According to Perceiving AI monitoring, developer Ryan Shea has launched a new platform called AI IQ (aiiq.org), placing state-of-the-art large models on the human IQ bell curve, using a single number to intuitively answer "How intelligent is this model really." Currently, the official website data shows that among the key tested models:
• GPT-5.5: 136 points (leading the list)
• Claude Opus 4.7 and Gemini 3.1 Pro: tied at 132 points
• Grok 4.3: 125 points
• Kimi K2.6: 122 points
• DeepSeek V4 Pro and Muse Spark: tied at 117 points
• Qwen3.6: 108 points
The platform's algorithm mechanism fetches the original scores of 12 benchmarks from a public leaderboard, converts them into implied IQ using a calibrated difficulty curve, then averages them across four dimensions: abstract reasoning, mathematical reasoning, programming reasoning, and academic reasoning. Dimensions with missing data are conservatively filled, and models are not made to look smarter by running fewer benchmarks. Since all underlying data comes from existing benchmarks, AI IQ itself does not conduct new tests. Its core value lies in translating scattered scores into a scale understandable at a glance for the average person.
In addition to the overall IQ rankings, the website also provides several cross-sectional views. Users can visually filter cost-effectiveness in the "IQ vs. Cost" chart or view the evolution slope of various manufacturers in the "Frontier IQ Timeline." Furthermore, the platform overlays emotional intelligence (EQ) scores measured by EQ-Bench to assess whether a model "speaks well."
