According to Perceive Beating monitoring, the latest MoE flagship model GLM-5.2 by MindScape AI achieved a score of 51 in the Artificial Analysis Big Model Intelligence Index v4.1 evaluation, surpassing MiniMax-M3 (44 points), DeepSeek V4 Pro (max, 44 points), and Kimi K2.6 (43 points), claiming the top spot on the global open-source model leaderboard.
In the GDPval-AA v2 test, simulating real-world knowledge work, GLM-5.2 scored 1524 (human baseline 1000 points), leading MiniMax-M3 (1418 points) and DeepSeek V4 Pro (max, 1328 points), tying with the closed-source cutting-edge large model GPT-5.5 (xhigh reasoning). Compared to its predecessor GLM-5.1, scientific reasoning CritPt increased by 16 percentage points to 21%, HLE increased by 12 percentage points to 40%, TerminalBench v2.1 increased by 16 percentage points to 78%, and GPQA Diamond reached 89%.
GLM-5.2 occupies the best cost-effectiveness position on the "Intelligence - Task Cost" Pareto frontier. Due to an average output of 43k tokens per task (GLM-5.1 was 26k), GLM-5.2's average single-task cost has risen to approximately $0.46, higher than GLM-5.1 ($0.25) and DeepSeek V4 Pro (max, $0.05), but still much lower than equivalent closed-source models.
GLM-5.2 has a total of 744B parameters, with 40B activation parameters, and the context window has increased from 200K in the previous version to 1M, following the MIT open-source license. Currently, the MindScape official API (pricing input 1.4, output 4.4 per million tokens) and platforms such as SiliconFlow, DeepInfra, and Nebius AI have launched services.
