According to 1M AI News monitoring, today the globally recognized AI benchmark platform LMArena (million-user blind test participation) updated the Code Arena specialized ranking, with GLM-5.1 ranking first in the global open-source models and third in the global models.
GLM-5.1 not only inherits the open-source SOTA encoding capability of the previous generation of models but also achieves a breakthrough in Long-Horizon Tasks, accomplishing:
1. Building a Linux desktop from scratch in 8 hours;
2. Breaking the vector database optimization bottleneck after 655 iterations;
3. Optimizing real machine learning model workload through 1000 rounds of tool invocation.
It is worth mentioning that under the equivalent evaluation criteria of the METR ranking, GLM-5.1 is the only open-source model capable of sustaining 8-hour-level continuous work and is one of the few models globally with this capability, apart from Claude Opus 4.6.
