According to Perceive Beating monitoring, aggregator OpenRouter has disclosed that the performance gap between open-source models and closed-source state-of-the-art models has stabilized at 3 to 6 months. Over the past 18 months, the state-of-the-art closed-source labs have failed to widen the lead as expected, while the open-source movement, represented by players from China and the U.S., is accelerating model replacement of closed-source models with a high cost-performance ratio.
The DeepSeek V4 Flash release became the preferred replacement in just two months. With 2.84 trillion parameters, the DeepSeek V4 Flash scored 79.0% in SWE-bench Verified benchmarking, approaching the level of GPT-5.5. The official first-party input/output pricing is only $0.14/$0.28 per million tokens, making it about 150 times cheaper than GPT-5.5 in terms of output cost. Even with a Western cloud hosting premium for non-data-retaining training, the actual cost is only about 1.3% of the closed-source state-of-the-art models.
In addition to its price advantage, the GLM 5.2 released in June 2026 by MindSpec ranks first in the Open-source Weighted Intelligence Index of Artificial Analysis and rivals the performance of GPT-5.5 in real-world intelligent agent evaluations, becoming an alternative for long-range programming planning. However, GLM 5.2 consumes more tokens during deep contemplation, requiring a balance of output costs for enterprise deployment. The multimodal open-source model MiniMax M3, with its innovative MSA sparse attention architecture, offers native image and video long-context processing capabilities at a lower token price, becoming a strong open-source competitor to Gemini Flash.
Meanwhile, based on the Mamba-2 hybrid architecture, NVIDIA's Nemotron 3 Ultra has become the strongest U.S. native open-source force, aiming to drive NVIDIA hardware and microservices ecosystem market demand through an open ecosystem.
OpenRouter emphasizes that while state-of-the-art closed-source models will eventually advance, the fixed token cost for intelligent levels will continue to decrease, providing significant cost optimization space for enterprises.
