Self-play Iteration: MiniMax M2.7 Released, Programming Approaching Opus Level

According to 1M AI News monitoring, MiniMax has released M2.7, calling it the "first model deeply involved in iterating on its own model." The team built a research-oriented Agent framework based on early versions of M2, allowing the model to autonomously complete reinforcement learning skill construction, memory updates, and process optimization. In an internal experiment, M2.7 autonomously ran over 100 rounds of iteration loops (analyzing failed trajectories, modifying scaffolding, running evaluations, comparing results), ultimately achieving a 30% improvement in internal evaluation. Across 22 machine learning tasks in MLE Bench Lite, the three-test average win rate was 66.6%, on par with Gemini-3.1, and second only to Opus 4.6 (75.7%) and GPT-5.4 (71.2%).

On the programming front, M2.7 scored 56.22% in SWE-Pro, tying with GPT-5.3-Codex; VIBE-Pro scored 55.6%, approaching Opus 4.6; and Terminal Bench 2 scored 57.0%. MiniMax reported reducing online production failure recovery time to under three minutes multiple times. In office scenarios, GDPval-AA had an ELO score of 1500, ranking second among 45 models, only behind Opus 4.6, Sonnet 4.6, and GPT-5.4. OpenClaw evaluated MM-Claw with an accuracy of 62.7%, close to Sonnet 4.6, with a compliance rate of 97% for 40 complex skills (>2000 Tokens).

MiniMax has simultaneously open-sourced the Agent interaction system OpenRoom (github.com/MiniMax-AI/OpenRoom), embedding AI interactions into a visual Web space. M2.7 has been fully deployed on the MiniMax Agent and the open platform.

Source

Correction/Report

On-Chain Activity