According to Perceive Beating monitoring, MiniMax has released a technical blog revealing the root cause analysis process of its M2 series large model's inability to output the name "Ma Jiaqi." The investigation started from individual cases and eventually uncovered a systemic degradation issue affecting the entire vocabulary.
The root cause was identified as the tokenizer (a component that splits text into model processing units) merging "Jiaqi" into a single token during training. During the pre-training phase, the model encountered a large amount of internet text and learned this token; however, in the later dialogue data used for fine-tuning, there were less than 5 samples containing "Jiaqi." During fine-tuning, high-frequency tokens such as tool_call markers and code symbols continuously updated the surrounding vector space, pushing low-frequency tokens like "Jiaqi" in the wrong direction. The model still "recognizes" Ma Jiaqi and can accurately answer related questions; the only thing lost was the ability to output this token.
Subsequently, the team conducted a full scan of the entire vocabulary of around 200,000 tokens and found that about 4.9% of the tokens had significantly degraded. The most severe degradation was observed in Japanese: 29.7% of Japanese tokens significantly degraded, far exceeding Korean at 3.3%, Russian at 3.7%, Chinese at 3.9%, and English at 3.5%. Also ranking high in degradation were internet SEO junk words like "private server of legend" and "painless induced abortion," with mechanisms identical to "Jiaqi."
The significant degradation of Japanese also solved an old mystery. Previously, the model occasionally mixed in Russian or Korean characters in Japanese conversations, and the reason could not be found. This analysis revealed that after the Japanese token parameters drifted, they got confused with tokens from other languages in the vector space, leading to both the incorrect activation of Japanese tokens (language mixing) and pushing neighboring low-frequency Chinese tokens out of the normal probability range (token forgetting).
The solution was to construct a synthetic dataset covering the entire vocabulary and train the model with a simple echo task for each token. The results were immediate: the proportion of Russian characters blended into Japanese responses dropped from 47% to 1%, and the overall vocabulary output parameter stability (cosine similarity) increased from a minimum of 0.329 to all above 0.97.
