header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

On the occasion of the first anniversary of DeepSeek-R1, a new model "MODEL1" is revealed

BlockBeats News, January 21st, according to QuantumBit, DeepSeek-R1 has exposed its new model "MODEL1" on the occasion of its first anniversary. DeepSeek has updated the FlashMLA code on GitHub, with 28 mentions of MODEL1 across 114 files, appearing as a distinct model from V32. V32 is known to be DeepSeek-V3.2, so MODEL1 is likely a new architecture. Specific differences in the code are reflected in KV cache layout, sparsity handling, and FP8 decoding, showing several differences in memory optimization.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish