header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

DeepSeek has released the Prover-V2 model, which has 671 billion parameters.

2025-04-30 11:08

BlockBeats News, April 30th, DeepSeek today released a new model named DeepSeek-Prover-V2-671B on the AI open-source community Hugging Face. It is reported that DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports multiple calculation precisions, making it easier to train and deploy the model faster and more resource-efficiently. With 671 billion parameters, it is either an upgrade to last year's Prover-V1.5 mathematical model. In terms of model architecture, the model uses the DeepSeek-V3 architecture, adopts a Mixture of Experts (MoE) mode, has 61 Transformer layers, and a 7168-dimensional hidden layer. It also supports ultra-long contexts, with a maximum position embedding of 163,800, allowing it to handle complex mathematical proofs. Furthermore, it employs FP8 quantization to reduce the model size through quantization techniques, improving inference efficiency. (Jinshi)

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish