Meituan Open Sources 560B Parameter Theorem Proving Model: 72 Inference Steps with 97.1% Success Rate, Refreshing SOTA Open Source Model

According to 1M AI News monitoring, the Meituan LongCat team has open-sourced LongCat-Flash-Prover, a 560B-parameter MoE model specializing in mathematical reasoning tasks for the Lean4 formal theorem proving language. The model weights are released under the MIT license and are now available on GitHub, Hugging Face, and ModelScope.

The model decomposes formal reasoning into three distinct capabilities: Automatic Formalization (transforming natural language math problems into Lean4 formal statements), Sketch Generation (producing lemma-style proof outlines), and Full Proof Generation. All three capabilities are integrated through the Agent tool for Trustable Integration Reasoning (TIR) and real-time interaction verification with the Lean4 compiler.

On the training side, the team proposes a Hybrid-Experts Iteration Framework to generate cold-start data and introduces the HisPO algorithm during the reinforcement learning phase to stabilize the MoE model for long-range task training, while incorporating theorem consistency and validity checking mechanisms to prevent reward hacking.

Benchmark tests demonstrate that LongCat-Flash-Prover has set new benchmarks in automatic formalization and theorem proving among open-source weight models. It achieves a 97.1% pass rate with only 72 inferences on the MiniF2F-Test, 70.8% on ProverBench, and 41.5% on PutnamBench, with at most 220 inferences per question.

Source

Correction/Report

On-Chain Activity

1h ago

Yesterday, the US Spot Bitcoin ETF saw a net outflow of $52 million, marking the third consecutive day of outflows.

Yesterday, the US Etheruem Spot ETF saw a net outflow of $42 million, marking the third consecutive day of outflows.

Palantir AI System Awarded Key Project Status by the US Department of Defense

Middle East Conflict and Rate Hike Expectation Reverberation: Global Asset Tremor, US Stocks Fall for Four Consecutive Weeks, Bond Market 'Bloodbath', Gold Posts Largest Weekly Decline in 43 Years

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

Meituan Open Sources 560B Parameter Theorem Proving Model: 72 Inference Steps with 97.1% Success Rate, Refreshing SOTA Open Source Model

Erik Voorhees Increases ETH Holdings by 14,400, Totaling Over 117,000 ETH

Two Whales Cumulatively Sell $13.17M Worth of XAUT Tokens, Incurring a Total Loss of Over $1.7M

A whale has 10x shorted 226,300 HYPE tokens, with a position worth approximately $9 million.

Bitmine re-stakes over 100,000 ETH, with total staked amount reaching 3,142,291 ETH