NewsFlash Articles Data Fundraising Skill&API

DeepSeek V4-Pro successfully climbed Codeforces3206 using GPT-5.4, but it still fell short in long-context and knowledge when compared to Opus and Gemini.

According to Perceive Beating monitoring, DeepSeek V4 technical report unveiled the comparison between DeepSeek-V4-Pro-Max (Maximum Inference Intensity Mode) and a closed-source flagship. The comparison group includes Opus 4.6 Max, GPT-5.4 xHigh, Gemini 3.1 Pro High, as well as the open-source models Kimi K2.6 and GLM-5.1, excluding the recently released Opus 4.7 and GPT-5.5.

In terms of encoding, V4-Pro-Max scored 3206 on Codeforces, surpassing GPT-5.4 with 3168 and Gemini 3.1 Pro with 3052, setting a new benchmark record. LiveCodeBench also achieved a top score of 93.5. SWE Verified scored 80.6, slightly below Opus 4.6 at 80.8, a difference of 0.2 percentage points.

For long-context evaluation, both 1M benchmarks of V4-Pro-Max ranked second: CorpusQA 1M scored 62.0, trailing Opus 4.6 at 71.7 but leading Gemini 3.1 Pro at 53.8; MRCR 1M scored 83.5, with Opus 4.6 leading by almost 10 percentage points at 92.9.

In agent tasks, MCPAtlas Public scored 73.6, slightly below Opus 4.6 at 73.8. Terminal-Bench 2.0 scored 67.9, lower than GPT-5.4 at 75.1 and Gemini 3.1 Pro at 68.5.

Regarding knowledge and reasoning, V4-Pro-Max still shows a noticeable gap: GPQA Diamond at 90.1 (Gemini 94.3), SimpleQA-Verified at 57.9 (Gemini 75.6), HLE at 37.7 (Gemini 44.4). As an open-source model, V4-Pro-Max has for the first time matched or even surpassed closed-source flagships in multiple encoding and long-context benchmarks, but still lags behind Gemini 3.1 Pro in knowledge-intensive evaluations.

It is important to note that the above comparison does not include the recently released GPT-5.5 and Opus 4.7. The gap between V4 and the latest generation of closed-source models awaits third-party assessment validation.

Source

Correction/Report

On-Chain Activity

3h ago

Trump's 'Magical Day': Daytime Rally for TRUMP Holders, Evening White House Press Dinner Shooting Incident

If Bitcoin breaks $80,000, the mainstream CEX cumulative short liquidation pressure will reach $619 million.

Trump: Being a Target for Assassination is an "Honor"

Ethereum Foundation Stakes $48.9 Million Worth of ETH

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

DeepSeek V4-Pro successfully climbed Codeforces3206 using GPT-5.4, but it still fell short in long-context and knowledge when compared to Opus and Gemini.

The Bless project team sold 500 million tokens after the recent BLESS price surge.

The Balancer attacker has converted 21,000 ETH into 617.43 BTC over the past three days

The Bless team is suspected of once again selling nearly 100 million tokens, causing a brief price drop of over 10%.

Two Whale Addresses Increased Their LINK Holdings Today, With a Cumulative Withdrawal of Tokens Worth $4.67 Million