NewsFlash Articles Data Fundraising Skill&API

DeepSeek V4 to be Released Next Week, Yifan Zhang Reveals Three Architecture Components

According to Perception Beating monitoring, Princeton PhD student Yifan Zhang revealed that the next-generation flagship V4 of the Chinese AI company DeepSeek will be released next week; he listed three architecture components in the comment: Sparse Multi-Query Attention (Sparse MQA), Fused MoE Mega Kernel, and Hyper-Connections. Zhang, who did his undergraduate at Peking University's Yuanpei College and his master's at Tsinghua University's Yao Class, is currently a Princeton AI Lab Fellow. He previously worked as a research intern at ByteDance's Seed Base Model Team. He is not currently employed at DeepSeek, and the official DeepSeek team has not confirmed the release schedule.

Each of the three components corresponds to an independent direction in LLM optimization. Sparse MQA introduces sparsity on top of multi-query attention to further reduce inference computation power and memory usage in long-context scenarios; Fused MoE Mega Kernel integrates MoE's routing decision with expert matrix multiplication into a single GPU kernel, eliminating a significant amount of kernel launch and memory transfer overhead during the inference stage; Hyper-Connections generalize residual connections by replacing a single residual addition with multiple learnable weighted paths.

Source

Correction/Report

On-Chain Activity

1h ago

A certain trader spent $575 two days ago to buy 27.9 billion ASTEROID tokens, which have since generated a return of over 1700%.

2h ago

Trump Discusses the Situation in the Strait of Hormuz and Agreement Details with Pakistani Army Chief

U.S. Media: Trump Publicly Displays Arrogance, But Internally Feels Fear

Aave Platform USDT Borrowing APY Skyrockets to 14.99%

Musk: Grok 4.4 Doubles to 1T Parameter, 4.5 Further Expands to 1.5T, Aiming for May Release

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

DeepSeek V4 to be Released Next Week, Yifan Zhang Reveals Three Architecture Components

A certain trader spent $575 two days ago to buy 27.9 billion ASTEROID tokens, which have since generated a return of over 1700%.

A whale's ZRO long position was partially liquidated, resulting in a $2.88 million loss

Over $5.4 billion in assets were hastily withdrawn after a hacker borrowed a large amount of ETH from Aave.

The address holds 8.02 billion ASTEROID tokens, with an unrealized gain of $2.6 million