According to Perceiver Beating monitoring, yesterday, after DeepSeek open-sourced the TileKernels kernel library, we speculated the core architecture components of V4 through the industrial-grade kernels included in the library. Today, the V4 model card was released, and the validations are as follows:
mHC (Manifold-Constrained HyperConnection): Yesterday, it was speculated that V4 used DeepSeek's improved mHC instead of the raw HyperConnection. The model card confirms that V4 uses Manifold-Constrained Hyper-Connections, which is a match. MoE architecture and Top-k expert routing: Yesterday, TileKernels included a complete MoE dispatch and gather kernel, and the model card confirms that V4 is an MoE model, which is a match. FP4+FP8 Mixed Precision: Yesterday, the library contained FP4 and FP8 quantization kernels, and the model card confirms that the weights are stored using FP4+FP8 mixed precision, which is a match.
The only component that was not confirmed is the Engram (Conditional Memory Module). Yesterday, we noticed that the V4 specification disclosed by Yifan Zhang did not mention Engram, leaving room for interpretation. The V4 model card also does not mention Engram.
The model card also revealed new components not covered by TileKernels: the Hybrid Attention Mechanism (CSA + HCA) is the core of V4's long-context efficiency leap, with inference FLOPs for 1M contexts only at 27% of V3.2 and KV cache at only 10%; training has switched to the Muon optimizer.
