NewsFlash Articles Data Fundraising Skill&API

Xiaomi Unveils Reconstructable Integrated World Model Framework, Breaking Mainstream Benchmark Performance Record

According to Perceiving Beat monitoring, Xiaomi Auto has officially unveiled the Xiaomi EV World Model, a new framework for assisted driving world modeling. For the first time internally, it has achieved deep coupling of 3D reconstruction and video generation modules. In traditional autonomous driving simulation, the reconstruction and generation are often separated. The reconstruction module can restore the scene but cannot predict changes, while the generation module can predict the future but is prone to distortion and drift over long time sequences. The team has proposed the JointWM architecture, using a 3D geometric structure as a physical skeleton to anchor the scene. It then completes visual details through the generation module and predicts unobserved areas. This architecture has refreshed multiple performance records in mainstream benchmarks such as Waymo and nuScenes.

Specifically, in terms of mechanism, the reconstruction module WorldRec abandons the traditional per-pixel paradigm and adopts sparse 3D query points for scene representation. It incrementally fuses into a cross-view 4D Gaussian spatial skeleton, achieving 10-second rapid reconstruction of a 10-second video. Based on the geometric priors provided by the reconstruction module, the generation module WorldGen, constrained by the skeleton's physical boundaries, is only responsible for generating reasonable lighting and textures. For content beyond future frames and field-of-view blind spots, the generation module performs physical prediction through two-stage temporal training and distribution-matching distillation mechanisms. The entire architecture achieves a single-view generation speed of 0.19 seconds and a three-view speed of 0.46 seconds on an H20 GPU, supporting video generation of up to 1 minute.

This solution achieved a 28.48 PSNR score in Waymo's reconstruction accuracy test and maintains a leading position in nuScenes zero-shot generalization. In terms of generation efficiency, the solution is 5.6 times faster than the autoregressive baseline Epona and ranks among the top in spatiotemporal consistency among similar algorithms. Currently, the research results have been implemented in Xiaomi Auto's three major scenarios, including delivering over 100,000 segments of high-quality synthetic data for perception model training, constructing a highly realistic closed-loop simulation environment to reproduce long-tail road conditions, and launching an assisted driving academy to provide generative video guidance for user operations.

Source

Correction/Report

On-Chain Activity

30min ago

Bitcoin Enters High-Risk Zone as Institutional Funds Continue to Withdraw, Highlighting Selling Pressure Concerns

OKB sees a sudden spike of over 10%, now trading at $97.23

「Market Maker」 Liquidates $3.7 Million ETH Long, Remaining Position Close to Liquidation with 0.96% Margin

xAI officially moves Grok Build down to SuperGrok and X Premium+ subscribers

Correction/Report

Submit

Add Library

Visible to myself only

Public

Save

Choose Library

Add Library

Cancel

Finish

Xiaomi Unveils Reconstructable Integrated World Model Framework, Breaking Mainstream Benchmark Performance Record

「Market Maker」 Liquidates $3.7 Million ETH Long, Remaining Position Close to Liquidation with 0.96% Margin

Kraken redeemed 50,600 ETH from EigenCloud 11 hours ago, equivalent to $107 million.

A BTC Whale was liquidated for $23.16 million over two days, consecutively topping the global liquidation list

A certain whale closed a short position of over $100 million in ETH and instead went long with $13.43 million in BTC