header-langage
简体中文
繁體中文
English
Tiếng Việt
한국어
日本語
ภาษาไทย
Türkçe
Scan to Download the APP

Google DeepMind has released Gemini Robotics-ER 1.6, and Spot robot can now automatically read dashboards.

According to 1M AI News monitoring, Google DeepMind has released Gemini Robotics-ER 1.6, positioned as a high-level reasoning model for robots, with significant improvements in spatial reasoning and multi-view understanding compared to its predecessors ER 1.5 and Gemini 3.0 Flash. The model has been made available to developers through the Gemini API and Google AI Studio.

The core upgrades include three capabilities:

1. Enhanced pointing precision: useful for precise object detection, counting, spatial relationship reasoning (e.g., "point out all objects that can fit into the blue cup"), and motion trajectory planning, and can accurately reject pointing to non-existent objects in the scene.
2. Multi-view success detection: robots can now synthesize multiple camera views to determine task completion, maintaining accuracy even in occluded or dynamic environments.
3. Addition of instrument reading capability: capable of interpreting various industrial instruments such as round pressure gauges, vertical level indicators, and digital displays, achieving progressive reasoning through agentic vision (visual reasoning + code execution). It first zooms into detailed regions, then calculates proportions and intervals through pointing and code, and finally combines world knowledge to derive readings.

The instrument reading capability stems from DeepMind's collaboration with Boston Dynamics. Boston Dynamics announced on the same day that it has integrated Gemini and Gemini Robotics-ER 1.6 into its Orbit AIVI-Learning product, which was rolled out to all AIVI-Learning customers on April 8. The integration has added support for gauges, allowing the quadruped robot Spot to autonomously inspect industrial facilities and read instrument data such as pressure gauges. Boston Dynamics stated that with the inference capabilities of Gemini, AIVI-Learning has also seen improvements in baseline performance and accuracy in existing tasks such as visual inspection, pallet counting, and liquid accumulation detection.

DeepMind describes ER 1.6 as its "safest robot model." In adversarial spatial reasoning tasks, its compliance with safety instructions is significantly better than ER 1.5. In safety risk identification tests based on real injury reports, the ER series models outperform Gemini 3.0 Flash by 6% in text scenes and 10% in video scenes.

举报 Correction/Report
Correction/Report
Submit
Add Library
Visible to myself only
Public
Save
Choose Library
Add Library
Cancel
Finish