Google DeepMind has just introduced Gemini Robotics On-Device a powerful version of its vision-language-action (VLA) model designed to run directly on robots. Unlike traditional cloud-dependent systems this new model works entirely offline. That makes it ideal for latency-sensitive tasks such as real-time manipulation and industrial automation.
What Sets This Model Apart
Launched just months after the original Gemini Robotics model this new version is smaller faster and more efficient. It’s specially designed for bi-arm robots requiring minimal computing power and supports natural language instruction following. From folding clothes to unzipping bags it can complete these tasks without external data support.
Even more impressively the model has been adapted beyond its original ALOHA robot platform. On a bi-arm Franka FR3 it handled folding dresses and assembling belts. On the humanoid Apollo robot by Apptronik the model maintained its versatility and was able to execute natural language instructions with ease.
To support experimentation and development Google is also launching the Gemini Robotics SDK. Through this kit developers can test the model using MuJoCo physics simulations evaluate it on real-world tasks and fine-tune it for new domains with as few as 50–100 demonstrations.
Performance and Speed
Talking more about the real-world experience of this model Gemini Robotics On-Device has been benchmarked against Google’s own flagship robotics models. The results show that it delivers strong generalisation even when handling unfamiliar or complex tasks. It also shines in instruction-following and multi-step actions making it a great fit for everyday robotics and industrial work.
Google has also kept safety in mind as the model integrates content safety through a Live API and works alongside low-level safety-critical controllers. There are dedicated teams responsible for overseeing risk assessment and providing feedback to further improve safety.
This launch represents a major step in making robotics AI more adaptable and widely usable. This new model clearly helps in solving key challenges like latency and connectivity while unlocking new potential in robotic applications.
With the new SDK in hand developers can experiment and shape how this technology gets used. Early access is being granted to trusted testers with broader availability expected later.