New AI Robot System Trained Entirely in Virtual Worlds Could Transform Robotics Development
A new robotics breakthrough suggests that artificial intelligence systems may no longer need massive amounts of real-world training data to learn how to interact with physical environments. Researchers from the Allen Institute for AI have introduced MolmoBot, a robotics model trained completely in simulation rather than through human-controlled demonstrations.
The project could significantly lower the cost and complexity of developing physical AI—robots that can manipulate objects and perform tasks in real-world environments.
A New Approach to Training Robots
Traditionally, robotics systems require thousands of hours of real-world demonstrations. Human operators remotely control robots to collect training data, a process that is both time-consuming and expensive.
For example, datasets like DROID contain about 76,000 teleoperated trajectories gathered by researchers across multiple institutions. Similarly, RT-1 developed by Google DeepMind required 130,000 training episodes collected over 17 months by human operators.
Because of the high cost, this type of research is often limited to well-funded laboratories and major technology companies.
According to Ali Farhadi, the goal of the new system is to make robotics research more accessible.
He explained that robotics could become an important scientific tool, allowing researchers to explore new ideas faster—but only if the technology becomes easier to develop and share across the global research community.
Training Robots in Virtual Worlds
Instead of collecting real-world demonstrations, the MolmoBot team created 1.8 million simulated robot manipulation trajectories using a system called MolmoSpaces.
The simulation environment uses the MuJoCo to generate realistic robotic interactions with objects. Researchers also applied domain randomization, meaning they constantly changed object shapes, lighting, camera angles, and environmental conditions to improve the AI’s ability to adapt.
This approach allows robots to learn from a huge variety of virtual experiences without requiring human operators.
Massive Data Generation with GPUs
Using NVIDIA A100 GPUs, the system produced around 1,024 training episodes per GPU-hour. This generated the equivalent of over 130 hours of robot experience for every hour of real time.
Compared with traditional methods, this simulation pipeline delivers roughly four times the data throughput, allowing developers to train robotics models much faster and at lower cost.
Real-World Testing
Despite being trained entirely in virtual environments, MolmoBot models successfully transferred their skills to real robots without additional training. The system was tested on platforms such as the Rainbow Robotics RB-Y1 and the Franka FR3.
In tabletop pick-and-place tasks, MolmoBot achieved a 79.2% success rate, outperforming π0.5, which was trained using extensive real-world data and achieved 39.2% success.
The robots were also able to perform complex actions such as approaching objects, grasping them, and opening doors—without being specifically trained for those exact scenarios.
Open Tools for the Robotics Community
Another key aspect of the project is its open-source approach. The entire MolmoBot ecosystem—including datasets, training pipelines, and model architectures—has been released publicly.
According to Farhadi, open tools are essential for accelerating innovation in robotics.
He emphasized that progress in AI should not rely on closed datasets or proprietary systems. Instead, shared research infrastructure will allow scientists and engineers around the world to collaborate and build more capable physical AI systems.
The Future of Physical AI
By shifting the focus from collecting expensive real-world demonstrations to building better virtual environments, MolmoBot could redefine how robots are trained.
If simulation-based training continues to improve, companies and research institutions may soon be able to develop advanced robotics systems faster, cheaper, and with far fewer physical experiments—bringing the era of practical physical AI closer than ever before.
