AI Weekly: OpenAI's pivot from robotics acknowledges the power of simulation

Late last week, OpenAI confirmed it shuttered its robotics division in part due to difficulties in collecting the data necessary to break through technical barriers. After years of research into machines that can learn to perform tasks like solving a Rubik's Cube, company cofounder Wojciech Zaremba said it makes sense for OpenAI to shift its focus to other domains, where training data is more readily available.

Beyond the commercial motivations for eschewing robotics in favor of media synthesis and natural language processing, OpenAI's decision reflects a growing philosophical debate in AI and robotics research. Some experts believe training systems in simulation will be sufficient to build robots that can complete complex tasks, like assembling electronics. Others emphasize the importance of collecting real-world data, which can provide a stronger baseline.

A longstanding challenge in simulations involving real data is that every scene must respond to a robot's movements -- even though those that might not have been recorded by the original sensor. Whatever angle or viewpoint isn't captured by a photo or video has to be rendered or simulated using predictive models, which is why simulation has historically relied on computer-generated graphics and physics-based rendering that somewhat crudely represents the world.

But Julian Togelius, an AI and games researcher and associate professor at New York University, notes that robots pose challenges that don't exist within the confines of simulation. Batteries deplete, tires behave differently when warm, and sensors regularly need to be recalibrated. Moreover, robots break and tend to be slow -- and cost a pretty penny. The Shadow Dexterous Hand, the machine that OpenAI used in its Rubik's Cube experiments, has a starting price in the thousands. And OpenAI had to improve the hand's robustness by reducing its tendon stress.

"Robotics is an admirable endeavor, and I very much respect those who try to tame the mechanical beasts," Togelius wrote in a tweet. "But they're not a reasonable way to do reinforcement learning, or any other episode-hungry type of learning. In my humble opinion, the future belongs to simulations."

Training robots in simulation

Gideon Kowadlo, the cofounder of Cerenaut, an independent research group developing AI to improve decision making, argues that no matter how much data is available in the real world, there's more data in simulation -- data that's easier to control, ultimately. Simulators can synthesize different environments and scenarios to test algorithms under rare conditions. Moreover, they can randomize variables to create diverse training sets with varying objects and environment properties.

Indeed, Ted Xiao, a scientist at Google's robotics division, says that OpenAI's move away from work with physical machines doesn't have to signal the end of the lab's research along this direction. By applying techniques including reinforcement learning to tasks like language and code understanding, OpenAI might be able to develop more capable systems that can then be applied back to robotics. For example, many robotics labs use humans holding controllers to generate data to train robots. But a general AI system that understands controllers (i.e., video games) and the video feeds from robotics with cameras might learn to teleoperate quickly.

Recent studies hint at how a simulation-first approach to robotics might work. In 2020, Nvidia and Stanford developed a technique that decomposes vision and control tasks into machine learning models that can be trained separately. Microsoft has created an AI drone navigation system that can reason out the correct actions to take from camera images. Scientist at DeepMind trained a cube-stacking system to learn from observation in a simulated environment. And a team at Google detailed a framework that takes a motion capture clip of an animal and uses reinforcement learning to train a control policy, employing an adaptation technique to randomize the dynamics in the simulation by, for example, varying mass and friction.

In a blog post in 2017, OpenAI researchers wrote that they believe general-purpose robots can be built by training entirely in simulation, followed by a small amount of self-calibration in the real world. This would increasingly appear to be the case.

For AI coverage, send news tips to Kyle Wiggers -- and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer