OpenAI disbands its robotics research team

OpenAI has disbanded its robotics team after years of research into machines that can learn to perform tasks like solving a Rubik's Cube. Company cofounder Wojciech Zaremba quietly revealed on a podcast hosted by startup Weights & Biases that OpenAI has shifted its focus to other domains, where data is more readily available.

"So it turns out that we can make a gigantic progress whenever we have access to data. And I kept all of our machinery unsupervised, [using] reinforcement learning -- [it] work[s] extremely well. There [are] actually plenty of domains that are very, very rich with data. And ultimately that was holding us back in terms of robotics," Zaremba said. "The decision [to disband the robotics team] was quite hard for me. But I got the realization some time ago that actually, that's for the best from the perspective of the company."

In a statement, an OpenAI spokesperson told VentureBeat: "After advancing the state of the art in reinforcement learning through our Rubik's Cube project and other initiatives, last October we decided not to pursue further robotics research and instead refocus the team on other projects. Because of the rapid progress in AI and its capabilities, we've found that other approaches, such as reinforcement learning with human feedback, lead to faster progress in our reinforcement learning research."

OpenAI first widely demonstrated its robotics work in October 2019, when it published research detailing a five-fingered robotic hand guided by an AI model with 13,000 years of cumulative experience. The best-performing system could successfully unscramble Rubik's Cubes about 20% to 60% of the time, which might not seem especially impressive. But the model notably discovered techniques to recover from challenges, like when the robot's fingers were tied together and when the hand was wearing a leather glove.

This was the culmination of over two years of work. In May 2017, OpenAI released Roboschool, open source software for controlling robotics in simulation. That same year, the company said it had created a robotics system, trained entirely in simulation and deployed on a physical robot, that could learn a new task after seeing it done once. And in 2018, OpenAI made available simulated robotics environments and a baseline implementation of Hindsight Experience Replay, a reinforcement learning algorithm that can learn from failure.

"The sad thing is, if we were a robotics company, the mission of the company would be different, and I think we would continue. I believe quite strongly in the approach that [the] robotics [team] took and the direction," Zaremba added. "But from the perspective of what we want to achieve, which is to build [artificial general intelligence], there were some components missing."

Artificial general intelligence

OpenAI has long asserted that immense computational horsepower is a necessary step on the road to artificial general intelligence (AGI), or AI that can learn any task a human can. While luminaries like Mila founder Yoshua Bengio and Facebook VP and chief AI scientist Yann LeCun argue that AGI can't exist, OpenAI's cofounders and backers -- among them Greg Brockman, chief scientist Ilya Sutskever, Elon Musk, Reid Hoffman, and former Y Combinator president Sam Altman -- believe powerful computers in conjunction with reinforcement learning, pretraining, and other techniques can achieve paradigm-shifting AI advances.

As MIT Technology Review reported in 2020, a team within OpenAI called Foresight runs experiments to test how far they can push AI capabilities by training algorithms with increasingly large amounts of data and compute. According to that same report, OpenAI is developing a system trained on images, text, and other data using massive computational resources that the company's leadership believes is the most promising path toward AGI.

One of the fruits of this effort is DALL-E, a text-to-image engine that's essentially a visual idea generator. Given a text prompt, the OpenAI system generates images to match the prompt, filling in the blanks when the prompt implies the image must contain a detail that isn't explicitly stated. DALL-E can combine disparate ideas to synthesize objects, some of which are unlikely to exist in the real world -- like a hybrid of a snail and a harp.

Brockman and Altman in particular believe AGI will be able to master more fields than any one person, chiefly by identifying complex cross-disciplinary connections that elude human experts. Furthermore, they predict that responsibly deployed AGI -- in other words, AGI deployed in "close collaboration" with researchers in relevant fields, like social science -- might help solve longstanding challenges in climate change, health care, and education.

Zaremba asserts that pretraining is a particularly powerful technique in the creation of large, sophisticated AI systems. At a high level, pretraining helps the model learn general features that can be reused on the target task to boost its accuracy. Pretraining was used to develop OpenAI's Codex, a model that's trained on billions of lines of public code to power Copilot, GitHub's service that provides suggestions for whole lines of code inside development environments like Microsoft Visual Studio. Codex is a fine-tuned version of OpenAI's GPT-3, a language model pretrained on over a trillion words from websites, books, Wikipedia, and other web sources.

"When we created robotics [systems], we thought that we could go very far with self-generated data and reinforcement learning. At the moment, I believe that pretraining [gives] model[s] 100 times cheaper 'IQ points,'" Zaremba said. "That might be followed with other techniques."

Commercial realities

OpenAI's move away from robotics might be a reflection of the economic realities the company faces. DeepMind, the Alphabet-owned AI research lab, has undergone a similar shift in recent years as R&D costs mount, moving away from prestige projects in favor of work with commercial applications, like protein shape prediction.

It's an open secret that robotics is a capital-intensive field. Industrial robotics company Rethink Robotics closed its doors months after attempting unsuccessfully to find an acquirer. Boston Dynamics, considered among the most advanced robotics firms, was acquired by Google and then sold to SoftBank before Hyundai agreed to buy a controlling stake for $1.1 billion. And Honda retired its Asimo robotics project after over a decade in development.

Roughly a year ago, Microsoft announced it would invest $1 billion in San Francisco-based OpenAI to jointly develop new technologies for Microsoft's Azure cloud platform. In exchange, OpenAI agreed to license some of its intellectual property to Microsoft, which the company would then package and sell to partners, and to train and run AI models on Azure as OpenAI worked to develop next-generation computing hardware.

In the months that followed, OpenAI released a Microsoft Azure-powered API that allows developers to explore GPT-3's capabilities.(OpenAI said recently that GPT-3 is now being used in more than 300 different apps by "tens of thousands" of developers and producing 4.5 billion words per day.) Toward the end of 2020, Microsoft announced that it would exclusively license GPT-3 to develop and deliver AI solutions for customers, as well as creating new products that harness the power of natural language generation.

Microsoft recently announced that GPT-3 will be integrated "deeply" with Power Apps, its low-code app development platform — specifically for formula generation. The AI-powered features will allow a user building an ecommerce app, for example, to describe a programming goal using conversational language like "find products where the name starts with 'kids.'"

As for projects like DALL-E and Jukebox -- an AI system that can generate music in any style from scratch, complete with vocals -- they also have obvious and immediate business applications. OpenAI predicts that DALL-E could someday augment or even replace 3D rendering engines. For example, architects could use the tool to visualize buildings, while graphic artists could apply it to software and video game design.

Artificial general intelligence

Commercial realities

More