New Nvidia AI agent, powered by GPT-4, can train robots

Nvidia Research announced today that it has developed a new AI agent, called Eureka, that is powered by OpenAI's GPT-4 and can autonomously teach robots complex skills.

In a blog post, the company said Eureka, which autonomously writes reward algorithms, has, for the first time, trained a robotic hand to perform rapid pen-spinning tricks as well as a human can. Eureka has also taught robots to open drawers and cabinets, toss and catch balls, and manipulate scissors, among nearly 30 tasks.

“Reinforcement learning has enabled impressive wins over the last decade, yet many challenges still exist, such as reward design, which remains a trial-and-error process,” Anima Anandkumar, senior director of AI research at Nvidia and an author of the Eureka paper, said in the blog post. “Eureka is a first step toward developing new algorithms that integrate generative and reinforcement learning methods to solve hard tasks.”

Nvidia Research also published the Eureka library of AI algorithms for people to experiment with them using Nvidia Isaac Gym, a physics simulation reference application for reinforcement learning research. Isaac Gym is built on Nvidia Omniverse, a development platform for building 3D tools and applications based on the OpenUSD framework.

Work builds on previous Nvidia work on AI agents

Hype over AI agents has been swirling for months, including with the rise of autonomous AI agents like Auto-GPT, BabyAGI and AgentGPT back in April.

The current Nvidia Research work builds on previous efforts including the recent Voyager, an AI agent built with GPT-4 that can autonomously play Minecraft. In a New York Times article this week on efforts to transform chatbots into online agents, Jeff Clune, a computer science professor at the University of British Columbia who was previously an OpenAI researcher, said that "this is a huge commercial opportunity, potentially trillions of dollars," while adding that "this has a huge upside — and huge consequences — for society.”

Outperforms expert human-engineered rewards

In a new research paper titled "Eureka: Human-level reward design via coding large language models," the authors said that Eureka "exploits the remarkable zero-shot generation, code-writing, and in-context improvement capabilities of state-of-the-art LLMs, such as GPT-4, to perform evolutionary optimization over reward code."

The resulting rewards, they said, can be used to acquire complex skills through reinforcement learning. "Without any task-specific prompting or pre-defined reward templates, Eureka generates reward functions that outperform expert human-engineered rewards. In a diverse suite of 29 open-source RL environments that include 10 distinct robot morphologies, Eureka outperforms human experts on 83% of the tasks, leading to an average normalized improvement of 52%."

“Eureka is a unique combination of large language models and Nvidia's GPU-accelerated simulation technologies,” said Jim Fan, senior research scientist at NVIDIA, who’s one of the project’s contributors, in the blog post. “We believe that Eureka will enable dexterous robot control and provide a new way to produce physically realistic animations for artists.”

Work builds on previous Nvidia work on AI agents

Outperforms expert human-engineered rewards

More