Researchers at the University of California, Berkeley have created a framework for teaching artificial intelligence systems to learn motion from being shown video clips on YouTube.
The framework incorporates computer vision and reinforcement learning to train AI skills from videos. Altogether the team was able to train AI to perform more than 20 acrobatic tasks like cartwheels, handsprings, backflips, and some martial arts.
The method does not require the use of motion capture video, the kind often used to transfer human action to digital forms, such as the movement of LeBron James incorporated into NBA 2K18 or the performance of Andy Serkis as Gollum from Lord of the Rings.
The framework works by first ingesting the video to understand the poses seen in each video frame; then a simulated character is trained to imitate the movement using reinforcement learning. The system can also take a single image of a person in motion and predict the plausible outcome of how the motion will carry out.
Skills learned directly from videos can then be reused in different characters and environments. One was reused to train a simulation of Atlas, a humanoid robot from Boston Dynamics that grabbed the world’s attention last year for doing backflips.
The research released Tuesday follows work by Berkeley researchers highlighted in a paper last month about training AI systems to dance.
“All in all, our framework is really just taking the most obvious approach that anyone can think of when tackling the problem of video imitation. The key is in decomposing the problem into more manageable components, picking the right methods for those components, and integrating them together effectively,” authors Jason (Xue Bin) Peng and Angjoo Kanazawa said in a Berkeley AI Research blog post. “However, imitating skills from videos is still an extremely challenging problem, and there are plenty of video clips that we are not yet able to reproduce: Nimble dance steps, such as this Gangnam style clip, can still be difficult to imitate.”
Other authors include UC Berkeley Robot Learning Lab director Pieter Abbeel, UC Berkeley assistant professor Sergey Levine, and UC Berkeley professor Jitendra Malik.
Imitation techniques seen in the paper released Tuesday on Arxiv brings to mind other deep learning techniques to train AI motion, such as a system that learned to walk with no prior training using OpenAI’s Gym simulated environment.
MIT CSAIL researchers are also exploring techniques to help robots compensate for a lack of logic or experience in the physical world, while the startup TwentyBN wants to train robots to recognize and interpret human action using computer vision.