UC Berkeley researchers open-source RAD to improve any reinforcement learning algorithm

A group of University of California, Berkeley researchers this week open-sourced Reinforcement Learning with Augmented Data (RAD). In an accompanying paper, the authors say this module can improve any existing reinforcement learning algorithm and that RAD achieves better compute and data efficiency than Google AI's PlaNet, as well as recently released cutting-edge algorithms like DeepMind's Dreamer and SLAC from UC Berkeley and DeepMind.

RAD achieves state-of-the-art results on common benchmarks and matches or beats every baseline in terms of performance and data efficiency across 15 DeepMind control environments, the researchers say. It does this in part by applying data augmentations for visual observations. Coauthors of the paper on RAD include Michael "Misha" Laskin, Kimin Lee, and Berkeley AI Research codirector and Covariant founder Pieter Abbeel.

RAD was released Thursday on preprint repository arXiv. Data augmentation has been important to advances in convolutional neural networks (CNN) for challenges like robotic grasping and achieving human-level performance in games like Go.

"For the first time, we show that data augmentations alone can significantly improve the data-efficiency and generalization of RL methods operating from pixels, without any changes to the underlying RL algorithm, on the DeepMind Control Suite and the OpenAI ProcGen benchmarks, respectively," the paper reads. "By using multiple augmented views of the same data point as input, CNNs are forced to learn consistencies in their internal representations. This results in a visual representation that improves generalization, data-efficiency, and transfer learning."

Data augmentation techniques increase diversity in training data sets without collecting new data. "We find that data diversity alone can make agents focus on meaningful information from high-dimensional observations without any changes to the reinforcement learning method," the authors note.

It's been a busy week for the machine learning subfield of reinforcement learning.

Earlier this week, NYU researchers released work on arXiv that applies data augmentation they say also achieves state-of-the-art results on the DeepMind control suite.

And at the entirely digital International Conference on Learning Representations (ICLR) this week, Google AI researchers introduced methods for measuring the reliability of reinforcement learning algorithms, and Huawei AI researchers introduced Adversarial AutoAugment for improving data augmentation policy.

Abbeel also coauthored a number of reinforcement algorithm papers at ICLR, including HiPPO for training several levels of reinforcement learning algorithms at once and a paper on reinforcement learning and policy optimization that touches on data augmentation.

In a different series of developments, earlier this week Salesforce released the AI Economist, reinforcement learning the company claims is able to create optimal tax policies.

More