Waymo's AI improves autonomous vehicle performance while saving costs

Waymo, Alphabet's self-driving vehicle research division, today detailed a system -- Progressive Population Based Augmentation (PPBA) -- it claims has improved the performance of its autonomous systems while reducing the amount of data required to train them. Specifically, Waymo says PPBA bolstered its cars' object detection capabilities while decreasing costs and accelerating the training process.

It's early days, but the approach could improve Waymo vehicles' robustness in challenging driving scenarios -- even while the fleet remains grounded by the COVID-19 pandemic.

The situations Waymo's cars encounter in the real world and in simulation give the company's engineers opportunities to train the models underlying the Waymo Driver, Waymo's full-stack driverless platform. By way of background, the Waymo Driver -- which is now in its fifth generation -- relies on a custom suite of lidar, cameras, and radars, as well as algorithms that enable it to interpret and respond to the sensor data.

Typically, ensuring these models are highly generalizable requires collecting a large, diverse set of training data and recruiting a human team to manually annotate the data. But PPBA automates the bulk of the process by discovering ways to synthesize additional data.

PPBA takes cues from AutoAugment, a Google Research and Google Brain project that uses various image augmentation operations -- such as rotation, cropping, image mirroring, and color shifting -- to morph and transform data. Trained through reinforcement learning, it selects the best augmentation policy (i.e., combination of augmentation operations) for a given sample set while reducing the computational cost of searching for policies.

PPBA also builds on Waymo's existing data augmentation efforts. In early 2019, the company began applying techniques from a Google Brain and Google Research algorithm called RandAugment to image-based classification and detection tasks. Waymo reports that it achieved "significant" improvements in several classifiers and detectors as a result, including those that help classify foreign objects, such as construction equipment and animals.

PPBA targets lidar, which measures the distance to target objects by illuminating them with laser light and measuring the reflected pulses. Beyond 3D spatial information, logs from lidar sensors contain parameters such as distance, operation strength, and sampling probability.

To discover policies designed for point cloud data sets, PPBA works on a point cloud augmentation search space containing eight operations, each of which is associated with a probability and specific parameters:

The original data sample
A ground truth augmentation (which has parameters denoting the probability for sampling vehicles, pedestrians, and cyclists)
A random flip
World scaling
Global translate noise (which has parameters for the distortion magnitude of translation operations on certain coordinates)
Frustum dropout
Frustum noise
Random rotation
Random drop laser points

Inspired by biological evolution, PPBA learns to optimize augmentation strategies by starting with multiple search spaces and replacing underperforming ones with "offspring." At each iteration, it adopts the best parameters discovered in past iterations.

Waymo claims that in experiments PPBA achieved performance improvements across detection architectures and saved costs because it only needs labeled lidar data for training. "Our experiments show that by applying automated data augmentation to lidar data, we can significantly improve 3D object detection without additional data collection or labeling," wrote Waymo in a blog post. "On the baseline 3D detection model, our method is up to 10 [times] more data efficient than without augmentation, enabling us to train machine learning models with fewer labeled examples, or use the same amount of data for better results, at a lower cost."

It's not the first time Waymo has used AI to expedite backend tasks like data augmentation and search.

Waymo previously collaborated with DeepMind on PBT (Population Based Training), which managed to reduce false positives by 24% in pedestrian, bicyclist, and motorcyclist recognition tasks while cutting training time and computational resources in half. Following a pilot study, PBT was integrated directly with Waymo's technical infrastructure, enabling researchers from across the company to apply it with a button click.

More recently, Waymo pulled back the curtain on Content Search, which draws on tech similar to that powering Google Photos and Google Image Search to let data scientists quickly locate almost any object in Waymo's driving history and logs. The company says this has contributed to "many improvements" across its system, including the ability to detect school buses with children about to step onto the sidewalk, people riding electric scooters, and a cat crossing a street.