During a workshop on autonomous driving at the Conference on Computer Vision and Pattern Recognition (CVPR) 2020, Waymo and Uber presented research to improve the reliability — and safety — of their self-driving systems. Waymo principal scientist Drago Anguelov detailed ViDAR, a camera and range-centric framework covering scene geometry, semantics, and dynamics. Raquel Urtasun, chief scientist at Uber’s Advanced Technologies Group, demonstrated a pair of technologies that leverage vehicle-to-vehicle communication for navigation, traffic modeling, and more.


ViDAR, a collaboration between Waymo and one of Google’s several AI labs, Google Brain, infers structure from motion. It learns 3D geometry from image sequences — i.e., frames captured by car-mounted cameras — by exploiting motion parallax, a change in position caused by movement. Given a pair of images and lidar data, ViDAR can predict future camera viewpoints and depth data.

According to Anguelov, ViDAR uses shutter timings to account for rolling shutter, the camera capture method in which not all parts of a scene are recorded simultaneously. (It’s what’s responsible for the “jello effect” in handheld shots or when shooting from a moving vehicle.) Along with support for up to five cameras, this mitigating step enables the framework to avoid displacements at higher speeds while improving accuracy.

Waymo ViDAR

Above: A depth prediction model created with ViDAR.

ViDAR is being used internally at Waymo to provide state-of-the-art camera-centric depth, egmotion (estimating a camera’s motion relative to a scene), and dynamics models. It led to the creation of a model that estimates depth from camera images and one that predicts the direction obstacles (including pedestrians) will travel, among other advances.

VB Transform 2020 Online - July 15-17. Join leading AI executives: Register for the free livestream.


Researchers at Uber’s Advanced Technologies Group (ATG) created a system called V2VNet that enables autonomous cars to efficiently share information with each other over the air. Using V2VNet, cars within the network exchange messages containing data sets, timestamps, and location information, compensating for time delays with an AI model and intelligently selecting only relevant data (e.g., lidar sensor readings) from the data sets.

Uber V2VNet

Above: Predictions informed by V2VNet.

To evaluate V2VNet’s performance, ATG compiled a large-scale vehicle-to-vehicle corpus using a “lidar simulator” system. Specifically, the team generated reconstructions of 5,500 logs from real-world lidar sweeps (for a total of 46,796 training and 4,404 validation frames), simulated from viewpoints of up to seven vehicles.

The results of several experiments show V2VNet had a 68% lower error rate compared to single vehicles. Performance increased with the number of vehicles in the network, showing “significant” improvements on far and occluded objects and cars traveling at high speed.

It’s unclear whether V2VNet will make its way into production on real-world cars, but Uber rival Waymo’s driverless Chrysler Pacifica minivans wirelessly exchange information about hazards and route changes via dual modems. “[Our cars] still have to rely on onboard computation for anything that is safety-critical, but … [5G] will be an accelerator,” said Waymo CTO Dmitri Dolgov in a presentation last year.