In a paper published on the preprint server Arxiv.org this week, researchers at Uber’s Advanced Technologies Group (ATG) propose an AI technique to improve autonomous vehicles’ traffic movement predictions. It’s directly applicable to the driverless technologies that Uber itself is developing, which must be able to detect, track, and anticipate surrounding cars’ trajectories in order to safely navigate public roads.
It’s well-understood that without the ability to predict the decisions other drivers on the road might make, vehicles can’t be fully autonomous. In a tragic case in point, an Uber self-driving prototype hit and killed a pedestrian in Tempe, Arizona two years ago, partly because the vehicle failed to detect and avoid the victim. ATG’s research, then — which is novel in that it employs a generative adversarial network (GAN) to make car trajectory predictions as opposed to less complex architectures — promises to advance the state of the art by boosting the precision of predictions by an order of magnitude.
The coauthors’ GAN — called SC-GAN, from “scene-compliant GAN” — creates trajectories that follow constraints existing within scenes, given access to high-definition maps of scenes (including roads, crosswalk locations, lane directions, traffic lights, and signage) and detection and tracking systems informed by lidar, radar, and camera sensors on-car. The GAN outputs nearby cars’ frames of reference, with the origin at the center position and the x- and y-axes defined by the cars’ headings and left-hand sides, respectively.
For each individual car whose potential future trajectories the GAN predicts, the scene context information and map constraints are bundled into an RGB image that can be represented by a mathematical object called a matrix. (Matrices are rectangular arrays of numbers arranged in rows and columns, and they’re often used to represent concepts in a format upon which AI models can operate.) The images capture 10 meters behind the cars and 30 meters on either side of them, as well as 10 meters behind.
In experiments, the team implemented their proposed AI system and several baselines in Google’s TensorFlow machine learning framework and sourced a large-scale, real-world data set (ATG4D) comprising 240 hours of data obtained by driving in various traffic conditions (e.g., varying times of day and days of the week across several U.S. cities). Each car every 0.1 seconds created a single data point consisting of the current and past 0.4 seconds of observed velocities, accelerations, headings, and turning rates for a total of 7.8 million data points, which were split along with the surrounding high-definition map information into model training, testing, and evaluation sets.
The researchers report that SC-GAN reduced off-road false positives — a metric measuring the percentage of predicted trajectory points that were off-road, or outside of the drivable region for each car — by 50% compared with a baseline. Furthermore, it outperformed the existing state-of-the-art GAN architectures for motion prediction, decreasing both average and final prediction error numbers “significantly.”
Qualitatively, the researchers say that SC-GAN successfully predicted cars’ movements even in fairly challenging edge cases. For instance, in a scene where a car was approaching an intersection in a straight-only lane, SC-GAN correctly predicted that it would continue straight even though the car’s tracked heading slightly tilted to the left. In another scene, SC-GAN rightly anticipated that a car would take a right turn after approaching an intersection in a turning lane.
“Motion prediction is one of the critical components of the self-driving technology, modeling future behavior and uncertainty of the tracked actors in [the self-driving vehicle’s] vicinity,” wrote the study’s coauthors. “Extensive qualitative and quantitative analysis showed that the method outperforms the current state-of-the-art in GAN-based motion prediction of the surrounding actors, producing more accurate and realistic trajectories.”