MIT CSAIL's VISTA autonomous vehicle simulator transfers skills learned to the real world

In a recent study, researchers hailing from MIT's Computer Science and Artificial Intelligence Laboratory and the Toyota Research Institute describe Virtual Image Synthesis and Transformation for Autonomy (VISTA), an autonomous vehicle development platform that uses a real-world data set to synthesize viewpoints from trajectories a car could take. While driverless car companies including Waymo, Uber, Cruise, Aurora, and others use simulation environments to train the AI underpinning their real-world cars, MIT claims its system is one of the few that doesn't require humans to manually add road markings, lanes, trees, physics models, and more. That could dramatically speed up autonomous vehicle testing and deployment.

As the researchers explain, VISTA rewards virtual cars for the distance they travel without crashing so that they're "motivated" to learn to navigate various situations, including regaining control after swerving between lanes. VISTA is data-driven, meaning that it synthesizes from real data trajectories consistent with road appearance, as well as distance and motion of all objects in the scene. This prevents mismatches between what's learned in simulation and how the cars operate in the real world.

To train VISTA, the researchers collected video data from a human driving down a few roads; for each frame, VISTA predicted every pixel into a type of 3D point cloud. Then, the researchers placed a virtual vehicle inside of the environment and rigged it so that when it made a steering command, VISTA synthesized a new trajectory through the point cloud based on the steering curve and the vehicle's orientation and velocity.

VISTA used the above-mentioned trajectory to render a photorealistic scene, estimating a depth map containing information relating to the distance of objects from the vehicle's viewpoint. By combining the depth map with a technique that estimates the camera's orientation within a 3D scene, the engine pinpointed the vehicle's location and relative distance from everything within the virtual simulator, while reorienting the original pixels to recreate a representation of the world from the vehicle's new viewpoint.

In tests conducted after 10 to 15 hours of training -- during which the virtual car drove 10,000 kilometers (0.62 miles) -- a car trained with the VISTA simulator was able to navigate through previously unseen streets. Even when positioned at off-road orientations that mimicked various near-crash situations, such as being half off the road or into another lane, the car successfully recovered back into a safe driving trajectory within a few seconds.

In the future, the research team hopes to simulate all types of road conditions from a single driving trajectory, such as night and day and sunny and rainy weather. They also hope to simulate more complex interactions with other vehicles on the road.

More