Roughly a year ago, Scale and NuTonomy released a driverless data set called NuScenes that they claimed at the time surpassed corpora like KITTI, Baidu’s ApolloScape, and the Udacity Self-Driving Car library in size, scale, and accuracy. Since then, new and more diverse corpora like the Waymo Open Dataset, the Ford Autonomous Vehicle Dataset, and Lyft’s autonomous vehicle data set have emerged, but Motional — whose CEO founded NuTonomy — is looking to take back the crown with the release of an expanded NuScenes.

Data sets like NuScenes can be used to improve the robustness of self-driving cars in environments from cities to back roads. The Rand Corporation estimates that autonomous cars will have to rack up 11 billion miles before we’ll have reliable statistics on their safety, but as headwinds slow real-world testing, simulated miles have become the next best thing.

This expansion of NuScenes includes NuScenes-lidarseg, which improves the semantic segmentation of 1,000 Singapore and Boston scenes, making it one of the largest publicly available lidar segmentation data sets. According to Motional, NuScenes-lidarseg adds 1.4 billion annotated lidar points for a “significantly” more detailed picture of a vehicle’s surroundings than the original bounding boxes, allowing researchers to study things like lidar point cloud segmentation and foreground extraction.

The expanded data set also includes NuImages, a new corpora comprising nearly 100,000 annotated 2D images selected to represent a range of challenging, “educational” driving conditions. Motional says NuImages was created in response to user demand and that it is designed to help autonomous cars operate safely in “unpredictable” scenarios.

Both NuScene-lidarseg and NuImages build on the existing NuScenes data set, which contains hundreds of scenes comprising over a million images captured using cameras, lidars, radars, GPS, and inertial measurement sensors. Motional says over 8,000 researchers have used NuScenes since its release in March 2019, more than 10 new data sets have been made publicly available, and over 250 scientific papers have cited the data.


VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member