Uber, which hasn’t publicly discussed the architecture of its autonomous car platform in great detail, today published a post laying out the technologies that enable engineers within its Advanced Technologies Group (ATG) to test, validate, and deploy AI models to cars. It gives a glimpse into the complexities of self-driving car development generally, and perhaps more importantly, it serves as a yardstick for Uber’s driverless efforts, which suffered a setback following an accident in Tempe, Arizona in May 2018.
According to Uber, the most important component of the ATG’s workflow is VerCD, a set of tools and microservices developed specifically for prototyping self-driving vehicles. It tracks the dependencies among the various codebases, data sets, and AI models under development, ensuring that workflows start with a data set extraction stage followed by data validation, model training, model evaluation, and model serving stages.
“VerCD … has become a reliable source of truth for self-driving sensor training data for Uber ATG,” wrote Uber. “By onboarding the data set building workflow onto VerCD, we have increased the frequency of fresh data set builds by over a factor of 10, leading to significant efficiency gains. Maintaining an inventory of frequently used data sets has also increased the iteration speed of [machine learning] engineers since the developer can continue their experimentation immediately without waiting several days for a new data set to be built. Furthermore, we have also onboarded daily and weekly training jobs for the flagship object detection and path prediction models for our autonomous vehicles. This frequent cadence of training reduced the time to detect and fix certain bugs down to a few days.”
Uber says the bulk of the engineering effort behind VerCD has been spent adding company-specific integrations to enable existing systems to interact with ATG’s full end-to-end machine learning workflow. To this end, the latest VerCD’s Orchestrator Service can call various data primitives to build a runtime of a self-driving vehicle for testing, or interact with a code repository while creating images with deep learning libraries and replicating data sets between datacenters and to and from the cloud (should model training occur in these locations).
The bulk of the data sets that VerCD manages come from logs collected by the ATG’s self-driving cars. Log data — images from cameras, lidar point and radar information, vehicle state (location, speed, acceleration, heading), and map data (such as the vehicle’s route and lanes it used) — is divided into training data, testing data, and validation data, such that 75% goes to training, 15% to testing, and 10% to validation. A proprietary tool called GeoSplit is used to select logs and split them between train, test, and validation based on their geographical location.
A typical VerCD user provides the dependencies of any data set, model, or metric builds, and VerCD manages this information in a database backend. Upon registration of a new data set, the VerCD data set service stores the dependency metadata in a complementary database. Data sets are uniquely identified by name and a version number as well as the dependencies tracked by VerCD, allowing for the exact reproduction of sensor log IDs from autonomous vehicles, metadata describing data set lifecycle, and more. Machine learning models are also uniquely identified, supporting the reproduction of things like versioned data sets and the path to AI model training configuration files.
Uber ATG uses a hybrid approach to machine learning training, with training jobs running in on-premises datacenters powered by graphics card and processor clusters as well as running training jobs in the cloud. Uber’s Peloton, an open source unified resource scheduler, scales jobs by deploying them to processes on clusters, while Kubernetes deploys and scales apps across clusters of hosts.
Once a machine learning engineer defines the experimental model in VerCD’s Model Service API, the ATG’s systems begin training it. VerCD importantly supports a validation step to allow for a smooth transition between an experimental and production model, which Uber notes enforces additional constraints on model training to ensure reproducibility and traceability.
Depending on how it performs, VerCD designates a model as “failed,” “aborted,” or “successful.” If a model fails or must be aborted, the ML engineer can opt to rebuild with a new set of parameters. Asynchronously, VerCD can initiate validation of the model, where checks on the training pipeline depend on the specific model being trained. A model may be promoted to production only when both the experimental build succeeds and validation succeeds, according to Uber.
The post might be perceived as an attempt at greater transparency; Uber has a mixed track record when it comes to self-driving car research, to put it mildly. It restarted tests of its driverless cars in Pittsburgh last December — eight months after one of its prototype Volvo SUVs struck and killed a pedestrian in Tempe — after which it also began manual tests in San Francisco and Toronto. The National Transportation Safety Board later determined that Uber had disabled the automatic emergency braking system in the Volvo XC90 involved in the fatal crash. (The company said in internal documents that this was to “reduce the potential for erratic vehicle behavior.”)
In a blog post published in June 2018, head of Uber’s ATG Eric Meyhofer detailed newly implemented safeguards, such as a training program focused on safe manual driving and monitoring systems that alert remote monitors if drivers take their eyes off the road. And in a voluntary safety assessment filed with the National Highway Traffic Safety Administration, Uber said that with its newly established systems engineering testing team, it’s now better positioned “to reason over many possible outcomes to ultimately come to a safe response.”