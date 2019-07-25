Sussing out AI models capable of reliably guiding driverless cars requires endless amounts of testing and fine-tuning, typically, not to mention computational power out the wazoo. In an effort to bolster AI algorithm training effectiveness and efficiency, Google parent company Alphabet’s Waymo is collaborating with DeepMind on techniques inspired by evolutionary biology, the two companies revealed in a blog post this morning.

As DeepMind explains, AI algorithms self-improve through trial and error. A model is presented with a task, and over time, it learns to perform this task by continually attempting it and adjusting based on the feedback it receives. Performance is heavily dependent on the training regimen — known as a hyperparemeter schedule — and finding the best regimen is commonly left to experienced researchers and engineers. They hand-pick AI models undergoing training, culling the weakest performers and freeing resources to train new algorithms from scratch.

DeepMind devised a less labor-intensive approach in PBT (Population Based Training), which starts with multiple machine learning models initiated with random variables (hyperparameters). The models are evaluated periodically and compete with each other in an evolutionary fashion, such that underperforming members of the population are replaced with “offspring” (copies of better-performing members with slightly mutated variables). PBT doesn’t necessitate restarting training from scratch, because each offspring inherits the state of its parent network, and the hyperparameters are updated actively throughout training. The net result is that PBT spends the bulk of its resources training with “good” hyperparameter values.

PBT isn’t perfect, that said — it tends to optimize for the present and fails to consider long-term outcomes, disadvantaging late-blooming AI models. To mitigate this, researchers at DeepMind trained a larger population and created subpopulations called niches, in which algorithms are only allowed to compete within their own subgroups. Lastly, they directly rewarded diversity by providing more unique models an edge in the competition.

In several recent studies, DeepMind and Waymo applied PBT to pedestrian, bicyclist, and motorcyclist recognition tasks with the goal of investigating whether it could improve recall (the fraction of obstacles identified over the total number of in-scene obstacles) and precision (the fraction of detected obstacles that are actually obstacles and not false positives). Ultimately, they sought to train a single AI model to maintain recall over 99% while reducing false positives.

DeepMind reports that these experiments informed a “realistic” evaluation framework for testing AI’s real-world robustness, which in turn informed PBT’s algorithm-selecting competition. They also say that they revealed the need for fast evaluation to support evolutionary competition; in PBT, models are evaluated every 15 minutes. (DeepMind said it employed parallelization across “hundreds” of distributed machines in Google’s data centers to achieve this.)

The results were impressive. PBT algorithms were able to achieve higher precision by reducing false positives by 24% compared to their hand-tuned equivalents while maintaining a high recall rate, DeepMind says. Moreover, they saved time and resources — the hyperparameter schedule discovered with PBT-trained algorithms took half the training time and resources and used half the computational resources.

DeepMind says that it’s incorporated PBT directly into Waymo’s technical infrastructure, enabling researchers from across the company to apply it with a button click. “Since the completion of these experiments, PBT has been applied to many different Waymo models, and holds a lot of promise for helping to create more capable vehicles for the road,” wrote the company. “Traditionally, [AI] can only be trained using simple and smooth loss functions, which act as a proxy for what we really care about. PBT enabled us to go beyond the update rule used for training neural nets, and towards the more complex metrics optimizing for features we care about.”