Google’s in-house tensor processing units (TPU) are packed with specialized circuits optimized for AI model training and are at the heart of Google products like Translate, Photos, Search, the Google Assistant, and Gmail. They became publicly available through Google’s Cloud Machine Learning Engine in beta in February, and the Mountain View company is now bringing them to European and Asian regions and introducing preemptible pricing.
“Cloud TPUs allow businesses everywhere to transform their own products and services with machine learning, and we’re working hard to make Cloud TPUs as widely available and as affordable as possible,” Brenna Saeta, lead tech for TensorFlow at Google, wrote in a blog post.
The cost savings with preemptible plans are substantial. Google says that training a Cloud TPU ResNet-50 — a neural network that’s often used as a benchmarking tool for AI training speed — on a database of images from scratch costs as little as $7.50. That’s compared to the $25 it costs to train the same model on a normal plan.
And preemptible pricing offers the same performance advantages as standard Cloud TPU training. In Stanford’s DAWNBench AI benchmarking competition, Google’s Cloud TPU trained a ResNet-50 model 6 times faster than non-TPU submissions, taking just 30 minutes to reach the target accuracy.
So what’s the catch? Preemptible instances are given a lower compute priority than normal instances, which means if a normal instance requires additional resources, Google’s Cloud Compute Engine might terminate the workload. That won’t pose a problem for fault-tolerant models, Google says, thanks to TensorFlow’s built-in support for saving and restoring from checkpoints. But more complex systems might not be great candidates.
Cloud TPU preemptible pricing plans are available starting today.
Google isn’t the only one with cloud-hosted hardware optimized for machine learning frameworks. In March, Microsoft opened Brainwave — a fleet of field-programmable gate arrays (FPGAs) designed to speed up machine learning operations — to select Azure customers. (Microsoft said that this allowed it to achieve 10 times faster performance for the models that power its Bing search engine.) Amazon, meanwhile, provides its own FPGA hardware to customers, and is reportedly developing an AI chip that will accelerate its Alexa speech engine’s model training.
At Google’s I/O Developer Conference in May, the Mountain View company took the wraps off the third generation of TPUs: the Tensor Processing Unit 3.0. Google claims they offer up to 100 petaflops in performance — 8 times that of the company’s second-generation chips.