Amazon wants to make it easier to get AI-powered apps up and running on Amazon Web Services. Toward that end, it today launched AWS Deep Learning Containers, a library of Docker images preinstalled with popular deep learning frameworks.
“We’ve done all the hard work of building, compiling, and generating, configuring, optimizing all of these frameworks, so you don’t have to,” Dr. Matt Wood, general manager of deep learning and AI at AWS, said onstage at the AWS Summit in Santa Clara this morning. “And that means that you do less of the undifferentiated heavy lifting of installing these very, very complicated frameworks and then maintaining them.”
The new AWS container images in question — which are preconfigured and validated by Amazon — support Google’s TensorFlow machine learning framework and Apache MXNet, with Facebook’s PyTorch and other deep learning frameworks to come. And they work on the full range of AWS services including Amazon ECS, Amazon Elastic Container Service for Kubernetes, and Amazon Elastic Compute Cloud (EC2), and with Kubernetes on Amazon EC2. (Microservices can be added to apps deployed on Kubernetes using Deep Learning Containers.)
Wood says Deep Learning Containers include a number of AWS-specific optimizations and improvements, allowing them to deliver “the highest performance for training and inference in the cloud.” The TensorFlow optimizations in particular allow certain AI models to train up to twice as fast through “significantly” improved GPU scaling — up to 90 percent scaling efficiency for 256 GPUs, Amazon claims.
“AWS Deep Learning Containers are tightly integrated with Amazon EKS and Amazon ECS, giving you choice and flexibility to build custom machine learning workflows for training, validation, and deployment,” Amazon wrote in a blog post. “Through this integration, Amazon EKS and Amazon ECS handle all the container orchestration required to deploy and scale the AWS Deep Learning Containers on clusters of virtual machines.”
Their debut comes months after Amazon took the wraps off of Inferentia, a high-throughput, low-latency processor custom-built for cloud inference, at its annual re:Invent conference in Las Vegas. Inferentia supports NT8, FP16, and mixed precision, and multiple machine learning frameworks including TensorFlow, Caffe2, and ONNX. It’s expected to be available this year in AWS products including EC2 and Amazon’s SageMaker.
And it follows on the heels of Elastic Inference Engine, a service that allows customers to attach GPU-powered inference acceleration to any Amazon EC2 or Amazon SageMaker instance. Elastic Inference Engine is fully compatible with TensorFlow, Apache MXNet, and ONNX, and Amazon says it can reduce deep learning costs by up to 75 percent.