Amazon’s been adding AI-focused features to Amazon Web Services, its cloud computing subsidiary, at a steady clip. Just this week, Amazon Transcribe and Comprehend — AWS’ automatic speech recognition (ASR) service and natural language processing service, respectively — gained support for real-time transcriptions and custom entities. And today, Amazon announced a bevy of improvements heading to SageMaker, its end-to-end platform for building, training, and deploying machine learning models.
“Machine learning is a highly collaborative process — combining domain experience with technical skills is the bedrock of success, and often requires multiple iterations and experimentation with different datasets and features,” Dr. Matt Wood, general manager of learning and artificial intelligence (AI) at Amazon Web Services, wrote in a blog post. “Training a successful model is almost never a hole-in-one, and so it’s important to be able to keep track of the important decisions, replay the successful parts, reuse what worked, and get help on what didn’t. We’re introducing new capabilities to make these iterations easier to manage, repeat, and share.”
First on the list is Sagemaker Search, which enables AWS customers to find AI model training runs performed with unique combinations of datasets, algorithms, and parameters. It’s accessible from the SageMaker console.
Joining Sagemaker Search on the list of new features is Step Functions, which coordinates across multiple services the steps required to complete a machine learning workflow. Also new? Integration with Apache Airflow, an open source framework for authoring, scheduling, and monitoring workflows.
Step Functions and Apache Flow will be available starting next month.
“[With Step Functions, you] can automate publishing datasets to Amazon S3, training an ML model on your data with SageMaker, and deploying your model for prediction,” Dr. Wood wrote. “[It’ll] monitor SageMaker (and Glue) jobs until they succeed or fail, and either transition to the next step of the workflow or retry the job. It includes built-in error handling, parameter passing, state management, and a visual console that lets you monitor your ML workflows as they run.”
Those improvements dovetail with the addition of three new built-in algorithms — one for suspicious IP addresses (IP Insights), low dimensional embeddings for high dimensional objects (Object2Vec), and unsupervised grouping (K-means clustering) — to SageMaker, and AWS’ newfound support for Horovod, Uber’s open source deep learning framework for Google’s Tensorflow; software machine learning library scikit-learn; and Spark MLeap.
Also in the overall upgrade are visualizations and integration with version-control system Git, which helps to track and coordinate changes in files. Now, developers can link GitHub, AWS CodeCommit, or self-hosted Git repositories with SageMaker notebooks for the purposes of cloning public and private repositories, or store repository information in Amazon SageMaker using IAM, LDAP, and AWS Secrets Manager.
Finally, on the security front, SageMaker now meets Amazon’s System and Organizational Controls (SOC) Level 1, Level 2, and Level 3 audits.
“These new capabilities, algorithms, and accreditation will help bring more machine learning workloads to more developers. By focusing almost exclusively on what customers are asking for, we’re making real strides in making machine learning useful and usable in the real world through Amazon SageMaker,” Dr. Wood wrote. “Accreditation, experimentation, and automation aren’t always the first thing you may think of when it comes to artificial intelligence, but our customers tell us that these features can further shorten the time it takes to build, train, and deploy their models. No R&D department required.”