Artificially intelligent (AI) algorithms are great in theory but essentially useless if you don’t have powerful hardware on which to deploy them. With a conventional computer, a complex model could take hours, days, or even weeks to train.
But Cisco is prepared to address this problem. The San Jose company today announced an expansion of its Unified Computing System (UCS) server portfolio that’s focused squarely on AI — specifically, on IT organizations looking to get AI systems up and running.
First on deck is new hardware: the UCS C480 ML M5. It’s a 4U server with Intel Xeon scalable processors, eight Nvidia Tesla V100-32G GPUs with high-bandwidth NVLink interconnects, and flexible options on the CPU, networking, storage, memory, and software fronts. The top-of-the-line configuration boasts dual Xeon processors, up to 128GB of 2666MHz DDR4 RAM, 24 SATA hard drives or SSDs, six NVMe drives, and four x100G Virtual Interface Cards (VICs).
“The reason we developed this system is purely from demand within our install base,” Todd Brannon, senior director of datacenter marketing, told VentureBeat in an interview. He gave a few supporting statistics: about eight in 10 companies have already adopted AI as a customer solution or are planning to do so by 2020, by which time it’s expected to be a $1.2 trillion industry.
“GPUs are very relevant to reducing the amount of time it takes to train models, particularly in deep learning,” he said. “We see an order of magnitude decrease in training time.”
The UCS server works with containerized apps — i.e., apps run within their own operating environment isolated from the wider system — and multicloud computing models (AI systems with datasets stored across services), and it’s fully compatible with Cisco’s AI solutions stack. That includes data pipelines with MapR; Cloudera’s Data Science Workbench, which supports frameworks such as TensorFlow and PyTorch; and Hortonworks’ Hadoop 3.1, which Cisco’s working to validate in a design in which the UCS C480 ML M5 stores data and runs containerized Apache Spark and Google TensorFlow analytic workloads.
“If you’re preparing a large dataset to feed into a model or algorithm, data is the center of the show, and it’s variable,” Brannon said. “It could turn into a 10 petabyte exercise in a couple of days … Our strategy is to take on this new level of data with … computing options for testing, development, training, and inference, all unified by cloud-based management.”
Cisco said it’s contributing code to Google’s Kubeflow open source project, which integrates Kubernetes — a platform for managing containerized workflows and services — with TensorFlow. And Cisco said it’s collaborating with Anaconda to “ensure that data scientists can [get started with] machine learning using languages such as Python.”
“Over the past five or four years, we’ve seen our big data business increase by a factor of 18,” Brannon said. “There’s demand across all different verticals, from global enterprise customers to the public sector. With [the UCS C480 ML M5], we’re answering the need for a new type of accelerated computing platform. We’re curating top-to-bottom software and hardware stacks with leading ecosystem partners to ensure faster and more predictable deployments.”
The UCS C480 ML M5 will be available in the fourth quarter of this year.