Microsoft debuts its own chips for enterprise AI: ‘Maia’ and ‘Cobalt’

Microsoft is moving to strengthen its computing infrastructure play with the announcement of two new in-house chips for enterprises: Azure Maia 100 and Azure Cobalt 100.

Showcased at this week's Microsoft Ignite 2023 conference in Seattle, the tech giant's largest annual global event, the chips provide enterprises with efficient, scalable and sustainable compute to take advantage of the latest cloud and AI breakthroughs.

Microsoft says they represent the final piece of the puzzle in its mission to deliver flexible infrastructure systems – with its own and partner-delivered hardware and software – that can be optimized to meet different workload requirements.

Maia, as the company explained, is its AI accelerator, designed to run cloud-based training and inference for generative AI workloads. Meanwhile, Cobalt is an Arm-based chip designed to handle general-purpose workloads with high efficiency. Both offerings will be deployed in Azure next year, starting with Microsoft’s own data centers driving its Copilot and Azure OpenAI services.

“We are reimagining every aspect of our data centers to meet the needs of our customers,” Scott Guthrie, executive vice president of Microsoft’s Cloud + AI Group, said in a statement. “At the scale we operate, it's important for us to optimize and integrate every layer of the infrastructure stack to maximize performance, diversify our supply chain and give customers infrastructure choice,” he added.

What to expect from Azure Maia and Cobalt?

While Microsoft has not shared specific performance stats, the company does note that the Maia AI chip can handle some of the largest AI workloads running on Microsoft Azure, right from training language models to inferencing. The silicon has been designed specifically for the Azure hardware stack, which enables it to achieve absolute maximum utilization of the hardware when handling the workloads.

It developed the accelerator over the years by working in conjunction with OpenAI. Specifically, the company tested it on models built by the Sam Altman-led generative AI unicorn and used the feedback to make necessary changes.

“We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models. Azure’s end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers,” Altman said in a statement.

Like Maia, the capabilities of Cobalt are also largely under the wraps. However, one thing’s pretty clear: this chip will handle general-purpose workloads on Azure with a focus on energy efficiency. The Arm-based design ensures it is optimized to maximize performance per watt, making sure that the data center gets more computing power for each unit of energy consumed.

“The architecture and implementation are designed with power efficiency in mind. We’re making the most efficient use of the transistors on the silicon. Multiply those efficiency gains in servers across all our data centers, it adds up to a pretty big number,” Wes McCullough, corporate vice president of hardware product development at Microsoft, said.

More importantly, since both of the chips have been designed in-house, Microsoft will install them on custom server boards, placed within tailor-made racks that fit easily within existing company data centers. For the Maia rack, the company has also designed "sidekicks" that direct cold liquid to the cold plates of the chips, ensuring that it does not heat up when dealing with heavy power usage situations.

_{A custom-built rack for the Maia 100 AI Accelerator and its “sidekick” inside a thermal chamber at a Microsoft lab in Redmond, Washington. The sidekick acts like a car radiator, cycling liquid to and from the rack to cool the chips as they handle the computational demands of AI workloads}

Partner integrations expanded

As part of its flexible systems approach, Microsoft is also backing its custom chips with expanded support for partner hardware. The company said it has launched a preview of the new NC H100 v5 virtual machine series built for Nvidia H100 Tensor Core GPUs and will soon add the latest Nvidia H200 tensor core GPUs to its data center fleet. It also plans to add AMD MI300X accelerated VMs to Azure to accelerate the processing of AI workloads for high-range AI model training and generative inferencing.

This approach gives customers of the company multiple options to choose from, depending on their performance or cost needs.

As of now, Microsoft plans to roll out the new chips next year. It has already started working on their second generation.