Microsoft introduces its own enterprise AI chips: ‘Maia’ and ‘Cobalt’.

Wed Nov 22, 2023 - 2:36am GMT+0000

Microsoft is enhancing its computing infrastructure offerings with the introduction of two new in-house chips tailored for enterprises: Azure Maia 100 and Azure Cobalt 100. These chips were unveiled at the Microsoft Ignite 2023 conference in Seattle, the tech giant’s prominent annual global event. They promise to equip enterprises with efficient, scalable, and eco-friendly computing power to leverage the latest advancements in cloud computing and artificial intelligence.

These chips signify the final pieces of Microsoft’s mission to deliver adaptable infrastructure systems that combine its proprietary technology with partner-delivered hardware and software, all customizable to meet various workload requirements.

Maia is Microsoft’s AI accelerator, optimized for executing cloud-based training and inference for generative AI workloads. On the other hand, Cobalt is an Arm-based chip designed for handling general-purpose workloads with exceptional efficiency. Both chips are slated for deployment in Azure next year, beginning with Microsoft’s own data centers that drive its Copilot and Azure OpenAI services.

Scott Guthrie, the executive vice president of Microsoft’s Cloud + AI Group, stated, “We are reimagining every aspect of our data centers to meet the needs of our customers. At the scale we operate, it’s important for us to optimize and integrate every layer of the infrastructure stack to maximize performance, diversify our supply chain and give customers infrastructure choice.”

What Can You Expect from Azure Maia and Cobalt?
While specific performance metrics haven’t been disclosed, Microsoft emphasizes that the Maia AI chip is capable of handling some of the most extensive AI workloads on Microsoft Azure, from training language models to inferencing. The chip has been meticulously designed to align with the Azure hardware stack, enabling it to fully utilize the hardware’s potential when managing workloads.

Over the years, Microsoft collaborated closely with OpenAI to develop this accelerator, testing it with models created by the generative AI company led by Sam Altman and incorporating feedback for improvements. Altman expressed his excitement, saying, “We were excited when Microsoft first shared their designs for the Maia chip, and we’ve worked together to refine and test it with our models. Azure’s end-to-end AI architecture, now optimized down to the silicon with Maia, paves the way for training more capable models and making those models cheaper for our customers.”

Details regarding Cobalt’s capabilities remain largely confidential. Nevertheless, it’s evident that this chip is tailored for general-purpose workloads on Azure, with a strong emphasis on energy efficiency. Its Arm-based design is optimized to deliver maximum performance per watt, ensuring that the data center extracts more computing power from each unit of energy consumed.

Wes McCullough, corporate vice president of hardware product development at Microsoft, stated, “The architecture and implementation are designed with power efficiency in mind. We’re making the most efficient use of the transistors on the silicon. Multiply those efficiency gains in servers across all our data centers, it adds up to a pretty big number.”

A noteworthy feature of both chips is that they have been designed in-house, allowing Microsoft to install them on custom server boards, which are then accommodated in tailor-made racks designed to seamlessly integrate into existing corporate data centers. For the Maia rack, Microsoft has also devised “sidekicks” that direct cold liquid to the chips’ cold plates, preventing overheating during periods of high power usage.

Expanded Partner Integrations
As part of its flexible systems approach, Microsoft is expanding support for partner hardware alongside its custom chips. The company has introduced a preview of the new NC H100 v5 virtual machine series, designed to work with Nvidia H100 Tensor Core GPUs, and plans to incorporate the latest Nvidia H200 tensor core GPUs into its data center infrastructure. Furthermore, Microsoft intends to introduce AMD MI300X accelerated VMs to Azure, aimed at accelerating the processing of AI workloads for high-level AI model training and generative inferencing.

This approach provides Microsoft’s customers with a range of options to choose from based on their specific performance and cost requirements.

Microsoft’s plan is to roll out these new chips next year, and they are already working on the second generation of these chips.