Microsoft’s First Customized AI-Centric Chips

AI Chip Manufacturer Companies Now Include Microsoft.

Microsoft Joins List of AI Chip Manufacturers

In the past few months, it has been revealed that Microsoft has created their own custom AI chip that will be used to train large language models. As the current reliance on AI chips lies with Nvidia, which is an expensive investment, Microsoft hopes to avoid overreliance on Nvidia and cut out the associated cost. Not only are they creating AI-centric chips, but they are building their own Arm-based CPU intended for cloud workloads. Both custom chips are designed with Azure data centers in mind and are readying the company for the coming surge in AI applications.

Microsoft’s First Customized AI-Centric Chips

The New Microsoft AI Chip (CPU)

The New Microsoft AI Chip and CPU

Microsoft’s Azure Maia AI chip and Arm-powered Azure Cobalt CPU are coming this year, though an exact date has not been stated yet. This development comes on the heels of the demand for Nvidia’s H100 GPUs last year, which are widely used to train and operate generative image tools and large language models.

Microsoft isn’t new to the silicon game. For more than 20 years, they have collaborated on their creation for Xbox and have co-engineered chips for their Surface devices. In 2017, they began working towards creating their own custom chips to advance their efforts in the cloud and AI space in the future. The new Azure Maia AI chip and Azure Cobalt CPU are both built in-house at Microsoft. This, combined with a deep overhaul of its entire cloud server stack to optimize performance, power, and cost, will ideally lead to an improvement in cloud infrastructure computing.

Azure Cobalt CPU is built on an Arm Neoverse CSS design and customized for Microsoft. It’s designed specifically to power general cloud services on Azure. It was designed with performance control and lower power consumption per core in mind.

Microsoft AI Chip Workload Testing

Microsoft is testing the Cobalt CPU on Teams and SQL server workloads in hopes that it will be capable of supporting virtual machines soon that manage different workloads. Comparatively, this technology is on par with Amazon’s Graviton 3 servers for AWS, and offers a nearly 40% performance increase over the Arm-based servers Microsoft currently employes for Azure.

Microsoft uses Maia 100 AI accelerator as it’s cloud AI workload management tool. Used as a large language model for training and inference, it’s capable of managing some of the largest AI workloads that Azure has to offer. The plan is to use it to manage OpenAI workloads in the near future, since Microsoft and OpenAI have been collaborating on Maia through its design and testing phases. With Azure’s AI architecture optimized through Maia, Microsoft hopes to create new ways to train capable AI models and make those models cheaper for customers.

Maia is manufactured on a 5-nanometer TSMC processor with around 30% less transistors than AMD’s Nvidia MI300X AI GPU. This essentially enables support for faster model training and inference. Microsoft is even part of a group of businesses, including AMD, Arm, Intel, and more, that are standardizing the next generation of data formats for AI Models. Microsoft’s efforts to build these AI chips is a collaborative and open work effort in line with the Open Compute Project (OCP) that helps entire systems adapt to the needs of AI.

Microsoft AI Chip "Maia" Specifications

Maia is the first server processor built by Microsoft that is entirely liquid cooled. This allows servers to be higher density while still operating at increased efficiencies. These were created with the current data center footprint in mind, so each stack was designed layer by layer to fit to these systems cohesively. In fact, it has to be this way so Microsoft can spin up more servers as quickly as possible around the world. Logistically, these Maia server boards were designed for simplicity, effectiveness, and rapid deployment.

Along with sharing MC data types, Microsoft is also sharing the Maia rack designs with its partners so they can use them on their own systems regardless of what type of silicon they are using. The exact chip designs will stay in house, though.

Maia is currently being tested on GPT 3.5 Turbo, which is the same model that powers ChatGPT, GitHub Copilot, and Bing AI. Since Microsoft is still in the early stages of testing and deployment, performance benchmarks and specifications haven’t been released yet.

Microsoft's AI Chip vs Nvidia and Other Competitors

With no benchmarks or specifics released, it’s difficult to compare Maia with similar products like Nvidia’s H100 GPU or AMD’s MI300X. While this may be due to the current ongoing partnership with both companies, it is likely too early to create comparisons until more testing is done.

Microsoft aims to diversify their supply chains, especially since Nvidia is the main supplier of AI server chips right now and companies have been fighting tooth and nail to buy as many chips as possible. It is estimated that OpenAI alone required around 30,000 of Nvidia’s older A100 GPUs in order to commercialize ChatGPT. Microsoft’s own chip manufacturing would add more options to the market and help lower the cost of AI for customers across the globe. These chips are also being developed solely for Azure cloud workloads, so they wouldn’t be available for mass sale like others from Nvidia, Intel, or AMD.

The New Era of Datacenters - AI Computing

The Next Age of AI Processing

The names Maia 100 and Cobalt 100 imply that Microsoft is already designing second-generation versions of these chips. While they haven’t shared their roadmap, Microsoft has confirmed that this isn’t the only iteration of these chips. They are planning to update and develop them well into the future to further expand on AI capabilities while supporting their Azure workloads.

So then, how fast can Microsoft implement Maia and how will the pricing of these chips impact the use of AI services? Microsoft has been quiet on the price of these new servers, but if their silent rollout of Copilot for 365 is any indication, the cost may be steep.

If AI is an investment your company is sorely hoping to capitalize on in the future, but the price of Microsoft products and services are holding you back, you may need to find new avenues to lower your spend. Third-party support through US Cloud is a proven method of saving on your annual Microsoft support budget, with a proven track record of saving clients 50% or more on their support needs. With a team of fully US-domestic engineers and financially-backed SLAs that guarantee 15 minute responses to tickets of all severities, we ensure you are getting the support that Microsoft claims to provide, at half the cost. Faster Microsoft support for less starts with US Cloud.

Get Microsoft Support for Less

Unlock Better Support & Bigger Savings

  • Save 30-50% on Microsoft Premier/Unified Support
  • 2x Faster Resolution Time + SLAs
  • All-American Microsoft-Certified Engineers
  • 24/7 Global Customer Support