Microsoft deliberately chose to use old tech for its Nvidia GPU rival — Maia 100 AI accelerator uses HBM2E memory and the mysterious ability to ‘unlock new capabilities’ via firmware update

Maia 100 architecture is “tailored to modern machine learning needs”

When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.

At the recent Hot Chips 2024 symposium,Microsoftrevealed details about its first-generation custom AI accelerator, the Maia 100, designed for large-scale AI workloads on its Azure platform.

Unlike its rivals, Microsoft has opted for older HBM2E memory technology, integrated with the intriguing ability to “unlock new capabilities” via firmware updates. This decision appears to be a strategic move to balance performance and cost efficiency.

The Maia 100 accelerator is a reticle-size SoC, built on TSMC’s N5 process and featuring a COWOS-S interposer. It includes four HBM2E memory dies, delivering 1.8TBps bandwidth and 64GB capacity, tailored for high-throughput AI workloads. The chip is designed to support up to 700W TDP but is provisioned at 500W, making it energy-efficient for its class.

“Not as capable as a Nvidia H100”

“Not as capable as a Nvidia H100”

Microsoft’s approach with Maia 100 emphasizes a vertically integrated architecture, from custom server boards to specialized racks and a software stack designed to enhance AI capabilities. The architecture includes a high-speed tensor unit and a custom vector processor, supporting various data formats and optimized for machine learning needs.

Additionally, the Maia 100 supports Ethernet-based interconnects with up to 4800Gbps all-gather and scatter-reduced bandwidth, using a custom RoCE-like protocol for reliable, secure data transmission.

Patrick Kennedy fromServeTheHomereported on Maia at Hot Chips, noting, “It was really interesting that this is a 500W/ 700W device with 64GB of HBM2E. One would expect it to be not as capable as aNvidiaH100 since it has less HBM capacity. At the same time, it is using a good amount of power. In today’s power-constrained world, it feels like Microsoft must be able to make these a lot less expensive than Nvidia GPUs.”

The Maia SDK simplifies deployment by allowing developers to port their models with minimal code changes, supporting both PyTorch and Triton programming models. This enables developers to optimize workload performance across different hardware backends without sacrificing efficiency.

Are you a pro? Subscribe to our newsletter

Are you a pro? Subscribe to our newsletter

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

More from TechRadar Pro

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

LG Electronics sets ambitious B2B revenue goal to offset declining consumer demand

New fanless cooling technology enhances energy efficiency for AI workloads by achieving a 90% reduction in cooling power consumption

Phishing attacks surge in 2024 as cybercriminals adopt AI tools and multi-channel tactics