Microsoft announced the in-house AI chip, Azure Maia 100, at the Ignite developer conference in Seattle on November 15, 2023. This chip is designed to handle OpenAI models, Bing, GitHub Copilot, ChatGPT, and other AI services. Support for Copilot, Azure OpenAI is expected to commence in early 2024.
TrendForce’s Insights:
Microsoft has not disclosed detailed specifications for Azure Maia 100. Currently, it is known that the chip will be manufactured using TSMC’s 5nm process, featuring 105 billion transistors and supporting at least INT8 and INT4 precision formats. While Microsoft has indicated that the chip will be used for both training and inference, the computational formats it supports suggest a focus on inference applications.
This emphasis is driven by its incorporation of the less common INT4 low-precision computational format in comparison to other CSP manufacturers’ AI ASICs. Additionally, the lower precision contributes to reduced power consumption, shortening inference times, enhancing efficiency. However, the drawback lies in the sacrifice of accuracy.
Microsoft initiated its in-house AI chip project, “Athena,” in 2019. Developed in collaboration with OpenAI. Azure Maia 100, like other CSP manufacturers, aims to reduce costs and decrease dependency on NVIDIA. Despite Microsoft entering the field of proprietary AI chips later than its primary competitors, its formidable ecosystem is expected to gradually demonstrate a competitive advantage in this regard.
Google led the way with its first in-house AI chip, TPU v1, introduced as early as 2016, and has since iterated to the fifth generation with TPU v5e. Amazon followed suit in 2018 with Inferentia for inference, introduced Trainium for training in 2020, and launched the second generation, Inferentia2, in 2023, with Trainium2 expected in 2024.
Meta plans to debut its inaugural in-house AI chip, MTIA v1, in 2025. Given the releases from major competitors, Meta has expedited its timeline and is set to unveil the second-generation in-house AI chip, MTIA v2, in 2026.
Unlike other CSP manufacturers, both MTIA v1 and MTIA v2 adopt the RISC-V architecture, while other CSP manufacturers opt for the ARM architecture. RISC-V is a fully open-source architecture, requiring no instruction set licensing fees. The number of instructions (approximately 200) in RISC-V is lower than ARM (approximately 1,000).
This choice allows chips utilizing the RISC-V architecture to achieve lower power consumption. However, the RISC-V ecosystem is currently less mature, resulting in fewer manufacturers adopting it. Nevertheless, with the growing trend in data centers towards energy efficiency, it is anticipated that more companies will start incorporating RISC-V architecture into their in-house AI chips in the future.
The competition among AI chips will ultimately hinge on the competition of ecosystems. Since 2006, NVIDIA has introduced the CUDA architecture, nearly ubiquitous in educational institutions. Thus, almost all AI engineers encounter CUDA during their academic tenure.
In 2017, NVIDIA further solidified its ecosystem by launching the RAPIDS AI acceleration integration solution and the GPU Cloud service platform. Notably, over 70% of NVIDIA’s workforce comprises software engineers, emphasizing its status as a software company. The performance of NVIDIA’s AI chips can be further enhanced through software innovations.
On the contrary, Microsoft possess a robust ecosystem like Windows. The recent Intel Arc GPU A770 showcased a 1.7x performance improvement in AI-driven Stable Diffusion on Microsoft Olive, this demonstrates that, similar to NVIDIA, Microsoft has the capability to enhance GPU performance through software.
Consequently, Microsoft’s in-house AI chips are poised to achieve superior performance in software collaboration compared to other CSP manufacturers, providing Microsoft with a competitive advantage in the AI competition.
Read more