News

[News] NVIDIA’s H200 vs. AMD’s MI300X: Is the Former’s High Margin Justifiable?


2024-09-02 Semiconductors editor

AI chip giants NVIDIA and AMD have been under heated competition for a couple of years. NVIDIA, though controls the lion’s share of the market for AI computing solutions, had been challenged by AMD while the latter launched Instinct MI300X GPU in late 2023, claiming the product to be the fastest AI chip in the world, which beats NVIDIA’s H200 GPUs.

However, months after the launch of MI300X, an analysis by Richard’s Research Blog indicates that AMD’s MI300X’s cost is significantly higher than NVIDIA’s H200’s, while H200 outperforms MI300X by over 40% regarding inference production applications, which makes NVIDIA’s high margin justifiable.

AMD’s MI300X: More Transistors, More Memory Capacity, More Advanced Packaging…with a Higher Cost

The analysis further compares the chip specifications between the two best-selling products and explores their margins. NVIDIA’s H200 is implemented using TSMC’s N4 node with 80 billion transistors. On the other hand, AMD’s MI300X is built with 153 billion transistors, featuring TSMC’s 5nm process.

Furthermore, NVIDIA’s H200 features 141GB of HBM3e, while AMD’s MI300X is equipped with 192GB of HBM3. Regarding packaging techniques, while NVIDIA is using TSMC’s CoWoS 2.5D in the H200, AMD’s MI300X has been moved to CoWoS/SoIC 3D with a total of 20 dies/stacks, which significantly increases its complexity.

According to the analysis, under the same process, the number of transistors in the logic compute die and the total die size/total cost are roughly proportional. AMD’s MI300X, equipped with nearly twice the number of transistors compared to NVIDIA’s H200, therefore, is said to cost twice as much of the latter in this respect.

With 36% more memory capacity and much higher packaging complexity, AMD’s MI300X is said to suffer a significantly higher manufacturing cost than NVIDIA’s H200. It is also worth noting that as NVIDIA is currently the dominant HBM user in the market, the company must enjoy the advantage of lower procurement costs, the analysis suggests.

This is the price AMD has to pay for the high specifications of the MI300X, the analysis observes.

NVIDIA’s 80% Margin: High at First Glance, but Actually Justifiable

On the other hand, citing the results of MLPerf tests, the analysis notes that in practical deployment for inference production applications, the H200 outperforms the MI300X by over 40%. This means that if AMD wants to maintain a similar cost/performance ratio (which CSP customers will demand), the MI300X price must be about 30% lower than the H200. The scenario does not take other factors into consideration, including NVIDIA’s familiarity with secondary vendors, the Compute Unified Device Architecture (CUDA), as well as related software.

Therefore, the analysis further suggests that NVIDIA’s 80% gross margin, though might seem to be high at first glance, actually allows room for its competitors to survive. If NVIDIA were to price its products below a 70% margin, its rivals might struggle with negative operating profits.

In addition to achieving better product performance at a lower cost through superior hardware and software technology, NVIDIA excels at non-technical economic factors, including R&D and the scaling of expensive photomasks, which impact operational expenditures (OPEX) and cost distribution as well, while its long-term commitments to its clients, confidence, and time-to-market also play a role, the analysis notes.

Regarding the key takeaways from their latest earnings reports, NVIDIA claims the demand for Hopper remains strong, while Blackwell chips will potentially generate billions of dollars in revenue in the fourth quarter. AMD’s Instinct MI300 series, on the other hand, has emerged as a primary growth driver, as it is expected to generate more than USD 4.5 billion in sales this year.

Read more

(Photo credit: NVIDIA)

Please note that this article cites information from Richard’s Research Blog.

Get in touch with us