News
According to a report by Taiwanese media TechNews, industry sources have indicated that Microsoft has recently reduced its orders for Nvidia’s H100 graphics cards. This move suggests that the demand for H100 graphics cards in the large-scale artificial intelligence computing market has tapered off, and the frenzy of orders from previous customers is no longer as prominent.
In this wave of artificial intelligence trends, the major purchasers of related AI servers come from large-scale cloud computing service providers. Regarding Microsoft’s reported reduction in orders for Nvidia’s H100 graphics cards, market experts point to a key factor being the usage of Microsoft’s AI collaboration tool, Microsoft 365 Copilot, which did not perform as expected.
Another critical factor affecting Microsoft’s decision to reduce orders for Nvidia’s H100 graphics cards is the usage statistics of ChatGPT. Since its launch in November 2022, this generative AI application has experienced explosive growth in usage and has been a pioneer in the current artificial intelligence trend. However, ChatGPT experienced a usage decline for the first time in June 2023.
Industry insiders have noted that the reduction in Microsoft’s H100 graphics card orders was predictable. In May, both server manufacturers and direct customers stated that they would have to wait for over six months to receive Nvidia’s H100 graphics cards. However, in August, Tesla announced the deployment of a cluster of ten thousand H100 graphics cards, meaning that even those who placed orders later were able to receive sufficient chips within a few months. This indicates that the demand for H100 graphics cards, including from customers like Microsoft, has already been met, signifying that the fervent demand observed several months ago has waned.
(Photo credit: Nvidia)
News
According to a report by Taiwan’s Commercial Times, NVIDIA is facing repercussions from the US chip restriction, leading to controls on the export of high-end AI GPU chips to certain countries in the Middle East. Although NVIDIA claims that these controls won’t have an immediate impact on its performance, and industry insiders in the Taiwanese supply chain believe the initial effects are minimal. However, looking at the past practice of prohibiting exports to China, this could potentially trigger another wave of preemptive stockpiling.
Industry sources from the supply chain note that following the US restrictions on exporting chips to China last year, the purchasing power of Chinese clients increased rather than decreased, resulting in a surge in demand for secondary-level and below chip products, setting off a wave of stockpiling.
Take NVIDIA’s previous generation A100 chip for instance. After the US implemented export restrictions on China, NVIDIA replaced it with the lower-tier A800 chip, which quickly became a sought-after product in the Chinese market, driving prices to surge. It’s reported that the A800 has seen a cumulative price increase of 60% from the start of the year to late August, and it remains one of the primary products ordered by major Chinese CSPs.
Furthermore, the recently launched L40S GPU server by NVIDIA in August has become a market focal point. While it may not match the performance of systems like HGX H100/A100 in large-scale AI algorithm training, it outperforms the A100 in AI inference or small-scale AI algorithm training. As the L40S GPU is positioned in the mid-to-low range, it is currently not included in the list of chips subject to export controls to China.
Supply chain insiders suggest that even if the control measures on exporting AI chips to the Middle East are further enforced, local clients are likely to turn to alternatives like the A800 and L40S. However, with uncertainty about whether the US will extend the scope of controlled chip categories, this could potentially trigger another wave of purchasing and stockpiling.
The primary direct beneficiaries in this scenario are still the chip manufacturers. Within the Taiwanese supply chain, Wistron, which supplies chip brands in the AI server front-end GPU board sector, stands to gain. Taiwanese supply chain companies producing A800 series AI servers and the upcoming L40S GPU servers, such as Quanta, Inventec, Gigabyte, and ASUS, have the opportunity to benefit as well.
(Photo credit: NVIDIA)
News
According to the news from Chinatimes, Asus, a prominent technology company, has announced on the 30th of this month the release of AI servers equipped with NVIDIA’s L40S GPUs. These servers are now available for order. The L40S GPU was introduced by NVIDIA in August to address the shortage of H100 and A100 GPUs. Remarkably, Asus has swiftly responded to this situation by unveiling AI server products within a span of less than two weeks, showcasing their optimism in the imminent surge of AI applications and their eagerness to seize the opportunity.
Solid AI Capabilities of Asus Group
Apart from being among the first manufacturers to introduce the NVIDIA OVX server system, Asus has leveraged resources from its subsidiaries, such as TaiSmart and Asus Cloud, to establish a formidable AI infrastructure. This not only involves in-house innovation like the Large Language Model (LLM) technology but also extends to providing AI computing power and enterprise-level generative AI applications. These strengths position Asus as one of the few all-encompassing providers of generative AI solutions.
Projected Surge in Server Business
Regarding server business performance, Asus envisions a yearly compounded growth rate of at least 40% until 2027, with a goal of achieving a fivefold growth over five years. In particular, the data center server business catering primarily to Cloud Service Providers (CSPs) anticipates a tenfold growth within the same timeframe, driven by the adoption of AI server products.
Asus CEO recently emphasized that Asus’s foray into AI server development was prompt and involved collaboration with NVIDIA from the outset. While the product lineup might be more streamlined compared to other OEM/ODM manufacturers, Asus had secured numerous GPU orders ahead of the AI server demand surge. The company is optimistic about the shipping momentum and order visibility for the new generation of AI servers in the latter half of the year.
Embracing NVIDIA’s Versatile L40S GPU
The NVIDIA L40S GPU, built on the Ada Lovelace architecture, stands out as one of the most powerful general-purpose GPUs in data centers. It offers groundbreaking multi-workload computations for large language model inference, training, graphics, and image processing. Not only does it facilitate rapid hardware solution deployment, but it also holds significance due to the current scarcity of higher-tier H100 and A100 GPUs, which have reached allocation stages. Consequently, businesses seeking to repurpose idle data centers are anticipated to shift their focus toward AI servers featuring the L40S GPU.
Asus’s newly introduced L40S GPU servers include the ESC8000-E11/ESC4000-E11 models with built-in Intel Xeon processors, as well as the ESC8000A-E12/ESC4000A-E12 models utilizing AMD EPYC processors. These servers can be configured with up to 4 or a maximum of 8 NVIDIA L40S GPUs. This configuration assists enterprises in enhancing training, fine-tuning, and inference workloads, facilitating AI model creation. It also establishes Asus’s platforms as the preferred choice for multi-modal generative AI applications.
News
According to a report from Taiwan’s TechNews, NVIDIA has delivered impressive results in its latest financial report, coupled with an optimistic outlook for its financial projections. This demonstrates that the demand for AI remains robust for the coming quarters. Currently, NVIDIA’s H100 and A100 chips both utilize TSMC’s CoWoS advanced packaging technology, making TSMC’s production capacity a crucial factor.
Examining the core GPU market, NVIDIA holds a dominant market share of 90%, while AMD accounts for about 10%. While other companies might adopt Google’s TPU or develop customized chips, they currently lack significant operational cost advantages.
In the short term, the shortage of CoWoS has led to tight chip supplies. However, according to a recent report by Morgan Stanley Securities, NVIDIA believes that TSMC’s CoWoS capacity won’t restrict shipments of the next quarter’s H100 GPUs. The company anticipates an increase in supply for each quarter next year. Simultaneously, TSMC is raising CoWoS prices by 20% for rush orders, indicating that the anticipated CoWoS bottleneck might alleviate.
According to industry sources, NVIDIA is actively diversifying its CoWoS supply chain away from TSMC. UMC, ASE, Amkor, and SPIL are significant players in this effort. Currently, UMC is expanding its interposer production capacity, aiming to double its capacity to relieve the tight CoWoS supply situation.
According to Morgan Stanley Securities, TSMC’s monthly CoWoS capacity this year is around 11,000 wafers, projected to reach 25,000 wafers by the end of next year. Non-TSMC CoWoS supply chain’s monthly capacity can reach 3,000 wafers, with a planned increase to 5,000 wafers by the end of next year.
(Photo credit: TSMC)
Press Releases
NVIDIA’s latest financial report for FY2Q24 reveals that its data center business reached US$10.32 billion—a QoQ growth of 141% and YoY increase of 171%. The company remains optimistic about its future growth. TrendForce believes that the primary driver behind NVIDIA’s robust revenue growth stems from its data center’s AI server-related solutions. Key products include AI-accelerated GPUs and AI server HGX reference architecture, which serve as the foundational AI infrastructure for large data centers.
TrendForce further anticipates that NVIDIA will integrate its software and hardware resources. Utilizing a refined approach, NVIDIA will align its high-end, mid-tier, and entry-level GPU AI accelerator chips with various ODMs and OEMs, establishing a collaborative system certification model. Beyond accelerating the deployment of CSP cloud AI server infrastructures, NVIDIA is also partnering with entities like VMware on solutions including the Private AI Foundation. This strategy extends NVIDIA’s reach into the edge enterprise AI server market, underpinning steady growth in its data center business for the next two years.
NVIDIA’s data center business surpasses 76% market share due to strong demand for cloud AI
In recent years, NVIDIA has been actively expanding its data center business. In FY4Q22, data center revenue accounted for approximately 42.7%, trailing its gaming segment by about 2 percentage points. However, by FY1Q23, data center business surpassed gaming—accounting for over 45% of revenue. Starting in 2023, with major CSPs heavily investing in ChatBOTS and various AI services for public cloud infrastructures, NVIDIA reaped significant benefits. By FY2Q24, data center revenue share skyrocketed to over 76%.
NVIDIA targets both Cloud and Edge Data Center AI markets
TrendForce observes and forecasts a shift in NVIDIA’s approach to high-end GPU products in 2H23. While the company has primarily focused on top-tier AI servers equipped with the A100 and H100, given positive market demand, NVIDIA is likely to prioritize the higher-priced H100 to effectively boost its data-center-related revenue growth.
NVIDIA is currently emphasizing the L40s as their flagship product for mid-tier GPUs, meaning several strategic implications: Firstly, the high-end H100 series is constrained by the limited production capacity of current CoWoS and HBM technologies. In contrast, the L40s primarily utilizes GDDR memory. Without the need for CoWos packaging, it can be rapidly introduced to the mid-tier AI server market, filling the gap left by the A100 PCle interface in meeting the needs of enterprise customers.
Secondly, the L40s also target enterprise customers who don’t require large parameter models like ChatGPT. Instead, it focuses on more compact AI training applications in various specialized fields, with parameter counts ranging from tens of billions to under a hundred billion. They can also address edge AI inference or image analysis tasks. Additionally, in light of potential geopolitical issues that might disrupt the supply of the high-end GPU H series for Chinese customers, the L40s can serve as an alternative. As for lower-tier GPUs, NVIDIA highlights the L4 or T4 series, which are designed for real-time AI inference or image analysis in edge AI servers. These GPUs underscore affordability while maintaining a high-cost-performance ratio.
HGX and MGX AI server reference architectures are set to be NVIDIA’s main weapons for AI solutions in 2H23
TrendForce notes that recently, NVIDIA has not only refined its product positioning for its core AI chip GPU but has also actively promoted its HGX and MGX solutions. Although this approach isn’t new in the server industry, NVIDIA has the opportunity to solidify its leading position with this strategy. The key is NVIDIA’s absolute leadership stemming from its extensive integration of its GPU and CUDA platform—establishing a comprehensive AI ecosystem. As a result, NVIDIA has considerable negotiating power with existing server supply chains. Consequently, ODMs like Inventec, Quanta, FII, Wistron, and Wiwynn, as well as brands such as Dell, Supermicro, and Gigabyte, are encouraged to follow NVIDIA’s HGX or MGX reference designs. However, they must undergo NVIDIA’s hardware and software certification process for these AI server reference architectures. Leveraging this, NVIDIA can bundle and offer integrated solutions like its Arm CPU Grace, NPU, and AI Cloud Foundation.
It’s worth noting that for ODMs or OEMs, given that NVIDIA is expected to make significant achievements in the AI server market for CSPs from 2023 to 2024, there will likely be a boost in overall shipment volume and revenue growth of AI servers. However, with NVIDIA’s strategic introduction of standardized AI server architectures like HGX or MGX, the core product architecture for AI servers among ODMs and others will become more homogenized. This will intensify the competition among them as they vie for orders from CSPs. Furthermore, it’s been observed that large CSPs such as Google and AWS are leaning toward adopting in-house ASIC AI accelerator chips in the future, meaning there’s a potential threat to a portion of NVIDIA’s GPU market. This is likely one of the reasons NVIDIA continues to roll out GPUs with varied positioning and comprehensive solutions. They aim to further expand their AI business aggressively to Tier-2 data centers (like CoreWeave) and edge enterprise clients.