News
The recent rapid downturn in the black market prices of AI servers equipped with NVIDIA’s highest-tier AI chip, the H100, in China, has attracted attention, as per a report from Economic Daily News. This fluctuation, triggered by US sanctions, has reportedly prompted concerns about its impact on overall supply and demand dynamics, and whether it will further squeeze normal market mechanisms.
Industry sources cited by the same report have revealed that the prices of AI servers equipped with the H100 chip have recently plummeted on the Chinese black market. This is primarily due to the imminent launch of NVIDIA’s next-generation high-end AI chip, the H200. With the transition between old and new products, scalpers who previously hoarded H100 chips to drive up prices are now offloading their large inventories.
As per a report from Reuters, despite the US expanding its ban on AI technology-related exports to China last year, some dealers are still taking risks. There is still trading of H100 chips in the Huaqiangbei electronics market in northern Shenzhen, but it has all gone underground. The chips are said to be mainly imported into China through purchasing agents or shell companies set up overseas, making them accessible to Chinese universities, research institutions, and even companies through special dealer channels.
Due to the US ban, both the H100 chip and AI servers equipped with it can only be traded on the black market, not openly. Scalpers have significantly inflated prices, with servers featuring the H100 chip reaching over CNY 3 million (over USD 420,000) in China, compared to the official price of USD 280,000 to USD 300,000, resulting in profits of over 10% for some middlemen after deducting logistics and tariffs.
With the H200 set to launch in the second quarter, the H100 will become the “previous generation” product. Consequently, middlemen who had hoarded H100 chips are eager to sell their inventory, leading to a rapid correction in prices.
Recently, servers with the H100 chip on the China black market have dropped to around CNY 2.7 to 2.8 million, with spot prices in Hong Kong falling to around CNY 2.6 million, representing a decline of over 10%.
According to a previous report from Reuters, in response to Chinese universities and research institutions reportedly acquired high-end AI chips from NVIDIA through distributors, a NVIDIA spokesperson stated that the report does not imply that NVIDIA or any of its partners violated export control regulations, and the proportion of these products in global sales is negligible. Nvidia complies with U.S. regulatory standards.
Read more
(Photo credit: NVIDIA)
News
NVIDIA’s upcoming next-generation high-end AI chip, the H200, is on the horizon. As per a report from Economic Daily News, currently, the mainstream high-end H100 chip has seen a decline in demand, putting an end to the previous state of supply shortages.
As per the same report, Taiwanese contract manufacturers openly acknowledge that the supply of H100 chips is indeed smoother now, primarily due to the alleviation of tight CoWoS advanced packaging capacity constraints.
Despite a significant short-term correction in the market price of H100 chips in China, Taiwan’s AI server manufacturers, such as Quanta and Inventec, are still striving to maximize shipments. This quarter, the momentum of AI server shipments is expected to see a significant boost.
From the perspective of server manufacturers, the demand and pricing of critical components are typically negotiated directly between cloud service providers (CSPs) and chip manufacturers like NVIDIA. Once the price and quantity are agreed upon, manufacturers are then commissioned to produce and ship the products.
Quanta emphasized that with the easing of tight capacity in upstream CoWoS advanced packaging, the supply of H100 chips has become smoother. Maintaining their previous stance, they anticipate that the momentum of AI server shipments will begin to show from this quarter onwards.
A previous report from tom’s hardware once emphasized that the ease of purchasing H100 GPUs has brought about some changes in the market as well. Customers now prioritize price and practicality when leasing AI computing services from cloud service providers.
Additionally, alternatives to the H100 GPU have emerged in the current market, offering comparable performance and software support. These may come at more affordable prices, potentially fostering a fairer market environment.
Mike Yang, Senior Vice President and General Manager of Quanta Cloud Technology (QCT), also mentioned recently that they expect to see a significant improvement in chip supply by June, which will subsequently boost server shipment performance in the second half of the year.
Read more
(Photo credit: NVIDIA)
News
The previously elusive NVIDIA data center GPU, H100, has seen a noticeable reduction in delivery lead times amid improved market supply conditions, as per a report from Tom’s Hardware. As a result, customers who previously purchased large quantities of H100 chips are reportedly starting to resell them.
The report further points out that the previously high-demand H100 data center GPU, driven by the surge in artificial intelligence applications, has seen a reduction in delivery wait times from a peak of 8-11 months to 3-4 months, indicating a relief in supply pressure.
Additionally, with major cloud providers such as AWS, Google Cloud, and Microsoft Azure offering easier access to AI computing services for customers, enterprises that previously purchased large quantities of H100 GPUs have begun further reselling these GPUs.
For instance, AWS introduced a new service allowing customers to rent GPUs for shorter periods, resolving previous chip demand issues and shortening the waiting time for artificial intelligence chips.
The report also indicates that customers are reselling these GPUs due to reduced scarcity and the high maintenance costs, leading these enterprise customers to make such decisions. This situation contrasts starkly with the market shortage a year ago.
However, even though the current difficulty in obtaining H100 GPUs has significantly decreased, the artificial intelligence market remains robust overall. The demand for large-scale artificial intelligence model computations persists for some enterprises, keeping the overall demand greater than the supply, thereby preventing a significant drop in the price of H100 GPUs.
The report emphasizes that the current ease of purchasing H100 GPUs has also brought about some changes in the market. Customers now prioritize price and practicality when leasing AI computing services from cloud service providers.
Additionally, alternatives to the H100 GPU have emerged in the current market, offering comparable performance and software support but at potentially more affordable prices, potentially contributing to a more equitable market condition.
TrendForce’s newest projections spotlight a 2024 landscape where demand for high-end AI servers—powered by NVIDIA, AMD, or other top-tier ASIC chips—will be heavily influenced by North America’s cloud service powerhouses.
Microsoft (20.2%), Google (16.6%), AWS (16%), and Meta (10.8%) are predicted to collectively command over 60% of global demand, with NVIDIA GPU-based servers leading the charge.
However, NVIDIA still faces ongoing hurdles in development as it contends with US restrictions.
TrendForce has pointed out that, despite NVIDIA’s stronghold in the data center sector—thanks to its GPU servers capturing up to 70% of the AI market—challenges continue to loom.
Three major challenges are set to limit the company’s future growth: Firstly, the US ban on technological exports has spurred China toward self-reliance in AI chips, with Huawei emerging as a noteworthy adversary. NVIDIA’s China-specific solutions, like the H20 series, might not match the cost-effectiveness of its flagship models, potentially dampening its market dominance.
Secondly, the trend toward proprietary ASIC development among US cloud behemoths, including Google, AWS, Microsoft, and Meta, is expanding annually due to scale and cost considerations.
Lastly, AMD presents competitive pressure with its cost-effective strategy, offering products at just 60–70% of the prices of comparable NVIDIA models. This allows AMD to penetrate the market more aggressively, especially with flagship clients. Microsoft is expected to be the most enthusiastic adopter of AMD’s high-end GPU MI300 solutions in 2024.
Read more
(Photo credit: NVIDIA)
Insights
The US Department of Commerce issued new restrictions on AI chips on October 17, 2023, with a focus on controlling the export of chips to China, including NIVIDA’s A800, H800, L40S, and RTX4090, among others. Taiwanese manufacturers primarily serve cloud service providers and brand owners in North America, with relatively fewer shipments to Chinese servers. However, Chinese manufacturers, having already faced two chip restrictions imposed by the US, recognize the significance of AI chips in server applications and are expected to accelerate their in-house chip development processes.
TrendForce’s Insights:
1. Limited Impact on Taiwanese Manufacturers in Shipping AI Servers with H100 GPUs
Major Taiwanese server manufacturering companies, including Foxconn, Quanta, Inventec, GIGABYTE, and Wiwynn, provide AI servers equipped with H100 GPUs to cloud data centers and brand owners in Europe and the United States. These Taiwanese companies have established some AI server factories outside China, in countries such as the US, the Czech Republic, Mexico, Malaysia, and Thailand, focusing on producing L10 server units and L11 cabinets in proximity to end-users. This strategy aligns with the strategic needs of US cloud providers and brand owners for global server product deployment.
On the other hand, including MiTAC, Wistron, and Inventec, also provide server assembly services for Chinese brands such as Inspur and Lenovo. Although MiTAC has a significant share in assembling Inspur’s servers, it acquired Intel DSG (Data Center Solutions Group) business in July 2023. Therefore, the focus of AI servers remains on brand manufacturers using H100 GPUs, including Twitter, Dell, AWS, and European cloud service provider OVH. It is speculated that the production ratio of brand servers will be adjusted before the new restrictions are enforced.
Wistron is a major supplier for NVIDIA’s AI server modules, DGX A100, and HGX H100. Its primary shipments are to end-users in Europe and the United States. It is expected that there will be adjustments in the proportion of shipments to Chinese servers following the implementation of the restrictions.
Compal has fewer AI server orders compared to other Taiwanese manufacturers. It has not yet manifested any noticeable changes in Lenovo server assembly proportions. The full extent of the impact will only become more apparent after the enforcement of the ban.
During the transitional period before the implementation of the chip ban in the United States, the server supply chain can still adapt shipments based on local chip demand in China to address market impacts resulting from subsequent chip controls.
2. Chinese Manufacturers Focusing on Accelerating In-House Chip Development
Chinese cloud companies had already started developing their AI chips before the first U.S. chip restrictions in 2022. This included self-developed AI chips like Alibaba Cloud’s T-HEAD, a data center AI chip, and they expanded investments in areas such as DRAM, AI chips, and semiconductors with the aim of establishing a comprehensive IoT system from chips to the cloud.
Baidu Cloud, on the other hand, accelerated the development of its third-generation self-developed Kunlun chip, designed for cloud and edge computing, with plans for an early 2024 release.
Tencent introduced three self-developed chips in 2021, including an AI inference chip called Zixiao, used for Tencent’s meeting business; a video transcoding chip called Canghai, used in cloud gaming and live streaming applications; and a smart network card chip named Xuanling, applied in network storage and computing.
ByteDance made investments in cloud AI chips through its MooreThread initiative in 2022 for applications in AI servers. Huawei released the Ascend 900 chip in 2019 and is expected to introduce the Ascend 930B AI chip in the latter half of 2024. While this chip has the same computational power as the NVIDIA A100 chip, its performance still requires product validation, and it is speculated that it may not replace the current use of NVIDIA GPUs in Chinese AI servers.
Despite the acceleration of self-developed chip development among Chinese cloud server manufacturers, the high technological threshold, lengthy development cycles, and high costs associated with GPU development often delay the introduction of new server products. Therefore, Chinese cloud companies and brand manufacturers continue to purchase NVIDIA GPUs for the production of mid to high-end servers to align with their economic scale and production efficiency.
In response to the new U.S. restrictions, Chinese cloud companies have adopted short-term measures such as increasing imports of existing NVIDIA chips and building up stockpiles before the enforcement of the new restrictions. They are also focusing on medium to long-term strategies, including accelerating resource integration and shortening development timelines to expedite GPU chip manufacturing processes, thus reducing dependency on U.S. restrictions.
News
Source to China Times, in response to increased visibility in AI server orders and optimistic future demand, two ODM-Direct based in Taiwan, Wiwynn, and Quanta, are accelerating the expansion of their server production lines in non-Chinese regions. Recently, there have been updates on their progress. Wiwynn has completed the first phase of its self-owned new factory in Malaysia, specifically for L10. As for Quanta, has further expanded its L10 production line in California, both gearing up for future AI server orders.
Wiwynn’s new server assembly factory, located in the Senai Airport City in Johor, Malaysia, was officially inaugurated on the 12th, and it will provide full cabinet assembly services for large-scale data centers. Additionally, the second phase of the front-end server motherboard production line is expected to be completed and operational next year, allowing Wiwynn to offer high-end AI servers and advanced cooling technology to cloud service providers and customers in the SEA region
While Wiwynn has experienced some slowdown in shipments and revenue due to its customers adjusting to inventory and CAPEX impacts in recent quarters, Wiwynn still chooses to continue its overseas factory expansion efforts. Notably, with the addition of the new factory in Malaysia, Wiwynn’s vision of establishing a one-stop manufacturing, service, and engineering center in the APAC region is becoming a reality.
Especially as we enter Q4, the shipment of AI servers based on NVIDIA’s AI-GPU architecture is expected to boost Wiwynn’s revenue. The market predicts that after a strong fourth quarter, this momentum will carry forward into the next year.
How significant is the demand for AI servers?
According to TrendForce projection, a dramatic surge in AI server shipments for 2023, with an estimated 1.2 million units—outfitted with GPUs, FPGAs, and ASICs—destined for markets around the world, marking a robust YoY growth of 38.4%. This increase resonates with the mounting demand for AI servers and chips, resulting in AI servers poised to constitute nearly 9% of the total server shipments, a figure projected to increase to 15% by 2026. TrendForce has revised its CAGR forecast for AI server shipments between 2022 and 2026 upwards to an ambitious 29%.
Quanta has also been rapidly expanding its production capacity in North America and Southeast Asia in recent years. This year, in addition to establishing new facilities in Vietnam, they have recently expanded their production capacity at their California-based Fremont plant.
The Fremont plant in California has been Quanta’s primary location for the L10 production line in the United States. In recent years, it has expanded several times. With the increasing demand for data center construction by Tier 1 CSP, Quanta’s Tennessee plant has also received multiple investments to prepare for operational needs and capacity expansion.
In August of this year, Quanta initially injected $135 million USD into its California subsidiary, which then leased a nearly 4,500 square-meter site in the Bay Area. Recently, Quanta announced a $79.6 million USD contract awarded to McLarney Construction, Inc. for three construction projects within their new factory locations.
It is expected that Quanta’s new production capacity will gradually come online, with the earliest capacity expected in 2H24, and full-scale production scheduled for 1H25. With the release of new high-end AI servers featuring the H100 architecture, Quanta has been shipping these products since August and September, contributing to its revenue growth. They aim to achieve a 20% YoY increase in server sales for 2023, with the potential for further significant growth in 2024.