News
Major Cloud Service Providers (CSPs) continue to see an increase in demand for AI servers over the next two years. The latest projections of TrendForce indicate a global shipment of approximately 1.18 million AI servers in 2023, with a year-on-year growth of 34.5%. The trend is expected to persist into the following year, with an estimated annual growth of around 40.2%, constituting over 12% of the total server shipments.
NVIDIA, with its key products including AI-accelerating GPU and the AI server reference architecture HGX, currently holds the highest market share in the AI sector. However, it is crucial to monitor CSPs developing their own chips and, in the case of Chinese companies restricted by U.S. sanctions, expanding investments in self-developed ASICs and general-purpose AI chips.
According to TrendForce data, AI servers equipped with NVIDIA GPUs accounted for approximately 65.1% this year, projected to decrease to 63.5% next year. In contrast, servers featuring AMD and CSP self-developed chips are expected to increase to 8.2% and 25.4%, respectively, in the coming year.
Another critical application, HBM (High Bandwidth Memory), is primarily supplied by major vendors Samsung, SK Hynix, and Micron, with market shares of approximately 47.5%, 47.5%, and 5.0%, respectively, this year. As the price difference between HBM and DDR4/DDR5 is 5 to 8 times, this is expected to contribute to a staggering 172% year-on-year revenue growth in the HBM market in 2024.
Currently, the three major manufacturers are expected to complete HBM3e verification in the first quarter of 2024. However, the results of each manufacturer’s HBM3e verification will determine the final allocation of procurement weight for NVIDIA among HBM suppliers in 2024. As the verifications are still underway, the market share for HBM in 2024 remain to be observed.
Read more
(Photo credit: NVIDIA)
News
Samsung Electronics and South Korean internet giant Naver have joined forces to invest in an artificial intelligence semiconductor solution. According to BusinessKorea’s report, the energy efficiency of the first solution chip from the two companies is expected to be roughly eight times higher than competitors like NVIDIA H100.
This new solution is based on a Field-Programmable Gate Array (FPGA) customized for Naver’s HyperCLOVA X large language model.
Per Tom’s Hardware cited from Naver, it indicated that this AI chip is eight times more power efficient than NVIDIA’s AI GPUs H100 thanks to the usage of LPDDR memory. However, specific details remain undisclosed, and the timeline for product development by the two companies is yet to be clarified.
Samsung and Naver began their collaboration at the end of 2022, utilizing Samsung’s advanced process technology, expertise in memory technologies like computational storage, processing-in-memory (PIM) and processing-near-memory (PNM), as well as Compute Express Link (CXL). Naver’s strengths in software and AI algorithms are also leveraged in this collaboration.
Samsung has already produced and sold various types of memory and storage technologies for AI applications, including SmartSSD, HBM-PIM, and memory expansion modules with CXL interfaces, all crucial for the upcoming AI chips.
“Through our collaboration with NAVER, we will develop cutting-edge semiconductor solutions to solve the memory bottleneck in large-scale AI systems,” said Jinman Han, Executive Vice President of Memory Global Sales & Marketing at Samsung Electronics.
Read more
(Photo credit: Samsung)
News
In response to export restrictions on AI chips by the U.S. Department of Commerce, NVIDIA has previously introduced a China-Exclusive version of its graphics card, featuring the AD102-250 GPU and named GeForce RTX 4090 D.
According to ICsmart’s report, industry insiders has revealed that NVIDIA is rumored to officially unveil GeForce RTX 4090 D on December 28 at 10:00 PM (GMT+8), with the suggested retail price remaining at CNY 12,999.
Due to the impact of the new U.S. export restrictions on semiconductor to China in October this year, NVIDIA’s high-end gaming graphics card, GeForce RTX 4090, faced restrictions in sales in China.
In order to address this issue, NVIDIA decided to develop the customized GeForce RTX 4090 D specifically for the Chinese market. By adjusting certain specifications to comply with U.S. export control requirements, they aim to continue sales in the Chinese market.
According to previous information, the RTX 4090D is still based on TSMC’s 4nm process, featuring the AD102 GPU. However, the core designation changes from AD102-300 to AD102-250, corresponding to a downgrade in specifications. The exact number of CUDA cores is not yet clear, but is expected to be fewer than the 16,384 cores in the RTX 4090.
Additionally, the core base clock will see a slight increase from 2235MHz to 2280MHz, while the boost clock remains at 2520MHz. It is possible that the card will retain 24 GB of GDDR6X memory capacity with over 1 TB/s of bandwidth. The total board power (TBP) is expected to see a slight reduction from 450W to 425W.
Read more
(Photo credit: NVIDIA)
News
AMD has long aspired to gain more favor for its AI chips, aiming to break into Nvidia’s stronghold in the AI chip market. Key players like Meta, OpenAI, and Microsoft, who are major buyers of AI chips, also desire a diversified market with multiple AI chip suppliers to avoid vendor lock-in issues and reduce costs.
With AMD’s latest AI chip, Instinct MI300X slated for significant shipments in early 2024, these three major AI chip buyers have publicly announced their plans to place orders as they consider AMD’s solution a more cost-effective alternative.
At the AMD “Advancing AI” event on December 6th, Meta, OpenAI, Microsoft, and Oracle declared their preference for AMD’s latest AI chip, Instinct MI300X. This marks a groundbreaking move by AI tech giants actively seeking alternatives to Nvidia’s expensive GPUs.
For applications like OpenAI’s ChatGPT, Nvidia GPUs have played a crucial role. However, if the AMD MI300X can provide a significant cost advantage, it has the potential to impact Nvidia’s sales performance and challenge its market dominance in AI chips.
AMD’s Three Major Challenges
AMD grapples with three major challenges: convincing enterprises to consider substitutions, addressing industry standards compared to Nvidia’s CUDA software, and determining competitive GPU pricing. Lisa Su, AMD’s CEO, highlighted at the event that the new MI300X architecture features 192GB of high-performance HBM3, delivering not only faster data transfer but also meeting the demands of larger AI models. Su emphasized that such a notable performance boost translates directly into an enhanced user experience, enabling quicker responses to complex user queries.
However, AMD is currently facing critical challenges. Companies that heavily rely on Nvidia may hesitate to invest their time and resources in an alternative GPU supplier like AMD. Su believes that there is an opportunity to make efforts in persuading these AI tech giants to adopt AMD GPUs.
Another pivotal concern is that Nvidia has established its CUDA software as the industry standard, resulting in a highly loyal customer base. In response, AMD has made improvements to its ROCm software suite to effectively compete in this space. Lastly, pricing is a crucial issue, as AMD did not disclose the price of the MI300X during the event. Convincing customers to choose AMD over Nvidia, whose chips are priced around USD 40,000 each, will require substantial cost advantages in both the purchase and operation of AMD’s offerings.
The Overall Size of the AI GPU Market is Expected to Reach USD 400 Billion by 2027
AMD has already secured agreements with companies eager for high-performance GPUs to use MI300X. Meta plans to leverage MI300X GPUs for AI inference tasks like AI graphics, image editing, and AI assistants. On the other hands, Microsoft’s CTO, Kevin Scott, announced that the company will provide access to MI300X through Azure web service.
Additionally, OpenAI has decided to have its GPU programming language Triton, a dedication to machine learning algorithm development, support AMD MI300X. Oracle Cloud Infrastructure (OCI) intends to introduce bare-metal instances based on AMD MI300X GPUs in its high-performance accelerated computing instances for AI.
AMD anticipates that the annual revenue from its GPUs for data centers will reach USD 2 billion by 2024. This projected figure is substantially lower than Nvidia’s most recent quarterly sales related to the data center business (i.e., over USD 14 billion, including sales unrelated to GPUs). AMD emphasizes that with the rising demand for high-end AI chips, the AI GPU market’s overall size is expected to reach USD 400 billion by 2027. This strategic focus on AI GPU products underscores AMD’s optimism about capturing a significant market share. Lisa Su is confident that AMD is poised for success in this endeavor.
(Image: AMD)
News
According to a news report from IJIWEI, sources have revealed that NVIDIA has placed urgent orders with TSMC for the production of AI GPU destined for China. These orders fall under the category of “Super Hot Run” (SHR), with plans to commence fulfillment in the first quarter of 2024.
Respond to the United States implementing stricter export controls on the Chinese semiconductor industry, sources stated in the report indicate that NVIDIA plans to provide a new “specialized” AI chip to China by lowering specifications, replacing the export-restricted H800, A800, and L40S series.
Insiders suggest that NVIDIA intends to resume supplying the RTX 4090 chip to China in January of next year but also release a modified version later to comply with U.S. export restrictions.
On the other hand, NVIDIA continues to increase its orders with TSMC. This move aims to secure TSMC’s manufacturing capacity to meet the demand for the H100. However, due to limitations in CoWoS (Chip-on-Wafer-on-Substrate) production capacity, the H100 GPU is currently facing severe shortages.
It is noted that following NVIDIA, Intel and AMD are also expected to tailor AI chips for China. TSMC, as the primary pure-play foundry partner for these AI chip suppliers, will continue to enjoy a competitive advantage.
According to sources from semiconductor equipment manufacturers, despite TSMC’s efforts to increase CoWoS production capacity, the foundry still cannot meet the growing demand for NVIDIA GPUs. Additionally, the MI300 chip that was recently launched by AMD is also competing for the foundry industry’s production capacity.
Insiders note that TSMC’s ability to expand CoWoS production capacity is limited, with delays in equipment replacement speed, machine installation speed, and labor deployment. The new capacity is expected to be ready by the second quarter of 2024.
Equipment is identified as one of the key variables affecting TSMC’s expansion of CoWoS production capacity. Unexpected impacts on production and delivery times from Japanese equipment supplier Shibaura have delayed the development and installation of new capacity across TSMC’s production lines, including those in Longtan and Zhunan.
TSMC Chairman Mark Liu mentioned in a press conference in September that the shortage of CoWoS packaging capacity at TSMC is temporary, and it will be addressed through capacity expansion within the next year and a half to meet the growing demand.
(Photo credit: TSMC)
Read more