News
The world’s four major CSPs (Cloud Service Providers) – Microsoft, Google, Amazon, and META – are continuously expanding their AI infrastructure, with their combined capital expenditures projected to reach USD 170 billion this year. According to the industry sources cited in a report from Commercial Times, it’s pointed out that due to the surge in demand for AI chips and the increased area of silicon interposers, the number of chips that can be produced from a single 12-inch wafer is decreasing. This situation is expected to cause the CoWoS (Chip on Wafer on Substrate) production capacity under TSMC to remain in short supply.
Regarding CoWoS, according to TrendForce, the introduction of NVIDIA’s B series, including GB200, B100, B200, is expected to consume more CoWoS production capacity. TSMC has also increased its demand for CoWoS production capacity for the entire year of 2024, with estimated monthly capacity approaching 40,000 by the year-end, compared to an increase of over 150% from the total capacity in 2023. A possibility exists for the total production capacity to nearly double in 2025.
However, with NVIDIA releasing the B100 and B200, the interposer area used by a single chip will be larger than before, meaning the number of interposers obtained from a 12-inch wafer will further decrease, resulting in CoWoS production capacity being unable to meet GPU demand. Meanwhile, the number of HBM units installed is also multiplying.
Moreover, in CoWoS, multiple HBMs are placed around the GPU, and HBMs are also considered one of the bottlenecks. Industry sources indicate that HBM is a significant challenge, with the number of EUV (Extreme Ultraviolet Lithography) layers gradually increasing. For example, SK Hynix, which holds the leading market share in HBM, applied a single EUV layer during its 1α production phase. Starting this year, the company is transitioning to 1β, potentially increasing the application of EUV by three to four times.
In addition to the increased technical difficulty, the number of DRAM units within HBM has also increased with each iteration. The number of DRAMs stacked in HBM2 ranges from 4 to 8, while HBM3/3e increases this to 8 to 12, and HBM4 will further raise the number of stacked DRAMs to 16.
Given these dual bottlenecks, overcoming these challenges in the short term remains difficult. Competitors are also proposing solutions; for instance, Intel is using rectangular glass substrates to replace 12-inch wafer interposers. However, this approach requires significant preparation, time, and research and development investment, and breakthroughs from industry players are still awaited.
Read more
(Photo credit: NVIDIA)
News
The latest challenger has emerged in the battle for dominance in China’s GPU and graphics card market. China’s LinJoWing has unveiled its self-developed second-generation graphics processing chip, the GP201, reportedly boasting performance metrics surpassing that of AMD’s E8860 embedded graphics card.
According to a report from global media outlet Tom’s Hardware, LinJoWing, despite being only three years old, has demonstrated with its GP201 GPU performance comparable to AMD’s E8860 integrated graphics card from a decade ago. While the GPU is already in production and available in China, it has yet to surface on American shopping websites.
As per a report from Chinese media outlet IT Home, the GP201 outperforms the AMD E8860 embedded graphics card in various aspects such as 3D performance, 2D polygon rendering, ellipse rendering, pixel and image shifting, window rendering, and support for the domestic OpenCL library platform.
Additionally, LinJoWing’s GP201 GPU supports multiple Chinese-made processors and operating systems, with single-precision floating-point computing power reaching 1.2 Tflops. It supports 4K 60Hz display and H.265 decoding, with a maximum power consumption of 30W. Currently, five models of the GPU have been released in full-height, half-height, MXM, and other forms.
Tom’s Hardware believes that the performance of the GP201 is actually unimpressive. NVIDIA’s entry-level product, the GT 1030, released in 2017, matches the GP201 in terms of clock speed, TFLOPS, and power consumption, with eBay prices generally below $50. The GT 1030 benefits from mature NVIDIA drivers, making it difficult for LinJoWing to reach this level. However, LinJoWing’s ability to enter production after only three years of establishment gives it a competitive edge over other Chinese graphics cards.
This year, LinJoWing also surpassed its competitor Loongson in the low-end GPU market. However, challenging its biggest competitor, Moore Threads, will require further effort. Currently, Moore Threads’ flagship GPU’s specifications can rival NVIDIA’s RTX 3060 Ti.
Read more
(Photo credit: AMD)
News
Japanese digital infrastructure service provider Sakura Internet, backed by government subsidies, is enhancing its cloud services for generative AI. According to a report from MoneyDJ, Sakura Internet’s procurement of GPUs is set to increase fivefold from the initial plan, with purchases including NVIDIA’s latest product, the “B200,” unveiled in March.
On April 19th, Sakura announced that it has secured Japanese government subsidies to strengthen its cloud service “Koukaryoku” for generative AI. The company plans to expand the number of GPUs deployed in “Koukaryoku” to fivefold from the initially planned quantity, aiming to incorporate around 10,000 GPUs, including NVIDIA’s latest “NVIDIA HGX B200 system” introduced in March. The goal is to establish a large-scale cloud infrastructure with a computational power of 18.9 EFLOPS by the end of March 2028.
Sakura had previously received similar government subsidies in June 2023, marking this as the second time they have received such support.
Sakura announced that last June they invested JPY 13 billion, aiming to purchase approximately 2,000 NVIDIA GPUs (with a computational power of 2.0 EFLOPS) between July 2023 and March 2025. Due to significantly higher demand than expected, the procurement of these 2,000 GPUs is projected to be completed ahead of schedule by the end of June 2024.
This new investment plan, totaling around JPY 100 billion (including costs for server components other than GPUs and maintenance fees), targets additional procurement of approximately 8,000 GPUs (with a computational power of 16.9 EFLOPS) between April 2024 and December 2027.
The overall GPU procurement quantity of around 10,000 units will thus be five times the original plan of approximately 2,000 units. According to Japanese media reports, Sakura will provide server computing power equipped with these GPUs to companies engaged in generative AI research.
On April 19, the Ministry of Economy, Trade and Industry of Japan announced that in order to establish the necessary supercomputers for developing generative AI domestically in Japan, they will provide a maximum subsidy of JPY 72.5 billion to five Japanese companies, with Sakura receiving a maximum subsidy of JPY 50.1 billion.
Previously, NVIDIA CEO Jensen Huang visited Japan in December last year and met with Japanese Prime Minister Fumio Kishida. Huang stated that Prime Minister Kishida requested NVIDIA to supply as many GPUs as possible for generative AI to Japan. NVIDIA will collaborate with Japanese companies including Sakura, SoftBank, NEC, NTT, and others to accelerate the development of generative AI.
Read more
(Photo credit: Sakura)
News
According to sources cited by the American news outlet Business Insider, Microsoft plans to double its inventory of GPUs to 1.8 million, primarily sourced from NVIDIA. Having more chips on hand will enable Microsoft to launch AI products that are more efficient, faster, and more cost-effective.
The source does not detail specific future applications for these chips, but acquiring a large quantity of chips means that Microsoft can extensively deploy them across its own products, including cloud services and consumer electronics.
The sources cited by the same report further revealed that Microsoft plans to invest USD 100 billion in GPUs and data centers by 2027 to strengthen its existing infrastructure.
Microsoft’s significant stockpiling of AI chips underscores the company’s efforts to maintain a competitive edge in the AI field, where having robust computing power is crucial for innovation.
On the other hand, NVIDIA recently stated that the AI computer they are collaborating on with Microsoft will operate on Microsoft’s Azure cloud platform and will utilize tens of thousands of NVIDIA GPUs, including their H100 and A100 chips.
NVIDIA declined to disclose the contract value of this collaboration. However, industry sources cited by the report estimate that the price of each A100 chip ranges from USD 10,000 to 12,000, while the price of the H100 is significantly higher than this range.
Additionally, Microsoft is also in the process of designing the next generation of the chip. Not only is Microsoft striving to reduce its reliance on NVIDIA, but other companies including OpenAI, Tesla, Google, Amazon, and Meta are also investing in developing their own AI accelerator chips. These companies are expected to compete with NVIDIA’s flagship H100 AI accelerator chips.
Read more
(Photo credit: NVIDIA)
News
The GPU shortage issue has reportedly been alleviated. Per a report from Economic Daily News, it has led to a significant improvement in delivery times for major server brands like Dell. Delivery times have decreased from 40 weeks at the end of last year to a normal cycle of 8-12 weeks now, and sometimes even shorter.
Dell is reportedly capitalizing on the opportunities in artificial intelligence (AI), according to the same report citing Terence Liao, General Manager of Dell Taiwan, who indicated on April 9th that the company is experiencing strong server orders and demand in the Taiwanese market. This surge is primarily due to the robust AI needs within Taiwan’s corporate sector.
As for the previously challenging GPU shortage issue affecting the industry, delivery times have significantly improved this year following the expansion of CoWoS (Chip-on-Wafer-on-Substrate) capacity.
Terence Liao mentioned that towards the end of last year, there was indeed a tight supply of NVIDIA’s H100 GPUs, leading to Dell’s delivery times averaging around 40 weeks and competitors experiencing even longer delays of up to 52 weeks. However, starting from February this year, GPU supply has notably improved. For Dell in Taiwan specifically, delivery times have returned to a normal cycle of 8-12 weeks, and sometimes even shorter.
With the GPU shortage issue eased, Dell Taiwan openly acknowledges that they currently have a high volume of server orders and strong demand, largely driven by Taiwan’s enterprises seeking AI solutions. Terence Liao analyzed that, from the perspective of the Taiwan market, industries actively adopting AI include manufacturing, healthcare, government, finance, and telecommunications sectors.
As per Dell’s GenAI Pulse Survey, 78% of IT decision-makers express anticipation for AI-driven solutions to unleash potential within enterprises, viewing AI as a means to enhance productivity, streamline processes, and reduce costs.
Moreover, from a corporate budget perspective, the allocation for AI servers has increased from around 10% in the past to approximately 20% currently. This shift indicates a heightened commitment within the Taiwanese industry towards investing in AI.
Terence Liao emphasized that the demand for AI servers primarily comes from Cloud Service Providers (CSPs) and general enterprises. While CSPs still represent a significant portion of this demand, Dell is particularly pleased to see an increase in demand from general enterprises.
Previously, TrendForce underscores that the primary momentum for server shipments this year remains with American CSPs. However, due to persistently high inflation and elevated corporate financing costs curtailing capital expenditures, overall demand has not yet returned to pre-pandemic growth levels. Global server shipments are estimated to reach approximately. 13.654 million units in 2024, an increase of about 2.05% YoY. Meanwhile, the market continues to focus on the deployment of AI servers, with their shipment share estimated at around 12.1%.
Read more
(Photo credit: Dell)