The world’s four major CSPs (Cloud Service Providers) – Microsoft, Google, Amazon, and META – are continuously expanding their AI infrastructure, with their combined capital expenditures projected to reach USD 170 billion this year. According to the industry sources cited in a report from Commercial Times, it’s pointed out that due to the surge in demand for AI chips and the increased area of silicon interposers, the number of chips that can be produced from a single 12-inch wafer is decreasing. This situation is expected to cause the CoWoS (Chip on Wafer on Substrate) production capacity under TSMC to remain in short supply.
Regarding CoWoS, according to TrendForce, the introduction of NVIDIA’s B series, including GB200, B100, B200, is expected to consume more CoWoS production capacity. TSMC has also increased its demand for CoWoS production capacity for the entire year of 2024, with estimated monthly capacity approaching 40,000 by the year-end, compared to an increase of over 150% from the total capacity in 2023. A possibility exists for the total production capacity to nearly double in 2025.
However, with NVIDIA releasing the B100 and B200, the interposer area used by a single chip will be larger than before, meaning the number of interposers obtained from a 12-inch wafer will further decrease, resulting in CoWoS production capacity being unable to meet GPU demand. Meanwhile, the number of HBM units installed is also multiplying.
Moreover, in CoWoS, multiple HBMs are placed around the GPU, and HBMs are also considered one of the bottlenecks. Industry sources indicate that HBM is a significant challenge, with the number of EUV (Extreme Ultraviolet Lithography) layers gradually increasing. For example, SK Hynix, which holds the leading market share in HBM, applied a single EUV layer during its 1α production phase. Starting this year, the company is transitioning to 1β, potentially increasing the application of EUV by three to four times.
In addition to the increased technical difficulty, the number of DRAM units within HBM has also increased with each iteration. The number of DRAMs stacked in HBM2 ranges from 4 to 8, while HBM3/3e increases this to 8 to 12, and HBM4 will further raise the number of stacked DRAMs to 16.
Given these dual bottlenecks, overcoming these challenges in the short term remains difficult. Competitors are also proposing solutions; for instance, Intel is using rectangular glass substrates to replace 12-inch wafer interposers. However, this approach requires significant preparation, time, and research and development investment, and breakthroughs from industry players are still awaited.
Read more
(Photo credit: NVIDIA)