News
Last year’s AI boom propelled NVIDIA into the spotlight, yet the company finds itself at a challenging crossroads.
According to a report from TechNews, on one hand, NVIDIA dominates in high-performance computing and artificial intelligence, continuously expanding with its latest GPU products. On the other hand, global supply chain instability, rapid emergence of competitors, and uncertainties in technological innovation are exerting unprecedented pressure on NVIDIA.
NVIDIA’s stock price surged by 246% last year, driving its market value past USD 1 trillion and making it the first chip company to achieve this milestone. According to the Bloomberg Billionaires Index, NVIDIA CEO Jensen Huang’s personal wealth has soared to USD 55.7 billion.
However, despite the seemingly radiant outlook for the NVIDIA, as per a report from TechNews, it still faces uncontrollable internal and external challenges.
The most apparent issue lies in capacity constraints.
Currently, NVIDIA’s A100 and H100 GPUs are manufactured using TSMC’s CoWoS packaging technology. However, with the surge in demand for generative AI, TSMC’s CoWoS capacity is severely strained. Consequently, NVIDIA has certified other CoWoS packaging suppliers such as UMC, ASE, and American OSAT manufacturer Amkor as backup options.
Meanwhile, TSMC has relocated its InFo production capacity from Longtan to Southern Taiwan Science Park. The vacated Longtan fab is being repurposed to expand CoWoS capacity, while the Zhunan and Taichung fabs are also contributing to the expansion of CoWoS production to alleviate capacity constraints.
However, during the earnings call, TSMC also stated that despite a doubling of capacity in 2024, it still may not be sufficient to meet all customer demands.
In addition to TSMC’s CoWoS capacity, industry rumors suggest that NVIDIA has made significant upfront payments to Micron, SK Hynix, to secure HBM3 memory, ensuring a stable supply of HBM memory. However, the entire HBM capacity of Samsung, SK Hynix, and Micron for this year has already been allocated. Therefore, whether the capacity can meet market demand will be a significant challenge for NVIDIA.
While cloud service providers (CSPs) fiercely compete for GPUs, major players like Amazon, Microsoft, Google, and Meta are actively investing in in-house AI chips.
Amazon and Google have respectively introduced Trainium and TPU chips, Microsoft announced its first in-house AI chip Maia 100 along with in-house cloud computing CPU Cobalt 100, while Meta plans to unveil its first-generation in-house AI chip MTIA by 2025.
Although these hyperscale customers still rely on NVIDIA’s chips, in the long run, it may impact NVIDIA’s market share, inadvertently positioning them as competitors and affecting profits. Consequently, NVIDIA finds it challenging to depend solely on these hyperscale customers.
Due to escalating tensions between the US and China, the US issued new regulations prohibiting NVIDIA from exporting advanced AI chips to China. Consequently, NVIDIA introduced specially tailored versions such as A800 and H800 for the Chinese market.
However, they were ultimately blocked by the US, and products including A100, A800, H100, H800, and L40S were included in the export control list.Subsequently, NVIDIA decided to introduce new AI GPUs, namely HGXH20, L20 PCIe, and L2 PCIe, in compliance with export policies.
However, with only 20% of the computing power of H100, they are planned for mass production in the second quarter. Due to the reduced performance, major Chinese companies like Alibaba, Tencent, and Baidu reportedly refused to purchase, explicitly stating significant order cuts for the year. Consequently, NVIDIA’s revenue prospects in China appear grim, with some orders even being snatched by Huawei.
Currently, NVIDIA’s sales revenue from Singapore and China accounts for 15% of its total revenue. Moreover, the company holds over 90% market share in the AI chip market in China. Therefore, the cost of abandoning the Chinese market would be substantial. NVIDIA is adamant about not easily giving up on China; however, the challenge lies in how to comply with US government policies and pressures while meeting the demands of Chinese customers.
As per NVIDIA CEO Jensen Huang during its last earnings call, he mentioned that US export control measures would have an impact. Contributions from China and other regions accounted for 20-25% of data center revenue in the last quarter, with a significant anticipated decline this quarter.
He also expressed concerns that besides losing the Chinese market, the situation would accelerate China’s efforts to manufacture its own chips and introduce proprietary GPU products, providing Chinese companies with opportunities to rise.
In the race to capture the AI market opportunity, arch-rivals Intel and AMD are closely after NVIDIA. As NVIDIA pioneered the adoption of TSMC’s 4-nanometer H100, AMD quickly followed suit by launching the first batch of “Instinct MI300X” for AI and HPC applications last year.
Currently, shipments of MI300X have commenced this year, with Microsoft’s data center division emerging as the largest buyer. Meta has also procured a substantial amount of Instinct MI300 series products, while LaminiAI stands as the first publicly known company to utilize MI300X.
According to official performance tests by AMD, the MI300X outperforms the existing NVIDIA H100 80GB available on the market, posing a potential threat to the upcoming H200 141GB.
Additionally, compared to the H100 chip, the MI300X offers a more competitive price for products of the same level. If NVIDIA’s production capacity continues to be restricted, some customers may switch to AMD.
Meanwhile, Intel unveiled the “Gaudi3” chip for generative AI software last year. Although there is limited information available, it is rumored that the memory capacity may increase by 50% compared to Gaudi 2’s 96GB, possibly upgrading to HBM3e memory. CEO Pat Gelsinger directly stated that “Gaudi 3 performance will surpass that of the H100.”
Several global chip design companies have recently announced the formation of the “AI Platform Alliance,” aiming to promote an open AI ecosystem. The founding members of the AI Platform Alliance include Ampere, Cerebras Systems, Furiosa, Graphcore, Kalray, Kinara, Luminous, Neuchips, Rebellions, and Sapeon, among others.
Notably absent is industry giant NVIDIA, leading to speculation that startups aspire to unite and challenge NVIDIA’s dominance.
However, with NVIDIA holding a 75-90% market share in AI, it remains in a dominant position. Whether the AI Platform Alliance can disrupt NVIDIA’s leading position is still subject to observation.
Read more
(Photo credit: NVIDIA)
Insights
With the flourishing development of technologies such as AI, cloud computing, big data analytics, and mobile computing, modern society has an increasingly high demand for computing power.
Moreover, with the advancement beyond 3 nanometers, wafer sizes have encountered scaling limitations and manufacturing costs have increased. Therefore, besides continuing to develop advanced processes, the semiconductor industry is also exploring other ways to maintain chip size while ensuring high efficiency.
The concept of “heterogeneous integration” has become a contemporary focus, leading to the transition of chips from single-layer to advanced packaging with multiple layers stacked together.
The term “CoWoS” can be broken down into the following definitions: “Cow” stands for “Chip-on-Wafer,” referring to the stacking of chips, while “WoS” stands for “Wafer-on-Substrate,” which involves stacking chips on a substrate.
Therefore, “CoWoS” collectively refers to stacking chips and packaging them onto a substrate. This approach reduces the space required for chips and offers benefits in reducing power consumption and costs.
Among these, CoWoS can be further divided into 2.5D horizontal stacking (most famously exemplified by TSMC’s CoWoS) and 3D vertical stacking versions. In these configurations, various processor and memory modules are stacked layer by layer to create chiplets. Because its primary application lies in advanced processes, it is also referred to as advanced packaging.
According to TrendForce’s data, it has provided insights into the heat of the AI chip market. In 2023, shipments of AI servers (including those equipped with GPU, FPGA, ASIC, etc.) reached nearly 1.2 million units, a 38.4% increase from 2022, accounting for nearly 9% of the overall server shipments.
Looking ahead to 2026, the proportion is expected to reach 15%, with a compound annual growth rate (CAGR) of AI server shipments from 2022 to 2026 reaching 22%.
Due to the advanced packaging requirements of AI chips, TSMC’s 2.5D advanced packaging CoWoS technology is currently the primary technology used for AI chips.
GPUs, in particular, utilize higher specifications of HBM, which require the integration of core dies using 2.5D advanced packaging technology. The initial stage of chip stacking in CoWoS packaging, known as Chip on Wafer (CoW), primarily undergoes manufacturing at the fab using a 65-nanometer process. Following this, through-silicon via (TSV) is carried out, and the finalized products are stacked and packaged onto the substrate, known as Wafer on Substrate (WoS).
As a result, the production capacity of CoWoS packaging technology has become a significant bottleneck in AI chip output over the past year, and it remains a key factor in whether AI chip demand can be met in 2024. Foreign analysts have previously pointed out that NVIDIA is currently the largest customer of TSMC’s 2.5D advanced packaging CoWoS technology.
This includes NVIDIA’s H100 GPU, which utilizes TSMC’s 4-nanometer advanced process, as well as the A100 GPU, which uses TSMC’s 7-nanometer process, both of which are packaged using CoWoS technology. As a result, NVIDIA’s chips account for 40% to 50% of TSMC’s CoWoS packaging capacity. This is also why the high demand for NVIDIA chips has led to tight capacity for TSMC’s CoWoS packaging.
TSMC’s Expansion Plans Expected to Ease Tight Supply Situation in 2024
During the earnings call held in July 2023, TSMC announced its plans to double the CoWoS capacity, indicating that the supply-demand imbalance in the market could be alleviated by the end of 2024.
Subsequently, in late July 2023, TSMC announced an investment of nearly NTD 90 billion (roughly USD 2.87 billion) to establish an advanced packaging fab in the Tongluo Science Park, with the construction expected to be completed by the end of 2026 and mass production scheduled for the second or third quarter of 2027.
In addition, during the earnings call on January 18, 2024, TSMC’s CFO, Wendell Huang, emphasized that TSMC would continue its expansion of advanced processes in 2024. Therefore, it is estimated that 10% of the total capital expenditure for the year will be allocated towards expanding capacity in advanced packaging, testing, photomasks, and other areas.
In fact, NVIDIA’s CFO, Colette Kress, stated during an investor conference that the key process of CoWoS advanced packaging has been developed and certified with other suppliers. Kress further anticipated that supply would gradually increase over the coming quarters.
Regarding this, J.P. Morgan, an investment firm, pointed out that the bottleneck in CoWoS capacity is primarily due to the supply-demand gap in the interposer. This is because the TSV process is complex, and expanding capacity requires more high-precision equipment. However, the long lead time for high-precision equipment, coupled with the need for regular cleaning and inspection of existing equipment, has resulted in supply shortages.
Apart from TSMC’s dominance in the CoWoS advanced packaging market, other Taiwanese companies such as UMC, ASE Technology Holding, and Powertek Technology are also gradually entering the CoWoS advanced packaging market.
Among them, UMC expressed during an investor conference in late July 2023 that it is accelerating the deployment of silicon interposer technology and capacity to meet customer needs in the 2.5D advanced packaging sector.
UMC Expands Interposer Capacity; ASE Pushes Forward with VIPack Advanced Packaging Platform
UMC emphasizes that it is the world’s first foundry to offer an open system solution for silicon interposer manufacturing. Through this open system collaboration (UMC+OSAT), UMC can provide a fully validated supply chain for rapid mass production implementation.
On the other hand, in terms of shipment volume, ASE Group currently holds approximately a 32% market share in the global Outsourced Semiconductor Assembly and Test (OSAT) industry and accounts for over 50% of the OSAT shipment volume in Taiwan. Its subsidiary, ASE Semiconductor, also notes the recent focus on CoWoS packaging technology. ASE Group has been strategically positioning itself in advanced packaging, working closely with TSMC as a key partner.
ASE underscores the significance of its VIPack advanced packaging platform, designed to provide vertical interconnect integration solutions. VIPack represents the next generation of 3D heterogeneous integration architecture.
Leveraging advanced redistribution layer (RDL) processes, embedded integration, and 2.5D/3D packaging technologies, VIPack enables customers to integrate multiple chips into a single package, unlocking unprecedented innovation in various applications.
Powertech Technology Seeks Collaboration with Foundries; Winbond Electronics Offers Heterogeneous Integration Packaging Technology
In addition, the OSAT player Powertech Technology is actively expanding its presence in advanced packaging for logic chips and AI applications.
The collaboration between Powertech and Winbond is expected to offer customers various options for CoWoS advanced packaging, indicating that CoWoS-related advanced packaging products could be available as early as the second half of 2024.
Winbond Electronics emphasizes that the collaboration project will involve Winbond Electronics providing CUBE (Customized Ultra-High Bandwidth Element) DRAM, as well as customized silicon interposers and integrated decoupling capacitors, among other advanced technologies. These will be complemented by Powertech Technology’s 2.5D and 3D packaging services.
Read more
(Photo credit: TSMC)
News
In October of 2023, the U.S. government expanded its restrictions on chip exports, limiting NVIDIA from exporting certain chips to China without prior permission. Despite this, NVIDIA is not expected to relinquish the Chinese market and may commence production of the AI chip “H20,” specifically designed for China, in the second quarter of this year.
According to a report from Wccftech, there is keen interest in NVIDIA’s potential exclusive chips for China, including H20, L20, and L2, intended to replace H100, L40, and L4, catering to the AI training needs of Chinese customers.
NVIDIA is reportedly trying to accelerate its return to the Chinese AI chip market, expecting to quickly regain its advantage and market share. It is understood that the main base board supplier for the new product remains Wistron.
The orders from the relevant supply chain manufacturers’ clients will be deferred and are expected to see substantial shipments starting from the second quarter.
The report indicates that progress on these chip projects is steady, and the products fully comply with U.S. export restrictions. Production of the H20 is expected to commence in the second quarter.
Furthermore, it is reported that these GPUs were originally scheduled for release at the end of 2023 but faced delays due to the ongoing tensions between China and the US.
NVIDIA emphasized that the AI chip designed specifically for the Chinese market will fully comply with the requirements and guidelines of the U.S. Department of Commerce, subsequently enabling the launch of the GeForce RTX 4090D in China.
Industry sources estimate that NVIDIA is actively seeking to comply with U.S. government computing power regulations by further reducing the customized chip’s performance. However, due to missing a sales opportunity, many Chinese customers have begun exploring the purchase of local AI chips as an alternative to NVIDIA products.
This is primarily driven by the availability and competitive cost-effectiveness of Chinese chips, with several Chinese companies switching to Huawei products for AI training.
While NVIDIA has significantly streamlined the H20 to meet local demands in China, with computing power reduced to only 15% of the H100, the H20 still aims to strengthen its competitive advantage in specifications.
According to leaked specifications circulating online at the end of 2023, the H20 boasts a FP8 computing power of 296 TFLOPs and FP16 computing power of 148 TFLOPs, with an increased memory capacity of 96GB compared to the H100’s 80GB.
However, domestically-produced chips in China are also formidable. It is claimed that the performance of the H20 is only one-fourth that of Huawei’s HiSilicon Ascend 910B, yet its price is exceptionally high. Therefore, for some Chinese enterprises, there is still an incentive to adopt self-developed AI chips. In the future, whether the potential for domestically-produced AI chips in China can disrupt NVIDIA’s monopoly is yet to be seen.
Read more
(Photo credit: NVIDIA)
News
AMD has long aspired to gain more favor for its AI chips, aiming to break into Nvidia’s stronghold in the AI chip market. Key players like Meta, OpenAI, and Microsoft, who are major buyers of AI chips, also desire a diversified market with multiple AI chip suppliers to avoid vendor lock-in issues and reduce costs.
With AMD’s latest AI chip, Instinct MI300X slated for significant shipments in early 2024, these three major AI chip buyers have publicly announced their plans to place orders as they consider AMD’s solution a more cost-effective alternative.
At the AMD “Advancing AI” event on December 6th, Meta, OpenAI, Microsoft, and Oracle declared their preference for AMD’s latest AI chip, Instinct MI300X. This marks a groundbreaking move by AI tech giants actively seeking alternatives to Nvidia’s expensive GPUs.
For applications like OpenAI’s ChatGPT, Nvidia GPUs have played a crucial role. However, if the AMD MI300X can provide a significant cost advantage, it has the potential to impact Nvidia’s sales performance and challenge its market dominance in AI chips.
AMD’s Three Major Challenges
AMD grapples with three major challenges: convincing enterprises to consider substitutions, addressing industry standards compared to Nvidia’s CUDA software, and determining competitive GPU pricing. Lisa Su, AMD’s CEO, highlighted at the event that the new MI300X architecture features 192GB of high-performance HBM3, delivering not only faster data transfer but also meeting the demands of larger AI models. Su emphasized that such a notable performance boost translates directly into an enhanced user experience, enabling quicker responses to complex user queries.
However, AMD is currently facing critical challenges. Companies that heavily rely on Nvidia may hesitate to invest their time and resources in an alternative GPU supplier like AMD. Su believes that there is an opportunity to make efforts in persuading these AI tech giants to adopt AMD GPUs.
Another pivotal concern is that Nvidia has established its CUDA software as the industry standard, resulting in a highly loyal customer base. In response, AMD has made improvements to its ROCm software suite to effectively compete in this space. Lastly, pricing is a crucial issue, as AMD did not disclose the price of the MI300X during the event. Convincing customers to choose AMD over Nvidia, whose chips are priced around USD 40,000 each, will require substantial cost advantages in both the purchase and operation of AMD’s offerings.
The Overall Size of the AI GPU Market is Expected to Reach USD 400 Billion by 2027
AMD has already secured agreements with companies eager for high-performance GPUs to use MI300X. Meta plans to leverage MI300X GPUs for AI inference tasks like AI graphics, image editing, and AI assistants. On the other hands, Microsoft’s CTO, Kevin Scott, announced that the company will provide access to MI300X through Azure web service.
Additionally, OpenAI has decided to have its GPU programming language Triton, a dedication to machine learning algorithm development, support AMD MI300X. Oracle Cloud Infrastructure (OCI) intends to introduce bare-metal instances based on AMD MI300X GPUs in its high-performance accelerated computing instances for AI.
AMD anticipates that the annual revenue from its GPUs for data centers will reach USD 2 billion by 2024. This projected figure is substantially lower than Nvidia’s most recent quarterly sales related to the data center business (i.e., over USD 14 billion, including sales unrelated to GPUs). AMD emphasizes that with the rising demand for high-end AI chips, the AI GPU market’s overall size is expected to reach USD 400 billion by 2027. This strategic focus on AI GPU products underscores AMD’s optimism about capturing a significant market share. Lisa Su is confident that AMD is poised for success in this endeavor.
(Image: AMD)
News
Intel’s upcoming Lunar Lake platform has entrusted TSMC with the 3nm process of its CPU. This marks TSMC’s debut as the exclusive producer for Intel’s mainstream laptop CPU, including the previously negotiated Lunar Lake GPU and high-speed I/O (PCH) chip collaborations. This move positions TSMC to handle all major chip orders for Intel’s crucial platform next year, reported by UDN News.
Regarding this news, TSMC refrained from commenting on single customer business or market speculations on November 21st. Intel has not issued any statements either.
Recent leaks of Lunar Lake platform internal design details from Intel have generated discussions on various foreign tech websites and among tech experts on X (formerly known as Twitter). According to the leaked information, TSMC will be responsible for producing three key chips for Intel’s Lunar Lake—CPU, GPU, and NPU—all manufactured using the 3nm process. Orders for high-speed I/O chips are expected to leverage TSMC’s 5nm production, with mass production set to kick off in the first half of next year, aligning with the anticipated resurgence of the PC market in the latter half of the year.
While TSMC previously manufactured CPUs for Intel’s Atom platform over a decade ago, it’s crucial to note that the Atom platform was categorized as a series of ultra-low-voltage processors, not Intel’s mainstream laptop platform. In recent years, Intel has gradually outsourced internal chips, beyond CPUs, for mainstream platforms to TSMC, including the GPU and high-speed I/O chips in the earlier Meteor Lake platform—all manufactured using TSMC’s 5nm node.
Breaking from its tradition of in-house production of mainstream platform CPUs, Intel’s decision to outsource to TSMC hints at potential future collaborations. This move opens doors to new opportunities for TSMC to handle the production of Intel’s mainstream laptop platforms.
It’s worth noting that the Intel Lunar Lake platform is scheduled for mass production at TSMC in the first half of next year, with a launch planned for the latter half of the year, targeting mainstream laptop platforms. Unlike the previous two generations of Intel laptop platforms, Lunar Lake integrates CPU, GPU, and NPU into a system-on-chip (SoC). This SoC is then combined with a high-speed I/O chip, utilizing Intel’s Foveros advanced packaging. Finally, the DRAM LPDDR5x is integrated with the two advanced packaged chips on the same IC substrate.
(Image: TSMC)