Insights
In late December 2023, reports surfaced indicating OpenAI CEO Sam Altman’s intention to raise funds to construct a semiconductor plant, ensuring a secure supply of AI chips.
According to a report from the Washington Post on January 24, 2024, Sam Altman has engaged with US congressional members to discuss the construction of the semiconductor plant, including considerations of timing and location, highlighting his increasingly fervent ambition to establish the facility.
TrendForce’s Insights:
The rapid emergence of AI-generated content (AIGC) undoubtedly stood out as a highlight of 2023, closely tied to the quality and efficiency of the large language models (LLMs) used. Take OpenAI’s ChatGPT, for instance, which employs the GPT-3.5 model released in 2020. With 175 billion training parameters, it surpasses its predecessor, GPT-2, by over 100 times, itself being over 10 times larger than the initial GPT from 2018.
In pursuit of better content quality, diversified outputs, and enhanced efficiency, the continuous expansion of model training parameters becomes an inevitable trend. While efforts are made to develop lightweight versions of language models for terminal devices, the cloud-based AI computing arena anticipates a continued expansion of language model training parameters, moving towards the “trillion” scale.
Due to the limited growth rate of AI chip performance, coping with the rapidly increasing model training parameters and the vast amount of data generated by the flourishing development of cloud-based AIGC applications inevitably requires relying on more AI chips. This situation continues to exert pressure on the chip supply chain.
Given that the demand for AI computing is escalating faster than the growth rate of chip performance and capacity, it’s understandable why Sam Altman is concerned about chip supply.
The construction of advanced process fabs is immensely costly, with estimates suggesting that the construction cost of a single 3nm fab could amount to billions of dollars. Even if Sam Altman manages to raise sufficient funds for plant construction, there remains a lack of advanced semiconductor process and packaging technology, not to mention capacity, yield, and operational efficiency.
Therefore, it is anticipated that Sam Altman will continue to seek collaboration with sfoundries to achieve his factory construction goal.
Looking at foundries worldwide, TSMC is undoubtedly the preferred partner. After all, TSMC not only holds a leading position in advanced processes and packaging technologies but also boasts the most extensive experience in producing customized AI chips.
While Samsung and Intel are also suitable partners from a localization perspective, considering factors like production schedules and yield rates, choosing TSMC appears to be more cost-effective.
(Photo credit: OpenAI)
News
Sam Altman, the CEO of OpenAI, the developer of the ChatGPT, is reportedly expected to visit Korea on January 26th. Altman may hold meetings with top executives from Samsung Electronics and SK Group to strengthen their collaboration on High-Bandwidth Memory (HBM).
According to sources cited by The Korea Times, Sam Altman is making a slight adjustment for the potential meeting details with Samsung Electronics’ Chairman Lee Jae-yong and SK Group’s Chairman Chey Tae-won.
OpenAI is set to engage in discussions with Samsung Electronics and SK Group to collaboratively develop artificial intelligence (AI) semiconductors, as part of OpenAI’s strategy to reduce heavy reliance on the AI chip leader NVIDIA.
Reportedly, Altman visited Korea in June of last year, and this upcoming visit is expected to last only about six hours. Most of the time is anticipated to be spent in closed-door meetings with leaders of Korean chip companies or other high-profile executives.
Altman is keen on strengthening relationships with Korean startups and chip industry players, as it contributes to OpenAI’s development of large-scale language models, powering ChatGPT. OpenAI unveiled its latest model, GPT-4 Turbo, at the end of last year and is currently proceeding with planned upgrades to related services.
Regarding this matter, The Korea Times also cited a spokesman at SK Group, indicating that SK Group also did not confirm whether Chey and Altman will meet.
“Nothing specific has been confirmed over our top management’s schedule with Altman,” an official at SK Group said.
Read more
(Photo credit: OpenAI)
Insights
Microsoft announced the in-house AI chip, Azure Maia 100, at the Ignite developer conference in Seattle on November 15, 2023. This chip is designed to handle OpenAI models, Bing, GitHub Copilot, ChatGPT, and other AI services. Support for Copilot, Azure OpenAI is expected to commence in early 2024.
TrendForce’s Insights:
Microsoft has not disclosed detailed specifications for Azure Maia 100. Currently, it is known that the chip will be manufactured using TSMC’s 5nm process, featuring 105 billion transistors and supporting at least INT8 and INT4 precision formats. While Microsoft has indicated that the chip will be used for both training and inference, the computational formats it supports suggest a focus on inference applications.
This emphasis is driven by its incorporation of the less common INT4 low-precision computational format in comparison to other CSP manufacturers’ AI ASICs. Additionally, the lower precision contributes to reduced power consumption, shortening inference times, enhancing efficiency. However, the drawback lies in the sacrifice of accuracy.
Microsoft initiated its in-house AI chip project, “Athena,” in 2019. Developed in collaboration with OpenAI. Azure Maia 100, like other CSP manufacturers, aims to reduce costs and decrease dependency on NVIDIA. Despite Microsoft entering the field of proprietary AI chips later than its primary competitors, its formidable ecosystem is expected to gradually demonstrate a competitive advantage in this regard.
Google led the way with its first in-house AI chip, TPU v1, introduced as early as 2016, and has since iterated to the fifth generation with TPU v5e. Amazon followed suit in 2018 with Inferentia for inference, introduced Trainium for training in 2020, and launched the second generation, Inferentia2, in 2023, with Trainium2 expected in 2024.
Meta plans to debut its inaugural in-house AI chip, MTIA v1, in 2025. Given the releases from major competitors, Meta has expedited its timeline and is set to unveil the second-generation in-house AI chip, MTIA v2, in 2026.
Unlike other CSP manufacturers, both MTIA v1 and MTIA v2 adopt the RISC-V architecture, while other CSP manufacturers opt for the ARM architecture. RISC-V is a fully open-source architecture, requiring no instruction set licensing fees. The number of instructions (approximately 200) in RISC-V is lower than ARM (approximately 1,000).
This choice allows chips utilizing the RISC-V architecture to achieve lower power consumption. However, the RISC-V ecosystem is currently less mature, resulting in fewer manufacturers adopting it. Nevertheless, with the growing trend in data centers towards energy efficiency, it is anticipated that more companies will start incorporating RISC-V architecture into their in-house AI chips in the future.
The competition among AI chips will ultimately hinge on the competition of ecosystems. Since 2006, NVIDIA has introduced the CUDA architecture, nearly ubiquitous in educational institutions. Thus, almost all AI engineers encounter CUDA during their academic tenure.
In 2017, NVIDIA further solidified its ecosystem by launching the RAPIDS AI acceleration integration solution and the GPU Cloud service platform. Notably, over 70% of NVIDIA’s workforce comprises software engineers, emphasizing its status as a software company. The performance of NVIDIA’s AI chips can be further enhanced through software innovations.
On the contrary, Microsoft possess a robust ecosystem like Windows. The recent Intel Arc GPU A770 showcased a 1.7x performance improvement in AI-driven Stable Diffusion on Microsoft Olive, this demonstrates that, similar to NVIDIA, Microsoft has the capability to enhance GPU performance through software.
Consequently, Microsoft’s in-house AI chips are poised to achieve superior performance in software collaboration compared to other CSP manufacturers, providing Microsoft with a competitive advantage in the AI competition.
Read more
Insights
According to Bloomberg, Apple is quietly catching up with its competitors in the AI field. Observing Apple’s layout for the AI field, in addition to acquiring AI-related companies to gain relevant technology quickly, Apple is now developing its large language model (LLM).
TrendForce’s insights:
As the smartphone market matures, brands are not only focusing on hardware upgrades, particularly in camera modules, to stimulate device replacements, but they are also observing the emergence of numerous brands keen on introducing new AI functionalities in smartphones. This move is aimed at reigniting the growth potential of smartphones. Some Chinese brands have achieved notable progress in the AI field, especially in large language models.
For instance, Xiaomi introduced its large language model MiLM-6B, ranking tenth in the C-Eval list (a comprehensive evaluation benchmark for Chinese language models developed in collaboration with Tsinghua University, Shanghai Jiao Tong University, and the University of Edinburgh) and topping the list in its category in terms of parameters. Meanwhile, Vivo has launched the large model VivoLM, with its VivoLM-7B model securing the second position on the C-Eval ranking.
As for Apple, while it may appear to be in a mostly observatory role as other Silicon Valley companies like OpenAI release ChatGPT, and Google and Microsoft introduce AI versions of search engines, the reality is that since 2018, Apple has quietly acquired over 20 companies related to AI technology from the market. Apple’s approach is characterized by its extreme discretion, with only a few of these transactions publicly disclosing their final acquisition prices.
On another front, Apple has been discreetly developing its own large language model called Ajax. It commits daily expenditures of millions of dollars for training this model with the aim of making its performance even more robust compared to OpenAI’s ChatGPT 3.5 and Meta’s LLaMA.
Analyzing the current most common usage scenarios for smartphones among general consumers, these typically revolve around activities like taking photos, communication, and information retrieval. While there is potential to enhance user experiences with AI in some functionalities, these usage scenarios currently do not fall under the category of “essential AI features.”
However, if a killer application involving large language models were to emerge on smartphones in the future, Apple is poised to have an exclusive advantage in establishing such a service as a subscription-based model. This advantage is due to recent shifts in Apple’s revenue composition, notably the increasing contribution of “Service” revenue.
In August 2023, Apple CEO Tim Cook highlighted in Apple’s third-quarter financial report that Apple’s subscription services, which include Apple Arcade, Apple Music, iCloud, AppleCare, and others, had achieved record-breaking revenue and amassed over 1 billion paying subscribers.
In other words, compared to other smartphone brands, Apple is better positioned to monetize a large language model service through subscription due to its already substantial base of paying subscription users. Other smartphone brands may find it challenging to gain consumer favor for a paid subscription service involving large language models, as they lack a similarly extensive base of subscription users.
Read more
In-Depth Analyses
OpenAI’s ChapGPT, Microsoft’s Copilot, Google’s Bard, and latest Elon Musk’s TruthGPT – what will be the next buzzword for AI? In just under six months, the AI competition has heated up, stirring up ripples in the once-calm AI server market, as AI-generated content (AIGC) models take center stage.
The convenience unprecedentedly brought by AIGC has attracted a massive number of users, with OpenAI’s mainstream model, GPT-3, receiving up to 25 million daily visits, often resulting in server overload and disconnection issues.
Given the evolution of these models has led to an increase in training parameters and data volume, making computational power even more scarce, OpenAI has reluctantly adopted measures such as paid access and traffic restriction to stabilize the server load.
High-end Cloud Computing is gaining momentum
According to Trendforce, AI servers currently have a merely 1% penetration rate in global data centers, which is far from sufficient to cope with the surge in data demand from the usage side. Therefore, besides optimizing software to reduce computational load, increasing the number of high-end AI servers in hardware will be another crucial solution.
Take GPT-3 for instance. The model requires at least 4,750 AI servers with 8 GPUs for each, and every similarly large language model like ChatGPT will need 3,125 to 5,000 units. Considering ChapGPT and Microsoft’s other applications as a whole, the need for AI servers is estimated to reach some 25,000 units in order to meet the basic computing power.
As the emerging applications of AIGC and its vast commercial potential have both revealed the technical roadmap moving forward, it also shed light on the bottlenecks in the supply chain.
The down-to-earth problem: cost
Compared to general-purpose servers that use CPUs as their main computational power, AI servers heavily rely on GPUs, and DGX A100 and H100, with computational performance up to 5 PetaFLOPS, serve as primary AI server computing power. Given that GPU costs account for over 70% of server costs, the increase in the adoption of high-end GPUs has made the architecture more expansive.
Moreover, a significant amount of data transmission occurs during the operation, which drives up the demand for DDR5 and High Bandwidth Memory (HBM). The high power consumption generated during operation also promotes the upgrade of components such as PCBs and cooling systems, which further raises the overall cost.
Not to mention the technical hurdles posed by the complex design architecture – for example, a new approach for heterogeneous computing architecture is urgently required to enhance the overall computing efficiency.
The high cost and complexity of AI servers has inevitably limited their development to only large manufacturers. Two leading companies, HPE and Dell, have taken different strategies to enter the market:
With the booming market for AIGC applications, we seem to be one step closer to a future metaverse centered around fully virtualized content. However, it remains unclear whether the hardware infrastructure can keep up with the surge in demand. This persistent challenge will continue to test the capabilities of cloud server manufacturers to balance cost and performance.
(Photo credit: Google)