News
As per Chinese media mydrivers’ report, it has indicated that NVIDIA updated the EULA terms in the CUDA 11.6 installer, explicitly prohibiting third-party GPU companies from seamlessly integrating CUDA software. This move has reportedly stirred up discussions among the market.
NVIDIA updated the EULA agreement for CUDA 11.6, explicitly stating, “You may not reverse engineer, decompile or disassemble any portion of the output generated using SDK elements for the purpose of translating such output artifacts to target a non-NVIDIA platform.”
This move is reportedly speculated to target third-party projects like ZLUDA, involving Intel and AMD, as well as compatibility solutions from Chinese firms like Denglin Technology and MetaX Technology.
In response, MooreThreads issued an official statement, confirming that its MUSA and MUSIFY technologies remain unaffected. MUSA and MUSIFY are not subject to NVIDIA EULA terms, ensuring developers can use them with confidence.
In fact, since 2021, NVIDIA has prohibited other hardware platforms from running CUDA software analog layers but only warned about it in the online EULA user agreement. Currently, NVIDIA has not explicitly named any parties, issuing warnings in the agreement without taking further action, although further measures cannot be ruled out in the future.
In the past, CUDA development and its ecosystem were widely regarded as NVIDIA’s moat. However, processor architect Jim Keller previously criticized CUDA on the X platform, stating, “CUDA is a swamp, not a moat,” adding, “CUDA is not beautiful. It was built by piling on one thing at a time.”
Meanwhile, according to the account rgznai100 posting on Chinese blog CSDN, NVIDIA’s actions will have a significant impact on AI chip/GPGPU companies that previously adopted CUDA-compatible solutions. NVIDIA may initially resort to legal measures, such as lawsuits, against GPGPU companies following similar paths.
Therefore, Chinese enterprises should endeavor to enhance collaboration with the open-source community to establish an open AI compilation ecosystem, reducing the risks posed by NVIDIA’s market dominance.
Read more
(Photo credit: NVIDIA)
News
According to the Economic Daily News, the AI wave is ushering in a demand for updated specifications in CMOS Image Sensors (CIS), with the global CIS leader, Sony Corporation, aggressively positioning itself to take advantage of this trend. As part of the semiconductor industry’s move towards localized production, Sony has placed significant orders with TSMC’s new Kumamoto plant in Japan, boosting the production volume for the fourth quarter and rapidly increasing the new plant’s capacity utilization.
TSMC does not comment on individual customers or orders. Industry sources point out that the CIS component market previously faced an inventory adjustment issue for over a year. Recently, with clients restarting stock replenishment in anticipation of a recovery, coupled with the AI effect, various end-use applications are adopting lenses developed specifically for AI applications. This shift is expected to drive a new wave of demand for replacing old lenses with new ones to capitalize on AI lens opportunities.
Sony is optimistic about future opportunities in automotive and consumer sectors, and intends to extensively utilize TSMC’s 22nm process for producing CIS components and Image Signal Processors.
Furthermore, to seize the AI business opportunities, Sony has launched Digital Signal Processors (DSP) equipped with AI algorithms, which are expected to enhance applications such as human motion analysis, image processing enhancement, or human tracking. Especially with Sony securing large orders from clients, it is poised to become a major product line in the AI era.
TSMC’s new Kumamoto plant in Japan recently opened and is in the equipment installation phase, with production expected to start as early as the fourth quarter, focusing on 40, 28/22 nm processes for automotive and industrial clients.
The joint venture company for TSMC’s Japan plant, JASM, includes Sony as the largest shareholder besides TSMC. Sony has been a major client of TSMC for outsourced wafer production for many years. With the Kumamoto plant set to start production by the end of the year, Sony is almost certain to secure a significant share of wafer capacity, becoming a major client that fills the capacity utilization rate of TSMC’s Kumamoto plant.
(Image: TSMC)
News
The previously elusive NVIDIA data center GPU, H100, has seen a noticeable reduction in delivery lead times amid improved market supply conditions, as per a report from Tom’s Hardware. As a result, customers who previously purchased large quantities of H100 chips are reportedly starting to resell them.
The report further points out that the previously high-demand H100 data center GPU, driven by the surge in artificial intelligence applications, has seen a reduction in delivery wait times from a peak of 8-11 months to 3-4 months, indicating a relief in supply pressure.
Additionally, with major cloud providers such as AWS, Google Cloud, and Microsoft Azure offering easier access to AI computing services for customers, enterprises that previously purchased large quantities of H100 GPUs have begun further reselling these GPUs.
For instance, AWS introduced a new service allowing customers to rent GPUs for shorter periods, resolving previous chip demand issues and shortening the waiting time for artificial intelligence chips.
The report also indicates that customers are reselling these GPUs due to reduced scarcity and the high maintenance costs, leading these enterprise customers to make such decisions. This situation contrasts starkly with the market shortage a year ago.
However, even though the current difficulty in obtaining H100 GPUs has significantly decreased, the artificial intelligence market remains robust overall. The demand for large-scale artificial intelligence model computations persists for some enterprises, keeping the overall demand greater than the supply, thereby preventing a significant drop in the price of H100 GPUs.
The report emphasizes that the current ease of purchasing H100 GPUs has also brought about some changes in the market. Customers now prioritize price and practicality when leasing AI computing services from cloud service providers.
Additionally, alternatives to the H100 GPU have emerged in the current market, offering comparable performance and software support but at potentially more affordable prices, potentially contributing to a more equitable market condition.
TrendForce’s newest projections spotlight a 2024 landscape where demand for high-end AI servers—powered by NVIDIA, AMD, or other top-tier ASIC chips—will be heavily influenced by North America’s cloud service powerhouses.
Microsoft (20.2%), Google (16.6%), AWS (16%), and Meta (10.8%) are predicted to collectively command over 60% of global demand, with NVIDIA GPU-based servers leading the charge.
However, NVIDIA still faces ongoing hurdles in development as it contends with US restrictions.
TrendForce has pointed out that, despite NVIDIA’s stronghold in the data center sector—thanks to its GPU servers capturing up to 70% of the AI market—challenges continue to loom.
Three major challenges are set to limit the company’s future growth: Firstly, the US ban on technological exports has spurred China toward self-reliance in AI chips, with Huawei emerging as a noteworthy adversary. NVIDIA’s China-specific solutions, like the H20 series, might not match the cost-effectiveness of its flagship models, potentially dampening its market dominance.
Secondly, the trend toward proprietary ASIC development among US cloud behemoths, including Google, AWS, Microsoft, and Meta, is expanding annually due to scale and cost considerations.
Lastly, AMD presents competitive pressure with its cost-effective strategy, offering products at just 60–70% of the prices of comparable NVIDIA models. This allows AMD to penetrate the market more aggressively, especially with flagship clients. Microsoft is expected to be the most enthusiastic adopter of AMD’s high-end GPU MI300 solutions in 2024.
Read more
(Photo credit: NVIDIA)
News
Under the formidable impetus of AI, global enterprises are vigorously strategizing for AI chip development, and China is no exception. Who are the prominent AI chip manufacturers in China presently? How do they compare with industry giants like NVIDIA, and what are their unique advantages? A report from TechNews has compiled an overview of eight Chinese AI chip manufacturers in self-development.
In broad terms, AI chips refer to semiconductor chips capable of running AI algorithms. However, in the industry’s typical usage, AI chips specifically denote chips designed with specialized acceleration for AI algorithms, capable of handling large-scale computational tasks in AI applications. Under this concept, AI chips are also referred to as accelerator cards.
Technically, AI chips are mainly classified into three categories: GPU, FPGA, and ASIC. In terms of functionality, AI chips encompass two main types: training and inference. Regarding application scenarios, AI chips can be categorized into server-side and mobile-side, or cloud, edge, and terminal.
The global AI chip market is currently dominated by Western giants, with NVIDIA leading the pack. Industry sources cited by TechNews have revealed data that NVIDIA nearly monopolizes the AI chip market with an 80% market share.
China’s AI industry started relatively late, but in recent years, amid the US-China rivalry and strong support from Chinese policies, Chinese AI chip design companies have gradually gained prominence. They have demonstrated relatively outstanding performance in terminal and large model inference.
However, compared to global giants, they still have significant ground to cover, especially in the higher-threshold GPU and large model training segments.
GPUs are general-purpose chips, currently dominating the usage in the AI chip market. General-purpose GPU computing power is widely employed in artificial intelligence model training and inference fields. Presently, NVIDIA and AMD dominate the GPU market, while Chinese representative companies include Hygon Information Technology, Jingjia Micro, and Enflame Technology.
FPGAs are semi-customized chips known for low latency and short development cycles. Compared to GPUs, they are suitable for multi-instruction, single-data flow analysis, but not for complex algorithm computations. They are mainly used in the inference stage of deep learning algorithms. Frontrunners in this field include Xilinx and Intel in the US, with Chinese representatives including Baidu Kunlunxin and DeePhi.
ASICs are fully customized AI chips with advantages in power consumption, reliability, and integration. Mainstream products include TPU, NPU, VPU, and BPU. Global leading companies include Google and Intel, while China’s representatives include Huawei, Alibaba, Cambricon Technologies, and Horizon Robotics.
In recent years, China has actively invested in the field of self-developed AI chips. Major companies such as Baidu, Alibaba, Tencent, and Huawei have accelerated the development of their own AI chips, and numerous AI chip companies continue to emerge.
Below is an overview of the progress of 8 Chinese AI chip manufacturers:
1. Baidu Kunlunxin
Baidu’s foray into AI chips can be traced back to as early as 2011. After seven years of development, Baidu officially unveiled its self-developed AI chip, Kunlun 1, in 2018. Built on a 14nm process and utilizing the self-developed XPU architecture, Kunlun 1 entered mass production in 2020. It is primarily employed in Baidu’s search engine and Xiaodu businesses.
In August of the same year, Baidu announced the mass production of its second-generation self-developed AI chip, Kunlun 2. It adopts a 7nm process and integrates the self-developed second-generation XPU architecture, delivering a performance improvement of 2-3 times compared to the first generation. It also exhibits significant enhancements in versatility and ease of use.
The first two generations of Baidu Kunlunxin products have already been deployed in tens of thousands of units. The third-generation product is expected to be unveiled at the Baidu Create AI Developer Conference scheduled for April 2024.
2. T-Head (Alibaba)
Established in September 2018, T-Head is the semiconductor chip business entity fully owned by Alibaba. It provides a series of products, covering data center chips, IoT chips, processor IP licensing, and more, achieving complete coverage across the chip design chain.
In terms of AI chip deployment, T-Head introduced its first high-performance artificial intelligence inference chip, the HanGuang 800, in September 2019. It is based on a 12nm process and features a proprietary architecture.
In August 2023, Alibaba’s T-Head unveiled its first self-developed RISC-V AI platform, supporting over 170 mainstream AI models, thereby propelling RISC-V into the era of high-performance AI applications.
Simultaneously, T-Head announced the new upgrade of its XuanTie processor C920, which can accelerate GEMM (General Matrix Multiplication) calculations 15 times faster than the Vector scheme.
In November 2023, T-Head introduced three new processors on the XuanTie RISC-V platform (C920, C907, R910). These processors significantly enhance acceleration computing capabilities, security, and real-time performance, poised to accelerate the widespread commercial deployment of RISC-V in scenarios and domains such as autonomous driving, artificial intelligence, enterprise-grade SSD, and network communication.
3. Tencent
In November 2021, Tencent announced substantial progress in three chip designs: Zixiao for AI computing, Canghai for image processing, and Xuanling for high-performance networking.
Zixiao has successfully undergone trial production and has been activated. Reportedly, Zixiao employs in-house storage-computing architecture and proprietary acceleration modules, delivering up to 3 times the computing acceleration performance and over 45% cost savings overall.
Zixiao chips are intended for internal use by Tencent and are not available for external sales. Tencent profits by renting out computing power through its cloud services.
Recently, according to sources cited by TechNews, Tencent is considering using Zixiao V1 as an alternative to the NVIDIA A10 chip for AI image and voice recognition applications. Additionally, Tencent is planning to launch the Zixiao V2 Pro chip optimized for AI training to replace the NVIDIA L40S chip in the future.
4. Huawei
Huawei unveiled its Huawei AI strategy and all-scenario AI solutions at the 2018 Huawei Connect Conference. Additionally, it introduced two new AI chips: the Ascend 910 and the Ascend 310. Both chips are based on Huawei’s self-developed Da Vinci architecture.
The Ascend 910, designed for training, utilizes a 7nm process and boasts computational density that is said to surpass the NVIDIA Tesla V100 and Google TPU v3.
On the other hand, the Ascend 310 belongs to the Ascend-mini series and is Huawei’s first commercial AI SoC, catering to low-power consumption areas such as edge computing.
Based on the Ascend 910 and Ascend 310 AI chips, Huawei has introduced the Atlas AI computing solution. As per the Huawei Ascend community, the Atlas 300T product line includes three models corresponding to the Ascend 910A, 910B, and 910 Pro B.
Among them, the 910 Pro B has already secured orders for at least 5,000 units from major clients in 2023, with delivery expected in 2024. Sources cited by the TechNews report indicate that the capabilities of the Huawei Ascend 910B chip are now comparable to those of the NVIDIA A100.
Due to the soaring demand for China-produced AI chips like the Huawei Ascend 910B in China, Reuters recently reported that Huawei plans to prioritize the production of the Ascend 910B. This move could potentially impact the production capacity of the Kirin 9000s chips, which are expected to be used in the Mate 60 series.
5. Cambricon Technologies
Founded in 2016, Cambricon Technologies focuses on the research and technological innovation of artificial intelligence chip products.
Since its establishment, Cambricon has launched multiple chip products covering terminal, cloud, and edge computing fields. Among them, the MLU 290 intelligent chip is Cambricon’s first training chip, utilizing TSMC’s 7nm advanced process and integrating 46 billion transistors. It supports the MLUv02 expansion architecture, offering comprehensive support for AI training, inference, or hybrid artificial intelligence computing acceleration tasks.
The Cambricon MLU 370 is the company’s flagship product, utilizing a 7nm manufacturing process and supporting both inference and training tasks. Additionally, the MLU 370 is Cambricon’s first AI chip to adopt chiplet technology, integrating 39 billion transistors, with a maximum computing power of up to 256TOPS (INT8).
6. Biren Technology
Established in 2019, Biren Technology initially focuses on general smart computing in the cloud.
It aims to surpass existing solutions gradually in various fields such as artificial intelligence training, inference, and graphic rendering, thereby achieving a breakthrough in China’s produced high-end general smart computing chips.
In 2021, Biren Technology’s first general GPU, the BR100 series, entered trial production. The BR100 was officially released in August 2022.
Reportedly, the BR100 series is developed based on Biren Technology’s independently chip architecture and utilizes mature 7nm manufacturing processes.
7. Horizon Robotics
Founded in July 2015, Horizon Robotics is a provider of smart driving computing solutions in China. It has launched various AI chips, notably the Sunrise and Journey series. The Sunrise series focuses on the AIoT market, while the Journey series is designed for smart driving applications.
Currently, the Sunrise series has advanced to its third generation, comprising the Sunrise 3M and Sunrise 3E models, catering to the high-end and low-end markets, respectively.
In terms of performance, the Sunrise 3 achieves an equivalent standard computing power of 5 TOPS while consuming only 2.5W of power, representing a significant upgrade from the previous generation.
The Journey series has now iterated to its fifth generation. The Journey 5 chip was released in 2021, with global mass production starting in September 2022. Each chip in the series boasts a maximum AI computing power of up to 128 TOPS.
In November 2023, Horizon Robotics announced that the Journey 6 series will be officially unveiled in April 2024, with the first batch of mass-produced vehicle deliveries scheduled for the fourth quarter of 2024.
Several automotive companies, including BYD, GAC Group, Volkswagen Group’s software company CARIAD, Bosch, among others, have reportedly entered into cooperative agreements with Horizon Robotics.
8. Enflame Technology
Enflame Technology, established in March 2018, specializes in cloud and edge computing in the field of artificial intelligence.
Over the past five years, it has developed two product lines focusing on cloud training and cloud inference. In September 2023, Enflame Technology announced the completion of Series D funding round of CNY 2 billion.
In addition, according to reports cited by TechNews, Enflame Technology’s third-generation AI chip products are set to hit the market in early 2024.
Conclusion
Looking ahead, the industry remains bullish on the commercial development of AI, anticipating a substantial increase in the demand for computing power, thereby creating a significant market opportunity for AI chips.
Per data cited by TechNews, it has indicated that the global AI chip market reached USD 580 billion in 2022 and is projected to exceed a trillion dollars by 2030.
Leading AI chip manufacturers like NVIDIA are naturally poised to continue benefiting from this trend. At the same time, Chinese AI chip companies also have the opportunity to narrow the gap and accelerate growth within the vast AI market landscape.
Read more
(Photo credit: iStock)
News
In 2023, “generative AI” was undeniably the hottest term in the tech industry.
The launch of the generative application ChatGPT by OpenAI has sparked a frenzy in the market, prompting various tech giants to join the race.
As per a report from TechNews, currently, NVIDIA dominates the market by providing AI accelerators, but this has led to a shortage of their AI accelerators in the market. Even OpenAI intends to develop its own chips to avoid being constrained by tight supply chains.
On the other hand, due to restrictions arising from the US-China tech war, while NVIDIA has offered reduced versions of its products to Chinese clients, recent reports suggest that these reduced versions are not favored by Chinese customers.
Instead, Chinese firms are turning to Huawei for assistance or simultaneously developing their own chips, expected to keep pace with the continued advancement of large-scale language models.
In the current wave of AI development, NVIDIA undoubtedly stands as the frontrunner in AI computing power. Its A100/H100 series chips have secured orders from top clients worldwide in the AI market.
As per analyst Stacy Rasgon from the Wall Street investment bank Bernstein Research, the cost of each query using ChatGPT is approximately USD 0.04. If ChatGPT queries were to scale to one-tenth of Google’s search volume, the initial deployment would require approximately USD 48.1 billion worth of GPUs for computation, with an annual requirement of about USD 16 billion worth of chips to sustain operations, along with a similar amount for related chips to execute tasks.
Therefore, whether to reduce costs, decrease overreliance on NVIDIA, or even enhance bargaining power further, global tech giants have initiated plans to develop their own AI accelerators.
Per reports by technology media The Information, citing industry sources, six global tech giants, including Microsoft, OpenAI, Tesla, Google, Amazon, and Meta, are all investing in developing their own AI accelerator chips. These companies are expected to compete with NVIDIA’s flagship H100 AI accelerator chips.
Progress of Global Companies’ In-house Chip Development
Rumors surrounding Microsoft’s in-house AI chip development have never ceased.
At the annual Microsoft Ignite 2023 conference, the company finally unveiled the Azure Maia 100 AI chip for data centers and the Azure Cobalt 100 cloud computing processor. In fact, rumors of Microsoft developing an AI-specific chip have been circulating since 2019, aimed at powering large language models.
The Azure Maia 100, introduced at the conference, is an AI accelerator chip designed for tasks such as running OpenAI models, ChatGPT, Bing, GitHub Copilot, and other AI workloads.
According to Microsoft, the Azure Maia 100 is the first-generation product in the series, manufactured using a 5-nanometer process. The Azure Cobalt is an Arm-based cloud computing processor equipped with 128 computing cores, offering a 40% performance improvement compared to several generations of Azure Arm chips. It provides support for services such as Microsoft Teams and Azure SQL. Both chips are produced by TSMC, and Microsoft is already designing the second generation.
OpenAI is also exploring the production of in-house AI accelerator chips and has begun evaluating potential acquisition targets. According to earlier reports from Reuters citing industry sources, OpenAI has been discussing various solutions to address the shortage of AI chips since at least 2022.
Although OpenAI has not made a final decision, options to address the shortage of AI chips include developing their own AI chips or further collaborating with chip manufacturers like NVIDIA.
OpenAI has not provided an official comment on this matter at the moment.
Electric car manufacturer Tesla is also actively involved in the development of AI accelerator chips. Tesla primarily focuses on the demand for autonomous driving and has introduced two AI chips to date: the Full Self-Driving (FSD) chip and the Dojo D1 chip.
The FSD chip is used in Tesla vehicles’ autonomous driving systems, while the Dojo D1 chip is employed in Tesla’s supercomputers. It serves as a general-purpose CPU, constructing AI training chips to power the Dojo system.
Google began secretly developing a chip focused on AI machine learning algorithms as early as 2013 and deployed it in its internal cloud computing data centers to replace NVIDIA’s GPUs.
The custom chip, called the Tensor Processing Unit (TPU), was unveiled in 2016. It is designed to execute large-scale matrix operations for deep learning models used in natural language processing, computer vision, and recommendation systems.
In fact, Google had already constructed the TPU v4 AI chip in its data centers by 2020. However, it wasn’t until April 2023 that technical details of the chip were publicly disclosed.
As for Amazon Web Services (AWS), the cloud computing service provider under Amazon, it has been a pioneer in developing its own chips since the introduction of the Nitro1 chip in 2013. AWS has since developed three product lines of in-house chips, including network chips, server chips, and AI machine learning chips.
Among them, AWS’s lineup of self-developed AI chips includes the inference chip Inferentia and the training chip Trainium.
On the other hand, AWS unveiled the Inferentia 2 (Inf2) in early 2023, specifically designed for artificial intelligence. It triples computational performance while increasing accelerator total memory by a quarter.
It supports distributed inference through direct ultra-high-speed connections between chips and can handle up to 175 billion parameters, making it the most powerful in-house manufacturer in today’s AI chip market.
Meanwhile, Meta, until 2022, continued using CPUs and custom-designed chipsets tailored for accelerating AI algorithms to execute its AI tasks.
However, due to the inefficiency of CPUs compared to GPUs in executing AI tasks, Meta scrapped its plans for a large-scale rollout of custom-designed chips in 2022. Instead, it opted to purchase NVIDIA GPUs worth billions of dollars.
Still, amidst the surge of other major players developing in-house AI accelerator chips, Meta has also ventured into internal chip development.
On May 19, 2023, Meta further unveiled its AI training and inference chip project. The chip boasts a power consumption of only 25 watts, which is 1/20th of the power consumption of comparable products from NVIDIA. It utilizes the RISC-V open-source architecture. According to market reports, the chip will also be produced using TSMC’s 7-nanometer manufacturing process.
China’s Progress on In-House Chip Development
China’s journey in developing in-house chips presents a different picture. In October last year, the United States expanded its ban on selling AI chips to China.
Although NVIDIA promptly tailored new chips for the Chinese market to comply with US export regulations, recent reports suggest that major Chinese cloud computing clients such as Alibaba and Tencent are less inclined to purchase the downgraded H20 chips. Instead, they have begun shifting their orders to domestic suppliers, including Huawei.
This shift in strategy indicates a growing reliance on domestically developed chips from Chinese companies by transferring some orders for advanced semiconductors to China.
TrendForce indicates that currently about 80% of high-end AI chips purchased by Chinese cloud operators are from NVIDIA, but this figure may decrease to 50% to 60% over the next five years.
If the United States continues to strengthen chip controls in the future, it could potentially exert additional pressure on NVIDIA’s sales in China.
Read more
(Photo credit: NVIDIA)