AI chip


2024-04-15

[News] Following in NVIDIA’s Footsteps? Intel Reportedly Plans to Launch Chinese Version of AI Chips

Under pressure from US restrictions, Intel is reportedly preparing to follow in NVIDIA’s footsteps by developing “special edition” versions of its AI acceleration chips, Gaudi 3, for the Chinese market. These two related products are rumored to be launched at the end of June and the end of September.

According to reports from The Register, Intel recently unveiled its new generation AI acceleration chip, Gaudi 3. Intel stated in the Gaudi 3 white paper that it is preparing to launch a special edition Gaudi 3 tailored for the Chinese market. This would include two hardware variants: the HL-328 OAM-compatible Mezzanine Card and the HL-388 PCIe Accelerator Card. The HL-328 is said to be scheduled for release on June 24, while the HL-388 follow suit on September 24.

In regard of the specifications, the made-for-China edition and the original version share the same features, including 96MB of on-chip SRAM memory, 128GB of HBM2e high-bandwidth memory with a bandwidth of 3.7TB per second, PCIe 5.0X16 interface, and decoding standards.

However, due to US export restrictions on AI chips, the comprehensive computing performance (TPP) of high-performance AI needs to be below 4,800 to export to China. This means the Chinese special edition’s 16-bit performance cannot exceed 150 TFLOPS (trillion floating-point operations per second).

For comparison, the original Gaudi 3 achieves 1,835 TFLOPS in FP16/BF16. This contrasts with NVIDIA’s H100, which is approximately 40% faster in large model training and 50% more efficient in inference tasks.

Therefore, the made-for-China edition will need to significantly reduce the number of cores (the original version has 8 Matrix Multiplication Engines [MME] and 64 Tensor Processor Core [TPC] engines) and operating frequency. Ultimately, this could result in reducing its AI performance by approximately 92% to comply with US export control requirements.

Analyses cited in the same report further suggest that Intel’s launch of the made-for-China edition for AI performance will be comparable to NVIDIA’s AI accelerator card H20 tailored for the Chinese market.

The made-for-China edition of Intel’s Gaudi 3 boasts a performance of 148 TFLOPS in FP16/BF16, slightly below the 150 TFLOPS limit. However, in terms of high-bandwidth memory (HBM) capacity and bandwidth, the Chinese special edition Gaudi 3 will be lower than NVIDIA’s H20, potentially putting it at a competitive disadvantage against the H20. Still, pricing will also be a key factor in determining whether it holds any competitive advantage.

As per a previous report from Reuters, the prices of the chips were said to be comparable to those of its competitor Huawei’s products. Reportedly, NVIDIA priced orders from Chinese H20 distributors between USD 12,000 and 15,000 per unit.

TrendForce believes Chinese companies will continue to buy existing AI chips in the short term. NVIDIA’s GPU AI accelerator chips remain a top priority—including H20, L20, and L2—designed specifically for the Chinese market following the ban.

Read more

(Photo credit: NVIDIA)

Please note that this article cites information from The Register and Reuters.

2024-03-22

[News] AMD Hosts Innovation Summit, CEO Lisa Su Highlights the Beginning of AIPC Era

Following NVIDIA’s GTC 2024, AMD also hosted an AI PC Innovation Summit on March 21st, with CEO Lisa Su leading the top executives in attendance. As per a report from Commercial Times, by collaborating with partners including brands ASUS, MSI, and Acer, AMD has showcased its exciting applications in AI PCs.

AMD highlights that future language models will evolve in two directions: one is the large-scale models introduced by tech giants, which use increasingly more parameters and become more complex in operation, with closed architecture being a major characteristic.

The other direction is small open-source models, which are becoming more widely accepted by the public. These models with fewer parameters can run smoothly on edge devices, especially AI PCs, expecting a significant influx of developers.

Furthermore, the AI compute requirements for large and small language models are entirely different. AMD has different hardware positioning to meet all demands.

Lisa Su emphasizes that artificial intelligence is driving a revolution, reshaping every aspect of the tech industry, from data centers to AI PCs and edge computing. AMD is excited about the opportunities presented by this new era of computing.

TrendForce previously issued an analysis in a press release, indicating that the AI PC market is propelled by two key drivers: Firstly, demand for terminal applications, mainly dominated by Microsoft through its Windows OS and Office suite, is a significant factor. Microsoft is poised to integrate Copilot into the next generation of Windows, making Copilot a fundamental requirement for AI PCs. Secondly, Intel, as a leading CPU manufacturer, is advocating for AI PCs that combine CPU, GPU, and NPU architectures to enable a variety of terminal AI applications.

Introduced around the end of 2023, Qualcomm’s Snapdragon X Elite platform is set to be the first to meet Copilot standards, with shipments expected in the second half of 2024. This platform is anticipated to deliver around 45 TOPS.

Following closely behind, AMD’s Ryzen 8000 series (Strix Point) is also expected to meet these requirements. Intel’s Meteor Lake, launched in December 2023 with a combined CPU+GPU+NPU power of 34 TOPS, falls short of Microsoft’s standards. However, Intel’s upcoming Lunar Lake might surpass the 40 TOPS threshold by the end of the year.

The race among Qualcomm, Intel, and AMD in the AI PC market is set to intensify the competition between the x86 and Arm CPU architectures in the Edge AI market. Qualcomm’s early compliance with Microsoft’s requirements positions it to capture the initial wave of AI PC opportunities, as major PC OEMs like Dell, HPE, Lenovo, ASUS, and Acer develop Qualcomm CPU-equipped models in 2024, presenting a challenge to the x86 camp.

Read more

(Photo credit: AMD)

Please note that this article cites information from Commercial Times and Bloomberg.

2024-03-19

[News] TSMC’s 4nm Process Powers NVIDIA’s Blackwell Architecture GPU, AI Performance Surpasses Previous Generations by Multiples

Chip giant NVIDIA kicked off its annual Graphics Processing Unit (GPU) Technology Conference (GTC) today, with CEO Jensen Huang announcing the launch of the new artificial intelligence chip, Blackwell B200.

According to a report from TechNews, this new architecture, Blackwell, boasts a massive GPU volume, crafted using TSMC’s 4-nanometer (4NP) process technology, integrating two independently manufactured dies, totaling 208 billion transistors. These dies are then bound together like a zipper through the NVLink 5.0 interface.

NVIDIA utilizes a 10 TB/sec NVLink 5.0 to connect the two dies, officially termed NV-HBI interface. The NVLink 5.0 interface of the Blackwell complex provides 1.8 TB/sec bandwidth, doubling the speed of the NVLink 4.0 interface on the previous generation Hopper architecture GPU.

As per a report from Tom’s Hardware, the AI computing performance of a single B200 GPU can reach 20 petaflops, whereas the previous generation H100 offered a maximum of only 4 petaflops of AI computing performance. The B200 will also be paired with 192GB of HBM3e memory, providing up to 8 TB/s of bandwidth.

NVIDIA’s HBM supplier, South Korean chipmaker SK Hynix, also issued a press release today announcing the commencement of mass production of its high-performance DRAM new product, HBM3e, with shipments set to begin at the end of March.

Source: SK Hynix

Recently, global tech companies have been heavily investing in AI, leading to increasing demands for AI chip performance. SK Hynix points out that HBM3e is the optimal product to meet these demands. As memory operations for AI are extremely fast, efficient heat dissipation is crucial. HBM3e incorporates the latest Advanced MR-MUF technology for heat dissipation control, resulting in a 10% improvement in cooling performance compared to the previous generation.

Per SK Hynix’s press release, Sungsoo Ryu, the head of HBM Business at SK Hynix, said that mass production of HBM3e has completed the company’s lineup of industry-leading AI memory products.

“With the success story of the HBM business and the strong partnership with customers that it has built for years, SK hynix will cement its position as the total AI memory provider,” he stated.

Read more

(Photo credit: NVIDIA)

Please note that this article cites information from TechNews, Tom’s Hardware and SK Hynix.

2024-02-29

[News] Rapidus, the First 2nm Client, to Manufacture AI Chips for Tenstorrent

Rapidus, a foundry company established through Japanese government-industry collaboration, aims to mass-produce 2-nanometer chips by 2027. According to Rapidus, Canadian artificial intelligence (AI) chip startup Tenstorrent is set to become Rapidus’ 2nm client.

On February 27th, Rapidus announced a collaboration with Tenstorrent to jointly develop and manufacture edge AI accelerators based on 2nm logic technology. Tenstorrent will be responsible for the design/development of the AI chips, while Rapidus will handle production utilizing its under-construction factory in Hokkaido.

Rapidus had announced its collaboration with Tenstorrent in November 2023 to accelerate the development of AI chips, and this collaboration now extends into the realm of manufacturing.

This marks the first public announcement by Rapidus of securing a client (contract manufacturing order) for the most advanced chips. However, Rapidus has not disclosed details such as production volume or financial terms.

Atsuyoshi Koike, President of Rapidus, stated at a press conference held on February 27th, “In the future, AI will be utilized in all products, and the ability to swiftly produce AI chips that meet customer demands is crucial for competitiveness.”

As per a report from Asahi News, Tenstorrent is currently collaborating with TSMC and Samsung. Tenstorrent’s CEO, Jim Keller, stated, “We will be producing various products. Many people are looking forward to our collaboration with Rapidus.”

Rapidus is expected to mass-produce logic chips of 2 nanometers or less by 2027. The first plant, “IIM-1,” located in Chitose City, Hokkaido, began construction in September 2023. The trial production line is scheduled to start in April 2025, with mass production slated to begin in 2027.

Per NHK’s report, At a press conference held in Chitose City on January 22nd, Junichi Koike announced that the construction of the 2-nanometer plant is proceeding smoothly, and the trial production line is scheduled to be operational by April 2025 as originally planned.

Regarding the construction of the plant, Koike stated, “There has been no delay even for a day; it is progressing according to schedule.” He also mentioned that they are also considering the construction of a second and third plant in the future.

After the groundbreaking ceremony held in September 2023, the foundation work for Rapidus’ 2-nanometer factory was mostly completed by December at the same year. Construction of the above-ground factory building commenced in January this year (2024). The framework of the factory is expected to be completed by April or May this year, with the factory anticipated to be completed by December this year (2024).

Established in August 2022, Rapidus was jointly founded by eight Japanese companies, including Toyota, Sony, NTT, NEC, Softbank, Denso, Kioxia (formerly Toshiba Memory Corporation), and Mitsubishi UFJ, who invested collectively in its establishment.

Read more

(Photo credit: Rapidus)

Please note that this article cites information from Rapidus, Asahi News and NHK.

2024-02-22

[News] Hurdles in Acquiring NVIDIA’s High-End Products: Assessing the Progress of Eight Chinese AI Chip Companies in Self-Development

Under the formidable impetus of AI, global enterprises are vigorously strategizing for AI chip development, and China is no exception. Who are the prominent AI chip manufacturers in China presently? How do they compare with industry giants like NVIDIA, and what are their unique advantages? A report from TechNews has compiled an overview of eight Chinese AI chip manufacturers in self-development.

  • An Overview of AI Chips

In broad terms, AI chips refer to semiconductor chips capable of running AI algorithms. However, in the industry’s typical usage, AI chips specifically denote chips designed with specialized acceleration for AI algorithms, capable of handling large-scale computational tasks in AI applications. Under this concept, AI chips are also referred to as accelerator cards.

Technically, AI chips are mainly classified into three categories: GPU, FPGA, and ASIC. In terms of functionality, AI chips encompass two main types: training and inference. Regarding application scenarios, AI chips can be categorized into server-side and mobile-side, or cloud, edge, and terminal.

The global AI chip market is currently dominated by Western giants, with NVIDIA leading the pack. Industry sources cited by TechNews have revealed data that NVIDIA nearly monopolizes the AI chip market with an 80% market share.

China’s AI industry started relatively late, but in recent years, amid the US-China rivalry and strong support from Chinese policies, Chinese AI chip design companies have gradually gained prominence. They have demonstrated relatively outstanding performance in terminal and large model inference.

However, compared to global giants, they still have significant ground to cover, especially in the higher-threshold GPU and large model training segments.

GPUs are general-purpose chips, currently dominating the usage in the AI chip market. General-purpose GPU computing power is widely employed in artificial intelligence model training and inference fields. Presently, NVIDIA and AMD dominate the GPU market, while Chinese representative companies include Hygon Information Technology, Jingjia Micro, and Enflame Technology.

FPGAs are semi-customized chips known for low latency and short development cycles. Compared to GPUs, they are suitable for multi-instruction, single-data flow analysis, but not for complex algorithm computations. They are mainly used in the inference stage of deep learning algorithms. Frontrunners in this field include Xilinx and Intel in the US, with Chinese representatives including Baidu Kunlunxin and DeePhi.

ASICs are fully customized AI chips with advantages in power consumption, reliability, and integration. Mainstream products include TPU, NPU, VPU, and BPU. Global leading companies include Google and Intel, while China’s representatives include Huawei, Alibaba, Cambricon Technologies, and Horizon Robotics.

In recent years, China has actively invested in the field of self-developed AI chips. Major companies such as Baidu, Alibaba, Tencent, and Huawei have accelerated the development of their own AI chips, and numerous AI chip companies continue to emerge.

Below is an overview of the progress of 8 Chinese AI chip manufacturers:

1. Baidu Kunlunxin

Baidu’s foray into AI chips can be traced back to as early as 2011. After seven years of development, Baidu officially unveiled its self-developed AI chip, Kunlun 1, in 2018. Built on a 14nm process and utilizing the self-developed XPU architecture, Kunlun 1 entered mass production in 2020. It is primarily employed in Baidu’s search engine and Xiaodu businesses.

In August of the same year, Baidu announced the mass production of its second-generation self-developed AI chip, Kunlun 2. It adopts a 7nm process and integrates the self-developed second-generation XPU architecture, delivering a performance improvement of 2-3 times compared to the first generation. It also exhibits significant enhancements in versatility and ease of use.

The first two generations of Baidu Kunlunxin products have already been deployed in tens of thousands of units. The third-generation product is expected to be unveiled at the Baidu Create AI Developer Conference scheduled for April 2024.

2. T-Head (Alibaba)

Established in September 2018, T-Head is the semiconductor chip business entity fully owned by Alibaba. It provides a series of products, covering data center chips, IoT chips, processor IP licensing, and more, achieving complete coverage across the chip design chain.

In terms of AI chip deployment, T-Head introduced its first high-performance artificial intelligence inference chip, the HanGuang 800, in September 2019. It is based on a 12nm process and features a proprietary architecture.

In August 2023, Alibaba’s T-Head unveiled its first self-developed RISC-V AI platform, supporting over 170 mainstream AI models, thereby propelling RISC-V into the era of high-performance AI applications.

Simultaneously, T-Head announced the new upgrade of its XuanTie processor C920, which can accelerate GEMM (General Matrix Multiplication) calculations 15 times faster than the Vector scheme.

In November 2023, T-Head introduced three new processors on the XuanTie RISC-V platform (C920, C907, R910). These processors significantly enhance acceleration computing capabilities, security, and real-time performance, poised to accelerate the widespread commercial deployment of RISC-V in scenarios and domains such as autonomous driving, artificial intelligence, enterprise-grade SSD, and network communication.

3. Tencent 

In November 2021, Tencent announced substantial progress in three chip designs: Zixiao for AI computing, Canghai for image processing, and Xuanling for high-performance networking.

Zixiao has successfully undergone trial production and has been activated. Reportedly, Zixiao employs in-house storage-computing architecture and proprietary acceleration modules, delivering up to 3 times the computing acceleration performance and over 45% cost savings overall.

Zixiao chips are intended for internal use by Tencent and are not available for external sales. Tencent profits by renting out computing power through its cloud services.

Recently, according to sources cited by TechNews, Tencent is considering using Zixiao V1 as an alternative to the NVIDIA A10 chip for AI image and voice recognition applications. Additionally, Tencent is planning to launch the Zixiao V2 Pro chip optimized for AI training to replace the NVIDIA L40S chip in the future.

4. Huawei

Huawei unveiled its Huawei AI strategy and all-scenario AI solutions at the 2018 Huawei Connect Conference. Additionally, it introduced two new AI chips: the Ascend 910 and the Ascend 310. Both chips are based on Huawei’s self-developed Da Vinci architecture.

The Ascend 910, designed for training, utilizes a 7nm process and boasts computational density that is said to surpass the NVIDIA Tesla V100 and Google TPU v3.

On the other hand, the Ascend 310 belongs to the Ascend-mini series and is Huawei’s first commercial AI SoC, catering to low-power consumption areas such as edge computing.

Based on the Ascend 910 and Ascend 310 AI chips, Huawei has introduced the Atlas AI computing solution. As per the Huawei Ascend community, the Atlas 300T product line includes three models corresponding to the Ascend 910A, 910B, and 910 Pro B.

Among them, the 910 Pro B has already secured orders for at least 5,000 units from major clients in 2023, with delivery expected in 2024. Sources cited by the TechNews report indicate that the capabilities of the Huawei Ascend 910B chip are now comparable to those of the NVIDIA A100.

Due to the soaring demand for China-produced AI chips like the Huawei Ascend 910B in China, Reuters recently reported that Huawei plans to prioritize the production of the Ascend 910B. This move could potentially impact the production capacity of the Kirin 9000s chips, which are expected to be used in the Mate 60 series.

5. Cambricon Technologies

Founded in 2016, Cambricon Technologies focuses on the research and technological innovation of artificial intelligence chip products.

Since its establishment, Cambricon has launched multiple chip products covering terminal, cloud, and edge computing fields. Among them, the MLU 290 intelligent chip is Cambricon’s first training chip, utilizing TSMC’s 7nm advanced process and integrating 46 billion transistors. It supports the MLUv02 expansion architecture, offering comprehensive support for AI training, inference, or hybrid artificial intelligence computing acceleration tasks.

The Cambricon MLU 370 is the company’s flagship product, utilizing a 7nm manufacturing process and supporting both inference and training tasks. Additionally, the MLU 370 is Cambricon’s first AI chip to adopt chiplet technology, integrating 39 billion transistors, with a maximum computing power of up to 256TOPS (INT8).

6. Biren Technology 

Established in 2019, Biren Technology initially focuses on general smart computing in the cloud.

It aims to surpass existing solutions gradually in various fields such as artificial intelligence training, inference, and graphic rendering, thereby achieving a breakthrough in China’s produced high-end general smart computing chips.

In 2021, Biren Technology’s first general GPU, the BR100 series, entered trial production. The BR100 was officially released in August 2022.

Reportedly, the BR100 series is developed based on Biren Technology’s independently chip architecture and utilizes mature 7nm manufacturing processes.

7. Horizon Robotics 

Founded in July 2015, Horizon Robotics is a provider of smart driving computing solutions in China. It has launched various AI chips, notably the Sunrise and Journey series. The Sunrise series focuses on the AIoT market, while the Journey series is designed for smart driving applications.

Currently, the Sunrise series has advanced to its third generation, comprising the Sunrise 3M and Sunrise 3E models, catering to the high-end and low-end markets, respectively.

In terms of performance, the Sunrise 3 achieves an equivalent standard computing power of 5 TOPS while consuming only 2.5W of power, representing a significant upgrade from the previous generation.

The Journey series has now iterated to its fifth generation. The Journey 5 chip was released in 2021, with global mass production starting in September 2022. Each chip in the series boasts a maximum AI computing power of up to 128 TOPS.

In November 2023, Horizon Robotics announced that the Journey 6 series will be officially unveiled in April 2024, with the first batch of mass-produced vehicle deliveries scheduled for the fourth quarter of 2024.

Several automotive companies, including BYD, GAC Group, Volkswagen Group’s software company CARIAD, Bosch, among others, have reportedly entered into cooperative agreements with Horizon Robotics.

8. Enflame Technology

Enflame Technology, established in March 2018, specializes in cloud and edge computing in the field of artificial intelligence.

Over the past five years, it has developed two product lines focusing on cloud training and cloud inference. In September 2023, Enflame Technology announced the completion of Series D funding round of CNY 2 billion.

In addition, according to reports cited by TechNews, Enflame Technology’s third-generation AI chip products are set to hit the market in early 2024.

Conclusion

Looking ahead, the industry remains bullish on the commercial development of AI, anticipating a substantial increase in the demand for computing power, thereby creating a significant market opportunity for AI chips.

Per data cited by TechNews, it has indicated that the global AI chip market reached USD 580 billion in 2022 and is projected to exceed a trillion dollars by 2030.

Leading AI chip manufacturers like NVIDIA are naturally poised to continue benefiting from this trend. At the same time, Chinese AI chip companies also have the opportunity to narrow the gap and accelerate growth within the vast AI market landscape.

Read more

(Photo credit: iStock)

Please note that this article cites information from TechNews and Reuters.

  • Page 7
  • 14 page(s)
  • 68 result(s)

Get in touch with us