NVIDIA’s GB200 thermal challenges have made Co-Packaged Optics (CPO) technology a key topic in the tech industry. According to Commercial Times, NVIDIA plans to unveil a new CPO-based switch at the March GPU Technology Conference (GTC), with mass production slated for August. Suppliers like TSMC and FOCI are actively preparing for this development.
The Commercial Times, citing supply chain sources, reports that shipments of the GB200 series have been underwhelming. The complex design of the GB200 NVL72 racks, combined with the high power consumption generated by high-performance computing, has created overheating issues. This delay has also fueled speculation that AWS might accelerate the adoption of its in-house chips, Trainium and Inferentia, posing a competitive threat to NVIDIA as ASIC demand grows and encroaches on the general-purpose GPU market.
The report also highlights that while single-rack issues with GB200 have largely been resolved, multi-rack interconnects are now presenting significant challenges. The complexity of copper cables and wiring is causing concurrent issues with overheating and glitching. Industry insiders emphasize that transitioning from electrical to optical connections is becoming inevitable. NVIDIA is rumored to announce its CPO switch at GTC, with the Switch ASIC chip manufactured by TSMC.
According to Commercial Times, trial production is expected to proceed smoothly, enabling mass production in August. The CPO switch is projected to support 115.2 Tbps signal transmission. TSMC has already validated its compact universal photonics engine (COUPE) with a transfer rate of 1.6 Tbps and is testing 3.2 Tbps products. However, achieving the target transmission speed for the CPO switch requires the coupling of 36 optical engines.
Sources within the industry, as reported by Commercial Times note that semiconductor processes involving nanoscale optical coupling pose significant
challenges. While individual couplings may achieve a yield rate exceeding 90%, repeating the process 36 times magnifies the risk of failure. Even minor errors can lead to CoWoS (Chip-on-Wafer-on-Substrate) assembly failures, resulting in significant financial losses.
The same report notes that equipment manufacturers remain cautious. German equipment supplier Ficontec, which provides key machinery for the process, faces capacity limitations, raising doubts about whether NVIDIA can launch the product as scheduled.
(Photo credit: NVIDIA)