NVIDIA Corporation (NVDA.US), update roadmap! Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR (TSM.US) targeted?
16/01/2025
GMT Eight
According to a report by Ming-Chi Kuo, an analyst at TF International Securities, cited by Tomshardware, demand in the market for Nvidia's high-end Blackwell dual-chip design is surpassing Nvidia's low-end single-chip design. As a result, this trillion-dollar GPU manufacturing giant has updated its Blackwell architecture roadmap to prioritize the use of dual-chip designs packaged with CoWoS-L.
The report further states that starting from the first quarter of this year, Nvidia will focus on its 200 series Blackwell GPUs. However, it is important to note that this only includes the multi-chip versions of the 200 series, such as GB200 NVL72. The single-chip versions of the 200 series, such as B200A, have been discontinued.
Similarly, Nvidia apparently plans to prioritize the use of multi-chip B300 series models, especially GB300 NVL72. Due to high demand for multi-chip variants, the single-chip variants of the B300 GPU will be of lower priority in manufacturing. Nvidia's high-priority Blackwell GPU models will use Taiwan Semiconductor Manufacturing Co., Ltd.'s more advanced CoWoS-L technology. The discontinued B200A and single-chip B300 GPUs both use CoWoS-S.
Ming-Chi Kuo stated that as a result of these changes, certain suppliers will be particularly impacted.
NVIDIA Corporation's roadmap has changed, what has changed?
As previously outlined, NVIDIA Corporation has adopted dual-chip designs for the 200 series, which includes system products like GB200 NVL72 and HGX B200, manufactured using CoWoS-L.
Nvidia stated that the new B200 GPU has 2,080 billion transistors and can deliver up to 20 petaflops of FP4 horsepower. It also mentioned that combining two GPUs with a single Grace CPU in the GB200 could provide 30x performance for LLM inference workloads and significantly improve efficiency. Compared to H100, it reportedly reduces costs and energy consumption by 25 times.
However, Ming-Chi Kuo pointed out that the 200 series does not include the single-chip version B200A using the CoWoS-S process, so they do not require CoWoS-S.
Research firm SemiAnalysis previously stated that Nvidia plans to introduce a new Blackwell GPU called B200A, which is a lower-end alternative to the delayed B200 GPU. They reported that B200A will contain up to 144GB of HBM3E memory and consume up to 1000 watts of power, meeting the needs of low and mid-range AI systems. According to the initial plan, the B200A GPU will be used in servers like MGX GB200A NVL36, which supports up to 36 GPUs, attracting customers interested in building smaller AI models.
It is worth mentioning that B200A will be based on a die called B102, which will also be used for the Chinese version Blackwell's B20.
However, as Ming-Chi Kuo mentioned, the strategy of NVIDIA Corporation has now changed. He further stated that starting from the first quarter of 2025, Nvidia will shift its focus to the 200 series and reduce the supply of the H series. This will further reduce their demand for CoWoS-S.
In Ming-Chi Kuo's analysis report, he also analyzed the future B300 series for NVIDIA Corporation. He stated that the series originally planned to have both dual-chip (CoWoS-L) and single-chip (CoWoS-S) designs, including systems like GB300 NVL72 (dual-chip) and HGX B300 NVL16 (single-chip).
Similarly, according to SemiAnalysis, the Nvidia B300 series processors will feature a significantly redesigned design and will still use Taiwan Semiconductor Manufacturing Co., Ltd.'s 4NP manufacturing process (optimized for Nvidia's 4nm level node for enhanced performance). The report mentioned that their computational performance will be 50% higher than the B200 series processors. The performance increase comes at the cost of a TDP of up to 1,400W, only 200W higher than the GB200. SemiAnalysis reports that the B300 will be released approximately six months after the B200.
Another major improvement in the Nvidia B300 series is the use of 12-Hi HBM3E memory stacks, providing 288 GB of memory and 8 TB/s bandwidth. The enhanced memory capacity and higher computational throughput will enable faster training and inference, with the inference cost potentially reduced by up to three times, as the B300 can handle larger batch sizes and support extended sequence lengths while addressing latency issues in user interactions.In addition to higher computing performance and larger memory, Nvidia's second generation Blackwell machine may also use the company's 800G ConnectX-8 NIC. This NIC has twice the bandwidth of the current 400G ConnectX-7 and has 48 PCIe channels, compared to its predecessor's 32 channels. This will provide significant horizontal bandwidth improvement for new servers, which is a victory for large clusters.According to reports, another major improvement of B300 and GB300 is that compared to B200 and GB200, Nvidia reportedly will redesign the entire supply chain. The company will no longer try to sell the entire reference motherboard or the entire server chassis. Instead, Nvidia will only sell B300 loaded with SXM Puck modules, Grace CPUs, and Axiado Host Management Controllers (HMC). This will allow more companies to participate in the Blackwell supply chain, making it easier to access machines based on Blackwell.
With the help of B300 and GB300, Nvidia will provide its hyperscale and OEM partners with more freedom to design Blackwell machines, which will impact their pricing and even performance.
However, Ming-Chi Kuo pointed out that while systems based on B300 are planned for mass production in 2026, Nvidia and CSP currently prefer to use the GB300 NVL72 with CoWoS-L packaging. Although B300 systems with single-chip, CoWoS-S packaging are also used, the GB300 NVL72 will be prioritized.
Therefore, there is a more urgent demand for CoWoS-L compared to CoWoS-S.
Ming-Chi Kuo pointed out that these changes in the product roadmap will affect Nvidia and its supply chain partners to varying degrees. Some suppliers will be particularly hard hit, leading to a significant recent pullback in their stock prices. However, from Nvidia's perspective, the slowdown/reduction in CoWoS-S expansion is mainly due to changes in the product roadmap rather than a decline in demand. This change also aligns with Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR's strategic plan to promote its CoWoS-L technology as a mainstream solution.
What are the differences between CoWoS-L and CoWoS-S?
In the description above, we saw descriptions of CoWoS-L and CoWoS-S. These are actually two versions of the CoWoS platform from NVIDIA Corporation.
CoWoS is abbreviated as Chip-on-wafer-on-substrate. As an advanced packaging technology, CoWoS offers advantages such as larger packaging sizes and more I/O connections. It allows for the stacking of 2.5D and 3D components to achieve homogeneous and heterogeneous integration. Traditional systems faced memory limitations, while modern data centers use High Bandwidth Memory (HBM) to enhance memory capacity and bandwidth. CoWoS technology allows for the heterogeneous integration of logic SoC and HBM on the same IC platform.
Traditionally, scaling transistors according to Moore's Law helped meet the demand for improved performance. However, this is inadequate for modern applications such as High-Performance Computing (HPC), Artificial Intelligence, and even Graphics Processing Units (GPU). CoWoS allows for the stacking of chips on the same substrate, reducing interconnect latency between homogeneous or heterogeneous logic SoCs and HBMs.
Additionally, the use of silicon interposers and organic interposers greatly enhances the thermal management capabilities of stacked integrated circuits. This directly improves the reliability and lifespan of the entire system, while minimizing the risk of thermal throttling.
CoWoS technology helps by allowing multiple logic SoCs and HBMs to be installed on the same interposer and substrate. This is in stark contrast to traditional packaging technologies, which previously required multiple logic SoCs to be installed on printed circuit boards (PCBs) and necessary connections to be made during packaging. This led to larger packaging sizes and increased material costs and manufacturing expenses. CoWoS packaging is overall smaller and more cost-effective.
With the popularity of AI, the demand for CoWoS has increased significantly, prompting Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR to significantly expand CoWoS. The Economic Daily reported earlier this year that Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR is actively increasing its advanced packaging capacity for CoWoS, with estimated capacity nearly doubling by 2025 to reach 75,000 wafers per month, and will continue to increase capacity in 2026 due to strong market demand.
Specifically, as shown in the diagram below, CoWoS has three versions, including CoWoS-L and CoWoS-S.
Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR explains that the CoWoS-S (Chip on Wafer on Substrate with silicon interposer) platform provides cutting-edge packaging technology for ultra-high-performance computing applications such as Artificial Intelligence (AI) and supercomputing. This wafer-level system integration platform provides high-density interconnects and deep trench capacitors in large silicon interposer areas to accommodate various functional top chips/chips, including logic chips stacked with High Bandwidth Memory (HBM) cubes. Currently, the CoWoS-S allows for up to 3.3X photomask size (or ~2700mm) of interposer area.The intermediate layer is ready for production.CoWoS -R (Chip on Wafer on Substrate with silicon interposer with fan-out RDL interposer) is one of the members of the advanced packaging series of CoWoS, which uses redistribution layer (RDL) interposer as an interconnect between System on Chip (SoC) and/or high bandwidth memory (HBM) to achieve heterogeneous integration. The RDL interposer consists of polymer and copper wires, making it relatively flexible. This enhances the integrity of C4 joints and allows packaging to expand in size to meet very complex functional requirements.
On the other hand, CoWoS -L is one of the final chip packaging on the CoWoS (Chip on Wafer on Substrate) platform. It combines the advantages of CoWoS -S and InFO (Integrated Fan-Out) technology, using an interposer layer and Local Silicon Interconnect (LSI) chip for chip-to-chip interconnects, and utilizing RDL layer for power and signal transmission, providing the most flexible integration.
The main features of CoWoS -L include:
1. LSI chips are used to achieve high wiring density chip-to-chip interconnects through multiple layers of sub-micron copper wires. LSI chips can adopt various connection architectures in each product, such as SoC to SoC, SoC to chipset, SoC to high bandwidth memory, and can be reused in multiple products. The corresponding metal types, layers, and spacing are consistent with CoWoS -S products.
2. Mold-based interposer has wider RDL layer spacing on the front, back, and Through InFO Via (TIV) for signal and power transmission, reducing high-frequency signal loss during high-speed transmission.
3. Ability to integrate additional elements such as independent embedded trench capacitors below the SoC chip to improve power management.
In conclusion, the news that a major customer has dropped CoWoS orders from Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR, as mentioned by Guo Mingqi, provides another perspective on this news.
Guo Mingqi stated that although the expansion of CoWoS-S is slowing down, the capacity of CoWoS-R is increasing. He also mentioned that for Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR, the transition from B200 to B300 involves the same Front-End-of-Line (FEoL) process. Back-End-of-Line (BEoL) changes can be managed through ECO.
Therefore, Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR considers them as the same product, and the timing of product transition is not important for Taiwan Semiconductor Manufacturing Co., Ltd. Sponsored ADR.