Big chips, rising again?
In the triple background of the slowing down of Moore's Law, advanced packaging taking the relay, and the fragmentation of AI scenarios, the seemingly niche technology route of wafer-level integration is unexpectedly redefining the boundaries of "large" in a new way.
At the beginning of 2025, the AI chip field announced two heavyweight news:
Elon Musk confirmed on social media platform that Tesla, Inc. (TSLA.US) restarted the Dojo 3 supercomputer project, stating that Tesla, Inc. will become the world's largest AI chip manufacturer;
Another important participant in the AI chip industry, Cerebras Systems, signed a multi-year purchase agreement worth over 10 billion US dollars with OpenAI, promising to deliver 750 megawatts of computing power, which will be gradually put into use before 2028.
One is the "resurrection" of self-developed training chips, and the other is a commercial breakthrough in wafer-level systems - behind these two different news, the "big chip", once considered a different technology route, once again stands in the spotlight.
The division of the two large chips
In the history of AI chip evolution, the term "big chip" has never been a precise technical term, but more like a generalization of two completely different designs.
One is represented by Cerebras, the wafer-level single-chip integration, and the other is the "wafer-level system" like Tesla, Inc. Dojo, which is between a single chip and a GPU cluster. The former pursues simplicity through complexity, building a single processor with a full 300mm wafer, while the latter takes a middle path and integrates multiple pre-tested chips into a system similar to a single chip through advanced packaging.
The root of this division lies in the different solutions to the two major pain points of the "memory wall" and the "interconnect bottleneck."
Under the traditional GPU architecture, the separation of the processor and memory leads to continuous data back and forth between HBM and computing cores. According to technical literature, from A100 to H100, NVIDIA Corporation's computing power has increased by about 6 times, but the memory bandwidth has only increased by 1.7 times. This imbalance has shifted the dominant factor in training time from computing power to memory bandwidth. Multi-GPU systems amplify this overhead - even though NVLink 6.0 has pushed single-GPU bandwidth to 3.6 TB/s, the latency between chips is still hundreds of times that of on-chip interconnects.
The Cerebras WSE-3 released in 2024 uses 40 trillion transistors, 900,000 AI cores, and 44GB of on-chip SRAM to give its own answer: to integrate computation and storage into a single silicon chip so that data can be processed without leaving the chip. Its on-chip interconnect bandwidth reaches 214 Pbps, 37 times that of the NVIDIA Corporation H100 system, and the memory bandwidth reaches 21PB/s, 880 times that of the H100. This extreme integration density brings about extreme performance improvements, running at a speed of 1800 tokens/s on the Llama 3.1 8B model, while the H100 is only at 242 tokens/s.
However, this extreme also brings extreme engineering challenges. The issue of yield for a whole wafer is almost the opposite of Moore's Law: the larger the area, the exponentially higher the defect rate. Cerebras' breakthrough is to shrink each AI core to 0.05 square millimeters - only 1% of the H100
SM core - and bypass defective areas through redundant design and intelligent routing. This ant-like fault tolerance mechanism allows even a flawed chip to maintain overall performance, but at the cost of requiring specialized firmware mapping and complex cooling systems. The WSE-3's 23kW power consumption requires a custom liquid cooling and hybrid coolant system.
In comparison, Tesla, Inc. Dojo follows the wafer-level system route between the two. The D1 chip itself is only 645 square millimeters, but arranged in a 55 array on a carrier, using Taiwan Semiconductor Manufacturing Co., Ltd.'s InFO packaging technology to achieve high-density interconnection, allowing 25 chips to work together like a single processor. This design avoids the yield risk of a single wafer chip. Each D1 chip can be pre-tested, and to some extent mitigates the interconnect bottleneck of a multi-chip system, with a latency between chips of only 100 nanoseconds, far lower than the millisecond level of traditional GPU clusters.
Tesla, Inc.'s pragmatic shift
In August 2025, Bloomberg reported that Tesla, Inc. disbanded the Dojo supercomputing team, which was once seen as the end of the self-developed training chip route. But only six months later, Dojo was restarted, and the logic behind it had fundamentally changed.
Musk revealed on social media that the AI5 chip design is in good shape and Tesla, Inc. will restart work on Dojo
3, using the AI6 or AI7 chip. The goal is no longer to train automatic driving models on Earth, but to focus on "space artificial intelligence computing."
This shift is intriguing. Originally, Dojo was positioned as a universal training platform benchmarking 100,000 H100s, with Morgan Stanley estimating it could bring Tesla, Inc. an incremental value of $500 billion. But in reality, with the core team leaving one after another, the project was halted at the end of 2024, and Tesla, Inc. instead purchased 67,000 equivalent H100s to build the Cortex cluster. The reasons behind it are not hard to understand: although the theoretical performance of the D1 is powerful, the key to training chips is not individual chip performance.
NVIDIA Corporation's moat consists of more than a decade of CUDA ecosystem accumulation, locked-in CoWoS advanced packaging capacity, and deep ties with the HBM supply chain. In contrast, even if the self-developed Dojo2 plan of Tesla, Inc. succeeds in flow production, it will still need to make up for several years in software adaptation, cluster scheduling, and reliability engineering, while NVIDIA Corporation has already iterated through two to three generations of products during that time.
Today, Tesla, Inc. chooses to outsource training and self-developed inference, which is essentially a recalculation of opportunity costs. Musk states that it is unreasonable for Tesla, Inc. to spread its resources across two completely different AI chip designs, and that the AI5, AI6, and subsequent chips will be excellent in inference, and at least quite good in training. The AI5 chip uses a 3nm process, manufactured by Taiwan Semiconductor Manufacturing Co., Ltd., and is expected to be mass-produced by the end of 2026, with individual performance approaching that of the NVIDIA Corporation Hopper level, and a dual-chip configuration capable of approaching Blackwell architecture.
The more critical aspect is the shift in strategic focus. Dojo
3 is no longer a universal training platform benchmarking GPU clusters, but is aimed at space computing deployment. Musk plans to finance this vision through SpaceX's future IPO, using starships to deploy computing satellites that can operate under continuous sunlight.
The genius of this positioning lies in the fact that space computing as an emerging track does not have the ecological barriers of NVIDIA Corporation, nor does it require a head-on confrontation with the mature GPU ecosystem, but opens up entirely new application scenarios. In November 2025, Starcloud, an investment by NVIDIA Corporation, launched an H100 into space for the first time, three days later, Alphabet Inc. Class C announced plans to deploy TPU to space by early 2027. This space computing race has only just begun.
However, even with the restart, there are still other challenges. According to reports, Tesla, Inc. has awarded the Dojo
3 chip manufacturing contract to Samsung, and the chip packaging business is being taken over by Intel Corporation. This supply chain adjustment reflects the reality that Taiwan Semiconductor Manufacturing Co., Ltd.'s capacity saturation could not provide active support for Dojo
3, and also exposes Tesla, Inc.'s weakness in competing for OEM capacity.
Cerebras' precise positioning
If Tesla, Inc.'s Dojo is a repositioning in trial and error, then the $10 billion collaboration between Cerebras and OpenAI is a precise positioning on the eve of inference breakthroughs. OpenAI has promised to purchase up to 750 megawatts of computing power from Cerebras by 2028, with a transaction value exceeding $100 billion. The key to this order is that OpenAI is willing to pay a premium for what is called "ultra-low-latency inference."
Barclays research predicts that in the future, AI inference computing demand will account for over 70% of the total AI computing power, and the demand for inference computing could even exceed that of training computing by a factor of 4.5. As ChatGPT and other generative AI applications shift from "train once, deploy multiple times" to "continuous inference, real-time interaction," the value of low-latency inference capability has dramatically increased. OpenAI infrastructure lead Sachin
Katti stated that when AI responds in real-time, users will do more, stay longer, and run more high-value workloads.
Cerebras' unique speed comes from integrating a large amount of computation, memory, and bandwidth on a single giant chip, eliminating the bottlenecks that slow down inference speed in traditional hardware. This architectural advantage can amazing performance differences in practical applications. Cerebras
WSE-3 achieves speeds 210 times faster than the H100 in carbon capture simulations and 20 times faster in AI inference. If Cerebras can continue to provide sub-second response times on a large scale, it could reduce infrastructure costs and open the door to richer, more conversational applications that rely on streaming responses.
However, this commercial breakthrough did not come easily. In the first half of 2024, 87% of Cerebras' revenue came from G42 in the UAE, and this over-reliance on a single customer hindered its IPO plans. In October 2024, Cerebras withdrew its IPO application, but continued to raise funds, with the latest reports stating that the company is negotiating a new round of $1 billion financing, valuing the company at approximately $22 billion. The amount of the order from OpenAI surpasses Cerebras' current company valuation, making OpenAI its largest and perhaps the only major customer. This close relationship is seen as both a business breakthrough and a potential risk.
Insiders believe that if OpenAI's financial situation were stronger, it might consider following the lead of other tech giants and acquire Cerebras directly, including its engineering talent and operational infrastructure. The current cooperation structure is more based on financial reality rather than strategic intent. OpenAI CEO Sam Altman personally invested in Cerebras as early as 2017, and Elon Musk attempted to acquire Cerebras and merge it into Tesla, Inc. in 2018. These historical entanglements make the current cooperation more delicate.
This investment has also led to a diversification of the supply chain. In 2025, OpenAI signed agreements with NVIDIA Corporation, AMD, and Broadcom Inc. In September, NVIDIA Corporation pledged $100 billion to support OpenAI and build at least a 10-gigawatt NVIDIA Corporation system, equivalent to 4 to 5 million GPUs. The CEO of OpenAI stated that computing scale is highly correlated with revenue growth, but the availability of computing power has become one of the most important limiting factors for further growth. In this context, Cerebras provides a differentiated option - a dedicated system optimized for low-latency inference.
Analyst Neil Shah points out that this forces mega-scale providers to diversify their computing systems, using NVIDIA Corporation GPUs for general AI workloads, internal AI accelerators for highly optimized tasks, and systems like Cerebras for specialized low-latency workloads. The fragmentation of inference scenarios (from dialogue generation to code completion to image rendering) means that no single chip architecture can handle all scenarios, and the value of dedicated accelerators lies in this.
Cracks in the ecological moat and opportunities
Whether it's Cerebras or Tesla, Inc., they cannot avoid an ultimate question: in the increasingly fierce competition today, how much space is there for the survival of the big chip route?
It is worth noting that the AI chip market has long been overcrowded - in June of last year, AMD introduced the MI350X and MI355X GPUs, with training and inference speeds equal to or superior to the B200, and in January of this year, NVIDIA Corporation unveiled the Rubin platform at CES, with both chips already at a staggering update speed.
As the GPU market moves towards an oligopoly, the window of opportunity for the third technological route is rapidly narrowing - why risk betting on an immature wafer-level system when customers can hedge against NVIDIA Corporation, a universal GPU manufacturer, with companies like AMD?
Cerebras' strategy is to completely dislocate the competition. The CS-3 system does not position itself as a training platform, but as a leading inference machine, pushing inference latency to the extreme and simplifying the software stack. The brilliance of this positioning lies in the fact that the inference market is just beginning to explode, with a weaker ecological lock-in effect than the training side, and the diversity of inference tasks leaves room for specialized architectures. The $10 billion order from OpenAI essentially validates this business logic with real money, as when inference costs make up a large portion of operating expenses, a 15-times improvement in performance is enough to reshape the supplier landscape.
Meanwhile, Tesla, Inc. is betting on advanced packaging. Taiwan Semiconductor Manufacturing Co., Ltd. anticipates launching the CoWoS technology in 2027, which will achieve computing power 40 times higher than existing systems, with a silicon chip area exceeding 40 masks, accommodating more than 60 HBM chips - a process route almost tailor-made for wafer-level integration.
When packaging technology allows dozens of pre-tested logic chips and dozens of HBM chips to be integrated on a single substrate, the boundary between traditional "big chips" and "small chip interconnects" becomes blurred. Tesla, Inc.'s previous D2 chip chose this path: using CoWoS packaging to achieve wafer-level performance while avoiding the risk of single-wafer chips. The future Dojo3 may continue to explore this aspect.
Redefining the boundaries of "large"
The big chip has once again entered everyone's field of vision, but the boundaries of "large" seem to have quietly changed.
Firstly, there is the "large" in terms of physical size - Cerebras' single chip that spans the entire wafer is still a technological marvel, but its commercial value is limited to specific scenarios. The Cerebras
WSE system costs approximately $2-3 million and has been deployed in institutions such as the Argon National Laboratory, the Mayo Clinic, and the Condor Galaxy facility in collaboration with G42. It will not replace GPUs as a universal training platform, but it can open up new battlefields in inference, scientific computing, and other latency-sensitive fields.
Secondly, there is the "large" in terms of system integration - whether it's Tesla, Inc.'s wafer-level packaging or NVIDIA Corporation's GB200 NVL72 full-cabinet solution - is becoming mainstream. A SEMI report shows that global wafer factory equipment spending will reach $110 billion in 2025, growing by 18% to $130 billion in 2026, with the logic microcomponent sector being a key driver of investment in advanced technologies such as 2nm processes and backside power supply technology. The evolution of Taiwan Semiconductor Manufacturing Co., Ltd.'s CoWoS route, the standardization of HBM4, and the proliferation of UCIe interconnect protocols are all driving small chip heterogenous integration towards system-level single-chip integration.
Finally, there is the "large" in terms of business models - this is the real watershed. The collaboration between OpenAI and Cerebras is widely seen as another example of leading tech companies absorbing promising AI chip startups, effectively integrating these startups into a dominant ecosystem through direct acquisitions or through exclusive, large-scale commercial partnerships. SambaNova, Groq, and Cerebras have each adopted different technical solutions, seen as a few of the niche challengers that have been able to compete with industry-leading players in specific workloads in the AI chip market for many years. But with increasing competition and limited customer acceptance, many such startups have struggled to break through the pilot deployment stage with major customers.
The halt and restart of Tesla, Inc.'s Dojo is essentially an expensive business trial - it validated that a full-stack self-developed training chip is not replicable for non-cloud giants, but also reserved technical reserves for self-controlled inference. The union between Cerebras and OpenAI, on the other hand, is a precise positioning on the eve of an inference breakthrough, exchanging extreme performance with vertical scene pricing through wafer-level architecture.
In the backdrop of slowing Moore's Law, the relay of advanced packaging, and the fragmentation of AI scenarios, the wafer-level integration technology route, which seems niche, is redefining the boundaries of "large" in unexpected ways.
They are not trying to replicate NVIDIA Corporation's success, but to find the value niches overlooked by general solutions in the AI computing landscape. In this sense, it is not a binary narrative of rise or fall, but a persistent struggle about how to survive under the shadow of giants and ultimately open up new territories.
This article was selected from the official WeChat account "Observation of the Semiconductor Industry." GMTEight Editor: Chen Yufeng.
Related Articles

"A+H's listing in Hong Kong requires a market value threshold of 30 billion RMB? A frontline investigation clears up rumours and confusion."

Shenwan Hongyuan Group: The re-emergence of US debt panic, what did the market misunderstand?

Guosheng: Which construction companies are likely to benefit from the logic of rising chemical prices?
"A+H's listing in Hong Kong requires a market value threshold of 30 billion RMB? A frontline investigation clears up rumours and confusion."

Shenwan Hongyuan Group: The re-emergence of US debt panic, what did the market misunderstand?

Guosheng: Which construction companies are likely to benefit from the logic of rising chemical prices?

RECOMMEND

Paul Chan Says Hong Kong Has Licensed 11 Virtual Asset Exchanges, Stablecoin Licenses Expected Later This Year
22/01/2026

Ministry Of Finance And Other Departments Introduce Comprehensive Fiscal And Financial Policies To Boost Domestic Demand
22/01/2026

Capital Migration: Five Years On, An In‑Depth Analysis Of China’s 11 High‑Growth Venture Capital Tracks In 2025
22/01/2026


