The wave of reasoning frenzy is coming, NVIDIA Corporation (NVDA.US) is fully preparing to face TPU! After acquiring the Groq core team, they set their sights on AI21 Labs.

date
07:48 31/12/2025
avatar
GMT Eight
TorchTPU loosens CUDA's moat, Ironwood points directly to the era of inference: NVIDIA acquires Groq to supplement its inference capabilities, intends to acquire AI21 to supplement its applications, and aims to lock in the dominant position in the entire stack with Groq+AI21.
Some media reported citing informed sources that after the AI chip superpower and global highest market value company NVIDIA Corporation (NVDA.US) recently spent 20 billion U.S. dollars to acquire the Groq core team, it is currently in advanced acquisition negotiations to acquire the leading artificial intelligence company AI21 Labs based in Israel for 20 billion to 30 billion U.S. dollars. NVIDIA Corporation is aiming to acquire AI21 Labs, following a recent non-exclusive licensing agreement worth 200 billion dollars with AI chip startup Groq, which licensed its AI inference technology to NVIDIA Corporation and will have Groq's founder and core research team join NVIDIA Corporation after the transaction is completed. This move emphasizes the approaching "global AI inference wave" and the increasing competition pressure brought by Alphabet Inc. Class CTPU AI computing cluster. NVIDIA Corporation is striving to maintain its absolute dominance in the AI chip field with an 80% market share by combining "multi-architecture AI computing power + consolidating CUDA ecosystem + introducing more AI chip design talents", and intends to secure its dominance in the AI full-stack conversation with the acquisitions of Groq and AI21. It is understood that the Israeli AI startup AI21 Labs focuses on developing large language models (LLMs), enabling enterprises to quickly build customized enterprise-level generative AI applications, such as ChatGPT, thus holding an important position in the enterprise AI ecosystem. The company was co-founded by Amnon Shashua in 2017, who is also the co-founder and CEO of the leading player in the autonomous driving field, Mobileye (MBLY.US). According to media reports, following a round of funding in 2023 led by NVIDIA Corporation (NVDA.US) and Alphabet Inc. Class C (GOOGL.US), the recent valuation of the company post-funding is approximately 1.4 billion dollars. The company has around 200 employees, many of whom have advanced degrees in science and technology and rich experiences in AI application development. This indicates that NVIDIA Corporation may value the comprehensive AI skills of the top talents at AI21 more than just the company's technology itself. Huang Renxun increasingly favors Israeli tech companies In recent years, NVIDIA Corporation has been actively acquiring the top tech companies based in Israel. For example, in December last year, NVIDIA Corporation officially completed the acquisition of Run:ai; Run:ai's exclusive technology can significantly improve the overall efficiency of NVIDIA Corporation's AI chips and reduce the specific number of GPUs required to complete tasks. The terms of the deal were not disclosed, but previous media reports stated that NVIDIA Corporation paid approximately 700 million dollars for the acquisition. In the same year, NVIDIA Corporation also acquired Deci, a company that deeply transformed AI large models with exclusive technology to ensure they run with lower costs and higher efficiency. And in 2019, NVIDIA Corporation acquired Mellanox, the hottest tech company in Israel at the time, for 6.9 billion dollars - a leading provider of end-to-end Ethernet and InfiniBand high-speed intelligent interconnect solutions and network services for servers and storage; Mellanox's core technology is the foundation of NVIDIA Corporation's current leading "InfiniBand + Spectrum-X/Ethernet" high-performance network infrastructure. According to media reports, NVIDIA Corporation is currently building a large research and development center in Kiryat Tivon, located south of Haifa, Israel. There have been previous reports stating that Huang Renxun, CEO and co-founder of NVIDIA Corporation, has referred to Israel as the "second home" of the chip company he leads. After the completion of the large research park, NVIDIA Corporation stated that it will include up to 160,000 square meters (approximately 1.7 million square feet) of office space, parks, and public areas covering 90 dunams (22 acres), inspired by NVIDIA Corporation's global headquarters in Santa Clara, California. NVIDIA Corporation anticipates that the project will commence in 2027 and be operational by 2031. The AI inference wave is coming, and NVIDIA Corporation feels the increasingly strong competitive pressure brought by Alphabet Inc. Class CTPU NVIDIA Corporation's AI GPU practically monopolizes the AI training side, necessitating a more powerful AI computing cluster universality and rapid iteration capability for the entire computing system, while the AI inference side values unit cost, latency, and energy efficiency after the large-scale adoption of cutting-edge AI technologies. Alphabet Inc. Class C has explicitly positioned Ironwood as the "created for the AI inference era" TPU generation, emphasizing performance/efficiency/computing cluster cost-effectiveness and scalability. As the AI inference computing system becomes a long-term cash cost center for global tech companies, customers are more inclined to choose more cost-effective and competitive AI ASIC accelerators on the cloud. Media reports have mentioned that OpenAI has increasingly rented TPU (Alphabet Inc. Class C TPU belongs to the AI ASIC technology route) on Google Cloud to lower AI inference costs - a typical case of rising competition pressure from TPU. According to Semianalysis calculation data, Alphabet Inc. Class C's latest TPU v7 (Ironwood) has shown a remarkable generational leap, with TPU v7's BF16 computing power reaching as high as 4614 TFLOPS, while the widely used previous generation TPU v5p was only 459 TFLOPS, marking a significant performance improvement. In addition, TPU v7 is directly compared to NVIDIA Corporation's Blackwell architecture B200, where, in specific AI application scenarios, architecture with more cost-effective and energy-efficient advantages like AI ASIC can more easily handle mainstream inference loads, with TPU even providing 1.4 times more performance per dollar than Blackwell. One of the core constraints of the TPU AI computing cluster has been developer stack and engineering inertia (CUDA/PyTorch), and as Alphabet Inc. Class C significantly advances TorchTPU and gains the cooperation of key ecosystem participants, the accessibility of Alphabet Inc. Class C TPU to external developers increases, leading to a faster and more comprehensive dissemination and release of TPU's advantages in massive inference scenarios over NVIDIA Corporation's AI GPU computing system and NVIDIA CUDA ecosystem. As the global AI inference wave shifts the focus from "who can train the most powerful language model" to "who can deploy AI large models at scale with the lowest cost and latency", TPU is actively pushing forward on "dedicated inference hardware + cloud delivery + reducing software frictions"; by reinforcing with Groq (inference-specific capacity and talent) and AI21 (model/enterprise application stack and talent), NVIDIA Corporation is directly responding to this competitive trend. The current massive AI inference demand is showing a rapid growth trend every six months, hence in the context of the AI inference wave sweeping in and the increasing competition pressure brought about by Alphabet Inc. Class CTPU, NVIDIA Corporation's acquisition of Groq is aimed at capturing inference chip concepts and top talents, while through AI21, it aims to reinforce software and model-side capabilities. This strategy of "hardware technology route diversification + AI application ecosystem end-to-end binding" serves as a defensive response and counterattack. The essence of the deal between NVIDIA Corporation and AI chip startup Groq is the non-exclusive licensing of inference-type AI chip technology + the absorption of Groq founders/CEO Jonathan Ross and some key executives and engineering teams; some semiconductor industry analysts have emphasized Groq's exclusive chip technology focusing on inference and reducing data transfer bottlenecks through on-chip SRAM, directly addressing the cost/latency pain points in the inference stage. NVIDIA Corporation is currently in advanced negotiations to acquire AI21 Labs for 20 to 30 billion dollars, and the latest media reports indicate that NVIDIA Corporation may value its approximately 200 top-tier AI talents and enterprise-level generative AI capabilities; if NVIDIA Corporation successfully includes AI21 Labs' large-scale model development and enterprise application capabilities, it will be very advantageous for NVIDIA Corporation to further lock in customers in its own "software/platform/solution/NVIDIA ecosystem" during the global AI inference explosion period, rather than just being an AI GPU supplier, and will likely embed itself more deeply into the "model-application-deployment" AI ecosystem of enterprise customers, thereby solidifying the CUDA moat and significantly enhancing the stickiness and bargaining power of NVIDIA Corporation's AI computing cluster (to avoid future diversion of inference computing power to more self-developed AI ASIC chips or alternative computing power like TPU).