AI Inference Craze Strikes NVIDIA Corporation (NVDA.US) Full Force Against TPU! Targeting AI21 Labs after Acquiring Groq Core Team

date
07:48 31/12/2025
avatar
GMT Eight
TorchTPU loosens CUDA's moat, Ironwood points directly to the era of inference: NVIDIA acquires Groq to supplement its inference capabilities, intends to acquire AI21 to supplement its applications, and aims to lock in the dominant position in the entire stack with Groq+AI21.
According to reports citing informed sources, after the "AI chip super hegemon" and the world's highest market value company NVIDIA Corporation (NVDA.US) recently spent $20 billion to acquire the core team of Groq, they are currently in deep acquisition negotiations to acquire AI21 Labs, a leading artificial intelligence company headquartered in Israel, for $20 billion to $30 billion. NVIDIA Corporation aims to acquire AI21 Labs this time, in addition to the $20 billion non-exclusive licensing agreement reached with AI chip startup Groq not long ago, where Groq agreed to license its AI inference technology to NVIDIA Corporation, and after the transaction, Groq's founder and core research team will join NVIDIA Corporation. This move highlights the wave of "global AI inference," coupled with the increasing competition pressure brought by Alphabet Inc. Class C TPU AI computing cluster. NVIDIA Corporation is striving to maintain its dominant position in the AI chip field with an 80% market share through a strategy of "multi-architecture AI computing + consolidation of the CUDA ecosystem + introduction of more AI chip design talent." NVIDIA Corporation aims to solidify its dominance in the AI stack with the acquisitions of Groq and AI21. AI startup AI21 Labs, based in Israel, focuses on developing large language models (LLM), allowing companies to quickly build customized enterprise-level generative AI applications, such as ChatGPT, which hold a significant position in the enterprise AI ecosystem. The company was co-founded in 2017 by Amnon Shashua, who is also the co-founder and CEO of autonomous driving leader Mobileye (MBLY.US). After receiving funding led by NVIDIA Corporation (NVDA.US) and Alphabet Inc. Class C (GOOGL.US) in 2023, AI21 Labs was valued at approximately $1.4 billion after its recent round of funding. The company has around 200 employees, many of whom hold advanced degrees in STEM fields and possess extensive experience in AI application development. This suggests that NVIDIA Corporation may value the combined AI skills of AI21's top employees, rather than just the company's technology itself. Huang Renxun increasingly favors Israeli technology companies. In recent years, NVIDIA Corporation has been actively acquiring the top technology companies in Israel. For example, in December last year, NVIDIA Corporation completed the acquisition of Run:ai; Run:ai's exclusive technology can significantly improve the overall efficiency of NVIDIA Corporation's AI chips and greatly reduce the specific number of GPUs required to complete tasks. The terms of the transaction were not disclosed, but media reports previously stated that NVIDIA Corporation paid approximately $700 million for the acquisition. In the same year, NVIDIA Corporation also acquired Deci, a company that uses exclusive technology to deeply transform AI large models to ensure their cost-effective and efficient operation. In 2019, NVIDIA Corporation acquired Mellanox, then Israel's hottest technology company, for $6.9 billion--a leading end-to-end provider of Ethernet and InfiniBand high-speed intelligent interconnect solutions and network services for servers and storage. Mellanox's core technology is the foundation for the high-performance network infrastructure of "InfiniBand + Spectrum-X/Ethernet" led by NVIDIA Corporation. It was reported that NVIDIA Corporation is currently building a large research and development center in Kiryat Tivon, located south of Haifa, Israel. Media reports have mentioned that NVIDIA Corporation's CEO and co-founder Huang Renxun has referred to Israel as the "second home" of the chip company he leads. NVIDIA Corporation stated that once the large research campus is completed, it will include up to 160,000 square meters (approximately 1.7 million square feet) of office space, parks, and public areas, occupying 90 dunams (22 acres), inspired by NVIDIA Corporation's global headquarters in Santa Clara, California. NVIDIA Corporation expects the project to start construction in 2027 and be operational by 2031. As the wave of AI inference approaches, NVIDIA Corporation feels the increasingly strong competitive pressure brought by Alphabet Inc. Class C TPU. NVIDIA Corporation's AI GPU almost monopolizes the AI training side, requiring more powerful AI computing cluster universality and rapid iteration capability of the entire computing system, while the AI inference side values unit token costs, latency, and efficiency after the large-scale landing of cutting-edge AI technology. Alphabet Inc. Class C clearly positions Ironwood as a "born for the AI inference era" TPU generation, emphasizing performance/efficiency/computing cluster cost-effectiveness and scalability. When the AI inference computing system becomes a long-term cash cost center for global technology companies, customers are more willing to choose more cost-effective and efficient AI ASIC accelerators on the cloud. Media reports previously mentioned that OpenAI was renting TPU (Alphabet Inc. Class C TPU belongs to the AI ASIC technology route) on a large scale through Alphabet Inc. Class C's Google Cloud platform, with one of the core motives being to reduce AI inference costs--a typical case of rising competition pressure from TPUs. According to data calculated by Semianalysis, Alphabet Inc. Class C's latest TPU v7 (Ironwood) exhibits an amazing generational leap, with TPU v7's BF16 computing power reaching 4614 TFLOPS, while the previous widely used TPU v5p was only 459 TFLOPS, marking a whole magnitude of improvement. In addition, TPU v7 is directly comparable to NVIDIA Corporation's Blackwell architecture B200, and for specific AI application scenarios, the architecture of more cost-effective and efficient AI ASIC offers advantages in performance per dollar, making it easier to handle mainstream inference loads than Blackwell, with TPU providing 1.4 times more performance per dollar. One of the core constraints of the TPU AI computing cluster in the past has been the developer stack and engineering inertia (CUDA/PyTorch), but when Alphabet Inc. Class C vigorously promotes TorchTPU and gains the collaboration of key ecosystem participants, the accessibility of Alphabet Inc. Class C TPU to external developers improves, providing a competition advantage over NVIDIA Corporation's AI GPU computing system/NVIDIA CUDA ecosystem in mass inference scenarios with faster and more comprehensive and deep diffusion. As the global AI inference wave focuses on "who can deploy AI large models at the lowest cost and lowest latency," TPUs are simultaneously focusing on "inference-specific hardware + cloud delivery + reducing software friction." NVIDIA Corporation's acquisitions of Groq (inference-specific capabilities and talent) and AI21 (model/enterprise application stack and talent) are direct responses to this competitive trend. The current surge in ultra-large scale AI inference demand is showing a trend of doubling every six months, so against the rise of the AI inference wave and the increasingly strong competitive pressure brought by Alphabet Inc. Class C TPU, NVIDIA Corporation aims to secure inference chip ideation and top talent through Groq, complemented by software and model-side capabilities through AI21, reflecting a typical response to the competitive focus line of "diversification of hardware technology routes + end-to-end binding of AI application ecology." The essence of the transaction between NVIDIA Corporation and AI chip startup Groq is non-exclusive AI chip technology licensing for inference, including the absorption of Groq's founder/CEO Jonathan Ross and some key executives and engineering team. Some semiconductor industry analysts have emphasized Groq's exclusive chip technology focus on inference and its ways to reduce data transfer bottlenecks, targeting the cost/latency pain points of the inference stage. NVIDIA Corporation is in deep negotiations to acquire AI21 Labs for $20 billion to $30 billion, and the latest media reports suggest that NVIDIA Corporation may value its approximately 200 high-end AI talents and enterprise-level generative AI capabilities. If NVIDIA Corporation successfully integrates AI21 Labs' large model development and enterprise application capabilities, it will be very beneficial for NVIDIA Corporation to further lock customers into its "software/platform/solutions/NVIDIA ecosystem" during the global inference boom period, rather than just being an AI GPU supplier. This could embed more deeply into the "model-application-deployment" AI ecosystem of enterprise customers, consolidating CUDA's moat and significantly enhancing the stickiness and bargaining power of NVIDIA Corporation's AI computing cluster (avoiding the diversion of future inference computing to more self-developed AI ASIC chips or similar TPU alternative computing).