Huang Renxun's GTC speech: The era of reasoning is coming, revenue will reach at least one trillion US dollars by 2027, and lobsters are the new operating system.

date
08:34 17/03/2026
avatar
GMT Eight
At the GTC 2026 conference, NVIDIA CEO Huang Renxun positioned the company as a builder of "AI factories" and stated that "by 2027, we will see at least $1 trillion of high-confidence demand." He asserted that Agents will end the traditional SaaS model, and in the future, "annual salary + token budget" will become the new standard in the workplace.
On March 16, 2026, the NVIDIA CorporationGTC 2026 conference officially opened, and NVIDIA Corporation (NVDA.US) founder and CEO Jensen Huang delivered the keynote speech. In this conference, known as the "annual pilgrimage of the AI industry," Huang explained the transformation of NVIDIA Corporation from a "chip company" to an "AI infrastructure and factory company." Addressing the market's concerns about the sustainability of performance and growth, Huang detailed the underlying business logic driving future growththe "Token Factory Economics." The performance outlook is extremely optimistic, with a projected demand of "at least $1 trillion by 2027." Over the past two years, there has been an exponential explosion in global AI computing demand. As large models evolve from "perception" and "generation" to "inference" and "action (task execution)," the consumption of computing power has dramatically increased. In response to the market's high focus on orders and revenue ceilings, Huang provided incredibly strong expectations. Huang stated in his speech: "Last year at this time, I said we saw $500 billion in high confidence demand covering Blackwell and Rubin until 2026. Now, right here, right now, I see at least $1 trillion in demand by 2027." Huang's trillion-dollar expectation briefly boosted NVIDIA Corporation's stock price by over 4.3%. Furthermore, he added: "Is this reasonable? That's what I'm going to talk about next. In fact, we will even have more demand than that. I am certain that the actual computing demand will be much higher." Huang pointed out that NVIDIA Corporation's system has proven itself to be the world's "most cost-effective infrastructure." Due to NVIDIA Corporation's ability to run AI models in almost all fields, the $1 trillion invested by customers can be fully utilized and maintained for a long lifecycle. Currently, 60% of NVIDIA Corporation's business comes from the top five large cloud service providers, while the remaining 40% is widely distributed across sovereign cloud, enterprise, industrial, Siasun Robot & Automation, and edge computing sectors. Token Factory Economics, where performance per watt determines the commercial lifeline To explain the $1 trillion demand in detail, Huang presented a new set of business thinking to global enterprise CEOs. He pointed out that the Data Center is no longer just a storage repository, but is now a "factory" that produces "tokens" (the basic unit generated by AI). Huang emphasized: "Every data center, every factory, by definition, is limited by power. A 1GW factory will never become 2GW; this is the law of physics and atoms. At a fixed power, whoever has the highest token throughput per watt will have the lowest production cost." Huang divided future AI services into four business levels: - Free tier (high throughput, low speed) - Intermediate tier (~$3 per million tokens) - High tier (~$6 per million tokens) - High-speed tier (~$45 per million tokens) - Ultra-high-speed tier (~$150 per million tokens) He pointed out that as models become larger and contexts become longer, AI becomes smarter, but the rate of token generation decreases. Huang stated: "In this Token Factory, your throughput and token generation speed will directly your precise income for next year." Huang emphasized that NVIDIA Corporation's architecture allows customers to achieve ultra-high throughput in the free tier, while also increasing performance by an astonishing 35 times in the highest value inference tier. Vera Rubin achieved a 350 times acceleration in just two years, and Groq filled the gap in extremely fast inference Under the physical limit constraints, NVIDIA Corporation introduced its most complex AI computing system, Vera Rubin. Huang stated: "In the past, I mentioned Hopper, and I would hold up a chip, that was cute. But when it comes to Vera Rubin, you think about the whole system. In this 100% liquid-cooled system that completely eliminates traditional cables, what used to take two racks that TIAN AN had to install now takes only two hours." Huang explained that through end-to-end software and hardware co-design, Vera Rubin achieved incredible data leaps in the same 1GW data center. "In the short span of two years, we increased the token generation rate from 22 million to 700 million, achieving a 350 times increase. Moore's Law during the same period could only bring about a 1.5 times increase," Huang said. To address the bottleneck in bandwidth under high-speed inference conditions (such as 1000 Tokens/second), NVIDIA Corporation provided the final solution by integrating the acquired company Groq: asymmetric detached inference. Huang explained: "These two processors have completely different characteristics. The Groq chip has 500MB of SRAM, while a single Rubin chip has 288GB of memory." Huang pointed out that through the Dynamo software system, the stage requiring massive computing and memory for "prefill" is handed over to Vera Rubin, while the stage requiring extremely sensitive "decoding" is left to Groq. Huang also provided recommendations for enterprise computing power configurations: "If your work mainly involves high throughput, use Vera Rubin 100%; if you have a large amount of high-value programming-level token generation needs, allocate 25% of the data center size to Groq." It was revealed that Groq's LP30 chip, manufactured by Samsung, is already in production, with shipments expected to begin in the third quarter. Meanwhile, the first Vera Rubin rack is already operational on Microsoft Corporation Azure cloud. In addition, regarding optical interconnection technology, Huang showcased the world's first commercially produced Co-Packaged Optics (CPO) switch, Spectrum X, and calmed the market's debate about the shift from copper to optical: "We need more copper cable production capacity, more optical chip manufacturing capacity, and more CPO manufacturing capacity." Agent ends traditional SaaS, "annual salary + token" becomes a Silicon Valley standard In addition to breaking hardware barriers, Huang devoted a significant portion of his speech to the revolution in AI software and ecosystems, particularly the explosion of the Agent (intelligent agent) phenomenon. He described the open-source project OpenClaw as the "most popular open-source project in human history," saying that it surpassed Linux's achievements of the past 30 years in just a few weeks. Huang bluntly stated that OpenClaw is essentially the "operating system" of Agent computers. Huang asserted: "Every SaaS (Software as a Service) company will become an AaaS (Agent-as-a-Service) company. Without a doubt, in order to securely deploy intelligent agents with access to sensitive data and code execution capability, NVIDIA Corporation has introduced the enterprise-level NeMo Claw reference design, which includes a policy engine and privacy router." For ordinary professionals in the workplace, this transformation is just around the corner. Huang outlined the new form of workspace in the future: "In the future, every engineer in our company will need an annual token budget. Their base salary may be in the tens of thousands of dollars, and I will allocate about half that amount as a token allowance, enabling them to achieve a 10x efficiency boost. This has become a new recruiting bargaining chip in Silicon Valley: how many tokens does your offer include?" At the end of the speech, Huang also "spoiled" the next generation computing architecture Feynman, which will achieve simultaneous expansion of copper wires and CPO. More intriguingly, NVIDIA Corporation is developing a data center computer to be deployed in space called "Vera Rubin Space-1," opening up the imagination for extending AI computing power beyond Earth. Huang Renxun GTC 2026 full speech, translated in its entirety above. This article was originally featured on "Wall Street Knowledge". GMTEight edit: Jiang Yuanhua.