Huachuang Securities: Token Inference Explosive Growth, Domestic Computing Power Faces Massive Demand
The normalization of reasoning workloads means that the demand for computing power is shifting from "periodic training input" to "continuous reasoning consumption", leading to a systematic upward trend in the value of the computing chip industry.
Huachuang Securities released a research report stating that the exponential increase in Token consumption heralds a fundamental shift in user AI usage patterns. The constant expansion of computing power requirements directly drives the growth of computing power chip demand, leading the Chinese GPU industry into a high-volume phase. The normalization of inference loads signifies a shift from "periodic training input" to "continuous inference consumption" in computing power demand, with the central value of the computing power chip industry showing a systematic upward trend.
Key points from Huachuang Securities:
Event: From February 9 to 15, 2026, OpenRouter data showed that Chinese models had a call volume of 4.12 trillion Tokens, surpassing the 2.94 trillion Tokens of American models for the same period for the first time. From February 16 to 22, four models from Chinese manufacturers ranked among the top five models in terms of platform call volume, namely MiniMax's M2.5, Moon's Dark Side's Kimi K2.5, KNOWLEDGE ATLAS's GLM-5, and DeepSeek's V3.2. These four models accounted for 85.7% of the total call volume of the Top 5.
Demand side: Computing power chips enter a phase of high load normalization
The exponential increase in Token consumption may seem to be driven by growth in user scale and usage duration, but the deeper DRIVE behind it is the fundamental shift in user AI usage patterns. AI's role is evolving from a "question-and-answer tool" that provides simple information and engages in casual conversations to a "productivity tool" that can deeply engage in workflow and handle complex tasks.
NVIDIA CEO Jensen Huang pointed out that without computing power, Tokens cannot be generated, and without Tokens, revenue growth cannot be achieved. The continuous expansion of computing power demand directly drives the growth of computing power chip demand, propelling the Chinese GPU industry into a high-volume phase of prosperity. Frost & Sullivan predicts that the market size of the Chinese AI computing acceleration chip market will increase from 142.537 billion yuan in 2024 to 1,336.792 billion yuan in 2029, with a compounded annual growth rate of 53.7%.
In terms of sub-markets, the GPU market is growing the fastest, with its market share expected to increase from 69.9% in 2024 to 77.3% in 2029, reaching a market size of 1,033.340 billion yuan. Overall, the normalization of inference loads suggests a shift in computing power demand from "periodic training input" to "continuous inference consumption," with the central value of the computing power chip industry showing a systematic upward trend.
Supply side: Strengthened overseas constraints and enhanced domestic substitution capabilities
1) Tight overseas supply: Policy restrictions combined with capacity bottlenecks continue to limit supply elasticity. According to market reports cited by Global Times, the U.S. Department of Commerce stated that despite approving the export of AI chips to China two months ago, NVIDIA has not yet sold any H200 chips to China. In January of this year, although the U.S. Department of Commerce issued new rules easing export restrictions on H200 chips to China, the U.S. State Department insisted on pushing for stricter export restrictions. Amid multiple uncertainties, Chinese customers have not placed orders with NVIDIA until permission conditions are clarified.
In addition to policy disturbances, the global computing hardware itself is in a structurally tight balance, with current data center GPU delivery lead times ranging from 36 to 52 weeks. Overall, short-term supply elasticity is difficult to effectively release, and extended delivery lead times have become the norm. The overseas computing supply gap may continue to exist, creating a window of opportunity for the acceleration of domestic computing power substitution and local supply chain improvement.
2) Enhanced domestic chip substitution capabilities: Performance and commercialization synchronize leaps. The continuous growth in Token consumption essentially reflects an increase in the frequency and intensity of calls for large model inference, with the FP8 metric becoming the next generation computing standard, essentially achieving greater overall computing performance through a sacrifice of precision. On February 12, the first S5000 single-card FP8 computing power exceeding 1000 TFLOPS was publicly disclosed by Moore's Law Thread, with training accuracy closely matching NVIDIA's H100, with a difference of less than 1%.
Domestic chips actively promote ecosystem adaptation. From December 2025 to March 2026, Muxi Holdings' C500/C550 has gradually adapted to multiple domestic large models such as Tencent's Hybrid Image 3.0, Step3.5flash from AstarTech, and GLM-5 from KNOWLEDGE ATLAS. With the continuous improvement of model adaptation capabilities, domestic GPUs are moving from technically available to scalable. On February 27, Cambrian, Moore's Law Thread, and Muxi Holdings released their 2025 annual performance reports, all achieving triple-digit revenue growth, with Cambrian making its first annual profit.
Overall, domestic GPUs are breaking through synchronously in computing power density, ecosystem adaptation, and commercial realization, gradually achieving large-scale substitution of domestic computing power chips.
Investment targets
Recommended focus: (1) Chip design: Cambrian (688256.SH), Hygon Information Technology (688041.SH), Muxi Holdings (688802.SH), Moore's Law Thread (688795.SH), ILUVATAR COREX (09903); (2) Chip foundry: Semiconductor Manufacturing International Corporation (688981.SH), Hua Hong Semiconductor (688347.SH)/HUA HONG SEMI (01347); (3) Servers and accessories: Inspur Electronic Information Industry (000977.SZ), Sichuan Huafeng Technology (688629.SH).
Risk warning
Downstream demand lower than expected, domestic substitution progress below expectations, foundry supply risks.
Related Articles

MNSO (09896) spent 1.9132 million Hong Kong dollars to repurchase 56,600 shares on March 3rd.

Angang Steel (00347) appoints He Yongya as its legal process agent.

XINYI SOLAR (00968) will distribute a final dividend of 0.8 Hong Kong cents per share on July 3rd.
MNSO (09896) spent 1.9132 million Hong Kong dollars to repurchase 56,600 shares on March 3rd.

Angang Steel (00347) appoints He Yongya as its legal process agent.

XINYI SOLAR (00968) will distribute a final dividend of 0.8 Hong Kong cents per share on July 3rd.

RECOMMEND





