China Securities Co., Ltd.: Spring Festival large factory model frequently occurs, cloud demand is expected to "inflate".
During the Spring Festival, large models undergo intensive iterations, and the collaborative development of multiple intelligent agents and the native multimodal driving ability make a leap forward.
China Securities Co., Ltd. released a research report stating that during the Spring Festival period, large models experienced intensive iteration, and multi-agent collaboration and native multimodal driving capabilities made significant advancements. During the past two weeks, leading AI companies at home and abroad intensively released new-generation base models. Parallel agent architecture, complex logical reasoning, super-long contexts, and native audio-visual modalities have become the core features of this technological cycle, and the industry trend is rapidly evolving from dialogue-based question-answering towards fully automated management of complex engineering tasks. Recently, domestic and foreign cloud providers have been issuing price increase notices frequently, driving the industry towards an inflection point in response to AI inference demands. It is recommended to pay attention to the core directions under the logic of cloud price increases.
China Securities Co., Ltd.'s main points are as follows:
During the Spring Festival period, large models experienced intensive iteration, with collaborative multi-agents and native multimodal driving capabilities making significant advancements. During the past two weeks, leading AI companies at home and abroad intensively released new-generation base models. Parallel agent architecture, complex logical reasoning, super-long contexts, and native audio-visual modalities have become the core features of this technological cycle, and the industry trend is rapidly evolving from dialogue-based question-answering towards fully automated management of complex engineering tasks.
Google: Google released its new flagship model Gemini 3.1 Pro, which significantly outperformed competitors with a 77.1% accuracy rate in the ARC-AGI-2 test measuring cutting-edge reasoning capabilities. It also natively supports millions of Tokens for super-long contexts, with an accuracy rate of 84.9% in the MRCR v2 difficult task test. In the field of code and agents, its LiveCodeBench Pro score of 2887 leads the industry. Additionally, the model has made breakthroughs in 3D spatial reasoning and complex graphics generation, enabling quick generation of high-quality SVG dynamic graphics with just text commands. Compared to its predecessor, the Gemini 3.1 Pro significantly reduces error rates, further consolidating its leading position in complex logical reasoning and all-modal input areas.
Anthropic: As Anthropic's latest flagship model, Claude Sonnet 4.6 has undergone comprehensive upgrades in code writing and long-text reasoning dimensions. In the GDPval-AA test evaluating the practical knowledge work value, it slightly outperformed the flagship Opus 4.6, becoming the new generation efficiency benchmark. Its core computing operation ability has increased to 72.5% in the OSWorld evaluation, with integrated processing capabilities across web pages and local applications. While maintaining pricing at $3 per million inputs, the model significantly optimizes the coherence of multi-step task execution, reduces over-engineering situations, accelerates the commercial landing of end-side agent applications.
xAI: xAI released the 500 billion parameter Grok 4.2 test version, also integrating multi-agent cluster mechanisms. When handling complex tasks, the system can automatically schedule multiple heterogeneous agents for parallel reasoning and real-time cross-validation, synthesizing conclusions after comprehensive multidimensional professional judgment. This architecture performed excellently in the Alpha Arena large model real-time investment competition, becoming the only model to achieve positive returns. In front-end development and code generation scenarios, the multi-agent debate mechanism effectively avoids logical traps of single-agent models, enhancing the usability and accuracy of code output, validating the technical superiority of parallel agent architecture in engineering verification environments.
Alibaba: Alibaba has open-sourced the Qwen 3.5 flagship series, integrating linear attention and hybrid expert architecture, maintaining high-level reasoning capabilities while increasing decoding throughput by 8.6 times. As a native visual-language model, Qwen 3.5 competes with top overseas products in multimodal understanding and benchmark evaluation. Its core technological breakthrough lies in the post-training phase expansion of reinforcement learning tasks and environments, enabling linear growth in model general-purpose capability with reinforcement learning environment scale. The BaiLian platform has launched the Qwen 3.5-Plus flagship interface, providing various reasoning modes such as thinking and rapid, seamlessly integrating with mainstream programming tools.
ByteDance: ByteDance's Bean 2.0 matrix includes Pro, Lite, Mini, and exclusive Code versions, systematically refactoring for complex instruction execution. The Pro version excels in deep reasoning and long-link tasks, achieving a gold standard level in math and programming competition tests. On the multimodal front, Bean 2.0 has achieved industry-leading performance in real-time streaming Q&A and long video comprehension. Its Code model deeply integrates internal AI programming tool TRAE, significantly enhancing automatic error correction capabilities in workflows. While maintaining top model performance, Bean 2.0 further reduces token costs by approximately an order of magnitude, effectively breaking the cost constraints of long-cycle agent application landing.
KNOWLEDGE ATLAS: KNOWLEDGE ATLAS AI launches the flagship base model GLM-5 with 744 billion parameters, evolving core capabilities from assisted programming towards automated intelligence engineering. The model introduces sparse attention mechanism and new asynchronous reinforcement learning infrastructure, significantly reducing computation and memory overheads. In terms of industrial ecology, GLM-5 has successfully implemented W4A8 mixed-precision quantization on Huawei Ascend computing clusters, benchmarking against leading overseas computing platforms, reducing deployment costs by 50% in long-sequence and low-latency scenarios. The commercialization of GLM-5 signifies a dual breakthrough for domestic large models in complex reasoning capabilities and underlying computing power compatibility.
MiniMax: M2.5 sets new industry standards in productivity benchmarks such as programming and tool invocation, with an SWE-Bench Verified accuracy rate of 80.2%. Based on its self-developed native agent reinforcement learning framework, M2.5 achieves high reasoning throughput and low cost, requiring only $1 to continuously run 100 tokens for an hour at 100 TPS inference speed. The breakthrough in cost and speed enables economic feasibility in handling tasks with millions of tokens for super-long contexts. Currently, the model has taken over approximately 30% of real-world business scenarios within MiniMax, covering functions such as research and finance, validating its industrial potential in end-side productivity engines.
Kimi: Dark Side of the Moon releases the latest flagship model Kimi K2.5, incorporating joint text-visual pre-training technology to achieve bidirectional enhancement in multimodal capabilities. At a technical level, K2.5 introduces agent clusters and parallel agent reinforcement learning frameworks, which can break down long-cycle complex tasks into heterogeneous subproblems for distributed processing, reducing end-to-end reasoning delays by 4.5 times. The accompanying programming assistant Kimi Code seamlessly integrates with mainstream integrated development environments. With high accuracy and significant computational scheduling efficiency, Kimi K2.5 quickly ascends the open-source tool invocation list, accelerating the transformation towards automated engineering productivity.
From "universal accessibility of computing power" to "inflation of computing power," the logic of supply and demand escalation driving cloud price increases continues to unfold. Recently, domestic and foreign cloud providers have been increasing prices frequently, with the inflexible premium brought about by AI inference demands pushing the industry towards an inflection point. Taking Alibaba Cloud as an example, we observe that its growth rate has been continuously increasing since 24Q2, reaching a new quarterly high of 34% in 25Q3, with public cloud business revenue growth driving Alibaba Cloud's revenue increase, especially AI-related product revenue achieving triple-digit year-on-year growth for nine consecutive quarters. We believe that the stage of homogenized competition and price battles in cloud services is ending, as cloud resource pricing models are transitioning from "price for quantity" to "premium realization." Core directions under the logic of cloud price increases: 1) Edge cloud/CDN: AI inference is descending massively towards the edge, enabling direct computation for edge data and reducing data transmission to central cloud. It is recommended to focus on CDN price increases for profit potential, contribution to income increment in overseas markets, and positioning in the edge AI inference market. 2) Cloud repatriation: Central cloud costs increase, leading enterprises to opt for local deployments of hyper-convergence or distributed storage to reduce costs through hybrid or private clouds. 3) Leading cloud providers face value reassessment opportunities; continued optimism for top CSPs with pricing power in AI cloud services.
From "GPU dominance" to "collaboration of heterogeneous computing power," the logic of demand escalation and cost transmission driving CPU and storage price synchronization continues to evolve, with the distribution of hardware value shifting from "emphasis on computing power" to "emphasis on computation and storage." 1) CPU: AI agent applications being massively distributed to the edge and end-side require CPU's general computing and task scheduling capabilities for autonomous planning, tool invocation (API), and complex logical reasoning. With the popularity of agents, the demand for non-stream processing and serial computing increases significantly, driving up CPU usage and specifications. 2) Memory interconnection and computational/memory accord (CXL): High-concurrency reasoning highlights the bottleneck of the "memory wall," increasing system-level costs. Data centers accelerate the deployment of high-speed interconnection technologies such as CXL to achieve memory pooling, significantly reducing CPU waiting time and computing total cost.
Risk warnings: (1) Macro-economic downturn risk: The downstream computer industry involves various sectors, and under macro-economic pressure, lower-than-expected industry IT spending may directly affect computer industry demand. (2) Accounts receivable bad debt risk: Most computer companies' businesses rely on project-based contracts that require payment upon completion, lengthening payment periods from downstream customers may lead to an increase in bad debts and potential impairment losses. (3) Intensified industry competition: While computer industry demand is relatively stable, intensified competition on the supply side may change the industry landscape. (4) Impact of international environmental changes: Increasing international trade frictions, the continuous pressure from the United States on Chinese technology may affect companies with a high proportion of overseas income.
Related Articles

TIANLI International Holdings (01773) Chairman Luo Shi increased stake by 2.7 million shares.

China Qinfa (00866) issues profit warning, expecting a after-tax loss of no more than 98 million yuan by 2025, a turnaround from profit to loss year-on-year.

US Stock Market Move | PDD Holdings Inc. Sponsored ADR Class A (PDD.US) rose nearly 3% in pre-market trading, with five major Chinese institutions collectively increasing their positions.
TIANLI International Holdings (01773) Chairman Luo Shi increased stake by 2.7 million shares.

China Qinfa (00866) issues profit warning, expecting a after-tax loss of no more than 98 million yuan by 2025, a turnaround from profit to loss year-on-year.

US Stock Market Move | PDD Holdings Inc. Sponsored ADR Class A (PDD.US) rose nearly 3% in pre-market trading, with five major Chinese institutions collectively increasing their positions.

RECOMMEND

Nine Companies With Market Value Over RMB 100 Billion Awaiting, Hong Kong IPO Boom Continues Into 2026
07/02/2026

Hong Kong IPO Cornerstone Investments Surge: HKD 18.52 Billion In First Month, Up More Than 13 Times Year‑On‑Year
07/02/2026

Over 400 Companies Lined Up For Hong Kong IPOs; HKEX Says Market Can Absorb
07/02/2026


