Soochow: End-side AI innovation drives hardware memory upgrade, multi-modal interaction opens a new era of intelligent terminals.
03/03/2025
GMT Eight
Soochow released a research report stating that edge-side AI is accelerating the process of terminal intelligence through model compression, memory innovation, and multimodal interactive interface revolution. According to institutional estimates, the CAGR of the edge-side AI market is expected to reach 58% from 2023 to 2028, surpassing 1.9 trillion yuan in 2028. In comparison to various hardware, the firm believes that Apple Inc. (AAPL.US) has significant room for improvement in memory/battery/heat dissipation. The firm believes that energy consumption caused by memory and its operations is currently the weakest link and is expected to become the core direction of hardware transformation. Large model technology is driving the transition of human-machine interaction from "instruction-centric" to "intent-centric," with giants like Apple Inc. (memory innovation) and Alphabet Inc. Class C (UI interaction model) leading the breakthrough in hardware and algorithms.
Key points from Soochow are as follows:
Edge-side AI innovates human-machine interaction, models upgrade rapidly, giants lead industry development
The autonomy of AI is continuously improving from being "instruction-centric" to "intent-centric." Large language models (LLMs) are transforming terminals from various aspects, with Agents being essential for open-ended questions, enabling understanding of complex inputs, planning reasoning, and using tools effectively. According to Soochow, the edge-side AI market is expected to have a high CAGR of 58% from 2023 to 2028, surpassing 1.9 trillion yuan in 2028. Regarding specific small model performance, the number of parameters has a significant impact on model performance. However, due to hardware limitations, innovations in small models are more active in improving performance with limited parameters. Quantization/pruning/distillation are the main model compression methods, and differences in datasets/compression accuracy/quantization mixed methods are expected to lead to a variety of small models. In the Agent architecture, the basic model itself needs to introduce new input types to become a VLA model, and there is an increased demand for personalization and memory operations, which require additional optimization.
Hardware transformation core focus is on memory, Apple Inc. focuses on memory innovation to address memory bottlenecks
Compared to cloud-side models, hardware is an important constraint for edge-side models and needs to be upgraded to overcome shortcomings. In comparison to various hardware, Soochow believes that Apple Inc. has significant room for improvement in memory/battery/heat dissipation. The energy consumption caused by memory is currently the weakest link and is expected to become the core direction of hardware transformation. For example, a 7B model with half-precision parameters uses more than 14GB of DRAM for loading, with DRAM consuming two orders of magnitude more energy than SRAM and calculations. The efficiency of memory utilization in iOS and Android varies significantly, with Soochow suggesting that Android needs to provide a unified AI basic model at the OS level, while iOS needs to increase hardware memory beyond model compression to overcome hardware bottlenecks. In addition to simply increasing memory capacity, Apple Inc. is innovating intensively in memory structure, energy consumption, transfer speed, etc., such as collaborating with Samsung to develop independent packaging forms and promoting the new WMCM packaging method to further enhance the flexibility and integration of chip combinations.
Multimodal UI interaction interface revolution brings historical opportunities for Agents
According to the mode of interaction, task execution methods can be divided into API-based and user interface (UI)-based methods. API interactions are less versatile. The UI interface method, under the Transformer architecture, effectively overcomes the implicit relationship between tasks and UI elements, significantly improving the feasibility of GUI Agents, and is expected to become mainstream. Currently, both Apple Inc. and Alphabet Inc. Class C are focusing on UI interaction models, with Apple Inc.'s Ferret-UI and Alphabet Inc. Class C's Screen AI models using screen-reading AI visual language models to understand screen information in a unified encoding manner. From Alphabet Inc. Class C's UI model perspective, the increase in model parameters has a significant impact on performance, and the performance improvement of the 5B model has not reached saturation, making it necessary to further enhance the model performance.
Risk Warning: Underlying innovation falls short of expectations, technological development falls short of expectations, intensified competition risks.