GF SEC: AI reasoning RAG vector database drives growth in SSD demand. It is recommended to focus on core beneficiaries of the industry chain.

date
09:35 31/12/2025
avatar
GMT Eight
From the individual perspective, personalized RAG retains long-term memory, preferences, and contextual information of users, forming a "user-level vector space", significantly boosting the growth of RAG demand.
GF SEC released a research report stating that the RAG architecture provides long-term memory for large models, and the demand for RAG storage is driven by enterprise and personalization needs. The RAG vector database storage medium in AI reasoning is transitioning from "memory-assisted retrieval" to "full SSD storage architecture," which will continue to drive the demand for high-bandwidth, large-capacity SSDs. It is recommended to focus on core beneficiaries in the industry chain. The main points of GF SEC are as follows: RAG provides "long-term memory" for large models, and enterprise & personalization needs drive the growth of RAG demand. In the RAG (Retrieval-Augmented Generation) architecture, large language models (LLMs) first query the vector database before generating responses. The vector database serves as the key hub connecting user queries with external knowledge, responsible for efficiently storing, managing, and retrieving high-dimensional vectorized knowledge representations, thereby enhancing the accuracy and timeliness of generated results. From the enterprise perspective, RAG is gradually penetrating online scenarios (e-commerce, web search, etc.) and offline scenarios (enterprise, legal, engineering research, etc.). From a personal perspective, personalized RAG retains user long-term memory, preferences, and contextual information, forming a "user-level vector space," significantly boosting the growth of RAG demand. AI reasoning RAG vector database drives SSD demand growth. The vector database storage medium needs to carry large-scale vector data and index structures, requiring high throughput and low latency to meet the similarity retrieval needs in high-concurrency scenarios. Currently, the vector database storage medium is transitioning from "memory-assisted retrieval" to "full SSD storage architecture." Based on "All-in-storage ANNS Algorithms Optimize VectorDB Usability within a RAG System" using KIOXIA AiSAQ as an example, vectors, PQ quantization results, and indexes are stored together on SSD, requiring 11.2TB of SSD capacity for 10 billion vectors, with PQ Vectors occupying 1.28TB and indexes occupying 10TB. Under TLC/QLC SSD, AiSAQ has a cost advantage of 4-7 times compared to DiskANN media; additionally, all AiSAQ tenants are in an active state, tenants can start queries directly without the "cold start" delay of loading from SSD to DRAM first, enhancing the scalability and economic feasibility of RAG systems. Volcano Engine TOS Vectors opens a new paradigm of vector storage and increases demand for SSD. According to the Volcano Engine developer community's official account, TOS introduces Vector Bucket, which uses the proprietary Cloud-Native vector index library Kiwi and a multi-level local cache architecture (covering DRAM, SSD, and remote object storage). In scenarios of large-scale, long-term storage and low-frequency queries, this architecture not only meets the hierarchical needs of high/low-frequency data but also significantly reduces the threshold for enterprises to use large-scale vector data. TOSVector works closely with the Volcano Engine high-performance vector database and Volcano AI agent products, in interactive Agent scenarios, storing high-frequency accessed memories (such as users' core preferences, recent task execution results, etc.) in the vector database for millisecond-level high-frequency retrieval; storing low-frequency accessed memories (such as interaction records from six months ago or historical execution results) in TOSVector, allowing for second-level delays, in exchange for lower storage costs and broader memory space; for complex task handling Agent scenarios, TOSVectors can handle massive semantic vector storage while ensuring the sustainable accumulation of long-term data. Risk Warning AI industry development and demand are lower than expected; AI server shipments are lower than expected, and domestic manufacturers' technological and product progress is lower than expected.