Caitong: VLA Model Drives Intelligent Driving Industry Chain Reconstruction, Suggests Paying Attention to Leading Targets in Software and Hardware.

date
28/02/2025
avatar
GMT Eight
Caitong released a research report stating that currently, the VLA model in the application process of automatic driving involves data acquisition and preprocessing, multimodal information fusion, action command production and execution, and feedback. Gao Chao, a researcher at the China Automatic Driving Industry Innovation Alliance, stated that it is expected that the VLA model will be mass-produced and implemented by 2025, which will boost the penetration rate of NOA in urban areas. With the breakthrough and imminent mass production of VLA (Vision-Language-Action) model technology, the intelligent driving industry chain will see structural investment opportunities. It is recommended to focus on core software and hardware suppliers such as ArcSoft Corporation (688088.SH), Streamax Technology (002970.SZ), Shanghai Huace Navigation Technology (300627.SZ), Huizhou Desay SV Automotive (002920.SZ), Ningbo Joyson Electronic Corp. (600699.SH), Thunder Software Technology (300496.SZ), etc. Key points from Caitong are as follows: VLA model achieves high-level end-to-end intelligence VLA, which stands for Vision-Language-Action Model, was first proposed and applied in the Siasun Robot & Automation field by DeepMind in 2023. VLA not only integrates the perception capabilities of visual language models (VLM) and the decision-making abilities of end-to-end models (E2E), but also introduces the "chain of thought" technology, achieving global context understanding and human-like reasoning capabilities. It is capable of taking given text and visual data as input, and producing actionable commands for Siasun Robot & Automation as output, possessing a natural genetic makeup for interaction between AI and the physical world. Additionally, its system transparency and interpretability allow the reasoning process to be differentiable throughout, enabling the driving logic to be explained to users through in-car displays and enhancing user trust. Currently, the application process of the VLA model in automatic driving involves data acquisition and preprocessing, multimodal information fusion, action command production, execution, and feedback. The VLA model may be mass-produced and implemented by 2025, reshaping the competitive landscape of the intelligent driving market Gao Chao, a researcher at the China Automatic Driving Industry Innovation Alliance, stated that it is expected that the VLA model will be mass-produced and implemented by 2025, which will boost the penetration rate of NOA in urban areas. Currently, the focus of competition in the smart driving market has shifted from simple functional implementation to more profound technical paradigm competition, emphasizing the advanced and sustainable nature of the technological architecture. A recent report by Goldman Sachs on automatic driving indicates that by 2030, end-to-end solutions dominated by the VLA model may account for 60% market share in the L4 level market, indicating a potential restructuring of the value chain for traditional tier-one suppliers. In addition, Apple recently published a new research paper on its machine learning research page, introducing a new large-scale self-play reinforcement learning framework called GIGAFLOW, used to train universal and robust automatic driving strategies. This breakthrough in research provides a new training method for the VLA model, helping to further combine emerging technologies such as reinforcement learning and self-play training to enhance the intelligence and generalization capabilities of automatic driving systems. Risk warning: Risks include the penetration rate of automotive intelligence increasing slower than expected, support for policies and regulations falling short of expectations, shortages of raw materials such as chips, and risks related to the macroeconomic situation.

Contact: contact@gmteight.com