Guotai Junan: OpenAI improves user experience with the innovative breakthrough of the o3 model in the field of AI.
30/12/2024
GMT Eight
Guotai Junan released a research report stating that OpenAI released new features such as the o1 API and o3 in the last four days of the 12-day event. The o1 model significantly enhances developers' efficiency and the diversity of application scenarios through enhanced API functions such as function calls, structured output, and visual input. The o3 series models demonstrate abilities close to or surpassing human experts in areas such as coding, mathematics, and scientific reasoning, while significantly reducing costs through flexible reasoning intensity settings. The report focuses on AI development tools and platforms, AI inference and high-performance computing, and AI security alignment technologies, which may benefit from OpenAI's technological breakthroughs for further development opportunities.
Key points from Guotai Junan include:
- OpenAI significantly enhances API functionality through the release of the o1 model, which includes new features such as function calls, structured output, and visual input, greatly improving developers' efficiency. The o1 model achieves an accuracy rate of 95% in structured output calls, far exceeding the GPT-4o model, ensuring high accuracy and stability in complex tasks. Additionally, control options for developers regarding messaging and reasoning efforts help find the best balance between performance and cost, further optimizing the development experience. The addition of visual input capabilities allows the o1 model to directly process image inputs, such as error analysis in tables, further expanding its application scenarios.
- The native and deep integration of ChatGPT desktop applications significantly improves programming and creative efficiency. Users can quickly call ChatGPT using shortcuts, generate complex code snippets in Xcode and Warp terminals, shorten development time, and improve code quality. Integration with creative tools such as Notion, Apple Notes, etc., further enhances document editing and information integration efficiency, making ChatGPT more integrated into users' daily work and creative processes. The introduction of advanced voice mode enhances the user's interactive experience with ChatGPT, bringing higher convenience and productivity.
- The o3 series models introduced by OpenAI have made breakthroughs in performance, cost, and security. The o3 model achieved a score of 87.5% in the ARC AGI benchmark test, surpassing the average human level and significantly enhancing coding, mathematical, and scientific problem-solving capabilities. In Codeforces competitive programming, the o3 model's ELO value approaches 2727, far exceeding the o1 model's 1891; in Epoch AI's Frontier Math Benchmark, the accuracy exceeds 25%, which is one of the most difficult math benchmark tests currently, with previous model accuracies less than 2%. The o3 mini, with flexible reasoning intensity settings (low, medium, high), has surpassed the o1 mini in coding ability at low reasoning intensity, and outperforms o1 at medium reasoning intensity, providing a balanced option for developers between cost and reasoning performance. The o3 mini's latency at low reasoning intensity is close to that of GPT-4o, while the cost is only a small fraction of the o1 model, providing a solid foundation for large-scale commercial applications. Additionally, OpenAI has opened the o3 series models for external researchers to conduct security testing for the first time, enhancing public trust in the technology and further solidifying OpenAI's leading position in the field of AI.
Risk warning: Technological breakthroughs carry security and privacy risks; commercialization processes may fall short of expectations.