Alibaba (09988) officially releases Qwen3.7-Plus with comprehensive upgrades in visual and language capabilities.

date
06:47 02/06/2026
avatar
GMT Eight
Building on the powerful text capabilities of Qwen3.7, Qwen3.7-Plus has comprehensively upgraded its visual-language capabilities while maintaining its full intelligent physical capabilities in coding, tool use, and productivity workflows.
On June 2nd, Alibaba (09988) officially released Qwen3.7-Plus under its subsidiary Qianwen, integrating visual and language capabilities into a multimodal model as part of an intelligent base. Building upon the strong text capabilities of Qwen3.7, Qwen3.7-Plus has comprehensively upgraded its visual-language capabilities while maintaining complete intelligent capabilities in encoding, tool usage, and productivity workflows. According to reports, the core feature of Qwen3.7-Plus is its ability as a multimodal interactive hybrid intelligent system. It can perceive real-world scenes, read screens and operate GUI, generate code based on visual references, navigate mobile applications end-to-end, and answer visual questions based on network knowledge seamlessly integrating GUI and CLI interactions in a single intelligent system loop. As a versatile coding intelligence system and productivity assistant, it handles all mode inputs for tasks ranging from front-end prototypes to complex software engineering, and even multi-step workflow automation. It has the ability to generalize across frameworks, ensuring stable performance whether deployed through Claude Code, OpenClaw, Qwen Code, or other frameworks. The Hybrid-Agent intelligent system built on Qwen3.7-Plus integrates the code generation capability of large models with GUI automation execution, achieving full lifecycle app development from requirements analysis to version iteration. The Agent runs continuously for over 11 hours, autonomously completing the full development cycle of an English vocabulary learning app. It has generated over 10,000+ lines of code, made over 1,000+ Agent calls, covering core aspects of software development lifecycle: requirement document generation, automatic code writing, deployment automation, test case creation, GUI automated testing, parallel testing in multiple scenarios, automatic product description updates, and automated version iteration evolution. Additionally, Qwen3.7-Plus also supports multimodal reasoning (able to parse complex visual information such as subway maps), enhanced visual search-based question answering, image/video to SVG vector code conversion, visual-driven web design, and can automatically procure ECS cloud servers and complete the maintenance process in the browser Agent scene. The model performs strongly on various high-difficulty benchmarks such as BabyVision, MathVision, ScreenSpot Pro, and AndroidWorld. Currently, Qwen3.7-Plus is available on the Alibaba Cloud Hundredsmelt platform, supporting OpenAI compatible APIs and calls under the Anthropic protocol.