Qwen3-VL family adds 2B and 32B models to fully cover visual language understanding scenarios.

date
15:12 22/10/2025
avatar
GMT Eight
On October 22, Alibaba W (09988) announced that the Qwen3-VL family under its subsidiary Tongyi Qianwen team will expand again, adding two new Dense model sizes: 2B and 32B.
On October 22, the Qianwen team under BABA-W (09988) announced that the Qwen3-VL family is expanding again, with the addition of two new Dense models sizes: 2B and 32B, covering a range from lightweight to dessert-level models, encompassing visual language understanding scenarios. Up to now, Qwen3-VL has released a total of four open-source Dense models and two MoE models. Qianwen stated that each model provides two versions: the Instruct version has faster response and more stable execution, suitable for dialogue and tool calling; the Thinking version enhances long-chain reasoning and complex visual understanding, able to "think in pictures" and perform better in high difficulty tasks. According to reports, Qwen3-VL-32B outperforms GPT-5 mini and Claude 4 Sonnet in STEM, VQA, OCR, video understanding, and proxy tasks, using only 32B parameters to match models as large as 235B, even surpassing them on OSWorld. Meanwhile, Qwen3-VL-2B delivers remarkable performance in a small size, running smoothly on edge devices, making developer experiments and deployments lighter. From image recognition to writing, reasoning, and creativity, Qwen3-VL makes "understanding the world" easier, quicker, and smarter. As of now, Qwen3-VL has released a total of four Dense models sizes: 2B, 4B, 8B, 32B, and two MoE models sizes: 30B-A3B, 235B-A22B. Each model has Instruct and Thinking versions, as well as 12 corresponding FP8 quantization versions. A total of 24 open-source Qwen3-VL weight models can be downloaded for free commercial use on the Moda Community and Hugging Face.