Lates News
Ali Qianwen announced the launch of the Qwen3.5-Omni full-modal large model. The Qwen3.5-Omni series includes three sizes: Plus, Flash, and Light. The Instruct version supports a 256k long context model, allowing for over 10 hours of audio input and over 400 seconds of 720P (1FPS) audiovisual input. The model undergoes native multimodal pre-training on massive amounts of text, vision, and over 1 billion hours of audiovisual data, demonstrating outstanding multimodal perception and generation capabilities. Compared to Qwen3-Omni, Qwen3.5-Omni has significantly enhanced multilingual capabilities, supporting speech recognition in 113 languages and dialects, as well as speech generation in 36 languages and dialects.
Latest
4 m ago

