3.0 series models officially launched: All-in-One model system leads the new era of video creation.

date
09:01 05/02/2026
avatar
GMT Eight
On February 5th, Ke Ling AI officially launched its 3.0 series models globally. It is currently open for use by black gold members and is expected to be fully launched in the near future.
On February 5th, KeLing AI officially launched its 3.0 series models globally, currently open for use to black gold members and expected to be fully launched in the near future. The release includes the KeLing Video 3.0, KeLing Video 3.0 Omni, KeLing Image 3.0, and KeLing Image 3.0 Omni models, covering the entire film and television production process including image and video generation, editing, and post-production. This marks the official entry of AI into the core production process of film and creative content, as KeLing AI enters the era of 3.0. The new All-in-One product and technology concept of multi-modal input and output integration supports native creation. Based on the All-in-One product and technology concept, the KeLing 3.0 series models form a unified video model system with highly consistent multi-modal input and output. It consolidates the understanding, generation, and editing of images in the visual creation process into a continuous flow within a unified framework, allowing creators to complete the creation process within a single model for the first time. Within this system, creators can use text, images, sounds, and videos as inputs simultaneously and directly obtain professional-grade output results, eliminating the need to split the creation process into multiple tools and steps. In terms of stability and expressiveness, the KeLing 3.0 series models have been enhanced at multiple key points. A breakthrough has been achieved in the long-standing issue of consistency within the industry: by integrating a series of technological capabilities such as integrating video body uploads, tone binding, and the globally pioneering "image-video + main reference" technology, the models can ensure that character images, movements, and sounds remain stable during complex camera switches, with clear text and recognizable brand logos. Even in multilingual scenarios, the visual style and character features remain highly consistent. On the narrative level, the models support continuous generation of up to 15 seconds and introduce intelligent shot composition and custom shot control, allowing creators to directly organize the shot rhythm and narrative structure without relying on fragmented stitching, thereby imbuing the shots with emotional progression and tension. KeLing Video 3.0: Movie-level storytelling and precise control The new intelligent storyboard system acts as an AI director, interpreting the script intentions deeply and automatically scheduling positions and scenes. Whether it's classic dialogue or complex cross-shot transitions, they can be generated with a single click, significantly reducing post-production correction costs. By employing the globally pioneering "image-video + main reference" technology, creators can anchor specific elements of the image. Regardless of the camera movement, the main characters, props, and scene features remain stable. This technology effectively solves the long-standing industry issue of "character collapse." Additionally, the models support up to 15 seconds of continuous generation and can adapt to multiple languages (Chinese, English, Japanese, Korean, Spanish) and multiple local accents and dialects (Cantonese, Sichuanese, Northeastern accent, Beijing accent), achieving emotionally rich audiovisual synchronous performances. These capabilities enable AI to be more than just a tool but an intelligent creative partner capable of executing the director's intentions. KeLing Video 3.0 Omni: All-around references and ultimate consistency The Omni version further enhances character consistency and responsiveness to commands. Specifically, creators only need to upload reference materials for the model to extract and bind specific visual features and tones related to the main subject. Utilizing feature decoupling technology, elements such as characters and props can be freely reused in different scenarios while maintaining the same facial features and voice. This version not only reduces visual errors and adds dynamism but also overcomes challenges such as text deformation. Combined with flexible shot control capabilities, the AI-generated content meets the professional film and television industry's standards for "direct delivery." This essentially provides directors with highly controllable "digital actors" and a "virtual filming crew." KeLing Image 3.0 series models: Strengthening static storytelling and supporting 4K ultra-high definition KeLing Image 3.0 and KeLing Image 3.0 Omni models focus on enhancing the "narrative feeling" of static images, telling complete stories with still frames. The models can deeply deconstruct the visual and auditory elements within the hinted keywords, precisely control composition and viewpoint logic, and are highly adaptable to professional demands such as film and television storyboarding and scene setup. The new versions support direct output of 2K/4K ultra-high-definition images and introduce the function of creating series of group images. While enhancing the realism of the imagery, it ensures a high degree of consistency in style, lighting, and details between group images, meeting the strict requirements of professional visual materials for accuracy and consistency. Breaking the barriers of creation: AI becomes the core production process and undergoes three major transitions KeLing AI 3.0 has undergone three critical transitions from being "usable" to "controllable," and then to "professionally scheduled." Since the release of the world's first DiT video generation model for users in June 2024, KeLing AI has driven the industry into the "usable era." After entering the 2.0 stage, it has evolved from being "usable" to being "user-friendly" through continuous improvement in model capabilities and performance. Built on the All-in-One concept, the KeLing AI 3.0 series models, based on the foundation of the recently introduced O1 and 2.6 models, further deepen the Multi-modal Visual Language (MVL) interaction concept to achieve a systematic leap from "basic generation" to "professional scheduling." They have made key breakthroughs in quality generation and professional controllability. With core capabilities such as intelligent shot composition, image-video + main reference, and audio-visual synchronization in multiple languages and accents, the models don't just understand the creative intent but can systematically schedule and coordinate shot rhythms, character relationships, and audio-visual structures. Creators can organize shot compositions, anchor subjects, and progress narratives within a single model, allowing KeLing AI to evolve from a single-point generation tool to the next generation "creative interface" for content creators. In the fields of film and advertising, creators can quickly validate ideas using intelligent shot composition and other shot scheduling capabilities, while stable character consistency in games and virtual production can significantly accelerate the construction of digital assets. According to public data, as of December 2025, KeLing AI has over 60 million creators, has generated over 600 million videos, serves over 30,000 corporate users, and has achieved an annualized revenue run rate of 240 million USD. The release of the KeLing 3.0 series models marks the transition of AI from a simple generation tool to a creative collaborator that understands creative intent and executes shot compositions, signaling the arrival of the director era for everyone.