The Zhiyuan Research Institute announces the open source of the Video-XL-2 ultra-long video understanding model.
On June 3rd, the Zhiyuan Research Institute announced that they have released a new generation of super-long video understanding model, Video-XL-2, in collaboration with Shanghai Jiaotong University and other institutions. It is reported that the new model significantly extends the duration of videos that can be processed, supporting efficient processing of video inputs up to ten thousand frames on a single GPU. Currently, the model weights of Video-XL-2 have been fully open to the community.
Latest