Lates News
On August 12th, Tencent released the multimodal understanding model Mixed Large-Vision. It uses the MoE architecture with 52B activation parameters and supports arbitrary resolution images, videos, and 3D spatial inputs, focusing on improving multilingual scene understanding capabilities. (Interface News).
Latest