Xiaohongshu open-sources multi-modal large model.
Xiaohongshu Hilab released the first visual language model, dots.vlm1, in the open source dots model family. This model is built on a 1.2 billion parameter visual encoder and DeepSeek V3 LLM, and through large-scale pre-training and fine-tuning, it achieves near SOTA level in visual perception and reasoning.
Latest
11 m ago