DeepSeek team releases new visual compression model DeepSeek-OCR
The DeepSeek team releases a new visual compression model called DeepSeek-OCR.
On October 20th, the DeepSeek-AI team launched a new research result - DeepSeek-OCR, proposing an innovative method for compressing long text context through visual modality. This involves rendering the long context into an image and then feeding it to the model, allowing the context that originally required thousands or tens of thousands of text tokens to be represented with just a few hundred visual tokens, achieving efficient compression of information.
It is reported that DeepSeek-OCR consists of two parts: the core encoder DeepEncoder and the decoder DeepSeek3B-MoE-A570M. DeepEncoder is designed to maintain low computational activations under high-resolution input, while achieving high compression ratios to control the number of visual tokens within a manageable range.
Experiments show that when the number of text tokens does not exceed 10 times the number of visual tokens (compression ratio less than 10x), the model's OCR (text recognition) accuracy can reach 97%; even when the compression ratio is increased to 20x, the accuracy remains at about 60%, demonstrating great potential in the compression of historical document contexts and research on large language model memory mechanisms. DeepSeek-OCR also has high practical value.
In the OmniDocBench test, DeepSeek-OCR surpassed the GOT-OCR2.0 of StageUp Xingchen with 100 visual tokens (256 tokens per page), and was better than MinerU2.0 of the Shanghai AI Lab with less than 800 visual tokens (average of over 6000 tokens per page). In actual production, DeepSeek-OCR can generate over 200,000 pages of large language model/visual language model training data per day on a single A100-40G graphics card.
Related Articles

Northeast: Music ecosystem giant NETEASE MUSIC (09899) benefits from rapid industry growth.

Bidding for Warner Bros. (WBD.US) enters a critical period, while Paramount Skydance (PSKY.US) speeds up the antitrust review process.

US Stock Market Move | Multiple departments issued letters to support the new consumption and financial consumption driving LexinFintech Holdings Ltd. Sponsored ADR Class A (LX.US) to rise by 6.04%.
Northeast: Music ecosystem giant NETEASE MUSIC (09899) benefits from rapid industry growth.

Bidding for Warner Bros. (WBD.US) enters a critical period, while Paramount Skydance (PSKY.US) speeds up the antitrust review process.

US Stock Market Move | Multiple departments issued letters to support the new consumption and financial consumption driving LexinFintech Holdings Ltd. Sponsored ADR Class A (LX.US) to rise by 6.04%.

RECOMMEND

Nine Companies With Market Value Over RMB 100 Billion Awaiting, Hong Kong IPO Boom Continues Into 2026
07/02/2026

Hong Kong IPO Cornerstone Investments Surge: HKD 18.52 Billion In First Month, Up More Than 13 Times Year‑On‑Year
07/02/2026

Over 400 Companies Lined Up For Hong Kong IPOs; HKEX Says Market Can Absorb
07/02/2026


