🟡 🤖 Models Published: · 2 min read ·

Mistral: OCR 4 — structured document extraction with bounding boxes in 170 languages

Editorial illustration: scanned paper document with labeled paragraphs and bounding boxes in various languages

Mistral OCR 4 is a new optical character recognition model that tops the OlmOCRBench leaderboard with 85.20 points, supports 170 languages, and delivers paragraph-level bounding boxes — all at a price of $4 per 1,000 pages.

🤖

This article was generated using artificial intelligence from primary sources.

Mistral AI has released OCR 4 — a new optical character recognition model that extracts not just text but the entire page structure with spatial paragraph coordinates from scanned and digital documents.

What does Mistral OCR 4 bring that is new?

The model is identified as mistral-ocr-4-0; the alias mistral-ocr-latest now points to this version. The key novelty is the include_blocks parameter, which returns an array of blocks with paragraph-level bounding boxes — rectangular frames that define the position of each paragraph on the page together with reading order. Alongside coordinates, each block carries a structural label: heading, table, equation, caption, header, or footer.

Benchmarks: top of the leaderboard on all measures

Mistral OCR 4 achieves 85.20 points on OlmOCRBench — currently the highest score on that leaderboard — and 93.07 points on OmniDocBench. On the internal multilingual Crawl Multilingual test it reaches 98 points. In human preference evaluations the model records an average win-rate of 72% over tested alternatives, a notable jump compared to previous Mistral OCR versions.

Support for 170 languages and deployment options

The model covers 170 languages organized into 10 language groups, and input formats include PDF, DOC, PPT, and OpenDocument files. For organizations where data sovereignty is important, Mistral OCR 4 is available as a self-hosted solution within a single container — without sending documents to external servers. Integrations are also available on AWS SageMaker, Microsoft Foundry, and Snowflake.

Pricing and availability

The standard API charges $4 per 1,000 pages, while the Batch API reduces the cost to $2 per 1,000 pages — making it attractive for bulk archive processing. On the Document AI platform the price is $5 per 1,000 pages. Compared to earlier Mistral OCR versions that did not offer structural blocks, OCR 4 delivers significantly richer output on the same infrastructure, suitable for downstream processing in RAG systems and digital archives.

Frequently Asked Questions

What is OCR and what does Mistral OCR 4 do?
OCR (Optical Character Recognition) is a technology that converts images of text or scanned documents into machine-readable text. Mistral OCR 4 goes a step further: in addition to text extraction it returns structural labels such as headings, tables, and captions, as well as spatial coordinates (bounding boxes) for each paragraph.
How much does Mistral OCR 4 cost compared to competitors?
The API price is $4 per 1,000 pages, while the Batch API offers a price of $2 per 1,000 pages. On the Document AI platform the price is $5 per 1,000 pages.