OCR API

Three OCR engines to match your accuracy, speed, and cost requirements. From fast Tesseract extraction to Claude AI-powered vision understanding of complex documents.

Simple (Tesseract)Advanced (DocTR)Vision (Claude AI)AI enhancement
3 Engines
Choose Your Fit
PDF + Images
PNG, JPEG, WebP

Key Features

Simple OCR (Tesseract)

Fast, free text extraction using Tesseract. Best for clean, printed documents. Process up to 20 pages synchronously with confidence scoring.

Advanced OCR (DocTR)

Neural network-powered OCR with word-level bounding boxes, confidence scores, and layout-aware extraction. Process up to 10 pages synchronously. No API costs.

Vision OCR (Claude AI)

The highest accuracy option powered by Claude's vision capabilities. Understands document context — handles complex layouts, handwriting, forms, tables, and low-quality scans. Output as text, Markdown, or JSON.

AI Enhancement

Optional post-processing step that uses AI to fix common OCR errors like character confusion, broken words, and formatting artifacts.

Detailed Output Mode

Get per-page breakdowns with word-level coordinates and confidence scores. Build precise extraction pipelines based on text positions.

Async Processing

Process unlimited pages via background jobs. Simple OCR auto-queues beyond 20 pages, Advanced beyond 10, and Vision beyond 3.

Use Cases

See how teams are using this API in production

Invoice & Receipt Processing

Extract text from scanned invoices and receipts for automated bookkeeping, expense management, and accounts payable workflows.

Document Digitization

Convert scanned paper documents into searchable, editable text. Process archives of historical or legacy documents at scale.

Form Data Extraction

Extract field values from scanned forms — applications, surveys, medical forms — using word-level coordinates for precise field mapping.

Handwritten Content

Use Vision OCR to process handwritten notes, filled forms, and annotations that standard OCR engines struggle with.

Low-Quality Scans

Vision OCR handles faded text, skewed pages, and low-resolution scans where traditional OCR produces poor results.

Multi-Language Documents

Process documents in multiple languages. Supports PDF and image inputs (PNG, JPEG, WebP).

Why Choose Us

Three Engines, One API

Choose the right engine for your accuracy, speed, and cost requirements. Same API interface for all three.

Beyond Characters

Vision OCR understands document context and structure, not just character shapes. Handles what other OCR engines can't.

Scale Without Limits

Async processing handles unlimited pages. Auto-queues large jobs so you never hit page limits.

Extract Text from Any Document

Choose the right OCR engine for your use case. Start your free trial.