OCR API
Three OCR engines to match your accuracy, speed, and cost requirements. From fast Tesseract extraction to Claude AI-powered vision understanding of complex documents.
Key Features
Simple OCR (Tesseract)
Fast, free text extraction using Tesseract. Best for clean, printed documents. Process up to 20 pages synchronously with confidence scoring.
Advanced OCR (DocTR)
Neural network-powered OCR with word-level bounding boxes, confidence scores, and layout-aware extraction. Process up to 10 pages synchronously. No API costs.
Vision OCR (Claude AI)
The highest accuracy option powered by Claude's vision capabilities. Understands document context — handles complex layouts, handwriting, forms, tables, and low-quality scans. Output as text, Markdown, or JSON.
AI Enhancement
Optional post-processing step that uses AI to fix common OCR errors like character confusion, broken words, and formatting artifacts.
Detailed Output Mode
Get per-page breakdowns with word-level coordinates and confidence scores. Build precise extraction pipelines based on text positions.
Async Processing
Process unlimited pages via background jobs. Simple OCR auto-queues beyond 20 pages, Advanced beyond 10, and Vision beyond 3.
Use Cases
See how teams are using this API in production
Invoice & Receipt Processing
Extract text from scanned invoices and receipts for automated bookkeeping, expense management, and accounts payable workflows.
Document Digitization
Convert scanned paper documents into searchable, editable text. Process archives of historical or legacy documents at scale.
Form Data Extraction
Extract field values from scanned forms — applications, surveys, medical forms — using word-level coordinates for precise field mapping.
Handwritten Content
Use Vision OCR to process handwritten notes, filled forms, and annotations that standard OCR engines struggle with.
Low-Quality Scans
Vision OCR handles faded text, skewed pages, and low-resolution scans where traditional OCR produces poor results.
Multi-Language Documents
Process documents in multiple languages. Supports PDF and image inputs (PNG, JPEG, WebP).
Why Choose Us
Three Engines, One API
Choose the right engine for your accuracy, speed, and cost requirements. Same API interface for all three.
Beyond Characters
Vision OCR understands document context and structure, not just character shapes. Handles what other OCR engines can't.
Scale Without Limits
Async processing handles unlimited pages. Auto-queues large jobs so you never hit page limits.
Extract Text from Any Document
Choose the right OCR engine for your use case. Start your free trial.