Back to API Docs
POSThttps://app.alternapdf.com/api/v1/extract/pdf-images

Extract Images

Extract all embedded images from a PDF document. Returns a ZIP archive containing every image found in the document in its original format. Filter by minimum dimensions to skip small icons and decorative elements, and target specific page ranges for large documents.

Content-Type: multipart/form-data

Parameters

ParameterTypeDefaultDescription
file requiredfileThe PDF file to extract images from.
min_widthinteger10Minimum image width in pixels. Images narrower than this are excluded.
min_heightinteger10Minimum image height in pixels. Images shorter than this are excluded.
start_pageintegerFirst page to extract images from. Defaults to the first page of the document.
end_pageintegerLast page to extract images from. Defaults to the last page of the document.

Code Examples

cURL
curl -X POST "https://app.alternapdf.com/api/v1/extract/pdf-images" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "min_width=50" \
  -F "min_height=50" \
  -F "start_page=1" \
  -F "end_page=10" \
  --output images.zip
Python
import requests
import zipfile
import io

url = "https://app.alternapdf.com/api/v1/extract/pdf-images"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("document.pdf", "rb") as f:
    files = {"file": ("document.pdf", f, "application/pdf")}
    data = {
        "min_width": "50",
        "min_height": "50",
        "start_page": "1",
        "end_page": "10",
    }
    response = requests.post(url, headers=headers, files=files, data=data)

# Check response headers for image count and total size
total_images = response.headers.get("X-Total-Images")
total_size = response.headers.get("X-Total-Size-Bytes")
print(f"Found {total_images} images ({total_size} bytes)")

# Save the ZIP archive
with open("images.zip", "wb") as out:
    out.write(response.content)

# Or extract images directly
with zipfile.ZipFile(io.BytesIO(response.content)) as zf:
    zf.extractall("extracted_images/")
    print(f"Extracted: {zf.namelist()}")
JavaScript
import fs from "fs";

const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("min_width", "50");
formData.append("min_height", "50");
formData.append("start_page", "1");
formData.append("end_page", "10");

const response = await fetch("https://app.alternapdf.com/api/v1/extract/pdf-images", {
  method: "POST",
  headers: {
    "X-API-Key": "YOUR_API_KEY",
  },
  body: formData,
});

// Check response headers for image count and total size
const totalImages = response.headers.get("X-Total-Images");
const totalSize = response.headers.get("X-Total-Size-Bytes");
console.log(`Found ${totalImages} images (${totalSize} bytes)`);

// Save the ZIP archive
const buffer = await response.arrayBuffer();
fs.writeFileSync("images.zip", Buffer.from(buffer));

Response

Returns a ZIP archive containing all extracted images in their original formats. Response headers provide summary information about the extracted images.

Response Headers

HeaderTypeDescription
Content-Typestringapplication/zip
Content-Dispositionstringattachment; filename="document_images.zip"
X-Total-ImagesintegerTotal number of images extracted from the PDF.
X-Total-Size-BytesintegerTotal size of all extracted images in bytes (uncompressed).

ZIP Archive Contents

The ZIP archive contains image files named by their page number and index:

document_images.zip
  ├── page_1_img_1.png
  ├── page_1_img_2.jpeg
  ├── page_3_img_1.png
  ├── page_5_img_1.jpeg
  └── page_5_img_2.png