POSThttps://app.alternapdf.com/api/v1/extract/pdf-images

Extract Images

Extract all embedded images from a PDF document. Returns a ZIP archive containing every image found in the document in its original format. Filter by minimum dimensions to skip small icons and decorative elements, and target specific page ranges for large documents.

Content-Type: multipart/form-data

Parameters

Parameter	Type	Default	Description
`file` required	file	—	The PDF file to extract images from.
`min_width`	integer	`10`	Minimum image width in pixels. Images narrower than this are excluded.
`min_height`	integer	`10`	Minimum image height in pixels. Images shorter than this are excluded.
`start_page`	integer	—	First page to extract images from. Defaults to the first page of the document.
`end_page`	integer	—	Last page to extract images from. Defaults to the last page of the document.

Code Examples

cURL

curl -X POST "https://app.alternapdf.com/api/v1/extract/pdf-images" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "min_width=50" \
  -F "min_height=50" \
  -F "start_page=1" \
  -F "end_page=10" \
  --output images.zip

Python

import requests
import zipfile
import io

url = "https://app.alternapdf.com/api/v1/extract/pdf-images"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("document.pdf", "rb") as f:
    files = {"file": ("document.pdf", f, "application/pdf")}
    data = {
        "min_width": "50",
        "min_height": "50",
        "start_page": "1",
        "end_page": "10",
    }
    response = requests.post(url, headers=headers, files=files, data=data)

# Check response headers for image count and total size
total_images = response.headers.get("X-Total-Images")
total_size = response.headers.get("X-Total-Size-Bytes")
print(f"Found {total_images} images ({total_size} bytes)")

# Save the ZIP archive
with open("images.zip", "wb") as out:
    out.write(response.content)

# Or extract images directly
with zipfile.ZipFile(io.BytesIO(response.content)) as zf:
    zf.extractall("extracted_images/")
    print(f"Extracted: {zf.namelist()}")

JavaScript

import fs from "fs";

const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("min_width", "50");
formData.append("min_height", "50");
formData.append("start_page", "1");
formData.append("end_page", "10");

const response = await fetch("https://app.alternapdf.com/api/v1/extract/pdf-images", {
  method: "POST",
  headers: {
    "X-API-Key": "YOUR_API_KEY",
  },
  body: formData,
});

// Check response headers for image count and total size
const totalImages = response.headers.get("X-Total-Images");
const totalSize = response.headers.get("X-Total-Size-Bytes");
console.log(`Found ${totalImages} images (${totalSize} bytes)`);

// Save the ZIP archive
const buffer = await response.arrayBuffer();
fs.writeFileSync("images.zip", Buffer.from(buffer));

Response

Returns a ZIP archive containing all extracted images in their original formats. Response headers provide summary information about the extracted images.

Response Headers

Header	Type	Description
`Content-Type`	string	`application/zip`
`Content-Disposition`	string	`attachment; filename="document_images.zip"`
`X-Total-Images`	integer	Total number of images extracted from the PDF.
`X-Total-Size-Bytes`	integer	Total size of all extracted images in bytes (uncompressed).

ZIP Archive Contents

The ZIP archive contains image files named by their page number and index:

document_images.zip
  ├── page_1_img_1.png
  ├── page_1_img_2.jpeg
  ├── page_3_img_1.png
  ├── page_5_img_1.jpeg
  └── page_5_img_2.png