POSThttps://app.alternapdf.com/api/v1/convert/pdf-to-html

PDF to HTML

Convert PDF documents into clean, structured HTML. Preserves text layout, handles scanned documents with built-in OCR, and produces well-formed markup ready for embedding or further processing.

Content-Type: multipart/form-data

Parameters

Parameter	Type	Default	Description
`file` required	file	—	The PDF file to convert to HTML.
`preserve_layout`	boolean	`true`	Attempt to preserve the original spatial layout in the HTML output.
`include_metadata`	boolean	`false`	Include PDF metadata as HTML meta tags in the output.
`page_separator`	string	`<hr>`	HTML string inserted between pages in the output.
`enable_ocr`	boolean	`true`	Automatically apply OCR to scanned or image-based pages.
`ocr_engine`	string	`auto`	OCR engine to use. Options: `tesseract`, `openai`, `auto`.
`outer_wrapper`	boolean	`true`	Wrap the output in a full HTML document structure with `<html>`, `<head>`, and `<body>` tags.
`title`	string	`Document`	The title for the HTML document when `outer_wrapper` is enabled.

Code Examples

cURL

curl -X POST "https://app.alternapdf.com/api/v1/convert/pdf-to-html" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "preserve_layout=true" \
  -F "outer_wrapper=true" \
  -F "title=My Document" \
  -F "enable_ocr=true"

Python

import requests

url = "https://app.alternapdf.com/api/v1/convert/pdf-to-html"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("document.pdf", "rb") as f:
    files = {"file": ("document.pdf", f, "application/pdf")}
    data = {
        "preserve_layout": "true",
        "outer_wrapper": "true",
        "title": "My Document",
        "enable_ocr": "true",
    }
    response = requests.post(url, headers=headers, files=files, data=data)

result = response.json()
html_content = result["html"]

# Save to file
with open("output.html", "w") as out:
    out.write(html_content)

JavaScript

const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("preserve_layout", "true");
formData.append("outer_wrapper", "true");
formData.append("title", "My Document");
formData.append("enable_ocr", "true");

const response = await fetch("https://app.alternapdf.com/api/v1/convert/pdf-to-html", {
  method: "POST",
  headers: {
    "X-API-Key": "YOUR_API_KEY",
  },
  body: formData,
});

const result = await response.json();
console.log(result.html);

Response

Returns a JSON object containing the converted HTML content.

JSON Response

{
  "status": "success",
  "filename": "document.pdf",
  "html": "<!DOCTYPE html>\n<html>\n<head>\n<title>My Document</title>\n</head>\n<body>\n<p>Extracted content from the PDF...</p>\n<hr>\n<p>Page 2 content...</p>\n</body>\n</html>"
}

Field	Type	Description
`status`	string	Processing status. Always `success` on 200.
`filename`	string	Name of the uploaded file.
`html`	string	The converted HTML content. When `outer_wrapper` is true, includes full document structure.