Back to API Docs
POST
https://app.alternapdf.com/api/v1/convert/pdf-to-htmlPDF to HTML
Convert PDF documents into clean, structured HTML. Preserves text layout, handles scanned documents with built-in OCR, and produces well-formed markup ready for embedding or further processing.
Content-Type: multipart/form-data
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
file required | file | — | The PDF file to convert to HTML. |
preserve_layout | boolean | true | Attempt to preserve the original spatial layout in the HTML output. |
include_metadata | boolean | false | Include PDF metadata as HTML meta tags in the output. |
page_separator | string | <hr> | HTML string inserted between pages in the output. |
enable_ocr | boolean | true | Automatically apply OCR to scanned or image-based pages. |
ocr_engine | string | auto | OCR engine to use. Options: tesseract, openai, auto. |
outer_wrapper | boolean | true | Wrap the output in a full HTML document structure with <html>, <head>, and <body> tags. |
title | string | Document | The title for the HTML document when outer_wrapper is enabled. |
Code Examples
cURL
curl -X POST "https://app.alternapdf.com/api/v1/convert/pdf-to-html" \
-H "X-API-Key: YOUR_API_KEY" \
-F "file=@document.pdf" \
-F "preserve_layout=true" \
-F "outer_wrapper=true" \
-F "title=My Document" \
-F "enable_ocr=true"Python
import requests
url = "https://app.alternapdf.com/api/v1/convert/pdf-to-html"
headers = {"X-API-Key": "YOUR_API_KEY"}
with open("document.pdf", "rb") as f:
files = {"file": ("document.pdf", f, "application/pdf")}
data = {
"preserve_layout": "true",
"outer_wrapper": "true",
"title": "My Document",
"enable_ocr": "true",
}
response = requests.post(url, headers=headers, files=files, data=data)
result = response.json()
html_content = result["html"]
# Save to file
with open("output.html", "w") as out:
out.write(html_content)JavaScript
const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("preserve_layout", "true");
formData.append("outer_wrapper", "true");
formData.append("title", "My Document");
formData.append("enable_ocr", "true");
const response = await fetch("https://app.alternapdf.com/api/v1/convert/pdf-to-html", {
method: "POST",
headers: {
"X-API-Key": "YOUR_API_KEY",
},
body: formData,
});
const result = await response.json();
console.log(result.html);Response
Returns a JSON object containing the converted HTML content.
JSON Response
{
"status": "success",
"filename": "document.pdf",
"html": "<!DOCTYPE html>\n<html>\n<head>\n<title>My Document</title>\n</head>\n<body>\n<p>Extracted content from the PDF...</p>\n<hr>\n<p>Page 2 content...</p>\n</body>\n</html>"
}| Field | Type | Description |
|---|---|---|
status | string | Processing status. Always success on 200. |
filename | string | Name of the uploaded file. |
html | string | The converted HTML content. When outer_wrapper is true, includes full document structure. |