Back to API Docs
POSThttps://app.alternapdf.com/api/v1/convert/pdf-to-html

PDF to HTML

Convert PDF documents into clean, structured HTML. Preserves text layout, handles scanned documents with built-in OCR, and produces well-formed markup ready for embedding or further processing.

Content-Type: multipart/form-data

Parameters

ParameterTypeDefaultDescription
file requiredfileThe PDF file to convert to HTML.
preserve_layoutbooleantrueAttempt to preserve the original spatial layout in the HTML output.
include_metadatabooleanfalseInclude PDF metadata as HTML meta tags in the output.
page_separatorstring<hr>HTML string inserted between pages in the output.
enable_ocrbooleantrueAutomatically apply OCR to scanned or image-based pages.
ocr_enginestringautoOCR engine to use. Options: tesseract, openai, auto.
outer_wrapperbooleantrueWrap the output in a full HTML document structure with <html>, <head>, and <body> tags.
titlestringDocumentThe title for the HTML document when outer_wrapper is enabled.

Code Examples

cURL
curl -X POST "https://app.alternapdf.com/api/v1/convert/pdf-to-html" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "preserve_layout=true" \
  -F "outer_wrapper=true" \
  -F "title=My Document" \
  -F "enable_ocr=true"
Python
import requests

url = "https://app.alternapdf.com/api/v1/convert/pdf-to-html"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("document.pdf", "rb") as f:
    files = {"file": ("document.pdf", f, "application/pdf")}
    data = {
        "preserve_layout": "true",
        "outer_wrapper": "true",
        "title": "My Document",
        "enable_ocr": "true",
    }
    response = requests.post(url, headers=headers, files=files, data=data)

result = response.json()
html_content = result["html"]

# Save to file
with open("output.html", "w") as out:
    out.write(html_content)
JavaScript
const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("preserve_layout", "true");
formData.append("outer_wrapper", "true");
formData.append("title", "My Document");
formData.append("enable_ocr", "true");

const response = await fetch("https://app.alternapdf.com/api/v1/convert/pdf-to-html", {
  method: "POST",
  headers: {
    "X-API-Key": "YOUR_API_KEY",
  },
  body: formData,
});

const result = await response.json();
console.log(result.html);

Response

Returns a JSON object containing the converted HTML content.

JSON Response
{
  "status": "success",
  "filename": "document.pdf",
  "html": "<!DOCTYPE html>\n<html>\n<head>\n<title>My Document</title>\n</head>\n<body>\n<p>Extracted content from the PDF...</p>\n<hr>\n<p>Page 2 content...</p>\n</body>\n</html>"
}
FieldTypeDescription
statusstringProcessing status. Always success on 200.
filenamestringName of the uploaded file.
htmlstringThe converted HTML content. When outer_wrapper is true, includes full document structure.