POSThttps://app.alternapdf.com/api/v1/convert/pdf-to-markdown

PDF to Markdown

Convert PDF documents into well-structured Markdown. Automatically detects headings, lists, tables, and code blocks. Supports ATX and Setext heading styles, multiple table formats, and preserves bold/italic formatting. OCR is applied automatically for scanned documents.

Content-Type: multipart/form-data

Parameters

Parameter	Type	Default	Description
`file` required	file	—	The PDF file to convert to Markdown.
`detect_headings`	boolean	`true`	Detect and convert headings based on font size and weight.
`detect_lists`	boolean	`true`	Detect and format bulleted and numbered lists.
`detect_tables`	boolean	`true`	Detect tabular data and convert to Markdown tables.
`detect_code_blocks`	boolean	`true`	Detect code blocks based on monospace fonts and formatting.
`preserve_bold_italic`	boolean	`true`	Preserve bold and italic text formatting in Markdown syntax.
`add_page_breaks`	boolean	`true`	Insert page break markers between pages in the output.
`include_metadata`	boolean	`false`	Include PDF metadata as YAML front matter in the Markdown output.
`heading_style`	string	`atx`	Heading format. `atx` uses `# Heading` syntax, `setext` uses underline syntax.
`table_format`	string	`pipe`	Table output format. Options: `pipe`, `grid`, `simple`.

Code Examples

cURL

curl -X POST "https://app.alternapdf.com/api/v1/convert/pdf-to-markdown" \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "detect_headings=true" \
  -F "detect_tables=true" \
  -F "detect_code_blocks=true" \
  -F "heading_style=atx" \
  -F "table_format=pipe"

Python

import requests

url = "https://app.alternapdf.com/api/v1/convert/pdf-to-markdown"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("document.pdf", "rb") as f:
    files = {"file": ("document.pdf", f, "application/pdf")}
    data = {
        "detect_headings": "true",
        "detect_tables": "true",
        "detect_code_blocks": "true",
        "heading_style": "atx",
        "table_format": "pipe",
    }
    response = requests.post(url, headers=headers, files=files, data=data)

result = response.json()
markdown = result["data"]["markdown"]

# Save to file
with open("output.md", "w") as out:
    out.write(markdown)

print(f"Pages: {result['data']['total_pages']}")
print(f"Processing time: {result['data']['processing_time_ms']}ms")

JavaScript

const formData = new FormData();
formData.append("file", fs.createReadStream("document.pdf"));
formData.append("detect_headings", "true");
formData.append("detect_tables", "true");
formData.append("detect_code_blocks", "true");
formData.append("heading_style", "atx");
formData.append("table_format", "pipe");

const response = await fetch("https://app.alternapdf.com/api/v1/convert/pdf-to-markdown", {
  method: "POST",
  headers: {
    "X-API-Key": "YOUR_API_KEY",
  },
  body: formData,
});

const result = await response.json();
const { markdown, total_pages, processing_time_ms } = result.data;

console.log(markdown);
console.log(`Pages: ${total_pages}, Time: ${processing_time_ms}ms`);

Response

Returns a JSON object containing the converted Markdown and processing metadata.

JSON Response

{
  "status": "success",
  "data": {
    "markdown": "# Document Title\n\nThis is the first paragraph of content...\n\n## Section One\n\n| Column A | Column B |\n|----------|----------|\n| Value 1  | Value 2  |\n\n---\n\n*Page 2 content...*",
    "total_pages": 5,
    "processing_time_ms": 1245,
    "ocr_used": false,
    "ocr_engine": null,
    "ocr_confidence": null,
    "cost_usd": 0.005
  }
}

Field	Type	Description
`status`	string	Processing status. Always `success` on 200.
`data.markdown`	string	The converted Markdown content.
`data.total_pages`	integer	Number of pages processed.
`data.processing_time_ms`	integer	Processing time in milliseconds.
`data.ocr_used`	boolean	Whether OCR was applied to any pages.
`data.ocr_engine`	string \| null	OCR engine used, or null if not applicable.
`data.ocr_confidence`	number \| null	OCR confidence score (0-1), or null if not applicable.
`data.cost_usd`	number	Cost of this request in USD.