explaingit

deepdiy/pdf2md

20RustAudience · developerComplexity · 2/5ActiveLicenseSetup · easy

TLDR

A Rust CLI that converts PDF files into clean Markdown using a YOLO layout detection model, preserving headings, tables, formulas, and images.

Mindmap

mindmap
  root((pdf2md))
    Inputs
      PDF file
      DPI options
      Page selection
    Outputs
      Markdown file
      Extracted images
      Optional ZIP bundle
    Use Cases
      Convert scanned papers to Markdown
      Batch process research PDFs
      Serve a small in-house API
      Run on a low-spec VPS
    Tech Stack
      Rust
      DocLayoutNet
      YOLO
      Streamlit

Things people build with this

USE CASE 1

Convert a PDF report into Markdown with figures saved alongside

USE CASE 2

Run the prebuilt binary on a 1GB VPS to host a tiny PDF conversion service

USE CASE 3

Wrap the CLI with the included Streamlit app for a browser based PDF to Markdown tool

USE CASE 4

Call the public pdf2md API to get Markdown and a ZIP from any PDF without an API key

Tech stack

RustYOLODocLayoutNetStreamlit

Getting it running

Difficulty · easy Time to first run · 5min

Drop the prebuilt binary and model folder into a directory and run, no Docker needed.

MIT license, free for personal and commercial use with attribution preserved.

In plain English

pdf2md is a command-line tool that turns PDF files into Markdown text while trying to keep the original page layout intact. It is written in Rust and uses a YOLO-based detection model called DocLayoutNet to figure out which parts of a page are headings, paragraphs, tables, lists, formulas, captions, page headers, footnotes, or images. The output is meant to read cleanly, with no random line breaks chopping up paragraphs. The project ships ready-to-run binaries for macOS on Apple Silicon, Linux on x86_64 and ARM64, and Windows on x86_64, so most users do not need to compile anything. You copy the binary and the model folder into a working directory, then run a command like ./pdf2md-<platform> input.pdf and get a Markdown file. Options let you set the detection DPI, the image export DPI, pick a single page, point at a different model folder, or export the full page image used for layout detection. Embedded images are pulled out of the PDF and saved next to the Markdown file. For people who prefer a browser, there is a small Streamlit web app in the repo that wraps the binary and shows the converted Markdown with images. The README says the tool runs comfortably on a 1-core, 1 GB RAM VPS with no Docker required. There is also a free public API at pdf2md.deepdiy.net that accepts a PDF over HTTP POST and returns Markdown, image links, and a downloadable ZIP. No API key is needed. The server handles one job at a time and returns HTTP 429 if it is busy, with a 20 MB size limit and a 120 second per-task limit. The project is released under the MIT license.

Copy-paste prompts

Prompt 1
Download the pdf2md binary for my platform and convert a PDF into Markdown with extracted images
Prompt 2
Run pdf2md on a single page of a long PDF using a custom detection DPI
Prompt 3
Launch the bundled Streamlit web app on a VPS so my team can drop PDFs in a browser
Prompt 4
POST a PDF to the pdf2md.deepdiy.net API and parse the returned Markdown and image links
Prompt 5
Build the project from source on Linux ARM64 and swap in a different DocLayoutNet checkpoint
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.