Build a retrieval-augmented generation (RAG) system that ingests PDFs and Word documents to answer questions about them.
Extract structured data from financial reports, contracts, or research papers to feed into an AI analysis pipeline.
Process scanned images and handwritten documents locally without sending them to external services.
Convert a folder of mixed document types (PDFs, slides, spreadsheets) into a unified JSON format for indexing.
Docling is a Python library and command-line tool for converting documents from many different file formats into structured, AI-friendly output. The problem it solves is that before you can use a document as context for an AI system, you need to extract its text and structure in a clean, organized form, which is especially difficult for PDFs because they were designed for printing rather than machine reading. PDFs often contain tables, multi-column layouts, headers, footnotes, charts, and mathematical formulas that simple text extraction tools mangle or miss entirely. Docling handles these challenges with purpose-built understanding of page layout, reading order, table structure, and image content. The library accepts a wide range of input formats including PDF, Word documents (DOCX), PowerPoint (PPTX), Excel (XLSX), HTML, images in formats like PNG and JPEG, LaTeX, and audio files through speech recognition. It converts all of these into a unified internal document representation and then exports to Markdown, HTML, or JSON, preserving the structural information that makes the content useful for AI processing. For PDFs, it uses a layout detection model called Heron that identifies different regions of each page. The tool can run entirely locally, which matters when handling sensitive documents in environments without internet access. It integrates directly with popular AI application frameworks like LangChain, LlamaIndex, and Haystack, so you can plug it into an existing retrieval-augmented generation pipeline. The project was started by IBM Research Zurich and is now hosted under the Linux Foundation AI and Data initiative. You would use Docling when building an AI application that needs to ingest and understand a variety of real-world document formats.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.