Quickly copy text from screenshots or windows that don't allow text selection using a hotkey.
Process hundreds of scanned images or photos in bulk to extract and organize text into CSV or JSON files.
Convert PDF documents into searchable PDFs with an invisible text layer while keeping the original page images.
Extract data from QR codes, barcodes, and other machine-readable codes in images without external services.
Requires Python environment setup and downloading OCR model files on first run.
Umi-OCR is a free, open-source, fully offline OCR (Optical Character Recognition) tool that extracts text from images and documents without sending anything to the internet. OCR is the technology that reads text embedded in a picture, for example, turning a screenshot of a webpage into editable text, or extracting data from a scanned form. Most commercial OCR services require uploading your files to a remote server, which raises both cost and privacy concerns. Umi-OCR solves this by running entirely on your own machine. The tool offers four main modes. Screenshot OCR lets you capture any area of your screen with a hotkey and immediately get the text, making it ideal for copying from windows or applications that don't allow text selection. Batch OCR processes folders of image files (JPEG, PNG, WEBP, TIFF, and others) in bulk, outputting results as plain text, Markdown, CSV, or JSONL. Document OCR handles PDF, EPUB, XPS, and other document formats, optionally generating a searchable dual-layer PDF where the original page image is preserved with an invisible text layer beneath it. A QR/barcode feature reads or generates codes from images, supporting 19 protocols including QR Code, EAN, PDF417, and Data Matrix. A particularly practical feature is the "ignore zone", when batch-processing images that all share the same watermark or header/footer position, you draw rectangles over those areas and Umi-OCR automatically discards text found there without affecting the rest of the page. The application runs offline on Windows and Linux, requires no installation (just extract and launch), and auto-detects your system language. Internally it uses PaddleOCR or RapidOCR as the recognition engine, with support for multiple languages. The UI is built with Qt/QML and the back end is Python. An HTTP API and command-line interface are available for integrating Umi-OCR into automated workflows or scripts.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.