explaingit

openclaw/clawpdf

52TypeScript
This is a quick first-pass explanation. The richer sections — use-cases, tech stack, setup, prompts — are still being generated.

TLDR

clawpdf is a TypeScript package that lets JavaScript code work with PDF files in Node.js or the browser, without installing any extra dependencies.

Mindmap

A visual breakdown will appear here once this repo is fully enriched.

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

In plain English

clawpdf is a TypeScript package that lets JavaScript code work with PDF files in Node.js or the browser, without installing any extra dependencies. It bundles Google's PDFium PDF engine compiled to WebAssembly, so there is no native addon to compile and no canvas library to install. The main things it can do are extract text from a PDF, render individual pages as PNG images, and handle password-protected files. There is an "auto" extraction mode that pulls text first and only falls back to rendering PNG images when the extracted text is too short to be useful. This is aimed at use cases where PDFs are being fed into an AI model: readable PDFs go in as text, scanned or image-heavy PDFs go in as images. An adapter function called toMessageContent can shape the output into blocks suitable for multimodal model input. The API centers on three main functions. openPdf opens a single document and gives you access to page count, text, and per-page PNG rendering. extractPdf is a one-shot function that applies the auto fallback logic. createEngine creates a reusable PDFium instance, which the README recommends for server code so you are not spinning up a new WASM engine for each request. A CLI is included so you can extract text or render pages directly from the terminal without writing any code. Both the Node.js and browser paths ship in the same package, the browser version pre-configures the WASM URL for bundlers, with an option to host the WASM file yourself. Benchmarks in the README show roughly half the processing time and significantly lower memory use compared to an earlier approach tested against the same sample PDFs. Node.js 20 or later is required. The package is released under the MIT license, with upstream BSD-style and Apache 2.0 notices for the PDFium binary.

Open on GitHub → Explain another repo

← openclaw on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.