explaingit

bhaskatripathi/pdfgpt

7,172PythonAudience · generalComplexity · 2/5LicenseSetup · moderate

TLDR

A Python app that lets you upload a PDF and ask questions about it. It finds the most relevant sections and sends them to OpenAI to generate precise answers, with optional page-number citations.

Mindmap

mindmap
  root((pdfGPT))
    How it works
      Split PDF into chunks
      Embed each chunk
      Find similar chunks
      Send to OpenAI
    Features
      PDF upload or URL
      Page citations
      No LangChain
      No vector database
    Tech
      Python
      Universal Sentence Encoder
      OpenAI GPT models
      Docker Compose
    Use cases
      Query research papers
      Analyze contracts
      Build document Q&A
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Upload a long research paper or contract and ask specific questions to get direct answers with page citations.

USE CASE 2

Run the app locally or via Docker to privately query PDFs without storing data in a third-party service.

USE CASE 3

Use as a starting template for building a document Q&A system without LangChain or a vector database.

Tech stack

PythonOpenAI APIUniversal Sentence EncoderDocker

Getting it running

Difficulty · moderate Time to first run · 30min

Requires a paid OpenAI API key to generate answers.

MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.

In plain English

pdfGPT is a Python application that lets you ask questions about a PDF document and get answers generated by an AI model. You upload a PDF file or provide a URL to one, then type a question, and the application finds the most relevant sections of the document and sends them to OpenAI to generate a precise answer. The author claims it was one of the earliest open-source systems of this kind, first released in 2021, and argues it remains more accurate than many later alternatives because of its simple architecture. The technical approach works like this: the application splits the PDF into small chunks of about 150 words each. It then generates a numerical representation (called an embedding) of each chunk using a deep learning encoder called the Universal Sentence Encoder. When you ask a question, the application generates an embedding of your question and uses a nearest-neighbor search to find the five chunks most similar to it. Those five chunks are inserted into a prompt sent to OpenAI, which generates the final answer. The responses can include page number citations in square brackets so you can locate the source in the original document. One design choice that distinguishes this project from some alternatives is that it does not use a vector database or a third-party orchestration library like LangChain. The embeddings are saved to a file on disk and reloaded on subsequent queries. The application supports OpenAI GPT models including GPT-3.5 Turbo and GPT-4. A Docker Compose file is included for running the application in a container. A live demo is hosted on Hugging Face Spaces. The project is MIT licensed and open to contributors, though the README notes that documentation has not been kept fully up to date.

Copy-paste prompts

Prompt 1
I'm using pdfGPT to query a PDF contract. How do I upload the file and ask it to find all clauses about payment terms?
Prompt 2
I want to run pdfGPT with Docker Compose. Walk me through the setup steps and what environment variables I need to provide.
Prompt 3
pdfGPT returned an answer but I want to verify it. How do I use the page-number citations it includes to find the source in the original PDF?
Prompt 4
I want to modify pdfGPT to use GPT-4 instead of GPT-3.5 Turbo. What do I change in the code?
Prompt 5
How does pdfGPT's nearest-neighbor search work? I want to understand why it splits PDFs into 150-word chunks and uses sentence embeddings.
Open on GitHub → Explain another repo

← bhaskatripathi on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.