explaingit

daveebbelaar/document-copilot

17PythonAudience · pm founderComplexity · 3/5Setup · moderate

TLDR

An AI chatbot that lets you ask plain-English questions about a collection of uploaded PDFs and get sourced answers, built with FastAPI, React, Supabase, and OpenAI, demonstrated on SEC financial filings.

Mindmap

mindmap
  root((Document Copilot))
    What It Does
      PDF ingestion
      Q&A with citations
      Hybrid vector search
    Tech Stack
      FastAPI backend
      React frontend
      Supabase pgvector
    Use Cases
      SEC filing research
      Internal document Q&A
      Research assistant
    Setup
      OpenAI API key
      Supabase account
      Railway hosting
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Build an internal tool that lets analysts query a library of SEC filings and receive cited answers without reading each document manually.

USE CASE 2

Create a document chatbot for any PDF collection, upload files, ask questions, and get answers with references to the source sections.

USE CASE 3

Set up a hybrid search system that combines vector similarity and standard text search for more accurate retrieval across large document sets.

Tech stack

PythonFastAPIReactTypeScriptSupabasePostgreSQLpgvectorOpenAI

Getting it running

Difficulty · moderate Time to first run · 1h+

Requires active Supabase and OpenAI accounts and enabling the pgvector extension inside the Supabase database.

In plain English

Document Copilot is an AI chatbot built to let users ask questions about a collection of documents in plain English and receive answers with source citations. The use case described in the readme is a fictional investment research firm where analysts spend significant time reading SEC financial filings (10-Ks and 10-Qs) before producing any original analysis. The chatbot is meant to handle that reading work and surface relevant information on demand. The project uses a Python backend built with FastAPI, a React frontend, and a Supabase-hosted PostgreSQL database for storing users, chats, uploaded documents, and document chunks. When a document is ingested, it is split into pieces and converted into numerical representations using OpenAI's API, then stored with a vector search extension called pgvector. When a user asks a question, the system finds the most relevant chunks through a combination of vector similarity and standard text search, then passes them to an OpenAI language model to compose an answer. A helper script is included to download a small set of real SEC filings from EDGAR, the public US financial disclosure database. By default it fetches recent 10-K filings for five large US companies and saves them locally for use as sample data during development. The frontend is built with Vite, React, and TypeScript. Authentication is handled through Supabase's email-based auth system. The application is designed to be hosted on Railway. Setting up the project requires Python 3.12 or later, Node.js, and active accounts with Supabase and OpenAI. Setup guides for the backend, frontend, and database are included in the repository's docs folder.

Copy-paste prompts

Prompt 1
I'm setting up document-copilot with Supabase and OpenAI. Walk me through enabling the pgvector extension in Supabase and wiring up the environment variables in the backend.
Prompt 2
Using document-copilot, how do I add my own PDF documents instead of the sample SEC filings and trigger re-ingestion so they become searchable?
Prompt 3
In document-copilot, how does the hybrid search work that combines vector similarity with keyword text search? Show me where in the code that logic lives.
Prompt 4
I want to deploy document-copilot on Railway. What environment variables do I need to configure and what are the build commands for the backend and frontend?
Prompt 5
Help me use the included script to download SEC 10-K filings from EDGAR for a specific company and load them into document-copilot.
Open on GitHub → Explain another repo

← daveebbelaar on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.