Build a Q&A chatbot over internal company documents that understands relationships between people, products, and events.
Process a collection of books or research papers at roughly $0.08 each and query them for named entities and connections.
Add incremental document updates to an AI app so new files can be inserted without re-processing the entire knowledge base.
Run a self-hosted alternative to cloud RAG services with graph-structured retrieval and persistent on-disk storage.
Requires an OpenAI API key. Processing costs roughly $0.08 per book-length document.
fast-graphrag is a Python library that helps AI applications answer questions from your own documents by first turning those documents into a graph of connected ideas and entities. Instead of dumping text into a search index and hoping the right chunk comes back, it maps out who is who, what connects to what, and where things happened, then walks that graph when you ask a question. The result is answers that draw on genuine structure in your data rather than keyword overlap. Getting started requires an OpenAI API key. You point the library at a folder of text, describe the domain you care about (for example, "identify the characters and how they interact"), list the kinds of entities you want tracked, and call insert on your documents. The library builds a graph in the background, stored to disk so you can add new documents later without starting from scratch. Queries then return plain answers drawn from that graph. The main practical claim is cost. The README compares processing a book at $0.08 with fast-graphrag versus $0.48 with Microsoft's graphrag, a six-times difference. The savings come from how the library explores the graph: it uses an algorithm called personalized PageRank, borrowed from the same family of ideas that powers web search rankings, to focus on the most relevant parts rather than scanning everything. The library is asynchronous, meaning it can handle many documents or queries at the same time without blocking. It supports incremental updates, so new data can be added without re-processing what is already stored. A set of tutorial notebooks in the examples folder covers things like swapping in a different language model, using checkpoints to protect against data corruption, and including source references in answers. The project is open source under the MIT license. The team behind it, Circlemind, also runs a managed hosted version with a free tier of 100 requests per month. A community Discord is available for questions.
← circlemind-ai on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.