circlemind-ai/fast-graphrag

★ 3,787PythonAudience · developerComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((fast-graphrag))
    What it does
      Document to graph
      Graph-based Q and A
      Entity extraction
    Tech approach
      Personalized PageRank
      Async processing
      Incremental updates
    Use cases
      Company knowledge base
      Research paper search
      Low-cost RAG pipeline
    Setup
      OpenAI API key
      MIT open source
      Hosted free tier

mindmap root((fast-graphrag)) What it does Document to graph Graph-based Q and A Entity extraction Tech approach Personalized PageRank Async processing Incremental updates Use cases Company knowledge base Research paper search Low-cost RAG pipeline Setup OpenAI API key MIT open source Hosted free tier

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Build a Q&A chatbot over internal company documents that understands relationships between people, products, and events.

USE CASE 2

Process a collection of books or research papers at roughly $0.08 each and query them for named entities and connections.

USE CASE 3

Add incremental document updates to an AI app so new files can be inserted without re-processing the entire knowledge base.

USE CASE 4

Run a self-hosted alternative to cloud RAG services with graph-structured retrieval and persistent on-disk storage.

Tech stack

PythonOpenAI APIPageRankSQLite

Getting it running

Difficulty · moderate Time to first run · 30min

Requires an OpenAI API key. Processing costs roughly $0.08 per book-length document.

Free to use for any purpose including commercial, as long as you keep the copyright notice.

In plain English

fast-graphrag is a Python library that helps AI applications answer questions from your own documents by first turning those documents into a graph of connected ideas and entities. Instead of dumping text into a search index and hoping the right chunk comes back, it maps out who is who, what connects to what, and where things happened, then walks that graph when you ask a question. The result is answers that draw on genuine structure in your data rather than keyword overlap. Getting started requires an OpenAI API key. You point the library at a folder of text, describe the domain you care about (for example, "identify the characters and how they interact"), list the kinds of entities you want tracked, and call insert on your documents. The library builds a graph in the background, stored to disk so you can add new documents later without starting from scratch. Queries then return plain answers drawn from that graph. The main practical claim is cost. The README compares processing a book at $0.08 with fast-graphrag versus $0.48 with Microsoft's graphrag, a six-times difference. The savings come from how the library explores the graph: it uses an algorithm called personalized PageRank, borrowed from the same family of ideas that powers web search rankings, to focus on the most relevant parts rather than scanning everything. The library is asynchronous, meaning it can handle many documents or queries at the same time without blocking. It supports incremental updates, so new data can be added without re-processing what is already stored. A set of tutorial notebooks in the examples folder covers things like swapping in a different language model, using checkpoints to protect against data corruption, and including source references in answers. The project is open source under the MIT license. The team behind it, Circlemind, also runs a managed hosted version with a free tier of 100 requests per month. A community Discord is available for questions.

Copy-paste prompts

Prompt 1

Using fast-graphrag, write Python code to load a folder of text files about my product, build a knowledge graph tracking features and customer complaints, then answer 'What are the top 3 issues users report?'

Prompt 2

I want to use fast-graphrag with a model other than OpenAI. Show me how to swap in a custom OpenAI-compatible API endpoint and update the entity extraction configuration.

Prompt 3

My fast-graphrag graph is built and I need to run 10 queries concurrently. Write an async Python script that fires all queries at once and collects results.

Prompt 4

How do I add checkpoints to my fast-graphrag insert pipeline to protect against data corruption when adding 500 new documents in one batch?

Prompt 5

I want to compare fast-graphrag and Microsoft graphrag on the same 5-document dataset. Write the setup code for both and explain where the cost difference comes from.

Open on GitHub → Explain another repo

← circlemind-ai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.