Build a local AI assistant that searches your PDFs, emails, and chat history without an internet connection.
Add semantic search to Claude Code so it can find conceptually related code and documents in your projects.
Search through 60 million text chunks from personal data using only 6 gigabytes of storage instead of 200.
Index your iMessage history, WeChat chats, and browser history for private local AI-powered search.
Building from source requires additional system libraries for the DiskANN and HNSW graph index backends.
LEANN is a vector database designed to run entirely on a personal laptop. The core idea is to let anyone set up an AI assistant that can search through large personal data collections, without sending that data to any cloud service. The project calls this approach RAG, which stands for Retrieval-Augmented Generation: before an AI answers a question, it first finds relevant documents in a local index and uses them to inform the response. The main technical claim is a 97% reduction in storage compared to conventional vector databases, without losing accuracy. Traditional approaches store numerical representations (called embeddings) for every piece of text in the index. LEANN instead stores only a graph of relationships between documents and recomputes the embeddings on demand during a search. This lets a collection of 60 million text chunks fit in about 6 gigabytes rather than 200 gigabytes. The supported data sources are broad. The README walks through specific setups for searching your local file system (PDFs, text files, Markdown), Apple Mail, browser history, iMessage, WeChat chat history, ChatGPT and Claude conversation exports, Slack messages, and Twitter bookmarks. There is also an integration with Claude Code's MCP protocol, which adds semantic search (finding conceptually related results) on top of the basic keyword search Claude Code offers by default. Installation is via PyPI using the uv package manager. The project runs on macOS (both ARM and Intel), Linux (Ubuntu, Arch, RHEL-based), and Windows through WSL. Building from source requires additional system libraries for the graph index backends (DiskANN and HNSW). The project collects zero telemetry. It is licensed under MIT. The full README is longer than what was shown.
← yichuan-w on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.