Power the retrieval step of a RAG chatbot by storing document embeddings and running semantic search in under a millisecond.
Replace multiple specialized search systems with one database that handles vector, keyword, and full-text queries together.
Build an AI product backend in Python with a few lines of code to insert rows and run hybrid searches against million-scale datasets.
Store and query both vector embeddings and ordinary string or number fields in the same table without a separate database.
Ships as a single binary with no external dependencies, Docker images available for Linux and macOS. Windows requires WSL2 with extra host networking steps.
Infinity is a database designed specifically for AI applications, with a focus on search. Unlike a traditional database built primarily for storing and retrieving records, Infinity is built around the kinds of searches that AI systems need: finding similar items based on meaning (dense vector search), matching sparse keyword patterns (sparse vector search), multi-vector comparisons (tensor search), and conventional text search, all at once or in combination. The goal is to let developers building AI-powered products run these searches quickly without stitching together multiple separate systems. The database claims strong performance figures: queries on million-scale vector datasets completing in under 0.1 milliseconds, and full-text search across 33 million documents finishing in about 1 millisecond. These numbers reflect the kinds of workloads common in retrieval-augmented generation (RAG) applications, where an AI assistant needs to quickly find relevant documents before generating a response. Infinity supports several methods for combining and re-ranking results from different search types. Setup is straightforward by database standards. The server ships as a single binary with no external dependencies, and Docker images are available for Linux and macOS. A Python package provides the client-side API. The README includes a short code example showing how to connect, create a table, insert rows, and run a vector search in a few lines of Python. Windows users need to run through WSL2 to use Docker, with some extra steps to enable host networking. The system stores not just vectors but also ordinary data types like strings and numbers in the same table, so you do not need a separate database for the non-vector portions of your data. The Python API is designed to be approachable for developers familiar with data tools rather than database internals. Infinity is built by Infiniflow and is open to contributions. The team maintains a public roadmap on GitHub and a Discord community for questions and discussion.
← infiniflow on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.