explaingit

quickwit-oss/tantivy

15,180Rust

TLDR

Tantivy is a full-text search engine library written in Rust.

Mindmap

A visual breakdown will appear here once this repo is fully enriched.

In plain English

Tantivy is a full-text search engine library written in Rust. It is not a ready-to-run search server like Elasticsearch or Apache Solr. Instead, it is a piece of code (a crate, in Rust terms) that a developer adds to their own program so that the program can search through large amounts of text. The README describes it as closer in spirit to Apache Lucene, the older Java library that Elasticsearch and Solr are themselves built on top of. The feature list covers what you can do with it. Searches use BM25 scoring, the same ranking formula Lucene uses. You can write queries in a natural form such as (michael AND jackson) OR "king of pop", and run phrase searches. Indexing is multithreaded and incremental, meaning you can add new documents without rebuilding the whole index. The README says indexing the full English Wikipedia takes under three minutes on the author's desktop. Startup is under 10 milliseconds, which the README calls useful for command-line tools. Tantivy supports many field types: text, integers, floats, dates, IP addresses, booleans, and hierarchical facets. It can store documents in compressed form using LZ4 or Zstd, run range queries and faceted search, and roll up results with an aggregation collector that produces histograms, range buckets, averages, and stats. Tokenizers, the pieces that split text into searchable words, are configurable, with stemming for 17 Latin-script languages and third-party add-ons for Chinese, Japanese, and Korean. The README is explicit about what Tantivy does not do. Distributed search across many machines is out of scope; for that the same team points readers to a separate project called Quickwit, which is built on top of Tantivy. Data inside an index is immutable, so editing a document means deleting it and indexing the new version. New documents only become searchable after a commit call on the index writer, and existing readers need to be reloaded to see the change. Bindings exist for Python and Ruby, and the README lists projects that use Tantivy, including a Matrix chat message indexer and a typo-tolerant search engine with a REST API. Companies named as users include Etsy and ParadeDB.

Open on GitHub → Explain another repo

Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.