haskallcurry/interpolation_search

Analysis updated 2026-06-24

★ 0RustAudience · developerComplexity · 3/5Setup · easy

Mindmap

mindmap
  root((interpolation-search))
    Inputs
      Size flags
      Query count
      Cold copies
      Random seed
    Outputs
      Timing tables
      SVG diagrams
      HTML draggable label
    Use Cases
      Compare lookup algorithms
      Reproduce a viral tweet
      Stress test cache behaviour
    Tech Stack
      Rust
      Cargo

mindmap root((interpolation-search)) Inputs Size flags Query count Cold copies Random seed Outputs Timing tables SVG diagrams HTML draggable label Use Cases Compare lookup algorithms Reproduce a viral tweet Stress test cache behaviour Tech Stack Rust Cargo

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Reproduce the tweet's interpolation vs binary search timing comparison

USE CASE 2

Benchmark a 1 billion key lookup if your machine has 40+ GiB of RAM

USE CASE 3

Generate the SVG and HTML diagrams that illustrate probe positions

USE CASE 4

Study an open-addressed hash set with linear probing in idiomatic Rust

What is it built with?

RustCargo

How does it compare?

	haskallcurry/interpolation_search	bakome-hub/bakome-crypto-quant-engine	darthchudi/lob
Stars	0	0	0
Language	Rust	Rust	Rust
Setup difficulty	easy	easy	easy
Complexity	3/5	3/5	3/5
Audience	developer	developer	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 5min

Just cargo run --release works for the default 1K and 1M cases, the --include-1b switch needs roughly 40 GiB of RAM.

In plain English

interpolation_search is a small Rust project that benchmarks three different ways of looking up a 64 bit integer in a large sorted set. It is built around a comparison from a tweet, referred to in the code as the tweet.txt experiment. The three contenders are interpolation search over a sorted array of unique u64 keys, the standard library's slice::binary_search on the same array, and a deliberately simple open addressed hash set with linear probing, modular wraparound, the value 0 as the empty sentinel, and a 75 percent target load factor. The benchmark generator is deterministic. Stored keys are produced by feeding the integers 0 through n into a mix64 hash function, sorted for the array searches and shuffled before being inserted into the hash set. Timed queries draw a pseudorandom xorshift64 index in the range 0 to 2n and look up mix64 of that index, giving a 50 percent hit rate without storing a separate query stream and without a short repeated access cycle. A smoke run uses cargo run --release with flags such as --sizes 1K, --queries 100000, and --cold-copies 1. Running cargo run --release on its own runs the default 1K and 1M benchmarks. The --include-1b switch adds a 1 billion entry case, but only fits on a machine with enough memory, since the 1B hash set cold arena alone is about 40 GiB. The --cold-copies flag controls how many copies of the data are kept resident at the largest requested size, and smaller sizes scale inversely so the total cold footprint stays roughly constant. A --cold-reference-size option applies the same scaling while running only a subset of sizes, which is useful when sweeping random seeds at the smaller end. Alongside timings, the program writes three visual artefacts: an SVG table for the first tweet comparison, an SVG illustration of binary midpoint versus interpolation probe for the second tweet, and an HTML draggable label version of that second diagram. The --no-svg flag skips assets, and --tweet2-only regenerates only the second tweet output without re running the benchmarks.

Copy-paste prompts

Prompt 1

Walk me through how interpolation_search generates deterministic keys with mix64

Prompt 2

Explain what --cold-copies and --cold-reference-size do to the working set size

Prompt 3

Help me add a new search variant like Eytzinger layout to the benchmark harness

Prompt 4

Show me how the SVG probe diagram is built so I can tweak the styling

Frequently asked questions

What is interpolation_search?

Rust benchmark that compares interpolation search, binary search, and an open-addressed hash set on 64-bit integer lookups over large sorted sets.

What language is interpolation_search written in?

Mainly Rust. The stack also includes Rust, Cargo.

How hard is interpolation_search to set up?

Setup difficulty is rated easy, with roughly 5min to a first successful run.

Who is interpolation_search for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.