explaingit

jepsen-io/jepsen

7,383ClojureAudience · researcherComplexity · 5/5Setup · hard

TLDR

A Clojure testing framework that deliberately crashes nodes and cuts network connections in a live distributed database, then checks whether the recorded history of operations is logically consistent with the system's guarantees.

Mindmap

mindmap
  root((Jepsen))
    What it does
      Injects faults
      Records operations
      Checks consistency
    Fault types
      Network partitions
      Node crashes
      Clock skew
    Infrastructure
      Control node
      Database nodes
      SSH connections
    Output
      Correctness report
      Performance graphs
      Web interface
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Write automated correctness tests for a distributed database that inject network partitions and node crashes

USE CASE 2

Detect consistency anomalies by analyzing every read and write recorded during a fault-injection run

USE CASE 3

Measure a database's availability under adversarial conditions like clock skew and killed processes

USE CASE 4

Reproduce published Jepsen correctness analyses against a specific database version

Tech stack

ClojureJavaSSH

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a control machine plus at least five database nodes reachable via SSH, LXC containers or cloud VMs work.

In plain English

Jepsen is a testing framework for distributed systems, the kind of software that runs across multiple computers at the same time and must coordinate between them. The library's tagline is "breaking distributed systems so you don't have to," which describes its purpose: it deliberately injects faults (network partitions, crashed nodes, clock drift) while running operations against a live system, then checks whether the recorded history of those operations is logically consistent. If it finds something that should not be possible given the system's guarantees, it reports the anomaly. A Jepsen test is a program written in Clojure, a programming language that runs on the Java virtual machine. The test sets up a control node on your machine and connects via SSH to a set of database nodes where the target system runs. During the test, virtual clients send reads and writes to the system while a separate component called the "nemesis" disrupts things: dropping network packets between nodes, killing processes, manipulating system clocks. Jepsen records every operation's start and end time, then a checker analyzes whether the complete history could have legally occurred given the system's claimed consistency model. Test results include correctness analysis, performance graphs, and availability charts saved to disk for review. There is also a web interface and a REPL (an interactive prompt) for examining test results in detail after a run. To run tests, you need a control machine and at least five database machines, though these can be virtual machines or Linux containers rather than real hardware. AWS, LXC containers on a local machine, and ordinary VMs are all supported. The project notes that tests can aggressively modify the database nodes (killing processes, altering firewall rules, changing clocks), so running Jepsen against a production system is not recommended. Jepsen has been used publicly to find correctness bugs in many well-known databases and coordination systems. The project website lists published analyses. The framework is primarily a tool for database developers and distributed systems researchers who want rigorous, automated correctness testing under adversarial conditions.

Copy-paste prompts

Prompt 1
Show me how to write a basic Jepsen test that writes and reads a single register on a 5-node cluster
Prompt 2
How do I configure Jepsen's nemesis to partition the network between specific nodes during a test run?
Prompt 3
Which Jepsen checker should I use to verify linearizability, and how do I read the analysis output it produces?
Prompt 4
How do I set up Jepsen to run on LXC containers locally instead of renting five real servers?
Prompt 5
How do I add clock-skew injection to a Jepsen test to check how my database handles time drift between nodes?
Open on GitHub → Explain another repo

← jepsen-io on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.