explaingit

burntsushi/xsv

10,759RustAudience · dataComplexity · 2/5LicenseSetup · moderate

TLDR

xsv is a fast Rust command-line tool for slicing, filtering, joining, sorting, and computing stats on CSV files, note it is unmaintained, qsv and xan are the actively developed successors.

Mindmap

mindmap
  root((xsv))
    Commands
      select columns
      filter rows
      join files
      stats
      slice
      split
    Key feature
      Index file
      Fast large files
    Use cases
      Data wrangling
      CSV analysis
      ETL pipelines
    Tech
      Rust
      crates.io
    Status
      Unmaintained
      Successors exist
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Count rows, select specific columns, and filter a large CSV file from the terminal without opening a spreadsheet.

USE CASE 2

Join two CSV files on a shared column using a fast hash-based approach that does not require pre-sorted input.

USE CASE 3

Compute statistics like mean, median, and standard deviation on a CSV column in seconds using an optional index file.

Tech stack

Rust

Getting it running

Difficulty · moderate Time to first run · 30min

Unmaintained, consider qsv or xan as actively developed alternatives. Install via cargo install xsv or build from source.

Dual-licensed under MIT and the Unlicense, use freely for any purpose with no restrictions.

In plain English

xsv is a command-line tool written in Rust for working with CSV files. It provides a collection of subcommands that cover the most common operations on tabular data: counting rows, selecting columns, filtering by regex, joining multiple files, sorting, slicing a range of rows, computing statistics like mean and standard deviation, and splitting one large file into many smaller ones. The design goal is for each command to be fast, composable with Unix pipes, and honest about performance tradeoffs. One notable feature is indexing. Running xsv index on a CSV file creates a small companion index file that enables certain later commands to skip directly to the relevant rows instead of parsing everything from the start. This makes slice operations on large files nearly instant and speeds up statistics gathering significantly. The README demonstrates this on a 3.1-million-row world cities dataset where indexed operations finish in seconds. The tool also formats CSV output into aligned columns in a terminal via the table command, handles files with unusual quoting or delimiter rules, and can do inner, outer, and cross joins between files using a hash-based approach that keeps things fast without requiring pre-sorted input. The project is now unmaintained. The author recommends looking at qsv or xan as actively developed alternatives that cover similar ground. The repository remains available for reference and historical use, and the code compiles and runs fine for users who want to install it from source or via the crates.io package. It is dual-licensed under MIT and the Unlicense.

Copy-paste prompts

Prompt 1
Using xsv, select only the 'name' and 'email' columns from a CSV file called users.csv and write the output to a new file. Show the command.
Prompt 2
I have a 3-million-row CSV file and want to get statistics on the 'price' column fast. Show me how to create an xsv index first, then run the stats command.
Prompt 3
Using xsv, join two CSV files, orders.csv and customers.csv, on a shared 'customer_id' column. Show the join command.
Prompt 4
How do I use xsv to filter rows where the 'country' column equals 'US' using a regex? Show the command.
Prompt 5
Using xsv, split a large CSV into smaller files of 10,000 rows each. Show the command and explain what output files it creates.
Open on GitHub → Explain another repo

← burntsushi on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.