Check whether a file was corrupted during download by comparing its hash before and after transfer.
Build a fast key-value lookup table or cache using XXH3 hashes as bucket keys.
Detect duplicate files in a large collection by hashing each file and comparing results.
Speed up data deduplication in a backup or storage system without cryptographic overhead.
xxHash is a hashing library written in C. A hash function takes a piece of data, such as a file or a string of text, and produces a short fixed-length number called a hash. That number acts like a fingerprint: if the data changes even slightly, the hash changes too. Hashes are used constantly in software for things like quickly checking whether data has been corrupted, looking things up in tables, or detecting duplicate files. What sets xxHash apart is speed. The README benchmarks show its fastest variant, XXH3, processing data at roughly 31 gigabytes per second on a modern desktop processor, which is faster than the rate at which that machine can read from RAM. Most well-known hash functions like MD5 or SHA1 are designed with security in mind and run far more slowly, xxHash is not a security tool and makes no claim to be, but for non-security uses (integrity checking, hash tables, caching) it is much faster. The library offers several variants. XXH32 produces a 32-bit hash suited to 32-bit processors, XXH64 produces a 64-bit hash for 64-bit systems, and XXH3 (introduced in version 0.8) produces either 64-bit or 128-bit hashes and is optimized for modern processors using a technique called vectorized arithmetic, which processes multiple values at once. All variants pass an independent test suite called SMHasher that evaluates quality properties such as how evenly the output values are distributed. The code is written in plain C, runs identically on processors with different byte orderings, and is available as either a single header file you drop into a project or a compiled library. It is free to use under a BSD-style license. The README is fairly technical, covering benchmark numbers, build configuration options, and integration instructions. The core use case is simple: any software that needs to hash data quickly and does not need cryptographic security.
← cyan4973 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.