explaingit

seatgeek/fuzzywuzzy

9,261PythonAudience · developerComplexity · 1/5Setup · easy

TLDR

This is the archived original home of a Python fuzzy string matching library. The project has been renamed to TheFuzz and moved to a new GitHub location, all active development happens there.

Mindmap

mindmap
  root((fuzzywuzzy))
    What it does
      Fuzzy string match
      Similarity scoring
      Typo tolerant search
    Status
      Archived repo
      Renamed to TheFuzz
      New GitHub location
    Use Cases
      Search with typos
      Record deduplication
      Database matching
    Setup
      Install TheFuzz instead
      Python package
      Easy pip install
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Find the closest matching string from a list even when there are typos or slight spelling differences.

USE CASE 2

Match incoming database records to existing entries where field values may be spelled or abbreviated differently.

Tech stack

Python

Getting it running

Difficulty · easy Time to first run · 5min

This repo is archived. Use TheFuzz at its new GitHub location instead, install with pip install thefuzz.

In plain English

This repository was a Python library for fuzzy string matching, which is the ability to compare two pieces of text and find a similarity score even when they are not identical. This is useful for tasks like matching a user's search term to a list of items when typos are present, or finding which entry in a database most closely matches an incoming record that might be spelled slightly differently. The repository has been renamed and moved. It is now called TheFuzz and lives at a different GitHub address. The README is short and only explains this transition: version 0.19.0 of TheFuzz corresponds to version 0.18.0 of the original project, with the main difference being the name change throughout the code. New issues or pull requests should be submitted to the TheFuzz repository rather than here. There is no further documentation in this repository about how the library works or what functions it provided. Anyone looking to use or contribute to the project should follow the link in the README to the current TheFuzz repository.

Copy-paste prompts

Prompt 1
Using TheFuzz (the renamed fuzzywuzzy library), show me how to compare a user's search input against a list of product names and return the best match even when the spelling is off.
Prompt 2
With TheFuzz in Python, how do I find all items in a list that are at least 80% similar to a given string? Show me the code and explain the difference between ratio and partial_ratio.
Prompt 3
Help me use TheFuzz to deduplicate a list of company names that have slight spelling variations or abbreviations, and group the similar ones together.
Open on GitHub → Explain another repo

← seatgeek on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.