seatgeek/fuzzywuzzy

★ 9,261PythonAudience · developerComplexity · 1/5Setup · easy

Mindmap

mindmap
  root((fuzzywuzzy))
    What it does
      Fuzzy string match
      Similarity scoring
      Typo tolerant search
    Status
      Archived repo
      Renamed to TheFuzz
      New GitHub location
    Use Cases
      Search with typos
      Record deduplication
      Database matching
    Setup
      Install TheFuzz instead
      Python package
      Easy pip install

mindmap root((fuzzywuzzy)) What it does Fuzzy string match Similarity scoring Typo tolerant search Status Archived repo Renamed to TheFuzz New GitHub location Use Cases Search with typos Record deduplication Database matching Setup Install TheFuzz instead Python package Easy pip install

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Find the closest matching string from a list even when there are typos or slight spelling differences.

USE CASE 2

Match incoming database records to existing entries where field values may be spelled or abbreviated differently.

Tech stack

Python

Getting it running

Difficulty · easy Time to first run · 5min

This repo is archived. Use TheFuzz at its new GitHub location instead, install with pip install thefuzz.

In plain English

This repository was a Python library for fuzzy string matching, which is the ability to compare two pieces of text and find a similarity score even when they are not identical. This is useful for tasks like matching a user's search term to a list of items when typos are present, or finding which entry in a database most closely matches an incoming record that might be spelled slightly differently. The repository has been renamed and moved. It is now called TheFuzz and lives at a different GitHub address. The README is short and only explains this transition: version 0.19.0 of TheFuzz corresponds to version 0.18.0 of the original project, with the main difference being the name change throughout the code. New issues or pull requests should be submitted to the TheFuzz repository rather than here. There is no further documentation in this repository about how the library works or what functions it provided. Anyone looking to use or contribute to the project should follow the link in the README to the current TheFuzz repository.

Copy-paste prompts

Prompt 1

Using TheFuzz (the renamed fuzzywuzzy library), show me how to compare a user's search input against a list of product names and return the best match even when the spelling is off.

Prompt 2

With TheFuzz in Python, how do I find all items in a list that are at least 80% similar to a given string? Show me the code and explain the difference between ratio and partial_ratio.

Prompt 3

Help me use TheFuzz to deduplicate a list of company names that have slight spelling variations or abbreviations, and group the similar ones together.

Open on GitHub → Explain another repo

← seatgeek on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.