explaingit

haujetzhao/asr-hotword

13PythonAudience · developerComplexity · 2/5Setup · easy

TLDR

A Python post-processing library that fixes speech recognition errors for brand names and technical terms by matching sounds (phonemes) instead of letters, then swapping in the correct word automatically.

Mindmap

mindmap
  root((asr-hotword))
    How It Works
      Phoneme matching
      Fuzzy similarity score
      Threshold setting
    Hotword File
      One word per line
      Pipe-separated aliases
      Text expansion support
    Languages
      Chinese pinyin
      English letters
      Mixed text
    Performance
      5000 hotwords
      20ms per sentence
    Dependencies
      pypinyin
      rapidfuzz
    Origin
      CapsWriter-Offline
      Voice typing tool
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Fix brand names and technical terms that your speech-to-text tool keeps getting wrong, like 'CapsWriter' being heard as 'Caps Rider'.

USE CASE 2

Expand voice shortcuts into full text, say a short alias out loud and have it output a full phone number or email address.

USE CASE 3

Post-process transcripts from any speech recognition system to clean up proper nouns and uncommon words automatically.

USE CASE 4

Handle mixed Chinese and English voice input and correct recognition errors in both languages with one tool.

Tech stack

Pythonpypinyinrapidfuzz

Getting it running

Difficulty · easy Time to first run · 30min

Install two packages: pypinyin and rapidfuzz. Clone the repo, add your hotwords to a plain text file (word|alias format), then run the included demo script to test corrections immediately.

No license information was mentioned in the explanation.

In plain English

Speech recognition software often gets uncommon words wrong. Brand names, technical terms, and proper nouns tend to come out garbled: "CapsWriter" might be transcribed as "Caps Rider", "Claude" as "cloud", and a Chinese brand name might appear as an ordinary word that sounds similar. This library is a post-processing tool you run on the raw text output from any speech recognition system to fix those mistakes. The core idea is phoneme-based matching. Instead of comparing words letter-by-letter, the library converts both your list of target words and the recognized text into sound units (pinyin syllables for Chinese, individual letters for English), then measures how similar the sounds are. If the similarity score passes a threshold you set, the misrecognized fragment is replaced with the correct word. Processing 5,000 hotwords against a single sentence takes about 20 milliseconds. You define your hotwords in a plain text file, one entry per line. Each entry has a target word followed by one or more aliases, separated by pipe characters. The aliases are alternate ways the word might get transcribed. Any alias that sounds similar enough to something in the recognized text will trigger a replacement with the first word in the line. The first word does not have to be a correction target, it can be any text you want to type quickly, like a phone number or email address. Saying the alias out loud then outputs the full expansion. The library is extracted from a larger project called CapsWriter-Offline, which is a voice typing tool for Windows. It supports mixed Chinese and English text. Installation requires two Python packages: pypinyin (for Chinese phoneme conversion) and rapidfuzz (for fuzzy string matching). The repository includes a sample hotword file and a demo script so you can test corrections against example inputs right away.

Copy-paste prompts

Prompt 1
I'm using the asr-hotword Python library to fix speech recognition errors. My hotword file has entries like 'CapsWriter|Caps Rider|Caps Writer'. Show me how to load this file and run correction on a string of recognized text.
Prompt 2
Using asr-hotword, how do I set the similarity threshold so that only very close phoneme matches trigger a replacement? Show a code example with pypinyin and rapidfuzz.
Prompt 3
I want to use asr-hotword to expand voice shortcuts, saying 'my email' outputs '[email protected]'. How do I format the hotword file entry and call the library to make this work?
Prompt 4
Help me integrate asr-hotword as a post-processing step after I get raw text from a speech recognition API. Show a minimal Python function that takes raw transcript text and returns corrected text.
Prompt 5
What are the performance limits of asr-hotword? I have 10,000 hotwords and need to process transcripts in real time. How should I structure my hotword file and calls to keep latency low?
Open on GitHub → Explain another repo

← haujetzhao on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.