Analysis updated 2026-05-18
Record your car making an unusual noise on your phone and run cardiag to get a ranked list of likely faulty parts.
Build a labeled dataset of car fault sounds by scraping YouTube clips and ingesting your own recordings.
Reuse the audio cleaning and CLAP embedding pipeline as a recipe for other mechanical sound classification projects.
| adam-s/car-diagnosis | bongobongo2020/krea2-character-lora-trainer | duration-ai/bonsai-image-android | |
|---|---|---|---|
| Stars | 8 | 8 | 8 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | moderate | hard |
| Complexity | 3/5 | 3/5 | 5/5 |
| Audience | researcher | vibe coder | researcher |
Figures from each repo's GitHub metadata at analysis time.
Requires Python 3.11 and uv, a pre-trained model ships with the repo so no large downloads are needed to run basic inference.
Cardiag is a command-line and web tool that listens to a recording of a car and tries to identify what is wrong with it. You point it at an audio clip, and it returns a verdict about whether something sounds abnormal, which area of the car it appears to be coming from, and a ranked list of likely faulty parts. The tool is built as a triage aid, not a diagnosis replacement. The authors are clear about what it can and cannot do: on phone-quality recordings it achieves a 0.79 area under the ROC curve for detecting faults (versus a 0.50 score for random guessing). It gets the correct car zone in the top-3 results about 75 percent of the time. Crucially, when the audio is too ambiguous to call, it says "uncertain" rather than guessing. The README includes honest benchmark numbers and an explanation of a head that was cut for failing out-of-sample testing. The pipeline has several stages. First, it scrapes audio examples from YouTube or TikTok, or accepts clips you record yourself. Then a cleaning step strips out speech, music, and road noise to isolate the mechanical sounds. A pre-trained audio model called CLAP converts those sounds into a numerical representation. Finally, small linear classifiers trained on top of that representation produce the fault, zone, and part predictions. A pre-trained model ships with the repository, so you can run a diagnosis immediately after cloning without downloading additional data. The web app version lets you drop in a clip or paste a YouTube link and see the result, including an explanation of why the model returned that answer. A separate inspect command generates a visual and audible breakdown of what the pipeline extracted from your clip. The cleaning recipe and the honest training approach are noted as the main reusable contributions. The same method reaches 0.93 on cleaner engine audio datasets, suggesting the limitations come from the difficulty of phone recordings rather than the technique itself.
Audio-ML pipeline that analyzes a car recording to triage faults: detects if something sounds wrong, identifies the car zone, and ranks likely faulty parts with calibrated uncertainty.
Mainly Python. The stack also includes Python, CLAP, uv.
License details in the LICENSE file, not summarized in the README.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.