explaingit

clvv/hf-uncensored-model-popularity

0HTMLAudience · dataComplexity · 2/5ActiveSetup · easy

TLDR

Data and code behind a single Twitter chart measuring the share of uncensored or abliterated language models on Hugging Face by month, with a reproducible Python collector.

Mindmap

mindmap
  root((hf-uncensored-popularity))
    Inputs
      Hugging Face API
      Keyword filter list
      Tag filters
    Outputs
      CSV snapshots
      SVG charts
      HTML report
    Use Cases
      Reproduce the May 2026 chart
      Track uncensored model share
      Audit the keyword list
    Tech Stack
      Python
      huggingface_hub
      HTML
      SVG

Things people build with this

USE CASE 1

Reproduce the chart showing uncensored model share of Hugging Face downloads

USE CASE 2

Reuse the keyword list to classify your own pull of Hugging Face model metadata

USE CASE 3

Update the snapshot CSV to a fresher month and regenerate the SVG and HTML report

USE CASE 4

Audit the methodology behind a viral AI-stats tweet by inspecting the included metadata file

Tech stack

Pythonhuggingface_hubHTMLSVG

Getting it running

Difficulty · easy Time to first run · 5min

The included CSVs are the authoritative snapshot; rerunning the collector against live Hugging Face data will yield slightly different numbers.

In plain English

This repository holds the data and code behind a single chart that someone posted on Twitter. The chart looks at how popular so-called uncensored AI models are on Hugging Face, which is a public site where people share language models. The author is upfront that this is a fully AI-generated analysis, meant as a reproducible public snapshot rather than an official Hugging Face dataset or an audited academic study. The headline number is that uncensored or heretic-tagged model repos created so far in May 2026 account for about 15.24 percent of rolling downloads among language-model-style repos. Across all time the share is smaller, around 0.9 percent of downloads and 1.8 percent of repos, so the May spike is the part being highlighted. The method explains what was counted. The denominator pulls in all Hugging Face model repos tagged with text-generation, text2text-generation, conversational, image-text-to-text, gguf, or llama.cpp. The numerator narrows that group down to repos whose ID or tags include words like uncensored, heretic, abliterated, censorless, jailbreak, or liberated. Download and like counts are point-in-time numbers taken from Hugging Face's public API, grouped by the month each repo was created. The repo ships a Python collector script, a compressed CSV snapshot of the raw data, a smaller monthly aggregate CSV, a self-contained HTML report, the SVG chart files, and a metadata file listing the exact filters and keyword list used. A citation file records hashes and provenance. To reproduce, you install the huggingface_hub Python package and run the collector script. Fresh runs will give slightly different numbers because the Hugging Face data is live, so the included CSVs are the way to verify the exact chart shown in the tweet.

Copy-paste prompts

Prompt 1
Run the clvv/hf-uncensored-model-popularity collector to produce a fresh CSV for the current month
Prompt 2
Extend the keyword filter in hf-uncensored-model-popularity to also count tags like dolphin and nous-uncensored
Prompt 3
Plot the monthly uncensored share from the included CSV as a stacked bar chart in matplotlib
Prompt 4
Write a notebook that diffs two CSV snapshots from hf-uncensored-model-popularity and reports which repos drove the change
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.