explaingit

berzerk0/probable-wordlists

9,260Audience · developerComplexity · 1/5Setup · easy

TLDR

A huge collection of real-world passwords sorted by how common they are, built from billions of leaked credentials, useful for security research and making sure your own password is not one millions of others use.

Mindmap

mindmap
  root((probable-wordlists))
    What it is
      Password wordlists
      Frequency sorted
      2 billion entries
    Sources
      Public leak data
      SecLists
      Hashes.org
    Contents
      Real passwords
      Wireless subset
      Dictionary lists
      Analysis files
    Use cases
      Security research
      Password auditing
      Education
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Check if your password pattern is dangerously common by looking it up in the frequency-sorted lists

USE CASE 2

Use the 8-to-40-character wireless subset as a wordlist for authorized Wi-Fi security testing with Aircrack-ng

USE CASE 3

Apply the included HashCat rules and character masks to authorized password recovery research

USE CASE 4

Study which password patterns appear most often to teach others how to pick genuinely strong passwords

Getting it running

Difficulty · easy Time to first run · 5min
No license terms are mentioned in the explanation.

In plain English

This repository is a large collection of password wordlists sorted by how common each password actually is, rather than alphabetically. The core idea is simple: if you know which passwords millions of real people are using, you can make sure yours is not one of them. The project was built using data from publicly available password leaks and is intended strictly for lawful, ethical, and educational purposes. The author spent the better part of a year gathering nearly 1,600 files totaling more than 350 gigabytes of leaked credentials from sites like SecLists, Weakpass, and Hashes.org. Each file was cleaned up, internal duplicates were removed, and all of them were combined into one giant source. A password had to appear in at least five of those source files to make the final cut. The frequency with which a password appeared across all sources was treated as a measure of its popularity. The final output covers roughly two billion real passwords, sorted from most to least common. The repository is organized into three main sections. The first is Real-Passwords, which contains actual leaked passwords, including a subfolder for entries between 8 and 40 characters useful for wireless network testing. The second is Dictionary-Style Lists, which includes general-purpose word collections, common usernames, top-level domains, and other technically useful entries. The third is Analysis Files, which holds tools like HashCat rules and character masks generated using the PACK project, useful for people doing password recovery or security research. This is not a code project. The repository does not contain software to run but rather serves as a reference library of files you can download selectively. A full clone is not recommended because of the size involved, the project includes a separate downloads page that helps you get only what you need. The project has been referenced in published books on password cracking, mentioned on the Security Now podcast, and cited by tools like Aircrack and L0phtcrack. Its main value is educational: it makes visible the patterns that weak passwords follow, so anyone who wants to pick a genuinely hard-to-guess password can see what to avoid.

Copy-paste prompts

Prompt 1
I'm doing authorized Wi-Fi penetration testing with Aircrack-ng. How do I download only the 8-to-40-character wireless subfolder from berzerk0/probable-wordlists and feed it to Aircrack-ng?
Prompt 2
Using the HashCat rules included in probable-wordlists, write a HashCat command to test a captured WPA2 handshake file against the most common password patterns.
Prompt 3
I want to search a probable-wordlists file to see if a specific password appears in the top 10,000 most common passwords without downloading the entire 350GB repo. How do I do that?
Prompt 4
How do I use the PACK-generated character masks from probable-wordlists to generate targeted HashCat mask attacks for authorized security research?
Open on GitHub → Explain another repo

← berzerk0 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.