explaingit

minimaxir/big-list-of-naughty-strings

47,634PythonAudience · developerComplexity · 1/5DormantLicenseSetup · easy

TLDR

A curated collection of text strings designed to break software, empty strings, Unicode edge cases, code injections, null bytes, for stress-testing any application that accepts user input.

Mindmap

mindmap
  root((repo))
    What it does
      Test data for bugs
      Edge case strings
      Organized by category
    Input types
      Empty and spaces
      Unicode characters
      Code injections
      Right-to-left text
      Null bytes
    Use cases
      Manual QA testing
      Automated test suites
      Sign-up forms
      Search boxes
    Formats
      Plain text file
      JSON version
      Third-party packages
      Python helper script

Things people build with this

USE CASE 1

Stress-test a sign-up form or login field by pasting naughty strings into input boxes during manual QA.

USE CASE 2

Loop through the JSON file in an automated test suite to catch regressions in how your app handles edge-case text.

USE CASE 3

Test a search box, comment field, or chat app against Unicode, injection, and null-byte payloads that commonly crash software.

USE CASE 4

Verify that a database or API correctly sanitizes or rejects malicious or unusual text inputs without crashing.

Tech stack

PythonJSONNode.js.NETPHPC++

Getting it running

Difficulty · easy Time to first run · 5min
Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

The Big List of Naughty Strings is a curated collection of text strings specifically chosen because they are likely to cause bugs, crashes, or unexpected behavior when a program uses them as user input. The problem it solves is that developers often test their applications with normal, well-behaved input but forget to check what happens with unusual or edge-case text. Real users, and especially malicious ones, can submit things like empty strings, strings containing only spaces, very long strings, strings with special Unicode characters, strings that look like code injections, strings in right-to-left languages, strings with null bytes, and strings that have historically tripped up databases or web applications. The project provides a plain text file called blns.txt where each line is one of these problematic strings, organized into labeled categories. There is also a JSON version for loading the list programmatically in your own test scripts. A small Python helper script generates the JSON from the text file. Third-party packages for Node.js.NET, PHP, and C++ let you import the list directly into automated test suites without copying files manually. You would use this when building any application that accepts text input from users, a sign-up form, a search box, a comment field, a file upload field, a chat app, and you want to stress-test it against the kinds of inputs that commonly break software. QA engineers paste these strings into forms during manual testing, and automated test suites loop through the JSON file to catch regressions. The list is language-agnostic; the strings are the test data, and your application under test handles them in whatever language it is written in.

Copy-paste prompts

Prompt 1
I'm building a web form that accepts user comments. How do I use the Big List of Naughty Strings to test my form's input validation?
Prompt 2
Show me how to load the naughty strings JSON file into a Python test script and loop through each string to test my application.
Prompt 3
I want to add the Big List of Naughty Strings to my Node.js test suite. What's the easiest way to import and use it?
Prompt 4
My chat app keeps crashing on certain Unicode inputs. How can I use this list to find and fix the problematic strings?
Prompt 5
What are the most common categories of strings in this list that break real-world applications?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.