explaingit

amueller/word_cloud

10,532PythonAudience · developerComplexity · 2/5LicenseSetup · easy

TLDR

A Python library that generates word cloud images from text, sizing each word by how often it appears. Works from a Python script or the command line, and supports custom shapes, multiple languages, and color options.

Mindmap

mindmap
  root((word_cloud))
    What it does
      Sizes words by frequency
      Generates image files
      Custom shape masking
    Tech Stack
      Python
      Pillow image handling
      Matplotlib display
    Use Cases
      Presentations
      Data visualization
      Reports
    Setup
      pip install
      Command line tool
      Multi-language support
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Generate a word cloud image from a plain text file using the command line, no coding required

USE CASE 2

Create a shaped word cloud that fills the outline of any image, such as a logo or animal silhouette

USE CASE 3

Visualize the most frequent topics in a document or dataset for a presentation or report

USE CASE 4

Add word cloud generation to a Python data pipeline or web application

Tech stack

PythonNumPyPillowMatplotlib

Getting it running

Difficulty · easy Time to first run · 5min
Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

This is a Python library that generates word cloud images from text. A word cloud is a visual where words from a piece of writing are displayed at different sizes: the more frequently a word appears in the source text, the larger it is drawn in the image. These are commonly used in presentations, reports, and social media to give a quick visual impression of what a document or dataset is about. Installing the library is straightforward using pip or conda, the two most common Python package managers. Once installed, you can use it from a Python script or directly from the command line. The command-line version takes a plain text file as input and outputs an image file, which makes it easy to generate a word cloud without writing any code. The library can handle several visual styles. A simple word cloud places words on a plain background at random positions and sizes. A masked version lets you supply a shape image, and the words are arranged to fill only that shape, so you could get a word cloud in the outline of an animal or a letter. The library also supports color customization and languages beyond English, including Arabic. For working with PDFs, the README suggests piping the text output of a PDF-to-text conversion tool into the word cloud command, which works on most Linux systems where that tool is included by default. The library is MIT licensed and tested against several recent versions of Python. It depends on three common Python packages for math, image handling, and plotting. The code was originally shared in a 2012 blog post and has been maintained and expanded since then.

Copy-paste prompts

Prompt 1
Using the wordcloud Python library, write a script that reads a .txt file, removes common English stopwords, generates a word cloud image 800x400 pixels with a blue color scheme, and saves it as a PNG.
Prompt 2
I have a black-and-white mask image of a cat silhouette. Write Python code using wordcloud to fill that shape with words from a long string, then display it with matplotlib.
Prompt 3
Using the word_cloud command-line tool, what is the exact command to convert a text file called 'report.txt' into a word cloud PNG called 'output.png' with a white background?
Prompt 4
How do I use the wordcloud Python library to generate an Arabic word cloud from a string of Arabic text? Include any special configuration needed for right-to-left languages.
Prompt 5
Write a Python function using wordcloud that accepts a list of strings, joins them, removes stopwords, and returns a PIL image object of the resulting word cloud.
Open on GitHub → Explain another repo

← amueller on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.