explaingit

googlecreativelab/quickdraw-dataset

6,734Audience · researcherComplexity · 2/5Setup · easy

TLDR

50 million hand-drawn sketches across 345 categories collected from Google's Quick Draw game, available in multiple formats for machine learning, generative art, and sketch recognition research.

Mindmap

mindmap
  root((quickdraw-dataset))
    What it is
      50M hand-drawn sketches
      345 categories
    Data formats
      Raw JSON strokes
      Binary fast-load
      28x28 NumPy images
    Use cases
      Sketch recognition
      Generative drawing
      Cultural analysis
    Audience
      ML researchers
      Creative coders
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Train a sketch recognition model to identify hand-drawn objects across 345 categories.

USE CASE 2

Generate new drawings in the style of a given category using the Sketch-RNN model and its prepared training subset.

USE CASE 3

Analyze how people from different countries draw the same objects by filtering the dataset by country metadata.

Tech stack

PythonNumPyGoogle Cloud Storage

Getting it running

Difficulty · easy Time to first run · 30min

Requires gsutil (Google Cloud SDK) to download files, each category is a separate file on Google Cloud Storage.

No explicit license is stated, Google notes the data may contain some inappropriate content despite moderation.

In plain English

The Quick, Draw! Dataset is a collection of 50 million hand-drawn sketches contributed by players of Google's Quick, Draw! game, where participants had 20 seconds to draw a prompt and a neural network tried to guess what they drew. The drawings span 345 categories ranging from everyday objects to abstract concepts. Each drawing is stored as a sequence of pen strokes with x/y coordinates and timestamps, along with metadata including the category, the player's country, and whether the game successfully recognized the drawing. Google has released the data in several formats to suit different use cases. The raw format stores each drawing as a JSON record in a plain text file, one drawing per line. A preprocessed version cleans up the strokes, removes timing data, and scales everything into a standard 256x256 pixel region. There is also a binary format for faster loading, and a set of 28x28 pixel grayscale bitmap images in NumPy format for anyone who wants to treat the drawings as images rather than vector paths. All formats are hosted on Google Cloud Storage and can be downloaded by category. The raw data for each category arrives as a separate file, and the download can be done with a single command using Google's gsutil tool. A subset of the data, 75,000 samples per category, was prepared specifically for training the Sketch-RNN model, a generative model that can produce new drawings in the style of a given category. That version is stored in compressed NumPy files and was used in research on teaching computers to draw. The dataset is made available for developers, researchers, and artists. Google notes that while the drawings were individually moderated, the collection may still contain some inappropriate content.

Copy-paste prompts

Prompt 1
How do I download the Quick Draw dataset for just the 'cat' category using gsutil and load it as 28x28 NumPy images for training a classifier?
Prompt 2
Using the Quick Draw dataset stroke format, write Python code to render a single drawing from its pen-stroke sequence as a matplotlib plot.
Prompt 3
I want to train Sketch-RNN on the Quick Draw bicycle category. How do I download the correct compressed NumPy files and feed them into the model?
Prompt 4
How do I filter the Quick Draw dataset to only keep drawings from a specific country and with recognized=True, then convert them to image format?
Open on GitHub → Explain another repo

← googlecreativelab on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.