explaingit

faviovazquez/ds-cheatsheets

16,218Audience · dataComplexity · 1/5Setup · easy

TLDR

A curated collection of data science cheatsheets, quick-reference PDFs and images covering Python, R, machine learning, deep learning, SQL, and statistics, organized by topic for easy lookup.

Mindmap

mindmap
  root((repo))
    What it is
      Reference collection
      Quick lookup
      No code to run
    Python topics
      Pandas NumPy
      Jupyter
      Regex and imports
    ML topics
      Scikit-learn
      Deep learning
      Model selection
    Other topics
      R and tidyverse
      SQL
      Statistics
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Look up the exact pandas or numpy function name while working on a data analysis task without leaving your editor.

USE CASE 2

Use the scikit-learn cheatsheet to quickly choose the right machine learning model for your dataset.

USE CASE 3

Browse the collection as a study guide to map out all topics in data science before deciding what to learn next.

Getting it running

Difficulty · easy Time to first run · 5min

In plain English

ds-cheatsheets is a curated collection of data-science cheatsheets, single-page reference PDFs and images that summarise the key commands, syntax, and concepts of a tool or topic in a way you can glance at while working. The repository itself is essentially a long, organised table of contents that links out to each cheatsheet, rather than software you install and run. The cheatsheets are grouped by area so you can jump to the section that matches what you are learning or struggling with. Categories include Business Science workflows, Python (basics, pandas, numpy, Jupyter, regular expressions, importing data), R (the tidyverse, data.table, dplyr, lubridate, stringr, purrr, R Markdown, package development), math and calculus refreshers, probabilities and statistics, big data tools (PySpark RDDs and DataFrames, Dask, sparklyr), machine learning (scikit-learn, caret, H2O, mlr, supervised and unsupervised learning summaries, choosing the right model), deep learning (Keras, neural networks, convolutional and recurrent networks), SQL, and data visualisation (Matplotlib, Seaborn, Bokeh and others). You would use this repository as a study companion or a quick lookup when you are working on a data-science task and need to remember the exact name of a function or the shape of a syntax. Beginners use it to map out what topics exist in the field, more experienced practitioners use individual sheets as desk references during day-to-day work. There is no code or runtime here, the repository contains links and PDFs, and many of the cheatsheets are pulled from third parties such as DataCamp, RStudio and Dataquest. The full README is longer than what was provided.

Copy-paste prompts

Prompt 1
I keep forgetting how to reshape DataFrames in pandas, find the relevant cheatsheet in ds-cheatsheets and walk me through the key operations with examples.
Prompt 2
Which cheatsheets in ds-cheatsheets cover deep learning? List the Keras and neural network references and summarize what each one covers.
Prompt 3
Help me use the scikit-learn cheatsheet to choose the right algorithm for a classification task with 5000 rows and 20 features.
Prompt 4
I want to learn R tidyverse, which cheatsheets in ds-cheatsheets should I start with and in what order?
Open on GitHub → Explain another repo

← faviovazquez on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.