karpathy/nn-zero-to-hero

Analysis updated 2026-06-21

★ 21,730Jupyter NotebookAudience · developerComplexity · 2/5Setup · easy

Mindmap

mindmap
  root((nn-zero-to-hero))
    What it does
      Free video course
      Jupyter code files
      Builds from scratch
    Topics Covered
      Backpropagation
      Language models
      GPT transformer
      Tokenization
    Projects Built
      micrograd engine
      makemore model
      GPT from scratch
    Audience
      Python beginners
      AI curious learners

mindmap root((nn-zero-to-hero)) What it does Free video course Jupyter code files Builds from scratch Topics Covered Backpropagation Language models GPT transformer Tokenization Projects Built micrograd engine makemore model GPT from scratch Audience Python beginners AI curious learners

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Follow along with lectures to build a working GPT-style language model entirely from scratch in Python.

USE CASE 2

Study the micrograd notebooks to understand how backpropagation and gradient descent actually work.

USE CASE 3

Use the makemore notebooks to build a character-level model that generates new names or words.

USE CASE 4

Run the tokenization lecture code to understand how text is converted into number chunks for LLMs.

What is it built with?

PythonJupyter NotebookPyTorch

How does it compare?

	karpathy/nn-zero-to-hero	nirdiamant/genai_agents	zergtant/pytorch-handbook
Stars	21,730	21,801	21,628
Language	Jupyter Notebook	Jupyter Notebook	Jupyter Notebook
Setup difficulty	easy	easy	moderate
Complexity	2/5	3/5	2/5
Audience	developer	developer	researcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

Requires Python and Jupyter Notebook, PyTorch needed for later lectures.

In plain English

Neural Networks: Zero to Hero is a free video course, accompanied by Jupyter Notebook code files, that teaches how neural networks and modern AI language models work from first principles. The course is designed as a series of YouTube lectures where the instructor writes code live, building increasingly complex neural network systems from scratch. The course starts at the very bottom: Lecture 1 covers backpropagation, which is the core mathematical algorithm used to train neural networks. Rather than just explaining the concept, the instructor builds a tiny working neural network engine called "micrograd" from scratch using only basic Python. From there, the course progressively builds up to more complex architectures. Lectures 2 through 6 build a character-level language model called "makemore", a system that generates new words or names by learning statistical patterns from training data, going through increasingly sophisticated versions: a simple statistical model, a multilayer neural network, techniques for stabilizing training (Batch Normalization), a deep dive into manually computing gradients, and finally a convolutional architecture. Lecture 7 then builds a GPT (Generatively Pretrained Transformer), the same type of architecture used in AI chat systems, from scratch and in full. Lecture 8 covers tokenization, which is the process of converting text into numerical chunks that language models can process. The course assumes basic Python knowledge and a vague memory of high school calculus. Each lecture links to a YouTube video and has corresponding Jupyter Notebook files in this repository so you can follow along and run the code yourself. It's aimed at people who want to genuinely understand how modern AI systems work under the hood, not just use them.

Copy-paste prompts

Prompt 1

Walk me through the micrograd code from karpathy/nn-zero-to-hero step by step, explain how backpropagation works in this tiny neural network engine.

Prompt 2

I am following the makemore lectures in karpathy/nn-zero-to-hero. Explain how batch normalization stabilizes training and why it matters.

Prompt 3

Help me understand the GPT implementation in karpathy/nn-zero-to-hero, specifically how the attention mechanism works in the transformer block.

Prompt 4

I cloned karpathy/nn-zero-to-hero. How do I set up Jupyter Notebook and run the first backpropagation exercise?

Prompt 5

Explain the difference between the bigram model and the MLP model in the makemore series of karpathy/nn-zero-to-hero.

Frequently asked questions

What is nn-zero-to-hero?

A free video course with Jupyter Notebook code that teaches how neural networks and GPT-style language models work from scratch, building everything step by step in plain Python.

What language is nn-zero-to-hero written in?

Mainly Jupyter Notebook. The stack also includes Python, Jupyter Notebook, PyTorch.

How hard is nn-zero-to-hero to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is nn-zero-to-hero for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub karpathy on gitmyhub

Verify against the repo before relying on details.