Analysis updated 2026-06-21
Follow along with lectures to build a working GPT-style language model entirely from scratch in Python.
Study the micrograd notebooks to understand how backpropagation and gradient descent actually work.
Use the makemore notebooks to build a character-level model that generates new names or words.
Run the tokenization lecture code to understand how text is converted into number chunks for LLMs.
| karpathy/nn-zero-to-hero | nirdiamant/genai_agents | zergtant/pytorch-handbook | |
|---|---|---|---|
| Stars | 21,730 | 21,801 | 21,628 |
| Language | Jupyter Notebook | Jupyter Notebook | Jupyter Notebook |
| Setup difficulty | easy | easy | moderate |
| Complexity | 2/5 | 3/5 | 2/5 |
| Audience | developer | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Requires Python and Jupyter Notebook, PyTorch needed for later lectures.
Neural Networks: Zero to Hero is a free video course, accompanied by Jupyter Notebook code files, that teaches how neural networks and modern AI language models work from first principles. The course is designed as a series of YouTube lectures where the instructor writes code live, building increasingly complex neural network systems from scratch. The course starts at the very bottom: Lecture 1 covers backpropagation, which is the core mathematical algorithm used to train neural networks. Rather than just explaining the concept, the instructor builds a tiny working neural network engine called "micrograd" from scratch using only basic Python. From there, the course progressively builds up to more complex architectures. Lectures 2 through 6 build a character-level language model called "makemore", a system that generates new words or names by learning statistical patterns from training data, going through increasingly sophisticated versions: a simple statistical model, a multilayer neural network, techniques for stabilizing training (Batch Normalization), a deep dive into manually computing gradients, and finally a convolutional architecture. Lecture 7 then builds a GPT (Generatively Pretrained Transformer), the same type of architecture used in AI chat systems, from scratch and in full. Lecture 8 covers tokenization, which is the process of converting text into numerical chunks that language models can process. The course assumes basic Python knowledge and a vague memory of high school calculus. Each lecture links to a YouTube video and has corresponding Jupyter Notebook files in this repository so you can follow along and run the code yourself. It's aimed at people who want to genuinely understand how modern AI systems work under the hood, not just use them.
A free video course with Jupyter Notebook code that teaches how neural networks and GPT-style language models work from scratch, building everything step by step in plain Python.
Mainly Jupyter Notebook. The stack also includes Python, Jupyter Notebook, PyTorch.
Setup difficulty is rated easy, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.