karpathy/llm101n

Analysis updated 2026-05-18

★ 36,894Audience · developerComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((LLM101n))
    What it teaches
      Bigram models
      Neural networks
      Transformer architecture
      GPU acceleration
    Building blocks
      Mathematics foundations
      Backpropagation
      Attention mechanisms
      Tokenization
    Final project
      Storyteller AI
      Web application
      Fine-tuning
      Deployment
    Tech stack
      Python
      C
      CUDA
      NVIDIA GPUs
    Course structure
      17 chapters
      Progressive complexity
      Hands-on building
      Appendix topics

mindmap root((LLM101n)) What it teaches Bigram models Neural networks Transformer architecture GPU acceleration Building blocks Mathematics foundations Backpropagation Attention mechanisms Tokenization Final project Storyteller AI Web application Fine-tuning Deployment Tech stack Python C CUDA NVIDIA GPUs Course structure 17 chapters Progressive complexity Hands-on building Appendix topics

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Learn how large language models work by building one from mathematical foundations to a deployed web application.

USE CASE 2

Understand transformer architecture, attention mechanisms, and GPU optimization through hands-on implementation.

USE CASE 3

Build a storyteller AI that can create, refine, and illustrate short stories end-to-end.

USE CASE 4

Master distributed training, quantization, and inference optimization techniques for LLMs.

What is it built with?

PythonCCUDANVIDIA GPU

How does it compare?

	karpathy/llm101n	babysor/mockingbird	aseprite/aseprite
Stars	36,894	36,897	36,862
Language	—	Python	C++
Setup difficulty	hard	hard	easy
Complexity	5/5	4/5	2/5
Audience	developer	researcher	designer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Requires NVIDIA GPU with CUDA toolkit and significant time to work through mathematical foundations before reaching runnable code.

License could not be detected automatically. Check the repository's LICENSE file before use.

In plain English

LLM101n is a planned course by Andrej Karpathy (a well-known AI researcher and educator) that intends to teach students how to build a large language model, the type of AI system that powers ChatGPT, from the ground up. The end product of the course would be a "Storyteller" AI that can create, refine, and illustrate short stories. The stated goal is to build everything end-to-end, from basic mathematics to a working web application, using Python, C, and CUDA (the programming language used to run code on NVIDIA GPUs). The README describes a detailed 17-chapter syllabus starting from the simplest possible language model (a bigram model, which just looks at pairs of words) and progressively building up through neural network backpropagation, attention mechanisms, the transformer architecture (the foundation of modern LLMs), tokenization, optimization techniques, GPU acceleration, distributed training across multiple GPUs, inference optimization including quantization and KV-caching, fine-tuning with human feedback, and finally deployment as a web app. The appendix lists supplementary topics like tensor mechanics, different neural network architectures, and multimodal AI. Important note from the README: as of the time this README was written, the course does not yet exist. It is being developed by Eureka Labs and the repository is archived until the course is ready. You would follow this project if you are interested in learning how LLMs work at a deep technical level through a hands-on build-it-yourself approach, rather than just using existing models. No primary programming language is assigned to the repository since course materials have not yet been released.

Copy-paste prompts

Prompt 1

Walk me through the syllabus of LLM101n and explain what I'll learn in each of the 17 chapters.

Prompt 2

How would I implement a bigram language model as the first step in LLM101n?

Prompt 3

What are the key differences between the chapters on attention mechanisms and the transformer architecture in LLM101n?

Prompt 4

Show me how to set up CUDA and Python to follow along with the GPU acceleration chapters of LLM101n.

Prompt 5

What does the final Storyteller project in LLM101n involve, and how do I deploy it as a web app?

Frequently asked questions

What is llm101n?

A planned course by Andrej Karpathy teaching how to build a large language model from scratch, starting with basic math and ending with a working AI storyteller web app.

What license does llm101n use?

License could not be detected automatically. Check the repository's LICENSE file before use.

How hard is llm101n to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is llm101n for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub karpathy on gitmyhub

Verify against the repo before relying on details.