karpathy/mingpt

Analysis updated 2026-06-21

★ 24,310PythonAudience · researcherComplexity · 3/5Setup · moderate

Mindmap

mindmap
  root((minGPT))
    What It Does
      Readable GPT reimplementation
      Transformer architecture demo
      GPT-2 weight loading
    Tech Stack
      Python
      PyTorch
    Key Files
      Model definition
      Tokenizer
      Training loop
    Use Cases
      ML education
      GPT from scratch
      Text generation
      Research baseline

mindmap root((minGPT)) What It Does Readable GPT reimplementation Transformer architecture demo GPT-2 weight loading Tech Stack Python PyTorch Key Files Model definition Tokenizer Training loop Use Cases ML education GPT from scratch Text generation Research baseline

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Study the Transformer architecture hands-on by reading and running a minimal, well-commented GPT implementation.

USE CASE 2

Train a small GPT from scratch on a text file to see character-level language modeling in action.

USE CASE 3

Load OpenAI's pretrained GPT-2 weights and generate text to understand how pretraining and inference connect.

USE CASE 4

Experiment with GPT on simple tasks like number addition to build intuition before tackling larger models.

What is it built with?

PythonPyTorch

How does it compare?

	karpathy/mingpt	scrapegraphai/scrapegraph-ai	plotly/dash
Stars	24,310	24,389	24,150
Language	Python	Python	Python
Setup difficulty	moderate	moderate	easy
Complexity	3/5	2/5	3/5
Audience	researcher	developer	data

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires PyTorch installed, GPU recommended but not required for small demos.

In plain English

MinGPT is a stripped-down, educational reimplementation of GPT, the type of AI model behind ChatGPT, written by Andrej Karpathy, a prominent AI researcher. GPT (Generative Pretrained Transformer) is the family of language models that take a sequence of text as input and predict what comes next. MinGPT's purpose is not to be the most capable or efficient version, it exists to be the most readable version, so people can actually understand what is happening inside these models. The entire core implementation is about 300 lines of Python code split across three files: the model definition (the Transformer neural network itself), a tokenizer (which converts text into numbers the model can process), and a generic training loop. The Transformer is the architecture that modern large language models are built on, it processes sequences by letting every token "attend" to every other token to understand context. The repo includes several small demonstrations: training a GPT from scratch to add numbers, training one as a character-level text generator on any text file, and loading OpenAI's pretrained GPT-2 weights to generate text from a prompt. A machine learning student or researcher would use minGPT when they want to understand GPT from the ground up without wading through the complexity of production implementations. It is written in Python using PyTorch, a popular deep learning library. Note that the author has since created a successor called nanoGPT for those who want something similarly educational but more capable.

Copy-paste prompts

Prompt 1

Walk me through minGPT's model.py file and explain how the Transformer attention mechanism works in plain English.

Prompt 2

Show me how to train minGPT from scratch on a custom text file and generate samples from the trained model.

Prompt 3

How do I load OpenAI GPT-2 pretrained weights into minGPT and run text generation from a starting prompt?

Prompt 4

Explain what each of the three core files in minGPT does and how they connect to train a language model.

Prompt 5

How does minGPT's training loop work and what hyperparameters should I tune first when experimenting?

Frequently asked questions

What is mingpt?

MinGPT is a 300-line readable PyTorch reimplementation of GPT designed to help you understand how modern AI language models work from the ground up.

What language is mingpt written in?

Mainly Python. The stack also includes Python, PyTorch.

How hard is mingpt to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is mingpt for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub karpathy on gitmyhub

Verify against the repo before relying on details.