rasbt/llms-from-scratch

Analysis updated 2026-05-18

★ 92,051Jupyter NotebookAudience · developerComplexity · 3/5Setup · easy

Mindmap

mindmap
  root((repo))
    What it does
      Build LLM from scratch
      Learn attention mechanism
      Train and fine-tune models
    Learning path
      Text data handling
      Model architecture
      Pretraining pipeline
      Instruction fine-tuning
    Tech stack
      PyTorch
      Python
      Jupyter Notebooks
    Use cases
      Study LLM internals
      Experiment with fine-tuning
      Understand transformer models
    Audience
      Students and learners
      ML practitioners
      Curious developers

mindmap root((repo)) What it does Build LLM from scratch Learn attention mechanism Train and fine-tune models Learning path Text data handling Model architecture Pretraining pipeline Instruction fine-tuning Tech stack PyTorch Python Jupyter Notebooks Use cases Study LLM internals Experiment with fine-tuning Understand transformer models Audience Students and learners ML practitioners Curious developers

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Study how transformer models and attention mechanisms work by implementing them yourself.

USE CASE 2

Build and train a small GPT-style language model on your own machine to understand the full pipeline.

USE CASE 3

Fine-tune pretrained model weights for text classification or instruction-following tasks.

USE CASE 4

Learn PyTorch deep learning fundamentals through hands-on implementation of a real language model.

What is it built with?

PythonPyTorchJupyter Notebook

How does it compare?

	rasbt/llms-from-scratch	microsoft/ml-for-beginners	microsoft/generative-ai-for-beginners
Stars	92,051	85,669	110,270
Language	Jupyter Notebook	Jupyter Notebook	Jupyter Notebook
Setup difficulty	easy	easy	moderate
Complexity	3/5	2/5	2/5
Audience	developer	general	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 5min

License could not be detected automatically. Check the repository's LICENSE file before use.

In plain English

LLMs-from-scratch is the official code repository accompanying Sebastian Raschka's book "Build a Large Language Model (From Scratch)." Its purpose is to teach how a ChatGPT-style large language model actually works by walking the reader through building a small but fully functional version of one, line by line, using PyTorch, a popular Python framework for deep learning. Rather than calling someone else's pretrained model, the reader codes the whole pipeline themselves and runs it on their own machine. The repository is organised by book chapters. After an introductory chapter explaining what LLMs are, later chapters guide the reader through working with text data, coding the attention mechanism that lets the model look at different parts of an input at once, building a GPT-style model architecture, pretraining the model on unlabelled text, fine-tuning it for text classification, and fine-tuning it again so it can follow instructions like a chat assistant. Appendices add an introduction to PyTorch, references, exercise solutions, extras for the training loop, and a parameter-efficient fine-tuning method called LoRA. Each chapter ships as Jupyter notebooks plus standalone Python scripts and exercise solutions. The README also notes that the code can load weights from larger pretrained models so readers can experiment with fine-tuning a real model after building their own. People typically use this repository as study material, alongside the book or on its own, to gain intuition about how modern language models are built, trained, and adapted.

Copy-paste prompts

Prompt 1

Walk me through the attention mechanism code in chapter 3 of llms-from-scratch and explain how it lets the model focus on different parts of the input.

Prompt 2

I want to fine-tune a pretrained model using the LoRA method from the appendix, show me the key steps and how to adapt the code for my dataset.

Prompt 3

Help me understand the pretraining loop in llms-from-scratch: what loss function is used, how are batches created, and what does the training curve typically look like?

Prompt 4

I've built the GPT model from the repo, now how do I load real pretrained weights and adapt the code to work with them?

Frequently asked questions

What is llms-from-scratch?

Learn how ChatGPT-style language models work by building one from scratch in PyTorch, chapter by chapter, with code you run yourself.

What language is llms-from-scratch written in?

Mainly Jupyter Notebook. The stack also includes Python, PyTorch, Jupyter Notebook.

What license does llms-from-scratch use?

License could not be detected automatically. Check the repository's LICENSE file before use.

How hard is llms-from-scratch to set up?

Setup difficulty is rated easy, with roughly 5min to a first successful run.

Who is llms-from-scratch for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub rasbt on gitmyhub

Verify against the repo before relying on details.