jingyaogong/minimind

🔥 Hot★ 50,134PythonAudience · developerComplexity · 4/5ActiveLicenseSetup · hard

Things people build with this

USE CASE 1

Train a working language model on your own GPU to understand how ChatGPT-like systems actually work internally.

USE CASE 2

Fine-tune a small model using LoRA or RLHF techniques to customize its behavior for specific tasks.

USE CASE 3

Deploy a trained model locally via an OpenAI-compatible API and integrate it into your own applications.

USE CASE 4

Study the complete training pipeline from raw text to inference without relying on high-level frameworks.

Tech stack

PythonPyTorchDDPDeepSpeedStreamlit

Getting it running

Difficulty · hard Time to first run · 1day+

Requires GPU with sufficient VRAM, PyTorch/CUDA setup, and 2+ hours of training time per example.

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

MiniMind is an educational open-source project that teaches you how to build a small but fully functional large language model (LLM) from scratch. An LLM is the kind of AI that powers tools like ChatGPT, it takes in text and generates coherent responses. The project's goal is to make this technology accessible: instead of the hundreds of billions of parameters used by industry models, MiniMind trains a model with only 64 million parameters, small enough to run on a single consumer GPU in about two hours. The project covers the entire training pipeline end-to-end, starting from raw text data. It includes data cleaning, tokenizer training (teaching the model to split text into tokens, which are the units it processes), pretraining (learning general language patterns from large text corpora), and supervised fine-tuning (teaching it to follow instructions). Beyond basic training, it also implements more advanced techniques from scratch: LoRA (a fine-tuning method that reduces memory usage), RLHF (reinforcement learning from human feedback, used to align AI responses with human preferences), and reasoning chain training. All these algorithms are written directly in PyTorch without relying on high-level abstraction libraries, so readers can see exactly how each component works. You would use this repository if you are a student, researcher, or developer who wants to deeply understand how LLMs work internally, not just how to use them, but how they are actually built and trained. It serves as both a working codebase and a learning tutorial. The trained models are small enough to be deployed locally and can be served via an OpenAI-compatible API endpoint, making them usable with existing chat interfaces. A simple Streamlit web interface is also included for testing. The tech stack is Python and PyTorch, with optional support for distributed training across multiple GPUs using DDP and DeepSpeed.

Copy-paste prompts

Prompt 1

Walk me through the MiniMind training pipeline step-by-step: how do I go from raw text to a trained model, and what does each stage do?

Prompt 2

Show me how to implement LoRA fine-tuning in MiniMind to reduce memory usage when adapting the model to my own data.

Prompt 3

How do I use MiniMind's RLHF implementation to align my trained model with human preferences?

Prompt 4

Help me set up MiniMind to train a model on my GPU and then serve it via the OpenAI-compatible API.

Prompt 5

Explain how MiniMind's tokenizer training works and why it's necessary before pretraining the language model.

Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.