lightning-ai/litgpt

★ 13,352PythonAudience · researcherComplexity · 4/5LicenseSetup · moderate

Mindmap

mindmap
  root((LitGPT))
    Models
      Llama 3
      Mistral
      Gemma
      Phi
    Workflows
      Inference
      Fine-tuning
      Pre-training
    Features
      Readable code
      YAML recipes
      Multi-GPU
    Tech
      Python
      PyTorch

mindmap root((LitGPT)) Models Llama 3 Mistral Gemma Phi Workflows Inference Fine-tuning Pre-training Features Readable code YAML recipes Multi-GPU Tech Python PyTorch

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Load a Llama 3 or Mistral model and generate text responses in about five lines of Python code

USE CASE 2

Fine-tune a pre-trained language model on your own dataset using the command-line interface and tested YAML recipes

USE CASE 3

Pre-train a custom language model from scratch on a multi-GPU cluster with built-in memory-efficient training techniques

USE CASE 4

Study how a large language model is implemented without abstraction layers hiding the core architecture

Tech stack

PythonPyTorchYAML

Getting it running

Difficulty · moderate Time to first run · 30min

Requires a GPU with sufficient VRAM, larger models need quantization or a distributed multi-GPU setup.

Use freely for any purpose, including commercial use, as long as you include the Apache 2.0 license and copyright notice.

In plain English

LitGPT is a Python library for working with large language models, which are the kind of AI systems that understand and generate text. It provides clean, from-scratch implementations of over 20 well-known models including Llama 3, Gemma, Phi, Qwen, Falcon, Mistral, and others. The distinguishing feature is that every model is written without layered abstractions, meaning the code is readable and debuggable rather than buried inside a framework that hides what is happening. The library covers three main workflows. First, you can load a pre-existing model and use it to generate text, answer questions, or process documents. Second, you can fine-tune an existing model on your own data, which means taking a general-purpose AI and training it further to specialize in a particular task or domain. Third, you can pre-train a model from scratch, which requires much more computing power but gives full control over what the model learns. LitGPT includes YAML configuration files called recipes that contain tested settings for each workflow so you do not have to figure out the best configuration yourself. The library supports running on anywhere from a single consumer GPU up to clusters of a thousand or more GPUs, and it includes techniques for reducing memory usage so models can run on hardware with limited memory. It is designed for both experimentation and production use, and is licensed under Apache 2.0, which allows commercial use without restrictions. To get started with basic inference, you install the package with pip, load a model by name, and call a generate function. The README shows this working in about five lines of Python code. For fine-tuning and pre-training, LitGPT uses a command-line interface where you point it at your data and a configuration file, and it handles the training loop. The project is maintained by Lightning AI, the same company behind the PyTorch Lightning training framework. They also offer cloud GPU infrastructure for running LitGPT workloads, though the library itself runs anywhere Python and PyTorch are available.

Copy-paste prompts

Prompt 1

Show me how to load Llama 3 with LitGPT and run inference on a list of prompts while minimizing GPU memory usage

Prompt 2

Walk me through fine-tuning a Mistral model on a custom CSV dataset using LitGPT's CLI and the recommended YAML recipe

Prompt 3

How do I set up LitGPT pre-training on a multi-GPU machine, and how do I monitor training loss?

Prompt 4

What memory-reduction techniques does LitGPT support, and how do I enable quantization for a model that is too large for my GPU?

Prompt 5

Show me the LitGPT source code for the Llama 3 attention mechanism and explain how it differs from the original transformer paper

Open on GitHub → Explain another repo

← lightning-ai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.