meta-pytorch/torchtune

★ 5,751PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((torchtune))
    What it does
      Fine-tune LLMs
      LoRA and QLoRA
      Knowledge distillation
      DPO PPO alignment
    Tech stack
      Python
      PyTorch
      Hugging Face Hub
      YAML configs
    Supported models
      Llama 4
      Mistral
      Gemma 2
    Use cases
      Single GPU training
      Multi-GPU runs
      Preference alignment

mindmap root((torchtune)) What it does Fine-tune LLMs LoRA and QLoRA Knowledge distillation DPO PPO alignment Tech stack Python PyTorch Hugging Face Hub YAML configs Supported models Llama 4 Mistral Gemma 2 Use cases Single GPU training Multi-GPU runs Preference alignment

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Fine-tune Llama or Mistral on your own dataset using LoRA to cut GPU memory requirements.

USE CASE 2

Run multi-GPU distributed training jobs using PyTorch's native APIs.

USE CASE 3

Apply DPO or GRPO techniques to align a language model's outputs to human preferences.

USE CASE 4

Distill a large language model into a smaller, faster version using the built-in recipes.

Tech stack

PythonPyTorchLoRAQLoRAHugging FaceYAML

Getting it running

Difficulty · hard Time to first run · 1day+

Requires one or more CUDA-capable GPUs and model weights from Hugging Face Hub, no CPU training support.

In plain English

torchtune was a Python library built by Meta's PyTorch team for fine-tuning and experimenting with large language models. Development wound down in 2025, but the code remains publicly available and was shaped by contributions from over 150 people during its active period. The central concept is post-training: taking a pre-built AI model and adjusting it to new tasks, datasets, or behaviors. torchtune supported several methods for doing this. Supervised fine-tuning updates the model's weights directly using labeled examples. LoRA and QLoRA are lighter alternatives that train only a small fraction of the model's parameters rather than the whole thing, which cuts down on GPU memory requirements considerably. Knowledge distillation trains a smaller model to behave like a larger one. DPO, PPO, and GRPO are reinforcement-learning-style techniques used to align a model's responses with human preferences. Running a training job meant picking a recipe (the training method) and a config file (YAML format), then calling the tune run command. The library shipped ready-made configs for a range of well-known models: Llama 4, Llama 3.x, Mistral, Gemma 2, Phi-4, Qwen 2.5, and others. Model weights were loaded from Hugging Face Hub or Kaggle Hub. The library was designed to run on a single GPU, multiple GPUs on one machine, or multiple machines at once. Its focus was on memory efficiency and performance using PyTorch's built-in APIs, keeping the training code readable and modifiable rather than hiding it behind heavy abstractions. Because active development has ended, there is no ongoing support. The README links to a GitHub issue that explains the shutdown decision for anyone looking for background on why the project was wound down.

Copy-paste prompts

Prompt 1

Using torchtune's LoRA recipe, show me how to fine-tune Llama 3 on a custom JSON dataset with a single GPU.

Prompt 2

Write a YAML config for torchtune to run QLoRA fine-tuning on Mistral-7B, loading weights from Hugging Face Hub.

Prompt 3

How do I use torchtune's DPO recipe to align a language model on a preference dataset? Show me the command and config file.

Prompt 4

Explain the difference between torchtune's LoRA and full supervised fine-tuning recipes, and when I should choose each one.

Prompt 5

How do I set up torchtune for multi-GPU training across two GPUs on one machine?

Open on GitHub → Explain another repo

← meta-pytorch on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.