deepspeedai/deepspeedexamples

★ 6,818PythonAudience · researcherComplexity · 5/5Setup · hard

Mindmap

mindmap
  root((repo))
    Applications
      End-to-end Training
      Full Pipelines
    Training Examples
      Model Training
      Fine-tuning
    Inference
      MII Server
      FastGen
      Hugging Face
    Compression
      Model Shrinking
    Benchmarks
      Speed Tests

mindmap root((repo)) Applications End-to-end Training Full Pipelines Training Examples Model Training Fine-tuning Inference MII Server FastGen Hugging Face Compression Model Shrinking Benchmarks Speed Tests

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run a working end-to-end example of training a large language model with DeepSpeed across multiple GPUs.

USE CASE 2

Fine-tune a pre-trained Hugging Face model using DeepSpeed memory optimization to fit it on smaller hardware.

USE CASE 3

Test DeepSpeed inference performance using MII or FastGen to measure throughput on your hardware.

USE CASE 4

Use the benchmarks folder to compare DeepSpeed training speed across different configurations.

Tech stack

PythonDeepSpeedPyTorchHugging Face

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a multi-GPU machine and familiarity with DeepSpeed configuration, each subfolder has its own setup instructions.

License not specified in the explanation.

In plain English

DeepSpeedExamples is a collection of code examples that show how to use DeepSpeed, a software library designed to make training large AI models faster and more efficient. The repository does not contain the DeepSpeed library itself, it contains sample code that uses it. The library is a separate project, also available on GitHub, and is maintained by Microsoft. The examples are organized into five sections. Applications are end-to-end projects that train and run AI models from start to finish. Training contains scripts for teaching models or adapting existing ones to new tasks, with each subfolder carrying its own instructions. Inference holds code for running already-trained models to generate predictions, with separate guides for two DeepSpeed inference systems called MII and FastGen, as well as a guide for using DeepSpeed with models from the Hugging Face library. Compression covers techniques for making models smaller. Benchmarks contains tests that measure how fast the DeepSpeed library runs under different conditions. The README serves mainly as a directory sign. It points to the subfolders rather than explaining how to use them directly. Each subfolder is expected to have its own more detailed documentation. The project accepts outside contributions and follows Microsoft open-source guidelines. Contributors need to sign a Contributor License Agreement before their code can be merged. This is a technical resource aimed at machine learning engineers who already work with large AI models and want to see working examples of DeepSpeed in practice. The README itself is sparse and assumes you already know what DeepSpeed is and why you might use it.

Copy-paste prompts

Prompt 1

I want to fine-tune a large language model using DeepSpeed. Point me to the right training example in DeepSpeedExamples and explain what I need to configure before running it.

Prompt 2

Show me how to run DeepSpeed inference on a Hugging Face model using the FastGen example. What are the hardware requirements and how do I start the server?

Prompt 3

I want to benchmark DeepSpeed training throughput on a 2-GPU machine. Which benchmark script should I use and what command do I run?

Prompt 4

Explain what ZeRO optimization in DeepSpeed does in plain English, then show me which example in DeepSpeedExamples demonstrates it.

Open on GitHub → Explain another repo

← deepspeedai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.