explaingit

lm-sys/fastchat

39,475PythonAudience · developerComplexity · 4/5ActiveLicenseSetup · hard

TLDR

Open platform for training, serving, and evaluating large language model chatbots. Includes Vicuna (a fine-tuned open-source chatbot) and Chatbot Arena (a benchmark where users vote on AI responses).

Mindmap

mindmap
  root((FastChat))
    What it does
      Train chatbots
      Serve models via API
      Evaluate with benchmarks
    Key components
      Vicuna chatbot
      Chatbot Arena
      MT-Bench
    Use cases
      Self-host LLMs
      Fine-tune models
      Compare model quality
    Tech stack
      Python
      Hugging Face
      Multi-GPU support
    Audience
      Researchers
      Engineers
      Infrastructure teams

Things people build with this

USE CASE 1

Fine-tune open-source models like LLaMA on your own conversation data to create custom chatbots.

USE CASE 2

Self-host multiple language models behind an OpenAI-compatible API so existing tools can use them without code changes.

USE CASE 3

Run systematic evaluations and comparisons of different models using MT-Bench or Chatbot Arena voting.

USE CASE 4

Build and operate a large-scale model evaluation platform with human preference feedback.

Tech stack

PythonHugging Face TransformersPyTorchCUDAMetal

Getting it running

Difficulty · hard Time to first run · 1day+

Requires GPU/CUDA setup, model downloads (10GB+), and multiple service components (training, serving, evaluation backend).

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

FastChat is an open platform for training, serving, and evaluating large language model chatbots. It was created by the LMSYS organization and is the release repository for Vicuna, an open-source chatbot trained by fine-tuning Meta's LLaMA model on conversation data, and for Chatbot Arena, a popular benchmark where users vote on which AI responses they prefer in blind side-by-side comparisons. The platform has three main capabilities. First, it provides training code and recipes for fine-tuning foundation models like LLaMA on instruction-following data, which is how Vicuna was created. Second, it includes a distributed serving system that can load multiple large language models and expose them through a web chat interface or through an OpenAI-compatible REST API, meaning existing software that calls the OpenAI API can be pointed at FastChat instead to use open-source models. Third, it contains evaluation frameworks including MT-Bench, a multi-turn benchmark designed to measure how well chatbots handle complex, multi-step conversations beyond simple one-shot questions. You would use FastChat if you are a researcher studying how to train better open-source chatbots, an engineer who wants to self-host language models behind an API that existing tools already know how to call, or someone building infrastructure to evaluate and compare multiple models systematically. The Chatbot Arena component has powered over 10 million chat requests across 70 or more models and collected over 1.5 million human preference votes, producing one of the most widely cited LLM leaderboards in the research community. The tech stack is Python throughout. It uses the Hugging Face Transformers library for model loading, supports single and multi-GPU inference, CPU inference, and Apple Silicon via the Metal backend, and can be installed with a single pip command.

Copy-paste prompts

Prompt 1
How do I fine-tune a LLaMA model using FastChat to create a custom chatbot on my domain data?
Prompt 2
Set up FastChat to serve multiple open-source models behind an OpenAI-compatible API endpoint.
Prompt 3
Use FastChat's MT-Bench to evaluate and compare how well different language models handle multi-turn conversations.
Prompt 4
Deploy FastChat with multi-GPU inference to serve large models efficiently in production.
Prompt 5
Create a Chatbot Arena-style evaluation where users vote on responses from different models.
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.