kwai/douzero

★ 4,550PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((repo))
    What it does
      Plays DouDizhu card game
      Self-play AI training
    Technique
      Monte Carlo methods
      Deep neural networks
      No human knowledge
    How to use
      Train from scratch
      Pre-trained weights
      Online demo
    Use cases
      AI game research
      Reproduce ICML results
      Study self-play AI

mindmap root((repo)) What it does Plays DouDizhu card game Self-play AI training Technique Monte Carlo methods Deep neural networks No human knowledge How to use Train from scratch Pre-trained weights Online demo Use cases AI game research Reproduce ICML results Study self-play AI

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Train a DouDizhu AI from scratch using the provided training scripts on a GPU server.

USE CASE 2

Evaluate the pre-trained model weights on CPU to reproduce the paper's benchmark results.

USE CASE 3

Play DouDizhu against the trained AI in a browser using the online demo without any local setup.

USE CASE 4

Study the Monte Carlo self-play training method for games with very large action spaces.

Tech stack

Python

Getting it running

Difficulty · hard Time to first run · 1day+

Training requires a multi-GPU server, evaluation can run on CPU including Windows.

In plain English

DouZero is an AI system that learned to play DouDizhu, the most popular card game in China. DouDizhu translates roughly to "Fighting the Landlord" and is a three-player card game where two players team up as peasants against a single landlord, with everyone trying to be the first to play all their cards. The game has a very large number of possible moves on any given turn, which makes it unusually hard for AI to handle. The research behind DouZero was published at ICML 2021, a major machine learning conference. The core idea is a classic technique called Monte Carlo methods, which involves simulating many possible game sequences to estimate which moves lead to winning, combined with deep neural networks to handle the large action space. The AI trains by playing against itself repeatedly, with no human expert knowledge baked in. Starting from nothing on a single server with four GPUs, it reached top performance among hundreds of AI agents on a public leaderboard in a matter of days. The repository includes training code, pre-trained model weights, and an evaluation setup. Training requires a GPU. Evaluation can run on a CPU, including on Windows. An online demo at douzero.org lets you play against the trained AI in a browser without setting up anything locally. A Google Colab notebook is also linked so you can experiment in the cloud for free. The code is released by Kwai (the company behind the Kuaishou short-video platform) and is intended primarily as a research artifact. If you want to reproduce results or study the technique, the paper and training scripts are both available. Community members have built improved versions using different neural network architectures, and those are linked from the README as well.

Copy-paste prompts

Prompt 1

I want to train a DouZero model. What hardware do I need, how do I start the training run, and what output should I expect?

Prompt 2

Show me the commands to evaluate the pre-trained DouZero model on CPU and explain which metrics to look at.

Prompt 3

Explain how the DouZero self-play loop works and how deep neural networks help handle the large number of possible moves in DouDizhu.

Prompt 4

I want to try a different neural network architecture for DouZero, like the community improvements in the README. What parts of the training code would I need to change?

Open on GitHub → Explain another repo

← kwai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.