explaingit

openai/baselines

Analysis updated 2026-06-24

16,714PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

Reference Python implementations of classic reinforcement learning algorithms (DQN, PPO2, A2C, DDPG) from OpenAI, used as a baseline for RL research.

Mindmap

mindmap
  root((baselines))
    Inputs
      Gym environment
      Algorithm choice
      Hyperparameters
    Outputs
      Trained model
      Training logs
      Evaluation video
    Use Cases
      Compare new RL methods
      Train Atari agents
      Reproduce paper results
    Tech Stack
      Python
      TensorFlow
      MuJoCo
      Gym
    Algorithms
      DQN
      PPO2
      A2C
      DDPG
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Reproduce a reinforcement learning paper against a known baseline

USE CASE 2

Train a PPO2 agent on an Atari game and watch the learned policy play

USE CASE 3

Use DDPG as a starting point for a continuous-control robotics experiment

USE CASE 4

Compare your new RL algorithm against DQN on the same environment

What is it built with?

PythonTensorFlowGymMuJoCo

How does it compare?

openai/baselinesipython/ipythonexaloop/codon
Stars16,71416,69616,769
LanguagePythonPythonPython
Setup difficultyhardeasymoderate
Complexity4/52/54/5
Audienceresearcherdataresearcher

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · hard Time to first run · 1day+

Project is in maintenance mode and pinned to old TensorFlow, MuJoCo environments need a separate license.

In plain English

OpenAI Baselines is a collection of Python implementations of reinforcement learning algorithms, provided by OpenAI as reference-quality code for researchers and practitioners. Reinforcement learning is a branch of AI where a computer program learns to make decisions by trial and error, receiving rewards for good actions and penalties for bad ones, similar to how you might train a dog with treats. The library packages up several well-known algorithms under one roof, including DQN, PPO2, A2C, DDPG, and others. These are the foundational techniques that researchers use to teach AI agents to play video games, control robotic simulations, and solve other decision-making problems. The purpose is to give the research community reliable, reproducible starting points so that when someone says "I improved on PPO2," everyone is comparing against the same baseline code. You train a model by running a command that specifies which algorithm to use and which environment to run it in. For example, you can train an agent to play Atari Pong or control a simulated humanoid figure. The library tracks training progress, lets you save trained models, and lets you load them back later to watch what the agent learned. The project is currently in maintenance mode, meaning it receives bug fixes but is no longer under active feature development. It requires Python 3 and TensorFlow, and some example environments also need the MuJoCo physics simulator, which requires a separate license.

Copy-paste prompts

Prompt 1
Walk me through training a PPO2 agent on Atari Pong with openai/baselines, step by step
Prompt 2
Show me how to load a saved baselines model and render the agent playing the environment
Prompt 3
Help me port a baselines PPO2 script from TensorFlow 1 to TensorFlow 2 or PyTorch
Prompt 4
Set up a Python 3.7 environment with the exact dependencies baselines needs
Prompt 5
Compare DQN vs PPO2 in baselines and tell me which to pick for a discrete action game

Frequently asked questions

What is baselines?

Reference Python implementations of classic reinforcement learning algorithms (DQN, PPO2, A2C, DDPG) from OpenAI, used as a baseline for RL research.

What language is baselines written in?

Mainly Python. The stack also includes Python, TensorFlow, Gym.

How hard is baselines to set up?

Setup difficulty is rated hard, with roughly 1day+ to a first successful run.

Who is baselines for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub openai on gitmyhub

Verify against the repo before relying on details.