OpenAI Baselines is a collection of Python implementations of reinforcement learning algorithms, provided by OpenAI as reference-quality code for researchers and practitioners. Reinforcement learning is a branch of AI where a computer program learns to make decisions by trial and error, receiving rewards for good actions and penalties for bad ones, similar to how you might train a dog with treats. The library packages up several well-known algorithms under one roof, including DQN, PPO2, A2C, DDPG, and others. These are the foundational techniques that researchers use to teach AI agents to play video games, control robotic simulations, and solve other decision-making problems. The purpose is to give the research community reliable, reproducible starting points so that when someone says "I improved on PPO2," everyone is comparing against the same baseline code. You train a model by running a command that specifies which algorithm to use and which environment to run it in. For example, you can train an agent to play Atari Pong or control a simulated humanoid figure. The library tracks training progress, lets you save trained models, and lets you load them back later to watch what the agent learned. The project is currently in maintenance mode, meaning it receives bug fixes but is no longer under active feature development. It requires Python 3 and TensorFlow, and some example environments also need the MuJoCo physics simulator, which requires a separate license.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.