Analysis updated 2026-07-03
Study how to train reinforcement learning models on historical data when live simulation is not possible.
Use the policy evaluation tools to test a new decision-making policy against old data before deploying it.
Explore Q-learning and policy gradient algorithms in a production-scale Python/PyTorch codebase.
Reference the distributed training setup for handling large recommendation datasets with reinforcement learning.
| facebookresearch/reagent | opengeos/leafmap | stability-ai/stable-audio-tools | |
|---|---|---|---|
| Stars | 3,699 | 3,699 | 3,699 |
| Language | Python | Python | Python |
| Setup difficulty | moderate | easy | hard |
| Complexity | 4/5 | 2/5 | 4/5 |
| Audience | researcher | data | researcher |
Figures from each repo's GitHub metadata at analysis time.
Docker install available but the project is archived and no longer maintained, new users should consider the successor project Pearl instead.
ReAgent is an archived open-source platform built by Facebook for applying reinforcement learning to real-world problems at scale. Reinforcement learning is a type of AI training where a system learns by taking actions and receiving feedback, rather than being given labeled examples. ReAgent was built to make this approach practical for large-scale recommendation and optimization tasks, such as deciding what to show users in a feed or how to allocate resources. The platform is no longer actively maintained. Facebook's team has moved to a successor project called Pearl, and the README directs users there for ongoing support. ReAgent is preserved as a reference and for teams that built on top of it. When it was active, ReAgent provided a full pipeline: taking raw data, transforming it into training-ready inputs, training AI models using a variety of reinforcement learning algorithms, and then serving those models in production. It was designed for situations where you cannot run a live simulation, so it trained on batches of previously collected data rather than interacting with an environment in real time. It also included tools for evaluating a new policy using old data, which is important when you cannot test a policy by actually deploying it. The platform supported a wide range of algorithms, including several variants of Q-learning for environments with discrete or continuous actions, policy gradient methods, and contextual bandit approaches for simpler decision problems where only a single decision is made rather than a sequence. ReAgent was written in Python and used PyTorch for model training. It supported distributed training for handling large datasets. Installation was possible via Docker or manually, with detailed instructions available in the repository's documentation folder.
An archived Facebook platform for applying reinforcement learning to large-scale recommendation and optimization problems using previously collected data, now superseded by Pearl.
Mainly Python. The stack also includes Python, PyTorch, Docker.
Open-source and free to use, exact license terms are in the repository.
Setup difficulty is rated moderate, with roughly 1h+ to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.