Run a DQN agent on CartPole to see a complete working deep reinforcement learning implementation from scratch
Use the PPO code as a readable reference when implementing your own policy gradient research
Compare training reward curves across DQN, SAC, and TD3 to choose an algorithm for a continuous control task
Study the SAC implementation to understand how entropy regularization prevents an agent from getting stuck in local optima
Requires Python 3.6 or below and PyTorch 0.4 or above, newer Python versions may have compatibility issues.
This repository collects PyTorch implementations of popular deep reinforcement learning algorithms. Reinforcement learning is a style of machine learning where a software agent learns by taking actions in an environment and receiving rewards or penalties, rather than from labeled training examples. Deep reinforcement learning pairs this with neural networks, letting the agent handle complex, high-dimensional inputs like game screens or robot sensor readings. The algorithms included cover a broad span of techniques that researchers and engineers commonly use as starting points or benchmarks: DQN (which famously learned to play Atari games), Policy Gradient methods, Actor-Critic approaches, DDPG and TD3 (for environments with continuous action spaces like controlling a robot arm), PPO (a widely used algorithm that balances performance and stability), A2C and A3C (methods that can run multiple parallel learning processes), and SAC (an approach that adds randomness to prevent the agent from getting stuck). Each algorithm is in its own folder with code, training charts, and links to the original research papers. The test environments come from OpenAI Gym, a standard benchmark suite used by reinforcement learning researchers. Examples include CartPole (balancing a pole on a cart), MountainCar (getting a car up a hill with limited engine power), Pendulum (keeping a pendulum upright), and BipedalWalker (teaching a two-legged robot to walk). The README includes training reward curves for several algorithms so you can see what to expect. The stated goal is educational: the code is meant to be clear and readable so learners can follow how each algorithm works, not just run it. Requirements are Python 3.6 or below, PyTorch 0.4 or above, and the gym library for the test environments.
← sweetice on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.