Study the theory of Q-learning, SARSA, and DQN through structured university lecture slides.
Learn how policy gradient methods like REINFORCE and A2C work from clearly organized course materials.
Understand multi-agent reinforcement learning and imitation learning through dedicated lecture sections.
This repository is a course on deep reinforcement learning, organized as a series of lecture slides and accompanying video recordings. The videos are in Chinese, and the slides are PDF files available directly from the repository. It is structured as a university-style curriculum rather than runnable code. Reinforcement learning is a branch of machine learning where a software agent learns to make decisions by trying things out and receiving feedback, similar to how a person learns a game by playing it repeatedly. "Deep" reinforcement learning means the agent uses a neural network to process what it sees and decide what to do next, which allows it to handle far more complex situations than older rule-based approaches. The course moves through eight major topic areas. It opens with a conceptual overview of how reinforcement learning works, covering the main families of approaches: value-based methods, policy-based methods, and actor-critic methods. It also includes a session on AlphaGo to show how these ideas apply to a well-known real-world system. From there it goes into TD learning, which is a specific technique for estimating how good a given situation is by looking at outcomes a few steps ahead rather than waiting until the end of a game or task. This section covers Sarsa, Q-learning, and multi-step methods. Later sections go deeper into value-based approaches including experience replay and double DQN, then into policy gradient methods including REINFORCE and A2C. A section on continuous action spaces covers scenarios where the agent does not just pick from a fixed list of options but chooses values along a range. The final sections introduce multi-agent settings, where several agents interact, and imitation learning, where an agent learns by observing examples rather than by trial and error. This is a self-study or classroom resource. There is no software to install and no coding exercises included in the repository itself.
This repo across BitVibe Labs
Verify against the repo before relying on details.