Track the latest research on AI reasoning models and chain-of-thought training from one curated list
Find and download open-source reasoning model weights from DeepSeek, Qwen, NVIDIA, and others
Discover training frameworks for reproducing o1-style reinforcement learning from human feedback
Get links to recorded talks and blog posts explaining how step-by-step reasoning models are built
This repository is a curated collection of research papers, blog posts, talks, and open-source resources focused on AI reasoning, with a particular emphasis on OpenAI's o1 model (internally called Strawberry). It does not contain code. Instead, it tracks the research and public writing that helped explain or replicate the ideas behind models that reason step-by-step before answering, rather than generating a reply immediately. The collection is organized into several sections. Official documentation and announcements from OpenAI appear at the top. A news section lists updates from OpenAI, Google DeepMind, DeepSeek, and other labs about their own reasoning-focused models. The blogs section includes posts from researchers and practitioners explaining how o1 works, how to replicate it, and what its limitations are. There are also links to recorded talks from researchers like Noam Brown, who worked on AI planning in games before joining OpenAI, and Hyung Won Chung. A significant portion of the repository covers open-source efforts. There are links to open model weights from groups like DeepSeek (R1), Alibaba Qwen (QwQ, QvQ), NVIDIA, Skywork, and others who have built publicly available reasoning models. A separate codebase section links to training frameworks and reinforcement learning implementations that people have used to reproduce o1-style training, including OpenRLHF and related REINFORCE++ work. The research papers section is extensive and covers topics like chain-of-thought prompting, process reward models, reinforcement learning from human feedback, Monte Carlo tree search applied to language models, and self-play training methods. These are the technical building blocks behind reasoning-focused language models. This is a living reference list, updated as new work appears. It is aimed at people who follow AI research and want a single place to track what is being built and published around LLM reasoning. The full README is longer than what was shown.
← hijkzzz on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.