explaingit

hijkzzz/awesome-llm-strawberry

6,905Audience · researcherComplexity · 1/5Setup · easy

TLDR

A curated reading list of research papers, blog posts, talks, and open-source resources tracking the state of AI reasoning models, with a focus on OpenAI o1 and step-by-step thinking models.

Mindmap

mindmap
  root((awesome-llm-strawberry))
    What it is
      Curated reading list
      No code
      Living reference
    Research topics
      Chain-of-thought
      Process reward models
      Reinforcement learning
      Monte Carlo tree search
    Open-source models
      DeepSeek R1
      Qwen QwQ
      NVIDIA models
    Resource types
      Research papers
      Blog posts
      Recorded talks
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Track the latest research on AI reasoning models and chain-of-thought training from one curated list

USE CASE 2

Find and download open-source reasoning model weights from DeepSeek, Qwen, NVIDIA, and others

USE CASE 3

Discover training frameworks for reproducing o1-style reinforcement learning from human feedback

USE CASE 4

Get links to recorded talks and blog posts explaining how step-by-step reasoning models are built

Getting it running

Difficulty · easy Time to first run · 5min

In plain English

This repository is a curated collection of research papers, blog posts, talks, and open-source resources focused on AI reasoning, with a particular emphasis on OpenAI's o1 model (internally called Strawberry). It does not contain code. Instead, it tracks the research and public writing that helped explain or replicate the ideas behind models that reason step-by-step before answering, rather than generating a reply immediately. The collection is organized into several sections. Official documentation and announcements from OpenAI appear at the top. A news section lists updates from OpenAI, Google DeepMind, DeepSeek, and other labs about their own reasoning-focused models. The blogs section includes posts from researchers and practitioners explaining how o1 works, how to replicate it, and what its limitations are. There are also links to recorded talks from researchers like Noam Brown, who worked on AI planning in games before joining OpenAI, and Hyung Won Chung. A significant portion of the repository covers open-source efforts. There are links to open model weights from groups like DeepSeek (R1), Alibaba Qwen (QwQ, QvQ), NVIDIA, Skywork, and others who have built publicly available reasoning models. A separate codebase section links to training frameworks and reinforcement learning implementations that people have used to reproduce o1-style training, including OpenRLHF and related REINFORCE++ work. The research papers section is extensive and covers topics like chain-of-thought prompting, process reward models, reinforcement learning from human feedback, Monte Carlo tree search applied to language models, and self-play training methods. These are the technical building blocks behind reasoning-focused language models. This is a living reference list, updated as new work appears. It is aimed at people who follow AI research and want a single place to track what is being built and published around LLM reasoning. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
List every open-source reasoning model in the awesome-llm-strawberry repo and summarize their training approach and available weight sizes.
Prompt 2
Based on the papers in this list, compare process reward models vs outcome reward models for training reasoning AI, what does the research say works better?
Prompt 3
From the resources in awesome-llm-strawberry, create a step-by-step reading plan for understanding how OpenAI o1 works, starting from the most accessible posts.
Prompt 4
Which reinforcement learning frameworks in this repo have been used to reproduce o1-style training? Summarize what each one does differently.
Prompt 5
Summarize the key findings from Monte Carlo tree search papers listed in awesome-llm-strawberry and how they apply to language model reasoning.
Open on GitHub → Explain another repo

← hijkzzz on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.