hijkzzz/awesome-llm-strawberry

★ 6,905Audience · researcherComplexity · 1/5Setup · easy

Mindmap

mindmap
  root((awesome-llm-strawberry))
    What it is
      Curated reading list
      No code
      Living reference
    Research topics
      Chain-of-thought
      Process reward models
      Reinforcement learning
      Monte Carlo tree search
    Open-source models
      DeepSeek R1
      Qwen QwQ
      NVIDIA models
    Resource types
      Research papers
      Blog posts
      Recorded talks

mindmap root((awesome-llm-strawberry)) What it is Curated reading list No code Living reference Research topics Chain-of-thought Process reward models Reinforcement learning Monte Carlo tree search Open-source models DeepSeek R1 Qwen QwQ NVIDIA models Resource types Research papers Blog posts Recorded talks

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Track the latest research on AI reasoning models and chain-of-thought training from one curated list

USE CASE 2

Find and download open-source reasoning model weights from DeepSeek, Qwen, NVIDIA, and others

USE CASE 3

Discover training frameworks for reproducing o1-style reinforcement learning from human feedback

USE CASE 4

Get links to recorded talks and blog posts explaining how step-by-step reasoning models are built

Getting it running

Difficulty · easy Time to first run · 5min

In plain English

This repository is a curated collection of research papers, blog posts, talks, and open-source resources focused on AI reasoning, with a particular emphasis on OpenAI's o1 model (internally called Strawberry). It does not contain code. Instead, it tracks the research and public writing that helped explain or replicate the ideas behind models that reason step-by-step before answering, rather than generating a reply immediately. The collection is organized into several sections. Official documentation and announcements from OpenAI appear at the top. A news section lists updates from OpenAI, Google DeepMind, DeepSeek, and other labs about their own reasoning-focused models. The blogs section includes posts from researchers and practitioners explaining how o1 works, how to replicate it, and what its limitations are. There are also links to recorded talks from researchers like Noam Brown, who worked on AI planning in games before joining OpenAI, and Hyung Won Chung. A significant portion of the repository covers open-source efforts. There are links to open model weights from groups like DeepSeek (R1), Alibaba Qwen (QwQ, QvQ), NVIDIA, Skywork, and others who have built publicly available reasoning models. A separate codebase section links to training frameworks and reinforcement learning implementations that people have used to reproduce o1-style training, including OpenRLHF and related REINFORCE++ work. The research papers section is extensive and covers topics like chain-of-thought prompting, process reward models, reinforcement learning from human feedback, Monte Carlo tree search applied to language models, and self-play training methods. These are the technical building blocks behind reasoning-focused language models. This is a living reference list, updated as new work appears. It is aimed at people who follow AI research and want a single place to track what is being built and published around LLM reasoning. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1

List every open-source reasoning model in the awesome-llm-strawberry repo and summarize their training approach and available weight sizes.

Prompt 2

Based on the papers in this list, compare process reward models vs outcome reward models for training reasoning AI, what does the research say works better?

Prompt 3

From the resources in awesome-llm-strawberry, create a step-by-step reading plan for understanding how OpenAI o1 works, starting from the most accessible posts.

Prompt 4

Which reinforcement learning frameworks in this repo have been used to reproduce o1-style training? Summarize what each one does differently.

Prompt 5

Summarize the key findings from Monte Carlo tree search papers listed in awesome-llm-strawberry and how they apply to language model reasoning.

Open on GitHub → Explain another repo

← hijkzzz on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.