sapientinc/hrm

★ 12,419PythonAudience · researcherComplexity · 4/5Setup · hard

Mindmap

mindmap
  root((HRM))
    What it does
      Dual module reasoning
      No step supervision
      Small but capable
    Tech Stack
      Python
      PyTorch
      FlashAttention
      CUDA
    Use Cases
      Solve Sudoku
      Navigate mazes
      ARC benchmark
    Audience
      AI researchers
      ML students

mindmap root((HRM)) What it does Dual module reasoning No step supervision Small but capable Tech Stack Python PyTorch FlashAttention CUDA Use Cases Solve Sudoku Navigate mazes ARC benchmark Audience AI researchers ML students

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Train an AI model to solve extreme Sudoku puzzles on a single laptop GPU in around ten hours

USE CASE 2

Load a pre-trained HRM checkpoint from Hugging Face and run it on the ARC-AGI-2 benchmark

USE CASE 3

Study how dual-speed reasoning (slow abstract + fast computation) works in a small AI model

USE CASE 4

Fine-tune the maze-solving checkpoint to navigate custom maze layouts

Tech stack

PythonPyTorchFlashAttentionCUDAWeights and BiasesHugging Face

Getting it running

Difficulty · hard Time to first run · 1h+

Requires a CUDA-capable GPU, the quick-start Sudoku demo needs ~10 hours on a single laptop GPU, and full experiments require an 8-GPU setup.

In plain English

This repository contains the official code for the Hierarchical Reasoning Model, or HRM, a research AI architecture designed to handle complex reasoning tasks. The core idea is that most current AI models use a technique called Chain-of-Thought, where they generate long sequences of intermediate reasoning steps to solve problems. HRM takes a different approach, drawing inspiration from how the human brain uses different processing speeds for different types of thinking: a slower, higher-level module handles abstract planning while a faster, lower-level module handles detailed computations. These two modules run in a loop during a single forward pass through the model, producing surprisingly capable reasoning without needing explicit step-by-step supervision during training. What makes HRM notable is its size. The model has only 27 million parameters, which is extremely small compared to the large language models that currently dominate AI. Despite this, it was trained on only 1,000 examples and achieves near-perfect results on tasks like solving very difficult Sudoku puzzles and finding optimal paths through large mazes. It also outperforms much larger models on the Abstraction and Reasoning Corpus benchmark, which is a standard test for measuring general reasoning ability in AI systems. The paper describing the architecture is available on arXiv. The repository lets you train HRM from scratch or load pre-trained checkpoints from Hugging Face. Pre-trained checkpoints are available for the ARC-AGI-2 benchmark, extreme Sudoku puzzles, and hard maze-solving. Training requires a GPU with CUDA support. The quick-start guide walks through training a Sudoku solver on a single laptop GPU in around ten hours, while full-scale experiments are designed to run on an 8-GPU setup. The code depends on PyTorch, FlashAttention (with different versions for different GPU generations), and Weights and Biases for tracking training metrics. Setup instructions in the README cover installing CUDA and the required Python packages. A puzzle visualizer is included as an HTML file to help you explore the training data visually.

Copy-paste prompts

Prompt 1

Set up the HRM repository and train a Sudoku solver from scratch on my laptop GPU following the quick-start guide, show me the exact commands

Prompt 2

Load the HRM ARC-AGI-2 pre-trained checkpoint from Hugging Face and run inference on a set of my own test puzzles

Prompt 3

Explain in plain English how HRM's higher-level planning module and lower-level computation module work together during a single forward pass

Prompt 4

Walk me through using the HTML puzzle visualizer included in the HRM repo to explore the training data

Prompt 5

Compare HRM's 27M parameter performance on ARC reasoning tasks against a much larger language model and explain why it punches above its weight

Open on GitHub → Explain another repo

← sapientinc on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.