explaingit

locuslab/eqr

17PythonAudience · researcherComplexity · 5/5LicenseSetup · hard

TLDR

Research code from Carnegie Mellon that trains neural networks to solve hard reasoning tasks like extreme Sudoku and maze navigation by running the same reasoning step repeatedly until the answer converges.

Mindmap

mindmap
  root((EqR))
    What it does
      Iterative reasoning
      Convergence to attractor
      Hard problem solving
    Tasks
      Sudoku-Extreme
      Maze-Unique 30x30
    Architecture
      Equilibrium loop
      Halt condition
      Breadth search
    Setup
      torchrun
      FlashAttention
      Custom optimizer
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Reproduce the EqR paper experiments on Sudoku-Extreme and Maze-Unique to validate iterative reasoning results.

USE CASE 2

Test how increasing the number of reasoning iterations improves accuracy on hard problems compared to a standard transformer baseline.

Tech stack

PythonPyTorchCUDAFlashAttentiontorchrun

Getting it running

Difficulty · hard Time to first run · 1day+

Requires CUDA-compiled FlashAttention and adam-atan2 extensions, distributed GPU setup via torchrun needed for training.

Use freely for any purpose including commercial use, changes must be noted but do not need to be open-sourced.

In plain English

EqR (Equilibrium Reasoners) is a research project from Carnegie Mellon University that explores a different way to do multi-step reasoning in neural networks. The idea is that instead of having a model produce an answer in one forward pass, you run the model's reasoning layer repeatedly until it converges to a stable output, called an attractor. The code in this repository reproduces experiments from an accompanying academic paper. The two test tasks are Sudoku-Extreme (very hard Sudoku puzzles that standard models struggle with) and Maze-Unique (30x30 maze navigation problems where exactly one solution path exists). These tasks were chosen because they require extended step-by-step reasoning. By running the model iteratively and checking when the output stops changing, EqR can apply more compute to harder problems without changing the network's size. The repository contains training and evaluation scripts, dataset builders for both task types (or download scripts to fetch pre-built datasets from Hugging Face), and pre-trained checkpoints. Training uses distributed GPU setups via torchrun and requires two CUDA-compiled extensions: a custom optimizer called adam-atan2 and FlashAttention for the maze task. Both must be compiled from source, and the README includes detailed notes on getting the build right. Evaluation lets you control how many iterative reasoning steps the model runs (the halt_max_steps parameter) and how many different starting points to try in parallel (breadth search). The model is compared against a standard transformer baseline under the same compute budget. The codebase builds on two earlier repositories (HRM and TRM) and is released under the Apache 2.0 license. Large-scale inference code using Google's XLA compiler is listed as a planned future release.

Copy-paste prompts

Prompt 1
I want to run inference using the pre-trained EqR checkpoint from locuslab/eqr on Sudoku-Extreme puzzles. Walk me through loading the checkpoint, setting halt_max_steps, and running the evaluation script.
Prompt 2
Explain how EqR iterative reasoning works: why does running the same layer repeatedly help solve hard Sudoku puzzles and how does the model decide when to stop iterating?
Prompt 3
I am comparing EqR to a standard transformer baseline under the same compute budget using the eqr evaluation code. Show me the commands to run both models and how to interpret the accuracy output.
Open on GitHub → Explain another repo

← locuslab on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.