Run autonomous machine-learning experiments overnight and wake up to a log of what the AI discovered.
Study how an AI agent iterates on real training pipelines and makes architectural decisions.
Compare different model architectures and hyperparameters fairly under a fixed time budget.
Tinker with autonomous AI research loops without manual intervention between runs.
Requires NVIDIA GPU with CUDA support, PyTorch compilation, and autonomous agent loop that may take hours to produce meaningful results.
autoresearch is a small experimental setup that lets an AI agent run machine-learning research on a single GPU automatically, overnight. The idea is to give the agent a working but simplified language-model training pipeline and let it experiment by itself: it modifies the training code, runs a short training, checks whether the result improved, keeps or discards the change, and repeats. You wake up the next day to a log of experiments and, hopefully, a better model. The training code is a simplified single-GPU implementation drawn from a related project called nanochat. The repository is deliberately tiny and centers on three files. prepare.py handles one-time data preparation, it downloads training data and trains a tokenizer, plus runtime utilities. The agent does not touch this file. train.py is the single file the agent edits and contains the full model, optimizer, and training loop, so architecture, hyperparameters, batch size, and similar choices are all fair game. program.md is a short instructions file that you, the human, edit to set up your "research org", it is what you point the agent at to start a run. Each training run uses a fixed five-minute wall-clock time budget, no matter the hardware. The metric tracked is val_bpb (validation bits per byte), where lower is better. The fixed budget means roughly twelve experiments per hour and around a hundred while you sleep, and it lets architectural changes be compared fairly. Someone would use autoresearch to tinker with autonomous AI research loops or to study how an agent iterates on a real training pipeline. Requirements are a single NVIDIA GPU, Python 3.10 or newer, and the uv project manager.
Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.