explaingit

sakanaai/ai-scientist

13,584Jupyter NotebookAudience · researcherComplexity · 5/5Setup · hard

TLDR

A system that uses large language models to automate the full scientific research cycle, generating hypotheses, running experiments, and producing formatted academic papers with AI peer review.

Mindmap

mindmap
  root((AI Scientist))
    What it does
      Generates hypotheses
      Runs experiments
      Writes papers
      Reviews papers
    Templates
      NanoGPT
      2D Diffusion
      Grokking
    Requirements
      NVIDIA GPU
      API keys
      Linux machine
      LaTeX
    Supported models
      GPT-4o
      Claude
      Gemini
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Automate generation of novel research papers on small language model training using GPT-4o or Claude as the driving model

USE CASE 2

Run experiments on neural network grokking phenomena without manually writing any experiment code

USE CASE 3

Generate and AI-review a 2D diffusion model research paper using the included experiment template

Tech stack

PythonJupyter NotebookLaTeXNVIDIA CUDAOpenAI APIAnthropic APIGoogle Gemini API

Getting it running

Difficulty · hard Time to first run · 1day+

Requires a Linux machine with NVIDIA GPUs, a LaTeX installation for PDF generation, and paid API keys for a frontier model such as GPT-4o, Claude, or Gemini.

In plain English

The AI Scientist is a system from Sakana AI that attempts to automate the full cycle of scientific research using large language models. Given a research template and a set of starting ideas, it can generate new research hypotheses, write and run experiments, analyze results, and produce a formatted academic paper, including a review of that paper by another AI model. The aim is to have AI conduct research with minimal human involvement, rather than just assisting human researchers. The system works through experiment templates that define a research domain. Three templates are included: NanoGPT (a small language model training setup), 2D Diffusion (a generative modeling task), and Grokking (a phenomenon in neural network learning). Each template gives the system a codebase to modify and experiment with. The AI generates ideas, writes code changes, runs the experiments on a GPU machine, reads the results, and then writes a LaTeX paper summarizing what it found. A separate reviewer pass uses an LLM to evaluate the generated paper. Running the system requires a Linux machine with NVIDIA GPUs, a Python environment, a LaTeX installation (for PDF generation), and API keys for at least one supported frontier model such as GPT-4o, Claude, or Gemini. The README lists all supported model providers including OpenAI, Anthropic, Google, and options via Amazon Bedrock and Vertex AI. The project recommends using only frontier-grade models since weaker models produce poor research quality. The project includes an important safety warning: the system executes code written by the LLM, which could include network access, file operations, or installation of packages. Running it in a containerized environment with restricted network access is strongly advised. Sample papers produced by the system are available in the repository and on a shared Google Drive folder from the original research runs. Community-contributed templates beyond the three official ones are accepted but are not maintained by the Sakana AI team.

Copy-paste prompts

Prompt 1
Set up the AI Scientist with the NanoGPT template and GPT-4o, generate 3 research ideas, run the highest-scored idea as a full experiment, and save the resulting LaTeX paper
Prompt 2
Using the Grokking template in AI Scientist with Claude as the model, generate a research hypothesis about modular arithmetic and produce a formatted PDF paper with results and an AI review
Prompt 3
Configure AI Scientist to use Gemini via the Vertex AI provider, run the 2D Diffusion template, and compare the quality of the generated paper against a Claude-powered run
Prompt 4
Add a custom experiment template to AI Scientist for a reinforcement learning codebase and have it generate a new research idea and run the experiment autonomously
Open on GitHub → Explain another repo

← sakanaai on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.