explaingit

toadoum/ai-research-skill

Analysis updated 2026-05-18

7PythonAudience · researcherComplexity · 2/5LicenseSetup · easy

TLDR

A portable SKILL.md that gives Claude Code, Codex, and OpenClaw a full ML research workflow with guardrails against data leakage, fabricated citations, and single-seed results, plus a self-improving loop that logs and learns from past mistakes.

Mindmap

mindmap
  root((AI Research Skill))
    Research stages
      Hypothesis framing
      Literature review
      Baseline reproduction
      Experiment design
      Analysis and writing
    Self-improving loop
      Mistakes logged to LESSONS.md
      Agent reads lessons next run
      Rules promoted to SKILL.md
    Guardrails
      No fabricated citations
      Multi-seed results required
      Leakage audit built in
    Compatible agents
      Claude Code
      Codex
      OpenClaw
    License
      MIT
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Give Claude Code a structured research workflow so it catches data leakage and reproducibility issues automatically during an ML experiment.

USE CASE 2

Log recurring agent mistakes to LESSONS.md so the same failure (like reporting a single-seed result as a finding) never repeats across sessions.

USE CASE 3

Scaffold a new ML experiment directory with reproducibility guardrails already in place before writing any code.

USE CASE 4

Use the literature-review reference doc to build a comparison matrix from papers the agent actually reads, not citations it invents.

What is it built with?

PythonSKILL.mdAGENTS.md

How does it compare?

toadoum/ai-research-skillcaptaingrock/krea2trainercodenamekt/hexus
Stars777
LanguagePythonPythonPython
Setup difficultyeasyhardmoderate
Complexity2/54/53/5
Audienceresearcherdesignerdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 5min

Requires an AI coding agent (Claude Code, Codex, or OpenClaw) already installed, the skill itself is just files you copy into the agent's skills directory.

MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.

In plain English

AI Research Skill is a set of files you drop into your coding agent's skills directory to give it structured guidance for every stage of a machine learning research project. It works with Claude Code, Codex, and OpenClaw by placing a SKILL.md file where those agents automatically look for instructions, so the guidance loads whenever your task looks like research without any extra setup. The skill walks the agent through seven research stages: turning a vague idea into a testable hypothesis, doing a literature review from papers actually read rather than recalled, reproducing the strongest known baseline before claiming any improvement, designing experiments with proper seed fixing and data-leakage checks, running configurations so every result is reproducible, analyzing results across multiple seeds rather than a single lucky run, and writing a paper where every claim is backed by a number. What makes this project different from a static prompt is a self-improving loop. When the agent catches or makes a research mistake, it runs a small Python script that appends the lesson to a LESSONS.md file in your project. At the start of every future task, the agent reads that file and applies what it already learned. If the same mistake appears three times, the lesson gets promoted into the SKILL.md itself. The loop cannot be used to make the agent less rigorous: a built-in guardrail refuses any lesson that would weaken research integrity by fabricating, hiding, or cherry-picking results. The repository also includes reference documents for literature review, experiment design, and paper writing, plus a script that scaffolds a reproducible experiment directory with the correct folder structure. The project is licensed under MIT and aimed at researchers who run ML experiments with AI coding agents and want the agent to catch common mistakes (data leakage, single-seed results, fabricated citations) before they cost weeks of work.

Copy-paste prompts

Prompt 1
I installed ai-research-skill in my Claude Code skills directory. I'm starting an NLP classification experiment. Walk me through the Frame and Review stages from the skill's workflow before I write any code.
Prompt 2
My experiment shows a suspiciously high F1 score. Use the ai-research-skill leakage audit checklist to help me diagnose whether label information is leaking into my training set.
Prompt 3
I made a mistake in my last research session where I claimed a gain from a single seed. Help me run log_lesson.py to record this as a rule in LESSONS.md so it doesn't happen again.
Prompt 4
I'm writing the analysis section of my ML paper. Use the ai-research-skill writing guidelines to check that every performance claim I'm making is backed by mean and std over at least 3 seeds.

Frequently asked questions

What is ai-research-skill?

A portable SKILL.md that gives Claude Code, Codex, and OpenClaw a full ML research workflow with guardrails against data leakage, fabricated citations, and single-seed results, plus a self-improving loop that logs and learns from past mistakes.

What language is ai-research-skill written in?

Mainly Python. The stack also includes Python, SKILL.md, AGENTS.md.

What license does ai-research-skill use?

MIT license, use freely for any purpose, including commercial, as long as you keep the copyright notice.

How hard is ai-research-skill to set up?

Setup difficulty is rated easy, with roughly 5min to a first successful run.

Who is ai-research-skill for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub toadoum on gitmyhub

Verify against the repo before relying on details.