Turn a machine learning research paper PDF into a runnable code repository automatically
Prototype an ML paper implementation without writing all the boilerplate from scratch
Run locally using open-source models via vLLM to generate paper code without an OpenAI API key
Evaluate how well AI-generated code matches a reference implementation using the included benchmark
Requires an OpenAI API key (roughly $0.50-0.70 per paper with o3-mini) or a local vLLM setup with DeepSeek-Coder.
Paper2Code is a research project from a team at ICLR 2026 that attempts to automatically turn machine learning research papers into working code repositories. The core system, called PaperCoder, takes a paper as input, either as a PDF or as LaTeX source files, and produces a folder of code that implements the methods described in the paper. The system works in three stages handled by multiple AI agents. A planning agent reads the paper and lays out what needs to be built. An analysis agent examines the technical details of the methods. A code generation agent then writes the actual code. The result is a structured output directory containing planning notes, analysis artifacts, and the final generated repository. To use it, you need an API key for an AI provider. The default supported option is OpenAI, where running the system on a single paper costs roughly fifty to seventy cents using the o3-mini model. The project also supports running open-source language models locally using a framework called vLLM, with DeepSeek-Coder as the default model for that path. Instructions in the README walk through the steps to convert a PDF into the JSON format the system expects, or you can feed it LaTeX directly. The README includes an example using the well-known "Attention Is All You Need" paper, which introduced the Transformer model. It also describes an evaluation framework for scoring how well the generated code matches a reference implementation, using either a reference-free approach (judged by the AI alone against the paper) or a reference-based approach (compared against the original authors' published code). The project also released a benchmark dataset on HuggingFace called paper2code, which pairs machine learning papers with their corresponding official code repositories for evaluation purposes.
← going-doer on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.