SynthTraces is a small codebase that creates synthetic recordings of AI coding assistant conversations. The idea is to capture how an AI model behaves when a user asks it questions about a real software project, so researchers can study and compare different AI models side by side. Each session pairs two AI models together. One model acts as the user, asking questions like "How do I run this code?" or "What recent changes were made and why?" The other model acts as the coding agent, which can read files, write code, run commands, and edit files within a real project codebase. Their entire exchange is recorded as a trace. The scope of the dataset is large: 20 different agent models, 3 different user models, 20 project codebases (such as transformers and diffusers), and 20 starting questions. Multiplied together, that gives up to 24,000 unique session recordings, one for each combination. The agent models are hosted remotely and include popular open models from DeepSeek, OpenAI, and Qwen. The user models run locally using a program called llama.cpp, which lets models run on regular hardware. The project codebases are real open-source repositories, cloned locally so the coding agent can actually interact with them. This setup aims to produce realistic traces of how an AI would behave helping someone navigate an unfamiliar codebase, rather than simulated or hand-crafted examples. The trace dataset is published on Hugging Face for others to download and study. The README notes that final statistics about success rates and token counts are still to be filled in after generation is complete. The code is licensed under MIT.
← julien-c on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.