Pair a strong PM model with a small Intern to keep cost low while reading a large codebase
Run a local Qwen via vLLM as the Coder and a hosted Claude as the PM
Auto-review every Coder edit against the approved plan and bounce drift back
Lock down per-role tool permissions so the Intern can only read, glob, and grep
Requires Bun plus three OpenAI-compatible endpoints and matching API keys for the PM, Coder, and Intern roles.
TeamCode is a terminal coding assistant that runs three AI models together as a small committee instead of relying on one model for everything. The author argues that a single model writing and reviewing its own code falls into confirmation bias, so the work is split across three roles: a PM that plans, a Coder that writes and edits files, and an Intern that does cheap bulk reading. The roles can use different providers or model sizes. The normal flow is described as a committee protocol. The PM searches the codebase with glob and grep, dispatches the Intern to read files, and produces a concrete plan. The Coder then checks that plan against the real code, pushes back if it disagrees, and the two debate for up to three rounds before the user is asked to approve, reject, or edit the proposal. Once approved, the Coder writes files in a background fiber so the user can keep chatting with the PM. If PM auto-review is on, every file the Coder writes is read back by the Intern and compared to the plan, and if the implementation drifts the PM steers the Coder back. Each role has different tools. The PM is mostly read-only and can dispatch the Intern and submit plans to the Coder. The Coder can read, write, edit, and run shell commands. The Intern can only read, glob, and grep, so the author suggests a small 4B-class model is enough for that role. A permissions block in the config locks these capabilities down per agent. Setup uses Bun. The user clones the repo, runs bun install, copies teamcode.jsonc, and points each role at an OpenAI-compatible endpoint such as DashScope, OpenAI, vLLM, or LiteLLM. API keys can live in the config, in environment variables like TEAMCODE_PM_API_KEY, or be set with the /apikey slash command. Other slash commands change the per-role model, toggle auto-review, set the max parallel Interns, compact context, switch theme, and show the current plan and status. The README notes the project is tuned for edge development scenarios with medium-sized models, low bandwidth, and generous VRAM, naming the DGX Spark as an example. Recommended pairings cover Qwen 3.6, GPT-5.5, and Claude at three tiers per role. Recent versions added primitive security guardrails that ask for user permission and a Cautious, Balanced, or Fast personality setting.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.