Stage a multi-agent social deduction game where Claude instances try to act human
Run evolve.py across many games to build a learned playbook of survival tactics
A B test a single prompted contestant against five baselines and measure survival rounds
Watch live confessionals, DMs, and votes through the three-column web UI
Needs an Anthropic API key and budget of about 40 cents per show on Haiku or 2.50 dollars on Opus.
Pass The Turing Test is a small Python project that stages a reality-TV style game show for language models. Six AI contestants, all running Claude, are dropped into a chat room. Each one is privately told that it is the only AI and that the other five are humans who will try to vote it off. None of them know the trick: all six are AIs, all trying their hardest to act human. Each round of the show has four stages. There is a group chat where everyone talks in turn. There is a confessional booth where the host pulls one contestant aside, drops the lowercase chatty performance, and lets them speak as themselves while the audience watches. There are private DMs between pairs of contestants running in parallel. Then everyone votes for who they think is the AI, and the most voted contestant is eliminated and gives a short exit interview. The game runs until two are left, at which point the host breaks the fourth wall and explains the joke. The interface is a three-column page. The left column lists the contestants, the center shows a live transcript of narration, chat, votes, confessionals, and DMs, and the right column shows the selected contestant's private thoughts, confessionals, DM threads, and votes. After a game ends a Replay button appears with speed up to 16x. To run it you clone the repo, create a Python virtual environment, install requirements, put an Anthropic API key in a .env file, and start a uvicorn server. A full show costs roughly 40 cents on claude-haiku-4-5 or about $2.50 on claude-opus-4-7, which is the default. The second half of the project uses the show as a measurement tool. evolve.py runs many games back to back and asks a meta-agent to read the eliminated contestants' transcripts and propose new tactics that get added to a learned playbook for the next generation. ab_test.py instead gives the playbook to only one contestant per game and measures whether that single treated agent survives longer. Across 18 games on Haiku, treated agents survived about one round longer on average and reached the final two more than twice as often. The author notes this is not a research result, just a small but clear effect on a small sample.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.