Analysis updated 2026-06-24 · repo last pushed 2026-05-21
Cut context usage on Claude Code or Cursor by sandboxing large tool outputs into small summaries
Restore an agent's working memory after a context compaction using BM25 search over a SQLite event log
Run Playwright snapshots, log greps, and GitHub issue dumps without flooding the model context
Add a Think in Code pattern so the agent writes a script that returns only the answer
| mksglu/context-mode | sairyss/domain-driven-hexagon | tonejs/tone.js | |
|---|---|---|---|
| Stars | 14,627 | 14,636 | 14,609 |
| Language | TypeScript | TypeScript | TypeScript |
| Last pushed | 2026-05-21 | 2024-06-11 | 2026-05-20 |
| Maintenance | Maintained | Dormant | Maintained |
| Setup difficulty | moderate | moderate | easy |
| Complexity | 4/5 | 4/5 | 3/5 |
| Audience | developer | developer | developer |
Figures from each repo's GitHub metadata at analysis time.
Automatic routing enforcement only works on platforms with hook support, others need a routing file copied into the project by hand.
Context Mode is an MCP server, a small program that plugs into AI coding agents like Claude Code, Cursor, and a list of others, and changes how those agents handle large pieces of data. The README explains the core problem: every time an agent calls a tool, the raw response goes straight into the model's context window. A web page snapshot taken with Playwright might be 56 KB, twenty GitHub issues might be 59 KB, a single access log file might be 45 KB. After half an hour of this, the README says, 40% of the working memory has been spent on data the model is no longer using, and when the conversation has to compact itself, the agent loses track of what it was doing. The tool addresses this in four ways according to the README. First, sandbox tools run external commands and keep the raw output out of context, replacing it with a small summary, one stated figure is 315 KB becoming 5.4 KB, a 98% reduction. Second, session continuity is tracked in a SQLite database with full-text search, when the conversation is compacted, only the events the model needs are retrieved via BM25 ranking rather than dumping everything back in. Third, what the README calls Think in Code asks the agent to write a script that returns only the answer rather than reading every file. Fourth, the project deliberately does not impose a writing style on the model's replies, citing benchmarks where strict brevity rules hurt coding performance. Installation is grouped by platform. The README documents 15 of them. Claude Code is the easiest case: two slash commands add the plugin marketplace and install context-mode, after which a SessionStart hook injects the routing instructions automatically and a doctor command validates the setup. Platforms with hook support get fully automatic routing enforcement, the others need a one-time routing file copied into the project. The MCP server ships eleven tools. Six are sandbox tools for running code and indexing data: ctx_batch_execute, ctx_execute, ctx_execute_file, ctx_index, ctx_search, and ctx_fetch_and_index. Five meta-tools cover statistics, diagnostics, upgrades, purging old data, and insights. Slash commands surface the same functions inside the agent. The README opens with logos of companies whose teams the project claims as users and shows a 570-point Hacker News thread as social proof. The license is the Elastic License v2. The full README is longer than what was shown.
MCP server that keeps large tool outputs out of an AI coding agent's context window by sandboxing commands and storing session events in a searchable SQLite database.
Mainly TypeScript. The stack also includes TypeScript, MCP, SQLite.
Maintained — commit in last 6 months (last push 2026-05-21).
Elastic License v2 lets you use and modify the code for free, but you cannot offer it as a hosted service or remove its license keys.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.