Analysis updated 2026-05-18
Stop Claude Code from reloading a large MEMORY.md or AGENTS.md file every session by caching only relevant snippets
Set a hard token budget on context recall to control API costs for a team of coding agents sharing memory
See a savings receipt after each recall showing how many tokens were injected versus the full baseline
Build a memory layer for any MCP-compatible coding assistant using the remember, recall, and search_memory tools
| yohadh/thrift-memory | arashthr/hugo-flow | argeneau12e/kairos-tx | |
|---|---|---|---|
| Stars | 2 | 2 | 2 |
| Language | TypeScript | TypeScript | TypeScript |
| Setup difficulty | easy | moderate | hard |
| Complexity | 2/5 | 3/5 | 4/5 |
| Audience | developer | developer | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires an MCP-compatible coding agent like Claude Code or Cursor. No external API keys needed.
This is a memory server for AI coding assistants, designed to reduce the token cost of reloading project context files at the start of every session. Tools like Claude Code or Cursor often re-read the same large documentation files each time they start, which costs tokens every single time. Thrift Memory sits between the agent and those files, stores memories from previous sessions, and at each new session recalls only the pieces relevant to the current task, subject to a token limit you set. What sets this tool apart from other memory systems is that every recall operation returns a receipt showing how many tokens the full context would have cost (the baseline), how many were actually injected, and how many were saved. The formula is savedTokens equals baselineTokens minus injectedTokens. This lets you see the actual cost reduction rather than assuming savings are happening. The tool provides three main capabilities. The first is an MCP server with three commands: remember (store a memory in organization, agent, or session scope), recall (retrieve relevant memories under a hard token budget), and search_memory (browse stored memories without applying a tight budget). The second is a local dashboard where you can see savings over time and manage individual memories, including pinning important ones so they are always included. The third is an optional HTTP proxy that trims live requests and retries failed ones from rate limits. Installation is done through npm. For Claude Code users there is a plugin that sets up the MCP server and adds slash commands in one step. For other MCP-compatible tools, you add a short JSON configuration block pointing to the npx command.
An MCP memory server for AI coding agents that recalls only task-relevant context under a token budget and shows exactly how many tokens were saved.
Mainly TypeScript. The stack also includes TypeScript, Node.js, npm.
Setup difficulty is rated easy, with roughly 5min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.