Analysis updated 2026-05-18
Reduce token consumption and API costs when using Claude Code, Cursor, or GitHub Copilot for extended coding sessions.
Extend effective context window length by compressing requests, allowing longer conversations on large codebases without hitting limits.
Optimize AI assistant usage for teams or individuals with strict token budgets or rate-limit constraints.
| rtk-ai/rtk | sharkdp/fd | fuellabs/fuels-rs | |
|---|---|---|---|
| Stars | 42,925 | 42,859 | 43,217 |
| Language | Rust | Rust | Rust |
| Setup difficulty | moderate | easy | moderate |
| Complexity | 3/5 | 1/5 | 4/5 |
| Audience | developer | developer | developer |
Figures from each repo's GitHub metadata at analysis time.
Requires Rust toolchain installation and compilation from source.
RTK is a local CLI proxy that sits between your AI coding assistant and the LLM provider it calls, with the goal of dramatically reducing the number of tokens consumed per session. According to the project, it achieves a 60 to 90 percent reduction in token usage, which translates directly into lower API costs or fewer rate-limit hits when working with tools like Claude Code, GitHub Copilot, or Cursor. It works by intercepting the outgoing requests that your AI coding tool makes to its backend model, transforming or compressing the context before forwarding it. Because it hooks into the bash command layer that tools like Claude Code use when invoking shell commands, it can operate without modifying the AI tool itself. The proxy runs locally on your machine and the AI assistant's configuration is updated to route requests through it. You would use RTK when your AI coding sessions are consuming tokens faster than your budget or subscription allows, or when you are hitting context limits that cut off long conversations. It is particularly relevant for heavy users of Claude Code, Codex, or Cursor who run extended sessions on large codebases and find that context window costs or limits become a practical constraint. The project is written in Rust, which gives it low overhead as a local proxy process. It is designed to be transparent to the user: once configured, the AI coding tool behaves the same as before but the underlying requests are more token-efficient. This makes it a cost-optimization layer rather than a change to the AI tool's functionality.
Local CLI proxy that intercepts AI coding assistant requests and compresses context to reduce token usage by 60, 90%, lowering API costs and rate-limit pressure.
Mainly Rust. The stack also includes Rust, CLI, Proxy.
Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.