explaingit

maxforai/tokenless

32JavaScript

TLDR

Tokenless is a command-line add-on for Claude Code that tries to cut how many tokens each session burns through.

Mindmap

A visual breakdown will appear here once this repo is fully enriched.

In plain English

Tokenless is a command-line add-on for Claude Code that tries to cut how many tokens each session burns through. The pitch on the front page is a single number: roughly 50% less request tokens in a vibe-coding run, and up to 80% less response tokens in plain chat. You install it from GitHub with npm, run a couple of setup commands, and then keep using Claude Code as normal. A tokenless style command lets you switch between three output profiles at any time. The core problem it tackles is that Claude Code sessions get expensive because every file read, log, diff, and verbose final reply gets carried into the context of the next request. Tokenless intercepts that flow: large tool outputs and file reads are kept on your machine as raw artifacts, while what actually goes to the model is a compact packet, called a TOKENLESS-READ-PACKET, containing an artifact id, imports, symbol list, a few snippets, and the exact commands needed to expand the original output later if the agent really needs it. It also targets two other sources of growth. The launcher trims out heavy Task and Plan tools by default to keep agent trajectory from ballooning, and the chat and coding output profiles change how Claude itself replies. The chat profile keeps responses short and readable, the coding profile produces dense structured responses for code work, and the off profile disables all of this and gives you stock Claude Code behavior. The README is unusually open about evidence. It lists a table of API-body measurements from real sessions: a 5-turn CRM vibe-coding run dropped from 4.7M request tokens to 2.5M, a natural conversation dropped from 7,223 response tokens to 1,442, a large CSS visual edit dropped by around 54 to 60 percent, and a 10,000-line React edit dropped by about 40 percent. There is also a research-backing section pointing at papers on brevity constraints, prompt compression, LLMLingua, LongLLMLingua, Selective Context, and Gist Tokens, while admitting these do not automatically prove Tokenless helps every session. Installation is npm install -g from the GitHub repo, then tokenless repair-hooks, tokenless install-commands, and tokenless launch. CLAUDE_BIN can point to a non-standard Claude Code binary. The project is MIT licensed and includes README translations for Chinese, Japanese, French, and Spanish.

Open on GitHub → Explain another repo

Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.