Analysis updated 2026-06-24
Cut Claude Code monthly bill by switching to the coding profile for vibe sessions
Keep chat replies short during natural Q and A with the chat profile
Store large file reads as local artifacts so they stop bloating future requests
Disable Task and Plan tools by default to keep agent trajectories small
| maxforai/tokenless | systemoutprintlnhelloworld/plus-pp-helper | rosalina7515/ui-modernizer | |
|---|---|---|---|
| Stars | 32 | 32 | 30 |
| Language | JavaScript | JavaScript | JavaScript |
| Setup difficulty | easy | moderate | easy |
| Complexity | 2/5 | 3/5 | 2/5 |
| Audience | vibe coder | developer | vibe coder |
Figures from each repo's GitHub metadata at analysis time.
Not on npm registry yet, install via npm install -g github:MaxForAI/Tokenless and run repair-hooks before first launch.
Tokenless is a command-line add-on for Claude Code that tries to cut how many tokens each session burns through. The pitch on the front page is a single number: roughly 50% less request tokens in a vibe-coding run, and up to 80% less response tokens in plain chat. You install it from GitHub with npm, run a couple of setup commands, and then keep using Claude Code as normal. A tokenless style command lets you switch between three output profiles at any time. The core problem it tackles is that Claude Code sessions get expensive because every file read, log, diff, and verbose final reply gets carried into the context of the next request. Tokenless intercepts that flow: large tool outputs and file reads are kept on your machine as raw artifacts, while what actually goes to the model is a compact packet, called a TOKENLESS-READ-PACKET, containing an artifact id, imports, symbol list, a few snippets, and the exact commands needed to expand the original output later if the agent really needs it. It also targets two other sources of growth. The launcher trims out heavy Task and Plan tools by default to keep agent trajectory from ballooning, and the chat and coding output profiles change how Claude itself replies. The chat profile keeps responses short and readable, the coding profile produces dense structured responses for code work, and the off profile disables all of this and gives you stock Claude Code behavior. The README is unusually open about evidence. It lists a table of API-body measurements from real sessions: a 5-turn CRM vibe-coding run dropped from 4.7M request tokens to 2.5M, a natural conversation dropped from 7,223 response tokens to 1,442, a large CSS visual edit dropped by around 54 to 60 percent, and a 10,000-line React edit dropped by about 40 percent. There is also a research-backing section pointing at papers on brevity constraints, prompt compression, LLMLingua, LongLLMLingua, Selective Context, and Gist Tokens, while admitting these do not automatically prove Tokenless helps every session. Installation is npm install -g from the GitHub repo, then tokenless repair-hooks, tokenless install-commands, and tokenless launch. CLAUDE_BIN can point to a non-standard Claude Code binary. The project is MIT licensed and includes README translations for Chinese, Japanese, French, and Spanish.
CLI add-on for Claude Code that reduces session token usage by storing large tool outputs locally and sending compact reference packets to the model instead.
Mainly JavaScript. The stack also includes JavaScript, Node, npm.
MIT license, free to use, modify, and ship commercially as long as the copyright notice stays.
Setup difficulty is rated easy, with roughly 5min to a first successful run.
Mainly vibe coder.
This repo across BitVibe Labs
Verify against the repo before relying on details.