doorman11991/smallcode

★ 1,096JavaScriptAudience · developerComplexity · 3/5ActiveSetup · moderate

Mindmap

mindmap
  root((smallcode))
    Inputs
      Local LLM server
      env config
      Project files
    Outputs
      Search-replace patches
      TODO plans
      Tool execution
    Use Cases
      Offline coding agent
      Privacy-first dev
      Run on laptop
    Tech Stack
      Node.js
      JavaScript
      LM Studio
      Ollama

mindmap root((smallcode)) Inputs Local LLM server env config Project files Outputs Search-replace patches TODO plans Tool execution Use Cases Offline coding agent Privacy-first dev Run on laptop Tech Stack Node.js JavaScript LM Studio Ollama

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run a coding assistant offline using an 8B model in Ollama

USE CASE 2

Get patch-based edits from a local LLM without leaking code to a cloud API

USE CASE 3

Use a small local model with optional fallback to Claude or DeepSeek on hard failures

USE CASE 4

Try a context-budgeted agent loop on a laptop with 16k token models

Tech stack

Node.jsJavaScriptnpmSQLiteLMStudioOllama

Getting it running

Difficulty · moderate Time to first run · 30min

Needs Node 18+ and a local LLM server like LM Studio or Ollama running before first use, better-sqlite3 may need native build on non-LTS Node.

In plain English

SmallCode is a terminal coding assistant built specifically for small language models, the kind in the 8 to 35 billion parameter range that can run on a regular laptop or desktop instead of a cloud server. The README contrasts it with tools like OpenCode, which assume you are using a frontier model such as Claude or GPT-5 with a very large context window and reliable output. SmallCode tries to squeeze useful coding work out of weaker, local models by changing the design around their limits. You install it from npm with a single command, or grab a prebuilt tarball for Windows, macOS, or Linux that bundles Node.js so you do not need to compile anything. It then talks to a local model server such as LM Studio or Ollama, configured through a .env file. There is an optional escape hatch: if the local model fails after retries, SmallCode can hand off the task to a cloud model like Claude or DeepSeek, but only if you provide an API key. The README lists several design choices that target small-model weaknesses. A context budget engine caps tool output and summarizes old history so the model never overflows its window. A two-stage tool router shows the model only the tools it needs, which matters when the context is just 8 or 16 thousand tokens. A forgiving parser accepts tool calls in JSON, YAML, XML, or plain text, since small models often produce messy output. Editing uses search-and-replace patches rather than full file rewrites, because small models tend to truncate or hallucinate when reproducing whole files. There is also a planning layer that breaks complex tasks into a TODO file the model rereads each turn, plus detection for repetition loops and stuck patch attempts. The author claims an 87 percent benchmark score using a 4 billion active parameter model.

Copy-paste prompts

Prompt 1

Set up smallcode with Ollama running llama 8B and walk me through the .env config

Prompt 2

Show me how smallcode's two-stage tool router decides which tools to expose to a small model

Prompt 3

Explain how smallcode's search-and-replace patch tool works and when it falls back to full rewrites

Prompt 4

Add a new tool to smallcode that runs pytest and parses failures into the TODO planner

Prompt 5

Compare smallcode's context budget engine to OpenCode's approach for a 16k context model

Open on GitHub → Explain another repo

← doorman11991 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.