serge-chat/serge

★ 5,732SvelteAudience · developerComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((repo))
    What it does
      Local AI chat
      No data sent out
      No API key needed
      Browser-based UI
    Tech stack
      SvelteKit frontend
      FastAPI backend
      LangChain
      Redis sessions
      llama.cpp engine
    Setup
      Docker required
      Single start command
      Windows via WSL2
    Limits
      RAM must fit model
      Crashes if too little memory

mindmap root((repo)) What it does Local AI chat No data sent out No API key needed Browser-based UI Tech stack SvelteKit frontend FastAPI backend LangChain Redis sessions llama.cpp engine Setup Docker required Single start command Windows via WSL2 Limits RAM must fit model Crashes if too little memory

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Run a private AI chat assistant on your laptop without sending any messages to third-party servers

USE CASE 2

Prototype a local LLM-powered tool without needing an OpenAI API key or monthly bill

USE CASE 3

Test different open-source language models side by side in a familiar chat interface

USE CASE 4

Set up a self-hosted AI assistant on a home server for your household or small team

Tech stack

SvelteSvelteKitFastAPIPythonLangChainRedisDockerllama.cpp

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Docker and enough free RAM to load your chosen model, the app will crash if the model exceeds available memory.

Free to use, modify, and distribute under both the MIT and Apache 2.0 licenses, two of the most permissive open-source licenses available.

In plain English

Serge is a self-hosted chat application that lets you run AI language models on your own computer without sending any data to external services and without needing an API key or paid account. It wraps a tool called llama.cpp, which can run large language models locally on consumer hardware. The interface is a web page you open in your browser. You type messages and the AI responds, similar to other chat tools, but everything stays on your machine. Chat history and settings are stored locally using a small database. The project is packaged as a Docker container, meaning you start it with a single command and it handles all the setup automatically. Once running, you visit a local address in your browser to begin chatting. On the technical side, the frontend is built with SvelteKit, the backend API uses FastAPI and LangChain, and Redis handles storing chat sessions. These are the components that run inside the container, so you do not need to install any of them yourself. The main practical requirement is having enough free memory on your computer to load the model you choose. Models vary in size, and the application will crash if your machine runs out of RAM while loading one. The project notes this as an important caveat to be aware of before getting started. Windows users can run Serge using Docker Desktop with WSL2 enabled. The project is open-source under MIT and Apache 2.0 licenses. A Discord community exists for help and discussion. Contributing is welcome by opening issues or pull requests on GitHub.

Copy-paste prompts

Prompt 1

Using the Serge codebase, help me add a system prompt field to the chat UI so I can set a custom persona for the AI before starting a conversation.

Prompt 2

Show me how Serge's FastAPI backend passes messages to llama.cpp and how I can add a streaming endpoint that sends tokens to the browser one at a time as they are generated.

Prompt 3

I want to add a model download manager to Serge's SvelteKit frontend that lists available models from a remote index and shows download progress. Help me build that component.

Prompt 4

Help me extend Serge to support multiple named conversation threads that persist between page refreshes, storing them in the existing Redis session store.

Open on GitHub → Explain another repo

← serge-chat on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.