explaingit

techfreq/ai-local-executive-team

6PythonAudience · developerComplexity · 4/5ActiveSetup · hard

TLDR

Local multi-agent bridge that loads five LM Studio models as a CEO, CTO, CFO, CPO, COO board and routes prompts to the right role via an OpenAI or Ollama API.

Mindmap

mindmap
  root((ai-local-executive-team))
    Inputs
      User chat messages
      Preset selection
      Hotkeys
    Outputs
      Streamed agent responses
      Action plan from COO
      Benchmark winners saved
    Use Cases
      Multi-perspective answers offline
      Replace cloud LLMs for coding
      Local vision analysis
      Auto-tune hardware presets
    Tech Stack
      Python
      LM Studio
      CUDA
      OpenAI API
      Ollama API

Things people build with this

USE CASE 1

Run a five-model executive board offline for product and engineering questions

USE CASE 2

Plug local models into Continue, Cline, or OpenWebUI through one bridge

USE CASE 3

Auto-tune model configs on your GPU and save the fastest setup

USE CASE 4

Route casual prompts to a fast 14B and strategy prompts to the full board

Tech stack

PythonLM StudioCUDAOpenAI APIOllama API

Getting it running

Difficulty · hard Time to first run · 1h+

Needs LM Studio, multiple downloaded large models, and a GPU with 8GB+ VRAM to be useful.

In plain English

This project sets up what the author calls an executive board made of AI models that all run on your own computer rather than in the cloud. Five different downloaded models are assigned roles named after company titles: CEO, CTO, CFO, CPO, and COO. When you ask a question, the CEO model opens by framing it, three of the other models answer in parallel from their angle, the COO turns those answers into an action plan, and the CEO closes with a final recommendation. The pitch is that one general model is a single mind, but a panel of models each tuned to a job gives you different perspectives on the same prompt. The pieces talk to a separate program called LM Studio, a free desktop app that downloads open weight language models and serves them on your machine. The bridge in this repository sits between LM Studio and any chat client, and it exposes endpoints in both the OpenAI and Ollama API styles, so editors like Continue, Cline, Claude Code, and OpenWebUI can connect without special integration. Routing is automatic: short or casual messages go to the small fast model, technical questions go to the CTO, and big strategic questions trigger the full five model board meeting. The README documents six presets that trade speed for quality, ranging from Fastest at about thirty seconds to Nuclear at up to forty minutes. Live hotkeys let you switch presets, abort a generation, or run a built in benchmark that tries different settings and saves the fastest combination for your hardware. The author developed and tested it on a Windows machine with an NVIDIA RTX 3060, 12 gigabytes of video memory, 64 gigabytes of system RAM, and a Ryzen 9 5900X.

Copy-paste prompts

Prompt 1
Add a new CMO role to ai-local-executive-team that runs in parallel with CTO/CFO/CPO and updates the routing logic
Prompt 2
Write a config preset for ai-local-executive-team tuned to a 24GB RTX 4090 with all models fully on GPU
Prompt 3
Containerize swarm_bridge_server.py with a docker-compose that also starts LM Studio's headless server
Prompt 4
Add a streaming WebSocket endpoint to ai-local-executive-team so a web client can show each executive typing in real time
Prompt 5
Extend the intent router to detect legal questions and add a General Counsel agent backed by a specialized model
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.