Enable Caduceus mode in Hermes so the agent maintains a visible to-do plan and verifies each step before moving to the next.
Author a Hermes workflow that fans out sub-tasks to multiple agents in parallel and watch live progress in the Orchestration Theater panel.
Configure multiple AI models in Hermes and let the Auto Router assign each sub-task to the cheapest model that meets the capability bar.
Install hermes-caduceus with one script that backs up every file it touches and undo the whole thing with a single uninstall flag.
Requires an existing Hermes desktop app installation, Python 3.11+ with standard library only, no extra packages needed.
Hermes-caduceus is an optional add-on mode for Hermes, a desktop AI agent app made by Nous Research. If you already use Hermes, this project lets you turn on a more advanced planning style with a single command. When turned off, the fork behaves exactly like the original Hermes, so there is no penalty for installing it. The core addition is called Caduceus mode, which you activate by typing "/caduceus on" inside Hermes. Once on, the agent works through tasks by maintaining a visible to-do plan, completing one step at a time, and checking its own work before marking anything as done. For simple or quick requests, it skips the extra ceremony and responds normally, so the overhead only appears when a task genuinely calls for it. For more complex jobs, Caduceus includes a workflow engine called the Loom. Rather than running everything in one conversation thread, the agent can author a small Python workflow and run it across multiple sub-agents in parallel or in sequence. The Hermes desktop app shows this process in real time through what the project calls the Orchestration Theater: a visual panel with live phase lanes, per-agent status, token usage, and a shared budget counter. You watch the fan-out happen rather than waiting in silence. A third feature, the Auto Router, handles model selection when multiple AI models are configured. It scores each sub-task by capability requirements and sends that task to the cheapest model that meets the bar. The main orchestrator always keeps the model you chose for your session, so only the background workers get re-routed. Installation is a single git clone and one Python script, which auto-detects your Hermes install, backs up every file it touches, and can be fully undone with an uninstall flag. The project requires Python 3.11 or newer and uses only the standard library, with no additional packages to install. A local GPU worker option also exists for running workflow sub-agents on local models rather than cloud endpoints.
← onlyterp on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.