explaingit

aryangonsalves/voiceclaw

Analysis updated 2026-05-18

3PythonAudience · generalComplexity · 3/5Setup · moderate

TLDR

An always-listening Windows voice agent that routes commands through on-device PC control, a local AI model, and cloud AI, letting you control your computer and run complex tasks by speaking.

Mindmap

mindmap
  root((VoiceClaw))
    How it works
      Wake word detection
      4-tier brain routing
      Local and cloud AI
    Tiers
      Learned cache
      Local PC control
      Ollama local model
      Claude or OpenAI agent
    Features
      Media and app control
      File management
      Web search
      Plugin system
    Setup
      Windows installer
      Portable ZIP
      Python source
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Control your Windows PC hands-free using voice commands for media, apps, and browser navigation without any internet connection.

USE CASE 2

Run multi-step agent tasks by voice, such as finding files, searching the web, or clicking on-screen elements.

USE CASE 3

Add custom voice commands for your own workflows by writing a simple Python plugin file.

What is it built with?

PythonWindowsfaster-whisperopenWakeWordOllamaPySide6

How does it compare?

aryangonsalves/voiceclaw0marildo/imagoagentlexi/agent-lexi
Stars333
LanguagePythonPythonPython
Setup difficultymoderateeasymoderate
Complexity3/52/54/5
Audiencegeneralgeneralvibe coder

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires an API key for AI agent features, local control and speech recognition work without one. App is not code-signed and may trigger a Windows SmartScreen warning.

In plain English

VoiceClaw is a lightweight always-listening voice assistant for Windows that lets you control your computer by speaking. After saying a wake word like "Hey Jarvis" or "Alexa," you can open and close applications, scroll pages, control media playback, manage files, search the web, and hand off complex multi-step tasks to an AI agent. The whole thing runs in the background and uses only the wake-word detection while idle. The system uses a four-tier approach to keep common commands instant and reserve cloud AI for genuinely difficult requests. The first tier is a learned command cache: phrases you have used before resolve immediately to their actions without checking anything else. The second tier handles simple local control commands like "open Chrome," "next video," or "volume up" with no internet and no model. The third tier routes short factual questions to a local Ollama model if one is running. The fourth tier sends complex or multi-step requests (reasoning, web lookups, file operations, or clicking on-screen elements) to a Claude or OpenAI agent using your own API key. If you do not have any AI API key, the free tiers still cover direct PC control and on-device speech recognition. You can also switch to push-to-talk or hotkey mode if you prefer not to have a wake word running continuously. The companion desktop app (built with PySide6) shows a dashboard with live listening status and a command tester, lets you manage wake words, microphone settings, and hotkeys, and displays a log of recent issues. A plugin system lets you add your own voice commands by dropping a Python file into a plugins folder. VoiceClaw is available as a prebuilt Windows installer or a portable ZIP from the Releases page, or runnable from source with Python. The README notes the app is not code-signed yet, so Windows may show a security warning on first run.

Copy-paste prompts

Prompt 1
I've installed VoiceClaw on Windows with an Anthropic API key. How do I set up the wake word and test that Tier-3 agent tasks are working?
Prompt 2
Walk me through writing a VoiceClaw plugin that adds a voice command to open a specific application or website.
Prompt 3
How does VoiceClaw's four-tier routing decide when to use the local Ollama model versus the Claude cloud agent?
Prompt 4
I want VoiceClaw to start automatically when I sign into Windows. Which command or script should I use?

Frequently asked questions

What is voiceclaw?

An always-listening Windows voice agent that routes commands through on-device PC control, a local AI model, and cloud AI, letting you control your computer and run complex tasks by speaking.

What language is voiceclaw written in?

Mainly Python. The stack also includes Python, Windows, faster-whisper.

How hard is voiceclaw to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is voiceclaw for?

Mainly general.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub aryangonsalves on gitmyhub

Verify against the repo before relying on details.