ertugrulakben/lucid

★ 0PythonAudience · developerComplexity · 4/5ActiveLicenseSetup · moderate

Mindmap

mindmap
  root((lucid))
    Inputs
      Ctrl Alt J prompt
      Screen pixels
      Reference images
    Outputs
      Mouse and keyboard actions
      Saved workflows
      Toast notifications
    Use Cases
      Answer questions about screen
      Replay named workflows
      Run scheduled desktop jobs
    Tech Stack
      Python
      Claude
      SQLite
      Windows
      uv

mindmap root((lucid)) Inputs Ctrl Alt J prompt Screen pixels Reference images Outputs Mouse and keyboard actions Saved workflows Toast notifications Use Cases Answer questions about screen Replay named workflows Run scheduled desktop jobs Tech Stack Python Claude SQLite Windows uv

Things people build with this

USE CASE 1

Run a desktop AI agent that clicks and types across any Windows app

USE CASE 2

Record a repetitive task once and replay it with different variables

USE CASE 3

Schedule cron-style desktop jobs that summarise email or fill spreadsheets

USE CASE 4

Use Claude vision to answer questions about whatever is on screen

Tech stack

PythonClaudeSQLiteWindowsuv

Getting it running

Difficulty · moderate Time to first run · 30min

Windows only, needs uv plus an Anthropic API key or an existing Claude Code CLI subscription configured in settings.yaml.

MIT license, you can use, modify, and ship it commercially as long as you keep the copyright notice.

In plain English

Lucid is a desktop assistant for Windows that lets an AI model see what is on your screen and then move the mouse, type on the keyboard, and click through programs on your behalf. It runs in the system tray, and you summon it with the keyboard shortcut Ctrl+Alt+J, which opens a small prompt bar similar in spirit to the Spotlight search box on macOS. The project is open source under the MIT license and uses Anthropic's Claude as its vision model, either through an API key or through an existing Claude Code CLI subscription. The tool offers three modes that the user picks from the prompt bar. Answer mode looks at the current screen and replies in text, useful for questions like what an error message means or what a PDF says. Teach mode records a task once with mouse and keyboard, then saves it as a named workflow with variables so it can be replayed later with different values, for example creating an invoice for a different customer. Execute mode hands full control to Claude, which then drives the desktop using a set of around sixteen actions such as clicking, typing, dragging, focusing a window, pasting into a file dialog, or taking a screenshot to the clipboard. Several features are aimed at making this practical rather than a demo. Claude is steered toward clicking elements by their visible label instead of guessing pixel coordinates, and a retry guard nudges it to try a different approach when it keeps clicking the same spot. There is a shell command runner limited to short read-only commands, a built-in scheduler that supports cron expressions, intervals, one-shot times, and relative delays, and a resilient mode for long tasks that extends the timeout and step budget. Lucid also keeps per-user context. A profile file holds details like name, email, default browser, and frequent folders, none of which are committed to the repository. A small SQLite database remembers facts the assistant has learned, files it has touched, and one-sentence summaries of successful tasks, and pulls the most relevant of these back as context for new requests. There is also an opt-in captcha solver for accounts the user owns, rate limited and disabled for scheduled jobs. Safety controls are explicit. Password fields are detected and screenshots of blacklisted windows are suppressed. Destructive sounding actions such as send, delete, format, or pay open a modal that asks the user to deny, allow, or allow for the session. A kill-switch shortcut stops the loop within half a second, and there is an option to send Ctrl+Z to the foreground window when the user presses stop.

Copy-paste prompts

Prompt 1

Walk me through how Lucid stores my Anthropic API key and where data/profile.yaml fits in

Prompt 2

Show me how to add a new action to the Execute mode action set in Lucid

Prompt 3

Help me write a Lucid schedule that opens Gmail every weekday at 9am and summarises unread mail

Prompt 4

Explain how the retry guard and click-by-label logic in Lucid prevents stuck loops

Prompt 5

Give me a Teach mode workflow definition for creating an invoice with customer and amount variables

Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.