Run a desktop AI agent that clicks and types across any Windows app
Record a repetitive task once and replay it with different variables
Schedule cron-style desktop jobs that summarise email or fill spreadsheets
Use Claude vision to answer questions about whatever is on screen
Windows only, needs uv plus an Anthropic API key or an existing Claude Code CLI subscription configured in settings.yaml.
Lucid is a desktop assistant for Windows that lets an AI model see what is on your screen and then move the mouse, type on the keyboard, and click through programs on your behalf. It runs in the system tray, and you summon it with the keyboard shortcut Ctrl+Alt+J, which opens a small prompt bar similar in spirit to the Spotlight search box on macOS. The project is open source under the MIT license and uses Anthropic's Claude as its vision model, either through an API key or through an existing Claude Code CLI subscription. The tool offers three modes that the user picks from the prompt bar. Answer mode looks at the current screen and replies in text, useful for questions like what an error message means or what a PDF says. Teach mode records a task once with mouse and keyboard, then saves it as a named workflow with variables so it can be replayed later with different values, for example creating an invoice for a different customer. Execute mode hands full control to Claude, which then drives the desktop using a set of around sixteen actions such as clicking, typing, dragging, focusing a window, pasting into a file dialog, or taking a screenshot to the clipboard. Several features are aimed at making this practical rather than a demo. Claude is steered toward clicking elements by their visible label instead of guessing pixel coordinates, and a retry guard nudges it to try a different approach when it keeps clicking the same spot. There is a shell command runner limited to short read-only commands, a built-in scheduler that supports cron expressions, intervals, one-shot times, and relative delays, and a resilient mode for long tasks that extends the timeout and step budget. Lucid also keeps per-user context. A profile file holds details like name, email, default browser, and frequent folders, none of which are committed to the repository. A small SQLite database remembers facts the assistant has learned, files it has touched, and one-sentence summaries of successful tasks, and pulls the most relevant of these back as context for new requests. There is also an opt-in captcha solver for accounts the user owns, rate limited and disabled for scheduled jobs. Safety controls are explicit. Password fields are detected and screenshots of blacklisted windows are suppressed. Destructive sounding actions such as send, delete, format, or pay open a modal that asks the user to deny, allow, or allow for the session. A kill-switch shortcut stops the loop within half a second, and there is an option to send Ctrl+Z to the foreground window when the user presses stop.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.