Long-press the notch companion and dictate a task that Claude Haiku then performs by driving the mouse and keyboard in Slack, Discord, or Figma.
Use the background observer to build a per-app memory of UI element positions so future agent runs do not relearn the same interface.
Open the developer window with Command-Shift-I to inspect the live observation stream, every API request, and the action timeline.
Fork the project as a starting point for your own desktop AI agent that needs Accessibility, Screen Recording, and Microphone permissions.
Needs an M-series MacBook with a notch, macOS 14+, Xcode build, four API keys (Anthropic, Google, OpenAI, OpenRouter), and three sensitive system permissions.
Agent Notch is a macOS app that lives in the notch at the top of a MacBook screen. The idea is that you long-press a small cursor companion, speak what you want done, and an AI model called Claude Haiku 4.5 takes over the mouse and keyboard to carry it out. The notch itself shows what the agent is doing while it works. It only runs on M-series MacBooks that have a physical notch, on macOS 14 or newer. To install it you use a few command-line tools (Homebrew, XcodeGen, a signing script), then open the project in Xcode and build it. You need four API keys: one from Anthropic for the Claude model, one from Google for Gemini, one from OpenAI for voice transcription and text-to-speech, and one from OpenRouter for a context-selection model called Mercury 2. The app also asks for three macOS permissions on first run: Accessibility, so it can detect the long-press and send clicks, Screen Recording, so it can see what is on your screen, and Microphone, so it can hear you. A central piece of the project is the context system. Two things run in parallel. A background observer watches your screen and builds up a memory of where buttons live in apps like Slack, Discord, and Figma, so the agent does not have to relearn an interface every time. A foreground path triggers when you long-press: it grabs a fast snapshot of the current app, your selection, your clipboard, and your cursor position, then asks Mercury 2 to summarize it all into a short brief. The brief is passed to Claude before any action is taken, and references like 'her' or 'that doc' are resolved to concrete things first. Privacy is built into the architecture. Password managers, secure input fields, and credentials in URLs are never logged. A single kill switch pauses all collection. A developer window (Command-Shift-I) shows the live observation stream, the per-app memory, every request and response, and the full timeline of what the agent did after each long-press. The project is written in Swift and SwiftUI, was built at TritonHacks 2026, and is released under the MIT license.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.