Search your screen history to recall something you saw last week without remembering which app it was in.
Automatically transcribe and summarize your video calls from Zoom, Teams, or Google Meet.
Ask in plain English what you were working on at a specific time and get an answer grounded in your actual screen history.
Record voice memos with a hotkey and have them transcribed and stored alongside your screen activity.
Requires a GPU with at least 4 GB VRAM to run the local Gemma 4 model at a usable speed.
ScreenMind is a Python application that records your screen activity and turns it into a searchable, conversational memory, all running on your own computer without any cloud connection. Think of it as a private alternative to Microsoft Recall: every time your screen changes, ScreenMind captures a screenshot, runs it through a locally installed AI model called Gemma 4, and stores a structured description of what was on screen, what app you were using, and what you were doing. The AI model handles three types of input. For screenshots it reads the visual content and produces a detailed summary including the app name, your activity category, and a description of every visible element. For audio it can transcribe voice memos you record with a hotkey, and it also auto-detects video calls (Zoom, Teams, Google Meet) to transcribe and summarize those meetings automatically. For reasoning tasks it generates daily summaries and powers a chat interface where you can ask questions like "what did that person say on Discord yesterday" and get back answers grounded in your actual screen history. Searching your history uses two methods at once: a semantic search that finds content by meaning rather than exact keywords, and a full-text keyword search. You can also browse a timeline view, replay a timelapse of your entire day, and view an analytics dashboard showing which apps you used most and how your time was distributed. All data stays on your machine. ScreenMind filters out sensitive information such as credit card numbers, API keys, and passwords before saving anything. Screenshots are encrypted on disk. You can set a PIN to lock the dashboard and switch on an incognito mode to pause recording entirely. The project also includes an agent platform where you can write short automation scripts in plain text or Python, a server that exposes your screen history to AI coding tools like Claude Desktop and Cursor, and optional integrations with Obsidian and Notion. Setup requires Python 3.10 or later, about 5 GB of disk space for the AI model, and a graphics card with at least 4 GB of memory to run analysis at a reasonable speed.
← ayushh0110 on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.