Build an AI agent that browses the web by reading tab state from markdown files and writing click or type commands, without any browser SDK
Automate multi-step web tasks like logging in, filling forms, and clicking buttons using a plain text loop any LLM can drive
Record browser sessions as video clips while an AI agent performs tasks, for debugging or auditing the agent's decisions
Requires Node.js and npm to install, plus a Chrome browser, no graphical interface, everything runs via CLI commands and plain text files.
This project, called preprint, is a tool that lets AI agents control a web browser by reading and writing plain text files instead of learning complex browser automation protocols. The core idea is that a background program (a daemon) runs a real Chrome browser instance and continuously writes the current state of each open tab into a markdown file. The agent reads that file to see what is on screen, then appends a single action command at the bottom of the file. The daemon carries out that action in the live browser and rewrites the file with the new state, all within about a second. Setting it up involves installing the package through npm and running a command that opens a web page inside a managed Chrome window. A folder called "preprint" appears in your working directory with one file per tab. Each tab file shows the page content as a simplified tree of clickable elements, along with the last action taken and any console output from the page. A second file tracks what changed since the previous snapshot. The actions an agent can trigger include clicking a button (referenced by a short code from the page file), typing text into a field, pressing a key, scrolling the page, navigating to a different URL, taking a screenshot, or recording a video clip. After each action the agent re-reads the updated tab file before deciding the next step. Screenshots and recordings are saved to a separate artifacts folder and persist even after the tab is closed. The tool supports multiple browser sessions with different identities. You can use your own Chrome profile, a named separate profile, or a completely clean browser with no stored login data. Sessions stay open until you explicitly close them or run a stop command that shuts everything down. This is a command-line developer tool, written in Rust, aimed at programmers building AI agents that need to interact with live web pages. It is not a hosted service and has no graphical interface. With 17 stars on GitHub, it appears to be a new or early-stage experiment.
← supermemoryai on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.