Build an agent that logs in to a site and pulls structured records into JSON
Drive a single-page web app from Claude or another LLM using natural language clicks
Convert any page including iframe and shadow DOM content to clean markdown for an agent
Run a long persistent profile so the agent stays logged in across sessions
Needs Node.js 20 or newer, a Chrome or Chromium binary, and TypeScript 5; the patched-Chromium backend is optional but planned.
AgenticBrowser is a browser runtime built so AI agents can use the web the way a person does. The README's framing is direct: if a human can access a page, the agent should be able to access it too. The tool opens any URL, including pages built with single-page JavaScript frameworks, iframes, shadow DOM, and lazy-loaded content, and it tries to deal with anti-bot challenges from services like Cloudflare, reCAPTCHA, and hCaptcha without a human stepping in. Once a page is open, the agent can do a small set of high-level actions. It can read the page as clean markdown, click or type by describing the target in plain language (for example, click the login button), pull structured data out of a page using a JSON schema, and verify goals such as whether the user is logged in. If access fails, a recover step tries alternatives like reader mode, archive snapshots, or print views. Profiles can be saved on disk so the agent stays logged in between sessions. The project is layered. A command router sits above a state machine that classifies the page, a challenge solver, content extraction, and a behavior layer that models pointer movement, typing cadence, and scrolling on real human telemetry. Underneath that is a stealth engine that uses CDP-level overrides and per-launch fingerprint seeds to keep values like timezone, locale, GPU vendor, and Client Hints brands consistent across a session. There is also a planned patched-Chromium backend in runtime-binary that modifies fingerprint surfaces at the C++ and V8 level, which the runtime would pick up automatically when present. You can run it three ways. As an MCP server it exposes tools such as browser_open, browser_read, browser_act, browser_extract, and browser_verify, which other agent frameworks can call. As a CLI you run commands like open, read, act, and extract directly. As a TypeScript SDK you import functions like openUrl, readContent, and actOnPage, with optional config for proxy, geo, persistent profile, and a behavior preset. Requirements are Node.js 20 or newer, Chrome or Chromium, and TypeScript 5 or newer. The license is MIT.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.