explaingit

vercel-labs/agent-browser

Analysis updated 2026-05-18

31,927RustAudience · developerComplexity · 3/5LicenseSetup · moderate

TLDR

A command-line tool that lets AI agents control a web browser, clicking buttons, filling forms, taking screenshots, through simple text commands.

Mindmap

mindmap
  root((agent-browser))
    What it does
      Control browser via CLI
      Click buttons and links
      Fill forms automatically
      Take screenshots
    How it works
      Launches Chrome browser
      Accessibility tree snapshots
      Element reference IDs
      Natural language mode
    Use cases
      AI web automation
      Form filling workflows
      Content scraping
      Repetitive web tasks
    Tech stack
      Rust core
      Chrome for Testing
      npm distribution
    Audience
      AI engineers
      Automation builders
      Vibe coders
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Build an AI agent that autonomously fills out web forms and submits them without human intervention.

USE CASE 2

Automate repetitive web tasks like logging in, navigating pages, and extracting data from multiple websites.

USE CASE 3

Create a chatbot that can browse the web, read page content, and answer questions about what it finds.

USE CASE 4

Scrape dynamic web content by having an agent click through pages and capture screenshots or text.

What is it built with?

RustChrome for TestingNode.jsnpmCargo

How does it compare?

vercel-labs/agent-browsertokio-rs/tokiosurrealdb/surrealdb
Stars31,92731,89432,036
LanguageRustRustRust
Setup difficultymoderatemoderatemoderate
Complexity3/53/53/5
Audiencedeveloperdeveloperdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires Chrome for Testing binary download and Rust/Cargo build compilation.

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

In plain English

Agent-browser is a command-line tool that lets AI agents control a web browser programmatically, opening pages, clicking buttons, filling in forms, taking screenshots, and extracting information, all from simple text commands in a terminal. It was built by Vercel Labs specifically to power automated browser tasks inside AI-driven workflows. The problem it solves is that AI agents often need to interact with the web just like a human would: navigating to a URL, reading the page content, clicking a link, or submitting a form. Most existing tools for browser automation are designed for software testing and can be heavy or slow. Agent-browser is designed to be extremely fast and lightweight, making it well-suited for AI pipelines where the agent issues many browser commands in sequence. It works by launching a Chrome browser in the background (using Google's official "Chrome for Testing" channel) and exposing a set of clean command-line instructions to control it: things like "click this element", "fill this input field", "take a screenshot", or "get the text of this element". The tool can identify elements by reference IDs from an accessibility tree snapshot, a structured representation of everything visible on the page, which is particularly useful for AI agents that reason about page structure rather than pixel positions. It also supports natural-language commands through a built-in AI chat mode. You would use this tool when building an AI agent that needs to browse the web, fill out forms, scrape content, or automate repetitive web tasks. The core binary is written in Rust for maximum performance, and it is distributed via npm, Homebrew, or Cargo (Rust's package manager).

Copy-paste prompts

Prompt 1
How do I set up agent-browser to let my AI agent click buttons and fill forms on a website?
Prompt 2
Show me how to use the accessibility tree snapshot feature to identify page elements for my automation script.
Prompt 3
How can I integrate agent-browser into my Node.js project to automate web scraping tasks?
Prompt 4
What's the best way to use agent-browser's natural language mode to let an AI agent understand and interact with web pages?
Prompt 5
How do I take screenshots and extract text from a webpage using agent-browser commands?

Frequently asked questions

What is agent-browser?

A command-line tool that lets AI agents control a web browser, clicking buttons, filling forms, taking screenshots, through simple text commands.

What language is agent-browser written in?

Mainly Rust. The stack also includes Rust, Chrome for Testing, Node.js.

What license does agent-browser use?

Use freely for any purpose including commercial. Keep the notice and disclose changes to the patent grant.

How hard is agent-browser to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is agent-browser for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub vercel-labs on gitmyhub

Verify against the repo before relying on details.