explaingit

autoscrape-labs/pydoll

6,841PythonAudience · developerComplexity · 2/5Setup · easy

TLDR

A Python library for automating Chrome and Edge browsers that skips the WebDriver middleman, adds human-like mouse and typing behavior to dodge bot detection, and extracts structured data via typed schema classes.

Mindmap

mindmap
  root((Pydoll))
    Browser Control
      Chrome and Edge
      No WebDriver needed
      Async operations
    Human Behavior
      Curved mouse paths
      Variable typing speed
    Data Extraction
      Schema classes
      CSS selectors
      Shadow DOM support
    Network Control
      Request interception
      Cookie session reuse
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Scrape a website that blocks Selenium by using human-like movements and no detectable WebDriver signature

USE CASE 2

Define a typed Python schema class with CSS selectors to extract structured data from a page in a single call

USE CASE 3

Intercept browser network traffic to discover hidden API endpoints used by a website

USE CASE 4

Automate interactions inside shadow DOM elements or cross-origin iframes that other tools cannot reach

Tech stack

PythonChrome DevTools Protocolasyncio

Getting it running

Difficulty · easy Time to first run · 30min

Requires Chrome or Edge browser installed on the system.

In plain English

Pydoll is a Python library for controlling Chrome or Edge browsers automatically. Unlike similar tools, it talks directly to the browser using a low-level protocol called Chrome DevTools Protocol rather than going through a separate WebDriver program. This means there is no extra binary to install and no flag that websites can detect to identify automated traffic. The library is built around Python's async system, which lets it run many browser operations at once without blocking. It is fully type-checked, meaning your code editor can offer autocompletion and catch mistakes before you run anything. Mouse movements and typing are designed to mimic human behavior by default, using curved movement paths, variable timing, and small random noise, which makes automated browsing harder for anti-bot systems to spot. For extracting data from web pages, Pydoll offers a structured approach: you define a Python class describing the fields you want (like title, author, or tags), point each field at a CSS selector, and call a single extract method. The result is a typed Python object with all the values filled in, rather than raw HTML strings you have to parse yourself. This works inside shadow DOM elements and cross-origin iframes as well, which other automation tools often cannot reach. Network-level control is also built in. You can intercept outgoing requests to block ads or trackers, monitor traffic to find hidden API endpoints, or make HTTP requests that carry the browser's current cookies and session state. Installation is a single pip command with no extra dependencies. The README includes working code examples for navigation, interaction, and data extraction. A separate documentation site covers all features in detail. The project is open source and actively sponsored by several web scraping service providers.

Copy-paste prompts

Prompt 1
Using pydoll, write an async Python script that opens Chrome, navigates to a product listing page, and extracts title, price, and link for each item using a structured schema class.
Prompt 2
Show me how to use pydoll network interception to block all ad and analytics tracker requests while scraping a site.
Prompt 3
I need to log into a website with pydoll using human-like typing. Show me how to fill username and password fields with realistic delays and cursor movement.
Prompt 4
Define a pydoll schema class to extract article title, author, publication date, and tags from a news website using CSS selectors.
Open on GitHub → Explain another repo

← autoscrape-labs on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.