omkarcloud/botasaurus

★ 4,557PythonAudience · developerComplexity · 3/5Setup · easy

Mindmap

mindmap
  root((botasaurus))
    What it does
      Bypass bot detection
      Scrape websites
      Cache results
    Anti-Detection
      Realistic mouse moves
      Browser masking
      Proxy rotation
    Delivery Modes
      Desktop app
      Web interface
      Kubernetes scale
    Tech
      Python
      Browser automation
      HTTP requests

mindmap root((botasaurus)) What it does Bypass bot detection Scrape websites Cache results Anti-Detection Realistic mouse moves Browser masking Proxy rotation Delivery Modes Desktop app Web interface Kubernetes scale Tech Python Browser automation HTTP requests

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Scrape websites protected by Cloudflare or Datadome using anti-detection browser automation.

USE CASE 2

Package a finished scraper as a desktop app or web interface so non-technical clients can run it themselves.

USE CASE 3

Cut proxy costs by routing some requests directly through the browser instead of an external proxy service.

USE CASE 4

Run parallel scraping tasks with result caching to avoid re-fetching pages you have already processed.

Tech stack

PythonSeleniumPlaywrightKubernetes

Getting it running

Difficulty · easy Time to first run · 30min

Install via pip, some anti-detection features require a compatible browser driver to be present.

In plain English

Botasaurus is a Python framework for building web scrapers that can get past bot-detection systems. Many websites use tools like Cloudflare, Fingerprint, and Datadome to identify and block automated traffic. Botasaurus is specifically designed to avoid triggering these systems by making the browser behave more like a real human, including realistic mouse movements and other signals that detection tools look for. The framework wraps around browser automation (similar to Selenium or Playwright) and HTTP request libraries, but adds a layer that masks the signs of automation. You write a Python function, add a decorator like @browser or @request to it, and Botasaurus handles the rest: launching the browser with anti-detection settings, managing browser profiles, rotating proxies, and saving results automatically as JSON files. A basic scraper that opens a page and extracts a heading can be written in about ten lines of code. Beyond detection bypassing, Botasaurus includes features aimed at reducing the cost and complexity of scraping at scale. It claims to cut proxy costs by up to 97% by sending some requests directly from within the browser rather than routing all traffic through an external proxy server. It also supports running multiple scraping tasks in parallel, caching results to avoid refetching the same data, and distributing work across multiple machines using Kubernetes. One notable feature is the ability to package a finished scraper as a desktop application for Windows, macOS, or Linux, or to turn it into a web interface that non-technical users can operate through a browser. This is aimed at developers who build scraping tools for clients or customers who do not want to use a command line. Installation is through pip. The project is split across several packages (botasaurus, botasaurus-api, botasaurus-driver, botasaurus-server, and others) that can be installed and upgraded together. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1

Write a Botasaurus scraper using the @browser decorator to extract product prices from a Cloudflare-protected e-commerce site.

Prompt 2

How do I configure proxy rotation in Botasaurus so I can scrape at scale without getting blocked?

Prompt 3

Show me how to package my Botasaurus scraper as a web interface a client can use without touching the command line.

Prompt 4

How do I use Botasaurus caching so my scraper skips pages it has already downloaded on a previous run?

Prompt 5

How do I distribute a Botasaurus scraping job across multiple machines using Kubernetes?

Open on GitHub → Explain another repo

← omkarcloud on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.