explaingit

omkarcloud/botasaurus

4,557PythonAudience · developerComplexity · 3/5Setup · easy

TLDR

A Python web scraping framework that bypasses bot detection systems like Cloudflare by mimicking human browser behavior, and can package finished scrapers as desktop apps or web interfaces for non-technical clients.

Mindmap

mindmap
  root((botasaurus))
    What it does
      Bypass bot detection
      Scrape websites
      Cache results
    Anti-Detection
      Realistic mouse moves
      Browser masking
      Proxy rotation
    Delivery Modes
      Desktop app
      Web interface
      Kubernetes scale
    Tech
      Python
      Browser automation
      HTTP requests
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Scrape websites protected by Cloudflare or Datadome using anti-detection browser automation.

USE CASE 2

Package a finished scraper as a desktop app or web interface so non-technical clients can run it themselves.

USE CASE 3

Cut proxy costs by routing some requests directly through the browser instead of an external proxy service.

USE CASE 4

Run parallel scraping tasks with result caching to avoid re-fetching pages you have already processed.

Tech stack

PythonSeleniumPlaywrightKubernetes

Getting it running

Difficulty · easy Time to first run · 30min

Install via pip, some anti-detection features require a compatible browser driver to be present.

In plain English

Botasaurus is a Python framework for building web scrapers that can get past bot-detection systems. Many websites use tools like Cloudflare, Fingerprint, and Datadome to identify and block automated traffic. Botasaurus is specifically designed to avoid triggering these systems by making the browser behave more like a real human, including realistic mouse movements and other signals that detection tools look for. The framework wraps around browser automation (similar to Selenium or Playwright) and HTTP request libraries, but adds a layer that masks the signs of automation. You write a Python function, add a decorator like @browser or @request to it, and Botasaurus handles the rest: launching the browser with anti-detection settings, managing browser profiles, rotating proxies, and saving results automatically as JSON files. A basic scraper that opens a page and extracts a heading can be written in about ten lines of code. Beyond detection bypassing, Botasaurus includes features aimed at reducing the cost and complexity of scraping at scale. It claims to cut proxy costs by up to 97% by sending some requests directly from within the browser rather than routing all traffic through an external proxy server. It also supports running multiple scraping tasks in parallel, caching results to avoid refetching the same data, and distributing work across multiple machines using Kubernetes. One notable feature is the ability to package a finished scraper as a desktop application for Windows, macOS, or Linux, or to turn it into a web interface that non-technical users can operate through a browser. This is aimed at developers who build scraping tools for clients or customers who do not want to use a command line. Installation is through pip. The project is split across several packages (botasaurus, botasaurus-api, botasaurus-driver, botasaurus-server, and others) that can be installed and upgraded together. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
Write a Botasaurus scraper using the @browser decorator to extract product prices from a Cloudflare-protected e-commerce site.
Prompt 2
How do I configure proxy rotation in Botasaurus so I can scrape at scale without getting blocked?
Prompt 3
Show me how to package my Botasaurus scraper as a web interface a client can use without touching the command line.
Prompt 4
How do I use Botasaurus caching so my scraper skips pages it has already downloaded on a previous run?
Prompt 5
How do I distribute a Botasaurus scraping job across multiple machines using Kubernetes?
Open on GitHub → Explain another repo

← omkarcloud on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.