explaingit

apify/crawlee

Analysis updated 2026-06-21

23,088TypeScriptAudience · developerComplexity · 3/5Setup · moderate

TLDR

A Node.js library that automates web scraping, visiting websites, extracting data, rotating proxies, and managing large URL queues so you don't get blocked.

Mindmap

mindmap
  root((repo))
    What it does
      Visits sites automatically
      Extracts data at scale
      Avoids bot detection
      Manages URL queues
    Tech stack
      TypeScript
      Node.js
      Playwright
      Puppeteer
    Use cases
      Price comparison tools
      AI training data
      News aggregation
      Competitor monitoring
    Audience
      Developers
      Data engineers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Build a price comparison tool that scrapes product listings from multiple retail websites automatically

USE CASE 2

Collect news articles from dozens of sites automatically for a content aggregation feed

USE CASE 3

Gather training data for AI models from public web pages at scale

USE CASE 4

Monitor competitor websites and alert you when prices or content change

What is it built with?

TypeScriptNode.jsPlaywrightPuppeteer

How does it compare?

apify/crawleelfnovo/open-notebooklouislam/dockge
Stars23,08823,08123,095
LanguageTypeScriptTypeScriptTypeScript
Setup difficultymoderatehardeasy
Complexity3/54/52/5
Audiencedeveloperdeveloperops devops

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Requires Node.js and optionally Playwright or Puppeteer browser drivers for JavaScript-heavy sites.

In plain English

Crawlee is a web scraping and browser automation library for Node.js. Web scraping means automatically visiting websites and extracting information from them, like prices, product listings, article text, or any other data you can see in a browser. Crawlee makes this easier by handling the repetitive, technical work for you. The problem it solves is that scraping modern websites is hard: pages load content using JavaScript, websites detect and block automated requests, and managing a queue of thousands of URLs while handling errors and retries gets complex fast. Crawlee handles all of this. It can control real browsers (via Playwright or Puppeteer) to scrape JavaScript-heavy sites, or use fast HTTP requests for simpler pages. It automatically rotates proxies to avoid blocks, generates realistic browser fingerprints to appear human-like, manages a queue of URLs to visit, and saves collected data to disk or cloud storage. You would use this if you need to extract data from websites at scale, for example, to build a price comparison tool, aggregate news articles, collect training data for AI, or monitor competitor websites. It works in JavaScript and TypeScript and runs on Node.js. It is developed by Apify, a company that provides cloud infrastructure for running scrapers, though Crawlee itself runs anywhere.

Copy-paste prompts

Prompt 1
Help me write a Crawlee script in TypeScript that scrapes all product names and prices from a retail site and saves them to a CSV file.
Prompt 2
I'm using Crawlee with Playwright, how do I rotate proxies and avoid getting blocked on JavaScript-heavy pages?
Prompt 3
Show me how to set up a Crawlee URL queue that crawls an entire sitemap and stores results to disk.
Prompt 4
I want to scrape a site that requires login, write me a Crawlee script that handles authentication with Playwright.
Prompt 5
How do I run a Crawlee scraper on Apify's cloud platform versus running it locally on Node.js?

Frequently asked questions

What is crawlee?

A Node.js library that automates web scraping, visiting websites, extracting data, rotating proxies, and managing large URL queues so you don't get blocked.

What language is crawlee written in?

Mainly TypeScript. The stack also includes TypeScript, Node.js, Playwright.

How hard is crawlee to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is crawlee for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub apify on gitmyhub

Verify against the repo before relying on details.