explaingit

gocolly/colly

Analysis updated 2026-06-21

25,275GoAudience · developerComplexity · 2/5Setup · easy

TLDR

Colly is a Go web scraping and crawling framework that handles cookies, rate-limiting, caching, and parallel requests automatically, so you can extract data from websites without writing boilerplate code.

Mindmap

mindmap
  root((Colly))
    What it does
      Scrapes web pages
      Follows links
      Extracts structured data
    Features
      Cookie handling
      Rate limiting
      Local page caching
      Parallel requests
    Use cases
      Price monitoring
      Content archiving
      Data pipelines
    Tech stack
      Go
    Audience
      Developers
      Data engineers
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Scrape product prices from an e-commerce site on a schedule and save them to a spreadsheet.

USE CASE 2

Build a web crawler that archives all pages of a blog for offline reading or search indexing.

USE CASE 3

Monitor competitor websites for content changes and trigger an alert when something updates.

USE CASE 4

Build a data pipeline that collects structured data from multiple websites for further analysis.

What is it built with?

Go

How does it compare?

gocolly/collymicrosoft/typescript-goasdf-vm/asdf
Stars25,27525,32925,330
LanguageGoGoGo
Setup difficultyeasyeasyeasy
Complexity2/53/52/5
Audiencedeveloperdeveloperdeveloper

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

In plain English

Colly is a web scraping and crawling framework written in Go (also called Golang). Web scraping means automatically visiting websites and pulling out structured information from their pages, the same information you would otherwise copy manually. The problem it solves is that writing code to crawl websites from scratch involves a lot of repetitive work: handling cookies, managing how fast you visit pages, dealing with different text encodings, and respecting a site's robots.txt rules (a file that tells bots which pages they may access). Colly handles all of that for you. You give Colly a starting URL and write callback functions, small pieces of code that run when Colly finds a link, downloads a page, or encounters an HTML element matching a selector (like all anchor tags or all product prices). Colly then follows links, manages sessions, and can run multiple requests in parallel. It can cache pages locally so you don't re-download them, and it supports distributed scraping across multiple machines. You would use Colly if you need to gather data from websites for analysis, build a search index, archive content, or monitor pages for changes. It works for anything from a quick one-off script to a large-scale data pipeline. The language is Go, no other runtime or framework is required.

Copy-paste prompts

Prompt 1
Write a Colly scraper in Go that visits a news homepage and extracts article titles and URLs into a CSV file.
Prompt 2
Show me how to configure Colly to rate-limit requests to one per second and cache downloaded pages to disk.
Prompt 3
Using Colly, write a spider that starts at a homepage, follows all internal links, and prints the title of each page.
Prompt 4
How do I make Colly respect a site's robots.txt file while still scraping pages that allow crawling?
Prompt 5
Write a Colly scraper that sends scraped data to a PostgreSQL database as it crawls, using goroutines for concurrency.

Frequently asked questions

What is colly?

Colly is a Go web scraping and crawling framework that handles cookies, rate-limiting, caching, and parallel requests automatically, so you can extract data from websites without writing boilerplate code.

What language is colly written in?

Mainly Go. The stack also includes Go.

How hard is colly to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is colly for?

Mainly developer.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub gocolly on gitmyhub

Verify against the repo before relying on details.