explaingit

gocolly/colly

📈 Trending25,301GoAudience · developerComplexity · 2/5ActiveLicenseSetup · easy

TLDR

Go framework for automatically visiting websites and extracting structured data. Handles cookies, rate limiting, encoding, and robots.txt rules so you don't have to.

Mindmap

mindmap
  root((Colly))
    What it does
      Web scraping
      Link following
      Data extraction
    Key features
      Callback functions
      Session management
      Parallel requests
      Local caching
    Use cases
      Data analysis
      Search indexing
      Content archiving
      Change monitoring
    Tech stack
      Go language
      No dependencies
    Audience
      Backend developers
      Data engineers

Things people build with this

USE CASE 1

Extract product prices and details from e-commerce sites for price comparison or market analysis.

USE CASE 2

Build a search index by crawling a website and collecting all pages and their content.

USE CASE 3

Monitor news sites or blogs for new articles matching specific keywords and alert when they appear.

USE CASE 4

Archive web content by downloading and storing pages locally for offline access or historical records.

Tech stack

Go

Getting it running

Difficulty · easy Time to first run · 5min
Use freely for any purpose, including commercial use, as long as you keep the copyright notice.

In plain English

Colly is a web scraping and crawling framework written in Go (also called Golang). Web scraping means automatically visiting websites and pulling out structured information from their pages, the same information you would otherwise copy manually. The problem it solves is that writing code to crawl websites from scratch involves a lot of repetitive work: handling cookies, managing how fast you visit pages, dealing with different text encodings, and respecting a site's robots.txt rules (a file that tells bots which pages they may access). Colly handles all of that for you. You give Colly a starting URL and write callback functions, small pieces of code that run when Colly finds a link, downloads a page, or encounters an HTML element matching a selector (like all anchor tags or all product prices). Colly then follows links, manages sessions, and can run multiple requests in parallel. It can cache pages locally so you don't re-download them, and it supports distributed scraping across multiple machines. You would use Colly if you need to gather data from websites for analysis, build a search index, archive content, or monitor pages for changes. It works for anything from a quick one-off script to a large-scale data pipeline. The language is Go; no other runtime or framework is required.

Copy-paste prompts

Prompt 1
Show me a Colly example that visits a website, finds all links matching a CSS selector, and prints their URLs.
Prompt 2
How do I set up Colly to respect rate limits and robots.txt when scraping a site?
Prompt 3
Write a Colly script that downloads all product names and prices from a paginated e-commerce site.
Prompt 4
How can I use Colly callbacks to extract data when a page loads and when an error occurs?
Prompt 5
Show me how to run multiple Colly requests in parallel and cache results locally.
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.