laramies/theharvester

★ 16,199PythonAudience · developerComplexity · 3/5Setup · moderate

Mindmap

mindmap
  root((theHarvester))
    What it does
      Collect emails
      Find subdomains
      Discover IPs and URLs
    Data Sources
      Search engines
      Cert transparency
      Security search engines
      Breach databases
    Features
      Passive recon
      Active brute-force
      REST API
    Audience
      Pen testers
      Blue team defenders
      Security researchers

mindmap root((theHarvester)) What it does Collect emails Find subdomains Discover IPs and URLs Data Sources Search engines Cert transparency Security search engines Breach databases Features Passive recon Active brute-force REST API Audience Pen testers Blue team defenders Security researchers

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Map all subdomains and email addresses exposed for a target organization before a penetration test.

USE CASE 2

Audit what information about your own company is publicly visible to outside attackers.

USE CASE 3

Automate reconnaissance by integrating theHarvester via its REST API into a larger security pipeline.

Tech stack

PythonDockeruv

Getting it running

Difficulty · moderate Time to first run · 30min

Several of the most powerful data sources require API keys from third-party providers, the free tier gives limited requests per day.

License terms were not mentioned in the explanation.

In plain English

theHarvester is a reconnaissance tool used in the early "information-gathering" stage of a penetration test or red-team assessment. Its job is to collect publicly available information about a given domain, names, email addresses, IP addresses, subdomains, and URLs, so a security team can see what an outside attacker would be able to find about their organisation. This is called OSINT, short for open-source intelligence, because everything is pulled from public resources. The tool runs as a command-line program. You give it a domain to target, and it then queries a long list of "passive" data sources in turn, public search engines like Baidu, Brave, DuckDuckGo, Mojeek and Yahoo, certificate transparency logs through crt.sh and Cert Spotter, security-focused search engines like Shodan, Censys, Netlas, FOFA, ZoomEye and SecurityTrails, breach-checking services like haveibeenpwned and DeHashed, and email-finder services like Hunter and RocketReach, among many others. Some sources are free, others need an API key, and the README lists the free quotas and paid tiers. On top of that, "active" modules can brute-force subdomain names from a dictionary and take screenshots of discovered subdomains. You would reach for theHarvester if you are a penetration tester scoping out a target's external attack surface, a blue-team defender wanting to see what is exposed about your own organisation, or a security researcher doing reconnaissance. An optional REST API allows the tool to be integrated with other systems, protected by an API key. It is written in Python (3.12 or higher) and uses the uv package manager for installation. It can also be run from a prebuilt Docker image.

Copy-paste prompts

Prompt 1

Using theHarvester, write me a shell command to gather subdomains and emails for the domain example.com using Shodan and crt.sh as sources.

Prompt 2

I want to run theHarvester from Docker against my own domain to audit our external exposure, walk me through the setup steps.

Prompt 3

How do I configure API keys for theHarvester paid data sources like SecurityTrails and Hunter so they load automatically on each run?

Prompt 4

Generate a Python script that calls theHarvester REST API to run a scan and parse the JSON results into a CSV report.

Open on GitHub → Explain another repo

← laramies on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.