explaingit

drunkleen/downloader-action

23Audience · generalComplexity · 2/5ActiveSetup · easy

TLDR

A forkable GitHub Actions setup that downloads files via aria2c on a runner, zips them, splits archives over 95MB, and commits the result back to your repo under a downloads folder.

Mindmap

mindmap
  root((downloader-action))
    Inputs
      File URLs
      Page URLs
      Zip password
    Outputs
      Zipped downloads
      Page screenshots
      Updated README
      browse log
    Use Cases
      Serverless file fetch
      Page archive snapshots
      Repo size cleanup
      Bulk media grab
    Tech Stack
      GitHub Actions
      aria2c
      curl
      Puppeteer
      git filter-repo

Things people build with this

USE CASE 1

Download large files into a private GitHub repo without renting a server

USE CASE 2

Snapshot a web page as HTML plus screenshot plus media into a dated folder

USE CASE 3

Auto-split downloads over 95MB into parts so GitHub will accept them

USE CASE 4

Periodically prune blobs over 20MB to stay under the 5GB repo limit

Tech stack

GitHubActionsaria2ccurlPuppeteerBash

Getting it running

Difficulty · easy Time to first run · 5min

Fork the repo, enable Actions on the fork, then click Run on the workflow with your URLs since there is no server to provision.

In plain English

Downloader Action turns a forked GitHub repository into a do-it-yourself file downloader. You do not run a server. Instead, GitHub Actions, the free automation service built into every GitHub repo, spins up a temporary Ubuntu machine when you click Run, downloads the files you ask for, and commits them back into your repo under a downloads folder. From there you grab them by clicking the raw link. The main workflow takes one or more URLs and an optional zip password. On the runner it uses aria2c, a fast multi-connection downloader, and falls back to curl if aria2c fails. Up to three retries are tried, with longer waits between each. All downloaded files are packed into a zip archive. Because GitHub rejects single files over 100 megabytes, archives larger than 95 megabytes are automatically split into 95-megabyte parts. Uploads over 200 megabytes are pushed in small batches with their own retry loop. After each run, a README is generated inside the archive folder with direct download links, and the root README is updated with a list of all downloads. A second workflow uses a headless Chromium browser through Puppeteer to visit a URL. It takes a full-page screenshot, saves the raw HTML, and downloads all images, audio, video, and document files it finds on the page, organised by domain and timestamp inside a pages folder. A running log of visited sites is kept in browse.md, with favicons resolved from the root, then the page HTML, then a Google fallback. Two cleanup workflows are also included. Delete Downloads wipes the working tree, while History Cleaner rewrites git history with git filter-repo to strip blobs larger than 20 megabytes and then force-pushes every branch and tag. The README warns that the History Cleaner is destructive and irreversible, and that all existing clones will need to be re-cloned afterwards. GitHub repos have a soft 5 gigabyte limit, so the cleaner is meant to be run periodically.

Copy-paste prompts

Prompt 1
Walk me through the downloader-action workflow that takes URLs and commits zipped files back to my fork
Prompt 2
Show how the Puppeteer workflow saves a full-page screenshot, raw HTML, and media into a domain folder
Prompt 3
Explain why files over 95MB get split and how the receiver reassembles them
Prompt 4
Write a workflow_dispatch invocation that downloads three URLs with a zip password
Prompt 5
Add a step to downloader-action that uploads the resulting zip to S3 instead of committing it
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.