explaingit

dkron-io/dkron

4,687GoAudience · ops devopsComplexity · 4/5Setup · moderate

TLDR

A distributed cron job scheduler that runs scheduled tasks across a cluster of servers with no single point of failure, if one server goes down, the others keep running your jobs automatically.

Mindmap

mindmap
  root((dkron))
    What it does
      Distributed cron
      No single failure
      Job scheduling
    Tech used
      Go runtime
      Raft consensus
      Serf gossip
    Management
      Web dashboard
      REST API
      Docker Compose
    Client libraries
      Python client
      Ruby client
      PHP client
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Replace single-server cron jobs with a fault-tolerant cluster so scheduled tasks survive server failures.

USE CASE 2

Schedule jobs to run on specific subsets of servers in a distributed infrastructure using tags or roles.

USE CASE 3

Manage and monitor all scheduled jobs through a built-in web dashboard or REST API without SSH access.

USE CASE 4

Integrate job scheduling into your application code using client libraries for Python, Ruby, or PHP.

Tech stack

GoDockerRaftSerf

Getting it running

Difficulty · moderate Time to first run · 30min

Local testing needs Docker and Docker Compose, production deployment requires a cluster of multiple servers for fault tolerance to work.

In plain English

Dkron is a tool for running scheduled tasks across a group of servers, similar to how cron jobs work on a single computer but designed to keep running correctly even when individual servers fail. On a single machine, a scheduled task stops if that machine goes down. Dkron spreads the responsibility across a cluster of machines so there is no single point of failure: if one server dies, the others continue running the scheduled jobs. The system is built in Go and uses two coordination mechanisms: Raft, a protocol for reaching agreement across distributed nodes, and Serf, a tool for cluster membership and failure detection using a gossip approach where nodes exchange status information with each other directly. Together these give the cluster the ability to elect a leader, detect failures, and continue functioning without manual intervention. Jobs are defined with cron-style schedules and can be targeted at any subset of servers in the cluster. Dkron provides a web interface for viewing and managing jobs, and an API for adding or modifying jobs programmatically. The README links to API documentation on the project website for the full details. Setting up a test environment requires Docker and Docker Compose. Running one command starts the cluster locally and makes the dashboard accessible in a browser. Scaling the cluster to more server or agent nodes is also done through Docker Compose by specifying the desired count. Client libraries in Python, Ruby, and PHP are listed in the README for interacting with the API from application code. The project was inspired by a Google research paper on reliable distributed scheduling and by Airbnb's Chronos scheduling system. It runs on Linux, macOS, and Windows. Full documentation is hosted on the project website.

Copy-paste prompts

Prompt 1
Set up a local Dkron cluster using Docker Compose and schedule a job that runs every 5 minutes, show me the full docker-compose.yml and job definition.
Prompt 2
How do I target a Dkron job to run only on servers with a specific tag in my cluster, rather than all nodes?
Prompt 3
Show me how to use the Dkron Python client library to create and update scheduled jobs programmatically from my application.
Prompt 4
How does Dkron elect a new leader if the current leader node goes down? Does it happen automatically or does it require manual intervention?
Open on GitHub → Explain another repo

← dkron-io on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.