Build a self-directed study curriculum for distributed systems starting from the curated beginner bootcamp resources.
Find and read the foundational academic papers on topics like consensus, logical clocks, and large-scale storage systems.
Prepare for system design interviews by working through the papers and books organized by topic.
This repository is a curated collection of reading material for anyone wanting to learn how distributed systems work. Distributed systems are programs or services that run across multiple computers at the same time, and coordinating those computers reliably is one of the harder problems in software engineering. The list gathers books, academic papers, blog posts, and talks that explain the core concepts. The collection is organized into sections. A bootcamp section lists a handful of starting points for newcomers, including an introduction to the CAP theorem (which describes the trade-offs any distributed system must make between consistency and availability) and a tour of the classic failure modes engineers commonly hit. From there, the list branches into books, most of which are free or available with free registration, ranging from introductory overviews to dense academic texts. A large portion of the list is research papers. These include foundational work like Leslie Lamport's paper on logical clocks, which established how computers in a network can agree on the order of events without a shared clock. Other papers cover how Google built its storage systems, how Amazon designed its Dynamo database, and how consensus algorithms like Paxos and Raft let a group of machines agree on a value even when some of them fail. The list also covers messaging systems, streaming logs, and coordination services. There are sections on monitoring and testing distributed systems, and on specific technologies like service meshes and container orchestration tools. The repository contains no runnable code. It is a reading list maintained by the community, intended as a starting point for developers, students, or anyone building or evaluating systems that need to run reliably across more than one machine.
← theanalyst on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.