Study the four stages of ML system design (setup, data pipeline, model training, serving) before a technical interview.
Use the 27 open-ended ML systems design questions and community answers to practice interview preparation.
Contribute edits, new resources, or answers to the open-source booklet via pull request.
This repository contains a short booklet written in 2019 on how to design machine learning systems, covering the process from initial project setup through data handling, model training, and eventually deploying and maintaining a working system. The author, Chip Huyen, describes it as an early attempt to document this topic, and notes that her later O'Reilly book from 2022, titled Designing Machine Learning Systems, is a more thorough and current treatment of the same subject. The booklet follows four stages: setting up the project, building the data pipeline, selecting and training a model, and serving the model in production. Each section links to external resources for deeper reading and includes case studies from machine learning engineers at large tech companies. At the end there are 27 open-ended interview questions on machine learning systems design, with community-contributed answers available in this same repository. This is not a traditional software tool: it is a document built with a package called magicbook that converts text files into HTML and PDF output. The repository includes the source content files and build instructions for anyone who wants to contribute edits or additions to the text. Contributing can mean fixing errors, adding resources, or editing the questions and answers. The README is explicit that this is not the repository for the 2022 O'Reilly book, which has its own separate GitHub repository. Anyone looking for the more current and comprehensive material should consult that book's repository instead. This one remains publicly available as the original draft, along with the community answers it has accumulated.
← chiphuyen on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.