explaingit

eugeneyan/applied-ml

28,865Audience · dataComplexity · 1/5StaleLicenseSetup · easy

TLDR

Curated reading list of real-world machine learning case studies from companies like Airbnb, Netflix, and Google showing how they deployed ML systems in production.

Mindmap

mindmap
  root((applied-ml))
    What it does
      Real production case studies
      Company implementations
      Practical results
    Topics covered
      Data quality and engineering
      Feature stores
      Search and ranking
      Recommendation systems
      NLP and computer vision
      Model management
    Learning value
      Before-and-after decisions
      Why approaches worked
      Measurable outcomes
      Team structures
    Use cases
      Design your own system
      Learn from peers
      Avoid common pitfalls
      Validate architecture choices

Things people build with this

USE CASE 1

Review how Airbnb or Netflix solved a recommendation or ranking problem before building your own system.

USE CASE 2

Find documented failures and lessons learned from major companies to avoid repeating their mistakes.

USE CASE 3

Understand the full ML lifecycle from data engineering through model deployment by reading case studies across 30+ topic areas.

USE CASE 4

Learn what validation, A/B testing, and privacy techniques companies actually use in production rather than in theory.

Getting it running

Difficulty · easy Time to first run · 5min
Use freely for any purpose including commercial, as long as you keep the copyright notice.

In plain English

applied-ml is a curated reading list of papers, technical blog posts, and articles published by companies describing how they use machine learning and data science in real production systems. Rather than academic research, the focus is on practical, deployed applications, what problem was framed, which techniques were tried, why certain approaches worked, and what measurable results were achieved. The list is organized into over 30 topic areas covering the full lifecycle of a machine learning system. Topics include data quality and engineering, feature stores (centralized repositories for the input data fed to models), search and ranking systems, recommendation engines, natural language processing, computer vision, anomaly detection, forecasting, embeddings (a technique for representing data as numerical vectors), reinforcement learning, model management, and the human practices and team structures behind ML teams. There are also sections on validation and A/B testing, privacy-preserving techniques, and documented failures. The contributing companies include names like Airbnb, Uber, Netflix, Google, Meta, LinkedIn, Pinterest, Shopify, DoorDash, Lyft, and many others. A data scientist or machine learning engineer who is designing a system and wants to learn from how comparable organizations approached the same type of problem, before committing to an architecture, would use this list as a reference. It answers the question "how did others actually do this in production?" rather than "what does the theory say?".

Copy-paste prompts

Prompt 1
I'm building a recommendation system. Show me the most relevant papers and case studies from applied-ml about how companies like Netflix or Airbnb approached this problem.
Prompt 2
What are the best practices for feature engineering and feature stores according to the applied-ml reading list? Give me 3 key takeaways from real company implementations.
Prompt 3
I need to set up model validation and A/B testing for my ML system. What does applied-ml say about how production teams at Google or Meta handle this?
Prompt 4
Show me documented ML failures from applied-ml and explain what went wrong and how companies fixed it.
Prompt 5
Which applied-ml case studies cover privacy-preserving machine learning techniques? I need to understand how companies protect user data in production.
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.