Analysis updated 2026-07-03 · repo last pushed 2026-01-12
Build a click-through rate prediction model for an e-commerce site to rank products a user is likely to buy.
Train a personalized recommendation system for a streaming service using viewing history and content metadata.
Benchmark ad-ranking model performance against the MLPerf standard using the Criteo advertising dataset.
Scale a recommendation model to billions of data points using distributed multi-GPU training.
| facebookresearch/dlrm | facebookresearch/videopose3d | karpathy/makemore | |
|---|---|---|---|
| Stars | 4,048 | 4,036 | 4,010 |
| Language | Python | Python | Python |
| Last pushed | 2026-01-12 | 2022-12-10 | 2024-06-04 |
| Maintenance | Maintained | Dormant | Dormant |
| Setup difficulty | hard | moderate | easy |
| Complexity | 4/5 | 3/5 | 2/5 |
| Audience | developer | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Requires GPU and large datasets like Criteo, distributed training needs multiple machines or GPUs.
This repository contains code for a deep learning model designed to predict whether someone will click on an ad or product, useful for personalized recommendations and targeted advertising. Instead of treating all user data the same way, DLRM (Deep Learning Recommendation Model) handles two types of information differently: numerical features like age or price, and categorical features like product category or user ID. The model works by splitting the problem into two paths. One path processes numerical data through simple mathematical layers. The other path converts categorical features (like "user viewed category X") into dense numerical vectors called embeddings, then combines these vectors in ways that capture relationships between different categories. Finally, the two paths merge together, and the combined information flows through additional layers to produce a single score: the probability that the user will click. The architecture is flexible, you can swap in different ways of combining the categorical embeddings depending on what makes sense for your problem. People building recommendation systems, ad platforms, or e-commerce sites would use this code. For example, a streaming service could use DLRM to predict whether a user will watch a particular show based on their viewing history and show metadata. Facebook and other companies use variants of this approach at massive scale. The repository includes test scripts to verify correctness, benchmark scripts to measure performance, and support for training on well-known datasets like Criteo's advertising data. It also supports distributed training across multiple GPUs or machines, which is essential when working with billions of data points. The implementation is provided in two versions, PyTorch (a popular deep learning framework) and Caffe2, so teams can choose whichever fits their infrastructure better. The code handles real-world complexities like loading large datasets efficiently, saving and resuming training from checkpoints, and integrating with MLPerf, a standard benchmark for measuring machine learning system performance.
Meta's deep learning model for predicting ad clicks and product recommendations, handling both numerical and categorical user data to estimate the probability a user will click.
Mainly Python. The stack also includes Python, PyTorch, Caffe2.
Maintained — commit in last 6 months (last push 2026-01-12).
Apache 2.0, free for any use including commercial, keep the license notice.
Setup difficulty is rated hard, with roughly 1h+ to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.