Study how Twitter's For You feed ranking model selects and orders content to show to users
Use TwHIN embeddings as a starting point for building your own social media recommendation system
Adapt the Heavy Ranker model architecture for a custom large-scale content ranking task
Requires a Linux machine with an Nvidia GPU, torchrec does not run on other platforms without workarounds.
This repository contains open-sourced machine learning models that Twitter uses to power parts of its recommendation system. The code covers two specific models: the Heavy Ranker that decides what shows up in the For You feed on the home timeline, and TwHIN embeddings, which are a way of representing Twitter users and content as numerical vectors for use in recommendation tasks. A research paper on TwHIN is linked from the README for anyone who wants the technical background. The project is written in Python and is intended to run inside a Python virtual environment on Linux machines. It also depends on torchrec, a library for large-scale recommendation systems that works best with an Nvidia GPU. If you do not have a Linux machine with an Nvidia GPU, running this code locally will likely require extra workarounds the README does not cover. Setup is handled by a single shell script, and each sub-project within the repository has its own README with more specific instructions for running that model. The top-level README is brief and points readers to those individual sub-project folders for details. The README is sparse overall. It identifies what is included and how to get started at a high level, but does not describe in plain terms how the ranking or embedding models work, what inputs they take, or how they were trained. Readers who want deeper context would need to explore the sub-project folders and the linked research paper directly. This repository is primarily useful to people with a machine learning background who want to study or adapt the actual models Twitter uses. It is not a product users interact with directly, and it is not a tool for general-purpose use without significant technical knowledge.
← twitter on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.