explaingit

twitter/the-algorithm

73,250ScalaAudience · researcherComplexity · 5/5QuietLicenseSetup · hard

TLDR

X's recommendation algorithm that decides which posts appear in your feed, notifications, and search results by ranking candidates based on engagement likelihood, author reputation, and user interests.

Mindmap

mindmap
  root((repo))
    What it does
      Ranks posts for feed
      Filters policy violations
      Scores user reputation
      Groups interest communities
    How it works
      Candidate gathering
      Ranking models
      Filtering layers
      Feed assembly
    Key components
      SimClusters
      TwHIN
      Tweepcred
      Navi
    Tech stack
      Scala
      Python
      Rust
      Bazel
    Use cases
      Study recommendation systems
      Understand feed algorithms
      Research algorithmic transparency
      Learn large-scale ranking

Things people build with this

USE CASE 1

Study how large-scale recommendation systems rank content for hundreds of millions of users.

USE CASE 2

Learn the architecture of a real-world feed algorithm with candidate generation, ranking, and filtering stages.

USE CASE 3

Research how reputation scoring and interest clustering influence what content users see.

USE CASE 4

Understand the infrastructure and machine learning components behind algorithmic content selection.

Tech stack

ScalaPythonRustBazel

Getting it running

Difficulty · hard Time to first run · 1day+

Multiple languages (Scala/Python/Rust), Bazel build system, likely requires internal data/models and complex infrastructure to run end-to-end.

Use freely for any purpose, including commercial use, as long as you keep the copyright notice and include a copy of the license.

In plain English

This is the source code for the recommendation algorithm that powers X (formerly Twitter). Its job is to decide which posts appear in your "For You" feed, which notifications you receive, and what shows up when you search or explore the platform. In short, it answers the question: out of hundreds of millions of posts, which ones should this specific user see right now? The system works in several stages. First, candidate sources gather a large pool of potentially relevant posts from both accounts you follow and accounts you don't. Then ranking models score each candidate based on factors like how likely you are to engage with it, how reputable the author is, and whether it matches your interests. Finally, filtering layers remove content that violates policies or legal requirements before the final feed is assembled and delivered to you. Key internal components include SimClusters (which groups users into interest communities), TwHIN (which builds relationship maps between users and posts), and a page-rank-style reputation scorer called Tweepcred. You would look at this repository if you are a researcher studying recommendation systems, a developer curious about how large-scale feed algorithms are structured, or someone interested in transparency around algorithmic content selection. It is not a standalone runnable application but rather a collection of services and machine learning jobs that require the broader X infrastructure to operate. The primary languages are Scala and Python, with some Rust for high-performance model serving (a component called Navi). Build tooling uses Bazel. This is reference and study material, not a plug-and-play product.

Copy-paste prompts

Prompt 1
Walk me through how the-algorithm's ranking models decide which posts to show in the For You feed.
Prompt 2
Explain the role of SimClusters and TwHIN in X's recommendation system and how they group users and posts.
Prompt 3
Show me the candidate generation and filtering stages in the-algorithm and what factors determine which posts survive each stage.
Prompt 4
How does Tweepcred work as a reputation scorer in this recommendation system, and what impact does it have on feed ranking?
Prompt 5
What are the main differences between how the-algorithm ranks posts for the feed versus search results?
Open on GitHub → Explain another repo

Generated 2026-05-18 · Model: sonnet-4-6 · Verify against the repo before relying on details.