explaingit

cortexproject/cortex

5,801GoAudience · ops devopsComplexity · 5/5Setup · hard

TLDR

A horizontally scalable, multi-tenant metrics storage backend that extends Prometheus with long-term retention on cloud object storage like S3 or GCS.

Mindmap

mindmap
  root((cortex))
    What it does
      Metrics storage
      Long term data
      Multi tenant
    Architecture
      Horizontal scale
      High availability
      Object storage
    Tech
      Prometheus
      S3 and GCS
      OpenTelemetry
    Operations
      CNCF project
      Community calls
      Slack support
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Store Prometheus metrics from multiple teams in one cluster with each team's data fully isolated.

USE CASE 2

Retain months or years of metrics cost-effectively by writing data to Amazon S3 or Google Cloud Storage.

USE CASE 3

Scale metrics ingestion horizontally by adding nodes as data volume grows, without downtime.

USE CASE 4

Ingest OpenTelemetry metrics alongside Prometheus-format data into the same storage backend.

Tech stack

GoPrometheusAmazon S3Google Cloud StorageAzureOpenTelemetry

Getting it running

Difficulty · hard Time to first run · 1day+

Requires multiple cloud infrastructure components, object storage configuration, and deep Prometheus knowledge.

In plain English

Cortex is a system for storing and querying metrics data at large scale. It is designed to work as a long-term backend for Prometheus, a widely-used tool that collects measurements from servers, applications, and infrastructure, such as CPU usage, request counts, or error rates. Prometheus is typically good at short-term storage on a single machine. Cortex extends it by storing that data reliably across multiple machines and keeping it for much longer periods. The four main properties Cortex is built around are horizontal scalability, high availability, multi-tenancy, and long-term storage. Horizontal scalability means that as data volume grows, you add more machines rather than upgrading a single one, and the work is distributed across them. High availability means data is replicated, so losing one machine does not cause data loss. Multi-tenancy means a single Cortex cluster can accept and separate metric streams from multiple independent teams or customers, keeping their data isolated from one another. For long-term storage, Cortex supports writing data to object storage systems including Amazon S3, Google Cloud Storage, Microsoft Azure, and OpenStack Swift. This keeps costs manageable compared to keeping everything on local disk. Cortex also supports ingesting OpenTelemetry Metrics in addition to Prometheus-format data. The project is a CNCF graduated project, meaning it is governed by the Cloud Native Computing Foundation and has met maturity standards around security, governance, and production readiness. It is regularly featured at KubeCon conferences, with talks available going back to 2016. Community calls are held every four weeks, and support is available through a Slack channel and mailing list.

Copy-paste prompts

Prompt 1
Set up Cortex with Amazon S3 as the long-term storage backend for my Prometheus instance, show the minimal config blocks I need.
Prompt 2
Configure Cortex multi-tenancy so that team A and team B's metrics are stored separately in the same cluster with no data leakage.
Prompt 3
Walk me through the Cortex architecture: which components handle ingestion, querying, and long-term storage independently?
Prompt 4
Migrate from single-node Prometheus to Cortex for long-term retention, what are the main steps and where is data loss risk highest?
Prompt 5
How does Cortex handle high availability, what exactly happens when one ingester node fails?
Open on GitHub → Explain another repo

← cortexproject on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.