explaingit

gauravs19/cloud-native-observability

Analysis updated 2026-05-18

3Audience · ops devopsComplexity · 2/5Setup · easy

TLDR

A vendor-neutral reference catalog of 316 cloud observability metrics and alert rules across 41 categories, with importable Grafana dashboards and Prometheus rules for Azure, AWS, and GCP.

Mindmap

mindmap
  root((repo))
    What it is
      Metrics reference
      Alert rule catalog
      316 metrics total
    Frameworks
      RED method
      USE method
      Golden Signals
      DORA metrics
    Components
      Grafana dashboards
      Prometheus alerts
      Cloud service map
    Coverage
      18 stack layers
      7 cross-cutting dims
      Azure AWS GCP
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

What do people build with it?

USE CASE 1

Use the catalog as a checklist when setting up monitoring for a new cloud service or Kubernetes cluster

USE CASE 2

Import the Grafana dashboards and Prometheus alert rules as a starting baseline for your observability stack

USE CASE 3

Find the right Azure, AWS, or GCP service to collect a specific metric using the cloud service map in Part D

USE CASE 4

Review AI/LLM serving metrics or FinOps cost signals that are often missing from standard monitoring setups

What is it built with?

GrafanaPrometheusOpenTelemetryYAML

How does it compare?

gauravs19/cloud-native-observability0marildo/imagoabdurrafey237/rag-chatbot
Stars333
LanguagePythonJupyter Notebook
Setup difficultyeasyeasymoderate
Complexity2/52/53/5
Audienceops devopsgeneralgeneral

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 5min

No code to install, import the Grafana JSON dashboards or copy the Prometheus YAML alert rules directly into your existing stack.

In plain English

Cloud-Native Observability is a reference catalog that lists what to measure and how to alert in a modern cloud-based software system. It is not a tool you run, it is a document you read and adapt. The catalog covers 316 individual metrics across 41 categories, spanning everything from frontend pages and API gateways to databases, Kubernetes clusters, AI models, and business KPIs. Metric names follow industry-standard OpenTelemetry and Prometheus naming conventions, so they translate directly into most monitoring setups. The catalog is organized around three well-known frameworks for deciding which metrics matter. RED measures request rate, error rate, and response duration. USE measures utilization, saturation, and errors for resources like CPU and memory. Golden Signals is a simpler set of four key indicators developed for large-scale systems. Each metric in the catalog has a label showing which framework it belongs to, plus a recommended alert action: page a human immediately, create a ticket for follow-up, or just display on a dashboard without alerting. The repository also includes ready-to-import Grafana dashboard files and Prometheus alert rule files, so you can get a working baseline quickly rather than building from scratch. A cloud service map in the catalog links each category to the matching services on Azure, AWS, and Google Cloud, so you can find the right monitoring source for whatever cloud you use. A companion project called NFR Advisor helps you decide which metrics from this catalog apply to a specific system, generating service level objectives for each requirement and linking them back to the relevant metrics here. Together they form a starting checklist for teams setting up observability for the first time or auditing an existing monitoring setup.

Copy-paste prompts

Prompt 1
I'm setting up monitoring for a Kubernetes-hosted API. Which metrics from the cloud-native observability catalog should I start with, and how do I choose between RED, USE, and Golden Signals?
Prompt 2
How do I import the Grafana dashboards from this repo and what do the RED-by-endpoint and USE-by-resource dashboards show?
Prompt 3
I want to set up multi-window burn-rate SLO alerts with Prometheus. Walk me through the alert rules in this catalog and when to use page vs ticket vs watch severity.
Prompt 4
How does the Part D cloud service map work? Show me how to find the Azure services for monitoring a relational database workload.

Frequently asked questions

What is cloud-native-observability?

A vendor-neutral reference catalog of 316 cloud observability metrics and alert rules across 41 categories, with importable Grafana dashboards and Prometheus rules for Azure, AWS, and GCP.

How hard is cloud-native-observability to set up?

Setup difficulty is rated easy, with roughly 5min to a first successful run.

Who is cloud-native-observability for?

Mainly ops devops.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub gauravs19 on gitmyhub

Verify against the repo before relying on details.