explaingit

open-metadata/openmetadata

13,906TypeScriptAudience · dataComplexity · 4/5Setup · hard

TLDR

OpenMetadata is an open-source platform that maps all your company's data sources, databases, dashboards, pipelines, and ML models, into a single searchable catalog, adding context about quality, ownership, and data lineage so teams and AI tools can find and trust data faster.

Mindmap

mindmap
  root((OpenMetadata))
    What it does
      Metadata catalog
      Data lineage
      Quality monitoring
    Data Sources
      120 plus services
      Databases and warehouses
      ML models
    Semantics
      Glossaries
      Business concepts
      PII tagging
    AI Integration
      MCP server
      Context for AI tools
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Search and discover data assets across 120+ connected services without tracking down data owners manually.

USE CASE 2

Audit column-level data lineage to see exactly where values come from and what downstream systems would break if something changed.

USE CASE 3

Monitor data quality and freshness with automated checks on your most critical datasets.

USE CASE 4

Plug AI assistants into your data catalog via the built-in MCP server to give them business context alongside raw schema.

Tech stack

TypeScript

Getting it running

Difficulty · hard Time to first run · 1h+

Requires running multiple components (metadata server, search engine, database), Docker Compose quickstart simplifies initial setup.

In plain English

OpenMetadata is an open-source platform for keeping track of an organization's data. In a large company, information lives in many separate places: databases, data warehouses, dashboards, reports, pipelines, and machine-learning models. OpenMetadata does not store that data itself. Instead it collects metadata, which is data about the data, such as what tables exist, what each column means, who owns it, where it came from, and how fresh and trustworthy it is. It pulls this together into a single connected map the README calls a metadata knowledge graph. The README frames much of this around making the information usable by both people and AI assistants. Its argument is that connecting an AI tool straight to a raw database only gives it the bare structure, not the meaning or the context: whether a dataset can be trusted, who is responsible for it, or what other systems rely on it. OpenMetadata aims to supply that missing context so users and AI can find, understand, and safely use data. The platform groups its work into a few areas. Context covers the technical facts about each data asset plus quality test results, freshness checks, and lineage, which is the record of where data flows from and to. It tracks lineage even down to the level of individual columns, so you can see what might break if one column changes. Semantics adds business meaning on top, letting teams define shared vocabularies (glossaries), business concepts like Customer or Revenue, metrics, and classification tags such as PII or Confidential for sensitive information. The README says OpenMetadata connects to more than 120 data services and offers search, APIs, and software development kits so other programs can read and write this metadata. It also mentions an MCP server, a standard way to plug AI assistants into the catalog. Common uses include data discovery, data quality monitoring (called observability), and data governance, which is the practice of controlling who can use what data and how. The document ends with quickstart, documentation, community, contributing, and license sections. The project's code is written mainly in TypeScript.

Copy-paste prompts

Prompt 1
Using the OpenMetadata REST API, write a Python script that lists all datasets tagged as PII, showing the owner, last-updated date, and the service they belong to.
Prompt 2
Write an OpenMetadata YAML ingestion config that connects to a PostgreSQL database named 'analytics' and ingests table-level metadata including column descriptions.
Prompt 3
Help me design an OpenMetadata glossary for an e-commerce company with at least five business terms like Customer, Order, Revenue, Churn, and Funnel, include definitions and related terms.
Prompt 4
I want to set up data quality tests in OpenMetadata on a table called 'orders'. Show me how to define a freshness check and a column completeness check using the UI or API.
Open on GitHub → Explain another repo

← open-metadata on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.