Transform raw warehouse data into analytics-ready tables using plain SQL files
Run automated data quality checks to catch nulls, duplicates, and unexpected values after each run
Build dependency-aware data pipelines without writing custom orchestration code
Visualize how your data flows through a project as an auto-generated lineage diagram
Requires a supported data warehouse connection (Snowflake, BigQuery, Redshift, Postgres, etc.) before running any models.
dbt (data build tool) is a command-line tool that helps data analysts transform raw data in a warehouse into clean, structured tables ready for analysis. Instead of writing complex scripts or building custom pipelines, analysts write plain SQL SELECT statements, and dbt takes care of turning those statements into actual tables or views in the database. The central concept is a "model," which is just a SQL file that pulls from other tables or models. Models can reference each other, so dbt tracks the order in which they need to run. If model B depends on model A, dbt knows to run A first. It can also visualize these relationships as a diagram, which helps teams understand how data flows through their project. dbt also includes a testing layer so teams can verify that their data meets expectations: things like checking that a column has no nulls, or that every value in a field is unique. Running tests after each transformation run helps catch data quality problems early. The open-source version (dbt Core) runs locally or in CI pipelines. A hosted option (dbt Cloud) adds collaboration features, scheduling, and a web interface. Both use the same model syntax, so it is straightforward to move between them. The README is brief and links to external documentation for full usage details. An active community exists on Slack and the dbt Community Discourse forum.
← dbt-labs on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.