Attribute slow Postgres queries to the request or background job that fired them
Tag every BigQuery call with tenant and region for audit logs
Capture AI agent generated SQL with actor and operation fields for review
Forward query context into OpenTelemetry traces for end-to-end debugging
You must call datacontext.configure once at startup and wrap the right data-access functions, otherwise no events fire.
DataContext is a Python library for figuring out which part of an application caused a specific database query. In a live service, by the time a slow or strange query reaches the logs or the database team, the link back to the request, background job, or AI agent that triggered it has usually been lost. DataContext attaches that runtime context to every query as it happens, so the people on call can answer the question of who actually sent it. The library is installed from PyPI with pip install datacontext, and it has optional extras for OpenTelemetry, SQLAlchemy, PostgreSQL, BigQuery, Snowflake, Dagster, and dbt. To use it, the developer calls datacontext.configure once at startup, naming the service, the environment, and the data-access function that should be wrapped. From then on, every call to that function emits one finished event when it returns or raises an exception. The author stresses that wrappers keep return values and exceptions exactly as they were, and that if the library itself fails the application keeps running. Each emitted event is a JSON object with a stable shape: the event name, start and end timestamps, the service and environment, the database system and client name, a SHA-256 fingerprint of the query, a sanitized version of the SQL, the duration in milliseconds, and the file, line, function, and short stack of the code that issued it. When the developer wraps a block of code in a context.use(...) block, fields like operation, actor, request_id, tenant, and region are added to every query captured inside it. If OpenTelemetry is active, the trace_id and span_id are attached as well. DataContext is described as deliberately small and early. The supported instrumentation today is manual helpers, function wrapping, native SQLAlchemy, PostgreSQL, BigQuery, and Snowflake hooks, plus Dagster and dbt execution-context attribution. Output goes to JSONL files, a callback function, or an OpenTelemetry-oriented sink. The maintainers say further database clients and ORMs will be prioritized based on real requests in GitHub Discussions and issues. The production-behavior section lists the safety rules: wrappers do not swallow exceptions, capture failures fall back to a minimal event, sink failures are logged and dropped, and raw SQL is opt-in while sanitized text is the default.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.