Analysis updated 2026-05-18
Record an AI agent's execution trace as a SQLite file and diff it against a previous run to detect dropped steps.
Set up a GitHub Action that automatically blocks a PR when an agent regresses from its golden baseline.
Detect tool-drop or looping regressions in a LangChain or OpenAI Agents SDK workflow using built-in anomaly rules.
Add reasoning observability to a plain Python AI pipeline without installing any third-party dependencies.
| therealdk8890/dprovenancekitpython | a-bissell/unleash-lite | abhiinnovates/whatsapp-hr-assistant | |
|---|---|---|---|
| Stars | 1 | 1 | 1 |
| Language | Python | Python | Python |
| Setup difficulty | easy | hard | hard |
| Complexity | 3/5 | 4/5 | 3/5 |
| Audience | developer | researcher | developer |
Figures from each repo's GitHub metadata at analysis time.
DProvenanceKit is a Python library that helps you catch when an AI agent quietly changes what it does between runs. When an agent drops a tool call, skips a verification step, or falls into a new loop, the change may never surface in the output text itself. This library records each run as a structured trace, lets you compare two runs side by side, and can block a pull request in your CI pipeline if the agent's behavior has drifted from a known-good baseline. The core workflow is: record a run, save it to a file, then diff later runs against that saved golden run. You wrap your existing code with a simple context manager. The trace file is an SQLite database stored on your machine with no external service required. When you run the same workflow again after making changes, the library compares the two traces and reports which steps were dropped, added, or reordered. It integrates with popular AI frameworks including LangChain, LangGraph, the OpenAI Agents SDK, LlamaIndex, and CrewAI through optional adapters. It also accepts traces from any OpenTelemetry-instrumented system. The core library has zero dependencies beyond the Python standard library, so adding it to an existing project is straightforward. For teams using automated pipelines, the library ships a command-line gate tool and a ready-made GitHub Action. These can fail a pull request automatically when a candidate run structurally diverges from the golden baseline, posting a diff comment to the PR. Two built-in anomaly rules detect the most common regressions: a tool call being dropped and an agent entering a loop where it was not looping before. The library is a Python port of an original Swift implementation. A separate hosted web dashboard for visualizing traces and managing multiple runs is available as a commercial service at dprovenance.dev, but the local open-source library works independently.
A Python library that records AI agent execution traces, diffs them against a golden baseline, and can block CI pipelines when an agent silently drops a step or adds a loop.
Mainly Python. The stack also includes Python, SQLite, OpenTelemetry.
Setup difficulty is rated easy, with roughly 30min to a first successful run.
Mainly developer.
This repo across BitVibe Labs
Verify against the repo before relying on details.