kqiu10/sherlog

★ 24PythonAudience · developerComplexity · 3/5LicenseSetup · moderate

Mindmap

mindmap
  root((sherlog))
    How it works
      Read failure log
      Diagnose root cause
      Propose code patch
      Run tests in sandbox
    AI agents
      Diagnosis agent
      Fix proposal agent
      Verification agent
    Tech stack
      Python
      LangGraph
      Claude AI
      PostgreSQL
      Docker
    Safety
      Read-only source access
      Temp copy for testing
      Original never modified

mindmap root((sherlog)) How it works Read failure log Diagnose root cause Propose code patch Run tests in sandbox AI agents Diagnosis agent Fix proposal agent Verification agent Tech stack Python LangGraph Claude AI PostgreSQL Docker Safety Read-only source access Temp copy for testing Original never modified

Click or tap to explore — scroll the page freely

Things people build with this

USE CASE 1

Feed a broken CI build log into Sherlog and get a tested, confirmed code fix back automatically

USE CASE 2

Set up automated bug triage that diagnoses recurring failure patterns from stored past incidents

USE CASE 3

Integrate Sherlog into a development workflow to reduce time spent manually reading error logs

USE CASE 4

Use the self-correcting verification loop to validate AI-proposed patches before applying them to production code

Tech stack

PythonLangGraphAnthropic ClaudePostgreSQLDockeruv

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Python 3.11+, Docker for PostgreSQL, and a paid Anthropic API key.

Use freely for any purpose, including commercial, as long as you keep the copyright notice.

In plain English

Sherlog is a tool that analyzes application failure logs, identifies why something broke, proposes a code fix, and then actually tests that fix to confirm it works. It uses multiple AI agents working together, built with a library called LangGraph and powered by Anthropic's Claude. One agent handles diagnosis, another proposes a fix, and a third verifies the result by running the project's existing test suite against a temporary copy of the code. The process starts when you give Sherlog a failure log from a broken build or crashed application. The tool can retrieve similar past failures from a database, read the actual source code files involved through read-only access (so it cannot change your code during investigation), and identify the root cause. It then proposes a specific change: which file, which line to remove, and what to replace it with. The verification step is what makes this different from asking an AI to look at a bug. Instead of an AI opinion about whether a fix seems right, Sherlog copies your project to a temporary folder, applies the proposed patch, and runs your actual test command. If the tests pass, the fix is accepted. If they fail, the loop feeds the failure back and tries again, up to a configurable limit. The original project is never modified. The README reports benchmark results where the test-based verifier caught all broken fixes in a 14-case set, while a purely AI-based opinion check missed one. The project requires Python 3.11 or later, Docker to run the PostgreSQL database that stores past incidents, and an Anthropic API key. Installation uses the "uv" package manager. A quickstart example shows Sherlog diagnosing a simple arithmetic bug: it reads the source, proposes a one-line patch, applies it in a sandbox, and reports a passing test. The project is in early development. The core pipeline from log ingestion through diagnosis, fix proposal, and self-correcting verification is described as functional. The license is MIT.

Copy-paste prompts

Prompt 1

I have a Python application that keeps throwing a KeyError in production. How do I use Sherlog to diagnose the failure log and generate a tested fix?

Prompt 2

Set up Sherlog with Docker and PostgreSQL to store past incidents. Walk me through the uv install steps and required environment variables including the Anthropic API key.

Prompt 3

Sherlog's verification loop ran my tests but they still fail after 3 retries. How do I increase the retry limit and read the intermediate patch proposals it tried?

Prompt 4

How does Sherlog use LangGraph to coordinate its diagnosis, fix, and verification agents? Show me the agent graph structure.

Open on GitHub → Explain another repo

← kqiu10 on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.