explaingit

benchflow-ai/mockflow

5PythonAudience · developerComplexity · 4/5ActiveLicenseSetup · moderate

TLDR

Local mock services that imitate Gmail, Google Calendar, Docs, Drive, and Slack so AI agents can be tested against repeatable fake workplace environments without touching real accounts.

Mindmap

mindmap
  root((mockflow))
    Inputs
      Agent API calls
      Seed data
      Task config toml
    Outputs
      Mocked Gmail responses
      Devhub dashboard
      Smoke results
    Use Cases
      Benchmark agents
      Local dev testing
      Reproducible eval
    Tech Stack
      Python
      Docker
      uv

Things people build with this

USE CASE 1

Test an AI agent against a local Gmail mock without hitting real accounts

USE CASE 2

Run reproducible agent benchmarks across Drive, Docs, Calendar, and Slack mocks

USE CASE 3

Add a new mock service that conforms to the shared Docker base image contract

USE CASE 4

Develop and smoke-test agent tasks like archiving stale Drive drafts locally

Tech stack

PythonDockeruv

Getting it running

Difficulty · moderate Time to first run · 30min

Needs Python 3.12, uv, a running Docker daemon, and several free ports in the 9000 range.

AGPL v3 lets you use and modify the code, but if you offer it as a network service you must share your modified source under the same license.

In plain English

Mockflow is a collection of fake services that pretend to be popular workplace tools so that AI agents can be tested against them locally, without touching real accounts or real data. The current set covers stand ins for Gmail, Google Calendar, Google Docs, Google Drive, and Slack, each packaged as its own environment under packages/environments. The point of the project is to give a benchmark or a developer a stable, repeatable place to run an agent through a task. Instead of the agent hitting the actual Gmail API, where the data changes constantly and where mistakes have real consequences, it talks to a local mock that behaves like Gmail in the ways that matter for the test. The repo describes itself as owning local environment development, seed contracts, dev tooling, API parity checks, and a shared Docker base image. The actual benchmark scoring and the canonical task definitions live in a different repo. To run it locally you need Python 3.12 or newer, the uv package manager, a running Docker daemon, and a handful of free local ports in the 9000 range. A script called scripts/dev.sh starts every configured mock service together with a small web dashboard called devhub, which you open at 127.0.0.1 port 9060. Another script can start only the services that a particular example task needs, and a smoke script runs a fast set of checks before you commit changes. All the mock services share a Docker base image published as kywch/mockflow with a version number tracked in a VERSION file. Task Dockerfiles are expected to be thin and to pin themselves to that base image version. The README walks through the release checklist for bumping the base image, building it, running the example smoke tests against it, pushing it to the registry, and validating that a fresh pull works. There are also runtime contracts written down, such as canonical service names, environment variable names for service URLs, and a config.toml file that holds runtime metadata. Example tasks shipped with the repo include archiving stale Google Drive drafts, indexing keywords across Google Docs, forwarding confidential email, and a couple of multi service flows that touch mail and calendar together. The project is licensed under AGPL v3 only, and the team behind it, BenchFlow, also offers a hosted version of Mockflow as a paid service.

Copy-paste prompts

Prompt 1
Give me a 5-minute install guide for mockflow with Python 3.12, uv, and Docker
Prompt 2
Show me how scripts/dev.sh launches all mock services and opens devhub at 127.0.0.1 port 9060
Prompt 3
Walk me through the runtime contracts in mockflow including service names, env vars, and config.toml
Prompt 4
How do I bump the kywch/mockflow base image version and run the smoke tests before pushing
Prompt 5
Explain how a task Dockerfile pins to the shared mockflow base image and what stays thin
Open on GitHub → Explain another repo

Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.