Test an AI agent against a local Gmail mock without hitting real accounts
Run reproducible agent benchmarks across Drive, Docs, Calendar, and Slack mocks
Add a new mock service that conforms to the shared Docker base image contract
Develop and smoke-test agent tasks like archiving stale Drive drafts locally
Needs Python 3.12, uv, a running Docker daemon, and several free ports in the 9000 range.
Mockflow is a collection of fake services that pretend to be popular workplace tools so that AI agents can be tested against them locally, without touching real accounts or real data. The current set covers stand ins for Gmail, Google Calendar, Google Docs, Google Drive, and Slack, each packaged as its own environment under packages/environments. The point of the project is to give a benchmark or a developer a stable, repeatable place to run an agent through a task. Instead of the agent hitting the actual Gmail API, where the data changes constantly and where mistakes have real consequences, it talks to a local mock that behaves like Gmail in the ways that matter for the test. The repo describes itself as owning local environment development, seed contracts, dev tooling, API parity checks, and a shared Docker base image. The actual benchmark scoring and the canonical task definitions live in a different repo. To run it locally you need Python 3.12 or newer, the uv package manager, a running Docker daemon, and a handful of free local ports in the 9000 range. A script called scripts/dev.sh starts every configured mock service together with a small web dashboard called devhub, which you open at 127.0.0.1 port 9060. Another script can start only the services that a particular example task needs, and a smoke script runs a fast set of checks before you commit changes. All the mock services share a Docker base image published as kywch/mockflow with a version number tracked in a VERSION file. Task Dockerfiles are expected to be thin and to pin themselves to that base image version. The README walks through the release checklist for bumping the base image, building it, running the example smoke tests against it, pushing it to the registry, and validating that a fresh pull works. There are also runtime contracts written down, such as canonical service names, environment variable names for service URLs, and a config.toml file that holds runtime metadata. Example tasks shipped with the repo include archiving stale Google Drive drafts, indexing keywords across Google Docs, forwarding confidential email, and a couple of multi service flows that touch mail and calendar together. The project is licensed under AGPL v3 only, and the team behind it, BenchFlow, also offers a hosted version of Mockflow as a paid service.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.