Monitor whether your production ML model's predictions are drifting as new real-world data comes in over time.
Evaluate whether your LLM chatbot responses are too short, too negative, or refusing requests they should answer.
Generate an HTML data quality report before deploying a new model version to catch problems early.
Set up a live monitoring dashboard to track AI system health metrics across real user traffic over time.
Evidently is an open-source Python library for evaluating and monitoring AI systems. If you have built a machine learning model or a chatbot powered by a large language model, Evidently gives you tools to check whether your system is behaving correctly, both during development and once it is running in production. The library works in two main ways. First, you can run Reports and Test Suites: you feed in your data and outputs, and Evidently calculates metrics like whether your data quality has changed, whether a model's predictions are drifting over time, or whether an AI assistant is giving responses that are too short, too negative, or likely denying requests it should answer. Over 100 built-in checks cover classification accuracy, recommendation diversity, text sentiment, and many more. You can view results interactively in a Jupyter notebook or export them as HTML or JSON for sharing. Second, Evidently includes a monitoring dashboard that tracks these metrics over time as your system handles real users. You can run the dashboard yourself on your own machine or sign up for a hosted version called Evidently Cloud, which adds alerting, team management, and other extras on top of a free tier. The library installs with a single pip command and is designed to be modular. You can start with a one-line preset that catches common problems in tabular data, then layer in custom metrics as you learn more about what your specific system needs to track. Evidently is developed by Evidently AI, the company behind it, and is open source under a public license. It is aimed at data scientists and ML engineers who want to catch problems in AI systems before or after they reach users, without having to build monitoring infrastructure from scratch.
← evidentlyai on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.