Deploy a trained scikit-learn or PyTorch model as a REST endpoint that scales with traffic on a Kubernetes cluster.
Run A/B tests splitting traffic between two model versions to compare their prediction quality on real users.
Add an outlier detector that flags unusual inputs before they reach a production model to protect prediction quality.
Monitor model accuracy and request metrics with built-in Prometheus and Grafana dashboards without custom instrumentation.
Requires a running Kubernetes cluster and Helm, you need a cloud provider account or a local kind/minikube setup before any model can be deployed.
Seldon Core is a platform for taking trained machine learning models and turning them into live services that other software can call over the internet. If you have a model trained in Python, Java, TensorFlow, PyTorch, or a handful of other frameworks, Seldon Core wraps it up and exposes it as a web endpoint that accepts requests and returns predictions. The whole system runs on Kubernetes, which is an infrastructure layer for managing many containers at once across cloud servers. The core idea is that putting a model into production is harder than training it. You need to handle incoming traffic, scale up when requests spike, keep logs, monitor accuracy over time, and roll out new versions without breaking things. Seldon Core packages all of that machinery. It has been installed over two million times and is used by organizations that run thousands of models simultaneously. It works on AWS, Azure, Google Cloud, Alibaba Cloud, DigitalOcean, and OpenShift, so teams are not locked into one provider. Beyond basic serving, the project adds several built-in capabilities. You can run A/B tests that split traffic between two versions of a model to see which performs better. You can add outlier detectors that flag unusual inputs before they reach the model. You can attach explainers that produce a reason for each prediction. All requests and responses can be logged to Elasticsearch for auditing, and metrics flow to Prometheus and Grafana for dashboards and alerts. Distributed tracing via Jaeger lets engineers see exactly how long each step in a multi-model pipeline takes. Version 2 of Seldon Core is now available and the project recommends new users start there. Version 1, which the bulk of the documentation covers, is still supported. Installation for V1 is done through Helm, a Kubernetes package manager, using a single command that sets up the operator in a dedicated namespace. Once installed, you deploy a model by writing a short configuration file that names the model type and points to the model file stored in cloud object storage. The project is open source and maintained by Seldon. There is an active Slack community, fortnightly video calls, and a public issue tracker for bugs and feature requests.
← seldonio on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.