explaingit

replicate/cog

9,411GoAudience · dataComplexity · 3/5Setup · moderate

TLDR

Cog is an open-source tool that automatically packages machine learning models into Docker containers, handles GPU and library configuration, and generates an HTTP API so models can be deployed anywhere.

Mindmap

mindmap
  root((cog))
    What it does
      Model packaging
      Docker automation
      HTTP API generation
    Tech Stack
      Go and Python
      Docker CUDA
      Rust HTTP server
    Use Cases
      Deploy ML models
      Test predictions locally
      Build model APIs
    Workflow
      Write cog.yaml
      Write predict.py
      Build container
      Deploy or run locally
    Platforms
      macOS
      Linux
      Windows WSL
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Package a trained machine learning model into a Docker container without writing a Dockerfile or web server code.

USE CASE 2

Deploy a PyTorch or TensorFlow model to Replicate's platform or any cloud provider using a single build command.

USE CASE 3

Expose a model as an HTTP API automatically by defining input and output types in a small Python prediction class.

USE CASE 4

Test model predictions locally with a single CLI command before deploying anywhere.

Tech stack

GoPythonDockerCUDAPyTorchTensorFlowRust

Getting it running

Difficulty · moderate Time to first run · 30min

Requires Docker installed locally, GPU models additionally need CUDA-compatible hardware and drivers.

In plain English

Cog is an open-source tool from Replicate that helps machine learning researchers and engineers package their models so they can be deployed anywhere. The core problem it solves is that getting a trained machine learning model running on a server is surprisingly difficult: you have to configure Docker containers, match the right versions of GPU libraries like CUDA with the right versions of frameworks like PyTorch or TensorFlow, and write a web server to accept requests. Cog handles all of that for you. Instead of writing a Docker configuration file by hand, you describe your environment in a short YAML file. You tell Cog whether you need a GPU, which system packages are required, which Python version to use, and which file contains your prediction logic. Cog reads this and builds a properly configured Docker image, choosing the right base image and library combinations automatically. To define how your model runs, you write a small Python class with two methods: one that loads the model into memory at startup, and one that processes each prediction request. Cog reads the input and output types you declare and automatically generates an HTTP API for your model, so other systems can call it by sending a JSON request to a URL. The HTTP server is built on a fast Rust-based framework. Once built, the Docker image can run on any machine that supports Docker, including your own servers, cloud providers, or the Replicate platform. You can test predictions locally with a single command before deploying anywhere. Cog runs on macOS, Linux, and Windows 11 via WSL. Installation is available through Homebrew on macOS, a shell script, or by downloading a binary directly from the GitHub releases page. The project was created by former engineers from Docker and Spotify, and contributions are welcome through a guide in the repository.

Copy-paste prompts

Prompt 1
Show me a minimal cog.yaml and predict.py for packaging a PyTorch image classifier with Cog.
Prompt 2
How do I build and run a Cog model container locally to test predictions before pushing to Replicate?
Prompt 3
Write a Cog predict.py class that accepts an image URL as input and returns a JSON object with classification labels.
Prompt 4
What do I put in cog.yaml to specify CUDA 11.8 and PyTorch 2.0 for a GPU-based model?
Prompt 5
How do I install Cog on macOS using Homebrew and run my first prediction from the command line?
Open on GitHub → Explain another repo

← replicate on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.