Watch GPU memory and temperature update live during a machine learning training run instead of running nvidia-smi repeatedly
Find and kill a runaway GPU process that is consuming all VRAM on a shared research server
Use nvisel to automatically pick the GPU with the most free memory before launching a training job
Export GPU metrics to a Grafana dashboard to track historical utilization across a team's GPU cluster
Requires NVIDIA GPU and drivers installed, does not support AMD or Intel GPUs.
nvitop is a terminal-based tool for monitoring NVIDIA graphics cards and the processes running on them. If you work with machine learning or any software that uses a GPU, you have probably used the built-in nvidia-smi command, which gives you a static snapshot of what is happening on your GPU. nvitop goes further by providing a continuously updating, colorful interface that shows GPU memory usage, temperature, running processes, and more, all in a format that is much easier to read at a glance. The tool has two main ways to use it. The first is a quick status check that prints the current GPU and process state to the terminal. The second is a full monitor mode that runs interactively, similar to how the htop tool works for CPU processes. In monitor mode you can sort and filter the list of GPU processes, view the process tree to see which parent programs launched each GPU task, inspect environment variables, send signals to stop or pause processes, and navigate using either the keyboard or the mouse. Bar charts and history graphs show how resource usage has changed over time. Beyond the interactive display, nvitop also ships a companion tool called nvisel that helps deep learning researchers choose which GPU to use before starting a training job, based on available memory or other criteria. The package additionally exposes a Python programming interface so developers can query GPU and process information from within their own scripts or applications. This API supports collecting snapshots of metrics and building custom monitoring dashboards. There is also an exporter component that feeds data into Grafana, a popular tool for displaying metrics on dashboards. Installation is straightforward via pip or conda, and the tool works on both Linux and Windows. It queries the GPU directly using NVIDIA's own library rather than by parsing nvidia-smi output, which makes it faster and more accurate. The full README is longer than what was shown.
← xuehaipan on gitmyhub — every repo by this author, as a profile.
Verify against the repo before relying on details.