explaingit

plasma-umass/scalene

13,414PythonAudience · dataComplexity · 3/5Setup · easy

TLDR

A Python profiler from UMass Amherst that pinpoints exactly which lines of code are slow or wasting memory, with only 10-20% runtime overhead, far less than standard profilers, plus AI-powered fix suggestions.

Mindmap

mindmap
  root((repo))
    What it does
      Python profiler
      Line level detail
      Low overhead
    CPU Profiling
      Python vs native time
      System time
      Hot spot highlights
    Memory Profiling
      Memory leaks
      Accidental copies
      GPU memory
    Output
      Browser interface
      Command line mode
      AI suggestions
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Find exactly which lines of a Python script are consuming the most CPU time without slowing the program down significantly.

USE CASE 2

Detect memory leaks and accidental expensive copies between Python and compiled libraries like NumPy.

USE CASE 3

Profile GPU time alongside CPU and memory in machine learning training scripts on NVIDIA hardware.

USE CASE 4

Get AI-generated suggestions for speeding up specific slow lines of Python code directly in the profiling output.

Tech stack

PythonNumPyPyTorchCUDA

Getting it running

Difficulty · easy Time to first run · 5min

GPU profiling is currently limited to NVIDIA hardware only.

In plain English

Scalene is a profiler for Python programs. A profiler is a tool that measures where a program spends its time and memory while running, so that you can find the slow or wasteful parts and improve them. Scalene was built by researchers at UMass Amherst and distinguishes itself from other Python profilers by being significantly faster and providing more detailed information at the same time. Most profilers slow your program down by a large factor while measuring it, making results less realistic. Scalene uses a sampling approach rather than tracking every single function call, so its overhead is typically 10 to 20 percent. It also profiles at the individual line level, not just per function, so you can see exactly which line of code is consuming time or memory. On the CPU side, Scalene separates out time spent in Python itself from time spent in native code, such as compiled libraries. This distinction matters because most developers can only optimize their own Python code, not the underlying library internals. It also identifies system time separately, which helps spot input and output bottlenecks. Hot spots are highlighted in red in the output. Memory profiling is a notable strength. Scalene tracks which specific lines of code cause memory to grow, identifies likely memory leaks, and measures how much data is being copied across the Python and native code boundary. Accidental copies, such as converting a NumPy array into a plain Python list without realizing it, can be expensive, and Scalene flags those. GPU time is also reported, though currently limited to NVIDIA hardware. The output can be viewed in a browser-based interface that opens automatically after a run, showing an interactive breakdown of CPU, GPU, and memory data per line. A command-line-only mode is also available. The tool also includes AI-powered optimization suggestions based on the profiling results.

Copy-paste prompts

Prompt 1
Show me how to run Scalene on my Python script and interpret the output to find the slowest lines of code.
Prompt 2
How do I use Scalene to detect memory leaks in a Python data processing script?
Prompt 3
How do I profile a NumPy-heavy script with Scalene to find accidental array copies between Python and native code?
Prompt 4
Walk me through reading Scalene's browser output to identify which lines need optimization in a data pipeline.
Prompt 5
How do I enable GPU profiling in Scalene for a PyTorch training script on an NVIDIA GPU?
Open on GitHub → Explain another repo

← plasma-umass on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.