explaingit

nixtla/statsforecast

4,780PythonAudience · dataComplexity · 3/5Setup · moderate

TLDR

A Python library for fast time series forecasting using classic statistical models, with automatic model selection and support for processing millions of series at once using distributed computing.

Mindmap

mindmap
  root((statsforecast))
    What it does
      Time series forecasting
      Auto model selection
      Anomaly detection
    Models
      AutoARIMA
      AutoETS
      Theta
    Scaling
      Spark
      Dask
      Ray
    Audience
      Data scientists
      Analysts
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Forecast weekly sales figures for thousands of products at once without waiting hours for results.

USE CASE 2

Automatically find the best ARIMA or ETS model for a time series without manually tuning parameters.

USE CASE 3

Detect anomalies in time series data such as unusual spikes in website traffic or energy consumption.

USE CASE 4

Scale a forecasting pipeline across a cluster with Ray or Spark when data volumes are too large for one machine.

Tech stack

PythonPyPIcondaSparkDaskRayscikit-learn

Getting it running

Difficulty · moderate Time to first run · 30min

Requires familiarity with pandas DataFrames and time series concepts, distributed backends need Spark, Dask, or Ray configured separately.

In plain English

StatsForecast is a Python library for predicting future values in time series data using established statistical methods. A time series is any sequence of measurements recorded over time: sales figures by week, electricity usage by hour, website traffic by day. Statistical forecasting models try to find patterns in that history and extend them forward. StatsForecast packages many of these well-known models together and focuses on making them run much faster than previous Python implementations. The library includes automatic versions of several classic forecasting approaches. AutoARIMA, for example, searches for the best configuration of a model family called ARIMA (which stands for Autoregressive Integrated Moving Average) by testing different parameter combinations and picking the one that fits the data best. Similar auto-selection versions exist for ETS (a family of exponential smoothing models), Theta, and CES. For situations where you have a rough idea of what you want, manual versions of each model are also available. There is also support for time series with multiple seasonal patterns, anomaly detection, and incorporating external variables like weather or pricing. Speed is a central claim of the project. The README states that the AutoARIMA implementation is roughly 20 times faster than a comparable Python library called pmdarima and about 500 times faster than Facebook's Prophet. For very large workloads, the library integrates with distributed computing frameworks including Spark, Dask, and Ray, which lets it split work across many machines. The README includes a benchmark showing one million time series processed in around 30 minutes using Ray. The library uses the same interface style as scikit-learn, a well-known Python machine learning library, so anyone familiar with that pattern will recognize the fit and predict calls. It is available through PyPI and conda-forge, the two main Python package distribution channels.

Copy-paste prompts

Prompt 1
Using nixtla/statsforecast, fit an AutoARIMA model to a pandas DataFrame with a date column and a value column, then forecast the next 12 periods and plot the result.
Prompt 2
Show me how to use StatsForecast to run the same AutoETS model across 500 different time series in parallel and collect all forecasts into one DataFrame.
Prompt 3
Using StatsForecast with Ray, set up a distributed forecasting job that processes one million time series and saves results to a CSV.
Prompt 4
How do I add external regressors to a StatsForecast model so that a known future variable like a holiday flag is included in the forecast?
Open on GitHub → Explain another repo

← nixtla on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.