explaingit

stefan-jansen/machine-learning-for-trading

17,322Jupyter NotebookAudience · dataComplexity · 4/5Setup · hard

TLDR

Companion code for the book Machine Learning for Algorithmic Trading, 150+ Jupyter notebooks taking you from raw market data through ML model training to backtested trading strategies on historical prices.

Mindmap

mindmap
  root((machine-learning-for-trading))
    What it does
      Book companion code
      150 Jupyter notebooks
      ML4T workflow
    Tech Stack
      Python
      TensorFlow
      Zipline
      pandas
    Use Cases
      Backtest ML strategies
      NLP on SEC filings
      Synthetic data GANs
    Audience
      Data scientists
      Quant traders
      Finance students
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Backtest a machine-learning-driven trading strategy on historical stock prices using the Zipline engine

USE CASE 2

Train an NLP model on SEC filings or earnings call transcripts to predict stock price movements

USE CASE 3

Generate synthetic time-series market data using GANs for strategy stress testing

USE CASE 4

Build a deep reinforcement learning trading agent using the included end-to-end notebook examples

Tech stack

PythonJupyter NotebookpandasTensorFlowZipline

Getting it running

Difficulty · hard Time to first run · 1h+

Requires a custom Zipline fork and financial data downloads before the trading strategy notebooks can run.

In plain English

This repository is the companion code for the book "Machine Learning for Algorithmic Trading, 2nd edition", a roughly 800-page, 23-chapter guide on using machine learning to design trading strategies for financial markets. The repo's role is to make the book runnable: it contains over 150 Jupyter notebooks that show how to source market and alternative data, engineer features, train ML models, turn predictions into trading signals, and backtest the resulting strategy on historical data. The material follows what the book calls the ML4T workflow: collecting data, extracting features, training and tuning an ML model, designing a strategy on top of its predictions, and simulating it on past prices using a backtesting engine. Coverage spans techniques from linear regression through unsupervised learning, CNNs and RNNs applied to market and alternative data, generative adversarial networks for synthetic time-series data, and deep reinforcement learning for a trading agent. Alternative data sources include SEC filings, earnings call transcripts, and satellite images. A customised version of the Zipline library is provided to plug ML predictions into the backtest, and an appendix documents over 100 alpha factors. Someone would use this if they are a quantitative-trading practitioner, finance student, or data scientist who wants a worked-example path from raw data to a backtested ML-driven strategy. The code is Python in Jupyter notebooks and relies on standard data-science libraries including pandas and TensorFlow. A companion website at ml4trading.io and a community at exchange.ml4trading.io are linked from the README. The full README is longer than what was provided.

Copy-paste prompts

Prompt 1
Using the ML4T workflow from machine-learning-for-trading, help me source daily S&P 500 price data, engineer momentum and volatility features, and train a gradient boosting model to generate buy and sell signals.
Prompt 2
Set up the custom Zipline environment from machine-learning-for-trading and run a backtest for a mean-reversion strategy on US equities.
Prompt 3
Using the SEC filing notebooks in machine-learning-for-trading, build an NLP pipeline that predicts earnings surprises from 10-K filings.
Prompt 4
Walk me through training an LSTM on historical price sequences using the RNN notebooks from machine-learning-for-trading.
Prompt 5
Using the 100+ alpha factors documented in machine-learning-for-trading, build a multi-factor ranking model that selects the top 20 stocks each month.
Open on GitHub → Explain another repo

← stefan-jansen on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.