explaingit

borisbanushev/stockpredictionai

5,569JavaScriptAudience · researcherComplexity · 5/5Setup · hard

TLDR

A Jupyter notebook walkthrough for building a stock price prediction pipeline using a GAN, LSTM, sentiment analysis, Fourier transforms, and Bayesian optimization, educational, not trading advice.

Mindmap

mindmap
  root((repo))
    Model architecture
      GAN generator
      GAN discriminator
      LSTM time series
      CNN discriminator
    Input signals
      Historical price data
      Technical indicators
      BERT news sentiment
      Fourier transforms
      ARIMA modeling
    Tuning
      Bayesian optimization
      Reinforcement learning
    Tech Stack
      Python
      MXNet Gluon
      Multiple GPUs
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Study how to combine GAN, LSTM, NLP, and ARIMA into one time-series prediction pipeline.

USE CASE 2

Use the notebook as a template for building your own stock or asset price prediction system.

USE CASE 3

Learn how Bayesian optimization with reinforcement learning can tune a GAN during training.

USE CASE 4

Explore how BERT-based sentiment analysis of financial news can be added as an input signal to a trading model.

Tech stack

PythonJupyterMXNetGluonLSTMBERTARIMA

Getting it running

Difficulty · hard Time to first run · 1day+

Requires multiple GPUs and the MXNet framework with Gluon API, published January 2019 so some dependencies may be outdated.

License terms were not clearly stated in the README.

In plain English

This repository is a Jupyter notebook walkthrough for building a stock price prediction system, published in January 2019. The author works through predicting daily price movements for Goldman Sachs stock using a combination of AI techniques stacked together in one pipeline. The central model is a Generative Adversarial Network, or GAN. In this setup, one neural network (the generator) attempts to produce realistic stock price sequences, while a second network (the discriminator) tries to tell real data from generated data. By competing against each other during training, both networks improve. The generator uses an LSTM, a type of network suited to time-series data, and the discriminator uses a convolutional neural network. What makes the notebook unusual is the number of input signals it feeds into this model. Beyond basic historical price data and standard technical indicators like moving averages and Bollinger bands, it incorporates NLP sentiment analysis using BERT to process financial news, Fourier transforms to capture long-term trend directions, autoregressive time-series modeling via ARIMA, and an autoencoder to extract higher-level patterns that simpler features might miss. Correlated assets such as commodities and currency pairs are also included. Tuning the GAN's settings is handled through Bayesian optimization combined with reinforcement learning techniques (including Rainbow and PPO algorithms), which decide when and how to adjust the model during training. The code uses MXNet with its Gluon API and is designed to run on multiple GPUs. The notebook includes a disclaimer that this is an educational exploration of techniques, not investment advice. The full README is longer than what was shown.

Copy-paste prompts

Prompt 1
How does the GAN in this stock prediction notebook work? Explain the generator and discriminator roles in plain terms.
Prompt 2
Show me how to adapt the stockpredictionai notebook to predict prices for a different stock ticker using my own CSV data.
Prompt 3
How does the Bayesian optimization and reinforcement learning tuning work in this notebook, and how do I adjust it?
Prompt 4
What does the Fourier transform step do in this stock prediction model and how does it capture long-term price trends?
Prompt 5
How do I add BERT-based sentiment analysis from financial news headlines as an input feature in this pipeline?
Open on GitHub → Explain another repo

← borisbanushev on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.