explaingit

chz1y/steerable-music-transformer

21PythonAudience · researcherComplexity · 4/5Setup · hard

TLDR

A small AI model that generates Bach-style choral music with precise bar-by-bar control over rhythmic density, without breaking the harmonic structure. Trained on 349 Bach pieces using a compact 4-layer Transformer.

Mindmap

mindmap
  root((repo))
    What it does
      Control rhythm density
      Generate Bach chorals
      Bar by bar steering
    How it works
      Complexity tags Level 1-10
      REMI+ note encoding
      Micro-Transformer model
    Training Data
      349 Bach pieces
      MIDI source files
      music21 library
    Results
      0.893 correlation score
      Harmonic structure preserved
      Reproducible pipeline
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Generate Bach-style choral music where you control how note-dense each bar sounds, from sparse to busy.

USE CASE 2

Research how to steer AI music generation without accidentally distorting the harmonic quality.

USE CASE 3

Reproduce the paper's experiments by building the dataset from MIDI files and training the model yourself.

USE CASE 4

Explore how small, well-trained AI models can produce controllable musical output without needing massive compute.

Tech stack

PythonPyTorchmusic21MIDIREMI+Transformer

Getting it running

Difficulty · hard Time to first run · 1day+

Requires building a MIDI dataset via music21, running a preprocessing script, training the model, then generating and evaluating samples. Multiple sequential pipeline steps before any output.

No license information was mentioned in the explanation.

In plain English

This repository is the code for a research paper about generating music with precise control over how rhythmically busy each musical bar sounds. The problem the paper addresses is that when AI music systems are asked to produce denser rhythms, they often break the harmonic structure of the music at the same time. This project introduces a method to adjust one of those qualities independently of the other. The approach works by training a small neural network on 349 four-part choral pieces by J.S. Bach. Before training, each bar of music in the dataset is labeled with a complexity tag on a scale from Level 1 to Level 10, indicating how densely packed with notes that bar is. The model learns to generate music according to whichever tag is supplied at the start of each bar, allowing you to dial the rhythmic density up or down on a bar-by-bar basis. The research uses a music encoding format called REMI+ to represent the notes as a sequence of tokens, similar to how text is represented for language models. The model itself is described as a micro-Transformer: only four layers and eight attention heads, which is quite small compared to typical language models. The README argues that with well-prepared, high-purity training data you do not need a much larger model to achieve reliable, steerable output. The paper reports a Pearson correlation of 0.893 between the requested complexity level and the actual note density of what the model generates, and shows through a separate analysis that increasing rhythmic density does not measurably increase harmonic noise. To reproduce the results, you build the training dataset from MIDI files with a provided script, train the model with another script, generate samples, and then produce evaluation charts. The MIDI source data comes from the open-source music21 library.

Copy-paste prompts

Prompt 1
I have the steerable-music-transformer repo. Walk me through building the training dataset from MIDI files using the provided script, then training the model from scratch.
Prompt 2
Using the steerable-music-transformer, how do I generate a Bach-style chorale where the first two bars are rhythmically sparse (Level 2) and the last two bars are dense (Level 8)?
Prompt 3
Explain how REMI+ encoding works in the steerable-music-transformer and why it was chosen over standard MIDI representation for training a Transformer model.
Prompt 4
I want to evaluate my generated samples from steerable-music-transformer. How do I reproduce the Pearson correlation chart and the harmonic noise analysis from the paper?
Prompt 5
Can I fine-tune the steerable-music-transformer on a different composer's MIDI files instead of Bach? What parts of the data preparation script would I need to change?
Open on GitHub → Explain another repo

← chz1y on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.