Reproduce the 4096 token benchmark and plot O(n log n) vs O(n squared) curves
Swap the FFT mixer into a small Transformer to compare loss and speed
Run the long range mixing test that measures cosine similarity across 256 token gaps
Read the bundled paper to learn how a learned frequency filter substitutes for QK attention
Plain pip install of torch, numpy, and matplotlib is enough; commercial use is blocked by the custom license.
Spectron is a research project by Fardin Sabid in Bangladesh that proposes a different way to do the attention step inside a language model. Modern language models like GPT are built on Transformers, and the part of a Transformer that lets each word look at every other word is called self-attention. The standard recipe multiplies a matrix called Q by the transpose of K, which is fine for short text but grows with the square of the sequence length. For a 128,000 token input the README points out that the attention step alone would need around 68 gigabytes of memory. The Spectron idea is to skip that big matrix entirely. Instead of comparing every token to every other token directly, the input sequence is sent through a Fast Fourier Transform, which converts it into a frequency representation, then multiplied by a learned filter, then sent back through the inverse Fourier Transform. The README sums this up as three lines of PyTorch and writes the operation as IFFT(W ⊙ FFT(x)). The author explains that low frequencies capture long range structure while high frequencies capture local details, and the learned filter W decides which frequencies matter for each dimension. The README claims this runs in O(n log n) time instead of O(n squared) and shows a small benchmark table to back it up. At 4096 tokens, Spectron is reported as 15.4 times faster than a Transformer baseline. There are also figures fitting the measured runtimes to the two complexity curves, and a long range mixing test where tokens 256 positions apart end up with cosine similarity 0.91. The model used for the benchmark has 1.9 million parameters. To try it locally, clone the repo, pip install torch, numpy, and matplotlib, and run python test.py. The project ships a research paper PDF and a citation file. The license is a custom Spectron Research and Ethical License v1.0 that allows free use for research and personal projects but requires a separate license for commercial use and forbids military or surveillance use.
Generated 2026-05-22 · Model: sonnet-4-6 · Verify against the repo before relying on details.