fardinsabid/spectron

Analysis updated 2026-06-24

★ 1PythonAudience · researcherComplexity · 3/5LicenseSetup · easy

Mindmap

mindmap
  root((spectron))
    Inputs
      Token sequence
      Learned filter W
    Outputs
      Mixed token sequence
      Benchmark plots
      Paper PDF
    Use Cases
      Study FFT attention
      Compare with Transformer
      Long range mixing test
    Tech Stack
      Python
      PyTorch
      NumPy
      Matplotlib

mindmap root((spectron)) Inputs Token sequence Learned filter W Outputs Mixed token sequence Benchmark plots Paper PDF Use Cases Study FFT attention Compare with Transformer Long range mixing test Tech Stack Python PyTorch NumPy Matplotlib

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Reproduce the 4096 token benchmark and plot O(n log n) vs O(n squared) curves

USE CASE 2

Swap the FFT mixer into a small Transformer to compare loss and speed

USE CASE 3

Run the long range mixing test that measures cosine similarity across 256 token gaps

USE CASE 4

Read the bundled paper to learn how a learned frequency filter substitutes for QK attention

What is it built with?

PythonPyTorchNumPyMatplotlib

How does it compare?

	fardinsabid/spectron	a-bissell/unleash-lite	abhiinnovates/whatsapp-hr-assistant
Stars	1	1	1
Language	Python	Python	Python
Setup difficulty	easy	hard	hard
Complexity	3/5	4/5	3/5
Audience	researcher	researcher	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · easy Time to first run · 30min

Plain pip install of torch, numpy, and matplotlib is enough, commercial use is blocked by the custom license.

Custom Spectron Research and Ethical License v1.0 that allows research and personal use only, commercial use needs a separate license and military or surveillance use is forbidden.

In plain English

Spectron is a research project by Fardin Sabid in Bangladesh that proposes a different way to do the attention step inside a language model. Modern language models like GPT are built on Transformers, and the part of a Transformer that lets each word look at every other word is called self-attention. The standard recipe multiplies a matrix called Q by the transpose of K, which is fine for short text but grows with the square of the sequence length. For a 128,000 token input the README points out that the attention step alone would need around 68 gigabytes of memory. The Spectron idea is to skip that big matrix entirely. Instead of comparing every token to every other token directly, the input sequence is sent through a Fast Fourier Transform, which converts it into a frequency representation, then multiplied by a learned filter, then sent back through the inverse Fourier Transform. The README sums this up as three lines of PyTorch and writes the operation as IFFT(W ⊙ FFT(x)). The author explains that low frequencies capture long range structure while high frequencies capture local details, and the learned filter W decides which frequencies matter for each dimension. The README claims this runs in O(n log n) time instead of O(n squared) and shows a small benchmark table to back it up. At 4096 tokens, Spectron is reported as 15.4 times faster than a Transformer baseline. There are also figures fitting the measured runtimes to the two complexity curves, and a long range mixing test where tokens 256 positions apart end up with cosine similarity 0.91. The model used for the benchmark has 1.9 million parameters. To try it locally, clone the repo, pip install torch, numpy, and matplotlib, and run python test.py. The project ships a research paper PDF and a citation file. The license is a custom Spectron Research and Ethical License v1.0 that allows free use for research and personal projects but requires a separate license for commercial use and forbids military or surveillance use.

Copy-paste prompts

Prompt 1

Clone fardinsabid/spectron, install torch and numpy, and run python test.py end to end on CPU

Prompt 2

Walk me through the three line IFFT(W times FFT(x)) operation in the Spectron code and how W is learned

Prompt 3

Modify the Spectron benchmark to compare 8192 token runs against a vanilla scaled dot product attention baseline

Prompt 4

Explain what the Spectron Research and Ethical License lets me do if I want to use this in a commercial product

Prompt 5

Rewrite the long range mixing test in Spectron so it logs cosine similarity at gaps of 64, 256, and 1024

Frequently asked questions

What is spectron?

Research PyTorch implementation of an FFT-based replacement for self-attention that claims O(n log n) cost and a 15x speedup at 4096 tokens.

What language is spectron written in?

Mainly Python. The stack also includes Python, PyTorch, NumPy.

What license does spectron use?

Custom Spectron Research and Ethical License v1.0 that allows research and personal use only, commercial use needs a separate license and military or surveillance use is forbidden.

How hard is spectron to set up?

Setup difficulty is rated easy, with roughly 30min to a first successful run.

Who is spectron for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Verify against the repo before relying on details.