Analysis updated 2026-05-18
Learn how backpropagation works by reading and running a from-scratch autograd engine written in plain, commented Swift.
Train a small GPT-style Transformer on your own text file on Apple Silicon and watch the model overfit in real time.
Build a local RAG pipeline that answers questions about your own documents using BM25 text search and a local Llama 2 model with no cloud required.
| asaptf/swift-language-models | iamwilliamli/livetranscriber | crafcat7/peakmon | |
|---|---|---|---|
| Stars | 6 | 6 | 7 |
| Language | Swift | Swift | Swift |
| Setup difficulty | moderate | hard | easy |
| Complexity | 4/5 | 3/5 | 3/5 |
| Audience | researcher | developer | developer |
Figures from each repo's GitHub metadata at analysis time.
Part 3 (GPT Transformer) requires an Apple Silicon Mac with MLX, Parts 1, 2, and 4 run on any macOS or Linux with Swift installed.
swift-language-models is a self-contained educational course that teaches how modern AI language models work, written entirely in Swift. Instead of wrapping a framework you cannot inspect, each of the four parts builds a working model from scratch in code you can read, run, and modify. The goal is understanding the actual mechanism, not just knowing how to call an API. The course is structured as four independent projects that you run with a single shell script. The first part is an n-gram model: it counts how often each character follows the previous few characters, turns those counts into probabilities, and samples text from them. No learning happens here, but it shows what a language model is actually trying to do. The second part introduces real learning by building a backpropagation engine (the math that lets a model improve by measuring its mistakes) from scratch, then using it to train a small neural network that predicts the next character. The third part builds a GPT-style Transformer, the kind of architecture behind systems like ChatGPT. It uses Apple's MLX framework and runs on Apple Silicon hardware, training in minutes on a small text file. You can watch it overfit, which teaches why training and validation sets are kept separate. The fourth part adds retrieval-augmented generation (RAG): a way of letting a language model answer questions about documents you provide by first searching for the relevant passages using a classic text-search method called BM25, then feeding those passages into the prompt. This part runs locally with no cloud connection. Every line of code is commented. Each part is a complete, self-contained project with its own build and no shared state, so you can start with any of them. Parts 1, 2, and 4 run on both Linux and macOS. Part 3 requires Apple Silicon hardware. The repository does not specify a license.
A four-part hands-on course in Swift that builds language models from scratch, going from simple letter counting through a GPT Transformer to retrieval-augmented generation.
Mainly Swift. The stack also includes Swift, MLX, Llama 2.
No license information is provided in the README.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.