asaptf/swift-language-models

Analysis updated 2026-05-18

★ 6SwiftAudience · researcherComplexity · 4/5Setup · moderate

Mindmap

mindmap
  root((swift-models))
    Four Parts
      N-gram counting
      Neural autograd
      GPT Transformer
      RAG pipeline
    Key Concepts
      Tokenization
      Backpropagation
      Self-attention
      BM25 retrieval
    Tech Stack
      Swift
      MLX Apple Silicon
      Llama 2
    Platforms
      Linux
      macOS
      Apple Silicon

mindmap root((swift-models)) Four Parts N-gram counting Neural autograd GPT Transformer RAG pipeline Key Concepts Tokenization Backpropagation Self-attention BM25 retrieval Tech Stack Swift MLX Apple Silicon Llama 2 Platforms Linux macOS Apple Silicon

Click or tap to explore — scroll the page freely

What do people build with it?

USE CASE 1

Learn how backpropagation works by reading and running a from-scratch autograd engine written in plain, commented Swift.

USE CASE 2

Train a small GPT-style Transformer on your own text file on Apple Silicon and watch the model overfit in real time.

USE CASE 3

Build a local RAG pipeline that answers questions about your own documents using BM25 text search and a local Llama 2 model with no cloud required.

What is it built with?

SwiftMLXLlama 2

How does it compare?

	asaptf/swift-language-models	iamwilliamli/livetranscriber	crafcat7/peakmon
Stars	6	6	7
Language	Swift	Swift	Swift
Setup difficulty	moderate	hard	easy
Complexity	4/5	3/5	3/5
Audience	researcher	developer	developer

Figures from each repo's GitHub metadata at analysis time.

How do you get it running?

Difficulty · moderate Time to first run · 30min

Part 3 (GPT Transformer) requires an Apple Silicon Mac with MLX, Parts 1, 2, and 4 run on any macOS or Linux with Swift installed.

No license information is provided in the README.

In plain English

swift-language-models is a self-contained educational course that teaches how modern AI language models work, written entirely in Swift. Instead of wrapping a framework you cannot inspect, each of the four parts builds a working model from scratch in code you can read, run, and modify. The goal is understanding the actual mechanism, not just knowing how to call an API. The course is structured as four independent projects that you run with a single shell script. The first part is an n-gram model: it counts how often each character follows the previous few characters, turns those counts into probabilities, and samples text from them. No learning happens here, but it shows what a language model is actually trying to do. The second part introduces real learning by building a backpropagation engine (the math that lets a model improve by measuring its mistakes) from scratch, then using it to train a small neural network that predicts the next character. The third part builds a GPT-style Transformer, the kind of architecture behind systems like ChatGPT. It uses Apple's MLX framework and runs on Apple Silicon hardware, training in minutes on a small text file. You can watch it overfit, which teaches why training and validation sets are kept separate. The fourth part adds retrieval-augmented generation (RAG): a way of letting a language model answer questions about documents you provide by first searching for the relevant passages using a classic text-search method called BM25, then feeding those passages into the prompt. This part runs locally with no cloud connection. Every line of code is commented. Each part is a complete, self-contained project with its own build and no shared state, so you can start with any of them. Parts 1, 2, and 4 run on both Linux and macOS. Part 3 requires Apple Silicon hardware. The repository does not specify a license.

Copy-paste prompts

Prompt 1

I want to understand backpropagation by running swift-language-models Part 2 (neural-char). How do I build and run it, and what should I watch in the output as the loss drops during training?

Prompt 2

Walk me through Part 3 of swift-language-models: how do I train MiniGPT on my own text file using run.sh, what does overfitting look like in the numbers, and why does this part require Apple Silicon?

Prompt 3

I want to run the RAG part (Part 4, swift-rag) to answer questions about my own documents. What format should my corpus files be, and how do I run a search-only query without loading the language model?

Prompt 4

What is the conceptual difference between what Part 1 (n-gram counting) and Part 2 (neural learning) do, and why does the course make you build counting before introducing gradients?

Frequently asked questions

What is swift-language-models?

A four-part hands-on course in Swift that builds language models from scratch, going from simple letter counting through a GPT Transformer to retrieval-augmented generation.

What language is swift-language-models written in?

Mainly Swift. The stack also includes Swift, MLX, Llama 2.

What license does swift-language-models use?

No license information is provided in the README.

How hard is swift-language-models to set up?

Setup difficulty is rated moderate, with roughly 30min to a first successful run.

Who is swift-language-models for?

Mainly researcher.

Open on GitHub → Explain another repo

This repo across BitVibe Labs

Scan in gitsafehub Deploy in gitdeployhub asaptf on gitmyhub

Verify against the repo before relying on details.