explaingit

doctorwkt/acwj

13,217CAudience · developerComplexity · 3/5LicenseSetup · moderate

TLDR

A 64-part tutorial series that walks you through writing a real C compiler from scratch in C, starting from reading text tokens and ending with a compiler that can compile itself.

Mindmap

mindmap
  root((acwj))
    What it does
      Build a C compiler
      Step by step 64 parts
      Self-compiling target
    Topics covered
      Lexer and tokens
      Expressions variables
      Control flow loops
      Functions pointers
      Structs enums
      Preprocessor
    Output target
      x86-64 assembly
      Second CPU target
    License
      GPL3 source code
      Creative Commons docs
    Audience
      Developers
      CS learners
Click or tap to explore — scroll the page freely

Code map

Detail Auto

An interactive map of this repo's files and how they connect — its source is parsed live in your browser. Click Visualize to build it.

filefunction / class

Things people build with this

USE CASE 1

Follow all 64 numbered parts in order to understand how a compiler works by building a real one

USE CASE 2

Study how a C compiler handles pointers, structs, type casting, and a preprocessor in actual working code

USE CASE 3

Use as a practical reference when implementing your own programming language or compiler project

Tech stack

Cx86-64Assembly

Getting it running

Difficulty · moderate Time to first run · 30min

Each part is self-contained and requires a C compiler plus a Linux environment to compile and run the examples.

Source code is GPL3, if you distribute modified versions you must share source under the same license. Written tutorial content is under Creative Commons.

In plain English

This repository documents a step-by-step journey to write a compiler from scratch in C. A compiler is a program that reads source code written by a programmer and translates it into machine instructions a computer can run. The author set out to build a compiler for a large subset of the C programming language, with the specific goal of making it capable of compiling itself, a benchmark in compiler development sometimes called self-compilation or bootstrapping. The project is organized as a series of 64 numbered parts, each with its own folder and explanatory document. The series starts from the very basics, introducing how a compiler reads and recognizes tokens in source text, then progressively adds more complexity: arithmetic expressions, variables, control flow like if-statements and loops, functions, pointers, arrays, structs, unions, enums, type casting, and eventually a preprocessor. The final parts cover register spilling, lazy evaluation, passing a triple-compilation test, and adding support for a different CPU target. Each part is written to be followed along, with explanations of both the practical code changes and the underlying theory when relevant. The author's stated approach is to stay focused on practice rather than academic formalism. Someone who wants to understand how programming languages work from the inside, or who is curious about what happens between writing code and running it, would find this series more accessible than most textbook treatments of the subject. The author has since moved on to a new language project called alic and considers this series complete. The source code is licensed under GPL3, and the written documents are under Creative Commons. Some early code ideas were drawn from an existing open-source compiler called SubC.

Copy-paste prompts

Prompt 1
I am at part 5 of the acwj compiler series, help me understand how the symbol table works and how it connects to code generation for variable assignments.
Prompt 2
Show me how the acwj compiler handles pointer arithmetic and why the type system needs to track base types.
Prompt 3
I want to add a new binary operator to the acwj compiler. Walk me through which files to modify and how the parser, AST, and code generator each need to change.
Prompt 4
Explain how the self-compilation test in acwj works, what does it mean for a compiler to compile itself, and how do you verify the output is correct?
Open on GitHub → Explain another repo

← doctorwkt on gitmyhub — every repo by this author, as a profile.

Verify against the repo before relying on details.