Analysis updated 2026-06-24
Use the GIFs in slides or blog posts to teach how convolution output size is computed
Regenerate the animations with custom filter sizes for your own paper figures
Compile the LaTeX paper locally and cite the resulting PDF in coursework
Reference the transposed and dilated convolution diagrams when debugging CNN architectures
| vdumoulin/conv_arithmetic | tuhdo/os01 | unicitynetwork/whitepaper | |
|---|---|---|---|
| Stars | 14,644 | 13,543 | 13,178 |
| Language | TeX | TeX | TeX |
| Setup difficulty | moderate | hard | easy |
| Complexity | 2/5 | 5/5 | 1/5 |
| Audience | researcher | developer | researcher |
Figures from each repo's GitHub metadata at analysis time.
Rebuilding needs a full LaTeX toolchain plus Python animation deps, the README points to the arXiv paper for any actual explanation.
This repository is the source for a short technical report and a set of accompanying animations about convolution arithmetic in deep learning. Convolution is a math operation used inside the type of neural networks that read images, and the arithmetic part is the small but fiddly rules about how the size of the output depends on the input size, the filter size, padding, and stride. The report itself is a paper by Vincent Dumoulin and Francesco Visin called A guide to convolution arithmetic for deep learning, available on arXiv at 1603.07285. The most visible part of the repository is its animated GIFs, which show small grids of blue input squares being scanned by a moving filter to produce a smaller cyan output. The README lays these out in three tables. The first table covers ordinary convolutions in seven variants: no padding with no strides, arbitrary padding with no strides, half padding, full padding, then the same three padding choices combined with strides, plus an odd stride case. The second table shows the same set of cases but for transposed convolutions, which are sometimes used to upsample or reverse a convolution. The third table shows a single dilated convolution animation, where the filter skips positions on the input. The rest of the README explains how to rebuild everything from source. There is a small shell script in bin/generate_makefile that creates the Makefile. Once that is in place, running make all_animations regenerates every GIF into the gif directory, with intermediate frames written as PDF and PNG into pdf and png directories. Running plain make compiles the LaTeX document into the final PDF. The README also reminds users that the code and images are free to use under the project's licence as long as the paper is properly cited. There is no other description of how the math works inside the README, for that, the reader is pointed to the arXiv paper.
Source for the Dumoulin and Visin paper on convolution arithmetic, plus animated GIFs that show how padding, stride, and dilation change CNN output sizes.
Mainly TeX. The stack also includes TeX, LaTeX, Make.
Setup difficulty is rated moderate, with roughly 30min to a first successful run.
Mainly researcher.
This repo across BitVibe Labs
Verify against the repo before relying on details.