This repository is the source for a short technical report and a set of accompanying animations about convolution arithmetic in deep learning. Convolution is a math operation used inside the type of neural networks that read images, and the arithmetic part is the small but fiddly rules about how the size of the output depends on the input size, the filter size, padding, and stride. The report itself is a paper by Vincent Dumoulin and Francesco Visin called A guide to convolution arithmetic for deep learning, available on arXiv at 1603.07285. The most visible part of the repository is its animated GIFs, which show small grids of blue input squares being scanned by a moving filter to produce a smaller cyan output. The README lays these out in three tables. The first table covers ordinary convolutions in seven variants: no padding with no strides, arbitrary padding with no strides, half padding, full padding, then the same three padding choices combined with strides, plus an odd stride case. The second table shows the same set of cases but for transposed convolutions, which are sometimes used to upsample or reverse a convolution. The third table shows a single dilated convolution animation, where the filter skips positions on the input. The rest of the README explains how to rebuild everything from source. There is a small shell script in bin/generate_makefile that creates the Makefile. Once that is in place, running make all_animations regenerates every GIF into the gif directory, with intermediate frames written as PDF and PNG into pdf and png directories. Running plain make compiles the LaTeX document into the final PDF. The README also reminds users that the code and images are free to use under the project's licence as long as the paper is properly cited. There is no other description of how the math works inside the README; for that, the reader is pointed to the arXiv paper.
Generated 2026-05-21 · Model: sonnet-4-6 · Verify against the repo before relying on details.