Contents

Introduction

For someone who might have just started with deep learning, keeping track of the dimensions can be little daunting. For a detailed explations and visualization I highly recommend going through ‘A guide to convolution arithmetic for deep learning’ by Dumoulin et. al.

The following notation is used in the calculations that follow.

Notations:

  • 2D discrete convolution, $N$
  • square kernel size, $k$
  • strides, $s$
  • padding, $p$
  • output size, $o$

No zero padding, unit strides

\(o = (i-k) + 1\)

Zero padding and unit strides

\(o = (i-k) + 2p + 1\)

Half (same) padding

For any $i$ and odd $k$ $(k=2n+1, n \in \N )$, $s=1$ and $p=[\frac{k}{2}]$

\[o = i + 2 \left[\frac{k}{2}\right] - k = i\]

Full padding

For any $i, k$ and for $p=k-1$, $s=1$ \(o = i + 2(k-1) - (k-1) = i + k-1\)

No zero padding, non-unit strides

For any $i, k, s$ \(o= \left[\frac{i+2p-k}{s}\right]+1\)

Pooling arithmetic

\(o= \left[\frac{i-k}{s}\right]+1\)

Dilated convolutions

Can be used to increase receptive field cheaply. Effective kernel size with dilation, $d$, $\hat{k} =k + (k-1)(d-1)$ \(o= \left[\frac{i+2p-k-(k-1)(d-1)}{s}\right]+1\)

Convolution calculator

(will be updated to be more inclusive)

Input



Convolution

Shapes

This is inspired by convnet-calculator and major chunk of code comes from there.