1. Some materials are taken from machine learning course of Victor Kitov
If number of input features is $N_{in}$ how to calculate number of output features $N_{out}$?
(c) cs231n
Lets simplify things a bit: consider only a single channel $5 \times 5$ input and $3 \times 3$ convolutional filter.
No strides, no padding, no bias weight.
$$ \begin{align} \left( \begin{array}{ccccc} 0 & 1 & 2 & 1 & 0 \\ 4 & 1 & 0 & 1 & 0 \\ 2 & 0 & 1 & 1 & 1 \\ 1 & 2 & 3 & 1 & 0 \\ 0 & 4 & 3 & 2 & 0 \\ \end{array} \right) &\quad * & \left( \begin{array}{ccc} 0 & 1 & 0 \\ 1 & 0 & 1 \\ 2 & 1 & 0 \\ \end{array} \right) & \quad = & \left( \begin{array}{ccc} 9 & 5 & 4 \\ 8 & 8 & 10 \\ 8 & 15 & 12 \\ \end{array} \right) \\ \mathbf{X} \qquad \qquad &\quad * & \mathbf{W} \qquad & \quad = & \mathbf{I} \qquad \end{align} $$Denote
Go down to indices (filter indexing starts from "center"):
$$ I_{i,j} = \sum\limits_{-1 \leq a,b \leq 1} W_{a,b}X_{i+a, j+b} $$And usually some activation function $f(\cdot)$ is applied: $$ O_{i,j} = f(I_{i,j}) $$
Similarly to backprop algorithm in previous lecture:
OMG, it is convolution too!
What's with derivative here?