You are on page 1of 1

Summary Lecture 13

Benedikt Willecke
IES19372

Convolutional Neural Networks (CNN) are most famous for their usage in image recognition. It
consists of multiple layers. Convolutional Layer, Non-Linear Activation Layer(ReLu), Pooling Layer and
a Fully Connected Layer. Here, the trend is towards deep networks. The Convolutional Layer is a
window that slides step by step over the image and outputs one number. So a 3x3 Window produces
1 output out of 9 inputs. This is done by calculating the dot product. At different depths of the
network and with different convolution methods we can extract different features on low, medium
and high level. The trend goes toward small filter sizes but deep networks. If the filter size doesn’t
match the image size we can use a different stride size or padding. In the equation used for feed-
forward and backpropagation the weight is the filter kernel and then there is a bias as well. The
calculations are basically the same. But sometimes we reshape the matrices/vectors to make the
calculations more efficient. The fully connected layer is basically the idea that every node in the
previous layer is connected to every node in the next one with a non-linear activation function in the
middle. This can actually be implemented using a Conv layer which makes the process very efficient:
it can be done in a single forward pass. The Pooling layer reduces the complexity of the data, for
example by using Max-pooling. This is basically down sampling. However, nowadays there is a trend
towards getting rid of the pooling layer and just using Conv layer with a larger stride size. It has been
shown that this can improve performance.

You might also like