You are on page 1of 14

CNN Building Blocks

24/08/2022 1
What is CNN?

• A convolutional neural network (CNN) is a type of artificial neural network used


in image recognition and processing that is specifically designed to process pixel
data.
• CNNs are powerful image processing, artificial intelligence (AI) that use deep
learning to perform both generative and descriptive tasks, often using machine
vison that includes image and video recognition, along with recommender systems
and natural language processing (NLP).
• A CNN uses a system much like a multilayer perceptron that has been designed
for reduced processing requirements. The layers of a CNN consist of an input
layer, an output layer and a hidden layer that includes multiple convolutional
layers, pooling layers, fully connected layers and normalization layers.

24/08/2022 2
CNN Building Blocks

1. Kernel (Filter)
2. Stride
3. Padding
4. Pooling
5. Flatten

24/08/2022 3
CNN Building Blocks

24/08/2022 4
Kernel

• Kernel is a filter that is used to extract the features from the images.
• The kernel is a matrix that moves over the input data, performs the dot
product with the sub-region of input data, and gets the output as the
matrix of dot products.
• Kernel moves on the input data by the stride value.

24/08/2022 5
Kernel

INPUT 5*5

3 3 2 1 0

0 0 1 3 1
Kernel 3*3 Feature Map 3*3
3 1 2 2 3 12 12 17
0 1 2
2 2 0 10 17 13
2 0 0 2 2
9 6 4
0 1 2
2 0 0 0 1

O=[i-k]+1
24/08/2022 6
Stride

• The filter is moved across the image left to right, top to bottom, with
one-pixel column change on the horizontal movements, then one-pixel
row change on the vertical movements.
• The amount of movement between applications of the filter to the
input image is referred to as the stride, and it is almost symmetrical in
height and width dimensions.
• The default stride or strides in two dimensions is (1,1) for the height
and the width movement.

24/08/2022 7
Stride

Input 5*5

3 3 2 1 0
Kernel 3*3
Feature map 2*2
0 0 1 3 1 0 1 2 12 17
3 1 2 2 3 2 2 0 9 14
0 1 2
2 0 0 2 2

2 0 0 0 1

O=[(i-k)/s]+1

24/08/2022 8
Padding

• Padding is the best approach, where the number of pixels needed for
the convolutional kernel to process the edge pixels are added onto the
outside copying the pixel from the edge of the image.
• Fix the border effect problem with padding.

24/08/2022 9
Padding
Input with padding 5*5
Feature Map 5*5
0 0 0 0 0 0 0
Kernel 3*3 6 14 17 11 3
0 3 3 2 1 0 0 0 1 2 14 12 12 17 11
0 0 0 1 3 1 0 2 2 0 8 10 17 19 13
0 1 2 11 9 6 14 12
0 3 1 2 2 3 0
6 4 4 6 4
0 2 0 0 2 2 0
0 2 0 0 0 1 0
0 0 0 0 0 0 0
O=[(i-k+2p)/s]+1

24/08/2022 10
Pooling

• Pooling is required to down sample the detection of features in feature


maps.
• Pooling layers provide an approach to down sampling feature maps by
summarizing the presence of features in patches of the feature map.
• Two common pooling methods are average pooling and max pooling
that summarize the average presence of a feature and the most
activated presence of a feature respectively.

24/08/2022 11
Pooling
Max Pooling
2*2
Input with padding 5*5
Feature Map 5*5 14 17
0 0 0 0 0 0 0 11 19
Kernel 3*3 6 14 17 11 3
0 3 3 2 1 0 0 14 12 12 17 11
0 1 2
0 0 0 1 3 1 0 2 2 0 8 10 17 19 13
Avg Pooling
0 3 1 2 2 3 0 0 1 2 11 9 6 14 12 2*2
6 4 4 6 4 11.5 14.2
0 2 0 0 2 2 0
9.5 14
0 2 0 0 0 1 0
0 0 0 0 0 0 0

24/08/2022 12
Flattening

• Once the pooled featured map is obtained, the next step is to flatten it.
• It involves transforming the entire pooled feature map matrix into a
single column which is then fed to the neural network for processing.

24/08/2022 13
Thank you……..

24/08/2022 14

You might also like