You are on page 1of 37

Convolution Neural Network

CP -6
Machine Learning
M S Prasad
INPUT --- [[ {conv+ReLU}}*N---Pool ]*M---[ FC—ReLU]*k -----
Convolution data set

54 para 2 bias
Stride: It is the number of spaces that we move to the right before reach the end of the
image, and the spaces we move to below before we reach the end of the image.

Stride 1 Stride2

Padding: Sometimes we want to take advantage of all the pixels in the image, so a padding just indicates how
many columns and rows of zeros we are going to add in the border of the image. Also, if you want to apply a
’full’ convolution you need to add a (wf-1) padding, where wf is the width of the filter.
BIAS
The relation between the bias and the result of a convolution.
The bias add some specific value to the result in every channel, so for the error that we receive from an
upper layer every value of the bias needs to change according to the error of the related
channel.
Pooling layer
Non-linear down sampling of the volume by using small filters to sample for example the
maximum or average values in a rectangular area of the output from the previous layer. Pooling
reduces the spatial size, to reduce the amount of parameters and computations, and additionally
avoids overfitting, i.e. high training accuracy but low validation accuracy.

Pooling is a down-sampling operation that reduces the dimensionality of the feature map. The
rectified feature map now goes through a pooling layer to generate a pooled feature map.
The pooling layer uses various filters to identify different parts of the image like edges, corners,
body, feathers, eyes, and beak.
Normalisation layer Different kinds of normalisation layers have been proposed
to normalise the data, but have not proven useful in practice and have therefore
not gained any solid ground.

Fully connected layer Neurons in this layer are fully connected to all activations
in the previous layers, as in regular neural networks. These are usually at the
end of the network, e.g. outputting the class probabilities.

Loss layer Often the last layer in the network that computes the objective of the
task, such as classification, by e.g. applying the softmax function

FC - Softmax /Logistics - output


Batch Normalization

Input value over a mini batch B= { x1 , x2 ….xm}

Output

Yi = B N
SAMPLE CNN
The convolution layer is that all spatial locations share the same convolution kernel, which greatly
reduces the number of parameters needed for a convolution layer.

In a deep neural network setup, convolution also encourages parameter sharing.

The combination of convolution kernels and deep and hierarchical structures are very effective in
learning good representations (features) from images for visual recognition tasks..

key concept in CNN (or more generally deep learning) is distributed representation. For example,
suppose our task is to recognize N different types of objects and a CNN extracts M features from any
input image. It is most likely that any one of the M features is useful for recognizing all N object
categories; and to recognize one object type requires the joint effort of all M features.
Why CNN

Say we have an initial image is 224 x 224 x 3. If we proceed without convolution then wec need 224 x 224 x 3 =
100, 352 numbers of neurons in input layer.
After applying convolution you input tensor dimension is reduced to 1 x 1 x 1000. It means you only need 1000
neurons in first layer of feedforward neural network.
Data Size Consideration
Conv Input ==nxn ; filter f x f
Output is ( n-f+1)x(n-f+1) reduces pixel size.

Padding : p
Output (n-2p-f+1)(n=2p-f+1)
Padding makes output size = input i.e. n+2p-f+1 =n sp p =f-1/2

Stride S
Output [ n+2p-f)/S+1 ][(n+2p-f/S+1]

No of channel
Input nxnxnc padding p stride s
Flattening. Flattening is used to convert all the resultant 2-Dimensional arrays from pooled
feature maps into a single long continuous linear vector.
In July 2012, researchers at Google exposed an advanced neural network to a
series of unlabelled, static images sliced from YouTube videos.
To their surprise, they discovered that the neural network learned a cat-detecting
neuron on its own, supporting the popular assertion that “the internet is made of
cats”.

ANY question

Refer Lecture notes on CNN


import numpy as np
import pandas as pd
from keras.optimizers import SGD
from keras.datasets import cifar10
from keras.models import Sequential
from keras.utils import np_utils as utils
from keras.layers import Dropout, Dense, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D

# load data set


(X, y), (X_test, y_test) = cifar10.load_data()

# normalize data set # convert to categories


X, X_test = X.astype('float32')/255.0, X_test.astype('float32')/255.0

y, y_test = utils.to_categorical(y, 10), u.to_categorical(y_test, 10)


#initialize model
model = sequential()

model.add(Conv2D(32, (3, 3), input_shape=(32, 32, 3), padding='same', activation='relu'))


# assume feature 32 , kernel 3x3 , ReLu , padding 3 , channel 3

#drop out define model .add( Dropout ( 0.2)

# another conv layer with padding valid


model.add(Conv2D(32,(3,3),activation =‘relu’,padding=‘valid’)
#max pooling
model.add(maxpooling2D(pool_size=(2,2)))
#flatten data
model.add(Flatten())
model.add(Dense( 512 ,activation = ‘relu’) # dense layer
#here the hidden units are 512.

Add drop out in output dense


model.add(Dropout(0.3))
model.add( Dense (10 , activation ‘softmax’))
# Compile model
model.compile (loss=‘categorical_cross entropy’, optimizer=SGD(momentum =0.5,decay =.0004) metrics[‘accuracy’])
# we can use ADAM optimizer which adjust learning rate smoothly also.

# Fit for 25 epochs


model.fit(X, y, validation_data=(X_test, y_test), epochs=25, batch_size=512)

print("Accuracy: &2.f%%" %(model.evaluate(X_test, y_test)[1]*100))

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])


Regularization Methods in CNNs
Regularization is a method of including extra information to solve an irregular problem or to stop overfitting.
CNN also uses regularization to handle all those problems. Below are different types of regularization
techniques used by CNNs:
•Empirical
•Explicit
Different categories of empirical regularization:
• Dropout
• DropConnect
• Stochastic pooling

Different categories of explicit regularization:


Early stopping
Weight decay
Number of parameters
Max norm constraint
Residual Networks (ResNet)
ResNet is a Convolutional Neural Network (CNN) architecture, made up of series of residual blocks
(ResBlocks).

U-Nets
A U-Net is a convolutional neural network architecture that was developed for biomedical image segmentation. U-
Nets have been found to be very effective for tasks where the output is of similar size as the input and the output
needs that amount of spatial resolution. This makes them very good for creating segmentation masks and for
image processing/generation such as super resolution.
PRACTICAL: Step by Step Guide

Step 1: Choose a Dataset

You might also like