Professional Documents
Culture Documents
• Name of Guide:
CONVOLUTION NEURAL NETWORK
• A Convolutional Neural Network, also known as CNN, is a class of neural networks that specializes in processing
data that has a grid-like topology, such as an image.
• The second we perceive an image; our brain analyses a massive amount of data. Each neuron has its own receptive
field and is coupled to other neurons in such a way that the full visual field is covered.
• Similarly, a CNN has a number of layers, designed so that simpler patterns (lines, curves, etc.) are detected first,
followed by more complicated patterns (faces, objects, etc.).
CONVOLUTION LAYER
• Dot multiplication is done between 2 matrices- kernel which is a matrix having the set of learnable parameters or
weights and the image matrix given as input. The kernel is dimensionally smaller than an image. The kernel slides
across the image matrix and dot multiplication are done to get a value as elements for the resultant output matrix.
1.Sparse interaction
2.Parameter sharing
3.Equivariant representation
And in this model, the output size of each convolutional layer can be
formulated as
Where, I,F,P and S denote the input size, kernel size, padding size, and
stride size, respectively.
POOLING LAYER
• This helps in reducing the spatial size of the representation by deriving a summary statistic of the nearby outputs. This
decreases the computation. There are several pooling functions such as the average of the rectangular frame, L2 norm of
the rectangular frame, and a weighted average based on the distance from the central pixel
Formula for Padding Layer
This will yield an output volume of size Wout x Wout x D.
ACTIVATION FUNCTIONS
Rectified Linear Units
(ReLUs) are adopted as the Softmax is used as the activation
activation function in the function in the output layer, the
convolutional layers and max- input of which is the matrix output
pooling layers to avoid gradient from the fully connected layers. The
explosion and ensure faster formulation is given as
convergence speed during the
back-propagation operation, which
can be formulated as