You are on page 1of 11

BLDEA’s V. P. Dr. P. G.

Halakatti College of Engineering and


Technology, Vijayapur.

Department of Computer Science and Engineering

AIML Presentation On
“Convolutional Neural Network”

Presented By:
Anjana Hiroli
Aishwarya Hirolli
INTRODUCTION
• CNN is a Convolutional neural network which falls under the supervised learning category of
neural networks. This means that the network requires a set of data that is already classified into the
required classes.

• This is particularly useful for finding patterns in images to recognize objects, classes, and
categories.

• It is one of the various types of artificial neural networks which are used for different applications
and data types. A CNN is a kind of network architecture for deep learning algorithms and is
specifically used for image recognition and tasks that involve the processing of pixel data.

• Convolutional neural network with transfer learning in the field of biomedical images may lead to an
increase in the performance of the model and could achieve better accuracy.
WHY YOU HAVE CHOSEN THIS ALGORTHIM

 CNNs have excelled in large-scale image datasets such as ImageNet, showcasing their ability
to handle diverse and complex visual information. This success has contributed to the
widespread adoption of CNNs in the field of computer vision.

 Finding good internal representations of images objects and features has been the main goal
since the beginning of computer vision. Therefore many tools have been invented to deal
with images. Many of these are based on a mathematical operation, called convolution.

 CNNs have demonstrated remarkable success in image-related tasks, including image


classification, object detection, and image segmentation. Their architecture is well-suited for
extracting meaningful features from images, leading to state-of-the-art performance in
various benchmarks.
CNN Architecture
CNN consists of three layers:
 Input layer
 Convolutional layer
 Pooling layer
 Fully connected (FC) layer.
Convolutional Layer:
• The primary purpose of this layer is to extract features from the input image.
• Computers read images as pixels and it is expressed as matrix (NxNx3)- (height by width by depth).
• A filter is used to detect the presence of specific features or patterns present in the original image
(input).It is usually expressed as a matrix (MxMx3), with a smaller dimension but the same depth as the
input file.
• This filter is convolved (slided) across the width and height of the input file, and a dot product is
computed to give an activation map.
• Stride: Stride is the number of pixels by which we slide our filter matrix over the input matrix. our
input image matrix. The filter is moved across the image left to right, top to bottom, with a one-pixel
column change on the horizontal movements, then a one-pixel row change on the vertical movements.
Zero-padding:
 Padding refers to the process of adding extra pixels to the input image before applying convolution.
 Padding is the best approach, where the number of pixels needed for the convolutional kernel to
process the edge pixels are added onto the outside copying the pixels from the edge of the image.
 If we have an input of size W x WxD and Dout number of kernels with a spatial size of F with stride S
and amount of padding P, then the size of output volume can be determined by the following formula.
Pooling Layer:
 Its main function is to reduce the size of volume which makes the computation faster and reduces
the memory.
 Two common types of pooling layers are max pooling and average pooling.
 Max pooling is a pooling operation that selects the maximum element from the region of
the feature map covered by the filter. Thus, the output after max-pooling layer would be a
feature map containing the most prominent features of the previous feature map.
Average pooling: computes the average of the elements present in the region of feature map covered
by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the
feature map, average pooling gives the average of features present in a patch.

• The resulting feature maps are then passed to a fully connected layer, which performs the final
classification or regression task.
Flattening: The resulting feature maps are flattened into a one-dimensional vector after the
convolution and pooling layers so they can be passed into a completely linked layer for
categorization or regression.

Fully connected layer


 The Fully connected layer is used for classifying the input image into a label.It consists of neurons with fully
connected with other neurons and output of this neurons are passed on activation function that may be Rectified
Linear Unit (ReLU) and Sigmoid.
 The final decision is made based on which neuron has the highest activation.
 This layer connects the information extracted from the previous steps (i.e Convolution layer and Pooling layers) to
the output layer and eventually classifies the input into the desired label.
Application:
• Image and pattern recognition
• Speech recognition
• Natural language processing
• Video analysis
THANK YOU

You might also like