You are on page 1of 9

Unit No 4

1. DEEP LEARNING- Introduction


Deep learning is a type of machine learning and artificial intelligence (AI) that imitates the way
humans gain certain types of knowledge. Deep learning is an important element of data science,
which includes statistics and predictive modeling. It is extremely beneficial to data scientists who
are tasked with collecting, analyzing and interpreting large amounts of data; deep learning makes
this process faster and easier.

At its simplest, deep learning can be thought of as a way to automate predictive analytics. While
traditional machine learning algorithms are linear, deep learning algorithms are stacked in
a hierarchy of increasing complexity and abstraction.

Fig. 2: Growing uses for deep learning. Source: Semiconductor Engineering


Unit No 4

2. Introduction to Convolution Neural Network


Before diving into the Convolution Neural Network, let us first revisit some concepts of Neural
Network. In a regular Neural Network there are three types of layers:
Input Layers: It’s the layer in which we give input to our model. The number of neurons in this
layer is equal to the total number of features in our data (number of pixels in the case of an
image).

Hidden Layer: The input from the Input layer is then feed into the hidden layer. There can be
many hidden layers depending upon our model and data size. Each hidden layer can have
different numbers of neurons which are generally greater than the number of features. The output
from each layer is computed by matrix multiplication of output of the previous layer with
learnable weights of that layer and then by the addition of learnable biases followed by activation
function which makes the network nonlinear.

Output Layer: The output from the hidden layer is then fed into a logistic function like sigmoid
or soft max which converts the output of each class into the probability score of each class.

The data is then fed into the model and output from each layer is obtained this step is called feed
forward, we then calculate the error using an error function, some common error functions are
cross-entropy, square loss error, etc. After that, we backpropagate into the model by calculating
the derivatives. This step is called Backpropagation which basically is used to minimize the loss.
Here’s the basic python code for a neural network with random inputs and two hidden layers.

3. Convolution Neural Network


Convolution Neural Networks or covnets are neural networks that share their parameters.
Imagine you have an image. It can be represented as a cuboid having its length, width
(dimension of the image), and height (as images generally have red, green, and blue channels).

Layers used to build ConvNets


Unit No 4

A covnets is a sequence of layers, and every layer transforms one volume to another through a
differentiable function. Types of layers: Let’s take an example by running a covnets on of
image of dimension 32 x 32 x 3.

Input Layer: This layer holds the raw input of the image with width 32, height 32, and depth 3.

Convolution Layer: This layer computes the output volume by computing the dot product
between all filters and image patches. Suppose we use a total of 12 filters for this layer we’ll get
output volume of dimension 32 x 32 x 12.

Activation Function Layer: This layer will apply an element-wise activation function to the
output of the convolution layer. Some common activation functions are RELU: max(0, x),
Sigmoid: 1/(1+e^-x), Tanh, Leaky RELU, etc. The volume remains unchanged hence output
volume will have dimension 32 x 32 x 12.

Pool Layer: This layer is periodically inserted in the covnets and its main function is to reduce
the size of volume which makes the computation fast reduces memory and also prevents
overfitting. Two common types of pooling layers are max pooling and average pooling. If we use
a max pool with 2 x 2 filters and stride 2, the resultant volume will be of dimension 16x16x12.

Fully-Connected Layer: This layer is a regular neural network layer that takes input from the
previous layer and computes the class scores and outputs the 1-D array of size equal to the
number of classes.

4. Concept of Convolution (1D and 2D) layers


Convolutional Layer
Unit No 4

In deep learning, a convolutional neural network (CNN or ConvNet) is a class of deep neural
networks, that are typically used to recognize patterns present in images but they are also used
for spatial data analysis, computer vision, natural language processing, signal processing, and
various other purposes

What Is a Convolution?

Convolution is an orderly procedure where two sources of information are intertwined; it’s an
operation that changes a function into something else. Convolutions have been used for a long
time typically in image processing to blur and sharpen images, but also to perform other
operations. (e.g. enhance edges and emboss)

Convolution, Non Linearity (ReLU), Pooling or Sub Sampling, Classification (Fully Connected
Layer)

The first layer of a Convolutional Neural Network is always a Convolutional


Layer. Convolutional layers apply a convolution operation to the input, passing the result to the
next layer. A convolution converts all the pixels in its receptive field into a single value.

For example, if you would apply a convolution to an image, you will be decreasing the image
size as well as bringing all the information in the field together into a single pixel. The final
output of the convolutional layer is a vector. Based on the type of problem we need to solve and
on the kind of features we are looking to learn, we can use different kinds of convolutions.

The 2D Convolution Layer


The most common type of convolution that is used is the 2D convolution layer and is usually
abbreviated as conv2D. A filter or a kernel in a conv2D layer “slides” over the 2D input data,
performing an element wise multiplication. As a result, it will be summing up the results into a
single output pixel. The kernel will perform the same operation for every location it slides over,
transforming a 2D matrix of features into a different 2D matrix of features.

The Dilated or Atrous Convolution

This operation expands window size without increasing the number of weights by inserting zero-
values into convolution kernels. Dilated or Atrous Convolutions can be used in real time
Unit No 4

applications and in applications where the processing power is less as the RAM requirements are
less intensive.

Separable Convolutions

There are two main types of separable convolutions: spatial separable convolutions, and
depthwise separable convolutions. The spatial separable convolution deals primarily with the
spatial dimensions of an image and kernel: the width and the height. Compared to spatial
separable convolutions, depthwise separable convolutions work with kernels that cannot be
“factored” into two smaller kernels. As a result, it is more frequently used.

Transposed Convolutions

These types of comvolutions are also known as deconvolutions or fractionally strided


convolutions. A transposed convolutional layer carries out a regular convolution but reverts its
spatial transformation.

Develop 1D Convolutional Neural Network

A one-dimensional convolutional neural network model (1D CNN) for the human activity
recognition dataset. Convolutional neural network models were developed for image
classification problems, where the model learns an internal representation of a two-dimensional
input, in a process referred to as feature learning.

Before going through Conv1D, let me give you a hint. In Conv1D, kernel slides along one
dimension. Now let’s pause the blog here and think which type of data requires kernel sliding in
only one dimension and have spatial properties? The answer is Time-Series data. Let’s look at
the following data.
Unit No 4

This data is collected from an accelerometer which a person is wearing on his arm. Data
represent the acceleration in all the 3 axes. 1D CNN can perform activity recognition task from
accelerometer data, such as if the person is standing, walking, jumping etc. This data has 2
dimensions. The first dimension is time-steps and other is the values of the acceleration in 3
axes.

Following plot illustrate how the kernel will move on accelerometer data. Each row represents
time series acceleration for some axis. The kernel can only move in one dimension along the axis
of time.

5. Case study of CNN for eg on Diabetic Retinopathy


Diabetic retinopathy (DR) is the leading cause of blindness in the working-age population of the
developed world. Presently, detecting DR is a manual, time-consuming process that requires a
Unit No 4

trained ophthalmologist to examine and evaluate digital fundus photographs of the retina.
Computer machine learning technologies such as Convolutional Neural Networks (CNNs) have
emerged as an effective tool in medical image analysis for the detection and classification of DR
in real-time..

Diabetes mellitus, commonly known as diabetes, causes high blood sugar. Persistently high
blood sugar level leads to various complications and general vascular deterioration of the heart,
eyes, kidneys, and nerves, Diabetic retinopathy (DR) is one of the leading diseases caused by
diabetes, It damages the blood vessels of the retina, for those who have diabetes type-I or type-II.
DR is classified into two major classes: nonproliferative (NPDR) and proliferative (PDR)

Figure. Computer Vision through Convolutional Neural Network

CNN for Diabetic Retinopathy detection

Convolutional Neural Network is a feed-forward neural network. It mainly consists of an input


layer, many hidden layers (such as convolutional relu, pooling, flatten, fully connected and
softmax layers) and a final multi-label classification layer. CNN methodology involves two
stages of processing: a time-consuming training stage where millions of images go through many
iterations of CNN architecture to finalize the model parameters of each layer and a second real-
time prediction stage where each image in test dataset is fed into the trained model to score and
validate the model.
Unit No 4

However, there are two issues with CNN methods for DR detection. One is achieving a desirable
offset in sensitivity (patients correctly identified as having DR) and specificity (patients correctly
identified as not having DR). This is significantly harder for a five-class problem containing
normal, mild, moderate, severe, and proliferative DR classes. The second problem is over fitting.
Skewed datasets cause the network to over-fit to the class most prominent in the dataset. Large
datasets are often massively skewed.

6. Building a smart speaker- https://youtu.be/W9qDi8xgGqg

7. Self-deriving car etc- https://youtu.be/1L0TKZQcUtA and


https://youtu.be/gCm4fhv9WRI

8. Unsupervised Learning – SOM Algorithm and its variant;


Self Organizing Map (or Kohonen Map or SOM) is a type of Artificial Neural Network which is
also inspired by biological models of neural systems form the 1970’s.

It follows an unsupervised learning approach and trained its network through a competitive
learning algorithm. SOM is used for clustering and mapping (or dimensionality reduction)
techniques to map multidimensional data onto lower-dimensional which allows people to reduce
complex problems for easy interpretation. SOM has two layers, one is the Input layer and the
other one is the Output layer.

The architecture of the Self Organizing Map with two clusters and n input features of any sample
is given below:
Unit No 4

How SOM works?


Let’s say an input data of size (m, n) where m is the number of training example and n is the
number of features in each example. First, it initializes the weights of size (n, C) where C is the
number of clusters. Then iterating over the input data, for each training example, it updates the
winning vector (weight vector with the shortest distance (e.g Euclidean distance) from training
example). Weight updation rule is given by :

where alpha is a learning rate at time t, j denotes the winning vector, i denotes the ith feature of
training example and k denotes the kth training example from the input data. After training the
SOM network, trained weights are used for clustering new examples. A new example falls in the
cluster of winning vector.

SOM Algorithm

1. Steps involved are :


2. Weight initialization
3. For 1 to N number of epochs
4. Select a training example
5. Compute the winning vector
6. Update the winning vector
7. Repeat steps 3, 4, 5 for all training examples.
8. Clustering the test sample

You might also like