Professional Documents
Culture Documents
At its simplest, deep learning can be thought of as a way to automate predictive analytics. While
traditional machine learning algorithms are linear, deep learning algorithms are stacked in
a hierarchy of increasing complexity and abstraction.
Hidden Layer: The input from the Input layer is then feed into the hidden layer. There can be
many hidden layers depending upon our model and data size. Each hidden layer can have
different numbers of neurons which are generally greater than the number of features. The output
from each layer is computed by matrix multiplication of output of the previous layer with
learnable weights of that layer and then by the addition of learnable biases followed by activation
function which makes the network nonlinear.
Output Layer: The output from the hidden layer is then fed into a logistic function like sigmoid
or soft max which converts the output of each class into the probability score of each class.
The data is then fed into the model and output from each layer is obtained this step is called feed
forward, we then calculate the error using an error function, some common error functions are
cross-entropy, square loss error, etc. After that, we backpropagate into the model by calculating
the derivatives. This step is called Backpropagation which basically is used to minimize the loss.
Here’s the basic python code for a neural network with random inputs and two hidden layers.
A covnets is a sequence of layers, and every layer transforms one volume to another through a
differentiable function. Types of layers: Let’s take an example by running a covnets on of
image of dimension 32 x 32 x 3.
Input Layer: This layer holds the raw input of the image with width 32, height 32, and depth 3.
Convolution Layer: This layer computes the output volume by computing the dot product
between all filters and image patches. Suppose we use a total of 12 filters for this layer we’ll get
output volume of dimension 32 x 32 x 12.
Activation Function Layer: This layer will apply an element-wise activation function to the
output of the convolution layer. Some common activation functions are RELU: max(0, x),
Sigmoid: 1/(1+e^-x), Tanh, Leaky RELU, etc. The volume remains unchanged hence output
volume will have dimension 32 x 32 x 12.
Pool Layer: This layer is periodically inserted in the covnets and its main function is to reduce
the size of volume which makes the computation fast reduces memory and also prevents
overfitting. Two common types of pooling layers are max pooling and average pooling. If we use
a max pool with 2 x 2 filters and stride 2, the resultant volume will be of dimension 16x16x12.
Fully-Connected Layer: This layer is a regular neural network layer that takes input from the
previous layer and computes the class scores and outputs the 1-D array of size equal to the
number of classes.
In deep learning, a convolutional neural network (CNN or ConvNet) is a class of deep neural
networks, that are typically used to recognize patterns present in images but they are also used
for spatial data analysis, computer vision, natural language processing, signal processing, and
various other purposes
What Is a Convolution?
Convolution is an orderly procedure where two sources of information are intertwined; it’s an
operation that changes a function into something else. Convolutions have been used for a long
time typically in image processing to blur and sharpen images, but also to perform other
operations. (e.g. enhance edges and emboss)
Convolution, Non Linearity (ReLU), Pooling or Sub Sampling, Classification (Fully Connected
Layer)
For example, if you would apply a convolution to an image, you will be decreasing the image
size as well as bringing all the information in the field together into a single pixel. The final
output of the convolutional layer is a vector. Based on the type of problem we need to solve and
on the kind of features we are looking to learn, we can use different kinds of convolutions.
This operation expands window size without increasing the number of weights by inserting zero-
values into convolution kernels. Dilated or Atrous Convolutions can be used in real time
Unit No 4
applications and in applications where the processing power is less as the RAM requirements are
less intensive.
Separable Convolutions
There are two main types of separable convolutions: spatial separable convolutions, and
depthwise separable convolutions. The spatial separable convolution deals primarily with the
spatial dimensions of an image and kernel: the width and the height. Compared to spatial
separable convolutions, depthwise separable convolutions work with kernels that cannot be
“factored” into two smaller kernels. As a result, it is more frequently used.
Transposed Convolutions
A one-dimensional convolutional neural network model (1D CNN) for the human activity
recognition dataset. Convolutional neural network models were developed for image
classification problems, where the model learns an internal representation of a two-dimensional
input, in a process referred to as feature learning.
Before going through Conv1D, let me give you a hint. In Conv1D, kernel slides along one
dimension. Now let’s pause the blog here and think which type of data requires kernel sliding in
only one dimension and have spatial properties? The answer is Time-Series data. Let’s look at
the following data.
Unit No 4
This data is collected from an accelerometer which a person is wearing on his arm. Data
represent the acceleration in all the 3 axes. 1D CNN can perform activity recognition task from
accelerometer data, such as if the person is standing, walking, jumping etc. This data has 2
dimensions. The first dimension is time-steps and other is the values of the acceleration in 3
axes.
Following plot illustrate how the kernel will move on accelerometer data. Each row represents
time series acceleration for some axis. The kernel can only move in one dimension along the axis
of time.
trained ophthalmologist to examine and evaluate digital fundus photographs of the retina.
Computer machine learning technologies such as Convolutional Neural Networks (CNNs) have
emerged as an effective tool in medical image analysis for the detection and classification of DR
in real-time..
Diabetes mellitus, commonly known as diabetes, causes high blood sugar. Persistently high
blood sugar level leads to various complications and general vascular deterioration of the heart,
eyes, kidneys, and nerves, Diabetic retinopathy (DR) is one of the leading diseases caused by
diabetes, It damages the blood vessels of the retina, for those who have diabetes type-I or type-II.
DR is classified into two major classes: nonproliferative (NPDR) and proliferative (PDR)
However, there are two issues with CNN methods for DR detection. One is achieving a desirable
offset in sensitivity (patients correctly identified as having DR) and specificity (patients correctly
identified as not having DR). This is significantly harder for a five-class problem containing
normal, mild, moderate, severe, and proliferative DR classes. The second problem is over fitting.
Skewed datasets cause the network to over-fit to the class most prominent in the dataset. Large
datasets are often massively skewed.
It follows an unsupervised learning approach and trained its network through a competitive
learning algorithm. SOM is used for clustering and mapping (or dimensionality reduction)
techniques to map multidimensional data onto lower-dimensional which allows people to reduce
complex problems for easy interpretation. SOM has two layers, one is the Input layer and the
other one is the Output layer.
The architecture of the Self Organizing Map with two clusters and n input features of any sample
is given below:
Unit No 4
where alpha is a learning rate at time t, j denotes the winning vector, i denotes the ith feature of
training example and k denotes the kth training example from the input data. After training the
SOM network, trained weights are used for clustering new examples. A new example falls in the
cluster of winning vector.
SOM Algorithm