You are on page 1of 41

NUERAL NETWORK AND DEEP LEARNING

CCS355

UNIT III
THIRD-GENERATION NEURAL NETWORKS

Spiking Neural Networks-Convolutional Neural Networks-Deep Learning Neural Networks-


Extreme Learning Machine Model-Convolutional Neural Networks: The Convolution
Operation – Motivation – Pooling – Variants of the basic Convolution Function – Structured
Outputs – Data Types – Efficient Convolution Algorithms – Neuroscientific Basis –
Applications: Computer Vision, Image Generation, Image Compression.
Spiking Neural Networks
Introduction:
Spiking Neural Network is at the intersection of neuroscience and artificial
intelligence. It opens up numerous prospects in the field of robotics. The Spiking Neural
Network (SNN) is the third generation of neural network models constructed using specialized
network topologies that reinvent the entire computational process. The spiking makes it smarter
and more energy-efficient.
Spiking Neural Networks

Neural networks are classified into three generations.


1. Initial generation of these structures (the perceptron) consists of a single artificial nerve
cell that can be educated.
2. Second-generation neural networks are Artificial Neural Networks (ANN) structures that
are still in use today.
3. Third-generation artificial neural networks are the Spiking Neural Networks (SNN)
structure, which attempts to fully mimic the working principles of the human brain and
communicate via spikes.
The Spiking Neural Network (SNN) is constructed using specific network topologies. It
becomes smarter and more energy-efficient thanks to the spikes, which are essential for
compact gadgets to function.
How its works?

➢ Spiking Neural Networks (SNNs) were developed in computational neuroscience to


replicate the behaviour of organic neurons. As a result, the Leaky-Integrate-and-Fire
(LIF) model was developed, which characterizes neuronal activity as integrating
incoming spikes and poor dispersion (leakage) to the environment.
➢ Spiking Neural Networks lack a general linear structure. In this respect, it lacks a distinct
layer besides the input and output layers. Instead of crisp layers, they use more complex
structures like loops or multidirectional connections to convey data between neurons.
Architecture diagram:
➢ The value of each neuron is the same as the current electrical potential of biological neurons
According to its mathematical model, a neuron’s value might fluctuate. For instance, if a neuron
receives a spike from an upstream neuron, its value could increase or decrease. The value of a
neuron will drop below its average as soon as it crosses a certain threshold, at which point it will
transmit a single impulse to every downstream neuron connected to the original one

➢ The neuron will thus have a


refractory phase that is
comparable to a biological
neuron’s. The value of the neuron
will eventually revert to the
average.
Architecture diagram of SNN
Application of Spiking Neural Networks

1. Computer vision is the field that stands to gain the most from automatic video anal- ysis
utilizing Spiking Neural Networks. The IBM True North digital neurotic, which simulates the
activity of neurons in the visual cortex using one million pro- gram Mable neurons and 256
million pro- gram Mable synapses, can be useful in this regard. This neurotic is frequently re-
girded as the original piece of Spiking Neural Networks compatible hardware.
2. Real-Time Processing: SNNs excel in tasks requiring real-time processing and temporal pattern
recognition, such as audio and video processing, robotics, and brain-computer interfaces.
3. Cognitive Computing: They hold promise for emulating higher-order cognitive functions and
complex behaviours exhibited by biological brains.
Advantages and Disadvantages of SNN
❖ Advantages
1. SNN is a dynamic system. As a result, it excels in dynamic processes like speech and dynamic
picture identification.
2. When an SNN is already working, it can still train.
3. To train an SNN, you simply need to train the output neurons.
4. Traditional ANNs often have more neurons than SNNs; however, SNNs typically have fewer
neurons.
5. Because the neurons send impulses rather than a continuous value, SNNs can work incredibly
quickly.
6. Because they leverage the temporal presentation of information, SNNs have boosted
information processing productivity and noise immunity.
❖ Disadvantages:
1. SNNs are difficult to train.
2. As of now, there is no learning algorithm built expressly for this task.
3. Building a small SNN is impracticable.
Convolutional Neural Network (CNN)

➢ A Convolutional Neural Network (CNN) is a type of Deep Learning neural network


architecture commonly used in Computer Vision. Computer vision is a field of Artificial
Intelligence that enables a computer to understand and interpret the image or visual data.
➢ Similarly for image classification we use Convolution Neural networks.

➢ Convolutional Neural Network (CNN) is


the extended version of Artificial Neural
Network (ANN) which is predominantly
used to extract the feature from the grid-
like matrix dataset. For example visual
datasets like images or videos where data
Convolutional Neural Network
patterns play an extensive role.
Convolutional Neural Network (CNN)
In a regular Neural Network there are three types of layers:
1. Input Layers: It’s the layer in which we give input to our model. The number of neurons in this
layer is equal to the total number of features in our data (number of pixels in the case of an
image).
2. Hidden Layer: The input from the Input layer is then fed into the hidden layer. There can be
many hidden layers depending on our model and data size. Each hidden layer can have different
numbers of neurons which are generally greater than the number of features. The output from
each layer is computed by matrix multiplication of the output of the previous layer with
learnable weights of that layer and then by the addition of learnable biases followed by
activation function which makes the network nonlinear.
3. Output Layer: The output from the hidden layer is then fed into a logistic function like sigmoid
or soft max which converts the output of each class into the probability score of each class.
Convolutional Neural Network (CNN)

1. Flattening: The resulting feature maps are flattened into a one-dimensional vector after the
convolution and pooling layers so they can be passed into a completely linked layer for
categorization or regression.
1. Fully Connected Layers: It takes the input from the
previous layer and computes the final classification
or regression task.
2. Output Layer: The output from the fully connected
layers is then fed into a logistic function for
classification tasks like sigmoid or soft max which
converts the output of each class into the probability
score of each class.
Convolutional Neural Network (CNN)
Example:
Let’s consider an image and apply the convolution layer, activation layer, and pooling layer operation
to extract the inside feature.

Input Original greyscale image Output


Applications of Convolutional Neural Networks
1. Image Classification : Discuss how CNNs are used for image classification tasks,
including recognizing objects, scenes, and facial expressions.
2. Medical Imaging : Highlight the role of CNNs in medical image analysis, such as
diagnosing diseases from X-rays, MRIs, and histopathology slides.
3. Autonomous Vehicles : Explain how CNNs enable object detection and scene
understanding in autonomous vehicles, improving safety and navigation.
4. Artificial Intelligence Art : Showcase creative applications of CNNs in generating art,
style transfer, and image manipulation.
Advantages and Disadvantages of Convolutional Neural Networks
Advantages (CNN):
1. Good at detecting patterns and features in images, videos, and audio signals.
2. Robust to translation, rotation, and scaling invariance.
3. End-to-end training, no need for manual feature extraction.
4. Can handle large amounts of data and achieve high accuracy.
Disadvantages (CNN):
1. Computationally expensive to train and require a lot of memory.
2. Can be prone to overfitting if not enough data or proper regularization is used.
3. Requires large amounts of labelled data.
4. Interpretability is limited, it’s hard to understand what the network has learned.
Deep Learning Neural Networks

Definition: Deep learning is the branch of


machine learning which is based on artificial
neural network architecture. An artificial neural
network or ANN uses layers of interconnected
nodes called neurons that work together to process
and learn from the input data.

Deep Learning Neural Networks


In a fully connected Deep neural network, there is an input layer and one or more hidden layers
connected one after the other. Each neuron receives input from the previous layer neurons or the
input layer. The output of one neuron becomes the input to other neurons in the next layer of the
network, and this process continues until the final layer produces the output of the network. The
layers of the neural network transform the input data through a series of nonlinear transformations,
allowing the network to learn complex representations of the input data.

Artificial Intelligence
Neural Networks

Deep Learning
Neural Networks
Advantages and Disadvantage of Deep Learning Neural Network

➢ Advantages of Deep Learning


1. High accuracy: Deep Learning algorithms can achieve state-of-the-art performance in
various tasks, such as image recognition and natural language processing.
2. Automated feature engineering: Deep Learning algorithms can automatically discover
and learn relevant features from data without the need for manual feature engineering.
3. Scalability: Deep Learning models can scale to handle large and complex datasets, and can
learn from massive amounts of data.
4. Flexibility: Deep Learning models can be applied to a wide range of tasks and can handle
various types of data, such as images, text, and speech.
➢ Disadvantages of Deep Learning:
1. High computational requirements: Deep Learning models require large amounts of data
and computational resources to train and optimize.
2. Requires large amounts of labelled data: Deep Learning models often require a large
amount of labelled data for training, which can be expensive and time- consuming to
acquire.
3. Interpretability: Deep Learning models can be challenging to interpret, making it
difficult to understand how they make decisions.
4. Overfitting: Deep Learning models can sometimes over fit to the training data, resulting
in poor performance on new and unseen data.
Applications Deep Learning Neural Networks
1. Computer Vision: 4. Autonomous Vehicles:
Discuss the role of deep learning in advancing computer Discuss the use of deep learning in enabling
vision tasks, including image classification, object autonomous driving systems to perceive and
detection, and facial recognition. interpret their environment.
2. Natural Language Processing: 5. Finance, Retail, and Entertainment:
Highlight the applications of deep learning in natural Mention additional industries where deep
language understanding, machine translation, and learning is making an impact, such as financial
sentiment analysis. forecasting, recommendation systems, and
3. Healthcare: content generation.
Describe how deep learning is transforming healthcare
with applications in medical image analysis, disease
diagnosis, and personalized treatment.
Extreme Learning Machines (ELMs)

Definition:
Extreme Learning Machines (ELMs) are a type of single-hidden layer feedforward
neural network (SLFN) where the weights connecting the input layer to the hidden layer are
randomly generated and fixed. The output layer weights are then analytically computed through
a process called Moore-Penrose pseudoinverse.

Extreme Learning Machines (ELMs)


Architecture of Extreme Machine Learning Neural Network

✓ We are going to discuss the architecture of ELM which provides a detailed explanation
of how ELM works in machine learning.
✓ The architecture of ELM is very simple and straight forward which involves three
segments which are listed below,

1. Input layer
2. Hidden layer – Single hidden layer
3. Output layer
1. Input layer:
In ELM, the Input Layer is where the data enters the model. It’s represented as a vector called X, which
contains the input features.
X = [X[1], X[2], X[3], ..., X[N]]
In this representation, each X[i] corresponds to a specific feature or attribute of the data. N is the total
number of features. The Input Layer is responsible for passing the data to the Hidden Layer for further
processing.

2. Hidden layer-single hidden layer:


The hidden layer of ELM is where random weights and biases are assigned. Let’s denote the number of
hidden neurons as L as per above Fig 1. The weights connecting the input features to the hidden neurons
are represented by a weight matrix W of size (number of features, L).
2. Hidden layer-single hidden layer:

The value of N is a hyper parameter that needs to be set before training the neural network. Each
column in the weight matrix corresponds to the weights of a hidden neuron. The biases for the
hidden neurons are represented by a bias vector b of size (L, 1).
The second dimension of 1 is used to ensure that the bias vector is a column vector. This is because
the dot product of the weight matrix W and input feature vector X results in a column vector, and
adding a row vector (the bias vector) to a column vector requires that the bias vector be a column
vector as well. The purpose of the bias term is to shift the activation function to the left or right,
allowing it to model more complex functions.
The output of the hidden layer, often denoted as H, is calculated by applying the activation function
g like linear regression concept by making element-wise to the dot product of the input features and
the weights, adding the bias.
H = g(W * X + b)
3. Output layer
In ELM, the output layer weights are calculated using Moore-Penrose inverse of the hidden
layer output matrix. This output weight matrix is denoted as beta. The output predictions,
represented as f(x), are calculated by multiplying the hidden layer output H by the output
weights beta:
f(x) = H * beta
To make predictions, we multiply the hidden layer output H by the output weights beta. Each
row in f(x) represents the predictions for a corresponding data point.
Advantages and Applications Extreme Learning Neural Networks
Advantages:
1. Fast Training: ELMs have a fast training speed due to their analytical solution for output
weights.
2. Scalability: ELMs are scalable and can handle large datasets efficiently.
3. Generalization: Despite their simplicity, ELMs often generalize well to unseen data.
Applications:
1. Classification: ELMs are used for tasks such as image classification, sentiment analysis, and
pattern recognition.
2. Regression: ELMs can perform regression tasks like predicting housing prices, stock prices, and
time series forecasting.
3. Feature Learning: ELMs are used for feature learning and representation learning in deep
learning architectures.
The Convolution Neural Network Operation
Definition:
The convolution operation in the context of deep learning refers to the process of applying a
filter (also known as a kernel) to an input image or feature map. This operation involves sliding the filter
over the input data and computing element-wise multiplications followed by summation to produce an
output feature map.
Purpose:
✓ The convolution operation serves as a fundamental building
block in convolutional neural networks (CNNs), enabling them
to automatically learn hierarchical features from raw data.
✓ By convolving learnable filters with input data, CNNs can
effectively extract spatial hierarchies of features, capturing
patterns of increasing complexity.
Enhancing Feature Extraction:
1. The motivation behind the convolution operation lies in its ability to enhance feature
extraction from raw data, particularly in tasks involving images, audio, and video.
2. Convolution allows the network to learn local patterns, edges, and textures from input
data, enabling it to generalize better to unseen examples.

Parameter Sharing:
1. Another key motivation is parameter sharing, where the same filter is applied across
different spatial locations of the input data.
2. This sharing reduces the number of parameters in the network, making it more
computationally efficient and reducing the risk of overfitting.
Pooling
Definition:
Pooling, also known as down sampling, is a technique used in convolutional neural
networks (CNNs) to reduce the spatial dimensions of feature maps while retaining important
information.
Types of Pooling:
1. Max Pooling: Selects the maximum value from each patch of the feature map, preserving the
most dominant features.
2. Average Pooling: Computes the average value within each patch, providing a more smoothed
representation of features.
Purpose:
✓ Pooling helps in reducing the computational complexity of the network by reducing the spatial
dimensions of the feature maps.
✓ It also introduces translational invariance, making the network more robust to spatial
translations in the input data.
Variants of the Basic Convolution Function

1. Stride Convolution:
Stride convolution involves moving the filter by more than one pixel at a time, effectively reducing
the spatial dimensions of the output feature map.
2. Dilated Convolution:
Dilated convolution, also known as aurous convolution, introduces gaps between the filter elements,
allowing it to capture larger receptive fields without increasing the number of parameters.
3. Transposed Convolution (Deconvolution):
Transposed convolution, or deconvolution, is used for up sampling feature maps, effectively
increasing the spatial dimensions of the output.
Data Types
1. Input Data:
✓ Convolutional neural networks (CNNs) are capable of processing various types of input data,
including images, audio spectrograms, and text embedding's.
✓ Each type of data may require different pre-processing steps and network architectures tailored
to its specific characteristics.
2. Output Data:
✓ The output of a CNN depends on the task it is designed to perform. For classification tasks, the
output typically consists of class probabilities or discrete labels.
✓ For tasks like object detection or segmentation, the output may include bounding boxes,
masks, or pixel-wise labels.
Efficient Convolution Algorithms
Importance:
✓ Efficient convolution algorithms are crucial for accelerating the training and inference
processes of convolutional neural networks (CNNs), especially for large-scale datasets and
complex network architectures.
Techniques:
✓ Techniques like fast Fourier transform (FFT)-based convolution, Win grad convolution, and
depth wise separable convolution are commonly used to optimize the computational
efficiency of convolutions.
✓ Hardware accelerators like GPUs and TPUs further enhance the speed and efficiency of
convolutional operations in deep learning frameworks.
Neuroscientific Basis
Inspiration:
✓ The neuroscientific basis of convolutional neural networks (CNNs) draws inspiration from
the visual cortex of the human brain.
✓ CNN architectures mimic the hierarchical organization of neurons in the visual cortex, with
each layer learning increasingly complex features from raw sensory input.
Receptive Fields:
✓ The concept of receptive fields in CNNs is analogous to the receptive fields of neurons in
the visual cortex, representing the region of the input space that influences the activity of a
particular neuron.
✓ CNNs learn hierarchical representations by aggregating information across multiple
receptive fields, capturing spatial hierarchies of features similar to the visual processing in
the brain.
Applications of Deep Learning in Computer Vision
1. Object Detection:
✓ Deep learning techniques, particularly Convolutional Neural Networks (CNNs), have revolutionized
object detection tasks.
✓ CNN-based architectures such as Faster R-CNN, YOLO (You Only Look Once), and SSD (Single Shot
Multi Box Detector) enable accurate and real-time object detection in images and videos.
✓ Object detection finds applications in various domains, including surveillance, autonomous vehicles,
and image-based search engines.
2. Image Classification:
✓ Deep learning models excel in image classification tasks, where they classify images into predefined
categories or classes.
✓ Image classification finds applications in medical diagnosis, quality control in manufacturing, and
content-based image retrieval.
3. Image Enhancement and Restoration:
✓ Deep learning techniques are used for image enhancement and restoration tasks, including denoising,
deburring, and super-resolution.
✓ Image enhancement and restoration techniques are applied in satellite imaging, medical imaging, and
digital photography.
4. Face Recognition:
✓ Deep learning has significantly improved the accuracy and robustness of face recognition systems.
✓ Face recognition is widely used in security systems, surveillance, biometric authentication, and social
media tagging.
✓ Video analysis finds applications in surveillance, sports analytics, healthcare monitoring, and video
content recommendation.
Applications of Deep Learning in Image Generation

1. Generative Adversarial Networks (GANs):


✓ GANs are a class of deep learning models used for generating realistic synthetic data.
✓ They consist of two networks: a generator that generates fake data samples and a discriminator
that distinguishes between real and fake samples.
✓ GANs are used for generating images, music, text, and other types of data.
2. Style Transfer:
✓ Style transfer involves applying the artistic style of one image to the content of another image.
✓ Deep learning models like Neural Style Transfer use CNNs to separate content and style
representations and combine them to create stylized images.
✓ Style transfer finds applications in artistic rendering, photo editing, and graphic design.
3. Super-Resolution:
✓ Super-resolution techniques enhance the resolution and quality of low-resolution images.
✓ Deep learning models such as SRGAN (Super-Resolution Generative Adversarial Network) and
ESPCN (Enhanced Super-Resolution Convolutional Network) learn to generate high-resolution
images from low-resolution inputs.
✓ Super-resolution is used in medical imaging, surveillance, satellite imaging, and digital
photography.
4. Image Synthesis:
✓ Deep learning models are used for synthesizing novel and realistic images from scratch.
✓ Variation Auto encoders (VAEs) and Generative Adversarial Networks (GANs) can generate
diverse and high-quality images across different domains.
✓ Image synthesis finds applications in virtual reality, video game development, and creative arts.
Applications of Deep Learning in Image Compression
1. Lossy Image Compression:
✓ Deep learning techniques are used to achieve efficient lossy compression of images while
maintaining perceptual quality.
✓ Auto encoder-based models learn compact representations of images and decode them to
reconstruct high-quality images.
2. Content-Aware Compression:
✓ Content-aware compression techniques leverage deep learning to adaptively allocate bits to
different regions of an image based on their importance.
✓ Deep neural networks can learn to prioritize preserving important features while discarding
less relevant details during compression.
✓ Content-aware compression is applied in medical imaging, satellite imaging, and remote
sensing.
3. Progressive Image Compression:
✓ Deep neural networks can generate low-resolution versions of images, which are successively refined to
higher resolutions.
✓ Progressive image compression is used in web browsing, image sharing platforms, and digital libraries.
4. Learned Image Codecs:
✓ Learned image codecs replace traditional handcrafted compression algorithms with deep learning-based
models.
✓ Learned image codecs are applied in embedded systems, mobile devices, and cloud-based image
processing platforms.
5. Low-Bitrate Compression:
✓ Deep learning techniques are used to achieve high-quality image compression at low bitrates, enabling
efficient transmission and storage of visual data.
✓ Low-bitrate compression is used in video conferencing, telemedicine, and multimedia messaging
applications.

You might also like