Professional Documents
Culture Documents
CCS355
UNIT III
THIRD-GENERATION NEURAL NETWORKS
1. Computer vision is the field that stands to gain the most from automatic video anal- ysis
utilizing Spiking Neural Networks. The IBM True North digital neurotic, which simulates the
activity of neurons in the visual cortex using one million pro- gram Mable neurons and 256
million pro- gram Mable synapses, can be useful in this regard. This neurotic is frequently re-
girded as the original piece of Spiking Neural Networks compatible hardware.
2. Real-Time Processing: SNNs excel in tasks requiring real-time processing and temporal pattern
recognition, such as audio and video processing, robotics, and brain-computer interfaces.
3. Cognitive Computing: They hold promise for emulating higher-order cognitive functions and
complex behaviours exhibited by biological brains.
Advantages and Disadvantages of SNN
❖ Advantages
1. SNN is a dynamic system. As a result, it excels in dynamic processes like speech and dynamic
picture identification.
2. When an SNN is already working, it can still train.
3. To train an SNN, you simply need to train the output neurons.
4. Traditional ANNs often have more neurons than SNNs; however, SNNs typically have fewer
neurons.
5. Because the neurons send impulses rather than a continuous value, SNNs can work incredibly
quickly.
6. Because they leverage the temporal presentation of information, SNNs have boosted
information processing productivity and noise immunity.
❖ Disadvantages:
1. SNNs are difficult to train.
2. As of now, there is no learning algorithm built expressly for this task.
3. Building a small SNN is impracticable.
Convolutional Neural Network (CNN)
1. Flattening: The resulting feature maps are flattened into a one-dimensional vector after the
convolution and pooling layers so they can be passed into a completely linked layer for
categorization or regression.
1. Fully Connected Layers: It takes the input from the
previous layer and computes the final classification
or regression task.
2. Output Layer: The output from the fully connected
layers is then fed into a logistic function for
classification tasks like sigmoid or soft max which
converts the output of each class into the probability
score of each class.
Convolutional Neural Network (CNN)
Example:
Let’s consider an image and apply the convolution layer, activation layer, and pooling layer operation
to extract the inside feature.
Artificial Intelligence
Neural Networks
Deep Learning
Neural Networks
Advantages and Disadvantage of Deep Learning Neural Network
Definition:
Extreme Learning Machines (ELMs) are a type of single-hidden layer feedforward
neural network (SLFN) where the weights connecting the input layer to the hidden layer are
randomly generated and fixed. The output layer weights are then analytically computed through
a process called Moore-Penrose pseudoinverse.
✓ We are going to discuss the architecture of ELM which provides a detailed explanation
of how ELM works in machine learning.
✓ The architecture of ELM is very simple and straight forward which involves three
segments which are listed below,
1. Input layer
2. Hidden layer – Single hidden layer
3. Output layer
1. Input layer:
In ELM, the Input Layer is where the data enters the model. It’s represented as a vector called X, which
contains the input features.
X = [X[1], X[2], X[3], ..., X[N]]
In this representation, each X[i] corresponds to a specific feature or attribute of the data. N is the total
number of features. The Input Layer is responsible for passing the data to the Hidden Layer for further
processing.
The value of N is a hyper parameter that needs to be set before training the neural network. Each
column in the weight matrix corresponds to the weights of a hidden neuron. The biases for the
hidden neurons are represented by a bias vector b of size (L, 1).
The second dimension of 1 is used to ensure that the bias vector is a column vector. This is because
the dot product of the weight matrix W and input feature vector X results in a column vector, and
adding a row vector (the bias vector) to a column vector requires that the bias vector be a column
vector as well. The purpose of the bias term is to shift the activation function to the left or right,
allowing it to model more complex functions.
The output of the hidden layer, often denoted as H, is calculated by applying the activation function
g like linear regression concept by making element-wise to the dot product of the input features and
the weights, adding the bias.
H = g(W * X + b)
3. Output layer
In ELM, the output layer weights are calculated using Moore-Penrose inverse of the hidden
layer output matrix. This output weight matrix is denoted as beta. The output predictions,
represented as f(x), are calculated by multiplying the hidden layer output H by the output
weights beta:
f(x) = H * beta
To make predictions, we multiply the hidden layer output H by the output weights beta. Each
row in f(x) represents the predictions for a corresponding data point.
Advantages and Applications Extreme Learning Neural Networks
Advantages:
1. Fast Training: ELMs have a fast training speed due to their analytical solution for output
weights.
2. Scalability: ELMs are scalable and can handle large datasets efficiently.
3. Generalization: Despite their simplicity, ELMs often generalize well to unseen data.
Applications:
1. Classification: ELMs are used for tasks such as image classification, sentiment analysis, and
pattern recognition.
2. Regression: ELMs can perform regression tasks like predicting housing prices, stock prices, and
time series forecasting.
3. Feature Learning: ELMs are used for feature learning and representation learning in deep
learning architectures.
The Convolution Neural Network Operation
Definition:
The convolution operation in the context of deep learning refers to the process of applying a
filter (also known as a kernel) to an input image or feature map. This operation involves sliding the filter
over the input data and computing element-wise multiplications followed by summation to produce an
output feature map.
Purpose:
✓ The convolution operation serves as a fundamental building
block in convolutional neural networks (CNNs), enabling them
to automatically learn hierarchical features from raw data.
✓ By convolving learnable filters with input data, CNNs can
effectively extract spatial hierarchies of features, capturing
patterns of increasing complexity.
Enhancing Feature Extraction:
1. The motivation behind the convolution operation lies in its ability to enhance feature
extraction from raw data, particularly in tasks involving images, audio, and video.
2. Convolution allows the network to learn local patterns, edges, and textures from input
data, enabling it to generalize better to unseen examples.
Parameter Sharing:
1. Another key motivation is parameter sharing, where the same filter is applied across
different spatial locations of the input data.
2. This sharing reduces the number of parameters in the network, making it more
computationally efficient and reducing the risk of overfitting.
Pooling
Definition:
Pooling, also known as down sampling, is a technique used in convolutional neural
networks (CNNs) to reduce the spatial dimensions of feature maps while retaining important
information.
Types of Pooling:
1. Max Pooling: Selects the maximum value from each patch of the feature map, preserving the
most dominant features.
2. Average Pooling: Computes the average value within each patch, providing a more smoothed
representation of features.
Purpose:
✓ Pooling helps in reducing the computational complexity of the network by reducing the spatial
dimensions of the feature maps.
✓ It also introduces translational invariance, making the network more robust to spatial
translations in the input data.
Variants of the Basic Convolution Function
1. Stride Convolution:
Stride convolution involves moving the filter by more than one pixel at a time, effectively reducing
the spatial dimensions of the output feature map.
2. Dilated Convolution:
Dilated convolution, also known as aurous convolution, introduces gaps between the filter elements,
allowing it to capture larger receptive fields without increasing the number of parameters.
3. Transposed Convolution (Deconvolution):
Transposed convolution, or deconvolution, is used for up sampling feature maps, effectively
increasing the spatial dimensions of the output.
Data Types
1. Input Data:
✓ Convolutional neural networks (CNNs) are capable of processing various types of input data,
including images, audio spectrograms, and text embedding's.
✓ Each type of data may require different pre-processing steps and network architectures tailored
to its specific characteristics.
2. Output Data:
✓ The output of a CNN depends on the task it is designed to perform. For classification tasks, the
output typically consists of class probabilities or discrete labels.
✓ For tasks like object detection or segmentation, the output may include bounding boxes,
masks, or pixel-wise labels.
Efficient Convolution Algorithms
Importance:
✓ Efficient convolution algorithms are crucial for accelerating the training and inference
processes of convolutional neural networks (CNNs), especially for large-scale datasets and
complex network architectures.
Techniques:
✓ Techniques like fast Fourier transform (FFT)-based convolution, Win grad convolution, and
depth wise separable convolution are commonly used to optimize the computational
efficiency of convolutions.
✓ Hardware accelerators like GPUs and TPUs further enhance the speed and efficiency of
convolutional operations in deep learning frameworks.
Neuroscientific Basis
Inspiration:
✓ The neuroscientific basis of convolutional neural networks (CNNs) draws inspiration from
the visual cortex of the human brain.
✓ CNN architectures mimic the hierarchical organization of neurons in the visual cortex, with
each layer learning increasingly complex features from raw sensory input.
Receptive Fields:
✓ The concept of receptive fields in CNNs is analogous to the receptive fields of neurons in
the visual cortex, representing the region of the input space that influences the activity of a
particular neuron.
✓ CNNs learn hierarchical representations by aggregating information across multiple
receptive fields, capturing spatial hierarchies of features similar to the visual processing in
the brain.
Applications of Deep Learning in Computer Vision
1. Object Detection:
✓ Deep learning techniques, particularly Convolutional Neural Networks (CNNs), have revolutionized
object detection tasks.
✓ CNN-based architectures such as Faster R-CNN, YOLO (You Only Look Once), and SSD (Single Shot
Multi Box Detector) enable accurate and real-time object detection in images and videos.
✓ Object detection finds applications in various domains, including surveillance, autonomous vehicles,
and image-based search engines.
2. Image Classification:
✓ Deep learning models excel in image classification tasks, where they classify images into predefined
categories or classes.
✓ Image classification finds applications in medical diagnosis, quality control in manufacturing, and
content-based image retrieval.
3. Image Enhancement and Restoration:
✓ Deep learning techniques are used for image enhancement and restoration tasks, including denoising,
deburring, and super-resolution.
✓ Image enhancement and restoration techniques are applied in satellite imaging, medical imaging, and
digital photography.
4. Face Recognition:
✓ Deep learning has significantly improved the accuracy and robustness of face recognition systems.
✓ Face recognition is widely used in security systems, surveillance, biometric authentication, and social
media tagging.
✓ Video analysis finds applications in surveillance, sports analytics, healthcare monitoring, and video
content recommendation.
Applications of Deep Learning in Image Generation