0% found this document useful (0 votes)

26 views50 pages

Deep Learninng UnitIII

The document provides an overview of dimensionality reduction techniques in deep learning, focusing on methods like PCA and LDA for linear approaches, and manifold learning for non-linear approaches. It also discusses the architecture and evolution of Convolutional Neural Networks (CNNs), including notable models like AlexNet, VGGNet, Inception, and ResNet, highlighting their unique features and contributions to the field. Additionally, it covers training techniques for CNNs, such as weight initialization, batch normalization, and hyperparameter optimization.

Uploaded by

phoenixonfire9tailed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views50 pages

Deep Learninng UnitIII

Uploaded by

phoenixonfire9tailed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning

Department of Computer Science and Engineering

Pranveer Singh Institute of Technology, Kanpur
Dimensionality Reduction
Dimensionality reduction refers to reducing the number of input variables (features) while
preserving as much important information (variance or class separability) as possible. It helps:
● Reduce computational cost
● Avoid overfitting
● Improve visualization
● Remove redundant/noisy features

There are two broad types:

● Linear methods (e.g., PCA, LDA)
● Non-linear (manifold) methods (e.g., t-SNE, Isomap)

2
Principal Component Analysis (PCA)
🧠 Intuition:
Imagine a cloud of points in 3D space that lies roughly along a line.
Although it’s 3D, most variation is along that line — so we can represent it using just one
variable instead of three.
⚙ Steps:
1. Standardize data: Subtract mean, divide by standard deviation.
2. Compute Covariance Matrix: Measures how features vary together.
3. Find Eigenvectors & Eigenvalues:
○ Eigenvectors = principal component directions
○ Eigenvalues = amount of variance captured
4. Sort and Select: Keep top-k eigenvectors with largest eigenvalues.
5. Transform: Project original data on these axes. 3
Principal Component Analysis (PCA)
📘 Example:
Suppose we have height and weight data for 1000 people.
● Height and weight are correlated.
● PCA may find that one component (body size) explains 95% of the variation.
● We can reduce from 2D → 1D without losing much information.
📊 Application:
● Image compression (e.g., compressing 1024-pixel images to 100 features).
● Noise removal.
● Visualizing high-dimensional data in 2D (for clustering).

4
PCA(Principal Component Analysis)

5
PCA(Principal Component Analysis)

6
PCA(Principal Component Analysis)

7
PCA(Principal Component Analysis)

8
PCA(Principal Component Analysis)

9
PCA(Principal Component Analysis)

10
PCA(Principal Component Analysis)

11
PCA(Principal Component Analysis)

12
PCA(Principal Component Analysis)

13
PCA(Principal Component Analysis)

14
Linear Discriminant Analysis (LDA)
Type: Linear, Supervised
Goal: Reduce dimensions while maximizing class separability.
🧠 Intuition:
While PCA looks for directions of maximum variance,
LDA looks for directions that best separate classes.
⚙ Steps:
1. Compute mean of each class.
2. Compute within-class scatter (how spread out data is within a class).
3. Compute between-class scatter (how far apart class means are).
4. Find projection matrix W that maximizes the ratio:

5. Project data into lower dimension using W. 15

Linear Discriminant Analysis (LDA)

📘 Example:
In a face recognition dataset with multiple people:
● LDA projects each face image into a lower-dimensional space
where faces of the same person are close together, and
different people are far apart.

16
LDA(Linear Discriminant Analysis)

17
LDA(Linear Discriminant Analysis)

18
LDA(Linear Discriminant Analysis)

19
LDA(Linear Discriminant Analysis)

20
LDA(Linear Discriminant Analysis)

21
LDA(Linear Discriminant Analysis)

22
LDA(Linear Discriminant Analysis)

23
LDA(Linear Discriminant Analysis)

24
LDA(Linear Discriminant Analysis)

25
Manifold Learning
Type: Non-linear
Goal: When data lies on a curved surface (manifold) in high-dimensional space, manifold
learning “unfolds” it.
🧠 Intuition:
Think of a “Swiss roll” — a 2D sheet rolled in 3D space.
PCA would fail because it’s linear, but manifold learning can “unwrap” it into a flat 2D
surface.
📚 Common Techniques:
● Isomap: Preserves geodesic (curved) distances using nearest neighbors.
● t-SNE: Preserves local neighborhoods; great for 2D visualization of clusters.
● LLE (Locally Linear Embedding): Keeps local linear relationships among data points.
📘 Example:
In image datasets, all images of digits "2" may lie on one curved manifold, and "3" on
another — manifold learning helps visualize these relationships. 26
Metric Learning
Goal: Learn a distance function that reflects semantic similarity.
🧠 Intuition:
Euclidean distance may not always reflect similarity.
Metric learning teaches the model what “similar” means in context.
📘 Example:
● Face Recognition: Siamese networks learn embeddings such that:
○ Same person → small distance.
○ Different persons → large distance.

27
Autoencoders and Dimensionality Reduction in Neural
Networks
An Autoencoder is a neural network that learns to compress and then reconstruct
input data.
🔹 Architecture:
Input → Encoder → Bottleneck → Decoder → Output

● Encoder: Reduces dimension to a compressed representation.

● Bottleneck: The low-dimensional “code.”
● Decoder: Reconstructs original data from code.
🧠 Intuition:
28
Like PCA, but nonlinear — can capture complex patterns.
Autoencoders and Dimensionality Reduction in Neural
Networks
📘 Example:
For an image of 28×28 pixels (784 inputs):
● Encoder compresses it to 64 features.
● Decoder reconstructs the 784-pixel image from those 64 features.
● The 64-feature vector is a non-linear reduced representation.
⚙ Variants:
● Denoising Autoencoder: Learns to remove noise (input: noisy image → output: clean
image).
● Sparse Autoencoder: Forces most hidden units to be inactive → compact encoding.
● Variational Autoencoder (VAE): Learns probability distributions in latent space, used for
29
generative modeling.
Introduction to Convolutional Neural Networks (ConvNets)

CNNs are designed to process data with spatial structure, like images (height × width × color
channels).
🔹 Key Idea:
Instead of connecting every input neuron to every output neuron (as in dense layers),
CNNs use local connections called filters (kernels) to detect patterns.
Layers in CNN:
Convolution Layer
○ Applies filters to extract features like edges or corners.
○ Example: A 3×3 kernel slides over an image, computing weighted sums.

30
Introduction to Convolutional Neural Networks (ConvNets)
Activation (ReLU)
○ Adds non-linearity:
○ Helps model complex patterns.
Pooling Layer
○ Reduces spatial size (e.g., 2×2 max pooling).
○ Makes features more translation invariant.
Fully Connected Layer
○ Flattens feature maps and performs classification.
Example: For a 28×28 grayscale image:
● Conv1 → detects edges
● Conv2 → detects corners or shapes
● Conv3 → detects object parts
● Fully Connected → outputs “This is a cat.” 31
CNN Architecture

a) AlexNet (2012)
● 8 layers (5 conv + 3 FC).
● ReLU activations (faster than sigmoid/tanh).
● Used Dropout to avoid overfitting.
● Trained on GPU — a breakthrough for deep learning.

Example: Classified 1.2 million ImageNet images into 1000 categories.

Impact: Sparked the modern deep learning revolution.

32
AlexNet (2012)

This was the first architecture that used GPU to boost the training performance. AlexNet consists of 5 convolution
layers, 3 max-pooling layers, 2 Normalized layers, 2 fully connected layers and 1 SoftMax layer. Each convolution
layer consists of a convolution filter and a non-linear activation function called “ReLU”. The pooling layers are used
to perform the max-pooling function and the input size is fixed due to the presence of fully connected layers. The input
size is mentioned at most of the places as 224x224x3 but due to some padding which happens it works out to be
227x227x3. Above all this AlexNet has over 60 million parameters.
33
CNN Architecture

b) VGGNet (2014)
● Simple, uniform architecture.

● Only 3×3 convolution filters used repeatedly.

● 16–19 layers deep.

● Great performance but huge parameters (~138M).

Example: Used widely as a feature extractor in computer vision.

34
VGGNet

35
VGGNet
● Inputs: The VGGNet accepts 224x224-pixel images as input. To maintain a consistent input size for the

ImageNet competition, the model’s developers chopped out the central 224x224 patches in each image.

● Convolutional Layers: VGG convolutional layers use the smallest feasible receptive field, or 33, to

record left-to-right and up-to-down movement. Additionally, 11 convolution filters are used to transform

the input linearly. The next component is a ReLU unit, a significant advancement from AlexNet that

shortens training time. Rectified linear unit activation function, or ReLU, is a piecewise linear function

that, if the input is positive, outputs the input; otherwise, the output is zero. The convolution stride is fixed

at 1 pixel to keep the spatial resolution preserved after convolution (stride is the number of pixel shifts

over the input matrix).

36
VGGNet

● Hidden Layers: The VGG network hidden layers all make use of ReLU. Local

Response Normalization (LRN) is typically not used with VGG as it increases

memory usage and training time. Furthermore, it doesn’t increase overall accuracy.

● Fully Connected Layers: The VGGNet contains three layers with full connectivity. The

first two levels each have 4096 channels, while the third layer has 1000 channels with

one channel for each class.

37
CNN Architecture

Inception (GoogLeNet)
● Introduced Inception modules combining multiple filter sizes (1×1, 3×3, 5×5) in
parallel.

● 1×1 convolutions used for dimensionality reduction to reduce computation.

● Deeper but more efficient than VGG.

38
Inception (GoogLeNet)
Inception Module (GoogLeNet Style)
The Inception module runs multiple convolution paths in parallel, then concatenates the outputs
depth-wise.

39
Inception (GoogLeNet)

Purpose of each branch

● 1×1 Conv branch: keeps local details, cheap computation
● 1×1 → 3×3 Conv: medium-size receptive field
● 1×1 → 5×5 Conv: larger receptive field
● Pool → 1×1 Conv: captures background + reduces spatial variance
Why 1×1 conv?
● Reduces number of channels
● Adds non-linearity
● Makes 3×3 and 5×5 branches much cheaper

40
Inception (GoogLeNet)

Example: One Inception Module with Channel Sizes

Suppose input feature map = 256 channels

Output depth

41
CNN Architecture
ResNet (2016)
● Introduced skip connections (residual blocks).

● Solved vanishing gradient problem by allowing gradients to flow directly through identity
connections.

● Enabled training of extremely deep networks (up to 152 layers).

Example: Became standard for most vision tasks (object detection, segmentation, etc.).

42
ResNet (2016)

ResNet Architecture (2016)

Key Innovation → Skip Connections (Residual Learning)
The core idea is:

Where
● x = input to the residual block
● F(x) = output of stacked convolutional layers
● x is added directly (identity connection) to the output

This solves the vanishing gradient problem and allows very deep networks (50, 101, 152 layers).
43
ResNet (2016)
1. Residual Block (Basic Block: for ResNet-18/34)
Block Structure
Input → Conv3×3 → BN → ReLU → Conv3×3 → BN → Add(Input)
→ ReLU → Output

2. Bottleneck Residual Block (for ResNet-50/101/152)

Uses 1×1 → 3×3 → 1×1 convolutions for efficiency.
Block Structure
Input → 1×1 → 3×3 → 1×1 → Add(Input) → ReLU

44
ResNet (2016)
Full ResNet Architecture Construction:
ResNet-50
Input: 224×224 RGB

Stage 1: Conv 7×7, 64 filters, stride 2

MaxPool 3×3, stride 2

Stage 2: [1×1, 64; 3×3, 64; 1×1, 256] × 3 blocks

Stage 3: [1×1,128; 3×3,128; 1×1,512] × 4 blocks

Stage 4: [1×1,256; 3×3,256; 1×1,1024] × 6 blocks

Stage 5: [1×1,512; 3×3,512; 1×1,2048] × 3 blocks

Global Average Pool

45
FC 1000 (Softmax)
ResNet (2016)

ResNet-101 Why ResNet Is Powerful

Same as ResNet-50, except: ✔ Skip connections allow gradient flow →
Stage 4: × 23 blocks (instead of 6) prevents vanishing gradient

ResNet-152 ✔ Enables extremely deep models (152+ layers)

✔ Became backbone for:
Stage 2: × 3
Stage 3: × 8 ● Object detection (Faster R-CNN, Mask R-CNN)
Stage 4: × 36 ● Image segmentation (DeepLab)
Stage 5: × 3 feature extraction in many CV tasks
✔ Very stable training even at large depth
46
Training a ConvNet
a) Weight Initialization
Proper initialization ensures stable gradients.
Example: Poor initialization → exploding or vanishing gradients → model doesn’t converge.

47
Training a ConvNet

b) Batch Normalization
● Normalizes outputs of each layer to have mean=0, variance=1.

● Reduces internal covariate shift.

● Allows higher learning rates and faster convergence.

● Acts like a regularizer.

Example: Without BatchNorm, training deep CNNs may oscillate or diverge.

48
Training a ConvNet
C) Hyperparameter Optimization
Hyperparameters = settings not learned by the network.

49
Training a ConvNet
Optimization Methods:
1. Grid Search: Try all combinations (costly).
2. Random Search: Randomly sample combinations.
3. Bayesian Optimization: Smartly explore promising regions.
4. AutoML / HyperOpt / Optuna: Automated tuning.
✅ Example Workflow:
Building a CNN for dog–cat classification:
1. Use pretrained ResNet50.
2. Fine-tune with He initialization.
3. Apply BatchNorm + Dropout.
4. Tune learning rate, batch size, and optimizer.
5. Evaluate on test data.

ENG6500 8 DL IntroductionToDeepLearning Part2
No ratings yet
ENG6500 8 DL IntroductionToDeepLearning Part2
65 pages
Deep Learning Study Notes Overview
No ratings yet
Deep Learning Study Notes Overview
155 pages
Module 5
No ratings yet
Module 5
20 pages
Unit 4
No ratings yet
Unit 4
51 pages
Deep Learning and CNN Overview
No ratings yet
Deep Learning and CNN Overview
35 pages
Dna Co1 Co2
No ratings yet
Dna Co1 Co2
91 pages
MatConvNet: CNN Toolbox for MATLAB
No ratings yet
MatConvNet: CNN Toolbox for MATLAB
55 pages
Lecture2.2 UnimodalRepresentations Part1 PDF
No ratings yet
Lecture2.2 UnimodalRepresentations Part1 PDF
92 pages
Deep Learning
No ratings yet
Deep Learning
44 pages
Unit 3
No ratings yet
Unit 3
14 pages
Week4 CNN&RNN
No ratings yet
Week4 CNN&RNN
68 pages
Comprehensive Notes On Convolutional Neural Networks (CNNS)
No ratings yet
Comprehensive Notes On Convolutional Neural Networks (CNNS)
5 pages
MATLAB CNN Toolbox for Researchers
No ratings yet
MATLAB CNN Toolbox for Researchers
59 pages
CSCI417 Machine Intelligence - Lec11 RNN - V1
No ratings yet
CSCI417 Machine Intelligence - Lec11 RNN - V1
61 pages
Kernel Slides
No ratings yet
Kernel Slides
33 pages
DL Unit-4 Computer Vision
No ratings yet
DL Unit-4 Computer Vision
58 pages
Deep Learning & CNN Fundamentals
No ratings yet
Deep Learning & CNN Fundamentals
56 pages
Machine Leaning
No ratings yet
Machine Leaning
17 pages
Rec03 - Deep Architectures
No ratings yet
Rec03 - Deep Architectures
65 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
15 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
CNN, RNN
No ratings yet
CNN, RNN
60 pages
Antim Prahar AI and ML For Business 2025
No ratings yet
Antim Prahar AI and ML For Business 2025
45 pages
CNN, RNN
No ratings yet
CNN, RNN
60 pages
Deep Learning: Neural Networks Overview
No ratings yet
Deep Learning: Neural Networks Overview
105 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
3 pages
Module 2
No ratings yet
Module 2
41 pages
Structured Neural Networks Overview
No ratings yet
Structured Neural Networks Overview
35 pages
Deep Learning for Visual Experts
No ratings yet
Deep Learning for Visual Experts
58 pages
Understanding CNNs and RNNs in PyTorch
No ratings yet
Understanding CNNs and RNNs in PyTorch
18 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
Sem 7 R2
No ratings yet
Sem 7 R2
51 pages
DL Full Notes
No ratings yet
DL Full Notes
17 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
11 pages
Introduction To Deep Learning 17th January 2025
No ratings yet
Introduction To Deep Learning 17th January 2025
60 pages
L4 - Deep Learning
No ratings yet
L4 - Deep Learning
50 pages
CNN Applications in Computer Vision
No ratings yet
CNN Applications in Computer Vision
65 pages
? What Is A Convolutional Neural Network
No ratings yet
? What Is A Convolutional Neural Network
3 pages
Overview of Multiple Object Tracking
No ratings yet
Overview of Multiple Object Tracking
69 pages
DL Ia2
No ratings yet
DL Ia2
13 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
9 pages
Lab - 8.1 - CNN
No ratings yet
Lab - 8.1 - CNN
5 pages
Image Processing with CNNs Overview
No ratings yet
Image Processing with CNNs Overview
63 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
54 pages
Deep Learning and CNN Fundamentals
No ratings yet
Deep Learning and CNN Fundamentals
106 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
Classify Webcam Images Using Deep Learning
No ratings yet
Classify Webcam Images Using Deep Learning
17 pages
Notes of Deep Learning Top Architectures
No ratings yet
Notes of Deep Learning Top Architectures
13 pages
Understanding Convolutional Neural Networks
No ratings yet
Understanding Convolutional Neural Networks
107 pages
Unit Iii Visualization and Understanding CNN
No ratings yet
Unit Iii Visualization and Understanding CNN
78 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Ads 1
No ratings yet
Ads 1
46 pages
Unit V DL
No ratings yet
Unit V DL
28 pages
Deep Learning Overview and Key Concepts
No ratings yet
Deep Learning Overview and Key Concepts
11 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
DL Unit 4
No ratings yet
DL Unit 4
58 pages
A Comprehensive Survey On Deep Graph Representaion Learning
No ratings yet
A Comprehensive Survey On Deep Graph Representaion Learning
50 pages
Crypto Prediction Model Plan
No ratings yet
Crypto Prediction Model Plan
28 pages
Dimensionality Reduction Techniques
No ratings yet
Dimensionality Reduction Techniques
7 pages
Module3 Notes
No ratings yet
Module3 Notes
13 pages
Manifold Learning for Image Classification
No ratings yet
Manifold Learning for Image Classification
11 pages
Geometric Structure of High Dimensional Data and Dimensionality Reduction English Version Chinese Edition Wang Jian Zhong
No ratings yet
Geometric Structure of High Dimensional Data and Dimensionality Reduction English Version Chinese Edition Wang Jian Zhong
506 pages
Local Correlation Visualization Methods
No ratings yet
Local Correlation Visualization Methods
7 pages
Foundations of Machine Learning (2nd Edition) Mehryar
No ratings yet
Foundations of Machine Learning (2nd Edition) Mehryar
10 pages
Advanced Science - 2024 - Gong - Control of Cellular Differentiation Trajectories For Cancer Reversion
No ratings yet
Advanced Science - 2024 - Gong - Control of Cellular Differentiation Trajectories For Cancer Reversion
17 pages
Deep Model Fusion: A Survey
No ratings yet
Deep Model Fusion: A Survey
46 pages
Advance Deep Learning - BIT
No ratings yet
Advance Deep Learning - BIT
119 pages
Data Reduction
No ratings yet
Data Reduction
28 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 4 Notes
23 pages
ArXiv 2020 Walid Ahmad 0 ChemBERTa 2
No ratings yet
ArXiv 2020 Walid Ahmad 0 ChemBERTa 2
8 pages
Minor Proj
No ratings yet
Minor Proj
15 pages
Machine Learning for Remote Sensing
No ratings yet
Machine Learning for Remote Sensing
6 pages
An Integrated Clustering and BERT Framework For Improved Topic Modeling
No ratings yet
An Integrated Clustering and BERT Framework For Improved Topic Modeling
9 pages
Geometric and Topological Data Reduction
No ratings yet
Geometric and Topological Data Reduction
275 pages
API Reference - Scikit-Learn 0.19.2 Documentation
No ratings yet
API Reference - Scikit-Learn 0.19.2 Documentation
21 pages
Factor Analysis and Dimension Reduction in R A Social Scientist's Toolkit 1st Edition Study Guide Download
100% (23)
Factor Analysis and Dimension Reduction in R A Social Scientist's Toolkit 1st Edition Study Guide Download
14 pages
Machine Learning For Algorithmic Trading 2nd Edition Stefan Jansen
No ratings yet
Machine Learning For Algorithmic Trading 2nd Edition Stefan Jansen
465 pages
F - Sne Umap: ROM TO With Contrastive Learning
No ratings yet
F - Sne Umap: ROM TO With Contrastive Learning
44 pages
Nonlinear Dimensionality Analysis
No ratings yet
Nonlinear Dimensionality Analysis
37 pages
Vikings
No ratings yet
Vikings
36 pages
Machine Learning in Fluid Mechanics
No ratings yet
Machine Learning in Fluid Mechanics
66 pages
Feature Engineering
No ratings yet
Feature Engineering
15 pages
AI Optimization of Self-Compacting Geopolymer Concrete
No ratings yet
AI Optimization of Self-Compacting Geopolymer Concrete
20 pages
Mental Health Analysis in Social Media Posts: A Survey: Muskan Garg
No ratings yet
Mental Health Analysis in Social Media Posts: A Survey: Muskan Garg
24 pages
ML 4
No ratings yet
ML 4
14 pages

Deep Learninng UnitIII

Uploaded by

Deep Learninng UnitIII

Uploaded by

Deep Learning

Department of Computer Science and Engineering

There are two broad types:

5. Project data into lower dimension using W. 15

● Encoder: Reduces dimension to a compressed representation.

Example: Classified 1.2 million ImageNet images into 1000 categories.

● Only 3×3 convolution filters used repeatedly.

● 16–19 layers deep.

● Great performance but huge parameters (~138M).

Example: Used widely as a feature extractor in computer vision.

over the input matrix).

Response Normalization (LRN) is typically not used with VGG as it increases

one channel for each class.

● 1×1 convolutions used for dimensionality reduction to reduce computation.

● Deeper but more efficient than VGG.

Purpose of each branch

Example: One Inception Module with Channel Sizes

● Enabled training of extremely deep networks (up to 152 layers).

ResNet Architecture (2016)

2. Bottleneck Residual Block (for ResNet-50/101/152)

Stage 1: Conv 7×7, 64 filters, stride 2

Stage 2: [1×1, 64; 3×3, 64; 1×1, 256] × 3 blocks

Stage 3: [1×1,128; 3×3,128; 1×1,512] × 4 blocks

Stage 4: [1×1,256; 3×3,256; 1×1,1024] × 6 blocks

Stage 5: [1×1,512; 3×3,512; 1×1,2048] × 3 blocks

Global Average Pool

ResNet-101 Why ResNet Is Powerful

ResNet-152 ✔ Enables extremely deep models (152+ layers)

● Reduces internal covariate shift.

● Allows higher learning rates and faster convergence.

● Acts like a regularizer.

Example: Without BatchNorm, training deep CNNs may oscillate or diverge.

You might also like