EvolvingCNNs V1

vi
s A Six Day AICTE ATAL Workshop

h on
n Convolutional Neural Networks
u.
11th – 16th December 2023
m
u
rt
h
y
Neural Networks to CNNs
@ Concept and Case studies
m 14th December 2023 (Thursday)
a
ni Organized by
p Department of Information Technology
al Sir C R Reddy Engineering College
.e Eluru, Andhra Pradesh, India.
d
u sites.google.com/site/yvsm007
Dr. Vishnu Yarlagadda..!
Associate Professor, Department of ICT
Manipal Institute of Technology (MIT), MAHE
Yelahanka, Bengaluru, KA – 560 064, IN.
StringBuffer sb = new StringBuffer(“urvishnu”); sb.append(“@gmail.com”);
Presentation Outline
What is Deep Learning?
Theorical Explanation
Convolution Process
Max Pooling
Stride and Padding
Digit Recognition - Application
Case Studies
Conclusion
Research Scope
Learning Hierarchical Representations
› Image Recognition
– Pixel -> Edge -> Texton -> Motif -> Part -> Object
› Text
– Character -> Word -> Word Group -> Clause -> Sentence -
> Story
› Speech
– Sample -> Spectral Band -> Sound -> Phone -> Phoneme -
> Word
3
DEC 28, 2023
Human View
4
DEC 28, 2023
5
DEC 28, 2023
Human View Vs. Machine View
6
DEC 28, 2023
Color Image to RGB Matrix
7
DEC 28, 2023
8
DEC 28, 2023
9
DEC 28, 2023
10
DEC 28, 2023
11
DEC 28, 2023
12
DEC 28, 2023
What is Deep Learning?
› “Representation-learning methods with multiple levels
of representation, obtained by composing simple but
non-linear modules that each transform the
representation at one level (starting with raw input) into
a representation at higher, slightly more abstractive
level.”
Ref: LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-
444.
13
DEC 28, 2023
Flow of Image Classification:
14
DEC 28, 2023
Convolution and Transposed Convolution
Celard, P., Iglesias, E. L., Sorribes-Fdez, J. M., Romero, R., Vieira, A. S., & Borrajo, L. (2023).
A survey on deep learning applied to medical images: from simple artificial neural networks to 15
DEC 28, 2023 generative models. Neural Computing and Applications, 35(3), 2291-2323.
Convolution Neural Networks
16
DEC 28, 2023
Evolution of Neural Networks
17
DEC 28, 2023
18
DEC 28, 2023
Convolution These are the network
parameters to be learned.
1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0
-1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 Filter 2
-1 1 -1
0 0 1 0 1 0
-1 1 -1
…
…
6 x 6 image
Each filter detects a
small pattern (3 x 3).
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1
1 0 0 0 0 1 Dot
0 1 0 0 1 0 product
3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 -1 1
If stride=2
1 0 0 0 0 1
0 1 0 0 1 0
3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
-1 -1 1
stride=1
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0
-3 -3 0 1
6 x 6 image 3 -2 -2 -1
-1 1 -1
-1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0 -1 -1 -1 -1
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0 -1 -1 -2 1
Feature
0 0 1 0 1 0
-3 -3 Map
0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Color image: RGB 3 channels
11 -1-1 -1-1 -1-1 11 -1-1
-1 1 1 -1-1 -1 -1 1 -1
-1-1 11 -1-1
-1 1 -1
-1 -1-1 1 1 -1 Filter 1 -1 -1 1 1 -1 -1 Filter 2
-1-1 -1-1 11 -1-1 11 -1-1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected
1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image
x1
1 0 0 0 0 1
0 1 0 0 1 0
x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected
…
…
…
…
0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 1 1
Filter 1
-1 1 -1 2 0
-1 -1 1 3 0
4 0 3
1 0 0 0 0 1 :
…
0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0
9 0
0 1 0 0 1 0
10: 0
0 0 1 0 1 0
…
13 0
6 x 6 image
14 0
fewer parameters! 15 1 Only connect to
16 1 9 inputs, not
fully connected
…
1 -1 -1 1 1
-1 1 -1 :2 0
Filter 1
-1 -1 1 :3 0
:4 0 3
1 0 0 0 0 1 :
…
0 1 0 0 1 0 7 0
0 0 1 1 0 0 :8 1
1 0 0 0 1 0 :9 0 -1
0 1 0 0 1 0
10:: 0
0 0 1 0 1 0
…
13 0
6 x 6 image
:
14 0
Fewer parameters :
15 1
:
16 1 Shared weights
Even fewer parameters
:
…
The whole CNN
cat dog ……
Convolution
Max Pooling
Can repeat
Fully Connected many
Feedforward network
Convolution times
Max Pooling
Flattened
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1
3 -1 -3 -1 -1 -1 -1 -1
-3 1 0 -3 -1 -1 -2 1
-3 -3 0 1 -1 -1 -2 1
3 -2 -2 -1 -1 0 -4 3
Why Pooling
• Subsampling pixels will not change the object
bird
bird
Subsampling
We can subsample the pixels to make image smaller

fewer parameters to characterize the image
A CNN compresses a fully
connected network in two ways:
• Reducing number of connections
• Shared weights on the edges
• Max pooling further reduces the complexity
Max Pooling
New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
0 0 1 1 0 0 3 0
-1 1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0 Max 3 1
0 3
Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
3 0
-1 1 Convolution
3 1
0 3
Max Pooling
Can repeat
A new image
many
Convolution times
Smaller than the original
image
The number of channels Max Pooling
is the number of filters

The whole CNN
cat dog ……
Convolution
Max Pooling
Fully Connected A new image

Feedforward network
Convolution
Max Pooling
Flattened A new image

3
Flattening
0
1
3 0
-1 1 3
3 1 -1
0 3 Flattened
1 Fully Connected
Feedforward network
3
Only modified the network structure and
CNN in Keras input format (vector -> 3-D tensor)
input
Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 … There are
-1 -1 1 25 3x3
-1 1 -1 … Max Pooling
filters.
Input_shape = ( 28 , 28 , 1)
28 x 28 pixels 1: black/white, 3: RGB Convolution
3 -1 3 Max Pooling
-3 1
CNN in Keras input format (vector -> 3-D array)
Input
1 x 28 x 28
Convolution
How many parameters for
each filter? 9 25 x 26 x 26
Max Pooling
25 x 13 x 13
Convolution
How many parameters 225=
for each filter? 50 x 11 x 11
25x9
Max Pooling
50 x 5 x 5
CNN in Keras input format (vector -> 3-D array)
Input
1 x 28 x 28
Output Convolution
25 x 26 x 26
Fully connected Max Pooling
feedforward network
25 x 13 x 13
Convolution
50 x 11 x 11
Max Pooling
1250 50 x 5 x 5
Flattened
AlphaGo
Next move
Neural
(19 x 19
Network positions)
19 x 19 matrix
Black: 1 Fully-connected feedforward
network can be used
white: -1
none: 0 But CNN performs much better
AlphaGo’s policy network
The following is quotation from their Nature article:
Note: AlphaGo does not use Max Pooling.
CNN in speech recognition
The filters move in the

CNN frequency direction.
Frequency
Image Time
Spectrogram
THANK YOU
Programs must be written for people to read, and only incidentally for machines to
execute
DEC 28, 2023 42

EvolvingCNNs V1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EvolvingCNNs V1

Uploaded by

Copyright:

Available Formats

vi

s A Six Day AICTE ATAL Workshop

Stride and Padding

Digit Recognition - Application

We can subsample the pixels to make image smaller

is the number of filters

Fully Connected A new image

Flattened A new image

28 x 28 pixels 1: black/white, 3: RGB Convolution

The filters move in the

DEC 28, 2023 42

You might also like