You are on page 1of 42

vi

s A Six Day AICTE ATAL Workshop


h on
n Convolutional Neural Networks
u.
11th – 16th December 2023
m
u
rt
h
y
Neural Networks to CNNs
@ Concept and Case studies
m 14th December 2023 (Thursday)
a
ni Organized by
p Department of Information Technology
al Sir C R Reddy Engineering College
.e Eluru, Andhra Pradesh, India.
d
u sites.google.com/site/yvsm007
Dr. Vishnu Yarlagadda..!
Associate Professor, Department of ICT
Manipal Institute of Technology (MIT), MAHE
Yelahanka, Bengaluru, KA – 560 064, IN.
StringBuffer sb = new StringBuffer(“urvishnu”); sb.append(“@gmail.com”);
Presentation Outline
What is Deep Learning?

Theorical Explanation

Convolution Process

Max Pooling

Stride and Padding

Digit Recognition - Application

Case Studies

Conclusion

Research Scope
Learning Hierarchical Representations
› Image Recognition
– Pixel -> Edge -> Texton -> Motif -> Part -> Object

› Text
– Character -> Word -> Word Group -> Clause -> Sentence -
> Story

› Speech
– Sample -> Spectral Band -> Sound -> Phone -> Phoneme -
> Word

3
DEC 28, 2023
Human View

4
DEC 28, 2023
5
DEC 28, 2023
Human View Vs. Machine View

6
DEC 28, 2023
Color Image to RGB Matrix

7
DEC 28, 2023
8
DEC 28, 2023
9
DEC 28, 2023
10
DEC 28, 2023
11
DEC 28, 2023
12
DEC 28, 2023
What is Deep Learning?
› “Representation-learning methods with multiple levels
of representation, obtained by composing simple but
non-linear modules that each transform the
representation at one level (starting with raw input) into
a representation at higher, slightly more abstractive
level.”
Ref: LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-
444.

13
DEC 28, 2023
Flow of Image Classification:

14
DEC 28, 2023
Convolution and Transposed Convolution

Celard, P., Iglesias, E. L., Sorribes-Fdez, J. M., Romero, R., Vieira, A. S., & Borrajo, L. (2023).
A survey on deep learning applied to medical images: from simple artificial neural networks to 15
DEC 28, 2023 generative models. Neural Computing and Applications, 35(3), 2291-2323.
Convolution Neural Networks

16
DEC 28, 2023
Evolution of Neural Networks

17
DEC 28, 2023
18
DEC 28, 2023
Convolution These are the network
parameters to be learned.

1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0
-1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
0 1 0 0 1 0 Filter 2
-1 1 -1
0 0 1 0 1 0
-1 1 -1



6 x 6 image
Each filter detects a
small pattern (3 x 3).
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1
1 0 0 0 0 1 Dot
0 1 0 0 1 0 product
3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
If stride=2
1 0 0 0 0 1
0 1 0 0 1 0
3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0

6 x 6 image
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
stride=1
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0
-3 -3 0 1

6 x 6 image 3 -2 -2 -1
-1 1 -1
Convolution -1 1 -1 Filter 2
-1 1 -1
stride=1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0
3 -1 -3 -1
0 0 1 1 0 0 -1 -1 -1 -1
1 0 0 0 1 0
-3 1 0 -3
0 1 0 0 1 0 -1 -1 -2 1
Feature
0 0 1 0 1 0
-3 -3 Map
0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Color image: RGB 3 channels
11 -1-1 -1-1 -1-1 11 -1-1
-1 1 1 -1-1 -1 -1 1 -1
-1-1 11 -1-1
-1 1 -1
-1 -1-1 1 1 -1 Filter 1 -1 -1 1 1 -1 -1 Filter 2
-1-1 -1-1 11 -1-1 11 -1-1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected

1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image

x1
1 0 0 0 0 1
0 1 0 0 1 0
x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected



0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 1 1
Filter 1
-1 1 -1 2 0
-1 -1 1 3 0
4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0
9 0
0 1 0 0 1 0
10: 0
0 0 1 0 1 0


13 0
6 x 6 image
14 0
fewer parameters! 15 1 Only connect to
16 1 9 inputs, not
fully connected

1 -1 -1 1 1
-1 1 -1 :2 0
Filter 1
-1 -1 1 :3 0
:4 0 3
1 0 0 0 0 1 :


0 1 0 0 1 0 7 0
0 0 1 1 0 0 :8 1
1 0 0 0 1 0 :9 0 -1
0 1 0 0 1 0
10:: 0
0 0 1 0 1 0


13 0
6 x 6 image
:
14 0
Fewer parameters :
15 1
:
16 1 Shared weights
Even fewer parameters
:

The whole CNN
cat dog ……
Convolution

Max Pooling
Can repeat
Fully Connected many
Feedforward network
Convolution times

Max Pooling

Flattened
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1

3 -1 -3 -1 -1 -1 -1 -1

-3 1 0 -3 -1 -1 -2 1

-3 -3 0 1 -1 -1 -2 1

3 -2 -2 -1 -1 0 -4 3
Why Pooling
• Subsampling pixels will not change the object
bird
bird

Subsampling

We can subsample the pixels to make image smaller


fewer parameters to characterize the image
A CNN compresses a fully
connected network in two ways:
• Reducing number of connections
• Shared weights on the edges
• Max pooling further reduces the complexity
Max Pooling

New image
1 0 0 0 0 1 but smaller
0 1 0 0 1 0 Conv
0 0 1 1 0 0 3 0
-1 1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0 Max 3 1
0 3
Pooling
2 x 2 image
6 x 6 image
Each filter
is a channel
The whole CNN
3 0
-1 1 Convolution

3 1
0 3
Max Pooling
Can repeat
A new image
many
Convolution times
Smaller than the original
image
The number of channels Max Pooling

is the number of filters


The whole CNN
cat dog ……
Convolution

Max Pooling

Fully Connected A new image


Feedforward network
Convolution

Max Pooling

Flattened A new image


3
Flattening
0

1
3 0
-1 1 3

3 1 -1
0 3 Flattened

1 Fully Connected
Feedforward network

3
Only modified the network structure and
CNN in Keras input format (vector -> 3-D tensor)

input

Convolution
1 -1 -1
-1 1 -1
-1 1 -1
-1 1 -1 … There are
-1 -1 1 25 3x3
-1 1 -1 … Max Pooling
filters.
Input_shape = ( 28 , 28 , 1)

28 x 28 pixels 1: black/white, 3: RGB Convolution

3 -1 3 Max Pooling

-3 1
Only modified the network structure and
CNN in Keras input format (vector -> 3-D array)

Input
1 x 28 x 28

Convolution
How many parameters for
each filter? 9 25 x 26 x 26

Max Pooling
25 x 13 x 13

Convolution
How many parameters 225=
for each filter? 50 x 11 x 11
25x9
Max Pooling
50 x 5 x 5
Only modified the network structure and
CNN in Keras input format (vector -> 3-D array)

Input
1 x 28 x 28

Output Convolution

25 x 26 x 26
Fully connected Max Pooling
feedforward network
25 x 13 x 13

Convolution
50 x 11 x 11

Max Pooling
1250 50 x 5 x 5
Flattened
AlphaGo

Next move
Neural
(19 x 19
Network positions)

19 x 19 matrix
Black: 1 Fully-connected feedforward
network can be used
white: -1
none: 0 But CNN performs much better
AlphaGo’s policy network
The following is quotation from their Nature article:
Note: AlphaGo does not use Max Pooling.
CNN in speech recognition

The filters move in the


CNN frequency direction.
Frequency

Image Time
Spectrogram
THANK YOU

Programs must be written for people to read, and only incidentally for machines to
execute

DEC 28, 2023 42

You might also like