Professional Documents
Culture Documents
Lecture 08 -09
Learning Outcomes
• Neural Network
• Different Architecture of Neural Network
• Backpropogation Algorithm
• Gradient Decent Approach
• Deep Learning
• Covolution Neural Network
Biological Inspirations
• Some numbers…
– The human brain contains about 10 billion nerve cells (neurons)
– Each neuron is connected to the others through 10000 synapses
• Properties of the brain
– It can learn, reorganize itself from experience
– It adapts to the environment
– It is robust and fault tolerant
Biological Neuron
• A neuron has
– A branching input (dendrites)
– A branching output (the axon)
• The information circulates from the dendrites to the axon via the cell body
• Axon connects to dendrites via synapses
– Synapses vary in strength
– Synapses may be excitatory or inhibitory
Biological Neuron
What is an artificial neuron ?
Definition : Non linear, parameterized function with re
stricted output range
y
n 1
w0 y f w0 wi xi
i 1
x1 x2 x3
Activation Functions
20
18
16
14
Linear
12
10
6
yx
4
0
0 2 4 6 8 10 12 14 16 18 20
Logistic
1.5
0.5
1
y
0
1 exp( x)
-0.5
-1
-1.5
-2
-10 -8 -6 -4 -2 0 2 4 6 8 10
1.5
Hyperbolic tangent
1
0.5
0
exp( x) exp( x)
-0.5
y
-1
-1.5
exp( x) exp( x)
-2
-10 -8 -6 -4 -2 0 2 4 6 8 10
Neural Networks
• A mathematical model to solve engineering problems
– Group of highly connected neurons to realize compositions of non linear functi
ons
• Tasks
– Classification
– Discrimination
– Estimation
• 2 major types of networks
– Feed forward Neural Networks
– Recurrent Neural Networks
Learning
• The procedure that consists in estimating the parameters of neurons so that the whole
network can perform a specific task
• 3 types of learning
– The supervised learning
– The unsupervised learning
– Reinforcement learning
Classification of Some NN System with respect to Learning Methods and Architecture Types Learning Method
Single Layer Feedforward ADALINE, Hopfield, Perceptron AM, Hopfield LVQ, SOFM
x1 x2 ….. xn
Recurrent Neural Networks
• Can have arbitrary topologies
• Can model systems with internal states (d
ynamic ones)
0 1 • Delays are associated to a specific weight
0
0 • Training is more difficult
1 • Performance may be problematic
– Stable Outputs may be more difficult to e
0
0 1 valuate
– Unexpected behavior (oscillation, chaos,
…)
x1 x2
Building a Neural Network
1. “Select Structure”: Design the way that the neurons are interconnected
2. “Select weights” – decide the strengths with which the neurons are inte
rconnected
– weights are selected so get a “good match” to a “training set”
– “training set”: set of inputs and desired outputs
– often use a “learning algorithm”
Multiple Output Units: One-vs-Rest
Multiple Output Units: One-vs-Rest
Neural Network Classification
Representing Boolean Function
Representing Boolean Function
Combining Representaions to Create Non-Linear Functions
Layering Representations
Layering Representations
Forward-Propagating Local Input Signals
t2
t1
Why now?
1) Algorithm Advancements
2) GPU Computing
3) Availability of Large Training Data
DEEP LEARNING
• We know it is good to learn a small model.
• From this fully connected model, do we really ne
ed all the edges?
• Can some of these be shared?
Deep Learning Vs Machine Learning
Popular Deep Learning Algorithm
Convolution Neural Network
Consider Learning an Image
• Some patterns are much smaller than the whole image
Can represent a small region with fewer parameters
“beak” detector
Same pattern appears in different places: They can be compressed!
What about training a lot of such “small” detectors and each detector must “move around”.
“upper-left beak”
detector
“middle beak”
detector
A Convolutional Layer
A CNN is a neural network with some convolutional layers (and some
other layers). A convolutional layer has a number of filters that does
convolutional operation.
Beak detector
A filter
Convolution
These are the network
parameters to be learned.
1 -1 -1
1 0 0 0 0 1 -1 1 -1 Filter 1
0 1 0 0 1 0 -1 -1 1
0 0 1 1 0 0
1 0 0 0 1 0 -1 1 -1
-1 1 -1 Filter 2
0 1 0 0 1 0
0 0 1 0 1 0 -1 1 -1
…
…
6 x 6 image
Each filter detects a small pattern (3 x 3).
Convolution 1 -1 -1 Filter 1
-1 1 -1
stride=1
-1 -1 1
1 0 0 0 0 1 Dot
product
0 1 0 0 1 0 3 -1
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
1 -1 -1
Convolution -1 1 -1 Filter 1
-1 -1 1
If stride=2
1 0 0 0 0 1
0 1 0 0 1 0 3 -3
0 0 1 1 0 0
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
6 x 6 image
Convolution
1 -1 -1 Filter 1
-1 1 -1
stride=1 -1 -1 1
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
0 1 0 0 1 0
0 0 1 0 1 0 -3 -3 0 1
6 x 6 image 3 -2 -2 -1
Convolution -1 1 -1 Filter 2
-1 1 -1
stride=1 -1 1 -1
Repeat this for each filter
1 0 0 0 0 1
0 1 0 0 1 0 3 -1 -3 -1
-1 -1 -1 -1
0 0 1 1 0 0
1 0 0 0 1 0 -3 1 0 -3
-1 -1 -2 1
0 1 0 0 1 0 Feature
0 0 1 0 1 0 -3 -3 Map0 1
-1 -1 -2 1
6 x 6 image 3 -2 -2 -1
-1 0 -4 3
Two 4 x 4 images
Forming 2 x 4 x 4 matrix
Color Image: RGB 3 Channels
1 -1 -1 -1-1 1 1 -1-1
1 1 -1-1 -1-1 -1 1 -1
-1-1 1 1 -1-1 -1-1 1 1 -1-1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1-1 -1-1 1 1 -1-1 1 1 -1-1
-1 -1 1 -1 1 -1
Color image
1 0 0 0 0 1
1 0 0 0 0 1
0 11 00 00 01 00 1
0 1 0 0 1 0
0 00 11 01 00 10 0
0 0 1 1 0 0
1 00 00 10 11 00 0
1 0 0 0 1 0
0 11 00 00 01 10 0
0 1 0 0 1 0
0 00 11 00 01 10 0
0 0 1 0 1 0
0 0 1 0 1 0
Convolution v.s. Fully Connected
1 0 0 0 0 1 1 -1 -1 -1 1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1
0 0 1 1 0 0 -1 -1 1 -1 1 -1
1 0 0 0 1 0
0 1 0 0 1 0
0 0 1 0 1 0
convolution
image
x1
1 0 0 0 0 1
0 1 0 0 1 0 x2
Fully- 0 0 1 1 0 0
1 0 0 0 1 0
connected
…
…
…
…
0 1 0 0 1 0
0 0 1 0 1 0
x36
1 -1 -1 1 1
Filter 1
-1 1 -1 2 0
-1 -1 1 3 0
4: 0 3
…
1 0 0 0 0 1
0 1 0 0 1 0 0
0 0 1 1 0 0 8 1
1 0 0 0 1 0 9 0
0 1 0 0 1 0 10: 0
…
0 0 1 0 1 0
13 0
6 x 6 image
14 0
fewer parameters! 15 1 Only connect to 9
16 1 inputs, not fully
connected
…
1 -1 -1 1: 1
-1 1 -1 Filter 1 2: 0
-1 -1 1 3: 0
4: 0 3
…
1 0 0 0 0 1
0 1 0 0 1 0 7: 0
0 0 1 1 0 0 8: 1
1 0 0 0 1 0 9: 0 -1
0 1 0 0 1 0 10: 0
…
0 0 1 0 1 0
13: 0
6 x 6 image
14: 0
Fewer parameters 15: 1
16: 1 Shared weights
Even fewer parameters
…
The whole CNN
cat dog ……
Convolution
Max Pooling
Can repeat
Fully Connected many times
Feedforward network
Convolution
Max Pooling
Flattened
Max Pooling
1 -1 -1 -1 1 -1
-1 1 -1 Filter 1 -1 1 -1 Filter 2
-1 -1 1 -1 1 -1
3 -1 -3 -1 -1 -1 -1 -1
-3 1 0 -3 -1 -1 -2 1
-3 -3 0 1 -1 -1 -2 1
3 -2 -2 -1 -1 0 -4 3
Why Pooling
• Subsampling pixels will not change the object
bird
bird
Subsampling
3 1
0 3
Max Pooling
Can repeat
A new image
many times
Smaller than the original image Convolution
Max Pooling
Max Pooling
1
3 0
-1 1 3
3 1 -1
0 3 Flattened
1 Fully Connected
Feedforward network
3
Review Questions
1. What are the three different types of Neural Network?
2. What is the major challenge of Gradient Descent algorithm
3. What is learning rate in Neural Network and why its value should be small.
4. What filtering and max-pooling in CNN
5. What are the advantages of Deep Learning?
Thank you