Professional Documents
Culture Documents
CHAPTER 04
MULTILAYER PERCEPTRONS
Introduction
Computer Experiment
x1 x2 d +1
w0
0 0 0 w1
x1 y
0 1 0
w2
1 0 0 x2
Linear
1 1 1 Decision boundary
1
w1 0 w2 0 w0 0 w0 0 f ( z)
1 e z
w1 0 w2 1 w0 0 w2 w0
w1 1 w2 0 w0 0 w1 w0 y f (10 x1 10 x2 20)
w1 1 w2 1 w0 0 w1 w2 w0
Its easy to find a set of weight that satisfy the above inequalities.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 3
Introduction
Limitation of Rosenblatt’s Perceptron
OR Operation:
x1 x2 d +1
w0
0 0 0 w1
x1 y
0 1 1
w2
1 0 1 x2 Linear
1 1 1 Decision boundary
1
w1 0 w2 0 w0 0 w0 0 f ( z)
1 ez
w1 0 w2 1 w0 0 w2 w0
w1 1 w2 0 w0 0 w1 w0 y f ( 20 x1 20 x2 10)
w1 1 w2 1 w0 0 w1 w2 w0
Its easy to find a set of weight that satisfy the above inequalities.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 4
Introduction
Limitation of Rosenblatt’s Perceptron
XOR Operation:
x1 x2 d +1
w0
0 0 0
x1 w1 y
0 1 1
1 0 1 w2
x2
1 1 0
Non-linear
w1 0 w2 0 w0 0 Decision boundary
w0 0
w1 0 w2 1 w0 0 w2 w0
w1 1 w2 0 w0 0 w1 w0 y f (???)
w1 1 w2 1 w0 0 w1 w2 w0
Clearly the second and third inequalities are incompatible with the fourth, so
there is no solution for the XOR problem. We need more complex networks!
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 5
The XOR Problem
A two-layer Network to solve the XOR Problem
Figure 4.8 (a) Architectural graph of network for solving the XOR problem. (b)
Signal-flow graph of the network.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 6
The XOR Problem
A two-layer Network to solve the XOR Problem
Figure 4.9 (a) Decision boundary constructed by hidden neuron 1 of the network in
Fig. 4.8. (b) Decision boundary constructed by hidden neuron 2 of the network. (c)
Decision boundaries constructed by the complete network.
Figure 4.1 Architectural graph of a multilayer perceptron with two hidden layers.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 9
MLP: Some Preliminaries
Weight Dimensions
Wij
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Pedestrain Car Moto Truck
In the forward phase, the weights of the network are fixed and
the input signal is propagated through the network, layer by
layer, until it reaches the output.
m
v j (n) w ji (n) yi (n)
i 0
e j ( n) d j ( n) y j ( n)
Figure 4.4 Signal-flow graph highlighting the details of output neuron k connected
to hidden neuron j.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 19
The Back-propagation Algorithm
We redefine the local gradient for a hidden neuron j as:
E (n) y j (n) E (n)
j ( n) j (v j (n))
y j (n) v j (n) y j (n)
Where the total instantaneous error of the output neuron k:
1
E(n) ek (n)
2
2 kC
Differentiating w. r. t. yj (n) yields:
E (n) ek (n) ek (n) vk
ek (n) ek (n)
y j (n) k y j (n) k vk (n) y j (n)
But ek(n) dk (n) yk (n) dk (n) k (vk (n))
Hence ek
k (vk (n))
vk (n)
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 20
The Back-propagation Algorithm
Also, we have m
vk (n) wkj (n) y j (n)
j 0
Differentiating, yields:
vk (n)
wkj (n)
Then, we get y j (n)
E (n)
ek (n)k (vk (n)) wkj k (n) wkj
y j (n) k k
Figure 4.5 Signal-flow graph of a part of the adjoint system pertaining to back-
propagation of error signals.
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 22
The Back-propagation Algorithm
We summarize the relations for the back-propagation algorithm:
Logistic Function
1 a exp( av)
(v ) , a 0 ' (v ) a (v)[1 (v)]
1 exp( av) 1 exp( av)
b
(v) a tanh(bv), a, b 0 ' (v) [a (v)][a (v)]
a
1. Initialization
2. Presentation of
training example
3. Forward
computation
4. Backward
computation
5. Iteration
3. Activation function
Use an odd function
Hyperbolic not logistic function
(v) a tanh(bv)
ASU-CSC445: Neural Networks Prof. Dr. Mostafa Gadal-Haqq 27
Heuristics for making the BP Better
4. Target values
Its very important to choose the
values of the desired response
to be within the range of the
sigmoid function.
5. Normalizing the input
Each input should be
preprocessed so that its mean
value, averaged over the entire
training sample, is close to zero,
or else it will be small
compared to its standard
deviation. Figure 4.11 Illustrating the operation of mean
removal, decorrelation, and covariance
equalization for a two-dimensional input space.
•Computer Experiment
•4.15
41
Next Time
42