You are on page 1of 13

Back-propagation

Chih-yun Lin
2/13/2019
Agenda
Perceptron vs. back-propagation
network
 Network structure
 Learning rule
Why a hidden layer?
An example: Jets or Sharks
Conclusions
Network Structure –
Perceptron

O Output Unit

Wj

Ij Input Units
Network Structure –
Back-propagation Network
Oi Output Unit
Wj,i

aj Hidden Units
Wk,j

Ik Input Units
Learning Rule
Measure error
Reduce that error
 By appropriately adjusting each of the
weights in the network
Learning Rule –
Perceptron
Err = T – O
 O is the predicted output
 T is the correct output
Wj  Wj + α * Ij * Err
 Ij is the activation of a unit j in the input
layer
 α is a constant called the learning rate
Learning Rule –
Back-propagation Network
Erri = Ti – Oi
Wj,i  Wj,i + α * aj * Δi
 Δi = Erri * g’(ini)
 g’ is the derivative of the activation
function g
 aj is the activation of the hidden unit
Wk,j  Wk,j + α * Ik * Δj
 Δj = g’(inj) * ΣiWj,i * Δi
Learning Rule –
Back-propagation Network
E = 1/2Σi(Ti – Oi)2

E
= - Ik * Δj
Wk , j
Why a hidden layer?
(1 w1) + (1 w2) < ==> w 1 + w2 < 
(1 w1) + (0 w2) > ==> w1 > 
(0 w1) + (1 w2) > ==> w2 > 
(0 w1) + (0 w2) < ==> 0< 
Why a hidden layer? (cont.)
(1 w1) + (1 w2) + (1 w3) <  ==> w1 + w2
+ w3 < 
(1 w1) + (0 w2) + (0 w3) >  ==> w1 >
(0 w1) + (1 w2) + (0 w3) >  ==> w2 > 
(0 w1) + (0 w2) + (0 w3) <  ==> 0 < 
An example: Jets or Sharks
Conclusion
Expressiveness:
 Well-suited for continuous inputs,unlike
most decision tree systems
Computational efficiency:
 Time to error convergence is highly
variable
Generalization:
 Have reasonable success in a number of
real-world problems
Conclusions (cont.)
Sensitivity to noise:
 Very tolerant of noise in the input data
Transparency:
 Neural networks are essentially black boxes
Prior knowledge:
 Hard to used one’s knowledge to “prime” a
network to learn better

You might also like