You are on page 1of 25

Special Topics in Machine Learning

Lecture # 3
The multilayer network
W(n,m) W(n,m)
X1   Y1

Y2
X2  
Y3
 
X3
. . .
. .
. .
.
. .
.
Xn   Yn

Input Hidden Output


Layer Layer Layer
Weights in the architecture
Weights between hidden and output layers
Wgc weight from neuron C to G

Wgd weight from neuron D to G

Wge weight from neuron E to G

Wgf weight from neuron F to G


Weights between input and hidden layers
Wca weight from neuron A to C
Wcb weight from neuron B to C
Wda weight from neuron A to D
Wdb weight from neuron B to D
Wea weight from neuron A to E
Web weight from neuron B to E
Wfa weight from neuron A to F
Wfb weight from neuron B to F
Calculation of the output of the Neural
Network
Equation 1:
Gout = [cout*Wgc]+[dout*Wgd]+[eout*Wge]+[fout*Wgt]
 
Equation 2: cout = [Ain*Wca] + [Bin *Wcb]
Equation 3: dout = [Ain*Wda] + [Bin *Wdb]
Equation 4: eout = [Ain*Wea] + [Bin *Web]
Equation 5: fout = [Ain*Wfa] + [Bin *Wfb]
Calculation of Errors:
Simple equations to calculate errors are
Equation 6: Eg= Desired output – Actual output
(Eg is error at G and vice versa)
 
Equation 7: Ec =Eg*Wgc
Equation 8: Ed =Eg*Wgd
Equation 9: Ee =Eg*Wge
Equation 10: Ef =Eg*Wgf
Example of simple Feed Forward NN

Calculate when desired value =1 Gout , Cout , Dout , Eout ,Fout , cout, Eg , Ec , Ed , Ee , Ef A.F is log sigmoid
MLP Architecture
The Multi-Layer-Perceptron was first introduced by M. Minsky and S. Papert in 1969
Type:
Feedforward

Neuron layers: 1 input


layer
1 or more hidden layers 1 output
layer

Learning Method:
Supervised
Why the MLP?
 The single-layer perceptron classifiers discussed
previously can only deal with linearly separable sets of
patterns.

 The multilayer networks to be introduced here are the


most widespread neural network architecture
– Made useful until the 1980s, because of lack of
efficient training algorithms (McClelland and
Rumelhart 1986)
– The introduction of the backpropagation training
algorithm.
Multi Layer Network Feedforward
• Feedforward Neural Networks
– Each unit is connected only to that of the next layer
– The processing proceeds smoothly from the input unit to output
– There is no feedback (directed acyclic graph or DAG)
– They have no internal state

input hidden output


Multi Layer Network Feedforward

Perceptron
Terminology and Nomenclature
Terminology and Nomenclature
Backpropagation Algorithm
Step0: Initialize weights. (Set to small random values).

Step1. While stopping condition is false, do steps 2-9.


 
Step2: For each training pair, do steps 3 - 8.
Backpropagation Algorithm
Step3: Each input unit (Xi, I = 1,….., n) receives input.

Step4. Each hidden unit (Zj, j = 1,……,p) sums its weighted input signals.
n
z _ in j  v0 j  
i 1
x i v ij
Applies its activation function to compute its output signal, zj = f(z_inj),
and sends this signal to all units in the layer above (output units)
Backpropagation Algorithm
Step5: Each output unit (Yk, k = 1,…….,m) sums its weighted input signals,

p
y _ in k  w 0 k   z j w jk
j 1
and applies its activation function to compute its output signal; yk = f(y_ink).
Backpropagation Algorithm
Step6: Each output unit (Yk, k = 1,….,m) receives a target pattern
corresponding to the input training pattern, computes its error
information terms:
 k  ( t k  y k ) f ( y _ in k )

Calculates its weight correction term (used to update wjk later)


w jk   k Z j
Calculate its bias correction term (used to update w0k later)
w0k   k
and sends k to units in the layer below.
Backpropagation Algorithm
Step7: Each hidden unit (Zj, j = 1,…..,p) sums its delta inputs (from units in the layer
above), m
 _ in j  
k 1
kw jk

 to calculate its error:


 j   _ in j f ( z _ in j )
multiplies by the derivative of its activation function

Calculates its weight correction term (used to update vij later)

and calculates its bias correction term (used to update v0j later)
Backpropagation Algorithm
Step8: Each output unit (Yk, k = 1,…..,m) updates its bias and weights (j = 0,…….,p):

Each hidden unit (Zj, j = 1,…..,p) updates its bias and weights (i =0,…….,n):

Step9: Test stopping condition.


BackPropagation training cycle
1/ Feedforward of the input training pattern

3/ Adjustement of the weights


2/ Backpropagation of the associated error
Preprocessing and Post processing
Preprocessing
 Extraction of features and/or feature selection
 Data Smoothing if required
 Normalization if required

Post processing
 Interpretation & analysis of results
Termination conditions
• Number of training iterations

• Mean Square Error

• Accuracy
Numerical Example
x1x2 t
1 1 0
1 0 1
0 1 1
0 0 0
Activation Function:
Log Sigmoid
Initial Weights Vs:
0.23 (v1)
0.46 (v2)
0.46 (v3)
0.92 (v4)

Initial Weights Ws:


0.023 (w1)
0.023 (w2)

Network Topology : 2×2×1


Thanks

You might also like