Professional Documents
Culture Documents
1
Multilayer Layer Perceptron (MLP)
• A MLP consists of an input layer, several
hidden layers, and an output layer.
• Node i, also called a neuron,It includes a
summer and a nonlinear activation function g.
3
Basics using MATLAB Neural
Network Toolbox
• The MATLAB commands used in the procedure are newff (type of
architecture ,size and type of training algorithm ) , train and sim.
• newff : create a feed-forward backpropagation network
• The MATLAB command newff generates a MLPN neural network,
which is called net.
4
Activation function
• Purelin
5
Activation function
• Satlin: • Logsig:
• Satlins: • Tansig:
6
Training
• To create a network that can handle noisy input
vectors it is best to train the network on both ideal
and noisy vectors. To do this, the network is first
trained on ideal vectors until it has a low sum-
squared error .
7
Basic flow diagram
8
Example-1
• Consider humps function in MATLAB. It is given by :
9
solution (a)
humps function
100
• To obtain 80
commands: output
40
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
x = 0:.05:2; y=humps(x); tim e (s )
P=x; T=y;
plot(P,T,'x')
grid; xlabel('time (s)'); ylabel('output'); title('humps function') 10
Step-1
Design the network
% DESIGN THE NETWORK
% ==================
%First try a simple one – feedforward (multilayer perceptron network(
net=newff([0 2], [5,1], {'tansig','purelin'},'traingd');
% Here newff defines feedforward network architecture.
% The first argument [0 2] defines the range of the input and initializes
the network parameters.
% The second argument the structure of the network. There are two
layers.
% 5 is the number of the nodes in the first hidden layer,
% 1 is the number of nodes in the output layer,
% Next the activation functions in the layers are defined.
% In the first hidden layer there are 5 tansig functions.
% In the output layer there is 1 linear function.
% ‘learngd’ defines the basic learning scheme – gradient method
% traingd is a network training function that updates weight and bias values
according to gradient descent. 11
Step-2
Design the network
% Define learning parameters
%Train network
2
10
T ra in in g - Blu e G o a l- Bla ck
1
10
0
10
-1
10
-2
10
-3
10
-4
10
0 100 200 300 400 500 600 700 800 900 1000
1 0 0 0 E pochs
13
solution (a)
% Simulate how good a result is achieved: Input is the same
input vector P.
% Output is the output of the neural network, which should be
compared with output data
100
a= sim(net1,P);
80
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 14
2
Step-4
solution (a)
The fit is quite bad, especially in the beginning.Increase
the size of the network: Use 20 nodes in the first hidden
layer.
net=newff([0 2], [20,1], {'tansig','purelin'},'traingd');
Otherwise apply the same algorithm parameters and
start the training process.
net.trainParam.show = 50; % The result is shown at every 50th
iteration (epoch)
net.trainParam.lr = 0.05; % Learning rate used in some gradient
schemes
net.trainParam.epochs =1000; % Max number of iterations
net.trainParam.goal = 1e-3; % Error tolerance; stopping criterion
%Train network
2
10
T ra in in g - B lu e G o a l- B la c k
1
10
0
10
-1
10
-2
10
-3
10
-4
10
0 100 200 300 400 500 600 700 800 900 1000
1 0 0 0 Epochs
• The error goal of 0.001 is not reached now either, but the
situation has improved significantly.
16
solution (a)
% Simulate how good a result is achieved: Input is the same
input vector P.
% Output is the output of the neural network, which should be
compared with output data
100
a= sim(net1,P);
80
% Plot result and compare
60
plot(P,a-T, P,T); grid;
40
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 17 2
Step-6
solution (a)
• Try Levenberg-Marquardt – trainlm. Use also smaller
size of network – 10 nodes in the first hidden layer.
net=newff([0 2], [10,1], {'tansig','purelin'},'trainlm');
%Define parameters Pe rforma nce is 0 .00 09 9 1 48 , Goa l is 0 .0 0 1
3
10
net.trainParam.show = 50;
2
net.trainParam.lr = 0.05; 10
net.trainParam.epochs =1000;10 1
-3
10
-4
10
0 50 100 150 200 250 300 350 400 450
4 9 5 Epochs
18
Step-7
solution (a)
• Performance is now
H um ps function
according to the tolerance 100
specification.
O u tp u t o f n e tw o r k a n d e r r o r
80
%Simulate result 60
a= sim(net1,P);
%Plot the result and the error 40
plot(P,a-T,P,T)
20
xlabel('Time (s)'); ylabel('Output
of network and error');
title('Humps function') 0
-20
0 0.5 1 1.5 2
Tim e (s )
19
Step-8
solution (a)
20
solution (b)
• RADIAL BASIS FUNCTION NETWORKS
• Here we would like to find a function, which fits the 41 data
points using a radial basis network. A radial basis network
is a network with two layers. It consists of a hidden layer of
radial basis neurons and an output layer of linear neurons.
Here is a typical shape of a radial basis transfer function
used by the hidden layer: 1
0.9
0.8
p = -3:.1:3; 0.7
a = radbas(p);
0.6
0.5
plot(p,a) 0.4
0.3
0.2
0.1
21
0
-3 -2 -1 0 1 2 3
solution (b)
• We can use the function newrb to quickly create a
radial basis network, which approximates the function
at these data points. Generate data as before:
100
80
x = 0:.05:2; y=humps(x);
60
P=x; T=y;
plot(P,T) 40
T
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
22 2
P Step-1
solution (b)
Pe rform a nce is 8 2 2 .5 8 5, Goa l is 0
5
10
net1 = newrb(P,T); 2
10
1
10
0 5 10 15 20 25
2 5 Epochs
‘’net = newrb(P,T,GOAL,SPREAD)’’
23
Step-2
solution (b)
• For humps the network training leads to
singularity and therefore difficulties in training.
Simulate and plot the result::
100
a= sim(net1,P); 80
plot(P,T-a,P,T) 60
40
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
24
solution (b)
The plot shows that the network approximates humps but
the error is quite large. The problem is that
the default values of the two parameters of the network are
not very good. Default values are goal -
mean squared error goal = 0.0, spread - spread of radial
basis functions = 1.0.
In our example choose goal = 0.02 and spread = 0.1.
a= sim(net1,P);
plot(P,T-a,P,T)
xlabel('Time (s)'); ylabel('Output of network and error'); 25
title('Humps function approximation - radial basis function')
solution (b)
H um ps function a pproxim a tion - ra dia l ba s is function
100
80
Output of ne twork a nd e rror
60
40
20
-20
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Tim e (s )
26
What is the significance of small
value of spread. What about
large?
27
Example-2
Consider a surface described by z = cos (x) sin (y)
defined on a square
−2 ≤x ≤ 2 x 2,−2≤ y ≤2.
28
solution
Generate data:
x = -2:0.25:2; y = -2:0.25:2;
z = cos(x)'*sin(y);
mesh(x,y,z)
xlabel('x axis'); ylabel('y axis'); zlabel('z axis');
title('surface z = cos(x)sin(y)');
gi=input('Strike any key ...');
29
Step-1
solution
s urfa ce z = cos (x)s in(y)
0.5
z a xis
-0.5
-1
2
1 2
0 1
0
-1 -1
y a xis -2 -2 30
x a xis
solution
Store data in input matrix P and output vector T
P = [x;y]; T = z;
Use a fairly small number of neurons in the first layer, say
25, 17 in the output.
Initialize the network
net=newff([-2 2; -2 2], [25 17], {'tansig' 'purelin'},'trainlm');
Apply Levenberg-Marquardt algorithm
%Define parameters
net.trainParam.show = 50;
net.trainParam.lr = 0.05;
net.trainParam.epochs = 300;
net.trainParam.goal = 1e-3;
%Train network
net1 = train(net, P, T);
gi=input('Strike any key ...');
31
Step-2
solution
Pe rform a nce is 0 .0 0 0 2 2 9 3 0 1 , Goa l is 0 .0 0 1
1
10
0
10
Tra ining-Blue Goa l-Bla ck
-1
10
-2
10
-3
10
-4
10
0 0.5 1 1.5 2 2.5 3 3.5 4
4 Epochs
32
solution
Simulate the response of the neural network and
draw the corresponding surface:
a= sim(net1,P);
mesh(x,y,a) 1
0.5
-0.5
-1
2
1 2
0 1
0
-1 -1
-2 -2 33
Step-3
solution
The result looks
satisfactory, but a closer Error surface
-0.06
% Error surface 2
mesh(x,y,a-z) 1 2
xlabel('x axis'); 0
0
1
ylabel('y axis'); -1 -1
-2 -2
zlabel('Error'); y axis x axis
title('Error surface') 34
Step-4
solution
35
Example-3
Consider Bessel functions Jα(t), which are solutions
of the differential equation
t y ty (t )y 0
2 2 2
a) Plot J1(t).
b) Try different structures to for fitting. Start with a
two-layer network.
36
solution )Plot J1(t)(
First generate the
data. Firs t orde r be s s e function
0.6
MATLAB has Bessel
0.5
functions as MATLAB
0.4
functions.
0.3
0.2
t=0:0.1:20;
y=bessel(1,t); 0.1
y
plot(t,y) 0
grid -0.1
xlabel('time in
-0.2
secs');ylabel('y');
title('First order bessel -0.3
function'); -0.4
0 2 4 6 8 10 12 14 16 18 20
tim e in s e cs
37
Step-1
solution
Next try to fit a backpropagation network on the data.
Try Levenberg-Marquardt.
P=t; T=y;
%Define network. First try a simple one
net=newff([0 20], [10,1], {'tansig','purelin'},'trainlm');
Pe rform a nce is 0 .0 0 0 6 5 0 1 , Goa l is 0 .0 0 1
%Define parameters 10
1
net.trainParam.show = 50;
net.trainParam.lr = 0.05; 10
0
net.trainParam.epochs = 300;
Tra ining-Blue Goa l-Bla ck
net.trainParam.goal = 1e-3; 10
-1
%Train network
net1 = train(net, P, T); -2
10
-3
10
-4
38
10
0 0.5 1 1.5 2 2.5 3
solution
Firs t orde r be s s e l function
0.6
0.5
0.4
0.2
0.1
-0.1
-0.2
% Simulate result
a= sim(net1,P); -0.3
0 2 4 6 8 10 12 14 16 18 20
%Plot result and compare tim e in s e cs
plot(P,a,P,a-T)
xlabel('time in secs');ylabel('Network output and error');
title('First order bessel function'); grid
39
solution
Since the error is fairly significant, let’s reduce it by doubling the
nodes in the first hidden layer to 20 and decreasing the error
tolerance to .
P=t; T=y
%Define network. First try a simple one
net=newff([0 20], [20,1], {'tansig','purelin'},'trainlm');
%Define parameters 10
Pe rform a nce is 5 .1 3 2 4 7 e -0 0 5 , Goa l is 0 .0 0 0 1
1
net.trainParam.show = 50;
net.trainParam.lr
10 4 = 0.05; 10
0
net.trainParam.goal = 1e-4; 10
-1
%Train network
net1 = train(net, P, T);
-2
10
-3
10
-4
10
-5
10 40
0 0.5 1 1.5 2 2.5 3 3.5 4
4 Epochs Step-2
solution
Firs t orde r be s s e l function
0.6
0.5
0.4
0.2
0.1
-0.1
-0.2
-0.3
% Simulate result
a= sim(net1,P); -0.4
0 2 4 6 8 10 12 14 16 18 20
%Plot result and compare tim e in s e cs
plot(P,a,P,a-T)
xlabel('time in secs');ylabel('Network output and error');
title('First order bessel function'); grid
41
solution
The result is considerably better, although it would still require
improvement. This is left as furtherexercise to the reader.
P=t; T=y
%Define network. First try a simple one
net=newff([0 20], [40,1], {'tansig','purelin'},'trainlm');
%Define parameters 10
Pe rforma nce is 9 .7 2 9 49 e -0 0 7, Goa l is 1 e -0 06
1
net.trainParam.show = 50;
0
net.trainParam.lr = 0.05; 10
net.trainParam.epochs = 300; 10
-1
net.trainParam.goal = 1e-6;
Tra ining-Blue Goa l-Bla ck
%Train network
-2
10
-4
10
-5
10
-6
10
-7
10
0 10 20 30 40 50 60 70 80
42
90 100
1 0 4 Epochs Step-3
solution
Firs t orde r be s s e l function
0.6
0.5
0.4
0.2
0.1
-0.1
-0.2
44
Block Set
The Neural Network Toolbox provides a set of blocks you can use to build
neural networks in Simulink or which can be used by the function gensim
to generate the Simulink version of any network you have created in
MATLAB.
Bring up the Neural Network Toolbox block set with this command.
neural
45
Transfer Function Blocks
Double-click on the Transfer Functions block in the Neural window to bring
up a window containing several transfer function blocks.
Each of these blocks takes a net input vector and generates a corresponding
output vector whose dimensions are the same as the input vector
46
Net Input Blocks
Double-click on the Net Input Functions block in the Neural window to bring up a
window containing two net-input function blocks
Each of these blocks takes any number of weighted input vectors, weight
layer output vectors, and bias vectors, and returns a net-input vector
47
Weight Blocks
Each of these blocks takes a neuron's weight vector and applies it to an input
vector (or a layer output vector) to get a weighted input value for a neuron.
It is important to note that the blocks above expect the neuron's weight vector to
be defined as a column vector. This is because Simulink signals can be column
vectors, but cannot be matrices or row vectors.
It is also important to note that because of this limitation you have to create S
weight function blocks (one for each row), to implement a weight matrix going to a
layer with S neurons. 48
49
Example-4
if it is possible to find a neural network model, which produces the same
behavior as Van der Pol equation.
x ( x 2 1)x x 0
or in state-space form
x 1 x 2 (1 x12 ) x1
x 2 x1
50
solution
51