Professional Documents
Culture Documents
IN ELECTRONIC ENGINEERING
PROJECT REPORT
Acknowledgements
I would like to thank Ms Jennifer Bruton for her time and invaluable guidance during this
project. I would also like to thank my friends and family for their ordinary laughter and
most especially my mom for hanging in there with me.
Declaration
I hereby declare that, except where otherwise indicated, this document is entirely my own
work and has not been submitted in whole or in part to any other university.
ii
Abstract
This project outlines the development of a neural network model for system identification. It
traces the growth of neural networks from their humble beginnings as single-layer
perceptrons to neural network models. Both multi-layer and recurrent networks models are
examined and their merits as system identifiers discussed. The system chosen as a basis for
the empirical data collection is the anti-lock brake system, which exhibits highly non-linear
behaviour and lends itself to neural network modelling for system identification purposes.
The backpropagation algorithm is used in the development of the neural network. Until
recently, backpropagation neural networks made up 80% of all neural network applications
[1]. The use of backpropagation has declined due to the relatively long required training
times for the iterative algorithm. Genetic algorithms are discussed as a possible alternative.
iii
Table of Contents
Acknowledgements..................................................................................................................ii
Declaration...............................................................................................................................ii
Abstract ...................................................................................................................................iii
Table of Contents....................................................................................................................iv
Table of Figures ......................................................................................................................vi
Introduction.............................................................................................................................. 1
1.1 Artificial Neural Networks ........................................................................................ 1
1.1.1 Background ......................................................................................................... 1
1.1.2 How the Human Brain Learns ............................................................................ 2
1.2 Artificial Neuron and Activation Function................................................................ 3
1.2.1 Linear Activation Function................................................................................. 3
1.2.2 Non - Linear Activation Functions ..................................................................... 4
1.2.3 Neural Network Matlab Toolbox........................................................................ 6
1.3 Summary .................................................................................................................... 6
The Perceptron......................................................................................................................... 7
2.1 Implementing a Single Layer Perceptron in Matlab.................................................. 8
2.1.1 Single Layer Perceptron Designed without Neural Network Toolbox............... 8
2.1.2 Designing and Training using the Neural Network Toolbox ............................. 9
2.2 Multi-Layer Perceptron............................................................................................ 10
2.2.1 Implementing a Multi-Layer Perceptron in Matlab - XOR Classification ....... 12
2.3 Summary .................................................................................................................. 16
Anti-Lock Braking System .................................................................................................... 17
4.1 ABS Model .............................................................................................................. 17
4.1.1 Basic Steps of System Identification ................................................................ 17
4.1.2 The Simulink Model ......................................................................................... 18
4.1.3 Pseudo Random Binary Sequence Input........................................................... 19
4.2 Data .......................................................................................................................... 19
4.2.1 Data Collection ................................................................................................. 19
iv
Table of Figures
FIGURE 1 COMPONENTS OF BIOLOGICAL NEURON [2] ................................................................................2
FIGURE 2 COMPONENTS OF THE SYNAPSE [2].............................................................................................3
FIGURE 3 LINEAR ACTIVATION FUNCTION, EQUATION 1.............................................................................4
FIGURE 4 LOG SIGMOID ACTIVATION FUNCTION, EQUATION 2 ...................................................................5
FIGURE 5 TAN-SIGMOID ACTIVATION FUNCTION, EQUATION 3...................................................................5
FIGURE 6 SINGLE LAYER PERCEPTRON ARCHITECTURE [6] .......................................................................7
FIGURE 7 INPUT VECTORS OF THE SLP PLOTTED........................................................................................9
FIGURE 8 CLASSIFICATION PLOT WITH NEW INPUT CORRECTLY PLOTTED IN RED. ....................................10
FIGURE 9 MULTI-LAYER PERCEPTRON ARCHITECTURE [6].......................................................................11
FIGURE 10 OUTPUT PLOT OF XOR INPUTS INTO SLP, WHICH WAS UNABLE TO PERFORM CLASSIFICATION12
FIGURE 11 MODEL OF MLP THAT SOLVES EXOR CLASSIFICATION DIFFICULTIES ...................................13
TABLE 1 THE XOR TRUTH TABLE ............................................................................................................13
TABLE 2 TRUTH TABLE FOR THE NEURON WITH STRONG NEGATIVITY N1 AND THE NEURON WITH STRONG
POSITIVITY N2 [5]. .........................................................................................................................13
FIGURE 12 OVERALL CLASSIFICATION OF XOR PROBLEM [5]..................................................................14
FIGURE 13 TRAINING OF THE XOR NETWORK, MEAN SQUARE ERROR PLOT OVER 70 EPOCHS [5] ............15
FIGURE 14 TRAINING OF XOR NETWORK WITH PERFORMANCE GOAL MET, MEAN SQUARE ERROR PLOT UNTIL
CONVERGENCE [5]..........................................................................................................................15
FIGURE 15 ABS MODEL WITH PSEUDO RANDOM BINARY SEQUENCE INPUT .............................................18
FIGURE 16 PSEUDO RANDOM BINARY SEQUENCE.....................................................................................19
FIGURE 17 A VISUAL REPRESENTATION OF INPUT AND OUTPUT DATA ......................................................20
FIGURE 18 PARALLEL IDENTIFICATION MODEL [3]...................................................................................23
FIGURE 19 SERIES-PARALLEL IDENTIFICATION MODEL [3].......................................................................24
FIGURE 20 TRAINING PLOT OF TRAINGD ..................................................................................................26
FIGURE 21 TRAINGDM PLOT WITH MOMENTUM CONSTANT OF 0.9 ...........................................................27
FIGURE 22 TRAINGDM WITH MU=0, TRAINING PLOT SIMILAR TO TRAINGD PLOT AS WEIGHT CHANGE BASED ON
GRADIENT ......................................................................................................................................28
FIGURE 23 TRAINING PLOT OF TRAINGDA................................................................................................29
FIGURE 24 VARIABLE LEARNING RATE PLOTTED AGAINST EACH EPOCH ITERATION ................................29
FIGURE 25 TRAINLM PERFORMANCE TRAINING PLOT ...............................................................................30
FIGURE 26 TRAINING, TESTING AND VALIDATION DATA PLOT USING TRAINLM, HIGHLIGHTING OVER-FITTING 31
FIGURE 27 TRAINING, TESTING AND VALIDATION DATA PLOT USING TRAINGDM, HIGHLIGHTING OVER-FITTING 32
FIGURE 28 POST TRAINING ANALYSIS PLOT FOR TRAINLM ALGORITHM. ..................................................33
FIGURE 29 POST TRAINING ANALYSIS PLOT FOR TRAINDGM ALGORITHM. ...............................................34
FIGURE 31 POOR PERFORMANCE OF RECURRENT NEURAL NETWORK WITH TRAINLM ALGORITHM ..........38
FIGURE 32 TRAINGDX ALGORITHM PERFORMANCE PLOT .........................................................................39
FIGURE 33 DETERIORATED PERFORMANCE OF RECURRENT NEURAL NETWORK .......................................40
FIGURE 34 SUM SQUARE ERROR PLOT OF RECURRENT NETWORK WITH PRE AND POST PROCESSING IMPLEMENTED
.......................................................................................................................................................41
FIGURE 35 THE BASIC CONCEPTS BEHIND GENETIC ALGORITHMS [7] .......................................................43
vi
Chapter 1
Introduction
System identification using both conventional and neural network systems is the
development of a mathematical model of a dynamic system based on empirical data.
Choice of identifier structure is based on well-established results in linear systems theory
and can be applied in the development of non-linear neural networks identifiers with great
success.
This is the basis of neural network system identification solutions and the
Before neural
network system identification and its merits is examined, artificial neural networks and their
concepts will be described. The background and concepts behind artificial neural networks
is discussed along with the development of these simple structures into more complex
recurrent neural networks.
extensively for system modelling where the physical processes are not understood fully or
are highly complex.
The structure of the human brain neuron is the template for artificial learning. However,
lack of knowledge leads to approximations and assumptions of the general architecture of
an artificial neural network. The knowledge of neurons is incomplete and computing power
is limited so models are often idealisations of real networks of neurons.
2
equation 1
Plots of these
differentiable, non-linear activation functions are illustrated in figures 4 & 5. They are
commonly used in networks trained with backpropagation. The networks referred to in this
project are generally backpropagation models and they mainly use log-sig and tan-sig
activation functions. The logistic activation function; it is defined by the equation
Logsig(x)=1/(1+exp(-x))
equation 2
=1 though it can be changed which in turn changes the shape of the sigmoid. As tends
toward infinity it behaves more and more like a hard-limiter where the slope of the sigmoid
is zero. In this case where the slope is not zero, the output range is contained between 0 and
1.
e x ex
e x + e x
equation 4
Tansig(x) runs faster than tanh(x) so it is a good choice when speed is an important factor.
1.3 Summary
Artificial neural networks use the CNS of living creatures as a basis for system architecture.
This architecture is used as the basis for artificial structures called artificial neural networks.
This development of an artificial neural network requires an activation function, which is
either linear or non-linear. This function changes the activation level of a unit into an
output signal. This activation function must be applied to all neural networks including the
single layer perceptron.
Chapter 2
The Perceptron
A single layer perceptron (SLP) is the simplest form of artificial neural network that can be
built. This chapter discusses the single layer perceptron in detail. It consists of one or more
artificial neurons in parallel. Each neuron in the single layer provides one network output,
and is usually connected to all of the external inputs (Figure 6). The diagram below
illustrates a very simple neural network; it consists of a single neuron in the output layer.
There are n neurons in the input layer; each circle represents a neuron. The total input
stimuli to the neuron in the output layer is
n
equation 4
i =0
y= Output of the neuron = f(zin) The input x0 is a special input, referred to as the bias
input. Its value is normally fixed at +1. Its associated weight w0 is referred to as the bias
weight.
equation 5
y = output
w = weights
x = inputs
equation 6
e( k ) = ( k ) y ( k )
equation 7
e = error
= fixed value
= t arg et
A hard limit activation function is used to calculate the y(k). This activation function is a
threshold activation function; it is implemented in Matlab code using the sign function. The
activation function limits the output between one and minus one. When an element is fed
through this function it returns a one if the element is greater than zero; or zero if it equals
zero; and minus one if it is less than zero. A zero output will never be produced if the target
is never set to zero. This network is in effect a binary output perceptron. It can only
classify input patterns that are linearly separable. Frank Rosenblatt first developed this
perceptron architecture in 1958 [3].
8
The network trains so that it behaves like an AND gate. The outputs are linearly separable
so the network can classify them as a one or a zero like binary logic. A classification line is
drawn across the linear plane, which is shown in blue in figure 8. If a new input is applied
the newly trained network is simulated and classification of this new point occurs. In this
case the new input is [0.7; 1.2], it is correctly classified as a one and shown in red on the
right side of the classification line.
After one training cycle of the network the correct classification is not always achieved. It
can take several training cycles or epochs to modify the weights until the correct
classification of the problem is achieved. This design structure is the basis for all artificial
neural networks. The SLP leads to the creation of multi-layer perceptrons, which are
structures of multiple single layer perceptrons.
network layer by layer. The computations performed by this feed forward network with a
single hidden layer, non-linear activation functions and a linear output layer, can be written
mathematically as
x = f ( s) = B ( As + a ) + b
equation 8
s = inputs
x = outputs
= non-linearity function.
It has been proven that this architecture can approximate any continuous function to any
degree of accuracy of a compact set. The multi-layer perceptron has been termed the
universal approximator. However, it is never known exactly how many hidden layers of
11
neurons will ensure optimum network convergence and if the weight matrix that
corresponds to that error goal can be found. These solutions are unique to each neural
network and the input and output data applied [4]. To begin with the MLP architecture is
applied to the EXOR problem. Historically, it was this problem that first exhibited the
limitations of the SLP and also led to the development of more complex multi-layer
perceptrons. Minsky and Papert (1969) believed that in their, intuitive judgement the
extension (to multi-layer systems would be) sterile.
inability of the SLP to classify the EXOR problem and other such linearly non-separable
problems [6].
Figure 10 Output plot of XOR inputs into SLP, which was unable to perform classification
A new MLP must be built using the newff() function [5]. This creates a new network
function, which has an input layer, a hidden layer and an output layer (Figure 11)
X1
12
1
The essence of this problem is to build a perceptron network that takes two Boolean inputs
and outputs the XOR of them. The XOR truth table is shown below in table 1.
X1
X2
Desired
Outputs
The first neuron is designed with strong negativity. The second neuron is designed with
strong positivity and the third neuron must discriminate between the two of them
X1
X2
N1
N2
Table 2 Truth table for the neuron with strong negativity N1 and the neuron with strong positivity N2 [5].
13
This problem is now linearly separable and classification can be achieved. Matlab produces
a plot of the overall classification (Figure 12). There is a classification line through (0,0) to
(1,1) indicating the output is 0 for both of these inputs and another classification line
through (0,1) to (1,0) indicating that both of these inputs produce 1 as an output.
The Neural Network does not automatically classify the inputs correctly. When the input
data was first applied the output was incorrect, the network had to be trained to recognise
the inputs and perform as an XOR gate. The training of the data takes place over 70 epochs.
Figure 13 highlights that the minimum gradient has been reached and the performance goal
was not met.
14
Figure 13 Training of the XOR network, mean square error plot over 70 epochs [5]
The network is trained again. This time a specific goal is set for the network to achieve.
This goal is 0.0037^2. It only takes four epochs for this goal to be achieved and correct
classification then takes place (Figure 14). The hidden layer behaves like a little black box,
hence its name, hidden layer.
approximated so it may behave slightly differently each time the network and training
algorithm are run, every time producing different results.
Figure 14 Training of XOR network with performance goal met, mean square error plot until convergence [5]
15
2.3 Summary
The single layer perceptron is the simplest form of artificial neural network. It is possible to
implement the SLP without the neural network toolbox but this perceptron is not as
powerful as one created with the toolbox. Using the toolbox the single layer perceptron can
perform classification on linear inputs.
16
Chapter 4
Anti-Lock Braking System
The ABS model is a demo model found in Simulink Matlab. It, like many other models,
can be modelled or identified using multi-layer perceptrons. A typical anti-lock braking
system senses when the wheel lock up is to occur. It then releases the brakes for a very
short time and reapplies the brakes when the wheel spins up again. ABS greatly reduces the
possibility of skidding during hard braking. ABS also lets the driver steer during braking.
This ability to steer during braking is the one of the main benefits of ABS; in a hard braking
situation without the ABS the wheels may skid and at times lose traction between the tires
and road, which could result in accidents. Neural Networks have already been used with
great success to develop a genetic neural fuzzy controller. This controller finds the optimal
wheel slips that maximize the road adhesion coefficient [7]. The Anti-lock brake system
lends itself to neural network modelling and Fuzzy Logic Control because of its need to
constantly alter its response to variations of inputs [8].
select and estimate the model structures used to build the neural network,
These are the steps followed in the development of multi-layer perceptrons for the ABS
model.
17
18
controlled input into the ABS model. The input is either 0 or 1, which provides random
excitation. The data is persistently exciting, so that the training set has to be representative
of the entire class of inputs that may excite the system.
4.2 Data
4.2.1 Data Collection
Data must be collected from the model during simulation. Input and output data is collected
using a workspace sink. The data is categorised into three heading. These are training data,
testing data and validation data. Generally 60% of the input and output data is used for
training, 20% is used for testing and 20% used for validation. However, previous research
using neural networks for feature extraction and temporal segmentation of acoustic signal
used 80% of the collected data for training the system and 20% for testing [9]. Testing and
validation of a network is a very important aspect in developing an effective neural network
so when modelling the ABS 60% of the data is used for training and validation and testing
data files are created using 20% of the collected data in each. This set of data is reused each
19
time a new training algorithm is implemented to ensure training; testing and validation
parameter conditions remain constant though out the experiment. Any variations in results
can only be related to the algorithms or the networks architecture as opposed to a different
input data sequence and its corresponding output data.
load input_data
load output_data
Figure 17 illustrates the response of the ABS to the random excitation signal input. This is
a visual representation of a small section of the data that is loaded before the network can be
run or trained.
20
4.3 Summary
The ABS model exhibits a high level of non-linearity, this is the main reason for its choice
as a model for system identification; also it is easy to modify the Simulink Model so that a
PRBS input can be applied. Data is collected from the ABS Simulink Model and this data is
applied to the design of the Neural Network.
identification process.
21
Chapter 5
Building Neural Network the design detail
System identification is carried out in phases. The first is the data collection process, which
is outlined in previous section (Section 4.2 Data). Next, this data is processed to filter it and
remove any outliners. Processing can improve the overall performance of a model [11]. A
model structure is selected and the best parameters for this structure computed.
The
models properties and convergence results are examined and analysed. The matlab neural
network toolbox provides all the necessary functions to ensure these procedures can be
followed.
Scaling
The function premnmx() is used to scale the inputs and targets so they fall within a specified
range. The output of the network is now trained to produce outputs in the (-1,1) range.
These are converted back into the same units that were used for the original targets.
Mean and Standard Deviation
The second approach is normalisation of the mean and standard deviation of the training set.
This is done using prestd(). It normalises the inputs and targets so they will have zero mean
and unity standard deviation. The outputs are converted back into the same units that are
used for the original targets using poststd.
22
The series-parallel identification structure does not use feedback (Figure 19). Instead, it
uses the actual plant output to predict the future outputs. Static backpropagation is used and
generally stability and convergence are guaranteed with this method [3]
23
Like a normal system identification model a neural network model structure is defined by
inputs but also by the neural network architecture. This architecture includes the type of
network, hidden layers and hidden nodes. In this case the series-parallel identification
model is used as the neural network model structure. This is because of its high level of
stability and convergence success and because of its ability to be used off line [10].
trained. The algorithm that is used to adjust the weights of the links so as to produce the
desired output is known as the training the network.
Backpropagation involves
variations to the basic training algorithm of the back propagation neural network. These
24
variation algorithms are the basis of test procedures evaluating the overall most effective
way to model the ABS.
MLP Code
25
equation 9
k = learning rate
This function is known as the steepest gradient descent training function. The changes to
the weights and biases are obtained by multiplying the learning rate by the negative
gradient. The higher the learning rate the larger the step taken. If the learning rate is set to
large the algorithm can become unstable, if the learning rate is set to small then the
algorithm will take to long to converge.
Traingd implements the steepest descent algorithm. Figure 20 shows the training plot of
The
performance of the network is measured in this case according to the mean square errors
(mse).
26
network to respond to the local gradient and recent trends in error surface. Momentum
prevents the network getting beyond a local minima. The momentum constant is defined by
it is a number between 0 and 1. The training plot in figure 21 exhibits the ABS data
modelled with the traingdm algorithm with a momentum constant of 0.9.
When the
momentum constant is 1 the new weight change is set equal to the last weight change and
the gradient is simply ignored. When the momentum constant is 0 a weight change is
based solely on the gradient and the traingdm simply behaves, as the traingd algorithm
would (Figure 22).
27
Figure 22 Traingdm with mu=0, training plot similar to Traingd plot as weight change based on gradient
Traingda implements the steepest descent training function with a variable learning rate. If
the learning rate is set too large the algorithm can oscillate and become unstable but if it is
set too small the algorithm will take to long to converge. The learning rate with the
algorithm Traingda is allowed to change during the training process in response to the
complexity of the local surface error. This procedure increases the learning rate, but only to
the extent that the network can learn without large error increases. Near optimal learning is
achieved for the local terrain. When a large learning rate could result in stable learning the
learning rate is increased, when the learning rate is too high to guarantee a decrease in error
it gets decreased until stable learning is achieved again. In figure 23 the minimum gradient
is reached by the 66 epoch so the learning rate variation and training stops at this epoch.
28
The increase in learning rate is plotted in figure 24. The learning rate increase terminates at
epoch 66 when the training stops. The training plots outputted with these steepest gradient
decent algorithms all achieve performance at around 0.477 mse. Trainlm algorithm, another
type of algorithm is implemented however it does not affect the mse performance output.
29
overcome the problems of having to compute the Hessian matrix (second derivatives) of the
performance index at the current values of weights and biases. This algorithm appears to be
a faster method for training moderate size feed-forward neural networks. In this case the
time elapsed is just 9.0470 seconds for the training to take place in comparison to the time
elapsed for the training algorithm Traingda which was approximately 15 seconds. Trainlm
is a very efficient Matlab implementation since the solution of the matrix equation is a builtin function so its attributes become even more pronounced in a Matlab setting [11].
The performance of the network remains constant at approximately 0.477 mse. Even with
the use of the most efficient training algorithm in the toolbox the performance is unchanged.
The efficiency of this algorithm is concluded to relate to time and its ability to compute the
algorithm more rapidly than other algorithms.
30
Figure 26 Training, testing and validation data plot using Trainlm, highlighting over-fitting
31
Figure 27 Training, testing and validation data plot using Traingdm, highlighting over-fitting
[a]=postmnmx(Y,mint,maxt);
[m,b,r]=postreg(a(2,:),teach2(2,:))
;
m and b correspond to the slope of the y-intercept of the best linear regression relating
targets to the network outputs. If there was a perfect fit i.e. if the outputs exactly equal the
targets, the slope would be 1 and the y-intercept would be 0. The third variable returned,
the R-value is the coefficient between the outputs and targets. It is a measure of how well
32
the variation in output is explained by the targets. If this number is equal to 1, then there is
perfect correlation between targets and outputs. These are the post training analysis outputs
for the Trainlm algorithm.
m= 6.9188e-005
b=63.2698
v=0.0083
The R-value is extremely low and indicates a very poor linear fit, which is shown in figure
28. A similar plot is obtained for the Traingdm algorithm indicating the overall weakness of
the neural network to perform system identification (Figure 29). The R-value in this case is
a negative number but the system still exhibits poor linear fit.
33
The analysis of this system indicates a very poorly functioning neural network identifier.
This system could be improved by changing the architecture. This could be done by adding
more hidden layers and increasing the number of input neurons. The actual optimum
structure is achieved through trial and error. Some changes are made to the structure but a
signification improvement in performance is not highlighted. The hidden layer of the
network designed with the Trainlm algorithm is increased from 5 neurons to 22 neurons.
The output training performance plot shows no significant change (Figure 30).
34
5.3 Summary
The building of a neural network follows a number of systematic procedures. Adherence to
these procedures does not necessarily guarantee a highly effective neural network model.
The model requires vigorous testing to obtain the optimal architecture as the number of
hidden layers and neurons in each layer determine the performance of the network. The
MLP architecture in this study does not reach its optimal potential. However, the structure
provides the basis for a recurrent neural network.
35
Chapter 6
Recurrent Neural Networks
6.1 Structure of Recurrent Neural Network design detail
Although multi-layer networks and recurrent neural networks have different structures they
may be viewed similarly. The networks have the potential to be used in unison in systems
with dynamic elements and feedback [10]. In effect recurrent neural networks used for
identification or model based predictive control are multi-layer neural networks with a delay
element in their feedback loop. Recurrent neural networks could be built with multi-layer
networks in their feedback loop, creating a system where the structures compute in tandem.
Hence the networks could be used in unison creating systems with both dynamic elements
and feedback. This is beyond the scope of the structures examined and tested with the ABS
data, multi-layer perceptrons and recurrent neural networks were tested as separate entities
and their results compared. There are two neural network structures available in the Matlab
neural network toolbox: the Hopfield and the Elman structure. The Elman structure is
chosen as the architecture of the recurrent network used to model ABS. This choice is made
because the Hopfield architecture is seldom used in practice, even the best Hopfield designs
may have spurious results that can lead to incorrect answers [11]. Elman networks are twolayer backpropagation networks with the addition of a feedback connection from the output
of the hidden layer to its input.
36
neural network architecture. Elman Code is an example of the code used to test the Elman
structure.
Elman Code
net.trainParam.epochs=300;
net.trainParam.show=5;
net.trainParam.goal=0.01;
net.performFcn='sse';
[pn,minp,maxp,tn,mint,maxt]=premnmx(teach1,teach2);
pnseq=con2seq(pn);
tnseq=con2seq(tn);
[net,tr]=train(net,pnseq,tnseq);
toc
hold on;
semilogy(tr.epoch,tr.perf)
title('Sum squared error of Elman Network')
xlabel('Epoch')
ylabel('Sum squared error')
Y=sim(net,pnseq);
37
The recurrent connection present in the Elman network allows the network to detect and
generate time-varying patterns. The Elman structure differs from conventional two layer
networks in that the first layer has the recurrent connection. The delay in this connection
stores values form the previous time step, which can be used as the current time step. This
property may give rise to the miscorrelation of results. Even if two Elman networks with
the same weights and biases are given identical inputs at a given time step their outputs can
be different due to different feedback states. The network has proved effective at storing
information for future reference and that is why it is tested for identification of the ABS
model. Different training algorithms are tested and the results compared with the multilayer structures.
6.2.2 Results
Trainlm is the first algorithm, which trains the network it is the quickest of all the
algorithms. It tends to proceed so rapidly it does not necessarily do well when implemented
in Elman structures. However, this is a relative statement as the algorithm takes 75.0630
minutes to run 100 epochs compared with the multi-layer network run time of 28.6410
seconds for the trainlm algorithm. The performance results were also very poor. The mean
square error performance measurement was 3954.82. Figure 31 highlights the networks
poor performance.
38
2.7670e+003 minutes to run 100 epochs, which is significantly longer than the trainlm run
time of 75.0630 minutes. The performance error only outperforms the trainlm algorithm
slightly until its maximum epoch is reached.
These results were inadequate and pre and post processing is implemented to see if
improvements can be made. First, all the mean and standard deviation of the input and
target data are normalised. As a result of normalisation they now have zero mean and unity
standard deviation. After training the inputs and outputs are scaled back into the original
units. This does not improve performance; in fact figure 33 highlights that performance has
deteriorated.
39
A second type of pre and post processing, scaling is implemented because of the lack of
success with the mean and standard deviation method. The function premnmx() scales the
data for training and postmnmx() converts the data back to its original state after the
algorithm has run. The resultant plot shown in figure 34 does not show any significant
difference in performance even when mean and standard deviation processing was carried
out on the data.
40
Figure 34 Sum square error plot of recurrent network with pre and post processing implemented
the backpropagation training algorithm because it is not based on error gradient and does
not require as much computational time when the neuron number is high [12].
Development of genetic algorithms for identification and training purposes is a relatively
new direction and could produce extremely interesting results.
42
Development of the project in the future will not be limited to the use of genetic algorithms
and the improvement of the structures which use backpropagation. Architecture may also
be developed to include both multi-layer and recurrent networks hence maximising the
strength of each of the individual architectures in one unified unit.
The multi-layers
strength lies in its success at pattern recognition problems and the recurrent networks
success is in its solution of optimisation problems. Matlab toolbox has proved a very
powerful tool for building each of the architecture separately its capabilities may be
investigated and perhaps extended to build a more complex model. In this study the
development of research and testing has been progressive. It traces the development of the
SLP through its growth into recurrent networks. Testing highlights the flaws in all the
architectures such as the SLP inability to perform non-linear classification, the MLP poor
error performance and the recurrent networks poor error performance and long training
durations. Possible solutions are offered and interesting future directions are discussed in
the form genetic algorithm development and architecture modification.
43
References
[1]
Bruce D. Baker & Craig E. Richards (In Press), Exploratory application of neural
networks to school finance: forecasting educational spending
[2]
[3]
J.Wesley Hines (1997), Fuzzy and Neural Approaches in Engineering, A WileyInterscience Publication, John Wiley & Sons, INC.
[4]
[5]
[6]
[7]
Yonggon Lee & Stanislaw H. Zak (2001), Designing a Genetic Neural Fuzzy AntiLock Brake System Controller, IEEE Transactions on Evolutionary Computation
[8]
W.K. Lennon & K.M. Passino (1995), Intelligent control for brake systems,IEEE
Transactions on Fuzzy Systems, VOL.3, 381-388.
[9]
[10]
[11]
http://www.mathworks.com/access/helpdesk/help/helpdesk.shtml
[12]
44
Appendix 1
%The andgate problem again this time with 12 cycles
clear
w1=[0 1 -1]';
b=1;
k=1;
x1=[-1 -1]';
x2=[-1 1]';
x3=[1 -1]';
x4=[1 1]';
tau1=-1;
tau2=-1;
tau3=-1;
tau4=1;
tau=[tau1 tau2 tau3 tau4];
p=[[b;x1][ b;x2][ b;x3][ b;x4]];
mu=0.2;
new_w(:,k)=w1;
y(k)=sign(w1'*p(:,k))
e(k)=tau(:,k)-y(k);
new_w(:,k+1)=w1+(mu*e(k)*p(:,k));
k=0
while k<12;
for i=1:4;
y(i)=sign(new_w(:,k+i)'*p(:,i));
e(i)=tau(:,i)-y(i);
new_w(:,k+i+1)=new_w(:,k+i)+(mu*e(i)*p(:,i));
end
k=k+4;
end
45
Appendix 2
P=[-0.5 -0.5 0.3 0.1;
-0.5 0.5 -0.5 1.0];
T=[0 0 0 1];
plotpv(P,T);
net=newp(minmax(P),1);
plotpv(P,T);
net.b{1}=1;
plotpc(net.IW{1},net.b{1});
%inputs
%targets
%vectors
%network
%vectors
%attempt
%bias
plotted
created with one layer (slp)
replotted with networks
at classification
46
Appendix 3
%Designing Neural Network
close all % close all open figures
clear all % clear all old variables, to reduce the risk of
confusing errors
tic;
load input_data
load output_data
teach1=teach1';
teach2=teach2';
net = newff(minmax(teach1),[5,2],{'tansig'
'purelin'},'traingd');
%Training the Neural Network
net=init(net);
Y = sim(net,teach1);
[pn,minp,maxp,tn,mint,maxt] = premnmx(teach1,teach2);
%net.trainParam.show=5;
net.trainParam.epochs=200;
net.trainParam.lr=0.02;
net=train(net,pn,tn);
Y = sim(net,teach1);
%plot(tr.epoch,tr.perf,tr.epoch,tr.vperf,tr.epoch,tr.tperf)
%legend('Training','Validation','Test',-1);
%ylabel('Squared Error'); xlabel('Epoch')
toc
47