You are on page 1of 6

Neural and Evolutionary Computing Lab 2: Neural Networks for Approximation and Prediction ______________________________________________________________________________

1. Approximation and Prediction Problems
Aim: extract from data a model which describes either the dependence between two variables or the variation in time of a variable (in order to predict new values in a time series) Examples: • Experimental data analysis: o The input data are the measured values the analyzed variables (one variable is the predictor, the other is the predicted one) o The results are values of the predicted variables estimated for some given values of the predictor • Time series prediction: o The input data are previous values of a time-dependent variable o The result is an estimation of the next value in the time series 2. Using the nftool for approximation problems nftool (neural network fitting tool) provides a graphical user interface for designing and training a feedforward neural network for solving approximation (fitting) problems. The networks created by nftool are characterized by: • • • One hidden layer (the number of hidden units can be changed by the user; the default value is 20) The hidden units have a sigmoidal activation function (tansig or logsig) while the output units have a linear activation function The training algorithm is Backpropagation based on a Levenberg-Marquardt minimization method (the corresponding Matlab function is trainlm

The learning process is controlled by a cross-validation technique based on a random division of the initial set of data in 3 subsets: for training (weights adjustment), for learning process control (validation) and for evaluation of the quality of approximation (testing). The quality of the approximation can be evaluated by: • • Mean Squared Error (MSE): it expresses the difference between to correct otputs and those provided by the network; the approximation is better if MSE is smaller (closer to 0) Pearson’s Correlation Coefficient (R): it measures the correlation between the correct outputs and those provided by the network; as R is closer to 1 as the approximation is better.

Exercise 1 a) Using nftool design and train a neural network to discover the dependence between the data in the set “Simple Fitting Problem”. b) Analize the netweork behavior by visualizing the dataset and the approximated function. Analyze the quality of the approximation by using the plot of R c) Compare the obtained results for different numbers of hidden units: 5, 10, 20.

….….…. newff establishes the number of input units. The hidden layers should have nonlinear activation functions (‘logsig’ or ‘tansig’) while the output layer can have also linear activation function. {k1.k2. ‘learngdm’ (it corresponds to the momentum variant of BackPropagation).fm}: the activation functions for all functional layers.1}. R) for different cases c) after the network is saved it can be seimulated by using sim(netFitting.…. ‘trainlm’ (variant based on the Levenberg Marquardt minimization method). ‘tansig’ and ‘purelin’.km-1}: the number of units on each hidden layer {f1.fm}. Based on this.1}.'traingd') .dateIntrare) 3. Definition of a network used to represent the XOR function: ffnet=newff([0 1 0 1.f2.{5.d) Save the network in an object named netFitting and simulate its functioning for different input data Indicatie: a) use nftool and select the corresponding dataset from “Load Example Data Set” b) visualize the graphs with “PlotFit” and “PlotRegression” c) change the architecture. In the case of a network having m layers of functional units (i. When the function adapt is uszed the possible values of this parameter are: ‘learngd’ (it corresponds to the serial standard BackPropagation derived using the gradient descent method). ‘<training algorithm>’: is a parameter specifying the tip of the learning algorithm. {f1.learnFcn='learngd' one can specify which is the learning algorithm used for the weights between the input layer and the first layer of functional units.f2. Possible values are: ‘logsig’. ‘learngda’ (variant with adaptive learning rate). Based on this.'logsig'}.e.{'logsig'.k2. When the function train is used the possible values of this parameter are: ‘traingd’ (it corresponds to the classical batch BackPropagation derived using the gradient descent method). newff establishes the number of output units. There are two main training variants: incremental (the training process is initiated by the function adapt) and batch (the training process is initiated by the function train).[0 1 1 0].km-1}. retrain and compare the quality measures (MSE. For instance by ffnet. desiredOutputs. ‘<training algorithm>’) The significance of the parameters involved in newff is: inputData: a matrix containing on its columns the input data in the training set. ‘traingdm’ (it corresponds to the momentum variant of BackPropagation).. m-1 hidden layers) the syntax of the function is: ffnet = newff(inputData. desiredOutputs: a matrix containing on its columns the correct answers in the training set. {k1. Remark 1: the learning function can be different for different layers. Using Matlab functions for creating and training feedforward neural networks The function used in Matlab to define a multi layer perceptron is newff.0 0 1 1]. Exemple 1.inputWeights{1.

lr=0.d]=regression(layers. desiredOutputs) There several parameters which should be set before starting the learning process: • • • • Learning rate . There are two training functions: adapt (incremental learning) and train (batch learning).passes=10. Find a linear or a nonlinear function which approximates the data (by minimizing the sum of squared distances between the points and the graph of the function). The syntax of this function is: [ffTrained.trainParam.epochs (number of passes through the training set): the implicit value is 1 Maximal value of the error: goal Momentum coefficient – mc (when ‘learngdm’ is used): the implicit value is 0. Adapt.1. inputData.01 Number of epochs .lr=0. ffnet.passes (number of passes through the training set): the implicit value is 1 Momentum coefficient – mc (when ‘learngdm’ is used): the implicit value is 0. In order to compute the output of a network one can use the function sim.001.training. function [in. The syntax of this function is: [ffTrained.1. testData) Application 1.lr: the implicit value is 0. inputData. Linear and nonlinear regression Let us consider a set of bidimensional data represented as points in plane (the coordinates are specified by using the function ginput).lr: the implicit value is 0. ffnet. having the syntax: sim(ffTrained.err]=train(ffnet.y. Simulating multi layer perceptrons.passes=1000. desiredOutputs) There several parameters which should be set before starting the learning process: • • • Learning rate . Train.Training.trainParam.9 Example: ffnet.err]=adapt(ffnet.cycles) % if layers = 1 then a single layer linear neural network will be created % if layers = 2 then a network with one hidden layer will be created and % hiddenUnits denote the number of units on the hidden layer .trainParam.adaptParam.y.goal=0.9 Example: ffnet.adaptParam.01 Number of epochs .hiddenUnits. ffnet.

'r*').goal=0.’adapt’. y=sim(reta.d.x). while b == 1 [xi.d).trainParam. d = [].0. end end % graphical representation of the approximation x=inf:(sup-inf)/100. % define the network if (layers==1) reta=newlind(in.adaptParam.x.'purelin'}.n) = yi. 5000) 2. % network training reta=train(ret.'r-'). end inf=min(in).trainParam.{'tansig'. for nonlinear regression the call should be: regression(2. else ret.n) = xi. d(1. ret.y.05.b] = ginput(1).lr=0. sup=max(in). In the case of nonlinear regression analyze the influence of the number of hidden units. n = 0.adaptParam.hiddenUnits. Hint: for linear regression call the function as: regression(1. plot(in.05. In order to test the behavior of different . Values to test: 5 (as in the previous exercise).:sup.d).epochs=cycles.in.lr=0.in.’adapt’. ret.yi.trainParam.001. % network training reta=adapt(ret. if(training == 'adapt') % setting the parameters ret.% training = 'adapt' or 'train' % cycles = the number of passes (for adapt) and epochs (for train) % read the data clf axis([0 1 0 1]) hold on in = []. in(1. end Exercises: 1. 1) (the last two parameters are ignored). b = 1.5. n = n+1.'b*'. ret.'traingd'). Test the ability of function regression to approximate different types of data.passes=cycles. plot(xi.yi. % linear network designed starting from input data else ret=newff(in. 10 and 20.d).d.

'traingd').epochs=epochs.2).i+inputUnits). ret. end ret=newff(in. Application 2. d=zeros(1.goal=0. end Let us use the prediction function to estimate the next value in a series of Euro/Lei exchange rate. the neural network will have N input units. inTest=zeros(inputUnits.((xn-N. The training set will have n-N pairs of (input data.L-inputUnits+1:L)'.adaptParam.'b-'. The goal is to predict the value corresponding to moment (n+1). in=zeros(inputUnits.rezTest. Prediction Let us consider a sequence of data (a time series): x 1.{'tansig'.L+1. The data should be first read: . xi-2. ret. Therefore.….d. disp('Predicted value:').….xN).in.x2. correct output): {((x1. if(training == 'adapt') ret.in).((x2.d]=prediction(data. d(i)=data(1.xi-N.05.….'k*').lr=0.xn-N+1.hiddenUnits.xN+1). disp(rezTest). The main idea is to suppose that a current value x i depends on N previous values: xi-1.i:i+inputUnits-1).lr=0. plot(x.d). else ret.'r-'. a given number of hidden units and 1 output unit.….d.05.…. y=sim(reta. reta=train(ret.training. Based on this hypothesis we can design a neural network which is trained to extract the association between any subsequence of L and the next value in the series.x2.inTest). end % graphical plot x=1:L-inputUnits.xn)}..xN+2).inputUnits.xn which can be interpreted as values recorded at successive moments of time.adaptParam.i)=data(1. A solution in Matlab is: function [in.x.hiddenUnits.y.architectures on the same set of data save the data specified in the first call and define a new function which does not read the data but receive them as parameters. for i=1:L-inputUnits in(:.1).passes=epochs.trainParam.xn-1).xN+1).trainParam.001. ret.epochs) % data : the series of data (one row matrix) % inputUnits : number of previous data which influences the current one % hiddenUnits : number of hidden units % traininb= 'adapt' or 'train' % epochs = number of epochs L=size(data. inTest=data(1.….x3.d). rezTest=sim(reta. reta=adapt(ret.in.L-inputUnits).L-inputUnits).trainParam.'purelin'}.

15) 2.10.2000) Exercise. 5.ro. the number of input units.bnr. 10. 2009 downloaded from www. 5. Analyze the influence of training algorithm on the ability of the network to make prediction . 1. the type of the learning process and the maximal number of epochs:: prediction(data2. the number of hidden units. Analyze the influence of the number of input units on the ability of the network to make prediction (Hint: try the following values: 2. Analyze the influence of the number of hidden units on the ability of the network to make prediction (Hint: the number of hidden units should be at least as large as the number of input units) 3. The data in the file are daily rates from 2005 up to October 16.csv’)’. Since the values are stored in reverse chronological order we construct a new list: data2(1:length(data))=data(length(data):-1:1) Then the prediction function can be called by specifying the data set.’adapt’.data=csvread(‘dailyRate.