You are on page 1of 17

# ECE 626 – Advanced Neural Networks

Project Report #1

Instructor
Dr.Scott Dick
University of Alberta

Submitted by
Kanwarpal Singh
Student ID: 1524706

Faculty of Graduate Studies & Research – Master of Engineering
Electrical & Computer Engineering – Software Engineering and Intelligent
Systems
Modelling problem:

We have been given wine dataset, which consist of 178 examples of 13 different
attributes, it is a 3 class problem. Now we are required to build a Multi-layer
perceptron network using the backpropagation with momentum learning. So,in
order to do that we first need to divide the dataset into two parts i.e Training and
Testing,we have selected 10% 1)of data as test data and remaining as the training
data.The next step we need to do is to perform is parameter 10 fold cross validation
on the training data ,which will divide the data into 10 equal parts ,each with 16
examples,these 10 equal parts as known as folds.For each different folds we will
set parameters,these parameters will be the number of layers,number of neurons,
learning rate and momentum.After setting these parameters in our network,we will
create and train the network using these.The best parameters will be selected on the
basis of accuracy of network.

Multi-layer Perceptron and Back Propagation algorithm:

Multi-layer as the name suggests, a network with one or more hidden layer
between the input and output layers, it is basically a network of neurons which are
called as perceptrons.A Multi-layer perceptron network looks like this:
Here the middle layer is the hidden layer,every node is full connected to other
layers in the node.In Neural network,every input in the network adds a weight to
the network,so the sum of these weights and the biased is feed into a function and
this function is known as Activation function, this function transforms the input
into the output of the network.

There are basically three types of activation functions,which are used most often in
Nueral networks, i.e Sigmoid functions and Tanh Activation functions.
Backpropagation algorithm:

Backpropagation is the most commonly used method for training a neural
network.The motive of backpropagation is to compute partial derivates of cost
function with respect to any network weight w or bias b in the network.Step by
step working of backpropagation algorithm is as follow:

1.First,a forward pass is executed,this computes the activation for the layers,say
l1,l2,l3….to the output layer l.t1

2.For every output in the unit i in layer t1,

3.For layer l = t1 – 1,t1-2,t1-3…..,

set

4. Now calculate the partial derivates,using

This training method of backpropagation can be done through inbuilt training
function provided by matlab i.e traingdm
Methodology used in Project:

In this project,we are provided with Wine dataset,this dataset can be downloaded
from UCI Machine Learning Repository. So,in this dataset,there are 13 attributes
with 178 examples.The 13 attributes in this dataset are:

1. Alchohal
2. Malic Acid
3. Ash
4. Alcalinity of ash.
5. Magnesium
6. Total phenols
7. Flavanoids
8. Nonflavanoids phenols
9. Proanthocyanins
10.Color Intensity
11.Hue
12.OD280/OD315 of diluted wines
13.Proline

Preprocessing the data:

For performing any operation or analysis on data, we first need to process it in a
form,so that we can directly do our operations on it.So for that,we need to use
some preprocessing on data.Normalisation is done to ensure that all data falls
under one particular range.Fortunately ,there are inbuilt functions available in
matlab for preprocessing the data.So,for this we will normalize the data and
remove the unwanted data from our dataset using the inbuilt matlab functions:

normalized_input = mapminmax(inputs); %Normalising the network

inputs.normal_input2= removeconstantrows(normal_input)

fixed_input = fixunknowns(normal_input2)
Since our inputs have been normalized now,so now we need to design the neural
network.But firstly, we need to divide the dataset into training and testing data
set.In this problem,we will use 10% of samples as testing set,that sums to 18.So
remaining 160 samples is given to training set.Now,the next step is to divide 160
samples of training set into 10 folds of k-fold cross validation.So,using k-fold cross
validation,the training data is divided into 10 equal parts of 16 samples each.K-fold
cross fold validation can also be done using in-built matlab functions such as
crossvalind and cvpartition.For every iteration,one fold out of 10 folds will be
used as Testing and the remaining folds will be used as Training set.

Design Procedure for creating network:

So ,our data is divided into 10 folds,we need to create network for each fold.A
neural network can be created by Neural network object named net in matlab.Here
we are creating feedforward network,next we need to give the network a training
function,number of layers,number of neurons .Since we need to select the best
hyper parameters,so the 4 parameters we will use to judge the best network are:

 Number of layers
 Number of neurons
 Learning Rate
 Momentum

We will assign different parameter values to different networks and on the basis of
Accuracy of network,best Hyper parameters will be chosen.
Training the network:

After the network objects and the parameters of the networks are defined, we now
need to train the network.For training a network train function is used

[net,tr] = train( net, inputs, targets)

Along with the train function, we will assign the training parameters,like number
of epochs,the learning rate,momentum constant .

net.trainParam.lr=0.01
net.trainParam.epochs = 2000
net.trainParam.mc=0.5

After the network is created and trained,we need to compute the confusion
matrix.In the confusion matrix,we can calculate the True Positive Rate,True
Negative rate,False Positive rate and the False Negative rate.From these factors,we
can compute the accuracy.The table below shows the Accuracy,TPR and FPR rates
of every fold.

Fold Accuracy TPR FPR
count
1 97.3 98.125 1.875
2 68.1 68.125 31.8
3 98.1 98.1 1.9
4 63.7 63.7 36.3
5 96.9 96.3 3.7
6 95 97.5 2.5
7 96.9 96.875 3.12
8 97.5 94.4 5.6
9 98.3 98.8 1.2
10 93.5 93.8 6.3
Conclusion:

In this report,we have discussed in detail about the methodology for creating a
multi-layer perceptron network using backpropagation algorithm.The problem
defined in this dataset is 3 class data classification problem.In this problem we first
divided the data into 2 parts i.e. training and testing,the training dataset is divided
into 10 folds and every fold is tested once.Finally this trained data is compared
with the testing set,which we have separated in the first step,then the whole
network is trained and tested.In this problem we have learned how to train a neural
network along with different parameters,how to avoid over fitting.The best
hyperparamters based on resuts is

Learning rate = 0.09
Number of hidden neurons: 10
Epochs: 2000
Momentum constant = 0.5
References:

Improving the way neural networks learn
http://neuralnetworksanddeeplearning.com/chap3.html

Gradient descent with backpropagation momentum,”traingdm” http://www-
rohan.sdsu.edu/doc/matlab/toolbox/nnet/traingdm.html

Getting data into matlab http://matlabdatamining.blogspot.ca/2007/04/getting-data-

Wine Classification problem in Matlab
http://www.mathworks.com/help/nnet/examples/wine-classification.html

Improve network generalization and avoid overfitting
https://www.mathworks.com/help/nnet/ug/improve-neural-network-generalization-
and-avoid-overfitting.html#bss4gz0-35

Neural network classification
https://www.mathworks.com/help/stats/examples/classification.html

How to improve neural network performace
neural-network-performance

Matlab cross validation using crossvalind
https://www.mathworks.com/help/bioinfo/ref/crossvalind.html

Neural network k fold cross validation
Appendix

inputs = wineInputs
outputs = wineTargets

%Normalise inputs and outputs
normal_input= mapminmax(inputs)
normal_input2 = removeconstantrows(normal_input)
i1 = fixunknowns(normal_input2)

normal_output= mapminmax(outputs)
normal_output2 = removeconstantrows(normal_output)
t1 = fixunknowns(normal_output2)

%Divide the data in to two parts

q = size(i1,2)
q1 = floor(q*0.90) % 160 parts here

q2 = q -q1 %18 parts here
index = randperm(q)

index1 = index(1:q1) % 1 to 160 goes here
index2 = index(q1 +(1:q2))

a1 = i1(:,index1) %13*160 goes here

b1 = t1(:,index1) %3*160 goes here

testset1= i1(:,index2) % 13,18 goes here

testset2= t1(:,index2) %3,18 goes here

trainFcn = 'traingdm';

%Dividing data into k folds

data1 = a1(:,1:160);
n = length(data1)
k = 10;
allix1 = datasample(data1,13);
numineach = ceil(n/k)

y = allix1;
%allixineach = reshape([allix1 NaN(1,k*numineach-n)],k,numineach);

cvpart = crossvalind('Kfold',n,k)
for i = 1:k

if i ==1

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);
net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.01
net.trainParam.epochs = 2000;
net.trainParam.mc=0.1

net = init(net);
y = net(a1)
perf1 = perform(net,b1,y)

end

if i ==2

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);
net.trainFcn = 'traingdm' ;
hiddenLayerSize = 20
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.02
net.trainParam.epochs = 2000;
net.trainParam.mc=0.2

net = init(net);
y = net(a1)
perf2 = perform(net,b1,y)
end

if i ==3

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';
[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.03
net.trainParam.epochs = 2000;
net.trainParam.mc=0.3

net = init(net);
y = net(a1)
perf3 = perform(net,b1,y)

end

if i ==4

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 20
net = patternnet(hiddenLayerSize, trainFcn);
net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.04
net.trainParam.epochs = 2000;
net.trainParam.mc=0.4

net = init(net);
y = net(a1)
perf4 = perform(net,b1,y)

end

if i ==5

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);
net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.05
net.trainParam.epochs = 2000;
net.trainParam.mc=0.5

net = init(net);
y = net(a1)
perf5 = perform(net,b1,y)

end

if i ==6

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);
net.trainFcn = 'traingdm' ;
hiddenLayerSize = 20
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.06
net.trainParam.epochs = 2000;
net.trainParam.mc=0.6

net = init(net);
y = net(a1)
perf6 = perform(net,b1,y)

end

if i ==7

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);
net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.07
net.trainParam.epochs = 2000;
net.trainParam.mc=0.7

net = init(net);
y = net(a1)
perf7 = perform(net,b1,y)

end

if i ==8

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 20
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.08
net.trainParam.epochs = 2000;
net.trainParam.mc=0.8
net = init(net);
y = net(a1)
perf8 = perform(net,b1,y)

end

if i ==9

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);
net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=0.09
net.trainParam.epochs = 2000;
net.trainParam.mc=0.9

net = init(net);

y = net(a1)
perf9 = perform(net,b1,y)

end

if i ==10

testIdx = (cvpart == i); %# get indices of test instances
trainIdx = ~testIdx ; %# get indices training instances
trInd = find(trainIdx);
tstInd = find(testIdx);

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 20
net = patternnet(hiddenLayerSize, trainFcn);

net.performFcn = 'mse';

[net,tr] = train(net,a1,b1);

net.trainParam.lr=1.0
net.trainParam.epochs = 2000;
net.trainParam.mc=1.0

net = init(net);
y = net(a1)
perf10 = perform(net,b1,y)

end
end

%Training the remaining network

%create a new network

net.trainFcn = 'traingdm' ;
hiddenLayerSize = 10
net = patternnet(hiddenLayerSize, trainFcn);
%net.performFcn = 'mse';

[net,tr] = train(net,testset1,testset2);

net.trainParam.lr=0.01
net.trainParam.epochs = 2000;
net.trainParam.mc=0.1

net = init(net);
y = net(testset1)
perffinal = perform(net,testset2,y)

Matlab code 1