School of Technology: Simulation and Modelling

SCHOOL OF TECHNOLOGY
SIMULATION AND MODELLING
Lecture 08
GAMING SIMULATION: Deep Discriminative models :
Multi layer Perceptron (MLP) Network
Advancing Knowledge, Driving Change | www.kca.ac.ke

1
Deep Discriminative Models
Deep Discriminative models are deep learning models that ‌aims
at creating decision boundary between the classes and learns
the conditional probability distribution p(y|x).

2. Deep Discriminative Models
Examples of Discriminative Models
Multilayer perceptron Network
Convolutional Neural Networks
Recurrent Neural Networks such as Vanilla Recurrent Neural
Networks (Vanilla RNN) , Long-Short Term Memory Networks
(LSTMs) and Gated Recurrent Units (GRUs)
Deep Stacking Network
Time-Delay Neural Network
Hierarchical temporal memory(HTM)
Deep Sequential neural network

Multilayer Perceptron
 Multilayer Perceptron (MLP) is a type of discriminative neural
network that is made up of several perceptrons and has one or
more hidden layers in between the input and the output layer.
 A number of neurons are connected in layers to build a multilayer
perceptron.
 Example, MLP with single hidden layer.
•

Advantages of Mult layer perceptron
1. a multi layer perceptron can also learn non – linear functions. This is not
possible with perceptron, not able to represent functions that are not linearly
separable
x2
x2
+
+
+ - + -
-
x1
x1
+ -
- +
-
Linearly separable
Non-Linearly separable

5
Back Propagation Artificial neural Network

Back Propagation
 Back Propagation algorithm is an example of Multilayer Perceptron
(MLPAs the algorithm applies, the adjustment of weights is done by
propagating backward from the output nodes to the inner nodes.
The goal of backpropagation is to optimize the weights to allow learning

how inputs can be mapped to outputs.

Bias
Ina multilayer network, Bias unit is an "extra" neuron added to

each pre-output layer that stores the value of 1.
Bias units aren't connected to any previous layer and in this sense
don't represent a true "activity".
Bias
 Bias is used to adjust the output along with the weighted sum of
the inputs to the neuron.
It is an additional parameter in the Neural Network
 Example
From this diagram, Bias of 1.0 has been added as a constant value.

Bias
 Bias is like the intercept added in a linear equation, Y=mx+c,

 since it will also be the output when the input is zero.
 Where, m = weight and c = bias
 If the bias is absent , model(Y=mx) will train over point passing
through origin only, which is not in accordance with real-world
scenario. Also with the introduction of bias, the model will
become more flexible.
Therefore Bias
Advancing is aDriving
Knowledge, constant which helps the model in a way that it can fit best for the
Change | www.kca.ac.ke
given data. Advancing Knowledge, Driving Change | www.kca.ac.ke

Bias
 The change in bias shifts the activation function to the left or to the
right e.g. Increasing bias by 5 shifts the function to the left by 5 units
Example
Bias helps in
controlling the
value at which
activation
function will
trigger

Bias
 Bias is used to delay the triggering of the activation function.

 Example:
 Suppose an activation function act() which get triggered on some
input greater than 0 and input1 = 1, weight1 = 2, input2 = 2,
weight2 = 2
 Net total input = input1*weight1 + input2*weight2output = 6
 Since the net output>0 then activation function will get triggered
to output = 1
 If a bias=-6 is introduced the net total input becomes 0.
 (1*2) + (2*2) +(-6)= 0.
 Therefore, the activation function will not trigger.

Effect of adjusting Weights
 Change in weight adjusts the speed of learning . That is, makes the
activation function steeper or flatter
 Example, Suppose we increase weight as follows:
 Weight1 changed from 1.0 to 4.0 and weight 2 from -0.5 to 1.5
It can be inferred that

Increase in weight will
increases the number of
Triggering or the speed of
output
Triggering.
Decrease in weight will

delay triggering
input
Back propagation Neural network
 Back propagation Algorithm has four main stages:
1. Initialization of weights
2. Feed forward- each input unit(X) receives an input signal and transmits this signal
to each of the hidden units Z1, Z2, Z3….., Zn,
 Each hidden unit then calculates the activation function and sends its signs Zi
to each output unit.
 The output unit calculates the activation function to form the response of the
given input pattern.
3. Calculate Back propagation of errors-
 Each output unit compares activation Y , with its target value T , to determine
1 K
the associated error for that unit.
 Based on the error, the factor is computed and is used to distribute the error
at output unit Yk back to all units in the previous layer.

Similarly, the factor is compared for each unit Zj
4. Updating weights and biases
Back propagation algorithm
 Example
 Consider the following network
Given that input1 =0.05 and input2 =0.10, b1=1 and b2=1
Train the above network to output 0.01 and 0.99
 Step1: Guess the initial Weights:

 Layer 1 weights: w1=0.15, w2=0.20, w3=0.25, w4 =0.30
 Layer 2 weights: w5= 0.40, w6=0.45, w7=0.50, w8=0.55

 Step2: Forward Pass
 i) .Calculate the total net input for hidden input 1(h1)
Net input =( input1x w1)+(input2x w2) +(bias*0.35)
h1 = (0.05x 0.15)+(0.10x0.20)+(0.35x1) =0.3775
ii). Calculate activation e.g. using logistic function

Step2: Forward Pass
iii) .Calculate the total net input for hidden input 2(h2)
Net input =( input1x w1)+(input2x w2) +(bias*1)
h2 input = (0.05x 0.25)+(0.10x0.30)+(0.35x1) =0.3925
iv). Calculate activation(output) e.g. using logistic function

 Step2: Forward Pass
iii) .Calculate the total net input for output layer(O)
Net input =( Out h1x w5)+(out h2x w6) +(b2*1)
h2 input = (0.593x 0.40)+(0.0.596x0.45)+(0.60x1) =1.1059

Step2: Forward Pass
iii) .Calculate the sum of net input for output layer(O)
Net input =( Out h1x w5)+(out h2x w6) +(b2*1)
Net input o1 = (0.593x 0.40)+(0.0.596x0.45)+(0.60x1) =1.1059
Net input o2 = (0.593x 0.50)+(0.0.596x0.55)+(0.60x1) =1.2243

 Step 3: Calculate Total Error
 Each error can be calculated using squared error function.
For example,
The target output for is 0.01 but the neural network output
0.75136507, therefore its error is
Similarly, the target output for o2 = 0.99 but the neural network output= 0.77 82
Therefore the error is:

 Step 3: Calculate Total Error
 All the errors are then added together to obtain the total error:
Example

 Step 4: Backward pass
 In this step each weight is adjusted weights by propagating
backward from the output nodes to the inner nodes.
 The aim of adjusting is to cause the actual output to be closer the
target output, thereby minimizing the error for each output neuron
and the network as a whole.
 Chain rule is used to adjust the weights

Step 4: Backward pass
 Step a) Calculate change in weights by a certain learning rule.
 For example, chain rule can be applied.
It
states that derivative of y with respect to x =
 A derivative is a function which measures the slope.
 In neural networks, it used to used calculate change in weights
 For Example , the change in w5 can be calculated as follows:

 Step a) Calculate change in weights by a certain learning rule.
 Instead of calculating the derivative of how a specific weight affects
the cost directly, we can instead calculate these:
dError/dOutput :The derivative of how a neuron’s output
affects total error
dOutput/dWeightedInput: The derivative of how the net
input of a neuron affects a neuron’s output
dWeightedInput/dWeight: The derivative of how a
weight affects the weighted input of a neuron
 Finally, all the three derivatives are used to get change in weight
as follows:
dError/dWeight5 =
 Step a) Calculate change in weights
(i) Calculate the derivative of the total error change with respect to the output as
follows:
(ii) Calculate the derivative of how the net input of a neuron affects a neuron’s output
as follows:
(iii) Calculate the derivative of how a weight affects the weighted input of a neuron.
(iii) finally, Calculate the partial derivative of with respect to weight, e.g. w 5:

 Step a) Calculate change in weights .
 Delta rule can also be applied to calculate change in weights as
follows:

 b) Calculate the new weight by subtracting change in weight from
the previous weight.
 For example
 The new W5 can be calculated as follows:
 Where:
 n= predefined learning rate e.g. 0.5

Partial derivative of total error with respect to weight5

 Step 4 : Backward pass can be repeated to get the new weights
 for w6,w7 and w8
The neural network is updated after getting new weights leading into the
hidden layer neurons so that they can be used to calculate new weights of the
preceding using backpropagation algorithm.
0.35
0.4
0.51
0.56

 Calculate weights of hidden layer by repeating backward pass.
 but slightly different to account for the fact that the output of each
hidden layer neuron contributes to the output (and therefore error)
of multiple output neurons.

 The derivative of total error with respect to w1 can be calculated as
follows:

Derivative of total error with respect to w1
dEo1/dOuth1= dEo1/dNetInput01 * dNetInputo1/dOuth1

dEo1/dNetInput01 = dEo1/dOut01 * dOut01/ dNetInput01
But:
dEo1/dOut01 = -(Target-Outo1) = - (0.01-0.75136) = 74136
dOut01/ dNetInput01 =Outo1(1-Outo1) =0.75136(1-0.75136)=0.18681
Therefore,
dEo1/dNetInput01 = 0.741136 * 0.18681 = 0.138444
dNetInputo1/dOuth1=w5= 0.40
Hence, dEo1/dOuth1 = 0.138444 * 0.40 = 0.05537


Similarly, dEo2/douth2Can be calculated as follows
dEo2/dOuth1= dEo2/dNetInput02 * dNetInputo2/dOuth1

dEo2/dNetInput02 = dEo2/dOut02 * dOut02/ dNetInput02
But:
dEo2/dOut02 = -(Target-Outo1) = - (0.99-0.7732) = - 0.2168
dOut02/ dNetInput02 =Outo2(1-Outo2) =0.7732(1-0.7732)=0.17536
Therefore,
dEo2/dNetInput02 = -0.2168 * 0.17536 = -0.03801
dNetInputo2/dOuth1=w7= 0.51
Hence, dEo2/dOuth1 = -0.03801 * 0.51 = - 0.0193851

Finally, the two values can be put together as follows:

dEo1/dOuth1= 0.05537
dEo2/dOuth1= - 0.0193851
dEtotal/Douth1 = dEo1/douth1 +dEo2/Outh1

= 0.05537 -0.0193851
= 0.035
Therefore,
dEtotal/dw1 = dEtotal/dOuth1 * dOuth1/dneth1 * dneth1/dnetw1
but,
dEtotal/Douth1 =0.035
dOuth1/dneth1 = Outh1(1-Outh1) = 0.59326(1-0.59326)=0.24130
dneth1/dnetw1 = i1= 0.05
Hence,
dEtotal/dw1 = 0.035* 0.24130 * 0.05 =0.0004222

 New weight 1 (w1) is calculated as follows:
 New W1= old W1 - n * dEtotal/dw1
 where :
 n= learning rate = 0.5
Therefore,
new W1 = 0.15- 0.5 * 0.0004222 = 0.149889

 Backward pass can be repeated to get the following new weights of
hidden layer
 W2 = 0.199956, W3 =0.24975, W4 = 0.299950
The update network is as follows:

0.15 0.35
0.4
0.19
0.24 0.51
0.29
0.56

References
 https://medium.com/towards-artificial-intelligence/
understanding-back-propagation-in-an-easier-way-you-never-
before-42fe26d44a47
 https://stevenmiller888.github.io/mind-how-to-build-a-neural-
network/.
 https://hmkcode.com/ai/backpropagation-step-by-step/


School of Technology: Simulation and Modelling

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

School of Technology: Simulation and Modelling

Uploaded by

Copyright:

Available Formats

SCHOOL OF TECHNOLOGY

SIMULATION AND MODELLING

GAMING SIMULATION: Deep Discriminative models :

Multi layer Perceptron (MLP) Network

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

The goal of backpropagation is to optimize the weights to allow learning

how inputs can be mapped to outputs.

Ina multilayer network, Bias unit is an "extra" neuron added to

Advancing Knowledge, Driving Change | www.kca.ac.ke

 Bias is like the intercept added in a linear equation, Y=mx+c,

given data. Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

 Bias is used to delay the triggering of the activation function.

Advancing Knowledge, Driving Change | www.kca.ac.ke

It can be inferred that

Decrease in weight will

Advancing Knowledge, Driving Change | www.kca.ac.ke

at output unit Yk back to all units in the previous layer.

 Step1: Guess the initial Weights:

Advancing Knowledge, Driving Change | www.kca.ac.ke

ii). Calculate activation e.g. using logistic function

Advancing Knowledge, Driving Change | www.kca.ac.ke

iv). Calculate activation(output) e.g. using logistic function

Advancing Knowledge, Driving Change | www.kca.ac.ke

iv). Calculate activation(output) e.g. using logistic function

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

 Step 3: Calculate Total Error

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Partial derivative of total error with respect to weight5

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

dEo1/dOuth1= dEo1/dNetInput01 * dNetInputo1/dOuth1

Hence, dEo1/dOuth1 = 0.138444 * 0.40 = 0.05537

Advancing Knowledge, Driving Change | www.kca.ac.ke

Similarly, dEo2/douth2Can be calculated as follows

dEo2/dOuth1= dEo2/dNetInput02 * dNetInputo2/dOuth1

Hence, dEo2/dOuth1 = -0.03801 * 0.51 = - 0.0193851

Advancing Knowledge, Driving Change | www.kca.ac.ke

Finally, the two values can be put together as follows:

dEtotal/Douth1 = dEo1/douth1 +dEo2/Outh1

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

The update network is as follows:

Advancing Knowledge, Driving Change | www.kca.ac.ke

Advancing Knowledge, Driving Change | www.kca.ac.ke

You might also like