You are on page 1of 14

ARTIFICIAL NEURON

Artificial Neural Network-III

Lecture No 4

Dr. Kashif Zafar

February 24, 2022

1
ARTIFICIAL NEURAL NETWORKS

Training of Weights

2
ARTIFICIAL NEURAL NETWORKS

Training (setting) of weights

X1 X2 Y
1.0 1.0 1
9.4 6.4 -1
2.5 2.1 1
8.0 7.7 -1
0.5 2.2 1
7.9 8.4 -1
7.0 7.0 -1
2.8 0.8 1
1.2 3.0 1
7.8 6.1 -1

3
ARTIFICIAL NEURAL NETWORKS

Training of weights

For this simple problem, which we can view graphically, the


weights can be set by analysis. If we draw a line (10, 10) in
this space, it would perfectly separate the two classes.

The boundary line between the two


classes has to be
W1X1 + W2X2 – θ = 0

For (X1, X2) = (0, 10) we have 10W2 – θ = 0.


If we take W2 = 1, we have θ = 10
Similarly, for (X1, X2) = (10, 0), we have 10W1 – θ = 0. Since
we have calculated θ to be 10, so we have W1 = 1
4
ARTIFICIAL NEURAL NETWORKS

Training of weights

For complex problems in multi-dimensional space, we have


to have some algorithm for setting (or training) the weights

Supervised training: the classes


of training samples are known

Random initialization of weights.

Threshold or bias is considered as a weight and its input is


fixed at -1

5
ARTIFICIAL NEURAL NETWORKS

Training of weights

For each training sample, we calculate the output with the


current weights

The error will be equal to ydesired – yactual


= ydesired – f(∑xiwi)

For the complete set of training patterns, the error will be


equal to
E = ∑p (ydesired – yactual)2

The error is squared so that the positive and negative errors


may not cancel each other out during summation
6
ARTIFICIAL NEURAL NETWORKS

Training of weights

Error is the difference between the computed output and the


given output (also called target value, or actual output)

How to reduce this error?


Input is fixed,
Architecture of neuron is fixed
We can only change the weights

Which weight to change?


Since we don’t know which weight is contributing how
much to the error, hence we change all weights

7
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized


Delta Rule

How much will be the change in


a weight?

We wish to change the weights


so that the error is reduced

Each weight configuration can


be represented by a point on an
error surface

8
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized Delta


Rule

Suppose y is some unknown function


of x

y = f(x)

Suppose we can find the slope (rate of


change of y) at any point x
The slope of a function at any point x is the gradient of the
tangent to the curve at the point.
The slope is Δy / Δx in the figure.
9
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized Delta


Rule

If Δx is small, then Δy is almost the


same as the change δy in the function

We can move towards the minimum


of the function, if we adjust x in the
direction of the negative gradient

xnew = xold – c(Δy / Δx)


If we keep on making this change iteratively, we should
approach the minimum of the function
This technique is called gradient descent
10
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized Delta Rule


Example

Let current x be 4. Suppose


y = f(x) = 9.5 at x = 5
and y at x = 3 is 9.0
Then Δy / Δx = 9.5 – 9.0/5 – 3 = 0.25
The new point considered will be
xnew = xold –c*0.25 = 3.875 (if c = 0.5)

If the y = f(x) curve had been downward, then


Δy / Δx = 9.0 – 9.5/5 – 3 = -0.25
The new point x will be xnew = xold –c*-0.25 = 4.125
11
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized Delta Rule

We start with a random


weight configuration

Delta rule uses gradient descent:


Δwi = -c (∂Error/ ∂wi)

This error can be reduced most rapidly by


adjusting each weight wi in the direction of the
negative gradient -∂E/ ∂wi

12
ARTIFICIAL NEURAL NETWORKS

Training of weights: Generalized Delta Rule

The weight change is calculated as

Hence Δwi = wi (new) - wi (old) = -c (∂Error/ ∂wi)


Which can shown to be equal to

Δwi = -2c[- (ydesired - yactual) . f’(act) . xi]

This rule is called Generalized Delta Rule

Example: For the logistic function f(act) = 1/(1 + e-λact)


We have f’(act) = f(act)(1 – f(act))
13
ARTIFICIAL NEURAL NETWORKS

References

Engelbrecht
Training of weights Section 2.4, 2.4.1, 2.4.2

Laurene Fausett:
Where NN are used Section 1.3
Delta Rule page 86-88
Generalized Delta Rule page 106

14

You might also like