You are on page 1of 51

University Of Khartoum

Department Of Electronics & Electrical


Engineering
Software & Control Engineering

EC5245: ARTIFICIAL NEURAL


NETWORK & FUZZY LOGIC
By: Ustaza Hiba Hassan
Lecture 5
26/6/2018 Hiba Hassan: U of K 2

How often do we update the weights?


• We update the weights using one of the following techniques:
• Online: after each training case.
• Full batch: after a full sweep through the training data.
• Mini-batch: after a small sample of training cases.
• Mini-batch learning is very useful for large neural networks with
very large and highly redundant training sets.
26/6/2018 Hiba Hassan: U of K 3

Modes of Learning
• Learning is usually carried on in one of two modes; online or off-
line.
• In On-line learning: weights are updated after processing each
training case, then the used case is discarded.
• In Off-line learning: weights are updated after ALL the training data
is processed, hence, data is stored and accessed multiple times.
• Batch learning is done off-line.
26/6/2018 Hiba Hassan: U of K 4

The Error Surface


• The course of learning can be traced on the error
surface: as learning is supposed to reduce error,
when the learning algorithm causes the weights to
change, the current point on the error surface should
descend into a valley of the error surface.
• The following graph shows the error surface of a
linear neuron with two input weights.
26/6/2018 Hiba Hassan: U of K 5
26/6/2018 Hiba Hassan: U of K 6

The Error Surface (cont.)


• In general, the error surface lies
in a space with a horizontal axis
for each weight and one vertical
axis for the error.
• For a linear neuron with a
squared error, it is a quadratic
bowl.
• Vertical cross-sections are
parabolas.
• Horizontal cross-sections are
ellipses.
26/6/2018 Hiba Hassan: U of K 7
26/6/2018 Hiba Hassan: U of K 8

More Neural Network Architectures


• The main architectures, as mentioned before, are:
• FeedForward Artificial Neural Network (FFANN),
• FeedBack (Recurrent) Artificial Neural Network (FBANN),
• However, there are several other commonly used architectures that are
derived from them, such as,
• Radial Basis Function Neural Network (RBFNN),
• Self-Organizing Feature Maps Neural Networks (SOFMNN) or
(SOMNN).
26/6/2018 Hiba Hassan: U of K 9

FFANN & FBANN


• In FFNN, usually the hidden layer/s uses sigmoid neurons and the
output layer uses linear neurons.
• The Feedback networks are dynamic, i.e. their state is changing
continuously until they reach an equilibrium point. They remain at the
equilibrium point until the input changes and a new equilibrium needs to
be found.
• FBANNs, also known as recurrent NN (RNN)s, make use of sequential
information, hence they are widely adopted in natural language
processing (NLP).
• Examples of RNN networks are: Elman and Hopfield networks.
26/6/2018 Hiba Hassan: U of K 10

Elman Neural Network


• Elman network was developed by Elman in 1990.
• He introduced what he called context units.
• These units hold a copy of the hidden unit activations from the previous time-
step.
• The context unit is treated just like inputs for backpropagation.
26/6/2018 Hiba Hassan: U of K 11

Elman Feedback Neural Networks Architecture


26/6/2018 Hiba Hassan: U of K 12

Training of RNN
• Training of Elman RNNs deploy a modified version of the
backpropagation algorithm, called Backpropagation Through Time
(BPTT).
• The gradient at each output depends on calculations of the previous
time step as well as those of current time step. However, more
advanced architectures is capable of utilizing several time steps using
the BPTT algorithm.
• For example; to calculate the gradient at t = 3, we need to
backpropagate 2 steps and sum up the gradients.
• However, simple RNN, also known as vanilla RNN, have limitations
regarding learning long-term dependencies.
• The solution to this flaw have led to many evolved RNNs, such as Long
Short Term Memory (LSTM) RNN, which have selective memory (so to
speak).
26/6/2018 Hiba Hassan: U of K 13

Elman RNN (cont.)


• The internal feedback loop of Elman networks, make it possible to
learn to detect and generate time-based, temporal, patterns.
• Hence, Elman networks are useful in areas where time is crucial
such as signal processing and prediction.
26/6/2018 Hiba Hassan: U of K 14

Hopefield Feedback Neural Networks Architecture


26/6/2018 Hiba Hassan: U of K 15

Hopfield Neural Networks

• Hopfield neural network is used to store one or more stable


target vectors. These stable vectors can be viewed as memories
that the network recalls when provided with similar vectors.
• All neurons are interconnected.
• Has symmetric bidirectional weights, i.e. wij = wji .
• Every neuron is equally likely to fire, and only one fires at a time,
hence it has asynchronous operation.
26/6/2018 Hiba Hassan: U of K 16

Cont.
• The input p to this network just supplies the initial conditions.
The Hopfield network uses the saturated linear transfer
function satlins.

Saturated Linear Transfer Function


26/6/2018 Hiba Hassan: U of K 17

Hopfield Networks (cont.)


• Hopfield networks can act as error correction or vector categorization
networks.
• Input vectors are used as the initial conditions to the network, which
recurrently updates until it reaches a stable output vector.
• Hopfield networks are interesting from a theoretical standpoint, but are
seldom used in practice.
• Even the best Hopfield designs may have spurious ‫غيرحقيقية‬/‫ زائفة‬stable
points that lead to incorrect answers.
26/6/2018 Hiba Hassan: U of K 18

RADIAL BASIS FUNCTION NEURAL


NETWORK
26/6/2018 Hiba Hassan: U of K 19

Radial Basis Function Network (RBFNN)


• Radial basis function (RBF) neural networks (RBF-NN) have been
introduced by Broomhead and Lowe in 1988.
• It originated from the theory of function approximation.
• It uses radial basis functions as activation function of the hidden unit.
• Hidden units are known as radial centers and are represented by
the vectors c1, c2 ,……, ch.
• The transformation from input space to hidden unit space is
nonlinear whereas transformation from hidden unit space to output
space is linear.
26/6/2018 Hiba Hassan: U of K 20

Radial Basis Function Networks (RBFNN)


26/6/2018 Hiba Hassan: U of K 21
26/6/2018 Hiba Hassan: U of K 22

RBFNN (cont.)
• The net input to the RBF transfer function is a vector distance
between its weight vector w and the input vector p, multiplied by
the bias b. The box ||dist||, accepts the input vectors p and the
single row input weight matrix, and produces the dot product of the
two.
• The transfer function of a RBF neuron is defined as:

radbas(n)  e - n2
26/6/2018 Hiba Hassan: U of K 23

RBFNN (cont.)
• As the distance between w and p decreases, the output increases.
• Thus, a RB neuron acts as a detector that produces 1 whenever
the input p is identical to its weight vector w.
• The bias b allows the sensitivity of the RB neuron to be adjusted.
26/6/2018 Hiba Hassan: U of K 24

Two unnormalized Gaussian radial basis functions in one input dimension. The basis
function centers are located at c1=0.75 and c2=3.25.
26/6/2018 Hiba Hassan: U of K 25

Radial Basis Function Networks (RBFNN)


• There are many types of radial basis function neural networks, such as,
• Generalized Regression Neural Networks (GRNN) and
• Probabilistic Neural Networks (PNN).
26/6/2018 Hiba Hassan: U of K 26

Interpolation and Approximation of RBF networks:


• Assuming that there is no noise in the training data set, we need to
estimate a function d(.) that yields exact desired outputs for all
training data.
• This task is usually called an interpolation ‫االستقراء‬- ‫ االستكمال‬problem,
and the resultant function d(.) should pass through all of the training
data points.
• Consider a Gaussian Basis function centered at ui with a width
parameter σ :
26/6/2018 Hiba Hassan: U of K 27

 x  u i   2

wi  Ri  x  u i   exp 
 2 i2

Each training input xi serves as a center for the basis function, Ri.
Thus, the Gaussian interpolation radial basis function network
(RBFN) is:

n  x  x i   2

d ( x)   ci exp 
i 1  2 i 
2

Where ci is the output value associated with the ith receptive field
and i = 1 ,…..,n.
26/6/2018 Hiba Hassan: U of K 28

 Ri may be Gaussian function, Logistic function, or other. The


final output may be a weighted sum of the output value
associated with each receptive field, or may be a weighted
average of the output associated with each receptive field.

H H

H H  c w  c R ( x)
i i i i
d ( x)   ci wi   ci Ri ( x) or d ( x)  i 1
H
 i 1
H
i 1 i 1
w
i 1
i  R ( x)
i 1
i
26/6/2018 Hiba Hassan: U of K 29

RBFN Training
• Several training techniques are used for optimal selection of the
parameters vectors ci and wi. Some of them are:
i. Random selection (results in fixed centers).
ii. Deploying self-organized methodology.
iii. Adopting supervised learning processes.
• An RBFN's approximation capacity may be further improved with
supervised adjustments of the center of the shape of the receptive field (or
radial basis) functions.
• Several learning algorithms have been proposed to identify the parameters,
and to update all modifiable parameters. Linear parameters can be updated
using the least squares method or the gradient method.
26/6/2018 Hiba Hassan: U of K 30

Random Selection
• Results in fixed centers of RBFs of the hidden units.
• The locations of the centers could be randomly chosen from the
training data set. Different values of centers and widths are used
for each radial basis function.
• Hence, trialing with training data is required.
• The output layer weight value is usually learned by a technique
called pseudo-inverse.
• The main problem with random selection is that it requires a large
training set for optimal performance.
26/6/2018 Hiba Hassan: U of K 31

Self-Organized Selection
• Usually deploys hybrid learning:
• The centers of hidden layer RBFs are learned using self-
organized clustering techniques, example: k-means.
• The output layer linear weights are estimated via supervised
learning, thus using LMS algorithm.
26/6/2018 Hiba Hassan: U of K 32

RBFN Fitting:
• Set σ =1.0 for both exponential and Gaussian basis functions.

wi  Ri  x  xi 
• When the basis functions do not have enough overlap, the
weighted sum of the hidden outputs may generate curves that are
not smooth enough.
26/6/2018 Hiba Hassan: U of K 33

RBFN Fitting:
• An example of Radial Basis Algorithm is the following two-layer
network.

• The first layer has radial basis neurons. The net input functions
calculate a layer's net input by combining its weighted inputs and
biases.
• The second layer has pure linear neurons. The net input functions
calculate a layer's net input by combining its weighted inputs and
biases.
26/6/2018 Hiba Hassan: U of K 34

RBFN Fitting:
• Both layers have biases. Initially the radbas layer has no neurons.
The following steps are repeated until the network's mean
squared error falls below goal.
• The network is simulated.
• The input vector with the greatest error is found.
• A radbas neuron is added with weights equal to that vector.
• The pure linear layer weights are redesigned to minimize error.
26/6/2018 Hiba Hassan: U of K 35

Example of Radial Basis Approximation: (a) training data


26/6/2018 Hiba Hassan: U of K 36

(b) weighted sum of radial basis transfer functions


26/6/2018 Hiba Hassan: U of K 37

(c) Correct weight and bias values


26/6/2018 Hiba Hassan: U of K 38

The following is adopted from:


26/6/2018 Hiba Hassan: U of K 39

RBFN vs Multilayer Network


• Comparing between an RBF net & a multilayer net:
26/6/2018 Hiba Hassan: U of K 40
26/6/2018 Hiba Hassan: U of K 41

RBFNN
• Consider the following RBFNN;
26/6/2018 Hiba Hassan: U of K 42

Cont.
• When the input falls within a small local region of the
input space, the radial basis functions, in the hidden
layer, produce a noticeable response.
• Each hidden unit owns a unique receptive field in input
space.
• An input vector xi that lies in the receptive field for center
cj, could activate cj and obtain the target output by proper
choice of weights. The output is given by,

• Where,Ф = some radial function.


26/6/2018 Hiba Hassan: U of K 43

Cont.
• There are several radial functions, some of them are,

• Where

• The radial function most commonly used is the Gaussian.


26/6/2018 Hiba Hassan: U of K 44

Learning via Fixed Center Approach (Gaussian RBF)


26/6/2018 Hiba Hassan: U of K 45
26/6/2018 Hiba Hassan: U of K 46
26/6/2018 Hiba Hassan: U of K 47
26/6/2018 Hiba Hassan: U of K 48
26/6/2018 Hiba Hassan: U of K 49
26/6/2018 Hiba Hassan: U of K 50
26/6/2018 Hiba Hassan: U of K 51

You might also like