You are on page 1of 16

18CSE352T

NEURO FUZZY AND GENETIC PROGRAMMING

SESSION 3
Topics that will be covered in this Session
• Learning Algorithms
• The Basic Principle of ANN Learning
• Supervised Learning
• Hebb Rule
• Perceptron Learning Rule
• Delta Rule
• Extended Delta Rule
Learning Algorithms
• ANN is characterized by three entities:
• Its architecture
• Activation function
• Learning technique
• Learning refers to the process of finding the appropriate set of weights of the
interconnections so that the ANN attains the ability to perform the designated task.
• This process is called Training the ANN
• How to find the appropriate set of weights so that the ANN is able to solve a given
problem?
• To start with a set of weights and then gradually modify them to arrive at the final
weights
The Basic Principle of ANN Learning
• The ANN starts with an initial distribution of interconnection weights and then
goes on adjusting the weights iteratively until some predefined stopping
criterion is satisfied
• The weight of a certain interconnection path at the iteration is
• The weight at iteration, is obtained by

• where is the adjustment to weight


• A learning algorithm is characterized by the method undertaken by it to
compute
Supervised Learning
• Labelled training data
• The computer is presented with example inputs & outputs
• A teacher is present
Class Attribute /
Input Arguments Target Attribute

SIZE COLOR FRUIT NAME


BIG RED APPLE
SMALL RED CHERRY
BIG GREEN BANANA
SMALL GREEN GRAPE

• Classification : Target attribute → Categorical


• Prediction (Regression) : Target attribute → Numeric
Unsupervised Learning
• Unlabelled training data
• Derives structure from data based on relationships among attributes
• No teacher Size Color

Input Arguments
Big Small Red Green
SIZE COLOR
BIG RED
SMALL RED Size & Color
BIG GREEN
SMALL GREEN
Small Small
Big & Big & &
Red &
Green Green
• Clustering Red
Supervised Learning
Supervised Learning
• A neural network is trained with the help of a set of patterns known as the training
vectors
• The output of these vectors might be, or might not be, known beforehand
• When these are known and that knowledge is employed in the training process, the
training is termed as supervised learning
• Otherwise, the learning is said to be unsupervised
• Some popular supervised learning methods are
• Perceptron Learning
• Delta Learning
• Least-Mean-Square (LMS) Learning
• Correlation Learning
• Outstar Learning
Linearly Separable data
1. Hebb Rule
• It is one of the earliest learning rules for ANNs
• Weight adjustment is computed as

where t is the target activation


• Points to note:
• Hebb Rule cannot learn when the target is 0
(because the weight adjustment becomes 0,
irrespective of the value of )
• Hebb Rule results in better learning if the
input/output both are in bipolar form
• It does not guarantee to learn a classification
instance even if the classes are linearly separable
2. Perceptron Learning Rule
What is a Perceptron?
• The Perceptron is one of the earliest neural network
models proposed by Rosenblatt in 1962
• It has simple structure, pattern classifying behaviour and
learning ability
Structure of a Perceptron
• It consists of a number of input units and a processing
unit
• The perceptron sends an output 1 if the net input is
greater than a predefined adjustable threshold value θ.
Otherwise it sends output 0
• It is customary to include the adjustable threshold θ as
an additional weight attached to an input which is
permanently maintained as 1
Perceptron Learning Rule – cont..
• The inputs to a perceptron are real
values
• The output is binary ( 0 or 1 )
• The perceptron is a totality of
• The input units
• The weights
• The summation processor
• Activation function
• Adjustable threshold value
• The perceptron acts as the basic ANN
structure for pattern classification
Perceptron Learning Rule – cont..
• It is convenient to use the bipolar activation Learning Strategy
function • If the perceptron produces the
= desired output, then the weights need
• Let not be changed
be the training vector • If the perceptron misclassifies X
negatively (if it erroneously produces
be the output of the perceptron, t=1, 0 or -1 -1 instead of +1) then the weights
be the current combination should be appropriately increased
of weights • If the perceptron misclassifies X
positively (if it erroneously produces
+1 instead of -1) then the weights
should be appropriately decreased
Perceptron Learning Rule – cont..
Perceptron Learning Rule – cont..
• The perceptron learning rule can be formulated as
for
• Here η is a constant known as the learning rate
• When a training vector is correctly classified then and the weight adjustment 0
• If the pattern is misclassified negatively, that is and then and so is incremental and is
proportional to
• If the pattern is misclassified positively, that is and then and so is decremental and is
proportional to
• Using matrix notation, the perceptron learning rule can be written as

where and are the vectors corresponding to interconnection weights and inputs
and
3. Delta / LMS (Least Mean Square) / Widrow-Hoff Rule

• Here the weight adjustment is computed as

• In LMS learning, the identify function is used as the activation function during the
training phase
• The learning rule minimizes mean square error between the activation and the
target value
• The output of LMS is in binary form
4. Extended Delta Rule

• The Extended Delta Rule removes the restriction of the output activation function
being the identity function only
• Any differentiable function can be used for this purpose
• Here the weight adjustment is computed as

where g(.) is the output activation function and g’(.) is its first derivative

You might also like