Feed Forward Neural Networks: Supervised Learning - I

Feed Forward Neural Networks
Supervised Learning - I
Course :Soft Computing

By: Dr P Indira priyadarsini
DEPARTMENT OF INFORMATION TECHNOLOGY 1 11/09/2021

FEED FORWARD NEURAL NETWORKS
1.Feed forward training of input patterns
 The inputs are fed simultaneously into the units making up the input layer.
 These inputs pass through the input layer and then weighted and fed
simultaneously to a second layer of units, known as a hidden layer.
 The weighted outputs of the last hidden layer are input to units making up the
output layer, which emits the network’s prediction for given tuples.
2. Back propagation of errors

 Each output node compares compares its activation with the desired output.
 The error is propagated backwards to upstream nodes.
3. Weight adjustment
 The weights of all links are computed simultaneously based on the error and
propagated backwards

Single layer Feed forward Network

What is meant with learning?
The ability of the neural network (NN) to learn from its environment and to improve its
performance through learning.
- The NN is stimulated by an environment
- The NN undergoes changes in its free parameteres
- The NN responds in a new way to the environment
Definition of Learning
 Learning is a process by which the free parameters of a neural network are
adapted through a process of stimulation by the environment in which the
network is embedded.
 The type of the learning is determined by the manner in which the parameter
changes take place.

LEARNING AND MEMORY
- What are the specific features of Humans when compared to animals.
- Learning and memory are the one that we make humans intelligent.
- Learning is the process by which we acquire new information about the

world around us.
- Memories hold that information over time.
- Learning allows us to store and retain knowledge. It builds our

memories.
- We are not born with most of the information in our brains.
- We acquire it through learning.
- It is through this interleaved process of knowledge acquisition and

storage that the environment alters our behavioral responses.
- Names, Faces, Streets, Algebra, programming art, dance, sport ,reading ,
playing the piano. We learn them all.
- We are what we are because of what we learn.
- Memory is the glue that binds our mental life.
- Specific disorders of memory can be devastating.
- Life has no meaning without memory.
- Imagine not being able to recall where we were a moment ago, or what
we read a couple of hours ago or what we said some time ago? Where
did we go to high school?
- What is the name of one’s child?
- Life is tragic without memory. It is meaningless.

• Neural networks were designed on analogy with the human brain.
• The brain’s memory, however, works by association.
• For example, we can recognize a familiar face even in an unfamiliar

environment within 100-200 ms.
• We can also recall a complete sensory experience, including sounds and

scenes, when we hear only a few bars of music. The brain routinely
associates one thing with another.
• Human memory is essentially associative. One thing may remind us of

another, and that of another, and so on.
• We use a chain of mental associations to recover a lost memory. If we

forget where we left an umbrella, we try to recall where we last had it,
what we were doing, and who we were talking to. We attempt to establish
a chain of associations, and thereby to restore a lost memory.

 Multilayer neural networks trained with the back-propagation algorithm are
used for pattern recognition problems. However, to emulate the human
memory’s associative characteristics we need a different type of network: a
recurrent neural network.
 A recurrent neural network has feedback loops from its outputs to its inputs.
The presence of such loops has a profound impact on the learning capability of
the network.
There are two types of Memory in brain
1. Short term memory (STM)
2. Long term memory( LTM)

 Once a memory is created, it must be stored (no matter how briefly). Many
experts think there are three ways we store memories: first in the sensory
stage; then in short-term memory; and ultimately, for some memories, in long-
term memory. Because there is no need for us to maintain everything in our
brain, the different stages of human memory function as a sort of filter that
helps to protect us from the flood of information that we're confronted with
on a daily basis.
• Short-term memory has a fairly limited capacity; it can hold about seven items
for no more than 20 or 30 seconds at a time.
• You may be able to increase this capacity somewhat by using various memory
strategies.
• For example, a ten-digit number such as 8005840392 may be too much for your
short-term memory to hold.
• But divided into chunks, as in a telephone number, 800-584-0392 may actually
stay in your short-term memory long enough for you to dial the telephone.
• Important information is gradually transferred from short-term memory into
long-term memory. The more the information is repeated or used, the more
likely it is to eventually end up in long-term memory, or to be "retained.“
• Long term memory can be further divided into two: implicit and explicit
memory.
• Explicit and implicit memory are stored at different sites and use a different
logic.
• Explicit memory deals with everyday facts and events. Its storage site is the
medial temporal lobe and the hippocampus.
• Explicit memory is defined as the memory that involves conscious
recollection of past experience.
• For eg: Recollection of birthday party celebrated three days back
• Implicit memory is defined as the memory in which past experience is

utilized without conscious awareness.
• For eg: cycling, driving, playing tennis
• Implicit memories are stored differently depending upon how they are
acquired

LEARNING AND MEMORY. The hippocampus, parahippocampal
region, and areas of the cerebral cortex (including prefrontal cortex)
compose a system that supports declarative, or cognitive, memory.
Different forms of nondeclarative, or behavioral, memory are
supported by the amygdala, striatum, and cerebellum.

• We just discussed a form of supervised learning
• A “teacher” tells the network what the correct output is based
on the input until the network learns the target concept
• Supervised learning allows you to collect data or produce a data

output from the previous experience.
• Helps you to optimize performance criteria using experience
• Supervised machine learning helps you to solve various types of real-
world computation problems.
• Examples: Back propagation, Perceptron Learning and Hop field
networks
• We can also train networks where there is no teacher.
• This is called unsupervised learning.

• The network learns a prototype based on the distribution of patterns
in the training data.
• Such networks allow us to:
• Discover underlying structure of the data
• Encode or compress the data 15
• Transform the data
DEPARTMENT OF INFORMATION TECHNOLOGY 11/09/2021
• Here, are prime reasons for using Unsupervised Learning:
• Unsupervised machine learning finds all kind of unknown patterns in data.
• Unsupervised methods help you to find features which can be useful for
categorization.
• It is taken place in real time, so all the input data to be analyzed and labeled in
the presence of learners.
• It is easier to get unlabeled data from a computer than labeled data, which
needs manual intervention.
• Examples : Self organizing maps and ART

SUPERVISED VERSUS UNSUPERVISED LEARNING

ERROR CORRECTION AND GRADIENT DESCENT RULES

Perceptron Learning Algorithm
• In 1958, Frank Rosenblatt introduced a training algorithm that provided the
first procedure for training a simple ANN: a perceptron.
• The perceptron is the simplest form of a neural network. It consists of a
single neuron with adjustable synaptic weights and a hard limiter
 The operation of Rosenblatt’s perceptron is based on the McCulloch and
Pitts neuron model. The model consists of a linear combiner followed by a
hard limiter.
 The weighted sum of the inputs is applied to the hard limiter, which
produces an output equal to +1 if its input is positive and -1 if it is negative.
 The aim of the perceptron is to classify inputs, x1, x2, . . ., xn, into one of two
classes, say A1 and A2.
 In the case of an elementary perceptron, the n- dimensional space is divided
by a hyper plane into two decision regions. The hyper plane is defined by the
linearly separable function:
n
 xi wi    0
i 1

Linear separability in the perceptrons
x2 x2
Class A1
1
2
1
x1
Class A2 x1
x1w1 + x2w2  = 0 x1w1 + x2w2 + x3w3  = 0

x3
(a ) Two-input perceptron. (b) Three-input perceptron.

How does the perceptron learn its classification tasks?
This is done by making small adjustments in the weights to reduce the difference
between the actual and desired outputs of the perceptron. The initial weights are
randomly assigned, usually in the range [-0.5, 0.5], and then updated to obtain the
output consistent with the training examples.
If at iteration p, the actual output is Y(p) and the desired output is Yd (p), then the
error is given by:
e( p)  Yd ( p)  Y( p) where p = 1, 2, 3, . . .
Iteration p here refers to the pth training example presented to the

perceptron.
 If the error, e(p), is positive, we need to increase perceptron output Y(p),
but if it is negative, we need to decrease Y(p).

The perceptron learning rule
wi ( p  1)  wi ( p )  a . xi ( p ) . e( p )
where p = 1, 2, 3, . . . a is the learning rate, a

positive constant less than unity.
The perceptron learning rule was first proposed by
Rosenblatt in 1960. Using this rule we can derive
the perceptron training algorithm for classification
tasks.

11/09/2021 25
Perceptron’s training algorithm
Step 1: Initialisation
Set initial weights w1, w2,…, wn and threshold q
to random numbers in the range [-0.5, 0.5].
If the error, e(p), is positive, we need to increase
perceptron output Y(p), but if it is negative, we
need to decrease Y(p).

11/09/2021 26
Perceptron’s training algorithm (continued)
Step 2: Activation
Activate the perceptron by applying inputs x1(p),
x2(p),…, xn(p) and desired output Yd (p). Calculate
the actual output at iteration p = 1
 n 
Y ( p )  step   xi ( p ) w i ( p )   
 i  1 
where n is the number of the perceptron inputs,
and step is a step activation function.
Step Function is one of the simplest kind of activation functions. In this, we consider
a threshold value and if the value of net input say y is greater than the threshold
then the neuron is activated.
DEPARTMENT OF INFORMATION TECHNOLOGY 27
11/09/2021
Perceptron’s training algorithm (continued)
Step 3: Weight training
Update the weights of the perceptron
wi ( p  1)  wi ( p)  wi ( p)
where Δ wi(p) is the weight correction at iteration
p.
The weight correction is computed by the delta
 w ( p)    x ( p ) .e( p)
rule:
i i
Step 4: Iteration
Increase iteration p by one, go back to Step 2 and
repeat the process until convergence.
28
Example of perceptron learning: the logical operation
AND Epoch
Inputs Desired
output
Initial
weights
Actual
output
Error Final
weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3  0.1 0 0 0.3  0.1
0 1 0 0.3  0.1 0 0 0.3  0.1
1 0 0 0.3  0.1 1 1 0.2  0.1
1 1 1 0.2  0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold: 
= 0.2; learning rate:
DEPARTMENT OF INFORMATION TECHNOLOGY
= 29
0.1 11/09/2021
11/09/2021 29
THANK YOU

Feed Forward Neural Networks: Supervised Learning - I

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Feed Forward Neural Networks: Supervised Learning - I

Uploaded by

Copyright:

Available Formats

Feed Forward Neural Networks

Course :Soft Computing

DEPARTMENT OF INFORMATION TECHNOLOGY 1 11/09/2021

2. Back propagation of errors

DEPARTMENT OF INFORMATION TECHNOLOGY 2 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 3 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 4 11/09/2021

- What are the specific features of Humans when compared to animals.

- Learning is the process by which we acquire new information about the

- Memories hold that information over time.

- Learning allows us to store and retain knowledge. It builds our

- We are not born with most of the information in our brains.

- We acquire it through learning.

- It is through this interleaved process of knowledge acquisition and

- We are what we are because of what we learn.

- Memory is the glue that binds our mental life.

- Specific disorders of memory can be devastating.

- Life has no meaning without memory.

- What is the name of one’s child?

- Life is tragic without memory. It is meaningless.

DEPARTMENT OF INFORMATION TECHNOLOGY 6 11/09/2021

• The brain’s memory, however, works by association.

• For example, we can recognize a familiar face even in an unfamiliar

• We can also recall a complete sensory experience, including sounds and

• Human memory is essentially associative. One thing may remind us of

• We use a chain of mental associations to recover a lost memory. If we

DEPARTMENT OF INFORMATION TECHNOLOGY 7 11/09/2021

There are two types of Memory in brain

1. Short term memory (STM)

2. Long term memory( LTM)

• Implicit memory is defined as the memory in which past experience is

• For eg: cycling, driving, playing tennis

DEPARTMENT OF INFORMATION TECHNOLOGY 10 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 11 11/09/2021

• Supervised learning allows you to collect data or produce a data

• This is called unsupervised learning.

• Unsupervised machine learning finds all kind of unknown patterns in data.

• Examples : Self organizing maps and ART

DEPARTMENT OF INFORMATION TECHNOLOGY 16 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 18 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 20 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 22 11/09/2021

x1w1 + x2w2  = 0 x1w1 + x2w2 + x3w3  = 0

DEPARTMENT OF INFORMATION TECHNOLOGY 23 11/09/2021

Iteration p here refers to the pth training example presented to the

DEPARTMENT OF INFORMATION TECHNOLOGY 24 11/09/2021

where p = 1, 2, 3, . . . a is the learning rate, a

DEPARTMENT OF INFORMATION TECHNOLOGY 25 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 26 11/09/2021

DEPARTMENT OF INFORMATION TECHNOLOGY 30 11/09/2021

You might also like