Artificial Neural Network

Artificial Neural Network
Human brain?
Recognizing a face by human brain and the same task
done by computer
Is it a easy task by computer ?
Speed of a silicon chip 10^-9 s
Speed of a brain 10^-3 s
Massively parallel network of neuron (with 10
billions of neurons) or 60 trillions of
interconnections
What we can do is to try to mimic some part of brain
work
Usefulness and Capability of NN

Non linearity
Interconnections of non linear neurons.
It means non linearity is distributed through out.
Input- output mapping
(involves teacher)
Learning ability
Adaptability
Can adopt free parameters to changes in the
surroundings environment.
Evidential Response
Decision with a measure of confidence
Fault tolerance
graceful of degradation
All these capabilities can be achieved or mimic by ANN
Why Artificial Neural Networks?

There are two basic reasons why we are interested in
building artificial neural networks (ANNs):

viewpoint: Some problems such as
character recognition or the prediction of future
states of a system require massively parallel and
adaptive processing.
Technical
viewpoint: ANNs can be used to

replicate and simulate components of the human
(or animal) brain, thereby giving us insight into
natural information processing.
Biological
Neural Network
Neuron
Biological inspiration
The spikes travelling along the axon of the pre-synaptic
neuron trigger the release of neurotransmitter substances at the
synapse.
The neurotransmitters cause excitation or inhibition in the
dendrite of the post-synaptic neuron.
The integration of the excitatory and inhibitory signals may
produce spikes in the post-synaptic neuron.
The contribution of the signals depends on the strength of the
synaptic connection.
Fundamentals of Biological Neuron

Synapses can be excitatory or inhibitory.
Spikes (signals) arriving at an excitatory synapse tend
to cause the receiving neuron to fire.
Spikes (signals) arriving at an inhibitory synapse tend
to inhibit the receiving neuron from firing.
The cell body and synapses essentially compute the
difference between the incoming excitatory and
inhibitory inputs.
When this difference is large enough (compared to
the neuron's threshold) then the neuron will fire.
The faster excitatory spikes arrive at its synapses the
faster it will fire.
Artificial Neural Networks

The building blocks of neural networks are the
neurons.
In technical systems, we also refer to them as units or nodes.
Basically, each neuron

receives input from many other neurons.
changes its internal state (activation) based on the current
input.
sends one output signal to many other neurons, possibly
including its input neurons (recurrent network).
Artificial Neural Networks

Information is transmitted as a series of electric
impulses, so-called spikes.

The frequency and phase of these spikes encodes the
information.
In biological systems, one neuron can be connected to as
many
as
10,000
other
neurons.
Usually, a neuron receives its information from other
neurons in a confined area, its so-called receptive field.
Artificial Neural Network
This configuration is actually called a Perceptron.

A perceptron models a neuron by taking a weighted sum of
inputs and sending the output 1, if the sum is greater than
some adjustable threshold value otherwise it sends 0.
This is called an activation function.
Perceptron
A computer model or computerized machine devised to
represent or simulate the ability of the brain to recognize and
discriminate.
Perceptrons are the easiest data structures to learn for the
study of Neural Networking.
The links between the nodes not only show the relationship
between the nodes but also transmit data and information,
called a signal or impulse.
The perceptron is a simple model of a neuron (nerve cell).
Few points about single layer Perceptron

Strength of the connection is called the synaptic
weight.
Larger the value more the connection weight.
Activation function output is either 0 or 1.
The perceptron algorithm was invented in 1957 at
the Cornell Aeronautical Laboratory by Frank
Rosenblatt, funded by the United States Office of
Naval Research.
Perceptron: Neuron Model

(Special form of single layer feed forward)
The perceptron was first proposed by Rosenblatt (1958) is a
simple neuron that is used to classify its input into one of two
categories.
A perceptron uses a step function that returns +1 if weighted
sum of its input 0 and -1 otherwise
( v )
1 if
1 if
v
v
b (bias)
x1
x2
w1
w2
wn
xn
(v)
0
0
The Neuron Diagram

Bias
b
x1
w1
Induced
Field
Input
values
x2
w2
xm
wm
weights
Summing
function
Activation
function
()
Output
Bias of a Neuron
The bias b has the effect of applying a transformation to
the weighted sum u
v=u+b
The bias is an external parameter of the neuron. It can be
modeled by adding an extra input.
v is called induced field of the neuron
m
j0
w0 b
w jxj
The McCulloch-Pitts Model of Neuron
Neuron Models
The choice of activation function determines the
neuron model.
Examples:
a if v c
(
v
)
step function:
b if v c
ramp function:
a if v c
( v ) b if v d
a (( v c )( b a ) /( d c )) otherwise
sigmoid function with z,x,y parameters

(v ) z
Gaussian function:
(v )
1
1 exp( xv y )
1 v 2
1
exp

2
2
Step Function
b
Curve fitting using ANN
Weight
Weight
Height
Height
Objective is to fit a curve that passes through these set of points
y mx c
m is the is the slope and c is the intercept
y mx c
Can be written in neural network form
yk w1 x w0
This equation is the neural model of a straight line
w 0 is the intercept, w 1 is the slope and x is

the input.
w 1 can be synaptic weight between the input
and output.
Neural model of straight line

1
w20
y2
2
x1
w20
is the intercept and
w 21
w 21
y2 w21 x w20
is the slope X is the input
Multi input and single output of a height

weight problem
Weight can be due to calorie intake
Weight can be due to biological problem
Weight can be due to physical problem or
some disease problem
There can be various other problems where we
can get different weight of a person.
How to solve these problems with multiple
input parameters to the weight
Multi input case of a neural network

1
w30
y3
w 31
w32
y3 w30 w31 x1 w32 x2
Pertaining to this we will be getting a two slope.

Hence, this is a two dimensional problem.
How to model a three dimensional

system
It is easy to visualize the three dimensional input
parameter.
X
Z
Multi dimensional system
y f ( x1 , x2 , x3 ....xn )
Error using gradient descent
weight
Height
E p (t p y p )
How to calculate error?

E p (t p y p ) 2
E E p (t p y p ) 2
p
Why error is important

If we start with any arbitrary intercept and slope for the straight line,
we should measure the error and then we will find out that in what
direction we should adjust the load and intercept so that next time fitting
that I do can be a better fit
Objective in the neural network is to find the minimum error
Gradient descent
E
w
This is also called steepest descent
Gradient descent
E
wij wij
E
p
p E p w
ij
E p
G
p wij
E
E yo
.
woi yo woi
1 p
E (t 0 y 0 p ) 2
2
p
Using Chain Rule
E
(t0 y0 )
yo
yk w1 x w0
y0 woi xi w0 woi xi
j
yo
woi woi
x xi
oi i
E
(t0 y0 ) xi
woi
woi (t0 y0 ) xi
woi (t0 y0 ) xi
is the learning rate
woi woi woi

Success of convergence of ANN much depends on the
learning rate
Concept of learning rate
Can we fit a non linear curve using a

straight line fitting?
y f (x)
y
Even if we want to fit a non linear curve with a

straight line we have to fit several straight line
Non linear curve fitting
To fit this we have to use non linear curve fitting method
People started thinking on modeling a non linear activation function
Bias
b
x1
w1
Induced
Field
Input
values
x2
w2
xm
wm
weights
Summing
function
Activation
function
()
Output
Non Linear Activation function
Non Linear Activation Function

Non linear solution by linear NN model is very
complicated
wk 1
wk 0
wk 2
vk
Non linear function
wk 3
(vk ) y
w kn
vk
x j w kj
Sigmoid function
A sigmoid function produces a curve with an S shape.
(vk ) 1
(vk ) 0
1
y ( vk )
1 e avk
This is also known as logistic function

Changing a to a higher value would give to a MacCullan Pitts model
Importance of a in sigmoid function

1. If a is very high (tending to infiniti) it will
behave as an Mcculloh Pitts model.
2. If a is very low(tending to minus infinity) it will
behave as purely sigmoidal function.
a is used as tuning of sigmoid function
So whatever is the v the NN activation function
will give the non linear output based on the
activation function.
Few important points about sigmoid

activation function
Yk can take any value between 0 and 1

Yk will be 0.5 at 0.
Yk can take 1 at infinity.
Yk can take 0 at minus infinity.
It is monotonically increasing.
It is continuously differentiable.
MacCullon Pitts model is not continuous model
Tanh function
(vk ) tanh(a vk )
Learning mechanism process in ANN
Learning Mechanism in NN
Five basic learning rules

1. Error correction learning
2. Memory based learning
3. Hebian learning
4. Competitive learning
5. Boltzman learning
Learning mechanism mentioned above are used by human brain.
But it is very difficult to say which type is used mostly by the
brain.
Error correction learning
E ek (n) dk (n) yk (n)

n is the discrete time step or iteration number
minimization of E(n)
1
2
E(n) ek (n)
2 k
wkj ( n) ek ( n) x j ( n)
This we have found using gradient descent method
This is also called delta rule or windrow- hoff rule
Error correction method

w kj ( n 1) w kj ( n ) w kj ( n )
wkj
is the updated synaptic weight
This is the simplest error correction techniques
This learning is used mostly in the ANN techniques
Memory based learning

1
()
The pattern of the inputs can be called

to be vectors. I mean I am feeding X
inputs which are 1,2,3..m inputs
Memory based learning

_
Xi
_

X i
i
_
di
For every Xi there will be a

desired output di.
In large memory we are going to store these

patterns
N
X i , di
For every N there is a desired pattern
Here N is the n different patterns
Suppose, all the inputs

X i
i
are stored in the memory
And there is another set of inputs which is
X test
Using Euclidian distance, we will be finding that
is nearer to any of the
_

X i
i
X test

Xi
i
Now suppose
Xj
is found to be nearer to
X test
X j , {X1 , X 2 ,....X n }
Xj
is nearest neighbor of
X test
min d (Xi , Xtest ) d ( X j , Xtest )

i
Memory learning mechanism using pattern
0 and 1 pattern for some input vectors
Memory learning mechanism using pattern
X test
0 and 1 pattern for some input vectors with
te st
This concept is also known as nearest neighbor criteria or

memory based learning
Memory learning mechanism using

pattern
outlier
0 and 1 pattern for some input vectors with an outlier
0 and 1 pattern for some input vectors with an outlier and
K nearest neighbor classification

It means, we are going to consider k nearest neighbors
instead of single neighbor.
te st
Hebbian Learning
Hebbian learning is said to be one of the closest

learning mechanism compared to biological
neuron
Hebbian Learning Mechanism

Hebb was a neurophysiologist, 1949 in the book of
Organization of Behaviour
Synaptic weight
If cell A consistently fires cell B, metabolic

changes happen so that the efficiency of A
firing B increases.
In other way
It
states that where one cell's firing repeatedly

contributes to the firing of another cell, the magnitude
of this contribution will tend to increase gradually
with time.
OR
The hebbian learning rule, in which a change in the
strength of a connection is a function of the pre and
postsynaptic neural activities.
Hebbian Synapse
Hebbian synapses
a) Time-dependent (It means it should work in
synchronized way)
b) Local ( It depends upon presynaptic and post synaptic)
c) Strongly interactive (It is connected with presynaptic
and post synaptic neuron)
Positive correlation- Synaptic strengthening
Uncorrelated or negative correlation- Synaptic
weakening
Uncorrelated means one does not effect another
Classification of synaptic modification

1. Hebbian- Synapse increases its strength with positive
correlation.
2. Anti-Hebbian- Synapse increases its strength with
negative correlation.
3. Non-Hebbian- Which does not involve Hebbian
mechanism of either kind.
Mathematical model of Hebbian

modifications
wkj ( n) f ( yk (n), x j ( n))
Post synaptic
Pre synaptic
According to hebbian model, the Wkj function can be

positive correlated or negative correlated
Hebbs hypothesis
wkj (n) yk (n) x j (n)

This is also called activity product rule
Concept of Hebbian learning rule
wkj
xj
yk
xj
wkj
yk
The weight rise exponentially, i.e.
w kj
Covariance Hypothesis
x
y
is a time average value of x j

is a time average value of
yk
wkj ( x j x )( yk y )
Covariance hypothesis is the modified version of Hebbian learning mechanism
Covariance hypothesis
(xj x )
wkj
yk
- ( x j x )
Covariance hypothesis has got the stability effect
wkj
xj x
yk y
wkj
xj x
yk y
xj x
yk y
xj x
yk y
w kj
Notes on Hebbian Learning

Hipocampus: It is the area of one brain where
we see Hebbian behaviour
Competitive learning
Competitive learning
1
Layer
of
Source
w 51
w52
Feedback
connections
w53
w54
6
Inhibitory
neuron 3
7
4
Feedforward connections
Excitory
O/P layer
Mathematical modeling of competitive

learning
yk
1 if vk v j for all
0
otherwise
wkj 1
j k
for all k
wkj
( x j wkj )
0
If neuron k wins
If neuron k looses
Multi Layer Perceptron

Artificial Neural Network - PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Neural Network - PDF

Uploaded by

Copyright:

Available Formats

Usefulness and Capability of NN

All these capabilities can be achieved or mimic by ANN

Why Artificial Neural Networks?

building artificial neural networks (ANNs):

viewpoint: ANNs can be used to

Fundamentals of Biological Neuron

Artificial Neural Networks

Basically, each neuron

Artificial Neural Networks

impulses, so-called spikes.

Usually, a neuron receives its information from other

neurons in a confined area, its so-called receptive field.

Artificial Neural Network

This configuration is actually called a Perceptron.

Few points about single layer Perceptron

Perceptron: Neuron Model

The Neuron Diagram

The McCulloch-Pitts Model of Neuron

sigmoid function with z,x,y parameters

Curve fitting using ANN

Objective is to fit a curve that passes through these set of points

w 0 is the intercept, w 1 is the slope and x is

Neural model of straight line

is the intercept and

Multi input and single output of a height

Multi input case of a neural network

y3 w30 w31 x1 w32 x2

Pertaining to this we will be getting a two slope.

How to model a three dimensional

Multi dimensional system

Error using gradient descent

How to calculate error?

Why error is important

This is also called steepest descent

Using Chain Rule

is the learning rate

woi woi woi

Concept of learning rate

Can we fit a non linear curve using a

Even if we want to fit a non linear curve with a

Non linear curve fitting

To fit this we have to use non linear curve fitting method

People started thinking on modeling a non linear activation function

Non Linear Activation function

Non Linear Activation Function

Non linear function

A sigmoid function produces a curve with an S shape.

This is also known as logistic function

Importance of a in sigmoid function

Few important points about sigmoid

Yk can take any value between 0 and 1

Learning mechanism process in ANN

Five basic learning rules

Error correction learning

E ek (n) dk (n) yk (n)

This is also called delta rule or windrow- hoff rule

Error correction method

is the updated synaptic weight

This is the simplest error correction techniques

This learning is used mostly in the ANN techniques

Memory based learning

The pattern of the inputs can be called