You are on page 1of 70

Artificial Neural Network

Human brain?
Recognizing a face by human brain and the same task
done by computer
Is it a easy task by computer ?
Speed of a silicon chip 10^-9 s
Speed of a brain 10^-3 s
Massively parallel network of neuron (with 10
billions of neurons) or 60 trillions of
interconnections
What we can do is to try to mimic some part of brain
work

Usefulness and Capability of NN


Non linearity
Interconnections of non linear neurons.
It means non linearity is distributed through out.
Input- output mapping
(involves teacher)
Learning ability
Adaptability
Can adopt free parameters to changes in the
surroundings environment.

Evidential Response
Decision with a measure of confidence
Fault tolerance
graceful of degradation

All these capabilities can be achieved or mimic by ANN

Why Artificial Neural Networks?


There are two basic reasons why we are interested in

building artificial neural networks (ANNs):


viewpoint: Some problems such as
character recognition or the prediction of future
states of a system require massively parallel and
adaptive processing.

Technical

viewpoint: ANNs can be used to


replicate and simulate components of the human
(or animal) brain, thereby giving us insight into
natural information processing.

Biological

Neural Network
Neuron

Biological inspiration
The spikes travelling along the axon of the pre-synaptic
neuron trigger the release of neurotransmitter substances at the
synapse.
The neurotransmitters cause excitation or inhibition in the
dendrite of the post-synaptic neuron.
The integration of the excitatory and inhibitory signals may
produce spikes in the post-synaptic neuron.
The contribution of the signals depends on the strength of the
synaptic connection.

Fundamentals of Biological Neuron


Synapses can be excitatory or inhibitory.
Spikes (signals) arriving at an excitatory synapse tend
to cause the receiving neuron to fire.
Spikes (signals) arriving at an inhibitory synapse tend
to inhibit the receiving neuron from firing.
The cell body and synapses essentially compute the
difference between the incoming excitatory and
inhibitory inputs.
When this difference is large enough (compared to
the neuron's threshold) then the neuron will fire.
The faster excitatory spikes arrive at its synapses the
faster it will fire.

Artificial Neural Networks


The building blocks of neural networks are the

neurons.
In technical systems, we also refer to them as units or nodes.

Basically, each neuron


receives input from many other neurons.
changes its internal state (activation) based on the current
input.
sends one output signal to many other neurons, possibly
including its input neurons (recurrent network).

Artificial Neural Networks


Information is transmitted as a series of electric

impulses, so-called spikes.


The frequency and phase of these spikes encodes the

information.
In biological systems, one neuron can be connected to as

many

as

10,000

other

neurons.

Usually, a neuron receives its information from other

neurons in a confined area, its so-called receptive field.

Artificial Neural Network

This configuration is actually called a Perceptron.


A perceptron models a neuron by taking a weighted sum of
inputs and sending the output 1, if the sum is greater than
some adjustable threshold value otherwise it sends 0.
This is called an activation function.

Perceptron
A computer model or computerized machine devised to
represent or simulate the ability of the brain to recognize and
discriminate.
Perceptrons are the easiest data structures to learn for the
study of Neural Networking.
The links between the nodes not only show the relationship
between the nodes but also transmit data and information,
called a signal or impulse.
The perceptron is a simple model of a neuron (nerve cell).

Few points about single layer Perceptron


Strength of the connection is called the synaptic
weight.
Larger the value more the connection weight.
Activation function output is either 0 or 1.
The perceptron algorithm was invented in 1957 at
the Cornell Aeronautical Laboratory by Frank
Rosenblatt, funded by the United States Office of
Naval Research.

Perceptron: Neuron Model


(Special form of single layer feed forward)
The perceptron was first proposed by Rosenblatt (1958) is a
simple neuron that is used to classify its input into one of two
categories.
A perceptron uses a step function that returns +1 if weighted
sum of its input 0 and -1 otherwise

( v )

1 if
1 if

v
v

b (bias)
x1
x2

w1
w2
wn

xn

(v)

0
0

The Neuron Diagram


Bias

b
x1

w1
Induced
Field

Input
values

x2

w2

xm

wm
weights

Summing
function

Activation
function

()

Output

Bias of a Neuron
The bias b has the effect of applying a transformation to
the weighted sum u
v=u+b
The bias is an external parameter of the neuron. It can be
modeled by adding an extra input.
v is called induced field of the neuron
m

j0

w0 b

w jxj

The McCulloch-Pitts Model of Neuron

Neuron Models
The choice of activation function determines the
neuron model.
Examples:
a if v c

(
v
)

step function:
b if v c
ramp function:

a if v c

( v ) b if v d
a (( v c )( b a ) /( d c )) otherwise

sigmoid function with z,x,y parameters


(v ) z

Gaussian function:
(v )

1
1 exp( xv y )

1 v 2
1
exp

2
2

Step Function
b

Curve fitting using ANN

Weight

Weight

Height

Height

Objective is to fit a curve that passes through these set of points

y mx c
m is the is the slope and c is the intercept

y mx c
Can be written in neural network form

yk w1 x w0
This equation is the neural model of a straight line

w 0 is the intercept, w 1 is the slope and x is


the input.
w 1 can be synaptic weight between the input
and output.

Neural model of straight line


1

w20

y2
2

x1

w20

is the intercept and

w 21

w 21

y2 w21 x w20
is the slope X is the input

Multi input and single output of a height


weight problem
Weight can be due to calorie intake
Weight can be due to biological problem
Weight can be due to physical problem or
some disease problem
There can be various other problems where we
can get different weight of a person.
How to solve these problems with multiple
input parameters to the weight

Multi input case of a neural network


1

w30

y3

w 31

w32

y3 w30 w31 x1 w32 x2

Pertaining to this we will be getting a two slope.


Hence, this is a two dimensional problem.

How to model a three dimensional


system
It is easy to visualize the three dimensional input
parameter.

X
Z

Multi dimensional system

y f ( x1 , x2 , x3 ....xn )

Error using gradient descent

weight

Height

E p (t p y p )

How to calculate error?


E p (t p y p ) 2
E E p (t p y p ) 2
p

Why error is important


If we start with any arbitrary intercept and slope for the straight line,
we should measure the error and then we will find out that in what
direction we should adjust the load and intercept so that next time fitting
that I do can be a better fit
Objective in the neural network is to find the minimum error

Gradient descent

E
w

This is also called steepest descent

Gradient descent
E

wij wij

E
p
p E p w
ij

E p
G
p wij

E
E yo

.
woi yo woi

1 p
E (t 0 y 0 p ) 2
2
p

Using Chain Rule

E
(t0 y0 )
yo

yk w1 x w0
y0 woi xi w0 woi xi
j

yo

woi woi

x xi

oi i

E
(t0 y0 ) xi
woi

woi (t0 y0 ) xi
woi (t0 y0 ) xi

is the learning rate

woi woi woi


Success of convergence of ANN much depends on the
learning rate

Concept of learning rate

Can we fit a non linear curve using a


straight line fitting?
y f (x)
y

Even if we want to fit a non linear curve with a


straight line we have to fit several straight line

Non linear curve fitting

To fit this we have to use non linear curve fitting method

People started thinking on modeling a non linear activation function

Bias

b
x1

w1
Induced
Field

Input
values

x2

w2

xm

wm
weights

Summing
function

Activation
function

()

Output

Non Linear Activation function

Non Linear Activation Function


Non linear solution by linear NN model is very
complicated

wk 1

wk 0

wk 2

vk

Non linear function

wk 3

(vk ) y

w kn

vk

x j w kj

Sigmoid function

A sigmoid function produces a curve with an S shape.

(vk ) 1

(vk ) 0

1
y ( vk )
1 e avk

This is also known as logistic function


Changing a to a higher value would give to a MacCullan Pitts model

Importance of a in sigmoid function


1. If a is very high (tending to infiniti) it will
behave as an Mcculloh Pitts model.
2. If a is very low(tending to minus infinity) it will
behave as purely sigmoidal function.
a is used as tuning of sigmoid function
So whatever is the v the NN activation function
will give the non linear output based on the
activation function.

Few important points about sigmoid


activation function

Yk can take any value between 0 and 1


Yk will be 0.5 at 0.
Yk can take 1 at infinity.
Yk can take 0 at minus infinity.
It is monotonically increasing.
It is continuously differentiable.
MacCullon Pitts model is not continuous model

Tanh function

(vk ) tanh(a vk )

Learning mechanism process in ANN

Learning Mechanism in NN

Five basic learning rules


1. Error correction learning
2. Memory based learning
3. Hebian learning
4. Competitive learning
5. Boltzman learning
Learning mechanism mentioned above are used by human brain.
But it is very difficult to say which type is used mostly by the
brain.

Error correction learning

E ek (n) dk (n) yk (n)


n is the discrete time step or iteration number
minimization of E(n)

1
2
E(n) ek (n)
2 k

wkj ( n) ek ( n) x j ( n)
This we have found using gradient descent method

This is also called delta rule or windrow- hoff rule

Error correction method


w kj ( n 1) w kj ( n ) w kj ( n )

wkj

is the updated synaptic weight

This is the simplest error correction techniques

This learning is used mostly in the ANN techniques

Memory based learning


1

()

The pattern of the inputs can be called


to be vectors. I mean I am feeding X
inputs which are 1,2,3..m inputs

Memory based learning


_

Xi
_


X i
i
_

di

For every Xi there will be a


desired output di.

In large memory we are going to store these


patterns
N

X i , di

For every N there is a desired pattern

Here N is the n different patterns

Suppose, all the inputs


X i
i

are stored in the memory

And there is another set of inputs which is

X test
Using Euclidian distance, we will be finding that
is nearer to any of the
_


X i
i

X test


Xi
i
Now suppose

Xj

is found to be nearer to

X test

X j , {X1 , X 2 ,....X n }

Xj

is nearest neighbor of

X test

min d (Xi , Xtest ) d ( X j , Xtest )


i

Memory learning mechanism using pattern

0 and 1 pattern for some input vectors

Memory learning mechanism using pattern

X test
0 and 1 pattern for some input vectors with

te st

This concept is also known as nearest neighbor criteria or


memory based learning

Memory learning mechanism using


pattern

outlier

0 and 1 pattern for some input vectors with an outlier

0 and 1 pattern for some input vectors with an outlier and

K nearest neighbor classification


It means, we are going to consider k nearest neighbors
instead of single neighbor.

te st

Hebbian Learning

Hebbian learning is said to be one of the closest


learning mechanism compared to biological
neuron

Hebbian Learning Mechanism


Hebb was a neurophysiologist, 1949 in the book of
Organization of Behaviour

Synaptic weight

If cell A consistently fires cell B, metabolic


changes happen so that the efficiency of A
firing B increases.

In other way
It

states that where one cell's firing repeatedly


contributes to the firing of another cell, the magnitude
of this contribution will tend to increase gradually
with time.
OR
The hebbian learning rule, in which a change in the
strength of a connection is a function of the pre and
postsynaptic neural activities.

Hebbian Synapse
Hebbian synapses
a) Time-dependent (It means it should work in
synchronized way)
b) Local ( It depends upon presynaptic and post synaptic)
c) Strongly interactive (It is connected with presynaptic
and post synaptic neuron)
Positive correlation- Synaptic strengthening
Uncorrelated or negative correlation- Synaptic
weakening

Uncorrelated means one does not effect another

Classification of synaptic modification


1. Hebbian- Synapse increases its strength with positive
correlation.
2. Anti-Hebbian- Synapse increases its strength with
negative correlation.
3. Non-Hebbian- Which does not involve Hebbian
mechanism of either kind.

Mathematical model of Hebbian


modifications
wkj ( n) f ( yk (n), x j ( n))
Post synaptic

Pre synaptic

According to hebbian model, the Wkj function can be


positive correlated or negative correlated

Hebbs hypothesis

wkj (n) yk (n) x j (n)


This is also called activity product rule

Concept of Hebbian learning rule

wkj

xj

yk

xj
wkj

yk
The weight rise exponentially, i.e.

w kj

Covariance Hypothesis

x
y

is a time average value of x j


is a time average value of

yk

wkj ( x j x )( yk y )
Covariance hypothesis is the modified version of Hebbian learning mechanism

Covariance hypothesis

(xj x )

wkj
yk

- ( x j x )
Covariance hypothesis has got the stability effect

wkj

xj x

yk y

wkj

xj x

yk y

xj x

yk y

xj x

yk y

w kj

Notes on Hebbian Learning


Hipocampus: It is the area of one brain where
we see Hebbian behaviour

Competitive learning

Competitive learning
1
Layer
of

Source

w 51
w52

Feedback
connections

w53

w54

6
Inhibitory

neuron 3

7
4

Feedforward connections
Excitory

O/P layer

Mathematical modeling of competitive


learning

yk

1 if vk v j for all
0
otherwise

wkj 1

j k

for all k

wkj

( x j wkj )
0

If neuron k wins
If neuron k looses

Multi Layer Perceptron

You might also like