You are on page 1of 20

Definitions of Certain Key Terms

Neuron: The basic nerve cell or computing unit for


biologic information processing.
Action potential: The pulse of electric potential generated
across the membrane of a neuron following the application
of a stimulus greater than the threshold value.
Axon: The output node of a neuron that carries the action
potential to other neurons in the network.
Axon hill cock: The starting point of the axon.
Dendrite: The input part of the neuron that carries a
temporal summation of action potential to soma.
Soma: The cell body of the neuron (that processes the
inputs from dendrites).
Somatic gain: The parameter that changes the slope of the
non-linear activation function, used in the architecture of
neuron.
Synapse: The junction point between the axon of a pre-
synaptic neuron and the dendrite of a post-synaptic neuron.
It is the axon-dendrite contact organ.
Synaptic and somatic learning: Synaptic learning is the
component of the learning that determines the optimum
synaptic weights based on the minimization of certain
performance index of error. Somatic learning consists of
the adaptation of the optimum value of the slope of the
non-linear activation function.
2

1. Neuro Computing

A human brain consists of approximately 10 computing


11

elements called neurons. They communicate through a


connection network of axons and synapses, having a
density of approximately 10 synapses per neuron. The
4

human brain is thus a densely connected electrical


switching network, conditioned largely by the
biochemical process. The neuron is thus the
fundamental building block of a biological neural
network and operates in a chemical environment. A
typical neuron cell has three major regions: the soma
(cell body), the axon and the dendrites. The dendrites
form a dendrite tree, which is a very fine bush of thin
fibers around the neuron body. Dentrites receive
information from the cell body through axons (long
fibers that serve as transmission lines). An axon is a long
cylindrical connection that carries impulses from the
neuron. The end part of the axon splits in to a fine
elements, each branch of which terminates in a small
end bulb almost touching the dendrites of the
neighboring neurons. This axon- dendrite contact is
termed as a synapse. The synapse is where the neuron
introduces its signal (in terms of electrical impulses) to
the neighboring neuron. Further more the neuron is
covered by a thin membrane.
3

A neuron will respond to the total of its inputs aggregated


over a short time interval (period of latent summation).
The neuron will respond if the total potential of its
membrane reaches a certain level. The neurons
generate a pulse response and send it to its axon only
under the satisfaction of certain conditions. The
incoming impulse may be excitatory if they cause
firing, or inhibitory if they hinder the firing. The
precise condition for firing is that the excitation should
exceed the inhibition by the amount called the
threshold of a neuron (a typical value for the threshold
is 40 mV.).

The incoming impulses to neuron can only be generated by


the neighboring neurons or by the neuron itself (by
feedback). Usually a certain number of impulses are
required for a neuron to fire. Impulses that are closely
spaced in time and arrive synchronously are more likely to
cause a neuron to fire. Observations showed that biological
neural networks perform temporal integration and
summation of electrical signals. The resulting spatio-
temporal processing performed by the biological neural
networks is a complex process and is less structured than in
digital computations. Furthermore the electrical impulses
are not synchronised in time as opposed to the synchronous
discipline of digital computation. One important
characteristic feature 0f the biological neuron is that the
magnitude of the signal generated does not differ
significantly. The signal in the nerve fiber is either absent
or has a maximum value. This means that the information
4

is transmitted between the nerve cells in the form of binary


signals.

After carrying a pulse, an axon fiber undergoes a state of


complete inactivity for a certain time called the refractory
period. For this time interval the nerve does not conduct
any signals, regardless of the intensity of excitations. The
refractory period is not uniform over the cells. The time
units for modeling biological neurons may be of the order
of milliseconds. Also there are different types of neurons
and different ways in which they are connected.

Now understand that we are dealing with a dense network


of interconnected neurons that release asynchronous
signals, which are not only fed forward to the neighboring
neurons but also fed back to the generating neuron itself.
Thus the picture of the real phenomena in the biological
neural network becomes involved.

The brain is a highly nonlinear, complex, and parallel,


information processing system. Human brain has the ability
to arrange its structural constituents (neurons) to perform
certain operations like pattern recognition, perception and
motor control, many times faster than the fastest computer
available today. In what follows an example of such
operation by human brain is explained.

Consider the human vision which is an information


processing task. The visual system continuously gives the
representation of the environment around us and supply the
information needed to react to it. The human brain
5

routinely accomplishes these perceptual recognition tasks


in approximately 100-200 msec. A digital computer will
take days to perform a much less complex task. Consider
for example, the sonar of a bat, which is an active echo
recognition system. The bat sonar gives information like
how far away the target is, the relative velocity of the
target, the size of the target, the size of the various features
of the target, and the azimuth & elevation of the target.

The vestibule-ocular reflex (VOR) system is a part of the


vision operations performed by the human eye and the
brain. The function of the VOR is to maintain the stability
of the retinal image by making the eye rotations opposite to
the head rotations. There are pre-motor neurons and motor
neurons which carry out any muscle movement. The pre-
motor neurons in the vestibular nuclei receives and process
head rotation (inputs) signals and sends the results to the
eye muscle motor neurons responsible for eye rotations.
Since the above input and output signals are well defined it
is possible to modal such a vestibule-ocular reflex (VOR)
system.

In what follows two questions are asked.

1.1 Why Neurons are very slow?

1. The axon is a long insulated conductor. It is a few


microns in diameter filled with a much poorer
conductor than copper, even a few millimeters will
have high resistance.
6

2. No insulation is perfect. Some current will leak


through the membrane
3. A cell membrane is an insulating sheet tens of an
ngstroms thick with conductors on both sides. The
membrane material has a high dielectric constant. So
we should expect a large membrane capacity (a typical
value would be 1 F per cm 2 .

Now, the time constant which is related to the product of


the resistance and capacitance is also high.

1.2 Why the action potential is all-or-none?

A neuron will respond to the total of its inputs aggregated


over a short time interval (period of latent summation).
The neuron will respond if the total potential of its
membrane reaches a certain level. The neurons
generate a pulse response and send it to its axon only
under the satisfaction of certain conditions. The
incoming impulse may be excitatory if they cause
firing, or inhibitory if they hinder the firing. The
precise condition for firing is that the excitation should
exceed the inhibition by the amount called the
threshold of a neuron (a typical value for the threshold
is 40 mV.).

1.3 Computation by human brain

We may have the complete knowledge of the neural


architecture and arrangements, yet the characterisation of
7

the high-level computation of the human mind remains a


mystery. This is because the electro chemical transmission
of signals and the adjustments of the synaptic (connection
weights) are involved and it is complex. This paradoxical
situation of human mind can be roughly explained as
follows:

Imagine connecting a logic analiser to a working CPU with


a completely known and well documented architecture. Let
all the signal flow from the logic analyzer to the CPU and
from CPU to the logic analyzer is known and is
documented and analyzed. The knowledge of this activity
in the micro level is insufficient to explain the computation
that is taking place in the macro level.

Note, however, that the primary purpose, application, and


objective of the human brain is survival. The time-evolved
performance of human intelligence reflects an attempt to
optimize this objective. The distinguishing characteristics
does not, however, reduce our interest in biological
computation since,

1. The brain integrates and stores experiences, which


could be previous classification or associations of
input data. In this sense it self organizes experience.
2. The brain considers new experiences in the context of
stored experiences.
3. The brain is able to make accurate predictions about
new situations on the basis of previously self
organized experiences.
8

4. The brain does not require perfect information. It is


tolerant of deformations of input patterns or
perturbations in input data.
5. The brain seems to have available, perhaps unused,
neurons ready for use.
6. The brain does not provide, through microscopic or
macroscopic examination of its activity, much useful
information about its operation at high level.
7. The brain tends to cause behavior that homeostatic,
meaning in a state of equilibrium (stable) or tending
towards such a state. This is an interesting feature
found in some recurrent neural networks such as in
Hopfield and grossberg networks.

1.4 The Artificial Neural Network


The idea of artificial neural network has been motivated
from the recognition that the human brain computes in
entirely different way from the conventional digital
computer. Such a neural network is defined as follows:
A neural network is a massively parallel distributed
processor made up of simple processing units, which has a
natural propensity for storing experimental knowledge and
making it available for use. It resembles the brain in two
aspects:
(1) Knowledge is acquired by the network from its
environment through a
Process of learning
(2) Interneuron connection strengths, called synaptic
weights, are used to store the acquired knowledge
9

1.5 Representation of knowledge

Knowledge refers to stored information or modals used by


a person or machine to interpret, predict, and appropriately
respond to the outside world. The neural network will thus
learn the environment in which it is embedded. Such a
knowledge learned is of two kinds:
1. The known world state, or the facts about what is and
what has been, Known. This kind of knowledge is
referred to as prior information.
2. Measurements (observations) obtained by using
sensors designed to probe the environment. This
information provides examples to train the neural
network.

The examples may be labeled or unlabelled. In labeled


examples, each example representing an input signal is
paired with a target or desired response. Unlabelled
examples consists of different realisations of the input
signal by itself. The neural network will then acquire
knowledge by training using these examples that are
labeled or unlabelled.

The knowledge representation inside the neural network is


rather complicated. In what follows four rules are explained
which are of common sense in nature.

Rule 1. It is obvious that similar inputs from similar


class usually produce similar representations inside the
10

network and therefore they should be classified as


belonging to the same category.

One usually used measure of similarity is the Euclidean


distance. The Euclidean distance between a pair of
vectors x i and x j in the Euclidean space R m is given by
d (x i x j ) x i x j
1/ 2
m
( xik x jk ) 2
k 1
The similarity between the two inputs is defined as the
reciprocal of the Euclidean distance between the two
vectors. Lesser the distance more similar the inputs are.

Rule 2. Second rule is just opposite of the first rule.


Items to be separated as separate classes should be given
widely different representations in the network.
Consequently, the more is the Euclidean distance the
inputs are more separate.
Rule 3. If a particular feature is important, then
more number of neurons should be used for the
representation of that event in the network.
Rule 4. Prior informations and invariance should be
built in to the network, so that they need not be learned
and these results in the reduction of the network
architecture. The free parameters to be adjusted are
reduced and this results in less number of building
blocks and less cost. Here we are talking about
specialized networks. Biological neural networks are
specialized indeed.
11

There are no general rules for incorporating prior


iformatios and invariance. It is possible to incorporate
prior information in to the network architecture by
weight sharing and localized connections. Invariance
actually means invariance to transformations. Invariance
to transformations can be achieved
(i) by structure
(ii) by training

1.6 Characteristics of Neural network


1. Generalization
A neural network derives its computing power due to (i) Its
massive parallel-distributed structure (ii) Its ability to learn.
Thus we train the network using some training examples.
The network will give an appropriate response if we give
an example that is not included in the training examples
used for training.
2. Nonlinearity
The basic model of a neural network is nonlinear if the
activation function is nonlinear (that is usually the case).
Nonlinearity is an important feature, since the underlying
physical mechanism is nonlinear. Furthermore the
nonlinearity is distributed trough out the network.
3. Adaptation
A neural network is inherently adaptive
When a neural network is doing a task two features are
involved; space and the time
The training of a neural network is usually done in a
stationary environment
But the environment will change continuously
12

So a spatiotemporal training is required. The synaptic


weights of the network (weight space) will change
continuously
As a result when the environment changes the training
examples as well as the weight space changes.
This is a continuous process in all animals
Such a continuous change is also possible in an
artificial neural network.
In other words the training process in an artificial
neural network is continuous and the free parameters
of the system should continuously adapt to the
environment
The question that arises is how often this adaptation
should take place? That depends on the application
An unsupervised training will be better than
supervised training, as is the case in human brain?

1.7 Models of a neuron


A neuron is an information-processing unit. A neural
network consists of a number of such units. The figure
shows the model of a neuron. One can identify three basic
ingredients of such a neuron model.
(i) A set of connecting links called synapses between
the input signals x j ; j 1,2,..., m and neuron k. Such
synapses are characterised by their synaptic weights
wkj ; j 1,2,..., m
. Note that the subscripts of w are kj
and not jk, the meaning of which will be clear when
we deal with the back propagation algorithm for
training the neuron.
13

(ii) An adder which sums up the input signals weighed


(multiplied) by their respective synapses.
(iii) It is required to limit the amplitude of the output of
the neuron to some finite value. The amplitude of
the output of a neuron may be limited to the range
[0,1] or [-1,1]. This operation is carried out by a
squashing function called the nonlinear activation
function.

x1 bias bk
wk1
wk 2

x2


vk
(.)
yk

xm
wkm
The above neuron model also includes an externally
applied bias term bk . The effect of the bias term is to
increase or lower the net input of the activation function as
shown in figure.
Induced local field, vk bk 0
bk 0
bk 0

0
Linear combiners output, uk
14

We will describe the neuron k by using the following set of


equations:
m
u k w kj x j
j1
y k (u k b k )
(v k )

To incorporate the bias term as an input term, the neuron


model may be modified. Accordingly the equations are
modified as
m
uk wkj x j
j 0
yk (v k )

x0 wk 0 bk
wk 0
wk1

x1 vk yk
(.)

wkm
xm

1.8 Signal Flow Graph of a Neuron

The signal flow graph of a single neuron is shown in the


figure below. One can identify the source nodes, the
computation node and the communication links from the
figure.
15

x0 1
x1

x2 vk (.) yk

xm

Signal flow graph of a neuron

1.9 Types of Activation function


Three types of activation functions are explained below.

1. Threshold Function:
As shown in the figure, we have
if v k 0
(v )
1
0
if v k 0
(v ) 1

vk
if v k 0
(v )
1

1 if v k 0 1
(v )

This type of neuron model is vk

known as McCulloch pits model


-1

2. Piecewise-Linear Function
1 if vk 1
2
(v ) v if 1 vk 1 (v )
2 2
1
0 if vk
2
1 1 vk
2 2
16

3. Sigmoid Function ( Logistic Function )

The S-shaped sigmoid function is the most commonly used


activation function.
( v k )
1
( v k )
1 e ( a v k )

vk

Note:-
1. The sigmoid function is differentiable, where as the
threshold function is not. Differentiability is an
important feature of the neural network theory.
2. As a to , (v k ) 0 to 1 , and it reduces to threshold
function.
3. The logistic function coined its name from the the
transcendental law of logistic growth. Measured in
appropriate units, all growth process are supposed to
be represented by the logistic distribution function
1
F( t )
1 e t
Where t represents time and , are constans.

Another example of the odd sigmoid function which ranges


from -1 to +1 is the hyperbolic tangent function (the
sigmum function) given by the expression
av
(v ) tan h
2

1 e av 2
1
1 e av 1 e av
17

This is bipolar continuous activation function between 1


and 1. With a , we have a bipolar hard limiting
activation function with output as either 1 or 1.

1.10 Exercises:

1. Show that the derivative of the logistic function w.r.t v is


( v) a ( v) [1 ( v)]

What is the value of this derivative at the origin?


1
( v)
1 e ( a v)
d ( v) a ea v
( v)
dv
1 e a v 2
1 a ea v

a ea v 1 ea v

a ( v) 1 ( v)

At v 0, (v) 12
Therefore,
(0) a (1 1 ) ( 1 )
2 2
`
a
4
2. Show that the derivative of the tansigmoid function w.r.t
v is
( v) a [1 2 ( v)]
2

What is the value of this derivative at the origin?


18

av
( v) tan h
2
a av a av
( v) sec h 2 1 tan h 2
2 2 2 2
a

1 2 ( v)
2

a
(0)
2

4. In logistic activation function, the presence of the


constant a has the same effect of multiplying all the inputs
with a.
1
( v)
1 e ( a v)
1 1

1 exp(a w 1x i ) 1 exp( w 1 a x i )
i i

5. Show that
(i) A linear neuron may be approximated as a neuron
with sigmoidal activation function with small synaptic
weights.
( Hint: For small values of x, e x 1 x )

(ii) a McCulloch-Pits modal of a neuron may be


approximated as a neuron with sigmoidal activation
function with large synaptic weights.

1.11 Logic operations performed by ANN


Logical AND :
Consider the truth table illustrating an AND gate
19

x2 x1 y x1

0 0 0 w1 v y
0 1 0
1 0 0 x2 w2 hard limiter
1 1 1 b

w1 1
w 1 , b 1.5
2

Logical OR :
Consider the truth table illustrating the OR gate

x2 x1 y w 1 1
w 1 , b 0.5
0 0 0 2

0 1 1
1 0 1
1 1 1 Note: The implementations of AND and OR
logic functions differ only by the value of the
bias
Complement

x y w -1 v y
1 0 x hard limiter
0 1
b 0.5
20

You might also like