Module 4 Notes

Neural Network Module 4
ATTRACTOR NEURAL NETWORK

4.1 INTRODUCTION
• Human memory is associative.
• Human beings recall facts and information by invoking a chain of associations within a complex
web of knowledge stored in our brains.
• One idea or impression quickly link to another.
• Sometimes, a chain of thought can be generated through conscious reasoning.
• Often it is unconscious, as when we day dream. Much as the fragrance of a flower evokes a
pleasant memory of the past. so can the name of a friend suddenly create a surge of emotions.
• Entire experiences are recalled in complete richness of fact on a simple cue, and within fractions
of a second.
• The hippocampus in our brains is one structure that is responsible for the storage of declarative
memory.
• Research into the structure and mechanisms of operation of hippocampal neurons has revealed
mechanisms that are required for associative leaming.
4.2 ASSOCIATIVE LEARNING

4.2.1. Long term potentiation and Hebb`s Postulate
• Neurons in the hippocampus readily display long-term and permanent increases in activity with
certain patterns of stimulation.
• Consider figure 4.1 to portrays the essential regions of the hippocampus.
• The structure and mechanism of operation of specialized neurons called pyramidal reside in a region
called CA1.
• In Fig. 4.1, there are three major pathways with arrows indicating the direction of signal flow. The
perforant fibre pathway runs to granule cells in the dentate region.
• Granule cells in turn send axon that form the mossy fibber pathway to cells in the CA3 region.
• Finally, CA3 cells project axons onto dendrites of pyramidal cells in the CAl region as shown in Fig.
4.1.
• Extensive research on the nature of firing of neurons in these three hippocampal pathways, has
revealed that a brief high frequency stimulus train to any one of these three pathways leads to an
enduring increase in the excitatory postsynaptic potential (EPSP) of CA1 pyramidal neurons. That is
Department of CSE, AJIET Mangalore 1

to say that the strength' of the EPSP increases in response to the stimulus train, and this change can
last for anything from hours to weeks.
• This facilitation of the EPSP is referred to as long-term potentiation (LTP).
• It is most interesting that CA1 LTP displays three important properties: cooperativity, associativity,
and specificity.
• Cooperativity implies that CA1 LTP cannot be produced by activating only one fibre,
• There is a minimum number of fibres that must be activated together to achieve LTP.
• This cooperativity has an associative feature: when both strong and weak excitatory inputs arrive in
the same dendritic region of a pyramidal cell, the weak input gets potentiated if it is activated in
association with the strong one. Finally, LTP is specific to the dendrites where it is produced.
• Amongst various other observations, the most important point to note is that LTP requires
depolarization of the postsynaptic neuron, and that this depolarization must coincide with the
stimulus from the presynaptic neuron.
• "When an axon of cell A... excite[s] cell B and repeatedly or persistently takes part in firing it, some
growth process or metabolic change takes place in one or both cells so that A 's efficacy as one of the
cells’ firing B is increased."
Fig 4.1: Hippocampal pathways involved in long term potentiation (LTP)
4.2.2. NMDA Synapse as a Model for LTP

• Axions from CA3 neurons that terminate on CAl neurons use glutamate as their neurotransmitter.
• As seen from Fig. 4.2, CAl neurons have both NMDA and non-NMDA receptors to which
glutamate binds.
• The non-NMDA channel operates like any ordinary chemically gated channel.

• This means that the channel opens when the glutamate neurotransmitter binds to the designated
site on the channel.
• The opening of the channel allows the flow of sodium ions into, and potassium ions out of the
postsynaptic dendrite.
• This eventually leads to a depolarization of the postsynaptic dendritic region.
• On the other hand, the NMDA receptor channel is a uniquely double gated channel: it is usually
blocked by 𝑀𝑔++ ions, and so initially the channel remains blocked even when glutamate binds
to it. This is shown in Fig. 10.2(a).
• Magnesium ions decouple from the channel only if the postsynaptic cell is Sufficiently
depolarized.
• Therefore, an NMDA channel can become unblocked only when
o glutamate binds to the receptor, and
o the postsynaptic cell is sufficiently depolarized by strong cooperative inputs from the
presynaptic neurons.
• Depolarization is achieved only when many non-NMDA receptors open in response to the
simultaneous firing of many presynaptic inputs since unblocking these receptors allows an influx
of 𝑁𝑎 + into the postsynaptic cell.
Fig 4.2: Structure of the NMDA synapse and the molecular mechanism of LTP

• The influx of calcium is essential for LTP.

• This influx of calcium comes through NMDA channels and not from the regular voltage gated
calcium channels that are usually prevalent in the cell membrane of the dendritic shaft.
• Activation of non-NMDA channels depolarizes the dendritic spines and this removes the
magnesium blockade which allows calcium to enter.
• Synaptic action is thus restricted to synapses that are active.
• The influx of calcium initiates the enduring enhancement of synaptic transmission by activating
two calcium dependent protein kinases called calmodulin kinase and protein kinase C.
• These protein kinases become persistently active. This first stage is called the induction of LTP,
and is distinct from a second maintenance of LTP phase that requires enhanced presynaptic
transmitter release.
• If the induction of LTP requires a postsynaptic event and the maintenance requires a presynaptic
event, then there must be some message that goes back from post to presynaptic neuron.
• This message is carried by a calcium activated retrograde transmitter (nitric oxide) that diffuses
from post to presynaptic neuron. This is shown in Fig. 10.2(b).
• In the presynaptic neuron, the retrograde transmitter activates one or more second messengers in
the presynaptic terminal that enhances transmitter release of neurotransmitter. This maintains
LTP.
4.3. ATTRACTOR NEURAL NETWORK ASSOCIATIVE MEMORY

4.3.1. Associative Memory Model
• Associative memories map data from an input space to data in an output space: from unknown
domain points to known range points.
• The memory learns an underlying association from a training data set.
• Memory models considered in the present study have their origin in additive neuronal dynamics, and
are non-learning in nature. By this we mean that connection strengths are programmed “a priori”
based on associations that are to be encoded in the system.
• No learning takes place during operation of the system.
• Such memories are referred to as matrix association memories, because a connection matrix W,
encodes or program associations {𝐴𝑖 , 𝐵𝑖 }𝑄𝑖=1 𝐴𝑖 ∈ 𝔹𝑛 , 𝐵𝑖 ∈ 𝔹𝑚 as shown in figure 4.3.
• If (𝐴𝑖 , 𝐵𝑖 ) is one of the programmed memories then 𝐵𝑖 is called associant of 𝐴𝑖 .
• Then presentation of some input vector 𝐴′ produces an output or response vector 𝐵 ′ .

• If 𝐴′ is close to some encoded association 𝐴𝑖 , then 𝐵 ′ that is generated should be close to 𝐵𝑖 .

• When 𝐴𝑖 and 𝐵𝑖 are in different spaces 𝐴𝑖 ∈ 𝔹𝑛 , 𝐵𝑖 ∈ 𝔹𝑚 then the model is a hetero-associative
memory which associates two different vectors with each another.
• If 𝐴𝑖 = 𝐵𝑖 ∈ 𝔹𝑚 then the model is an auto-associative memory which associates a vector with
itself.
Figure 4.3: A matrix associative memory encodes associations into a connection matrix
• Whether the system is auto associative or hetero-associative, the memory of the vectors that are to be
associated is stored in the connections of the network.
• Associative memory models enjoy properties such as fault tolerance and, when they are auto
associative, they are able to solve the patten completion problem.
• Figure 4.4 shows two layers of neurons, one which receives input vectors and one which generates
output vectors.
Figure 4.4: A simple Associative memory is generated by a two-layer neural network
• The two layers are assumed to be connected through an n x m connection matrix W.

• If all the neurons are assumed linear, then for an input vector A that is presented to the network, the
activation and thus the signal vector B of output neurons can be written compactly in matrix form:
𝐵 = 𝐴𝑇 𝑊 − − − − − − − − − (4.1)
• Here, the association of B with A is generated through the connection matrix W. This is an example
of a linear memory.
• Additional complexity can be introduced into the operation of the same network by employing non-
linear signal functions at the input as well as at the output.
• In this case, the equation (4.1) can be written as
𝐵 = 𝒮(𝐴𝑇 𝑊) − − − − − − − − − (4.2)
where we assume that the signal vector A is generated using a non-linear function of activations in
the
input layer. In this case the memory is non-linear.
4.3.2. Hebb Association Matrix
• The Hebb postulate states that the change in synaptic efficacy is proportional to the conjunction
(or product) of pre and postsynaptic neuron signals.
• To formalize this idea, consider two layers 𝐿𝑥 and 𝐿𝑦 of n and m neurons respectively such that
neurons from 𝐿𝑥 connect to 𝐿𝑦 using a weight matrix W.
• If we assume that the weight matrix is constructed using Hebbian learning, it generates weight
values in accordance with the product of the presynaptic and postsynaptic activities of a neuron
as follows:
𝛿𝑤𝑖𝑗 = 𝜂𝑎𝑖 𝑏𝑗 − − − − − − − − − − − (4.3)
for some learning rate η, and signal vectors A = (𝑎1 , … . . , 𝑎𝑛 )𝑇 and B = (𝑏1 , … . . , 𝑏𝑛 )𝑇 across 𝐿𝑥
and 𝐿𝑦 , respectively. This rule is often referred to as the generalized Hebb rule.
• If we assume the leaning rate 𝜂 = 1, Hebbian learning essentially reduces to taking the outer
products of signal vectors.
• For example, if we start out with a zero-association matrix, then a single association between two
vectors A and B is then encoded as:
𝑎1 𝑏1 ⋯ 𝑎1 𝑏𝑚
𝑊= ( ⋮ ⋱ ⋮ ) = 𝐴𝐵 𝑇 − − − − − − − −(4.4)
𝑎𝑛 𝑏1 ⋯ 𝑎𝑛 𝑏𝑚
• The Hebb conjunctional learning rule should be clear from the above matrix: the weight 𝑤𝑖𝑗 that
connects neuron i of 𝐿𝑥 to neuron j of 𝐿𝑦 is the product of the respective signals 𝑎𝑖 and 𝑏𝑗 of both
the neurons. The association (A, B) is thus encoded into the matrix.
4.4. LINEAR ASSOCIATIVE MEMEORY

Assume that a single association (A, B) has been encoded into a connection matrix W using the outer
product 𝑊 = 𝐴𝐵 𝑇 . If the vector A is presented to 𝐿𝑥 , then the activity that develops across 𝐿𝑦 is
𝐵′𝑇 = 𝐴𝑇 𝑊 = 𝐴𝑇 𝐴𝐵 𝑇 = ‖𝐴‖2 𝐵 𝑇 − − − − − −(4.5)
The output vector that is recalled across the 𝐿𝑦 is a scaled version of B, the associant of A. If the vector A is
normalized, recall is exact.

What if more than one association needs to be encoded? The natural course to follow is to superpose the
individual association matrices. Assuming Q associations {𝐴𝑘 , 𝐵𝑘 }𝑄𝑘=1 that are to be encoded resulting
connection matrix is then
𝑄
𝑊 = ∑ 𝐴𝑘 𝐵𝑘𝑇 − − − − − − − −(4.6)
𝑘=1
If the vectors 𝐴𝑘 are orthonormal-orthogonal to one another and normalized to unity magnitude. In such a
case, when an association 𝐴𝑖 is presented to neurons in layer 𝐿𝑥 . a forward pass through the network yields a
vector B`:
𝐵′𝑇 = 𝐴𝑇𝑖 𝑊 − − − − − − − − − −(4.7)
𝑄
= 𝐴𝑇𝑖 ∑ 𝐴𝑘 𝐵𝑘𝑇 − − − − − − − −(4.8)

𝑘=1
= 𝐴𝑇𝑖 𝐴𝑖 𝐵𝑖𝑇 + ∑ 𝐴𝑇 𝐴𝑘 𝐵𝑘𝑇 − − − − − −(4.9)

𝑘≠𝑖
= ‖𝐴𝑖 ‖2 𝐵𝑖𝑇 + 0 − − − − − − − (4.10)

= 𝐵𝑖𝑇 − − − − − − − − − − − − − (4.11)
The recall of the encoded associant 𝐵𝑖 is perfect. Such associative memories are called orthogonal linear
associative memories (OLAMs).
4.5. HOPFIELD NETWORK

4.5.1. Hopfield Network Embody Additive Dynamics
The Hopfield network is a recurrent or feedback network of the form shown in Fig. 4.5
Figure 4.5: Hopfield auto associative memory architecture.

Layer F comprises n neurons which receive external inputs 𝐼𝑖 , i = l,... ,n. Each neuron feeds back signals
𝒮𝑖 (𝑥𝑖 ), through weights 𝑤𝑖𝑗 , i = 1,.....,n; j=1,... ,n. The simplest form of dynamics is the additive activation
dynamics model given by:
𝑛
𝑥̇ 𝑖 = −𝐴
⏟ 𝑖 𝑥𝑖 + ∑ 𝑤𝑖𝑗 𝒮𝑗 (𝑥𝑗 ) + 𝐼⏟𝑖 𝑖 = 1 … … . 𝑛 − − − − − − − (4.12)
⏟
𝑗=1
where we assume that individual neurons can have distinct sigmoidal characteristics
1 − 𝑒 −𝜆𝑖 𝑥
𝒮𝑖 (𝑥) = − − − − − − − −(4.13)
1 + 𝑒 −𝜆𝑖 𝑥
with an inverse function
1 1−𝑠 1 −1
𝒮𝑖−1 (𝑠) = ln ( )= 𝒮 (𝑠) − − − − − − − (4.14)
𝜆𝑖 1+𝑠 𝜆𝑖
1− 𝑒 −𝑥
Where s = 𝒮(𝑥) = . Usually, the gain scale factor 𝜆𝑖 is common for all neuron
1+𝑒 +𝑥
4.5.2. Electronic Circuit Interpretation of Additive Dynamics

Consider the electronic circuit drawn in Fig. 4.6, which has n amplifiers which feedback currents through
resistances 𝑅𝑗𝑖 . Each amplifier is associated with an input leakage capacitor 𝐶𝑖 and a leakage resistance 𝑟𝑖 .
Each amplifier also receives an external current input 𝐼𝑖 .
Fig 4.6: The electrical circuit interpretation of the Hopfield network.

Assuming the ith input and output amplifier voltages to be 𝑥𝑖 and 𝑆𝑖 (𝑥𝑖 ) (where S(.) is the amplifier transfer
function) respectively, the Kirchhoff current law equation at the input node of the ith amplifier can be
written as:
-------------------(4.15)
Rearrangement yields
-------------(4.16)
------------------(4.17)
Where,
--------------------(4.18)
4.5.3. Hopfield Model has Cohen-Grossberg Form

The models that possess Cohen-Grossberg dynamics have the general form:
-----------------(4.19)
Where 𝑎𝑖 (𝑥𝑖 ) > 0, 𝐶𝑖𝑗 = 𝐶𝑗𝑖 and 𝑑𝑗′ (𝑥𝑗 ) ≥ 0
These models also follow the Lyapunov function
----------------(4.20)
and are globally asymptotically stable.
Rearranging equation (4.12) yields:
---------------------(4.21)
Which has Cohen-Grossberg form with the following substitution
𝑎𝑖 (𝑥𝑖 ) = 1 − − − − − − − −(4.22)
𝑏𝑖 (𝑥𝑖 ) = −𝐴𝑖 𝑥𝑖 + 𝐼𝑖 − − − − − − − −(4.23)

𝑐𝑗𝑖 = −𝑤𝑗𝑖 − − − − − − − −(4.24)
𝑑𝑗 (𝑥𝑗 ) = 𝒮𝑗 (𝑥𝑗 ) = 𝑠𝑗 − − − − − −(4.25)
Substituting equation (4.22) – (4.25) into (4.20) we have:
𝑛 𝑛 𝑥𝑖 𝑛 𝑛
1
𝐸 = − ∑ ∑ 𝑤𝑗𝑘 𝒮𝑗 (𝑥𝑗 )𝒮𝑘 (𝑥𝑘 ) + ∑ 𝐴𝑖 ∫ 𝛼𝑖 𝒮𝑖′ (𝛼𝑖 )𝑑𝛼𝑖 − ∑ 𝐼𝑖 𝒮𝑖 (𝑥𝑖 ) − − − − − − − (4.26)
2 0
𝑖=1 𝑘=1 𝑖=1 𝑖=1
𝑛 𝑛 𝑠𝑖 𝑛 𝑛
1
= − ∑ ∑ 𝑤𝑖𝑗 𝑆𝑗 𝑆𝑘 + ∑ 𝐴𝑖 ∫ 𝑆𝑖−1 (𝛽𝑖 )𝑑𝛽𝑖 − ∑ 𝐼𝑖 𝑠𝑖 − − − − − − − − − − − − − (4.27)
2 0
𝑗=1 𝑘=1 𝑖=1 𝑖=1
Using equation (4.14)
-------------(4.28)
4.5.4. Stability Analysis of Continuous Time

To prove the stability of the system we need to show that 𝐸̇ is strictly negative. From the chain rule of
calculus,
------------------------------(4.29)
From equation 4.27 and 4.12
-----------------(4.30)
----------------------------------------(4.31)
Substituting 4.31 in 4.29 yields,
------------------------------(4.32)
The product 𝑥̇ 𝑖 𝑠̇𝑖 is always positive since both the activation and signal time derivatives have the same sign.
The positivity of this product also follows from the monotonicity of both signal function and its inverse.
𝑛
𝑑 −1
𝐸̇ = − ∑ (𝑆 (𝑠𝑖 )𝑠̇𝑖 − − − − − − − − − (4.33)
𝑑𝑡 𝑖
𝑖=1

= − ∑(𝑆𝑖−1 (𝑠𝑖 )′ 𝑠̇𝑖2 − − − − − − − − − (4.34)

𝑖=1
< 0 − − − − − − − − − − − − − − − (4.35)
since (𝑆𝑖−1 (𝑠𝑖 )′> 0 and 𝑠̇𝑖2 > 0. Note that É = 0 iff 𝑠̇𝑖 = 0 ∀i: the time derivatives of signal go to zero, or
signals cease to evolve further in time when the time derivative of the energy goes to zero. The signal state
trajectory must eventually come to rest. This establishes the stability of the system in continuous time.
4.5.5. Hopfield Model Operation and Stability Analysis in Discrete Time
The binary threshold signal function is actually a limiting case of the sigmoidal function 𝜆𝑖 −→ ∞. When the
neuron gain scale factor 𝜆𝑖 −→ ∞, we say the network operates in the high gain limit If we allow 𝜆𝑖 −→
∞ the energy function of Eq. (4.28) reduces to
-------------------(4.36)
In the absence of external inputs the energy function of Eq. (4.36) reduces to the familiar quadratic form
-------------------(4.37)
In this high gain limit, a neuron switches its state from -1 to 1 or from 1 to -1 at discrete interval of time by
sampling its current activation. At the kth instant of time the neuron updates its activation and signal to
instant k +1 in accordance with the difference equation:
--------------(4.38)
i = 1,... ,n where the subscript i on the signal function has been dropped since all neurons operate in the high
gain limit. Note that positive and negative activations translate unambiguously to 1 and -1 whereas the
neuron does not change state if its activation is zero.
At time instant k we assume the energy of the system to be 𝐸𝑘 :

---------------------------(4.39)
Neuron state changes at time instant k will yield a new system state vector and consequently a new system
energy 𝐸𝑘+1 :
-----------------------(4.40)
The change in energy Δ𝐸 is then defined as
---------------------(4.41)
without any loss or generality, we may assume that only one neuron, say the Ith, changes state. Note that in
the double summation of Eq. (4.40) there are two sets of terms that involve index I: one when i = I and the
other where j =I. For the moment we are only interested in the energy 𝐸𝑖𝑘 comprising terms that involve
index I:
--------------------------(4.42)
Replacing the index i by j in the second summation term and noting that 𝑤𝐼𝑗 = 𝑤𝑗𝐼 since network
connections are symmetric, yields
-----------------------------(4.43)
Since the change in energy Δ𝐸𝑘 involves on the ith neuron, we have:
-------------(4.44)

Where 𝑥1𝑘+1 is the activation of neuron I at the time instant k+1, and Δ𝑠1𝑘+1 = 𝑠𝐼𝑘+1 − 𝑆𝐼𝑘
4.6. APPLICATIONS OF HOPFIELD NETWORKS

Attractor neural networks such as Hopfield network and the bidirectional associative memory have a natural
ability to implement associative memory applications. Since association is naturally embedded in such
models. This associative property emerges by virtue of the symmetry of the internal connections and from
the fact that desired memories are programmed to be stable states of the network. Here, computation is
viewed as an error-correcting motion towards the closest memory on an energy landscape.
A basic principle of optimization algorithm is to search through the solution space in a way that decreases a
well-defined coat function that is required to be minimized. In the context of attractor neural networks, if
solutions to the optimization problem can be mapped onto the stable states of the neural network, then the
network can be made to evolve towards one of these solutions from an initial state. To do this, one needs to
appropriately design the energy function of the network such that stable states defining possible solutions
have energies that are proportional to the cost function of the problem.
4.6.1. Pattern Completion

A Hopfield network is used to encode three graphic patterns: a plane, a tank and a helicopter as memories.
These images have been defined on a 12 x 12-pixel grid, as portrayed in Fig. 4.7{a-e). Each figure was
represented by a 144-dimensional bipolar vector: 1 for a white pixel and -1 for a black pixel. These vectors
were encoded into a 144-dimensional Hopfield network using bipolar outer product encoding.
Fig: 4.7 Three 12 x 12- pixel patterns encoded into a Hopfield Network

Recall that bipolar encoding of these patterns also encodes their complements. This means that there are
going to be three complement state spurious attractor amongst the other mixture states. We do however
expect that partially distorted pattens will be cleaned up to their original form. In other words, as long as the
distortion level is such that the initial point lies within the basin of attraction of the encoded memory, image
clean up and proper recall are guaranteed.
Figure 4.8 portrays a 40 per cent distorted version of the original plane image. This means that a random
sample of 58 pixels of the original 144 pixels of the plane image have been inverted. For purpose of this
simulation, we employed pure asynchronous update. We only have to ensure that each neuron in the network
is updated the same number of times on average. In the present simulation, the order of update of the
neurons was randomized, but it was ensured that each neuron was updated at least once during a field update
cycle involving all 144 neurons.
Fig 4.8: Snapshots of a 144 neuron Hopfield network simulation

4.6.2. Three-Dimensional Object Recognition

An interesting application of Hopfield networks comes from the field of three-dimensional object
recognition. Three-dimensional object recognition involves matching an object to a description of a scene I
order to determine its position and orientation in space. The solution to the problem employs a database of
object model which comprises two-dimensional projection of a three-dimensional object viewed from
different angles. These topologically different views are referred to as characteristic views and are usually
generated by an automated procedure. Due to the large library of characteristic views that can be generated
the three-dimensional object recognition task becomes extremely complex.
In the first step, a coarse-grained recognition of the object is performed based on polygonal region
matching of the two-dimensional projections. Having reduced the search space, in the second step, a fine-
grained recognition is performed based on matching of vertex sequence. Both stages employ a Hopfield
network that comprises a two-dimensional array of neurons. Since the primary motivation of the present
discussion is to elucidate the details involved in the design of the network for such an application, and
considering that the course-grained and the fine-grained recognition network design procedures are very
similar.
Feature Selection In the present application, it is assumed that three-dimensional objects described by two-
dimensional projections in the form of line drawings obtained by segmentation of the original image. The
starting point is to describe each projection by a set of features. Figure 4.9 portrays a two-dimensional
projection of a slotted wedge with different regions numbered from 0 through 7. This is an example of a
labelled image.
Fig 4.9: Labelled regions of a slotted wedge in a particular view of the object
Labelling involves calculating the area of each region and tracing the boundary to determine corner points.
The area is a local feature of each region. Once each region has been labelled, a global relational feature is

computed. As shown in Figure 4.9 for region 2, this global feature is the distance from the centroid of region
2 to the centroids of all other regions. Both the area and the distances are suitably normalized.
Fig 4.10: Hopfield network for three-dimensional object recognition
4.7. BRAIN-STATE-IN-A-BOX NEURAL NETWORK

The brain-state-in-a-box (BSB) is a predecessor of the Hopfield network. The BSB model extends the linear
associator and is similar to the Hopfield network: it is an auto associative model with its connection matrix
computed using outer products in the usual way. The operation of both models is also very similar, with
differences arising primarily in the way activations are computed in each iteration, and in the signal function
used. The brain-state-in-a-box stands apart from other models in its use of the linear-threshold signal
function.
4.7.1. Operational Details

Neurons are connected in a feedback fashion through a weight matrix W. Signals across the field update
synchronously using the same iterative procedure until the signals stabilize. However, the BSB network
differs from the Hopfield network in the way neuron activations are computed, and in the signal function
employed.

𝑇
Activation and Signal Function: Assume that the network has an activation vector X = (𝑥1𝑘 , … … . 𝑥𝑛𝑘 )
𝑇
and a corresponding signal vector S = (𝑠1𝑘 , … … . 𝑠𝑛𝑘 ) at time instant k. Then BSB Neurons update their
signals in accordance with the following pair of update equation.
----------------------------------------- (4.45)
Where 𝛾, 𝛼 𝑎𝑛𝑑 𝛿 are constants that control the nature of the activation update equation. If 𝛾 = 𝛼 = 1 and
𝛿 = 0, equation 4.45 reduces to
-------------(4.46)
The signal function S(.) in Eq. (4.46) is given by a bipolar version of the linear threshold signal function:
-------------------(4.47)
Weight Matrix As in the case of the Hopfield network, the connection matrix is symmetric, which implies
the existence of n mutually orthogonal eigenvectors 𝜂1 … … . 𝜂𝑛 . Then the matrix W is completely
determined by its eigenvalues and eigenvectors. If we start out with a zero-connection matrix, and assume
that Q orthonormal input vectors 𝐴𝑖 are presented, such that each 𝐴𝑖 appears 𝜆𝑖 times, then in accordance
with the Hebb encoding scheme the resulting connection matrix is
-------------------------(4.48)
Note that each 𝐴𝑖 is an eigenvector of W with eigenvalue 𝜆𝑖 since
---------------------(4.49)

Operational Summary
4.7.2. The BSB too has Cohen-Grossberg Form

The BSB model minimizes a Lyapunov energy function and thus the network possesses stable dynamics.
Consider the neuron-specific scalar form of the BSB vector update equation (Eq. (4.45) with 𝛾 = 1 and 𝛿 =
0:
-----------------------------(4.50)
where 𝛿𝑗𝑖 is the Kronecker delta function, and we have defined
------------------------------(4.51)
Although equation 4.50 is specified in discrete time, one can easily rewrite it in continuous form as
-------------------------(4.52)
Now, consider a change of variables. Make the substitution:
------------------------------------(4.53)
in which case Eq. (4.52) modifies to
------------------------(4.54)
A Lyapunov function for the BSB model is now straightforward to derive. All we need to do is to make
appropriate substitution into the Lyapunov function of the standard Cohen-Grossberg form. This yields:
------------------(4.55)
Using the definitions from Eqs. (4.47), (4.51), and Eq. (4.53), Eq. (4.45) can be written in terms of the
original variables 𝑠𝑖 as
--------------------(4.56)
4.8 SIMULATED ANNEALING
4.8.1 Understanding the Procedure
The simulated annealing method for finding an optimal configuration of neuron states given a set of weights
is based on physical annealing metaphor. It involves the following basic steps:
1. Randomize neuron states once in the beginning and initialize the temperature to a high value.
2. Choose a neuron I randomly from the network.
3. Compute energy 𝐸𝐴 of the present configuration A.
4. Flip the state of neuron I to generate a new configuration B.
5. Compute the energy 𝐸𝐵 of configuration B.
6. If 𝐸𝐵 < 𝐸𝐴 then accept the state change of neuron I. otherwise accept the state change for neuron I
−Δ𝐸 (1)⁄
with probability 𝑒 𝑇 where Δ𝐸 (1) = 𝐸𝐵 − 𝐸𝐴
7. Continue selecting and testing neuron randomly and set their state several times in this way until a
thermal equilibrium is reached.
8. Finally, lower the temperature and repeat the procedure.

From an implementation point of view, note that the transition probability is dependent on ΔE. Therefore,
only those neurons need to be considered which are directly connected to the neuron I under consideration.
The change in energy Δ𝐸 (1) which occurs when neuron I flips its state from 𝑆𝑙 to -𝑆𝑙
--------------------------(4.57)
Stochastic simulated annealing algorithm applied to a Hopfield network
4.9 BOLTZMANN MACHINE

• An extension to the discrete Hopfield network is the Boltzmann machine.
• Boltzmann machines replace the deterministic local search dynamics of Hopfield networks by a
randomized search dynamic.

• The model also introduces a powerful stochastic learning algorithm in place of the simple Hebbian
rule employed in Hopfield networks.
• Relaxation in Boltzmann machines is done using the simulated annealing procedure.
• The literature on Boltzmann machines differentiates between a completion network and an input-
output network (similar to the standard feedforward neural network architecture).
• These architectures are portrayed in Fig. 4.11. Note that there are hidden neurons that do not receive
any external input. In addition, the connections are all bidirectional. Architectural differences from the
Hopfield competition network and a standard feedforward network should be carefully observed.
• In a completion network architecture (Fig. 4.11), a partially unspecified vector is expected to be
completed (or corrected) by clamping (holding fixed) the neuron signals to the corresponding known
values, and allowing the remaining unknown values to be determined through a network relaxation
process that is based on simulated annealing Fig. 4.11b), the inputs are clamped to a particular value
and the outputs are determined through the relaxation process.
Let us summarize the similarities and differences between the Boltzmann machine and the Hopfield
network. First the similarities. In both the models
1. neuron states are bipolar,
2. weights are symmetric,
3. neurons are selected at random for asynchronous update, and
4. there is no self-feedback.
The differences are:

1. The presence of a hidden layer of neurons makes the Boltzmann machine architecturally different
from the Hopfield network.
2. neuron update in the Hopfield network is deterministic whereas that in the Boltzmann machine is
stochastic.
3. In Hopfield network, vectors are encoded into the system using outer product correlation encoding
and there is no separate learning phase as such. In Boltzmann machine, learning is implemented
through complex combination of simulated annealing and gradient descent. Boltzmann learning is
stochastic supervised learning.

Fig 4.11: Two variant architecture of Boltzmann Machine
4.9.1 Operational Details

Consider Boltzmann completion network with configurations described by an energy function:
----------------------------(4.58)
Neurons in a Boltzmann machine flip their states based on Glauber dynamics. Specifically, for the Ith
neuron with signal 𝑠𝑖 and activation 𝑥𝐼 ,
--------------------(4.59)
Alternatively, Eq. (4.59) can be recast into the form of a flip probability. For this, we expand Eq (4.59)
and observe the signal flip probabilities for neuron
---------------------------(4.60)
which we conveniently combine using the neuron signal value before the flip, and the energy change
Δ𝐸𝑙 that results from the flip:
-------------------------(4.61)

The procedure is as follows:

1. Clamp the signal values of visible units to the corresponding known values. Randomise the
signals of neurons corresponding to unknown input values and signals of hidden neurons from
the set {-1, 1}. Initialize the temperature.
2. Select a neuron I at random for update. Compute its activation and use the Glauber dynamics
update equation Eq. (4.59) to compute signal states. Alternatively, flip the signal value in
accordance with the probability defined by Eq. (4.61). Note that signals for clamped neurons will
not be allowed to change during the update cycles.
3. Repeat the above update procedure until all neurons are polled several times.
4. Reduce the temperature and repeat the polling cycle. Stop when the temperature is low enough.
4.10 BIDIRECTIONAL ASSOCIATIVE MEMORY

4.10.1 Architecture and Operation

{𝐴𝑘 , 𝐵𝑘 }𝑄𝑘=1 𝐴𝑘 𝜖 𝐵 𝑛 and 𝐵𝑘 𝜖 𝐵 𝑚 such associative memory is called hetero associative

memory since they associate vectors from one vector space to another. The implications for a
neural network are that now instead of having a single layer of dimension n as in the Hopfield
network, we require two layers Lx and Ly of dimensions n and m respectively as shown in
Fig. 4.12. Such a system is called a bidirectional associative memory (BAM).
Figure 4.12 Architecture of bidirectional hetero associative memory
Consider the forward chain of activity transmission from Lx to Ly:
-----------------------------(4.62)
Where signal vector A' of Lx is composed with weight matrix W to set-up activations in Ly, which
are then thresholded to generate the signal vector B' across Ly. Assuming that A' is closer to an
encoded association 𝐴𝑖 than to all other encoded associations 𝐴𝑖 , it is desired that the vector B' that is
generated is closer to 𝐵𝑖 , than to any other encoded 𝐵𝑗 , 𝐵𝑗 being the vector associated with 𝐴𝑖 . If we
wish to back B' to Lx in order to improve the accuracy of recall we may do so by using 𝑊 𝑇 in this
reverse chain of activity transmission from Ly to Lx:
-------------------(4.63)
A`` can then be fed forward through W to generate B". Continuing this bidirectional process
generates a sequence of vector pairs:
------------------------(4.64)


Module 4 Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module 4 Notes

Uploaded by

Copyright:

Available Formats

Neural Network Module 4

ATTRACTOR NEURAL NETWORK

4.2 ASSOCIATIVE LEARNING

Department of CSE, AJIET Mangalore 1

Fig 4.1: Hippocampal pathways involved in long term potentiation (LTP)

4.2.2. NMDA Synapse as a Model for LTP

Department of CSE, AJIET Mangalore 2

Department of CSE, AJIET Mangalore 3

• The influx of calcium is essential for LTP.

4.3. ATTRACTOR NEURAL NETWORK ASSOCIATIVE MEMORY

Department of CSE, AJIET Mangalore 4

• If 𝐴′ is close to some encoded association 𝐴𝑖 , then 𝐵 ′ that is generated should be close to 𝐵𝑖 .

Figure 4.4: A simple Associative memory is generated by a two-layer neural network

• The two layers are assumed to be connected through an n x m connection matrix W.

4.4. LINEAR ASSOCIATIVE MEMEORY

Department of CSE, AJIET Mangalore 6

= 𝐴𝑇𝑖 ∑ 𝐴𝑘 𝐵𝑘𝑇 − − − − − − − −(4.8)

= 𝐴𝑇𝑖 𝐴𝑖 𝐵𝑖𝑇 + ∑ 𝐴𝑇 𝐴𝑘 𝐵𝑘𝑇 − − − − − −(4.9)

= ‖𝐴𝑖 ‖2 𝐵𝑖𝑇 + 0 − − − − − − − (4.10)

4.5. HOPFIELD NETWORK

Figure 4.5: Hopfield auto associative memory architecture.

Department of CSE, AJIET Mangalore 7

4.5.2. Electronic Circuit Interpretation of Additive Dynamics

Fig 4.6: The electrical circuit interpretation of the Hopfield network.

Department of CSE, AJIET Mangalore 8

4.5.3. Hopfield Model has Cohen-Grossberg Form

𝑏𝑖 (𝑥𝑖 ) = −𝐴𝑖 𝑥𝑖 + 𝐼𝑖 − − − − − − − −(4.23)

Using equation (4.14)

4.5.4. Stability Analysis of Continuous Time

Department of CSE, AJIET Mangalore 10

= − ∑(𝑆𝑖−1 (𝑠𝑖 )′ 𝑠̇𝑖2 − − − − − − − − − (4.34)

4.5.5. Hopfield Model Operation and Stability Analysis in Discrete Time

Department of CSE, AJIET Mangalore 11

Department of CSE, AJIET Mangalore 12

4.6. APPLICATIONS OF HOPFIELD NETWORKS

4.6.1. Pattern Completion

Department of CSE, AJIET Mangalore 13

Fig 4.8: Snapshots of a 144 neuron Hopfield network simulation

Department of CSE, AJIET Mangalore 14

4.6.2. Three-Dimensional Object Recognition

Department of CSE, AJIET Mangalore 15

Fig 4.10: Hopfield network for three-dimensional object recognition

4.7. BRAIN-STATE-IN-A-BOX NEURAL NETWORK

4.7.1. Operational Details

Department of CSE, AJIET Mangalore 16

Department of CSE, AJIET Mangalore 17

4.7.2. The BSB too has Cohen-Grossberg Form

4.8 SIMULATED ANNEALING

4.8.1 Understanding the Procedure

Department of CSE, AJIET Mangalore 19

Stochastic simulated annealing algorithm applied to a Hopfield network

4.9 BOLTZMANN MACHINE

Department of CSE, AJIET Mangalore 20

The differences are:

Department of CSE, AJIET Mangalore 21

Fig 4.11: Two variant architecture of Boltzmann Machine

4.9.1 Operational Details

Department of CSE, AJIET Mangalore 22

The procedure is as follows:

4.10 BIDIRECTIONAL ASSOCIATIVE MEMORY

Department of CSE, AJIET Mangalore 23

{𝐴𝑘 , 𝐵𝑘 }𝑄𝑘=1 𝐴𝑘 𝜖 𝐵 𝑛 and 𝐵𝑘 𝜖 𝐵 𝑚 such associative memory is called hetero associative

Figure 4.12 Architecture of bidirectional hetero associative memory