Neural Associative Memory Networks Explained

Associative Memory Neural Networks
Associative memory
• Associative memory is defined as the ability to learn and remember the
relationship between unrelated items. for example, remembering the name of
someone or the aroma of a particular perfume.
• Associative memory deals specifically with the relationship between different
objects or concepts. A normal associative memory task involves testing
participants on their recall of pairs of unrelated items, such as face-name pairs.
• Associative memories are neural networks (NNs) for modeling the learning
and retrieval of memories in the brain. The retrieved memory and its query are
typically represented by binary, bipolar, or real vectors describing patterns of
neural activity.
Pattern Association
• Learning is the process of forming associations between related patterns.
• The patterns we associate together may be of the same type or of different
types.
• Each association is an input-output vector pair, s:t.
• If each vector t is the same as the vector s with which it is associated, then the
net is called an autoassociative memory.
• If the t's are different from the s's, the net is called a heteroassociative
memory.
• In each of these cases, the net not only learns the specific pattern pairs that
were used for training, but also is able to recall the desired response pattern
when given an input stimulus that is similar, but not identical, to the training
input.
Training Algorithms for Pattern Association
• Similar to hebbian learning for classification.
• Algorithm: (bipolar or binary patterns)
• For each training samples s:t: w ij  si  t j
• w ij increases if both si and t j
are ON (binary) or have the same sign (bipolar)
P
w ij   si ( p)t j ( p) W  { w ij }
P 1
• Instead of obtaining W by iterative updates, it can be computed from the

training set by calculating the outer product of s and t.
Outer product
• Outer product. Let s and t be row vectors.
Then for a particular training pair s:t
 s1   s1t1......s1t m  w11 ......w1m 
   s t ......s t   
W ( p)  s ( p)  t ( p)     t1 ,......t m   
T 2 1 2 m
 
     
 sn   sn t1......sn t m  w n1......w nm 
And
• It involves 3 nested loops p, i, j (order of p is irrelevant)
p= 1 to P /* for every training pair */
i = 1 to n /* for every row in W */
j = 1 to m /* for every element j in row i */
Delta rule
• In its original form, the delta rule assumed that the activation function for the
output unit was the identity function.
• A simple extension allows for the use of any differentiable activation

function; we shall call this the extended delta rule.
Hetero-associative Memory
Associative memory neural networks are nets in which the weights
are determined in such a way that the net can store a set of P
pattern associations.
• Each association is a pair of vectors (s(p), t(p)), with p = 1,
2, . . . , P.
• Each vector s(p) is an n-tuple (has n components), and each t(p)
is an m-tuple.
• The weights may be found using the Hebb rule or the delta rule.
Hetero-associative Memory
Example of hetero-associative memory
• Binary pattern pairs s:t with |s| = 4 and |t| = 2.

• Total weighted input to output units: y _ in j   x i w ij
i
• Activation function: threshold
1 if y _ in j  0
yj  
0 if y _ in j  0
• Weights are computed by Hebbian rule (sum of outer products
of all training pairs)
P
W   s i ( p) t j ( p)
T
p 1
• Training samples:
s(p) t(p)
p=1 (1 0 0 0) (1, 0)
p=2 (1 1 0 0) (1, 0)
p=3 (0 0 0 1) (0, 1)
p=4 (0 0 1 1) (0, 1)
1  1 0 1  1 0
       
0 0 0 1 1 0
s T (1)  t (1)   1 0    s T (2)  t ( 2)   1 0   
0 0 0  0 0 0
0 0  0 0 0 
   0    
 0 0 0
0 0 0
       
0 0 0
s T (3)  t (3)    0 1   s ( 4)  t ( 4) 
T  0   0 1   0 0
 0 0 0
 
1 0 1
1  0 1  1  0
  
   1 
2 0
 
1 0
W  Computing the weights
0 1
 
0 2 

Recall:
x=(1 0 0 0) x=(0 1 0 0) (similar to S(1) and S(2)
2 0
  2 0
1 0 0 0   1 0
  2 0  
0 1  0 1 0 0  1 0
 1 0 
0 0 1
 2  0
 2 
y1  1, y2  0
y1  1, y2  0
x=(0 1 1 0)
2 0 (1 0 0 0), (1 1 0 0) class (1, 0)

 
1 0 (0 0 0 1), (0 0 1 1) class (0, 1)
 0 1 1 0   1 1
0 1 (0 1 1 0) is not sufficiently similar
0 2  to any class

y1  1, y 2  1 delta-rule would give same or
similar results.
Auto-associative memory
• For an auto-associative net, the training input and target output vectors are
identical.
• The process of training is often called storing the vectors, which may be
binary or bipolar.
• The performance of the net is judged by its ability to reproduce a stored
pattern from noisy input; performance is, in general, better for bipolar vectors
than for binary vectors.
• Same as hetero-associative nets, except t(p) =s (p).
• Used to recall a pattern by a its noisy or incomplete version.
(pattern completion/pattern recovery)
• A single pattern s = (1, 1, 1, -1) is stored (weights computed by Hebbian
rule – outer product)
1 1 1  1
1 1 1  1
W  
1 1 1  1
 1 1  1 1 
•
training pat. 111  1 W   4 4 4  4  111  1

noisy pat   111  1 W   2 2 2  2  111  1
missing info  0 0 1  1 W   2 2 2  2  111  1
more noisy   1  11  1 W   0 0 0 0 not recognized
• The preceding process of using the net can be written more succinctly
as:
• As before, the differences take one of two forms: "mistakes" in the data
or "missing" data.
• The only "mistakes" we consider are changes from + 1 to -1 or vice
versa.
• We use the term "missing" data to refer to a component that has the
value 0, rather than either + 1 or -1
Iterative Auto-associative memory
• In some cases the net does not respond immediately to an input signal with a
stored target pattern, but the response may be enough like a stored pattern.
• Testing a recurrent auto-associative net: stored vector with second, third and
fourth components set to zero.
• The weight matrix to store the vector (1, 1, 1, -1) is
Iterative Auto-associative memory
• The vector (1,0,0,0) is an example of a vector formed from the stored
vector with three "missing" components (three zero entries).
• The performance of the net for this vector is given next.
• Input vector (1, 0, 0, 0):
• (1, 0, 0, 0).W = (0, 1, 1, -1) >> iterate
• (0, 1, 1, -1).W = (3,2,2, -2) >> (1, 1, 1, -1).
• Thus, for the input vector (1, 0, 0, 0), the net produces the "known" vector (
1, 1, 1, -1) as its response in two iterations.
Bidirectional Associative Memory (BAM)
• First proposed by bart kosko
• Heteroassociative network
• It associates patterns from one set, set A, to patterns from another set,
set B, and vice versa
• Generalize and also produce correct outputs despite corrupted or
incomplete inputs
• Consists of two fully interconnected layers of processing elements
• There can also be a feedback link connecting each
Node to itself.
• The BAM mapping of an n dimensional input vector into the m
dimensional output vector .
A BAM network (Each node may also be connected to itself)

• How does the BAM work?
BAM operation: (a) forward direction; (b) backward direction

• The input vector is applied to the transpose of weight matrix to
produce an output vector
• Then, the output vector is applied to the weight matrix to produce a new input vector
This process is repeated until input and output vectors become unchanged (reach stable state)
Basic idea behind the BAM
• Store pattern pairs so that when n-dimensional vector X from set A
is presented as input, the BAM recalls m-dimensional vector Y
from set B, but when Y is presented as input, the BAM recalls X.
The BAM training algorithm
Step 1: Storage The BAM is required to store M pairs of patterns. For example, we may
wish to store four pairs:
In this case, the BAM input layer must have six neurons and the output layer three neurons.
The weight matrix is determined as:

Number
of pairs
• 2: Testing The BAM should be able to receive any vector from set A
Step
and retrieve the associated vector from set B, and receive any vector from set
B and retrieve the associated vector from set A. Thus, first we need to
confirm that the BAM is able to recall when presented with . That is,
For instance
•Then,
we confirm that the BAM recalls when presented with . That is,
For instance
• Step 3: Retrieval: Present an unknown vector (probe) X to the BAM and
retrieve a stored association. The probe may present a corrupted or incomplete
version of a pattern from set A (or from set B) stored in the BAM. That is,
• Repeat the iteration until equilibrium, when input and output vectors remain unchanged
with further iterations. The input and output patterns will then represent an associated
pair.
• BAM is unconditionally stable (Kosko, 1992). This means that any set of
The
associations can be learned without risk of instability. This important quality
arises from the BAM using the transpose relationship between weight matrices in
forward and backward directions.
Let us now return to our example. Suppose we use vector X as a probe. It
represents a single error compared with the pattern from set A:
This probe applied as the BAM input produces the output vector Y1 from set B.
The vector Y1 is then used as input to retrieve the vector X1 from set A. Thus,
the BAM is indeed capable of error correction.

Neural Associative Memory Networks Explained

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Neural Associative Memory Networks Explained

Uploaded by

Copyright:

Available Formats

Associative Memory Neural Networks

• Instead of obtaining W by iterative updates, it can be computed from the

• A simple extension allows for the use of any differentiable activation

• Binary pattern pairs s:t with |s| = 4 and |t| = 2.

2 0 (1 0 0 0), (1 1 0 0) class (1, 0)

training pat. 111  1 W   4 4 4  4  111  1

A BAM network (Each node may also be connected to itself)

BAM operation: (a) forward direction; (b) backward direction

The weight matrix is determined as:

You might also like