You are on page 1of 4

Notes on the paper Long Term Memory Storage Capacity of Multiconnected Neural Networks

Amir Hesam Salavati E-mail: saloot@gmail.com August 8, 2011

Summary

In [1] the authors consider extending the concept of Hopeld associative memory to neural networks that utilize higher-order neurons. The state of these higher-order neurons not only depend on the state of their neighbors, but also on their second and higher order correlations. The authors claim the assumption of having higher-order neurons is biologically meaningful as the brain is highly packed with neurons in close vicinities of each others and in such conditions, neighboring neurons can aect the the synaptic weight of other connections in addition of that of their own. Using such a model, the authors show that the storage capacity of Hopeld networks can be improved. More specically, the authors obtain storage capacities that are polynomial in N with exponents linear in p, the order of the network. In other words, M = O(N p2 ). However, note that the number of stored patterns is still proportional to the number of synapses. For a network of size N , we can get more synapses if we consider higher-order correlations. Rate: 10+/10

Model and Method


A network of higher-order neurons is called multiconnected network in this paper. [important]In a multiconnected network, the weight wij between neurons i and j may also depend of the state of neuron k, denoted by sk . Such dependecies is captured by wij = wij + wijk sk As a result, the simple update rule of normal neural networks will become equivalent to that of higher-order neurons: si = f ( +
j

wij sj +
j,k

wijk sj sk + . . .)

(1)

where f (.) is non-linear function and is the ring threshold. Figure 1 illustrates schematics of a high-order neuron. 1

Figure 1: Schematics of a higher-order synapse [1].

The overall problem is the same as that of Hopeld associative memory: we have M binary patterns of length N , denoted by X = {x }, which we would like to memorize using i multiconnected neural networks. The update rule is given by equation (1). [disadvantage][question]The authors have assumed that the absolute value of the higherorder connection weights scaled with second-order ones, i.e. |wijk | |wij | (2)

[very important]With the assumption of = 0, the learning rule for the weights is an extended version of the Hopeld learning rule: wij ... wijk...p = p2 < s s s . . . s > p i j k (3) where < . > denotes averaging over all patterns in the training set. Using an analysis similar to that of Hopeld, the authors breaks down the linear input sum of each neuron into a desired term and an interference term. Assuming the patterns to be 2 = < s s > i j wijk = < s s s > i j k

stochastically independent, the interference term will become a Gaussian random variable as a consequence of central limit theorem. Therefore, the stability condition reduces to the probability of a Gaussian random variable being smaller than the desired term. Limiting this probability to be small will give us the maximum number of patterns than can be stored.

Results
1 [very important][advantage][good for report]For leq 1/N , the capacity is (1N )2 times that of Hopeld networks. p2 [very important][advantage][good for report]For > 1/ N , the capacity is 2p1N (p1/2) times that of Hopeld networks. Here, p is the order of the network.

In the above equation, note that the exponent is linear in p. [important]For networks with order two and three, the pattern retrieval capacity is: M (1 + N )2 MHopf ield 1 + 25/2 1/2 N 1/2 + 3 2 N (4)

this formula is valid in the whole range of values of . [important][idea]An interesting point to note is that the number of stored patterns is still proportional to the number of synapses. For a network of size N , we can get more synapses if we consider higher-order correlations. It has also been found that the proportionality factor between the number of stored bits and the number of synapses is a decreasing function of the synaptic order.

Ideas and Facts That Might Help Our Research


[very important][idea]A very important idea which is mentioned towards the end of the paper is the eect of partitioning the memory, i.e. breaking the whole network into smaller parts and each part act as a normal associative memory. Now a pattern is a combination of sub-patterns. The authors have noted that: 1. [very important]partitioned associative memories have capacities exponential in the network size (2N for some in the order of 0.03). 2. [very important]it is biologically relevant since it is observed that in human brain, 50 to 100 neurons form a microcolumn and a group of 50 to 100 microcolumns constitute a column. A column is also linked to approximately 100 other columns, that are not necessarily near by. [very important][good for report]The authors claim that higher-order synapses are indeed a realistic assumption: Electron micrographs reveal that the cortical neurons are packed in tightly intertwinned bundles of bers (Roney et al. 1979; Shaw et al. 1982). The assumption that neurons are linked only by binary connections is too simple. Synapses can modify the membrane potentials of other synapses as well as those of dendrites. 3

[disadvantage][question]The authors have assumed that the absolute value of the higherorder connection weights scaled with second-order ones, i.e. |wijk | |wij | (5)

[very important][idea]With the assumption of = 0, the learning rule for the weights is an extended version of the Hopeld learning rule: wij ... wijk...p = p2 < s s s . . . s > p i j k (6) where < . > denotes averaging over all patterns in the training set. [idea][My opinion]: I think in order to fully utilize the power of higher-order neurons, not only the learning rule should be based on higher order correlations, but also neural update rule itself should look like (1). Because if neural update rule is the same as the traditional one, then even with the best learning rule, which is the pseudo-inverse rule, one is unable to learn more than a linear number of patterns. [advantage]Using the proposed scheme, the authors obtain polynomial storage capacities with exponents linear in p, the order of the network. [idea][important]The authors have also investigated the eect of noise during the learning phase. = < s s > i j

wijk = < s s s > i j k

Introductory Parts and Related Works


The long-term memory in the title is referring to the associative memory.

References
[1] P. Peretto, J. J. Niez, Long term memory storage capacity of multiconnected neural networks, Biological Cybernetics, Vol. 54, No. 1, 1986, pp. 53-63.