Professional Documents
Culture Documents
1, FEBRUARY 1999 73
Abstract—Based on the previous work of a number of authors, neural network (MANN). Hirai [26] proposed a network that
we discuss an important class of neural networks which we call associates a pattern with a collection of patterns. Hagiwara
multi-associative neural networks (MANN’s) and which associate [22] and coworkers [37] extended the bidirectional associative
one pattern with multiple patterns. As a computationally effi-
cient example of such networks, we describe a specific MANN, memory [32] to store associations among multiple patterns.
that is, a multi-associative, dynamically generated variant of These networks used covariance (inner-product) learning rules
the counterpropagation network (MCPN). As an application of similar to that in the Hopfield network [27] and therefore
MANN’s, we design a general system that can learn and retrieve have limited memory capacity. Owens et al. [38] attempted to
complex spatio-temporal sequences with any MANN. This system
consists of comparator units, a parallel array of MANN’s, and reduce training time and improve the classification capability
delayed feedback lines from the output of the system to the of the multilayer perceptron by incorporating multiple output
neural network layer. During learning, pairs of sequences of layers. In this paper, we discuss the MANN in general, as
spatial patterns are presented to the system and the system well as a computationally efficient exemplar MANN. As a
learns to associate patterns at successive times in sequence.
demonstration of the applicability of MANN’s, we present a
During retrieving, a cue sequence, which may be obscured by
spatial noise and temporal gaps, causes the system to output the design of a system that is capable of learning and retrieving
stored spatio-temporal sequence. We prove analytically that this complex spatio-temporal sequences using any static MANN. A
system is capable of learning and generating any spatio-temporal specific implementation of this general system shows a number
sequences within the maximum complexity determined by the of desirable advantages over existing spatio-temporal systems.
number of embedded MANN’s, with the maximum length and
number of sequences determined by the memory capacity of There has been extensive research activity in learning and
the embedded MANN’s. To demonstrate the applicability of this generating temporal phenomena with artificial neural net-
general system, we present an implementation using the MCPN. works. The main motivation for creating and studying these
The system shows desirable properties such as fast and accurate models is to build intelligent artificial systems that mimic
learning and retrieving, and ability to store a large number of
or even surpass certain aspects of biological intelligence.
complex sequences consisting of nonorthogonal spatial patterns.
Spatio-temporal phenomena are particularly interesting due to
Index Terms— Auto-associative, hetero-associative, multi- their abundance in both nature and practical applications. For
associative, neural network, noise, spatio-temporal sequence.
instance, a particular spatio-temporal scene triggers a response
sequence in a robot. The survival of many animals also
I. INTRODUCTION depends on the ability to learn and produce spatio-temporal
sequences, e.g., a prey escapes from a predictor with a series
A HETERO-associative neural network (HANN) associates
a spatial pattern with another pattern
may or may not be the same as pattern
which
, whereas an auto-
of maneuvers.
Grossberg [14]–[20] proved mathematically that his
associative neural network (AANN) associates a spatial pattern Avalanche model is capable of learning spatio-temporal
with itself, i.e., in an AANN. For example, the sequences after an infinite number of presentations of the
multilayer perceptron network [40], the counterpropagation training sequences; however, the model, presented in the
network [25], and the bidirectional associative memory [32] form of delayed partial-differential-difference equations, is
are HANN’s, whereas the Hopfield network [27] is an AANN. computationally expensive and its practical capabilities to
One pattern may often be associated with many patterns. effectively generate spatio-temporal sequences, such as the
For example [7], an image of a banana may be associated ones discussed in the present paper, are yet to be demonstrated.
with not only a banana, but also a yellow object, a fruit, an Fukushima’s [13] spatio-temporal system consists of a num-
oblong object, etc. Let us call a neural network that associates ber of binary neurons connected with delayed Hebb-type [24]
one spatial pattern with multiple patterns a multi-associative covariance (inner-product) synapses [1]. This system required
many iterations for sequence retrieval and retrieved nonorthog-
Manuscript received January 6, 1997; revised April 10, 1998. This work
was supported by the Australian Research Council and Deakin University. onal patterns with difficulty. Images generated by this system
The author was with the School of Computing and Mathematics, Deakin are often obscured by noise, which may be attributed to
University, Victoria 3168, Australia. He is now with the School of Electrical spurious memories characteristic of covariance (inner-product)
and Electronic Engineering, Nanyang Technological University, Singapore
639798 (e-mail: elpwang@ntu.edu.sg). learning rules [1], [27]. Time delays have been used in Hop-
Publisher Item Identifier S 1083-4419(99)00900-0. field networks [27], [28] to generate time-dependent sequences
1083–4419/99$10.00 1999 IEEE
74 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 29, NO. 1, FEBRUARY 1999
of spatial patterns [30], [41], [42] and to process speech isting networks for spatio-temporal sequence generation are
signals [45]. Kosko [32] proposed the bidirectional associative all based on some specific learning rules and network ar-
memory that consists of interconnected networks and is able chitecture. In this paper, as an application of MANN’s, we
to produce spatio-temporal sequences. Buhmann and Schulten propose a general framework for learning and generating
[5] used stochastic noise to induce transitions between spatial spatio-temporal sequences with any arbitrary MANN which
patterns in Hopfield networks and these transitions formed may be based on any arbitrary architecture and learning
spatio-temporal sequences. These systems also use Hebb-type algorithm. To demonstrate the applicability of the this general
learning rules and thus have rather low memory capacity and system, we present an implementation using a specific MANN,
spurious memories. that is, our multi-associative, dynamically generated variant
Guyon et al. [21] presented a spatio-temporal system that of the counterpropagation network [25], and we show that
required a priori analytical expressions of all stored sequences. this system significantly improves the efficiency of spatio-
Other mechanisms for temporal sequence generation are time- temporal sequence learning and generation in comparison with
dependent [10], [39], asymmetric [8], [36], and diluted higher other existing systems. The present work also increases the
order [50], [51], [56], [57], synaptic interactions. capability of dealing with complex sequences when compared
Bradski et al. [2]–[4] coupled two ART networks [7] to form to a system that we proposed using AANN’s and HANN’s, as
sequence producing systems (also see [23]). The capabilities evidenced by the Theorem in Section III when compared to
of these networks in handling complex sequences, where one our previous work based on AANN’s and HANN’s [53], [54],
spatial pattern in a sequence may be followed by different as evidenced by the Theorem in Section III and the example
spatial patterns, have not been mathematically proven or given in Tables I–III..
clearly substantiated with examples such as the ones given in This paper is organized as follows. In Section II, we discuss
the present paper. D. Wang and Yuwono [49] recently proposed general features of MANN and provide a computationally
a novel network to generate complex sequences [47], [48]. efficient exemplar MANN based on the counterpropagation
They proved that the model can learn to generate any complex network. As an application of MANN’s, we describe in
sequences within a certain limit determined by the network Section III the architectural design and the operating mech-
architecture. Like many other sequence processing systems anisms of our general system for learning and generating
[59], their network handles sequences of symbols (scalars) spatio-temporal sequences, together with analytical discussions
rather than spatial patterns (vectors). To learn and retrieve on the maximum complexity of the sequences learned by the
spatio-temporal sequences, their network requires preprocess- system and the memory capacity of the system (the maximum
ing to transform sequences of spatial patterns to sequences of lengths and number of sequences stored). An implementation
symbols as network inputs and post-processing to transform of the general system, as well as the learning and the retrieval
sequences of symbols to sequences of spatial patterns as of some explicit examples, will be described in Section IV.
network outputs. This work has the following resemblance We end the paper with some closing remarks in Section V.
with a concurrent effort taken by Nigrin [35]. These networks
memorize sequences with the competitive learning algorithm
used in the ART networks and anticipate upcoming compo- II. MULTIASSOCIATIVE NEURAL NETWORKS AND AN EXAMPLE
nents in a sequence with feedback connections. They use As defined in the previous section, a multi-associative
decreasing activation in neurons to represent order information neural network (MANN) associates one pattern with multiple
in a sequence. Nigrin’s network classifies sequences (also see patterns. There are many situations where multi-associations
[7]), rather than generates sequences as the network presented may occur. For instance, a facial image may be associated with
in [49] does. In this paper, we concentrate our discussions to not only the person’s name, but also a friend, a humorous
the storage and retrieval of spatio-temporal sequences only. person, etc. A car may be associated with a red object, a
Extension of the present work to sequence classification and tool for transport, or even traffic hazard! Although one pattern
discrimination will be the subject of future studies. may be associated with a set of patterns, which we call
Many authors, inspired by Elman [12] and Jordan [29], the association set of pattern , we assume that the network
incorporated time-delayed feedbacks into backpropagation net- outputs only one pattern in its association set at any given
works [40] to form recurrent networks (for a review see [44]). instance of time and a signal external to the MANN determines
These networks were used in processing temporal speech which pattern in the association set is the overall output of the
signals [33], [46]. DeVries and Principe [9] proposed an inter- MANN.
esting temporal model combining the existing approaches such We assume that there are two separate input channels in
as backpropagation, time-delays, and Grossberg’s work. Both a MANN, i.e., the conditioned stimulus (CS) and the un-
backpropagation and recurrent networks have long training conditioned stimulus (US) channels, in analogy with classical
times, which makes real-time learning very difficult. conditioning [31], that is, after repeated presentations of a US
Despite intensive research activities in learning and retriev- together with a CS, the CS alone can generate the response
ing spatio-temporal sequences with neural networks, various caused by the US. During learning, paired CS and US input
difficulties, such as slow and inaccurate learning and re- patterns are presented to a MANN through the CS channel
trieving, strict orthogonality requirements, limited memory and the US channel, respectively, and the MANN learns the
capacity, and difficulties in dealing with complex sequences, association between the CS and the US patterns in each pair.
remain in the existing approaches. Furthermore, these ex- After learning, each CS pattern may be associated with
WANG: MULTI-ASSOCIATIVE NEURAL NETWORKS 75
has input neurons and one group of associative weights connected to neuron are exactly the overall averages
neurons, but the network has no competitive neurons or of the CS and the US training patterns used to modify the
weights. When the first pair of CS and US training patterns, weights of this neuron [52], [59]
i.e., and , is presented to the network, the first
competitive neuron is generated, with its incoming weights
being the CS training pattern, i.e., , and the outgoing
weights connecting this neuron to all associative neurons being
the US training pattern, i.e., . It will become
clearer that each competitive neuron may be connected to
more than one group of associative neurons to allow for
multi-associations. denotes the outgoing weight vector After training, each competitive neuron in the network is
connecting the th competitive neuron with the th group of connected to at least one group of associative neurons. Some
associative neurons, and, as shown below, it is actually the th competitive neurons may be connected to multiple groups of
US pattern associated with the CS pattern represented by the associative neurons, if the CS patterns which these competitive
incoming weight vector of the th competitive neuron. neurons represent have been associated with multiple US
For each subsequent pair of CS–US association, a com- patterns during training (Fig. 2). The associative neurons in
petition is carried out in the competitive layer and the neuron the multi-associative layer of the network are connected to
whose incoming weights are the most similar to the CS training the output layer of the network. When a CS input pattern is
pattern is located. If the similarity between the CS training presented during retrieving, a winner-take-all competition is
pattern and the weight vector of this winning neuron is below a carried out in the competitive layer. If the similarity between
vigilance threshold [6], which signifies that the US training the input CS pattern and the incoming weights of the winning
pattern belongs to a category that has not been learned, a new competitive neuron is above the vigilance threshold , the
competitive neuron is generated in the same way in which US association set represented by the outgoing weights of this
the first competitive neuron was generated, namely, with its winning competitive neuron is activated. An external control
incoming and outgoing weight vectors being the CS and the signal permits only one group of associative neurons to be
US training vectors, respectively. active at any instance of time and the US pattern represented
If the similarity between the CS training pattern and the by the active group of associative neurons becomes the overall
incoming weight vector of this winning neuron is above the output of the network. If the similarity between the input CS
vigilance threshold , which signifies that the CS training pattern and the incoming weights of the winning competitive
pattern belongs to a learned category, the incoming weight neuron is below the vigilance threshold , the network
vector of the winning neuron ( ) is modified according outputs a “don’t know” answer.
to (1). The outgoing weight vector of this neuron is then In this section, we have discussed in general an important
compared with the training US pattern, if the winning neuron class of neural networks, i.e., the multi-associative neural
is connected to only one set of associative neurons, that is, if networks (MANN’s), and we have built a multi-associative,
the CS pattern represented by the winning neuron has been dynamically generated variant of the counterpropagation net-
associated with only one US pattern. If the winning neuron is work (MCPN) as an example of MANN’s. In the subsequent
connected to multiple groups of associative neurons, namely, sections, as an application of MANN’s, we will propose a
if the CS pattern represented by the winning neuron has been general system for learning and retrieving spatio-temporal
associated with multiple US patterns, then only the outgoing sequences with any MANN, and we will implement this
weight vector that is most similar to the training US pattern general system with the MCPN.
is considered. If the similarity between this outgoing weight
vector and the training US pattern is over another vigilance III. A GENERAL SPATIO-TEMPORAL SYSTEM
threshold , which signifies that the training vector belongs WITH MULTIASSOCIATIVE NEURAL NETWORKS
to a learned category, then this outgoing weight vector is Spatio-temporal sequences in the present paper denote time-
updated according to (2). If the similarity between the US dependent sequences in which each frame or element at
training pattern and the most similar outgoing weight vector any given time is a spatial pattern. Throughout this pa-
of the winning neuron is below the vigilance threshold , per, we treat time as a discrete entity. Fig. 3 shows three
a new group of associative neurons are created and the examples of such sequences, which we have created to demon-
weight vector connecting these new associative neurons to the strate the functionality of the system to be proposed: a)
winning competitive neuron is set to be the US training pattern, , b) , and
thereby establishing a new multi-association (Fig. 2). c) . Note that in Fig. 3 pat-
In (2) and (3), we choose a special type of learning rates tern , which is the same as pattern 1, appears in both
[52] and [59] sequences a) and b), pattern in sequence a) is somewhat
(4) similar to pattern 6 in sequence b), and pattern appears
more than once within sequence c). Hence these sequences are
where ( ) and ( ) are the numbers of times that complex sequences in which one spatial pattern may be fol-
the incoming and outgoing weights of neuron has been lowed by multiple patterns. The cyclic nature of the sequences
modified, respectively, so that the incoming and outgoing is not required in order to use our model. We are interested
WANG: MULTI-ASSOCIATIVE NEURAL NETWORKS 77
(a)
(b)
(c)
Fig. 3. Three examples of spatio-temporal sequences. Note the complexities in the sequences: (a) pattern I appears in both sequence, (b) sequence, and
(c) pattern J appears more than once in sequence.
TABLE I
THE CS AND US INPUTS TO EACH MANN, AS WELL AS THE
OVERALL CS AND US INPUTS TO THE SYSTEM, AS A SYSTEM
OF THREE EMBEDDED MANN’S LEARNS THE SEQUENCE REFEREE
There are two stages of operation for the present spatio- presented to the system through the CS input channel only,
temporal system: the learning stage, at which the system learns one spatial pattern at each time step. The comparator units are
spatio-temporal sequences, and the retrieving stage, at which again disabled during the presentation of the cue sequence,
the system retrieves sequences after being presented with cues. because there exist overall CS inputs to the system. The US
Learning and retrieving may be mixed; however, we assume input channel of the system is not used during retrieving.
that only one operation occurs at a give instance of time. We After the last pattern in the cue sequence has been presented,
will first present some formal discussions on the learning and the comparator units are made functional and their task is to
retrieving mechanisms of the system, which will then be made find and to output the common pattern existed in all of the
clearer with some special exemplar sequences and a specific activated US association sets of the MANN’s at each subse-
implementation of the general design. quent time step. Suppose the length of the cue sequence, i.e.,
At the learning stage, pairs of sequences of spatial patterns the number of spatial patterns in the cue sequence excluding
are presented to the CS and the US input channels of the any temporal gaps, is , and at time , MANN’s
system simultaneously, one spatial pattern at each time step. have CS input patterns, where and we recall
The two sequences in each training pair may be the same, is the number of embedded MANN’s. Suppose MANN
or one may be a variation of the other. If there is an external receives a CS input pattern, which has been associated to a
input to the CS input channel of the system, the system simply set of US patterns during
outputs these external signals, regardless the outputs from training, where . If the cue sequence is a
the comparator units. Since there are always CS inputs to part of a training sequence, with each spatial pattern having
the system during learning, it is equivalent to disabling the a tolerable noise level, and the cue sequence is sufficient to
comparator units at the learning stage, that is, the comparator unambiguously determine the next spatial pattern in the
units are used only during retrieval. sequence, then the following must hold: i) pattern must
The CS sequence in each training pair is fed into the be in every US association set , with ;
CS input channel and is then directed into each embedded and ii) there must not exist any other spatial pattern common
MANN through appropriate delayed feedback (Fig. 4). The to every US association set. This result is not difficult to
US sequence in the pair, the expected output of the system prove, since i) is a direct consequence of the assumptions
corresponding to the CS sequence, is fed into the US input that the cue sequence is a part of a training sequence and
channel and is then directed into each embedded MANN that each spatial pattern in the cue sequence has a tolerable
without any delays. noise level. In addition, the violation of ii) would contradict
Consider a spatio-temporal sequence of length , i.e., the assumption that the cue sequence is sufficiently long to
. Suppose the length of the unambiguously determine pattern . We therefore obtain the
subsequence necessary to unambiguously determine spatial following theorem concerning the complexity of the spatio-
pattern in the sequence is , which is called the degree of temporal sequences that can be learned and generated by the
pattern [49]. The maximum degree for the entire sequence, present general system:
i.e., , is called the degree of this Theorem: A system shown in Fig. 4 with embedded
sequence. For example, in the sequence , the MANN’s is able to learn and generate any spatio-temporal
subsequence is required to unambiguously determine sequence of degree .
the last , hence the degree of the last is 3 [49]. One can Proof: The proof is similar to the arguments in the
verify that the degree of this sequence is also 3. previous paragraph. After presenting a sequence to the CS
When spatio-temporal sequence , , , and the US channels simultaneously during training, MANN
is presented simultaneously to the CS input and the learns to associate each spatial pattern in the sequence with
US input channels of the system during learning, the overall the pattern(s) steps later in the sequence. During retrieving,
system CS input and the overall system US input are the same if a cue sequence with length or longer is presented to
as the US inputs for each MANN at all times. Learning does the system, being the degree of the sequence, the next
not occur for the first MANN, i.e., MANN 1, until , spatial pattern in the sequence following the last pattern in
since there is no CS input for MANN 1 at due to the the cue sequence must be present in all association sets in the
time-delay in the CS pathway of MANN 1 (Fig. 4). Similarly, MANN’s, as long as . Furthermore, this pattern must
MANN does not learn until . The CS input pattern be the only common pattern present in all association sets
for MANN 1 at is , whereas the US input pattern in the MANN’s, since the cue sequence is sufficiently long
to MANN 1 is . Hence MANN 1 learns to associate CS to unambiguously determine the next pattern in the sequence.
pattern with US pattern at . At , MANN One can infer that the entire sequence can be retrieved by this
1 learns to associate CS pattern with US pattern cue sequence.
and MANN 2 learns to associate CS pattern with US Note that the mathematical analysis of our system is much
pattern , and so on. Once this sequence is presented to less complicated compared to that of Wang and Yuwono [49].
the system, other sequences can be presented to the system Let us consider the example used by Wang and Yuwono [49],
in exactly the same way. the sequence ; however, we assume that each
At the retrieving stage after learning, a cue sequence, which letter in the sequence is a spatial pattern (vector), rather than
is a small piece of a stored sequence and which may or may a symbol (scalar) as assumed by [49]. Since the degree of
not be obscured by spatial noise and/or temporal gaps, is the sequence is as discussed above, we will use 3
WANG: MULTI-ASSOCIATIVE NEURAL NETWORKS 79
TABLE V
THE CS AND US INPUTS TO EACH MANN, AS WELL AS THE
OVERALL CS AND US INPUTS TO THE SYSTEM, AS A SYSTEM OF
(a) THREE EMBEDDED MANN’S LEARNS THE SEQUENCE b) IN FIG. 1
(b)
(c)
(d)
(e)
(f)
TABLE VI
THE CS AND US INPUTS TO EACH MANN, AS WELL AS THE
(g) OVERALL CS AND US INPUTS TO THE SYSTEM, AS A SYSTEM OF
THREE EMBEDDED MANN’S LEARNS THE SEQUENCE c) IN FIG. 1
Fig. 5. Spatio-temporal sequence retrieval after training: (a) a retrieval of
sequence a); (b) a response to an ambiguous cue; (c) an unambiguous retrieval
of sequence b); (d) a response to a nonorthogonal cue pattern; (e) a response
to an unknown cue sequence; (f) a response to a cue sequence containing
temporal gaps; and (g) a retrieval of sequence c).
TABLE IV
THE CS AND US INPUTS TO EACH MANN, AS WELL AS THE
OVERALL CS AND US INPUTS TO THE SYSTEM, AS A SYSTEM OF
THREE EMBEDDED MANN’S LEARNS THE SEQUENCE a) IN FIG. 1
TABLE VII
TESTING THE SYSTEM FOR CASE a) IN FIG. 3: THE CS INPUT
TO AND THE OUTPUT OF EACH MANN, AS WELL AS THE
CS INPUT TO AND THE OUTPUT OF THE OVERALL SYSTEM
the general system with a specific MANN, that is, the MCPN
described in Section II.
We choose both dimensions of the CS and the US spatial
patterns, as well as the numbers of input neurons and initial
associative neurons in the MCPN, to be 11 11, i.e.,
. Three such MCPN’s are used in this
implementation ( ). We train the system to store the to Fukushima’s results [13], our system is capable of fast and
sequences consisting of manually created spatial images of accurate storing and generating complex sequences consisting
alpha-numeral characters shown in Fig. 3, and then test the of nonorthogonal spatial patterns. Our system can also output
system in situations shown in Fig. 5. We present each training "don’t know" answers, thereby reducing error rate.
sequence only once to the system. The detailed inputs and Sequences a) and b) were first used by Fukushima [13] in
outputs at each time step of training or retrieval for cases training and testing his spatio-temporal system. We present
shown in Fig. 5(a)–(c) are given in Tables IV–IX. Compared here a comparison between our results and Fukushima’s
WANG: MULTI-ASSOCIATIVE NEURAL NETWORKS 81
[13] K. Fukushima, “A model of associative memory in the brain,” Kybern., [44] A. C. Tsoi and A. D. Back, ”Locally recurrent globally feedforward net-
vol. 12, pp. 58–63, 1973. works: a critical review of architectures,” IEEE Trans. Neural Networks,
[14] S. Grossberg, “On learning of spatiotemporal patterns by networks with vol. 5, pp. 229–239, Mar. 1994.
ordered sensory and motor components, I: Excitatory components of the [45] K. P. Unnikrishnan, J. J. Hopfield, and D. W. Tank, “Connected-
cerebellum,” Stud. Appl. Math., vol. 48, pp. 105–132, 1969. digit speaker-dependent speech recognition using a neural network with
[15] , “Nonlinear difference-differential equations in prediction and time-delayed connections,” IEEE Trans. Signal Processing, vol. 39, pp.
learning theory,” in Proc. Nat. Acad. Sci., 1967, vol. 58, pp. 1329–1334. 698–713, 1991.
[16] ,“Some networks that can learn, remember, and reproduce any [46] A. Waibel, T. Hanazawa, G. Hinton, and K. Shikano, “Phoneme
number of complicated space-time patterns,” Stud. Appl. Math., vol. 49, recognition using time-delay neural networks,” IEEE Trans. Acoust.,
pp. 135–166, 1970. Speech, Signal Process., vol. 37, pp. 328–339, 1989.
[17] , “Neural pattern discrimination,” J. Theor. Biol., vol. 27, pp. [47] D. L. Wang and M. A. Arbib, “Complex temporal sequence learning
291–337, 1970. based on short-term memory,” Proc. IEEE, vol. 78, pp. 1536–1543,
[18] , Studies of Mind and Brain: Neural Principles of Learning, Sept. 1990.
Perception, Development, Cognition, and Motor Control. Boston, MA: [48] , “Timing and chunking in processing temporal order,” IEEE
Reidel, 1982. Trans. Syst., Man, Cybern., vol. 23, pp. 993–1009, July/Aug. 1993.
[19] , “The adaptive self-organization of serial order in behavior: [49] D. L. Wang and B. Yuwono, “Anticipation-based temporal pattern
Speech, language, and motor control,” in Pattern Recognition by Hu- generation,” IEEE Trans. Syst., Man, Cybern., vol. 25, pp. 615–628,
mans and Machines, E. C. Schwab and H. C. Nusbaum, Eds. New Apr. 1995.
York: Academic, vol. 1, pp. 187–294, 1986. [50] L. Wang, “Oscillatory and chaotic dynamics in neural networks under
[20] , The Adaptive Brain, II: Vision, Speech, Language, and Motor varying operating conditions,” IEEE Trans. Neural Networks, vol. 7,
Control. Amsterdam, The Netherlands: Elsevier/North-Holland, 1987. pp. 1382–1388, Nov. 1996.
[21] I. Guyon, L. Personnaz, J. P. Nadal, and G. Dreyfus, “Storage and [51] , “Suppressing chaos with hysteresis in a higher-order neural
retrieval of complex sequences in neural networks,” Phys. Rev. A, vol. network,” IEEE Trans. Circuits Syst., vol. 43, pp. 845–846, Dec. 1996.
38, pp. 6365–6372, 1988. [52] , “On competitive learning,” IEEE Trans. Neural Networks, vol.
[22] M. Hagiwara, “Multidirectional associative memory,” in Proc. Int. Joint 8, pp. 1214–1217, Sept. 1997.
Conf. Neural Networks, 1990, vol. 1, pp. 3–6. [53] , "Learning and retrieving spatio-temporal sequences with any
[23] M. J. Healy, T. P. Caudell, and S. D. G. Smith, “A neural architecture for static associative neural network," IEEE Trans. Circuits Syst. II, vol. 45,
pattern sequence verification through inferencing,” IEEE Trans. Neural pp. 729–738, June 1998.
Networks, vol. 4, pp. 9–20, Jan. 1993. [54] L. Wang and D. L. Alkon, ‘‘An artificial neural network system for
[24] D. O. Hebb, The Organization of Behavior. New York: Wiley, 1949, temporal–spatial sequence processing," Pattern Recognit., vol. 28, no.
p. 44. 8, pp. 1267-1276, 1995.
[25] R. Hecht-Nielsen, Neurocomputing. Reading, MA: Addison-Wesley, [55] L. Wang, E. E. Pichler, and J. Ross, “Oscillations and chaos in neural
1990. networks: an exactly solvable model,” in Proc. Nat. Acad. Sci., Dec.
[26] Y. Hirai, “A model of human associative processor (HASP),” IEEE 1990, vol. 87, pp. 9467–9471.
Trans. Syst., Man, Cybern., vol. SMC-13, pp. 851–857, Sept./Oct. 1983. [56] L. Wang and J. Ross, “Chaos, multiplicity, crisis, and synchronicity in
[27] J. J. Hopfield, “Neural networks and physical systems with emergent higher order neural networks,” Phys. Rev. A, vol. 44, pp. R2259–2262,
collective computational abilities,” in Proc. Nat. Acad. Sci. USA, Apr. Aug. 1991.
1982, vol. 79, pp. 2554–2558. [57] , “Physical modeling of neural networks,” in Methods in Neuro-
[28] , “Neurons with graded response have collective computational science, Computers and Computations in the Neurosciences, P. M. Conn,
properties like those of two-state neurons,” in Proc. Nat. Acad. Sci., Ed. San Diego, CA: Academic, 1992, vol. 10, pp. 549–567,
May 1984, vol. 81, pp. 3088–3092. [58] Time Series Prediction: Forecasting the Future and Understanding the
[29] M. I. Jordan, “Attractor dynamics and parallelism in a connectionist se- Past, A. S. Weigend and N. A. Gershenfeld, Eds. Reading, MA:
quential machine,” in Proc. 8th Annu. Conf. Cognit. Sci. Soc. Hillsdale, Addison-Wesley, 1993.
NJ: Lawrence Erlbaum, 1986, pp. 531–546. [59] L. Wu and F. Fallside, “On the design of connectionist vector quantiz-
[30] D. Kleinfeld, “Sequential state generation by model neural networks,” ers,” Comput. Speech Lang., vol. 5, pp. 207–229, 1991.
in Proc. Nat. Acad. Sci., 1986, vol. 83, pp. 9469–9473.
[31] A. H. Klopf, “A neuronal model of classical conditioning,” Psychol.
Rev., vol. 88, pp. 135–170, 1988.
[32] B. Kosko, “Bidirectional associative memories,” IEEE Trans. Syst., Man,
Lipo Wang received the B.S. degree in laser tech-
Cybern., vol. 18, pp. 49–60, Jan./Feb. 1988.
nology from National University of Defense Tech-
[33] R. P. Lippmann, “Review of neural networks for speech recognition,”
nology, Changsha, China, in 1983, and the Ph.D.
Neural Computat., vol. 1, pp. 1–38, 1989.
[34] W. S. McCulloch and W. H. Pitts, “A logical calculus of ideas immanent degree in physics from Louisiana State Univer-
in nervous activity,” Bull. Math. Biophys., vol. 5, pp. 115–133, 1943. sity, Baton Rouge, in 1988 (with a CUSPEA as-
[35] A. Nigrin, Neural Networks for Pattern Recognition. Cambridge, MA: sistantship). His Ph.D. dissertation was on low-
MIT Press, 1993. dimensional electron systems in semiconductor de-
[36] H. Nishimori, T. Nakamura, and M. Shiino, “Retrieval of spatio- vices such as MOSFET.
temporal sequence in asynchronous neural network,” Phys. Rev. A, vol. Since then, he has been interested in the theory
41, pp. 3346–3354, 1990. and applications of neural networks. In 1989, he
[37] Y. Osana, M. Hattori, and M. Hagiwara, “Chaotic bidirectional asso- worked at Stanford University, Stanford, CA, as a
ciative memory,” in Proc. IEEE Int. Conf. Neural Networks, 1996, pp. Postdoctoral Fellow. In 1990, he was a Lecturer in electrical engineering
816–821. at the University College, University of New South Wales, Australia. From
[38] F. J. Owens, G. H. Zheng, and D. A. Irvine, ”A multi-output-layer 1991 to 1994, he was on the Research Staff at the National Institute of Health,
perceptron,” Neural Comput. Appl., vol. 4, pp. 10–20, 1996. Bethesda, MD. From 1995 to August 1998, he was a Computing Lecturer at
[39] P. Peretto and J. J. Niez, “Collective properties of neural networks,” in the School of Computing and Mathematics, Deakin University, Melbourne,
Disordered Systems and Biological Organization, E. Bienenstock et al., Australia. Since September 1998, he has been an Associate Professor at the
Eds. Berlin, Germany: Springer-Verlag, 1986, pp. 171–185. School of Electrical and Electronic Engineering, Nanyang Technological Uni-
[40] D. E. Rumelhart, J. L. McClelland, and the PDP Research Group, versity, Singapore. His research interests include combinatorial optimization,
Parallel Distributed Processing. Cambridge, MA: MIT Press, 1986. control, temporal behavior, pattern recognition, data mining, fuzzy networks,
[41] H. Sompolinsky and I. Kanter, “Temporal association in asymmetric and biological networks. He is author or co-author of over 40 journal
neural networks,” Phys. Rev. Lett., vol. 57, pp. 2861–2864, 1986. publications and 30 conference presentations, holds a U.S. patent on neural
[42] D. W. Tank and J. J. Hopfield, “Neural computation by concentrating networks, and is co-author/editor of Artificial Neural Networks: Oscillations,
information in time,” in Proc. Nat. Acad. Sci., 1987, vol. 84, pp. Chaos, and Sequence Processing (Los Alamitos, CA: IEEE Comput. Soc.
1896–1900. Press, 1993) and Proceedings of the 1998 World Multiconference on Systemics,
[43] G. L. Tarr, S. K. Rogers, M. Kabrisky, M. E. Oxley, and K. L. Priddy, Cybernetics and Informatics. He is an Associate Editor for Knowledge and
“Effective neural network modeling in C,” in Artificial Neural Networks, Information Systems: An International Journal, and has served on the program
T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, Eds. Amsterdam, committees of a number of conferences. More information is available at his
The Netherlands: Elsevier, 1991, pp. 1497–1500. home page at http://www.ntu.edu.sg/home/elpwang/.