You are on page 1of 15


D. A. Allport

Distributed memory, modular

subsystems and dysphasia
I take it as self-evident that the dysphasias-acquired disorders of
language-are a class of memory disorder. Of course, this is not to
say that they are, primarily, impairments of 'episodic' memory, that
is, of memory for particular experiences or events; but they are
impairments, nonetheless, of memory, or memory-retrieval, for the
previously familiar patterns of language. Dysphasic memory
impairments are seen, for example, in the difficulty of retrieving
the spoken form of a word, given some specification of its meaning;
or in retrieving a meaning, given the spoken form; in recovering
the orthographic (written) form of a word, given its spoken form;
and so on.l Let us call the ability that is needed for such tasks and
which is evidently disturbed in dysphasic impairments 'language
memory', to distinguish it both from the more general (non-episodic
and non-linguistic) knowledge of the world-and from memory for
particular, experienced (non-linguistic) events or episodes, both of
which may be well preserved in many forms of dysphasia (Allport,
If this is granted, that the dysphasias represent a class of memory
disorders, it must be equally evident that we shall need a theory of
memory retrieval and memory interference-a theory of the nature
and origin of confused or incomplete or inaccurate retrieval-as an
essential tool in the understanding of dysphasia. In spite of this,
there have been surprisingly few attempts to apply such theoretical
understanding as we have of the psychology of memory to the phenomena of dysphasia.
Certain recent developments in the fundamental conception of
memory processes, and of their possible embodiment in physical
structures like the brain, now make this a more promising enter~rise than it has appeared hitherto. The key developments here are
m models of 'distributed' memory and of parallel-associative processes of retrieval (Hinton & Anderson, 1981). My aim in this chap32


ter is to introduce these theoretical ideas, so far as is possible in a

non-technical way, and to consider some of their implications for
our understanding of the nature of dysphasic difficulties. Before
doing so, however, it will be worthwhile, by way of contrast, to
consider the currently dominant approach in cognitive neuropsychology to the understanding of language and language disorders,
namely the identification of isolable 'processing components'
(modular subsystems), and to review, briefly, the strengths and
limitations of this approach (Section I). In Section II I outline
various different levels of explanation, as applied to neuropsychological data. I shall then be in a position to introduce, in Section III,
some of the essential ideas of distributed memory in a way that, I
hope, may make them intuitively accessible to the non-mathematical reader. Finally, in Section IV, I consider how these ideas apply
to aspects of brain-injured, dysphasic performance and to what are,
or are not, valid neuropsychological 'components'.


Despite the immense and rapidly growing quantity of information
available on the anatomy and physiology of the brain, we still know
almost nothing about the processes in the nervous system responsible for language or other higher-level cognitive abilities. The
traditional aphasiological approach-the correlation of behavioural
deficit with anatomical lesion site-has yielded somewhat slender
dividends in terms of insights into the disordered processes. Meanwhile, independent of the neurosciences, the psychological investigation of normal cognitive abilities has developed in a number of
important ways. In the 'information-processing' approach to
cognitive psychology, or 'cognitive science' (Norman, 1981), a key
idea has been that the mechanisms of behaviour can be described
at an abstract, or process level, without any reference to the physical
or biological hardware involved, much as a computer program can
be written without explicit reference to the physical machine on
which it will run. In this tradition, information-processing models
of cognitive processes are often expressed in flow-chart form, that
is as blueprints for a set of computable processes, where the longterm goal is the complete specification of these processes in a working computer program. (In the psychology of language, McClelland
& Rumelhart's (1981) model of written word recognition provides
an elegant, representative example of this kind of theory-building.)





If the flow-chart model is seen as a step towards formulating a

fully-specified, computable model, a still more preliminary step is
to try to identify the in-principle-separable components-the building blocks-of the system as a whole, components which may then
be studied and modelled in an intelligible way, at least partly in
isolation from the rest of the system. There is increasing support
for the view that the staggering complexity of human behaviour is
the product of interaction among many different, semi-independent
subsystems each performing a unique, specialist role in the overall
organization (e.g. Allport, 1977, 1980; Minsky, 1979; Fodor, 1983).
An analogy is often drawn with a 'society of experts', a large organization-the CIA provides a favourite example-in which different units of the organization have different skills and are in
possession of different pieces of information; in which no single
member of the organization can possibly possess all the knowledge,
or all the expertise, contained in the organization as a whole. The
analogy between individual minds, and societies, has a number of
interesting features. For the present, the essential idea is that the
functional components of mind are, in general, special-purpose rather
than general-purpose elements in the working of the whole
system (Allport, 1980). If this general view is correct, then to characterize in outline any of these separable subsystems-to discover
broadly what it does and with what other subsystems it communicatesbecomes an essential preliminary to constructing a detailed, information-processing model of how that subsystem computes the
specialized functions that have been ascribed to it.
In the light of this preliminary but essential goal, the recent surge
of interest among cognitive psychologists in the phenomena of
dysphasia, and other behavioural consequences of brain injury, is
easily understood. Cognitive psychologists have come to recognize
the potential of the individual, neuropsychological case-study to
reveal dissociable behavioural deficits, and hence to provide clues
about the functional separability of the underlying component
mechanisms (e.g. Marin et aI, 1976; Patterson, 1981; Shallice,
1979). Equally ambitiously, it is hoped, selective impairment of
some components may permit a uniquely privileged view of the
working of the remaining, intact systems.
The most convincing defence of this research strategy is its ability
to produce consistent interpretations of dysphasic performance,
converging on the same functional components as can be inferred
from experiments using normal subjects. Much recent research,
particularly that concerned with the processing of written language,

can claim to provide successful illustrations (e.g. Coltheart et aI,

1980; Patterson & Coltheart, 1984). A particularly influential example, to which we shall need to return, is Morton's logogen n;t0del,
a model of normal lexical organization which has been apphed to
several varieties of dysphasic and dyslexic performance (e.g.
Morton, 1980; Ellis, 1982).
I share the enthusiasm and excitement over the 'modular subsystems' approach. At the same time, it is important to recognize
certain potential limitations in its application to brain-injured
The strategy rests on two rather strong assumptions. The first
is that biological information-processing systems-human mindsare indeed highly modular in organization, in the way suggested,
not only in their abstract or 'functional' organization but also, and
equivalently, in their anatomical embodiment, so that localized
anatomical lesions can selectively damage just one or a small
number of psychologically intelligible subsystems, leaving other
subsystems physically unimpaired. The second assumption is. that,
in this case, the ensuing behaviour reflects the normal operation of
the remaining, intact subsystems, minus the contribution ~f ~he
damaged components, without major compensating reorganIzation
on the part of the surviving components. This latter assumption,
in particular, appears threatened by the evident fact that, following
cerebral injury, at least some recovery of language, as of other
cognitive abilities, is almost always observed (cf. Newcombe and
Ratcliff, 1979; Finger and Stein, 1982). Indeed, this is surely the
aspect of dysphasia of principal interest to therapists, and to the
patients themselves. Yet contemporary analyses of language mechanisms at the level of 'separable functional components' (the box-andarrow notation of current cognitive neuropsychology) appear to
have nothing that they can say about it.
Functional components and cerebral lesions
There are, however, more serious problems at stake, all of which
reflect a mismatch with the level of description needed to accommodate the manifestations of brain injury in dysphasia. First, and
obviously threatening to this approach, Wood (1978, 1982) ~as
shown how in a distributed memory system, a clear 'double dlssociation' between behavioural deficits can be consistent with
complete overlap in the underlying representations. Of this subject,
however, more later. The inferred separable components (the



'boxes' and 'arrows' of current information-processing models in

neuropsychology) are highly abstract entities, whose psychological
validity and interpretation, it is claimed (e.g. Morton, 1981), is
independent of any possible, physical implementation in the brain.
Applied to the behaviour of intact, normal subjects, this approach
seems reasonable enough. Directed towards the understanding of
the effects of neurological injury, it appears less obviously satisfactory. What, specifically, might it mean to think of'lesioning' a component in such an abstract, disembodied system?
It is, perhaps, straightforward enough to think of the simple deletion of an entire component, a 'box' or an 'arrow', from such an
information-processing model. This option, however, lays itself
open to the objection raised by Freud (1891) against the earlier
'diagram makers', to the effect that theorizing in this form seems
to reflect dysphasic performance as though seen in silhouette, without internal structure. For this level of analysis, the 'ideal' dysphasic
data should take the form of complete failure on one set of tasks,
normal (intact) performance on another set. In practice, such
complete functional dissociations are seldom, if ever, seen. On the
other hand, what is seen every day in the dysphasic clinic is reduced
efficiency of performance in one or more domains: slower and less
reliable word-finding; partial or incomplete retrieval of wordmeanings; increased confusability between similar items or similar
constructions; and so on and so on. How these all-too-familiar
phenomena of diminished, but not zero, performance within any
one processing domain are to be explained by theories at the level
of Independent Processing Components is far from obvious.
Of course, this is not to deny that focal head injury may result
in selective impairment of particular domains of language processing.
The point at issue is that, within anyone domain, the impairment is,
most commonly, partial-a general reduction of efficiency-not allor-none.
Another feature of the reduced efficiency of dysphasic performance deserves comment here. When the same tests designed to
probe receptive or expressive lexical knowledge are repeated over
a period, success on individual words typically fluctuates from one
occasion of testing to another, even though the overall test scores
may remain remarkably consistent. That is, particular classes of
words can be differentially affected, as a group, in a consistent way.
~o~~ver, what appears not to occur is the permanent loss of unique,
mdivldual written or spoken wordforms, leaving others in the same
class intact. The same applies to memory for other recurrent



patterns, such as faces or melodies. Whereas the recognition of

previously familiar faces can be impaired in general, what has never
been reported is an acquired, selective inability to recognize (say)
one's grandmother, while recognition of one's grandfather is
preserved. Brain lesions may have selective effects at the level of
whole processing components, but not, it appears, at the level of
individual words or objects in memory.
Clearly, none of the features of dysphasic performance that I have
mentioned show the analysis of psychological mechanisms in terms
of distinct, but interacting, subsystems to be in any sense wrong.
Far from it. They are, nonetheless, examples of very obvious and
general phenomena that theories, at that level of abstraction, simply
fail to engage at all. As I suggested earlier, if we are to get some
theoretical insight into them, we shall need to look for theories at
a different logical level of description.
David Marr (1981) has put forward a theoretical framework for our
understanding of the processes of vision, a field in which information-processing analysis is a long way ahead of the corresponding
research in language. In presenting this framework, and the progress within it so far achieved, Marr illustrates the point again and
again that, if one hopes to understand any complex informationprocessing system, one will need different kinds of explanation at
several different levels of description, levels which may be, at first,
only very loosely linked. Some properties of a system's behaviour
will be most appropriately explained at one level, some at another.
To make matters harder, it is by no means always obvious in
advance which level of explanation will be the most appropriate,
or tractable, for any given, behavioural phenomenon.
Marr & Nishihara (1978) distinguished four levels of description.
To begin with, there is the analysis of basic components and their
local circuitry: how do transistors and diodes (neurons and synapses)
work? At another level -up, are questions about implementation:
how are assemblies of the basic components arranged to implement
particular mechanisms-the adders and multipliers of a pocket
calculator, for example? Most importantly, for our present purpose,
at this level arise questions about how the fundamental mechanisms
of memory-storage, comparison, retrieval-are implemented.
The third level is that of representation and algorithm, the level of
description at which most current work in artificial intelligence, and



much of cognitive psychology is aimed. Here the central questions

are: (a) what aspects of the information being handled by the system
are given explicit2 internal representation, so that they can be used
directly by a given process; (b) at what 'stage' in the system, i.e.
from which other representations, can they be obtained; and (c) how
(i.e. by what computable procedures) are they derived? Cognitive
psychologists have tended to concern themselves more with the first
two of these questions, with identifying distinct or common
(shared) codes of representation, and with mapping their channels
of intercommunication, than with specifying computable procedures for transforming one code into another. Examples of questions of types (a) and (b) in the cognitive psychology of language
would include, for example, many currently live issues about the
organization of the mental lexicon. (For instance, in the perception
or production of speech, is there any level of explicit representation
of systematic phonemes? In written word-recognition, is there a
stage of representation of abstract letter-identities? If so, what other
stages or subsystems can read from (have inputs from) this particular code? Are there distinct lexical and non-lexical coding systems
by which a skilled reader can derive pronunciation from print?
Etc., etc.)
Finally, the top level of description contains the abstract theory
of the computation or process being performed, that is, the theory
in the broadest sense of what is being done, and why; and what are
the constraints provided by the world in which it operates that
make it possible? As regards language, the level of 'computational
theory' perhaps corresponds most nearly to that of abstract theoretical linguistics.
In terms of these four levels of analysis it is not immediately
obvious, to which level the cognitive neuropsychologists' 'functionally separable components', inferred from dissociable behavioural
deficits, should be assigned. Arguably, the most global of these
component distinctions, such as, for example, the distinction
between 'logogen system' in general and 'cognitive system', belong
properly to the level of the abstract 'computational' (linguistic?)
theory. Similarly, linguistic intuitions regarding the broad decomposition of the language faculty into syntactic, semantic, phonological (etc) domains, and which have claimed support from the
major categories of dysphasic impairment (e.g. Caramazza &
Berndt, 1978; Lesser, 1978), belong at this level. There is a parallel
here with Marr's use of neuropsychological dissociations in the
perception of three-dimensional objects (Taylor & Warrington,



1973) as the basis for certain fundamental choices, at the 'computational theory' level, about the overall organization of the visual
process (Marr, 1981).
Equally, it can be argued that the neuropsychologists' separable
functional components correspond-or at least ought to correspond-one for one with distinct representation systems, i.e.
distinct attribute codes (e.g. Allport, 1980; Monsell, 1983). Hence
they belong to the next level of analysis, the level of 'representation
and algorithm'.
At either of these levels of analysis, however, we find little help
in understanding what it might mean to 'lesion' -to injure rather
than to eliminate-one of these abstract components. Where physical injury results not in the total abolition of some function (or
representational ability) but in a reduction of its scope and efficiency-for example in diminished vocabulary, slower, unreliable
and errorful retrieval, etc.-then the box-and-arrow notation of
current functional-component models (e.g. Morton, Ch. 9) offers
no obvious way to accommodate these changes.
To understand these behavioural effects we need also to have a
model of functionally separable components at the (neural) implementation level. The principal aim of this paper is to motivate, and
to provide at least an introduction to such a model.
Even to suggest such an enterprise evokes responses of dismay,
even of abrupt dismissal, on the part of many cognitive psychologists. Clearly there is a yawning theoretical gulf here. On one side
of the gap there is a vigorous, even flourishing cognitive psychology, applied to both normal and pathological language processes,
operating almost exclusively at Marr's third level (symbolic representations). On the other side of the gap there are dramatic and
continuing advances in the neurosciences-, almost entirely at the
'basic components' level. Between these two, however, questions
at what Marr called the implementation level appear to have been
very largely ignored by those on either side3
In spite of this, if we are to press our question To which level
of description does the analysis of modular sub-systems or 'separable functional components' properly belong? the correct answer

appears to be: All levels, down to and including that of fundamental

mechanism or 'implementation'. That is, the way in which psychological processes emerge from interactions among modular subsystems has strong implications for, and is in turn illuminated by,
analysis at each of Marr's three levels of description-the top level,
computational (linguistic) theory, the level of symbolic represen-



tation and process, and at the level of physical implementation.

Indeed, it is because questions about the modular decomposition of
the mind/brain arise at each of these levels, and because their solutions have important implications for other questions at each level,
that the identification of modular subsystems represents such a
primary and essential goal for the sciences of cognition-theoretical
and computational linguistics, cognitive psychology, neuropsychology-and for the understanding of language pathology.
Semantic nets and neural nets
A widely accepted notation for representing the structure of lexical
and semantic knowledge, adopted both in psychology and in
computer science, takes the form of a 'semantic net'-a network of
concept nodes and labelled, directed relational links (e.g. Collins
& Loftus, 1975). Most discussions of semantic nets are confined to
the abstract level of 'representation and algorithm', without reference to their possible embodiments in (neuronal) hardware. Iri
neuropsychology and the study of dysphasia, however, if we are to
understand the disorders of lexical and semantic memory that result
from physical injury we shall undoubtedly need an explicit theory
of the relationship between the abstract representation level and the
level of its physical implementation.
One obvious possibility is to suppose that different concept nodes
in a semantic network correspond to different physical elements in
the hardware (neurons, cell-assemblies, etc.), and that the relational
links between concepts (and between concepts and word-forms)
similarly correspond to particular physical linkages. Evidently, this
is a possibility that many people take quite seriously, both in
neurophysiology (e.g. Barlow, 1972) and, mutatis mutandis, in artificial intelligence (e.g. Fahlman, 1979). However, it is not the only
one. Another possibility is that each 'concept node' corresponds not
to a distinct part, or component, of the hardware but to a particular
pattern of activity in it. Different concept nodes, in this sort of
~plementation, can be represented by different patterns of activity
m the same set of physical units, the same network. That is, the
representation of concepts is 'distributed'.

Distributed memory
To get a better idea of what this might mean, consider the following



simplified example, for which I am indebted-as throughout this

section-to Hinton (1981). Imagine a network of simple hardware
elements (switches) and their physical interconnections, as illustrated in Figure Each element has two possible activity states,
either 'on' or 'off, which can be represented symbolically by a 1 or
a O. Figure 2.1b shows a sequence of activity states of five hardware elements, resulting from an initial input and mutual interactions within the network.
As I have already emphasized, the same physical system can be
described at more than one level of analysis. Thus, the behaviour
of our hypothetical network could be described either in terms of
the activities of the individual hardware elements (as in Figure
1.1 b) or, alternatively, at a higher level, in terms of the activity of
the network as a whole. That is, recurring patterns of activity across
all five elements now become the units of analysis, the basic descriptive elements to which particular names could be assigned. In this
way, Figure 2.1c depicts the sequential relationships between
patterns of activity of the hardware elements in Figure 2.1a. It is
important to see that, while the diagrams in Figures 2.1a and 2.k
are superficially similar, their interpretation is radically different.
In Figure 2.1a the nodes represent physically distinct parts of the
machine; arrows represent individual hardware connections; and
many different nodes can be 'on' at the same time, In Figure 2.1c
none of these things are true. Here the nodes stand for mutually
exclusive states of the network as a whole; the arrows represent
possible transitions between these states.
Diagrams 2.1a and 2.1c both describe the same physical system,
but in diagram 2.1c the descriptive elements stand for distributed
states; there is no simple one-to-one correspondence between these
elements and particular physical parts of the network.
This illustration of Hinton's makes a good starting-point for
understanding the idea of 'distributed' memory. In that example,
however, a whole lot of important questions were (temporarily)
sidestepped. To begin with, one might ask, in what sense should
any partiCUlar pattern of activity in the network be treated as a
'unit', rather than the merely accidental co-occurence of activities
among its constituent hardware elements? In Figure 2.1c, the
arrows assert something about the sequential constraints among
particular activity states, i.e. they refer to the (past or future) history
of the network. 'Units', in this notation, are thus activity-patterns
that stably recur in the system's history. If the network is to act
as a memory, we want it to distinguish patterns or events that are


time ->


























already familiar (Le. that stably recur), and thus can have potential
significance, from unknown or arbitrary configurations. Our intitial
question thus gives rise to two, more specific questions:
1. How might such a network be arranged so that each of a number
of different activity-patterns can be stably reinstated at different
2. How might new, reinstatable activity-patterns (new 'units') be
To begin to answer these questions, imagine now a network of
hardware elements, in which every element is connected to every
other, including itself, as in Figure 2.2a. Assume also that each
element can be active in a graded amount, rather than simply 'on'
or 'off'. Each interconnection transmits excitation (inhibition) from
one element to another, with a given positive (negative) weighting,
or 'strength' of transmission. The same weightings can be shown
also in the form of a matric of interconnections, as in Figure 2.2b.
(Naturally for any psychologically plausible application to human
memory we shall need to think about a matrix of many more than
just four elements.)
Most of the suggestions about learning within such a matrix of
interconnected active elements are variants of an idea put forward
originally by Hebb (1949). The idea is that the strength of connectivity between any two elements (neurons) changes as a function of
the amount of concurrent ('pre- and post-synaptic') activity in that
pair of elements. For example in Anderson's (1977) matrix memory
model, the basic learning assumption is that the weightings of each
interconnection are changed in proportion to the product of the
receiving units


















Fig. 2.2 (a) A completely interconnected network of physical elements. (b) The
same system shown as a matrix of interconnections. Each interconnection may
have a different variable weighting.



activity level in each of the corresponding pairs of source and

receiving units. If the inputs to such a system cause the same
pattern of activity to occur repeatedly, the set of active elements
constituting that pattern will become increasingly strongly interassociated. That is, each element will tend to turn on every other
element in the inter-associated pattern and (with negative weights)
to turn off the elements that do not form a part of the pattern. To
put it another way, the pattern as a whole will become 'autoassociated' -it will come to cause itself as its own successor. It thus
becomes a (one of a set of) stable states of the system. We may call
a learned (auto-associated) pattern an 'engram'.
The establishment of an auto-associated pattern will have a
number of interesting consequences.



4. Categorical perception and 'capture'

Input patterns that are similar to (i.e. that share many elements
with) a strongly auto-associated pattern, but which are not themselves already-learned patterns, or are less strongly learned, will
tend to recruit the more strongly learned pattern and thus be
replaced by it. That is, they get 'captured' by the stronger pattern.
With some quite reasonable assumptions about feedback within
such a system, and a maximum and minimum (zero) activity level
in individual elements, it can be shown that such systems will tend
to settle into stable, learned activity-patterns in which some units
are maximally active while the remainder are not responding at all
(Anderson, 1977). In effect, that is, such systems will tend to
exhibit a strong form of 'categorical perception'.

1. Stability
Once evoked, a learned pattern-but not an unlearned one-will
tend to maintain itself.

2. Part-to-whole retrieval
The activation of only some elements of the learned pattern will
tend to evoke each of the remaining elements of that pattern, since
all of its missing elements receive positive connections from each
of the elements already present, while currently active elements that
are not part of the learned pattern are inhibited. As more of the
missing elements are activated, they also begin to assist the recruitment of the remainder of the auto-associated pattern, until the
network settles into the completed pattern. Some dramatic illustrations of this auto-associative forcing of missing pattern-parts are
given by Kohonen (1977; Kohonen et aI, 1981).

3. Retrieval dynamics
The process of reinstatement of the complete learned pattern is thus
extended over time. Where the input is related, in some degree, to
several different engrams (see below), the network will take longer
to 'settle' into one, stable pattern of activity. Ratcliff (1978) has
put forward a mathematical model of memory retrieval dynamics
that is formally equivalent, in several important respects, to that
of Anderson (1977), and which provides an impressive fit to a range
of experimental data on memory retrieval times.

5. Many engrams
Suppose, now, that the input forces a different activity-pattern in
the same population of interconnected elements. If this pattern
recurs, or is sustained, it too will come to be auto-associated.
However, the-at first sight-really surprising feature of matrix
memories of this kind is that the learning of this new pattern need
not disturb the memory for (i.e. the recoverability of) the previously learned pattern, even though both patterns are stored in the
same matrix of interconnections. So long as the different patterns
are orthogonal-that is, so long as they are not correlated with one
another-then many different patterns (engrams) can be literally
superimposed on the same matrix of interconnected elements, without mutual inteference. 4 The requirement for interference-free
recovery of stored patterns, that the different patterns should be
uncorrelated, is intuitively obvious when it is appreciated that the
process of retrieval of any stored pattern is essentially a process of
correlating a given input-vector (a 'retrieval cue') against the matrix
as a wholes. To the extent that the retrieval pattern correlates
with-overlaps with, resembles-more than one engram that has
been stored in the matrix, retrieval will inevitably be distorted by,
or suffer 'interference' from these other, related patterns.
The same principles apply to associations between activitypatterns in different sets of hardware elements. Imagine that the
group of elements, <x, in Figure 2.3, is completely connected to a
second group of elements, ~: every element in the first group is
connected to every element in the second group. Suppose further,





tunable filter, responding only to learned ('tuned') input patterns.

The response of the system to a novel input-pattern, i.e. one
completely unrelated (orthogonal) to any previously stored pattern,
will be damped to zero. Similarly, an orthogonal activity-pattern in
the elements of (X that has not been associated with an activitypattern in another set of elements, {3, will give rise to zero activity
in {3. That is, a novel input will not be 'seen' by higher levels of
the system until it is learned. This must have the result, as Anderson points out, that such a system will be agonizingly difficult to
teach. Once some learned patterns are established, however, further
associative learning can be increasingly rapid, the larger-or more
multi-dimensional-are the already learned patterns involved. For
the same reasons, initial biases in the network will have a profound
influence on later learning (cf. Edelman, 1981).





Fig. 2.3 Two groups of physical elements, a: and ~, representing two different
domains of attributes (after Anderson, 1977). Every element in a: projects to every
element in ~. Every element in ~ receives an input from every element in a:.

that whenever the activity-pattern A is excited by inputs to (x, other

inputs ('forcing' inputs) excite the activity-pattern B in the second
set of elements, {3. (For a discussion of the role of 'forcing stimuli'
in associative learning see Kohonen et aI, 1981.) As before, our
assumption is that the strength of each interconnection is changed
as a function of the product of the activity in each interconnected
pair of elements in (X and {3, respectively. Now, after this learning
has occurred, whenever pattern A recurs in the elements of (x,
pattern B will be reinstated in the elements of {3. Again, many
different associations between different activity-patterns in (X and
{3 can be stored within the same matrix of interconnections; and
again, of course, the same limitations will be observed due to interference from similar, or related, patterns and their stored associations. The effective 'strength' or recoverability of an engram will
be a joint function of (1) the strength of auto-association among the
elements of the stored pattern, and (2) the strength of association
between the retrieval cue and the to-be-recovered engram, relative
to its overlapping associations with all other stored engrams-in
other words its distinctiveness or 'uniqueness' as a retrieval cue
(cf. Cermak & Craik, 1979).
As Anderson and others have frequently pointed out, memory
in this kind of system is, formally as well as intuitively, a form of


Some properties of distributed, matrix memories

The foregoing is intended to give an entirely informal and intuitive
introduction to the basic ideas of distributed representation and
matrix memory systems. The theory of distributed representation
has been developed over the past dozen years or so by a number
of people, notably, with application to the psychol~gy ~f memory,
by James Anderson and his colleagues at Brown Uruverslty (Anderson, 1973, 1977), and by Kohonen in Helsinki (Kohonen, 1~77;
Kohonen et aI, 1981). Hinton & Anderson (1981) have compiled
an outstanding collection of papers on distributed memory models
by many different authors including themselves, and the interested
reader is very strongly advised to consult this collection for a fuller
and, of course, more technical introduction. It should be emphasized that the presentation here has been kept to some of the ~ost
basic, qualitative features of parallel, distribute~ represent~t~~n.
Almost nothing has been said about the computauonal capab1l1ues
of such parallel, network systems, which have in fact begun to be
used extensively in modelling the complex processes of human
vision (Ballard et al, 1983). Their application to theories of higher
mental function is also being actively explored (Fahlman et aI,
Even with the very informal account to which we have confined
ourselves here, however, a number of the important properties of
such distributed memory systems should be apparent.
First, 'retrieval' is not a matter of fetching information from
some storage location and transferring or copying it into another



location where it can be 'read', as it is in almost all other kinds of

conventional memory-systems, from current general-purpose
computers to libraries. Retrieval, in a distributed, matrix memory
consists in the re-activation of a specific activity pattern in a
specific-i.e. code-specific or content-specific-subset of elements.
The activation of that pattern, in that set of elements, can give rise
in turn to the activation of an associated pattern, in a different set
of elements, and so on. The essential character of the informationprocessing that occurs in such a system thus consists in 'mapping'
or transcoding patterns of activity from one set of elements to
another. Radically unlike other kinds of memory, however, there
is no distinction between the 'processor' that operates on the available information and the 'store' in which it is held. The memory
is not a passive container, in which, in principle, any informationcontent can be placed, but an active, content-specific pattern-recognizer and pattern-transcoder.
Second, among any set of engrams or learned patterns that have
been superimposed on the same population of hardware elements,
only one can be fully retrieved-re-activated-at a time. That is to
say, within anyone set (or 'domain') of pattern-feature elements,
a distributed memory system must be 'single-channel' in operation.
In order that one pattern can be fully realized, other, potentially
competing patterns on the same set of elements must be
Third, learning is automatically generalized to new input patterns
in proportion to their resemblance (correlation) to patterns already
learned, a property of the very greatest importance in dealing with
a world in which events seldom, if ever, recur exactly as before; a
property, also, that appears to be omnipresent in biological
memories, and whose absence is perhaps the single most severe
limitation on the use or recovery of information from large-scale
conventional (man-made) memory systems. (The latter, in default
of true similarity-based content-addressing abilities, must fall back
on elaborate methods of indexing and searching to locate the
desired information, such as the pattern-matching and back-tracking
procedures of list processing languages, that have formed the
indispensable apparatus of contemporary Artificial Intelligence.) In
distributed, matrix memories the 'interference' resulting from
similarity among stored patterns is the price that is paid for the
enormous advantage of automatic transfer of learning to similar but
novel configurations.



Fourth, matrix memory systems automatically respond to the

common elements, or prototypes, from a set of related, learned
instances where the 'prototype' is the pattern having the highest
correlati~n with (sharing the largest number of microfeatures with)
the entire set of instances, even though the prototype pattern itself
was never previously encountered-a property that is evidently
possessed by biological, human memory (e.g. Posner & Keele,
1970', Sol so & McCarthy, 1981). To put the same point
. ,in a slightly
different way, matrix memory systems extract 'semantlc memoryin the sense of the long-term-invariant or common features and
relationships of many encoded events and their associations-as a?
automatic by-product of the encoding of particular, related, 'eplsodic' instances. However, there is no explicit encoding of these
common features and relations distinct from the encoding of each
particular instance. 'Episodic' and 'se~antic: me~ory (Tulving,
1983) are thus not separate 'components ~f mmd: It ~ho~d ?ot ~
possible to lose 'semantic' memory while preservmg eplsodiC
memory for the same classes of encoded ,events;, thoug~ t~e
converse, one-way dissociation-failure to retneve umque, eplsodlc
'context' information-may occur (Kinsbourne & Wood, 1982),


Word forms and word-meanings
Can we now identify, in terms of these ideas about distri~uted,
matrix memory systems, what would count as the separable functional components' of neuropsychology (Section I). We know from
a very wide range of neurophysiological research, (C?wey, 1981;
Mountcastle, 1978) that individual neural elements m dlfferent loc~l
regions of brain are responsive to differe~t cla~ses of senso,ry at:nbutes: in vision, to colour, movement, onentatlOn, stereo dispanty,
and so on; in hearing, to pitch, glide, duration, etc., etc .. , . ~ore
over in some regions individual units are found to be selectively
resp~nsive to highly complex configurnations, such as faces
(Perrett et aI, 1982). Let us call the class of attri~utes encod~d by
each of these sets of specialist elements an attnbute domam. It
appears very natural, then, to propose that the neu~opsychologists'
separable 'functional components', identified behavwurally through



doubly-dissociable deficits in performance, correspond to sets of

auto-associated patterns, or engrams, defined over a common population

of feature-elements, hence over the same attribute domain. Different
attribute domains, different 'components'.
Consider the store of spoken word-forms as one such hypothetical
component (Allport & Funnell, 1981). It seems reasonable to
assume a representational domain in which the individual elements
are responsive to acoustic spectra and to the temporal modulation
of sound patterns (Kay, 1982). Possibly, we may need to envisage
also a more abstract attribute domain, in which the elements encode
phonetic or even phonemic properties, though the evidence does
not seem particularly to favour it (Klatt, 1979). A neural dictionary
of spoken word-forms might then be realized as a set of autoassociated patterns superimposed on the same acoustic (or phonetic)
attribute domain, hence on the same population of feature elements,
the same neuronal network. In a system of this kind, individual
word-units, therefore, could not be identified with particular sets
of neural elements; on the contrary, the entire vocabulary of spoken
word-forms would be physically superimposed on the same neural
An immediate consequence of this kind of distributed representation is that physical injury should not result in the loss of particular word-forms while others remained unimpaired. Rather we
should expect the destruction of any proportion of the neural
network underlying the vocabulary of spoken word-forms to have
the effect of reducing the discriminability of many or all these
learned patterns; the larger the lesion, the greater the effect. Wood
(1978, 1981) has constructed a simulation model, based on Anderson's matrix memory ideas, which exhibits precisely these properties of local 'mass-action': decreasing overall retrieval accuracy as
increasing numbers of elements in the matrix are disabled. Further,
the failure or inaccuracy of retrieval should be most apparent in
respect of those word-forms (or other engrams) that are least strongly auto-associated; thus, uncommon words will be more impaired
than those that have been encountered often (or also, perhaps, more
recently). Moreover, the errors in retrieval will take the form of
increased confusability among acoustically (or phonetically) similar
word-forms, including the 'capture' of less familiar words by their
stronger neighbours within the attribute-space. Finally, since the
word-units in such a system exist only as sequential compositions
of (acoustic/phonetic) feature-elements, it must follow that degradation of information at the word level should always be accom-



panied by loss of discriminability at the sub-lexical feature level.

My contention is that all of these properties are precisely to be
found in dysphasic, lexical impairment.
_ Slower, less distinctive (errorful) retrieval or recognition of
spoken word-forms.
_ In word-finding, incomplete and/or misordered retrieval in the
form of phonemic paraphasias.
_ Capture of less familiar word-forms by their acoustic neighbours, both in production, as in so-called 'verbal paraphasias'
(malapropisms), and in recognition (Allport, 1983b, c).
_ Impairment of spoken word-forms in perception and production
appears to be associated with impaired discrimination of speech
sounds at a sub-lexical level (e.g. Allport, 1983c). The discovery
of even a single case in which the word-form store was clearly
impaired, without any corresponding sub-lexical impairment,
would threaten one of the central assumptions put forward here,
about lexical (word-form) representation.
_ Finally, what (I maintain) is not observed is the permanent loss
of particular spoken word-forms, leaving their acoustic neighbours available and unimpaired. Again, the unambiguous
demonstration of even one such case would be sufficient to
falsify the model. (Note: Wood (1981) has shown how the
retrieval of particular engrams may be selectively impaired, even
in a fully distributed memory; this can occur if two, nearly identical engrams differ from one another only by a few microfeatures-all of which have been lesioned-and if no other engrams
are critically dependent for their differentiation on the same
microfeatures. Clearly this type of effect in no way alters the
statement, above, nor its openness to empirical falsification.)

Written word-forms
The immediately preceding discussion has been in terms of a store
of spoken word-forms, as one possible example of a neuropsychologically dissociable, functional component. A similar case can be
made in respect of a store of written word-forms. Allport & Funnell
(1981) reviewed a variety of evidence for the independence of these
two functional components, which they referred to, respectively,
as the phonological and the orthographic lexicon. Each one of the
empirical consequences, listed above, of injury to the phonological
lexicon, based on our assumptions regarding distributed representation, can be re-stated, mutatis mutandis, as consequences of injury



to the orthographic lexicon. Here, of course, the increased confusability in retrieval will be in terms of orthographic (letter by letter)
similarity. Again, according to the model of distributed representation advocated here, there are no orthographic word-units
physically distinct from the representations of (positional) letteridentities of which they are composed. The model, therefore,
predicts that impairment of the (receptive-expressive) orthographic
lexicon should be invariably accompanied by increased confusability among (non-lexical) letter identities, and/or, perhaps, letterpositions.
Most importantly, if this model is correct, what should never be
observed is the selective, permanent loss of orthographic knowledge
regarding any individual written word, while its orthographic
neighbours-sharing many of the same letters, in the same (approximate) relative positions-are unimpaired.



in coding, say, the characteristic sounds of one particular object

(a telephone) will participate also in many other auto-associated
patterns representing other objects or events. Figure 2.4 gives a
very rough sketch of the idea here, though the diagram fails to
capture the hierarchical nature of object-concepts.

attribute- domains

Word-meanings and object concepts

I assume that the distributed engrams representing particular wordforms in the phonological and orthographic lexicons are associatively linked with other auto-associated patterns representing
non linguistic word-meanings ('semantic memory'). For simplicity, let
us confine the discussion here to the representation of relatively
simple object-concepts. Following the general conception of distributed memory that I have put forward here, I shall further assume
that the auto-associated patterns representing physical objects are
distributed across a very wide range of attribute domains, encompassing
every class of sensory and motor (action-related) attributes pertaining to the particular object-concept. The object-concept of telephone, for example, must involve the convolution not only of many
different complex properties of shape, surface texture, size and so
forth that are codable in visual and tactile attribute domains, but
also properties specific to auditory and to action-coding domains of
representation, including manipulation and speech. Indeed, the full
object-concept for teiepmme, as for any other functional artefact,
must presumably embody a specification of the complete 'scripted'
routine of interactions with the object. The engrams specifying
complex action-routines-when and what to pick up, how to hold
it, etc., etc.-will no doubt share many of the (auto-associated) subpatterns, of which they are composed, with a vast number of other
learned action-routines that likewise involve grasping, picking up,
etc. Similarly in the sensory domains, the same elements involved



Fig. 2.4 Schematic diagram to illustrate how object concepts might be

represented as auto-associated activity patterns (dotted outlines) distributed across
many different sensory and motor attribute domains. Spoken and written wordforms are similarly represented as auto-associated patterns within their
corresponding ('phonological'/'orthographic') attribute domains. Mappings
between word-forms and word-meanings are embodied as distributed matrices of
interconnections between attribute domains.

The essential idea is that the same neural elements that are
involved in coding the sensory attributes of a (possibly unknown)
object presented to eye or hand or ear also make up the elements
of the auto-associated activity-patterns that represent familiar
object-concepts in 'semantic memory'. This model is, of course, in
radical opposition to the view, apparently held by many psychologists, that 'semantic memory' is represented in some abstract,
modality-independent, 'conceptual' domain remote from the mechanisms of perception and of motor organization.



Again, if we consider the possible effects of physical injury to

such a system, some consequences are immediately apparent:
- Since object-concepts are typically distributed over many different attribute domains and hence, generally, over widely
dispersed brain regions, they will appear to be less vulnerable
to local brain injury than (for example) word-forms in the phonological lexicon that are defined over only one-or very fewattribute domains. Only very diffuse or widespread injury, as
in severe degenerative disease or toxicosis, is liable to result in
the clinically evident loss of entire classes of object-concept (e.g.
Warrington, 1975).
- Concepts defined over relatively few attribute domains-purely
visual objects, for example, such as clouds or colours-will be
more vulnerable to local cerebral injury (Gardner, 1973).
- Disorders of object-concepts could result either from degradation of the auto-associative linkages that bond components of the
engram together as a unit, or from more local injury to particular attribute domains. In the latter case, the loss of particular
attribute information in semantic memory should be accompanied by a corresponding perceptual (agnosic) deficit.
- It is important to distinguish loss or degradation within the
object-representations from the functional disconnection of
object-concepts and their corresponding spoken or written
word-forms. The associative links between word-forms and
object-concepts, embodied as a distributed matrix of interconnections, will possess the same properties of graceful degradation (or 'mass action') as the engrams that they link. Effects of
this kind, resulting from partial disconnections between attribute domains (and relating to anomie and other dysphasic
syndromes), have been demonstrated by Gordon (1981) in a
simulation study of distributed representation. Since the associative mappings in each direction are distributed over quite
different populations of links 'synapses', the functional disconnections can of course be unidirectional (cf. Allport & Funnell,
The theory of distributed, associative memory has many fundamental implications for our understanding of neuropsychological
impairments, only a few of which have been even touched on here.
In particular, I have tried to suggest how these ideas may be
mapped onto neuropsychological conceptions of separable 'functional components'. I conclude, now, with an illustration of how
the same theoretical orientation may help to clarify what perhaps



should not be considered as a functionally independent component.

Other examples could be given, but one must suffice.
Auditory-verbal short-term memory
I began by claiming that the dysphasias were, self-evidently, a kind
of memory disorder. (I hope that, by now, the sense in which this
claim was intended is sufficiently clear.) However, in the more
familiar sense of 'memory' as recall or recognition of particular
events, people, places-that is, episodic memory-there is little to
suggest that dysphasic patients have any necessarily accompanying
impairments of this kind. With one exception. This is in the
immediate, auditory-vocal repetition of spoken sequences: what
used to be called 'immediate memory span' (Miller, 1956). It is a
commonplace that virtually all dysphasic patients-certainly all
those whom it would be appropriate to classify as having impairments of the phonological lexicon-have a greatly reduced span of
immediate repetition (e.g. Albert, 1976; Heilman et aI, 1976). The
majority of healthy adults can repeat back a sequence such as a
seven-digit telephone number, without error. For many dysphasic
patients the span is of three digits, or less.
Up to the early 1970s at least, it was thought appropriate to
attribute the (normal) immediate repetition span, or a large part of
it, to the capacity of a central short-term store (STS) that was either
(opinions differed) modality independent or in some way specialized
for spoken language (Atkinson & Shiffrin, 1971; Crowder, 1976).
Thus it was very natural for Shallice and Warrington (1970, 1977;
Warrington & Shallice, 1969) to represent STS as a separable 'functional component', in the neuropsychological sense, and to suggest
that deficits in 'span' could be due to specific impairment of this
hypothetical, functional component, which they referred to as
'auditory-verbal short-term memory'. However, there are several
reasons why their hypothesis, formulated in this way, may be a
First, the once-popular distinction between long-term and shortterm stores (even attribute-specific stores) as functionally separable
components has come under increasingly severe criticism. It is
probably fair to say that there is now no really convincing evidence
in favour of such a distinction, and much that is contrary (e.g. Hunt
& Elliott, 1980; Glenberg & Kraus, 1981; for extensive discussion
of this issue see Cermak & Craik, 1979).
Second, as is well known, serial 'span' in normal subjects is



massively affected by the acoustic similarity among the words in the

sequence. 'Bat, hat, rat, gnat, vat, cat' is harder to remember than
'bat, hood, shrew, midge, jar, dog'. Moreover, in normal as well
as in dysphasic subjects, span depends not only on the sounds of
the syllables but on their lexical familiarity and their meaningfulness. Thus, the typical span of seven digits drops to around five
common words, and to only about two or three nonsense-syllables;
sequences of common words result in a longer span than rare
words; names of concrete objects have a longer span than abstract
words (Brener, 1940).
All of these behavioural characteristics suggest that a mechanism
having the properties of the phonological lexicon is intimately
involved in-or, indeed, is the functional component responsible
for-repetition span. If this suggestion is correct, it would, of
course, follow that all those dysphasic patients who show impairment of the phonological lexicon should also exhibit a reduced
auditory-verbal repetition span. The available evidence (currently quite limited) suggests that they do (Allport, 1983b). If, on the
contrary, auditory-verbal short-term memory and the phonological
lexicon were functionally independent subsystems, there is no
obvious or compelling reason why these impairments should in fact
In the model I have put forward here, lesion of the phonological
lexicon must result in the reduced distinctiveness of the 'phonological' attribute domain. Consequently, for all such patients,
spoken word-lists are more acoustically (phonologically) similar.
For the same reason, as the dimensionality of the attribute-space
is reduced, so will be the strength of all those auto-associated
patterns (word-forms) that are defined over it. The effect will be
that, for such patients, previously familiar words behave more like
uncommon words or even like nonsense-syllables; their stability and
recoverability is diminished. In the simplest kind of matrix model
of list memory (Murdock, 1979), the signal-to-noise ratio (d') for
item information is equal to kin, where k is the number of dimensions or 'feature elements' in the attribute domain and n is the
number of items in the list. For a given d', the smaller (more
severely lesioned) the attribute-space, the fewer the list-items that
can be recalled. (Order information will show similar effects. The
coding of temporal sequence in distributed memory is briefly
discussed by Murdock, 1979, and by Kohonen et al, 1981.)
Traditionally, in psychology, the mechanisms of perception and
of memory have been studied in rather separate compartments. In
contrast, the way of thinking inspired by the conception of distrib-



uted associative memory strongly discourages any such separation.

The case of auditory-verbal short-term memory perhaps provides
one example.
Concluding remarks
Few would claim that the currently available systems of classifying
dysphasic impairments are wholly adequate; still less that the
theoretical framework underlying such classification, and in terms
of which these impairments are to be understood-and remediated-is satisfactory; or even that there is a coherent theoretical
framework at all. The traditional objective, of assigning dysphasic
patients to one or another of a set of mutually exclusive categories,
in practice results in most patients being categorized-if one is
honest-as 'mixed', a category that is of singularly little use either
as regards decisions about, or evaluation of, therapy.
In contrast, the 'modular subsystems' approach adopted increasingly by cognitive neuropsychologists, where it has been systematically applied, has shown the ability to provide not only a coherent
descriptive classification of impairments but to offer genuinely new
insights into the functional/causal relationships between them (e.g.
Patterson & Coltheart, 1984). Where the modular subsystems
approach, on its own, fails to provide insight, on the contrary, is
in the most commonplace character of dysphasic (and other neuropsychological) impairment: the so-called 'graceful degradation' of
performance, whereby particular linguistic functions are impoverished, slowed, subject to increased equivocation and error, rather
than simple, all-or-none loss of function.
Distributed associative memory provides a potential account of
these phenomena, as well as of many other fundamental properties
of normal memory retrieval. The claim of this paper is that these
two theoretical approaches precisely and necessarily complement
each other. We need to work on them both.
Since the potential of these approaches has, as yet, only begun
to be exploited, the future, for cognitive neuropsychology, of their
combined application is still wide open. The prospect, however,
looks encouraging.

1. These examples are all lexical (Allport & Funnell, 1981). Acquired
disorders in other aspects of language-syntax, prosody) semantics-can similarly
be thought of as disorders of memory retrieval.



2. In a representation such as a topographic map, for example, local contours

are shown explicitly, whereas (say) the visibility of one point from another is only
implicit in the representation: it must be derived by a further process of
inference. For further discussion on this and related issues of representation and
process, see Marr (1981), Chapter 1; Palmer (1978).
3. With notable exceptions, of course, including, e.g. Edelman &
Mountcastle, 1978; Schmitt et ai, 1981; Hinton & Anderson, 1981.
4. The number of completely orthogonal (unrelated) patterns that can be
represented within a matrix is, of course, limited by the dimensionality-the
number of independent elements-in the matrix. For discussion of capacity
limitation in linear and non-linear matrix memories, see Willshaw, 1981.
5. Strictly, in simple matrix-memories, taking the dot-product. For a helpful
introduction see Murdock, 1979; elementary matrix algebra required.
6. There may well be impairments in still lower-level auditory (or
articulatory) attribute domains that need not be accompanied by impairments at
the word-form level.

Albert M L 1976 Short-term memory and aphasia. Brain and Language 3: 28-33
Allport A 1977 What level of detail for cognitive theories? AISB Quarterly 27: 10-13
Allport D A 1980 Panerns and actions: cognitive mechanisms are content-specific. In:
Claxton G (ed) Cognitive psychology; New directions. Routledge & Kegan Paul,
Allport D A 1983a Language and cognition. In: Harris R (ed) Approaches to
language. Pergamon, Oxford
Allport D A 1983b Auditory-verbal short-term memory and conduction aphasia. In:
Bouma H, Bouwhuis D (eds) Anention & Performance 10. Erlbaum, Hillsdale NJ.
Allport D A 1983c Speech production and comprehension: one lexicon or two? In:
Prinz W & Sanders A F (eds) Cognition and motor processes. Springer, Berlin
Allport D A, Funnell E 1981 Components of the mental lexicon. Philosophical
Transactions of the Royal Society (London) B295: 397-410
Anderson J A 1973 A theory for the recognition of items from short memorized lists.
Psychological Review 80: 417-438
Anderson J A 1977 Neural models with cognitive implications. In: LaBerge D,
Samuels S J (eds) Basic processes in reading. Erlbaum, Hillsdale NJ
Atkinson R C, Shiffrin R M 1971 The control of short-term memory. Scientific
American 225: 82-90
Ballard D H, Hinton G E, Sejnowski T J 1983 Parallel visual computation. Nature
306: 21-26
Barlow H 1972 Single units and pqception: a neuron doctrine for perceptual
psychology. Perception 1: 371-394
Brener R 1940 An experimental investigation of memory span. Journal of
Experimental Psychology 26: 467-482
Caramazza A, Berndt R S 1978 Semantic and syntactic processes in aphasia: a review
of the literature. Psychological Bulletin 8S: 898- 918
Cermak L S, Craik F I M (eds) 1979 Levels of processing in human memory.
Erlbaum, Hillsdale NJ
Collins A M, Loftus E F 1975 A spreading-activation theory of semantic processing.
PsYchological Review 82: 407-428
Coltheart M, Patterson K, Marshall J C (eds) 1980 Deep dyslexia. Routledge &
Kegan Paul, London
Cowey A 1981 Why are there so many visual areas? In: Schmitt F 0, Worden F G,



Adelman G, Dennis S G (eds) The organization of the cerebral cortex. MIT Press,
Cambridge MA
Crowder R G 1976 Principles oflearning and memory. Erlbaum, Hillsdale NJ
Edelman G M 1981 Group selection as the basis for higher brain function. In: Schmitt
F 0, Worden F G, Adelman G, Dennis S G (eds) The organization of the cerebral
cortex. MIT Press, Cambridge MA
Edelman G M, Mountcastle V B (eds) 1978 The mindful brain: cortical organization
and the group-selective theory of higher brain function. MIT Press, Cambridge
Ellis A W 1982 Spelling and writing (and reading and speaking). In: Ellis A W (ed)
Normality and pathology in cognitive functions. Academic Press, London
Fahlman 1979 NETL: a system for representing and using real-world knowledge.
MIT Press, Cambridge MA
Fahlman S E, Hinton G E, Sejnowliki T J 1983 Massively parallel architectures for
AI: Netl, Thistle, and Boltzmann machines. Proceedings of the National
Conference on Artificial Intelligence, Washington, DC
Finger S, Stein D G 1982 Brain damage and recovery. Academic, New York.
Fodor J A 1982 The modularity of mind.
Freud S 1891 Zur Auffassung der Aphasien. Deuticke, Vienna
Gardner H 1973 The contribution of operativity to naming capacity in aphasic
patients. Neuropsychologia 11: 213-220
Glenberg A M, Kraus T a 1981 Long-term recency is not found on a recognition test.
Journal of Experimental Psychology: Learning, Memory and Cognition
7: 475-479
Gordon B 1982 Confrontation naming: computational model and disconnection
simulation. In: Arbib M A, Caplan D, Marshall J C (eds) Neural models of
language processes. Academic, New York
Hinton G E 1981 Implementing semantic nets in parallel hardware. In: Hinton G E,
Anderson J A (eds) Parallel models of associative memory. Erlbaum, Hillsdale NJ
Hinton G E, Anderson J A (eds) 1981 Parallel models of associative memory.
Erlbaum, Hillsdale NJ
Hebb D 01949 The organization of behavior. Wiley, New York
Heilman K M, Scholes R, Watson R T 1976 Defects of immediate memory in Broca's
and Conduction aphasia. Brain and Language 3: 201- 208
Hunt R R, Elliott J M 1980 The role of nonsemantic information in memory:
orthographic distinctiveness effects on retention. Journa1 of Experimental
Psychology: General 109: 49-74
Kay R H 1982 Hearing of modulation in sounds. Physiological Reviews 62: 894-975
Kinsbourne M, Woed F 1982 Theoretical considerations regarding the
episodic-semantic memory distinction. In: Cermak L S (ed) Human memory and
amnesia. Erlbaum, Hillsdale NJ
Klatt D H 1980 Speech perception: a model of acoustic-phonetic analysis and lexical
access. In: Cole R A (ed) Perception and production of fluent speech. Erlbaum,
Hillsdale NJ
Kohonen T 1977 Associative memory-A system-theoretical approach. Springer,
Kohonen T, Lehtio P, Oja E 1981 Storage and processing of information in
distributed associative memory systems. In: Hinton G E, Anderson J A (eds)
Parallel models of associative memory. Erlbaum, Hillsdale NJ
Lesser R 1978 Linguistic investigations of aphasia. Arnold, London
Marin 0 S M, Saffran E M, Schwartz M 1976 Dissociations oflanguage in aphasia:
implications for normal function. Annals of the New York Academy of Science
280: 868-884
Marr D 1981 Vision. Freeman, San Francisco
Marr D, Nishibara H K 1978 Visual information processing: Artificial Intelligence
and the sensorium ofsight. Technology Review 81: 2-23
McClelland J L, Rumelhart D E 1981 An interactive activation model of context



effects in letter perception: Part 1. An account of basic findings. Psychological

Review 88: 375-407
Miller G A 1956 The magical number seven, plus or minus two: some limits on our
capacity for processing information. Psychological Review 63: 81-97
Minsky M 1979 The Society theory of thinking. In: Winston P H, Brown R H (eds)
Artificial Intelligence: an MIT perspective. MIT Press, Cambridge MA
Monsell S 1983 Components of working memory underlying verbal skills: a
'distributed capacities' view. In: Bouma H, Bouwhuis D (eds) Attention and
Performance 10. Erlbaum, Hillsdale NJ
Morton J 1980 The Logogen model and orthographic structure. In: Frith U (ed)
Cognitive processes in spelling. Academic, London
Mountcastle VB 1978 An organizing principle for cerebral function: the unit module
and the distributed system. In: Edelman G M, Mountcastle V B (eds). The
mindful brain: cortical organization and the group-selective theory of higher brain
function. MIT Press, Cambridge MA
Murdock B B 1979 Convolution and correlation in perception and memory. In:
Nilsson L G (ed) Perspectives on memory research. Erlbaum, Hillsdale NJ
Newcombe F, RatcliffG 1979 Long-term psychological consequences of cerebral
lesions. In: Gazzaniga M S (ed) Handbook of behavioral neurobiology, Vol 2:
Neuropsychology. Plenum, New York
Norman D A (ed) 1981 Perspectives in cognitive science. Erlbaum, Hillsdale NJ
Palmer S E 1978 Fundamental aspects of cognitive representation. In: Rosch E,
Lloyd B (OOs) Cognition and categorization. Erlbaum, Hillsdale NJ
Patterson K, Coltheart M 1984 Acquired disorders of reading: a psycholinguistic
description. In: Oxbury J, Whurr R, Wyke M, Coltheart M (eds) Aphasia.
Butterworth, London
Perrett D I, Rolls E T, Caan W 1982 Visual neurones responsive to faces in the
monkey temporal cortex. Experimental Brain Research 47: 329-342
Posner M I, Keele S W 1970 Retention of abstract ideas. Journal of Experimental
Psychology 83: 304-308
Ratcliff R 1978 A theory of memory retrieval. Psychological Review 85: 59-108
Schmitt F 0, Worden F G, Adelman G, Dennis S G (eds) 1981 The organization of
the cerebral cortex. MIT Press, Cambridge MA
Shallice T 1979 Case study approach in neuropsychological research. Journal of
Clinical Neuropsychology I: 183-211
Shallice T, Warrington E K 1970 Independent functioning of verbal memory stores: a
neuropsychological study. Quarterly Journal of Experimental Psychology
22: 261-273
Shallice T, Warrington E K 1977 Auditory-verbal short-term memory impairment
and conduction aphasia. Brain & Language 4: 479-491
Solso R L, McCarthy J E 1981 Prototype formation of faces. British Journal of
Psychology 72: 499- 503
Taylor A M, Warrington E K 1973 Visual discrimination in patients with localized
cerebral lesions. Cortex 9: 82-93
Tulving E 1983 Elements of episodic memory. Clarendon Press, Oxford
Warrington E K 1975 The selective impairment of semantic memory. Quanerly
Journal of Experimental Psychology 27: 635-657
Warrington E K, Shallice T 1969 The selective impairment of auditory verbal
shon-term memory. Brain 92: 885-896
Willshaw D 1981 Holography, associative memory and inductive generalization. In:
Hinton G E, Anderson J A (OOs) Parallel models of associative memory, Erlbaum,
Hillsdale NJ
Wood C C 1978 Variations on a theme of Lashley: Lesion experiments on the neural
model of Anderson, Silverstein, Ritz and Jones. Psychological Review 85: 582-591
Wood C C 1982 Implications of simulated lesion experiments for the interpretation of
lesions in real nervous systems. In: Arbib M A, Caplan D, Marshall J C (eds)
Neural models oflanguage processes. Academic, New York

B. Butterworth

Jargon aphasia: processes and

Jargon is a rare and spectacular manifestation of an aphasic
condition. Critchley (1970) defined it as 'a type of speech impairment whereby the patient emits a profusion of utterances, most of
which are incomprehensible to the hearer, though not perhaps to
the speaker.' Words are often quite inappropriate in context; some
words are not to be found in the dictionary; syntax is frequently
odd and erroneous; empty phrases and circumlocutions abound.
Not infrequently, this speech is diagnosed as demented and it is
only through the happy intervention of a knowledgeable doctor,
speech therapist or psychologist that the patient is saved from
psychiatric help. Take the case of Mr K. 'He enjoyed good health
until the winter of 1970 when, at the age of 76, his language
behaviour suddenly became grossly abnormal. This occurred to
such an extent that his next of kin thought he had just been struck
by sudden madness and decided, somewhat hastily, to have him
interned in a lunatic asylum. Mr K. understood the meaning of this
decision and resented it; indeed he never forgot or forgave although
he later agreed that his verbal protests could hardly have helped.
After a week or so at the asylum, he had the good fortune of being
visited by a knowledgeable intern. As a consequence he was transferred to the aphasia unit of la Salpetriere where clinical manifestations of a left posterior sylvian softening were observed.' (Lecours
et aI, 1981: Case No.1).
In this chapter, I will describe the typical disorders of sentence
construction and of words found in jargon, and offer some suggestions as to the deficits and compensatory strategies that produce
them. In particular, I shall explore the idea, first hinted at by Freud
(1891), that these patients suffer from no loss of grammatical or
lexical knowledge.
The most striking cases, those containing neologisms, are
encountered only infrequently: in a sample of 420 aphasics, Kertesz