You are on page 1of 355

P r o c e e d i n g s of t h e







:-'*l3ii„- >•-
fiagmmim m m



B e r k e l e y , California

A u g u s t 1 9 - 2 1 , 1 9 8 1

The Conference is supported in part by a grant from The Alfred P. Sloan Foundation.

Constraint, Construal, and Cognitive Science 1
Robert P. Abelson

A Model for Planning in Everyday Situations 11
Robert Wilensky

A Comnnitment-Based Framework for Describing Informal Cooperative Work 17

Richard E. Fikes

A Cognitive Science Approach to Improving Planning 23

Barbara Hayes-Roth

Everyday Problem Solving 29

James A. Levin

Marriage is a Do-It-Yourself Project: The Organization of Marital Goals 31

Naomi Quir)n


Transformational Structure and Perceptual Organization 41
Stephen E. Palmer

Rational Processes in Perception 50

Alan Gilchrist, Irvin Rock

The Role of Spatial Working Memory in Shape Perception 56

Geoffrey E. Hinton

Color Perception and the Meanings of Color Words 61

Paul Kay

Structure and Function in the Early Processing of Visual Information 65

Shimon Ullman

Affect, Emotion, and Other Cognitive Curiosities 75
George Mandler

Affect and Memory Representation 78

Wendy G. Lehnert

Situation Based Emotion Frames and the Cultural Construction of Emotions 84

Catherine Lutz

Disentangling the Affective Lexicon 90

Andrew Ortony, Gerald L. Clore

Affect and the Perception of Risk 96

A m o s Tversky, Eric J. Johnson

Generative Analogies as Mental Models 97
Dedre Centner

The Role of Experience in Models of the Physical World 101

Andrea A. diSessa

The Form and Function of Mental Models 103

P. N. Johnson-Laird

Learning Through Growth of Skill in Mental Modeling 106

Jill H. Larkin, Herbert A. Simon

Interestingness and Memory for Stories 113
John B. Black, Steven P. Shwartz, Wendy G. Lehnert

Use of Goal-Plan Knowledge in Understanding Stories 115

Edward E. Smith, Allan M. Collins

Inferences in Story Comprehension 117

R a y m o n d e Guindon (sponsored by Alice Healy)

Self-Embedding is/Vof a Linguistic Issue 121

John M. Carroll

Center-Embedding Revisited 123

Michael B. Kac

The Comprehension of Focussed and Non-Focussed Pronouns 125

Jeanette K. Gundel, Deborah A. Dahl

The Natural Natural Language Understander 128

Henry Hamburger, Stephen Grain

Why Do Children Say "Goed"? - A Computer Model of Child Generation 131

Mallory Self ridge (sponsored by Katherine Nelson)

Five Experiments in the Development of the Early Infant Object Concept 133
George F. Luger, T. G. R. Brower, Jennifer G. Wishart

Analogy Generation in Scientific Problem Solving 137

John Clement

Understanding Design 141

Gerhard Fischer, Heinz-Dieter Boecker

System Properties of American Law: Constraints on Applying Cognitive Science. ... 143
Janet L. Lachman, R o y Lachman

Writing with a Computer 145

Ira Goldstein

Creating Pleasant Programming Environments for Cognitive Science Students 148

M. Eisenstadt, J. H. Laubsch, J. H. Kahney

N e w Tools for Cognitive Science 150
Leonard Friedman

Cultural Constraints on Cognitive Representation 152

Alf Zimmer

Some Thoughts on Logic, Language, Mind and Reality 156

RachelJoffe Falmagne

Time and Cognition: The Domestication of the Maya Mind 159

Hugh Gladwin

Why Some Modified Class Inclusion Tasks are Easy for Young Children:
A Process Model for Finding Referents of Labels in Arrays 163
AdeleA. Abratiamsen

MOPs and Learning 166

Roger C. Schank

Retrieving General and Specific Information from Stored Knowledge of Specifics. . . 170
James L. l\/lcClelland

Can If be Formally Represented? 173

D. S. Bree

To See And Not to See, That is the Question 177

E. Andreewsky, A. Andreewsky, G. Deloche, D. Bourcier

E/Motional Memory as a Mediating Construct in the Study of Person/Environment

Interaction 179
Marisa Zavalloni

The Perception of Disoriented Complex Objects 181

Steven P. Shwartz (sponsored by Stephen Kosslyn)

The Role of Intrinsic Axes in Shape Recognition 184

Marianne Wiser (sponsored by Mary Potter)

Relations Between Schemata-Based Computational Vision and Aspects of Visual

Attention 187
Roger A. Browse (sponsored by D. Kahneman)

Growing Schemas Out of Interviews 191

Jerry R. Hobbs, Michael H. Agar

Shaping Explanations: Effects of Questioning on Text Interpretation 193

Richard H. Granger, Jr. (sponsored by Dogulas R. White)

The Need for Context in Event Identity 197

John M. Morris

Point of View in Problem Solving 200

Edwin L. Hutch ins, James A. Levin

Representing Problem-Solving Episodes 203

Arthur M. Farley, David L. McCarty

Where Process- and Measurement Models Meet:
Evaluation of States in Problem Solving 207
Jan Drosler (sponsored by Walter Kintsch)

Positive Affect and Creative Problem Solving 210

Alice M. I sen, Gary P. N o wick i

Memory in Story Invention 213

Natalie Dehn (sponsored by Robert Abelson)

Recognizing Thematic Units in Narratives 216

Brian J. Reiser, Wendy G. Lehnert, John B. Black (sponsored by Andrew Ortony)

Using Qualitative Simulation to Generate Explanations 219

Kenneth Forbus, Albert Stevens

Incentive and Cognitive Processing 222

Michael W. Eysenck

The Role of TAUs in Narratives 225

Michael G. Dyer (sponsored by Wendy Lehnert)

Controlling Parsing by Passing Messages 228

Brian Phillips, James Hendler

A Parser with Something for Everyone 231

Eugene Charniak

Thought Sequences and the Language of Consciousness 234

Benny Shannon

What Conditions Should a Theory of Consciousness meet? 236

Bernard J. Baars

The Role of the Observer in Cognitive Science 238

Richard Hammersley (sponsored by Donald Norman)

Are We Ready for a Cognitive Engineering? 240

S K. Card, A. Newell

Knowledge Structures of Computer Programmers 243

Beth Adelson (sponsored by Marilyn Shatz)

GLISP: An Efficient, English-Like Programming Language 249

Gordon S. Novak, Jr.

Some Questions About Verbal and Pictorial Representation 252

Sheldon Richmond

The Relationship Between Human Motion and Objects in the Cognitive

Representation of Visual Events 255
Margot D. Lasher

Toward a Model of Cognitive Process in Cartoon Comprehension 261

Michael E. J. Masson, Inga Boehler (sponsored by Peter Poison)

D e m o n Timeouts
Limiting the Life Span of Spontaneous Computations in Cognitive Models .... 263
Steven Small (sponsored by Chuck Rieger)

The Effects of Integrated Knowledge on Fact Retrieval and Consistency Judgments:

W h e n Does it Help, and When Does it Hurt? 265
Lynne M. Reder, Brian H. Ross

A Question Answering Method of Exploring Prose Comprehension: An Overview . . . 268

Arthur C. Graesser

Role of Context in Cognitive Development 270

Gideon Carmi

Concept Formation Through the Interaction of Multiple Models 271

Mark H. Burstein (sponsored by Christopher Riesbeck)

A Simulated World for Modeling Learning and Development 274

Pat Langley, David Nicholas, David Klahr, Greg H o o d

Why Do We Do What We Do? 277

James A. Galambos, John B. Black (sponsored by Lance Rips)

An Activation-Trigger-Schema Model for the Simulation of Skilled Typing 281

David E. Rumelhart, Donald A. Norman

Toward a Formal Theory of Human Plausible Reasoning 283

Allan Collins, Ryszard S. Michalski

Skills, Learning and Parallelism 284

Aaron Sloman

Cognitive Style and the Learning of a First Computer Language 286

M. J. Coombs (sponsored by R. Wiiensky)

The Concept of E-Machine: On Brain Hardware and the Algorithms of Thinking.... 289
Victor Eliashberg

Invariance Hierarchies in Metaphor Interpretation 292

Jaime G. Carbonell (sponsored by Philip Hayes)

Dual-System Processing in Language Comprehension 296

Jon M. Slack

Metaphor as Nonliteral Similarity 299

Odella E. Schattin, Hartvig Dahl (sponsored by Virginia Teller)

A Theory of Intelligence 301

P. J. vanHeerden

A Tetragenic Frame for Modelling Unmediated Knowledge Acquisition 303

Egon E. Loebner

Surrealistic Imagery and the Calculation of Behavior 307

Sheldon Klein, David A. Ross, Mark S. Manasse,
Johanna Danos, Mark S. Bickford, Walter A. Burt, Kendall L. Jensen
Learning with Understanding 310
James G. Greeno

The Growth of Number Representation: Successive Levels of Schematic Learning. . . 313

Lauren B. Resnick

The Role of Experiences and Examples in Learning Systems 317

Edwina L. Rissland, Oliver G. Self ridge, Elliot M. Solo way

l-lnterruption Effects in Backward Pattern Masking:

The Neglected Role of Fixation Stimuli 320
J. Pascual-Leone, J. Johnson, D. Goodman, D. Hameluck, L. H. Theodor

Cognitive Load Affects Early Brain Potential Indicators of Perceptual Processing. . . . 32

Francois Richer, Jackson Beatty

The Spatial Representation and Processing of Information in Cognition 326

Gary W. Strong, Bruce A. Whitehead (sponsored by John Holland)

Robert P. Abel son
Yale University

Cognitive science has barely emerged as a distinction has been made by an unnamed but easily
discipline -- or an interdiscipline, or whatever it guessable colleague of mine, who claims that the
is -- and already it is having an identity crisis. major clashes in human affairs are between the "neats"
and the "scruffies". The primary concern of the
Within us and among us we have many competing neat is that things should be orderly and predictable
identities. Two particular prototypic identities while the scruffy seeks the rough-and-tumble of
cause a very serious clash, and I would like to life as it comes.
explicate this conflict and then explore some areas
in which a fusion of identities seems possible. I am exaggerating slightly, but only slightly,
Consider the two-word name "cognitive science". in saying that the major disagreements within
It represents a hybridization of two different cognitive science are instantiations of a ubiquitous
impulses. On the one hand, we want to study human division between neat right-wing analysis and scruffy
and artificial cognition, the structure of mental left-wing ideation. In truth there are some signs
representatives, the nature of mind. On the other of an attempt to fuse or to compromise these two
hand, we want to be scientific, be principled, tendencies. Indeed, one could view the success of
be exact. These two impulses are not necessarily cognitive science as primarily dependent not upon
incompatible, but given free rein they can develop the cooperation of linguistics,AI, psychology, etc.,
what seems to be a diametric opposition. but rather, upon the union of clashing world views
about the fundamental nature of mentation. Hopefully,
The study of the knowledge in a mental system we can be open minded and realistic about the
tends toward both naturalism and phenomenology. important contents of thought at the same time we
The mind needs to represent what is out there are principled, even elegant, in our characteriza-
in the real world, and it needs to manipulate it tions of the forms of thought.
for particular purposes. But the world is messy,
and purposes are manifold. Models of mind, The fusion task is not easy. It is hard to
therefore, can become garrulous and intractable neaten up a scruffy or scruffy up a neat. It is
as they become more and more realistic. If one's difficult to formalize aspects of human thought
emphasis is on science more than on cognition, which are variable, disorderly, and seemingly
however, the canons of hard science dictate a strategy irrational, or to build tightly principled models
of the isolation of idealized subsystems which can of realistic language processing in messy natural
be modeled with elegant productive formalisms. domains. Writings about cognitive science are
Clarity and precision are highly prized, even at beginning to show a recognition of the need for
the expense of common sense realism. To caricature world-view unification, but the signs of strain are
this tendency with a phrase from John Tukey (1959), clear. Consider the following passage from a
the motto of the narrow hard scientist is, "Be recent article by Frank Keil (1981) in Pscyhological
exactly wrong, rather than approximately right". Review, giving background for a discussion of his
formalistic analysis of the concept of constraint:
The one tendency points inside the mind, to see
what might be there. The other points outside the "Constraints will be formal
mind, to some formal system which can be logically restrictions that limit the class of logically
manipulated (Kintsch et al., 1981). Neither camp possible knowledge structures that can normally
grants the other a legitimate claim on cognitive be used in a given cognitive domain." (p. 198).
science. One side says, "What you're doing may seem
to be science, but it's got nothing to do with Now, what is the word "normally" doing in a
cognition." The other side says, "What you're statement about logical possibility? Does it mean
doing may seem to be about cognition, but it's that something which is logically impossible can be
got nothing to do with science." used if conditions are not normal? This seems to
require a cognitive hyperspace where the impossible
Superficially, it may seem that the trouble is possible.
arises primarily because of the two-headed name
cognitive science. I well remember the discussions It is not my intention to disparage an author
of possible names, even though I never liked on the basis of a single statement infelicitously
"cognitive science", the alternatives were worse; put. I think he was genuinely trying to come to
abominations like "epistology" or "representonomy". grips with the reality that there is some boundary
somewhere to the penetration of his formal constraint
But in any case, the conflict goes far deeper analysis into the viscissitudes of human affairs.
than the name itself. Indeed, the stylistic division But I use the example as symptomatic of one kind of
is the same polarization than arises in all fields approach to the cognitive science fusion problem:
of science, as well as in art, in politics, in you start from a neat, right-wing point of view, but
religion, in child rearing -- and in all spheres acknowledge some limited role for scruffy, left-wing
of human endeavor. Psychologist Silvan Tomkins orientations. The other type of approach is the
(1965) characterizes this overriding conflict as obvious mirror: you start from the disorderly left-
that between characterologically left-wing and wing side and struggle to be neater about what you
right-wing world views. The left-wing personality are doing. I prefer the latter approach to the
finds the sources of value and truth to lie within former. I will tell you why, and then lay out the
individuals, whose reactions to the world define beginnings of such an approach.
what is important. The right-wing personality
asserts that all human behavior is to be under- The strategy of trying to move leftward from
stood and judged according to rules or norms which the right suffers from a seemingly permanent limita-
exist independent of human reaction. A similar tion on the kinds of content and process you are
willing to consider. If what really matters to you boundaries of mini-models, cognitive science
is the formal tractability of the domain of investi- psychologists have been willing to use messier stimulus
gation, then your steps are likely to be small and timid. materials and at least contemplate non-laboratory
Recent history in several social and behavioral methodologies. The way is not easy, and there is
science areas makes this quite clear. much anguishing. That, I claim, is the price of
trying to move leftward from a right-wing starting
In cognitive anthropology, there was a great deal point.
of fascination 25 years ago with the orderliness
of systems for kinship terminology. Kin terms in Linguists, by and large, are farther away from a
different societies were found to be precisely de- cognitive science fusion than are cognitive psycholo-
scribable by concatenations of the values on a handful gists. The belief that formal semantic analysis will
of well-specified components such as sex and genera- prove central to the study of human cognition suffers
tion. Formal mini-models captured these regularities from the touching self-delusion that what is elegant
elegantly. Originally it was thought that this must perforce be true and general. Intense study
kind of componential semantics held great promise for of quantification and truth conditions because they
the analysis of culture and language in general, but provide a convenient intersection of logic and
gradually it was realized that outside of kinship language will not prove any more generally informative
terms and pronoun systems, precious little else in about the huge range of potential uses of language
the language of any society was ordered so neatly. than the anthropological analysis of kinship terms
Faith in tight componential analysis has largely been told us about culture and language. On top of
abandoned. that, there is the highly restrictive tradition of
defining the user of language as a redundant if not
Rational decision theory has until recently had defective transducer of the information to be found
a tight hold on the views of economists and some in the linguistic corpus itself. There is no room
psychologists of the way people made decisions. in this tradition for the human as inventor and changer
The typical rational decision model specifies a set and social transmitter of linguistic forms, and of
of uncertain outcomes, with each of which is asso- contents to which those forms refer. To try to under-
ciated a probability and a utility. Choices among stand cognition by a formal analysis of language
ensembles ot outcomes are then said to be predictable seems to me like trying to understand baseball by
on the basis of a strict composition rule on the an analysis of the physics of what happens when an
probabilities and utilities. The only trouble is, idealized bat strikes an idealized baseball. One
the behavior of human subjects overwhelmingly might learn a lot about possible trajectories of
disobeys the predictions of the models, no matter the ball, but there is no way in the world one could
how hard the axioms try. There have been numerous ever understand what is meant by a double play or a
attempts to rescue the general framework, including run or an inning, much less the concept of winning
the clever strategy of training subjects to obey the World Series. These are human rule systems
the rational model following initial demonstrations invented on top of the physical possibilities of
of deviation from it. I would recommend this device the batted ball, just as there are human rule systems
also to people promoting competence models of syntax invented on top of the structural possibilities
in the face of incompetent performers, except that of linguistic forms. One can never infer the rule
I cannot as a psychologist bring myself to believe systems from a study of the forms alone.
that it tells us anything about human psychology.
Well, not I have stated a strong preference
In any case, there are some many violations of rational
against trying to move leftward from the right.
decision theory that it is a clear failure as a
What about the other? What are the difficulties
descriptive or explanatory psychological model.
in starting from the scruffy side and moving toward
Only an approach that deals directly with observed
the neat? The obvious advantage is that one has
decision phenomena (for example, the work of Tversky
the option of letting the problem area itself, rather
and Kahneman (1980)) has a chance of success. (For
than the available methodology, guide us about what
fuller reviews of this field, see Einhorn and Hogarth
is important. The obstacle, of course, is that we
(1981), March (1978), and Abelson & Levi (Note 1)).
may not know how to attack the important problems.
Other examples of excessive faith in the unaided More likely, we may think we know how to proceed,
power of formalisms to subdue the beast of psycho- but other people may find our methods sloppy. We
logical explanation could be adduced from within may have to face accusations of being £d hoc, and
experimental psychology itself. A good case from scientifically unprincipled, and other awful things.
some years back is provided by stochastic learning
models (Bush & Mosteller, 1955), which were extremely Sometimes we worry about such matters ourselves.
rich as mathematical objects,but turned out to have There is a neat person struggling to get into every
applicability to a very small range of problems, scruffy person (just as there is a scruffy person
indeed. Models like this were part of the "bottom struggling to get out of every neat person). What
up" tradition of doing science within experimental is required is that we act on our worries, that we
psychology, the belief that by starting with very try to take the criticisms seriously and see what
tightly controlled, limited, and isolated laboratory can be done about them. The messy intuitions and
phenomena, one could gradually explicate the operation theories, albeit they concern very general and
of the whole organism. This tradition is of course important problems (God bless 'em), need to be
still strongly honored by many experimental pscyholo- articulatued and developed in a more orderly way.
gists, but I think that those psychologists interested
I will take the work of the Yale Artificial
in cognitive science have largely departed from that
Intelligence Project, and an particular, the
tradition, at least in its most extreme form. In
programmatic statements in the Schank and Abelson
the service of studying more important and more general
(1977) book as point of departure. The Yale point
phenomena than those falling within the formal
of view is quintessential1y scruffy, and has been there are almost certainly subgoals, each of which
criticized accordingly. No matter that scrips and defines a scene involving a transaction between
plans and goals and themese are psychologically particular role players in a certain physical setting,
reasonable, and that computer programs using such using given props.
concepts are operational at the frontier of realistic
processing of natural language, nevertheless, it is At the scene level of a script, therefore, there
said that the system of concepts is not formal enough. run in parallel four networks of coherences: Those
governing the transactions, the role players, the
The make a system more formal is to define its physical settings, and the props. None of these
concepts more precisely, and to have them enter into entities can enter into sequences arbitrarily.
general predictions and explanations according to a Scene transitions between one physical setting and
set of principles, preferable a small elegant set. another, for example, follow the topographical
Let me first address the question of the definition rules of familiar environments. One does not step
of concepts. The Yale group deals with knowledge off the airplane directly into a swimming pool,
structures such as scripts and plans, it may seem or go from the doctor's waiting room into the kitchen.
at first that these are pretty amorphous entities. Role players remain from scene to scene except
What counts, say, as a script or as clearly not a when somebody makes a purposive and expected entry
script? How can you tell? When we gave examples or exit. It does not require belaboring these
such as the restaurant script, and cognitive psycho- coherences in full detail to realize the enormous
logical experiments (e.g., Bower, Black & Turner, degree of constraint thus imposed on input relevant
1979; Galambos & Rips, 1979; Graesser, 1981) to any given script.
forthwith used verbal stimuli from the restaurant
Perhaps on of the things that disguised the high
situation, along with doctor visits and laundromat
order of systematicity of scripts was that some of
activities and so on, it may have produced misleading
the computer programs that used them, such as SAM
impressions that the intention was to define scripts
(Cullingford, Note 3 ) , were written in a way that
solely by waving at passing examples, or perhaps
did not insure against ad hoc violations of some of
by writing down a definitive list of 111 of them,
the constraints. A prankish programmer could perfectly
or worse, by allowing any damn thing to be a script
well prepare an expected event sequence wherein
just by calling it that.
the customer ate the check and gave the food to the
I say that these impressions are misleading cashier, thus thwarting their mutual goals, and
because in fact we have become acutely aware nothing in the Script Applier would protest illegi-
(Schank, Note 2) that scripts have been loosely timacy of such expectations. Of course they would
used. The original intention was not at all to turn out by experience to be useless in matching
create a haven for loose concepts; in fact, scripts realistic inputs, but that is a very weak way to
(among other knowledge structures) were very recognize absurdity. (And it is still vulnerable
tightly defined, by a set of interdependent con- to the possibility of prankish inputs). Later programs
straints. Indeed, if a knowledge structure is proposed such as POLITICS (Carbonell, 1978) did not, by the
as crucial in the top-down processing of certain way, suffer the same degree of vulnerability, but
inputs, then clearly it must embody of us to leave this whole issue has not been treated as explicitly
these constraints largely implicit, rather than as it might be.
spelling them out systematically. Why is any of this important? Well, there may
It is not my main intention today to remedy be some people who feel it is not important, that
this neglect for scripts or any other specific type there are more compelling issues for language AI to
of knowledge structure, but rather to make clear my worry about. But it bothers me that the concept of
general view of the role that constraints play in structural constraint seems to have been coopted by
the process of understanding text. However, having the neats, when all the while the scruffy Yale programs
raised the issue, it is useful to begin by indicating are based very heavily on a whole series of implicit
what constrains script structures. Related remarks constraints.
apply to related types of structures such as MOPs
Let us look more closely at some general issues
(Schank, 1980) and metascripts (Abelson, 1981).
pertaining to the idea of constraint. In the original
The casual definition of a script is "a stereo- formulation of information theory and communication
typed sequence of events familiar to the individual". theory, the structural constraints on the communicative
Implicit in this definition are two powerful sources elements were presumed mutually accessible to the sender
of constraint. One is the notion of an event and receiver of messages. They each knew the redun-
sequence, which implies the causal chaining of dancies of letter strings or phoneme strings, and this
enablements and results for physical events and of consensus was the basis for an analysis of the
initiations and reasons for mental events. Causal information content of messages. In effect, one could
chains are highly ruleful, and many of those rules ignore most of the properties of the receiver, and
have been spelled out explicitly (Schank, 1975; concentrate the analysis on the properties of the
Schank and Abelson, 1977. Ch.2). The other constraint stimulus ensemble.
generator comes from the ideas of stereotypy and Nowadays the emphasis in cognitive science is on
familiarity. That an event sequence is stereotyped chunks of meaning, and one cannot generate meaning
implies the absence of fortuitous events. Also, for simply by higher-order approximations to the structure
events to be often repeated implies that there is some of low-level stimulus elements. The idea that the set
set of standard individual and institutional goals of possible messages is very much constrained is still
which gives rise to the repetition. Furthermore, a powerful idea, but at least two drastic changes are
necessary in applying an information theoretic type
of analysis to higher-level meaning elements, say.
sentences, rather than to low-level stimulus elements I propose a characterization relating the struc-
such as letters. For one thing, the number of turedness of a context to the constraint in the
possible elements is infinite rather than finite. stimulus string and something I will call the construal
For another, there is no guarantee at all that the function on the part of the understander. The con-
receiver of messages adequately comprehends the straint in the stimulus string can be expressed by the
structure of contingencies between sentences that distribution of probabilities P(i) of occurrence over
can possible be generated by the message source. all potential next stimulus chunks. If a few inputs
are moderately likely and all others are of very low
People in the Chomskyan tradition writing probability, the stimulus contains more structure than
about constraints in knowledge structures do not usually if likelihood is fairly evenly distributed over a
distinguish between structural constraint intrinsic large set of possibilities. This is as in standard
to the stimulus ensemble and structural contraint information theory.
characterizing the receiver's construal of the
But the understander may not extract from the
stimulus possibilities. The former focuses the
stimulus the available structure. The individual
analysis structure strictly on analysis of language,
has expectations of what sorts of things may occur
completely defining the psychology of the receiver
next. If something which is highly expected occurs,
out of independent existence. This is a very right-
it is difficult to process. In general, we may
wing attitude, in the sense I have previously dis-
imagine that there is a distribution of measures
cussed. People are said to be the way they are
of processing facility F(i) over all possible next
because of immutable external regularities. There
stimulus chunks. Under various different construals
is little interest in studying learning, or human
of what the stimulus string is about, thus what
error, or individual differences in intelligence.
expectations are appropriate, the distribution of
Furthermore, there is total disregard of facility measures will be different. Perhaps
cultural shaping of knowledge structures. That is, processing facility could be operationalized as the
even in cases where there is a structural match speed with which a given chunk can be comprehended,
between the semantics of the language and the or perhaps in some more subtle way, but in any
corresponding mental representation in a particular case the understander is conceptualized as having
domain, this match may have been produced by a process the capacity to prepare for coming inputs by making
of cultural invention rather than by the inevitable a differential allocation of facility measures F(i).
emergence of a natural truth. Much social knowledge This is a much more realistic view of the understander
pertains to what anthropologists (cf. D'Andrade, than assuming that he has no expectations at
Note 4) call constitutive rule systems, extensive all, thus relatively equal facility in accepting all
networks of how to construe, how to behave in, and inputs, or only a single dominant expectation.
even how to feel about culturally defined situations. Artificial intelligence programs that work heavily
The nexus of rules defining the meaing of marriage top-down always in effect smear their expectations
is one example. Other examples of cultural rule over a domain of related possibilities. A good
systems are mental illness, senior citizenship, image for this emaring tendency is Chuck Reiger's
golf, and sexual harassment. It seems to me perfectly concept of the "expectancy cloud".
obvious that there is no foreordained meaning for The average value of processing facility is the
any of these domains, or a thousand others, rather, sum of cross-products of stimulus probability P(i)
a meaning which evolves under the pressure of social, and facility measure F(i) over all possible input
political and economic motives and experiences. chunks. This average facility will be high or
I belabor the banal here because of recently renewed low depending on four things: (1) the general
claims that to know knowledge, one only need know simplicity of the context; (2) the general facility
semantics. of the understander; (these two factors can jointly
Having thus gored the one-horned ox, let me try be characterized by the mean of the F's unweighted
to lay out a balanced view of one aspect of the inter- by stimulus probability): (3) the degree of pre-
play between mental representatives and stimulus dictability inherent in the stimulus string
structures. I will place the argument in the context (which could be indexed by an uncertainty measure
of text understanding. on the P's); and (4) the match between the F's and
the P's that is, the extent to which the construal
Consider an individual who is exposed to a string by the understander appropriately allocates her
of language, presented one chunk at a time, say, preparations in the direction of inputs which are
sentence by sentence. Considerable constraint relatively likely to occur. Under an assumption of
will be imposed by the general context surrounding a fixed unweighted variance of the F's, it is easy
the presentation of the string, say, whether it is to show that the average proportionality relation-
a story or a piece of conversation, and what nature ship between the P's and the F's, high facility
of the topic and style of presentation. The local attaching to relatively high likelihood, and low
context operative at a particular place within the facility attaching to relatively low likelihood.
string will exercise further constraint. Is there
The match between what is expected and what might
a way to conceptualize a measure of the degree of
occur should not automatically be presumed, either
structuredness at any given place in the presentation
according to some cosmic principle of innate resonance
of the string? Further, is there a way to think
between the individual and the environment, or on
about structure such that it is a joint property
the basis of a gradual learning of, and accomodation
of the stimulus string and the interpretive machinery
to contextual contingencies. There are at least
of the understander?
four reasons for this. First, it is very possible,
indeed frequent, for people to misconstrue situations
and have a whole series of misguided expectations.
Misconstrual tendencies are very interesting to social In short, the structure in personal construals
psychologists and there has been a good deal of need not match the structure of stimulus constraints,
recent research on stereotyping, on misleading first for several reasons. When there is a match, however,
impressions, on the effects of inappropriate but understanding is considerable facilitated. The
salient schemata, and on the insensitivity of false example of script processing provides an especially
constructions to empirical evidence (cf. Nisbert & clear case. An account of a highly scripted activity
Ross, 1980). such as a visit to a doctor introduces very high
stimulus constraint, because only a limited number
A second reason not to presume that construals of events have high probability of occurrence in
reflect stimulus constraints is that people are the account. If the understander construes the account
generally extremely slow to pick up the contingency as indeed concerning a doctor visit (as opposed, say,
structure in novel input materials, if they ever get to a chat with a professional colleague), then his
it at all. Contingencies are especially problematic relevant knowledge structure will highly constrain his
when multidimensional dependencies are involved. It expectations to a small set of events. Given a
is clear from classical two-alternative guessing sufficient consensus on what sorts of things transpire
situations that people are very good at learning in accounts of doctor visits, understanding will
the simplest zero-order structure, that is, the (on average) be very facile.
relative frequencies of two'different alternatives.
Although appropriate data do not to my knowledge My discussion to this point has concealed an
exist, there is little reason, however, to suppose important aspect of the concepts of constraint
that people are adept at learning the zero-order and construal in text processing. There are really
structure within large numbers of alternatives. three different types of structural limitation on
And it is very clear from so-called "cue validity" coherent stimuli and coherent expectations. Recall
studies (e.g.,Hammond & Summers, 1965) that there that we are supposing that the input string is
are sharp limitations on the learning of higher- received a chunk at a time, and that we are inter-
order structures. When many independent cues are ested in the probability of occurrence and the
modest predictors of an outcome variable, people processing facility associated with every possible
are unable to use all the cues, but settle instead chunk. For tangibility we may suppose that the
for (somewhat fallible) use of three or four of chunks are sentences. Two somewhat different kinds
them. In realistic stimulus domains where it is of constraints are those applying within chunks,
not at all clear how many cues of what sort there and those applying between chunks. Let us desig-
might be, the situation can be even worse. For nate these respectively as combinatorial constraint
example, in studies of how people judge whether and sequential constraint. A third kind I will
someone else is lying or not (Krauss et al., 1976; call functional equivalence constraint, to which
Kraut, 1978), the facial, gestural and speech cues I will return shortly.
that judges employ to diagnose the liar overlap By within chunk, or combinatorial constraint,
hardly at all with the set of cues that actually I refer to tendencies or rules for what linguistic
predict lying. components go with what. This would include all
A third reason, related to the second, is that of syntax, semantic rules or "selection restric-
stimulus structure is usually dependent on the tions" about sensible meaning combinations, such as
source of the string being communicated. Different what actions require animate actors and what attri-
communicators have different styles and different butes are pertinent for what object classes, and also
angles on what to say or write about a given topic. fragments of pragmatic knowledge that tell us what
The understander usually will not have long enough combinations are unlikely in the real world even though
experience with particular communicators to pick semantically possible, such as the Queen using
up their individual contingency structures, even obscenity or coal miners curtsying. In this
if learning were very rapid. category of constrint, it seems quite likely that
Fourth, the construal function must be flexible. rules characterizing stimulus structure would be
in operation, so that when there is a shift in the generally well matched by rules characterizing con-
topic of the stimulus string, the understander can struals. These aspects of constraint are widely
establish a new set of expectations. A lack of appreciated and shared, and the reasons given above
matching could come about if adjustments in construal in support of mismatch tendencies are least likely
were sluggish, lagging behind the stimulus. It to apply.
is intuitively clear that there are both individual By between chunk, or sequential constraint, I
and situational differences in the rate of adjustment mean tendencies for certain cV\unV.s, to ^QA\Qvt a <^\m^t\
of construals. Part of the ordinary concept of chunV. or sequence o^ c\\unV.s.. Vtdto a ^o'^wa^ v^^f^^
intelligence, or perhaps quick-mindedness, is the of viev), one mi<jht sviw^^e tV\at betv^een cVwinVs.
ability ot an individual to keep up with what is constraint is just anotVxer t^unA^e o^ 'ie^ect^O'n
being read or said, especially if the point is rapidAy restrictions., or set o^ ruAes. a^out v»\\at qoe^ vrtSVn
shifting. Situations, too, may help or hinder vthat, and t\\us is just AiVe v^itVivv cVwiVvV. coT\s.tTa.\T\t,
quick reconstrual. There is a good deal of psychological but operating on larger units. In the field of
literature on the phenomenon of perserverance, wherein story understanding, the supposition has sometimes
a person's analysis of a problem area continues in been made that a kind of syntax exists linking
a vein which has previously been successful, despite successive units, and so we have the notion of
the introduction of new materials which make the old "story grammars" (cf. Rumelhart, 1977; Mandler &
analysis outmoded (Luchins, 1942), or the presentation Johnson, 1977).
of evidence that past success was spurious (Ross, Story grammars may have some use as rough tools
Lepper, & Hubbard, 1975). in restricted story domains, but they have I think
rightly been criticized (e.g.. Black & Wilensky,
1979) when they make overly strong claims. Stories
are simply not grammatical in quite the same sense
that sentences are. Nor are other bodies of text.
When a literate reader asserts that a short string of to how likely or expected they are, and the contingency
writing within a period at the end is not a sentence, structure limiting what goes with what, both within
the literate observe will almost always agree. But chunks and across chunks.
who is to say that a longer string of writing is not
a paragraph, or not a story. Judgements here are I have argued that matching between construal
somewhat fuzzy, as we are dealing not with inviolate structure and stimulus constraint structure should not
rules but with general coherence tendencies. Prag- in general be presupposed, but that it is likely to
matics and stylisties become much more important, and occur in certain contexts. Within chunks, it is
the role of syntax dwindles. Construals can more likely that syntactic constraints will be well
readily be out of synch with stimulus constraints. matched. Across chunks, certain types of knowledge
To think otherwise,and to regard sequential constraint structures permit ready construals of well-structured
as merely an extension of combinatorial constraint, realities. I have already mentioned scripts
is to fall too much under the thrall of the bottom-up, as one example.
start small, scientific strategy of the neats.
Another between-chunk example has to do with the
The third category, that I have called "func- role that knowledge of intentionality plays in story
tional equivalence" constraint, is rather different understanding. Stories tend to be summarized in
in nature. In the general conception I have sketched terms of goals and plans of the main characters.
of the construal process, there is the clear unrealism Goals and successful plans tend to be well remembered.
of supposing that the organism allocates preparatory AI models of story understanding make a big point of
provisions among an infinite number of possible tracking goals and goal fates. This was the case in
stimuli. Much more reasonable is the supposition Schank and Abel son (1977), and in other Yale models
that the understander prepares differentially for such as those of Wilensky (1978( and Carbonell
different categories of input content, in effect (1978), and in a great many non-Yale models as well.
grouping stimuli into equivalence classes according Why is this? What is there about goals that makes
to the functions they serve. Within each class, them so important? From a scruffy common sense
processing facility would be roughyl constraint. point of view, this question may not seem worth
This is a fairly strong form of subjectively imposed asking because the answer seems obvious. Goals
constraint, for it says that such-and-such stimuli underly most of the activities of people, and what
are to be regarded as equivalent, and all processed interests people is hearing about things that other
with ease, whereas such-and-such other stimuli, people do. One might amplify this intuition with
constituting another equivalence class, will be the observation that intentionality is not obser-
processed with less facility, and so on. Now there vable, but must be inferred, and there is something
is great opportunity for mismatching, depending on especially intriguing in making inferences about
whether the understander does or does not carve the people to explain why they do what they do.
possibilities at different joints than the probability Another angle is that goals relate to emotions,
structure of the stimuli would. To maintain matching, as I will discuss later, and emotions are especially
it would be necessary for every stimulus within an interesting.
equivalence class to be approximately equally
probable. The understander, however, may lump A formalist would not be especially happy with
possibilities together because they have comparable such explanations. They essentially say that goals
personal concern or affective significance, not because are interesting because everybody knows they are
they are necessarily objectively substitutable. interesting; but no principled account is given.
Well, it seems to me clear that a principled account
These ideas may become clearer if I recapitulate can be given, and indeed, it is implicit in almost all
and then amplify the model I am outlining. We are analyses of the role of goals and plans, but it is
supposing an understander exposed to a stimulus string usually not spelled out. Simply put, it is that
one chunk at a time. Strong structural constraints intentionality is "inference rich"--it provides a
characterize the ensemble of stimulus strings which the high degree of stimulus constraint. Construals
given string instantiates, but we do not presuppose structured to match will confer great advantage to
that the understander necessarily perfectly appreciates the understander. Intentional action introduces
those constraints. Instead, the understander, in a constraints of all three kinds: combinatorial,
construal process, or general expectational policy, sequential and functional equivalence. Especially
imposes constraints of her own. Construals are from noteworthy is the possibility of remote sequential
time to time altered as the stimulus string unfolds, constraint, that is, the influence of somebody
but while in force, each given construal is a having a goal on his actions much later in time.
processing allocation which determines the particular In a story or a novel-- or in life-- it could be
degree of facility with which different possible inputs dozens or hundreds of pages-- or days-- before the
and sequences of inputs would be processed. A major goals of individuals are actualized, yet that
construal defines functional equivalence classes of latent potentiality is present all the while. This
stimuli, such that stimulus possibilities within constraint demon that goals unleash is a kind of
class are processed alike. inferential time bomb set to go off one knows
The average facility of processing is maximized not when. No Markov process behaves like this.
when there is a match between the probability struc- It is clear, therefore, that knowledge structures
ture of the stimulus ensemble and the facility concerning goals are highly constraining, thus very
structure of the construal. Furthermore, given some important in the understanding process. Whether goals
degree of match, the average facility increases with are the most constraining concept in texts about
increasing structuredness. There are really three human activities is very hard to say, because we have
aspects of both types of structure: the zero-order no comparative constraint measures on different
structure partitioning the possibilities according inference-r'ich concepts, averaged over all possible,
or all available, or all experienced texts of a given
type. I want to point out, however, that a good
guiding philosophy behind the choice of knowledge taxonomies of affective states (deRivera, 1977;
structures to investigate is to try to pick those which Roseman, 1979; Wilensky, Ortony & Collins, 1980)
are both highly constraini r:g and highly frequrnt, which set forth a number of disposing factors
thereby being very usefjl for construal funt.tions. said to generate one or another affect While
Although we did not a'ticu late it explicitly, thi s highly evocative, the various schemes seem to lack
was indeed the rationale g uiding the Schank h Abel son a unifying principle conmon to their sets of affects,
(1977) choice of scripts, plans, goals, and themes this I believe to be provided by the idea of
as the highest priority kn owledge structure concepts incompatible construals.
CO investigage. Far from being ad h^c, therefore,
these concepts are very cl osely tied to ideas con- My analysis owes much to the scheme devised
cerning constraint. by Roseman (1979), but is differently organized in
order to get the benef it of the incompatibility
Thus far in my account, I have given mainly a principle. In conveyi ng a preliminary version
way to talk carefully about the role of structural of my theory, I will c ontinue to talk about
constraint in the process of understanding, with construals, but I will depart freely from the text
very little in the way of predictive principles understanding pradigm to refer also to a behavioral
or substantive claims. Let me now attempt some situation paradigm whe re the prospective and actual
claims based on the construal concept. events might happen or do happen to the individual
involved, rather than his just imagining their
I have said that construals are from time to time occurrence for the cha racters in a story.
altered or updated during the course of understanding.
I believe that what is remembered about a stimulus To avoid confuction, I should say what the
string is not the string itself, but the set of theory is not about. It is not primarily about the
construals and reconstruals used in interpreting pleasures and pains associated directly with physical
it. When a construal is active and inputs arrive sensations, either innately or thorough conditioning.
which are not readily processed, that is, are un- Thus it is not primarily about sexual pleasure, or
expected in terms of the construal, then reconstrual pleasurable tastes, or about fright, pain or disgust,
becomes necessary. Memory for one's own construal or about the love or aversion for the people or
structures, therefore, would include both the original objects associated with those pleasures or pains.
construal and the reconstrual compelled by unfore- Secondarily, however, it may implicate pleasures
seen stimulus events. Thus my proposal here is or pains or other goal states, as will be explained.
similar to Schank & Abelson's (1977) script pointer The theory is also not about the semantics of affect
plus "weird list" idea, and to Graesser's (1981) as coded in words or phrases capable of evoking
"script pointer plus tag" hypothesis, except that associated emotions, such a pejorative adjectives
it is more general in that the knowledge structure applied to another person so as to stimulate dislike
involved need not be a script, but could be anything. or the sunny lyrics of a love song. Rather, it
Indeed, it could be any combination of knowledge concerns the emergence of an affecgive state as a
structures implicated in a construal which partially consequence of the structural relationships in an
succeeds and then needs correction. (Among other ongoing situation: it is a structural theory of
recent treatments containing similar ideas are those "on-line affect."
of Lehnert (1979) and of Schank (Note 2)). In analyzing incompatible construals, we have to
There are some memory data which fit nicely with ask why more than one construal would ever be
this Construal Principle (e.g., Hastile, 1980; necessary. An obvious answer is that the ongoing
Spiro & Eposito, 1977; Graesser, 1981). But I am construal leads to poor understanding, and must
also aware that there are other data which are hard be changed. A more direct way to say this is that
to explain by it (e.g., Anderson & Pichert, 1978), anticipation does not correspond to reality. What
and that there are complications in applying it to you imagine will happen does not in fact happen,
data in general. Let that be a story for another and you must update your imaginings. If the update
occasion, however. is incompatible with the previous construal, then
an affective process will occur which is both a
Reconstrual seems to me to be an especially
signal and a symptom of the activity of reconstrual
interesting phenomenon. One class I would like to
(and which, incidentally, will be associated with high
pursue arises when two incompatible construals
memorability for the event which precipitated the
compete with one another for processing dominance.
By incompatible construals, I mean sets of expec-
tations based on the other set. The more massive the Two conditions seem basic to the degree of
preparations, the more serious the incompatibility. incompatibility of construals. One is the range of
In cognitive terms, the massive preparatory part possible cognitive chunks implicated by the two
of a construal is the establishment of sequential construals, the second is the discrepancy in the
constraints; thus incompatible construals involve hedonic import of the two construals, whereby one
opposing sequential constraints. is highly pleasurable or painful and the other is
not. Inference-richness and hedonic import would
Characteristically associated with incompatible seem in practice to cooccur, because one mainly
construals, under certain conditions, are affective makes extensive inferences about that which is
experience. There are different types of incom- personally consequential. But the two concepts
patibility, as I shall spell out, and with each of are conceptually separable.
these is associated a different variety of affect.
These connections are compelling enough to serve In any case, not every reconstrual involves
as the basis for a theory of affect. Recently, there compatibility, and many incompatibilities are quite
have arisen in psychology a number of systematic trivial in extent and significance. Thus
structural affect is not freely evoked by minor
alterations from previous expectations. Thus if
you mistook someone to say they were from Stanford
and it turned out they meant Stamford, (Connecticut), previously outlined cases, it can be seen that
or if you thought that a session of the conference FRUSTRATION represents dashed HOPE, and DISAPPOINTMENT
was in Room A and it turned out to be in Room B, is negatively resolved SUSPENSE.
those changes would not provoke affect (unless your
mistake led to come commitment or consequence In a slightly different type of subcase, the neg-
difficulty to undo). ative reality has already occured, but the individual
imagines what might have been, by reconstruing
It is instructive to consider systematically the sequence starting at a particular point of
how the inference-richness of consequential alternative departure. The idea "I shouldn't have done what I
construals might vary and give rise to differing did; if only I had acted differently, things would
affective states. A useful rubric is the intentional have been different", corresponds to a state of
action sequence, where a positive or negative outcome EMBARASSMENT or MORTIFICATION. If the recrimination
state is cognized by the individual along with an goes all the way back to believing that one has pur-
alternative outcome of opposite import. What is sued the wrong goal, then the affective state is one
of interest is the point in the sequence at which of GUILT.
the alternativity arises, thereby determining the
depth of reconstrual which is necessary. There are Another set of subcases arises when there is an
different classes of cases, depending on whether imagined negative outcome, but the actual outcome is
two alternative construals are present only in positive. Without going through all the details,
imagination, or whether one is imagined and the other suffice to say that depending on the sequential
represents reality, or there are representations point at which alternativity occurs, the respective
of two disparate realities because reality has affective states of LUCKINESS. GRATITUDE, PRIDE,
changed. AND RECTITUDE can be generated.
Let us suppose a sequence in which a goal Finally, there is the case of incompatible
leads to some planned action which through some construals which arise because one reality is suddenly
causal instrumentality determines an outcome. replaced by another reality. This need have nothing
Consider first the case in which this sequence has to do with an intentional action sequence, because
progressed up to a certain point in reality, and it can be outside of the individual's control.
then there are alternative imagined construals, one If the old reality was positive, and the new reality
hedonically positive, the other negative, of the is no longer positive, the affect is SORROW. If the
uncertain future course of the sequence. If goal, old reality was not negative, but the new reality
action, and causal instrumentality are fixed, but is negative, the affect is DISTRESS. If the old
only the outcome is uncertain, there is a minimal reality was not positive, but the new reality is
range of content for the alternative construals to replaced by one which is not negative, a state of
deal with. The associated affective experience can RELIEF is produced.
be characterized as SUSPENSE. If goal and action
are fixed, but there are alternative causal instru- I have been perhaps somewhat scruffy in my
mentalities each potentially controlling the outcome, presentation of this system of 16 affects (albeit
the alternative construals are inferentially richer I had earlier implied I would try to be neat).
and the affect in general will be more elaborate. It was not my intention here, however, to be
Perhaps there are alternative authorities who may complete and well-disciplined, but only to lay out
become responsible for the outcome, as a particular direction of theory and research
for example two judges who might be assigned to your involving the role of construals in understanding,
legal case, one probably sympathetic, the other memory, and affective experience. The conception
probaly unsympathetic. The affect here is one of of a construal function as a system of subjective
the pair of HOPE/FEAR, depending on whether the constraints which may or may not match objective
favorable or unfavorable construal is emphasized. stimulus constraints is, it seems to me, a very
important conception. There is no reason why the
When only the goal is fixed, but two (or more) idea of systems of constraint should be abandoned
well defined and distinct action plans are construable, to cooption by the right wing within cognitive
each with uncertain connection to the important science, which presumes to investigate Mind without
outcome, the associated affective state may be reference to minds. Instead, we need in cognitive
characterized as AGONIZING. Then not even the goal science a fusion of left and right wings, of sub-
is fixed, but incompatible possible goals can jective and objective, of content and of formalism.
be clearly construed, the affective state is one
Consider next the case in which a particular
sequence leading to a favorable outcome is construed
in imagination, but reality forces an alternative
construal in which the outcome is in fact negative.
If goal.action, and causal instrumentality are
fixed, but the real negative outcome differs from
the imagined positive outcome, the state is one of
DISAPPOINTMENT. If goal and action are fixed, but the
real causal instrumentality producing a positive
outcome differs from the imagined causal instrumentality
producing a positive outcome, the affective state
is one of FRUSTRATION or ANGER. In relation to
Keil, F.C. Constraints on knowledge and cognitive
development. Psychological Review, 1981, 88.
1. Abelson, R.P. & Levi, A. Decision-making and 197-227.
decision theory. Chapter in preparation for
G. Lindzey S E. Aronson (Eds.), Handbook of social Kintsch, W., Miller, J., & Poison, P. Problems of
psychology. Reading, Mass.: Addison-Wesley, 1983". methodology in cognitive science. Paper for the
Cognitive Science Symposium, Boulder, Col., July 1981
2. Schank, R.C. D)[namic memory: A theory of learning
jn compu^tej;^ an^d people. Manuscript in preparation".' Krauss, R.M., Geller, V., & Olson, C.T. Modalities and
Yale University, l'98l'. cause in the detection of deception. Paper presented
at the American Psychological Association convention.
3. Cullingford, R. Script application: Computer under- Washington, D.C., 1976
standing of newspaper stories. Research Report 116,
Oept. of Computer Science, Yale University, 1978. Lehnert, W. Text processing effects and recall memory.
Research Report 157, Dept. of Computer Science, Yale
4. D'Andrade, R. Cultural meaning systems. Paper University, May 1979.
presented at the SSRC conference on Culture and
Cognition, May 1981. Luchins, A. Mechanization in problem solving.
Psychological Mono^ra^hs, 1942, 54, Whole No. 248.
5. Wilensky, R., Collins, A.,. & Orotny, A. A
cognitive analysis of affect. Draft of paper. Mandler. J.M. & Johnson, N.S. Remembrance of things
U. Cal. Berkeley, Nov., 1980. parsed: Story structure and recall. Cognitive
Ps_y£h_okiay, 1977, 9, 111-151.
March, J.G. Bounded rationality, imbiguity, and the
Abelson, R.P. Social psychology's rational man. In engineering of choice. Bell Journal of Ejcon_orni£S
S.O. Benn & G.W. Mortimore (Eds.), Ritionalxty and and Management Science, 1978, 9, 587-608'.
the soci_a] science London: Routledge & Kegan Paul,
1976. Nisbett, R. & Ross, L. Human inference: strategies
and shortcomings of social judgement. Englewood
Abelson, R.P. The psychological status of the script Cliffs, N.J.: Prentice-Hall, 1980.
concept. American Psychologist, 1981, 36, 715-729.
Roseman, I. Cognitive aspects of emotion and emotional
Anderson, R., & Pichert, J.W. Recall of previously behavior. Paper delivered at the American Psycho-
unrecallable information following a shift in per- logical Association convention. New York, 1979.
pective. Journal of Verbal Learjlina and Verbal
Behavior, 1978, U , 1-12. Ross, L., Lepper, M.R. & Hubbard, M. Perserverance
in self perception and social perception: Biased
Black, J.B. & Wilensky, R. An evaluation of story attributional processes in the debriefing paradigm.
grammars. Cognitive Science, 1979, 3, 213-230. Journal of Personality and Social Psychology, 1975,
32, 880-892.
Bower, G.H., Black, J.B. & Turner, T.J. Scripts in
memory for text. Cognitive Psychology, 1979, 1_1, Rumelhart, D.E. Understanding and summarizing brief
177-220. stories. In D. LaBerge & S.J. Samuels (Eds.),
Basic processes in reading: Perception and compre-
Bush, R.R. & Mosteller, F. Stochastic models for hension. Hillsdale, N.J.: Erlbaum, 1977.
learning. New York: Wiley, 1955.
Schank, R.C. The structure of episodes in memory.
Carbonell, J.G., Jr. POLITICS: Automated ideological In D.G. Bobrow & A. Collins (Eds.), Representation
reasoning. Cognitive Science, 1978, 2, 27-52. and understanding: studies in cognitive science.
New York: Academic Press, 1975.
deRivera, J. A structural theory of the emotions.
Psychological Issues, Monograph 40. N.Y.: International Schank, R.C. Language and memory. Cognitive
Universities Press, 1977. Science, 1980, 4, 243-284.

Einhorn, H. & Hogarth, R. Behavioral decision theory: Schank, R.C. and Abelson, R.P. Scripts, plans, goals
Processes of judgement and choice. Annual Review of and understanding. Hillsdale, N.J.: Erlbaum, 1977.
Psychology, 1981, 32, 53-88.
Spiro, R.J. & Esposito, J. Superficial processing
Galambos, J. & Rips, L.J. The representation of of explicit inferences in text. Technical Report
events in memory. Paper presented to the Midwestern 60, Center for the Study of Rading, Urbana, 111.,
Pscyhological Association, May, 1979. 1977.

Graesser, A. Prose comprehension beyond the word. Tomkins, S. The psychology of being right -- and
New York: Springer-Verlag, 1981. left. Trans-action, 1965, 3, 23-27

Hammond, K.R. & Summers, D.A. Cognitive dependence Tukey, J.W. Analyzing data: Sanctification or
on linear and non linear cues. Psychological detective work? American Psychologist, 1969,
Review, 1965, 72, 215-224. 24, 83-91.

Hastie, R. Memory for behavioral information that Tversky, A. & Kahneman, D. The framing of decisions
confirms or contradicts a personality impression. and the psychology of choice. Science, 1980.
In R. Hastie, T.M. Ostrom, E.B. Ebbesen, R.S. Wyer,
D.L. Hamilton, D.E. Carlston (Eds.) Wilensky, R. Why John married Mary: Understanding
Person memory: The cognitive basis of social percep- stories involving recurreing goals. Cognitive
tion. Hillsdale N.J.: Erlbaum, 1977. Science, 1978, 2, 235-266.

A Model for Planning in Everyday Situations*
Robert Wilensky
Computer Science Division
Department of EECS
University of Cdlifornia. Berkeley
Berkeley, California 94720

planning involves detecting the interactions between

goals, figuring out their implications, and then deciding
1. Introduction - Four Tenets for a Theory of Planning what to do about them.
Much previous work on planning and problem solv-
ing has been concerned with either very specialized sys- 1.2. Planning Knowledge Should be Equally Available for
tems or with highly artificial domains (e. g.. consider Understanding
Fikes and Nilsson (1971). Newell and Simon (1972), Suss- The second tenet of our theory of planning is that it
m a n (1975), Shortlifle (1976)). More recently, there has should be equally usable by both a planner and an under-
been an increase in attention given to planning in com- stander. That is, while a planner uses its planning
monplace situations. For example, Rieger (1975) has knowledge to bring about a desired state of atTairs. an
proposed a set of " c o m m o n sense algorithms" for rea- understander m a y need to use this same knowledge to
soning about everyday physical situations; Hayes-Roth comprehend the actions of a person it is watching or of a
and Hayes-Roth (1979) are concerned with how a person character about w h o m it is reading. For example, a
might schedule a day's activities; and Carbonell (1980) planner with the goal of keeping fit might take up jog-
POLITICS program reasons dogmatically about political ging; an understander might use the same knowledge to
decisions. On another front, Sacerdoti (1977) and infer that someone who has taken up jogging m a y have
McDermott (1978), while operating perhaps in the more done so because he had the goal of staying in shape.
traditional problem solving context, have proposed some Planning and understanding are rather different
powerful approaches to problem solving in general. processes, and thia will of course be reflected in our
1.1. Everyday Planning is Reasoning about Interactions planning and understanding mechanisms. However, our
Between Goals theory of planning specifies that knowledge should be
represented in a fashion so that it is usable by either
We have been developing a theory of planning that is
concerned with reasoning about everyday situations. A
central tenet of this theory is that most of the planning 1.3. Meta-PIanning is Used as the Driving Principle
involved in everyday situations is primarily concerned The third salient feature of our theory is that it is
with the interactions between goals. That is, planning for based on meta^lanning. By this I m e a n that the prob-
individual goals is assumed to be a fairly simple matter, lems a planner encounters in producing a plan for a
consisting primarily of the straightforward application of given situation m a y themselves be formulated as goals.
rather large quantities of world knowledge. The com- These "meta-goals" can then be submitted to the plan-
plexity of planning is attributed to the fact that most ning mechanism, which treats them just like any other
situations involve numerous goals that interact in com- goals. That is, the planner attempts to find a "meta-
plicated ways. plan" for this meta-goal; the result of successful
Thus while traditional problem solving research has application of this plan will be the solution to the original
been concerned with finding the solution to a single, planning problem.
difficult problem (e. g.. finding the winning chess move), A typical example of a meta-goal is the goal
most everyday problem solving consists of synthesizing RESOLVE-GOAL-CONFLICT. A planner would presumably
solutions to fairly simple, interactine oroblems. For have an instance of this goal whenever it detects that
example, a typical everyday situation that involves the some of its "ordinary" goals are in conflict with one
sort of planning we are interested in might be to obtain another. The meta-plans for this goal are the various
some nails, and also buy a h a m m e r . The plan for each goal conflict resolution strategies available to the
goal is straightforward: One simply goes to the hardware planner.
store, buys the desired item, and returns. The problem Meta-planning is described in more detail in Wilen-
lies in recognizing that it is a terrible idea to execute sky (1980). Here, w e give only a brief characterization of
these plans independently. Rather, the seemingly simple its main features and advantages.
c o m m o n sense plan is to combine the two individual
Meta-goals are organized by meta-themes. These
plans, resulting in the plan of going to the hardware
are very general principles of planning that describe
store, buying both items, and then returning.
situations in which meta-goals c o m e into being. W e sum-
Simple as this situation m a y be, most conventional
marize these briefly:
planners are ill-equipped to handle it. Although some
planning programs have mechanisms for removing Meta-themes
redundancies from a plan, they generally lack a mechan-
ism for even noticing this sort of interaction if these
plans are derived from heretofore unrelated goals. 1) DON'T WASTE RESOURCES
Perhaps more importantly, the interaction between 2) ACHIEVE AS M A N Y GOALS AS POSSIBLE
plans m a y have more complex ramifications. For exam- 3) MAXIMIZE T H E VALUE O F T H E GOALS ACHIEVED
ple, if enough items are to be purchased at the hardware 4) AVOID IMPOSSIBLE G O A L S
store, then a better plan might be to take one's car,
while walking m a y do otherwise. Thus a good part of

•Research sponsored in pan by the National Science Foundation under grant HCS79-06543
and by the Office of Naval Research under contract N00014-60-C-0732

1.4. Projection is Used to Infer Goals and Debug Plana
As an example of how these function, the meta- The fourth signiflcant feature of our planning model
theme "ACHIEVE AS M A N Y GOALS AS POSSIBLE" is is that it is based on projection. That is, as the planner
responsible for detecting goal conflicts. That is, if the formulates a plan for a goal, the execution of this plan is
planner intends to perform a set of actions that will simulated in a hypothetical world model. Problems with
negatively interact with one another, this meta-theme proposed plans m a y be detected by examining these
causes the planner to have the goal of resolving the hypothetical worlds.
conflict. If this meta-goal fails, i. e., the planner could Projection not only enables the planner toflndprob-
not find a way to resolve the conflict, then the meta- lems with its own plans, but it also enables it to deter-
theme "MAXIMIZE T H E VALUE OF T H E GOALS ACHIEVED" mine that a situation merits having a new goal. For
springs into action. This meta-theme sets up the goal of example, sensing an impending danger requires the
arriving at a scenario in which the less valuable goals are planner to project from the current state of aflfairs into a
abandoned in order to fulfill the most valuable ones. The
details of the meta-plans involved in these processes are hypothetical world which it finds less desirable. Having
described in length in the last two sections of this paper. done this projection, the planner can infer that it should
Meta-planning has a number of advantages over have the goal of preventing the undesirable state of
other approaches to planning; these advantages are aflfairs from coming into being.
summarized below: Projecting hypothetical realities also allows a gen-
eral "goal detection" mechanism to work for meta-goals
1.3.1. Meta-planning knowledge can be used for both as well as for "ordinary" goals. When proposed plans for
planning and understanding goals are projected, interactions will appear in the
As meta-goals and meta-plans are declarative struc- hypothetical world. Since such interactions generally
tures in the same sense as are ordinary goals and plans, indicate that some important planning principle is not
they m a y be used to understand situations as well as being adhered to, the occurrence of this hypothetical
plan in them. Thus an understander with access to this negative interaction is usually a signal to the planner to
knowledge would be able to interpret someone's action achieve some particular meta-goal.
as an attempt to resolve a goal conflict, for example. In Working with projected universes entails some liabil-
contrast, planning programs that have the equivalent ities as well, as does the notion of meta-planning and of
knowledge embedded procedurally would not be able to using highly declarative representations. However, our
conveniently use it to explain someone else's actions. claim is that the prices associated with these ideas are
prices that must be paid anyway. By putting them
1.3.2. The same planning mechanism can apply to together in the manner described here, a deal of power
m o r e difficult tasks. is obtained for no additional cost.
Meta-planning knowledge generally embodies a set In the next section, I discuss the general structure
of strategies for complicated plan interactions. By for- of a planning mechanism based on these assumptions.
mulating this knowledge in terms of goals and plans, the This is the structure used in P A N D O R A (Plan ANalysis
s a m e planning architecture that already exists for with Dynamic Organization, Revision, and Application), a
simpler planning can be used to implement more compli- planning system now under construction at Berkeley.
cated planning involving multiple goals, etc. The sections following show how these mechanisms func-
tion together in reasoning about goal conflict situations.
1.3.3. More general resolution of traditional planning As we have noted, we intend these ideas to be applicable
problems to understanding as well as planning, and in fact, they
are being used in a new implementation of PAM, a plan-
Traditional planners usually tieat problems such as based story understanding system. While we do not dis-
goal conflicts by special purpose means by the intro- cuss the structure of P A M here, the analysis of goal
duction of critics, for example (Sussman 1975, Sacerdoti conflicts is presented in a form in which its use in under-
1977). This is equivalent to having the general problem standing as well as planning m a y be seen.
solver consult an expert when it gets in trouble. The
meta-planning allows the general problem solver to call a 2. The Design of a Planner Based on Ueta-planning
general problem solver (itself) instead. Thus all the This section described the overall architecture of a
power of such a system can be focussed on planning planner based on the four tenets just considered. The
problems, rather than just relying on a few expert lac- planner is composed of the following major components:
tics. Of course, all the specific knowledge usually embo-
died in critics would still be available to the general 1) The Goal Detector
problem solver. But the meta-planning model allows this
knowledge to interact with all other knowledge as it now This mechanism is responsible for determining that
take take part in general reasoning processes. the planner has a goal. The goal detector has access to
the planner's likes and dislikes, to the stale of the world
1.3.4. Representational advantages and any ciiaiiges that m a y befall it, and to the planner's
The meta-planning model also provides m o r e fiexi- own internal planning structures and hypothetical world
bility when no solution can be found. Since a meta-goal models. The goal detector can therefore establish a new
represents the formulation of a problem, the existence goal because of some change in the environment,
of the problem m a y be dealt other than its being fully because such a goal is instrumental to another goal, or
resolved. For example, the problem solver m a y simply in order to resolve a problem in a planning structure
decide to accept a flawed plan if the violation is viewed that arises in a hypothetical world model.
as not being too important, or decide to abandon one of
the goals that it can't satisfy. By separating solving the
problem from formulating the problem, the problem
m a y be accessed as opposed to treated, an option that
most other problem solving models do not allow.

2) The Plan Generator
The plan generator proposes plans for the goals The planner component of our model itself consists
already detected. It m a y dredge up stereotyped solu- of three components;
tions, it m a y edit previously known plans to fit the
current situation, or it m a y create fairly novel solutions.
The plan generator is also rfcsponsible for expanding
high-level plans into their primitive components to allow
execution. 1) The Proposer,
which suggests plausible plans to try
3) The Executor 2) The Projector,
The executor simply carries out the plan steps as which tests plans by building hypothetical world
proposed by the plan generator. It is responsible for the models of what it would be like to execute these
detection of errors, although not with their correction. plans
3) The Revisor,
The importance of the goal detector should be which can edit and remove parts of a proposed plan-
emphasized. Most planning systems do not worry about ning structure
where their goals c o m e from; high-level goals are gen- The Proposer begins by suggesting the most specific
erally handed to the plarmer in the form of a problem to plan it knows of that is applicable to the goal. If this plan
be solved. However, a planning system needs to infer its is rejected or fails, the Proposer will propose succes-
own goals for a number of reasons: an autonomous sively more general and "creative" solutions. Once the
planner needs to know when it should go into action; for Proposer has suggested a plan, the Projector starts com-
example, it should be able to recognize that it is hungry puting what will happen to the world as the plan is exe-
or that its power supply is low and what goal it should cuted. The difficult problems in conducting a simulation
therefore have. It should be able to take advantage of involve reasoning about "possible world" type situations
opportunities that present themselves, even if it doesn't which are not amenable to standard temporal logic
have a particular goal in mind at the time. It should be (McCarthy and Hayes, 1969). However, we finesse this
able to protect itself from dangers from its environment, issue by defining hypothetical states in terms of what the
from other planners, or from consequences of its own planner thinks of in the course of plan construction. In
plans. other words, our solution is to let the system assert the
The goal detector operates through the use of a changes that would be m a d e into a hypothetical data
mechanism called the Noticer The Noticer is a general base, in the meantime letting the goal detector have
facility in charge of recognizing that something has access to these states. Thus if the plan being simulated
occurred that is of interest to some part of the system. would result in the planner dying, say, this would consti-
The Noticer monitors changes in the external environ- tute a hypothetical undesirable state, which might
ment and in the internal states of the system. When it trigger further goals, etc.
detects the presence of something that it was previously
instructed to monitor, it reports this occurrence to the As the Projector hypothetically carries out the plan,
source that originally told it to look for that thing. The and other goals and meta-goals are detected by the goal
Noticer can be thought of as a collection of IF-ADDED detector, the original plan m a y have to be modified. This
demons whose only action is to report some occurrence is done by explicit calls to the Revisor, which knows the
to some other mechanism. plan structure and can m a k e edits or deletions upon
Goals are detected by having themes and meta- request. The modified plan structure is simulated again
themes asserted into the Noticer with orders to report to until it is either found satisfactory or the entire plan is
the goal detector. Theme is a term used by Schank and given up and a new one suggested by the Proposer.
Abelson (1975) to m e a n something that gives rise to a Actually, the function of the Projector is somewhat
goal; a meta-theme, similarly, is responsible for generat- more pervasive than has so far been described. The Pro-
ing meta-goals. For example, we can assert to the not- jector must be capable of projecting current events into
icer that when it gets hungry (i. e., when the value of future possibilities based both on the intentions of the
some internal state reaches a certain point), the planner planner and on its analysis of those events themselves.
should have the goal of being not hungry (i. e., of chang- For example, if the planner sees a boulder rolling down
ing this value), or that if someone is threatening to kill the mountain, it is the job of the Projector to project the
the planner, that the planner should have the goal of future path that the boulder will traverse. If the pro-
protecting its life. On the meta-level, we might assert jected path crosses that of the planner, for example, a
that if a goal conflict comes into existence, then the preservation goal should be detected. Thus the Projec-
planner should have the meta-goal of resolving this tor is a quite powerful and general device that is capable
conflict. of predicting plausible futures.
Note that the presumption of a goal detector cou-
pled with meta-planning creates a system of consider-
able power. For example, no separate mechanism is
required for detecting goal conflicts or for noticing that
a set of proposed plans will squander a resource. The
need to resolve conflicts or conserve resources is
expressed by formulating descriptions of the various
situations in which this m a y occur, and the appropriate
meta-goal to have when it does. By asserting these
descriptions into the Noticer to detect meta-goals, goal
conflicts and other important goal interactions are han-
dled automatically.

3. Reasoning about Goal Conflicts
In the next two sections w e give a m o r e detailed 3.1.1. USE-NORUAL-PLAN applied to resolvin« goal
analysis of one particular part of our planning model, conflicts
namely, the resolution of goal conflicts. The problem is The most specific re-planning strategy is likewise
important in its own right; however, the presentation analogous to the planning strategy for ordinary goals,
that follows is aimed at demonstrating the kind of "stra- namely, find a normal plan. A normal plan in the case of
tegy architecture" to which the model is conducive. In goal conflict is to find a stored plan specifically designed
particular, the section illustrates a n u m b e r of important for use in a goal conflict between the kinds of goals found
meta-goals and the meta-plans for them, and describes in the current situation. For example, consider the fol-
how they would be invoked and utilized by the model. lowing situation:
The section also emphasizes the utility of meta-planning (1) Mary was very hungry, but she was trying to lose
for the application of planning knowledge to understand- some weight. She decided to take a diet pill.
ing goal conflicts as well as to planning for them.
In (1), there is a conflict between the goal of losing
Since it is desirable to achieve all of one's goals, a weight and satisfying hunger, as the normal plan for the
planner faced with a goal conflict will probably attempt latter goal involves eating. The RE-PLAN meta-plan is
to resolve that conflict. W e express this by saying that used, and the USE-NORMAL-PLAN strategy applied. The
the state of having a goal conflict is a situation that normal plan found that is applicable to both goals is to
causes the meta-theme "ACHIEVE AS M A N Y GOALS AS take a diet pill.
POSSIBLE" to b e c o m e active. In such a situation, this
meta-theme creates the meta-goal RESOLVE-GOAL- Just as m a n y objects are functionally defined by the
CONFLICT. This is a meta-goal because resolving the role they play in ordinary plans, so some objects are
conflict can be viewed as a planning problem that needs functionally defined by the role they play in plans aimed
to be solved by the creation of a better plan. In this for- at resolving specific goal conflicts. Thus a diet pill is an
mulation, the resolution of the goal conflict is performed object functionally defined by its ability to resolve the
by the execution of a meta-plan, the result of whii^h will conflict between hunger and weight loss; a raincoat is
be a set of altered plans whose execution will not inter- defined by the role it plays in preventing wetness when
fere with one another. one must go outside. In fact, a great deal of m u n d a n e
The knowledge needed to replan around a goal planning knowledge appears to consist of plans for
conflict is quite diverse, and m a y depend upon the par- resolving specific types of goal conflicts.
ticular goals in question and on the nature of the
3.1 2. Intelligent use of TRY-ALTERNATE-PLAN to find
conflict. However, the meta-plans with which this
non-conflicting set
knowledge is applied are rather general. To see why. it is
necessary to ask how it is possible for goal conflicts to be A general planning strategy that is applicable when
resolved at all. There appear to be two ways in which a plan cannot be m a d e to work is to try another plan for
goal conflicts can c o m e about that determine how they that goal. In the case of resolving goal conflicts, this
means that alternative plans for each conflicting goal
m a y be resolved:
can be proposed until a set is found that are not in
1) The conflict detected is based on the plans for one's conflict. As noted above, this m a y be costly, but it will
goals, rather than on the goals themselves. In this only be tried when no canned conflict resolution plan has
case, it m a y be possible to achieve the goals by been found. Moreover, the plan can provide some intelli-
other, non-conflicting plans. gent ways of proposing new alternatives that m a y help
2) The conflict depends upon some additional cir- keep costs down.
cumstance or condition beyond the stated goals or For example, consider the following situation:
plans. The conflict might therefore be resolved if
(2) John was going outside to pick up the paper when he
this circumstance is changed.
noticed it was raining. He looked for his raincoat,
but he couldn't find it. He decided to get Fido to
We therefore define two very general meta-plans, fetch the paper for him.
efTective, we need to supply these meta-plans with more Here, John first thought to walk outside, but then found
information; if we use RE-PLAN blindly, for example, we that this would cause a conflict. As his normal plan for
might end up enumerating all possible plans for each resolving this conflict failed, John tried proposing other
conflicting goal, although m a n y of these plan combina- plans, looking for ones that wouldn't entail his getting
tions will present the same problem that caused the ori- wet. Since getting the dog to fetch the paper is such a
ginal goal conflict. plan, and since John presumably doesn't care if Fido gets
wet, this plan is adopted.
3.1. RE-PLAN The metd-planning strategy used here is called
There are a n u m b e r of different re-planning stra- TRY-ALTERNATIVE-PLAN. The difference between I'sing
tegies applicable to goal conflict situations. They arc this metj-plan and blind generate and test strategics is
given here in order of decreasing specificity. This is in that some control can be exerted here over exactly what
accordance with our belief about the order in which such is undone and and what is looked for as a replacement.
plans would actually be used, i. .e, the most specific one For example, the backtracking here need not be cluotio-
first, then progressively m o r e general ones, until a satis- logical or dependency-directed, but can be knau.le.dje-
factory set of plans is found. In this respect, meta-plans directed. That is, rather than undo the last planning
are enli.-ely analogous to ordinary plans insofar as the decision, a planning decision related to either goal can
planning process is conct'rned. be undone, possibly based on an informed guess.
The order of plan application is just a corollary of Inrtddition,when fetching a new plnn, it m a y be pos-
the First Law of Knowledge Application "Always use the sible to specify in the fetch some conditions that the
most specific piece of knowledge applicable" fetched plan m a y have to meet without actually testing
that plan for a conflict. For example, in the case of get-
ting the newspaper when it is raining, we can ask for a
plan for gftting something thai doesn't involve going out-
side. That is, we can look for a plan for one goal that
does not contain an action that led to the original

conflict. If our m e m o r y mechanism can handle such To decide what circumstance to change, a planner
requests, then we can retrieve only those plans that do once again needs to analyze the cause of the conflict
not conflict in the same way that the original plan does. Thus CHANGE-CIRCUMSTANCE first requires the use of
In order for this to work, TRY-ALTERNATIVK-PUN MAKE-A>TTR1BUTI0N to propose a candidate for alteration.
needs to know what part of a plan contributed to the goril As was the case for RE-PLAN, MAKE-ATTRIBUTION
conflict so it can look for a plan without this action. This requires access to detailed knowledge about the nature
generally depends upon the kind of conflict. We Cdri for- of negative goal interactions in order to find a particular
mulate this within the meta-planning framework by circumstance with which to meddle. An analysis of such
defining a meta-plan called MAKE-ATTRIBUTION. Here, interactions appears in Wilensky (197B).
MAKE-ATTRIBUTION is used as a subplan of the TRY-
ALTERNATIVE-PUN meta-plan, although we shall m a k e 4. Goal AbandoDment
other uses of it below. TRY-ALTERNATIVE-PUNfirstasks When attempts to resolve a goal conflict are unsuc-
MAKE-ATTRIBUTION to specify a cause of the problem. cessful, a planner must m a k e a decision about what
and then fetches a new plan without the objectional ele- should be salvaged. In terms of meta-planning. we can
ment in it. describe these "goal abandonment " situations as follows.
TRY-ALTERNATIVE-PLAN can also control how far up The inability to achieve a RESOLVE-GOAL-CONFLICT
the proposed goal-subgoal structure it should go to undo meta-goal results in the planner having this failed meta-
a decision. Thus, if no alternative plan for a goal can be goal. Having a failed RESOLVE-GOAL-CONFLICT meta-goal
found, the goal itself can be questioned if it is a subgoal is a condition that triggers the meta-theme MAXIMIZE
of some other plan. For example, consider the following THE VALUE OF THE GOALS ACHIEVED. This triggering
scenario: condition causes this meta-theme to invoke a new meta-
(3) John was going to get the newspaper when he goal. called CHOOSE-MOST-VALUABLE-SCENARIO. This
noticed it was raining. He decided to listen to the goal is satisfied when the relative worth of various
radio instead. achievable subsets of the conflicting goals is assessed.
Here the entire subgoal of getting the newspaper was and the subset offering the greatest potential yield
eliminated. Since this was apparently a subgoal of determined.
finding out the news, the alternative plan of listening to To achieve this meta-goal. we postulate a
the radio can be substituted. Once again, MAKE- SIMULATE-AND-SELECT meta-plan. This plan proposes
ATTRIBUTION is used to propose a plan that doesn't various combinations of goals to try. and computes the
involve an unwanted step. The difTerence between this worth of each combination. The most valuable set of
and the last case is that here a plan lying above the goals is returned as the scenario most worth pursuing.
conflicting goal is re-planned. 4.1. The SIMULATE-AND-SELECT meta-plan
In addition to the RE-PLAN meta-plan, the other structure. To begin with, it makes a n u m b e r of
general goal conflict resolution strategy is to change the presumptions about evaluating the cost and worth of
circumstance that contributes to the conflict. This is goals and of comparing them to one another. We
actually more general than RE-PLAN, because it m a y be presume that values can be attributed to individual
applicable to conflicts where the goals themselves states in isolation ceteris parihiis, and that the value of a
exclude one another, whereas RE-PLAN requires the set can be computed from its parts. This does not
conflict to be plan-based. presume that the computation is simple: indeed, it m a y
CHANGE-CIRCUMSTANCE can resolve a goal conflict involve the consultation of large amounts of world
by altering a state of the world that is responsible for the knowledge. However, we do assume that all values can
goals conflicting with one another. Once this has been be m a d e commeasurable. The general issues involving
achieved, the original set of plans m a y be used without attributing values to goals are discussed (although by no
encountering the original problem. means resolved) in Wilensky (1980).
The SIMULATE-AND-SELECT meta-plan has in efTect
For example, consider the following situations:
two distinct options. The first is quite straightforward.
(4) John had a meeting with his boss in the morning. It simply involves constructing maximal achievable (i. e.,
but he was feeling ill and wanted to stay in bed. He non-conflicting) subsets from a m o n g the conflicting
decided to call his boss and try to postpone the goals, and evaluating the net worth of each one. Since
meeting until he felt better. we are generally dealing with two goals in a conflict, this
(5) John wanted to live in San Francisco, but he also means just evaluating the worth of one goal and compar-
wanted to live near Mary, and she lived in New York. ing it to the value of the other. Thus if having the news-
John tried to persuade Mary to move to San Fran- paper is deemed more valuable than getting wet, then
cisco with him. the planner walks outside to get the newspaper and
In (4), John's conflict is caused by his plan to attend allows hJmsei/ to get soaked. Alternatively, a reader try-
the meeting and his plan to stay h o m e and rest. These ing to understand someone else's behavior would use
plans conflict because of the time constraints on John's knowledge about this meta-plan to m a k e inferences
meeting, which force the plans to overlap; the plans about their value system. If we observe John risking get-
require John to be in two places at once, so they cannot ting wet into to get his morning paper, then we conclude
be executed simultaneously. If the time constraint on that having his paper is worth more to him than getting
attending the meeting were relaxed, however, then the wet.
conflict would cease to exist. Thus rather than alter his However, there is another set of alternatives that
plans, John can seek to change the circumstances that need to be considered. Consider once again the example
cause his plans to conflict by attempting to remove the of fetching the newspaper in from the rain, in which the
time constraint that are a cause of the difTiculty. original goals are to get the newspaper and to remain
In (5), the conflict is between living in San Francisco dry. Rather than abandon either goal completely, a rea-
and being near Mary, who is in New York. The basis for sonable alternative is to try to reduce the degree to
this exclusion involves the location of San Francisco and which one gets wet as m u c h as possible. A plan for
of Mary. However, if one of these locations were changed remaining as dry as possible while moving through the
so that the distance between them were reduced, them rain is to run as fast as one can. This plan satisfies one
the state would no longer exclude one another. Thus goal entirely, and another to a degree. The total value of
John can attempt to change Mary's location, while still this scenario is likely to be greater than the value of
maintaining his original goals. 15
The following is an example of the kind of planning
staying dry but not getting the paper, and since the situation that P A N D O R A can handle. P A N D O R A is
other abandonment possibility (getting the paper but presented with a task that requires it to get some nails
getting soaked) is clearly worse than this (i. e., getting and lo get a h a m m e r . PANDORA proposes the normal
the paper but getting less soaked), the scenario involving plans for these goals, which require it lo go lo the store,
partial fulflUment is likely to be adopted. buy the desired item, and return. As the plans involve
some c o m m o n preconditions, the mela-theme "DON'T
Partial goal fulfillment is a general principle that, is WASTE R E S O U R C E S " causes P A N D O R A lo have the meta-
applicable to all goals thai involve scalar values. It goal COMBINE-PLANS. A mela-plan associated with this
allows the SIMULATE-AND-SELECT mela-plan to propose goal synthesizes a new plan that involves going to the
options in which the partial fulfillment of one goal store, buying both objects, and returning.
enables the (possibly partial) fulfillment of the other. PANDORA can also detect and resolve a number of
This process is illustrated by the "newspaper in the rain" goal conflict-base situations. In addition, P A N D O R A is
example: MAKE-ATTRIBUTION determines that the prob- being used lo model the planning processes of a h u m a n
lem with the "stay dry" goal above is that it requires not who needs lo cook dinner during a power failure, in
going outside. Thus a partial version of this goal is which most of the normal plans for one's goals will not be
sought that doesn't involve this condition. In the case of eflective.
not getting wet above, the "stay as dry as possible" alter-
native is selected because this doesn't require not going References
outside. This scenario is therefore hypothesized and
evaluated along with the strict abandonment options, 1 Fikes, R. and Nilsson, N. J. (1971). STRIPS: A new
and the one with the highest value chosen. approach lo the application of theorem proving to
problem solving. Artificial Intelligence 2, 189-208.
5. Summary and Projections
W e have proposed a theory of planning based on four 2 Hayes-Roth, B. and Hayes-Roth, F. (1978). Cognitive
tenets: (l) C o m m o n s e n s e planning is essentially the Processes in Planning. RAND Report R-2366-0NR.
consideration of interactions of otherwise simple plans,
(2) knowledge about planning should be usable both by a 3 McDermott, Drew (1978). Planning and Acting. In
planner and an understander, (3) planning problems Cognitive Science vol. 2, no. 2.
should be formulated as meta-goals, and solved by the
same planning mechanism responsible for the fulfillment
4 McCarthy, J. and Hayes, P. J. (1969) Some philosoph-
of ordinary goals, and (4) to accomplish m u c h of its man-
ical problems from the standpoint of artificial intel-
date, Lhe planner makes projections of the future based
ligence. In Meltzer and D. .Michie (eds.) Machine
on its current knowledge of the world and its own tenta-
intelligence, vol. 4. New York; American Elsevier, pp.
tive plans.
These tenets form the basis for a model of planning
5 Newell, A., and Simon, H. A. (1972). H u m a n Problem
whose most salient features are a goal detector and a
Solving. Englewood Clifls, N. J.: Prentice Hall.
projector. The goal detector is used to infer goals,
including meta-goals, based on the situations in which
the planner finds itself; the projector is used to guess 6 Rieger, C. (1975). The Commonsense Algorithm as a
what the future will bring based on the planner's current Basis for Computer Models of H u m a n Memory, Infer-
beliefs and plans. As the goal detector has access to the ence, Belief, and Contextual Language Comprehen-
hypothetical situations simulated by the projector, it can sion. In Theoretical Issues in Natural Language
delect problems with currently intended plans by notic- Processing. R. Schank and B. L. Nash-Webber, (eds.).
ing their consequences in hypothetical realities. These Cambridge, Mass.
problems are dealt with by setting up meta-goals to try
to assure a more desirable future state of affairs. 7 Sacerdoti, E. D. (1977). A Structure for Plans and
W e examined this tiiodel of planning in the particu- Behavior Elsevier North-Holland, Amsterdam.
lar domain of goal conflict resolution. Here we found use
for the mela-plans RE-PLAN (consisting of U3E-N0RMAL- 8 Schank, R. C. and Abelson, R. P (1977). Scripts.
PI,A\ and USE-AlT^RXATfi-Pl.AN) and CHANGE- Plans. Goals, and Understanding. Lawrence Erl-
CIKCUMSTA.NCE for the mcta-goal RRS0I,VE-G0A1.,- b a u m Press, Hillsdale, N.J.
CONFLICT. Both mela-plans m a k e use of the powerful
sub-plan .MAKE-ATTRIBUTION. For the related goal of 9 Shortliffe, E. H. (1976). MYCIN: Computer-based Med-
CF00i;t;-.V10ST-VAl.UABM:-SCENARI0, the SIMUI,ATE-A\'D- ical Consultations American Elsevier.
SEI.ECT mela-plan is used to create alternatives involv-
ing goal abandonment and partial goal fulfillment.
.MAKE-ATTRIBI;T10.\ was found to be useful here as well. 10 Stefik, Mark J. (1980). Planning and Mela-Planning
We are cuirently attempting to test these ideas in MOLGE.\: Part 2. Stanford Heuristic Programming
Lv\o programs. PAM, a ?!ory understanding system, uses Project HPP-80-13 (working paper), Computer Sci-
kno-rtifdge about goal in! L-ractions to understand stonrs ence Department, Stanford University.
involving multiple goals. That is, P A M can detect situa-
tions like goal conflict and goal competilinn, and, realiz- 11 Sussman, G. J. (1975). A Computer Model of SkUL
ing that these tlireaten CPi'tain meta-goals, PAM will Acquisition American Elsevier, New York.
interpret a character's subsequent behavior as a meta-
plan to address the negative consequences of Ihese 12 Wilensky, R. (1978). Understanding goal-based
interactions Thus P A M makes use of the knowledge stories. Yale University Research Report No. 140.
structures described above, but of course, it does not
lest the model of planning per sc. 13 Wilensky, R. (1980). Meta-planning: Representing
Both the model of planning knowledge and of plan- and Using Knowledge about Planning in Problem
ning is being used in the development of I'ANDORA (Plarj Solving and Natural Language Understanding.
ANaly/.er with Dynamic Orgarn/ation, Revision and Appli- Berkeley Electronic Research Laboratory Memoran-
cation). P A N D O R A is given a description of a situation d u m No, UCB/ERL/M80/33.
and determines if it has any goals it sliould act upon. It
then c;r»;atr;s plans for these goals, using projection to
lest them. New goals, including meta-goals, m a y be
inferred in l.lie process, possibly causing PANDORA to
revise its [)revious plans
A C o m m i t m e n t - B a s e d F r a m e w o r k for
Describing Informal Cooperative W o r k

Richard E. Fikes
Cognitive and Instructional Sciences G r o u p
Xerox Palo Alto R e s e a r c h Center
June 1981


In this paper we present a framework for describing cooperative work in informal domains such as an office. We argue that
standard models of such work are inadequate for describing the adaptibility and variability that is observed in offices, and are
fiindamentally misleading as metaphors for understanding the skills and knowledge needed by computers or people to do the work.
The basic claim in our alternative framework is tliat an agent's work is defined in terms of making andfijlfillingcommiunents to
other agents. The tasks described in those commitments are merely agreed upon means for fulfilling the commitments, and the
agents involved in the agreement decide in any given situation how and whether a gi\en commitment has beenftjlfilled.W e also
claim that in informal domains, descriptions of tasks, functions, and procedures are necessarily incomplete and imprecise. They
result from negotiations among the agents and serve as agreed upon specifications of what is to be done. The descriptions evolve
during their use in continuing negotiations as situations and questions arise in which their meaning is unclear. These claims imply
that for a given situation an agent using such descriptions must be capable of interpreting imprecise descriptions, dctcmiining
effective methods for performing tasks, and negotiating with other agents to determine task requirements.

In Uiis paper we present a framework for describing cooperative work in fonn blank), renegotiate the requirements of a Uisk (e.g., request
domains where there is no agreed upon formal model of the tasks and extension of a deadline), or use some method other than the standard
functions lo be done, nor of the procedures for doing them (i.e., in procedure for performing a task.
informal task domains). Most of the work people do is in such informal
domains where it is not feasible lo create precise statements of the tasks A second problem was that our model did not account for the
to be done, the situatiims in which they arc lo be done, the resources difficulties related lo working with informally specified tasks, functions,
available, nor the capabilities of the processors doing them. For and procedures. For example, die infomialily of office work makes
example, people ,irc regularly confronted with task descriptions such as infeasiblc die spccific.itioii of precise algorithms for performing tasks.
write a progress report on the projecl". "describe tJie items to be Situations occur in which the available procedures do not indicate what
purchased", "yield right of way", "keep your eye on the ball", or "slice to do (eg., a vendor claims dial goods were delivered, but no record of
chicken breasts very thinly into julienne strips" their arrival can be found), what is indicated cannot be done (e.g., a
Tlic project that produced this framework has been focused on the deadline has already past), or what is indicated is not the preferred
problems of automating office work involving the use of prespccified method for perfomiing die task (e.g., because an unexpected resource is
prtxredures, and our experiences with those problems motivate and available). Hence, the work involved in using office procedures is
proMdc examples for the discussions in this paper However, the results qualitatively different from the work involved in executing a formal
reported here arc intended to be generally applicable in any task domain algoridim.
where tasks and procedures arc infomially specified, and agents enlist We concluded, then, that modeling procedural office work as simple
each other's help to achieve their individual goals. program execution is an inadequate basis for automating it and is
We began our office amomation efforts by attempting to develop a misleading as a metaphor for understanding die skills and knowledge
model of the work being automated that would provide a basis for our needed by computers or people lo do the work. That conclusion led us
system design efforts. W e were parliculariy interested in describing th( to search for alternative ways of modeling cooperative work that would
skills and information needed to do the work, and in accounting for th( account for the way office tasks arc actually performed, and would
iidaplibilily and variability of methods used by people in pcrforminf inform us regarding the required skills and knowledge. This paper
their tasks. presents the initial results of that search, an alternative based on tlie
Our initial thesis was that office prixcdures arc like computer programs agreements made by agents performing die work and agents for w h o m
and they are "executed" by a collection of office workers in a mannc: Die work is done (see jl lores and ludlow) for another analysis of orPice
an.ilcigoiis to a collection of computers executing a program. It seemed ; work based on such agreements).
simple ni,iltcr lo aiitomalc office procedures by storing thcin in ; Aniilyzing Cooperative Iiirormal Work
compiucri/ed data base as if they were programs, fhen, at each step ii
the cvc(uiion of tlie procedures in an office the computer could do th^ the Social Nature of Tasks and Functions
step itself, or tell the person doing the step what operation lo do a m Tasks
then monitor the results.
As »c prixccdcd, problems arose tliat led us lo questioi Wc begin by an.ilyzing a simple work situation in which an agent has a
our thesis [likes and Henderson). O n e such problem was that ou task that he wants done. For the purposes of this discussion we will
model did not account for the variability in the way tasks ar consider a task to be defined as a set of goals to be achieved while
accomplished, t-or example, an office worker has more options ii maini.iimng a set of constraints. The basic tenet of our model is that
following a procedure than our model described. H e can choose I tasks are essentially s(x:ial in nature in Uiat they are done by one agent
Ignore some of the requirements of a task (e.g.. leave somefieldsof (a contr.ictor) for some other agent (a client). The situation where an
agent does a task for himself is the special case in which die client and
contractor are Uie same agent.

A clicnl (i.e., an agent w h o w.iiils a task done) can choose lo enlist some establish monitors that recogni/.c situations in which an instance of the
other agent to be the contractor for a task. M k client accomplishes that function's task is to be pcrfonncd.
cnlistinent by establishing a "task contract" witli the contractor (sec the
work on contract nets |l)avis and Smith] |Siiiith] for a detailed nmdel of Tasks and Functions in Infunnal Domains
h o w such contracts are established) W c model a contract as consisting An agent performing a function depends on the function description to
of a collection of commitmcnLs. each m a d e by one of the contracting specify each situation in which he is to do something and in each of
agents to anotlicr of the contracting agents. A task contract is an those situations Uic task thai he is lo perform In informal domains, use
agreement between two agents containing a commitment by one of the of those descriptions becomes problematic because of their inprccision
agents (the contractor) to do a task for a second agent (the clicnl). Che and incompleteness (What is a "pioperly executed purchase request"?)
contract m a y also contain other commitments, such as a commitment by |Suchinan|. hiencc, Ihe contractor is confronted with the new subtasks in
the client to remunerate (i e.. achieve some goal for) the contractor in each situation of interpreting the function description to delenninc
return for accomplishing ihc task. whether a task is to be done, what the task would be, and then after
In order for -i task contr.ict to be established, the client and contractor doing something whether the lask has been accomplished.
must agree on the task that is to be performed. ITieir negotiations will
We claim in our model that the sole criteria for an acceptable
produce an agreed upon dcscripiion of the task, and the coinmiuncnt will
intcipretation of these dcscriplions is agreeincnt by the contractor and
be a statement of intent to do the described task, nicrcfore, we consider
client I'hal is, the function and task descriptions arc part of the contract
cooperative Uisks as being defined by a process, and as
between die contractor and client, and those agents are Uic final
representing a negotiated agreement between the client and contractor.
audionty as lo what those descriptions mean and whether they have
Note that a task contract establishes a goal for the contractor of fulfilling been satisfied. For example, die meaning of "describe the items to be
his commitment to the client, and pcrfonning die described task is only purchased" on a purchase requisition form is worked out in each case by
a means for achieving diat goal. That observ Jtion is one of the bases for tlie rcquisitioner and the procurement department buyer, for w h o m the
our explanations of the behavior variations observed in h u m a n description is being created.
cooperative work situations, llial is, the agreed upon task description The interpretation of task and fiinction dcscriplions in any given
provides the contractor with a set of sufficient conditions for fulfilling situation is therefore a subject for negotiation between client and
his comniitmenl, and therefore represents one method of achieving his contractor. I'hal is. if a commitment description is not sufficiendy
goal. However, any actions by the contractor that result in the client precise or complete for the contractor to dclemiine what he should do in
agreeing that Ihc commitment has been fulfilled will achieve Ihc a given situation, Uicn additional negotialions with die client are
contractor's goal. For example. Die contractor may choose to achieve necessary. Hence, in infomial domains, Uie negotiation processes diat
some variation of the task's goal, ignore some of tlie constraints, produce commitment descriptions continue during the fulfillincnt of
convince the client that die task shouldn't be done, etc. H e is free to those commiunents and become an integral part of die work required to
use whatever method he thinks will succeed and is most desirable for ftilfill them.
him in the context of his other goals and constraints. Agents performing functions in informal domains must be capable of
In order for a contractor to make use of the flexibility available to him determining appropriate interpretations of imprecise descriptions and of
in fulfilling his commitment, he must know w h o the client is, the client recognizing when the description is sufficiently inadequate to warrant
must be accessible to him for negotiation, and the contractor must be renegotiation with die client W h e n agents are skilled in those
capable of planning and performing alternative courses of action to those capabilities, die difficult and time consuming process of creating
described in the usk contract comprehensive function and procedure descriptions can be avoided.
Descriptions can be allowed to build up incrementally by generalizing
die experiences gained in particular situations.
Often in human work situations, a person will agree to perform a given
type of t.isk whenever a given set of conditions occur; Uiat is, he will Functions as Operators for Planning
agree to perform a "function" Kor example, a buyer in a corporate
prwurenicnt department may agree to issue a purchase order whenever a Functions play the same role as operators in standard Artificial
profKrrly executed purch.isc request is received. Also, procedures are Intelligence planning and problem solving frameworks (for example.
typically designed as methods for performing functions rather than Il-ikcs and Nilsson)) in that they can be used by agents to achieve goals.
individual tasks and are used whenever the function's task is to be done W e said earlier ihat an agent w h o wants a lask done can cnlisi a second
(e.g.. the procedure for issuing purch.isc orders). Hence, in order to agent to do the lask by establishing a lask .ngreemcnl with the second
describe Uiose situations, we will gcncrali/e our discussion to include agent. I'unctions provide an alternative method of enlisting a second
functions as well as tasks. agent to do a task. I'hal is. if the second agent is a contractor w h o has
As with tasks, one method available to a client for pcrfonning'a function made a commituncnt (it doesn't matter to w h o m ) to provide a function,
is to csi.iblish an agreement (a "function contract") with a contractor in and die task that die first agent ("the consumer") wants done is an
which the coniracior commits to do the function for die client ["he instance of that function's lask. then Ihc consumer can cause Ihe
contract *ill cimliiin an agreed upon description of Ihc function to be contractor lo do the lask by persuading him that die appropriate insuince
performed b> the contractor. Kor our discussion, wc will consider a of the function's preconditions arc true. If the contractor refuses lo do
function description to consist of a parametenzed task description and the task, then die consumer can appeal lo the function's client,
paraincten/cd set of prccimditions such that any given instance of Uic attempting to convince him dial the preconditions were satisfied and diat
precondiiioiis defines an instance of the task. Whenever an instance of die contractor did not fulfill his cominilinent lo accomplish the task.
the preconditions becomes true, the contractor agrees to perform the For example, if an employee of a small company wants to obtain some
corresponding instantiated task. equipment for use in his work, then he tan achieve dial goal by
As with tasks, the contr^ictnr's goal is to fulfill his commitment and the obtaining Ihe appropriate audiorizations and submiuinp die appropriate
agreed upon function description provides him with a set of sufficient forms lo the company's procurement dcparutient. Ilie procurement
conditions for achieving that goal. r.ach time the ftinction's department has made a commitment to the company president to be the
preconditions become tnic the contractor can choose to do whatever contractor for the function of purchasing cquipniciu. and receipt of the
actions he thinks will satisfy the client appropriate forms and authorizations is the precondition for that
function. Ihe employee becomes a consumer of that funcdon by
Note that in the transition from task to fiinction a new subtask has been convincing the contractor dial an instance of its preconditions have
introduced; namely, recognition of the occurrence of the preconditions. become true. If the procurement dcparlincnt refuses to provide the
Hence, an agent w h o has committed to perform a function must advance, the employee can complain lo the company president dial they
are not performing their l\inction.

In deciding to use a funclion. the consumer repriced his orieinnl lask is a suffKient condition and impimant method for the siibcoiilractor to
MJih Jic ncA task of persuading the conirjclur to do Uie nrigiiwil l.isk. fulfill his commiUiienL (he siilKontniclor can therefore do whatever he
Notice the method for accomplishing the nc» task is to con\mcc llic thinks vvill convince the consumer that the contractor's commitment to
conlr.iclor that an instance of the preconditions have been s.ilisned, him been fulfilled.
rather than simply to make an instance of the preconditions inic. ITic The consumer therefore plays an important role in the subcontaclor's
consumer is free to negotiate with the contractor as to what he will work and is an additional agent wiih w h o m the subcontractor can
accept as satisfactory evidence that the preconditions are true. For negotiate to determine what is required of him in a given situation. As
example, the employee requesting an equipment purchase might before, if the work is being done in an infonnal domain, then
convince the procurement depanmenl that a phone call from the determining agreed upon inicrpreutions of the descriptions in particular
employee's manager is sufficient in thai case to authorize the purchase. silualions is an additional issue for negotiation. If the subcontractor and
If tlie preconditions arc informal!) described, then there is tlie additional the consumer agree im what is lo be dime, then the contractor need not
issue to he resolved in those negotiations of determining an agreed upon enter into the negotiations or even know what was agreed on because his
interpretation of the descriptions in the situation. For example, the commitment to the consumer is being fulfilled and die commitment to
employee might ask the procurement department to accept a m e m o him by the subcontracuir is being fulfilled.
requesting the purchase rather than the standard form. ITiis is another
If i:i a given situation, the suhconiracior and consumer cannot agree on
case where negotiations during the performance of a task arc vital to its
die task to be done, dicn diey both can appeal to the contractor for help.
completion and where variability is introduced by the one-time
The subcontractor can argue that his commitment lo the contractor does
agreements that result from those negotiations.
not include what the consumer is asking for. and die consumer can
Subcontracting to Perform Tasks and Functions argue that die contractor's commitment to him is not being fulfilled.
Hence, the contractor needs lo enter into die negotiations only when the
Consider again the basic situation in which a client wants a task done
subcontractor and the consumer cannot agree.
and lias obtained a commitment from a contractor to do the task. W e
could then describe the comractors situation as one in which he wants a For example, when die employee requests the equipment purchase, die
lask done, and that he has the option of persuading yet a third agent (a procurement department buyer may atieinpt to satisfy die employee by
"subcontracior") to do some or all of the task for him. ITie convincing him Uiat he should use previously purchased equipment or
subcontractor then is in the same situation and has an option to enlist a Uiat he should rent equipment H e may persuade the employee to help
fourth agent, etc. ITic same description holds for functions as well as find an appropriate vendor, and in return agree to obtain die
tasks. authorizationsfiirdie purchase that are normally part of die employee's
We are interested here in examining the role that the contractor's client responsibility. Such localized one-time agreements between agents occur
plays in the work of a subcontractor. For that purpose it is sufficient to regularly in office settings, and are an important aspect of the variability
consider die three agent case where a contractor and client have an and adaptibility that characterize office work. Standard computer
agrcciiiciil in which the contractor commits to pcrfonn a function, and program description techniques (e.g., flow charts) are hopelessly
i)ic coiuractor instead of performing the Function himself establishes an inadequate for describing such activity.
agreement witli a subcontractor in which the subcontractor commits to So, we see diat the consumer is a source of information for the
perforin the function. In ihat case, the contractor's client then becomes subcontractor about what is lo be done and an audiority on when the
tlic consumer for the subcontractor's funclion. lask has been completed. Also, die consumer acts as a monitor for die
We can augment our purchasing example by considering the funclion contractor as to whether the subcontractor has done his job, since it is
contract between the company president and tlie employee. In that the consumer w h o cares whether or not ihe task is accomplished. Ilie
coniract. the president commits to purchase equipment for die employee interdepcndcncies among the consumer, contractor, and subcontractor
whenever he submits an aiidiorized request. Instead of doing the discussed in diis secuon arc in Figure 2.
puruhasmg himself tlie president contracts with the procurement
dcp.iilincnl to do il. Hence, a subcontracting relationship exists in which
tlic employee is the consumer, the company president is the contractor,
.ind the pnxrurcmcni department is tlic subcontractor. Figure 1 indicates For the consumer:
tlie structure of the two contracts that establish those relationships. The subcontractor:
Performs the desired lask.
I'he contractor:
Settles disputes widi the subcontractor.
Main luiiclinn Contract:
Client: The employee For the contractor:
Contractor: The company president I'he subcontractor:
Funclion Description: Purchase equipment for the employee Fulfills the commitment to the consumer.
whenever he submits an aulliori/cd request ITie consumer:
I'unclion Subcontract: Provides remuneration for doing die usk, and monitors the
subcontractor's work.
( licnl: llie company president
( onlraclor: llie procurement department For the subcontractor:
( oiisuiiur I'he employee The consumer.
l-uiHiion Di'scriplion: Purchase equipment for the employee Helps interpret die task description, and Indicates when die task
whenever he subniiis an authorized request is completed.
l-i(jurc I: Fxampic subcontracting situation in an oflicc Ihe contractor:
Prov ides remuneration for doing the task, and helps settle
disputes with the consumer.
Ilie contractor wants the function done in order to fulfill his Figure 2: Summary of the Consumer, Contractor, Subcontractor
cdinmilnK'Ml to the consumer. Ilic commiUiienl of the subcontractor to
perloiiii the function can therefore be considered as being to fulfill the
conlr.icior's commitment to the consumer. Satisfaction of the consumer

Ihc Social Nalurc of I'roccilurcs
been known at the time Uie procedure was designed, ihe work involved
Now cunsidcr a <;ilualion in which an agent has a runction he wants in using those function descriptions is significantly different from the
done ,ind a procedure describing how to do it. W c will call Ihc agent work of pcrfomiing steps described as operations. In particular. It
w h o the luMction "the procedure's in.uiiger" and the function "the iiuolves subtasks of planiiiiin to delennine .1 method to use, and
procedure's function" A prcKCdurc describes a method for doing a inoiiiinrini; to determine whether the nietliod accomplished the function.
function in tenns of a collection of steps to be done in .1 specified order, However, an agent c.ipable of effectively pcrfoiming those subtasks can
and Iheicby provides a means for tlic procedure's inanagei to organise a belter dctennine the appropriaicness of his results and can successfully
collection of agents to perform tlic procedure's fiinction. Miat is. the perform his step in unexpected situations li'ikcs).
procedure's manager has the option for each step of the procedure of
SuhcDiilrdcliiig l\'iiliin Procedures
obtaining a coinmitincni from some oUu-r agent to do Ihc step (i.e. of
"installing the step"). If he oht.ilns such a coinmilmcni for each step of Our description of procedure insiallaiion thus far would predict that each
the procedure (i.e.. if he "installs the procedure"), then the agents w h o tunc a procedure step is activated and the step contractor docs something
agreed to do the steps (i.e.. the "step contr.ictors") will do the function other than die task described in his agreement with the procedure's
for him. l-'or example, if > piocurcmenl department manager is assigned manager, that the contractor must obtain an .icknowledgcmenl from the
(he function of purchasing equipment for employees, then he can either managei that what he did s-ilisfics his commitineni. In .ictual practice in
find or create a priKcdurc for performing tliat function and install the offices, there is a broad v.ui.ihiluy of behavior in ihe performance of
procedure by obtaining commitments froni the people in his department procedure steps, and only rarely is diat behavior accompanied by
10 be step contractors for each of tlie procedure's steps. iiitcraclion wnli the procedure's manager (typically the step contractor's
In formal domains, operation descriptions can be provided for each step supervisor). Instead, there are frequent negotiations a m o n g the agents
in a procedure that are guaranteed to satisfy the designer's intention for doing die steps of the procedure. I'hosc agents arc not generally
the step (e.g.. add x to y). Ilien tlie comruitment of a step contractor is working for each other and have m a d e no apparent commitments to
to perform the step's operation in a manner that satisfies die formal e.nh odier. H o w do wc explain Uieir negotiations and the role those
description. I"he contractor need not have any model of the results interactions play in their work? In this section w e model Uiosc
expected from his step or of the role ihey play in performing the inleractions by extending our description of procedure insUillation to
procedures task. His total sphere of concern is to perform the operation include tlic subcontracting relationships that are established among the
as specified. That is the style of procedure execution done, for example, step contractors.
by a typical programming language interpreter. Wc can apply our analysis of subcontracting to die performance of
In informal domains, there are no guarantees that a procedure will procedure steps by idenufying the comniilments m a d e during a
successfully accomplish its task. Those guarantees are lost because Uic procedure installation and considering the "functional role" played by
procedure, its task, and tlie situations in which it will be used are procedure steps A step's functional role is the rationale used by the
\mprcc\sc\v described. Hence. prtKCdurcs in informal domains arc only priKcdurc designer for including the step in llic procedure (e.g.. achieve
provoiypcs ot mcvViods, tor pcrfomVing, lasVis. TVicy suggest a way of a goal of ihe procedure's task, salisfy a precondition of some oUier step
dccomposvTvg, a vask invo svcps, and perhaps indicalc h o w ihc task is in the prwedure). That rationale is therefore the defining basis for th«
typically perfonned, but they do not alleviate die need for problem function to be performed at that step jVani^hn and Orown).
solving in each specific situation to detenninc how to perform a task. Ilie funclion to be performed at e.ich step of -i procedure has a set of
Tlie user of an infomial procedure is confronted with the subproblcms preconditions as pan of ius description. Ihc designer of a procedure
of determining the meaning of the procedure in the specific situation must assure diat when a given step is to be perfonned. its preconditions
and whedier it will be applicable or effective. are satisfied. I hat design goal is satisfied by including other steps earlier
A basic problem in informal domains with installing procedures to in the procedure that will cause those preconditions to be satisfied. 'I'hc
perfonn functions is that one must commit at the lime of installation to fiinctioiial role of those earlier steps, therefore, is to satisfy die
the decomposition specified by die procedure. If indeed as we argued preconditions of Uie later step.
above, that decomposition is only suggestive and needs 10 be reexamined
We can characterize a function's preconditions as consisting of
each time the procedure is used, then the strategy of installing a
"activation conditions", the (xrcurrcnce of which signals the contractor
procedure is an inetTective means of transfcring tlie work from the
tliat an instance of the function's Uisk is to be done, and "enabling
procedures manager to tlie step contractors Ihc ihallenge then is to
conditions", the satisfaction of which provides ihc context needed by the
describe and install prixedurcs in a manner that maximises Uieir
contr.icior to perform die task. I'or example, the function performed by
adaptibillty and flexibility.
a buyer in a procurement department is activated when he receives a
Procedure Steps as i'unclions
purchase request and is enabled when he receives audioriz^ation to make
An important way of meeting the challenge of compensating for the Uic purchase. W e distinguish, dierefore. between steps whose fiinctional
inadequacies of inforin.illv specified procedures is to add to the role is to activate other steps and those whose role is to enable odier
descnpiiiin of each step a description of the finKiion to be accomplished steps.
by that step (le.. the goals to be aLliicicd and constraints 10 be Wc make use of that distinction in describing the contract that installs a
maintained each time the step is performed), lor example, add to a step pr(Kcdure step. That contract contains a commitment by the step
described as "Submit to procurement an aiithnri/cd purchase request" contractor to perform die step's function whenever the step's activation
the function description "Whenever an employee wants eqiiipmenl conditions occur and a commitment by the procedure's manager to
purch.iscd. achieve the goal: Procurement knows the employee wants satisfy the step's enabling conditions whenever the activation conditions
equipment purchased and has the information and authorizations occur. I'or example, an accounting department clerk (the step
necessary to m a k e the purchase". contractor) may make a c o m m i i m c m to his manager (die procedure's
A function description spcciTies the requirements of a step without manager) to respond to vendors' invoices whenever one is received. The
reference to huw those requirements are to be performed and therefore manager would, in turn, agree to provide die clerk with the puixhasc
provides the option of using whatever method is appropriate in a order, packing slips, and other supporting documents needed to respond
particular situation to accomplish the function's task. The agent appropriately to die vendor.
performing a step can use the function description to evaluate whellier The procedure's manager satisfies his commitment to salisfy a step's
the action described for the step is an appropriate method in a given enabling conditions by installing ihosc procedure steps whose functional
situation, to plan alternative methods for performing the step, and to role is to enable that step. Hence, an agent w h o is performing a step
evaluate whether his actions accomplished the step. whose functional role is to enable some other step is in efTect a
Adding function descriptions to steps results in procedures applicable to subcontractor whose consumer is the agent peifoiining the step he is
a widci r-inge ctf situations bec.iusc it allo\\s die agents performing the enabling. In die accounung department example above, the agents who
steps to lake into considcialion properties of the situation such as supply Ihe tlerk widi the supporting dixruments are subcontractors whose
resource limit.iiions and inler.icuons with other Lisks that may not have consumer is the clerk.

Our earlier comments aboul the role that a consumer plays in Uie work Information Needed From a Procedure Description
of a suhconiracior therefore apply here. T h e agent doing the slop hcing
enabled and the agent satisfying the enabling condition negotiate with We have also indicated information that is needed from the description
each other to determine what the enabler's task is in problematic of an informal procedure in order for the procedure to be used
situations, and the procedure's manager is brought Into the negotiations adaptivcly and fiexibly. The description should identify the procedure's
only when they cannot agree. Also, the agent being enabled acts as a manager (so that each step contractor knows whose satisfaction he is
monitor on the enabler for the procedure's manager. trying to obtain), and each step of the procedure should be described as
a function (so that the step contractor can choose his own method of
The analysis of subcontracting applies to any procedure step whose performing the step). If satisfaction of an enabling condition of a step is
functional role involves providing a result to some agent other than subcontracted to another step, then Uie description of the step being
directly to Ihc procedure's manager. In those the agent providing enabled should identify the enabling step and w h o is performing it (so
the result is fulfilling a commitment made by the procedure's manager to that the contractor for the enabled step can negotiate with the enabler
the consumer of that result (or to a client of that consumer). Hence, the and monitor his performance). Finally, as noted above about all
consumer and producer can work out together what is to be provided. functions, if a step is a subcontract, then its description should identify
We conclude from this discussion that an important way of increasing the consumer of the subcontract (because satisfying him is a sufficient
the adapiibilily of a procedure is to include in the description of each condition for fulfilling the commitment).
step the functional role of the step. If that functional role involves Implications for Office Automation
fulfilling a commitment of the procedure's manager to some third agent,
ITiis framework is serving as a basis for our exploration of how
then the description should include the identity of that agent, and tlie
computer-based systems can effectively participate in procedural work in
step contractor should have access to him for ongoing negotiations.
offices. W e have reported in earlier papers our preliminary results in
Summary and Conclusions this regard (|l-'ikcs] and jfikes and Henderson)) and will not attempt lo
describe our current efforts in detail here. Instead, we will conclude this
In this analysis we have described a framework that identifies the sources
paper with some general remarks on office automation to suggest the
of variability and adaptibility observed in h u m a n cooperative work
uses w e are making of the commitment-based framework.
silualions. Our basic claim is that an agent's work is defined in terms of
making and fulfilling commiuncnts to otiicr agents. Ilic tasks described Our discussion indicates Uiat in infonnal domains, "intelligent"
in those commiunenis are merely agreed upon means for fulfilling tlie capabilities such as planning, plan monitoring, and negotiation are
commitments. The agents involved in the agreement arc free in any required to do even simple cooperative work. Current computer-based
given siiuaiion to decide how and whether a given commitment has been systems that claim to automate such work in offices do not have those
fulfilled Hence, nonstandard methods and outcomes may be considered capabilities. They require precise descriptions of their function and h o w
acceptable even though they do not correspond to the described tasks, to perform it. 'l"herefore. they can "commit" to doing only a
functions, and procedures. foinializablc approximation of the function desired by the client. They
We claimed thai descriptions of tasks and functions result from arc incapable of performing the function in siuiations tliat do not match
negotiations between clients and contractors, and serve as agreed upon tlie assumptions of the formalization, and can not adapt their methods to
specifications of what the contractor is to do. In infonnal domains, account for unexpected features of a particular situation such as resource
those descriptions are necessarily incomplete and imprecise. limitation changes or interactions with other tasks. In addition, they
Determining their intended meaning in specific situations is an important require more effort by the client to establish their task or function
component of the work. ITiat determination involves continuing contract since they have no capability of interpreting vague descriptions
negotiations as situations and questions arise in which t))e meaning of and only very limited capabilities for recognizing situations where a
Uie dccriptions is unclear. description is inapplicable.
Procedures provide a means for organising a collection of agents to All t(X) often, designers and installers of office automation equipment do
pcrfonn a function. In informal domains, procedures represent only not realize the unformalizable subtleties of the work being automated,
prototypes of methods whose meaning and applicability in specific and therefore do not anticipate the differences between what Uic
situations is unclear. Their use requires problem solving and equipment is going to do and what the people did w h o m it is replacing.
negotiation to determine an effective method in a given situation. Tliose differences often cause major upheavals in an organization
because ihcy change the work requirements of all the agents w h o
Information Needed To Do Cooperative Procedural Work interact witli the equipment A major goal of tlie analysis described in
this paper has been to provide a model of the aspects of
This framework characterizes the information needed by agents doing office functions being overiooked by current automation efforts so that
cooperative work and the role that the information plays in their work. the differences in functionality introduced by the automation can be
In general, it indicates that an agent needs descriptions of tlie task and predicted and compensated for.
function contracts lo which he has agreed, and the fiinclions available to Automation can increase productivity in an office by supporting, as well
him. as replacing, people in their performance of office functions. For
For each Wsk or function commitment llial an agent has made, he needs example, the framework we have described suggests ways of supporting
to know the agreed upon task or function description (because it office work by providing agents with the information they need w h e n
provides a set of sufficient conditions for fulfilling the commiunent), the they need it. It also suggests a facilitator role for a computer-based
agent to w h o m the commiunent was made (so iJial the contractor knows system using knowledge of w h o the clients, contractors, and consumers
whose satisfaction he is trying to obtain), and tlie consumer of the results arc for each task being performed. Hy knowing w h o must be satisfied
of the task or function in the case where the commitment is a by each result, a system would be able to monitor and track the
subcontr.ici (because salislying the consumer is a sufficient condition for performance of a task without needing lo understand the mediods being
fulfilling the comniiimcnt). used or the semantics of the task itself.
An apcnl needs lo know the functions .n.iil.iblc to him so that he can
use thcin as steps in plans he forms to accomplish his tasks. In order to Acknowledgements
use a fiinclion, he needs a description of its task (so that he can
deterininc whcllier the function can be used to accomplish his task), its l.ucy Suchman is a major participant in the research effort from which
preconditions (because they describe a mcins for iniii.iling performance this paper emerges and has made significant contributions to the ideas
of the lask), Ihc identity of the contractor (so he will know w h o he must presented herein. Also. I would like to thank Danny Dobrow. Austin
persuade to perform the task), and the identity of the client (so he will Henderson. Tom Malone, Al Newell, and Kurt Vanlehn for providing
know wlui to appeal lo if he feels that the contractor is not adequately valuable feedback and ideas in support of this paper.
pcrroiiiiing Ihc fiinction).

Fikcs. R. "Aulomaling the Problem Solving in Procedural OfTice
Work" In Proceedings of the A KIPS Office Aiiloiiialion
Conrcrencc. Houston, Texas, March 1981.
Hikes, K.. and Henderson, A. "On Supporting the Use of Procedures in
Office Work" Proceedings of ITie I'irsI Annual National
Conference on Artificial Intelligence, Stanford, Calif., August 1980,
Fikcs, R., and Nilsson. N. "SI RIPS: A New Approach to the
Application of ITieorem Pro\ing to Problem Solving". Artificial
Intelligence 2 (1971), pp 189-208.
Flores, F., and l.iidlow, J, "Doing and Speaking in the Office", In
lick. C , and Sprague, R. (eds), D S S : Issues and Challenges.
London: Pcrgamon Press, 1981.
Smith. R. G. "A Framework for Problem Solving in a Distributed
Proccs-sing Knvironmcnt". Dept. of Computer Science, Stanford
Unn.. Stanford, Calif. SIAN-CS-78-7Q0 (HPP-78-28) Dec. 1978,
Smith, R. G.. and Davis, R. "Frameworks for Cooperation in
Distributed Problem Solving" IFHH Transactions on Systems,
Man, and Cybernetics, Vol. SMC-II. No. 1, January 1981,
Suchman, L. "Office Pr(x:edures as Practical Action: Theories of Work
and Software Design" fo .ippear in a special issue on Office
Semantics of tJic lEHH Trans-ictions on Software Fiigincciing.
Vanl.ehn, K., and Drown, J. S. "Planning Nets: A Representation for
Formalizing Analogies and Semantic Models of PrtKcdural Skills",
In Snow, R., Fredenco, P„ and Monuguc, W . (eds) Aplilude
learning and Insiniiiion: Cogmiive Process Analyses. Hillsdale,
N.J.: l.awrcnce Krlbaum Associates.


Barbara Hayes-Roth
Rand Corporation
Santa Monica, CA. 90406

INTRODUCTION' Our research program comprises the following

Planning is the predetermination of an intended 1. Develop a cognitive model of the planning process.
sequence of actions aimed at achieving a goal. We all 2. Conduct experimental investigations of the model.
engage in planning for a variety of goals, ranging 3. Apply the model in studies of individual differences
from everyday goals like performing a set of errands in planning skill.
to more consequential goals like making a career 4. Apply the model in studies of planners' deficiencies
change. Whether or not we achieve our goals depends and their vulnerabilities to task factors.
in part on the quality of our plans. 5. Infer principles for improving planning performance.
During the past few years, my colleagues and I 6. Design a computer aid around the inferred principles.
have been studying the cognitive processes people use 7. Implement the computer aid.
for planning. When we began this work, most of the S. Test the computer aid in real planning environments.
earlier scientific research on planning had focused on We have completed tasks 1-5 and have recently begun
the development of automatic planning systems (e.g., working on task 6. This paper summarizes our work to
Fahlman, 1974; Pikes, 1977; Sacerdoti, 1974, 1975). date.
Other research had examined the role of plans in human
behavior (e.g., Ernst & Newell, 1969; Miller,
Galanter, & Pribram, 1969). However, little was known A COGNITIVE MODEL OF THE PLANNING PROCESS
about the psychology of planning per se--how to
identify effective planners, what special skills or Our model of the cognitive processes underlying
strategies effective planners use, and what task planning behavior exploits the theoretical
factors impact on planning effectiveness. Because our architecture of the Hearsay-II speech-understanding
long-range goal was to develop computer aids for human system (CMU Computer Science Research Group, 1977;
planners, we felt that understanding these Erman & Lesser, 1975; Lesser, Fennel, Erraan, & Reddy,
psychological issues was an important first step. 1975; Lesser & Erraan, 1977; Hayes-Roth & Lesser,
Accordingly, we embarked on a program of research 1977). It also incorporates principles developed in
designed to elucidate the cognitive processes the research on automatic planning systems and on the
underlying planning and to develop a computer aid that role of plans in human behavior mentioned above. The
exploits cognitive strengths and compensates for model has three basic components: specialists, a
cognitive weaknesses. "blackboard," and a control regime. Each of these is
Of course, there are many different kinds of discussed briefly below. (See Hayes-Roth & Hayes-
planning, depending upon the number of planners Roth, 1979, for a more detailed discussion of the
involved, the planning environment, the type of goals model.)
under consideration, the action options, etc. For our
research, we chose to focus on individual planning of
multiple-task sequences in a spatial environment. This Specialists are the mental processes that
task domain is well-defined and manageable. At the generate decisions for incorporation into the plan in
same time, it is general enough to apply to a variety process. For example, one specialist might generate a
of specific real-world planning tasks (e.g., planning decision to establish a particular goal for the plan.
travel itineraries, planning delivery routes, planning Another might generate a decision to take a particular
factory inspections, planning tactical missions). For action toward achieving that goal.
our research, we wanted an instantiation of this task
that was both realistic and familiar to the people who We operationalize specialists as condition-action
would serve as subjects in our experiments. We chose rules, similar to the production rules of Newell and
the following errand-planning task: Simon (1972). The condition part of the rule describes
Given: a list of desired errands the circumstances under which the specialist can make
a map of the local environment a contribution to the plan. Ordinarily, the condition
starting and ending times requires that some other planning decision has already
starting and ending locations been made. When that condition is satisfied, we say
temporal constraints that the specialist has been "invoked." The action
contextual information part of the rule describes the decision the specialist
Plan: which errands to accomplish can contribute to the plan if it is "executed."
how much time to allocate for each errand We assume that a given individual possesses many
in what order to conduct the errands planning specialists. Some of them are generic and can
by what routes to travel between successive
errands. 23
make contributions to all planning problems. Other psychological validity. Second, the blackboard
specialists are domain-specific and can make structure improves computational efficiency by
contributions only to planning problems in their permitting specialists to restrict their inspection
particular domains. We also assume that the many activities to those areas of the blackboard likely to
specialists an individual brings to bear on a planning contain information of interest.
problem operate independently. They do not communicate
or influence one another's behavior directly. However, Control Regime
they can communicate and influence one another's
behavior indirectly, as discussed in the next section. According to the model, planning proceeds in a
series of "cycles." On each cycle, many specialists
The Blackboard may be invoked. One specialist is scheduled to execute
its action next. It does so, recording its decision at
The blackboard is a structured internal data base an appropriate location on the blackboard. The
in which executed specialists record their decisions. recording of a new decision signals the beginning of
All specialists can also inspect the blackboard and the next cycle. This process repeats until the planner
respond differentially to the presence of different has developed a satisfactory plan.
kinds of information. In this way, specialists
indirectly communicate and influence one another's The process of scheduling one of the invoked
behavior. specialists on each cycle is another important feature
of the model. Most previous conceptions of the
The model partitions the blackboard into five planning process imposed upon it a strict,
conceptual "planes" that distinguish the different hierarchical control regime. High-level abstract
kinds of decisions we think planners make. The meta- decisions were made first and refined by later
plan plane contains decisions about how to approach decisions at successively lower levels of abstraction.
the problem, what kinds of problem-solving strategies By contrast, our model assumes an opportunistic
to use, what kinds of policies should guide plan control regime. Specialists are scheduled and
development, and what kinds of criteria should be used decisions generated in highly variable orders
to evaluate tentative plans. The plan abstraction determined by competing scheduling heuristics. We have
plane contains decisions characterizing the kinds of concentrated on two scheduling heuristics, focus of
actions that should be included in the plan. The attention and recency. The focus of attention
knowledge base plane contains data, assumptions, and heuristic recommends scheduling specialists that
knowledge about the world that might be useful in record decisions in pre-selected areas of the
instantiating plan abstraction decisions. The plan blackboard. The recency heuristic recommends
plane contains decisions about the plan itself. These scheduling specialists that have been invoked
decisions are typically instantiations of plan recently, for example, on the last one or two cycles.
abstraction decisions, based on information in the (Hayes-Roth & Lesser (1977) have recommended other
knowledge base. Finally, the executive plane contains heuristics, such as efficiency and efficacy.) We
decisions about how to sequence the e.xecution of implement these heuristics by means of specialists
invoked specialists during the planning process. These that record relevant decisions on the executive plane
decisions determine the order in which decisions are of the blackboard.
generated on the other planes of the blackboard. The interaction of the focus of attention and
The model further partitions each plane of the recency heuristics can manifest a variety of specific
blackboard into different "levels of abstraction." To control strategies, including the hierarchical
illustrate, the plan plane has four different levels strategy mentioned above. We believe that the
of abstraction. The outcomes level contains decisions flexibility embodied in the opportunistic control
about the goals of the plan. The design level contains regime is both a more accurate model of the
decisions about the general spatial-temporal variability we observe in human planning behavior and
organization of the plan. The procedures level a more powerful approach to planning in general.
contains decisions about the actions planned within
that spatial — temporal organization. The operations
level contains decisions about the low-level
operations necessary to carry out those actions. The
other planes have similar levels of abstraction. We coiiJucted two kinds of e.\perimental
investigations of the planning model: psychological
The blackboard structure outlined above serves experiments and computer simulation experiments.
two functions. First, it embodies our model of the
psychological categories of planning. Thus, it The psychological experiments provided support
distinguishes our model from other planning models and for several of the basic assumptions of the model,
provides one basis for evaluating the model's including the following: (a) that people make planning

decisions in each of the conceptual categories of The results supported all three hypotheses
the blackboard; (b) that people formulate plans at the advanced above. Effective planners generated
postulated levels of abstraction; (c) that people decisions in all areas of the planning blackboard,
develop plans with both top-down and bottora-up whereas ineffective planners generated primarily plan
decision sequences; (d) that people effect alternative and plan-abstraction decisions. Effective planners
control strategies for planning; and (e) that people also generated decisions at different levels of
opportunistically exploit the information and abstraction, whereas ineffective planners generated
constraints available during planning. (These primarily low-level decisions. Effective planners
experiments are discussed in detail in several showed greater attentional flexibility than
reports: Goldin & Hayes-Roth, 1980; Hayes-Roth, ineffective planners. They more frequently shifted
1980;Hayes-Roth & Hayes-Roth, 1979; Hayes-Roth & their focus of attention among the different planes of
Thorndyke, 1980.) the blackboard and among different temporal loci
within the plan itself. Finally, effective planners
The computer simulation was a LISP implementation e.xhibited many more specialists than ineffective
of the model with about fifty specialists. The planners and they seemed to exploit powerful
simulation served two functions. First, it specialists more actively. (These results are
demonstrated the sufficiency of the model to account discussed in more detail in Goldin & Hayes-Roth,
for human planning behavior. The simulation produced 1980.)
plans and planning protocols similar to those produced
by human planners. It also exhibited their PLANNERS' DEFICIENCIES AND THEIR VULNERABILITIES
characteristic strategic flexibility. Second, the TO TASK FACTORS
simulation allowed us to explore some of the
We next applied the model to an analysis of
computational properties of the model, providing more
general deficiencies in human planning and to the
general insights into distributed computation and
deleterious effects of task factors. Using protocol
heuristic control regimes. (The computer simulation is
analysis methods similar to those described above, we
discussed in more detail in Hayes-Roth & Hayes-Roth,
identified three task factors that impede effective
1979, and in Hayes-Roth, Hayes-Roth, Rosenschein, &
planning: constraints, complexity, and time stress.
Cammarata, 1979).
INDIVIDUAL DIFFERENCES IN PLANNING SKILL Planners seem able to accommodate one or two
simple time constraints. However, as the number of
The first application of the opportunistic model time constraints in a problem increases, planning
was to individual differences in planning skill--why effectiveness deteriorates. Time-constrained tasks,
are some planners more effective than others? The particularly those that appear late in the plan,
model suggests three areas in which effective planners rarely satisfy their constraints. The problem lies in
might differ from ineffective planners: their planning strategy. The opportunistic model permits
generation of decisions in different areas of the alternative planning strategies and, for heavily
planning blackboard, their flexibility in distributing constrained problems, a constraint-based strategy is
attention among the different areas of the blackboard, appropriate. Apparently, however, most of our
and their repertoires of specialists. subjects did not have a constraint-based strategy in
We evaluated these hypotheses by examining the their repertoires.
planning processes of several planners with different Planners also have difficulty coping with
skill levels. We assessed a planner's skill level increasing problem complexity. As the number of tasks
based on the quality of the plans he or she produced. under consideration and the number of alternative
The quality of a plan was a composite score reflecting possible plans increase, planners require
several interacting dimensions (e.g., efficiency, disproportionately longer times to generate
constraint satisfaction, temporal realism). Planners satisfactory plans. This problem also seems to lie in
who achieved high plan scores were designated planning strategy. Instead of adopting a strategy
effective planners; those who achieved low plan scores which would restrict attention to a small number of
were designated ineffective planners. We then the most promising alternatives, our subjects appeared
examined the planning process of effective versus willing to consider new alternatives throughout a
ineffective planners as revealed in thinking aloud planning session.
protocols. Basically, we asked subjects to verbalize The third deficiency in human planning lies in
:heir thoughts while they formulated plans. We then the area of time estimation. Planners tend to
analyzed these protocols, classifying statements as underestimate the time required to execute planned
representing particular planes of the blackboard or actions and, as a consequence, to overestimate the
levels of abstraction. Finally, we examined the number of actions they can execute in the available
relationship between planning skill and the frequency time. This tendency is exacerbated by time stress (the
with which planners made different kinds of decisions.
time required to execute all the actions under poor planners. Therefore, planners should be taught
consideration divided by the available time). Two the roles of these types of decisions in the planning
factors contribute to this problem. Planners typically process.
estimate time requirements at relatively high levels
of abstraction. Because they fail to enumerate all 5. Domain-specific planning heuristics. Our
time-consuming components of an action, they studies of individual differences also showed that
systematically underestimate the time required to good planners had more different planning specialists
execute it. Planners also respond to strong than poor planners. Therefore, planners should be
motivational factors. Their strong desire to taught a variety of domain-specific planning
accomplish all or most of the tasks under heuristics.
consideration biases their time estimates toward
underestimation. (These results are discussed in more 6. Costs and benefits of opportunism. There is
detail in Hayes-Roth, 1980.) considerable evidence that most people employ some
amount of opportunism in the planning process.
PRINCIPLES FOR IMPROVING PLANNING PERFORMANCE Opportunism provides planners freedom to examine
various aspects of a problem, investigate alternative
Based on the opportunistic model and the empirical plan configurations, etc. This enables them to
results summarized above, we developed the following discover solutions that more rigid approaches would
principles for improving planning performance: obscure. On the other hand, opportunism requires
additional time and may lead planners down
Criteria for Selecting Planners unproductive, as well as productive, solution paths.
Planners should be taught these advantages and
1. Large-capacity working memory. The disadvantages and how to exercise controlled
opportunistic model describes people's tendency to opportunism.
"jump around" the space of possible considerations
while forming plans. This suggests that it may be 7. General planning strategies. As mentioned
important for planners to have large-capacity working above, different planning strategies are appropriate
memories in order to keep track of several aspects of under different circumstances. Planners should be
a developing plan simultaneously. taught general planning strategies and the
circumstances under which each is appropriate.
2. Attentional flexibility. Our studies of
individual differences in planning performance showed 8. Judgment and time estimation. Most people
that good planners shift attention among different show a strong tendency to underestimate the time
aspects of a planning problem more frequently than required for planned actions. As a consequence, their
poor planners. Tlierefore, attentional flexibility may plans are unrealistic and overrun the time available
be another important characteristic of potential for execution. Planners should be taught cognitive
planners. methods for making such judgments more reliably and
more accurately.
3. Strategic flexibility. Our studies of top-
down versus bottom-up planning strategies showed the How to Train Planners
impact of particular planning strategies on the
efficacy of the plans subjects produce and on the 9. Provide explicit instruction. Explicit
efficiency with which they produce them. In addition, instruction appears to be a highly effective technique
some subjects appeared more willing than others to for training particular planning strategies and
adopt alternative strategies. Therefore, strategic methods.
flexibility may be another important selection
criterion. 10. Induce illustrative experiences. Many
planners seem to be able to generalize what they learn
from one planning problem to subsequent, similar
What to Teach Planners planning problems. Therefore, general strategies and
methods can be taught by instructing planners how to
4. Concepts of abstract plans, meta-cognitive use them on specific problems and providing
decisions, executive decisions, and knowledge-base opportunities for them to generalize them to similar
decisions. Our studies of individual differences problems .
showed that good planners made decisions in all
categories of the planning blackboard, whereas poor J_l. Illustrate effective planning. Because our
planners made only certain kinds of decisions. In planning simulation effectively mimics the cognitive
particular, high-level abstract decisions, world processes people use while planning, it may provide a
knowledge decisions, metacognitive decisions, and useful model during training. The simulation could be
executive decisions distinguished good planners from programmed to illustrate planning strategies.
Useful Aids for Planners Resource Estimation

(See the following section.) For each task under consideration, the planner
must determine what resources are required to execute
ONGOING RESEARCH it. Our research suggests that planners are unduly
optimistic about the number of tasks that can be
As discussed in the introduction to this paper, accomplished with given fixed resources. The resources
we are particularly interested in developing computer estimation component should help planners
aids to planning. We believe that, for the foreseeable realistically assess the resource requirements of
future, people will pliy central roles in most tasks under consideration.
important planning activities. Accordingly, an
effective planning aid should exploit people's Resource Al loc.:Lion
cognitive strengths and compensate for their cognitive
weaknesses. We have recently begun working on the Given limited resources and alternative goals,
design of such an aid. Our work in this area focuses the planner must determine how to allocate the
on a different instantiation of the same general available resources. Our research suggests that
planning task: project planning. We chose this task planners tend to spread resources too thinly across
because it is an important real-world task that could too many goals. The resource allocation component
benefit from the development of a planning aid and should help planners to formulate realistic allocation
because we have contact with a variety of people who schemes and to perform cost/benefit analyses of
carry on project planning professionally. Our current alternative allocation schemes.
design comprises the following components;
Goal Formulation
The planner must schedule planned tasks in a way
Our research suggests that planners suffer three that provides adequate time for the execution of each
deficiencies in the area of goal formulation. They do task, insures completion of prerequisites by the time
not formulate complete, well defined goals specifying they are required, and minimizes slack time and other
all of the objectives, constraints, and policies the costs. The scheduling component should support these
plan in progress should serve. They sometimes activities.
vacillate among conflicting goals, striving to satisfy
different goals at different points in the planning Evaluation
process. They try to satisfy too many goals, given the
available resources. The goal formulation component Our research suggests that poor plan evaluation
should help planners to articulate, prioritize, prune, is a major impediment to effective planning. Poor
and coordinate project goals. planners do very little systematic evaluation of their
tentative plans. Even good planners frequently decide
Product Specification
arbitrarily among two or three final contenders. The
Given a set of goals, the planner must generate evaluation component should at least assess whether
functional specifications for project product(s). the final plan meets criteria articulated by the
Presumably, development of these products would planner. It should also assess the efficacy,
satisfy project goals. The problems in product efficiency and robustness of the plan in a simulated
specification are that planners may not generate environment.
complete specifications or they may not specify During the next few years, we plan to refine this
products that systematically satisfy project goals. design, implement it as a computer system, and
The product specification component should help them evaluate its utility in the context of several Rand
to do so. projects.
Task Analysis REFERENCES

Given a set of specifications, the planner must C.MU Computer Science Research Group. Summary of the
specify a set of project tasks whose execution would CML' Five-year ARPA Effort in Speech Understandng
implement planned project products. Again, the Research, Technical Report, Carnegie-Mellon
problems are that planners may not generate a complete University, 1977.
set of tasks or analyze them at a sufficiently low Erman, L.D. and V.R. Lesser. "A Multi-Level
level of abstraction or that they may not specify Organization for Problem Solving Using Many Diverse
tasks that systematically implement project products. Cooperating Sources of Knowledge," Proceedings of
The task analysis component should help them to do so. the Fourth International Joint Conference on

Artificial Intelligence, Tbilisi, USSR, 1975, 483- Hayes-Roth, Barbara. Flexibility in Executive
490. Strategies, N-1170-ONR, The Rand Corporation,
Ernst, G.W. and A. Newell. GPS: A Case Study in September, 1980.
Generality and Problem Solving, New York: Academic Hayes-Roth, Barbara. Human Planning Processes, R-
Press, 1969. 2670-ONR, Rand Corporation, December, 1980.
Fahlman, S.E. "A Planning System for Robot Hayes-Roth, F. and V.R. Lesser. "Focus of Attention
Construction Tasks," Artificial Intelligence, 1974. in the Hearsay-II Speech Understanding System,"
5, 1-49. Proceedings of the Fifth International Joint
Fikes, R.E. "Knowledge Representation in Automatic Conference on Artificial Intelligence, Boston,
Planning Systems," In A.K. Jones (ed.). Mass., 1977, 27-35.
Perspectives on Computer Science. New York; Lesser, V.R., R.D. Fennell, L.D. Erman, and D.R.
Academic Press, 1977. Reddy. "Organization of the Hearsay-II Speech
Goldin, Sarah and Barbara Hayes-Roth. Individual Understanding System," IEEE Transactions on
Differences in Planning Processes, N-1488-0NR, The Acoustics, Speech and Signal Processing, ASSP-
Rand Corporation, June, 1980. 23,1975,11-23.
Hayes-Roth, Barbara and Frederick Hayes-Roth. "A Miller, G.A., E. Galanter, and K.H. Pribram, Plans and
Cognitive Model of Planning," Cognitive Science, the Structure of Behavior, Holt, Rinehart and
1979, 3, 275-310. Winston, Inc., 1960.
Hayes-Roth, Barbara and Frederick Hayes-Roth. Newell, A. and H.A. Simon. Human Problem Solving.
Cognitive Processes in Planning, R-2366, The Rand Englewood Cliffs, N.J.. Prentice-Hall, 1972.
Corporation, December, 1978. Sacerdoti, E.D. "Planning in a Hierarchy of
Hayes-Roth, Barbara and Perry W. Thorndyke. Abstraction Spaces," Artificial Intelligence, 1974,
Decisionmaking During the Planning Process, N- 5, 115-135.
1213-ONR, The Rand Corporation, October, 1980. Sacerdoti, E.D. A Structure for Plans and Behavior.
Hayes-Roth, Barbara. Estimation of Time Requirements Technical Note 109, Stanford Research Institute,
During Planning: Interactions Between Motivation Menlo Park, California, August, 1975.
and Cognition, N-1581-ONR, The Rand Corporation,
November, 1980.

Everyday Problem Solving The domain specific focus of expert knowledge has
been reinforced by studies of everyday problem
James A. Levin solving. For example, a group of researchers at the
University of California, Irvine have been examining
Laboratory of Comparative Human Cognition the ways that ordinary adults use arithmetic
University of California, San Diego knowledge while grocery shopping (Lave, 1980; De la
Roeha, Hurtaugh, & Lave, 1981). Schools spend many
This paper was generated through the distributed years teaching us general purpose algorithms that can
social processing of the Laboratory of Comparative be used during shopping to calculate comparative
Human Cognition, with special help from Denis Newman, prices. Yet most of their observations show that
Naomi Mlyake, Bud Mehan. Ed Hutchlns, Peg Griffin, people ordinarily use special purpose heuristics
Hike Cole, and Harcla Boruta. Financial support was while shopping. Even In this mundane everyday
provided by The Spencer Foundation. setting, people have developed "expertise" to carry
Abstract out this task, domain specific methods that bear
little resemblance to the general purpose
Everyday problem solving is different in significant computational skills taught in school.
ways from the kinds of problem solving that take Similar research by Scrlbner and her associates
place in laboratory mlcroworld settings. Attempts to (Scrlbner, 1981) reinforce this finding of special
simplify have excluded important factors that can purpose expertise In everyday functioning. They
help us understand aspects of the problem solving studied the work in a dairy warehouse, examining the
that are problems from the point of view of the use of computation in filling orders and determining
laboratory. This paper describes several research total prices. The experts in this domain had
projects that have examined problem solving in non- developed special purpose algorithms to allow them to
laboratory settings, and some of the Implications of function efficiently in this domain.
these studies for cognitive science. The current Problem Solving vs. Routine Functioning.
notions of distributed cognitive processing can be What is the relation between expertise and problem
extended in a powerful way to the socially solving? Problem solving is not Just accomplishing
distributed problem solving characteristic of particular kinds of tasks labeled as "problems". The
everyday settings. This notion of socially processes involved in solving most puzzles are
distributed problem solving can then reflect back on different the second time you solve them (when you
Individual problem solving, which which is acquired know the answer) than the first time. In fact,
and often carried out in social settings. "expertise" can be defined as the knowledge and
Someone walks Into your office and asks you to cognitive skills that allow a person to perform
rpr-ommend a paper to read as an introduction to routinely what other people would have to do through
research on children's problem solving. You discuss problem solving. Central to the definition of
with the person exactly what she wants to know, you problem solving is the notion of a "blocked
walk over to your bookshelf to look for an condition" (Hutchlns & Levin, 1981). Derived from
appropriate book, you call a friend on the phone who the Gestalt studies of problem solving (Kohler,
might know. All very unexceptional, yet imagine that 1925), this occurs when a problem solver is unable to
the person didn't allow you to discuss with her achieve some goal, after repeated attempts to do so.
exactly what she wanted to know, to go to your Problem solving is the cognitive processing that
bookshelf or call on the phone, but instead required occurs when a problem solver is blocked. Routine
you to answer her question without these external functioning is the processing that occurs when
resources. In everyday circumstances you would throw unblocked.
her out of your office. Yet these are exactly the Studies of everyday problem solving.
constraints of the laboratory microworlds within Several research groups have examined how people
which problem solving is studied. deal with these blocked conditions in non-laboratory
Recently a number of research groups have been settings. Suchman (1980) did an ethnographic study
studying problem solving in non-laboratory settings. of problem solving in an office setting, setting down
This work has some important implications for a detailed account of accounting practices,
cognitive science: it serves to reinforce the especially those Involved in dealing with non-
findings concerning the role of expertise in human standard cases. Even In the mundane work-a-day
problem solving and expert artificial intelligence office setting, the execution of explicit
systems, and this non-laboratory work extends beyond instructions remains "irremediably problematic",
the current problem solving research, pointing to requiring Interactive work on the part of the
ways to enrich both models of human problem solving participants. Levin & Kareev (1980) examined the
and expert Inference systems. problem solving of children in computer clubs. In
Expertise. both these studies, a critical component of the
Recent research on expert problem solving has problem solving, which is largely absent from
highlighted the large amount of domain specific laboratory studies, is the conceptual organization
knowledge and cognitive processes that constitute of the task, determining what the problem IS, what
oxperlise (Chase & Simon, 1973; Larkln, McDermott, are the goals and constraints to be satisfied, what
Simon, * Simon, 1980). These studies of human actions are available.
problem solving have been parallelled by the The second major difference between laboratory
development of artificial intelligence "expert" problem solving and everyday problem solving is the
systems, which are also characterized by a focus on much more important role played by external
domain specific knowledge and inference processes. resources, those outside the individual problem
This work contrasts with the early work on problem solver. The laboratory setting is relatively sterile
solving, both in psychology and in computer science, of help and the experimenter works to keep it that
which postulated a few all powerful general problem way. A standard experiment would not be run In the
solving processes, that would operate over a large middle of a busy room. Given a puzzle to solve,
passive data base. These "central processor" models it is not considered proper for the subject to ask a
have been displaced by various "distributed" data friend what the answer Is. However, when a person 29
base and processor models, with multiple concurrent encounters a problem in everyday life, asking someone
pror-osses that interact to produce complex what to do Is usually appropriate.
procersing. The use of social resources is probably the
biggest single difference between standard laboratory
adultf when faced with an
arithmetic problem in an eveyday setting Is to ask distributed within an individual expert, a
someone what the answer Is (Lave, 1980). In computer distribution that allows the expert to draw smoothly
clubs, children help and ask for help effortlessly to upon social resources whenever problems arise.
get beyond minor blocks so that they can get on with
their play (Levin K Kareev, 1980).
Division of labor. One of the most Important ways \»eterencea
that people use social resources for problem solving
Is by organizing a division of the work Involved. Chase, W. G., K Simon, H. A. PerceptVon in clness.
When faced with a new computer game, children divide Cognitive Psychology, 1973, n, 55-61.
up the task so that no child Is overwhelmed while
the game gets played. One child will type In the De la Rocha, C, Murtaugh, M., & Lave, J.
required responses, another will take over the A conceptual framework for locating cognitive
generation of guesses, others will evaluate and processes in everyday life. To appear in J. Lave
modify the guesses (Levin & Kareev, 1980). This 4 B. Rogoff (Eds.), Everyday cognition: Its
process of organizing and executing a division of development in social context. Cambridge, HA:
labor is so effortless and smooth that repeated Harvard University Press, in press.
viewing of video taped instances is required to see D'Andrade, R. G. The cultural part of cognition.
it at all. Paper presented at the Second Annual Cognitive
Socially distributed problem solving. Science Conference, Yale University, 1980.
An important contribution of cognitive science to Cognitive Science, in press 1981.
the study of everyday problem solving are the Hutchlns, E. L., 4 Levin, J. A. Point of view in
frameworks for distributed processing. Using this problem solving. Paper presented at the Third
new processing "language", researchers can now talk Annual Cognitive Science Conference, U. C.
about the social distribution of problem solving in Berkeley, 1981.
many situations, characterizing the nature of each
processor and the kinds of interactions that occur Kohler, W. The mentality of apes. London:
(LCHC, 1981; Mehan, 1981). Issues of conflict Routledge and Kegan Paul, 1925.
resolution and information integration, central Langley, P. Data-driven discovery of physical laws.
Issues for distributed processing, are also Cognitive Science, 1981, 5, 31-5'4.
critically important for models of socially
distributed problem solving. Lave, J. What's special about experiments as contexts
Yet this contribution Is not a one-way street. for thinking. The Quarterly Newsletter of the
The study of problem solving that is distributed Laboratory of Comparative Human Cognition, 1980,
over several people can suggest hypotheses about how 2, 86-91.
the same process might be organized cognltively when Laboratory of Comparative Human Cognition, UCSD.
performed by a single person. A major issue for any Culture and intelligence. In R. Sternberg (Ed.),
cognitive model is what is the structure of the Handbook of intelligence. New York: Cambridge
knowledge and processes, what are the units and University Press, 1981.
subunits. The division of a problem that can be
solved distributed across several people provides an Larkin, J., McDermott, J., Simon, D.P., 4 Simon, H.A.
existence proof that the problem can successfully Expert and novice performance in solving physics
be organized that way. Researchers can then carry problems. Science, 1980, 208, 1335-1312.
out empirical tests of whether an individual in fact Lenat, D. B. Automated theory formation in
does organize the task that way. mathematics. Proceedings of the Fifth
Aquisition of expertise. A second contribution International Joint Conference on Artificial
of this approach to everyday problem solving Is to Intelligence. 1977, 833-8'J2.
deal at least partly with the Issue of acquisition.
Current research on the acquisition of problem Levin, J. A., 4 Kareev, Y. Problem solving In
solving skills and domain specific expertise has everyday situations. The Quarterly Newsletter of
concentrated on independent invention (Langley, 1980; the Laboratory of Comparative Human Cognition,
Lenat, 1977). let models that depend totally on 1980, 2, 17-52.
independent invention of knowledge and processes
Mehan, H. Practical decision making in naturally
never get very far. It remains a major puzzle how
occurring institutional settings. To appear in B.
such systems could acquire the huge amounts of domain
Rogoff 4 J. Lave (Eds.), Everyday cognition: Its
specific knowledge needed by experts. A way to
development in social context. Cambridge, MA:
overcome this block was pointed out by D'Andrade, in
Harvard University Press, in press.
his invited address at the previous Cognitive
Sciences Meeting (1980): people acquire knowledge and Newell, A., 4 Simon, H. A. Human problem solving.
skills from other people. We are socialized within a Englewood Cliffs, NJ: Prentice-Hall, 1972.
rich culture, where the people and objects are at Scrlbner, S. Studying working intelligence. To
least partially organized specifically to help appear in J. Lave 4 B. Rogoff (Eds.), Everyday
novices become experts in the domains Important for cognition: Its development in social context.
functioning In the world. Children are taught the Cambridge, MA: Harvard University Press, in press.
important facts of life; beginners are trained to
becone experts. Suchman, L. Office procedure as practical action:
From this point of view, the acquisition of Theories of work and software design. Paper
expertise can be characterized as the progressive presented at the Workshop on Research In Office
internalization by the learner of socially Semantics, Chatham, Massachusetts, June 1980.
distributed processing. A person generally becomes
an expert in a setting where he/she can gradually
take on more and more of the effort in handling tasks
that experts in the domain handle. Children who are
novices at a computer game initially divide the task
over30several children and adults. Gradually, each
child takes over more and more functions, so that
fewer children have to cooperate to accomplish the
task. Finally a child can play the game alone.
This progressive acqulstion of a process initially
distributed socially
for the way that in fact canprocesses
the cognitive provide aarerationale
Marriage is a Do-It-Yoursclf Project: organization of goals Is to quote some statements by
The Organization of Marital Goals Alex, an Interviewee of mine, on the subject of his
marriage. Talking about his preconceptions of marriage,
Naomi Quirm he says,
Duke IMivereity
6H-1: Well yeah 1 thought it was all going
In the story understanding 111erature,a class to be wonderful. You know it was—the problern
of stories which is on Its way to becoming something of of sex was going to be solved. You know I was
a famous example, has to do with getting married. One an adolescent or barely out of adolescence,
typical variant, used by Wllensky (1978: 67), goes like you know—this was a wonderful idea. And, I
this: don't know—the Idea—and—you
John was tired of frequenting the local singles' really—you're asking some good questions
bar. He decided to get married. because there really are some things that I
knew about and that I wanted. A companion and
The point of this story is that to understand how friend. Probably the most. You know, the
getting married could possibly replace going to a things that I did want and I got out of the
singles' bar, the reader must know that a state such as marriage. That seemed Important then—to have
marriage can subsume goals that arise repeatedly. In someone there all the time that you could rely
this case the goal is Satlsfy-SEX. A series of variants on. And talk to all the time about things.
of this story, however, demonstrate that marriage may in Somebody to help and somebody to help you, you
fact subsume a variety of such recurring goals: know, that seemed like a real good idea. That
John was feeling lonely every evening. He seemed like something I really wanted. It
decided to get married (Wilensky ibid.; 72). seemed like something that we got out of the
marriage. Somebody always there.
Mary was tired of working for a living. She In a second Interview, he goes back to this issue:
decided to quit her job and get married 6H-2: I think, right from the very beginning, I
(Wllensky ibid.: 77). believe I mentioned on the other tape that I
Schank and Abelson Introduce two other variants felt the—talking about the—what was—did you
(1977: 126) to Illustrate a further characteristic of expect getting married that I did feel a need
goal subsumption: for a companion and somebody to be with me and
to fulfill something that was empty, I didn't
After his marriage with Mary broke up, John
know what it was. And I think that might have
began frequenting the local singles' bars.
been the first step—the first stage of love.
After his marriage with Mary broke up, John But now I'd say that love is, to me, is the
decided to join a chess club. desire to give more to the other person than
you're giving to yourself, at times. You
To understand these stories we need to realize, Schank know it's the, not only willingness to
and Abelson (Ibid.) say, that "goals that arise via accept the part of marriage where you have to
subsumption tend to subsume more than a single goal." change and adjust, but a strong desire to do
"Being married," they note, "can subsume the sex urge so. Not only the willingness to accept that
goal, goals of social stimulation, the desire to have there are times when the other part—person
children, to have power over another person, or to be or partner might need some help. But the
with a loved one." Wllensky (Ibid.: 77) views this desire to give help. To that person. The
tendency to goal multiplicity as a distinctive property sort of love of the fulfilling the other
of social relationships: person's needs.
Since there Is typically more than one Later in the same Interview, talking about why some of
obligation imposed upon the member of a the couples they had been thrown together with in the
relationship, social relationships usually Navy had marital difficulties:
subsume more than one goal. For example, 6H-2: Therewere certain things they married for.
being married to someone usually subsumes a They'd gotten which it may have been their
number of recurring goals in addition to dinner on the table every night and, you«,
those for which money Is instrumental. a warm body next to you in bed. Knd It
Since being married obligates each partner suddenly went beyond that. It suddenly got
to have sex with the other, and since having into conversation. And support, things of
a willing partner Is a precondition for that sort. And that's—that caused some
having sex, then marriage subsumes the difficulties.
recurring Satlsfy-sex goal. Also, since In a later interview, he describes giving advice to a
marriage requires the partners to live Navy shipmate:
together, then it subsumes the recurring 6H-^: And he said he was going to get married and I
Enjoy-company goal to which being near a said, "Well," I said, "I hope you think about
loved one is instrumental. it real hard because I think you might find
These examples promote a certain view of social marriage to be a little bit surprising than
relationships as bundles of discrete goals which happen, what It is. Because it was for me. Shocking
perhaps because of cultural convention, to be packaged sometimes, you know, that it wasn't all love
together. As Schank and Abelson (Ibid.: 127) summarize, and sex and that's it. Yeah, that there was
"Each social relationship carries with it a packet of some work to be done."
goals." This paper is offered as an alternative to The first thing to be said about the '"goal
that particular view of goals and social relationships. subsumption" model ofro.Trriage,then. Is that It Is a
It argues that marital goals are not so much packaged folk model. The reason why we understand marriage as
together as highly organized, and that what organizes subsumption of a particular set of goals, is not because
them Is an underlying model of marriage which, far social relationships necessarily subsume multiple goals,
from being fixed and conventional. Is constructed by but because the goal subsumption model of marriage Is a
the Individual In the course of the relationship. very common preconception in our society. Tt is not.
Perhaps the best way to Introduce this view of the 31
however, the only myth about marriage; another, Because of this very strong stereotype he has been
sub-cultural one. Is expressed In the following segments taught, Jimmy decided he is never going to get married,
from Interviews with another husband,
2H-2: I wouldn't want anything like a marriage
2H-7: The marriage thing, I think basically, to Interfere with my normal way of living. I
is Just a mental thing where you know, you've wouldn't want to Interrupt my life to be—you
heard so many things about marriage like It's— know, maybe because I feel like I would probably
marriage have become something like a negative be unhappy because no woman would really give
thing. You know, a lot of people, you know, me, maybe, the freedom that 1 need.
you know, it's Just—you just hear so much, Jlnmy's model might be dubbed the "goal-frustration"
you've never been married so you don't know. model of marriage. Our cultural knowledge of this
You know you Just hear so many things about model should assist us in understanding stories such as,
when you're married, you know, things really
change. Your wife don't let you go out or, Jimmy wanted his freedom. He decided not to
you know, your husband don't let you go out get married.
and, you have to—you know, you never have
any money, you can't spend money on what you or
want to. You know you hear so many things— Jimmy did not want to feel like he had to keep
when you get married your wife don't want you sex cooking all the time. He determined never
sexually anymore or you don't want your wife to get married.
and you have to be sneaking here and there
trying to do things. You know, these are The real-life excerpts above are taken from
things that you hear and because of all this, Interviews with two husbands in a larger study, during
you know, you know, it's like you have a — which each spouse in eleven marriages has been
it's Just like you sort of take caution and interviewed for an average of 15-16 hours apiece.
say, "Well let's try it out first and then Interviewees describe their preconceptions of marriage,
You hear it all your life from a kid on up and make clear In the course of these extended
we'll"—you know—
and from various people. interviews what they think their marriages turned out
2H-7: Your wife won't let you out and—or to be. To Illustrate how goals are organized by these
you got to be doing something. You got to understandings of marriage which emerge in the course
keep the baby, you're babysitting, you know. of it, I would like to provide a somewhat extended
And they say things like, "You're babysitting analysis of Alex's conceptualization of his marriage,
while your wife is out in the streets, you which 1 will then compare with a somewhat briefer look
know, playing around." Or you know, you know, at another marriage.
they put a lot of, you know, it's a lot of First of all, Alex sees his marriage as a joining
fears being—it, you know, it be a lot of together of two people, expressed more specifically in
fears—a lot of things that people—you know, two metaphors he uses again and again: MARRIAGE IS
it's just constantly, you know. BEING A COUPLE and MARRIAGE IS A PERSON. The first of
2H-7: You know, so it's Just something these metaphors is characteristic of his descriptions
constantly you know. It's like something— of the early period of his thirteen-year marriage, for
like it's the end of the road, you know, to example in this remembrance of the decision to have a
get married. It's like you—you're just child while he was in the Navy:
crazy. You're out of your mind, you know. 6H-1: After I got the first promotion with
2H-7: Okay, I'm single, okay and I'm going the idea that the second one might be coming.
out and maybe one of my—somebody I know a I think that's how it was and that was quite
buddy or somebody that's married. Okay, you successful because Shirley got pregnant right
hear from women and men. Okay, and you're away. And she, you know, when I got back from
single and people tell you, "Oh, you're so Guantanamo she told me and we told the news
lucky." You know, you know, 'You're single to everybody and it was a really big deal.
and you can go out here and see all these And I think that the apartment and the baby
women you want to see, you know, anywhere— and all of that stuff really began to come
you know, you're just—you're living the down on us, you know, and we started believing
life, you know, and they're making you feel that we were truly a couple. And we were truly
like, "God," you know, "Marriage is like the a family and really married.
end of everything," you know. Here the equation between being married and being a
Marriage, In this view of Jimmy's does not so much couple is explicit; and being a couple is also equated
solve the problem of sex as create it: with being a family. All three continue to be used
2H-7: Marriage or whatever it is—it's a sex interchangeably throughout Alex's interviews. The
thing, you know. You got to—your sex got to first identity is explored in the following remarks
be cooking all the time, you know, or else, about a couple they knew in the Navy:
you know, you ain't doing nothing, you know. 6H-2: You know it's funny there's even a
And you're wife may essential—or whoever may couple that we got to know in Iceland that
Just leave you because, you know—or you may were Just dating there, they Just met there.
have to leave, you know, your wife or your One of the teachers and a fellow that was up
woman because—or go for somebody that's, there who didn't get married for years, but
you know, that things are rolling, you know. did later. I mean they went off on their
This is what society has put into you. own—he went further—he went out to the West
Coast and got a ship. She went to Okinawa to
teach. And they eventually got married. And
we knew them very well, and were close to
them and they were just dating then. And they
ended up getting married and having a very
solid marriage, now with a child. And we see
them, they lived in Okinawa for a while and
32 then they went to Germany. And that's another—
it's amazing how that happened.
I: Why? This segment offers an unusually clear glimpse Into the
internal logic by which a goal follows from a metaphor—
6H-2: Well that—I still feel like—I still— probably because Alex is here recapping the very
I knew them as a couple, even then. You know argument which he used to persuade his wife. Thus "we
I always saw then as a couple, even though have a marriage" sets out the initial assumption; "it's
they really—they weren't In the same just the two of us in this marriage" establishes the
relationship as the other couples that we identity of a "marriage" with a "couple"; "not four" Is
knew, but they ended up that way. What's a logical inference following from "just the two of us,"
amazing is how that couple formed the same and in turn supplies a reason for the goal of
sort of marriage that the other married exclusivity, here to be realized by separating her from
couples that we knew already had. her parents. Such a chain or reasoning Is not often
Alex is amazed that his perception of them as a married made explicit in these descriptions of marriages;
couple turned out to be accurate. The second equation, rather, the link between a given marital metaphor and
between being a couple and being a family is elaborated a given marital goal must often be inferred from their
further in this segment: juxtaposition, from more scattered pieces of logic
6H-1: That, you know, we were now a family. which can be pieced together to relate them, and from
Just the two of us, we were a family, and their recurrence and emphasis as common themes in a
that—and the commitment was to make that given individual's interviews. Thus, it does not
family move. You know, that it was going to surprise us to learn that Alex repeatedly stresses the
be a solid thing—you know It vas going to be further goal of proximity. Talking about their decision
a—you know just like the parents' had been— whether or not to accept .-> post in Iceland, a station
they were—and all the people we knew, around which provided housing for families, he says:
us. Remember of course that this mass divorce 6H-2: I think if you're separated by necessity,
thing which is going on now, hadn't—was not if there is no choice about it, that's one
going on then. This was—we got married in thing. Then you become—you know, you're tied
'67, 1 guess It was. January '67. together by your letters and you're both
A couple is a family. This excerpt also Introduces a But to be it.
fighting givenAnda chance—to be told thing.
that's a different that you
goal—to make that family move. The juxtaposition of may stay together that everything had been
this observation with the comments about contrasting cleared, that we had a place. It had all been
marriages back then with the mass divorce of the settled, and then to say, "I'm not going to go."
present time makes clear that what Alex means by a That would have been a breach of the faith of
family that moves is a marrlajie which does not end in the marriage. It would have been hard to
divorce. Thus one goal of BEING A COUPLE (and hence a overcome.
family) is permanence. This excerpt also hints at how Choosing to stay together at every opportunity Is a
that permanence is to be attained—an issue which will consequence of the proximity goal; and maintaining ties
be taken up below. during forced separations is an attempt to overcome the
The COUPLE metaphor Is associated not only with the circumstances which prevent proximity. Letter-writing
goal of permanence, but also with the further goals was a major shipboard activity during his overseas
(Preservation Goals, in Schank and Abelson's ibid.: trips in the Navy; and nowadays, on business trips, he
115-116) of exclusivity, proximity, and shared makes long dally phone calls home. While separations
experience. The first of these goals emerges in Alex's can be "fought" in these ways, a marriage may not be
descriptions of several incidents early in the marriage, able to sustain extremely lengthy ones:
revolving around his wife's fllrtatlousness and untoward 6H-2: But I thought that would be two years
Interest in men. Alex has this to say: like that [APART] would have done severe
6H-1: 1 think I decided It was time for damage to our marriage. I think it would,
Shirley to stop fooling around with any other you know, it would have done severe damage
guys. And that [GETTING HARRIED) was the one and if we weren't the same people that we are
way 1 thought I could convince her to do that. right now that might have happened. And that
Other than that she was going to party for the there Is one of the weakening things of a
rest of her life. trip overseas like that.
from which it is clear that being married implies It can be Inferred that separation causes damage by
exclusivity. The relationship between BEING A COUPLE blocking the proximity goal, just as Shirley's
and exclusivity Is still clearer in another striking flirtations blocked the exclusivity goal. Elsewhere he
passage about his response to her emotional tie to her describes her fllrtatlousness as "jeopardizing our
parents: relationship" before they were married.
6H-1: I think it was significant because you Another consequence of the proximity goal is that
know I made a very strong decision then and, partings are difficult; a nimiber of incidents have to
you know, we—Shirley went along and I think do with sorrowful departures:
that was one of the first times that I had 6H-6: I think the thing that bothers me the
said, "I'm going to separate you from your most and sometimes It bothers me when I leave
parents." In a very decisive manner and in the morning Is if something happens to
"This Is what we're going to do." And you either one of us and we'll don't see each
know, "You're going to follow me here, you're other again. And that makes m e — I — e v e r y once
going to go with me here as your husband." In a while that thought strikes me—maybe it
Maybe It was, you know—looking back on that doesn't other people who are married—and It
I—maybe that seems—sounds a little makes me very sad. And I just want to go back
chauvinistic but 1 think It was a declaration and say some more—and sometimes we're—it's
that we have a marriage and it's just two of very hard for us, you know, we'll be very slow
us in this marriage not four.

at parting and Shirley will some mornings, They have "made a lot of wonderful trips'' which have
you know. Just going to work Shirley'11 be "been one of the highlights of our marriage." She in
at the door waving to me and stuff. That'9 turn, jumps at the chance to go on a trip with him, and
Important 1 think to both of us that we do Insists that they find a recreational activity to do
care enough that—and even little departures together—the one they have settled on is ice skating
are Important. 6H-2: Because at the time we were sort of
Talking about his out-of-town business trips, which looking for something to do together and
he does not enjoy, Alex dwells on another goal—that of Shirley—the options she had brought up were
sharing experience. This can be seen as a logical unacceptable to me, particularly the one to
consequence of the proximity goal. Shared experience go square dancing. That was one that I had
both requires proximity, and affirms it. just—we tried some sports, we did volleyball
6H-6: Now I would not choose to take a for a while but then I just hurt my shoulder.
vacation alone. That would be terrible to me. And I couldn't play volleyball anymore. But
I would prefer to have Shirley with me. On we—Shirley said, I remember her saying, "We
any sort of vacation or pleasant business at don't do anything together," or something, you
all. Any sort of real pleasant time I prefer know, she had one of these—and I was, I said,
to have her with me because i f — I really would "That's crazy." You know, "We live together
prefer to share those experiences with her. we have all sorts of things we do together all
In fact, one of the things that I don't like the time," but she wanted something specific
about the trips is—those trips which are she could put her finger on that we did
clearly just for me. When I do go to someplace together.
very good to eat or I get to go to a show, I Just as forced separations are met by attempts to
really feel badly about her not being there remain tied together at a distance, unavoidable
because we don't have that as a shared occasions when pleasurable experiences cannot be shared
experience anymore. And those shows and good are countered with efforts to share these experiences
meals and things like that that we've done as vicariously:
a shared thing are really important to us and 6H-6: If I go to a show I—once I've gone,
have been—are good moments for us in our you know, once it's too late, you know, not
marriage. to go. I try to tell her as much as I can to
So much so that, share with her. She spends a lot of time
6H-6: I'd probably feel more guilty about looking real sad when I'm telling her about
going to a show than almost anything else we'd it. To make sure that I know that she would
do because the two of us love to go to a show. prefer to have been there but I do tell her.
And I would probably—I'm going to be In New In—and neat restaurants that I've been to or
York for about a week—or in Stamford for about something like that. Places that I would like
a week in January. And I'm going to be i n — to, in the future—us, you know, have us go to
about 8 days I'm going to be away in January together.
Just as he filled his overseas letters with details of
including 4 or 5 in Stamford. And when we're
his life in the Navy, he shares with her all the details
in Stamford we often get tickets to New York
of his business trips—"And the trip is something to be
shows—take a train in. And I'm sure we'll
discussed with her as far as I'm concerned"—as well as
go see some shows. I'm going to try my best
the ins and outs of office politics. A media specialist
to go—to talk people into shows that Shirley
for a large company, he brings all of the films he has a
would not want to go to, but I would.
hand in producing home for her to view. She always
And as the following excerpt shows, this goal of
presses him for more details of his solitary trips:
sharing experiences has a not inconsequential influence
6H-6: My memory is not as good as hers for
on the couple's economic decisions:
details. And she always wants more details and
6H-6: But I also think that it's excellent
usually what happens to me is I don't remember.
when I—you know, when I can have the chance
As soon as I get back I have trouble with some
to have J. along. Not to show her that I'm
details but over the next week or two weeks or
working hard because I wouldn't take her on
maybe a period—a long period of time as I
a trip when there's no choice but to work 12
remember more details I tell her. But I tell
hours a day. Like the trip to the Carribean
her about things like the politics and the
would have been ridiculous for her to go on
meeting itself and what happened and she's
because it was—there were—days began at
really my sounding board on a lot of that stuff.
7 in the morning and ending at 7:30 in the
When they've been difficult meetings she really
evening of work. And nothing pleasant at all
is a sounding board. I've, always shared my
happening. But If it's going to be for a
work with Shirley.
meeting situation where the amount—actual
A potential of being joined to another person is that
work Is marginal and there are spouse affairs
that union can become even closer over time, and Alex
going along with it and there are maybe even
characterizes his marriage in this way. He talks of
some social affairs with It a dance or dinners
how they are "much more tied to each other now than we
or things of that sort. And it's apparent
were then. We were still two independent people,"
that a wife would be—or a spouse would be very
implying that now they are not. He feels "more in love";
much welcome and comfortable there then yes,
romantic love has "deepened and grown stronger." They
you know, by all means we've gone Into debt
like to be with each other more, and also to do things
over those. I mean we really have. We've
for each other more than before. To separate himself
spent more than we should have often on those,
from his wife and children becomes more and more
without a hesitant mind, and I still don't
look back on them and think that that was money
wasted. By no means was that money wasted.

6H-4: I just don't think marriage is about 6H-4: 1 certainly believe that Shirley as an
me. I think It'c about us and each time you individual should have her rights and
add another person in there that person becomes privileges as 1 think 1 should. But I also
very deeply involved. believe very strongly in Shirley and Alex as
the married couple or the D. family as it
Adding children. It's hard for me to see how exists here as the family that in itself has
you can separate yourself from all of these rights and privileges thit must be seen to.
other individuals that you've become so tied
Individual needs may interfere with the needs of the
to. Other marriages may not be like mine and
marriage. And analysis, Alex believes, will create this
Shirley'e where 1 feel, you know, that we are
interference because it promotes individual needs:
this close.
At some point, and increasingly as his story moves 6H-4: Our marriage is a very good thing for
toward the present, he begins to talk about the both of us. Has been a good thing for both
marriage itself as no longer "two independent people" of us. And is the—to me is the best thing
coupled together, but as a person In its own right: in the world for both of us. And in analysis
6H-1: When Shirley came back we shortly after you have to deal with 100% of yourself, me.
that got an apartment and I think that was You know, the "me" generation is often brought
the beginning of some good things. We were up. And one of the things that's brought up
really more upset about the ship's leaving is the—a lot of analysis. A lot of these,
than we had been about anything else and I "the importance of looking out for Number One"
think that was a sign that we were beginning type concepts. Well that's what analysis is
to really feel for our marriage. to a great degree. You're really going after
In a series of similarly striking metaphors, he speaks yourself and your analyst is always saying,
of the marriage growing and maturing, "You have a right to this, you have a right to
that, you have a right t o — " You know, "Other
6M-6: Our marriage and our love has grown a people have no right." Well that really works
great deal over the years and only in the last very well when you're in analysis. It doesn't
maybe eight, nine, ten yenrs do I feel that work very well when you're in the middle of a
it's really matured. marriage. Because a marriage is not that. A
and as something he has a confidence in: marriage is something else. It's something
where you have to be--have to give away part
6H-2: I think we pass the crisis then and I of this right to always react, certainly.
had a confidence in our marriage then which Therefore Alex is afraid of analysis; for analysis
I think I was lacking right at the point when threatens marriage by blocking the marital unselfishness
we were—you know, well like I said w e — I goal in the same way as his wife's flirtatiousness
don't think we got to know each other in that threatened the marriage by blocking the goal of
very intimate way until we got overseas. exclusivity, and extended separation threatened the
Elsewhere he speaks of some things as being "a breach of marriage by blocking the proximity goal:
the faith of the marriage," more or less "good for a 6H-4: When Shirley was in analysis for a
marriage," "helpful to a marriage situation," "hard for short period of time. When that happened
the mnrriage," and "challenging to marriages," that was very hard on me and, you know, we had
a very good marriage. And I think analysis—I
eH-l: There is certain kinds of analysis,
was afraid analysis was going to ruin it. And
encounter groups and all sorts of things are
I suspect that Shirley is too. Because she
done now that do focus on the individual.
said to me, "I dread the day you ever go into
Yes, and 1 think that is challenging to
analysis because I don't know if you'll—you
marriages. What it really comes down to, the
know, I don't know what'= going to happen to
big question is, should the couples that
you." You know—and—as if something might
break up, as a result of having these things,
happen to our marriage.
should they have been married at all? Maybe
At the same time Alex agrees that individuals have the
those marriages weren't any good anyway. But
right to find out who they are and what they want out of
you then begin to wonder—take this a step
life, and he wonders whether there isn't some alternative
further--you wonder if any marriage would
way "to analyze people in a marriage as opposed to
make it given the—given egotism. If you put
analyzing people who are not in a marriage situation."
the self ahead of the marriage, if any marriages
Because, as he says about individual analysis, in one
would last. What I'm saying is that I myself
final metaphor,
don't know whether analysis destroys that type
6H-4: If you start to take one part—it's
of relationship, or whether it just points out
something like trying to analyze only one side
the problems with that type of relationship.
of my brain. To take one part of the family
Here the goal of marital permanence is recast in terms
and deal that way.
of human survival—Alex wonders "if any marriage would
I am arguing that metaphors engender goals. MARRIAGE
make it." And another goal is introduced, that of
IS BEING A COUPLE engenders the marital goals of
unselfishness toward one's spouse. This is expressed
permanence, exclusivity, proximity and shared experience,
elsewhere as "the desire to give more to the otfter
because these are properties of BEING A COUPLE which
person than you're giving to yourself, at times" and
realize the chosen metaphor. This metaphor engenders
"the need to give In the marriage." Egotism, by
the further goals of connectedness during separation and
contrast, is "putting the self ahead of the marriage."
vicarious sharing of separate experiences because these
The exa<t logic of unselfishness as a goal of marriage
are alternative ways of realizing the goals of proximity
as a PERSON is spelled out in the following statement
and shared experience, respectively. In the same way,
concerning the effects of his wife's analysis; this
MARRIAGE IS A PERSON entails the goal of marital
statement makes use of the PERSON metaphor once again:
unselfishness. But marital metaphors, themselves, are
organized by underlying conceptualizations. The logic
which links these two metaphors seems to be a processual
one—that of two independent people coupled together35and
merging into a single person.
A further set of metaphors ubiquitous In Alex's While Alex is "not exactly sure why one marriage works
thinking, marriage as some kind of MANUFACTURED PRODUCT, and another doesn't," he does have Ideas about what has
Is linked to the metaphor, MARRIAGE IS BEING A COUPLE. made his own marriage, and other strong marriages he
This is done by equating one particular property of has known, work:
BEING A COUPLE—permanence—with a property taken by 6H-2: Well I'd say there's something about
MANUFACTURED PRODUCTS—durability. The reasoning is those people that we knew, they had a basic
revealed In one of the interview segments above: solid foundation in their marriages that
6H-1: That, you know, we were now a family. could be shaped into something good.
Just the two of us, we were a family, and
that—and the commitment was to make that Here the product is an edifice of some sort, to allow
family move. You know, that it was going to the notion that what is critical Is a solid foundation.
be a solid thing The foundation, he believes, is ''commitment":
"just like the parents' had been," with an ensuing 6H-2: The couples that we knew that it was
discussion of marriage in the '60's contrasted with the working, I don't think there was any question
mass divorce of today. Another passage which makes in their minds about getting married. All of
explicit the rel.itionship between product manufacture them were pretty sure that they wanted to get
and permanency is married. And they were pretty sure they knew
6H-1: Wlien we got the apartment in Virginia what to do with marriage. And this was true
Beach in Norfolk it was really nice. And it about us too, even though, I've said over and
w a s — I have very fond memories of that place over again, you know, how much we've changed
and our marriage at that place. I think we and how much our relationship has matured.
began to really form a marriage there. The (Here he refers to the MARRIAGE IS A PERSON metaphor,
fact that Shirley got pregnant also Indicates which Is not applicable in this context),
we were getting into something more permanent. I still, at the very beginning, I said, you
Thus strong or solid or good marriages, or what Alex know, right away we knew there was a
elsewhere calls successful marriages, like his own, are commitment. And this was the same thing that
those which are permanent. His comparisons with other was basically true about all these couples.
couples inevitably employ the DURABLE PRODUCT metaphor I think they all got married thinking that
to distinguish between these marriages and others which this was "the thing." The right thing to do
are "weak," or "broken'': and it made sense to them.
6H-2: And we were relying on the kind of in his own case the commitment to marriage is attributable
looking at each other and saying, "Well," to background, and it becomes clear that the commitment
you know, "Who are you?" Now people were Itself is about not giving up the marriage; now, perhaps,
doing that up there, married couples were permanence is equated with the tenacity of the effort
doing that quite often and it resulted quite rather than the durability of the product:
often In broken marriages. There was a lot
6H-1: I think we're just lucky as hell, that
of trouble with that. A lot of couples who
we have a good marriage. And I think it's not
came up there [TO ICELAND] had problems.
Just due to luck but also the fact that both
And some other couples that came up there
of us are—have deep coranltments to the concept
seemed to be very strong in their marriages,
of marriage based on our backgrounds and
when they left there, as I think we were. And
upbringing. We couldn't have given this up
we remained good friends, although long-distance
once we made the commitment to it, without
friends, with some of these couples. That we
tremendous trauma.
knew up there who were our age, who had good
He attributes this attitude to the common cultural
strong marriages, and whose marriages were
heritage he shares with his wife, because "Jewish
obviously strengthened by the stay there. Or
families are strong and such" and "the cultural heritage
at least not weakened by it. I think our
gets carried on." Even though his parents' marriage,
marriage was strengthened there. I think we
unlike his own, is far from ideal.
did—we were forced to look at each other very
marriages are less likely to break than weak ones. 6H-7: I think they believe in home and family
But the introduction of the DURABLE PRODUCT metaphor too and perhaps I've gotten this strong feeling
seems to permit a proliferation of metaphors expanding about home and family from the fact that they're
this notion of marriage as some kind of MANUFACTURED together despite all this bickering and arguing,
PRODUCT. A common one Alex uses is a metaphor of you know, they would never think of divorcing,
marriage as a production or project, the effort at I don't believe. X don't think it ever crossed
making the marriage, an effort which in his words either their minds.
"works" or doesn't work, and which you have to "work at" But there are other factors which worked for he and his
and "work out." In one version, the result is a wife:
decidedly home-made product:
6H-4: Maybe this is just for us. Maybe this 6H-2: I will only say this, what worked for us
has nothing to do with anybody else. But it is, we have a common cultural background. We
could be that—our situation when we got have the Jewish heritage whicli is very strong—
married was such that we had lots of room to that heritage is very strong and has made a
adjust. Because we didn't have any idea wh.-it difference for us. We are supportive of each
we were getting Into. That gave us a lot of other and we are very, ver;/complimentary to
room to adjust. And by the time we had been each other.
through the first year we realized, you know, He elaborates on this latter "asset":
there would have to be adjustments made. And
a few years afterwards when things really got
serious we were—you know, when the marriage
was strong, it was very strong because it was
made as we went along—it was sort of a do-it-
yourself project.
6H-4: I think that we were so different and between product durability and marital permanence.
we had such complimentary differences that Unlike other goals of BEING A COUPLE, such as proximity
our weaknesses—that both of our weaknesses and shared experience, permanence is an Achievement Goal.
were such that the other person could fill in. And it is difficult to achieve, something that roust be
And that quickly became apparent to us that worked at, a feature which lends Itself to characterization
if we wanted to not deride the other person for In terms of production. Thus, one model, of what
their weaknesses we would instead get their marriage should be like, leads to another model or theory,
strengths in return. And that's what I think about how it can be implemented. Each suggests apposite
has been the asset—these are the assets that metaphors, and each new metaphor is capable of
have been very good for us. And I suppose introducing new marital goals.
what that means is that we have both looked This highly individualized and relatively complex
into the other person and fou:id their best model of marriage contrasts with the cultural stereotype
parts and used those parts to make the with which Alex entered marriage—the goal subsumptlon
relationship gel. And make the relationship model which cast marriage as a RESOURCE and asked what
complete. he "got out of it." In general the preconceived ideas
Here he moves to another PRODUCT metaphor which suggests with which husbands and wives enter marriage, by
cannibalizing parts from two broken-down automobiles to comparison with the understandings which they ultimately
build a working one. Elsewhere, the PRODUCT idea allows construct through the process of encounter and
still another metaphor: assimilation and realization, are very sketchy, static,
6H-4: I'd certainly hate to see people— culturally standardized and readily dispelled. These
everybody in the world jumping into marriage more complex understandings supply any number of simple
immature. But maybe there's really something stories for a story understanding task, such as
to the idea that it is not so much how you Alex decided it was time for Shirley to stop
feel—exactly how you feel before you get fooling around with any other guys. He got
into a marriage but what you can make of life, married to her.
you know, in the marriage that really counts. or
Having thought it out—in other words, having Alex thought that two years apart would do
thought it out beforehand and coming to the severe damage to his marriage. He decided
conclusion that you really are in love. Might to take the post which provided housing for
not be as good for a marriage as having gotten families.
married, looked into what was worthy of being
in love about, found it, Identified it after
awhile, because it does—it takes a while, and Alex feels that analysis focusses on the
then made that the cornerstone. individual. He was afraid that his wife's
And finally, another element in their situation which, analysis would ruin his marriage.
coupled with commitment, made for a permanent marriage and so on.
was the very lack of any idea, at the outset, what
marriage was about; this understanding permits him still The degree to which marital goals are tied to the
another DURABLE PRODUCT metaphor: metaphors which engender them is more obvious by
bH-l<: But I think that my commitment to the comparison with a different set of metaphors. For this
marriage was just so strong and I—as I said purpose the views of another husband will be somewhat
before I think this had a lot to do with more briefly summarized and interpreted. This husband,
heritage and background and what we knew had Bob, began marriage with a version of the LIVE HAPPILY
to be done. My commitment to the marriage EVER AFTER myth of American marriage,
was so strong that even though, or maybe—I 5H-2: I don't think many issues did come up.
don't know, m.nybe because I hadn't thought I think that I was a very adept liver in the
about this, because I had to make it a struggle, sense of I did ray thing rather well. I was
or because it was a struggle for us that we much into being, I think—playing a great deal
forged a lifetime proposition. And maybe it on the surface of life and living. I think
was because things were wrong at first, and we that, it seemed to me that people got married
were screwed up and we didn't know what we and lived together in a certain amount of
were doing. Maybe that had something to do harmony, ergo, we will marry and live together
with what was good about It. The fact that we in a certain amount of harmony. We didn't
really had to work at it. pursue in any depth any issue that I recall.
So that all these different metaphors—of carrying That if, I don't know, me walking in with my
out a project, shaping a product, putting it together feet sloppy ticked off Eileen then that was a
from different parts, building an edifice with a solid reason why I didn't walk in with my feet sloppy.
foundation and a cornerstone, forging something that This model was to be jeopardized, first, by their
will last a lifetime, something strong and unbreakable— friendship with another couple who were continually
are permitted by the underlying, and more general, questioning their own relationship:
understanding that MARRIACe' IS A MANUFACTURED PRODUCT. 5H-2: I think that we saw at that juncture,
And each of these metaphors introduces its own marital probably not even consciously, the potential
goals: you have to start out with the intent not to for a different kind of marriage—a different
give up (the foundation); you have to be willing to work sort of motif. And I think we were both
and struggle to produce it ((he effort); you have to impressed, awed with it and frightened of it.
find something worth being in love about (the Ue didn't want to leap in into an inquiry
cornerstone); you have to identify each others' method of marriage at the point—I don't think
strengths and overlook each others' weaknesses (the either of us felt an impetus to do it, in a
parts), and so on. In Schank and Abelson's (ibid.: sense that it seemed to be the way to a good
116-117) terras, these would seem to be Instrumental marriage.
Coals—they are all in the service of constructing a
durable product. In turn, the conceptualization of
marriage as a PRODUCT is linked to a prior understanding 37
of marriage as BEING A COUPLE, via the association
Ultimately, however, a series of extramarital sexual 5H-5: So we went back to the mountains and I
relationships with other people led them to adopt the think It was a time of a lot of walks. It was
inquiry method. These relationships called into a time of a lot of searching. A time when we
question, for each spouse at different Junctures, the gave each other some space to kind of work out
status of their relationship with each other. When where each of us were.... I think at the end of
Bob reacted to his wife's affair by declaring his own that time together, we both were—it was pretty
intent to figure out "where he is" on his own, clear to both of us that our love was a very
5H-4: She then became very concerned about, strong and deep one.... And in that It was a
"My God. All this means all this. You know biggie, that we really had to keep In contact
what 1 really care about is my relationship about it, we really had had to talk about it.
with Bob and 1 don't want it to go to hell." The conclusion which emerges "through miles of talking"
And I think that was a very important thing Is that the relationship with each other is a basic or
to hear for me. And I think from then on, we PRIMARY RELATIONSHIP, a "biggie," and this understanding
talked a great deal more about our—where are makes sense of their relationships with other lovers:
we, and not where are you but—or where am 1 5H-5: There was a primary commitment to me
vis-a-vis life and love and the pursuit of and the relationship with Dave was ancillary,
happiness, but where are you and 1? may deal on areas that 1 didn't. But was
When he gets involved in an unexpectedly intense love largely additive and so that made it good and
affair, he and Eileen go to the mountains to work okay and reasonable and all that. That's kind
through the questions, of a rule or an understanding in much of that
that made sense.
5H-5: "Where was I? What did it mean?"
"What did it mean to the two of us?" fllloen 5H-9: And so there's been a very strong draw.
basically supportive and In favor of it but I think that as we were slugging through some
also scared to death and not knowing and I of the more rational Issues or those that we
guess in a sense, at least what I—was conveyed could get into some sort of rational component,
to me was, "I kind o f — I thought another what really prompted us to do that work and
relationship would be nice and I'm glad that what really made it important that we come to
you're having it. But my God I didn't think it some clarity is the strength of the emotional
would be quite this Intense," and all of that sort of thing. Wliic'.i again I think we both
sort of thing. The—I think she was not overly recognized and then Just a — I don't think we
threatened by my sexual Involvement with Martha glorified it as much as acknowledged that it
but that was a weighty dimension. T h e — I think was there. That Is to say my hurt hurt Eileen,
she was more concerned about the depth of our her hurt hurt me and I can be hurt by lots of
love and the basicness of my love at that point. other people's hurt but I think not to a depth
Which again, excluded Eileen but—certainly had that that seemed to consistently have, and
an intensity that was different from my love then with her as well. There—maybe it's the
for Eileen at that time. combination that there is a—there's an
It Is not only Eileen who is scared by the depth of Bob's intellectual stimulation with one another,
involvement with Martha: there's an emotional stimulation with one
5H-6: I think the risk is basically that the another, there's a sexual stimulation with one
situation of the relationship with Eileen and another, there's a childbearlng stimulation
I would get out of hand, such that I couldn't with one another, or wrestling with great
insure that it would continue no matter what Issues of the world, and so I think Eileen
I did. And so I would be in a position of encapsulates for me an ongoing growth potential
losing someone that I thought so much of and for me and all that gambit and vice versa, I
I could do little about it. And so if 1 played believe. So—and I think we have found parts
it safe, if 1 played it tentatively, I had a of that in many other people many times, but
certain control in It, in the sense that no one who we felt could replace in that sense.
Eileen could be running around and making a MARRIAGE IS A PRIMARY RELATIONSHIP, comparable to other
risk of herself in a sense, but If I didn't I relationships, but more basic, Important, emotionally
could always hold It together and keep It going deep, and Irreplaceable. Unlike Alex's model of
and all that. Which I might think marriage, it does not entail exclusivity. Consistent
retrospectively is not probably right either with this view of marriage. Bob reports Chat he has
but that was my perception. So 1 think really trouble with the term 'marriage' itself, and he Is far
what was at stake was the—was ending—losing more Inclined to speak of their 'relationship.' He
the relationship. comments that
Bob and Eileen meet each one of these marital crises by 5H-8: I think that we looked at our
talking: relationship very often in contexts of other
5H-4: I think it worked out that Eileen and relationships. We have been as a couple at times
I both perceived that we had little idea of, very involved with other friends and what they
in fact, what our commitments to one another were doing or not doing. And they have seemed
Involved. That we did not In many ways know very comfortable sharing with us where they are
ourselves particularly well, relative to and where they aren't and I think we have
relationships, relative to life and living provided for them sometimes new vocabulary and
and that sort of thing. I think it worked vice versa. I think Eileen has been more open
out that through miles of talking, with one about where she is with her friends than I have.
another, on a lot of thi.s, that it nt least Getting out types of new vocabulary or why
made sense to us and felt Important to us to don't you look at it this way. Or getting
coiitlntip our commitment and our living support for areas that she was in some sense
together with one another. holding out or trying to say Co me, "No, it
And on another occasion. needs to go anoCher way." ThaC sort of thing.
So I Chink ocher people provided us boch
Chrough their direcC requesC for our counsel
38 and help and all and Che times in us utilizing
their marriages as kind of case studies for
where we are In all this.
This use of other marriages as "case studies" providing monitored very carefully how she was changing
nev vocabulary and new perspectives helpful In the t.isk In the aspect of, "Is this still the person
of understanding their own relationship, contrasts with who I want to Interrelate with? Can I deal
the way Alex uses other couples' marriages. For Alex, with this change? Does it seem reasonable to
other marriages are either like or unlike his own In me? Uhat do I have to gain or lose from It?"
terms of durability, and their comparison provides him That whole bit.
with some understanding of what that dlfferenre is. Such separate experiences cannot be dealt with, simply,
If MARRIAGE IS A PRIMARY RELATIONSHIP, the by sharing them vicariously, for they alter the
Preservation Goal Is to keep this relationship together Individual in ways that may simply move him or her too
In the course of other, less Important ones. But a far away, so that the relationship can no longer be
RELATIONSHIP does not link two people together held together. Efforts to keep in contact may not be
physically, as does BEING A COUPLE; rather, it positions enough, either. Changes in the other person may be so
them vis-a-vls one another. Thus the problem of great as to call into question whether this is "still
Implementing this goal Is not one of constructing a more the person who I want to Interrelate with"—whether
durable product; It Is one of "keeping in contact," a this should continue to be the PRIMARY RELATIONSHIP.
major Instrumental Goal of this marriage. This Involves For unlike Alex's version of a marriage as some kind of
the "Inquiry method" of marriage. Marriage as INQUIRY physical coupling, a PRIMARY RELATIONSHIP cannot
lends itself to the further metaphor of mutual self- necessarily be expected to be permanent.
examination, and hence marriage, as a SPATIAL 5H-7: I think that what our relationship Is,
RELATIONSHIP. Is important because we're two other people
5H-7: I don't know I think that if she were that do live together, we have lived together,
to change dramatically or I were that that may Eileen and—the Information and the knowledge
not be so. Perhaps part of our commitment to and the respect and love for me that she has
continually evaluate where we nre in some sort Is enormous and vice versa. And I think that
of awareness of that phenomenon. That the now It's much more a commitment to making
greatest liability to our relationship is to certain that we're pretty clear where each of
not work on trying to get a good sense of where us are in our growth. In our lives. In our
we are. If I let her alone and don't try and living. Not, I guess—what I don't feel In
she lets me alone wc might satellite far enough what you asked Is, not to keep the relationship
away so we're not sure what's in between us. together but because we have a relationship.
This last is a kind of "Lost in Space" metaphor which I truly could conceive of us in our evolution
nicely captures the sense that there are no reference of things feeling that an ongoing relationship,
point.s except each other. In this spatial metaphor, living together, does not make sense. When
"keeping in contact about It" means "talking about It." we're at a point in growth and who we are or
Talk, which Initially served to establish the primacy because of another person or another opportunity
of their relationship to each other, now serves to keep that seems to be there that says, "Okay we need
each spouse apprised of where the other is so that they not and we probably should not perpetuate this."
can, presumably, do whatever necessary to "hold I guess here again I feel very simple-minded
together" their relationship. about the whole thing In that that we live
5H-5: So I think that the relationship with together we need to clarify these things there's
Martha probably set a lot of ground rules for a compelling Interest to do so. I feel pretty
how we're going to comnunicate through other much the same with most everyone as a matter of
relationships, at other times. I guess it fact, but not to the depth of commitment to
also very deeply confirmed that we can deal doing It. I mean It's very important to me
with and process in the throes of living that Eileen and I are pretty clear where each
relatively effectively, a lot of stuff. And of us are.
that if we don't, if we try to pack it off So that the final Instrumental Goal of marital self-
somewhere, very quickly there's a very examination is to assess when it is time to discontinue
petceptiv—perceivable sterility, by each of the relationshipf either because the two spouses have
us, at where our relationship Is. moved too far apart, or alternatively, because another
Again, the role of communication in this marriage relationship has taken primacy. Such assessment, these
contrasts sharply with the use of talk by Alex and passages hint, is understood within yet another
Shirley to convert unshared into vicariously shared metaphor, of gain, loss and substltutabllity—A
experience by telling all the remembered details of RELATIONSHIP IS AN ECONOMIC EXCHANGE.
their times apart. Elsewhere (Quinn n.d.) I have described the role of
Alex feels his marriage threatened when certain the word 'commitment' in framing marital goals. In one
conditions—his wife's flirtations, her analysis, their of its senses, marital commitment is to those goals
lengthy separation—arise to block critical marital which are effortful and ongoing enough to require
goals. He generally responds to such situations by substantial dedication to them. It is not surprising,
doing his best to eliminate the threatening condition. then, to hear Alex say, "....the commitment was tomake
What threatens Bob's marriage is a much more gradual that family move. You know, that it was going to be a
process of change in one or the other spouse, which can solid thing." while Bob speaks of "our commitment to
be brouglit about by new relationships or new experiences continually evaluate where we are," "a commitment to
of any kind, and which Is so nearly Imperceptible that making certain that we're pretty clear where each of us
It requires continual monitoring: are in our growth, in our lives. In our living." The
5H-7: I guess what I'm saying is I don't know word 'commitment' is used to mark those marital goals
how big a change that I thought there might which are instrumental to the Implementation of a given
have been but I think that there is always a model of marriage.
thought when Eileen goes into a different sort I hope to have hinted that two-liners from story-
of experience and—or 1 do, of the other understanding tasks are poor sources of inspiration for
partner saying, "How much will this change a tlieory of goal organization. All of us can Interpret
them?" Indeed, Eileen's going to her Job in any one of the Alex stories I invented above, and
Greensboro. There was a part of me that probably for the most part Interpret it correctly.39
This is because we have a great deal of knowledge about
the possible links between particular goals and
particular actions, including both knowledge of human
nature and of American culture. This knowledge helps us
understand what Alex might be up to in the particular
instance; It does not tell us much more, because to do
such spot Interpretation of someone else's behavior we
do not ordinarily need to know his philosophy of
marriage. We do. Indeed, have some folklore about the
way in which marriages may organize goals. This
folklore again. Is a misleading source of information
about the way goals are actually organized In ongoing
marriages. It may tell us a lot more about why John
decided to get married than about why John decided to
get divorced. The organization of marital goals and the
goals of other interpersonal relationships is better
understood in the richer context of extended discourse
about the discourser's own relationship—that is, from
the Inside. While such discourse is not perfectly
transparent by any means, it does suggest a view of
goals as organized by metaphors which are themselves
organized around underlying theories of what a
relationship should be and how such a relationship can
be implemented.
Quinn, Naomi
n.d. Analysis of a key word; 'commitment' in
American marriage. Unpublished ms.
Schank, Roger and Robert Abelson
1977 Scripts, plans, goals and understanding: an
inquiry into human knowledge structures. New
Jersey: Erlbaum.
Wilensky, Robert
1978 Understanding goal-based stories. Doctoral
dissertation. Department of Computer Science,
Yale University.


Stephen E, Palmer
nnlverslty of Callfprnia. Berkeley

The problens of structure arid image, after all, changes in ways that the
crganlzation ij, perception are arrong the rost object clearly does not. £ecause
cer.tral ir the history " " field.
-•- — transformational invariants reflect object
properties rather than lm,a^e properties,
computing them probably plays an important
role in getting from an image based
representation to an object or world based

Hew can the transformational structure

of an event be computed frcm the
spatio-temporal Images that arise from it.
The problem, of course, is that while it is
trivial to compute the image over time from
knowledge of the real world object and its
real world transformation, the reverse is not
In this paper I sugj^est ar.swers to toth at all trivial. In fact, there isn't even a
questions. To provide An Initial ilea cf the unique solution, but only an infinite set of
direction I will take, here is a liief object/transformation pairs. In other words,
preview. The organizational phenomena I take there are nany different world events
to le rrost critical for understandin^; the objects rndergoing transformations — that
stn)ctur'=' cf perception are: (a) object could give rise to the same specific event
constancy dtsplte changes in ir.age size, images and no logical basis on which to
shape, rcbiticn, and orientation, (b) motion decide the correct one from purely optical
perception of structured objects, ;c) fifcural information. It is clear that some
(?ccdress or "good f^estelt", (('' perceptual additional assumptions are needed about
grorplrg of the visual field, ard (e) either the nature of the object, the nature
reference frame effects. First, I will argue of the transformation, or both to reach a
that these phenomena are related by tYe fact determinate solution. To reach the correct
that underlylr.g then all is a cormon solution as well, the additional assrmptions
transformatioral structure. Second, I will will have to be chosen in a principled way.
describe a general design for a comnutational •*'hat assumptions might help here?
systetn within which trarsf orma ticnal Clearly a good bet would be any assumption
structure car. be analyzed easily. The that is generally true cf events in the
proposal is to couple a tra.isforiraticnelly world. Then, arriving at solutions that are
based system of analyzers with an attenticral consistent with these assumptions will almost
rrechanism that establishes a variable frar.? always result in veridical perception.
of reference within it. Tne job of the
dttentional frs-re is to move around within Perhaps the most striking fact about the
the space rf analyzers so as to naximize the transformations that characterize events in
Invarlance and stability of the attended the world is that both objects and
output. It dees so by ccrrpensa ti r<? fcr the transformations tend to be fairly stable over
changes brcutiht about by the t ransf orr,at ions . time and space. This is certainly trie ever
Finally, I will maKe a few rep-arks about why short intervals cf time and small regions of
I think tne kiid of computational structure I space and is usually true over long intervals
!3m proprsing is an interestir.g onp from a of time (on the order of at least whole
pvrely theoretical starjdpoint. seccri's) and larger regions of space as well.
Transformational Structure To the ettent that this is so, objects can be
well approximated as rigid and the
The basis of transformational structure transformations they undergo as uniform
Is the concept of transf cnr;d tional mctions in three dimensional space.
invariance. Transformational invariance Naturally, there are many cases cf non-rigid
refers to the fact that when an object objects undergoing various complex motions.
undergoes a spatial transformation, such as a But even these are probably best understood
rotation, a great many changes occur in the in terms of how they can be analyzed into a
pattern of stiiulation on the retina, but at structure of roughly ri^id components
the same time there is a great d°al of higher undergoing roughly uniforn^ motions. The
order structure that does not change at all. motion o*' a person walking is a case cf
All sorts of relationships ("r relationships non-rigid motion that has yielded to such an
among relationships; do not charg'=, even analysis (Jrhansson, 1973; Cutting, 1981).
though each first order property ioes , and The rigidity and uniformity assumptions
these unchanging aspects ar^ the suggest that the perceptual system operates
transformational invariants. In many cases in such a way as to maximize invariance in
they correspond mere directly to the both objects and motions. ^^ In anthropcmcrphic
intrinsic properties of tne r>=r.l world object terms, t)-e system "wants" to analyze both
r.ndergoing the transf crma ticn — its size, objects dnd mctions as "changing as little as
shape, color, and so fortn — tnan to the possible". "Vanting to change as little as
prcpprli=s of its projected image. Tne
possll)le'" refers simultaneously to the facts Given that real world events tend to
that that the perceived object/motion pair consist of rigid objects in uniform motions,
irust, Ir some sense, account for the sensed a perceptual system would be biased toward
variations in the Image and that this can be veridlcallty if it somehow embodied
done In more than one way. To pick a well preferences toward perceiving rigid objects
studied exairple, the and uniform motions. Indeed, there is good
kinetic depth effect, If
a line changes evidence that this is true. When presented
simultaneously in its
projected length and with ambiguous information, people tend to
orientation, it might be
either (a) a line perceive rigid objects rotating or
that gets shorter and
longer by certain amounts as translating in three-dimensional space as
it rotates in
varous possiblethatwaysis orrotating In depth
constant length long as the cptical structure is consistent
(b) a line of
(Vallach S, P'Connell, 1953). The latter Is with such an interpretation and the system is
almost invariably perceived, although it given sufficient time. For example,
sometimes takes an observer several seconds Johansson (1950) showed that people have a
to achieve it. Cnce it is perceived, strong tendency to see two moving points as
however, it is stable and dees rot fixed at the ends of a rigid rod moving in
spontaneously change to something else. three dimensions rather than as moving
According to the present line of thought, the non-rigidly in two dimensions. Thus, an
preferred interpretation arises because it is ambiguous object in unambiguous motion tends
"simpler" than the alternatives given the to be perceived as rigid rather than
heuristic assumption that objects in the plastically deforming. The other side of
world tend to be rigid and undergo uniform this story is that an unambiguously specified
irotions in three dimensional space. Thus, object in ambiguous stroboscopic motion tends
the perceptual system seems to prefer to be seen in uniform motion (Shepard & Judd,
Interpretations of greatest possible 1976; Farrell & Shepard, 1931). Clearly,
invariance. there is something special about rigid
Notion and Constancy objects and motions for the human visual
The perceptual phenomena most directly system.
Anothe r important problem 1n perceptual
and obviously related to transformational Figural Goodness
theory th at is intimately related to
invariance are those of motion perception and transformat ional structure is _ what
object constancy. ^^otion occurs when the psychologis ts have come to c all "figural
positlor. of some object or part of an object goodness." Figural goodness refe rs primarily
chanfces over time. It is the paradigmatic to subject ive feelings of order , regularity,
case of 6 iransformatior., of course, but it and slmplic ity in certain figure s as opposed
is perhaps not so obvious what it has to do to others. The relation of th is subjective
with invariance. In fact, the whole concept feeling to transformational stru cture is not
of a distinct object undergoing motion intuitively obvious, but it is nevertheless
presupposes invariance in that the object is simple to g rasp.
taken to be unchanging (except for its Figures are good to the extent that
pcsition, of course) as it moves. Logically, they are themselves invariant over certain
one could just as well say that the world had types of transformations. The most obvious
changed its Irtrinsic nature over time. This case is that of standard bilateral or
wculd be a mere reasorable notion if one reflectional ^symmetry. ^_ To illustrate, the
perceived the visible surfaces of the world letters "a" and "r." are ref lectlonelly
like a rubber sheet that simply changed its symmetric about their vertical axes because
shape plastically during events. The fact is each letter is the same as itself after being
that people to rot perceive the world in this reflected about a vertical line through its
way, but as consisting of articulated, center. The other widely known tyge of
unchanging objects that undergo various sorts symmetry is rotational. The letters "N and
of notions. This highlights the fact that "Z" are rotationally symmetric through an
perceiving motions of objects acti:ally angle of ie0-degrees because each letter is
presupposes invariant aspects as well as the same as Itself after being rotated by
varyingIt a ones,
ppears, ardthen , that
that object
an event consworld
in the ie0-degrees. Still other letters have
is just
always has the
both other side of the coin
components. transformational invariance over a number of
motion perception. ''ction is the perc
tancy different transformations: "x", "C", "H", and
transforration; object constancy is from "l" all have two reflectional symmetries
perceived. invariance. They are compl (about vertical and horizontal lines through
ei ved
coupled in that for the motior t
different:, the object must be different the their centers) as well as lEB-degree
In the case of a rotation in depth etely rotational symmetry. A perfect circle has
example. either the object is rigid and 0 be still greater transformational invariance
motion is uniform las in actual too. because it is unchanged by all central
rotation)1 or the object is plastic and , for rotations and reflections.
motion is1 nonuniform in such a way that the There are two other, less well known
combined changes pro-^ uce the two dimens depth types of symmetry: translational and
image ( as in the plastically -"efo dilational (*eyl. 1952). They are defined by
colored r•egions ir. a motion picture the
depth rotation,. their the same abstract scheme as for
icnal and rotational symmetries. 3oth of these
rmlne latter sorts of symmetries technically apply
of a only to idealized, infinite patterns, but one
42 can define "local" versions that apply to
finite patterns by only requiring invarlance
for part of the pattern over the O o o o o D D D n
trarsformation. (See Palmer, in press, fcr a O o o o o
rore complete discussion of symtnetry, local o o o o
o o o o o
synmetry, and their relation to
transforrratlonal structure.) o o o o o D D D D
o o o o o
It turns out that the goodness of o o o o o o o o
figures can he well predicted from its
symmetries In this extended sense: the set of B
transforn-at ions over which the figure is
Irvarlant. Garner (1974) showed that ratings
Fig. 1: Grouping by proximity and similarity.
of perceived goodness increased rronotonlcally
with the number of transfornatio'^s over which
the figure is invariant. Garner actually The point I want to make about grouping
talks about "rotation ar,i reflection (or phenomena is Just this: Elements are grouped
? & P) snlsets" of a figure, tut this concept together when they are in closer
turrs out to be Isomorphic to the number cf transformational relationships to each other
rotational and rtf 1 ect invariants than they are to other elements.
(Palmer, in press). F\jrther, the amount of -
Transformat ioral "closeness" refers to the
transforrraiional invarlance a figure has also magnitude of the transformation required to
strongly affects how ^luickly people can natcn achieve transformational Invarlance. For
two figures for physical identity, how well exarrple, the dots in Figure lA must undergo a
they remenber a figure, and how easily they larger translation to bring them into
can describe it. Such results demonstrate congruence with their vertical neighbors than
the reality of figural goodness in perceptual with their horizontal neighbors. In Fipure
processing. fSee Garner, 1974, for a IE, the figures must undergo e rotation as
review). There is additional evidence that well as a translation to coincide with tr.eir
flgvral goc'ness depends on trarslational and vertical neighbors, whereas just a
dilatioral invarlances as well (leeuwenberg, translation will suffice for the horozcrtal
1971). neighbors.
In summary, figural goodness seems tc be Now consider some of the most potent
characterl2ed quite nicely by the concept of factors in grouping phenomena. Similarity of
transf invarlance. The relevant two elements in position, orientation, and
transformations in this case are reflections, size can be defined by the magnitudes of the
rotations, translations, and illations. transforrations — translations, rotations,
Except for the addition of reflections, these and dilations, respectively — that are
ere the sane set that charactFrized motion required to make them equivalent. Continuity
and cbjert constancy phenomena. It seers is similarity over local translations, and
unlikely that this is merely a coincidence. bilateral symmetry is similarity over
reflections. And so, once again, we find the
jrorplnfi same types of transformations lurking behird
grouping phenomena as we found behind motion,
The next phenomenon I wart to relate to object constancy, and figural gcodness.
transformational structure is grouping Grouping seems to be determined by maximizing
(Werthelmer, 1922;. Figure 1 shows some transformational relatedness within a
standard exarples of grouping phenomena In perceptual group.
which mcst people report perceiving either a
Frames of Reference
vertical or horizontal oigarizatlor. Figure
lA demonstrates the infli.erce of proximity en The final category cf perceptual
grouping. The dots are orj^anized into jjhenomena I went to discusss I will call
vertical cclumr.s (rather than hcrizc-tal or reference frame effects." It Includes a
diagonal rows) because their vertical number cf different results in many different
proximity is greater than their horliortal domains. Whet they all have in common is to
Froxlmily. Figure 13 eemcnstrates the suggest that perception at any moment occurs
Influence of sirilarity of orientation. All within e single, unitary frame of reference
else being equal, similar elements tend to be that captures common properties of the whole
grouped together and dissimilar elements display. Cther properties are perceived
grouped apart. ^'ary different klnes cf relative to this frame, very lik'=l;• in terms
similarity have teen shown to affect of deviations from it.
grouping, but color, size, and orientation
are particularly striking. Continuity, As an example. consider now the
symmetry, and closure are thre" other well orientation cf a global referer'ce frame can
documented factors. Lvt perhaps the rost affect shape perception. Fi^^ures 2A and 23
ptent of all is what group together
gestaltists by
called show the same form in two orientations that
common fate." Elements rnrve ir, the same differ by a 4.5-degree rotation. Figure 2A is
common fate whPr they rate. ;ven a perceived as a square because its sides are
direction at the same field cf random dct hori7ontal and vertical, and Figure 21 is
texture spontaneously
completely homogeneous organizes Into figure perceived as a diamond because its sides are
and ground ^hen a spatial subset of the dots diagonal. Powever, an interesting thing
begins to move together or when the rest of happens when a number of such forms are
the dots begin to move around them. aligned diagonally. The perceived shapes

reverse: horizontal and vertical sides consists of translations, rotations,
produce the appearance of diamonds while dilations, and reflections. Thus, the
diagonal sides produce the appearance of structure underlying all possible Euclidean
squares (Attneave, 1966; Palmer, in frames of reference is based on
preparation). It seems that the tilt of the transformational invariance: the set of
whole configuration is somehow "factored cut" transformations that make one reference frame
of the display, and the orientation of the equivalent to all others.
sides is then perceived relative to the whole Ret urning to perc eption, it seems that
configuration.__ The conjecture is that this the per ceptual orien tation of a line or
factoring out" is done by establishing; a perceive d direction of motion is like the
tilted frame for the figure within which 45 posltlon of a point in analytic geometry:
degrees is the referent orientation. This their va lues depend on the reference fr ames
would explain why the shapes are perceived as within which they a re perceived. like its
they are, and it fits well with many other geometri cal ccunterpar t, this frame seems to
Include soniething akin to a position
orientsticral phenomena in shape perception. (axis), resolu tion
(origin) , orientation
(See Hock, 1073, for a review).
(unit s ize), and refl ectlon (sense). And if
this fra me is variable from one moment to the
D O next, then an un derlying structure of
D O D transfer matlons is imp lied that relate one
< >
frame t 0 another. Th ese transformations are
D O exactly the same as we have encountered
repeated ly: translat ions (for positi rr,),
A B D rotation s ffor orienta tlon), dlldtions ffor
unit siz e', and reflec tions (for sense).

Fig. 2: Reference frames in shape perception.

Transformational Theory of ?ercept\;al f^tructure
S i m i lar 9 'feet s are well known tc occur
in IT o t i o n perc eptio n. F or instance, when two All of the perceptual phenomena just
poin ts a re iP sinus oidal n.otion, one discussed suggest that the perceptual system
vert icall y ay,d th e ot her horizor tally as has a definite preference for processing
show n in rir-ur e 3A, ^eo pie do not usually optical structure involving certain kirds of
perc ei ve them as such Rather, the see a transformational invariances. The questions
ccnf igura tlon that mov es diagonally as a I now want to address are (a) what this mi,;ht
Lnit , vi thin wbic h the twc dots move toward tell us about perceptual orgarization ard (b)
and away from eac h ot her as depicted in how sucl. a system r;ight be constructed
Tigu re P (Jc hanss on, 1 95? ;. Here agair, it computationally.
seer s the t thf- Perc eptua 1 system establishes
I think the examoles considered above
^ f of refer ence_,for_ the common irction
one it out." "irduced" motion are telling us that" the visual system is
effe cts are simil ar i !. that an unnovine b)iilt on a transformational lase that can be
r t i pct is see'i to ri eve b ecause a larger, more used to extract trarsforrations as a
prom inent opt ical stru cture serves as the heuristic for solving perceptual protlfrs.
f r^m e of refer but is mcivin,? so slowly In other words, the system is desif^-ned to be
that 1 ts n-oticn is n ot detected (Eunker, transparent to transformations of the sort
i i"<iV ' n-ost often encountered so that it "prefers
interpretations involving them. iy
"trensparent" I mean that (a) these
transformations can be computed rapidly and
easily in surh a way that (b) the system can
compensate for them simply and efficiently.
This st-ongly suggests that the system rust
be designed to solve the problems of
transformational invariance right from the
start, and that these design features form the
There are three basic CO mporents i
c 1 uheart
rion I of the cons
wll1 system.
ider here : (a) a space of
•ig. 3: Refererce frames in motion pe-ception analy zers t hat are t ransforma tiorally r elated
to 0 ne an other, ( t) higher order ana lyzers
Tr.e re latio f, of reference frares to t^at comput e output similarit y of lower crder
transfo rr-eti or. el structure cen best be a r a 1 Vzers over local transforma tlonal
explain er ^ y d-'alogy to their role in rela't ions, and (c) a n dttent lonal mec hanism
aralyt i c ge o^et^ •/. There, a reference frame that establ ishes a p erceptual reference frame
is esta blish ed to 'describe a point's location w i t h i n the analyze r space that mar imizes
numeric ally. It provides a set of i n v a r•iances The output of atten tiocal
assumpt icrs wi thiP which spatial positions f i x e i ions is stor ed in memory as a
maps i r tc n u r p j>ical coordiretes. Tlfferent r P P r e sen ta t ior of the perc eived obje ct. I
referer ce 'r ames w i l l discus s each pa rt in tu rn more fully,
differe nt c o c d j^
map the sar.e point into
but the re eder shov: Id rememb er that th ey are
the rates, ti^t the;.' are related
ty er.a ctly i n t e r relate d propose Is that 0 nly make sense
relate the ir same transfernetiors that w i t h i n the complete systemic structure .
standar d F ucli;< pcrrespor.ding frames. In
pan geometry. this set
relations bet.een pairs of analyzers in the
first-orcPf A".aly7er Space space reflect the transformational relations
between them as discussed above. Ihis
The first conpone r.t CO nsists cf a set transforralicnal structure defines the
for sets' cf analyz ers corputlr*^ spatial system.
propertips c' the visu al fi eld ir parallel.
^vrprisirgly, alrosl any sort of analyzers
«ill ("o, es Irng as t h ey hive particular Orientation Sense
structural ips t 0 each other. The
structural cc"Strair.t is that they be
transfornatlorally re Idted to each other.
Precisely what this me ans i s developed rore
fully and fornally el sewnere (, in
press), tut the basic nctio n is Just this:
Tvo analyzer? are tra nsfor r^aticnally_ related
if their_ "receptive fiel ds" or "spatial
furctlcns' are ide ntico 1 exceot for a o —
transformeticr. frcm a speci fir set. I'' the
present case, the set corsists, not
surprisingly, of the transformations
discussed earlier: t ransl ations, rotations,
reflections, and dil a t i 0 ns (the sc-called
"sinilarity' transfo rrnati ons of Fuclldean
geornetry). I cal 1 such sets cf
transforTationally rel ated analyzers
functional systerrs bee ause they compute, in \
this transforTdtional sense the same spatial
function 'Palpier, in p ress)
An exampl e wou Id be a se t of excitatory Fig. 4: The space of analyzers.
"bar d elector s w ith inhib itory surrounds
(ala Fu bel S, «(iese 1. 1 9e2) whose elements aigher-order Analyzers
differ only in the posit ion, orient ation, and
size of their rece ptive fie Ids. Each bar The importan ce of transformational
detecto r is relat ed t 0 eac h othe r one by a relatedress among analyzers is that their
transla tion, rotat ion, dila tion, or some output is guarantee d to be the same given any
composi te of tw 0 0r m ore of these two patterns (or portions of patterns) that
transfo rmation s. A simi larly const ructed set are identical over the transformation that
of edg e dete ctors wou Id CO nstitu te another relates them. F or instance, consider a
functio nal sys tem, disti net from the first pattern with reflec tional symmetry about a
because there is no tr ansfo rmatio n from the vertical axis sue h as the letter "a". As
specifi ed set that nakes a ba r-like receptive discussed before, this means that it is
field into a n edg e-lik e rec eptive field and invariant over a reflection about this
vice ve rsa. vertical line. No w consider any analyzer
The ov erall structure o f a functional that covers any por tion of this pattern. Its
system can te c onceptualize d as a n analyzer output, whatever that might be, must be
space in wh ich ea ch analyzer is a p olnt. The Identical to that of the other analyzer
dimensions of t he space c orresp ond to the related to it by reflection in the same
transformat icnal relations among then vertical line. M oreover, this is generally
i.e., the posit ion, orienta tion, resolution true for all pairs of analyzers related to
(or size), and re flection (or sense ) of the each other ty this particular transfornatlon
analyzers relati ve to each o ther. Since the given any pattern having that type of
number of a nalyze rs is certai nly fi nite, the sym.metry. Thus, the interesting fact about
space is only sparsely popul ated with transformational r elatedness among first
analyzer p oints. Therefore , it is more order analyzers is that transformational
appropriate to t hink of it a s some thing like regularities in the stimulus event will be
a discrete lattic e structure such a s depicted reflected in easi ly computable regularities
in Figure 4. The cycli c dim ension is in their outputs.
In general, higher order analyzers are
orientation (whic h repeats af ter 1 60-degrees
elements that compute such relationships
of rotatio r) an d the bina ry di mension is
among the outputs of lower order analyzers.
reflection (whi ch repeats aft er each
They respond to symmetries or motions,
reflection) , wh ile both posit ioral and
depending on whether they compare outputs
resolutlona 1 dimensions are s imple orderirgs.
simultaneously or over a time lag. We have
The diagra m is n aturally a s implif icatior of
just discussed a case Involving symmetry, but
the actual space, since one c annot depict a
it turns out that motion analysis has exactly
structure of mo re than thr ee dim ensiots in
the same logical structure except for the
real space. To t hink of the whole structure
intrc'uction cf a temporal difference. For
in concret e ter ms , one car conce ive of the
example, suppose that a pattern at some time
vertical di mensio n cf the str uclure shown in
produces outputs in the first order aralyzers
Figure 4 as re solution an d then imagine a
and that the pattern then moves, say, by a
whole two diren sional arra y of ther (to
translation. The output this produces after
represent t he two positional dimens ions) like
a shcrt duration will be exactly the same as
a case full of be er cans. N otice that the

the output It produced Initially for each
pair of analyzers related by that translation Perceptual Reference Frames
over that time lag.
At the heart of this proposal is the
These second orde r ana lyzers can le hypothesis that visual attention is localized
conceptual! zed as the links in th e lattice within the analyzer space and is centered at
structure depicted in Figur e 4. although a particular position. This establishes a
there proh ahly wo uld be fa r more of them. perceptual reference frame for further
They repr esent loca 1 t ransfo rmational perceptual analysis. Fixing visual attention
relations among fi rst 0 rder a nalyze rs. They on one position of the analyzer space induces
compute reg ularltie s in space (symm etry) or a reference frame because it determines a
space-tine (not ion) ty de termining position {or origin), orientation (or axis),
trarsformat ional in varia ncps. One can even direction along that orientation (or sense),
think of ther as heing eirhed ded in the sane and resolution (or unit of distance) relative
space as th e first orde r ana lyzers because to which the contents of visual attention are
they have the sam e sor t of t ransfo rmational coded. Thus, positioning visual attention
structure. In the case of mot ion a nalyzers, determines the description given to objects
of course, there is th e addi tonal dimension under analysis. This is entirely analogous
cf rate of rr. 0110 n . to the role of reference frames in analytic
geometry. A circle has a different equation
This t ransfo rtnati onal str uctur e of the when the origin cf its reference frame is at
seccn orde r anal yzers allows th<= p ossibility its center than when the origin is
of pi ggy-ta cklng still higher order analyzers off-center. The corresponding phenomenon in
on to p. Th at is, the outputs of se cord order perception occurs when the same figure looks
araly zers c ould b e com pared fo r Sim ilarity in like a square or a diamond, depending on the
Just the same way that th ey c ompare the orientation of the reference frame within
outpu ts of first order a nalyze rs . This makes which it is perceived (see Figure 2 ) .
rrcst sens e for mot ion ana lyzer s. Those
If vis ual atte ntion acts as a perceptual
analy zers t hat CO mpare outputs simu Itaneously
reference frare fo r constructing d esriptions
would prov ide i nf orm aticn a bout symmetries
of shapes, then it is clear tha t certain
and regvla rities of notion. Th ose that
rigid tra nsformat ions can be completely
corpa re ove r a fu rther time Ic d wtu Id provide
compensated for by corresponding a ttenticnal
i .1 f 0 rm a t i c r. ab out accele ratio ns ar.d
changes w ithin t he analyzer sp ace. The
•iecel e r a i i o ns .
result is analog ous to a g eometrical
Visual Attention and the ^ind^s Fye reference frare di splacing as the circle did
or changing its sea le size as the c ircle grew
Given such a space cf ana lyzer s, how can larger so that i ts equation did not chenfTP
it be use d, as irten ded , to "f actor cut" despite the changes in the geometri c figure.
tren sforpot ional structu re7 Th e rea 1 problem It is eas y to s ee that the c hanges in
here i s to fir.c some in ternal tran sfcrmaticn reference f rarie that would accompany
that will compe nsote for the external displacemen ts of such an a ttenticnal
tran sformct icn . Or p tran sfcrmaticn nechanism w oul'"" be able to compe nsate for
"com pensate s" for anoth er if arip lylnt' the rigid trans forratio ns of objects, k eeping the
secc rd dft er th e firs t yiel ds th e identity contents of the att entional frarie - - whatever
tran sformat ion - - i.e. , inva I- i a nce or no they migh t be — constant de spite the at all. ?cr exampl 6, s uppose an transformat ion. I I is not diff icult ^ to
nl^e ct is c directl y towa rd an observer. imagine th at such a system wculd prefer" to
C ^'er ti-e, the image c f this obje ct expands register u niform transformations of rigid
ir. th f ield. If the objects r ather than complex m otions of
unlf ormly e visu al
obs''rver we re to move aw ay fro m the object at deforming o bjeets . Compensating fo r motions
the sere ra te as the obj ect no ved t oward the of rigid objects can be accomplis hed sin-ply
then by an atten tional t ransfornation an r it dees
otse rver, the two trans forma ticns will
ejac *ly Car eel, a nd the irra^e of t he object not requir e any ch ange in the desc ription of
vill net ch a r e e. Thus, the " movir-g bacKvard" the object. Any ct her sort of tran sfcrmaticn
Iran sforrei i^r b :• the obs erver exactly requires c hanges i n the object's d esciiption
copp e n s a t e s for t he obje ct mot ion. as well . I f the a ttenticnal fram e somehow
I w ant tc sugg est t hat s o m e t h i n g similar manages t0 foil ow the path w ithin the
analyzers space that maximize s object
happens inside the head. Rat h e r t h a n the eye
constancy an(^ met ional uniformity , then its
corpensa tinf^ f cr t ransf ormat i o n s b y moving
I suggest that opeiation w ill embo dy such "prefere nces ."
about i n the worl d, ho wever
visual at tent ion moves etc ut with in the Attentiopal Control
analyzer spao P 1 a y i ns th e role of the
mind's e ye. I ike t h e ey e wit h respec I to the This lead s directly to the ne Tt problem:
world , visua 1 a t t e n t icn can cha nge i ts how this at tentio nal reference frame is
posi tion an-i 0 rient a t i o n with re5j)ect to the contrclled. I t shcu Id be mention ed at the
analyzer Si^a ce. Unl ike the e ye, it outset that there is a eertai n amount of
accorpli she 5 I olh 0 f th^ se tr ansforma ticns ty conscious con trol over the attentional
simple rev erne n s withi n the analyze r spdce. reference fra re. People usuall y can, if
That is , rot ation s of th e rind 's eye pressed, atte nd to specified positions,
correspc nd tc trans latio ns of visual sizes, and or ientat ions. Eut I s uspect that
cttentio n a 10 ny the ori entat ional d imensicn ecnsclo\:s cent rol n f attention is a high
of the a nelyze r spa ce. Simil crly, ch an^es of level cognit Ive activity that does not
scale <" zcomin g" al e .<os s 1 y n , 1961) can be usually extend ( ^ own to the level a t which we
accompli shed by tran s l a t l ons al ong the are currently dealin g. Rather, it seems that
on dim ensic r .
r e s 01 u 11 d «reat deal o f the ni tty-gritty details of
dtteritional control (Tust be strongly single a ttentional fixation will
deternlned ty stitrulxis structure. seldom 1 e sufflci ent to code a whole scene or
ccmploT object. ^'o^e complete ceding wculd
Hov right this be dcre? Ihe arswer I le acco mplished by making many attentional
want to explore Is that the structure of fixation s at ot her positions within the
stirvli deterr'ines how visual dttentior is analyzer space t hat have high ouputs. There
pcsilicned — their symrretries and is evid«= nee c f a bias toward beginning at the
regularities. The tes,ic idea is that most gilcbal 1 evei (Vavon, 1977/, A
attTtior is positior;ed tc maxinize r^asonab le guess would be that after one or
transforpational irvarlance. Tnis can be two glo bal fira tions of an olject at a low
done by *'ir.eir.g the output frcrr the resoluti 0" level of the analyzer space, nany
hifcher order ara^yzers for a ci'^^fi ref;ion of local f ixations world be rade at hie;her
the analyzer space, since these analyzers are rescluti on to cod e details. I am assuming
the rnes sensitive to trarsforpatirral that t he cent ents of these attentional
invariance. fixation s are som ehow stored in memory tc
It is important tc r ealize here that form a represen tat ion of the world. The
different reference frame s imply dif fe r e r. t result will be a hierarchical structural
syir-netrles anj regul arities There fore , any descript ion, much like those I have discussed
"econoiry cf coding" scherr.e for repr psent ation previous ly (Palme r, 197?, 1977;.
(e.g., Att reave, 1 r54) r eqi:ics that the Given th at attention can be pos i tiered
reference frare be chcse n to max imize such to maximize transformationa 1 invaric nee, it
synnetries. To illustr ate t nis fact. is not diffic ult for it to CO ntinue to do so
consider a gain the case of a cir cle. *hen if the ocject begins tc move. That is r.erely
the origin of the re ference frame i s at its a matter of t racking the same maximum through
center, t he c i r le
c ha all pes sitle the analyzer space with t he help of the
reflectiona 1 and rotati cnal symme tries motion analyz ers discussed e arlier. Recall
centered a tout tha t point *hen the 0 rift-in that these analyzers ar e sensit ive to
is off-cent er, its 0 nly sytr iretry a bout that transformatio nal invariance o ver time , and,
point is a single reflect ion abou t the line therefore, that mainta ini ng maximum
joining it tc the circle s c'nt ^r • Fuch transformatio nal invariance will entail
facts will be repr esented in the ou tpu ts of following t he maximum o utput of these
the second order ana lyzers at the c pnter &T.d analyzers. Icing so has the effe ct of
off-center positio ns wit hir the ana lyzer raintainlng object consta ncy ove r the
space. The output w ill be much g reate r at transformatio n. For example, when a square
the positi on withi n the a nalyzer space that begins to r otete, it is pe rceived a s such,
corresponds to the c enter- Theref0 re, there not as a squa re that change s in pe rceived
will be a strong tendenc y to est abiis h the shape until it becomes a dianoni a nd then
attentional referenc e frane at the cente r of changes back into a square ag ain. The latter
the circle. is whet wou Id be expected if the r ctation
For a circle, there will be no were not fol lowed by the reference frame
particular orientation preference, precisely initially us ed to code its shape, b ut were
because it has complete central symmetry. fixed in the same unchanging orientati on. I
For a square, however, or for any other suspect that the latter is w hat happe ns when
figure that has significant asymmetries, people are shown a tight spiral pattern
there will be decided preferences in rotating, ye t see it as cir cles cont ractlng
orientation. A square is symmetric about into the cent er.
only the lines Joining opposite midpoints of Crganizational Phenomena Revisited
its sides or opposite vertices of its angles.
Therefore, only these four orientations are We now begin to see how motion can be
serious candidates. If the midpoint line is analyzed and constancy can be maintained
used, the figure would have a different within such a transformationally based
perceived shape than if the vertex lire were system. The transformations are initially
used. In fact, these two_frame orientations o d e d by the higher order analyzers and then
result in the "square"" and "diamond" \;sed to achieve and maintain maximal
interpretations, respectively- Gbviously, constancy. The motion finally perceived is
not all figures have exact symmetries like not a simple function of the motion
circles and squares do. lut the same analyzers, since it too depends on a
principles would apply to approxiirate reference frame that maximizes invariance.
symmetries. This is accomplished by an attentional
Th e no tion I have in mind for the mechanism that finds and follows maximal
placeme nt 0f at tention i s a "hill climbing" output levels within the analyzers sensitive
process Its goal i5 to maxinize to motion and their higher-order analyzers.
transf0 rma tl onal invarlan ce in th e stimvlus Transformations of this reference frame
informa tion by seeking the pos ition of can compensate for stimulus transformations,
highest out put in the analyzer space, at thereby maintaining constancy. It seems
least local ly. Sometime s there will be necessary that much of this must be done
several pax ima, and in these c eses quite outside conscious attention, however, because
dlffere nt pe rcept s will ar ise when different there hardly would be enough of it to go
maximal po sitio ns are chosen for the around. ^ove likely, once an object's
percept ual re ference frane. The representation has been established in
square/ d iamo nd a nd ambig uous tri angles are memory, its representational schema can
two we 11 k nown examples ^ttnea ve. 1968; follow the approriate reference frame without
Palmer, 1960 Fal men ^ luc her, 19fl conscious attention. This monitoring process
vould only require conscious attention if within the analyzer space. For example, if
something unexpected were to occur, such as all the analyzers were rotationally
the olject disappearing or changing its symmetric, as are circular center-surround
intrinsic properties. receptive fields, then the system could not
support the orientation dimension or the
*^any perceptual grouping phenomena are a higher order rotational motion analyzers.
natural result of this attentional process
working within a transformationally In any case, the fact that the
structured space. It seeks the maxiinal constraints on the system are so weak
amount of transformational invariance at suggests that it is primarily the structure
different levels of resolution and cedes of the whole system that is doing the work.
elements together that are closely related Indeed, this must be true if the basic
within the analyzer space. Given that building blocks of the system are
attention can only cover a portion of the transformational relations. Transformational
analyzer space and that it is attracted to relations are an emergent property of systems
local maxima, it will tend to code together of analyzers; they are simply undefined for
items that are transformationally similar. any individual analyzer without a systemic
context of other analyzers. This suggests
Finally, reference frame effects result that the internal structure of individual
from an attentional mechanism that is analyzers might best be considered in terms
centered en a position within the analyzer of their functional role within the system.
space and a coding scheme, more fvlly Further, the nature of the elements of the
specifie-i elsewhere (Palmer, in press), that system might actually be determined by
describes shape relative to the reference optimization of their fvnctional roles v>ithin
values of the frame. The fact that frame the system as a whole (Palmer, in press). I
effects generally snow that global structure think these are interesting and important
affects local structure more strongly than notions for perceptual theory. They hark
vice versa suggests that i^lotal information back to the gestalt claim that emergent
tends to dominate in determining the position properties of whole systems play the critical
0*" the attentional frame. Higher order role in understanding perceptual phenomena.
strcture of the whole configuration seems to Perhaps they were right.
strongly affect the placement of the
reference frame and, therefore, to irfluence REFERENCES
the resulting perception.
In all of these phenomena it is clear
that the system's preference for invariance Attneave, F. Some informational aspects
ever transformations holds within the three of visual perception. Fsycholclegical
dimensional space of the world. It nay held Review. 1954, 61, ie3-l&3.
in the two dimensional space of images as
well, bv.t when the two conflict, the simpler Attneave, F. Triangles as ambiguous
three direriSional solution generally figures. American Journal of Psychelcgy.
("ominates. We have heen discussing the 1966, 81, 447-453.
analysis of two dimensional Irages, and it is
Cutting, J. Z. Coding theory adapted to
not entriely clear how to extend the proposal
gait perceptior. Journal rf Sxperimer.tal
into the third dimension. Cne possibility Psychology: Huran perception t. Performance.
woulo Ve to add serend, three dimensional 1 9 6 1 , 7 ; 71-87.
level that ernbodied the same design features,
t\;t in 6 higher dirensiondl i ty. Another Tunker, K. Ueber induzierte Eewegtrp;,
woi:ld he tc translate the relevant simple Psycholcgishe Forschung. 1929, 12.
'. ransf ormatio'-s in three dimensiors into
their complex counterparts in two dim^-nsicns. Farrell, J. i. , -i Fhepard, F. N. Shape,
I do not yet have a well defined proposal to orientation, and apparent rotational motion.
make or this -rlfficiilt issue. Journal of Experimental Psychology: Human
The Importance of Systemic Structure Perception .^ Performance. Ifiei, V, 477-466.

refore closing, I want to say a few Garner, V. R. The processing of

wrrds about ar. ini terestl "g; property of the information and structure. Potomac, ^'d.:
theory quite apart from its ability (or laci Irlbaum, 1974.
thereof) to account for perceptual phenomena.
I am intrigued by it systemic natrrp. The Hubel, L. H., ^ Wiesel, T. M. Receptive
reader rey nave noticed that in explaining fields, binocular interaction, ard the
the theory I made almost r.c reference to the functional architecture of the cat's visual
specific nature of the analyzers that cortex. Journal of Physiology (Icndoc).
comprise the pieces of the system. At one 1^62, lee, iee-154.
point I said that they mii^ht be something
Johansson, G. Ccnfi^uration ir event
like bar- or edge-detectors, hut that was
perception. Stockholm: Almqvist i luiksell,
merely for illustrat ior.. In fact, it makes
very little difference what the analy?ers
look like es long as they satisfy certain Johansson , G . Visual perceptior, of
symmetry conditions: namely, they cannot be biological motion ard a model for its
symmetrical about any transfornatien proposed analysis. Perception and PsychoDhvsics.
to exist within the analyzer space. The 1973, _14., 2(^1-211.
reason shovld be clear. If the analyzers are
invariant over a transformation, then that
trarsformation canrot exist as a dimension
Losslyr., F. ^t. IfTdige anrj ml nd .
Cirlridge. faSS. Harvard University Press,

Leeuwenterg, E. L. J. A percepU'al
codlrf language for visual and auditory
patterns. Arerican Jcirnal of Psycholcgy.
19''1, ? 4 , Z27-7t£.

Ndvoi, r. ?crest tefcre trees: The

frecpderce cf global features in visual
perception. Cognitive Psychology. 1£77, _9,

Palmer. S. E. Visual perception and

world kr.cwledpe; Notes en a model of
sensory-cognitive interaction. Ii
L . A . Nornar '* L. i. R'.melhart (Scs.),
Explorations in cognition. :-'illsdflle. N.J.:
•rlbeun, If-VS.

Palner, S. E. Hierarchical structure in

perceptual representation. Co<<ni tive
Psychology. 1977, 9_, 441-474.

Pdlrer, S. E. What trakes triangles

point: local and glotal effects in
configurations of amhiguous triangles.
Cognitive Psychology. 1960, JJ., 235-3f5.
Palmer, S, E. Symnetry, transformation,
and the strcture of perceptual systems. In
J. Beck ?, F. "etelli (Eds.), Representation
and organization in :>crceution .
Hillsdale, N.J.: Zrlhaum, in press.

Palmer, S. E. Reference frames in shape

perception. In preparation.

Palmer, S. ?.., s, Puch°r, N. l^.

Configural effects in perceived poirling of
ambiguous traingles. Journal of ETperimental
Psychology: Human Perreptior. JL Ferfcrmancp,
1P91. 7, 88-114.
Rock, I. Crientatior and form.
New York: Acafiemic Press, 1373.

Shepard, R. N. S, Judd, S. A.
Perceptual illusion of rotation of three
dimensional objects. Science. 1976, 191,
t'allach H., O'Connell, E.N. The
kinetic depth effect. Journal
Experimental Psycholcgy. 1953, 4^. 205-217.

Wertheimer. y.. Untersuchungen zvr Lehre

von der Gestalt. Psychologishe Forschvirf:.
1923, J^ 301-3SP. Translated in V. I. Ellis
(ed.j, A sourcebook of gestalt psychology.
New York: Earcourt, Brace ^ Co., 1936.
Weyl, H. Symmetry. Prince tor. N.J.:
Princeton University Press, l&f;2.

Rational Processes in Perception

Alan Gilchrist and Irvin Rock

In this paper we will give our reasons for believing Underlying this assumption is another assumption about
that certain current attempts to explain perceptual how the visual system vorks that Gilchrist (1981) has
phenomena on a lower level in terms of known sensory- called the photometer metaphor. Just as the signal
mechanisms are untenable. We will do this by focussing produced by a photometer is a direct function of the
on two topics, lightness perception and the perception light falling upon it, so the perceived lightness of
of apparent motion. We will summarize some older data each point in the field is assumed to be a direct
(not all of which are sufficiently known) and will function of the rate of discharging of the cells stimu-
describe some recent work of our own. Finally, on a lated by each such point. With the knowledge that has
more positive note, we will try to indicate the direction been available about light, about the formation of the
that a theory must take if it is to deal effectively retinal image, and about photochemical processes and
with these phenomena. nerve physiology it is understandable why such a view has
become so deeply ingrained as not even to be explicitly
Lightness Perception recognized as an assumption. Given this assumption,
We will begin with the assumption that Helmholtz was phenomena such as contrast and constancy, in which light-
essentially wrong in his belief that an object's light- ness does not correlate with luminance, seem to require
ness can be inferred by interpreting the luminance an explanation long the lines of lateral inhibition.
reflected by it to the eye in terms of the amount of
illumination falling on it. Such a process requires There is now, however, reason to reject this approach.
unequivocal information about the illumination whereas Evidence has been accumulating to support the theory
the only information directly available is the intensity that the perception of lightness (and chromatic color)
of light, or luminance, reflected by each surface in the is based on information at the edges between regions of
field. Each such luminance is the joint product differing luminance (or hue). Homogeneous regions
of the reflectance property of the surface and the il- between edges are then "assumed" to have the lightness
lumination falling on that surface. Rather we will or color indicated by these edges. There is overwhelm-
assume that the perceived shade of gray of a surface is ing evidence (Yarbus, 19^7; Whittle and Challands, 1969;
governed primarily by the luminance of that surface J. Walraven, 1976) that the visual system responds to
relative to the luminance of neighboring surfaces as changes in stimulation, not to an unchanging state of
Hering (1920) suggested and as Wallach (ig'^S) elegantly stimulation. This is normally guaranteed by continuous
demonstrated. There is now fairly wide agreement among eye movements, for vision.' Whenever an image can be
investigators on this general principle. held stationary on the retina for a few seconds, all
But what is the underlying explanation of it? There visual experience stops. These facts are inconsistent
is great appeal in Bering's suggestion of reciprocal with the photometer metaphor and they strongly indir-ate
interaction, i.e. that a bright region of the field the crucial nature of edges or gradients in the retinal
would have a darkening effect on an adjacent region and image since this is where stimulation changes in the
a dark region would have a brightening effect on an normal moving eye. Krauskopf (1963) has shown that when
adjacent regiori. We now know for a fact that the rate the boundary of a surface is prevented from moving on
of discharge in one nerve fiber is attenuated when a the retina, its color will disappear and be replaced by
neighboring fiber is stimulated by light. Thus such the color of the surrounding region, signalled by the
lateral inhibition can plausibly be invoked to explain boundary of that region.
why the appare.'it lightness of one region is governed by It seems unlikely that any absolute luminance infor-
the extent of stimulation of an adjacent region (see mation would be picked up in this way and yet it now
^ornsweet, 197C; Jameson and Hurvich, 196'+). seems quite possible that the visual system achieves
In fact, given lateral inhibition as a known sensory what it does using only relative information. Even a
effect, one might have been able to predict the pheno- simple edge-relations approach goes a long way toward
menon of contrast, even if it had never been observed explaining lightness constancy since the luminance ratio
(although questions can be raised about the spatial between two adjacent surface colors remains the same
distance over which such a mechanism can be expected to even when illumination changes.
occur). Surrounding a gray region by a white one should The important point here is that there is no need
lead to diminished discharging of retinal fibers stimu- to invoke a concept such as lateral inhibition to explain
lated by the gray region; surrounding another gray constancy. Once the photometer assumption is made
region of the same value by a black one should lead to explicit and in fact, is displaced by the concept of
increased discharging of retinal fibers stimulated by edge information, the whole edifice collapses. Of
that gray region because of a release of inhibition. course lateral inhibition is a well-established physio-
Thus one of these gray regions should look lighter than logical fact. It is probably part of the process
the other and so it does. An implicit assumption here whereby the ratio at an edge is determined. But we
is that the phenomenal shade of gray perceived in a given don't believe that lateral inhibition solves any of the
region is a direct function of the rate of discharging basic problems of constancy. The concept of an exagger-
of fibers stimulated by that region. ation or enhancement of edge ratios seems unnecessary
The fact of constancy of lightness can be explained and illogical. If lateral inhibition exaggerated an
along the same lines. When the illumination falling on edge ratio, it would do so in the same way every time
a surface changes, then the luminance of all adjacent an edge of the same value were present on the retina.
regions rises and falls together. Thus, while the rate A given edge ratio would be specified by a given neural
of discharging of cells stimulated by a gray region signal, with or without an exaggeration function.
should increase when illumination increases, so too Therefore the exaggeration function doesn't seem to add
should the rate of discharging of cells from a surround- anything of explanatory value.
ing white region increase. The latter will increase the Certain problems turn out to be dissolvable pseudo-
inhibition on the former with the net result of little problems with the adoption of an edge-relations ap-
if any change in the absolute rate of discharging of proach. One of these is constancy under changing
those cells. Therefore the perceived lightness should illumination. On the other hand, other problems emerge,
remain more or less constant and so it does. Note again, although they are more tractable. For example, how do
however, the assumption, here explicit, that the pheno- we now explain the constancy of surface lightness as
menal lightness is a direct function of the rate of the surface is viewed against differing backgrounds?
discharge of the appropriate fibers. The luminance ratio at the edge of a surface can change
irainatically as it is placed on different backgrounds pattern is made to disappear through retinal stabiliza-
and yet lightness perception retains alrr.o^t unchanged. tion, one gray square turns black and the other turns
For example, ir. the classic example if lightness white. This provides some of the best evidence for the
contrast, the gray s^^are on the white background has concept of edge integration. A similar experiment was
an eige ratio that is radically different (even uiposite done by Gilchrist, et. al. (in press) that demonstrates
ir. sign) froc that of the gray square on the black the importance of the distinction between reflectance
bac.'-orour.d. Thus, under -^ simple edge theory they edges and illumination edges. When the boundary betv;een
ought to appear radically different in lightness, and the white and black backgrounds is made to look like the
yet they appear almost the same. This suggests that edge of a shadow, the two squares will also turn black
lightness is not leten-uined simply by the boundary of a and white respectively, just as in the stabilized-iir.age
surface, t.;t by tr.e relationship between that boundary experiment. Thus when an edge is identified as an
and other boundaries. Presumably the boundaries of the illumination edge, it seems to drop out of the integra-
squares thenselves only signal departures from a back- tion process for surface lightness just as if it were
ground lightness, which in turn is signalled by the invisible.
boundary of each baci'.ground. Thus the edge dividing It would take us too far afield to enter "into a full
the white and black backgrounds signals the relationship discussion of precisely how the perceptual system
between the two background lightnesses and we might discriminates illumination from reflectance edges.
expect that this edge will be as critical to the light- While the presence of penumbra at an illumination edge
ness of the targets as the edges of the targets them- may be one source of information it is not the only one
selves. In fact Gilchrist and Piantineda (unpublished and is not necessary in the experiments just described.
experiment) have found that if that edge is retinally Before discussing the important role that depth
stabilized, the two gray squares turn black and white perception plays in discriminating edges, it is worth
respectively, just what we would expect based simply considering the ramifications of what we have just
on the ratios at the edges of the gray squares. We discussed for the notion of lateral inhibition or any
might say that the assignment of lightnesses to the other theory of neural interaction which seeks to explain
various regions is the end result of a computational the important eft'ects of remote edges on what is per-
process in which information from all edges present is ceived in regions adjacent to other edges. For now in
integrated. Arend (1973) and Land and McCann (l97l) addition to other difficulties such a theory faces, in
have proposed similar schemes. dealing with such "remote" effects it would have to be
If such computational processes occur and are argued that such effect do not occur at all when those
governed by remote as well as local edge information, remote edges are interpreted as representing illumina-
the reader may well wonder about the achievement of tion rather than reflectance differences. In fact, it
constancy. Consider the typical case where two gray is interesting to note that apparently no one has
disks of equal reflectance on the same background are noticed that when contrast effects are applied to
unequally illuminated because one region and its immedi- illumination edges, they not only fail to result in
ate background are in shadow. Earlier we said that constancy, but they actually make matters worse.
constancy could be explained on the basis of the equal Constancy requires that we explain how the perception
ratio of each gray region to its background. That would of surface lightness could be the same on both sides of
be true in the example under consideration. But now we an illumination edge, given the difference in luminance.
have also said that the presence of other edges enters Applying a mechanism here that further exaggerates the
into the equation. The shadow edge can easily have a luminance difference is a little like bringing water to
ratio as great as a white-black edge and, more probably, a drowning man.
even greater. If this is entered into the computation, Recent experiments (Gilchrist, 1977, 198l) have
constancy would fail; the disks would be seen as demonstrated the role of depth perception in distinguish-
different shades of gray in accordance with the luminance ing an illumination edge from a reflectance edge. In
difference between them. The equal disk-to-surround one experiment an artificial interposition cue was used
luminance ratios here logically cannot signify that the to make a target square appear as located either in a
two grays are equal if the gray regions are seen as on near plane, dimly illuminated, or in a far plane,
backgrounds of different luminance values, or so it brightly illuminated, as shown in Figure 1. In terms of
would seem. the retinal image, the target square was always flanked
Unless the perceptual system can discriminate between by a surface of much lower luminance to its lower right
reflectance edges and illumination edges, that is and by a surface of much higher luminance to its upper
between changes in the pigment of the surface and left (note relative luminances given in Figure 1 ) .
changes in the amount of illumination shining on the In space, however, the low luminance surface was located
surface. If so, perhaps illumination edges would not in the near plane while the high luminance surface was
be included in the computation of surface lightness located in the far plane. The results show that light-
values. There is now strong evidence of just such ness is determined by the luminance ratio of the target
discrimination of reflectance and illumination edges square to its coplanar neighbor. Thus the target square
(Gilchrist, in press). If observers view the two disks looked ifhite when it appeared in the near plane but
4nJrr the conditions just described, they typically do black when it appeared in the far plane. In other terms,
perceive the two grays as almost equal, i.e. constancy the edge dividing the target from its coplanar neighbor
is achieved. Moreover, they perceive both sides of the was treated as a reflectance difference while the edge
background as white with one side in shadow. Thus the dividing the target from its non-coplanar neighbor was
central edge is apparently correctly identified as an presumably treated as representing an illumination
illumination edge. If, however, the observers view the difference. Since the retinal image was essentially
display through an aperture that permits only part of the same in both conditions, this result is inconsis-
the background and the two gray regions to be seen, and tent with an explanation based on lateral inhibition.
if the edge of the shadow is reasonably sharp, the grays In another experiment involving planes meeting to
no longer look equal, constancy is destroyed. Moreover, form a dihedral angle, these ideas were put to a more
the observers now perceive the two sides of the back- rigorous test in which predictions based on perceived
ground as unequal in lightness. Thus the edge is coplanarity would be the opposite of predictions based
interpreted as separating different reflectances, not solely on retinal ratios.
different illuminations. In this condition only then The experimental arrangements are shown in Figure 2.
does the central edge enter into the process of computing In the horizontal plane, a black target tab extended
the lightness of the gray disks. out into space from a larger white square. In the
We reported earlier that if the boundary between the vertical plane, a white target tab extended upward into
black and white backgrounds in the traditional contrast space from a larger black square. Thus each target

T ^ ^ ' '^ \\



3.75 8.0

7.75 3.0
Figure 2

Apparent Motion
Although we know a good deal about the conditions
MEDIAN ^ ^ ^ H that produce the illusory impression of motion refer-
9.0 OBSERVER 3 5 ^ ^ ^ 1 red to as apparent motion, we still do not understand
MATCHES ^ H H why it occurs or, for that matter, why it only occurs
B under certain conditions. What we know about this
effect is that given the sudden appearance of object £,
its sudden disappearance, followed typically by just
Figure 1 the right time interval of object _b in just the right
new spatial location, one tends to see motion of a to
tab was seen against the background square that was in b. The currently favored explanation is that a motion-
a separate plane. The horizontal surfaces received detector cell in the brain will discharge even if the
about 30 times as much illumination as the vertical appropriate receptor field of the retina is stimulated
surfaces, or just enough illumination difference to discontinuously by two points rather than by a point
make the luminance of the black target tab equal to moving over the retina. Such cells do seem to exist in
that of the white target tab. Given the viewing per- various species of animals (Griisser-Cornehls, I968;
spective of the observer, i»5 degrees from each plane, Barlow and Levick, I965).
the display was similar to traditional contrast displays; However, the fact is that it is not necessarily the
two targets of equal luminance on bright and dark back- case that the conditions for apparent motion perception
grounds respectively. Thus a theory based on lateral entail stimulation of separate retinal regions.
inhibition would clearly predict that the target on the Ordinarily that is the case, since a_ and h are in
bright background, in this case the upper target, separate spatial locations and the eye is more or less
should appear darker than the other target, although stationary. What seems to matter is the perception of
the exact magnitude of the effect is harder to deter- a and b in separate locations in space.
mine. On the other hand, if lightness is really based To get at this question an experiment was performed
on luminance relationships within planes, then each tab in which the observers had to quickly move their eyes
should be compared with the larger background square back and forth synchronous with the onset of a_ and b
that lies in the same plane, even though it is adjacent so that each stimulated the same central region of the
only along one edge of the tab. Thus not only womld retina, rather than as, more typically, two discretely
the coplanar ratio principle predict that the upper tab different loci (Rock and Ebenholtz, 1962). Therefore,
vjould appear lighter, not darker, than the lower tab, it the conditions for apparent motion might be thought not
would predict that the upper tab should look white and to exist. Yet, the observer does locate & and bi in
the lower tab black. phenomenally discrete places in the environment. The
In fact the latter result was actually obtained. result was that although nothing was said to the obser-
Figure 2 shows the median Munsell matches {next to vers about motion that might create an expectation of
samples of those Munsell values) obtained from naive perceiving motion, most of them nonetheless spontaneous-
observers. Moreover, since the target tabs were ly did. This experiment seems to prove that, in humans
actually trapezoidal in shape, they could be made to at least, it is not necessary to explain stroboscopic
switch perceived planes when viewed monocularly. In motion in terms of a sensory mechanism that detects
that condition of the experiment the perceived light- sudden change of retinal location. There is neither
nesses of the tabs also switched, with the lower tab change of retinal nor cortical locus of projection of a
now appearing white and the upper tab appearing black. and b here.
Since these changes in perceived lightness were pro- An entirely different view that has been presented by
duced solely by a change in depth perception, with no Rock (1975) is that the impression of motion is a solu-
change in the retinal image, these data raise difficul- tion to the problem posed by the rather unusual stimulus
ties that may be insurmountable for current theories sequence. First a_ inexplicably disappears. Then b
based on lateral inhibition. inexplicably appears elsewhere. By "inexplicable" we
mean that when an object in the world disappears as we
are looking at it, it is generally because another object h — H
moves in front of it or it is occluded by another object
because of oiir motion. However, when a stationary
object suddenly and rapidly moves to another location,
it does tend to disappear from one location and to ap-
pear in another. Therefore, perhaps this state of
affairs in a stroboscopic display suggests the solution
of motion.
Given that potential solution, the question arises
as to whether it is acceptable. Motion from a to b does
account for the brief stimulation by a^ and _b, but isn't
the absence of any visible object between the locus &_
and b a violation of the requirement that a solution be W m »
supported by what is present in the stimulus? If the
solution is "a_ moving across space to b" doesn't this m
call for stimulus support in the form of continuously
visible motion across that spatial interval? Ordinarily
that would be true, but it is a fact that has been
demonstrated that for very rapid motion of an actually m .
displacing object, little more than a blur can be seen
in the region between the terminal locations (Kaufman
et al, 1971). In fact it was shown that if the terminal
locations are occluded, no motion of a moving object is
seen. Therefore when the spatial and temporal intervals r ' ' %
between a and b^ in a stroboscopic display are such as
would correspond with the real motion of a rapidly f".
displacing object, the absence of continuously visible
movement need not act as a constraint against perceiv-
ing movement. In fact, this analysis may explain why
slow rates of alternation do not lead to the impressions
of motion. By "slow rate'' we mean a condition with a
relatively long interval between the disappearance
(offset) of a and the appearance (onset) of b. Such a
rate would imply a slowly moving object and a slowly
moving object would normally be seen throughout the
spatial interval between a_ and b^. Therefore the ab-
sence of object motion over that interval at slow rates
of alternation is a violation of the requirement of
stimulus support. Hence the movement solution is not
acceptable at slow rates even if the offset and onset:. ' * t '
tend to suggest this solution.
While on this topic of rate we might briefly comment Figure 3
on the case where the alternation is very rapid, i.e.
a zero or only a minimum interval between the offset of (See Figure 3B) As far as the sensory theory of appar-
a^ and onset of b^. If the "on" time of a_ and b is itself ent motion Is concerned there is no obvious reason why
very brief, this state of affairs will result in £ and these conditions should not produce an impression of a^
_b being visible simultaneously by virtue of neural and ^ moving. But from the standpoint of problem
persistence. But if a_ is visible when h_ appears, the solving theory, we have now provided an explicable basis
solution that a has moved to _b is not supported or one for the alternate appearar;ce and disappearance of a. and
might say, is contradicted. This deduction war. tested b_, namely, that they are there all the time but under-
by using rates of alternation that ordinarily dc going covering and uncovering. Therefore the perceptual
produce the stroboscopic effect but with the following system may prefer this solution or at least we are
variation: First a appears, followed by the usual offering it a viable alternative not usually available
blank interval; when b appears so does a (in its original (see Ftoper, I96I*; Sigraan and Rock, 197'*).
location). Therefore the sequence of events is: a; a_ The subjects rarely perceived motion of the dots here.
and b,: a_; a and b^, etc. If the presence of a_ during the Some may object that the presence of the actually moving
exposure of b violates the requirements of the motion rectangle interfered in some way with perceiving strobo-
solution, then observers should not achieve a strobo- scopic motion. It is, after all, an unusual, atypical,
sopic effect under these conditions. Our observers did way of studying such motion. The rectangle may draw
not. If, however, the display is changed so that the the subjects' attention or otherwise inhibit motion
a object that appears concurrently with b but in the perception of the dots. For this reason a slight change
same location as a is somewhat different tha;; the a_ was introduced, one that had another purpose to it as
that appears alone, observers d£ perceive a movi r; to well. Suppose the rectangle moves, but a bit too far,
b. The sequence is a^; a_' and b^, : a; a' and b; etc. far enr. ugh no longer to be in front of where the dot
Tt was :.oted above that in the typical experiment on had been. But by a method, the details of which need not
stroboscopic motion, a_ and b^ inexplicably disappear and be discussed here, things were so arranged that when the
appear. What was meant was that no rationale is pro- rectangle is in its terminal location, the dot is
vided to the observer of why they appear and disappear nonetheless not visible.
such as is the case when things in the environment Now it is no longer a fitting or intelligent solution
auddenly appear or disappear because another object in to perceive a and b^ as two permanently present dots that
front suddenly moves out of the way or in the way. are simply undergoing covering and uncovering. For it
This suggested the following kind of experiment. can be seen that in fact the rectangle is not covering
Suppose we cause the retina to be stimulated by a and the spot in its terminal location and yet the spot is
_b in just the right places at just the right tempo, etc. , not visible (violation of the stimulus-support require-
but by a method in which we move an opaque object back ment). Therefore the best solution is again one of
and forth, alternately covering and uncovering a and b.
movement and that is what the subjects perceived. Note change location from frame to frame (and often many
that this experiment serves as a control for the ob- such objects are simultaneously changing locations in
jection raised to the first one; the moving rectangle either the same or varying directions). Yet this
here does not interfere with perceiving motion of the outcome is not predictable at all in terms of the other
dots. kinds of sensory theories mentioned earlier.
Another variation performed is based on the idea A related example is the perception of motion of
that for the covering-uncovering solution to be viable, complex stimuli such as the line drawings of three-
the covering object must appear to be opaque. If it dimensional cuboid figures that Shepard and his
does not, it can hardly be covering anything. This associates have used in the mental rotation studies.
factor was manipulated in an experiment illustrated Shepard and Judd (1976) presented two perspectives of
in Figure 3B. The actual stimulus conditions are such figures in a stroboscopic motion paradigm and
very similar but in one case, because the oblique lines showed that, at the appropriate rate of alternation,
within the rectangle are stationary and aligned with observers perceive these objects rotating through the
all the others, the rectangle looks like a hollow wire angle necessary to account for the change in perspective
perimeter. In a control condition the lines inside it from a to b. This effect clearly implies that the
moved with the rectangle, and it looked like an opaque perceptual system deals with the problem of accounting
object. The difference in results is very clear: when for the differences in a and h by an intelligent motion
the rectangle appeared to not be opaque, subjects by solution. A further finding of interest is that the
and large perceived movement whereas in the oase of optimum rate of alternation for achieving a continuous
the opaque-appearing rectangle, they did not. A hollow coherent rotation of a rigid whole object was an inverse
rectangle is in contradiction of the property of opacity function of the angular difference as implied by the
"•jequired by the covering-uncovering solution. two perspectives views. In other words, the greater
In a final experiment, conditions were such that no the angle through which rotational motion was seen, the
physical contours at all moved back and forth in front slower the rate of alternation had to be.
of the dots. There was, however, a phenomenally opaque This finding can be considered to be in keeping with
object that moved, one based on illusory contours, as one of Korte's Laws which states that optimum apparent
illustrated in Figure 3C. The great majority of motion is preserved when the spatial separation between
subjects did not perceive movement. In a control presentations of &_ and b^ is increased by increasing the
experiment, illustrated in Figure 3D, the orientation time interval between presentation of a and b. This law
of the corner fragments was changed so that no subjec- makes sense if one assumes that the perceived speed of
tive rectangle was perceived and this array was moved rotation is constant. If therefore the mental represen-
back and forth. Wow the majority of subjects did tation of the object has to rotate through a greater
perceive movement. angle, more time is required.
It should be noted that in all these cases where a Further support for this interpretation is provided
covering-uncovering effect is perceived there is no by an experiment which asked the following question: Is
reason why movement of the dots could not have been it the retinal spatial separation or the perceived
perceived a. well. That is to say, if the observer spatial separation that governs Korte's Lav? Perceived
were to see an opaque rectangle moving back and forth separation was varied by creating conditions in which
and, simultaneous with this, a dot stroboscopically a and b appeared at differing distance but were always
moving in the opposite direction, such a solution located so as to project to the eye in the same retinal
would also account for the stimulus sequence. Converse- loci (Corbin, 19l*2; Attneave and Block, 19T3). The
ly everything implied by that solution is represented experiments demonstrated that it was the perceived
in tie stirLuius, and no contradictory perception is spatial separation, not the retinal separation that
occurring. Therefore the tendency to perceive dots enters into Korte's Law.
uiidergoing occlusion and disocclusion rather than dots A problem-solving theory can account for these facts.
roving, represents a preference for one solution over It offers an explanation of why motion is seen. Unlike
t:x- other. The preferred solution is obviously related other theories, it takes as a point of departure and is
to a very basic characteristic of percei.tion, namely, quite compatible with the fact that the conditions
olject permanence, the tendency to assume the continued leading to motion perception entail change of perceived
presence or existence of an object even when it is location rather than change of retinal location. It
I'loner.tarily not visible for one reason or another. But offers a rationale for the known facts about alternation,
given the very strong predilection we have to perceive i.e. why movement is perceived only within a certain
appare.nt aotior. even ur.i-~r * he most unlikely conditions, range of middle values of inter-stimulus interval. It
it remains a problem as to why it is not perceived in can 'ieal easily with the kinds of perceived transforma-
this sit-ation and the object-permanence solution i.^ tions or movements that occur when £ and _b are more
preferred. A possible answer ii that the coverin^^- than single dots or lines, such as groupings or forms
uncoverin/; sol .it ion accojnts for all stimulus change by with sub-parts, or complex three-dimensional figures
one "cause": a rectangle covering and uncovering in differing orientations. Finally it permits us to
spots that ar<r co::tinuously present. The other solution predict instance? where no motion will be perceived
entails two i.nde; t-nient events that are coincidentally despite the maintenance of the spatial and temporal
and unaccou:itably correlated; a rectangle moving in one parameters that ordinarily produce the stroboscopic
direction and in anti-phase to spots moving in the effect.
opposite dirfCtiDC. On the other hand this theory does not as yet explain
There is another line of eviderice that also strongly all the known facts. It does not explain the reported
SJpports a problem-solving interpretation of strobo- findings that motion is seen more readily if "aoth a_ and
scopic motion. If the stimulus ?o:!sists oi rior-; than b are placed so that their projections fall within one
a single dot or line, the problem arises of what in a hemisphere of the brain (Gengerelli, I9I48); nor does it
is seen moving to what in t_. To mak- the point clear, explain why the effect is more readily obtained if a_
suppose that a and b^ each consist of a two-by-three and b^ stimulate one eye compared to the case where a_
matrix of dots. What will be seen here is the rectan- stimulates one eye and _b the other (Ammons and Weitz,
gular grouping moving as a whole (Ternus, 19'^b). 1951). However, the.= e findings have never been repli-
Appare:.tly the perceptual system seeks a movement cated and waj-rant careful re-examination. And finally
solution that will do justice to the object ^s a whole. a problem-solving theory might be considered to be
Indeed, wjre this not the case, the motion perceived inappropriate as an explanation of the stroboscopic
in moving pictures would be quite chaotic, because it effect that seems to occur in decorticated guinea pigs
is typically objects consisting of many parts that fSmith, 19^*0} or newly born lower organisms such as
fish or insects (Rock, Tauter, and Heller, 1965). history of different correction factcrs designed to
However there now seems to be fairly good evidence bring the local luminance into correlation with per-
that there are two kinds of apparent motion (Broddick, ceived lightness. This has never worked and it is
1971*; Anstis, 1980). One kind, referred to as the short- time we recognize that no theory based on absolute
range process, occurs over very small angular separa- amounts of light can work. What is constant about a
tions of a and b. There is reason for believing that white surface, for instance, is its relationship to the
this kind may be based on motion-detector neurons respon- rest of its environment.
sive to a small shift in stimulation on the retina. The It is not surprising that the visual system gets its
other kind, referred to as the long-range process, critical information from the edge. This is the point
occurs over larger angular separation of a and b. This in the image where the relationship between two amounts
process is probably not based on the activation of of light is represented. In more complex scenes each
nction-detector neurons. Most if not all of the evi- local edge relationship, or ratio, has to be seen in
dence discussed above pertains to this long-range pro- relation to other edge ratios. The concept of edge
cess. The short-range process thus seems to have a integration that we have discussed does not involve
direct sensory basis whereas the long-range process any distortion or exaggeration process. Rather it
seems to have a cognitive basis. In the light of this involves the proper organizing of certain local rela-
distinction, it is possible that the findings referred tionships in order to make explicit a more global
to in the previous paragraph are explicable in terms of relationship that was only implicit in the local
the short-range process. relationships. At the level of cognition the same
function is served formally by the syllogism.
Fundamentally the visual system must be logical
At this point we should step back from these empiri-
because the world is logical. The world is not put to-
cal studies and see what general lessons can be drawn
gether in a random or capricious way. If a rectangular
as to the nature of theories of perception. If the
object is incapable of obscuring a set of diagonal
visual system is to achieve a faithful representation
lines it will also be incapable of obscuring a luminous
of the physical world then the organization of its own
spot. How inefficient it would be if local percepts
processes must somehow mirror the organization of the
were allowed to coexist with other local but contradic-
world. Any theory of perception that does not take this
tory percepts. A system that excludes contradictions
point into account will ultimately fail.
from its global relationships has the tremendous advan-
In certain theories of perception, constancy and
tage of reducing the ambiguity of its local relation-
veridicality are fortuitous outcomes that occur only
under some circjmstances. This is not good enough.
Perception and cognition seem to share this quality
Both the logic of what the perceptual system must
of excluding contraditions within their own domains.
accomplish and the emperical evidence of what it does
Of the two, however, perception seems to be the more
achieve demand a theory in which constancy is inevitable,
successful. Of course it is possible to construct fig-
not accidental.
ures such as impossible triangles, or Escher drawings,
Herein lies the danger of theories based on simple and
which surprise us by the extent to which visual contra-
limited physiological findings. Unless the physiological
dictions are tolerated. But it is the rareness of such
finding can be seen as part of a larger process that
visual contradictions that leads to our delight at such
"horaes-in" on reality in an inevitable way, that physio-
figures. Examples in which the cognitive system fails
logical finding is likely to be misunderstood. This is
to exclude contradictions are unfortunately too nume-
the problem with viewing lateral inhibition as an
rous to mention. One only needs to turn to political
exaggeration or distortion process. If we cling to the
speeches, or the Bible, or journal articles to find a
photometer metaphor, to the assumption that fundamen-
gold mine of examples.
tally the visual system measures the intensity (and per-
Seeing, then, is like thinking, at least in many of
haps the wavelength) of the light at each point in the
its formal properties, and this may be because thinking
image, then it is not surprising that some kin3 of
is like seeing. That is,seeing may be the primitive
distortion process will be required to transform the
form of thinking, the basic prototypical form that
array of photometer readings into something vaguely
shows how relationships are to be integrated in order
representing visual experience.
to correctly represent the world. Seeing, of course,
There is no need to talk as though the intensity of
had to come first, and it must be there even in
light at point A in the image is "affected" by the
mosquitoes. Thinking, however, allows us to integrate
intensity of light at point B. The fact is that the
relationships that extend beyond the time and space
light at point A, seen by itself, would be perceptually
limitations of the visual system.
meaningless. Having a second intensity of light present
Perhaps the world looks the way it looks because we
in the visual field doesn't merely change the first
are what we are physiologically. But it should not be
amount of light, it literally establishes a relationship
forgotten that the world existed before we did and thus,
and the apprehension of thi.=; relationship produces
as we learn from evolution, we are what we are beca-ise
lightness perception in its simplest form.
the world is what it is.
Ci.ntrast theories are usually thought to be relation-
al theories, but they are not. As Koffka (1935) has
correctly pointed out, the ultimate correlate of light-
ness perception in a contrast theory is still an abso-
lute amount of light, iiot a relationship. The contrast
process only allows the absolute value of one region of
the field to be changed as a Tjnction of other values.
But it is still that absolute value that reigns. And the
reason that it has to be changed is that as an absolute
value, it will always be out of touch with visual
experi-nce, which involves relationships. In fact the
l;i:tory of theories of lightness perception is the

Retina-based versus scene-based frames

In the early stages of visual processing, the

Geollrey £• Hinton
size, position and orientation of parts of the
tSC Applied Psychology Unit visual input are represented relative to the frame
Cambridge, England of reference defined by the retina. Anorthoscopic
perception, however, cannot depend on storage in
these early, "retina-based" representations because
AIE1RACT people typically fixate on the peephole, so all the
different pieces of the object project to the same
Ihree demonstrations are presented and used to bit of the retina (Rock, 1981). Representations that
support a number of apparently unrelated claims encode the positions of the pieces relative to the
about the internal rejresentations that people have retina would not allow us to perceive the whole
when they perceive or imagine a spatial structure. object because the relative position of a piece
The first demonstration illustrates properties of within the whole is determined by where the peephole
the spatial working memory that enables us to is, not by where the piece falls on the retina. It
integrate successive glimpses of parts of an object is just conceivable that as we move our eyes, the
into a coherent whole. The second demonstration internal records of all the previously perceived
shows that our ability to generate a mental image is pieces are correspondingly altered so that the
severely limited by the form of our knowledge of the records always encode where the piece is relative to
shape of an object. The third shows that the shape the current retinal position, but this seems very
representation which we create when we attend to a unlikely.
whole object does not involve creating the kinds of What is needed is a way of representing where the
shape representations for the parts of the object pieces are that is not affected by eye-movements or
that we would form if we attended to them and saw even by movements of the whole person through space
them as wholes in their own right. Ihe real (Turvey, 1977). This can be achieved by using a
motivation for this medley of demonstrations and for temporary scene-based frame of reference that is
the interpretations offered is that these phenomena defined by some larger contextual object or
can all be seen as manifestations of a particular configuration within the external scene. If we keep
kind of parallel mechanism which is described a continually updated representation of the
briefly in the last sectiori. relationship between the retina and this scene-based
frame, we can use it to convert from positions on
I PERCEPTION THROIGH A PEEPHOLE the retina into positions relative to the scene
before storage. Ihese positions relative to the
Fig. 1 illustrates a phenomenon called anorthoscopic scene will be unaffected by subsequent eye or body
perception that occurs when people perceive an movements. Obviously the scene-based frame will
object one piece at a time through a slit or have to change from time to time, and it will have
peephole (Hochberg, 1968). Lnder suitable conditions to have a scale that is appropriate to the scale of
peojle report that they have a perceptual experience the parts we are attending to, but over a period of
of the whole object. They somehow integrate a number a second or two, perceptual integration of the
of separately perceived pieces into a single results of successive fixations could be achieved by
Gestalt. This means that they must be storing using a single scene-based frame of reference.
internal records of their perceptions of the
individual pieces. Ihe simplest theory of Post-categorical versus atomistic representations
anorthoscopic perception is that the subject builds
up an internal, picture-like representation part by In a picture-like representation, the shapes of
part, and then uses this internal "picture" as a objects are not explicitly represented — it
substitute for a retinal image in identifying the requires an interpretive process to extract them.
whole object. As we shall see, this theory has Consider, for example, how a straight line is
problems. represented in an array. Ihe line is decomposed into
"atomic" fragments each of which is depicted by
filling in one cell in the array. The absolute
positions of the individual atomic fragments
relative to the whole array are encoded directly and
precisely, but there is no direct encoding of the
d ) straightness of the line, because this depends on
the relative positions of the various fragments.
Ising this kind of atomic depiction it is impossible
to represent the fact that a line is straight
without representing precisely where it is relative
to the whole array. It is impossible to be precise
about shape and vague about position in a picture-
Figure 1. A cartoon strip showing a peephole like representation.
moving around the outline of a shape. Ihe fact Ihe memory used in anorthoscopic perception.
that successive frames in the cartoon fall in
different positions makes the task harder.

however, seems to allov just this combination of any orientation if the orientation is defined in
jrecisioD and vagueness. If a peephole is moved terms of the natural axes of the cube. lut when the
around a polygonal spiral (see Pig. 2) people often diagonal is used to define the required orientation,
"perceive" a closed polygon. Iheir memory for the we realise that relative to the diagonal, we have no
precise locations of the individual sides is poor clear idea where the various parts of the cube are.
and can he swayed by expectations about closed Our knowledge of the spatial dispositions of the
polygons, but they know that the sides are straight. parte of a cube is relative to the "intrinsic" frame
This informal evidence that spatial working memory of reference defined by the cube's own axes.
can be more precise about the shapes of pieces than Knowledge in this form is ideal for recognising the
about their positions implies that it contains shape of a rigid object because whatever Ihe
explicit representations of shapes rather than being object's actual size, position and orientation, the
a picture-like collection of atomistic local dispositions of its parts will always be the same
features in which shapes are only implicit. A recent relative to an intrinsic frame of reference based
experiment supports this conclusion. on the object itself (Palmer, 197^; Marr and
Nishihara, 1978). So if the appropriate object-based
frame can be imposed, the early retina-tased
representations which encode the positions of the
Figure 2. A parts relative to the retina can be receded into
<^ •s^ object-based representations and this encoding will
peephole is moved
around a polygonal constitute a viewpoint-independent shape
spiral without description that allows the object to be recognised.
revealing the free I have now appealed to three different sorts of
^ ends or the adjacent reference frame. The initial processing of the
^ ^ visual input uses representations relative to the
parallel sides.
^ retina; recognition of the shape of an object
involves receding these early retina-based
representations into ones that are relative to an
object-based frame; and anorthoscopic perception
Cirgus, Gellrasn, and Hochberg (1961) have shown relies on storing the relationships of recognised
that it is considerably easier to "see" the shape of shapes to a temporary
Ill scene-based
a whole object if the peephole is moved around the
outline of the object than if the peephole jumps Fig. 5 shows a face composed entirely of pieces
randomly from one part of the outline to another. of fruit. Palmer (1975) reports that when subjects
The two different conditions were balanced so that are shown this figure very briefly, they see it as a
ihe total exposure to any one part of the object was face without seeing the parts as fruit. The
identical, so the contents of a picture-like store fruitface figure demonstrates that forming the
would be equally good in both cases. Ihe obvious Gestalt for a face does not depend on forming
interpretation of this experiment is that when Gestalts for the parts. This is puzzling because to
neighbouring parts of an object are exposed in see the face we must form some representations of
succession, it is possible to form more complex the parts and their relationships to the whole,
chunks (shapes) and hence to reduce the number of since it is the relative dispositions of the parts
chunks that must be stored in spatial working within the whole that make it a face. Cne
memory. When successive exposures are of widely possibility which has not been much explored is that
sejaf-Xei pieces, either no chunks are formed, or each part of the face can have two quite different
chunks are created which do not correspond to the internal representations. When the part is seen as a
natural parsing of the whole object into parts. This constituent of the face it receives a representation
type of explanation implies that the memory involved in which it is interpreted as filling the role of,
contains explicitly segmented
II THE and identified chunks.
CUBE TASK say, an eye because of its crude overall shape and
iT3 relation to the whole face. When it is seen as a
Hinton (1979) describes an apparently simple whole in its own right, however, it receives a quite
mental imagery task that people cannot do: different internal representation in which the rough
"Imagine a wire-frame cube resting on a tabletopshapes and dispositions of its parts cause it to
with the front face directly in front of you and be seen as a piece of fruit.
perpendicular to your line of sight. Imagine the
long diagonal that goes from the bottom, front, left-
hand corner to the top, back right-hand one. Now Figure ^. P face
imagine the cube is reoriented so that this diagonal composed entirely
is vertical and the cube is resting on one corner. of pieces of fruit .
Elace one fingertip about a foot above a tabletop (After Palmer, 1975)
and let this mark the position of the top corner on
the diagonal. Ihe corner on which the cube is
resting is on the tabletop, vertically below your ^ 4 i ^
linpertip. l«ith your other hand point to the spatial
locBtioi.s of the other corners of the cube."
It is. fairly easy to imagine a cube in just about

reference and representing the size, position, and
The idea thai an objecl receives a quite orientation of each part of the object relative to
different internal representation when it becomes this frame.
the object of focal attention does not fit the
jojular view of attention as a kind of internal 3. The representation that an object receives
sjotlight which can illuminate any one of a number when it is seen as a Gestalt and its shape is
of otherwise unconscious shape representations. recognised is completely different from its
However, the idea is very compatible with "early representation when it is seen as a constituent of a
selection" theories (Triesman and Gelade, 1980) in larger Gestalt. Only one Gestalt can be formed at a
which focal attention is constructive and is time, but many separate records of previous
necessary for the generation of a shape Gestalts can be stored in spatial working memory.
representation .

The internal spotlight metaphor for visual V A MCHAUISK FOR SPATIAL REPRESQITATICN £
attention is a powerful one, but I believe it is
based on a mistaken analogy between external There is not space here to d iecuEs all the
perception and introspection. Normally our attention various kinds of mechanism that have been suggested
moves rapidly and smoothly from one level to another for representing spatial struct ures. I shall simply
and we do not realise that at any instant we are describe one possibility which is designed to make
attending at just one level. Cnly when the use of parallel interactions be tween very large sets
information at the different levels is made of features. This kind of compu tation seems to be a
inconsistent, as in the fruitface, does it become natural way of harnessing the c omputational power
obvious that the Gestalt for the whole cannot
provided by a system like the b rain in which a large
coexist with the Gestalts for its parts.
number of richly interconnected units all compute in
Introspection is of little use for deciding what is
parallel (Anderson and Hinton, 1961 ) . The mechanism
in our minds at one brief instant because it does is based on four related assump tions:
not allow us to decide between two possibilities.
Either there are shape representations that lurk 1. A perceptual feature must always be
outside focal attention, or shape representations represented relative to some frame of reference
are generated or regenerated the moment we ask because properties like the length, position, and
ourselves whether they are there. Our fundamental orientation of a feature implicitly assume a
epistemological assumption that the existence of reference frame.
objects is independent of our awareness of them
cannot be applied to the contents of our own minds. 2. At any moment during perception we use three
different frames of reference -- retina-based,
An obvious objection to any theory which claims object-based, and scene-based — so our perceptual
that people only see one shape at a time is that the apparatus has three different sets of units, each of
shape of an object is determined by the shapes of which represents features relative to one of these
its parti and their dispositions relative to the frames of reference.
whole. This kind of recursive definition of a shape
in terms of the shapes of its parts leads to a 3. The meaning of features relative to one frame
regress that only terminates at hypothetical of reference in terms of features relative to
"primitive" features. The fruitface figure is another depends on the relationship between the two
important because it suggests an alternative way out frames. So the way in which units in one set affect
of the regress. The representations of the parts units in another set must be controlled by a
that are used in perceiving the shape of the whole representation of the spatial relationship between
may be different in kind from the representations the frames of reference used by the two sets. A
used to perceive the shapes of the parts when we particular spatial relationship pairs each unit in
attend to them. Naturally, different shape one set with one unit in the other set, and allows
representations must be able to influence one activity in one of these units to cause activity in
another. Having recognised an eye it should be the other.
easier to see the whole face, but this influence
could be mediated by spatial working memory. 4. Lifferent Gestalts correspond to alternative
Although only one Gestalt can be formed at a time, patterns of activity in the very same set of object-
records of many previous Gestalts can be kept in based units. So only one Gestalt can be formed at a
working memory and used to influence the formation time, though records of many previous Gestalls can
of the next Gestalt. be stored as activity in the scene-based units.

lig. 4 incorporates these assumptions. Unlike

IV VHAT THE DEMONSTRATIONS ShOV many box diagrams in psychology, the separate boxes
really are intended as separate collections of
The demonstrations have been used as evidence for hardware units. Every linit continually recomputes
the following cleims: its activity level as a function of the input it
receives from other units. In the short term (i .e in
1. Ve integrate the information obtained in
about ICO msec) , the whole system computes by
successive glances by storing records of the shapes
settling into a state of activity that is
that we identify and their relationships to a
temporarily stable. This kind of settling process is
temporary scene-based frame of reference. We can use
described in more detail in Hinton (19&1 b) where it
these stored records to generate nev shape
is shown that the process of assigning an
represen ta tions .
appropriate objed-tased frame of reference can be
2. The process of recognising a shaj«i (forming a implemented by the three-way inleraction between
Gestalt) involves imposing an otject-tssed frame of retina-based units, object-based units and 1he units
for representing the spatial relationship between

the retins and Ihe object. This kind of three way identities. It is more useful to make each aciive
interaction ie what the triangular eymtols in Fig. scene-based unit represent the existence of an
4 dejict. After each settling, control jroceesee object of a particular type with a particular
(unspecified here) can reset the pattern ol aclivity relationship to the current scene, as the following
in any set of units, and thereby initiate a new examples show.
proceES of settling. Not all the unite in a set need
be involved in the interactions with other sets, for Suppose that as a result of previous perceptual
examjle, the objed-tased units that are directly analysis, activity in a scene-based unit, 11 ,
affected by retina-based units probably code fairly represents the existence of an eye with the
simple features, whereas the object-based units that relationship Fj^ to the scene. Suppose also that the
directly affect the scene-based ones probably code system is now attempting to settle on an
complex conjunctions of the simpler features. interpretation of a larger object (a face) with the
relationship Fyj to the scene, hig and hfs determine
hlf , the relationship of the eye to the face, and
Scene-based units. so they determine which object-based unit, Cj ,
Active units encode should be activated to represent the eye as a
recently identified constituent relative to the frame of reference of
shapes and their rel- the whole face. This influence of the contents of
lations to the scene working memory on perception can be implemented (see
lig. 4) by having an explicit representation of Pfj
Units whose activity which governs the interaction between scene-based
represents the relat- and object-based units and ensures that activity in
ionship of the current £j provides excitatory input to Ci .
object-based frame to Now consider what is required of spatial working
the current scene- memory if the face is seen first and attention is
based frame. then focussed on one eye. !Ihe fact that this part
Object-based units. had the role of an eye within the whole face should
Pattern of activity facilitate its interpretation as an eye when it
represents shape of becomes the focus of attention. Ibis effect can be
the current Gestalt achieved if the Gestalt for the whole face activates
scene-based units that represent the major
\ Units whose activity constituents of the Gestalt as well as the whole. £0
represents the relat- the mapping from object-based to scene-based units
ionship of the current operates simultaneously on imits that represent the
> object-based frame to identity of the whole Gestalt and on units
the retina-based frame representing its major constituents.
F.etina-based units.
Activity pattern is VI CONCIUSION
the result of early
Three demonstrations have been used to illustrate
visual processing.
aspects of our internal representations of spatial
structures. Particular attention has been given to
Figure 4. A parallel mechanism, the spatial working memory that allows people to
integrate their perception over time. It has been
argued that this memory contains compact records of
5his kind of mechanism raises many interesting the rich perceptual Gestalts that are fonned when a
issues, some of which are discussed elsewhere person attends to an object. Ihe interactions
(Hinton, igeia) . Ihe following section focusses on between spatial working memory and the apparatus in
what the scene-based features are like, and how they which Gestalts are formed allows previous Gestalts
influence the the formation of a new Gestalt, i . e . to influence (or entirely determine) the formation
how they affect the formation of temporarily stable of the current Gestalt even though only one Gestalt
pattern of activity in the object-based units. can be present at a time. This view of the role of
spatial working memory supports "early selection"
theories in which focal attention is required to
Scene-based features synthesize a shape, and only one shap)e can seen at a
time. It also supports the view that different
Once the general approach of implementing spatial
Gestalts correspond to alternative patterns of
working memory as activity in a set of scene-based
activity in a set of units that encode features
units is accejted, quite a lot can be deduced about
relative to a frame of reference imposed on the
the nature of the units from their function. One
important function of spatial working memory is to
Finally, a few provisos. The demonstrations are
allow previously identified Gestalts to aid in the
well known but the interpretations of what they
formation of related Gestalts. Having recognised an
show are probably contentious, and the mechanism I
eye, the whole face should be easier to see, and
suggest is speculative and underspecified. There has
vice versa. Ihe kind of precisely located, atomistic
not been space to elaborate on many interesting
features that would be needed for a picture-like
issues like how the mechanism might account for the
representation would not be of much value in spatial
experimental data on mental rotation (Cooper and
working memory, because they would not explicitly
Shepard, 1975) or spatial working memory (Eroadbent
represent the identities of objects, and so their
and Eroadbent, 1981; Phillips and Christie, 1977).
effects could not be made to depend on these
Nor has it been possible to discuss crucial

theoretical issues like the number of units that
would be required by the mechanism, or the problems
of encoding novel shapes in working memory. Triesman, A. M. A Gelade, G. A feature-integration
theory of attention. Cognitive Psychology,
1980, 2£, 97-156.
Turvey, M. T. Contrasting orientations to the theory
I thank Steve Traper, Ed Hutchins, Tony yiarcel, of visual information processing. Psychological
Don Korman, Dave Kumelhart, Tim Shallioe, Joanne Review, 1977, 84, 67-88.
Sharp and Aaron Sloman for useful discussions. Many
of the ideas presented here were developed while I
was a Visiting Scholar at the Program in Cognitive
Science at the University of California, San Diego,
supported by a grant from the Sloan Foundation.


Anderson J. A. & Hinton, G. E. Models of

information processing in the brain. In G. E.
Hinton & J.A. Anderson (Eds.) Parallel models of
associative memory. Hillsdale, NJ: Erlbaum, 1981.

Broadbent, I. E. & iroadbent, M. H. P. Recency

effects in visual memory. Quarterly Journal of
Experimental Psychology, 1981 , 53A.. 1-15.

Cooper, 1. A. & Shepard , F. N. Chronometric studies

in the rotation of mental images. In W. G. Chase
(Ed.), Visual information processing. Kew Tfork:
Academic Press, 1975.

Girgus, J. S. Gellman, I. H. & Hochberg, J. The

effect of spatial order on piecemeal shape
recognition: A developmental study. Perception
and PsychophysicE, 1980, 28, 133-138.

Hinton, G. E. Some demonstrations of the effects of

structural descriptions in mental imagery.
Cognitive Science, 19'-9, 3, 231-250.

Hinton, G. E. Shape representation in parallel

systems. To appear in Proc. IJCAI-81 1981a.

Kintm, 'V. ;. A parallel computation that assigns

canonical object-based frames of reference. To
appear in Proc. IJCAI-81. Vancouver, Canada,
1981 t.
Hochberg, J. In the mind's eye. In R. K. Haber (Ed.)
Contemporary theory and research in visual
perception, hev ^ork: Holt, Finehart and
Winston, 1968

X.arr, I. & Kishihara, H. K. Representation and

recognition of the spatial organisation of three-
dimensional shapes. Proc. Roy. Soc. Series _E ,
19,£, 200, 269-294.
Palmer, S. E. Visual perception and world knowledge:
Kotes on a model of sensory cognitive
interaction. In I. A. Korman & D. E. Rumelhart
(Eds.), Explorations in cognition. San
Francisco: Freeman, 19V5.

Phillips, V. A. & Christie, I. F. M. Components of

visual memory. Quarterly Journal of
Experimental Psychology, 1977, 29., 117-134-

Rock, 1. Anorthoscopic perception. Scientific

American, March 1981, 244, 103-111.


Paul Kay
University of California, Berkeley

The relation between perception and meaning is Despite the superiority of the Hering model over
hard to trace in any domain. I have been asked today its competitors in explaining these and many other
to discuss the problems of identifying such connec- aspects of the subjective experience of color, it
tions in the domain of color. Color is an area in never gained general acceptance until Russell De
which our ignorance regarding the relation of percep- Valois and his associates, as recently as the late
tion and linguistic meaning is less than total; 1960s, isolated the anatomical structures that
nonetheless you will not be surprised to learn that accomplish the opponent-process function and monitored
here, as elsewhere, there are more questions than this function in the living organism. After a series
answers. In the time available I will be able to do of preliminary experiments, which established that the
no more than sketch one view of the matter and so will visual system of the macaque monkey is like man's in
probably present a clearer picture than is in fact all relevant respects, De Valois and his co-workers
warranted by current knowledge. In particular there inserted micro-electrodes into individual cells in the
will be little time to discuss the detailed empirical Lateral Geniculate Nucleii of live macaques and
evidence that supports this view and no time to recorded the rates of firing of these cells while the
discuss alternative views.^ animals' eyes were exposed to light of systematically
varying wavelengths and intensities. It was found
I will begin by describing in lamentably over-
that L6N cells could be thus classified into six
simplified terms certain structures in the human
types, two of which were primarily sensitive to
visual system that give rise to the sensation of
overall luminosity or brightness, and four of which
color. We will then see how these aspects of visual
were primarily sensitive to dominant wavelength or hue.
physiology can help us understand independent findings
regarding the meanings of words for color in the The latter four types constitute the opponent
world's languages. In particular, starting from the process system. Each of these opponent cells has a
simple arithmetic of differential firing rates of spontaneous or basal rate of firing (or about 6 spikes
certain individual types of cells in the visual per second): this rate increases or decreases depend-
system, we are able to build a model of the perceptual ing on the wavelength of stimulating light according
categorization of color that explains a good deal both to the four patterns shown in Figure 1.
about cross-linguistically universal features in color
Inspection of Figure 1 reveals that types of cell
naming and about dimensions of difference among the
A and C form a pair in having the same crossover point
classifications of color found in the languages of the
between advanced and retarded rate of firing at about
world. Finally we will see how this model organizes
605nm and also in having mirror image maxima and
certain systematic observations that have been made
minima of firing rate at about 540nm and 640 nm.
regarding regularities in the temporal evolution of
Cells of types A and C together constitute Hering's
the color classification systems of the world's
postulated red-green process: considering all cells
of types A and C at once, we may take the sum of the
Aspects of the Neurophysiology of Color Perception absolute deviations from the basal rate of firing in
the long wavelength region, that is above the cross-
It is widely known that the retina contains three
over point, as signaling the strength of the red
kinds of color receptors, i.e., cones. It is perhaps
response. Similarly, the total absolute deviation
less generally known that at post-retinal but still
from the basal firing rate below the crossover point
peripheral levels of neural processing, information
represents the strength of the green response. In
regarding dominant wavelength, or hue, is recoded from
analogous fashion the type B and D cells together
this three-channel system into a four-channel system,
constitute the yellow-blue opponent process: the sum
yielding the four fundamental hue sensations, blue,
of absolute deviations above the crossover point is
green, yellow, and red. In 1920 Ewald Hering (1968)
the total amount of yellow information or equivalently
postulated, on the basis of primarily introspective
the strength of the yellow response, while below the
evidence, that such a system must exist. Hering noted
crossover point the sum of absolute deviations repre-
further that subjectively there is no such thing as a
sents the total blue response. Note that in a given
mixture of green and red nor of blue and yellow--try
stimulus condition, each opponent process must be in
to imagine what one could possibly mean by the locu-
either one state or the other depending whether the
tions 'a reddish green' or 'a bluish yellow'. He
wavelength of the stimulus is above or below the
therefore supposed that there must exist what he
crossover point for that pair of types of cells.
called two 'opponent processes' in the visual system,
At a given wavelength there are thus two families
one red vs. green process and one yellow vs. blue
of possibilities. (1) If the visible wavelength is at
process. At a given moment, each process has to be
one of the cross-over points, then one of the opponent
in exactly one of its named states; for example, at a
systems is inert, e.g. at about 605nm the macaque's
certain time the red-green process might be in the red
red-green system is quiescent; at this point all hue
state and the yellow-blue process in the yellow state:
information is carried by the yellow-blue channel,
this pairing of states would give rise to the sensa-
which is in the yellow (longer wavelength) state; this
tion of orange. If one admits continuity to the model
is called the yellow unique hue point; all the organ-
by allowing each of the four opponent states to
ism sees at this wavelength is pure yellow. The blue,
operate at varying strengths, the relative strengths
green, and red unique hue points are defined in the
of the two states operative at a given moment will
same way. (2) If the stimulus wavelength is not at a
determine the precise shade that is subjectively
unique hue point, then exactly two of the four funda-
experienced. In our example, the relative strengths
mental hue states are operative and the relative
of the red and yellow states will determine whether a
strengths of these two states determine the precise
reddish orange, a yellowish orange, or a relatively
shade of perceived hue; for example in the region
balanced or pure orange is experienced.

• • CClLL) I •,.
if "*
5 .
! • 1- 1 M
1- -1'
o S 0.
t c .. •.*»v(i£«.:r-i™.^.
^^r.ioIC •< ,„ ,<,.., it ,„
. I

.r-!-i., •G-RCELLS
'• '
i Mi-

. t 1 '\\


111 i•
"T •: *: K yy^ »o ao K ie ^g tc •• *c io ,o.
FiCLRf i.Source De Valois ct al l')56 972-73. Figs 9-12). Note Horizonul doited lines
indicaic mean hnsal response rate for each cell type.

between the yellow and green unique hue points the So far we have talked only about types of neural
green state of the red-green system and the yellow cells, their rates of firing, and certain functions
state of the yellow-blue system are operative; the composed of the firing rates of different classes of
relative strengths of these response states determine cells, but we have said nothing about the meanings of
whether a yellowish green, a greenish yellow, or a any words in natural languages. We are now prepared
perfectly balanced chartreuse or lime is perceived. to make the initial connection: the curves labelled
BLUE, GREEN, YELLOW, and RED in Figure 2 represent at
One may, in sum, model this system as having one and the same time (1) the outputs of the funda-
quantitative outputs in four channels, RED, YELLOW, mental hue-response categories as defined in terms of
GREEN, BLUE, where at a given instant there are non- proportional output in the individual hue channel of
zero outputs in either (1) a single channel or (2) two the opponent process system, and also (2) the mean-
adjacent channels (considering red and blue as also ings of the ordinary English words blue, green,
a d j a c e n t ) . From psycho-physical data Wooten (1970) yellow, and red, along with their exact translations
has estimated the curves for humans comparable to into many languages. Similarly, non-opponent
those of rigure 1. Using these curves (not shown fundamental response channels BLACK and WHITE,
here) it is a straightforward matter to calculate for corresponding to the English words black and white
each channel of fundamental hue response (RED, YELLOW, (and their translations in many other languages), are
GREEN, BLUE) the proportion of total hue response in defined by the two classes of brightness-sensitive
that channel for each wavelength of visible light. cells discovered by De Valois and his associates, and
Curves representing these calculations are shown in these categories also may be modeled as fuzzy sets.
Figure 2. Being proportions, these functions neces-
sarily have ordinates varying from zero to unity The Semantics of Color Words
across the spectrum. It is therefore natural to
We have seen that six English color words (and
interpret them as fuzzy sets, which interpretation is
their translations into other languages that have
reflected by the ordinate of Figure 2 being labelled
exact translations of these words) c»n be given
"Degree of Membershio".

500 550 600
FlOLRt %.
neurophysiological definitions. For these six that is used wherever an English speaker would use
semantic categories, we have achieved a considerable either the word green or the word blue. There is
rapprochement of semantics and perception. What can considerable experimental evidence indicating that
we say now about the perceptual basis of other color this widespread basic color category (let us call it
words in English and in other languages? In this 'grue') is in fact the fuzzy union of the fundamental
investigation we will restrict our attention to what neural response categories GREEN and BLUE. For
have come to be called 'basic' color words. In any example, it has been found in a large number of
language the basic color words form a natural set, and languages that subjects asked to pick out from an
it is the comparison of the sets of basic color words array of color stimuli the best example of their
across languages that has been found most fruitful in category grue will not select something intermediate
the cross-linguistic investigation of this semantic between green and blue such as we might call turquoise
domain. In every language there is a small set of or aqua; rather, they will select either a focal green
semantically simple words such that any color can be or a focal blue. Since the union of two fuzzy sets is
named with a member from this set. Members of this defined as the maximum of the individual characteris-
set are called the basic color words or basic color tic functions, this pre-theoretically surprising, but
terms of the language. Several languages are known in empirically robust, finding is predicted by the
which there are just two basic color terms. English definition of the category 'grue' as the fuzzy union
has eleven; in addition to the six already discussed, of GREEN and BLUE.
which name the fundamental neural response categories,
Berlin and Kay (1969) surveyed the basic color
there are also brown, purple, pink, orange and grey.
lexicons of ninety-eight languages and reported strong
For many speakers of Russian, there are twelve basic
constraints on the semantics of basic color term sys-
color terms; Russian has a basic color term specif-
tems. They also postulated a narrowly constrained
ically for light blue, goluboy, along with the term
evolutionary sequence through which basic color
for darker blue, sinyiy.
lexicons must pass as they add terms over time. That
For a long time it was believed by linguists and sequence, as reformulated by Kay and McDaniel (1978)
anthropologists that there were no constraints on the about a decade later on the basis of a great deal of
way the basic color terms of a language might divide work by many investigators in the interim, is summar-
the perceptual domain of color and hence no tendency ized in Figure 3.
for color words to be translatable across unrelated
languages. Another way this idea was put was the A language with only two basic color terms has
claim that perception has no influence over color- one which is the fuzzy union,'WHITE or RED or YELLOW*,
naming in a language beyond setting the bounds of the and one which is the fuzzy union,'BLACK or GREEN or
visible spectrum. Thus in what was probably the most blue'; these are conveniently glossed as 'light-warm'
widely accepted linguistics textbook of the 1950s, and 'dark-cool', respectively. When a language adds
H. A. Gleason said, "There is a continuous gradation a third term, it does so by splitting the 'light-warm'
of color from one end of the spectrum to the other. term into a'WHITE'term and a*RED or YELLOW'(i.e.
Yet an American describing it will list the hues as 'warm') term. At the next stage of development,
red, orange, yellow, green, blue, purple--or something either the'dark-cool'term splits into 'BLACK' and
of the kind. There is nothing inherent either in the 'GREEN or BLUE', that is ' c o o T , or the warm term
spectrum or the human perception of it which would splits into 'RED' and 'YELLOW' terms (see Stages Ilia
compel its division in this way" (1961:4). We now and Illb in Figure 3 ) . At Stage IV, whatever possi-
know that this is wrong and that all the basic color bility didn't occur at Stage III now occurs, so the
terms in all languages are based on the six funda- language now has basic terms for the fuzzy categories
mental response categories: the four of the opponent 'WHITE', 'BLACK', 'RED', 'YELLOW', and 'GREEN or BLUE'
(i.e. hue) system and the two non-opponent (i.e. (i.e. 'grue'). At Stage V, the'grue'category is dis-
brightness) categories. solved into its fundamental neural response components
We have already noted that each of these six 'GREEN' and 'BLUE', and there is now one basic color
categories has a structure that invites its inter- term for each fundamental neural response category.
pretation as a fuzzy set. There is strong additional Up to here in the sequence, we have been consid-
motivation for the fuzzy set interpretation, namely ering two types of basic color categories, those that
that all the other basic color categories, either in consist in unions of fundamental neural response
English or in any of the other languages that have categories and those that consist in the fundamental
been investigated, may be defined in terms of simple neural response categories themselves. Beyond
Boolean functions of these fuzzy sets. For example, evolutionary Stage V, basic color categories of a new
in many languages of the world, including the majority kind are formed on the basis of the intersections of
of Native American languages, there is a single word the fundamental categories. More precisely each of

rw 1
R or V W R
G or Bu ~^^^ W R R Y
W or R or Y w / Lsk J ^ R Y Y G
Bk or G or Bu RBkp rorY G or Bu . w # nV „r Bu G G Bu
R ' Bk Bu Bu Bk
Y -Bk V + Bk (Brown)
L Bk or G or H^i J Bk
Y + Bk(Brownl R + W (Pink)
R + l)u (Purpie)
Ula R + Y (Orange)
Stages IV
Illb VI ;B + W(Grey)
Fk.irf .1

these later combinations of the fundamental categories Notes
consists in twice the fuzzy intersection of its
constituent categories. For example, the fuzzy set 1. This talk is, in effect, a highly compressed
orange is twice the intersection (minimum) of the summary of Kay and McDaniel (1978), and the hearer or
fuzzy sets RED and YELLOW.2 reader interested in pursuing the subject should con-
sult that paper and the references cited there.
It is not possible in the time available to dis- McDaniel (1972) was the first to propose a perceptual
cuss the empirical motivation for the formulation of explanation for the Berlin and Kay (1969) findings
these intersectional categories in terms of precisely regarding semantic universals in terms of the opponent
twice the intersection of the constituent fuzzy process model of color vision. Figures 1, 2, and 3
categories (see Kay and McDaniel 1978:631-635; Mervis accompanying this text are respectively Figures 4, 6,
and Roth 1981). But the main points of the story so and 13 of Kay and McDaniel.
far should now be clear. Empirical semantic re-
searches have revealed that, so far as we can tell at 2. In Figure 3, the '+' sign denotes the binary
present, all the basic color categories of the lan- operation 'twice the fuzzy intersection'.
guages of the world are based on the six fundamental
neural response categories, whose structures are
determined by the firing patterns of LGN (and other) References
cells in the visual pathway. Languages with fewer
than six basic color terms have terms that encode Berlin, B., and Paul Kay. 1969. Basic color terms:
categories composed of fuzzy unions of the fundamental their universality and evolution. Berkeley &
categories. Languages that encode more basic cate- Los Angeles: University of California Press.
gories than the six perceptually fundamental ones,
encode categories based on the fuzzy intersections of De Valois, R. L.; I. Abramov; and G. H. Jacobs. 1966.
the fundamental ones. Analysis of response patterns of LGN cells.
Journal of the Optical Society of America
Furthermore, there appear to be quite narrow 56.966-77.
constraints on which of the logically possible Boolean
combinations of the six fundamental response categor- , and G. H. Jacobs. 1968. Primate color
ies actually occur in the world's languages. For vision. Science 162.533-40.
example, of the fifty-seven possible categories that
might be formed by taking fuzzy unions of the six Gleason, H. A. 1961. An introduction to descriptive
neurologically fundamental categories, only the four linguistics. New York: Holt, Rinehart and
we have discussed ('light-warm', 'warm', 'dark-cool', Winston.
and 'cool'--i.e. 'grue') occur in actual languages.
Little is known by way of explanation of this fact, Hering, Ewald. 1920. Grundzuge der Lehre vom Licht-
though it is perhaps worth recalling that Hering sinn. Berlin: Springer. [English version:
designated the colors white, red and yellow collec- Outlines of a theory of the light sense. Trans-
tively as inherently arousing, and the colors black, lated by L. M. Hurvich & D. Jameson. Cambridge,
green and blue as inherently non-arousing. Even more MA: Harvard University Press, 1964.]
striking as an empirical generalization crying for
Kay, Paul, and Chad K. McDaniel. 1978. The linguistic
theoretical explanation is the evolutionary sequence
significance of the meanings of basic color
depicted in Figure 3. Why should the color lexicons
terms. Language 54:610-646.
of the world sort into just the handful of types
permitted by this sequence and, above all, why should Mervis, C. B., and E. M. Roth. 1981. The internal
the temporal evolution of color terminology systems structure of basic and non-basic color cate-
follow this particular, narrowly restricted course? gories. Language 57:383-405.
Answers to these questions, as they are found, will
deepen our understanding of the relation of perception Wooten, B. R. 1970. The effects of simultaneous
and linguistic meaning in the domain of color. and successive chromatic constraint on spectral
hue. Doctoral dissertation, Brown University.

Stnictiir"! and riinclion edges anrt contours alone, although in terms of the underlying light
in Ihc early procrssinc of lisual iiironnalion intensity distributions, llie sketch and the original image arc markedly

Shimon Ullman T h e representation of localized intensity changes is not tlie only

approach Dial has been proposed for Ihcfirststnges of analyzing visual
llic Anificical liilclligcncc l^ibnralory
information. O n e popular alternative is tlie Kourier analysis approach
Mas'^chtisciis Jnslltutc of Technology
tliat received wide attention in die psychophysical literature following
Campbell and Rohson's (19681 discovery of spatial frequency tuned
I. Iiilroducllon channels in die visual system. I he approach presented here is in a
A central notion in contemporary cognitive science is thai mental sense a combination of the frequency channels and tlic edge detection
pi(x:c";<;cs involve computations defined over internal representations. approaches, but il is concerned primarily with die detection of intensity
This general view suggests a distinction between the study of the rep- changes.
resentation and computations pciformed by our cognitive systems on A large variety of techniques have been proposed in die past
tlie one hand, and tlic physical braui mechanisms supporting tliesc com- (primalily within die ciH'incering field of iniagc processing) for Ihc
ptnations on the other. I he two studies proceed along different paths, detection ol inleiisil) cli.m.Kcs in images. A iii.ijor piobloin lliat has
and neither is completely reducible to Uie other. It is the hope of cogni- been (liscmcicd in the ciiiiise of ilc\clo|iinp these lecliiii(|iios. is ihat
tive Science, however. Uiat the studies of function and mechanism can sicnifii.iiil iiilcnsii) cIi,iii(.hs in .in'C can occur at a v.iiieiy of scales.
coiii|)lcnicnt each other, and ihat Uicorics c m be developed for various Siiiiio cli.iiipcs aie ci.idual and smniiili: they c m .iKu he ilcsciihcd in
cognitive subsystems that will describe and explain tlieir computational lrct|iicnc\ diini.iin triiiiuinloin a-- lowfiet|iK'iicvlIliii:''".. (>|Iicis are
aspects, their underlying mech.inisms, and Uic interactions between Ihc high frequency and sharply hKali-'ed changes. T o capture all of die
two. significant intensity changes, it is possible to examine tlie image nt a
In this paper I shall dcsciibe some attempts to combine the study number of different resolutions, or scales. A low lesolution "copy" will
of brain mech.inisins with conipui.itional considerations in llic first serve for capturing die gradual, gross changes, a high resolution "copy"
stapes of visual information processing. This work combines the con- for the fine details, l-igure 1 shows an example of what it means for the
Irilnitiors of many individuals, most notably the late David Marr, and same image to be examined at three different resolutions. T h e resolution
a group of people w h o were fonuiiate to work with him, primarily at decreases from la to Ic. It can be seen that in die lower resolution
M.I. I "s Artificial Inielligence l.iboiaiory and l\\chology IX'parlment. copies fine details are progressively blurred. Ihc low resolution copy
can be obtained by a puKCSs called gaussianfiltering(and this filtering
I. lU'prisciilini; inliiisil) chances In iniaKCS
is in a sense optimal, see Marr & llildiedi 1980). This simply means
Ihcfii^tciMiipMlalion.ll incihlciii ariv.-s in Ihc early piocessing that at every point a local average is taken of die intensity values, using
of visual inli'iin.ilion is the niL.iiii/.iiion .md rcprcsoiil uiiin of a gaussian weighting function. The resolution of the resulting copy is
the input KLislcted b\ Ihc r\es. \i tlio pliotouLcpini', lc\cl. ihc input controlled by the sise of tJie gaussian. A larger giiassian averages die
111 llu- Msiial s\sicin consists of o,ci .' 11) iiiilliiui liiilit iniiu-.ilv iiir.isiirc- intensity values over a wider neighborhood, and hence is less sensitive
niriits (lepi .icic.l In ovrr IJII iiiillinM ciiiics ami imls in c.ich c\,' I I his to fine details. Hie gaussian smoothing is also called in mathematical
IS .III iiiiwklilv luce .mil iiiiiriaiiii ,1 scl i;f iiiM'.imnK'iils. \\ i. can terms die convolution of the image with a gaussianfilter,denoted by
therefore expect the visual system to construct a more economical repre- C * I (where / is the image, G is die gaussian smoothing function).
sentation of the input, that will make explicit the relevant information As a result of thefirstoperation w e have a number of "copies" of
for lalci priKcssing stages. the original im.ige, at a number of different resolutions, as detcnnined
A reasonable candidate for the task is a representation that can be by the si/cs of the gaussianfillets(figure 2). Mie next step is to iso-
roughly described .is an edge represcniation of Ihc im.age. llic idea is late the sharp intensity changes in each copy. W e shall consider this

to make explicit the locations in iJie image where light intensity changes problem first in the context of one-dimensional signals. In diis case,

sharply from one level to anothci. Hie motivations for this type of a die image / is a function of a single variable, denoted by i. A sharp

repiescm.ilion are (i) it will achieve a more concise description of the change in the signal /(i) can be defined as a peak in itsfirstderivative,

image than liie original array of intensity values , and (ii) sharp changes since die derivative, by definition, measures die signal's slope. I'rom

in light intensity values usually have a physical signiricancc. Iliey are clcinentiiiy calculus, peaks in thefirstderivative can also he located by

often associated, for example, wiili object boundaries, markings on ob- 7cro-<rossings of the second derivative (i.e. places where die second

jects' surfaces, and so forth A n edge representation is Uieieforc useful in dcrivaiive changes sign). Mathematically the two criteria are equiv-

making the transition from the domain of light intensities in die image alent but the second characteri/alion has certain advantages when two-
to analy?ing die physical structure of die visible environment O n e signals arc concerned |Marr & llildieth I98()|.
general observation often raised in suppoit of the edge representation
approach is that many objects are recognisable from a sketch of their

In siiiiiniar), Ihc iDi.ili/.iinm ofshnrp ch;inpi'; is dhiiiincil by per- tion can be relaxed (this problem arises since thefiltersin Uic h u m a n
riiiMiInt:: system are probably more Uian an octave wide). If appropriate

,;';k.-v) (!) extensions along these lines can be made, it would imply that the 7Cro-
crosslngs provide not only a convenient representation that captures the
Mk- /cm unssinr'- iii the oiilpiit »ill iiuln.iH' ilic Icn.ilicms dl sh.iip significant aspects of the image, but also a complete one. Ilial is, no
intensity changes in the iinngc nt tlic scnic determined by llic gaiissian. essential inlonnation is lost by discarding tJie image and analyzing the
Ihis nie.ins that the image / isfirstpassed Ihrough a gaiissian ?cri)-crossing representation alone. (See Marr, I'oggin & Ullman 1979.
sitioolliing finiclion G , and iJicn a second derivative of the result is for fill thcr discussion of this issue.)
taken, lite two operations of scaling and diffeientialion he combined
in a convenient manner, llic combination is based on a mathcinallcal .1 I lie liioloKical detection of zero-crossings
identity that slates ihc oider ofdilTctcnliation and convohilion can I he analysis so far leads to the general suggestion that fiillowing
be without alTccting the result In ni.ithematical notation; die retinal operation Ihe nexl step is to locale and lepreseiit a m a p ofUic
zero-crossings In the oulput. If this siiggeslion is correct, then a main
£.(^^•0 = ( £ , « ) • / (2) fiinclion of die primary visual cortex should be die construction of the
zerinrossings rcpicsentation. I shall next turn to consider hriefiy h o w
The imphcation is tJial the two opeiations can he collapsed into a zero-crossings m a v be detected by the mechanisms ofihe visual cortex.
single one: simplyfilterthe image not through a gaussian function, but I hefibersof the optic nerve roming from the eye to the brain carry
through ll];G (the second derivatnc of a gaussian). Ihis function is Ihc imagefilteredIhroiigli Ihe V V; rcceplivefields(lliis is, of course, a
shown in figure li (.lb shows its fourler tiaiisform). llic analogue in two ciimpul.iiioDal icleali/.ilioii). This im.ipe is in lad carried by iinit<
dimensions would he a similar hut circularly svinmetric fimction which
ol two coinplemcni.ii\ t>pes, called on-cciilci ,iiiil ollienler iinils. Ihc
has the appcar.nice of a "nicxican hat" Mathematically, in Ihc two-
oHn-iiici uiiils.ue siiiipiv "iineiliil iiK\ii.,iii h.ils" vsiili pu-g.pliieceiilci
ihmensional cise thefilteris V ' G . where V ^ is tJic and G a
,iikI iHiMinc siiirouiul lot us iKrv cdiMcli'i liic ii' diiipiil ill Ihc
two-diuiensiimal g.iussian function.
Mciiiih dl an cili'c, lit'iuc l<.\ drpitls ,i sic|i edge, .iiiil fih is ilic rcsiili
The scheme is n o w stiaightforward: the rcpicsentation of intensity
dl p issiiir. fii lliidiM'h III cplivc lirldv I Ins dul|;iil toiil.iiii'
changes is dhiaincd from the /cro-crossing in the result of passing the
both negative and positive values. In contrast, the optic nerve carries
image throughfiltersthat have the shape of V ' C .
no negative values; die positive part of the signal is carried by the on-
Ibose «lio have some familiarity with Ihc physiology of die visual center units, and die negative part by the off-ceiucr ones. I his means
system would readily iccogni/e Ihc shape of thesefiltersas correspond- dial within die system die zero-crossing itself is always flanked by two
ing to ilie shape of ganglion receptivefields.In other words. Uie peaks of activity: of on-ccntei cells on one side, and olT-cenler cells on
retinal structure can be viewed as approvim.iiing the convolution of the die other. Ihe detection of a zero-crossing can easily be acconiplislied,
image with the V ' Gfilicis.(Itir more detail sec Marr & llildieth 1980. dierefore. by a simple comhiiialion of the on- and olT-ccnter units.
Marrct 1980. W h e n two adjacent units, one off-ccnicr, the other on-center, are active
ligure 4 shows ev.iniplcs of images following ihis operation, simultaneously, they indicate die existence of a zero-crossing ninning
and the lesulting /cio-crossinps representations (generated by I lien midway between them. Note diat a point of zero value is detected in this
llildieth). Ihcfiisirow shows two iniaees piioi to the lillcring stage. scheme by detecting peaks of activity radier than zero activity.
I he sctond low shows the images lillcrcd Ihiough die lelinal operation. The basic zero-crossing detector is shown in figure 7a, It is com-
It .iiivcs MPiiic idc.i of the li'ini of the iiii.iiie .is it Ii.imIs up the opiic posed of die two siih-unils (on- and off-center) combined with an
ner\e Iniii the c>e. Ma .iii iiiuinudi ilc si.iiiuii calKil llie l(iN. to "and" operation. Ibis means that the two units are required to be active
Ihc visual loiiex in aie.i I ' o^ the hrain I lie ihml rov illusii.iU's ihe simultiincously to produce a response. Ihe unit can be made oriented
icsulline /rii> iiossiir ii-pi.-i ul.ilioiis I iriu. 5 .iii lui I'e lof .i by combining a number of such detectors lying in a row (figure 7b).
sculpture h> Henry Moore) and its /cro-<rossiiig icpresc'niation at three Such an oriented unit will exhibit many of die properties of cortical
difTereni resolutions. simple cells ("edge detectors") originally discovered by iliibel & VVicsel

Hcforc turning to the physiological aspects of the ^cro-crossing in die visual cortex of die cat | 1 % 3 | and monkey (19fi8|. It will still l.ick,

representations, it will be of inleiest to note that 7ero-crossings in however, one fundamental property; cells in die visual cortex are also

bandp.issfillersare k n o w n to he, in a sense, "rich in information" oden selective for direction of motion. They respond well when dieir
preferred stimulus moves in one direction, but little or not at all when it
II. I ogan of the Hell l,aboi,itories has shown that a onc-diinensional with a bandv,idth of less dian one octave be completely moves in die opposite direction.

reconstructed (up to an overall mulnplicalivc consi.uii) from its /cro-

crossings alone, provided that some simple conditions ,ire mil |l ogan 4. Vrldini; (ilrcitiiinal selecliiity

1977|, It is not clear, however, whether die dieorcm can he extended With the addition of one siihunit it is possible to m a k e the b.isic
to two dimensions, and under what conditions die one-iictavc restric- zero-crossing dclector diicc tionally selective, and use it for Ihe incisure-

mcni of visual moliiin. lo sec bow. consider ag.iin ihc zero crossing die two figures arc uncorrclaled W h e n these figures arc presented lo
iiwcialcd uiih an inlcnsily edge (figure 6b). At Ihc /cro-iiossing itself h u m a n observers in a rapid ailcrnadon. the central square is imniedalcly
ihc ciiirtiu \,iliic is. of course, zero. Ii c m be rc;idil.\ seen from ilic perceived to m o v e h,ick and forlh against a background of uncorrclaled
flptiie Ibal il ihc piolilc m m moves lo the iiphl. Ihe i.iliic al Ihis poijil molion Figure 10c shows iJic zero-crossings representation of lOa.
will be iiicriMsiiip. II II iiiii\c lo ilio Irli. ibe \aliic mil be diLHMsiiip ligure lOd is the result of die molion analysis of the zero-crossing (Ihe
lt> sinipiv in^pitliiii; llic sii;n ol (lie leio| tliaii,i;i.' il iIuk luie be- light dots indicate die direction of motion of die z.cro-crossings). In
comes p(ftvible lo dcli.-miinc (be iliKilioii of motion. Il is mil ililliiiill figure 10c tlic light dots where removed from tlic area where coherent
lo csliblisli It IS liiKliii po' ihl' to niiMsnie Ibe spiiij ol million motion (to tlic right) was found. Ilic motion assignment was correct,
111 Uic direction of Ibc unit by comparing llie slope of Ihc zero-crossing with die exception of a few isolated points, and as a result the moving
and Ibe rale of temporal clinngc. Ihc extra sub-unit should respond square was detected.
tlicrcforc to temporal changes. Ideally, it should behave like the time i.e. ,'/j(V'G). 1 have sketched above some aspects of an evolving dicory of early
visual information processing. Ihe main goal has been not to present a
As it turns out. the population of retinal cells contain a natural
comprehensive review of the theory, but to illustrate an alienipt aimed
candidate for this task. Ihcsc arc the so called Y-lypc cells, originally
at combining die study of structure and function in the early stages of
discovered by rnrolh-Cupcll & Kobson (lOMil. Ihis is a relatively small
visual perception. M.ijor paits of the Iheoiy were consequcndy left out
siib-popiilaiion of cells that arc kiumn lo be "transient" I hat is. iJicy
of Ihe discussion, most notably, die use of Ibc early iepresent;iti(ms in
rc-pond lo a sicacl.v stimulus by a sluirt aiu1 brisk response when Ihc
sicrcn \isi(iii [M.irr & I'ocgiii 1979. Crimson 19S1).
stimulus is turned on or o(T. I he ntlicr major populalion of retinal cells
l-'in.illy. 1 would like to end with two hiief cautionary notes. Ihc
arc the sustained. X-typc cells. Such a cell responds to a stationary
first has to do with Ihc specific pmhlciii of aii.dy/ing image contours.
siunuliis will) a sustained response tJiat usually conlinues as long as Uie
l-.vcn if Ibe zero-crossing .in.iljsis is .ilong the right track. i| provides
stimulus is present Milhin its rcccpti\c licld.
only thefirstslaccs in the ,inal>sis of cdgts .ind liii.ige ciiiiloiiis ligure
diir schematic model of die simplest directionally selective units is
11 illiisiiates examples ofcontuurs are e isily peicoived but cannol
therefore constniclcd fiom three types of sub-units. As before, it has
be c.iplured by any simple iiit',iisit\-based .iimKsis of the im.ige. In
a row of on-center cells, and a row of oir-centcr cells, both of the sus-
(igiiic H a all Ihe lines In- .ilciiij; Ilic IS deii. di.i.fon.ils Ihe
taincil Ijpc. In addilion. il has ,iii input from at least one transient Y -
.iiid \eitic.i! Iioinid.'irics whiih .iic .ipp.iieiil in Ihe im.ige .ire |iioducetl
lype unit (figure 7i). A more detailed discussion of this general scheme
not b\ .thiiipt intensiis cli.tiirr.. biii by uiLiiii '.'inupi'u; pioccssos.
can be found in (Man & Ullnian 1981).
li^'iiie 111) is .111 es.niiplc of so i tll.d "(ii!'iiili\<- < iitiloni'.". I hc\ do
Ihis general vhcme for zcro-trossing and motion detection w:i not exist in the im.age. and cannol he detected by simple intensity-based
dinon prim,inly by consider.ilions. I'liysiologically, al operations. These example senes to illustrate that e\cn a seemingly
Ihoiigli \-t)pe units were often <lcsciil)ed as transient, it was not cica simple and elementary task such .is die detection of image contours, re-
wlicther ihey can also be dcscfihcd .is at least approximating die re quires in fact complex processing that is still far from being complelely
quired time deru.iiivc opcr.ilion. W e Ibereforecoinpaied the response understood.
icqiiiicil hy the ciiiii|iui.ilion,il sihciiic with pluvioKignal response Ilic second and more general comiiieiit to do with llic iiitcgra-
(lakcii finni Rodirck i^ Stone. lOfA. Dicher & S.iiulcrson 197.V sei liim of theories ol riinclion and struclure in more complex systems, llic
Mill iV; Ullinan l')Sl lor del.ills) S o m e uimp.insims,ire shown in figiin examples I have oullincd c o m e from a system is relatively simple
u (for \ cells) and 9(foi V lcHs). I Ik- top row in figiiie N is Ibe coinolu and easy lo explore. Us anatimiical structure is orderly, the input lo the
lion iif \.irimis profiles (edge, thin h.ii, wide wiih V ' G . On system is relatively easy to control and manipulate expeiimenlally, and
icniei colls ,iu' cv|Hxled to cari\ the po'.iliM.' |i,iil of these profiles, anc m u c h is known about its physiology. I'ven under Uicse favorable condi-
oil (cni'-i the ncr.iliie |Viit In the iiixl ino runs the posime put i.l tions, the integration orsliucture and function proves lo be exceedingly
the si;; iscomp.iud with reLonliiii's fioiii on ceiilei cells, .-ind in Ihc difiicult. W h a t is die hope. then, for achieving tompiehensive theories
List lo'As the 111 e.iiuc- put is l o m p m d with lerordinrs liniii oil (.enlei of structure and function for higher, more complicaicd. cognitive sys-
I ills '.111111,11 ioiM|i.iii.ons .lie ^ 111'I'll III li-iiu- ') In I'M , ii die iniiipol.i- tems?
•flic task is certainlyfiuniid.ible.hut it is piobablv worthy of
tional model, based on •',{(•*I)- '"^ physiological recordings. It can be
exploration at least in certain instancs. since it appears unlikely dial
seen that even in the cases where the profiles .ne r.ither complicated, die
the sitiicture tif complex systems he understood without some
general agreement is good.
guidelines supplied b> computational theories. It has to he admitted.
linallv in Ihis section,figureId shows an example of applying the
however, that given the dilTiculties of die task il is unclear whether
motion deteclioii scheme described above Ui a mo\ing random texture.
coherent and detailed liieoiics combining stiiicturc and fuiictlnn can be
I igurcs Ida and h shovv a pair ori.iiidoin d m patterns, A central square
achieved al present beyond Ihe simplest congnilive systems.
ill ID.i is sliilli'd ill lOli sliplitl) to die right, while the h.itkgrounds of

AekiMiMleilgnunt: I wish to thank I llildicth and K, Stevens for

their invalu.ililc help.
Campbell. K\V. & Kobsnn, J.O. Application of Kouricr analysis to
the visibility of gratings./ l'hy5iol.(l.(mJon) 107. 551-556.
Drehcr, I). & Sanderson, K.J, 1973 Receptivefieldanalysis: responses
to moving visual contours by single lateral geniculate neurons
in the cat. 7, I'hysiol. I ami. 231 95-118.
l-nroth-Cugcll, C. & Kobsnn, J. D. 1966 The contrast sensitivity of
retinal ganglion cells of the cat. J. /'Iiysiul. (I.oiid) 187, 517-
Crimson, W.l I.. 1981 A computer implementation of a llieory of
h u m a n stereo vision. I'liil. Trans. Ray. .Sue. li. 292 (I05H).
I lubel, 1 ).H. & Wicsel. V.N. 1962 Receptive fields, binocular inter-
action, and functional architecture in the cat's visual cortex. /
/V/iw,)/. Inmloii. 160. 106-154.
Ilubcl, 1)11. & Wiesel, I.N. 1968 Receptivefieldsand functional
architecture of monkey striate cortex, y. Physiol. London. 195.
Kogan, 11.1-. 1977 Inform,ition in the /cro-crossings of bandpass
signals.flf//,S>sTali..I.. 56. 487-510.
Marr, I). & I'oggio. I 1979 A compuuilional tlieory of
stereo vision. Twc. Roy Sac. loud II 204, 301-328.
Marr, I). Poggio. I. & Ullniaii. S. 1979 Randpass channels, zero-
crossings, and early visu,il infi)rmaIion priKCSsing. / OpI.
Fig. 1 .Soc. ,1m., 60(6). 914-916.
Marr, I). & llildrcth. K. 1980 theory of edge detection. I'roc. R.
,V<.<. lond n. 187-217.
The same image al three clirTcrciit resolutions.
Marr, I). & Ullnian. S. 1981 Directional seleclivily and its use in
early processing. PnK. Roy. Sue. Ioml B. 2// 151-
Rodicck, U. \V. & Stone. J. I%5 An.ilysis of receptivefieldsof cat g.inglion cells../ Nnnopliyshil. 2S. 833-849.

i'igurc 2. OifTcrcnt resolution copies of the original im.ige are obtained

by convolving the image with ganssianfiltersof diU'eicnt sizes.

A j e> ^
^ j 0 ^ ^

Tigurc 1 a. The shape of J'^'iG. b. Its 1'miricr liansrorm.


r d i < ? X ' ^


I'igurc 4. l-.xamplcs of /cro-crossing rcprescntnlions. I'irst row: the

original linages. Second row: the images following the convolution with
V ' G . niiril rwo: the rcsulling ?enn:rossings representations.

Fig. 5

/cro-cnwing representations of (lie s.imc image at tlirce diflcrcnt resolutions.


I'igiMC (). A <;it'p C(lj;c hcfmc (.1) ;ind .iClci [he cfMunliiliiMo wilh (!ic
rctin.'il opcintDi {h).

(•'igiirc 7 A schemalic di.igr.iin of ihc li,i<;ic /cro-trossing dclcclor. a.
Oii-ccnici nnd oir-cciiicr iiriiK ;irc ANPcii logcthcr. b. A row of
such siibiinils in.ikes Ihc dclcctur orirnlation-spocific. e. Willi the
.nddilicm o(n liiiic dcriv.itivc suhiinil Ihc (I'-ic !nr hccomc; dircclionnlly


_ k _


- j L j L


1 ipiirc S, A ' nnipiilscin hcmccii Ihc iiuhIcI ,iiu1 physiiilogi- icc(iidHip<,, rnriin- .iiul (ilf ccnlcr X c;,pc iiiirls. I nrdclaihscc texl

L a
I o

> z
z >

1 >

a) .v.-i
.- I • I '
;.'.•., .'.. m m ^ ,
ii • '

, ,r_' ..

? »
'-",.(,' '1.,"'• 'i • ~ '.II ,,=•- •" •

3 .- ' J.-.V, •••• ,•. • •-•I' - '


e ^


% 0

I igiMC 11. Comdiirs thai caniidl he dcicclcd hy simple inlcnsily based



Affect, emotion, and other cognitive curiosities well bounded, precise set of phenomena. Elnotion is
one of these categories. To ask "what is (an) emo-
George Handler tion?" - as William James did and as some cognitive
University of California, Dan Diego scientists still do - assumes that the vagueness and
redundancy of natural language is suspendable.
The lure of phenomenocentrism. During the past Categories like emotion (just as intelligence, jus-
century - at least since Darwin, Marx and Freud - our tice, equity, learning, aggression, etc. etc.) have an
concept of reality has undergone changes that have evolutionary history and current function that do not
been particularly apparent in the cognitive sciences. support the weight of explanatory systems. They are
A dominant symptom has been the sharp swing of atti- useful, and have developed, as communicative devices
tudes toward the apparent and convincing reality of in natural discourse. The analyses of these functions
phenomenal experience. It is practically impossible is an important enterprise (usually engaged in by our
not to be overwhelmed bv the immediacy of human philosophical brethren). However, they fall far short
experience. Philosophers have tended to accept its of definitional devices as a first step toward satis-
primacy through the ages and many cognitive scientists factory explanatory! y and theoretical ends. At least
still do. We have overcome the perils of ethnocen- one can say that to date attempts to develop satisfac-
trism and of anthropocentrism; we believe no longer tory, exhaustive, and scientifically useful defini-
that either our social values or humankind are the tions (much less explanations) of "iWOTION" have
touchstones of social organization or the measure of failed.
animal behavior. But we are still phenomenocentric;
we accept as basic the surface experiences that we Again, having inveighed against natural language
seem to share with our fellow featherless bipeds. categories, I turn to them as a starting point. The
Phenomenocentrism has been abandoned in some corners categories of our common experience are, of course,
of cognitive science; but many psychologists, philoso- collections of events (or objects) that do have a
phers and AI practitioners seem to hold stoutly to the vague common core. That common core Is - as in the
dogna that common h'jman experience provides the basic case of the phenomenal experiences an indispenslble
building blocks of cognition. I want to argue that starting point for serious investigation. In the case
this kind of commitment to surface structure as the of affect and emotion, there are apparently two
psychologicallv "real" prevents progress and, in par- aspects that characterize the collection. Dic-
ticular, has barred us from any reasonable understand- tionaries tell us that emotions refer to "vehement,
ing of human emotion. Any satisfactory comprehension excited mental states," that they involve "agitation,
of the structures and processes that generate affect disturbance of mind, feeling, passion." Affects are
and emotion requires the postulatlon of deep struc- mental states that involve "desires, intentions," and
tjre, a willingness to go "beyond phenomenology." "inward dispositions" and "intent." During its early
history, and to some extent in its modern usage, the
It Is the case, of course, that the common
term "affect" also invoked the same kind of physical
experience our phenomenal world - does provide the
referent that the emotions do. What the common con-
major insights that have led us to determine the
cepts of emotion and affect seem to have in common is
important problems and some of the answers abojt human
a state of phvsi( ologi) cal excitation or arousal.
cognition. Psychology has failed when it has Ignored
What apparently differentiates the various affects or
these insights. But the importance of common experi-
emotions are desires, intentions, and values.
ence, language and understanding In the context of
discovery must not color the need to go beyond Looking for femotion's deep structure. If we now
phenomenology and folk experience when we try to turn to the problem of arriving at a program of theory
understand the processes, structures and mechanisms or research, we need to postulate a system that con-
that determine, generate and construct that experi- structs or generates some subset of these emotional
ence. phenomena. I shall defend one version of such a
theory, but the main thrust of mv argument is that
There are two glaring examples of phenomenocen- some kind of theory (deep structure) needs to be
trism In the fields of affect and emotion. The first, developed that generates so-called emotional behavior
popular through the ages and exemplified by Charles and experience. Psychologists have variously
Darwin and his descendants, postulates the categories emphasized the agitation/arousal dimension or the
of emotion taken from the natural language as the intention/value aspect. Most of the the papers at
fundamental building blocks of human (and lower) emo- this symposium address the latter problem, i.e., the
tional life. It Is a position that searches for the specification of the cognitive structure of emotions.
effects of jnanalyzed emotions, posits a limited set I shall outline a point of view which is equally con-
of such fjindamental emotions, and even seeks their cerned with the arousal and evaluative aspects of the
locus in various corners of the human brain. It has generative system, though I do not go into the kinds
variously reified such nicely vague concepts as fear, of details that the cognitive components require, and
love, anxiety, joy, lust, dread, and so forth, and so that are exemplified by the other papers of the sympo-
forth. sium.
The other position, beloved of poets and some A point of view. My concern is specifically with
social psychologists, posits the prlmacv and noncogni- the conscious experience of emotion. As a conse-
tlve (i.e., knowledge Independent) nature of affective quence, I have been concerned recently with the con-
experience. Its advocates assert that feelings and struction of conscious experience in general. There
affects occur prior to the registration and/or experi- have been a n'umber of recent suggestions, notably by
ence of other aspects of human experience, and partic- A.J. Marcel (198'), that stress the constructive
ular prior to "cognitive" events. nature of consciousness, such that a particular cons-
cious state Is seen as constructed out of two or more
Emotion and the categories of natural language. activated schemas that produce a phenomenal unity that
A concern that Is related to phenomenocentrism and, apparently conforms to the Intentions of the indivi-
like It, a hangover from the '9th centarv is exempli- dual and the reqjlrements of the task and situation.
fied In the penchant to take seriously the Implicit The constructive approach to consciousness Is ideally
claim that some natural language categories, define a suited to accommodate the notion that conscious
Copyright 0 '98' George Mandler experience of emotion concatenates both evaluative

cognitions and autonomic arousal. Thus, the phenome- have taken one step in that direction in trying to
nal emotional experience is not some additive result show how one of the more primitive evaluative judg-
of arousal and evaluation, but rather the scheraas ment, that of preference arising out of the sense of
activated by arousal and evaluation are used in con- familiarity, is related to the oongruity between
structing the phenomenally unified emotional experi- expectations (schemas) and evidence in the world.
ence. Its intensity will indeed be related to the
degree of arousal and its specific quality will depend In general, though, I would argue that much of
on a complex evaluative cognitive event, but the two the valuative cognitions that contribute to emotional
ingredients are experienced as a single emotion, just quality deal with the internal structure of events
as eggs, milk, and sugar may be experienced as cus- rather than with the presence or absence of features
tard. This approach also accommodates the fact that or attributes. Thus, the sense of loss that leads to
there exist experiences of "cold" emotions, of evalua- grief deals with relationships and not with the
tive cognitions without arousal, and of unemotional specific characteristics of the lost person. Simi-
arousal, of autoncmic arousal without the cognitions larly beauty and ugliness deal with structural rela-
that provide emotional qualities. tionships, as do the cognitions that underlie jealousy
and even fear. That practically canonic emotion -
The arousal component of emotional experience can anxiety - apparently has a cognitive basis in the per-
be ascribed primarily to peripheral autonomic nervous ceived absence of structure, not in any definable
system (ANS) activity. Whereas there is some evidence feature of the anxiety arousing situations.
that what is most efficiently registered is some gen-
eral level of ANS activity (heart rate, blood pres- Emotion and cognitive science. In contrast to
sure, gastric motility etc), we also know that there certain speculations to the contrary (e.g., Zajonc,
are large individual differences in the patterning of 1980), these arguments suggest that evaluative cogni-
the various autonomic indicators. It mav well be the tions (preferences, likings, aversions etc.) are rela-
case that registration of peripheral arousal will, in tively complex cognitive events, certainly involving
the individ'ual case, be governed by different patterns more processing than simple definitional or featural
and may, in some cases, be driven by a single channel judgments. We have recently collected some data that
(such as heart rate activity). For the present, the support this argument; simple impressionistic
sufficient and necessary conditions for the occurrence judgments of liking (the simplest evaluative judgment)
of autonomic arousal are not adequately known. To a are slower than simple categorical jud^nents, with the
large extent we still rely on lists of "elicitors" effect becoming rather large for familiar objects.
which are of varying degree of utility (e.g., tissue Thus it takes longer to process the information needed
injury, stress, surprise, threat, emergency reaction, for simple val'uation than for simple categorization,
etc. etc.). I have suggested that the interruption of exactly what we would expect if the former involves
ongoing action, the discrepancy between expectation processing of internal structural relationships, while
and evidence, and similar instances of incongruity categorization may proceed on more simple
between organisms' schemes and the evidence from the presence/absence judgments about features or attri-
environment, are responsible for a large subset of the b utes.
occurrences of ANS arousal. Such a hypothesis not The argument that affect or emotion is prior to
only is consistent with the homeostatic view of the or independent of cognitions frequently appeals to the
ANS, but also responds to adaptive, evolutionary func- phenomenal evidence that we are often conscious of
tions of the autonomic system in general. Whenever an valuations before we are aware of the details of the
expectation is violated or a plan kept from being car- event that is being valued. Even if this particular
ried out (either in thoijght or action) an interruption kind of phenomenocentric assertion is confirmed by
(discrepancy) occurs which leads to ANS arousal. It some future analyses, it does not say anything about
is important to note that pleasant as well as cognition and affect, but it does address the nature
unpleasant experiences are captured bv this approach - of consciousness and the kinds of conscious construc-
'unexpected, desirable events generate arousal just as tions involved in affect and in other kinds of experi-
unexpected, noxious events. ences. That analysis is beyond the scope of this
What is the natare of evaluative cognitions? In presentation.
the first place, I suggest that there are three
so'orces of value that influence the quality of an emo- Cognitive scientists have, often for good reason,
tional experience. Our evaluations may be based on been accused of being scientific imperialists. The
innate, prewired values - such as the preference for old division of the world into cognition, conation,
certain temperature ranges, the avoidance of looming and will has been destroyed by raiding parties that
objects, the preference for sweet over bitter tastes; have penetrated deep into (undefended) territory pre-
or our values may be culturally predicated - we are viously considered noncognitive. To the extent that
"told" what is edible, lovable, drinkable, without such aggression has been justified, it has been based
ever having had direct interactions with the objects only in part on the claim that all of experience and
in question that would direct our values; and finally behavior deals with knowledge. More important may have
we may make valuative judgments that are determined bv been the fact that the new theory-rich cognitive sci-
the structare of the valued event - or rather we base ence has been willing to take on all kinds of problems
our jud^ent on some comparison between the event and in terms of an information processing organism (ani-
some existing schema. It is the latter which I find mate or otherwise). For the time being, I have left
most challenging; what are the structures that deter- aside the "cognitive" nature of autonomic arousal.
mine judgments of beauty, ugliness, preference or However, in the spirit of cognitive imperialism one
rejection, and which in turn determine emotions of joy should be concerned with the kind of structural
and disgust, liking and disliking? There has been representations that would be useful theoretically and
some reasonable amount of work done on such staples as that would lead to a better understanding of the ini-
anxiety, fear, and grief, but much analysis is still tiation, perception, and conscious construction of
to come - some of it presented at this symposium. I peripheral autonomic arousal. The neurosciences are
charter members of the cognitive scien:es; maybe It la
tijr.e to tell the peripheral physiologists to look to
their borders.


Kandler, G. Kind and emotion. New York: Wiley, '975.

Kanaler, G. The ^trj-ture of val je: Accojnting for

taste. Technical Report No. ''0', Center for Human
Information Processing, University of California,
Lar. Diego, May ' 98'.

Warcel, A..J. Conscious and 'jnconscious perception:

II. An approach to consciojsness. Cognitive
Psychology, ' 9 8 ^ in press.

Zajonc, R.B. Feeling and thljiking: Preferences need

no inferences. American Psychologist, '980, ;5,

Affect and Memory Representation

Wendy G. Lehnert

Department of Computer Science

Yale University
1. Affective Knowledge Structures and Computers

Many philosophers, psychologists, and A.I. An ambitious implementation of belief system

workers have taken various positions on the issue of maintainance was attemnted by Kenneth Colby in the
machines and emotion. Some argue that a computer early 60'8 [Colby 1967; 1973]. While Colby is best
can never "experience" human emotions in any known for PARRY, the paranoid conversationalist
significant sense because it just doesn't make sense [Colby 1975], his earlier work was aimed at a more
to attribute conscicusness to an inorganic and general simulation of neurotic thought processes
programmed system [Gunderson 1971, Puccetti 1968, [Colby 1963; 1965]. Colby was specifically
Scriven 1960, Ziff 1959) Others argue that our interested in simulations of Freudian defense
subjective sense of emotional experience is too mechanisms when they surface in clinical dialogues
"intuitive" and ill-defined a candidate for between psychiatrists and patients. His work
computational modelling [Dreyfus 1972, Weizenbaum involved affective manipulations, but only in a very
1976]. Still others argue that emotion will be a superficial sense. Colby utilized "emotion
natural and necessary consequence of intelligent monitors" which were numerical parameters with names
information processing, an inevitable side-effect of like "excitation," "self-esteem," "danger,"
intelligence [Kenny 1963, Simon 1967, Doyle 1980, "well-being," and "pleasure." While Colby's
Slcmsn 1981]. And then there are always "rational simulations were never intended to implement a
purists" who consider emotional experience totally complete system of affective representation, he
irrelevant to reasoning processes and therefore, of nevertheless found it necessary to maintain and
no consequence to artificial intelligence manipulate these numeric parameters. For example,
whatsoever. the "excitation" monitor reflected the overall
anxiety of the system - a factor that any
Many lines of reasoning have been invoked to
psychotherapist would want to take into account.
secure these various positions [Dennett 1976],
Whether or not someone's anxiety levtl can be
although most of the arguments are conducted with a
adequately represented on a scale of 1 to 10 is
distinctively philosophical tone. It is ironic that
another question.
the most passionate advocates in these debates
rarel) argue from first-hand experience with It is inevitable that belief systeir
computer simulations. Why does A.I. seem to be so manipulations manifest them.selves most naturally in
silent en the the topic of emotion and computers? I interactive conversation. Colby was forced into
cannot speak fcr everyone in the field, but I would conversational task orientaticns when he began his
guess that a let of us prefer to avoid the whole work on belief system.s, and this eventually drew him
morass because we believe that the questions being toward PARRY. PARRY also utilized numeric
answered art not the questions we should be asking. parameters for "anger," "fear," and "mistrust," - a
somewhat more narrow set than was needed for
A ccit'fuler can have knowledge of human generally neurotic simulations. While PARRY is the
etictional 3 ty in the same sense that it can have only conversational system that I know cf which has
knowledge of mass spectroscopj, medical diagnostic implemented an affective component, it is clear that
techniques, or payroll data. Computers do not have any conversational system would requiie affective
to "be" emoting entities to use this knowledge any manipulations if it was designed to simulate
more than they have to "be" cher.ists, physicians, or emotional responses [Schank and Lehnert 1979].
bureaucrats to use knowledge specific to those
Although Colby was primarily interested in
professions. If it is difficult to give knowledge
thought processes, his simulations became thoroughly
of emct:.cns to ccmtuttrs, it is on]> difficult for
mired in language processing difficulties. Colby's
the same reason that a thousand other topics are
sentence processing techniques relied on lexical
difficult for ronputtrs: people do not have a
pattern matching routines, and his internal m.emory
rigorous understanding cf their intuitive Ur.owledgi
representations were lexically-oriented as well.
in terms cf inf c in'at ion processing requi t eii;er,tE , We
Thefc devices were ineffective substitutes for
need to develcp (1) s>-mbolic systeiiis cf internal
natural language processing strategies, and Colby's
representation, and (2) processing strategies to
models were significantly hampered by inadequate
manipulate these symbolic structures. These two
representational techniques [Boden 1977). Similar
rtquiiecient s art uiiiveissl to all A.I. efforts, and
impediments were encountered by other researchers
the difficulties involved are not significantly
who tackled belief systems early on [see e.g.
amplified when tne knowledge to he encoded is
Abelson 1973], so it comes as no surprise to ste
knowledge of enot ional it y.
that the m.ost recent work on belief systems is
When ve appl) knowledge of emotions to an thoroughly grounded in theories of natural language
information processing task, we car evaluate our processing and internal memor> representation.
expertise on ei:;oticnf, by evaluating the overall [Carhonell 1978]. Whatever one's ultimate research
effectiveness of the larger information processing goal (models of belief systems, memory organization,
system. What experience: has A.I. had vith etc.) all dialcg simulations are prinarily natural
affective knowledge structures? Our experience is language processing system.s, and any attempt to
admitedly limited, but it it not totally circumvent this fact is destined to fail. In fact,
non-existent. To date, three distinct task when the research goal is a model cf human emotion,
orientations have touched on affective maiarulations
of one sort or Enother.
This work wiis supported in part by the Advanced
]) belief systMi ir.airtai Research Projects Agency under contract
2) convevsat i cnal s iimui at i ens r^GOOU-75-c-llll and ir part by the National Science
3) niirrative text processii-^ Foundation under contract IST79J8463.

it is far better to embrace the challanges of are of course other approaches to affect [deRiveria
natural language proceBsing with open arms: a 1977, Izard 1977], but ve will not attempt to survey
natural language processing project provides the all the relevant proposals here. Readers who are
only naturalistic and realistic laboratory for familiar with alternative aystems can judge for
theories of affective memory representation. themselves whether similar troublespots would arise
in trying to implement another system.
For example, narrative texts provide a rich
proving ground for theories of affect - one need 2. Conceptual Decomposition for Affective States
only look at stories that involve emotional
reactions and emotional behavior. Yet affective In the Roseman system, emotional states are
knowledge representation has not been systematically represented by decomposition into five dimensions.
tackled within natural language processing programs Four of the dimensions assume positive and negative
until very recently. The remainder of this paper fields, while the fifth assumes a three-valued
will outline some of my own experiences in trying to spectrum:
implement affective knowledge structures in a
narrative text processing system, BORIS. Five Dimensions of Affect
The BORIS system currently utilizes a limited 1) Motivational Status (desirability) [+,-]
system of affective representation and 2) Situational Status (attainment) [+,-1
affect-related inference mechanisms. In addition to 3) Probability Status (certainty) [+,-]
these representational techniques for affects per 4) Legitimacy Status (deservedness) [+,-]
se, the recently proposed TAU knowledge structure 5) Agency Status [self, other, circumstantial]
(Thematic Affect Unit) contributes to BORIS'S
affective inference capabilities as well [Dyer When an event is mapped into its appropriate place
1981]. Since descriptions of BORIS'S processing on each of the five spectrums, we can predict
techniques can be found elsewhere [Dyer and Lehnert emotional reactions to the event. For example, (a)
1980, Lehnert, Dyer, Johnson, Yang and Harley 1981], wanting a ticket to a sold-out Grateful Dead Concert
we will not go into a description of BORIS here - we describes a mental state with a positive motivation
will instead take this opportunity to explain how a (wanting it) and a negative situation (not having
computational m.odel of affect might contribute to it); (b) winning a ticket to a Grateful Dead
other models of affect that are not computational in Concert is an event with a positive motivation
nature. (wanting it) and a positive situation (having it);
To begin with, a computational model must (c) losing the ticket has negative motivation (not
address a specific set of problems to be solved. In wanting to lose it) and positive situation (having
language processing, affective manipulations are lost it); and (d) finding it again involves a
needed for four general inference situations: negative motivation (not wanting it lost) with a
negative situation (not having it lost). If all of
(1) Given an event description, infer an affective this happens circumstantially, we expect to see (a)
antecedent: sorrow, (b) joy, (c) distress, and (d) relief.
Using all five dimensions, Roseman's systen
"John took a Valium." { => John is upset) differentiates 13 primary emotions. These are
listed below with a vector encoding of the five
(2) Given an event description, infer an affective dimensions as listed above. For example,
consequent: (+ + + - S) corresponds to a positive motivation,
positive situation, positive probability, negative
"John got a big raise." ( => John is happy) legitimacy, and self-agency. An "*" indicates that
the corresponding dimension can assume any value.
(3) Given an affective state, PRIMARY EMOTIONS
infer its likely antecedent:

"After the hurricane, John was depressed." M S P L A M S P L A

(John suffered a loss => John is depressed)
(+ t t * C) JOY (+ + * *
(4) Given an affective state, (+ + - * C) HOPE (_ _ * *
infer its likely consequence: ( * C) HOPE (+ - * _
(_ _ + * C) RELIEF (- + * _
John was so happy about his royalty check, (_ + + _ C) DISTRESS (- + * +
he made reservations at Reno Sweeny's. (_ + * ^ C) FRUSTRATION (+ * +
( ... => John intends to celebrate) (+ _ * 4 C) FRUSTRATION (+ C) FEAR
(+ - ^ - C) SORROW • (- + - C) FEAR
In many cases, we must cocibine two or more of the (+ + * + S) PRIDE (+ - * + S) REGRET
above inference types to make sense of an implicit (_ _ * + S) PRIDE (- + * + S) REGRET
causality: (* * * _ S) GUILT

"After the hurricane, John saw a shrink." Many lexical descriptions of emotionality are used
to reference more than one conceptual configuration.
To see how this process-orientation differs from a For example, John could "regret" flunking a test
purely psychological approach to the problem, we (- + + + S ) . Alternatively, if John got a high B on
will look at some inferences problems in a narrative the test, he might "regret" not getting an A
text, and see how far a non-process-oriented model (+ - + H S ) . These are two distinct senses of
can go in helping a system like BORIS. When we regret: we can regret what happened, and we can
first began to look at affect in BORIS, we were regret what didn't happen. People who dwell on
greatly inspired by Ira Roseman's model for (- + + 4 S) configurations kick themselves for past
representing affective states [Roseman 1979], There mistakes while people who dwell on situations
involving (+ - i + S) are melancholy dreamers. We
can describe either personality in terms of past
regrets, but important conceptual distinctions are What did Paul have to celebrate? Adulterous mates
lurking beneath these words. are not normally greeted with such enthusiasm, so
the celebration must be causally connected to
Lexical ambiguities at the conceptual level something else that Paul should feel good about.
make it difficult to describe Roseman's 13 primary Notice that if Richard had expressed his heartfelt
emotions to everyone's satisfaction. For example, condolences to Paul instead of congratulating him,
one could argue that "liking" is not an emotion at this would also make sense. Paul's affective state
all but an attitude. The appropriate emotion for is complex and must be fully understood to
(+ + * * 0) and (- - * * 0) is really one of accommodate these various possibilities.
gratitude. Or perhaps "distress" should be called
"discomfort." It is instructive to engage in this To make affective inferences, BORIS needed to
sort of criticism as an intuitive exercise, but a interpret events and states from the story in terms
better way to test Roseman's system is with a of Roseman's five affect dimensions. "Motivation"
computer implementation. and "situation" were relatively easy to recognize by
relying primarily on goal states. But the three
3. Implementing Affect remaining dimensions proved to be trickier than
expected. We will look at some difficulties in
Vnien we tried to implement Roseman's system in "agency" recognition, slthough similar illustrations
BORIS, we ran into some interesting difficulties. could have been chosen from "certainty" and
To get a sense of these, we will look at a sample "legitimacy" as well.
text that BORIS processes, highlighting some problem
areas encountered. Initially, we thought that the agent for an
event would simply be the physical actor of the
**************************************************** event [Schank 1975]. We quickly discovered
A BORIS Narrative otherwise. For example, in a question answering
task, experimental subjects indicated that Richard
Richard hadn't heard from his old roommate Paul was happy to receive the letter from Paul. He
for years. Paul 4iad loaned Richard money which was wasn't grateful, and he didn't like Paul any more
never paid back, but now he had no idea where to than before; he was simply happy. In order to
find his old friend. When a letter finally arrived infer that Richard was happy to get Paul's letter,
from San Francisco, Richard was anxious to find out we have to ascribe a circumstantial agent when
how Paul was. Richard gets the letter. If the letter's arrival
was encoded with other-agency, then Richard would
Unfortunately, the news was not good. Paul's
either like Paul or feel grateful to Paul for
wife Sarah wanted a divorce. She also wanted the
getting the letter. Simple joy can only come from
car, the house, the children, and alimony. Paul
circumstantial agency. But the letter's arrival is
wanted the divorce, but he didn't want to see Sarah
encoded as an MTRANS event with actor = Paul (Paul
take everything he had. His salary from the state
sent the letter). If agency is not a function of an
school system was very small. Not knowing who to
event's actor, what is it? BORIS was (and still is)
turn to, he was hoping for a favor from the only
stymied by the agency problem.
lawyer he knew. Paul gave his home phone number in
case Richard felt he could help. It seems that agency is a function of actors
but more specifically, intentional actors. Notice
Richard eagerly picked up the phone and dialed. how the affective inference changes if Richard
After a brief conversation, Paul agreed to have believes that Paul sent the letter just because Paul
lunch with him the next day. He sounded extremely wanted to make Richard happy. Now it is much more
relieved and grateful. reasonable for Richard to like Paul or feel grateful
to Paul for sending the letter. If X knowingly does
The next day, as Richard was driving to the Z to make Y happy, and Z succeeds in making Y happy,
restaurant, he barely avoided hitting an old man on then Y will like X for doing it. If Paul does
the street. He felt extremely upset by the something intending to make Richard happy, then
incident, and bad three drinks at the restaurant. Richard experiences the event with other-agency.
When Paul arrived, Richard was fairly drunk. After But if Paul does something which only makes Richard
the food came, Richard spilled a cup of coffee on happy incidentally, then Richard experiences the
Paul. Paul seemed very annoyed by this so Richard event with circumstantial-agency. Knowledge of an
offered to drive him home for a change of clothes. actor's ultimate intentions is needed to establish
When Paul walked into the bedroom and found affective agency for inference purposes.
Sarah with another man, he nearly had a heart In addition to intentionality, affective agency
attack. Then he realized what a blessing it was. can be influenced by an actor's degree of social
With Richard there as a witness, Sarah's divorce responsibility. For example, it makes sense that
case was shot. Richard congratulated Paul and Paul got annoyed when Richard spilled coffee on him.
suggested that they celebrate at dinner. Paul was But what is annoyance? Annoyance can be a variant
eager to comply. of anger or dislike (Paul was annoyed with Richard),
both of which require other-agency. Richard didn't
intend to spill the coffee, but he was nevertheless
responsible for the event (albeit innocently), and
There are a number of important affect-related this responsibility gives us other-agency.
inferences in this story. For example, we should Annoyance is even more ambiguous in the sense that
infer that Richard felt bad about spilling his it may also describe frustration, which involves
coffee on Paul, and his offer to drive Paul home was circumstantial-agency: it is rotten luck to have
motivated (at least in part) by a desire to someone spill coffee on you. If Paul is upset, but
alleviate guilt. In the next sentence, when Paul not upset with Richard specifically, then his
finds Sarah, we should not assume that Paul suffered annoyance is one of pure frustration.
a cardiac arrest; he is just very surprised. We
must also understand why it made sense for Richard
to congratulate Paul and suggest a celebration.

Since "annoyed" is ambiguous, and this situations like Paul's reaction to Sarah [Lebnert
particular example could go either way, it is useful 1980, 1981a, 198ibl. The hidden bleating plot unit
to look for limiting cases which force one encodes any event that causes an initial negative
interpretation over another. For example, it seems reaction which later yields to a dominately positive
reasonable that Paul might be more annoyed with emotion. A aimilar "mixed blessing" plot unit
Richard for the accident since Richard was drunk. handles cases where the initial reaction is
If Richard were sober, he would somehow be less at positive, but a negative emotion follows.
fault. Suppose a frail little old lady is carrying
a cup of coffee, and as she passes by Paul she In general, a plot unit is a configuration of
collapses from a heart attack. Do we expect Paul to three affect states: (1) "M" mental states with
be angry at the old lady for spilling coffee on him? neutral affect, (2) "+" events that cause positive
Not likely. Now suppose a boisterous drunk lurches affects, and (3) "-" events that cause negative
past Paul and drops a drink on him. Do we expect affects. Each affect state is interpreted with
Paul to be angry at the drunk? Sure. Neither event respect to a specific character, although plot units
was intended, but a drunk is more responsible for may contain multiple affect states that involve more
his actions than a heart attack victim. People than one character. For example, the retaliation
choose to get drunk, but they don't choose to have unit involves two characters and five affect states:
heart attacks. The element of free will operating Z
in a drunk renders him more responsible for his
accidents: a drunk chooses to be accident-prone. ?
Since BORIS has no heuristics for assessing relative
degrees of responsibility, BORIS defaults to
circumstantial agency and therefore interprets
Paul's annoyance as one of pure frustration. This
is not altogether right, but a more correct
interpretation requires an assessment mechanism for This configuration tells us that X did something (?)
social responsibility. which caused a negative reaction in Y (-). This
One final problem with agency involves events negative event motivated a mental state (M) in Y,
that cause complex affect states. When Paul catches which was subsequently actualized by a positive
Sarah in their bedroom with another man, he event (+) for Y, and a negative event (-) for X. In
witnesses and reacts to an event involving two other words, X did something that distressed Y, so Y
actors. The event is assumed to be intentional (we retaliated by doing something to distress X. The
are given no reason to interpret the bedroom vertical and diagonal links in this diagram describe
activities as a rape) and at least one of the lovers various causal relationships between affects states
must be responsible for it. So it seems that we within characters and across characters.
have a clear-cut candidate for other-agency. Since
this event will save Paul from a nasty court battle Affect state maps are constructed for each
and divorce settlement, it is desirable from Paul's character in a story, as a way of tracking that
perspective. Sarah's activity can therefore character's emotional ups (-i-'s) and downs (-'s), and
interpreted by Paul as a desirable, positively specific plot units are recognized when the linkages
attained, certain, and illegitimate (she's violating across affect states indicate that a given plot unit
their marital contract) event of other-agency configuration has been encountered. If X were to
(+ + + - 0 ) . But this configuration brings us to get back at Y for Y's retaliation, we would have two
the improbable prediction that Paul will like Sarah instances of the retaliation plot unit sharing two
and her lover, or feel grateful to them for engaging common affect states:
in their illicit activity. X
The difficulty with this example is the
complexity of Paul's emotional state. He may be ?
happy about the settlement implications, but he is
probably very unhappy about his territorial rights. »5-
Even if he doesn't feel possessive about Sarah (Paul
did say that he also wanted a divorce), he has a (shc.i-ed)
right to feel put out by a stranger in his bedroom,
to say nothing of his bed. Bis privacy is surely
being violated on at least one level, and we are
assured of his negative reaction when we are told
A plot unit graph for a story can be generated by
that he "almost had a heart attack." So Paul's
creating a graph node for each instantiated plot
reaction is mixed: it has a strong negative
unit, and placing arcs between all pairs of plot
component (- + + - 0) and a more far-sighted
units that share at least one common affect state.
positive component (+ + + - 0) as well. This
The above affect state map yields a graph of two
explains why it would make sense for Richard to
nodes connected by an arc, while the BORIS divorce
either express sympathy or offer congratulations.
story involves 24 plot units in a connected graph
It seems most appropriate to first offer condolences
and then congratuations, but either one can be
understood as a reasonable reaction on Richard's It appears that the connectivity features in a
part. plot unit graph provide a strong basis for
Special heuristics must be invoked for complex summarization algorithms. When a story's plot unit
emotional states, and higher level knowledge graph contains a pivotal node (a node with maximal
structures will be needed to handle inferences in connectivity), we can expect a short summary of the
these cases. For example, the representational story to be based on the conceptual content of that
system of plot units (which grew out of our pivotal node. For example in the BORIS divorce
experience with BORIS'S affect analysis), includes a story, the hidden blessing unit is pivotal, and we
special structure called "hidden blessing" to handle can summarize the story by saying "Paul saved
himself from a nasty divorce settlement when he

accidentally found his wife in bed with another inferences, we were led to the design of two new
man." The hidden blessing unit encodes Paul's knowledge structures for narrative text analysis.
discovery as an event of mixed affect states, and a Plot units and thematic affect unite were natural
synopsis of any hidden blessing has to explain what devices for solving affect-oriented processing
was ultimately good about an initially negative problems. The importance of affective knowledge
event. structures for memory representation deserves
further exploration in both the psychological and
We have been experimenting with plot units A.I. research paradigms. Recent research in both
primarily as a basis for narrative text areas has uncovered some provocative results
summarization, [Lehnert 1980, 1981a, 1981b; concerning emotion and memory which pave the way for
Lehnert, Black, and Reiser 1981; Reiser, Lehnert, further investigations. From the A.I. end of the
and Black 1981], but their use as a predictive world, we are seeing how the use of plot units for
knowledge structure for affective inference remains narrative text summarization suggests that emotional
to be explored. Initial efforts in this direction states of characters in a story play a far more
led to the development of a slightly higher level of central role in high-level memory representation
memory representation (Thematic Affect Units) that than was previously suspected [Lehnert 1981a]. At
relates to adages and fables [Dyer 1981]. the same time, psychologists are demonstrating how
the moods and attitudes of experimental subjects can
4. Conclusions dramatically alter their patterns of memory
retrieval [Bower 1981].
Our experience with Roseman's affect analysis
and BORIS suggests that affective inferences are So affects appear to play many different roles
dependent on a substantial range of other inference in human cognition. In particular, we have seen how
mechanisms. It is not possible to study problems of knowledge of emotional reactions can be crucial to
affect without addressing seemingly unrelated various information processing tasks, including
problem areas. The current state of the art in narrative text comprehension. Still a great deal of
language processing allows us to tackle recognition work remains to be done if we are going to fully
techniques involving scripts, goals, plans, understand the roles of emotional knowledge and
interpersonal themes, plot units, and thematic experience in human information processing. We must
affect units, all of which can contribute to affect turn to psychology labs and LISP programs for the
recognition techniques. But affect analysis can answers to our questions, putting aside the more
also lead us into largely uncharted regions of philosophical speculations about machines and
intentionality and social responsibility, just to emotions. It will be much easier to resolve our
name two areas we've discussed. speculative debates in the face of some solid
research, and the quality of our interdisciplinary
We have not attempted to compile a list of all dialogs will be greatly enhanced by empirical
the related knowledge needed to handle affect, contributions from psychology and artificial
because this list is likely to be a comprehensive intelligence.
list of all knowledge structures used for language
processing. Interestingly enough, there will
probably be no knowledge structures devoted
exclusively to affect. One could argue that the
presence of a Roseman-like vector (+ - + + S) within
computational memory constitutes an affect-specific BIBLIOGRAPHY
knowledge structure. But this structure is just a
Abelson, R.P. (1973). "The Structure of Belief
processing artifact: we do not expect to find these
Systems." in Computer Models of. Thought and
vectors in long or short-term memory
jjanguage. (eds: Schank and Colby). San
representations. If we needed to explicitly encode
Francisco: W.B. Freeman.
"John was angry," we might be reduced to vector
notation, but as soon as this sentence is embedded Boden, M. (1977). Artificial Intelligence and
in a context which tells us what happened to John Hatural Man. New York: Basic Books.
and why he feels angry, his anger will likewise be
embedded in some larger structure. TAU's fill this Bower, G. H. (1981). "Mood and Memory." American
role already, and plot units (originally called Psychologist. Vol. 36, No. 2. 129-148
"affect units") also operate along these lines. For
example, vindictive or vengeful feelings can be Carbonell, J. G. (1978). "POLITICS: Automated
readily derived from the retaliation plot unit. Ideological Reasoning." Cognitive Scien^g.
In conclusion, I would say that it is Vol. 2, No. 1. 27-52.
altogether too early to pass judgement on any
specific representational techniques for affects. Colby, K. M. (1963). "Computer Simulation of a
Roseman has supplied the computational environment Neurotic Process." in Computer Simulation fil
with a valuable framework in which to work (or Personality: Frontier fil Pgychp^pfcica;
perhaps play) and other psychological theories may Research (eds: Tomkins and Messick). New
also prove to be valuable, although I know of no York: Wiley.
others that have been (even partially) implemented.
Colby. K. M. (1965). "Computer Simulation of
This particular area is a difficult one for A.I.
Neurotic Processes." in Computers j.n Biomedical
implementations because it draws on so many other
Research. Vol. 1_ (eds: Stacy and Waxman).
complex problem areas in memory and cognition.
New York: Academic Press
It was nevertheless valuable to attempt an
implementation of Roseman's model, if only for the Colby, K. M. (1967). "Computer Simulation and
sake of the spin-offs that emerged. By confronting Change in Personal Belief Systems." Behavioral
problems of affect representation and affect-related Science. 12, 248-253.

Colby, K. M. (1973). "Simulations of Belief
Systems." in Computer Models si. OSOiShL iSi. ihs. CognittYt ££i£a££ SOiiSiX- Berkeley,
Language. (eds: Schank and Colby). San Calif.
Francisco: W.B. Freeman.
Roseman, I. (1979). "Cognitive aspects of emotion
Colby, K. M. (1975). Artificial ftrpppj-j. New and emotional behavior." presented at the
York: Pergamon. , meeting of the American Psychological
Association, New York. (unpublished
Dennett, D. C. (1978). Brainstoms. Montgomery, manuscript. Department of Psychology, Yale
Vermont: Bradford Books. University, New Haven, Conn.)

deRivera, J. (1977). A Structural Theory sL Ihfi. Schank, R. (1975). Conceptual Information

Emotions. New York: International Processing. Amsterdam: North-Holland.
Universities Press.
Schank, R. and Lehnert, W. (1979). "The
Doyle, J. (1980). "A Model for Deliberation, Conceptual Content of Conversation."
Action, and Introspection." AI-TR.581, MIT Prpceedj-pgp sL -EM Si-y^t> international Jpipt
Artificial Intelligence Laboratory. PPPf^rPPcpfi£^x%ifi<;.}.^l IptpJ.;?.KPP«?e- Tokyo.

Dreyfus. H. L. (1972). What Computers Can't Djii Scriven, M. (1960). "The compleat robot: a
A. Critique fil Artificial Reason. New York: prolegomena to androidology." in Dimenaions of
Harper and Row. Mind; 4 Symposium. (Ed: S. Hook). New
York: New York Universities Press.
Dyer, M. (1981). "The Role of IAD's in
Narratives." PrPCeg<i^PKP SL JhS. Thj-fj APPUflt Simon, H. A. (1967). "Motivational and Emotional
Conference £i £]>£. Cognitive Science Society. Controls of Cognition." (reprinted in Models of
Berkeley, Calif. Thought. Yale University Press. 1979)

Dyer, M. and Lehnert, W. (1980). "Organization Sloman, A. (1981). "Why Robots Will Have
and Search Processes for Narratives. Emotions." Proceedings ££. £ M Seventh
Department of Computer Science Research Report International i£i££. Conference ££. Artificial
#17 5. Yale University, New Haven, Conn. Intelligence. Vancouver, British Columbia.

Gunderson, K. (1971). Mentality and Machines. New Weizenbaum, J. (1976). Computer Power and Human
York: Doubleday. Reason. San Francisco: W.H. Freeman and
Izard, C. E. (1977). Human Emotions. New York:
Plenum Press. Ziff, P. (1959). "The feelings of robots."
Analysis. No. 19, 64-68.
Kenny, A. (1963). Action. Emotion. flM Will.
London: Rout ledge and Kegan Paul.

Lehnert, W. (1980). "Narrative Text

Summarization." Proceedings af. tt£ First Annual
PgtIPPP; Cppfeyepcp qs. Apt^ficj-p^ Ipte^URPPcp.
Stanford, Calif.
Lehnert, W. (1981a). "Plot Units and Narrative
Summarization." Cognitive Science, (in press).

Lehnert, W. (1981b). "Plot Units: A Narrative

Summarization Strategy." in Strategies for
Natural Language Processing. (eds: Lehnert
and Ringle). New Jersey: Lawrence Erlbaum (in
Lehnert, W., Black, J., and Reiser, B. (1981).
"Summarizing Narratives." Proceedings of the
££X£illk JCptern^fj-pn^l ipj-p]: CppfpFPpce ££
Artificial Intelligence. Vancouver, British
Lehnert, W., Dyer, M., Johnson, P., Yang, C.J., and
Harley, S. (1981). "BORIS ~ An Experiment in
In-Depth Understanding of Narratives."
Department of Computer Science Research Report
#188. Yale University, New Haven, Conn.
Puccetti, R. (1968). Persons: 4 Study ai_ Possible
Mgiai Agents in the Universe. London:

Reiser, B., Lehnert, W., and Black, J. (1981).

"Recognizing Thematic Units in Narratives."

but in social science generally. The
SITUATION BASED EMOTION FRAMES AND traditional and still current concern has
THE CULTURAL CONSTRUCTION OF EMOTIONS been with emotion as physical sensation.
Internal feelings are presumed to be the
Catherine Lutz
primary referents of emotion words, and they
are therefore seen as difficult to talk
1. Introduction: Anthropological Approaches about — emotions are, in this cultural
framework, internal, private, pre-verbal
The question for this panel concerns the states. Affect is not accidently omitted
relevance of models of emotion for general from the classic social science problem
theories of internal representation and termed 'language and thought'.-'- Although it
processing. The particular question which I has more recently been acknowledged that one
will address is that of the contributions may think about how one feels and feel about
which an anthropological or comparative how one thinks, the dichotomy of cognition
approach can make to an understanding of artd affect remains fundamental even in
emotional representations and emotional attempts to 'bridge' the gap between them.
'tasks'. In particular, I would like to We have value stances, moreover, towards
suggest that we begin by looking at the way the concepts and relations of our theories,
people themselves frame emotional experience and we are ambivalent about emotion. On the
and interactions. Much of this framing one hand, cognition is seen as 'higher' in
structure, moreover, is culturally provided evolutionary and other senses than emotion.
and culturally variable. It can be seen in Lakoff and Johnson (1930) present some
emotion word meanings, in the logic of striking evidence in English metaphors that
emotional response, and in explicit emotion is so viewed. Metaphors which they
ethnotheories of emotion. term 'orientational' indicate that 'good',
Anthropologists have begun to document a 'high status', 'control', and 'rational' are
diversity of theories of mind and self (e.g., 'up', while 'bad', 'low status', 'being
Geertz, 1976; Leenhardt, 1979 (1947); Rosaldo, controlled', and 'emotional' are 'down'
1930; Strauss, 1977; White, 1931). Cultural (1980: 14-17). The mature person ideally
knowledge systems, or 'ethnotheories,' explain controls affect with thought. 'Intelligence',
why, when, and how emotion occurs, and they as that term is commonly defined in American
are embedded in more general cultural theories English, refers to cognitive abilities, with
about the person, internal processes, and 'emotional ability' being somewhat of a
social action. The examination of these contradiction in terms.2
culturally constituted systems of knowledge Emotion is instinctual, inchoate,
can play two important roles in achieving the formless, but at the same time it may
goals of cognitive science. In the first, we also be positively valued. In scientific
encounter ourselves in the other by seeing psychological terms, emotion is motivational
our own knowledge structures contrastingly and causes behavior. It is the life force,
highlighted. This is consonant with cognitive a sacred center. To claim emotion for
science's often stated aim of converting tacit computers, then, is sacrilege. For the above
understandings into explicit ones. Beliefs reasons, attempts at the 'cold' examination
about the self are among the most tacit in any of a 'hot' topic (Abelson, 1963) appear to us,
culture, as it is with them that intrapsychic as Americans, both quixotic and profane.
and social reality are framed and felt. As I would propose that we proceed on the
Abelson (1979) has pointed out, one person's assumption that the terms 'cognition' and
knowledge may appear to another as belief. 'affect' represent, not dual and separate
Our 'knowledge' about emotion may turn out to processes (Zajonc, 1930), but ideal types
be largely belief on cross-cultural placed at the ends of a true continuum of
inspection and comparison. more or less immediate and more or less
This is related to the second role for motivated processing of events or situations.
cross-cultural comparison in cognitive science While emotion is the human potential
which is that we may learn from, as well as for extremely high speed and action-oriented
about, the theories of other systems. For processing, emotions, are culturally
example, the emotion words of other cultural constructed concepts which point to clusters
groups may have different referents or slice of situations typically calling for some kind
up the affective pie in different ways than of action. Many of these situation clusters
do English emotion terms. If one of the aims will be universal while some will be
of cognitive science is to develop an environmentally specific. In the simplest
abstract, logical, and unambiguous language example, one language group may code sudden
for a scientific psychology, the events and dangerous events under the same
ethnosemantics of emotion will be an rubric while another community might
important topic of inquiry. A true science distinguish them. Those situations which
of affect needs a language which is will be universally framed together via
transcultural. emotion words will be those for which
Our most basic Western ideas about what responses -- such as movement towards others
constitutes humanness — ideas which are or flight — are shared and necessitated by
dramatically evident in debates over whether the requisites of human social life. As
computers can "feel emotion" -- form D'Andrade has pointed out, a person who
84 implicit frame for our inquiries. The simply 'thought' coolly about acquiring food
nature of this frame explains why emotion has would have little chance of survival if that
been neglected, not only in cognitive science. thought were easily extinguished and did not
(1930: 16).
Such an affectless person could note that thinking about thought processes, is the
someone was in the process of stealing her ability to introspect about, monitor, and
car but, in the absence of anger, could be control those processes (Brown and Barclay
easily called off to play bingo. 1976: 72). More basically, the term
metacognition has been used to refer also to
Why do human communities need emotion the awareness of the existence, nature,
words? It is not simply to name internal variables, and integration of cognitive
states. Emotion words are important primarily processes (Wellman, n.d.). The importance of
as communicators of one's perception of the examining the phenomenon of metacognition lies
occurrence of a particularly salient situation in its potential role in the molding of more
frame. The use of emotion words can often micro-level processes.
involve "backwards inferencing" on the part There is an important bridge to be built
of listeners about the events which led up between these notions about metacognition and
to the speaker's statement (White, 1979). In emerging anthropological concerns with
using emotion words, people also communicate ethnotheories of self and psychological
the behavior which is likely to follow on process. Although work in metacognition
their part. As one of the most crucial has tended to be based on the assumption that
functions of social groups lies in their these abilities and knowledge are culture-
organization and regulation of individual free, the child develops in a cultural milieu
behavior, emotion words are necessary for the replete with explicit and implicit ideas
most efficient and judicious coordination of about how one internally and socially
plans, understanding, and behavior. By operates. The child receives training in
clustering situations which share action metacognitive skills in the context of that
plans and other dimensions of meaning, culture's typical learning settings. The
emotion words facilitate such coordination psychologist's metacognitive skills and the
both on the intra- and the inter-psychic anthropologist's ethnotheories thus constitute
levels. and constrain each other.
2. Emotion Frames The concept of 'metacognition' should be
The framing of emotion occurs at many seen, however, in the same ideal typical light
levels. What has been and is here meant by in which we are here placing 'cognition', with
a 'frame'? Interest in the framing of 'metaemotion' coined to describe the more
knowledge and social interaction has been immediate and action-oriented processing of
widespread in social science. Attempts have our processing. These metaskills can be
been made to find person-centered, rather conceptualized, like cognition and affect,
than investigator-originated, breaks in the as points along a continuum of delay and
stream of mental and social reality with a immediacy and of passivity and behavioral
view to understanding how the psychologically implication. To Brown and Barclay's list
and culturally constructed framed units of metamemorial skills -- introspection,
organize human experience. These units have monitoring, and control (1976: 72) -- we must
been a variety of cognitive ones (Minsky, add, however, the evaluation of one's own
1975; Rumelhart and Ortony, 1976); they have cognitive and emotional processing. The
been linguistic units which frame one or more skills of control and evaluation fall towards
concept, logical relation, or cultiural the same end of this continuum as does affect.
proposition (Black and Metzger, 1965; Rosch, People have knowledge concerning emotions and
1977); they have been situational units whose their place in the workings of the self and
definition is a function of the transformation relationships, and they evaluate particular
of "physical space and chronological time... emotions (for example, 'righteous indignation
into social space and social time" (McHugh, is good', 'hate is bad', 'regret is good')
1968: 3; see also Hall, 1977: 129-140); they and emotion in general. The inseparability
may be contexts of interaction, the bracketing of value and knowledge is seen in the fact
of which transforms an event into one with that people in cultures which do not value a
fundamentally different meanings and particular type of processing skill (such as
epistemological and behavioral entailments intuition) will, according to the evidence
(Bateson, 1972: 177-193; Goffman, 1974); and accumulating in the fields of both
on the most global level cultural groups are metacognition and cross-cultural psychology,
themselves framed by shared knowledge systems, do less of such processing. Cultural values
customs, and world view — an event which is and knowledge determine what we term emotional
framed in an East African culture may have experience, that is, they structure the way
very different meaning than that 'same' event in which we frame events as salient, frame
framed in French culture. Cross-cultural expressive communications from others as
and interpersonal understanding — emotional meaningful, and frame our own emotions
and otherwise — occurs only to the extent linguistically and behaviorally.3
that there is a shared and agreed upon or 3. Ifaluk Theories of Emotion and Situation
a constructed frame for interaction. Frames.
There appears to be a growing recognition The Ifaluk are a Micronesian people
in psychology that the more inclusive frames whose adaptation to their coral islet
are just that, rather than simply variables, environment has been notably successful
and that the more micro-level frames are and resistent to colonial influences. Their
embedded in them. This is seen in the one-half square mile atoll is densely
increasing concern with what is termed 'world populated, supporting 430 people on a diet 85
knowledge'. It is also seen in the recent extracted through fishing, horticulture, and
interest in metacognition (A. Brown, 1973; gathering. Their society is organized
Flavell, 1976, 1973). Metacognition, or
through the ranking of individuals based on parentheses are percentages of the sample who
their clan and lineage affiliations, gender, checked each item).
and age. Several chiefs head the matrilineal
clans, and organize island-wide activities. my blood pressure goes up...(72);
The absence of physical aggression on Ifaluk I'm easily irritated...(64); I seem
marks it as one of the world's least violent to be caught up and overwhelmed
societies. Ghosts of ancestral and other by the feeling (64); my face and
varieties are ubiquitous and dangerous, mouth are tight, tense, hard (60);
although they may also bring protection from there is an excitement, a sense of
typhoons and disease. being keyed up, overstimulated,
The Ifaluk have a rich body of knowledge supercharged (58); my pulse quickens
about the nature of the person and emotion (56); my body seems to speed up (54);
(Lutz, 1930, 1931). Although they do not there is a quickening of heartbeat
distinguish emotion from thought, and do not (52); my fists are clenched (52)
have a monolexeme for either, they do label (Davitz, 1969: 35).
two closely related but nonetheless distinct Contrast the meaning of 'anger' with the
types of internal processes, one more willful following definitions of the Ifaluk emotion
(tip-) than the other (nunuwan), more a word, song, which may be translated as
product of individual desire than of social 'justified anger'. "We are song from some
conformity or social intelligence. The gossip we hear [about ourselves which is not
Ifaluk words which we would call emotion words true], if someone talks a lot at us, [or] if
can be used to modify either of these two the pig comes and eats food I just made."
process terms. "If someone doesn't give me something I
The Ifaluk evaluation of emotion, that [legitimately] ask for, I'm song."
is, their metaemotional attitude, is a very The evidence for the centrality of
positive one. Emotion is viewed as situation as a frame for Ifaluk emotional
inextricably part of thought, and the correct experience and understanding is found in
understanding and behavioral enactments of several kinds of data. As just mentioned,
emotions, both pleasant and unpleasant, are folk definitions of emotion words are
seen as signs of maturity. The mentally ill predominantly given in a form similar to the
on Ifaluk are marked, in the indigenous view, following, 'Emotion X is when someone steals
by deficiencies in the ability to display my bananas', or 'If your child dies, then
both emotional and more technological emotion Y'. Secondly, daily conversations
understandings. Their view here interestingly abound in which emotion is explicitly
converges with those of researchers recently discussed. Emotions of the self and others
working on the emotional-cognitive development are spoken of in terms of their environmental
of Down's syndrome infants (Cicchetti and or situational causes and correlates.
Sroufe, 1976; Emde, Katz and Thorpe, 1973). Physical symptoms are not a salient topic of
Emotion is a 'covert' category (C. Brown, conversation. A third source of data is
1974) for the Ifaluk. A domain of words sorting tasks in which people were asked to
exists which informants separate from others pile-sort cards with Ifaluk emotion words on
in sorting tasks. This domain includes words them. Post-test questioning of people about
whose primary referents are organism- the rationale for their sorting decisions
environment interactions. Ifaluk emotion produces answers which most commonly cite a
words are defined by the situations in common eliciting situation or situation type,
which people experience them. English emotion or alternately a sequence of situations
words, by contrast, foreground the organism (Lutz, 1932). Examples include, "These
response, with physiological signs (Davitz, [emotion] words all go together because they
1969) taken as 'symptoms' of an internal all occur when we are interrupted in our
event, rather than an external one. The two work", and "They all involve something
ethnotheoretical systems overlap to an happening that we want [to happen]."
important degree, however. While the Ifaluk The Ifaluk logic of emotion is based on
have a less elaborate model of physiological propositions about the parameters and meaning
correlates of events, some English emotion of commonly occurring situations. Why, then,
definitions point to the existence of a and on what criteria do the Ifaluk frame
secondary situational model. something as a situation and frame diverse
Joel Davitz, in his book The Language of situations within the confines of a single
Emotion, presents data on the meanings of emotion word? Let us return to the example
emotion words for 50 American informants in of the word song, which has been translated
a dictionary-like format. Data collection as 'justified anger'. The prototypical
began in an open-ended manner, with people situation which causes song is one in which
being asked to describe their experiences of another person has violated a cultural rule
each of 50 different emotions. A checklist or value. Included are both situations where
was developed out of a large and representa- ego or a relative of ego suffers as a result
tive sample of descriptive phrases. This and situations where it is simply cultural
checklist was presented to another 50 principle which is at issue, with no one
individuals to get comparable and quantifiable directly disadvantaged as a result.
descriptors for the 50 emotions. The Specifically, some of the commonly associated
overwhelming emphasis on physiological state situations include hearing false gossip about
is evidenced, for example, in the definition oneself, encountering someone whose
of 'anger'. The following descriptions are personality does not conform with the
listed in order of frequency (numbers in cultural values of generosity, calmness,
even-temper, and respectfulness, being
excluded from the emotional and material results from the failure of the other to
support expected from kinspersons and others, conform to group goals.
and hearing of the violation of a taboo such
as that against walking upright in front of In attempting to compare emotion words
the men's house. across cultures, it can be seen that neither
can the referent be assumed without
'Anger' can also be defined by its examination of ethnopsychological theories
situational correlates. Americans will in which they are embedded nor can the
define a frustrating situation as one in translation process be anything but primary.
which anger typically occurs. 'Anger', The translation of song as 'anger', or even
according to Carroll Izard, arises in as the more accurate 'justified anger', tends
situations where one is to erroneously suggest that the two terms
either physically or psychologically share common referents and webs of
restrained from doing what one ethnotheoretical meaning. A first step
intensely desires to do. The towards improving the accuracy of translation
restraint may be in terms of of emotion words might be to use situations
physical barriers, rules and rather than feeling tone as the primary
regulations, or one's own criteria for mapping an emotion word in one
incapability (Izard, 1977: 329-330). language onto one or more emotion words in
Brown and Herrnstein define 'anger' as another language (Lutz, 1930).
the "illegitimate disappointment of legitimate Three final suggestions are in order about
expectations" (1975: 274). Although their the general relationship between situation
definition is more restrictive than that of and emotion. Although the storage of
Izard, they are not contradictory. As Brown emotional meaning in situation frames is
and Herrnstein point out, legitimacy is particularly explicit in Ifaluk ethnotheory,
"neither universally acknowledged nor the work of Minsky (1975) and others suggests
unchanging" (1975: 279). America is a that the situation might be a universal frame
poly-cultural and complex society, and there for many kinds of experience, as it represents
is much disagreement over values, rules, an especially efficient way of storing related
and regulations. This, and the value placed bits of information. The efficiency of this
on individual achievement, make it storage method might be enhanced, moreover,
unsurprising, therefore, that 'anger' may be by certain types of metaemotional evaluations
evoked in situations where one's own and controls and by developed metacognitive
individual goals (based on one's personal approaches to the situational coding of
sense of the legitimacy of those goals) are emotion.
blocked. The restraints and rules which Secondly, the function of emotion words
Izard speaks of as eliciting 'anger' in may importantly consist of their linking of
Americans are seen as the correct and moral situations in a culturally meaningful way.
order on Ifaluk. A separate term, ngush, 'Anger' links a certain group of situations,
describes the state of being required to while song links another. Cultural adaptation
conform to a valid cultural rule which is calls for varieties of responses to the same
nonetheless frustrating of other individual 'objective" circumstances. Situations are
goals such as comfort. correlated by their shared relationship to a
Although the goals of individuals will particular cultural value and by the types of
be occasionally blocked in every culture, the action which follow from them. Emotion words
definition of the situation in which one is code these environmental correlations, and
blocked will vary from culture to culture, as thus provide for understanding of self and
will the subsequent emotional reaction social life.
(Whiting, 1944). The socialization process Finally, the situations which are
results in the molding of goals. The American correlates and constituters of emotion in
child comes to expect certain kinds of Ifaluk ethnopsychological theory are commonly
achievement from her or himself. "Standing on occurring ones, such as confronting a task
your own two feet", "Making it", and "Proving for which one is ill-prepared, observing a
yourself" are cultural aphorisms which reflect rule violation, or having one's child fall
the types of independence and achievement ill. These situation based emotion frames
goals which are acquired. The Ifaluk child, organize behavior through their links to
on the other hand, comes to operate with goals action plans. Schank and Abelson claim that
dictated by the values of sharing and frequently encountered events usually require
interpersonal dependence. The behavioral the development of accompanying scripts.
implications of these values are outlined in These scripts are "highly stylized ways of
cultural rules which provide for the equal executing planboxes" (1977: 96), while plans
distribution of everything from food to are the ranges of choices dictated by a
children and, conversely, for the equal particular goal. Thus we expect that Ifaluk
distribution of sacrifice and restraint. The cultural values, from which many individual
'crazy' person on Ifaluk is partially defined goals originate, would contribute both to the
as one who spends too much time alone, or framing of situations under a common emotion
thinks only of her or his own needs. term, and to the available interactional
Important differences in the nature, and script that dictates behavior in the emotion
hence the translations, of song and 'anger' defined situation. We and the Ifaluk need
flow from these differences in culturally emotion words to communicate definitions 87
constituted goals. It is for such reasons of the situation and intended plans of action.
that American 'anger' results from This and other ethnopsychological theories
blinders which we take with us to the study of References
affect. Through the comparative investigation
of emotion frames including the linguistic, ABELSON, R. Computer simulation of 'hot'
situational, and ethnotheoretic, we may be cognition. In S. Tompkins and S. Messick
able to identify those frames which emerge (Eds.), Computer Simulation of
only in particula r environmental circumstances Personality. New York: John Wiley, 1963.
and those which a re universally meaningful. ABELSON, R. Belief and knowledge systems.
Cognitive Science, 1979, 3^, 355-366.
Notes BATESON, G. Steps to an Ecology of Mind. New
York: Ballantine Books, 1972, pp. 177-
Acknowledgements. I would like to thank 193.
Geoffrey White for helpful comments on an
earlier draft of this paper. BLACK, M., and METZGER, D. Ethnographic
description and the study of law.
1. There are various forms in which the American Anthropologist, 1965, 67, 6(2),
Whorfian hypothesis on the linguistic 141-165. —
determination of thought may be construed
from the strongest, in which non-verbal BROWN, A. Knowing when, where and how to
cognitive activity is patterned by language, remember: A problem of meta-cognition.
to the weakest, in which only the perception In R. Glaser (Ed.), Advances in
of absent stimuli (i.e., memory) is strongly Instructional Psychology. Hillsdale,
influenced (Miller and McNeil cited in Lemon, N. J.: Lawrence Erlbaum, 1973.
1931: 202-203). If we include affect as part BROWN, A., and BARCLAY, C. The effects of
of the same continuum of internal processing training specific mnemonics on the
on which thought is found (see below), the meta-mnemonic efficiency of retarded
modeling of relationships between language children. Child Development, 1976, 47,
and other processing will need to take account 70-30.
of both affect and cognition. We should BROWN, C. Unique beginners and covert
expect that, under certain types of conditions categories in folk biological taxonomies.
of emotion activation, one or another version American Anthropologist, 1974, 76, 325-
of the Whorfian hypothesis may apply. 327: ~
2. For views of emotion as intelligence, see
D'Andrade, 1930:15-17; Lutz andLeVine, n.d.; BROWN, R., and HERRNSTEIN, R. Psychology.
Meichenbaum, 1930: 274-273. Boston: Little, Brown, 1975.
3. The sociologist Arlie Hochschild's (1979)
notion of 'emotion work' is relevant here as
relationship between affective and
a conceptualization of the way in which
cognitive development in Down's
emotion knowledge structures ('feeling rules'
syndrome infants. Child Development,
in her scheme) affect emotional experience
1976, 47^, 920-929.
itself. In her view, these structures are
provided by ideology and, in particular, by D'ANDRADE, R. The cultural part of cognition.
social 'feeling rules'. These rules govern Address given to the Second Annual
not only behavior, but feelings themselves, Cognitive Science Conference, New Haven,
and hence are not merely the 'display rules' 1930.
of Ekman (1974). DAVITZ, J. The Language of Emotion. New York:
Academic Press, 1969.
EKMAN, P. Universal facial expressions of
emotion. In R. LeVine (Ed.), Culture
and Personality: Contemporary Readings.
Chicago: A l d m e , 1974.
EMDE, R., KATZ, E., and THORPE, J. Emotional
expression in infancy: II. Early
deviations in Down's syndrome. In M.
Lewis and L. Rosenblum (Eds.), The
Development of Affect. New York: Plenum
Press, 1973.
FLAVELL, J. Metacognitive aspects of problem
solving. In L. Resnick (Ed.), The
Origins of Intelligence. Hillsdale,
N. J.: Lawrence Erlbaum, 1976.
FLAVELL, J. Metacognitive development. In
J. M. Scandura and C. J. Brainard (Eds.),
Structural/Process Theories of Complex
Human Behavior. Alphen a. 1. Ryn, The
Netherlands: Sijthoff and Noordhoff,

GEERTZ, c. "From the native's point of view": ROSCH, E. Human categorization. In N.
On the nature of anthropological Warren (Ed.), Advances in Cross-Cultural
understanding. In K. Basso and H. Selby Psychology. Vol. I. London: Academic
(Eds.), Meaning in Anthropology. Priss, 1977.
Albuquerque: University of New Mexico
Press, 1976. RUMELHART, D. E., and ORTONY, A. The
representation of knowledge in memory.
GOFFMAN, E. Frame Analysis; An Essay on the In R. C. Anderson, R. Spiro, and W. E.
Organization of Experience. Cambridge; Montague (Eds.), Schooling and the
Harvard University Press, 1974. Acquisition of Knowledge. Hillsdale,
N. J.; Lawrence Erlbaum, 1976.
HALL, E. T. Beyond Culture. New York; Anchor
Press, 1977 (1976). SCHANK, R., and ABELSON, R. Scripts, Plans,
Goals, and Understanding: An Inquiry'
HOCHSCHILD, A, Emotion work, feeling rules, into Human Knowledge Systems^ Hillsdale,
and social structure. American Journal N. J.; Lawrence Erlbaxim, 1977.
of Sociology, 1979, 35^, 551-595.
STRAUSS, A, Northern Cheyenne ethnopsychology.
IZARD, C. E. Human Emotions. New York; Plenum Ethos, 1977. 5, 326-357.
Press, 191T.
WELLMAN, H. A child's theory of mind; The
LAKOFF, G., and JOHNSON, M. Metaphors We Live development of conceptions of cognition.
Chicago; University ot Chicago Press,
By. CI In S. Yussen (Ed.), The Growth of Insight
TTao. in the Child. New York: Academic Press,
LEENHARDT, M. Do Kamo. Chicago; University of
Chicago Press, 1979 (1947). WHITE, G. Some social uses of emotion
language: A Melanesian example. Paper
LEMON, N. Language and learning: Some presented at the 78th Annual Meeting
observations on the linguistic of the American Anthropological
determination of cognitive processes. In Association, Cincinnati, 1979.
B. Lloyd and J. Gay (Eds.), Universals of
Human Thought; Some African Evidence. WHITE, G. 'Person' and 'emotion' in A'ara
New York: Cambridge University Press, ethnopsychology. Paper presented at the
1981. Tenth Annual Meeting of the Association
for Social Anthropology in Oceania, San
LUTZ, C. Emotion words and emotional Diego, 1931.
development on Ifaluk atoll. Ph.D.
Dissertation, Harvard University, WHITING, J. W. M. The frustration complex in
Cambridge, Mass., 1930. Kwoma society. Man, 1944, 115, 140-144.

LUTZ, C. Talking about 'our insides': Ifaluk ZAJONC, R. B. Feeling and thinking:
conceptions of the self. Paper presented Preferences need no inferences. American
at the Tenth Annual Meeting of the Psychologist, 1930, 35_, 151-175.
Association for Social Anthropology in
Oceania, San Diego, 1981.
LUTZ, C. The domain of emotion words on
Ifaluk. American Ethnologist, 1932, 9_,
in press.

LUTZ, C, and LEVINE, R. Culture and

intelligence in infancy: An
ethnopsychological view. In M. Lewis
(Ed.), The Origins of Intelligence. 2nd
edition. New York: Plenum Press,
MCHUGH, P. Defining the Situation: The
Organization ot Meaning in Social
Interaction. New York; Bobbs-Merrill,

MEICHENBAUM, D. A cognitive-behavioral
perspective on intelligence.
Intelligence, 1930, 4, 271-233.

MINSKY, M. L. A framework for representing

knowledge. In P. H. Winston (Ed.),
The Psychology of Computer Vision. New
York: McGraw-Hill, 197 5.
ROSALDO, M. Knowledge and Passion: Ilongot
Notions of Self and Social Life. New
York: Cambridge University Press, 1930.

Disentangling the Affective Lexicon
study of the emotions, some alternative method Is
Andrew Ortony and Gerald L. Clore needed for identifying emotion words. What we propose
In this paper can be viewed as the linguistic
University of Illinois groundwork required for language-based studies of the
at emotions and other affect-related terms.
The problem that we are dealing with is by nc
means unique to this domain. There are many areas
When social psychologists and personality where It is dlffi cult or impossible to be entirely
theorists Investigate traits and emotions they explicit about t he criteria for class membership, but
frequently rely on lists of words that denote, or are where psychologis ts have relied on rating scales and
thought to denote, traits, or emotions, or feelings. the Intuitions of judges to achieve reliable and valid
A classic problem In such work is that there is no classifications. The work of Rosch and her colleagues
unambiguous way of specifying which words refer to (e.g. Rosch, 19 78; Rosch & Mervls, 1975) on the
emotions, which refer to traits, and which refer to categorization o f concrete objects is an obvious
other behaviors, and non-emotional states. example.
Investigators have generally relied on their
intuitions in these matters, and by and large In the present context, one possibility would be
agreement has not been very high. to present subjects with candidate emotion words and
ask how good good they are as examples of emotions.
This problem Is not so severe in lists of This we plan to do. However, as the sole strategy,
personallty-tralt words, largely because many modern this approach has drawbacks related to not knowing
empirical studies involving trait descriptors choose what criteria subjects employ in their Judgments, and
their terms from one of several "standardized" lists. consequently It raises troublesome problems about
For example, social psychologists frequently draw from reliability. Thus, as a first step, we chose to
the list of 555 words compiled by Anderson (1968). employ a number of explicit tests that we hope offer
This list was developed by selecting feasible greater reliability and that provide potentially
candidates from the 18,000 words appearing in Allport useful additional information about the structure of
and Odbert's (1936) classic monograph. From the the affective lexicon. These tests constitute a set
resulting reduced list of 2200 words, Anderson removed of heuristics for isolating genuine emotion words (and
extreme words (e.g. majestic), words designating other kinds of words) from a list of putative emotion
temporary states (e.g. aghast), words having to do words. They take the form of a group of sentence
with physical characteristics (e.g. hairy), strongly frames into which a candidate word is inserted. The
sex-linked words (e.g. alluring), and other words tests are "passed" or "failed" by a particular word
considered unsuitable as ingredients in impression depending on the extent to which groups of judges
formation (e.g. fond). Finally, words found to be consider the resulting sentences to be meaningful
unfamiliar to college students were eliminated. rather than anomalous.
Although Anderson's list was also determined primarily
on the basis of Intuition, and although it does Finally, it should be emphasized that we think of
contain some ambiguous words (for example, happy, these tests as a set of heuristics or "rules of thumb"
while certainly designating a trait, also can rather than as an algorithm. Nevertheless, we think
designate an emotion), it nevertheless has sufficient that they do a tolerable job of disentangling the
face validity to have gained wide acceptance. affective lexicon—certainly a better job than blind
intuition, or than no criteria at all.
Those who would study the emotions are less
fortunate. Whether one seeks to map out the cognitive While our primary goal is to Isolate the genuine
basis of the emotions, as we do, or whether one Is emotion words from a pool of candidate emotion words,
Investigating the effects of emotions on behavior, or we also consider it interesting to attempt to classify
of behavior on emotions, an indiscriminate use of the major kinds of words appearing in the pool. The
language can be dangerously misleading In both theory pool comprises the union of several lists claiming to
construction and in the conduct of research. Many of be lists of emotions and/or of feelings. In
the words In lists used in studying emotions either do constructing it we drew primarily from lists used in
not designate the kinds of states they are Intended various psychological treatments of emotion (Bush,
to, or they are ambiguous between different kinds of 1972; Dahl & Stengel, 1978; Davitz, 1969; Russell,
states. The indiscriminate use of such lists in 1980). The final pool consisted of about 500 words,
theoretical and empirical research poses a serious the largest contribution coming from Dahl and
methodological problem. For example, Russell (1980) Stengel's extensive list. A sizeable number also came
scaled 28 "emotion-denoting adjectives". He found from Bush (1972), who had reduced a prior list of
"sleepiness" to be an Important dimension of such 2,186 adjectives from Allport and Odbert (1936). From
words. Although he lists the words used in his these Bush selected the 263 words that raters agreed
studies, he provides no justification for their were more relevant to emotions ("what a person feels")
inclusion and he describes no method for their than to personality ("what a person is like") or to
selection. He included words like bored, tired, behavior ("what a person does"). Also Included was
sleepy, drowsy, tranquil, and relaxed; we do not think Davltz's list of words from Roget's Thesaurus, and
that such terms denote emotions at all. If one other smaller lists.
Includes among one's stimuli, words that have a high We found as we examined these lists that while we
loading on sleepiness, sleepiness will turn out to be could not give a satisfactory definition of an
a factor. Until the inclusion of such words in the emotion, we could readily eliminate many of the
stimulus set can be justified, generalizations about candidates as words that did not refer to emotions.
the structure of the emotions have to be regarded as For example. In the lists of words designating
suspect. emotions and feelings used by Dahl and Stengel (1978)
The specification of necessary and sufficient or Bush (1972), there are numerous "Intruders" such as
criteria for emotions is a notoriously difficult if tired, hungry, breathless, and revived— words which
not an impossible goal. But because the employment of seem to designate body states, a"nd words like
linguistic stimuli is an important avenue into the confused, baffled, and sure, which seem to represent
aon-affectlve cognitive states. Still other entries Since our primary goal is to separate trait
nice abandoned, abused, and appreciated represent the descriptors from emotion words, we shall deal first
acts or beliefs of others relevant to the self; they with that part of the Initial pool that fails Test 1.
could certainly cause emotions but do not themselves Recall, first, that terms like proud, sad, and happy
denote emotions. are sometimes used as trait descriptors and sometimes
as emotion words. Thus, the half of the pool
The linguistic tests that we propose are attempts containing traits and emotions actually contains words
to classify such "Intruders" In a reasonably of three kinds—the "pure" emotion words that
systematic way while also separating out emotion unambiguously designate emotions (e.g. embarrassed,
words. The first distinction we make Is between words disgusted, Jubilant), the "pure" trait words that
that designate traits or emotions and words that do unambiguously designate traits (e.g. thrifty,
not. Words that do designate traits or emotions are intelligent, studious, dishonest), and the "emotion-
of three kinds: (a) "pure" trait words, which refer trait hybrid" words that have two senses, one
only to traits (and not to emotions), (e.g. studious, referring to an emotion and one to a trait.
ambitious, mean), (b) "pure" emotion words, which
refer only to emotions (and not to traits) (e.g. The test that we now describe is designed to
jubilant, distressed, embarrassed), and (c) polysemous separate pure trait terms and emotion-trait hybrids
words that can be used to refer to both emotions and from pure emotion terms. Because the context provided
traits (e.g. cheerful, happy, proud). For brevity we by the sentence frame resists temporary states in
shall refer to such words as "emotlon-tralt hybrids". favor of perservering qualities, it allows as sensible
Although less central to our concerns, still of completions only traits and hybrid words with a trait
Interest are the three kinds of words already as one meaning.
mentioned that constitute the other half of the pool.
These we call "body-state" words, "cognitive-state" Test 2: John was well-known as a(n) xxxxx person
words, and "other-action" words. The tests that we
The result of applying this test is to separate
discuss in this paper are all designed to deal with
examples like the following:
adjectives or participial forms. Rephrasing of the
tests is required to handle noun and verb forms. (FAIL)
Words denoting emotions or traits
pure traits and hybrids pure emotions
The first test that we propose is actually a pair
of sentence frames. One, Frame A, deals with anxious disgusted
negatively valenced words, and one. Frame B, deals happy distressed
with positively valenced ones. These frames can be proud embarrassed
thought of as linguistic filters. Their logic is to materialistic jubilant
contrast candidate words with something explicitly superstitious love-sick
emotional so that words that do not denote emotions
will produce meaningful (as opposed to anomalous) In order to separate the hybrids from the pure
sentences. The test separates the entire pool into traits, another test. Test 3, is needed. This test
two halves: (a) an item that fails the test (i.e. may be applied to the same set of words as Test 2.
produces an anomalous sentence) is most probably a The hybrids can then be isolated by taking the
word that denotes a trait or an emotion, and, (b) an intersection of words passing Test 2 and of those
item that passes the test (i.e. produces an acceptable passing Test 3. This is because Test 2 detects words
sentence) is probably a body-state word, an other- that have trait readings, while Test 3 detects that
action word, or a cognitive-state word. Thus, the subset of them that also have emotion readings (see
test Is intended to allow as sensible completions only Fig. 1 ) .
words like puzzled and certain (cognitive-state The rationale behind Test 3 is that emotions can
words), breathless and refreshed (body-state words), be experienced to varying degrees, and can be
and abandoned and appreciated (other-action words). experienced in the absence of an interpersonal
Test 1. exchange. Thus, reflecting on a situation can give
Frame A: Although at that moment Mary was xxxxx, rise to an emotion but not to a trait, although, if a
she was emotionally content term is ambiguous as between a trait and an emotion,
Frame B: Although at that moment Mary was xxxxx, it will fit the test because of its emotion sense.
she was not emotionally content
Test 3: As he reflected on what had happened,
The word although anticipates a contrast, and in the John was quite xxxxx
contexts of these frames, it is a contrast of valence.
However, the presence of the phrase emotionally The result of applying this test is to separate
content, constrains the contrast to non-emotional examples like the following:
Accordingly, emotion words will fail the test,
but body-state words, cognitive-state words, and pure emotions and hybrids pure traits
other-action words all pass it. For example, words
like breathless, puzzled, and abandoned pass the test cheerful ambitious
because they fit the sentence frame for negative words distressed intelligent
(Frame A ) , and words like refreshed, certain, and disgusted knowledgeable
appreciated pass because they fit the frame for ecstatic mean
positive words (Frame B ) . Traits are prevented from frightened sensitive
fitting into the sentence frames by incorporating in proud thrifty
the frames a reference to a particular moment so that
a quality that is enduring will give rise to an
words as well as
anomalous sentence. Thus, trait
honest, unkind,
emotion words fail the test (e.g.
jubilant, and distressed).

( E H B A R R A 5 S E 0 SUPERS IT I T lOUS ;ted\


PURE f„nTinw tbhit ''^'^^ °^""" BODY-STATES "GNITIVE"


See FI9. 2





Figure 1.

Our primary goal has now been achieved. We think meaning wrecklessly), and one does not experience
we have proposed a reasonably methodical procedure for "being abandoned" as <i separate emotion. Rather,
isolating emotion words from a pool that contains some abandoned represents the actions of some other vis a
words that do not denote emotions. Furthermore, we vis the self. Its special quality, its emotional
have separated two kinds of emotion words, those that loading, presumably comes from the fact that the
seem to denote emotions exclusively, and those knowledge that one has been abandoned typically gives
(hybrids) that also denote traits. We consider this rise to (negatively toned) emotions. It is, however,
to be a potentially Important distinction. The as much of an error to assume that "abandoned" is an
results of multidimensional scaling studies, for emotion or a trait as it would be to suppose that
example, in which subjects make similarity judgments "kicked in the groin" was.
can be muddied by the unwitting inclusion of a subset
of ambiguous stimuli (i.e. hybrids). We shall assume that unlike emotion words, the
most salient characteristic of other-action words is
Words not denoting emotions or traits that others can engage in those actions without Che
person to whom they are relevant necessarily being
There may be occasions on which one might want to aware of them. Since one can be abandoned and not
compare emotion words to some other kinds of words, know it, abandoned cannot represent an emotion or any
say, cognitive-state words, or other-action words. other kind of Internal state; it is an other-action
Although of secondary interest to us, the same kind of word. Notice that it does not follow from this that
procedures can be used to separate the three kinds of awareness entails an emotional state. Normally,
words appearing in the other half of the initial pool, awareness is a necessary but not sufficient condition
namely the half comprising words that passed Test 1 for an emotion. Thus, Test 4 is designed to identify
(see Fig. 2 ) . other-action words. The logic of the test is (a) to
deny awareness by using the expression "totally
The first kind of words that we attempt to unaware", and (b) to take advantage of the fact that
isolate are those that do not represent an internal "other-action words" require actions b ^ others that
state of a person at all. These we call "other- might influence the self by explicitly making the
action" words because they characterize the actions agent of the action an other.
(or the attitudes) of others that are relevant to
(although not necessarily directed towards) the self. Test 4: John was totally unaware that he
Perhaps because they are so strongly associated with had been xxxxx by the woman
emotional responses many other-action words have found
their way into lists of emotions and traits. For
example, the word abandoned appears in the lists of For this sentence frame reasonable completions
Dahl and Stengel (1978) and o'f Bush (1973). It also are restricted to words that denote the actions
appears in the Personal Traits column of Allport and (physical or mental) of others. As with the earlier
Odbert's (1936) list. However, in modern English tests, some words will fall to fit simply because they
abandoned designates neither a trait nor an emotion. are of the wrong syntactic type, but, as always, the
One cannot be disposed to behave "abandonedly", more interesting cases are those for which the
(although we do speak of behaving "with gay abandon" resulting sentence is not syntactically ill-formed but










Figure 2.

rather Is semantlcally anomalous. For example, the Other-action words:

word puzzled does not fit very well because It Is odd
to suppose that John could be puzzled and not realize abused
it. We think that puzzled fits better Into the abandoned
category that we call "cognitive-state'' words. A more appreciated
difficult example is revived. Because revived defeated
suggests the poaslblllty of prior unconsciousness It disgraced
seems better able to fit Into the sentence frame, yet ignored
we like to think that revived Is a body-state word. neglected
If this is so, then subjects should Judge It to fit slighted
better in the body-state frame (see Test 5, below), welcome
even though it might do tolerably well in the other-
action frame. Since subjects are asked to Judge how It is worth pointing out a couple of things at
well a target word fits in a frame, it would be this Juncture. The first concerns the difference
sufficient for our purposes to discover that words between "being" and "feeling" something. The
like abandoned and Ignored fit better than words like inclination to treat other-action words as internal
revived. It is not necessary that words not in the state words is much greater when they occur with
category upon which we are focussing give rise to "feel" than with "is". The reason is that "feel" can
seriously anomalous sentences. Our expectation is be, and often is interpreted to mean "feel as one
only that the most reasonable completions are produced would If (one realized that) one was xxxxx". Thus,
by other-action words. Internal state words make poor that John was Ignored entails nothing about what John
completions. Thus, we can separate other-action words felt. It merely asserts that somebody ignored John.
using this test. When subjects in our experiments are Whether or not John responded to this other-action
Instructed to make meanlngfulness Judgments they are emotionally will depend on all kinds of factors (e.g.
warned to ignore one particular interpretation of the Was he aware of the fact? Did he expect anything else?
sentence-frame In Test 4 that would confound the Did he care? etc.) In other words, the Inference to
results by rendering spurious "meaningful" Judgments. an emotional response Is a pragmatic one, not a
Subjects are told that the focus of the sentence logical one. Yet, If one says that "John felt
should be on John's lack of awareness, not on the Ignored", we have much more license to infer that John
identity of the agent responsible for the action. was In an (emotional) internal state. We infer that
Thus, they are instructed to ignore the interpretation John responded emotionally. There is no doubt that
in which John might have wrongly attributed his being feeling ignored is a unique kind of (negative)
ignored, appreciated, revived etc. to someone other feeling—so too is the feeling of being pricked by a
than the woman. The following are examples of words needle. But this fact Is not sufficient for It to
that we think pass Test 4 most easily.
count as an emotion. One cannot confuse causes with
their highly correlated effects. predicates do not refer to physical characteristics.

The second point pertains to the problem of Examples of words which would pass this test with
ambiguity. Consider the word encouraged. There relative ease are:
certainly is an other-action sense of the word. For
example, someone might not realize that they were
being encouraged to get hooked on drugs by the person Body-state words:
that kept giving him free samples. But there is also
another sense of the word in which one can be
encouraged, namely the sense In which one becomes more
optimistic about something or other, and In this sense
the actions of an other are by no means necessary.
So, for example, a person might be encouraged about
the prospects of a good day's sailing on waking up and
seeing the sun shining and feeling a fine breeze.
Words like encouraged and relieved seem to have both
an other-action sense, and an emotion sense. It is
our expectation (but only the data will tell) that
where such words have an emotional sense, that sense
is more salient. If this is the case, they will fail
Test 1 and will be treated in the emotion and trait
pool. This we consider more important than that they
survive for treatment in the pool presently under Finally, It is our hope, and again we shall have
consideration (i.e. that part of the initial pool not to wait until the data are in, that those words that
denoting emotions or traits). This particular kind of do not fit well the sentence frame for either Test 4
cross-pool ambiguity would only become a problem if a or Test 5, fall to do so because they are poor
significant number of emotions were lost, an outcome examples of other-action words and of body-state
that we consider unlikely. words. In that case It Is our expectation that they
will be good examples of cognitive-state words such as
Our final test is designed to separate out body- the following:
state words. These are terms that pertain to the
physical state of an animate being—they are not
restricted to humans, although some of them might be Cognitive-state words:
used more frequently with respect to humans. Again,
these words can be valenced, and are often, but by no bored
means necessarily, associated with emotional disbelieving
responses. Their appearance in various lists of distracted
emotion words (e.g., again, those of Dahl & Stengel, doubtful
and of Bush, and of Russell) is probably due to the overworked
fact that they appear in (Column II of) the Allport puzzled
and Odbert (1936) list. This category is loosely uncertain
characterized by these authors as "terms designating uninspired
mood, emotional activity, or causal and temporary amused
forms of conduct" (p.vll). In it appear words like aware
thirsty and breathless which in our opinion do not fit certain
even this loose characterization. What is it to be in impressed
a thirsty mood? Is being thirsty an emotional interested
activity, or a form of conduct? We suspect that these sure
terms appear in Column II not because they belong vindicated
there, but because they are less Incongruous there Conclusion
than in one of the other three categories used by
Allport and Odbert. The question of the psychological validity of the
Test 5 attempts to separate out these terms by various categories that we have proposed is obviously
using a sentence frame that focuses on body states (as an important issue. We find these categories to be
opposed to other kinds of sensations, or perceptions), intuitively reasonable and we believe that they do
and that minimizes the cognitive content by represent psychologically important distinctions.
predicating them of a newborn infant: However, ultimately we would like to know that these
distinctions correlate with behavioral differences.
For example, in a pilot study conducted by Lord and
Test 5: The pediatrician explained that one of the Ortony memory for emotion words was found to be very
physical characteristics of a newborn infant much superior to memory for cognitive-state words.
was to be xxxxx. These are the kind of data that one needs to
demonstrate the psychological validity of the
It seems to us that this test only allows as good distinctions we have proposed.
completions terms that designate body feelings. It
seems to us to more readily allow completions with Finally, we should point out that we are more
words that do not entail awareness and that do not wedded to the general principles that we have proposed
suggest cognitive activity (emotional or otherwise). than we are to the specific tests. Indeed, some of
Thus it would be odd to complete this frame with an the tests we find rather inelegant. It remains to be
other-action word like ignored, and it would be odd seen how effective these tests are, and we are
with words like certain. The oddness arises both from convinced that there is room for considerable
attempting to predicate higher level cognitive Improvement. However, some procedure along the lines
functions involving social awareness and metacognition of the one we have suggested seems essential if one is
to newborn infants, and from the fact that these to avoid the kinds of problems in the analysis of
enotlons that we Identified at the outset. We hope
that our discussion will alert those who are studying
the emotions to the need to distinguish between states
that genuinely are emotional in nature and those that
are not.


Allport, G. W., & Odbert, H. S. Trait-names: A

psycho-lexical study. Psychological Monographs,
1936. 47(1, Whole No. 211).

Anderson, N. H. Lllceableness ratings of 55

personality-trait words. Journal of Personality
and Social Psychology, 1968, £, 272-279,

Bush, L. E. Empirical selection of adjectives

denoting feelings. JSAS Catalog of Selected
Documents in Psychology, 1972, 2^, 67.

Dahl, H., & Stengel, B. A classification of emotion

words: A modification and partial test of
de Rivera's Decision Theory of Emotions.
Psychoanalysis and Contemporary Thought, 1978, 1_,
Davltz, J. The language of emotion. New York:
Academic Press, 1969.

Rosch, E. Principles of categorization. In E. Rosch

(Ed.), Cognition and categorization. Hillsdale,
N.J. : Erlbaum, 1978.

Rosch, E., & Mervls, C. B. Family resemblances:

Studies in the Internal structure of categories.
Cognitive Psychology, 1975, 7^, 573-605.

Russell, J. A. A clrcumplex model of affect. Journal

of Personality and Social Psychology, 1980, 39,


We would like to thank Charles Lord and George Miller

for helpful comments on earlier drafts of the paper.
The research reported herein was supported in part by
the National Institute of Education under Contract No.
HEW-NIE-C-400-76-0116, and In part by a Spencer
Fellowship awarded to the first author by the National
Academy of Education.

Affect amd the Perception of Risk

Amos Tversky and Eric J, Johnson

Stanford University

As cognitive scientists turn their attention to these data suggested that the effect was due to mood in-
emotion, they face the task of integrating affect into duced, and that the bad moods were more than unpleasant
models of cognition. The perception of risks seems states. In addition, they had pervasive influences on
an ideal area to examine the relationship between cog- an important class of risk-related judgments.
nitive and affective processes. When we witness an In the third experiment, we broadened the estimates
accident, or read a newspaper story about a natural we requested to include items not related to either death
disaster, we do more than simply revise our subjective or risk. For example, subjects were asked to report the
probabilities. We are often quite disturbed and shaken frequency of bankruptcy and divorce. Even with these
by such events. Our encounters with risk are inevitably widely divergent estimates, we have found strong global
connected with feelings, including those of surprise, generalization of affect, with no evidence for either
dismay, and worry. local or gradient generalization. We also included a
Previous work in risk perception has concentrated condition which read an additional newspaper story free
on the cognitive domain. Lichtenstein et al. (1978), of risk related content, but which described a series
for example have asked people to estimate the frequency of negative events which occurred to the main character.
of death due to various causes. They argue that the Since the story made no reference to risk or death, its
availability of instances in memory helps determine principle effect was the negative mood it induced in
these perceived frequencies. Thus, homicide is seen as the reader. This depressing story resulted in a pattern
much more common than suicide, although actually the of results almost identical to those induced by the
reverse is true. Causes of death which are spectacular risk-related newspaper stories.
and the subject of media coverage appear to be overest- These data, viewed as a whole, demonstrate that
imated while more mundane causes are underestimated. affect can have a large and pervasive influence on one
We conducted three studies using an experimental important class of judgments, estimates of the fre-
paradigm similar to the one used by Lichtenstein et al. quency of risk-related events in the population. So
Before they made their estimates, however, subjects read far we have found no indication of a connection between
a newspaper-like account of the death of a single in- the information contained in a story and its impact on
dividual under the guise of a newspaper reporting study. the estimated frequency of death. The overriding factor
These stories, although quite graphic, were relatively in these increases does not appear to be the story told,
devoid of information. They were, however, effective but rather the mood or affect state it conveys to the
in changing mood, causing readers to report they felt reader. These effects are not limited to areas of death,
much more depressed than a control group which had not but have been shown for estimates of non-fatal hazards
read the stories. Later, in an apparently unrelated and lifestyle threatening risks such as divorce and
questionnaire, these subjects were asked to estimate the bankruptcy.
frequency of death due to various causes. The causes Any model of affect must account for two important
of death ranged from those closely related to the topic aspects of this phenomenon: (1) Induction of a neg-
of the story, such as stomach and lung cancer for a ative mood alone is sufficient to change estimates,
story about leukemia, to unrelated causes such as tor- and (2) the size of the change is unrelated to the
nados and airplane accidents. semantic similarity, either among the estimates them-
The potential impact of these stories, and their selves, or between the cause of the mood and the
accompanying changes in mood, represent a continuum. estimates.
At one end of the continuum, we might expect the story References
to have no effect on the estimates. This is the norm- Isen, A. M., Shalker, T.E., Clark, M. & Karp, L.
atively justified response, since the stories contained "Affect, accessibility of material in memory, and
no information about the frequency of the death in the behavior: A cognitive loop?". Journal of Person-
population. In contrast, the reader of the story might ality and Social Psychology, 1978, 37, 1-lA.
generalize from the instance in the newspaper-like story
and increase their estimate of the frequency of that
cause of death. We will term this a local generaliza-
The impact of the story might also generalize to
other, related risks. A story about a leukemia victim
might also raise our subjective probability of related
diseases such as lung and stomach cancer, but not un-
related risks such as airplane accidents. This gra-
dient generalization should be closely related to the
similarity of the risks. Finally there is abundant
evidence in social psychology (Isen, Shalker, Clark,
and Karp 1978) for more pervasive Influences of affect.
We might expect that increases in estimated frequency
might occur for all risks, a possiblity we term global
In the first two studies we examined the general-
ization of negative affect across the responses. De-
spite our attempts to provide a sensitive test of lo-
cal or gradient generalization, both studies demon-
strate sizable global generalization. Readers of the
newspaper stories estimated that all causes of death
were about 40% more common than the control. Since
the changes were unrelated to the topic of the story.


Generative Analogies as Mental Models

Dedre Gentner

Subject explaining electric current: If you in- to predict the set of inferences that should follow
crease resistance In the circuit, the current slows from use of a given analogy; (2) contrast two analogi-
down. Now that's like a h1gh--cars on a highway cal models for the domain and show that they lead to
where you--if you notice as you close down a lane, different indirect Inferences; (3) to show that people's
you have cars moving along. Okay, as you go down inferences concerning simple circuits vary according to
into the thing, the cars move slower through that which of these models they use; and finally (4) to
narrow point. discuss the general issue of analogical models and
When people are reasoning about an unfamiliar do- structure-mapping.
main, they often appear to use analogies, as In the
A structure-mapping theory of analogy. The claim
above example from a protocol. Analogies are also used
here is that analogies select certain aspects of exist-
in teaching, as in the following excerpt:
ing knowledge, and that this selected knowledge can be
The idea that electricity flows as water does is a
structurally characterized. First, let's consider what
good analogy. Picture the wires as pipes carrying
an analogy Is not. An analogy such as
water (electrons). Your wall plug is a high-pres-
(1) An electric circuit with a battery and resistor
sure source which you can tap simply by inserting
much like a plumbing system with a reservoir
a plug. The plug has two prongs--one to take the
and a constricted section of pipe.
flow to the lamp, radio, or air conditioner, the
clearly does not convey that alj_ of one's knowledge
second to conduct the flow back to the wall. A
about the plumbing system should be attributed to the
valve (switch) is usad to start or stop flow.
circuit. The inheritance of characteristics is only
If implicit or explicit analogies are an important partial. This might suggest that an analogy is a weak
determinant of the way people think about complex sys- similarity statement, conveying that some but not
tems, then it becomes crucial to know exactly how such all of the characteristics of the base system apply to
analogies work. This paper considers the psychological the target system. But this weak characterization fails
role of these analogies In structuring the target to capture the distinction between literal similarity
domain. and analogical relatedness. Contrast statement (1) with
The first question that must be posed is whether a literal similarity statement like
such seeming analogical models do in fact strongly (2) A hose is like a pipe.
affect the person's conceptualization of the target The literal
domain (the Generative Analogy hypothesis), or whether similarity statement (2) conveys that the pipe and
they are merely convenient ways of talking about the the hose share object attributes--e.g. cylindrical
domain (the Mere Terminology hypothesis). The mere use shape--as well as sharing similar relationships with
of terms borrowed from a given domain--as, for example, other objects--e.g. CONVEY (hose, water)/C0NVEY (pipe,
when electricity is discussed in terms of moving vehi- water). Statement (1) also conveys considerable overlap
cles or flowing water--1s not in Itself proof that the in functional relations: e.g. impede (resistor, current
speaker is conceiving of electricity as deeply analo- /IMPEDE (constriction, water). However, it does not
gous to traffic or to water flow. convey overlap of objects and their attributes. The
To dem.onstrate that an analogy has generative con- resistor as a separate object need not have any quali-
ceptual power, we must show that nontrivlal inferences ties in common with^constriction. The analogy, in
specific to the base occur in the target. These infer- short, conveys overlap in the system of relations
ences must be such that they cannot be attributed to among objects, but no particular overlap in the charac-
shallow lexical associations; e.g. it is not enough to teristics of the objects themselves. The literal
find that the person who speaks of electricity as similarity statement conveys overlap both in relations
"flowing" also uses terms such as "capacity" or among the objects and in the attributes of the indivi-
''pressure". Such usage Is certainly suggestive of a dual objects.
Generative Analogy, but it could also occur under the The analogical models used in science can be
Mere Terminology hypothesis. characterized as structure-mappings between complex
The goal here is to show that, at least some of systems. In these analogies, the objects of the known
the time, the Generative Analogy hypothesis holds: domain, the base domain, are mapped onto the objects of
that deep, indirect inferences in the target follow the domain of inquiry, the target domain; the predicates
from use of a given base domain as an analogical model. of the base domain--particularly the relations that hold
To do this, we must first decide what inferences should among the nodes--are then applied in the target domain.
follow from use of a given analogy, and then observe Structure-mapping analogy asserts that identical opera-
whether the analogies people adopt appear to affect the tions and relationships hold among nonidentical things.
set of inferences they readily make. The relational structure is preserved, but not the
The plan of this paper is (1) to propose a objects.
structure-mapping theory of analogy that will allow us Given a particular prepositional representation
of knowledge we can proceed with an explicit charac-
This research was supported by the Office of Naval Re- terization of analogical mapping. A structure-mapping
search and was carried out at Bolt Beranek and Newman analogy between a target system T and a base system B
and at U.C.S.D. Donald Gentner collaborated on all
aspects of this work, but particularly in developing
the two analogical models of electricity. I thank Allan
Collins, Ken Forbus, Ed Smith and Al Stevens for many
insightful comments on this research. Please address
correspondence to Dedre Gentner, Bolt Beranek & Newman,
50 Moulton Street, Cambridge, MA 02138.

is an assertion that ing analogy is particularly apt for combinations of
1. Given a decomposition of the base and target batteries, while the moving-object model is superior
for combinations of resistors.
domains into object nodes b,,b.,...,b of the
Figure 1
Structural representations of water flow and of
base system B and object nodes t,,t„,...,t
simple electric circuit, showing structural overlap
1 ^ HI
of the target system T,
2. The analogical mapping M maps the nodes of B

M:b.- t. into the nodes of T.

3. Then predicates valid in B can be applied in

T, using the node substitutions dictated by ; s
the mapping:
M : CHb.,h.)J -^CHt.,t.)J ACROSS THROUGH, OF
Further, the probability that the derived
proposition is valid in the target T is battery wirel electricity Tjjtesistor wire2
greater for relational predicates than for
attributes, and greater for high-order rela-
tions than for lower-order relations. The
\ / \ J^\^ I
strength of the analogical predication in- CONNECTED CONNECTED"^ CONNECTED CONNECTED
creases as we move down the list. TO TO TO TO
(i) M:A(b.) A(t.)

(ii) ^:CHb^,b.)J -)-Z:F(t. .t^) J

{iii)M:£G{Fj(b.,bj), F^{b.,h.)J —^

£G(F^(t.,tj), F^{t.,t.)J

Thus, TRUE ZrA(b.)J does not strongly suggest

TRUE £"A(t.)7. Attributional predicates (i) are less ; ^ 1

likely to carry over than relational predicates (ii); PRESSURE '^ FLOW NARROWNESS
and lower-order relations less likely than higher-order ACROSS THROUGH OF
relations (iii).

Two models of simple electric circuits. One common reservoir pipel water constriction pipe2
analogy used to teach simple electricity is based on
plumbing systems. Figure 1 shows the structure-mapping
conveyed by this analogy. The object-nodes of the ^ \ / \ y \^ I
hydraulic base domain (e.g. the reservoir and constric- CONNECTED CONNECTED CONNECTED CONNECTED
tion) are mapped onto the object-nodes (the battery and TO TO TO TO
resistor) of the circuit. Given this correspondence of 1 f
nodes, the analogy conveys that the relationships that
hold between the objects and object-attributes of the
hydraulic system also hold between the nodes of the
electric system-.for example, that current increases
with voltage just as rate of water flow increases with Combinational problems. One way to observe deep
pressure; and that current decreases with resistance indirect inferences in the target domain is to ask
just as the rate of water flow decreases with degree about different combinations of components. For example,
of constriction. we can ask how the current in a circuit with two resis-
A second kind of analogy for electric circuits is tors in series or in parallel compares with that in a
based on objects moving through chutes. Current is simple one-resistor circuit. The answers are not ob-
seen as a moving crowd of small objects: voltage is the vious. The four circuits generated by series and para-
forward pressure or pushiness of the objects. Like the llel combinations of batteries and resistors are non-
plumbing model, the moving-object model provides rela- transparent. They provide an excellent way to observe
tions that map usefully into the electrical system: If true inferences, as opposed to shallow verbal associa-
we imagine a source of pushiness corresponding to the tions. To deduce the current in these four circuits,
battery, and gates in the chute corresponding to the the person must move beyond the first-stage naive model
resistors, then the more pushiness, the higher the rate of circuitry (shown in Figure 1). This first level of
of aggregate motion; the narrower the gates, the lower insight is that batteries make for more current, resis-
the rate of aggregate motion. tors make for less current. These rules hold for
batteries and resistors in series, but not for parallel
Although these two analogies convey many of the combinations, as shown in Figure 2. Parallel batteries
same relations, in some respects they differ in the give the same current as a single system, not more;
aptness of the relational match with the target domain, while parallel resistors allow more current than in a
particularly if we consider slightly more complex simple circuit, not less.
circuits (see Gentner and Genti""" in press.)

Figure 2 , CROWD MODEL
Current and voltage in simple serial a n d parallel
configuration c i r c u i t s . Again I have all these people coming along
here. I have this big area here where people
are milling around. I think it is crucial
SIHTLE ciBcurr ctjiuieirr VOLTASC that I separate them though before they get
At X At S to the gates . . . I can model the two gate
I system by just putting the two gates right
into the arena just lilce that. So this is
one possible model of the two gate system.
There are two gates instead of one which
seems to imply that the resistance would be
half as great if there were only one gate
SERIAL BATTERIES for all those people.
This protocol suggests that models do affect infer-
ences. The following study tests this possibility more
2V on a larger scale. In this study, fairly naive high
school and college students were first shown a simple
1 circuit with a battery and a resistor, and then asked
to give qualitative solutions for the four combination
circuits shown in Figure 2. They were asked to circle
paralij:l Batteries whether the current (and voltage) in each of the combin-
ation circuits would be greater than, equal to, or less
than that of the simple battery-resistor circuit. After
they gave their answers for all four combination cir-
cuits, they were asked to describe the way they thought
y about electricity. Then, for each of the four circuit
SERIAI. RTSISTORS problems, they were asked to circle whether they had
thought about flowing fluid, moving objects, or some
other way of conceiving of electricity. They were also
asked questions about water, to be sure that they under-
stood the base domain.
Figure 3 shows a schematic diagram of parallel and
serial resistors in the target and in the two base
Figure 3
t : ? '

Do analogies make a d i f f e r e n c e . If subjects a r e

really using their analogical models to understand
electronics they should draw o n their knowledge o f com-
binational relations among the corresponding components
in their respective base d o m a i n s . Even though the com-
plex circuit problems are couched purely in terms o f
electronics, w e should see differences in s u b j e c t s '
predictions depending on which model they u s e .
For e x a m p l e , here are two sections o f a protocol
of a subject trying to predict the current in a
parallel-resistor c i r c u i t . In the first s e c t i o n , s h e
uses a hydraulics model with reservoirs for b a t t e r i e s , z^J^
which was her initial m o d e l , a n d derives the wrong
answer of less c u r r e n t . In the second s e c t i o n , s h e
uses a crowd model suggested to her by the experimenter O CD O CD
to derive the correct answer o f m o r e c u r r e n t : z: o iir z o
We started o f f as one p i p e , b u t then w e split
into two. Now does that make a n y difference?
I guess it seems to m e that this does make a "Wv ' s/\/^r-
difference. So what w e have here is one p i p e , u _
one sort o f line coming o f f and then w e let it
go like that f o r a w h i l e . W e let it split off.
We have a different current in the split-off I I-
section, and then w e bring it back together.
That's a whole different thing. That just
functions as one big pipe o f some obscure
description. S o y o u should not get a s much

Subject who used the fluid model should do well on Flgure I*
the battery questions. This is because serial and Proportion correct on different Itinds of circuits for
parallel reservoirs combine in the same manner as ser- subjects using different modeis of electricity
ial and parallel batteries, and the combinational dis-
tinctions are spatially quite distinct in the water CURRENT
domain. Two equal reservoirs in series (one above the
other) give more pressure and hence a greater rate of MTB) ^CCEL
water flow than a single reservoir. However, since
water pressure depends only on height, not on volume, BAHERIES
two reservoirs in parallel (side by side) yield only a
rate of flow equal to that of a single reservoir. T h u s ,
if the flowing fluid analogy is generative, then sub-
jects with this model (assuming they know the way 5 .6. RESISTORS
pressure works in the base) should be able to differ-
entiate serial and parallel configurations of batteries.
For resistors, however, the fluid flow model with its f M
constrictions should n o t , in general, lead to a strong
differentiation between serial and parallel resistors.
As the protocol above s h o w s , the different be- .2
tween serial and parallel constrictions is fairly
opaque in the fluid flow domain; therefore subjects
with this model cannot import the correct combination-
al distinction into the electricity domain. Thus, the SERIAL PAWLLEL
prediction is that subjects with the fluid flow model
should do better with brtteries than with resistors. imiHb OBJECTS MDDEL
For subjects with the moving-objects model, the 1.0
pattern should be quite different. In this model,
configurations of batteries should be relatively diffi-
cult to differentiate, since analogies for batteries
are hard to find. In contrast, resistors should be
better understood. This is because in the moving- .6.
objects model, resistors are often seen as gates. Con-
ceiving of resistors as gates should lead to better
differentiation between the parallel and serial config- S
urations. If all the objects must pass through two
gates one after the other (serial) then the rate of BATTIRIcS
flow should be lower than for just one gate. On the .2
other hand, if the flow splits and moves through two
parallel gates, then the rate of flow should be twice
the rate for a single gate. Thus subjects using this
model should correctly respond that parallel resistors SERIAL PARALLEL
give more current than a single resistor; and serial
resistors, less. lations from the base domain are mapped into the target
Overall, if these models are truly generative domain, where they genuinely affect the person's con-
analogies, we should find that the fluid-flow people ceptual view of the domain.
do better with batteries than resistors, and the The structure-mapping interpretation of the pro-
moving-object people do better with resistors than with cess avoids two extreme positions that often arise in
batteries. discussions of analogy as explanation: the "vague
Results. Figure 4 shows the results for subjects metaphoricizing" position, which holds that analogy is
who used either the fluid-flow analogy or the moving- inherently illogical and unhelpful, and the "appropri-
objects analogy consistently, on all four problems. ate abstractions" position, which emphasizes the fact
For the fluid-flow subjects, only those who correctly that analogies convey correct knowledge about the
answered the latter questions about the behavior of target domain. The structure-mapping view is neutral
reservoirs were included. This was to insure that with respect to whether analogy per se is helpful or
subjects possessed the requisite knowledge in the base harmful. According to the structure-mapping view, the
domain. There were nine fluid-flow subjects and seven inferences conveyed by a given analogy are not neces-
moving-object subjects. sarily either correct or incorrect; they are predicta-
The patterns of combinational inference are dif- ble from the predicate structure in the two domains.
ferent depending on which model the subject had. As Relational predicates, particularly those that parti-
predicted, people who used the fluid-flow model per- cipate in higher-order systems of relations, are most
formed better on batteries than on resistors. The likely to be mapped; but these may be either correct
reverse is true for the moving-object people. In a (as in the case of the moving-object model applied to
Model type x Component type x Topology analysis of parallel resistors) or incorrect or indeterminant (as
variance, the interaction between model type and cir- when the moving-object model is applied to batteries).
cuit component is significant; ¥{i,is} ='i(i; p < .05 . The more we know about the structure of analogy, the
Conclusions. The results of the study indicate better we can design good educational analogies and
that, for our subjects, the analogies used for elec- predict the problem that will occur in use of any given
tricity were truly generative. Use of different ana- analogy.
logies led to systematic differences in the patterns References
of correct and incorrect inferences in the target Centner, D„ and Centner, D.R, Flowing waters or teem-
domain. Moreover, these combinatorial differences are ing crowds: Mental models of electric circuits.
not easily attributable to shallow verbal associations In D, Centner £• A. Stevens (Eds.) Mental models.
and communicative patterns. These analogies seem to Hillsdale, New Jersey: Lawrence ErIbaum Associates
be truly generative for our subjects; structural re- in press.

T H E R O L E O F EXPERIENCE IN M O D E L S pushing and pulling things around, a few are selected to stand as
O F T H E PHYSICAL W O R L D prototypical for the class and are systematically used In both
explanations and predictions It Is not, of course, a literal recall of
Andrea A. diSeua an observed phenomenon which serves this purpose but what the
M I T . DS.R.E. person establishes as a conventional Interpretation of the phenomenon
Cambridge, M A In terms which that person already understands I call these
Jul7 8,1981 paradigmatic Interpretations of experience 'phenomenologlcal
primitives." In the case of the Aristotelian expectation of motion
always In the direction of force, one might hypothesize that the
In recent years there has been a growing body of research c o m m o n event of pushlngan object from rest serves as an Important
dealing with "naive physics:" what people, prior to extensive prototype. Thus the phenomenologlcal primitive here Is the "theory"
Instruction, expect are the principles governing the everyday that things simply m o v e in the direction you push them, and
workings of the physical world. This research Is extremely previous momentum Is generally Ignored since It plays no role In the
interesting as a study of untutored learning since presumably prototype. I suspect that the Interpretation of pushing from rest Is
whatever systematic understanding physics-naive Individuals arrive based on prior, common-sense notions of agency and causality which
at must be due primarily to their direct experience with the physica,^ one finds associated with naive conceptions of force In many ways.
world. Here w e would like to summarize briefly some results o1 Indeed, what is more surprising, one can make a rather strong case
that experimental work and a tentative theoretical Interpretation ol that these s a m e ideas had a great impact on the historical
it. Then w e will be in a position to speculate on the general natur< development of the science of mechanics as well, (See diSessa, 1980)
of the models people make of their experience in order to cope wit! Though the notion of phenomenologlcal primitive might help
it account for the origins of false "theories" like the Aristotelian
W h a t has emerged from naive physics research Is i expectation, w e must look to the knowledge system in which these
surprisingly robust and systematic set of ideas about mechanic structures operate in order to account for their long term stability.
which are often, however, decidedly non-Newtonian. Studies fron Here w e can offer only the briefest suggestion as to the character of
age 10 upward to university students and physics-naive adults hav this system, again by example. T h e example involves a counter-
given in many cases very uniform results (Viennot, 1979; Clemem example to the Aristotelian expectation. Imagine a ten-ton truck
1979; McCloskey, el al. 1980; White, 1981). In one study (diSess! hurtling down the highway and a small push on Its side. W h o could
1981) w e let elementary school students a n d universlf believe the truck will move In the direction of the push? Indeed, no
undergraduates play with the same computer simulation of an obje< one does! W h e n subjects are prompted in such a way, the
obeying Newton's Laws. Despite many more years experience, Aristotelian intuition Is not even considered, or. If It Is, excuses for
year of high school physics and a university level course o its inapplicability are found: T h e force (sic) of the truck Is too big;
mechanics one could see a clear overlap in the set of strategies th It's overcoming the sideways push." But either way, (through
university students used as compared to the elementary student selective cueing or maintaining excuses as part of the knowledge
Moreover, many of these common strategies were not neutral, m base) the Aristotelian expectation is Isolated and kept safe from
merely based on other-than-textbook analyses, but were overtly noi refutation.
Newtonian T h e university students often had difnculty applylr W h a t generalizations can w e draw from this story of naive
the simplest classroom concepts to the simulation, even when aske physics about the models people spontaneously make? T h e first Is
O n e of the prominent expectations most students showed was th that these models are robust, that naive Ideas and the Interpretations
force acts by directly producing motion in the direction of the for of experience one makes with them can have a powerful effect on
rather than by combining with previous motion. T h u s the: long term understanding or misunderstanding. All the evidence
students sided with Aristotle against Newton. points to the conclusion that naive physics plays a large role In
Besides the obvious pedagogical problems, data such as th learning textbook physics. T o draw a caricature, it is almost as if
pose In a very direct form the fundamental question of what o one Is trying to teach physics to a stable cognitive system which
learns from experience. H o w can it be that people come to a robi already knows a different physics.
non-Newtonian understanding, even resisting instruction, of a wor T h e second generality is the Importance of knowing the
with which they deal everyday, a world governed by Newtoni particular pre-existing notions W h a t one carries away from an
principles? experience Is one's interpretation of it Such a truism only warrants
N o one expects the answers to such a question to be simp attention when w e remind ourselves of how little w e know about the
But some analysis of a set of naive conceptions (dISessa, to appe: naive vocabulary and about how one builds deeper understanding
suggests that something like the following mechanism m a y play out of it. T h e surprise of a Newtonian world teaching Aristotelian
Important role. It Is a mechanism concerning incremental learni physics, however, should serve as a reminder.
based on experience. T h e basic Idea Is that a m o n g all t T h e third generality concerns the fragmentation exhibited
experiences one has regarding a class of phenomena, for exam| by spontaneous models. O n e can find people both believing and

disbelieving their Aristotelian expectations depending on
circumstances. This apparent Incoherence Is puiiling. After all,
what other than a coherent system could exhibit such robustness. In
fact. It Is in this area, coherence, that I believe our own naive Ideas
about knowledge systems most need refining.


Clement, J. Common preconceptions and misconceptions as an

important source of difficulty in physics courses, Amherst, M A :
University of Massachusetts, Cognitive Development Project
Working Paper, July. 1979.

diSessa, A. Unlearning Aristotelian physics: a study of knowledge-

based learning. Cognttlxn Scltnce, to appear 1981

diSessa, A. Phenomenology and the evolution of Intuition, In Mental

Modeli, Centner, D. and Stevens, A. (Eds.), Lawrence Eribaum, to

diSessa, A. Momentum flow as an alternative perspective In

elementary mechanics, (Working Paper *i), Cambridge M A : M I T
D S R E , 1980

McCloskey, M . Caramazia, A and Green, B Curvilinear motion in

the absence of external forces: naive beliefs about the motion of
objects, Sciencf Vol. 210, 5, December, 1980.

Viennot, L. La ralsonnemini spontant en rfynomijue elemenlaire,

Paris: Hermann, 1979.

White, B. Designing computer games to enhance learning, (Technical

Report) Cambridge, MA:. M I T AI Laboratory, 1981


P. N. Johnson-Laird

Centre for Research on Perception and Cognition

Laboratory of Experinental Psychology
University of Sussex
Brighton BNl 9QG England

You are lost in the maze at Hampton Court Palace. consistent with the description.) Half the descrip-
You come to a turning and for a moment you are not tions that the subjects received were determinate
sure which way to go. You recognize that you have like the example above, and the other half were
been at this point before, and, in your imagination, indeterminate. The indeterminate descriptions were
you turn right, proceed down an alley, and are then constructed merely by changing the last word in the
confronted by a dead end. And so this time around, second sentence:
you decide to turn left. What you did was to
reconstruct a route through the maze on the basis The spoon is to the left of the knife
of a mental model of it. You may hardly have The plate is to the right of the spoon
experienced any imagery at all; or you may have had The fork is in front of the spoon
a succession of vivid images like a snippet from an The cup is in front of the knife.
imaginary movie that culminates in a leafy cul-de-
sac. In either case, there was nothing verbal This description is consistent with two radically
about your reasoning: you navigated your way different diagrams:
through your model of the maze much as a rat in a
psychological laboratory might have done (O'Keefe (1) (2)
and Nadel, 1978). Yet, there is another method spoon knife plate spoon plate knife
that you could use to make your decision. You
recall instead that the way to get out of the maze fork fork
is to keep turning left at every available cup cup
opportunity, and, since you are presented with such The materials were counterbalanced so that for each
an opportunity, you accordingly decide to turn left. set of five objects, a subject received either the
This method makes use of a mental representation determinate or else the indeterminate description.
of verbal propositions. After the subjects had judged a series of eight des-
The two alternatives illustrate the contrast criptions and diagrams, they were given an unexpected
between exploiting a mental model (perhaps with test of their memory for the descriptions. On each
accompanying imagery) and making use of a proposi- trial, they had to rank four alternatives in terms of
tional representation. My aim in this paper is to their resemblance to the original description: the
show that the contrast is real -- that there are original description, an inferrable description, and
both forms of mental representation -- and to offer two "foils' with a different meaning. The inferrable
an account of the purpose that they serve. Indeed, description for the example above contained the
if there are mental models, then the two most sentence:
important questions about them are: what form do
they take? what function do they serve? I will try The fork is to the left of the cup
to answer both questions.
in place of the sentence interrelating the spoon and
Direct empirical evidence for the contrast the knife. The description can therefore be inferred
between propositional representations and mental from the layout corresponding to the original descrip-
models comes from a series of experiments that tion in the case of both the determinate and the
Kannan Mani and I have carried out (Mani and indeterminate descriptions.
Johnson-Laird, in press). In the most recent of
our studies, the subjects heard a verbal description The subjects remembered the layouts of the
of a spatial layout, such as: determinate descriptions very much better than those
of the indeterminate descriptions. The percentages
The spoon is to the left of the knife of trials on which they ranked the original and the
The plate is to the right of the knife inferrable descriptions prior to the foils was 88%
The fork is in front of the spoon for the determinate descriptions, but only 58% for the
The cup is in front of the knife. indeterminate descriptions. All twenty of the
subjects conformed to the trend, and there was no
They were then shown a diagram, such as: effect of whether or not a diagram had been consistent
with a description. However, the percentages of
spoon knife plate
trials on which the original description was ranked
fork higher than the inferrable description was 68% for the
cup determinate descriptions, but 88% for the indetermin-
and they had to decide whether or not the diagram was ate descriptions. This difference was highly
consistent with the description. (If you think of reliable, too.
the diagram as depicting the arrangement of the Evidently, subjects tend to remember the layout
objects on a table top, then obviously it is of determinate descriptions better than that of

indeterminate descriptions, but they tend to remember then the question of of its origins does not even
verbatim detail of indeterminate descriptions better arise. The theory of mental models also reveals
than that of determinate descriptions. This 'cross- the major cause of inferential error: the greater
over' effect is impossible to explain without the number of mental models that have to be construct-
postulating at least two sorts of mental representa- ed in order to make a valid deduction, the greater
tion. A plausible account of the results is indeed load on working memory, and the more likely an error
that subjects construct a mental model of the is to be made. My colleagues and I have checked this
determinate descriptions but abandon such a represen- prediction in a variety of inferential tasks. Table
tation in favour of a superficial propositional one 1 presents the relevant data from four of our experi-
as soon as they encounter an indeterminacy in a ments in which the subjects had to draw their own
description. Mental models are easier to remember conclusions from syllogistic premises: it gives the
than propositional representations, perhaps because percentages of valid conclusions that were drawn de-
they are more structured and elaborated (cf. Craik pending on the number of mental models that had to be
and Tulving, 1975) and require a greater amount of
processing to construct (cf. Johnson-Laird and Table 1 The percentages of correct valid
Bethell-Fox, 1978). But, models encode little or conclusions drawn from syllogistic
nothing of the linguistic form of the sentences on premises in four experiments. The
which they are based, and subjects accordingly percentages are shown as a function
confuse inferrable descriptions with the originals. of the number of mental models that
Propositional representations are relatively hard to have to be constructed in order to
remember, but they do encode the linguistic form of draw a valid conclusion.
sentences. Hence, when they are remembered, the
One model Two model Three model
subjects are likely to make a better than chance
problems problems problems
recognition of verbatim content.
It is natural to suppose that propositional Experiment 1 92 46 28
representations are produced as part of the normal Experiment 2 80 20 9
process of comprehending discourse (cf. Kintsch, Experiment 3 62 20 3
1974; Fodor, Fodor, and Garrett, 1975), but they Experiment 4 58 0 0
can also serve the useful purpose of providing an
economical representation of radically indeterminate constructed. In Experiment 1, 20 students at Teachers
discourse. The function of mental models is College, Columbia University, were asked to state what
profoundly semantic: a propositional representation followed from premises in each of the 64 logically
is true or false with respect to a mental model of distinct varieties (see Johnson-Laird and Steedman,
tFe worl?7 TfvTs relation is established by mapping 1978). Experiment 2 was a replication with 20
propositional representations onto mental models, and students at Milan University, and Experiment 3 was a
I have argued elsewhere for a procedural semantics further replication in which 20 Italian subjects were
that carries out this task (see, e.g. Johnson-Laird, given just 10 seconds in which to make each of their
1980, for a description of a program that builds up responses. These experiments were carried out in
spatial arrays from verbal descriptions). Truth or collaboration with Bruno Sara. Finally, in Experiment
falsity with respect to reality ultimately depends on 4, which Debbie Bull and I designed, 19 children
the construction of mental models on the basis of between 11 and 12 years of age were asked to draw
perceptual experience. conclusions from 20 out of the 64 possible pairs of
syllogistic premises. The trend in each experiment
There is one other crucial function served by was remarkable: not a single subject that we have
mental models. The fundamental semantic principle tested has ever failed to perform best on those
of truth is that an assertion is true provided that syllogisms that require only a single model to be con-
there is no counterexample to it. The assertion, structed. It is difficult to resist the conclusion
"Socrates is dead," has only one possible counter- that inferential ability is based on the manipulation
example, namely, that Socrates is not dead; the of mental models.
assertion, "All men are mortal," has a large number
of potential counterexamples. Likewise, given the Let me finish with one final conjectural flourish.
truth of a set of premises, a conclusion is The psychological core of understanding any phenomenon
necessarily true only if there is no counterexample consists in your having a 'working model' of it in
to it, that is, no way of interpreting the premises your mind. If you understand inflation, a mathemati-
that renders the conclusion false. If human beings cal proof, the way a computer works,DNA or a divorce,
have grasped this principle, then they can reason then you have a mental representation of it that serves
validly without possessing any mental logic, rules of as a model in much the same way as, say, a clock
inference, or inferential schemata. It is a functions as a model of the solar system. Like a
straightforward matter to write computer programs that clock, a mental model need not be wholly accurate to be
make inferences without recourse to rules of inference: useful, which is just as well because, of course, there
they construct models of the premises, draw a putative are no complete models of any empirical phenomena.
conclusion on the basis of a simple heuristic, and If a television set is mentally represented as contain-
then search for counterexamples to the conclusion. ing a beam of electrons that are magnetically deflected
On the previous occasion that I presented this idea across the screen, then this component of the model
(Johnson-Laird, 1980), it was viewed as on a par with serves an explanatory function. It accounts, for
the Pelagian heresy in some quarters. Yet cognitive example, for the distortion of the picture that occurs
scientists should be prepared to accept that the when a magnet is held near to the screen. Other com-
doctrine of mental logic may be just as mistaken as ponents of the model may serve no such function. One
the idea of original sin. Abandoning the doctrine might imagine, say, each electron as deflected by the
certainly solves the otherwise intractable mystery of magnetic field much as a ball-bearing is diverted from
how children could acquire logic without being able its course by a magnet, but without having any
to reason validly. The thesis that logic is innate representation of the nature of magnetism: the
is the only plausible solution but it has no more 'picture' is just a picture, which simulates reality
explanatory value or empirical content than an appeal rather than models its underlying principles. At
to divine intervention. If there is no mental logic. least one other component of every dynamic model is
neither modelled nor simulated. This element is
time. Time is not represented in a dynamic model,
but rather the model unwinds in real time in much
the same way as do the events that are modelled,
though perhaps at a different rate. In models that
are not dynamic, of course, time can be represented
by a spatial axis. What one should expect in
examining the growth of expertise in a particular
domain is the gradual transition from mere
prepositional principles to a fully articulated
mental model, and the gradual replacement of
simulated elements by their modelled counterparts.


Craik, F.I.M., & Tulving, E. Depth of processing and

the retention of words in episodic memory.
Journal of Experimental Psychology: General, 1975,
104, 268-294.

Fodor, J.D., Fodor, J.A., & Garrett, M.F. The

psychological unreality of semantic representations.
Linguistic Inquiry, 1975, 4, 515-531.

Johnson-Laird, P.N. Mental models in cognitive

science. Cognitive Science, 1980, 4, 71-115.

Johnson-Laird, P.N., & Bethel 1-Fox, C.E. Memory for

questions and amount of processing.
Memory and Cognition, 1978, £, 496-501.

Johnson-Laird, P.N., & Steedman, M.J. The psychology

of syllogisms. Cognitive Psychology, 1978, 10,

Kintsch, W. The representation of meaning in memory.

Hillsdale, N.J.: Lawrence Erlbaum Associates, 1974.

Mani, K., & Johnson-Laird, P.N. The mental

representation of spatial descriptions.
Memory and Cognition, In press.

O'Keefe, J., & Nadel, L. The hippocampus as a

cognitive map. London: Oxford University Press,

Learning T h r o u g h G r o w t h of Skill Then the conditions of s o m e n e w production are found to match,
and the cycles continue. Production systems have a continued
in Mental Modeling^
history of fruitlulness in psychological modeling (Newell and Simon,
1972, hJewell, 1973, McDermott, 1978) but the major feature used in
Jill H. Larkin A B L E is the easy modeling of learning. Each production is an
Herbert A. Simon independent piece of knowledge, and the circumstances under
Carnegie-Mellon University which it applies are determined Only by the contents of its o w n
The mental models of a skilled scientist, are often different from conditions. Thus the addition of knowledge (learning) is modeled
tfiose of an untrained person. For example, in tfiinking about the simply by the addition of n e w productions.
interaction of physical objects, the untrained person seems largely To explain the working of A B L E w e consider its application to
restricted to envisioning objects in a sequence of motions. T h e solving part of the problem given in Table 1 and Figure la, and
entities in these "naive" or Immediate mental models correspond presented as a worked example in Halliday & Resnick (1970). The
directly to objects in the real v*/orld. The inferential rules that control first paragraph of the example states the problem. W e currently give
the running of these models correspond roughly to rules reflecting A B L E a good understanding of this paragraph, i.e., a good
h o w events unfold in real time. While such mental models are immediate representation of the problem. It is coded as a set of
perfectly adequate for getting around in everyday situations, they are related declarative elements in working memory, indicated by the
sometimes dramatically wrong (Green, McCloskey, and Caramazza, graph structure in Figure lb.
1980, Clement, 1981) and they certainly seem less effective in
solving scientific problems than are the more extended
representations of the scientist. 1.1. Encoding ot Principles
The mental models of a person with training in physics are not After achieving this immediate representation of the situation, how
limited to entities and inferential rules based directly on experience. does a solver m a k e scientific inferences of the kind illustrated by the
Instead, these models can and d o include entities that have textbook solution given in Table 1? In other words h o w is a scientific
technical meanings defined only by the scientific discipline, and that mental model run?
are related by special inferential rules again defined in the discipline. Such inferences must be based on scientific principles that are in
For example, a person with training in physics is not restricted to s o m e sense "known" to the solver. " K n o w n " might initially mean
considering perceivable objects like cats and coffee cups, but m a y that the appropriate textbook page is available for inspection. W e
also represent situations in terms of technical entities like forces or discuss later the growth of other kinds of knowing. Thus w e provide
pressure drops. Similarly, capacity to m a k e inferences about a A B L E with knowledge of relevant principles in the form illustrated in
situation need not parallel imagined development of the situation in Figure 2. Like all principles and definitions in A B L E , it includes a
time, but instead m a y reflect special constraint laws of physics, e.g., symbolic statement of the principle A p pgh together with a setting
that the m o m e n t u m of an isolated system must remain constant. to which the principle applies and in terms of which of the .symbols in
These ideas.are discussed more fully in (Larkin, 1981a) and in the the statement are (Jefined, Here the selling includes a portion of
discussion of physical intuition in (Simon and Simon, 1978). liquid v/ith density p. two points in tlia: liquid sepaiated by a height
In this paper w e consider h o w an individual might develop the
ability to re-represent situations in terms of scientific entities.
Presumably this development is one goal of science instruction. W e Table 1: Worked example from a textbook
shall present preliminary results from an experimental and (Halliday and Resnick, 1970) showing the application of relations
theoretical case study in such development. Subjects with of fluid statics to relate densities of
backgrounds in physics studied sections taken from a physics liquids in a U-shaped tube.
textbook that described material (fluid statics) they had not
A U-tube is partly filled with water. Another liquid, which does not
previously encountered, and then used this material in efforts to
mix with water, is poured into one side until it stands a distance d
solve problems. In a coordinated theoretical effort w e are
above the water level on the other side, which has meanwhile risen a
developing a computer-implemented model of learning from text that
distance I (Fig. la). Find the density of the liquid relative to that of
is capable of using declarative statements of facts (in this case,
relations of physics) both to "understan'd" the derivation of n e w
results and to apply these results in solving problems. In Fig. 1 points C are at the s a m e pressure^ Hence, the pressure
drop from C to each surface is the same^, for each surface is at
atmospheric pressure"'
1.The A B L E s y s t e m The pressure drop on the water side is p 21*, where the 21^ comes
The system, called A B L E , is a descendant of a system that learned from the fact that the water column has risen a distance I on one
through practice to apply principles of mechanics, and that side and fallen a distance I on the other side, from its initial position.
accounted for strategy differences between skilled and less skilled The pressure drop on the other side is pg(d + 21)^, where p is the
individuals (Larkin, 1981b). density of the unknown liquid. Hence,
A B L E is a production system written in the current implementation n8
p 021 = pg(d + 21)"
of O P S . a LISP-based efficient production system language (Forgy, and
1980, Forgy. 1979). Thus A B L E has a working m e m o r y composed of
passive elements of knowledge that are acted on by a large p/p^ 2l/(2Ud)9.
production m e m o r y composed of elements of procedural knowledge The ratio of the density of substance to the density of water is
encoded as condition-action pairs. W h e n the conditions of a called the relative density (or the specific gravity) of that substance.
particular production are found to match s o m e of the contents of
^•^ These numbers label inferences for reference later in the text.
working memory, this match cues the execution of the
corresponding actions which then act to modify working memory.
h. The "gravitational acceleration" g = 9.8m/s is not specified but
assumed to be known outside the context of this principle.
This work was supporled by NIE NSF grant number 1 5586?. by NSF grant This knowledge of a principle is encoded as a passive link-node
number 1 55035 .ind by the Dt-lcnse Advanced Research Proiecls Agency (DOO).
structure involving no knowledge of hov,' or when to apply the
ARPA Otclrji No ,3597 monitof'.*d by the Air Force Avronics Labor.itnry under
principle In this sense it is dcclaraine knowledge, although clearly
Contr-jr.t F;UGI57U C 1151 Tli.> Jiiilhors ncknovvlfdoe llio coirlrrbulrcins of
f>u;..'iri ('.(tW^ii m llu- rrll. r li.ji; h.-in'.ciiplion .in.I codrnrj • ,1 Ihe pr.jldrol il-jt;i

Figure 3: Goals used in A B L E with labeled arrows indicating
Figu re 1: (a) Diagram provided by the textbook for the example In
productions that change goals.
Table 1 (b) Annotated "diagram" (immediate representation)
provided for A B L E in starting to work the example. Expression for a quantity

Find principle .
Establish principle
Compare prir
inciple P4 j
-Check established-
Recurse on Fail
(b) non-matched quantity


goal is to find an expression for a quantity in a problem setting, then

water-top one should search for a principle that can provide further
information about that quantity. The initial search can be based on a
variety of criteria (cf. Simon & Simon (1978), Larkin 1981b ) . A B L E
original-level 9 water
currently uses a very rough process, embodied in P2, that to be
I considered a principle must involve a quantity of the type currently
•liquid-bottom desired (here pressure drop).
wjler ABLE then applies the knowledge in production P3 to set the goal
of comparing elements of the problem situation with elements of the
-tube connection principle, including its statement and setting. Production P4
contains knowledge of how to trace and compare the graph
structures representing the current problem situation (e.g., Figure
lb) and the principle situation (Figure 2). W h e n all possible
Figure 2: Graph structure representation of the principle correspondences between the two graph structures have been
A p = pgh. matched, production P5 sets as a goal to check whether all parts of
the principle have been matched. If they have, the goals are
Setting: point A successively marked as succeeded. If this is not the case,
production P6 recognizes that part of the situation crucial to the
liquid ^t:;^ height applicability of the principle has not been satisfied. The goals are
pressure dro>\_^^ ^^density then successively marked as failed, and A B L E ultimately must seek a
different principle. If, however, the only part of the principle situation
\~.^point B
that does not have a correspondence in the problem situation is a
particular theoretical entity (quantity) then production P7 recognizes-
Statement: •Ap = pgh that if this correspondence could be established then the principle
would succeed. Thus in our example, if A B L E had not already
established the correspondence h = 21 (as the text solution has not -
- see Table 1), then A B L E would set as a subgoal to establish an
it goes beyond minimal propositional encoding of the phrases that expression for the height h, beginning work again with production
may have been used to describe it in the textboot^. To illustrate how PI in Figures.
ABLE uses this knowledge of principles, w e consider its application I^uch of ABLE's work involves establishing a detailed match
to develop the inference labeled 4 in Table 1, that is to infer that the between a principle setting and a subset of the problem setting. In
pressure drop from 0 to A on the water side of the tube is pg2l. the single inference discussed above, of the total of 9 cycles of
production execution, 5 were concerned with matching settings.
This costly and compulsive matching seems, however, to be
1.2. Interpretive Use of Principles necessary for good problem solving. For example, in the current
ABLE applies declarative knowledge through general procedural problem there are several densities, pressures, and heights. Without
knowledge that first matches the setting of the principle to the careful matching between settings, the solver can easily "infer"
setting of the problem, and then uses the statement of the principle relations between quantities that in fact have no connection.
(interpreted in the setting of the problem) to make inferences.
The control structure of ABLE, shown in Figure 3, is based on that
proposed by Neves and Anderson (1981) for the analogous task of 1.3. Reducing the Costs of Interpretive Matching
supplying reasons for the statements in a geometry proof. The Because matching a principle to a setting is costly, it is crucial that
following paragraphs provide English statements of the productions a solver develop good search procedures for locating principles
that control the shift of goals. Figure 3 indicates these productions likely to produce useful inferences. The primitive search algorithm
by labeled arrows linking goal statements. embodied in production PI (pick a principle involving the kind of
quantity you're trying to solve for) is certainly not good enough. The
ABLE starts by using the knowledge in production PI that if the

following paragraphs describe ABLE's ability to develop better primitive search algorithm of PI. They do not, however, short circuit
search mechanisms. all the interpretive matching between the principle and problem
settings. In the current case, A B L E must still match lines and points
A B L E learns through two mechanisms proceduralization and
and heights.- In human language the knowledge in a production like
generalization. The mechanisms implemented here are adapted
P8 might be expressed, "I have to relate pressure drops to density
from those described by Neves and Anderson (1981). Similar
and I remember there was a principle about that, so let m e check
mechanisms have been implemented by (Newell, Shaw, & Simon ,
whether it applies." Although such knowledge does not replace the
1960, Lewis, 1978, Hayes and Simon, 1974).
use of a general ability to apply a principle in a situation, a collection
Proceduralization is a mechanism through which declarative of this kind of knowledge provides a means of locating principles
knowledge (e.g., the statement of a principle and a situation to which that are likely to prove useful. Furthermore, as proceduralization
It applies) can, through application in an example, be converted to and composition proceed, they build productions with more
procedural knowledge of h o w to apply that principle to analogous information in their conditions. Thus the ability to recognize a'
situations. Composition is the collapsing or compiling of procedures configuration in which a certain principle will be useful becomes
so that they run more quickly and with reduced need for conscious more and more accurate.
monitoring (Hayes and Simon, 1974).
In the limit, if A B L E has worked many similar problems, it builds
In A B L E productions PI and P4 (Figure 3) contain the capacity for productions like the following:
proceduralization. (These are the productions that use directly
declarative knowledge of relations.) W h e n each of these IF the goal is to find or justify an expression for pressure drop
productions executes, it builds a copy of itself involving the specific and there is a density associated with a fluid
declarative entities from the relation being applied. Thus for and there are two points in the fluid separated by a height
example, when P 4 applies to the principle A p pgh, it m a y apply in THEN there is a pressure drop between the two points, and its
the form: value is A p = pgh.
IF If the goal is to compare a principle to the current setting At this point ABLE has at least one of the capabilities necessary
and there are corresponding theoretical entities in the for building what w e have called a scientific representation for a
principle and problem settings problem. O n encoding a problem involving appropriate heights and
and these entities refer to tv^o physical entities of the s a m e densities, A B L E immediately knows that the problem also involves
type related pressure drops, and ban use these entities as readily as any
THEN mark the two physical entities as corresponding. in the immediate representations.

This production builds a copy of itself of the form:

1.4. Settings for Learning
W h e n can the learning involved in proceduralization and
coinposit'cn occur? Clearly solving problems is one such setting.
IF the goal is to compare the principle A p = pgh to the
However, as vje sIkiM see in our dir;ussion of human learners, it
cui rent setting
and the density of a liquid in the problem setting seems likely that this learning also takes place during study of text
corresponds to the density in the principle setting material. The following is one mechanism through which this might
THEN mark this liquid in the problem as corresponding to the occur.
liquid in the principle A p = pgh. Consider the textbook example in Table 1. Suppose the learner
considers the various inferences (labeled 1-9) as statements to be
This new production is part of the procedural knowledge needed to understood or verified on the basis of previous knowledge. Then
apply the principle A p = pgh. understanding the sentence involving inference 4 involves exactly
As Neves and Anderson (1981) point out. these proceduralized the process discussed above, justifying the expression for the
productions are almost always shorter (contain fewer conditions) pressure drops by using the principle A p pgh. If such reason-
than the original general productions that built them. Thus they m a y giving is pait of active study of scientific text, then through reading
immediately provide the advantage of reducing working-memory the learner should acquire s o m e ability to recognize situations to
load. However, their main importance is that they are the ingredients which principles will apply.
for building efficient productions that recognize useful These comments are not limited to the study of text examples.
configurations in a problem and relate these configurations to N e w principles themselves are presented in m u c h the same manner.
potentially useful principles. A setting is described, and a sequence of inferences are stated,
The second mechanism of learning is composition. If two ultimately leading to a statement of a new principle.
proceduralized productions (any of those built by PI and P4)
execute in sequence, they are combined to form a single production
that does the work of both. This is done first by collecting the 2. H u m a n Learners
condition and action elements from both and deleting repetitions. In previous work (Simon and Simon, 1978. Larkin, McDermott,
(Further details given by Neves and Anderson (1981).) For example, Simon, and Simon, 1980a, Larkin, 1981b), w e have compared the
the proceduralized production above can be composed with a problem solving of true experts, individuals with extensive
production built by PI in Figure 3 to form the following: professional experience in physics, with that of novices, individuals
whose experience is limited to the equivalent of less than one
IF the goal is to find an expression for Ap of a fluid college level course. The performance of the export subjects would
and there is a density for the fluid correspond here to the A B L E model after all proceduralization and
THEN set the goal to compare to the current situation the composition had occurred This very A B L E model is essentially
principle A p = pgh equivalent in performance to the models of expert subjects
with the correspondence between the pressure drops and described in earlier papers. Proceduralization and composition have
the densities already established. produced a collection of sensitive productions that can recognize a
configuration of knowledge in a problem situation, and make an
Productions of this form, indicated by PS in Figure 3 short circuit theiininetliate inference based on an appropriate principle.
The hiim:in solvers considered here are all novices with respect to
the phy-^ic:; i!i,il>.ri,il. they know varyiny ;imounts of general physics.

but none had previously studied fluid statics. The question is then to
what extent do human learners, confronted with a novel section of
physics text and associated problems, perform like the A B L E Table 3: Order of major steps in solution to the U tube problem by
system? The 2tnswer is that in interesting ways h u m a n learners are (a) A B L E , (b) and (c) More able h u m a n solvers, (d) and (e) less able
more and less able. In this discussion w e shall focus on the h u m a n solvers, (f) Text example.
performance of four subjects, two w h o look very m u c h like the A B L E
system and two w h o look very different. These data are preliminary
and the support for features of the A B L E model is suggestive rather
than conclusive. R nS it- ^ =. -
In individual sessions each subject was asked to talk aloud as + eg •o
?= ^ - °.
much as possible while reading a six-page discussion of fluid statics _ II !
™5 _•c S * *
from a physics textbook (Halliday and Resnick, 1970) and working ^ i» j o> s «
CM — Ill
three associated problems. Subjects were given one and one-half b II a i Ol
hours for the task, and were encouraged to work just as if they were
U> II S a a E II ~ "^ a
^ < -3 = a <1 _ <
completing an assignment for a science course.
Table 2 summarizes the characteristics of subjects w e consider
more and less able learners, characteristics w e discuss in the
following paragraphs.
a 5 .=.
2.1. M o r e Able Learners - <D <0 a* 9- « & ^ I I
Reading i § « & ^~ o B
W e have suggested in the preceding section that careful active =• a,'B
=', E
" II «• " s I oI s e
reading of text may be an important setting for acquiring the partially II ?) p (0
proceduralized and composed productions that aid in locating 5 I t I -°"
useful principles. Indeed the two more able subjects used a great a a" _= -3 < a- - S a jz a. O S..t II "
deal of effort in processing the text. First, both subjects began their
work by reading the text completely from beginning to end, although
both glanced at the problems before beginning to read.
Second, these learners show consistent evidence that they are g
processing what they read carefully and conscientiously. The two
subjects interrupted their reading 53 and 70 times respectively to " E 0) fi
make comments on what they were reading. Most of these to u £ I
comments (75%, 81%) suggest that the subject is relating what is s ^ E
,1 _«"=• a' » E
3 " II «-,;? II«,
a I M i l i s 5 °!i i a'2 5 52 ^
Table 2: Characteristics of more and less able learners.
—' a «*5- " S 3 5- E 11 '^ II i;i;i^
More Able ' Less Able TO „
Aloud, Often silently,
many comments usually few comments
Reading precedes Problem solving before
problem solving all text read
Slowly Rapidly the text. They readily recalled the relevance of the U-tube example,
(e.g.,19min, 20min) (e.g.,7min, 5min) found the right page in the text, and based their solutions on
inferences m a d e in that example. Thus before beginning the
Problem solving: problem solution, these subjects had s o m e internalized knowledge
Correct or factor of useful principles to apply to a U-tube setting.
of 2 error Other errors The order in which principles are applied is similar to that
No search Search c o m m o n generated by the A B L E system, and differs slightly from the order
Order of principles Order of principles presented in the original example in that information is generated in
like A B L E means-ends a forward working manner so that information is always available at
the time it is needed. These data are presented in Table 3. Part (f) of
Table 3 shows the nine inferences, labeled 1-9 in Table 1, used by
the textbook in solving the analogous problem stated in Table 1. The
being read to previous knowledge ("Ok, intuition would tell you
remaining solutions are for the problem solved by the h u m a n
that"; "...which is analogous to just the weight of something in
mechanics"), or expressions of understanding (e.g., "Ok, that's
easy enough"). The two subjects considered here both solved this problem
correctly. All of the subjects w e consider to be more able either
Problem Solving solved this problem correctly or m a d e the simple error of solving for
W e consider here performance on the following problem, which is the height of the original fluid (mercury) above the point C (Figure
very analogous to the worked example in the text (Table 1). la), rather than solving for half that distance, the distance the
A simple U-tube contains mercury. W h e n 13.6 c m of mercury rose.
water is poured into the right arm, how high does the
mercury rise in the left arm from its initial position?
2.2. Less able Learners
The text provides the densities of water (1.0 x 10"'kg/m'') and Reading
mercury (1.36 x 10*kg/m^).
Unlike A B L E and the more able solvers, the less able solvers
The two subjects solved this problem without any search through skimmed through the text rapidly. The two considered here

complained that reading aloud interfered with understanding and symbol for distance However, as illustrated by subject S4, the
were permitted to read the text silently, which they did rapidly (see errors can be far more exotic. However, all errors are produced by
Table 2), These subjects also both stopped reading and began copying s o m e equation from the problem, substituting for the
solving the /irst problem as soon as they encountered material symbols in. that equation values that correspond in type, and then
relevant to it. solving for a symbol for distance.
Problem Solving
Unlike the more able subjects, the less able subjects do search
through the text tor appropriate relations. As Table 3 shows, S 3 and 3. Conclusion
S 4 each tried two different principles. In all cases the selection w a s As w e have noted elsewhere (Larkin, 1981a, Simon and Simon,
preceded by an episode of searching the text material. 1978) individuals trained in physics seem to work with mental models
As shown in Table 3 the order in which principles are applied by, that are different than those used by less trained individuals. In
the less able subjects is very different from that produced by the particular, skilled individuals re-represent the problems in terms of
more able subjects, by the A B L E system, and in the analogous text technical entities (e.g., pressure drops) that have no special
example. The procedure of these subjects seems to be the meaning outside the discipline of physics. Here w e suggest that the
following: (1) search through the text for an equation that involves general learning mechanisms of proceduralization and composition
distances (presumably because a distance is the quantity to be provide s o m e explanation of how this ability to re-represent
found). This equation may be the equation resulting from the U tube problems might be acquired.
example (subject S3 in Table 3) but it may not be (subject 84). (2) Our prototype A B L E system acquires a principle in declarative
Substitute values for quantities appearing in this equation, using as a form, as a student might by reading a chapter. This initial encoding
criterion for substitution merely left over that the value substituted does not itself include any information about h o w or where to apply
must be of the s a m e type as the symbol for which it is substituted the principle. Thus initial applications of the principle are
(i.e.. a height for a height, a density for a density). Indeed, even this interpretive: achieved through general procedural knowledge about
simple constraint is sometimes violated when subjects fail to h o w to apply any principle or definition. Through such application
distinguish between pressure p and density p. (3) If after this A B L E builds new specific procedural knowledge associated with the
substitution all values in the quantity have been used, and there is in principle. Initially fragments of procedural knowledge aid in the
the equation a quantity of the appropriate type (here distance) left search process. They contain patterns of information that have been
over, then solve the equation if possible. (4) If it is impossible to fit all used with that principle in the past, thus short-circuiting ABLE's
the information in the problem into the equation, then abandon it and original general and weak method of selecting principles. Ultimately
get a n e w one (5) If there remain in the equation symbols not a principle can be completely proceduralized (for a set of analogous
assigned values, then search either for an expression involving this contexts) so that application is completely automatic. This final
symbol, or for s o m e "standard" value for this symbol (e.g., automatic knowledge m a y well be an ingredient of vi/hat one would
atmospheric pressure). want to call an expert's mental model in which technical entities
This procedure is very different from that executed by ABLE. (e.g.. pressure drops) are seen as readily as visible entities like
However, it is not hard to see how such performance might be heights.

prodi.:ced by ABLE through appropriate delation of strategic References

knovjledge. First, one v.ould have to use Iho initial A B L E system,
before it had built any procedural knowledge about applying Clement. J. Students preconceptions in introductory mechanics.
principles. This absence corresponds to the lack of processing of American Journal of Physics, '\QQ\, , in press.
the text observed in these human solvers. Second, one would have
to remove from A B L E its strategic knowledge that a principle can be Forgy, C.L. Implerventing a Fast Syntactic Pattern l^atcher.
applied only if all aspects of the setting of principle are matched Technical Report, Computer Science Department, Carnegie-
against the setting of the problem (production P4 in Figure 3). This fulellon University, August 1979.
production would be replaced with one that would allow use of a
Forgy, C.E. Preliminary 0PS5 Manual. Technical Report, Computer
relation if all symbols in it could be matched by quantities of the
Science Department, Carnegie-K/lellon University, 1980.
s a m e type in the problem by quantities of the same type (i.e., a
length for a length). This uncritical matching perhaps is associated Green, B., McCloskey, M.. Caramazza, A. Curvilinear fulotion in the
with the less able subjects poor abilities for selecting useful Absence of External Forces: Naive Beliefs About the Motion of
principles. They have to match a lot of principles, and so may do it in Objects Sc/ence, December 1980, 270(5J, 1139-1141.
a less costly way. even though this economy has devastating effects
on their problem solutions. With these changes the A B L E system Halliday. D. and Resnick, R. Fundamentals of Physics. New York;
could produce any of the incorrect solutions w e have observed in John Wiley & Sons 1970.
the less able subjects.
The result is a weak means-ends procedure of searching for Hayes, J.R., and Simon, H.A. Understanding written task
relevant principles observed elsewhere in novice solvers (Simon and instructions. In Gregg, L.W. (Ed.). Knowledge and Cognition.
Simon. 1978, Larkin, fvlcDermott, Simon, and Simon, 1980b). A first Hillsdale, NJ: Lawrence Eribaum Assoc, 1974.
principle is proposed because it contains a quantity of the type to be
Larkin, J.H., McDermott, J., Simon, D.P,, Simon, H.A. Ivlodels of
solved for. Subsequent principles are proposed because they . can
competence in solving physics problems. Cognitive Science,
be used to replace in the original equation quantities without known
values. Substitution is based on the weak criterion that the two 1980,4,317-345.
quantities must be of the same type (e.g., two lengths, two Larkin, J,H., fvlcDermott, J., Simon, D.P., and Simon, H.A. Expert and
novice performance in solving physics problems. Science, June
Because of their uncritical matching of principles to the problem 1980,208, 1335-1342.
situation, the errors m a d e by the less able subjects are varied and
exotic compared to the simple "sensible" error characteristic of the Larkin. J.H. The Role of Problem Representation in Physics. C.I.P.
more able subjects. The most c o m m o n error is illustrated by subject 429, Carnegie-fvlellon University, 1981, Presented at a
S3 in Table 3. The equation from the U-tubo example is used, the Conference on fvlental fvlodels, La Jolla, CA. October, 1980.
distance 136 c m is substituted for one of the distances in the
• equation. / and rf, and the equation is solved' lor the remaining Larkin, J.H Enriching formal knowk ige: A model for learning to

solve problems in physics. In J R. Anderson (Ed ), Cognitive
skills and their acquisition. Hillsdale, H.J.; La.'.icnce drlbaum
Associates, Inc., 1981.

Lewis, C H. Production system models ol practice ellccts. PhD

thesis, University of Michigan, 1978.

McDermott. J. Some strengths of production system architectures. In

N A T O A.S.I, proceedings: Stry-tural/process theories of
complex human behavior. The Netherlands: A.W. Sijthoff,
International Publishing Company, 1978.

Neves. D. and Andersen, J.R. Knowledge Compilation: Mechanisms

for the Automatization of Cognitive Skills. In Anderson, J.R. (Ed.),
Cognitive Sl<ilis and Then Acquisition., Hillsdale, NJ: Lawrence
Eribaum Associates, Inc., 1981.

Newell, A. and Simon, H. A. Human Problem Solving. Englewood

Cliffs, NJ: Prentice Hall, Inc. 1972.

Newell. A,. Shaw, J.C.. & Simon, H.A. A variety of intelligent learning
in a general problem solver. In Yovits, M.C. & Cameron, S (Eds.),
International Tracts in Computer Science and Technology and
Their Application, : Pergamon Press, 1960.

Newell. A. Production systems: Models of control structures. In

Chase, W.G. (Ed), Visual information processing. N e w York:
Academic Press, 1973.

Simon, D.P. and Simon, HA. Individual differences in solving physics

problems. In Siegler, R. (Ed.), Children's thinking: What
develops?. Hillsdale, NJ.: Lawrence Eribaum Associates, 1978.

IntereBtingneii and Nenory for Stories

John B. Black, Steven P. Shwartz

and Wendy G. Lehnert

•Yale University

Number of Interesting Events
Two experiments investigated the effects of Experiment
the number of interesting events in a story on
memory for that story. The results differed dep- In this experiment we explicitly tested the
ending on whether or not the story followed a ste- prediction that as the number of interesting events
reotypical script. For script-based stories, ad- in a story increased, the memory for the story
ding interesting events spaced throughout the would first be aided and then be hurt. In partic-
stories decreased the stories' memorability. How- ular, we designed four stories of 20 events each
ever, for non-script-based stories, adding a few such that up to nine unrelated but inherently in-
interesting events aided memory for the stories, teresting events could be substituted for uninter-
but adding too many decreased memory. These re- esting events with minimal change in the sentences.
sults were interpreted in terms of a limited res- For example, one story about a visit to a library
ource model of story understanding that focusses contained the following two uninteresting events
processing on interesting events. On her way Mary passed a friend
and remembered she was in her class.
The first sentence here can be made interesting by
One of the major problem areas for theories of merely substituting "prostitutfe" for "friend."
language understanding is how to determine which
inferences are important to make while understan- Two of these four stories were based on ste-
ding a story and which inferences are not important reotypical scripts (visiting the library and eating
to make. When inferencing is left uncontrolled, a at a restaurant) while the other two were not
combinatorial explosion of inferences is often the (walking in the park and visiting a private club).
result. We wrote four versions of each of these stories.
Schank and Abelson (1977) proposed a group of These versions differed only in the number of in-
higher level knowledge structures (scripts, plans, teresting events. One version had 0, another had
goals and themes) that constrain inferencing to 3, a third had 6 and the final one 9. These in-
only the most relevant paths, but Schank (1979) teresting events were distributed evenly throughout
concluded that even more constraint was needed. In each story version. :Our main interest was in how
particular, he proposed that interestingness be this number of interesting events would affect the
used to guide inferencing: i.e., that the infer- subjects' recall of the stories. We gave the four
encing process should concentrate on the most in- stories (one in each version) to 32 subjects to
teresting events in a narrative. Schank argued read (8 subjects saw a given version of a given
that an interestingness value could be assigned to story), then after a 15 minute intervening task the
events in narratives using the criteria that some subjects were given the story titles and asked to
events are inherently interesting (e.g., sex, death recall as much as they could from the stories.
and violence are always interesting) and other e- These recall protocols were then scored for the
vents are interesting in certain contexts (e.g., gist of the events in the stories.
taking off one's clothes in public is interesting ^able 1 gives the percentage of the story
but not in private). These interestingness values statements recalled for each version of the stor-
can then be used to focus the inferencing processes ies.
Table i
on only the most interesting events in a narrative.
Lehnert (1979) extended this analysis by ad- Percent of Story Statements Recalled in First
ding the notion of resource limitations and making Experiment
predictions for the recall of stories. One of the
predictions was that adding interesting events to a Type of Story Number of Interesting Events
story will improve its memorability because they 0 3 6 9
provide markers around which events in the story
can be organized. However, because processing Script 65 51 52 56
resources are limited, adding interesting events to
a story will only aid memory up to a certain point. Non-script 53 59 49 A2
If a story contains too many interesting events
then there will be too many high priority events The results were quite different for the different
competing for the limited processing resources, so kinds of story. In particular, the non-script
this "overload'' condition should also be detrimen- stories followed the predicted pattern: i.e., ad-
tal to recall of the story. We conducted two ex- ding 3 interesting events improved recall but ad-
periments to test these predictions. ding more hurt recall. However, recall of the
script stories was best with 0 interesting events,
second best with 9, and worse with the intermediate
values of 3 and 6.

Our interpretation of these results is that in namely, remembering the story as a script plus a
the script-based stories there is already a story list of deviations.
organization (namely, the script), so adding in-
teresting events is harmful to recall because the With the non-script stor les, on the other
existing organization is disrupted. In the 0 in- hand, such a representation is not possible because
teresting events condition, subjects can recon- there are no organizing scripts In these stories,
struct a script-based story during recall by merely the interesting events provide the organization for
remembering that it was a typical implementation of the story and the construction of this organization
the script (e.g., a typical library story). Howe- during reading is facilitated by having the inter-
ver, adding interesting events makes straight re- esting events spaced throughout the story. If the
construction no longer possible. The slight upturn interesting events are massed, the limited resour-
at 9 interesting events might occur because by this ces available for processing are "locally over-
time the subjects have completely abandoned trying loaded" so the memory for a s tory with massed in-
to use the script as a reconstruction aid and teresting events is even worse than one with no
merely focused on the interesting events as memory interesting events.
With the non-script stories, on the other
hand, the results were as predicted because there Conclusions
was no strong preexisting organization in the in-
teresting events condition. Specifically, adding 3 The non-script stories conformed to our ex-
interesting events aided recall, but the 6 and 9 pectations that adding a few interesting events
events conditions led to worse recall because of an spaced throughout a story would increase the
overabundance of interesting events. story's memorability, but that adding too many in-
Massed and Spaced Interesting Events teresting events or having them massed together
Experiment would decrease the story's memorability. The re-
sults were different, however, if the stories were
In this experiment we explicitly tested the highly stereotyped, script-based stories. In par-
prediction from Lehnert's limited resource model ticular, adding interesting events to script-based
that it is the density of interesting events rather stories disrupts the existing script organization
than the absolute number that determines recall. and hence led to less memorable stories unless
In the first experiment, the number of interesting these disruptive events were massed together in the
events in the story was confounded with the density story. Thus a writer can "spice up" a text by ad-
of interesting events. In this experiment we used ding inherently interesting statements, but this
the same basic stories but had only the 0 and 3 procedure will increase the reader's memory only if
interesting events conditions. Now, however, we the text was poorly organized originally.
varied density independently of number by having
the interesting events either be located adjacent
to one another (the massed condition) or be evenly
spaced throughout the story as before (the spaced References
condition). The procedure was the same as in the
first experiment with 24 subjects this time (but Graesser, A., Gordon, S.E. and Sawyer, J.D.
still 8 per group). Recognition memory for typical and atyp-
Table 2 gives the percentage of the story ical actions in scripted activities:
statements recalled in the versions of the stories. Test of a script pointer + tag hypothe-
sis. Journal of Verbal Learnine and Ver-
Table 2 bal Behavior. 1979, 18, 319-332.
Percent of Story Statements Recalled in Second Lehnert, W.G. Text processine effects and re-
Experiment call memory. Research Report no. 157,
Department of Computer Science, Yale U-
Type of Story Number of Interesting Events niversity, 1979.
0 3 Spaced 3 Massed Schank, R.C. Interestingness: Controlling
inferences. Artificial Intelligence.
Script 55 58 64 1979, 12, 273-297.
Non-script 50 55 46 Schank, R.C. and Abelson, R.P. Scripts.
plans, goals and understanding. Hills-
dale, NJ: Erlbaum, 1977.
As with the first experiment, the results were
quite different for the two types of stories. With
the script stories, there was essentially no
difference between the 0- and 3-spaced conditions,
but the 3-massed condition led to better recall of
the story. With the non-script stories, on the This research was supported by a grant from the
other hand, the results confirmed the expectation Sloan foundation.
that the spaced interesting events would improve
recall, while the massed ones would decrease
Our interpretation of these results is that
with the script-based stories, having the disrup-
tive interesting events massed allows the reader to
store them in memory as separate units which is
harder to do if the interesting events are spaced
throughout the story. Thus the massed condition
facilitates a memory representation like the one
proposed by Graesser, Gordon, and Sawyer (1979) —

Use of Goal-Plan Knowledge
In Understanding Stories
Edward E. Smith & Allan M. Collins
Bolt Beranek and Newman Inc.

There seems to be a growing By our analysis, this story can be

consensus among researchers that represented by four levels of embedded
understanding a story involves goals and plans, where each level
constructing a representation of the contains one goal and its associated
characters' goals and plans (e.g., plan. At the first or top level would be
Rumelhart, 1977; Schank & Abelson, 1977; John's goal of his wife getting better
Wilensky, 1978). There is, however, (an inference from line 1) and his plan
relatively little experimental evidence of
from getting
line 2 ) .herA an operation (inference
precondition for the
for people's on-line construction of such "operation" plan, however, is that the
goal-plan representations, and the person have money, and this precondition
purpose of the present experiment was to becomes the goal at the second level (it
provide such evidence. is explicitly stated in line 2 ) , while
In our paradigm, subjects are the plan at the second level is to borrow
presented a five-line story, one line (or money (see line 3a). The "borrow" plan
sentence) at a time. Subjects press a also has a precondition, namely being in
button as soon as they understand a contact with the lender, and this becomes
sentence, and the sequence of reading the goal at the third level (see line
times obtained with a particular story is 3b); the plan at this level is to phone
assumed to offer a line-by-line the lender (see line 3b). Finally, the
description of the process whereby a "phone" plan has as a precondition that
representation is constructed for that one know the telephone number, and this
story. The general idea behind our is the goal at the fourth level (see line
study was to systematically delete 3c); the plan at this level is to use the
mention of certain goals and plans needed directory (see line 4 ) .
to understand the story, and to see if None of our subjects saw this full
readers then took longer at lines where version of the operation story. Rather
they would have to infer these missing they saw a version with two lines
goals and plans. deleted. As an example, some subjects
To get more specific, we have to read the story with lines 3a and 3b
deal with a story in detail. Consider deleted. We expected these subjects to
then the following story about an be relatively slow in reading line 3c
operation. because there is a sizeable gap between
1. While John was laid off, his the goal mentioned in it and the goal
wife became seriously ill. and plan given by the directly preceding
2. John needed money to pay for an line 2. That is, by deleting lines 3a
operation. and 3b, we gapped a couple levels of
goals and plans and our subjects must
3a. He decided to borrow money from fill this gap when reading line 3c, which
his Uncle Harry. should take time. When these subjects
3b. He would give his Uncle Harry a get to line 4, however, they should read
quick call. it relatively quickly; for now there is
no gap in the underlying goal-plan
3c. He had to find out Uncle Harry's
representation that needs to be filled.
phone number.
In contrast to the case just described,
4. John reached for the most recent other subjects read the operation study
suburban directory. with lines 3b and 3c deleted. These
subjects should read line 3a relatively
5. John was overjoyed Uncle Harry quickly (there is no gap to fill in the
agreed to the loan. goal-plan representation), but read line
4 relatively slowly (two levels of goals
and plans have been gapped, and need to
be filled at this point)-

So, by varying which lines of the
story are deleted, we can manipulate the References
size of the gap that a reader must fill
when reading a particular story line. Rumelhart, D. E. Understanding and
Since these gap-sizes are measured in summarizing brief stories. In
units of underlying goals and plans, a D. LaBerge & J. Samuels (Eds.),
finding that reading time increases with Basic processes in reading;
gap size would be evidence for the Perception and comprehension.
on-line construction of a goal-plan Hillsdale, N.J.: Lawrence Erlbaum
representation. Associates, 1977.
With stories like the operation one,
we have obtained the predicted gap-size Schank, R., & Abelson, R. Scripts,
effect — the time to read a line plans, goals, and understanding.
increases monotonically with the number Hillsdale, N.J.: Lawrence Erlbaum
of goals and plans that have to be filled Associates, 1977.
in at that point -- though its magnitude Wilensky, R. Why John married Mary:
tends to be greater on some lines than Understanding stories involving
others. We were concerned, however, that recurring goals. Cognitive Science,
the gap-size effect might only occur with 1978, 2, 235-266.
stories that explicitly emphasize
characters' intentions, i.e., readers
might only construct goal-plan
representations for stories that
explicitly use intentional constructions
like would give and decided to. For this
reason (and others) we rewrote our
stories so that sentences that were once
intentional in tone now became actional.
This required changing the order of the
rewritten sentences, since the
chronological order for actions is the
reverse of that for intentions. In the
actional version of the operation story,
for example, John first reaches for the
directory, then finds Uncle Harry's new
number, then calls him, and then asks him
for a loan — the opposite order of lines
3a, 3b, 3c, and 4 in the original story.
Once having created these actional
versions of the stories we then varied
which lines were deleted, thereby varying
the size of the goal-plan gap that the
reader had to fill. We again found that
the time to read a sentence increased
with its associated gap size (though
again the effect's magnitude depended on
the exact line being read) . These
results suggest that people construct
roughly the same goal-plan representation
of a story regardless of whether the
story emphasizes the characters' actions
or intentions.


Department of Psychology
University of Colorado, Boulder


Other researchers have proposed back-

Predictions were made about the types and ward, constrained models (Havlland and
the number of inferences to be found in Clark, 1974; Kintsch and van Dijk, 1978;
the verbal protocols of subjects reading Wilensky, 1978). The inferences generated
difficult to understand texts. The pred- from the input sentence are constrained by
ictions were made on the basis of two al- the previous sentences. The inferences
ternative models of inference generation, are produced specifically to establish a
the backward bottom-up inference and the coherence relation between the semantic
forward bottom-up inference. It was also representation of the current sentence and
hypothesized that plan-goal and script the representation of the text. The type
inferences would be stored longer in STM and number of inferences generated are
than coreference and role identification constrained by the context.
inferences. This implies that plan-goal
and script inferences are more likely to Consider a simplified illustration of
be reported than coreference and role Wilensky's algorithm to explain events
Identification inferences. The protocol (1978). The inferences generated from an
analyses support the backward bottom-up event are possible plans for that event.
inference model and support the assumed Those plans are checked to see whether one
length of storage in STM of the different of these plans is a known plan, or is a
types of inferences. plan for a know goal, or is a plan for a
known theme, or is an instrumental plan
for a know plan. Constraints are provided
by already asserted or inferred themes,
1.0 Introduction goals or plans, and by the fact that ac-
tions must be explained relative to them.
The study of the generation of infer- Experimental evidence seem to be
ences in text comprehension is Important somewhat more in favor of a backward, con-
because it is believed to be one of the strained generation of inferences. The
main mechanism by which a cohesive text results of Thorndyke's experiment (1976)
representation is built. While inferences were proposed as a support for a forward
are not to be equated with expectations, bottom-up Inference model. However, this
it has not always been clear how infer- experiment might suffer an identlflability
ences interact with top-down processing. problem. Using previously experimentally
derived inferences, a recognition test
Some researchers have proposed a for- with those inferences and sentences of the
ward bottom-up inference model. The texts is given. It is found that the
inferences generated at the input of a false alarm rate for compatible inferences
sentence are strictly determined by the is greater than for incompatible infer-
local information in the sentence and they ences. However, such predictions can be
are unconstrained by the context (Rieger, derived as well from a backward bottom-up
1975; Thorndyke, 1976). In Thorndyke's Inference model than from a forward
model (1978), diverse forward bottom-up bottom-up inference model. A backward
inferences are generated from each new inference model will predict that the com-
input sentence. Their number depends on patible inferences, that is, those which
the presentation rate of the text, the establish coherence, become part of the
difficulty of the text, the purpose in text representation and are likely to be
reading, etc. Some of these Inferences falsely recognized as text sentences (see
could then be compatible with an incoming Keenan,McKoon and Kintsch, in Kintsch,
sentence, facilitating its comprehension. 1974). Incompatible inferences, which do

not establish coherence, do not become The texts used were "Paul's Outing"
part of the text representation and are (Collins, Brown and Larkln, 1977), "The
not likely to be falsely recognized. Crowd", an adaptation of a text used by
Bransford and Johnson (1973), and "Noon,
In an experiment by Miller and Downtown", a short humourous narrative.
Kintsch (1980), subjects have to produce In all the texts, the last sentence indi-
continuations for a sentence presented by cated clearly the topic of the text. The
Itself, In a specific context establishing text "The Crowd" is more descriptive than
clearly a main topic, or In a non-seclflc narrative.
context suggesting many topics. It Is
predicted and found that the subjects in For an Inference to be reported, it.
the no or non-specific context will rely must be stored in STM and during a minimal
on the local constraints provided by the amount of time (Ericsson and Simon, 1980).
target sentence to produce these continua- Inferences likely to be reported are
tions- On the other hand, it is predicted Inferences of plans and goals, and infer-
and found that subjects in the specific ences of scripts (Schank and Abelson,
context condition will produce continua- 1977). In the protocols, explicit infer-
tions related to the main topic. Any ences of plans and goals would be like:
model of inference generation, backward or "The driver went underneath the truck to
forward, will predict the continuations to repair it.". Explicit inferences of
be determined by the local constraints in scripts would be like: "This is going to
the no or non-specific contexts. However, be about traffic." or "He seems to go to a
a forward inference model cannot account movie.".
for the fact that the continuations pro-
duced in the specific context condition Inferences of plans and goals are
are all related to the main topic (con- likely to be reported because they are the
strained), while a backward Inference major coherence relations in the text re-
model predicts just that. presentation of narratives. They need to
be stored long enough in STM (short term
memory) to build the cohesive text repre-
sentation and they become a permanent part
2.0 Method and Predictions of the text representation. Inferences of
scripts are likely to be reported because
the scripts are used to interpret a cer-
The study to be presented makes pred- tain number of sentences and are the basis
ictions over the types and the number of of the text representation.
inferences likely to he found in the ver-
bal protocols of subjects reading diffi- Inferences not likely to be reported
cult-to-understand texts. These predic- are coreference relations and role iden-
tions are based on the two alternative mo- tifications. Coreference relations are
dels of inference generation, the forward the determination of the referent of a
and the backward models, they are based on pronoun or a definite description. They
assumed or known properties of the memory are often made automatically, on the basis
storage of inferences, and they are based of the structure of the text, without re-
on the theory of verbal reports (Ericsson quiring STM to store intermediate computa-
and Simon, 1980). tions, and therefore, are not likely to be
reported. Role identifications are the
Nine subjects were trained to make recognition that a story character or ob-
thlnking-aloud and retrospective reports ject correspond to a script character or
on their inferences on twelve practice object. The recognition is made on the
texts. It was emphasized that they should basis of functional or categorical infor-
not try to report all the Inferences that mation provided by the text. Presumably,
they could eventually make, but only re- this is accomplised through direct
port the inferences they were making auto- pattern-matching and therefore does not
matically, without effort. However, they use STM to store Intermediate results and
were also Instructed that some texts might cannot be reported. However, if the in-
be harder than others and that It would be ferred coreference or role identification
normal to find the production of infer- reveals itself to be Inadequate, it is
ences more difficult in those cases, but likely that the reader will report the in-
still to report them. adequacy and will report the new assign-
ment .

For each text, a set of inferences reported than coreference or role identif-
were a priori derived and predicted to ap- ication Inferences.
pear explicitly, either frequently or in-
frequently, in the protocols. Table 2 presents the frequencies of
inferences explicitly generated, implicit-
It Is predicted that the most fre- ly generated and of inferences for which
quent explicit inferences will be those of there is no indication in the protocols
plans and goals, and scripts. A related that they have been made. The data are
prediction is that the most frequently im- summed over the two methods because while
plicit inferences will be those of co- there is a slight tendency for more infer-
references and role identifications. By ences to be reported in the thinklng-aloud
implicit, it is meant that the protocol reports, this is not true for all stories.
in icates rather certainly that the infer-
ence was made but only indirectly. For
example, suppose two of the sentences Script and plan-goal Inferences are
were: "Paul plunked down $5 at the win- more often explicit than implicit or not
dow.... but he refused to take it.". mentionned, while coreference and role
Suppose also that the protocol contains Identification inferences are most often
something like: "Well, I couldn't under- implicit.
stand why John wouldn't take the change.".
It is clear that the reader made the co- The location and the number of in-
reference relation between "he" and ferred coherence relations support a back-
"John". ward bottom-up inference model. The gre-
ater frequency of reported plan-goal and
According to the backward bottom-up script inferences is consistent with their
inference models, plan-goal inferences assumed length of storage in STM and LTM.
should be reported only after the event The relative unfrequency of role identifi-
they help to explain is read, and in most cation and coreference inferences supports
cases, only one coherence relation should their assumed automaticity and lack of use
be reported for that event. According to of STM to store intermadiate results.
the forward bottom-up inference models
(Thorndyke, 1976), numerous inferences The somewhat less good fit of the
could be reported when a sentence is read, data to the predictions in "The Crowd"
potentially explaining an event in a might be due to the fact that "The Crowd"
future sentence. is more descriptive than narrative anc
that the models for inference generatlor
The protocols were analysed In terms were devised for narratives.
of the types and frequency of the expli-
citly and implicitly reported inferences.
A simple consistency checking was per-
formed on the classification of the Infer-
ences using a third of the protocols (9 Table 1
out of 27). The consistency estimate was Observed frequencies of each instance
about .95. of the predicted inferences
In every case of a reported coherence summed over the two methods
relation, it has been reported after the
event It explained, and in only one case
NOON, DOWNTOWN (each on a total of 5)
has there een ore than one coherence re-
lation proposed (there were two). This is Pitdicreo freijoooL,
consistent with a backward bottom-up
p-g caust- p H p Is p-g p ^ p-g
inference model. 5 3 4 1 4 2 3 M=.6
Table 1 presents the types, predicted
Predicted unfrequent:
frequencies and observed frequencies in
None of the 6 instances of coreferences
the protocols of the a priori determined
has been observed in any protocol M= .0

The overall frequency of the predict-

ed frequent inferences is .65, and of the
unfrequent ones is .11.

As was predicted, plan-goal and

script inferences are much more ferquently

PAUL'S OUTING (over a total of 6)
I would like to thank Dr. Anders Er-
Predicted frequent: icsson for his useful advice and encour-
p-g p-g p-g p-g agement .
5 5 6 5 M- .9

Predicted unfrequent: References

cor cor role role role role cause
1 2 0 1 0 1 0
role cause Bransford, J. D., Johnson, M. K.
0 2 M= .1 Contextual Prerequisites for Understand-
ing: Some Investigations of Comprehen-
sion and Recall. Journal of Verbal
THE CROWD (over a total of 6) Learning and Verbal Behavior, 1972, 11,
Predicted frequent:
scr scr p-g scr scr cause scr Collins, A. M., Brown, J. S., Larkin, K.
1 3 4 2 4 1 3 M= .5 M. Inference in Text Understanding.
Technical Report No. 40. Bolt, Bera-
Predicted unfrequent: neck and Newman, 1977.
role role cor role cor
1 1 0 1 0 Ericsson, K. A., Simon, H. A. Verbal
impl Impl Reports as Data. Psychological Review,
2 1 M= .2 1980, 87, 215-251.

Kintsch, W. The Representation of Meaning

p-g = plan-goal role = in Memory. Hillsdale, N.J.: Erlbaum,
role-identlflcation 1974.
scr = script impl = implicature
M = mean Kintsch, W., van Dijk, T. A. Toward a
Model of Text Comprehension and Produc-
tion. Psychological Review, 1978, 85,
Table 2
Types and frequencies of inferences Miller, J. R., Kintsch, W.
as a function of stories Knowledge-based Aspects of Prose
and methods. Comprehension and Readability
Presented at the annual meeting of the
American Educational Research Associa-
tion, Boston, April 1980.
Rieger, C. Conceptual Memory and Infer-
21 p-g 6(2) p-g 3(5) p-g Conceptual Information Processing.
0 cor 12 cor Amsterdam: North-Holland, 1975.

Schank, R. C, Abel son, R. P. Scripts,

PAUL'S OUTING Plans, Goals, and Understanding.
Lawrence Erlbaum, 1977.
20 p-g 3(2) p-g 1(0) p-g Thorndyke, P. W. The Role of Inferences
3 cor 9 cor in Discourse Comprehension. Journal of
2 role 12 role Verbal Learning and Verbal Behavior,
1 cause 1976, 15, 437-446.

THE CROWD Wllensky, R. Understanding Goal-based

Stories. Research Report No 140, De-
EXPLICIT IMPLICIT NOT FOUND partment of Computer Science, Yale Un-
16(15) p-g 7 role 11(12) p-g iversity, 1978.
5(4) role 7 cor
3 impl 1(0) p-g
1 cause
1(0) cor

Self-Embedding is Not
A Linguistic Issue*

John M Carroll Several examples of this structure occur in Griffith's Broken

M I T Linguistics and Philosophy Dept. Blossoms. Richard Barlhlemess, as the Chinaman, casts a misty-eyed
and IBM Watson Research Center look within a detail-shot, and immediately there is a cut away to a
flashback of his youth. It would not have been possible for Griffith to
Anyone familiar with language research over the last 25 years
cut away from Barlhlemess, shou the flashback, and then proceed with
or so will be no stranger to examples like these:
the story-line without first returning to Barlhlemess The return is
The w o m a n died.
obligatory; it is part of the structure of (Griffith's) cinema As subse-
The w o m a n the m a n met died.
quent examples will show this is the crucial property of sequences that
The woman the m a n the girl loved met died.
affords SE. If a sequential domain has a structure strong enough to
Such examples have an illustrious past, both in the formal study of
afford obligatory returns, it can have SE. (This, of course, is precisely
language and in the psychology of language. The self-embedding (SE)
analogous to points made by Chomsky, 1959, with respect to lan-
properly they display singularly places natural language syntax beyond
the generative capacity of finite devices (Chomsky, 1959).•• The
Dance. A well-known example of dance within dance is in the
interaction of S E with language comprehension entrains a remarkable
Nutcracker Suite where several episodes of puppet-dance occur as pari
phenomenon: one (or zero) levels of SE cause no noticable increment
of the ballet itself. More common, but also more subtle, arc examples
in comprehension difficulty, but two (or more) levels are typically
in which a dance phrase (Lasher, 1981) is interrupted, whereupon a
associated with substantial impairment of comprehension (e g.. Miller
complete and distinct other phrase is danced, then finally the original
andlsard, 1964).
phrase runs to completion.
Explaining SE. This remarkable fact has maintained the study
One class of such examples occurs when a female dancer is
of SE as a preeminent topic in psycholinguislics. A variety of accounts
lifted by a male dancer in classical ballet: the female assumes a rigid
have been produced The earliest accounts tended to emphasize the
posture while the male carries her and continues moving; when she is
extent to which SE overtaxed memory resources for speakers and
returned to the floor, the female continues with her own dance. Anoth-
hearers (Miller and Chomsky. 1963: 470ff.; Yngve. I960). Later
er class of examples are cases in which a group of dancers assume rigid
accounts tended to focus on the kinds of operations that were invoked
postures while a subset, usually a couple or a single dancer, perform a
in the course of processing SE structures (Miller and Isard, 1964;
dance. At the completion of the embedded dance, the group resumes.
Bcver. 1970). Significantly, though, all of these accounts regarded S E
These latter types of examples have analogs in music where they are
as a general structural properly — whose expression in language was of
easier to cite because of the available notation
special interest -- not as a property unique to language ex hypothesi.
Music. Music has an extremely articulated structure and
Miller and Chomsky (1963: 484) write "Self-embedding of such great
affords several classes of S E structures One of these is the cadenza: a
theoretical significance ... that we should certainly look for occurrences
strong cadence is interrupted by a solo and then completed. For
of it in non-linguistic contexts." Bever (1970) cited as one of the
example, the cadence 1-6/4 — V 7 — I, closing a section of a violin
strengths of his account of SE processing problems that analogous
concerto, can be interrupted by a violin solo: 1-6/4 — V 7 — Solo — I
principles seemed to explain phenomena in visual perception.
W h e n the interruption amounts to a delay in the matrix struc-
Clearly though, there is another theoretical option. O n e could
ture, it is often called "parenthesis" (Meyer, 1973). Meyer discusses
hypothcsJ7e that SE phenomena are special to language. O n such a
an example from Haydn's String Quartet in Eb Major, Opus 50 No. 3:
view, the natural explanatory mechanism for the interaction of SE with
a melodic patterning of thirds F.b-G, F-Ab, implying G-llb, and rein-
sentence comprehension would be language-specific -- not some gener-
forced by harmonic and rhythmic patterns, is interrupted by a four
al property of memory or perceptual process as suggested by Bever,
measure repeating a-b-a-b structure: ". . the real melody is character-
Miller and Chomsky, etc. This view is (often implicitly) adopted by
ized by goal-directed motion: but the parenthesis is static."(p. 241)
many natural language parsing theorists. A recent example is Fodor
Social interaction. For some years now. Sacks (1973) and
and Frazier (I9K0) w h o suggest that the interaction of S E with sen-
associates have been developing a theory of the structure of conversa-
tence comprehension be theoretically reconstructed as a parsing princi-
tional interactions based on the "adjacency pair" unit The paradigm
ple attaching an incoming word or phrase into its surface structure
example is question/answer:
using the smallest possible number of new nonterminal nodes (Fodor
A: D o you know what an adjacency pair is?
and Frazicr. 1980: 426-434). (.See Carroll. 1981; Ford. Bresnan, and
B: No.
Kaplan, 1981, and Wanner. 1980; for further discussion of this mod-
The implication of the "second pair part" entailed by the occurrence
el )
of the "first pair part" is sufficient structure to support S E Indeed.
The question of whether generalizations about S E phenomena
Goffman (1979: 258-9) noticed such examples.
are linguistic or still more general is a question of fact. But it is worth
Al: Can I borrow your hose?
emphasizing that assumptions one way or the other lead immediately
B2: D o you need it this very moment?
to empirical consequence. For example, the assumption that S E is
A 2 : No.
strictly linguistic actually does some work in the analysis of Fodor and
Bl: Yes.
Frazicr I he single strong distinction Fodor and Frazier (1980) are
Instead of answering A s initial question (and complcteing an adjacen-
able to draw between their model and that of Wanner (1980) turns
cy pair structure), B initiates a second question-answer adjacency pair.
precisely on their language-specific analysis of S E phenomena.
W h e n this second pair is complete, there is a return to the first struc-
While the resolution of this issue may ultimately be decided on
ture Goffman also noted an example of multiple SF (B is a trainman
Ihcordical grounds. I will focus chiefly on the presentotion of data
in a station)
whose prima facie analysis entails the view that the correct level of
Al: Have yoii got the lime?
generalization for addressing SE phenomena is something like
82 Standard or Daylight Saving?
"complex sequences" I will discuss examples from film, dance, music.
A3: What time are you running on?
.ind social interaction Then. 1 will turn back to the theoretical level
B3: Standard.
.ind outline a proposal, following Bevcr's (1970) Double Function
A 2 Standard then.
Bl: It's five o'clock
Film A n obvious candidate for SE analysis in film is the
Jefferson (n.d.) has shown however that such cases of multiple S E are
"flashh.ick ' scene: an entire scene is embedded into another scene In
extremely rare, although noniterative SE is quite common.
the cinema of D.W. Griffith, for example, a scene (S) can consist of a
Indeed, it is striking that, as in the case of language, multiple
long-shot (L) followed by a series of detail-shots (D) A dclail-shol
SF structures simply do not obtain O n e certainly has flashbacks, but
may itself include a scene. This h.-xs the effect of embedding a scene
never flashbacks within flashbacks W h e n such structures arc em-
within a scene In Carroll (1980: 61-63) I have discussed the follow-
ployed (as by Alain Resnais in Jc T'aime, Jc T'aime), they are em-
ing grammar-fragment as the basis of a formal analysis of this sort of
ployed precisely in order to confuse the viewer with respect to tempo 121
ral sequence Analogously, one cannot imagine straightforward con-
S= = > L + D*
texts in which a dance phrase within a dance phrase within a dance
D = = > D' + S + D-
a right-hand
to indicate
of indexing
thatthea return
second isrule
the "prime"
This amounts
as a phrase acould
within appear
the theoretical
playthis Etc. in
In atheballet
a theme
analysis of One cannot
final embed
phenomena this
and a Itheme
paper suggest
want ato playa
a return
Double function The double function principle can be pul
quite simply as "The same stimulus cannot be perceived in two incom-
patible ways al the same lime " The prima facie force of the principle Notes
in visual perception and language comprehension (Bever. 1970) and •This work was facilitated by a fellowship from the National
cinema perception (Carroll, 1980: 190-193), exclusive of S E phenome- Science Foundation and a sabbatical leave from the I B M Corporation
na, has been reviewed elsewhere As applied to SE, the line one would I am grateful lo T o m Bever, Jeff Coulter, ten Meyer, and Margnt
want to run is that the categories "embedder" and "embeddee" are Lasher for discussion and criticism
perceptually significant and that accordingly an object of perception " I will presuppose, but not review, the standard distinction
cannot simultaneously be both embedder and embeddee. From this between "nesting" and "self-embedding" (Chomsky and Miller, I963K
the usual facts regarding SE follow.
This remains an empirical hypothesis, of course, but the plausi- References
bility of the premises for the analysis is considerable. In contrast, Bever, T G. The cognitive basis for linguistic structures, in J.R Hayes
appeals to memory (Miller and Chomsky, 1963; Yngve, 1960) are less (Ed.) Cognition and language learning. N e w York: Wiley,
compelling in that people typically can memorize multiple S E se- 1970.
quences — Ihey just can't understand them Appeals to processing Carroll, J.M. Toward a structural psychology of cinema. The ffaguc
constraint are either too vague to assess (Miller and Isard, 1964) or Mouton, 19K0
tantamount to the proposal under discussion (Bever, 1970). Finally, Carroll, J.M. O n fallen horses racing past barns. In Papers from the
the cross-modal evidence of S E and the iterative S E constraint strongly parasession on language and behavior. Chicago: Chicago
indicate that language-specific analyses are missing the true generaliza- Linguistic Society, 1981.
tion. Chomsky, N. O n certain formal properties of grammars. Information
and Control. 1959,2, 137-167.
Chomsky, N. and Miller, G.A. Introduction to the formal analysis of
natural languages. In Luce et al (1963).
Fodor, J D. and Frazier, L Is the human sentence parser an A T N ?
Cognition. 1980,8,417-459.
Ford, M., Bresnan, J., and Kaplan, R A competence-based theory of
syntactic closure In J Bresnan (F.d ) The mental representa-
tion of grammatical relations. Cambridge; M I T Press, 1981.
Goffman, E. Replies and responses Language in Society, 1979, 5,
Jefferson, G. Unpublished research, n.d.
Lasher, M D. The cognitive representation of an event involving
human motion. Cognitive Psychology. 1981.
Luce. R.D.. Bush, R.R., and Galanter, E. (Eds.) Handbook of mathe-
matical psychology. N e w York; Wiley, 1963.
Meyer, L B . Explaining music. Berkeley: University of California
Press, 1973.
Miller, G.A. and Chomsky. N. Finitary models of language users. In
Luceet al. (1963).
Miller. G.A. and Isard, S. Free recall of self-embedded English sen-
tences Information and Control, 1964, 7, 292-303
Sacks, H. Lecture notes. Summer Institute of Linguistics, Ann Arbor,
Michigan, 1973.
Wanner, E. The A T N and the sausage machine: Which one is balony?
Cognition, 1980, 8, 209-225.
Yngve, V.H A model and an hypothesis for language structure. Pro-
ceedings of the American Philosophical Society. I960. 104.

Center-Embedding Revisited

Michael B. Kac
Dept. of Linguistics
University of Minnesota
Minneapolis. MN 55455
The severe comprehension difficulty associated
with certain center-embedding constructions is per-
haps the best known of psychosyntactic phenomena.
Most attempts at explanation have been variations on (6) a. Link each for and to to the first
a single theme--that the c.e. configuration leads to infinitive to its right.
an overload of short-term memory during processing.
That this is not the whole story can be seen from con- b. Link each that to the first finite
sidering the fact, rarely noted, that c.e. construc- verb to its right.
tions exist which are understood quite easily, e.g.
(1) If either the Pope is Catholic or pigs In (5a), that is linked to would while the for and to
have wings then Napoleon loves Josephine. are both linked to like, as desired; in (5b), howe-
ver, there will be a garden path since the first that
To this observation it might be replied that it is will be erroneously linked to likes rather than annoys.
not c.e. per se that causes difficulty, but rather Thus (5a-b) and (1-2) are treated analogously in that
MULTIPLE c.e.; thus, (1) would not be expected to that errors of prematurity are committed in the cases
pose problems since it is embedded only to a depth that pose comprehension difficulty but not in the cases
of 1. But the same is true of that don't.
(2) If if the Pope is Catholic then pigs The general idea embodied in the foregoing can be
have wings then Napoleon loves Josephine. further extended. Consider Object relative construc-
tions like
which is, at best, at the outer reaches of comprehen-
sibility. (7) a. The rat that the cat chased squeaked.

The facts regarding (1-2) can be accounted for by b. The rat that the cat that the dog bit
a few simple assumptions. The first is that consti- chased squeaked.
tuent recognition is carried out in strict left-right Examples of this type are perhaps most familiar from
fashion; assume further that in attempting to ana- discussions of c.e. That (7b) should be more diffi-
lyze (1-2), the parser has at some point built a cult to process than (7a) falls directly out of gene-
structure of the form ral design features of a syntactic parser currently
(3) If/either S, then/or S^ then Sy under investigation called MULTIGAP ('multiple pass
group-analyzing parser') and is, moreover, attributed
and that at this point the following procedures are
to a garden path effect much like the ones hypothesized
in the earlier cases discussed. In MULTIGAP, although
(4) a. When an ^l. is encountered, open an S simple NP's (i.e. NP's without clausal modifiers) are
at that point; the locate the first identified early in parsing, recognition of complex
then to the right of this _[£ and close NP's is forestalled until after an exploratory phase
at the end of the S immediately follow- during which the parser builds a structural represen-
ing. tation called a PREANALYSIS, in which the sentence be-
b. When an either is encountered, open an ing parsed is parcelled up into a sequence of units
S at that point; then locate the first called BOUNDED GROUPS (b-groups). A key notion here
or to the right of this either and close is that of the TRANSITION from one predicate to the
at the end of the S immediately follow- next (or from the last predicate to #), i.e. the ma-
ing. terial that intervenes between the former and the lat-
ter. If conditions are satisfied for construing the
latter as subordinate to the former (e.g. if there is
Applied to (1), (4) will operate in straightforward an overt subordinator in the transition) then the tran-
fashion; it will close the S beginning with H di- sition is of one type, labelled B'; otherwise, it is
rectly after S,, and the S beginning with either di- of a different type, labelled B. Cross-cutting this
rectly after S?. Applied to (2), however, it will dichotomy is a distinction between STRONG and WEAK
nisparse, closing the S beginning with the first i_f transitions. If the transition from a predicate of n
prematurely after S2--a garden path effect of a famil- places to the next predicate, or to #, contains at
iar kind. A parallel account can be given of the dif- least n-1 NP's, it is strong, weak otherwise. In (7b),
ference between the transitions from bi_t to chased and from chased to
(5) a. That for Harry to like Maxine would squeaked are both weak, while that from squeaked to #
annoy Fred bothers me. is strong. Abbreviatorily, the four transition types
b. That that Harry likes Maxine annoys are labelled s (strong B ) , w (weak B ) , s' (strong B " ) ,
Fred bothers me. and w' (weak B'). The parser not only types transi-
tions according to this scheme, but also inserts into
where both involve nesting of complementizer-verb de- each a special boundary marker, 1 , to delineate fa-
pendencies, but where (5b) is considerably more diffi- groups. If a transition is of any type other than s_,
cult to process than (5a). Assume that complementi- J is positioned directly after the initial predicate
zers are associated with the verbs that they mark as in the sequence under consideration, otherwise direct-
non-main by the following procedure: ly after the last NP of the transition. (Recall that
simple NP recognition has already taken place.) The
preanalysis of (7b) is thus

(8) The rat that the cat that the dog (11) The boy who believes that the girl who
bit 1 chased]] squeaked ]] # kissed Harry likes Maxine saw Sue.
^1 W2
which, like (9), requires some effort, but is with-
Once the preanalysis is set, the parser looks for pos- in bounds; however,
sible opening points of complex NP's--sequences of the
form NP-SUB-V and NP-SUB-NP. (The full MULTIGAP design (12) The boy who believes that the girl who
is capable of dealing with cases where no overt subor- thinks that Harry likes Maxine saw Sue
dinator occurs, but discussion will be limited here to kissed Samantha.
cases where there is such a subordinator.) The open- evidently is not. In the case of (11), the inner NP
ing is labelled as belonging to type 1 (Subject rela- is parsed first, then the outer one, and there is no
tive) or type 2 (other) depending on which type of o- erroneous closure. In the case of (12), by contrast,
pening sequence is found. Accordingly, the parser will there are no first degree NP's, and the NP beginning
find two type 2 openings in (8): the rat that the cat with the bojf^ is closed prematurely, after Maxine.
and the cat that the dog. Closure is effected by a pro-
Honesty compels mention of a perplexing case that
cedure which, if the opening is type 1, closes at the
this approach cannot handle. If, in a c.e. type 2
first s to the right, but at the first w to the right
relative of the first degree, one of the NP's is re-
of a type 2 opening. Thus in (7a), the complex NP will
placed by a pronoun, comprehensibility seems to in-
be correctly closed after chased, but in (7b), there
will be premature closure of the larger NP at w^ ra-
ther than at w,, an error of prematurity exactTy ana- (13) The rat that the cat I saw chased
logous to those discussed earlier. Note, moreover, squeaked.
that c.e. of a type 1 relative in a type 2 construc-
tion poses no problems: I have no explanation to offer for this fact.
(9) The mouse that the cat who chased the rat A general assumption underlies the treatment ad-
saw squeaked. vanced here, as follows: in an optimal parser, one
While (9) is not absolutely straightforward, it can seeks the simplest procedures that will apply in error-
be understood with a little effort, which is not the free fashion to the simplest cases--cases in which a
case with (7b) even though (9) is of the same depth construction of type x contains no other constructions
of embedding. The difference is accounted for in this of type X. Effects of the sort observed here are then
treatment by the fact that the larger NP is closed off explained as being due to some of the more complex ca-
directly after saw (i.e. after the one w) while the ses being tractable in terms of maximally simple pro-
smaller one closes directly after rat--at the first s. cedures while others are not. Given the results de-
Because there is only one w, no error of prematurity scribed above, the principle seems to have a measure
occurs. of plausibility as at least a working hypothesis.
Complex NP's actually come in two varieties, which
we might call first and second degree; in a first de-
gree NP. there is only one predicate, while in a se-
cond degree NP the main predicate of the construction
has a complement. The two types of NP are somewhat
different in their behavior in that while type 2 NP's
of the first degree always end in w-transitions, a
type 2 NP of the second degree may end in either a w
or an £-, thus compare
(10) a. The boy that Harry believes U likes
Maxine ]] saw Sue ^2
I # w'
b. The boy that Harry believes ]] Maxine
likes I saw Sue J s'
w £
In (10a), the Subject NP ends at s.^ while in (10b) it
ends at w, corresponding to the fact that in the for-
mer case the head NP is the Subject of the complement
clause while in the latter it is the Object of that
clause. The parser copes with this fact by being e-
quipped with two closure mechanisms, one for first de-
gree complex NP's, and another for second degree NP's.
A first degree NP can be identified by checking to see
that no B" intervenes between the opening and the first
B to the right thereof; if such a B' is found, closure
is forestalled. Once all first degree NP's have been
closed, the parser closes second degree NP's by simply
looking for the first available B to the right of any
opening for which no corresponding closure has been
made. (Any B that has already been swallowed up in-
to a previously recognized NP is no longer available
for consideration.) It is a consequence of this fea-
ture of the parser that a second degree complex NP
with a c.e. first degree NP should be more easily pro-
cessed than one with a second degree NP, an expecta-
tion that is evidently borne out: compare

The Comprehension of Focussed and
Non-Focussed Pronouns
Jeanette K. Gundel and Deborah A. Dahl
University of Minnesota
Gundel (1980) has shnwn that fnni<;<;pH nrnnmin^ (9)* IT, I don't LIKE.
like the underlined form in (1) and non-focussed
pronouns like the underlined form in (2) have different However, in many languages non-focussed pronouns differ
comnunicative functions. lexically from focussed ones. Some languages (e.g.
(1) Q. Who did they call? Irish. Spanish and Polish) have long form focussed pro-
A. Pat said they called HER. nouns and corresponding short form , usually clitic,
(2) Q. Has Pat been called yet? non-focussed pronouns. Compare the Polish examples in
A. Pat said they called her TWICE. (10) and (11).
(lOj Jan je tutaj. Ja 30 widzia/am. (non-focussed)
The purpose of this paper is to investigate the possi-
is here I him saw
bility that these different functions are associated
"Jan is here. I saw him."
with different psychological processes underlying the
(11) Q. Kogo widziajas?
comprehension of pronouns.
who saw "Who did you see?"
We will begin with a brief description of the
linguistic differences between focussed and non- A. Ja JEGO widzia/am. "I saw HIM" (focussed)
focussed pronouns. The focussed pronoun in (1) is I him saw
a referring expression. Its function, like that of In most languages, the two sets of pronouns (cormonly
other referring expressions, is to pick out and call the referred to as non-emphatic and emphatic respectively)
addressee's attention to, some entity in the discourse are related historically, the non-focussed pronoun
context. Thesediffer from full NP's only in that the being a phonologically reduced version of the focussed
entity is assumed to be identifiable on the basis of its one. There are languages however which have totally un-
presence in the immediate linguistic or non-linguistic related forms for focussed and non-focussed pronouns.
context. As with full NP's, focus on a pronoun is For example, in Fijian the 3rd person singular non-
obligatory when the pronoun is part of tiie corment of a focussed form is e and the corresponding focussed form
sentence (i.e. new information being asserted, is koya. Finally, there are languages which allow, and
questioned, etc. about a topic), as in (1). Pronouns in some cases require, so-called zero anaphora (i.e.
which are topics can also be focussed, e.g. if there is no form at all) in those cases where English would
a topic shift or contrast as in (3)-(5). have a non-focussed pronoun. The following Mandarin
(3) I asked Bruce about it. HE^ said he didn't CARE. example from Li and Thompson(1979)is an illustration
of qu-le shuY lai "(He) brought the water."
(4) THEM, I don't LIKE.
bring-aspect water
(5) Q. Are Bill and Mary still here?
A. HE^went HOME, but SHE's in the other ROOM. Focussed pronouns, however, cannot be omitted in these
Non-focussed pronouns like the underlined form in (2J languages.
have no independent referring function. They are While the existence of zero NP-anaphora in
always controlled by already established discourse or languages like Spanish, for example, has often been
sentence topics. From a communicative point of view, linked to the fact that such languages have subject
they are almost completely redundant. Their function is agreement marking on the verb, it is important to point
thus primarily syntactic. This distinction between out that such agreement is neither a necessary nor a
focussed and non-focussed pronouns is independent of sufficient condition for zero NP-anaphora. As can be
whether the coreferential full NP is in the same sen- seen from the example in (12), Mandarin allows zero NP-
tence or in a previous sentence in the discourse. anaphora even though it has no agreement marking on the
Furthermore, as seen in (6) and (7), both focussed verb.
and non-focussed pronouns can be non-1inguistically In addition to the lexical differences discussed
evoked.(Halliday and Hasan (1976) call this exophoric.) above, focussed pronouns differ from non-focussed
(6) (A sees B reading an application and says) pronouns in their syntactic properties and in con-
Do you think we should ADMIT her? (non-focussed) ditions on coreference with other NP's . In English,
(7) (A hands an application to B and says) non-focussed pronouns are excluded from certain syn-
Do you think we should admit HER? tactic environments. For example, a direct object must
Although focussed and non-focussed pronouns have precede an indirect object if the direct object is
different lexical, syntactic and semantic properties, a non-focussed pronoun , as illustrated in (13) and
linguistic theories of anaphora have not generally dis- (14).
tinguished the two. This is no doubt due partly to the (13) Q. Did you give the books to Tom?.
fact that most theories of pronominal anaphora are A. No. I gave MARY the books
based on English, which has identical forms for •them
focussed and non-focussed pronouns and where the latter (14) Q. Which books did you give to Mary?
differ from the former only in that they are unstressed ' A. I gave Mary THEM.
and have corresponding phonological reduction in casual As (13) shows, it is not non-focussed NP's in general,
speech. The one exception is the 3rd person neuter but only non-focussed pronouns, which are excluded fron
singular U , which is always non-focussed.(see Linde final position in such sentences.
(1979) .For example On the other hand, some syntactic environments
(8) Q. Which do you want? require non-focussed pronouns, as illustrated in (15)
and My pocket has a hole in it
A.*ril take IT. ^THAT

(16) The old dog still has a lot of life left in him If the two pronoun functions discussed above
•HIM are associated with differences In processing, it
Focussed pronouns also differ from non-focussed may be that both the Reference Search Hypothesis and
pronouns in conditions on coreference. For example, a; the Topic-Stability Hypothesis are correct. Reference
noted in Akmajian and Jackendoff (1970) only non- ii> search may be used in processing focussed pronouns
focussed pronouns can be coreferential with a follow- and topic-stability may be used in processing non-
ing full NP in the same sentence. Thus, John and he cai focussed pronouns. Assuming that executing a ref-
be coreferential in (17) but not in (18). erence search is a more complex task than assign-
(17) After he woke up, John went to TOWN. ment of reference to a discourse topic, this dual
process hypothesis predicts that processing focussed
(18) After HE woke up, John went to town. pronouns should be more difficult than processing
Coreference assignments also differ depending on non-focussed pronouns. This hypothesis would also
whether the pronoun is focussed or non-focussed in seem to predict that no differences should be found
examples like (19) and (20). in processing between sentences which both have non-
(19) Mary called Alice and then SARAH called her. focussed pronouns. Previous work, however, has found'
such differences. Caramazza and Gupta (1979), for
(20) Mary called Alice and then Sarah called HER. example, found differences in reaction time to naming
Since there are clear linguistic and functional the NP coreferential with a non-focussed (it would have
differences between focussed and non-focussed pro- been unstressed had it been presented luditorily)
nouns, we would now like to address the issue of pronoun depending on pragmatic and syntactic factors.
whether these differences are reflected in differ- According to the Topic-Stability Hypothesis, these
ences in processing. differences should not have been found. However, the
We suggested above that focussed pronouns prediction made by this hypothesis depends on the
can refer to any entity in the inmediate discourse fact that unstressed pronouns must refer to the dis-
context. Non-focussed pronouns, on the other hand, course topic. Sentences presented in isolation, as in
can only be coreferential with a subset of these, Carmazza and Gupta's study, may be ambiguous with
namely established topics. If this is true, one respect to discourse topic. If more than one entity
might expect that the processing of non-focussed is eligible to serve as the discourse topic, then a
pronouns would be less complex since the set of reference searcn might be necessary even for an
available entities is more restricted. unstressed pronoun, although the number of entities
J'lli ineories ot psychological processes unaeriying pro- to be examined might be smaller than for a focussed
noun comprehension have generally not distinguished be- pronoun. Thus, these results do not seem to provide
tween focussed and non-focussed pronouns. Assumptions a serious counterexample to our proposal.
about pronoun processing can be classified into two main The only previous study to directly address
categories. The first, which we will refer to as the the question of comprehension differences between
Reference Search Hypothesis is stated explicitly in focussed and non-focussed pronouns is Maratsos (1973).
Clark and Clark (1977, p. 78): "on finding a definite He found that focussed pronouns of one type (thst is
noun phrase, search memory for the entity it was meant the role-switch type seen in (20)), were more
to refer to and replace the interpretation of the noun difficult for children to comprehend than unstressed
phrase by a reference to the entity directly." A similar pronouns. He interprets this difference to the
statement is found in Clark and Sengul (1979): "When operation of a role stability strategy that tells the
listeners encounter 'the woman' or 'she' they are child to try to maintain the same actors in syntactic
assumed to treat this as given information for which and semantic roles. Obviously this interpretation
they must find a referent. They then search memory for resembles the Topic-Stability hypothesis, in that in
the unique entity to which 'the woman' or 'she' was the Topic-Stability hypothesis the listener is
intended to refer."(Note that under this hypothesis maintaining the same entity as a discourse topic.
pronoun comprehension is not assumed to be essential- In the dual-process hypothesis, as well as in
ly different from comprehension of full NP's.)Caramazza Maratsos's hypothesis, stress is seen as signalling
and Gupta (1979) also seem to implicitly accept the change, however, the fact that some types of focussed
Reference Search Hypothesis when, in discussing some of pronouns do not signal grammatical or semantic role
their stimulus sentences, they say:"the sentence materi- change (such as the example in (7)) shows that
als used in Experiment I could be expected to generate Maratsos's characterization is not quite accurate for
this chain of events because the preposed subordinate a wider sample of focussed pronouns than the ones he
clauses do not contain enough information to guide the used in his experiment.
subject in the search for appropriate referents to the In light of the results that Maratsos obtained,
anaphoric pronouns (emphasis added) we expected that the processing of focussed pronouns,
The second major kind of pronoun processing theory assumed to be of the reference search type, would be
may be referred to as the Topic-Stability hypothesis. more difficult than the processing of unfocussed pro-
A pronoun, in this view, serves not as a signal to the nouns, assumed to be of the topic-stability type.
listerner to initiate a memory search, but rather as a We conducted a pilot study to test the hypothe-
signal to assign coreference relations between the sis that there are different kinds of processing for
pronoun and the discourse topic. A statement consistent stressed and unstressed pronouns. In this study, 15
with this view, though not specifically proposing a subjects listened to a set of 40 short discourses, 20
processing theory, can be found in Chafe (1974): "if experimental and 20 filler. The experimental dis-
the explanation in terms of consciousness is correct, courses occurred in two identical forms, except with
it is misleading to speak as if the addressee needs to a biasing context that made a focussed or non-
perform some operation of recovery for given informa- focussed pronoun appropriate. In both forms of the
tion. The point is rather that such information is al- discourse the referent of the pronoun was the same.
ready on stage in the mind." Karmiloff-Smith (1980) For example:
takes a similar view; "anaphoric pronominalization (21) A. Did Bill say who would be late?
functions as an implicit instruction for the addressee B. Yes, after I called him up. Bill said that HE
not to recompute for retrieval of an antecedent would be late.
referent, but rather to treat the pronoun as the de-
fault case for the thematic subject of a span of dis-
126 She goes on to say that deviations from the
default (topic) case will be signalled by the use of a
full NP. As we have seen (e.g. example (3) above)
asuch deviations
focussed can be signalled as well by the use of
(22) A. Did Bill say whether he would be late? References
B. Yes, after I called him up, Bill said
that he would be late. Akmajian, A., and Jackendoff, R. (1970). Coreferen-
tiality and stress. Linguistic Inquiry, I,
After listening to the sentences, the subjects an- 124-126.
swered true or false to a statement about a part of Caramazza, A., and Gupta, S. (1979). The roles of
the sentence that had nothing to do with the pro- topicalization, parallel function, and verb
nominal reference. After (21) or (22), for example, semantics in the interpretation of pronouns.
the subjects might hear: Linguistics, U , 497-518.
(23) True or false: Bill called me up. Chafe, RTlT (1974). Language and consciousness.
Language, 50, 111-133.
On the assumption that a difficulty in pronoun process- Clark, H.H., and Clark, E.V. (1977). Psychology of
ing would lead to a general degradation in performance Language. New York: Harcourt Brace
in comprehending all parts of the sentence, we pre- Jovanovich, Inc.
dicted that more errors would be made in the sentences Clark, H.H., and Sengul, C.J. (1979). In search of
with focussed pronouns in them. referents for nouns and pronouns. Memory
We found that subjects did not made significant- and Cognition, 7^, 35-41.
ly more errors in either condition. We believe, how- Gundel,"T! (1980). Zero-NP anaphora in Russian:
ever, that this result is due to the fact that the A case of topic-prominence. In Kreiman, J.,
task isiimply too easy. We found that subjects per- and Ojeda, A.E., eds.. Papers from the
formed at about the 90% level of correctness for Parasession on Pronouns and Anaphora. Chicago:
both focussed and non-focussed pronoun discourses, Chicago Linguistic Society, 139-146.
and most of these errors were due to two stimulus Halliday, M.A.K., and Hasan, R. (1976). Cohesion in
discourses that were difficult for other reasons. We English. London: Longman.
also asked the subjects to judge whether they felt the Karmiloff-Smith, A. (1980). Psychological processes
following statement was easy, medium, or difficult to underlying pronominalization and non-pronom-
answer, and found no difference between focussed and inalization in chidren's connected discourse.
non-focussed conditions in this measurement either. In Kreiman, J., and Ojeda, A.E., eds.. Papers
We did find that judgements of difficulty were not con- from the Parasession on Pronouns and Anaphora.
sistently related to performance. Very often subjects Chicago: Chicago Linguistic Society, 231-250.
thought that the statement was easy to confirm when Li, C.N., and Thompson, S.A. (1979). Third-person
they gave the wrong answer. pronouns and zero-anaphora in Chinese
A future experiment will provide a more sensi- discourse. In Givon, T., ed.. Syntax and
tive test of this hypothesis. This experiment will Semantics 12: Discourse and Syntax. New York:
auditorily present subjects with discourses contain- Academic Press, 311-335.
ing pronouns of the types we are interested in, and Linde, C. Focus of attention and the choice of pronouns
measure reaction time to naming the referent of the in discourse. (1979). In Givon, T., ed..
pronoun. This would be similar to the procedure used Syntax and Semantics 12: Discourse and Syntax.
in Caramazza and Gupta (1979), except that they used a New York: Academic Press, 337-354.
visual presentation. Maratsos, M.P. The effects of stress on the under-
If our hypothesis is correct, then reaction standing of pronominal co-reference in children.
time to naming the referent of a focussed pronoun Journal of Psycholinguistic Research. 2 (1973),
should be longer than reaction time to fining the 1-8.
referent of a non-focussed pronoun. If there is
no difference in reaction times, we would conclude
that the reference search process is used for all
pronoun processing. Note that the Topic-Stability
hypothesis cannot be correct (at least for adults),
for all pronouns, since it can lead to the wrong
referent's being selected for some focussed pronouns,
as illustrated in (19) and (20). On the other hand,
reference search could lead to the correct referent
for all pronouns.
We have suggested that pronouns can be divided
into two types on the basis of their linguistic
characteristics, and we have suggested that people
may be able to t;ke advantage of the severe restric-
tions on what the antecedent of an unstressed pronoun
can be in processing. No differences were detected in
a pilot study, but an additional experiment is pro-
posed that will use a more sensitive test of pro-
cessing difficulty than the one used in the first

The Natural Natural Lanquaqe Understander
Henry Hajnburqer, UCI and NSF
and Consider procedures i and ii, intended to
Stephen Grain, U of Texas represent correct procedures for the two kinds of
counting, expressed at a coarse enough grain that
Ttiis study of natural languaqo comprehension by individual statements can be taken as havinq
natural understanding systems (children) is based theoretical content. Ttie tokens are intended to
on a procedural analysis represented in the form be more or less self-explanatory. In the interest
of a progranming language. To clarify what is of simplicity, no mention is made of initializ-
coqnitively required for a child to respond appro- ation, stored data, or output.
priately to certain expressions in English, we i procedure: 'Count.'
show how these forms can be translated into repeat Ttie procedure runs
procedures in a high-level progranminq language. next number through successive numbers
It is then possible to discuss two kinds of diffi- until fail until there are no more.
culties a natural language form can present to the ii procedure: 'Count them.'
listener: (i) incompatibility of the form with its repeat
associated procedure and (ii) complexity of that next object At each cycle, the next
procedure. An example of procedure complexity is next number object and number are
the nesting of loops, whereas a contributor to until fail noted.
incompatibility is a word or contiguous ohrase
that corresponds to separated pieces of the proce- For i to bo utilized in forming ii, it is neces-
dure. Wc shall present evidence of both types of sary to insert new material, specifically 'next
difficulty frcn experiments with children and will object,' into the loop already present in i. TTius
compare the predictions of our procedural view one approach to accounting for the difficulty
with those of a loss detailed syntactic explan- posed by the transitive case is to hypothesize
ation that has been advanced for a subset of the that breaking into a loop is a difficult or low-
phenomena. priority move for acquisition or comprehension. A
Many cognitive tasks can be expressed either as a related view is implicit in iii where it can be
natural language contnand or as a programming seen that a single word, 'count,' corresponds to
language procedure. In many tasks requiring some two separate pieces of procedure. Ttiis phenomenon
elementary mathematical kjiowledqe (e.q., count- can appropriately be called translational discont-
ing), the cognitive procedure appears to be inuity in the sense that it pertains to the
substantially more complex than the syntactic relationship betweeen two representations. Since
structure of the conmand. In such tasks, an it arises in the translation from a (very) high-
explanation of children's difficulties can be level language (English) to a less-high-level
pursued more profitably in the realm of cognitive language (of procedures), it can also be called a
procedure than of syntactic structure or parsing. compiling discontinuity. Note that there is no
To take a particularly simple example, the verb discontinuity in the phrase structure of the
'count' has the capacity to serve as either a language form itself.
transitive or an intransitve verb, but it is prob- iii procedure: 'Count ...'
ably never the first such verb encountered by a repeat
child; compare 'let's eat' and 'eat your cracker.' next number
Therefore the transition from intransitive use of until fail
'count' to its transitive use involves no syntac- To take these ideas a step further, consider the
tic innovation for a child. Put despite being phrase 'the second green ball.' Roeper (1972)
syntactically ordinary, 'count' presents complex- found that many children interpret this phrase as
ities both in its procedure and in translation to if 'second' and 'green' each modify the noun in
that procedure. Just ask the next three-year-old the same way, that is, as if the phrase meant
you see to count, and then to count a few objects. something like 'the ball which has the properties
Chances are good that you will get errors on the of being both green and second.' Matthei (1978)
transitive (latter) task but not on the intrans- expresses this observation as a distinction in
itive one. A look at these errors will give some syntactic phrase structure, assigning the adult
perspective on what a correct procedure must do. interpretation the phrase structure '(second
Oie kind of attempt at transitive counting (green ball))', and the non-adult interpretation
involves blithely swinging a finger past the '(second green ball).' TVie idea of using syntac-
objects to be counted, while apparently counting tic phrase structure in this way to encode semant-
essentially intransitively, with no attempt to ic interpretations must rely on scm& assumptions
coordinate individual objects to individual nuntv- about how semantics relates to syntactic struct-
bers. A somewhat more sophisticated performance ure, for example, a compositional semantics tied
has the finger stopping at some or all of the to the syntactic phrase marker. In any case, no
objects, but in imperfect coordination with the such phrase-structural distinction is possible for
numbers. TTiese misperformances, we believe, arise phrases like 'second ball,' for which Ttetthei
and persist not out of ignorance of the lexical found substantial corresponding errors: 36% of
item 'count' or the associated data structure of responses (four- and five-year-olds, mostly)
positive integers, nor from the syntax of trans- interpreted 'second ball' as 'the one that is both
itive verbs (which, it was suggested above, has second and a ball,' as oppostxi to the 52% who made
already been learned), but because of difficulty the similar error on the example above that had an
inherent in forming a correct procedure for the adjective in the phrase.
task of transitive counting.
Pursuing the procedural approach with this con- finally nesting the new loop in the existing one.
struction, we new find not only ccnpilinq vi procedure: 'second ...'
discontinuity but also nested loops. These repeat
aspects of the procedure appear both with and next ob]ect
without the adjective, as can be seen with refer-
ence to iv and v, again emitting initialization
and any data structures. next number
iv procedure: 'socxxid ball' until
repeat pred two
repeat It is our view that these ccnplexities are what
next C3b]ect make such phrases difficult for children. Not
until only does this account indicate where corplexities
pred ball lie, but in addition it provides an account of how
next number specific errors can arise from a straightforward
until attempt to avoid breaking into loops and incurring
pred two cotnpilinq discontinuity. Suppose that a child
V procedure: 'second green hall' simply taclcs on the appropriate test ('pred ball')
repeat at the end of vi to form vii. This is possible if
repeat the convention introduced in v, above, is used
next object again here: allowing a sequence of 'pred'
until statements as a compound exit test. TTiis ploy
pred qreon yields precisely tlie observed incorrect result
pred ball whenever it is possible (when the second object is
next number a ball), and an infinite loop otherwise.
until vii procedure: 'second ball'
pred two (incorrect interpretation)
The task environment is a display of several repeat
objects in a rcw, with a left-to-right ordering next object
clearly connunicated in advance. The child is next number
asked to 'take the second ball' frcn a display of until
balls and boxes, or to 'take the second green pred two
ball' frcn a display of red and green balls. To pred ball
take the former case, suppose that the first To gain perspective on the earlier examples and
object is a box. Then even if the second object raise some new issues, consider the phrase 'second
is a ball, so that it is both second and a ball, biggest ball.' Here the ordering along which
it is still not the second ball. Procedure iv 'second' is to be counted is based not on position
correctly makes this distinction (which eludes so but on size. This ordering must be at least part-
many children) by means of nested loops in which ially determined by the subject, using an appro-
successive objects are checked for ballhood in the priate algorithm. Tlie usual sorting algorithms
inner loop and counting proceeds in the outer only (ripple, bubble. Shell, etc.) are probably poor
in case the current object is a ball. motiels both because they are severely nonparallel
With an adjective present the appropriate and because they provide a complete ordering where
procedure is v, which is like iv except that the only a partial one is needed.
inner loop is exited only in the event that two In addition to some sorting, this task demands
successive predicates are satisfied. Rather than menory of some order relationships that have been
introduce an 'and' operator in v, we have allcwed established by that sorting. These size relation-
the possibility of multiple exit tests, thereby ships in short-term memory must be used in concert
making conjunction look simpler than disjunction. with the ordering of positive integers held in
Even so, one still should expect the extra long-term meinory. This requirement is not present
predicate to add some processing burden and indeed for the earlier tasks since the positional order-
the phrase in v does lead to more errors than that ing is continually available from the display. In
in iv. Furthermore the outer predicate should our experiments the display has been a card with a
also impose a processing burden. Suppose we row of pictured objects, so subjects cannot create
remove it and interpret the absence of any exit a physical order. If separate physical objects
test as signifying 'repeat forever or until system were used, then a child's sequence of moves might
breakdown (say by failure of a 'next' operator).' reveal use of a specific sorting algorithm.
What is then left of v is a procedure for 'count Results of pilot experiments suggest that 'second
the qreen balls,' which does indeed appear to biggest ball' is, as one would expect from these
cause children less difficulty than 'second green considerations, an extremely difficult phrase to
hall.' (Matthei regarded the two as cognitively interpret. We devised the most straightforward we
equivalent). could that would test comprehension of this
Not only do the words 'second' and "ball' together phrase; all the objects were identical except in
translate into nested loops, but they do so in a size; all but two were of the same snail size; the
discontinuous manner. This is so because 'second' remaining two were both substantially larger than
is responsible for the outer loop and 'ball' (in the snaller ones, adjacent to each other and
iv) is responsible for the inner loop. Worse yet, noticeably different from each other in size;
the programing statement 'next obiect' is neither was in second position. In experiments so
implicit in the counting loop, so that under an far the error rate has been 80%; we will re-
absolutely left-to-right control structure, the port more comprehensive testing at the conference.
state of (cognitive) affairs after 'second' is
processed would be as in vi. Processing the
remainder of the phrase requires, on this model,
locating 'next object' in the existing loop, using
it ('next object') in constructing a new loop, and 129
It is possible to set up more oonplex displays in
which different miscxjmprehensions load to distinct
choices as response. With balls and boxes of var-
ious sizes there exist arrays in which, say, one
object is both second biggest and a ball but is
not the second biggest ball; or in which the
second ball is the biggest; etc. A welter of
possibilities exists for testing the way in which
ordinals, superlatives, relative adjectives, abso-
lute adjectives, nouns and relative clauses are
canprehended by children and what the course of
development looJcs like, in terms of the kinds of
procedures we have posited here. Such develop-
mental sequences are, in turn, the raw material
for theories of an acquisition device.

The following sequence of utterances inter-
Why Do Children Say "Goed"? spersed with transitions summarizes the development
A Computer Model of Child Generation of CHILD'S generation capacity, and is drawn from a
full run of CHILD during which it both learns to
Hallory Selfridge comprehend and generate. The statements referring to
Department of EE and CS CHILD "learning" meaning and syntax summarize the
University of Connecticut, Storrs, CT learning of comprehension described in Selfridge
1.0 Introduction (I98O). In order to show development, CHILD has been
An important question in modelling child supplied with the same concept to generate repeated-
language generation is why children say regular forms ly: the concept for "the parent puts the ball on the
of irregular words, such as "goed", (Clark and Clark, table", (PTRANS ACTOR (PARENT1) OBJECT (BALL1) TO
1978) during development, although they never hear (TOP VAL (TABLED)).
them. Three other general characteristics of CHILD knows no words
children's generation also require explanation. CHILD says "urn"
First, Benedict's (1976) work suggests clearly that
comprehension of various aspects of language precede CHILD learns meaning of "ball"
the generation of those aspects. Second, the length
of the utterances children say becomes generally CHILD says "ball"
longer as development proceeds (Bloom, 1973). Third,
CHILD learns meaning of "put"
Wetstone and Friedlander (1973) suggest that first
children say things in the wrong order, and then say CHILD says "ball put"
things in the correct order.
In order to address these issues, this paper ex- CHILD learns meaning of "table"
plores the hypothesis that learning to talk is driven
by learning to understand. This hypothesis begins by CHILD says "table ball put"
assuming that the principal effect of learning to
understand is the development of the lexicon as addi- CHILD learns syntax of "put"
tional words are learned and their "definitions" are CHILD says "put ball table"
refined and modified. It further assumes that the
language generation process is not learned, but is an CHILD learns meaning of "on"
innate part of a child's cognitive repertoire. Fi-
nally, it states that the ability to generate grows CHILD says "put ball table on"
as the lexicon develops during the development of
comprehension. The hypothesis predicts that a com- CHILD learns syntax of "on"
puter model which incorporated it would display the CHILD says "put ball on table"
characteristics described above.
CHILD (Selfridge, 1980) was initially developed This progression shows that CHILD'S generation
to model a subset of the development of comprehension does correspond to the data described earlier.
about a limited domain in a child between the ages of First, CHILD certainly does learn to generate after
one and two years, using context-based inference and it learns to understand. Second, the length of its
learning rules to build a dictionary aocessable to a utterances does indeed grow. Third, the ability to
conceptual analyzer (Birnbaum and Selfridge, 1979). generate structurally correct utterances follows the
New words were added to the dictionary, the meanings