Cognition on Cognition
Edited by Jacques Mehler and Susana Franck

Preface: Building COGNITION I Neuropsychology 1 Insensitivity to future consequences following damage to prefrontal cortex 2 Autism: beyond "theory of mind" 3 Developmental dyslexia and animal studies: at the interface between cognition and neurology 4 Foraging for brain stimulation: toward a neurobiology of computation
November 1 9 9 5 ISBN 0 - 2 6 2 - 6 3 1 6 7 - 9 504 pp. $ 5 5 . 0 0 / £ 3 5 . 9 5 (PAPER)

5 Beyond intuition and instinct blindness: toward an evolutionary rigorous cognitive science II Thinking 6 Why should we abandon the mental logic hypothesis? 7 Concepts: a potboiler 8 Young children's naive theory of biology 9 Mental models and probabilistic thinking 10 Pretending and believing: issues in the theory of ToMM 11 Extracting the coherent core of human probability judgment: a research program for cognitive psychology 12 Levels of causal understanding in chimpanzees and children 13 Uncertainty and the difficulty of thinking through disjunctions III Language and Perception. 14 The perception of rhythm in spoken and written language 15 Categorization in early infancy and the continuity of development 16 Do speakers have access to a mental syllabary? 17 On the internal structure of phonetic categories: a progress report 18 Perception and awareness in phonological processing: the case of the phoneme 19 Ever since language and learning: afterthoughts on the Piaget-Chomsky debate 20 Some primitive mechanisms of spatial attention 21 Language and connectionism: the developing interface 22 Initial knowledge: six suggestions

Series Bradford Books Cognition Special Issue Related Links Contributor List Request Exam/Desk Copy

Preface: Building COGNITION
The human mind needs to acknowledge and celebrate anniversaries; however, some anniversaries are more salient than others. This book emanates from Volume 50 of the journal, COGNITION. Why that volume of COGNITION was important to us perhaps becomes clear when we understand how the mind encodes numbers. Indeed, Dehaene et al. (1992) reported that the number 50 is psychologically more salient than, say, either 47 or 53. So, predictably, Volume 50 was a befitting occasion to celebrate an anniversary; it was a time to take stock of what was happening during the early years and a time to remember how we were long ago and how we have evolved as a journal. In our first editorial, we wanted to remember those who have provided us with so much help and the cultural climate that made the journal possible. In this introduction to COGNITION on Cognition we leave as much of the original introduction as possible so that the flavor initially conveyed remains. COGNITION was envisioned by T. G. Bever and Jacques Mehler because we thought that the new and diffuse area of cognition had to be facilitated by overcoming the inflexibility of form and content that were characteristic of most earlier journals in psychology and linguistics. Moreover, cognition was a multidisciplinary domain while psychology and linguistics were too narrow and too attached to one school of thought or another. So too were most journals. In the sixties, one could see the birth of the cognitive revolution in Cambridge, Massachusetts, where many of those who were to become the main actors were working on a project which was to become modern Cognitive Science. Was it possible to study intelligent behavior, in man and in machine, in the way that one studies chemistry, biology or even astronomy? We were sure the question should be answered affirmatively. Since then, the study of mind has become a part of the natural sciences. Positivism and behaviorism, among others, had confined publishing to patterns that were ill-suited to our needs. Psychologists, linguists, neuropsychologists, and others would often voice their dismay. Authors knew that to enhance their chances of publication they had to avoid motivating their studies theoretically. "Make your introduction as short and vacuous as possible" seemed to be the unspoken guideline of most journals. Editors were often even more hostile towards discussions that had "too much theory," as they used to say in those days. That was not all. Psychology journals did not welcome articles from linguistics while neuropsychologists had to hassle with neurologists to see their findings published. For a psychologist to



publish in a linguistics journal was equally out of bounds. Readership was also broken down along lines of narrow professional affinity. Yet scientists from all these disciplines would meet and discuss a range of exciting new issues in the seminars held at the Harvard Center of Cognitive Studies, and at similar centers that were being created at MIT, Penn, amongst others. Those were the days when computer scientists and psychologists, neurologists and linguists were searching jointly for explanations to the phenomena that their predecessors had explored from much narrower perspectives. If perception continued to be important, learning was beginning to loose its grip on psychology. Neuropsychology and psycholinguistics were becoming very fashionable and so was the simulation of complex behavior. Studying infants and young children had once more become a central aspect of our concerns. Likewise, students of animal behavior were discovering all kinds of surprising aptitudes to which psychologists had been blinded by behaviorism. It was, however, in the fields of linguistics and computer science that the novel theoretical perspectives were being laid out with greatest clarity. What was wanted was a journal that could help students to become equally familiar with biological findings, advances in computer science, and psychological and linguistic discoveries, while allowing them to become philosophically sophisticated. So, some of us set out to create a journal which would enclose such a variegated domain. We also wanted a publication for which it would be fun to write and which would be great to read. These ideas were entertained at the end of the sixties, a difficult time. France was still searching for itself in the midst of unrest, still searching for its soul after hesitating for so long about the need to face up to its contradictions, those that had plunged it into defeat, occupation and then collaboration on one side, suffering, persecution and resistance on the other. The United States, contending with internal and external violence, was trying to establish a multiracial society. At the same time it was fighting far from home for what, we were being told, was going to be a better world, though the reasons looked much less altruistic to our eyes. All these conflicts fostered our concerns. They also inspired the scientists of our generation to think about their role and responsibility as social beings. The nuclear era was a reminder that science was not as useless and abstruse as many had pretended it to be. Was it so desirable for us to be scientists during weekdays and citizens on Sundays and holidays, we asked ourselves. How could one justify indifference over educational matters, funding of universities, sexism, racism, and many other aspects of our daily existence? In thinking about a journal, questions like these were always present in our minds. COGNITION was born in France and we have edited the journal from its Paris office ever since. When Jacques Mehler moved from the United States to France, he worked in a laboratory located across from the Folies Bergeres, a neighborhood with many attractions for tourists but none of the scientific journals that were essential for keeping up with cognitive science. In 1969, the laboratory was moved to a modern building erected on the site at which the infamous Prison du Cherche-Midi had been located until its demolition at the end of the Second World War. This prison stood opposite the Gestapo Headquarters and resistance fighters and other personalities were tortured and then shot within its walls. A



few decades earlier in the prison, another French citizen had been locked up, namely, Captain Dreyfus. It was difficult to find oneself at such a place without reflecting on how the rational study of the mind might illuminate the ways in which humans go about their social business and also, how science and society had to coexist. The building shelters the Ecole des Hautes Etudes en Sciences Sociales (EHESS), an institution that played an important role in the development of the French School of History. F. Braudel presided over the Ecole for many years while being the editor of the prestigious Annates, a publication that had won acclaim in many countries after it was founded by M. Bloch and L. Febvre. It was obvious that the Annates played an important role at the Ecole, where M. Bloch, an Alsatian Jew who was eventually murdered for his leading role during the Resistance, was remembered as an important thinker. Bloch was a convinced European who preached a rational approach to the social sciences. He was persuaded of the importance of expanding communication between investigators from different countries and cultures. Today, M. Bloch and his itinerary help us understand the importance of moral issues and the role of the individual as an ultimate moral entity whose well-being does not rank below state, country, or religion. Our hope is that rational inquiry and cognitive science will help us escape from the bonds of nationalism, chauvinism, and exclusion. Cognitive scientists, like all other scientists and citizens, should be guided by moral reason, and moral issues must be one of our fields of concern. A Dutch publisher, Mouton, offered us the opportunity to launch the journal. In the late sixties, money seemed less important than it does today. Publishers were interested in ideas and the elegance with which they were presented. We agreed to minimize formal constraints, and there was no opposition to the inclusion of a section to be used to air our political and social preoccupations. Opposition during those early planning stages came from a source that we had not at all foreseen as a trouble area. To our great surprise we discovered that publishing an English language journal in France was not an easy task. Some of our colleagues disapproved of what they perceived as a foreign-led venture. "Isn't it true," they argued, "that J. Piaget, one of the central players in the Cognitive Revolution, writes in French?" "A French intellectual ought to try and promote the French culture throughout the language of Descartes, Racine and Flaubert," we were reminded time and again. For a while we had mixed feelings. We need no reminders of how important differences and contrasts are to the richness of intellectual life. Today politicians discuss ways in which the world is going to be able to open markets and promote business. The GATT discussions have concentrated partly on the diversity of cultural goods. We agree with those who would like to see some kind of protection against mass-produced television, ghost-written books, and movies conceived to anesthetize the development of good taste and intelligence. Unfortunately, nobody really knows how to protect us against these lamentable trends. Removing all cultural differences and catering only to the least demanding members of society, no matter how numerous, will promote the destruction of our intellectual creativity. So why did we favor making a journal in English, and why is it that even today we fight for a lingua franca of science? Science is a special case, we told ourselves then, as we do today. We all know that since the Second World War, practically



all the top-quality science has been published in English. It would be unthinkable for top European scientists to have won the Nobel prize or reached world renown if they had published their foremost papers in their own language. They didn't. Likewise, it is unthinkable today for serious scientists, regardless of where they study and work, to be unable to read English. Of course, novels, essays, and many disciplines in the humanities are more concerned with form than with truth. It is normal that these disciplines fight to preserve their tool of privilege, the language in which they need to express themselves. Thus we viewed the resistance to English during the planning stages of COGNITION as an ideological plot to keep the study of mind separate and antagonistic to science and closer to the arts and humanities. Our aim was just the opposite, namely, to show that there was a discipline, cognition, which was as concerned with truth as chemistry, biology, or physics. We were also aware that the fear of contact and communication among fellow scientists is the favorite weapon used by narrow-minded chauvinists and, in general, by authoritarian characters with whom, we still, unfortunately, have to cope in some parts of the European academic world. While COGNITION was trying to impose the same weights and measures for European (inter alia) and American science, some of our colleagues were pleading for a private turf, for special journals catering to their specific needs. We dismissed those pleas, and the journal took the form that the readership has come to expect. Today, we include in this volume a series of articles that were originally published in the Special Issue produced to celebrate the fiftieth volume of the journal. We present these articles in an order which we think brings out their thematic coherence. There are areas that deal with theoretical aspects which range from the status of explanations in cognitive science, the evolutionary accounts offered to explain the stable faculties that are characteristic of homo abilis, to the way in which humans use general faculties to reason about their environment, and so forth. Another group of papers deals with the way in which humans process information and use language, the parts of cognitive science that are best understood, so far. We also present a number of papers that deal with infants' initial abilities and their capacity to learn the distinctive behaviors of the species. We also include several papers that try to relate behaviors to their underlying neural structures. This formto-function pairing may become particularly relevant to explain development. Indeed, many of the changes in behavior that one observes in the growing organism may stem from neural changes and/or from learning. Understanding the neural structures underlying our capacities may help us understand how these are mastered. It is difficult to imagine what the contents of volume 100 of COGNITION will look like. Certainly the journal, publishing in general, and academic publishing in particular, will change in radical ways in the years to come. Not only will the contents evolve in ways that will seem transparent a posteriori but also the form will change in ways that are hard to predict a priori. The ways in which science develops are hard to foresee because until one has bridged the next step vistas are occluded by the present. Fortunately, we do not need to worry about this for the time being. Our work is cut out—concentrating on what we are doing rather than on the ways in which we are doing what we are doing. On the


contrary, we must start thinking about how the changes in publishing will affect our ways of doing science. It is part of the scientist's duty to explore the changes to come so as to insure that the independence and responsibility of science is protected in the world of tomorrow as it is today. We cannot close this short introduction without thanking Amy Pierce for her help in preparing this special issue for publication with MIT Press. Jacques Mehler and Susana Franck

Insensitivity to future consequences following damage to human prefrontal cortex
Antoine Bechara, Antonio R. Damasio*, Hanna Damasio, Steven W. Anderson
Department of Neurology, Division of Behavioral Neurology and Cognitive Neuroscience, University of Iowa College of Medicine, Iowa City, IA 52242, USA

Abstract Following damage to the ventromedial prefrontal cortex, humans develop a defect in real-life decision-making, which contrasts with otherwise normal intellectual functions. Currently, there is no neuropsychological probe to detect in the laboratory, and the cognitive and neural mechanisms responsible for this defect have resisted explanation. Here, using a novel task which simulates real-life decision-making in the way it factors uncertainty of premises and outcomes, as well as reward and punishment, we find that prefrontal patients, unlike controls, are oblivious to the future consequences of their actions, and seem to be guided by immediate prospects only. This finding offers, for the first time, the possibility of detecting these patients' elusive impairment in the laboratory, measuring it, and investigating its possible causes.

Introduction Patients with damage to the ventromedial sector of prefrontal cortices develop a severe impairment in real-life decision-making, in spite of otherwise preserved intellect. The impairments are especially marked in the personal and social realms (Damasio, Tranel, & Damasio, 1991). Patient E.V.R. is a prototypical example of this condition. He often decides against his best interest, and is unable to learn

* Corresponding author. Supported by NINDS POl NS19632 and the James S. McDonnell Foundation.


A. Bechara, A. Damasio, H. Damasio, S. Anderson

from his mistakes. His decisions repeatedly lead to negative consequences. In striking contrast to this real-life decision-making impairment, E.V.R.'s general intellect and problem-solving abilities in a laboratory setting remain intact. For instance, he produces perfect scores on the Wisconsin Card Sorting Test (Milner, 1963), his performances in paradigms requiring self-ordering (Petrides & Milner, 1982), cognitive estimations (Shallice & Evans, 1978), and judgements of recency and frequency (Milner, Petrides, & Smith, 1985) are flawless; he is not preseverative, nor is he impulsive; his knowledge base is intact and so is his short-term and working memory as tested to date; his solution of verbally posed social problems and ethical dilemmas is comparable to that of controls (Saver & Damasio, 1991). The condition has posed a double challenge, since there has been neither a satisfactory account of its physiopathology, nor a laboratory probe to detect and measure an impairment that is so obvious in its ecological niche. Here we describe an experimental neuropsychological task which simulates, in real time, personal real-life decision-making relative to the way it factors uncertainty of premises and outcomes, as well as reward and punishment. We show that, unlike controls, patients with prefrontal damage perform defectively and are seemingly insensitive to the future.

Materials and methods The subjects sit in front of four decks of cards equal in appearance and size, and are given a $2000 loan of play money (a set of facsimile US bills). The subjects are told that the game requires a long series of card selections, one card at a time, from any of the four decks, until they are told to stop. After turning each card, the subjects receive some money (the amount is only announced after the turning, and varies with the deck). After turning some cards, the subjects are both given money and asked to pay a penalty (again the amount is only announced after the card is turned and varies with the deck and the position in the deck according to a schedule unknown to the subjects). The subjects are told that (1) the goal of the task is to maximize profit on the loan of play money, (2) they are free to switch from any deck to another, at any time, and as often as wished, but (3) they are not told ahead of time how many card selections must be made (the task is stopped after a series of 100 card selections). The preprogrammed schedules of reward and punishment are shown on the score cards (Fig. 1). Turning any card from deck A or deck B yields $100; turning any card from deck C or deck D yields $50. However, the ultimate future yield of each deck varies because the penalty amounts are higher in the high-paying decks (A and B), and lower in the low-paying decks (C and D). For example, after turning 10 cards from deck A, the subjects have earned $1000, but they have also encountered 5 unpredicted punishments bringing their total cost to $1250, thus

3. 4







2 0 7

2/ 22 36 H9\ 50 si 52 53

0 0 26 27 2 * 3 1

+ 100




0 72 73

7 m




35 63

" £ ? 65 66

65 7tf7T

5 1 16 0

8 8 tf


38 39 4o 0 0


M ws
of 0


56 59 6






l 2

7 6


9 .10

1 2

3 . 4 .5


7 .8

9 10 H5 ^

1 2

3 59









(/ /*

23 2¥ 25 26 Z7 28 29 30 M3


12 1 0 5 16 J7 0

8 *>

8 8


u 58


0 0

1 B
1 +100

22 31 32 33 55 57 65 66 * 7 68 69 1S\






72 73 1H 7



2 31 3


38 39 40 S

42 52 53 54 56 96 97 98 99 /oo

1 ° 1 °



3 19 2





8 3

B 49 56 51













77ie top score card repr represent the profiles of select more from decks

that of a typical control subject, and the bottom one tha elections from the first to the 100th card. Control subjects B


A. Bechara, A. Damasio, H. Damasio, S. Anderson

incurring a net loss of $250. The same happens on deck B. On the other hand, after turning 10 cards from decks C or D, the subjects earn $500, but the total of their unpredicted punishments is only $250 (i.e. subject nets $250). In summary, decks A and B are equivalent in terms of overall net loss over the trials. The difference is that in deck A, the punishment is more frequent, but of smaller magnitude, whereas in deck B, the punishment is less frequent, but of higher magnitude. Decks C and D are also equivalent in terms of overall net loss. In deck C, the punishment is more frequent and of smaller magnitude, while in deck D the punishment is less frequent but of higher magnitude. Decks A and B are thus "disadvantageous" because they cost the most in the long run, while decks C and D are "advantageous" because they result in an overall gain in the long run. The performances of a group of normal control subjects (21 women and 23 men) in this task were compared to those of E.V.R. and other frontal lobe subjects (4 men and 2 women). The age range of normal controls was from 20 to 79 years; for E.V.R.-like subjects it was from 43 to 84 years. About half the number of subjects in each group had a high school education, and the other half had a college education. E.V.R.-like subjects were retrieved from the Patient Registry of the Division of Behavioral Neurology and Cognitive Neuroscience. Selection criteria were the documented presence of abnormal decision-making and the existence of lesions in the ventromedial prefrontal region. To determine whether the defective performance of E.V.R.-like subjects on the task is specific to ventromedial frontal lobe damage, and not merely caused by brain damage in general, we compared the performances of E.V.R.-like subjects and normal controls, to an education matched group of brain-damaged controls. There were 3 women and 6 men, ranging in age from 20 to 71 years. These controls were retrieved from the same Patient Registry and were chosen so as to have lesions in occipital, temporal and dorsolateral frontal regions. Several of the brain-damaged controls had memory defects, as revealed by conventional neuropsychological tests. Finally, to determine what would happen to the performance if it were repeated over time, we retested the target subjects and a smaller sample of normal controls (4 women and 1 man between the ages of 20 and 55, matched to E.V.R. in level of education) after various time intervals (one month after the first test, 24 h later, and for the fourth time, six months later).

Results Fig. 2 (left) shows that normal controls make more selections from the good decks (C and D), and avoid the bad decks (A and B). In sharp contrast, E.V.R.-like subjects select fewer from the good decks (C and D), and choose more from the bad decks (A and B). The difference is significant. An analysis of




i SJ f
0 15 30


f 1111111111111111 i 11111111 f 11111111111111 • 111111111111








liiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 0 15 30 45

Fig. 2.

(Left panels) Total number of cards selected from each deck (A, B, C or D) by normal controls (n = represent means ± s.e.m. (Right panels) Profiles of card selections (from the first to the 100th selection)


A. Bechara, A. Damasio, H. Damasio, S. Anderson

variance comparing the number of cards from each deck chosen by normal controls and by target subjects revealed a significant interaction of group (controls vs. targets) with choice (A, B, C, D) (F(3,147) = 42.9, /X.001). Subsequent Newman-Keuls Mests revealed that the number of cards selected by normal controls from deck A or B were significantly less than the number of cards selected by target subjects from the same decks (ps< .001). On the contrary, the number of cards selected by controls from decks C or D were significantly higher than the numbers selected by target subjects (ps<.001). Within each group, comparison of the performances among subjects from different age groups, gender and education yielded no statistically significant differences. Fig. 2 (right) shows that a comparison of card selection profiles revealed that controls initially sampled all decks and repeated selections from the bad decks A and B, probably because they pay more, but eventually switched to more and more selections from the good decks C and D, with only occasional returns to decks A and B. On the other hand, E.V.R. behaves like normal controls only in the first few selections. He does begin by sampling all decks and selecting from decks A and B, and he does make several selections from decks C and D, but then he returns more frequently and more systematically to decks A and B. The other target subjects behave similarly. Fig. 3 reveals that the performance of brain-damaged controls was no different


3 8 W >






5 3

200- ^ -20-40/f A


^^^H ^^^^^|

^^^H ^^^^^H

s a o
c a « >• •o < «M O u


0> W D

1 a Z


Brain- Damaged





Fig. 3.

Total number of selections from the advantageous decks (C + D) minus the total numbers of selections from the disadvantageous decks (A + B) from a group of normal controls (n = 44), brain-damaged controls (n=9), E.V.R., and E.V.R.-like subjects (n=6). Bars represent means ± s.e.m. Positive scores reflect advantageous courses of action, and negative scores reflect disadvantageous courses of action.

Insensitivity to future consequences following damage to prefrontal cortex


from that of normal controls, and quite the opposite of the performance of the prefrontal subjects. One-way ANOVA on the difference in the total numbers of card selections from the advantageous decks minus the total numbers of selections from the disadvantageous decks obtained from normal and brain-damaged controls did not reveal a significant difference between the two groups (F(l,52) = 0.1, p> A), but the difference between the normal and E.V.R.-like groups was highly significant (F(l,50) = 74.8, p < .001). As a result of repeated testing, E.VR.'s performance did not change, one way or the other, when tested one month after the first test, 24 h later, and for the fourth time, six months later. This pattern of impaired performance was also seen in other target subjects. On the contrary, the performance of normal controls improved over time.

Discussion These results demonstrate that E.V.R. and comparable subjects perform defectively in this task, and that the defect is stable over time. Although the task involves a long series of gains and losses, it is not possible for subjects to perform an exact calculation of the net gains or losses generated from each deck as they play. Indeed, a group of normal control subjects with superior memory and IQ, whom we asked to think aloud while performing the task, and keep track of the magnitudes and frequencies of the various punishments, could not provide calculated figures of the net gains or losses from each deck. The subjects must rely on their ability to develop an estimate of which decks are risky and which are profitable in the long run. Thus, the patients' performance profile is comparable to their real-life inability to decide advantageously, especially in personal and social matters, a domain for which in life, as in the task, an exact calculation of the future outcomes is not possible and choices must be based on approximations. We believe this task offers, for the first time, the possibility of detecting these patients' elusive impairment in the laboratory, measuring it, and investigating its possible causes. Why do E.V.R.-like subjects make choices that have high immediate reward, but severe delayed punishment? We considered three possibilities: (1) E.V.R.-like subjects are so sensitive to reward that the prospect of future (delayed) punishment is outweighed by that of immediate gain; (2) these subjects are insensitive to punishment, and thus the prospect of reward always prevails, even if they are not abnormally sensitive to reward; (3) these subjects are generally insensitive to future consequences, positive or negative, and thus their behavior is always guided by immediate prospects, whatever they may be. To decide on the merit of these possibilities, we developed a variant of the basic task, in which the schedules of reward and punishment were reversed, so that the punishment is immediate and


A. Bechara, A. Damasio, H. Damasio, S. Anderson

the reward is delayed. The profiles of target subjects in that task suggest that they were influenced more by immediate punishment than by delayed reward (unpublished results). This indicates that neither insensitivity to punishment nor hypersensitivity to reward are appropriate accounts for the defect. A qualitative aspect of the patients' performance also supports the idea that immediate consequences influence the performance significantly. When they are faced with a significant money loss in a given deck, they refrain from picking cards out of that same deck, for a while, just like normals do, though unlike normals they then return to select from that deck after a few additional selections. When we combine the profiles of both basic task and variant tasks, we are left with one reasonable possibility: that these subjects are unresponsive to future consequences, whatever they are, and are thus more controlled by immediate prospects. How can this "myopia" for the future be explained? Evidence from other studies suggests that these patients possess and can access the requisite knowledge to conjure up options of actions and scenarios of future outcomes just as normal controls do (Saver & Damasio, 1991). Their defect seems to be at the level of acting on such knowledge. There are several plausible accounts to explain such a defect. For instance, it is possible that the representations of future outcomes that these patients evoke are unstable, that is, that they are not held in working memory long enough for attention to enhance them and reasoning strategies to be applied to them. This account invokes a defect along the lines proposed for behavioral domains dependent on dorsolateral prefrontal cortex networks, and which is possibly just as valid in the personal/social domain of decision-making (Goldman-Rakic, 1987). Defects in temporal integration and attention would fall under this account (Fuster, 1989; Posner, 1986). Alternatively, the representations of future outcomes might be stable, but they would not be marked with a negative or positive value, and thus could not be easily rejected or accepted. This account invokes the somatic marker hypothesis which posits that the overt or covert processing of somatic states provides the value mark for a cognitive scenario (Damasio, 1994; Damasio et al., 1991). We have been attempting to distinguish between these two accounts in a series of subsequent experiments using this task along with psychophysiological measurements. Preliminary results favor the latter account, or a combination of the two accounts. Those results also suggest that the biasing effect of the value mark operates covertly, at least in the early stages of the task.

Damasio, A.R. (1994). Descartes' error: Emotion, rationality and the human brain. New York: Putnam (Grosset Books).

Insensitivity to future consequences following damage to prefrontal cortex


Damasio, A.R., Tranel, D., & Damasio, H. (1991). Somatic markers and the guidance of behavior. In H. Levin, H. Eisenberg, & A. Benton (Eds.), Frontal lobe function and dysfunction (pp. 217-228). New York: Oxford University Press. Fuster, J.M. (1989). The prefrontal cortex (2nd edn.). New York: Raven Press. Goldman-Rakic, P.S. (1987). Circuitry of primate prefrontal cortex and regulation of behavior by representational memory. In F. Plum (Ed.), Handbook of physiology: The nervous system (Vol. V, pp. 373-401). Bethesda, MD: American Physiological Society. Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of Neurology, 9, 90-100. Milner! B., Petrides, M., & Smith, M.L. (1985). Frontal lobes and the temporal organization of memory. Human Neurobiology, 4, 137-142. Petrides, M., & Milner, B (1982). Deficits on subject-ordered tasks after frontal and temporal-lobe lesions in man. Neuropsychologia, 20, 249-262. Posner, M.I. (1986). Chronometric explorations of the mind. New York: Oxford University Press. Saver, J.L., & Damasio, A.R. (1991). Preserved access and processing of social knowledge in a patient with acquired sociopathy due to ventromedial frontal damage. Neuropsychologia, 29, 1241-1249. Shallice, T., & Evans, M.E. (1978). The involvement of the frontal lobes in cognitive estimation. Cortex, 14, 294-303.


Autism: beyond "theory of mind"
Uta Frith*, Francesca Happe
MRC Cognitive Development Unit, 4 Taviton Street, London WC1H OBT, UK

Abstract The theory of mind account of autism has been remarkably successful in making specific predictions about the impairments in socialization, imagination and communication shown by people with autism. It cannot, however, explain either the non-triad features of autism, or earlier experimental findings of abnormal assets and deficits on non-social tasks. These unexplained aspects of autism, and the existence of autistic individuals who consistently pass false belief tasks, suggest that it may be necessary to postulate an additional cognitive abnormality. One possible abnormality - weak central coherence - is discussed, and preliminary evidence for this theory is presented.

The theory of mind account of autism In 1985 Cognition published an article by Baron-Cohen, Leslie, and Frith, entitled: Does the autistic child have a "theory of mind"? The perceptive reader would have recognized this as a reference to Premack and Woodruffs (1978) question: Does the chimpanzee have a theory of mind? The connection between these two was, however, an indirect one - the immediate precursor of the paper was Wimmer and Perner's (1983) article on the understanding of false beliefs by normally developing pre-school children. Each of these three papers has, in its way, triggered an explosion of research interest; in the social impairments of autism, the mind-reading capacities of non-human primates, and the development of social understanding in normal children. The connections which existed between the three papers have been mirrored in continuing connections between these three fields of research - developmental psychology (Astington, Harris, & Olson, 1989; Perner, 1991; Russell, 1992; Wellman, 1990), cognitive ethology

* Corresponding author


U. Frith, F. Happ£

(Byrne & Whiten, 1988; Cheney & Seyfarth, 1990), and developmental psychopathology (Cicchetti & Cohen, in press; Rutter, 1987). There can be little doubt that these contacts have enriched work in each area. Perceptive readers would also have noticed the inverted commas surrounding the phrase "theory of mind" in the 1985 paper. Baron-Cohen, Leslie, and Frith followed Premack and Woodruffs definition of this "sexy" but misleading phrase: to have a theory of mind is to be able to attribute independent mental states to self and others in order to explain and predict behaviour. As might befit a "theory" ascribable to chimpanzees, this was not a conscious theory but an innately given cognitive mechanism allowing a special sort of representation - the representation of mental states. Leslie (1987, 1988) delivered the critical connection between social understanding and understanding of pretence, via this postulated mechanism; metarepresentation is necessary, in Leslie's theory, for representing pretence, belief and other mental states. From this connection, between the social world and the world of imaginative play, sprung the link to autistic children, who are markedly deficient in both areas. The idea that people with autism could be characterized as suffering from a type of "mind-blindness", or lack of theory of mind, has been useful to the study of child development - not because it was correct (that is still debatable) but because it was a causal account which was both specific and falsifiable. The clearest expression of this causal account is given in Frith, Morton, and Leslie (1991). What is to be explained? Autism is currently defined at the behavioural level, on the basis of impairments in socialization, communication and imagination, with stereotyped repetitive interests taking the place of creative play (DSM-III-R, American Psychological Association, 1987). A causal account must link these behavioural symptoms to the presumed biological origins (Gillberg & Coleman, 1992; Schopler & Mesibov, 1987) of this disorder. Specificity is particularly important in any causal account of autism because autistic people themselves show a highly specific pattern of deficits and skills. The IQ profile alone serves to demonstrate this; autistic people in general show an unusually "spiky" profile across Wechsler subtests (Lockyer & Rutter, 1970; Tymchuk, Simmons, & Neafsey, 1977), excelling on Block Design (constructing a pattern with cubes), and failing on Picture Arrangement (ordering pictures in a cartoon strip). This puzzling discrepancy of functioning has caused many previous psychological theories of autism to fail. For example, high arousal, lack of motivation, language impairment, or perceptual problems are all too global to allow for both the assets and deficits of autism.

Fine cuts along a hidden seam What are the specific predictions made by the hypothesis that people with autism lack a "theory of mind"? The hypothesis does not address the question of

1992) False beliefs (Leslie & Thaiss. but not all. The fine cuts method.it is silent on functioning in non-social areas . So. 1986) Understanding know (Perner et al. for example. However. 1992) Metaphorical expression (Happe. Tager-Flusberg. Some.indeed Hermelin and O'Connor (1970) demonstrated to many people's initial surprise that autistic children prefer to be with other people. when viewed from the cognitive level. & Cohen. 1993) Information occlusion (Baron-Cohen..but it focuses on the critical triad of impairments (Wing & Gould.Autism: beyond "theory of mind" 15 the spiky IQ profile . just like non-autistic children of the same mental age. has also informed research Table 1. both supporting and attacking the theory (reviewed by Baron-Cohen. Not only does it make sense of this triad. such behaviour requires the ability to "mentalize" (represent mental states). The power of this hypothesis is to make fine cuts in the smooth continuum of behaviours. and in this it has been remarkably useful. 1993) Recognizing happiness and sadness Object occlusion Literal expression References refer to Assets and Deficits. Happe & Frith..and is consistently reported by parents to be missing in the development of even able autistic children (Newson. The mentalizing-deficit account has allowed a systematic approach to the impaired and unimpaired social and communicative behaviour of people with autism. 1989b) Deception (Sodian & Frith. 1989) Protodeclarative pointing (Baron-Cohen.. 1991) Recognizing surprise (Baron-Cohen et al. social approach need not be built upon an understanding of others' thoughts . . Social and communicative behaviour is not all of one piece. 1984). but it also makes "fine cuts" within the triad of autistic impairments. in press). sharing attention with someone else does require mentalizing . Leekam & Perner. It has sparked an enormous amount of research. Autistic assets and deficits as predicted by the "fine cuts" technique. between tasks which require mentalizing and those which do not Assets Ordering behavioural pictures Understanding see Protoimperative pointing Sabotage False photographs Deficits Ordering mentalistic pictures (Baron-Cohen et al. 1992. Table 1 shows some of the work exploring predictions from the hypothesis that autistic people lack mentalizing ability. 1994a. as used in the laboratory. Dawson. 1993. Happe. & Everard. 1979).

Frith. Happe. By looking at performance across tasks which are equivalent in every other way. 1993) Using person as receiver of information (Phillips. although this enterprise has still some way to go. Sperber & Wilson's 1986 Relevance theory). pre-empts many potential criticisms. F. Deficits Spontaneous pretend play (Wetherby & Prutting. 1988) Talking about beliefs and ideas (Tager-Flusberg. which aims to pit two behaviours against each other which differ only in the demands they make upon the ability to mentalize. in press). 1993. The fine cuts approach suits the current climate of increased interest in the modular nature of mental capacities (e. except for the critical cognitive component. Happe* Table 2. that explanatory theories must give a full account of a disorder (Morton & Frith. Frith..g. Fodor. intellectual energy has been saved for the really interesting theoretical debates.16 U. social interaction and verbal and non-verbal communication. It is also peculiarly suitable for use in brain-imaging studies. 1989. For this reason.g.. autism has come to be a test case for many theories of normal development (e. But there is more to autism than the classic triad of impairments. & Hermelin. Autistic assets and deficits observed in real life Assets Elicited structured play Instrumental gestures Talking about desires and emotions Using person as tool Showing "active" sociability ^ References refer to Assets and Deficits. in press) into the pattern of abilities and deficits in real life (Table 2). . 1993) Showing "interactive" sociability (Frith et al. This technique. Cosmides.. The mentalizing account has helped us to understand the nature of the autistic child's impairments in play. Another key benefit of the specificity of this approach is the relevance it has for normal development. 1984) Expressive gestures (Attwood. But just how well does the theory of mind account explain autism? By the stringent standard. 1983). Limitations of the theory of mind account The hijacking of autism by those primarily interested in normal development has added greatly to the intellectual richness of autism research. not that well. It has allowed us to think about social and communicative behaviour in a new way.

Islets of ability (an essential criterion in Kanner. Leslie.Preoccupation with parts of objects (a diagnostic feature in DSM-IV. Hermelin and O'Connor were the first to introduce what was in effect a different "fine cuts" method (summarized in their 1970 monograph) . . much of it predating the mentalizing theory.Idiot savant abilities (striking in 1 in 10 autistic children. & Frith. 1943). None of these aspects can be well explained by a lack of mentalizing.ordering picture stories involving mental states (BaronCohen. forthcoming). Most of these successful children also passed another test of mentalizing . which demonstrates non-social abnormalities that are specific to autism. . 1984). 1993. include the following: . . . there is also a substantial body of experimental work. Of course. some 20% of autistic children passed the Sally-Ann task.Autism: beyond "theory of mind" 17 Non-triad features Clinical impressions originating with Kanner (1943) and Asperger (1944. translated in Frith.Restricted repertoire of interests (necessary for diagnosis in DSM-III-R. . It also cannot explain all people with autism. 1967).Excellent rote memory (emphasized by Kanner. 1943). Even in the first test of the hypothesis (reported in the 1985 Cognition paper). Table 3 summarizes some of the relevant findings. Rimland & Hill. 1991).Obsessive desire for sameness (one of two cardinal features for Kanner & Eisenberg. then.suggesting some real underlying competence in representing mental states. 1986) . clinically striking features shown by people with autism need not be specific features of the disorder. McDonnell.namely the comparison of closely matched groups of autistic and non-autistic handicapped children of the same mental age. All of these non-triad aspects of autism are vividly documented in the many parental accounts of the development of autistic children (Hart. American Psychological Association. cannot explain all features of autism. Baron-Cohen (1989a) tackled this apparent dis- . However. 1989. and withstanding the test of time. Park. 1987). The talented minority The mentalizing deficit theory of autism. 1956).

Langdell. 1967) Memory for related items (e. One possible way of explaining the persisting autism of these successful subjects is to postulate an additional and continuing cognitive impairment.b) Jigsaw by picture (e. impulsiveness). . all were impaired on the Wisconsin Card Sorting Test and Tower of Hanoi (two typical tests of executive function).. Frith & Hermelin. has yet to be established by systematic comparison with other non-autistic groups who show impairments in executive functions (Bishop. in press). Ozonoff. the specificity. Frith. 1993. & Pennington (1991) found that while not all subjects with autism and /or Asperger's syndrome showed a theory of mind deficit.18 U. Weeks & Hobson. On the basis of this finding they suggest that executive function impairments are a primary causal factor in autism. 1970 a. 1991) have shown that some autistic people can pass theory of mind tasks consistently..g.g. Pennington. F. 1991) Echoing with repair (e. failure to plan. 1993). Happe. 1987) Recognizing faces right-way-up (e. shown by autistic subjects relative to normally expected asymmetries Unusual strength Memory for word strings Memory for unrelated items Echoing nonsense Pattern imposition Jigsaw by shape Sorting faces by accessories Recognizing faces upside-down Unusual weakness Memory for sentences (e. 1991) can be seen as springing from some of the limitations of the theory of mind view discussed above.g.g. & Pennington. However. Experimental findings not accounted for by mind-blindness. Aurnhammer-Frith. applying these skills across domains (Happe. 1993) and showing evidence of insightful social behaviour in everyday life (Frith. 1985).g. Happe* Table 3. Hermelin & O'Connor. confirmation of the theory. However... it is not clear how it could explain the specific deficits and skills summarized in Table 3. results from other studies focusing on high-functioning autistic subjects (Bowler.g. Tager-Flusberg. 1969) Pattern detection (e.. 1978) References refer to Unusual strength and Unusual weakness. Frith. by showing that these talented children still did not pass a harder (second-order) theory of mind task (Perner & Wimmer. While an additional impairment in executive functions may be able to explain certain (perhaps non-specific) features of autism (e.. Surprising advantages and disadvantages on cognitive tasks. What could this impairment be? The recent interest in executive function deficits in autism (Hughes & Russell. 1992. stereotypies. Ozonoff. and hence the power of this theory as a causal account.g.. Rogers..g. & Siddons. & Rogers. 1969) Sorting faces by person (e. Rogers. Ozonoff.

Witkin. 1971). sew-so.for example. A similar tendency to process information in context for global meaning is also seen with non-verbal material . with a slightly modified procedure including some pretraining with cut-out shapes. pear-pair). when Amitta Shah set off to look at autistic children's putative perceptual impairments on the Embedded Figures Test. the gist of a story is easily recalled.is advantageous. and non-verbal mental age of 9. meet-meat.6.relatively piece-meal processing . Frith (1989) proposed that autism is characterized by a specific imbalance in integration of information at different levels.g. and. concluded: "an individual does not normally take [such] a situation detail by detail. Another instance of central coherence is the ease with which we recognize the contextually appropriate sense of the many ambiguous words used in everyday speech (son-sun. In all ordinary instances he has an overmastering tendency simply to get a general impression of the whole. Raskin. On the basis of this theory. Bartlett (1932). and is effortful to retain. Oltman. our everyday tendency to misinterpret details in a jigsaw piece according to the expected position in the whole picture. For example. and that a lack of central coherence could explain very parsimoniously the assets and deficits shown in Table 3. were compared with 20 learning disabled children of the same age and mental age.Autism: beyond "theory of mind" 19 The central coherence theory Motivated by the strong belief that both the assets and the deficits of autism spring from a single cause at the cognitive level.. Empirical evidence: assets A first striking signpost towards the theory appeared quite unexpectedly. he constructs the probable detail" (p. 206). and 20 normal 9-year-olds. 1967). summarizing his famous series of experiments on remembering images and stories. These children were given the Children's Embedded Figures Test (CEFT. "central coherence" in Frith's words.who appear to be sensitive to the advantage of recalling organized versus jumbled material (e. The children were almost better than the experimenter! Twenty autistic subjects with an average age of 13.. while the actual surface form is quickly lost. on the basis of this. Frith suggested that this universal feature of human information processing was disturbed in autism. It is likely that this preference for higher levels of meaning may characterize even mentally handicapped (non-autistic) individuals . A characteristic of normal information processing appears to be the tendency to draw together diverse information to construct higher-level meaning in context. Hermelin & O'Connor. The test involved spotting a hidden figure (triangle or house shape) . she predicted that autistic subjects would be relatively good at tasks where attention to local information . but poor at tasks requiring the recognition of global meaning.. & Karp.

arrangement of cleaning materials on bathroom shelf). a clock). while the two control groups (which did not differ significantly in their scores) achieved 15 or less. The ease and speed with which autistic subjects picked out the hidden figure in Shah and Frith's (1983) study was reminiscent of their rapid style of locating tiny objects (e. 1). have privileged access to the parts and details normally securely embedded in whole figures. Twenty autistic. subjects would benefit from pre-segmentation of the designs. The study of embedded figures was introduced into experimental psychology by the Gestalt psychologists.g. and the difficulty which most people experience with this task appears to relate to problems in breaking up the whole design into the constituent blocks. During testing children were allowed to indicate the hidden figure either by pointing or by using a cut-out shape of the hidden figure. The designs are notable for their strong gestalt qualities. Perhaps this struggle to resist overall gestalt forces does not occur for autistic subjects. at the expense of the constituent parts (Koffka. due to weak central coherence. so that individual blocks can be used to reconstruct the original design from separate parts. that the advantage shown by autistic subjects is due specifically to their ability to see parts over wholes.20 U. and often relative to other people of the same age. Shah and Frith (1993) suggested. then novel predictions could be made about the nature of their islets of ability. Happ6 among a larger meaningful drawing (e. Autistic subjects with normal or near-normal non-verbal IQ were matched with normal children of 16 years. 1979). this fact has generally been explained as due to intact or superior general spatial skills (Lockyer & Rutter. 1935). Autistic subjects showed superior performance compared to controls in one . Autistic subjects with non-verbal IQ below 85 (and not lower than 57) were compared with learning disabled children of comparable IQ and chronological age (18 years). The Block Design subtest of the Wechsler Intelligence Scales (Wechsler. and normal children aged 10. as often described anecdotally. If people with autism. This test. 1970.g. The results showed that the autistic subjects' skill on this task resulted from a greater ability to segment the design. autistic children got a mean of 21 items correct. Prior. Frith. Gottschaldt (1926) ascribed the difficulty of finding embedded figures to the overwhelming "predominance of the whole". on the basis of the central coherence theory. who believed that an effort was needed to resist the tendency to see the forcefully created gestalt. thread on a patterned carpet) and their immediate discovery of minute changes in familiar lay-outs (e. F. where 40 different block designs had to be constructed from either whole or pre-segmented drawn models (Fig.. Out of a maximum score of 25. but not autistic. 1981) is consistently found to be a test on which autistic people show superior performance relative to other subtests. first introduced by Kohs (1923). requires the breaking up of line drawings into logical units..g. They predicted that normal. 1974. While many authors have recognized this subtest as an islet of ability in autism. 33 normal and 12 learning disabled subjects took part in an experiment.

4. 5. 2. affected all groups equally. regardless of their IQ level. From these latter findings it can be concluded that general visuo-spatial factors show perfectly normal effects in autistic subjects. and rotated versus unrotated presentation. 6. 8) "oblique" versus "non-oblique" (3. 6) "unrotated" versus "rotated" (1. 4. 7. 7.Autism: beyond "theory of mind" 21 HH • «• H 3 5. On the other hand.when working from whole designs. Examples of all types of design: "whole" versus "segmented" (1. The great advantage which the control subjects gained from using pre-segmented designs was significantly diminished in the autistic subjects. 6. it would be expected to confer marked disadvantages in tasks which involve interpretation of individual . 8 vs. 3. other conditions which contrasted presence and absence of obliques. 2. BB 4> 8 Fig. 7 vs. 2. condition only . 3. 8). 4 vs. and that superior general spatial skill may not account for Block design superiority. Empirical evidence: deficits While weak central coherence confers significant advantages in tasks where preferential processing of parts over wholes is useful. 5. 1. 1.

By contrast. these will be errors which violate the overall pattern. An interesting example is the processing of faces. one must process the final word as part of the whole sentence meaning: "He had a pink bow". 1993). and compared them with 6 dyslexic children and 10 normal children of the same reading age. F. rather than the details. Blusewicz. The abnormality of excellence The hypothesis that people with autism show weak central coherence aims to explain both the glaring impairments and the outstanding skills of autism as resulting from a single characteristic of information processing. One example might be the type of error made in the Block Design test. "He made a deep bow". One case in which the meaning of individual stimuli is changed by their context is in the disambiguation of homographs. Kaplan. Of these two types of information. were impaired when contextual cues had to be used. Frith and Snowling (1983) predicted that this sort of contextual disambiguation would be problematic for people with autism. The central coherence theory suggests that. Ouston. In order to choose the correct (contextappropriate) pronunciation in the following sentences. Frith. where errors are made at all on Block Design. They tested 8 children with autism who had reading ages of 8-10 years. it appears to be configural processing which is disrupted by the inverted presentation of faces (Bartlett & Searcy. The number of words read with the contextually appropriate pronunciation ranged from 5 to 7 out of 10 for the autistic children. This finding suggested that autistic children. Langdell. & Atkinson. which seems to involve both featural and configural processing (Tanka & Farah. 1993). although excellent at decoding single words. global) choices made in a similarity-judgement task .22 U. who tended to give the more frequent pronunciation regardless of sentence context. Brake. One characteristic of this theory is that it claims that the islets of ability and savant skills are achieved through relatively abnormal processing. and predicts that this may be revealed in abnormal error patterns. 1988. This work fits well with previous findings (Table 3) concerning failure to use meaning and redundancy in memory tasks. and Preston (1991) found that in normal adult subjects there was a strong relation between the number of such configuration-breaking errors made on the Block Design test and the number of local (vs. 1993. the normal and dyslexic children read between 7 and 9 of the 10 homographs in a contextually determined manner. This was also demonstrated in their relative inability to answer comprehension questions and to fill in gaps in a story text. Kramer. Happ6 stimuli in terms of overall context and meaning. & Lee. 1978). Rhodes. This may explain the previously puzzling finding that autistic subjects show a diminished disadvantage in processing inverted faces (Hobson.

in contrast to normal children. It remains to be seen whether other savant abilities can be explained in terms of a similarly local and detail-observant processing style. The authors observed that the subject "began his drawing by a secondary detail and then progressed by adding contiguous elements". In contrast. Autistic subjects were tested on a battery of theory of mind tasks at two levels of difficulty (first. . Can it also shed light on the continuing handicaps of those talented autistic subjects who show consistent evidence of some mentalizing ability? Happe (1991). may be helpful in explaining some of the real-life features that have so far resisted explanation. then. but not the autistic subjects.Autism: beyond "theory of mind" 23 (Kimchi & Palmer. The three autistic groups and the control group obtained the same score for total number of words correctly read. The normal controls showed a significant advantage when sentence context occurred before (rare pronunciation) target words (scoring 5 out of 5. 1982). and 6 subjects who passed both first. 5 subjects who passed all and only first-order tasks. but rather a construction by local progression". vs. errors violating configuration are far more common than errors violating pattern details in autistic Block Design performance. 1993). in a first exploration of the links between central coherence and theory of mind.and second-order theory of mind tasks were compared with 14 7-8-year-olds. however. were sensitive to the relative position of target homograph and disambiguating context: "There was a big tear in her eye". while the autistic subjects (as in Frith and Snowling. The autistic subjects were of mean age 18 years. versus "In her dress there was a big tear". Preliminary data from subjects with autism (Happe. and concluded that his drawings showed "no privileged status of the global form . the young normal subjects. Excellent drawing ability may be characterized by a relatively piece-meal drawing style. 2 out of 5 where target came first).and second-order theory of mind). in preparation) suggest that. as well as making sense of a body of experimental work not well accounted for by the mentalizing deficit theory. and grouped according to their performance (Happe. 1983) tended to give the more frequent pronunciation regardless (3 out of 5 appropriate pronun- . and had a mean IQ of around 80. Five subjects who failed all the theory of mind tasks. Central coherence and mentalizing Central coherence. used Snowling and Frith's (1986) homograph reading task with a group of able autistic subjects. Mottron and Belleville (1993) found in a case study of one autistic man with exceptional artistic ability that performance on three different types of tasks suggested an anomaly in the hierarchical organization of the local and global parts of figures. a professional draughtsman who acted as a control started by constructing outlines and then proceeded to parts. As predicted. A second example concerns idiot savant drawing ability. .

Even those subjects who consistently passed all the theory of mind tasks (mean VIQ 90) failed to use sentence context to disambiguate homograph pronunciation. to think of weak central coherence as characteristic of even those autistic subjects who possess some mentalizing ability. Twenty-seven children who failed standard first-order false belief tasks were compared with 21 subjects who passed. 1988) we hold that the mentalizing deficit can be usefully conceptualized as the impairment of a single modular system. Whiten. Nevertheless. involving extracting information from a story context. It is possible. then. So. The ability to mentalize would appear to be of such evolutionary value (Byrne & Whiten. leaving other functions intact (e. Frith. therefore. There is. the processing characteristic of weak central coherence. as would strong central coherence. At present. Following Leslie (1987. It is still our belief that nothing captures the essence of autism so precisely as the idea of "mind-blindness". then. It may be that a theory of mind mechanism which is not fed by rich and integrated contextual information is of little use in everyday life. 1994b). even autistic subjects who passed standard second-order false belief tasks showed characteristic and striking errors of mental state attribution (Happe. for example. all the evidence suggests that we should retain the idea of a modular and specific mentalizing deficit in our causal explanation of the triad of impairments in autism. In contrast. normal IQ). skill on non-verbal tasks benefiting from weak central coherence is characteristic of both passers and failers. The important point of this study was that this was true of all three autistic groups. then. In both groups Block Design was a peak of non-verbal performance for the majority of subjects: 18/21 passers.24 U. to think of this balance (between preference for parts . preliminary evidence to suggest that the central coherence hypothesis is a good candidate for explaining the persisting handicaps of the talented minority. 1991) that only insult to the brain can produce deficits in this area. as illustrated above.which may be damaged. This system has a neurological basis . It is possible. for a full understanding of autism in all its forms. Happe (submitted) explored this idea further by looking at WISC-R and WAIS subtest profiles. that while social reasoning difficulties (as shown by Wechsler tests) are striking only in those subjects who fail theory of mind tasks. our present conception is that there may be two rather different cognitive characteristics that underlie autism. Therefore. The finding that weak central coherence may characterize autistic people at all levels of theory of mind ability goes against Frith's (1989) original suggestion that a weakness in central coherence could by itself account for theory of mind impairment.. F.g. when theory of mind tasks were embedded in slightly more naturalistic tasks. this explanation alone will not suffice. 1988. Happ6 ciations in each case). and 23/27 failers. By contrast. performance on the Comprehension subtest (commonly thought of as requiring pragmatic and social skill) was a low point in verbal performance for 13/17 "failers" but only 6/20 "passers". gives both advantages and disadvantages. It seems. irrespective of level of theory of mind performance.

If a stimulus is treated in the same way regardless of context. in looking for the extended phenotype of autism. No doubt. it may have a genetic component. the "inhibition of pre-potent but incorrect responses" may contain two separable elements: inhibition and recognition of context-appropriate response. and so is likely to overlap to some degree with conceptions of both central coherence and theory of mind. However. The umbrella term "executive functions" covers a multitude of higher cognitive functions. Folstein. However.it is perhaps in danger of trying to take on the whole problem of meaning! One of the areas for future definition will be the level at which coherence is weak in autism. which may vary in the normal population. Future prospects The central coherence account of autism is clearly still tentative and suffers from a certain degree of over-extension. to focus on the strengths and weaknesses of autistic children's processing. which in turn cause social and non-social abnormalities. It is not clear where the limits of this theory should be drawn . . this style would be subject to environmental influences. this may look like a failure of inhibition. the central coherence hypothesis differs radically not only from the theory of mind account. While Block Design and Embedded Figures tests appear to tap processing characteristics at a fairly low or perceptual level. but also from other recent theories of autism. but. in terms of weak central coherence. Perhaps the most influential of such general theories is the idea that autistic people have executive function deficits.Autism: beyond "theory of mind" 25 vs. every other current psychological theory claims that some significant and objectively harmful deficit is primary in autism. One factor which can make a pre-potent response incorrect is a change of context. the hypothesis that autistic people have relatively weak central coherence makes specific and distinct predictions even within the area of executive function. For example. then. and Isaacs (1991) that the parents of children with autism tell rather less coherent spontaneous narratives than do controls. autistic people may have no problem in inhibiting action where context is irrelevant. Of course it may be that some people with autism do have an additional impairment in inhibitory control. Central coherence and executive function With the speculative link to cognitive style rather than straightforward deficit. in addition. In fact. wholes) as akin to a cognitive style. Some initial evidence for this may be found in the report by Landa. It may be interesting. just as some have peripheral perceptual handicaps or specific language problems.

and it may be that people with autism process the most local of the levels available in open-ended tasks. and the larger effect of story structure. Snowling and Frith (1986) demonstrated that it was possible to train subjects with autism to give the context appropriate (but less frequent) pronunciation of ambiguous homographs. for example. . from the global precedence effect in perception of hierarchical figures (Navon. It seems likely.26 U. whether right or wrong. Like the theory of mind account. 1993. it is to be hoped that. Happ* work on memory and verbal comprehension suggests higher-level coherence deficits. Norris (1990) found that building a connectionist model of an "idiot savant date calculator" only succeeded when forced to take a modular approach. Frith. but the failure of many such savants to apply their numerical skills more widely (some cannot multiply two given numbers) suggests a modular system specialized for a very narrow cognitive task. Just as the idea of a deficit in theory of mind has taken several years and considerable (and continuing) work to be empirically established. Coherence can be seen at many levels in normal subjects. 1984. Diagnostic and Statistical Manual of Mental Disorders. so the idea of a weakness in central coherence will require a systematic programme of research. For example. The importance of testing central coherence with open-ended tasks is suggested by a number of findings. when asked again. 1977) to the synthesis of large amounts of information and extraction of inferences in narrative processing (e. that autistic weak central coherence is most clearly shown in (non-conscious) processing preference. Level of coherence may be relative. So. 1986). 3rd revised edition (DSM-III-R). F. Weeks and Hobson (1987) found that autistic subjects sorted photographs of faces by type of hat when given a free choice. but. Similarly. the central coherence theory will form a useful framework for thinking about autism in the future. were able to sort by facial expression. One interesting way forward may be to contrast local coherence within modular systems. So. relatively local and piece-meal). which may reflect the relative cost of two types of processing (relatively global and meaningful vs. for example. within text there is the word-to-word effect of local association.g.. These three levels may be dissociable. then. DC: American Psychological Association. in a special issue of Discourse Processes on inference generation during text comprehension). Trabasso & Suh. Hermelin & O'Connor. References American Psychological Association (1987). and global coherence across these systems in central processing. Washington. the effect of sentence context. the calendrical calculating skills of some people with autism clearly show that information within a restricted domain can be integrated and processed together (O'Connor & Hermelin.

behavioural and intentional understanding of picture stories in autistic children. Attwood.V. New York: Wiley. The autistic child's theory of mind: A case of specific developmental delay. U. Cicchetti. A. U. 12.. A. (1992). Journal of Child Psychology and Psychiatry. Developing theories of mind. Mechanical. 76. Perceptual role taking and protodeclarative pointing in autism.) (in press). 18.. 281-316. 187-276.W. Remembering: A study in experimental and social psychology.. 113-125. P. and humans. S. Frith.. U. & Seyfarth.. (1986). 4.A. (1983). U. & Cohen. (1989a). MA: MIT Press. & Siddons. Die "autistischen Psychopathen" im Kindesalter. Oxford: Oxford University Press. Journal of Abnormal Psychology. retarded and young normal children...Autism: beyond "theory of mind" 77 Asperger. Asperger. H. (Eds. S. U. D. 34. & Cross.. (1989b). by H. (1991). Can children with autism recognise surprise? Cognition and Emotion. Bowler. 7. Archiv fur Psychiatrie und Nervenkrankheiten.R. Frith. executive functions and theory of mind: A neuropsychological perspective. (1992). Autism. Theory of mind and social adaptation in autistic. 433-438. H. R. 113-127. (1993). Modularity of mind. Journal of Autism and Developmental Disorders. 25.C. Frith. 31. Frith. New York: Cambridge University Press. (1970a). J. 279-293.M. 37-46. Trends in Neuroscience. J. Baron-Cohen. Cognitive Psychology. Bishop. 14. Happe.. Emphasis and meaning in recall in normal and autistic children. D. U. & Cohen.). UK: Cambridge University Press. Out of sight or out of mind? Another look at deception in autism. B. The role of visual and motor cues for normal. (1985). D. 33. The cognitive basis of a biological disorder: Autism. Autism: Explaining the enigma. Manual of developmental psychopathology (Vol. A. (1932). (Eds. (1969). J. Frith. B. (Eds.. D. 76-136. (1990). Byrne. 29-38. Spitz. UK: Cambridge University Press. A. Cambridge.. (1989). apes. Oxford: Clarendon Press. A. 7.) (1988). Baron-Cohen. . & Olson. Baron-Cohen. Journal of Child Psychology and Psychiatry. Cambridge. F. Frith (Ed. 21. How monkeys see the world. Autism and Asperger syndrome. 117. Harris. 30. Leslie. & Whiten. Does the autistic child have a "theory of mind"? Cognition. S. "Theory of mind" in Asperger's syndrome. Social Development. U.L. 10.L. Baron-Cohen. Oxford: Basil Blackwell.) (1993). U. (1993). Understanding other minds: Perspectives from autism. J.J. & Leslie. D.C. (1969). Studies in pattern detection in normal and autistic children: II. (Eds. Cognition.) (1989). Aurnhammer-Frith. S. Studies in pattern detection in normal and autistic children: I. 877-893..M.. Journal of Child Psychology and Psychiatry. The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. J. Morton. Annotation. & Hermelin.. Chicago: University of Chicago Press.M. & Frith. Frith. Frith. Inversion and configuration of faces. U.J. Bartlett. U. British Journal of Developmental Psychology. Cosmides.. P.M. F. 241-257. U. Cambridge. 33. Bartlett.. In U. Astington. Journal of Child Psychology and Psychiatry. R. Baron-Cohen. L. (1991). D.. subnormal and autistic children. & Hermelin. Cheney.. 285-297. 153-163. S. Translation and annotation of "Autistic psychopathy" in childhood.H. (1989). Frith. Fodor. 1)... & Frith. 10. (1988). Immediate recall of auditory sequences. (1970b). (1944).M. S. D. Language and Speech. A. 413-420. S. (in press). (1993). Reproduction and production of color sequences. Journal of Child Psychology and Psychiatry. & Searcy.M. Tager-Flusberg. 507-516. Baron-Cohen. The understanding and use of interpersonal gestures by autistic and Down's syndrome children. Journal of Experimental Child Psychology. F. 120-135. Leslie.. British Journal of Developmental Psychology. Machiavellian intelligence: Social expertise and the evolution of intellect in monkeys. Baron-Cohen. 1141-1155.

M. New York: McMillan. F. Leslie.E. (1991). 498-510. S. & Russell. (1991). (1987).. American Journal of Orthopsychiatry.B. Leekam. (1992). Harris. & O'Connor. A.E. 24. & Perner.28 U.. Unpublished Ph. & O'Connor. 26. Blusewicz. & Lee.). University of London.W. (1991). Kanner. Hermelin. (1994b).G. 203-218. M. Kaplan..E. R. 1-24. Domain specificity in conceptual development: Evidence from autism. M. T. S. New York: Plenum Press.H. F.G. (in preparation). 55-65. A. Langdell. Autistic children's difficulty with mental disengagement from an object: Its implications for theories of autism. B.G. 412-426. Landa. U. In E. L.. (submitted). 13. C.). & O'Connor. J. An advanced test of theory of mind: Understanding of story characters' thoughts and feelings by able autistic. 213-218. M. & Snowling. Ueber den Einfluss der Erfahrung auf die Welt der Wahrnehmung von Figuren. F. 225-251.G. F. New York: Cambridge University Press. 441-453.H. P.A. Journal of Speech and Hearing Research. Koffka. C. & Palmer. What's in a face? The case of autism...E.. J. Journal of Experimental Psychology: Human Perception and Performance. (1988). Remembering of words by psychotic and subnormal children. (1926). 43. Oxford: Pergamon. Journal of Developmental Psychology. Psychological Review. 40. Cognition.. 34.L. J. Autistic disturbances of affective contact. Does the autistic child have a metarepresentational deficit? Cognition. S. Gillberg. 101-119. Kohs. and global-local similarity judgement in autistic subjects. F. 455-465. (1978). Principles of Gestalt psychology.R. U. (1991). (1992). Leslie. 94..E. Hart.. (1923). 8. Happe. 16. A. Hobson. In J. & Isaacs. & Frith. 8. K. Mesibov (Eds. F. (in press). E. 35. Theory of mind and communication in autism. F. K. Astington.M. Developmental Psychology. F. Psychological experiments with autistic children. Journal of Child Psychology and Psychiatry. S.D. Hermelin. Form and texture in hierarchically constructed patterns. Happe.G. Cognition. (1993).M.J. 2. thesis.. K. B. Recognition of faces: An approach to the study of autism. N.. Intelligence measurement. Central coherence. 255-268. (1994a). Annotation: Psychological theories of autism. B. (1943). London: Mac Keith Press. 329-342. Journal of Autism and Developmental Disorders. Happe. Idiot savant calendrical calculators: Rules and regularities. British Journal of Psychology. British Journal of Psychology. Olson (Eds. Some implications of pretence for mechanisms underlying the child's theory of mind. (1956). C. & Preston. J. 79.. Psychologische Forschung. L. 48. Journal of Child Psychology and Psychiatry. 217-250. Leslie. Kramer..C. Journal of Clinical and Experimental Neuropsychology. Hughes. (1986). (1989). R. 1. (1982). Ouston. Schopler & G. Without reason: A family copes with two generations of autism. & Thaiss.. Early infantile autism 1943-1955. Happe" Frith. block design errors. Happe. (1935). 1339-1345. 261-317. Happe. . Kimchi. R. Folstein. N.. 19. 215-229. Learning and cognition in autism. N. Communicative competence and theory of mind in autism: A test of relevance theory.G. mentally handicapped and normal children and adults. & Coleman. Gottschaldt. L. 29. & D. Reading for meaning and reading for sound in autistic and dyslexic children.P. (1993). Visual hierarchical analysis of block design configural errors. Pretence and representation: The origins of "Theory of Mind".. T. (1983). (1967). Hermelin. Theory of mind and IQ profiles in autism: A research note. Happe. The biology of the autistic syndromes. New York: Penguin Books. Happe. Theory of mind in autism. C. 885-893. Nervous Child. 521-535. Psychological Medicine. Spontaneous narrative-discourse performance of parents of autistic individuals. Kanner. L.E. (1970). Developing theories of mind. (1988). New York: Harcourt Brace.E. 58. & Eisenberg. Frith.E.

J. (1967). 1. A five to fifteen year follow-up study of infantile psychosis: IV. L. J. Psychological Medicine. Idiot savant calendrical calculators: Maths or memory. Oxford: Blackwell. D. Prior. What's lost in inverted faces? Cognition. & Frith. 357-380. B.). 42. B. 18. D. & Hill. & Leekam..). 35. MA: MIT Press. N. C. Forest before trees: The precedence of global features in visual perception. 392-415. (1990). Morton. Manual of Developmental Psychopathology (Vol. U. Idiot savants. & Frith.F. A. How to build a connectionist idiot (savant). (1984). British Journal of Social and Clinical Psychology.D. & Woodruff. D. Cohen (Eds. pp. The role of cognition in child development and disorder.": Attribution of second-order beliefs by 5-10 year old children. 515-526. Deception and sabotage in autistic.L.... Russell. and communication. J.. Mottron. U. Rhodes. B. (1993). 1351-1364. 353-383. A. Patterns of cognitive ability.R. New York: Wiley. S.J. Communication. Premack. Leslie. thesis. .) (1987). Wortis (Ed. (1984).R. 152-163. 1-16. & Frith. Perner. (1993). U. Causal modelling: A structural approach to developmental psychopathology. Cognitive abilities and disabilities in infantile autism: A review. Journal of Experimental Child Psychology. O'Connor. A study of perceptual analysis in a high-level autistic subject with exceptional graphic abilities. 34. (1992). Ozonoff. 7. Unpublished Ph.. The theory-theory: So good they named it twice? Cognitive Development. UK: Penguin Books. 485-519.. Brake.. 32. In J. (in press). Journal of Child Psychology and Psychiatry. Cognition. B. Does the chimpanzee have a theory of mind? Behavioural and Brain Sciences. G. News from the Border: A mother's memoir of her autistic son.M. & Rogers. Neurobiological issues in autism. Norris. & Atkinson. 1-4. Pennington. (1983).. A. New York: Ticknor & Fields. Ozonoff. 19. 1081-1106. & Belleville. (1989). (1979). 32. In D.. Snowling. 25-57.F. (1993). Shah.J. J.. Journal of Child Psychology and Psychiatry. Journal of Child Psychology and Psychiatry. 7. belief. 60. Cambridge. Rutter. 9. & Hermelin. (1978). S. Perner. "John thinks that Mary thinks that . . 277-291. & Rutter. The natural history of able autistic people: Their management and functioning in social context. 801-806. S. & Wimmer. Navon.. (1991).Autism: beyond "theory of mind" 29 Lockyer.B. & Frith. S. Phillips. Journal of Experimental Child Psychology. S. (1993). 1-2. (1986). Comprehension in "hyperlexic" readers. 47. . A. (1970). J. 33. 279-309. 23. New York: Plenum Press. Sperber. Journal of Abnormal Child Psychology. Relevance: Communication and cognition. University of London. L.T.. (1984). Schopler. The siege: The battle for communication with an autistic child. 591-605. Cognitive Psychology.. Exploration of the autistic child's theory of mind: Knowledge. Brain and Cognition.. D. U. S. Child Development. J. Asperger's syndrome: Evidence of an empirical distinction from high-functioning autism. Sodian. (1987). Understanding intention and desire by children with autism. An islet of ability in autistic children: A research note. 437-471. Rimland.. H. (1985).. Newson. Journal of Child Psychology and Psychiatry.. U. P. 14. (1991). S. E. (1992)..J. W. (1993). Shah. 9. New York: PlenumPress. Harmondsworth. Understanding the representational mind. 1107-1122. Ch. (1986).P. 155-169). retarded and normal children. (Eds. M. E.C. U. & Everard. M. British Journal of Medical Psychology. 4. 24. Park. D. 689-700. M.. Mental retardation and developmental disabilities (vol. & Wilson. Perner. Dawson. 613-620. 39. M. G.. 60. B. & Frith.. Summary of the report to DHSS in four parts. 13. & Mesibov. (1991). McDonnell. & Pennington. (1977). Cicchetti & D. A... Why do autistic individuals show superior performance on the Block Design task? Journal of Child Psychology and Psychiatry. M. Executive function deficits in high-functioning autistic children: Relationship to theory of mind. G. 13). Rogers. Frith.

J. Happ<5 Tager-Flusberg. MA: MIT Press.). (1987). (1983). (Ed.P. Journal of Speech and Hearing Research.. Beliefs about beliefs: Representation and the constraining function of wrong beliefs in young children's understanding of deception. A. 103-128. C.30 U. M. 27.J. (1991). 16. Wechsler. D.. Severe impairments of social interaction and associated abnormalities in children: Epidemiology and classification. Quarterly Journal of Experimental Psychology. Journal of Mental Deficiency Research. & Karp. Oxford: Oxford University Press. Tanka. 11-29.Revised.J.. H. 137-152. 9. 46A. T. H. Intellectual characteristics of adolescent childhood psychotics with high verbal ability. Natural theories of mind. & Perner. Oltman. Understanding text: Achieving explanatory coherence through on-line inferences and mental operations in working memory. S. 364-377. 13. S. Frith. Wechsler. Tager-Flusberg. 3-34. Cohen (Eds. H. New York: Psychological Corporation. Journal of Autism and Developmental Disorders. Simmons. Semantic processing in the free recall of autistic children: Further evidence for a cognitive deficit.Q. Tymchuk.. J. . California: Consulting Psychologists Press. (1974). British Journal of Developmental Psychology. H. (1977).M. Witkin. S. 21. L. & D. Wechsler Adult Intelligence Scales .Revised. Oxford: Basil Blackwell.. J. (1981). (1984). (1993). S. The child's theory of mind. (1971).A. Parts and wholes in face recognition.. Tager-Flusberg. & Prutting. F. (1979). Weeks. Wing. (1993). & Farah. Wetherby. 417-430. Discourse Processes. J.W.K. & Suh.J. (1990). A.J. (1993). New York: Psychological Corporation. Wellman. What language reveals about the understanding of minds in children with autism. The salience of facial expression for autistic children. D.) (1991). Cognition..M. 28. 9. & Neafsey. Cambridge. & Hobson. Profiles of communicative and cognitive-social abilities in autistic children. R. Wimmer. P. Whiten. In S... A manual for the Embedded Figures Test. 225-245. Trabasso.A. 133-138. Raskin.. A.. Baron-Cohen. E. Wechsler Intelligence Scale for Children . Understanding other minds: Perspectives from autism. H. Journal of Child Psychology and Psychiatry. & Gould. H.

Galaburda Department of Neurology. Epidemiological evidence in dyslexic families led to the discovery of animal models with immune disease. Gordon F. neuroimaging. comparable anatomical changes and learning disorders. This notion is supported by electrophysiological data and by findings of anatomical involvement in subcortical structures close to the input as well as cortical structures involved in language and other cognitive functions. University of Colorado Health Sciences Center. and neurophysiology indicate that dyslexia is accompanied by fundamental changes in brain anatomy and physiology. for help in the preparation of the report on behavioral studies in immunedefective mice. . MA 02215. Beth Israel Hospital and Harvard Medical School. It is suggested that the disorder of language. Rosen for their collaboration. presently at the Department of Psychiatry. whether near the input or in high-order cortex. results from early perceptual anomalies that interfere with the establishment of normal cognitivelinguistic structures. or at both sites simultaneously. USA Abstract Recent findings in autopsy studies.3 Developmental dyslexia and animal studies: at the interface between cognition and neurology Albert M. coupled with primarily disordered cognitive processing associated with developmental anomalies of cortical structure and brain asymmetry. which is the cardinal finding in dyslexic subjects. The preparation of this review was supported by grant 2P01 HD20806 from NIH/NICHD. which can be attributed to anomalous prenatal and immediately postnatal brain development. which have added needed detail about mechanisms of injury and plasticity to indicate that substantial changes in neural networks concerned with perception and cognition are present. It is not possible at present to determine where the initial insult lies. Lisa Schrott. Sherman and Glenn D. The author thanks Dr. Boston. involving several anatomical and physiological stages in the processing stream. The author thanks his colleagues Drs.

Although it is theoretically possible to learn to be dyslexic. Morais. as yet unidentified. based on detection in the schools and on results of psychoeducational test batteries. Where it concerns complex behaviors such as the elements of language. In this review. Liberman. September 1993). for instance. but it is likely that many with similar deficits are not dyslexic and some dyslexics do not exhibit phonological deficits (see recent discussions in Cognition. either reflecting the fact that available methods are not capable of showing such differences or the fact that phonological cases. Thus. Number 3. also known as specific reading disability. which have led to this hypothesis.32 1. . sensory and perceptual deficits are more likely to play a role in language acquisition than they do after language has been implemented in the brain. that is. have exhibited fairly uniform neuropathological findings (see below). high-level vision. Galaburda An important aim of behavioral neurology in the clinics and its sister specialty concerned with normal biology. as yet. This is in part the result of the fact that to a large extent animal models have not been possible for capturing behaviors thought to be either purely human. in which case the brain would be found to be normal. for the diagnosis of dyslexia. originates as a disorder of perception affecting the brain at a vulnerable time (before the age of 1 year) when phonological structures relating to the native language are being organized in the developing brain. In terms of brain loci and etiology. consciousness and attention. and the clinical syndromes in which these functions fail. which have come to post-mortem examination. biographical memory. Some specific cognitive anomalies are thought to be present in nearly all dyslexics. 1984). and it is relatively easier to model sensory and perceptual abnormalities in animal models than it is to model cognitive processes like language. cognitive neuroscience. Dyslexia remains a diagnosis that is made on educational grounds. brains of individuals with this diagnosis. Thus. Autopsy studies have not disclosed differences between dyslexics of the visual and auditory types. is the mapping of behavior onto neurophysiology and brain structure. & Alegria. 1978. Luytens. or considered to have achieved qualitative or quantitative uniqueness (or both) only in human beings. the diagnosis of dyslexia probably encompasses several biological subtypes. Introduction A. I will outline some of the findings in the brains of dyslexics and in suitable animal models. This should not be surprising in a skill for which many areas of mental capacity on the one hand and cultural influences on the other play such an important role. Our recent work is being driven by the hypothesis that the condition known as developmental dyslexia. Volume 48. for example phonological deficits (Fischer. progress has been slow. & Shankweiler. whereby relatively simpler biological and environmental factors may prove to be more important. Developmental disorders offer a partial advantage in this regard in that they concern the early acquisition of cognitive capacities. there is no absolute cognitive marker.

indicate that the brain is affected widely.. and cortex belonging to the "what" portion of the visual pathway in the middle and inferior temporal gyrus. The former is associated with slowness in the early segments of the magnocellular pathway as . as well as subcortical regions closer to the input (Galaburda & Livingstone. more difficult to justify on the basis of available information. Sherman. is that of injury to multiple stages of processing at the same developmental time. 1993. & Geschwind. A second set of observations is made on the human dyslexic thalamus. Rosen. which would support the hypothesis that cognitive deficits are secondary to early perceptual deficits. or are in themselves secondary to changes occurring upstream or downstream first. & Galaburda. Specifically there is absence of the ordinary pattern of leftward asymmetry of the planum temporale. 1993. Rosen. both in the region of and anterior to the classical Broca's area. 1990). The perisylvian cortices affected by the minor malformations include inferior frontal cortex. cortex of the superior temporal gyrus (part of Wernicke's area again). Kaufmann. Livingstone et al.. including much of the perisylvian cortex containing both auditory and visual areas (Humphreys. 1991). 1991). cortex of the inferior parietal lobule often involved in anomic aphasia with writing disturbances. 2. & Galaburda. Another possibility. The planum temporale is a part of the temporal lobe thought to make up a portion of Wernicke's speech area. The areas of anatomical abnormality are likely to interact with each other during development so that it is not possible to state with confidence at present whether the cortical changes are primary. We have found that both neurons in the magnocellular layers of the lateral geniculate nucleus and in the left medial geniculate nucleus are smaller than expected (Galaburda & Livingstone. and the perisylvian cortex displays minor cortical malformations. have constituted the bulk of the anatomical studies. 1990).Developmental dyslexia and animal studies 33 being more common. including foci of ectopic neurons in the molecular layer and focal microgyria. Neuroanatomical characteristics My colleagues and I have reported alterations in the pattern of brain asymmetry of language areas as minor cortical malformations in four male and three female dyslexic brains (Galaburda. 1985. however. Findings so far. is the first step. Aboitiz. Drislane. Humphreys et al. or damage at early stages of processing. Livingstone. Our present research is directed in part toward answering the question of whether subcortical damage. cortex of the parietal operculum often involved by lesions producing conduction aphasia. causing developmental changes both upstream and downstream in the interconnected neural networks.

the anomaly in rapid temporal processing associated with thalamic changes. Mehler. 1992). 1986. 1990). 1991). 1973). & Sherman. 1987. since all of them displayed symmetry of the planum temporale. 1985). which we anticipate will also break down in the presence of ectopias (research in progress). Such a discovery was not possible in the dyslexic brain sample alone. 1963). Press. as is the case in dyslexic brains. The relationship between the two elements of the first set of findings. & Geschwind. which also may display foci of neocortical ectopic neurons (Sherman. One could suggest. increasing size asymmetry between homologous areas of the two hemispheres is associated with decreasing number of neurons and decreasing amount of cortex on the small side (Galaburda.. Corsiglia. Behan. similar temporal processing abnormalities have long been suspected to underlie deficits of aphasic patients (Efron. the anomaly in asymmetry and the cortical malformations. prominent alterations in corticocortical connectivity (Sherman. & Rosen. Furthermore.34 A. in which asymmetry and size and number of neurons interrelate randomly. This would add further support to the notion that symmetry in the presence of ectopias. 1989). Rosen. that is. Galaburda assessed by evoked response techniques addressing magnocellular function separately from parvocellular (Livingstone et al. Sherman. & Sherman. Galaburda. In the normal state. & Galaburda. Some answers have come from the study of strains of mice that spontaneously develop autoimmune disease. both in the mouse and human brains. is another focus of research with possibilities for resolution in animal models. & Galaburda. that variable degrees of asymmetry work well only when a specific relationship between degree of asymmetry and neuronal numbers is allowed to take place. Thus. Rosen. thus suggesting an interaction between very early developmental events and the expression of cortical asymmetry (Rosen. that is. Sherman. There is an additional formal relationship between asymmetry and callosal connectivity (Rosen. & Galaburda. The latter may relate to the temporal processing abnormalities described in the auditory system of language impaired children (Tallal & Piercy. and the relationship between the first and second set. therefore. Stone. Emsbo. is likely to be associated with fundamental changes in the functional properties of networks participating in perceptual and cognitive activities. 1987) (rather than increasing amount on the large side). Rosen. ectopic animals show alterations in the usual pattern of brain asymmetry. Galaburda. associated with these "ectopias". 1989). and that the development of ectopias may interfere with this relationship. Stone. there may be significant changes in some cortical neuronal subtypes (Sherman. & Galaburda. Early on there were reports of neuroimaging aberrations in the pattern of brain . Animal research has also aimed at explaining the question of how minor malformations could lead to noticeable and even clinically persistent disorders of cognitive function. This relationship breaks down in animals with ectopias. Rosen. and modification of behavior (see below). Sherman. Galaburda. Aboitiz.

1978). Novey. so comparisons were not possible. 1987) and relatively excessive numbers of neurons (Galaburda et al. the normal interaction between asymmetry and size no longer exists in the latter (see above). 1986) and interhemispheric connections (Rosen et al.. followed by normal males. LeMay. too. 100% of the brains showed absence of asymmetry of the planum. Semrud-Clikeman. The study also found that both ADD/ H and dyslexics had a narrower right anterior temporal area. again in normal experimental animals. reflects bilaterally large symmetrical regions rather than bilaterally small symmetrical regions (Galaburda et al. Relative area measurements of the midsagittal corpus callosum (CC) showed that female dyslexics had the largest CC. 1990) examined for the specificity of the asymmetry changes reported in dyslexic brains. & Odegaard. as seen in the mice with ectopias. Their study found that symmetry of the planum was significantly more common in dyslexics as compared with normals and individuals with attention deficit disorder/hyperactivity syndrome (ADD/H). and dyslexics alone had bilaterally smaller insular regions and significantly reduced left plana temporale. Among the dyslexic subgroup exhibiting phonological deficits. This reduction in the size of the left planum is in contrast to research. followed by male dyslexics. The splenium contains fibers from the posterior temporal and parietal cortices. is that different investigators use somewhat different criteria and methods for outlining the planum temporale (Galaburda. supports the notion .. In the dyslexics 70% of the brains showed symmetry of the planum. 1990) reconstructed and measured the planum temporale in 19 eighth-grade dyslexics and appropriate controls using MRI. None of the brain segments corresponded to the planum temporale of other studies. followed by male dyslexics. described a shift toward increased right parietal opercular tissue in the dyslexic sample. leading the authors to suggest that asymmetry of the planum is necessary for normal phonological awareness.. Hoien. Another MRI study examined neuroanatomical differences between dyslexics and normals (Duara et al.. However. this report. followed in turn by normal readers.. as compared to asymmetry. in a magnetic resonance imaging (MRI) study. Moreover. as compared to only 30% of the control sample. The splenium of the callosum was also largest in female dyslexics. 1993). and this may explain why different imaging studies find different effects on the planum temporale. as well as anomalies in folding of the cortex in that region (Leonard et al. This study found that a brain segment lying anterior to the occipital pole was larger on the right in dyslexics but not in controls. 1993). Another explanation. Larsen and colleagues (Larsen. which participate in language functions. Witelson. Lorys. 1991). followed by normal females. More recently. away from temporal tissue. Hynd and colleagues (Hynd. & Perlo. 1985). & Eliopulos. showing that symmetry. A recent study by Leonard and colleagues. and this is this author's experience.Developmental dyslexia and animal studies 35 asymmetry in dyslexic subjects (Hier. which parceled the region of the planum and adjacent parietal operculum on MRI. Lundberg. Rosenberger. 1989.

indicate that physiological differences exist between dyslexic and normal readers in cognitive processing. thus suggesting that linguistic stimuli are treated in part as non-linguistic stimuli by this group.. 1992. Rumsey and colleagues (1992) demonstrated anomalies in cerebral blood flow in the left temporoparietal region in dyslexic men-an area important for language. in part. Ransil. Ostrosky-Solis. 1991) confirmed this finding in MRI scans. & Wallesch. dyslexics failed to show increased left hemisphere negativity to linguistic stimuli. Rumsey et al. as demonstrated by their N400 responses to visually presented primed and unprimed words. This suggested to the authors that significant differences existed in expectancy. There have been several functional tomographic studies using positron emission tomography (PET) (Hagman et al. particularly affecting linguistic categories. however. 1977). Some studies have found increased left-handedness in dyslexics and increased dyslexia in left-handers (Geschwind & Behan. often show aberrant patterns of brain asymmetry (LeMay. One recent study (Landwehrmeyer. A recent study (Steinmetz. or expectancy wave. Jancke. This study produced particularly accurate reconstructions of the planum temporale from MRI scans. Unlike normal readers. 3. Both amplitude and latency differences at a left parietal site were documented in the PINV and amplitude differences in the CNV in a sample of nine right-handed preadolescent boys. & Freund. 1990) looked at auditory potentials evoked by a variety of linguistic and non-linguistic stimuli in dyslexics and non-dyslexics. Meneses. These are the waves occurring at 400 ms from the stimulus onset and presumably reflect activity in cognitively related cortex. Volkmann. 1982. 1990) looked at contingent negative variation (CNV). attention and brain activity signal processing. The resolution of this wave is called the postimperative negative variation (PINV).and found that left-handers had a lesser degree of leftward planum asymmetry than right-handers. their reading difficulties. but instead showed increased right hemisphere negativity. Neurophysiological studies Altered brain potentials have been described in dyslexics. as compared to matched controls. Harmony. Both groups showed increased right hemisphere surface negativity in a non-linguistic stimulus.. & Guevara. & Geschwind. Schachter. The authors reconstructed and measured the planum temporale in 52 normal subjects-26 right-handers and 26 left-handers . and left-handers. combined. Galaburda that alterations in the CC may be characteristic of dyslexic brains and may underlie. Gerling. What is less well recognized . 1987). Another study (ChayoDichy. These results. 1990) found that disabled readers seemed to have a failure to engage long-term semantic memory.36 A. A third study (Stelmack & Miles. which corresponded most closely to studies done directly on post-mortem material. 1992). like dyslexics.

low-contrast stimuli were presented. 1993). thus complementing the physiological findings of a slowed magnocellular system. & Maunsell. in the same study. and it was found that the magno cells only were smaller in the dyslexic group. beginning even in VI. none of which is currently thought to mediate cognitive functions. These stimuli are handled by the magnocellular pathway of the visual system. Moreover. and that such abnormalities may interfere with auditory and visual language acquisition and efficient language processing at loci where fast processing is required for extraction of meaning..and parvocellular layers of the LGN were measured in five dyslexic and five control brains. Williams and Lecluyse (1990) have taken advantage of this possibility and have showed that image blurring. implicating the retina. The parvocellular pathway. there is an ongoing debate as to the extent to which the two subsystems remain segregated. which is the fastest rate at which a contrast reversal of a stimulus can be seen. & Casagrande. On the other hand. and color selective. but in some specimens both magno and parvo layers displayed disorganization of their architecture. Garzia. On the other hand. the neurons present in the magno. Another evoked potentials study (Livingstone et al. which is slow. 1991) reported visual findings in dyslexics that can be attributed to perceptual anomalies and do not therefore primarily implicate language dysfunction. the dyslexics showed abnormalities when fast. We have also found preliminary evidence that the types of anatomical abnormalities described in the visual thalamus may extend to the auditory thalamus as well. relatively contrast insensitive. is abnormally slow in dyslexic children at low spatial frequencies and low contrasts (Lovegrove. Earlier evidence suggested a similar defect in fast auditory processing (Tallal & Piercy. The flicker fusion rate. Beck. is capable of re-establishing normal temporal processing of words in disabled readers. the primary visual cortex (VI) and higherorder visual cortices (Livingstone & Hubel. including the visual and auditory. the LGN and/or VI. The timing of the physiological abnormality in the dyslexics suggested a magnocellular deficit early in the pathway. which reduces the contrast of high spatial frequencies. 1988). and whether the separation changes in character altogether (Ferrera. appeared to function normally in the dyslexic group. We have measured the cell bodies of representative regions of the medial geniculate nuclei of dyslexic and control brains and have found that there is a shift in the former toward smaller neurons. The parvo cells were not changed in size. which is segregated already in the retina and continues to be separate through the lateral geniculate nucleus (LGN). 1973) and proposed that fast processing may be abnormal in several modalities. 1992. Lachica. especially affecting the left . & Nicholson. 1990). Flickering checkerboard patterns were presented to dyslexics and non-dyslexics at different contrasts and rates.Developmental dyslexia and animal studies 37 is their difficulties at the more peripheral levels of sensory processing and perception. and the transient and sustained visual evoked potentials were recorded. Nealey.

1993). Comparing across inbred strains is risky.1. 1985b. Sherman. complex maze and avoidance learning). as well as tests of lateralization and activity. which could again produce difficulties in the processing of rapidly changing auditory information (Tallal & Piercy. associated with anatomical changes also relatively close to the sensory organs. Denenberg. A priori any behavioral abnormalities demonstrated in the immune-defective mice could be attributed either to the presence of cortical malformations. the existence of abnormalities in immune function making the animal sick. We have been able to separate the behavioral deficits into two types: ectopia-associated behaviors. 1973). animal models are being developed in our laboratory to study this question specifically vis-a-vis the relationship between cortical anomalies and changes in the thalamus . Denenberg. within-strain comparisons were used to examine the behavior of NZB and BXSB mice. Therefore.. one of the most difficult problems encountered is the choice of an appropriate control group. or to a combination of both. Mobraaten et al. Because dyslexia is a specific learning disorder and may be accompanied by enhancement of certain abilities (Geschwind & Galaburda. Rosen. 1991. The battery of behavioral tasks that we have administered includes four measures of learning (discrimination. Although all possibilities are predicted from current notions of developmental plasticity. 1991. Ectopia-associated behaviors In working with inbred strains. and autoimmune-related behavior. since behaviors are known to be influenced by the vastly different genetics of each strain. represent a primary failure or the result of changes that have begun downstream and have propagated toward the periphery (as well as further downstream). It is not possible to state at this stage in the research program whether the functional deficits demonstrated early on in the pathway. & Galaburda. 1992.which came first? 4. Such a shift may reflect a corruption of a hitherto not well-understood large-celled system. NZB and BXSB mice are being used as an animal model of cortical malformations associated with the human dyslexic condition (Denenberg. spatial. 4. Galaburda hemisphere (Galaburda & Livingstone. In the investigation of ectopia-associated behaviors. Behavioral studies in animals with anomalous cortex As indicated above. 1985c). this was easily accomplished since 40-50% of NZB and BXSB mice .38 A. and lesser differences between this large-celled system and a slower small-celled system. 1987). the use of a behavioral battery is crucial. Sherman et al. et al... Schrott. 1985a.

Schrott. The presence of ectopias also interacts with an animal's paw preference. A similar pattern of behavior is seen in the Morris maze. . No such effects were seen in non-ectopic mice (Schrott. The starting point varies from one of four locations in a semi-random sequence. Mice received 10 trials for 5 days (Denenberg. For water escape learning. Denenberg. The mice received 5 trials on a single day of testing and time to reach the platform was recorded. and two spatial measures. left-pawed NZB males and females and BXSB males had faster times than their right-pawed counterparts. Ectopias depressed performance on discrimination learning. reflecting a different pattern of learning than NZB mice without ectopias. 1990). In this task. In this task an animal was placed in one end of an oval tub and had to swim to find a hidden escape platform at the other end using extra-maze spatial cues. 1990). compensated for in the presence of alternative strategies available in enriched early environments. right-pawed BXSB male mice had better performance than their left-pawed ectopic littermates.Developmental dyslexia and animal studies 39 develop ectopias. so the animals had to use an associative. Measures included number of correct choices and time to reach the escape ladder. This is in itself interesting. Waters. Talgo. 1992). & Galaburda. Rearing in an enriched environment. NZB mice with ectopias made fewer correct choices and took longer to find the escape ladder over the first 4 days of testing. Three measures in the behavioral battery were found to be sensitive to the presence of ectopias: a non-spatial discrimination learning task. Mice received 4 trials a day for 5 days (Denenberg et al.. however. Reinforcement consisted of escape from the water plus being placed in a dry box beneath a heat-lamp. Denenberg. This is a complex spatial task requiring the animal to find a hidden escape platform using extramaze cues. since it supports the notion that focal brain injury early on can be. the maze is divided into four quadrants and three annuli and the percentage of time spent in each portion was also measured. In ectopic NZB mice. Sherman. 1992). No paw preference effects were seen in non-ectopic . This test utilized a two-arm swimming T-maze. Waters. rather than spatial or positional strategy. et al. On the discrimination learning task. The left-right location of the positive stimulus was altered in a semi-random sequence. Enriched ectopic NZB mice had similar performance to their enriched non-ectopic littermates (Schrott. with a grey stem. Rosen.water escape and the Morris maze. Ectopic NZB mice were slower to find the escape platform and spent more time in the outermost annulus of the maze. In addition. at least in part. but caught up by day 5. Sherman. An escape ladder hung at the end of the alley designated to be positive (Wimer & Weller. The opposite relationship was seen for the spatial water escape task. was able to compensate for the deficit in ectopic mice. and a black and a white alley. to solve the task. Time and distance to reach the escape platform were measured. & Kenner. a significant improvement in both performance measures was seen if they were reared in an enriched environment as compared to standard cages. 1965).

after formation of the blood-brain barrier soon after birth. & Galaburda. Retz. Lai. 1991. Press. & Lai. Denenberg et al. The box was separated into two compartments by a divider. 1992. thus suggesting that ectopias could have more or less severe functional consequences according to laterality. Rosen. activity. The disruption of underlying fiber architecture. Sherman. Stone. Forster. & Galaburda. Autoimmune-related behavior It should be remembered that ectopias arise from influences on brain development taking place as early as the 13th embryonic day. In fact. is the crucial characteristic for these associations. In addition. hemisphere or size.. 1990. Freter. Stone.. Morrison. could cross the blood-brain barrier and indirectly affect brain function. Talgo. Denenberg. Rosen. Spencer. Poor performance in active and passive avoidance conditioning has been consistently associated with autoimmune mice (Denenberg. Most likely this is because the damage from an ectopia is more widespread than the focal lesion itself.40 A. Across numerous studies it has been found that the presence of an ectopia. 1992. such as paw preference and environmental enrichment. Press. & Galaburda. & Lai.2. but often at a slower rate or with poorer scores. with concomitant learning deficits. the Lashley maze or avoidance conditioning. 1990). The behavioral consequences of ectopia presence are task-specific. Humphries. Ectopic mice are capable of learning. Mobraaten et al. Galaburda mice of either strain (Denenberg. and that subsequently with increasing age many mice with or without ectopias acquire autoimmune disease consisting of humoral and cell-mediated injury to many organs other than the brain. Carroll. Waters et al. & Deni. Again this is interesting in the light of claims that dyslexics are more likely than non-dyslexics to be left-handed (Geschwind & Behan. Stone. is not known at present. 1990. 1983. Whether the learning deficits are a direct consequence of the ectopia or whether an ectopia is a marker for aberrant development in general. Sherman. & Galaburda. & Bennett. Behan. Five seconds of a pulsed light served as the . rather than its architectonic location. 4.. Rosen. Sherman. ectopias interact with other variables. Sherman. 1982). 1986). metabolic changes arising from failure of organs such as the kidneys and liver. Nandy. 1992. which are often involved in autoimmunity. Schrott. Rosen. Mathis. alterations in neuronal circuitry and neurotransmitter abnormalities that accompany an ectopia reflect a brain that has developed abnormally (Sherman. the brain is relatively protected from autoimmunity. On the other hand. 1988. In the present set of studies avoidance conditioning was conducted in a two-way shuttlebox. No main ectopia effects are seen for measures of lateralization. Bennet. 1991).

Sherman. The negative relationship between these two variables was more difficult to establish than the relationships between ectopias and behavior because of the lack of a proper control. This difficulty was solved in a rather complicated way. 1992). Finally. Escape from the shock is their preferential response. Comparing avoidance performance within a strain was not possible. The degree of autoimmunity was not associated with any of the other tasks in the behavioral battery. 1991). Denenberg. Further support for this association was provided by a study with BXSB-DBA reciprocal hybrids. Thus. 1992).. as well as null responses were recorded. in four groups of mice with vastly different rearing histories. nor were any ectopia interactions present (Denenberg.. personal communication). the degree of autoimmunity was negatively related to performance on an active avoidance conditioning task. . as well as impaired avoidance performance. as would typically be found in an animal learning this task. Mobraaten et al. Comparing avoidance performance across strains (an autoimmune strain vs. The most striking characteristic of NZB and BXSB mice is their very poor performance.4 mA footshock acted as the unconditioned stimulus.. null responses and the latency to escape often increased across days. with few avoidances made. The hybrid offspring were autosomally identical but differed in degrees of immune reactivity. the poorer the avoidance performance). as a function of the uterine environment in which they were raised. This response pattern is associated with a high degree of autoimmunity (the more autoimmune a mouse. Denenberg et al. Schrott. Mobraaten et al. in a group of genetically related mice-NXRF recombinant inbred lines-the line with the greatest degree of autoimmunity had the poorest avoidance performance (Schrott. a strain with normal immune functioning) is problematic because an avoidance difference could result from any number of genetic differences unrelated to immune functioning. Environmental enrichment had no effect on avoidance learning. 1991. pharmacological manipulations including cholinomimetics. Transfer of a non-autoimmune DBA embryo to an autoimmune BXSB maternal host induced autoimmune disease in the adult animal. when the severity of the disease was reduced by transferring an NZB embryo to a non-autoimmune hybrid mother. Instead.Developmental dyslexia and animal studies 41 conditioned stimulus. avoidance performance was improved (Denenberg. Waters et al.. In addition. The number of avoidances. Conversely. 1992. escapes (and the time to make them). The DBA x BXSB cross yielded offspring with greater immune reactivity and poorer avoidance performance than the BXSB x DBA cross (Denenberg et al. because all mice within a strain develop an autoimmune condition and there is insufficient variability in avoidance performance (all mice have poor performance). while up to 20 s of 0. It is interesting to note that failure to avoid or escape the shock (a null response) is not extinguished rapidly.. A set of embryo transfer studies permitted comparison of genetically identical mice who differed with regard to their immune status.

& Galaburda. Denenberg. Rosen. male rats with neonatally induced focal malformation of the cortex (the resultant malformation is similar to one of the forms found in dyslexic brains (Humphreys. 1). 1984. using water reinforcement. & Schleicher. personal communication). Press. Zilles. Certain recombinant inbred lines with NZB as one of the progenitors develop a higher incidence of callosal agenesis. including the formation of ectopias and a small infrapyramidal mossy fiber tract system (Anstatt. In an effort to relate these results. (Tallal. and this abnormality affects spatial learning (Schrott. Denenberg. sequential tones. Rosen. Results showed that all rats were able to . Fig. 1988). Stimuli were reduced in duration from 750 to 375 ms across 24 days of testing. Nowakowski. Temporal processing in rats Language-impaired individuals exhibit severe deficits in the discrimination of rapidly presented auditory stimuli. These possible mechanisms are by no means mutually exclusive. Sherman. Galaburda nootropics and antidepressants failed to improve performance (Schrott. Tallal & Piercy. Other behaviors One behavior that does not fit into either of these categories is the Lashley m a z e . Sherman.4. 1992). 1991)) were tested in an operant paradigm for auditory discrimination of stimuli consisting of two sequential tones.. These abnormalities are not seen in BXSB mice (Sherman et al. including phonological and non-verbal stimuli (i. 1991. Waters et al. 1988. 1992). (2) effects of circulating autoantibodies or other autoimmune factors. NZB mice have a low incidence of callosal agenesis (approximately 7%). they are consistent with this hypothesis. 4.a complex maze which can be solved using spatial and/or associative learning strategies. 1973)...3. even when given additional trials and cues (Schrott. BXSB mice have excellent performance on this task.e. & Galaburda. while NZB have great difficulty learning it. Sherman.42 A. (4) altered stress responses and/or hormonal interactions. Fink. Although the negative neuroanatomical and neurochemical findings cannot conclusively prove that immune dysregulation mediates active avoidance deficits in autoimmune mice. In addition. Possible immune mechanisms include (1) immune complex deposition on brain membranes and subsequent alterations in the permeability of the blood-brain barrier. Subjects were shaped to perform a go-no-go target identification. unpublished data) and may account for the poor performance of NZBs in the Lashley maze. NZB mice are known to have abnormalities of the hippocampus. (3) cytokine effects. 4. and (5) developmental aspects.

Comparison of auditory temporal processing impairments in languag graph on the left illustrates the percentage correct of a two-tone discr individuals (see Tallal & Piercy. 1973) as different total stimulus calculated by false alarm minus hit latency (in milliseconds).Human 2-Tone Sequence Task • — — Normals LI 1 1 50- —i i 1 1 1 1 i 1 1 i 62135 3173 3693 4 Total Stimulus Time (msec) Figure 1. for sh .

or whether all the results could be explained by late slowing. T. Pascal. Right. 562.. 15. R.. N.H.. Behavior. K. 579-588. A. Quantitative and cytoarchitectonic studies of the entorhinal region and the hippocampus of New Zealand Black mice.and left-lesioned subjects were significantly depressed in comparison to shams at the shortest duration (250 ms.M. Postnatal development of forebrain regions in the autoimmune NZB-mouse: A model for degeneration in neuronal systems. . V. 98-104. (1991). V.H. 571. Rosen. and were significantly depressed in comparison to shams. G. Talgo.S. 48. A. Sherman. & Guevara. Spatial learning. Archives of Neurology.. (1988). undergoing further experimental investigation. 358. 47.F. (1990). 36.O. L. A computer-aided procedure for measuring Lashley III maze performance. ectopias and immunity in BD/DB reciprocal crosses. G. F. behavior and autoimmunity... Barker.E. 114-122. M. 857-861.. (1978). Denenberg..A.. 73.S. Cortex.M. 347-357. G. thus suggesting that any nearby lesion may propagate along connectionally related areas to result in changes in those areas incompatible with normal temporal processing capacity. 323-329. Retz.A. Behan. Ferrera.. Rabin. P. 496-510... Physiology of Behavior. aphasia. (1991). W. Chayo-Dichy. Levin. & Lubs. Learning and memory deficits associated with autoimmunity: Significance in aging and Alzheimer's disease. Sherman. G. 563. R.W. Fischer..O.M.. G. L. bilaterally lesioned rats showed specific impairment at stimulus durations of 400 ms or less.. T. 403-424.H.. Brain Research. S. Mixed parvocellular and magnocellular geniculate signals in visual area V4... 54. Liberman.. Sherman. paw preference and neocortical ectopias in two autoimmune strains of mice. However. Carroll. Jallad. Sheldon.R.. J. H. D.D. Reading reversals and developmental dyslexia: A further study.. T.. G.. 183. Waters.. L. Behan.D. F.F. discrimination learning. & Galaburda. 1031-1034. (1991).. Denenberg. 1).. B. Duara. Kushch. (1991). K.H. M. B. I..M.. Loewenstein.P. 14.. Brain Research. & Lai. Effects of the autoimmune uterine/ maternal environment upon cortical ectopias. V. Galaburda discriminate at longer stimulus durations. L. These questions are. Denenberg. Rosen. R. References Anstatt. the neonatal lesion did not substantially involve the auditory pathways. Gross-Glenn.. 410-416.. Physiology of Behavior. Brain.A.. Journal of Neural Transmission.. & Deni.D. 253-273. V. Fink. & Kenner. Harmony. Mobraaten. and deja vu. & Galaburda. M. Nature. Denenberg. G. Ostrosky-Solis. Nealey. (1991). D. 756-761. G. & Maunsell. N. Talgo.J. Zilles. International Journal of Neuroscience. however. A computer-aided procedure for measuring discrimination learning. R...C. & Galaburda. D.. Anatomy and Embryology.F. K.M.44 A. The late event related potentials CNV and PINV in normal and dyslexic subjects. A. L. P. S.Y. Schrott.. & Shankweiler. Schrott. V. (1992). L. H. Morrison. Morrison. J.. Neuroanatomy differences between dyslexic and normal readers on magnetic resonance imaging scans.. Forster.. Schrott.. Fig. Interestingly.W. Temporal perception. 50. Freter. Rosen. A.. & Schleicher. Efron. A.H.A. (1963). Schrott. Waters.A.. 249-257. N. S... Drug Development Research.. The experiments could not address the question of whether the cortical lesion propagates upstream and results in temporal processing anomalies early in the process..M. (1990). (1992). Denenberg. L. Meneses.W.. Brain Research. V. (1988). N.

521-552. 727-738.. Experimental evidence for a transient system .M. C... G. Semrud-Clikeman. V. A hypothesis and a program for research. K.. Flowers. 42. Planum temporale asymmetry: Reappraisal since Geschwind and Levitsky. Journal of Neuropathology and Experimental Neurology. Sherman. Developmental dyslexia in women: Neuropathological findings in three cases. & Casagrande. Landwehrmeyer. A.. & Galaburda..F. & Galaburda. G. G. W. Lovegrove. A. R. M.S. (1990)...Developmental dyslexia and animal studies 45 Galaburda. Galaburda. (1990). G. (1990)... Segregation of form. A. & Wallesch. 240. physiology and perception. D. F. 50.M..W. Andersen.S. Patterns of task-related slow brain potentials in dyslexia..W. 145-160. 7943-7947. Developmental dyslexia: Four consecutive cases with cortical anomalies... (1991). Evidence for a magnocellular defect in developmental dyslexia. J.A. 791-797... Archives of Neurology. Anomalous cerebral structure in dyslexia revealed with magnetic resonance imaging. MRI evaluation of the size and symmetry of the planum temporale in adolescents with developmental dyslexia.. Aboitiz. 79.. M. M. A. Buchsbaum. 329. A. G. Hynd... J. P. Asymmetries of the skull and handedness: Phrenology revisited. and pathology: II. & Staab. Journal of Neurological Science. & Odegaard. Cerebral lateralization. (1988).E. Histological asymmetry in the primary visual cortex of the rat: Implications for mechanisms of cerebral asymmetry. F.O. Science. Archives of Neurology.J. J. . (1985b). & Sherman. Left-handedness: Association with immune disease.M. Brain morphology in developmental dyslexia and attention deficit disorder/hyperactivity. H. D. and pathology: I. Lachica. M. H. Sherman. N. 18.W. (1993).. Archives of Neurology. J. 740-749... Press. A. Archives of Neurology. R.. & Galaburda.D.. 222-233. Drislane. Archives of Neurology. A. (1982). Larsen. associations.M.G.. V. 39. Hoien.M. P. 88.F. G. Novey.A... 428-521. Geschwind. & Geschwind. L. & Livingstone. Archives of Neurology..F. Tallal. N. Geschwind. Archives of Neurology. Rosenberger. T. M. Geschwind. LeMay.. G. and depth: Anatomy. P. associations. Hagman. (1985a). M.. 90-92. The planum temporale (Editorial). & Perlo. M. G. (1978). 289-301.. C M . S. Livingstone. P. & Nicholson..M.S.. (1985c).M. Galaburda.S.D. 682. F. P. Proceedings of the National Academy of Sciences. E. Rosen. J. migraine.M.D. Lorys. Kaufmann.T. Livingstone.. Rosen. A.. 151-160. 50. D.D. Intrinsic connections of layer-Ill of striate cortex in squirrel monkey and bush baby: Correlations with patterns of cytochrome oxidase.D. & Eliopulos. B. (1993).. A.. Biological mechanisms. Aboitiz. Hynd. Lundberg. Galaburda. Cerebral lateralization. Rosen.. (1977).. (1987). Leonard. Humphreys. (1993). Biological mechanisms. Rosen. color. Cortex. 70-82. (1990). (1993). Humphreys. 47. Alexander.. G. Mao. Annals of Neurology. W.. A. (1991). (1986). & Sherman. M. 35. O. (1990). Archives of Neurology. Hier. L. L. & Katz.B. 42... Archives of Neurology.K. 22.V. Wood. 919-926. Proceedings of the National Academy of Sciences USA. A. A hypothesis and a program for research. Brain and Language.K. 50. 42. J. Garofalakis.. Lombardino.M. & Behan. (1985). associations..M. & Hubel.M. Beck. Cerebral lateralization. 49. Annals of Neurology... USA. Neuropsychologia. 457. 32. Rosen. Honeyman. G. 461-469. 25. Cerebral brain metabolism in adult dyslexic subjects assessed with positron emission tomography during performance of an auditory task. 853-868. Freezing lesions of the newborn rat brain: A model for cerebrocortical microgyria.. Voeller.. Corsiglia.. E.B. N. Annals of the New York Academy of Sciences. F. Galaburda. Agee. W. 634-654. (1992).M. LeMay. E.. 28.D. and developmental disorder. A hypothesis and a program for research. Journal of Comparative Neurology. & Galaburda. movement.F. A. A. & Galaburda.. M.. Biological mechanisms. 163-187. Physiological and anatomical evidence for a magnocellular defect in developmental dyslexia. 734-739. G. N. Garzia. Developmental dyslexia: Evidence for a sub-group with reversed cerebral asymmetry. Geschwind. 47. Morris. P. Gerling. 5097-5100. & Galaburda. N. and pathology: III.. 243-253.W.F.C.O.

(1988). Neocortical VIP neurons are increased in the hemisphere containing focal cerebrocortical microdysgenesis in New Zealand Black Mice.F. G. Journal of Clinical and Experimental Neuropsychology. A. (1986). (1989). Spencer. M. & Galaburda.. Denenberg. N.. J. International Journal of Neuroscience.D.. Sherman. Rosen. N. G. K.... (1992). J.G.D. G. The organization of radial glial fibers in spontaneous neocortical ectopias of newborn New-Zealand black mice. Volkmann. 202-2(fJ. 315-319.J. Brain Research. Andreason. L. Temporal or phonetic processing deficit in dyslexia? That is the question.F. D. (1987). Brain Research. Interhemispheric connections differ between symmetrical and asymmetrical brain regions. M.. Rosen. (1985).. L. B. Aquino. Stone.. A. M.M. K. & Lai. Brain abnormalities in immune defective mice. (1990). (1983). 532.D. Schachter.D..M. A. Ransil. A..S. Environmental enrichment.46 A. 137-146. 241. (1992).M.M. P. L. S.D. G. Abnormal architecture and connections disclosed by neurofilament staining in the cerebral cortex of autoimmune mice. & Geschwind.J. G. & Rosen. & Galaburda. Developmental Brain Research. Tallal. 887-903.D. Galaburda deficit in specific reading disability. H. 315-336. A. Press. A. G..H. K.M. Sherman.O.. J. Stone. 239-242. Sherman. P. J. Rosen.. 269-276. Neuropsychologia. J.. Sherman. (1989). 100.V.. G. Sherman. Luytens. Bennet.. & Galaburda. Development of the hippocampal formation in mutant mice. 82. & Freund. Defects of non-verbal auditory perception in children with developmental aphasia. Associations of handedness with hair color and learning disabilities. G. A.. D. 33. A. 67. 49. Behan. 1499-1503. Failure to activate the left temporoparietal cortex in dyslexia: An oxygen-15 positron emission tomographic study. A. J.F.. 58.L.. (1984). Tallal. 279-283.. Lai. L. G..M.. Sherman. Denenberg. & Galaburda. G. Brain Research. G. neocortical ectopias. The effect of picture priming on event-related potentials of normal and disabled readers during a word recognition memory task... Applied Psycholinguistics. and behavior in the autoimmume NZB mouse.. T.. 468-469. Cortical anomalies in brains of New Zealand mice: A neuropathoiogic model of dyslexia? Proceedings of the National Academy of Sciences USA. Perceptual and Motor Skills. Mehler. P. G.M. Rosen.M. H. Morrison.. 529.D. G.. 52.M. R. Behan.F.. Morais.. H. G. S. The effect of developmental neuropathology on neocortical asymmetry in New Zealand Black mice. Neuroanatomical anomalies in autoimmune mice.C. 1085-1089. . Galaburda. Mathis. Sherman.M. D. A. & Bennett. Sherman.. J. (1991).S... (1990).. R.D. Developmental Brain Research.F. 221-222. (1973).S. 12..M. G. 527-534. Journal of the American Optometric Association. Galaburda. Hamburger. 85-93. & Geschwind. C. 232-236. Jancke. H. D. Archives of Neurology. & Galaburda. Neuroscience. V.. Stelmack. A. A. 25. Waters. & Galaburda.D. Press.F. Segmentation abilities of dyslexics and normal readers. Sherman. G. King. 74.. & Cohen.M. 15. (1984).... Nature. Correlation between a learning disorder and elevated brain-reactive antibodies in aged C57BL/6 and young NZB mice. Anatomical left-right asymmetry of language-related temporal cortex is different in left-handers and right-handers. 8072-8074.. & Galaburda. Behavioral Neuroscience. Sherman. Life Sciences. P. Annals of Neurology.M. Schrott. V.. Emsbo. P. 33. (1992).J. G. Rosen.. & Alegria.... & Galaburda. 247-254.. & Piercy. & Miles.S.M. 167-169. N. G.M. Rosen. Rapoport.F.F. Stone. Rosen. 525-533. 532. Rumsey. Zametkin. L. Nandy. Steinmetz. 2. Humphries. G...H. 353-358. Nowakowski.O. 67.F. R.. J.C. Behavioral impairments related to cognitive dysfunction in the autoimmune New Zealand Black mouse. (1992). 29. (1990). Schrott. (1990). (1987). Pikus. D. A..M.. 25-33.. 5. Drug Development Research.M. Acta Neuropathologica (Berlin). Physiology of Behavior. 45.F. Rosen...D. Lashley maze deficits in NZB mice.

. & Lecluyse. Journal of the American Optometry Association. Science. (1990).F.. K. 229. Evaluation of a visual discrimination task for the analysis of the genetics of mouse behavior. 203-208. R. Witelson. Wimer.Developmental dyslexia and animal studies 47 Williams. S. M.. 665-668. (1965). 2. Perceptual and Motor Skills. & Weller. S. (1985). The brain connection: The corpus callosum is larger in left handers. 20. 111-121. Perceptual consequences of a temporal processing deficit in reading disabled children.

Los Angeles. This is a simplified preparation in which to study the neurobiology of the elementary computational operations that make cognition possible. USA Abstract The self-stimulating rat performs foraging tasks mediated by simple computations that use interreward intervals and subjective reward magnitudes to determine stay durations. The decision process sets the parameters that determine stay durations (durations of visits to foraging patches) so that the ratios of the stay durations match the ratios of the preferabilities.R. because the neural signal specifying the value of a computationally relevant variable is produced by direct electrical stimulation of a neural pathway. fax (310) 206 5895. University of California at Los Angeles.4 Foraging for brain stimulation: toward a neurobiology of computation C. Introduction Cognitive psychology arose when psychological theorists became convinced that behavior was mediated by computational processes that could not readily be described in the language of reflex physiology. e-mail randy@cognet. CA 90024-1563. 405 Hilgard Ave. Gallistel* Department of Psychology. Newly developed measurement methods yield functions relating the subjective reward magnitude to the parameters of the neural signal.. (310) 206 7932. These measurements also show that the decision process that governs foraging behavior divides the subjective reward magnitude by the most recent interreward interval to determine the preferability of an option (a foraging patch).ucla.edu . the rise of cognitive psychology widened the conceptual gap between neuroscientific theory and *Tel. Thus.

This led in time to the elucidation of the molecular structure of the gene. However. How this was to be accomplished chemically was. to say the least. such as adding and multiplying the values of the retrieved variables. Gallistel psychological theory. there are reasons to doubt that we know the computationally relevant properties of neural tissue. to say the least. We do not know how neural tissue stores information. trying to build satisfactory models of psychological processes using what we know about how the nervous system works as the starting point of the modeling effort. we will have to discover how the elements of computation are realized in the central nervous system. In pursuing this strategy. obscure until the revelation of the sequence of complementary base-pairs cross-linking the two strands of DNA. We must develop simplified preparations in which elementary computational operations demonstrably occur. Genes were then shown by the methods of classical genetics to be present in simple organisms (bacteria and yeast) that lent themselves to biochemical manipulation. yet. then use our knowledge of those properties to discover the underlying cellular and molecular mechanisms. because crucial aspects of the structure of complex biological molecules were undreamed of in biochemistry up to that point. The chemical identification of the gene revolutionized biochemistry. Genes and their properties were inferred from the study of inheritance. We need a conceptual scheme and research strategies to bridge the gap. Thus. obscure. In that case. the discovery of the neural realization of the elements of computation may someday revolutionize neuroscience. storing and retrieving the values of variables are fundamental to computation. The mechanism of inheritance was not-and in retrospect could not have beendeduced by building models based on what biochemists understood about molecular structure before 1954. It is also far from clear that we know the neural processes that implement other elements of computation.50 C. Computational models of psychological processes require the storing and retrieving of the values of the variables used in computation. In order to have a neurobiology of computation. we follow the approach that proved successful in other areas where science has established the physical basis for phenomena that initially seemed refractory to physico-chemical explanation. we may hope to work from the bottom up. Genetic theory required molecules that could generate copies of themselves. derive from the behavioral study of those preparations properties of the neural processes that mediate the computations. then we must work from the top down. We clearly do not know one such property. . How this is to be accomplished by neurophysiological processes is. If we do not already know the computationally relevant properties of neural tissue. One strategy is to assume that we already understand those aspects of neural functioning that enable neural tissue to carry out complex computations.

most clearly worked out. 1988). Fairhurst. our knowledge of the quantitative properties of the relevant neurophysiological processes must come from measures based on their behavioral consequences. 1984. more importantly. there should be some apparent strategy by which one might try to identify the neural circuitry that carries out the computational aspects of the task. and most quantitatively successful computational models in cognitive psychology.Foraging for brain stimulation Developing a suitable preparation 57 The pursuit of a top-down strategy requires the development of suitable simple preparations. & Meek. Finally. Most work on bacterial genes focuses on a relatively few genes whose effects are simply expressed and readily measured. The decision processes that mediate responding in timing tasks use simple computations to translate remembered and currently elapsing intervals into behavior (see Church. and empirically well supported. it is becoming clear that they elucidate and integrate a wide range of phenomena in classical and instrumental conditioning (Gallistel. We must determine quantitative properties of these computational processes by behavioral methods rather than by neurophysiological methods precisely because we do not yet know what neurophysiologically identified processes we should measure. detailed models of the decision processes. Work on the neurobiology of computation needs likewise to focus on cognitive phenomena that are relatively well understood at the computational level. A good case can be made that behavior based on comparison of remembered and currently elapsing temporal intervals is a suitably simple aspect of cognition. & Kacelnik. in press. Gibbon. And. Our goal is to discover which neurophysiological processes we should be examining. The preparations must be as simple as possible from two different perspectives: a psychological perspective and a neurobiological perspective. 1992. Extensive experimental and theoretical work by Gibbon and Church and their collaborators has established a detailed model of the process by which temporal intervals are measured and recorded in memory and. From a neurobiological perspective. 1984. Church. Models of timing behavior are among the simplest. From a psychological perspective. they must exploit those behavioral phenomena for which we have computational models that are clear. Moreover. 1990. Gallistel. Thus. Gibbon. Church. It is only by their behavioral consequences that we know them. just as genes were for many decades known only through their consequences for inheritance. Until we have reached that goal. relatively simple. for reviews). there should be behavioral methods by which we may derive quantitative properties of the neurophysiological processes that mediate the storage of the variable and/or the ensuing computations. there should be some idea of how the computationally relevant signals . Gibbon. well developed. timing tasks recommend themselves from a psychological perspective.

1988. and they yield the function relating the magnitude of a stored variable to the parameters of the neural signal that specifies its value. Gallistel might someday be generated in the neural circuit after it was subjected to the radical isolation required for serious work on cellular and molecular mechanisms. 1991). It is becoming commonplace to cut slices or slabs of living neural tissue containing interesting neural circuits. and keep them alive for hours or days in Petri dishes. and the computations involved in the decision processes«that translate these stored variables into choice behavior. reward magnitudes. The return from the time invested in a given choice is the magnitude of the rewards received from that investment divided by the interreward interval. . in matching behavior. How could you generate in the isolated neural circuit the signals that served as the inputs to that circuit in the intact animal? You cannot deliver food rewards to isolated neural circuits. where they may be subjected to experiments that could not be carried out in the intact brain. Thus. Our findings emphasize the computational nature of the decision processes that mediate the choices a foraging animal makes. The relative return is the ratio of the return from a given choice to the sum of the returns from all the available choices. 1961. I explain this rationale before summarizing our recent experimental findings from the study of matching behavior in self-stimulating rats. remove them from the brain. Matching behavior is the very general tendency of animals-from pigeons to humans .52 C. much work on electrical self-stimulation of the brain in the rat has been motivated by the belief that this phenomenon may provide us with a preparation in which to study the neurobiology of the elementary computational operations that make complex cognitive processes possible. Suppose one had removed from the brain a slab containing a neural circuit that mediated the storage and retrieval of temporal intervals. Herrnstein.to allocate the times they invest in competing choices in a manner that matches the relative returns obtained from those choices (Davison & McCarthy. Here. as in other tasks that depend on the assessment and comparison of temporal intervals.the return and the relative return-then adjusts the parameters of a stochastic "patch-leaving" process in such a way that ratios of the expected stay durations match ratios of the most recently obtained returns. Rationale for using self-stimulation For some years. The to-bereviewed experimental findings suggest that the decision process underlying matching behavior computes these subjective quantities . the behavior provides an unusually direct reflection of an elementary computation performed by the central nervous system: the ratios of the times allocated to the alternatives match the computed ratios of the recently experienced returns from each alternative.

a rate of return that would rapidly satiate its desire for any known natural reward.1 ms cathodal pulses delivered to the medial forebrain bundle. The computationally relevant neural signals produced by these events are at present almost impossible to determine. The parameters of the rewarding neural signal .the onsets of tones and lights and the delivery of food rewards. the quantities that enter into the computation . Generating the reward signal by direct electrical stimulation of neural tissue provides dramatic simplification from both the psychological and the neurobiological perspectives. Mark & Gallistel. describe the morphology and physiological characteristics of . Psychophysical procedures may also be used to determine the function relating the psychological magnitude of the remembered reward to the strength and duration of the neural signal that produces it (Leon & Gallistel.are specified by direct electrical stimulation of a pathway in the central nervous system itself. 1992. 1988). The stimulus that produces the rewarding effect in self-stimulation is a brief train (0.the subjective durations of the interreward intervals and the subjective magnitudes of the rewards . a neuroanatomically complex collection of diverse projections interconnecting the upper brain stem and the forebrain. in the self-stimulating rat. 1993. we have a directly controllable neural signal related in a known way to a psychological magnitude that plays a central role in computationally interesting decision processes. Simmons & Gallistel. The axons that carry the rewarding signal must pass within about a 0. psychophysical and electrophysiological methods in an attempt to identify the axons that carry the rewarding signal.1-1. we bypass the complexities of the sensory perceptual circuits that translate natural events into the signals specifying the values of the psychological variables in computationally interesting decision processes. in press). Yeomans.are specified by external events . However. It will work for hours on end for these rewards.5 mm radius of the electrode tip. even when it gets 40-60 large rewards per minute . The rat's desire for brain stimulation reward is intense and insatiable. This gives us the starting point for a line of experiments that combines neuroanatomical.Foraging for brain stimulation 53 In matching behavior as it is commonly studied in the laboratory. Thus. respectively.0 s long) of 0. the remembered variables that enter into the computations of the decision process the subjective durations of the interreward intervals and the subjective magnitudes of the rewards . From the neurobiological standpoint. We generate the computationally relevant signal directly.the number of axons fired by the pulses and the number of times per second they fire .are determined by the current and pulse frequency. 1989. Thus. the problem of identifying the relevant neural circuitry is greatly simplified by the fact that the search starts at a localized site in the central nervous system. in the self-stimulating rat. The combination of direct control over the rewarding neural signal and a rewarding effect that sustains almost any amount of responding makes it possible to bring psychophysical methods to bear in analyzing the neural pathway that carries the signal (Shizgal & Murray.

Wise & Rompre. 1989. 1981). 1981). Finally. the finding that the required pulse frequency varies as the reciprocal of the current suggests two conclusions (Gallistel et al. Stellar & Rice. Shizgal. Yeomans. the self-stimulation phenomenon has the potential to provide the kind of simplification at both the psychological and neurobiological levels of analysis that may make it possible to work from the psychological level of analysis down to the neurobiological mechanisms that mediate elementary computation. if we succeed in identifying the circuit and isolating it for cellular and molecular study. How does the subjective magnitude of the reward grow as we increase the strength of the neural signal? To determine this. Measuring the subjective magnitude of the reward One of the early findings from psychophysical experiments on brain stimulation reward was the surprisingly simple form of the trade-off between the stimulating current and the pulse frequency required to produce a just acceptable reward from a train of fixed duration (Gallistel. doubling the current doubles the number of relevant axons fired by each pulse in the train. & Yeomans. It is reasonably certain that the number of axonal firings produced by a train of very brief cathodal pulses is directly proportional to the pulse frequency. either doubling current or doubling pulse frequency doubles the strength of the reward signal. 1988). Firing 1000 axons 10 times to produce 10. albeit in a somewhat indirect way.the same stimulus used in the intact animal. the required pulse frequency varies as the reciprocal of the current. and investigate the electrophysiological and neurochemical processes in the postsynaptic cells of the circuits in which these neurons are embedded (Shizgal & Murray. Gallistel and Leon (1991) reasoned that matching behavior could be used for this purpose. Thus. (2) The subjective magnitude of the reward produced by the resulting neural signal is determined simply by the strength of that signal. the subjective magnitude of the rewards the rat receives when it holds down a lever ought to combine multiplicatively with the subjective rate of rewards generated by holding down that lever to determine the subjective return from that . doubling the stimulating current halves the required pulse frequency. From a normative or "rational-decision-maker" standpoint. 1989. Thus. 1989. that is.. where strength is defined as the number of action potentials per unit time. we had to find a method for measuring the subjective magnitude of the reward. Thus.54 C. over most of the usable range of stimulating currents and pulse frequencies. For many electrode placements. Thus. we can use electrical stimulation to activate it in isolation . Gallistel the neurons from which these axons arise.000 action potentials has the same rewarding effect as firing 500 axons 20 times. (1) The number of reward-relevant axons fired by a stimulating pulse is directly proportional to the stimulating current.

rats move back and forth between the levers with sufficient frequency that the average rate of reward from each lever is approximately equal to the reciprocal of the expected interval in the schedule for that lever.Foraging for brain stimulation 55 lever (return = magnitude x rate). one in which the expected interval between rewards was 16 s (a VI 16 s schedule). delivered half as often). Gallistel and Leon . the subjective return on the two sides will be equal. In the Gallistel and Leon (1991) experiment. it gets the reward then and there. the rat gets about 15 rewards per minute from a lever with a VI 4 s schedule and about 3. the scheduling algorithm in effect flips a biased coin. otherwise it gets it when it next presses the lever. one in which the expected interval was 4 s (VI 4 s). When the rat held down the lever in the other alcove. When the factor by which the subjective reward magnitudes on the two sides differ is the inverse of the factor by which the subjective rates of reward differ. The bias on the coin determines the average interval between the delivery of a reward and the availability of the next reward. we increase the subjective magnitude of the reward the rat experiences from each stimulating train received from that lever.75 rewards per minute from a lever with a VI 16 s schedule. Once every second. the rat spends most of its time on the lever that delivers these rewards four times as often. The expected interval to the next reward is lip s. it received rewards determined by a different schedule. As we increase the strength of the rewarding signal on the other side. To determine the function relating the subjective magnitude of the rewarding effect to the strength of the neural signal that produced it. In this situation. the longer the average interreward interval on the lever. the rat should spend equal amounts of time on the two levers. where p is the probability of heads in a flip made every 1 s. A variable interval schedule makes the next reward on a lever available after an interval determined by an approximation to a Poisson process. To make the subjective return from a lever with a VI 16 s schedule equal the subjective return from a lever with the VI 4 s schedule (assuming a rational decision maker). we would have to make the subjective magnitude of the reward on the first lever four times bigger than the subjective magnitude of the reward on the second lever. say. Thus. it received rewards at a rate determined by one variable interval schedule. When the rat held down one lever. it makes a reward available on the lever. there were two levers. By the matching law. If the rat has the lever down. The lower the probability of heads. A rational decision-making process that obeyed the matching law (allocating equal amounts of time to choices that yield equal returns) would divide its time equally between a lever yielding rewards of magnitude 10 every 8 s and a lever yielding rewards of magnitude 20 every 16 s (rewards twice as big. each in its own alcove. When the rewards given on the two sides are the same size. Gallistel and Leon (1991) had rats work for brain stimulation reward on concurrent variable interval schedules of reward. say. If the coin comes up heads.

Increasing either pulse frequency or current by a given factor (i. However. 1). Mark and Gallistel (1993) varied the ratio of the interreward intervals in the two concurrent schedules of reward. Like Gallistel and Leon (1991). / is pulse frequency and k is a constant). / is current.56 C. in the course of an experiment that used matching behavior to determine the function relating the subjective magnitude of the reward to the duration of the neural signal that produced it. Gallistel and Leon (1991) did not test their measurement assumptions. 1 can be equated with the subjective magnitude of the reward (on a log scale).. Second. the y axis in Fig. At each ratio of rates of reward. the subjective magnitude of the reward (Af) is determined by the number of action potentials per second in a signal of fixed duration. The conclusions are. to make the rat divide its time 50:50 between the two levers (equipreference). Thus. The ratio of these values determine the ratio of times allocated to the options. M = /(n a ). (1) The subjective rate of reward (r) is proportional to the objective rate of reward. Thus. by a given interval on a log scale) has the same effect on the subjective magnitude of the reward (Fig. or the V (for value) of the lever: V=Mr The V values for the available options are the quantities the decision process uses in computing the ratios of the returns from its most recent investments in those options. Gallistel and Leon's (1991) results strengthen the conclusions drawn from the trade-off between current and pulse frequency at the threshold for reward acceptability. but Mark and Gallistel (1993) did test them. first. that the number of action potentials per second in the rewarding signal is directly proportional to both pulse frequency and current: n^klf where na is the number of action potentials per second. Given two measurement assumptions. 1 uses double logarithmic coordinates to display the experimentally determined trade-off between the difference in rate of reward (on the y axis) and either pulse frequency or current at equipreference (on the x axis). the x axis in Fig. 1 can be read as the number of action potentials in the neural signal that produces the rewarding effect (times an unknown scaling factor k). (2) Subjective rate of reward combines multiplicatively with subjective reward magnitude to determine the subjective return from a lever. that is.e. they . Fig. they varied the duration of the stimulating train delivered by one lever to find the factor by which the train durations had to differ in order to offset the difference in the rates of reward (the equipreference method). Gallistel (1991) varied the factor by which the rates of reward differed and determined the adjustments in pulse frequency or current required to offset this difference.

this graph may also be read as a graph of the subjective magnitude of the reward as a function of the number of action potentials per second in the 0. (The fixed magnitude of the reward on the competing lever establishes the unit of measurement. Data replotted from Gallistel and Leon (1991).5 s rewarding signal. so equal intervals on the current (top) and pulse frequency (bottom) axes correspond to changes by equal multiplicative factors. the relative rate of reward need not be varied. The logarithmic scale of subjective reward magnitude (right ordinate) was generated by setting the magnitude of the smallest measured reward equal to 1. such as train duration. By the matching law. because subjective reward magnitude is read directly from the time allocation ratio in a matching task. 1. measurements made by the direct method do not depend on the assumptions about how . A complete determination of the relation between a stimulation parameter. The train duration for both rewards was 0.) In the direct method.A. a 4:1 ratio of time allocation in favor of one lever implies that the subjective reward from that lever is four times as big as the subjective reward from the other lever. Under the measurement assumptions specified in the text. also analyzed the data from the many trials on which the rats did not allocate equal amounts of time to the two levers (because the two levers did not yield equal returns). the ratio of the amounts of time allocated to the two levers on a trial is proportional to the ratio of the returns from those levers. The trade-off between the relative rates of reward on concurrent variable interval schedules of reward (left ordinate) and the pulse frequency or current to which one reward had to be adjusted to induce the rat to spend equal amounts of time on the two levers (bottom and top abscissas). and the subjective magnitude of the reward produced can be made at a single setting of the schedules of reward on the two levers. The scales are logarithmic. When the schedules for the two levers deliver rewards at equal rates. Mark and Gallistel term this the direct method.5 s. The pulse frequency and current of the other reward were fixed at 126 pps and 400 p.Foraging for brain stimulation Current (^A) at Equipreference 158 251 400 630 T 1- 100 158 Pulse Frequency (pps) at Equipreference Fig. Thus.

* 'LSC11 X X X o •f X OTC o -%7 XXX x equipreference • direct I i i i_ . If rate of C D X5 Subjective Reward Magniti o .5 1 2 Trai n Duration (s) Fig. The second assumption was that the decision process underlying matching behavior multiplied the subjective magnitude of reward by the subjective rate of reward to determine the subjective return from a lever. the relative rate of reward is held constant. If either assumption is wrong. o CJl -*> . In the data shown here.2 0. but this assumption is unnecessary in the direct method. The equipreference method does not depend on this latter assumption. The approximate agreement between the two sets of measurements validates the assumption that the subjective rate of reward is proportional to the objective rate. The reward magnitude versus train duration function obtained by the direct method is approximately superimposable on the function obtained by the equipreference method. 3. In comparing the measurements made by the equipreference and direct methods. Mark and Gallistel (1993) tested the validity of this assumption by comparing the measurements of subjective reward magnitude made at different relative rates of reward. one of the two assumptions on which the validity of the equipreference measurements rested. On the other hand. This is a key assumption in the equipreference method. The measurements of the subjective reward magnitude at different train durations made by the equipreference method are compared to the measurements made by the direct method (with the rates of reward the same on both levers). in the direct method. because. 2 tests the validity of the assumption that the subjective rate of reward is proportional to the objective rate of reward. the relative rate of reward was 1:1. they do agree. they do depend on the assumption that the ratio of the times the rat allocates to the two levers matches the ratio of the subjective returns. only on the weaker assumption that when the returns are equal the rat has no preference between the levers. 2. we test the validity of these different measurement assumptions. i 0. Data from Mark and Gallistel (1993).58 C. 2. Gallistel the rat's subjective estimate of the rate of reward depends on the objective rate.1 0. The comparison in Fig. the measurements will not agree. as shown in Fig. However. The curve was computed by a smoothing routine from the complete data set shown in Fig.

It should allocate only twice as much time to the 1 s lever. Moreover. To correct for differences in the scale of measurement. In the course of validating the measurements. provided we correct for the difference in the scale of measurement. The rat should now allocate eight times as much time to the 1 s lever (a factor of two increase in its preference). Suppose we repeat the measurements with the schedules of reward adjusted so that the 0. If the relative rate of reward acts as a scaling factor. Thus. we have determined how those values depend on the strength and the duration of the barrage of action potentials produced by the stimulation. At a given setting of the relative rates of reward. The greatest value of these measurement experiments lies in what they reveal about the decision process.5 s train.5 s train. 3 shows that. Mark and Gallistel (1993) determined the rat's time allocation ratio as a function of the train duration on one lever. The direct method takes this to mean that the subjective reward from the 1 s train is four times bigger than the subjective reward from the 0. The subjective magnitude of the reward is a steep sigmoidal function of the strength of the neural signal (Gallistel & Leon. then the functions we obtained at different relative rates of reward should be superimposable. we have isolated a simple computational process. we multiplied the time allocation ratios from a given session by the inverse of the rate ratio in effect for that session. And so on.5 s reward comes twice as often as the 1 s reward. where we can control the values of the variables that enter into the computation by direct electrical stimulation of a pathway in the central nervous system. This validates the assumption that subjective rate of reward combines multiplicatively with subjective reward magnitude to determine the subjective return from a lever. 1991. Fig. after rescaling. The rat's preference (time allocation ratio) for the 1 s lever should be reduced by a factor of two. one for each duration of the variable reward.Foraging for brain stimulation 59 reward combines multiplicatively with subjective magnitude to determine return. . the rat has a 4:1 preference for the reward generated by a 1 s train over the reward generated by a 0. the computational process that uses these psychological variables (subjective magnitude and subjective rate) to determine how the animal will behave. The time allocation ratios are the direct measures of the subjective magnitude of the variable reward.5 s reward comes only half as often as the 1 s reward. Suppose that when the rates of reward on the two levers are equal. then the relative rate of reward should act as a simple scaling factor when subjective reward magnitude is measured by the direct method. keeping the reward on the other lever constant. Suppose we repeat the measurements with the schedules adjusted so that the 0. we have established that the decision process multiplies subjective reward magnitude by subjective rate of reward to determine the subjective value of the lever. A set of these time allocation ratios. the measurements made at different relative rates of reward superimpose. gives the function relating subjective reward magnitude to the duration of the train of stimulation.

Lea & Dow. for example.2 0. 1:8 means that the constant reward was delivered eight times as often as the reward whose magnitude was measured. Staddon.1 0. All of these models. 1981). 3. 1988.1 •^ • f 0. In the Gibbon et al. however. 1993). (1988) showed that matching behavior could result from rate estimates based on just two interreward intervals. Gibbon et al. Data from Mark and Gallistel (1993). Thus. sampled from the population of remembered interreward intervals on a lever. one for each lever. Recently. Comparison of direct determinations of reward magnitude versus train duration function made at different relative rates of reward. they all predict that when the relative . matching behavior was another example of behavior generated by a decision process that used remembered temporal intervals. in press) and a somewhat less steep sigmoidal function of its duration (Mark & Gallistel. However. A ratio of. Gallistel "LSC11 nitu< D) C O xJL* rfr • ••7 x jjcpT 0. Vaughan. 1981. The fact that the rescaled sets of measurements superimpose implies that subjective rate of reward combines multiplicatively with subjective reward magnitude to determine subjective return (or value). The key gives the rate of delivery of the reward whose magnitude was measured. (1988) analysis. Thus. implicitly or explicitly assume that the animal's current time allocation ratios are based on a lengthy sample of previous returns. (1988) specified the interval of past history on which the subjective estimate of rate of reward was based.05 CO x eqp 1:8 • 1:4 • 1:2 • 1:1 •2:1 A • • 0.5 1 2 Train Duration (s) Fig. however. 1984. A model of the decision process in matching It has commonly been assumed that the subjective estimates of rate of reward that underlie matching behavior were based on a reward-averaging process of some kind (Killeen. (1988) did not specify the size of the populations of remembered interreward intervals from which the samples were drawn. Simmons & Gallistel. Leon & Gallistel. 1992. neither reward-averaging models nor the timing model proposed by Gibbon et al.5 ^ C O 3 ^ C D •/^ *K * •* •§ °.60 C D T3 2 C.2 }J*xx T • • f 1 0. relative to the rate of delivery of the reward whose magnitude was held constant. Gibbon et al. The measurements of reward magnitude (the observed time allocation ratios) made at different relative rates of reward were multiplied by the inverse of these rate ratios.

we tabulated the time allocation ratio and the ratio of the numbers of rewards received. it might be thought that the rapid adjustment to the reversal reflected higher-order learning. The expected number on the leaner lever was only two. in our experiment. This accounts for the gaps in the solid lines in Fig. there were many windows in which this number was in fact zero. These rapid shifts in time allocation ratio in response to changes in the relative rates of reward imply that-at least under some conditions . Ratios based on numbers derived from a small sampling of two Poisson scheduling processes show large variability . They all predict that adjustments in time allocation ratios following a step change in the relative rate of reward should be sluggish. 4). 4. its occurrence was signaled by the withdrawal and reappearance of the levers. Thus. but very noisy. The surprise is that the rat's window-to-window time allocation ratios show similar variability and that the variability in the time . Because we used such a narrow sampling window. It happened at the same time in every session.Foraging for brain stimulation 61 rate of return changes. We obtained similar results in a similar experiment with rats responding for brain stimulation reward (Mark & Gallistel. He found that the pigeons reversed their time allocation ratio within the span of about one expected interreward interval on the leaner schedule. we showed that the rats in our experiment adjusted to the totally unpredictable random changes in the apparent relative rates of reward due to the noise inherent in Poisson scheduling processes (Mark & Gallistel. Dreyfus (1991). Due to the random variations inherent in Poisson schedules. Within each successive window. A gap occurs wherever the reward ratio in a window was undefined. in which case the ratio of the numbers of rewards received was undefined. The minimum sample on which estimates of the latest relative rates of reward could in principle be based is the most recent interreward interval on each lever. the reversal in relative rate of reward at mid-session was predictable. estimate of the current rate of reward on that lever. In both the Dreyfus (1991) experiment and our experiment. Its time allocation ratios will change only when the averages over a large number of previous rewards have changed or only when populations that include a large number of previous interreward intervals have changed. We plotted the logarithms of both ratios on the same graph (Fig.the animal's estimate of the rate of reward is based only on a small sample of the more recent rates of reward on the two levers. 4. However.see solid lines in Fig. We used sampling windows twice as long as the expected interreward on the leaner schedule. the numbers of rewards received within a window were small. 1994). He ran sessions in which the relative rates of reward reversed in the middle of each session. and. The reciprocal of the interval between the last two rewards on a lever gives an unbiased. working with pigeons responding for food reward. 1994). it should take the animal a long time to adjust its time allocation ratios to the new rates of return. obtained results strikingly at variance with this prediction. This variability is random.

2 -1.2 0. The lighter horizontal lines indicate the actually experienced reward ratio as calculated by aggregating over the trial.) (D) Programmed reward density -1.2 rewards/min (B) Programmed reward density =4. The actually experienced combined rate of reward across the two trials in a given condition (combined reward density) is given at the lower right of each panel.6 0. A gap in the solid tine means that the ratio was undefined withm that window because no reward occurred on one or both sides The programmed reward ratios of 4:1 and 1:4 are indicated by the horizontal lines at +0. Reproduced from Mark and Gallistel (1994) by permission of the publisher. which yield these ratios.0 -0.8 rewards I min. are given beside these lighter tines.8 6 9 12 15 0 20 40 60 80 100 120 CO DC V116 svs VI 64 s 4. (A) Programmed reward density equals 19. (The density actually experienced.4 rewards/min 0 10 20 30 40 50 60 50 100 150 200 250 Time (mins) Plots of log reward ratios (R.6 h DB13 VI4svsVI16s IS 81 15 rewards/min -1. (C) Programmed reward density =2 4 rewards!mm. .IT2) in successive windows equal to two expected interreward intervals on the leaner schedule. reflects the variability inherent in Poisson schedules.C. The actual numbers of rewards obtained. over sessions comprised of two trials.6 and -0. IR2) and log time allocation ratios (T.8 1.2 rewards/min. Gallistel 1. Note that the time allocation ratio tracks wide random fluctuations Z w T ^ / f ' ° re*ardless °f the °veral1 ™<»d density.6 respectively. with a 16-fold reversal in the programmed relative rate of reward between the trials Successive windows overlap by half a window. which is slightly greater than the programmed density.

c. If the rate-estimating process used averaging windows whose width was fixed and independent of the time scale of the reward schedules (cf. This implies that the statistical properties of the rate estimation process (e. The reciprocal of this parameter. The duration of a visit cycle is inversely . as can be seen in Fig. under conditions where changes in the relative rate of reward occur frequently. is the expected duration of the stay. 1/Ap. 1984). Moreover. When the leaner schedule was VI 256 s (lower right panel). the extent to which the time allocation ratio tracked the noise in the reward ratio. 4).g. the tracking of the noise is the same regardless of the overall rate of reward. and sessions lasted more than 4 h.. the rat got about 15 rewards per minute from the two schedules combined. This is the maximally localized estimate of the current rate. Thus. Ap. These findings led us to develop a timing model for the decision process in matching (Mark & Gallistel. and a session lasted only 15 min. where the subscript designates the patch. 1992). The expected durations of the stays in each of the available patches sum to a constant. hence. these changes in the overall rates of reward would change the number of rewards over which the rate estimation process averaged and. However. was confirmed by quantitative analyses. The expected duration of the animal's stay "in a patch" (on a lever) is assumed to be determined by a Poisson patch-leaving parameter. which included cross-correlation functions and scatter plots of the time allocation ratio within a window as a function of the reward ratio within that same window. the rate estimation process is time scale invariant. The current return from that lever is this rate estimate times the subjective magnitude of the most recent reward. 4. 1994). that is. the tracking of the noise remains the same over large changes in the overall rate of reward. The time allocation ratio tracks the random noise inherent in estimates of relative rates of reward based on very narrow samples from two Poisson scheduling processes. And. The tendency to leave a patch (Ap) is inversely proportional to the current return from that patch. Lea & Dow. The current rate estimate for a lever is assumed to be the reciprocal of the interval between the last two rewards received. the decision process that mediates matching behavior relies on rate estimates derived from a very narrow sample of the most recent rates of return experienced in the various foraging "patches" (levers or keys).Foraging for brain stimulation 63 allocation ratios tracks the variability in the reward ratios. The tracking of the noise. When the leaner schedule was VI 16 s (upper left panel in Fig. Also. 4. a general feature of behavior on timing tasks (Gibbon. the less the tendency to leave a patch. The greater the most recent return. which is the duration of a visit cycle consisting of one visit to each patch. which is evident in Fig. the rat got about one reward per minute. estimating rate by taking the reciprocal of the interreward interval makes the statistical properties of the rate estimate time scale invariant. Experimental sessions lasted for 60 expected rewards on the leaner schedule. the variability in the estimates) are scale invariant. over large changes in the time scale of reward delivery.

When the green-yellow pair was illuminated. using two pairs of keys: a white-red pair and a green-yellow pair. 1992. That is. on the assumption that the patch-leaving tendencies for the two keys (Ar and Ag) had the same values during the short unrewarded tests as they did during the training portions of the experiment. 1977. while the yellow key was associated with a schedule only half as rich. When the white-red pair was illuminated. for present purposes. 1980). We are now testing the implications of our model for the microstructure of matching behavior.64 C. This simple model of the decision process in matching predicts the rapid kinetics observed by Mark and Gallistel (1994) and Dreyfus (1991). the higher the overall rate of reward. Conclusion None of the models of psychological function propounded within contemporary cognitive psychology . Mazur. The surprising result was that the pigeons had a 4:1 preference for the green key. However. When choosing between the red and green keys. He trained pigeons to respond to food on concurrent variable interval schedules of reward. In accord with the matching law. the animals may use a different decision process to determine the times it allocates to each patch-one that does not rely on maximally localized estimates of the rates of reward (Keller & Gollub. there was a VI 40 s schedule for the red key and a VI 20 s schedule for the white key. the important point is that we have developed and are testing a fully specified computational model of the decision process that operates when there is a history of frequent changes in the rate of reward. the pigeons had about a 2:1 preference for the white key over the red and a similar preference for the green key over the yellow. This is surprising because the red and green keys were both associated with VI 40 s schedules. there was a VI 40 s schedule on the green key and a VI 80 s schedule on the yellow key. Surprisingly.including those models that take their inspiration from assumptions about how the nervous system works-is easily realized by neuro- . We are also testing the possibility that the use the decision process makes of local information about rate of return is determined by the animal's global experience. There is reason to think that if the rates of return remain constant for many sessions. it also predicts a quite unrelated and rather counterintuitive result reported by Belke (1992). Belke then determined the pigeon's preference in short (unrewarded) tests that pitted the red and green keys against each other. Our model of the decision process predicts the observed 4:1 preference for the green key. Gallistel proportional to the overall rate of reward. the more rapidly the animal cycles through the patches. they showed a twofold stronger preference for the green key than the preference they showed for that key when it was paired with the yellow key. Myerson & Miezin.

1991). nor how it can operate with those variables in accord with the elementary operations of arithmetic and logic. These are simple computational models. This is not surprising. because the well-established principles of neurobiology do not say anything about how nervous tissue can store and retrieve the values of variables. Landy & Movshon. 1984). Foster. This testing has yielded some surprising findings that may offer lines of attack on the underlying cellular and molecular mechanisms. yet they involve the basic arithmetic operations that are the foundation of all computation. They have been exceptionally well specified and tested. it may remember intervals as 10% shorter or 10% longer than they were. the animal systematically misremembers intervals by some percentage. Similarly. Models of behavior based on remembered temporal intervals and other simple psychological quantities such as reward magnitude seem suited to the purpose. Davis. However. The discovery of mutants whose circadian clock has a dramatically enhanced period error (Konopka. early sensory processing. . does not involve an operation crucial to many later. there are clearly important neurophysiological mechanisms that remain to be discovered.g. when the mechanism that measures the subjective duration of an interval writes the measured interval into memory. The obvious question is. Ralph & Menaker. For example. higher-level computations. it miswrites the value by an animal-specific (not task-specific) scalar value (Church. which would provide an avenue of attack on the molecular basis of the write-tomemory operation. What cognitive psychology has to offer neurobiology is a new conceptual scheme-a scheme rooted in computation rather than in reflexes.the "reduced instruction set" out of which all computations are compounded? How are these elementary operations implemented by cellular mechanisms and by basic neural circuits? Some progress along these lines has been made in early stages of sensory processing (e. 1987. This scalar error in the recording of the values of intervals makes no functional or psychological sense. Insofar as the behavioral evidence demands a computational account of psychological processes. & Menaker. As a result. namely. 1988) has provided a line of attack on the molecular and cellular mechanisms of internal clocks. the operations that are the foundation of computation. Ralph. the storage and retrieval of the values of variables.Foraging for brain stimulation 65 physiological processes that are faithful to well-established principles of neurobiology.. It is presumably a hardware error analogous to the hardware error that makes the period of an animal's internal clock deviate from 24 h by an animal specific factor. We need to develop preparations that do involve this crucial element of post-sensory computation. Depending on the animal. what are the elements of neurobiological computation . one may hope to find mutants with a dramatically enhanced scalar write-to-memory error. We should take the computational assumptions implicit in cognitive psychology seriously and use them to ask questions about the nervous system. or at least those aspects currently being modeled at the neural level. 1990.

First. Wise & Rompre. but they are simpler and more completely specified than most such models. It gives us direct control over the neural signal that plays two crucial roles. 1988). All of them assume that the decisions that determine quantitative properties of the observed behavior are based on elementary computations involving the addition. they give us a good starting point for a program of research aimed at establishing the neurobiological foundations of the computational capability that makes cognition possible. We have also been able to use results from matching experiments with self-stimulating rats to develop and test models of the computations used by the decision process in matching behavior. (1992). Stimulus preference and the transitivity of preference. 401-406. The specification of the subjective duration of those intervals requires a clock signal as well as a signal that marks off the interval. Thus. in experiments reviewed here. 1989. subtraction. that the clock signal originates within the nervous system itself (Gallistel. No further stimulus is required to generate the time signals that enable the interval-measuring mechanism to measure an interval. division and ordering of scalar quantities. References Belke. These models are cognitive models. 1990). These scalar quantities represent simple aspects of the animal's experience . Thus. There is reason to believe. Our model of the decision process in matching is closely related to models for the decision processes in other timing tasks.66 C Gallistel Research in my own laboratory uses subjects in which the reward is generated by direct electrical stimulation of the central nervous system. Yeomans. Animal Learning and Behavior. 1989. T. Stellar & Rice. this neural signal specifies one of the psychological values that is used in the decision process that mediates matching behavior-the subjective magnitude of the reward. however. multiplication.the durations of intervals and the magnitudes of rewards. . Second. More recently. 1989. Experiments in many laboratories have used self-stimulation behavior to define quantitative and pharmacological characteristics of the neural pathway that carries the rewarding signal (Shizgal & Murray.W. 20. the neural signal produced by direct electrical stimulation of the medial forebrain bundle provides the stimulus input necessary to define the two quantities that the decision process in matching uses to determine how the animal will allocate its time among patches-the subjective reward magnitude and the interreward interval. we have been able to measure the subjective value of the reward and plot the functions relating this value to the strength and duration of the neural signal generated by the stimulation. this same neural signal specifies the beginning and end of interreward intervals. This opens up another line of attack on the neural mechanisms that mediate the computations in timing tasks.

New York: New York Academy of Sciences. (1991). Gibbon & L. 486-502. Cambridge. Quantification of steady-state operant behavior.). C. R. C. Ralph. (1977). Gallistel.R. Scalar timing in memory. & Gallistel.).). 107..H. Herrnstein. 18. 1225-1227.G. & Menaker. 21. & C.S.. C. (1981). Keller. P. Konopka.. (1984)..M. (1961). & Dow. J. & Menaker. 183-193. (1991).F. 52-77). Computational models of visual processing. 228-273. New York: Annals of the New York Academy of Sciences. Fairhurst.Foraging for brain stimulation 67 Church. Science. D. Measuring the subjective magnitude of brain stimulation reward by titration with rate of reward. 303-320.M. 17. (1981). Hillsdale. Dreyfus.R. Journal of Experimental Psychology: Animal Behavior Processes.J. In J. J. Journal of the Experimental Analysis of Behavior. In N. 360-364. (1987). Leon.C. P. The organization of learning. Relative and absolute strength of response as a function of frequency of reinforcement. & Yeomans. (1992).R. L.. (1988). J. The function relating the subjective magnitude of brain stimulation reward to stimulation strength varies with site of stimulation. American Economic Review.. Behavioral Neuroscience. Handbook of perception and cognition. Psychological Review. Szabadi. Allan (Eds. & Meek. M. (in press).J. & Gallistel. Bradshaw.E. 160-174. Allan (Eds.A. Journal of the Experimental Analysis of Behavior. (1993).. Lea. (1990). F. Foster. Journal of Mathematical Psychology. Church. R. The kinetics of choice: An operant systems analysis.M. Ubiquity of scalar timing with a Poisson clock. 145-153..).J. Cooper (Eds. (1989). 20. (1984).. The saturation of subjective reward magnitude as a function of current and pulse frequency.. Experiments on stable sub-optimality in individual behavior.R. M. Killeen. Cambridge. R. Shizgal. 1988. The neuropharmacological basis of reward (pp. Mark. Behavioral Neuroscience. New York: New York Academy of Sciences. 9: Animal learning and cognition.V. J. NJ: Erlbaum. & Leon.R. In J. J. Ralph. R. Gallistel. (1991). (in press). & Kacelnik. New York: Academic Press. Psychological Review.R. (1980). Gibbon. J.E. 106-163). Space and time. The integration of reinforcements over time. & Murray. 269-277). Behavioral Neuroscience. Mark.. & Gallistel. 247. (1992). Timing and time perception (pp. The kinetics of matching.J. Gallistel.. Choice behavior in transition: Development of preference with ratio and interval schedules. Timing and time perception (pp. Subjective reward magnitude of MFB stimulation as a function of train duration and pulse frequency.. M.R. Liebman & S. C. W. (1991). S. Properties of the internal clock. Averaging theory. Local shifts in relative reinforcement rate and time allocation on concurrent schedules. R.. (1984). Transplanted suprachiasmatic nucleus determines circadian period.. Shizgal. 267-272. Science. Vol. In J. 975-977. C. Amsterdam: Elsevier/North-Holland. T. Psychological Review.R. C. Scalar expectancy theory and choice between delayed rewards. 52. (1988). 913-924.R.R. D. R. Staddon. 102-114. Gibbon.R. M.A. Simmons. 283-293. 87. J. P. (1992). Church. Oxford: Clarendon Press. Behavioural Brain Research.E.M. M.R. Allan (Eds. Mackintosh (Ed. Neuronal basis of intracranial self-stimulation. B. & McCarthy.. A mutation of the circadian system in golden hamsters. Gibbon & L. A portrait of the substrate for self-stimulation. S. J. Gibbon. 79-95. 36.M. 28. L. (1994). Lowe (Eds..G. (1988).M. Duration and rate of reinforcement as determinants of concurrent responding. 49. Journal of Experimental Psychology: Animal Behavior Processes. 241. Davison. & Gollub.). 105. Annual Review of Genetics. Journal of the Experimental Analysis of Behavior. & Gallistel.R. E. J. .M. Gibbon & L. Quasi-dynamic choice models: Melioration and ratio invariance.. 364-378.. 567-582). & Miezin. MA: Bradford Books/MIT Press. 227-236. Journal of Experimental Psychology: Animal Behavior Processes. Gallistel. S.. Davis.).J. & Movshon. Landy. M. 4. 108. Herrnstein. 81. The matching law: A research review. Myerson. J. M. F. A. In CM.. Mazur.. C. C. (1988). (1990). R. Timing and time perception (pp. MA: Bradford Books/MIT Press. T.A. In J.R. 389-401.. Genetics of biological rhythms in Drosophila.

Hermstein. J. Mechanisms of brain stimulation reward.E. W. In A. 14-65). Vaughan.S.). 227-266). In M. Annual Review of Psychology.R. . Commons. In J. & Rompre.J. (1989). M. 40 6/ 191-225.B.. Oxford: Clarendon Press. Morrison (Eds. (1989).J. 263-279).). & Rice. Wise. Cooper (Eds. Choice and the Rescorla-Wagner model.J. J.M. The neuropharmacology basis of reward (pp. Progress in psychobiology and physiological psychology (pp. R.L. Brain dopamine and reward.A. P.). Yeomans. Rachlin (Eds. (1988). (1981). Liebman & S.-P. New York: Academic Press. Quantitative analyses of behavior: Matching and maximizing accounts (pp. Epstein & A. Cambridge MA: Ballinger.R.68 C Gallistel Stellar.. Pharmacological basis of intracranial self-stimulation reward. & H. R.

especially. USA Abstract Cognitive psychology has an opportunity to turn itself into a theoretically rigorous discipline in which a powerful set of theories organize observations and suggest focused new hypotheses. Santa Barbara. . evolutionarily rigorous theories of adaptive function are the logical foundation on which to build cognitive theories. Theories of adaptive function specify what problems our cognitive mechanisms were designed by evolution to solve. Roger Shepard. Steve Pinker for his insightful comments on an earlier draft.5 Beyond intuition and instinct blindness: toward an evolutionarily rigorous cognitive science Leda Cosmides* 3 . Don Symons. University of California. Martin Daly. Mike Gazzaniga. Steve Pinker. Gerd Gigerenzer. This cannot happen. For many illuminating discussions on these topics. This information can free cognitive scientists from the blinders of intuition and folk psychology. and the members of the Laboratory of Evolutionary Psychology (UCSB). John Tooby b "Laboratory of Evolutionary Psychology. We also thank Don Symons for calling our attention to Gary Larson's "Stoppit" cartoon and. allowing them to construct experiments capable of detecting complex mechanisms they otherwise would not have thought to test for. because the architecture of the human mind acquired its functional organization through the evolutionary process. Department of Psychology. however. USA b Laboratory of Evolutionary Psychology. CA 93106-3210. Santa Barbara. In contrast. We are grateful to the McDonnell Foundation and NSF Grant BNS9157-499 to John Tooby for their financial support. restricting our attention instead to a minute class of unrepresentative "high-level" problems. thereby supplying critical information about what their design features are likely to be. Dan Sperber. Department of Anthropology. CA 93106-3210. The choice is not * Corresponding author. This is because intuition systematically blinds us to the full universe of problems our minds spontaneously solve. Margo Wilson. David Buss. as long as intuition and folk psychology continue to set our research agenda. University of California. we warmly thank Pascal Boyer.

But to benefit from knowledge generated in these collateral fields. We believe the study of central processes can be revitalized by . Tooby between no-nonsense empiricism and evolutionary theory. many of these fields are finding it necessary to use concepts and research from the cognitive sciences. We now have the opportunity to take our place in the far larger and more exacting scientific landscape that includes the rest of the modern biological sciences. Cosmides. In turn. evolved / developed. and (3) the recognition that these tasks are usually solved by cognitive machinery that is highly functionally specialized. the idea that the field's scope is limited to the study of "higher" mental processes. paleontology. have made the most rapid empirical progress. (2) detailed analyses of the tasks each mechanism was designed by evolution to solve. research of immediate and direct relevance to our own is being generated in evolutionary biology.such as the human brain . Williams The cognitive sciences have reached a pivotal point in their development. animal /human. and neuroscience. population biology. The biological and cognitive sciences dovetail elegantly because in evolved systems . innate / learned. These areas succeed because they are guided by (1) theories of adaptive function. developmental biology. This means shedding certain concepts and prejudices inherited from parochial parent traditions: the obsessive search for a cognitive architecture that is general purpose and initially content-free.70 L. we will have to learn how to use biological facts and principles in theory formation and experimental design. cognitive scientists will have to abandon the functional agnosticism that is endemic to the field (Tooby & Cosmides. biological/cultural.there is a causal relationship between the adaptive problems a species encountered during its evolution and the design of its phenotypic structures. behavioral ecology. because evolutionary biologists investigate and inventory the set of adaptive information-processing problems the brain evolved to solve. 1992). genetics. genetic/ environmental. Indeed. it is between folk theory and evolutionary theory. the excessive reliance on results derived from artificial "intellectual" tasks. such as visual perception. Most importantly. a theoretical synthesis between the two fields seems inevitable. In fact. emotion/cognition. J. and a long list of false dichotomies reflecting premodern biological thought-evolved/learned. Nothing in biology makes sense except in the light of evolution. biological / social. the cognitive subfields that already recognize and exploit this relationship between function and structure. Theodosius Dobzhansky Is it not reasonable to anticipate that our understanding of the human mind would be aided greatly by knowing the purpose for which it was designed? George C. and cognitive scientists investigate the design of the circuits or mechanisms that evolved to solve them. Every day.

Gould & Lewontin. and many other evolved competences of humans. 1982. We will briefly explain why they are important.for solving adaptive problems.1 Other cognitive scientists take a less ideological. 1983. As a result. 1986. color constancy. To isolate a functionally organized mechanism within a complex system. unfalsifiable speculations that one indulges in at the end of a project. Dawkins. 1979. most never think about function at all. the state space of potential organic designs is infinitely large and infinitely dimensioned. 1991). Williams & Nesse. when definable engineering standards of functionality are applied. cognitive psychology has been conducted as if Darwin never lived. a few cognitive scientists have tried to ground their dismissal of functional reasoning in biology itself. But theories of adaptive function are not a luxury. However.Beyond intuition and instinct blindness 71 applying the same adaptationist program. Atheoretical approaches will not suffice-a random stroll through hypothesis space will not allow you to distinguish figure from ground in a complex system.. how they bear Similar results emerge from the cognitive sciences. Krebs & Davies. But for this to happen. In short. Williams. crucial to the future development of cognitive psychology.o n those rare occasions when artificial systems can solve the assigned tasks at all. you need a theory of what function that mechanism was designed to perform. Although artificial intelligence researchers have been working for decades on computer vision. there is no way of defining an "optimal" point in it. cognitive scientists will have to replace the intuitive. where exactly they fit into a research program. This article is intended as an overview of the role we believe theories of adaptive function should play in cognitive psychology. folk psychological notions that now dominate the field with evolutionarily rigorous theories of function. much less "measuring" how closely evolution brings organisms to it. 1987. adaptations can be shown to be very functionally designed .fanciful. natural selection is known to produce cognitive machinery of an intricate functionality as yet unmatched by the deliberate application of modern engineering.g. consequently. Daly & Wilson. this argument has been empirically falsified so regularly and comprehensively that it is now taken seriously only by research communities too far outside of evolutionary biology to be acquainted with its primary literature (Clutton-Brock & Harvey. after the hard work of experimentation has been done.where many anti-adaptationist arguments go awry. They are an indispensable methodological tool. Surprisingly. more agnostic stance. or what the explicit analysis of function could teach them. . speech recognition and comprehension. 1979). naturally selected computational systems still far outperform artificial systems on the adaptive problems they evolved to s o l v e . object recognition. It is exactly this reluctance to consider function that is the central impediment to the emergence of a biologically sophisticated cognitive science. Most cognitive scientists proceed without any clear notion of what "function" means for biological structures like the brain. There are an uncountable number of changes that could conceivably be introduced into the design of organisms and. Thus. Indeed. The claim that natural selection is too constrained by other factors to organize organisms very functionally has indeed been made by a small number of biologists (e. 1966. many cognitive scientists think that theories of adaptive function are an explanatory luxury . However. This is a far more definable standard than "optimality" .

The physical structure is there because it embodies a set of programs. and what orthodoxies they call into question. A computational theory specifies what that problem is and why there is a device to solve it. (Marr. Cosmides. J. a brain. and (b) why it was designed to solve that problem and not some other one. 1982). a computer. the programs are there because they solve a particular problem. trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: it just cannot be done. you need to know (a) what problem it was designed to solve. For human-made artifacts and biological systems.a calculator. 1992. An organism's phenotypic structure can be thought of as a collection of "design features" . or what Marr called a computational theory (Marr. Over evolutionary time. you need to develop a task analysis of the problem. Tooby on cognitive and neural theories. form follows function. a cash register. . Knowing the physical structure of a cognitive device and the information-processing program realized by that structure is not enough. A design feature will cause its own spread over generations if it has the consequence of solving adaptive problems: cross-generationally recurrent problems whose solution promotes reproduction. see Tooby & Cosmides. 1982. Function determines structure Explanation and discovery in the cognitive sciences . It specifies the function of an information-processing device. In order to understand bird flight. however. such as the functional components of the eye or liver. Marr felt that the computational theory is the most important and the most neglected level of explanation in the cognitive sciences. (2) They solve problems by virtue of their structure. It is based on the following observations: (1) Information-processing devices are designed to solve problems.micro-machines. (For a more complete and detailed argument. 27) David Marr developed a general explanatory system for the cognitive sciences that is much cited but rarely applied.) I. such as detecting predators or detoxifying . In other words. new design features are added or discarded from the species' design because of their consequences.72 L. . His three-level system applies to any device that processes information . p. because it is essential for understanding how natural selection designs organisms. only then do the structure of feathers and the different shapes of birds' wings make sense. a television. (3) Hence to explain the structure of a device. This functional level of explanation has not been neglected in the biological sciences. we have to understand aerodynamics.

and noise. p. 2 All traits that comprise species-typical designs can be partitioned into adaptations. 1990a. Explanations at the level of the computational theory are called ultimate level explanations. Tooby & Cosmides. only narrowly defined aspects of organisms fit together into functional systems: most of the system is incidental to the functional properties. Three levels at which any machine carrying out an information-processing task must be understood (from Marr. Hardware implementation How can the representation and algorithm be realized physically? In evolutionary biology. A computational theory defines what problem the device solves and why it solves it. 1966. Natural selection is a feedback process that "chooses" among alternative designs on the basis of how well they function. Computational theory What is the goal of the computation. Marr's computational theory is a functional level of explanation that corresponds roughly to what biologists refer to as the "ultimate" or "functional" explanation of a phenotypic structure. by-products. theories about programs and their physical substrate specify how the device solves the problem.something no sensible evolutionary biologist would ever maintain. Many see Table 1. and what is the algorithm for the transformation? 3. Like other machines. why is it appropriate.currently dominate the research agenda in the cognitive sciences. which are present because they were selected for. biologists had to develop a theoretical vocabulary that distinguishes between structure and function. Answering such questions is extremely difficult. and what is the logic of the strategy by which it can be carried out? 2.questions about programs and hardware . some have misrepresented the well-supported claim that selection organizes organisms very functionally as the obviously false claim that all traits of organisms are functional . or at the level of hardware implementation.Beyond intuition and instinct blindness 73 poisons. 1986. 1992. Representation and algorithm How can this computational theory be implemented? In particular. Williams. Pinker & Bloom. 1990. "How" questions . 1982. By selecting designs on the basis of how well they solve adaptive problems. and most cognitive scientists realize that groping in the dark is not a productive research strategy. all complex functional organization is (Dawkins. which was injected by the stochastic components of evolution. Explanations at the level of representations and algorithm. 1985). what is the representation for the input and output. 1990b. cognitive scientists need to recognize that while not everything in the designs of organisms is the product of selection. which are present because they are causally coupled to traits that were selected for. Nevertheless. 25) 1.2 To understand this causal relationship. Unfortunately. are called proximate levels of explanation. this process engineers a tight fit between the function of a device and its structure. .

each with a different set of cognitive programs. once we know enough about the properties of neurons. Facts about the properties of neurons. In this view. Undoubtedly they will. . neurotransmitters. It is an essential tool for discovery in the cognitive and neural sciences. neurotransmitters and cellular development. monogamous gibbons. knowing what and why places strong constraints on theories of how. lions that hunt in teams. and cellular development cannot tell you which of these millions of programs the human mind contains. a program's structure "depends more upon the computational problems that have to be solved than upon the particular hardware in which their solutions are implemented" (1982. Cosmides. For this reason. spiders that spin webs. cheetahs that hunt alone. This cannot be true. very few cognitive programs are capable of solving any given adaptive problem. In other words. Task demands radically constrain the range of possible solutions. 27). a computational theory of function is not an explanatory luxury. you can vastly simplify the empirical search for the cognitive program that solves it. it is the arrangement of neurons . it becomes straightforward to develop clinical tests that will target its neural basis. The same basic neural tissue embodies all of these programs. And once that program has been identified. figuring out what cognitive programs the human mind contains will become a trivial task. polyandrous seahorses. The question is. . humans that speak. p.into birdsong templates or web-spinning programs . The idea that low-level neuroscience will generate a self-sufficient cognitive theory is a physicalist expression of the ethologically naive associationist/empiricist doctrine that all animal brains are essentially the same. but it reduces the number of possibilities to an empirically manageable number.that matters. Tooby the need for a reliable source of theoretical guidance. consequently. ants that farm. bees that compute the variance of flower patches. There are millions of animal species on earth. and it could support many others as well. A theory of function may not determine a program's structure uniquely. bats that echolocate. But extreme partisans of this position believe neural constraints will be sufficient for developing cognitive theories.74 L. what form should it take? Why ask whyl -or-how to ask how It is currently fashionable to think that the findings of neuroscience will eventually place strong constraints on theory formation at the cognitive level. . J. as David Marr put it. polygynous gorillas . In fact. Consider the fact that there are birds that migrate by the stars. Even if all neural activity is the expression of a uniform process at the cellular level. By developing a careful task analysis of an information-processing problem.

Other mathematical operations are inappropriate because they violate these intuitions. if the cash register subtracted each price from 100. you need to know what problems it was designed to solve and why it was designed to solve those problems rather than some other ones. such as commutativity and associativity (see Table 2). regardless of their origin. On this view. for example. But it shouldn't. the cash register adds these numbers together. "the reason is that the rules we intuitively feel to be appropriate for combining the individual prices in fact define the mathematical operation of addition" (p. Cognitive science is the study of the design of minds. That's the what. 22. Cognitive psychology is the study of the design of minds that were produced by the evolutionary process. cognitive scientists will need to know what problems our cognitive and neural mechanisms were designed to solve. Most cognitive scientists know this. But why was the cash register designed to add the prices of each item? Why not multiply them together. What they don't yet know is that understanding the evolutionary process can bring the architecture of the mind into . cash registers were designed to add because addition is the mathematical operation that realizes the constraints on buying and selling that our intuitions deem appropriate. the store would pay you.and whenever you chose more than $100 of goods. then shows that these constraints map directly onto those that define addition (see Table 2). In this particular example. He formulates these intuitive rules as a series of constraints on how prices should be combined when people exchange money for goods. the buck stopped at intuition. which are represented by numbers. emphasis added). an information-processing device that was designed by the evolutionary process. you need to ask the same questions of the brain as you would of the cash register. To compute a final bill. Evolution produced the what.Beyond intuition and instinct blindness 75 To figure out how the mind works. How the addition is accomplished is quite irrelevant: any set of representations and algorithms that satisfy these abstract constraints will do. or subtract the price of each item from 100? According to Marr. Beyond intuition: how to build a computational theory To illustrate the notion of a computational theory. In other words. Marr asks us to consider the what and why of a cash register at a check-out counter in a grocery store. and it has certain abstract properties. We know the what of a cash register: it adds numbers. Our intuitions are produced by the human brain. The input to the cash register is prices. the more goods you chose the less you would pay . and evolutionary biology is the study of why. Addition is an operation that maps pairs of numbers onto single numbers. To discover the structure of the brain.

There is a unique element.) 3.) 2. For biological systems. Arranging the goods into two piles and paying for each pile separately should not affect the total amount you pay. it should cost you nothing. Pinker & Bloom. or computer programming. where "adaptive" has a very precise. 1982. Tooby Table 2. Buying food at a grocery store is a form of social exchange . Why cash registers add (adapted from Marr. Williams. like writing. It is exactly the kind of problem that selection can build cognitive mechanisms for solving. 1986. if it were. the basic operation for combining prices.cooperation between two or more individuals for mutual benefit. The adaptive problems that arise when individuals engage in this form of cooperation have constituted a long-enduring selection pressure on the hominid line. Adding zero has no effect: 2 + 0 = 2 Rules governing social exchange in a supermarket 1. 1966). J. (Inverses. The order in which goods are presented to the cashier should not affect the total. Social exchange is not a recent cultural invention. Cosmides. and the fact that social exchange exists in some of our primate cousins suggests that it may be even more ancient than that. The only component of the evolutionary process that can build complex structures that are functionally organized is natural selection.) 2. The brain can process information because it contains complex neural circuits that are functionally organized. the nature of the designer carries implications for the nature of the design. let's consider the source of Marr's intuitions about the cash register. (Associativity. and buying nothing and something should cost the same as buying just the something. Each number has a unique inverse that when added to the number gives zero: 2 + (-2) = 0 sharper relief. and of its being extremely elaborated in some cultures and absent in others. 1990. "zero". yam cultivation. (The rules for zero. Associativity: (2 + 3) + 4 = 2 + (3 + 4) 4. (Dawkins. But its distribution does not fit this pattern. Paleoanthropological evidence indicates that social exchange extends back at least 2 million years in the human line. 1990a. If you buy an item and then return it for a refund. If you buy nothing. one would expect to find evidence of its having one or several points of origin. pp. Tooby & Cosmides.) 4. 1992. of its having spread by contact.7(5 L. presenting itself in many forms: reciprocal . 22-23) Rules defining addition 1. And the only kind of problems that natural selection can build complexly organized structures for solving are adaptive problems. Commutativity: (2 + 3) = (3 + 2) = 5 3. Bearing this in mind. (Commutativity. narrow technical meaning. your total expenditure should be zero. Social exchange is both universal and highly elaborated across human cultures.

.D. 1981. a crucial one being that social exchange cannot evolve in a species unless individuals have some means of detecting individuals who cheat and excluding them from future interactions (e. 1986. we can deduce that the human cognitive architecture contains 3 Had Marr known about the importance of cheating in evolutionary analyses of social exchange. 1990). modeling it as a repeated Prisoner's Dilemma. 1991). Wilkinson. We have strong and cross-culturally reliable intuitions about how this form of cooperation should be conducted. preventing the clerk from altering the totals to match the amount of cash in the drawer). Boyd. 1991)..David Marr was consulting these deep human intuitions. From these facts. Fiske. By cataloging these design features. It is an ancient. 1971). two rolls of tape keep track of transactions (one is for the customer. Cheney & Seyfarth. 1988. W. These theories have provided a principled basis for generating hypotheses about the phenotypic design of mechanisms that generate social exchange in a variety of species. Hamilton. social exchange is a universal. animal behavior researchers were able to look for-and discover-previously unknown aspects of the psychology of social exchange in species from chimpanzees. food-sharing. Most cash registers have anti-cheating devices. Behavior is generated by computational mechanisms.g. species-typical trait with a long evolutionary history. Fiske. Behavioral ecologists have used these constraints on the evolution of social exchange to build computational theories of this adaptive problem . Axelrod & Hamilton. If a species engages in this behavior (and not all do). Fischer. 1992. Cash drawers lock until a new set of prices is punched in. and so on (Cosmides & Tooby. In developing his computational theory of the cash register . 1984.a tool used in social exchange . researchers such as George Williams. Smuts. then its cognitive architecture must contain one of these programs.theories of what and why. 1988. Trivers. These analyses have turned up a number of important features of this adaptive problem. 1988. They spotlight design features that any cognitive program capable of solving this adaptive problem must have. pervasive and central part of human social life. In our own species. the other rolls into an inaccessible place in the cash register. This research strategy has been successful for a very simple reason: very few cognitive programs satisfy the evolvability constraints for social exchange. which arise in the absence of any explicit instruction (Cosmides & Tooby. baboons and vervets to vampire bats and hermaphroditic coral-reef fish (e. and Robert Axelrod have explored constraints on the evolution of social exchange using game theory. 1990. One can think of this as an evolvability constraint. 1992. If a species engages in social exchange behavior. and so on. In evolutionary biology. de Waal & Luttrell. Selection cannot construct mechanisms in any species including humans-that systematically violate such constraints. Robert Trivers. then it does so by virtue of computational mechanisms that satisfy the evolvability constraints that characterize this adaptive problem.Beyond intuition and instinct blindness 77 gift-giving. 1988.g. marketing-pricing. Axelrod. he might have been able to understand other features of the cash register as well. .

'What is the function of a given structure or organ?' has been for centuries the basis for every advance in physiology" (1983. and some of the best work in evolutionary biology is devoted to analyzing constraints on the evolution of mechanisms that solve these problems. Attention to function can advance the cognitive sciences as well. Aside from those properties acquired by chance or imposed by engineering constraint. cognitive scientists have become familiar with the notion of developing computational theories to study perception and language. Cosmides. in evolutionary terms (less than 1% of the past 2 million years). Tooby & Cosmides. change by change. it is unlikely that our species evolved complex adaptations even to agriculture. and why we humans reliably develop circuits that embody these rules rather than others. behavioral ecology. these evolutionary analyses may be the only source of constraints available for developing computational theories of social cognition. 1990a. Since Marr. Complex designs . subject to the constraint that each new design feature must solve an adaptive problem better than the previous design (the vertebrate eye is an example). As a source of theoretical guidance about organic design. The few thousand years since the scattered appearance of agriculture is a short stretch. we can seek mechanisms that are well engineered for solving them. p.4 If we know what these problems were. functionalism has an unparalleled historical track record.78 L. comparative studies. For these and other reasons. paleoanthropology and other fields. 1990b). In fact. the mind consists of a set of information-processing circuits that were designed by natural selection to solve adaptive problems that our hunter-gatherer ancestors faced generation after generation. As Ernst Mayr notes.ones requiring the coordinated assembly of many novel. functionally integrated features . let alone to post-industrial society (for discussion. Principles of organic design The field of evolutionary biology summarizes our knowledge of the engineering principles that govern the design of organisms. 328). see Dawkins. Our ancestors spent the last 2 million years as Pleistocene hunter-gatherers (and several hundred million years before that as one kind of forager or another). we should be able to develop a computational theory of the organic information-processing device that governs social exchange in humans. As cognitive scientists. we should be able to specify what rules govern human behavior in this domain. 1982. Yet some of the most important adaptive problems our ancestors had to solve involved navigating the social world. Tooby programs that satisfy the evolvability constraints for social exchange.are built up slowly. In other words. "The adaptationist question. . The exploration and definition of these adaptive problems is a major activity of evolutionary biologists. but the notion that one can develop computational theories to study the informationprocessing devices that give rise to social behavior is still quite alien. By combining results derived from mathematical modeling. J.

the more useful functional information is.the more tightly you can constrain what would count as a solution . Because functionally neutral features are free to vary. an outer shell no smaller than a tape. and the more you can concentrate your experimental efforts on discriminating between viable hypotheses.g. you would know that it must contain a device that converts magnetic patterns into sound waves. Information about features that have no impact on the machine's function would not have helped much either (e. Never having seen one. But the more precisely you can define the goal of processing . and so on. But the answer becomes progressively clearer as I add functional constraints: (1) it is well designed for entertainment (movie projector. Functional information helps because it narrowly specifies the outcome to be produced. experiments are needed to establish how. In other words. TV. information about them does little to narrow your search. Guessing at random would have taken forever. Knowing the object is well engineered for solving these problems provides powerful clues about its functional design features that can guide research. the more the field of possible solutions is narrowed. a place to insert the tape. it is well designed for playing taped music (stereo or Walkman). Yet the definition of function that guides most research on the mind (it "processes information") is so broad that it applies even to a Walkman. This means (1) narrow definitions of outcomes are more useful than broad ones (tape player versus entertainment device). It is difficult to figure out the design of the object I'm now thinking about if all you know is that it is a machine (toaster? airplane? supercollider?). A technological analogy may make this clearer. The smaller the class of entities capable of producing that outcome.the more clearly you can see what a mechanism capable of producing that solution would have to look like. the number of scratches). Computional theories address what and why. The more constraints you can discover. it was not designed to project images (nothing with a screen). Narrow definitions of function are a powerful methodological tool for discovering the design features of any complex problem-solving device. its color. including the human mind. but no larger than necessary to perform the transduction. It is possible to create detailed theories of adaptive function.. evolutionary biologists explore exactly those questions that Marr argued were essential for developing computational theories of adaptive information-processing problems. This is because . but because there are multiple ways of achieving any solution.Beyond intuition and instinct blindness 79 evolutionary biologists try to identify what problems the mind was designed to solve and why it was designed to solve those problems rather than some other ones. CD player?). it was designed to be easily portable during exercise (Walkman). seeing versus scratching). and (2) functional information is most useful when there are only a few ways of producing an outcome (Walkman versus paperweight.

narrows the field of possible solutions. does the design represent an evolutionarily stable strategy? ..functional and otherwise . are we not seeing? How would evolutionary functionalism transform the science of mind? Table 3. sometimes different) .related to point 2) 4. Cosmides.from which computational theories of adaptive information-processing problems can be built. to know how our intuitions might have blinded us. Hunter-gatherer studies and paleoanthropology . and what the answer will probably look like. if any. 5. This rule of organic design sounds too general to be of any help. Evolutionary biology provides constraints from which computational theories of adaptive information-processing problems can be built To build a computational theory. or would it have been selected out by alternative designs with different properties? (i. Game-theoretic models of the dynamics of natural selection (e. J.e. What is the adaptive problem? 2. and vice versa).source of information about the environmental background against which our cognitive architecture evolved. More precise definition of Marr's "goal" of processing that is appropriate to evolved (as opposed to artificial) information-processing systems 2. It provides constraints . and Z evolve. but aren't. Y. Doing experiments is like playing "20 questions" with nature. What cognitive systems. Studies of the algorithms and representations whereby other animals solve the same adaptive problem. therefore. Tooby natural selection is only capable of producing certain kinds of designs: designs that promoted their own reproduction in past environments. Prisoner's Dilemma and cooperation . Table 3 lists some principles of organic design that cognitive psychologists could be using. Evolvability constraints: can a design with properties X. (Information that is present now may not have been present then. Taking function seriously We know the cognitive science that intuition has wrought. It is more difficult.. But when it is applied to real species in actual environments.particularly useful for analysis of cognitive mechanisms responsible for social behavior) 3. and evolutionary biology gives you an advantage in this game: it tells you what questions are most worth asking. (These will sometimes be the same.80 L. kin selection.g. What information would have been available in ancestral environments for solving it? Some sources of constraints 1. however. this deceptively simple constraint radically limits what counts as an adaptive problem and. you need to answer two questions: 1.

reasoning and learning in non-humans. Most of these are characterized by strict evolvability constraints. they were vulnerable to a large variety of parasites and pathogens. hunter-gatherer archaeology. sexual conflict. they would rarely (if ever) have seen more than 1000 people at one time. trade-offs between mating effort and parenting effort. parental care. In contrast. which they use in studying processes of attention. behavioral ecologists and evolutionary biologists have already created a library of sophisticated models of the selection pressures. habitat selection. navigation. exposed to a wide variety of plant toxins and having a sexual division of labor between hunting and gathering. Twenty-first-century textbooks on human cognition will probably be organized similarly. and so on. the adaptive problems posed by social life loom large. No less should prove true of humans. strategies and trade-offs that characterize these very fundamental adaptive problems. Textbooks in evolutionary biology are organized according to adaptive problems because these are the only problems that selection can build mechanisms for solving. resource competition. "reasoning". Tooby & DeVore.g. and an extended period of physiologically obligatory female investment in pregnancy and lactation. When these parameters are combined with formal models from evolutionary biology and behavioral ecology. Fortunately. Textbooks in behavioral ecology are organized according to adaptive problems because circuits that are functionally specialized for solving these problems have been found in species after species. a reasonably consistent picture of ancestral life begins to appear (e. "learning". In this picture. predator defense. they engaged in cooperative hunting. 1987). omnivores. memory. They lived in small nomadic kin-based bands of perhaps 20-100. Findings from paleoanthropology. dominance and status. which could only be satisfied by cognitive programs that are specialized for reasoning about the social . they had little opportunity to store provisions for the future. Which model is applicable for a given species depends on certain key life-history parameters. signaling and communication. they made tools and engaged in extensive amounts of cooperative reciprocation. long periods of biparental investment in offspring. inbreeding avoidance.. mammals with altricial young. They were a long-lived. aggression.Beyond intuition and instinct blindness 81 Textbooks in psychology are organized according to a folk psychological categorization of mechanisms: "attention". and studies of living hunter-gatherer populations locate humans in this theoretical landscape by filling in the critical parameter values. mateship maintenance. Ancestral hominids were ground-living primates. mating system. defense and aggressive coalitions. enduring male-female mateships. courtship. cooperation. textbooks in evolutionary biology and behavioral ecology are organized according to adaptive problems: foraging (hunting and gathering). "memory". kinship. low-fecundity species in which variance in male reproductive success was higher than variance in female reproductive success. paternity uncertainty and sexual jealousy.

1989. the same mechanisms are thought to govern how one acquires a language and a gender identity. Cosmides & Tooby. nearly every thought and feeling of which humans are capable. In this view.. 1990. you won't take the steps necessary to find it. we can find our way out into the vast. these empiricist mechanisms have no inherent content built into their procedures. Tooby world. Baron-Cohen. they are not designed to construct certain mental contents more readily than others. research in psychology and the other biobehavioral and social sciences has been dominated by the assumptions of what we have elsewhere called the Standard Social Science Model (SSSM) (Tooby & Cosmides. J. Leslie. we can escape the narrow conceptual cage imposed on us by our intuitions and folk psychology. 1987). 1985. Leslie & Frith. By using evolutionary biology to remind ourselves of the types of problems hominids faced across hundreds of thousands of generations. By having the preliminary map that an evolutionary perspective provides. This model's fundamental premise is that the evolved architecture of the human mind is comprised mainly of cognitive processes that are content-free. II. and are thought to explain nearly every human phenomenon. 1991.and very successful . and they have no features specialized for processing particular kinds of content over others. 1992). Jackendoff. Frith. barely explored areas of the human cognitive architecture.g. In . Cosmides. few in number and general purpose. This is not a minor point: if you don't think a thing exists. "imitation". "induction". 1992). or between social reasoning and other cognitive functions. very little work in the cognitive sciences has been devoted to looking for cognitive mechanisms that are specialized for reasoning about the social world. an aversion to incest and an appreciation for vistas. By definition. Yet despite its importance. Nor have cognitive neuroscientists been looking for dissociations among different forms of social reasoning. 1992). a desire for friends and a fear of spiders . The work on autism as a neurological impairment of a "theory of mind" module is a very notable . Fiske.82 L.exception (e. "reasoning" and "the capacity for culture". but a primary one is that cognitive scientists have been relying on their intuitions for hypotheses rather than asking themselves what kind of problems the mind was designed by evolution to solve. This suggests that our evolved mental architecture contains a large and intricate "faculty" of social cognition (Brothers.indeed. 1992. Their structure is rarely specified by more than a wave of the hand. These general-purpose mechanisms fly under names such as "learning". There are many reasons for the neglect of these topics in the study of humans (see Tooby & Cosmides. Computational theories derived from evolutionary biology suggest that the mind is riddled with functionally specialized circuits During most of this century.

.living or artificial . wouldn't an organism be better equipped and better adapted if it could solve a more general class of problems over a narrower class? This empiricist view is difficult to reconcile with evolutionary principles for a simple reason: content-free. (For this reason. naturally intelligent programs situated in organisms successfully negotiate through lifetimes full of biotic antagonists . this view of central processes is difficult to reconcile with modern evolutionary biology. At the same time. A program equipped solely with domain-general procedures can do nothing unless the human programmer solves the frame problem for it: either by artificially constraining the problem space or by supplying the program by fiat .must somehow solve the frame problem (e. As Aquinas put this empiricist tenet a millennium ago. be able to solve problems humans are known to be able to solve. While artificial intelligence programs struggle to recognize and manipulate coke cans. The weakness of content-independent architectures To some it may seem as if an evolutionary perspective supports the case that our cognitive architecture consists primarily of powerful. subject matter or domain of life experience they are operating on. such procedures are described as content-independent. general-purpose problem-solvers . domain-general or content-free). Most artificial intelligence programs have domainspecific knowledge and procedures that do this (even those that are called "general purpose"). self-defending food items.o r even inert .g. these naturally intelligent programs solve a large series of intricate problems in the project of assembling a . they are assumed to operate uniformly.with pre-existing knowledge bases ("innate" knowledge) that it could not have acquired on its own.predators.Beyond intuition and instinct blindness 83 other words. parasites. It must." As we will discuss. After all. general-purpose problem-solving mechanisms are extraordinarily w e a k . to be a viable hypothesis about our cognitive architecture. a proposed design must pass a solvability test. any proposed cognitive architecture had to produce sufficiently self-reproductive behavior in ancestral environments . with or without connections to a perceptual system.we know this because all living species have been able to reproduce themselves in an unbroken chain up to the present. even siblings.inference engines that embody the content-free normative theories of mathematics and logic. Pylyshyn. However. 'There is nothing in the intellect that was not first in the senses. in principle. Every computational system . 1987). At a minimum.compared to specialized ones. The premise that these mechanisms have no content to impart is what leads to the doctrine central to the modern behavioral and social sciences: that all of our particular mental content originated in the social and physical world and entered through perception. no matter what content. conspecific competitors.

There is a Gary Larson cartoon about an "allpurpose" product called "Stoppit". navigation. An architecture equipped only with content-independent mechanisms must succeed at survival and reproduction by applying the same procedures to every adaptive problem. 1987. But there is no domain-general criterion of success or failure that correlates with fitness (e. Tooby & Cosmides. Gelman. We think there is a very large number of such problems. 1991. so too must any hypothetical domain-general cognitive architecture reliably generate solutions to all of the problems that were necessary for survival and reproduction in the Pleistocene. Tooby & Cosmides. foraging. cigarette smoking. Carey. the computational load increases with catastrophic rapidity.84 L. nutritional regulation. Combinatorial explosion paralyzes even moderately domain-general systems when encountering real-world complexity. Gallistel. Instead. social exchange. taxis. When sprayed from an aerosol can. predator avoidance. 1989. Carey. & Keil. then the domain-general hypothesis fails. including inclusive fitness regulation. Tooby sufficient number of replacement individuals: offspring. As generality is increased by adding new dimensions to a problem space or new branch points to a decision tree. We have developed this argument in detail elsewhere (Cosmides & Tooby. Brown.at a minimum. we will simply summarize a few of the relevant points. (1) The "Stoppit" problem. and many others as well. J. The question is not "How much specialization does a general purpose system require?" but rather "How many degrees of freedom can . (2) Combinatorial explosion. sexual jealousy. Because what counts as the wrong thing to do differs from one class of problems to the next. highly structured and very complex set of problems. A content-independent. mate choice. so we won't belabor it here.g. For humans and most other species. Stoppit stops faucet drips. procedural knowledge or privileged hypotheses. 1992). incest avoidance.. 1989. 1992). for example. specialization-free architecture contains no rules of relevance. any kind of information-processing problem that involves motivation. Keil. 1994. 1985. Markman. 1990a. Just as a hypothesized set of cognitive mechanisms underlying language must be able to account for the facts of human linguistic behavior. crying babies and charging elephants. and so could not solve any biological problem of routine complexity in the amount of time an organism has to solve it (for discussion see. Cosmides. An "all-purpose" cognitive program is no more feasible for an analogous reason: what counts as adaptive behavior differs markedly from domain to domain. this is a remarkably diverse. what counts as a "good" mate has little in common with a "good" lunch or a "good" brother). there must be as many domain-specific subsystems as there are domains in which the definitions of successful behavioral outcomes are incommensurate. If it can be shown that there are essential adaptive problems that humans must have been able to solve in order to have propagated and that domain-general mechanisms cannot solve them.

(For discussion of this design principle. (Hence this problem cannot be solved by placing a few ''constraints" on a general system. because selection does not work by inference or simulation. over thousands of generations.and raise it when caloric intake becomes a priority. Content-free architectures are limited to knowing what can be validly derived by general processes from perceptual information. the mechanism will be too. the mother avoids ordinarily palatable foods when they would threaten the embryo: she responds adaptively to an ontogenetically invisible relationship. They lower it when the embryo is most at risk . Consider the following adaptive problem.) (3) Clueless environments. 1994. Natural selection "counts up" the actual results of alternative designs operating in the real world. Domain-specific mechanisms are not limited in this way. based on a short period of experience. by natural selection. the best trade-off between calories consumed and risk of teratogenesis is obscure. They can be constructed to embody clues that fill in the blanks when perceptual evidence is lacking or difficult to obtain. 1992). Shepard. 1990a. over millions of individuals. eating behavior and fitness is ontogenetically "invisible": it cannot be observed or induced via general-purpose processes on the basis of perceptual evidence. 1981. early term abortions are often undetectable. see Cosmides & Tooby. This sharply limits the range of problems they can solve: when the environment is clueless.5 It can. Functionally specialized designs allow organisms to solve a broad range of otherwise unsolvable adaptive problems. As a result. however. highly targeted one . embryos self-abort for many reasons.thereby causing the food aversions. All plants foods contain an array of toxins. For example. This subtle statistical relationship between the environment. Tooby & Cosmides. be "observed" phylogenetically.which embody privileged hypotheses that reflect and exploit these virtually unobservable relationships in the world. As a result. In this sense it is omniscient . and weights these alternatives by the statistical distribution of their consequences: those design features that statistically lead to the best available outcome are retained.even a specialized. it is not limited to what is locally perceivable.and still compute decisions in useful. Ones that your liver metabolizes with ease sometimes harm a developing embryo. nausea and vomiting of early pregnancy .Beyond intuition and instinct blindness 85 a system tolerate . it can build circuits .) Women ingest thousands of plant toxins every day.it is not limited to what could be validly deduced by one individual.like those that regulate food choice during pregnancy . . real-world time?" Combinatorics guarantee that real systems can only tolerate a small number. 1987. the embryo/toxin problem is solved by a set of functionally specialized mechanisms that adjust the threshold on the mother's normal food aversion system (Profet. and it is not confused by spurious local correlations. 1987.

Moreover. They believe that the preponderance of mental processes are content-independent and general purpose. prudent scientific stance . in deciding which of two alternative designs is more likely to have evolved. Epicycle upon epicycle would have to be added on to evolutionary theory to create a model in which less efficient designs frequently outcompeted more efficient ones. but it cannot function as a pump. 1987. but nevertheless continue to believe that the mind needs very little content-specific organization to function. This same principle applies to the design of the human body. machines like these-ones that are specialized Parsimony applies to number of principles. not number of entities . but it is not good at detoxifying poisons. Natural selection. It would be extremely difficult to open a bottle of wine with a cup or to drink from a cork-screw. and in those few cases where one could. 1994. their comparative performance on ancestral adaptive problems is the appropriate standard to use. 1992). Pumping blood throughout the body and detoxifying poisons are two very different problems. efficiency and functional specialization Some researchers accept the conclusion that the human mind cannot consist solely of content-independent machinery. A general engineering principle is that the same machine is rarely capable of solving two different problems equally well. They could not have evolved. The reason why is quite straightforward. architectures that do not come factory-equipped with sufficiently rich sets of content-specific machinery fail the solvability test. molecules or stellar bodies. the human body has a different machine for solving each of them. J.is to posit as few functionally specialized mechanisms as possible. Given this standard.6 General-purpose mechanisms can't solve most adaptive problems at all. positing a preponderance of general-purpose machinery is neither prudent nor parsimonious. We have both cork-screws and cups because each solves a particular problem better than the other. Cosmides. 6 . they believe that the correct null hypothesis . a specialized mechanism is likely to solve it more efficiently.physicists posit a small number of laws. The heart is elegantly designed for pumping blood. Natural selection is a relentlessly hill-climbing process which tends to replace relatively less efficient designs with ones that perform better. This stance ignores what is now known about the nature of the evolutionary process and the types of functional organization that it produces. the liver is specialized for detoxifying poisons. consequently.the parsimonious. Tooby & Cosmides. survived or propagated because they are incapable of solving even routine adaptive problems (Cosmides & Tooby. In biology. not a small number of elements. Tooby In sum.86 L. Hence.

& Lang. the following forms of "evidence" are not relevant: (1) showing that the design feature has a high heritability.g. 1990.. . And it does.assuming one is possible at all . 1986. see Dawkins. These adaptive specializations are domain-specific: the specialized design features that make them good at solving the problems that arise in one domain (avoiding venomous snakes) make them bad at solving the problems that arise in another (inducing a grammar). (2) showing that variations in the environment do not affect its development. A mind that applied relatively general-purpose reasoning circuits to all these problems. the learning mechanisms that govern language acquisition are different from those that govern the acquisition of food aversions. Ohman. most importantly. regardless of their content. Speed. 1992. 1994. and both of these are different from the learning mechanisms that govern the acquisition of snake phobias (e. 1985. Mineka & Cook. Symons. Different informationprocessing problems usually have different solutions. The cognitive programs that govern how you choose a mate should differ from those that govern how you choose your dinner. Cook & Mineka. (Criteria for frequency-dependent adaptations differ. and that it cannot be better explained as a by-product of some other adaptation or physical law.Beyond intuition and instinct blindness 87 and functionally distinct-are called adaptive specializations (Rozin. Ohman. Eriksson. 1990b. 1988. For refinements and complications. reliability and efficiency can be engineered into specialized mechanisms. They are also content-dependent: they are activated by different kinds of content (speech versus screams). and their procedures are designed to accept different kinds of content as input (sentences versus snakes). Specialization of design is natural selection's signature and its most common result (Williams. Dimberg. (3) the design feature must reliably develop (in the appropriate morphs) given the developmental circumstances that characterized its environment of evolutionary adaptedness. Tooby & Cosmides. 1985). 1989. Pinker. For this reason. because they do not need to engineer a compromise between mutually incompatible task demands: a jack of all trades . For example. 1992.. & Olofsson. (3) showing that "learning" plays no role in its development. & Ost. But flexibility and efficiency of thought and action can be achieved by a mind that contains a battery of There are strict standards of evidence that must be met before a design feature can be considered an adaptation for performing function X. 1987). 1986. Garcia. (1) The design feature must be species-typical. Implementing different solutions requires different. Cook. 1966. 1976). functionally distinct mechanisms (Sherry & Schacter. 1966)7 In fact. (4) it must be shown that the design feature is particularly well designed for performing function X. Williams. 1975). would be a very clumsy problem-solver. and. (2) function X must be an adaptive problem (i. and. 1982. a cross-generationally recurrent problem whose solution would have promoted the design feature's own reproduction). Hodes. There is no reason to believe that the human brain and mind are any exception. Contrary to popular belief. especially. one should expect the evolved architecture of the human mind to include many functionally distinct cognitive adaptive specializations.e.is necessarily a master of none. the more intensely natural selection tends to specialize and improve the performance of the mechanism for solving it. the more important the adaptive problem.

a specialized design is usually able to solve a problem better than a more generalized one. Krebs & Davies. 1991). knife. Cosmides. It is unlikely that a process with these properties would design central processes that are general purpose and contentfree. natural selection is a hill-climbing process which produces mechanisms that solve adaptive problems well. one's default assumption should be that the architecture of the human mind is saturated with adaptive specializations. identifying its components is enormously difficult. cork-screw. Marr's central insight was that you could do this by developing computational theories of the problems these mechanisms were designed to solve .bottle opener.for the human brain. How to find a needle in a haystack The human brain is the most complex system scientists have ever tried to understand. we have conducted an experimental research program over the last 10 years.88 L. Because most cognitive scientists still think of central processes as content-independent. Consequently. exploring the hypothesis that the human mind contains 8 For a detailed analysis of the common arguments against the application of evolutionary biology to the study of the human mind. The mind is probably more like a Swiss army knife than an all-purpose blade: competent in so many situations because it has a large number of components . 1987. the more difficult it will be to isolate and map any one of them. We thought an effective way of doing this would be to use an evolutionarily derived computational theory to discover cognitive mechanisms whose existence no one had previously suspected. Gallistel. The odds you'll find one are low unless you can radically narrow the search space. scissorseach of which is well designed for solving a different problem. see Tooby and Cosmides (1992).. The more functionally integrated circuits it contains. are those who study humans. Gould. J. . Looking for a functionally integrated mechanism within a multimodular mind is like looking for a needle in a haystack. Toward this end. we thought it would be particularly interesting to demonstrate the existence of central processes that are functionally specialized and content-dependent: domain-specific reasoning mechanisms. We wanted to demonstrate its utility in studying the human mind. toothpick.8 The empirical advantages of using evolutionary biology to develop computational theories of adaptive problems have already been amply demonstrated in the study of non-human minds (e. the adaptive problems our hunter-gatherer ancestors faced. 1982. rather than an evolutionarily based theory.g. The only behavioral scientists who still derive their hypotheses from intuition and folk psychology. The functional architecture of the mind was designed by natural selection. 1990. Real. Tooby special-purpose circuits.

1966. Manktelow & Evans.m o r e than enough time for selection to shape specialized mechanisms . A large literature already existed showing that people are not very good at detecting logical violations of "if-then" rules in Wason selection tasks. 1984. Yet investigating brave people would be a waste of time. for example. We tested this hypothesis using the Wason selection task. For example. 1971. social exchange is an "I'll scratch your back if you scratch mine" principle (for evolutionary analyses see. we developed a computational theory of the information-processing problems that arise in this domain (Cosmides.g. even when these rules deal with familiar content drawn from everyday life (e. Some of the design features we predicted are listed in Table 4. Trivers. Yet a coward who turns out to be a Leo would represent a violation of the rule.Beyond intuition and instinct blindness 89 specialized circuits designed for reasoning about adaptive problems posed by the social world of our ancestors: social exchange.so finding a brave Virgo would prove nothing. mate choice. We initially focused on social exchange because (1) the evolutionary theory is clear and well developed. Using evolvability constraints that biologists had already identified (some involving the Prisoners' Dilemma). Williams. By starting with an adaptive problem hunter-gatherers are known to have faced. 1979. (3) paleoanthropology evidence suggests that hominids have been engaging in it for millions of y e a r s . Boyd.). you will probably investigate people who you know are Leos. Sometimes known as "reciprocal altruism".. Many people also have the impulse to investigate people who are brave. The evolutionary analysis of social exchange parallels the economist's concept of trade. 1981. "If a person is a Leo. we could proceed to design experiments to test for associated cognitive specializations. if you are like most people. For example. suppose you are skeptical when an astrologer tells you. 1988." and you want to prove him wrong. (2) the relevant selection pressures are strong.not that all brave people are Leos . Axelrod. This evolvability constraint led us directly to the hypothesis that humans might have evolved inference procedures that are specialized for detecting cheaters. coalitional action. Cosmides & Tooby. 1985. to see whether they are brave. . 1983). This gave us a principled basis for generating detailed hypotheses about the design of the circuits that generate social exchange in humans. Axelrod & Hamilton. 1989). and so on. the astrologer said that all Leos are brave . Wason. And. 1972). you probably won't realize that you need to investigate cowards. threat. which had originally been developed as a test of logical reasoning (Wason. to see if they are Leos. then that person is brave. Circuits that generate social exchange will be selected out unless they allow individuals to detect those who fail to reciprocate favors cheaters.and (4) humans in all cultures engage in social exchange. In looking for exceptions to this rule. Wason & Johnson-Laird. 1966. mathematical analyses had established cheater detection as a crucial adaptive problem.

Cosmides. if would be immediately obvious to you that you should investigate Leos and cowards. fewer than 10% of subjects spontaneously realize this. They are just as good at computing 3. They embody implicational procedures specified by the computational theory. 6. In general. cial contract from the perspective of one party as from the perspective of another. 2. application of a generalized deontic logic cannot explain the results. It is not the case that social contract content merely facilitates the application of the rules of inference of the propositional calculus. The algorithms governing reasoning about social contracts operate even in unfamiliar situations. Social contract content does not the cost-benefit representation of a somerely "afford" clear thinking. To show that an aspect of the phenotype is an adaptation to perform a particular function. one must show that it is particularly well designed for performing that function. 7. 2. They cannot operate so as to detect cheaters unless the rule has been assigned the cost-benefit representation of a social contract. Their cheater detection procedures cannot detect violations of social contracts that do not correspond to cheating. Reasoning about social exchange: evidence of special design* (a) The following design features were predicted and found 1. a 4. Despite claims for the power of culture and "learning". (b) The following by-product hypotheses were empirically eliminated 1. even formal training in logical reasoning does little to . Tooby Table 4. Familiarity cannot explain the social contract effect. The definition of cheating that they embody depends on one's perspective. and that it cannot be better explained as a by-product of some other adaptation or physical law. It is not the case that any problem involving payoffs will elicit the detection of violations. Permission schema theory cannot explain the social contract effect. 5. J. in other words. They do not include altruist detection procedures. They include inference procedures specialized for cheater detection. But it is not intuitively obvious to most subjects. If your mind had reasoning circuits specialized for detecting logical violations of rules. 8.90 L. 4.

then you must first fix your bed" or "If you are to eat cassava root. out of carefully designed experimental studies. and their performance jumps dramatically. 1992. investigating people eating cassava root and people without tattoos is logically equivalent to investigating Leos and cowards. "If you are to eat these cookies. people do not treat social exchange problems as equivalent to other kinds of reasoning problems. Wason & Johnson-Laird. Cheng. 1989). Whenever the content of a problem asks subjects to look for cheaters on a social exchange -even when the situation described is culturally unfamiliar and even bizarre -subjects experience the problem as simple to solve.. (For a review of the relevant experiments. Our evolutionary derived computational theory of social exchange allowed us to construct experiments capable of detecting. Nisbett. So far.. and apply domain-specific. Seventy to 90% of subjects get it right. we found that people who ordinarily cannot detect violations of "if-then" rules can do so easily and accurately when that violation represents cheating in a situation of social exchange. 1985.features that no one was looking for and that most of our colleagues thought were outlandish (Cosmides & Tooby.g. high-resolution "maps" of the intricate mechanisms that collectively constitute the cognitive architecture. In these situations. For more detailed descriptions. the adaptively correct answer is immediately obvious to almost all subjects. Their minds distinguish social exchange contents. Holyoak. the highest performance ever found for a task of this kind. see Cosmides & Tooby. who commonly experience a "pop out" effect. 1986. Parallel lines of investigation have already identified two other domain-specialized reasoning mechanisms: one for reasoning about aggressive threats and one . However. 1989. Cosmides & Tooby. Gigerenzer & Hug. along with the alternative by-product hypotheses that we and our colleagues have eliminated. isolating and mapping out previously unknown cognitive procedures. content-dependent rules of inference that are adaptively appropriate only to that task.g. 1992. No formal training is needed. see Cosmides. The data seem best explained by the hypothesis that humans reliably develop circuits that are complexly specialized for reasoning about reciprocal social interactions. then you must have a tattoo on your face").) We think that the goal of cognitive research should be to recover. Experimental tests have confirmed the presence of all the predicted design features that have been tested for so far. formal view. It led us to predict a large number of design features in advance . Those design features that have been tested and confirmed are listed in Table 4.Beyond intuition and instinct blindness 91 boost performance (e. 1972). & Oliver. But everywhere it has been tested. 1989. This is a situation in which one is entitled to a benefit only if one has fulfilled a requirement (e. no known theory invoking general-purpose cognitive processes has been able to explain the very precise and unique pattern of data that experiments like these have generated. From a domain-general.

But the reasoning circuits we have been investigating are complexly structured for solving a specific type of adaptive problem. By studying patient populations with autism and other neurological impairments of social cognition. and (2) we lack the contentindependent circuits necessary for performing certain logical operations ("logical reasoning").is fundamentally misconceived. 1989).that the mind is general purpose and content-free . 1990. evolutionary biology and neuroscience have shown that the central premise of the SSSM . they develop without any conscious effort and in the absence of any formal instruction. they reliably develop in all normal human beings. one can think of these specialized circuits as reasoning instincts. the evolved architecture of the human mind is full of specialized reasoning circuits and regulatory mechanisms that organize the way we interpret experience. they are applied without any conscious awareness of their underlying logic. These circuits inject certain recurrent concepts and motivations into our mental life.g. Three decades of research in cognitive psychology. Reasoning instincts In our view. and they provide universal frames of meaning that allow us to understand the actions and intentions of others. 1992). We are now designing clinical tests to identify the neural basis for these mechanisms. Instincts are often thought of as the polar opposite of reasoning.92 L. construct knowledge and make decisions. In other words. they have all the hallmarks of what one usually thinks of as an "instinct" (Pinker. while humans "gave up instincts" to become "the rational animal". Cosmides.is beginning to replace it (Tooby & Cosmides. In contrast. as spinning a web is to a spider or dead-reckoning is to a desert ant. Manktelow & Over. An alternative framework . Beneath the level of surface . Non-human animals are widely believed to act through "instinct". Consequently. They make certain kinds of inferences just as easy. Tooby for reasoning about protection from hazards (e. we should be able to see whether dissociations occur along the fracture lines that our various computational theories suggest. social exchange problems are easy because we do have evolved circuits specialized for reasoning about that important. J. a large range of reasoning problems (like the astrological one) are difficult because (1) their content is not drawn from a domain for which humans evolved functionally specialized reasoning circuits.sometimes called evolutionary psychology . Tooby & Cosmides. According to this view. 1994). and they are distinct from more general abilities to process information or to behave intelligently. The inferences necessary for detecting cheaters are obvious to humans for the same reason that the inferences necessary for echolocation are obvious to a bat. evolutionarily long-enduring problem in social cognition. effortless and "natural" to us as humans..

1987. 1994. Even Fodor. 1989. 1992). Sperber. Gigerenzer. Carey & Gelman. 1979. Barkow. In spite of this consistent pattern. 1890).E. 1994. who has championed the case for modular processes. thought the mind is a collection of "faculties" or "instincts" that direct learning.that it consists of an enormous collection of circuits. Gelman & Hirschfeld. III. A. most cognitive scientists balk at the model of a brain crowded with specialized inference engines.Beyond intuition and instinct blindness # variability. Brown. Wynn. Pinker. & Tooby. 1991. The vocabulary may be archaic. Sperber. Leslie. D. but evidence for the existence of learning instincts (Marler. 1890. 1988. computational theories are lenses that correct for instinct blindness Intuitions about cognition: the limitations of an atheoretical approach The adaptationist view of a multinodular mind was common at the turn of the century. Baron-Cohen. 1992. Rozin. 1994. 1990. 1994. But so is the ipherent indeterminacy in the position of electrons. Keil. 1983). 1994. 1991. & Frith. Spelke. 1991. 1985. Brown. McDougall. 1992. reasoning and action (James. Leslie. It is uncomfortable but scientifically necessary to accept that common sense is the . such as William James and William McDougall. Intuition is a misleading source of hypotheses because functionally specialized mechanisms create 'instinct blindness". in press. Gelman & Hirschfeld. In James's view. human behavior is so much more flexibly intelligent than that of other animals because we have more instincts than they d o . 1990. Daly & Wilson. 1990. 1994. 1989. Markman. 1991.the complex product of a large collection of functionally specialized circuits . Symons. Hoffrage. 1985. 1988. 1990. see Atran. These faculties were thought to embody sophisticated information-processing procedures that were domain-specific. Cosmides. Carey & Gelman. Symons. Frith. Wilson & Daly. Spelke. & Kleinbolting.is deeply at war with our intuitions. 1991) and reasoning instincts is pouring in from all corners of the cognitive sciences (for examples. 1988. Leslie. 1976. each specialized for performing a particular adaptive function. 1990. 1992). Tooby & Cosmides. however. it becomes more apparent that the evolved architecture of the human mind is densely multimodular . Cosmides & Tooby. The study of perception and language has provided the most conspicuous examples.n o t fewer (James. 1979. but the model is modern. 1908). The notion that learning and reasoning are like perception and language . all humans share certain views and assumptions about the nature of the world and human action by virtue of these universal reasoning circuits (Atran. Brown. takes the traditional view that "central" processes are general purpose (Fodor. 1991. With every new discovery. 1994. 1990. Boyer. Early experimental psychologists.

whose automatic. In the case of central processes. not useful theories for physicists and cognitive scientists. out of awareness. Our intuitions were designed to generate adaptive behavior in Pleistocene hunter-gatherers. . we are all naive realists. motives.9 Our intuitions may feel authoritative and irresistibly compelling. To the metaphysician alone can such questions occur as: why do we smile. non-conscious operation creates our seamless experience of the world. an untrustworthy guide to the reality of subatomic particles or the evolved structure of the human mind. The sense of clarity and self-evidence they generate is so potent it is difficult to see that the computational problems they solve even exist. not requiring any explanation or research. Thus the "naturalness" of certain inferences acts to obstruct the discovery of the mechanisms that produced them. smiles. so far as to ask for the why of any instinctive human act. Intuitively.we tend to be blind to their existence. But these dedicated circuits structure our thought so powerfully that it can be difficult to imagine how things could be otherwise. sentences. Cognitive instincts create problems for cognitive scientists. we incorrectly locate the computationally manufactured simplicity that we experience as a natural property of the external world -as the pristine state of nature. As a result. by a host of functionally integrated circuits. input as toy worlds into computers. relevances and saliences. Tooby faculty that tells us the world is flat. These reasoning instincts are powerful inference engines. you need to envision an alternative conceptual universe. Precisely because they work so well . a mind debauched by learning to carry the process of making the natural seem strange. foods. J. Not suspecting they exist. when pleased. . but just as important in creating our perception of the world. animals. glares. dangers. the known and the obvious. This automatically manufactured universe. nevertheless. Well-designed reasoning instincts should be invisible to our intuitions. social groups. and not scowl? Why are we unable to talk to a 9 This should not be surprising. relationships. we do not conduct research programs to find them. artifacts. even as they generate them-no more accessible to consciousness than retinal cells and line detectors. But to produce this simplified world that we effortlessly experience. Cosmides.94 L. humans. a vast sea of computational problems are being silently solved. we think human intuition is not merely untrustworthy: it is systematically misleading. seems like it could almost be tractable by that perennially elusive collection of general-purpose algorithms cognitive scientists keep expecting to find. goals. But they are. . words. To see that they exist. As William James wrote: It takes .because they process information so effortlessly and automatically . and they may lead us to dismiss many ideas as ridiculous. experiencing the world as already parsed into objects.

1890) For exactly this reason. 1985. there seems to be nothing to explain. then he doesn't deserve our help. of course our heart palpitates at the sight of the crowd. And that is the root of the problem. then one needs to look for a reasoning device that can reliably generate (1) and (2) without also generating (3) and (4). sentences (1) and (2): (1) If he's the victim of an unlucky tragedy. of course we love the maiden. The inferences they embody seem to violate a grammar of social reasoning-in much the same way that "Alice might slowly" violates the grammar of English but "Alice might come" does not (Cosmides. we rarely notice their absence or feel the need to explain it. But consider sentences (3) and (4): *(3) If he's the victim of an unlucky tragedy. Realizing that not generating (3) and (4) is a design feature of the mechanism is tricky. probably. They may not always be applicable. the she-bear. to the bear. Cosmides & Tooby. but they are perfectly intelligible. If so. Functionally specialized reasoning circuits will make certain inferences intuitive . does each animal feel about the particular things it tends to do in the presence of particular objects To the lion it is the lioness which is made to be loved. for example. Consider. To the broody hen the notion would probably seems monstrous that there should be a creature in the world to whom a nestful of eggs was not the utterly fascinating and precious and never-to-be-too-much-sat-upon object which it is to her. Yet they involve no logical contradictions. Sentences (3) and (4) sound eccentric in a way that (1) and (2) do not. (James. There is a complex pattern to the inferences we generate. then we should pitch in to help him out. 1992). Precisely because the device in question does not spontaneously generate inferences like (3) and (4). however. then he doesn't deserve our help. 1989. Of course we smile. *(4) If he spends his time loafing and living off of others.Beyond intuition and instinct blindness 95 crowd as we talk to a single friend? Why does a particular maiden turn our wits so upside-down? The common man can only say. (2) If he spends his time loafing and living off of others. intuition is an unreliable guide to points of interest in the human mind. so palpably and flagrantly made for all eternity to be loved! And so.so "natural" that there doesn't seem to be any phenomenon that is in need of explanation. but seeing it requires a contrast between figure and . The inferences they express seem perfectly natural. that beautiful soul clad in that perfect form. then we should pitch in to help him out.

And no one guesses that our central processes instantiate domain-specific grammars every bit as rich as that of a natural language (for more examples. a sentence is defined as a string of words that members of a linguistic community would judge as well formed. If you hurt her. then you must promise to never help me. . If it is a grammar of social reasoning. *He gave her something expecting nothing in return. she was enraged.96 L. Cosmides. If you hurt her. (b) I love my daughter. Hidden grammars In the study of language. *I love my daughter. I don't want to help him because whenever I'm in trouble he refuses to help me. (d) He gave her something expecting nothing in return. *She paid $5 for the book because the book was less valuable to her than $5. then you must promise to help me. *I want to help him because whenever I'm in trouble he refuses to help me. the geometry of a snowflake disappears against a white background. In the study of reasoning. "Unnatural" inferences form the high contrast background necessary to see the complex geometry of the inferences that we do spontaneously generate. Tooby ground. then these inferences are about the domain of social motivation and Table 5. I'll kill you. As a result. a grammar is a finite set of rules that can generate all appropriate inferences while not simultaneously generating inappropriate ones. (c) If I help you now. *If I help you now. Without this background. I'll kiss you. see Table 5). Yet these "unnatural" inferences are exactly the ones we don't produce. J. we look neither for the pattern. she was touched. Inferences that violate a grammar of social reasoning (a) I want to help him because he has helped me so often in the past. a grammar is defined as a finite set of rules that is capable of generating all the sentences of a language without generating any non-sentences. (e) She paid $5 for the book because the book was more valuable to her than $5. the pattern can't be seen. nor for the mechanisms that generate it. *I don't want to help him because he has helped me so often in the past.

the context must cause the violated constraint to be satisfied. But the rules that generate sentences . 1994). however. "The horse raced past the barn fell" seems ungrammatical when "raced" is categorized as the main verb of the sentence. For an extensive discussion of how natural selection structures the relationships among genotype. It violates a grammatical constraint of social contract theory: that (benefit to offerer) > (cost to offerer) (Cosmides & Tooby. but grammatical if the context indicates that there are two horses. Its neurological development is buffered against most naturally occurring variations in the physical and social environment. and "raced" as a passive verb within a prepositional phrase.the grammar itself . but no person who uses the term means "immune to every environmental perturbation". 1994). The fact that the internal operations of the computational machinery in question are automatic and unconscious is a contributing factor. phenotype and environment in development.ungrammatical . Only recently have these grammars been recognized as minor variants on a Universal Grammar (UG): an invariant set of rules embodied in the brains of all human beings who are not neurologically impaired (Chomsky. Setswana-had a completely different grammar. To pick a standard linguistic example. "Fell" is then recategorized as the main verb.Beyond intuition and instinct blindness 97 behavior. Context can have the same effect on statements that seem socially ungrammatical. 1989). Chinese. these complex rules are so opaque that just 40 years ago most linguists thought each human language .because gum wrappers are considered worthless. To become grammatical.English. Indeed. . hidden from our conscious awareness. UG is innate in the following sense: its intricate internal organization is the product of our species' genetic endowment in the same way that the internal organization of the eye is. 1980. The task is difficult precisely because our linguistic inferences are generated by a "language instinct" (Pinker. n The term "innate" means different things to different scientific communities. all normal human beings raised in reasonably normal environments develop the same UG (e. Discovering the grammar of a human language is so difficult. Certain environmental conditions are necessary to trigger the development of UG. For example.linguistics devoted to the task. Pinker. see Tooby and Cosmides (1992). As a result. recategorizing the gum wrapper as something extremely valuable (potentially justifying the $1000 payment) would do this: the statement seems sensible if you are told that the speaker is a spy who knows the gum wrapper has a microdot with the key for breaking an enemy code. Pinker. but these conditions are not the source of its internal organization.u Universal grammars of social reasoning are invisible to cognitive scientists now for the same reason that UG was invisible to linguists for such a long time. that there is an entire field . an "inappropriate" inference is defined as one that members of a social community would judge as incomprehensible or nonsensical. 1994). "I'll give you $1000 for your gum wrapper" seems eccentric . Context can make a seemingly ungrammatical sentence grammatical.g.10 The cornerstone of any computational theory of the problem of language acquisition is the specification of a grammar. but the causes of invisibility go even deeper..operate effortlessly and automatically. I0 The similarities between a grammar of language and a grammar of social reasoning run even deeper. One thing this set of specialized circuits can do is distinguish grammatical from ungrammatical sentences.

Cosmides. but bad for the cognitive scientist. The LAD is an adaptation to combinatorial explosion: by restricting the child's grammatical imagination to a very small subset of hypothesis space .and hence Universal Grammar .12 Any set of utterances a child hears is consistent with an infinite number of possible grammars. But this last step is where our imagination stumbles. one must first realize that UG exists. But it caused a form of theoretical blindness in linguists. Its function is to generate grammatical inferences consistent with UG without simultaneously generating inconsistent ones. The cognitive scientist needs to know this. Yet 12 As a side-effect.hypotheses consistent with the principles of UG . UG is what.were difficult to discover because circuits designed to generate only a small subset of all grammatical inferences in the child also do so in the linguist. To explain the fact that all natural languages fall within the bounds of UG. For example. one must realize that there are alternative grammars.is a rule in an alternative grammar. Tooby UG is a small corner of hypothesis space. the LAD's structure must make alternative grammars literally unimaginable (at least by the language faculty). there are an indefinitely large number of grammars that are not variants of UG. which obstructed the discovery of UG and of the language instinct itself. and for exactly the same reasons. the LAD was not designed to support writing. but its properties made the design and spread of this cultural invention possible. A content-free learning mechanism would be forever lost in hypothesis space. This property of the language instinct is crucial to its adaptive function. Discovering a grammar of social reasoning is likely to prove just as difficult as discovering the grammar of a language. J. it can also solve problems that played no causal role in its selective history. Forming the plural through mirror reversal . The language instinct structures our thought so powerfully that alternative grammars are difficult to imagine.it makes language acquisition possible. . This is good for the child learning language. No child considers this possibility. for example.98 Instinct blindness L.so that the plural of "cat" is "tac" . but only one of them is the grammar of its native language. it is the language acquisition device's (LAD) principal adaptive function. however. To realize that it exists. To do this. if formal analyses reveal that it produces both the mirror reverse rule and the "add 's' to a stem" rule. Alternative grammars . who needs to imagine these unimaginable grammars. This is not an incidental feature of the language instinct. in order to characterize UG and produce a correct theory of the LAD's cognitive structure. A proposed algorithm can be ruled out. One can think of this phenomenon as instinct blindness. an algorithm is how. the LAD cannot generate this rule.

in the 1960s almost no one realized that machine vision was difficult. to fear disease. to fall in love. automatic. Cosmides & Tooby. (Marr. The reason for this misperception is that we humans are ourselves so good at vision.. that is devoted to this task. This is a remarkable omission. to experience moral outrage. botany. 16) Phenomenally.Beyond intuition and instinct blindness 99 there is no field. Anthropological malpractice As a result of the rhetoric of anthropologists. reliable. animal behavior. seeing seems simple. To find someone beautiful. 1985. but also disease. 1991. to deduce a tool's function from its shape . 1982. Jackendoff. 1992. parallel to linguistics. Tooby & Cosmides. Instinct blindness is one culprit. reliable. 1992). most cognitive researchers have. The human cognitive architecture probably embodies a large number of domain-specific "grammars". dedicated computational machinery that makes this possible. . Most cognitive scientists don't realize it. tool-making. but to their complexity. 1982. unconscious and requires no explicit instruction. The field had to go through [a series of fiascoes] before it was at last realized that here were some problems that had to be taken seriously. extreme and unfounded claims about cultural relativity is another (e. p. but they are grossly underestimating the complexity of our central processes.and a myriad other cognitive accomplishments . very few individuals even recognize the need for such a grammar. Brown. 1992). . to feel jealous. choices and preferences .g. from an evolutionary point of view. fast. Legend has it that in the early days of artificial intelligence. 1991. The phenomenal experience of an activity as "easy" or "natural" often leads scientists to assume that the processes that give rise to it are simple. as part of their standard intellectual furniture. Sperber. targeting not just the domain of social life. It is effortless. a confidence that cultural relativity . Research on the computational machinery responsible for these kinds of inferences. indeed. let alone such a field (for exceptions. But seeing is effortless. fast. and unconscious precisely because there is a vast array of complex. Fiske. Marvin Minsky assigned the development of machine vision to a graduate student as a summer project. automatic. Our intuitions blind us not only to the existence of instincts. see Cosmides. But this apparent simplicity is possible only because there is a vast array of complex computational machinery supporting and regulating these activities.especially the social ones-is almost totally absent in the cognitive sciences. This illusion of simplicity hampered vision research for years: .can seem as simple and automatic and effortless as opening your eyes and seeing. foraging and many other situations that our hunter-gatherer ancestors had to cope with on a regular basis. to initiate an attack. to reciprocate a favor. 1989.

Indeed.100 L.that there is no transformation that can map the rules of one onto the rules of another. Instinct blindness is a side-effect of any instinct whose function is to generate some inferences or behaviors without simultaneously generating others. 1994. they operate within a huge set of implicit panhuman assumptions that allow them to decode the residue of human life that does differ from place to place (Sperber. much of what happens in other cultures. 13 . has complained that it is the "professional malpractice of anthropologists to exaggerate the exotic character of other cultures" (Bloch. existed in the minds of other members of the culture (Boyer. Sperber. 1991)? More importantly. 1990. out of the infinite universe of possibilities. then these rules cannot be expressions of an underlying UG of social reasoning. 1980. anthropologists are just as oblivious to what is universally natural for the human mind as the rest of us. According to this view. Sperber. but these grammars will differ dramatically and capriciously from one culture to the next. If so.13 Indeed. cultural relativism is an interpretation imposed as an article of faith . the relativist position holds that the grammars of different cultures are utterly incommensurate . Tooby & Cosmides. most scientists harbor the incorrect impression that there is no "Universal Grammar" of social reasoning to be discovered. It is more than empirically reasonable. 1992).including social reasoning instincts . Cosmides.is completely compatible with the ethnographic record. a shared look. or an aggressive gesture and infer its meaning and its referent. 1992). without universal reasoning instincts. this is a self-legitimizing institutional pressure: why go long distances to study things that could be studied at home (Brown. Chomsky. J. They know they can work out exchanges without language.not a conclusion based on scientific data (Brown. they understand. Consequently. however. not what is absent from all cultures or what differs from species to species. 1985. 1992). 1982. Maurice Bloch. The notion of universal human reasoning instincts . To some degree. This is a For a history and discussion of how unsupported relativist claims gained widespread acceptance in the social sciences. however. Tooby is an empirically established finding of wide applicability (see discussion of the Standard Social Science Model in Tooby & Cosmides. for the reasons discussed above. automatically and without reflection. Their attention is drawn to what differs from culture to culture. because one wouldn't be able to infer which representations. Drawing on their cognitive instincts. Tooby & Cosmides. In its most extreme form. 1982. 1992). see Brown (1991) and Tooby and Cosmides (1992). a grammar of social reasoning might exist in each culture. 1977). 1991. it is a logical necessity. a prominent member of the field. Among anthropologists. the acquisition of one's "culture" would be literally impossible. Indeed. Tooby & Cosmides. or see a smile.

To do this. . 1992). The cognitive sciences need theoretical guidance that is grounded in something beyond intuition. the intricate outlines of the mind's design stand out in sharp relief. but not what we are. because combinatorial explosion is a very general selection pressure (for discussion. Hermaphroditic worms. Human cultural variation is trivial in comparison. That's why theoretical biology is so important. Many aspects of the human mind can't be seen by the naked "I" . In William James's terms. female praying mantises who eat their mate's head while copulating with him -other animals engage in behaviors that truly are exotic by human standards. The fact that human instincts are difficult for human minds to discover is a side-effect of their adaptive function. flies who are attracted to the smell of dung. A good theory rips away the veil of naturalness and familiarity that our own minds create. broody hens and AI programs. It provides positive theories of what kinds of cognitive programs we should expect to find in species that evolved under various ecological conditions: theories of what and why. exposing computational problems whose existence we never even imagined. colonies of ant sisters who come in three "genders" (sterile workers. which they did through the use of mathematical logic and the theory of computation. we're flying blind. male langur monkeys who commit systematic infanticide when they join a troop.by intuition unaided by theory. queens). Evolutionary biology's formal theories are powerful lenses that correct for instinct blindness.Beyond intuition and instinct blindness 101 very general property of instincts. fish who change sex when the composition of their social group changes. These languages "made the natural seem strange". see Tooby & Cosmides. which are not variants of UG. One of the most common is the study of non-human minds that differ profoundly from our own . they had to escape the confines of their intuitions. Linguists were awakened to the existence of alternative grammars by the creation of computer "languages". Observing behaviors caused by alternative instincts jars us into recognizing the specificity and multiplicity of our own instincts. soldiers.animal minds and electronic minds. Observations like these tell us what we are not. The study of animal behavior is another time-honored method for debauching the mind-the one used by William James himself. they debauched their minds with learning. Otherwise. In their focus. polyandrous jacanas who mate with a male after breaking the eggs he was incubating for a rival female. inspiring linguists to generate even stranger grammars. Corrective lenses There are various ways of overcoming instinct blindness.

Dupre (Ed.. T. In J. (1981). 278-292. (1986). Cognition. L. Hillsdale. Doctoral dissertation..). & Harvey. Sex. R. Hawthorne. L. Department of Psychology. Deduction or Darwinian algorithms? An explanation of the "elusive" content effect on the Wason selection task. Tooby Atran. D. Chicago: University of Chicago Press. Rules and representations. Daly. NY: Aldine de Gruyter. J. Daly. Hirschfeld (Eds. Is the repeated prisoner's dilemma a good model of reciprocal altruism? Ethology and Sociobiology. In J. (1989). Journal of Abnormal Psychology.. Journal of Abnormal Psychology. & Oliver. R. Carey. J.). Cosmides.. P. Cosmides. (1988). S. Evolutionary psychology and the generation of culture. P. P. New York: Cambridge University Press. 381-398). Tooby (Eds. Cosmides. (1992). MA: MIT Press. & Tooby.H.. U. III. Barkow. (1989). L. Nisbett. N. The cognitive foundations of natural history. M. A. New York: Oxford University Press. Boston: Wadsworth. L. Cosmides. Brown. Bloch. Brothers.. (in press). & Gelman. 195-207. J. Leslie. 107-133. 547-565. & Wilson. Cosmides. New York: Basic Books. L. Boyer. Homicide. New York: Cambridge University Press. Origins of domain specificity: The evolution of functional organization. J. London B. 14. A.102 References L.W. E. In J. Case study: A computational theory of social exchange. Cognitive Psychology. (1991). K. Mapping the mind: Domain specificity in cognition and culture. (1990).). Does the autistic child have a "theory of mind"? Cognition.L. S.E. Daly. S.. & Tooby. Cosmides.. & Seyfarth.. Man. Domain-specific principles affect learning and transfer in children. University Microfilms #86-02206. (1994). R. Fox (Eds. M. 12. Concepts in Neuroscience. 187-276.L. (1980). 98. Science. New York: Oxford University Press. Cosmides. 10. Cognitive Science. 448-459. Human universals. (Eds. Cambridge. The adapted mind: Evolutionary psychology and the generation of culture. (1986). 1. Comparison and adaptation. (1985). M. L. Carey. Cheney. (1977). & Tooby.. M. (1985). R. 95. (1990). Baron-Cohen. R. L. L. Cosmides. Neonate cognition (pp. Gelman & L. Hodes. NJ: Erlbaum. 9. 27-51. S. Hillsdale. Chomsky. J. Proceedings of the Royal Society. (1984). The epigenesis of mind. M. & Lang. & Tooby. W.... New York: McGraw-Hill. New York: Columbia University Press.. Brown. J. evolution and behavior.. (1990). Pragmatic versus syntactic approaches to training deductive reasoning. D. Cognition. In S. Clutton-Brock.D. Cheng. 37-46. J. Cook. & J. The latest on the best: Essays on evolution and optimality. (1979). R.. M. 51-97. From evolution to behavior: Evolutionary psychology as the missing link. M. Part II. (Eds. (1985). S. 1390-1396. L. (1988). Axelrod. (1989). Boyd. 31. & Frith. (1994). Preparedness and phobia: Effects of stimulus content on human visceral conditioning. Mehler & R.. L. (1983). Harvard University. NJ: Erlbaum.. & Hamilton. The evolution of cooperation.) (1991). Ethology and Sociobiology. Berkeley: University of California Press. L. & Wilson. R. Cosmides. Cognitive adaptations for social exchange.. How monkeys see the world. 211. (1990). Holyoak. Constraints on semantic development. The evolution of cooperation. The adapted mind: Evolutionary psychology and the generation of culture. (1994). & Tooby. & Wilson. P. The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. The naturalness of religious ideas. The social brain: A project for integrating primate behavior and neurophysiology in a new domain. Discriminative parental solicitude and the relevance of evolutionary ..)..J. 205. Axelrod. Cook. 293-328. The past and the present in the present. & Mineka. J. Barkow. & Tooby. 211-222.) (1992). Cosmides. Are humans good intuitive statisticians after all? Rethinking some conclusion of the literature on judgment under uncertainty. Observational conditioning of fear to fear-relevant versus fearirrelevant stimuli in rhesus monkeys.. M. 18. (1987). 21.

Oxford: Blackwell Scientific Publications. J. (1991). The extended phenotype. G. & Keil. 98. San Francisco: Freeman. E. 51-73). In S. R. Constraints children place on word meanings. U. & Luttrell. In S. A. N. J. & Hug.. In T. 185-210). Hoffrage.B. Gelman. E. M.).G. D. Marler. R.. K. (pp. Bootsin (Eds. 287-305. (1983). Frith. NJ: Erlbaum. & G. (1989). MA: MIT Press. Freeman. Simultaneous hermaphroditism. (1982). (1991). Cambridge. An introduction to behavioural ecology. Zentall & B. R.J. Gilhooly. 119-136. Weiskrantz (Ed. Mayr. Dawkins. Galef (Eds. Garcia.C. & Hirschfeld..) (1994). MA: MIT Press. L. St. U. 121. Structures of social life: The four elementary forms of human relations. & Over. Mineka. The epigenesis of mind. Biological constraints on the fear response. (1982). Leslie. (1989). Mechanisms of social reciprocity in three primate species: Symmetrical relationship characteristics or cognition? Ethology and Sociobiology. Hillsdale. Gould. 2. A. A. A. S.. M. (1988). K. Mapping the mind: Domain specificity in cognition and culture.. Gallistel. 9. Autism: Explaining the enigma. (1988). Gallistel. Leslie. Psychological Review.C. In L. D. & Lewontin. tit-for-tat. Cambridge.. . 9. Dawkins.. (1985). Probabilistic mental models: A Brunswikean theory of confidence.. Cambridge. Ethology and Sociobiology. Gelman. How to carry out the adaptationist program? The American Naturalist. Gazzaniga (Ed. Social learning and the acquisition of snake fear in monkeys. (1990). P. In K.). NJ: Erlbaum. cheating and perspective change. The necessity of illusion: Perception and thought in infancy. Cognition. and cognitive development.B. Fiske. G. F. Ohman. C. Facilitation of reasoning by realism: Effect or non-effect? British Journal of Psychology. 57-77. Reiss & R. Brown.I. Krebs. & Davies. Gould.M.. Concepts. New York: Cambridge University Press. McDougall. (1992). R. New York: Norton.M. 477-488. Vision: A computational investigation into the human representation and processing of visual information. Carey & R. The modularity of mind. (1979).A. 101-118. L. Theoretical issues in behavior therapy (pp. Gelman (Eds. Proceedings of the Royal Society.M.M. Cambridge. New York: Norton. R.. Categorization and naming in children: Problems of induction. The instinct to learn. F. (1986). Gigerenzer. S..Beyond intuition and instinct blindness 103 models to the analysis of motivational systems. & Evans. Ethology: The mechanisms and evolution of behavior.H. Keane. NJ: Erlbaum. 324-334.L.R. (1988). Jackendoff. 94. Keil. Luce. A. 506-528. MA: MIT Press.. & Kleinbolting. H. Oxford: W. 70.I. (1982). (1991).R. New York: Henry Holt. MA: MIT Press. Gelman (Eds. (Eds. The organization of learning..M. (1991). J. Oxford: Clarendon Press. kinds.J.). Fischer. 123-175). New York: Academic Press.T. J. Languages of the mind.L. Fodor. Oxford: Blackwell. Logie. The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. (1990).P. K. Boston: John W. Introduction to social psychology. Principles of psychology.. Cognitive Science. Hillsdale. 205. The cognitive neurosciences. U.. Hillsdale. C. Domain-specific reasoning: Social contracts. L. Carey & R. Social learning: Psychological and biological perspectives. Manktelow. 581-598.).B. Thought without language (pp. Marr. New York: Free Press. J.A. (1990).C. Markman. E. (1979). Deontic thought and the selection task.G. F. R. Cambridge. (1983). MA: MIT Press. MA: MIT Press. London B. Lines of thinking (Vol.G.H.R. In S.E. & Cook. & Ost. S.R. 43. W. Learning without memory. 412-426. Journal of Cognitive Neuroscience. 127-171. Psychological Review. (1989). Carey. (1990). In M. Markman. (1987). Gigerenzer. 14. Cambridge. Erdos (Eds. (1988). James.T. Lessons from animal learning for the study of cognitive development. The epigenesis of mind. E. 1).). de Waal. Chichester: Wiley. and the evolutionary stability of social systems. (1987). The blind watchmaker. W. Pretense and representation: The origins of "theory of mind".. Manktelow. Dimberg.). (1908/1916).). (1992). (1890). S.

The adapted mind: Evolutionary psychology and the generation of culture. Tooby (Eds. Evolution of a mesh between principles of the mind and regularities of the world. 168-184). Sperber.)..F. 46. L.. transl. Barkow. S. A. (1990). In C. 11. & Cosmides. 20. Psychological studies of widespread beliefs. J. The modularity of thought and the epidemiology of representations. & J. Smuts. (1988). (1992). Symons. New York: Oxford University Press. Epstein (Eds. The adapted mind: Evolutionary psychology and the generation of culture. Hillsdale. (1994).). Perceptual organization. L.). D. Hawthorne: Aldine. D. Sex and friendship in baboons. 29-56. Real. Kubovy & J. Cognitive Science. D. Cosmides. D. Spelke. & J. (1987). Principles of object perception.) (1987). A. & Cosmides. Psychological Review. MA: MIT Press. .S. The logic of threat: Evidence for another cognitive adaptation? Paper presented at the Human Behavior and Evolution Society. Z. D. (1981). Alice Morton. (1986). Sperber.). Sperber. Shepard. Symons. (Eds. Man (N. 35-57. (1971). Foss (Ed. The evolution of intelligence and access to the cognitive unconscious. Geskell (Eds. (1992). J. Shepard. S. In J. The epidemiology of beliefs. 13. Pregnancy sickness as adaptation: A deterrent to maternal ingestion of teratogens. Pinker. I. E. & J. P. E. Rozin. Weiskrantz (Ed. Sperber.N. Tooby. M. B. Sherry. D. D. The language instinct.M. & Schacter. The past explains the present: Emotional adaptations and the structure of ancestral environments.104 L. The reconstruction of hominid behavioral evolution through strategic modeling. On anthropological knowledge. The origins of physical knowledge.. Tooby (Eds.. New York: Cambridge University Press.L.A. Pylyshyn. Spelke. Animal choice behavior and the evolution of cognitive architecture. In J. Oxford: Clarendon Press. Journal of Comparative and Physiological Psychology. Harmondsworth: Penguin.. New horizons in psychology. J. In L. L. Barkow.. 707-727. D. (1991). Cosmides. Progress in psychobiology and physiological psychology. & DeVore. Tooby. Fraser & G. The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press.L. Tooby J. Science. & Cosmides. Profet. Oxford: Clarendon Press.).W. (1976). New York: Morrow.). Tooby.M. Hirschfeld. UK: Cambridge University Press. Tooby Ohman. The evolution of multiple memory systems. Reasoning. Primate models of hominid behavior. Norwood.). (1990). R.. New York: SUNY Press. Anthropology and psychology: Towards an epidemiology of representations. L. Cosmides. & Bloom. (1982). Cambridge. L. Cambridge.). Tooby. Trivers. (1989). In J. On the use and misuse of Darwinism in the study of human behavior. R. (1990). One-trial learning and superior resistance to extinction of autonomic responses conditioned to potentially phobic stimuli. 253. P. Cosmides. New York: Academic Press. Pomerantz (Eds. The evolution of human sexuality. Tooby. NJ: Erlbaum. 94. Psychophysical complementarity. D. The evolution of reciprocal altruism. J. (1990b).N. (Ed. New York: Oxford University Press. In S. In J. On the universality of human nature and the uniqueness of the individual: The role of genetics and adaptation. Pinker. 980-986. 88. The psychological foundations of culture. Quarterly Review of Biology. NJ: Ablex. In M. 619-627. P. The latest on the best: Essays on evolution and optimality. (1966).S. Kinzey (Ed. Sperber. New York: Oxford University Press.). Dupre (Ed. (1985). J. & Cosmides. In J.. (1990a). Gelman & L. UK: Cambridge University Press. (1975) Rethinking symbolism. 73-89. Ethology and Sociobiology. Sprague & A.). Cambridge. 58. Evanston. (1992). (1994). 375-424. (1987). L. Journal of Personality. Barkow. In W. 14. (1987). 439-454. A. (Eds. Natural language and natural selection. L. 17-67. & Olofsson. L. In B. Behavioral and Brain Sciences. IL.). Thought without language (pp. Mapping the mind: Domain specificity in cognition and culture. (1979).N. (1975). R. Wason.). The robot's dilemma: The frame problem in artificial intelligence. Eriksson..

C. Williams. & Nesse. G. Thinking and reasoning: Psychological approaches. Food sharing in vampire bats. & Daly. Scientific American. The adapted mind: Evolutionary psychology and the generation of culture. Quarterly Review of Biology. Tooby (Eds. A defense of reductionism in evolutionary biology. (1992). Wilkinson. Realism and rationality in the selection task.B. 76-82. Wynn. Cosmides. Oxford surveys in evolutionary biology. G.). 66. London: Batsford. 9. February. (1991).M.S. 749-750. London: Routledge & Kegan Paul. 85—100. The dawn of Darwinian medicine. P. New York: Oxford University Press. (1985).N. 358. Psychology of reasoning: Structure and content. Addition and subtraction by human infants.).. (1990). In J.C. M. G. . St. R. & Johnson-Laird. (1966).S. G. Williams. Barkow. In J. Evans (Ed. Nature.T. 1-22. K. (1983). M. Adaptation and natural selection: A critique of some current evolutionary thought. (1988). P. The man who mistook his wife for a chattel. P. (1992).C. Princeton: Princeton University Press. L.. G. Wilson. Ethology and Sociobiology. 2. Wilkinson. 1-27. Wason.Beyond intuition and instinct blindness 105 Wason. (1972). & J. Williams. Reciprocal altruism in bats and other mammals.

PO Box 270. Reiser & Rumain. * Correspondence to: L. 1984. 54. I revise these arguments and claim that they are either not conclusive. Rips. Fisch. 1990. USA Abstract Two hypotheses on deductive reasoning are under development: mental logic and mental models. Such evidence has mostly been accumulated in the last few years. Boulevard Raspail. 75006 Paris. it is a thesis about the structure of the vehicle of internal representations. New Brunswick. Bonatti. Even if the thesis loomed around for centuries. there is still little convincing psychological evidence of the existence of a mental logic. 1983). Jerry Fodor. Within such a framework. according to logical rules implemented in procedures activated by the forms of the mental representations. This thesis fits very well with representational views of the mind according to which cognitive processes are largely proof-theoretical. it holds that reasoning consists of operations on mental representations. 1.6 Why should we abandon the mental logic hypothesis? Luca Bonatti* Laboratoire de Sciences Cognitives et Psycholinguistique. France Philosophy Department. and almost exclusively concerns propositional reasoning (Braine. Laboratoire de Sciences Cognitives et Psycholinguistique. NJ 08903-0270. 75006 Paris. Boulevard Raspail. 54. Its psychological corollary is that a system of logic in the mind underlines our thinking processes. . France. Lea. O'Brien. It is often accepted that there are overwhelming arguments to reject the mental logic hypothesis. Rutgers University. Emmanuel Dupoux. Noveck & Braine. In a nutshell. Jacques Mehler. and Christophe Pallier for comments on a first draft of this paper. or point at problems which are troublesome for the mental model hypothesis as well. The author is indebted to Martin Braine. Introduction An old and venerable idea holds that logic is concerned with discovering or illuminating the laws of thought.

1984. 1993). mental models seem to be able to dispense with it and substitute analog simulations for discrete manipulation of propositional-like objects (McGinn 1989). the mental model hypothesis claims that understanding a text consists of the manipulation of tokens representing concrete samples of entities in the world. 1991. Byrne. and reasoning consists of the construction of alternative arrangements of tokens.' in particular (la) explain propositional reasoning (Johnson-Laird & Byrne. among psychologists an almost unanimous consensus has been reached on the death of mental logic and on the fact that reasoning is carried out by constructing mental models. Johnson-Laird & Byrne. Byrne. 1993. Johnson-Laird & Byrne 1989. In a very short time. the hypothesis had an enormous success. including individual differences . Thus. Bonatti In the same years in which some results were beginning to appear. Only recently has a substantial effort of formal clarification been undertaken (especially in Johnson-Laird & Byrne. is due to the impressive list of problems the new hypothesis promised to solve.110 L. Let me list them. 6). to the point that probably the words "mental models" are second only to "generative grammar" for their consequences within the cognitive science community. however. and both its exact status and the feasibility of its claims were a puzzle (Boolos. on their supposed nature. 3. 1993). x). and. Ch. 5.mental models . & Schaeken. Rips. (Id) explain syllogistic reasoning (Johnson-Laird. 1991. if compared to the big revolution introduced by the theory. (lc) explain the figural effect in reasoning (Johnson-Laird & Bara. 1991 and Johnson-Laird. They differ. 1983a. vagueness notwithstanding. 1986). Ch. A good part of this sweeping success. 1984. No abstract rules should be needed to accomplish deduction. Johnson-Laird & Byrne. Johnson-Laird & Bara. mental logic has been seriously challenged by an alternative . p. Both hypotheses share the basic geography of cognition: also the mental models hypothesis is (inter alia) about the nature of the internal representations of deductive processes. at least at first blush. crucial aspects of the new hypothesis were left vague. Hodges.mostly due to the work of Johnson-Laird and his collaborators. 1984. 1992). 1991). 1991. but the task is still far from being accomplished (Bonatti. (lb) explain relational reasoning (Johnson-Laird. Originally. & Schaeken. in press. 1983a. 1983b. Roughly. p. Johnson-Laird & Bara. Johnson-Laird. Johnson-Laird & Byrne. 1984. while mental logic seems naturally to require a language of thought on whose formulas abstract rules apply. Mental models would: (1) provide a general theory of deductive reasoning (Johnson-Laird. 1991. What precisely a mental model is seemed to be a question of secondary importance. Nevertheless. 1992). nowadays the group of psychologists who doubt of the truth of the mental model theory is on the verge of extinction.

1983a. McGinn. 1983a. xi. p. I will . when confronted with a theory so rich in both philosophical consequences and empirical power. p. 1987). 1987. pp. 16). Ch. 1983b. p. & Garnham. ix). 1983a. 6. 1989). Nevertheless. Byrne. (9) "readily cope with the semantics of propositional attitudes" (Johnson-Laird. 430) and solve the problems presented by them (Johnson-Laird. 1983a. 1991. 1991. Johnson-Laird. (3) account for a vast series of linguistic phenomena.Why should we abandon the mental logic hypothesis? Ill (Johnson-Laird. 370-371. (le) explain reasoning with single and multiple quantifiers (Johnson-Laird. 1989). 473-474. (10) provide a solution to the controversy on the problem of human rationality (Johnson-Laird & Byrne. Johnson-Laird. 45). 1991. Ch. Garnham. p. such as anaphors. (8) offer an explanation of meaning (Johnson-Laird. p. 1983a. 1987). pp. 1983a. Garnham. I will confine myself to a modest task. 1989). In this paper. 1989). Johnson-Laird & Byrne. 1983a. 489. (5) explain the difference between implicit and explicit inferences (JohnsonLaird. 1991. pp. But showing it is quite a long task. pp. 1983a. 1983a. Ch. Another source of support for the mental model hypothesis came from a parallel series of arguments to the conclusion that the mental logic hypothesis is doomed to failure. Even the most benevolent reader. 1991. 1991). I think it can be shown that all the philosophical advantages claimed for mental models are unsupported propaganda. p. 77). 6). 430-436). critical voices were confined to a "small chorus of dissenters". p. Oakill. (6) "solve the central paradox of how children learn to reason" (Johnson-Laird. (12) elucidate the nature of self-awareness and consciousness (Johnson-Laird. 125-126. 1983a. & Tabossi. 1993. 117-121) and the belief bias effect (JohnsonLaird & Byrne. with some patience and time. should have at least felt inclined to raise her critical eyebrows. and that most of the psychological evidence is much less firm than generally admitted. Johnson-Laird. pronouns and plausibility effects in language processing (Johnson-Laird. 332). Johnson-Laird & Byrne. (7) explain content effects in reasoning (Byrne. definite and indefinite descriptions. almost all tied to the "ardent advocates of rule theories" (Johnson-Laird & Byrne. pp. 397. 1989. 1983a. 402. In fact. (2) explain how logical reasoning is performed without logic (Byrne. 1983a. pp. Garnham. (11) solve the problem of how words relate to the world (Johnson-Laird. McGinn. (4) offer a theory of the structure of discourse (Johnson-Laird.

1992. Though the comprehension principles guiding it are only sketched. the thesis that "in contrast [to mental logic]. 1991.. 421). From this point of view. Models are supposed to be constructed either directly from perception. the identification of its logical form and a first semantic analysis retrieving literal meaning. the complaint is correct. and does not intend to. However.1 For linguistic models. there is a hypothesis on their role in the time course of reasoning. Accordingly. Mental logic doesn't have the machinery to deal with meaning and cannot explain the role of content and context in understanding and reasoning This is one of the major complaints against a mental logic. . 77) is false. pragmatics and general knowledge aid to select a particular logical form for the input signal. one should distinguish two separate processes involved in problem solving. 1993). So a theory of mental logic cannot. After a first processing roughly delivering a syntactic analysis of a linguistic signal. when reasonable. but there are structural differences between the two constructs which make it difficult to accept the identification.112 L. 1.. In the first case. models are constructed from propositional representations via a set of procedures sometimes Sometimes it looks as if perceptual models in Marr's sense are considered to be equivalent to mental models in Johnson-Laird's sense (see Johnson-Laird. the second one is reasoning proper. 1991. as mental logic theorists recognize. for mental logic theories a comprehension mechanism sensible to pragmatic information drives input analysis (Braine et al. a sketch of the procedures for their constructions exists. for each perceptual model there is an infinite number of mental models corresponding to it. but also of John not scratching his leg. representations possibly sharply different from the first semantic analysis are passed onto a processor blind to content and pragmatics. 2. but mental models do. they point at problems which are troublesome for the mental model theory as well. p. 1984. of Mary being late for a date. or indirectly from language. How could a formal theory enlighten us on such a clearly content-driven process as reasoning? In fact. Bonatti plainly go through the list of this second class of arguments and show that either they are not conclusive or. looks like the diagram in Fig. The first one is comprehension. no detailed account on how perception should generate models has been given. p. the model theory has the machinery to deal with meaning" (Byrne. For this reason. models are no improvement. of John not running the New York Marathon. explain the role of content in reasoning. Johnson-Laird et al. To mention the most apparent one. perceptual models don't contain negation. and so on. A perceptual model of John scratching his head is a mental model of John scratching his head. Braine & O'Brien. O'Brien. Afterwards. though it may help to locate how and when content and pragmatics interact with reasoning proper. According to it. The general picture suggested. with some integration. 1983a. The arguments follow in no particular order of importance.

with the aid of pragmatic information and world knowledge. the logical form of the input sentence is selected Fallb strate I Figure 1.) . comprehension mechanisms and reasoning prope in double squares. The place of pragmatics.Reasoni theory Comprehension mechanism Rules ( basic proced + heurist strateg A first semantic analysis is elaborated. then.

0)(A)(O)) The output of the parser is a couple containing both the grammatical description of the input ("sentence") and its semantical evaluation (in this case. Bonatti called procedural semantics. . notice the following points. Only at this point will procedural semantics take over and construct a model out of the propositional representation of the sentence. scope relations must be already straightened out. the model will be: A O that is. Now. The sentence (1) Every man loves a woman must be parsed to yield either (2) For all men x there is some woman y such that (x love y) or . which have to be received as its input.). . procedural semantics presupposes the literal meaning of words and sentences. when given as input a sentence like The circle is on the right of the triangle a parser will start working and after some crunching the following information will be placed on the top of its stack: (The-circle-is . not to the world. But. Second. then. As Johnson-Laird himself writes. "The reader should bear in mind that the present theory uses a procedural semantics to relate language. 170 ff. p. if procedural semantics is not about literal meaning and logical forms.)-*Sentence ((1. an array containing numerical coordinates specifying the interpretation of the spatial relation. procedural semantics can work only if the output of the parser is not ambiguous: for example. and the interpretations of the definite descriptions). By the same token. p. Thus procedural semantics presupposes logical forms. not a function from mental representations to the world.114 L. but to mental models" (1983a. Procedural semantics is essentially translation from mental representations to mental representations. but on the logical forms of propositional representations. in this case. neither are mental models. the procedures that construct models do not operate properly on natural language sentences. 248). For example (Johnson-Laird & Byrne. 1991. First. an image of a triangle to the left of the circle.0.

there are interpretations of (1) which don't correspond to either (2) or (3). Thus the input to procedural semantics must be clear. then mental models cannot represent it either. clearly the propositional representations on which it has to act in order to build the right mental models are not the results of a first semantic analysis of input sentences retrieving their literal meaning. while the programs implementing the mental model theory described in Johnson-Laird (1983a) and Johnson-Laird et al. Now. and other pragmatic factors.Why should we abandon the mental logic hypothesis? (3) For some woman y all men x are such that (x loves y) 7/5 Only on the basis of one of them can procedural semantics yield a mental model. has to be retrieved before models are constructed. the input to procedural semantics presupposes both the literal meaning of the text and its logical form. involving a generic reading of the indefinite description. this much is clear: // the logical form is not rich enough to articulate such a distinction. and not with how much people like me. While it is not clear that a mental model can express the difference between a generic woman and a specific woman. let's say that Luca is a nice guy has something to do with my ability as a philosopher. Procedural semantics works once all the disambiguations due to context. but the propositional content and the logical forms of the message it conveys. (1992) assume that the syntactic analysis of the input sentence plus word meaning is sufficient to determine its propositional content and logical forms. To sum up. the possibility to construct the appropriate models of a text strictly depends on the expression power of the logical forms on which procedural semantics operates. since procedural semantics is proposed as a set of procedures extracting models from propositional representations. Thus the input to procedural semantics must be rich. her understanding of relevance in communication. Now. and all the cases in which the hearer/reader gathers information from an utterance aided by her general world knowledge. To continue with the previous example. it is this contextual message that they must retain. but the analysis of their message in context. therefore. since they come from expressible logical forms. scope phenomena and retrieval of the speaker's intentions have taken place. A similar point can be made for metaphors. clear. analogies. which. Fourth. Third. So if we take seriously the proposal that mental models are the kind of structure we build when comprehending a text. and must be rich. free from . by standard Gricean reasons the message conveyed in "Luca is a nice guy" in a text like Q: Is Luca a good philosopher? A: Well. in a more natural setting the propositional content needed to construct the relevant mental models cannot be the first semantic analysis of the input.

The place of pragmatic and comprehension mechanisms in the mental model hypothesis. and the suspicion that it won't be implemented is more than warranted (Fodor. 425) But such "assumption" amounts to the solution to the frame problem. procedural semantics and mental models presuppose. Algorithms are only a part of the story. and post-pragmatic. which for what concerns the role of pragmatics and meaning. pragmatic information and world knowledge aid to select the logical form of the input sentence Figure 2. but we have not implemented this assumption. in theory.116 L. So Johnson-Laird et al. with time. 1983).just as mental logic. It could be objected that I am presenting a misleading picture. we come up with a sophisticated input analysis and we get the overall picture presented in Fig. structural ambiguities. informed by any relevant general knowledge. it would still be the case that the retrieval of the relevant message would occur in the pre-modelic construction processes . and do not explain. Bonatti Input Parsing i First and second semantic analysis T Procedural semantics Mental models After a first semantic analysis is elaborated. (p. 2. In any case. has no difference from the mental logic picture . based on the algorithms implementing a small fraction of the mental model theory rather than on the theory itself. (1992) write: The process of constructing models of the premises is. the rest will come. if the problem were solvable. Thus when we begin to fill in the details. a theory of how pragmatics affects the selection of the correct message a set of utterances carries in the relevant situation.

in any of the relevant senses of "content". Any errors due to pre-deductive. nor meaning in situation. not to content connections or relevance. while in Braine and O'Brien's (1991) logical theory of implication the paradoxes are not available as theorems. response selection strategies. and. it surely can't be advertised as the model to imitate for sensibility to content. and they don't seem to have the adequate structure to do it. mechanisms of working . The mental model theory of connectives mainly consists of a variation on truth tables. 1991. worst of all. if judged according to the canons of standard logic. to performance failures. The most glaring problem is that people make mistakes. Thus besides their name. or that they make more mistakes than what the average individual should innately know according to the logical competence mental logic attributes to people (Churchland. 1991). or to faulty competence. They may be due to cognitive components not engaging reasoning proper. 168 ff.Why should we abandon the mental logic hypothesis? 117 selecting the right logical forms and propositional contents which are input to procedural semantics. are fallacious.). The natural understanding of entailment seems to require a connection in content between antecedent and consequent. such as the comprehension stage or strategies of response selection. which should not occur if deduction is guided by a mental logic. mistakes come in different classes. p. comprehension mechanisms. they show reiterate resistance to the teacher's efforts to correct them (Bechtel & Abrahansen. They cannot explain literal meaning. p. The reason is pretty clear. and truth tables are only sensible to truth values. or post-deductive. p. there is a litmus paper to test sensibility to content. In fact. the argument notices that undergraduates make mistakes. There is no mental logic because people make fallacious inferences People often reach conclusions which. 283). 1990. In fact. regardless of their contents and even of their truth values. mental models allow one to derive them as valid inferences (Johnson-Laird & Byrne. 25) In less sophisticated versions. Now. 1983a. They draw invalid conclusions. So if a theory of reasoning licenses them. (Johnson-Laird. But the paradoxes of material implication allow false arbitrary antecedents to imply arbitrary consequents. 3. models have no advantage over mental logic to explain the role of content in reasoning. nor how pragmatics and general knowledge affect interpretation. And this should be a problem for a mental logic. can be accommodated by the two hypotheses roughly in the same way: the existence of such errors doesn't count against mental logic any more than it counts against mental models. Performance mistakes are explained away by mental models by indicating how models are built and handled by mechanisms non-proprietary of reasoning-mostly.

But if this were the case. 4. but the tentative set of rules proposed for model construction is meant to be truth preserving in principle. any alternative logics cannot be a favored point of reference without further justifications. and therefore an adequate logic for natural language needs to extend beyond first order. 1983a.are a more delicate matter. p. Bonatti memory storage and retrieval. expressions such as "More than half of" or "Most" are sets of sets.as it were. Errors of competence . or other. It is possible (though not desirable) to account for systematic errors within a mental logic framework by indicating which rules (if any) induce systematic violations of the selected normative model. logics. On this basis. A system based on mental logic can account for them in the same way. O'Brien. The argument from this proposal to the rejection of mental logic runs as follows: [Higher-order calculus] is not complete. for that matter. or embody a rule which brings about a systematic loss of truths. Does failure to apply excluded middle count as an error? Does the absence of reasoning schemata corresponding to material implication count? Classical logic . in press). then it may be said that subjects make mistakes in point of competence regardless of the compliance of natural logical consequence to classical. under ideal conditions. directly generated by how the reasoning box is . and yet we reason with them This argument has been considered "the final and decisive blow" to the doctrine of mental logic (Johnson-Laird.118 L. Thus it is puzzling to figure out how models might account for purported systematic violations: errors in point of competence would be an even deeper mystery for the mental model hypothesis. As of today the algorithms proposed to implement logical reasoning by models are either psychologically useless or ill defined (Bonatti. 141). The question is to decide with respect to which point of reference they are errors. According to Barwise and Cooper (1981).or. so it is difficult to give a definite judgement on this issue. If there can be no formal logic that captures all the valid . invalid reasoning processes could count as mistakes. It can be argued. mental models would be in a worse position than mental logic. One major task of a psychological theory of deductive reasoning is to characterize what people take the right implications to be starting from certain premises. Braine & Yang. in press. What could count as a systematic error in this context? Previous assumptions on the nature of rationality must be exploited. for example. that it is rational to proceed from truths to truths. There is no mental logic because higher-order quantifiers are not representable in first-order logic. If it could be shown that under ideal conditions people respond erratically to identical problems.

again. so be it. There is no evolutionary explanation of the origin of mental logic Another alleged argument against mental logic concerns its origin. This failure is a final and decisive blow to the doctrine of mental logic. it may backfire. Barring such arguments. But an argument is needed to ask for completeness as a constraint over a mental logic. p. It would be a very interesting empirical discovery to find out that. it is not complete. So what can possibly be wrong in using higher-order logic? We are told. in the absence of evidence that natural reasoning is complete. so it must be attached a certain importance. Johnson-Laird & Bara. italics mine) The argument has often been repeated (see. 1989). 15). To accept that there is a mental logic seems to lead to the . 1991. that any theory that assumes that the logical properties of expressions derive directly from a mental logic cannot give an adequate account of those that call for a higher-order predicate calculus. would be irrelevant. p. Johnson-Laird & Byrne. In fact. Johnson-Laird et al. Neither should they: such an advantage. It is a remarkable fact that natural language contains terms with an implicit "logic" that is so powerful that it cannot be completely encompassed by formal rules of inference. blame the incompleteness of a higher-order mental logic system as if the mental model counterproposal were complete. But the only fragment for which a psychological implementation has been proposed . say. then a fortiori there can be no mental logic that does either. It would be desirable that subjects reason consistently. Even more basic logical properties cannot be granted a priori. to presuppose that our reasoning system is consistent requires an argument. because we can decide what we want from it. the "final and decisive blow against mental logic" blows up. for example. (Johnson-Laird.propositional reasoning-is not even valid. A bland version of it simply claims that there is no evolutionary explanation of mental logic. and if patterns of inference are required that can be better formalized in second-order logic. or completeness. 1983a. 5. and it is difficult to see what it would look like. and this is enough to reject the theory (Cosmides. A richer version runs as follows. p. Johnson-Laird & Byrne. Models have no advantage over mental logic on the issue of completeness. But finding out how people reason is an empirical enterprise. It follows.Why should we abandon the mental logic hypothesis? 119 deductions. a subject's system for propositional reasoning is complete. 81. The question is to figure out why. but it's not enough that we want it to be so. but. We may impose constraints on a logical system by requiring that it possesses certain logical properties such as consistency. 1990. Such objection makes sense only if one presupposes that a mental calculus must be complete. 6. 140-141. The nature of the representational device in which mental processes are carried out is an empirical question. pp. as everybody hopes to discover that under ideal conditions they do. 1984. of course.

Bonatti admission that most of our reasoning abilities are innate. if x cannot be acquired by trial-and-error and reinforcement.120 L. If you try to generalize it beyond this domain. it becomes flatly absurd. though there is no account of how it could have become innately determined (Johnson-Laird. p.no concept is invented. even restricting its field of application. 204) It is first worth noticing that the argument is meant to apply to cognition. 40). but it may keep changing its logical system (for simplicity. What is the logical syntax of mental processes? What logical system underlies reasoning abilities? What concepts is the mind able to entertain. and only to very restricted kinds of cognitive abilities. all concepts are innate. Nativism. (JohnsonLaird. Or else. in general. and at the same time may need to learn any concept by experience. or perceptual primitives: the ability to recognize colors (or any perceptual primitive) cannot be acquired by trial-and-error and reinforcement. so how did the species acquire the ability to breathe? That doesn't work. p. 142-143) So intractable is the problem for formal rules that many theorists suppose that deductive ability is not learned at all. although it may be. an organism may be innately endowed with the syntax of first-order logic. so how could the ability to recognize colors be acquired by neo-Darwinian mechanisms? This doesn't work either. Such an organism would have an innate logical syntax. It is hard to construct a case against the learning of logic that is not also a case against its evolution. It is innate. So I assume that the argument is really targeted against mental logic. cannot be a problem: everybody has to live with it. notice that there are at least three different questions one may raise. But there should be something specifically wrong with nativism about mental logic: there is no evolutionary explanation for its origin: By default. breathing cannot be acquired by trial-and-error and reinforcement. Fodor (1980) has even argued that. Its generalization says: for any *. the set of its axioms) by flip-flopping an axiom. pp. then how could it be acquired by a neo-Darwinian mechanism? Now take a non-cognitive phenomenon and substitute it for x. Second. 1983a. in principle. For the given premise is that Darwinian mechanisms are a sort of trial-and-error and reinforcement mechanisms applied to the species. but no innate logic or innate concepts. then how could it be acquired by neo-Darwinian mechanisms? (Johnson-Laird & Byrne. logic could not be learned. The difficulty with this argument is not that it is wrong. Try with colors. Alas. any argument that purports to explain the origins of all intellectual abilities by postulating that they are innate merely replaces one problem by another. whether innately or by experience? The above argument does not keep them separate. The moral that Fodor drew is an extreme version of nativism . And neither does it work for most innate cognitive abilities. but that it is too strong. yet they may have radically different answers. an organism may be endowed with an . and the only issue is whether you like it weaker or stronger. 1991. it seems that our logical apparatus must be inborn. If it could not be acquired by trial-and-error and reinforcement. 1983a. No one knows how deductive competence could have evolved according to the principles of neo-Darwinism. For example.

I will assume that the above argument is really targeted against nativism of a system of logic. Then. or a byproduct of another mutation. . There is no end to plausible story telling. may have been selected for quite other reasons. The quest for a Darwinian explanation of cognitive evolution is founded at best on an analogy with biological evolution.Why should we abandon the mental logic hypothesis? 121 innate logical syntax and an innate logic. under what metric it turned out to be advantageous: these are unanswered questions. evolutionary explanations are uninformative. generalized problem solving and linguistic competence might seem obviously to give a selective advantage to their possessors.. this claim is unsupported because there is no evolutionary story on how such a system gets fixated. Since there is no theory of its acquisition.. . which. here one should sense the kind of comparative advantage that the mental model hypothesis gains.bound. whether language has been a direct mutation. Thus the doctrine of mental logic has to be rejected. But. better-known. The short answer to such an argument (in its bland and its rich forms) is: too bad for evolutionary explanations.not just its syntax . Fourth. Again. we have no way of measuring the actual reproductive advantages. human cognition may have developed as the purely epiphenomenal consequence of the major increase in brain size. . and analogies may be misleading. Alternative (a) is empty. Second. and thus Darwinian worries don't arise. even if it were true that selection operated directly on cognition. The argument presupposes that there must be an evolutionary explanation of how deductive abilities are fixated. The problem is that we do not know and never will. . First. there is an evolutionist explanation of its origin. . (Lewontin. . or (b) if it is not learned. the claim that greater rationality and linguistic ability lead to greater offspring production is largely a modern prejudice. We should not confuse plausible stories with demonstrated truth. as opposed to the case of mental logic. 244-245) And there is no reason to ask for mental logic what does not exist and might not exist for other. What would it look like? For the much clearer case of language. . . But there are several difficulties. either (a) the ability of building mental models is not innate but learned. culture . but may need experience to acquire contentful concepts. Lewontin specifically makes this point for problem solving: . it must be assumed that the logical system . . If there is a mental logic. . The arguments for or against nativism are quite different in the three cases. 1990. alas. in turn. cognitive domains. This is a general problem concerning the application of evolutionary concepts to cognition. it can be reconstructed in the following way. Whether a mutation endowing humans with linguistic abilities concerns the structures of the organism or in its functions. There is no learning theory for models and it is . pp. The argument seems to presuppose that. But let us suppose that one should seriously worry for the lack of a Darwinian explanation of how innate logic has been selected. .is innate.and history . an account is due of how it is acquired. The long answer requires a reflection on the state of evolutionary explanations of cognitive mechanisms. .

Johnson-Laird & Byrne. However. But evolutionary explanations are not so fine-grained to discriminate between our capacity to construct models as opposed to derivations. Solution: the rules supervene to the structure of models. In this case. Mental logic cannot explain reasoning because people follow extra-logical heuristics Often heuristics of various sorts guide human responses even in deduction. alternative (b) assumes the following form: an innate mechanism for building mental models gives an evolutionary advantage that an innate mental logic doesn't give. 1991). there isn't one for mental models either. It may be argued that heuristics don't pose any special problem to model-based theories of reasoning. For example. Analogously. but how could a heuristic be explicitly represented within models? Tokens and possibly some of their logical relations are explicit in models. an order is needed to constrain the sequence of constructed models (Galotti. heuristics can be an epiphenomenon of the structure of models. At least in principle. 1989). Baron & Sabini. transitivity is an emerging feature of the structure of the model. whereas rule-based systems must express them explicitly. For example. 1983a. or on previous experience. Bonatti unlikely that any future such theory will bring about substantial economies in nativism. for example. 1986). since most of the structures needed for problem solving are the same regardless of which theory turns out to be correct. such as reliance on the most frequent interpretation. whereas they do for logic-based theories. models may help to solve the problem of implicitness: certain processes may be externally described by explicit rules which nevertheless are not explicitly represented in the mental life of an organism. But there may be something more to the argument. if a premise has different possible interpretations. it may be argued that also other apparent rule-following behaviors such as strategies are emerging features of models.122 L. and this squares very badly with a radical rule epiphenomenalism. Just like Dennett's queen moving out early. it will work for both. But it is unclear how this counts against mental logic. 6. or on previously held beliefs. If there is any such explanation. a model for the sentences "a is to the right of &" and "6 is to the right of c" allows us to derive "a is to the right of c" with no explicit rule to that effect (see Johnson-Laird. but not . Without (a). Galotti. But the other side of the coin is the problem of explicitness: how could a system represent the information that is explicitly represented? This is no difficulty for mental logic. Such an order too may depend on heuristics having nothing to do with models proper. Models need heuristics as much as logical rules do. if there isn't one for mental logic. often subjects reason by following heuristics that they can perfectly spell out and that are not accounted for by the structure of models (see.

420) But truth tables (and thus models) don't have such a problem. then the connective tonk could be defined. tonk shows that explicit definitions cannot give the meaning to a term on the ground of the analytical tie between the definiens and the definiendum. this idea is unworkable .. the formal rules for propositional connectives are consistent with more than one possible semantics . 7. 2 and 2 are 5. derive P tonk Q (2) From P tonk Q. but can at most correspond to a previously possessed meaning: we see that certain rules of inferences are adequate for "and" because we know its meaning and judge the adequacy of the rules with respect to it. We can perfectly introduce a sign for tonk governed by the above rules and have a purely symbolic game running. The difference between "and" and tonk is that in the first case the rules correspond to the (previously held) sense of the word "and": they . So while a propositional-like system doesn't have the problem of explicitness. If there were nothing more to the meaning of a connective than the inferences associated to it. implicitly reflects. p. Prior's argument is a challenge to a conceptual role semantics. 420). But games with rules and transformations of symbols don't generate meaning: "to believe that anything of this sort can take us beyond the symbols to their meaning. although it is sometimes suggested that the meaning of a term derives from. . p.. 2 and 2 are 4 tonk 2 and 2 are 5 Therefore. (Johnson-Laird et al. is to believe in magic" (Prior. Prior argued that rules of inference cannot analytically define the meaning of the connectives they govern. with the meaning specified by the following rules: (1) From P. refer to an argument presented in Prior (1960. since they "are merely a systematic way of spelling out a knowledge of the meanings of connectives" (Johnson-Laird et al. Johnson-Laird et al. 1964. 1964).. 1992. Mental logic cannot offer a theory of meaning for connectives In fact. But an aspect of it has been forgotten. or is nothing more than the roles of inference for it. 1992. how to avoid tonkl According to Prior. Hence. 191). . If meaning is inferential role. p. models may have it. derive Q and with tonk we could obtain the following derivation: 2 and 2 are 4 Therefore. . Models don't contain information specifying the order in which certain operations have to be executed.Why should we abandon the mental logic hypothesis? 123 performatives. but only the result of such operations.

we would accept it because we understand that such a characterization captures the meaning of the conjunction. he argued. 1964. 1991). because there is no contonktion and the explicit introduction of a sign for it does not give life to a new connective. The point of this further facet of the argument is that truth tables identify a much broader class of signs than conjunction. truth tables are in no better position than rules to generate meanings. and not of the other signs. We would probably be happy with this solution. We can define a class of signs standing for conjunction. Instead of using rules. then tonkitis would reappear. But. there was "no difference in principle between [rules of inferences and truth tables]" (Prior. So Prior's argument goes. 192) to clarify it. 193). or Oxford is the capital of Scotland" (Prior. We can now leave Prior and touch on the real problem. it forms a sign that is true if P is true and false if Q is false (and therefore. Prior noticed. In fact. and moreover. signs that are understood on the basis of the understanding of conjunction (see Usberti. they suffer from another equally worrisome disease. because there is a conjunction. and a class of signs standing for contonktion. which is the abbreviation for "Either P and Q. or truth tables. p. but are "indirect and informal ways" (Prior. p. but the latter is an empty class. There are conjunction-forming signs. Prior (1964) remarked that explicitly defining connectives in terms of truth tables did not change the point of his criticism. any formula of arbitrary length with the same truth table will turn out to be a conjunction-forming sign. but truth tables can. 192). (1992) when they claim that mental logic cannot explain the meaning of connectives. 1964. 194). In his view.124 L. truth tables in a metalanguage. don't define the meaning of the logical symbols. This seems to be the interpretation adopted by Johnson-Laird et al. If they apparently don't suffer from tonkitis. If we grant that explicit rules. Thus. since a (symbolic) truth table game defining a contonktion-forming sign is easy to find: tonk "is a sign such that when placed between two signs for the propositions P and Q. And if we wanted to resort again to formal games. both true and false if P is true and Q is false)" (Prior. There are no contonktion-forming signs. 1964. p. but something else can: namely. but this will not give conjunction its meaning. Bonatti don't confer it its meaning. p. We might try to eliminate all the unwanted signs which would be defined by the truth table for conjunction by saying that the table defines the meaning of the shortest possible sign for conjunction. One way to read it is that rules can't give a symbol its meaning. but are accepted on the basis of their correspondence to some pre-existent meaning we . 1964. so will formulas involving non-logical conceptions such as "P ett Q". of course. But in the second case there is no prior sense to appeal to. we can define a conjunctionforming sign by using the familiar truth table.

For the moment. while contonktions are not. one of the possibilities is that I find that resemblance because the rules are the exact expression of the patterns of inferences of a logical connective in the mind.Why should we abandon the mental logic hypothesis? 125 attach to connectives and quantifiers. If a semanticist presented us with a set of rules for them. The present point is simply that no argument exists to hamper its development. I have no intuition about their adequacy because there is no logical connective for tonk in the mind. however. others clearly don't. and Schaeken. are not good guides. But in general this is false. as Prior said. then mental logic can be the basis of a theory of meaning for natural connectives. If such a theory can be worked out (and a tiny part of it already exists). In this case. At the same time. It is not enough to say that conjunctions have a meaning because they seem to correspond to rules in the mind but contonktions don't because they don't titillate our intuitions. . spell out the meaning of binary connectives. focus. truth tables are merely "a systematic way of spelling out a knowledge of the meanings of connectives".consider. we would not probably have the same immediate intuition we feel for conjunction. one of the possibilities is that it is my brother. A developed theory of mental logic offers empirical reasons to show that conjunctions are in the mind. Why is it so? Why do we feel that the truth table. This is where a theory of mental logic comes in. and the anomalous truth table for tonk can't reflect the meaning of a new connective? Nothing seems to block the following possibility. and when I see the rules of inference-or the truth table-for tonk. There are lots of logical operators that may not have any straightforward correspondence with natural language. from which the explicit rules are a clone copy. or quantifiers over events. for example. When I see somebody who reminds me of my brother. So when I see a set of rules for the conjunction and I think that it adequately expresses what I mean by a conjunction. and yet are computed in retrieving the truth conditions of natural language sentences . whereas the classical truth table for the implication doesn't reflect the meaning of natural implication. There are 16 binary truth tables: only some of them do. Byrne. when I see the truth table of material implication I realize that it does not spell out the meaning of natural implication because the rules governing natural implications are not reflected in it. Contonktion cured. we are very far from having such a complete theory. because if thinking that a game of symbols can take us beyond the symbols to their meanings is magic. we still have to explain what the source of our intuitions about the meaning of connectives and quantifiers is. for the conjunction reflects the meaning of the conjunction. or seem to. there is nothing more to the meaning of the term than the rules themselves. For Johnson-Laird. it is equally magic to think that the meaning of logical symbols comes from nowhere. Intuitions.

Byrne (1989) demonstrated that these relations in turn can block modus ponens. The suppression of the deduction shows that people do not have a secure intuition that modus ponens applies equally to any content. 326) and as a consequence that by their own argument. and Braine (1983). that also valid deductions as strong as modus ponens can be blocked: Models can be interrelated by a common referent or by general knowledge.. then subjects may be led by pragmatic reasons to construe the two premises If she meets her friend she will go to a play If she has enough money she will go to a play as a single If (she meets her friend and she has enough money) she will go to a play . Mental model theorists attributed a considerable importance to this result.126 L. By modifying a paradigm used by Rumain. (Johnson-Laird et al. 83) But no argument is offered to ensure that modus ponens is really violated. . p. If we assume that deductive rules apply not to the surface form of a text. . Byrne (1989) set up an experiment in which to premises such as If she meets her friend she will go to a play She meets her friend an extra premise was added. Bonatti 8. rule theorists ought to claim that there cannot be inference rules for (valid deduction). 1991. It shows. Connell. Yet. transforming the argument in If she meets her friend she will go to a play If she has enough money she will go to a play She meets her friend and she showed that in this case the percentage of subjects applying modus ponens drops from 96% to 38%. this intuition is a criterion for the existence of formal rules in the mind. There is no mental logic because valid inferences can be suppressed This recent argument is based on the so called "suppression of valid inferences" paradigm. p. they claimed. 1992. (Johnson-Laird & Byrne. but to its integrated representation. or to justify the claim that this result supports the mental models hypothesis.

So there was no reason to change the conventional wisdom on the relations between logic and psychology: the former was stable because considered complete and the latter was stable because non-existent. then B". also studied by Byrne. This wouldn't be necessary for models. under this interpretation. It's time to change. but mental logic has had its shot.Why should we abandon the mental logic hypothesis? 127 and therefore when provided only with the premise "She meets her friend". because they "have the machinery to deal with meaning". It may be replied that my response to the suppression argument puts the weight of the explanation on pre-logical comprehension processes. Models too rely on pragmatic comprehension mechanisms. pp. and knowing one of the disjuncts of the composed antecedent suffices to correctly apply modus ponens. people tend to construct a unified representation of a text which may itself be governed by formal rules of composition. this impression derives from a mistake of historical perspective. and in the other a model not licensing the inference is constructed. there is no suppression of valid inferences: simply. Thus. If model theorists want to explain why people draw the inference in one case and not in the other. In other cases." Often the contrast between the long history of mental logic and its scarce psychological productivity is taken as a proof of its sterility. This may be because subjects compose the premises "If A then B" and "If C then B" as a single "If (A or C). logic too remained substantially unchanged. rather than on deduction proper. they have to say that in one case a model licensing the inference is constructed. The idea is very ancient. In fact. Conclusions "Yes. To account for why it is so. 17-18). and that mental logic theorists have no account of such processes. But I have shown that such a claim is false. It has been around for centuries and nothing good came out of it. For centuries. and don't explain them. they don't know about the truth of the conjunctive antecedent and correctly refuse to use modus ponens. 9. Russell and the neopositivists. when subjects are given arguments such as If she meets her friend she will go to a play If she meets her mother she will go to a play She meets her friend they do conclude that she will go to a play. but the conceptual tools needed to transform it into the basis for testable empirical hypotheses are very recent. with Frege. . to the point that Kant considered it a completed discipline (1965. When. logic as we mean it started being developed. the routes of logic and psychology separated. they offer no explanation.

for independent reasons. however. that certain kinds of properties once attributable only to humans can also be appropriately predicated of other physical configurations.not from psychologists . No substantial argument against the psychological feasibility of mental logic motivated this change of view. to make a protocol of the rules according to which our thinking actually proceeds. However.128 L. Thinking. were the least possible attractive tool to investigate the psychology of reasoning. among the large majority of philosophically minded logicians. showing interest in psychological processes became a sort of behavior that well-mannered people should avoid. Rather. indeed. and on the other side a model of how a physical structure can use a formal system to carry out derivations. The first was provided by Gentzen. namely. its roots have to be looked for in the general spirit of rebellion against German and English idealism from which twentieth-century analytic philosophy stemmed. the fundamental idea of proof theory was "none other than to describe the activity of our understanding. Formal systems. 1983). And Turing did offer the abstract model of how a physical mechanism could perform operations once considered mental along the lines suggested by Hilbert. Bonatti Well beyond the 1930s. the Piagetian exception. 1969. 1990) have not helped towards a clarification. p. leaves the mechanisms and procedures by which the mind itself operates underspecified. parallels speaking and writing: we form statements and place them one behind another" (1927. p. There was. The further step necessary to the formulation of a psychological notion of mental logic came when functionalism advanced the explicit thesis that the . It says that mental processes can be simulated. Hilbert first directly expressed a connection between symbols and thought which could serve as a psychological underpinning for mental logic. once again. but Turing's real breakthrough consisted of the realization that a computer can be a mind. the same conclusion became popular among experimental psychologists and was generally held until the early 1960s. as conceived by the axiomatic school. both by behaviorists and by the new-look psychologists.that put logic back in the psychological ballpark. Gentzen did introduce the systems of natural deduction with the aim to "set up a formal system which comes as close as possible to actual reasoning" (Gentzen. 475). What was still missing to render logic ready for psychological investigation was on the one side a more intuitive presentation of formal systems. and the second by Turing. Such insight. and recent Piagetian-oriented investigations on mental logic (see Overton. but his reference to "actual reasoning" was merely intuitive. For him. 68). the distance between Gentzen's and Turing's ideas and a real psychological program should not be underestimated. Nevertheless. but it leaves it undetermined whether the simulandum and the simulans share the same psychology. It was again an impulse coming from logicians . Yet Hilbert's intuition was not enough. it so happens. but it does not count: Piaget's flirting with mental logic was never clear enough to become a serious empirical program (Braine & Rumain.

& Cooper. Propositional reasoning by model? Psychological Review. however hard. roughly contemporary with the psychological history of the mental model hypothesis. It is. and not mysteries. Given how little we know about the mind and reasoning. . In fact. And another decade or more had to go before experimental techniques were sufficiently developed to begin asking nature the right questions in the right way. Rips and their collaborators are the first attempts at elaborating mental logic in a regimented psychological setting. G. Approaching these problems requires the close collaboration of psychologists. Linguistics and Philosophy. A response to Grandy's note.. Bonatti. R. Connectionism and the mind. Boolos. both for its supposed superiority in handling empirical data and for the overwhelmingly convincing arguments against mental logic. If a case against it and in favor of mental models can be made. be it point of principle. References Barwise. (1979). J. Indeed. Oxford: Basil Blackwell. and not in Aristotle's age. The change leading to this second step was gradual. and required a lot of philosophical work to be digested. 17 181-182. contrary to widespread assumptions. and that the natural kinds described by psychology are not organisms. 158-160. (1981). (1984). in fact. but computational devices. Bechtel. Braine. The works by Braine. 159-219. or in point of history. M.. (1991). W. Generalized quantifiers and natural language. (in press). but on the formal and empirical development of the two theories. Psychological Review. the choice of psychologically plausible rules to test. L. This shouldn't come as a surprise: both needed largely the same conceptual tools to be conceived. Only then had logic and philosophy come to the right point of development to take the mental logic hypothesis seriously. extending the mental logic hypothesis beyond propositional reasoning engenders formidable problems connected with the choice of an appropriate language to express the logical forms of sentences on which rules apply. natural language semanticists and syntacticians. If then and strict implication. there are no good arguments against mental logic.D. & Abrahansen. the case for mental models has been overstated under both counts. it cannot rest on principled reasons. and the choice of appropriate means to test them. So.Why should we abandon the mental logic hypothesis? 129 psychological vocabulary is computational vocabulary. Psychologists should keep playing the mental logic game. A. Cognition. We are now beyond the 1960s. Mental models are not the inevitable revolution after millennia of mental logic domination. On 'Syllogistic inference'. 86. Most psychologists have abandoned the program and married the mental models alternative. But these are problems. Thus the psychological history of mental logic is very recent. 4. conclusions on research programs that only began to be adequately developed a few years ago are premature.

313-371).P. Journal of Memory and Language.). Garnham. P. Cognition.. W. R.M. Chichester: Ellis Horwood Ltd. Johnson-Laird. B. MA: MIT Press.. 21. 323-380. 16. (1991). Psychological Review. MA: MIT Press. MA: Harvard University Press 1967. (1989). The psychology of learning and motivation (Vol. (1990).N. Can valid inferences be suppressed? Cognition. Language and learning: The debate between Jean Piaget and Noam Chomsky (pp. Johnson-Laird. K. 28.. 418-439. J. I. (1991). Oxford: Basil Blackwell. Hillsdale. 263-340). Cognition. 105. Szabo (Ed. & Sabini.. In J.). & Byrne. London: Routledge & Keegan. (1980). Johnson-Laird. 99. (1989).. R. San Diego. J. Ravell (Ed. A theory of if: Lexical entry. 16. Johnson-Laird. (1989). 229-246).). (1992) Propositional reasoning by model. 69-84. (1990). Gentzen. K. & Rumain. 31. reprinted in S.N. Cosmides. (1927). Posner (Ed. New York: Wiley.N. & O'Brien. (1984). MA: MIT Press. Cognition. reasoning program. & Braine. Mental content. P. Meta-logical problems: Knights. Galotti. 182-203. CA: Academic Press. Foundations of cognitive science (pp. Johnson-Laird. In D.M. & Rumain. D.N. In M. Psychological Review. In J. Lewontin.D. Churchland. (1990). C.. In M. Braine. Byrne. 61-83.. 187-276. 39. P.P.. From Frege to Godel (pp. van Heijenoort (Ed.A. Modularity of mind. (1990).N. & Schaeken. M.C. Spatial descriptions and referential continuity.. In M. Psychological Bulletin. On the nature of explanation: a PDP approach. Martin Press. pp. Johnson-Laird. D. Fisch.J. Suppressing valid inferences with conditionals. Johnson-Laird.P.D.). Braine. MA: MIT Press. Johnson-Laird. Predicting propositional logic inferences in text comprehension. rules or mental models? Journal of Experimental Psychology.. pp.D. G. The evolution of cognition. Amsterdam: North Holland. Spatial reasoning. (1989). Smith (Eds. (1983). and pragmatic principles.. R. Hodges. In J. PiatteliPalmarini (Ed. R. Thinking and reasoning: psychological approaches (pp. 361-387.N. Cognition. (1993). Thinking as a skill.. Ehrlich..N. Journal of Verbal Learning and Verbal Behavior. & Johnson-Laird.M. 142-163). Osherson and E. Mental models. The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. M. R. Hilbert.. & Byrne.. 29. Evans (Ed. 44-75).M. 18. D.B. Cambridge. (1986). (1993). . R. P. R. O'Brien. 31. Byrne. Thinking: an invitation to cognitive science (vol. Byrne. 36. & Byrne. Deduction. MA: Harvard University Press. General. (1989). Fodor. J. 469-499).N.. Critique of pure reason. 565-575. Cambridge. Forrest (1991). 3. R.. B.E. 296-306. M. M. I. 115. Psychological Review. The logical content of theories of deduction. knaves and Rips. Investigations into logical deduction. Mental models. S. K. P. B. Mental models as representations of discourse and text. P. Behavioral and Brain Sciences. Noveck. 68-128). pp.N. (1969). P. P. P. A. Byrne.N. NJ: Erlbaum. & Bara. Reiser.. Individual differences in syllogistic reasoning: deduction. Galotti. Johnson-Laird. Some empirical justification for a theory of natural propositional logic. 658-673. Approaches to studying formal and everyday reasoning. The foundations of mathematics. Cambridge.). R.130 L. (1991). Logical Reasoning. Syllogistic inference.A. Ill. 464-469). Cambridge. 96. (1989). P.. Cambridge. B. Cambridge. 1-61. 281-292). McGinn. (1983). 16-25. Lea. (1983a). The collected papers of Gehrard Gentzen (pp. (1965). J. J.). & Byrne. P. (1984). 16 353-354. Precis of Deduction. Bonatti Braine. (1987). Journal of Memory and Language. R. (1982). Reasoning by model: the case of multiple quantification. Emergent computation (pp.B. Cambridge MA: Harvard University Press. Behavioral and Brain Sciences. Fodor. L. CaarmichaeVs handbook of child psychology (Vol. P. 71-78. On the impossibility of acquiring 'more powerful' structures. New York: St. Baron. 331-351. (1983b).N. 98.).M. Johnson-Laird. (1989). P.. Kant. & Tabossi.

Mental muddles. Conversational comprehension processes are responsible for reasoning fallacies in children as well as adults: It is not the biconditional.. G.N. so why can't we solve those logical reasoning problems? In K.D. (1964). Manktelow & D. Rumain. Cognitive processes in propositional reasoning. Mental logic and irrationality: we can put a man on the moon. (in press). Overton. Prior.. M. Rationality (pp. Tucson: University of Arizona Press. O'Brien. Developmental Psychology. 117-140. Y. Rips. Prior. Proportional reasoning by mental models? Simple to refute in principle and in practice. & Garnham. (1989). 38-39. Connell.I. Brand & R. M. 21. (1960). 19. B. 191-195. Psychological Review. Believability and syllogistic reasoning. J.N. Reasoning.D. Cognition. Erlbaum. 90. J. (1983). Psychological Review. P. 471-481.). Johnson-Laird.E. Braine. (1983).. Usberti.J.Why should we abandon the mental logic hypothesis? 131 Oakill. necessity and logic: Developmental perspectives. (1993). (1986). 258-286). Analysis. & Yang. Over (Eds. Conjunction and contonktion revisited. Harnish (Eds. O'Brien. W. D. 131-138. 31. 110-135). & Braine.. Teoria.. A. ..) (1990). (1991). Hillsdale. D. London: Routledge. L. A.N. Prior's disease. The runabout inference ticket. In M. The representation of knowledge and belief (pp. 24. A.J. L. 38-71. Rips. Analysis.). 2. (Ed. NJ.

. USA Abstract An informal. linguistics. New York. What to do about the theory's not being true (if it's not) . and one that I'll mostly leave for another time. A revival of conceptual atomism appears to be the indicated alternative. nobody listens to the music of the spheres (or to me. not just in philosophy but also in psychology. And I think this ubiquitous theory is quite probably untrue. I want to get you to see that there is this ubiquitous theory and that. Introduction: the centrality of concepts What's ubiquitous goes unremarked. Busch Campus. USA. The nature of concepts is the pivotal theoretical issue in cognitive science. Psychology Building. Rutgers University. hard question. NJ 08855. Center for Cognitive Science. Piscataway. Psychology Building.7 Concepts: a potboiler Jerry Fodor* Graduate Center. It is argued that the practically universal assumption that concepts are (at least partially) individuated by their roles in inferences is probably mistaken. NY 10036. for that matter). Rutgers University. USA Center for Cognitive Science. I think a certain account of concepts is ubiquitous in recent discussions about minds. but for the last fifty years or so. very likely. you yourself are among its adherents. NJ 08855. 33 West 42nd Street. This paper aims at consciousness raising. and the rest of the cognitive sciences. CUNY. discussion of the role that the concept of a concept plays in recent theories of the cognitive mind. Piscataway. but revisionist. Here's why: Cognitive science is fundamentally concerned with a certain mind-world * Correspondence to: J. and not just this week. Busch Campus. Fodor.what our cognitive science would be like if we were to throw the theory overboard-is a long. it's the one that all the others turn on. artificial intelligence.

either in cognitive science or in philosophy. semantic) properties of a creature's mental states are supposed to be sensitive to. and that these have both semantic and causal properties. or both about whether and in what sense it is committed to complex mental representations. however. Mental entities that exhibit both semantic and causal properties are generically called "mental representations". almost1 universal agreement that theories of this relation must posit mental states some of whose properties are representational. On all hands. particularly. and whose representational and causal properties are determined. For discussion. The caveat is because it's moot how one should understand the relation between main-line cognitive science and the Gibsonian tradition. This paper will not be concerned. The representational (or. the semantic properties of a node in a network are specified by the node's label. reliably comport with its utilities. however. with these issues in the metaphysical foundations of semantics. all the others (including. in general.2 The causal properties of a creature's mental states are supposed to determine the course of its mental processes. about how the representational/semantic properties of mental states are to be analyzed. provoked by Fodor and Pylyshyn (1988). Roughly. Concepts are the least complex mental entities that exhibit both representational and causal properties. as I'll often say. This account subsumes even the connectionist tradition which is. they are. 2 There is no general agreement. and. beliefs. by those of the concepts they're constructed from. or confused. other than tangentially. and some of whose properties are causal. desires and the rest of the "propositional attitudes") are assumed to be complexes whose constituents are concepts. Enter concepts. So even connectionists think there are concepts as the present discussion understands that notion. then. concepts serve both as the domains over which the most elementary mental processes are defined. and hence to carry information about. for example. and its causal properties are determined by the character of its connectivity.134 J. often unclear. See. see Fodor (1990) and references cited therein. Hence their centrality in representational theories of mind. Fodor relation. For recent discussion. eventually. Suffice it for present purpose that connectionists clearly assume that there are elementary mental representations (typically labeled nodes). simply taken for granted by psychologists when empirical theories of cognitive processes are proposed. in normal circumstances. Smolensky (1988) and Fodor and McLaughlin (1990). the character of its environment. There is. and theories that propose to account for the adaptivity of behavior by reference to the semantic and causal properties of mental representations are called "representational theories of the mind". wholly or in part. and as the most primitive bearers of semantic properties. l . the goal is to understand how its mental processes can cause a creature to behave in ways which. at present. see Fodor and Pylyshyn (1981). There is a substantial literature on this issue. the character of its behavior.

then it has to be a part of your theory that pumpkins are what concepts are.a point that's essential but easy to overlook. Thinking about dogs often makes one think about cats because dogs and cats often turn up together in experience. if you know what an X is. It depends entirely on how often you've come across prime numbers of dogs covered with ketchup on Tuesday afternoons. PRIME NUMBER-thoughts or TUESDAY AFTERNOON-thoughts or KETCHUPthoughts. Here's a stripped-down version of a classical representational theory of concepts. BONE-thoughts. If. anyhow. Because association is the only causal power that ideas have. your theory is that concepts are pumpkins. BARKthoughts and the like in most actual mental lives. I am going to argue. for example. for example. . I'll assume he did). but also from the philosophical tradition of classical British empiricism. So. any idea can. and it's the patterns in one's experience. therefore. there are possible mental lives in which that very same concept reliably calls up. Generally speaking. this applies to concepts. No doubt. for purposes of exposition. though their general architecture conforms quite closely to what I've just outlined. depending on which experiences one happens to have. the account of concepts that they offered differs. then you also know what it is to have an X. So much by way of a reminder of what classical theorists said about concepts. and if your theory is that having a concept is having a pumpkin. Comparison illuminates both the classical and the current kinds of representational theories. and only these. and they get their semantic properties from their resemblance to things in the world. that determine the associations among one's ideas. and reveals important respects in which their story was closer to being right about the nature of concepts than ours. in striking ways. Classical ideas cannot. Ancient history: the classical background 135 The kind of concept-centered psychological theory I've just been sketching should seem familiar. and because association is determined only by experience. I don't want to claim much for the historical accuracy of my exegesis (though it may be that Hume held a view within hailing distance of the one I've sketched. the concept DOG applies to dogs because dogs are what (tokens of) the concept looks like. Though DOGthoughts call up CAT-thoughts.Concepts: a potboiler 1. So. as it might be. from the ones that are now fashionable. I want to say a bit about classical versions of the representational theory of mind because. LEASH-thoughts. And ditto the other way around. become associated to any other. not only from current work in cognitive science. They get their causal powers from their associative relations to one another. in principle. be defined by their relations to one another. But I do want to call your attention to a certain point about the tactics of this kind of theory construction . Concepts are mental images. I take it that this is just truistic. then it has to be a part of your theory that having a concept is having a pumpkin.

Without exception. important. and then the story about what jobs. but how do you identify a concept? Answer: you identify a concept by saying what it is the concept of. First you say what is to have a job. " is intentional for the " . and then you say that having a cat is just: having one of those. it goes the other way round. current theorizing about concepts reverses the classical direction of analysis.that they treat as parasitic: the concept X is just whatever it is that a creature has when it has that concept. it's the concept that you use to think about dogs with. . " position. Similarly. for concepts of other than canine content: the concept X is the concept of Xs. mutatis mutandis. a catastrophe . and siblings. I suspect that it was a wrong turn-on balance.y o u give the "identity conditions" for the concept-and then the story about concept possession follows without further fuss.136 J. Correspondingly. however.t h e story about concept individuation . Well. however. and then explain what numbers are in terms of them? Or are the properties of sets and of numbers both parasitic on those of something quite else (like counting. Having the concept X is just having a concept to think about Xs with. Peacocke. one's theory about having a cat ought surely to be parasitic on one's theory about being a cat. (See. . the most profound implications for our cognitive science. first you say what a cat is.) This subtle. To cite a notorious case: ought one first to explain what the number three is and then explain what it is for a set to have three members? Or do you first explain what sets are. or a sibling. To a striking extent. for example). and largely inarticulate. difference between contemporary representational theories and their classical forebears has. having the concept X is having a concept to think about Xs "as such" with. which is illuminatingly explicit on this point. having the concept DOG is just having a concept to think about dogs with. I would be rich and famous. pains and siblings are is a spin-off. These examples are. or a pain.and that we shall have to go back and do it all again. If I knew and I was rich. . The concept DOG. (More precisely. Anyhow. . pains. So. so I'll argue. it determines the kinds of problems we work on and the kinds of theories that we offer as solutions to our problems. for example. First you say what it is for something to be the concept X . untendentious. classical representational theories uniformly took it for granted that the explanation of concept possession should be parasitic on the explanation of concept individuation. is the concept of dogs. and sometimes it isn't. It's the story about being the concept X .) So much for the explanatory tactics of classical representational theories of mind. The context "thinks about . Fodor Sometimes it's clear in which direction the explanation should go. 1992. With jobs. for example. I hope. for example. But decisions about the proper order of explanation can be unobvious. We'll return to this presently. The substance of current theories lies in what they say about the conditions for having the concept X. that's to say. and extremely difficult.

the most familiar. If concepts are internal mental representations. not mental. Of-ness ("content". you need an alternative to the picture theory of meaning. I think. Roughly. The philosopher's concepts can be viewed as the types of which the psychologist's concepts are tokens. just a little about why the classical representational view was abandoned. but I'm meaning it to pick out a much broader critical tradition. they thought the image theory of mental representation explained it. of course. We'll need to keep them in mind when we turn to discussing current accounts of concepts. for philosophers. Having put it in there. how does thought every contact the external world Terminological footnote: here and elsewhere in this paper. There were. classical theorists are at a loss as to how to get it out again. hence. or that they were wrong to hold that concepts should be individuated by their contents. perhaps. So the ecological objection goes. one of the things that killed the classical theory of concepts was simply that concepts are mental entities. but they used to be. the term has Gibsonian ring. it isn't because it looks like a dog that it's concept of dogs. What cognitive science is trying to understand is something that happens in the world. Behaviorist views aren't widely prevalent now.Concepts: a potboiler 137 First. . I follow the psychologist's usage rather than the philosopher's. concepts are generally abstract entities. and rightly. though at least as influential as the others. In that case.) Here's a rough formulation. the epistemological being. It doesn't follow either that classical theorists were wrong to hold that the story about concept possession should be parasitic on the story about concept identification.because classical theorists thought that they had of-ness under control. it's the interplay of environmental contingencies and behavioral adaptations.3 and mentalism went out of fashion. I suspect Dewey was the chief influence. however. metaphysical and epistemological. Even if concepts are mental images (which they aren't) and even if the concept DOG looks like a dog (which it doesn't) still. you will feel no need for a theory about what concepts are. is distinctly harder to state. Epistemology: The third of the standard objections to the classical account of concepts. viewed as problematic. and thought is conversant only with concepts. the concept X is the concept of Xs. "intentionality") does not reduce to resemblance. three kinds of reasons: methodological. The two ways of talking are compatible. But it's true that if you want to defend the classical order of analysis. classical or otherwise. Viewing concepts primarily as the vehicles of thought puts the locus of this mind/world interaction (metaphorically and maybe literally) not in the world but in the head. Metaphysics: A classical theory individuates concepts by specifying their contents. Methodology: Suppose you're a behaviorist of the kind who thinks there are no concepts. We now know that they were wrong to think this. (In fact. see the next footnote. This kind of worry comes in many variants. Used in this connection. and it is now widely. This seemed OK-it seemed not to beg any principled questions . it's that classical theories aren't adequately "ecological".

for example. surely. in fact. The problem is then to get together again what has been sundered . Notice too that this objection survives the demise of the image theory of concepts. I will contrast classical theories of concepts with "pragmatic" ones. in the discussion that follows. (Compare the classical view discussed above: having the concept of X is just being able to have thoughts about Xs). it's clear enough what cure was recommended. treating mental representation as. Fodor that the mental representations are supposed to represent? If there is a "veil of ideas" between the mind and the world. I'll try to make it plausible that all the recent and current accounts of concepts in cognitive science really are just variations on the pragmatist legacy. in the roughest outline. only with our idea of it") or in idealism ("it's OK if we can never get outside of heads because the world is in there with us")?4 And. but contains them both in an unanalyzed totality. 9). "Other [philosophers' methods] begin with results of a reflection that has already torn in two the subject-matter and the operations and states of experiencing.." (p. and/or to be able to reason about Xs in certain kinds of ways. Notice that this ecological criticism of the classical story is different from the behaviorist's eschewal of intentionality as such. Reid used it against Hume. I think." 4 . this sort of objection to the classical theory predates behaviorism by a lot. "[Experience] recognizes in its primary integrity no division between act and material. what's wanted is what they call in Europe being in the world. say. and this brings us back to our main topic. (I'm told this sounds even better in German. If what we want is to get thought out of the head and into the world. how can the mind see the world through the veil? Isn't it. Accordingly. unless in some way it can be 'transcended' (p. but it forms a veil or screen which shuts us off from nature. we need to take having a concept as the fundamental notion and define concept individuation in terms of it. In fact.) This is all. as I say. This is a true Copernican revolution in the theory of mind. is the new theory about concept possession: having a concept is having certain epistemic capacities. solipsism and idealism are both refutations of theories that entail them. But even if the "ecological" diagnosis of what's wrong with classical concepts is a bit obscure. that it is extremely confused. la)". What's wanted isn't either pictures of the world or stories about the world. and we are still living among the debris.. discursive rather than iconic doesn't help. hard to formulate precisely. we need to reverse the classical direction of analysis. The remedy he recommends is resolutely to refuse to recognize the distinction between experience and its object. precisely as discussed above. subject and object. but that they are internal.138 J. "Experience to them is not only something extraneous which is occasionally superimposed upon nature. inevitable that the classical style of theorizing eventuates either in solipsism ("we never do connect with the world. It is a paradigmatically pragmatist idea that having a concept is being able to do certain things rather than being able to think certain things. To have the concept of X is to be able to recognize Xs. Here. Thus Dewey (1958). in fact. The present objection to "internal representations" is not that they are representations.

Suffice it that criterial relations are supposed to be sort-of-semantical rather than sort-of-empirical. and we'll also have compiled a must-list for whatever theory of concepts pragmatism is eventually replaced by. Thus Ryle. and I do want to turn to less primitive versions of pragmatism about concepts. for example. Peacocke.like TRIANGULAR and TRILATERAL. When we've finished with this catalogue of tragic flaws. to the things in the world that the concept applies to. for examplemay perfectly well be different concepts. identical. or can think of. not all behaviorists were eliminativists. To put this point another way. The resulting symmetry is gratifyingly Sophoclean. and lots of philosophers still think there must be something to it (see. or being disposed to respond selectively. So let me just briefly remind you of what proved to be the decisive argument against the behavioristic version: concepts can't be just sorting capacities. it's always relative to some or other way of conceptualizing the things that are being sorted.concepts that apply to the same things-would have to be identical. which behaviors are supposed to be criterial for concept possession? Short answer: sorting behaviors. "procedural" semanticists like Woods. this identification of possessing a concept with being able to discriminate the things it applies to survived well into the age of computer models (see. and paradigmatic responses are overt behaviors "under the control" of overt stimulations. a deep fact about concepts by which it is undone. concepts included. having the concept X is being able to discriminate Xs from non-Xs.1.Concepts: a potboiler 139 In particular. (1975). I propose to consider (briefly. for example. 2. then coextensive concepts . according to this tradition. Even necessarily coextensive concepts . Though their behaviors . So. Do not ask what criteria are. then. there are some things we're not meant to know. I don't want to bore your with ancient recent history. Au fond. to sort things into the ones that are X and the ones that aren't. And coextensive concepts aren't. so long as the "criteria" for having a concept can be expressed in the vocabulary of behavior and/or in the vocabulary of dispositions to behave. there is. 1992). and Hull (and even Skinner about half the time) are perfectly content to talk of concept possession. However. This approach gets concepts into the world with a vengeance: having a concept is responding selectively. in general. we'll have exhausted all the versions of concept pragmatism I've heard of. sorting is something that happens under a description. Behavioristic pragmatism (and the problem of intentionality) I remarked above that behaviorism can be a reason for ruling all mentalistic notions out of psychology. Each evokes its proprietary nemesis. you'll be pleased to hear) what I take to be five failed versions of pragmatism about concepts. Though behaviorist in essence. for if they were. for each. some were reductionists instead.

So. it still may be that having a concept is some kind of knowing how. All non-behaviorist 5 The idea that concepts are (at least partially) constituted by inferential capacities receives what seems to be independent support from the success of logicist treatments of the "logical" concepts (AND. But notice that pragmatists as such are still OK: even if having a concept isn't just knowing how to sort things. of course. however. nonetheless. when these philosophers tell you what it's like to analyze a concept. this formulation is.) Behaviorists had a bad case of mauvais fois about this. . as usual. and only the first is exercising the concept TRIANGLE. (Here again. from the creature that is sorting trilaterals. however. [is] a pseudoproblem in psychology". to cleave to some kind of pragmatist reduction of concept individuation to concept possession and of concept possession to having epistemic capacities.) It should. Fodor may look exactly the same. etc. ALL. they would dearly have liked to deny the intentionality of sorting outright. to think of something as a trilateral is to think of it as having sides. For many philosophers (though not for many psychologists) thinking of concepts as inferential capacities is a natural way of extending the logicist program from the logical vocabulary to TREE or TABLE.5 Since inferring is presumably neither a behavior nor a behavioral capacity. The question then arises: what difference in their epistemic capacities could distinguish the creature that is sorting triangles from the creature that is sorting trilaterals? What could the difference between them be.). but you wish.140 J. (For a clear statement of this objection to behaviorism. and the guy who is collecting trilaterals must accept that the things in his collection have sides (even if he hasn't notice that they have angles). if it isn't in the piles that they end up with? The universally popular answer has been that the difference between sorting under the concept TRIANGLE and sorting under the concept TRILATERAL lies in what the sorter is disposed to infer from the sorting he performs. articles like Kendler (1952). is paradigmatic. 1978. strike you as not obvious that the analysis of AND is a plausible model for the analysis of TREE or TABLE. that you accept the point that sorting is always relative to a concept. the creature that is sorting triangles is in a different mental state. they start with AND. and though they may end up with the very same things in their piles. The guy who is collecting triangles must therefore accept that the things in his collection have angles (whether or not he has noticed that they have sides). Suppose. not one that a behavioristic pragmatist can swallow. and is behaving in a different way. 1992. In this respect. To think of something as a triangle is to think of it as having angles. see Dennett. We are now getting very close to the current scene. make fascinating retrospective reading. The long and short is: having concepts is having a mixture of abilities to sort and abilities to infer. Peacocke. So much the worse for behaviorists. according to which 'what is learned. and that theories of concept possession are prior to theories of concept individuation.

so it follows that you couldn't have the concept BACHELOR and fail to have the concept UNMARRIED MALE. Some such thought is often voiced informally in the cognitive science literature. They are thus all required to decide which inferences constitute which concepts. but no doubt there are those even on the East Coast who believe it in their hearts. Synonymous terms presumably express the same concept (this is a main connection between theories about concepts and theories about language). and to what degree. by my reckoning. see Fodor and Lepore (1992. For a discussion. And if there is no cognitive science. there is no intentional cognitive science. I strongly suspect this is because a robust notion of similarity is possible only where there is a correspondingly robust notion of identity. of course. Of which the first is as follows. the locus classicus of anarchic pragmatism. at least in part. not even a rough account of how such a similarity relation over concepts might be defined. And if there are no facts about which beliefs and desires are which. though without exception pragmatist. together with the . exactly four. that while nothing systematic can be said about concept identity.Concepts: a potboiler 141 versions of pragmatism hold that concept possession is constituted. I'm also not going to consider a doctrine that is closely related to anarchic pragmatism: namely. beliefs and desires are complexes of which concepts are the constituents). by inferential dispositions and capacities. by assumption. Definitional pragmatism (and the analyticity problem) Suppose the English word "bachelor" means the same as the English phrase "unmarried male". Of non-behavioristic pragmatist theories of concepts there are. to my knowledge. two concepts are similar. 2.3. we might as well stop worrying about what concepts are and have a nice long soak in a nice hot tub. it may be possible to provide a precise account of when. And if there are no facts about which concepts are which. are distinguished by the ways that they approach this question. then there are no facts about which beliefs and desires are which (for. Contemporary theories of concepts. Anarchic pragmatism (and the realism problem) Anarchic pragmatism is the doctrine that though concepts are constituted by inferential dispositions and capacities. 2. then there are no facts about which concepts are which. for cognitive science is just belief/desire explanation made systematic. 7). but there is. And from that. there is no fact of the matter about which inferences constitute which concepts. If there are no facts about which inferences constitute which concepts.2. Ch. I'm not going to discuss the anarchist view. California is.

having the concept X just is being able to sort Xs and being disposed to draw the inferences that define X-ness. to be sure. but try actually filling in the blanks in "JC is a dog iff JC is a . As it turns out. And.4. for being among the things that concept applies to). But alas. of course. a way to do it." without using the words like "dog" or "canine" or the like on the right-hand side. more precisely. this condition simply can't be met.e. The definition story offered a plausible (though partial) account of the acquisition of concepts. it seems that Y doesn't define X unless Y applies to all and only the possible Xs (as well. for most concepts. unless you accept the inference that if something belongs in your bachelor collection. despite its advantages. Lassie.142 J. see Bruner. that is. etc. the vocabulary in which it is couched must not include either the concept itself or any of its synonyms. if the definition is to be informative. If BACHELOR is the concept UNMARRIED MALE. then it is something that is male and unmarried.. & Austin (1956) and the large experimental literature that it inspired. what it shows is that being a necessary and sufficient condition for the application of a concept is not a sufficient condition for being a definition of the concept. Maybe being male and unmarried is necessary and sufficient for being a bachelor. definitional pragmatism tended to be hazy. There is. Still. . the definition theory doesn't work. the departed deserves a word or two of praise. But it isn't the case that "creature with a backbone" defines "creature with a heart" or vice versa. Rather. Being a creature with a backbone is necessary and sufficient for being a creature with a heart (so they tell me). then it's not hard to imagine how a creature that has the concept UNMARRIED and has the concept MALE could put them together and thereby achieve the concept BACHELOR. This point generalizes beyond the case of lists. it can't be met unless the definition employs synonyms and near-synonyms of the concept to be defined. Other significant virtues of the definition story will suggest themselves when we discuss concepts as prototypes in section 2. Maybe this treatment generalizes. Fodor intentionality of sorting (see section 2. That there is this option is. (Of course the theory that complex concepts are acquired by constructing them from their elements presupposes the availability of the elements. Concepts can't be definitions because most concepts don't have definitions.) This process of assembling concepts can be-indeed. The idea that it's defining inferences that count for concept possession is now almost as unfashionable as behaviorism. Goodnow. it follows that you couldn't be collecting bachelors so described unless you take yourself to be collecting unmarried males. however. Quite generally.. was-studied in the laboratory. maybe. About the acquisition of these. then being on the list would be necessary and sufficient for being in the extension of DOG.. . to define a concept is to provide necessary and sufficient conditions for something to be in its extension (i. Spot .1). as all and only the . if you could make a list of all and only the dogs (Rover. At a minimum. no comfort for the theory that concepts are definitions.).

Concepts: a potboiler 143 actual Xs). first in philosophy and then in cognitive psychology. compatible with the concept of DOG. possible. It seems clear enough that even if Rover. "even if I don't know what makes an inference analytic. that's why you can't define DOG by just listing the dogs. compatible with the concept DOG that some of these non-actual dogs are ten feet long? How about twenty feet long? How about twenty miles long? How about a light-year long? To be sure. in the same sense. But I suppose that red. allows. Probably biology rules out zebra-striped dogs. and nobody has any idea how to find out. surely it rules out dogs that are striped red. are light-year-long dogs possible. "Well". It is. mere biological or physical (or even metaphysical) impossibility won't do. and was faced with the usual problem about which inferences belong to which bundles. an exasperated pragmatist might nonetheless reply. again. is it analytically impossible that there should be such dogs? If you doubt that this kind of question has an answer. and the project foundered because nobody knows what makes an inference analytic. "shorter than a light-year" is part of the definition of DOG only if "some dogs are longer than a light-year" is analytically impossible. or that it matters a lot for any serious purpose what the answer is. 2. Lassie and Spot are all the dogs that there actually are. Would that show that he has failed to master the concept DOG? Or the concept LIGHT-YEAR? Or both? To put the point in the standard philosophical jargon: even if light-year-long dogs aren't really possible. Stereotypes and prototypes (and the problem of compositionality) Because it was pragmatist. I do know what makes an inference statistically reliable. somebody who thought that there might be such dogs wouldn't thereby show himself not to have the concept DOG . you are thereby doubting that the notion of definition has an important role to play in the theory of concept possession. Well. So why couldn't the theory of concept possession be statistical rather than definitional? Why couldn't I exploit . compatible with the concept DOG? Suppose somebody thought that maybe there could be a dachshund a light-year long. that there should be others. but presumably biology rules out a lot of options that the concept DOG. it's not biologically possible that there should be a dog as big as a light-year. then. white and blue striped dogs are conceptually possible. is that nobody was able to explicate the relevant sense of "possible". white and blue. it is possible.that's at the heart of the idea that concepts are definitions. what killed the definition theory of concepts. But is it.4. as such. The notion of an analytic inference was supposed to bear the burden of answering this question. Correspondingly. the definition story treated having a concept as having a bundle of inferential capacities. So much for definitions.would he? So. the modal notion .possibility .

in passing. and so forth. let's return one last time to the defunct idea that concepts are definitions. red for colors. not dachshunds for dogs). people are in striking agreement about what properties an arbitrarily chosen X is likely to have. . so it's hardly surprising that prototype theories of concepts are popular among connectionists. and the Xs that are judged to have lots of the properties that an arbitrary X is likely to have are generally the ones that are judged to be prototypical. semantically related to P. that this is probably the view of concepts that the prototypical cognitive scientist holds these days. It would be OK. and hence for the productivity and systematicity of thought. Productivity and systematicity are also universal features of human thought . if it can say that John loves Mary. McClelland & Rummelhart. if a language can say that P and that . an arbitrarily chosen dog is likely to be less than a light-year long. sure enough.) Moreover.) So. for example. "neural networks" are analog computers of statistical dependencies. To see why it doesn't work. the Xs that are judged to be prototypical are generally ones that have lots of the properties that an arbitrary X is judged likely to have. for this purpose. In the first instance.P . (See. in one way or another. This. an arbitrarily chosen conservative is likely to be a Republican. (An arbitrarily chosen bird is likely to be able to fly. why shouldn't having the concept of an X be having the ability to sort by X-ness. productivity and systematicity are best illustrated by reference to features (not of minds but) of natural languages. To say that they are systematic is to say that if a language can express the proposition that P. if the available statistical procedures were analogically (rather than explicitly) represented in the learner. it will also be able to say that Q and that . . then you can learn the concept X if you have the concepts Y and Z together with enough statistics to recognize reliability when you see it.Q . at last. in fact. It was a virtue of that idea that it provides for the compositionality of concepts. Fodor the notion of a reliable inference to do what definitional pragmatism tried and failed to do with the notion of an analytic inference?" We arrive. productivity and systematicity are universal features of human languages. For lots of kinds of Xs. If the concept of an X is the concept of something that is reliably Y and Z. at modern times.144 J.) As far as anybody knows. then it will be able to express a variety of other propositions that are. we're about to see. Notice. (So. together with a disposition to infer from something's being X to its having the typical properties of Xs? I think. it will be able to say that Mary loves John . then. people are in striking agreement about which Xs are prototypic of the kind (diamonds for jewels. Qua learning models. 1986. for lots of kinds of Xs. that stereotypes share one of the most agreeable features of definitions: they make the learning of (complex) concepts intelligible. To say that languages are productive is to say that there is no upper bound to the number of well-formed formulas that they contain. And. is no small matter.

that very same concept DOG also occurs in the thought that Rover is a brown dog. also gets you from the concept ANTIMISSILE to the concept ANTIANTIMISSLE. and.P . the very same process that gets you from the concept MISSILE to the concept ANTIMISSILE. if it can entertain the thought that Mary loves John. Compositionality requires. Whatever the content of the concept of BROWN DOG may be. and so on ad infinitum. It's on these assumptions that compositionality explains how being able to think that Rover is brown and that Rover is a dog is linked to being able to think that Rover is a brown dog. Systematicity follows because the concepts and principles you need to construct the thoughts that P and -Q are the very same ones that you need to construct the thoughts that Q and . that constituent concepts must be insensitive to their host. . it can entertain the thought that John loves Mary . This sort of treatment of compositionality is familiar.P . If you accept compositionality. The idea is that mental representations are constructed by the application of a finite number of combinatorial principles to a finite basis of (relatively or absolutely) primitive concepts. of the thoughts of many infra-human creatures). And compositionality further requires that the content of a complex representation is exhausted by the contributions that its constituents make. your grasp of the concepts BROWN and DOG wouldn't explain your grasp of the concept BROWN DOG. whatever the concept BROWN is that occurs in the thought that Rover is brown. it can also entertain the thought that . And also. and so on. a constituent concept contributes the same content to all the complex representations it occurs in. There is no upper bound to the number of thoughts that a person can think. the very same concept BROWN also occurs in the thought that Rover is a brown dog. for all I know. it must be completely determined by the content of the constituent concepts BROWN and DOG. together with the combinatorial apparatus that sticks these constituents together. It is extremely plausible that the productivity and the systematicity of language and thought are both to be explained by appeal to the systematicity and productivity of mental representations. I want to emphasize that it places a heavy constraint on both theories of concept possession and theories of concept individuation. (I am assuming the usual distinctions between cognitive "competence" and cognitive "performance"). .Concepts: a potboiler 145 (and. if this were not the case. . in effect. (So. and that mental representations are systematic and productive because they are compositional. if a mind can entertain the thought that P and any negative thoughts. and I will assume that it is essentially correct. and the concepts and principles you need to construct the thought that John loves Mary are the very same ones that you need to construct the thought that Mary loves John.) Productivity follows because the application of these constructive principles can iterate without bound. then you are required to say that whatever the concept DOG is that occurs in the thought that Rover is a dog.

they don't have to be stereotypic adjectives or stereotypic Xs. by the same definition. in the general case. So. I've claimed that knowing what is typical of adjective and what is typical of X doesn't. Fodor In short. and very tall bachelors from Hoboken are very tall unmarried men from Hoboken . Concepts do contribute their defining properties to the complexes of which they are constituents. of course. Stereotypes aren't compositional. he said. if concepts were definitions. this observation about the uncompositionality of stereotypes generalizes in a way that seems to me badly to undermine the whole pragmatist program of identifying concept possession with inferential dispositions. though concepts might have turned out to be definitions. unmarried men. And even if there are stereotypic adjective Xs. tall unmarried men. it's rather a pity that they're not. Concepts aren't definitions. Thus. for example. so long as there are infinitely many that are neither. Fitzgerald made this point about stereotypes to Hemingway when he said. when complex concepts are compositional. it may be some comfort that the systematicity and productivity of thought is compatible with compositionality failing in any finite number of cases. finitely many linguistic expressions) are idiomatic or metaphoric. It allows. there is no reason to suppose that it would be either a stereotype for tall men. Since bachelors are. If this account of compositionality strikes you as a bit austere. that is. for example. And if compositionality doesn't. I doubt. For stereotypes.146 J. but. they couldn't possibly turn out to be stereotypes or prototypes. the adjective in "ADJECTIVE X" is there precisely to mark a way that adjective X$ depart from stereotypic Xs. though some of your beliefs about adjective Xs are compositional- ." Hemingway replied by making the corresponding point about definitions: "Yes". otherwise compositionality won't explain productivity and systematicity. It's just that. tell you what is typical of adjective Xs. don't work the way that definitions do. and so on. and hence productive and systematic. tall bachelors are. there is nothing more to the definition of "very tall bachelor from Hoboken" than very tall unmarried man from Hoboken. that finitely many thoughts (hence a fortiori. or a stereotype for men. alas. that there is a stereotype of very tall men from Hoboken. the whole must not be more than the sum of its parts. The reason it doesn't is perfectly clear. "they have more money". "ADJECTIVE X" can be a perfectly good concept even if there is no adjective X stereotype. In fact. nothing will. or a stereotype for men from Hoboken. from the present perspective. even if there were. "The rich are different from the rest of us. . Correspondingly. . then. and very tall bachelors are very tall unmarried men. and the defining properties of complex concepts are exhaustively determined by the defining properties that the constituents contribute. there is nothing more to the definition of the phrase than what the definitions of its constituents contribute. On the contrary: often enough. We can now see why. by definition. we could see how thought could be compositional.

so whatever about concepts is not compositionally determined is therefore not their content. But probably you learned that green apples mean apple pies from the likes of Julia Child. some are beliefs that you have acquired about adjective Xs as such. it appears there are principled reasons why none of them could. Some of the inferences you are prepared to make about green apples follow just from their being green and from their being apples. On the one hand. he has (rightly. he cleaves (wrongly. not to what you know about the corresponding words or concepts. and some are compositionally inherited from your beliefs about Xs. then having concepts can't be having inferential capacities. I think) to the idea that having concepts is having certain inferential dispositions. I return briefly to my enumeration of the varieties of pragmatist theories of concept possession. that they are likely to taste sour. This line of argument was first set out in Fodor and Lepore (1992). and so forth. on the third hand (as it were). Either way. This puts your paradigmatic cognitive scientist in something of a pickle. . these inferences are not definitional and not compositional. You learned that "green apple" means green and apple when you learned English at your mother's knee. My own view is that cognitive science is right about concepts not being definitions. But. the inferential role of a concept is not. that there are kinds of green apples that you'd best not eat uncooked. On the other hand. only defining inferences are compositional so if there are no definitions. then there must indeed be analytic/definitional inferences. That is to say: they derive entirely from the constituency and structure of your GREEN APPLE concept. Philosophical reaction has been mostly that if the price of the pragmatist account of concepts is reviving the notion that there are analytic/ definitional inferences. I think that is very close to being a proof that the pragmatist notion of what it is to have a concept must be false. to the inferences that your beliefs about adjective Xs dispose you to draw. I think) rejected the idea that concepts are definitions. It should now seem unsurprising that none of them work. and these aren't compositional at all. Patently. and that it's the analysis of having concepts in terms of drawing inferences that is mistaken. of course. in general. But. In light of the issues about compositionality that we've just discussed. That is. in general. The same applies. They belong to what you know about green applies. it seems clear that the current situation is unstable. they are not ones that GREEN APPLE inherits from its constituents. But others depend on information (or misinformation) that you have picked up about green apples as such: that green apples go well in apple pie. as we've just been seeing. determined by its structure together with the inferential roles of its constituents. Something's gotta give.Concepts: a potboiler 147 ly inherited from your beliefs about adjectives. compositional. the inferential roles of concepts are not. only defining inferences are. The moral is this: the content of complex concepts has to be compositionally determined.

that our use of the term "electron" is implicitly defined by the theories we espouse. To know about hands is to know. in effect. ". 14). the nature of explanations deemed acceptable. Well. then. centrally it's having the capacity to draw certain inferences. Fodor 2.. to know about teeth and the brushing of them. Both fail. Even every-day concepts like HAND or TREE or TOOTHBRUSH figure in complex. and even in the individual concepts at the center of each system . in part. And suppose. The problem of how to individuate concepts thus reduces to the problem of how to individuate theories. Concepts are typically parts of beliefs.148 J. in which their notions of causality are embedded and in terms of which their deep ontological commitments are explicated. maybe that's because we haven't been taken the epistemic bit sufficiently seriously. her proposal is. according to this view. about arms and fingers.. The version of pragmatism according to which concepts are abstractions from knowledge structures corresponds exactly to the version of positivism according to which terms like "electron" are defined implicitly by reference to the theories they occur in.. that they are X) and I have a different theory about electrons (viz. successive theories differ in three related ways: in the domain of phenomena accounted for. Cognitive development consists. to know about toothbrushes is. Perhaps. largely inarticulate knowledge structures. in the emergence of new theories out of these older ones. concepts are just abstractions from such formal and informal knowledge structures. inter alia. well. On this view. and for the same reasons. Change of one kind cannot be understood without reference to the changes of the other kinds" (pp. Here are some passages in which the developmental cognitive psychologist Susan Carey (1985) discusses the approach to concepts that she favors: ". that these are paradigms for conceptual changes in ontogeny. that they are Y). The last two sentences are quoted from Carey's discussion of theory shifts in the history of science. but perhaps it's always true. Suppose you have a theory about electrons (viz. The "theory theory" of concepts (and the problem of holism) Pragmatists think that having a concept is having certain epistemic capacities. with the concomitant reconstructing of the ontologically important concepts and emergence of new explanatory notions" (p. [young] children represent only a few theory-like cognitive structures. We've had trouble figuring out which inferences constitute which concepts. in a different sense of "part". in both cases. the "theory theory" says that you have an essentially different concept of electrons from mine if (and only if?) you have an essentially different theory of electrons from mine. typically parts of theories. 4-5). to have the concept ELECTRON is to know what physics has to say about electrons. Cognitive science is where philosophy goes when it dies. but they are also. and to have the concept TOOTHBRUSH is to know what dental folklore has to say about teeth. . inter alia.5. This is clearly true of sophisticated concepts like ELECTRON..

The first is that I'm not accusing Carey of concept holism. then relativism. makes them both ideas of dogs?) First the pragmatist theory of concepts. we mean different things when we say "dog". and Hume's refusal to construe the content of one's concepts as being determined by the character of one's theoretical commitments. . Where is this buck going to stop? My second caveat is that holism about the acquisition of beliefs and about the confirmation of theories might well both be true even if holism about the individuation of concepts is. still less of the slide from concept holism to relativism. you will pay the price of extravagant paradox. Since you and I have different concepts of dogs. 1962). If you have caught onto how this game is played. But it wasn't and you can't. according to the pragmatist view. it's often gone. You utter "Some dogs have tails. concepts are essentially different if they are exploited by essentially different theories. then holism. and the question "which of the inferences a theory licenses are central?" sounds suspiciously similar to the question "which of the inferences that a concept licenses are constitutive?" Carey cites with approval Kuhn's famous distinction between theory changes that amount to paradigm shifts and those that don't (Kuhn. then the theory theory of concepts. The problem about which inferences constitute which concepts has therefore an exact analagon in the problem which inferences constitute which theories. of course. nobody knows how to individuate theories. you have you idea of dogs and I have mine. "just semantic". One thing does seem clear: if your way out of the shell game is to say that a concept is constituted by the whole of the theory it belongs to. I want to emphasize two caveats. So the disagreement between us is. So it goes. but in fact we're not. Or so. holist account of concept individuation. you won't be surprised to hear that nobody knows how to individuate paradigms either. it is also part of my concept DOG according to the present. Since tailessness is part of my theory of dogs. so don't bother trying. For example: it turns out that you and I can't disagree about dogs. You might have thought that our disagreement was about the facts and that you could refute what I said by producing a dog with a tail. hopeless. It's hard to believe it matters much which of these shells you keep the pea under. (What. or electrons. We seem to be contradicting one another. just as concepts are according to the pragmatist treatment. The trouble is that she has no account of centrality. Carey thinks that only the "central" principles of a theory individuate its concepts. as comfortable muddleheads like to put it. There is no contradiction between Quine's famous dictum that it's only as a totality that our beliefs "face the tribunal of experience". or toothbrushes since we have no common conceptual apparatus in which to couch the disagreement. Indeed. they are interdefined. theories are bundles of inferences. as I believe.Concepts: a potboiler 149 But. at least. one wonders." "No dogs have tails" I reply. Theories are essentially different if they exploit essentially different concepts. Roughly speaking. Unsurprisingly. these problems are equally intractable.

deep problem about how to get a theory of confirmation and belief fixation if you are an atomist about concepts. The problem is. So much for caveats. there's no reason to suppose that the first of these problems is worse than the second. to put it contemporary terms. It's worth noticing that the holistic account of concepts at which we've now dead-ended is diametrically opposite to the classical view that we started with. by their roles in any other mental processes. If semantics isn't part of psychology. then. it's independent. inferential capacities as well? So be it. In this respect. Concepts are the constituents of thoughts. Hume. semantics is about what constitutes concepts and psychology is about the nature of mental processes. at a minimum. to be sure. What's left. or. even though his theory of mental processes was associationistic and hence hopelessly primitive. they would if there were any definitions. Hume was thus a radical atomist just where contemporary cognitive scientists are tempted to be radically holist. Holism would be a godsend and the perfect way out except that it's preposterous on the face of it. then the view I'm recommending is that semantics isn't part of psychology. But not just sorting capacities since sorting is itself relativized to concepts. But there is also a deep. In classical accounts. rather. Here's how the discussion has gone so far: modern representational theories of mind are devoted to the pragmatist idea that having concepts is having epistemic capacities. I think that Hume was closer to the truth than we are. Or. you don't need to have a sophisticated theory of mental processes in order to get it right about what concepts are. We saw that. a deep.150 J. by stipulation. they're the most elementary mental . did get it right about what concepts are. If. and not by what theories they belong to. of the concept's inferential role. Fodor There is. The "theory theory" merely begs the problem it is meant to solve since the individuation of theories presupposes the individuation of the concepts they contain. Maybe. but which inferential capacities? Well. So far as I know. that Hume was right: concepts aren't individuated by the roles that they play in inferences. that there is nothing left for a pragmatist to turn to and that our cognitive science is in deep trouble. This was a way of saying that the identity of a concept is independent of the theories one holds about the things that fall under it. rather. It follows that concepts can't be stereotypes. then. but there aren't any definitions to speak of. any concept could become associated to any other. Defining inferences are candidates since they do respect the compositionality of mental representations. Statistical inferences aren't candidates because they aren't compositional. as such. indeed. concepts are individuated by what they are concepts of. in fact. for example. for a pragmatist to turn to? I suspect. Not that there aren't mental representations. for the likes of Hume. deep problem about how to get a theory of confirmation and belief fixation if you are not an atomist about concepts. or that mental representations aren't made of concepts. inferential capacities that respect the compositionality of mental representations.

The project of constructing a representational theory of the mind is among the most interesting that empirical science has ever proposed. & Rummelhart. T. non-directive self-analysis comes to an end. B. & Pylyshyn.). Z. 9. S. concepts are individuated by their representational and not by their casual properties. Fodor. But if "What individuates concepts?" is easy. wrong. Fodor. Peacocke. New York: Wiley. Conceptual Change in Childhood. Kendler. J. H. McClelland & D. But I'm afraid we've gone about it all wrong. E. W. 28. Now vee may perhaps to begin. What's in a link? In D. Skinner Skinned. (1978). MA: MIT Press. The whole story about the individuation of the concept DOG is that it's the concept that represents dogs. Psychological Review. Cambridge. Cognition. (1988). Collins (Eds. 6. all that has to specified in order to identify a concept is what it is the concept of. Cognition. Chicago: University of Chicago Press.. 328-343. however. (1988). Parallel Distributed Processing (Vol. At the very end of Portnoy's Complaint. Correctionism and cognitive architecture. J. In J. J. (1990). J. (1962). J. 11. Behavioral and Brain Sciences. Experience and Nature. In the last sentence of the book. 35.Concepts: a potboiler 151 objects that have both causal and representational properties. J. Woods. the psychiatrist finally speaks: "So [said the doctor]. & Lepore. Cambridge. Dennett. MA: MIT Press. (1990). 2). Z. New York: Academic Press. Dewey. for once. J. (1958). P. 3-71. MA: MIT Press. (1981). pretty nearly nothing at all. D. Carey. according to the present view. mental representation doesn't reduce to mental imaging. Since. Cambridge. MA: MIT Press. The Structure of Scientific Revolutions. Oxford: Blackwell. What we know about the second question is. (1975). 139-196. (1992). Fodor. Kuhn. The right questions are: "How do mental representations represent?" and "How are we to reconcile atomism about the individuation of concepts with the holism of such key cognitive processes as inductive inference and the fixation of belief?" Pretty much all we know about the first question is that here Hume was. . J. On the proper treatment of connectionism. E. How direct is visual perception? Some reflection on Gibson's "ecological approach". 269-277. MA: MIT Press. (1986).. Yes?" References Bruner. 183-204.. (1952). (1991). 59.. J. Bobrow & A. that's because its the wrong question. Goodnow.. Cambridge. Smolensky. J. (1956). C. (1992).. In Brainstorms. McClelland. Representation and Understanding. Fodor. Mind and Language.. as previously remarked. the client's two hundred pages of tortured.. & Austin. A Study of Concepts. Cognition. as far as I can tell. Why meaning (probably) isn't conceptual role. & McLaughlin. & Pylyshyn. Fodor. A distributed model of human learning and memory. A Study of Thinking. 1-23. D. (1985). Fodor. G. A Theory of Content and Other Essays. New York: Dover Publications.). "What is learned?" A theoretical blind alley. & Lepore. Holism: A Shopper's Guide. Rummelhart (Eds. Connectionism and the problem of systematicity: why Smolensky's solution doesn't work. Cambridge.

We also claim that the core of naive biology is acquired based on specific cognitive constraints as well as the general mechanism of personification and the resultant vitalistic causality. Dokkyo University. but it is differently instantiated and elaborated through activity-based experiences in the surrounding culture. More specifically. Saitama 340. which assumed young children to be preoperational and thus * Corresponding author.8 Young children's naive theory of biology Giyoo Hatano*' a . ways of inferring attributes or behaviors of biological kinds. . Chiba University. and that these three constitute a form of biology. Japan b School of Education. Introduction A growing number of cognitive developmentalists have come to agree that young children possess "theories" about selected aspects of the world (Wellman & Gelman. that is. Soka. we tried to answer the following five critical questions. This conceptualization is a distinct departure from the Piagetian position. and a non-intentional causal explanatory framework. Japan Abstract This article aimed at investigating the nature of young children's naive theory of biology by reviewing a large number of studies conducted in our laboratories. 1992). which is adaptive in children's lives. Kayoko Inagaki b ^Faculty of Liberal Arts. Chiba 263. What components does young children's knowledge system for biological phenomena (or naive biology) have? What functions does it have in children's lives? How is it acquired in ontogenesis? How does its early version change as children grow older? Is it universal across cultures and through history? We propose that young children's biological knowledge system has at least three components. knowledge needed to specify the target objects of biology.

and theory of society (e.154 G. there has been a debate in recent years. What components does each theory have? What functions does it have in children's lives? How is it acquired in ontogenesis? How does its early version change as children grow older? Is it universal across cultures and through history? In what follows. 1987) have asserted that the differentiation between psychology and biology occurs. because the term "theories" means coherent bodies of knowledge that involve causal explanatory understanding. none of these has been widely accepted as an important domain. where innate or early cognitive constraints work. Because of the limited space available. young children lack the mind-body distinction. a number of recent studies have suggested that children possess biological knowledge at much earlier ages than Carey claimed. we will refer to studies conducted in other laboratories only when they are highly relevant. Furth. do not recognize that our bodily functions are independent of our intention nor that biological processes which produce growth or death are autonomous. What else? Wellman and Gelman (1992) take biology as the third domain. but the former certainly have something more than a collection of facts and/or procedures to obtain desired results (Kuhn. . On one hand. based on a large amount of data collected by our associates and ourselves. Whichever aspects of the world young children have theories about.. K. Hatano. It is generally agreed that naive physics and naive psychology are included among them. Some developmentalists (e. Carey (1987) suggest that there are a dozen or so such domains.. 1985). Vosniadou & Brewer. On the other hand. 1992). at least compared with the "big three".. In other words.g. Keil.e.g. Carey (1985) claimed that children before around age 10 make predictions and explanations for biological phenomena based on intuitive psychology (i. much earlier than Carey (1985) assumed. more specifically.g.. young children are assumed to possess theories only in a few selected domains.. nor researched extensively. Nonetheless.g. 1992). Others have proposed that biological phenomena are conceptualized differently from other phenomena from the beginning (e. we would like to offer our tentative answers to these questions as to naive biology. the following questions seem critical. According to her. exact characterizations of these theories require further studies. 1980).. 1989). astronomy (e.g. Inagaki incapable of offering more or less plausible explanations in any domain. As to whether young children have acquired a form of biology. Carey. A few other candidate theories young children may possess are theory of matters (e. An important qualification here is "selected aspects of". How similar young children's theories are to theories scientists have is still an open question. however. & Wiser. Smith. if it does. Among others. intentional causality). Hatano & Inagaki.

for example animals' ability to perform autonomous movements (e. children of ages 4-6 were presented with a picture of a flower's bud (or a new artifact or a young animal) as the standard stimulus picture. More specifically. in other words. and were then asked to choose which of two other pictures would represent the same plant (or artifact or animal) a few hours later and several months/years later. that is. children before age 6 distinguish plants and animals from non-living things in terms of growth. 1993b). preschool children can distinguish animals from inanimate objects by attending to some salient distinctive features. Gelman.. These components correspond respectively to the three features that Wellman (1990) lists in characterizing framework theories: ontological distinctions.Young children's naive theory of biology 755 Components of young children's biological knowledge We are convinced that the body of knowledge which young children possess about biological phenomena (e. Backscheider. when damaged. coherence. knowledge about the living-non-living distinction.. Gelman. which is an extension of that of Rosengren. The third is a nonintentional causal explanatory framework for behaviors needed for individual survival and bodily processes. both animals and plants can regrow. For example..e. Kalish. Shatz. The children showed "invariance" patterns (i. Though only a small number of studies have dealt with plants as living things. 1993a). and Gelman (1993) also reported that 4-year-olds recognize that. The first is knowledge enabling one to specify objects to which biology is applicable. bodily process. and believe that these three constitute a naive biology (Inagaki. which investigated children's differentiation between animals and artifacts in terms of growth. Animate-inanimate and mind-body distinctions An increasing number of recent studies have revealed that young children have the animate-inanimate distinction. and a causal-explanatory framework.e. whereas artifacts can be mended only by human intervention. It is clear . In this study. and also about the mind-body distinction. behavior of animals and plants needed for individual survival. and McCormick (1991). no change in size both a few hours later and several months/years later for all the items) for artifacts. The second is a mode of inference which can produce consistent and reasonable predictions for attributes or behaviors of biological kinds.g. changes in size as time goes by (Inagaki. reproduction and inheritance of properties to offspring) has at least three components. 1990). but "growth" patterns (i.. they have also indicated that young children recognize plants as distinct from non-living things in some respects. changes in size either/both a few hours later and several months/years later) for plants and animals.g.

many of them justified their responses by mapping food for animals to water for plants.g. Inagaki that young children can distinguish typical animals and plants from typical non-living things in some attributes. to be not. Springer and Keil (1989) reported that children of ages 4-7 consider those features leading to biologically functional consequences for animals to be inherited.g.g. they showed that even children aged 4 and 5 already recognize not only the differential modifiability among characteristics that are unmodifiable by any means (e. K. By interviewing children using a variety of questions. gender). By asking 5..156 G. Inagaki and Hatano (1987) revealed that children of 5-6 years of age recognize that the growth of living things is beyond their intentional control. for growth/about one-third of them cited the phenomenon of plants getting bigger from a seed or a bud. in other words. we can conclude that children are able to acquire the living-non-living distinction by age 6. Young children can also distinguish between the body and the mind. Experiment 1). biological phenomena from social or psychological ones. that are bodily and modifiable by exercise or diet (e... such as "A tulip or a pine tree dies if we do not water it". forgetfulness). such as those leading to social or psychological consequences. but also the independence of activities of bodily organs (e.and 6-year-olds whether a few examples of plants or those of inanimate things would show similar phenomena to those we observe for animals. and thus distinguished them from inanimate things. and one-fifth of them by referring to watering as corresponding to feeding as a condition for growth.. An even more systematic study on the mind-body distinction has just been reported by Inagaki and Hatano (1993. Siegal (1988) indicated that children of ages 4-8 recognize that illness is caused not by moral but by medical factors. .g. running speed). Proof that they are aware of the commonalities between animals and plants is needed. Moreover. while other sorts of features. and that are mental and modifiable by will or monitoring (e. heartbeat) from a person's intention. That young children treat inanimate things differently from animals and plants is not sufficient for claiming that they have an integrated category of living things. a baby rabbit grows not because its owner wants it to but because it takes food. this point is discussed in a later section. These findings all suggest that young children recognize the autonomous nature of biological processes. they have substantial knowledge of contagion and contamination as causes of illness. Based on this and other related studies. both of which are observed among a subset of animate things. Another important piece of evidence for this distinction is young children's use of non-intentional (or vitalistic) causality for bodily phenomena but not for social-psychological ones. Hatano. For example. Hatano and Inagaki (1994) found that a great majority of them recognized commonalities between animals and plants in terms of feeding and growing in size over time.

" ["// someone does not pick up the cage. and (c) compatible situations." ["Why doesn't the grasshopper do anything? Why does it just stay thereT] "It cannot (go out of the cage and) walk. "Suppose a woman buys a grasshopper. "The grasshopper will be dizzy and die. Our studies (Inagaki & Hatano. but they do not use knowledge about humans indiscriminately. in which a human being and the target object would behave similarly. Results indicated that for the similar situations many of the children generated reasonable predictions with some explanations by using person analogies. After shopping she is about to leave the store without the grasshopper. unlike a . they produced unreasonable predictions for the compatible situations.Young children's naive theory of biology 157 Personification as means to make educated guesses about living things When children do not have enough knowledge about a target animate object. 1991) confirmed such a process of constrained personification. where the object and a human being would in fact react differently. On her way home she drops in at a store with this caged grasshopper. "Does a grasshopper feel something if the person who has been taking care of it daily dies? [If the subject's answer is "Yes"] How does it feel?" (compatible). what will the grasshopper doT] "The grasshopper. where the target object and a human would react differently. In one of the studies (Inagaki & Hatano. 'cause it cannot open the cage. 'cause the grasshopper. "The grasshopper will be picked up by someone. The following are example responses of a child aged 6 years 3 months: for the "too-much-eating" question of the "similar" situation. where they were unable to check the plausibility of products of person analogies because of the lack of adequate knowledge (e. Example questions for each situation are as follows: "We usually feed a grasshopper once or twice a day when we raise it at home. 1987. about the relation between the brain and feeling). they can make an educated guess by using personification or the person analogy in a constrained way. 1991).will just stay there. What will the grasshopper do?" (contradictory). we asked children of age 6 to predict a grasshopper's or tulip's reactions to three types of novel situations: (a) similar situations.. Young children are so familiar with humans that they can use their knowledge about humans as a source for analogically attributing properties to less familiar animate objects or predicting the reactions of such objects to novel situations. for the "left-behind" question of the "contradictory" situation. whereas they did not give personified predictions for the contradictory situations. but predictions obtained through the person analogy do not seem implausible to them. is like a person (in this point)". and predictions based on the person analogy contradict children's specific knowledge about the target. What will happen with it if we feed it 10 times a day?" (similar situation). As expected.g. (b) contradictory situations. though it is an insect.

This concerns the third component of their biological knowledge. 1985). intentional or nonintentional. Here the type of causality. For instance. Gellert. K. On the contrary. It seems inevitable to accept this claim so long as we assume only two types of causalities. .. This body of knowledge can be called a theory only when a causal explanatory framework is included in it. Non-intentional causality The experimental evidence presented so far enables us to indicate that young children have a coherently organized body of knowledge applicable to living things. These children may rely on this intermediate form of causality. whereas mechanical causality means that physiological mechanisms cause the target phenomenon.e. "agency" (i. a specific bodily system enables a person. Young children cannot give articulated mechanical explanations when asked to explain biological phenomena (e. and thus they have a form of biology which is differentiated from psychology. using person analogies in a constrained way. Carey (1985) claimed that. 1962).. Hatano. to exchange substances with its environment or to carry them to and from bodily parts. sometimes they try to explain them using the language of person-intentional causality (Carey. intentional causality versus mechanical causality. because they are ignorant of physiological mechanisms involved. which might be called "vitalistic causality".158 G. which has.g. irrespective of his or her intention. "The grasshopper will feel unhappy.g." This illustrates well how this child applied knowledge about humans differentially according to the types of situations. Children may not be willing to use intentional causality for biological phenomena but not as yet able to use mechanical causality. Generally speaking. a tendency to initiate and sustain behaviors). that is.' The activity is often described as a transmission or exchange of the "vital force". determines the nature of a theory. Inagaki person". we claim that young children before schooling can apply a non-intentional causality in explaining biological phenomena. we propose an intermediate form of causality between these two. and the person analogy may be misleading only where they lack biological knowledge to check analogy-based predictions. as mentioned above. Intentional causality means that a person's intention causes the target phenomenon. like a living thing. In contrast. children before age 10 base their explanations of biological phenomena on an intentional causality. bodily processes mediating input-output relations) in an open-ended interview (e. However. as represented by Carey (1985).. children generate reasonable predictions. vitalistic causality indicates that the target phenomenon is caused by activity of an internal organ. These findings apparently support the claim that young children do not yet have biology as an autonomous domain. for the "caretaker's death" question of the "compatible" situation.

" The 6-year-olds chose vitalistic explanations as most plausible most often. what the halt of blood circulation would cause. Experiment 2) predicted that even if young children could not apply mechanical causality. because energies fade away if blood does not come there. Hence. they would prefer vitalistic explanations to intentional ones for bodily processes when asked to choose one from among several possibilities." However. In Inagaki and Hatano (1990) some of the children of ages 5-8 gave explanations referring to something like vital force as a mediator when given novel questions about bodily processes. they will die. or information. one child said. 1993. Vitalistic causality is clearly different from person-intentional causality in the sense that the organ's activities inducing the phenomenon are independent of the intention of the person who possesses the organ. and college students as subjects to choose one from three possible explanations each for six biological phenomena. With increasing age the subjects came to choose mechanical explanations most often. by assuming it or its components to be human-like (Ohmori. such as. respectively. We can see a similar mode of explanation in the Japanese endogenous science before the Meiji restoration (and the beginning of her rapid modernization). young children try to understand the workings of internal bodily organs by regarding them as humanlike (but non-communicative) agents. though they were more apt to adopt intentional causality than the 8-year-olds or adults. This vitalistic causality is probably derived from a general mechanism of personification. and another child. "If blood does not come to the hands. and if they could not generate vitalistic causal explanations for themselves. 8-year-olds. It should be noted that the 6-year-olds applied non-intentional (vitalistic plus mechanical) causalities 75% of the time. they chose them 54% of the time. as the number of these children was small. One who has no means for observing the opaque inside or details of the target object often tries to understand it in a global fashion. (c) Because the lungs take in oxygen and change it into useless carbon dioxide [mechanical]. because the blood does not carry energies to them". energy. . 1985). "We wouldn't be able to move our hands. We asked 6-year-olds. We (Inagaki & Hatano. we did another experiment to induce children to choose a plausible explanation out of the presented ones.Young children's naive theory of biology 159 which can be conceptualized as unspecified substance. which results in vitalistic causality for bodily processes. and by assigning their activities global life-sustaining characters. An example question on blood circulation with three alternative explanations was as follows: "Why do we take in air? (a) Because we want to feel good [intentional]. vitalistic and mechanical causality. The three explanations represented intentional. for example. (b) Because our chest takes in vital power from the air [vitalistic]. such as blood circulation and breathing.

they tended to choose vitalistic explanations rather than intentional ones. contrary to Carey (1985). it is useful in everyday biological problem solving. in making predictions for reactions of familiar animate entities to novel situations. For biological phenomenon questions . because the vitalistic explanation refers to activity of the responsible organ or bodily part (implicitly for sustaining life). First. to the agency of a bodily organ or part. This is taken for granted.160 G. K. we believe that naive biology is formed. Anyway. both seem to afford valid perspectives of the biological world. as revealed by Inagaki and Hatano (1993. because it is functional. whereas only 20% opted for (b) "Because Taro's heart urged him to go near her" [vitalistic]. Hatano. then. Another interpretation is that. it will be intriguing to examine these characterizations of young children's "biological" explanations in concrete experimental studies. Inagaki which had evolved with medicine and agriculture as its core (Hatano & Inagaki 1987). Functions of naive biology The personifying and vitalistic biology that young children have is adaptive'in nature in their everyday life. which refers only to the necessity. whereas the vitalistic is concerned more with the how or the process. 1992)? Both are certainly in between the intentional and the mechanical. In sum. and elaborated. Since a human being is a kind of living . Experiments 3 and 3a). One interpretation is that they are essentially the same idea with different emphases-the teleological concerns more the why or the cause. They seldom attribute social-psychological behavior. Taro came near her. ' Young children seem to rely on vitalistic causality only for biological phenomena. which is optional and not needed for survival. children before schooling have acquired a form of biology differentiated from psychology. because naive biology includes constrained personification as a general mode of inference for biological phenomena and about biological kinds. it is closer to mechanical causality than is the teleological one. Why did he do so?" Eighty per cent of the 6-year-olds chose (a) "Because Taro wanted to become a friend of hers" [intentional explanation]. for example. In other words. and for properties and behaviors of unfamiliar entities. is the relationship between the vitalistic explanation for biological phenomena and the teleological-functional explanation for biological properties (Keil.almost the same as those used in Experiment 2 of Inagaki and Hatano (1993) except for excluding the mechanical causal explanation . The following is an example question for such behavior used in the study: "When a pretty girl entered the room. we can conclude from the above findings that children as young as 6 years of age possess three essential components of biology. What. maintained. In other words.

procedures for taking care of animals and plants as well as themselves in everyday life. and also judge the plausibility of causes suggested by the experimenter. which can also be observed for humans (being constipated. and asked them to guess a cause for each phenomenon. for example. "What kind of food might I give a squirrel? Favorite chestnuts only. they will be taken ill. For example. Third. though not necessarily accurate. "You do not eat favorite food only. however. "A squirrel became weaker because it did not eat chestnuts"). Our favorite example is a 5-year-old girl's statement reported by Motoyoshi (1979). some of their expressions strongly suggest that they edited or adapted to this animal those responses obtained by the person analogy (e. naive biology is useful because it helps children learn meaningfully. she concluded: "Flowers are like people. as revealed by Inagaki and Hatano (1987. Global understanding of internal bodily functions is enough for such purposes. The person analogies in young children's biology are especially useful. The experimental group children. Hatano and Inagaki (1991b) presented to 6-year-olds three bodily phenomena of a squirrel. About a half of them explicitly referred to humans at least once in their causal attributions for a squirrel. about the necessity of giving a variety of food to a squirrel. If flowers eat nothing (are not watered). naive biology also enables young children to make sense of biological phenomena they observe in their daily lives. 1991). Naive biology provides young children with a conceptual tool for interpreting bodily phenomena of other animals as well as humans." In this sense. The subjects were aurally given the description of the procedures while watching pictures visualizing them. or even discover. because they are constrained by the perceived similarity of the target objects to humans and also by specific knowledge regarding the target objects. the experimenter indicated. Based on accumulated experience with raising flowers. At the same time. diarrhea.g. You eat a variety of food. don't you?" After listening to the description of all procedures. The description included several references to humans in the experimental condition but not in the control condition. young children's naive biology constitutes what Keil (1992) calls a mode of construal.and 6-year-olds' comprehension of raising procedures of a squirrel. About three-quarters of them on the average could offer some reasonable causes. irrespective of age. and getting older and weaker).. If they eat too much (are watered too often). gave more often adequate . the use of the person analogy can often produce reasonable. they will fall down of hunger. the children were asked to tell how to raise a squirrel to another lady. would enhance 5. predictions. chestnuts. and relying on her naive biology. Second. or ice cream?" They were thus required to choose an alternative and to give the reason. Inagaki and Kasetani (1994) examined whether inducing the person analogy. a critical component of naive biological knowledge. They were asked questions by this lady.Young children's naive theory of biology 161 entity. seeds and vegetables mixed. especially for advanced animals.

the mode of inference. Their personifying and vitalistic biology seems to be triggered almost automatically whenever children come into contact with novel phenomena which they recognize as "biological" (Inagaki. one 5-year-old child said. Children aged 5 in a day care center inferred that. It should also be noted that naive biological knowledge is seldom applied "mechanically". Hatano. 1989. and after group discussion they produced an idea of making the rabbit take medicine for diarrhea as a suffering person would. 1990b). In fact in our study described above (Inagaki & Hatano." An anecdotal but impressive example of the discovery of a procedure for coping with trouble with a raised animal is reported by Motoyoshi (1979). Infants seldom need biological knowledge. making an educated guess by applying insufficient knowledge is often rewarded in everyday life. When children acquire an autonomous domain of biology is still an open question for us. K. 1987) it was very rare that the children gave no prediction or the "I don't know" answer to our questions which were somewhat unusual. because a person will die if he eats the same kind of food only. when they observed unusual excretion of a rabbit they were taking care of every day. Inagaki reasons for their correct choices than the control ones. so most everyday knowledge is readily used. Moreover. 1992) and also knowledge about our bodily functions and health (Hatano. . Speaking generally. 1993). it might be suffering from diarrhea like a person. and so will it. because it has been essential for our species to have some knowledge about animals and plants as potential foods (Wellman & Gelman. since they do not need to take care of their health nor try to find food themselves. both in individual problem solving and in social interaction. This early acquisition of biology is not surprising from the perspective of human evolution.are promptly and effortlessly retrieved and used to generate more or less plausible ideas. For instance.162 G. Inagaki & Hatano. procedural or conceptual knowledge about the target to generate a reasonable answer. Children's naive biology is not an exception. we think that the acquisition of biology comes a little later than that of physics or psychology. Young children's naive biology is functional partly because its componentspieces of knowledge. our experimental data strongly suggest that children as young as 6 years of age have acquired a form of biology. and causality . However. though their superiority in the number of correct choices was significant only for the younger subjects. You must give a squirrel plenty of seeds and carrots. As mentioned earlier. we believe. "Don't feed chestnuts only. children constrain their analogies by factual. Acquisition of naive biology As already mentioned. because we have not examined whether much younger subjects too possess a form of biology.

constrained way (Inagaki & Hatano. They also through personification generalize this global understanding of the body to other living things. are going on inside the body. In fact. such as that diarrhea is caused by eating something poisonous. 1991). More specifically. Since children cannot see the inside of the body. In Vera and Keil (1988). the ability to communicate) to these organs. our modified replication study on the controllability of internal bodily functions suggests that 3-year-olds are not sure whether the workings of bodily organs are beyond their control (Inagaki & Suzuki. this cannot directly serve as the basis for the living-non-living distinction. initiate and maintain activity without external forces) but can hardly communicate with us humans. Autonomous biology also requires to include animals and plants. 1987. they attribute agency and some related human properties but not others (e.. Carlson. Golinkoff. giving the social-psychological context to 4-year-olds did not affect the inductions they made. 1991. Vosniadou. which appear so different.. Our own speculation about how young children acquire personifying and vitalistic biology through everyday life experiences is as follows. 4-year-olds' inductions about animals. into an integrated category of living things. A set of specific innate or very early cognitive constraints is probably another important factor in the acquisition of naive biology. it is plausible that they apply the person analogy to bodily organs in that way.e. as Keil argues. It is true that. and thus has to apply an intermediate form of causality between the intentional and mechanical. 1984). However. when given the biological context. young children may overestimate the controllability of bodily processes by will or intention. Though there is some evidence that even infants can distinguish objects having a capacity for self-initiated movement from those not having it (e. Cognitive bases of naive biology Whether naive biology gradually emerges out of naive psychology (Carey.g. 1985) or is a distinct theory or mode of construal from the start (Keil. for example. uncontrolled by their intention. too. who were given the same attribution questions without context. 1992) is still debatable. It is likely that even very young children have tendencies to attribute a specific physical reaction to a specific class of events.g. & Sexton. resembled those previously found for 7-year-olds. These tendencies enhance not only their rejection of intentional . Considering that young children use analogies in a selective. Harding. preschool children have some understanding of the distinction between the biological and the social-psychological. Children notice through somatosensation that several "events". 1989). they will try to achieve "global understanding" by personifying an organ or bodily part..Young children's naive theory of biology 163 autonomous biology has to deal with entities which have agency (i.

but this does not mean it is purely psychological. it is suggested that they sometimes overattribute human mental properties (not communication ability. plants or even inanimate objects (e. and produced reasonable predictions with some explanations for it. To sum up. Some such experiences are also universal in human ways of living. the goldfish-raisers used the knowledge about goldfish as a source for analogies in predicting reactions of an unfamiliar "aquatic" animal (i. Our studies have in fact revealed that such an activity may produce a slightly different version of naive biology from the ordinary one. and therefore to use that body of knowledge. 1992. though it is autonomous. if children are actively engaged in raising animals. when asked whether we could keep a baby frog in the same size forever.164 G. but others may vary and thus produce differently instantiated versions of naive biology. the goldfish-raisers had much richer procedural. factual and conceptual knowledge about goldfish.. one of the raisers answered. More interestingly. unpublished study) as well as to less advanced animals. being happy and others in addition to agency) to bodily organs (Hatano & Inagaki. as well as their knowledge about humans. In this sense. because a frog will grow bigger as goldfish grew bigger. we believe that the ability of young children to make inferences about bodily processes.. Activity-based experiences We are willing to admit that.g. early naive biology is "psychological". K. is based on personification. we can't. but working hard. as suggested by Atran (1990). as well as about animals' and plants' properties and behaviors. Although these two groups of children did not differ in factual knowledge about typical animals in general. there must be some core elements in naive biology that are shared among individuals within and between cultures. a frog). Inagaki (1990a) compared the biological knowledge of kindergarteners who had actively engaged in raising goldfish for an extended period at home with that of children of the same age who had never raised any animal.e. Inagaki causality for bodily phenomena but also their construction of more specific beliefs about bodily processes. However. because they understand the mind-body distinction to some extent. Hatano. For example. 1988). we would like to emphasize that this condition does not mean children's activity-based experiences do not contribute to the acquisition. Inagaki & Sugiyama. 105). as a source for analogical predictions and explanations for other biological kinds. because of the above general mechanism of personification and the resultant vitalistic causality. and specific cognitive constraints. p. one that they had never raised. . However. For example. which "fit nicely with biology" (Keil. "No. it will be possible for them to acquire a rich body of knowledge about them.

They gave attribution questions. hierarchically organized biological categories. and that it can occur without systematic instruction in biology. and (d) lack of some conceptual devices (e. the goldfish-raisers could use two sources for making analogical predictions. toward a biology which relies on category-based inferences and rejects intentional causal explanations. whereas the accumulation of more and more factual knowledge can be achieved by enrichment only.e. their personifying and vitalistic biology will gradually change toward truly "non-psychological" (if not scientific) biology by eliminating the above weaknesses (b) and (c). 1991. to children aged from 4 to 10 and college students. What weaknesses does it have? Its weaknesses are obvious even when compared with intuitive biology in lay adults. such as "Does X have a property Y?".. Let us list some major ones: (a) limited factual knowledge." It might be added that the goldfish-raisers tended to use person analogies as well as goldfish analogies for a frog. Theory changes in biological understanding So far we have emphasized strengths of young children's naive biology. Moreover.g. We assume that this change is almost universal. fundamental restructuring of knowledge). at least among children growing up in highly technological societies. Results indicated that there was a progression from 4-year-olds' predominant reliance on similarity-based attribution (attributing . having a heart.. in another study (Kondo & Inagaki. In other words.. excreting) not only to goldfish but also to a majority of animals phylogenetically between humans and goldfish at a higher rate than non-raisers. The use of inferences based on complex. Goldfish-raisers attributed animal properties which are shared by humans (e. that is. but now they are big. "photosynthesis"). It is expected that.Young children's naive theory of biology 165 My goldfish were small before. 1992). This suggests that the experience of raising goldfish modifies young children's preferred mode of biological inferences. Whether the acquisition of basic conceptual devices in scientific or school biology is accompanied by a theory change is not beyond dispute. see also Hatano & Inagaki. (c) lack of mechanical causality. though schooling may have some general facilitative effects on it. hierarchically organized biological categories and of mechanical causality requires a theory change or conceptual change (i. but incorporating them meaningfully into the existing body of knowledge can usually be achieved only with the restructuring of that knowledge. (b) lack of inferences based on complex. goldfish-raising children tended to enlarge their previously possessed narrow conception of animals. as children grow older. "evolution".g. Inagaki and Sugiyama (1988) examined how young children's human-centered or "similarity-based" inference would change as they grew older.


G. Hatano, K. Inagaki

human properties in proportion to perceived similarity between target objects and humans) to adults' predominant reliance on category-based attribution (attributing by relying on the higher-order category membership of the targets and category-attribute associations). This shift is induced not only by an increased amount of knowledge but also by the development of metacognitive beliefs evaluating more highly the usefulness of higher-order categories (Hatano & Inagaki, 1991a, Inagaki, 1989). In contrast to young children's vitalistic, and sometimes even intentional, biological explanations, older children reject intentional explanations for biological phenomena and are inclined to use mechanical causality exclusively. In Experiment 2 of Inagaki and Hatano's (1993) study, the difference between 6-year-olds and 8-year-olds was larger than the difference between 8-year-olds and adults in terms of preference for mechanical explanations and avoidance of intentional ones. These results suggest that young children's biology is qualitatively different from the biology that older children and adults have, and that, in accordance with Carey's claim, there occurs a conceptual change in biological understanding between ages 4 and 10. However, contrary to her claim, this change is characterized not as the differentiation of biology from psychology but as a qualitative change within the autonomous domain of biology, because children as young as 6 years of age already possess a form of biology. Another important change may occur as a result of the learning of scientific biology at school. In order to be able to reason "scientifically" in biology one needs to know its basic concepts and principles - major conceptual devices which cannot be acquired without intervention. For example, if one does not know the phenomenon of photosynthesis, one will not be able to understand the difference between animals and plants (i.e., plants can produce nutriment themselves), and thus may accept the false analogy of mapping water for plants with food for animals. We assume that, unlike the first theory change, this change is hard to achieve and thus occurs only among a limited portion of older children or adolescents.

Universality of naive biology Which aspects of naive biology are universal, and which aspects are not? As suggested by Atran (1990), it may be possible to find the "common sense" or core beliefs shared by all forms of folk biology and even by scientific biology. However, what such core beliefs are is debatable. Much of the research inspired by Piaget has shown parallels among the biological understanding of children in different cultures. The distinctions between animals and terrestrial inanimate objects are particularly strong. However,

Young children's naive theory of biology


the biological understanding observed in different cultures is not identical. The most striking of the differences thus far reported concerns ideas about plants of children in Israel. Stavy and Wax (1989) showed that about half of a sample of 6-12-year-olds, when asked to judge the life status of animals, plants and non-living things, classified plants either as non-living things or as falling within a third category: things that are neither living nor non-living. Beliefs about inanimate objects also may differ between cultures. Whereas recent studies conducted in North America indicate that young children seldom attribute life or other living-thing properties to any terrestrial inanimate objects (e.g., Dolgin & Behrend, 1984; Richards & Siegler, 1984), Inagaki and Sugiyama (1988) reported that some Japanese preschoolers extended mental properties even to inanimate objects without movement or function, such as stones. Hatano et al. (1993) tried to differentiate between universal and culturally specific aspects of children's conceptions of life and understanding of attributes of living things, by comparing kindergarteners, 2nd- and 4th-graders from Israel, Japan and the United States. The children were asked whether two instances each of four object types (people, other animals, plants and inanimate objects) possessed each of 16 attributes that included life status (being alive), unobservable animal attributes (e.g., has a heart), sensory attributes (e.g., feels pain), and attributes true of all living things (e.g., grows bigger). The results illustrate both similarities and differences across cultures in children's biological understanding. Children in all cultures knew that people, other animals, plants, and inanimate objects were different types of entities, with different properties, and were extremely accurate regarding humans, somewhat less accurate regarding other animals and inanimate objects, and least accurate regarding plants. At the same time, as predicted from cultural analyses, Israeli children were considerably more likely not to attribute to plants properties that are shared by all living things, whereas Japanese children, whose overall accuracy was comparable to the Israeli, were considerably more likely to attribute to inanimate objects properties that are unique to living things. These differences are especially interesting because they suggest that children's naive biology is influenced by beliefs within the culture where they grow up. Consider why Japanese children might be more likely than children in the United States or Israel to view plants or inanimate objects as alive and having attributes of living things. Japanese culture includes a belief that plants are much like human beings. This attitude is represented by the Buddhist idea that even a tree or blade of grass has a mind. In Japanese folk psychology, even inanimate objects are sometimes considered to have minds. For example, it is at least not a silly idea for Japanese to assign life or divinity not only to plants but also to inanimate objects, especially big or old ones. In addition, linguistic factors seem to influence Japanese children's attributional judgements. The kanji (Chinese character) representing it has a prototypal meaning of "fresh" or "perishable" as well as


G. Hatano, K. Inagaki

"alive". Therefore, this kanji can be applied to cake, wine, sauce, and other perishable goods. Similar features of culture and language may account for Israeli children being less apt than American or Japanese children to attribute to plants life status and properties of living things. Stavy and Wax (1989) suggested that within the Israeli culture plants are regarded as very different from humans and other animals in their life status. This cultural attitude parallels that of a biblical passage (Genesis 1: 30), well known to Israeli students, indicating that plants were created as food for living things including animals, birds and insects. Adding to, or perhaps reflecting, their cultural beliefs, the Hebrew word for "animal" is very close to that for "living" and"alive". In contrast the word for "plant" has no obvious relation to such terms (Stavy & Wax, 1989). How culture influences the development of biological understanding has yet to be studied. Parents, schools and mass media may serve to transmit cultural beliefs. For example, Japanese parents may communicate the attitude through their actions toward plants and divine inanimate objects, though they do not usually tell their children this explicitly. Culture may provide children with opportunities to engage in activities that lead them to construct some particular biological understanding, as in the case of children raising goldfish (Hatano & Inagaki, 1992; Inagaki, 1990a).

Postscript Since Carey (1985), young children's naive biology has been an exciting topic for research in cognitive development. As more and more ambitious researchers have joined to study it, not only has a richer database been built and finer conceptualizations offered about this specific issue, but also, through attempts to answer questions like the ones discussed so far in this article, a better understanding of fundamental issues in the developmental studies on cognition, like the nature of domains, theories, constraints, etc., has been achieved. It will probably be a popular topic for the coming several years, and research questions about naive biology can be better answered and/or better rephrased. What is urgently needed now is (a) to integrate nativistic and cultural accounts of acquisition and change in naive biology, and (b) to find commonalities and differences between naive biology and other major theories of the world possessed by young children (Hatano, 1990).

Atran, S. (1990). Cognitive foundations of natural history: Towards an anthropology of science. Cambridge, UK: Cambridge University Press.

Young children's naive theory of biology


Backscheider, A.G., Shatz, M., & Gelman, S.A. (1993). Preschoolers' ability to distinguish living kinds as a function of regrowth. Child Development, 64, 1242-1257. Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: MIT Press. Carey, S. (1987). Theory change in childhood. In B. Inhelder, D. de Caprona, & A. Cornu-Wells (Eds.), Piaget today (pp. 141-163). Hillsdale, NJ: Erlbaum. Dolgin, K.G., & Behrend, D.A. (1984). Children's knowledge about animates and inanimates. Child Development, 55, 1646-1650. Furth, H.G. (1980). The world of grown-ups: Children's conceptions of society. New York: Elsevier. Gellert, E. (1962). Children's conceptions of the content and functions of the human body. Genetic Psychology Monographs, 65, 291-411. Gelman, R. (1990). First principles organize attention to and learning about relevant data: Number and the animate-inanimate distinction as examples. Cognitive Science, 14, 79-106. Golinkoff, R.M., Harding, C.G., Carlson, V., & Sexton, M.E. (1984). The infant's perception of causal events: The distinction between animate and inanimate objects. In L.L. Lipsitt & C. Rovee-Collier (Eds.), Advances in Infancy Research (Vol. 3, pp. 145-165). Norwood, NJ: Ablex. Hatano, G. (1989). Language is not the only universal knowledge system: A view from "everyday cognition". Dokkyo Studies in Data Processing and Computer Science, 7, 69-76. Hatano, G. (1990). The nature of everyday science: A brief introduction. British Journal of Developmental Psychology, 8, 245-250. Hatano, G., & Inagaki, K. (1987). Everyday biology and school biology: How do they interact? Quarterly Newsletter of the Laboratory of Comparative Human Cognition, 9, 120-128. Hatano, G., & Inagaki, K. (1991a). Learning to trust higher-order categories in biology instruction. Paper presented at the meeting of the American Educational Research Association, Chicago. Hatano, G., & Inagaki, K. (1991b). Young children's causal reasoning through spontaneous personification. Paper presented at the 33rd meeting of the Japanese Educational Psychology Association, Nagano [in Japanese]. Hatano, G., & Inagaki, K. (1992). Desituating cognition through the construction of conceptual knowledge. In P. Light & G. Butterworth (Eds.), Context and cognition: Ways of learning and knowing (pp. 115-133). London: Harvester/Wheatsheaf. Hatano, G., & Inagaki, K. (1994). Recognizing commonalities between animals and plants. Paper to be presented at the meeting of the American Educational Research Association, New Orleans. Hatano, G., Siegler, R.S., Richards, D.D., Inagaki, K., Stavy, R., & Wax, N. (1993). The development of biological knowledge: A multi-national study. Cognitive Development, 8, 47-62. Inagaki, K. (1989). Developmental shift in biological inference processes: From similarity-based to category-based attribution. Human Development, 32, 79-87. Inagaki, K. (1990a). The effects of raising animals on children's biological knowledge. British Journal of Developmental Psychology, 8, 119-129. Inagaki, K. (1990b). Young children's use of knowledge in everyday biology. British Journal of Developmental Psychology, 8, 281-288. Inagaki, K. (1993a). Young children's differentiation of plants from non-living things in terms of growth. Paper presented at the 60th meeting of the Society for Research in Child Development, New Orleans. Inagaki, K. (1993b). The nature of young children's naive biology. Paper presented at the symposium, "Children's naive theories of the world", at the 12th meeting of the International Society for the Study of Behavioral Development, Recife, Brazil. Inagaki, K., & Hatano, G. (1987). Young children's spontaneous personification as analogy. Child Development, 58, 1013-1020. Inagaki, K., & Hatano, G. (1990). Development of explanations for bodily functions. Paper presented at the 32nd meeting of the Japanese Educational Psychology Association, Osaka [in Japanese]. Inagaki, K., & Hatano, G. (1991). Constrained person analogy in young children's biological inference. Cognitive Development, 6, 219-231. Inagaki, K., & Hatano, G. (1993). Young children's understanding of the mind-body distinction. Child Development, 64, 1534-1549. Inagaki, K., & Kasetani, M. (1994). Effects of hints to use knowledge about humans on young


G. Hatano, K. Inagaki

children's understanding of biological phenomena. Paper to be presented at the 13th meeting of the International Society for the Study of Behavioral Development, Amsterdam. Inagaki, K., & Sugiyama, K. (1988). Attributing human characteristics: Developmental changes in over- and underattribution. Cognitive Development, 3, 55-70. Inagaki, K., & Suzuki, Y. (1991). The understanding of the mind-body distinction in children aged 3 to 5 years. Paper presented at the 33rd meeting of the Japanese Educational Psychology Association, Nagano, [in Japanese]. Keil, F.C. (1992). The origins of an autonomous biology. In M.R. Gunnar & M. Maratsos (Eds.), Modularity and constraints in language and cognition. Minnesota Symposia on Child Psychology (Vol. 25, pp. 103-137). Hillsdale, NJ: Erlbaum. Kondo, H., & Inagaki, K. (1991). Effects of raising goldfish on the grasp of common characteristics of animals. Paper presented at the 44th Annual Meeting of Japanese Early Childhood Education and Care Association, Kobe [in Japanese]. Kuhn, D. (1989). Children and adults as intuitive scientists. Psychological Review, 96, 674-689. Massey, CM., & Gelman, R. (1988). Preschooler's ability to decide whether a photographed unfamiliar object can move itself. Developmental Psychology, 24, 307-317. Motoyoshi, M. (1979). Watashino Seikatuhoikuron. [Essays on education for day care children: Emphasizing daily life activities.] Tokyo: Froebel-kan [in Japanese]. Ohmori, S. (1985). Chishikito gakumonno kouzou. [The structure of knowledge and science.] Tokyo: Nihon Hoso Shuppan Kyokai [in Japanese]. Richards, D.D., & Siegler, R.S. (1984). The effects of task requirements on children's life judgments. Child Development, 55, 1687-1696. Rosengren, K.S., Gelman, S.A., Kalish, C.W., & McCormick, M. (1991). As time goes by: Children's early understanding of growth. Child Development, 62, 1302-1320. Siegal, M. (1988). Children's knowledge of contagion and contamination as causes of illness. Child Development, 59, 1353-1359. Smith, C, Carey, S., & Wiser, M. (1985). On differentiation: A case study of the development of the concepts of size, weight, and density. Cognition, 21, 177-237. Springer, K., & Keil, F.C. (1989). On the development of biologically specific beliefe: The case of inheritance. Child Development, 60, 637-648. Stavy, R., & Wax, N. (1989). Children's conceptions of plants as living things. Human Development, 32, 88-94. Vera, A.H., & Keil, F.C. (1988). The development of inductions about biological kinds: The nature of the conceptual base. Paper presented at the 29th meeting of the Psychonomic Society, Chicago. Vosniadou, S. (1989). Analogical reasoning as a mechanism in knowledge acquisition: A developmental perspective. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 413-469). Cambridge, UK: Cambridge University Press. Vosniadou, S., & Brewer, W. (1992). Mental models of the earth: A study of conceptual change in childhood. Cognitive Psychology, 24 535-585. Wellman, H.M. (1990). The child's theory of mind. Cambridge, MA: MIT Press. Wellman, H.M., & Gelman, S.A. (1992). Cognitive development: Foundational theories of core domains. Annual Review of Psychology, 43, 337-375.

Mental models and probabilistic thinking
Philip N. Johnson-Laird*
Department of Psychology, Princeton University, Green Hall, Princeton, NJ 08544, USA

Abstract This paper outlines the theory of reasoning based on mental models, and then shows how this theory might be extended to deal with probabilistic thinking. The same explanatory framework accommodates deduction and induction: there are both deductive and inductive inferences that yield probabilistic conclusions. The framework yields a theoretical conception of strength of inference, that is, a theory of what the strength of an inference is objectively: it equals the proportion of possible states of affairs consistent with the premises in which the conclusion is true, that is, the probability that the conclusion is true given that the premises are true. Since there are infinitely many possible states of affairs consistent with any set of premises, the paper then characterizes how individuals estimate the strength of an argument. They construct mental models, which each correspond to an infinite set of possibilities (or, in some cases, a finite set of infinite sets of possibilities). The construction of models is guided by knowledge and beliefs, including lay conceptions of such matters as the i(law of large numbers'9. The paper illustrates how this theory can account for phenomena of probabilistic reasoning.

1. Introduction Everyone from Aristotle to aboriginals engages in probabilistic thinking, whether or not they know anything of the probability calculus. Someone tells you:
*Fax (609) 258 1113, e-mail phil@clarity.princeton.edu The author is grateful to the James S. McDonnell Foundation for support. He thanks Jacques Mehler for soliciting this paper (and for all his work on 50 volumes of Cognition]). He also thanks Ruth Byrne for her help in developing the model theory of deduction, Eldar Shafir for many friendly discussions and arguments about the fundamental nature of probabilistic thinking, and for his critique of the present paper. Malcolm Bauer, Jonathan Evans and Alan Garnham also kindly criticized the paper. All these individuals have tried to correct the erroneous thoughts it embodies. Thanks also to many friends - too numerous to mention - for their work on mental models.

172 There was a severe frost last night. and you are likely to infer: The vines will probably not have survived it.

P. Johnson-Laird

basing the inference on your knowledge of the effects of frost. These inferences are typical and ubiquitous. They are part of a universal human competence, which does not necessarily depend on any overt mastery of numbers or quantitative measures. Aristotle's notion of probability, for instance, amounts to the following two ideas: a probability is a thing that happens for the most part, and conclusions that state what is probable must be drawn from premises that do the same (see Rhetoric, I, 1357a). Such ideas are crude in comparison to Pascal's conception of probability, but they correspond to the level of competence a psychological theory should initially aspire to explain. Of course many people do encounter the probability calculus at school. Few master it, as a simple test with adults shows: There are two events, which each have a probability of a half. What is the probability that both occur? Many people respond: a quarter. The appropriate "therapy" for such errors is to invite the individual first to imagine that A is a coin landing heads and B is the same coin landing tails, that is, p(A & B) = 0, and then to imagine that A is a coin landing heads and B is a coin landing with the date uppermost, where date and head are on the same side, that is, p(A & B) = 0.5. At this point, most people begin to grasp that there is no definite answer to the question above - joint probabilities are a function of the dependence of one event on the other. Cognitive psychologists have discovered many phenomena of probabilistic thinking, principally that individuals do not follow the propositional calculus in assessing probabilities, and that they appear to rely on a variety of heuristics in making judgements about probabilities. A classic demonstration is Tversky and Kahneman's (1983) phenomenon of the "conjunction fallacy", that is, a violation of the elementary principle that p(A & B)^p(B). For example, subjects judge that a woman who is described as 31 years old, liberal and outspoken, is more likely to be a feminist bankteller than a bankteller. Indeed, we are all likely to go wrong in thinking about probabilities: the calculus is a branch of mathematics that few people completely master. Theorists relate probability to induction, and they talk of both inductive inference and inductive argument. The two expressions bring out the point that the informal arguments of everyday life, which occur in conversation, newspaper

Mental models and probabilistic thinking


editorials and scientific papers, are often based on inductive inferences. The strength of such arguments depends on the relation between the premises and the conclusion. But the nature of this relation is deeply puzzling - so puzzling that many theorists have abandoned logic altogether in favor of other idiosyncratic methods of assessing informal arguments (see, for example, Toulmin, 1958; the movement for "informal logic and critical thinking", e.g. Fisher, 1988; and "neural net" models, e.g. Thagard, 1989). Cognitive psychologists do not know how people make probabilistic inferences: they have yet to develop a computable account of the mental processes underlying such reasoning. For this celebratory volume of Cognition, the editor solicited papers summarizing their author's contributions to the field. The present paper, however, looks forward as much as it looks back. Its aim is to show how probabilistic thinking could be based on mental models-an approach that is unlikely to surprise assiduous readers of the journal (see, for example, Byrne, 1989; Johnson-Laird & Bara, 1984; Oakhill, Johnson-Laird, & Garnham, 1989). In pursuing the editor's instructions, part 2 of the paper reviews the theory of mental models in a self-contained way. Part 3 outlines a theoretical conception of strength of inference, that is, a theory of what objectively the strength of an inference or argument depends on. This abstract account provides the agenda for what the mind attempts to compute in thinking probabilistically ( a theory at the "computational" level; Marr, 1982). However, as we shall see, it is impossible for a finite device, such as the human brain, to carry out a direct assessment of the strength of an inference except in certain limiting cases. Part 4 accordingly describes a theory of how the mind attempts to estimate the strength of inferences (a theory at the "algorithmic" level). Part 5 shows how this algorithmic theory accounts for phenomena of probabilistic thinking and how it relates to the heuristic approach. Part 6 contrasts the model approach with theories based on rules of inference, and shows how one conception of rules can be reconciled with mental models.

2. Reasoning and mental models Mental models were originally proposed as a programmatic basis for thinking (Craik, 1943). More recently, the theory was developed to account for verbal comprehension: understanding of discourse leads to a model of the situation under discussion, that is, a representation akin to the result of perceiving or imagining the situation. Such models are derived from syntactically structured expressions in a mental language, which are constructed as sentences are parsed (see Garnham, 1987; Johnson-Laird, 1983). Among the key properties of models is that their structure corresponds to the structure of what they represent (like a visual image), and thus that individual entities are represented just once in a model. The theory of mental models has also been developed to explain deductive


P. Johnson-Laird

reasoning (Johnson-Laird, 1983; Johnson-Laird & Byrne, 1991). Here, the underlying idea is that reasoning depends on constructing a model (or set of models) based on the premises and general knowledge, formulating a conclusion that is true in the model(s) and that makes explicit something only implicit in the premises, and then checking the validity of the conclusion by searching for alternative models of the premises in which it is false. If there are no such counterexamples, then the conclusion is deductively valid, that is, it must be true given that the premises are true. Thus, the first stage of deduction corresponds to the normal process of verbal comprehension, the second stage corresponds to the normal process of formulating a useful and parsimonious description, and only the third stage is peculiar to reasoning. To characterize any particular domain of deduction, for example reasoning based on temporal relations such as "before", "after" and "while", or sentential connectives such as "not", "if, "and" and "or", it is necessary to account for how the meanings of the relevant terms give rise to models. The general reasoning principles, as outlined above, then automatically apply to the domain. In fact, the appropriate semantics has been outlined for temporal relations, spatial relations, sentential connectives and quantifiers (such as "all", "none" and "some"), and all of these domains can be handled according to five representational principles: (1) Each entity is represented by an individual token in a model, its properties are represented by properties of the token, and the relations between entities are represented by the relations between tokens. Thus, a model of the assertion "The circle is on the right of the triangle" has the following spatial structure:

which may be experienced as a visual image, though what matters is not so much the subjective experience as the structure of the model. To the extent that individuals grasp the truth conditions of propositions containing abstract concepts, such as friendship, ownership and justice, they must be able to envisage situations that satisfy them, that is, to form mental models of these situations (see JohnsonLaird, 1983, Ch. 15). (2) Alternative possibilities can be represented by alternative models. Thus, the assertion "Either there is a triangle or there is a circle, but not both" requires two alternative models, which each correspond to separate possibilities:

(3) The negation of atomic propositions can be represented by a propositional annotation. Thus, the assertion "There is not a triangle" is represented by the following sort of model:

whereas an implicit information encodes the information in a way that is not immediately accessible. The principal purpose of the annotation is to ensure that models are not formed containing both an element and its negation. and the cases in which circles occur. pp. a completely explicit set of models can be constructed by fleshing out the initial models to produce the set: A -iA -iO O . the proper initial representation of the disjunction "Either there is a triangle or there is a circle. deductions can be made without the need for formal rules of inference of the sort postulated in "natural deduction systems" (see. the formal rule for disjunction: A or B not A . as shown by the square brackets: [A] [O] This set of models implicitly represents the fact that circles cannot occur in the first model and triangles cannot occur in the second model. for example. have been exhaustively represented. As this example shows. Thus. B (4) Information can be represented implicitly in order to reduce the load on working memory.-. An explicit representation makes information immediately available to other processes. in this case. 1988. and Johnson-Laird & Byrne. Braine. Thus. Thus. Of course.Mental models and probabilistic thinking ^A 175 where "-1" is an annotation standing for negation (for a defence of such annotations. but not both" indicates that for the cases in which triangles occur. Rips. leaving only the second model and its new negated element: ^A O It follows that there is a circle. because circles are exhaustively represented in the second model and triangles are exhaustively represented in the first model. 1984). such as. 1983. Individuals and situations are represented implicitly by a propositional annotation that works in concert with an annotation for what has been represented exhaustively. 130-1). 1991. see Polk & Newell. the only way to combine the disjunctive models above with the model of 'There is not a triangle" is to eliminate the first model. Reiser & Rumain. the nature of the mental symbol corresponding to negation is unknown.

when a proposition has not been exhaustively represented. 1993). Any potential counterexample to a conclusion must be consistent with the premises. A model that does not contain propositional annotations. but inspection of the model alone does not determine whether it represents distance. (Only the first principle is needed to flesh out the models of the disjunction above. Hence. for example. For example. Hence. This aspect of mental models plays a crucial role in the account of syllogistic reasoning and reasoning with multiple quantifiers. for example. that is. or may not. a model based on the first two assumptions above. 1983). a counterfactual state of affairs. a model represents a real possibility. Johnson-Laird where there is no longer any need for square brackets because all the elements in the models have been exhaustively represented. which is governed by two principles: first. Models with propositional annotations compress sets of states of affairs in a still more powerful way: a single model now represents a finite set of alternative sets of situations. the model above of the assertion 'The circle is on the right of the triangle" corresponds to infinitely many possibilities. which contains an infinite number of possibilities (Barwise. when an element has been exhaustively represented (as shown by square brackets) in one or more models. then add both it and its negation to separate models formed by fleshing out any model in which it does not occur. second. syllogistic premises of the form: All the A are B All the B are C call for one model in which the number of As is small but arbitrary: [[a] [[a] b] b] c c . which is assumed to be the linguistic representation from which the models are constructed. Experimental evidence bears out the psychological reality of both linguistic representations and mental models (see Johnson-Laird. there must be an independent record of the premises. The key to understanding implicit information is accordingly the process of fleshing out models explicitly. for example. a given model may. the model is not specific about the distance apart of the two shapes.) (5) The epistemic status of a model can be represented by a propositional annotation. represent the distances apart of objects.176 P. represents a set of possible states of affairs. This record also allows the inferential system to ascertain just which aspects of the world the model represents. but the model itself does not enable the premises to be uniquely reconstructed. or a deontic state. in verbal reasoning. add its negation to any other models.

but b and c These three binary contrasts accordingly yield eight alternatives. Johnson-Laird. not-c individuals who are not-a. for example. not-b but c individuals who are not-a. syllogistic reasoning (see. In contrast. the greater the number of models that an inference calls for.typically. theories of deduction . Second. This prediction can be tested without knowing the detailed models postulated by the theory: it is necessary only to determine whether or not erroneous conclusions are consistent with the premises. erroneous conclusions will tend to be consistent with the premises rather than inconsistent with them. namely. 1965). and each of them is consistent with an indefinite number of possibilities depending on the actual numbers of individuals of the different sorts (see also Garnham. the harder the task will be. see Johnson-Laird & Byrne. just one model of them . eight distinct potentially infinite sets have been compressed into a single model. There may.and overlook other possible models. Cs are not exhaustively related. and the three dots designate implicit individuals of some other sort. and for a reply to commentators. First. Third. 1991. negative assertions bring to mind the affirmative propositions that are denied (Wason. This prediction calls for a theoretical account of the models postulated for a particular domain. see JohnsonLaird & Byrne. be individuals of each of the three following sorts: individuals who are not-a. In short. The initial model. knowledge can influence the process of deductive reasoning: subjects will search more assiduously for alternative models when a putative conclusion is unbelievable than when it is believable. for example.Mental models and probabilistic thinking 177 As are exhaustively represented in relation to Bs. This single model supports the conclusion: All the A are C and there are no counterexamples. & Garnham. Such accounts typically depend on independently motivated psycholinguistic principles. 1993). 1993). Bs are exhaustively represented in relation to Cs. which is used for the inference. corresponds to eight distinct sets of possibilities depending on how the implicit individuals are fleshed out explicitly. Oakhill. however. Reasoners will err because they construct some of the models of the premises . or may not. 1989). The theory of reasoning based on mental models makes three principal predictions. The third prediction has been corroborated in the only domain in which it has so far been tested. not-b. The first two of these predictions have been corroborated experimentally for all the main domains of deduction (for a review.

. 1972). Second. one needs to distinguish between the strength of an argument . support the conclusion. Osherson. that is. given an implicit assumption. schema or causal scenario. Aristotle). the distance apart of the two shapes can differ. as we have seen. others are not. In principle.178 P. if true. Kahneman & Tversky. Where the model theory and the formal rule theories make opposing predictions. the evidence so far has corroborated the model theory. individuals are liable to neglect the second of these components. the next section of the paper will specify how in practice the mind attempts to assess the strength of an argument (the theory at the "algorithmic" level). and it can be characterized abstractly by adopting the semantic approach to logic (see. An argument can be strong but its conclusion improbable because the argument is based on improbable premises.some are highly convincing. their premises could be true but their conclusions false. the probability of the premises is distinct from the strength of the argument. However. An assertion such as "The circle is on the right of the triangle" is. textures and so on. but their strong points can be captured in the following analysis. Braine. it is logically valid. that is. that is. true in infinitely many different situations. the inference is an enthymeme (cf. (2) an inference is strong if it corresponds to a deduction in reverse. 1983. shapes. 1984). & Rumain. Each hypothesis has it advantages and disadvantages. Smith. The strength of an inference By definition. 1965). 3. and the degree to which the conclusion is likely to be true in any case. the present section of the paper will specify an abstract characterization of the objective strength of an argument . Hence.g. and (3) an inference is strong if the predicates (or arguments) in premises and conclusion are similar (cf. The relation between premises and conclusion in inductive inference is a semantic one. These differences are an important clue to the psychology of inference.the degree to which its premises.what in theory has to be computed in order to determine the strength of an inference (the theory at the "computational" level). Hempel. such as argument from specific facts to a generalization of them (cf. Reiser. as can their respective sizes. 1989). which we will develop in two stages. Rips. Barwise & Etchemendy. as we shall see. But. inductive arguments are logically invalid. for example. the probability of a conclusion should depend on both the probability of the premises and the strength of the argument. Yet such arguments differ in their strength . First. Johnson-Laird based on formal rules of inference exist only for spatial reasoning and reasoning based on sentential connectives (e. and Shafir (1986) in a ground-breaking analysis of induction explored a variety of accounts of inferential strength that boil down to three main hypotheses: (1) an inference is strong if. Yet in all of .

are not equivalent to the probability calculus: as we shall see. the principles have no strong implications for the correct interpretation of probability. What underlies deduction is the semantic principle of validity: an argument is valid if its conclusion is true in any state of affairs in which its premises are true. This account has a number of advantages.e. Philosophers sometimes refer to these different states as "possible worlds" and argue that an assertion is true in infinitely many possible worlds. and so it is reasonable to assume that possible states of affairs are close to equi-possible. The . however. then the strength of the argument equals the proportion of states of affairs consistent with the premises in which the conclusion is also true. Armed with the notion of possible states of affairs. The two abstract principles. the probability of any one distinct possible state of affairs (possible world) is infinitesimal. Second.Mental models and probabilistic thinking 179 these different states the circle is on the right of the triangle. It is 1 in the case of a valid deduction. which is a matter for self-conscious philosophical reflection.the conclusion follows validly from the premises. then the conclusion is inconsistent with the premises: the inference has no strength whatsoever. We leave to one side the issue of whether or not possible worlds are countably infinite. then the argument is maximally strong. The underlying theory has led to a powerful. counterexamples) weaken the argument. inductions are reverse deductions. It follows that a method of integrating the area of a subset of states of affairs provides an extensional foundation for probabilities. the conclusion is at least consistent with the premises. we can define the notion of the strength of an inference in the following terms: a set of premises. lend strength to a conclusion according to two principles: (1) The conclusion is true in at least one of the possible states of affairs in which the premises are true. The strength of an inference is accordingly equivalent to the probability of the conclusion given the premises. Likewise. An induction increases semantic information and so its conclusion must be false in possible cases in which its premises are true. First. If there are no counterexamples. and indeed there is valid argument in favor of the negation of the conclusion. account of the semantics of natural language (see. the human inferential system can attempt to assess the relevant proportions without necessarily using the probability calculus. and an intermediate value for inductions. 1974). If there are counterexamples. 0 in the case of a conclusion that is inconsistent with the premises. including implicit premises provided by general and contextual knowledge. but they are the reverse of deductions that throw semantic information away. If there is no such state of affairs. for example. it embraces deduction and induction within the same framework.. Hence. Montague. that is. though controversial. (2) Possible states of affairs in which the premises are true but the conclusion false (i.

the possible states of affairs that a proposition rules out as false. equals 1 -p(A)> where p(A) denotes the probability of A (Bar-Hillel & Carnap. that is. and for an alternative conception. but it does not follow that such differences call for a mental representation of numerical probabilities. 1964. an argument (or a probability) may concern either a set of events or a unique event. Hintikka's. 1985. however. This account captures all the standard cases of induction. Johnson-Laird principles are compatible with interpretations in terms of actuarial frequencies of events. & Irmscher. Thus. such as one that recorded the relative ease of constructing different classes of models. 1958). 1982). the account is compatible with semantic information. and subjective degrees of belief (cf. An alternative conception of "degrees of belief might be based on analogue representations (cf. 1993). Hintzman. 1962. and so the conclusion must be true given that the premises are true. can be straightforwardly analyzed in the terms described above: informal argumentation is typically a species of induction. Deduction does not increase semantic information. or on a system that permitted only partial rankings of strengths. we can distinguish between deduction and induction on the basis of semantic information. Individuals who are innumerate may not assign a numerical degree of certainty to their conclusion. Individuals' beliefs do differ in subjective strength.180 P. 1926. Hence. analysis of beliefs in terms of possibility. the conclusion of a valid deduction rules out the same possibilities as the premises or else fewer possibilities. that is. If A is complex proposition containing conjunctions. its probability can be computed in the usual way according to the probability calculus. The strength of an argument. which may veer at one end into deduction and at the other end into a creative process in which one or more premises are abandoned. and even numerate individuals may not have a tacit mental number representing their degree of belief. disjunctions.a procedure that is so unlike a logical proof that many theorists have supposed that logic is useless in the analysis of everyday reasoning (e. Johnson-Laird. Nozawa.g.. such as the generalization from a finite set of observations to a universal claim (for a similar view. Third. One feature of such informal argumentation is that it typically introduces both a case for a conclusion and a case against it . see Ramsay. see Shafer & Tversky's. equi-possibilities based on physical symmetry. the account is compatible with everyday reasoning and argumentation. No . Hence. 1926). a case for a conclusion may depend on several inductive arguments of differing strength. discussion of "belief functions"). The obvious disadvantage of the account is that it is completely impractical. A. the conclusion of an induction goes beyond the premises (including those tacit premises provided by general knowledge) by ruling out at least some additional possibility over and above the states of affairs that they rule out. 1983). Fourth. Induction increases semantic information. The semantic information conveyed by a proposition. that is. Ramsay. Toulmin. as argued elsewhere (Johnson-Laird. etc.

1990). the probability of the date uppermost is 0. it takes exponentially longer to generate such conclusions (given that NP ^ P). This deduction makes explicit what is implicit in the premises. many inductive inferences are not probabilistic. Mental models and estimates of inferential strength Philosophers have tried to relate probability and induction at a deep level (see. Such an event was unthinkable from their previous experience.Mental models and probabilistic thinking 181 one can consider all the infinitely many states of affairs consistent with a set of premises. Carnap.5. A more mundane example is as follows: If you park illegally within the walls of Siena. Conversely. No one can integrate all those states of affairs in which the conclusion is true and all those states of affairs in which it is false. Phil will probably have his car towed. and their . 1950). is a piece of probabilistic reasoning that is deductive: The probability of heads is 0. 4. for example. that is. for example. that is. The probability of the date uppermost given tails is 0. that is.5. The probability of the date uppermost given heads is 1. but demonstrations of invalidity may get lost in the uspace" of possible derivations. but the formulation of parsimonious conclusions that maintain semantic information is not computationally tractable. the engineers in charge at Chernobyl inferred initially that the explosion had not destroyed the reactor (Medvedev. For example. as premises contain more atomic propositions. proofs for valid theorems can always be found in principle. They were certain that the reactor was intact. and they had no evidence to suppose that it had occurred. and it does not increase their semantic information. and there are inductive inferences that are not probabilistic. This inference is also a valid deduction. So how does this account translate into a psychological mechanism for assessing the strength of an argument? It is this problem that the theory of mental models is designed to solve. Phil has parked illegally within the walls of Siena. Hence. they lead to conclusions that people hold to be valid. Inference with quantifiers has no general decision procedure. you will probably have your car towed. but as far as cognitive psychology is concerned they are overlapping rather than identical enterprises: there are probabilistic inferences that are not inductive. Inference with sentential connectives has a decision procedure. Here.

When the information in the first premise is added to this model. it will be helpful to consider first how it accounts for deductions based on probabilities. Most anarchists are bourgeois. Johnson-Laird conviction was one of the factors that led to the delay in evacuating the inhabitants of the nearby town. Kropotkin is bourgeois. To illustrate the point. Probably. 1983. In fact. Another possible model is: [a] [a] [a] [a] b b b k .182 P. p. The quantifier "most" calls for a model that represents a proportion (see Johnson-Laird. Critics sometimes claim that models can be used only to represent alternative states of affairs that are treated as equally likely. that is. and it is necessary to explain their basis as well as the basis for probabilistic deductions. there is no reason to suppose that when individuals construct or compare models they take each model to be equally likely. one possible model is: k [a] [a] [a] [a] b b b in which Kropotkin is bourgeois. To understand the application of the model theory to the assessment of strength.-. anarchists cannot occur in fleshing out the implicit model designated by the three dots. Thus. 137). . Of course people do make probabilistic inductions. a model of the second premise takes the form: [a] [a] [a] [a] b b b where the set of anarchists is exhaustively represented. consider an example of a deduction leading to a probabilistic conclusion: Kropotkin is an anarchist.

Following Aristotle. The inference is strong. Individuals who are capable of one-to-one mappings but who have no access to cardinal or ordinal numbers will still be able to make this inference. Kropotkin is bourgeois. Hence. then most individuals have a prior knowledge that Evelyn is likely to be killed. or a deep snow drift. and so it will deduce: Probably. in constructing models (of sets of possibilities). Likewise. It will detect the greater frequency of models in which it Kropotkin is bourgeois. as we have seen. given that Evelyn fell (without a parachute) from an airplane flying at a height of 2000 feet. their assessments of probabilities should be consistent. They have merely to map each model in which S occurs one-to-one with each model in which S does not occur. if there is a residue. They may be able to imagine cases to the contrary. And in most possible states of affairs as assessed from models of the premises.will . The strength of an inference depends. which was proposed by Tversky and Kahneman (1973. 1983). For example.just as models in which Kropotkin is bourgeois outnumber those in which he is not.those in which a conclusion is true and those in which it is false . it corresponds to the more probable category. on the relative proportions of two sorts of possible states of affairs consistent with the premises: those in which the conclusion is true and those in which it is false. those in which Evelyn is killed will occur much more often than those in which Evelyn survives . and perhaps to the relative ease of constructing them. The only difference in induction is that information that goes beyond the premises (including those in tacit knowledge) is added to models on the basis of various constraints (see Johnson-Laird. and. but not irrefutable. But. and it forms a bridge between the model theory and the heuristic approach to judgements of probability. for example. but naive individuals who encounter such a case for the first time can infer the conclusion. Estimates of the relative proportions of the two sorts of models . Insofar as individuals share available knowledge. Reasoners can estimate these proportions by constructing models of the premises and attending to the proportions with which the two sorts of models come to mind. Evelyn falls into a large haystack. can be treated as equivalent to: in most possible states of affairs. the inferential system needs to keep track of the relative frequency with which the two sorts of models occur. there are many ways in principle in which to estimate the relative frequencies of the two sorts of model-from random sampling with replacement to systematic explorations of the "space" of possible models. p. assertions of the form: probably S.Mental models and probabilistic thinking 183 in which Kropotkin is not bourgeois. Kropotkin is bourgeois. S. This account is compatible with the idea of estimating likelihoods in terms of scenarios. 229).

but a semantic one. biased and governed by heuristics. because of the limited processing capacity of working memory. for example. that is. envisioning models. and. and the theory yields a number of predictions about making and assessing inductive inferences. however. They will tend to construct one or two models. arguments . individuals are likely to seek the most specific conclusion consistent with the premises (see Johnson-Laird. Second. for example. for example. models must allow for alternative courses of events. and. they are likely to seek parsimonious conclusions (see Johnson-Laird & Byrne. This relation is not in general a formal or syntactic one. Finally. In assessing outcomes dependent on sequences of events. in particular. that is. they make more abstract conceptual manipulations. The process will be affected by several constraints. knowledge becomes available to them in a systematic way. with a deductive conclusion. Next. 1973). Shafir & Tversky. and they are likely to be constrained by the availability of relevant knowledge (Tversky & Kahneman. one that could be true given the premises.184 P. They then resemble so-called "event trees". 1993).especially in daily life . many models are likely never to be envisaged at all. 1992). The model theory postulates a mechanism for making knowledge progressively available. Johnson-Laird be rudimentary. They manipulate the spatial or physical aspects of the situation.do not wear their logical status on their sleeves. The main predictions of the theory are as follows: First. 1991) and in choice (see. and be uncertain about whether it follows of necessity. Once they have formed an initial model. draw a conclusion. they consider the properties of superordinate concepts of entities in the model. they make still more abstract inferences based on introducing . which each correspond to a class of possibilities. It takes work to estimate the strength of relation. They will tend to confuse an inductive conclusion. 5. which Shafer (1993) argues provide a philosophical foundation to probability and its relations to causality. 1991). that is. Johnson-Laird & Byrne. In particular. is a crude method. Reasoners begin by trying to form a model of the current situation. on the proportion of possibilities compatible with the premises in which the conclusion is true. Disjunctive alternatives. Some empirical consequences of the theory The strength of an argument depends on the relation between the premises and the conclusion. one that must be true given the premises. and so individuals will tend to approach deductive and inductive arguments alike. are a source of difficulty both in deduction (see. and the retrieval of relevant knowledge is easier if they can form a single model containing all the relevant entities. they manipulate the model directly by procedures corresponding to such changes.

(2) Conceptual manipulations: The suspect had an accomplice . semantic similarity between the premises and the conclusion. will influence probabilistic judgements. Sec. and Cause or Effect" (Hume. These heuristics can be traced back to Hume's seminal analysis of the connection between ideas: "there appear to be only three principles of connexion between ideas. reasoners are also likely to be guided by other heuristics.who carried out the crime (theft is a crime.a waiter. Hence. III). In short. They may then be able to envisage the following sort of sequence of ideas from their knowledge about the kinds of things in the model: (1) Physical and spatial manipulations: The suspect leant through the window to steal the wallet. They will infer that the suspect is innocent. perhaps . What follows? Reasoners are likely to build an initial model of Arthur inside the restaurant when his wallet is stolen and the suspect outside the restaurant at that time. are contrary to the premises). The person charged with the offense was outside the restaurant at the time of the robbery.Mental models and probabilistic thinking 185 relations retrieved from models of analogous situations (cf. Such factors may even replace extensional estimates based on models. Contiguity in time or place. (3) Analogical thinking The suspect used a radio-controlled robot to sneak up behind Arthur to take the wallet (by analogy with the use of robots in other "hazardous" tasks). or ran in and out of the restaurant very quickly (creative inferences that. which have been extensively explored by Tversky and Kahneman. and then by introducing analogies from other domains. It is important to emphasize that the order of the three sorts of operations is not inflexible. namely. and the causal cohesiveness between them. 1983). The suspect stole the wallet as Arthur was entering the restaurant. and that particular problems may elicit a different order of operations. there should be a general trend in moving away from explicit models to implicit possibilities. and then they attempt to move away from them. the model theory predicts that reasoners begin by focusing on the initial explicit properties of their model of a situation. Gentner. Nevertheless. Resemblance. and many crimes are committed by accomplices). Third. . first by conceptual operations. and their colleagues. in fact. Consider the following illustration: Arthur's wallet was stolen from him in the restaurant. 1748.

that is. Conversely. individuals are likely to focus on what is explicit in their initial models and thus be susceptible to various "focusing effects" (see Legrenzi. however. Individuals are indeed often overconfident in their inductive judgements. and showed a reliable decline in confidence. that is. and that their confidence derives from the validity of this cue. With easier one-model problems. they are likely to accept it. These are long-term representations of probabilistic cues and their validities (represented in the form of conditional probabilities). especially in the case of arguments that do have alternative models in which the conclusion is false. rated confidence tends to be higher than the actual percentage of correct answers. When they could go no further. in which subjects were asked to draw initial conclusions from such premises as: The old man was bitten by a poisonous snake. They tend initially to infer that the old man died. though by the end of the experiment they may have been underconfident as a result of bringing to mind remote scenarios. as is sometimes observed? One factor that may be responsible for the effect in repeated-measure designs is the subjects' uncertainty about whether or not there might be other models in a one-model case. if they reach an incredible (or undesirable) conclusion. Finally. These authors propose that individuals use the single cue with the strongest validity and do not aggregate multiple cues. by their own lights. They report corroboratory evidence from their experiments on the phenomenon of overconfidence. individuals should be inferential satisficers. As Griffin and Tversky (1992) point out. the model theory proposes that the propensity to satisfice should lead subjects to overlook models in the case of multiple-model problems. overconfidence is greater with harder questions and this factor provides an alternative account of Gigerenzer et al. Girotto. Johnson-Laird Fourth. This propensity to satisfice will in turn lead them to be overconfident in their conclusions. They were then asked whether there were any other possibilities and they usually succeeded in thinking of two or three. or succeed in constructing a model in which such a conclusion is true. if they reach a credible (or desirable) conclusion. Hence. Hoffrage.'s results. In contrast. and Gigerenzer.186 P. they are likely to search harder for a model of the premises in which it is false. and so they should tend to be more confident than justified in the case of harder problems. But should subjects be underconfident in such cases. they were asked to rate again their initial conclusions. There was no known antidote available. & . and Kleinbolting (1991) have propounded a theory of "probabilistic mental models" to account for this phenomenon. Overconfidence in inductive inference occurred in an unpublished study by Johnson-Laird and Anderson. they were initially overconfident. the error and its correlated overconfidence cannot occur. and to overlook models that are counterexamples. Their confidence in such conclusions was moderately high.

Mynatt. 1987).'. for an analogous view). Michalski. the relation between the premises and conclusion.Mental models and probabilistic thinking 187 Johnson-Laird. and that these rules have numerical parameters for such matters as degree of certainty. 6. Doherty. Tversky & Kahneman. that is. any factor that makes it easier for individuals to flesh out explicit models of the premises should improve performance. 1989. and then fail to adjust their estimates of its likelihood by taking into account alternative models (see also Griffin & Tversky. 1989. If q then s (with probability p') Numerous AI programs include rules of this sort (see. These effects include difficulty in isolating genuinely diagnostic data (see. Pennington and Hastie (1993) report success in matching these patterns to informal inferences of subjects playing the part of trial jurors. and. Focusing is also likely to lead to too great a reliance on the credibility of premises (and conclusion) and too little on the strength of the argument. 1983. They admit that it is difficult to use standard psychological techniques to test their theory. Holland. for example. 1975). Beyth-Marom & Fischhoff. 1982). Klayman & Ha. Holyoak. Conversely.g. it does not address the issue of whether people make systematic errors. 1983. 1986. and effects of how problems in deductive and inductive reasoning are framed (e. as Collins and Michalski mention. But. Rules for probabilistic thinking An obvious potential basis for probabilistic reasoning is the use of rules of inference. The most plausible psychological version of this idea is due to Collins and Michalski (1989). They argue that individuals construct mental models on the basis of rules of inference. one danger is that subjects' protocols are merely rationalizations for answers arrived at by other means. Johnson-Laird & Byrne. 1993). Hence. Tweney. . as they point out (p. They have not tried to formalize all patterns of plausible inference. 1981). & Thagard. 1979). which is intended to account only for people's answers to questions. such as: If q & r then s (with probability p) . In sum.. 1992. It does not make any predictions about the differences in difficulty between various sorts of inference. for example. their main proposed test consists in trying to match protocols of arguments against the proposed forms of rules. testing hypotheses in terms of their positive instances (Evans. Nisbett. AI rule systems for induction have not yet received decisive corroboration. analogies and inductions. Reasoners will build an initial model that makes explicit the case for a conclusion. 7). neglect of base rates in certain circumstances (Tversky & Kahneman. & Schiavo. Winston. but rather some patterns of inference that make up a core system of deductions.

They may take the form of schemas or content-specific rules of inference. Aristotle would not have grasped such notions as sample. On the contrary. however. The rules in AI programs are formal and can be applied to the representation of the abstract logical form of premises. that is. 1993. benefit from training with them. The law has a rich semantic content that goes well beyond the language of logical constants. but they could be represented declaratively. Johnson-Laird In contrast. how they enter into the process of thinking . another sort of rule theory has much more empirical support. no reason to oppose them to mental models. He would thus have had a tacit grasp of the law that he could make use of in certain circumstances.188 P.the details of the computations themselves . it is likely to be applied only when one has grasped the content of a problem. The law of large numbers. It can be paraphrased as follows: The larger the sample from a population the smaller its mean is likely to diverge from the population mean. constructed a model that makes explicit that it calls for an estimate based on an example.is also unknown. Likewise. Smith. lead to mental models that are used to assess probabilities. but he would have been more surprised by a coin coming up heads ten times in a row than a coin coming up heads three times in a row. is not a formal rule of inference. at approximately equal rates and in unpredictable ways. and sometimes overextend their use of them. Conclusions The principle thesis of the present paper is that general knowledge and beliefs. Most cognitive scientists agree that humans construct mental . Such principles differ in generality and validity. There is. Langston. certain devices produce different outcomes on the basis of chance. and it is doubtful whether it could be applied to the logical form of premises. just as conceptual knowledge must underlie the construction of models. This theory appeals to the idea that individuals have a tacit knowledge of such rules as the "law of large numbers" (see Nisbett. things are likely to even up in the long run (gambler's fallacy). 1992). mention them in justifying their responses. Individuals are likely to hold many other general principles as part of their beliefs about probability. however. 7. if a sample from such a device is deviant. They seem likely to work together in tandem. that is. For instance. Individuals apply the rules to novel materials. but they underlie the construction of many probabilistic judgements. mean and population. & Nisbett. along with descriptions of situations. The fact that individuals can be taught correct laws and that they sometimes err in over-extending them tells us nothing about the mental format of the laws.

R. Evans. The logic of plausible reasoning: A core theory. The complete works of Aristotle. is there to the claim that individuals think probabilistically by manipulating models? The answer. which are both equally feasible on this foundation.D.M. as the study of deduction has shown.Mental models and probabilistic thinking 189 representations. (1989). Gentner. C. Cognition. Cambridge.T.). is twofold. & Michalski. A number of questions about a question of number. Reiser. & Carnap. Bar-Hillel. Barwise. London: Erlbaum. MA: Addison-Wesley. B. many may suspect that the model theory merely uses the words "mental model" where "mental representation" would do. A. Chicago: Chicago University Press. Diagnosticity and pseudodiagnosticity. Hence. (1988). Mynatt. 31. Logical foundations of probability.. In M. In Y. M.I.).. & Etchemendy. R. Acta Psychologica..D. Bar-Hillel. (Ed. Barnes. the representational principles of models allow sets of possibilities to be considered in a highly compressed way.. (1984). Model-theoretic semantics. Braine. Behavioral and Brain Sciences. 337-338. induction and probabilistic thinking. (1989). Second. R. D. edited by J. (1983). Cambridge. Journal of Personality and Social Psychology. Reading.. Chichester: Ellis Horwood. Fisher. Behavioral and Brain Sciences. Carnap.J. A. 11-21. In The psychology of learning and motivation.D. J. M.B. MA: MIT Press. R. are distinct from those made by theories that postulate only representations of the logical form of assertions. 13 1-49. J. The logic of real arguments. (1993). K. The nature of explanation. & Schiavo. J. 350-351. Collins. Garnham.. (1987).St. 7. A. Pseudodiagnosticity. J.J. which has been outlined here. 45. Garnham. M. 18). So. (1979). (1964). and on the processing limitations of working memory. B. Everyday reasoning and logical inference. Y. Princeton: Princeton University Press. UK: Cambridge University Press. An outline of a theory of semantic information. UK: Cambridge University Press. 16. and even in certain cases sets of sets of possibilities. (1950). the model theory makes a number of predictions based on the distinction between explicit and implicit information. Mental models as representations of discourse and text. Bias in human reasoning: Causes and consequences. 43.S. Doherty. 2 vols. Beyth-Marom. Cognitive Science. Suppressing valid inferences with conditionals.E. . (1989). (Vol. Foundations of cognitive science. Tweney. & Rumain. References Aristotle (1984). 61-83. (1943). (1989). Structure-mapping: A theoretical framework for analogy. it is feasible to assess probability by estimating possible states of affairs within a general framework that embraces deduction. A. B. Barwise. R. Cognitive Science. Language and information.. if any. (1983).R. Byrne.M. Some empirical justification for a theory of natural propositional logic. (1993). what force. Posner (Ed. R.. 1185-1197. 16. Cambridge.. Craik. This framework provides an extensional foundation of probability theory that is not committed a priori to either a frequency or degrees-of-belief interpretation. & Fischhoff. New York: Academic Press. 155-170. Such predictions. First.

NJ: Erlbaum.E.N. 24. Los Altos.P.E. A. 411-435. R. (1982). Hintikka.J. MA: MIT Press. (1991). IL: Open Court. P.A. Hume. 1-40. K. The weighing of evidence and the determinants of confidence. (1985).. D. Medvedev. (1983). Johnson-Laird. Hintzman. Z. Vision: A computational investigation into the human representation and processing of visual information. Formal philosophy: Selected papers. Langston. A. Michalski. NJ: Erlbaum. Girotto. learning. Cognitive processes in propositional reasoning.S. M. Modeling human syllogistic reasoning in Soar. Klayman. J. P. Thinking through uncertainty: Nonconsequential reasoning and choice.N. Ithaca: Cornell University Press.N. (1990). C. R. (1984). Cognition. Shafir. Probabilistic mental models: A Brunswikian theory of confidence. A theory and methodology of inductive learning. 24. UK: Cambridge University Press. B.) (1993). In D. 430-454. 449-474. (1992).E. Thagard. 49. Shafer G.. Hillsdale. Hillsdale. New Haven: Yale University Press. Jbhnson-Laird. Johnson-Laird Gigerenzer. (1926/1990). (1993). Smith. (1983). Cognitive Science.H. Cambridge. Some origins of belief. (1993). Marr. Y. (1992). Aspects of scientific explanation.N. H. R. V. Cognition.. Hillsdale.. & Johnson-Laird. Cognitive Psychology. P. Johnson-Laird.J. Hoffrage. (1989). Cognitive Psychology.E. 127-141. Ramsay.. P. (1958). & Byrne. Authors' response [to multiple commentaries on Deduction]: Mental models or formal rules? Behavioral and Brain Sciences. NJ: Erlbaum. & Shafir. Johnson-Laird.-W. N.M. 313-330.M. Mitchell (Eds.. 31. Unpublished MS. (1989)... & Garnham. G. Holyoak. Deduction. F. 94. Rips. (1993). Rules for reasoning. (1965). Only reasoning. E. Johnson-Laird. Shafer. F. Beiievability and syllogistic reasoning. Using probability to understand causality. Psychological Review. Holland. 211-228. UK: Cambridge University Press. 181-187). disconfirmation and information in hypothesis testing. The case for rules in reasoning. E. (1962). (1988). Nozawa. In R. Polk. Cambridge. Cognitive Science.N. R. D. & Byrne. The uses of argument. (1991). Journal of Memory and Language. (1972). P.G. P. (1993). 16. & Nisbett. Cognition. Johnson-Laird. Nisbett. 123-163.N. Nisbett. CA: Morgan Kaufmann. Knowledge and belief: An introduction to the logic of the two notions.N. E..A. (Ed. 16.. Toulmin. & Thagard. U. 197-224.J. Cognition. Behavioral and Brain Sciences. The legacy of Chernobyl. . (1983). Focussing in reasoning and decision making. (1989). G. 435-502. Subjective probability: A judgment of representativeness. Oakhill.V. & Tversky. 9. & Newell. J. Pennington. An enquiry concerning human understanding. J. 28. A.. Cambridge. J. Rutgers University. Journal of Verbal Learning and Verbal Behavior. 3.H. Frequency as a nonpropositional attribute of memory.). 98. (1986). San Francisco: W. R. Hempel. Cambridge..N. Freeman. (1974). Explanatory coherence. (1748/1988). 1-61. (1982). P. S. Hillsdale. UK: Cambridge University Press.M.190 P. A. P. P. 12. Griffin. NJ: Erlbaum.. A. G. New York: Macmillan. C.J. 21. Michalski. Carbonell. 90.P. & Bara. & Ha. & Kleinbolting. Psychological Review. R.. J. Psychological Review.H. and discovery. Induction: Processes of inference. L. D. Confirmation. Reasoning in explanation-based decision making.). 37-66. Kahneman. Mental models: Towards a cognitive science of language. Cognitive Psychology. (1992). In Tenth Annual Conference of the Cognitive Science Society (pp. Languages and designs for probability judgment.E.. La Salle. & Hastie.J. 506-528. P.. P. & Irmscher.. R. Ramsay: Philosophical papers.. D. 24. 117-140. Smith. D. Mellor.M.S. (Ed. 16. & Tversky. & Tversky. Cognition. & Byrne.. Legrenzi. T. 368-376. inference and consciousness. Syllogistic inference. R. R. Truth and probability.E. 38-71. 49. Johnson-Laird.. Human and machine thinking. (1986).. (1987). 309-339. A. Montague.L.N. & Tversky. D. Osherson. Machine learning: An artificial intelligence approach. New York: Norton. & T. (1993). E..

P. & Kahneman.). (1981). Tversky.. & Kahneman. Psychological Review. A. 7-11. New York: McGraw-Hill. Evidential impact of base rates. 5. Availability: A heuristic for judging frequency and probability. Cognitive Psychology. (1973). 207-232.). A. The contexts of plausible denial.C. D. (1975). (1965). Tversky.Mental models and probabilistic thinking 191 Tversky. Tversky. D. Tversky. Judgments under uncertainty: Heuristics and biases.. (1983). Slovic. A. In D.H. The psychology of computer vision. Kahneman. & A. P. (Ed. D. Journal of Verbal Learning and Verbal Behavior. (Eds. The framing of decisions and the psychology of choice. 293-315. Learning structural descriptions from examples. & Kahneman. New York: Cambridge University Press. A. P.H. 453-458. Winston.. Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. . Wason. (1982). 211. In P. D.. 90. & Kahneman. 4. Science. Winston.

May 1989. Leslie Department of Psychology. and Inaugural Conference of the Rutgers University Center for Cognitive Science. University of Michigan. Italy. various parts having been presented to the BPS Developmental Section Annual Conference. the child's "theory of mind" is plausibly the result of the growth and functioning of a specialized mechanism (ToMM) that produces domain-specific learning. Trieste Encounters on Cognitive Science. 1993. I am grateful to participants and audiences at those meetings and also to colleagues and friends at the MRC Cognitive Development Unit for nurture and good nature. April 1991. The failure of early spontaneous development of "theory of mind" in childhood autism can be understood in terms of an impairment in the growth and functioning of this mechanism. November 1991. Coleg Harlech. are discussed in the light of some current findings. Trieste. October 1990. December 1989. Rutgers University. ToMM makes possible a commonsense causal interpretation of agents9 behavior as the result of circumstances that are imaginary rather than physical. Agent-centered descriptions place agents in relation to information. Two early attitude concepts. . Piscataway. USA Abstract Commonsense notions of psychological causality emerge early and spontaneously in the child. pretends and believes. By relating behavior to the attitudes agents take to the truth of propositions. NJ 08855-1179. Ann Arbor. Center for Cognitive Science. ToMM constructs agent-centered descriptions of situations or "metarepresentations". Rutgers University. collaborator and friend who tragically lost his long struggle against cancer on April 17. September 1988. International Conference on Cultural Knowledge and Domain Specificity. Seattle. This paper has undergone a long gestation. What implications does this have for our understanding of the mind/brain and its development? In the light of available evidence. Cornell University. Society for Research on Child Development Biennial Meeting. my student. Dedication: This article is dedicated to the memory of Daniel Roth.10 Pretending and believing: issues in the theory of ToMM Alan M. International Workshop on Naturalized Epistemology.

1985). Introduction Consider the scenario in Fig. children understand this scenario by attributing a (false) belief to Sally and predicting her behavior accordingly. illustrated by this scenario. . Leslie. Mother's behavior described as a physical event-as one object in relation to another-is minimally interesting. For example. 1. 1. Entertaining this kind of intentional or agent-centered description requires computing a certain kind of internal representation. in the scenario in Fig. The real significance of her behavior emerges only when mother is described as an agent in relation to information. by about 4 years. Premack and Woodruff (1978) coined the term "theory of mind" for the ability. Numerous studies (e. How is the preschool child able to learn about mental states when these are unobservable. heard nor felt? A general answer to the above question is that the brain attends to behavior and infers the mental state that the behavior issues from. Baron-Cohen. Leslie oAnn A \ A box < ) \ Silly puUjM^harmt trble Inthtbaskt Dl • box | Sally goas away ^L t—*C5 ~T*~i (feasor] <—J A L_l 1 Ann moves marble "whan will Sally ^^ look for her marble?" ^ Fig. 2. & Frith. theoretical constructs? Or put another way: how is the young brain able to attend to mental states when they can be neither seen.194 A.g. mother can adopt an attitude (of pretending) to the truth of a description ("it is a telephone") in regard to a given object (the banana). 1992). & Frith. I have called this the "metarepresentation" or "M-representation" (Leslie. As an agent. Leslie. Leslie & Thaiss. Mother's behavior is talking to a banana. explain and interpret the behavior of agents in terms of mental states. The task for a 2-year-old watching her is to infer that Mother PRETENDS (of) the banana (that) "it is a telephone". 1987. A standard false belief scenario that can be solved by 4-year-olds (after Baron-Cohen. 1983) have shown that. Such findings raise the following question. 1. to predict. Wimmer & Perner. 1985.

. I shall explore the following assumption. (1) The key to understanding the origins of theory of mind lies in timepressured. not simply as a sequence of events. A pretend scenario that can be solved by 2-year-olds. I call this mechanism ToMM (theory of mind mechanism). but as instantiating intentions in the broad sense. 2. Native to our mental architecture is a domain-specific processing stream adapted for understanding the behavior of agents.Pretending and believing: issues in the theory of ToMM 195 Mother's behaviour: talking to a banana! Infer mental state: mother PRETENDS (of) the banana (that) "it is a telephone" Fig. Here are five guiding ideas in the theory of ToMM. This pressure will constrain the amount and types of information that can be taken into account and has had an adaptive evolutionary influence on the architecture of theory of mind processing. on-line processing to interpret an agent's behavior in terms of underlying intentions. A major component of this system is a mechanism which computes the M-representation. Early in development. as issuing from mental states. that is. This processing task is time-pressured because agent-centered descriptions must be arrived at fast enough to keep up with the flow of behavior in a conversation or other interaction. human beings undertake the informationprocessing task of understanding the behavior of agents.

1991. The ability is productive and does not remain limited to a single or to a few special topics. cannot be seen. ToMM is specifically concerned with "cognitive" properties of agents and employs specialized notions for this task. Morton. play in animals). (3) ToMM employs a proprietary representational system which describes propositional attitudes. the child begins to entertain deliberate suppositions about simple imaginary situations: foi example.. I discussed some of the more important of these distinctions in Leslie . Leslie & Frith. ToMM should be able to function relatively spontaneously since it has the job of directing the child's attention to mental states which. This property of ToMM is discussed in the theory of the M-representation to which I return below. modular. Around this time. and is subject to dissociable damage-in the limit. Pretending and ToMM One of the easily observed products of ToMM is the capacity to pretend. unlike behavior. and encompasses the ability to understand other people's communicative pretence. 1990). 2. heard or felt. ToMM can be damaged or impaired independently of other processing systems (see below). Due regard must be paid to the question of distinguishing pretence from other phenomena which are superficially similar at a behavioral level (e. Leslie (2) Descriptions of intentional states are computed by a specialized theory of mind mechanism {ToMM) which is post-perceptual. & Leslie. acting from a mistaken belief. Spontaneous pretending emerges between 18 and 24 months of age.g.196 A. (5) ToMM is damaged in childhood autism resulting in its core symptoms and impairing these children's capacity to acquire a theory of mind. ToMM should also be able to function as a source of intuitions in reasoning about agents and thus be addressable centrally. Leslie and Roth (1993) have recently reviewed evidence supporting this idea (see also Frith. Finally. To this end. permits sharing of information about imaginary situations with other people. she pretends that a banana is a telephone or that an empty cup contains juice. is exercised playfully and communicatively without ulterior motive (e. Perhaps the most important job ToMM has to do is to produce development within its designated domain and to produce it early. is domain specific. rapidly and uniformly without benefit of formal instruction. to deceive). Information about behavior arrives through a number of different sensory channels and includes verbal utterances. functional play. ToMM introduces the basic attitude concepts and provides intuitive insight into mental states early in life while encyclopedic knowledge and general problem-solving resources are limited. so ToMM should operate post-perceptually. operates spontaneously.g. (4) ToMM forms the specific innate basis for our capacity to acquire a theory of mind..

To explain the isomorphism between the three fundamental forms of pretence (behavioral phenomena) and the three aspects of opacity (semantic phenomena). Rather than appearing in three *For reasons which are not clear. A further misunderstanding is to suppose that the only way the child could possibly handle the three aspects of opacity is by explicitly theorizing about reference. I proposed the existence of a certain kind of internal representation. In fact. Indeed the whole thrust of my proposals was to avoid such a commitment by describing processing machinery that would achieve a similar result implicitly. The first requirement is to account for the fundamental forms of pretence. There are four critical features of early pretence that a cognitive model must capture. a given real object. truth and existence). for example a banana. pretending that teddy's imaginary hat has a hole in it-should be a source of embarrassment to my theory. for example a dry table is pretended to be wet. in this case a banana. reference. one for each of the basic (external) semantic relations between a representation and what it represents (viz. must be a part of the human mind/brain from its infancy onwards. the possibility of "complex" pretence springs readily from the assumed combinatorial properties of metarepresentation. Leslie (1987) did not propose any such thing. by theorizing about the general nature of representation. Here the pretence decouples the normal effects of predicating wetness in the internal representation. Opacity may be roughly described as the result of the "suspension" of the semantic relations of reference. imaginary objects can be pretended to have existence.1 Leslie (1987) argued that the fundamental forms of pretence reflect the semantic phenomena of opacity (Quine.Pretending and believing: issues in the theory of ToMM 197 (1987) and pointed out that the aim of previous workers to develop a behavioral definition of pretence was unattainable. having whatever properties give rise to opacity phenomena. is pretended to be some other object. And thirdly. such as a mental state report or counterfactual reasoning. I proposed instead a theoretical definition in terms of underlying cognitive processes. Representational structures. 1961). Such pretence requires a decoupling of the internal representation for telephones from its normal reference so that it functions in context as if it referred to a member of some arbitrary class of object.that is. A cognitive model of pretence has to explain why there are exactly three fundamental forms and why there are exactly these three.for example.. The second critical feature of the development of pretence that a cognitive theory must account for is related to the first. in properties pretend. truth and existence that occurs when a representation is placed in an intentional context. . a given object or situation is pretended to have some property typically it does not have. Second. Perner (1991) writes as if the fact that the fundamental forms can be combined into more complex forms. Here the pretence affects the normal existence presuppositions in the internal representation. for example a telephone. for example that there is a hat on teddy's head. There are three of these. truth and existence . In object substitution pretence.

believing. why does she also gain the ability to understand pretence-in-others? Traditional investigations overlooked this startling fact. the same representational system is required if the child is to interpret mother's behavior in terms of mother BELIEVES (of) the banana (that) "it is a telephone". nor about mother's strange behavior. namely. Again. nor about the meaning of the word "telephone". I shall consider the emergence of the concept. analysing agents' goal-directed actions with instruments and acquiring a lexicon. Pretending and believing. These four critical features of pretence . the yoking of solitary pretence with the ability to understand pretence-in-others. This representational system provides precisely the framework that is needed to deploy another attitude concept closely related to pretending. such as building a catalogue of object kinds. though closely related attitude concepts. when mother says. "it is a telephone". Leslie discrete stages. we can account for the yoking in development between the capacity to pretend oneself and the capacity to understand pretence-in-others if we assume that a single mechanism is responsible for both. This is another critical feature of the early capacity for pretence that a cognitive model must capture. the child understands that mother's behavior-her gesturing and her use of language . the concept of believing.the three fundamental forms. not bananas in general nor that banana over there. is anchored in a particular individual object in the here and now. Understanding another person's behavior as a pretence can be studied as an information-processing task the child undertakes. 1987. is nonetheless neither confused about bananas. Thus. when the child first becomes able to pretend herself (solitary pretence). For example. are. the 2-year-old. Finally. their emergence as a package. Given a single mechanism with the right properties. The pretended truth of the content. and hands the child a banana.relates to an imaginary situation which mother pretends is real. who is also undertaking a number of other complex informationprocessing tasks. in the second part of this article. it is this banana that mother pretends is a telephone. different concepts and their successful deployment can make rather different demands on problemsolving resources. and the anchoring of pretend content in the here and now . the fundamental forms of pretence emerge together in a package. . "The telephone is ringing". nevertheless. a cognitive model can capture both the character of the three fundamental forms of pretence and their emergence as a package (see Leslie.can be succinctly explained as consequences of the data structure called the "metarepresentation". For example. for discussion). a cognitive account must address the fact that pretence is related to particular aspects of the here and now in specific ways. The third crucial feature of pretence to be explained is. Instead.198 A. in general. This is true both for solitary pretence and in understanding the pretence of other people.

Leslie & Frith. Leslie (1987) simply assumed that very young children did not have access to an RTM in this sense. The decoupling of a representation allows a processor to treat the representation as a "report" of information instead of merely reacting to it. This larger relational structure is built around a set of primitive relations . Not counting the implicit truth value.Pretending and believing: issues in the theory of ToMM 2. In any case. transparent descriptions of the world.the attitude concepts or "informational relations". The metarepresentation 199 Leslie (1987. 1990) outlined some general ideas on how a mechanism like ToMM could account for the above. 1987). we can say that this system makes explicit four kinds of information. Following Marr (1982). But for Perner the term can only mean that the child possesses a certain kind of "representational theory of mind" (RTM) in which mental states are individuated by form rather than by meaning. What does an informational relation represent? Perner (1991) has made a great deal of the fact that I borrowed the term "metarepresentation" from Pylyshyn (1978) for whom it meant a "representation of the representational relation". and still does. as long as one leaves it as an empirical issue exactly how a given "representational relation" is represented. to distinguish it from Perner's later use of the term. This entire relational structure is the third type of representation and is referred to as the "metarepresentation" (or. Descriptions in this system identify: (1) (2) (3) (4) an agent an informational relation (the attitude) an aspect of the real situation (the anchor) an "imaginary" situation (the description) such that a given agent takes a given attitude to the truth of a given description in relation to a given anchor. the "M-representation"). The informational relation is the pivotal piece of information in the sense that it ties together the other three pieces of information into a relational structure and identifies the agent's attitude. This seemed an innocuous enough phrase to me then. informational relations are thus three-place relations (Leslie. This in turn allows the (decoupled) representation to be placed within a larger relational structure in which an attitude to the truth of the "report" can be represented. Three different types of representation were distinguished. The direct object of the identified attitude is (the truth of) a proposition or description (typically of an "imaginary" situation) in relation to a "real" object or state of affairs. "Decoupled" representations are opaque versions of primary representations. 1988c.1. The model of metarep- . "Primary" representations are literal. ToMM employs the metarepresentation. These relations tie together the other components. I see no reason to accept this stricture.

The critical point about what informational relations represent is that they denote the kind of relation that can hold between an agent and the truth of a description (applied to a given state of affairs). For example. one can say "p is true for John". This kind of opacity is also what is relevant to believing. As we shall see later. "representational relations". Leslie resentation I outlined was designed to account for the very young child's capacities by attributing more modest knowledge in which. Perner's model notion fails to address the relationship of the agent to the "model". there is no evidence available to suggest that preschool children have an RTM in Perner's sense. are handled implicitly. This is now quite different. This kind of relation immediately determines a class of notion different from the other kinds of relation that feature in early cognition. Representations of different times and places apparently constitute different "models". while "representational relations" such as pretending and believing are handled explicitly. pretence emerges when the child can entertain multiple "models" of the world instead of just a single model that is possible during the first year. The disinguishing feature in this latter case is clearly not the representation itself which remains the same. Consider now a "Zaitchik photograph" (one that has gone out of date). but that just gives another way (an alternate set of sounds for) saying "John believes that p is true". and forms the conceptual core of commonsense theory of mind. This photograph is only a representation of a past situation and not of the current situation. But an associative relationship can also hold between. for example spatial and mechanical relations. These notions are primitive in the sense that they cannot be analyzed into more basic components such that the original notion is eliminated. While one can paraphrase "John believes that p is true" in a number of ways. The opacity properties of pretence are not illuminated by tense and location "models" because the content of pretence is opaque in the here and now. among them BELIEVE and PRETEND. Perner (1991) adopts part of the above theory of pretence. in any case. It seems unlikely that infants during the first year cannot relate past states of affairs to present ones but. for example. for example. one does not thereby eliminate the notion believes. can-openers and kitchens without the child ever thinking that canopeners pretend or believe anything about kitchens. though he discusses it in terms of "models". Perner (1988) says that for the child the agent is simply "associated" with the model. Contrast this with the case in which someone assumes (wrongly) that the photograph is a photograph of the current situation. According to this view. namely the notion of decoupling. Perner (1991) at times . but the fact that an agent believes that the photograph depicts a current situation. such as reference and truth. Perner's notion of "model" does not say much about pretence.200 A. My assumption is that there is a small set of primitive informational relations available early on.

The basic idea is that decoupling introduces extra structure into a representation and that this extra structure affects how processes of inference operate.2. In the case of informational relations. Leslie (1988c) and Leslie and Frith (1990) developed this idea in terms of the relationship between decoupled representations and processes of inference. that is. general cognitive systems). Leslie (1987) suggested that one way to think about the decoupling of an internal representation from its normal input/output relations vis-a-vis normal processing was as a report or copy in one processing system (the "expression raiser") of a primary representation in another (e. 2. parity of argument demands that if we insist upon a behavioristic construal of pretence-understanding in the child. transparent internal representation into something that can function as the direct object of an informational relation. This is a crucial part of the semantics of mental state notions and is what gives rise to the possibility of pretends and beliefs being false. This proposal is only useful if we are also told how the child views the relation between p and the agent's behavior in the case in which p actually is true. some kind of attitude notion. The decoupling theory was an attempt to account for this feature without supposing that the child had to devise a theory of opacity. the truth of the whole expression is not dependent upon the truth of its parts.Pretending and believing: issues in the theory of ToMM 201 attributes a behaviorist notion of pretence to the young child such that the agent who pretends that/? is understood as acting as if p were true. as far as I can see. The fact that one child is a bit older than the other does not in itself constitute a compelling reason for treating the two cases in radically different ways. is it also causal in the case of pretence? If so. Subsequently. ensuring that the truth of the part does not determine the truth of the whole.g. depending upon point of view) as pretending that p. If the relation between circumstances and behavior in the normal case is causal. After all. then we should also insist upon a behavioristic construal of falsebelief-understanding.. involves some kind of mentalistic rather than behavioristic interpretation of the relation between the agent and /?. The simplest illustration of this is that one cannQt infer it is a telephone from "it is a telephone". the truth of a whole expression is determined by the truth of its . Decoupling The role of decoupling in the metarepresentation is to transform a primary. Finally. Normally. falsely believing that/? demands the interpretation acting as if p every bit as much (or every bit as little. This suggestion drew upon the analogy between opacity phenomena in mental state reports and reported speech. how can imaginary circumstances be viewed as causal? How could the child learn about the causal powers of imaginary circumstances? The only solution to this dilemma. as in the case of verbs of argument and attitude.

Decoupling creates extra structure in the representation . the empty cup is full. One might think at first that contradiction is blocked by the element. The conclusions of the inference are again closed under decoupling. For example. in pretend situations. So. This is exactly what is required by informational relations. On the lower level of (4). The inferencing device first examines the upper level where it encounters I pretend the empty cup X and registers no contradiction. This same principle is involved in the detection of contradiction. Thus. we do not conclude that pretend tea will really make the table wet. pretend.unfold by means of inference. Suppose the whole-parts principle was implemented in a spontaneous inferencing device that carries out elementary deductions. "Mary picked up the cup which was full" is true iff the cup Mary picked up was full. with no decoupling. it examines the lower level where it sees X "it is a telephone" and again detects no contradiction.an extra level to which the inferencing device is sensitive. The consequent is decoupled because the antecedent was. Of course. in (2). there are two levels. but contradiction returns in I pretend the cup is both empty and full. Contradiction is detected within but not across decoupled levels. then the liquid will pour out and make something wet. the device should not detect a contradiction in I pretend the empty cup is full. revealing a contradiction because the whole and all of its parts cannot be simultaneously true. Next. it also works for both own pretence and for understanding other people's pretence (Leslie. We can think of decoupling as controlling the occurrence of contradiction: (1) (2) (3) (4) the cup is empty. the device encounters X "it is both empty and full" and registers contradiction within the level as in (2). I pretend this empty cup "it contains tea" can be elaborated by an inference such as: if a container filled with liquid is UPTURNED. 1987). Leslie parts. or we may say that the inference operates within the decoupled level. then we could predict another consequent . however. however. if I upturn the cup which I am pretending contains tea. the cup is empty and the empty cup is full. I pretend the cup "it is both empty and full". This same inference works in both real and pretend situations. In (3). Similar patterns can be seen in causal inferences.both one's own and those one attributes to other people . The device will quickly produce the conclusion. If pretend scenarios . pretend. there is a single level within which a contradiction is detectable. "the cup is full and not full".202 A. despite the presence of the element. Despite the surface similarity to the foregoing. Consider the following. I pretend the empty cup "it is full". I conclude that I pretend the table "it is wet". For example.

the 18-month-old calculates speaker's meaning. This finding reinforces the idea that infants around this age are developing an interest in what might be called the "informational properties" of agents. the child will have to know how to interpret mother's actions and utterances with respect to mother's pretence rather than with respect to the primary facts of the situation. not in service of pretence. for example. At first glance. . Baldwin (1993) has provided independent evidence that children from around 18 months of age begin to calculate speaker's meaning. This double computation is inherently tied to the agent as the source of the communication and is seamlessly accomplished through the metarepresentation. namely. In the circumstances studied by Baldwin. an important difference between the empty cup is empty and pretending (of) the empty cup "it is empty". from around 18 months. But in pretence. but also in relation to the imaginary situation communicated to her and which she must infer. We can illustrate this in two different ways: first in relation to behavior. that the empty cup is empty. They then take the novel word to refer to the object that the speaker is looking at. this may seem ridiculous. Mother's goal-directed behavior with objects will be an important source of information for the young child about the conventional functions of objects.Pretending and believing: issues in the theory of ToMM 203 based on a variation of the above inference: if the liquid comes out of the container. Baldwin showed that. it will not be enough for the child to compute linguistic meaning. mother's use of language will be a major source of information about the meanings of the lexical items the child learns. even if this is different from the one they were looking at when they heard the utterance. and second in relation to language use. Yoking The emergence of solitary pretence is yoked to the emergence of the ability to understand pretence-in-others. Interestingly. 1957). She is able to comprehend the behavior and the goals of other people not just in relation to the actual state of affairs she perceives. But there is. but in the service of calculating linguistic meaning. Likewise. then the container mil be empty. of course.3. This leads to pretending something that is true. "The telephone is ringing" and hands the child the banana. children do not simply take the utterance of a novel word to refer to the object they themselves are looking at but instead look round and check the gaze of the speaker. Later I will present an empirical demonstration that young children routinely make this sort of inference in pretence. She will have to calculate speaker's meaning as well (cf. 2. When mother says. The very young child can share with other people the pretend situations she creates herself and can comprehend the pretend situations created by other people. Grice.

a bottle. in press). Sixth. Fourth. some wooded bricks and a paper tissue. 2-year-olds can infer the content of someone else's pretence and demonstrate this by making an inference appropriate to that person's pretence. allowing the child to introduce what elements he or she wished or felt bold enough to advance but to embed a number of critical test sub-plots as naturally as possible into the flow of play. Larry Lamb and Porky Pig. Lofty the Giraffe. 3. such inferencing can be used to elaborate pretence. Fifth. This experiment was first presented in Leslie (1988a) and discussed briefly in Leslie (1988c). The general design was to share pretence. plates. inferencing within pretence can use real-world causal knowledge and such knowledge is available to 2-yearolds in a form abstract enough to apply in imaginary situations and in counterfactual argument where perceptual support is minimal or contradictory. early pretence can involve counterfactual inferencing.2 The following hypotheses are tested. I shall describe an experimental demonstration of a number of the phenomena discussed so far. Understanding pretence-in-others A. This warm-up period served to convey that what was to happen was pretend play and to overcome the shyness children of this age often and quite rightly have with strangers who want to share their innermost thoughts with them. Second. 2-year-old pretence is anchored in the here and now in specific ways. but which are not appropriate to the actual physical condition of the props? 2 Harris and Kavanaugh (in press) have recently replicated and extended this study. Method The child was engaged by the experimenter in pretend play. though they draw somewhat different conclusions in line with their "simulation theory". These sub-plots allow testing of pretence-appropriate inferencing. First. one can communicate through action. sufficient for the child to calculate speaker's meaning/pretender's meaning and to support a particular counterfactual inference based upon the communicated content.1. Toy animals and some other props were introduced to the child during a warm-up period. that pretend contents are not always counterfactual. that Sammy was being awakened by Mummy Bear and was being told that there was going to be a party to which his friends were coming.204 3. The simulationist view of theory of mind phenomena raises a number of complex issues which I do not discuss here (but see Leslie & German. Leslie In this section. . My assistants in this task were Sammy Seal. Mummy Bear. Third. gesture and utterance a definite pretend content to a 2-year-old child. The experimenter pretended that it was Sammy's birthday that day. Other props included toy cups. Seventh. Could the child make inferences which are appropriate to the pretend scenario he has internally represented.

"Watch this!" and picks up a cup. "What's in here?" If the child does not answer. An area of the table top is designated "outside". The cup is placed into the cavity and a single scooping movement is made. of course. OK?" Experimenter then makes movements around the body and legs of Larry suggesting perhaps the removal of clothes and each time puts them down on the same part of the table top. really empty throughout. Then it will be your turn to put his clothes back on. Subjects There were 10 children aged between 26 and 36 months. making a "pile".3. The child is encouraged to "fill" two toy cups with "juice" or "tea" or whatever the child designated the pretend contents of the bottle to be. . Experimenter asks "What has happened? What has happened to Larry?" (4) BATH-WATER SCOOP. The experimenter then says. "Watch this!". Larry is then removed and placed on the table.) (2) UPTURN CUP. "I will take off Larry's clothes and give him a bath. there's a muddy puddle here!" Experimenter then takes Larry Lamb and says "Watch what happens!" Larry is then made to walk along until the "puddle" area is reached whereupon he is rolled over and over upon this area. (Both cups are. picks up one of the cups.2. Two further children were eliminated for being uncooperative or wholly inattentive. The child is then asked to point at the "full cup" and at the "empty cup". Experimenter asks. The child is told "It's your turn to put Larry's clothes on again" and handed Larry. A sub-part of this area is pointed to and experimenter says "Look. The child is told that it is time for the animals to go outside to play. "What has happened? What has happened to Porky?" (3) MUDDY PUDDLE. Experimenter says. Experimenter then says. with a mean age of 32. 3. turns it upside down. Following the above. shakes it for a second. holding it there upside down.Pretending and believing: issues in the theory of ToMM 3. the scoop is repeated once to "Watch this!" and "What's in here?" (5) CLOTHES PLACE. Experimenter "fills" a cup from the bottle and says. Experimenter constructs a "bath" out of four toy bricks forming a cavity. "Watch what happens!" Sammy Seal then picks up the cup and upturns it over Porky Pig's head. The sub-plots 205 (1) CUP EMPTY/FULL. Where (if anywhere) the child reaches in order to get the "clothes" is noted.6 months. it is suggested that Larry should have a bath. Larry is then placed in the cavity formed by the bricks for a few seconds while finger movements are made over him. then replaces it alongside the other cup. The cup is then held out to the child and he or she is asked.

"water" and pours into other cup. Statistical analysis seems mostly unnecessary. covered in mud". "threw water on him". In the other cases it is difficult to estimate the probability of a correct answer by chance but it is presumably low. The failures came from two children who answered "Don't know" or failed to respond despite the sub-plot being repeated for them. 5. says 'Til wipe it off him" and wipes with tissue. Leslie Table 1. points to correct place MUDDYPUDDLE 9/10 BATH-WATER SCOOP 9/10 CLOTHES-PLACE 9/9 Failures were produced by two different children with "don't know" responses or no response after the test was repeated twice.. says "oh no. "water" and upturns into bath.. "tipped juice on head" Dries animal with tissue. in the CUP EMPTY/FULL scenario the child works from the supposition the empty cups "they contain juice" . One child was not asked the clothes-place test through experimenter error. Results Table 1 shows the number of children passing each sub-plot plus the entire range of responses that occurred. all the mud".206 A. "he got wet". The CUP EMPTY/FULL subplot could be guessed correctly half the time. They demonstrate counterfactual causal reasoning in 2-year-olds based on imaginary suppositions. "got mud on" Says "water". 4. so all 10 children passing is significant (p = 0. "bathwater" Picks up from correct place. "poured milk over him. Discussion These results support a number of features of the metarepresentational model of pretence. binomial test). wet". For example. "he's spilling". Number of subjects passing test sub-plots and the full range of responses obtained Test CUP EMPTY/FULL UPTURN CUP Subjects passing 10/10 9/10 Range of responses obtained indicating appropriate inference Points to or picks up correct cup Refills cup.001.

that in terms of decoupling this is not the tautology. As we shall see later. as it were. like KNOW. however. The last conclusion is. pretend. The child is 3 Though I was perhaps the first to derive this as a prediction from a theoretical model. but their falseness is not strictly required by the logic of the concept. A similar conclusion was generated by one of the children in the UPTURN CUP scenario and expressed by his pretending to refill the "empty cup" when asked what had happened. the children correctly inferred what the experimenter was pretending. . Having counterf actual contents is. cases of "non-counterfactual pretence"..Pretending and believing: issues in the theory of ToMM 207 and upon seeing the experimenter upturn one of the cups. in regard to action. far from being unusual and esoteric. The child calculates a construal of the agent's behavior-a construal which relates the agent to the imaginary situation that the agent communicates. not a function of the truth of its direct object. the child is required to calculate. speaker's meaning as well as linguistic meaning. and. To achieve this. These examples help us realize that. but PRETEND and BELIEVE do not. pretending something is true when it is true. we can understand this peculiarity of attitude expressions in terms of decoupling. the agent's pretend goals and pretend assumptions. an example of pretending something which is true and not counterfactual. that is. so the conclusions generated were this empty cup "it contains juice" and that empty cup "it is empty". pretends "ought" to be false. this empty cup is empty. PRETEND shows the logic of the BELIEVE class of attitudes: the truth of the whole attitude expression is not dependent upon the truth of all of its parts. In this regard.3 One way to understand the above result is this: the logic of the concept. The very possibility of "correctness' depends upon some definite pretend situation being communicated. of course. beliefs are true. The child is not simply socially excited into producing otherwise solitary pretend. on the other hand. I am certainly not the first to point out that pretends "can be true". the child can answer questions by making definite inferences about a definite imaginary situation communicated to him by the behavior of the agent. the child applies a "real-world" inference concerning the upturning of cups (see pp. This is predicted by the Leslie (1987) model. In this case the child was asked about the cups. does not require that its direct object (i. Notice. what pretends are for.e. its propositional content) be false. in regard to utterances. As we saw earlier. Some attitudes. Vygotsky (1967) describes the case of two sisters whose favorite game was to pretend to be sisters. Our feeling that a "true pretend" is odd reflects the normativity of our concept of pretence. In the experiment. BELIEVE shows the opposite normativity to PRETEND. specifically. are ubiquitous in young children's pretence and indeed has an indispensable role in the child's ability to elaborate pretend scenarios. do require the truth of their direct object (though even here there are subtleties). 220-221). Normally.

The spontaneous processing of the agent's utterances. Whichever of these options. Versions of this position have been held for example by Perner (1991). may develop later. there may be a close psychological relationship between the concepts of pretending and believing: both may be introduced by the same cognitive mechanism. 1992). 6. or whichever mixture of these options (they are not mutually exclusive). require radically different representational capacities. turns out to be correct. Or the two notions may differentiate out of a common ancestor concept. For example. 1988b). 1993). more accurately. both may belong to the same pre-structured representational system. Leslie also capable of intentionally communicating his own pretend ideas back to participating agents. Believing and ToMM One of the central problems in understanding the development of theory of mind is the relation between the concepts of pretending and believing. However. There may be no specific relationship between the two and their development may reflect quite different cognitive mechanisms and quite different representational structures. that solving false belief tasks.demands that can be met only at different times in development depending upon a variety of factors (Leslie & Thaiss. one concept. believe. Flavell (1988) and Gopnik and Slaughter (1991). Alternatively. there is no reason to suppose that pretend and believe require radically different representational systems. Within this general scheme there are a number of more detailed options. pretend. either because of maturational factors or because believe requires more difficult and less accessible information to spur its emergence (Leslie. It is often claimed in support of the special nature of believe. to the truth of descriptions). There are two broad possibilities. Or there may be a progressive strengthening or sharpening of the pre-structured metarepresentational system (Plaut & Karmiloff-Smith. Or there may be different performance demands associated with employing the two concepts . This allows a rational construal of the role of non-existent affairs in the causation of real behavior. One of the deep properties that we seem pre-adapted to attribute to agents is the power of the agent to take an attitude to imaginary situations (or. any more than the concepts dog and cat. It is striking that this is done quite intuitively by very young children. may develop first while the other. though undoubtedly different concepts. and in contradiction of the second set of positions above. gestures and mechanical interactions with various physical objects to produce an interpretation of agent pretending this or that is surely one of the infant's more sublime accomplishments. . there is no more need to regard the child as "theorizing" like a scientist when he does this than there is when the child acquires the grammatical structure of his language.208 A.

However. not in terms of their content. 1. Leslie & Roth. there are no demonstrations of preschoolers individuating beliefs on syntactic grounds in disregard of their content. mental states are individuated within this framework in terms of their form. a sentence. Indeed. or whatever.g. 1991) claim is that success on a variety of false belief tasks at age 4 reflects a radical theory shift from a PA-based theory of mind to an RTM. all of the evidence quoted in support of this claim (mainly passing various false belief tasks) only shows that the child individuates beliefs on semantic grounds. anything which can be semantically evaluated will count as a "representation". We shall speak of a "propositional attitude" (PA). because a mental state involves a relation to a proposition not because it involves a relation to an image. Insofar as cognitive science holds an RTM. does the child employ a PA-based (semantic) theory of mind or representational (syntactic) theory of mind? Perner's (1988.. The second and stricter way of using the term is to denote only entities that can be semantically evaluated and have a physical form or a syntax. an image of a cat on a mat counts as a different mental state from a sentential representation of the same cat on the same mat. Thus. In this loose sense of representational theory of mind. Now we can ask. psychologists might argue about whether a given piece of knowledge is represented in an image form or in a sentential form. for example.g. in press. From the point of view of psychology. requires the child to employ a radically different conceptualization of mental states from that required by understanding pretend (e. Perner. After all. we shall use different terms. there are two ways in which one could speak of the child possessing a "representational theory of mind". 1992. or semantically based. see also Leslie & German. One could use the term "representational" loosely to cover anything which might be considered true or false. it is claimed that false belief can only be conceptualized within a "representational theory of mind" (RTM). Because it would be massively confusing to use the same term both for a theory of mind which individuates mental states in terms of their contents (semantics) and for a theory of mind which individuates mental states in terms of their form but not their content. To date. that is. the falseness of a belief is a quintessentially semantic property. Leslie & Thaiss. mental states involve representations simply because their contents are semantically evaluable. What the theory of ToMM aims to account for is the specific basis for this early emerging semantic theory of the attitudes. Specifically. . theory of mind for the first type of theory and a representational. Briefly.Pretending and believing: issues in the theory of ToMM 209 such as that in Fig. or syntactically based. All the available evidence supports the idea that preschoolers are developing a semantic theory of belief and other attitudes. 1993). The form of the representation is held to be critical to the individuation of the mental state. theory of mind (RTM) for the second. I have criticized the RTM view of the preschoolers' theory of mind at length elsewhere (e.. it is in this second stricter sense of "representational". 1991).

A task analytic approach to belief problems How can we begin to investigate the claim that a prestructured representational system interacts with performance factors in producing the patterns seen in the preschool period? Specifically. The ToMM model can be extended to relate it to the performance limitations affecting young preschoolers. the general task structure remains identical. Roth. resisting the pre-potent tendency to simply base the inference upon the (present) situation at tx. a polaroid camera. namely. In carrying out this analysis. Leslie 6. we must separate the various component demands made on conceptual organization from those made on general problem-solving resources in the course of tackling false belief problems. while. make demands on at least two distinct mechanisms. normal 3-year-olds fail both the false belief and the photographs tasks (Zaitchik. While the conceptual content of the task changes (from belief to photograph). Zaitchik designed a version of the standard false belief task in which the protagonist Sally is replaced by a machine. second. they are "for" . The protagonist's seeing of the original situation of the marble is replaced by the camera's taking a photograph of it. Roth & Leslie. Some false belief tasks. Results from comparing these two tasks show two things: first. Building on ideas proposed by Roth (Leslie & Roth. Normatively beliefs are true: this is what beliefs are "for". disregarding other competing or confusing information. 1 and other standards such as "Smarties". more general problem-solving demands are made of a "selection processor" (SP). Leslie and Thaiss (1992) outlined the ToMM-SP model. An important beginning has been made in this line of research by Zaitchik (1990). to infer the correct content for Sally's belief. 1992). after moving the marble to a new location. such as the Sally and Ann scenario in Fig.1. 1993. These latter demands require the child to interrogate memory for the specific information that is key to the belief content inference. Specific conceptual demands are made of ToMM to compute a belief metarepresentation. 1990). in the course of accurately calculating the content of the belief. in preparation). the protagonist's out-of-date belief is replaced in the new task by the camera's out-of-date photograph. This task then provides an intriguing control for the general problem-solving demands of the false belief task. For example. autistic children fail only the false belief task but pass the photographs version (Leekam & Perner. 1993. the situation that Sally was exposed to at t0 has to be identified from memory and the inference to Sally's belief based on that. Leslie & Thaiss.210 A. consistent with autistic impairment in the domain-specific mechanism ToMM. as would be expected on the basis of a general performance limitation. how do we investigate the notion that performance limitations mask the preschooler's competence with the concept of belief? We can try to develop a task analysis. There is a conceptual basis for the existence of this pre-potent response. 1991.

processing mechanism. for a case in which communication does help the 3-year-old with false belief. 1991. then. However. to discover that pretends can be true.g. Baron-Cohen. by. Leslie. in short. we should not be too hard on the preschooler if she takes a few months to discover that. Mitchell & Lacohee. for example.4 It makes sense. shows poor performance on a wider range of belief reasoning tasks. So. even dismayed. Leslie & Thaiss. for example. 1985. 1992). 1988. In these cases. it has been my experience that there are many adults who are surprised. 1991. better performance on false belief tasks is seen in 3-year-olds (e. 1990). drawing attention to the relevant "selection" and/or by encouraging inhibition of the pre-potent response. maps and drawings tasks (Charman & Baron-Cohen. Meanwhile. SP shows a gradual increase in functionality during the preschool period. ToMM is intact. Roth and I have dubbed this the "selection processor" (SP). Incidently. 1991. for obvious reasons. Roth & Leslie. (The same problem does not arise in the case of true pretends because the agent is always able to intentionally communicate the content of his pretend whereas. Like many other "executive functions".Pretending and believing: issues in the theory of ToMM 211 accurately describing the world. & Frith. 1988. Zaitchik. therefore. In the case of false belief. This disability is all the more striking alongside the excellent performance autistic children show on out-of-date photographs. 1992. by default. Roth & Leslie. the 3-year-olds' difficulty with false belief is due to limitations in this general component. an agent is not in a position to intentionally communicate that he has a false belief.) We can organize our thinking about the general demands made by some belief tasks by positing a general.. the vicissitudes of the real world sometimes defeat beliefs with dire consequences for the agent's goals.. see Roth and Leslie. be due to an inability to meet the general problem-solving demands of such tasks. According to this view. 1991. in the normal 3-year-old. Wellman & Bartsch. 1991. it is not part of the logic of the concept that belief contents must be true (compare the earlier parallel discussion on page 225 of pretends being normatively false). 1991). contrary to design. Baron-Cohen. this normative design fails and in order to accurately compute the errant content the pre-potent assumption must be resisted. if. Notice that the normative assumption is a far cry from the "copy theory" of belief (Wellman. 4 . Similar considerations may apply. Some belief tasks do not require this general component or stress it less. inhibiting a pre-potent inferential response and selecting the relevant substitute premise. by contrast. Autistic impairment on false belief tasks cannot. or at least non-theory-of-mind-specific. The SP performs a species of "executive" function. These tasks control for the general problem-solving demands of standard false belief tasks. a belief is "useful" to an agent only to the extent it is true. inferences to belief contents are based upon current actuality. In view of this. Leekam & Perner. beliefs "ought" to be true. although autistic children seem to be Although this normative assumption is fundamental to the notion of belief. to the case of Zaitchik photographs. 1991). Leslie & Frith. even compared with Down's syndrome children and other handicapped groups (e.g. The autistic child.

& Rogers.212 A. Taken together with previous findings that 3-year-olds can understand "knowing and not knowing" (e. In one study. The theoretical assumption behind such work is that by finding simplified tasks one reduces the number of false negatives that standard tasks produce. 4. this shows that the conceptual factor of the falseness of the belief per se is not the source of difficulty for 3-year-olds. 1988. 1990. 3. It can be readily seen that there was no difference in difficulty between the two tasks for 3-year-olds when task structure is equated. in preparation). we compared the performance of young. This allowed us to assess the importance of the falseness of the belief (a conceptual factor) in generating difficulty for 3-year-olds while holding general task structure constant. There must be something about the problem-solving structure of this standard belief task that stresses 3-year-olds. for details of the tasks used). this cannot be the cause of their failure on false belief tasks. 1988).g. The ToMM-SP model of development (after Leslie & Thaiss. Wellman & Bartsch. middle and older 3-year-olds on a standard version of the Sally and Ann task with a "partial true belief" version (see Leslie & Frith. Another approach in the literature to the problem of isolating belief competence is to find simplified task structures that 3-year-olds perform better on. 1992). 1991).. 3 summarizes the ToMM-SP model of normal and abnormal development. . Pennington. Sudan and non-standard pretence standardFB standard false representation ToMM 4 year old 3 year old Autistic SP y y X y X y Fig. Pratt & Bryant. Roth and I have recently extended our approach of studying minimally different task structures in an effort to isolate general processing demands from specific conceptual demands (Roth & Leslie. Leslie impaired in certain kinds of "executive functioning" (Ozonoff. Fig. This pattern can be succinctly explained on the assumption of a relatively intact SP together with an impaired ToMM-the mirror-image of the normal 3-year-old. The results are shown in Fig.

In this version. in the "NOT-SEE" test. failing means that the child indicates the empty location when asked where Sally will look for the marble. Leslie (in preparation) point out a danger with this approach. Now we can be confident that in the NOT-SEE test the children really did take Sally's exposure history into account because if they had responded like the controls no one would have . consisted of false positives. Suppose we run a group of children on the Sally and Ann task and find that 100% of the children pass. Sally sees the transfer of the marble and knows that it is in its new location. 4. We would have been tempted to describe this result as "chance" but we waited till we saw how many children passed the SEE control version of the task. We then run another group of same age children on a modified version of the task in which Sally does not go away for her walk but instead remains behind and watches Ann all the while. A concrete example will help make the idea of controlling false positives clear. we need to introduce controls for false positives. of 100% indicating the empty location. to our surprise. Manipulations designed to simplify tasks may inadvertently allow children to pass for the wrong reasons-for reasons which do not reflect the conceptual competence that the investigator is targeting. it did not demonstrate false belief understanding. Suppose that on this. Now we will say that the first finding. 100% of the children succeed. Suppose instead we had discovered that the false belief group were only 50% correct. To avoid this.Pretending and believing: issues in the theory of ToMM Partial True Belief versus False Belief Younger. Middle and Older Three-year-olds 213 o o Younger Middle Older Age groups Fig. we find that 100% in this SEE control group fail! In this control condition. Performance on both a standard false belief task and a true belief analogue improves gradually during the fourth year. Imagine that.

But. the second pattern of results says more about false belief understanding than the 100% "passing" in the paragraph above. This time. This would yield 50% false positives without any of the children actually calculating Sally's false belief. Three-year-olds were given a scenario in which Sally has two bags each containing pieces of material. if the children were simply following a "dumb" strategy and not calculating at all what Sally believed. The results showed that 50% of the 3-year-olds passed-a higher proportion than that obtained with a standard FB task. the possibility exists with this design that children were simply confused by the swapping and interpreted Sally's description of the bag either as referring to the bag that was in the red drawer or to the bag that is in the red drawer. when Sally asks for the bag in the red drawer. she must mean the bag that is in the red drawer. Leslie passed: therefore the 50% who did were not false positives. Ann then enters. by accident. According to these new results. She tells Ann that it is the bag in the red drawer she wants but. If the children were using such a low-level. Yet. Now. then it should make no difference that Sally had watched the proceedings. Sally does not know that Ann has swapped the bags. Sally watches as Ann swaps the bags between the drawers. then replaces the bags. This shows that the children were not taking into account Sally's exposure and thus were providing false positives in the false belief task. the child can correctly identify Sally's desire only if she first relates it to Sally's false belief. resulting again in 50% "correct". She places the bags.) Sudan and Leslie (in preparation) examined the Robinson and Mitchell scenario in relation to the SEE control. Sally then calls from the next room that she wants her bag of material to do some sewing and that it is important that she gets the correct bag. They found that the proportion of "correct" locations indicated in the false belief condition was the same as the proportion of incorrect locations indicated in the SEE control. one in each of two drawers. in the .214 A. Half will still interpret her as wanting the bag that was in the red drawer: the confusion created by swapping will occur again. In this interesting task. plays with them. Unfortunately. One immediate use of this enhanced technique of balancing a NOT-SEE test with a SEE control is to allow us to look at 3-year-old performance with a more sensitive instrument. "dumb" strategy. it will show up again in a version of the Robinson and Mitchell task that implements the SEE control. of course. she swaps their locations. The child is asked to identify the bag that Sally really wants. (Bear in mind that the indicated location counted as correct in the SEE control condition is the opposite of that counted correct in the NOT-SEE (false belief) condition. takes the two bags out of their drawers. In other words. swapping locations and asking about desire in relation to false belief does not. and goes to the next room. Robinson and Mitchell (1992) report a study that would benefit from the use of a SEE control. half making one interpretation and half the other.

First Ann sharpens the broken pencil. "it's broken". In this model. Sally calls from nextdoor. The only information the child has to go on is Sally's attached description. "Ann. though Fodor (1992) suggests splitting and moving the object into two target locations as a way of creating ambiguity. while the fourth pencil is broken. the ambiguity in the object of desire should trigger the child into consulting Sally's belief. According to Fodor's model. Standard false belief tasks allow unique predictions from desire. For example. The sharp pencil is the one Sally really wants! Our results showed that 48% of our 3-year-olds correctly chose the pencil that . Fodor predicts that when the 3-year-old does calculate belief. then she breaks each of the original three sharp pencils. One difficulty is to know what predictions the child will consider as ambiguous. it's broken!" As before. so the young child does not calculate belief and thus fails. Fodor (1992) has also recently proposed a model of a performance limitation in 3-year-olds' theory of mind reasoning. the child is asked which pencil Sally really wants. we need a scenario in which ambiguity in the object of desire is unavoidable. The Robinson and Mitchell task was modified to introduce ambiguity into Sally's desire. he will realize that Sally thinks that the now sharp pencil is still broken. the child will break the impasse by calculating belief. At this point. the 3-year-old typically predicts behavior from desire without calculating belief. A modification of the Robinson and Mitchell scenario meets this requirement nicely. According to the model. Although this modification makes the scenario more complex as a story. Older children.Pretending and believing: issues in the theory of ToMM 215 Robinson and Mitchell task. Sally leaves the pencils and goes into the next room. Now there are three broken pencils and one sharp pencil. bring me my favorite pencil-you know. Three of the pencils are sharp. When the child consults Sally's belief about which pencil is broken. Like the ToMM model. it could be that the child will regard search in both locations as a single unambiguous action. however. Sally has four pencils. she will succeed. Fodor assumes that the 3-year-old possesses the conceptual competence for understanding false belief. Whenever desire prediction results in ambiguity. we supposed that it would simplify the scenario as a false belief problem. produce a simplified false belief problem for 3-yearolds. Now Ann comes in and finds the pencils. routinely calculate both belief and desire because they have greater processing capacity available. Therefore. Instead of having two bags of material. the young child will not calculate belief unless the prediction from desire yields an ambiguous result and the child is unable to specify a unique behavior. Surian and Leslie then went on to test 3-year-olds in a study which combines the methods of minimally different task structures with the SEE control for false positives. But now there are three pencils which are broken. The child has been given no information prior to this about which pencil is Sally's favorite. however. This unavoidably produces ambiguity. In order to test Fodor's suggestion clearly.

it is not clear that Fodor's model identifies all the performance factors limiting 3-year-olds' successes. . they are no better at calculating its content than when asked to predict behavior. For example. Leslie Sally really meant. when they are directly requested to do so-they still have difficulty calculating its content accurately. Surian and Leslie obtained a sensitive measure of 3-year-old competence. The children may simply have latched onto the "odd-one-out". She calculated belief in order to figure out the referent of Sally's desire. the child has no information on which pencil is Sally's favorite other than the description Sally gives of it as being broken. This pattern was significantly different from that obtained in the NO-SEE test. however. we also ran a SEE control version of the pencils task. Again this is ambiguous. in standard false belief tasks. the uniquely sharp pencil. Perhaps the passers were false positives. If the children follow a dumb strategy. Most of the passers were true positives. To control for dumb possibilities like this. as in the NO-SEE test. in the standard Sally and Ann scenario it makes little difference if. in the SEE control condition only about 20% of the children chose the sharp pencil. because. 3-year-olds are asked where Sally thinks the marble is.the uniquely sharp pencil. about half the children should again respond by picking the odd-one-out . we should recognize that there are low-level "dumb" strategies that could have produced these results. the child is also concerned with the underlying mental states themselves. the word "favorite" singles out a particular individual. the child did not calculate belief in order to predict behavior. while at the same time controlling for false positives by means of a SEE control. Before reaching this conclusion. For example. the rest choosing one of the broken pencils. For example. then they should use the same dumb strategy when Sally remains in the room watching Ann process the pencils. If children simply respond with this or some other dumb strategy. In fact. Furthermore. even when 3-year-olds presumably do consult belief-for example.216 A. Important though this is. there are three broken pencils. it seems that ambiguity of desire can help 3-year-olds in solving a false belief problem. By combining a method of minimal task differences with the SEE control. And even when the ambiguity factor was apparently activated in the study above. in the ambiguity study above. We were able to isolate Fodor's ambiguity of desire factor by comparing the performance on Robinson and Mitchell's original task with the ambiguity modified version of it. Fodor's (1992) model focused on the prediction of behavior. Although further studies under way may change the picture. rather than being asked to predict behavior. In the SEE control. We thus obtained support for Fodor's ambiguity factor. This performance was significantly better than performance on the unmodified Robinson and Mitchell scenario and better than on the standard false belief task. Despite being asked to consult belief. However. still half the children did not calculate belief content correctly. by the time Sally makes her request.

For example. The children were indeed calculating belief. In this otherwise standard Sally and Ann task. the word "first" may simply lead the child to point to the first location the marble occupied. Fodor's ambiguity of desire can be assimilated to the SP model as one factor which can inhibit the normal content assumption and lead to search for an alternative belief content. but that when they do. the ordinary assumption about belief content is inhibited and an attempt made to calculate the content from Sally's exposure history. however. Perhaps if task structure were made to help with the selection of the appropriate content as well as with inhibiting the pre-potent response. justified. But for false belief situations where belief does not operate as it ought to. performance would improve further. For example. "my first answer is going to be wrong. This result has largely been ignored. and the only other different thing the child can pick is the sharp pencil. Unless "help" is given by the form of the task. it was not enough to help more than half of them to get their calculation right. 3-year-olds will tend to assume beliefs reflect current facts or will fail to identify the correct content. in other words to follow a dumb associative strategy. But which one is that. If there is to be a first look. the first hypothesis will simply be "a broken one". Both of these processes (the inhibition of the pre-potent response and the selection of the correct content) stretch the 3-year-olds' capabilities. the word "first" might cue the child that the experimenter expects the first look to fail. when the child tries to infer which pencil Sally wants. In light of the normative nature of the belief concept. this assumption is. but why should there be a second look unless the first one fails? Therefore. given there are three broken ones? Since it is not possible to reach a definite answer to the question of what Sally wants. so I'll pick something else". then this same dumb strategy should have been followed by the SEE control children as well. though the pencils story helped the children. Nevertheless. in order to produce a correct answer about content. on the other hand. Again. Alternatively. in the general case. point to the empty location for the first failing look. they can easily calculate its content correctly (even in the case of false beliefs). the word "first" simply triggers a dumb strategy . proposes that ToMM's routine calculation of belief normally assumes that content reflects relevant current facts. presumably there is to be a second look. But it was not. put like this. Surian and Leslie (in preparation) re-examined a modified standard scenario based on Siegal and Beattie (1991). Recall that the control children simply had to live with an indefinite answer because in their case Sally in fact knew there were three broken pencils. In a final experiment. because it is open to some obvious objections. this assumption has to be inhibited or blocked and a specific alternative content identified.Pretending and believing: issues in the theory of ToMM 217 Fodor's model assumes that 3-year-olds do not ordinarily consult belief. If the experimental effect simply reflected the dumb strategy. The SP model. instead of asking the child "Where will Sally look for her marble?" the child is asked 'Where will Sally look for her marble firstT Siegal and Beattie found that adding the word "first" dramatically improved 3-year-olds' success.

We simply added the necessary SEE control to examine the viability of such "dumb strategy" explanations. Further support is provided for this view by the finding that false belief tasks show a difficulty gradient. In addition. Plausibly. because in this condition a point to the first location is considered wrong. that this hypothesized double help has specific effects depending upon the status of Sally's belief. In the control condition too. to date. replicating Siegal and Beattie's finding. 4). Finally. the look-first question fails to help autistic children: we found only 28% of an autistic group passed. and probably beyond. behavior defeating a desire encourages the blocking of the normal assumption regarding belief content in the same way that ambiguity about what would satisfy the desire does. Leslie in the child who then appears to succeed but who. If this was the result of a dumb strategy. no different from an unmodified Sally and Ann task. We are now in a position to suggest an account of how this manipulation makes the experimenter's intentions explicit for 3-year-olds and why they. however.218 A. The child's attention is drawn to the possibility that Sally's first look may fail. asking where she will look first obtains a contrasting answer from 3-year-olds: namely. This squeeze gradually relaxes over the course of the fourth year (see Fig. that some false belief tasks are easier than others. the . but not 4-yearolds. the word "first" directs the child's attention to the first location and this helps select the correct counterfactual content. "in the second location". In summary. the effect of the word "first" is specific to the status of Sally's belief. then we should expect to find a similar proportion failing the SEE control task. These last results vindicate Siegal and Beattie (1991) and suggest that they have been wrongly ignored. there is increasing evidence that 3-year-olds have an underlying competence with the concept of belief but that this competence is not revealed in the tasks that are standardly used to tap it. Sally's behavior defeats her desire. Siegal and Beattie argued that including the word "first" made the experimenter's intentions explicit for the 3-year-old. The ToMM-SP model provides. Notice that the absolute level of success in the look first task is very high indeed. It is quite comparable to the level of success obtained by 4-year-olds in the standard task and higher than that obtained with 3-year-olds in the ambiguity task. The double help results in very good performance. We found that 83% of the children in the false belief task passed. This is a variation on Fodor's factor. The word "first" may give the child a "double" help. It seems increasingly likely that their competence is masked by a number of "general" factors that create a performance squeeze. does nothing to calculate belief. Bear in mind. once again. in fact. the word "first" should trigger the dumb response strategy. Therefore. In fact. If the child is not attending at all to Sally's belief then it should make no difference that Sally watches Ann move the marble. 88% were correct in this condition too. If her look fails (to find the marble). If Sally indeed knows where the marble is. need the help.

Leslie.). Meaning. Understanding other minds: Perspectives from autism (pp. Does the autistic child have a "theory of mind"? Cognition. (1992). Cohen (Eds. (Eds. Konstantareas & J. 244-267)..M. L. & Frith. Frith. Beitchman (Eds.M. (1988). T. (1991).. Developing theories of mind (pp. A. 412-426. Journal of Child Psychology and Psychiatry. (1985).. knowing and believing. Gopnik. Knowledge and ability in "theory of mind": One-eyed overview of a debate. (1990).. J. Cambridge: Cambridge University Press. The cognitive basis of a biological disorder: Autism. In J. In L. Philadelphia: Saunders. U. 14.Pretending and believing: issues in the theory of ToMM 219 most wide-ranging model of the young child's normal theory of mind competence. A. (1988).W. 33-51). What autism teaches us about metarepresentation. Developing theories of mind (pp.P. New York: Cambridge University Press. & D. (1993). (1991). Olson. 377-388.M. Journal of Child Language. Trends in Neurosciences. Coleg Harlech.. BaronCohen. & Roth.M. (1988b). Some implications of pretense for mechanisms underlying the child's theory of mind.M. Harris. J. 122-131. Infants' ability to consult the speaker for clues to word reference. P. Leslie.M. & T. Child Development. 97. 66. J. 19-46). J. of the performance factors that squeeze the child's success on false belief calculations. A. In J. 94.L. Harris.. & Baron-Cohen. 83-111).). A. 1988. A. S.L. Oxford: Blackwell. 98-110. (1991). Fodor. Stone (Eds. & Kavanaugh. Cognition.A. Leslie. A.).. & Leslie. H. (1993).H. Pretense and representation: The origins of "theory of mind". 225-251. The development of a theory of mind in autism: Deviance and delay? In M. S. special issue on Pervasive developmental disorders (pp. In S. Leslie. 40. U. P. Paper presented to BPS Developmental Conference. 62. Psychological Review. Flavell. 395-418. A. Leslie. Baron-Cohen. R. Does the autistic child have a "metarepresentational" deficit? Cognition. Charman.). and of the abnormal development of this domain found in childhood autism. & D. Weiskrantz (Ed. & D. S. References Baldwin. 315-324.. Young children's understanding of changes in their mental states. Oxford: Oxford University Press. British Journal of Developmental Psychology. Leslie. Psychological Review. The development of children's knowledge about the mind: From cognitive connections to mental representations. Tager-Flusberg. & Frith. U. D.A. A. Leslie. Baron-Cohen. Prospects for a cognitive neuropsychology of autism: Hobson's choice. 43. Morton. Understanding drawings and beliefs: A further test of the metarepresentation theory of autism (Research Note). A theory of the child's theory of mind. 21. Wales. Leslie. & Slaughter. September 16-19.. Leslie. (1991). A. In M. . (in press). Philosophical Review. 203-218. 20. & German.M. & Thaiss. 1105-1112. Domain specificity in conceptual development: Neuropsychological evidence from autism. A. T. (1992). (1957). 33. U. (1987). Olson (Eds.M. The necessity of illusion: Perception and thought in infancy.R. (in press). & Frith. Harris. P. H.) Mental simulation: Philosophical and psychological essays. D. Astington.. Leslie. Leekam. Astington. Thought without language (pp. Cognition. (1992).. (1988c).M.. Autistic children's understanding of seeing. Society for Research in Child Development Monographs. A. 37-46. V. The comprehension of pretense by young children. Grice. (1988a). A. & Perner.. Causal inferences in shared pretence.). 283-296. 433-438. 6. Psychiatric Clinics of North America.M. 185-210). Davies. 44. S.M. Oxford: Oxford Science Publications.

The recognition of attitude conveyed by utterance: A study of preschool and autistic children. 30. 103-128.M. Freeman. Roth. 61. (1991). Perspectives on the child's theory of mind (pp. 91-103. & H. 63. P.. Quine. Harris. (1991). Cambridge: Cambridge University Press. Mitchell. Plaut.H. 515-526.E. Pratt. (1993). A. Ozonoff. 141-172).M. Leslie. Butterworth. Wellman (Eds. Representational development and theory-of-mind computations. Premack.S. Cognitive Development. 239-277. Z. (1993). Pennington. S. San Francisco: W. Developing theories of mind (pp.).. 639-652. 1081-1105. D.W. Oxford: Oxford University Press. (1978). L. (1991). S. Perner. (1988). Cambridge. 38. Harris. Tel Aviv University. Cognition. M. British Journal of Developmental Psychology. & Mitchell. Children's early understanding of false belief. (1982). 4. When is attribution of beliefs justified? Behavioural and Brain Sciences.C. Executive function deficits in high-functioning autistic individuals: Relationship to theory of mind. & Karniloff-Smith. Cognition. Zaitchik.J. Cognition. 32. & Leslie. Does the chimpanzee have a theory of mind? Behavioural and Brain Sciences.M.L. (1992). Beliefs about false beliefs: Understanding mental states in normal and abnormal development. E. (1991). J. 41-68. Child Development. D. & Perner. From a logical point of view.220 A. Soviet Psychology. D. Children's interpretation of messages from a speaker with a false belief..J. 70-71. A. Where to look first for children's knowledge of false beliefs. Ph. (1990). B.. Journal of Child Psychology and Psychiatry. WV. (1967). Child Development. C. & D. 6. Olson (Eds. Cambridge. J. 35. 5. Behavioral and Brain Sciences.. 973-982. 1. 39. K. D. 16.). Understanding the representational mind. Leslie Marr.L. Developing semantics for theories of mind: From propositional attitudes to mental representation. Wellman.. J. D. (1991). Roth. Vygotsky. (1991). Zaitchik. Young children understand that looking leads to knowing (so long as they are looking into a single barrel). Perner. (1978). D. & Bryant. & Lacohee. Reprinted in G. 6-18. (1988). & Woodruff. 315-330). Vision.. Cognition. G. H. Wimmer.. P. P. Astington. 107-127.F. & Beattie. 1-12. (1983). MA: Harvard University Press. P.. . & Bartsch. H. Cognition. Pylyshyn. Is only seeing really believing? Sources of true belief in the false belief task. 9. 13. Play and its role in the mental development of the child. & Rogers.D. Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception. Robinson. 315-330. Young children's reasoning about beliefs. When representations conflict with reality: The preschooler's problem with false beliefs and "false" photographs. (1961).M. Siegal. thesis. A. In J. D. K. 592-593.. P. MA: MIT Press. H. (1990).

609. Smithc a IDIAP. Contract No. 1. AFOSR-92-0265 to Smith. as is the adequacy of our choices and judgements in most contexts encountered in daily life. The present article motivates this problem. Edward E. University of Michigan. NJ 08544-1010. and by Air Force Office of Scientific Research.idiap. our assessments of chance are subject to systematic errors and biases that render them incompatible with the elementary axioms of probability. Princeton University. Developing methods for extracting a coherent body of judgement that is maximally consistent with a person's intuition is a challenging task for cognitive psychology. * Corresponding author. Introduction Human assessment of chances provides a guide to objective probabilities in a wide variety of circumstances.91 to Osherson. see Osherson. MI 48104. and outlines one approach to it.11 Extracting the coherent core of human probability judgement: a research program for cognitive psychology Daniel Osherson*' 3 .ch Research support was provided by the Swiss National Science Foundation contract 21-32399. 330 Packard Road. CH-1920 Martigny. At the same time. and also relevant to the construction of artificial expert systems. The character and origin of these errors have been the topic of intense scrutiny over several decades (for a partial review. C. Eldar Shafirb.P. USA 'Department of Psychology. Valais. Green Hall. USA Abstract Human intuition is a rich and useful guide to uncertain events in the environment but suffers from probabilistic incoherence in the technical sense. The survival of the species in diverse and rapidly evolving environments is testimony to this fact. Switzerland ^Department of Psychology. . Ann Arbor. in press). by US Public Health Service Grant 1-R29-MH46885 from NIMH to Shafir. E-mail osherson@maya. Princeton.

1. However. 1986) and Shortliffe and Buchanan (1975). earlier approaches are "extensional" in character. E. Smith. However. Smith How can the power and wisdom of human judgement be exploited while avoiding its weaknesses? One approach is to formulate principles of reasoning that simulate human judgement in large measure but correct it in view of probabilistic coherence. as explained below.3).222 D. 1987. 1982. Heckerman. and complementary to. section 3. 1988. Osherson. section 1. 1986) reinforces the conviction that Presentation and discussion of the elementary axioms of probability is provided in Resnik (1987. Fagin & Halpern. Shafir. section 13. Lindley.g. which highlight the risks of reasoning non-probabilistically (see Cox. Jeffrey. 1972. In a word. which is more controversial (see Kelly. The attraction of probabilistic reasoning A system that recommends action in the face of uncertainty should quantify its estimates of chance in conformity with the axioms of probability. for extended discussion). The question is: how can orderly and plausible judgement about uncertain events be extracted from the turmoil of human intuition? We begin with general remarks on the difficulty of reasoning about probabilities in coherent fashion. We do not rely here on the additional axiom of countable additivity (Ross. Shafir. the tradition spawned by Pearl's studies. & Gualtierotti.4]). and proposed in this article as a research program for cognitive psychology..2). E. 1991. 1988). 2. (1994) but a somewhat different one will be summarized below. our approach is different from. as in Shafer (1976. our approach is "intensional" inasmuch as it relies on a representation of the semantic content of the statements to which probabilities must be attached. de Finetti. Savage. in contrast. Coherent reasoning 2. A specific implementation of our approach is presented in Osherson et al. 1972. 1993. Such is the research program advocated and elaborated in our recent work (Osherson. 1994).1 Such is the burden of classical analyses of betting and prediction. . close examination of these principles (e. The fundamental idea of extracting elements of human judgement for use within a normatively correct system of reasoning has already been articulated by Judea Pearl (1986. Biolsi. 1983. The goal of the present article is not to insist upon the details of any one approach but rather to highlight a research question that we find challenging and important. Alternative principles have been proposed to govern situations in which probabilities cannot generally be determined. 1946. assigning probabilities to unanalyzed statements and their logical combinations. 1964. Resnik.

1987).) 2.1). (For additional discussion.. 1990. means that 5. see Neapolitan.2. section 5. instead of the term "event" we use the equivalent terminology of "statements" and "propositions". Moreover. The computational difficulty of probabilistic reasoning Whatever its normative attractiveness. 1988. even when probabilities must be attributed to sentences without complex internal structure. a finite number of statements S l9 S 2 .3.1. Pearl. Theorem 6. Each S. 2. is a determinate claim whose truth value may not be known with certainty. 1990) it will be adopted without further justification in what follows. A proposition is a subset of state 2 This is shown for first-order arithmetical languages in Gaifman and Snir (1982. The N statements give rise to 2N state descriptions. Theorem 3.1. may or may not be negated. . If probabilities must be distributed over the sentences of an expressive language. The root difficulty is the large number of events that must be kept in view in order to ensure probabilistic coherence. Although the foregoing conclusion remains the subject of lively debate and reflection (as in Dubois & Prade. The remaining discussion bears exclusively on finite event spaces (the domain for most probabilistic expert systems). each of the form: ± S j A • • • A ±SN where ±S. certain manipulations are known to be intractable. . . . for example: (1) Tulsa will accumulate more than 10 inches of rain in 1995. To explain the problem we now review some standard terminology.7). the case of weak monadic second order logic is discussed in Stockmeyer. Probability over statements To establish a domain of discourse for reasoning. probabilistic reasoning poses difficult computational challenges. (1974. these difficulties can become insurmountable.2 However. A state description is the logically strongest claim that can be made about the domain since it consists of the conjunction of every statement or its negation. section 2.Extracting the coherent core of human probability judgment 223 probabilistic reasoning remains the standard of rigorous thinking in the face of uncertainty. Shafer & Pearl. for example updating Bayesian belief networks (see Cooper. 1991. SN are fixed in advance.

Each proposition is relatively simple in form. Smith descriptions. S2 A -i5 3 . third and fourth state descriptions she arrives at her probability for 52 v -iS 3 . and suppose that M is a mapping of C to real numbers. Osherson. 2. Suppose as well that some human agent #? is asked to assign probabilities to a growing list i? of propositions built from those statements. £ might start off like this: S2 v -iS 3 514^(52A5. . M is called coherent just in case there is some distribution P such that for all XEC. "Sx v S2" denotes the set of state descriptions in which at least one of Sx. namely: S2 A 5 3 . Otherwise. For example. but the same statement may occur across different propositions. There are 22" propositions. 5if could in principle proceed as follows.4. iS2 A iS3. By summing over the first. Let a collection C of propositions be given. for example). (2) 2={ Question: How can #? assign coherent probabilities to the successive propositions in if? In other words. Maintaining coherence in large domains Now suppose that the domain in question embraces 500 statements S19S2 5500 concerning. she writes down the . M(X) = P(X). referring to different months. obtained by summing the numbers associated with the state descriptions that make it up. Her probabilities are coherent at the end of this stage. five meteorological events in 100 American cities (we assume the events to be non-exclusive. Propositions are often denoted by the apparatus of propositional logic. whose sum is one. Stage 1: Faced with the first proposition S2 v -iS 3 . A given distribution induces a probability for each proposition. iS2 A S3. what procedure can be employed to ensure that the numbers are always extendible to a genuine probability distribution? To achieve coherency. let us say. E. Then Sif chooses a distribution over these state descriptions that reflects her beliefs about their respective probabilities. Shafir. Stage 2: Faced with the second proposition S 14 -*(S 2 A 5 X ). M is incoherent. 2if writes down the four state descriptions based on statements S2 and S3.) Sfi A i S . Thus. as 5if associates numbers with more and more propositions. E.224 D. A distribution is an assignment of non-negative numbers to the state descriptions. S2 occur positively.

Because of property (i) her probability for the first proposition S2 v -iS 3 may be recovered by adding the relevant state descriptions from among the current 16. furthermore. Then W chooses a distribution over these state descriptions that meet two conditions. It is not immediately obvious what procedure X can substitute for this one. etc. S8.Extracting the coherent core of human probability judgment 225 16 state descriptions based on statements Sl9 S2. Thus. Consequently. suppose that the state descriptions of (I) show up in 3> at one stage. J —lS2 A S3 1 . in the first stage. Eventually. for example. The disadvantage of this procedure is that it soon requires excessive space to write down the list of needed state descriptions.. namely: (i) it is consistent with the distribution chosen for the four state descriptions of stage 1 (this is possible since her probabilities are coherent at the end of stage 1). Suppose. S2 and S14. % arrives at her probability for S 14 ->(5 2 A 5 J ) . that $f chooses her probabilities in coherent fashion at each stage. By taking the appropriate sum. she writes down the 32 state descriptions based on statements S15 S2. S3 and S14. To take the simplest example. . X would attribute probabilities only to the eight state descriptions based on Sx. This procedure is nonetheless insufficient to guarantee the coherence of X9s judgement since it ignores logical dependencies among propositions at different stages. X would attribute probabilities only to the four state descriptions based on S2 and S3. that is. an enormous number. . S3. (ii) it reflects her beliefs about the 16 state descriptions now in play. and those of (II) show up later: f Sj A S2 S 2 A S^ ] J —\S1 A S2 > I Si A ~iS2 [pSx A -iS 2> (II) < S A —lS [ 2 3 " l S 7 A ""IS. that X attempts at each stage to limit attention to just the state descriptions needed in the evaluation of the current proposition. Stage 3: Faced with the third proposition S8 A . in the second stage. based on the four statements that appear in stages 1 and 2. Let us assume. and so forth. the totality of her probability attributions are still coherent at the end of this stage. S 14 .I 5 1 4 . 2500 state descriptions need be written down at once. the probability assigned to S2 v ~iS3 will not have changed from stage 1.

. Finally. There result 10. let alone 2 raised to the power of a few thousand. E. for many probability distributions we must store 2 50 numbers to represent the entire distribution. . . .. SN}. . consider a distribution P over statements Sl9. E. 361) summarizes the matter this way: To represent an arbitrary probability distribution. SN such that for all subsets {Sl Sr.) = 1 . Tractability via conditional independence Some distributions have special properties that allow coherent reasoning within modest computational bounds. Using this strategy.} of {Sl9. . Reasoning according to P does not require storing the probability of each state description. Smith Then overall coherence requires that the sum of the probabilities assigned to the first two state descriptions of (I) equal the sum of the probabilities assigned to the first and third state descriptions of (II).. P(5 r A • • • A Sr. For each proposition X of i?. not only can 9€ assign coherent probabilities to all the propositions that .)). For example. For such a language. It suffices to store only the probabilities associated with Sl9. Shafir.) = P(5 r ) x • • • x P(Sr. .226 D. the two distributions imply different values for the statement S2. For example. Osherson. violating overall coherence. We cannot keep 2 50 parameters in our heads.000 statements concerning which an agent might wish to reason probabilistically. we can easily express 100 predicates that apply sensibly to any of 100 grammatical subjects.. Is there any hope of carrying out such reasoning coherently? 2. we must specify the value of the probability function for each of the state descriptions. note that the problem is aggravated by the expressiveness of the language used for ordinary thought.). Clark Glymour (1992. .P(S. It is thus clear that the revised procedure suggested above is inadequate without some means of keeping track of the combinations of state descriptions seen in « S ? up to the current point. . that is. The difficulty of maintaining coherence over large domains is widely recognized. which is what would be required to represent a probability distribution over a realistic language.5. therefore. It follows that if the beliefs of our agent W are mutually independent in the foregoing sense then she can confront the list <£ with no fear of incoherence. and this entails the same combinatorial explosion as before. in which the underlying statements are mutually stochastically independent. So with 50 atomic sentences. p. W need only carry out the following steps: (a) list the statement letters occurring in X. (b) decide which state descriptions over the foregoing list imply X\ and (c) take the sum of the probabilities of the latter state descriptions as the final answer. there will be cases in which our beliefs are inconsistent and our degrees of belief incoherent. For example. Otherwise. .SN since the probability of any needed state description can be recovered through multiplication (relying where needed on the fact that P(-iS.

1986). many of these systems require manual entry of an excessive number of probabilities and conditional probabilities.. 1978. Long. For example. 1986. 30-32. and the numbing task of making thousands of judgements no doubt aggravates this tendency. & Andersen. Unfortunately. First.g. Geiger. 1990b. procedures can be devised to reduce the effect of judgemental biases that lead to incoherence (as discussed in Kahneman. "informational") content of the grammatical constituents that compose statements. SN is an unrealistic assumption in most situations. often minute in magnitude. Slovic. her judgement exhibits "path independence" in the sense that reordering i£ will not change 2Ts probability for any proposition. Schoenberger. 1982. Heckerman. so must be based on the judgement of experts (e. Ch. Let us now examine this idea. for extended discussion). Naimi.. the interrogation of experts can be rationalized and simplified (as in Heckerman. The fourth response has been raised in Szolovits and Pauker (1978). Chs. In addition. A variety of schemes for exploiting conditional independence have been devised (e. If P exhibits a felicitous pattern of conditional independence. 1989.5). 1990. Kahneman & Tversky. Second. Criscitiello. 1972. Fourth. . even with the right configuration of conditional independencies. Olesen et al. Lauritzen & Spiegelhalter. 1994). . Weaker but still useful forms of independence can be defined (see Whittaker. methods can be invented for constructing human-like distributions on the basis of judgements that are psychologically more natural than assessments of the probability of arbitrary propositions. Winterfeld & Edwards.. Verma. Third. doctors). The mutual independence of Sx. techniques can be implemented for finding a probability distribution that is maximally close to a set of possibly incoherent judgements (see Osherson et al. then its underlying state descriptions can be factored in such a way as to render their manipulation computationally tractable. 1983. & Grayboys. 4. which allow a large number of statements to be generated from a small number of constituents. Tversky & Kahneman. Usually the probabilities cannot be assessed in actuarial fashion.. Several responses may be envisioned to the foregoing problem. 1988. & Jayes.g. if P ( 5 1 | 5 2 A 5 3 ) = P(S 1 |S 3 ) then Sl is said to be *'conditionally independent" of S2 given S3 (according to P). The essential innovation of our approach is to attempt to derive probabilities on the basis of the "semantic" (really. 1990. though not to the exclusion of the other three. It is the one advocated here. It is well known that experts are apt to provide incoherent estimates of probability (see Casscells. Woldbye. . & Pearl. The reduction is based on the combinatorial mechanisms of grammar. Winterfeld & Edwards. Andreassen.Extracting the coherent core of human probability judgment 227 might arise in <£ (assuming that each proposition remains reasonably short). Falck. 1989. & Tversky.. 1990a). The potential benefit is reduction in the amount of information that must be culled from a human informant (and later stored). 1987).

It is thus worth noting the considerable representational power of real vectors. the present section describes a simple approach that analyzes grammatical constituents along a fixed shock of dimensions We begin by specifying the kind of statement to be treated. p] to denote the statement formed from them. Statement semantics D. a rating of the object TIGER along the dimensions size. Suppose that person P is chosen. In the hope of stimulating discussion.228 3. We limit attention to statements of this simple object-predicate form. and let us say that . 3. E. for discussion). speed and ferocity). For simplicity in what follows we assume that all the statements in 5 are meaningful.. for some fixed value of n. 1976. Osherson. Smith How can the meaning of statements best be represented in view of deriving human-like probabilities from their representations? Surely we are far from having the answer to this question. and logically independent from each other. A domain of reasoning is established by fixing a (finite) list obj of objects and a (finite) list pred of predicates and then considering the set of statements S = {[o. The vector is intended to code a given person's knowledge (or mere belief) about the item in question (see Smith & Medin. Subjects. Vector representations might seem impoverished compared to "frames" or other elaborate schemes for knowledge representation (Bobrow & Winograd. respectively. predicates. "Lawyers" and "seldom blush" are the object and predicate. Minsky. 1981. "Lawyers") and a grammatical predicate (namely. objects A statement like (3) Lawyers seldom blush decomposes into a grammatical subject (namely. for n = 3.g. Given object o and predicate p. 1986). 1981. we use [o. Vectorial representations Our approach associates each object and predicate with a real vector in n dimensions. E. neither contradictory nor analytic. "seldom blush"). p] | o e obj and p Gpred}.2. Such a vector may be conceived as a rating of the object or predicate along n dimensions (e. 3. Henceforth we employ the term "object" in place of "grammatical subject" in order to prevent confusion with the "subjects" participating in psychological experiments. of statement (3). Thus.1. Shafir.

two questions remain to be answered. 3. Ducamp. and Falmagne (1984): (4) Let P embody any fit-relation whatsoever. Then for some n. let us say that o "dominates" p just in case the coordinates of o's vector are at least as great as the corresponding coordinates of /?'s vector.) For o to have greater than . Intuitively. These are: (a) Which particular object and predicate are attributed to 5if? (b) How are object and predicate vectors translated into a probability distribution? Once answers are offered to (a) and (b). Call the person at issue 5if. the predicate "can learn a four choice-point maze in three trials" might have a requirement of .5 probability of possessing p.5 if and only if o dominates p. (For example. Fact (4) shows that such a scheme is perfectly general for representing probability thresholds and it renders plausible the idea that real vectors might also serve to predict continuous assessments of the probability of statements. the cardinality of pred. one per object and one per predicate. there is an assignment of n-dimensional vectors to obj U pred such that for all o E obj and all p E pred. Having decided to use vectors to represent obj U pred. Moreover. This may be abbreviated to: P([°> p])> -5.Extracting the coherent core of human probability judgment 229 predicate p "fits" object o just in case the probability of [o. To and from statement representations Recall that our goal is to capture the coherent core of a person's judgement about chances. We have the following fact.75 in the coordinate corresponding to intelligent.3. we can think of the vector assigned to a predicate as establishing criteria for membership in the associated category. and if . suppose that n-dimensional vectors are assigned to obj Upred. For this purpose. p] according to P exceeds .5 (any other threshold would serve as well). p]) > . proved in Doignon. n can be chosen to not exceed the smaller of: the cardinality of obj. We would like to represent the fit-relation in terms of vectors. oJs values at each coordinate must exceed the criterion established by p. Given such an assignment. P([o. namely: (c) If 2fs vectors are fixed in accordance with the answer to (a). a third question may be addressed.

4 There is no mathematical reason to limit the diagram to three dimensions. 1987. E.230 D. Osherson. Shafir. E. A less direct approach is to infer the vectors from similarity ratings among objects and predicates. In this case. Fixing object and predicate vectors One means of obtaining vectors is to request the needed information directly from %t via feature ratings (as in Osherson. & Tversky. It is easy to see that use of the diagram guarantees probabilistic coherence. 1977). 3. Meyers. how well do the resulting probabilities accord with 2T$ judgement about chances? Does the processed and regimented output of our system retain any of the insight that characterizes 2Ts understanding about probabilities in the environment? Let us now briefly consider (a)-(c).4 Let us now outline a simple scheme for selecting the particular region assigned 3 The unit cube has sides of length 1. those discussed in Osherson. various other kinds of solids would serve as well. Vectors to probabilities Turning to question (b) above. It is used for convenience in what follows.g. The probability of a complex proposition (e. . So far our experiments indicate that three dimensions are enough. & Stob. Another strategy is to postulate a model of simple probability judgements based on the needed vectorial representations. Volumes in the n-dimensional unit cube for any positive n yield bona fide distributions. Luce. our system carries out "extrapolation".. Wilkie.. the intersection or the union of two statements) may then be determined by calculating the volume of the corresponding region. Suppes. we describe one procedure for synthesizing probabilities from the vectors underlying obj and pred. and then work backwards to vectors from such judgements.g. It rests upon a scheme for constructing three-dimensional Venn diagrams. Tversky. 1989. we work backwards from a suitable vector-based model of similarity (e. the pair of vectors associated with object o and predicate p is translated into a subregion 5? of the unit cube.5.4.3 The volume of 9t represents the probability of [o. p]. in press). Stob. Specifically. extending a small set of probability judgements to a more complete set (see Osherson. & Smith. Stern. In this case. Shafir. Krantz. looking for vectors that best predict 2Ts similarity data. 3. 1991). The position of 91 determines its intersection with subregions assigned to other statements. Smith. Smith probabilities are subsequently assigned to propositions in accordance with the answer to (b).

Define the "O. coherent probabilistic reasoning can proceed without storing an entire distribution. For example. It is enough to store the vectors underlying objects and predicates since the volume associated with any given proposition (of reasonable length) can be easily retrieved from the vectors.. Observe that within any Venn diagram scheme that conforms to C. and suppose them to be suitably normalized so that all coordinates fall into the interval [0. etc. hence with any proposition. Thus. In contrast. and vectors 0. To serve as a computationally tractable means of coherent reasoning in large domains it suffices to meet the following condition: (C) Given a point x in the unit cube. given 10 objects 2nd 10 predicates. P be the vectors underlying o. if p and q are complementary predicates with contrasting vectors then (5) assigns [o. This reflects the low probability that must sensibly be assigned to [o. p]. p] A [O. only 20 vectors need be stored. /?. P underlying object o and predicate /?. Many alternatives to (5) are possible. (c) within the foregoing constraints. This is easily achieved even for vectors of considerable size. P is as close as possible to the geometrical center of 01. .Extracting the coherent core of human probability judgment 231 to a given statement [o. posed in section 2.1].4. p] and [o. Let O. 10 objects and 10 predicates give rise to 100 statements and thus to a distribution with 2100 state descriptions. p]. q] in view of the incompatible contents of p and q. q] boxes with little or no intersection. it must be computationally easy to determine whether x lies in the region associated with [o. (b) for i ^ 3 the length of the ith side of 5? is 1 minus the absolute difference between Oi and P. is offered thereby. A potential solution to the problem of coherent reasoning. In this case it is straightforward to calculate the volumes associated with any Boolean combination of statements. P-box" to be the (unique) rectangular solid 9? with the following properties: (5) (a) O falls within 38. It may be seen that the volume of 9t varies directly with the match of P's coordinates to those of 0\ statements based on compatible objects and predicates are accorded higher probability thereby. respectively. Moreover. <Jfc's position in the cube represents aspects of the semantics of [o9 p].5 It is clear that (5) satisfies C. It must be emphasized that not every distribution can be represented by a Venn A more careful formulation of C would refer to € -spheres in place of points x.

an individually randomized selection of 30 arguments was used to fix vectors representing his objects and predicates. Pearson correlations between the two sets of numbers were also calculated. By an elementary argument (over obj U pred) is meant a non-empty set of statements. The question thus arises: do distributions that conform to C approximate human intuition about chance in a wide variety of domains? We are thus led to question (c) above. "are more likely to exhibit 'fight' than 'flight' posture when startled"). In this case the median. The input set of probabilities need not be coherent. The median. We may use this single number as a predictor of the probabilities assigned to the remaining 50 arguments. absolute deviation between the Venn model's predictions for the 50 arguments and the probabilities offered directly by the subject. . 3. average absolute deviation between the observed and predicted probabilities is . E. The Venn model was then applied to the resulting vectors to produce probabilities for the remaining 50 arguments. Sticking with the simple scheme in (5) .. average absolute deviation between the observed probabilities assigned to a subject's 50 predicted arguments and the probabilities generated by the Venn model is . Smith diagram that meets C (just as not every distribution manifests conditional independencies of a computationally convenient kind). seeking vectors that maximize its fit to the subject's judgement about the 30 input arguments. in which the premise set is empty. and do this in such a way that the extrapolated distribution provides a reasonable approximation to the judgement of the person providing input. one of which is designated as "conclusion".let us now address this matter. Consider the mean of the probabilities assigned to the 30 arguments used to fix the object and predicate vectors of a given subject. Thirty college students evaluated 80 elementary arguments based on four mammals (which served as objects) and two predicates (e. Accuracy of the method We summarize one experimental test of our method. The results suggest that the Venn method can extrapolate a coherent set of probabilities from a small input set. 6 The correlation between the two sets of numbers is . Statements are considered special cases of elementary arguments.ll. For each subject. namely. 6 This deviation can be compared to the following statistic. For each subject we calculated the average.78.20. This was achieved by working backwards from the Venn model.232 D.g. Osherson. E. Shafir.6. An argument may be conceived as an invitation to evaluate the probability of its conclusion while assuming the truth of its premises. whether the distribution delivered by our method resembles the original intuitions of subject $?. the remainder (if any) as "premises".henceforth called the "Venn model" .

T. 1992. & Winograd. 14). into English]. Ch. The system sketched above is preliminary in character. Memo KSL-87-27. It was noted in section 2. Kyburg & P. Woldbye. Schoenberger. In H.F. the merits of human judgement have often been emphasized by the very researchers who investigate its drawbacks (e. Osherson. (1946). (1991) Updating with belief functions. W. frequency. and references cited there).. Psychological research in recent years has produced considerable understanding of the character and causes of incoherent reasoning. namely. D. Shafir & Tversky. Indeed.g.. American Journal of Physics. Tversky & Shafir. & Prade. 999-1000. & Grayboys.-P. Smokier (Eds. An overview of KRL. B. ordinal conditional functions and . Studies in subjective probability. and reasonable expectation. It unites theorizing about the mental mechanisms of reasoning with a practical problem for expert systems. ses sources subjectives [transl. A.. 3-46.. and serves merely to suggest the feasibility of the research program we advocate. La prevision: Ses lois logiques. S. Munin: A causal probabilistic network for interpretation of electromyographic findings. finding an exploitable source of Bayesian priors. 73-109.. 1989.. 1). Cognitive Science. Casscells. Probability. G. D. (1989). de Finetti.. T. Ducamp. Dubois. 1-13. Such appears to have been the goal of early inquiry into probability and utility (Gigerenzer et al. Concluding remarks The problem of recovering the coherent core of human probability judgement strikes us as an important project for cognitive psychology. to devise methods that distill the rational component of human thought. J. Cox. (1987). 1987. (1972). isolating it from the faulty intuition that sometimes clouds our reason. References Andreassen. New England Journal of Medicine. a knowledge representation language. however. We thus take there to be good empirical evidence. 299. Doignon. A. Bobrow. Probability. Interpretation by physicians of clinical laboratory results. Nisbett & Ross. 14. J. namely.. 28. May 1987. 1992. A challenge is posed thereby. impugn every aspect of ordinary reasoning.of normatively acceptable reasoning. de Finetti.. 1990. New York: Wiley. B. in favor of the thesis that human judgement is imperfect from the normative point of view. & Andersen. This thesis does not. p. It remains a worthy aim today. Journal of Mathematical Psychology. Stanford University. M. B. (1976). On realizable biorders and the biorder dimension of a relation. Knowledge Systems Laboratory. Probabilistic inference using belief networks is np-hard. In Proceedings of the Tenth International Joint Conference on Artificial Intelligence. even if debate continues about its scope and interpretation (see Gigerenzer & Murray. & Falmagne. 1980..1 that probabilistic coherence has non-trivial justification as a standard . induction and statistics. New York: Wiley. Falck. Cooper. S.however incomplete .Extracting the coherent core of human probability judgment 233 4.-C.. R. (1964). 1.). plus great computational plausibility. (1978). (1984). H.

M. Probabilistic reasoning in expert systems. Cognition as intuitive statistics. Lemmer (Eds. (1988). MA: MIT Press. P... Shafir. The development and use of a causal model for reasoning about heart failure. Haugeland (Ed.). Osherson. R. A. Judgment under uncertainty: Heuristics and biases. NJ: Prentice-Hall. & Pearl.).F. B. Slovic.N.E. Kanal. (1982). Local computations with probabilities on graphical structures and their applications to expert systems. & D. M. In P. Glymour.. H. Gigerenzer. Cambridge. Kelly. A. & J. D. Kanal & J.. Gaifman. D.N.D. Probabilistic interpretations for MYCIN'S certainty factors.G. MA: MIT Press. D.. Uncertainty in artificial intelligence 5.234 D. M. Scoring rules and the inevitability of probability. Amsterdam: North-Holland. Invitation to cognitive science: Thinking (2nd edn. 347-374). Smith. G. Osherson. In R. A tractable inference algorithm for diagnosing multiple diseases.. & Andersen. Cambridge. D.. Lindley. Osherson. (1990a). A source of Bayesian priors. Applied Artificial Intelligence. (1990). & Gualtierotti. & Murray. R. Amsterdam: Elsevier. Shachter. The logic of reliable inquiry. Daston. (1972).N. E.. & Halpern. (1990).. (1987). A. Extrapolating human probability judgment. IDIAP Technical Report 94-03.S.V. d-Separation: From theorems to algorithms. Cambridge. New York: Simon & Schuster. Jensen.. The logic of decision (2nd edn. New York: Wiley. R. (1990).. L. Henrion. Kjaerulff. D. Kanal. NJ: Erlbaum. MA: MIT Press. W. Henrion. (1991). Special Issue: Towards Causal AI Models in Practice. Nisbett. E. Long. Englewood Cliffs. Henrion. Theory and Decision. K. D. D. L. Probabilistic similarity networks. Cambridge. Smith. Journal of Mathematical Psychology. T. 1-26. D. S. Cognitive Psychology.). L.. Carnegie Mellon University. Department of Philosophy. C. (1980). Kahneman. Journal of the Royal Statistical Society. A munin network for the median nerve: A case study on loops.E. & J.)..). In D. & Tversky.). . Porter. Uncertainty in artificial intelligence 5. Minsky. 3. L. Chicago: University of Chicago Press. & Spiegelhalter. Gigerenzer. (1989).. F. Jeffrey. Shafir. L.. (1987).P. (1981). Kahneman. (1993). (1983)... (1987). Kanal. U. & J. E. S. In E.). M. Swijtink. Invitation to cognitive science: Thinking (pp. L. M. (1992). (in press). D. 495-548. Smith (Eds.. & J. Lemmer (Eds. D.F. L. Osherson. & Tversky. Thinking things through. D. Mind design. F.T. The society of mind. (1986)..J. & Snir. Kanal.). E. & Ross. (in press). 157-224. Cambridge. T. (1986).). Meyers. Biolsi. In R. In J. M. K. 50. M. A framework for representing knowledge. D. Amsterdam: North-Holland. Smith. D. 430-454. M. E.E. Heckerman.. Heckerman. New axioms for the contrast model of similarity. R. Osherson. IEEE Symposium on Computer Applications in Medical Care (pp. In L. UK: Cambridge University Press. & Jayes. Fagin.N. B. J. Proceedings of the Sixth Workshop on Uncertainty in Artificial Intelligence (pp. Andreassen.F. Minsky. UK: Cambridge University Press. Lemmer (Eds. J. Geiger. & Stob. (1989). Journal of Symbolic Logic. (1990b).P.. Subjective probability: A judgement of representativeness. Criscitiello.)... K. Uncertainty in artificial intelligence. M. A new approach to updating beliefs. Judgment.D. 311-329). Z. (1982). Human inference: Strategies and shortcomings of social judgment. E. Lauritzen. R. Schachter. Falck. Smith possibility measures.) (1982). Lemmer (Eds.F. MA: MIT Press. In P.. Osherson & E. Shafir.. S.N. Bonissone.F. Lemmer (Eds. Verma. G. 55-88).. Bonissone. 30-36). J. Amsterdam: Elsevier.. (1994). Henrion.. 31. Osherson (Eds. 50. MS. Olesen. (Eds. Beatty. Cambridge. International Statistical Review. Jensen. The empire of chance. 93-103. Amsterdam: North-Holland. Proceedings of the Sixth Workshop on Uncertainty in Artificial Intelligence (pp.. 47.. Probability judgment. T. Heckerman.K. D. Probabilities over rich languages. Hillsdale. & Kruger.. Naimi. Osherson. S. Neapolitan.

A. MIT. Constructive probability. A first course in probability. Psychological Review. E. 5. Tversky. NJ: Princeton University Press.. R. P. Thinking through uncertainty: Nonconsequential reasoning and choice. Smith. Shafir. The complexity of decision problems in automata theory and logic. Decision analysis and behavioral research. Whittaker.. (1972). O. New York: Macmillan. and structuring in belief networks.. 90. 305-309. Resnik. Psychological Review. G. Pearl. (1992). Cognitive Psychology. Readings in uncertain reasoning.. 327-362. D. (1988). Artificial Intelligence. A. W.) (1990). 115-144.. Stob. & Smith.E. Default probability. Cognitive Science. thesis. (1986). A... Choices: An introduction to decision theory. D. Foundations of measurement (Vol. 24. Stockmeyer. (1983). (1977).H. Pearl. 1-60. San Diego: Academic Press. Categorical and probabilistic reasoning in medical diagnosis. Ross S. Stern.. A model of inexact reasoning in medicine. Features of similarity. II). New York: Cambridge University Press. Mathematical Biosciences. 48. E. . (1989). (1976). Suppes. (1986). D. (1987). & Pauker.D. (1981). San Mateo. G. J. Minneapolis: University of Minnesota Press. Probabilistic reasoning in intelligent systems: Networks of plausible inference. Cambridge. 351-379. & Tversky.. 23. (1992). (1986). E. J. & Edwards. D. & Buchanan. Winterfeld. J. 11. Princeton. B. (1974). M. MA: Harvard University Press.. 29. J. Shafer. 241-288. 293-315. Psychological Science. J. New York: Wiley.Extracting the coherent core of human probability judgment 235 Osherson. Graphical models in applied multivariate statistics..D. (1988). New York: Dover. A. (1990). San Mateo. Fusion. P. 251-270. Wilkie. Shafer. 449-414. Szolovits. D. M. & Medin. A mathematical theory of evidence. (1978). Synthese. (1991). & Pearl.V. Shafter. 84. (1975). propogation. & Shafir. Categories and concepts. E.. Tversky. Luce. Extensional versus intuitive reasoning: The conjunction fallacy in probability judgement. E. Krantz. Tversky.. L. Ph. CA: MorganKaufmann. Artificial Intelligence. Savage L. G. & Kahneman.. & Tversky. The foundations of statistics. (Eds. CA: Morgan-Kaufmann. A. The disjunction effect in choice under uncertainty. 15. S.. Shortliffe.

object. Despite registering the emotion. Children and chimpanzees are both capable of labelling causal sequences and completing incomplete representations of them. required them to infer the location of food eaten by a trainer. the first requiring chimpanzees to read and use as evidence the emotional state of a conspecific. We present two tests of causal reasoning. 41 rue Gay Lussac. Santa Barbara. they failed to use it as evidence. comparing children and chimpanzees. and instrument) of a causal sequence. Children and. France Abstract We compare three levels of causal understanding in chimpanzees and children: (1) causal reasoning. to a lesser extent. and the University of Pennsylvania Primate Facility. . We conclude the article with a general discussion of the concept of cause. 75005 Paris. and (3) choosing the correct alternative for an incomplete representation of a causal sequence.physically impossible . The data reported here were collected at the University of California. We are greatly indebted to Guy Woodruff. at both institutions who assisted in the care and testing of the chimpanzees. Ann James Premack Laboratoire de Psycho-Biologie du Developpement. (2) labelling the components (actor. graduate and undergraduate. The second test. and completed incomplete representations of actions involving multiple transformations.4-year-old children abandoned the inference but younger children and chimpanzees did not. •Corresponding author. chimpanzees succeeded.12 Levels of causal understanding in chimpanzees and children David Premack*. CNRS. When given information showing the inference to be unsound . We are also indebted to the many students. The chimpanzee Sarah labelled the components of a causal sequence. suggesting that the concept evolved far earlier in the psychological domain than in the physical. who participated in all phases of the research and would be a co-author if we knew his whereabouts and could obtain his permission.

"Who did it? How? When. The ability to carry out this task is a prerequisite for causal reasoning. the individual engages in causal reasoning. an outcome-a corpse on the floor-is presented. and must infer or reconstruct the missing events. Not only do they complete incomplete analogies and make same/ different judgments about exemplars that are and are not analogies (Gillan. but the cause is not. At an intermediate level the individual analyses intact causal sequences into their components and labels them. where and why?" He would answer these questions by making inferences from the "evidence". Chimpanzees have been shown to be capable of all but the deepest level (Premack 1976. At the deepest level. that is. Sherlock Holmes was a good reasoner because he had an uncanny sense of what was "relevant". they also construct analogies from scratch (Oden & Premack. causal reasoning. A. A human confronted with this scene would ask immediately. At the most superficial level. Since the alternatives are all visible this task is the least demanding. another in which we compare chimpanzees and young children. and instrument of the action. one cannot identify the missing part of an incomplete sequence. that they can do transitive inference. unpublished data). solving problems in which he sees the outcome of a process but not its cause. Premack In this paper we compare three levels of causal understanding in chimpanzees and children. the individual must complete an incomplete representation of a causal action. 1983). though they have been shown capable of analogical reasoning. albeit inconclusive. object. if one cannot identify the separate parts of an intact sequence. . Causal reasoning In this paper we present two tests of causal reasoning: one conducted with a group of chimpanzees. & Woodruff. But there is little indication that they are capable of "Sherlock Holmes" type reasoning. In causal reasoning. which has two main sources: existing knowledge concerning the "corpse" and its past. by selecting the correct alternative.238 Introduction D. and observations concerning the immediate circumstances. 1981). He must label the actor. detecting implications in what others dismissed as neutral facts. There is evidence (Gillan. Premack. 1981). As an astute observer. Premack.

When tested. having encountered either food or the rubber snake on its run. Animals entered the laboratory as infants. Now. each animal was placed in a holding room with an informant that had just completed a run. and continuous access to water. the "informed" chimpanzees seemed not to profit from this contact. Reading emotional evidence Four juvenile chimpanzees were tested in a simple two-step reasoning problem. snacks. They were maintained rather like middle-class children on three meals a day. was successfully communicated to the recipient. for they accepted all opportunities to run.5 years. Every chimpanzee played both roles. before starting a run. 98% correct). Animals were taken there prior to. They no longer dashed full speed to the goal. were diapered and bottle-fed. but slowed midway. . The unpredictable negative trials profoundly changed the character of the chimpanzees' run. four were 3^-4 \ years old and Sarah was about 10 years old at the time of the study. and had the opportunity to observe that its conspecifics too played both roles. some were "rewarded" with a preferred food while others were simply praised. However. that of informant and recipient.8 to 4. and this state. The holding room was adjacent to the runway. approaching the goal hesitantly. then introduced them to occasional negative trials. a run (to serve as an informant).1 years. We first trained them to run down a path to a consistently positive hidden object.Levels of causal understanding in chimpanzees and children 239 Subjects The chimpanzees (Pan troglodytes) were African born. with an average age of 4. they could discriminate the recipient's state following its contact with the informant (ca. The informant. all of which were used. that is. and did so in the same way whether: (1) the informant was in a positive state. (2) a negative state. we have reason to believe. a rubber snake was substituted for the food on 15% of trials on a random schedule. Participating children came from Philadelphia schools. We next offered the animals an opportunity to play Holmes. and (some) were taught an artificial language. to escape the uncertainty of the negative trial by using the emotional state of an informant to infer what object it had encountered on its run. trained extensively in match-to-sample. Beyond that. 70% correct). or even when on control trials (3) had had no contact with an informant at all. The use of four animals permitted 12 possible recipient-informant pairs. and varied in age from 3. and immediately after. was in either a positive or negative emotional state. Uninformed human judges shown videotapes of the informant could discriminate the state of the informant (ca.

Children were tested with a comparable procedure adjusted to suit a classroom and human food preferences. with an intertrial interval of about an hour.240 D. and two groups of children (10 in each group). These formed the base of a triangle with the chimpanzee at its apex. the animals should have been able to infer that an informant's emotional state was the result of what it had found at the end of a run. The chimpanzee was accompanied by a trainer. Perhaps at 3±-4± years the chimpanzees were too young and could have solved the problem when older. Using location as evidence In the next experiment. assumed it might be snakes for you but food for me? We cannot rule out this possibility. midway between the containers and 30 feet from the base. it still could not use the emotional state (which it registered at some level) as evidence from which to infer what the informant had encountered on its run. Premack Under these conditions. The age at which children can solve this problem is not known. Before testing the chimpanzees. the trainer left.in other words. The apes were tested in their outdoor compound. and the chimpanzee was released. but a human in this circumstance would certainly explore the hypothesis that snakes for you means snakes for me as well. we had assumed this was a simple problem. a second trainer placed an apple in one container and a banana in the other. Following this action. one that could be easily solved. we tested both chimpanzees (the same group of four used in the previous problem). The trainer distracted the animal for approximately 2 min before removing the blind. We used apples and bananas because apes are fond of both and find them about equally desirable. the accompanying trainer placed a blind in front of the chimpanzee so that the containers were no longer visible to it. Nonetheless. we placed two opaque containers about 30 feet apart. . Premack. and the children in a classroom. Each animal was given 10 trials. No chimpanzee ever encountered another in the holding room whose emotional state was not owed to a run. A. For the chimpanzees. Having eaten the fruit. Could the chimpanzee have made this inference but not have assumed that what the informant found was not a good prediction of what it would find . and the fruit eaten by the experimenter was counterbalanced over trials. for one cannot test children with frightening objects. The two fruit were placed equally often in both containers. As the two watched. What the subject now saw was the second trainer standing midway between the containers eating either an apple or a banana. and therefore could be used as a base condition on which to impose variations that would permit our answering fundamental questions about reasoning.

the chimpanzee does not actually see this relation. and on trials on which he was seen eating the doughnut. the oldest chimpanzee. Discussion Causal reasoning is difficult because a missing item must be reconstructed. the relation between them is readily learned. and upon seeing him eat the banana.1 That is. . Chimpanzees. upon seeing the trainer eat the apple. and must reconstruct from it the event that caused the state. children selected the container which held the doughnut. Bert and Jessie responded in an intermediate fashion. While the monkey sees the relation between the lever press and the model's painful state. In our tests. selected the container which held the cookie. Village children typically lagged city children by 6-12 months. For instance. does not provide such temporally contiguous events. on trials on which the experimenter was seen eating a cookie. 18 of 20 choosing the container associated with an item different from the one the experimenter was eating. even though the chimpanzees experienced the same emotional states as a consequence of the same events. By contrast. Causal reasoning. This helps clarify the striking difference in difficulty between ^e difference between these data and preliminary data reported earlier (Premack. While in simple learning. but experiences only the informant's emotional state. Because in both simple and observational learning temporal contiguity between the lever press and electric shock are either experienced or observed. choosing the container associated with food different from that which the trainer was seen to be eating after first choosing the opposite container for two and four trials. they were incapable of reconstructing those events from the emotional state of another chimpanzee. she went to the container which held the apple. respectively. 1983) comes from an unaccountable difference between our village and city children. choosing the container associated with a fruit different from the one eaten by the experimenter. to the container which held the banana. choosing the container associated with the fruit that was the same as the one eaten by the trainer. however. in causal reasoning the monkey does not observe the model press the lever but sees only its emotional response to the shock. Sadie. She made this selection on the first trial as well as on all subsequent ones. responded as did the children.Levels of causal understanding in chimpanzees and children 241 Results Children in both groups were largely consistent. a monkey receives an electric shock when it presses a lever (and in observational learning observes that another receives an electric shock when it presses the lever). were inconsistent. Luvie did the opposite. however.

there was not sufficient time for him to unwrap these items. we tested this possibility by wrapping the fruit and pastries in elaborate packages before placing them in the containers. and suggests why the former is found in all species. Could we induce our subjects to change their assumption? Suppose there is insufficient time for the trainer to recover the food placed in the containers. Can young children and chimpanzees analyse such causal sequences? We devised a non-verbal procedure to answer this question by showing simple actions on a television monitor and giving our subjects markers that adhered to . cuts an apple with a knife. Such a question is asked only if one sees an event as part of a causal sequence in which there is a missing component. the object on which he acts. and the instrument with which he performs the action. and in making this assumption believed the one container to be empty. but the additional fact that they made any assumption at all. For instance. Premack.242 D. will make no assumptions. Children of 4 years and older were profoundly affected by this change. Nevertheless. The subjects saw the trainer eating one or another food. Premack learning and reasoning. we suggest. A. Most species. One might say that in the second experiment there is evidence for causal reasoning on the part of the children and perhaps one chimpanzee. Labelling a causal action A causal action can be analysed into three components: the actor who carries out the action. They no longer chose the container holding the pastry different from the one eaten by the trainer but chose at chance level between the two containers. It is not the specific content of the assumption alone that is of interest. Would this affect choice? Keeping all other conditions constant. They assumed the food was the same as that which had been placed in the container. and "answered" it quite specifically by going consistently to the container holding the food different from that eaten by the trainer. This experiment can be seen as one of causal reasoning because here too there is a missing element to reconstruct. but were never shown where he obtained it. By contrast. They responded as before. the children and perhaps one chimpanzee evidently did reconstruct the location of the food. Now the trainer could not possibly have obtained the food from the containers . and washes his dirty socks in soapy water. John paints a wall with a brush. younger children and the chimpanzees were totally unaffected by the change in the availability of the item. What is most interesting about this outcome is that subjects "asked" the question of where the trainer got the food. the latter in exceedingly few. They will observe the trainer eating and never ask where he obtained the food.

the transfer scenes presented: two objects. With the children Dolgin took a further step not possible with Sarah. reserving them for non-persons. correcting her errors where previously we had approved all her responses. Sarah was given this problem. or recipient of the action. for example. but without regard for whether the object or instrument was in actual use or simply present. she was given a transfer test in which all her responses were approved . two instruments. using the same non-verbal approach used with Sarah. for the scenes were not merely new. but at a relatively low level of accuracy. and then showing the child the actor marker told her . Where the training scenes had presented one person acting on one object with one instrument. 67% correct with the object marker. identified the actor.our standard procedure in transfer tests. and 62% correct with the instrument marker. the instrument of the action. The tests were uniquely demanding. The trainer demonstrated the proper placement of each marker. the recipient of the action. observer. with an average age of 4 years. only one of which was used. The attempt failed because she now placed the markers on the blank part of the screen (calling our attention to a fact we had previously overlooked . They did not properly identify the actor marker but drew a simpler distinction. we recognize the experiment was needlessly difficult. and brought the experiment to a halt. and could have been simplified by dropping one of the categories. for example. animate/inanimate (or perhaps person/non-person). Kim Dolgin (1981) as part of her doctoral research applied the same procedures to young children from 3. We attempted to improve her accuracy by training her on the transfer series.8 to 4. and in other scenes as the recipient of action. They reserved the actor marker for people but without regard for whether the person was an actor. only one of whom carried out the action. A red square placed on John. handed the markers to Sarah. For example. The children failed the transfer tests. This tactic was entirely new. she presented the scene in which Bill cut the apple. a green triangle on the apple. John marking a paper with a pencil. In retrospect. After reaching criterion on the three training tapes. She was 85% correct in the use of the actor marker. She told the children the meaning of each marker. She was trained on three different actions: Bill cutting an orange. and corrected Sarah's errors.4 years. either of object or of instrument. and Henry washing an apple with water. They made a similar simplification in the case of object and instrument markers. only one of which was acted upon. but also decidedly more complex than those used in training. the other being engaged in some scenes as an observer of the action of the first. Bill brushed Bob's hair. Sarah passed the transfer tests. and two persons.Levels of causal understanding in chimpanzees and children 243 the screen and allowed them to identify each component of an action.most of the screen is blank!). a blue circle on the knife.

The chimpanzees were given not only novel object-operator combinations but also anomalous ones-for example. 1988). wet paper. . Premack. far higher than Sarah.244 D. pencil. cut sponges. wet apple. in special cases. the other three were analogies. cut banana. carrying them out in a play context. Only with protracted training could non-language-trained chimpanzees be taught to do these tests. 1983). To determine whether non-speaking creatures can recognize such transformations. The main actions we tested were cutting. pp. i potato to \ glass of water) (Premack. the object marker "This is what he's doing it to". and the instrument marker "This is what he's doing it with. The children passed the transfer tests at an average level of 93%. changing its state and/or location. a knife the object responsible for the transformation. 1976. Language training conferred an immediate ability tojsolve this complete set of four tasks. Causal sequences as transformations An actual causal sequence is a dynamic one in which an agent acts on an object. 249-261 for details). In the latter. The tests were passed only by language-trained chimpanzees which were not given any training on the test but passed them on their first exposure to them. they were given as a representative test sequence "apple ? cut apple" along with the alternatives: knife. fruit that had been written on . In the former. and then only one test at a time. same/different judgments on the relations between relations. container of water.g. wetting and." The results were dramatic. Premack "He's the one doing it". one requiring that they choose the missing operator.and performed as well on the anomalous material as on the other (see Premack.an apple and a cut apple representing the transformation of the apple. This test was only one of four that language-trained chimpanzees could do. we designed a test in which the subject was given an incomplete representation of this causal sequence and was required to choose an alternative that properly completed it (Premack. typically with the use of an instrument.. But one can represent the causal sequence in a stylized way . 1976). actions that reversed the effect: joining. and the matching of physically unlike proportions (e. marking. The subjects had extensive experience with the actions on which they were tested. 4-7. A. another the missing terminal state. erasing and drying. with no apparent saving from one to the other (Premack. The problem was given to three chimpanzees (and numerous children) in two basic formats. they were given as a representative test sequence "apple knife ?" along with the alternatives: cut apple.

for while pencil was correct (and eraser incorrect) for one order. how to erase pencil or crayon marks with a gum eraser. container of water. container of water. she was first acquainted with pairs of actions that had reverse effects. an eraser to remove the mark. dry". they might have chosen an operator simply on the grounds that it belonged with a particular action-a knife. Sarah was given 16 sessions. for example. pencil. with the position of the correct operator randomized across left. to choose correctly. These preliminary results simply established that Sarah understood the new actions and could deal with them correctly. as the operator for cutting . Sarah had to take order into account. centre and right positions. dry) was presented four times per session in random order. and eraser as possible operators. the trainer showed her how to mend broken pieces of an object with Scotch tape. But the standard tests did not require the subjects to read the sequences from left to right. Each of the three cases (tape. knife.Levels of causal understanding in chimpanzees and children 245 Mapping the directionality of action The direction of the action was depicted in the test sequences by the left /right order of the objects. "marked paper ? paper". accompanied by the standard three operators. "apple ? cut apple". eraser was correct (and pencil incorrect) for the other. the animal could well have made the same choice. "paper ? marked paper". two of each kind. and presented pairs of trials in which the same material appeared in reverse order. for example. The tests took the standard form. For example.making this choice without regard to order. erase. wet". It takes a pencil to mark a blank sheet of paper. A total of 26 objects were used and 12 operators. Now. and four sessions with the new cases "tape. She was given pencil. old cases being presented 24 . mark. For example. She was then required to use the left-right order of the sequence. erase. Results: Total = 40/60 Original cases = 12/18 New cases = 28/42 Zdiff between old and new not significant. the object in its initial state was always presented on the left. how to dry a wet object with a cloth-and then gave her the opportunity to carry out the actions. for example. 12 trials per session. To obtain evidence that Sarah could discriminate the left-right order of the sequence and make use of it. Thus. Whether the intact apple was to the left or right of the cut one. She adopted these new actions with enthusiasm. She was then given three test sessions with the original "cut. the object in its transformed state on the right.

All other details were the same as those already reported. Premack times each. Finally. Although the objects and operators used were the same as those in the previous tests. In five sessions of 12 trials per session. Each case appeared twice per session in counterbalanced order. and again with the reverse operator. 30 new objects were used as well as 60 new operators. and incorrect irrelevant alternative.246 D. Each object appeared twice. they were combined in new ways. once with one or another of the six new operators. Results: Total = 44/60 Original cases = 20/30 New cases = 24/30 Excluding trials in which Sarah chose incorrect alternative: Total = 44/52 Original cases = 20/25 . A. Sarah was given an extensive transfer test involving new objects and operators. Results: Total = 110/180 Original cases = 47/72 New cases = 63/108 If we exclude trials in which Sarah chose the incorrect irrelevant alternative rather than the relevant one. new cases 36 times each in random order across sessions. All other details were the same as those of the preceding tests. corresponding figures are: Total = 110/148 Original cases = 47/62 New cases = 63/86 Zdiff between old and new not significant.18. Each operator appeared three times: as correct alternative. The reversal pairs compared as follows: Cut/tape: 28/60 Mark /erase: 37/60 Wet/dry: 45/60 Zdiff between c/t and w/d = 3. 10 of each kind. Premack. incorrect reverse alternative.

Could Sarah understand causal action from this perspective. then selecting the operator(s) that explains or accounts for the difference? We can add to the interest of this question by removing the restrictions that were applied to the examples Sarah had been given.05 with three alternatives) Wet/dry = 17/20 No significant Zdiff. paper could be both cut and marked. The test consisted of six 12-trial sessions. and place it/them in the trash. So. both cases) Total = 49/72 These results add to the evidence of Sarah's ability to use the test sequences as representations of action. The rest of the procedural details have already been reported. Second.Levels of causal understanding in chimpanzees and children 247 New cases = 24/27 Reversal pairs: Cut/tape = 15/20 Mark/erase = 12/20 (p < . the initial state could be an already-transformed object rather than one in an intact or canonical state. she was required to select the incorrect or irrelevant operators. Multiple transformations The basic consequence of a causal action is a transformation . These data establish that Sarah could use left-right order to map the directionality of action as accurately on unfamiliar as on familiar cases. Results: Single transformations = 25/36 Double transformations = 24/36 (p < . Now we not only lifted restrictions.a change from an initial state to a final one. comparing it with its terminal state. besides removing the interrogative particle and replacing it with the correct or relevant operators. First. each consisting of both single-action and double-action trials in equal number counterbalanced over the session. The six actions and their combinations were presented in equal number in each session counterbalanced over the session. looking at the initial state of an object. transformations involved more than a single action-for example. but gave Sarah a special trash bin in which to discard incorrect or irrelevant operators. Her analyses answered these implicit questions: (1) .001.

an incomplete depiction of the cutting sequence such as was given the chimpanzees . to a fixed set of responses. or combinations of the above. but more likely with a copy of the circuit. Similarly. The problems about which she was queried were depicted by a brief videotape (the terminal image of which was put on hold). the representation of a sequence can be distorted in a number of ways. but implicitly with visual exemplars. but also by duplicating elements. Sarah had to attend to the order of the test sequences. The questions asked her in this case were: What caused the fire? How could it be put out? What is neither cause nor solution but an associate of the fire? The correct answers were photographs of: matches (cause). her answers were not given in words but in visual exemplars. But suppose he is not required to actually cut an apple. Premack. as are the original circuits. to remove elements. for example. Premack What operator changed the object from its initial to its terminal state? (2) In applying this operator to this object. and they may therefore allow for greater novelty and flexibility.248 D. Copies of circuits are not tied. but is instead shown a representation of cutting . Flexible novel responding of this kind is not likely to be associated with the original circuitry (that concerned with actual cutting). We speculate that the representational ability required to pass these tests is that of a mind/brain which is capable of copying its own circuits. an individual may form a neural circuit enabling him to carry out the act efficiently. misordering them. For this reason. we suggest. "reading" them from left to right. she was shown a trainer vigorously stamping out a small conflagration of burning paper. An attempt to combine three questions Sarah was given a test that consisted of three questions: (1) What is the cause of this problem? (2) How can it be solved? (3) What is neither cause nor solution but merely an associate of the problem? These questions were not asked explicitly. to complete the representation by choosing the missing element? Probably not. flexible responding may depend on the ability of a mind/brain to be able to make copies of its own circuits. such as cutting an apple with a knife. Moreover. To restore distorted sequences to their canonical form requires an ability to respond flexibly. for the responses associated with the original circuit are those of actual cutting.could he use the neural circuit to respond appropriately. restore order and the like. and which did not? In answering these questions. what terminal state did one produce? (3) Which operators caused the difference between the initial and terminal state. that is. they would not apply to repairing an incomplete representation of cutting. a bucket of water . adding improper elements. For instance. not only by removing elements as in the chimpanzee test. with words. A. In carrying out an actual causal sequence. add others.

In the example concerning a fire. and could be counted on to bring out her best effort. . When the trainer entered to show her the fourth videotape. This did not help her cause. Scotch tape. and they succeeded nicely (Premack. When presented the marker for associate (blue circle). and a pencil (associate). choose.one who no longer played an active role in her care or testing but who had been an early caretaker. The markers were introduced by presenting each of them with a videotape. he left the cage area. it is not necessarily beyond the capacity of the chimpanzee. was intended to teach her to view the problems depicted on the videotape according to the perspective indicated by the correct answer associated with a marker. and then quickly looked back. with three photographsmatches. she was given a transfer test involving 20 new problems. When presented the marker for "solution" (green triangle). When she failed. several times. that is. after the trainer gave her the test material in a manila envelope. offering three photographs and teaching her which of them was correct. was a favourite. but one marker at a time. repeated with two other training videotapes. and taught to choose the bucket of water. and taught to choose pencil. but with a difficult question and a favourite trainer. We subsequently used this approach with 4^-6^-year-old children on a problem only slightly less demanding than the one given Sarah. Her looking behaviour was readily observable. she was shown photographs of water. it was decided to train her on this material and to bring in a new trainer . which served as one of three training cases. knife-and taught to choose matches. apple. The three questions were identified by different markers (like those used to identify the three components of an action). Sarah could be observed on a television monitor to empty the envelope on the cage floor. she was presented the marker for "cause" (red square). then two and. With this favourite trainer she not only looked at the alternatives with more than usual care but did several double-takes. she lost sphincter control and ran screaming about the cage. This procedure. inspect them. all three presented together. clay. eraser. only when there is success on two. but by the correct answers with which each marker was associated. and then ring her bell (as a period marker signalling an end and summoning the trainer). Once she reached criterion on the three training cases. It must be taught more carefully than we did. 1988). she made three consecutive errors. looked.Levels of causal understanding in chimpanzees and children 249 (solution). the latter because she often used a pencil in scribbling on paper of exactly the kind that was shown in the videotape. looked away. blanket. though the meaning of these markers was not determined by their location on the television image. Ordinarily she looked only once at an alternative before choosing. Sarah definitely "tried" harder with some trainers than with others. spread out the alternatives. not as a combination of three markers. Although a demanding test. she was shown photographs of pencil.

Natural causality (Premack. for only humans. Premack. These cases are important because they give the impression that the concept of cause has a content: psychological. and even then only in a few tool-using species (Kummer. Leslie & Keeble. it is essential to recognize that there is another. whereas the perception of arbitrary causal relations requires repeated pairings in time of two events.. The perception of causality of the Michotte variety probably evolved late. whereas arbitrary causality concerns the relation between any pair of temporally contiguous events . threats that produce withdrawal. infants may perceive a causal relationship when presented with totally arbitrary cases. 1991) concerns the relation between special pairs of events-one object launching another by colliding with it is the classic example. 1990).250 General discussion D. is "self-propelled" (Premack. and (2) when one object affects another "at a distance".a lever press followed by the presentation of a Mars bar is one example. and more basic example of natural causality. or both. Humans unquestionably perceive causality under both these conditions. which is not restricted to a few tool-using species but is found in virtually all species.g. because the perception of causality in the psychological domain evolved far earlier than it did in the Michotte case and belongs to a "part of the mind" that is less accessible to language. physical. collisions that launch objects.g. Yet this fact has received little comment . is demonstrable in 6-month-old infants (e. all fit neatly into either a physical or psychological category. Why? We suggest. Premack There are two traditions in which causality has been studied in psychology: the natural causality of Michotte (1963). and so on. affects another despite a lack of spatial/temporal contiguity. for . that is. A. Compare this to the perception of causality in the psychological domain.. and the exceptional monkey (e. Bar pressing that produces food. This is the psychological case where we perceive causality under two conditions: (1) when an object moves without being acted upon by another object. Intentional action which involves either a single individual or one individual acting upon another is part of the experience of all but a few invertebrates. in press). However. Cebus) handle objects in a manner capable of producing collisions. apes. 1987) and is considered innate. These two traditions have fostered conflicting interpretations of causality. and the arbitrary causality of Hume (1952). that is. and is learned. The perception of natural causal relations requires only a single pairing of appropriate events.virtually none compared to the extensive comment on the Michotte case. But are these differences real or do they simply reflect differences in subject matter? Although Michotte's case is typically the only cited example of natural causality.

All species may share a device that records the occurrence of contiguous events. embedded in naive theories about the world. largely unique to the human. Keil. and propose an explanatory mechanism for each of them.Levels of causal understanding in chimpanzees and children 251 example. and how the mind/brain binds items so as to construct events remains a challenge for neuroscience. . Especially Dickinson and his colleagues (e. Phillips. in press. in press). is largely a human specialization. Perhaps the concept of causality at its most fundamental level is no more than a device that records the occurrence of contiguous events . our outcome will be the same as that obtained by LeslieKeeble in the Michotte example . the capacity to act intentionally which enabled certain species not only to register but also to produce contiguous events. will show greater dishabituation (recovery in looking time) than the delay group. & Woodward. The contiguous group. another to a delay case. in press). Baillargeon. and then apply the Leslie-Keeble paradigm by reversing the order of the two events. In other words. to explain or interpret events that have been both registered and produced. in press) have considered the special subset of concurrent events . Let us habituate one group of infants to a contiguous case. Are such pairs marked in some fashion. psychology (Leslie. We present colour change followed by sound to both groups. in press. is this perception confined to the Michotte case? Causality is not bound by content. But how does one explain these results? Just as Leslie-Keeble do. Dickinson & Shanks. Singer. the capacity..a greater recovery for the group in which the events are presented contiguously.g. Explanation. in press. and arguably biology (Carey. Gelman. in press). and evolution contributed two major additions to this primitive device: first. the second level of causality is well represented by work on animal learning. a sharp sound followed by a temporally contiguous colour change in an object..associative learning . and thus represented differently in memory from other concurrent pairs? The third level of causality is to be found in recent work on domain-specific theories of naive physics (Spelke. for the sequence "sound-colour change" which involves neither an intentional act nor the transfer of energy from one object to another. fortunately. in press. & Kaufman.g. If so. While the basis of the primitive level has not been resolved by neuroscience (for this level operates on "events". Kotovsky. Durgin. is an example of neither psychological nor physical causality.brought about by intentional action. we predict. Premack & Premack. 1990). this example is important because it demonstrates that the concept of cause may be without content. These theories separate the concurrent events (registered by the primitive device) into special categories. we suggest. second. While contiguous events certainly do lead to the perception of causality. e.act-outcome pairs . & Needham.and is found in all species.

& D. Cognition. Intelligence in ape and man. S... A. L. The growth of causal understandings of natural kinds. The acquisition of physical knowledge in infancy. D. A study of the relative salience of form and function in adult and four-year-old subjects. (1981). Thought without language. D.J. Premack. Semantics of action. Causal cognition: A multidisciplinary debate. & Needham.). D. A. Oxford: Clarendon Press. Sperber (Eds. Reasoning in the chimpanzee. The infant's theory of self-propelled objects. & Woodruff. D. Sperber (Eds.. Causal cognition: A multidisciplinary debate. (in press). Hume. In A. Causal cognition: A multidisciplinary debate. Behavioral and Brain Sciences. D. R. Oxford: Clarendon Press. (1990). Oxford: Clarendon Press. (1981). Reasoning in the chimpanzee. S. Premack. On causal knowledge in animals.). Oxford: Clarendon Press. 6. Gelman. H. An enquiry concerning human understanding. Hillsdale. Minds with and without language. Premack. Premack.). Transitive inference. (in press). Instrumental action and causal representation. D. Premack. D.J.. Premack. 35). D.J. In L. A developmental study of cognitive predisposition'. Sperber (Eds.. & Woodward. Oxford: Clarendon Press. Leslie. In A.252 References D. Michotte.. D. & D. A. (1988). (1963). Premack.. Premack. (1983).. Premack. A. Weiskrantz (Ed. Sperber (Eds. (1952). In A. Causal cognition: A multidisciplinary debate.). Gillan. F. London: Methuen.S. & Shanks. Search for coherence: A basic principle of cortical self-organization. Keil. Talk at Fyssen Foundation Conference on the origins of the human brain. Premack.J. Premack. Sperber (Eds.J.J. D. Premack. The perception of causality. Dolgin. Gillan. Causal cognition: A multidisciplinary debate. (in press). Premack. In A. In Great books of the western world (Vol. Causal cognition: A multidisciplinary debate. Premack. Premack. Journal of Experimental Psychology: Animal Behavior Processes. A.J.). (in press). 150-164.). Carey. Clarendon Press: Oxford.). Singer. Phillips. Premack. Durgin. 265-288. Sperber (Eds. (1990). E. 25. Concepts in Neuroscience. & Kaufman. Sperber (Eds. In A. 1-17. (in press). 1-16. Premack. Oxford: Clarendon Press. Journal of Experimental Psychology: Animal Behavior Processes. Premack. Causal cognition: A multidisciplinary debate. Premack. Premack. I. Dickinson. Infant's knowledge of object motion and human action.J. Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. NJ: Erlbaum. & D. On the origin of causal understanding. (in press). .J. The codes of man and beasts. Premack. W. L. 125-167. Premack. Dissertation.). Spelke. F. (1981). Chicago: Benton. Premack. & D. II. (1991). & D. 36. K. Premack Baillargeon. & Keeble. & D. In A.). D. Do six-month-old infants perceive causality? Cognition. & D. Oxford: Clarendon Press. Kotovsky. A. Analogical reasoning. Oxford: Clarendon Press. & Premack. R. 7.J. D. Distinguishing between animates and inanimates: Not by motion alone. 7. Sperber (Eds. (1976). 1-26. (in press). University of Pennsylvania.J. G.. D.G.. In A. A. D. D. (1987). & D. Cause/induced motion: intention/spontaneous motion. D. D. (in press). In A. Premack. D.L. A. D. 1. Kummer.

a person * E-mail eidar@clarity. and Edward Smith. information search. Uncertain situations may be thought of as disjunctions of possible states: either one state will obtain. and by a grant from the Russell Sage Foundation. Princeton. The paper has benefited from long discussions with Amos Tversky. A critical feature of thinking and deciding under uncertainty is the need to consider possible states of the world and their potential consequences for our beliefs and actions. are discussed. In order to choose between alternative actions or solutions in situations of uncertainty. Daniel Osherson. it is suggested. princeton. USA Abstract This paper considers the relationship between decision under uncertainty and thinking through disjunctions. or another. Instead of hypothetically traveling through the branches of a decision tree. The common difficulty is attributed to people's reluctance to think through disjunctions. 1-R29-MH46885 from the National Institute of Mental Health. . and from the comments of Philip Johnson-Laird. and a variety of simple reasoning problems that often generate confusion and error are reviewed.13 Uncertainty and the difficulty of thinking through disjunctions Eldar Shafir* Department of Psychologyy Princeton University. Decision situations that lead to violations of Savage's sure-thing principle are examined. people suspend judgement and remain at the node. deductive and inductive reasoning. Some implications of the reluctance to think through disjunctions. This interpretation is applied to instances of decision making. NJ 08544. probabilistic judgement. as well as potential corrective procedures. puzzles and paradoxes. games. edu This research was supported by US Public Health Service Grant No. Introduction Everyday thinking and decision making often occur in situations of uncertainty.

he decides that he should buy. yields a more desirable outcome no matter how the uncertainty is resolved. for example. when planning a weekend's outing. and if he would also prefer a to b knowing that X did not obtain. p. in that it captures a fundamental intuition about what it means for a decision to be determined by the anticipated consequences. So. 1983. Thus. and is one of the simplest and least controversial principles of rational behavior. a player needs to consider what the best move would be if the opponent were to employ one strategy. Levi (1991). to clarify the matter for himself. then he definitely prefers a to b (Savage.254 E. or solution. Savage calls the principle that governs this decision the sure-thing principle (STP). Similarly. He considers the outcome of the next presidential election relevant to the attractiveness of the purchase. people's decisions do not always abide by STP. instead. Special situations sometimes arise in which a particular action. and decides that he would do so. STP has a great deal of both normative and descriptive appeal. for discussion). It is argued that a necessary condition for such violations is people's failure to see through the underlying disjunctions. Shafir needs to consider the anticipated outcomes of each action or each solution pattern under each state. and Bacharach and Hurley (1991) for technical discussion. In particular. An analogous situation was described by Savage (1954) in the following passage: A businessman contemplates buying a certain piece of property. Seeing that he would buy in either event.1 It is a cornerstone of expected utility theory. Hammond (1988). See also Shafir and Tversky (1992) for a discussion of nonconsequentialism. See. however. and what may be the best move if the opponent were to follow an alternative plan. undecided. and it holds in other models of choice which impose less stringent criteria of rationality (although see McClennen. at ^he notion of consequentialism appears in the decision theoretic literature in a number of different senses. Similarly. . . Despite its apparent simplicity. According to STP. and again finds that he would do so. a person may prefer to go bowling rather than hiking regardless of whether it is sunny or it rains. it is suggested that in situations of uncertainty people tend to refrain from fully contemplating the consequences of potential outcomes and. suspend judgement and remain. he asks whether he would buy if he knew that the Republican candidate were going to win. The present paper reviews recent experimental studies of decision under uncertainty that exhibit violations of STP in simple disjunctive situations. a person may want to consider which of a number of activities she would prefer if the weekend is sunny and which she would prefer if it rains. . if a person would prefer a to b knowing that X obtained. and an exchange of queens may be the preferred move whatever the strategy chosen by the opponent. even though he does not know which event obtains . 1954. 22). Thus. he considers whether he would buy if he knew that the Democratic candidate were going to win. It is an important implication of "consequentialist" accounts of decision making. when contemplating the next move in a chess game.

we have suggested that people have . our subjects refrain from contemplating (and acting in accordance with) the consequences of winning or of losing. the majority accepted the second gamble after having lost the first gamble. and it is argued that a reluctance to think through disjunctions can be witnessed across these diverse domains. are then considered. the Lost version)? And finally. would you play again given that you have won $200 on the first toss (the Won version)? Tversky and Shafir (1992) presented subjects with the Won. but that you do not know whether you have won or lost.Uncertainty and the difficulty of thinking through disjunctions 255 the uncertain node. 1992. ranging from deduction and probability judgement to games and inductive inference. it is suggested. each roughly a week apart. they act as if in need for the uncertainty about the first toss to be resolved.accept when you win. how would you feel about taking the second gamble given that you have just lost $100 on the first (henceforth. A decision maker who would choose to accept the second gamble both after having won and after having lost the first. but most subjects rejected the second gamble when the outcome of the first was not known. for further detail and related data). Part of the difficulty in thinking under uncertainty. accept when you lose. Among those subjects who accepted the second gamble both after a gain and after a loss on the first. when the outcome of the first gamble was uncertain. on a similar toss? Alternatively. and uncertain versions of this problem. when it is not known whether they have won or lost.was the single most frequent pattern exhibited by our subjects (see Tversky & Shafir. Elsewhere. Instead. Decisions Risky choice Imagine that you have just gambled on a toss of a coin in which you had an equal chance to win $200 or lose $100. The data were as follows: the majority of subjects accepted the second gamble after having won the first gamble. Would you like to gamble again. this particular pattern . Suppose that the coin has been tossed. The problems were embedded among several others so the relation among the three versions would not be transparent.i n conformity with S T P choose to accept the second gamble even when the outcome of the first is uncertain. s h o u l d . However. and subjects were instructed to treat each decision separately. derives form the fact that uncertainty requires thinking through disjunctive situations. Some implications and corrective procedures are considered in a concluding section. but reject when you do not know . 65% rejected the second gamble in the disjunctive condition. Lost. In fact. Studies in other areas.

but she prefers y over x when it is unknown whether or not A obtains. Such a pattern of preferences can be captured by a power function with an exponent of . and that a disjunction of different reasons (" 'I can no longer lose. similarly. . and she also prefers x over y when she knows that event A does not obtain. and captures common features of preference observed in numerous empirical studies. which can be thought of as decision trees. .256 E. 1979. shown in Fig. 1992. 1993). The disjunction effect amounts to a violation of STP. represents people's subjective value of losses and of gains. in other words. Consider. then. rather then contemplate t h e sometimes incontrovertible .) The function in Fig. While there is ample evidence to the contrary. 7r. see Shafir. Shafir different reasons for accepting the second gamble following a gain and following a loss. Simonson. The above pattern of nonconsequential reasoning may be illustrated with the aid of the value function from Kahneman and Tversky's (1979) prospect theory. 1. While prospect theory also incorporates a decision weight function. . Furthermore.' in case I lost") is often less compelling than either of these definite reasons alone (for further discussion of the role of reasons in choice. for recent extensions. 1982. & Tversky. The function. and hence of consequentialism. for simplicity. see Kahneman & Tversky.75 for losses. . as well as Tversky & Kahneman. a person P whose values for gains and losses are captured by the function of Fig. reflecting the common observation that "losses loom larger than gains" for most people. A disjunction effect occurs when a person prefers x over y when she knows that event A obtains. 1 represents a typical decision maker who is indifferent between a 50% chance of winning $100 and a sure gain of roughly $35. and a convex segment to the left of the origin reflecting risk seeking in choices between losses. we will assume. is indifferent between a 50% chance of losing $100 and a sure loss of roughly $40. P needs to decide. When confronting such disjunctive scenarios.consequences of the possible branches. which maps stated probabilities into their subjective value for the decision maker. that decision weights coincide with stated probabilities. and. He now needs to decide whether to accept or reject the second. people seem to remain at the uncertain nodes. whether to . Its S-shape combines a concave segment to the right of the origin reflecting risk aversion in choices between gains. there is nonetheless another important element that contributes to these paradoxical results: people do not see through the otherwise compelling logic that characterizes these situations.' in case I won the first gamble or 1 need to recover my losses. the slope of the function is steeper on the left of the origin than on the right.65 for gains and . 1. While a reliance on reasons seems to play a significant role in the psychology that yields disjunction effects. Suppose that P is presented with the gamble problem above and is told that he has won the first toss. this does not change the present analysis. Tversky and Shafir (1992) call the above pattern of decisions a disjunction effect. (For more on prospect theory.

Uncertainty and the difficulty of thinking through disjunctions


Fig. 1.

The value function v(x) = x65 for x^O and v(x) = -(-x)m75 for x^O.

maintain a sure gain of $200 or, instead, opt for an equal chance at either a $100 or a $400 gain. Given P's value function, his choice is between two options whose expected values are as follows: Accept the second gamble: Reject the second gamble: .50 x 400( 65) + .50 x 100( 65) 1.0 x 200( 65)

Because the value of the first option is greater than that of the second, P is predicted to accept the second gamble. Similarly, when P is told that he has lost the first gamble and needs to decide whether to accept or reject the second, P faces the following options: Accept the second gamble: Reject the second gamble: .50 x -[200 ( 75)] + .50 x 100( 65) 1.0 x -[100 ( 75)]


E. Shafir

Again, since the first quantity is larger than the second, P is predicted to accept the second gamble. Thus, once the outcome of the first gamble is known, the value function of Fig. 1 predicts that person P will accept the second gamble whether he has won or lost the first. But what is P expected to do when the outcome of the first gamble is not known? Because he does not know the outcome of the first gamble, P may momentarily assume that he is still where he began-that, for moment, no changes have transpired. Not knowing whether he has won or lost, P remains for now at the status quo, at the origin of his value function. When presented with the decision to accept or reject the second gamble, P evaluates it from his original position, without incorporating the outcome of the first gamble, which remains unknown. Thus, P needs to choose between accepting or rejecting a gamble that offers an equal chance to win $200 or lose $100: Accept the second gamble: .50 x -[100 ( 75)] + .50 x 200( 65) Reject the second gamble: 0 Because the expected value of accepting is just below 0, P decides to reject the second gamble in this case. Thus, aided by prospect theory's value function, we see how a decision maker's "suspension of judgement" - his tendency to assume himself at the origin, or status quo, when it is not known whether he has won or lost - leads him to reject an option that he would accept no matter what his actual position may be. Situated at a chance node whose outcome is not known, P's reluctance to consider each of the hypothetical branches leads him to behave in a fashion that conflicts with his preferred behavior given either branch. People in these situations seem to confound their epistemic uncertainty - what they may or may not know-with uncertainty about the actual consequences - what may or may not have occurred. A greater focus on the consequences would have helped our subjects realize the implications for their preference of either of the outcomes. Instead, not knowing which was the actual outcome, our subjects chose to evaluate the situation as if neither outcome had obtained. It is this reluctance to think through disjunctions that characterizes many of the phenomena considered below.

Search for noninstrumental information: the Hawaiian vacation Imagine that you have just taken a tough qualifying exam. It is the end of the semester, you feel tired and run-down, and you are not sure that you passed the exam. In case you failed you have to take it again in a couple of months-after

Uncertainty and the difficulty of thinking through disjunctions


the Christmas holidays. You now have an opportunity to buy a very attractive 5-day Christmas vacation package to Hawaii at an exceptionally low price. The special offer expires tomorrow, while the exam grade will not be available until the following day. Do you buy the vacation package? This question was presented by Tversky and Shafir (1992) to Stanford University undergraduate students. Notice that the outcome of the exam will be known long before the vacation begins. Thus, the uncertainty characterizes the present, disjunctive situation, not the eventual vacation. Additional, related versions were presented in which subjects were to assume that they had passed the exam, or that they had failed, before they had to decide about the vacation. We discovered that many subjects who would have bought the vacation to Hawaii if they were to pass the exam and if they were to fail, chose not to buy the vacation when the exam's outcome was not known. The data show that more than half of the students chose the vacation package when they knew that they passed the exam and an even larger percentage chose the vacation when they knew that they failed. However, when they did not know whether they had passed or failed, less than one-third of the students chose the vacation and the majority (61%) were willing to pay $5 to postpone the decision until the following day, when the results of the exam would be known.2 Note the similarity of this pattern to the foregoing gamble scenario: situated at a node whose outcome is uncertain, our students envision themselves at the status quo, as if no exam had been taken. This "suspension of judgement" - the reluctance to consider the possible branches (having either passed or failed the exam) - leads our subjects to behave in a manner that conflicts with their preferred option given either branch. The pattern observed in the context of this decision is partly attributed by Tversky and Shafir (1992) to the reasons that subjects summon for buying the vacation (see also Shafir, Simonson, & Tversky, 1993, for further discussion). Once the outcome of the exam is known, the student has good - albeit different reasons for going to Hawaii: having passed the exam, the vacation can be seen as a reward following a successful semester; having failed the exam, the vacation becomes a consolation and time to recuperate before a re-examination. Not knowing the outcome of the exam, however, the student lacks a definite reason for going to Hawaii. The indeterminacy of reasons discourages many students from buying the vacation, even when both outcomes - passing or failing the exam - ultimately favor this course of action. Evidently, a disjunction of different

2 Another group of subjects were presented with both Fail and Pass versions, and asked whether they would buy the vacation package in each case. Two-thirds of the subjects made the same choice in the two conditions, indicating that the data for the disjunctive version cannot be explained by the hypothesis that those who buy the vacation in case they pass the exam do not buy it in case they fail, and vice versa. While only one-third of the subjects made different decisions depending on the outcome of the exam, more than 60% of the subjects chose to wait when the outcome was not known.


E. Shafir

reasons (reward in case of success; consolation in case of failure) can be less compelling than either definite reason alone. A significant proportion of subjects were willing to pay, in effect, for information that was ultimately not going to alter their decision - they would choose to go to Hawaii in either case. Such willingness to pay for noninstrumental information is at variance with the classical model, in which the worth of information is determined by its potential to influence choice. People's reluctance to think through disjunctive situations, on the other hand, entails that noninstrumental information will sometimes be sought. (See Bastardi & Shafir, 1994, for additional studies of the search for noninstrumental information and its effects on choice.) While vaguely aware of the possible outcomes, people seem reluctant to fully entertain the consequences as long as the actual outcome is uncertain. When seemingly relevant information may become available, they often prefer to have the uncertainty resolved, rather than consider the consequences of each branch of the tree under the veil of uncertainty. A greater tendency to consider the potential consequences may sometimes help unveil the noninstrumental nature of missing information. In fact, when subjects were first asked to contemplate what they would do in case they failed the exam and in case they passed, almost no subject who had expressed the same preference for both outcomes then chose to wait to find out which outcome obtained (Tversky & Shafir, 1992). The decision of many subjects in the disjunctive scenario above was not guided by a simple evaluation of the consequences (for, then, they would have realized that they prefer to go to Hawaii in either case). An adequate account of this behavior needs to contend with the fact that the very simple and compelling disjunctive logic of STP does not play a decisive role in subjects' reasoning. A behavioral pattern which systematically violates a simple normative rule requires both a positive as well as a negative account (see Kahneman and Tversky, 1982, for discussion). We need to understand not only the factors that produce a particular response, but also why the correct response is not made. Work on the conjunction fallacy (Shafir, Smith, & Osherson, 1990; Tversky and Kahneman, 1983), for example, has addressed both the fact that people's probability judgement relies on the representativeness heuristic - a positive account - as well as the fact that people do not perceive the extensional logic of the conjunction rule as decisive - a negative account. The present work focuses on the negative facet of nonconsequential reasoning and STP violations. It argues that like other principles of reasoning and decision making, STP is very compelling when stated in a general and abstract form, but is often non-transparent, particularly because it applies to disjunctive situations. The following section briefly reviews studies of nonconsequential decision making in the context of games, and ensuing sections extend the analysis to other domains.

Uncertainty and the difficulty of thinking through disjunctions Games Prisoner's dilemma


The theory of games explores the interaction between players acting according to specific rules. One kind of two-person game that has received much attention is the Prisoner's dilemma, or PD. (For an extensive treatment, see Rapoport & Chammah, 1965). A typical PD is presented in Fig. 2. The cell entries indicate the number of points each player receives contingent on the two players' choices. Thus, if both cooperate each receives 75 points but if, for example, the other cooperates and you compete, you receive 85 points while the other receives 25. What characterizes the PD is that no matter what the other does, each player fares better if he competes than if he cooperates; yet, if they both compete they do significantly less well than if they had both cooperated. Since each player is encountered at most once, there is no opportunity for conveying strategic messages, inducing reciprocity, or otherwise influencing the other player's choice of strategy. A player in a PD faces a disjunctive situation. The other chooses one of two strategies, either to compete or to cooperate. Not knowing the other's choice, the first player must decide on his own strategy. Whereas each player does better competing, their mutually preferred outcome results from mutual cooperation rather than competition. A player, therefore, experiences conflicting motivations. Regardless of what the other does, he is better off being selfish and competing; but assuming that the other acts very much like himself, they are better off both making the ethical decision to cooperate rather than the selfish choice to compete. How might this disjunctive situation influence people's choice of strategy?

OTHER cooperates competes

You: 75 cooperate Other: 75

You: 25 Other 85

You: 85 compete Other 25 Other 30 1 You: 30

Fig. 2.

A typical prisoner's dilemma. The cell entries indicate the number of points that you and the other player receive contingent on your choices.


E. Shafir

Shafir and Tversky (1992) have documented disjunction effects in one-shot PD games played for real payoffs. Subjects (N = 80) played a series of PD games (as in Fig. 2) on a computer, each against a different unknown opponent supposedly selected at random from among the participants. Subjects were told that they had been randomly assigned to a "bonus group", and that occasionally they would be given information about the other player's already-chosen strategy before they had to choose their own. This information appeared on the screen next to the game, and subjects were free to take it into account in making their decision. (For details and the full instructions given to subjects, see Shafir & Tversky, 1992.) The rate of cooperation in this setting was 3% when subjects knew that the opponent had defected, and 16% when they knew that the opponent had cooperated. Now what should subjects do when the opponent's decision is not known? Since 3% cooperate when the other competes and 16% cooperate when the other cooperates, one would expect an intermediate rate of cooperation when the other's strategy is not known. Instead, when subjects did not know whether their opponent had cooperated or defected (as is normally the case in this game), the rate of cooperation rose to 37%. In violation of STP, a quarter of the subjects defected when they knew their opponent's choice-be it cooperation or defection - but cooperated when their opponent's choice was not known. Note the recurring pattern: situated at a disjunctive node whose outcome is uncertain, these subjects envision themselves at the status quo, as if, for the moment, the uncertain strategy selected by the opponent has no clear consequences. These players seem to confound their epistemic uncertainty - what they may or may not know about the other's choice of strategy - with uncertainty about the actual consequences - the fact that the other is bound to be a cooperator or a defector, and that they, in turn, are bound to respond by defecting in either case. (For further analysis and a positive account of what may be driving subjects' tendency to cooperate under uncertainty, see Shafir & Tversky, 1992.)

Newcomb's problem and quasi-magical thinking Upon completing the PD game described in fhe previous section, subjects (N = 40) were presented, on a computer screen, with the following scenario based on the celebrated Newcomb's problem (for more on Newcomb's problem, see Nozick, 1969; see Shafir & Tversky, 1992, for further detail and discussion of the experiment).
You now have one more chance to collect additional points. A program developed recently at MIT was applied during this entire session to analyze the pattern of your preferences. Based on that analysis, the program has predicted your preference in this final problem.

Uncertainty and the difficulty of thinking through disjunctions


I 20 points J
Box A


Box B


Consider the two boxes above. Box A contains 20 points for sure. Box B may or may not contain 250 points. Your options are to: (1) Choose both boxes (and collect the points that are in both). (2) Choose Box B only (and collect only the points that are in Box B). If the program predicted, based on observation of your previous preferences, that you will take both boxes, then it left Box B empty. On the other hand, if it predicted that you will take only Box B, then it put 250 points in that box. (So far, the program has been remarkably successful: 92% of the participants who choose only Box B found 250 points in it, as opposed to 17% of those who chose both boxes.) To insure that the program does not alter its guess after you have indicated your preference, please indicate to the person in charge whether you prefer both boxes or Box B only. After you indicate your preference, press any key to discover the allocation of points.

According to one rationale that arises in the context of this decision, if the person chooses both boxes, then the program, which is remarkably good at predicting preferences, is likely to have predicted this and will not have put the 250 points in the opaque box. Thus, the person will get only 20 points. If, on the other hand, the person takes only the opaque box, the program is likely to have predicted this and will have put the 250 points in that box, and so the person will get 250 points. A subject may thus be tempted to reason that if he takes both boxes he is likely to get only 20 points, but that if he takes just the opaque box he is likely to get 250 points. There is a compelling motivation to choose just the opaque box, and thereby resemble those who typically find 250 points it it. There is, of course, another rationale: the program has already made its prediction and has already either put the 250 points in the opaque box or has not. If it has already put the 250 points in the opaque box, and the person takes both boxes he gets 250 + 20 points, whereas if he takes only the opaque box, he gets only 20 points. If the program has not put the 250 points in the opaque box and the person takes both boxes he gets 20 points, whereas if he takes only the opaque box he gets nothing. Therefore, whether the 250 points are there or not, the person gets 20 points more by taking both boxes rather than the opaque box only. The second rationale relies on consequentialist reasoning reminiscent of STP (namely, whatever the state of the boxes following the program's prediction, I will do better choosing both boxes rather than one only). The first rationale, on the other hand, while couched in terms of expected value, is partially based on the assumption that what the program will have predicted - although it has predicted this already - depends somehow on what the subject ultimately decides to do. The results we obtained were as follows: 35% of the subjects chose both boxes, while 65% preferred to take Box B only. This proportion of choices is similar to that observed in other surveys concerning the original Newcomb's problem (see,


E. Shafir

for example, Gardner, 1973, 1974; Hofstadter, 1983). What can be said about the majority who prefer to take just one box? Clearly, had they known for certain that there were 250 points in the opaque box (and could see 20 in the other), they would have taken both rather than just one. And certainly, if they knew that the 250 points were not in that box, they would have taken both rather than just the one that's empty. These subjects, in other words, would have taken both boxes had they known that Box B is either full or empty, but a majority preferred to take only Box B when its contents were not known. The conflicting intuitions that subjects experience in the disjunctive situation when the program's prediction is not known - are obviously resolved in favor of both boxes once the program's decision has been announced: at that point, no matter what the program has predicted, taking both boxes brings more points. Subjects, therefore, should choose both boxes also when the program's decision is uncertain. Instead, many subjects fail to be moved by the foreseeable consequences of the program's predictions, and succumb to the strong motivation to choose just the opaque box and thereby resemble those who typically find 250 points in it.3 As Gibbard and Harper (1978) suggest in an attempt to explain people's choice of a single box, "a person may. . . want to bring about an indication of a desired state of the world, even if it is known that the act that brings about the indication in no way brings about the desired state itself. This form of magical thinking was demonstrated by Quattrone and Tversky (1984), whose subjects selected actions that were diagnostic of favorable outcomes even though the actions could not cause those outcomes. Note that such instances of magical thinking typically occur in disjunctive situations, before the exact outcome is known. Once they are aware of the outcome, few people think they can reverse it by choosing an action that is diagnostic of an alternative event. Shafir and Tversky (1992) discuss various manifestations of "quasi-magical" thinking, related to phenomena of self-deception and illusory control. These include people's tendency to place larger bets before rather than after a coin has been tossed (Rothbart & Snyder, 1970; Strickland, Lewicki, & Katz, 1966), or to throw dice softly for low numbers and harder for high ones (Henslin, 1967). Similarly, Quattrone and Tversky (1984) note that Calvinists act as if their behavior will determine whether they will go to heaven or to hell, despite their belief in divine pre-determination, which entails that their fate has been determined at birth. The presence of uncertainty, it appears, is a major contributor to quasi-magical thinking; few people act as if they can undo an already certain
3 The fact that subjects do not see through this disjunctive scenario seems indisputable. It is less clear, however, what conditions would serve to make the situation more transparent, and to what extent. Imagine, for example, that subjects were given a sealed copy of the program's decision to take home with them, and asked to inspect it that evening, after having made their choice. It seems likely that an emphasis on the fact that the program's decision has long been made would reduce the tendency to choose a single box.

Uncertainty and the difficulty of thinking through disjunctions


event, but while facing a disjunction of events, people often behave as if they can exert some control over the outcome. Thus, many people who are eager to vote while the outcome is pending, may no longer wish to do so once the outcome of the elections has been determined. In this vein, it is possible that Calvinists would perhaps do fewer good deeds if they knew that they had already been assigned to heaven, or to hell, than while their fate remains a mystery. Along similar lines, Jahoda (1969) discusses the close relationship between uncertainty and superstitious behavior, which is typically exhibited in the context of uncertain outcomes rather than in an attempt to alter events whose outcome is already known. As illustrated by the studies above, people often are reluctant to consider the possible outcomes of disjunctive situations, and instead suspend judgement and envision themselves at the uncertain node. Interestingly, it appears that decision under uncertainty is only one of numerous domains in which subjects exhibit a reluctance to think through disjunctive situations. The difficulties inherent to thinking through uncertainty and, in particular, people's reluctance to think through disjunctions manifest themselves in other reasoning and problem-solving domains, some of which are considered below.

Probabilistic judgement Researchers into human intuitive judgement as well as teachers of statistics have commented on people's difficulties in judging the probabilities of disjunctive events (see, for example, Bar-Hillel, 1973; Carlson & Yates, 1989; Tversky & Kahneman, 1974). While some disjunctive predictions may in fact be quite complicated, others are simple, assuming that one sees though their disjunctive character. Consider, for example, the following "guessing game" which consisted of two black boxes presented to Princeton undergraduates (N = 40) on a computer screen, along with the following instructions.
Under the black cover, each of the boxes above is equally likely to be either white, blue, or purple. You are now offered to play one of the following two games of chance: Game 1: You guess the color of the left-hand box. You win 50 points if you were right, and nothing if you were wrong. Game 2: You choose to uncover both boxes. You win 50 points if they are the same color, and nothing if they are different colors.

The chances of winning in Game 1 are 1/3; the chances of winning in Game 2 are also 1/3. To see that, one need only realize that the first box is bound to be either white, blue, or purple and that, in either case, the chances that the other will be the same color are 1/3. Notice that this reasoning incorporates the disjunctive

Whereas this preference could also be attributed to subjects' beliefs about the computer set-up. over all subjects. What is the largest amount of money that you would be willing to pay to participate in this game? The probability of winning in Game A is 1/3..05). while Game B was worth an average of only $4. Shafir logic of STP.001). and the truth of a scientific hypothesis depends on the precision of . The probability of winning in Game B is also 1/3. in all of which a large proportion of subjects prefer to gamble on a simple event over an equally likely or more likely disjunctive event.266 E. Game A was valued at an average of $6. One hundred and three Stanford undergraduates listed their highest buying prices for the gambles below: The following games of chance are played with a regular die that has two yellow sides. Forty-six percent of our subjects. You win $40 if it falls on green.53.69 (t = 4. one need only realize that for every outcome of the first toss. however. but also emphasized the game's sequential character which. and realizes that the chances are the same no matter what the first outcome was. seems not to have been entirely transparent to our subjects. The judged guilt of a defendant depends on the veracity of the witnesses. Z = 2. and nothing otherwise. are expected to find the two games roughly equally attractive. however. p < . To see this. We have investigated numerous scenarios of this kind. p<. two green sides. Inductive inference Inferential situations often involve uncertainty not only about the conclusion. In fact. both red. it was thought. the probability of winning on the second toss is always 1/3.65. This disjunctive rationale. provided that they see through the disjunctive nature of Game 2. You win $40 if it falls on the same color both times (e. the diagnosis of a patient depends on the reliability of the tests. and two red sides: Game A: You roll the die once.09. therefore. but a certain lack of clarity about the disjunctive case may have led them to prefer the unambiguous first game. Subjects. or both green) and nothing otherwise. What is the largest amount of money that you would be willing to pay to participate in this game? Game B: You roll the die twice. but about the premises as well. Eighty-five percent of these subjects offered a higher price for Game A than for Game B. considers the chances of winning conditional on each outcome. the next version not only insured the perceived independence of outcomes. did not list the same buying price for the two games. 70% of whom indicated a preference for Game 1 (significantly different from chance. may make its disjunctive nature more transparent.g. One enumerates the possible outcomes of the first box. These subjects may have suspected an equal chance for both games.

According to this principle. Shafer & Pearl. Eric.chance that the parents would agree to divorce in the disjunctive case. respectively.) Because the above effect is small. then I believe that A is more probable than B regardless of whether c obtains or not. and if I also believe that event A is more probable than event B given the absence of c. They have consulted marriage counselors and have separated once for a couple of months. Similar to the failure of STP in the context of choice.Uncertainty and the difficulty of thinking through disjunctions 267 earlier observations. Disjunctive question (N = 88): What do you think are the chances that both Tim and Julia will agree to a divorce settlement (that specifies whether Eric is to stay with his father or with his mother)? [59. for example. They have a 10-year old son. and there are potential ambiguities in the interpretation of the problem.54 and 2. Their marriage is presently at a new low. During the last few years Tim and Julia have had recurring marital problems.05 in both cases. but decided to try again. How likely is the defentant to be guilty if the witness is telling the truth and how likely if the witness is lying? What is the likelihood that the patient has the disease given that the test results are right. all of which agree on a general principle implied by the probability calculus. However. 1990).8%] Father question (N = 48): What do you think are the chances that both Tim and Julia will agree to a divorce settlement if Eric is to stay with his father? [40. along with one of the three questions that follow: Divorce problem Tim and Julia. p < . Subjects judged the probability that the parents will agree to a divorce settlement that specified that the child is to stay with his father to be less then 50%. if I believe that event A is more probable than event B in light of some condition c. In the following pilot study. more exploration of this kind of judgement is . this principle may not always describe people's actual judgements. however.almost 60% . have been married for 12 years. when there was uncertainty about whether the settlement would specify that the child is to stay with the father or with the mother (z = 4. 182 Stanford undergraduates were presented with the divorce scenario below.7%] Next to each question is its mean probability rating. both school teachers. they thought that there was a higher .8%] Mother question (N = 46): What do you think are the chances that both Tim and Julia will agree to a divorce settlement if Eric is to stay with his mother? [49. Reasoning from uncertain premises can be thought of as reasoning through disjunctions. to whom they are very attached.52 for the father and for the mother. and what is the likelihood if the results are false? The aggregation of uncertainty is the topic of various theoretical proposals (see. and similarly if it specified that the child is to stay with his mother.

namely. the mother is likely to object. that people's reasoning through this disjunctive situation may be nonconsequential. but when facing the disjunction people estimate a probability greater than one-half. people do not always refrain from considering the potential implications of disjunctive inferences. instead of contemplating the consequences of traversing each of the branches. Either disjunct . Evidently. of seven-letter words that end in ing ( ing) to be greater than the frequency of seven-letter words that have the letter n in the sixth position ( n-) (Tversky & Kahneman. having been told that "robins have an ulnar artery". it appears. see also Shafir. For example. 1990). seven-letter words that end in ent. & Osherson. sparrows. & Shafir. subjects rate it more likely that all birds have an ulnar artery than that ostriches have it (Osherson. While either disjunct presents a clear scenario with compelling reasons for increasing or decreasing one's probability estimate. Smith. . A precondition for such judgement is the failure to take account of the fact that the category birds consists of subcategories. Along similar lines. From this perspective the couple seems ready for divorce. a disjunctive situation can be less compelling. such patterns can emerge from a tendency towards "concrete thinking" (Slovic. however. 1990. Lopez. Numerous studies have shown that people often do not decompose categories into their relevant subcategories. In the above divorce scenario. if the child is to stay with his mother. etc.). at the expense of other information which remains implicit. subjects erroneously conclude that they must be more frequent. wherein neither parent has reasons to object. like robins. When making these estimates. people tend to remain nonconsequential at the uncertain node. Shafir required. and ostriches. The pilot data above illustrate one kind of situation that may yield such effects due to the way uncertainty renders certain considerations less compelling. on a typical page. Wilkie. In effect. Smith. Disjunction effects in judgement are likely to arise in contexts similar to those which characterize these effects in choice. the pattern above may capture a "disjunction effect" in judgement similar to that previously observed in choice.each branch of the tree .. But what about when the fate of the child is not known? Rather than consider the potential objections of each parent. More generally. 1983).leads to attribute a probability of less than one-half.268 E. As in choice. 1972) wherein people rely heavily on information that is explicitly available. Thus. the father will object.e. subjects evaluate the situation from a disjunctive perspective. most subjects estimate the frequency. people see a clear reason for lowering the probability estimate of a settlement once they know that the child is to stay with his father. Similarly. seven-letter words that end in ine. Of course. It does appear. people tend to suspend judgement in disjunctive situations. even if every disjunct would eventually affect their perceived likelihood is similar ways. subjects do not decompose the latter category into its constituent subcategories (i. subjects focus on the particular category under consideration: because instances of the former category are more easily available than instances of the latter. seven-letter words that end in ing.

implicit disjunction as encompassing the various disjuncts explicitly mentioned in the former. Evidently.Uncertainty and the difficulty of thinking through disjunctions 269 Various manifestations of the tendency for considerations that are out of sight to be out of mind have been documented by Fischhoff. for example.) Along similar lines. 1988). observed the NYT. Hershey. see Tversky and Koehler (1993). but in the three days since the election they have registered their concern about where he goes from here". "because of caution before the Presidential election" (The New York Times. the market declined immediately following Bush's victory. November 10). a clear outlook emerged. The mean probability assigned to the hypothesis "the cause of failure is something other than the battery. The financial markets. US financial markets remained relatively inactive and stable.. Johnson. explained one trader after the election. or the engine" doubled when the unspecified disjunctive alternative was broken up into some of its specific disjuncts (e. and Kunreuther (1993) found that subjects were willing to pay more when offered health insurance that covers hospitalization for "any disease or accident" than when offered health insurance that covers hospitalization "for any reason". a thorough analysis of the financial markets' behavior reveals . Most traders agree. For an extensive treatment of the relationship between explicit and implicit disjunctions in probability judgement. stock and bond prices declined. and the Dow Jones industrial average fell a total of almost 78 points over the ensuing week. Meszaros. even if they would eventually affect judgement in similar ways. "When I walked in and looked at the screen". Of course. explained the analysts. Inferential disjunction effects may also occur in situations in which different rationales apply to the various disjuncts. 1988). Dukakis been elected. The dollar plunged sharply to its lowest level in 10 months. November 10). asked car mechanics to assess the probabilities of different causes of a car's failure to start. and would have declined at least as much had Dukakis been the victor (these would unlikely be due to disjoint sets of actors). the ignition system. "had generally favored the election of Mr. Immediately following the election. etc. Shafir and Tversky (1992) have suggested that the financial markets' behavior during the 1988 US Presidential election had all the makings of a disjunction effect. "I thought Dukakis had won" (NYT. Slovic.g. the financial markets were likely to have registered at least as much concern had Mr. subjects do not perceive the latter. November 2. people may be reluctant to contemplate the consequences. wrote the WSJ. Under uncertainty. and Lichtenstein (1978) who. Bush and had expected his victory. Of course. "reflected continued worry about the US trade and budget deficits". that the stock market would have dropped significantly had Dukakis staged a come-from-behind victory. the starting system. "Investors were reluctant to make major moves early in a week full of economic uncertainty and seven days away from the Presidential election" (The Wall Street Journal. The dollar's decline. "economic reality has set back in" (WSJ. After days of inactivity preceding the election. the fuel system. November 5. In the weeks preceding the election.

they conclude. subjects understand that neither a vowel nor a consonant on the other . then there is an even number on the other side of the card. for reviews. they would decline if Dukakis was elected. "Considering how Wall Street had rooted for Bush's election". Stocks fell. but were not expected. it is in the nature of nonconsequential thinking to encounter events that were bound to be. and only those cards. 1989. whereas the correct choices are the E and the 7 cards. but it wasn't what she expected'.e. at least on the surface. Shafir numerous complications but. and they generally agree that people have no trouble evaluating the relevance of the items that are hidden on the other side of each card. p. Deductive inference The Wason selection task One of the most investigated tasks in research into human reasoning has been the selection task. It makes one think of the woman in the New Yorker cartoon discussing a friend's failing marriage: 'She got what she wanted. "its reaction to his victory was hardly celebratory. this incident has all the makings of a disjunction effect: the markets would decline if Bush was elected. (The success rate of initial choices in dozens of studies employing the basic form of the selection task typically ranges between 0 and a little over 20%. but they resisted any change until after the elections. subjects are presented with four cards.) The difficulty of the Wason selection task is perplexing. see also Wason. see Evans." Indeed. Thus. Only one side of each card is displayed. In a typical version of the task. 1969) explicitly address the discrepancy between subjects' ability to evaluate the relevance of potential outcomes (i. Most select the E card or the E and the 4 cards. 1985. "clearly it is the attempt to solve it which makes it difficult" (Wason & Johnson-Laird. For example: m u s m Subjects' task is to indicate those cards. and Gilhooly. to understand the truth conditions of the rule). 1988. (Oakhill & Johnson-Laird.270 E.." The simple structure of the task is deceptive . that must be turned over to test the following rule: "If there is a vowel on one side of the card. report related findings regarding subjects' selection of counterexamples when testing generalizations. and their inappropriate selection of the relevant cards. said the NYT (November 11). bonds fell and the dollar dropped. Numerous variations of the task have been documented. first described by Wason (1966).) While the problem is logically quite simple.the great majority of subjects fail to solve it. 174). each of which has a letter on one side and a number on the other. 1972. Wason and Johnson-Laird (1970. Being at the node of such a momentous disjunction seems to have stopped Wall Street from addressing the expected consequences.

at the disjunctive node. the cards' hidden sides not having been adequately evaluated. Subjects are given an exclusive disjunction rule. 1989). a black circle. although see also Manktelow & Evans. Johnson-Laird. 1989). 1972. for a review. Manktelow & Evans. & Legrenzi. 1993). Evans & Lynch. What exactly subjects do when performing the selection task remains outside the purview of the present paper. "this strongly confirms the view that card selections are not based upon any analysis of the consequences of turning the cards". the disjunction leads them to withhold such reasoning. The THOG problem Another widely investigated reasoning problem whose disjunctive logic makes it difficult for most people to solve is the THOG problem (Wason & Brooks. people are assumed to focus on items that have been explicitly mentioned. 1983. to apply prestored knowledge structures. In general. To explain the various effects.Uncertainty and the difficulty of thinking through disjunctions 271 side of the 4 card contributes to the possible falsification of the rule. subjects understand that a consonant on the other side of the 7 card would not falsify the rule but that a vowel would falsify it. researchers have suggested verification biases (Johnson-Laird & Wason. which tend to describe familiar situations. The problem presents four designs: a black triangle. 1985. It is likely that such content effects facilitate performance on the selection task by rendering it more natural for subjects to contemplate the possible outcomes. 1982. Instead. Subjects are easily able to evaluate the logical consequences of potential outcomes in isolation. While most people find it trivially easy to reason logically about each isolated disjunct. Griggs & Cox. 1982. as well as an innate propensity to look out for cheaters (Cosmides. Girotto. for example. 1984. yet they choose to turn the 4 card when its other side is not known. and instead remain. memories of domain-specific experiences (Griggs & Cox. selective focusing (Legrenzi. 1979). or to remember relevant past experiences. and Wason. Similarly. at least when the content is not familiar. 1970). & Johnson-Laird. Subjects confronted with the above four-card problem fail to consider the logical consequences of turning each card. a pattern of content effects has been observed in a number of variations on the task (see. but they seem to act in ways that ignore these consequences when facing a disjunction. pragmatic reasoning schemas (Cheng & Holyoak. p. nevertheless they neglect to turn the 7 card in the disjunctive situation. What these explanations have in common is an account of performance on the selection task that does not involve disjunctive reasoning per se. They are told that the experimenter has chosen one of the shapes (triangle or . and a white circle. As Evans (1984. especially considering the numerous studies that have addressed this question. Legrenzi. matching biases (Evans. 1973). 1979). 458) has noted. judgement suspended. a white triangle. 1979).

followed from those premises. Told that the black triangle is a THOG.272 E. if any. subjects appear to have no difficulty evaluating what is and what is not a THOG once they are told the particular shape and color chosen by the experimenter (Wason & Brooks. These investigators presented subjects with various premises and asked them to write down what conclusion. however. Reminiscent of the selection task.two disjunctive premises such as the following: June is in Wales or Charles is in Scotland. therefore. and that any design is a THOG if. but not both. In one study subjects were presented with "double disjunctions" . in turn. Uncertain about the correct shape and color. and only if. Chapter 3). Byrne. 1979). Similarly. Charles is in Scotland or Kate is in Ireland. fail to follow this disjunctive logic and the most popular answer is the mirror image of the correct response. but not both. This is because the shape and color chosen by the experimenter can only be either a circle and black. it has either the chosen shape or color. the separate disjuncts. subjects fail to consider the consequences of the two options and reach a conclusion that contradicts their preferred solution given either alternative. then it is not the case that Charles is in Scotland and. It therefore follows from this double disjunction that either Charles is in Scotland or June is in Wales and Kate is in . To see what follows from this double disjunction. see also Johnson-Laird & Byrne. but not both. Double disjunctions Further evidence of subjects' reluctance to think through inferential disjunctions comes from a recent study of propositional reasoning conducted by JohnsonLaird. or a triangle and white. 1991. The majority of subjects. Shafir circle) and one of the colors (black or white). If we assume that June is in Wales. we know that Kate is in Ireland. Smyth and Clark (1986) and Girotto and Legrenzi (1993) also address the relationship between failure on the THOG problem and nonconsequential reasoning through disjunctions. It is when they face a disjunction of possible choices that subjects appear not to work through the consequences. one simply needs to assume. subjects are asked to classify each of the remaining designs. They concluded that reasoning from conditional premises was easier for all subjects than reasoning from disjunctive premises. then it is not the case that June is in Wales or that Kate is in Ireland. The correct solution is that the white circle is a THOG and that the white triangle and the black circle are not. if we assume that Charles is in Scotland. In both cases the same conclusion follows: the black circle and white triangle are not THOGs and the white circle is. and Schaeken (1992.

For example. people seem to confound their epistemic uncertainty. and the relationship between reasoning about disjunctive propositions and reasoning through disjunctive situations merits further investigation. to make the following observation: The conditionality of inferences on subgoals places ANDS on a very short leash that has counterintuitive consequences.) As in the previous studies. to yield a total of 21% valid conclusions.Uncertainty and the difficulty of thinking through disjunctions 273 Ireland. what they may or may not know. wherein the specific disjuncts are not explicitly considered. they have no trouble concluding that Kate is in Ireland. once separate disjuncts are entertained. if subjects are provided with relevant facts they have no trouble arriving at valid conclusions. consider the following premises: There is an F or an R If there is an F then there is an L If there is an R then there is an L It seems intuitively obvious that there has to be an L. If ANDS is given the conclusion There is an L. In fact. the fact that one or another of the disjuncts must obtain.. Similarly.. Consider. pp. yielding an average of 5% valid conclusions. (1984. a number of psychological theories of propositional reasoning have been advanced in recent years (e. and Schaeken (1992) investigate these disjunctions in the context of their theory of propositional reasoning. 1974-6. 4 . and many others erred in their reasoning. say. once subjects are told that.fared worse. in his theory of propositional reasoning which he calls ANDS. Apart from that.negative and inclusive . But if the conclusion given is anything else (e. There is an X. or There is not an L). for example. Osherson. Thus. One issue that arises out of the aforementioned studies is worth mentioning.g. Rips. It is clear. but may "not be noticed" .g. with uncertainty about the actual consequences. then ANDS makes the deduction. Rips (1983). Reiser. 1983).seems a good simulation of the disjunction effect.'s subjects (ages 18-59. Byrne.that it should be "intuitively obvious" that there has to be an L. Presented with a disjunction of simple alternatives most subjects refrain from assuming the respective disjuncts and arrive at no valid conclusions. Yet. Braine. June is in Wales. that certain conclusions follow. (Other kinds of disjunctions . clean-shaven. nearly a quarter of Johnson-Laird et al. 1984. they reach a valid conclusion if told that Charles is in Scotland. ANDS will not notice that there has to be an L.4 Puzzles and paradoxes The impossible barber Many well-known puzzles and semantic paradoxes have an essentially disjunctive character. small-village Johnson-Laird. 357-8) The explicit availability of the premises in the example above may distinguish it from a standard disjunction effect. But when facing the disjunctive proposition. This leads Braine et al. that famous. all working at their own pace) concluded that nothing follows. finds reason to assume certain "backward" deduction rules that are triggered only in the presence of subgoals. the phenomenon . & Rumain.

B. the following puzzle (which the reader is invited to solve before reading further): There are three inhabitants. however. What is C? (Smullyan. reprinted in Rips. And if he does not shave himself. still seems perfectly innocent). each of whom is a knight or a knave. A. then he violates the stipulation that he only shaves those who do not shave themselves. Shafir barber who shaves all and only the village men who do not shave themselves. The set paradox. One definition of "paradox" is "a statement that appears true but which. 1989) We know that A must be either a knight or a knave. 1983). Two people are said to be of the same type if they are both knights or both knaves. incidentally. of course. these impossible disjunctions appear innocuous. What characterizes the paradoxes above is a logical impossibility that goes undetected partly due to their underlying disjunctive nature. so B is a knave. which had a profound influence on modern mathematical thinking. 22. But. for example. If B is a knave. p. lurks an important disjunction: either this barber shaves himself or he does not (which. Consider. and others called "knaves" always lie. the set paradox. then his . and Rips (1989) investigates the psychology of reasoning about them. Knights and knaves Many puzzles also rely on the surprising complexity or lack of clarity that arise in simple disjunctive situations. A and B make the following statements: A: B is a knave. concerns the set of all sets that do not contain themselves as members. The description of this barber seems perfectly legitimate . 1943). and C. in fact. 1978. then he violates the stipulation that he shaves all those who don't. involves a contradiction" (Falletta. but their "paradoxical" nature is instructive. (Is this set a member of itself?) The logical solution to these paradoxes is beyond the scope of the present paper (see Russell. Unless we delve into the appropriate disjuncts (which are themselves often not trivial to identify) and contemplate their logical consequences. The impossible barber is closely related to another of Bertrand Russell's paradoxes. once we contemplate the disjuncts we realize the problem: if the barber shaves himself. Behind this description. then his statement about B must be true. B: A and C are of the same type. If A is a knight. A class of such puzzles concerns an island in which certain inhabitants called "knights" always tell the truth. Smullyan (1978) presents a variety of knight-knave puzzles. namely.one almost feels like one may know the man.274 E.

the disjunctive nature of the puzzle makes it quite difficult. 1991. and the remaining subjects averaged a low solution rate of 26% of the problems answered correctly. and since A is a knave. Rips (1989. In contrast to many complicated tasks that people perform with relative ease. these problems appear computationally very simple. While various factors may contribute to the clumsiness with disjunctions in different domains. we have shown that C is a knave regardless of whether A is a knave or a knight. C must be a knave. In fact. it can result from the sheer complexity that characterizes many decision situations. While each assumption about A leads straightforwardly to a conclusion about C. 109). after reviewing subjects' think-aloud protocols. most involving just a couple of possible disjuncts. And the difficulties are not negligible: about 30% of Rips' subjects stopped working on a set of such problems relatively quickly and scored at less than chance accuracy (which was 5%). suppose A is a knave. since we are assuming A is a knight. B's statement that A and C are of the same type is true. 89) concludes that "most of the subjects' difficulties involved conceptual bookkeeping rather than narrowly logical deficiencies" (although see Johnson-Laird & Byrne. Hence. Thus. 181). so is C. conceptual "solution path" required to reason through a disjunction.Uncertainty and the difficulty of thinking through disjunctions 275 statement that A and C are of the same type is false. Conclusion In their seminal Study of Thinking. Then his statement that B is a knave is false. Decision difficulty is sometimes attributed to emotional factors having to do with conflict and indecision. Thus. and Austin (1956) observed "the dislike of and clumsiness with disjunctive concepts shown by human subjects" (p. it nonetheless appears that a consideration of people's reluctance to think through disjunctions may shed light on common difficulties experienced in reasoning and decision making under uncertainty. for a discussion of possible difficulties involved in more complex cases). and B is the knight. on the other hand. but rather the general. it is not the simple logical steps that seem to create the difficulties in this case. "subjects differ in the ease with which they hit upon a stable solution path" (p. The disjunctive scenarios reviewed in this paper were quite simple. STP violations were observed in a number of simple contexts of decision and reasoning that do not seem readily attributable to eitheremotional factors or complexity considerations. p. Alternatively. They serve to highlight the discrepancy between logical complexity on the one hand and psychological . In the context of the present paper. The studies reviewed above indicate that people's dislike of and clumsiness with disjunctions extend across numerous tasks and domains. Hence. Brunger. On the other hand. Rips proceeds to stipulate that while the required propositional (logical) rules are equally available to everyone. Goodnow.

In contrast to the "frame problem" (Hayes. 1993). Thinking through an event tree requires people to assume momentarily as true something that may in fact be false. 1985. which is trivial for people but exceedingly difficult for AI. to the effect that most subjects' difficulties involved conceptual bookkeeping. Rips' observation in the context of the knights/knaves problem. 1977). Hayes. In general. Merely mentioning the few possible disjuncts can hardly be considered a major facilitation from the point of view of computational or logical complexity. It appears that decision under uncertainty can be thought of as another domain in which subjects exhibit a reluctance to think through disjunctive situations. but it does appear to set subjects on the right solution path. numerous studies have shown that merely encouraging subjects to systematically consider the various disjuncts often allows them to avoid common errors. there often were very few intermediate steps to remember. While it is possible that subjects occasionally forget intermediate results obtained in their reasoning process. the task of thinking through disjunctions is trivial for AI (which routinely implements "tree search" and "path finding" algorithms) but is apparently quite unnatural for people. Slovic & Fischhoff. subjects in these experiments were allowed to write things down and. Kotovsky. subjects appear reluctant to travel through the branches of a decision tree. Typically. for example. Kotovsky & Simon. 1973. Griggs and Newstead (1982) have shown that simply spelling out for subjects the four disjunctive possibilities reliably improves their performance. Recall. particularly when it is known that most will eventually prove to be false hypothetical assumptions. As discussed by Shafir and Tversky (1992). In the context of the THOG problem. for example. that the problem will not be resolved by separately .276 E. 1969). and by Tversky and Shafir (1992) in the context of various disjunction effects in decision problems. 1990). or arriving at a stable solution path. subjects may lack the motivation to trasverse the tree simply because they assume. besides. such limitations are not sufficient to account for all that is difficult about thinking. Often. & Simon. rather than narrowly logical deficiencies (for a related discussion. namely that of systematically contemplating the decision tree's various branches. see Goldman. These limitations are bound to play a critical role in many situations. the "memory load". however. It is apparently difficult to devote full attention to each of several branches of an event tree (cf. People may be reluctant to make this assumption. shortcomings in reasoning are attributed to quantitative limitations of human beings as processors of information. McCarthy & Hayes. "Hard problems" tend to be characterized by reference to the "required amount of knowledge". for example. or the "size of the search space" (cf. Similar effects have been shown by Johnson-Laird and Byrne (1991) in the context of the double disjunctions. Indeed. Shafir difficulty on the other. as is often the case. especially when competing alternatives (other branches of the tree) are readily available.

for related discussion). Issues and advances in the foundations of decision theory. if it were not for this reluctance. would not be felt or observed. M. It appears that part of what may be problematic in decision under uncertainty is more fundamental than the problems typically envisioned. of course. Bacharach & S. Oxford: Basil Blackwell. .Uncertainty and the difficulty of thinking through disjunctions 277 evaluating the branches. Because it is a general "solution path" that seems to be neglected. but is sometimes violated when it is not (Tversky & Kahneman. the majority of subjects who opted for the same option under every outcome chose that option also when the precise outcome was not known. We usually try to formulate problems in ways that have sifted through the irrelevant disjunctions: those that are left are normally assumed to matter.).. Further study of people's psychology in situations of uncertainty and in other disjunctive situations is likely to improve our understanding and implementation of reasoning in general. Subjects' violations of STP in a variety of decision contexts were attributed to their failure to think through the disjunctive situation. & Hurley. or another. can be thought of as disjunctive situations: one event may occur. STP is generally satisfied when its application is transparent. Like other normative principles of decision making. it is suggested. mean that they are not capable of realizing it once it is apparent. substantially diminishes when the logic of STP is made salient. rather than a limitation in logical or computational skill. in other words. and of the decision making process in particular. Situations of uncertainty. Many of the patterns observed above. Foundations of decision theory: Issues and advances (pp. The frequency of disjunction effects. In M. people' reluctance to think through these scenarios often creates an uncertainty that. (1991). 1993. a proficiency in thinking through uncertain situations may be something that people can improve upon through deliberate planning and introspection. even in situations in which there should be no uncertainty since the same action or outcome will eventually obtain in either case. the fact that people routinely commit a mistake does not. Hurley (Eds. The studies above indicate that the disjunctive logic of these uncertain situations often introduces an uncertainty of its own. References Bacharach. reflect a failure on the part of people to detect and apply the relevant principles rather than a lack of appreciation for their normative appeal (see Shafir. concerning the difficulties involved in the estimation of likelihoods and their combination with the estimated utilities of outcomes. In fact. As with numerous other systematic behavioral errors. we suggest. when Tversky and Shafir (1992) first asked subjects to indicate their preferred course of action under each outcome and only then to make a decision in the disjunctive condition. S. 1986). Thus. 1-38).

Naming the parents of the THOG: Mental representation and reasoning. S. Braine. Journal of Experimental Psychology: Human Perception and Performance. (1984). Hillsdale. Manuscript. Scientific American. NJ: Erlbaum. Princeton Universitv. Evans. 125-162)..A. Hammond. Thinking: Directed. 391-416. & Newstead. Foundations and applications of decision theory. Harmondsworth: Penguin Books. Some empirical justification for a theory of natural propositional logic. The frame problem and related problems in artificial intelligence. M. 18. & Kunreuther. Leach.R. Bastardi. Metamagical the mas: Questing for the essence of mind and pattern. CA: Academic Press.R. (1974).J. with a mind-bending prediction paradox by William Newcomb. Organizational Behavior and Human Performance. (1989). Hershey. Elithorn & D.E. Gilhooly. 102-109. 407-420. E.. New York: Doubleday. B. Cognition. Theory and Decision. M. Johnson-Laird.T. A. R.N. (1993). 391-397. J. Griggs. San Francisco: Jossey-Bass. Reiser. 229(1). P.N. (1969). P. 9.. V.. G. In G. Hofstadter (1985)..T. American Journal of Sociology. 73.M.. Cosmides. Carlson. 8. The logic of social exchange: has natural selection shaped how humans reason? Cognition.). J. J. The psychology of learning and motivation (Vol. B. Gardner. J. Dilemmas for superrational thinkers. 330-344. Johnson-Laird.).. In C. Jones (Eds. 368-379.J.T.A. probability distortions. Reprinted in D. Cheng. & Lichtenstein. Goldman.. Framing. The role of problem structure in a deductive reasoning task. Bias in human reasoning: Causes and consequences.278 E. (1993). (1989). On the natural selection of reasoning theories.W. Philosophical applications of cognitive science. R. Journal of Risk and Uncertainty. and insurance decisions. forthcoming.J. Craps and magic. 33. (1978). L. J. 25-78. British Journal of Psychology.H. P. 2nd ed. CO: Westview Press. Griggs. 35-51. (1973). P. The elusive thematic-materials effect in Wason's selection task. A study of thinking. 187-276. J. 44. San Diego. 451-468. pp.. & Rumain. & Yates.R. Relfections on Newcomb's problem: a prediction and free-will dilemma. (1988).St. (1993). (1991).. 17. Slovic. 313-371). Shafir Bar-Hillel. (1985). P. (1992). 99. Consequentialist foundations for expected utility. Meszaros. Hayes. Propositional reasoning by model. (1956). B.J. New York: Academic Press. (1973). The psychology of superstition. Cheng. (Vol. 75. & Byrne.. June. (1983). 73.St. New York: Wiley. (1982).).. (1978). 1.F. G. K. (1988). & Holyoak. S. (1994). M. Artificial and human thinking.F. 7. Cognitive Psychology. & Cox. British Journal of Psychology. R.J. J. On the subjective probability of compound events. (1989).W.L. & Legrenzi. Henslin. (1973). (1982). Heuristic and analytic processes in reasoning. New York: Basic Books. Psychological Review. J. Dordrecht: Reidel. 418-439. pp. Organizational Behavior and Human Decision Processes. Counterfactuals and two kinds of expected utility. 285-313. Memory and Cognition.I. B.. undirected. & Shafir. Evans. 396-406. M.J. & E. Journal of Experimental Psychology: Language. E.. 297-307. (1983)..M. Gardner.M. Bruner. W.A.. H. McClennen. Evans. Boulder. Fault trees: sensitivity of estimated failure probabilities to problem representation.B. & Austin. Girotto.. British Journal of Psychology. Gibbard. The paradoxicon. Matching bias in the selection task. Free will revisited. P.B. 104-108. Scientific American. Pragmatic reasoning schemas.S. Fischhoff. (1973).B. and creative. . (1989). 230(3). J.. Disjunction errors in qualitative likelihood judgment. & Lynch. & Holyoak.L.J.S. J. J. (1984). 64.J. Scientific American. 4. D. J. On the search for and misuse of useless information. Hillsdale: Erlbaum. & Harper. P. & Schaeken.W.St. Quarterly Journal of Experimental Psychology. 316-330. P. Byrne. 31. (Eds. R.A. Hofstadter. K. Jahoda. Johnson.. Deduction. A.S. K. N.D.. 25. A. Hooker. leading up to a luring lottery. In A. W. Falletta. Bower (Ed. Goodnow.J. (1976).

12(2). Quattrone. Cognition.F. Rationality: Psychological and philosophical perspectives (pp. CA: MorganKaufmann.. 2nd ed.. Causal versus diagnostic contingencies: On self-deception and on the voter's illusion. E. Journal of Personality and Social Psychology. Prospect theory: An analysis of decision under risk. (1982). 38-43. & Tversky. Cognition. Shafir. A theoretical analysis of insight into a reasoning task.. Cognition. & Snyder.N. & Tversky. J. 477-488. E. Smith. Typicality and reasoning fallacies.P. 31. Michie (Eds. (1977). D. P. Levi. Cognitive Psychology. Smith. 143-183. Cognitive processes in propositional reasoning. 117-136). (1990). Stigum & F. E.N. P.A. Hurley (Eds. (1993).A. (1993).). M. Manktelow & D. I. 49. Why are some problems hard? Evidence from tower of Hanoi. Canadian Journal of Behavioral Science. G. Shafir. 38-71. NJ: Erlbaum. E... Cognition. 134-148. In K.. (1983). L. A. V. A. & Pearl. A. On the study of statistical intuitions. Hillsdale. H. Legrenzi. In M. B. Cognitive Psychology. Machine intelligence. K. 11-36. (1989).. 3. (1985). The psychology of knights and knaves. Hayes. (1978). L. Slovic. Nozick.I... (1979). Intuitions about rationality and cognition. 47.. (1990). D. 49. Rothbart.. J. Shafir.J.V. (1984). San Mateo. Meltzer & D. Newcomb's problem and two principles of choice. In B. 24. E.about man's ability to process information.M. 11. D. & Fischhoff. E. 395-400. & Chammah. R.E. (1974-6). & Evans. Hempel. In B. P. 229-239.R. 63. J. Confidence in the prediction and postdiction of an uncertain event. L. (1970). Memory and Cognition. 85-116. A. (1972). J. In N. Sure-thing doubts. Wilke.. On the psychology of experimental surprises. (1969). A. Logical abilities in Children (Vol..J. Oakhill. B. (Eds. A. New York: Routledge. Kahneman. The foundations of statistics. Oxford: Basil Blackwell. Over (Eds. 248-294.N. Russell. P.N. 2-4).. Consequentialism and sequential choice. (1969). What makes some problems really hard: Explorations in the problem space of difficulty. Cognitive Psychology.St.).M. D. & Johnson-Laird. (1979). Readings in uncertain reasoning. Cognition.) (1990).. I. (1983). Foundations of decision theory: Issues and advances (pp. Girotto. Psychological Review. Orgeon Research Institute Research Monograph. 1. Rescher (Ed. Econometrica. New York: American Elsevier. 18.A. Bacharach & S. New York: Norton. (1972). P.. & Shafir. Kahneman. 22. 263-291.).). S.N. Psychological Review. 544-551. New York: Simon & Schuster.. Reason-based choice. McCarthy. 79-94. Dordrecht: Reidel. Shafer.. Johnson-Laird. 90. memory and the search for counterexamples. From Shakespeare to Simon: speculations .E. 70. P.J. (1965). & Osherson. H.C. Savage. 17. 237-248. Some philosophical problems from the standpoint of Artificial Intelligence. P. The principles of mathematics. & Tversky. Prisoner's dilemma. A. & Hayes. Shafir. M. Kotovsky. Rips. & Simon.and some evidence . What is the name of this book? The riddle of Dracula and other logical puzzles. Manktelow.T.N. Ann Arbor: University of Michigan Press. Thinking through uncertainty: Nonconsequential reasoning and choice. 20. 97. & Wason. E. McClennen. Cognitive Psychology. (1992).I. (1993). 123-141. . Category based induction. D. & Legrenzi. 2.). British Journal of Psychology. Kotovsky. B. R. 260-283). Foundations of utility and risk theory with applications (pp. Facilitation of reasoning by realism: Effect or non-effect? British Journal of Psychology. 449-474. (1954).E. E. (1991). 46. Rapoport.. & Simon.. Lopez. 92-122).Uncertainty and the difficulty of thinking through disjunctions 279 Johnson-Laird.. Legrenzi. K. (1970). Rips. Smullyan. Essays in honor of Carl G. New York: Wiley.. Focussing in reasoning and decision making. A. P. Reasoning and a sense of reality. 37-66.N.. & Tversky. 185-200. Simonson.. Rationality.. P. K. (1943). J. (1990). (1985). & Tversky. A. G. Wenstop (Eds.. P. & Johnson-Laird. Journal of Experimental Psychology: Human Perception and Performance. Osherson. Slovic. Dordrecht: Reidel. Osherson.

In J. Wason. Structural simplicity and psychological complexity: Some thoughts on a novel problem.).C.C.N. 5. British Journal of Psychology. Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. & Kahneman. Tversky. P. (1969). Lewicki.B.. A conflict between selecting and evaluating information in an inferential task. 185. Wason. Realism and rationality in the selection task. Psychological Review. (1986). P. & Kahneman. Tversky. & Johnson-Laird. (1986).C. 3. The disjunction effect in choice under uncertainty. D. 509-515. A. A. 59. M. A. Wason.. New horizons in psychology (Vol. & Shafir. My half-sister is a THOG: Stragetic processes in a reasoning task.C. (1979). Psychological Science. Science. (1983). 1). British Journal of Psychology. D.. Tversky. Psychology of reasoning: Structure and content. D. P. Thinking and reasoning: Psychological approaches. Journal of Business. 143-151. 297-323. Tversky. (1966). 41. L. & Kahneman. 305-309. 293-315. In B. Bulletin of the British Psychological Society. Stanford University. (1972). Temporal orientation and perceived control as determinants of risk-taking. & Clark. D. A. & Katz. Cambridge. & Brooks. A. Reasoning..J. P. Harmandsworth: Penguin. Shafir Smyth. 2. Rational choice and the framing of decisions. 275-287. Psychological Research. S.G. Strickland. P. (1993).E.. (1966). .H. Evans (Ed. MA: Harvard University Press. P. Wason. 77. (1992).. (1983). 90.M. P. Wason. Tversky.J.. Wason. Support theory: A nonextensional representation of subjective probability. Judgment under uncertainty: Heuristics and biases.M. 281-284. 79-90.C. & Koehler. E. A. Manuscript. A. 22. P. (1974).C. Tversky. Advances in prospect theory: Cumulative representation of uncertainty. P. 61. 1124-1131. 251-278.M. (1970)..280 E.N.T. R.St. (1992). Journal of Risk and Uncertainty. & Kahneman. & Johnson-Laird. Journal of Experimental Social Psychology. London: Routledge & Kegan Paul. Foss (Ed.). D. THOG: The anatomy of a problem..

1986. UK Parti: Introduction Rhythm is perceptually salient to the listener. Segui. Indeed. Cambridge. Wundtlaan 1. 1981) motivated a syllable-by-syllable segmentation procedure. Listeners must recognize spoken utterances as a sequence of individual words. while French is said to have a syllable-based rhythm. The rhythm of English is stress-based. Mehler. . McQueen. and hence segmented speech input at strong syllable onsets. 6525 XD Nijmegen.1992) established that these contrasting results reflected differences between listeners rather than being effects of the input itself. non-native listeners did not use the segmentation procedures used by native listeners. in which the principal issue was how listeners segment continuous speech into words. Spanish (Sebastian. Cross-linguistic studies with French and English listeners (Cutler. the mora (Otake. and. experiments similar to those conducted in French and English were carried out in Japanese. Norris & Segui. Research on this issue in a number of languages prompted apparently differing proposals. 1994) suggested that lexical segmentation could be efficiently achieved via a procedure based on exploitation of stress rhythm: listeners assumed that each strong syllable was likely to be the beginning of a new lexical word. a language with a rhythmic structure based on a subsyllabic unit. Dommergues. This claim is central to the research project briefly described below: a large-scale investigation of listening. 1981. Dupoux. listeners' subjective impression of spoken language is that it may be effortlessly perceived as a sequence of words. 1992. Yet speech signals are continuous: there are only rarely reliable and robust cues to where one word ends and the next begins. but instead could apply their native procedure—sometimes inappropriately—to non-native input. In English. Language rhythm offered a framework within which the English and French results could be interpreted as specific realizations of a single universal procedure. 1992). Thus in both these cases the segmentation procedure preferred by listeners could be viewed as exploitation of the characteristic rhythmic structure of the language. 1988. under certain conditions. in contrast (Mehler. To test this hypothesis. Netherlands MRC Applied Psychology Unit.14 The perception of rhythm in spoken and written language Anne Cutler Max-Planck-InstitutfUrPsycholinguistik. Frauenfelder & Mehler. Experimental evidence from French. Hatano. experimental evidence (Cutler & Norris. Cutler & Butterfield. Norris & Cutler. Segui & Mehler. because the utterances may never previously have been heard. Similar to the French findings were results from Catalan. Frauenfelder & Segui.

as most sentences we come across are previously unheard. as we listen to an utterance it seems unproblematic—words in sentences seem just as clear as words that stand alone. and this paper will describe (although with minimal precision) some experimental studies showing what it might involve. Part III may serve as a useful comparison to Part II. for though speech must be segmented. a Scot. The language-specificity of rhythmic structure Now linguistic rhythmic structures have a noticeable feature in that language-universal they are definitely not. Therefore it is at least arguable that the reader. but the input is continuous. so to speak. Thus while English rhythmic structure features alternating stresses in which syllables contrast by being either strong or weak. thus in Japanese the mora is the (subsyllabic) unit which provides the root of rhythm. to be processed word by word.284 A. In other words: Rhythm will not be perceptually salient to the reader. This implies that part of listening involves an operation whereby input is segmented. The segmentation problem The orthography of English has a very simple basis for establishing where words in written texts begin and end: both before and also after every word are empty spaces and this demarcation surely helps the reader comprehend. though now we admit of more complexity in rhythmic exegetics and of other types of patterning that languages allow. say. Part II: The Perception of Rhythm in Language 7. however. as phonolo- . One of the many differences between listening and reading is that in most orthographies the segmentation problem does not arise—words are clearly demarcated in most printed texts. The findings. is specific to listening. These distinctions were expressed within traditional phonetics as uniquely based on timing (stress or syllable). however. who is confronted with no segmentation problem. The reader is invited to read the piece as it originally appeared (Part II). Cutler & Otake. a running stream of sound. Just how listeners accomplish such an effortless division is a question that psychologists have now begun to solve. Cutler Cutler & Mehler. 1994). such explicit segmentation cues are rarely to be found. The segmentation problem. as presented to a hearer. This fact is all too obvious to any hapless teacher who has tried to coax the prosody of French from. little pauses after every single word might make things clearer. having rather one where syllables are equal. For those still unconvinced. The following section of this chapter was conceived as a (self-referential) test of this suggestion. yet the data plainly indicate that rhythm in the input makes segmenting speech a breeze. thus supporting the hypothesis that listeners solve the segmentation problem in speech recognition by exploiting language rhythm—whatever form the rhythmic structure of their native language may take. as this summary explains. Yet we listeners experience no sense of some dramatic act of separating input into pieces that are known. 2. for we cannot hold in memory each total collocation. this particular endowment is not one which French possesses. 1993. has no need of language rhythm. These showed indeed that Japanese listeners could effectively use moraic structure to segment spoken input. at once can vindicate the order of the problem and the hearer's sense of ease. In a spoken text.

and they demonstrated well beyond the range of any queries that these listeners used syllables for parsing speech en mots. is the second. So compare the English limerick. which renders this interpretation valid: written rhythm's only noticed when it clearly hits iCutler & Norris (1988). a colleague4 gave this ready-made material to subjects to read out. a form which thousands take up. An important source of evidence. But the selfsame text. in a manner iron-cast: while the longest line in morae. by hypothesizing boundaries when syllables are strong. (1981). not one perceived the letter as a rhyming piece of verse. wherever they arose. and few would dare impugn it. there are five and only five in both the first line and the last. So the picture that emerges is that rhythm as exhibited in verse forms of a language can effectively predict those procedures which. in which verse (or rather. there are five lines in a limerick. having seven. however. from whom responses showed the better part had not perceived the rhymes at all. Segui et al. Otake et al. as the work referred to earlier suggests. Cutler & Butterfield (1992). Late in 1989 the present author wrote a letter. More experiments were subsequently carried out in Spanish. 4 Many thanks to Aki Fukushima and Bob Ladd for conducting this study and permitting me to describe it. and in Catalan and Portuguese and Qu6becois and Dutch. 2 . but the most pronounced of rhythms can escape our recognition when they're reproduced in printing in an article or book. In comparison with English we should surely not ignore a set of studies2 run quite recently on hearers in Japan. thus for English there is evidence1 involving a conjunction of spontaneous performance and experimental tests. to at least a hundred friends. can be found in verse and poetry: the metrical domain. doggerel) pretended to be prose. 3Mehler et al. and stress defines their makeup: the third and fourth are two-stress lines. (1993). (1981). this method can't go wrong. are observed when we consider rhythms found in written text. 3. however. while those studies3 that initiated all this lengthy series were performed on native listeners of French some years ago. with the haiku. may be printed as a ballad (thus.The perception of rhythm in spoken and written language 285 gists maintain. a poetic form of note in Japanese. 4. In a follow-up. Since the lexicon has far more words with strong pronunciation of the word-initial syllable. then it ought to be the case that it is hard to overlook. and analogously haiku have their composition reckoned by the mora computation. with lines which end in rhymes). and any reader can descry where the rhythm is. The non-use of rhythm in reading Unexpected complications to this neat account. For if rhythm is so integral a part of our audition. Some preliminary findings. which produced results consistent with the story that the mora is the unit that these listeners segment by when they can. Cutler & Otake (1994). allow us to declare the segmentation problem licked. in strict progression serial. and his results were even worse: of the readers who produced the text. the others all are threes. which together show that listeners use stress in segmentation. which in spite of minor variance did nothing that would banish the conclusion that for hearers rhythm matters very much. at first left their discoverer perplexed. The use of rhythm in listening Just those rhythms found in poetry are also those which function in perception. assuming that their use is not inhibited. which this section will endeavor to elucidate.

and this paper will describe (although with minimal precision) some experimental studies showing what it might involve. 2. say. yet the data plainly indicate that rhythm in the input makes segmenting speech a breeze. This implies that part of listening involves an operation whereby input is segmented. as we listen to an utterance it seems unproblematic— words in sentences seem just as clear as words that stand alone. this particular endowment is not one which French possesses. having rather one where syllables are equal. Part III: The Perception of Rhythm in Language 1. Thus while English rhythmic structure features alternating stresses in which syllables contrast by being either strong or weak. little pauses after every single word might make things clearer. a Scot. a running stream of sound. if judiciously considered has a lesson it can teach: it arises just because no segmentation step is needed. Thus the role of language rhythm is in understanding speech. The language-specificity of rhythmic structure Now linguistic rhythmic structures have a noticeable feature in that language-universal they are definitely not. In a spoken text. at once can vindicate the order of the problem and the hearer's sense of ease. to be processed word by word. however. such explicit segmentation cues are rarely to be found. as this summary explains. Yet we listeners experience no sense of some dramatic act of separating input into pieces that are known. as presented to a hearer. as conceded. so to speak. as most sentences we come across are previously unheard. This fact is all too obvious to any hapless teacher who has tried to coax the prosody of French from. though now we admit of more complexity in rhythmic exegetics . Cutler the eye. But perhaps the readers' lack of use of rhythm.286 A. These distinctions were expressed within traditional phonetics as uniquely based on timing (stress or syllable). but the input is continuous. for we cannot hold in memory each total collocation. The findings. The segmentation problem The orthography of English has a very simple basis for establishing where words in written texts begin and end: both before and also after every word are empty spaces and this demarcation surely helps the reader comprehend. Just how listeners accomplish such an effortless division is a question that psychologists have now begun to solve. for though speech MUST be segmented.

thus in Japanese the mora is the (subsyllabic) unit which provides the root of rhythm. which produced results consistent with the story that the mora is the unit that these listeners segment by when they can. thus for English there is evidence involving a conjunction of spontaneous performance and experimental tests. are observed when we consider rhythms found in written text. and analogously haiku have their composition reckoned by the mora computation. there are five lines in a limerick. So the picture that emerges is that rhythm as exhibited in verse forms of a language can effectively predict those procedures which. allow us to declare the segmentation problem licked. is the second.The perception of rhythm in spoken and written language and of other types of patterning that languages allow. The non-use of rhythm in reading Unexpected complications to this neat account. as the work referred to earlier suggests. by hypothesizing boundaries when syllables are strong. having seven. can be found in verse and poetry: the metrical domain. assuming that their use is not inhibited. which together show that listeners use stress in segmentation. with the haiku. the others all are threes. and stress defines their makeup: the third and fourth are two-stress lines. and few would dare impugn it. as phonologists maintain. The use of rhythm in listening Just those rhythms found in poetry are also those which function in perception. this method can't go wrong. in a manner iron-cast: while the longest line in morae. An important source of evidence. a form which thousands take up. Since the lexicon has far more words with strong pronunciation of the word-initial syllable. . while those studies that initiated all this lengthy series were performed on native listeners of French some years ago. and in Catalan and Portuguese and Quebecois and Dutch. a poetic form of note in Japanese. 4. and they demonstrated well beyond the range of any queries that these listeners used syllables for parsing speech en mots. however. which in spite of minor variance did nothing that would banish the conclusion that for hearers rhythm matters very much. 3. So compare the English limerick. More experiments were subsequently carried out in Spanish. there are five and only five in both the first line and the last. In comparison with English we should surely not ignore a set of studies run quite recently on hearers in Japan.

824-844. Thus the role of language rhythm is in understanding speech. Norris.M. and any reader can descry where the rhythm is. from whom responses showed the better part had not perceived the rhymes at all. D. (1993). N. and Segui. The syllable's differing role in the segmentation of French and English.. A. and Otake. & Mehler. Otake. Hatano. 25. & Norris. Sebastian-Galles. Journal of Memory and Language. Dommergues.. 113-121. & Mehler. J. and Segui. J. Mora or phoneme? Further evidence for language-specific listening. Journal of Verbal Learning & Verbal Behavior. A. Journal of Memory and Language.-Y. J.. J. then it ought to be the case that it is hard to overlook. 72. D. Cutler Cutler. 381-410. References A.G. & Segui. J.. Mehler. British Journal of Psychology. Cutler. if judiciously considered has a lesson it can teach: it arises just because no segmentation step is needed. Journal of Experimental Psychology: Human Perception and Performance. 20. Cutler. but the most pronounced of rhythms can escape our recognition when they're reproduced in printing in an article or book. Mora or syllable? Speech segmentation in Japanese. & Butterfield. The monolingual nature of speech segmentation by bilinguals. A. doggerel) pretended to be prose.. J.G. A. The role of strong syllables in segmentation for lexical access. in strict progression serial.G.. But perhaps the readers' lack of use of rhythm. a colleague gave this ready-made material to subjects to read out. Norris. in which verse (or rather. however. at first left their discoverer perplexed. & Mehler. Journal of Memory and Language. syllable monitoring and lexical access. which this section will endeavor to elucidate. Segui. D. The syllable's role in speech segmentation.H. (1986). U. Cognitive Psychology 24. Contrasting syllabic effects in Catalan and Spanish. J. (1988). Mehler. Rhythmic cues to speech segmentation: Evidence from juncture misperception. U. 471-477. J. A. 298-305. may be printed as a ballad (thus.. But the selfsame text. S. Journal of Experimental Psychology: Learning. A. Late in 1989 the present author wrote a letter. A. (1992). J. 18-32. 385-400. Competition in spoken word recognition: Spotting words in other words... 14. (1992). Journal of Memory and Language. (1994).288 Some preliminary findings. J. 32. Frauenfelder. and his results were even worse: of the readers who produced the text. not one perceived the letter as a rhyming piece of verse. Cutler. 31. as conceded. For if rhythm is so integral a part of our audition. Frauenfelder. E. and Cutler. to at least a hundred friends. J. . wherever they arose. (1994). McQueen. Segui. with lines which end in rhymes). Mehler. 621-638. J. 20. D. (1992). Phoneme monitoring. Cutler. 31. T. Memory and Cognition. Journal of Memory & Language. which renders this interpretation valid: written rhythm's only noticed when it clearly hits the eye. 218-236. Norris. 358-378. G. J.. (1981).. In a follow-up. 33. Cutler. T. (1981). Dupoux.

and Paul C.one marked by both continuities and discontinuities in the representations themselves and the processes that produce them. The author thanks Gundeep Behl-Chadha. * E-mail eimas@browncog. Eimas* Department of Cognitive and Linguistic Sciences. and featural categories of phonology shows a more complex pattern of change . but comprehensive theories of cognitive development akin to that of Piaget are found in the writings of Bruner. USA Abstract Arguments and evidence are presented for the conclusion that the young infant's perceptually based categorical representations for natural kinds . segmental. like Piaget. Joanne L. Quinn for their comments on earlier versions. 1.are the basis for their mature conceptual counterparts. A consideration of the development of the syllabic. Providence. Miller. for example). and Greenfield (1966) and Vygotsky (1934/1962) among others. . Olver. In addition. hypothesized a variety of stages and substages to explain the substantive changes in perception and thought that were believed to mark our intellectual passage from birth to maturity. it is argued that conceptual development is continuous in nature and without the need for special developmental processes.15 Categorization in early infancy and the continuity of development Peter D.animals in this case .bitnet Preparation of this discussion and the author's research described herein were supported by Grants HD 05331 and HD 28606. They. RI 02902. Brown University.by stages that differ in kind with respect to the medium of mental representations and often the processes or rules that operate on these representations. Introduction The standard wisdom that underlies developmental theories of human cognition includes the presumption that development is marked by discontinuities . The foremost proponent of this general view of development in the twentieth century was certainly Piaget (1952.

1964) and Vygotsky (1934/1962). An earlier example of this approach is found in the Kendlers' description of the changes in discriminative learning and transfer that occur during the fifth and seventh years of life (e. And only then according to Piaget are acts of cognition involving concepts able to support the logic of problem solving. more recent examples are offered by investigators concerned with developmentally correlated regressions in various facets of cognition from imitation to conservation and language comprehension. whereas for Vygotsky the earliest representations were the idiosyncratic associations of the individual child among the things of the world .. for example. Only later. this facet of cognitive development has been viewed as a process marked by a series of stages in which the categorical representations for objects and events that populate our environment differ in kind across the inevitable progression. specifically animals. I consider development of phonology from the perspective of continuities and discontinuities across representations and the processes that yield these representations. from one developmental period to the next. Kendler & Kendler. at or near the time of puberty. logically structured. Other. Conceptual development At least since the time of Piaget (Piaget & Inhelder. For example.cognitive "heaps". and meaningful emerge. 1982). or according to Vygotsky to provide the conceptually based meanings that are conveyed by human language. as opposed to enactively (motorically) or iconically represented.290 P. Eimas More typical of the writings of recent developmental theorists are explanations of (presumed) qualitative changes in quite restricted aspects of cognition. from early infancy to early childhood and how this development relates to the typically presumed discontinuities in development. In the second section. For Bruner and his colleagues it was only when concepts became symbolically represented. 1962. but see Eimas. the representations hypothesized by Piaget in the earliest years of life were based on sensorimotor representations.g. did thought attain mature levels of computational power. 1970). for example. ceteris paribus. The initial concerns of the present discussion pertain to the conceptual development of natural kinds. This stage-wise view of conceptual development is found today in the writings of Mandler (1992) and Karmiloff-Smith (1992). It is important to note that these theorists posit not only a difference in kind between early and . These relatively specific reversals in cognitive growth are typically conceived of (or can be conceived of) as reflecting the consequences of developmentally timed neuronal reorganizations that alter cognitive processes and strategies (Bever. do conceptual representations that are abstract.

Mandler (1992) has assumed that the earliest categorical structures of infants. that are innately given. moot about causal factors that brought about these apparent changes which I now attempt to describe. it can also be described as part of a continuous process whereby different attributes of the physical world attain functional significance at different ages. the earliest parsing of things and events in the world. this view posits continuity with respect to the cognitive operations by which conceptual representations develop and considers the nature of categorical representations to be unchanging across development the apparent qualitative difference between perceptually and conceptually driven *Xu and Carey (1993) have presented evidence that 10-month-old infants use spatiotemporal but not property-kind information in forming representations of specific physical objects (sortal concepts) that specify their boundaries and numerical identity (i. that are operative early in life. In this endeavor we depart from the classical view of conceptual development. drawing on a discussion of conceptual development by Quinn. but they also posit specialized mechanisms that perform this transition .a transformation that is taken to make human cognition in its mature form possible.representations that permit us to know the kind of thing being represented. Quinn & Eimas. with each redescription producing increasingly more abstract representations that eventually become available to consciousness. offering instead the contention that development of conceptual representations and even of the naive theories in which they are ultimately embedded is a continuous process that does not require the application of special-purpose processes of development. Eimas. Karmiloff-Smith (1992) has posited a process of representational redescription that operates a number of times during development.the latter having a core that was then presumed to include information that was not sensory or perceptual in nature. however.e.. Although this can be viewed as a discontinuity in conceptual development. Quinn and Eimas were. whereas their later function is to enrich these initial representations informationally and to do so to an extent that they begin to take on the characteristics of concepts. In effect. are perceptual in nature (cf. and that remain operative throughout the course of our existence.Categorization in early infancy and the continuity of development 291 later representations of the categories of our world. What is necessary instead is the application and re-application of processes that are available to all sentient beings (as far as we know). and Behl-Chahda (in preparation). whether an object is identical to one encountered at another time). . 1986) and remain so until they undergo a process of perceptual analysis that yields the meaningful conceptual representations of older children and adults . In a similar vein.1 Quinn and Eimas (1986) also noted the qualitativelike differences between the earliest perceptually based representations of young infants and the conceptually driven representations of maturity . The initial function of these processes is to form perceptually driven categorical representations.

There is no question that knowledge about animals (and other aspects of our world) is also gained by means not directly mediated by our senses and perceptual systems in the classical sense. thus it too is perceptually based. In a recent discussion of the current "Zeitgeist on the nature of concepts" Jones and Smith (1993) offer the following description of concepts that they take to summarize the prevailing view: "... it would seem obvious that these abilities have strong biological determinants. as noted. Quinn et al.as an immature cat.characteristics again likely to be perceivable by infants (cf. at the very least) and that would seem to be readily perceived and distinguished from other kinds of motion by young infants (e.g. head and body shape. "living entity" in the example of kitten described above may be given by such information as the self-propelled and intended motion that is associated with living animals (sufficiently restricted initially to exclude mollusks and plants. beliefs about the origins and causes of category membership.. For example. At the center would be understanding of the kitten as a living entity . we offer the idea that the nonperceptual knowledge that is taken to mark concepts as opposed to perceptual categories finds its origins and basis in the same processes of perception and categorization that make possible the initial perceptually driven categorical representations. as having been born of a cat" (p. . The acquisition of these attributes and undoubtedly others is presumed. and the kitten's sounds of communication . At the center lies our nonperceptual knowledge: principally. . fur. our perceptual experiences as we encounter objects in the world are represented at the periphery of our concepts. the information the infant is sensitive to in the course of parsing the world and in adding knowledge to these categorical representations must likewise be (in part) a function of our biology. Thus at the periphery of our concept kitten would be a description of its surface properties. Bertenthal. in preparation). 1984). Rather considerable biological knowledge about It is important to note that this position should not be construed as an argument for an empiricist view of conceptual development.2 It is important from the view taken here that definitions of mature concepts are vague and more than somewhat imprecise with respect to the information that is represented and the information that presumably distinguishes mature from immature representations (Quinn et al. as has been well argued by Edelman (1987). playfulness. In addition. for example. . Given the precocity of the infant's abilities to categorize and their sophistication. and so forth.292 P. in preparation). facial characteristics. Proffitt. 114). to be gradual and most importantly to rest on the processes of categorization and possibly prototype extraction that yield the early categorical structures of natural and artifactual kinds. four-legged. Eimas representations being in actuality for us one of degree of informational richness and complexity. & Cutting. In opposition to this "Zeitgeist on the nature of concepts". To be represented as an "immature cat" could well be a consequence of such characteristics as size (conditional upon other feline attributes).

at (or nearly at) what would be considered the basic level of representation (Eimas & Quinn. for example locomotion and the possession of faces. 1993). for example. submitted. do not require processes specially designed for cognitive growth. that are found in the categorical representations for different species are presumed to be recognized and abstracted (categorized) and form the basis for a representation for animate beings. Infants of this age can also form a global representation for a number of individual mammalian species. is gained in the formal and informal processes of education by means of language. We have obtained considerable evidence showing that infants as young as 3 and 4 months of age can form representations for a variety of mammalian species that are quite exclusive. a rudimentary superordinate-like representation (Behl-Chadha. They are believed to be the same processes that permit very young infants to abstract prototypic values for a variety of (possibly simpler) stimulus attributes such as orientation (Quinn & Bomba. In support of this position. Quinn. We have also argued that the processes of perceptually based categorization and association can in principle yield more abstract representations. Eimas. and even female lions given appropriate experience that contrasts cats and lions. These processes of recognition and abstraction operate across the considerable variation that exists in the individual representations of these features in different species. that is. These acquisitions. acquisition of this knowledge is governed by general. These include. respiration. The common aspects of the features for animate things. Presumably. In sum. as well as such biological principles as inheritance. a categorical representation for cats that excludes horses. 1993). biological motion. & Cowan. I note our recent research on the young infant's categorization of natural kinds. if at first rudimentary. may be individuated and represented by some average or prototypic value or range of permissible values. and the like. tigers. elongated bodies and so forth. Eimas. a number of properties. in this case species of animals. inasmuch as these attributes are correlated in their representations for specific animals. laws of language comprehension and learning. is a quantitative enrichment. their abstract prototypic values may be bound together and in this manner readily form a unified.Categorization in early infancy and the continuity of development 293 what distinguishes different kinds of animals. As a consequence. digestion. quite young infants are starting with categorical representations that parse at least part of the world of animals in ways that will continue to have significance throughout the course of development. 1986). & Rosenkrantz. Quinn. in press. for example. we likewise argue. Furthermore. or so we believe. representation for animate . What comes with further experience. legs. What is required is simply the association of language-based knowledge with mental structures that currently represent the animals in question. an independent representation for animate things (admittedly restricted initially to mammals and similar animals). if at first rudimentary. for example. and not a qualitative transformation of these early categorical representations.

I begin by describing research on the perception of speech. Breinlinger. we see no reason that such a view cannot have wide application to conceptual development across many domains of natural kinds and artifacts. Phonological development In this section. 1985). and rate of production. and Jacobson (1992) for object perception. immature representations into complex. but that they are able to represent the sounds of speech categorically (see Eimas. for reviews). and Jusczyk. and phonetic context. which on gradually acquiring further knowledge become the conceptual representations that make human cognition possible. What is important is that the emergence of what is viewed as conceptually based categorical representations and even a naive theory can be viewed as a continuous developmental process and one that does not need special processes that transform simple. To this core we would add the ability to form perceptually based categorical representations for objects and events. emotional state. including the speaker's sex. noting in particular the infant's abilities to categorize speech and the apparent constancy across development with respect to the processes of perception that yield categorical representations. Given representations for animals at (or nearly at) the basic level as well as the beginnings of a global representation and an emerging representation for animate things. Macomber. the characteristics of the speaker. the infant has a number of the necessary representations for the beginnings of a naive theory of biology that will ultimately bring organization to increasingly complex biologically based representations and complete our story of conceptual development for the domain of biology (cf. There are numerous studies showing that infants in the first weeks and months of life are not only sensitive to and attracted by human speech. & Jusczyk.294 P. While we have applied this line of thought only to biological entities. Eimas things. in press. 1987. This process is necessary if infants are to ultimately arrive at . more abstract symbolic representations. Miller. I discuss development of the categories of an emerging system of phonology in terms of the ideas used to describe the emergence of categorical representations for animals in infants and young children. It is a view very much in accord with that recently offered by Spelke. The latter is important in that it shows that infants are able to listen through the natural variation in speech that arises from instance-to-instance variation in production. They theorized that development begins around a "constant core" that yields the perception and representation of coherent objects that adhere to (some) laws of physics. however. that there is evidence for and against the continuity of phonological development with respect to the relation between the original categorical representations for speech and those necessary for a mature phonology and the processes that cause these changes. I then note. Murphy & Medin.

Categorization in early infancy and the continuity of development 295 structures that can support perceptual constancy and provide the basic constituents of language. She found that infants. Nor do they inform us whether the units of . 1992). 1980.one form of continuity in development. What is particularly interesting is that this many-to-one mapping is itself not invariant. Siqueland. 1988. Moreover. Thus. and phonetic environment and were readily discriminable from each other.g. 1991). Further evidence for categorization comes from a series of experiments by Kuhl (e. exist without constraint by the parental language. it is important to note that experiments evidencing the categorical perception of speech in infancy do not inform us whether these early categorical representations are linguistic.g. form equivalence classes for a number of consonantal and vocalic categories whose exemplars varied considerably in their acoustic properties as a consequence of differences in speaker.. approximately 6 months of age. That is to say. Jusczyk. 1983. when the same acoustic difference signaled different voicing categories. Murray. Eimas. respectively). and Eimas & Miller. The processes for the categorization of speech would thus appear to be precocious and stable across time . Jusczyk. the categorization of speech would appear to be a basic biologically given characteristic of human infants. 1979. & Carden. for example. and for sufficiently brief vocalic distinctions. Erickson. and Miller & Eimas. The categorization process for speech in effect maps a potentially indefinite number of signals onto a single representation. the sounds were reliably discriminated. In a similar vein. 1983. Miller & Eimas. Studies with young infants.and 4-month-old infants failed to discriminate small differences in the speech signal corresponding to moment-to-moment variations in production (voice onset time in this case) when the different exemplars were drawn from either of the two voicing categories of English in syllable-initial position. research with adults and infants has shown that the multiple acoustic properties that are sufficient and available to signal phonetic contrasts enter into perceptual trading relations (e. based on experiments performed originally with adult listeners. given that categorizations of this form occur on the first exposure to novel category exemplars and that the initial categories based on voicing. Fitch. for example. However. & Liberman. Similar findings have been obtained for voicing information for infants being raised in other language communities as well as for information corresponding to differences in place and manner of articulation.. Halwes. 1980. and Vigorito (1971) showed that 1. the values along a given cue influence the values along a second cue that signal a categorical representation. 1983) using a quite different experimental procedure and somewhat older infants. these early representations are not only categorical in nature but also organized entities (Eimas & Miller. have shown that the boundary between categories can be altered as a consequence of systematically varying contextual factors such as rate of speech (Eimas & Miller. and see Levitt. Interestingly. intonation patterns. However.

1981) than for segmental or featural representations (e. view of development with respect to issues of continuity and noncontinuity can be found in my earlier writings. Accordingly. This form of development is. I also take the initial representations to be syllabic in structure-a decision for which there would appear to be greater empirical evidence (Jusczyk. words or word-like structures.. Mattingly & Liberman. 1985. I take these initial representations to be linguistic in nature-a highly contentious assumption in the field of speech perception. segments.. They are. 1991. for example. Now if this is indeed the case. 12). however. then at some point in development these representations must change from being syllabic to being segmental and featural as well as syllabic. or syllables. whether they be auditory or linguistic. 1979). There would appear to be two views that are quite different with respect both to the nature of the relation between initial and mature representations and their presumed means of transformation... 1990). furthermore. indeed. in press.a noncontinuity view of development of representations and the means by which they develop. as is required for phonological knowledge (e. takes the initial categorizations of infants to be little more than general. The beginnings of an alternative. more complex. Mehler. we should expect phonemes to emerge from words" (StuddertKennedy. 1984). a major issue to be considered is whether there is continuity between the initial categorical representations for speech and later phonological representations.g. p. By this I simply mean that the categorical representations of speech are a part of the mental structures that form a human language (cf. Kenstowicz & Kisseberth. in agreement with a general principle of evolution that he views as being applicable to ontogeny. not the forerunners of the phonetic categories of human languages. Liberman & Mattingly.. 1985.g. 1990). Eimas & Miller. 1991. 1993. The first.296 P. articulated by Studdert-Kennedy (1991).g. Eimas categorization are the equivalent of features. and then only when the child begins to use language productively to convey meaning . A second issue is the basis for development. I assumed in Eimas (1975). Liberman & Mattingly. The mature representations of human phonologies emerge from representations of larger linguistic units. For example. 1992. Mehler. 1981) .g. More recently. having been convinced by the prevailing weight of evidence (e. It is at this point that arguments for continuity become more complex. Hillenbrand. Mattingly & Liberman. but one that is justified experimentally and biologically in my view (e. Given these assumptions. that initial categories of speech were segmental and unchanging. prelinguistic auditory constraints that must exist if developing systems of perception and production of speech are ultimately to act in concert. especially infant speech perception. according to Studdert-Kennedy. namely.. "that complex structures evolve by differentiation of smaller structures from larger. there is little relation between the two.

it should be remembered that the final representations are not only segmental and featural. although it does exist for syllables. Mattingly & Liberman. more probably discontinuous in nature. given the prevalence of differentiation in development (Gibson. 1990). this aspect of development is one that is continuous in nature . and more varied. there is the apparent constancy across development in the processes of speech perception.g. Finally. These initial holistic representations exist in a form from which segmental and featural representations may eventually be abstracted or differentiated (cf. would seem to be indicants of both continuous and discontinuous forms of development. as is the case for segments and features. Gibson. but I would argue with Jusczyk that differentiation is more a consequence of the representation and encoding of similar syllables in nearby positions than of production. may be involved if the differentiation of segments and features is also a consequence of production. continuity does not exist between the form of the initial and final representations of speech with respect to segments and features. Jusczyk. 1969) . but not in the kind of unit.. Differentiation is possibly aided later by the processes of speech production that are involved in later meaningful communication. if true. Thus the picture for phonological development is quite different from . Similarly. that is cognitively more economical than nonlinguistic representations of speech . There is further continuity if the initial and later representations of speech are both linguistic in nature. Liberman & Mattingly. segments and features.representational forms that in my opinion differ in kind from syllabic units. 1987). Of course. A major point to be taken from this latter section is that acquiring the categories of a mature phonology. The process of featural and segmental differentiation may signify a discontinuity. The mature syllabic structures are undoubtedly more complex. 1985. As a result this aspect of phonological development is continuous. they are also syllabic. 1993). than the original representations (cf. As noted. 1969). However. having a process of differentiation as a necessary means for the emergence of phonetic categories may be construed as a special process (over and above that which is involved in categorization per se) designed to further development. I would argue. I have come to believe that syllabic representations provide the better starting position (Eimas et al. Nonetheless. Moreover.Categorization in early infancy and the continuity of development 297 and by the arguments of Jusczyk (e. I continue to believe that the initial mental structures for speech are linguistic and.there is a change in the complexity of syllabic representations. This is to say. although it need not be. the syllables. a special developmentally related process that transforms initial representations... that is. 1987). Eimas et al. the representations supporting perception and production have at least a common core and thus the relation between the two need not be a part of the early stages of language acquisition. other processes. it results in a situation.parity between perception and production is immediately given (cf.

Eimas. & Miller. Jones.. P.D. (1970).R. Language acquisition: Speech sounds and the beginnings of phonology. Lipsitt (Eds. and communication. Gibson.L.S. NJ: Erlbaum. Eimas. 8. On infant speech perception and the acquisition of language. & Miller.B.). Proffitt. J. Regressions in mental development: Basic phenomena and theories.. P. & Kendler. Psychological Science. A. (1962). Moreover. T.W. 37. P. P. (1993). Development of categorical exclusivity in perceptually based categories of young infants.G.D.S. & Smith.C. Miller. D. 11: Speech.S. Science. Science. From general to language-specific capacities: The WRAPSA Model of how speech perception develops. (1987).. Neural darwinism. P. Eimas that for conceptual development. E.D..L. (in press). H. Erickson. T. Eimas. 113-139. P.. J. (1980). J. Journal of Experimental Child Psychology. D. Contextual effects in infant speech perception. (1992). & Greenfield. (1984). & Cowan. New York: Wiley. Kendler. Perception and Psychophysics. In J. Hamad (Ed.. (1993). T... 209. 251-263. 3-28. as may well be true for many of the varied facets of perception. & Jusczyk. P. H. 27. Categorical perception: The groundwork of cognition (pp. Jusczyk. (1966). S. G. 303-306.D. P. J. Language and Speech. 171. Infant sensitivity to figural coherence in biochemical motions. P. Cohen & P. Hillsdale. (1975). Jusczyk.D. Speech perception by infants: Categorization based on nasal consonant place of articulation. P. 343-350.B. Fitch. P. (1987). New York: Cambridge University Press. Bruner. Halwes. Experimental child psychology (pp. In S. In L. 69. P. Unpublished doctoral dissertation. Karmiloff-Smith.M. 1140-1141. (1971).. MA: MIT Press. 1-16.D. Psychological Review. G. Eimas (Eds. Attempts to understand and describe our intellectual and linguistic origins and their development thus find further justification for domain-specific theories. References Behl-Chadha. & Cutting. Jusczyk. R. 21. (1980). 193-231).D. Child Development. E.). New York: Basic Books..W Reese & L.P. & Vigorito. (1991).R.M. Eimas. L. Eimas.W... P. Journal of Phonetics. & Liberman. 1613-1622.E.L. pp.D. Miller & P. The place of perception in children's concepts. Bever.H. Hillenbrand.M. . Eimas. P. 3.M. J. (1984). Attentional processes. Brown University..298 P. 213-230. Quinn. Studies in cognitive growth.. Perceptually driven formation of superordinate-like categorical structures in early infancy. P. New York: Academic Press. Infant perception (Vol. P. (1993)..L. Cambridge. & Quinn. J. A. Vertical and horizontal processes in problem solving. Principles of perceptual learning and development. P. Bertenthal.). Edleman. Eimas. FL: Academic Press. Eimas. 279-310). In H. Handbook of perception and cognition.I. Salapatek (Eds. New York: Appleton-CenturyCrofts. Organization in the perception of speech by young infants. & Miller.D. 34. Vol. J. language. B. Olver.C. there is no simple way to characterize the nature of phonological development. Journal of the Acoustic Society of America.. (submitted). (1992). and language. 340-345. Studies on the formation of basic-level categories in young infants. Speech perception in early infancy. Siqueland.J.). Speech perception in infants. Eimas. 75. Beyond modularity. 161-195).R.W.. Cognitive Development.L.D. (in press). J.. 2. A constraint on the perception of speech by young infants. Perceptual equivalence of two acoustic cues for stop-constant manner.. New York: Academic Press. Orlando.L. cognition. (1982). (1969).

C. Thought and language (transl. Perception of auditory equivalence classes for speech in early infancy. D..K. Schiefelbusch. Speech and other auditory modules. Conceptual primitives. The role of syllables in speech processing: Infant and adult data. New York: Academic Press.). & Rosenkrantz. 263-285.. 333-352.D. Studdert-Kennedy. 32. Merrill-Palmer Quarterly. (1985). Murphy. (1992).L. J. Xu. J.G.M. 13. In N. G..L. Gall.. P.). Studdert-Kennedy (Eds. 331-363. P.L. Language acquisition: Biological and behavioral determinants (pp. & Medin.E. 501-520).. Mandler. Occasional Paper (#48). Quinn.D. A. Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories. Edelman. 345-354. K. Philosophical transactions of the Royal Society of London. J.M.. 463-475. (1988). E. 22. Context effects in two-month-old infants' perception of labiodental /interdental fricative contrasts.. (1964). The role of theories in conceptual coherence.. & Eimas.C. In G. MA: MIT Press. Macomber. (1991). (1986). & Eimas. Miller. (1979). Liberman. P. Mattingly. & Inhelder. 99. F. E.. Language development from an evolutionary perspective.. 99. On categorization in early infancy. & Carden. (1990). Evidence for representations of perceptually similar natural categories by 3-month old and 4-month old infants. Kuhl. Krasnegor. P..M. Evidence for a general category of oblique orientations in four-month-old infants. P. P. & M. S. Levitt.Categorization in early infancy and the continuity of development 299 Kenstowicz. Psychological Review. Hanfmann & G. J. 1668-1679. Eimas. P. (1985). (1981). Psychological Review.. Perception. & W. NJ: Erlbaum.G. & Liberman. London: Routledge & Kegan Paul. 605-632. & Jacobson. P. & Carey. (1979). (1993). Signal and sense: Local and global order in perceptual maps (pp. . MA. 361-368. M. B. Piaget. & Kisseberth. Cambridge. The motor theory of speech perception revised. Quinn.D. New York: Wiley-Liss. Journal of the Acoustical Society of America. 92. L.C. Rumbaugh. & Bomba.S. 14. Studies on the categorization of speech by infants. I. MIT Center for Cognitive Science. Cowan (Eds. Hillsdale. How to build a baby: II. 289-316. Quinn.C. (1952). 42. G. D. B295. J. W. Psychological Review. & Mattingly. R. 587-604. C.M. A. 21. 1-36. The early growth of logic. (1983). Vakar). K.. Generative phonology. New York: International University Press. Journal of Experimental Child Psychology. Kuhl. Vygotsky. Cognition.W. Mehler. Piaget.L. Cognition. 6. S. P. 135-165. Infants' metaphysics: The case of numerical identity. I..K. A.M. Journal of Experimental Psychology: Human Perception and Performance. Murray. 3-28). Breinlinger. The origins of intelligence in children. Jusczyk. Origins of knowledge. Spelke. 66.. J. Infant Behavior and Development. (1983). M. Cambridge. (1992). J. P. (1986). (1934/1962). (1993).

a repository of articulatory-phonetic syllable programs.16 Do speakers have access to a mental syllabary? Willem J.NL The authors wish to thank Ger Desserjer for his assistance with the running and analysis of all of the experiments reported. phonetic encoding. experimental part studies various predictions derived from this theory. E-mail PIM@MPI. 1992. is not due to the complexity of the word-final syllable. consists of accessing articulatory gestural scores for each of these syllables in a "mental syllabary9'. The second. moreover. Most current theories of speech production model the pre-articulatory form representation at a phonological level as consisting of •Corresponding author. the syllabary model is further elaborated with respect to phonological underspecification and activation spreading. The effect. As predicted. Alternative accounts of the empirical findings in terms of core syllables and demisyllables are considered. . Levelt*. The final stage of this process. this syllable frequency effect is independent of and additive to the effect of word frequency on naming latency. Introduction The purpose of the present paper is to provide evidence for the notion that speakers have access to a mental syllabary. Wundtlaan 1. Nijmegen 6525 XD. theoretical part of this paper sketches a framework for phonological encoding in which the speaker successively generates phonological syllables in connected speech. The latter two papers introduced the terms "phonetic" and "mental syllabary" for this hypothetical mental store. The main finding is a syllable frequency effect: words ending in a high-frequent syllable are named faster than words ending in a low-frequent syllable. Linda Wheeldon Max Planck Institute for Psycholinguistics.M. Netherlands Abstract The first. In the General Discussion. The notion of stored syllable programs originates with Crompton (1982) and was further elaborated in Levelt (1989. 1993).

L.3 02 W. We present here a more detailed model of syllable retrieval processes than has previously been attempted. 1988). Shattuck-Hufnagel. we will return to a range of further theoretical issues that are worth considering. Crompton (1982) suggested the existence of a library of syllable-size articulatory routines to account for speech errors involving phonemes and syllable constituents. given the notion of a mental phonetic syllabary. However. The syllabary in a theory of phonological encoding Crompton's suggestion As with so many notions in theories of speech production. and while we readily admit that much further evidence is required in order to substantiate it. Levelt. The mental syllabary was postulated as a mechanism for translating an abstract phonological representation of an utterance into a context-dependent phonetic representation which is detailed enough to guide articulation. This theoretical section will be followed by an empirical one in which we present the results of four experiments that address some of the temporal consequences of a speaker's retrieving stored syllable programs during the ultimate phase of phonological encoding. Wheeldon discrete segments or features (Dell. The present paper will provide experimental evidence that is consistent with the existence of a mental syllabary and provides a challenge to theories that assume (tacitly or otherwise) that the phonetic forms of all syllables are generated anew each time they are produced. 1982). 1988. the idea that a speaker retrieves whole phonetic syllable programs was originally proposed to account for the occurrence of particular speech errors. We will then sketch a provisional framework for the speaker's encoding of phonological words-a framework that incorporates access to a phonetic syllabary. the actual phonetic realization of a phonological feature is determined by the context in which it is spoken. For example. 1979) and some models assume explicitly that this level of representation directly activates articulatory routines (Mackay. The articulatory syllables [pig] and [he9r] in the library are addressed via sets of phonemic search instruction such as: . The fact that phonetic context effects can differ across languages means that they cannot all be due to the implementation of universal phonetic rules but most form part of a language-dependent phonetic representation (Keating. In the final discussion section. an error like guinea hig pair (for guinea pig hair) arises when the mechanism of addressing syllable routines goes awry. we propose it as a productive framework for the generation of empirical research questions and as a clear target for further empirical investigation. In the following we will first discuss some of the theoretical reasons for assuming the existence of a syllabary in the speaker's mind.

new frames are composed. onset of syllable 2. and then filling it with segments? In some way or another both must proceed from a stored phonological representation. the word's phonological code in the lexicon. According to that model. for words in their citation form). This provides an elegant account for the phonetic "accommodation" that takes place: [hlg] is pronounced with the correct allophone [J. In connected speech it is the exception rather than the rule that a word's canonical syllable skeleton is identical to the frame that will be filled. A functional paradox However.. not with [e] that would have been the realization of [h] in hair. coda of syllable 1.. then instructions arise for the retrieval of two quite different articulatory syllables. earlier formulated by Shattuck-Hufnagel (1979. leading to the exchange of the onset conditions. and inserted one-by-one into the slots of a syllabic frame for the word that is independently retrieved from the word's lexemic representation. a word's phonological segments are spelled out from the word's lexical representation in memory. which often involve more than a single lexical word.e. in the same paper Crompton reminds us of a paradox. Word from retrieval A first step in phonological encoding is the activation of a selected word's .). but not solved by either of them: "perhaps its [the scan copier's] most puzzling aspect is the question of why a mechanism is proposed for the one-at-a-time serial ordering of phonemes when their order is already specified in the lexicon". Let us now outline this framework in more detail (see Fig. Isn't it wasteful of processing resources to pull these apart first. Crompton argues. This copier mechanism in fact specifies the search instructions for each of a word's successive syllables (i.Do speakers have access to a mental syllabary? 303 onset = p nucleus = I coda = g onset = h nucleus = 69 coda = r If these search instructions get mixed up. etc. This addressing mechanism. not at any earlier "citation form" level. is fully compatible with Shattuck-Hufnagel's (1979) scan copier mechanism of phonological encoding. but for phonological words. onset of syllable 1. not for lexical words (i. Instead. 338). namely [hlg] and [pear]. 1989) he argued that the solution of the paradox should be sought in the generation of connected speech. Levelt (1992) formulated this functional paradox as follows: Why would a speaker go through the trouble of first generating an empty skeleton for the word. p. and then to combine them again (at the risk of creating a slip)? And (following Levelt. 1).e. It is only at this level that syllabification takes place. nucleus of syllable 1.

vowels.H CO phonological word fonnation CD i. The segmental information relates to the word's phonemic structure: its composition of consonants. Levelt. 1982). NJtl metrical spellout J A ?A I a I W M . Wheeldon segmental spellout td/fi/Jm/Jx/Jn/Jd/.304 word form retrieval W.dit] articulatory network Figure 1. among them Meringer and Mayer (1895). Shattuck-Hufnagel (1979).man . i m s n d i t/ ft/A retrieval of syllabic gestural scores [di .the word's form information in the mental lexicon. 1 this is exemplified for two words. distinguish between two kinds of form information: a word's segmental and its metrical form. etc. consonant clusters. Although terminologies differ. glides. In Fig. as they could appear in an utterance such as police demand it. all theories of phonological encoding. Dell (1988) and Levelt (1989). A framework for phonological encoding. diphthongs. "lexeme" . ranging from minimal or underspecification (Stemberger. demand and it. and with respect to the degree of linear ordering of segmental . Theories differ with respect to the degree of specification. /IN segment-to-frame association CO Id. 1983) to full phonemic specification (Crompton. L.

though. is that it is frequency sensitive. resulting in the phonological word demandit. This is sometimes phenomenologically apparent when we are in a "tip-of-the-tongue" state. the word's form information. weight could also be represented by branching (vs. two morae for a heavy one). that is. for instance. pronounceable metrical structures that largely ignore lexical word boundaries. 1993). 1 by their IPA labels and as consonantal or vocalic (CorV). independently retrieved. which will play an essential role in the experimental part of this paper. the precise CV structure of syllables (Dell. 1993. where we fail to retrieve an intended word. but feel pretty sure about its syllabicity and accent structure. However. The representation of accent structure in Fig. not branching) the nucleus. 1989. we have represented segments in Fig. In the utterance police demand it. when generating connected speech. Other metrical aspects represented in various theories are: onset versus rest of word (Shattuck-Hufnagel 1992). Jescheniak and Levelt (in press) have shown that the word frequency effect in picture naming (naming latency is longer for pictures with a low-frequency name than for pictures with a high-frequency name) is entirely due to accessing the lexeme. It specifies at least the word's number of syllables (its "syllabicity") and its accent structure. have restricted this issue to the phonological encoding of single words. An important aspect of form retrieval. speakers do not concatenate citation forms of words. Nespor & Vogel. the unstressed function word it cliticizes to the head word demand. 1982) and (closely related) whether syllables are strong or weak (Levelt. All classical theories. that is. but create rhythmic. 1 is no more than a primitive "stressed" (with ') versus unstressed (without '). 1986). the lexical stress levels of successive syllables. This is not critical. to some extent. the degree of reduction of syllables (Crompton.Do speakers have access to a mental syllabary? 305 information. But the mora representation simplifies the formulation of the association rules (see below). for more details). This relative independence of segmental and metrical retrieval is depicted in Fig. 1 as two mechanisms: "segmental spellout" and "metrical spellout" (see Levelt. Phonological word formation A central issue for all theories of phonological encoding is how segments become associated to metrical frames. Without prejudging these issues (but see Discussion below). Our representation in Fig. The metrical information is what Shattuck-Hufnagel (1979) called the word's "frame". Phonologists call this the "prosodic hierarchy" (see. 1 follows Hays (1989) as far as a syllable's weight is represented in a moraic notation (one mora for a light syllable. 1988). There is also general agreement that metrical information is. not lexical . Of crucial importance here is that phonological words. however. Relevant here is the level of phonological words (or clitic groups).

Levelt (1992) presented the following set of association rules for English. (c) associating to a would leave a [L without associated element.. There is. they would regularly be broken up in connected speech. The reader can easily verify that for . In addition. is the association of spelled-out segments to the metrical frame of the corresponding phonological word. there is a general convention that association to o\ the syllable node. but it also predicts the occurrence of syllabification speech errors such as de-mand-it. can only occur on the left-hand side of the syllable. However. In short. (b) there is no a to associate to. a diphthong to /I/LA. a phonological word's syllabification is created "on the fly" when these rules are followed. as depicted in Fig. Levelt. Wheeldon words. that is. to the left of any unfilled morae of that syllable. 1. Linguists call this "^syllabification". A consonant associates to /x if and only if any of the following conditions hold: (a) the next element is lower in sonority. Shattuck-Hufnagel. in fact. de-mand + it).306 W. but in a processing model this term is misleading. than. If they were. they must adhere to a language's rules of syllabification. Meyer. L. This is depicted in Fig. a phonological word frame is created by blending the frames of its constituent words. 1991. See Levelt (1992) for a motivation of these rules. are the domain of syllabification. 1990. 1979). without any claim to completeness: (1) A vowel only associates to /LL. But the mechanisms proposed still vary substantially. Such errors have never been reported to occur in fluent connected speech. There are no known segmental conditions on the formation of phonological words (such as "a word beginning with segment y cannot cliticize to a word ending on segment x"). It presupposes that there was lexical syllabification to start with (i. there must be a mechanism in phonological encoding that creates metrical frames for phonological words. 1 as "phonological word formation". There is good evidence that this process runs "from left to right" (Dell. Notice that this is an entirely metrical process. 1991. That is not only wasteful. Segment-to-frame association The next step in phonological encoding. and that association proceeds "from left to right". 1988. good reason to assume that a word's syllables are not fully specified in the word form lexicon.e. On the assumption that spelled-out segments are ordered. where the last syllable straddles a lexical boundary. The conditions are syntactic and metrical. The phonological word demandit is syllabified as de-man-dit. Essentially (and leaving details aside). Meyer & Schriefers. whatever the mechanisms. (2) The default association of a consonant is to a.

The gestural score only specifies that the lips should be closed. what the prevailing physical conditions of the articulatory system are (does the speaker have a pipe in his mouth that wipes out jaw movement?). Relevant here is that gestural scores are abstract. where the last syllable straddles the lexical boundary. One can present a reader with a phonotactically legal non-word that consists of non-existing syllables (such as fliltirp). They specify the tasks to be performed. The gestural score for a phonological word involves scores for each of its syllables. We suggest that it is what Browman and Goldstein (1991) have called gestural scores. Example of a gestural task is to close the lips. not the motor patterns to be executed. The issue here is: how does a speaker generate these scores? There may well be a direct route here. It should be noticed that this is not an account of the mechanism of segmentto-frame association. tongue tip and lips. to some extent. and the reader will pronounce it all right. It has been argued time . there are least-effort solutions that take into account which other tasks are to be performed. It is doubtless possible to adapt Shattuck-Hufnagel's (1979) scan-copier mechanism or Dell's (1988) network model to produce the left-toright association proposed here. Gestural scores are. and (ii) the use of more global syllable frames. most syllables that a speaker uses are highly overlearned articulatory gestures. not lexical word frames. but not how it should be done. already specifications of the gestural tasks that should be carried out in order to realize the syllable. the lower lip. there may be another route as well. But before turning to that. both lips. These computations are done by what they called an "articulatory network" . specifications of tasks to be performed. frames only specified for weight. As Saltzman and Kelso (1987) have shown. not for individual segmental slots. It is at this point that the notion of a mental syllabary enters the picture. The speaker can move the jaw.Do speakers have access to a mental syllabary? 307 demandit the syllabification becomes de-man-dit. a gestural score involves five "tiers". etc. plus three tiers in the oral system: tongue body. like choreographic or musical scores. Since there are five subsystems in articulation that can be independently controlled. Accessing the syllabary The final step of phonological encoding (which is sometimes called phonetic encoding) is to compute or access the articulatory gestures that will realize a phonological word's syllables. The adaptations will mainly concern (i) the generation of phonological.a coordinative motor system that involves feedback from the articulators. Still. They are the glottal and the velar system. we should first say a few words about what it is that has to be accessed or computed. that is. as Browman and Goldstein have convincingly argued. as in the articulation of apple. or all of these articulators to different degrees. After all. But not every solution is equally good. A syllable's phonological specifications are.

ranging from a few hundred in Chinese or Japanese to several thousands in English or Dutch. as will be taken up in the General Discussion. Levelt. focus on the final step in the theory: accessing the syllabary. Lindblom. rather. a syllabic gestural score. According to this theory. Crompton (1982) made the suggestion that articulatory routines for stressed and unstressed syllables are independently represented in the repository. This is depicted in Fig. It should be noticed that the size of the syllabary will be rather drastically different between languages. It is.308 W. The following four experiments were inspired by the notion of a syllabary. phonological word formation and segment-to-frame association. Fujimura & Lovins. its onset. It is obvious that many theoretical issues have not (yet) been raised. of coarticulation and of assimilation have the syllable as their domain (see. which controls motor execution of the gesture. for instance. It is important to notice this step has a certain theoretical independence. a phonological syllable specification and. each will activate its gestural score in the syllabary. it is only natural to suppose that they are accessible as such. but alternative explanations are by no means excluded. that is. Still. So far for the theoretical framework. In other words. on the other hand. you know how to pronounce its segments. they are mere properties of a syllabic gesture. the syllabary theory may have interesting consequences for an underspecification approach to phonological encoding. As phonological syllables are. Most theories of phonological encoding are not specific about phonetic encoding. nucleus and offset. Whceldon and again that most (though not all) phenomena of allophonic variation. The phonological specification is the input address. created during the association process. That score will be the input to the "articulatory network" (see above). in particular. segmental and metrical spellout. and this was adopted in Levelt (1989). 1983). The first one is in . EXPERIMENT 1: WORD AND SYLLABLE FREQUENCY According to the theory outlined above. Their results are compatible with that notion. on the one hand. We will. L. If these syllabic scores are overlearned. It may provide an independent means of determining what segmental features should minimally be specified in the form lexicon. one by one. and many of them would be compatible with the notion of a syllabary. the gestural score is the output. Or rather: phonetic segments have no independent existence. that we have a store of syllabic gestures for syllables that are regularly used in speech. Still. not the intention of the present paper to go into much more detail about the initial processes of phonological encoding. if you know the syllable and its stress level. 1 as the syllabary. they provide new evidence about the time course of phonetic encoding that has not been predicted by other theories. the syllabary is a finite set of pairs consisting of. there are two steps in phonological encoding where the speaker accesses stored information. 1978.

impossible to obtain the relevant naming latencies by means of a picture-naming experiment. Accessing a low-frequent homophone (such as wee) turned out to be as fast as accessing non-homophone controls that are matched for frequency to the corresponding high-frequent homophone (in case. however. We have modelled these two steps as successive and independent. there are simply not enough depictable target words in the language. that would put minimal restrictions on the words we could test. in particular. their naming latencies were measured. enough for the rationale of the experiment to know that there is a genuinely lexical frequency effect in word retrieval. Similar to word retrieval. that is. share their word form information. but not their semantic /syntactic properties. these symbols were presented on the screen and the subjects produced the corresponding target words. We therefore designed another kind of naming task. moreover. it is a genuinely lexical one. During the experiments.and low-frequency bisyllabic words were tested which comprised either two highfrequency syllables or two low-frequency syllables. The former involves the form part of the mental lexicon. and to assume that accessing the syllabary is a later and independent step in phonological encoding. . we). It is. the frequency effect must have a form-level locus: the lowfrequent homophone inherits the form-accessing advantage of its high-frequent twin. Method In the following experiments the linguistic restrictions on the selection of experimental materials were severe. subjects learned to associate each of a small number of target words to an arbitrary symbol. the lexeme. The second one is in retrieving the syllabic gestural score. the latter the syllabary.Do speakers have access to a mental syllabary? 309 retrieving word form information. not due to the process of recognizing the picture. Jescheniak and Levelt (in press) have further localized the effect in word form access. Oldfield and Wingfield (1965) and Wingfield (1968) first showed that naming latencies for pictures with low-frequency names are substantially longer than latencies for pictures with high-frequency names. allowing us to test for any interaction. It is. by definition. Since homophones. In the preparation phase of the experiment. The syllabary theory predicts that the effects should be additive and independent. The experiment was designed to look for an effect on word production latency of the frequency of occurrence of a word's constituent syllables. High. The effect is. Whole-word frequency of occurrence was therefore crossed with syllable frequency. accessing the store of syllables might also involve a frequency effect: accessing a syllable that is frequently used in the language may well be faster than accessing a syllable that is less frequently used. It has long been known that word form access is sensitive to word frequency.

Low-frequency syllables had counts of less than 300 in both overall and position-dependent counts.000 individual syllable forms. which includes the frequencies of all word forms with the same stem. therefore. Frequency counts All frequency counts were obtained from the computer database CELEX1. These groups differed in the combination of word frequency and syllable frequency of their constituent words. The Netherlands. In all of the experiments reported the same criteria were used in assigning words to frequency conditions.. this will most likely work against our hypothesis. and lemma frequency. Vocabulary The experimental vocabulary comprised four groups of 16 bisyllabic Dutch words. The word frequency counts we used are two occurrences per million counts from this database: word form frequency. giving approximately 12. Syllable frequencies were calculated for the database from the word form occurrences per million count. L. which would always involve linguistic processing of the input word. high-frequency syllables had both counts over 300. We are aware of the fact that we have been counting citation form syllables. Levelt. Max Planck Institute. Most low-frequency syllables. But if the latter frequency distribution deviates from the one we used.000 per million words. All low-frequency words had a count of less than 10 for both word form and lemma counts. with a mean frequency of 121. our distinct HF and LF syllable classes will tend to be blurred in the "real" distribution. All high-frequency words had both counts over 10. . Syllable frequencies were counted for phonetic syllables in Dutch. The syllable frequencies range from 0 to approximately 90. Groups ^e Centre for Lexical Information (CELEX).310 W. Two syllable frequency counts were calculated: overall frequency of occurrence and the frequency of occurrence of the syllable in a particular word position (i.e. This is important as our model claims that very lowfrequent syllables will be constructed on-line rather than retrieved from store. had above-average frequency of occurrence in the language. first or second syllable position). not syllables as they occur in connected speech. which has a Dutch lexicon based on 42 million word tokens. The phonetic script differentiates the reduced vowel schwa from full vowel forms. Average frequencies for each word group are given in Table 1. Each group contained 13 nouns and three adjectives. which includes every occurrence of that particular form. Wheeldon Notice that we decided against a word-reading task.

Log syllable and word frequencies and number of phonemes of words in each of the Word x Syllable frequency groups of Experiment 1 log syllable frequency: Log word frequency: Word form Lemma 1st syllable position dependent 1st syllable total 2nd syllable position dependent 2nd syllable total Number of phonemes High High 3.6 6 were also matched for word onset phonemes and mean number of phonemes.4 0. Symbols Four groups of four symbol strings were constructed.3 0.1 7.3 8. Order of presentation .4 7.2 5 Low High 3. see Appendix 1).6 7. The first production of each word in a block was a practice trial.3 8. These groups contained words which were phonologically and semantically unrelated and each group contained at least one word with second syllable stress.6 7. Each group was divided into four matched subgroups which were recombined into four experimental vocabularies of 16 words (four from each condition.6 4.Do speakers have access to a mental syllabary? 311 Table 1.0 5 Low Low 0. The four groups of symbols were roughly matched for gross characteristics as follows: Set 1 ) ) ) ) ) ) %%%%%% >>>>>> Set 2 \ \ \ \ \ \ &&&&&& Set 3 }}}}}} ###### Set 4 [ [ [ [ [ [ @@@@@@ A A A A A A Design Subjects were assigned to one of the vocabularies.2 4. Each symbol consisted of a string of six non-alphabetic characters.9 7.0 4.5 8.6 3. Within a block subjects produced each word six times. Their task was to learn to produce words in response to symbols.1 6 High Low 0.three blocks for each four-word set. Subjects learned one block of four words at a time.1 4.4 4.3 3. Within each vocabulary four groups of four words (one from each condition) were selected to be elicited in the same block.2 3. The experiment consisted of 12 blocks of 24 naming trials .3 3.7 5.

They were asked to practise the relationship between the symbols and the words until they thought they could accurately produce the words in response to the symbols. The printed order of the words from each frequency group was rotated across block groups. When each subject was confident that they had learned the associations they were shown each symbol once on the computer screen and asked to say the associated word. after which a symbol appeared on the screen and remained there for a further 500 ms. between the ages of 18 and 34. Wheeldon was random. All were native speakers of Dutch. Within a vocabulary each block group was assigned a symbol set. with the condition that no symbol occurred twice in a row. The first production of a word in each block was counted a practice trial and excluded from the analysis. Within a vocabulary the order of presentation of block groups was rotated across subjects. The assignment of symbols to words within sets was also rotated across subjects. Subject than had 2 s in which to respond. They were paid for their participation. Results Exclusion of data Data from two subjects were replaced due to high error rates. They were voluntary members of the Max-Planck subjects pool. This procedure was repeated for all four groups of words. Subjects Thirty-two subjects were tested.312 W. followed by a 3 s interval before the onset of the next trial. L. Both naming latencies and durations were recorded for each trial. This condition was included in order to eliminate the potentially large facilitation effect due to immediate repetition and to encourage subjects to clear their minds at the end of each trial. The screen then went blank for 500 ms. If they could do this correctly they then received three blocks of 24 trials. Levelt. Procedure Subjects were tested individually. Correct naming latencies following error trials were also excluded from the latency analysis as errors can often perturb subject's responses on the . The events on each trial were as follows. A fixation cross appeared on the screen for 300 ms. They were given a card on which four words with associated symbols were printed. 24 women and 8 men.

Syllable versus word frequency.7.8. Missing values in all experiments reported were substituted by a weighted mean based on subject and item statistics calculated following Winer (1971. 0 5 . but no interactions of this variable with either syllable or word frequency. F2(3.8 ms respectively. 0 0 1 . 48) = 4.5 ms) was also significant. F 2 (l. 28) = 17.8. high and low word frequency latencies were 592. Fx(l.001.Do speakers have access to a mental syllabary? 313 following trial. 28) = 14. The main effect of word frequency (15.2. Fx(2. Fx and F2 < 1. Collapsed across word frequency. FY(3928) = 96) = word onset latency in ms.3 ms and 606. p. Data points greater than two standard deviations from the mean were counted as outliers and were also excluded. 2.0 ms and 607. p = . p < .001.48) = 7. p < . Mean naming latencies for words in each of the frequency groups are shown in Fig. .2. oou low-frequency words 610 590 high-frequency words low high syllable frequency Figure 2. 0 0 1 . 488). The size of the syllable frequency effect is similar in both word frequency groups and vice versa: the interaction of word and syllable frequency was insignificant. / X . high and low syllable frequency latencies were 592.2 ms respectively. This resulted in the loss of only 1.3% of the data points were lost due to these criteria. p < . The main effect of syllable frequency (14. p < . F2(2. 3. 56) = 203. F 2 (l. There was a significant effect of vocabulary in the materials analysis. Naming latency Collapsed across syllable frequency. Naming latencies in Experiment 1. Fj(l.6% of the data points.48) = 3. Effects of practice were evident in the significant decrease in naming latencies across the three blocks of a word group.2 ms) was significant.


W. Levelt, L. Wheeldon

318.8, p<.001, and across the five repetitions of a word within a block, Fx(4,112) = 25.7, p < .001, F2(4,192) = 23.8, p < .001. The effect of block did not interact with either word or syllable frequency effects (all Fs< 1). The effect of trial, however, showed an interaction with syllable frequency that approached significance by subjects, Ft(4,112) = 2.3, p < .06, F2(4,192) = 1.5. However, this interaction was due to variation in the size of the priming effect over trials but not in the direction of the effect and does not qualify the main result.2

Percentage error rate High and low word frequency error rates were 2.6% and 3.0% respectively. High and low syllable frequency error rates were 2.7% and 2.9% respectively. A similar analysis carried out on percentage error rate (arc sine transformed) yielded no significant effects.

Naming duration A similar analysis was carried out on naming durations. High and low word frequency durations were 351.4 ms and 344.7 ms respectively. The 6.7 ms difference was significant over subjects, F^l, 28) = 8.8, p < .01, F2 < 1. High and low syllable frequency durations were 326.8 ms and 369.3 ms respectively. The 42.5 ms difference was significant, F^l, 28) = 253.7, p < .001, F 2 (l, 48) = 15.6, p < .001. Word and syllable frequency did not interact, Fx and F2 < 1.

Regression analyses Regression analyses were carried out on the means data of the experimental words. In all regressions mean naming latency is the dependent variable. Simple regressions with both log word form frequency and log lemma frequency failed to reach significance (R = 0.2, p > .05). Of the syllable frequency counts only second syllable frequency counts yielded significant correlations: total log frequency (# = 0.3, /?<.01) and position-dependent log frequency (/? = 0.4, /?<.001). Similarly number of phonemes in the second syllables and log second syllable CV structure frequency showed significant correlations with naming latency (both /? = 0.3, p<.05). A multiple regression of naming latency with these three
Main effects of block and trial were observed in the analyses of all the dependent variables reported. These practice effects were always due to a decrease in naming latencies, durations and error rates as the experiment progressed. In no other analysis did they significantly interact with frequency effects and they will not be reported.

Do speakers have access to a mental syllabary?


second syllable variables showed only a significant unique effect of log syllable frequency (p < .05). This pattern of results remained when only words with initial syllable stress were included in the regressions (n = 32).

Discussion Apart from the expected word frequency effect, the experiment showed that there is a syllable frequency effect as well, amounting to about 15 ms. Bisyllabic words consisting of low-frequency syllables were consistently slower in naming than those consisting of high-frequency syllables. Moreover, this syllable frequency effect was independent of the word frequency effect, as predicted by the syllabary theory. The post hoc regression analyses suggest that second syllable frequency is a better predictor of naming latency than the frequency of first syllable. Experiments 2 and 3 will explore this possibility in more detail. Not surprisingly, syllable complexity affected word durations, but there was also some evidence that complexity of the second syllable has an effect on naming latency. This issue will be taken up in Experiment 4.

EXPERIMENT 2: FIRST AND SECOND SYLLABLE FREQUENCY There are theoretical reasons to expect that in bisyllabic word naming the frequency of the second syllable will affect naming latency more than the frequency of the first syllable. It is known that in picture naming bisyllabic target words are produced with longer naming latencies than monosyllabic target words. In a study by Klapp, Anderson, and Berrian (1973) the difference amounted to 14 ms. The effect cannot be due to response initiation, as the difference disappears in a delayed production task where subjects can prepare their response in advance of the "Go" signal to produce it. It must therefore have its origin in phonological encoding. Levelt (1989, p. 417) suggests that if in phonetic encoding syllable programs are addressed one by one, the encoding duration of a phonological word will be a function of its syllabicity. But the crucial point here is that, apparently, the speaker cannot or will not begin to articulate the word before its phonetic encoding is complete. If articulation was initiated following the phonetic encoding of the word's first syllable, no number-of-syllables effect should be found. Wheeldon and Lahiri (in preparation) provide further evidence that during the production of whole sentences articulation begins only when the first phonological word has been encoded. Making the same assumption for the present case - that is, that initiation of


W. Levelt, L. Wheeldon

articulation will wait till both syllables have been accessed in the syllabary - it is natural to expect a relatively strong second syllable effect. The association process (see Fig. 1) creates phonological syllables successively. Each new syllable triggers access to the syllabary and retrieval of the corresponding phonetic syllable. Although retrieving the first syllable will be relatively slow for a low-frequency syllable, that will not become apparent in the naming latency; the response can only be initiated after the second syllable is retrieved. Retrieving the second syllable is independent of retrieving the first one. It is initiated as soon as the second syllable appears as a phonological code, whether or not the first syllable's gestural score has been retrieved. And articulation is initiated as soon as the second syllable's gestural code is available. First syllable frequency will only have an effect when retrieving that syllable gets completed only after retrieving the second syllable. This, however, is a most unlikely state of affairs. Syllables are spoken at a rate of about one every 200 ms. Wheeldon and Levelt (1994) have shown that phonological syllables are generated at about twice that rate, one every 100 ms. Our syllable frequency effect, however, is of the order of only 15 ms. Hence it is implausible that phonetic encoding of the second syllable can ''overtake" encoding of the first one due to advantageous frequency conditions. In this experiment we independently varied the frequency of the first and the second syllable in bisyllabic words. In one sub-experiment we did this for high-frequency words and in another one for low-frequency words.

Method The vocabulary consisted of 96 bisyllabic Dutch nouns: 48 high word frequency, 48 low word frequency. Within each word frequency group there were four syllable frequency conditions (12 words each) constructed by crossing first syllable frequency with second syllable frequency (i.e., high-high, high-low, low-high and low-low). The criteria for assigning words to frequency groups were the same as in Experiment 1. Mean log frequencies and number of phonemes for the high- and low-frequency words in each syllable condition are given in Table 2. Two high word frequency vocabularies and the two low word frequency vocabularies were constructed, each with six words from each syllable frequency condition. Each vocabulary was then divided into six four-word groups with one word from each condition. As in Experiment 1, these groups contained words which were phonologically and semantically unrelated. Each group was assigned a symbol set with four rotations and each of 48 subjects were assigned to one vocabulary and one symbol set. Each subject received 18 blocks of 24 trials: three blocks for each word group. In this experiment word frequency was a between-subjects variable. This was necessary because of the extra syllable frequency conditions and the limited

Do speakers have access to a mental syllabary?


Table 2. Log syllable and word frequencies and mean number of phonemes for high- and low-frequency words in each of the First x Second syllable frequency groups of Experiment 2
Syllable freq. 1st High High High Low Low Low High High Low Low 2nd No. phonemes Syl. 1 Syl. 2 2.7 3.2 2.8 3.6 Syllabic\ 1 POS 7.3 7.6 4.9 4.9 TOT 7.8 7.9 5.2 5.3 Syllabic: 2 POS 7.8 5.1 8.3 4.8 TOT 8.7 5.3 8.9 5.2 Word WRD 3.8 3.6 3.8 3.7 LEM 4.0 4.0 4.0 4.0

word frequency x High 2.8 x Low 2.8 xHigh 3.1 xLow 3.0 word frequency x High 2.8 x Low 2.7 xHigh 3.1 x Low 2.9

2.6 3.2 2.6 3.3

7.0 7.5 4.1 4.3

7.5 8.1 4.7 4.7

8.1 4.0 8.3 4.1

8.7 4.5 8.9 4.5

1.5 1.2 1.4 1.0

1.9 1.5 1.7 1.5

number of words a subject could accurately memorize and produce within an hour. Moreover, our major interest was in the pattern of results over the syllable frequency conditions for both high- and low-frequency words, rather than in the word frequency effect itself. In order to be able to compare baseline naming speed of subjects who received the high and low word frequency vocabularies, each subject received a calibration block of the same four words at the end of the experiment. The rest of the procedure was exactly the same as in Experiment 1. Forty-eight subjects were run; 24 received a high word frequency vocabulary (20 women and 4 men) and 24 received a low word frequency vocabulary (18 women and 6 men). Results Exclusion of data Data from four subjects were replaced due to high error rates. Data points were excluded and substituted according to the same principles as in Experiment 1. The first production of a word in each block was again counted a practice trial and excluded from the analysis. 2.8% of data points were correct naming latencies following error trials. 1.8% of the data points were greater than two standard deviations from the mean. Naming latency Mean naming latency for the high word frequency group was 641.6 ms - 5.7 ms


W. Levelt, L. Wheeldon

Table 3. Mean naming latency and percentage error (in parentheses) for words in the four syllable frequency conditions of Experiment 2. Means are shown for all words and for high- and low-frequency words separately. The effect of syllable frequency (low minus high) is also shown.
Syllable frequency Low All words 1st syllable 2nd syllable 637.4 (1.9) 644.5 (2.3) High 640.1 (2.1) 633.0 (0.2) 641.1 (2.4) 636.5 (2.0) 638.9 (1.8) 629.3 (1.5) -2.7 (-0.2) 11.5 (0.5) 0.7 (-0.4) 10.0 (0.5) -6.1 13.0 (0.0) (0.6) Effect Low - high

High-frequency words 1st syllable 641.8 (2.0) 646.5 (2.5) 2nd syllable Low-frequency words 1st syllable 2nd syllable 632.8 (1.8) 642.3 (2.1)

slower than the low word frequency group, 635.9 ms (see Table 3). This reverse effect of word frequency was insignificant, F1 and F2 < 1, and can be attributed to the random assignment of slower subjects to the high-frequency vocabularies. Mean naming latencies for the calibration block were: high word frequency, 659.3 ms; low word frequency, 624.5 ms. Subjects who received the high word frequency vocabularies were, therefore, on average 34.8 ms slower than the subjects who received the low word frequency vocabularies. This difference was also significant by words, F^l, 46) = 1.9, F 2 (l, 3) = 52.1, p < .01. Mean naming latencies and error rates for the syllable frequency conditions are shown in Table 3; the latency data are summarized in Fig. 3. The -2.7 ms effect of first syllable frequency was, unsurprisingly, insignificant, F x (l, 44) = 1.1, F2 < 1. The 11.5 ms effect of second syllable frequency was significant by subjects, F^l, 44) = 18.6, p < .001, and again marginally significant by words, F 2 (l, 80) = 3.8, p = .053. The interaction of first and second syllable frequency was not significant, Fx and F 2 <1. However, there was a significant three-way word frequency by first and second syllable frequency interaction, but only in the subject analysis, Fj(l,44) = 6.1, /?<.05, F 2 (l,80) = 1.3. This was due to a by-subjects only interaction of first and second syllable frequency in the lowfrequency word set, ^(1,22) = 5.6, p<.05, F 2 (l,40) = 1.2; words with highfrequency first syllables showed a smaller effect of second syllable frequency than words with low-frequency first syllables (5 ms and 21 ms respectively). Words with high-frequency second syllables showed a reverse effect of first syllable frequency (-14 ms) compared to a 2 ms effect for words with low-frequency second syllables. There was no main effect of vocabulary, F1 and F2 < 1. However, there was a

Do speakers have access to a mental syllabary? word onset latency in ms.



syllable 1


620 low high

syllable frequency
Figure 3. Naming latencies in Experiment 2. Syllable position (word-initial, word-final) versus syllable frequency.

significant interaction of second syllable frequency with vocabulary in the bysubject analysis, Fx(l9 44) = 6.8, p < .05, F 2 (l, 80) = 1.4, due to differences in the size of the effect in the two vocabularies in both the high- and low-frequency word sets.

Naming duration Naming durations for high- and low-frequency words were 346.8 ms and 316.6 ms respectively. The 50.2 ms effect was significant by words, F x (l, 44) = 3.5, p > . 0 5 , F 1 (l,80) = 20.1, p<.001. There were also significant effects of first syllable frequency (high 329.1 ms, low 334.3 ms, F 2 (l, 44) = 12.7, p > .01, F2 < 1) and second syllable frequency (high 321.1ms, low 342.3 ms, F^l, 44) = 167.0, p>.001, F 2 (l,80) = 9.8, p<M). The interactioli of first and second syllable frequency was only significant by subjects, Fx(l9 44) = 12.0, p > .001, F2 < 1; the effect of frequency on second syllable durations was restricted to words with high first syllable frequencies.

Percentage error rate Error rates are also shown in Table 3. They yielded only a significant effect of second syllable frequency over subjects, Fj(l, 44) = 6.0, p < .05, F 2 (l, 80) = 2.4.

320 Discussion

W. Levelt, L. Wheeldon

Although not all vocabularies in this experiment yielded significant syllable frequency effects, the main findings were consistent with our expectations. Whatever there is in terms of syllable frequency effects was due to the second syllable only. The frequency of the first syllable had no effect on naming latencies. Although the average size of the frequency effect (12 ms) was of the order of magnitude obtained in Experiment 1 (15 ms), the complexity of the experiment apparently attenuated its statistical saliency. An interaction of first and second syllable frequency effects is not predicted by our model of syllable retrieval. This experiment did yield some indication of such an interaction. However, it was observed in one vocabulary only and never approached significance over items. While further investigation is necessary to rule out such an effect, we do not feel it necessary to amend our model on the basis of this result. The next experiment was designed to isolate the effect of second syllable frequency.


Method Vocabulary The experimental vocabulary consisted of 24 pairs of bisyllabic Dutch words. Members of a pair had identical first syllables but differed in their second syllable: one word has a high-frequency second syllable and one word had a low-frequency second syllable (e.g., ha-merlha~vik). High and low second syllable frequency words were matched for word frequency. No attempt was made to match second syllables for number of phonemes (see Table 4). Two matched vocabularies of 12 word pairs were constructed.

Design Twelve pairs of abstract symbols of the form used in Experiment 1 were constructed. Each pair consisted of one simple symbol (e.g., ) and one more complex symbol (e.g., }}}}}}). The symbol pairs were assigned to one word pair in each vocabulary. Two sets for each vocabulary were constructed such that each word in a word pair was assigned to each symbol in its associated pair once.

Do speakers have access to a mental syllabary?


Table 4. Log syllable and word frequencies for high- and low-frequency second syllable words in Experiment 4
2nd syllable frequency High Log frequency Word form Lemma 1st syllable position dependent 1st syllable total 2nd syllable position dependent 2nd syllable total Number of phonemes 1.9 2.1 6.8 7.2 7.8 8.7 2.8 Low 2.0 2.2 6.8 7.2 4.0 4.7 3.3

Within a vocabulary, words were grouped into six blocks of four words. Only one member of a word pair occurred within a block. None of the words within a block had the same initial phoneme and they were semantically unrelated. The associated symbol groups in each set were the same in each vocabulary. Each subject was assigned randomly to a vocabulary and a word set. Each subject received 24 blocks of 24 trials: three blocks for each word group. Presentation of the blocks within a set was rotated. Procedure and subjects Each subject was assigned randomly to a vocabulary and a word set. Presentation of the blocks within a set were rotated. The procedure was the same as in Experiments 1 and 2. Twenty-four subjects were tested: 18 women and 6 men. Results Exclusion of data 2.2% of the data were trials following an error and 1.8% of the data were greater than 2 standard deviations from the mean. These data were again excluded from the analyses. Naming latencies Mean naming latency for words with high-frequency second syllables was


W. Levelt, L. Wheeldon

622.7 ms, and for low-frequency second syllable 634.5 ms. The 11.8 ms effect of syllable frequency was significant, Fj(l,22) = 12.6, p < . 0 1 , F 2 (l,44) = 4.7, p < .05. There was a main effect of vocabulary by words, FX<1, F 2 (l,44) = 18.0, p < .001, due to slower reaction times to vocabulary A (640.1 ms) compared to vocabulary B (617.1 ms). There was also a significant interaction between syllable frequency and vocabulary by subjects only, Fx(l9 22) = 4.5, p < .05, F 2 (l, 44) = 1.7, due to a larger frequency effect in vocabulary A (high 630.7 ms, low 649.5 ms) than in vocabulary B (high 614.8 ms, low 619.5 ms).

Naming durations Mean naming duration for words with high-frequency second syllables was 351.5 ms, and for low-frequency second syllable 370.0 ms. The 18.5 ms difference was significant, F x (l, 22) = 106.0, p < .001, F 2 (l, 44) = 4.5, p < .05. The effect of vocabulary was significant by words, Fx(l9 22) = 2.8, F 2 (l, 44) = 26.0, p < .001 (vocabulary A, 338.4 ms, vocabulary B 383.0 ms), but there was no interaction of vocabulary with syllable frequency, F x (l, 22) = 3.1, F2 < 1.

Percentage error rate Mean percentage error rates were, for high-frequency second syllable 1.2%, and for low-frequency second syllable 1.6%. The only significant effect was of vocabulary (vocabulary A 1.8%, vocabulary B 1.0%), F x (l,22) = 5.2, p<.05, F 2 (l,44) = 5.1, p<.05.

Discussion The present experiment reproduced the 12 ms second syllable effect obtained in Experiment 2, but now with satisfying statistical reliability. Together with the previous experiments, it supports the notion that the bulk, if not the whole of the syllable frequency effect, is due to the word-final syllable. Let us now turn to the other issue raised in the discussion of Experiment 1. Could it be the case that what we are measuring is not so much an effect of syllable frequency, but rather one of syllable complexity? In all of the previous experiments the second syllable frequency effect on naming latencies is accompanied by a similar effect on naming durations; that is, words with low-frequency second syllables have significantly longer naming durations than words with high-frequency second syllables. Moreover, the regression analyses of Experiment

Method Vocabulary The vocabulary consisted of 20 pairs of bisyllabic nouns. Design As in Experiment 3. at most. ge-schreeuw [CCCVVC]). ge-mis [CVC]. it is only retrieved.g. Word pairs were also matched for word and syllable frequency (see Table 5). Two sets for each vocabulary were again . therefore. the more computation would be involved in generating its gestural score afresh from its phonological specifications.simple syllables will be faster than complex syllables. but we controlled for syllable frequency in order to avoid the aforementioned confounding. The 20 pairs were divided into two vocabularies of 10 pairs matched on all the above variables. If indeed frequency is a determinant of accessing speed. It is possible. We also controlled for word frequency. But no such thing is expected on the syllabary account. EXPERIMENT 4: SYLLABLE COMPLEXITY The complexity issue is a rather crucial one. Each pair of words had the same initial syllable but differed in the number of phonemes in their second syllable (e. If any of these. There is no reason to suppose that retrieving a more complex gestural score takes more time than retrieving a simpler one. The more complex a syllable's phonological structure.. There is a general tendency for more complex syllables to be less frequent in usage than simpler syllables.even on the syllabary account . then . the former but not the latter would predict an effect of syllable complexity. pairs of abstract symbols were constructed and assigned to one word pair in each vocabulary. The syllabic gesture need not be composed. that syllable complexity (defined in terms of number of phonemes to be encoded or in terms of articulation time) underlies the effects we have observed. In the theoretical section of this paper we compared a direct route in phonetic encoding and a route via stored syllable programs. The present experiment was designed to test second syllable complexity as a potential determinant of phonetic encoding latency.Do speakers have access to a mental syllabary? 323 1 showed that a syllable's frequency of occurrence correlates with the number of phonemes it contains. be a mediated relation to complexity. There will.

Wheeldon Log syllable and word frequencies and mean number of phonemes for short and long words in Experiment 4 2nd syllable Short Log frequency Word form Lemma 1st syllable position dependent 1st syllable total 2nd syllable position dependent 2nd syllable total Number of phonemes 1.6 5. except that each subject received 15 blocks of 24 trials: three blocks for each word group. Twenty subjects were tested: 13 women and 7 men. Levelt.4 3.9 2. The procedure was the same as in Experiments 1 and 2.7 ms (3.2%). L.4 9. Analyses Naming latencies and percentage error rates were. W.3 5 constructed such that each word in a word pair was assigned to each symbol in its associated pair once. Fx and F2 < 1.0 2. Procedure and subjects Each subject was again assigned randomly to a vocabulary and a word set.*** Table 5.5% were outliers. and for complex words 678. Each vocabulary consisted of five blocks of four words. 681. Presentation of the blocks within a set were rotated.3 9. Results Exclusion of data Two subjects were replaced due to high error rates.3%).7 5.3 ms (4. The effect of complexity on naming latency was insignificant. as was the effect on error rates. Exclusion of data resulted in the loss of 5.1% were trials following an error and 1.3 9. . for simple words.3 9.0 3 Long 2.6% of the data: 4. The rest of the design was the same as in Experiment 3.4 3.

18) = 99.Do speakers have access to a mental syllabary? 325 Fx = 1. Mean word duration for the simple words was 270. The syllabary theory reconsidered It needs no further discussion that the experimental findings are in seamless agreement with the syllabary theory as developed above. F 2 (l.0 for the complex words.5. p < . . F ^ l .5. the complexity (number of phonemes) of a word's second syllable does not affect its naming latency.36) = 15. What are the theoretical consequences of these findings? We will first consider this issue with respect to the theoretical framework of phonological encoding sketched above. the lack of a complexity effect shows that either the direct route in phonetic encoding (see above) is not a (co-)determinant of naming latencies in these experiments.0. no other theory of phonological encoding ever predicted the non-trivial finding that word andsyllable frequency have additive effects on naming latency. Discussion When syllable frequency is controlled for. compared to 313. The theory. moreover. in some way.5. provides natural accounts of the dominant rule of the word-final syllable and one of the absence of a syllable complexity effect. F2 = 1. This shows that complexity cannot be an explanation for the syllable frequency effect obtained in the previous three experiments. In addition. not complexity dependent. Clearly. (iii) the effect is due to the frequency of the word's ultimate syllable. 0 0 1 . or that the computational duration of gestural scores is. and hence cannot be the cause of the frequency effect. (ii) the effect is independent of word frequency. We will then turn to alternative accounts that may be worth exploring. These explanations hinge on the theoretical assumption that syllabification is a late process in phonological encoding (in particular that there is no syllabification in the word form lexicon) and that gestural scores for syllables are retrieved as whole entities.0001. second syllable complexity does not affect naming latency. this difference was significant. /?<. In fact. GENERAL DISCUSSION The main findings of the four experiments reported are these: (i) syllable frequency affects naming latency in bisyllabic words.0 ms. (iv) second syllable complexity does not affect naming latency.

Wheeldon It is. ignoring syllable suffixes (see below). one can determine whether uniqueness is preserved. in a slip. such as the details of segmental and metrical spellout. This condition puts empirical constraints on the degree and character of underspecification. that is. There is no need to complete the specifications of successive segments in a word if one condition is met. the realization has to be voiceless. however. but syllables that occur with sufficient frequency in the speaker's language use as to have become "overlearned". Or in other words. As pointed out above. however. which can have both [k] and [g] as phonetic realizations. The lexicon might specify no more than the "archiphoneme" /K/.326 W. 1989. but the proposed solution may still be of some relevance to phonological theory. It is that each phonological syllable arising in the process of segment-to-frame association (see Fig. even if a syllable's segments are underspecified. Given a theory of underspecification. the domain should not be potential syllables. Moreover. whether each phonological syllable that can arise in phonological encoding corresponds to only one phonetic syllable in the syllabary. But a major problem for any underspecification theory is how a full specification gets computed from the underspecified base. the domain of radical redundancy should be the syllable. . that is. not the case that the findings are also directly supportive for other aspects of the theory. the segment is unspecified on the voicing dimension. Levelt. and /K/ may become realized as [g]. The solutions need not be the same for a structural phonological theory and for a process theory of phonological encoding. not any other linguistic unit (such as the lexical word). their combination can still be unique. Different cut-off frequency criteria should be considered here. Archangeli (1988) in particular proposed a theory of "radical underspecification". In other words. 1) corresponds to one and only one gestural score in the syllabary. In the context of /s-r/. Here we are only concerned with the latter. Another variant would be to limit the domain to core syllables. Here the voicelessness of /k/ in scruffy is redundant. But when. given this framework and the present results. 1993). The syllabary theory may handle the completion problem in the following way. These aspects require their own independent justification (for some of which see Levelt. The notion of underspecification was independently developed in phonological theory. L. the context disappears. But there is one issue in phonological encoding that may appear in a new light. which claims that only unpredictable features are specified in the lexicon. It could provide a natural account for speech errors such as in your really gruffy-scruffy clothes. the /s/ gets chopped off. It is the issue of underspecification. the metrical character of phonological word formation and the particulars of segment-to-frame association (except for the assumption that this proceeds on a syllable-by-syllable basis). Stemberger (1983) was amongst the first to argue for underspecification in a theory of phonological encoding.

whereas syllable suffixes are always computed. Unlike phonological encoding. This difference accounts for the fact that exchanges of whole syllables are almost never observed. What is a core syllable? One definition is that it is a syllable that obeys the sonority sequencing principle. 1988. The articulatory network probably computes an articulatory gesture that is a weighted average of the two target gestures in the range of overlap. that is. low or high frequency.Do speakers have access to a mental syllabary? 327 The syllabary theory is. as proposed above. They can best be cast as ranging over a dimension of "mixed models".new. which involves the slightly error-prone process of assigning activated phonemes to particular positions in a phonological word frame. But there is always a race between full computation and access to stored syllable scores. not complete without a precise characterization of how the syllabary is accessed. The successive selection of articulatory gestures does not exclude a certain overlap in their motor execution. selection of syllable one must precede selection of syllable two (and so on for subsequent syllables). our theory predicts that there should be a syllable complexity effect for words that end on new or very low-frequency syllables. in order to select phonetic syllables in their correct order. where the latter process will normally win the race except for very low-frequency or new syllables. of course. or Levelt. A syllable node's frequency-dependent accessibility can then be modelled as its resting activation. The other extreme is that a phonological word's and its syllables' gestural scores are always fully computed. this would require the addition of a bottom layer of phonetic syllable nodes. model). which includes our own. Whatever there is in betweensyliable coarticulation may be due to such overlap. A strict regime has to be built in. Our own syllabary theory. there are no frames to be filled in phonetic encoding. 1992. What we have said so far (following Crompton. Alternative accounts Let us now turn to possible alternative accounts of our data. strictly following a phonological word's segment-to-frame association. 1989) is that a syllable gesture is selected and retrieved as soon as its phonological specification is complete. In a network model (such as in Roelofs. Hence. and Levelt. Modelling work along these lines is in progress. It merely involves the concatenation of successively retrieved syllabic gestures. given a phonological syllable. The one extreme here is that all phonological encoding involves access to a syllabary. 1982. But the balance between computation and retrieval may be a different one. but also mutatis mutandis in Dell's. Although a word's second syllable node may become activated before the first syllable has been selected. 1992. is a mixed model in that we assume the computability of all syllables . . More computation will be involved when one assumes that only core syllables are stored.

This will most naturally occur in word-final position when there is a "left over" consonantal segment that cannot associate to a following syllable (Rule 2b). An advantage of this theory is that the syllabary will drastically reduce in size. The present version of a mixed theory would then be that as soon as a phonological core syllable is created in left-to-right segment-toframe association. nasals being more sonorant than stops. etc. Phonetically a segment's sonority is its perceptibility. It is clear where such affixes can arise in the process of segment-to-frame association discussed earlier. in a word like lens. So. violations of sonority sequencing do occur in English syllables. The syllable core consists of an initial . and that syllable-final segments should be monotonically decreasing from the nucleus (see Clements. But most of them have complex offset clusters.. These will all be eliminated in a core syllabary. its phonetic score is retrieved from the syllabary. The authors also gave other. as in cats. etc. There we varied syllables' complexity precisely by varying the number of segments in their consonant clusters (onset or coda). On either of these sonority accounts the syllable /plant/ is a core syllable.000 different syllables (counting both full and reduced syllables). Though /lpatn/ is not a syllable of English. 1990). s is a suffix. whereas /lpatn/ is not. although the sonority principle is not violated here. A similar notion of "syllable appendix" was proposed by Halle and Vergnaud (1980). In order to account for the different types of vowel affinity of the initial and final parts of the syllable (already observed in the earlier paper) he introduced the notion of demisyllable. Still. task or apt. for citation forms of words) there are about 12. Here the core obeys sonority sequencing. L. it is therefore premature to reject it without further experimentation. the latter violates the sequencing principle both in its onset and its offset. the experiment was not explicitly designed to test the affix theory. and the affix is added to it. But a disadvantage is that the theory predicts the complexity effect that we didn't find in Experiment 4. But sonority can also be denned in terms of phonological principles (Clements. Where Fujimura and Lovins (1978) only proposed to distinguish between syllable core and affix(es). 1990. Wheeldon This states that syllable-initial segments should be monotonically increasing in sonority towards the syllable nucleus (usually the vowel).e. They proposed that English syllables can have only one place-specifying consonant following the nucleus. for a historical and systematic review of "sonority sequencing"). namely to split up the core as well. Any affixes will be computationally added to that score. In the CELEX database for English (i. and this should have computational consequences on the present theory. vowels being more sonorant than consonants. not involving sonority. Fujimura and Lovins (1978) proposed to treat such and similar cases as combinations of a core syllable plus an "affix".328 W. more phonological reasons for distinguishing between core and affixes. such as /c t + s/. Levelt. Fujimura (1979) went a step further.

N. 183-207. 58-71). O. in actuality. The speaker might access such a demisyllabary and retrieve syllable-initial and syllable-final gestures or gestural scores. and the absence of syllable complexity effects. Fujimura's model requires that. D. Representation and reality: Physical systems and phonological structure. But as far as the demisyllable aspect is concerned. in actuality. The role of the sonority cycle in core syllabification. A. 5.S. as Fujimura (1990) puts it. UK: Cambridge University Press. our theory has been productive in making non-trivial predictions that found support in a series of experiments. Syllables as concatenated demisyllables and affixes. of sonority and other relations between consonants and the vowels they attach to. Consonantal features are. Any alternative theory should be able to account for the syllable frequency effect. a demisyllable frequency effect. in addition. Phonology. these demisyllables hinge at the syllabic nucleus. G. Cutler (Ed. References Archangeli. new experiments will have to be designed. and a final demisyllable consisting of vowel plus following consonants.).E. Beckman (Eds. S55. In J. (1991). where demisyllable frequency is systematically varied. L. In conclusion. Berlin: Mouton. Crompton. 471-476. SR-105/106. O. & Goldstein. (1982). (1988). Fujimura. We could call this inventory a demisyllabary. G. its independence of word frequency. we can see no convincing arguments to reject such a model on the basis of our present results. It cannot be excluded a priori that our syllable frequency effect is. 83-92. are the "minimal integral units".. Journal of Memory and Language. 1976). (1988). not phonemes. .Do speakers have access to a mental syllabary? 329 demisyliable consisting of initial consonant(s) plus vowel. Syllables and segments in speech production. Clements. 59 (Suppl. Sprachwissenschaft und Kommunikationsforschung. (1990). Zeitschrift fur Phonetik. Fujimura. An analysis of English syllables as cores and suffixes. Papers in laboratory phonology I. demisyllables are the domains of allophonic variation. features of demisyllables. Journal of the Acoustical Society of America. and we have another mixed model here.). Between the grammar and physics of speech (pp. Kingston & M. In order to test this. demisyllables. Haskins Laboratory Status Report on Speech Research. C. Or more precisely. Hence.P. Aspects of underspecification theory. 109-162). or course. On this account "the complete inventory for segmental concatenation will contain at most 1000 entries and still reproduce natural allophonic variation" (Fujimura. (1976). 32. further computation of syllable affixes should be necessary. 1). Browman. Dell. In this model. 27. Slips of the tongue and language production (pp. Cambridge. (1979). create the same complexity problem as discussed above. although we have certainly not yet proven that speakers do have access to a syllabary. The retrieval of phonological forms in production: Tests of predictions from a connectionist model. In A. This latter part of the model will. 124-142.

J.R. & Vergnaud.). 682. Mackay. Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. L. NJ: Erlbaum. J.). S.C. Bell & J. (1991).T. Prosodic units in language production.. 29. 89. (1983). (1983).) Meyer. Hooper (Eds.C.J. 30.. Psychological Review. processes and representations. S. Meyer. Dordrecht: Foris. Keating (1988).-R. Prosodic phonology. 1978.F. (1978). (in press). J. Versprechen und Verlesen. R. B. Hayes. W.G. & Lovins. Response latencies in naming objects. 100.330 W. (1968). Lindblom. Levelt. Levelt. Fujimura. (1994)... D. Phonological facilitation in picture-word interference experiments: Effects of stimulus onset asynchrony and types of interfering stimuli. Journal of Experimental Psychology.M. 20. Shattuck-Hufnagel. & Mayer. (1982). Speech errors as evidence for a serial order mechanism in sentence production. 17. (1971). W. Demisyllables as sets of features: Comments on Clement's paper. Journal of Experimental Psychology: LMC. with introductory essay by A. Three dimensional phonology. (1992). Timing in speech production: With special reference to word form encoding.).B. The time course of phonological encoding in language production: The encoding of successive syllables of a word. J. Journal of Memory and Language. 17. E. American Journal of Psychology.W. Levelt.S. The production of speech (pp. 295-342).M..D. A. Wingfield. (1986).. In WE. (in preparation). H. Economy of speech gestures. (1989). 94. The time course of phonological encoding in language production: Phonological encoding inside a syllable. 81. Speaking: From intention to articulation. Manuscript submitted for publication. Levelt.. O. 1. The role of word structure in segmental serial ordering. A. (1992). K. Walker (Eds. Cognition 42 213-259. A spreading-activation theory of lemma retrieval in speaking. MacNeilage (Ed. B. (1979). W. Wheeldon. UK: Cambridge University Press. 83-105. Kingston & M. Winer. Saltzman. 226-234. 275-292. & Vogel. & Berrian. (1990). Underspecification in phonetics. Amsterdam: John Benjamins.J. Journal of Memory and Language. & Lahiri.. The problems of flexibility. W. Amsterdam: North-Holland.. Syllables as concatenative phonetic units. (1991). S. Jescheniak. Quarterly Journal of Experimental Psychology.J. In A. J. Skilled actions: A task-dynamic approach. L. Syllables and segments (pp. Speech errors and theoretical phonology: A review. Journal of Experimental Psychology. 42. Papers in laboratory phonology I.M. Psychological Review.T.A. M. Anderson. Phonology. A. & Levelt. Journal of Linguistic Research. 253-306.J. A. Cutler and D. fluency and speed-accuracy tradeoff. Compensatory lengthening in moraic phonology. (1980).. A. Bloomington: Indiana Linguistics Club. M. 1146-1160. Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp.M. Accessing words in speech production: Stages. & Levelt. (1993). reconsidered. Annals of the New York Academy of Sciences. I. & Wingfield. In P. R. New York: McGraw-Hill. R.B. 107-120). W. Implicit speech in reading. Wheeldon Fujimura. Meyer. 524-545. . MA: MIT Press. 84-106. Monitoring the time-course of phonological encoding. Human Perception and Performance.S. & Kelso. Cognition. B.). Cooper & E. Stuttgart: Goschensche Verlag. (Reissued. (1965). Nespor. & Schriefers. Statistical principles in experimental design. (1987). 483-506. Halle. Between the grammar and physics of speech (pp. Shattuck-Hufnagel. (1985).E.M. Effects of frequency on identification and naming of objects. Oldfield.P. 5.J. L. 283-295. (1973). A.J. In J. 69-89.. 1-22. Beckman (Eds. Cambridge. (1989). Cambridge. O. 217-245). Cognition 42 107-142. Meringer. A. 368-374. 377-381). (1990).S. Klapp.A. New York: Springer. Hillsdale. W.. (1992). 273-281. Roelofs. Wheeldon.S. Linguistic Inquiry. Stemberger. Fay.

VOCAB A GROUP 1 constant neutraal cider tarbot arme client nader vijzel boter heuvel kandeel giraffe pater techniek gewei rantsoen (HH) (HH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) VOCAB B nadeel gordijn takel concaaf koning sleutel volte absint toren nerveus gemaal berber heelal crisis reiger pingel (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) VOCAB C geding triomf kakel neuraal (HH) (LH) (HL) (LL) VOCAB D roman borrel hoeder soldeer versie praktijk neder causaal teder advies combo geiser gebaar kasteel tegel narcis (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) GROUP 2 (HH) stilte rapport (LH) bever (HL) horzel (LL) natuur gratis proper concours kussen vijand adder trofee (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) GROUP 3 GROUP 4 .Do speakers have access to a mental syllabary? Appendix 1. Within a blockgroup. Vocabularies in Experiment 1 331 The four experimental vocabularies split into blockgroups containing one word from each of the four frequency groups. words are phonologically and semantically unrelated.

These findings suggest that the listener's representation of phonetic form preserves not only categorical information.where boundaries are located. Eimas for valuable comments on an earlier version of the manuscript.bitnet This paper and the research reported herein were supported by NIH Grant DC 00130 and NIH BRSG RR 07143. why they are *Tel. fax (617) 373 8714. in particular. underscore the existence and robustness of this structure and indicate further that the mapping between acoustic signal and internal category structure is complex: just as in the case of category boundaries. Introduction A major goal of a theory of speech perception is to explicate the nature of the mapping between the acoustic signal of speech and the segmental structure of the utterance. . but also fine-grained information about the detailed acoustic-phonetic characteristics of the language. considerable emphasis has been placed on the discrete as opposed to continuous nature of phonetic categories and. Boston. which are summarized in this paper. Miller* Department of Psychology. on the boundaries between categories . Northeastern University. USA Abstract There is growing evidence that phonetic categories have a rich internal structure. The basic issue can be framed as follows: how do listeners internally represent the phonetic categories of their language and how do they map the incoming speech signal onto these categorical representations during processing? Throughout the years. The author thanks Peter D. e-mail jlmiller@nuhub. MA 02115. Our recent findings on this issue. (617) 373 3766. with category members varying systematically in category goodness. the best exemplars of a given category are highly dependent on acoustic-phonetic context and are specified by multiple properties of the speech signal.17 On the internal structure of phonetic categories: a progress report Joanne L.

& Kluender. Connino. Repp. . like perceptual/cognitive categories in general (Nosofsky. Samuel. 1977). that is. The challenge is to explicate the nature of these structured representations and to discover their role in processing. Pisoni. 1983. selective adaptation (Miller. Miller & Volaitis. The picture that is emerging. Widin. is one of a categorization process that maps acoustic information onto discrete. 1977. 1989. Lahiri & Marslen-Wilson. Rosch. 1991). 1974. & Cooper. in press). 1992). 1982. Volaitis & Miller. Harris. Samuel. that the discrimination of stimuli from a given phonetic category is not all that limited. and discrimination/generalization (Kuhl.334 J. 1991). then. Miller located where they are. Thus the processes that map the acoustic signal onto the categorical representations of speech do not necessarily produce a loss of detailed information about particular speech tokens (cf. 1977. The fundamental idea is that phonetic categories. In this paper I summarize our progress to date in doing so. Moreover. Pisoni & Tash. there is now a growing body of evidence that stimuli within a given phonetic category are not only discriminable from one another. Liberman. what kinds of factors shift boundaries around (Repp & Liberman. but that phonetic categories themselves have a rich internal structure. have a graded structure. cf. 1970). but highly structured categorical representations. van Hessen & Schouten. 1990). This focus on boundaries has historical roots. This graded internal structure is revealed both by tasks that assess the functional effectiveness of category members in such phenomena as dichotic competition (Miller. with some stimuli perceived as better exemplars of the category than others-it is far from the case that the stimuli within a phonetic category are perceptually equivalent. 1992. The basic idea was that during speech perception the listener maps the linguistically relevant acoustic properties onto discrete phonetic categories such that information about category identity is retained. & Viemeister. Evidence for internal category structure: goodness judgements Although our early work on this issue examined the differential effectiveness of category members in dichotic listening and selective adaptation tasks (Miller. 1978. Li & Pastore. under certain experimental conditions listeners can discriminate stimuli within a category remarkably well (Carney. and by tasks that assess overt judgements of category goodness (Davis & Kuhl. 1991. 1987). Much of the early research on speech categorization explored the phenomenon of categorical perception (Studdert-Kennedy. which emphasized the ease of discriminating stimuli that crossed a category boundary compared to the relative difficulty of discriminating stimuli that did not cross a category boundary. but the details of the underlying acoustic form are largely lost (cf. 1992. however. Medin & Barsalou. that belonged to the same phonetic category. Miller. It is now known. 1988. 1977. 1982). Kuhl. 1987). Schermer.

to values far beyond those typically associated with a good member of one of the categories. For example. As in typical speech experiments. Kuhl and her colleagues (Kuhl. there is now evidence that such structure exists not only for adults but also for young infants. the higher the number. Note that for this particular series the phonetic category of interest. /p/. This suggests that the fine gradations within the category may play an important role in on-line speech processing and. the better the exemplar. the category of interest is bounded on both sides by other phonetic categories. Furthermore. more recently we have focused on overt category goodness judgements.On the internal structure of phonetic categories 335 1977. and on the other by highly exaggerated instances of the target category. 1992) report structured. Although the precise form of the goodness function does not remain invariant. the less time it takes listeners to identify a within-category stimulus as a good exemplar of /p/ (r = . Next. the VOT values of the /bi/ and /pi/ endpoints are informally selected to be good category exemplars. 9 3 ) (Wayland & Miller. with the / b / . Interestingly.. Representative data from such an experiment are shown in Fig. perhaps. the delay interval between the release of the consonant and the onset of periodicity corresponding to vocal fold vibration.. Miller et al. /p/. In other words. listeners are presented randomized sequences of the extended series and asked to judge each exemplar for its goodness as a member of the / p / category using a 1-10 rating scale. we have found that the structure reflected by the goodness ratings is strongly correlated with the structure reflected in a categorization reaction time task: the higher the goodness rating. 1 (top panel). language-dependent categories for vowels in 6-month-old infants and we have preliminary evidence for structured consonantal categories in . Lacerda. This yields an extended series ranging from /bi/ through /pi/ to a breathy. & Lindblom. is bounded on one side by another phonetic category. 1983). exaggerated version of /pi/ (which we label */pi/). showing systematic variation in category goodness. We first create a speech series in which a phonetically relevant acoustic property is varied so as to range from one phonetic segment to another. In other cases. Two further examples are shown in Fig. VOT. In our goodness task. lexical access . 1 (middle and bottom panels). We tentatively conclude that graded internal structure is a general characteristic of phonetic categories./ p / voicing distinction specified by a change in voice onset time (VOT).issues we are currently pursuing. /b/. specified by a variety of acoustic properties. A typical experiment proceeds as follows. with only a limited range of stimuli within the category obtaining the highest goodness ratings. we create a series ranging from /bi/ to /pi/. the category has a fine-grained internal structure. */p/. all contrasts studied to date have yielded systematic variation in goodness judgements within the category. in this case. we extend the series by incrementing the critical acoustic property. involving both consonants and vowels. We have now obtained goodness functions for a number of different phonetic contrasts. Stevens. Williams. As VOT increases the ratings first increase and then decrease. 1992).

specified by vowel duration. Bottom panel: group goodness function for I beet I along a lbetl-lba>tl-*lbcetl series. 3-month-old infants (Eimas & Miller. Based on Wayland and Miller (1992). specified by VOT. as suggested by Kuhl (1993). or whether some rudiments of languageindependent internal category structure are innately given by the biological endowment of the infant. remains to be determined. the nature of the mental representation that underlies such structure is not known. specified by closure duration. Based on Donovan and Miller (unpublished data). unpublished data). 1.336 J. Middle panel: group goodness function for Isteil along a lseil-lsteil-*lsteil series. Miller SO 100 ISO 200 Voice-Onset-Time (ms) 2S0 200 400 600 800 Closure Duration (ms) 1000 100 Fig. to become fine-tuned with specific language experience. Top panel: group goodness function for Ipil along a lbil-lpil-*lpil series. Whether such graded category structure emerges through exposure to language over the first months of life. Two major possibilities present themselves. . Based on Hodgson and Miller (1992). 200 300 400 Vowel Duration (ms) S00 Three examples of goodness functions for phonetic categories. We should also point out that although the existence of internal category structure is now well established.

such that the internal structure of the category is itself altered. the representation could be based on stored category exemplars. Our initial investigation of this issue focused on speaking rate. or prototype. When speakers talk they do not maintain a constant rate of speech (Crystal & House. goodness ratings would be based on the perceptual similarity of the test stimulus to the set of stored exemplars. in press). it could be that the listener stores an abstract summary description. by definition. a question that immediately arises is whether these pervasive context effects are limited to the region of the category boundary where. 1981). phonetic segment (Pisoni & Luce. The latter would indicate that contextual effects in perception entail a more comprehensive remapping between acoustic property and phonetic structure than shifts in boundary location alone can reveal. The basic finding is that boundary locations are flexible. As speakers slow down such that overall syllable duration becomes longer. A case in point is VOT. 1987). is relevant here as well. 1987). it is not currently known whether graded internal structure is properly characterized in terms of categories at a featural. the *It should also be noted that the longstanding debate in the literature over the unit of representation that underlies speech perception. This produces a potential problem for perception in that many phonetically relevant acoustic properties are themselves temporal in nature. and that listeners are sensitive to such variation during speech perception (Jusczyk. Much of the evidence for context effects in perception comes from studies examining boundary locations. 1984). or whether the influence of contextual factors extends beyond the boundary region.l The internal structure of phonetic categories is context dependent There is substantial evidence that the acoustic information specifying a given phonetic segment varies extensively with a host of contextual factors. but see Stevens & Blumstein. Grosjean. . 1990. variation in a relevant contextual factor results in a predictable. & Lomanto. In this case. Goodness ratings would be based on the perceptual similarity of the test stimulus to the stored prototype.On the internal structure of phonetic categories 337 First. speaker and speaking rate. Differentiating between these accounts of phonetic category representation may well prove difficult (see Li & Pastore. Alternatively. 1987) or syllable (Studdert-Kennedy. and change as speaking rate changes. 1981). That is to say. 1988). such as phonetic environment. presumably weighted for frequency (Nosofsky. systematic change in the location of the listener's category boundaries (Repp & Liberman. With a shift of emphasis from category boundaries to category centers. 1986. of the category (Rosch. for example whether linguistic feature (Stevens & Blumstein. 1978). Miller. there is ambiguity in category membership. as it has in the case of non-speech perceptual and cognitive categorization (Medin & Barsalou. segmental or syllabic level of representation. 1986. 1980). Perkell & Klatt.

Summerfield. just as speakers produce longer VOT values for longer syllables. so too do listeners require longer VOT values for stimuli to be perceived as prototypic category exemplars. Stimuli from the two series were presented for goodness judgements using our rating task. we have discussed only rate information specified by the syllable itself. We asked whether the listener's adjustment for rate is limited to the boundary region or affects the perception of stimuli well within the category (Miller & Volaitis. 1981). but tend to produce a wider range of VOTs (Miller et al. but in relation to syllable duration. But it is known that. by focusing on rate effects within the category rather than at the boundary. As speakers slow down. Indeed. Volaitis & Miller.338 J. a change in syllable-external rate also affects perception (Gordon. the listener's voiced-voiceless boundary shifts toward longer VOTs. 1979. that is. 1975. To do so. we obtained systematic goodness functions. Miller & Liberman. and the stimuli in the other were longer (325 ms). Summerfield. So far. There thus appears to be a close correspondence between acoustic alterations stemming from a change in rate during speech production and perceptual alterations in the internal structure of categorical representations. 1989. & Gans. 1990. measured at the category boundary: as syllable duration increases (reflecting a slowing of rate). in that the best exemplars spanned a wider range of VOT values for the longer syllables. the two rate effects appear similar in kind. reflecting a fast rate of speech. Our perceptual data mirrored this effect. Thus. Carrell. 1981). reflecting a slower rate of speech. they produce not only longer VOTs. we designated a best-exemplar range for each series. Volaitis & Miller. reflecting different speaking rates. This suggests that if listeners are to use VOT to distinguish voiced and voiceless stops most effectively. see also Volaitis & Miller. In order to quantify the effect of rate. The stimuli in one series were short (125 ms). Miller VOT values associated with stop consonants at syllable onset also become longer (Miller. with only a limited range of stimuli receiving the highest ratings. 1992). the basic phenomenon has proved highly robust. 1986. we have recently . Although there is continuing controversy over the nature of the perceptual mechanism underlying rate effects of this type (Fowler. For both series. syllable-internal rate. Pisoni. Green. defined as the range of stimuli receiving ratings that were at least 90% of the maximal rating given to any stimulus within the series. & Reeves. 1983. However. 1992). 1988. 1992). 1986. at least insofar as boundaries are concerned. Interestingly. the perceptual data matched the acoustic production data in yet another characteristic. Many studies have shown evidence for such rate-dependent processing. we created two extended /bi/-/pi/-*/pi/ series that differed from each other in overall syllable duration. other than the fact that the boundary shifts due to syllable-internal rate often appear to be larger than those due to syllableexternal rate. The effect of rate was clear: the best-exemplar range was shifted toward longer VOT values for the longer syllables. Summerfield. they should treat VOT not absolutely..

with no concomitant widening of the range. whose syllables had the same overall duration. so too did the VOT value of the initial consonant increase. However. and a close correspondence between these context effects and the acoustic consequences of speech production. again mirroring the trend in the acoustic data. does alter the perception of stimuli well within the category. 1991). dis. as speaking rate decreased such that overall syllable duration increased. with the best-exemplar range being narrower for /tis/ than /ti/. 1981). In addition. like syllable-internal rate. but also on its structure (Weismer. VOT was longer for /ti/ than /tis/. 1982." Two main findings emerged. then for a given syllable duration the best exemplars of /ti/ should have longer VOT values than those of /tis/. This is precisely what happened in the perceptual study.On the internal structure of phonetic categories 339 found evidence for a qualitative difference in the two kinds of rate effects (Wayland. These findings support a dissociation between the two kinds of rate information. obtained from asking speakers to produce tokens of /di. they raise the possibility that the shape of the phonetic category is determined by characteristics of the syllable itself. the shapes of the goodness functions for /ti/ and /tis/ were quite different. with syllableexternal factors simply shifting the category. Moreover. As expected. However. Thus syllable-external rate. "She said she heard here. in which we compared goodness ratings along a /di-ti-*ti/ and /dis-tis-*tis/ series. the change in sentential rate from fast to slow. 1990. unlike the change in syllable-internal rate. Miller. tis/ across a range of rates. If listeners are sensitive to this acoustic pattern. Yet further support for both the existence of context effects within the category. & Volaitis. comes from a study in which we looked at the role of acoustic-phonetic context provided by the critical segment . In particular. along the acoustic dimension.evidence that supports the proposal in the literature that the two rate effects may derive from different underlying mechanisms (Port & Dalby. the change in sentential rate from fast to slow produced a shift in the best-exemplar range for /pi/ along the target series toward longer VOT values. in press) . Support for the notion that syllable-internal factors are important in determining the shape (as well as location) of the category along an acoustic dimension comes from a study in which we directly examined the effect of syllable structure on goodness ratings (Volaitis. with its structure intact. VOT did not depend only on the duration of the syllable. we first conducted an acoustic analysis of consonant-vowel (CV) and consonant-vowel-consonant (CVC) syllables. and tended to encompass a wider range of VOT values. for each of the four syllables. for a given syllable duration. 1979). ti. We obtained goodness ratings for target syllables drawn from a single /bi/-/pi/-*/pi/ series embedded in fast and slow versions of a context sentence. First. The main study was straightforward. produced only a simple shift in the best-exemplar range. Volaitis & Miller. To set the stage for our perceptual study. Summerfield.

it is known that listeners are sensitive to this pattern. rooted in the acoustic consequences of speech production. such that VOT values become longer as place moves from labial (/b. respectively. But not all context effects in speech perception are acoustically based. as the motor theory of speech perception proposes (Liberman & Mattingly. the findings provide strong support for the proposal that the listener's sensitivity to contextual factors is not a boundary phenomenon.p/) to velar (/g. listeners are finely tuned to the consequences of articulation. An important issue to be resolved is whether such attunement arises from a speechspecific mechanism that operates in terms of articulatory principles. Our question was whether this context sensitivity in perception extends to the best exemplars of the category. In other words. with speech being just one of those events (Fowler.variables that themselves do not systematically alter the acoustic structure of the utterance. insofar as they shift their voiced-voiceless category boundaries toward longer VOT values as place changes from labial to velar (Lisker & Abramson. Miller itself in shaping the internal structure of a phonetic category (Volaitis & Miller. even for stimuli within a category. or a perceptual system that operates so as to be generally sensitive to the consequences of physical events in the world. whereas the other constitutes a non-word. Taken together. an articulatory system that has evolved to take account of auditory processes (Diehl & Kluender. Consider a series of speech syllables that vary in initial consonant from /b/ to /p/. for example BEEF. there is not a single prototype for a given linguistic category (cf. 1970). for stimuli from /bi-pi-*pi/ and /gi-ki-*ki/ series. specified by a change in VOT. they required longer VOT values to perceive the best tokens of /ki/ than /pi/. The answer was yes. 1989). it has been shown that listeners will tend to identify stimuli with potentially ambiguous phonetic segments in the vicinity of the / b / ./ p / boundary so as to render the real word of the language rather than the non-word. 1992). for example PEEF. Oden & Massaro. 1985). It has long been known that the VOT value of an initial stop consonant systematically varies with changes in place of articulation of that stop. in this . 1978). 1986). Our data also show that the systematic way in which context alters the best exemplars of the perceptual category corresponds closely to the way in which context alters the relevant acoustic properties during production. So far we have focused on acoustically based contextual variation. Note that this finding is compatible with either an exemplar-based or a prototype-based representational structure. with the caveat that any abstracted "prototype" must itself be context dependent.k/) (Lisker & Abramson. Some derive instead from the influence of higher-order linguistic variables . that is. but that context alters which stimuli are perceived to be the best category exemplars. 1964). The critical aspect of the series is that one endpoint constitutes a word of the language. Moreover. When listeners were asked for /p/ and /k/ goodness judgements.340 J. An example is lexical status. In a number of experiments using such series.

to test directly the influence of lexical status on internal category structure. unpublished data). Pitt & Samuel. and phonetic context. syllable structure. In a recent set of experiments (Miller. Instead. that is. 1980. we extended the series by increasing VOT to extreme values (as in Miller & Volaitis. The other type has to do with the finding that phonetic contrasts are multiply specified. & Burki-Cohen. as had the contextual factors of rate. listeners tended to hear the stimuli near the boundary as BEEF in the first series and as PEACE in the second series. They were asked to rate each exemplar for the goodness of /p/. there was only a marginally reliable shift for the peak of the function itself. these findings raise the possibility that higher-order contextual variables. where phonetic identity is presumably clearly specified by the acoustic information. we confirmed the basic lexical effect: The / b / . do not substantially alter the mapping between acoustic signal and internal category structure. In a preliminary experiment involving BEEF-PEEF and BEACEPEACE series that varied in VOT. In the main experiment. with little or no influence often seen for the endpoint stimuli of a series. and use these multiple . Volaitis. we designated a best-exemplar range for /p/. described above. may be limited to the region of the category boundary: The strength of the lexical effect appears to decrease as stimuli along a series move away from the region of the category boundary. and no reliable shift for the edge of the best-exemplar range beyond the peak (in the direction away from the boundary).On the internal structure of phonetic categories 341 example BEEF rather than PEEF. The internal structure of phonetic categories is specified by multiple acoustic properties Discussions of speech perception typically consider two major kinds of variation that complicate the mapping between acoustic signal and phonetic category. Interestingly. coupled with our goodness rating task. considered above. such that listeners were presented with BEEF-PEEF-*PEEF and BEACE-PEACE-*PEACE series. Although only preliminary. unlike acoustically based contextual variables. Listeners appear to be exquisitely sensitive to the multiple acoustic consequences that arise from any given articulatory act. Following the procedures used in Miller and Volaitis (1989. unlike the acoustically based context effects described earlier. In other words. we used extended VOT series. 1993). however. lexical status can produce a shift in category boundary location (Ganong./ p / category boundary was located at a longer VOT value on the BEEFPEEF compared to the BE ACE-PEACE series. The interesting finding was that the change in lexical status did not shift the entire best-exemplar range along the VOT series. the literature suggests that the effect of lexical status. 1989). for both series. see above). One type is variation due to contextual factors.

which sounded like "s. Two primary properties underlying this contrast are the duration of the closure interval between the fricative and the vocalic portion of the syllable and the frequency of the first formant (Fl) at its onset after closure. Miller properties to identify a given phonetic segment (Bailey & Summerfield. 1981). such that for the series with the higher Fl onset the boundary was shifted toward longer closure durations. which reveal that manipulation of one variable can produce a shift in the category boundary along a continuum defined by the other variable. is compatible with either an exemplar-based or a prototype-based representational structure.342 J. Evidence that listeners use both properties in identifying "say" versus "stay" comes from trading relation studies. much of the evidence for the use of multiple properties comes from the examination of category boundaries. Stimuli from the two series were randomized and presented to listeners for goodness judgements. like the context-dependent nature of category structure described above. Note that the sensitivity of internal category structure to multiple properties. 1992. with a similar caveat: any abstracted . Morrongiello. listeners identified stimuli with short closures as "say" and those with longer closures as "stay". In other words. For example. such that each series ranged from "say" through "stay" to an exaggerated version of "stay" (*"stay"). and Robson (1981) created two "say"-"stay" series. However. 1991). in preparation). 1993. the critical question is whether such trading relations are limited to the boundary region or whether properties also trade to define which stimuli are the best exemplars of the category. but see Stevens & Blumstein. Hodgson & Miller. consider the distinction between the presence and absence of a stop consonant in "say" (/sei/) versus "stay" (/stei/). With respect to the nature of categories. As a case in point. For each series. The main finding was that the best exemplars of "stay" were located at longer closure durations for the high Fl onset series than the low Fl onset series-clear evidence for a within-category trading relation. the location of the "say"-"stay" boundary depended on the Fl onset frequency. (1981). We have examined this issue by pairing our goodness procedure with a trading relation paradigm. 1980. And we have replicated this basic finding for an intervocalic voicing category. We then extended each series by increasing closure duration to very long values. specified by closure duration and preceding vowel duration (Hodgson & Miller.tay".. As in the case of context effects. The series differed from each other in Fl onset frequency. in which the "say" versus "stay" distinction along each series was specified by a change in closure duration. with a higher Fl onset frequency (which favors "say") listeners required more silence (which favors "stay") to hear the presence of the stop consonant. A short closure and high Fl onset specify "say". Best. focusing on the "say"-"stay" distinction described above (Hodgson. whereas a longer closure and lower Fl onset frequency specify "stay".. We began by creating two "say"-"stay" series closely patterned after those of Best et al.

However. also becomes larger. can support the perception of graded category structure. unlike our goodness judgement task. the robustness of category structure itself (Hodgson. showing no signs of decline even at a relatively poor signal-to-noise ratio (0 dB). 2 .On the internal structure of phonetic categories 343 "prototype" must itself not only be context dependent. we created sinewave replicas of our two original extended "say"-"stay"-*"stay" series. our third study in the series provides evidence that sinewave replicas of speech. who did not find evidence for a within-category trading relation for the "say"-"stay" contrast (he did not test the intervocalic voicing contrast). in so doing. the noise had no effect: both the within-category structure and the trading relation proved highly resilient. We found that as Fl onset frequency becomes higher. measured as the shift in location of the best category exemplar along the series. is invoked. as required by our goodness judgement task. such that listeners no longer give very high ratings to any stimuli within the series-no stimuli are perceived as good exemplars of "stay". (1981). but presented the stimuli in a background of multitalker babble noise-noise that has been shown previously to produce changes in the way in which acoustic properties contribute to phonetic categorization (Miller & Wayland. That is. Repp assumed that they did just that. Specifically. we explored the limits of the trading relation by systematically increasing the Fl onset value of the stimuli. the studies suggest that within-category trading relations do occur.2 In three subsequent studies based on our extended "say"-"stay"-*"stay" series. we have examined the limits and robustness of the within-category trading relation and. but at a price: the structure of the category begins to break down. indeed. his discrimination task did not require listeners to attend to the phonetic quality of the stimuli. Patterning our stimuli after sinewave stimuli used by Best et al. which eliminate the rich harmonic structure of speech but preserve the basic time-varying patterns of the speech signal. in preparation). that is. we used background noise to examine the robustness of category structure. the magnitude of the trading effect. 1976). Hodgson & Miller. we conducted our basic goodness experiment with our two original "say"-"stay"-*"stay" series. In the second study of the series. 1985). but must also be multiply specified (cf. In the first of these studies. Taken together. our findings on within-category trading relations appear to disagree with those of Repp (1983). the rating function for a series with extreme Fl values is depressed. Massaro & Cohen. 1993. Finally. Thus the fine-grained internal structure of phonetic categories is far from fragile. as well as within-category trading relations. Wardrip-Fruin. however. to listeners who heard the sinewave At first glance. We presented the stimuli for "stay" goodness judgements to "speech" listeners. 1993. but only when specifically phonetic processing. Thus his listeners could have based their within-category discriminations on auditory information alone and. In the present case. Of considerable interest will be to determine whether the Fl onset frequency at which category structure is lost is correlated with the decreasing probability of that Fl onset frequency value occurring in production.

(1992). Davis. 536-563.M. withincategory stimuli vary systematically in category goodness. Conclusion The research we have reviewed provides support for the claim that the stimuli within a phonetic category are far from perceptually equivalent.K. K.344 J.). 14. 88. & House. 3. C. & Viemeister. 62. Widin. T. These findings indicate that the listener's representation of the phonetic categories of language includes finegrained detail about phonetic form. Fowler.R. N.L. This finding adds to the weight of evidence that graded category structure is an integral part of the representation of phonetic categories. References Bailey. Although as typical in sinewave speech experiments listeners gave a variety of response patterns.. 961-970. & G. Journal of Experimental Psychology: Human Perception and Performance. 121-144.E. R.P.. fully half of the listeners we tested provided highly systematic goodness functions that were remarkably similar to those we obtained for our original "say"-"stay"-*"stay" speech series.. Fowler. 101-112.. showing clear evidence of a wi thin-category trading relation. (1990). P. G. 1236-1249. Best. Ecological Psychology. & Summerfield. On the objects of speech perception.M. contextdependent properties. Journal of the Acoustical Society of America. 6. Canada: University of Alberta. Proceedings of the International Conference on Spoken Language Processing (pp. (1990). 495-498). Derwig. T. (1980). (1981). B. Diehl. Articulation rate and the duration of syllables and stress groups in connected speech. Ohala. 191-211.A.L. A. Journal of the Acoustical Society of America. 88. C. 3-28. Carney. Best exemplars of English velar stops: A first report. Perceptual equivalence of acoustic cues in speech and nonspeech perception. In J. (1986).S.. & Kuhl. & Robson. Morrongiello. It is resistant to noise and it is revealed even when the phonetic quality of the stimulus rests on a highly stylized "caricature" of speech that preserves critical phonetically relevant time-varying properties. with the best exemplars of a given category themselves defined in terms of multiply specified. Wiebe (Eds. Journal of the Acoustical Society of America. Crystal. 1. Journal of Phonetics. Information in speech: Observations on the perception of [s]-stop clusters. Sound-producing sources as objects of perception: Rate normalization and nonspeech perception.. Miller stimuli as speech. .E. Hodge. (1989).. K.T. & Kluender.J.H. Noncategorical perception of stop consonants differing in VOT. C. (1977).J.F. P. Nearey. M. A major challenge before us is to determine the role played by this fine-grained knowledge in on-line speech processing and lexical access. Alberta. And this graded internal category structure is highly robust. Perception and Psychophysics. Q. B. R.. Rather. A.A. An event approach to the study of speech perception from a direct-realist perspective.

Cognition. Linguistic experience alters phonetic perception in infants by 6 months of age. 46. Journal of the Acoustical Society of America. Journal of the Acoustical Society of America. .. Schouton (Ed. Phonetica. & Barsalou. In S. & Pastore. Miller. New York: Wiley. In K.W. J. W. (1964). speech and language. 43. P. Science. X. 215-225. 457-465. 1967 (pp. & Cohen.E. de Boysson-Bardies. Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. 20. Hamad (Ed. A. (in preparation). Ph. P. P. A. P. Jusczyk.M. In Proceedings of the Sixth International Congress of Phonetic Sciences./ s i / distinction. McNeilage..). P. 137-146. P. K. P. (1976). monkeys do not.H. J.). Kuhl. C. L. New York: Cambridge University Press. J.. Miller. (1988). A cross-language study of voicing in initial stops: Acoustical measurements. Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories. D. Northeastern University.K. (1991). Audition. 1997. A. 563-567). Schermer. Word. P. Miller. & Miller.. 2464. Lacerda. Categorical perception. (1992). & Reeves. Li. Stevens. The motor theory of speech perception revised... J. & Marslen-Wilson. 606-608. Phonetica. Hodgson.. 6.. Some effects of later-occurring information on the perception of stop consonant and semivowel. 110-125. (1986)..L.L. F.. (1985).. Internal phonetic category structure depends on multiple acoustic properties: Evidence for within-category trading relations [Abstract].. Grosjean.. In M. 641-648. & Abramson. (1979).G. 505-512.W. R. 62. 93-107. (1992). 106-115. & J. (in press). W. (1991). J. J. de Schonen.D. Perception and Psychophysics. Internal structure of phonetic categories: Evidence for within-category trading relations. III (1980).). L.N.. Hodgson. Dordrecht: Kluwer. T. (1984). Liberman. K. A.P. C.S. Perception and Psychophysics. 259-274). Miller. The voicing dimension: Some experiments in comparative phonetics. thesis.L. (1977). Categorization processes and categorical perception. P. A.L. Journal of Experimental Psychology: Human Perception and Performance. (1989). P. Connine. 384-422.On the internal structure of phonetic categories 345 Ganong. 50.W. M. Lisker. Speech perception. & Liberman. & Volaitis.L. The contribution of fundamental frequency and voice onset time to the / z i / . Miller.. Williams. Lisker.K. (1986).F.L. J.. The mental representation of lexical form: A phonological approach to the recognition lexicon.. & J.L. Handbook of perception and human performance (pp. K. 43..S. (1983). Cognition.. 1-36. (1987). & Lomanto. In B.R. Effect of speaking rate on the perceptual structure of a phonetic category.L. 38.. Thomas (Eds. & Lindblom.P. Perception and Psychophysics.N.).. 25. Jusczyk. Developmental neurocognition: Speech and face processing in the first year of life (pp. Phonetic categorization in auditory word perception. & Kluender. (1993).C. L. Kuhl. F. Internal structure of phonetic categories: The role of multiple acoustic properties. Kuhl. Kaufman. & Miller. Berlin: Mouton-De Gruyter. J. J.M. 41.R. S. Properties of feature detectors for VOT: The voiceless channel of analysis. Perception and Psychophysics. P. 704-717. Evaluation of prototypes and exemplars in perceptual space for place contrast. 92.E. L. I. & Abramson. (1993). Speaking rate and segments: A look at the relation between speech production and speech perception for the voicing contrast. The role of multiple acoustic properties in specifying the internal structure of phonetic categories [Abstract]. 73. Green.. Boff. 60. Gordon. A possible auditory basis for internal structure of phonetic categories. L.L. B.L. Medin.E.K.. Massaro. 255. Hodgson. Miller. Prague.M.M. A. Journal of the Acoustical Society of America. (1970). Prague: Academia. Hodgson. K. (1991). 27/1-27/57). Journal of the Acoustical Society of America.A. 89. Morton (Eds. Innate predispositions and the effects of experience in speech perception: The native language magnet theory. Journal of the Acoustical Society of America. 21. 245-294.. Induction of rate-dependent processing by coarse-grained aspects of speech. Lahiri. D. & Mattingly. & Miller. 2124-2133.

.346 J. Perkell. (1982). L. Paper presented at 33rd annual meeting of the Psychonomic Society. (1993).. Consonant/vowel ratio as a cue for voicing in English. (1977). Miller (Eds. (1983).L. J. D. (1992). S.N. The effect of signal degradation on the status of cues to voicing in utterance-final stop consonants. Samuel. Volaitis. & Dalby. Hillsdale. Memory. Phonetic prototypes.. A. Hillsdale. C. 92. 341-361. Stevens. In E. & Cooper. Acoustic-phonetic representations in word recognition. Nosofsky. 172-191. Q. van Hessen. (1987).W. (1980).G.L. Influence of internal phonetic category structure in on-line speech processing.H. In S. & Luce. Speech Communication. Perception and Psychophysics 15. 92. 25. D. B. A.. 314-322.L. M. Studdert-Kennedy. Number 4. L. Harnad (Ed. Aerodynamics versus mechanics in the control of voicing onset in consonantvowel syllables. M. & Liberman. 1856-1868. M.B. L. Miller Miller.G. Studdert-Kennedy. Rosch & B.). 54. Journal of the Acoustical Society of America. 19.M. & Gans. Spring. Similarity. (1991). Summerfield. (1987). 1074-1095.. In H. . & Schouten. Categorical perception. 141-152.C.. Rosch. NJ: Erlbaum.E. Port.C.. Hillsdale.L. An empirical and meta-analytic evaluation of the phoneme identification task. (1988). (1990). & Klatt. S. & Samuel.H. & Wayland. T. frequency. (1992). Wayland. 3. 37-50... Articulatory rate and perceptual constancy in phonetic perception.). Principles of categorization. F. 1998. Ph. 1-38)..S. Volaitis. Journal of the Acoustical Society of America.C. 1907-1912. G.B. Pisoni. 723-735. Oden. Journal of Experimental Psychology: Human Perception and Performance. & Miller. Influence of a syllable's form on the perceived internal structure of voicing categories [Abstract]. D.M. (1970). Invariance and variability in speech processes. Department of Psychology. Journal of the Acoustical Society of America. Pisoni. 2. S. 14.D. Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories. In P. 89. Psychological Review. 85. J. Psychological Review. 7. & Miller. Perception of the duration of rapid spectrum changes in speech and nonspeech signals. Lloyd (Eds. R. & Massaro. (1981).J. (1981). (1983). 205-210. Cognition. K. Harris.. & Miller. D. J.). Q. E. The Queen's University of Belfast. Speech perception.). Summerfield. New York: Cambridge University Press. NJ: Erlbaum. A. R. D. A.H.H.D. 23. Kobe. Eimas and J. Journal of the Acoustical Society of America. Dichotic competition of speech sounds: The role of acoustic stimulus structure. 32. Journal of the Acoustical Society of America.. Perception and Psychophysics. & Tash. Japan.A.. (1978). (1982). Modeling phoneme perception. 34. B. J. Integration of featural information in speech perception. 1992.E.S.A. B. Repp. Journal of Experimental Psychology: Learning. Limits on the limitations of context conditioned effects in the perception of [b] and [wj. Language and Speech. thesis. and Cognition. Repp.J. Cognition and categorization. 21-52. 77.H. Effects of talker variability on speech perception: Implications for current research and theory. (1985).E. 54-65. Wardrip-Fruin.E. A. 31. Trading relations among acoustic cues in speech perception are largely a result of phonetic categorization.E..D. P. Pisoni. NJ: Erlbaum. K. Fujisaki (Ed. (1990). & Blumstein. Journal of Experimental Psychology: Human Perception and Performance. (1992). M.B. Repp. Perception and Psychophysics.S. II: A model of stop consonant discrimination. 234-249. Pisoni. (1986). Perception and Psychophysics.. J. Pitt. J. Volaitis. 699-725.M.B. Carrell. Some context effects in the production and perception of stop consonants. (1974). Motor theory of speech perception: A reply to Lane's critical review. (1978). (1975). (1993). 285-290.B. 45-66. and category representations. Proceedings of the International Conference on Spoken Language Processing. Perception and Psychophysics. Phonetic category boundaries are flexible.. Northeastern University. The search for invariant acoustic correlates of phonetic features. Liberman.. 77. St Louis. Reaction times to comparisons within and across phonetic categories.F. Speech Perception: Series 2. S. Perspectives on the study of speech (pp. J. 307-314. D.L.

. (1979).E.On the internal structure of phonetic categories 347 Wayland. J. 7. G. Journal of the Acoustical Society of America. The influence of sentential speaking rate on the internal structure of phonetic categories. & Volaitis. 197-204. Journal of Phonetics. Sensitivity of voice-onset-time (VOT) measures to certain segmental features in speech production. L. . Weismer. Miller.L. S. (in press).C.

it is argued that even if the representations used in speech perception and those used in assembling and in conscious operations are distinct. The second author is Research Associate of the Belgian FNRS. Special thanks are due to all our collaborators.ac. Buyl 117. particular types of information are represented has become increasingly compelling. among phonological units and properties. Moreover. On the basis of both experimental studies of human behavior and functional imaging data.18 Perception and awareness in phonological processing: the case of the phoneme Jose Morais*. they may entertain dependency relations. the latter are developed in the context of learning alphabetic literacy for both reading and writing purposes. Av. Regine Kolinsky Laboratoire de Psychologie experimental.be The authors' work discussed in the present paper was supported by the Human Frontier Science Program (project entitled Processing consequences of contrasting language phonologies) as well as the Belgian Fonds National de la Recherche Scientifique (FNRS)-Loterie Nationale (convention nos.4527. Ad. 8. phonemes may be the only ones to present a neural dissociation at the macro-anatomic level. Finally.4505. In these twenty years or so of Cognition's life the issue of where.90 and 8. e-mail jmorais@ulb. Belgium Abstract The necessity of a "levels-of-processing" approach in the study of mental representations is illustrated by the work on the psychological reality of the phoneme. at what levels of processing. and in particular to Mireille Cluytens. These two sets of mental representations are functionally distinct: the former intervene in speech perception and (presumably) production.93) and the Belgian Ministere de l'Education de la Communaute franchise ("Action de Recherche concertee" entitled Le traitement du langage dans differentes modalites: approches comparatives). Fax 32 2 6502209. SSDI 0010-0277(93)00601-3 . B-1050 Bruxelles. that is. Cognitive psychology is concerned with what information is represented mentally and how it is represented. it is argued that there are unconscious representations of phonemes in addition to conscious ones. Universite Libre de Bruxelles. This issue is crucial both to track the mental itinerary of information and to draw a correct 'Corresponding author.

almost every paper of the present authors has greatly improved following reviewers' criticisms-sincerely. In a subsequent volume of this journal. like between "ba".. & Vigorito. we are almost grateful to the anonymous reviewer and presumably distinguished scholar who confounded perception and awareness. R. & Ding. This claim was based on the discovery that illiterate adults are unable to manipulate phonemes intentionally. American babies can perceive "phonemic" distinctions. 1971). we are still unable to follow on a computer screen the multiple recodings of information accomplished in the brain. In the seventh volume of this journal. we had submitted a former version of it to another journal. Cary. Sadly. We are not complaining about reviewers . as evidenced by their inability to delete "p" from "purso" or add "p" to "urso". However. & Alegria. Alegria. Jusczyk. As a matter of fact. 1986).350 J. which rejected it on the basis of the comments of one reviewer who could not believe in our results given that. No reader of Cognition doubts that he or she can represent phonemes mentally. & Morais. Bertelson. 1987). It was probably his or her reviewing that led us both to write: "the fact that illiterates are not aware of the phonetic . under the guest editorship of Paul Bertelson. we all know that what we register are intentional responses given under the request of the experimenter. the experimental study of human behavior remains up to now the most powerful approach to the mind's microstructure. Siqueland. 1979). which again suggests that conscious access to global phonological properties of speech utterances does not depend on literacy. so that the evidence arising from an experiment may be difficult to attribute to a particular stage of processing. the "where" question may be even more difficult to answer than the "what" and "how" ones. "da" and "ga". In spite of the tremendous development of the functional imaging technology. Cary. Zhang. our group demonstrated (at least we believe so) that the notion that speech can be represented as a sequence of phonemes does not arise as a natural consequence of cognitive maturation and informal linguistic experience (Morais. We were happy that Cognition's reviewers had understood the interest of our 1979 paper. & Bertelson. or at least may be described as such. Later on. Cary. Eimas. Thus. we observed that illiterates can compare short utterances for phonological length (Kolinsky.g. as Peter Eimas and others had shown (e. 1986) and that the metaphonological failure of illiterates seems to be restricted to the phoneme. Nie. fairly well. Kolinsky picture of mental structure. Morais. it was reported that Chinese non-alphabetic readers share with illiterates the lack of phonemic awareness (Read. Characters coming out of press or from the writer's hand are costumed phonemes. But at how many processing level(s) and how deeply do phonemes live in our minds? We take the phoneme issue as a good illustration both of the necessity of pursuing a "levels-ofprocessing" inquiry in the study of mental representations and of the misunderstandings and pitfalls this difficult study may be confronted with. since they can both manipulate syllables and appreciate rhyme (Morais.

and despite our familiarity with it. metalinguistic level" (p. multileveled representations is incompatible with the notion of a phoneme" (pp. Besides its ability to demonstrate the direction of causality between two correlated states. In the same paper. announces "the death of the phoneme" (head of a section. Is the phoneme dead? Did it ever exist otherwise than in the conscious thoughts of alphabetically literate minds? Are phonemes the make-up of letters rather than letters the make-up of phonemes? Like Orfeo. 330).Perception and awareness in phonological processing 351 structure of speech does not imply. The work with illiterates and with non-alphabetic readers has contributed to nourish. p. Following the logic of illusory conjunctions (cf. Kaye (1989). 153-154). 331). of course. Thus. 289) appears at the top. after all. p. in the context of an attempt to demonstrate that "a phonology based on non-linear.. we took advantage of the dichotic listening technique to elicit word illusions which . the production of an illusion implies that the perceiver has no full conscious control of the informational content of the illusion. of experiments which turn on the creation of perceptual illusions" (p. he listed the "experimental evidence that phonemes are not perceptual units" (our italics). the battle to distinguish between perceptual and postperceptual representations had just begun. influence the misperception. In this list. 1982). One may use perceptual illusions to fight against experimenters' illusions. as far as our own work was concerned. if it is represented at an unconscious perceptual level. 287). that they do not use segmenting routines at this level when they listen to speech" (Morais et al. for instance. Fodor and Pylyshyn (1981) have convincingly argued for "the centrality. we progressively realized that. one is allowed to locate the representation of that part of the information at the unconscious perceptual system. information that is not consciously represented may. and having enough reasons to believe that part of this information cannot come from conscious representations. the fact that "illiterate adults cannot segment phonetically" (p. he has erroneously taken observations from the conscious awareness level as evidence of perceptual reality or nonreality. Treisman & Schmidt. in perceptual psychology. In the following years. Warren (1983) rightly called one's attention to the danger of introspection in this domain: "Our exposure to alphabetic writing since early childhood may encourage us to accept the analysis of speech into a sequence of sounds as simply the recognition of a fact of nature" (p. By looking at the informational content of the illusion. 161). 1979. 149). and to conclude the paper stressing the need "to distinguish between the prevalence of such or such a unit in segmenting routines at an unconscious level and the ease of access to the same units at a conscious. we have to face the illusions that constantly assault the visitors of perception. Some linguists have reached the same conclusion as far as the role of phonemes in the formal description of phonology is concerned. a simple product of knowing an alphabet. However. the suspicion that the phoneme could be. if not to raise.

Phonemes are not a mere product of alphabetic literacy. consonants have psychological reality at the perceptual level of processing. However. R. at least-we insist-in Portuguese native speakers. and that the role of consonants in speech perception can be demonstrated in a population that is unable to represent them consciously. Kolinsky et al. given that they were university students. compared to migrations of syllable and either voicing or place of articulation of the initial consonant (Kolinsky & Morais. This means that. yielded the same pattern of results (Morais. This double dissociation supports the idea that conscious and unconscious representations of phonemes form two functionally distinct sets of mental representations.. If speech attributes can be wrongly combined. it is important to address both issues. Recent data that we have obtained on Portuguese-speaking literate subjects (either European or Brazilian) indicate that the initial consonant of CVCV utterances is the attribute that "migrates" the most. & Paiva. Morais. since the subject is asked either to detect a word target previously specified or to identify the word presented in one particular ear. The very same populations which allowed us to show that conscious representations of phonemes are prompted by the learning of alphabetic literacy provide also a clear suggestion that unconscious perceptual representations of phonemes can develop prior to the onset of literacy. It intervenes in perceiving and producing speech. Kolinsky should result from the erroneous combination of parts of information presented to one ear with parts of information presented to the opposite ear (see Kolinsky. we found low rates of initial consonant migration (Kolinsky. 1992. Testing French native speakers on French material. Portuguese and Brazilian subjects possess unconscious representations of phonemes without disposing of conscious ones. Two functionally distinct systems may interact with each other. his or her attention is not called upon any word constituent. in reading and writing. at least for Portuguese. in press. . and Kolinsky. Kolinsky. Subsequent testing of Portuguese-speaking illiterate subjects. In our situation. 1993). This experimental situation can thus be used for testing of illiterate as well as literate people. again from both Portugal and Brazil. At the particular stage of processing tapped by the attribute migration phenomenon French-speaking listeners do not seem to represent consonants (the possibility that consonants are represented at other perceptual stages cannot be excluded). unpublished). in press). at least they do not do so as much as Portuguese-speaking listeners. and in the different forms of metaphonological behavior. & Cluytens.352 J. Phonological processing intervenes in different functions. Thus. for detailed description of the methodology). distinctiveness does not mean necessarily independence. but. we may be confident that they possess conscious representations of phonemes. 1992. The reverse picture can also be found. they must have been separately registered as independent units at some earlier stage of processing. Morais.

the patient's impairment in pseudoword reading was slight. and Denes (1989) suggested that the representations involved in pseudoword reading and in phonemic manipulations are independent in skilled readers. Since behavioral studies have provided evidence of automatic activation of phonological representations of word constituents. separate representations are needed for input and output processing. We have argued above that conscious representations of phonemes are distinct from the unconscious representations of phonemes used in spoken word recognition. let us note that the automatic phonological activation during skilled word reading has probably very little in common. 1985). Perfetti & Bell. and we might consider now whether or not they are distinct. that is. Bisiacchi. 1988). the recognition of written and spoken words. Moreover. Both Shallice et al. from the representations of phonemes which are intentionally activated in reading. she knew the phonemic values of all the letters of the alphabet. More recently. The two input functions. Posner. also use distinct phonological systems.'s finding and the fact that auditory word input does not activate the area specifically activated by word repetition and word reading aloud as shown by positron emission tomographic (PET) imagery (Petersen. which is characterized above all by a highly selective impairment in reading pseudowords and nonwords. Indeed. these phonological representations would be distinct and to a large extent independent from those involved in spoken word recognition. Bell. Mintun. concomitant with a severe deficiency in manipulating phonemes consciously? Based on the ability to perform phonemic manipulations displayed by a patient diagnosed as phonological dyslexic. Fox. Patterson and Marcel (1992) suggested that the nonword . Thus. Perfetti. discussion in Morais. the data are not convincing. very little interference was observed on reading aloud words by having to monitor at the same time for a target word in a list of auditorily presented words (Shallice. 1993). with the intentional assignment of phonological values to letters and groups of letters that is mostly used in the reading of illegal sequences or of long and phonologically complex pseudo words. 1989) suggest that processing in the input system might be quite independent from processing in the output system. too. & Raichle. and she could read and write a very high number of meaningless syllables (cf. & Lewis.. However. and the reverse may be true for a lesion in Wernicke's area. As far as the independence issue is concerned. McLeod. Is phonological dyslexia. 1991. & Delaney. However. production can be affected by a lesion in Broca's area while leaving recognition of auditory words intact. including phonemes.Perception and awareness in phonological processing 353 The perception and production of speech require distinct systems. except from a developmental point of view. 1992). 1992. As a matter of fact. written word pronunciation does not activate the Wernicke's area that is activated during word repetition (Howard et al. during orthographic processing (Ferrand & Grainger. Cipolotti.

Kolinsky reading deficit "may be just one symptom of a more general disruption to phonological processing" (p.R. the deep dyslexic whose re-education was described by de Partz (1986) and who.) on the speech dichotic test we have designed for the induction of attribute migration errors (unpublished data). was unable to associate letters with phonemes. All the three phonological dyslexics (V. who displayed no effect of lexicality in either reading or writing. since all subjects were exceedingly poor at assembling three phonemes. 259). They presented the results of six phonological dyslexics. two of them were deep dyslexics).A. may not be the only mechanism involving phonological representations in word reading) seems thus to depend on the same phoneme representations that are evoked for the purpose of intentional. there were a number of single consonant confusions. Morais.. too. was only slightly impaired in his conscious phonemic abilities (in reading. unfortunately. all were also extremely poor at the phonemic tests. R. there might be no deficiency in accessing phonology from orthography. we repeat. the authors do not give any indication on the patients' knowledge of grapheme-phoneme correspondences. immediately before re-education.R. and one further phonological dyslexic (S. two of them performed much better on rhyming judgement and on tests requiring a conscious analysis of utterances into syllables than on the phonemic tests. More recently. all displaying a severe deficit in nonword reading.. performing around chance level. when the reading deficit spares the mechanism of phonological assembling. the first author has tested three phonological dyslexics (more exactly.S. but a large effect of regularity. 1993). suggesting a slight impairment in his grapho-phonological conversion procedure. but in those cases where phonological assembling is dramatically damaged phonemic awareness is not observed.S.D.S. phonemic awareness is still present. and no one of them displayed a regularity effect. In collaboration with Philippe Mousty. on both the intentional segmentation and assembling of phonemes. besides his impairment in the addressed procedure).354 J.). At least for the two patients who could delete initial phonemes. P. and one surface dyslexic on different metaphonological tests (see Morais.V.) were extremely poor at nonword reading. Re-education of the assembling procedure re-installs phonemic awareness. P. Their results were compared with those of control subjects of the same age and educational level. the patients behaving in the metaphonological tests like illiterate people. We wanted to .D. and R. V. conscious manipulations of phonemes. attained a very high level of performance on both pseudoword reading and phonemic analysis when we tested him a few years later. The nonword reading deficit of these patients might be due mainly to deficiency in assembling. An interesting dissociation was observed between these two tasks. The surface dyslexic (J. The assembling procedure in reading (which. Thus. Interestingly... P. we could test J. but two of them could delete the initial consonant of a short utterance on about 80% of the trials.

Thus.S. it occurs even under passive listening. in that area in a task requiring to decide whether two syllables ended or not with the same consonant.D. in comparison with passive listening of the same speech material.Perception and awareness in phonological processing 355 know if people phonologically impaired both in conscious phonemic analysis and in phonological assembling in reading would show the same pattern of attribute migration in speech recognition as normal listeners. given that it was clearly supported by the dissociation observed in illiterate people. We do not dispose of neuro-anatomical data about our patients which would be sufficiently precise to try to match the areas damaged with types of deficit. The results were very clear. What are we allowed to infer from these correlational data? The distinctiveness of conscious and unconscious representations of phonemes cannot be questioned. J. all the other patients displayed good word repetition but relatively low nonword repetition. It is a rather sophisticated metaphonological task. The alternative is to conceive that these two systems of representation.. Evans. Among the phonological dyslexics. and a low rate for initial consonant. a high rate of migrations for syllable. obtained an overall correct detection score as poor as the phonological dyslexics. Phonetic decoding is obligatorily and automatically triggered whenever people hear speech stimuli. had been diagnosed as Broca's aphasics. entertain dependency relations with each other. and PR. the true implication of Zatorre et al. who was good at repeating both words and nonwords. Yet.'s finding is that a part of Broca's area is involved in conscious phonemic analysis.A. the task used involves much more than phonetic decoding. though distinct. Meyer. that is.S. measured with PET. as Wernicke's. and Gjedde (1992) reported that phonetic decoding is accomplished in part of Broca's area near the junction with the premotor cortex. the representations used in speech perception and those used in assembling and in conscious operations. Thus. it may be useful to inspect the literature to evaluate how distant the areas supporting conscious and unconscious phonological representations could be from each other if they are not coincident. One interpretation is that the cerebral damage they had undergone was wide enough to affect two relatively localized systems of representation. followed by moderate rates for first vowel and voicing of the initial consonant. there are two ways to interpret the impairments observed in our phonological patients. However. that is. All the phonological dyslexics failed to obtain migrations for syllables.. which illiterates would be unable to perform. Activation of temporal parietal structures posterior to the sylvian fissure occurs during passive . The evidence comes from an increase of activation. Recently. V. Zatorre. thus precluding a clear association to one type of aphasia. and their migration rates for initial consonant were even lower than those obtained by the normals. but S. Thus. It should be noted that. with the exception of J. but he was the only patient who showed the normal pattern of migrations for French. the surface dyslexic.

thus a metaphonological but non-analytical task. suggesting that auditory processes dependent on the previously intact right hemisphere have for some time compensated for the left hemisphere damage. involving most of the superior temporal gyrus. It would be interesting to assess whether or not activation of an additional area is obtained with a discrimination task. The second stroke caused a specific deficit in an auditory lexical decision task. makes this area a good candidate for phonological encoding of words" (p. and with a subsequent right hemisphere infarct involving again the superior temporal gyrus (Praamstra. Comparison of discrimination and identification functions for vowels and consonants suggests that there was a phonetic deficit. even after the second stroke. Morais. on the other hand. it involves recognition. it seems that conscious and unconscious representations of phonemes rely on different-though. Our expectation. Since. 163). which requires the subject to decide whether two speech stimuli are the same or different. The conscious . Petersen et al. whereas frontal activation anterior to this fissure occurs for articulation. it is metaphonological). The task is formally similar to the one used by Zatorre et al. Interestingly. but it concerns syllables rather than phonemes. using visual input implicated the temporoparietal cortex. and it may therefore be much closer to perception). the patient was able. but it requires only the global matching of two conscious percepts (in this last sense. the neural circuits that subserve articulation appear to host phonemic awareness processes in people who know an alphabet. a rhyming task. We predict that it would not yield the anterior activation that Zatorre et al. Kolinsky listening. The neuro-anatomical and neuropsychological data available up to now do not indicate a consistent dissociation between phonological and metaphonological representations as unitary ensembles. The discrimination task implies an intentional judgement (in this sense. as reported by Petersen et al. as could be expected. Among phonological units and properties. (1989). This prediction is based on the fact that. As Petersen et al. Worth noting also is a case of word deafness in a patient with a left temporoparietal infarct. however. found. & Crul. presumably prior to the second stroke.356 J. Hagoort. in comparison with the passive listening situation. as shown by other neural imaging studies (cf. (1992).. is that activation elicited by the speech discrimination would be similar to that observed in passive listening. Maassen. R. so that the task may have been accomplished on the basis of the residual auditory and/or phonetic capacities. relatively close-brain areas. that is. 1989). to perform a (in some sense) metaphonological task requiring him to judge whether two disyllabic words began (ended) or not with the same syllable. comment. but not for auditory clicks or tones. "the activation of the left temporoparietal focus during passive auditory word presentation. 1991). despite the auditory and phonetic impairment. phonemes may be the only ones to present such a dissociation. Very precise phonetic decoding was probably not necessary in this syllablematching task.

& Kolinsky. Our results with illiterates using the migration phenomenon on the one hand.a land that remains unexplored and that is perhaps unexplorable with our present techniques. in preparation. 1991). the activation of unconscious representations ultimately constrains the elaboration of conscious representations. References Bisiacchi. Scliar-Cabral. However. 1992. & Denes. Walley. as well as orthographic effects. are consistent with the neural data. and even when the stimuli are simply presented against a noise background (Castro. from the processes of phonetic decoding. Castro. However. and Ressler's (1982) observation that listeners may avoid the illusion of phoneme restoration by focusing attention on the critical phoneme. (1989). Remember that orthographic representations may also influence rhyming judgements on spoken words (Seidenberg & Tanenhaus. As we discuss elsewhere (Morais & Kolinsky. Morais. and therefore it deserves further and more systematic exploration. the dependency relation may be trivial. 41 A. It would have a perceptual as well as a postperceptual reality. It seems that it is not a mere convention. the acquisition of phonemic awareness may elicit supplementary and perhaps more efficient procedures to cope with spoken words. 1979).. L. In languages in which perceptual processing at the phonemic level may be crucial for the quality of the global conscious speech percept. We found evidence for a (sometimes useful) strategy of listening based on attention to the phonemic structure of words in the dichotic listening situation. Other indications that attentional focusing on phonemes may lead to improved word recognition include Nusbaum. the stage of processing at which these influences occur remains an open question. written word input does not seem to activate the areas devoted to the perceptual processing of spoken words. Quarterly Journal of Experimental Psychology. The reverse dependency relation is theoretically more interesting.Perception and awareness in phonological processing 357 representations of phonemes appear to be selectively dissociated. Castro. . and the conscious phonemic manipulation tasks on the other hand.. The effects of strategies based on conscious phonemic representations. detection of phonemes (Taft & Hambly. 1987). Cipolotti.S. & Content. What functional dependency relations might the conscious and unconscious representations of phonemes entertain with each other? From unconscious to conscious representations. in spite of attempted murder. To conclude. Castro & Morais. Carrell. 293-319. the phoneme is still alive. Kolinsky. Impairment in processing meaningless verbal material in several modalities: The relationship between short-term memory and phonological skills. in press). may take place between perception and recognition . 1985) as well as the occurrence of phonological fusions in dichotic listening (see Morais. G. on a neural basis. P.

Cary.W. J.-L. & Kolinsky. Literacy training and speech segmentation. Morais. P. K. Morais. Morais." Cognition. R. S. J. 115. Cary. Nusbaum. Howard. de Gelder & J. & Raichle. Siqueland.. Research on Speech Perception Progress Report (Vol. Indiana University. 1769-1782. Alfabetizacao da fala. 473-485. T. Leong (Eds. Radeau (Eds.. Alegria. 45A. & Paiva. Cognitive Neuropsychology.. & Cluytens. A. J.E. P. L. Alegria. J. W. L.. NJ: Erlbaum. language and literacy.C. L. 24..C. Radeau (Eds.. J. Friston. Alegria. Positron emission . Brown. & Alegria. Kolinsky. La reconnaissance des mots dans les differentes modalites sensorielles: Etudes de psycholinguistique cognitive (pp. L.R. (1979). (1992). Phonology: A cognitive view. D. 39A. (1992).. Perfetti. R. 451-465. Patterson. Morais.. How direct is visual perception? Some reflections on (jibson's "Ecological Approach. Holender. Unpublished manuscript. Brain. M.C.M. M. 59-70. Mintun..E. P. J. S. P. (in preparation).. M. Fodor. La reconnaissance des mots chez les adultes illettres. L. Intermediate representations in spoken word recognition: Evidence from word illusion. & Frackowiak.A. (in press). P. Phonemic awareness.L. In B.. pp. Speech perception in infants. 353-372. Morais. 133-149). Morais... (1989)..M. (1992).T. & Bertelson. Porto: Instituto Nacional de Investigacjio Cientffica.). 45-64. Dordrecht: Kluwer. S. & Content.. & Grainger.. 3.).. S.. Phonology and orthography in visual word recognition: Evidence from masked nonword priming.. Journal of Memory and Language. Kolinsky. Speech and reading: Comparative approaches. (1992). Bloomington. J. & Morais. (1986). Phonological ALEXIA or PHONOLOGICAL alexia? In J. Bell... L. & Vigorito. 303-306.D. In R. K. Joshi & C..-L.. & Marcel. J. The phoneme's perceptual reality exhumed: Studies in Portuguese. Morais. Berlin. Kolinsky.). Petersen. Amsterdam: North-Holland.. Castro.. (1991). L. S. Phonetic activation during the first 40 ms of word identification: Evidence from backward masking and priming... 171. (1993). Re-education of a deep dyslexic patient: Rationale of the method and results. Analytic approaches to human cognition (pp. Z. Hillsdale. C. (1987). Weiller. R.P.. 8. Bertelson. (1971). IN: Department of Psychology. H. Kolinsky. J. R. 8. D. C.L. J.. Paris: Presses Universitaires de France. A. In R. Quarterly Journal of Experimental Psychology. Reading disabilities: Diagnosis and component processes (pp. Scliar-Cabral. In J.. Kolinsky. Jun?a de Morais. Intermediate representations in spoken word recognition: A cross-linguistic study of word illusions. & Kolinsky. R. (in press). & Pylyshyn. & Bell. Jun?a de Morais.... R. Analytic approaches to human cognition (pp.. 149-177. Jusczyk. Kolinsky Castro.). Morais. 83-103). Conjunctions errors as a tool for the study of perceptual processes. J. (1987).K. London: Erlbaum. Morais. Kaye. 323-331. Journal of Memory and Language.. & Morais. Morais (Eds. (1992).358 J. R. E. Quarterly Journal of Experimental Psychology.I. M. Posner. Morais. Awareness of words as phonological entities: The role of literacy. 59-80). & M. R. Journal of Memory and Language. Kolinsky. The consequences of phonemic awareness.. J. J. W. Cary. Kolinsky. & Ressler. 27. Ferrand. Science.. J.). 175-184). (1991). (1981). Segui (Eds. S. Castro. 7. R. J.. J. Perfetti. Automatic (prelexical) phonemic activation in silent word reading: Evidence from backward masking. D. R. de Partz. Fox. & Morais. K. (1988). 731-734). 9.A..D. & M. J. Proceedings of the 3rd European Conference on Speech Communication and Technology: Eurospeech'93 (pp. Eimas. Controlled perceptual strategies in phonemic restoration. (1982). J. 30. 223-232. Patterson. Wise. 139-196. Evidence for global and analytical strategies in spoken word recognition. J. (1989). A. (1986). The cortical localization of the lexicons. C. Amsterdam: North-Holland. Walley. The effects of literacy on the recognition of dichotic words. R. 133-149). Holender.... Applied Psycholinguistics. Cognition. (1993). & J. J. Castro.H. J. & Delaney. M. Does awareness of speech as a sequence of phones arise spontaneously? Cognition. Carrell.

14.. G. (1979). (1983). Multiple meanings of "phoneme" (articulatory. & Lewis. Evans. A. 153-170. Lass (Ed. 24. M. Taft. graphemic) and their confusions.K. & Crul.. In N.M. A. acoustic. K. (1982). E. Read. perceptual. New York: Academic Press.. Warren. Nie. The ability to manipulate speech sounds depends on knowing alphabetic writing. C. Zatorre. Word deafness and auditory cortical function. (1985). 546-554. 256. McLeod. & Gjedde. Treisman. Zhang. Shallice. Maassen. Journal of Cognitive Neurosciences. Brain. P. Science. M.. 507-532. Orthographic effects on rhyme monitoring. & Hambly. pp... Illusory conjunctions in the perception of objects.). Cognitive Psychology. Y. Lateralization of phonetic and pitch discrimination in speech processing. 37A. Journal of Memory and Language. 5. The influences of orthography on phonological representations in the lexicon. 1. Journal of Experimental Psychology: Human Learning and Memory. Speech and language: Advances in basic research and practice (Vol. H. 24. T. Cognition. Isolating cognitive modules with the dual task paradigm: Are speech perception and production separate processes? Quarterly Journal of Experimental Psychology.. 31-44.J. (1985). & Ding.. (1991).. 1197-1225. P. 320-335.. . 846-849. 114.Perception and awareness in phonological processing 359 tomographic studies of the processing of single words.. 285-311). Hagoort. 107-141. (1986). Meyer.S. B. T. H. A. R.. B. (1992). Seidenberg. 9.C.. Praamstra. R. P. & Tanenhaus. & Schmidt.J. M.

Once the essential points of this important exchange are thus clearly laid out. Dipartimento di Scienze Cognitive. at the University of Massachusetts at Amherst. Italy. I am especially indebted to Eric Wanner for initial funding. and made many useful suggestions in the letter from which I have quoted some passages here. corroborate the "language specificity" thesis defended by Chomsky. Noam Chomsky carefully read the first draft. however. Via Olgettina 58. Jerry Fodor. especially in the acquisition of pronouns by the congenitally deaf child. Paul Horwich. Olivetti Italy and the Cognitive Science Society. it is easy to witness that recent developments in generative grammar. Steven Pinker reinforced that suggestion. . USA Abstract The central arguments and counter-arguments presented by several participants during the debate between Piaget and Chomsky at the Royaumont Abbey in October 1975 are here reconstructed in a particularly concise chronological and "logical" sequence. Cambridge MA 02139. as well as new data on language acquisition. and it shows. Sloan Foundation.19 Ever since language and learning: afterthoughts on the Piaget-Chomsky debate Massimo Piattelli-Palmarini* Dipartimento di Scienze Cognitive. 20132. MIT. The work I have done during these years has been generously supported by the Alfred P. Jim Higginbotham. assuming that such a paper could be of some use also to the undergraduates at MIT. Steve Gould and Dick Lewontin. my special indebtedness to Noam Chomsky. Istituto San Raffaele. the Kapor Family Foundation. the MIT Center for Cognitive Science. a philosopher of science. Ken Wexler. Jacques Mehler. in April 1989. and from a suggestion by my friend and colleague Paul Horwich. who had attended. Luigi Rizzi. I am in debt to Thomas Roeper for his invitation to give a talk on the Piaget-Chomsky debate to the undergraduates in linguistics and psychology. I wish to single out. The ideas expressed here owe a lot to a lot of people. The idea of transforming it into a paper came from the good feedback I received during that talk. Italy Center for Cognitive Science. Jerry Fodor stressed the slack that has intervened in the meantime between his present position and Chomsky's. Lila Gleitman. By the same token these data and these new theoretical refinements refute the Piagetian hypothesis that language is constructed Correspondence to: Massimo Piattelli-Palmarini. Milano 20132. Milano. Istituto San Raffaele. inducing me to revise sections of the first draft (perhaps the revisions are not as extensive as he would have liked). Via Olgettina 58. Laura-Ann Petitto. Morris Halle and David Pesetsky also offered valuable comments and critiques.

a bit like a swarm of virtual . 1988. one cannot possibly do a better job than the one they did. it is increasingly clear that the pendulum is presently swinging towards the innatist research program in linguistics presented at Royaumont by Chomsky (and endorsed by Mehler with data on acquisition). what I studiously avoided to say there and then. much of the later debate on the foundations of connectionism (Pinker & Mehler. It is not for the co-organizer. That debate also foreshadowed. and Putnam raised at the time. of that meeting. In fact. Piatelli-Palmarini upon abstractions from sensorimotor schemata. that many of us have witnessed over the years many impromptu re-enactments of arguments and counter-arguments presented in that debate. 2. as the initial milestone in the emergence of this field" (i. After all. and that if one still wants to raise today the same kind of objections to the central ideas of generative grammar as Piaget. and it has been stated (Gardner. The debates within the debate In hindsight.362 M. In hindsight. Fodor & Pylyshyn. for reasons that I shall come back to. it is important to realize that there were at least four distinct Royaumont debates eventually collapsing into one. A fresh reassessment of the important Royaumont debate (October 1975) between Piaget and Chomsky may be of interest in this context. cognitive science). Papert. Moreover.e. with Jacques Monod. and at times forcefully. all this accrues to the validity of Fodor's seemingly "paradoxical" argument against "learning" as a transition from "less" powerful to "more" powerful conceptual systems. and away from even the basic. language and learning turn out to be unfounded. and allegedly most "innocent". I also wish to highlight some recent developments in linguistics and language acquisition that bear clear consequences on the main issues raised during the debate. as time goes by. the most effective counters to those objections are still basically the same that Chomsky and Fodor offered at Royaumont. or for the editor of the proceedings (Piattelli-Palmarini. 1980) to say how strong the contender is. 1. and in perspective. however. Inhelder. 1988). the self-imposed neutrality I considered it my duty to adopt while editing the book.. Introduction This issue of Cognition offers a rare and most welcome invitation to rethink the whole field in depth.. What I will attempt to do here is support the Chomsky-Fodor line with further evidence that has become available in the meantime. Moreover. Lifting. at long last. assumptions of the constructivist Piagetian program. It is a fact.. 1980) that the debate is "certainly a strong contender. Piaget's basic assumptions on the biological roots of cognition. in the light of modern evolutionary theory. Cellerier. I say here explicitly. the book has by now been published in ten languages.

armchair theorizing) . the one which we. he being mostly concerned with conceptual contents and semantics. .Ever since language and learning: afterthoughts on the Piaget-Chomsky debate 363 particles collapsing into a single visible track in modern high-energy laboratories: the event that actually happened.A dynamic perspective (development and acquisition studied in real time. Piaget made it clear that it had been his long-standing desire to meet with Chomsky at great length. The only issue. It was his original wording that there had to be a "compromis" between him and Chomsky. I will outline these reasons in a simple sketch. voiced by Cellerier and Toulmin. the organizers. As Piaget states in his "invitation" paper.1 he thought there were powerful reasons supporting his assumption. which consisted of a minimization *In Language and Learning: The Debate between Jean Piaget and Noam Chomsky (hereinafter abbreviated as LL). The suggestion. During the preparatory phase. principles and formal constraints . and that this nucleus is accounted for by human biology. therefore was to assess the exact nature of this fixed nucleus and the degree of its specificity. Reasons for the "compromise" Piaget's assessment of the main points of convergence between him and Chomsky . pp. thought would happen.Emphasis on logic and deductive algorithms . Let me digress for a moment and sketch also these other "virtual" debates. this term is recurrent throughout the debate. Piaget considered that the potentially divisive issue of innatism was. and the one that Chomsky urged everyone not to let happen.Emphasis on actual experimentation (vs. language included. In fact. 23-24. Chomsky being (allegedly) mostly concerned with content-independent rules of syntactic well-formedness across different languages. at bottom. the one Jean Piaget hoped would happen. with real children) Piaget's proposal was one of a "division of labour".Rationalism and uncompromising mentalism -Constructivism and/or generativism (both assigning a central role to the subject's own internal activity) . a non-issue (or at least not a divisive one) because he also agreed that there is a "fixed nucleus" {noyaux fixe) underlying all mental activities.Emphasis on rules. was to consider two "complementary" strategies: the Piagetian one. and witness the "inevitable" convergence of their respective views.Anti-empiricism (in particular anti-behaviorism) . Piaget assumed that he and Chomsky were bound to agree in all important matters.

the learning of language). responsible for the vast gap between the debate he actually participated in. for instance in his 1967 book Biologie et Connaissance)? 2 Barbel Inhelder. consisting of a maximization of these factors-once more. unexpectedly. 1974) in words that are not his own. facing insuperable disagreement about those very assumptions he hardly considered worth discussing. These were such that no reasonable person could possibly reject them . to the very end.Cognition is a continuum This is a somewhat blunt rendition. had they really understood his position. One of Piaget's secrets was his deep reliance on the intuitive. Piatelli-Palmarini of the role of innate factors. but which may well reflect the essence of what he believed: Piaget's guiding hypothesis (hypothese directrice) .not if he or she actually understood what they meant. and certainly unexpected to Piaget. while Piaget hoped for a reconciliatory settlement with the Massachusetts Institute of Technology (MIT) contingent about particular hypotheses and particular mechanisms concerning language and learning (and. Piaget was still convinced he had been misunderstood by Chomsky and Fodor. Some of his former collaborators in the Geneva group. Piaget's imperception of these fundamental differences was. In Piaget's opinion. he found himself.Life is a continuum . in essence. then it would have been unthinkable that the disagreement could still persist. and the virtual debate he expected to be able to mastermind. in 1985. and which he believed were the common starting point . It can be safely stated that. a sort of division of labor. the constant focus of the discussions was on what Piaget considered perfectly "obvious" gallant de soi")\ the nature and origin of this "fixed nucleus". unshakeable truth of his hypotheses directrices (guiding hypotheses). .364 M. expressed basic agreement that this was "a fair rendition" of Piaget's hypothese directrice (as expressed. personal communication. One could single out the most fundamental of Piaget's assumptions (Piaget. He was heading for severe criticism from the molecular biologists present at the debate (especially from Jacob and Changeux) concerning his views on the origins of the fixed nucleus. to witness that. and the Chomskian one. but it is close enough to Piaget's core message. during the debate proper. in particular.more on these in a moment. And he was heading for major disagreements with Chomsky concerning the specificity of this nucleus.Cognition is an aspect of life therefore . One had the impression that. It was interesting for all participants.

but also very important. II There is a necessary. if literally taken this version is a well-known logical fallacy (compare with the following): .New York is a major metropolis . one does not want to impute to Piaget and his co-workers assent to a logical fallacy. Thus stated. fixed stages of increasing selfstabilization. That would be too devious.Cognition is best understood as auto-organization and self-stabilization in the presence of novelty This much seemed to Piaget to be untendentious and uncontroversial.Central Park is part of New York therefore . in fact.Central Park is a major metropolis Decidedly.Life is (basically) auto-organization and self-stabilization in the presence of novelty -Cognition is one of life's signal devices to attain auto-organization and self-stabilization therefore . it cannot pass as a "fair" reformulation. universal and invariable sequence of stepwise transitions between qualitatively different. Here they are (again in a succinct and clear-cut reformulation): PiageVs additional assumptions I Auto organization and self-stabilization are not just empty metaphors. but deep universal scientific principles captured by precise logico-mathematical schemes. A better reformulation.Ever since language and learning: afterthoughts on the Piaget-Chomsky debate 365 As any historian of medieval logic could testify. Ill The "logic" of these stages is captured by a progressive hierarchy of . would be the following: A better heuristic version of PiageVs core hypothesis . In order better to understand where the force of the hypothesis lies. one that passes the logical test. He declared. that this central hypothesis had guided almost everything he had done in psychology. one must remember that he unreservedly embraced other complementary hypotheses and other strictly related assumptions.

The random process of standard Darwinian evolution is unable in principle (not just as a temporary matter of fact. Piaget's crucial assumptions about language The basic structure of language is continuous with. Corollary V Another theory of biological evolution is needed (Piaget's "third way". IV The necessary and invariant nature of these transitions cannot be captured by the Darwinian process of random mutation plus selection. and also constitute the logical premise of linguistic 3 LL. moreover they follow one another in a strict unalterable sequence. had their say. which subsume as particular instances the concepts and schemes of the previous stage. obviously. take place through the subject's active effort to generalize. various sensorimotor schemata. as we will see in a moment. and which grants the "necessity" of the mental maturational stages. Within this grand framework. p. 59. it is useful to emphasize what were Piaget's specific assumptions concerning learning and language: Piagefs crucial assumptions about learning The transitions (between one stage and the next) are formally constrained by "logical necessity" (fermeture logique) and actually. The transition is epitomized by the acquisition of more powerful concepts and schemes. The sensorimotor schemata are a developmental pre-condition for the emergence of language. differing both from Darwin's and Lamarck's). due to the present state of biology) to explain this strict "logical" necessity.366 M. unify and systematize a wide variety of different problem-solving activities. and is a generalizationabstraction from. Piaget believed that there is a kind of evolution that is "unique to man". equilibrate. One the last two points the biologists.3 These are what they are. "dynamically". . Piatelli-Palmarini inclusion between ascending levels of abstraction and generalization (each stage contains the previous one as a sub-set). and could not be anything else.

who boldly undertook the task of systematically defending Piaget against the onslaught. the one which the organizers molecular biologists with a mere superficial acquaintance of cognitive psychology and linguistics . . on account of the insurmountable problems presented by his core tenets. with Piaget growing increasingly impatient to pass onto more important and more technical matters. except possibly to Piaget himself (see his "Afterthoughts"). is: ignorance. though some cultures may fail to attain the top stages. 278-284.Ever since language and learning: afterthoughts on the Piaget-Chomsky debate 367 structures (word order. believe for a moment that some form of compromise could be reached? The simple answer to this. Syntax is derivative from (and a "mirror" of) these. because they too anticipated some kind of convergence. and so on). I think I can faithfully reconstruct it in a few sentences: What we (the biologists) thought we knew About Piaget: -There is a stepwise development of human thought. It was inevitable that Piaget should meet strong opposition on each of these assumptions. the agent/patient/instrument relation. How could that be? How could we.believed they were organizing.4 that no compromise could possibly be found. Conceptual links and semantic relations are the prime movers of language acquisition. as I said. It was closer to what Piaget had in mind than to the debate that actually took place. notably in their many spirited exchanges with Seymour Papert. the whole debate turned only on these assumptions. another virtual debate. but failing to do so. the biologists in the group. in retrospect. through fixed. In a sense. on their alleged joint force and on the overall structure of his argument. the subject/verb/object construction. What we thought we knew about the two systems was simple and basic. pp. 3. The debate was not the one Piaget had anticipated.N o t everything that appears logical and necessarily true to us adults is so 4 LL. Another virtual debate: the one the organizers thought they were organizing There was. from infancy to adulthood. . qualitatively dif