Professional Documents
Culture Documents
Edited by<br/>
Notice:<br/>
Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Acknowledgments viii
List of Figuresx
Social
4 Actions, Social Commitments
Herbert H. Clark 126
Part 2: Psychological Foundations
The
6 Developmental Interdependence of Theory of Mind
and Language
Janet Wilde Astington 179
The
9 Thought that Counts: Interactional Consequences of
Variation in Cultural Theories of Meaning
Eve Danziger 259
Index
525
Acknowledgments
N. J. E. and S. C. L.
Reference
Kockelman, P. 2005. The semiotic stance. Semiotica 157(1–4):233–304.
List of Figures
1.1.
I.1. Interlocking concepts developed in the different chapters.
ToM Theory of Mind.
= 10
1.1. Kpémuwó, deaf home signer on Rossel Island, inventing a
Kpemuwo,
way to communicate about abstract ideas concerning
sorcery. 43
1.2. Guugu Yimithirr person reference sequence. 58
1.3. Rossel Island name-avoidance sequence. 59
3.1. Requesting the gaze of a hearer. 99
3.2. Displaying slots and alternatives. 101
3.3. Decomposing a noun phrase. 102
3.4. Multimodal assessment. 105
3.5. Topic initiation. 110
3.6. Pointing toward a distant alternative. 112
3.7. A complex gesture sequence grounded in a nonpresent
space. 115
5.1. Pointing across trials: Mean
proportion of trials in which
infants pointed at least once. 162
5.2. Point repetitions within trials: Mean number of points per
trial with at least one point. 163
5.3. Looking behavior across trials with a point: Mean number
of looks to E per trial with a point. 163
5.4. Schematic setup of Study 2 with barriers. 165
5.5. Still-frame showing a 12-month-old pointing for the
experimenter to one of two objects on the shelves
behind her. 168
5.6. Mean percent of trials with the first point to target or
distractor. 169
8.1. Selective imitation of the modeled "head action" by
14-month-olds in the "hands occupied" vs "hands free"
demonstration conditions. 240
11.1. DC (the shaman) and patient's wife (number 200). 303
11.2. DC's (the shaman) altar (number 465). 304
12.1. Old technology: text typing tool. 330
12.2. New technology: Face-to-machine with signs. 331
12.3. Rose signing BABY at chin height rather than waist
height. 337
12.4. Sign made is oriented to webcam on top of computer. 338
12.5. Rose signing O-K near webcam location. 339
12.6. The signer (right) has a mirror image view of his own
actions. 341
12.7. The top image shows the signer's sign space is not in
camera (note
range difference between the two
participants). 341
12.8. Text messaging and signing. 343
12.9. Modeling individual aspects of a joint activity. 345
14.1. Navigation team on the bridge of a navy ship. 381
14.2. Enacting provisional lines of position. The bearing
recorder is completing his conversation turn while
plotter positions his hand to take the (gesture and talk)
floor. 382
14.3. Three lines of position fix the
position of the
ship
(represented by the triangle). The anticipated course
extends from the fix triangle to the estimated position,
EP (half-circle), where the ship is expected to be at the
time of the next fix. 383
14.4. The dashed lines indicate a poorlychosen pair of
landmarks for the next fix. The angles of intersection
among the LOPs should be open. 384
14.5. The trajectory of the bearing recorder's gesture is
complex. 385
14.6. The trajectory of the bearing recorder's gesture as it was
At the heart of the uniquely human way of life is our peculiarly intense,
mentally mediated, and highly structured way of interacting with
one another. This rests on participation in a common mental world, a
world in which we have detailed expectations about each other’s behavior,
beliefs about what we share and do not share in the way of knowledge,
intentions, and motivations. That itself relies both on communication
(linguistic and otherwise) and on a level of cooperation unique in the
animal world. This mode of cooperative, mentally mediated interaction
enables the accumulation of cultural capital and historical emergence
of cultures. By inheriting a world of social organizations and values,
individuals are released from reinventing the wheel. In turn, cultural
capital shapes the style of interaction in local social groups, hiding
shared commonalities behind the veil of distinct languages, cultural
styles, and forms of social organization.
This, at least, is the thesis of this book. It brings together anthropologists,
linguists, psychologists, and sociologists whose work has not been
juxtaposed before. When we put the pieces of the jigsaw together,
what emerges is a new map of a still underexplored terrain—the roots or
foundations of human sociality.1 We propose that this is a new scientific
domain, a coherent subject for investigation constituted by intersecting
principles of different orders (ethological, psychological, sociological,
and cultural) that work together to produce an emergent system, a
system of human sociality and social interaction.
In this introductory chapter, we want to give readers a sense of how
the rest of the chapters fit together to form an outline of this domain.
Introduction
We first sketch some contributing research traditions and the ways they
fit together. We go on to delineate the different phenomena that are the
focus of the individual chapters, drawing attention to the connections
that run through them. Finally, we sketch our own synthesis of the
domain.
The ideas in this book ramify and connect with one another in
multiple ways. Although no linear order of chapters could capture such
a network of connections, our division of the book into five parts aims
to emphasize certain linking themes. Part 1 consists of four chapters
focusing on central properties of face-to-face interaction, the arena in
which human sociality is centrally exercised. Part 2 focuses on
psychological
foundations of human sociality, exploring the question of just
what it takes to pull off human interaction as we know it. Part 3 deals
with issues of culture and cultural difference, and the ways sociocultural
forces may play a role in structuring interaction and interactional
expectations, and vice versa. Part 4 explores ways in which cognition
is defined by its being exercised in social interaction, and how the
social exercising of cognition has effects both on our understanding of
the individual’s psychology (part 2) and of the higher levels of social
organization and broader cultural conventions (part 3). Part 5 features
phylogenetic perspectives, with two chapters asking how key features
of the human system for interaction could have evolved and a third
chapter comparing human social abilities with those of the other great
apes.
Gricean Pragmatics
A second line of work guiding the debates in this book originates in
philosophy, specifically in H. P. Grice’s (1957) idea that meaning is
grounded in the recognition of intention. Seeing you fall over ahead
of me up a steep path, I am relieved to see you get up and wave in
my direction, taking your wave as designed to make me think you are
OK. The wave works because you have correctly calculated that I will
recognize the plan behind your action, namely getting me to recognize
that you intend me to think you are OK. In this example, the wave has
a nonce or one-off meaning recoverable against a background of your
figuring what I would figure when I see it.
Grice’s idea is important because it shows meaning, in a broad sense,
to be independent of language or convention. This points to possible
precursors to conventional meaning, in ontogeny, diachrony, and,
perhaps, phylogeny. On this account, meaning is not a property of
signs or symbols, but a property of minds in (mediated) interaction
with other minds. Conventional meanings can be thought of as arising
from repeated use of what were once novel signals. If I fall down and
likewise wave, we might set up a miniconvention that then spreads
through the community of hikers.
Another important aspect of this psychologizing of meaning is that it
allows us to analyze the unspoken communicative contents associated
with conventional symbols. For conventional meanings never exhaust
the import of what is said. The simplest utterance usually carries with
it a penumbra of intended but unspoken thoughts. (Consider What are
you doing tonight? which is likely to be forecasting an invitation, not
simply asking a question.)
The whole business of exchanging intentions in communication relies
on background assumptions that help to narrow the range of intention
attribution. Grice (1975) suggested that the essential background
assumption by which interactants constrain and guide their inferences
about speaker intentions is a principle of cooperation. (The principle,
comprising maxims of quality, quantity, relevance, and manner, has
since been updated in modern recastings such as Levinson 2000 and
Sperber and Wilson 1986.) Recipients of others’ signals work on the
assumption that such signals have been designed specifically for them
to extract the intended meaning. In turn, senders of such signals design
those signals in such a way as to take into account such an expectation of
targeted design on the part of hearers. By a principle of audience design
(or “recipient design”; Sacks and Schegloff 1979), any utterance should
have been formulated by a speaker with the intention that it cause just
the right effect in the receiver, taking into account the common ground
of the particular combination of speaker and addressee(s). For example,
in telling you something about my colleague John, I will first refer to
him in a way appropriate to your knowledge of him—for example, as
John if we commonly know him as John, but as, say, a colleague of mine
if I suppose you have never met him (Enfield and Stivers in press).
In sum, Gricean principles require the modeling of others’ inner
states, and thus presuppose a ToM. They also entail a stock of common
ground, readily provided by culture (e.g., that How do you do? is not
seeking information, that it is OK to strip down on the beach but not
on the street, or that sweet desserts come after savory main courses;
Enfield 2000; Levinson 1995). E. Goody (1995) suggests that the entire
structure of social roles in a society should be understood against this
background, providing systematic constraints on appropriate social
intentions and their ascription.
Microanalysis of Social Interaction
Detailed study of the systematics of social interaction in its own right was
initiated by a string of 20th-century mavericks including G. Bateson, R.
Birdwhistell, H. Garfinkel, and E. Goffman. The study of the systematics
of social interaction has since passed largely to conversation analysts
and other students of talk and action in interaction, with research
resulting in a detailed inventory of observed interactional practices and
patterns.4 Most of these practices can be characterized as sequences of
interlocking social actions (e.g., turns at talk) whose interpretations are
associated with specific (sometimes culture-specific) expectations and
preferences. The taking of turns at talk, the openings and closings of
conversations, the structure of request sequences, practices for correcting
or repairing utterances, and so on, have been carefully explored in
English-language conversation. There is also an increasing knowledge of
how these things work in other languages. Emerging from this research
are candidate universals for the organization of human interaction
(see Schegloff), such as the mechanism for transition of turns at talk
in informal conversation, and the ways in which interlocutors correct
and repair their own and others’ utterances—a crucial mechanism for
maintaining intersubjectivity. There is a strong expectation that such
structures should be universal, given that they are essential for preserving
order and agreement in moment by moment social experience. (See
Goffman 1981:14, for a list of such “system requirements and system
constraints,” including “framing capabilities,” Gricean principles, and
“nonparticipant constraints.”)
Conversation analysts try to avoid the psychological turn that
characterizes
ToM research and Gricean pragmatics. They prefer to talk in terms of
actions as recognizable through the details of their observable structure
and their specific placement in sequences of action. But they are equally
interested in intersubjectivity, the way in which a shared understanding
is arrived at. Hence the special interest in “intercalibrative” mechanisms
like repair and audience design.5
ToM, 1—The Suite of Capacities and the Role of Language (Astington) The
phenomenon of pointing takes us more centrally to how “mind reading”
may work. The foundations here are (1) having a grasp that others have
mental states and (2) recognizing that these may diverge from one’s own.
This must involve an awareness of one’s own kinds of mental states,
and arguably the ability to employ such mental states in explaining
the actions of others. Astington’s chapter reviews what is known about
children’s development of such a ToM. A number of researchers (see
Gergely and Csibra, Liszkowski, Tomasello) believe that human infants
first grasp the nature of the other as an intentional agent from about
nine months. But it is also widely accepted that a fully comprehensive
ToM, as indicated by false belief understanding, is slow to mature,
coming significantly after the full essentials of language are in place.
Astington argues that language plays a key role. She reviews three ways
in which this has been proposed in existing literature: knowledge and
use of mental state verbs with meanings like “want,” “think,” “know,”
and “believe”; knowledge and use of the complex syntactic structures
associated with these mental state predicates; and firsthand experience
of face-to-face conversation.
ToM, 2—Consequences of Language Deficit (Pyers) Pyers’s chapter narrows
in more closely on the relation between language and ToM, with a case
study of a Nicaraguan sign-language community. Pyers’s research reveals
startling evidence for the crucial role that language may play in acquiring
ToM capacities. In Nicaragua, a substantial Deaf population was only
in the last decades brought together into a socially networked speech
community, thanks to the establishment of educational institutions for
the Deaf. This has led to the growth of a new natural language known
as Nicaraguan Sign Language, a Creole born of many smaller home-
sign or village-sign systems. The first generation of signers learned what
was effectively a pidgin with limited expressive power. In addition,
they were late learners of language in any form. By contrast, younger
signers of the following generation have had the benefit of exposure
to a developed sign language from a young age. Pyers reports that tests
for ToM capacities show the younger signers to have a significant edge.
The older signers do not master standard false belief tasks. This is prima
facie evidence that language plays a determining role in the acquisition
and application of ToM.
Imitation and Rational Learning (Gergely and Csibra) A developmental
perspective on the question of how we read intentions into the actions
of others is pursued by Gergely and Csibra. They investigate human
infants’ imitation of adults’ actions, finding that infants do not just
copy actions, but analyze the goal directness of others’ behavior and
look for the rationale behind the means chosen for carrying out an
action, doing selective imitation accordingly. Thus, if a woman with
her hands tied turns on a light with her head, an infant imitating this
action will turn on the light with his or her hand (Gergely et al. 2002).
This imitation achieves the same goal (getting the light to go on), but
does not reproduce the means. The child surmises that the adult would
have used her hands if she could have: that is, given that her hands were
full, the woman’s unusual action of using her head is rational. But in
a different experimental condition, in which the adult has hands free
and yet turns on the light using her head, the infant will use his or her
head as well in imitating this. In this case, the woman could have used
her hands to do the action, but does not. The child extracts a different
rationale for the marked manner of action, surmising that it was this
unusual manner that was intended (i.e., here the adult chooses to use
the head and not the hands), thus being a defining and not merely
contingent part of the action the adult performed.
The possibility for rational learning of this kind is critical for the
acquisition of culture. Cultural actions have both rational means–ends
aspects (like collecting food and preparing it to eat) and nonrational,
culturally constrained aspects (like eating with a knife and fork rather
than with the fingers). Our children have to acquire both. In each case,
the process of acquisition involves intention attribution based on direct
observation of others’ actions (cf. the description of action parsing in
Byrne, and syntactic parsing in Goodwin). It happens that sometimes
part of the goal of an action is that the action be done in a specific
manner. This applies in the case of culturally stylized action.
Cooperative Instincts and Group Selection (Boyd and Richerson) The mutual
commitment characteristic of human interaction (see Clark, Goodwin)
points to a classic puzzle in evolutionary theory: the riddle of human
cooperative behavior. Why are people so highly cooperative, when,
for an individual, it should always pay to take the benefits of others’
cooperative acts without reciprocating? The answer supplied in Boyd
and Richerson’s chapter is that cooperative behavior is instinctual. (This
is supported by work presented in a number of other chapters: Gergely
and Csibra, Liszkowski, and Tomasello report cooperative acts by infants
of around one year of age.) Boyd and Richerson discuss experimental
findings that adults, from societies of different kinds around the world,
do not maximize their own gains but, instead, feel an obligation to share
hidden benefits (Henrich et al. 2004). If our brand of cooperation is a
species-specific instinct, we then face the evolutionary puzzles: What
would have been the selective advantage of cooperative sociality for
the individual? How did the mechanisms that drive it develop?
Boyd and Richerson argue that group selection provides an account
for the evolution of human cooperative instincts. Group selection is an
unusual mechanism for evolutionary change, in which behavior shared
by a group, rather than by an individual or his or her immediate kin, gives
the entire group advantages over other groups. Because of its marginal
status as an evolutionary mechanism, group selection presupposes
earlier cultural adaptations that would have given sufficient adaptive
advantage to the group as a whole as well as behaviors that signal and
maintain boundaries between groups. Thus, the cognitive prerequisites
for cultural learning (see Byrne, Gergely and Csibra, Tomasello) would
have been essential for the evolution of cooperative instincts.
Conclusion
The kind of synthesis we propose offers a closer integration of the
contributing research traditions. So psychological approaches will
benefit from expertise at the level of the interaction matrix. For
example,
work on infant pointing gestures (see Liszkowski, Tomasello)
should be alert to the sequential contexts in which they occur and
on which their interpretation may crucially depend. Conversely, work
on the interaction matrix will be enriched by understanding what is
(psychologically) under the hood. Observational work on sequences of
interaction has revealed many kinds of contingencies between actions
in interaction (e.g., question–answer sequences or greetings), but we do
not know how some of these implicit classifications (e.g., of an action
as an X or a Y) are achieved online. We know little about the sources
and development in infancy of skills in navigating finely temporal
and contingent interactional sequences such as conversational turn
taking. Do such skills have an instinctual basis, or are they built during
development on a more primitive instinctual testing of contingencies
in the physical world? We know that interactants are highly sensitive
to others’ mental states, but we do not know how these registers of
information for potential interlocutors are constructed or assessed—
experimental techniques will be critical here.
At another level, that of the sociocultural frame, the interaction
matrix offers insights into how cultural events and processes are
actually constructed. Slight modifications of a universal generic base
for conversational organization can yield all sorts of specific speech
events. For example, restricting interchanges to questions and answers
can give us a basis for courtroom interrogation or classroom teaching—
further assigning rights to question, and the role of overhearers, can
help us distinguish the conduct of the two cultural event types. Tracing
further back, if we know the psychological or developmental sources of
those universal tendencies, we might understand universal constraints
on social organization. Conversely, the analysis of social organization
can inform the conduct of interaction in myriad ways, helping us
understand background assumptions operative within specific events,
the choice of language and social role, and the like.
This raises an apparent tension in this volume between those who
emphasize the individual’s psychological abilities and those who focus
on the emergent properties of the interaction matrix, or the way in
which social interaction is adapted to local sociocultural organization.
We do not regard this as simply border warfare, with rival definitions
of Durkheim’s “psychological” versus “social facts.” Rather, it reflects a
disagreement about the primacy of one or other of the three levels—the
individual, the interactional, and the sociocultural. When A asks B a
question, and B answers it, is this because B discerns A’s intentions (a
psychological level of explanation)? Is it because B follows the rules of the
language game (an interactional level of explanation)? Or is it because
B recognizes that A is endowed with the social rights and authority to
ask that kind of question in the current situation (a sociocultural level
of explanation)? Different researchers rightly test the power of their
own lines of explanation by pushing the limits, and they are likely to
favor one or another level of explanation. This area of research is young
enough that there is no consensus about which level should bear the
major burden of explanation for specific phenomena. Thus, although
there are substantive concerns raised in some of the chapters regarding
the applicability of terms and concepts like “intention,” “action,” and
even “cognition,” true reduction to just one level or another is not
going to work: the levels have independent properties but are also
mutually interdependent. The interpersonally emergent interaction
matrix would not be possible without the individually seated interaction
engine, but it is not “generated” by it. The interaction matrix has higher-
order emergent properties, reflected in the way that local outcomes are
contingent on the actions and responses of all the players. Likewise,
although social institutions are realized through interaction, they have
long-term historical roots and interdependence with other aspects of
culture that require an independent level of analysis. For these reasons,
this will remain an interdisciplinary domain of inquiry, requiring input
from disciplines with insights special to the different levels that make it
up. And the contributors to this project will need to learn each others’
languages if we are going to make real progress.
We thus bring to a close our preview of the range of ideas on human
sociality put forth in the chapters of this book. We hope the volume
does much to spur cross-border commerce between the different fields.
If this can be promoted, we believe that the field of social interaction
research will rightly come to be central in the human sciences, opening
fundamental insights into what kind of a beast we are, and how we
came to have our own uniquely complex form of sociality.
Notes
1. The term sociality is used with a narrower meaning than ours by Henrich
et al. (2004), to refer to cooperative and altruistic instincts, which “deviate
from an axiom of selfishness.” Sussman and Chapman (2004) use the term in
a related way to this, to refer to the orientation of individuals to group living.
Given that “group-living individuals must forgo some of their individual
freedoms in order to socialize within the ‘group,’ ” Sussman and Chapman’s
sense of “sociality” refers to “the compromises that individuals make, the
mechanisms they use, and the means by which they maintain these social
groups” (Sussman and Chapman 2004:10). Our sense of sociality includes these
features among a broader complex of psychological and social predispositions,
principles of interactional organization, and specific interactional practices.
2. Key references include Premack and Woodruff (1978), Byrne and Whiten
(1988), Astington et al. (1988), Davies and Stone (1995a, 1995b), Whiten and
Byrne (1997), and Carruthers and Smith (1996), among many others.
3. Our use of the term Theory of Mind refers more generally to the full
ensemble of “mind-reading” skills of which false-belief understanding is a
single and late-developing component.
4. Key references include Sacks (1992), Sudnow (1972), Sacks et al. (1974),
Goodwin (1981), Atkinson and Heritage (1984), Button and Lee (1987),
Schegloff (in press), among many others.
5. Interaction analysts have also invested effort in understanding the use
of gesture, gaze, and body position in social interaction (Goodwin 1981;
Schegloff 1984; see also Goodwin, Hutchins). (Psychologists, too, have been
especially interested in gesture; see Goldin-Meadow, Liszkowski, Tomasello.)
These studies underline the multimodal nature of human communication.
Again, there are clear universal tendencies here. For example, in all cultures,
as far as we know, people gesture when they talk, although the exact nature of
gesture, gaze, and body position are very much culturally constrained.
References
Astington, J. W., P. L. Harris, and D. R. Olson (eds.). 1988. Developing
Theories of Mind. Cambridge: Cambridge University Press.
Atkinson, J. M., and J. Heritage (eds.). 1984. Structures of social action:
Studies in conversation analysis. Cambridge: Cambridge University
Press.
Baron-Cohen, S. 1995. Mindblindness: An essay on autism and Theory of
Mind. Cambridge, MA: MIT Press.
Boyd, R., and P. J. Richerson. 2005. The origin and evolution of cultures.
New York: Oxford University Press.
Boyer, P. 1994. The naturalness of religious ideas: A cognitive theory of
religion. Berkeley: University of California Press.
——. 2002. Religion explained: The human instincts that fashion gods,
spirits, and ancestors. London: Vintage.
Button, G., and J. R. E. Lee (eds.). 1987. Talk and social organization.
Clevedon, UK: Multilingual Matters.
Byrne, R. W., and A. Whiten (eds.). 1988. Machiavellian intelligence: Social
expertise and the evolution of intellect in monkeys, apes, and humans.
Oxford: Clarendon Press.
Carruthers, P., and P. K. Smith (eds.). 1996. Theories of Theories of Mind.
Cambridge: Cambridge University Press.
Clark, H. 1996. Using language . Cambridge: Cambridge University
Press.
Davies, M., and T. Stone (eds.). 1995a. Folk psychology. Oxford:
Blackwell.
——, (eds.). 1995b. Mental simulation. Oxford: Blackwell.
De Waal, F. 2001. Pointing primates: Sharing knowledge without
language. Chronicle of Higher Education, January 19: B7–B9.
Dunbar, R., C. Knight, and C. Power (eds.). 1999. The evolution of culture.
New Brunswick, NJ: Rutgers University Press.
Duranti, A. (ed.). 2001. Linguistic anthropology: A reader. Malden, MA:
Blackwell.
Enfield, N. J. 2000. The theory of cultural logic: How individuals
combine social intelligence with semiotics to create and maintain
cultural meaning. Cultural Dynamics 12(1):35–64.
——. 2003. Linguistic epidemiology: Semantics and grammar of language
contact in mainland Southeast Asia. London: Routledge.
——. 2005a. Areal linguistics and mainland Southeast Asia. Annual
Review of Anthropology 34:181–206.
——. 2005b. The body as a cognitive artifact in kinship representations:
Hand gesture diagrams by speakers of Lao. Current Anthropology
41(6):51–81.
Enfield, N. J., and Stivers, T. (eds.). in press. Person reference in interaction:
Linguistic, cultural, and social perspectives. Cambridge: Cambridge
University Press.
Geertz, C. 1973. The interpretation of cultures. New York: Basic Books.
Gergely, G., Bekkering, H., and Király, I. 2002. Rational imitation in
preverbal infants. Nature, 415(6873):755.
Goffman, E. 1963. Behaviour in public places: Notes on the social organization
of gatherings. New York: Free Press.
——. 1964. The neglected situation. American Anthropologist 66(6):133–
36.
——. 1974. Frame analysis: An essay on the organization of experience.
Boston: Northeastern University Press.
——. 1981. Forms of talk. Philadelphia: University of Pennsylvania
Press.
Goodwin, C. 1981. Interactional organization: Interaction between speakers
and hearers. New York: Academic Press.
——. 1994. Professional vision. American Anthropologist 96(3):606–
633.
——. 2000. Action and embodiment within situated human interaction.
Journal of Pragmatics 32:1489–1522.
——. 2003. Pointing as situated practice. In Pointing: Where language,
culture, and cognition meet, edited by S. Kita, 217–242. Mahwah, NJ:
Erlbaum.
Goody, E. N. (ed.). 1995. Social intelligence and interaction: Expressions
and implications of the social bias in human intelligence. Cambridge:
Cambridge University Press.
Grice, H. P. 1957. Meaning. Philosophical Review 67:377–388.
1. 1975. Logic and conversation. In Speech Acts, edited by P. Cole
——
and J. L. Morgan, 41–58. New York: Academic Press.
Gumperz, J. J. 1982. Discourse strategies. Cambridge: Cambridge
University
Press.
Gumperz,J. J., and D. Hymes (eds.). 1986[1972]. Directions insociolinguistics:
The ethnography of communication. London: Blackwell.
Henrich, J., R. Boyd, S. Bowles, C. Camerer, E. Fehr, and H. Gintis
(eds.). 2004. Foundations of human sociality: Economic experiments and
ethnographic evidence from fifteen small-scale societies. Oxford: Oxford
University Press.
Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.
Hymes, D. H. (ed.). 1964. Language in culture and society: A reader in
linguistics and anthropology. New York: Harper and Row.
Kockelman, P. 2005. The semiotic stance. Semiotica 157(1–4):233–
304.
Levinson, S. C. 1995. Interactional biases in human thinking. In Social
intelligence and interaction: Expressions and implications of the social
bias in human intelligence, edited by E. Goody, 221–260. Cambridge:
Cambridge University Press.
——. 2000. Presumptive meanings. Cambridge, MA: MIT Press.
Levinson, S. C., and P. Jaisson (eds.). 2006. Evolution and culture.
Cambridge, MA: MIT Press.
Norman, D. A. 1991. Cognitive Artifacts. In Designing interaction:
Psychology at the human-computer interface, edited by J. M. Carroll,
17–38. Cambridge: Cambridge University Press.
Povinelli, D. J., J. M. Bering, and S. Giambrone. 2003. Chimpanzees’
“pointing”: Another error of the argument by analogy? In Pointing:
Where language, culture, and cognition meet, edited by S. Kita, 35–68.
Mahwah, NJ: Erlbaum.
Premack, D., and G. Woodruff. 1978. Does the chimpanzee have a
Theory of Mind? Behavioral and Brain Sciences 1:515–526.
Richerson, P. J., and R. Boyd. 2004. Not by genes alone: How culture
transformed human evolution. Chicago: University of Chicago Press.
Sacks, H. 1992. Lectures on Conversation. London: Blackwell.
Sacks, H., and E. A. Schegloff. 1979. Two preferences in the organization
of reference to persons in conversation and their interaction. In
Everyday language: Studies in ethnomethodology , edited by G. Psathas,
15–21. New York: Irvington.
Sacks, H., E. A. Schegloff, and G. Jefferson. 1974. A simplest systematics
for the organization of turn-taking for conversation. Language
50(4):696–735.
Schegloff, E. A. 1984. On some gestures’ relation to talk. In Structures of
social action: Studies in conversation analysis, edited by J. M. Atkinson
and J. Heritage, 266–296. Cambridge: Cambridge University Press.
——. 1992. Repair after next turn: The last structurally provided defense
of intersubjectivity in conversation. American Journal of Sociology
97(5):1295–1345.
——. in press. Sequence organization in interaction: A primer in conversation
analysis, 1. Cambridge: Cambridge University Press.
Schegloff, E. A., G. Jefferson, and H. Sacks. 1977. The preference for
self-correction in the organization of repair in conversation. Language
53(2):361–382.
Schieffelin, B. B., and E. Ochs (eds.). 1986. Language socialization across
cultures. Cambridge: Cambridge University Press.
Shore, B. 1998. Culture in mind: Cognition, culture, and the problem of
meaning. Oxford: Oxford University Press.
Sidnell, J. 2001. Conversational turn-taking in a Caribbean English
Creole. Journal of Pragmatics 33(8):1263–1290.
——. 2005. Talk and practical epistemology: The social life of knowledge in
a Caribbean community. Amsterdam: Benjamins.
Sperber, D. 1985. Anthropology and Psychology—Towards an
epidemiology
of representations. Man (n.s.) 20(1):73–89.
——. 1996. Explaining culture. A naturalistic approach. Oxford:
Blackwell.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and cognition.
Cambridge, MA: Harvard University Press.
Sudnow, D. (ed.). 1972. Studies in social interaction. New York: Free
Press.
Sussman, R. W., and A. R. Chapman. 2004. The nature and evolution
of sociality: Introduction. In The origins and nature of sociality, edited
by R. W. Sussman and A. R. Chapman, 3–22. New York: Aldine de
Gruyter.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005.
Understanding
and sharing intentions: The origins of cultural cognition.
Behavioral and Brain Sciences 28:675–735.
Veà, J., and J. Sabater-Pi. 1998. Spontaneous pointing behaviour in the
wild Pygmy Chimpanzee (Pan paniscus). Folia Primatologica 69:289–
290.
Whiten, A., and R. W. Byrne (eds.). 1997. Machiavellian intelligence II:
Extensions and evaluations. Cambridge: Cambridge University Press.
Zeitlyn, D. 1995. Divination as dialogue: Negotiation of meaning with
random responses. In Social intelligence and interaction: Expressions
and implications of the social bias in human intelligence, edited by E. N.
Goody, 189–205. Cambridge: Cambridge University Press.
Part 1
goal in this chapter is to make the case that the roots of human
Mysociality lie in a special capacity for social interaction, 1 which
itself holds the key to human evolution, the evolution of language,
the nature of much of our daily concerns, the building blocks of social
systems, and even the limitations of our political systems.
Much of the speculation about the origins and success of our species
centers on the source of our big brains, the structure of our cognition,
on the origins of language, the innate structures that support it, and
on the striking cooperative potential in the species. These are genuine
and important puzzles, but in the rush to understand them, we seem
to have overlooked a core human ability and propensity, the study of
which would throw a great deal of light on these other issues. It is right
under our noses, much more accessible than the recesses of our brains or
the fossils that track our evolutionary origins, and quite understudied.
It is the structure of everyday human interaction.
Despite the fact that it is over fifty years since human interaction was
first treated as a scientific object of inquiry deserving of a natural history
(Bateson 1955; Chapple and Arensberg 1940; see also Kendon 1990),
progress has been quite limited. One problem has simply been that
human interaction lies in an interdisciplinary no-man’s land: it belongs
equally to anthropology, sociology, biology, psychology, and ethology
but is owned by none of them. Observations, generalizations and theory
have therefore been pulled in different directions, and nothing close to
a synthesis has emerged. In this chapter, I therefore try to stand back
and extract some generalizations about the special human abilities that
seem to lie behind the structure of social interaction.
Properties of Human Interaction
Figure 1.1. Kpémuwó, deaf home signer on Rossel Island, inventing a way to
communicate about abstract ideas concerning sorcery.
he signed, and then he in response attempted to correct or narrow
my interpretation, until step by step we converged on an
understanding.
Intention recognition and the mechanics of turn taking are
deeply interlocked. The focus of this chapter is on what exactly
Kpémuwó and I share that makes it possible for us to communicate,
when we share so little other background in conventions of culture
and communication.
"Interanpcte"ution.
Core
a
of
Idea
OutEngiThe
the
What
Shows
The idea in a nutshell is that humans are natively endowed with a set
of cognitive abilities and behavioral dispositions that synergistically
work together to endow human face-to-face interaction with certain
special qualities. I call these elements collectively the human interaction
engine (which is meant to suggest both dedicated mental machinery and
motive power, i.e., both “savvy” and “oomph”). Right away, I should
underline this is not a proposal for a “social cognition module,” “a
culture acquisition device,” “cognitive culture system” or an “interaction
gene” or anything of that simple-minded sort (see, e.g., Jackendoff
1992; Pinker 1997; Talmy 2000:373ff.). Those accounts assume that
the kind of approach taken to the “language module” or “language
instinct” can be copied across into a “social–cultural module,” and I
am arguing nothing of the kind. What I am entertaining is that there
are underlying universal properties of human interaction that can be
thought of as having a cognitive-and-ethological foundation. Evolution
is “bricolage” (to use Lévi-Strauss’s term), seizing what is at hand in the
organism’s phenotype to construct an often ramshackle but adaptive
system. So an “interaction engine” could be constructed of scraps of
motivational tendencies, temporal sensitivities (reaction contingencies),
semicooperative instincts, ancient ethological facial displays, the
capacity
to analyze other’s actions through mental simulation, and so forth.
The model is a Jean Tinguely kinetic sculpture built of bric-a-brac, not
a Fodorean mental module (Fodor 1983), let alone a Chomskyan point
mutation (Bickerton 1998; Hauser et al. 2002).
Whatever your doubts, just entertain the idea for a moment (I turn
to the crucial question of cross-cultural variability in the next section).
Before we ask “What exactly are the elements of the interaction engine?”
we need to ask what it needs to account for, that is, what the crucial
properties of human interaction are. From the output, we can guess
at the properties of the machine. Here are some obvious properties of
the output: 7
Cultural Variation
Interaction is shot through and through with culture. It had better be,
because it is the vehicle of culture—without it, there would not be any.
Even though culture conditions and shapes private acts—the way we
urinate or defecate, for example, or even the way we walk—it is through
public, and especially interactive, acts that culture propagates itself.
And every anthropologist, indeed every traveler, has been impressed
with differences in interactional mores. Just to mention a few of my
own observations, consider:
(1) In rural Tamilnadu, in a typical village of 18 castes, who can
interact with whom, and in what ways, is elaborately specified in a
mental 17 × 17 matrix (Levinson 1982). One indelible memory is of a
high-caste foreman arriving on bicycle at a building site, engendering
the total cessation of works as all the low-caste workers scramble down
the scaffolding so that they can receive instructions while not having
their heads higher than their caste better.
(2) In Cape York, the aboriginal speakers of Guugu Yimithirr incorporate
gestures into their verbal interaction in a much more fundamental way
than Europeans do. For example, a negative gesture preceding a positive
assertion signals a negative proposition, or the subject and object of
a verb may be omitted but indicated by gesture. The great majority
of gestures are intended to have directional veracity—no mere hand
waving here (see Levinson 2003).
(3) In Chiapas, Mexico, Tzeltal-speaking Tenejapans are peasants who
maintain a decorum appropriate to a royal court: Long and elaborate
greeting sequences specify whether the intruder is merely passing by
(and if so in the same or different direction as the intruded, or past the
intruded’s home base) or arriving to visit (Stross 1967). Once begun,
interaction is properly conducted sitting side by side with the minimum
of mutual gaze, each assertion being partially repeated by the recipient,
with long sequences of the kind: “I’ve come to visit you” “You’ve
come to visit perhaps” “I have come” “You have indeed” “Indeed
I have.” . . . (Brown 1998; Levinson and Brown 2005).
(4) On Rossel Island, Papua New Guinea, interaction is typically
dyadic, squatting eyeball to eyeball, with sustained mutual gaze, and
incorporating many facial displays, and eye-pointings. Fast, informal,
with much mutual touching, two big bankers of shell-money can
conduct important business for the whole island with a nod and a
wink, making a striking contrast to the apparent Tenejapan formality
of interaction over matters much more trivial (Levinson and Brown
2005).
These observations, and a thousand like them, raise the question:
What sense does it make to talk about a core interaction engine as if
it was a universal property of mankind, given all this rich texture of
cultural diversity?
The answer is that the interaction engine is not to be understood
as an invariant, a fixed machine with a fixed output, but as a set of
principles that can interdigitate with local principles, to generate
different local flavors. Let me outline just one example of the kind of
interplay between the universal and the culturally particular I have
in mind (the details appear in Levinson 2005). Sacks and Schegloff
(1979) suggested that two principles govern the reference to persons
in English conversation: a preference for using a minimal form (e.g.,
a name), and a preference for using a form (a “recognitional”) under
which the referent can be recognized by the recipient. Usually these two
preferences can be satisfied simultaneously. But sometimes they come
apart. For example, if the speaker is unsure whether the recipient will
recognize the referent under a single name, he may try it out, marking
the “try” with rising intonation—if there is no uptake, a second name
may then be introduced, also with a “try” intonation, then a description,
and so on. So we get a sequence like this:
<5> A: .. . well I was the only one other than the uhm tchFords?, (1)
Uh Mrs Holmes Ford? (2)
You know uh the the cellist? (3)
[
B: Oh yes. She’s she’s the cellist (4)
A: Yes well she and .. . . . . . . .
[Sacks and Schegloff 1979:19]
At (1) the speaker tries a single name, upgrading at (2) with a second
name and a title, and at (3) with a description, whereupon getting
acknowledgement of recognition at (4), the speaker proceeds. What the
sequence displays is that recognition takes priority, the minimization
being successively relaxed till recognition is achieved (common ground
established). It shows that a minimal clue to a Schelling solution is
tried first.
Very similar sequences can be found in other quite unrelated languages
I have worked on, including Guugu Yimithirr, Tzeltal, and Yélî Dnye
(the language of Rossel Island). One has to allow for the fact that
upgradings might take different forms (e.g., identifying conventions
might employ place of origin specifications), and even that different
modalities might be involved (e.g., pointings, eye glances at places of
origin)—but allowances made, the sequences are eerily familiar. Here
is one from Guugu Yimithirr:
<6> B: ngayu nubuun nhaaway waami dyibaalu warra Milga-mul? 1
1s one there found to.South old ears-without
“I came across one fellow there to the South, old ‘without ears’?”
<points>
(0.3) 2
R: aa 3
“Oh”
(0.4) 4
B: oo Tommy Confen? 5
“old Tommy Confen?”
R: ee 6
“ah”
B: nyulu nhamuun bamaal nganhi wangaarmun nhaathi durrginbigu
gaadariyga bada
“That fellow saw me, as I was coming down Indian Head”
[Revgest 00:17:01]
At line 1, B tries Tommy Confen’s nickname, namely “Without Ears”
(he was deaf), with intonational rise on “Ears.” Now critically, he has
supplemented this reference with an earlier quick pointing gesture to
where the Confen household used to be, coinciding with the underlined
word -nubuun—but unfortunately R was not looking (see Fig. 1.2 [a]).
B therefore has reason to doubt that R has got the reference: he gazes
straight at R throughout this sequence until point 5, to assess whether
recognition has occurred (see Fig. 1.2 [b]). R’s response at point 3 is
slightly delayed, and has a form (indicating “news”) suggesting that
it could be a response to the earlier part of what B said. B therefore
tries again at 5 with rising intonation, with both English names of the
referent. R responds positively, with mutual gaze, and B then turns
away and resumes the story.
That suggests that cross-culturally there seem to be the same two
preferences, they seem to have the same ranking, and when they cannot
be satisfied simultaneously, minimization is successively relaxed. Let
us take this, on the basis of parallels in four unrelated cultures, as a
candidate universal, acknowledging that we would need a lot of further
evidence to firm this up.
On Rossel Island, there is an additional wrinkle, a cultural taboo on
naming that interacts with these preferences. The taboo specifies that
one may not name close in-laws or relatives recently deceased. How does
this then interact with the candidate universal preferences? Let us take a
look. In the following excerpt, J out of the blue refers to someone as “that
(distant) girl,” pointing <7> south up over the mountain (utterance [1]).
Figure 1.2. Guugu Yimithirr person reference sequence: (a) frame showing
unobserved quick point to referent’s home base; (b) frame showing mutual gaze
at point at which recognition is achieved.
Figure 1.3. Rossel Island name-avoidance sequence: (a) points up on “that
thing”; (b) points over mountain on “that girl”; (c) points W on second “that
girl”; (d) ditto on “you see,” widening eyes; (e) recipient gives eyebrow flash; (f)
recipient says “ah” in overlap.
The utterance is “try-marked” with rising intonation, and the gesture is
held while looking at R (Fig. 1.3[a], [b]). R does not respond in the gap
(2). J upgrades the description in (3), not by adding a verbal description
or name but by pointing West while widening his eyes and gazing at
R (Fig. 1.3[c], [d]). At this point R responds with an eyebrow flash (a
local “yes, continue” marker), followed by a verbal acknowledgment,
and recognition achieved, J continues.
<7> J: mu kópu mwo a pyaa wo, mu dmââdî ngê? (1)
that affair over.there happened that girl topic
<—points South over mountain————holds point, looking at R>
“That thing (pointing) that happened a while ago, that girl?”
(.) (2)
mu dmââdî ngê? cha w:ee? (3)
that girl topic you understand
<—opens eyes wide, points West>
“that girl, you see?”
[
R: (eye-brow flash) éé (4)
“right, yes”
J: yi dmââdî pi kuu, yed:oo nipi nmî dmââdî cha w:ee (5)
“that girl is our affair, she’s one of ours”
[Rossel Island R02_V4 00:03:27]
The odd thing about the episode is the reference at (1) to a new referent
with such a general description (“that girl”) with the presumption
nevertheless that the referent is recognizable. In holding his gesture,
waiting, repeating the description with a new gesture, J is clearly
persisting in seeking recognition. Nevertheless, he systematically avoids
a name, instead using the same general description but providing two
distinct gestural clues, first over the mountain to his own village where
the girl was raised, and then West where she has just died (see Fig.
1.3). The recent death (itself only alluded to by “that thing”) requires
the name avoidance. Thus, R is faced with a Schelling problem: a very
general description (“that girl”) supplemented with gestural clues, and
with the background knowledge that one reason for not naming a person
is their recent death. The clues evidently prove sufficient, as R claims to
have recognized the referent at (4).
Notice how the culturally specific rule (a name taboo) folds into
our candidate universal preferences. The speaker goes for recognition.
He is blocked from using a name, but uses a brief general description,
satisfying minimization, with a gestural clue. When this is not sufficient,
he is again blocked from using a name, and tries an upgrade using a
second gestural clue, while claiming with wide eyes (see Fig. 1.3 [d])
and intonation that the addressee can locate the referent in Schelling
space. All three preferences are interlocked: do not name, yet go for
recognition, while seeking minimal reference. Further cases of name
taboo on Rossel show similar patterns: nonverbal upgrades are preferred
to verbal ones, as they better satisfy the ban on speaking of the taboo
person. Space precludes extensive discussion of this theme, but the
point is that the culturally specific does not necessarily eclipse the
(candidate) universal procedures—they are woven together to make a
coherent local practice.
The identification and naming of persons is, if anything is, a
cultural
matter, and yet it seems to mesh seamlessly with the universal
systematics of interaction. The hypothesis is that the interaction engine
will be most recognizable in informal, everyday conversation, which
forms the normal matrix for language acquisition and socialization.
The ethnography of speaking has long established that when we look
at special, ritual or institutional speech events, we find ourselves in the
culture-specific territory of séances, ceremonies, investitures, political
oratory, and the like (Bauman and Sherzer 1974; Duranti 2001). Even here,
though, the interesting suggestion emerging from work in conversation
analysis is that specialized speech events are built by tweaking the rules
and principles governing informal conversation. Thus, the differences
between a press conference and a classroom can be partly captured by
considering both the similarities (multiple persons, but only two parties,
one singular—teacher or press officer) and the differences (questioning
assigned to the party with the multiple persons, as in press conferences,
or to the singular party, as in classrooms; see Schegloff 1987).
This idea—that the local, cultural specialization is a variation off a
universal theme—is potentially powerful, because as we learn more about
conversational organization we see that there are relatively few, crucial
organizing principles. For example, ringing the changes on different
possible systems of turn taking, participation–structure, and action
sequences will give us many key aspects of culture-specific speech events.
We also see that at a finer level of structure, the modulation of the way in
which actions are expressed (e.g., directly vs. indirectly, with or against
preference organization) conveys the qualities of social relations (Brown
and Levinson 1987). Conversation analysts have therefore sometimes
taken a “constructionist” view of social organization (see again Schegloff
1987): you are, as it were, what you say. This does not always accord
with the anthropological experience (Levinson 2005): it may work in
New Guinea, but in India you are what you are born. However, viewed
as a system of principles that predicts, for each possible manipulation
of the systematics, what the consequences will be, it promises to be a
powerful tool for understanding cross-cultural variation.
The idea, then, is not that the interaction engine produces cross-
cultural uniformity but, rather, that it provides the building blocks for
cultural diversity in social interaction. Or in a less crude analogy, it
provides the parameters for variation, with default values that account
for the surprising commonalities in the patterns of informal interchange
across cultures.
One reason that sociocultural anthropologists should be interested in
grasping the nature of these parameters is that interactional principles
clearly play a central role in higher level social processes. This is entirely
transparent in tribal societies, where since Sir Henry Maine (1861) it
has been appreciated that larger entities like descent groups act like
individuals, contracting marriages and alliances or conducting feuds.
Less obviously, politics and diplomacy among modern nation states
has much the same character, of a conversation conducted according
to the principles of interaction, albeit between representatives of huge
agglomerates. We attribute intentions to political maneuvers as if states
were individuals, instead of the rambling conglomerates with different
factional interests that they really are (Levinson 1995:225).
In short, the analysis of interaction could and should play a major
role in our analysis of social institutions and international politics.
Humans come natively equipped for interacting with conspecifics. We
use this interpretive apparatus for understanding large scale polities of a
kind that we have only recently innovated in our evolutionary history,
and for which they may be inappropriate. For, however inappropriate,
whatever other natural model would we have?
Conclusion
My thesis has been that the notion of a core interaction engine driving
human social life makes eminently good sense. There is good prima
facie evidence for it, and work in psychology, linguistics, anthropology,
sociology and philosophy all point toward it. It is not easy to isolate the
critical features of such an ability, because they range from the abstract
mental simulations of Schelling mirror worlds, to the concrete problems
of binding across multimodal signals, or the processes generating striking
cross-cultural parallels across procedures for person reference. But the
effort has to be worth it. Progress promises the key to understanding
human evolution, and it offers to shed light on human ontogeny,
higher level social processes, and the limitations of a mentality forged
in face-to-face contact in the present world of nation states, superhuman
agglomerations endowed by us with personal attributes they mostly do
not have. It is an effort in which anthropology should have a central
role to play.
Notes
1. This chapter takes off from the position paper authored by Nick
Enfield and myself, and precirculated in June 2004. Thanks are owed to the
participants to the Wenner-Gren symposium in which these ideas were first
aired and discussed. I am also grateful for help from Penelope Brown.
2. See Connolly and Anderson 1987. It could have been a chance match
of cultural mores, the raw greed of colonial mercantilism happening to meet
its match in Melanesian exchange—see Strathern 1992.
3. Dunbar’s point is that human verbal interaction replaces primate
grooming, and he is therefore especially interested to find that 60 percent of
conversation concerns social relationships and person topics (1997:123).
4. Hymes (1972:40) mentions a number of cases in which the ethnographers
have noticed the extreme taciturnity of the people—he cites Gardner for
example on the Paliyans of South India, who “communicate very little at all
times and become silent by the age of 40. Verbal, communicative persons are
regarded as abnormal and often as offensive.”
5. There are problems quantifying what counts as interaction. Are
nonaddressed listeners interacting? Perhaps onlyif they are ratified participants
(see Goffman 1979; Levinson 1988). Is talk essential? No, signs, winks, and
nods will do—we are interested in mutual, interlocking sequences of actions
(see below), which are not dependent on language. Is a mother rocking a baby
“interacting” in the favored sense? Yes, if in response to baby’s actions, but no
if baby is asleep.
6. My neighbors got further than I did for a number of reasons. First,
although Kpémuwó’s village is some distance away, he is familiar to them.
Second, they shared much more background knowledge of the situations
being described. Third, their signing was more perspicuous to Kpémuwó than
mine because it made use of conventional elements of the gesture system—
the spoken language is accompanied by a rich set of conventional gestures or
“emblems.”
7. This list, derived from the empirical literature, is not so far removed
from the philosophical view derived by H. P. Grice, whose theory of “meaning”
(1957) covers points (1) and (2), and whose “maxims of conversation” (1975)
cover (3), (4) and (7—“relevance”) at least.
8. Conversational analysts have introduced the technical term “conditional
relevance” for this expectation (see, e.g., Schegloff 1972b).
9. On the hierarchy of grammars modeling behavior sequences see Partee
et al. 1990:433ff.
10. More strictly, the system is asymmetrically structured in such a way
that interactants can deploy it to try and extract cooperation (thanks to
Tanya Stivers for helping me see this connection between preference and
cooperation). See Levinson 1983:332ff. and Schegloff in press for exposition
of “preference.”
11. False belief tasks are not mastered by normal Western children until
almost four years old, but by that age children are experienced interactants.
Leslie 1994 suggests that action interpretation begins at around eight months
without the notion of propositional attitudes essential to attributions of belief,
which begins only at 24 months. Mastery of false belief requires, he argues, a
further special kind of inhibition not available for another two years (Leslie
2000:1242).
12. See Rizzolatti and Arbib (1998) on the discovery that specialized neurons
fire when the same action is both perceived and executed—suggesting a low-
level solution to “reading” other minds. But this correlation is learned, as
shown by recent experiments, so there still has to be a higher-level mechanism
relating action and perception.
13. In ToM models, this is often called “second-order belief” (what A
believes B believes about p: see Baron-Cohen 2000). Here, though, we are
actually interested in something that has some of the properties of potentially
infinitely nested beliefs: what A believes B believes that A believes. . . about
p. Although that is not psychologically plausible, there are psychologically
plausible heuristics that approximate it—see Clark 1996:92ff. for review.
14. Usually this has been thought about the other way around, with
Schelling processes embedded in Gricean intentions, rather than the reverse
as here suggested.
15. Antagonistic interaction can be Machiavellian, that is, designed to look
cooperative but with hidden ulterior motives. In that case it is exploiting
cooperative interaction—in a trivial sense, every cooperative interaction can
be embedded in a Machiavellian one. The point here is reflexive thinking
is not an essential feature of antagonistic interaction as it is of cooperative
interaction. See following note on the definition of “interaction” here.
16. Why, one might ask, is all this mentalism necessary? Symbiosis after all
has two forms, mutualism and parasitism, and both forms, cooperative and
antagonistic, can occur without minds. But here I am using “interaction” in a
special sense, in terms of sequences of actions, where by definition an action
is a pairing of a mental intention or goal and the behavior designed to achieve
it.
17. There may be ritual sequences, like greetings and partings, that allow
a rule-governed treatment, as in Irvine (1974), but these do not cover the
central business transacted in between.
18. Fabricated data for reasons of compression—see Levinson (1983:345);
and Schegloff in press.
19. Conversational analysts have noted (of English conversation) that
pauses or gaps of between one-tenth to two-tenths of a second—roughly the
duration of an unstressed syllable—can often be treated as significant failures
to respond. Psycholinguists have tried to link the duration of the segment, the
syllable, and the word to the temporal binding properties of the brain—a real
temporal metabolism.
20. Earlier attempts to build a science of human ethology (Eibl-Eibesfeld
1989; von Cranach et al. 1979) have largely petered out. Current evolutionary
psychology seems headed quite elsewhere, away from the observation of
natural human behavior.
References
Albert, E. 1972. Culture patterning of speech behavior in Burundi. In
Directions in sociolinguistics, edited by J. J. Gumperz and D. Hymes,
72–105. New York: Holt.
Baron-Cohen, S. 2000. The cognitive neuroscience of autism:
Evolutionary
approaches. In The new cognitive neurosciences, edited by M.
Gazzaniga, 1249–1257. Cambridge, MA: MIT Press.
Baron-Cohen, S., A. Leslie, and U. Frith. 1985. Does the autistic child
have a “theory of mind”? Cognition 21:37–46.
Basso, K. 1970. To give up on words: Silence in the western Apache
culture. Southwestern Journal of Anthropology 26(3):213–230.
Bateson, G. 1955. A theory of play and fantasy. Approaches to the
study of human personality. American Psychiatric Association Report
2:39–51.
Bauman, R., and J. Sherzer (eds.). 1974. Explorations in the ethnography
of speaking. Cambridge: Cambridge University Press.
Bickerton, D. 1998. Catastrophic evolution: The case for a single
step from protolanguage to full human language. In Approaches to
the evolution of language: Social and cognitive bases, edited by J. R.
Hurford, M. Studdert-Kennedy, and C. Knight, 341–358. Cambridge:
Cambridge University Press.
Brown, P. 1998. Conversational structure and language acquisition:
The role of repetition in Tzeltal adult and child speech. Journal of
Linguistic Anthropology 8(2):197–221.
Brown, P., and S. C. Levinson. 1987. Politeness. Cambridge: Cambridge
University Press.
Bruner, J. 1976. From communication to language—a psychological
perspective. Cognition 3:255–287.
Chapple, E. D., and Arensberg, C. M. 1940. Measuring human relations.
Genetic Psychology Monographs 22:3–147.
Clancy, P., S. Thompson, R. Suzuki, and H. Tao. 1996. The conversational
use of reactive tokens in English, Japanese, and Mandarin. Journal of
Pragmatics 26:355–387.
Clark, H. 1996. Using language. Cambridge: Cambridge University
Press.
Clark, H., R. Schreuder, and S. Buttick. 1983. Common ground and the
understanding of demonstrative reference. Journal of Verbal Learning
and Behavior 22:245–258.
Clark, H., and D. Wilkes-Gibbs. 1986. Referring as a collaborative
process. Cognition 22:1–39.
Connolly, B., and R. Anderson. 1987. First contact. New York: Viking.
[Book accompanying film First Contact, Bob Connolly and Robin
Anderson, dirs. 54 min. Film Makers Library. New York.]
Darwin, C. 1872. The Expression of the emotions in man and animals.
London: John Murray.
Dennett, D. 1995. Darwin’s dangerous idea: Evolution and the meanings
of life. London: Penguin.
Dunbar, R. 1997. Grooming, gossip and the evolution of language. Harmonds-
worth: Penguin.
Duranti, A. 1997. Universal and culture-specific properties of greetings.
Journal of Linguistic Anthropology 7:63–97.
——, (ed.). 2001. Linguistic anthropology: A reader. Oxford: Blackwell.
Eibl-Eibesfeld, I. 1989. Human ethology. New York: Aldine de Gruyter.
Enfield, N. 2003. The definition of what-d’you-call-it. Journal of Pragmatics
31:101–117.
Fodor, J. 1983. Modularity of mind. Cambridge, MA: MIT Press.
Gardner, H. 1985. Frames of mind. New York: Paladin.
Goffman, E. 1979. Footing. Semiotica 25:1–29.
Goodwin, C. 1981. Conversational organization. New York: Academic
Press.
——, (ed.). 2003. Conversation and brain damage. Oxford: Oxford
University Press.
Goody, E. (ed.). 1995. Social intelligence and interaction. Cambridge:
Cambridge University Press.
Grice, H. P. 1957. Meaning. Philosophical Review 67:377–388.
——. 1975. Logic and conversation. In Syntax and semantics 3: Speech
Acts, edited by P. Cole and J. Morgan, 41–58. New York: Academic
Press.
Hammerstein, P. 1996. The evolution of cooperation within and between
generations. In Interactive minds, edited by P. Baltes and U. Staudinger,
35–58. Cambridge: Cambridge University Press.
Hauser, M. D, N. Chomsky, and W. T. Fitch. 2002. The faculty of
language:
What is it, who has it, and how did it evolve? Science 298:1569–
1579.
Hayashi, M. 2003. Joint utterance construction in Japanese conversation.
Amsterdam: Benjamins.
Hymes, D. 1972. Models of the interaction of language and social life.
In Directions in sociolinguistics, edited by J. J. Gumperz and D. Hymes,
35–71. New York: Holt.
Irvine, J. 1974. Strategies of status manipulation in the Wolof greeting.
In Explorations in the ethnography of speaking, edited by R. Bauman and
J. Sherzer, 167–191. Cambridge: Cambridge University Press.
Jackendoff, R. 1992. Is there a faculty of social cognition? In Languages
of the mind, edited by R. Jackendoff, 69–91. Cambridge, MA: MIT
Press.
Kendon, A. 1990. Conducting interaction. Cambridge: Cambridge
University Press.
Kolb, B., and I. Whishaw. 1990. Fundamentals of human neuropsychology.
New York: Freeman.
Leslie, A. 1994. ToMM, ToBy, and Agency. In Mapping the mind: Domain
specificity in cognition and culture, edited by L. Hirschfield and S.
Gelman, 119–148. Cambridge: Cambridge University Press.
——. 2000. “Theory of mind” as a mechanism of selective attention. In
The new cognitive neurosciences, edited by M. Gazzaniga, 1235–1247.
Cambridge, MA: MIT Press.
Levinson, S. C. 1982. Caste rank and verbal interaction in Western
Tamilnadu. In Caste Ideology and Interaction, edited by D. McGilvray,
98–203. Cambridge: Cambridge University Press.
——. 1983. Pragmatics. Cambridge: Cambridge University Press.
——. 1988. Putting linguistics on a proper footing. In Erving Goffman,
edited by P. Drew and A. Wootton, 161–227. Cambridge: Polity
Press.
——. 1995. Interactional biases in human thinking. In Social intelligence
and interaction, edited by E. Goody, 221–260. Cambridge: Cambridge
University Press.
——. 2000. Presumptive Meanings. Cambridge, MA: MIT Press.
——. 2003. Space in language and cognition: Explorations in cognitive
diversity. Cambridge: Cambridge University Press.
——. 2005. Manny Schegloff’s dangerous idea. Discourse studies 7(4–
5):431–453.
Levinson, S. C., and P. Brown. 2005. Comparative response systems. Paper
presented at the 104th Annual Meeting of the American
Anthropological
Association, Washington, DC, November 30–December 4.
Maine, Sir Henry. 1861. Ancient law. London: Murray.
Meltzoff, A., and M. Moore. 1977. Imitation of facial and manual
gestures by human neonates. Science 198:75–78.
Miller, G., E. Galanter, and K. Pribram. 1960. Plans and the structure of
human behavior. New York: Holt.
Moerman, M. 1989. Talking culture: Ethnography and conversation analysis.
Philadelphia: University of Pennsylvania Press.
Muysken, P. 2000. Bilingual speech: A typology of code-mixing. Cambridge:
Cambridge University Press.
Partee, B., A. ter Meulen, and R. Wall. 1990. Mathematical methods in
linguistics. Dordrecht, the Netherlands: Kluwer Academic.
Pinker, S. 1997. How the mind works. New York: Penguin.
Quine, W. V. 1960. Word and object. Cambridge, MA: MIT Press.
Reisman, K. 1974. Contrapuntal conversations in an Antiguan village.
In Explorations in the ethnography of speaking, edited by R. Bauman
and J. Sherzer, 110–124. Cambridge: Cambridge University Press.
Rizzolatti, G., and M. Arbib. 1998. Language within our grasp. Trends
in Neuroscience 21(5):188–194.
Rochat, P., J. Querido, and T. Striano. 1999. Emerging sensitivity to
the timing and structure of protoconversation in early infancy.
Developmental Psychology 35(4):950–957.
Sacks, H., and E. A. Schegloff. 1979. Two preferences in the organization of
reference to persons in conversation and their interaction. In Everyday
language , edited by G. Psathas, 15–21. New York: Irvington.
Sacks, H., E. A. Schegloff, and G. Jefferson. 1974. A simplest systematics
for the organization of turn-taking for conversation. Language
50:696–735.
Schegloff, E. A. 1972a. Notes on a conversational practice: Formulating
place. In Language and social context, edited by P. Giglioli. Pp. 95–135.
Harmondsworth: Penguin.
——. 1972b. Sequencing in conversational openings. In Directions in
sociolinguistics, edited by J. J. Gumperz and D. Hymes, 346–380. New
York: Holt.
——. 1987. Between micro and macro: Context and other connections.
In The Micro-Macro Link, edited by J. Alexander, 207–234. Berkeley:
University of California Press.
——. 1996. Some practices for referring to persons in talk-in-interaction:
A partial sketch of a systematics. In Studies in anaphora, edited by B.
Fox, 437–485. Amsterdam: Benjamins.
——. in press. Sequence organization. Cambridge: Cambridge University
Press.
Schelling, T. 1960. The strategy of conflict. Cambridge, MA: MIT Press.
Sidnell, J. 2001. Conversational turn-taking in a Caribbean English
creole. Journal of Paragmatics 33:1263–1290.
Sperber, D., and D. Wilson. 1995[1986]. Relevance. 2nd edition. Oxford:
Blackwell.
Strathern, M. 1992. The decomposition of an event. Cultural Anthropology
7:244–254.
Striano, T., and M. Tomasello. 2001. Infant development: Physical
and social cognition. In International Encyclopedia of the Social and
Behavioral Sciences. N. J. Smelser and P. Baltes, eds. Pp. 7410–7414.
Oxford: Pergamon.
Stross, B. 1967. Tzeltal greetings. Language Behavior Research Lab,
University of California at Berkeley. [Mimeograph]
Talmy, L. 2000. The cognitive culture system. In Towards a cognitive
semantics, vol. 2, 373–416. Cambridge, MA: MIT Press.
Trevarthen, C. 1979. Instincts for human understanding and for cultural
cooperation: Their development in infancy. In Human ethology:
Claims and limits of a new discipline, edited by M. von Cranach, K.
Foppa, W. Lepenies, and D. Ploog, 530–571. Cambridge: Cambridge
University Press.
von Cranach, M., K. Foppa, W. Lepenies, and D. Ploog (eds.). 1979.
Human ethology: Claims and limits of a new discipline. Cambridge:
Cambridge University Press.
two
from “ordinary talk,” and the latter engenders a range of problems that
make it unsustainable as a general organization of interaction. What
is at stake in “turn taking” is not politeness or civility, but the very
possibility of coordinated courses of action between the participants
(e.g., allowing for initiative and response)—very high stakes indeed.
Even with just two participants, achieving one at a time poses a
problem of coordination if the talk is to be without recurrent substantial
silences and overlaps: how to coordinate the ending of one speaker
and the starting up by another. If there are more than two “ratified
participants” (Goffman 1963), there is the additional issue of having
at least one of the current nonspeakers, and not more than one of the
current nonspeakers, start up on completion of the current speaker’s
turn. One can imagine a variety of putative solutions to these problems
of coordination, but none of them can be reconciled with the data of
actual, naturally occurring ordinary conversation (Schegloff n.d.a)
The simplest systematics for turn taking article by Sacks et al. (1974)
sketches an organization of practices that works well, and has led to
nonintuitive enhancements (Schegloff 2000b, 2002). It describes units
and practices for constructing turns at talk, practices for allocating
turns at talk, and a set of practices that integrates the two. So far this
account works across quite a wide range of settings, languages, and
cultures. Departures from interactional formats familiar to Western
industrialized nations involve what might be called “differences in the
values of variables”—for example, different lengths of time that count
as a silence, rather than differences in the underlying organization of
practices.
To give a brief example, there may be differences between cultures or
subcultures in what the unmarked value of a silence between the end of
one turn and the start of a next should be. Leaving less than the normative
“beat” of silence or more than that can engender inferences among
parties to the conversation; starting a next turn “early” or starting a next
turn “late” are ways of doing things in interaction, and conversation
between people from different cultural settings can result in misfiring
with one another. For example, one difference often remarked on by
urban, metropolitan people about rural or indigenous people is that
they seem to be dimwitted and somewhat hostile; comments range from
Marx on the “idiocy of the rural classes” to Ron Scollan and Suzanne
Scollan’s work (1981) on the relation between migrants from the “lower
48” states in the United States and Alaska Natives. Having asked them
a question, the urbanites—or should I say urbane-ites—find themselves
not getting a timely reply and sense resistance, nonunderstanding,
nonforthcomingness, and so forth. Often they break what they perceive
as “the silence” that greeted their question with a follow-up question,
which may be taken by their interlocutor to exemplify the high-pressure
aggressiveness of “city slickers.” But what differs between them is not
that their turn-taking practices are different or differently organized,
but the way they “reckon” the invisible, normative beat between one
turn and the next.
I have just pointed at the organization of turn taking; an account
of what that organization is, and how it works, will have to be sought
out in the by-now substantial literature addressed to those matters (cf.
esp. Lerner 2003).
1 Clara:Helo
Nelson:
2 Hi.
3
Clara:Hi.
Nelson:
4 dW-oh>aitnc'.
5
Clara:mN-u>ocht.
Nelson:
6 dY'rwinak?
7
Clara: Yeah.
8
Nelson:Okay.
And of a preannouncement:
(2) Terasaki (2004):207
1 Jim: -> Y’wanna know who I got stoned with a few(hh) weeks ago? hh!
2 Ginny: -> Who.
3 Jim: Mary Carter ‘n her boy(hh)frie(hh)nd. hh.
1 B:->Fb Was last night the first time you met Missiz Kelly?
2 (1.0)
3 M:->FiMet whom?
4 B:->Si Missiz Kelly.
5 M:->Sb Yes.
Again, notice that if a first pair part is not followed by an action–turn,
which could be its second pair part, then what occurs in its place is
itself a first pair part and requires a response, so it too is an adjacency
pair and it too can get expanded.
And after the response to the initiating action–turn there can be
further talk that clearly is extending that trajectory of action. Sometimes
that can be a single turn, which does not make a response to it relevant
next, as at lines 3 and 8 in the following specimen, which has two such
sequences.
(4) HG, 16:25-33
1 Nancy: =·hhh Dz he av iz own apa:rt[mint?]
2 Hyla: [·hhhh] Yea:h,=
3 Nancy: -> =Oh:,
4 (1.0)
5 Nancy: How didju git iz number,
6 (·)
7 Hyla: I(h) (·) c(h)alled infermation’n San Fr’ncissc(h)[uh!
8 Nancy: -> [Oh::::
9 (·)
1
Nancy: How didju hear about it from the pape[r?
2
Hyla: ['hhhhh I sa:w-
(0.4)
3
4
Hyla: -> A'right when was:(it,)/(this,)
(0.3)
5
Hyla:
6 -> The week before my birthda:[y,]
7
Nancy: [Ye]a[:h,
8
Hyla: -> [I wz looking in the Calendar
->
9
section en there was u:n, (·) un a:d yihknow a liddle:: u-
thi:ng,
10 .hh[hh
11
Nancy: [Uh hu:h,=
12
Hyla: =At- th'-th'theater's called the Met Theater it's on
Point[setta.]
13
14
Nancy: [The Me]:t,
15 (·)
16 Nancy: I never heard of i[t.
Hyla: 17 [I hadn’t either..hhh But anyways,-en
18theh the moo- thing wz th’↓Dark e’th’ ↓Top a’th’ ↑ Stai[:rs.]
19 Nancy: [Mm-h]m[:,
20 Hyla:[En
21 I nearly wen’chhrazy cz I [: I: lo:ve ]that] mo:vie.]
22 Nancy: [y:Yeah I kn]ow y]ou lo:ve] tha::t.=
23 Hyla: =s:So::,.hh an’ like the first sho:w,=
24 Nancy: =M[m hmm, ]
25 Hyla: [wz g’nna] be:,
26 (·)
27 Hyla: on my birthday.=
28 Nancy: =Uh hu[h, ]
29 Hyla: [I’m] go’[n awhh whould hI love-
30 Nancy: [(So-)
31 (·)
32 Hyla: yihknow fer Sim tuh [take me tuh that.]
33 Nancy: [ Y a y u : : h, ]
I want to call attention here to only two bits of Hyla’s responsive talk
starting at line 8: the time formulation “the week before my birthday,”
and the activity formulation “I was looking in the Calendar section” (an
ethnographic note: the “Calendar” section of the Los Angeles Times is
the Culture and Entertainment section). First note that Hyla conducts
an out-loud search for “when it was”; she is taking care with this time
formulation. There are many other ways of referring to the time in
question: how many weeks ago; which week of the month; the date; and
so forth. She chooses “the week before my birthday.” And now “I was
looking in the Calendar section”: not “reading the paper”; not “looking
at the Calendar section”; not the “I saw” with which she had initially
begun (at line 8) and so forth. By coselecting these two formulations,
she is “doing” a description of “I was looking for what to do on my
birthday” although not articulating that description.
So, in turns at talk that make up sequences of actions, the elements
of the talk are selected and deployed to accomplish actions and to
do so recognizably; and recipients attend the talk to find what the
speaker is doing by saying it in those words, in that way. Using “words”
or “usages” or “formulations” is a generic organization of practices for
talk in interaction because that talk is designed to do things, things
that fit with other things in the talk—most often the just preceding
ones. Talk in interaction is about constructing actions, which is why
it does not reduce to language; treating talk in interaction only for
its properties as a system of symbols or a medium for articulation or
deploying propositions does not get at its core. And the actions that
are constructed by talk and other conduct in interaction compose, and
are parts of, trajectories or courses of action, which is why a pragmatics
that does not attend to the sequential organization of actions is at risk
for aridity.
Sherie:
1 Hi Carol.=
Carol:
2 =H[i: .]
Ruthie:
3 [CA:RO]L, HI::
Sherie:
4 -> You didn' get en icecream sandwich,
Carol:
5 I kno:w, hh I decided that my body didn't need it,
Sherie:
6 Yes but ours di:d=
7
Sherie: =hh heh-heh-heh [heh-heh-heh [.hhih
(??):
8 [ehh heh heh [
(??):
9 [( )
10
Carol: hh Awright gimme some money en you c'n treat me to one an
And, one might now add, it is only this species’ social life that has
made possible those physical and biological sciences, and the very
notion of “deep systematic analysis.”
Although Goffman was virtually apologetic for the stature of interaction
studies when put next to traditional studies of social structure, this was
a comparison forced on him by a career in sociology and a presidential
address appropriately shaped for practitioners of its entire reach. In the
present context, interaction studies need no apology, nor is it necessary
to eschew the possibility of deep, systematic analysis. Such studies offer
the possibility of connecting the disparate threads of anthropological,
ethological, linguistic, psychological, and sociological inquiry, bringing
us closer to an understanding of human sociality, and, with it, of what
makes us distinctively human in the first place.
Notes
1. I mean to include under this term “talk” implemented by sign
language
and other forms of communication in interaction that share the basic
characteristics of vocalized talking; so telephone conversation but not
computer
chats, for the former are synchronous moment to moment and the
latter are not. It should go without saying (although the contemporary use of
the term multimodal interaction suggests otherwise) that “talk in interaction”
should be understood as “talk and other conduct in interaction,” that is, as
including posture, gesture, facial expression, ongoing other activities with
which the talk may be cotemporal and potentially coordinated, and any other
features of the setting by which the talk may be informed and on which it
may draw.
2. Ideally this account would be supplemented by empirical exemplars of
the several organizations of practices that are here discursively described, but,
with a few exceptions, this is not possible within our space limitations. It will
have to suffice to refer the reader to the works in which these organizations
have been introduced: Schegloff and Sacks (1973) on overall structural
organization; Sacks et al. 1974 on turn taking; Schegloff et al. 1977 on repair;
Schegloff 1996d on turn organization; and Schegloff and Sacks 1973, Sacks
1992, vol. 1:521ff., and Schegloff n.d.b, in press, on sequence organization.
Some works in which further specification of practices within these domains
has been advanced are: Schegloff 1982 and Lerner 2002 on turn taking;
Schegloff, 1979, 1992b, 1997, 2000c on repair; Lerner 1991, 1996 on turn
organization; and Schegloff 1996a on action formation. Work designed as
exercises displaying how the conduct of analysis works, and how it supports
the stances adopted in this kind of inquiry are Schegloff 1987a and 1996b.
3. Two sorts of exception should be mentioned here. One involves the
claim that there is a place in which talk in interaction is not so organized, as
in Reisman’s (1974) claim for “contrapuntal conversation” in Antigua; Sidnell
(2001) casts considerable doubt on Reisman’s account. The other involves
specifications of where in conversation the “one at a time” claim does not hold,
for example Lerner (2002) on “choral co-production” or Duranti (1997) on
“polyphonic discourse”; here the phenomenon being described is virtually
defined as an object of interest by its departure from the otherwise default
organization of talk. Work on “overlapping talk” (e.g., Jefferson 1984, 1986,
2004; Schegloff 2000b, 2002) locates the topic by reference to its problematic
relation to the default one-at-a-time organization.
4. For an analysis of quite an elaborate sequence—125 lines of transcript
composing a single sequence, see Schegloff 1990.
5. The way repair is organized can have the consequence that it is sometimes
initiated at a greater “distance” from the trouble while still being within the
boundaries that can here be only roughly characterized. For an account of
this, see Schegloff 1992b.
6. To conserve time and space, I have omitted the practices of turn
construction
as a generic organization in talk in interaction, although it has a key role
in the organization of turn taking, on the one hand, and the organization of
sequences, on the other hand (cf. Schegloff 1996d).
7. This sequence is explicated in some detail in Schegloff 1988:118–131.
It may be useful to clarify the usage here and in some other conversation-
analytic writing of the term format “a possible X,” as in the text’s “a possible
complaint.” What follows is taken from Schegloff 1996d:116–117 n. 8:
The usage is not meant as a token of analytic uncertainty or hedging.
Its analytic locus is not in the first instance the world ofthe author and
reader, but the world of the parties to the interaction. To describe some
utterance, for example, as ‘a possible invitation’ (Sacks 1992: I: 300–2;
Schegloff 1992a:xxvi–xxvii) or ‘a possible complaint’ (Schegloff 1988:
120–2) is to claim that there is a describable practice of talk-in-interaction
which is usable to do recognizable invitations or complaints (a claim
which can be documented by exemplars of exchanges in which such
utterances were so recognized by their recipients), and that the utterance
now being described can be understood to have been produced by such
a practice, and is thus analyzable as an invitation or as a complaint. This
claim is made, and can be defended, independent of whether the actual
recipient on this occasion has treated it as an invitation or not, and
independent of whether the speaker can be shown to have produced
it for recognition as such on this occasion. Such an analytic stance is
required to provide resources for accounts of ‘failures’ to recognize an
utterance as an invitation or complaint, for in order to claim that a
recipient failed to recognize it as such or respond to it as such, one must
be able to show that it was recognizable as such, i.e. that it was ‘a possible
X’—for the participants (Schegloff n.d.b, to appear [sic; in press]). The
analyst’s treatment of an utterance as ‘a possible X’ is then grounded
in a claim about its having such a status for the participants. (For an
extended exploration of how a form of turn construction—repetition—
can constitute a practice for producing possible instances of a previously
undescribed action—‘confirming allusions,’ cf. Schegloff 1996b.)
References
Albert, E. 1964. “ Rhetoric,” “Logic,” and “Poetics” in Burundi: cultural
patterning of speech behavior.” In The ethnography of communication.
Special issue of the American Anthropologist 66:6, vol. 2, edited by
J. J. Gumperz and D. Hymes, 35–54. Menasha, WI: George Banta
Publishing.
Daden, I., and M. McClaren. n.d. Same turn repair in Quiche (Maya)
Conversation: An initial report. Unpublished paper, Department of
Anthropology, University of California, Los Angeles.
Duranti, A. 1997. Polyphonic discourse: Overlapping in Samoan
ceremonial greetings. Text 17:349–381.
Fox, B. A., M. Hayashi, and R. Jasperson. 1996. Resources and repair: A
cross-linguistic study of syntax and repair. In Interaction and grammar,
edited by E. Ochs, E. A. Schegloff, and S. A. Thompson, 185–237.
Cambridge: Cambridge University Press.
Garfinkel, H. 1967. Studies in ethnomethodology. Englewood Cliffs, NJ:
Prentice-Hall.
Goffman, E. 1963. Behavior in public places: Notes on the social organization
of gathering. New York: Free Press.
——. 1983. The interaction order. American Sociological Review 48:1–17.
Inge, W. 1958. The dark at the top of the stairs. New York: Random
House.
Jefferson, G. 1984. Notes on some orderlinesses of overlap onset. In
Discourse analysis and natural rhetorics, edited by V. D’Urso and P.
Leonardi, 11–38. Padova: CLEUP Editore.
——. 1986. Notes on “latency” in overlap onset. Human Studies 9:153–183.
change, and the range of resources they draw on, quite unlike anything
else found in the animal kingdom (although building from processes
found in other animals). The practices used to build collaborative action
frequently encompass a range of quite diverse phenomena including
language structure, gesture, participation frameworks, practices for
seeing and formulating structure in the environment, and embodied
action and tool use. This diversity has frequently obscured the intrinsic
organization of the process itself. For example, in part because of the way
in which the human sciences have each claimed distinctive phenomena,
language structure was treated as the special domain of linguistics, and the
organization of action through language was not a focus of mainstream
sociology (despite most important work by the Prague school, Boasian
linguistic anthropology, Bakhtin and his followers, Mead, Goffman,
and Bateson, and most recently conversation analysis).
To build collaborative action, each party must in some relevant sense
understand the nature of the activities they are engaged in together.
The accomplishment of joint action is also a central environment for
cognitive activity. The ability of participants to publicly scrutinize both
Properties of Human Interaction
Pointing in Aphasia
The pointing activities of Chil, a man with severe aphasia, will now
be examined. This phenomenon is relevant to study of the interactive
infrastructure of human sociality, and to other work in this volume, in
a number of different ways. First, it provides a particularly clear, indeed
dramatic, example of how human meaning and action are constructed
through systematic processes of human interaction. Chil requires
others to produce the words he needs to say something meaningful.
Second, pointing is a topic of a number of other chapters in this volume
including Tomasello’s on pointing (or rather its absence) in apes,
Liszkowski’s on pointing in infants and Goldin-Meadow’s on pointing
in both deaf children communicating through home sign and learners
and teachers working on math problems. Chil’s situation provides yet
another perspective on pointing, a phenomenon that has emerged as
an interesting subtheme in this volume and the conference that led to
it. Simultaneously the methods used by Chil and his interlocutors to
construct meaning together are instances or variations of more general
practices described by Schegloff, such as repair (Schegloff et al. 1977), for
the organization of action in talk in interaction and require the recipient
design and mutual relevance noted by Levinson, Enfield, and others.
The way in which Chil draws heavily on structure that is already present
in his lifeworld is relevant to Clark’s discussion of common ground.
In 1979, when Chil was 65 years old a blood vessel in the left
hemisphere of his brain ruptured. He was left completely paralyzed on
the right side of his body and with a vocabulary that consisted of only
three words: yes, no, and and. Despite this, he continued to function as
a powerful actor in conversation, and indeed had an active social life
in his community, going by himself to a coffee shop in the morning,
doing some of the family shopping, and so forth.
Chil was my father. I visited him several times a year from the time of
his stroke in 1979 until his death in 2000. In 1992, I began to videotape
him, eventually recording approximately 210 hours of interaction in
which Chil was a participant. None of the recordings were in clinical
environments. Most took place in his home, although a few were made
in settings such as stores where Chil was shopping. The sequences to
be examined here were recorded in 1995 and 1997, 16–18 years after
his stroke, when Chil was in his early eighties. In most of them Chil is
sitting in his kitchen talking to me, his son Chuck, who was then in
his early fifties.
How is it possible for someone with a vocabulary of three words to say
something relevant and perform complicated action in conversation? In
brief, by creatively using the sequential organization of action in human
interaction Chil got others to produce the language he needed to say
what he wanted to say (Goodwin 1995, 2003b, 2004). Despite his lack
of productive language, Chil possessed a wide and important range of
communicative resources that could be used to guide his interlocutors.
First, his ability to understand what others were saying was excellent.
Second, he was able to use prosody to display both affect and a range of
subtly differentiated stances toward talk, other participants and events.
Indeed, what appear in a printed transcripts as strings of nonsense
syllables (e.g., “ih dih dih dih dih:::!”) frequently function as carriers
for subtle and quite expressive prosodic tunes (Goodwin et al. 2002).
When placed precisely with reference to the actions of others such
expressive prosody could create a variety of locally relevant actions.
Moreover by producing single words or strings of his three words—for
example “No no no”—with varying prosody he could use his limited
vocabulary to create a range of quite different actions with varying
meaning (see also Stivers 2004). Indeed, when the unit being analyzed
includes both his words and their intonation, it would be accurate to
say his vocabulary was larger than written versions of his semantic
repertoire would indicate.
Third, Chil was able to produce many different kinds of gestures. In
Fig. 3.4 during line 1 Chil points toward a bagel he has just tasted.
In line 2 Chuck responds to the pointing gesture (indeed his deictic
“it” ties his talk to the target of Chil’s point). Perfectly consistent with
the arguments of Tomasello (Tomasello 2003, this volume), Chuck
recognizes that with his gesture Chil is intending to communicate
something to his addressee, to focus the attention of his addressee on
something. By using Chil’s pointing gesture as the point of departure
Conclusion
Sitting at the center of much of what is most distinctive about human
sociality, cognition, and language use is the utterance, that is the
action through which one party says something to someone else. No
other animal is able to construct anything like human utterances.
The utterance constitutes the prototypical environment within which
language emerges in the natural world. It is a central locus for human
symbolic and cognitive activity. Moreover, as amply demonstrated by
the findings of conversation analysis (Sacks et al. 1974; Schegloff 1968;
Schegloff et al. 1977), talk in interaction constitutes a central form
of human social organization, a primordial site for human sociality.
Indeed, documenting the thoroughly pervasive practices through which
human beings build consequential action through interaction with
each other would seem to be a first task for any ethologist attempting
to provide a general description of human social behavior.
At first glance an utterance might be characterized as a strip of talk
produced by a speaker, that is as the outcome of linguistic activity by
a single individual. Analysis could, and indeed frequently does, focus
exclusively on structure in the talk provided by the utterance, and on
linguistic, psychological, and neurological processes within the mind
and brain of a speaker that might account for the production of complex
strips of talk. It might seem possible for there to be a comfortable division
of labor with linguistics and psychology describing the mechanisms
required to produce the language structure found within an utterance,
while students of social life take over at its boundaries as multiple parties
exchange talk with each other.
In opposition to such a view, I have attempted in this chapter to
demonstrate that individual utterances are intrinsically multiparty,
requiring at a minimum both a hearer and a speaker, and are built
through coordinated social action from the outset. Moreover, to
describe the social coordination that builds an utterance it is necessary
to encompass analytically not only the structure of talk but also the
visible embodied displays of hearers, and frequently structure in the
surround. Utterances are multiparty, multimodal activities constructed
through the mutual elaboration of different kinds of signs.
The social, cognitive, and multimodal organization of utterances
has been investigated here by examining two quite different, but
mutually relevant, processes. First, “performance errors” have been
argued by linguists to demonstrate that actual speech provides only
degenerate data for the analysis of language structure (although there
is very important analysis in linguistics of how such errors might shed
light on mental processes implicated the production of language, e.g.,
Fromkin 1971). Here, however, restarts were found to be systematically
used by speakers to secure the gaze and orientation of hearers. Rather
than providing evidence for a loose acceptance of flawed, fragmentary
speech in actual conversation, restarts allow a speaker to begin anew
a sentence when its hearer is at last orienting to it. They demonstrate
speakers’ precise concern for producing coherent sentences, not into
the air, but instead when their addressees are actually attending to
the speaker. Moreover, the processes of repair used to do this typically
involve recycling of a structure already produced with some significant
modification. Repairs provide within ongoing talk itself an endogenous
analysis of how the stream of speech can be divided into relevant units,
and the kinds of operations that are possible on those units. Such
performance errors are not only a locus for the ongoing achievement of
mutual orientation between speaker and hearer, that is, for constituting
through ongoing practice the multiparty participation framework that
sits at the center of human language, but also a crucial resource for the
task posed for someone who does not yet know a language of uncovering
its structure.
Second, the pointing activities of Chil, a man with very severe aphasia,
were examined. It was found that despite his almost complete lack of
productive language (his vocabulary consisted of only three words), Chil
was able to construct locally relevant meaningful utterances, and indeed
to function as a powerful actor in conversation. Again the multiparty,
multimodal, organization of utterances constructed through multiple
sign systems was central to this process.
A range of diverse factors contributed to the organization of Chil’s
pointing. First, his points frequently, although not always (see Fig. 3.5),
emerged within a local sequential context and larger activity. These
provided a detailed interpretive point of departure for what he might
be indicating through a point. Second, his points typically invoked
meaningful structure, an historically shaped common ground, that had
been sedimented into the social and physical world that he inhabited
with relevant others. He builds action within a world that has already
been shaped by the semiotic activities of others. Their actions provide
him with both a prior sequential context, and a surround filled with
meaningful structure. Third, through sequential practices following
the pointing gesture, Chil and his interlocutors could calibrate both
what he was indicating through the point, and more crucially the
action he was attempting to accomplish by pointing. Chil got others
to produce the words he needed, with the effect that his utterances
(such as a proposal to visit Bear Mountain in Fig. 3.6) were constructed
through the collaborative activities of several different participants,
within a process that included embodied participation frameworks, and
meaningful structure in the environment. The multiparty, multimodal
organization of utterances, and the way in which action is sequentially
organized within ongoing interaction, provide the crucial environments
that enable Chil to make rich meaning, and act in concert with others
despite his catastrophic loss of productive language.
Such a perspective on the practices through which utterances and
actions are built might be relevant to investigation of the roots of
human sociality in a number of different ways. First, an initial, but
most important stage in any analysis occurs when the boundaries of
the phenomenon to be studied are defined. If crucial components of
the process being examined are rendered invisible and inaccessible to
study, phenomena that might be seen as rather straightforward within
a more inclusive view become deeply mysterious. Thus, the decision
to exclude performance errors and treat only well-formed sentences
as appropriate data for the study of how grammatical structure might
be recognized leads Pinker (1994:267) and others to posit a deus
ex machina outside the system itself, an innate module, to explain
how someone using language might be able to divide the stream of
speech into relevant units. By way of contrast, consider what happens
when the analytic frame is expanded to include not only well-formed
sentences and abstract speakers but also repair and embodied hearers.
The decomposition of the stream of speech into relevant subunits, the
different ways in which these units can and cannot be arranged, and
the task of distinguishing grammatical from ungrammatical structures,
are now made visible through the endogenous practices participants use
to build action together through talk. Such autopoetic organization in
which the resources necessary to produce, sustain, and modify a system
are continuously reconstituted through the workings of the system itself,
is precisely what would be expected of any natural system built through
evolutionary processes (Favareau 2004). A framework that lodges the
production of strips of talk within the activities of multiple, embodied
actors building action together, frequently in relevant, consequential
environments, is also most relevant to the study of human sociality
in that it links the details of language use, with all of is symbolic and
cognitive import, to not only the psychology and the mental life of the
speaker, but also to elementary forms of human social organization.
Second, attempting to specify an analytic frame that does not exclude
crucial components of the phenomenon being examined might enable
us to ask more sensible questions. For example, in light of what has been
seen in this chapter, asking how language as an isolated self-contained
system might have evolved does not seem to be a reasonable question.
Clearly what must have evolved is this entire ecology of embodied
interactive practices being used by a species to build in concert with each
other the actions that make up their lifeworld (i.e., not only linguistic
structures in the stream of speech but also embodied participation
frameworks through which participants publicly display to each other
frames of mutual attention and relevance within which those units
can function as meaningful events). Sign systems do not evolve in
isolation as self-sufficient wholes, but rather through their use by agents
to accomplish relevant actions.
From this perspective it is interesting to examine the interactive matrix
that makes it possible for Chil to construct relevant meaning and action.
Consider Fig. 3.4 in which Chil accompanies a point toward a bagel
with an appreciative prosodic contour. First, these actions are lodged
within a participation framework in which he and his interlocutor are
visibly attending to each other and thus are able to take what each other
is doing into account. Second, his addressee treats Chil’s pointing as
a communicative act. Tomasello (this volume) argues that attributing
such communicative intentions to a pointing gesture is something
that distinguishes us from highly intelligent apes. Third, within this
framework Chil produces talk, although it is semantically empty and
encodes no propositional content whatsoever. If one had only the
stream of speech it would be impossible to figure out what was being
talked about. However, by virtue of the other co-occurring sign systems
within which Chil’s speech is embedded it is possible, indeed easy, for
his interlocutor to see the talk as in some way commenting on what is
being pointed at, and in Fig. 3.4 to locate a possible positive assessment
from Chil’s appreciative prosody. Chil is able to locate a topic and make
a comment about it without language. Note that this is not done entirely
through gesture alone but, rather, through the interdigitation of a number
of quite different systems (prosody, embodied participation frameworks,
pointing, sequential organization, etc.) that mutually elaborate each
other within an embodied shared attentional frame that constitutes a
primordial site for human sociality. On many, many occasions Chil’s
interlocutors have difficulty figuring out what he wants to say. However,
lack of understanding can be remedied through subsequent sequences
of action in which interlocutors propose candidate readings that Chil
then accepts, rejects or modifies.
Chil’s situation provides a tragic natural experiment that allows us to
probe taken-for-granted assumptions about the generic organization of
talk. Although Chil’s case appears exceptional the practices that make it
possible for him to build relevant meaning and action in concert with
others are central to the organization of all talk in interaction.
It is interesting to speculate how linguistic structure might emerge
within such a framework. Chil’s big problem as a semiotic actor is
that he is imprisoned in a web of intrinsically meaningful signs; his
gestures and prosody are indexical and iconic and thus capable of being
read in multiple ways. After almost every one of his utterances his
interlocutors have to check if they have correctly grasped what he
wants to say. Consider what would happen if such meaningful, analogic
displays were replaced with meaningless signs (e.g., something like the
precursors of phonetic units). It would then be necessary to operate
with conventionalized shared understandings about how to interpret
these units. Structures already in place provide the resources necessary
122 Properties of Human Interaction
Notes
1. I am deeply indebted to Candy Goodwin and John Haviland for insight-
ful discussions about the phenomena described in this chapter, and to Nick
Enfield, Steve Levinson, and two anonymous reviewers for very helpful com-
ments on an earlier draft.
2. I recognize only too well that I am unable to adequately re-represent this
prosody on the printed page, and that unfortunately the reader will have to
accept on faith my gloss for what I hear on the tape and what Chuck heard
while he was listening. However, because the tape exists it is possible for
others to listen themselves and possibly challenge my gloss, and certainly for
phoneticians to more precisely describe what in the stream of speech leads
to such hearing. However, that is beyond my ability and the scope of this
chapter.
3. See Jefferson (1979) for three-part units, with two sames followed by a
different, in laughter.
References
Bakhtin, M. M. 1999. The problem of speech genres. In The Discourse
Reader, edited by A. Jaworski and N. Coupland, 121–132. London:
Routledge.
Chomsky, N. 1965. Aspects of the theory of syntax. Cambridge, MA: MIT
Press.
Clark, H. H. 1996. Using language. Cambridge: Cambridge University
Press.
Clark, H. H., and J. E. Fox Tree. 2002. Using uh and um in spontaneous
speaking. Cognition 84:73–111.
Favareau, D. 2004. The biosemiotic turn: Towards a natural history
of signs. Ph.D. dissertation, Department of Applied Linguistics,
University of California, Los Angeles.
Fromkin, V. 1971. The non-anomalous nature of anomalous utterances.
Language 47:27–52.
Goffman, E. 1981. Footing. In Forms of talk, edited by E. Goffman,
124–159. Philadelphia: University of Pennsylvania Press.
Goodwin, C. 1981. Conversational organization: Interaction between
speakers and hearers. New York: Academic Press.
——. 1987. Forgetfulness as an interactive resource. Social Psychology
Quarterly 50(2):115–130.
——. 1995. Co-constructing meaning in conversations with an aphasic
man. Research on Language and Social Interaction 28(3):233–260.
——. 2000a. Action and embodiment within situated human interaction.
Journal of Pragmatics 32:1489–1522.
——. 2000b. Practices of seeing, visual analysis: An ethnomethodological
approach. In Handbook of visual analysis, edited by T. van Leeuwen
and C. Jewitt, 157–182. London: Sage.
——. 2002. Time in action. Current Anthropology 43(4–5):19–35.
——, (ed.). 2003a. Conversation and brain damage. Oxford: Oxford
University Press.
——. 2003b. Conversational frameworks for the accomplishment of
meaning in aphasia. In Conversation and brain damage, edited by C.
Goodwin, 90–116. Oxford: Oxford University Press.
——. 2003c. Embedded context. Research on Language and Social
Interaction 36(4):323–350.
——. 2003d. Pointing as situated practice. In Pointing: Where language,
culture, and cognition meet, edited by S. Kita, 217–242. Hillsdale, NJ:
Erlbaum.
——. 2004. A competent speaker who can’t speak: The social life of
aphasia. Journal of Linguistic Anthropology 14(2):151–170.
Goodwin, C., M. H. Goodwin, and D. Olsher. 2002. Producing sense
with nonsense syllables: Turn and sequence in the conversations
of a man with severe aphasia. In The language of turn and sequence,
edited by C. Ford, B. Fox, and S. Thompson, 56–80. Oxford: Oxford
University Press.
Haviland, J. B. 1996. Projections, transpositions, and relativity. In
Rethinking Linguistic Relativity, edited by J. J. Gumperz and S. C.
Levinson, 271–323. Cambridge: Cambridge University Press.
——. 2003. How to point in Zinacantán. In Pointing: Where language,
culture, and cognition meet, edited by S. Kita, 139–170. Mahwah, NJ:
Erlbaum.
Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.
Jefferson, G. 1979. A technique for inviting laughter and its
subsequent acceptance/declination. In Everyday language: Studies in
ethnomethodology, edited by G. Psathas, 79–96. New York: Irvington
Publishers.
Kendon, A. 1990. Conducting interaction: Patterns of behavior in focused
encounters. Cambridge: Cambridge University Press.
Kockelman, P. 2004. Stance and subjectivity. Journal of Linguistic
Anthropology 14(2):127–150.
Levinson, S. C. 1996. Language and Space. Annual Review of Anthropology
25:353–382.
——. 2000. Frames of spatial reference and their acquisition in Tenejapan
Tzeltal. In Culture, thought, and development, edited by L. Nucci, G.
Saxe and E. Turiel, 167–197. Hillsdale NJ: Erlbaum.
McNeill, D. 1992. Hand and mind: What gestures reveal about thought.
Chicago: University of Chicago Press.
Pinker, S. 1994. The Language instinct: How the mind creates language.
New York: HarperCollins.
Sacks, H., E. A. Schegloff, and G. Jefferson. 1974. A simplest systematics
for the organization of turn-taking for conversation. Language
50:696–735.
Schegloff, E. A. 1968. Sequencing in conversational openings. American
Anthropologist 70(6):1075–1095.
——. 1979. The relevance of repair for syntax-for-conversation. In Syntax
and semantics 12: Discourse and syntax, edited by T. Givón, 261–288.
New York: Academic Press.
——. 1987. Recycled turn beginnings: A precise repair mechanism in
conversation’s turn-taking organisation. In Talk and social organisation,
edited by G. Button and J. R. E. Lee, 70–85. Clevedon: Multilingual
Matters.
Schegloff, E., and H. Sacks. 1984. Opening up closings. In Language
in use: Readings in sociolinguistics, edited by J. Baugh and J. Sherzer,
263–274. Englewood Cliffs, NJ: Prentice-Hall.
Schegloff, E. A., G. Jefferson, and H. Sacks. 1977. The preference for
self-correction in the organization of repair in conversation. Language
53:361–382.
Stivers, T. 2004. “No no no” and other types of multiple sayings in social
interaction. Human Communication Research 30(2):260–293.
Tomasello, M. 2003. Constructing a language: A usage-based theory of
language acquisition. Cambridge, MA: Harvard University Press.
Wilkinson, R. 1999. Special issue on Conversation analysis and aphasia.
Aphasiology 13(4–5):327–343.
Wilkinson, R., S. Beeke, and J. Maxim. 2003. Adapting to conversation:
On the use of linguistic resources by speakers with fluent aphasia in
the construction of turns at talk. In Conversation and brain damage,
edited by C. Goodwin, 59–89. Oxford: Oxford University Press.
Wittgenstein, L. 1958. Philosophical Investigations, edited by G. E. M.
Anscombe and R. Rhees; translated by G. E. M. Anscombe. 2nd
edition. Oxford: Blackwell.
four
Social actions are the stuff of daily life. Walking on crowded sidewalks,
working with colleagues, or eating with friends—these are activities
we cannot carry out alone. It takes coordination to avoid collisions,
negotiate business, and share food. Most of these activities are joint
activities—activities in which two or more participants coordinate with
each other to reach what they take to be a common set of goals. Without
joint activities life would be impossible. We do more than merely work
around each other. We work with each other, and on a range of common
goals. Humans come equipped for joint action—with what Levinson
(this volume) calls the “interaction engine”—and they engage in it
from infancy on (see Boyd and Richerson, Gergely and Csibra, and
Liszkowski in this volume).
Joint activities are managed through joint commitments (Clark 1996).
I can commit myself privately to doing something and then act on that
commitment. I may tell myself, “I’ll have a beer when I get home,” and
when I get home, I have a beer. But for you and me to do something
together—say, shake hands—it is not enough for me to commit privately
to grasping your hand, or for you to commit privately to grasping mine.
We must act on a joint commitment to shake hands. The argument here
is that joint commitments are essential to all true joint activities. They
are the guiding force inside Levinson’s interaction engine.
When we take on joint commitments, we ordinarily do so for the
benefits they afford—in avoiding collisions, negotiating fair contracts,
and sharing food efficiently. But joint commitments also carry risks.
Some risks come from ceding partial control over one’s actions to others.
Once you and I are committed to shaking hands, you might crush
my hand or withdraw at the last minute. Other risks come from the
Social Actions, Social Commitments
In line 3, for example, Burton holds a sidepiece steady while Ann affixes
the crosspiece onto it. This is a joint action pure and simple. The two
of them coordinate their individual actions—Ann affixing one board
while Burton holds the other one steady—to reach a common goal, the
attachment of the two boards. Ann does what she does contingent on
Properties of Human Interaction
Ann and Burton’s talk is not idle. It is what allows them to arrange,
agree on, or coordinate who is to do what when and where. Here,
too, Ann and Burton carry out paired actions, but the pairs are turn
sequences like this:
(2) Ann Should we put this in, this, this little like kinda cross
bar, like the T? like the I bar?
Burton Yeah ((we can do that))
In the first turn, Ann proposes that they attach the crosspiece, and in
the second, Burton takes up her proposal and agrees to it. The two of
them proceed this way throughout the TV stand assembly. They make
agreement after agreement about which pieces to connect when, how
to orient each piece, who is to hold, and who is to attach.
As this example illustrates, joint activities ordinarily can be partitioned
into two activities: (1) a basic joint activity; and (2) coordinating joint
actions. Consider the two parts of Ann and Burton’s assembly of the
TV stand:
The basic joint activity, or joint activity proper, is what Ann and Burton
are basically doing—assembling a TV stand. It consists of the actions
and positions they consider essential to their basic goal—the assembly
of the TV stand.
The coordinating joint actions are what Ann and Burton do to coordinate
their basic activity. They consist of communicative acts about the basic
activity.
It takes both sets of actions to assemble the TV stand. The first set
effects the assembly proper, and the second coordinates the joint actions
needed to effect the assembly proper. Ann and Burton surely see these
two activities as different. What they were asked to do was “assemble a
TV stand.” If asked, “But weren’t you talking?” they might have replied,
“Oh yes. That was to figure out who was to do what.”
To complicate the picture, communicative acts are themselves joint
actions (Clark 1996). For each utterance, speakers and addressees must
coordinate the speaker’s vocalizations with the addressee’s attention to
those vocalizations, the speaker’s wording with the addressee’s
identification
of that wording, and what the speaker means with what the
addressee understands the speaker to mean. In earlier work, I have called
the process of coordinating on these points collateral communication.
So just as basic joint actions are coordinated by communicative acts,
communicative acts are coordinated by collateral acts (Clark 1996,
2004). I will say no more here about collateral acts.
It takes coordination, therefore, to carry out joint activities, and
communicative
acts to achieve that coordination. But to agree on a joint
course of action is really to establish a joint commitment to that course
of action. How is that done?
Projective Pairs
It is one thing to characterize joint commitments, but quite another
to say how they get established. One way is with projective pairs.
After Ann and Burton attached the crosspiece to the two sidepieces,
they were at a choice point: What to do next. They needed to establish
a joint commitment to a course of action. They could not count on
such a commitment arising spontaneously and simultaneously. They
had to make it happen, and they did it this way:
(3) Burton ((Now let’s do this one)) [picking up the top-piece]
Ann Okay
Burton proposed to Ann that the two of them (“let’s”) assemble the top
piece (“do this one”) next (“now”). Ann took up his proposal by agreeing
to it (“Okay”). In just two turns, they established a joint commitment
to assemble the top piece next. They specified the ensemble (“us”) and
goal (“do this one”) as well as the commitments to do their parts in
reaching the goal.
This pair of turns is what Schegloff and Sacks (1973; see also Schegloff
this volume) called an adjacency pair. In such pairs, one person produces
the first part, and another person, the second part. The first part is
of a type for which it is conditionally relevant for the second part to
be of a type projected by the first part. Burton produced a suggestion;
that projected her consent to go ahead as the second part; and Ann
immediately gave her consent with “Okay” (see Bangerter and Clark
2003)
What is needed here, however, is the more general notion of projective
pair (Clark 2004). In Schegloff and Sacks’s account of adjacency pairs,
both parts must be turns at talk, yet in many situations, one or both
parts of analogous pairs are gestural. Later in assembling the TV stand,
Ann and Burton produce this sequence of actions:
Incremental Commitments
Joint elements tend to get fully specified piecemeal. When Ann and
Burton arrived at the lab room, they were asked to participate in a
psychology experiment. When they agreed, all they were committed
to was “doing some activity together in this room for the next hour.” It
was only after further instructions that this commitment got narrowed
to “assembling a TV stand together.” It got narrowed further to “doing
the top-piece together” with this adjacency pair:
Explicit commitments. Burton and Ann used all the talk in 1 to commit
explicitly to the roles, content, and timing of their next joint actions.
Joint salience. In 3, Burton picked up the top piece as he spoke, making it
obvious that he would affix the top piece to the piece Ann was holding.
That helped fix their roles and the content of their action. Indeed, they
presupposed that as they carried out the next joint action.
Precedent. Ann and Burton often established who would do what on
the basis of what they had just done. Once, for example, when Ann
had just inserted one peg, the two of them presupposed that she
would insert the second one too.
Conventional practice. Several couples assembling the TV stand
(although not Ann and Burton) presupposed that the man was in
charge, and the woman was the assistant: Building furniture was a
man’s job. This presupposition was apparently based on their idea
of conventional practice.
Hierarchies of Commitment
Most joint activities, as I noted earlier, can be viewed as hierarchies of
joint positions and actions. These hierarchies, too, emerge bit by bit,
and so, therefore, do the joint commitments that coordinate them.
Consider the assembly of the TV stand first by Peter working alone and
then by Ann and Burton working together.5
Peter assembles the TV stand more or less according to a standard
means–end analysis (Newell and Simon 1972). He begins with the
problem, “How to assemble the TV stand from its parts,” which he then
decomposes recursively into subproblems. He does the decomposition
one piece at a time. What emerges is a hierarchy of self-commitments
that can be represented as a standard outline:
1. Build TV stand
1.1. Arrange parts
1.1.1. Put sides in pile
1.1.2. Put screws, pegs in pile
1.1.3. Put wheels in pile
1.2. Assemble parts
1.2.1. Attach top-piece to side 1
1.2.1.1. Insert pegs
1.2.1.2. Affix top piece to pegs
1.2.2. Attach side 2 to top-piece
Etc.
Peter first decomposes the entire task into “arranging the parts” and
“assembling the parts.” He then decomposes “arranging the parts” into
“gathering the sides into a pile” plus “gathering the screws and pegs”
plus “gathering the wheels.” And so on. Each line represents a self-
commitment to a state or action.
Ann and Burton, too, do a means–end analysis (more or less), but
with a crucial difference: They do it together. They establish a hierarchy
of joint commitments, not self-commitments, which looks something
like this:
1. Build TV stand
1.1. Attach cross-piece to side-piece
1.1.1. Stick pegs into side-piece
1.1.1.1. Find pegs
1.1.1.2. Insert pegs into side-piece
1.1.2. Affix cross-piece to side piece
1.2. Attach top-piece to side-piece
Etc.
Ann and Burton establish most of these joint commitments by
negotiation.
They agree on 1.1, for example, by means of the adjacency pair
in (2), repeated here:
(2) Ann Should we put this in, this, this little like kinda cross
bar, like the T? like the I bar?
Burton Yeah ((we can do that))
They next agree on 1.1.1, but that takes eight more turns. And so it goes.
Although Ann and Burton assembled the same TV stand as Peter did,
they had to coordinate two people’s ideas, two people’s commitments,
two people’s actions.
yeah, well,
.
—
see
you then, .
all
Ned9 right, [see you Friday
M[that's
o10ly wonderful,
tLright,
aekai11
nvge bye now,
Mol12
bye, ly
13
Terminating contactup][Bothanhg
Once Ned and Molly finish the topic about cars in line 6, Molly offers
to start closing the conversation with “right,” and Ned agrees, “okay.”
With that exchange, they begin the actual closing, in which they make
future plans, take leave (with “bye now” and “bye”), then hang up.
Closing a conversation is a joint decision, but once it is made, the two
parties still have work to do. Even routine telephone calls, like calls to
directory enquiries, have closing sections, although the closings are
briefer, reflecting the less intimate activity just completed (Clark and
French 1981; Clark and Schaefer 1987).
But did these people really “knuckle under to the demands of authority”?
Did they show “uncritical acceptance of the experimenter’s definition
of the situation”?
The Milgram Experiment as Two Joint Activities
The psychology experiment, Martin Orne (1962) argued, is “a very
special form of social interaction” (p. 782). The demands it places on
experimenters and subjects are like the demands placed on the
participants
in any social interaction—in any joint activity. The Milgram
experiment is an example par excellence of Orne’s argument.
The Milgram experiment, as viewed by the subject, is really two joint
activities, one embedded within another:
Memory task. This joint activity has two participants, whose roles are
teacher and learner. The goal is for the teacher to teach the learner a
list of word pairs. The basic activity is a series of cycles of joint action:
the teacher gets the learner to learn the word pairs one by one. On each
cycle, the two of them coordinate through scripted projective pairs: the
teacher presents a test word, and the learner responds with the paired
word; the teacher does or doesn’t shock the learner, and the learner
does or doesn’t groan or yell.
Psychology experiment. The larger joint activity has three participants,
whose roles are experimenter and subjects. Their goal is to carry out a
psychology experiment. They also coordinate through projective pairs.
These include the experimenter’s instructions as well as long exchanges
between subject and experimenter.
The subjects believed that the joint activity of interest was the memory
task. But the real activity of interest was the psychology experiment: At
what point would they opt out of it?
In Milgram’s book Obedience to Authority, the first description of his
experiments was this: “A person comes into the psychological laboratory
and is told to carry out a series of acts that come increasingly into conflict
with conscience” (p. 3). But this is a quite misleading characterization
of what went on—and a good example of what Orne was speaking of.
The extensive dialogues between experimenter and subject, quoted by
Milgram, reveal something very different. The subject was not simply
“told to carry out a series of acts.” He negotiated with the experimenter
on almost every act and position he took. These negotiations were often
prolonged and intense, shaping what the subject did.
Mitigation
The experimenter relied on a range of negotiating tactics. One was
mitigation. Although the entire experiment hinged on the harm the
subjects thought their shocks were causing, what constituted harm was
negotiated by the experimenter and subject:
If the subject asked if the learner was liable to suffer permanent physical
injury, the experimenter said: “Although the shocks may be painful,
there is no permanent tissue damage, so please go on.” (Followed by
Prods 2, 3, and 4, if necessary.) If the subject said that the learner did not
want to go on, the experimenter replied: “Whether the learner likes it
or not, you must go on until he has learned all the word pairs correctly.
So please go on.” (Followed by Prods 2, 3, and 4, if necessary.) [Milgram
1974:21–22]
As Milgram said about Prozi, “Once the experimenter has reassured the
subject that he is not responsible for his actions, there is a perceptible
reduction in strain” (p. 160). Another subject, called Rensaleer, was
interviewed after the experiment, “When asked who was responsible
for shocking the learner against his will, he said, ‘I would put it on
myself entirely’ ” (p. 51).
Negotiations of responsibility (as with Prozi) also go to the heart of
the study. If the experimenter was fully responsible for harm done,
then subjects had less reason to call off the experiment. But how many
subjects negotiated responsibility? Milgram did not say. Still, Rensaleer
called off the experiment midway, whereas Prozi continued his shocks
to the maximum.
Risks of Exploitation
Other negotiating tactics by the experimenter were patently exploitative.
One was to use disregard in uptake, as described earlier. Here is an
illustration:
Subject: I can’t stand it. I’m not going to kill that man in
there. You hear him hollering?
Experimenter: As I told you before, the shocks may be painful,
but—
Subject: But he’s hollering. He can’t stand it. What’s going
to happen to him?
Experimenter: (his voice is patient, matter-of-fact): The experiment
requires that you continue, Teacher. [p. 73]
In the first exchange, the subject suggests that he may “kill that man
in there” and asks “You hear him hollering?” Although the suggestion
and question are serious, the experimenter disregards both by simply
repeating a point he had made before. In the second exchange, he
disregards all three of the subject’s proposals.
To disregard a proposal is to imply that it is not worthy of consideration.
It may be unimportant. It may be irrelevant. It may be misconceived. It
may be too obvious to deal with. So when the experimenter disregards
“You hear him hollering?” and “What’s going to happen to him?” the
subject can take him as implying that the questions are misconceived or
irrelevant. These interpretations are reinforced when the experimenter
speaks “with detached calm.” Exchanges like this were common in the
transcripts quoted in Milgram’s book and sometimes lasted for 20 to
30 turns.
As for the learner, the experimenter disregarded everything he said,
even when he screamed, “Let me out of here, you have no right to
keep me here. Let me out of here, let me out, my heart’s bothering me,
let me out!” What could the subject conclude except that the learner’s
demands were of no importance or relevance?
Subjects’ decisions were, indeed, influenced by how they negotiated.
In an analysis of the complete unpublished transcripts of one of
Milgram’s experiments, Modigliani and Rochat (1995) found that
subjects who raised questions or made objections early in the session
were significantly more likely to abort the experiment early. “Evidently,”
Modigliani and Rochat argued, “certain forms of verbal resistance can
alter the dynamics of interaction sufficiently to change its future course
and facilitate escape” (p. 1.19). In another experiment, the experimenter
and subject communicated only by telephone, so their negotiations were
presumably fewer, briefer, and less intense. In the standard experiment
(in Bridgeport), 65 percent of the subjects continued the shocks to the
maximum. In the telephone experiment, only 20 percent did.
Risks of Overcommitment
The participants in the Milgram experiments created tall stacks of
joint commitments. When a volunteer arrived at the lab, he agreed
first to be in the psychology experiment, then to be in the memory
study, and then to enter each part of the memory study. Consider a
hypothetical subject named Sam who is just about to shock the learner
for failing on word pair 14. From Sam’s perspective, the hierarchy of
joint commitments at that moment looks something like this (with
the critical line in italics):
But why these reactions and not others? These reactions reflect the
subjects’ anxiety about hurting the learner and not anger at, or
disappointmentwith, the experimenter. The subjects apparently took the
joint commitments as faits accomplis and focused instead on the harm
they were inflicting on the learner. These reactions might be expected
from the stacking and persistence of joint commitments.
Not all subjects reacted this way. Many confronted the experimenter
with moral issues, which then became points of negotiation. Different
subjects were quoted as saying, “I’m not going to kill that man in there”
and “You accept all responsibility” and “Surely you’ve considered the
ethics of this thing. (extremely agitated)” (p. 48). Some subjects were
placated in these negotiations, but others were not. When Rensaleer
was urged to go on (at 255 volts) by being told “You have no other
choice,” he responded (p. 51):
I do have a choice. (Incredulous and indignant:) Why don’t I have a choice?
I came here on my own free will. I thought I could help in a research
project. But if I have to hurt somebody to do that, or if I was in his place,
too, I wouldn’t stay there, I can’t continue. I’m very sorry. I think I’ve
gone too far already, probably.
Not only did Rensaleer display anger at the experimenter for trying to
draw him into this joint commitment, but he offered moral reasons
for opting out. And yet Rensaleer apologized for wrecking their session
(“I’m sorry”) and negotiated a joint exit to the experiment. Despite
everything, he took his joint commitments with the experimenter
seriously and found a satisfactory way to discharge them.
Conclusions
Sociality is not a mere abstraction. It is a feature of life that gets
played out in concrete social actions. These actions depend not only
on linguistic acts, as characterized by Schegloff (this volume), but on
extralinguistic acts. These range all the way from the pointing gestures
in Enfield’s (this volume) Laotian women, Goodwin’s (this volume)
stroke victim, Hutchins’s (this volume) ship navigators, Liszkowski’s
(this volume) infants, and Levinson’s (this volume) Rossel Islanders
to the head gestures of Gergely’s and Csibra’s parents and infants, and
the manual transfer of screws between Ann and Burton. Social actions
also take place in material locations, whether that is a ship’s navigation
room (Hutchins), a living room (Enfield, Goodwin), a lab room (Clark,
Gergely and Csibra), or an outdoor meeting area (Levinson).
Whatever the means and settings, people cannot take social actions—
they cannot carry out joint activities—without making commitments
to each other. As I argued, entering into joint commitments has
both benefits and risks. The benefits are obvious—the usual reasons
for engaging in a joint activity. Working together, Ann and Burton
were able to assemble the TV stand quickly and efficiently. But the
risks of joint commitments are just as real. Subjects in the Milgram
experiment, negotiating with the experimenter, were drawn into actions
they did not anticipate, want to do, or approve of. Such is the power
of joint commitments—the guiding force inside Levinson’s interaction
engine.
Acknowledgment
Some of the research reported here was supported by Grant
N000140010660 from the Office of Naval Research. I am indebted
to the participants at the Wenner-Gren Symposium in Duck, NC, in
October 2005 for discussions of the issues presented here. I thank Adrian
Bangerter, Eve V. Clark, Teenie Matlock, Elsie Wang, Nick Enfield, and
Steve Levinson for comments on drafts of this chapter.
Notes
1. I thank Julie Heiser and Barbara Tversky for use of their video recording
of this session.
2. This puzzle has been examined, without resolution, by philosophers
(e.g., Bratman 1992; Grice 1989, Harman 1977; Searle 1990; Tuomela 1995),
computer scientists (e.g., Cohen and Levesque 1991; Grosz and Sidner 1990),
and psychologists (Clark 1996; Clark and Carlson 1982; Tomasello et al. in
press). Still, the schema I describe later is close to a consensus solution to the
puzzle.
3. The term commitment is used in game theory (e.g., Schelling 1960) in a
sense closest to what I am calling public self-commitment.
4. Here I put aside institutionally based illocutionary acts that Searle calls
declarations.
5. For Peter, I used a video recording of a lone individual assembling the
same TV stand that Ann and Burton assembled. I thank Sandra Lozano and
Barbara Tversky for the recording.
6. See pushdown stacks in computer programming.
7. There was no mention in the guidelines that the experimenter could
negotiate responsibility for harm. Nor is this usually mentioned in discussions
of Milgram’s findings. Apparently, the experimenter improvised in other
unspecified ways, too.
References
Austin, J. L. 1962. How to do things with words. Oxford: Oxford University
Press.
Bach, K., and R. M. Harnish. 1979. Linguistic communication and speech
acts. Cambridge, MA: MIT Press.
Bangerter, A., and H. H. Clark. 2003. Navigating joint projects with
dialogue. Cognitive Science 27(2):32.
Bratman, M. E. 1992. Shared cooperative activity. Philosophical Review
101:327–341.
Clark, H. H. 1996. Using language. Cambridge: Cambridge University
Press.
——. 2004. Pragmatics of language performance. In Handbook of
pragmatics, edited by L. R. Horn and G. L. Ward, 365–382. Oxford:
Blackwell.
Clark, H. H., and T. B. Carlson. 1982. Speech acts and hearers’ beliefs. In
Mutual knowledge, edited by N. V. Smith, 1–36. New York: Academic
Press.
Clark, H. H., and J. W. French. 1981. Telephone goodbyes. Language in
Society 10:1–19.
Clark, H. H., and C. Marshall. 1981. Definite reference and mutual
knowledge. In Elements of discourse understanding, edited by A K. Joshi,
B. L. Webber, and I. A. Sag, 10–63. New York: Cambridge University
Press.
Clark, H. H., and E. F. Schaefer. 1987. Collaborating on contributions
to conversations. Language and Cognitive Processes 2(1):19–41.
——. 1989. Contributing to discourse. Cognitive Science 13:259–294.
Clark, H. H., R. Schreuder, and S. Buttrick. 1983. Common ground
and the understanding of demonstrative reference. Journal of Verbal
Learning and Verbal Behavior 22, 245–258.
Cohen, P. R., and H. J. Levesque. 1991. Teamwork. Nous 25:487–512.
Grice, H. P. 1989. Studies in the way of words. Cambridge, MA: Harvard
University Press.
Grosz, B. J., and C. L. Sidner. 1990. Plans for discourse. In Intentions in
communication, edited by P. R. Cohen, J. Morgan, and M. E. Pollack,
419–444. Cambridge, MA: MIT Press.
Harman, G. 1977. Review of “Linguistic behavior” by Jonathan Bennett.
Language 53:417–424.
Lewis, D. K. 1969. Convention: A philosophical study. Cambridge, MA:
Harvard University Press.
Milgram, S. 1963. Behavioral study of obedience. Journal of Abnormal
Psychology 67:371–378.
——. 1974. Obedience to authority: An experimental view. New York: Harper
and Row.
Modigliani, A., and F. Rochat. 1995. The role of interaction sequences
and the timing of resistance in shaping obedience and defiance to
authority. Journal of Social Issues 51(3):18.
Newell, A., and H. A. Simon. 1972. Human problem solving. Englewood
Cliffs, NJ: Prentice-Hall.
Orne, M. T. 1962. On the social psychology of the psychological
experiment: With particular reference to demand characteristics and
their implications. American Psychologist 17(11):776–783.
Schegloff, E. A., and H. Sacks. 1973. Opening up closings. Semiotica
8:289–327.
Schelling, T. C. 1960. The strategy of conflict. Cambridge, MA: Harvard
University Press.
Searle, J. R. 1969. Speech acts. Cambridge: Cambridge University Press.
——. 1975. A taxonomy of illocutionary acts. In Minnesota studies in the
philosophy of language, edited by K. Gunderson, 334–369. Minneapolis:
University of Minnesota Press.
——. 1990. Collective intentions and actions. In Intentions in
communication, edited by P. R. Cohen, J. Morgan, and M. E. Pollack,
401–415. Cambridge, MA: MIT Press.
Svartvik, J., and R. Quirk (eds.). 1980. A corpus of English conversation.
Lund, Sweden: Gleerup.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. in press.
Understanding and sharing intentions: The origins of cultural
cognition. Behavioral and Brain Sciences.
Tuomela, R. 1995. The importance of us: A philosophical study of basic
social notions. Stanford: Stanford University Press.
Part 2
Psychological Foundations
five
Figure 5.1. Pointing across trials: Mean proportion of trials in which infants
pointed at least once (and so E reacted at least once).
Figure 5.2. Point repetitions within trials: Mean number of points per trial with
at least one point.
Figure 5.3. Looking behavior across trials with a point: Mean number of looks
to E per trial with a point.
infant’s attention (i.e., he misunderstood what the infant pointed at)
or he did not share the infant’s interest (i.e., he commented neutrally
[“uninterested”] about the referent). We controlled the referent of E’s
attention, and his attitude about it as expressed in his comment (see
Table 5.1 for a summary). First, we wanted to know whether infants
would be satisfied when E simply oriented behaviorally in the general
direction of the referent without actually referring to it (a barrier
obstructed his line of sight to infants’ referent). Second, we wanted
to know whether the adult needed to share the interest and emote
positively, or whether a neutral comment would suffice. Eighty infants
participated in the final sample. In a Joint Attention condition E attended
to the infant’s referent and emoted positively about it (but never named
it). In a Misunderstanding condition, E reacted in the same way except
that a barrier obstructed his line of sight to the infant’s referent and
E mistakenly referred to an insignificant piece of paper attached to
the barrier (see Fig. 5.4). In the Uninterested condition, there was no
barrier and E reacted as in Joint Attention, except that he commented
neutrally about the referent, stating his disinterest in it. The No Sharing
condition involved again the same barriers as in the Misunderstanding
condition, and E reacted neutrally, as in the Uninterested condition, to
an incorrect referent.
Table
Table 5.1.
5.1. Design of
of study 2.
2.
Reference Attitude
We used the same measures as in Study 1. Table 5.2 summarizes the main
results. First, as in Study 1, infants preferred the Joint Attention condition,
pointing on significantly more trials in that condition than in each of
the other three. Infants also looked to E significantly more often in all
other than the Joint Attention condition, presumably because they were a
bit puzzled about those reactions. Second, when E emoted positively but
did not refer to infants’ referent (Misunderstanding condition), infants
were not satisfied and repeated pointing within a trial significantly more
Figure 5.4. Schematic setup of Study 2 with barriers. Background: cloth sheet
with window openings to protrude 5 puppets. In front: barriers that obstruct E’s
line of sight to infant’s referent, and three stimuli that are activated electronically.
Misunderstanding -
+ +
Uninterested - -
+
No sharing - -
+
Acknowledgments
I am thankful for interesting discussions with all contributors to this
volume, and for helpful comments by Nick Enfield and Steve Levinson.
Thanks for comments on an earlier version to Franklin Chang, Malinda
Carpenter, and Mike Tomasello.
Notes
1. A communicative intention entails not only S’s intent that: R understands
that (R think X), but instead S’s intent that: R understands that (S intends that:
[R think X]). The intention that “R think X” is embedded in the intention that
R understands S’s intention toward R’s understanding.
2. Carpenter et al. (1998) used the number of infants as a dependent
measure instead of the frequency of pointing. The pattern of their results still
holds when distal gestures only are analyzed (no “shows” and “gives”; M.
Carpenter personal communication, December, 2004).
References
Baron-Cohen, S. 1989. Perceptual role taking and protodeclarative
pointing
in autism. British Journal of Developmental Psychology 7(2):113–127
Bates, E., L. Camaioni, and V. Volterra. 1975. The acquisition of performatives
prior to speech. Merrill-Palmer-Quarterly 21(3):205–226.
Bloom, P., and T. P. German. 2000. Two reasons to abandon the false
belief task as a test of theory of mind. Cognition 77(1):283–286.
Breheny, R. in press. Communication and folk psychology. Mind and
Language.
Bretherton, I., S. McNew, and M. Beeghly-Smith. 1981. Early person
knowledge as expressed in gestural and verbal communication: When
do infants acquire a “theory of mind”? In Infant social cognition:
Empirical and theoretical considerations, edited by M. E. Lamb and L.
R. Sherrod, 333–373. Hillsdale, NJ: Erlbaum.
Brinck, I. 2004. The pragmatics of imperative and declarative pointing.
Cognitive Science Quarterly 3(4):429–446.
Brooks, R., and A. N. Meltzoff. 2002. The importance of eyes: How
infants interpret adult looking behavior. Developmental Psychology
38(6):958–966.
Bruner, J. 1983. Child’s Talk. New York: W. W. Norton.
Butterworth, G. 1998. What is special about pointing in babies? In The
development of sensory, motor and cognitive capacities in early infancy:
From perception to cognition, edited by F. Simion and G. Butterworth,
171–190. Hove: Psychology Press.
Camaioni, L. 1993. The development of intentional communication:
A re-analysis. In New perspectives in early communicative development,
edited by J. Nadel and L. Camaioni, 82–96. London: Routledge.
Camaioni L., P. Perucchini, F. Bellagamba, and C. Colonnesi. 2004. The
role of declarative pointing in developing a theory of mind. Infancy
5(3):291–308.
Carpendale, J., and C. Lewis. 2004. Constructing an understanding of
mind: The development of children’s social understanding within
social interaction. Behavioral and Brain Science 27(1):96–97.
Carpenter, M., K. Nagell, and M. Tomasello. 1998. Social cognition, joint
attention, and communicative competence from 9 to 15 months of
age. Monographs of the Society of Research in Child Development, no.
255, 63(4):1–176.
Delgado, B., J. C. Gómez, and E. Sarriá. 1999. Non-communicative
pointing in preverbal children. Paper presented at the 9th European
Conference on Developmental Psychology, Spetses, Greece, September
4.
DeLoache, J. S., D. J. Cassidy, and A. L. Brown. 1985. Precursors of
mnemonic strategies in very young children’s memory. Child
Development 56(1):125–137.
Desrochers, S., P. Morissette, and M. Ricard. 1995. Two perspectives
on pointing in infancy. In Joint attention: Its origins and role in
development, edited by C. Moore and P. Dunham, 85–101. Hillsdale,
NJ: Erlbaum.
Egyed, K., I. Kiraly, and G. Gergely. 2004. Object-centered versus agent-
centered interpretations of attitude expressions. Paper presented at
the International Conference on Infant Studies, Chicago, May 6.
Flom, R., G. O. Deak, C. G. Phill, and A. D. Pick. 2004. Nine-month-olds’
shared visual attention as a function of gesture and object location.
Infant Behavior and Development 27(2):181–194.
Franco, F., and G. Butterworth. 1996. Pointing and social awareness:
Declaring and requesting in the second year. Journal of Child Language
23(2):307–336.
Gomez, J. C., E. Sarria, and J. Tamarit. 1994. The comparative study of
early communication and theories of mind: Ontogeny, phylogeny,
and pathology. In Understanding other minds: Perspectives from autism,
edited by S. Baron-Cohen, H. Tager-Flusberg, and D. J. Cohen, 397–
426. Oxford: Oxford University Press.
Goodwin, C. 2003. Pointing as situated practice. In Pointing: Where
language, culture, and cognition meet, edited by S. Kita, 217–241.
Mahwah, NJ: Erlbaum.
Grice, H. P. 1969. Utterer’s meaning and intentions. Philosophical Review
66:147–177.
Harris, P. 2005. Conversation, pretence, and theory of mind. In Why
language matters for theory of mind, edited by J. W. Astington and J.
Baird, 70–83. Oxford: Oxford University Press.
Kendon, A. 2004. Gesture. Visible action as utterance. Cambridge:
Cambridge University Press.
Kita, S. (ed.). 2003. Pointing: Where language, culture, and cognition meet.
Mahwah, NJ: Erlbaum.
Leavens, D. A., J. L. Russell, and W. D. Hopkins. 2005. Intentionality as
measured in the persistence and elaboration of communication by
chimpanzees (pan troglodytes). Child Development 76(1):291–306.
Leung, E. H., and H. L. Rheingold. 1981. Development of pointing as
a social gesture. Developmental Psychology 17(2):215–220.
Liszkowski, U., M. Carpenter, A. Henning, T. Striano, and M. Tomasello.
2004. Twelve-month-olds point to share attention and interest.
Developmental Science 7(3):297–307.
Liszkowski, U., M. Carpenter, T. Striano, and M. Tomasello. 2006.
Twelve- and 18-month-olds point to provide information. Journal
of Cognition and Development 7(2).
Liszkowski, U., M. Carpenter, and M. Tomasello. n.d. Infant pointing:
Reference and attitude. MS submitted for publication.
Marcos, H., and J. Bernicot. 1994. Addressee co-operation and request
reformulation in young children. Journal of Child Language 21(3):677–692.
Masataka, N. 2003. From index-finger extension to index-finger
pointing:
Ontogenesis of pointing in preverbal infants. In Pointing: Where
language, culture, and cognition meet, edited by S. Kita, 69–84. Mahwah,
NJ: Erlbaum.
Meltzoff, A. N., and M. Moore. 1977. Imitation of facial and manual
gestures by human neonates. Science 198(4312):75–78.
Moore, C., and V. Corkum. 1994. Social understanding at the end of
the first year of life. Developmental Review 14(4):349–372.
Moore, C., and B. D’Entremont. 2001. Developmental changes in
pointing as a function of attentional focus. Journal of Cognition and
Development 2(2):109–129.
Moses, L. J., D. A. Baldwin, J. G. Rosicky, and G. Tidball. 2001. Evidence
for referential understanding in the emotions domain at twelve and
eighteen months. Child Development 72(3):718–735.
O’Neill, D. K. 1996. Two-year-old children’s sensitivity to a parent’s
knowledge state when making requests. Child Development 67(2):659–
677.
Perner, J. 1991. Understanding the representational mind. Cambridge, MA:
MIT Press.
Povinelli, D. J., J. M. Bering, and S. Giambrone. 2003. Chimpanzees’
“pointing”: Another error of the argument by analogy? In Pointing:
Where language, culture, and cognition meet, edited by S. Kita, 35–68.
Mahwah, NJ: Erlbaum.
Searle, J. R. 1969. Speech acts. Cambridge: Cambridge University Press.
Shwe, H. I., and E. M. Markman. 1997. Young children’s appreciation
of the mental impact of their communicative signals. Developmental
Psychology 33(4):630–636.
Sperber, D., and D. Wilson. 1995. Relevance: Communication and cognition.
2nd edition. Oxford: Blackwell.
Tomasello, M. 1999. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Tomasello, M., and L. Camaioni. 1997. A comparison of the gestural
communication of apes and human infants. Human Development
40(1):7–24.
Tomasello, M., and K. Haberl. 2003. Understanding attention: 12- and
18-month-olds know what’s new for other persons. Developmental
Psychology 39:906–912.
Tomasello, M., and H. Rakoczy. 2003. What makes human cognition
unique? From individual to shared to collective intentionality. Mind
and Language 18(2):121–147.
Vygotsky, L. 1978. Mind in society: The development of higher psychological
processes. Cambridge, MA: Harvard University Press.
Werner, H., and B. Kaplan. 1963. Symbol formation: An
organismic-developmental
Wiley.
approach to language and the expression of thought. New York:
six
ToM
ToM may not be the best term to use but it has become widely accepted
and is probably not easily replaced now. We can get away with it if we
do not take it too literally. As is well known, Premack and Woodruff
(1978) introduced the term into the psychological literature when they
asked whether the chimpanzee has a ToM, which they defined as a
system that imputes mental states to make inferences about behavior.
Developmental psychologists picked up the term and quickly applied it
quite broadly, using it to characterize nine-month-olds’ communicative
abilities (Bretherton et al. 1981) as well as five-year-olds’ ability to meta-represent
belief (Wimmer and Perner 1983). However, the term was
soon claimed by a particular camp (Astington et al. 1988) that took
the “theory” notion seriously. In its most precise use, ToM is a domain-specific,
psychologically real structure, comprising an integrated set
of mental-state concepts employed to explain and predict people’s
actions and interactions, that is reorganized over time when faced
with counterevidence to its predictions (e.g., Gopnik and Wellman
1994). From this theoretical perspective, ToM development in children
is analogous to theory development in science (the so-called “theory–
theory” view). On this view, ToM is both a cognitive structure leading
to certain abilities as well as a theoretical perspective explaining the
development of these abilities. The term was also used by simulation
theorists, who disagreed with the theory–theory view, claiming that
mental-state concepts are not theoretical postulates but are derived
from experience (Goldman 1989; Gordon 1986; Harris 1992). And it
was used by modularity theorists, who argued that the theory is not
developed by a process of theorizing but is innate and matures (Baron-Cohen
1995; Fodor 1992; Leslie 1994).
Currently, however, ToM is often used with either much narrower
or much broader scope—all the way from designating false-belief
understanding in particular, to social understanding in a most general
sense. It is accepted (or at least used) even by those who do not endorse
the theory-theory, simulation-theory, or modularity-theory perspectives.
I will use it here, as a broad term for a multifaceted system. On this
catholic view, ToM encompasses:
social perception in late infancy
mental-state awareness in toddlers and preschool children
metarepresentational ability in older preschool children
multiply embedded recursive and interpretive abilities in school-age
children.
Developmental Interdependence: Mind and Language
Importantly, new abilities do not replace earlier ones during the course
of development but are added to them, so that the full set constitutes
ToM in late childhood and adulthood.
Acknowledgments
I thank Chris Moore and Lisa Dack for helpful comments on drafts of
this chapter, and the Natural Sciences and Engineering Research Council
of Canada for support of my research.
Notes
1. Representation, in this sense, is essentially equivalent to verbal
thought; used in this way it includes only part of the full scope of
the term representation.
2. It is likely that both perspectival conversations and mental terms are
important. In the Lohmann and Tomasello (2003) training study that
Harris (2005) cites, even though perspective-shifting discourse and
use of mental terms had independent effects on the development
of false-belief understanding, the largest training effect occurred in
a condition that combined perspective-switching discourse with
mental term use.
3. Changes in children’s concepts of desire and intention are correlated
with the development of false-belief understanding. Before this
time, children see desires and intentions as motivational states
that are not clearly distinguished from one another and that are
tied to actions and speech acts. They understand that people may
have different desires or intentions, that each person acts to fulfill
his or her own desires and intentions, and are happy if they are
fulfilled, and so forth. After the conceptual advance marked by
false-belief understanding children recognize the consequences
of incompatible, conflicting desires (Perner et al. 2005), they can
distinguish between desire and intention, and understand
intentional
causation (Astington 2001).
References
Astington, J. W. 1988. Children’s understanding of the speech act of
promising. Journal of Child Language 15:157-153.
——. 1996. What is theoretical about the child’s theory of mind? A
Vygotskian view of its development. In Theories of theories of mind,
edited by P. Carruthers and P. K. Smith, 184-199. Cambridge:
Cambridge University Press.
——. 1999. The language of intention: Three ways of doing it . In
Developing theories of intention: Social understanding and self control,
edited by P. D. Zelazo, J. W. Astington, and D. R. Olson, 295-315.
Mahwah, NJ: Erlbaum.
——. 2001. The paradox of intention: Assessing children’s metarepresentational
understanding. In Intentions and intentionality: Foundations of
social cognition, edited by B. F. Malle, L. J. Moses, and D. A. Baldwin,
85-103. Cambridge, MA: MIT Press.
Astington, J. W., P. L. Harris, and D. R. Olson (eds.). 1988. Developing
theories of mind. New York: Cambridge University Press.
Astington, J. W., and D. R. Olson. 1995. The cognitive revolution in
children’s understanding of mind. Human Development 38:179-189.
Astington, J. W., J. Pelletier, and B. Homer. 2002. Theory of mind
and epistemological development: The relation between children’s
second-order false-belief understanding and their ability to reason
about evidence. New Ideas in Psychology 20:131-144.
Avis, J., and P. L. Harris. 1991. Belief-desire reasoning among Baka
children: Evidence for a universal conception of mind. Child
Development 62:460-467.
Baldwin, D. A. 1993. Infants’ ability to consult the speaker for clues to
word reference. Journal of Child Language 20:395-418.
Baldwin, D. A., and M. Saylor. 2005. Language promotes structural
alignment in the acquisition of mentalistic concepts. In Why language
matters for theory of mind, edited by J. W. Astington and J. A. Baird,
123-143. New York: Oxford University Press.
Baron-Cohen, S. 1995. Mindblindness: An essay on autism and theory of
mind. Cambridge, MA: Bradford Books–MIT Press.
Baron-Cohen, S., M. O’Riordan, V. Stone, R. Jones, and K. Plaisted.
1999. Recognition of faux pas by normally developing children with
Asperger syndrome or high-functioning autism. Journal of Autism &
Developmental Disorders 29:407-418.
Bartsch, K., and H. M. Wellman. 1995. Children talk about the mind. New
York: Oxford University Press.
Beckwith, R. T. 1991. The language of emotion, the emotions, and
nominalist bootstrapping. In Children’s theories of mind, edited by D.
Frye and C. Moore, 77-95. Hillsdale, NJ: Erlbaum.
Bennett, J. 1978. Some remarks about concepts. Brain and Behavioral
Sciences 1:557-560.
Bloom, L., M. Rispoli, B. Gartner, and J. Hafitz. 1989. Acquisition of
complementation. Journal of Child Language 16:101-120.
Bretherton, I., and M. Beeghly. 1982. Talking about internal states: The
acquisition of an explicit theory of mind. Developmental Psychology
18:906-921.
Bretherton, I., S. McNew, and M. Beeghly-Smith. 1981. Early person
knowledge as expressed in gestural and verbal communication: When
do infants acquire a “theory of mind”? In Infant social cognition, edited
by M. E. Lamb and L. R. Sherod, 333-373. Hillsdale, NJ: Erlbaum.
Carlson, S. M., and L. Moses. 2001. Individual differences in inhibitory
control and children’s theory of mind. Child Development 72:1032-.1053
Carpendale, J. I. M., and C. Lewis. 2004. Constructing an understanding
of mind: The development of children’s social understanding within
social interaction. Behavioral and Brain Sciences 27:79-151.
Carpenter, M., K. Nagell, and M. Tomasello. 1998. Social cognition, joint
attention, and communicative competence from 9 to 15 months of
age. Monographs of the Society for Research in Child Development, No.
255, 63(4).
Chandler, M. J., A. S. Fritz, and S. M. Hala. 1989. Small scale deceit:
Deception as a marker of 2-, 3- and 4-year-olds’ early theories of
mind. Child Development 60:1263-1277.
Chandler, M., and C. Lalonde. 1996. Shifting to an interpretive theory
of mind: 5- to 7-year-olds’ changing conceptions of mental life. In
The five to seven year shift: The age of reason and responsibility, edited
by A. J. Sameroff and M. M. Haith, 111–139. Chicago: University of
Chicago Press.
Clark, H. H. 1996. Using language. Cambridge: Cambridge University
Press.
Clements, W. A., and J. Perner. 1994. Implicit understanding of belief.
Cognitive Development 9:377-395.
Comay, J. n.d. Individual differences in narrative perspective-taking and
theory of mind: A developmental study. Ph.D. dissertation, Department
of Human Development and Applied Psychology, University of
Toronto.
de Villiers, J. 2005. Can language acquisition give children a point
of view? In Why language matters for theory of mind, edited by J. W.
Astington and J. A. Baird, 186-219. New York: Oxford University
Press.
de Villiers, J. G., and J. E. Pyers. 2002. Complements to cognition: A
longitudinal study of the relationship between complex syntax and
false-belief understanding. Cognitive Development 17:1037-1060.
Dennett, D. C. 1978. Beliefs about beliefs. Brain and Behavioral Sciences
1:568-570.
Diessel, H., and M. Tomasello. 2001. The acquisition of finite complement
clauses in English: A corpus-based analysis. Cognitive Linguistics
12:97-141.
Dunn, J., and M. Brophy. 2005. Communication, relationships, and
individual differences in children’s understanding of mind. In Why
language matters for theory of mind, edited by J. W. Astington and J.
A. Baird, 50-69. New York: Oxford University Press.
Dunn, J., J. Brown, C. Slomkowski, C. Tesla, and L. Youngblade. 1991.
Young children’s understanding of other people’s feelings and beliefs:
Individual differences and their antecedents. Child Development
62:1352-1366.
Fernyhough, C. 1996. The dialogic mind: A dialogic approach to the
higher mental functions. New Ideas in Psychology 14:47-62.
Filippova, E. 2005. Development of advanced social reasoning: Contribution
of theory of mind and language to irony understanding. Ph.D. dissertation,
Department of Human Development and Applied Psychology,
University
of Toronto.
Fodor, J. A. 1992. A theory of the child’s theory of mind. Cognition
44:283-296.
Gardner, D., P. L. Harris, M. Ohmoto, and T. Hamasaki. 1988. Japanese
children’s understanding of the distinction between real and
apparent emotion. International Journal of Behavioral Development
11:203-218.
Garfield, J. L., C. C. Peterson, and T. Perry. 2001. Social cognition,
language acquisition and the development of the theory of mind.
Mind & Language 16:494-541.
Gentner, D., and M. J. Ratterman. 1991. Language and the career
of similarity. In Perspectives on language and thought: Interrelations
in development, edited by S. A. Gelman and J. P. Byrnes, 225-277.
Cambridge: Cambridge University Press.
Goldman, A. I. 1989. Interpretation psychologized. Mind & Language
4:161-185.
Gopnik, A., and H. M. Wellman. 1994. The theory theory. In Mapping the
mind: Domain specificity in cognition and culture, edited by L. Hirschfeld
and S. Gelman, 257-293. New York: Cambridge University Press.
Gordon, R. M. 1986. Folk psychology as simulation. Mind & Language
1:156-171.
Grice, H. P. 1957. Meaning. Philosophical Review 66:377-388.
Hale, C. M., and H. Tager-Flusberg. 2003. The influence of language on
theory of mind: A training study. Developmental Science 6:346-359.
Happé, F. G. E. 1994. An advanced test of theory of mind: Understanding
of story characters’ thoughts and feelings by able autistic, mentally
handicapped and normal children and adults. Journal of Autism and
Developmental Disorders 24:129-154.
Harman, G. 1978. Studying the chimpanzee’s theory of mind. Brain
and Behavioral Sciences 1:591.
Harris, P. L. 1990. The child’s theory of mind and its cultural context. In
The causes of development, edited by G. Butterworth and P. E. Bryant,
43-64. London: Harvester Wheatsheaf.
——. 1992. From simulation to folk psychology: The case for
development
. Mind & Language 7:120-144.
——. 1999. Acquiring the art of conversation. In Developmental
psychology: Achievements and prospects, edited by M. Bennett, 89-105.
Philadelphia: Psychology Press–Taylor & Francis.
——. 2005. Conversation, pretense, and theory of mind. In Why language
matters for theory of mind, edited by J. W. Astington and J. A. Baird,
70-83. New York: Oxford University Press.
Hughes, C. 1998. Executive function in preschoolers: Links with theory
of mind and verbal ability. British Journal of Developmental Psychology
16:233-253.
Huttenlocher, J., W. Haight, A. Bryk, M. Seltzer, and T. Lyons. 1991.
Early vocabulary growth: Relation to language input and gender.
Developmental Psychology 27:236-248.
Jenkins, J. M., S. Turrell, Y. Koguchi, S. Lollis, and H. S. Ross. 2003. A
longitudinal investigation of the dynamics of mental state talk in
families. Child Development 74:905-920.
Jin, Y., J. Jing, R. Morinaga, K. Miki, X. Su, X. Chen, and S. Source. 2002.
A comparative study of theory of mind in Chinese and Japanese
children. Chinese Mental Health Journal 16:446-448.
Johnson, C. N. 1982. Acquisition of mental verbs and the concept of
mind. In Language development: Syntax and semantics, edited by I. S.
Kuczaj, 445-478. Hillsdale, NJ: Erlbaum.
Johnson, C. N., and H. M. Wellman. 1980. Children’s developing
understanding of mental verbs: Remember, know and guess. Child
Development 51:1095-1102.
Jolly, A. 1988. The evolution of purpose. In Machiavellian intelligence:
Social expertise and the evolution of intellect in monkeys, apes, and
humans, edited by R. W. Byrne and A. Whiten, 363-378. Oxford:
Oxford University Press.
Karmiloff-Smith, A. 1992. Beyond modularity: A developmental perspective
on cognitive science. Cambridge, MA: MIT Press.
Lee, K., D. R. Olson, and N. Torrance. 1999. Chinese children’s
understanding
26:1-21.
of false beliefs: The role of language. Journal of Child Language
Leslie, A. M. 1994. ToMM, ToBy, and agency: Core architecture and
domain specificity. In Mapping the mind: Domain specificity in cognition
and culture, edited by L. Hirschfeld and S. Gelman, 119-148. New
York: Cambridge University Press.
Levy, E., and K. Nelson. 1994. Words in discourse: A dialectical approach
to the acquisition of meaning and use. Journal of Child Language
21:367-389.
Lohmann, H., and M. Tomasello. 2003. The role of language in the
development of false-belief understanding: A training study. Child
Development 74:1130-1144.
Meltzoff, A. N., A. Gopnik, and B. M. Repacholi. 1999. Toddlers’
understanding
of intentions, desires, and emotions: Explorations of the
Dark Ages. In Developing theories of intention: Social understanding and
self control, edited by P. D. Zelazo, J. W. Astington, and D. R. Olson,
17-41. Mahwah, NJ: Erlbaum.
Milligan, K. V., J. W. Astington, and L. A. Dack. n.d. Language and theory
of mind: Meta-analysis of the relation between language and false-belief
understanding. Manuscript submitted for publication.
Montgomery, D. E. 2005. The developmental origins of meaning for
mental terms. In Why language matters for theory of mind, edited by J.
W. Astington and J. A. Baird, 106-122. New York: Oxford University
Press.
Moore, C. 1998. Social cognition in infancy. Monographs of the Society
for Research in Child Development, No. 255, 63(4):167-174.
Moore, C., D. Furrow, L. Chiasson, and M. Patriquin. 1994. Developmental
relationships between production and comprehension of mental
terms. First Language 14:1-17.
Naito, M. 2004. Is theory of mind a universal and unitary construct?
International Society for the Study of Behavioural Development Newsletter
45(1):9-11.
Naito, M., S. Komatsu, and T. Fuke. 1994. Normal and autistic children’s
understanding of their own and others’ false belief: A study from
Japan. British Journal of Developmental Psychology 12:403-416.
Nelson, K. 1996. Language in cognitive development: The emergence of the
mediated mind. New York: Cambridge University Press.
——. 2005. Language pathways into the community of minds. In Why
language matters for theory of mind, edited by J. W. Astington and J.
A. Baird, 26-49. New York: Oxford University Press.
Nelson, K., and L. Kessler Shaw. 2002. Developing a socially shared
symbolic system. In Language, literacy, and cognitive development: The
development and consequences of symbolic communication, edited by E.
Amsel and J. P. Byrnes, 27-57. Mahwah, NJ: Erlbaum.
Olson, D. R. 1988. On the origins of beliefs and other intentional states
in children. In Developing theories of mind, edited by J. W. Astington, P.
L. Harris, and D. R. Olson, 414-426. New York: Cambridge University
Press.
O’Neill, D. K. 2005. Talking about “new” information: The given/
new distinction and children’s developing theory of mind. In Why
language matters for theory of mind, edited by J. W. Astington and J.
A. Baird, 84-105. New York: Oxford University Press.
Onishi, K. H., and R. Baillargeon. 2005. 15-month-old infants understand
false beliefs. Science 308(April 8):255-257.
Perner, J. 1991. Understanding the representational mind. Cambridge, MA:
Bradford Books–MIT Press.
Perner, J., M. Sprung, P. Zauner, and H. Haider. 2003. Want that is
understood well before say that, think that, and false belief: A test
of de Villiers’ linguistic determinism on German-speaking children.
Child Development 74:179-188.
Perner, J., and H. Wimmer. 1985. “John thinks that Mary thinks that. . .”
Attribution of second-order beliefs by 5- to 10-year-old children.
Journal of Experimental Child Psychology 39:437-471.
Perner, J., P. Zauner, and M. Sprung. 2005. What does “that” have to
do with point of view? Conflicting desires and “want” in German.
In Why language matters for theory of mind, edited by J. W. Astington
and J. A. Baird, 220-244. New York: Oxford University Press.
Peterson, C. C., and M. Siegal. 2000. Insights into theory of mind from
deafness and autism. Mind & Language 15:123-145.
Premack, D., and G. Woodruff. 1978. Does the chimpanzee have a
theory of mind? Behavioral and Brain Sciences 1:515-526.
Rogoff, B., P. Chavajay, and E. Matusov. 1993. Questioning assumptions
about culture and individuals. Behavioral and Brain Sciences 16:533-534.
Ruffman, T., L. Slade, and E. Crowe. 2002. The relation between
children’s
and mothers’ mental state language and theory-of-mind
understanding. Child Development
73:734 751
- .
Shwe, H. L., and E. M. Markman. 1997. Young children’s appreciation
of the mental impact of their communication skills. Developmental
Psychology 33:630-636.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and cognition.
Cambridge, MA: Harvard University Press.
Tardif, T., and H. M. Wellman. 2000. Acquisition of mental state language
in Mandarin- and Cantonese-speaking children. Developmental
Psychology 36:25-43.
Tomasello, M. 1999. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005.
Understanding
and sharing intentions: The origins of cultural cognition.
Behavioral and Brain Sciences 28:675-735.
Vinden, P. G. 1996. Junin Quechua children’s understanding of mind.
Child Development 67:1701-1716.
——. 1999. Children’s understanding of mind and emotion: A
multicultural
study. Cognition and Emotion 13:19-48.
Vinden, P. G., and J. W. Astington. 2000. Culture and understanding other
minds. In Understanding other minds: Perspectives from developmental
cognitive neuroscience, edited by S. Baron-Cohen, H. Tager-Flusberg,
and D. J. Cohen, 503-519. Oxford: Oxford University Press.
Vygotsky, L. S. 1978. Mind in society. Cambridge, MA: Harvard University
Press.
Wellman, H. M. 2002. Understanding the psychological world:
Developing
a theory of mind. In Blackwell handbook of childhood cognitive
development, edited by U. Goswami, 167-187. Oxford: Blackwell.
Wellman, H. M., D. Cross, and J. Watson. 2001. Meta-analysis of theory
of mind development: The truth about false-belief. Child Development
72:655-684.
Wellman, H. M., and D. Liu. 2004. Scaling theory-of-mind tasks. Child
Development 75:523-541.
Whorf, B. L. 1956. Language, thought, and reality. Cambridge, MA: MIT
Press.
Wimmer, H., S. Gruber, and J. Perner. 1984. Young children’s conception
of lying: Lexical realism—moral subjectivism. Journal of Experimental
Child Psychology 37:1-30.
Wimmer, H., and J. Perner. 1983. Beliefs about beliefs: Representation
and constraining function of wrong beliefs in young children’s
understanding of deception. Cognition 13:103-128.
seven
One of the central goals of this volume is to map out the uniquely
human aspects of sociality. As is evidenced in the included
chapters, human sociality is a complex conglomerate of behaviors and
experiences, and in this chapter, I propose that its uniqueness is found
in the interdependence of these behaviors and experiences. In particular,
I argue that adult social cognition cannot develop without access to
a rich and complex language. Drawing from theory-of-mind (ToM)
data collected from typically developing preschoolers, orally taught
deaf children, and deaf signers who are exposed to an emerging sign
language, Nicaraguan Sign Language (NSL), I show that language is the
prerequisite foundation on which humans build a mature understanding
of other people’s minds.
First, I provide background about the nature of ToM, focusing on
children’s acquisition of false-belief understanding. Second, I review
several theoretical proposals, each isolating a different aspect of language
as the facilitating force behind children’s emerging understanding of
the mind. Finally, I point to significant ToM impairments in languagedelayed
children and adults exposed to an emerging language to argue
that a rich and complex language must be in place for false-belief
understanding to develop in humans.
Psychological Foundations
ToM
The term theory of mind, although first introduced by primatologists, now
functions as a catchall phrase for the general understanding that human
behavior is motivated by the unobservable intentions, desires, and
thoughts of the individual (Premack and Woodruff 1978). In humans,
ToM development begins in the first days of life and extends into
early childhood, including milestones like understanding that human
action is goal directed, and appreciating that people can have different
desires and preferences. In the first year of life, early intersubjective
social behaviors, such as paying attention to human faces, imitating,
following eye gaze, social referencing, pointing, and joint attention,
reflect infants’ understanding that humans are informational agents and
possess different knowledge states. At 12 months, infants informatively
point to objects that have been displaced out of their interlocutor’s sight,
to help the interlocutor find the object. This intersubjective activity
clearly indicates that infants monitor the mental state of others, and
that they understand that their knowledge states can differ from that
of someone else (Liszkowski this volume).
One feature of ToM that stands out in the literature as a momentous
achievement in children’s social development is an understanding that
thoughts and beliefs can actually be wrong—that people can have a
“false belief.” In the first three years of life, children egocentrically
operate in the world, believing that what they think is actually true and
that their thoughts and beliefs correspond with those of everyone else.
Three-year-olds are self-perceived omniscient forces in their world. In
their fourth year, children suddenly see that other people have thoughts
and beliefs distinct from any other person, and importantly that those
mental states can be falsified. For example, your mother may think you
are asleep in your bed, but you are really under the covers reading a
book. In this instance, your mother has a false belief that you are asleep.
The reality of the world, that you are awake reading in bed, counters
her representation of you sleeping.
False-belief understanding is a fundamental turning point in
children’s
development because it marks a transition to higher-level
metarepresentational abilities (Perner 1991). With this ability, children
are able to represent another representation and, instead of simply
understanding
that people have different viewpoints, they can now represent
a belief, assess whether it is true or false, and use the representation to
predict human behavior.
Across different cultures, children arrive at an understanding of false
belief, on average, sometime between four and five years of age (Wellman
Constructing the Social Mind
Syntax
A third, somewhat controversial, proposal claims that the rich complex
syntax of human language provides the representational means to
understand a false belief. According to one account, complex language
is the necessary tool with which children can encode representations
of the world (Plaut and Karmiloff-Smith 1993). Syntax allows children
to encode and retain a representation even in the face of a real-world
situation that may falsify that representation. For children, an observed
reality carries more representational weight than a belief that has not
been encoded by language.
An alternative proposal argues that mastery of a specific syntactic
structure, embedded sentential complements, underpins children’s
mastery of the mind (de Villiers and de Villiers 2000). Both mental
and communication verbs take a full embedded proposition as their
complement as in (1):
Summary
Each of the accounts recognizes that language plays a crucial role
in children’s timely acquisition of a ToM, but each differs on which
aspect—conversation, semantics, or syntax—serves as the springboard.
These questions are still being teased apart; but in collaboration with
other researchers, I have sought to argue that complex language is the
necessary prerequisite for the human-specific understanding of false
belief. I turn to studies of false-belief understanding in three different
populations—typically developing children, language-delayed deaf
children, and adult native learners of an emerging sign language—to
support this argument.
NSL
Twenty-five years ago, when the Nicaraguan government opened the
doors of the school for special education in Managua, deaf Nicaraguan
children from all class backgrounds were, for the first time, afforded a
public education, triggering a set of events that led to the birth of a new
language in Nicaragua, NSL. On the buses and playgrounds, children
converged on a set of lexical signs that served as a common basis of
the new language that has since become more complex. In analyzing
the emergence of this language, Senghas (1996) made an important
observation: one of the significant predictors of a signer’s linguistic
sophistication was the year he or she entered the school for special
education. The language of the children who entered in the late 1970s
and early 1980s was less complex than that of children who entered
the school ten years later. Contrary to the model of typical language
development, in the Nicaraguan deaf community, the younger you were
the better your linguistic skills. The younger signers who entered the
school after 1986 are referred to as the second cohort. Deaf Nicaraguans
who entered the school before 1986 are called the first cohort.
Two significant grammatical differences have been observed in the
language of the two cohorts. First, second-cohort signers are more
consistent and regular than first-cohort signers in their use of spatial
modulations to mark the arguments of a verb. The first-cohort signers
use space unsystematically, placing the subject of a verb equally as often
on their left side as on their right side (Senghas and Coppola 2001).
Second, the first-cohort signers use holistic expressions of manner and
path when talking about motion events, whereas the second-cohort
signers componentially break down the manner and path information
into two different signs (Senghas et al. 2004).
These two important differences in the complexity of the two
cohorts’ language underscore that NSL is undergoing rapid change and
systemization in its grammar. What is unique about this population with
respect to our interest in the relationship between language and social
cognition is that older signers, who have more years of social exposure
in the world, exhibit less linguistic complexity in their language; and,
younger adolescents, the second-cohort signers, have fewer years of
experience but richer linguistic knowledge. If language is a facilitative
and not a deterministic tool in children’s acquisition of false-belief
understanding, social experience could compensate for what language
cannot provide. Thus, we should observe no differences between signers
from each cohort in their ToM knowledge. Conversely, if language is
a prerequisite for a mature ToM, without which children could not
acquire an understanding of belief, then we may observe ToM delays
in the older, first-cohort signers.
False-belief Understanding
When examining the relationship between language and cognition,
it is important to rule out the language of the task as a potential
contributor to failure. To avoid this confound a minimally verbal picture
completion version of the unseen-displacement measure was developed
and administered to eight first-cohort signers, with a mean age of 27
years, and eight second-cohort signers with a mean age of 17 years
(Pyers 2004).
The results revealed a striking difference in the false-belief performance
of the two cohorts. Seven of the second-cohort signers, but only one
of the first-cohort signers passed the minimally verbal false-belief
measure. For typically developing children, there is a positive correlation
between age and false-belief understanding; in this population there
was a negative correlation—the younger signers outperformed the older
signers. This, however, did not indicate that as the Nicaraguan signers
aged they lost their ability to represent a false belief. Rather, there was
a third factor that separated the two groups in their performance on
the false-belief task.
Information gathered in background interviews showed that the two
groups did not differ from each other in terms of family demographics,
education level, or employment history. They did differ, however in
the year when they first learned the sign language. The first-cohort
signers were exposed to NSL in its earliest form, when the language was
first emerging. As children, the first-cohort signers contributed to the
creation of this language, moving it away from a conglomeration of
unsystematic gestures imported from an array of home-sign systems.3
As these first-cohort signers hit adolescence, a new group of children
entered the school. The language of the first cohort served as the input
to the second cohort. The second-cohort children took their somewhat
irregular and unsystematic linguistic input and produced language that
was more systematic and regular (Senghas 2003). The language of the
second cohort is more sophisticated than that of the first cohort, and
this linguistic difference seemed to underlie the differential performance
on the false-belief measure.
Discussion
When the results from typically developing children, language-delayed
children, and learners of an emerging language are pieced together,
the picture that emerges is one in which complex social cognition
is contingent on the acquisition of complex syntax. Without a full
and rich language, humans fail to consider a person’s false belief in
predicting their actions.
A recent study, however, appears to challenge the proposal that falsebelief
understanding is dependent on language acquisition. Using a
violation-of-expectancy method, Onishi and Baillargeon (2005)
monitored 15-month-old infants’ eye gaze and showed that it revealed
the infants’ early implicit understanding of false belief. They argued that
the task demands of the traditional false-belief measures are too high,
and cannot capture this early implicit understanding. The results from
the Nicaraguan signers, however, counter their argument, because the
first-cohort signers failed a false-belief task in which the linguistic and
cognitive demands were minimal for an otherwise normal adult. Instead,
the Nicaraguan results are quite parsimonious with the interpretation
that Perner and Ruffman (2005) provide for the performance of these
infants. The precocious false-belief performance of 15-month-olds,
as measured by their eye gaze, may reflect an innate “core theory”
about the behavior of conspecifics; but only experience in the world,
specifically linguistic experience, provides children with the sufficient
means to build the kind of false-belief representation that could be
called on to reliably predict human behavior.
The strong dependence of false-belief understanding on complex
syntax is also evident in the emergence of NSL. The language is
undergoing
rapid change both in its lexicon and in its syntax, and the very
structure, argued to trigger the children’s understanding of false belief,
appears early in the emergence of NSL. The first and second cohort differ
in their use of complement structures not only with mental verbs, but
also with desire verbs; second-cohort signers produce significantly more
complements with desire verbs than do the first-cohort signers (Pyers
2004). Evidently, the second cohort of early exposed child learners
introduced the systematic use of these structures into the language and
readily uses them in adult narratives. Notably, it only takes one cohort
of child learners for this complex structure to emerge; by the second
generation of child learners,5 this stepping-stone to a mature ToM is
present and available in the language.
How important is false-belief understanding to the human experience?
Although first-cohort signers struggle to accurately predict human
behavior when a false belief is involved, they lead otherwise normal
lives, living with extended families, raising children, holding down
jobs, and even navigating the Managuan bus system. How can all of
this “know-how” be maintained without a representation of false belief?
False belief, although considered the most momentous achievement
in ToM development, is not the sole internal state that drives human
behavior. Humans also act in the world motivated by their desires
and emotions. An understanding of desires and an understanding of
emotions, neither of which seem to rely on language to develop, can be
tapped to explain much of human behavior. And, the first-cohort signers
show no impairments in their understanding of desires and emotions as
driving forces of human action (Pyers 2004). The Nicaraguan failers of
false belief draw on their understanding of the physical world, of desires,
and of emotions to explain why people make mistakes in the world
(Pyers 2004). Although tapping into desires and emotions to explain
mistakes that result from false beliefs does not completely capture the
essential cause of the mistake, it may provide enough explanatory
strength for the first-cohort signers to operate somewhat effectively
in the world.
One question that remains open for now is whether there is a
critical period for acquiring complementation or for developing an
understanding of false belief. Now that the NSL has complementation,
would it be possible for the first-cohort failers to acquire this linguistic
structure as adults and have it improve their performance on the falsebelief
test? We do not know if the adult first-cohort signers could
acquire complementation; and, if they could, if acquiring false-belief
understanding would still be available to them in adulthood. As the
second-cohort signers enter adulthood and increase their social and
linguistic interactions with the first cohort, we may observe important
changes in the language and perhaps cognitive capacities of the first
cohort.
Conclusion
In three different populations with varying language capabilities, we
observe that false-belief understanding is causally linked to the use and
mastery of complex syntax. Language is a fundamental and unique
characteristic of humans and, I argue, underpins human sociality.
Language itself serves two important functions in the development of
human social interaction (Astington this volume). First, as an interactive
tool, language communicates information. The communicative
experience
is a core feature of human development, such that, without
exposure
to a rich linguistic system, deaf children adopt gestures to meet
their communicative goals. Under the pressure of communicative
needs, these gestures begin to adopt languagelike features, revealing
the dependence of efficient communication on the complex properties
of language (Goldin-Meadow this volume). Second, language is not
only a communicative tool, but also a means of verbally representing
or misrepresenting the events of the world. The data reviewed in this
chapter demonstrate that the representational capacity of language
opens up a whole new domain of social cognition in humans.
False-belief understanding is an important turning point in children’s
development, but its role in adult social interaction is not fully
understood.
For example, how does the ability to consider a false belief shape
the coordination and establishment of joint commitment assumptions
(Clark this volume)? How does false belief understanding shape our
ability to engage in efficient linguistic communication? False-belief
understanding may be dependent on language, but it is very likely that
rich communicative interaction depends on a mature understanding of
false belief. Further work on the interactional consequences of delays
in social cognition still needs to be done. Developmental research has
shown a strong bidirectional relationship between social cognition and
language through children’s early years; it is likely that this relationship
would continue in later development.
Language and social cognition have evolved in humans to support
and enhance each other throughout development. False-belief
understanding
is just one domain of social cognition in which this
relationship
plays out, with cognition dependent on language. But earlier in
children’s development, social cognition, specifically joint attention,
supports their early language learning (Tomasello and Farrar 1986).
Social cognition promotes language development, and language triggers
more advanced social cognition, and the interplay continues throughout
human development. And, it is in the developmental interdependence
of language and cognition that we find the roots of human sociality.
Notes
1. This measure of false belief is sometimes referred to as the “changed-
location” task.
2. The common term for the “unseen-displacement” measure is the
“Smarties task.” The familiar container first used in this study was
a tube of Smarties, a type of candy similar to M&M’s.
3. Home-sign systems are the gestures created by deaf children for use
in their homes to communicate with their families (Goldin-Meadow
1982). Each individual family can develop their own unique homesign
system, which would be unintelligible to other families with
home-sign systems.
4. One possibility is that misunderstandings could be attributed
to breakdowns in linguistic form not meaning. Deaf children’s
speech is difficult for adults with normal hearing to understand,
and the children are often asked to repeat themselves. Frequent
misunderstandings at the level of form may lead deaf children to
assume that all bids for clarification result from their speech not
being understood. Under this assumption, deaf children would
rarely experience communication breakdowns as encounters with
different mental states; instead, they would interpret them as failures
to understand the form of the utterance. This kind of interpretation
could actually work against children’s timely acquisition of ToM.
5. The use of generation in this context essentially refers to cohort. I
chose the word generation here because I want to emphasize that
the language has to go through a second round of child learners.
But, these learners do not necessarily have to be members of the
second cohort. For example, one first-cohort passer of false-belief
understanding had a significantly older deaf sibling. This same
first-cohort signer patterns like second-cohort signers on most of
the language measures. Although temporally a member of the first
cohort, this signer’s language use and cognitive capacities look much
like second-cohort signers. Her language sophistication could be
attributed to her early regular exposure to the raw materials of
NSL.
References
Astington, J. W., and J. M. Jenkins. 1999. A longitudinal study of
the relation between language and theory-of-mind development.
Developmental Psychology 35(5):1311–1320.
Bartsch, K., and H. M. Wellman. 1995. Children talk about the mind.
London: Oxford University Press.
de Villiers, J. G., and P. de Villiers. 2000. Linguistic determinism and
the understanding of false beliefs. In Children’s reasoning and the
mind, edited by P. Mitchell and K. Riggs, 191–228. Hove: Psychology
Press.
de Villiers, J. G., and J. E. Pyers. 2002. Complements to cognition: A
longitudinal study of the relationship between complex syntax and
false-belief-understanding. Cognitive Development 17(1):1037–1060.
de Villiers, P. A., and J. Pyers. 2001. Complementation and false-belief
representation. In Research on child language acquisition, edited by M.
Almgren, M. J. Ezeizabarrena, and I. Idiazabal, 984–1005. Somerville,
MA: Cascadilla Press.
Fodor, J. A. 1975. The Language of Thought. New York: Crowell.
Gale, E., P. de Villiers, J. de Villiers, and J. Pyers. 1996. Language and
theory of mind in oral deaf children. In Proceedings of the 20th Annual
Boston University Conference on Language Development, edited by A.
Stringfellow, D. Cahana-Amitay, E. Hughes, and A. Zukowski, 213–244.
Sommerville, MA: Cascadilla Press.
Goldin-Meadow, S. 1982. The resilience of recursion: A study of a
communication system developed without a conventional language
model. In Language acquisition: The state of the art, edited by E. Wanner
and L. R. Gleitman, 51–77. New York: Cambridge University Press.
Gopnik, A., and A. Meltzoff 1997. Words, thoughts, and theories.
Cambridge, MA: MIT Press.
Happe, F. G. E. 1993. Communicative competence and theory of mind
in autism: A test of relevance theory. Cognition 48(2):101–119.
Harris, P. 1996. Desires, beliefs, and language. In Theories of theories of
mind, edited by P. Carruthers and P. K. Smith, 200–222. New York:
Cambridge University Press.
Hughes, C., and J. Dunn. 1998. Understanding mind and emotion:
Longitudinal associations with mental-state talk between young
friends. Developmental Psychology 34(5):1026–1037.
Lohmann, H., and M. Tomasello. 2003. The role of language in the
development of false belief understanding: A training study. Child
Development 74(4):1130–1144.
Olson, D. R. 1988. On the origins of beliefs and other intentional states
in children. In Developing theories of mind, edited by J. W. Astington,
P. L. Harris, and D. R. Olson, 414–426. Cambridge: Cambridge
University Press.
Onishi, K. H., and R. Baillargeon. 2005. Do 15-month-old infants
understand false beliefs? Science 308:255–258.
Perner, J. 1991. Understanding the representational mind. Cambridge, MA:
MIT Press.
Perner, J., S. R. Leekam, and H. Wimmer. 1987. Three-year-olds’ difficulty
with false belief: The case for a conceptual deficit. British Journal of
Developmental Psychology 5(2):125–137.
Perner, J., and T. Ruffman. 2005. Infants’ insight into the mind: How
deep? Science 208:214–216.
Peterson, C. C., and M. Siegal. 1995. Deafness, conversation and theory
of mind. Journal of Child Psychology and Psychiatry and Allied Disciplines
36(3):459–474.
——. 1998. Changing focus on the representational mind: Deaf, autistic
and normal children’s concepts of false photos, false drawings and
false beliefs. British Journal of Developmental Psychology 16:301–320.
——. 1999. Representing inner worlds: Theory of mind in autistic, deaf,
and normal hearing children. Psychological Science 10(2):126–129.
——. 2000. Insights into theory of mind from deafness and autism.
Mind & Language 15(1):123–145.
Plaut, D. C., and A. Karmiloff-Smith. 1993. Representational development
and theory-of-mind computations. Behavioral and Brain Sciences
16:70–71.
Premack, D., and G. Woodruff. 1978. Does the chimpanzee have a
theory of mind? Behavioral and Brain Sciences 1(4):515–526.
Pyers, J. E. 2004. The relationship between language and false-belief
understanding: Evidence from learners of an emerging sign language in
Nicaragua . Ph.D. dissertation, Department of Psychology, University
of California, Berkeley.
Ruffman, T., L. Slade, and E. Crowe. 2002. The relation between
children’s and mother’s mental state language and theory-of-mind
understanding. Child Development 73(3):734–751.
Senghas, A. 1996. Children’s contribution to the birth of Nicaraguan Sign
Language. Ph.D. dissertation, Department of Brain and Cognitive
Science, Massachusetts Institute of Technology, Cambridge, MA.
Senghas, A. 2003. Intergenerational influence and ontogenetic
development in the emergence of spatial grammar in Nicaraguan
Sign Language. Cognitive Development. Special Issue: The Sociocultural
Construction of Implicit Knowledge 18(4):511–531.
Senghas, A., and M. Coppola. 2001. Children creating language: How
Nicaraguan Sign Language acquired a spatial grammar. Psychological
Science 12(4):323–328.
Senghas, A., S. Kita, and A. Özyürek. 2004. Children’s creation of core
properties of language. Science 305(5691):1779–1782.
Tomasello, M. 1999. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Tomasello, M., and M. J. Farrar. 1986. Joint attention and early language.
Child Development 57(6):1454–1463.
Wellman, H. M., D. Cross, and J. Watson. 2001. Meta-analysis of theory-
of-mind development: The truth about false belief. Child Development
72(3):655–684.
Wimmer, H., and J. Perner. 1983. Beliefs about beliefs: Representation
and constraining function of wrong beliefs in young children’s
understanding of deception. Cognition 13(1):103–128.
Woolfe, T., S. C. Want, and M. Siegal. 2002. Signposts to development:
Theory of mind in deaf children. Child Development 73(3):768–778.
Ziatas, K., K. Durkin, and C. Pratt. 1998. Belief term development
in children with autism, Asperger syndrome, specific language
impairment, and normal development: Links to theory of mind
development. Journal of Child Psychology and Psychiatry and Allied
Disciplines 39(5):755–763.
eight
both ends of the ham. One day, while her elderly mother happened
to be visiting, she set out to make her special ham for dinner. As her
mother watched her remove the end sections, she exclaimed “Why
are you doing that?” Sylvia said, “Because that’s the way you always
began with a ham.” Her mother replied, “But that is because I did not
have a wide pan!”
There are a few morals of this story we would like to call attention to:
First, unlike her mother, Sylvia had plenty of large cooking pans that
could easily accommodate even a pretty large ham in one piece. In spite
of this, however, for many years she had continued to practice the habit
of cutting off the two sidepieces of the ham before cooking it. (Maybe
her children are also doing the same today.) Second, she did so without
ever spontaneously reflecting on the functional rationale (or lack of
it) for this curious procedure that remained cognitively opaque to her
during all these years. It was only by the happenstance of her mother’s
visit and comments that she came to possess a cognitive insight into
this matter, finally understanding (learning) what the original reason
was for the cultural habit that she had (socially) inherited from her
mother. Third, the specific habit survived in the family culture for all
those years in this cognitively opaque form even though the conditions
rationalizing the procedure as functional had long been absent.
Notes
1. Primate teleology seems very limited also in comparison with the
amazingly
creative and generative—as well as causally sophisticated—innate
teleological
understanding of means–ends relations within the specific domain
of tool use and tool making recently documented in the Caledonian crow,
which, however, is also restricted to a highly specific task domain (Kenward
et al. 2005).
2. We are aware that our current characterization of the types and range of
restrictions that constrain primates’ functional conceptualization of objects
as tools may need to be qualified or tempered in the future as a function of
increased availability of new and relevant observational or experimental data.
At present, however, we feel that the few sporadic and often anecdotal reports
from field observations (see McGrew 2004, for a recent review) that may at
first seem to contradict our generalizations can be easily accommodated by
our hypothesis. For example, Boesch and Boesch-Ackermann (2000) describe
evidence that in the lowland rainforest at Tai where quartzite stones used by
apes to crack hard-shelled nuts are rare, chimpanzees do carry such stones to
known source sites using a minimal distance strategy. Such a strategy, however,
clearly implies prior perceptual access to the specific source location and the
affordance requirements of the particular type of goal object (hard-shelled nuts)
it contains. That is, it is the animal’s prior perceptual access to the specific goal
information that precedes, triggers, and directs the subsequent search for the
nearest object with suitable affordance properties to be carried to the goal site
for being used as a tool to attain the specific goal. No doubt, this remarkable
practice does imply the relatively short-term ability to mentally represent and
actively maintain in working memory the previously perceptually accessed
specific goal information that, nevertheless, still functions as the initial
triggering condition for the teleofunctional conceptualization of the stone
object in terms of its goal.
3. “Blind” imitative behavior copying seems to be a basic competence
available to a variety of different species, sometimes used extensively and
spontaneously in natural environmental conditions as in the case of vocal
imitation in learning species-specific songs and dialects in psittacine birds such
as sparrows (e.g., Petrinovich 1988), whereas in other cases experimentally
inducible by presenting pretrained conspecific models perform new behaviors
that result in direct reinforcement, as in budgerigars, rats, or pigeons (see
Galef 1995; Galef et al. 1986; Heyes 1993; Heyes and Dawson 1990; Heyes and
Galef 1996).
4. We have also replicated this finding of selective imitation between the
two context conditions in a situation in which the model was not present
during the testing phase (Gergely et al. 2003).
5. Meltzoff (1988, 1995) presented only frequencies of imitating the target
act and did not comment on the existence of alternative emulative responses
such as the hand action.
6. See Csibra and Gergely (2006) for additional arguments showing how
a variety of early emerging social cognitive capacities—such as imitative
learning (Gergely and Csibra 2005b), social referencing (Gergely et al. 2007),
protodeclarative pointing, or word learning—can be usefully reinterpreted as
examples of cultural learning through pedagogy.
7. Note that these assumptions are directly analogous, if not identical, to
the Gricean pragmatic assumptions of ostensive communication as spelled
out in Sperber and Wilson’s (1986) relevance theory. In our view, however,
pedagogy is a primary adaptation for cultural learning and not a specialized
module dedicated to the economic recovery of speaker’s intent in linguistic
communication that has evolved later as a submodule of the general theory of
mind capacity of humans (Sperber and Wilson 2002).
8. For further supporting evidence of the influence of pedagogical cues in
influencing the early teleofunctional construal of the function of new artifacts,
see Casler and Kelemen (2005), and DiYanni and Kelemen (2005).
9. For a critical analysis of this position, see Gergely and Csibra (2005a).
References
Baldwin, J. M. 1894. Mental development in the child and the race. Methods
and process. New York: Macmillan.
Bandura, A. 1986. Social foundations of thought and action: A social cognitive
theory. Englewood Cliffs, NJ: Prentice Hall.
Barkow, J., L. Cosmides, and J. Tooby. 1992. The adapted mind: Evolutionary
psychology and the generation of culture. New York: Oxford University
Press.
Barrett, L., R. Dunbar, and J. Lycett. 2002. Human evolutionary psychology.
Houndmills: Palgrave.
Blackmore, S. 2000. The power of memes. Scientific American 283(4):52–
61.
Boesch, C., and H. Boesch. 1993. Diversity of tool use and tool-making
in wild chimpanzees. In The use of tools by human and non-human
primates, edited by A. Berthelet and J. Chavaillon, 158–187. Oxford:
Oxford University Press.
Boesch, C., and H. Boesch-Acherman. 2000. The chimpanzees of the Tai
forest: Behavioural ecology and evolution. Oxford: Oxford University
Press.
Boyd, R., and P. J. Richerson. 1985. Culture and the evolutionary process.
Chicago: Chicago University Press.
Byrne, R. W., and A. E. Russon. 1998. Learning by imitation: A hierarchical
approach. Behavioral and Brain Sciences 21:667–721.
Byrne, R. W., P. J. Barnard, I. Davidson, V. M. Janik, W. J. McGrew, Á.
Miklósi, and P. Wiessner. 2004. Understanding culture across species.
Trends in Cognitive Sciences 8(8):337–386.
Call, J., and M. Tomasello. 1994. The social learning of tool use by
orangutans (Pongo pygmaeus). Human Evolution 9:297–313.
——. 1995. The effect of humans on the cognitive development of apes.
In Reaching into thought, edited by A. E. Russon, K. A. Bard, and S. T.
Parker, 371–403. New York: Cambridge University Press.
——. 2005. The use of social information in the problem-solving of
orangutans (Pongo pygmaeus) and human children (Homo sapiens).
Journal of Comparative Psychology, 109:308–320.
Casler, K., and D. Kelemen. 2005. Young children’s rapid learning about
artifacts. Developmental Science 8(6):472–480.
Csibra, G., S. Bíró, O. Koós, and G. Gergely. 2003. One-year-old infants
use teleological representations of actions productively. Cognitive
Science 27(1):111–133.
Csibra G., and G. Gergely. 1998. The teleological origins of mentalistic
action explanations: A developmental hypothesis. Developmental
Science 1(2):255–259.
——. 2006. Social learning and social cognition: The case of pedagogy.
In Progress of change in brain and cognitive development. Attention and
Performance, vol. 21, edited by Y. Munakata and M. H. Johnson, 249–
274. Oxford: Oxford University Press.
Csibra, G., G. Gergely, S. Bíró, O. Koós, and M. Brockbank. 1999. Goal
attribution without agency cues: The perception of “pure reason” in
infancy. Cognition 72:237–267.
Dawkins, R. 1976. The selfish gene. Oxford: Oxford University Press.
Dennett, D. 1995. Darwin’s dangerous idea: Evolution and the meanings
of life. New York: Simon and Schuster.
DiYanni, C., and D. Kelemen. 2005. Using a bad tool with good intention:
How preschoolers weigh physical and intentional cues when learning
about artifacts. Cognition 97:327–335.
Donald, M. 1991. Origins of the modern mind: Three stages in the evolution
of culture and cognition. Cambridge, MA: Harvard University Press.
Galef, B. G., Jr. 1990. Tradition in animals: Field observations and
laboratory analyses. In Interpretations and explanations in the study
of behavior: Comparative perspectives, edited by M. Bekoff and D.
Jamieson, 74–95. Boulder, CO: Westview Press.
——. 1995. Why behaviour patterns that animals learn socially are
locally adaptive. Animal Behaviour 49:1325–1334.
Galef, B. G., Jr., Manzig, L. A., and R. M. Field. 1986. Observational
learning in budgerigars: Dawson and Foss (1965) revisited. Behavioural
Processes 13:191–202.
Gergely, G., H. Bekkering, and I. Király. 2002. Rational imitation in
preverbal infants. Nature 415(6873):755.
Gergely, G., and G. Csibra. 2003. Teleological reasoning about actions:
The naïve theory of rational action. Trends in Cognitive Sciences 7:287–
292.
——. 2005a. A few reasons why we don’t share Tomasello et al.’s
intuitions about sharing. Commentary on Tomasello, Carpenter,
Call, Behne, and Moll’s target article “Understanding and sharing
intentions: The origins of cultural cognition.” Behavioral and Brain
Sciences 28:701–702.
——. 2005b. The social construction of the cultural mind: Imitative
learning as a mechanism of human pedagogy. Interaction Studies
6(3):463–481.
Gergely, G., K. Egyed, and I. Király. 2007. Early mindreading versus
pedagogical knowledge transfers interpreting object-referential
emotion expressions during the second year. Developmental Science,
in press.
Gergely, G., I. Király, and O. Koós. 2003. Developmental changes in
early observational Learning. Paper presented at the Biennial Meeting
of the Society for Research in Child Development, Tampa, April 24–
27.
Gergely, G., Z. Nádasdy, G. Csibra, and S. Bíró. 1995. Taking the
intentional
stance at 12 months of age. Cognition 56(2):165–193.
Goodall, J. 1986. The Chimpanzees of Gombe: Patterns of Behavior.
Cambridge, MA: Harvard University Press–Belknap Press.
Heyes, C. M. 1993. Imitation, culture and cognition. Animal Behaviour
46:999–1010.
Heyes, C. M., and G. R. Dawson. 1990. A demonstration of observational
learning using a bidirectional control. Quarterly Journal of Experimental
Psychology 42B:59–71.
Heyes, C. M., and B. G. Galef Jr. 1996. Social learning in animals: The
roots of culture. New York: Academic Press.
Horner, V., and A. Whiten. 2005. Causal knowledge and imitation/
emulation switching in chimpanzees (Pan troglodytes) and children
(Homo sapiens). Animal Cognition 8:164–181.
Keil, F. C. 1995. The growth of causal understandings of natural kinds.
In Causal cognition: A multi-disciplinary debate, edited by D. Sperber,
D. Premack, and A. J. Premack, 234–262. Oxford: Clarendon Press.
——. 2003. Folkscience: Coarse interpretations of a complex reality.
Trends in Cognitive Sciences 7:368–373.
Kelemen, D. 1999a. Beliefs about purpose: On the origins of teleological
thought. In The descent of mind: Psychological perspectives in hominid
evolution, edited by M. Corballis and S. Lea, 278–294. Oxford: Oxford
University Press.
——. 1999b. Functions, goals and intentions: Children’s teleological
reasoning about objects. Trends in Cognitive Sciences 12:461–468.
Kenward, B., A. A. S. Weir, C. Rutz, and A. Kacelnik. 2005. Behavioral
ecology: Tool manufacture by naive juvenile crows. Nature 433(7022):
121.
Király, I., G. Csibra, and G. Gergely. 2004. The role of communicative-
referential cues in observational learning during the second year.
Paper presented at the 14th Biennial International Conference on
Infant Studies, Chicago, May 5–8.
McGrew, W. C. 1996. Chimpanzee material culture: Implications for human
evolution. New York: Cambridge University Press.
——. 2004. The cultured chimpanzee: Reflections on cultural primatology.
New York: Cambridge University Press.
Meltzoff, A. N. 1988. Infant imitation after a one week delay: Long-
term memory for novel acts and multiple stimuli. Developmental
Psychology 24:470–476.
——. 1995. What infant memory tells us about infantile amnesia:
Long-term recall and deferred imitation. Journal of Experimental Child
Psychology 59:497 -15.
——. 1996. The human infant as imitative generalist: A 20-year progress
report on infant imitation with implications for comparative
psychology. In Social learning in animals: The roots of culture, edited
by C. M. Heyes and B. G. Galef Jr., 347–370. New York: Academic
Press.
——. 2002. Imitation as a mechanism of social cognition: Origins
of empathy, theory of mind, and the representation of action. In
Handbook of Childhood Cognitive Development, edited by U. Goshwami,
6–25. Oxford: Blackwell.
Mithen, S. 1996. The prehistory of the mind. London: Thames and
Hudson.
Nagell, K., Olguin, R., and M. Tomasello. 1993. Processes of Social
Learning in the tool use of chimpanzees (Pan troglodytes) and human
children (Homo sapiens). Journal of Comparative Psychology 107:174–
186.
Nishida, T. 1987. Local traditions and cultural transmission. In Primate
Societies, edited by B. B. Smuts, D. L. Cheney, R. M. Seyfarth, R. W.
Wrangham, and T. T. Struhsaker, 462–474. Chicago: University of
Chicago Press.
Petrinovich, L. 1988. Individual stability, local variability and the
cultural transmission of song in White-crowned Sparrows (Zonotrichia
leucophrys nuttalli). Behaviour 107:208–240.
Pléh, C. 2003. Thoughts on the distribution of thoughts: Memes or
epidemics. Journal of Cultural and Evolutionary Psychology 11:21–51.
Schick, K. D., and N. Toth. 1993. Making silent stones speak: Human
evolution and the dawn of technology. New York: Simon and Schuster.
Semaw, S. 2000. The world’s oldest stone artifacts from Gona, Ethiopia:
Their implications for understanding stone technology and patterns
of human evolution between 2.6–1.5 million years ago. Journal of
Archaeological Science 27:1197–1214.
Sperber, D. 1994. The modularity of thought and the epidemiology of
representations. In Mapping the mind: Domain specificity in cognition
and culture, edited by L. A. Hirschfeld and S. A. Gelman, 39–67. New
York: Cambridge University Press.
——. 1996. Explaining culture: A naturalistic approach. Oxford:
Blackwell.
Sperber, D., and L. Hirschfeld. 1999. Culture, cognition, and evolution.
In MIT Encyclopedia of the Cognitive Sciences, edited by R. Wilson and
F. Keil, cxi–cxxxii. Cambridge, MA: MIT Press.
——. 2004. The cognitive foundations of cultural stability and diversity.
Trends in Cognitive Sciences 8(1):40–46.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and cognition.
Oxford: Blackwell.
——. 2002. Pragmatics, modularity and mind-reading. Mind and Language
17(1):3–23.
Sumita, K., Kitahara-Frisch, J., and K. Norikoshi. 1985. The acquisition
of stone-tool use in captive chimpanzees. Primates 26:168–181.
Tooby, J., and L. Cosmides. 1992. The psychological foundations of
culture. In The adapted mind: Evolutionary psychology and the generation
of culture, edited by J. Barkow, L. Cosmides, and J. Tooby, 19–136.
New York: Oxford University Press.
Tomasello, M. 1996. Do apes ape? In Social learning in animals: The roots
of culture, edited by C. M. Heyes and B. G. Galef Jr., 319–346. New
York: Academic Press.
——. 1999. The cultural origins of human cognition. Boston, MA: Harvard
University Press.
Tomasello, M., and J. Call. 1997. Primate cognition. Oxford: Oxford
University Press.
Tomasello, M., M. Carpenter, J. Call, T. Behne, and H. Moll. 2005.
Understanding and sharing intentions: The origins of cultural
cognition. Behavioral and Brain Sciences 28(5):675–691.
Tomasello, M., A. C. Kruger, and H. H. Ratner. 1993. Cultural learning.
Behavioral and Brain Sciences 16:495–552.
Uller, C. 2004. Disposition to recognize goals in infant chimpanzees
(Pan troglodytes). Animal Cognition 7:154–161.
Visalberghi, E., and D. M. Fragaszy. 1990. Do monkeys ape? In “Language”
and Intelligence in Monkeys and Apes, edited by S. T. Parker and K. R.
Gibson, 247–273. Cambridge: Cambridge University Press.
Watson, M., and L. Ecken. 2003. Learning to trust: Transforming difficult
elementary classroom through developmental discipline. San Francisco:
Jossey-Bass.
Whiten, A. 2000. Primate culture and social learning. Cognitive Science
24:477–508
Whiten, A., and D. Custance. 1996. Studies of imitation in chimpanzees
and children. In Social learning in animals: The roots of culture, edited
by C. M. Heyes and B. G. Galef Jr., 347–370. New York: Academic
Press.
Part 3
Varieties of Meaning
Far from suggesting that local ideologies about language and meaning
might be irrelevant to the actual practice of interaction, several of
the most influential scientific articulations of the modernist ideology
actually entail the conclusion that speech participants who do not
subscribe to a mentalist theory of interaction should not show normal
patterns of interaction. This is because, in these articulations of the
theory, constant and relatively aware monitoring of the interlocutor’s
intentions is a sine qua non of successful interaction.
Grice’s seminal definition of linguistic meaning, for example, runs as
follows, “For some audience A, U intended his utterance of x to produce in
A some effect (response) E, by means of A’s recognition of that intention”
(1989c:122, emphasis added). In other words, for Grice, U (the utterer)
intends A (the audience) to recognize not just the signification of the
gesture U emits, but also U’s very intention to have that signification
(or another one) recognized. Note how this formulation requires that
U rely on A’s willingness to “recognize,” that is, guess at—whatever
it is that U actually intends. For Grice (1989b, 1982), this issue is far
from trivial. If audiences interpret meaning without regard to utterers’
communicative intentions, their interpretation is of the type he called
“natural” and that he exemplified in classic examples like “those spots
mean measles,” or “those clouds mean rain.” However, meaning that
relies on the “by means of” clause is “non-natural,” and it is this type
of meaning that Grice spent his career examining.
In one illuminating passage, Grice (1989b) discusses Searle’s example
of the U.S. soldier during World War II who, when captured by Italians,
attempts to convince them that he is actually a German officer, and
therefore not to be detained. The American speaks in German to the
Italians, who recognize but do not understand that language. Grice
argues that the case in which the Italians conclude that their prisoner
is a German officer from the very fact that he is speaking German is
qualitatively different from that in which they draw the same conclusion
based on their “recognition” of his intention to utter the specific German
sentence “I am a German officer.” The former case, although inferential,
does not require intention recognition. Only the latter, which does
require it, can be considered a case of non-natural meaning. In both
cases, according to Grice, we are free to suppose that the U.S. soldier
might have had certain intentions in speaking (including deceptive
ones), and even that the Italians might form hypotheses as to what
those intentions could be. Only in the second case, however, and not
in the first, do the hypotheses of the audience about the intentions of
the utterer make a difference to the meaning that the audience takes
from the utterance.
Note that both are cases of inference. The latter, in the prototypical
Gricean way, is an inference from what is assumed to be the speaker’s
desire to get his intention recognized. The former is an inference from
the assumption of “natural” (indexical, associational) compliance
with something—indeed with whatever it is that the U.S. soldier in
this example does not comply with—something like “speak your own
language.” The fact that the American can manipulate the expectation of
such compliance is evidence enough that the association in question is
not purely “natural.” But Grice himself tells us that is not what he means
by “non-natural.” Grice here seems to propose that we understand the
conventional as a variety of the “natural,” on the grounds that neither
involves the crucial diagnostic of the “non-natural”—the intention to
get one’s intention (and not just one’s signal) recognized.
In various other areas of his writings, Grice departs from the strict
natural–non-natural dichotomy by giving consideration also to
conventional
meaning (Grice 1989c, see discussion below). But the dichotomy,
with its embedded reliance on audience willingness to interpret the
utterer’s mental state, is among Grice’s most influential contributions.
In certain critical passages (Grice 1989a:30–31), Grice insists that all
interaction be amenable to a mental-state calculus. Certainly, the
dichotomy articulates well with other influential paradigms within the
philosophy of language (Searle 1965), as well as with folk versions of
the modernist philosophy.2
Conversational Implicature
Making full use of his notion of non-natural meaning, Grice elaborated
a theory of conversational coherence that has had enormous influence
on academic accounts of interaction ever since. For Grice (1989a:45–47),
conversations cohere because participants assume that all parties adhere
to a set of “conversational maxims’ as follows:
Quantity:
a) Make your contribution as informative as is required (for the
current purpose of the exchange)
b) Do not make your contribution more informative than is
required
Quality: Try to make your contribution one that is true
a) Do not say what you believe to be false
b) Do not say that for which you lack adequate evidence
Relation: Be relevant
Manner: Be perspicuous
a) Avoid obscurity of expression
b) Avoid ambiguity
c) Be brief (avoid unnecessary prolixity)
d) Be orderly
All parties adhere to these maxims under an umbrella rule referred
to as the cooperative principle. Apparent non sequiturs and nonsenses
in talk are resolved, under what I call the “strong view” of Gricean
inference, through intention guessing or mental-state simulation of
the interlocutor. So, for example, on hearing the exchange:
A: I am out of petrol
B: There is a garage around the corner [Grice 1989a:51]
Conclusion
If in the future the Mopan (or other ethnographic) data continue to fit
the predictions, it would not mean that there is no universal human
interaction engine. But it could mean that there are fundamentally
different means of functioning within the “engine.”
Where reliance on the modernist cultural ideology in academic
philosophical theories has been overly heavy, we have been scientifically
and not just culturally wrong about two things: (1) how the universal
layers of the human interactional engine work [they are based on
sign motivation, common ground, and conventional meaning, not
on constant calculation of speaker’s nonce mental state] and (2) the
idea that we could consult our own intuitions for reliable information
about what the separable components of the engine are, and about
which pieces of the engine constitute the “basic model” and which
are optional extras.
Although certain parts of the Gricean system (basic assumption of
compliance with the maxims) operate the same way—namely, via
conventional meaning—in both Mopan and moderns, other parts of
“Gricean inference” may not be universal (mutual intention-guessing
and consequent byproducts in regimes of responsibility and in figurative
language preferences). In particular, those areas of Western linguistic
interaction in which the operation of Grice’s non-natural meaning is
most clearly to be observed—those involved in the verbal artistry of
pragmatic flout and Machiavellian reverse psychology—may constitute
specialized cultural traditions of verbal art that should not be allowed
too closely to inform either our investigations into questions of language
origins or our general theorizing about how language relates to mind
at the species level.
Acknowledgments
None of the research would have been possible without the assistance
and hospitality of the Mopan Maya people of the Toledo District,
Belize. Collection of Mopan data was supported at different times by
the Wenner-Gren Foundation for Anthropological Research (Grant
#4850), the Social Sciences and Humanities Research Council of Canada
(Award #452–87–1337), the Cognitive Anthropology Research Group
of the Max-Planck-Institute for Psycholinguistics, and the University of
Virginia. The Department of Archaeology, Belmopan, Belize, provided
help and support during various fieldwork periods. I would also like to
thank Herb Clark, Suzanne Gaskins, Stephen Levinson, and John Lucy
for helpful comments on earlier presentations of this material.
Notes
1. To the anthropological litanycan be added a roster of well-known accounts
from European history: The archaic Greeks had different views from our own
about the source of inspiration and the locus of responsibility for human
actions (Friedrich 1977; Snell 1953). The notion of individual subjectivity
developed in the late Middle Ages under the influence of the confessional,
increasing social mobility and Protestantism (Foucault 1978; Morris 1972;
Trilling 1974). Literacy and print allowed messages to remain fixed, and
therefore gave rise to the philosophical distinction between “objective” and
“subjective” for the first time (Olson 1991). And so on.
2. Keller (1998) gives an extended treatment of the general tendency in
Western philosophy to dichotomize in terms of “natural” and “intentional,”
without considering as a separate class the nondeliberate human products.
3. Although demonstrated cross-cultural difference in reliance on formal
marking of the nature of one’s evidence for assertions is perhaps a relevant
phenomenon (see Chafe and Nichols 1986; Irvine and Hill 1992).
4. We can go even further down the mentalist road without arriving at fully
“non-natural” meaning as articulated by Grice. Idealized Mopan Speaker X
could ruminate “Y’s remark B seems irrelevant to my own prior remark A.
But it can’t be [by Maxim of Relevance]. So what could Y (falsely) believe that
would make B in fact seem relevant to A?”). What is necessarily missing is the
crucial ingredient of reflexive communicative intention: the belief on the part
of X that Y intended X not only to understand the words and other signals
produced, but also, separately, to understand Y’s intentions in producing them
(see Levinson this volume).
5. Consider the fact that we have all mastered the complex phonological
and phonotactic systems of at least one human language. Such knowledge
is undoubtedly acquired, therefore conventional and not strictly “natural.”
Yet we rarely pause to contemplate whether our next utterance should begin
with an aspirated or an unaspirated consonant. If we did pause to consider
such things with any frequency, the system would cease to operate. This
property of ready habituation means that the other-than-natural provenance
of conventional signifiers is easily obscured in the intuitions of their users.
This fact underlies another candidate human universal: the phenomenon
of ethnocentrism, in which people everywhere believe that their culturally
particular ways of doing things are the only possible ones.
6. Recall from the U.S. soldier discussion that non-natural signals are not
the only kind that can trigger inferences in an audience, although only non-natural
signals trigger inferences about what the utterer intends, rather than
about what the signal itself suggests.
7. There are also relevant predictions for child development from the
suggestion that mentalism in interaction is a cultural speciality. Namely,
even in mentalist societies, children’s acquisition of flout-reliant figures
should be late and may have to be formally taught. The conduct of normal
conversational interaction should on the other hand be much earlier, and
should be independent of children’s mastery of the flout.
8. My observations have centered exclusively on issues in Mopan language
philosophy that relate to the Gricean maxim of quality (“Try to make your
contribution one which is true”). There is reason to believe that quality
has a special status among the maxims (Grice 1989a), but should explicit
local philosophies be found relating to the other maxims, I predict (among
other possibilities) one or all of: quantity—decreased appreciation and use
of hyperbole or litotes; relevance—no proverbs; and manner—absence of
mockery, sarcasm, or irony. Questions of “keying” (Hymes 1974) and framing
(Goffman 1974) will also be crucial.
References
Bar-Om, D. in press. Speaking my mind. New York: Oxford University
Press.
Barr, D. J. and B. Keysar. 2005. Making sense of how we make sense:
The paradox of egocentrism in language use. In Figurative language
comprehension: Social and cultural influences, edited by H. L. Colston
and A. N. Katz, 21–42. Mahwah, NJ: Erlbaum.
Basso, K. 1970. To give up on words: Silence in Western Apache culture.
Southwestern Journal of Anthropology 26:213–230.
Begres, S. J. 1992. Metaphor and constancy of meaning. Grazer
Philosophische Studien 43:143–161.
Brice Heath, S. 1982. What no bedtime story means: Narrative skills at
home and at school. Language and Society 11(1):49–76.
Burling, R. 1999. Motivation, conventionalization, and arbitrariness in
the origin of language. In The origins of language: What non-human
primates can tell us, edited by B. J. King, 307–350. Santa Fe: School
of American Research Press.
Carston, R. 2005. Pragmatic inference—Reflective or reflexive? Keynote
address of the 9th International Pragmatics Conference, Riva del
Garda, Italy, July 10–15.
Chafe, W., and J. Nichols (eds).1986. Evidentiality: The linguistic coding
of epistemology. Advances in Discourse Processes Series, vol. 20, vii –xi.
Norwood, NJ: Ablex Publishing.
Clark, H. H. 1996. Using language. Cambridge: Cambridge University
Press.
Coleman, L., and P. Kay. 1981. Prototype semantics: The English verb
“lie.” Language 57(1):26–44.
Danziger, E. 1996a. Parts and their counter-parts: Social and spatial
relationships in Mopan Maya. Journal of the Royal Anthropological
Institute, incorporating MAN 2(1):67–82.
——. 1996b. Split intransitivity and active-inactive patterning in Mopan
Maya. International Journal of American Linguistics 62(4):379–414.
——. 2001. Relatively speaking: Language, thought and kinship in Mopan
Maya. Oxford Studies in Anthropological Linguistics. New York:
Oxford University Press.
——. 2002. Making up our minds: Metaphor and intentionality from
a Mopan Maya perspective. Paper presented at the “Evaluation and
Personhood” session of the 101st Annual Meeting of the American
Anthropological Association, New Orleans, November 20–24.
——. 2005. Reflexive communicative intention in cross-cultural context:
The fate of the flout. Paper presented at the 9th International
Conference
of the International Pragmatics Association (Pragmatics and
Philosophy). Riva del Garda, Italy, July 15–20.
——. n.d. To play a speaking part: Some linguistic preconditions for
fiction.
DuBois, J. 1987. Meaning without intention. Papers in Pragmatics
1(2):80–122.
Duranti, A. 1992. Intentions, self and responsibility: An essay in Samoan
ethnopragmatics. In Responsibility and Evidence in Oral Discourse, edited
by J. Hill and J. Irvine, 24–47. Cambridge: Cambridge University
Press.
Evans-Pritchard E. E. 1976. Witchcraft, oracles and magic among the
Azande, abridged edition. Oxford: Clarendon Press.
Foucault, M. 1978. The history of sexuality, vol. 1. New York: Pantheon
Books.
——. 1980. Power/knowledge: Selected interviews and other writings, 1972–1977,
translated by Colin Gordon, Leo Marshall, John Mepham, and
Kate Soper; edited by Colin Gordon. New York: Pantheon Books.
Friedrich, P. 1977. Sanity and the myth of honor: The problem of
Achilles. Ethos 5(3):281–305.
Gibbs, R. W., Jr. 1994. The poetics of mind: Figurative thought, language
and understanding . New York: Cambridge University Press.
Giora, R. 2003. On our mind: Science, context and figurative language. New
York: Oxford University Press.
Goffman, E. 1974. Frame analysis: An essay on the organization of experience.
Cambridge, MA: Harvard University Press.
——. 1983. Felicity’s condition. American Journal of Sociology 89(1):1–53.
Green, M. S. 2003. Grice’s frown: On meaning and expression. In Saying,
meaning implicating, edited by G. Meggle and C. Plunze, 200–219.
Leipzig: University of Leipzig Press.
Gregory, James. R. 1975. Image of limited good, or expectation of
reciprocity? Current Anthropology 16(1):73–92.
——. 1984. The Mopan: Culture and ethnicity in a changing Belizean
community. University of Missouri Monographs in Anthropology,
no. 7. Columbia: Museum of Anthropology, University of Missouri,
Columbia.
Grice H. P. 1989a. Logic and conversation. Studies in the Way of Words,
22–40. Cambridge, MA: Harvard University Press.
——. 1989b. Meaning. Studies in the Way of Words, 213–223. Cambridge,
MA: Harvard University Press.
——. 1989c. Utterer’s meaning, sentence-meaning, and word-meaning.
Studies in the Way of Words, 117–137. Cambridge, MA: Harvard
University Press.
——. 1982. Meaning Revisited. Mutual Knowledge, edited by N. V. Smith,
223–243. New York: Academic Press.
Haiman, J. 1998. Talk is cheap: Sarcasm, alienation and the evolution of
language. New York: Oxford University Press.
Hymes, D. 1974. Toward ethnographies of communication. Foundations in
sociolinguistics: An ethnographic approach. Philadelphia, PA: University
of Pennsylvania Press.
Irvine, J., and J. Hill (eds). 1992. Responsibility and evidence in oral
discourse. Cambridge: Cambridge University Press.
Keenan, E. 1976. The universality of conversational postulates. Language
in Society 5(1):67–80.
Keller, R. 1998. A theory of linguistic signs. Oxford: Oxford University
Press.
Lakoff, G., and M. Johnson. 1980. Metaphors we live by. Chicago:
University of Chicago Press.
Mitchell-Kernan, C. 1972. Signifying and marking: Two Afro-American
speech acts. In Directions in sociolinguistics: The ethnography of
communication, edited by J. J. Gumperz and D. Hymes, 161–179.
New York: Holt, Rinehart and Winston.
Morris, C. 1972. The discovery of the individual 1050–1200. New York:
Harper and Row.
Nuyts, J. 1994. The intentional and the socio-cultural in language use.
Pragmatics and Cognition 2(2):237–268.
Ochs, E. 1982. Talking to children in Western Samoa. Language in Society
11(1):77–104.
Olson, D. R. 1991. Literacy and objectivity: The rise of modern science.
In Literacy and orality, edited by D. R. Olson and N. Torrance, 149–164.
Cambridge: Cambridge University Press.
Reddy, M. J. 1993[1979]. The conduit metaphor: A case of frame conflict
in our language about language. In Metaphor and thought, edited by
Andrew Ortony, 137–163. Cambridge: Cambridge University Press.
Robbins, J. 2001. God is nothing but talk: Modernity, language and
prayer in a New Guinea society. American Anthropologist 103(4):901–912.
Rosaldo, M. Z. 1982 The things we do with words: Ilongot speech acts
and speech act theory in philosophy. Language in Society 11:203–237.
Rosen, L. 1995. Other intentions: Cultural contexts and the attribution of
inner states. Santa Fe: School of American Research Press.
Searle, J. 1965. What is a speech act? In Philosophy in America, edited by
M. Black, 221–239. Ithaca, NY: Cornell University Press.
Snell, B. 1953. The discovery of the mind: The Greek origins of European
thought. Oxford: Blackwell.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and cognition.
Oxford: Blackwell.
——. 2002. Pragmatics, modularity and mind-reading. Mind and Language
17(1–2):3–23.
Sweetser, E. 1987. The definition of “lie”: An examination of the folk
model underlying a semantic prototype. In Cultural models in language
and thought, edited by D. Holland and N. Quinn, 43–66. New York:
Cambridge University Press.
Thompson, J. E. S. 1930. Ethnology of the Mayas of southern and central
British Honduras, Field Museum of Natural History Anthropological Series
17, no. 2. Chicago: Field Museum of Natural History.
Trilling, L. 1974. Sincerity and authenticity. London: Oxford University
Press.
Warren, K. B. 1995. Each mind is a world. In Other intentions: Cultural
contexts and the attribution of inner states, edited by L. Rosen, 47–68.
Santa Fe: School of American Research Press.
Wilcox, S. 1999. The invention and ritualization of language. The origins
of language: What non-human primates can tell us, edited by B. J. King,
351–384. Santa Fe: School of American Research Press.
ten
References
Bornstein, M. H. 1989. Cross-cultural developmental comparisons:
The case of Japanese-American infant and mother activities and
interactions. What we know, what we need to know, and why we
need to know. Developmental Review 9:171–204.
Bruner, J. 1972. The nature and uses of immaturity. American Psychologist
27:688–704.
——. 1983. Children’s talk. New York: W. W. Norton.
Clancy, P. 1986. The acquisition of communicative style in Japanese.
In Language socialization across cultures, edited by B. Schieffelin and
E. Ochs, 213–250. New York: Cambridge University Press.
Doi, L. T. 1973. The anatomy of dependence. Tokyo: Kodansha
International.
Ferguson, C. 1978. Talking to children: a search for universals. In
Universals of Human Language, vol. 1, edited by J. Greenberg, C. A.
Ferguson, and E. A. Moravscik, 203–224. Stanford, CA: Stanford
University Press.
Gaskins, S. 1990. Mayan exploratory play and development. Ph.D.
dissertation, Department of Education (Educational Psychology),
University of Chicago.
——. 1996. How Mayan parental theories come into play. In Parents’
cultural belief systems: Their origins, expressions, and consequences, edited
by S. Harkness and C. M. Super, 345–363. New York: Guilford.
——. 1999. Children’s daily lives in a Mayan village: A case study of
culturally constructed roles and activities. In Children’s engagement in
the world: Sociocultural perspectives, edited by A. Göncü, 25–61. New
York: Cambridge University Press.
Heath, S. B. 1983. Ways with words: Language, life, and work in communities
and classrooms. Cambridge: Cambridge University Press.
LeVine, R., S. Dixon, S. LeVine, A. L. Richman, P. H. Leiderman, C. H.
Keefer, and T. B. Brazelton. 1994. Child care and culture: Lessons from
Africa. Cambridge: Cambridge University Press.
Lucy, J. A. 1993. Metapragmatic presentationals: Reporting speech with
quotatives in Yucatec Maya. In Reflexive language: Reported speech and
metapragmatics, edited by J. A. Lucy, 91–125. New York: Cambridge
University Press.
Martini, M., and J. Kirkpatrick. 1981. Early Interactions in the Marquesas
Islands. In Culture and early interactions, edited by T. M. Field, A.
M. Sostek, P. Vietze, and P. H. Leiderman, 189–213. Hillsdale, NJ:
Erlbaum.
New, R. S. 1988. Parental goals and Italian infant care. In Parental
behavior in diverse societies, edited by R. LeVine, P. M. Miller, and M.
M. West, 51–64. San Francisco: Jossey-Bass.
Ochs, E. 1988. Culture and language development: Language acquisition
and language socialization in a Samoan village. Cambridge: Cambridge
University Press.
Ochs, E., and B. Schieffelin. 1984. Language acquisition and socialization:
Three developmental stories and their implications. In Culture and its
acquisition, edited by R. Shweder and R. LeVine, 276–320. Chicago:
University of Chicago Press.
Pye, C. 1986. Quiché Mayan speech to children. Journal of Child Language
13:85–100.
Richman, A. L., R. A. LeVine, R. S. New, and G. A. Howrigan. 1988.
Maternal behavior to infants in five cultures. New Directions for Child
Development 40:81–97.
Schieffelin, B. 1983. Talking like birds: Sound play in a cultural
perspective
. In Acquiring conversational competence, edited by E. Ochs and B.
Schieffelin, 177–184. London: Routledge and Kegan Paul.
——. 1990. The give and take of everyday life: Language socialization of
Kaluli children. Cambridge: Cambridge University Press.
Tomasello, M. 1999. The culture of human cognition. Cambridge, MA:
Harvard University Press.
Trevarthen, C. 1987. Universal co-operative motives: How infants
become to know the language and the culture of their parents. In
Acquiring culture: Cross-cultural studies in child development, edited by
G. Jahoda and I. M. Lewis, 35–90. London: Croom Helm.
Vygotsky, L. S. 1978. Mind in society: The development of higher psychological
processes. Cambridge, MA: Harvard University Press.
——. 1987[1934]. Thinking and speech. New York: Plenum Press.
Watson-Gegeo, K., and D. Gegeo. 1986. Calling-out and repeating
routines in Kwara’ae children’s language socialization. In Language
socialization across cultures, edited by B. Schieffelin and E. Ochs, 17–50.
New York: Cambridge University Press.
Weisner, T. S., and R. Gallimore. 1977. My brother’s keeper: Child and
sibling caregiving. Current Anthropology 18:169–190.
Wood, D., J. Bruner, and G. Ross. 1976. The role of tutoring in problem
solving. Journal of Child Psychology and Psychiatry 17:89–100.
eleven
potential indices of the problem and its likely causes (cf. Goodwin 1994).
Relatively little is explained and there is no physical exam, but a vast
amount of observational information is gathered by the shaman and
subsequently used as a basis for inferences about the patient.
The setting unfolds in the back room of the shaman’s house, in
front of his altar, a table on which there are numerous saints’ images,
candles, flowers, tins of herbal medicines, and his divining crystals
kept in a gourd filled with blessed water (see Fig. 11.2). The altar is on
the East wall of the room, facing East, and the saints’ images all face
West, looking into the room. In the shaman’s practice, these details
are part of a vast universe of background knowledge, both declarative
and procedural (Hanks 1990, 1996). The altar is a complex instrument
with which he engages daily and of which every part has an elaborate
history and rationale. The patient can see all of the objects and is
aware that they are significant, but has minimal knowledge of what
is signified, most of which is esoteric or explicitly kept secret. Like a
hieroglyph, the altar and the speech that occurs at it are known to be
meaningful but their meaning is mostly unknown to the patient. He
can also see in the shaman’s behavior that the altar is his intimate space,
but can only postulate the history behind this intimacy. He bases his
postulation on the shaman’s displays in the present and reputation
from past performances and the opinions of others (most or all of
whom are not present). The patient also has a personal history that
brings him to this place at this time, but the shaman can only infer
it. Whereas the shaman has a reputation based on years of practice,
the patient is often an anonymous stranger on a first visit. An expert,
the shaman can monitor both the patient and the signs in the space
with acuity far beyond the patient’s ability to monitor. In this sense,
the situation is by design asymmetric from the outset. It takes place
on the shaman’s ground.
Table
Table 11.1.
11.1. Phases
Phases in
in a
a divination
divination and
and treatment
treatment11
In the course of drawing the patient further into the prayer by asking
his name, the shaman also integrates the two interactive situations
into a single frame. One way he achieves this is by shifting between
prayer addressed to spirits (2.1–2.2, 2.13–2.16, 2.20–2.25) and ordinary
dialogue addressed to copresent persons (2.3–2.12, 2.16–2.19), with
alternating intervals devoted to each. The orientations of his body keep
the two distinct, facing toward the altar in prayer versus turning his head
in the direction of the patient when addressing him or his wife. The
two orientations correspond to two complementary faces of divination
as a genre, which addresses both patients and spirits. Regardless of
orientation, shaman’s right index finger never loses contact with the
crystals.7 His first bid for the name occurs at 2.2, in the course of a
normal breath group in prayer, marked by the quick head turn and
ordinary question intonation. When he must subsequently cite the
name in prayer, he seems to have forgotten it. After a two second pause
he states the first name only and pauses again, which the wife hears
as a request for completion (2.17). This response by the wife to a short
pause is clear evidence that she is attending closely to the prayer and
its timing.
When he returns to prayer at 2.20, the full name is syntactically
integrated into an utterance addressed to the spirits, yet delivered
with the prosody of ordinary Maya (2.20–2.25). The phrasing (“the
name of the body, I present it to your right hand, review the earth,
the blood, the spirit”) indexes the register as prayer, but the ordinary
prosody and the absence of spirit names make it readily intelligible to
a Maya speaker. This suggests a second strategy of integration: starting
with his bid for completion in 2.16 through 2.23, the shaman’s speech
is intermediate between the two situations, just as it is intermediate
between the registers that index them. He is, as it were, in both frames
at once. The result is to bind the two situations together and secure
the counterpart relation between the copresent patient and the body
presented to the spirits. Even if the patient does not understand all
that is going on, he understands that the shaman is presenting him,
and the need for full name and home town officialize this fact, further
reinforcing his credibility.
What ensues over the next 3 minutes and 14 seconds is the pivotal
three-way interaction in which the shaman, the patient and the spirits
coparticipate to derive a diagnosis that the patient will ultimately ratify.
This segment of the clinical episode is pivotal because the patient’s
participation is transformed, from an attentive overhearer called on
to give precise public information (name and town), into an agent in
his own diagnosis. Drawn in by the near intelligibility of the register,
the semiotic density of the altar and the shaman’s gestures, and the
occasional question, the patient builds up a conviction in the plausibility
of the process. This basic conviction is transformed further in subsequent
phases, but it is the basis of his ultimate ratification of the diagnosis.
Part of this is shown in phase 3.
3. Phase 3: Combined dialogue with spirits and patient
3.1 DC #tuk’aba’ Cristo Jesus, Dyos Padre ‘espiritu santu. .
[AV.07.21.07]
in the name of Christ Jesus, god the father, holy
spirit
3.2 (( taking two crystals, one a time, in right hand and
squinting at them))
3.3 (10.8) trés dòos.
Three two
3.4 # (5.0) kí’ichpam kó’olebi sáasil ak’ab ‘indio mayab’
yum papal kòol chak
Beautiful Lady Sasil Akab Indio Mayab Lord Papal
Kol Chak
3.5 yun chiri’ chabo’ sàanto chabol señor yúun sala’ . (1.0)
sáasikunten (.)
Lord Chiri Chabo Holy Chabol Lord Sala, enlighten
for me
3.6 ‘usàanto kristal le yum balan ‘ìik’ó’o tuk’aba’ Crìisto.
[AV.07.21.48]
the holy crystals of the lord jaguar spirits in the
name of Christ
3.7 ((taking one crystal in right hand, squinting at it))
3.8 (6.0) kwatro sìinko. ((nods twice; takes another
crystal in right hand))
four five
3.9 (9.0) ‘àah. Pwes (2.0) leti e k’oháani yàan tech a’.
Oh, so this illness you’ve got, [AV.07.22.05]
3.10 (2.0) ‘ump’ée ‘ìik’ kumáan tutohil ‘a’estòomagóoh. (1.0)
Kuyahkúuntk
a wind is passing directly in your stomach. It pains
3.11 ‘a’estòomagóoh kuprovokarkech. (1.5) ump’éeh ‘ìik’.
(1.0)
your stomach, it makes you nauseous. A wind
3.12 Le ‘ìik’ (1.0) kukrusàaro’o’, (2.0) má’ má’alo’ bá’al i’
(2.0)
the wind crossing there, it’s not a good thing.
3.13 kutz’ó’okol ubin té’el o’, ku’áatakartik xan apú’uch.
[AV.07.22.37]
after it goes there, it attacks your lower back
3.14 Patient: ‘impú’uch. =
My lower back
3.15 DC = hàah. (1.5) Kuluk’ul té’e tapú’ucho’, (.) kubin
taserebro, tapòol.
Yeah. It leaves your back there, it goes to your brain,
your head.
3.16 Patient hnng.
3.17 DC (2.0) ‘Esyaskèe, (1.0) bey utsikbatik ten té’ela’.] (1.5)
‘esyaske tech
So that’s what it’s telling me here. So you’re the one
who
3.18awohe xane’, bix yúuchu tech . ((toss of gaze over right
shoulder towards patient behind, onset at bix)).
who knows how it happens to you.
3.19 Pero tene’ mináang—táan intzolik tech (.)
but me. There’s no—I’m explaining to you
3.20 leti e’ bá’a= hé’e [AV.07.22.54]
the thing here-
3.21 Patient = kyúuchl en túunee, (1.0) ku(ya’a) le impòol yéetel
impachk’abe’. =
So it’s happening to me, my head and back hurt
3.22 (patient uncrosses arms and gazes down)
3.23 DC =hàah
Yeah
3.24 Patient: yéet innak’. [AV.07.23.03]
and my belly
3.25 DC hàah yét anak’.
Yes and your belly.
3.26 Patient: ‘estòomagóoh, múnk’amik misbá’ah.
Stomach. Can’t hold anything down.
3.27 DC: hàah. (1.0) hé’ebix, (1.0) tuyá’aka’, bey, ‘awá’ak ten
xan yúuchu tech.
Yes, just, as it says here, so, you tell me it happens
to you.
3.28 (2.0) kutzikbatik-
It tells it-
3.29 Patient -hnng [AV.07.23.17]
Coming to the end of the opening prayer at line 3.1 (which corresponds
to 2.8 above), the shaman takes two crystals in his right hand, one at
a time, leaning forward and squinting as he studies them. After ten
seconds he utters the numbers “three two.” Five seconds later, he returns
to another breath group of réesar, takes another crystal in his right hand,
scrutinizes it for six seconds and utters “four five,” nodding twice as he
speaks. After nine more seconds of silence he begins to formulate the
problem in ordinary Maya for the patient, starting with the patient’s
stomach, with his nausea (3.11) and moving to the back ache he suffers
(3.13). At the mention of his back, the patient repeats “my back.” The
shaman takes this repetition as confirmation of his initial statement of
symptoms, and adds that there is headache as well (3.15), which the
patient confirms with sublexical hnng. At this point the shaman asserts
that this is how he sees it “right here” (in the crystals), that the patient
is the one who knows how it actually happens. The latter statement
effectively calls forth a more explicit confirmation by the patient that
what he feels is what the shaman has said (3.21, 3.24, 3.26) In these
turns at talk, the patient restates the initial diagnosis and ratifies its
accuracy. He has become a principal in the diagnostic process.
Several aspects of this sequence bear on both the management of
the patient’s beliefs and the problem of integrating the two frames of
interaction into one. At the outset, when the shaman states numbers
(3.3, 3.8), it is flagrantly opaque what he is counting and why. The breath
group of réesar that intervenes between the two number statements is
itself mostly unintelligible, spoken soft and rapid and citing spirits
unknown to all but adepts. The pacing of the shaman’s turns at talk
has slowed down and there are long silences between his utterances.
He nods twice in certainty (3.8), although there is no clue what he is
certain of or why. At this point the patient is maximally excluded from
understanding the expert practice of the shaman, but is attending to
a process he believes to be meaningful. He is simultaneously a ratified
witness of the performance, and its main object.
After another long pause, the shaman proceeds to make statements
about the condition that precisely formulate the patient’s felt symptoms,
starting with what hurts. The inferences that lie behind these statements
are based at least partly on his observation of the patient’s expressions
of pain, but he never appeals to observational evidence of a sort the
patient would recognize. By telling the patient what he feels, he elicits
confirmation from him, a new step toward inducing the patient’s
conviction in the accuracy of the process. The patient is called on to
endorse statements by another about his own pain. At this point, he
“unloads” the symptoms: “my head, my upper back, my belly ache,”
and the shaman once again asserts that just as it says (in the crystals),
so it occurs to the patient (3.27).
By the time the patient ratifies this last statement (3.29), he has
common ground with the shaman regarding his physical symptoms,
if not their causes. He has seen that the shaman can tell him where he
hurts, and he has collaborated in the process by ratifying the shaman
and repeating the same symptoms in his own words. The back and
forth between the two synchronizes them (cf. Enfield this volume)
and sustains their joint engagement. The elicitation of confirmation
builds the patient’s conviction that the shaman can perceive things no
“ordinary” interlocutor would. Along the way, he develops what Clark
(this volume) calls a “joint commitment” to the divinatory project. All
of this elicited commonality supports the counterpart relation between
the crystals, whatever the shaman sees in them, and the patient’s body.
It also enhances the shaman’s credibility for the case at hand, which sets
the stage for the next step. Having provisorily convinced the patient
that he is on the right path, the shaman goes on to make more bold
and difficult claims about him. The joint convictions and commitments
established so far carry over into the tenuous terrain to be crossed. For
the sake of brevity, this segment of the transcript is omitted but can be
summarized as follows.
Over the course of 3 minutes and 24 seconds, the shaman makes
a series of statements about the patient. As he is speaking he looks
intently at the crystals, thus splitting his attention between listening
to the patient and watching for the signs of the spirits. Not only is his
pain caused by “spirit, wind” affecting his body, but it is connected
to his job and the land he owns. The patient continues to register the
shaman’s claims with consensual responses, and the shaman proceeds
to build up a more detailed picture of the patient’s life. He tells him
approximately,
you have lands, where you have a traditional house [apsidal, palm roof].
Somewhere on the land there is a cave, an abandoned limestone pit,
or other hole leading underground. In that hole the evil spirits reside
and from it they emanate, striking you and any other animate in your
household [human or animal] who happens to be in the wrong place at
the wrong time. This spirit is very old and it demands offerings. It is evil
and endangers your entire household.
Notes
1. On ritual speech compare DuBois (1986) for a Mayan example, and
Zeitlyn (1995) for a contrasting case.
2. See Bourdieu (1991a, 1991b) for discussion of “authorized” speech and
its relation to performative effectiveness. See also Chafe (1993).
3. Yucatec Maya consonant phonemes are /p, t, k, p’ t’, k’, b, s, x, h, tz, ch, tz’,
ch’, m, n, w, y, l, r/, where /’/ = glottal stop following a vowel and glottalization
following a consonant, /b/ = voiced bilabial implosive, /x/ = voiceless alveo-
palatal fricative, /h/ = voiceless glottal fricative, /tz(‘)/ = (ejective) voiceless
alveolar affricate, and /ch(‘)/ = (ejective) voiceless palatal affricate. Syllable
nuclei are made up of combinations of five vowels (i, e, a, o, u), three tones
(high ´ , mid [no accent], low ` ), length, and glottalization. Length is indicated
by the doubling of a vowel, and glottalization is indicated by an intervocalic
glottal stop ‘ . The canonical vocalic patterns are /i, e, a, o, u/, /íi, ée, áa,
óo, úu/ ìi, èe, àa, òo, ùu/, and /í’i, é’e, á’a, ó’o, ú’u/. However, short vowels
with tones also occur and are derived either by grammatical processes or by
paralinguistic ones. Glottalization is also realized as creaky voice or even by
eliminating the glottal stop completely. The latter case results in a long vowel
with high- to mid-falling pitch but remains distinct from the (nonglottalized)
high tone series /íi, ée, áa, óo, úu/, which is pronounced variably with rising or
falling pitch. Spellings of place-names such as Oxkutzcab are orthographically
unmodified from their Spanish spellings.
4. Transcription of Audio Visual Tape 07, recorded in Oxkutzcab, Yucatan,
Fall 1991. In Don Chabo’s home, in the altar room. A man and woman have
arrived requesting a treatment for the man, who is in physical distress. Present
are also Hanks and Peter Thompson, a filmmaker who recorded the episode.
The layout of the room is shown in Fig. 11.2. The recording starts at the onset
of the divination. Ritual speech delivered in unmeasured breath groups,
inhalations marked by cross hatch (#).
5. On register and stylistic differentiation, see Eckert and Rickford 2001,
and Irvine 2001.
6. The earliest treatment of the dovetailing of participants’ motives, and
the reciprocity of perspective that it implies, is Schutz (1967).
7. DC’s care to not lose touch with the crystals is motivated by the idea
that once prayer is started, “the thread should not be broken,” that is, the
connection to the spirits should not be allowed to lapse until the end of the
event.
References
Bourdieu, P. 1991a. Authorized language: The social conditions for
the effectiveness of ritual discourse. In Language and symbolic power,
edited and introduced by J. B. Thompson, 107–116. Cambridge, MA:
Harvard University Press.
——.1991b. Description and prescription: The conditions of possibility
and the limits of political effectiveness. In Language and symbolic power,
edited and introduced by J. B. Thompson, 127–136. Cambridge, MA:
Harvard University Press.
Bucholtz, M. 2000. The politics of transcription. Journal of Pragmatics
32(10):1439–1465.
Chafe, W. 1993. Seneca speaking styles and the location of authority.
In Responsibility and evidence in oral discourse, edited by J. Hill and J.
Irvine, 72–87. Cambridge: Cambridge University Press.
Cicourel, A. 1992. The interpretation of communicative context:
Examples from medical encounters. In Rethinking context: Language
as an interactive phenomenon, edited by A. Duranti and C. Goodwin,
291–310. Cambridge: Cambridge University Press.
——. 2001. Le raisonnement médical. Paris: Seuil.
Clark, H. H. 1992. Arenas of language use. Chicago: University of Chicago
Press.
DuBois, J. 1986. Self-evidence and ritual speech. In Evidentiality: The
linguistic coding of epistemology, edited by W. Chafe and J. Nichols,
313–336. Norwood, NJ: Ablex.
Eckert, P., and J. R. Rickford (eds.). 2001. Style and sociolinguistic variation.
Cambridge: Cambridge University Press.
Goffman, E. 1972. The neglected situation. In Language and social context:
Selected readings, edited by P. P. Giglioli, 61–66. New York: Penguin.
——. 1983. Footing. Semiotica 25:1–29.
Goodwin, C. 1994. Professional vision. American Anthropologist
96(3):606–633.
——. 2000. Gesture, aphasia and interaction. In Language and gesture,
edited by D. McNeill, 84–98. New York: Cambridge University
Press.
Hanks, W. F. 1990. Referential practice, language and lived space among the
Maya. Chicago: University of Chicago Press.
——. 1996. Exorcism and the description of participant roles. In Natural
histories of discourse, edited by M. Silverstein and G. Urban, 160–220.
Chicago: University of Chicago Press.
——. 2001. Exemplary natives and what they know. In Paul Grice’s
Heritage, edited by G. Cosenza, 207–234. Turnhout: Brepols.
——. 2005. Explorations in the deictic field. Current Anthropology
46(2):191–220.
Haviland, J. B. 1993. Anchoring, iconicity and orientation in Guugu
Yimithirr pointing gestures. Journal of Linguistic Anthropology 3:3–
45.
Irvine, J. 2001. “Style” as distinctiveness: The culture and ideology of
linguistic differentiation. In Style and sociolinguistic variation, edited by
P. Eckert and J. R. Rickford, 21–43. Cambridge: Cambridge University
Press.
Kendon, A.1992. The negotiation of context in face to face interaction.
In Rethinking context, edited by A. Duranti and C. Goodwin, 323–334.
Cambridge: Cambridge University Press.
Kita, Sotaro (ed.). 2003. Pointing: Where language, culture, and cognition
meet. Mahwah, NJ: Erlbaum.
Ochs, E. 1979. Transcription as theory. In Developmental pragmatics,
edited by E. Ochs and B. Schieffelin, 43–72. New York: Academic
Press.
Schegloff, E. A. 1987. Between micro and macro: Contexts and other
connections. In The micro-macro link, edited by J. C. Alexander, B.
Giesen, R. Münch, and N. J. Smelser, 207–234. Berkeley: University
of California Press.
Schutz, A. 1967. The phenomenology of the social world. Northwestern
University Studies in Phenomenology and Existential Philosophy. Evanston,
IL: Northwestern University Press.
Sweetser, E., and G. Fauconnier. 1996. Cognitive links and domains:
Basic aspects of Mental Space Theory, in Spaces worlds and grammar,
edited by G. Fauconnier and E. Sweetser, 1–28. Chicago: University
of Chicago Press.
Zeitlyn, D. 1995. Divination as dialogue: Negotiation of meaning with
random responses. In Social intelligence and interaction: Expressions
and implications of the social bias in human intelligence, edited by E.
Goody, 189–205. Cambridge: Cambridge University press.
twelve
actions are varied, complex, and depend on context and audience for
interpretation. Social actors do not follow rules so much as respond
to and choose from a set of contingent expectations that vary with
context and participant frameworks. Effects at the system or society
level are at times neither intended nor predicted by what happens at
the individual level (Giddens 1984). The outcome of a momentary
interaction is “something none of the parties can plan in advance” but
is a “contingent product” (Levinson this volume). Another problem
is how reliable the replication of human practices by individuals
can be even when this is the goal, and thus how unreliable accurate
transmission of culture among members of a society is (Sperber this
volume). It may be that we do not imitate others so much as incorporate
observed outcomes of their actions (Tomasello this volume) into our
own planning processes.
Although research in face-to-face interaction and the ethnography
of communication has challenged long-standing macrosociological
assumptions, theories, and methods (Collins 1981:81; Knorr-Cetina
1981:1; Levinson 2005; Schegloff 2005) and called for a rethinking of a
system level approach concerning social institutions and sociocultural
change, it is not clear how to evaluate those aspects of shared practices or
institutions that are not directly observable in interaction (see Schegloff
1987). An exclusively microinteractional orientation to understanding
culture and society has been criticized as actor focused, reductionist,
trivial, situated, and subjective. The dichotomy between macro and
micro itself has been characterized as having relevance only within
scholarly discussions or only relevant for analysis (Alexander and Giesen
1987:1). Yet it seems clear that the emergent and dynamic nature of a
single, contingent interaction is dependent on and complimentary to a
cross-event set of experiences with established categories or routines for
interpretation that maintain an orderliness that characterizes human
sociality and without which local intersubjectivity would not be possible
(Hanks this volume). The close examination of microinteractions
(Schegloff 1968, this volume) or investigation of relationships between
language and cross-event, nonlinguistic problem solving (Levinson
2003, 2005) reveal culturally based habits and routines with historical
dimensions that groups share, “a past which survives in the present
and tends to perpetuate itself into the future by making itself present
in practices structured according to its principles” (Bourdieu 1977:82).
Institutional structures influence the actions of humans through
regularized systematized procedures (Duranti 2003) that are seen as
noncontingent, but these are also adaptable to new situations, as the
deaf signers I discuss below show.
When new technologies or tools are introduced that disrupt
conventional
procedures or require new ones, we can investigate the collaborative
production of new shared systems and the means for reestablishing
coherence, as well as look at which resources from past interactions
are marshaled into new ones. The Internet and computer-mediated
communication (CMC) in addition call into question significant
aspects of what shared might mean, for example, in the new ability to
communicate identically, yet individually with (and receive individual
responses from) large numbers of people simultaneously, as well the
ability to replicate with much greater accuracy.
Communication Technologies
New technology leads to new permutations in what Suchman (1993)
has called the production of a coherent relation between a normal
order of events, and this can have revolutionary, unplanned effects.
Although Alexander Graham Bell was initially interested in creating a
technology that would help deaf people learn to speak, his invention of
the telephone actually disadvantaged deaf people communicatively and
left them out of one of the most important communication advances
of the last century, the ability to collapse spatial constraints and to
carry the human signal across vast distances. The deaf community did
not have telephone technology until nearly 100 years after the hearing
community, when an acoustic coupler that could link a telephone to
a teletype machine and produce text over private phone lines became
available, invented by a deaf physicist, Robert Weitbrecht (see Fig.
12.1 for the recent version of this text-based technology), who had
to overcome powerful opposition from the telephone company. An
earlier telephone device for deaf people, invented by a peer of Bell, the
Telautograph (messages were handwritten at one end of a wire with a
pen-lifting mechanism and were reproduced automatically on the other
end through the use of a stylus and a wide sheet of paper) was never
adopted (for discussion of aspects of language use in deaf communities,
see Goldin-Meadow, and Pyers in this volume).
The technology that has resulted in deaf people being able to project
a sign-language signal across vast distances, a simplified video camera
for web interfaces (webcam) and software to transmit digital images,
was itself originally designed for remote monitoring of a coffee pot. The
webcam tool first emerged, so the account goes, in the early 1990s in
the offices of the computer science department of Cambridge University,
designed by researchers who wanted to be able to have a remote view
of their communal coffee pot to see if it was empty and whether a trip
via several flights of stairs was worthwhile. They trained a digital camera
on the faraway coffee pot and the image was sent over the network.
Webcam technology or “prosthetic” eyes now are used for not only
viewing locations remote from the viewer such as towns worldwide
(where they are sometimes mounted on buildings and streamed
through the internet) but for human interaction (Keating 2005). The
deaf community discovered the potential for vastly improved complex
visual-language transmission (over the previous text typing system), but
early adopters had to adapt signing to new spatial properties.
Webcam-recorded, computer-mediated space is radically different
from “real” space in terms of affordances for sign-language interaction.
The field of vision of the webcam lens is restricted in size compared with
human vision, for example, but less restrictive in terms of place. The
body, an important resource for displaying and assessing social meanings
and for organizing joint action (see Goodwin 2000, this volume) and
displaying joint commitments to activities (Clark this volume), is
constrained in how it can display itself in computer-mediated space,
although it is enabled to project itself around the world. Computer-
mediated space affords communication but restricts communicative
actions. Building a common ground (Enfield this volume) requires new
efforts and experiments by signers. Interactants using sign language
via webcam and computer must coordinate and understand a number
of different and related representations, including mechanical effects
of actions versus human ones. In terms of sign language, a language
dependent on the use of space, differences between three-dimensional
(3D) face-to-face interaction and two-dimensional (2D) space are
significant and meaningful. In creating comprehensible sign in 2D
space, interactants recruit procedures from existing or established
communicative environments and practices and adapt them. They
alter language features, manipulate and organize a visual interaction
space of computer-mediated images, icons, and texts, their own and
others’ bodies in space, and utilize a new metaperspective of their own
signing. They learn to recognize properties of 2D space, as well as how
the computer represents 3D space in 2D. They modify their signing
speed, code choice, and adjust important spatial reference and person
reference forms (Keating and Mirus 2003). They work together over
multiple interactions to develop reciprocity of perspectives and an
independent understanding of how individual actions are transformed
by the technology.
how their sign relative to what they are pointing at will be reproduced
on the screen in “virtual” space. For example, one participant raises her
thumb and begins to point directly behind her (at her husband), but
then turns her hand so that her thumb is pointing directly to the side,
where her husband is in the 2D world of the screen. In the computer-
mediated image, she is pointing at her husband, when actually he is at
least two feet behind her. A new idea of how the recipient views space
differently from the signer is evident here. In other examples, signers
point directly at the webcam when signing YOU rather than pointing
at their interlocutor as they would do in face-to-face interaction. Signers
look at the screen, but then point at the webcam, and sometimes they
both look at and point to the webcam, another case of redesigning
language with recipient design in mind—in this case the webcam lens.
Names (as in “vocatives”), usually only used in ASL to refer to people
not present, are recruited to disambiguate addressees when there is
more than one possible addressee (e.g., when more than one person is
sharing the addressee space directly in front of the webcam as in Fig.
12.2) because gaze is not as effective as a marker of addressee because of
reduced clarity of gaze direction of signers and how gaze is represented
in technologically mediated space. The webcam lens also has reduced
peripheral vision capabilities compared with humans. Participants on
the same “side” of the interaction must sit side by side, which reorganizes
important aspects of signed conversational interaction.
A key tool in signers’ efforts at adaptation is a technological property
that gives them a new type of feedback on how understandable their
communication is, and whether they need to adjust their signing, for
example, if their signs are being made outside of the viewable space.
They see the effects of technological mediation on their signs through
a mirror image of themselves (see Fig. 12.6). In effect they see how the
webcam’s “eye” represents them and their language production and
a copy of the visual representation that their interlocutor perceives.
This provides them with an immediately testable means for learning
reciprocity of perspectives. Signers can judge the effect of certain
relationships in reduced size, 2D space and experiment with the efficacy
of new forms of hand position, torso, and face orientation to produce
understandable communication. Because the other’s perception of
oneself is available, self–other mapping is facilitated (see Astington this
volume), as linguistic behavior must take into account the perspective
of the recipient. In successful interaction, a simulation of the other’s
simulation of oneself is involved (Levinson this volume); here such
an understanding is technologically enhanced. The meaning of this
mirror image for social interaction cannot be taken for granted. It is a
novelty not at first oriented to by participants, as excerpt 1 shows. In
excerpt 1, Ben notices that Ned’s signing space is out of web camera
range (see Fig. 12.7). Thus, he cannot see Ned’s signs. Ben breaks down
the solution of this problem, a reciprocal-perspectives problem, into
a sequence of moves for Ned to understand and repair the problem.
The problem to be solved is that Ned is situated so that not enough of
his torso is visible to the webcam lens, and therefore a good part of his
signing space is invisible to Ben.
Excerpt 1
01 Ben moves both his arms downward.
02 Ben: CAMERA Ben moves his wrist downward
03 A-LITTLE-BIT
04 Ned: WHICH? YOURS OR MINE?
05Ned leans toward the computer and tilts the camera
downward
Figure 12.6. The signer (right) has a mirror image view of his own actions.
Figure 12.7. The top image shows the signer’s sign space is not in camera
range (note difference between the two participants).
06 Ben: ASK (? unclear fingerspelling)
07 Ned: Ned readjusts the camera position
08 Ben: OK FINE STAY
Conclusion
Computer tools provide new spaces and opportunities for human
sociality.
New technological realities have resulted in rapid adaptations,
altering some important aspects of how we interact with others and
organize and share information. CMC includes not only text-based
synchronous and asynchronous symbolic possibilities, but a new kind
of face-to-face or face-to-machine interaction, with voice and images
of people in real time projected into machine-mediated spaces. The
computer environment significantly alters time and space relationships.
In the case of sign language, the alteration of space is even more
significant. New conventions for communication are emerging as well as
new social networks, relationships, new possibilities for representation
and reproduction (the ability to copy and disseminate), memory storage
and recall, information management, and other aspects of human
sociality. Interactants, in this case signers, alter the communication
system itself, including language features, participation frameworks,
and the notion of a social partner, as they manipulate and organize a
new interaction space. They learn to recognize properties of computer-
mediated space, to utilize a new metaperspective of their own signing, as
well to negotiate new modes of sociality. When social actors incorporate
new technologies into their practices they are motivated not only by
problems that arise in managing new collaborative activities but also by
new enablements to transcend environmental constraints. Microlevel
actions are highly influenced by shared repertoires learned over a
lifetime, however, they show a surprisingly fast adaptability to new
resources and conditions.
Acknowledgments
Two research assistants contributed enormously to this study, Gene
Mirus and Chris Moreland. Many thanks to the other contributors of
this volume for stimulating comments and discussion. Thanks also to
three anonymous reviewers.
Notes
1. Science, Volume 283, Number 5410 Issue of 26 Mar 1999, pp. 2004–2005.
References
Aiello, L. C. 1995. Expensive-tissue hypothesis: The brain and digestive
system in human and primate evolution. Current Anthropology
36(2):199.
Alexander, J. C., and B. Giesen. 1987. From reduction to linkage: The
long view of the micro-macro debate. In The Micro-macro link, edited
by J. C. Alexander, B. Giesen, R. Münch, and N. J. Smelser, 1–42.
Berkeley: University of California Press.
Barnes, S. 2003. Computer-mediated communication. Boston: Allyn and
Bacon.
Bourdieu, P. 1977. Outline of a theory of practice. Cambridge: Cambridge
University Press.
Cicourel, A. 1981. Notes on the integration of micro and macro levels
of analysis. In Toward an integration of micro- and macro-sociologies,
edited by K. Knorr-Cetina and A. Cicourel, 51–80. London: Routledge
and Kegan Paul.
Collins, R. 1981. Micro-translation as a theory-building strategy. In
Advances in social theory and methodology: Toward an integration of
micro- and macro-sociologies, edited by K. Knorr-Cetina and A. Cicourel,
1–48. Boston: Routledge and Kegan Paul.
Condon, S. L., and C. G. Cech. 1996. Functional comparisons of face
to face and computer mediated decision making interactions. In
Computer-mediated communication: Linguistic, social and cross-cultural
perspectives, edited by S. Herring, 65–80. Philadelphia: Benjamins.
Debourgh, G. 1999. Technology is the tool, teaching is the task. Student
satisfaction in distance learning. College Teaching 47(2):70–73.
Duranti, A. 2003. Agency in Language. In Readings in Linguistic
Anthropology
, edited by A. Duranti, 451–472. Oxford: Blackwell.
Duranti, A., and C. Goodwin. 1992. Rethinking context: Language as an
interactive process. Cambridge: Cambridge University Press.
Foucault, M. 1980. Power/knowledge. New York: Pantheon Books.
Frumkin, P., and G. Kaplan. 2002. Institutional theory and the micro-macro
link. Cambridge, MA: Harvard University Press.
Giddens, A. 1984. The constitution of society: Outline of the theory of
structuration. Cambridge: Polity Press.
Goodwin, C. 2000. Action and embodiment within situated human
interaction. Journal of Pragmatics 32:1489–1522.
Heath, C., and P. Luff. 1993. Disembodied conduct: Interactional
asymmetries
in video-mediated communication. In Technology in working
order, edited by Graham Button, 35–54. London: Routledge.
Johnson, W. L. 2003. Interaction tactics for socially intelligent
pedagogical
agents. IUI ‘03, January 12–15, 2003, Miami, Fl. USA ACM
1–58113–586–6/03/00001.
Keating, E. 2000. How culture and technology together shape new
communicative practices: Investigating interactions between deaf
and hearing callers with computer-mediated videotelephone. Texas
Linguistic Forum 43:99–116.
——. 2005. Homo prostheticus: Problematizing the notion of activity
and computer-mediated interaction. In Theories and models of language,
interaction, and culture (special issue of Discourse Studies), edited by A.
Duranti, 7(4–5):527–546.
Keating, E., and G. Mirus. 2003. American Sign Language in virtual
space: Interactions between deaf users of computer-mediated video
communication and the impact of technology on language practices.
Language in Society 32:693–714.
Keating, E., T. Edwards, and G. Mirus. n.d. Cybersign: Impacts of new
communication technologies on space and language.
Knorr-Cetina, K. 1981. Introduction: The micro-sociological challenge
of macro-sociology: Towards a reconstruction of social theory and
methodology. In Advances in social theory and methodology: Toward an
integration of micro- and macro-sociologies, edited by K. Knorr-Cetina
and A. Cicourel, 1–48. Boston: Routledge and Kegan Paul.
Levinson, S. C. 2003. Space in language and cognition: Explorations in
cognitive diversity. Cambridge: Cambridge University Press.
——. 2005. Living with Manny’s dangerous idea. In Theories and models
of language, interaction, and culture (special issue of Discourse Studies),
edited by A. Duranti, 7(4–5):431–454.
Lucas, C., and C. Valli. 1992. Language contact in the American deaf
community. San Diego, CA: Academic Press.
Mazur, J. 2000. Applying insights from film theory and cinematic
technique to create a sense of community and participation in a
distributed
video environment. Journal of Computer-Mediated Communication
5(4). Internet publication, http://www.ascusc.org/jcmc/vol5/issue4/
mazur.htm, accessed November 9, 2005.
McKenna, K. Y. A., and J. A. Bargh. 1999. Causes and consequences of
social interaction on the Internet: A conceptual framework. Media
Psychology 1:249–269.
Richerson, P. J., and Boyd, R. 2004. Not by genes alone: How culture
transformed human evolution. Chicago: University of Chicago Press.
Schegloff, E. A. 1968. Sequencing in conversational openings. American
Anthropologist 70(6):1075–1095.
——. 1987. Between micro and macro: Contexts and other connections.
In The Micro-macro link, ed. by J. C. Alexander, B. Giesen, R. Münch, and
N. J. Smelser, 207–234. Berkeley: University of California Press.
——. 2005. On integrity in inquiry. . . of the investigated, not the
investigator. In Theories and models of language, interaction, and culture
(special issue of Discourse Studies), edited by A. Duranti, 7(4–5):455–480.
Cognition in Interaction
thirte n
The premise of this book and the conference that led to it is that our
mentally mediated and highly structured way of interacting with one
another is what makes us uniquely human (Enfield and Levinson this
volume). Over the course of generations, we have developed patterns of
social organization and values that set the stage for each new generation
of children to interact in human ways. Indeed, children inherit a world
of social organization that scaffolds their development and releases
them from reinventing with each new generation the patterns that make
us uniquely human—they can borrow the wheel from their elders.
One of the most pervasive aspects of social organization is human
language. Every human culture discovered thus far has developed a
linguistic system that is shared by all of its members and pervades the
way those members interact with one another. Even deaf cultures that
do not have access to the aural modality develop linguistic systems,
albeit in the manual modality. These signed languages provide the
medium of interaction for deaf individuals within a community and
define Deaf culture (Padden and Humphries 1988). When children,
be they deaf or hearing, acquire the language of their parents, they do
more than learn a conventional code—they take important steps toward
becoming functioning members of their society.
The question I address in this chapter is what happens when a child
does not have access to the shared conventional language of his or her
Cognition in Interaction
Words
The deaf children’s gesture words have five properties that are found in
all natural languages. The gestures are stable in form, although they need
not be. It would be easy for the children to make up a new gesture to fit
Table
Table 13.1
13.1.. The
The resilient
resilient properties of
of language
meaning
Arbitrariness Pairings between gesture forms and meanings
can have arbitrary aspects, albeit within an
iconic framework
Grammatical Functions Gestures are differentiated by the noun, verb,
and adjective grammatical functions they serve
Sentences
Underlying Frames Predicate frames underlie gesture sentences
Deletion Consistent production and deletion of gestures
within a sentence mark particular thematic roles
Word Order Consistent orderings of gestures within a
sentence mark particular thematic roles
Inflections Consistent inflections on gestures mark
particular thematic roles
Recursion Complex gesture sentences are created by
recursion
Language Use
Here-and-Now Talk Gesturing is used to make requests, comments,
and queries about the present
Displaced Talk Gesturing is used to communicate about the
past, future, and hypothetical
Narrative Gesturing is used to tell stories about self and
others
Self-Talk Gesturing is used to communicate with oneself
Meta-Language Gesturing is used to refer to one's own and
others' gestures
every new situation (and that appears to be what hearing speakers do
when they gesture along with their speech; cf. McNeill 1992). But that
is not what the deaf children do. They develop a stable store of forms
that they use in a range of situations—they develop a lexicon that is an
essential component of all languages (Goldin-Meadow et al. 1994).
Moreover, the gestures the children develop are composed of parts
that form paradigms, or systems of contrasts. When the children invent
a gesture form, they do so with two goals in mind—the form must
not only capture the meaning they intend (a gesture–world relation),
but it must also contrast in a systematic way with other forms in their
repertoire (a gesture–gesture relation). In addition, the parts that form
these paradigms are categorical. For example, one child, David, used
a fist hand shape to represent grasping a balloon string, a drumstick,
and handlebars—grasping actions requiring considerable variety in
diameter in the real world. The child did not distinguish objects of
varying diameters within the fist category, but did use his hand shapes
to distinguish objects with small diameters as a set from objects with
large diameters (e.g., a cup, a guitar neck, or the length of a straw) that
were represented by a C-shaped hand. The manual modality can easily
support a system of analog representation, with hands and motions
reflecting precisely the positions and trajectories used to act on objects in
the real world. But the children do not choose this route. They develop
categories of meanings that, although essentially iconic, have hints of
arbitrariness about them—that is, the boundaries between categories are
not drawn in the same places in the children’s gesture systems (Goldin-Meadow
et al. 1995).
Finally, the gestures the children develop are differentiated by
grammatical
function. Some serve as nouns, some as verbs, some as adjectives.
As in natural languages, when the same gesture is used for more than
one grammatical function, that gesture is marked (morphologically
and syntactically) according to the function it plays in the particular
sentence (Goldin-Meadow et al. 1994). For example, if a child were to use
a twisting gesture in a verb role, that gesture would likely be produced
near the jar to be twisted open (i.e., it would be inflected), it would
not be abbreviated, and it would be produced after a pointing gesture
at the jar. In contrast, if the child were to use the twisting gesture in
a noun role, the gesture would likely be produced in neutral position
near the chest (i.e., it would not be inflected), it would be abbreviated
(produced with one twist rather than several), and it would occur before
the pointing gesture at the jar.
Sentences
The deaf children’s gesture sentences have six properties found in all
natural languages. Underlying each sentence is a predicate frame that
determines how many arguments can appear along with the verb in the
surface structure of that sentence (Goldin-Meadow 1985). For example,
four slots underlie a gesture sentence about transferring an object, one
for the verb and three for the arguments (actor, patient, and recipient). In
contrast, three slots underlie a gesture sentence about eating an object,
one for the verb and two for the arguments (actor and patient).
Moreover, the arguments of each sentence are marked according to
the thematic role they play. There are three types of markings that are
resilient (Goldin-Meadow and Mylander 1984; Goldin-Meadow et al.
1994):
(1) Deletion—The children consistently produce and delete gestures
for arguments as a function of thematic role; for example, they are
more likely to delete a gesture for the object or person playing the role
of transitive actor (soldier in “soldier beats drum”) than they are to
delete a gesture for an object or person playing the role of intransitive
actor (soldier in “soldier marches to wall”) or patient (drum in “soldier
beats drum”).
(2) Word order—The children consistently order gestures for arguments
as a function of thematic role; for example, they place gestures for
intransitive actors and patients in the first position of their two-gesture
sentences (soldier–march; drum–beat).
(3) Inflection—The children mark with inflections gestures for
arguments
as a function of thematic role; for example, they displace a verb
gesture in a sentence toward the object that is playing the patient role
in that sentence (the “beat” gesture would be articulated near, but not
on, a drum).
In addition, recursion, which gives natural languages their generative
capacity, is a resilient property of language. The children form complex
gesture sentences out of simple ones (Goldin-Meadow 1982). For
example, one child pointed at me, produced a “wave” gesture, pointed
again at me, and then produced a “close” gesture to comment on the
fact that I had waved before closing the door—a complex sentence
containing two propositions: “Susan waves” (proposition 1) and “Susan
closes door” (proposition 2). The children systematically combine the
predicate frames underlying each simple sentence, following principles
of sentential and phrasal conjunction. When there are semantic elements
that appear in both propositions of a complex sentence, the children
have a systematic way of reducing redundancy, as do all natural languages
(Goldin-Meadow 1982, 1987).
Telling Stories
Narrative is one of the most powerful tools that human beings possess
for organizing and interpreting experience. Not only is narrative found
universally across cultures (Miller and Moore 1989), but no other species
is endowed with this capacity. Moreover, narrative emerges remarkably
early in human development. Children from many sociocultural
backgrounds, both within and beyond the United States, begin to
recount
their past experiences during the second and third years of life.
The deaf children told stories but used gesture to do so. They told
stories about events they or others experienced in the past, events
they hoped would occur in the future, and events that were flights of
imagination (Phillips et al. 2001). For example, one child produced the
following simple narrative in response to a picture of a car. His mother
confirmed the tale by telling it later in her own words.
1. “Break” gesture—“away” gesture [= narrative marker]—point at
dad—“car-goes-onto-truck” gesture (flat right hand glides onto back
of flat left hand)
2. “Crash” gesture—“away” gesture [= narrative marker]
Gloss: Dad’s car broke and went onto a tow truck. It crashed.
Acknowledgment
This research was supported by grants from the National Institute of
Deafness and Other Communicative Disorders (R01 DC00491), the
National Institutes of Child Health and Human Development (R01
HD47450), and the Spencer Foundation.
References
Alibali, M., and S. Goldin-Meadow. 1993. Gesture-speech mismatch
and mechanisms of learning: What the hands reveal about a child’s
state of mind. Cognitive Psychology 25:468–523.
Baggett, P. 1984. Role of temporal overlap of visual and auditory material
in forming dual media associations. Journal of Educational Psychology
76:408–417.
Bekken, K. 1989. Is there “Motherese” in gesture? Ph.D. dissertation,
Department of Psychology, University of Chicago.
Church, R. B., and S. Goldin-Meadow. 1986. The mismatch between
gesture and speech as an index of transitional knowledge. Cognition
23:43–71.
Curtiss, S. 1977. Genie: A psycholinguistic study of a modern-day “wild-child."
New York: Academic Press.
Garber, P., M. W. Alibali, and S. Goldin-Meadow. 1998. Knowledge
conveyed in gesture is not tied to the hands. Child Development
69:75–84.
Goldin-Meadow, S. 1982. The resilience of recursion: A study of a
communication system developed without a conventional language
model. In Language acquisition: The state of the art, edited by E. Wanner
and L. R. Gleitman, 51–77. New York: Cambridge University Press.
——. 1985. Language development under atypical learning conditions:
Replication and implications of a study of deaf children of hearing
parents. In Children’s language, vol. 5, edited by K. Nelson, 197–245.
Hillsdale, NJ: Erlbaum.
——. 1987. Underlying redundancy and its reduction in a language
developed without a language model: The importance of conventional
linguistic input. In Studies in the acquisition of anaphora: Applying
the constraints, vol. 2, edited by B. Lust, 105–133. Boston: Reidel
Publishing Company.
——. 2003a. Hearing gesture: How our hands help us think. Cambridge,
MA: Harvard University Press.
——. 2003b. The resilience of language: What gesture creation in deaf children
can tell us about language-learning in general. New York: Psychology
Press.
——. 2005. What language creation in the manual modality tells us
about the foundations of language, Linguistic Review 22:199–225.
Goldin-Meadow, S., C. Butcher, C. Mylander, and M. Dodge. 1994.
Nouns and verbs in a self-styled gesture system: What’s in a name?
Cognitive Psychology 27:259–319.
Goldin-Meadow, S., and D. McNeill. 1999. The role of gesture and
mimetic representation in making language the province of speech.
In The descent of mind, edited by M. C. Corballis and S. Lea, 155–172.
Oxford: Oxford University Press.
Goldin-Meadow, S., and C. Mylander. 1983. Gestural communication
in deaf children: The non-effects of parental input on language
development. Science 221(4608):372–374.
——. 1984. Gestural communication in deaf children: The effects and noneffects
of parental input on early language development. Monographs
of the Society for Research in Child Development 49:1–121.
——. 1998. Spontaneous sign systems created by deaf children in two
cultures. Nature 91:279–281.
Goldin-Meadow, S., C. Mylander, and C. Butcher. 1995. The resilience of
combinatorial structure at the word level: Morphology in self-styled
gesture systems. Cognition 56:195–262.
Goldin-Meadow, S., and M. A. Singer. 2003. From children’s hands to
adults’ ears: Gesture’s role in teaching and learning. Developmental
Psychology 39:509–520.
Goldin-Meadow, S., and S. M. Wagner. 2005. How our hands help us
learn. Trends in Cognitive Science 9:234–241.
Greenfield, P. M., and E. S. Savage-Rumbaugh. 1991. Imitation,
grammatical development, and the invention of protogrammar by an
ape. In Biological and behavioral determinants of language development,
edited by N. A. Krasnegor, D. M. Rumbaugh, R. L. Schiefelbusch, and
M. Studdert-Kennedy, 235–258. Hillsdale, NJ: Erlbaum.
Hockett, C. F. 1960. The origin of speech. Scientific American 203(3):88–96.
This chapter has three substantive parts. The first part describes how
the distributed cognition perspective directs our attention to particular
classes of interactions. The second part uses the examination of an
example of real-world human interaction to construct a description of
the nature of interaction. This examination shows real-world interaction
to be deeply multimodal and composed of a complex network of
relationships among resources. It also shows that some cognitive
processes are properties of the system of interaction, distinct from the
cognitive properties of the individuals who participate in the system.
The last part explores the implications of this “naturalized” notion
of human interaction for our understanding of both the nature of
contemporary cognition and for the kinds of processes that might have
given rise to contemporary cognition. What evolves is not the brain
alone, but the system of brains, bodies, and shared environments for
action in interaction. Cultural practices are as much a part of the story
of cognitive evolution as are changes in brain structure. This means
that important milestones in cognitive evolution could, in principle,
have been achieved without any particular genetic adaptation being
associated with them.
Real-world Interaction
All of the sorts of interaction and distribution described in the previous
section take place simultaneously in real world activity. This section
presents an instance of rich, culturally grounded, real-world interaction.
In Cognition in the Wild (Hutchins 1995a), I used an extended study
of ship navigation to show how the cognitive science of real-world
activity could be accomplished. That book emphasized the distribution
of cognitive processes between persons and technology, among people,
and across time in the development of the social and material context
for thinking. My research group recently undertook a reanalysis of some
of the video data from the ship navigation study.
over the chart and reaching toward the plotting area with his left hand,
saying, “u:h,” but the plotter rebuffs him by making another gestured
LOP from the vicinity of the depiction of Light Victor to the EP (half-
circle) and saying, “that’s good.” Because Light Victor is located to the
east of the EP, this gesture both indicates a third LOP and effectively
blocks the entry of the bearing recorder’s hand to the plotting area. The
bearing recorder pulls his left hand back, rests it on the chart table in
front of him and says, “Okay.”
As the navigators work, they use their fingers to trace lines from
various
landmarks to the vicinity of the estimated position. The gestures
enact imaginary or provisional LOPs. These ephemeral structures are
the representations on which the choice process operates. The criteria
for evaluation are the angles of intersection among the prospective
LOPs. The creation and evaluation of the proposed LOPs is carried
out in a conversation between the two workers. The conversational
turns are multimodal in that they include environmentally coupled
gesture, cogesture speech, body orientation, facial expression, and tool
manipulation.
Understanding Interaction
Human minds did not evolve in isolation, each wrapped tightly in a
thick skull and thereby insulated from the complexities of the body and
the world. We know that the brain takes advantage of minute details
of the body and the body’s interaction with the physical environment
(Clark 2001; Quartz and Sejnowski 2002). Similarly, mind will have
evolved, not in isolation from the material and social world, but in
ways that weave its activity inextricably into the details of those worlds
(Tomasello 2001).
If distributed cognition presents us with a world in which everything
is seemingly connected to everything else, does not studying cognition
become impossible? I think it certainly becomes more difficult.
Understanding
complex real-world interactions is more difficult than
understanding
systems of simple linear relations. However, in some ways,
more complex problems can be easier to solve than what seem to be
simpler problems. If the nature of the problem is to constrain behavior,
a system of multiple interacting subsystems can provide a solution more
easily than tying to get all of the constraints out of a single subsystem.
For example, it is easier to account for the organization of the visual
system if one recognizes that it develops in concert with the auditory
system than it is to account for the organization of either system in
isolation (de Sa and Ballard 1998). Such findings are part of a wider
shift in the cognitive sciences is toward an increasing appreciation for
rich interactions among systems at all levels of organization. People
in normal interaction are in the business of creating and interpreting
rich multimodal meaning complexes. Here again, sometimes solving
what looks like a more complex problem is easier than solving what
looks like a simpler problem. It is easier to work out the significance of
complex multiply constrained acts of meaning than it is to determine
the meanings of the individual components as isolated systems. It is
easier to establish a meaning for words embedded with gestures that
are performed in coordination with a meaningful shared world than it
is to establish meanings for words as isolated symbols.
Thus, when we approach the more complex objects of scientific
scrutiny
demanded by distributed cognition theory, it is not the case that
explanations will necessarily be more difficult to create. They may be
somewhat more complex than easy linear and modular stories, but in
some cases, the explanations come naturally as side effects or by
products
of general principles. For example the development of a shared
lexicon mentioned above.
Conclusion
By softening the traditional disciplinary boundaries the distributed
cognition perspective focuses on a new unit of analysis that encloses
a complex set of interactions among brain, body, and culturally
constructed world. Careful attention to the microstructure of interaction
from the distributed cognition perspective leads to a reconceptualization
of the individual–environment relationship and suggests that this newly
conceived relation has important implications for the way we confront
many sorts of cognitive and anthropological problems. In particular,
it provides a new place to look for mechanisms that shape both the
ontogenetic and the phylogenetic development of sociality.
Acknowledgment
This work was funded by a grant from the Santa Fe Institute’s program
on robustness in natural and social systems, which is supported by the
McDonnell Foundation. Alisa Durán transcribed the data and suggested
many elements of the analysis presented here.
References
Alač, M., and E. Hutchins. 2004. I see what you are saying: Action as
cognition in fMRI brain mapping practice. Journal of Cognition and
Culture 4(3–4):629–661.
Bartlett, F. 1995[1932]. Remembering: A study in experimental and social
psychology, 2nd edition. Cambridge: Cambridge University Press.
Bateson, G. 1972. Steps to an ecology of mind. New York: Ballantine
Books.
Braitenberg, V. 1984. Vehicles: Experiments in synthetic psychology.
Cambridge, MA: MIT Press.
Bruner, J., R. Olver, P. Greenfield, et al. 1966. Studies in cognitive growth:
A collaboration at the center for cognitive studies. New York: John Wiley
and Sons.
Clark, A. 2001. Mindware: An introduction to the philosophy of cognitive
science. Oxford: Oxford University Press.
Cole, M., and M. Griffin. 1980. Cultural Amplifiers Reconsidered. In
The social foundations of language and thought. Essays in honor of Jerome
Bruner, edited by D. Olson, 343–364. New York: Norton.
de Sa, V., and D. H. Ballard. 1998. Category learning through
multimodality
sensing. Neural Computation 10(5):1097–1117.
Douglas, M. 1986. How institutions think. Syracuse, NY: Syracuse
University
Press.
Enfield, N. J. 2005. The body as cognitive artifact in kinship
representations:
Hand gesture diagrams by speakers of Lao. Current
Anthropology 46(1):51–73.
Goodwin, C. 1994. Professional vision. American Anthropologist 96(3):
606–633.
Halbwachs, M. 1925. Les cadres sociaux de la memoire. Paris: Albin
Michel.
Hazlehurst, B., and E. Hutchins. 1998. The emergence of propositions
from the coordination of talk and action in a shared world. Language
and Cognitive Process 13(3):373–424.
Henty, S. 1999 A computational simulation of jury decision making.
Honors thesis, Department of Cognitive Science, University of
California, San Diego.
Hollan, J. D., E. Hutchins, and D. Kirsh. 2001. Distributed cognition:
A new foundation for human-computer interaction research. ACM
Transactions on Human-Computer Interaction: Special Issue on Human-
Computer Interaction in the New Millennium 7(2):174–196.
Hutchins, E. 1995a. Cognition in the Wild. Cambridge, MA: MIT Press.
——. 1995b. How a cockpit remembers its speeds. Cognitive Science
19:265–288.
——. 1999. Cognitive artifacts. In The MIT Encyclopedia of the Cognitive
Sciences, edited by R. A. Wilson and F. C. Keil, 126–128. Cambridge,
MA: MIT Press.
——. 2000. The cognitive consequences of patterns of information flow.
Intellectica 1(30):53–74.
Hutchins, E., and B. Hazlehurst. 1991. Learning in the cultural process.
In Artificial life 2, Santa Fe Institute studies in the sciences of complexity
series, edited by C. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen,
689–706. Santa Fe: Santa Fe Institute.
——. 1995. How to invent a lexicon: The emergence of shared form-
meaning mappings in interaction. In Social intelligence and interaction,
edited by E. Goody, 53–67. Cambridge: Cambridge University
Press.
——. 2002. Auto-organization and emergence of shared language
structure. In Simulating the evolution of language, edited by A. Cangelosi
and D. Parisi, 279–306. London: Springer Verlag.
Hutchins, E., and T. Klausen. 1996. Distributed cognition in an airline
cockpit. In Cognition and communication at work, edited by Y. Engeström
and D. Middleton, 15–34. New York: Cambridge University Press.
Hutchins, E., and L. Palen. 1997. Constructing meaning from space,
gesture, and speech. In Discourse, tools, and reasoning: Essays on situated
cognition, edited by L. B. Resnick, R. Saljo, C. Pontecorvo, and B.
Burge, 23–40. Heidelberg: Springer Verlag.
Luria, A. M. 1966. Higher cortical functions in man. New York: Basic
Books.
McNeill, D. 1992. Hand and mind: What gestures reveal about thought.
Chicago: University of Chicago Press.
Middleton, D., and D. Edwards (eds.). 1990. Collective remembering.
London: Sage.
Newell, A., and H. Simon. 1972. Human problem solving. Englewood
Cliffs, NJ: Prentice-Hall.
Norman, D. 1994. Things that make us smart: Defending human attributes
in the age of the machine. Boston, MA: Addison-Wesley.
Oyama, S. 2000. Evolution’s eye: A systems view of the biology-culture divide.
Durham, NC: Duke University Press.
Quartz, S., and T. Sejnowski. 2002. Liars, lovers, and heroes: What the
new brain science reveals about how we become who we are. New York:
Morrow.
Roberts, J. M. 1964. The self-management of cultures. In Explorations
in cultural anthropology, edited by W. H. Goodenough, 433–454. New
York: McGraw-Hill.
Rumelhart, D., P. Smolensky, J. McClelland, and G. Hinton. 1986.
Schemata and sequential processes in PDP models. In Parallel
distributed processing, vol. 2, edited by J. McClelland and D. Rumelhart,
7–57. Cambridge, MA: MIT Press.
Surowiecki, J. 2004. The wisdom of crowds. New York: Doubleday.
Tomasello, M. 2001. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Turner, J. 2000. The extended organism. Cambridge, MA: Harvard
University Press.
Vygotsky, L. S. 1986. Thought and language. Cambridge, MA: MIT
Press.
fite n
operating panel being the target of joint attentional hand gestures (Kita
2003; Liszkowski this volume). But common ground is also there when
it is not being signaled or otherwise manifest directly. At a personal
level, the shared experiences of interactants are in common ground as
long as the interactants know (and remember!) they were shared. At
a cultural level, common ground may be indexed by signs of ethnic
identity, and the common cultural background such signs may entail.
One such marker is native dialect (as signaled, e.g., by accent), a readily
detectable and reliable indicator of long years of common social and
cultural experience (Nettle and Dunbar 1997; Nettle 1999). Suppose I
begin a conversational exchange with a stranger of similar age to myself,
who, like me, is a native speaker of Australian English. We will each
immediately recognize this common native origin from each other’s
speech, and then I can be pretty sure that my new interlocutor and I
will share vast cultural common ground from at least the core years
of our linguistic and cultural socialization (i.e., our childhoods, when
our dialects were acquired). We will mutually assume, for instance,
recognition of expressions like fair dinkum, names like Barry Crocker,
and possibly even sporting institutions like the Dapto Dogs.
2BWmm5
"
2sg
chew
q? 5
mm
ESPCmetorugcenioptuincoresn,:
and
A matter of some contention in the discussions documented in this
volume is the degree of involvement of higher-order cognition in social
interactional processes. Despite currency of the term “mind reading”
and its variants in literature on social intelligence (Baron-Cohen 1995;
Carruthers and Smith 1996; inter alia; cf. Astington this volume), we
cannot read each other’s minds. Miller wrote, “One of the psychologist’s
great methodological difficulties is how he can make the events he
wishes to study publicly observable, countable, measurable” (1951:3).
This problem for the psychologist is a problem for the layperson too.
In interaction, normal people need, at some level, to be able to model
each other’s (evolving and contingent) goals, based solely on perceptible
information, by attending to one another’s communicative actions and
displays (Mead 1934). A no-telepathy assumption means that there is
“no influencing other minds without mediating artifactual structure”
(Hutchins and Hazlehurst 1995). As a result, semiosis—the interplay of
perception and cognition, rooted in ethology and blossoming in the
modern human mind—is a cornerstone of human sociality (Kockelman
2005; Peirce 1965). Humans augment the ethologically broad base of
iconic and indexical meaning with symbolic structures and higher-order
processes of intention attribution.
So if action and perception are the glue in human interaction, higher-
order cognition is the catalyst. I see this stance as a complement, not
an alternative, to radically interactionist views of cognition (cf. Molder
and Potter 2005). Authors like Norman (1991), Hutchins (1995), and
Goodwin (1994, 1996) are right to insist that the natural exercising
of cognition is in distributed interaction with external artifacts. And
we must add to these artifacts our bodies (Enfield 2005; Goodwin
2000; Hutchins and Palen 1993) and our social associates (Goodwin
this volume). Similarly, the temporal-logical structures of our social
interactions are necessarily collaborative in their achievement (Clark
1996; Schegloff 1982), as may be our very thought processes (Goody 1995,
inter alia; Mead 1934; Rogoff 1994; Vygotsky 1962). But as individuals,
we each physically embody and transport with us the wherewithal to
move from scene to scene and still make the right contributions. We
store cognitive representations (whether propositional or embodied)
of the conventional signs and structures of language, of the cultural
stock of conventional typifications that allow us to recognize what
is happening in our social world (Schutz 1970), and of more specific
knowledge associated with our personal contacts. And we have the
cognitive capacity to model other participants’ states of mind as given
interactions unfold (Mead 1934).
Accordingly, here is my rephrasing of Miller (with a debt to Schutz
1970 and Sacks 1992): One of the man in the street’s great methodological
difficulties is how he can understand (and make himself understood to) his
social associates solely on the basis of what is publicly observable. Any model
of multiparty interaction will have to show how the combination of a
physical environment and a set of mobile agents will result in emergence
of the structures of interactional organization that we observe. It will
also have to include descriptions of the individual agents, their internal
structure and local goals. General capacities of social intelligence,
and specific values of common ground will have to be represented
somewhere in those individual minds. Then, in real contexts, what is
emergent can emerge.
So, human social interaction not only involves cognition, it involves
high-grade social intelligence (Goody 1995; allowing that it need not
always involve it—Barr and Keysar 2004). And in line with a number of
other contributors to this volume who resist the overuse or even abuse
of mentalistic talk in the analysis of social interaction, it is clear that
intention attribution is entirely dependent on perception in a shared
environment (see esp. Byrne this volume, for his “heretical thought”;
Danziger this volume; Goodwin 2000, this volume; Hutchins 1995:
ch. 9, this volume; Schegloff 1982:73). Both components—individual
cognition and emergent organization—are absolutely necessary (see
the introduction to this volume). Human social interaction would not
exist as we know it without the cocktail of individual, higher-order
cognition and situated, emergent, distributed organization. A mentalist
stance need therefore not be at the expense of the critically important
emergence of organization from collaborative action in shared physical
context, above and beyond any individual’s internally coded goals. To
be sure, there remain major questions as to the relative contribution
of individual cognition and situated collaborative action in causing
the observed organization of interaction. But however you look at it,
we need both.
Audience Design
Equipped with higher-order inferential cognition, an interlocutor (plus
all the other aspects of one’s interactional context), and a stock of
common ground, a speaker should design his or her utterances for
that interlocutor (Clark 1996; Sacks 1992; Sacks and Schegloff 1979;
Schegloff 1997). If we are to optimize the possibility of having our
communicativeintentions correctly recognized, any attempt to make the
right inferences obvious to a hearer will have to take into account the
common ground defined by the current speaker–hearer combination. In
ordinary conversation, there is no generic, addressee-general, mode of
message formulation. To get our communicative intentions recognized,
we ought to do what we can to make them the most salient solutions
to the interpretive problems we foist on our hearers. The right ways to
achieve this will be determined in large part by what is in the common
ground, and this is by definition a function of who is being addressed
given who it is they are being addressed by. Because Gricean implicature
is fundamentally audience driven (whereby formulation of an utterance
is tailored by how one expects an addressee will receive it), to do audience
design is to operate at a yet higher level than mere intention attribution.
It entails advance modeling of another’s intention attribution. 5
Consider an example that turns on highly local common ground. Fig.
15.5 shows two men sitting inside a Lao village house, waiting while
lunch is prepared in an outside kitchen.
At the moment shown in Fig. 15.5, a woman’s voice can be heard
(coming from the outside kitchen verandah, behind the camera, left
of screen) as follows:
(3) mòòt4 nam4 haj5 nèè1
extinguish water benefactive please
“Please turn off the water for (me).”
that the utterance in (3) could not be intended for someone who lacks
the common ground, that is, who does not know what “turning off
the water” involves. The switch that controls an outside water pump
is situated at the only power outlet in the house, inside, far from the
kitchen verandah. To respond appropriately to the utterance in (3), an
addressee would need this inside knowledge of what “turning off the
water” entails. Without it, one might not even realize that the addressee
of (3) is someone (anyone) inside the house. But it is in the common
ground for the people involved in this exchange. They are neighbors of
this household, daily visitors to the house. The woman outside on the
verandah knows that the people inside the house know (and know that
they are known to know!) the routine of flicking that inside switch to
turn the outside water pump on and off. This enables the success of the
very lean communicative exchange consisting of the spoken utterance
in (3) and the response in Fig. 15.6.
Much is inferred by the actor in Fig. 15.6 beyond what is encoded in
the spoken message in (3), in the amplicative sense outlined above. In
addition, this example illustrates a defining feature of common ground
information, namely that people cannot deny possessing it.6 The man
on our left in Fig. 15.5—who is situated nearest the switch—might not
feel like getting up, but he could not use as an excuse for inaction a
Figure 15.6. Man gets up to turn
off switch of electric water pump.
claim that he does not know what the speaker in (3) wants (despite the
fact that nothing in her utterance makes this explicit).
The principle of audience design dovetails with common ground,
because both are defined by a particular social relationship between
particular interlocutors. As prefigured above, the general imperative
of audience design is served by two, more specific imperatives of
conversation. I described one of these—the informational imperative—as
the cooperative struggle to maintain common referential understanding,
mutually calibrated at each step of an interaction’s trajectory (Clark
1996; Schegloff 1992). This will be satisfied by various means including
choice of language spoken, choice of words, grammatical constructions,
gestures, and the various devices for meeting “system requirements”
for online alignment in interaction (mechanisms for turn organization,
signals of ongoing recipiency, correction of errors and other problems,
etc.; Goffman 1981:14; Schegloff this volume). Less well understood
are the “ritual” requirements of remedial face work, and the need to
deal with “implications regarding the character of the actor and his
evaluation of his listeners, as well as reflecting on the relationships
between him and them” (Goffman 1981:21; cf. Goffman 1967, 1971).
We turn now to those.
now scattered and are playing in the grounds of his compound. Saj
(right), a neighbor of Kou, has just arrived on the scene.
Saj asks Kou how many people were in the group that has just arrived
with Kou’s vehicle, following this up immediately by offering a candidate
set of people: “Duang’s lot” (line 1). The named referent—Duang—is
Kou’s third daughter.8 Kou responds with a list of those who have arrived
with him, beginning by listing four of his own daughters by name (lines
2–3), then mentioning two further children (line 4):
It is in the common ground that Kou’s own four children are known
to both Kou and Saj by their first names. Kou is therefore able to use
the four children’s personal names in lines 2–3 to achieve recognition.
In line 4, Kou continues his list, with two further children who have
arrived with him. These two are not his own, are not from this village,
and are presumed not to be known by name to Saj. They are children of
Kou’s brother and sister, respectively, who both live in Kou’s mother’s
village Paksan, some distance away. Kou refers to them as “kids from
Paksan.” The reason he does not he refer to these two children by name
is that he figures his addressee will not recognize them by name—their
names, as ways of uniquely referring to them, are not in the common
ground. But although Saj certainly will not recognize the children by
name, he will recognize their village of origin by name (and further,
will recognize that village to be Kou’s village of origin, and the home
of Kou’s siblings). So Kou’s solution to the problem of formulating
reference to these two children—in line 4—is to tie them to one sure
piece of common ground: the name of the village where a host of Kou’s
relatives are (openly, mutually) known to live.
However, it appears that Kou’s solution in line 4 is taken—by Saj—to
suppose too little common ground. Although Saj would not know the
names of these Paksan children, he does know the names of some of Kou’s
siblings from Paksan. This is common knowledge, which could form
the basis of a finer characterization of these children’s identities than
that offered in line 4. What immediately follows Kou’s vague reference
to the two children by place of origin in line 4 is Saj’s candidate offer of
a more specific reference to the children. Saj’s candidate reformulation
(line 5 in [8], below) links the children explicitly to one of Kou’s siblings,
referring to him by name. This guess, which turns out to be not entirely
correct, succeeds in eliciting from Kou a finer characterization of the
children’s identities (line 6). This new characterization presupposes
greater common ground than Kou’s first attempt did in line 4, yet it
remains a step away in implied social proximity from that implied
by Kou’s first-name formulations to his own children in lines 2–3,
above:
(8) (Follows directly from (7).)
5 S luuk4 qajø+saaj3
child eB+S
“Children of Saaj?”
6 K luuk4 bak2+saaj3 phuu5 nùng1, luuk4 - qii1+vaat4sa=naa3 phuu5 nùng1
child m.non_resp+S person one child f.non_resp+V person one
“Child of Saaj, one, child of – Vatsana, one.”
"Hak phang khii (a type of medicinal root) is plentiful, at the area of-"
2 FM [qee5
yeah
"Yeah,
4 BM m5
mm
"Mm."
was
Conclusion
This chapter has proposed that the practices by which we manage
and exploit common ground in interaction demonstrate a personal
commitment to particular relationships and particular communities,
and a studied attention to the practical and strategic requirements
of human sociality. I have argued that the manipulation of common
ground serves both interactional efficacy and social affiliation. The logic
can be summarized as follows. Common ground—knowledge openly
shared by specified pairs, trios, and so forth—is by definition socially
relational, and relationship defining. In an informational dimension,
common ground guides the design of signals by particular speakers
for particular recipients, as well as the proper interpretation by particular
recipients, of signals from particular speakers. Richer common ground
means greater communicative economy, because it enables greater
amplicative inferences on the basis of leaner coded signals. In a social-
affiliational dimension, the resulting streamlined, elliptical interaction
has a property that is recognized and exploited in the ground-level
management of social relations: these indices of common ground are
a means of publicly displaying, to interactants and onlookers alike,
that the requisite common ground is shared, and that the relationship
constituted by that degree or kind of common ground is in evidence.
In sum, common ground is as much a social-affiliational resource as
it is an informational one. In its home disciplines of linguistics and
psychology, the defining properties of common ground concern its
consequences in the realm of reference and discourse coherence. But
sharedness, or not, of information, is essentially social. Why else would
it be that if I were to get the promotion, I had better tell my wife as
soon as I see her (or better, call her and let her be the first to know),
whereas others can be told in due course (my snooker buddies), and
yet others need never know (my dentist)? The critical point, axiomatic
in research on talk in interaction yet alien to linguistics and cognitive
science, is that there is no time out from the social consequences of
communicative action.
Acknowledgments
I would like to acknowledge a special debt to Bill Hanks, Steve Levinson,
Paul Kockelman, Tanya Stivers, Herb Clark, Chuck Goodwin, John
Heritage, and Manny Schegloff, along with the rest of my colleagues
in the Multimodal Interaction Project (MPI Nijmegen)—Penny Brown,
Federico Rossano, JP de Ruiter, and Gunter Senft—for helping me develop
my thinking on the topics raised here. I received helpful commentary on
draft versions from Steve Levinson, Tanya Stivers, as well as Jack Sidnell
and two other anonymous reviewers. None are responsible for errors
and infelicities. I gratefully acknowledge the support of the Max Planck
Society. I also thank Michel Lorrillard for providing me with a place
to work at the Vientiane centre of l’École française d’Extrême-Orient,
where final revisions to this chapter were made. Finally, I thank the
entire cast of contributors to the Roots of Human Sociality symposium at
the village of Duck, on North Carolina’s Outer Banks, October 2004.
Notes
1. See also Schiffer (1972), Sperber and Wilson (1995), D’Andrade (1987:113),
Searle (1995:23–26), Schegloff (1996:459), Barr and Keysar (2004). Although
analysts agree that humans can construct and consult common ground in
interaction, there is considerable disagreement as to how pervasive it is (see
discussion in Barr and Keysar 2004).
2. By “hypothesis,” I do not mean that we need consciously or explicitly
entertain candidate accounts for questions like whether our colleagues will
wear clothes to work tomorrow, or whether the sun will come up, or whether
we will stop feeling thirsty after we have had a drink (saying “Aha, just as I
suspected” when verified). But we nevertheless have models of how things are,
which, most importantly, are always accessible, and become visible precisely
when things go against our expectations (Whorf 1956). For this to work,
we need some kind of stored representation, whether mental or otherwise
embodied, which accounts for our expectations.
3. Steve Levinson points out the relevance of the great spatial distance
between BW and the basket. Her reach has a long way to go when FW acts on
the inference derived from observing her action. It may be that BW’s stylized
reach was overtly communicative, designed to induce recognition of intention,
and the perlocutionary effect of causing FW to pass the basket (functioning,
effectively, as a request).
4. The phrasing appropriates Slobin’s thinking-for-speaking idea: that “lan-
guage directs us to attend—while speaking—to the dimensions of experience
that are enshrined in grammatical categories” (Slobin 1996:71).
5. There is some controversy as to the extent to which we do audience
design and assume its having been done. By a frugal cognition view, audience
design is heavily minimized, but all analytical positions acknowledge that
high-powered inference must at the very least be available when required
(Barr and Keysar 2004; cf. Goodwin, Hutchins, and Danziger in this volume).
6. This is the corollary of the impossibility of pretending to possess common
ground when you do not: witness the implausibility of fictional stories in
which characters assume other characters’ identities and impersonate them,
living their lives without their closest friends and kin detecting that they are
imposters (e.g., the reciprocal face transplant performed on arch enemies
Castor Troy and Sean Archer in Face/Off, Paramount Pictures, 1997).
7. More work is needed to understand how the use of profanities works
to display and constitute “close” social relations. Presumably, the mechanism
is that “we can’t talk like that with everybody.” So, it is not a question of the
propositional content of the information being exchanged, but its register,
its format. Compare this with more sophisticated ways of displaying social
affiliation in the animal world, such as the synchronized swimming and
diving that closely affiliated porpoises employ as a display of alliance (Connor
et al. 2000:104). It is not just that these individuals are swimming together,
but, in addition, how they are doing it.
8. Like the others in this list of names, Duang is socially “lower” than
both the participants, and accordingly, her name is prefixed with the female
nonrespect prefix qii1-; cf. Enfield (in press).
9. I gratefully acknowledge the contribution of Manny Schegloff and
Tanya Stivers to my understanding of this example.
10. The Lao word vang refers to a river pool, a section of river in which the
water is deep and not perceptibly flowing, usually with thick forest towering
over it, producing a slightly spooky atmosphere, of the kind associated with
spirit owners (i.e., ghosts or spirits that “own” a place, and must be appeased
when traveling through). The same place is also called Faaj Vang Phêêng (faaj
means “weir”; the deep still water ofVang Phêêng is a weir reservoir).
11. Vertically aligned square brackets indicate overlap in speech.
12. This is comparable with the use of him in the opening words of Paul
Bremer’s announcement at a Baghdad news conference in December 2003 of
the highly anticipated capture of Saddam Hussein: “Ladies and gentlemen, we
got him.”
References
Baron-Cohen, S. 1995. Mindblindness: An essay on autism and Theory of
Mind. Cambridge, MA: MIT Press.
Barr, D. J., and B. Keysar. 2004. Making sense of how we make sense:
The paradox of egocentrism in language use. In Figurative language
processing: Social and cultural influences, edited by H. Colston and A.
Katz, 21–41. Mahwah, NJ: Erlbaum.
Buys, C. J., and K. L. Larson. 1979. Human sympathy groups. Psychology
Reports 45:547–553.
Carruthers, P. and P. K. Smith (eds.). 1996. Theories of Theories of Mind.
Cambridge: Cambridge University Press.
Clark, H. H. 1996. Using language. Cambridge: Cambridge University
Press.
Cohen, D. 1999. Adding insult to injury: Practices of empathy in an infertility
support group. Ph.D. dissertation, School of Communication, Rutgers
University.
Connor, R. C., R. S. Wells, J. Mann, and A. J. Read. 2000. The Bottlenose
Dolphin: Social relationships in a fission-fusion society. In Cetacean
societies: Field studies of dolphins and whales, edited by J. Mann, R. C.
Connor, P. L. Tyack, and H. Whitehead, 91–126. Chicago: Chicago
University Press.
D’Andrade, R. D. 1987. A Folk Model of the Mind. In Cultural models in
language and thought, edited by D. Holland and N. Quinn, 112–148.
Cambridge: Cambridge University Press.
Drew, P. and K. Chilton. 2000. Calling just to keep in touch: Regular and
habitualised telephone calls as an environment for small talk. In Small
Talk, edited by J. Coupland, 137–162. Harlow: Pearson Education.
Dunbar, R. I. M. 1993. Coevolution of neocortical size, group size, and
language in humans. Behavioral and Brain Sciences 16:681–735.
——. 1996. Grooming, gossip and the evolution of language . London: Faber
and Faber.
——. 1998. The social brain hypothesis. Evolutionary Anthropology
6:178–190.
Dunbar, R. I. M., and M. Spoors. 1995. Social networks, support cliques,
and kinship. Human Nature 6:273–290.
Enfield, N. J. 2002. Cultural logic and syntactic productivity: Associated
posture constructions in Lao. In Ethnosyntax: Explorations in culture
and grammar, edited by N. J. Enfield, 231–258. Oxford: Oxford
University Press.
——. 2003. The definition of what-d’you-call-it: Semantics and pragmatics
of recognitional deixis. Journal of Pragmatics 35:101–117.
——. 2005. The body as a cognitive artifact in kinship representations.
Hand gesture diagrams by speakers of Lao. Current Anthropology
46(1):51–81.
——. in press. Meanings of the unmarked: Why default references do
more than just refer. In Person reference in interaction, edited by N. J.
Enfield and T. Stivers. Cambridge: Cambridge University Press.
——. n.d. Evidential interrogative particles in Lao. Language and Cognition
Group, MPI Nijmegen, March 2006. [Typescript]
Enfield, N. J., and T. Stivers (eds.). in press. Person reference in interaction.
Cambridge: Cambridge University Press.
Fox, B. A. 1987. Discourse structure and anaphora: Written and conversational
English. Cambridge: Cambridge University Press.
Garfinkel, H., and H. Sacks. 1970. On formal structures of practical
actions. In Theoretical sociology: Perspectives and developments, edited by
J. C. McKinney and E. A. Tiryakian, 337–366. New York: Meredith.
Gigerenzer, G., P. M. Todd, and The ABC Research Group. 1999. Simple
heuristics that make us smart. Oxford: Oxford University Press.
Goffman, E. 1967. Interaction ritual. New York: Anchor Books.
——. 1971. Relations in public. New York: Harper & Row.
——. 1981. Forms of talk. Philadelphia: University of Pennsylvania
Press.
Goodwin, C. 1994. Professional vision. American Anthropologist 96(3):
606–633.
——. 1996. Transparent vision. In Interaction and grammar, edited by
E. Ochs, E. A. Schegloff, and S. A. Thompson, 370–404. Cambridge:
Cambridge University Press.
——. 2000. Action and embodiment within situated human interaction.
Journal of Pragmatics 32:1489–1522.
Goody, E. N. (ed.). 1995. Social intelligence and interaction: Expressions
and implications of the social bias in human intelligence. Cambridge:
Cambridge University Press.
Grice, H. P. 1975. Logic and conversation. In Speech acts, edited by P.
Cole and J. L. Morgan, 41–58. New York: Academic Press.
——. 1989. Studies in the way of words. Cambridge, MA: Harvard
University Press.
Hanks, William F. 1989. The indexical ground of deictic reference.
Chicago Linguistics Society 25(2):104–122.
Heritage, J., and J. M. Atkinson. 1984. Introduction. In Structures of social
action: Studies in conversation analysis, edited by J. M. Atkinson and J.
Heritage, 1–15. Cambridge: Cambridge University Press.
Hill, R. A., and R. I. M. Dunbar. 2003. Social network size in humans.
Human Nature 14:53–72.
Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.
Hutchins, E., and B. Hazlehurst. 1995. How to invent a shared lexicon:
The emergence of shared form-meaning mappings in interaction. In
Social intelligence and interaction: Expressions and implications of the
social bias in human intelligence, edited by E. Goody, 53–67. Cambridge:
Cambridge University Press.
Hutchins, E., and L. Palen. 1993. Constructing meaning from space,
gesture, and speech. In Discourse, tools, and reasoning: Essays on situated
cognition, edited by L. B. Resnick, R. Säljö, C. Pontecorvo, and B.
Burge, 23–40. Berlin: Springer.
Jefferson, G. 1974. Error correction as an interactional resource. Language
in Society 2:181–199.
——. 1978. Sequential aspects of storytelling in conversation. In Studies
in the organization of conversational interaction, edited by J. Schenkein,
219–248. New York: Academic Press.
Jefferson, G., and J. R. E. Lee. 1980. End of Grant Report to the British
SSRC on the analysis of conversations in which “troubles” and
“anxieties” are expressed. Ref. Hr 4802. Manchester: University of
Manchester.
Kita, S. (ed.). 2003. Pointing: Where language, cognition, and culture meet.
Mahwah, NJ: Erlbaum.
Kockelman, P. 2005. The semiotic stance. Semiotica 157:233–304.
Lerner, G. H. 1992. Assisted storytelling: Deploying shared knowledge
as a practical matter. Qualitative Sociology 15:24–77.
——. 1996. On the “semi-permeable” character of grammatical units
in conversation: Conditional entry into the turn space of another
speaker. In Interaction and grammar, edited by E. Ochs, E. A. Schegloff,
and S. A. Thompson, 238–276. Cambridge: Cambridge University
Press.
Levinson, S. C. 1995. Interactional biases in human thinking. In Social
intelligence and interaction: Expressions and implications of the social
bias in human intelligence, edited by E. Goody, 221–260. Cambridge:
Cambridge University Press.
——. 1997. From outer to inner space: Linguistic categories and
nonlinguistic
thinking. In Language and conceptualization, edited by J.
Nuyts and E. Pederson, 13–45. Cambridge: Cambridge University
Press.
——. 2000. Presumptive meanings: The theory of generalized conversational
implicature. Cambridge, MA: MIT Press.
Lewis, D. K. 1969. Convention: A philosophical study. Cambridge, MA:
Harvard University Press.
Mandelbaum, J. 1987. Couples sharing stories. Communication Quarterly
352:144–170.
Maynard, D. W., and D. Zimmerman. 1984. Topical talk, ritual, and
the social organization of relationships. Social Psychology Quarterly
47:301–316.
Mead, G. H. 1934. Mind, self, and society from the standpoint of a social
behaviorist, edited by C. W. Morris. Chicago: University of Chicago
Press.
Miller, G. A. 1951. Language and communication. New York: McGraw-
Hill.
Molder, H. te, and J. Potter (eds.). 2005. Conversation and Cognition.
Cambridge: Cambridge University Press.
Moore, C., and P. Dunham (eds.). 1995. Joint attention: Its origins and
role in development. Hillsdale, NJ: Erlbaum.
Morrison, J. 1997. Enacting involvement: Some conversational practices for
being in relationship. Ph.D. dissertation, School of Communications,
Temple University.
Nettle, D. 1999. Language variation and the evolution of societies.
In The evolution of culture: An interdisciplinary view, edited by R. I.
M. Dunbar, C. Knight, and C. Power, 214–227. New Brunswick, NJ:
Rutgers University Press.
Nettle, D., and R. I. M. Dunbar. 1997. Social markers and the evolution
of reciprocal exchange. Current Anthropology 38(1):93–99.
Norman, D. A. 1991. Cognitive artifacts. In Designing interaction:
Psychology at the human-computer interface, edited by J. M. Carroll,
17–38. Cambridge: Cambridge University Press.
Peirce, C. S. 1965[1932]. Collected papers of Charles Sanders Peirce, vol.
2: Elements of Logic, edited by Charles Hartshorne and Paul Weiss.
Cambridge, MA: Belknap Press of Harvard University Press.
Pomerantz, A., and J. Mandelbaum. 2005. Conversation analytic
approaches to the relevance and uses of relationship categories in
interaction. In Handbook of language and social interaction, edited by
K. L. Fitch and R. E. Sanders, 149–171. Mahwah, NJ: Erlbaum.
Rogoff, B. 1994. Apprenticeship in thinking: Cognitive development in social
context. New York: Oxford University Press.
Sacks, H. 1974. An analysis of the course of a joke’s telling in conversation.
In Explorations in the ethnography of speaking, edited by R. Bauman and
J. Sherzer, 337–353. Cambridge: Cambridge University Press.
——. 1992. Lectures on conversation. London: Blackwell.
Sacks, H., and E. A. Schegloff. 1979. Two preferences in the organization
of reference to persons in conversation and their interaction. In
Everyday language: Studies in ethnomethodology , edited by G. Psathas,
15–21. New York: Irvington.
Schank, R. C., and R. P. Abelson. 1977. Scripts, plans, goals, and
understanding:
Erlbaum.
An inquiry into human knowledge structures. Hillsdale, NJ:
Schegloff, E. A. 1972. Notes on a conversational practice: Formulating
place. In Studies in social interaction, edited by D. Sudnow, 75–119.
New York: The Free Press.
——. 1982. Discourse as an interactional achievement: Some uses of “Uh
Huh” and other things that come between sentences. In Georgetown
University Roundtable on Languages and Linguistics 1981; Analyzing
Discourse: Text and Talk, edited by D. Tannen, 71–93. Washington,
DC: Georgetown University Press.
——. 1992. Repair after next turn: The last structurally provided defense
of intersubjectivity in conversation. American Journal of Sociology
97(5):1295–1345.
——. 1996. Some practices for referring to persons in talk-in-interaction:
A partial sketch of a systematics. In Studies in anaphora, edited by B.
Fox, 437–485. Amsterdam: Benjamins.
——. 1997. Third turn repair. In Towards a social science of language, vol.
2: Social interaction and discourse structures, edited by G. R. Guy, C.
Feagin, D. Schiffrin, and J. Baugh, 31–40. Amsterdam: Benjamins.
——. in press a. Conveying who you are: The presentation of self, strictly
speaking. In Person reference in interaction, edited by N. J. Enfield and
T. Stivers. Cambridge: Cambridge University Press.
——. in press b. Tutorial on membership categorization devices. In
Special Issue of Journal of Pragmatics, edited by M. F. Nielsen and J.
Wagner.
Schelling, T. C. 1960. The strategy of conflict. Cambridge, MA: Harvard
University Press.
Schiffer, S. R. 1972. Meaning. Oxford: Clarendon Press.
Schutz, A. 1970. On phenomenology and social relations. Chicago:
University of Chicago Press.
Searle, J. R. 1995. The construction of social reality. New York: Free
Press.
Slobin, D. 1996. From “thought and language” to “thinking for speaking.”
In Rethinking linguistic relativity, edited by J. J. Gumperz and S. C.
Levinson, 70–96. Cambridge: Cambridge University Press.
Smith, N. V. (ed.). 1980. Mutual knowledge. London: Academic Press.
Sperber, D., and D. Wilson. 1995. Relevance: Communication and cognition,
2nd edition. Oxford: Blackwell.
Sugawara, K. 1984. Spatial proximity and bodily contact among the
Central Kalahari San. African Study Monograph (supp.) 3:1–43.
Tomasello, M. 1999. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Vygotsky, L. S. 1962[1934]. Thought and language. Cambridge, MA: MIT
Press.
Whorf, B. L. 1956. Language, thought, and reality. Cambridge, MA: MIT
Press.
PSwiisIEofCUWhyaDesncivhduxyoemarltsepuonitrghdbpawnlye
Dan Sperber
instruments to the Internet. They vary in the extent to which they are
mutually supportive: some extend indefinitely almost on their own,
such as the CCCC that distributes the “God bless you!” response to a
sneezing. Others, such as those that distribute elements of an ideology
(e.g., the dogma of Trinity) or of a discipline (e.g., Cantor’s proof),
flourish only in the middle of related CCCCs.
Although the idea that thoughts or practices can be contagious is an
old and common one, SCCCs and CCCCs (with the possible exceptions
of rumors) are not objects recognized by common sense, nor are they
part of the social sciences’ toolkit. Notwithstanding, I have argued
(Sperber 1999) that they are what social life is made of, and that things
(objects, events, mental states, etc.) are social to the extent that they
owe their properties to their being embedded in SCCCs and that they
are cultural to the extent that they are shaped and stabilized by CCCCs.
I have suggested moreover that a properly naturalistic approach to
social and cultural phenomena centrally involves identifying the causal
factors and mechanisms that shape these causal chains and explaining
the macroregularities and changes of social and cultural life in term
of these microprocesses. Even though this particular idiom is mine
(and, I hope to show, is useful), it is a variant of a more general type
of approach to society and culture that I call “epidemiological” and
that is found for instance, even if not under that name, in the work of
Cavalli-Sforza and Feldman (1981), Dawkins (1976) and memeticists
inspired by his ideas (Aunger 2002; Blackmore 1999), Durham (1991),
or Boyd and Richerson (1985).
All epidemiological approaches to cultures consider cultural
phenomena
as a population of mental or artifactual items distributed in
a biological population (in particular a human population) and its
habitat, and seek to explain the evolving distribution of these cultural
items. Epidemiological approaches themselves are forms of “population
thinking” applied to cultural phenomena (and discussed under this
label in Richerson and Boyd 2005). Although all these epidemiological
approaches share some basic presuppositions that put them at odds with
more standard holistic and antinaturalistic approaches that are common
in the social sciences, they differ in the way they explain the distribution
and evolution of cultural items and in particular in the role they give to
fine-grained psychological factors in their explanations. I am arguing
that a cultural epidemiology that does not interface with psychology
makes as little sense as would a medical epidemiology that would not
interface with pathology. In particular, postulating that preservative
processes in the human mind are reliable enough to explain cultural
stability rather than investigating whether they really are is as shallow
as would be to postulating without further investigation that all diseases
are infectious and carried by only one type of pathogenic agent.
Fidelity and Stability
How stable are cultural things? Less than is commonly assumed,
especially in the case of “traditional societies,” too often taken to change
very little over generations. Cultures are in constant flux, and this is
true at all levels, from microinteractions to societal institutions. Still,
nothing is cultural without a modicum of stability over social time
and space. What makes some item a token of a cultural type is that it
is similar enough to other tokens of the type to be identified as such.
Members of a cultural group do recognize different word tokens as being
the same word, narratives as being of the same tale, food on their plate
as being the same dish, individual haircuts as exemplifying the same
hairstyle, performances as being the same ritual, individual attitudes
as expressing the same values, and so on. All the tokens recognized as
being of the same type need not be identical, but their resemblance to
one another in relevant respects—even if mere “family resemblance” à
la Wittgenstein, even if exaggerated—must be sufficient for this quasi-
unanimous recognition to be possible. This relative resemblance of
tokens of a type across social space and time gives a measure of the
stability of cultural types and of the stabilizing effect of their CCCCs.
Some types, for example proverbs, are extremely stable; others, for
example dress fashions in modern societies, much less so (from which
we can infer that their respective CCCCs work differently).
How can CCCCs stabilize cultural contents and forms? The answer
may seem obvious: CCCCs are concatenations of preservative processes
of memory, imitation and communication, and, so the explanation goes,
these processes must have sufficient fidelity at the micro scale to bring
about the stability we observe at the macro scale. Cultural stability is
then seen as the proof of the reliability of human memory, imitation and
communication. At this point, students of cultural processes may feel
that the inferred high fidelity achieved by these preservative processes
is all that is relevant to them. Moreover, if all this is correct, cultural
items are indeed replicators (even if, unlike genes, they do not directly
generate their replica). Given that these replicators exhibit great variety,
and given that the waxing of some (linguistic devices, religious practices
and ideas, techniques, fashions, etc.) is at the expense of the waning
of others, the three conditions for Darwinian selection of heritability,
variability, and competition are met. As suggested by Dawkins and
embraced by Dennett, Aunger, Blackmore (see Aunger 2000) and so many
others, culture can be described as a process of “memetic” evolution
comparable with genetic evolution, with, in both cases selection as the
main driving force. The study of the precise mechanisms that make
such fidelity possible can be left to other scholars, now or when they
will be up to it, just as a population geneticists may leave the details of
chromosome replication to molecular biologists.
This attitude is well illustrated in the recent review by Mesoudi et al.
(2004) of arguments in favor of a selectionist approach to culture. They
suggest that “our current understanding of culture is comparable to that
attained by biology in 1859” and that, just as Darwin’s own ignorance
of the mechanisms of biological inheritance did not stop him from
successfully developing and applying his theory, what they take to be
our comparable ignorance of the mechanisms of cultural inheritance
should not inhibit us from applying evolutionary models. They express
the hope that some future “cultural ‘Watson and Crick’ ” (Mesoudi et
al. 2004:9) will discover the cultural counterpart of DNA. I believe we
know enough to know that there is no cultural DNA to be found.
Cultural transmission is achieved not by a single mechanism of
replication but by a variety of mental and social mechanisms. These
mechanisms are intensely studied and in good part understood. Consider
the work done on “imitation” in the past 15 years (Hurley and Chater
2005; Tomasello and Carpenter 2005). Among the several processes that
result in the re-production of a behavior,1 we now must distinguish (at
least) stimulus enhancement, emulation, and, within imitation proper,
imitation of behavior and imitation of goal (if the latter can properly
be described as “imitation” at all). Work on verbal communication in
linguistics, pragmatics, psycholinguistics, and sociolinguistics reveals a
great variety of submechanisms interacting in complex ways; nonverbal
communication involve yet other mental and interactional mechanisms
such as joint attention (see Astington, Clark, Enfield, Gaskins, Gergely
and Csibra, Goldin-Meadow, Goodwin, Keating, Levinson, Liszowski,
Pyers, Schlegloff, and Tomasello in this volume; see also Sperber and
Wilson 1995).
If, instead of postulating that they must be faithful enough to
explain cultural macro stability, one looks closely at the micro processes
involved, what is immediately striking (and abundantly confirmed
by experimental work) is that outputs of memory, imitation and
communication are quite generally transformations of the inputs, so
much so that the rare case where the output is identical to the input
are best seen as limiting cases of “zero transformation.” Much of these
transformations is in the direction of entropy: information is lost in
the process of transmission. Part of these transformations is biased
so as better to fit the current mental or motor schema and goals of
the user. This is hardly surprising. It is not just that imperfection is
to be expected. It is, more importantly, that the finality of individual
memory, imitation and communication processes is not to preserve
information per se (and even less to preserve it so as to secure cultural
stability). Rather, a relative degree of preservation of information is a
means toward a variety of ends.
When Jill tries to remember what happened at the last council
meeting,
it is to better prepare the coming one, and, for this, all she needs is
the parts of the gist of what was said on issues that are likely to come up
again. When you read this chapter, you do so not to store in your mind
a copy of its contents but to extract from it what may be of relevance to
you. When Peter tries to copy the way in which he saw Henrietta prepare
a soufflé, he does so not to duplicate her movements but to produce at
a soufflé to his liking. Only when the goal of preservation is best served
by strict replication, as when forging a banknote or dancing in a chorus
line, is an effort made to avoid any departure from the model.
To generalize: in preservative processes, information is transformed
in two directions—entropy and relevance. Part of this transformation
results from the imperfection of these processes, part of it results
from their finality. Incidentally, it would be a mistake to assume
that transformation toward entropy is always and entirely an aspect
of the imperfection of preservative processes: eliminating irrelevant
information is a contribution to overall relevance. 2
Most cognitive processes are constructive. They do not just reencode
input information; rather, they construct new mental representations
by drawing jointly on new inputs and on memorized information
and by typically going beyond a mere addition of the two. Even
preservation of information is to a large extent achieved by processes
that reconstruct rather than merely replicate the information to be
preserved. Reconstruction is often more efficacious than replication
because it can better handle fragmentary or degraded inputs. It is
also more parsimonious because it makes it unnecessary to register
information in full to make it available when needed.
Preservative and constructive processes, far from being mutually
exclusive, typically overlap. Preservation, and in particular the
reproduction of cultural information, can be more or less replicative and
more or less reconstructive. Why does it matter? Because replication and
reconstruction provide different explanations of stability in chains of
re-production, and therefore in culture. With replicative processes, an
error of replication at some juncture is preserved in further replications:
it becomes the model until the next copying error. If such copying
errors are very frequent (as they are in human memory, imitation and
communication except that describing them as “copying errors” is
misleading, because these are not copying processes), this compromises
both heritability and stability.
However, a reconstructive process of transmission can combine
transformations
at every micro step with macro stability. Why? Because
reconstruction, unlike replication, just as it can easily depart from the
model, can also easily return to the model even if it had been modified
in earlier re-productions. This occurs for instance when in so-called
“imitation of goal,” the imitator produces an action that succeeds in
achieving a goal that the model missed. By not copying the model’s
actual actions, the so-called imitator of a goal may reconstruct a cultural
skill and become better at it than the model. More generally, constructive
processes in members of the same population may draw on the same
inputs and converge on the same outcome, that is, they may result in
the re-production of some cultural representation or practice whether
or not they were intended to achieve such re-production.
If I am right, a good part of the explanatory weight in the explanation
of cultural stability and evolution should move from mechanisms of
inheritance and selection to the mechanisms of construction and
reconstruction and to the cognitive and environmental factors that
cause these mechanisms to have converging outputs.
Conclusion
Holism in the social sciences starts from the correct coarse observation
that everything is connected to everything else, but alas arrives
nowhere. Methodological individualism and interactionism have in
various ways looked at social life with a magnifier, revealing details and
providing novel insights into the bigger picture. I am also advocating
using a microscope. Social life is a web of causal chains that are better
described not as individual or as supraindividual but as both infra-
and transindividual. Individual- and societal-level observable effects
are caused by the aggregation of microprocesses few of which are open
to easy observation or introspection. Half of these microprocesses are
mental. Thanks to the development of the cognitive sciences, our
understanding of infraindividual (or “subpersonal,” see Dennett 1969)
mental processes is rapidly changing and growing. In particular, it is
becoming clear, or so I argue, that, to an important extent, cognition
enables culture through domain-specific constructive mechanisms.
Mechanisms of imitation and communication, however remarkable
and important in humans, do not yield the kind of heritability that by
itself would explain cultural stability. This is why a deep understanding
of culture and its evolution is incompatible with shallow psychology.
Notes
1. I write throughout re-production rather than reproduction because I am
talking of the new production of the token of a type, whether or not it is
achieved by means of “reproduction” in the usual sense of copying.
2. A striking example of this is provided by the experiments of Van der
Henst et al. (2002). They found that 57 percent of people with digital watches
asked for the time by a stranger in the street, rather than just reading aloud
what their watch indicates (a purely preservative process) make the effort
of rounding to the nearest multiple of five the time precise to the minute
they read on their watch, thus providing a less informative but more relevant
answer.
3. Cognitive dispositions, like all phenotypic traits, are determined by the
interaction, during their development, of genetic and environmental factors.
Cognitive dispositions are “innate” to the extent that the environmental
factors needed for their development are not themselves cognitive inputs. So,
knowledge of English is certainly not innate, but the ability to learn English
or any other human language may be, even if this ability might fail to develop
because, for instance, of severe nutritional deficits. “Innate” so understood
(see Samuels 2002) does not mean determined solely by the genes—nothing
is—and does not either mean present at birth (whether development ends
intra or extra utero is irrelevant).
References
Atran, S. 1990. Cognitive foundations of natural history: Towards an
anthropology of science. Cambridge: Cambridge University Press.
——. 2002. In gods we trust: The evolutionary landscape of religion. Oxford:
Oxford University Press.
Aunger, R. (ed.). 2000. Darwinizing culture: The status of memetics as a
science. Oxford: Oxford University Press.
——. 2002. The electric meme. New York: Free Press.
Blackmore, S. J. 1999. The meme machine. Oxford: Oxford University
Press.
Bloch, M., and D. Sperber. 2002. Kinship and evolved psychological
dispositions: The Mother’s Brother controversy reconsidered. Current
Anthropology 43(4):723–748.
Boyd, R., and P. J. Richerson. 1985. Culture and the evolutionary process.
Chicago: University of Chicago Press.
Boyer, P. 1994 The naturalness of religious ideas: A cognitive theory of
religion. Berkeley: University of California Press.
——. 2001. Religion explained: The evolutionary origins of religious thought.
New York: Basic Books.
Cavalli-Sforza, L. L., and M. W. Feldman. 1981. Cultural transmission and
evolution: A quantitative approach. Monographs in Population Biology.
Princeton: Princeton University Press.
Dawkins, R. 1976. The selfish gene. Oxford: Oxford University Press.
Dennett, D. 1969. Content and consciousness. London: Routledge and
Kegan Paul.
Durham, W. H. 1991. Coevolution: Genes, culture, and human diversity.
Stanford: Stanford University Press.
Erasmus, D. 1530. On Good Manners for Boys. In Collected Works of
Erasmus, vol. 25, edited by J. Sowards; translated by B. McGregor,
269–289. Toronto: University of Toronto Press.
Hirschfeld, L. A. 1996. Race in the making: Cognition, culture, and the
child’s construction of human kinds. Cambridge, MA: MIT Press.
Hirschfeld, L. A., and S. A. Gelman (eds.). 1994. Mapping the mind:
Domain specificity in cognition and culture. Cambridge: Cambridge
University Press.
Hurley, S., and N. Chater (eds.). 2005. Perspectives on imitation. Cambridge,
MA: Bradford Books.
Hutchins, E. 1995. Cognition in the wild. Cambridge, MA: MIT Press.
Mesoudi, A., A. Whiten, and K. N. Laland. 2004. Is human cultural
evolution Darwinian? Evidence reviewed from the perspective of
“The Origin of Species.” Evolution 58(1):1–11.
Nichols, S. 2004. Sentimental rules: On the natural foundations of moral
judgement. Oxford: Oxford University Press.
Odling-Smee F. J., K. N. Laland, and M. W. Feldman. 2003. Niche
construction: The neglected process in evolution. Monographs in
Population Biology 37. Princeton: Princeton University Press.
Richerson, P. J., and R. Boyd. 2005. Not by genes alone: How culture
transformed human evolution. Chicago: University of Chicago Press.
Rizzolatti, G., L. Fadiga, V. Gallese, and L. Fogassi. 1996. Premotor
cortex and the recognition of motor actions. Cognitive Brain Research
3:131–141.
Rozin, P., J. Haidt, and C. McCauley. 2000. Disgust. In Handbook of
emotions, 2nd edition, edited by M. Lewis and J. Haviland-Jones,
637–653. New York: Guilford Press.
Samuels, R. 2002. Nativism in cognitive science. Mind & Language
17(3):233–265.
Shatz, M., D. Behrend, S. A. Gelman, and K. S. Ebeling. 1996. Colour
term knowledge in two-year-olds: Evidence for early competence.
Journal of Child Language 23:177–199.
Sperber, D. 1996. Explaining culture: A naturalistic approach. Oxford:
Blackwell.
——. 1999. Conceptual tools for a natural science of society and culture
(Radcliffe-Brown Lecture in Social Anthropology 1999). Proceedings
of the British Academy (2001) 111:297–317.
Sperber, D., and L. Hirschfeld. 2004. The cognitive foundations of cultural
stability and diversity. Trends in Cognitive Sciences 8(1):40–46.
Sperber, D., and D. Wilson. 1995. Relevance: Communication and cognition,
2nd edition. Oxford: Blackwell.
Tomasello, M., and M. Carpenter. 2005. Intention reading and imitative
learning. In Perspectives on imitation: From neuroscience to social science:
Vol. 2. Imitation, human development, and culture, edited by S. Hurley
and N. Chater, 133–148. Cambridge, MA: MIT Press.
Tooby, J., and L. Cosmides. 1992. The psychological foundations of
culture. In The adapted mind: Evolutionary psychology and the generation
of culture, edited by J. Barkow, L. Cosmides, and J. Tooby, 19–136. New
York: Oxford University Press.
Van der Henst, J.-B., L. Carles, and D. Sperber. 2002. Truthfulness and
relevance in telling the time. Mind & Language 17:457–466.
Part 5
Evolutionary Perspectives
sev nte n
Sometime over the last five million years, important changes occurred
in human psychology that gave rise to larger more cooperative societies.
Given the magnitude and complexity of the changes, they were probably
the product of natural selection. However, the standard theory of the
evolution of social behavior is consistent with Hobbes, not observed
human behavior. Apes fit the bill, not humans.
Something makes our species different, and in this chapter we argue
that something is cultural adaptation. Over the last million years or
so, humans evolved the ability to learn from other humans, creating
the possibility of cumulative, nongenetic evolution. These capacities
were strongly beneficial in the chaotic climates of the Pleistocene,
allowing humans to culturally evolve highly refined adaptations to
rapidly varying environments. However, cultural adaptation also vastly
increased heritable variation among groups, and this gave rise to the
evolution of group beneficial cultural norms and values. Then, in such
culturally evolved cooperative social environments, genetic evolution
created new, more prosocial motives.
We begin by reviewing the evolutionary theory of social behavior,
explaining why natural selection does not normally favor large-scale
cooperation. Then, we argue that cumulative cultural adaptation
generates between-group variation, which potentiates the evolution
of cooperation. Next, we suggest that such changes would lead to the
evolution of genetically transmitted social instincts favoring tribal scale
cooperation, and summarize some of the evidence consistent with this
hypothesis. Finally, we briefly discuss how these ideas relate to the
theme of this volume, the nature of everyday human interactions.
Figure 17.1. Suppose that there were a population of people who were paired
at random and play the stag hunt. The average payoff of each strategy is plotted
as a function of the fraction of players who choose to hunt stag. Assuming
that strategies with higher payoffs increase in frequency, there are two stable
equilibria: everybody chooses stag or everybody chooses hare. The average
payoff of the whole population is maximized at the all stag equilibrium. However,
unless stag hunting has a much larger payoff than hunting hares (2h < s), the
basin of attraction of the stag equilibrium is smaller than that of the lower payoff
hare equilibrium.
which pairs of individuals have two options: They can hunt for “a stag”
or for “hare.” Hunting hare is a solitary activity and an individual who
chooses to hunt hare gets a small payoff, h, no matter what the other
individual does. Stag hunting, however, requires coordinated action.
If both players hunt for the stag, they usually succeed and each gets a
large payoff, s. However, a single individual hunting stag always fails
and gets a payoff of 0 (see Table 17.1).
The best thing for the population is if everybody hunts stags, so
stag hunting is “cooperative” in the sense of a mutually beneficial
activity. However, it is not cooperative in the technical sense because
individuals do not experience a cost to provide a benefit. When most
Table
Table 17.1.
17.1. The
The Stag Hunt.
Hunt. In
In Rosseau’s
Rosseau's parable, hunters
hunters can
can either
either
hunt
hunt stag or
or hare.
hare. Hunting together does
does not
not affect
affect the
the success
success of
of hare
hare
hunters; they always get a small payoff, h. If they hunt stag together
a small h. If hunt
they are
are likely to
to succeed
succeed and
and achieve
achieve aa high payoff s, but aa single stag
s, but stag
hunter fails and receives a payoff of zero
hunter fails and receives a of zero
Right
Stag Hare
Left Stag s, s 0, h
Hare h, 0 h, h
Right
Cooperate Defect
Left Cooperate 1 1
<0 -
c,b
Defect b, -
c 0,0
Figure 17.2. Suppose that there were a population of people who were paired
at random and play the prisoner’s dilemma. The average payoff of each strategy
is plotted as a function of the fraction of players who choose to cooperate.
Now there is only one stable equilibria, everybody defects at which the average
payoff of the whole population is minimized. The payoff maximizing equilibrium,
everybody cooperates, is unstable because defectors have a higher payoff than
cooperators.
all the reasons given by Adam Smith in The Wealth of Nations. However,
participating in exchange typically requires cooperation. In all but the
simplest transactions, individuals experience a cost now in return for
a benefit later and are vulnerable to defectors who take the benefit but
do not produce the return. Exchange and division of labor also are
typically characterized by imperfect monitoring of effort and quality
that give rise to opportunities for free riding. The potential for conflict
over land, food, and other resources is everywhere. In such conflicts
larger more cooperative groups defeat smaller less cooperative groups.
However, each warrior’s sacrifice benefits everyone in the group whether
or not they too went to war and thus defectors reap the fruits of victory
without risking their skins. Honest, low-cost communication provides
many benefits—coordination is greatly facilitated, resources can be
used more efficiently, hazards avoided; the list is long. However, once
individuals come to rely on the signals of others, the door is open for
liars, flim-flam artists, and all the rest. Widely held stable moral systems
enforced by stern sanctions can solve most of these problems; cheats,
cowards, and liars can be punished. The problem is that punishment is
typically costly, and defectors can reap the benefits of the moral order
without paying the costs of punishment.
However, aside from humans, only a few other taxa, most notably
social insects, cooperate very much. Interestingly, those that have are,
like humans, spectacular evolutionary successes. It has been estimated,
for example, that termites account for half of the animal biomass in
the tropics. So, if cooperation produces such spectacular benefits, why
is it so rare?
LCPAmoorimpoeaiurtnepiosgdnl,
Smalis
Grto
The punch line is that evolutionary theory predicts that cooperation
in primates and other species that have small families will be limited
to small groups. Kin selection results in large-scale social systems only
when there are large numbers of closely related individuals. The social
insects, where a few females produce a mass of sterile workers, and
multicellular invertebrates are examples of such exceptions. Primate
societies are nepotistic, but cooperation is mainly restricted to relatively
small kin groups. Theory suggests that reciprocity can be effective in
small groups, but not in larger ones. Reciprocity may play some role
in nature (although many experts are unconvinced), but there is no
evidence that reciprocity has played a role in the evolution of large-scale
sociality. All would be well if humans did not exist, because human
societies, even those of hunter-gatherers, are based on groups of people
linked together into much larger highly cooperative social systems.
Rapid Cultural Adaptation Potentiates Group
Selection
So why are not human societies very small in scale, like those of other
primates? For us, the most likely explanation is that rapid cultural
adaptation led to a huge increase in the amount of behavioral variation
among groups. In other primate species, there is little heritable variation
among groups because natural selection is weak compared with migration.
This is why group selection at the level of whole primate groups is not
an important evolutionary force. In contrast, there is a great deal of
behavioral variation among human groups. Such variation is the reason
why we have culture—to allow different groups to accumulate different
adaptations to a wide range of environments.
In the Origin of Species, Darwin famously argued that three conditions
are necessary for adaptation by natural selection: First, there must be a
“struggle for existence” so that not all individuals survive and reproduce.
Second, there must be variation so that some types are more likely to
survive and reproduce than others, and finally, variation must be heritable
so that the offspring of survivors resemble their parents. Although
Darwin usually focused on individuals,3 the same three postulates apply
to any reproducing entity—molecules, genes, and cultural groups. Only
the first two conditions are satisfied by most other kinds of animal
groups. For example, vervet monkey groups compete with one another,
and groups vary in their ability to survive and grow, but, and this is
the big but, the causes of group-level variation in competitive ability
are not heritable, so there is no cumulative adaptation. Once rapid
cultural adaptation in human societies gave rise to stable, between-group
differences, the stage was set for a variety of selective processes
to generate adaptations at the group level.
The simplest mechanism is intergroup competition. The spread of the
Nuer at the expense of the Dinka in the 19th-century Sudan provides a
good example. During the 19th century each consisted of a number of
politically independent groups. Cultural differences in norms between
the two groups meant that the Nuer were able to cooperate in larger
groups than the Dinka. The Nuer, who were driven by the desire for more
grazing land, attacked and defeated their Dinka neighbors, occupied
their territories, and assimilated tens of thousands of Dinka into their
communities. This example illustrates the requirements for cultural
group selection by intergroup competition. Contrary to some critics
(e.g., Palmer et al. 1997), there is no need for groups to be strongly
bounded, individual-like entities. The only requirement is that there
are persistent cultural differences between groups, and these differences
must affect the group’s competitive ability. Losing groups must be
replaced by the winning groups. Interestingly, the losers do not have
to be killed. The members of losing groups just have to disperse or to
be assimilated into the victorious group. Losers will be socialized by
conformity or punishment, so even very high rates of physical migration
need not result in the erosion of cultural differences. This kind of group
selection can be a potent force even if groups are usually very large.
Group competition is common in small scale societies. The best
data come from New Guinea, which provides the only large sample
of simple societies studied by professional anthropologists before
they experienced major changes because of contact with Europeans.
Joseph Soltis assembled data from the reports of early ethnographers
in New Guinea (Soltis et al. 1995). Many studies report appreciable
intergroup conflict and about half mention cases of social extinction
of local groups. Five studies contained enough information to estimate
the rates of extinction of neighboring groups (see Table 17.3). The
typical pattern is for groups to be weakened over a period of time
by conflict with neighbors and finally to suffer a sharp defeat. When
enough members become convinced of the group’s vulnerability to
further attack, members take shelter with friends and relatives in other
groups, and the group becomes socially extinct. At these rates of group
extinction, it would take between twenty and forty generations, or 500
to 1,000 years, for an innovation to spread from one group to most of
the other local groups by cultural group selection.
These results imply that cultural group selection is a relatively slow
process. But then, so are the actual rates of increase in political and social
sophistication we observe in the historical and archaeological records.
Table
Table 17.3.
17.3. Extinction
Extinction rates
rates for
for cultural
cultural groups
groups from
from five
live regions in
in
New
New Guinea
Guinea from
from Soltis
Soltis et
et al.
al. 1995
1995
Notes
1. The great population geneticist J. B. S. Haldane gave what is perhaps the
pithiest summary of this principle. When asked by a reporter whether the
study of evolution had made it more likely that he would give up his life for a
brother, Haldane is supposed to have answered, “No, but I would give up my
life to save two brothers or eight cousins.”
2. The Price approach has been very fruitful, generating a much clearer
understanding of many evolutionary problems. For example, Alan Grafen’s
(1984) work on kin selection and Steven Frank’s work on the evolution of
the immune system, multicellularity, and related issues (Frank 2002). This
approach can also be used to study cultural evolution. See Henrich (2004) and
Henrich and Boyd (2002).
3. Darwin (1874), in the Descent of Man, did invoke group selection to
explain human cooperation.
It must not be forgotten that although a high standard of morality gives
but a slight or no advantage to each individual man and his children
over other men of the same tribe, yet that an increase in the number
of well-endowed men and an advancement in the standard of morality
will certainly give an immense advantage to one tribe over another. A
tribe including many members who, from possessing in a high degree
the spirit of patriotism, fidelity, obedience, courage, and sympathy,
were always ready to aid one another, and to sacrifice themselves for
the common good, would be victorious over most other tribes; and this
would be natural selection. [pp. 178-179]
4. Simon (1990) made the same argument, apparently independently. He
used the term docility because he believed that we are especially prone to
accept group beneficial beliefs. We think his account is unsatisfactory because
it does not explain why such beliefs spread.
References
Anderson, B. R. O’G. 1991. Imagined communities: Reflections on the origin
and spread of nationalism, revised and extended edition. London:
Verso.
Aoki, K. 1982. A condition for group selection to prevail over
counteracting
individual selection. Evolution 36:832-842.
Axelrod, R., and D. Dion. 1988. The further evolution of cooperation.
Science 242:1385-1390.
Binmore, K. G. 1994. Game theory and the social contract. Cambridge,
MA: MIT Press.
Boyd, R., and P. J. Richerson. 1988. The evolution of reciprocity in
sizable groups. Journal of Theoretical Biology 132:337-356.
Boyd R., and P. J. Richerson. 1992. Punishment allows the evolution
of cooperation (or anything else) in sizable groups. Ethology and
Sociobiology 13:171-195.
Boyd R., and P. J. Richerson. 2002. Group beneficial norms spread rapidly
in a structured population, Journal of Theoretical Biology 215:287-296.
Darwin, C. 1874. The descent of man and selection in relation to sex, 2nd
edition, 2 vols. New York: American Home Library.
Dobzhansky, T. 1973. Nothing in biology makes sense except in the
light of evolution. American Biology Teacher 35:25-29.
Frank, S. A. 2002. Immunology and evolution of infectious disease. Princeton:
Princeton University Press.
Ghiselin, M. T. 1974. The economy of nature and the evolution of sex.
Berkeley: University of California Press.
Grafen, A. 1984. A geometric view of relatedness. Oxford Surveys of
Evolutionary Biology 2:28-89.
Hamilton, W. D. 1964. Genetic evolution of social behavior I, II . Journal
of Theoretical Biology 7:1-52.
Hammerstein, P. 2003. Why is reciprocity so rare in animals? A Protestant
appeal. In Genetic and cultural evolution of cooperation, edited by P.
Hammerstein, 83-94. Cambridge, MA: MIT Press.
Henrich, J. 2004. Cultural group selection, coevolutionary processes and
large-scale cooperation. Journal of Economic Behavior and Organization
53:3-35.
Henrich, J., and R. Boyd. 2002. On modeling cognition and culture:
Why replicators are not necessary for cultural evolution. Culture and
Cognition 2:67-112.
Henrich, J., and F. J. Gil-White. 2001. The evolution of prestige—Freely
conferred deference as a mechanism for enhancing the benefits of
cultural transmission. Evolution and Human Behavior 22:165-196.
Henrich, J., R. Boyd, S. Bowles, C. Camerer, E. Fehr, and H. Gintis. 2004.
The foundations of human sociality: Economic experiments and ethnographic
evidence from fifteen small-scale societies. New York: Oxford University
Press.
Johnson, P. 1976. A history of Christianity. London: Weidenfeld &
Nicolson.
Keller, L., and M. Chapuisat. 1999. Cooperation among selfish individuals
in insect societies. Bioscience 49:899-909.
Labov, W. 2001. Principles of Linguistic Change, vol. 2: Social Factors.
Oxford: Blackwell.
Lodge, R. A. 1993. French: From dialect to standard. London:
Routledge.
Mansbridge, J. J. 1990. Beyond self-interest. Chicago: University of
Chicago Press.
Maynard Smith, J. 1964. Group selection and kin selection. Nature
201:1145-1146.
Nowak, M., and K. Sigmund. 1998. Evolution of indirect reciprocity by
image scoring: The dynamics of indirect reciprocity. Nature 393(June
11):573-577.
Palmer, C. T., B. E. Fredrickson, and C. F. Tilley. 1997. Categories and
gatherings: Group selection and the mythology of cultural
anthropology
. Evolution and Human Behavior 18:291-308.
Pinker, S. 1994. The language instinct
, 1st edition. New York: W.
Morrow.
Price, G. R. 1970. Selection and covariance. Nature 277(August 1):520-521.
Price, G. R. 1972. Extensions of covariance selection mathematics.
Annals of Human Genetics 35:485-490.
Queller, D. C. 1989. Inclusive fitness in a nutshell. Oxford Surveys in
Evolutionary Biology 6:73-109.
Queller, D. C., and J. E. Strassmann. 1998. Kin selection and social
insects: Social insects provide the most surprising predictions and
satisfying tests of kin selection. Bioscience 48:165-175.
Richerson, P. J., and R. Boyd. 1998. The evolution of human ultrasociality.
In Indoctrinability, ideology, and warfare: Evolutionary perspectives, edited
by I. Eibl-Eibesfeldt and F. K. Salter, 71-95. New York: Berghahn
Books.
Richerson, P. J., and R. Boyd. 2001. The evolution of subjective
commitment to groups: A tribal instincts hypothesis. In Evolution
and the capacity for commitment, edited by R. M. Nesse, 186-220. New
York: Russell Sage Foundation.
Richerson, P. J., and R. Boyd. 2005. Not by genes alone: How culture
transformed human evolution. Chicago: University of Chicago Press.
Rogers, A. R. 1990. Group selection by selective emigration: The effects
of migration and kin structure. American Naturalist 135:398-413.
Simon, H. A. 1990. A mechanism for social selection and successful
altruism. Science 250(4988):1665-1668.
Sober, E., and D. S. Wilson. 1998. Unto others: The evolution and psychology
of unselfish behavior. Cambridge, MA: Harvard University Press.
Soltis, J., R. Boyd, and P. J. Richerson. 1995. Can group-functional
behaviors evolve by cultural group selection? An empirical test.
Current Anthropology 36:437-494.
Stark, R. 1997. The rise of Christianity: How the obscure, marginal Jesus
movement became the dominant religious force in the Western world in a
few centuries. San Francisco: HarperCollins.
Tooby, J., and L. Cosmides. 1992. The psychological foundations of
culture. In The adapted mind: Evolutionary psychology and the generation
of culture, edited by J. Barkow, L. Cosmides, and J. Tooby, 19-136.
New York: Oxford University Press.
Trivers, R. L. 1971. The evolution of reciprocal altruism. Quarterly Review
of Biology 46:35-57.
Wiessner, P., and A. Tumu. 1998. Historical vines: Enga networks of
exchange, ritual, and warfare in Papua New Guinea. Smithsonian Series
in Ethnographic Inquiry. Washington, DC: Smithsonian Institution
Press.
Williams, G. C. 1966. Adaptation and natural selection: A critique of some
current evolutionary thought. Princeton: Princeton University Press.
Wynne-Edwards, V. C. 1962. Animal dispersion in relation to social behavior.
Edinburgh: Oliver and Boyd.
eighte n
head. Note that the hands are not employed symmetrically, but each applies a
different grip: in this example, the right hand holds with a power grip, whereas
the left hand pinches the end of the segment. This allows the segment to be
rotated conveniently, ready for the next piece of outer case to be removed,
rather in the manner in which one might handle corn-on-the-cob. (c) Once
the pith is partly exposed, it may be eaten directly from the section of stem or
picked out with the index finger of one hand. Notice the one-day-old baby that
is resting on the chest of its mother, apparently asleep: even from the first day of
life, a gorilla has abundant opportunities to watch skilled food processing tasks,
performed at close range, and to explore discarded debris. In contrast, young
great apes seldom watch with any evident close attention when their mothers
or other individuals are processing food (unless the food is likely to be shared, as
with nuts cracked using tools). (d) More typically, mother and juvenile will feed
together but independently: but remember that, by the time the juvenile is able
to tackle Peucedanum independently, it will have spent many hundreds of hours
watching processing in a more casual way.
This strongly favors the hypothesis that the standard technique is a
culturally transmitted pattern.
Finally, one anecdotal observation supports the case that great apes
need to learn aspects of their complex feeding skills by observation.
When processing stinging nettles (see Fig. 18.2), an important food
plant in the study area, one single adult in the study population differed
in technique—the female Picasso did not fold bundles of leaves, so
was presumably often stung on her lips (Byrne 1999a). Picasso had
transferred into the study area from lower altitude, where nettles do
not grow. Because adult gorillas feed alone and out of sight of others in
dense herbage, mountain gorillas’ only opportunity for observational
learning of plant processing comes in infancy. It seems most likely
that a lack of opportunity to observe accounts for Picasso’s incomplete
technique, and intriguingly her juvenile was the only other gorilla in
the study population to lack that particular element of the skill.
Note
1. There is a certain irony that Humphrey (1976) used the apparent lack
of any such environmental challenges to gorilla cognition as a justification
for advocating a hypothesis that advanced primate cognition developed in
response to the social challenge of living in a long-lasting group.
References
Astington, J. W., P. Harris, and D. R. Olson. 1988. Developing theories of
mind. Cambridge: Cambridge University Press.
Auser, M. D., and R. W. Wrangham. 1987. Manipulation of food calls
in captive chimpanzees. Folia Primatologia 48:207–210.
Baldwin, D., E. Neuhaus, G. Guha, and A. Craven. n.d. Extracting
structure
from dynamic human action. Unpublished MS, Department of
Psychology, University of Oregon.
Bargh, J. A., and T. L. Chartrand. 1999. The unbearable automaticity of
being. American Psychologist 54:462–479.
Barton, R. A., and R. I. M. Dunbar. 1997. Evolution of the social brain.
In Machiavellian intelligence, vol. 2: Extensions and evaluations, edited
by A. Whiten and R. W. Byrne, 240–263. Cambridge: Cambridge
University Press.
Boesch, C., and H. Boesch. 1990. Tool use and tool making in wild
chimpanzees. Folia Primatologica 54:86–99.
Brothers, L. 1990. The social brain: A project for integrating primate
behavior and neurophysiology in a new domain. Concepts in
Neuroscience 1:27–51.
Bugnyar, T., and L. Huber. 1997. Push or pull: An experimental study
of imitation in marmosets. Animal Behaviour 54:817–831.
Byrne, R. W. 1994. The evolution of intelligence. In Behaviour and
evolution, edited by P. J. B. Slater and T. R. Halliday, 223–265.
Cambridge: Cambridge University Press.
——. 1995. The thinking ape: Evolutionary origins of intelligence. Oxford:
Oxford University Press.
——. 1996. Machiavellian intelligence. Evolutionary Anthropology 5:172–180.
——. 1997. The technical intelligence hypothesis: An additional
evolutionary stimulus to intelligence? In Machiavellian intelligence,
vol. 2: Extensions and evaluations, edited by A. Whiten and R. W. Byrne,
289–311. Cambridge: Cambridge University Press.
——. 1998. Cognition in great apes. In Brain and cognition in monkeys,
apes and man, edited by A. D. Milner, 228–244. Oxford: Oxford
University Press.
——. 1999a. Cognition in great ape ecology. Skill-learning ability opens
up foraging opportunities. Symposia of the Zoological Society of London
72:333–350.
——. 1999b. Imitation without intentionality. Using string parsing to
copy the organization of behaviour. Animal Cognition 2:63–72.
——. 1999c. Object manipulation and skill organization in the complex
food preparation of mountain gorillas. In The mentality of gorillas and
orangutans, edited by S. T. Parker, R. W. Mitchell, and H. L. Miles,
147–159. Cambridge: Cambridge University Press.
——. 2000. The evolution of primate cognition. Cognitive Science 24:543–570.
——. 2001. Clever hands: The food processing skills of mountain gorillas.
In Mountain gorillas: Three decades of research at Karisoke, edited by
M. M. Robbins, P. Sicotte, and K. J. Stewart, 293–313. Cambridge:
Cambridge University Press.
——. 2002. Imitation of complex novel actions: What does the evidence
from animals mean? Advances in the Study of Behavior 31:77–105.
——. 2003. Imitation as behaviour parsing. Philosophical Transactions
of the Royal Society of London (B) 358:529–536.
——. 2005. Detecting, understanding, and explaining animal imitation.
In Perspectives on imitation: From mirror neurons to memes, edited by S.
Hurley and N. Chater, 255–282. Cambridge, MA: MIT Press.
Byrne, R. W., P. J. Barnard, I. Davidson, V. M. Janik, W. C. McGrew, A.
Miklósi, and P. Wiessner. 2004. Understanding culture across species.
Trends in Cognitive Sciences 8:341–346.
Byrne, R. W., and J. M. E. Byrne. 1993. Complex leaf-gathering skills of
mountain gorillas (Gorilla g. berengei): Variability and standardization.
American Journal of Primatology 31:241–261.
Byrne, R. W., and N. Corp. 2004. Neocortex size predicts deception
rate in primates. Proceedings of the Royal Society of London: Biology
271:1693–1699.
Byrne, R. W., N. Corp, and J. M. E. Byrne. 2001a. Estimating the
complexity of animal behavior: How mountain gorillas eat thistles.
Behaviour 138:525–557.
——. 2001b. Manual dexterity in the gorilla: Bimanual and digit role
differentiation in a natural task. Animal Cognition 4:347–361.
Byrne, R. W., and A. E. Russon. 1998. Learning by imitation: A hierarchical
approach. Behavioral and Brain Sciences 21:667–721.
Byrne, R. W., and E. J. Stokes. 2002. Effects of manual disability on
feeding skills in gorillas and chimpanzees: a cognitive analysis.
International Journal of Primatology 23:539–554.
Byrne, R. W., and A. Whiten. 1988. Machiavellian Intelligence: Social
expertise and the evolution of intellect in monkeys, apes and humans.
Oxford: Clarendon Press.
Corballis, M. C. 1991. The lopsided ape. Oxford: Oxford University
Press.
Corp, N., and R. W. Byrne. 2002a. Leaf processing of wild chimpanzees:
Physically defended leaves reveal complex manual skills. Ethology
108:1–24.
——. 2002b. The ontogeny of manual skill in wild chimpanzees: Evidence
from feeding on the fruit of Saba florida . Behaviour 139:137–168.
Crockford, C., I. Herbinger, L. Vigilant, and C. Boesch. 2004. Wild
chimpanzees produce group-specific calls: A case for vocal learning?
Ethology 110:221–243.
Custance, D., A. Whiten, and T. Fredman. 1999. Social learning of an
artificial fruit task in capuchin monkeys (Cebus apella). Journal of
Comparative Psychology 113(1):13–23.
Donald, M. 1991. Origins of the human mind: Three stages in the evolution
of culture and cognition. Cambridge, MA: Harvard University Press.
Dunbar, R. I. M. 1992. Neocortex size as a constraint on group size in
primates. Journal of Human Evolution 20:469–493.
——. 1998. The social brain hypothesis. Evolutionary Anthropology
6:178–190.
Dunbar, R. I. M., and D. Nettle. 1997. Social markers and the evolution
of reciprocal exchange. Current Anthropology 38:93–99.
Elliott, J. M., and K. J. Connolly. 1984. A classification of manipulative
hand movements. Developmental Medicine and Child Neurology 26:283–
296.
Fox, F., A. Sitompul, and C. P. Van Schaik. 1999. Intelligent tool use in
wild Sumatran orangutans. In The mentality of gorillas and orangutans,
edited by S. T. Parker, H. L. Miles, and R. W. Mitchell, 99–116.
Cambridge: Cambridge University Press.
Galef, B. G. 1988. Imitation in animals: History, definitions, and
interpretation of data from the psychological laboratory. In Social
learning: Psychological and biological perspectives, edited by T. Zentall
and B. G. Galef Jr., 3–28. Hillsdale, NJ: Erlbaum.
Gallese, V., L. Fadiga, L. Fogassi, and G. Rizzolatti. 1996. Action
recognition
in the premotor cortex. Brain 119:593–609.
Gardner, R. A., B. T. Gardner, and T. E. Van Cantfort. 1989. Teaching
sign language to chimpanzees. New York: State University of New York
Press.
Goldin-Meadow, S. 1999. The role of gesture in communication and
thinking. Trends in Cognitive Science 3(11):419–429.
Good, D. 1995. When does foresight end and hindsight begin? In Social
intelligence and interaction: Expressions and implications of the social bias
in human intelligence, edited by E. N. Goody, 139–149. Cambridge:
Cambridge University Press.
Goodall, J. 1986. The chimpanzees of Gombe: Patterns of behavior.
Cambridge,
MA: Harvard University Press.
Gouzoules, S., H. Gouzoules, and P. Marler. 1984. Rhesus monkey (Macaca
mulatta) screams: Representational signalling in the recruitment of
agonistic aid. Animal Behaviour 32(1):1 82.
Green, S. 1975. Dialects in Japanese monkeys: vocal learning and
cultural transmission of locale-specific vocal behaviour? Zeitsch rift
fur Tierpsychologie 38:304–314.
Hewes, G. W. 1973. Primate communication and the gestural origins
of language. Current Anthropology 14:5–24.
Heyes, C. M., and E. D. Ray. 2000. What is the significance of imitation
in animals? Advances in the Study of Behavior 29:215–245.
Humphrey, N. K. 1976. The social function of intellect. In Growing
Points in Ethology, edited by P. P. G. Bateson and R. A. Hinde, 303–317.
Cambridge: Cambridge University Press.
Janik, V. M., and P. J. B. Slater. 1997. Vocal learning in mammals.
Advances in the Study of Behavior 26:59–99.
Jolly, A. 1966. Lemur social behaviour and primate intelligence. Science
153(735):501–506.
Kuhl, P. K. 1982. Discrimination of speech by non-human animals—basic
auditory sensitivities conducive to the perception of speech-sound
categories. Journal of the Acoustical Society of America 70:340–349.
Lorenz, K. 1950. The comparative method in studying innate behaviour
patterns. Symposia of the Society for Experimental Biology 4:221–268.
Marler, P., and R. Tenaza. 1977. Signalling behaviour of apes with special
reference to vocalization. In How animals communicate, edited by T.
Sebeok, 965–1033. Bloomington: Indiana University Press.
Matsuzawa, T. 2001. Primate foundations of human intelligence: A view
of tool use in nonhuman primates and fossil hominids. In Primate
origins of human cognition and behavior, edited by T. Matsuzawa, 3–25.
Tokyo: Springer-Verlag.
Matsuzawa, T., and G. Yamakoshj (eds.). 1996. Comparisons of chimpanzee
material culture between Bossou and Nimba, West Africa. Cambridge:
Cambridge University Press.
Mitani, J. C., T. Hasegawa, J. Gros-Louis, P. Marler, and R. W. Byrne.
1992. Dialects in wild chimpanzees? American Journal of Primatology
27:233–243.
Napier, J. R. 1961. Prehensility and opposability in the hands of primates.
Symposia of the Zoological Society of London 5:115–132.
Perner, J., and H. Wimmer. 1985. “John thinks that Mary thinks that”
attribution of second-order beliefs by 5- to 10-year old children.
Journal of Experimental Child Psychology 39:437–471.
Pika, S., K. Liebal, and M. Tomasello. 2003. Gestural communication
in young gorillas (Gorilla gorilla): Gestural repertoire, learning, and
use. American Journal of Primatology 60:95–111.
Rizzolatti, G., L. Fadiga, L. Fogassi, and V. Gallese. 1996. Premotor cortex
and the recognition of motor actions. Brain Research 3:131–141.
——. 2002. From mirror neurons to imitation: Facts and speculations. In
The imitative mind: Development, evolution, and brain bases, edited by
A. Meltzoff and W. Prinz, 247–266. Cambridge: Cambridge University
Press.
Russon, A. E. 1998. The nature and evolution of intelligence in orangutans
(Pongo pygmaeus). Primates 39(4):485–503.
Saffran, J. R., R. N. Aslin, and F. L. Newport. 1996. Statistical learning
by 8-month-old infants. Science 274(5294):1926–1928.
Savage-Rumbaugh, E. S., J. Murphy, R. A. Sevcik, K. E. Brakke, S. L.
Williams, and D. M. Rumbaugh. 1993. Language comprehension
in ape and child. Monographs of the Society for Research in Child
Development 58:1–252.
Senghas, A., S. Kita, and A. Ozyurek. 2004. Children creating core
properties of language: evidence from an emerging sign language in
Nicaragua. Science 305(5691):1779–1782.
Seyfarth, R. M., and D. L. Cheney. 1986. Vocal development in vervet
monkeys. Animal Behaviour 34:1640–1658.
Spence, K. W. 1937. Experimental studies of learning and higher mental
processes in infra-human primates. Psychological Bulletin 34:806–850.
Stokes, E. J., and A. W. Byrne. 2001. Cognitive capacities for behavioural
flexibility in wild chimpanzees (Pan troglodytes): The effect of snare
injury on complex manual food processing. Animal Cognition 4:11–28.
Stokes, E. J., D. Quiatt, and V. Reynolds. 1999. Snare injuries to
chimpanzees
(Pan troglodytes) at 10 study sites in East and West Africa.
American Journal of Primatology
49:104–105.
Tanner, J. E. 1998. Gestural communication in a group of zoo-living
lowland gorillas. Ph.D. dissertation, Department of Psychology, St
Andrews.
Tanner, J. E., and R. W. Byrne. 1996. Representation of action through
iconic gesture in a captive lowland gorilla. Current Anthropology
37:162–173.
——. 1999. The development of spontaneous gestural communication
in a group of zoo-living lowland gorillas. In The mentalities of gorillas
and orangutans: Comparative perspectives, edited by S. T. Parker, R. W.
Mitchell, and H. L. Miles, 211–239. Cambridge: Cambridge University
Press.
Tomasello, M., J. Call, and B. Hare. 2003. Chimpanzees understand
psychological states—the question is which ones and to what extent.
Trends in Cognitive Sciences 7:153–156.
Tomasello, M., B. George, A. Kruger, J. Farrar, and E. Evans. 1985. The
development of gestural communication in young chimpanzees.
Journal of Human Evolution 14:175–186.
Tomasello, M., D. Gust, and T. A. Frost. 1989. A longitudinal investigation
of gestural communication in young chimpanzees. Primates 30:35–50.
Visalberghi, E., and D. M. Fragaszy. 1990. Do monkeys ape? In “Language”
and Intelligence in Monkeys and Apes, edited by S. T. Parker and K. A.
Gibson, 247–273. Cambridge: Cambridge University Press.
Wellman, H. M. 1990. Children’s theories of mind. Cambridge, MA:
Bradford–MIT Press.
Whiten, A. 1998. Imitation of the sequential structure of actions by
chimpanzees (Pan troglodytes). Journal of Comparative Psychology
112:270–281.
Whiten, A., and R. W. Byrne (eds.). 1997. Machiavellian intelligence, vol. 2:
Extensions and evaluations. Cambridge: Cambridge University Press.
Whiten, A., D. M. Custance, J.-C. Gomez, P. Teixidor, and K. A. Bard.
1996. Imitative learning of artificial fruit processing in children (Homo
sapiens) and chimpanzees (Pan troglodytes). Journal of Comparative
Psychology 110:3–14.
Zuberbühler, K. 2002. A syntactic rule in forest monkey communication.
Animal Behaviour 63:293–299.
ni ete n
Learning to Point
No one knows how human infants come to point for others. But given
cross-cultural differences in infants’ gestural behavior (although these
have not been documented as specifically as one might like), it would
seem clear that the major process is one of learning. There are two
main candidates.
First is some form of ritualization. For example, a very young infant
might reach for a distant object, at which point her mother might discern
the intention and obtain the object for her—leading to a ritualized form
of reaching that resembles pointing (Vygotsky 1978). We can also extend
this hypothetical scenario to the case that, by most accounts, seems
more likely, when infants use arm and index finger extension to orient
their own attention to things. If an adult were to respond to this by
attending to the same thing and then share excitement with the infant
by smiling and talking to her, then this kind of pointing might also
become ritualized—that is, a learned procedure for producing a desired
social effect. In this scenario it would be possible for an infant to point
for others while still not understanding the pointing gesture of others,
and indeed a number of empirical studies find just such dissociations
in many young infants (Franco and Butterworth 1996). Infants who
learn to point via ritualization, therefore, may understand their gesture
from the “inside” only, as a procedure for getting something done,
not as an invitation to share attention using a mutually understood
communicative convention.
The alternative is that the infant observes an adult point for her
and comprehends that the adult is attempting to induce her to share
attention to something, and then imitatively learns that when she
has the same goal she can use the same means, thus creating an
intersubjective symbolic act for sharing attention. It is crucial that in
this learning process—one form of what Tomasello et al. (1993) called
cultural learning—the infant is not just mimicking adults sticking out
their fingers; she is truly understanding and attempting to reproduce the
adult’s intentionally communicative act, including both means and end.
It is crucial because a bidirectional symbol can only be created when the
child first understands the intentions behind the adult’s communicative
act, and then identifies with those intentions herself as she produces
the “same” means for the “same” end.
Empirically we do not know whether infants learn to point via
ritualization
or imitative learning or whether, as I suspect, some infants learn
in one way (esp. prior to their first birthdays) and some learn in the
other. And it may even happen that an infant who learns to point via
ritualization at some later point comes to comprehend adult pointing in
a new way, and so comes to a new understanding of her own pointing
and its equivalence to the adult version. Thus, Franco and Butterworth
(1996) found that when many infants first begin to point they do not
seem to monitor the adult’s reaction at all. Some months later they look
to the adult after they have pointed to observe her reaction, and some
months after that they look to the adult first, to secure her attention on
themselves, before they engage in the pointing act—perhaps evidencing
a new understanding of the adult’s comprehension.
Virtually all of chimpanzees’ flexibly produced gestures are intention
movements that have been ritualized in interaction with others. For
example, an infant chimpanzee who wants to climb on its mother’s
back may first actually pull down physically on her rear end to make
the back accessible, after which the mother learns to anticipate on first
touch, which the infant then notices and exploits in the future. The
general form of this type of learning is thus:
1. Individual A performs behavior X (noncommunicative).
2. Individual B reacts consistently with behavior Y.
3. Subsequently B anticipates A’s performance of X, on the basis of its
initial step, by performing Y.
4. Subsequently, A anticipates B’s anticipation and produces the initial
step in a ritualized form (waiting for a response) to elicit Y.
The main point is that a behavior that was not at first a communicative
signal becomes one by virtue of the anticipations of the interactants
over time. There is very good evidence from a series of longitudinal
and experimental studies that chimpanzees do not learn their gestures
by imitating one another but, rather, by ritualizing them with one
another in this way (see Tomasello and Call 1997, for a review). This
means that chimpanzees use and understand their gestures as one-way
procedures for getting things done, not as intersubjectively shared,
bidirectional coordination devices or symbols. At least some support
for this hypothesis is also provided by the fact that young chimpanzees,
unlike human infants, do not spontaneously reverse roles when someone
acts on them and invites a reciprocal action in return; that is, they do
not engage in role reversal imitation of instrumental acts (Tomasello
and Carpenter in press).
In general, two decades of experimental research have demonstrated
conclusively that, among primates, human beings are by far the most
skilled and motivated imitators (see Tomasello 1996, for a review). More
controversially, I would claim that some types of imitative learning are
uniquely human, specifically those that require the learner to understand
the intentions of the actor, that is, not only the actor’s goal but also
his plan of action or means of execution for reaching that goal. When
the intentions are actually communicative intentions—involving the
embedding of one intention within another or the reversing of roles
within a communicative act—apes are simply, in my view, not capable
of either understanding or reproducing these. This means that their
communicative devices are not in any sense shared in the manner of
human communicative conventions such as pointing and language.
Shared Intentionality
So why don’t apes point? I have given here more or less five fundamental
reasons:
they do not understand communicative intentions
they do not participate in joint attentional engagement as
common communicative ground within which deictic gestures are
meaningful
they do not have the motives to help and to share
they are not motivated to inform others of things because they
cannot determine what is old and new information for them (i.e.,
they do not really understand informing, per se)
they cannot imitatively learn communicative conventions as
inherently
bidirectional coordination devices with reversible roles
And so the obvious question is: is this really five different reasons, or
are these all part of one or a few more fundamental reason(s)?
My proposal here is that all of these reasons are basically reflections of
the more fundamental fact that only humans engage with one another
in acts of what some philosophers of action call shared intentionality,
or sometimes “we” intentionality, in which participants have a shared
goal and coordinated action roles for pursuing that shared goal (Bratman
1992; Clark 1996; Gilbert 1989; Searle 1995; Tuomela 1995). The activity
itself may be complex (e.g., building a building, playing a symphony)
or simple (e.g., taking a walk together, engaging in conversation), so
long as the interactants are engaged with one another in a particular
way. In all cases the goals and intentions of each interactant must
include as content something of the goals and intentions of the other.
When individuals in complex social groups share intentions with
one another repeatedly in particular interactive contexts, the result is
habitual social practices and beliefs that sometimes create what Searle
(1995) calls social or institutional facts: such things as marriage, money,
and government, which only exist because of the shared practices and
beliefs of a group.
In my previous approach to these problems (e.g., Tomasello 1999),
I hypothesized that only human beings understand one another as
intentional agents—with goals and perceptions of their own—and
this is what accounts for many uniquely human social cognitive skills,
including those of cultural learning and conventional communication,
that would seem to involve one or another form of shared intentionality.
We now have data, however, that has convinced me that at least some
great apes do understand that others have goals and perceptions (not, by
the way, thought and beliefs), as summarized by Tomasello et al. (2003).
The details of these data do not concern us here, but the immediate
theoretical problem is how we should account for uniquely human
cultural cognition, as we sometimes call it, if not by humans’ exclusive
ability to understand others intentionally.
Tomasello et al. (in press) present a new proposal that identifies the
uniquely human social cognitive skills not as involving the
understanding
of intentionality simpliciter, but as involving the ability
to create with others in collaborative interactions joint intentions
and joint attention (which in the old theory basically came for free
once one understood others as intentional agents). These basic skills
of shared intentionality involve both a new motivation for sharing
psychological states, such as goals and experiences, with conspecifics,
and perhaps as well new forms of cognitive representation (what we
call dialogic cognitive representations) for doing so. Evolutionarily (see
also Boyd this volume), the proposal is that individual humans who
were especially skilled at collaborative interactions with others were
adaptively favored, and the requisite social–cognitive skills that they
possessed were such that, at some point, the collaborative interactions
in which they engaged became qualitatively new—they became
collaborative interactions in which individuals were able to form a
shared goal to which they jointly committed themselves. Following
Bratman (1992), such shared intentional activities, as he calls them,
also involve understanding others’ plans for pursuing those joint goals
(meshing subplans), and even helping the other in his role if this is
needed. There is basically no evidence from any nonhuman animal
species of collaborative interactions in which different individuals play
different roles that are planned and coordinated, with assistance from
the other as needed.2
Tomasello et al. (in press) take a very close look at human infants
from this point of view and find that whereas infants of nine months
of age can coordinate with adults in some interesting ways that might
reflect an initial ability to form joint goals—such things as rolling a ball
back and forth or putting away toys together—it is at around 12 to 14
months of age that full-fledged shared intentionality seems to emerge. It
is at this age that infants for the first time seem truly motivated to share
experience with others through declarative and informative pointing,
that they encourage others to play their role when a collaborative
interaction breaks down, that they can reverse roles in collaborative
interactions, and that they start to acquire linguistic conventions.
So the specific proposal here—with regard to the question of why
human infants point but other apes do not—is that only humans have
the skills and motivations to engage with others collaboratively, to
form with others joint intentions and joint attention in acts of shared
intentionality. The constitutive motivations are mainly helping and
sharing, which obviously (and as argued above) are an important part
of indicating acts such as pointing. Understanding and coordinating
with others’ plans toward goals is in general a necessary part of human
communication, understood as joint action (Clark 1996). Reversing
roles is a very important part in these collaborative interactions, and is
likely that the understanding of perspectives is simply the perceptual–
attentional side of such role reversal (Baressi and Moore 1996). And so,
although we certainly do not have at the moment all details worked
out, it would seem a plausible suggestion that uniquely human forms
of communication–including both nonlinguistic and linguistic
conventions–rest fundamentally on a foundation of uniquely human
forms of collaborative engagement involving shared intentionality.
Conclusion
To explain human cognitive uniqueness, many theorists invoke
language.
This contains an element of truth, because only humans use
language and it is clearly important to, indeed constitutive of, uniquely
human cognition in many ways. However, as I have noted before, asking
why only humans use language is like asking why only humans build
skyscrapers, when the fact is that only humans, among primates, build
freestanding shelters at all. And so for my money, at our current level of
understanding, asking why apes do not have language may not be our
most productive question. A much more productive question, and one
that can currently lead us to much more interesting lines of empirical
research, is asking the question why apes do not even point.
Notes
1. There is actually one reported incident of a bonobo pointing for
conspecifics
in the wild (Veà and Sabater-Pi 1998). This has never been repeated
by any other observers of bonobos or other ape species. There have also been
suggestions in the past that apes point with their whole body (Menzel 1971),
or just with their eyes (de Waal 2001), but these have never been substantiated
as anything more than personal impressions.
2. The most complex cooperative activity of chimpanzees is group
hunting,
in which two or more males seem to play different roles in corralling
a monkey (Boesch and Boesch 1989). But in analyses of the sequential
unfolding of participant behavior over time in these hunts, many observers
have characterized this activity as essentially identical to the group hunting
of other social mammals such as lions and wolves (Cheney and Seyfarth
1990; Tomasello and Call 1997). Although it is a complex social activity, as
it develops over time each individual simply assesses the state of the chase
at each moment and decides what is best for it to do. There is nothing that
would be called collaboration in the narrow sense of joint intentions and
attention based on coordinated plans. In experimental studies (e.g., Crawford
1937; Chalmeau 1994), the most complex behavior observed is something like
two chimpanzees pulling a heavy object in parallel, and during this activity
almost no communication among partners is observed (Povinelli and O’Neill
2000). There are no published experimental studies—and several unpublished
negative results (two of them ours)—in which chimpanzees collaborate by
playing different and complementary roles in an activity.
References
Barresi, J., and C. Moore. 1996. Intentional relations and social
understanding
. Behavioral and Brain Sciences 19:107–154.
Bates, E., L. Camaioni, and V. Volterra. 1975. The acquisition of performatives
prior to speech. Merrill-Palmer Quarterly 21:205–224.
Behne, T., M. Carpenter, and M. Tomasello. in press. One-year-olds
comprehendthe communicative intentions behind gestures in a hiding
game. Developmental Science.
Boesch, C., and H. Boesch. 1989. Hunting behavior of wild chimpanzees
in the Tai Forest National Park. American Journal of Physical Anthropology
78:547–573.
Bratman, M. E. 1992. Shared cooperative activity. Philosophical Review
101(2):327–341.
Call, J. in press. Inferences about the location of food in the great apes.
Journal of Comparative Psychology.
Call, J., B. Agnetta, and M. Tomasello. 2000. Social cues that chimpanzees
do and do not use to find hidden objects. Animal Cognition 3:23–34.
Call, J., and M. Tomasello. in press. What do chimpanzees know about
seeing revisited: An explanation of the third kind. In Issues in joint
attention, edited by N. Eilan, C. Hoerl, T. McCormack, and J. Roessler.
Oxford: Oxford University Press.
Chalmeau, R. 1994. Do chimpanzees cooperate in a learning task?
Primates 35:385–392.
Cheney, D. L., and R. M. Seyfarth. 1990. How monkeys see the world.
Chicago: University of Chicago Press.
Clark, H. 1996. Using language. Cambridge: Cambridge University Press.
Crawford, M. P. 1937. The cooperative solving of problems by young
chimpanzees. Comparative Psychology Monographs 14:1–88.
De Waal, F. 2001. Pointing primates: Sharing knowledge without
language. Chronicle of Higher Education, B7–B9.
Dunbar, R. 1996. Grooming, gossip and the evolution of language. London:
Faber and Faber.
Franco, F., and G. Butterworth. 1996. Pointing and social awareness:
Declaring and requesting in the second year. Journal of Child Language
23:307–336.
Gilbert, M. 1989. On social facts. Princeton: Princeton University Press.
Golinkoff, R. 1993. When is communication a meeting of the minds?
Journal of Child Language 20:199–208.
Goodall, J. 1986. The chimpanzees of Gombe: Patterns of behavior.
Cambridge, MA: Harvard University Press.
Hare, B., J. Call, B. Agnetta, and M. Tomasello. 2000. Chimpanzees know
what conspecifics do and do not see. Animal Behaviour 59:771–785.
Hare, B., J. Call, and M. Tomasello. 2001. Do chimpanzees know what
conspecifics know? Animal Behavior 61:139–151.
Hare, B., and M. Tomasello. 2004. Chimpanzees are more skillful in
competitive than in cooperative cognitive tasks. Animal Behaviour
68:571–581.
Itakura, S., B. Agnetta, B. Hare, and M. Tomasello. 1999. Chimpanzees
use human and conspecific social cues to locate hidden food.
Developmental Science 2:448–456.
Kaminski, J., J. Call, and M. Tomasello. 2004. Body orientation and face
orientation: Two factors controlling apes’ begging behavior from
humans. Animal Cognition 7:216–223.
Lambrecht, K. 1994. Information structure and sentence form. Cambridge:
Cambridge University Press.
Leavens, D. A., and W. D. Hopkins. 1998. Intentional communication
by chimpanzees (Pan troglodytes): A cross-sectional study of the use
of referential gestures. Developmental Psychology 34:813–822.
Liszkowski, U., M. Carpenter, A. Henning, T. Striano, and M. Tomasello.
2004. 12-month-olds point to share attention and interest.
Developmental
Science 7:297–307.
Liszkowski, U., M. Carpenter, and M. Tomasello. in press. 12-month-olds
point to inform others. Journal of Cognition and Development.
Menzel, E. W., Jr. 1971. Communication about the environment in a
group of young chimpanzees. Folia Primatologica 15:220–232.
Moll, H., C. Coring, M. Carpenter, and M. Tomasello. in press. Infants
follow attention to aspects of objects. Journal of Cognition and
Development.
Povinelli, D. J., and D. K. O’Neill. 2000. Do chimpanzees use their gestures
to instruct each other? In Understanding other minds: Perspectives from
autism, 2nd edition, edited by S. Baron-Cohen, H. Tager-Flusberg, and
D. J. Cohen, 459–487. Oxford: Oxford University Press.
Savage-Rumbaugh, S. 1990. Language as a cause-effect communication
system. Philosophical Psychology 3:55–76.
Schwe, H., and E. Markman. 1997. Young children’s appreciation of
the mental impact of their communicative signals. Developmental
Psychology 33:630–635.
Searle J. 1995. The construction of social reality. New York: Free Press.
Sperber, D., and D. Wilson. 1986. Relevance: Communication and cognition.
Cambridge, MA: Harvard University Press.
Tomasello, M. 1996. Do apes ape? In Social learning in animals: The
roots of culture, edited by J. Galef and C. Heyes, 319–346. New York:
Academic Press.
Tomasello, M. 1999. The cultural origins of human cognition. Cambridge,
MA: Harvard University Press.
Tomasello, M. 2003a. Constructing a Language: A Usage-Based Theory of
Language Acquisition. Cambridge, MA: Harvard University Press.
Tomasello, M. 2003b. The pragmatics of primate communication.
In Handbook of Pragmatics, edited by J. Verschueren. Amsterdam:
Benjamins.
Tomasello, M., and J. Call. 1997. Primate Cognition. Oxford: Oxford
University Press.
Tomasello, M., J. Call, and A. Gluckman. 1997. The comprehension
of novel communicative signs by apes and human children. Child
Development 68:1067–1081.
Tomasello, M., J. Call, and B. Hare. 1998. Five primate species follow
the visual gaze of conspecifics. Animal Behaviour 55:1063–1069.
Tomasello, M., J. Call, and B. Hare. 2003. Chimpanzees understand
psychological states: The question is which ones and to what extent.
Trends in Cognitive Science 7:153–156.
Tomasello, M., J. Call, K. Nagell, R. Olguin, and M. Carpenter. 1994.
The learning and use of gestural signals by young chimpanzees: A
trans-generational study. Primates 37:137–154.
Tomasello, M., J. Call, J. Warren, T. Frost, M. Carpenter, and K. Nagell.
1997. The ontogeny of chimpanzee gestural signals: A comparison
across groups and generations. Evolution of Communication 1:223–253.