You are on page 1of 164

The Extension of Biology

Through Culture
Edited by Andrew Whiten, Francisco J. Ayala, Marcus W. Feldman,
and Kevin N. Laland

Arnold and Mabel


Beckman Center

Irvine, CA

November 16–17, 2016


This work is reprinted from the Proceedings of the National Academy of
Sciences of the United States of America, vol. 114, no. 30, pp. 7734–7737 and
pp. 7775–7922, July 25, 2017, and includes articles from the Arthur M.
Sackler Colloquium The Extension of Biology Through Culture, held
November 16–17, 2016, at the Arnold and Mabel Beckman Center in
Irvine, CA. The articles appearing in these pages were contributed by
speakers and attendees at the colloquium and were anonymously peer
reviewed. Any opinions, findings, conclusions, or recommendations
expressed in this work are those of the authors and have not been
endorsed by the National Academy of Sciences.

The National Academy of Sciences is a private, nonprofit, self-perpetuating


society of distinguished scholars engaged in scientific and engineering
research, dedicated to the furtherance of science and technology and to
their use for the general welfare. Upon the authority of the charter granted
to it by the US Congress in 1863, the Academy has a mandate that requires
it to advise the federal government on scientific and technical matters.

ISBN 10: 0-309-46316-5


ISBN 13: 978-0-309-46316-4

© Copyright by the National Academy of Sciences, USA


All rights reserved. Published 2017
Printed in the United States of America
The Extension of Biology
Through Culture

National Academy of Sciences


Washington, DC
Arthur M. Sackler, M.D.
1913–1987

B orn in Brooklyn, New York, Arthur M. Sackler was edu-


cated in the arts, sciences, and humanities at New York
University. These interests remained the focus of his life, as he
became widely known as a scientist, art collector, and philan-
thropist, endowing institutions of learning and culture through-
out the world.
He felt that his fundamental role was as a doctor, a vocation
he decided upon at the age of four. After completing his
internship and service as house physician at Lincoln Hospital in
New York City, he became a resident in psychiatry at Creed-
moor State Hospital. There, in the 1940s, he started research
that resulted in more than 150 papers in neuroendocrinology,
psychiatry, and experimental medicine. He considered his
scientific research in the metabolic basis of schizophrenia his
most significant contribution to science and served as editor of
the Journal of Clinical and Experimental Psychobiology from 1950 to 1962. In 1960 he started
publication of Medical Tribune, a weekly medical newspaper that reached over one million
readers in 20 countries. He established the Laboratories for Therapeutic Research in 1938, a
facility in New York for basic research that he directed until 1983.
As a generous benefactor to the causes of medicine and basic science, Arthur Sackler built
and contributed to a wide range of scientific institutions: the Sackler School of Medicine
established in 1972 at Tel Aviv University, Tel Aviv, Israel; the Sackler Institute of Graduate
Biomedical Science at New York University, founded in 1980; the Arthur M. Sackler Science
Center dedicated in 1985 at Clark University, Worcester, Massachusetts; and the Sackler School
of Graduate Biomedical Sciences, established in 1980, and the Arthur M. Sackler Center for
Health Communications, established in 1986, both at Tufts University, Boston, Massachusetts.
His pre-eminence in the art world is already legendary. According to his wife Jillian, one of
his favorite relaxations was to visit museums and art galleries and pick out great pieces others
had overlooked. His interest in art is reflected in his philanthropy; he endowed galleries at the
Metropolitan Museum of Art and Princeton University, a museum at Harvard University, and
the Arthur M. Sackler Gallery of Asian Art in Washington, DC. True to his oft-stated
determination to create bridges between peoples, he offered to build a teaching museum in
China, which Jillian made possible after his death, and in 1993 opened the Arthur M. Sackler
Museum of Art and Archaeology at Peking University in Beijing.
In a world that often sees science and art as two separate cultures, Arthur Sackler saw them
as inextricably related. In a speech given at the State University of New York at Stony Brook,
Some reflections on the arts, sciences and humanities, a year before his death, he observed:
‘‘Communication is, for me, the primum movens of all culture. In the arts. . . I find the emotional
component most moving. In science, it is the intellectual content. Both are deeply interlinked
in the humanities.’’ The Arthur M. Sackler Colloquia at the National Academy of Sciences pay
tribute to this faith in communication as the prime mover of knowledge and culture.
July 25, 2017 | vol. 114 | no. 30
PNAS
Proceedings of the National Academy of Sciences of the United States of America

The Extension of Biology Through Culture


www.pnas.org

The papers collected here result from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, The Extension of Biology Through Culture. Complete information about this colloquium
and video recordings of most presentations are available on the NAS website at http://www.
nasonline.org/Extension_of_Biology_Through_Culture.

Contents
INTRODUCTION
7775 The extension of biology through culture
Andrew Whiten, Francisco J. Ayala, Marcus W. Feldman,
Cover image: A juvenile capuchin
and Kevin N. Laland
monkey observes a skilled adult male
eating a nut it has just broken using COLLOQUIUM PAPERS
a hammerstone. Articles in the Sackler
Colloquium on the Extension of Biology 7782 Cultural evolutionary theory: How culture evolves and why it matters
Through Culture explore social learning Nicole Creanza, Oren Kolodny, and Marcus W. Feldman
and cultural transmission in humans and
nonhuman animals as well as the
7790 Culture extends the scope of evolutionary biology in the great apes
Andrew Whiten
interplay between cultural and genetic
evolution. See the Introduction to the 7798 Synchronized practice helps bearded capuchin monkeys learn to extend
Sackler Colloquium by Andrew Whiten attention while learning a tradition
et al. on pages 7775–7781. Image Dorothy M. Fragaszy, Yonat Eshchar, Elisabetta Visalberghi, Briseida Resende, Kellie Laity,
courtesy of Luca Antonio Marino and Patrícia Izar
(EthoCebus Project, Brazil).
7806 Older, sociable capuchins (Cebus capucinus) invent more social behaviors,
but younger monkeys innovate more in other contexts
Susan E. Perry, Brendan J. Barrett, and Irene Godoy
7814 Gene–culture coevolution in whales and dolphins
Hal Whitehead
7822 Song hybridization events during revolutionary song change provide
insights into cultural transmission in humpback whales
Ellen C. Garland, Luke Rendell, Luca Lamoni, M. Michael Poole, and Michael J. Noad
7830 Conformity does not perpetuate suboptimal traditions in a wild
population of songbirds
Lucy M. Aplin, Ben C. Sheldon, and Richard McElreath
7838 A social insect perspective on the evolution of social learning mechanisms
Ellouise Leadbeater and Erika H. Dawson
7846 Cultural macroevolution matters
Russell D. Gray and Joseph Watts
7853 Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution
Alex Mesoudi
7861 Evolutionary neuroscience of cumulative culture
Dietrich Stout and Erin E. Hecht
7869 Identifying early modern human ecological niche expansions and associated
cultural dynamics in the South African Middle Stone Age
Francesco d’Errico, William E. Banks, Dan L. Warren, Giovanni Sgubin, Karen van Niekerk,
Christopher Henshilwood, Anne-Laure Daniau, and María Fernanda Sánchez Goñi
7877 Cumulative cultural learning: Development
and diversity
Cristine H. Legare
7884 Young children communicate their ignorance 7908 Coevolution of cultural intelligence, extended life
and ask questions history, sociality, and brain size in primates
Paul L. Harris, Deborah T. Bartz, and Meredith L. Rowe Sally E. Street, Ana F. Navarrete, Simon M. Reader,
and Kevin N. Laland
7892 Changes in cognitive flexibility and hypothesis
search across human life history from childhood 7915 The evolution of cognitive mechanisms in response
to adolescence to adulthood to cultural innovations
Alison Gopnik, Shaun O’Grady, Christopher G. Lucas, Arnon Lotem, Joseph Y. Halpern, Shimon Edelman,
Thomas L. Griffiths, Adrienne Wente, Sophie Bridgers, and Oren Kolodny
Rosie Aboody, Hoki Fung, and Ronald E. Dahl
NEWS FEATURE
7900 How language shapes the cultural inheritance
of categories 7734 Can animal culture drive evolution?
Susan A. Gelman and Steven O. Roberts Carolyn Beans
COLLOQUIUM INTRODUCTION

INTRODUCTION
COLLOQUIUM
The extension of biology through culture
Andrew Whitena,1, Francisco J. Ayalab, Marcus W. Feldmanc, and Kevin N. Lalandd

Biology is the study of life. How our understanding of the have changed in the course of human history, through
nature and evolution of living systems is being enriched a different form of inheritance: that in which people learn
and extended through new discoveries about social from others (social learning), including from previous
learning and culture in human and nonhuman animals is generations. Darwin himself recognized the parallels
the subject of the collection of articles we introduce here. between the evolution of culturally inherited languages
Recent decades have revealed that social learning and organic evolution (9, 10); indeed, evolutionary fam-
and the transmission of cultural traditions are much ily trees of languages proposed by philologists long
more widespread in the animal kingdom than earlier predated the Origin of Species, although they were
suspected, affecting numerous forms of functional further spurred by its publication (11–13).
behavior and creating a secondary form of evolution, During the 1970s and 1980s, first by Cavalli-Sforza
built onto the better-known primary, genetically based and Feldman (14–16) and then Boyd and Richerson
form. New scientific approaches to the study of human (17), the implications of the existence of the two forms
cultural evolution have also emerged and become of evolution, organic and cultural, was at last explored
productive. However, these developments in the study systematically and formally, through conceptual and
of cultural phenomena in both human and nonhu- mathematical modeling that formed a foundation for
man animals have yet to be seriously integrated into later empirical investigations. The present collection
mainstream evolutionary biology. Here we offer an of papers opens with a contribution by Creanza et al.
introductory overview of the background and scope of (18) that offers an overview of both the foundational
a collection of articles that report recent progress in studies in (human) cultural evolution and major devel-
these fields, and outline their proposed significance opments in the period since. The early body of 20th
for biology at large. century work laid out some of the ways in which cul-
The theoretical backbone of the life sciences, its tural evolution: (i) echoes many core principles of or-
central organizing principle, is of course evolution, by ganic evolution, yet (ii) also differs from it in dramatic
now rich in both theory and empirical support (1–3). ways that change evolutionary dynamics, and (iii) in-
The great synthesis of Darwin’s and Wallace’s evolu- teracts with the genetically based phenomena to
tionary insights and early 20th century understanding create new complexities (“gene–culture coevolu-
of genetics that became known as the “Modern Syn- tion”). We return to discuss these further, below.
thesis” was achieved by a brilliant set of biologists From a somewhat different perspective Maynard-
mainly in the period 1938–1946 (4), and its principles Smith and Szathmary (19) distinguished a series of
have provided the core of evolutionary theory since major transitions in the nature of evolution, such as
that time (5). Thus, contemporary texts on “evolu- the emergence of multicellularity and of sex, the
tion” focus on such topics as mutation, genetically most recent major transition being the emergence
based inheritance, population genetics, genomics, of (human) culture; and Dawkins (20) gave a name
and the natural and sexual selection pressures that to cultural elements suggested to be the analogs
shape gene frequencies, genotypes, and pheno- of genetic replicators—“memes”—which has been
types (1, 2, 6, 7). Genes and their role in inheritance assimilated into popular culture. Other authors sug-
have come to be celebrated as the pivotal elements gested “semes” (21), echoing semiotics, the study
in evolution (8). of signs and symbols.
However, a second form of evolution was also rec- We shall discuss such developments and subse-
ognized long ago, in the ways that cultural phenomena quent related scientific progress further below, but for

a
Centre for Social Learning and Cognitive Evolution, School of Psychology and Neuroscience, University of St. Andrews, St. Andrews KY16 9JP,
United Kingdom; bDepartment of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697; cDepartment of Biology,
Stanford University, Stanford, CA 94305; and dCentre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews,
St. Andrews KY16 9JP, United Kingdom
This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “The Extension of Biology Through Culture,” held
November 16–17, 2016, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. The complete
program and video recordings of most presentations are available on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
Author contributions: A.W., F.J.A., M.W.F., and K.N.L. wrote the paper.
The authors declare no conflict of interest.
1
To whom correspondence should be addressed. Email: a.whiten@st-andrews.ac.uk.

www.pnas.org/cgi/doi/10.1073/pnas.1707630114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7775–7781


the moment we make one observation: all of these writings on impact. For example, archaeological excavations have shown re-
culture were focused on a single species, our own. That other mains of nut-cracking materials dated to 4,300 y below the sur-
species might exhibit the core properties of cultural transmission, face of the Taï Forest, where modern chimpanzees continue this
and that this was worthy of scientific investigation, began to be practice, absent in most parts of Africa (31, 46); 27 y of observa-
recognized in a number of different lines of animal behavior tions have documented the spread from a few to over 600
research only around the middle of the last century. Moreover, humpback whales of a new form of hunting technique, lob-tail
evidence for animal social learning and tradition started to fishing (47); and new humpback songs have been found to
become substantial only in the most recent decades, and thus emerge, spread quickly to whole populations, and pass in waves
was not only unremarked in the Modern Synthesis but gained only across the Pacific in consecutive years (36, 37).
minimal mention in the foundational works of cultural evolution That it has taken so long to begin to discover the broad scope
(14, 17). The earliest reports for nonhuman animals (henceforth of animal culture has several possible explanations. One impor-
“animals”) were of novel traditions arising and spreading, nota- tant source of the progress made has been the achievement of
bly bottle-top opening to drink milk by titmice (22) and washing long-term field studies. For example, just a half-century ago re-
food items in the sea by Japanese monkeys, referred to conser- searchers knew next to nothing about the behavior of our closest
vatively at the time as “pre-cultures” (23). An additional revela- primate relatives in the wild, but in recent times it has been pos-
tion was the discovery of bird song learning and the existence of sible to assemble data on each of several species from multiple,
local song dialects (24). Since these foundational studies, there decades-long field studies to discover putative cultural differ-
has been a proliferation of studies documenting social learning ences amenable to more focused study, as now achieved for
and traditions in animals (25, 26), often referred to as “animal different genera of the great apes (31). The example of the long-
culture” (27). term observations that identified the spread of the lob-tail hunting
technique in humpback whales has already been mentioned (47).
The Discovery of Widespread Animal Culture That study also illustrates the application of increasingly sophis-
Research over the last half-century has led to the revelation that ticated statistical analyses to identify social learning through the
learning from others (social learning) is widespread in the animal way in which a novel behavior spreads along the lines of social
kingdom and spans a great range of important functional con- networks [network-based diffusion analysis (26)], an approach now
texts, including diet, feeding techniques, travel route selection, applied in other contexts, such as tool use in chimpanzees (48).
predator avoidance, vocal communication, migration, and mate This is one among a growing armory of such statistical approaches
and breeding site choices (26, 28). Hundreds of laboratory ex- bearing fruit in the discipline (26).
perimental studies have demonstrated social learning and trans- The most compelling evidence of social learning comes from
mission in a wide variety of animals. Social learning is now extensively experiments, the most basic (but powerful) forms of which involve
documented in mammals (29), with a particular intensity of re- comparing an experimental condition in which animals have wit-
search studies in primates (30–33) and cetaceans (34–37), in birds nessed a trained model complete some novel task, with a control
(38–41), in fish (42), and in insects (43, 44). The fact that social condition lacking a model, or comparing two experimental con-
learning has been shown to play important roles spanning mul- ditions in which models display different task solutions. Dyadic
tiple functional contexts (25–28) suggests that many animals are studies of this kind have a century-long history, but in recent de-
not simply acquiring one or a few behavioral patterns socially, cades modifications have been made to allow testing for the
but rather that social learning is central to their acquisition of more culturally relevant phenomenon of the diffusion of behav-
adaptive behavior. iors across multiple individuals or even between groups. Several
Social learning may lead to the spread of a behavior to other different experimental designs have since identified diffusion
individuals, which is what defines cultural transmission, and the through social learning, both in laboratory and field conditions,
establishment of traditions that come to characterize whole and in an accelerating number and diversity of mammalian, avian,
groups, subgroups, or populations. However, social learning may piscine, and insect species (49, 50).
also be transient, and lead to no such substantial population-level However, this growing array of widespread cultural phenom-
effects; for example, a monkey may learn from others that a par- ena in animals appears to have gone largely unrecognized in
ticular tree is in fruit or that a snake is in a particular bush, which mainstream texts on evolution (e.g., refs. 1, 6, 7, 19) or receives
shape its behavior for only a short period thereafter. Evidence for only minimal mention (2). One major goal of the present PNAS
social learning, so widespread in animals, is thus not sufficient to issue is to illustrate the scope of the findings on animal culture that
demonstrate the larger phenomena of traditions and culture. now merit integration into evolutionary biology at large.
Nevertheless, cultural diffusion has been further documented not Building on a recent book-length review making the case that
only in all of the vertebrate research reviews listed above, but also culture pervades numerous aspects of the life of whales and
in insects, at least in the laboratory. For example, Alem et al. (45) dolphins (34), two papers in the present collection describe recent
trained demonstrator bumble bees to pull a small piece of string progress in studies of these cetaceans. Garland et al. (37) focus on
to access an artificial flower providing food, and experiments then the transmission of complex song in humpback whales, present-
showed that many other bees that observed them acquired the ing a painstaking and highly illuminating analysis of the process of
novel technique, with this learning not shown by control bees that song hybridization during the remarkable periods of “cultural
had no model from which to learn. Moreover, other bees learned revolution” found in this species. Behavioral hybridization has
from the earliest learners, and transmission across several such often been highlighted as a phenomenon differentiating cultural
“cultural generations” was demonstrated. The extent to which this from genetically based evolution, although hybridization is far
kind of transmission occurs in the wild is now a question that from unknown at the species level and recent research in-
demands focused study. creasingly suggests that it has been much more common than
In vertebrates, plentiful evidence exists that such cultural previously suspected at the level of genetic transfer (51). The
transmission occurs in nature, with substantial and indeed striking new humpback data reveal two different, specific ways in which

7776 | www.pnas.org/cgi/doi/10.1073/pnas.1707630114 Whiten et al.


hybridization occurs, involving the application of systematic genetic package transmitted at conception; this is the focus of a
structural rules that the authors propose are similar to—and pro- third primate study, on a different genus of capuchin monkey
vide additional insights into—those identified in both birdsong (Sapajus libidinosus) renowned for their tool-assisted nut-cracking
and human language. Whitehead (35) goes on to review the evi- behavior. Fragaszy et al. (32) trace the acquisition of this skill
dence for gene–culture coevolution in both whales and dolphins, through the course of development over the several years needed
appraising the evidence that matrilineal cultural inheritance has to acquire competence, revealing the complex cycles of practice
been particularly influential in creating ecologically specialized and attention to experts’ nut-cracking, with adults’ behavior
communities in species, such as killer whales, in turn explaining helping to focus attention on rare elements of the skill that are
low diversity in matrilineal mitochondrial DNA and regional vari- critical to success.
ation in mitochondrial DNA haplotype distribution. We return to
this study in addressing the issue of gene–culture coevolution Human Culture Is Special
further below. The present issue includes several papers on a single species: our
Cultural transmission has long been recognized in the sphere own. This focus clearly does not correspond with any proper
of birdsong (24, 38) but birds have tended to be seen as “one-trick proportional representation among animal species; the explana-
(cultural) ponies” in this respect. However, recent studies have tion is simply that the scope and penetration of culture is excep-
identified cultural transmission across a much broader span of tional in humans (54–57), and because of this, human culture
behavior (39, 40, 52, 53). A striking case is the high-fidelity spread extends biology in many additional and extraordinary ways. In-
of alternative foraging behaviors experimentally seeded in sub- deed, some of the consequences of human culture, like the de-
stantial communities of great tits (40). Here, Aplin et al. (41) show struction of other species’ habitats, climate change, and pollution,
that these behaviors will evolve adaptively as payoffs change, are already having (and in many cases have already had) major
and the authors present evidence that this occurs through an in- effects on the evolution, distribution, and extinction of major
triguing combination of conformist social learning and payoff- segments of the world’s biota (58, 59).
sensitive individual learning. Studies of human cultures have been pursued by an even
Recent evidence that insects also show not only social learn- greater diversity of approaches than those sketched for animal
ing, but a capacity for cultural transmission spreading across culture above. There is of course a whole discipline of social and
communities, is reviewed here by Leadbeater and Dawson (44), cultural anthropology for which, as the name implies, the target of
leading them to appraise the potential consequences for the study is culture, and this has often striven for forms of participant
evolution of learning processes and the brain. The authors con- observation extending to self-immersion in different cultures, an
clude that “Social insects are distant relatives of vertebrate social approach actively avoided by most students of primate behavior,
learners, but the research we describe highlights routes by which and often not even thinkable for more distantly related species. In
natural selection could coopt similar cognitive raw material within contrast to common approaches in cultural anthropology, those
the animal kingdom.” working within the field of cultural evolution have created an array
Primates have long been at the forefront in research on animal of different approaches and methods that are often more con-
culture (27). In the present collection Whiten (31) reviews the di- ventionally scientific. These include a greater emphasis on such
versity of complementary observational and experimental evi- elements as formal and mathematical modeling, quantification
dence for social learning and multiple-tradition cultures in the and statistical analysis of numerical data, hypothesis testing, and
great apes. Although ape (and other nonhuman) culture does not systematic experimentation (18, 25, 56, 60–63). There is not the
encompass the elaborate levels of cultural evolution evident in space here to offer anything like a comprehensive review of the
humans, Whiten concludes that the accumulated evidence now resulting discoveries, but we can outline something of their range,
exists for the principal implications of culture for evolutionary bi- with selected illustrations.
ology alluded to earlier: cultural evolution displays a number of Those studies that examine the earliest evidence for human
properties evident in genetic evolution but through the different culture have shown a continuing trend for markers of change to be
means of social learning. The interactions of genetic and cultural found at ever earlier dates. For example, the earliest evidence of
transmission are evolutionarily consequential. stone tool use has recently been pushed back from 2.6 to
Cultural evolution rests upon the complementary processes of 3.4 million y ago (64), roughly the half-way point since our shared
innovation and selective transmission, but concentration of re- ancestry with Pan. Here, Stout and Hecht (65) develop models of
search to date on testing primates’ capacity for transmission has early lithic culture that integrate its distinctive human elements
arguably resulted in neglect of the innovation (“mutation”) ele- and primate foundations, both behavioral and neural. Similarly,
ment, which is less susceptible to experimental manipulation in the beginnings of what has been labeled “symbolic culture,”
the laboratory, and challenging to record in the wild. Here Perry indexed by such features as decorative items, like beads, has
et al. (33) report on 10 y of observations of over 200 capuchin been pushed back from the previous “cave art” dates of around
monkeys (Cebus capucinus) in 10 groups, with research explicitly 30 ka to closer to 100 ka, or in the case of some elements like
focused on innovations as well as their transmission. Distinguish- ochre, to even earlier dates, through a diversity of striking ar-
ing four main functional categories of behavior, these researchers chaeological finds (66). Here, d’Errico et al. (67) provide a rich and
report 17 innovations in foraging and drinking, 9 in hygienic and detailed account of the ways in which cultural repertoires and
other self-directed behaviors, 53 classed as investigative, and their associated ecological niches differentiated and evolved in
49 as social behaviors. Just 21% of these large totals were picked these periods.
up by others, indicating marked selectivity, which these authors Within historical times, the records of human culture have
dissect further. This research begins to identify the Darwinian become amenable to some of the systematic and quantitative
processes of variation and selective transmission underlying cul- methods developed within evolutionary biology to reconstruct
tural transmission in nonhuman species. As noted above, these phylogenetic relationships at scales ranging from macroevolution
processes can continue throughout life, contrasting with the to finer-grained speciation patterns (68). Such approaches have

Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7777
been particularly powerfully applied to the differentiation and nonhuman animals (79) to “rational imitation” (80) and “over-
evolution of language groups (68, 69), but also to such diverse imitation” (81) in children. The contributions to this issue address
topics as the evolution of socio-political organization (70) and folk all of these prospects conceptually and empirically in diverse and
tales (71). Here, Gray and Watts (60) apply this approach to the important ways. Here, we offer a brief introductory overview of
evolution of religion, using this example to explore the analysis of the background foundations to these new explorations and
cultural macroevolution. updated reviews.
As in the capuchin study of Fragaszy et al. (32), the psycho-
logical and social processes that allow human culture to be so Cultural Phenomena Create a New Form of Evolution. The core
distinctive need to be examined as individuals’ life histories un- of adaptive evolution through natural selection involves the triad
fold, and the affordances of the culture in which they develop are of variation in characters, competitive selection of the best
selectively assimilated and further modified. On the one hand, adapted to current circumstances among them (“survival of the
these processes are part of our species’ biology, their properties fittest,” although relative reproductive success is what ultimately
shaped during the millennia of evolutionary time over which our counts), and inheritance of those selected characters by descen-
ancestors became increasingly and intensively dependent on dants (82). Interwoven in cycles of these processes are three ad-
cumulative cultural inheritance (54, 56, 57). In turn, these unique ditional principles, notably the refinement of adaptations suited to
cultural processes operating in humans generate forms of life not the properties of ecological niches, the accumulation of com-
hitherto witnessed in the natural world. To highlight and dissect plexes of these, and differentiation of descendant populations
some of these special cultural phenomena, we include in this issue where they are sufficiently separated, for example by geography,
four contributions that share a focus on ontogenetic development. ultimately leading to speciation. The latter three effects are
Legare (72) provides an overview of core features of human manifested in the picture of organic evolution with which we are
development that facilitate the adaptive transmission and re- familiar, involving a broadly progressive complexification in life—
finement of culture, including the concept of “natural pedagogy,” from early bacteria to the sophisticated animals of today—and a
whereby adults provide active support to cultural assimilation and vast diversity of living species, all displaying a remarkable fit to the
children are predisposed to recognize and respond to this in ecological niche they so successfully inhabit. Current thinking in
particular ways, such as selective and discriminating copying (for cultural evolution suggests that all these principles are active also
example with respect to alternative cultural models), conformity,
in human cultural evolution (14, 16, 56, 61). Social learning and
including the recognition of norms, and innovative flexibility.
transmission provide the inheritance element and human in-
Other, complementary contributions in the present collection
vention the variants, the most successful of which are transmitted
focus on more specific topics, including the active role that chil-
to future generations, generating cultural adaptations to envi-
dren come to play in recognizing their ignorance, as well as their
ronments around the world; and progressive, cumulative cultures
knowledge, and systematically seek information to remedy this
show immense regional differentiation. Empirical evidence in
(73); the ways in which related hypothesis testing changes through
support of these contentions has accumulated over recent de-
the long period of human development in relation to the stage of
cades, reviewed for example in refs. 61, 62, 78, 83, and 84, and is
cognitive development and socio-ecological context (74); and the
pursued further in the present collection (18, 63).
significance of language as both a product and medium of culture,
Such questions about cultural evolution have remained little
illustrated by the linguistic labels and generics that provide spe-
studied in the animal culture literature, which has instead been
cial forms of both the transmission fidelity and affordance for in-
focused on the more fundamental matter of establishing what
novation that permit cumulative culture (75).
cultural phenomena exist in a diversity of species, and what
How Culture Extends Biology transmission processes underpin these (25, 27, 34). Initial explo-
How the existence of culture extends our understanding of the rations of Darwinian dynamics in the case of animal culture (53)
scope and nature of living systems and their evolution was initially have taken the list of eight key properties extracted from the
analyzed in three major respects (14, 15, 17). First, cultural phe- Origin of Species (9) for testing with human data [the six listed
nomena provide a second inheritance system (76) built on the above, plus changes of function and convergent evolution (83)]
foundations of the primary, genetically based system, and this can and through examining studies of animal culture, concluded there
generate a second form of evolution in the sphere of culturally is evidence for all of them (although minimal and slow-developing
transmitted behaviors and artifacts. Second, because cultural compared with the most recent, cumulative cultures of humans).
transmission is mechanically different from genetic transmission in However, there is evidence that animal traditions with suboptimal
particular ways, such as horizontal diffusion among nonrelatives, it payoffs are sometimes, although seemingly not always (85), vul-
can have new and drastically different evolutionary consequences nerable to decay (41), implying the working of the core Darwinian
(62). Third, the two systems may interact in complex ways, the triad, and it seems likely that those animals for which there is now
phenomenon of gene–culture coevolution (16, 56, 77, 78). To evidence of multiple-tradition cultures are the descendants of
these three we can now add two other important dimensions. One lines of ancestors among whom these traditions were pro-
is that the accumulating evidence that social learning and cultural gressively added, surviving through their success, as in the case of
transmission are much more widespread and consequential over 4,300-y-old nut-cracking in chimpanzees, mentioned earlier
across the animal kingdom than earlier suspected, extends much (46). Nonetheless, experimental studies, for example, of mate-
more broadly the implications of the three effects outlined above, choice copying, show that animal social transmission can be
which were originally conceived with a focus on human culture. A evolutionarily consequential, even if short lived (86).
second is that studies increasingly dissect and delineate the In any case, it is becoming apparent that cultural phenomena
richness of the consequences of cultural evolution, and resulting play an important role in shaping many species’ adjustment to and
diversification of life forms. Examples of recent such discoveries exploitation of their environments, with likely significant evolu-
range from the elucidation of functional forms of teaching in tionary consequences that are the focus of current research.

7778 | www.pnas.org/cgi/doi/10.1073/pnas.1707630114 Whiten et al.


Cultural Evolution Includes Characteristics Absent in Geneti- cultures of the Stone Age shaped brain size and structure to create
cally Based Evolution. Cultural evolution may display analogs of the cognitive and manual skills required for the evolution of
organic evolution outlined above, but it is also different in many greater sophistication in tool making. This kind of feedback may
fundamental respects, further extending the scope of the evolu- have been in existence for a very long time. Following a com-
tionary processes that shape biological systems (14, 17). Notably, parative phylogenetic analysis of primate brain and behavior data,
transmission is not only vertical, as in genetic inheritance from Street et al. (95) conclude that cultural processes may have gen-
parent to offspring, but can be horizontal, between unrelated erated such selective feedback (see also ref. 31). Their analyses
peers, or oblique, from unrelated individuals in the parental suggest that both brain expansion and high reliance on culturally
generation (14); moreover, because this involves neurally based transmitted behavior coevolved with sociality and extended life-
learning rather than genetic change, such transmission can be span in primates. This coevolution is consistent with the hypoth-
quite rapid, as well illustrated by the case of “revolutions” in the esis that the evolution of large brains, sociality, and long lifespans
songs of humpback whales, which may change annually yet has promoted reliance on culture, with reliance on culture in turn
quickly come to be sung by whole populations (37), and in a va- driving additional increases in brain volume, cognitive abilities,
riety of human and animal cases further explored in this issue. and lifespans in some primate lineages. Lotem et al. (96) further
Furthermore, unlike the genetically packaged adaptive in- describe an explicit model that accommodates the shaping of
formation inherited at conception, social information may be cognition by culture, from basic building blocks of learning and
gathered throughout ontogeny and indeed across the lifespan, data acquisition to phenomena such as language and tool use.
and in interaction with individual learning and practice, it can thus The authors illustrate how learning and cognition will evolve in
permit iterative and flexible forms of adaptation as circumstances response to human cultural activities.
change. Here, this is illustrated in analyses of extended ontogeny It has been proposed that cultural differentiation between
of difficult skills, like nut-cracking in primates (32, 48). Moreover, groups can have knock-on effects on genetic differences, in the
even innovations, the analog of mutations that become subject to case of birdsong leading to segregated communities between
selective adoption and further transmission to others, may be far which courtship and mating break down, ultimately leading to
from random; instead, they are often immediately functional, most speciation (97). The most comprehensive analyses of such effects
clearly in the example of intelligent, goal-oriented human inven- in cultural evolution among animals have been in whales and
tions, but also in the closest animal counterparts. dolphins (34, 35). In some species, different migratory routes
The social transmission process may itself be adaptively sha- appear to be culturally transmitted from mothers to calves, thence
ped by different biases in what is selectively assimilated, variously reflected in diverging genetic make-ups. Most striking is the case
referred to as transmission biases (17) or social learning strategies of the differentiation of killer whale ecotypes characterized by
(87). Examples include biases to copy behavioral routines, where alternative hunting strategies (targeted at seals vs. salmon and
there is evidence they are successful, conformist copying of the other fish, for example), song types, and residence patterns, that
majority (exploiting “the wisdom of the crowd”), and indirect are proposed to be responsible for anatomical changes, such as
biases, such as copying individuals on the basis of their reputation different jaw types suited to alternative prey (35, 98).
or group identity. Evidence for an array of such biases has accu-
mulated in studies of both human and animal cultural transmission The Present Issue: The Extension of Biology
(88) and are further addressed in this collection (48, 72, 73). Through Culture
The papers that follow in this collection address the multiple
Gene–Culture Coevolution. Empirical evidence for cultural topics alluded to in the introduction above. Papers in the earlier
practices creating selection pressures that feed back to affect parts of the collection have a predominant focus on studies of
biological evolution have been known for some time in the human animals, and the remainder a focus on the human case. These are
case (17). Such ideas reach back further to the Baldwin effect, sandwiched between an opening paper coauthored by one of the
which proposed that a measure of plasticity in animals’ adjustment founders of the subject of cultural evolution, that offers an over-
to their worlds during their lifetimes, including by learning, could view of core cultural evolution theory and empirical findings
create selection pressures for corresponding organic change (89), across human demography, population dynamics, and ecology
as well as to later notions of “behavioral drive” and “cultural (18), and a final complementary overview appraising progress
drive” (90, 91) and niche construction (92). However, these ideas made and prospects for the future of these endeavors (63).
are becoming much refined and supported by extensive data in
the age of genomics (93, 94). Acknowledgments
Using a diversity of evidence from archaeology to neurosci- The authors acknowledge the supplementary financial support to the Sackler
entific investigations, Stout and Hecht (65) analyze how the Colloquium of November 2016, provided by the John Templeton Foundation.

1 Barton N, et al. (2007) Evolution (Cold Spring Harbor Lab Press, Cold Spring Harbor, NY).
2 Futuyma DJ (2013) Evolution (Sinauer Associates, Sunderland, MA), 3rd Ed.
3 Losos JB (2013) Princeton Guide to Evolution (Princeton Univ Press, Princeton, NJ).
4 Huxley JS (1942) Evolution, the Modern Synthesis (Allen & Unwin, London).
5 Mayr E (1982) The Growth of Biological Thought: Diversity, Evolution and Iheritance (Harvard Univ Press, Cambridge, MA).
6 Mayr E (2002) What Evolution Is (Weidenfield and Nicholson, London).
7 Ridley M (2004) Evolution (Blackwell, Cambridge, MA), 3rd Ed.
8 Laland K, et al. (2014) Does evolutionary theory need a rethink? Nature 514:161–164.
9 Darwin C (1859) On the Origin of Species by Natural Selection (Murray, London).
10 Darwin C (1871) The Descent of Man and Selection in Relation to Sex (Murray, London).
11 Jones W (1798) The third anniversary discourse, delivered 2nd February 1786: On the Hindus. Asiatick Researches 1:415–431.

Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7779
12 Schleicher A (1850) Linguistische Untersuchungen. 2. Teil: Die Sprachen Europas in systematischer Übersicht (HB König, Bonn).
13 Schleicher A (1869) Darwin Tested by the Science of Language (JC Hoten, London).
14 Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phenotypic transmission from parents to children. (A theory of the effect of parental
phenotypes on children’s phenotypes.). Am J Hum Genet 25:618–637.
15 Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution: A Quantitative Approach (Princeton Univ Press, Princeton, NJ).
16 Feldman MW, Cavalli-Sforza LL (1976) Cultural and biological evolutionary processes, selection for a trait under complex transmission. Theor Popul Biol
9:238–259.
17 Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago Press, Chicago).
18 Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How culture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789.
19 Maynard-Smith J, Szathmary E (1995) The Major Transitions in Evolution (Freeman, Oxford).
20 Dawkins R (1976) The Selfish Gene (Oxford Univ Press, Oxford).
21 Hewlett BS, de Silvestri A, Guglielmino CR (2002) Genes and semes in Africa. Curr Anthropol 43:313–321.
22 Fisher J, Hinde RA (1949) The opening of milk bottles by birds. Br Birds 42:347–357.
23 Kawai M (1965) Newly acquired pre-cultural behaviour of the natural troop of Japanese monkeys on Koshima Islet. Primates 2:1–30.
24 Marler P, Tamura M (1964) Culturally transmitted patterns of vocal behavior in sparrows. Science 146:1483–1486.
25 Whiten A, Hinde RA, Stringer CB, Laland KN, eds (2012) Culture Evolves (Oxford Univ Press, Oxford).
26 Hoppitt W, Laland KN (2013) Social Learning: An Introduction to Mechanisms, Methods and Models (Princeton Univ Press, Princeton, NJ).
27 Laland KN, Galef BG, eds (2009) The Question of Animal Culture (Harvard Univ Press, Cambridge, MA).
28 Galef BG, Whiten A (2017) The comparative psychology of social learning. APA Handbook of Comparative Psychology, eds Call J, Burghardt G, Pepperberg I,
Snowdon C, Zentall T (American Psychological Association, Washington, DC), pp 411–440.
29 Thornton A, Clutton-Brock T (2011) Social learning and the development of individual and group behaviour in mammal societies. Philos Trans R Soc Lond B Biol Sci
366:978–987.
30 Whiten A (2012) Social learning, traditions and culture. The Evolution of Primate Societies, eds Mitani J, Call J, Kappeler PM, Palombit RA, Silk JB (Chicago Univ
Press, Chicago), pp 682–700.
31 Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. Proc Natl Acad Sci USA 114:7790–7797.
32 Fragaszy DM, et al. (2017) Synchronized practice helps bearded capuchin monkeys learn to extend attention while learning a tradition. Proc Natl Acad Sci USA
114:7798–7805.
33 Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) invent more social behaviors, but younger monkeys innovate more in other
contexts. Proc Natl Acad Sci USA 114:7806–7813.
34 Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Chicago Univ Press, Chicago).
35 Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl Acad Sci USA 114:7814–7821.
36 Garland EC, et al. (2011) Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Curr Biol 21:687–691.
37 Garland EC, Rendell L, Lamoni L, Poole MM, Noad MJ (2017) Song hybridization events during revolutionary song change provide insights into cultural
transmission in humpback whales. Proc Natl Acad Sci USA 114:7822–7829.
38 Catchpole CK, Slater PJB (2008) Bird Song: Biological Themes and Variations (Cambridge Univ Press, Cambridge, UK), 2nd Ed.
39 Slagsvold T, Wiebe KL (2011) Social learning in birds and its role in shaping a foraging niche. Philos Trans R Soc Lond B Biol Sci 366:969–977.
40 Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature 518:538–541.
41 Aplin LM, Sheldon BC, McElreath R (2017) Conformity does not perpetuate suboptimal traditions in a wild population of songbirds. Proc Natl Acad Sci USA
114:7830–7837.
42 Laland KN, Atton N, Webster MM (2011) From fish to fashion: Experimental and theoretical insights into the evolution of culture. Philos Trans R Soc Lond B Biol Sci
366:958–968.
43 Grüter C, Leadbeater E (2014) Insights from insects about adaptive social information use. Trends Ecol Evol 29:177–184.
44 Leadbeater E, Dawson EH (2017) A social insect perspective on the evolution of social learning mechanisms. Proc Natl Acad Sci USA 114:7838–7845.
45 Alem S, et al. (2016) Associative mechanisms allow for social learning and cultural transmission of string pulling in an insect. PLoS Biol 14:e1002564.
46 Mercader J, et al. (2007) 4,300-year-old chimpanzee sites and the origins of percussive stone technology. Proc Natl Acad Sci USA 104:3043–3048.
47 Allen J, Weinrich M, Hoppitt W, Rendell L (2013) Network-based diffusion analysis reveals cultural transmission of lobtail feeding in humpback whales. Science
340:485–488.
48 Hobaiter C, Poisot T, Zuberbühler K, Hoppitt W, Gruber T (2014) Social network analysis shows direct evidence for social transmission of tool use in wild
chimpanzees. PLoS Biol 12:e1001960.
49 Whiten A, Mesoudi A (2008) Review. Establishing an experimental science of culture: Animal social diffusion experiments. Philos Trans R Soc Lond B Biol Sci
363:3477–3488.
50 Whiten A, Caldwell CA, Mesoudi A (2016) Cultural diffusion in humans and other animals. Curr Op Psychol 8:15–21.
51 Shapiro JA (2017) Biological action in read-write genome evolution. Interface Focus, in press.
52 Mueller T, O’Hara RB, Converse SJ, Urbanek RP, Fagan WF (2013) Social learning of migratory performance. Science 341:999–1002.
53 Whiten A (2017) A second inheritance system: The extension of biology through culture. Interface Focus, in press.
54 Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press, Cambridge, MA).
55 Pagel M (2012) Wired For Culture: The Natural History of Human Communication (Allen Lang, London).
56 Henrich J (2015) The Secret of Our Success: How Culture is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter (Princeton Univ Press,
Princeton, NJ).
57 Laland KN (2017) Darwin’s Unfinished Symphony: How Culture Made the Human Mind (Princeton Univ Press, Princeton, NJ).
58 Boivin NL, et al. (2016) Ecological consequences of human niche construction: Examining long-term anthropogenic shaping of global species distributions. Proc
Natl Acad Sci USA 113:6388–6396.
59 Alberti M, et al. (2017) Global urban signatures of phenotypic change in animal and plant populations. Proc Natl Acad Sci USA, 10.1073/pnas.1606034114.
60 Gray RD, Watts J (2017) Cultural macroevolution matters. Proc Natl Acad Sci USA 114:7846–7852.
61 Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evolution. Behav Brain Sci 29:329–347, discussion 347–383.
62 Mesoudi A (2011) Cultural Evolution: How Darwinian Theory Can Explain Culture and Sythesize the Social Sciences. (Univ of Chicago Press, Chicago).
63 Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution. Proc Natl Acad Sci USA 114:7853–7860.
64 Harmand S, et al. (2015) 3.3-million-year-old stone tools from Lomekwi 3, West Turkana, Kenya. Nature 521:310–315.
65 Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl Acad Sci USA 114:7861–7868.
66 d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069.
67 d’Errico F, et al. (2017) Identifying early modern human ecological niche expansions and associated cultural dynamics in the South African Middle Stone Age. Proc
Natl Acad Sci USA 114:7869–7876.
68 Gray RD, Atkinson QD, Greenhill SJ (2011) Language evolution and human history: What a difference a date makes. Philos Trans R Soc Lond B Biol Sci
366:1090–1100.
69 Bouckaert R, et al. (2012) Mapping the origins and expansion of the Indo-European language family. Science 337:957–960.

7780 | www.pnas.org/cgi/doi/10.1073/pnas.1707630114 Whiten et al.


70 Currie TE, Mace R (2011) Mode and tempo in the evolution of socio-political organization: Reconciling ‘Darwinian’ and ‘Spencerian’ evolutionary approaches in
anthropology. Philos Trans R Soc Lond B Biol Sci 366:1108–1117.
71 Ross RM, Atkinson QD (2016) Folktale transmission in the Arctic provides evidence for high bandwith social learning among hunter-gatherer groups. Evol Hum
Behav 37:47–53.
72 Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl Acad Sci USA 114:7877–7883.
73 Harris PL, Bartz DT, Rowe ML (2017) Young children communicate their ignorance and ask questions. Proc Natl Acad Sci USA 114:7884–7891.
74 Gopnik A, et al. (2017) Changes in cognitive flexibility and hypothesis search across human life history from childhood to adolescence to adulthood. Proc Natl
Acad Sci USA 114:7892–7899.
75 Gelman SA, Roberts SO (2017) How language shapes the cultural inheritance of categories. Proc Natl Acad Sci USA 114:7900–7907.
76 Whiten A (2005) The second inheritance system of chimpanzees and humans. Nature 437:52–55.
77 Feldman MW, Laland KN (1996) Gene-culture coevolutionary theory. Trends Ecol Evol 11:453–457.
78 Richerson PJ, Boyd R (2005) Not by Genes Alone: How Culture Transformed Human Evolution (Univ of Chicago Press, Chicago).
79 Hoppitt WJE, et al. (2008) Lessons from animal teaching. Trends Ecol Evol 23:486–493.
80 Gergely G, Bekkering H, Király I (2002) Rational imitation in preverbal infants. Nature 415:755.
81 Lyons DE, Damrosch DH, Lin JK, Macris DM, Keil FC (2011) The scope and limits of overimitation in the transmission of artefact culture. Philos Trans R Soc Lond B
Biol Sci 366:1158–1167.
82 Lewontin RC (1970) The units of selection. Annu Rev Ecol Syst 1:1–18.
83 Mesoudi A, Whiten A, Laland KN (2004) Perspective: Is human cultural evolution Darwinian? Evidence reviewed from the perspective of the Origin of Species.
Evolution 58:1–11.
84 Richerson PJ, Christiansen MH (2013) Cultural Evolution: Society, Technology, Language and Religion (MIT Press, Cambridge, MA).
85 Warner RR (1988) Traditionality of mating-site preferences in a coral fish. Nature 335:719–721.
86 Gibson RM, Bradbury JW, Vehrencamp SL (1991) Mate choice in lekking sage grouse revisited: The roles of vocal display, female site fidelity, and copying. Behav
Ecol 2:165–180.
87 Laland KN (2004) Social learning strategies. Learn Behav 32:4–14.
88 Price EE, Wood LA, Whiten A (2016) Adaptive cultural transmission biases in children and nonhuman primates. Infant Behav Dev 48:45–53.
89 Baldwin JM (1902) Development and Evolution (Macmillan, New York).
90 Wyles JS, Kunkel JG, Wilson AC (1983) Birds, behavior, and anatomical evolution. Proc Natl Acad Sci USA 80:4394–4397.
91 Wilson AC (1985) The molecular basis of evolution. Sci Am 253:164–173.
92 Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected Process in Evolution (Princeton Univ Press, Princeton, NJ).
93 Richerson PJ, Boyd R, Henrich J (2010) Colloquium paper: Gene-culture coevolution in the age of genomics. Proc Natl Acad Sci USA 107:8985–8992.
94 Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome: Bringing genetics and the human sciences together. Nat Rev Genet
11:137–148.
95 Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural intelligence, extended life history, sociality, and brain size in primates. Proc Natl
Acad Sci USA 114:7908–7914.
96 Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive mechanisms in response to cultural innovations. Proc Natl Acad Sci USA
114:7915–7922.
97 Grant BR, Grant PR (2002) Simulating secondary contact in allopatric speciation: An empirical test of premating isolation. Biol J Linn Soc Lond 76:545–556.
98 Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of killer whale ecotypes. Nat Commun 7:11693.

Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7781
Cultural evolutionary theory: How culture evolves and
why it matters
Nicole Creanzaa,1, Oren Kolodnyb,1,2, and Marcus W. Feldmanb
a
Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235; and bDepartment of Biology, Stanford University, Stanford, CA 94305

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 16, 2017)

Human cultural traits—behaviors, ideas, and technologies that can between DNA variants and traits that have major cultural
be learned from other individuals—can exhibit complex patterns components, such as years of schooling, marriage choices, IQ test
of transmission and evolution, and researchers have developed results, and poverty. Perhaps because of the perceived greater
theoretical models, both verbal and mathematical, to facilitate precision of the genomic data, these culturally transmitted com-
our understanding of these patterns. Many of the first quantita- ponents have been relegated to the deep background, creating a
tive models of cultural evolution were modified from existing con- misleading public portrayal of the traits as being predetermined by
cepts in theoretical population genetics because cultural evolution genetics (see, e.g., ref. 11). Models of the dynamics of interaction
has many parallels with, as well as clear differences from, genetic among culture, demography, and genetics, which uncover the
evolution. Furthermore, cultural and genetic evolution can interact complexities in the determination of these behaviors and traits, are
with one another and influence both transmission and selection. crucial to remedy this potentially dangerous misinterpretation.
This interaction requires theoretical treatments of gene–culture
Here, we explore the ways in which cultural evolutionary
coevolution and dual inheritance, in addition to purely cultural
theory and its applications enhance our understanding of human
history and human biology, focusing on the links between cul-
evolution. In addition, cultural evolutionary theory is a natural
tural evolutionary theory and population genetics, human be-
component of studies in demography, human ecology, and many
havioral ecology, and demography. Throughout, we give examples
other disciplines. Here, we review the core concepts in cultural
of efforts to apply theory to data, linking models of cultural evo-
evolutionary theory as they pertain to the extension of biology lution to empirical studies of genetics, language, archaeology, and
through culture, focusing on cultural evolutionary applications in anthropology. For example, studies of cultural factors, including
population genetics, ecology, and demography. For each of these language and customs, help biologists interpret patterns of genetic
disciplines, we review the theoretical literature and highlight rel- evolution that might be misinterpreted if the cultural context were
evant empirical studies. We also discuss the societal implications of not taken into account. Finally, we outline several societal impli-
the study of cultural evolution and of the interactions of humans cations of cultural evolutionary theory.
with one another and with their environment.
Population Genetics and Cultural Evolution
|
cultural evolution mathematical models | gene–culture coevolution | Many of the first models of cultural evolution drew explicit
|
niche construction demography parallels between culture and genes by modifying concepts from
theoretical population genetics and applying them to culture.
Cultural patterns of transmission, innovation, random fluctua-
H uman culture encompasses ideas, behaviors, and artifacts
that can be learned and transmitted between individuals and
can change over time (1). This process of transmission and
tions, and selection are conceptually analogous to genetic pro-
cesses of transmission, mutation, drift, and selection, and many
change is reminiscent of Darwin’s principle of descent with of the mathematical techniques used to study genetics can be
useful in the study of culture (1, 12). However, these mathe-
modification through natural selection, and Darwin himself drew
matical approaches had to be modified to account for the dif-
this explicit link in the case of languages: “The formation of ferences between genetic and cultural transmission. For example,
different languages and of distinct species, and the proofs that we do not expect cultural transmission to follow the rules of genetic
both have been developed through a gradual process, are curiously transmission strictly. Indeed, cultural traits are likely to deviate from
parallel” (2, 3). Theory underpins most scientific endeavors, and, in all three laws of Mendelian inheritance: segregation, independent
the 1970s, researchers began to lay the groundwork for cultural assortment, and dominance (13).
evolutionary theory, building on the neo-Darwinian synthesis of The simple observation that cultural traits need not conform
genetics and evolution by using verbal, diagrammatic, and mathe- to Mendelian inheritance is sufficient to produce complex evo-
matical models (4–8). These models are, by necessity, approxima- lutionary dynamics: If children are likely to reject a cultural trait
tions of reality (9), but because they require researchers to specify that both of their parents possess, the frequency of that trait in
their assumptions and extract the most important features from the population may oscillate between generations (4). In addi-
complex processes, they have proven exceedingly useful in ad- tion, if two biological parents have different forms of a cultural
vancing the study of cultural evolution (10). Here, we review the trait, their child is not necessarily equally likely to acquire the
field of cultural evolutionary theory as it pertains to the extension of
biology through culture. We focus on human culture because the
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
bulk of cultural evolutionary models are human-centric and certain
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
processes such as cumulative culture seem to be unique to humans. Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
However, numerous nonhuman species also exhibit cultural trans- in Irvine, CA. The complete program and video recordings of most presentations are available
mission, and we consider the areas of overlap between models of on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.

human and animal culture in Discussion. Author contributions: N.C., O.K., and M.W.F. designed research, performed research,
The study of cultural evolution is important beyond its aca- analyzed data, and wrote the paper.

demic value. Cultural evolution is a fundamentally interdisci- The authors declare no conflict of interest.
plinary field, bridging gaps between academic disciplines and This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
facilitating connections between disparate approaches. For ex- Board.
1
ample, the advent of technologies for revealing genomic varia- N.C. and O.K. contributed equally to this work.
tion has led to a plethora of studies that measure association 2
To whom correspondence should be addressed. Email: okolodny@stanford.edu.

7782–7789 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620732114


COLLOQUIUM
PAPER
Human genes Human culture way for many related innovations and novel combinations (28).
Such dynamics may explain some of the punctuated bursts that
Parents are observed in the archaeological record of stone tools, like
Parents
Kin Societal the dramatic increases in complexity near the transitions from
norms the Middle to the Upper Paleolithic and from the Paleolithic
to the Neolithic (33–35), and may provide an account of the
Non-kin Peers dynamics of technological development in historical times (36–
39). In many models of cultural evolution, the frequency of one
or more cultural traits is tracked over time, and the equilibrium
Offspring properties are sought. However, recent research highlights the
Offspring
dynamics of cultural accumulation that occur in the transient
phase before the system approaches an equilibrium (28). For
Fig. 1. Cultural transmission is more complex than genetic transmission and example, if innovation processes are interdependent, as de-
may occur on short timescales, even within a single generation. scribed above, the cultural repertoire can fluctuate dramatically
before approaching an equilibrium because the loss or gain of a
groundbreaking innovation can lead to the loss or gain of its related
mother’s or father’s form of that trait (14). Further, a child can innovations as well (28). In addition, these models demonstrate how
acquire cultural traits not only from its parents (vertical trans- innovation processes can change the parameters, and therefore the
mission) but also from nonparental adults (oblique) and peers dynamics, of cultural evolution, possibly altering the cultural equi-
(horizontal) (1, 12); thus, the frequency of a cultural trait in the librium, if there is one (29). For example, a game-changing in-
population is relevant beyond just the probability that an individ- novation, such as the transition from foraging to agriculture, could
ual’s parents had that trait (Fig. 1). In most cases, the more allow a population to feed many more people; thus, a cultural in-
common a cultural trait is in the population, the more likely it is for novation can alter the size of the population, which is generally set
an individual to have the opportunity to acquire it through social as a fixed parameter in cultural evolutionary models (29). Such
learning (15). However, the size of the population may also in- nonequilibrium dynamics arise, for example, in a recent com-

EVOLUTION
fluence the continuing transmission, and thus survival, of a cultural parison between modeling predictions and the archaeological
trait (16). The relative importance of a population’s size, and its record that showed that the frequencies of Neolithic pottery
environmental context, for the retention and perhaps expansion of features over time are not consistent with a cultural system at
the cultural repertoire constitutes an ongoing debate (16–20). equilibrium (40).

The Roles of Transmission and Innovation in Cultural Evolution. Thus Linking Genetic and Cultural Evolution. As mentioned above, the-
far, we have made the analogy between alleles of a gene and oretical treatments of cultural transmission and evolution can
forms of a cultural trait, implying that the cultural trait in usefully draw on concepts from theoretical population genetics,

ANTHROPOLOGY
question can be represented in a binary or discrete manner. extending them to accommodate cultural processes. However,
Although this approximation is appropriate for some culturally cultural and genetic evolutionary processes can also interact with
transmitted traits, such as knowing or not knowing how to use a one another and with the environment (Fig. 2), and elucidating
certain tool, or smoking or not smoking, some cultural traits are the relative contributions of genes, culture, and environment to a
more naturally regarded as continuous or quantitative traits. For phenotype can be very difficult (41). Extensive theoretical work
example, cultural norms and preferences, such as degree of risk has been devoted to characterizing these interactions, termed
tolerance, have been modeled as continuous traits (e.g., ref. 21), and gene–culture coevolution (1, 42), culture–gene coevolution (43),
knowledge of a tool or technique has usefully been represented in dual inheritance theory (12, 44), or cultural niche construction
terms of a quantitative “skill level” (e.g., refs. 16, 22, and 23). (45, 46). When cultural and genetic evolution interact, the dy-
Like genes, cultural traits can be more or less adaptive namics of both genetic and cultural traits are likely to be very
depending on the environment and spread accordingly. An in- different from those characteristic of only one mode of trans-
teresting question is the following: If a certain behavior may be mission (47, 48). Further, cultural traits can alter the selection
either innate (i.e., genetically determined) or culturally acquired pressures on genetic traits and vice versa: For example, genetic
(and thus potentially responsive to the environment), which traits that are adaptive in one cultural background might not be
environmental patterns would favor the genetic transmission? adaptive in another (49, 50). The classic example of these in-
Models predict that spatially varying environments will favor teractions between cultural and genetic evolution is lactase
cultural transmission, whereas only highly stable environments persistence in adulthood: For much of human history, there was
would favor the genetic determination of the behavior (24–26). little reason to digest milk after weaning, and adults did not typically
Cavalli-Sforza and Feldman note an important reason that produce the enzyme that digests lactose. However, with the cultural
genes, cultural traits, and environments should all be considered
together: “Given the existence of individual plasticity in response
to the environment, correlations between biological relatives are
expected even if there is no genetic variation whatsoever” (14). Climate Niche construction
Unlike in genetics, where mutations are the source of new Environmental change Selection pressures
traits, cultural innovations can occur via multiple processes and Ultraviolet exposure Pathogens
at multiple scales (1, 27–29). Most of the models described above Altitude Demography Microbiome
include the cultural transmission of existing traits without pro- Transmission dynamics
viding a mechanism for novel traits to be introduced to the Population size
population. In many models of social learning, new information Innovations
enters a population via trial-and-error learning or individual in-
teractions with the environment, and this information can then
be culturally transmitted (30, 31). New cultural traits can also
originate when existing traits are combined in novel ways, which Flora and fauna
can lead to exponential rates of cultural accumulation (32). Food sources Migration
Recent models represent innovation as the result of multiple Subsistence strategy Genetic admixture
interacting processes (27–29), and cultural traits can accumulate Available resources Cultural connectivity
in punctuated bursts when these processes of innovation are
interdependent: A truly groundbreaking innovation can pave the Fig. 2. Cultural, genetic, and environmental factors influencing evolution.

Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7783
practice of cattle domestication and dairying, a genetic mutation on one cultural trait can influence the evolutionary dynamics of
that enabled the production of the lactase enzyme in adulthood was other cultural traits, facilitating the spread of rare cultural or ge-
strongly favored by selection (51, 52). netic variants (71, 72). More generally, assorting can affect not just
Theoretical analyses show that gene–culture coevolution can mate choice but many types of cultural interactions, termed
be dynamically complex and surprisingly unpredictable. For ex- “assortative meeting” (73). Empirical work supports this theoret-
ample, a well-known finding in population genetics is that a fit- ical finding; for example, beneficial health behaviors spread more
ness advantage to heterozygote genotypes maintains genetic readily through a social network when individuals’ social contacts
variation in a population. However, it is not sufficient to main- were more similar to themselves (74, 75). Culturally mediated
tain genetic variation for heterozygote offspring to be superior to assortment can also lead to biological differences: Partners that
homozygotes in their ability to acquire an advantageous cultural are more similar tend to have more offspring (76), thus increasing
trait that is transmitted culturally by a parent (12). In fact, the fitness, and assortative mating within highly homophilic groups
fitness advantage to the culturally transmitted trait has to be affects the average length of homozygous DNA segments (59, 77),
sufficiently large that it overcomes imperfection in vertical cul- leading to the appearance of higher levels of inbreeding than
tural transmission. In a similar vein, Aoki et al. (26, 53) modeled the might actually exist. Humans can also assort by language; however,
evolution of a genetic trait that increased the efficiency of teaching, studies of the interactions between language and genetic pop-
defined as vertical transmission of a cultural trait. Genetic variation ulation structure show that the resulting dynamics can differ by
at this teaching locus could not be maintained with asexual haploid population. For example, in some geographic regions, language
genetics and uniparental cultural transmission, but sexual haploid boundaries do not seem to act as barriers to gene flow (78–80)
genetics and biparental cultural transmission could preserve both whereas, in other places, assorting with respect to language seems
genetic polymorphism of the teaching locus and polymorphism to have had a large effect, and genetic similarity is more closely
of the cultural trait. These examples illustrate the theoretical associated with language than with geographic distance (80–83).
complexity that emerges when standard population genetic the- Assortative mating has had a measurable effect on human geno-
ory is extended to include the interactions between genetic and mic architecture, and genetic and phenotypic correlations between
cultural traits; the result is a highly nonlinear theory with com- partners are substantial (84).
plications not seen in purely biological theory. In addition to choosing their mates nonrandomly, individuals
The theoretical literature on gene–culture interactions has can also choose their cultural role models; these cultural trans-
become increasingly relevant in the genomic era. Genome-wide mission biases affect the relationship between a trait’s frequency
association studies (GWAS) have shown many genomic associ- in the population and its likelihood of transmission (Fig. 3). For
ations with a wide array of complex phenotypes and have allowed example, conformity bias is an exaggerated preference for the
detection of signals of genetic adaptation (54). However, GWA cultural variant practiced by the majority of the population,
studies of behavioral phenotypes such as IQ, educational at- which can lead to an increasingly large majority over time (85,
tainment, and life history should be interpreted with care (55– 86). Alternatively, individuals might preferentially seek out novel
58). As the authors of one such study state: “Studies of genetic cultural traits, termed rarity bias or novelty bias (30). These
analyses of behavioural phenotypes have been prone to mis- frequency-dependent biases can lead to patterns of cultural dif-
interpretation, such as characterizing identified associated vari- fusion in which the prevalence of a cultural trait can change
ants as ‘genes for education.’ Such characterization is not correct dramatically over short timescales, producing logistic growth
for many reasons: Educational attainment is primarily deter- (“S-shaped” curves) of trait frequency over time (87, 88). Ex-
mined by environmental factors” (55). Statistical relationships amples of cultural traits that are likely to exhibit frequency-
between genetic variants and behaviors need not be causal be- dependent transmission are fashion trends (89), career choices
cause assortative mating, spatial autocorrelation, and a shared (12), and baby names (90). Conformist transmission is likely to
environment can influence such relationships (55, 59–61). Twin dominate when the environment is relatively stable and common
studies of tobacco smoking point to interacting roles of genetics, cultural traits are well adapted to that environment (86, 91).
environment, and assortative mating in the initiation and con- Other types of transmission biases reflect not how common a
tinuance of smoking (62). In large-scale studies of human health, trait is in a population, but the characteristics of the people who
environmental and cultural factors should also be considered have the trait. In the case of prestige bias, individuals attempt to
because these could conflate the effects of genetics and ancestry acquire cultural traits that are perceived to be high quality by
with those of poverty, stress, racism, or socioeconomic status selectively learning from those individuals with high social rank
(63–65). For example, data from the large-scale Health and (92). For example, in an experimental test, children were much
Retirement Study showed an association between African an- more likely to choose an adult cultural role model if they had
cestry and hypertension: The prevalence of hypertension was observed bystanders attending to the potential model rather
eight percentage points higher in respondents with the highest than ignoring him or her (93); thus, even at a very early age,
quartile of African ancestry compared with those with the lowest humans can assess such characteristics as prestige or social
quartile (63). However, controlling for a subset of factors related standing. Individuals can also use observations of success asso-
to socioeconomic status (childhood disadvantage, education, ciated with a cultural trait, such as a fruitful hunt with a certain
income, and wealth) explained ∼38% of this disparity, reducing tool, to develop a preference for cultural role models that are
it to a five-percentage-point difference (63).

Nonrandom Assortment and Biased Transmission


Population Population
Many theoretical population genetic studies make the assump- Learner Learner Learner Learner
tion that mating is random within a population. However, in real
human populations, this assumption is often violated, as indi-
viduals tend to prefer mates with similar phenotypes, such as eye
color (66), height, IQ (67), education level (61), and smoking
status (68). Cultural evolutionary theory has led to significant Conformist Novelty Prestige Success
advances in our understanding of the effects of nonrandom mat- bias bias bias bias
ing, revealing that the transmission and dynamics of cultural traits Fig. 3. Biased cultural transmission mechanisms, where orange and purple
can be sensitive to both phenotypic and environmental assorting represent two forms of an arbitrary cultural trait. Conformity bias predicts
(41). Assortative mating, leading to an increased correlation be- that learners will copy the most common trait, and novelty bias predicts they
tween mates for genetic or cultural traits, can increase both ge- will copy the most rare. Prestige bias predicts learners will copy an individual
notypic and phenotypic variance in a population (69, 70). In of high social status (indicated by a crown) whereas success bias predicts they
addition, assortative mating (and other forms of homophily) acting will copy a successful individual (indicated by a gold medal).

7784 | www.pnas.org/cgi/doi/10.1073/pnas.1620732114 Creanza et al.


COLLOQUIUM
PAPER
demonstrably successful (30). This bias has been demonstrated prey species’ extinction, which forces humans to shift their diet in
experimentally (94, 95); for example, when individuals partici- response. Such models as the Diet Breadth Model, the Broad
pated in simulated hunting with virtual arrowheads and then Spectrum Revolution, and Nutritional Ecology (110–113) capture
modified their arrowheads either by trial and error or imitation, some of these processes, and, although they differ in many im-
copying successful individuals gave significantly better results portant dimensions, such as in the role they assign to plants in the
than trial and error (94). diet, they share the realization that cultural dynamics, genetic evo-
lution, and ecological processes must be considered jointly to un-
Models of Culture and Human Ecology derstand human evolution. Studies in this tradition have also
For thousands of generations humans have been carving their proposed how gradually changing cultural practices may have cre-
existence in the world with cultural tools that have become in- ated the conditions that culminated in the Neolithic revolution, with
tegral to their livelihoods, thereby shaping their environment at the domestication of multiple plant and animal species and the
all scales, both intentionally and unintentionally. Attempting to subsequent changes in almost every aspect of human existence
answer the question of what are the extensions of human biology (114–116). An interesting niche-construction perspective of these
through culture leads to a striking conclusion: There are few topics is proposed by Smith and Zeder (117).
aspects of human biology that have not been shaped by our
culture. Human culture has also affected the biology, even the Models in Human Behavioral Ecology. Human behavioral ecology
survival, of nonhuman species (96). In this section, we review a applies approaches that were developed with a focus on non-
number of cases for which incorporating culture into models of human species to the interpretation of human behavior (118).
ecoevolutionary dynamics has proven valuable for the interpre- One of these approaches is based on optimality in behavior, and
tation, prediction, and, in some cases, direction of human ecol- studies frequently devise models that capture human behavioral
ogy and of human impact on the ecosystem. constraints and alternatives, as well as their associated payoffs,
which are then considered jointly in predicting behavior or
Human Niche Construction. Niche construction is a process in explaining the evolutionary underpinnings of observed behaviors,
which organisms modify their environment in a way that alters often under the assumption that humans behave in a way that
the selective pressures that these organisms experience, thus maximizes their fitness. A broad range of empirical and theo-

EVOLUTION
affecting evolution (97). A special case of niche construction is retical studies of culturally determined behaviors bear directly on
cultural niche construction: the alteration of the environment human fitness, past and present. Human ecological traits, such as
through cultural practices, which may themselves evolve. Cul- life history profiles, subsistence strategies, mating preferences,
tural niche construction involves complex dynamics in which economic decision making, and social structures (119–122), have
selective pressures act on the culture itself, interacting with ge- been analyzed to predict individual behavior and to support
netic evolution and the environment to influence the spread of potential intervention that might alter human behaviors at the
both genetic and cultural traits (71). Because cultural change has societal level.
the potential to occur faster than genetic adaptation, dynamics of Interestingly, few studies in human ecology consider the dy-

ANTHROPOLOGY
niche construction that are driven by cultural traits play a namics of cultural evolution on which the studied behaviors
prominent role in human evolution; yet, only in recent decades depend; thus, for example, it is frequently assumed that alter-
has cultural evolution begun to be explicitly incorporated into native possible behaviors are available to the human group of
human evolutionary ecology (98). Studies that pioneered this interest when they might not be, such as different subsistence
approach showed how it can provide insight into the dynamics of strategies. Similarly, with some notable exceptions (e.g., refs.
the demographic transition in postindustrialized societies (e.g., 123–128), human behavioral ecology models often do not con-
refs. 1 and 99). For example, the reduction in birth rate during sider ecological and evolutionary dynamics that may depend on
the demographic transition is often characterized as a paradox the studied behavior and that play out on intermediate and long
because, from a Darwinian fitness perspective, individuals should timescales: For example, how would prey populations evolve
prefer to have more offspring, not fewer (100). However, if a over long periods of time in response to a certain human for-
cultural norm favoring small family size spreads, the fertility rate aging strategy, and how would that feed back onto human
can drop as well, resulting in a culturally induced demographic strategy choice? We suggest that these aspects are promising
transition (99, 101), which is a case where natural selection and avenues for further exploration.
cultural transmission seem to be in opposition.
The niche-construction approach has been productive in many Interspecies and Intergroup Dynamics. One of the hotly debated
other studies, such as those that describe culturally driven change topics in human prehistory is the replacement of Neanderthals
at the ecosystem level: for example, the extinction of megafauna by modern humans ∼40,000 y ago. A recent study (129) proposed
after the arrival of humans (102), the change of broad-scale an ecocultural model that incorporated cultural differences be-
landscapes as a result of cultivation in early and recent times tween two competing species into Lotka–Volterra competition
(103–105), and the traditional use of fire as a means to manip- dynamics and showed that a difference in culture between moderns
ulate the environmental dynamics in a way beneficial for humans and Neanderthals could have driven the latter’s extinction. This
(106, 107). Niche construction is also important in understanding model explicitly includes cultural evolutionary dynamics and
the evolutionary dynamics driven by changes in the immediate shows that a difference in population sizes between moderns in
environment that humans experience, such as via construction of Africa and Neanderthals in Eurasia could have led to a differ-
shelters and production of clothing that enabled the expansion of ence in the cultural complexity between the two populations,
humans into otherwise uninhabitable regions (108), and the use allowing the small groups of moderns that migrated out of Africa
of fire for food handling, which allowed dramatic changes in to gradually outcompete the larger population of Neanderthals
subsistence and may even have led to significant change to the that they encountered.
anatomy of the human jaw (109). This pattern—with one group replacing another as a result of
a culturally derived advantage—is likely to have taken place
Major Cultural Shifts. A key aspect of human evolution is the change repeatedly throughout human history. Thus, for example, genetic
over time in human subsistence strategies. Several models con- evidence largely supports a scenario in which the Neolithic rev-
sider the interaction of hunter-gatherers with the populations of olution spread throughout the world not by diffusion of farming
organisms that they consume and how these interact over time. practices among groups but by replacement of hunter-gatherer
They propose that predation pressure can decrease a prey species’ groups by farmers (130) (see also refs. 34, 131, and 132). A
population and exert selective pressures in favor of early re- second revolution occurred 6,000 to 4,000 y ago, when the early
production at a smaller body size, potentially leaving a tell- Neolithic farmers were overwhelmed by Yamnaya invaders from
tale pattern in the archaeological record. The result may be the the Russian Steppe, who had the cultural advantage of

Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7785
transportation by horses (133, 134). Such dynamics, in which Standard quantitative models of demographic change do not in-
cultural adaptation to temporally variable conditions may play an clude within-population variation in behaviors that affect fecundity
important role, are also pervasive more recently: For example, or mortality. Projections usually use fixed values for birth and death
competition between pastoralists and agriculturalists and re- rates; however, religious preferences, marriage customs, dietary
placement of one by the other are documented from biblical choices, population subdivision, and mortality profiles may af-
times to the present (135, 136). fect fecundity but are usually not part of demographic models.
Further, aspects of cultural transmission, such as prestige bias
Culture and Microbes. Models are also important in analyzing and the choice of nonparental cultural role models, can facili-
humans’ cultural and genetic coevolution with pathogens, the tate the spread of fertility-reducing behaviors (12, 153). Thus,
realm in which many of our species’ harshest evolutionary cultural evolutionary approaches should be integrated into
challenges have occurred. Some of the clearest signals of natural demography, especially the processes that have led to fertility
selection in the human genome are found near genes that are decline (154).
directly related to coping with diseases such as malaria (137, Many models for life history analysis of humans divide the
138), Kuru (139), and others (140–142), and the understanding lifespan into an ordered series of age classes. These models first
of their evolutionary dynamics is greatly enhanced when we are define the fertility rates of each age class and the survival rates
able to couple such genetic evidence with cultural dynamics that from one age class to the next. Then, they iterate the number in
influenced them. Durham (143), for example, argues that yam each age class produced by these parameters to determine the
farming practices in West Africa significantly increased standing dynamics of the population, including whether the number in
water, thus increasing breeding sites for malaria-carrying mos- each age class approaches a stable equilibrium, termed the sta-
quitoes, which led to high exposure to malaria and exerted se- tionary age distribution, or whether the population will grow or
lective pressure in favor of genetic variants that increase go extinct and at what rate (155).
resistance to malaria. In the New Guinea highlands, cannibalism Carotenuto et al. proposed a demo-cultural framework for
practices that were widespread until the 1940s drove the Kuru such an age-structured population, in which each individual
epidemic among the people of this region (144). A model of carried one variant of a dichotomous trait, say H or h, where H
culture–pathogen interactions demonstrated that different be- represents the presence of a socially learned behavior (for example,
havioral regimes could shape dynamics of pathogenic bacteria, fertility control) and h is its absence (156). An individual of type H
leading to nonintuitive outcomes (145). For example, antibiotic- might also be more likely to survive into the next age class. This
resistant strains will spread throughout the population in the integration of demography and culture yields complex dynamics; for
presence of ubiquitous antibiotic use whereas the WT bacteria example, the trait H can persist in the populations even if it lowers
have a fitness advantage if antibiotics are not used; however, if fertility, as long as the cultural transmission of H is reliable enough,
people modify their behavior by decreasing use of antibiotics or if H also sufficiently increases the chance of survival. Addi-
when they become less effective, both WT and resistant patho- tional learning steps can also be added to age-structured models,
such that vertical and horizontal transmission can occur at dif-
gens can coexist (145).
A fast-growing body of research focuses on the host-associated ferent rates for different age classes (101). In this case, hori-
zontal learning accelerated the trait’s spread and led to faster
microbiome: the communities of organisms, mostly bacteria, that
population growth than vertical transmission alone.
live in and on eukaryotes. The dynamics of the microbiome can
An important outgrowth of demo-cultural modeling has been
interact with those of its host, including genetic variation, cul-
its application to the sex-ratio problem. In many places, the sex
tural practices, and environmental context, further complicating ratio at birth is strongly biased in favor of males and, in China
the study of evolutionary processes. Thus, for example, the in- and parts of India, has resulted in up to 120 male births for every
teraction between dairy farming and selection on the lactase 100 female births (157). This cultural preference for sons can be
persistence gene has become the poster child of gene–culture manifested in sex-selective abortion or withholding of resources
coevolution; however, lactose-using bacteria in humans’ digestive from daughters. This bias has both economic and socio-cultural
tracts are very likely to have played a prominent role in the antecedent, as well as important ethical and demographic
emergence of dairy farming (146). Moreover, these bacteria consequences (158).
continue to affect individuals who do not carry a genetic muta- Data on cultural transmission of son preference can be in-
tion that allows them to efficiently digest dairy in adulthood. corporated into formal demographic analysis (159), linking these
Understanding how cultural practices influence human–microbe data to real-world policy applications (160). Theoretical models
interactions may provide us not only with insight into the Neo- can also aid in predicting the effects of policies: For example,
lithic farming revolution or early cattle domestication and re- one such model tracked the cultural transmission of the perceived
lated human evolution since then, but also with the necessary present value of a son relative to a daughter, the sex ratio at birth,
tools to make informed nutritional choices, such as those related and their effects on demographic change (161). The results of this
to dairy utilization in our present lives. Thus, worldwide dietary model suggest that interventions focused on peer-to-peer cultural
recommendations stand to benefit significantly from an im- transmission of a perceived higher value of daughters might
proved understanding of microbe–human interactions (147). complement existing economic incentives to support and educate
daughters, with the goal of mitigating the effects of son preference.
Demography and Cultural Evolution The literature on the interaction between cultural transmission
The growth and age structure of human populations are both and formal demography is quite sparse. Given the large variety
affected by norms and beliefs of their members. A predominantly of customs that relate to birth and death rates in different hu-
agricultural lifestyle produced higher population growth than the man societies, population projections for the future needs of di-
hunting-gathering lifestyle it replaced (148, 149). This increased verse populations should incorporate more cultural dynamics than
growth was most likely due to the spread of a complex of cultural is currently standard practice.
traits (150) whose adoption may have created conditions that
favored the accumulation of subsequent culturally transmitted Discussion
behaviors (151, 152). Beginning in the late 19th century, parts of With the extensive body of theoretical and empirical literature
Europe, Asia, the United States, Australia, and New Zealand on cultural evolution, researchers in this field are now combining
began to undergo a second demographic transition, which in- information from multiple disciplines and integrating disparate
volved a change from a high birth rate, high mortality regime to a approaches. Part of this new frontier involves more fully bridging
lower birth rate, low mortality regime. These changes were due the divide between theory and data, as well as developing
to the spread of fertility-reducing and survival-increasing be- mathematical models than can aid in the interpretation of an-
haviors that became part of the developed countries’ cultures. thropological and archaeological information. In addition to

7786 | www.pnas.org/cgi/doi/10.1073/pnas.1620732114 Creanza et al.


COLLOQUIUM
PAPER
aiding our understanding of human history, the study of cultural mating in birds based on culturally transmitted songs could ac-
transmission and evolution is extremely relevant in the modern celerate speciation (180, 181) and that sexual selection on
era. Insights from cultural evolution and the diffusion of inno- learned songs could influence evolution of the neural under-
vations have been coopted in advertising and social media to pinnings of learning (182). Recently, studies in a range of animal
quantify the viral spread of information (e.g., ref. 162). How can species have shown that cultural practices can emerge, spread,
these cultural evolutionary insights be better used for positive and change over time, potentially influencing individuals’ fitness
action and public health? In addition, how can we better use (183–187). Tool use among chimpanzees and capuchins (188–
knowledge about cultural evolution to more fully understand 190) is one such example, which also provides insight regarding
patterns of human genetic variation and population structure? the possible origins of the early phases of our own species’ ad-
As we continue to understand more about the human genome, it aptation to the “cultural niche” (191, 192).
becomes increasingly important to consider environmental and In recent years, models that are used for decision making in
cultural contexts as well as genetic variation; however, in the various fields, such as economics and public health, have begun to
study of gene-culture interactions, faulty logic or racial biases
take cultural evolution into account, and a growing number also
about “causes” of human differences may be used and must be
cautiously guarded against (reviewed in ref. 163). incorporate the modeling—verbal or mathematical—of the hu-
In this paper, we have reviewed aspects of human cultural man ecosystem’s expected coevolution with the spread of cultural
evolutionary theory, focusing on those that are most closely practices. These models play a prominent role in planned strate-
linked to the extension of biology through culture. With this gies related to climate change and reduction of carbon emissions
focus, we could not do adequate justice to many important do- (193), in predicting global food shortages and requirements (194),
mains of cultural evolutionary theory. In brief, many models of and in assessing the distribution of new practices and technologies
cultural evolution focus primarily on the transmission of cultural in agriculture (195, 196). In addition, models in epidemiology
traits and not on their interactions with genes or fitness. These have begun to integrate cultural transmission of health practices
models include, but are not limited to, models of social learning and pathogen ecological dynamics with regard to drug distribution
(e.g., refs. 164–167), models of language evolution (e.g., refs. and combating epidemics (e.g., ref. 197).
168–171), empirically driven verbal models of human evolution Deeper analysis of how human culture, human ecology, and

EVOLUTION
based on patterns in material culture (e.g., refs. 172–174), and the human environment coevolve is necessary for understanding
models of cultural dynamics within and between groups (e.g., historical and present dynamics, and for predicting future trends.
refs. 86 and 175–178)). In addition, we focused on human studies, These analyses will provide much-needed tools for the planning
although cultural processes are present in many other species. For and direction of such dynamics. Humans’ worldwide well-being
example, social learning has been extensively studied in non- and that of the ecosystem we live in depend on our ability to
human animals, in which behavioral strategies, such as producer make such predictions and act accordingly.
and scrounger, and cultural trajectories can be more clearly
defined than in humans (166, 179). Cultural transmission also ACKNOWLEDGMENTS. We thank the John Templeton Foundation and

ANTHROPOLOGY
has large-scale evolutionary implications for some nonhuman Stanford Center for Computational, Evolutionary, and Human Genomics
animals: For example, theoretical studies suggest that nonrandom for funding.

1. Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution: A 22. Kobayashi Y, Aoki K (2012) Innovativeness, population size and cumulative cultural
Quantitative Approach (Princeton Univ Press, Princeton). evolution. Theor Popul Biol 82:38–47.
2. Darwin C (1859) On the Origin of Species by Means of Natural Selection (Murray, 23. Baldini R (2015) Revisiting the effect of population size on cumulative cultural
London). evolution. J Cogn Cult 15:320–336.
3. Darwin C (1888) The Descent of Man, and Selection in Relation to Sex (Murray, 24. Boyd R, Richerson PJ (1983) The cultural transmission of acquired variation: Effects
London). on genetic fitness. J Theor Biol 100:567–596.
4. Feldman MW, Cavalli-Sforza LL (1976) Cultural and biological evolutionary processes, 25. Aoki K, Feldman MW (2014) Evolution of learning strategies in temporally and
selection for a trait under complex transmission. Theor Popul Biol 9:238–259. spatially variable environments: A review of theory. Theor Popul Biol 91:3–19.
5. Feldman MW, Cavalli-Sforza LL (1975) Models for cultural inheritance: A general 26. Aoki K, Wakano J, Feldman M (2005) The emergence of social learning in a tem-
linear model. Ann Hum Biol 2:215–226. porally changing environment: A theoretical model. Curr Anthropol 46:334–340.
6. Blum HF (1978) Uncertainty in interplay of biological and cultural evolution: Man’s 27. Fogarty L, Creanza N, Feldman MW (2015) Cultural evolutionary perspectives on
view of himself. Q Rev Biol 53:29–40. creativity and human innovation. Trends Ecol Evol 30:736–754.
7. Cavalli-Sforza L, Feldman MW (1973) Models for cultural inheritance. I. Group mean 28. Kolodny O, Creanza N, Feldman MW (2015) Evolution in leaps: The punctuated accu-
and within group variation. Theor Popul Biol 4:42–55. mulation and loss of cultural innovations. Proc Natl Acad Sci USA 112:E6762–E6769.
8. Alland A, Jr (1972) Cultural evolution: The Darwinian model. Soc Biol 19:227–239. 29. Kolodny O, Creanza N, Feldman MW (2016) Game-changing innovations: How cul-
9. Burnham KP, Anderson DR (1998) Model Selection and Inference: A Practical
ture can change the parameters of its own evolution and induce abrupt cultural
Information-Theoretic Approach (Springer, New York).
shifts. PLOS Comput Biol 12:e1005302.
10. Haldane JBS (1964) A defense of beanbag genetics. Perspect Biol Med 7:343–359.
30. Henrich J, McElreath R (2003) The evolution of cultural evolution. Evol Anthropol 12:
11. Guedes Jd, et al. (2013) Is poverty in our genes? Curr Anthropol 54:71–79.
123–135.
12. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Chicago Univ Press,
31. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies
Chicago).
tournament. Science 328:208–213.
13. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cul-
32. Enquist M, Ghirlanda S, Jarrick A, Wachtmeister C-A (2008) Why does human culture
tural evolution. Proc Natl Acad Sci USA 114:7853–7860.
14. Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phe- increase exponentially? Theor Popul Biol 74:46–55.
33. Klein RG, Edgar B (2002) The Dawn of Human Culture (Wiley, New York).
notypic transmission from parents to children. (A theory of the effect of parental
34. Bar-Yosef O (1998) On the nature of transitions: The Middle to Upper Palaeolithic
phenotypes on children’s phenotypes). Am J Hum Genet 25:618–637.
15. Giraldeau L-A (1994) Social foraging: Individual learning and cultural transmission of and the Neolithic revolution. Camb Archaeol J 8:141–163.
innovations. Behav Ecol 5:35–43. 35. Roebroeks W (2008) Time for the Middle to Upper Paleolithic transition in Europe.
16. Henrich J (2004) Demography and cultural evolution: How adaptive cultural pro- J Hum Evol 55:918–926.
cesses can produce maladaptive losses—The Tasmanian case. Am Antiq 69:197–214. 36. Darmstaedter L, Du Bois-Reymond R (1904) 4000 Jahre Pionier-Arbeit in den Exakten
17. Henrich J, et al. (2016) Understanding cumulative cultural evolution. Proc Natl Acad Wissenschaften (JA Stargardt, Berlin).
Sci USA 113:E6724–E6725. 37. Aiyar S, Dalgaard C-J, Moav O (2008) Technological progress and regress in pre-
18. Vaesen K, Collard M, Cosgrove R, Roebroeks W (2016) Population size does not explain industrial times. J Econ Growth 13:125–144.
past changes in cultural complexity. Proc Natl Acad Sci USA 113:E2241–E2247. 38. Kuhn SL (2012) Emergent Patterns of Creativity and Innovation in Early Technologies:
19. Collard M, Ruttle A, Buchanan B, O’Brien MJ (2013) Population size and cultural Origins of Human Innovation and Creativity (Elsevier, Oxford), pp 69–88.
evolution in nonindustrial food-producing societies. PLoS One 8:e72628. 39. Lehman HC (1947) The exponential increase of man’s cultural output. Soc Forces 25:
20. Collard M, Buchanan B, Morin J, Costopoulos A (2011) What drives the evolution of 281–290.
hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with 40. Crema ER, Kandler A, Shennan S (2016) Revealing patterns of cultural transmission
data from the Pacific Northwest. Philos Trans R Soc Lond B Biol Sci 366:1129–1138. from frequency data: Equilibrium and non-equilibrium assumptions. Sci Rep 6:39122.
21. Bisin A, Verdier T (2010) The economics of cultural transmission and the dynamics of 41. Feldman MW, Cavalli-Sforza LL (1979) Aspects of variance and covariance analysis
preferences. Handb Soc Econ 319:339–416. with cultural inheritance. Theor Popul Biol 15:276–307.

Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7787
42. Feldman MW, Laland KN (1996) Gene-culture coevolutionary theory. Trends Ecol 81. Barbujani G, Sokal RR (1990) Zones of sharp genetic change in Europe are also lin-
Evol 11:453–457. guistic boundaries. Proc Natl Acad Sci USA 87:1816–1819.
43. Chudek M, Henrich J (2011) Culture-gene coevolution, norm-psychology and the 82. Karafet TM, et al. (2016) Coevolution of genes and languages and high levels of
emergence of human prosociality. Trends Cogn Sci 15:218–226. population structure among the highland populations of Daghestan. J Hum Genet
44. Richerson PJ, Boyd R (1978) A dual inheritance model of the human evolutionary 61:181–191.
process. I. Basic postulates and a simple model. J Soc Biol Struct 1:127–154. 83. de Filippo C, et al. (2011) Y-chromosomal variation in sub-Saharan Africa: Insights
45. Laland KN, Odling-Smee J, Feldman MW (2000) Niche construction, biological evo- into the history of Niger-Congo groups. Mol Biol Evol 28:1255–1269.
lution, and cultural change. Behav Brain Sci 23:131–146. discussion 146–175. 84. Robinson MR, et al. (2017) Genetic evidence of assortative mating in humans. Nat
46. Odling-Smee J, Laland KN, Feldman MW (2003) Niche Construction: The Neglected Hum Behav 1:0016.
Process in Evolution (Princeton Univ Press, Princeton). 85. Efferson C, Lalive R, Richerson PJ, McElreath R, Lubell M (2008) Conformists and
47. Laland KN, Kumm J, Van Horn JD, Feldman MW (1995) A gene-culture model of mavericks: The empirics of frequency-dependent cultural transmission. Evol Hum
human handedness. Behav Genet 25:433–445. Behav 29:56–64.
48. Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evo- 86. Henrich J, Boyd R (1998) The evolution of conformist transmission and the emer-
lution. Behav Brain Sci 29:329–347, discussion 347–383. gence of between-group differences. Evol Hum Behav 19:215–241.
49. Rendell L, Fogarty L, Laland KN (2011) Runaway cultural niche construction. Philos 87. Rogers EM (2010) Diffusion of Innovations (Simon and Schuster, New York).
Trans R Soc Lond B Biol Sci 366:823–835. 88. Henrich J (2001) Cultural transmission and the diffusion of innovations: Adoption
50. Laland KN, O’Brien MJ (2012) Cultural niche construction: An introduction. Biol dynamics indicate that biased cultural transmission is the predominate force in be-
Theory 6:191–202. havioural change. Am Anthropol 103:992–1013.
51. Feldman MW, Cavalli-Sforza LL (1989) On the theory of evolution under genetic and 89. Acerbi A, Ghirlanda S, Enquist M (2012) The logic of fashion cycles. PLoS One 7:
cultural transmission with application to the lactose absorption. Mathematical e32541.
Evolutionary Theory, ed Feldman MW (Princeton Univ Press, Princeton), pp 145–173. 90. Acerbi A, Alexander Bentley R (2014) Biases in cultural transmission shape the
52. Ingram CJ, Liebert A, Swallow DM (2012) Population genetics of lactase persistence turnover of popular traits. Evol Hum Behav 35:228–236.
and lactose intolerance. eLS, 10.1002/9780470015902.a0020855.pub2. 91. Kendal J, Giraldeau LA, Laland K (2009) The evolution of social learning rules:
53. Aoki K, Wakano J, Feldman M (2016) Gene-culture models for the evolution of al- Payoff-biased and frequency-dependent biased transmission. J Theor Biol 260:
truistic teaching. On Human Nature: Biology, Psychology, Ethics, Policy, and Religion, 210–219.
eds Tibayrenc M, Ayala F (Academic, Amsterdam), pp 279–296. 92. Henrich J, Gil-White FJ (2001) The evolution of prestige: Freely conferred deference
54. Berg JJ, Coop G (2014) A population genetic signal of polygenic adaptation. PLoS as a mechanism for enhancing the benefits of cultural transmission. Evol Hum Behav
Genet 10:e1004412. 22:165–196.
55. Okbay A, et al.; LifeLines Cohort Study (2016) Genome-wide association study 93. Chudek M, Heller S, Birch S, Henrich J (2012) Prestige-biased cultural learning: By-
identifies 74 loci associated with educational attainment. Nature 533:539–542. stander’s differential attention to potential models influences children’s learning.
56. Benyamin B, et al.; Wellcome Trust Case Control Consortium 2 (WTCCC2) (2014) Evol Hum Behav 33:46–56.
Childhood intelligence is heritable, highly polygenic and associated with FNBP1L. 94. Mesoudi A, O’Brien MJ (2008) The cultural transmission of Great Basin projectile-
Mol Psychiatry 19:253–258. point technology II: An agent-based computer simulation. Am Antiq 73:627–644.
57. Davies G, et al. (2011) Genome-wide association studies establish that human in- 95. Mesoudi A (2011) An experimental comparison of human social learning strategies:
telligence is highly heritable and polygenic. Mol Psychiatry 16:996–1005. Payoff-biased social learning is adaptive but underused. Evol Hum Behav 32:
58. Minkov M, Bond MH (2015) Genetic polymorphisms predict national differences in 334–342.
life history strategy and time orientation. Pers Individ Dif 76:204–215. 96. Alberti M, et al. (2017) Global urban signatures of phenotypic change in animal and
59. Abdellaoui A, et al. (2015) Educational attainment influences levels of homozygosity plant populations. Proc Natl Acad Sci USA, 10.1073/pnas.1606034114.
through migration and assortative mating. PLoS One 10:e0118935. 97. Laland KN, Brown GR (2006) Niche construction, human behavior, and the adaptive‐
60. Piffer D (2015) A review of intelligence GWAS hits: Their relationship to country IQ lag hypothesis. Evol Anthropol 15:95–104.
and the issue of spatial autocorrelation. Intelligence 53:43–50. 98. Laland KN, Odling‐Smee J, Feldman MW (2001) Cultural niche construction and
61. Domingue BW, Fletcher J, Conley D, Boardman JD (2014) Genetic and educational human evolution. J Evol Biol 14:22–33.
assortative mating among US adults. Proc Natl Acad Sci USA 111:7996–8000. 99. Ihara Y, Feldman MW (2004) Cultural niche construction and the evolution of small
62. Maes HH, et al. (2006) Genetic and cultural transmission of smoking initiation: An family size. Theor Popul Biol 65:105–111.
extended twin kinship model. Behav Genet 36:795–808. 100. Borgerhoff Mulder M (1998) The demographic transition: Are we any closer to an
63. Marden JR, Walter S, Kaufman JS, Glymour MM (2016) African ancestry, social fac- evolutionary explanation? Trends Ecol Evol 13:266–270.
tors, and hypertension among non-Hispanic Blacks in the Health and Retirement 101. Fogarty L, Creanza N, Feldman MW (2013) The role of cultural transmission in hu-
Study. Biodemogr Soc Biol 62:19–35. man demographic change: An age-structured model. Theor Popul Biol 88:68–77.
64. Paradies Y, et al. (2015) Racism as a determinant of health: A systematic review and 102. Barnosky AD, Koch PL, Feranec RS, Wing SL, Shabel AB (2004) Assessing the causes of
meta-analysis. PLoS One 10:e0138511. late Pleistocene extinctions on the continents. Science 306:70–75.
65. Nugent NR, Tyrka AR, Carpenter LL, Price LH (2011) Gene-environment interactions: 103. Lansing JS, Cox MP, Downey SS, Janssen MA, Schoenfelder JW (2009) A robust
Early life stress and risk for depressive and anxiety disorders. Psychopharmacology budding model of Balinese water temple networks. World Archaeol 41:112–133.
(Berl) 214:175–196. 104. Erickson CL (1992) Prehistoric landscape management in the Andean highlands:
66. Laeng B, Mathisen R, Johnsen JA (2007) Why do blue-eyed men prefer women with Raised field agriculture and its environmental impact. Popul Environ 13:285–300.
the same eye color? Behav Ecol Sociobiol 61:371–384. 105. Delcourt PA, Delcourt HR (2004) Prehistoric Native Americans and Ecological
67. Keller MC, et al. (2013) The genetic correlation between height and IQ: Shared genes Change: Human Ecosystems in Eastern North America Since the Pleistocene (Cam-
or assortative mating? PLoS Genet 9:e1003451. bridge Univ Press, Cambridge, UK).
68. Treur JL, Vink JM, Boomsma DI, Middeldorp CM (2015) Spousal resemblance for 106. Boyd R (1999) Indians, Fire, and the Land in the Pacific Northwest (Oregon State Univ
smoking: Underlying mechanisms and effects of cohort and age. Drug Alcohol Press, Corvallis, OR).
Depend 153:221–228. 107. Bliege Bird R, Bird DW, Codding BF, Parker CH, Jones JH (2008) The “fire stick
69. Feldman MW, Cavalli-Sforza LL (1977) The evolution of continuous variation. II. farming” hypothesis: Australian Aboriginal foraging strategies, biodiversity, and
Complex transmission and assortative mating. Theor Popul Biol 11:161–181. anthropogenic fire mosaics. Proc Natl Acad Sci USA 105:14796–14801.
70. Rice J, Cloninger CR, Reich T (1978) Multifactorial inheritance with cultural trans- 108. Roebroeks W, et al. (1992) Dense forests, cold steppes, and the palaeolithic settle-
mission and assortative mating. I. Description and basic properties of the unitary ment of Northern Europe. Curr Anthropol 33:551–586.
models. Am J Hum Genet 30:618–643. 109. Wrangham RW (2009) Catching Fire: How Cooking Made Us Human (Basic Books,
71. Creanza N, Fogarty L, Feldman MW (2012) Models of cultural niche construction with New York).
selection and assortative mating. PLoS One 7:e42744. 110. Stiner MC (2001) Thirty years on the “broad spectrum revolution” and paleolithic
72. Creanza N, Feldman MW (2014) Complexity in models of cultural niche construction demography. Proc Natl Acad Sci USA 98:6993–6996.
with selection and homophily. Proc Natl Acad Sci USA 111(Suppl 3):10830–10837. 111. Davis S, Rabinovich R, Goren-Inbar N (1988) Quaternary extinctions and population
73. Eshel I, Cavalli-Sforza LL (1982) Assortment of encounters and evolution of co- increase in western Asia: The animal remains from Biq’at Quneitra. Paéorient 14:
operativeness. Proc Natl Acad Sci USA 79:1331–1335. 95–105.
74. Centola D (2010) The spread of behavior in an online social network experiment. 112. Hockett B, Haws JA (2005) Nutritional ecology and the human demography of Ne-
Science 329:1194–1197. andertal extinction. Quat Int 137:21–34.
75. Centola D (2011) An experimental study of homophily in the adoption of health 113. Hardy BL (2010) Climatic variability and plant food distribution in Pleistocene
behavior. Science 334:1269–1272. Europe: Implications for Neanderthal diet and subsistence. Quat Sci Rev 29:662–679.
76. Thiessen D, Gregg B (1980) Human assortative mating and genetic equilibrium: An 114. Flannery KV (1969) Origins and ecological effects of early domestication in Iran and
evolutionary perspective. Ethol Sociobiol 1:111–140. the Near East. The Domestication and Exploitation of Plants and Animals, eds
77. Abdellaoui A, et al. (2013) Association between autozygosity and major depression: Ucko PJ, Dimbleby GW (Gerald Duckworth, London), pp 73–100.
Stratification due to religious assortment. Behav Genet 43:455–467. 115. Valla FR, Bar-Yosef O, eds (1991) The Natufian Culture in the Levant (International
78. Hunley K, et al. (2008) Genetic and linguistic coevolution in Northern Island Mela- Monographs in Prehistory, Ann Arbor, MI).
nesia. PLoS Genet 4:e1000239. 116. Rowley-Conwy P, Layton R (2011) Foraging and farming as niche construction: Stable
79. Hunley K, Long JC (2005) Gene flow across linguistic boundaries in Native North and unstable adaptations. Philos Trans R Soc Lond B Biol Sci 366:849–862.
American populations. Proc Natl Acad Sci USA 102:1312–1317. 117. Smith BD, Zeder MA (2013) The onset of the Anthropocene. Anthropocene 4:8–13.
80. Srithawong S, et al. (2015) Genetic and linguistic correlation of the Kra–Dai-speaking 118. Winterhalder B, Smith EA (2000) Analyzing adaptive strategies: Human behavioral
groups in Thailand. J Hum Genet 60:1–10. ecology at twenty-five. Evol Anthropol Issues News Rev 9:51–72.

7788 | www.pnas.org/cgi/doi/10.1073/pnas.1620732114 Creanza et al.


COLLOQUIUM
PAPER
119. Henrich J, et al. (2001) In search of homo economicus: Behavioral experiments in 157. Banister J (2004) Shortage of girls in China today. J Popul Res 21:19–45.
15 small-scale societies. Am Econ Rev 91:73–78. 158. Tuljapurkar S, Li N, Feldman MW (1995) High sex ratios in China’s future. Science
120. Kaplan H, Hill K, Lancaster J, Hurtado AM (2000) A theory of human life history 267:874–876.
evolution: Diet, intelligence, and longevity. Evol Anthropol Issues News Rev 9: 159. Li N, Feldman MW, Li S (2000) Cultural transmission in a demographic study of sex
156–185. ratio at birth in China’s future. Theor Popul Biol 58:161–172.
121. Winterhalder B, Lu F, Tucker B (1999) Risk-senstive adaptive tactics: Models and 160. Bhattacharjya D, Sudarshan A, Tuljapurkar S, Shachter R, Feldman M (2008) How can
evidence from subsistence studies in biology and anthropology. J Archaeol Res 7: economic schemes curtail the increasing sex ratio at birth in China? Demogr Res 19:
301–348. 1831–1850.
122. Voland E (1998) Evolutionary ecology of human reproduction. Annu Rev Anthropol 161. Fogarty L, Feldman MW (2011) The cultural and demographic evolution of son
27:347–374. preference and marriage type in contemporary China. Biol Theory 6:272–282.
123. Belovsky GE (1988) An optimal foraging-based model of hunter-gatherer population 162. Barkow JH, O’Gorman R, Rendell L (2012) Are the new mass media subverting cul-
dynamics. J Anthropol Archaeol 7:329–372. tural transmission? Rev Gen Psychol 16:121–133.
124. Winterhalder B, Baillargeon W, Cappelletto F, Daniel IR, Prescott C (1988) The 163. Creanza N, Feldman MW (2016) Worldwide genetic and cultural change in human
population ecology of hunter-gatherers and their prey. J Anthropol Archaeol 7: evolution. Curr Opin Genet Dev 41:85–92.
289–328. 164. Enquist M, Ghirlanda S (2007) Evolution of social learning does not explain the or-
125. Broughton JM (1997) Widening diet breadth, declining foraging efficiency, and igin of human cumulative culture. J Theor Biol 246:129–135.
prehistoric harvest pressure: Ichthyofaunal evidence from the Emeryville Shell- 165. Rendell L, Fogarty L, Laland KN (2010) Rogers’ paradox recast and resolved: Pop-
mound, California. Antiquity 71:845–862. ulation structure and the evolution of social learning strategies. Evolution 64:
126. Low BS, Heinen JT (1993) Population, resources, and environment: Implications of 534–548.
human behavioral ecology for conservation. Popul Environ 15:7–41. 166. Arbilly M, Weissman DB, Feldman MW, Grodzinski U (2014) An arms race between
127. Stiner MC, Munro ND, Surovell TA, Tchernov E, Bar-Yosef O (1999) Paleolithic pop- producers and scroungers can drive the evolution of social cognition. Behav Ecol 25:
ulation growth pulses evidenced by small animal exploitation. Science 283:190–194. 487–495.
128. Stiner MC, Munro ND, Surovell TA (2000) The tortoise and the hare. Curr Anthropol 167. Rendell L, et al. (2011) Cognitive culture: Theoretical and empirical insights into
41:39–79. social learning strategies. Trends Cogn Sci 15:68–76.
129. Gilpin W, Feldman MW, Aoki K (2016) An ecocultural model predicts Neanderthal 168. Hruschka DJ, et al. (2009) Building social cognitive models of language change.
extinction through competition with modern humans. Proc Natl Acad Sci USA 113: Trends Cogn Sci 13:464–469.
2134–2139. 169. Nowak MA, Krakauer DC (1999) The evolution of language. Proc Natl Acad Sci USA
130. Skoglund P, et al. (2012) Origins and genetic legacy of Neolithic farmers and hunter- 96:8028–8033.
gatherers in Europe. Science 336:466–469. 170. Gray RD, Greenhill SJ, Ross R (2007) The pleasures and perils of Darwinizing culture
131. Aoki K, Shida M, Shigesada N (1996) Travelling wave solutions for the spread of (with phylogenies). Biol Theory 2:1–26.

EVOLUTION
farmers into a region occupied by hunter–gatherers. Theor Popul Biol 50:1–17. 171. Kolodny O, Lotem A, Edelman S (2015) Learning a generative probabilistic grammar
132. Patterson MA, Sarson GR, Sarson HC, Shukurov A (2010) Modelling the Neolithic of experience: A process-level model of language acquisition. Cogn Sci 39:227–267.
transition in a heterogeneous environment. J Archaeol Sci 37:2929–2937. 172. Hovers E (2012) Invention, reinvention and innovation: The makings of Oldowan
133. Allentoft ME, et al. (2015) Population genomics of bronze age Eurasia. Nature 522: lithic technology. Origins of Human Innovation and Creativity, ed Elias S (Elsevier,
167–172. Oxford).
134. Goldberg A, Günther T, Rosenberg NA, Jakobsson M (2017) Ancient X chromosomes 173. Bar-Yosef O (2002) The Upper Paleolithic revolution. Annu Rev Anthropol 31:363–393.
reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations. Proc 174. Klein RG (2008) Out of Africa and the evolution of human behavior. Evol Anthropol
Natl Acad Sci USA 114:2657–2662. 17:267–281.
135. Wossink A (2009) Challenging Climate Change: Competition and Cooperation 175. Boyd R, Richerson PJ (2009) Voting with your feet: Payoff biased migration and the

ANTHROPOLOGY
Among Pastoralists and Agriculturalists in Northern Mesopotamia (c. 3000-1600 evolution of group beneficial behavior. J Theor Biol 257:331–339.
BC) (Sidestone, Leiden, The Netherlands). 176. Wiens JJ, Hollingsworth BD (2000) War of the Iguanas: Conflicting molecular and
136. Spielmann KA, Eder JF (1994) Hunters and farmers: Then and now. Annu Rev morphological phylogenies and long-branch attraction in iguanid lizards. Syst Biol
Anthropol 23:303–323. 49:143–159.
137. Kwiatkowski DP (2005) How malaria has affected the human genome and what 177. Borgerhoff Mulder M, et al. (2009) Intergenerational wealth transmission and the
human genetics can teach us about malaria. Am J Hum Genet 77:171–192. dynamics of inequality in small-scale societies. Science 326:682–688.
138. Tishkoff SA, et al. (2001) Haplotype diversity and linkage disequilibrium at human 178. Fogarty L, Strimling P, Laland KN (2011) The evolution of teaching. Evolution 65:
G6PD: Recent origin of alleles that confer malarial resistance. Science 293:455–462. 2760–2770.
139. Mead S, et al. (2003) Balancing selection at the prion protein gene consistent with 179. Fehér O, Wang H, Saar S, Mitra PP, Tchernichovski O (2009) De novo establishment
prehistoric kurulike epidemics. Science 300:640–643. of wild-type song culture in the zebra finch. Nature 459:564–568.
140. Bustamante CD, et al. (2005) Natural selection on protein-coding genes in the hu- 180. Verzijden MN, et al. (2012) The impact of learning on sexual selection and specia-
man genome. Nature 437:1153–1157. tion. Trends Ecol Evol 27:511–519.
141. Sabeti PC, et al.; International HapMap Consortium (2007) Genome-wide detection 181. Lachlan RF, Servedio MR (2004) Song learning accelerates allopatric speciation.
and characterization of positive selection in human populations. Nature 449: Evolution 58:2049–2063.
913–918. 182. Creanza N, Fogarty L, Feldman MW (2016) Cultural niche construction of repertoire
142. Enard D, Cai L, Gwennap C, Petrov DA (2016) Viruses are a dominant driver of size and learning strategies in songbirds. Evol Ecol 30:285–305.
protein adaptation in mammals. eLife 5:e12469. 183. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture
143. Durham WH (1991) Coevolution: Genes, Culture, and Human Diversity (Stanford via conformity in wild birds. Nature 518:538–541.
Univ Press, Stanford, CA). 184. Rendell L, Whitehead H (2001) Culture in whales and dolphins. Behav Brain Sci 24:
144. Lindenbaum S (2015) Kuru Sorcery: Disease and Danger in the New Guinea 309–324, discussion 324–382.
Highlands (Routledge, Abingdon, UK). 185. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685.
145. Boni MF, Feldman MW (2005) Evolution of antibiotic resistance by human and 186. Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl
bacterial niche construction. Evolution 59:477–491. Acad Sci USA 114:7814–7821.
146. Walter J, Ley R (2011) The human gut microbiome: Ecology and recent evolutionary 187. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) in-
changes. Annu Rev Microbiol 65:411–429. vent more social behaviors, but younger monkeys innovate more in other contexts.
147. Szilagyi A, Galiatsatos P, Xue X (2016) Systematic review and meta-analysis of lactose Proc Natl Acad Sci USA 114:7806–7813.
digestion, its impact on intolerance and nutritional effects of dairy food restriction 188. Fragaszy D, Izar P, Visalberghi E, Ottoni EB, de Oliveira MG (2004) Wild capuchin monkeys
in inflammatory bowel diseases. Nutr J 15:67. (Cebus libidinosus) use anvils and stone pounding tools. Am J Primatol 64:359–366.
148. Bocquet‐Appel J (2002) Paleoanthropological traces of a Neolithic demographic 189. Whiten A, Horner V, de Waal FBM (2005) Conformity to cultural norms of tool use in
transition. Curr Anthropol 43:637–650. chimpanzees. Nature 437:737–740.
149. Gage TBB, DeWitte S (2009) What do we know about the agricultural demographic 190. Ottoni EB, Izar P (2008) Capuchin monkey tool use: Overview and implications. Evol
transition? Curr Anthropol 50:649–655. Anthropol. 17:171–178.
150. Ammerman AJ, Cavalli-Sforza LL (1984) The Neolithic Transition and the Genetics of 191. Whiten A (2011) The scope of culture in chimpanzees, humans and ancestral apes.
Populations in Europe (Princeton Univ Press, Princeton). Philos Trans R Soc Lond B Biol Sci 366:997–1007.
151. Henn BM, Cavalli-Sforza LL, Feldman MW (2012) The great human expansion. Proc 192. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes.
Natl Acad Sci USA 109:17758–17764. Proc Natl Acad Sci USA 114:7790–7797.
152. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the 193. Seneviratne SI, Donat MG, Pitman AJ, Knutti R, Wilby RL (2016) Allowable CO2
appearance of modern human behavior. Science 324:1298–1301. emissions based on regional and impact-related climate targets. Nature 529:477–483.
153. Richerson PJ, Boyd R (1984) Natural selection and culture. Bioscience 34:430–434. 194. Fischer RA, Byerlee D, Edmeades G (2014) Crop Yields and Global Food Security
154. Colleran H (2016) The cultural evolution of fertility decline. Philos Trans R Soc Lond B (ACIAR, Canberra, Australia).
Biol Sci 371:20150152. 195. Garibaldi LA, et al. (2017) Farming approaches for greater biodiversity, livelihoods,
155. Leslie PH (1948) Some further notes on the use of matrices in population mathe- and food security. Trends Ecol Evol 32:68–80.
matics. Biometrika 35:213–245. 196. Kassam A, Friedrich T, Shaxson F, Pretty J (2009) The spread of conservation agri-
156. Carotenuto L, Feldman MW, Cavalli-Sforza L (1989) Age structure in models of cul- culture: Justification, sustainability and uptake. Int J Agric Sustain 7:292–320.
tural transmission. Working paper (Morrison Institute for Population and Resource 197. Rhines AS (2013) The role of sex differences in the prevalence and transmission of
Studies, Stanford, CA), No 16. tuberculosis. Tuberculosis (Edinb) 93:104–107.

Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7789
Culture extends the scope of evolutionary biology in
the great apes
Andrew Whitena,b,1
a
Centre for Social Learning and Cognitive Evolution, University of St. Andrews, St. Andrews, KY16 9JP, United Kingdom; and bScottish Primate Research
Group, School of Psychology and Neuroscience, University of St. Andrews, St. Andrews, KY16 9JP, United Kingdom

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 14, 2017)

Discoveries about the cultures and cultural capacities of the great (henceforth simply “apes”), as they do in humans (13). I then
apes have played a leading role in the recognition emerging in explore ways in which cultural inheritance goes yet further be-
recent decades that cultural inheritance can be a significant factor yond these principles, creating new evolutionary phenomena.
in the lives not only of humans but also of nonhuman animals. This Finally I address interactions between the primary manifesta-
prominence derives in part from these primates being those with tions of organic evolution based on genetic inheritance and the
whom we share the most recent common ancestry, thus offering “second inheritance system” (14) based on social learning. In a
clues to the origins of our own thoroughgoing reliance on cumulative now long-standing body of literature for humans, this inter-
cultural achievements. In addition, the intense research focus on these action has been called “gene–culture coevolution” (15); the
species has spawned an unprecedented diversity of complementary logic of such coevolution (10, 16) may apply to other cultural
methodological approaches, the results of which suggest that cultural animals (1, 7).
phenomena pervade the lives of these apes, with potentially
Diverse and Convergent Evidence for the Scope of Great
major implications for their broader evolutionary biology. Here I
Ape Culture
review what this extremely broad array of observational and
experimental methodologies has taught us about the cultural lives Geographic Variation in Traditions in the Wild. In 1986 Goodall
of chimpanzees, gorillas, and orangutans and consider the ways in began to chart differences in behavior patterns among chim-
which this knowledge extends our wider understanding of primate
panzee study sites across Africa (17), proposing these differences
biology and the processes of adaptation and evolution that shape it.
as cultural variants when no genetic or environmental explana-
tion was apparent (later called the “method of exclusion”). The
I address these issues first by evaluating the extent to which the
approach became more comprehensive with time (18, 19),
results of cultural inheritance echo a suite of core principles that
eventually benefitting from a systematic collaboration between
underlie organic Darwinian evolution but also extend them in new
multiple long-term research groups (20, 21). Similar collaborative
ways and then by assessing the principal causal interactions
analyses were soon achieved by orangutan field researchers (22)
between the primary, genetically based organic processes of and more recently by a gorilla consortium (23). These analyses
evolution and the secondary system of cultural inheritance that converged in reporting multiple cultural variants in all three gen-
is based on social learning from others. era: 39 in Pan; 24 in Pongo, and 23 in Gorilla. The variants spanned
apes’ behavioral repertoires, including a great variety of tool use,
social learning | culture | evolutionary biology | chimpanzee | orangutan food processing, and social behavior, as discussed further below.
Further variants have continued to be reported intermittently for

R ecent decades have revealed social learning (learning from


others) to be pervasive across the animal kingdom, with
important implications for evolutionary biology at large (1) and
Pan (24) and Pongo, in the latter case leading to a revised tally of
26–35 variants, depending upon the criteria applied (25).
These surveys are vulnerable to false positives (it can be dif-
the subject of the Sackler Colloquium published here (2). This ficult to be sure that all alternatives to social learning have been
article focuses on great apes: chimpanzee (Pan troglodytes), go- excluded) and also to false negatives (cultural adaptations to
rilla (Gorilla gorilla), and orangutan (Pongo pygmaeus). Other local environmental properties may be inappropriately excluded)
primates are dealt with elsewhere in the issue (3, 4). Despite an (26). However, these pioneering efforts provided essential plat-
forms for more refined approaches, some incorporating both
early report (5), we still know little about cultural phenomena in
genetic and environmental variables into analyses (27). Other
chimpanzees’ rarer sister species, the bonobo (Pan paniscus) so advances yielded confirmatory evidence for culture through
bonobos are omitted here. I also make only limited reference to (i) more focused microhabitat analyses for specific behaviors such
human culture, although we are technically also great apes. Human as ant-dipping (28, 29); (ii) comparisons between neighboring
culture is extensively treated in other papers in this issue. communities sharing genes and habitat properties (30); and
I first survey the nature and scope of social learning and as- (iii) social learning experiments, as for nut-cracking (31, 32).
sociated aspects of cultural transmission in great apes, concluding The broad geographic surveys thus provide an initially im-
that the depth and diversity of observational and experimental perfect but progressively refined overall picture of ape cultural
evidence for cultural phenomena are unparalleled among non-
human species. The evidence thus accumulated suggests that
culture permeates the lives of the great apes in the breadth of This paper results from the Arthur M. Sackler Colloquium of the National Academy of
behavioral repertoires affected and also in their time-depth. These Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
properties may be evolutionarily significant. The authors of a Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
comprehensive recent review of cetacean culture concluded that in Irvine, CA. The complete program and video recordings of most presentations are available
“Culture . . . is a major part of what the whales are” (ref. 6, p. 7; on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.

and see ref. 7). Such a statement is obviously true for our own Author contributions: A.W. wrote the paper.
species (8–11); here I examine the justifications for thinking the The author declares no conflict of interest.
phrase also has validity for great apes. This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
Following a sister review ranging much more widely across Board.
both vertebrates and invertebrates (1), I take eight core princi- 1
Email: a.whiten@st-andrews.ac.uk.
ples of evolution illuminated by Darwin (12) and assess the ex- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
tent to which they apply to cultural phenomena in the great apes 1073/pnas.1620733114/-/DCSupplemental.

7790–7797 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620733114


COLLOQUIUM
PAPER
repertoires. The approach has been systematically applied to complexity of processing operations and also by the skill’s rarity;
spider monkeys (Ateles, reporting 23 cultural variants) (33) but not (ii) peering was followed by a higher rate of exploration of the
yet, to my knowledge, to other animals. Evidence exists for mul- item concerned, as was also confirmed specifically for the use of
tiple cultural variants in other species such as killer whales, which sticks in foraging (seen only at one of the two sites studied);
display very different hunting repertoires (e.g., for fish versus (iii) peering rose along with the learning of new skills and di-
seals), song repertoires, and migratory patterns (6), but systematic minished as competence was achieved; (iv) peering at nest-
tabulations have yet to facilitate direct cross-species comparisons. building was followed by a rise in nest-building over the next
hour; (iv) developmentally, peering tracked the peak time of
Intergroup Variation in Traditions in Captive Communities. A parallel learning to make nests; and (v) by about age 5 y, peering directed
approach has compared neighboring communities in captive at a juvenile’s mother tipped below 50% and was directed more
contexts, with the advantage that genetic and environmental toward others, from whom there was still something to learn (39).
explanations for group differences can be dismissed more Such observations offer a compelling case that juvenile apes’ close
cleanly. For example in the Chimfunshi chimpanzee sanctuary in peering facilitates the learning of major life skills.
Zambia, a bizarre habit of inserting a blade of grass into one ear
and leaving it there spread in one group but not in others (34). Quantitative Evidence for Horizontal Transmission. There is both
Moreover a distinctive “hand-clasp” form of grooming was ab- intracommunity and intercommunity evidence for horizontal
sent in this and one other group but was customary in others, in transmission in the wild. An example of the former was tracked
which it additionally took different forms (35). Similar group by network-based diffusion analysis, confirming that a novel
contrasts were found in the means by which hard-shelled Strychnos chimpanzee behavior, using moss as a water-sponge, spread from
fruits were opened (36). At the Yerkes Center in the United the alpha male along lines of social affiliation, providing quan-
States a further contrast in hand-clasp grooming emerged and titative circumstantial evidence for transmission (42). Examples
spread in one group over several years but remained absent in of intercommunity transmission include (i) a significant accel-
another group (37). These results reinforce those derived from the eration in habituation to human observers in a chimpanzee
studies in the wild outlined above. community being newly habituated after two females immigrated
from a well-habituated community (43); and (ii) the spread of

EVOLUTION
Quantitative Evidence for Vertical Mother-to-Offspring Transmission. ant-fishing to a new community after the immigration of a pro-
A study of the ontogeny of using stem-tools for termite-fishing ficient individual from a neighboring community where fishing
found that juvenile female chimpanzees spent significantly more was habitual (44).
time attending to their mother’s fishing than did their male peers
(38). Consistent with the skills being learned by observation, the Quantitative and Qualitative Evidence for Investment in Transmission.
young females tended to master the technique a whole year ahead Videos of termite-fishing have documented skilled chimpanzee
of the males, with a significant tendency to match even the length mothers donating tools to less competent juveniles (Fig. S1), thus
of probe their mother typically inserted into the mound (38). suffering a diminished duration and rate of termite-fishing while

ANTHROPOLOGY
Researchers studying orangutans have called the focused vi- the recipient enjoyed improved fishing (45). The authors pro-
sual attention of juveniles “peering” (Fig.1) (39). Building on pose these observations meet commonly accepted criteria for a
studies documenting correlations between maternal and juvenile functional (as opposed to intentional) concept of “teaching.”
foraging profiles (40, 41), a suite of predictions were confirmed They also document mothers orally splitting their tool lengthwise
that were consistent with peering functioning to facilitate learning neatly to make two functional tools or bringing multiple tools
key survival skills (39). In foraging and nest-building contexts, in and suggest these behaviors partially buffer mothers from the
which peering is most frequent, it was found that (i) the frequency costs of youngsters’ demands. Alternatively it might be argued
of peering in foraging contexts was predicted by the quantified that these actions are essentially unnecessary and thus represent
the more compelling evidence that the behavior has costs and
therefore counts as teaching, even if the teaching is not as active
as teaching by scorpion provision by meerkats (46) or beaching
to catch seals by killer whales (6). However, the pattern of costs
and benefits suggests that this support has positive fitness ben-
efits for the young, and parallel reports concerning the use of
tools for nut-cracking have also been described (47).
An earlier report described more active involvement in curb-
ing youngsters’ exploration of potentially dangerous food-types.
Haraiwa-Hasegawa reported that when an infant, PN, reached to
touch some fig leaves, “her mother, FT, took PN’s hand and
moved it away from the leaves. As PN continued . . . FT took the
leaves from PN’s hand, plucked all the leaves within her arm’s
reach and dropped them to the ground” (ref. 48, p. 280). At least
one other mother behaved similarly and “prohibited . . . infants
only from feeding on the individual trees that they themselves
never fed on.”

Dyadic Experimental Studies of Social Learning. Experimental re-


ports of social learning by naive individuals from proficient
models multiplied for more than a century in all great ape genera
and have been tabulated and enumerated in successive reviews
(49: n = 19 studies; 50: n = 33 further studies; 51: n = 25 studies
comparing two or more ape species). The more recent experi-
ments often adopt a highly informative two-action approach in
which participants see one of two models, each trained to tackle
a problem such as opening a foraging box in a different way.
Fig. 1. Peering (39) by a juvenile orangutan as her mother extracts termites Ideally a third, no-model control group is included. This method
from dead wood. Image courtesy of Christiaan Conradie and Caroline has demonstrated social learning in captive chimpanzees (52),
Schuppli. gorillas (53), and orangutans (54) that implies the copying of the

Whiten PNAS | July 25, 2017 | vol. 114 | no. 30 | 7791


action (imitation) or movements of the manipulanda (emulation) several experiments (67–69), including two experiments in which
that were seen, rather than the simpler process of mere enhance- tool-use behavioral variants were transmitted across three groups
ment of the manipulanda (55). These approaches have further with significant fidelity (69). These results are important for inter-
dissected the particular social learning processes at work, a topic preting putative cultures in the wild, such as the nut-cracking dis-
beyond the scope of this review (see refs. 56–59 for in-depth tributed across several hundred kilometers of West Africa, which
treatment). More relevant to the present discussion and re- would have required repeated intercommunity transmission.
viewed further below are extensions of these approaches to track
the successive cultural transmissions necessary to sustain traditions. How Pervasive Is the Role of Cultural Inheritance in the Lives
Nevertheless these dyadic experiments can importantly com- of Great Apes?
plement results from the field. For example, nut-cracking with Social Learning Shapes a Broad Repertoire of Cultural Variants. We
natural hammer materials, found naturally only in West Africa can first examine how pervasive the role of cultural inheritance is
(Fig. 2), is shown not to be explicable as an instinct in the West in the lives of great apes by surveying the range of behaviors
that is simply absent in the East, because East African chim- described in the cross-site comparisons summarized above (20–
panzees exposed to proficient models became proficient, whereas 23, 27).
controls in a no-model control condition did not, showing that the The chimpanzee lists of 1999–2001 include 30 different kinds
behavior is (socially) learned (Fig. 2) (31, 32). of tool use ascribed to culture, all suggesting functional and
adaptive payoffs beneficial to the performer’s biological fitness
Cultural Diffusion Experiments. Experiments focused on the broader (14, 21, 24). Others have been reported since then, including
phenomenon of cultural diffusion typically begin with models sticks bitten and thus sometimes made sharp and used to stab or
displaying different solutions to a task and then track the potential evict bush baby prey at Fongoli in Senegal (70, 71); a kit of stout
spread of the solutions in others who witness them. Alternative tools used to make tunnels; and fine stems used to fish down
variants provide important complementary information (63, 64). these tunnels and extract termites from nests deep underground
For example, the “transmission chain” design pairs a first model (72). A majority of such tools are used in food extraction, but
(A) with a naive individual (B); then when B achieves some others are used in hygienic actions such as wiping blood or semen
competence criterion, B becomes a model for a further indi- off fur, in protective comfort roles such as creating leaf-cushions
vidual (B for C, C for D, and so on). Achieving such configurations on wet ground, and in local courtship gambits such as bending
peacefully with apes requires sensitive experimental maneuvering, small shrubs on the ground (20, 21). Other diverse items include
but transmission has been demonstrated along chains of up to five forms of food processing without tools, ways of dispatching ecto-
participants in chimpanzees (65) and orangutans (66), as well as parasites located during grooming, and grooming customs such
in children (65). These experiments provide important models as the hand-clasp that shows variant forms even in neighboring
of transmission across cultural “generations,” implying a po- communities (73).
tential for cultural transmission across what would naturally be The orangutan list of 2003 (22) also includes a dozen different
decades of ape life. forms of tool use, several used for food extraction, such as holding a
By contrast, “open diffusion” designs mimic transmission in small stick in the mouth to extract seeds from Neesia fruits or using
the wild in which whole groups are exposed to alternative models; “leaf-gloves” to handle spiky fruit. Hygiene/comfort examples
watching and copying is “open” to any individual in the group. include using a leaf napkin to wipe off sticky latex. The list as a
This design has been applied in chimpanzees and children in whole is diverse, incorporating forms of arboreal locomotion,

No nut-cracking, but presence of all necessary raw


materials confirmed
Boesch et al. [61] McGrew et al. [62]

Nut-cracking
recorded at eight
sites in West
Africa across
~500 Km [60] In an experiment in an Island
sanctuary, East African
chimpanzees who do not nut-
crack in the wild learned nut-
cracking through observation
[31,32]

Fig. 2. Convergent evidence for a culture of nut-cracking in chimpanzees. Evidence for nut-cracking is seen at multiple sites in West Africa (20, 21, 60) (white stars)
but is absent at others (black stars). The gray star indicates an early report in Cameroon, which was not subsequently confirmed. Independent studies confirmed
availability of raw materials at two such sites (61, 62). Experiments showed East African chimpanzees (two-letter ID codes) did not initially nut-crack (Phase 1), but
when half of the population was exposed to a proficient model, they began to do so (Phase 2), and all did so once exposed (Phase 3) (31, 32).

7792 | www.pnas.org/cgi/doi/10.1073/pnas.1620733114 Whiten


COLLOQUIUM
PAPER
vocal sounds (some modified using leaves), variant nest con- Does Ape Culture Instantiate a New Form of Evolution?
structions such as adding sun covers, and whether slow lorises are Ape culture may have pervasive effects in shaping the behavioral
eaten (irrespective of lorises’ availability). repertoires of successive generations in the ways reviewed above,
The recent gorilla list of putative cultural variants (23) also but do these effects imply an extension of biology in the sense of
displays diversity that includes making bridges across water, instantiating a new form of evolution based not on genetic but on
rubbing fruit to clean it or remove spines, incorporating tree- social inheritance? This development is what Dawkins proposed
slapping into displays, using teeth as a “fifth limb” in climbing, in his concept of culturally replicated “memes” as analogies of
forms of bodily contact while traveling together, and forms of genes, creating a new form of evolution in the case of human
social play. culture (80). The idea of aspects of culture such as language
evolving through variation, (cultural) inheritance, and selection
The Extent of Vertical Intergenerational Transmission. A detailed goes back to Darwin’s own writings (81) and was highlighted as the
study of the foraging behavior of young wild orangutans before tenth and latest of the major evolutionary transitions proposed by
and after weaning concluded that their “diets were essentially Maynard-Smith and Szathmary (82). Mesoudi et al. (13) tackled
identical to their mothers’ even though not all mothers had the the issue in finding abundant evidence for counterparts in human
same diet” . . . “immatures selectively observed their mothers culture of eight major principles Darwin set out in the Origin
during extractive foraging, which increased goal-directed prac- (12): variation, selection, inheritance, adaptation, accumulation of
tice but not general manipulation of similar objects, suggesting modifications, geographic variation, convergence, and changes of
observational forms of learning of complex skills” (ref. 40, p. 62). function. How does ape culture compare?
This conclusion was reinforced by a later study focused on In addressing this question we must be clear about the phe-
peering (39), referred to earlier. At 2–4 y of age, infants foraged nomena for which we are querying a potential “evolution.” If a
with their mother more than 90% of the time; 94% of their chimpanzee invents a better hammer for nut-cracking (perhaps
feeding time occurred when the mother was also feeding; and using a stone rather than wood), this innovation may enhance
96% of their feeding was on the same items as the mother’s (39). that individual’s biological fitness, so that its genes are better rep-
resented in future generations, i.e., natural selection shapes bi-
The extent of cofeeding is clearly massive in preweaning years
ological evolution. However, if others copy use of the new tool, the
and is likely to engender the vertical social transmission of

EVOLUTION
fitness (reproductive success) of that cultural entity—stone-tool
dietary profiles. use—will be enhanced through its spread, and to that extent
These years of mother–offspring association and cofeeding are there is cultural evolution of this behavior. It is this second
typical of all the great apes and appear to lay down dietary phenomenon that we are addressing here. Effects on an indi-
preferences that change relatively little after weaning. Although vidual culture-bearer’s biological, inclusive fitness are a different
the social learning implied may be as simple as enhancement of a matter to which we return in a later section. We can now con-
food type by the mother’s feeding on it, such effects are likely to sider the eight evolutionary principles noted above.
be profoundly important, because large diet-sets need to be

ANTHROPOLOGY
mastered and selected from the even more vast options a tropical Variation, Selection, and Inheritance. The three principles of vari-
forest offers. This mastery includes avoiding the numerous plant ation, selection, and inheritance can together be regarded as the
parts that are toxic, selecting relatively nutritious options, and core trinity of Darwinian evolution. Their joint working is an
avoiding relatively poor ones. Chimpanzees may eat more than evolutionary algorithm that has been suggested to have the
300 different food types (species × parts) in a year (74); in the power to explain a multitude of phenomena beyond the living
Lopé Park of Gabon, for example, fruit alone is taken from systems to which Darwin applied it (83, 84).
114 different plants (75) selected from among many hundreds of As we have seen above, there is plentiful evidence in the great
potential food types available. The diet of gorillas may be similarly apes for the feature of inheritance through social learning that
diverse; gorillas in the Alfi Mountains of Cameroon eat more than provides sufficient fidelity to sustain traditions. There is also
200 different food types, including fruits, seeds, leaves, stems, pith, cultural variation, in part because, compared with gene replica-
flowers, bark, roots, and invertebrates (75). For the orangutans of tion, social learning is prone to imperfect copying. In the arrays
Tanjung Putting in Borneo, the figure is again more than 300 dif- of cultural variants among great apes discussed earlier in this
ferent food types (76). However, the dietary profiles of different article, there are plenty of behaviors that are displayed by many
populations may vary greatly, as suggested by earlier chimpanzee but not all individuals in a community (these behaviors are classed
studies (77) and confirmed more recently in neighboring orang- as “habitual” rather than “customary”).
utan populations separated by a large river, which displayed 60% By contrast, as yet there seems to be little direct recording of
difference in diet, contrasting with intrapopulation homogeneity cultural evolutionary change through competition and selection
(78). Years of close apprenticeship to a mother who daily displays within this variation. This absence of evidence of evolutionary
her knowledge of such a large but selective diet-set likely provide change is perhaps unsurprising. During the human Stone Age,
even when sophisticated, bilaterally symmetric Acheulian blades
an important means of achieving an adaptive response to this
showed an advance over earlier crude Oldowan tools, they
challenging complexity.
changed relatively little over a million years (85). If, as is plau-
sible, such stability characterizes chimpanzee nut-cracking and
Time Depth of Cultural Transmission. Long-term field sites have
other cultural variants of apes, then we will see little evidence
shown that techniques such as termite-fishing have continued of cultural selection in human lifetimes. Of course, organic
across several generations during the half-century of research evolutionary change is itself often slow compared with scientific
now achieved. However, this time-frame pales in comparison lifetimes, although instructive exceptions have often followed
with the discoveries of real archaeological excavation, which in human-caused environmental perturbations that create new se-
the Tai Forest of Ivory Coast reached a depth corresponding to lection pressures. The classic example is selection favoring dark
4,300 y, where remains of nut-cracking were identified (illus- morphs of peppered moths, better camouflaged against the sooty
trated in figure 1 of ref. 1) beneath those currently generated on surfaces of the industrial revolution, and then flipping to favor
the surface by chimpanzees (79). Of course, this behavior may be light-colored morphs as the world became cleaner again.
very much older. Once such a beneficial technology becomes Accordingly I have suggested that similar contexts of anthro-
customary, it may continue in perpetuity, pending major ecological pogenic change may be fruitful for investigating cultural evolu-
perturbation. This example suggests that ape cultural inheritance tion in animals (1). Scientific experiments may offer a convenient
spans not only the breadth of behavioral repertoires outlined instance. For example, in a pioneering cultural diffusion study,
earlier but also a potentially significant time depth comparable to three juvenile chimpanzees were confronted with and avoided
that familiar in organic evolution via genetic inheritance. two novel objects (86). One youngster was then replaced with a

Whiten PNAS | July 25, 2017 | vol. 114 | no. 30 | 7793


naive one, and this replacement was repeated, so after every third but their complexity suggests elementary forms of cumulative
such cycle the triplet contained different individuals. Never- culture, comparable perhaps to the achingly slow forms that
theless, approaches to the objects increased steadily and in later characterized the early hominin Stone Age.
generations of triplets became customary (Fig. S2). Accord-
ingly, here there was variation in boldness inheritance insofar Geographic Variation. Because the Darwinian algorithm operates
as naive youngsters learned from bolder ones that the objects in different regions, organic characteristics differentiate, and
could be safely approached and explored, and competitive se- speciation may occur. Parallel effects occur in human cultural
lection favored small, progressive steps in boldness. Playing evolution (13, 88). As we have seen, great apes show evidence of
with the objects thus could evolve as the norm in the later gener- different traditions at geographically separated locations, and
ations composed of different youngsters than the original shy ones. there is evidence from all of the great ape genera that differences
In a counterpart from the wild, two individuals from a human- in putative cultural profiles are correlated with the geographic
habituated community of wild chimpanzees immigrated into a separation of communities (23, 24). As humans or other apes
neighboring community that scientists were beginning to habituate, disperse over greater distances, one would expect both genetic
and at that point habituation accelerated significantly (43). and cultural similarities to diminish, and indeed Langergraber
In these examples an initially common variant (caution) was et al. (89) showed that cultural variation in chimpanzees is also
replaced competitively by another (boldness) which was adap- correlated with genetic variation (but this correlation does not
tively superior (fear was unnecessary in these contexts). Like- mean genes explain the behavioral differences; see the supple-
wise, in all the cultural diffusion experiments with apes cited mentary information in ref. 79 for further discussion of this study).
earlier, improved foraging techniques spread across test groups Kamilar and Atkinson (90) demonstrated a nested structure in
to replace the less competitive behavioral state characterized by four samples of human cultural repertoires in North America and
their absence. Here, again, there was variation (although in this New Guinea, which would occur if, as people disperse, traits are
case the variation was engineered by the experimenter), in- sequentially added in or lost. Consistent with earlier cladistic
heritance via social learning, and competitive selection favoring analyses of the branching pattern of chimpanzee profiles (91),
the cultural spread of the new foraging technique. chimpanzees were found to display this pattern of nestedness
I suggest experiments such as the one reported in ref. 86 may across African sites. However, orangutans did not, in agreement
allow us to explore the capacity of apes to exemplify, even if in a with an earlier detailed orangutan study (27) and possibly reflecting
limited way, the operation of the Darwinian-trinity algorithm in a a greater preponderance of vertical, mother-to-offspring trans-
cultural context. Presumably, at some previous time, all the cases mission than horizontal transmission between communities.
of clearly beneficial cultural variants in the wild, such as nut-
cracking and other tool use, did not exist; so, where they are Convergent Evolution. The Darwinian algorithm delivers some
customary, their common use is likely to have arisen through the similar organic evolutionary outcomes in different places, despite
operation of this algorithm. different foundations. Cultural convergences of this kind appear
to occur at different layers of relatedness among apes. An ex-
Adaptation. The growth of boldness in the studies discussed ample within the same species is hand-clasp grooming, which has
above (43, 86) indicates culturally evolved instances of adapta- emerged and spread in some chimpanzee communities in the
tion, although the adaptive payoffs were likely only mild. In the wild, but not in others (20, 21, 73), as well as in an African
wild there is evidence that a more crucial level of adaptiveness sanctuary (35) and in groups in the United States (37). This
has been delivered. Chimpanzees in Bossou, Guinea, were shown hand-clasp grooming is not simply an individual invention be-
to be reliant on two forms of technology, in particular nut- cause within-group spread of the behavior has been documented,
cracking and pestle-pounding (a means of extracting nutritious indicating social transmission. Other convergences span different
pulp from the apex of palm trees) during the dry season when ape genera, such as use of fly swats and leaf napkins by both
fruit became scarce (87); these cultural variants allow these apes chimpanzees and orangutans; again, these behaviors are neither
to inhabit otherwise inadequate habitats. How often culturally species instincts nor individually learned, because they are ha-
inherited technology is this critical remains difficult to judge at bitual at some locations but are absent in the same species at
present, but many forms of tool use allow chimpanzees and other locations. Finally there are convergences between apes and
orangutans to gain otherwise unavailable foodstuffs. other primates, e.g., the use of stones as hammers to break open
Such adaptations concern the local physical environment. hard-cased food by distantly related long-tailed macaques (92)
Other adaptations may be societal. In a community of chimpan- and capuchins (3, 93).
zees that customarily practices hand-clasp grooming, it may be
adaptive to learn this behavior from those already using it; and Change of Function. Change of function is perhaps the category
where a particular courtship gambit such as leaf-clipping has be- for which we are most limited by lack of historical records. In
come common, it will likely be beneficial to adopt this behavior as humans, historical records suggest that, just as morphology can
an action already recognized by one’s potential mating partner. evolve to serve a new function (arms becoming wings, for ex-
ample), cultural elements may evolve new functions different
Accumulation of Modifications. Human cultural modifications ac- from their original one (10). A candidate in great apes is that in
cumulate in an elaborate fashion that has no match in other chimpanzees leaf-clipping (noisily shredding leaves with the
animals and display the most striking analogies with the richness teeth) is reported as a courtship bid in some communities but is
of the evolved forms of the living world (9, 10, 13, 16, 83, 84). used for other functions, such as play, in others (47); such vari-
Many authors assert that we are the only species to exhibit cu- ations suggest that some of these alternatives may have evolved
mulative culture (8, 9, 49, 56), but I suggest this view may be from each other or from a common ancestral function. However,
premature. For example, chimpanzees in Goualougo use a stout it is possible that the lack of evidence in this category is explained
stick to make a deep tunnel to reach subterranean termite nests by the lack of evidence concerning limited cultural cumulation,
and then use long stems to fish down the tunnels, first creating a noted above.
distinct brush tip effective for fishing by stripping the stem ends
through their teeth (72). They do so in a context in which the Culture Extends Biology into New Realms of Evolution
effective procedure is highly opaque, so it is difficult to see how The above discussion focuses on how cultural evolution may
the technique could have developed other than by a series of match the template for genetically based Darwinian evolution,
cumulative steps beginning with the more transparent context of but cultural transmission by social learning also extends the
fishing near the surface. Boesch (47) describes several other scope of biological systems by incorporating additional dimen-
candidates for cumulative cultural evolution in chimpanzees. sions of inheritance and evolution. Some of these dimensions
Direct evidence for the origins of such routines is lost in the past, have long been recognized in the literature concerning human

7794 | www.pnas.org/cgi/doi/10.1073/pnas.1620733114 Whiten


COLLOQUIUM
PAPER
cultural evolution, including the fact that, in addition to in- (98). Orangutans did not show this selectivity (98), possibly
tergenerational transmission shared with genetic inheritance, reflecting their less community-based social life. Evidence for
cultural transmission can be horizontal (within or between groups, recognizing and copying more successful or productive options
and extending to nonkin) or oblique (with learning from nonrelatives came from further experiments with chimpanzees (99). It has
in the prior generation) (15). Above I have reviewed some of the been suggested that preferentially copying individuals of high
evidence for learning from parents, typically the mother, in apes rank could serve this function also, and two studies have shown
(38–41, 45). Horizontal and oblique transmissions are commonly chimpanzees preferring to copy a high-ranked rather than a
demonstrated in diffusion experiments (63–69) as well as in ob- lower-ranked individual (100, 101). Finally, a tendency to learn
servational studies in the wild (42–44) and in sanctuaries (34). from kin is shown by the studies of peering reviewed earlier,
Such transmission makes cultural learning a powerful adaptive which showed extensive learning from the mother during apes’
process, as does the fact that, because it hinges on neural rather extended preweaning period (39, 41). After weaning, this be-
than genetic changes, it can act very much faster; some impor- havior widened to include peering at the activities of others, a
tant things can be learned observationally in a matter of minutes plausibly adaptive shift from initially learning basic information
(94), although the acquisition of complex skills may require a
from parents and then later targeting others to learn more spe-
more extended observational apprenticeship (32, 39).
Additionally, although adaptive information is inherited ge- cific skills, a trend identified in both observational (102) and
netically in a package at conception (even if activation is later experimental (103, 104) studies of human children. However, we
adaptively contingent on environmental inputs), the adaptive still have only limited understanding of when and why an ape
information that feeds into social learning can be temporally opts to learn socially (or does not): For example, what deter-
distributed in at least two major ways. First, cultural transmission mines when immigrants will either conform to local norms (30)
can be Lamarckian-like, with the adaptive features acquired or instead transmit their habitual skills to others (44)?
through one individual’s lifetime passed on to those who learn
from them. Second, a learner can build up complex skills, such as Interactions Between Genetic and Cultural Modes of
some forms of ape tool use, progressively by repeated cycling Inheritance and Evolution
through a process of observe, practice, observe again, and practice At the broadest level, culture extends biology insofar as some

EVOLUTION
again. This pattern can be thought of as a spiral or helical process culturally transmitted behaviors are evolutionarily consequential,
of learning in which cycles of observation and practice allow the i.e., they have implications for practitioners’ survival, reproduc-
learner to assimilate more in later observations than was possible tion, and ultimate inclusive fitness (as opposed to the repro-
in the earlier, more naive stages (Fig. 3) (32, 39). ductive success of the cultural items themselves, discussed
Social learning also may be selective in the assimilation of earlier). Some cultural variants that appear relatively frivolous,
information, variously referred to as “directed social learning” such as staring at one’s reflection in water in gorillas (23) or
(95), “biased transmission” (15), or “social-learning strategies” applying an autoerotic tool in orangutans (22), may have less

ANTHROPOLOGY
(96), which can in principle shape adaptation and consequent evolutionary significance, but varied forms of tool use by orang-
evolutionary change, with no clear counterparts in the gene- utans and chimpanzees appear to be highly functional in gaining
based processes. access to rich resources such as insect prey, nut kernels, and
Evidence in great apes has been adduced for a number of the honey. Indeed, some of these behaviors appear to be vital for
potential learning rules these analyses highlight (97). Evidence chimpanzees to exploit niches that would otherwise exclude
for a copy-the-majority rule, suggested by the apparent confor- them (87). Other culturally transmitted behaviors play functional
mity of chimpanzees in diffusion experiments (68), came in roles in grooming, social interactions, and sexual courtship.
further experiments showing that both children and chimpanzees Another sense in which culturally transmitted behaviors may
would copy the choices of three other conspecifics rather than have been evolutionary important concerns their effects on or-
those of a single individual repeating the same act three times ganic evolution. Cetacean researchers have proposed that cul-
tural differentiation among whales has led to genetic differences
(7, 105). For example, killer whales display eco-types that spe-
cialize in hunting alternative prey such as seals or fish using very
different techniques, and different clans exhibit other behavioral
differences in their songs and migratory/resident patterns, de-
spite often being sympatric (6, 7, 105, 106). Such effects are
suggested to have driven other morphological and genetic dif-
ferentiation, ultimately leading to incipient speciation, because it
becomes difficult for a member of one culture to enter another
and successfully manage the different foraging and courtship
requirements of that culture. This causal pathway would be an
instance of “behavioral drive” (107–109), in which plasticity in
behavior allows a species to exploit or create a new niche, in this
case a culturally dependent one (e.g., hunting fish versus seal, a
cultural drive). This niche in turn may create selection pressures
acting on organic evolution, with effects such as the evolution of
more robust jaws in the seal-hunters (6). Parallel hypotheses
have been developed in the case of birdsong dialects driving
speciation (110, 111).
Dramatically different specialisms such as seen in killer whales
are not apparent among great apes, although the extent to which
similar processes are at work, e.g., in contrasts between nut-
cracking communities of chimpanzees and the nearest neighbors
Fig. 3. Helical curriculum model of skill development (after ref. 32). Over that do not crack, would repay attention. However, one principal
repeated cycles of observation-of-expert and practice, the social learner is effect of complex culture on organic evolution in apes has been
able to assimilate more information from the expert and gradually improve proposed concerning encephalization and the cognitive sophis-
his/her skill level. See the text for more explanation. tication it can provide: the cultural intelligence hypothesis.

Whiten PNAS | July 25, 2017 | vol. 114 | no. 30 | 7795


The Cultural Intelligence Hypothesis Summary and Conclusions
In accord with a previously advanced social (or Machiavellian) Research, particularly in the last two decades or so, has shown
intelligence hypothesis, relative brain size in different primate that a second inheritance system of social learning is widespread
species was found to be predicted by the typical size of their among animals, extending to all main classes of vertebrate and
social group and the concomitant demands on social cognition also to insects (1, 2). Apes merit a special focus, insofar as they
(59, 112). Great apes do not fit this pattern, showing high relative have been subjected to an unmatched diversity and volume of
and absolute brain sizes although gorillas and orangutans do not observational and experimental studies by multiple research
live in large communities. However, because all appear to display teams, whose work has revealed what appear to be the richest
relatively complex cultures, the cultural intelligence hypothesis nonhuman cultural repertoires identified to date (although some
suggests that this complexity has selected for encephalization, cetaceans, e.g., killer whales, may show greater cultural differ-
either in a culture-first or an entwined culture–gene–brain co- entiation). This article has attempted to indicate the scope of ape
evolution scenario (112–115). One side of this proposition may
culture research and the key points of its discoveries, particularly
be glossed as “culture makes you smart,” as is self-evident in the
human case (9), insofar as present-day humans are smarter than with respect to the theme of the present issue: how these cultural
those a century earlier by virtue of the cumulative cultural phenomena may extend biology and its core evolutionary theory
achievements from which they benefit. On a more modest scale, in particular. I have argued that the evidence supports the con-
the same is proposed for the cultural endowment of great apes. clusion that the nature of social learning and its consequences in
The converse side of the proposal is that there is selection on the cultural transmission create new forms of evolution. These new
socio-cognitive capacities necessary to assimilate and store all forms echo well the established core principles of organic evo-
the potential cultural repertoire available. In turn, it has been lution but also go beyond them in a number of fundamental
suggested that there will be correlated selection on technical and ways, such as horizontal transmission and inheritance of acquired
general intelligence, so as to benefit from the cultural input, as in characteristics, thereby extending the scope of evolutionary
intelligent tool-use, for example (114). One recently offered test processes we must now entertain. Moreover the primary genet-
of such ideas showed that when tested on the level playing field ically based forms of evolution shaped and are also shaped by the
of zoo contexts, Sumatran orangutans scored higher on general consequences of this second inheritance system in complex ways
intelligence than their Bornean cousins, as predicted by the we are only now starting to uncover.
more elaborate cultural repertoires of the Sumatran pop-
ulations in the wild; moreover Sumatrans have brains that are ACKNOWLEDGMENTS. I thank Sarah Davis, Bill McGrew, Stephanie Musgrave,
2–10% larger (116). Crickette Sanz, and Stuart Watson for comments on early versions of this paper.

1. Whiten A (2017) A second inheritance system: The extension of biology through 24. Whiten A (2012) Social learning, traditions and culture. The Evolution of Primate
culture. Interface Focus, in press. Societies, eds Mitani J, Call J, Kappeler P, Palombit R, Silk J (Univ of Chicago Press,
2. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology Chicago), pp 681–699.
through culture. Proc Natl Acad Sci USA 114:7775–7781. 25. van Schaik CP, et al. (2009) Orangutan cultures re-visited. Orangutans: Geographic
3. Fragaszy DM, et al. (2017) Synchronized practice helps bearded capuchin monkeys Variation in Behavioral Ecology and Conservation, eds Wich SA, Atmoko SSU,
learn to extend attention while learning a tradition. Proc Natl Acad Sci USA 114: Setia TM, van Schaik CP (Oxford Univ Press, Oxford, UK), pp 299–309.
7798–7805. 26. Laland KN, Janik VM (2006) The animal cultures debate. Trends Ecol Evol 21:542–547.
4. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) invent 27. Krützen M, Willems EP, van Schaik CP (2011) Culture and geographic variation in
more social behaviors, but younger monkeys innovate more in other contexts. Proc Natl orangutan behavior. Curr Biol 21:1808–1812.
Acad Sci USA1147806–7813. 28. Schöning C, Humle T, Möbius Y, McGrew WC (2008) The nature of culture: Tech-
5. Hohmann G, Fruth B (2003) Culture in Bonobos? Between species and within species nological variation in chimpanzee predation on army ants revisited. J Hum Evol 55:
variation in behavior. Curr Anthropol 44:563–571. 48–59.
6. Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Univ of 29. Möbius Y, Boesch C, Koops K, Matsuzawa T, Humle T (2008) Cultural differences in
Chicago Press, Chicago). army ant predation by West African chimpanzees? A comparative study of micro-
7. Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl ecological variables. Anim Behav 76:37–45.
Acad Sci USA 114:7814–7821. 30. Luncz LV, Boesch C (2014) Tradition over trend: Neighboring chimpanzee commu-
8. Pagel M (2012) Wired For Culture: The Natural History of Human Communication nities maintain differences in cultural behavior despite frequent immigration of
(Allen Lang, London). adult females. Am J Primatol 76:649–657.
9. Henrich J (2015) The Secret of Our Success: How Culture Is Driving Human Evolution, 31. Marshall-Pescini S, Whiten A (2008) Social learning of nut-cracking behavior in East
Domesticating Our Species, and Making Us Smarter (Princeton Univ Press, Princeton, NJ). African sanctuary-living chimpanzees (Pan troglodytes schweinfurthii). J Comp Psychol
10. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul- 122:186–194.
ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789. 32. Whiten A (2015) Experimental studies illuminate the cultural transmission of per-
11. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl cussive technologies in Homo and Pan. Philos Trans R Soc Lond B Biol Sci 370:
Acad Sci USA 114:7877–7883. 20140359.
12. Darwin CD (1859) On the Origin in Species by Natural Selection (Murray, London). 33. Santorelli CJ, et al. (2011) Traditions in spider monkeys are biased towards the social
13. Mesoudi A, Whiten A, Laland KN (2004) Perspective: Is human cultural evolution domain. PLoS One 6:e16863.
Darwinian? Evidence reviewed from the perspective of the Origin of Species. 34. van Leeuwen EJC, Cronin KA, Haun DB (2014) A group-specific arbitrary tradition in
Evolution 58:1–11. chimpanzees (Pan troglodytes). Anim Cogn 17:1421–1425.
14. Whiten A (2005) The second inheritance system of chimpanzees and humans. Nature 35. van Leeuwen EJC, Cronin KA, Haun DB, Mundry R, Bodamer MD (2012) Neigh-
437:52–55. bouring chimpanzee communities show different preferences in social grooming
15. Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago behaviour. Proc Biol Sci 279:4362–4367.
Press, Chicago). 36. Rawlings B, Davila-Ross M, Boysen ST (2014) Semi-wild chimpanzees open hard-
16. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for science of cultural shelled fruits differently across communities. Anim Cogn 17:891–899.
evolution. Proc Natl Acad Sci USA 114:7853–7860. 37. Bonnie KE, de Waal FBM (2006) Affiliation promotes the transmission of a social
17. Goodall J (1986) The Chimpanzees of Gombe: Patterns of Behavior (Harvard Univ custom: Handclasp grooming among captive chimpanzees. Primates 47:27–34.
Press, Boston). 38. Lonsdorf EV, Eberly LE, Pusey AE (2004) Sex differences in learning in chimpanzees.
18. McGrew WC (1992) Chimpanzee Material Culture: Implications for Human Evolution Nature 428:715–716.
(Cambridge Univ Press, Cambridge, UK). 39. Schuppli C, et al. (2016) Observational learning and socially induced practice of
19. Boesch C, Tomasello M (1998) Chimpanzee and human cultures. Curr Anthropol 39: routine skills in immature orangutans. Anim Behav 119:87–98.
591–614. 40. Jaeggi AV, et al. (2010) Social learning of diet and foraging skills by wild immature
20. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. Bornean orangutans: Implications for culture. Am J Primatol 72:62–71.
21. Whiten A, et al. (2001) Charting cultural variation in chimpanzees. Behaviour 138: 41. Jaeggi AV, van Noordwijk MA, van Schaik CP (2008) Begging for information:
1489–1525. Mother-offspring food sharing among wild Bornean orangutans. Am J Primatol 70:
22. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. 533–541.
Science 299:102–105. 42. Hobaiter C, Poisot T, Zuberbühler K, Hoppitt W, Gruber T (2014) Social network
23. Robbins MM, et al. (2016) Behavioural variation in gorillas: Evidence of potential analysis shows direct evidence for social transmission of tool use in wild chimpan-
cultural traits. PLoS One 11:e0160483. zees. PLoS Biol 12:e1001960.

7796 | www.pnas.org/cgi/doi/10.1073/pnas.1620733114 Whiten


COLLOQUIUM
PAPER
43. Samuni L, Mundry R, Terkel J, Zuberbühler K, Hobaiter C (2014) Socially learned 78. Bastian ML, Zweifel N, Vogel ER, Wich SA, van Schaik CP (2010) Diet traditions in wild
habituation to human observers in wild chimpanzees. Anim Cogn 17:997–1005. orangutans. Am J Phys Anthropol 143:175–187.
44. O’Malley RC, Wallauer W, Murray CM, Goodall J (2012) The appearance and spread 79. Mercader J, et al. (2007) 4,300-year-old chimpanzee sites and the origins of per-
of ant fishing in the Kasekela chimpanzees of Gombe: A possible case of in- cussive stone technology. Proc Natl Acad Sci USA 104:3043–3048.
tercommunity cultural transmission. Curr Anthropol 53:650–663. 80. Dawkins R (1976) The Selfish Gene (Oxford Univ Press, Oxford. UK).
45. Musgrave S, Morgan D, Lonsdorf E, Mundry R, Sanz C (2016) Tool transfers are a 81. Darwin C (1871) The Descent of Man and Selection in Relation to Sex (Murray,
form of teaching among chimpanzees. Sci Rep 6:34783. London).
46. Thornton A, McAuliffe K (2006) Teaching in wild meerkats. Science 313:227–229. 82. Maynard-Smith J, Szathmary E (1995) The Major Transitions in Evolution (Freeman,
47. Boesch C (2012) Wild Cultures: A Comparison Between Chimpanzee and Human Oxford, UK).
Cultures (Cambridge Univ Press, Cambridge, UK). 83. Dennett D (1995) Darwin’s Dangerous Idea (Penguin, London).
48. Haraiwa-Hasegawa M (1990) A note on the ontogeny of feeding. The Chimpanzees 84. Ridley M (2015) The Evolution of Everything (Fourth Estate, London).
of the Mahale Mountains: Sexual and Life History Strategies, ed Nishida T (Univ of 85. Stout D (2011) Stone toolmaking and the evolution of human culture and cognition.
Tokyo Press, Tokyo), pp 277–283. Philos Trans R Soc Lond B Biol Sci 366:1050–1059.
49. Tomasello M, Call J, eds (1997) Primate Cognition (Oxford Univ Press, Oxford, UK). 86. Menzel EW, Jr, Davenport RK, Rogers CM (1972) Protocultural aspects of chimpan-
50. Whiten A, Horner V, Litchfield CA, Marshall-Pescini S (2004) How do apes ape? Learn
zees’ responsiveness to novel objects. Folia Primatol (Basel) 17:161–170.
Behav 32:36–52.
87. Yamakoshi G (1998) Dietary responses to fruit scarcity of wild chimpanzees at Bos-
51. Galef BG, Whiten A (2017) The comparative psychology of social learning. APA
sou, Guinea: Possible implications for ecological importance of tool use. Am J Phys
Handbook of Comparative Psychology, eds Call J, Burghardt G, Pepperberg I,
Anthropol 106:283–295.
Snowdon C, Zentall T (APA, Washington).
88. Gray RD, Watts J (2017) Cultural macroevolution matters. Proc Natl Acad Sci USA
52. Whiten A, Custance DM, Gomez J-C, Teixidor P, Bard KA (1996) Imitative learning of
artificial fruit processing in children (Homo sapiens) and chimpanzees (Pan troglo- 114:7846–7852.
dytes). J Comp Psychol 110:3–14. 89. Langergraber KE, et al. (2011) Genetic and ‘cultural’ similarity in wild chimpanzees.
53. Stoinski TS, Wrate JL, Ure N, Whiten A (2001) Imitative learning by captive western Proc Biol Sci 278:408–416.
lowland gorillas (Gorilla gorilla gorilla) in a simulated food-processing task. J Comp 90. Kamilar JM, Atkinson QD (2014) Cultural assemblages show nested structure in
Psychol 115:272–281. humans and chimpanzees but not orangutans. Proc Natl Acad Sci USA 111:111–115.
54. Stoinski TS, Whiten A (2003) Social learning by orangutans (Pongo abelii and Pongo 91. Lycett SJ, Collard M, McGrew WC (2009) Cladistic analyses of behavioural variation in
pygmaeus) in a simulated food-processing task. J Comp Psychol 117:272–282. wild Pan troglodytes: Exploring the chimpanzee culture hypothesis. J Hum Evol 57:
55. Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM (2009) Emulation, imitation, 337–349.
over-imitation and the scope of culture for child and chimpanzee. Philos Trans R Soc 92. Gumert MD, Malaivijitnond S (2012) Marine prey processed with stone tools by
Lond B Biol Sci 364:2417–2428. Burmese long-tailed macaques (Macaca fascicularis aurea) in intertidal habitats. Am

EVOLUTION
56. Tennie C, Call J, Tomasello M (2009) Ratchetting up the ratchet: On the evolution of J Phys Anthropol 149:447–457.
cumulative culture. Phil Tran R Soc B 36:2405-2415. 93. Fragaszy D, Izar P, Visalberghi E, Ottoni EB, de Oliveira MG (2004) Wild capuchin
57. Subiaul F (2016) What’s special about human imitation? A comparison with encul- monkeys (Cebus libidinosus) use anvils and stone pounding tools. Am J Primatol 64:
turated apes. Behav Sci (Basel) 6:13. 359–366.
58. Whiten A (2017) Social learning and culture in child and chimpanzee. Annu Rev 94. Whiten A (1998) Imitation of the sequential structure of actions by chimpanzees
Psychol 68:129–154. (Pan troglodytes). J Comp Psychol 112:270–281.
59. Whiten A, van de Waal E (2016) Social learning, culture and the ‘socio-cultural brain’ of 95. Coussi-Korbel S, Fragaszy DM (1995) On the relation between social dynamics and
human and non-human primates. Neurosci Biobehav Rev, 10.1016/j.neubiorev.2016.12.018.
social learning. Anim Behav 50:1441–1450.
60. Carvalho S, McGrew W (2010) The origins of the Oldowan: Why chimpanzees are still
96. Laland KN (2004) Social learning strategies. Learn Behav 32:4–14.

ANTHROPOLOGY
good models for technological evolution in Africa. Stone Tools and Fossil Bones, ed
97. Price EE, Wood LA, Whiten A (2016) Adaptive cultural transmission biases in children
Domínguez-Rodrigo M (Cambridge Univ Press, Cambridge, UK), pp 201–221.
and nonhuman primates. Infant Behav Dev, 10.1016/j.infbeh.2016.11.003.
61. Boesch C, Marchesi P, Marchesi N, Fruth B, Joulian F (1994) Is nutcracking in wild
98. Haun DB, Rekers Y, Tomasello M (2012) Majority-biased transmission in chimpanzees
chimpanzees a cultural behaviour? J Hum Evol 26:325–338.
and human children, but not orangutans. Curr Biol 22:727–731.
62. McGrew WC, Ham RM, White LJT, Tutin CEG, Fernandez M (1997) Why don’t
99. Vale GL, Flynn EG, Lambeth SP, Schapiro SJ, Kendal RL (2014) Public information use
chimpanzees in Gabon crack nuts? Int J Primatol 18:353–374.
63. Whiten A, Mesoudi A (2008) Review. Establishing an experimental science of culture: in chimpanzees (Pan troglodytes) and children (Homo sapiens). J Comp Psychol 128:
Animal social diffusion experiments. Philos Trans R Soc Lond B Biol Sci 363:3477–3488. 215–223.
64. Whiten A, Caldwell CA, Mesoudi A (2016) Cultural diffusion in humans and other 100. Horner V, Proctor D, Bonnie KE, Whiten A, de Waal FBM (2010) Prestige affects
animals. Curr Op Psychol 8:15–21. cultural learning in chimpanzees. PLoS One 5:e10625.
65. Horner V, Whiten A, Flynn E, de Waal FBM (2006) Faithful replication of foraging 101. Kendal R, et al. (2015) Chimpanzees copy dominant and knowledgeable individuals:
techniques along cultural transmission chains by chimpanzees and children. Proc Implications for cultural diversity. Evol Hum Behav 36:65–72.
Natl Acad Sci USA 103:13878–13883. 102. Henrich J, Broesch J (2011) On the nature of cultural transmission networks: Evi-
66. Dindo M, Stoinski T, Whiten A (2011) Observational learning in orangutan cultural dence from Fijian villages for adaptive learning biases. Philos Trans R Soc Lond B Biol
transmission chains. Biol Lett 7:181–183. Sci 366:1139–1148.
67. Whiten A, Horner V, de Waal FBM (2005) Conformity to cultural norms of tool use in 103. Harris PL, Corriveau KH (2011) Young children’s selective trust in informants. Philos
chimpanzees. Nature 437:737–740. Trans R Soc Lond B Biol Sci 366:1179–1187.
68. Bonnie KE, Horner V, Whiten A, de Waal FBM (2007) Spread of arbitrary conventions 104. Lucas AJ, et al. (2016) The development of selective copying: Children’s learning
among chimpanzees: A controlled experiment. Proc Biol Sci 274:367–372. from an expert versus their mother. Child Dev, 10.1111/cdev.12711.
69. Whiten A, et al. (2007) Transmission of multiple traditions within and between 105. Carroll EL, et al. (2015) Cultural traditions across a migratory network shape the
chimpanzee groups. Curr Biol 17:1038–1043. genetic structure of southern right whales around Australia and New Zealand. Sci
70. Pruetz JD, Bertolani P (2007) Savanna chimpanzees, Pan troglodytes verus, hunt with Rep 5:16182.
tools. Curr Biol 17:412–417. 106. Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of
71. Pruetz JD, et al. (2015) New evidence on the tool-assisted hunting exhibited by
killer whale ecotypes. Nat Commun 7:11693.
chimpanzees (Pan troglodytes verus) in a savannah habitat at Fongoli, Sénégal. R Soc
107. Wyles JS, Kunkel JG, Wilson AC (1983) Birds, behavior, and anatomical evolution.
Open Sci 2:140507.
Proc Natl Acad Sci USA 80:4394–4397.
72. Sanz C, Call J, Morgan D (2009) Design complexity in termite-fishing tools of chim-
108. Wilson AC (1985) The molecular basis of evolution. Sci Am 253:164–173.
panzees (Pan troglodytes). Biol Lett 5:293–296.
109. Bateson PPG (2004) The active role of behavior in evolution. Biol Philos 19:283–298.
73. Nakamura M, Uehara S (2004) Proximate factors of different types of grooming
110. Grant BR, Grant PR (2002) Simulating secondary contact in allopatric speciation: An
hand-clasp in Mahale chimpanzees: Implications for chimpanzee social customs. Curr
empirical test of premating isolation. Biol J Linn Soc Lond 76:545–556.
Anthropol 45:108–114.
111. Grant PR, Grant BR (2002) Adaptive radiation of Darwin’s finches. Am Sci 90:130–139.
74. Inskipp T (2005) Chimpanzee (Pan troglodytes). World Atlas of Great Apes and Their
112. Whiten A, van Schaik CP (2007) The evolution of animal ‘cultures’ and social in-
Conservation, eds Caldecott J, Miles L (Univ of California Press, Berkeley, CA), pp
telligence. Philos Trans R Soc Lond B Biol Sci 362:603–620.
53–81.
113. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in-
75. Ferriss S (2005) Western gorilla (Gorilla gorilla). World Atlas of Great Apes and Their
Conservation, eds Caldecott J, Miles L (Univ of California Press, Berkeley, CA), pp telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016.
105–127. 114. Burkart JM, Schubiger MN, van Schaik CP (2016) The evolution of general in-
76. McConkey K (2005) Bornean orangutan (Pongo pygmaeus). World Atlas of Great telligence. Behav Brain Sci1–65.
Apes and Their Conservation, eds Caldecott J, Miles L (Univ pof California Press, 115. Street SE, Navarette AF, Reader SM, Land KN (2017) Coevolution of cultural in-
Berkeley, CA), pp 161–183. telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad
77. Nishida T, Wrangham RW, Goodall J, Uehara S (1983) Local differences in plant Sci USA 114:7908–7914.
feeding habits of chimpanzees between the Mahale Mountains and Gombe Na- 116. Forss SIF, Willems E, Call J, van Schaik CP (2016) Cognitive differences between
tional Park, Tanzania. J Hum Evol 12:467–480. orang-utan species: A test of the cultural intelligence hypothesis. Sci Rep 6:30516.

Whiten PNAS | July 25, 2017 | vol. 114 | no. 30 | 7797


Synchronized practice helps bearded capuchin
monkeys learn to extend attention while
learning a tradition
Dorothy M. Fragaszya,1, Yonat Eshchara,b, Elisabetta Visalberghic, Briseida Resended, Kellie Laitya, and Patrícia Izard
a
Psychology Department, University of Georgia, Athens, GA 30602; bDavidson Institute of Science Education, Weizmann Institute of Science, Rehovot, Israel
7610001; cInstitute of Cognitive Sciences and Technologies, National Research Council, Italy, Rome, Italy 00197; and dDepartment of Experimental
Psychology, University of São Paulo, Butanta, 05508030, SP, Brazil

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 23, 2017)

Culture extends biology in that the setting of development shapes of how culture can extend biology (13–15). However, cognitive
the traditions that individuals learn, and over time, traditions processes associated with learning are not yet well integrated into
evolve as occasional variations are learned by others. In humans, theories of cultural evolution and niche construction (16). Here,
interactions with others impact the development of cognitive we illustrate how growing up in a group with a prevailing tradition
processes, such as sustained attention, that shape how individuals of using tools in foraging could affect cognitive development in
learn as well as what they learn. Thus, learning itself is impacted young monkeys in ways that support their learning this traditional
by culture. Here, we explore how social partners might shape the skill. This work opens a bridge between the learning sciences and
development of psychological processes impacting learning a the field of cultural evolution.
tradition. We studied bearded capuchin monkeys learning a
traditional tool-using skill, cracking nuts using stone hammers. Social Experience Influences the Development of Attention
Young monkeys practice components of cracking nuts with stones In humans, social experiences are implicated in the development
for years before achieving proficiency. We examined the time of attention, memory, and individual learning styles (16–18).
course of young monkeys’ activity with nuts before, during, and Cultural influences on long-term memory development include,
following others’ cracking nuts. Results demonstrate that the on- for example, the development of particular ways of chunking and
set of others’ cracking nuts immediately prompts young monkeys rehearsing information to be remembered, such as the con-
to start handling and percussing nuts, and they continue these struction of “memory palaces” used by the ancient Greeks and
activities while others are cracking. When others stop cracking the cultures that succeeded them (19) and the use of written lists
nuts, young monkeys sustain the uncommon actions of percussing and notes in the present day. Culture also impacts the develop-
and striking nuts for shorter periods than the more common ac- ment of working memory, which incorporates structures and
tions of handling nuts. We conclude that nut-cracking by adults processes used for the temporary storage and organization of
can promote the development of sustained attention for the crit- information about events recently heard or seen or about activ-
ical but less common actions that young monkeys must practice to ities recently performed (20, 21). Working memory is dependent
learn this traditional skill. This work suggests that in nonhuman on sustained attention, and therefore sensitive to attentional
species, as in humans, socially specified settings of development disruption. It is intimately related to motor processes, and lim-
impact learning processes as well as learning outcomes. Nonhumans,
ited in span to perhaps three to four “chunks” of information at
like humans, may be culturally variable learners.
one time. In humans, working memory typically lasts from a few
seconds to more than 1 min depending on emotional salience;
primates | attention | development | learning | tool use whether the information to be remembered is about events, ac-
tions, or space; and other factors (20).

T raditions are behaviors shared among members of a group


that are reliably learned by new individuals, in part, with
social support (1). The concept of tradition highlights behavioral
It is thought that humans share with other primates general at-
tentional and working memory capacity, although humans develop
skills, many of which are cultural, to control attention, and thus to
consistency over time and across individuals. Here, we empha- extend working memory (20, 22). Consider the deep attention
size the origin of traditions in learning. Learned behavior is one humans develop through meditative practice and in the course of
manifestation of developmental plasticity, and generation of reading and writing, all of which are explicitly cultural skills. Cul-
novel behavior is another. Novel behavioral variants may be tural variation in learned attention and skilled actions in humans
learned by others, thus modifying existing traditions or leading to suggests we should seek similar variations in nonhuman primates
new ones (2). In this way, developmental plasticity enables her- and other taxa that, like humans, live in socially constructed
itable modification of behavior through learning, with the result
that culture may extend biology (3, 4). Traditions appear to be
widespread in the animal kingdom (5–7), leading theorists to This paper results from the Arthur M. Sackler Colloquium of the National Academy of
propose that cultural evolution, rather than being restricted to Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
recent human history, has general evolutionary significance (8, Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
9). Apes (6, 10) and monkeys provide several examples of tra- on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
ditions. Among monkeys, for example, vervet monkeys learn Author contributions: D.M.F., Y.E., E.V., B.R., and P.I. designed research; D.M.F., Y.E., and K.L.
their mothers’ food-processing techniques (11) and white-faced performed research; Y.E., and K.L. analyzed data; and D.M.F., E.V., and P.I. wrote the paper.
capuchin monkeys learn odd social games, such as poking a The authors declare no conflict of interest.
finger in another monkey’s eye, that become established in- This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
termittently in groups (2, 12). Board.
Given the essential role of learning in traditions, cognitive 1
To whom correspondence should be addressed. Email: doree@uga.edu.
processes associated with learning that show developmental plas- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
ticity themselves, should be incorporated into our understanding 1073/pnas.1621071114/-/DCSupplemental.

7798–7805 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1621071114


COLLOQUIUM
PAPER
environments. To the extent that the development of sustained environment, interferes with sustained attention to one’s own or
attention and working memory is biased by the actions of social another’s activity.
partners, the process and outcome of social learning are likely to
vary across groups and across species. Other features of socially Evaluating Attention to Others’ Actions
constructed niches, such as the creation of artifacts, can also pro- Duration of gaze is a common measure of sustained attention to
mote sustained attention by providing opportunities and reminders others’ actions in experimental settings, but this measure is not
to practice particular actions, and thus add another source of useful when the activity of interest is auditory or when the ex-
cultural or taxonomic variation in socially biased learning (23). perimenter cannot see where the subject is looking. Temporally or
Small shifts in attentional processes and perceptual biases can shift spatially synchronized activity between two individuals, commonly
the trajectory of learning with far-reaching consequences, as has measured in studies with nonhuman animals under the general
been proposed for humans learning language (24). heading of “social influence,” is an alternative measure of sus-
tained attention in such cases. We can also measure the decline of
Developing Sustained Attention for Manual Actions with synchronized activity once the “demonstrator” stops performing
Objects the activity. We propose that the onset of synchronized activity is
Sustained attention and working memory develop throughout an indirect measure of the onset of sustained attention and the
childhood in humans, in concert with neural development and decline of synchronized activity (i.e., a return to baseline rates of
social experience (25, 26). Young children extend the duration of performing a particular behavior) is an indirect measure of the
sustained attention to an object while an older person shows disruption of attention and the subsequent loss of the contents of
attention to that object, suggesting how social interactions can working memory (in other words, forgetting).
support the development of sustained attention in the particular Theories of social learning predict that an individual is aided in
context of handling objects (27). The visual salience to humans learning by attending to the actions or products of actions of an-
of others’ hand movements is evident by the second year of life, other, but are silent about how quickly attention is drawn to the
when toddlers shift their visual attention toward another person other’s actions or products, how long the influence lasts, or how the
handling an object from predominantly toward the face to pre- influence declines over time (35). Not surprisingly, contemporary
dominantly toward the hands (28). studies of social learning have not addressed temporal properties
There are sound reasons to propose that nonhuman primates of social influence. To our knowledge, Hoppitt et al. (36) provide
are also biased to attend to manual actions. Some species of the only published report to quantify the temporal decline of social
nonhuman primates in captivity have been shown to attend to influence on behavior in a learning context. In a field experiment,
humans’ manual actions with objects (29–31); they are at least as wild meerkats (Suricata suricatta) were presented with baited boxes
likely to attend to manual activity of familiar conspecifics, if not that could be opened in two different ways. Some meerkats
more so. Manual activity is salient to all primates: Using the hands (demonstrators) were trained to open the boxes using a particular

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
with visual guidance to collect food and bring it to the mouth is a method. Subsequently, the boxes were presented to all members of
primitive characteristic of primates, and all primates use their the group, and the naive meerkats’ interactions with the boxes, as
hands to explore objects and surfaces, as well as to contact others well as when they watched other meerkats opening the boxes, were
during social interactions, such as grooming and play (32). This recorded. The researchers found that individuals were more likely
fundamental feature of primate behavior is associated with a host to interact with a box immediately after observing another meerkat
of neuroanatomical, perceptuomotor, and cognitive attributes, interacting with it; the half-life of the effect was 20 s. Young
including strong visual and proprioceptive salience to movements meerkats spend much time with adults in the period when they are
of their own hands and of the hands of others (31). learning to forage on the hidden and dangerous scorpions that
Through the observer’s bias to attend to others’ manual ac- these animals capture and eat (37). Adults’ influence on young
tions with objects, others’ actions can support the development individuals in this period is thought to be necessary for meerkats to
of attentional control by young monkeys during particular master their challenging foraging style (38).
manual activities. In this way, social partners can support young We consider the relation in young monkeys between social
nonhuman primates that otherwise normally experience brief influence on activity and attentional processes associated with
sustained attention to others and to their own activities, learning learning. The activity in question relates to using stone hammers
manual skills that require longer sustained attention. Certainly to crack nuts, seeds, or other encased foods, a technical tradition
using tools qualifies as challenging enough to benefit from sup- in several populations of wild bearded capuchin monkeys
port of this kind. Acknowledging the cognitive dimension of the (Sapajus libidinosus) (39–42) (Fig. 1). Young capuchin monkeys,
constructed niche in nonhuman taxa will strengthen cultural like young meerkats, master finding and feeding on hidden and
evolutionary theory. Social influence on the development of sometimes noxious prey, and like meerkats, they are interested in
sustained attention is an appropriate early target for research in and affected by others’ actions with objects (43). Thus, they are
this direction. Even if the perceptual biases in primates favoring good candidates for studies of the temporal dynamics of social
attention to actions of others are small, they could nevertheless influence on behavior with objects and on the cognitive processes
powerfully affect learning trajectories, particularly when magni- associated with learning traditional tool-using skills.
fied by shifts in attention and memory (24).
We hypothesize that nonhuman primates and humans share Temporal Dynamics of Social Influence as a Window on
strong susceptibility to tuning (i.e., extending, strengthening) at- Sustained Attention
tention and memory about manual actions and about objects via Our objective was to examine temporal dynamics of social in-
interest in others’ manual activity. However, nonhuman primates fluence on young monkeys’ behavior in a situation in which the
face a challenge in sustaining attention to actions with their hands young monkeys were practicing component actions of a tradi-
that humans typically do not, or do not face to the same extent, tional manual skill. Temporal dynamics of young monkeys’ be-
which could be particularly important when learning a skill in- havioral coordination with others in this context reflect sustained
volving handling objects. Nonhuman primates’ attention to their attention to their own and others’ actions. The skill in question is
own activity is typically disrupted every few seconds to scan the cracking palm nuts using stone hammers, a precursor to the
surroundings briefly (surveying the surroundings in this way is situation facing humans knapping stone (44). Social support in
termed “vigilance” in the animal behavior literature) (33, 34). the form of instruction and demonstration aid in the acquisition
Vigilance, which functions to inform the perceiver about preda- of knapping, but these actions are not sufficient by themselves
tors, conspecifics, and other relevant dynamic features in the for people to learn to knap stone (45). Providing repeated

Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7799
The data for this report are taken from monkeys 6 y of age or
younger belonging to a wild, habituated group of bearded ca-
puchins observed in five periods over 2 y (Table 1). Members of
this group of monkeys routinely crack nuts using stone hammers
at many anvils scattered across their home range (49–51). In this
study, one observer continuously recorded the focal young
monkey’s behavior. A second observer concurrently recorded at
intervals of 1 min the distance from each neighbor (within a 10-m
zone) to the focal monkey, the identity and behavior of each
neighbor, and the occurrence of cracking nuts (i.e., striking a nut
with a stone, producing a sharp cracking noise) by any monkey in
the group. The method allowed us to analyze the focal monkey’s
behavior with nuts and stones and its presence near anvils in
relation to the start, continuation, and end of cracking by other
members of the group. Data collected while the group was in a
frequently visited area with abundant anvils, hammers, and
cracked shells, as detailed below, indicated that the temporal
Fig. 1. An adult bearded capuchin monkey has cracked a palm nut using a pattern of others’ influence on young monkeys’ activity with nuts
stone hammer on a log anvil and is removing and eating pieces of the kernel. and stones and their presence near anvils was associated spe-
A young monkey that cannot crack a nut itself watches closely. Image used
cifically with the others’ activity with nuts (i.e., synchronization
with permission from Luca Antonio Marino, Roma Tre University (Rome, Italy).
was not a byproduct of traveling in a cohesive group). Results are
reported for n = 16 monkeys, unless otherwise noted.
occasions for practice and self-discovery of movement solutions
Results
is a crucial dimension of social support for learning this complex
perceptuomotor skill and human traditional skills more generally Manipulation of Nuts. Young monkeys manipulated nuts at the
(13, 46). highest rate when others were cracking nuts (i.e., striking nuts
We measured the temporal rise and fall of social influence on with a stone) (median = 8.3 acts per 10 min) and at the lowest
wild young bearded capuchin monkeys (S. libidinosus) in the rate (median = 0.9 acts per 10 min) during periods when no
company of adults that were cracking palm nuts by striking them others were cracking nuts, and had not been cracking nuts for at
repetitively with stone hammers. These monkeys start to interact least 8 min. The difference between these rates was significant
with nuts and stones in the first year of life. They handle nuts, (estimate = 1.98, P < 0.0001). The onset of the effect of others’
percuss nuts directly against a hard surface (hereafter, percuss), cracking nuts was quick: Compared with the minute before
and strike nuts or nut shells with stones (hereafter, strike) for others began to crack, the median probability that a young
several years before they are able to crack palm nuts themselves monkey would manipulate a nut doubled in the first minute after
(47, 48). Thus, young monkeys exhibit remarkable persistence in others began to crack, and remained doubled or more for at least
a foraging activity that they cannot perform effectively. We know 5 min when others continued to crack (Fig. 2). Movie S1 provides
that others’ cracking nuts is partially responsible for the young a video-clip of a young monkey handling a nut while and after
monkeys’ persistence. In one recent study, while other group another monkey is cracking a nut. During the 7 min after the
members were cracking and eating nuts, monkeys 6 y of age or others stopped cracking, the rate of manipulation of nuts de-
younger were threefold more likely to be near an anvil, qua- clined exponentially (in At = A0 * e−βt; estimates: A = 9.96, P <
drupled their rate of interaction with nuts, and doubled their rate 0.0001; β = 0.325, P = 0.0013; Fig. 3), where e is the base of the
of percussing and striking compared with times when no monkey natural logarithm, β is the rate by which the dependent variable
in the group was cracking nuts. Interactions with objects other declines with time, t is the time since cracking in the group
than nuts showed the opposite pattern (48). stopped, and At (the dependent variable) is the rate or percentage

Table 1. Subjects’ date of birth, sex, and body mass at each sampling time point
Name Date of birth (mm/dd/yyyy) Sex Body mass in 2011, kg Body mass in 2012, kg Body mass in 2013, kg

Donzela 01/13/2013 F — — 0.4


Patricia 01/11/2013 F — — 0.5
Titia 01/03/2013 F — — 0.7
Divina 11/07/2012 F — — —
Cachaça 03/15/2012 M — 0.4 1.1
Thais 02/01/2011 F — 1.1 1.3
Presente 02/15/2011* M — 1.0 1.5
Chani 12/15/2010 F — 1.0 1.2
Coco 07/14/2009 M 1.1 1.4 1.7
Paçoca 01/01/2009 F 1.2 1.3 1.6
Pamonha 01/01/2009 F 1.2 1.4 1.6
Doree 11/09/2007 F 1.4 1.6 1.8
Pati 11/02/2007 M 1.7 2.1 2.5
Cangaceiro 09/20/2007 M 1.8 2.1 2.4
Catu 02/05/2007 M 1.8 2.1 2.5
Tomate 12/01/2006 M 1.8 2.0 2.3

F, female; M, male.
*Estimate.

7800 | www.pnas.org/cgi/doi/10.1073/pnas.1621071114 Fragaszy et al.


COLLOQUIUM
PAPER
monkeys’ rate of percussion declined exponentially for the next
7 min (in At = A0 * e−βt; estimates: A = 8.16, P = 0.026; β =1.12,
P = 0.014; Fig. 5). The half-life of the effect was 0.6 min, less
than a third as long as the half-life for manipulation of
nuts (2.1 min).
Four monkeys (all <1 y of age) did not strike a nut with a stone
during focal observations, but 12 monkeys (all >1 y of age) did.
For the latter monkeys, the rate of striking was highest when
others were cracking nuts (median = 1.4 per 10 min) and lowest
in periods 8 min or longer after others stopped cracking (me-
dian = 0.3 per 10 min; Fig. 6). The difference between these rates
was significant (estimate = 4.3, P < 0.0001). In the minute after
others stopped cracking, the median rate of striking by the young
monkeys declined to zero, and for all minutes, the rate of young
monkeys’ striking was not significantly different from periods
8 min or longer after others stopped cracking. The data did not
fit an exponential model of decline.

Fig. 2. Probability that a young monkey (n = 11) manipulated a nut in the Time Spent Near an Anvil. The percentage of time young monkeys
1 min before the onset of striking a nut performed by another monkey (No spent near an anvil was highest when others were cracking nuts
cracking) and during each of the 5 min following the onset of striking a nut (median = 13.3%) and lowest during periods 8 min or longer
by another monkey. The boxes display the median and interquartile range, after others stopped cracking (median = 2.6%) (Fig. 7). The
and whiskers indicate minimum and maximum values. The solid line within difference between these percentages was significant (estimate =
the box depicts the median. Circles indicate values of outliers.
6.06, P < 0.0001). When others were cracking until 7 min after all
others had stopped cracking, the percentage of time young
of time. The output from the model is A0 (the strength of the monkeys spent near an anvil declined exponentially (in At = A0 *
effect on the dependent variable) and β. The half-life of the effect e−βt; estimates: A = 11.94, P < 0.0001; β = 0.25, P < 0.0001). The
half-life of the effect was 2.8 min.
was 2.1 min. The rate of nut manipulation during the first to fifth
minutes after other monkeys stopped cracking nuts was signifi- Discussion
cantly higher than the rate during periods 8 min or longer after

PSYCHOLOGICAL AND
Linking Learning Processes to a Tool-Using Tradition in Monkeys.

COGNITIVE SCIENCES
others stopped cracking nuts (i.e., the baseline rate) (P <
Culture in a behavioral sense is present when, aided by social
0.0001 for minutes 1–4 after other monkeys stopped cracking nuts,
context, individuals consistently learn behaviors exhibited by
P = 0.0146 for minute 5). The rate of manipulation of nuts during
others in their community (i.e., they have traditions). Culture can
the sixth and seventh minutes after others stopped cracking nuts extend biology in an evolutionary sense when the traditions in
and the rate during periods 8 min or longer after others stopped question persist across generations, and when they have selective
cracking nuts did not differ significantly. consequences. In parallel with natural selection, occasional
The temporal pattern for young monkeys’ manipulation of spontaneous behavioral variants (arising from developmental
other objects besides nuts was the opposite of their actions with plasticity) that afford some advantage and that are learned by
nuts. Young monkeys manipulated other objects at the lowest rate
when others were cracking nuts (median = 12.8 per 10 min) and at
the highest rate during periods 8 min or longer after others
stopped cracking nuts (median = 16.7 per 10 min). The difference
between these rates was significant (estimate = 1.29, P < 0.0001).
From the time when others were cracking nuts through 7 min after
they stopped, juveniles’ rate of manipulation of other objects in-
creased exponentially (in At = A0 * e−βt; estimates: A = 14.2, P <
0.0001; β = −0.03, P = 0.015), although the rate of manipulation
of other objects in each of the first 7 min after others stopped
cracking nuts did not differ from the rate in baseline periods (i.e.,
8 min or longer after other monkeys stopped cracking nuts).

Percussing Nuts and Striking Nuts with a Stone. We looked at the


rates of two actions that are related to cracking palm nuts:
percussing a nut on a hard surface (percuss) and striking a nut
placed on a hard surface with a stone (strike). The young mon-
keys’ rate of percussing was highest while others were cracking
nuts (median = 1.3 per 10 min) and lowest during periods 8 min
or longer after others stopped cracking nuts (median = 0.2 per
10 min). These rates were significantly different (estimate = 6.3,
P < 0.0001). As was the case for manipulation of nuts, the onset
of the effect of others’ cracking on young monkeys’ percussion Fig. 3. Rate per 10 min of manipulation of nuts by young monkeys (n = 16)
was nearly immediate. Compared with the minute before others when one or more other monkeys struck nuts (Cracking present) during each
of the 7 min after others stopped striking a nut and in periods 8 min or
began to crack nuts, the probability that a young monkey would
longer after others stopped striking a nut (8 min or over). The exponential
percuss a nut more than doubled in the first minute after others curve generated by model fitting is overlaid. The boxes display the median
began to crack nuts (from <0.04 to >0.10) and remained at and interquartile range, and whiskers indicate minimum and maximum
that level for at least 5 min while others were cracking (Fig. 4). values. The solid line within the box depicts the median. Circles indicate
From the time that others stopped cracking nuts, the immature values of outliers. The half-life of the decline occurred at 2.1 min.

Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7801
challenging perceptuomotor skill. When others act in frequent
bouts, and when these bouts are long-enduring, they support
more frequent performance by young individuals of those actions
for which attention is least well maintained and, subsequently,
working memory is the most fragile. We provisionally identify
this process as social scaffolding for learning attentional skills
that support learning the least familiar component(s) of an ac-
tivity. The socially supported practice of sustained attention in a
given context can powerfully support development of longer
periods of sustained attention that can be marshalled during
other contexts. For example, human infants practice sustained
attention while maintaining joint attention with a caregiver, and
these common and frequent social interactions influence the
development of sustained attention more generally (27). In the
case of young bearded capuchin monkeys, we suggest that
extending sustained attention for percussive actions with stones
and nuts is one outcome of repeated prompting to perform these
actions arising from others striking nuts while cracking them. We
Fig. 4. Probability that a young monkey (n = 11) percussed a nut 1 min further suggest that the development of longer sustained atten-
before the onset of striking a nut performed by another monkey (No tion to their own percussive activity supports the acquisition of
cracking) and during each of the 5 min following the onset of striking a nut
by another monkey. The boxes display the median and interquartile range,
nut-cracking using stone hammers, which is a signature tradition
and whiskers indicate minimum and maximum values. The solid line within of tool use in some groups of bearded capuchin monkeys.
the box depicts the median. Circles indicate values of outliers.
Socially Tuned Attention and Learning Traditions in Primates and
Other Orders. This line of reasoning leads us to predict that
others can become established as new traditions. Learning a tool-using traditions in nonhuman animals include actions that
tradition is, by definition, dependent on the social context in are infrequent outside of that activity (i.e., traditions are not
which learning takes place, but it is equally dependent on the composed solely of common actions), and to the related pre-
learning processes of the individual. Here, we draw attention to dictions that (i) the components of tool use traditions that are
the power of social partners to influence not just the content of infrequent in species-typical activity outside of the traditional
learning a tradition (i.e., how or when to perform a particular activity will be performed frequently by proficient tool users, and
behavior) but also how learning itself takes place. Working from (ii) frequent performance by adults of uncommon actions will
findings with humans that sustained attention is developmentally support practice of these particular actions by young individuals.
plastic and susceptible to social influence, we examined behavior Taxonomic variation in social learning may follow from
of young monkeys, seeking evidence that social influences shape species-typical variations in attention in combination with tem-
attention in young monkeys. Specifically, we examined temporal poral dynamics of action. We hypothesize two features of at-
dynamics of social influence on young monkeys’ activity with the tention and memory distinguish primates from other orders:
materials relevant to the tool-using tradition in their wild group (i) Primates are more likely than other orders to have stronger
(cracking nuts with stone hammers) to index their sustained at- sustained attention and working memory for actions with objects
tention to this task. Young monkeys were more likely to be near compared with other kinds of actions, and (ii) primates have
an anvil, and to manipulate, percuss, and strike nuts once others stronger interest in others’ actions with objects than do other
began to strike nuts to crack them (which we call cracking). The orders [although there may be other taxa with strong interest in
higher probability of these actions in young monkeys persisted
through 5 min while others cracked nuts (5 min was the longest
period for which we had sufficient data for these analyses). They
continued to manipulate nuts at higher rates and were more
likely to be present near an anvil for minutes after others stopped
cracking nuts, with a half-life of more than 2 min for each vari-
able. These findings indicate substantial lingering effects of so-
cial influence on young monkeys’ interest in nuts and anvils and
general exploratory activity directed to nuts. In contrast, these
young monkeys decreased manipulation of other objects while
other monkeys cracked nuts.
Key to our argument that attention (and its related process,
working memory) can be enhanced by participating in socially
supported practice, others’ cracking was a strong facilitator of
young monkeys’ practicing percussion and striking nuts, but only
when others were cracking. The young monkeys percussed a nut
or struck a nut with a stone much less frequently than they
manipulated nuts; moreover, following the end of others’
cracking nuts, young monkeys reduced percussion and striking Fig. 5. Rate of direct percussion of nuts by young monkeys (n = 16) when
one or more other monkeys struck nuts (Cracking present) during the 7 min
quickly (a half-life of about 37 s for percussion and a median of after other monkeys stopped striking nuts and in periods 8 min or longer
zero after 1 min for striking). after others stopped striking a nut (8 min or over). The boxes display the
median and interquartile range; whiskers indicate minimum and maximum
Building Sustained Attention for the Least Likely Actions. These values. The solid line depicts the median. Circles indicate values of outliers.
findings illustrate how the frequency and temporal duration of The exponential curve generated by model fitting is overlaid. The half-life of
others’ actions can influence how young individuals learn a the decline occurred at 0.62 min.

7802 | www.pnas.org/cgi/doi/10.1073/pnas.1621071114 Fragaszy et al.


COLLOQUIUM
PAPER
Iterative practice in socially supportive contexts such as we de-
scribe for bearded capuchin monkeys characterizes the learning
context for culturally acquired instrumental skills in humans and
other species (10). Suggestive examples of unlikely actions tightly
synchronized with similar actions by another and of more gradual
decline of more common actions after others have stopped the
activity have been reported in apes. For example, a young chim-
panzee, closely watching another chimpanzee cracking a nut with
a stone, synchronously raised and lowered its arm (a “striking”
action) while the other was striking (53), similar to what we ob-
served with young capuchin monkeys striking nuts in the presence
of others cracking nuts. Young orangutans follow peering at
others at close range feeding or making nests with higher rates of
exploratory actions with the food items handled by others or with
Fig. 6. Rate of striking a nut with a stone by young monkeys (n = 16) when
one or more monkeys in the group struck nuts (Cracking present) during the
nest materials over the next few minutes (54), exhibiting a tem-
7 min after other monkeys stopped striking and in periods 8 min or longer poral persistence of social influence on the order of what we re-
after others stopped striking a nut (8 min or over). The boxes display the port for capuchin monkeys manipulating nuts.
median and interquartile range, and whiskers indicate minimum and maxi- We further suggested that primates exhibit particular atten-
mum values. The solid line depicts the median. The exponential curve gen- tional biases toward actions with objects. This bias increases the
erated by model fitting is overlaid. Circles indicate values of outliers. likelihood that they learn about such actions from being with
others in concert with practicing themselves. Attentional biases
are likely to vary across taxa in accord with, for example, dominant
others’ actions with objects, such as New Caledonian crows, modalities of communication and styles of foraging. Differences in
which also use tools in foraging (52)]. By comparison, some ce- where attentional biases impact learning processes could support
tacean species (e.g., Tursiops and Orcinus orca) are more likely taxonomic variation in the content of what the individual learns
than other mammals to have stronger sustained attention to with others (e.g., songs in some species, food preferences and
auditory events and more immediate response to others’ actions, foraging techniques in other species). The eventual outcome of
in keeping with the collaborative nature of their feeding activities taxonomic variation in socially biased learning is that traditions
and the efficiency of sound transmission in water (7). In general, develop in varied functional domains, and thus that culture is
the social dimension of the developmental niche is likely to likely to extend biology in various taxa in different ways.
contribute to tuning attention, and thus to bias learning, to favor

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
those skills that are commonly practiced by the young individ- Methods
ual’s social companions. Study Site. This study was conducted at Fazenda Boa Vista and adjacent lands
We hope this work prompts others to develop these ideas and in the southern Parnaíba Basin (9°39′S, 45°25′W) in Piauí, Brazil. Palms are
to test them with data from diverse species. Conceptual and abundant in the area, and many produce fruit at ground level. Two species
empirical work along these lines is needed to integrate de- of palm nuts in particular were commonly cracked by the monkeys in this
velopmental and psychological understanding of behavioral study: tucum (Astrocaryum campestre) and piassava (Orbygnia spp.). A
tucum nut is, on average, 46 mm in length, weighs 15.5 g, and has a 4.1-mm-
variation into theories of cultural evolution, niche construction,
thick shell, with a peak-force-at-failure of 5.6 kN. An average piassava nut is
and evolutionary biology. 61.3 mm long, weighs 50.6 g, and has a thicker and more resistant shell than
a tucum nut: 6 mm with a peak-force-at-failure of 11.5 kN (55).
Concluding Remarks
The naturally occurring stones used by the monkeys to crack nuts weigh,
Culture potentially extends biology insofar as the setting of de- on average, 1.1 kg (range: 250 g to 2.5 kg) (56). They are quartz, quartzite,
velopment supports individuals’ learning traditions, and occasion-
ally learning behavioral variants of these traditions arising in other
individuals that become established as new traditions. Behavioral
traditions are learned in social settings, and the attentional and
memorial processes that underlie that learning are themselves
shaped by social partners. To date, our attention on socially aided
learning, traditions, and cultures in nonhuman species has focused
on the form and function of traditional behaviors (e.g., foraging
skills, social interactional patterns). We argue that we need to in-
clude social influences on the learning process itself in the scope of
cultural inquiry, as cross-cultural educational psychologists have
argued (18). Improving our understanding of the psychological
processes supporting socially biased learning, and thus the tradi-
tions that animals acquire, must be part of advancing theory in
cultural evolution. Working memory and attention are one set of
linked cognitive processes available for study with respect to how
and when learning occurs, and how the social setting of develop-
ment influences learning processes.
We have hypothetically linked temporal dynamics of social
influence to sustained attention and working memory. These Fig. 7. Percentage of time spent within an arm’s length of an anvil by
young monkeys (n = 16) when one or more other monkeys struck nuts
cognitive processes are fundamental to learning, including
(Cracking present) during the 7 min after other monkeys stopped striking
learning a traditional skill, cracking nuts with a stone hammer in and in periods 8 min or longer after others stopped striking a nut (8 min or
the case of the young monkeys that we studied. We suggested over). The boxes display the median and interquartile range, and whiskers
that repeated experiences of performing challenging parts of indicate minimum and maximum values. The solid line depicts the median.
the action cycle relating to cracking nuts could lead to ex- Circles indicate values of outliers. The exponential curve generated by model
tended sustained attention and working memory for these actions. fitting is overlaid. The half-life of the decline occurred at 2.8 min.

Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7803
siltstone, or harder sandstone. The monkeys cracked nuts on naturally oc- dependent variables in this period were calculated as the number of events
curring anvils (boulders, exposed stone, or horizontal logs with a flat, or divided by total observation time. The proportion of time near an anvil was
nearly flat, horizontal surface). Anvils are abundant in the region. calculated as the number of seconds spent there divided by total observation
time. In the models, we treated the variables as count variables and used total
Subjects. At the beginning of the study, there were 11 immature monkeys in time as an offset. For variables that did not distribute normally (tested with
the group, aged from 3 mo to 4.5 y (Table 1). Five more infants were born the Shapiro–Wilk test), the Poisson distribution was used. Subject identification
during the study. At the beginning of the study, none of the subjects could was used as a random factor. Randomization of residuals was used to com-
crack open a whole nut of the more resistant palm species. The two oldest pensate for overdispersion.
juveniles and, to some extent, two others mastered this skill during the We used the exponential model At = A0 * e-βt to describe the dynamics of
study. Apart from the study subjects, the group included three adult males the dependent variables with time, where t is the time since cracking in the
and five adult females. All but one female habitually cracked palm nuts. The group stopped and At (the dependent variable) is the rate or percentage of
body mass of each monkey was obtained as monkeys stood individually on a time. The output from the model is A0 (the strength of the effect on the
digital scale to drink from a bowl of water (57). dependent variable) and β, from which the half-life can be calculated as
ln(2)/β. For each variable, we examined the goodness of fit of our data to
Data Collection. Data were collected in five discrete collection periods, each this exponential model and determined the estimates for A0 and β, as well as
6 to 9 wk, during three dry seasons (May–July) in 2011, 2012, and 2013 and the half-life of the effect. We used the values of the independent variables
two wet seasons (January–March) in 2012 and 2013. Observations were when monkeys were striking nuts in each of the 7 min after this activity
collected using two-person teams. One observer followed a focal subject to stopped; at times, we also used the values of the independent variables
obtain a continuous record of its activities, including manipulation of nuts when monkeys were striking nuts 8 min or longer after this activity stopped
and other objects, and locations, specifically if the subject was near an anvil. or during observation that did not include striking at all. Data from all ob-
Concurrently, the other member of the team recorded, as an instantaneous servation periods were used in analyses, except that data from the two wet
observation every minute, the identity, location, and activity of other seasons on young monkeys’ direct percussion of the nut and striking the nut
monkeys within 10 m of the focal monkey. All observations lasted 20 min, or with a stone were not used in analyses because these actions occurred very
until the focal subject went out of view and could not be followed. Obser- rarely in the data from these seasons.
vations lasting <5 min were discarded. We examined the effect of the onset of others’ cracking by tabulating
Observers first learned to identify all members of the group and were young monkeys’ actions with nuts in the minute before others began
subsequently trained in the method with one of the authors (Y.E.). Reliability cracking and in the 5 min following the onset of others’ cracking. The data
for focal observations was calculated using Generalized Sequential Querier are presented in Dataset S2. We present these data descriptively. We used
(GSEQ) software (www2.gsu.edu/∼psyrab/gseq/index.html). We used the 11 monkeys’ data for these tallies: 2-y-old, 3-y-old, and 4-y-old monkeys,
time unit method, which compares the codes inserted by two observers and which our previous analyses had indicated were affected more strongly by
defines as a match any instant in which both observers used the same code others’ striking than younger or older immature monkeys.
within a time window of 5 s. For each observer and trainer pair, time unit At each data collection period, one-quarter to one-half of observations
kappa was at or above 0.7, which is considered highly reliable (58). Reliability were collected while the monkeys visited an area (∼30-m diameter) of their
for instantaneous observations of other monkeys near the focal monkey was home range containing several large boulders, fallen logs, and areas of
tested separately for each aspect (identity, distance to the focal monkey, exposed stone that the monkeys habitually used as anvils to crack nuts. We
activity, and location) until agreement (sum agreement/agreement plus use this area as our outdoor laboratory. Several hammer stones were pre-
disagreement) was over 80% for each variable for 20 consecutive samples. sent in this area, typically left by the monkeys on or near the anvils. Many
The protocol was reviewed and approved by the Institutional Animal Care other anvils with hammer stones were present within 200 m in the sur-
and Use Committee of the University of Georgia. The study adheres to the code rounding area. The monkeys were sometimes provisioned in the outdoor
of best practices for field primatology set by the International Primatological laboratory with nuts as part of ongoing experiments (59–62).
Society and all applicable Brazilian regulations for the conduct of field research. When in the outdoor laboratory, young monkeys had ample opportunity
to spend time near anvils handling stones and nut shells independent of other
Data Analysis. For each subject in each collection period, we collected be- monkeys’ activity. The influence of others in the group cracking nuts in their
tween 19 and 53 observations, which lasted cumulatively between 5.3 and vicinity on the rate of manipulating nuts and on proximity to anvils by our
27.1 h. Observations were collated by subject for each season. Ten subjects subjects in the outdoor laboratory is approximately the same (manipulation:
appeared in all five collection periods. Data were collected as the monkeys P = 0.0406, estimate = 2.33; proximity to anvils: P < 0.0001, estimate = 8.4) as
traveled throughout their home range. The observations were exported from in the full sample collected over the entire home range (estimate = 1.98 for
The Observer to GSEQ software to extract the frequency of different events manipulation, estimate = 6.06 for proximity to anvil). Young monkeys
(e.g., manipulation of nuts) at times when others in the group cracked (struck) approached anvils and handled nuts most often while adults were cracking
nuts and at times when they did not. nuts, although anvils were available, and nut shells and hammer stones were
With respect to change in the young monkeys’ activity following cessation equally present and available, when others were not cracking. They showed
of others’ cracking, we used general mixed linear models to evaluate the the opposite pattern for manipulating other objects. We conclude that the
differences in activity under different conditions and exponential models to fine temporal influence of others’ nut-cracking on young monkeys’ activity
evaluate the temporal pattern of the effects. SAS/STAT14.2 software was with nuts and presence near anvils reported here is not a byproduct of
used for the analyses. We examined the rate of manipulation of nuts, ma- synchronized travel of a cohesive group.
nipulation of objects other than nuts, specific actions with nuts, and time
spent near an anvil. The data are summarized in Dataset S1. ACKNOWLEDGMENTS. We thank the assistants who helped collect the data
Our independent variables were the presence of nut-cracking activity (which and Marino Gomes de Oliveira and the Oliveira family for their help and
involved striking nuts with stones) in the group (yes/no) and the time that had permission to work on their land. We thank Marcus W. Feldman, Andrew
elapsed since this activity stopped (e.g., 0–1 min after the activity stopped, Whiten, Kevin N. Laland, and Francisco J. Ayala for the opportunity to participate
1–2 min after the activity stopped, 2–3 min after the activity stopped). The in the Sackler Colloquium “The Extension of Biology Through Culture.” We
dependent variables were (i) proportion of time the subject spent within an thank the statistical consulting service at the University of Georgia (UGA) and
arm’s length of an anvil, (ii) manipulation of nuts, (iii) manipulation of other the UGA Dean’s Award for the funding of this service. This research was funded
by the National Geographic Society, UGA, Coordenadoria de Aperfeiçoamento
objects, (iv) rate of percussing a nut directly on a surface, and (v) rate of
de Pessoal de Nível Superior, São Paulo Research Foundation (Grant 08/55684-3),
striking a nut with a stone. “Total time” is defined as all seconds of observa- and Brazilian National Council for Scientific and Technological Development
tion under a specific condition of the independent variable. For example, the (CNPq) (Contract 029088). Permission was granted for the research by Instituto
total time of “3 min after activity stopped” includes all observations from Brasileiro do Meio Ambiente e dos Recursos Renováveis through Permit 28689
120 to 180 s after all monkeys in the group stopped cracking nuts. Rates for and by CNPq/Ministério da Ciência e Tecnologia Permit 0002547/2011.

1. Fragaszy DM, Perry S (2003) Towards a biology of traditions. Traditions in Nonhuman 3. West-Eberhard MJ (2003) Developmental Plasticity and Evolution (Oxford Univ Press,
Animals: Models and Evidence, eds Fragaszy D, Perry S (Cambridge Univ Press, Oxford).
Cambridge, UK), pp 1–32. 4. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump-
2. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) tions and predictions. Proc Biol Sci 282:20151019.
invent more social behaviors, but younger monkeys innovate more in other con- 5. Fragaszy DM, Perry S (2003) The Biology of Traditions: Models and Evidence (Cambridge
texts. Proc Natl Acad Sci USA 114:7806–7813. Univ Press, Cambridge, UK).

7804 | www.pnas.org/cgi/doi/10.1073/pnas.1621071114 Fragaszy et al.


COLLOQUIUM
PAPER
6. McGrew W (2004) The Cultured Chimpanzee: Reflections on Cultural Primatology 38. Thornton A, McAuliffe K (2006) Teaching in wild meerkats. Science 313:227–229.
(Cambridge Univ Press, Cambridge, UK). 39. Visalberghi E, Fragaszy DM (2013) The EthoCebus project. Stone tool use by wild
7. Whitehead H, Rendell L (2015) The Cultural Lives of Dolphins and Whales (Univ of capuchin monkeys. Tool Use in Animals. Cognition and Ecology, eds Sanz C, Call J,
Chicago Press, Chicago). Boesch C (Cambridge Univ Press, Cambridge, UK), pp 203–222.
8. Mesoudi A (2011) Cultural Evolution (Univ of Chicago Press, Chicago). 40. Ferreira RG, et al. (2009) On the occurrence of Cebus flavius (Schreber 1774) in the
9. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cul- Caatinga, and the use of semi-arid environments by Cebus species in the Brazilian
tural evolution. Proc Natl Acad Sci USA 114:7853–7860. state of Rio Grande do Norte. Primates 50:357–362.
10. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. 41. Canale GR, Guidorizzi CE, Kierulff MC, Gatto CA (2009) First record of tool use by wild
Proc Natl Acad Sci USA 114:7790–7797. populations of the yellow-breasted capuchin monkey (Cebus xanthosternos) and new
11. van de Waal E, Bshary R, Whiten A (2014) Field experiments show wild vervet monkey records for the bearded capuchin (Cebus libidinosus). Am J Primatol 71:366–372.
infants acquire maternal food-processing techniques. Anim Behav 90:41–45. 42. Falotico T, Ottoni E (2016) The manifold use of pounding stone tools by wild capuchin
12. Perry S (2011) Social traditions and social learning in capuchin monkeys (Cebus). Philos monkeys of Serra da Capivara National Park, Brazil. Behav 153:421–442.
Trans R Soc Lond B Biol Sci 366:988–996. 43. Fragaszy DM, Visalberghi E, Fedigan L (2004) The Complete Capuchin (Cambridge
13. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl Univ Press, Cambridge, UK).
Acad Sci USA 114:7861–7868. 44. Mangalam M, Izar P, Visalberghi E, Fragaszy DM (2016) Task-specific temporal or-
14. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive mechanisms ganization of percussive movements in wild bearded capuchin monkeys. Anim Behav
in response to cultural innovations. Proc Natl Acad Sci USA 114:7915–7922. 114:129–137.
15. Jablonka E, Lamb M (2014) Evolution in Four Dimensions (MIT Press, Cambridge, MA), 45. Stout D, Khreisheh N (2015) Skill learning and human brain evolution: An experi-
2nd Ed. mental approach. Camb Archaeol J 25:867–875.
16. Flynn EG, Laland KN, Kendal RL, Kendal JR (2013) Target article with commentaries: 46. Whiten A (2015) Experimental studies illuminate the cultural transmission of per-
Developmental niche construction. Dev Sci 16:296–313. cussive technologies in Homo and Pan. Philos Trans R Soc Lond B Biol Sci 370:
17. Rogoff B (1991) Apprenticeship in Thinking (Oxford Univ Press, Oxford). 20140359.
18. Li J (2012) Cultural Foundations of Learning: East and West (Cambridge Univ Press, 47. de Resende BD, Ottoni EB, Fragaszy DM (2008) Ontogeny of manipulative behavior
Cambridge, UK). and nut-cracking in young tufted capuchin monkeys (Cebus apella): A perception-
19. Yates FA (1966) The Art of Memory (Chicago Univ Press, Chicago). action perspective. Dev Sci 11:828–840.
20. Carruthers P (2013) Evolution of working memory. Proc Natl Acad Sci USA 110: 48. Eshchar Y, Izar P, Visalberghi E, Resende B, Fragaszy D (2016) When and where to
10371–10378. practice: Social influences on the development of nut-cracking in bearded capuchins
21. Atkinson RC, Schiffrin RM (1968) Human memory: A proposed system and its control (Sapajus libidinosus). Anim Cogn 19:605–618.
processes. Psychol Learn Motiv 2:89–195. 49. Spagnoletti N, Visalberghi E, Ottoni E, Izar P, Fragaszy D (2011) Stone tool use by
22. O’Connell RG, et al. (2008) Self-alert training: Volitional modulation of autonomic adult wild bearded capuchin monkeys (Cebus libidinosus). Frequency, efficiency and
arousal improves sustained attention. Neuropsychologia 46:1379–1390. tool selectivity. J Hum Evol 61:97–107.
23. Fragaszy DM, et al. (2013) The fourth dimension of tool use: Temporally enduring 50. Verderane MP, Izar P, Visalberghi E, Fragaszy DM (2013) Socioecology of wild bearded
artefacts aid primates learning to use tools. Philos Trans R Soc Lond B Biol Sci 368: capuchin monkeys (Sapajus libidinosus): An analysis of social relationships among
20120410. female primates that use tools in feeding. Behaviour 150:659–689.
24. Thompson B, Kirby S, Smith K (2016) Culture shapes the evolution of cognition. Proc 51. Howard AM, Nibbelink N, Bernardes S, Fragaszy DM, Madden M (2015) Remote
Natl Acad Sci USA 113:4530–4535. sensing and habitat mapping for (Sapajus libidinosus): Landscapes for the use of
25. Reynolds GD, Romano AC (2016) The development of attention systems and working stone tools. J Appl Remote Sens 9:096020–096020.
memory in infancy. Front Syst Neurosci 10:15. 52. Holzhaider J, Hunt G, Gray D (2010) The development of pandanus tool manufacture

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
26. Pasternak T, Greenlee MW (2005) Working memory in primate sensory systems. Nat in wild New Caledonian crows. Behaviour 147:553–586.
Rev Neurosci 6:97–107. 53. Fuhrmann D, Ravignani A, Marshall-Pescini S, Whiten A (2014) Synchrony and motor
27. Yu C, Smith LB (2016) The social origins of sustained attention in one-year-old human mimicking in chimpanzee observational learning. Sci Rep 4:5283.
infants. Curr Biol 26:1235–1240. 54. Schuppli C, et al. (2016) Observational social learning and socially induced practice of
28. Fausey CM, Jayaraman S, Smith LB (2016) From faces to hands: Changing visual input routine skills in immature wild orangutans. Anim Behav 119:87–98.
in the first two years. Cognition 152:101–107. 55. Visalberghi E, et al. (2008) Physical properties of palm fruits processed with tools by
29. Myowa-Yamakoshi M, Matsuzawa T (2000) Imitation of intentional manipulatory wild bearded capuchins (Cebus libidinosus). Am J Primatol 70:884–891.
actions in chimpanzees (Pan troglodytes). J Comp Psychol 114:381–391. 56. Visalberghi E, et al. (2007) Characteristics of hammer stones and anvils used by wild
30. Fragaszy DM, Deputte B, Cooper EJ, Colbert-White EN, Hémery C (2011) When and bearded capuchin monkeys (Cebus libidinosus) to crack open palm nuts. Am J Phys
how well can human-socialized capuchins match actions demonstrated by a familiar Anthropol 132:426–444.
human? Am J Primatol 73:643–654. 57. Fragaszy DM, et al. (2016) Body mass in wild bearded capuchins, (Sapajus libidinosus):
31. Rizzolatti G, Sinigaglia C (2008) Mirrors in the Brain: How Our Minds Share Actions, Ontogeny and sexual dimorphism. Am J Primatol 78:389–484.
Emotions and Experience (Oxford Univ Press, Oxford). 58. Bakeman R, Deckner DF, Quera V (2005) Analysis of behavioral streams. Handbook of
32. Fragaszy DM, Crast J (2016) Functions of the hand in primates. The Evolution of the Research Methods in Developmental Science, ed Teti DM (Wiley, New York), pp
Primate Hand, Perspectives from Anatomical, Developmental, Functional, and 394–420.
Paleontological Evidence, eds Kivell T, Schmitt D, Lemelin P (Springer, New York), 59. Massaro L, Liu Q, Visalberghi E, Fragaszy D (2012) Wild bearded capuchin (Sapajus
Vol 2, pp 313–344. libidinosus) select hammer tools on the basis of both stone mass and distance from
33. Davis RT (1974) Monkeys as perceivers. Primate Behavior: Developments in Field and the anvil. Anim Cogn 15:1065–1074.
Laboratory Research, ed Rosenblum L (Academic, New York), Vol 3. 60. Fragaszy DM, Liu Q, Wright BW, Allen A, Brown CW (2013) Wild bearded capuchin
34. Treves A (2000) Theory and method in studies of vigilance and aggregation. Anim monkeys (Sapajus libidinosus) strategically place nuts in a stable position during nut-
Behav 60:711–722. cracking. PLoS One 8:E56182.
35. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202. 61. Hanna JB, et al. (2015) Kinetics of bipedal locomotion during load carrying in capu-
36. Hoppitt W, Samson J, Laland KN, Thornton A (2012) Identification of learning chin monkeys. J Hum Evol 85:149–156.
mechanisms in a wild meerkat population. PLoS One 7:e42044. 62. Liu Q, Fragaszy DM, Visalberghi E (2016) Wild capuchin monkeys spontaneously ad-
37. Thornton A, Hodge S (2008) The development of foraging microhabitat preferences just actions when using hammer stones of different mass to crack nuts of different
in meerkats. Behav Ecol 20:103–110. resistance. Am J Phys Anthropol 161:53–61, 1.

Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7805
Older, sociable capuchins (Cebus capucinus) invent
more social behaviors, but younger monkeys
innovate more in other contexts
Susan E. Perrya,b,1, Brendan J. Barrettc,d, and Irene Godoye
a
Department of Anthropology, University of California, Los Angeles, CA 90095-1553; bBehavior, Evolution and Culture Program, University of California, Los
Angeles, CA 90095-1553; cAnimal Behavior Graduate Group, University of California, Davis, CA 95616-8522; dDepartment of Anthropology, University of
California, Davis, CA 95616-8522; and eBehavioural Science Institute, Radboud University Nijmegen, 6500 HE Nijmegen, The Netherlands

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 16, 2017
(received for review January 18, 2017)

An important extension to our understanding of evolutionary Despite the obvious theoretical importance of innovation as a
processes has been the discovery of the roles that individual and key element in cultural evolution, there are relatively few studies
social learning play in creating recurring phenotypes on which of this topic, particularly in wild populations, because of meth-
selection can act. Cultural change occurs chiefly through invention odological difficulties in stimulating innovation experimentally
of new behavioral variants combined with social transmission of or detecting innovations in observational studies (10). This
the novel behaviors to new practitioners. Therefore, understand- paucity of information is partly because of the difficulty in cre-
ing what makes some individuals more likely to innovate and/or ating operational definitions of innovation that can produce
transmit new behaviors is critical for creating realistic models of meaningful datasets for comparative analysis. Here, we loosely
culture change. The difficulty in identifying what behaviors qualify
adopt the definitions by Reader and Laland (11) of innovation
as new in wild animal populations has inhibited researchers from
(the process) as “a process that results in new or modified
understanding the characteristics of behavioral innovations and
learned behaviour and that introduces novel behavioural variants
innovators. Here, we present the findings of a long-term, system-
atic study of innovation (10 y, 10 groups, and 234 individuals) in
into a population’s repertoire” and innovation (the product) as
wild capuchin monkeys (Cebus capucinus) in Lomas Barbudal, “a new or modified learned behavior not previously found in the
Costa Rica. Our methodology explicitly seeks novel behaviors, re- population” (ref. 11, p.14). Following the definitions of Ramsey
quiring their absence during the first 5 y of the study to qualify as et al. (12) and van Schaik et al. (13), we emphasize that inno-
novel in the second 5 y of the study. Only about 20% of 187 inno- vations are not part of the innate repertoire and do not arise
vations identified were retained in innovators’ individual behavioral predictably in all population members at certain points in the life
repertoires, and 22% were subsequently seen in other group mem- history; also, they do not predictably emerge in all population
bers. Older, more social monkeys were more likely to invent new members in response to particular social or ecological condi-
forms of social interaction, whereas younger monkeys were more tions. The definition by Ramsey et al. (12) and van Schaik et al.
likely to innovate in other behavioral domains (foraging, investiga- (13) differs from the definition by Reader and Laland (11), be-
tive, and self-directed behaviors). Sex and rank had little effect on cause it focuses on the individual rather than the population [i.e.,
innovative tendencies. Relative to apes, capuchins devote more of Ramsey et al. (12) argue that multiple individuals within the
their innovations repertoire to investigative behaviors and social same population could independently create the same behavior].
bonding behaviors and less to foraging and comfort behaviors. We take a compromise position, counting a behavior as an in-
novation if this is the first time that the behavior has been seen in
innovation | Cebus capucinus | cultural evolution | phenotypic plasticity | a particular social group during the putative innovator’s lifetime.
learning
Operational definitions of innovation and invention differ greatly
across fields. Some define innovations as inventions that have

B ehavioral innovation has long been a topic of interest for


researchers dedicated to studying the evolution of culture,
because it is a driver of cultural change (1, 2). The types of be-
subsequently spread throughout the population via social learning
(14). We use the definition most commonly and currently used in
the animal behavior literature, in which transmission of a new
havioral traditions that are of greatest interest to evolutionary behavior is not a necessary part of our definition of innovation.
modelers are those starting with an innovation that then spreads We assume because of the extreme xenophobia exhibited by
via social learning. Understanding the characteristics of (i) be- white-faced capuchins (15) that widespread social transmission
havioral innovations (which are roughly analogous to genetic
mutations) and (ii) the individuals who invent these behaviors is
critical to understanding cultural evolution and its relationship to This paper results from the Arthur M. Sackler Colloquium of the National Academy of
genetic evolution. Innovation is also of interest to evolutionary Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
biologists who study the role that learning plays in macroevolu- Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
tion, because it is a type of phenotypic plasticity that can affect
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
the direction of natural selection (3). Ability to innovate can
Author contributions: S.E.P. designed research; S.E.P., B.J.B., and I.G. performed research;
enhance reproductive success (for example, by enabling indi- S.E.P., B.J.B., and I.G. analyzed data; and S.E.P., B.J.B., and I.G. wrote the paper.
viduals to exploit new resources) (4–6). Innovation can generate The authors declare no conflict of interest.
the Baldwin effect, in which learned traits create recurring
This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
phenotypes that select for morphological adaptations, eventually Board.
leading to speciation (3, 7). Innovation is also of interest as a Data deposition: The R-code and data used in the statistical analysis reported in this paper
correlate of intelligence more generally, and the ability to solve are available at https://github.com/bjbarrett/cebusinnovation2017.
novel cognitive problems presented by experimenters can be 1
To whom correspondence should be addressed. Email: sperry@anthro.ucla.edu.
positively associated with mating success [e.g., bowerbirds (8)] This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
and parenting success [e.g., great tits (9)] in wild populations. 1073/pnas.1620739114/-/DCSupplemental.

7806–7813 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620739114


COLLOQUIUM
PAPER
of innovations is inhibited by insufficient exposure to members of are notable for their large relative brain sizes (32). White-faced
other social groups. capuchins (Cebus capucinus) are also omnivorous extractive for-
The thorniest challenges in the rapidly growing field of animal agers (33, 34), an ecological trait positively related to innovation in
innovation research arise in devising methods for quantifying callitrichids (35), and profit greatly from experimenting with new
innovation rates and determining the characteristics of innova- foods and feeding techniques. Capuchins have radiated over a
tors. Most research on animal innovation thus far has involved large, ecologically diverse geographic area (34, 36), which suggests
either (i) analysis of anecdotes drawn from published literature that they are capable of adapting their behavior to novel situa-
on topics other than innovation (16–18) or (ii) experiments in tions. They frequently interact with and investigate other species
either the field or the laboratory, in which experimenters present as predators, prey, and feeding competitors (37, 38). It has been
individuals with a novel problem to solve (8, 9, 19–22). The most suggested by Sol et al. (39) that innovative tendencies in birds
prominent body of innovation research using observational data have coevolved with life history in generalist species, with slow-
has focused on orangutans (Pongo) using developmental data developing, large-brained species prioritizing future over current
from rehabilitants (23) or “geographic contrasts methods” akin reproduction and being more likely to innovate and produce
to those used to diagnose probable social traditions (24, 25) to plastic responses to changing conditions. If this model holds true
infer innovation rates by evaluating the patterning of within- and in other taxa, capuchins should be excellent candidates for high
between-group behavioral variation in short-term studies of both innovation rate given their life histories and ecological niche.
captive and field populations (12, 13, 26). Few field studies of in- White-faced capuchins live in multimale, multifemale social
novation are longitudinal or look at the properties of individuals groups that are highly xenophobic, offering little opportunity to
that make them more, or less, likely to innovate within their lifetime. learn from members of other groups, except by migrating (40).
There are advantages and disadvantages to all of these Capuchins are notable for the high frequency of coalition for-
methods, and none of them seemed entirely suitable for our mation and the creation of quirky social conventions that seem
goals, which are to (i) document the kinds of behaviors present to serve as means of testing the social bonds that are likely to be
in the innovation repertoire in wild white-faced capuchins, (ii) in- important for enhancing fitness (27, 38, 40). The presence of
vestigate differences in rates of innovation across behavioral social traditions in the foraging and social domains (41) implies
domains, (iii) determine what features of individuals (rank, that capuchins innovate as well.
sex, age, and sociality) predict propensity to innovate, and
(iv) determine whether new behaviors become established com- Results
ponents of individual or group behavioral repertoires. Our ap- Innovations were classified in four categories or “behavioral
proach differs from previous approaches to documentation domains”: foraging, self-directed, social, and investigative. The
of behavioral innovation, in that it was implemented into the foraging category included 17 behaviors related to drinking water
core data collection protocol over a 10-y period, for which there or processing food, 4 of which were independently invented in
was dense behavioral sampling of multiple social groups in the other groups and only 2 of which persisted in the innovator’s
same ecological context. This approach provides the advantage repertoire. One of these was a form of tool use: the use of leaves
of enabling researchers to detect probable innovations in any to wrap and scrub the urticating hairs off of Automeris caterpillars.
behavioral domain and have sufficient time depth to know who Another was the use of the tail tip as a sponge to access water
the probable innovators are (details are in Methods). from tree holes too deep to reach with a hand or foot (Movie S1).
There are several biologically relevant behavioral contexts, or The latter was invented only once during the period of 2007–
domains, that are difficult to study experimentally. Studies have 2011 by a female who did this regularly but did not transmit it to
largely looked at the diffusion of novel foraging tasks for good other group members. This same tail-dipping behavior was in-

ANTHROPOLOGY
reasons. Experimentally seeded innovations or problem-solving dependently invented in the buffer period (2002–2006) by mon-
tasks provide controlled contexts, where both the latency of in- keys in three other groups and spread to multiple individuals in
dividuals innovating a solution and the social diffusion of inno- one of those groups. (The buffer period was a time period during
vations may be studied. Innovating solutions to novel tasks is also which we did not score innovations, but we used this time period
of obvious adaptive value. However, many innovations that are to confirm whether behaviors seen in the subsequent 5-y period
ecologically or socially relevant are difficult to study experi- were truly new to that group. Details are in Methods.) It was not
mentally. Some animals have innovative social interaction, i.e., possible to reliably score food choice innovations and assign them
“social games,” which may serve as bond testing rituals and can to particular innovators, but the foraging category would have
be socially transmitted (27). Some wild animals display repetitive been much larger had we been able to include them.
self-directed “quirks,” perhaps to self-soothe, akin to the pro- The self-directed category included nine behaviors related to
posed function of some stereotyped coping behaviors in captive enhancing comfort, dental hygiene, self-soothing, and self-
and wild animals (28). Other behavioral innovations have no stimulation. Capuchins are prone to inventing “personal quirks,”
apparent immediate biological or ecological function. especially involving clutching or poking some part of their own
Systematic observational study of innovations across a wide body for prolonged periods of time. These habits may persist for
range of behavioral domains permits us to explore whether indi- years and might be transmitted to other group members. There
vidual propensity to innovate is generalized or whether individuals were many individuals in the 2007–2011 dataset that were still
will be differentially prone to innovate in different behavioral do- practicing postural quirks that they invented during the buffer
mains according to their ecology and life history. For example, it has period (2002–2006), and those are not evident in SI Appendix,
been suggested that “necessity is the mother of invention” and Table S1. The “body part hold” category lumps together many
hence, that individuals who are young, low-ranking, and/or socially different types of postural quirks; if we split this category more
peripheral will be more prone to inventing new foraging strategies finely, there would be far more innovations in this domain.
(29); this hypothesis has yielded mixed results in past literature The social category includes 47 forms of social interaction that
reviews (6). In general, there are few strong theoretical expectations are not part of the standard species repertoire. Eight of these
about how age, sex, rank, and sociality affect innovation rates; we were independently invented in multiple groups, and most of
need natural history and observational studies to help guide theory. these inventions involved incorporation of behavioral elements
Capuchin monkeys (genera Cebus and Sapajus) are expected from the foraging repertoire into the exploration or use of the
to have unusually high innovation rates for myriad reasons. interaction partner’s body (e.g., explorations of the partner’s
Comparative studies have shown that brain size covaries with orifices or mouthing parts of the partner). Some behaviors (e.g.,
innovation frequency in primates and birds (30, 31), and capuchins dental examinations, eye poking, hand sniffing, sucking of body

Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7807
parts, toy game, and hair game) have been invented in multiple AA CU FF FL LB
groups over the years and have become well-established tradi-
tions in some groups; hence, many of these behaviors scored as
innovations for this 2007–2011 time period were invented before
2007 in other groups and still in practice (and hence, not counted
as innovations) during this time period. Both past work and the MK NM RF RR SP
patterning of results in SI Appendix, Table S1 suggest that those
social rituals that involve some discomfort or risk (e.g., having an
appendage bitten or damaging an eye) are more prone to remain
in individual repertoires and become established in group rep-
2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10
ertoires. In addition to the aforementioned behaviors that have estimated annual innovation rate per group
been proposed as bond-testing behaviors (41), the social category
of innovations includes social play, social displays, and some Fig. 1. Posterior predictions of annual innovation rate (number of inno-
creative ways for females to regulate infant behavior. vations per year) for each group. Two-letter names of the social groups are
The investigative category included creative manipulations of at the top of each panel; vertical lines indicate PMs. n = 44 group-years.
other species (e.g., porcupines, howler monkeys, and turtles),
human artifacts, leaves, sand, sticks, water, rocks, and other in-
animate objects as well as innovative ways of locomoting through self-directed domain (PM = 0.018; 89% CI = 0.008–0.037).
the forest. Most of these behaviors had no obvious immediate Much of the variance in probability of innovating can be ex-
purpose and gave the impression that the innovator was engaged plained by individual identity (σid of αp = 1.01) and social group
in recreational creativity, exploring the affordances of the ma- membership (σ group of αp = 1.56). Importantly, differences
terials. To give a few examples, in “cow pie seesaw” (Movie S2), between behavioral domains account for more variation in rates
the young monkey flips over a dried piece of cattle dung the of innovation (Table 2) than between-individual or -group dif-
length of her body, so that the flat side is on top and the rounded ferences for all other varying effect parameters with the excep-
side contacts the ground; then, she stands on top, rocking back tion of βmalep (SI Appendix, Table S2). Because interpreting the
and forth. In “mango hitting game,” a young monkey routinely coefficients of these models can be nonintuitive and because they
finds mangos about one-half the size of her body on the ground interact multiplicatively to estimate innovation rates in each
and applies hard, two-handed strikes to them for 4–5 min at a domain, we instead refer the reader to plots of model predictions
stretch, throwing her body weight into the assault on the mango, (Figs. 2 and 3 and SI Appendix, Figs. S1–S4) for all estimated effects.
with no apparent interest in eating it (SI Appendix). Less than For our model predictions, we display the posterior predic-
15% of investigative behaviors were seen more than once in tions for each individual’s annual innovation rate in each be-
individual repertoires, and only 18% were subsequently observed havioral domain. The y axes may represent (i) the estimated
being practiced by the innovator’s groupmates. number of innovations per individual monkey per year or the
The final dataset, described in detail in SI Appendix, Table S1, joint probability, (1 − p) × λ (Figs. 2 and 3 and SI Appendix, Figs.
included 187 innovations, 127 of which were unique behavior S3A and S4A). In some cases, looking at the individual compo-
types. Of the 187 innovations, 149 (80%) of them we never again nents of a zero-inflated Poisson (ZIP) model can be informative,
saw performed by the innovator, and only 41 (22%) were seen and therefore, we also present (ii) the probability of innovating
performed by other monkeys in the same group (i.e., had the per year, 1 − p, (SI Appendix, Figs. S1 A–D, S2 A–D, S3B, and
potential to have become a tradition during the observation S4B) and (iii) the number of innovations per year estimated to be
period). Of the 127 unique innovations seen, 54 (42.5%) were in observed conditional on being an innovator, λ (SI Appendix, Figs.
the investigative domain, 47 (37.0%) related to social behavior, S1 E–H, S2 E–H, S3, and S4C).
9 (7.1%) were self-directed behaviors, and 17 (13.4%) were
foraging behaviors. At least one innovation, based on our con- Age. Age differentially predicts innovativeness across behavioral
servative criteria, was scored in 117 of 234 individuals included in contexts. Younger individuals innovate at higher rates in the
this dataset. Descriptions of all innovations are included in SI investigative, foraging, and self-directed domains, although the
Appendix, and SI Appendix, Table S1 reports the distribution of effect size is quite small for self-directed and foraging behaviors
these behavioral variants across social groups and individuals. (Fig. 2 A–C). Older individuals are slightly more innovative in
Innovations were rare, being observed, on average, less than the social domain (Fig. 2D and Table 2). This effect was more
once per individual per year in any particular domain. These heavily driven by the probability of being an innovator (SI Appendix,
annual rates may appear low, but our aim is to predict the Fig. S1 A–D and Table S2) than the number of innovations con-
properties of an individual that make him/her more prone to ditional on being an innovator. Younger innovators seem slightly
innovate in particular behavioral domains. If one ignores the more likely to produce more innovations, conditional on being
properties of individuals or behavioral domains, annual group- an innovator, but this effect is small (SI Appendix, Fig. S1 E–H)
wide innovation rates (Fig. 1), which have been the focus of and near zero in most domains.
many field studies of innovation, are much higher; however, we
can learn much about individual innovators by taking this more
detailed approach. Table 1. WAIC estimates for all evaluated innovation models
Our global model (referred to as mASRMG in Table 1) re- Model WAIC dWAIC wWAIC SE
ceived overwhelming support compared with other models
[having a Widely Applicable Information Criterion (WAIC) mASRMG 1,442.02 0 1 81.97
weight of 1.00] and suggests that sociality and age are the most mA 1,478.34 36.32 0 84.38
important predictors of innovation (Table 1). From posterior mS 1,503.03 61.01 0 84.49
median (PM) estimates and 89% credible intervals (89% CIs), m 1,510.11 68.09 0 84.22
our model suggests that marginal rates of innovation per indi- mRG 1,526.1 84.08 0 86.46
vidual per year (innovation rates) are highest in the social do- mM 1,570.25 128.23 0 88.39
main (PM = 0.122; 89% CI = 0.064–0.195) followed by the Capital letters in model names correspond to predictors included: age (A),
investigative domain (PM = 0.085; 89% CI = 0.047–0.147), sociality (S), rank (R), sex (M), and group size (G). dWAIC, difference in WAIC
foraging domain (PM = 0.028; 89% CI = 0.014–0.052), and scores from the highest ranked model; wWAIC, WAIC weight.

7808 | www.pnas.org/cgi/doi/10.1073/pnas.1620739114 Perry et al.


COLLOQUIUM
PAPER
Table 2. Posterior mean estimates of varying effects of a. foraging b. investigative
behavioral domain

1.5
Behavioral domain

Parameter Foraging Investigative Self-directed Social

1
individual innovation rate
αp −0.12 −0.45 0.01 0.06
αl −0.98 0.07 −1.38 0.49

0.5
βagep 2.37 3.96 2.97 −1.64
βagel 0.03 −0.52 −0.04 0.24
βsocialityp −0.03 −0.31 0.23 −0.13

0
βsocialityl −0.32 −0.09 −0.52 0.51
βrank.highp 0.14 −0.46 −0.02 0.00 c. self-directed d. social

1.5
βrank.highl 0.05 −0.32 0.15 −0.01
βrank.lowp −0.37 −0.42 −0.44 0.19
βrank.lowl −0.21 0.03 −0.21 0.13
βmalep 0.12 −0.57 0.33 −0.09

1
βmalel −0.09 0.08 −0.26 0.20

0.5
Sociality. Sociality also differentially predicts innovation across
behavioral domains. More social individuals showed higher rates

0
of innovation in the social (Fig. 3D) domain. Less social indi-
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
viduals had higher innovation rates in the foraging (Fig. 3A), sociality scores (standardized)
investigative (Fig. 3B), and self-directed (Fig. 3C) domains, al-
though these effects are weak and less certain. More social in- Fig. 3. Joint model predictions for the effect of sociality on the number of
dividuals produced a greater number of innovations per year innovations per individual per year. Dark lines are at the PMs; lighter lines
(conditional on being innovators) in the social domain (SI Ap- are 100 randomly sampled posterior predictions. n = 3,132 individual-years.
pendix, Fig. S2H), whereas less social individuals showed a
greater number of innovations in the self-directed domain (SI
slightly more likely to be innovators (SI Appendix, Fig. S3B).
Appendix, Fig. S2G), and there was little effect of sociality on
Within the social and investigative domains, males showed
foraging (SI Appendix, Fig. S2E) and investigative (SI Appendix,
slightly higher innovation rates both overall and conditional on
Fig. S2F) behaviors.
being innovators, but there were no discernable effects of sex in
Sex. Males (PM = 0.034; 89% CI = 0.014–0.074) have slightly the foraging and self-directed domains (SI Appendix, Fig. S3 A
higher innovation rates than females (PM = 0.024; 89% CI = and C). Where these small differences exist, they are uncertain
0.011–0.045), ignoring between-domain variation. Males were and potentially of no biological significance.

Rank. Ignoring domain-specific effects, we found that innovation


rate is highest in middle-ranking individuals (PM = 0.036; 89%

ANTHROPOLOGY
a. foraging b. investigative CI = 0.017–0.070) followed by high-ranking individuals (PM =
0.026; 89% CI = 0.011–0.058) and lowest in low-ranking indi-
1.5

viduals (PM = 0.021; 89% CI = 0.009–0.48). Rank did not have


consistent or strong effects on overall innovation rates within
domains, although there was a slight tendency for midranking
1
individual innovation rate

individuals to show higher innovation rates in the social and


investigative domains (SI Appendix, Fig. S4A) relative to low- or
high-ranking individuals; this pattern held for the rates of in-
0.5

novation conditional on being an innovator (SI Appendix, Fig.


S4C). However, within each domain, low-rankers had a slightly
higher probability of becoming innovators than mid- or high-
0

c. self-directed d. social ranked individuals (SI Appendix, Fig. S4B). Most of these rank-
related domain-specific effects are relatively small and uncertain,
1.5

and therefore, we hesitate to make strong claims about them.


Discussion
1

Utility of This Method for Identifying Innovations. On the surface, it


would seem that white-faced capuchins have the largest reper-
toire of innovations of any primate species studied thus far, but
0.5

we encourage caution in comparing the sizes of different species’


innovation repertoires or their individual innovation rates be-
cause of differences in methods between studies. The method
0

1 2 3 1 2 3 that we used here differs from previous approaches in the fol-


log(age) lowing ways.

Fig. 2. Joint model predictions for the effect of age on the number of inno- i) Researchers were vigilant for innovations and recorded
vations per individual per year in the domains of (A) foraging, (B) investigative, them throughout the study period, likely resulting in fewer
(C ) self-directed and (D) social behaviors. Dark lines are at the PMs; lighter lines overlooked innovations than in other long-term studies. It
are 100 randomly sampled posterior predictions. n = 3,132 individual-years. also enabled rigorous recording of innovations across a

Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7809
wider range of behavioral domains than would have been the generally produce innovations with no obvious utility may give an
case had we focused our data collection on a specialized artificially low impression of the biological importance that in-
research topic not specifically related to innovation in multi- novation has for behavioral repertoires overall. Most groups
ple behavioral domains. Despite efforts to record all possible have a current repertoire of bond-testing signals that have been
types of innovations, we failed to record food choice innova- steadily practiced by a subset of the group for many years, and
tions with sufficient rigor to include them in this analysis. these behaviors are not, for the most part, scored as innovations
ii) This study generated a high density of behavioral observa- in this dataset, because they first appeared in the buffer period or
tions compared with most primate field projects because of even earlier. Food choice innovations (not reported here be-
the year-round presence of a large number of well-trained cause of the interobserver reliability challenge of being able to
data collectors. This higher sampling density suggests that, correctly identify plants the first time that they are eaten) tend to
relative to other studies, more true innovations would have be adopted quickly and remain in repertoires for long periods of
been observed and also, perhaps, that fewer false positives time, such as is the case for chimpanzees (42) and various species
would have been generated (i.e., fewer rare species-typical of monkeys (29). Particularly useful food processing or drinking
behaviors labeled as innovations simply because they had techniques, once invented, are likely to persist in repertoires for
not been observed in other practitioners because of low many years [or even centuries (43)]. Tail-dipping to access water
sampling density). in deep tree holes seems to have been independently invented in
iii) The large number of groups monitored in essentially iden- four groups at Lomas, originating during the 2002–2006 period
tical ecological circumstances with overlapping home ranges in three of these groups (and therefore, not counting as an in-
offered better opportunity than most studies for identifying novation according to our definition) but persisting during the
innovations via comparison of presence vs. absence in groups 2007–2011 period in two of those three groups. Although many
exhibiting similar ecologies. With the exception of five of the of the innovative behaviors recorded during this 5-y period seem
innovations concerned with exploration of human artifacts, to be aimless creativity with no obvious utilitarian goal in mind
there is no reason to think that monkeys had differential oppor- (aside from the foraging behaviors), it is important to remember
tunities to discover particular innovations. Having more groups that innovations, like mutations, may not be particularly bene-
in the sample might make it easier to find one that lacked some ficial on their own but may become exapted (44) and acquire a
rare behavior, although the long-term nature of the study com- benefit when paired with a particular ecological or social context
bined with the dense behavioral sampling probably mitigate this (even if the initial pairing is accidental). For example, a poten-
tendency for the large number of groups to produce false pos- tially risky behavior, such as sticking a finger deep into the eye
itives of innovation caused by presence/absence contrasts. socket of a partner, might yield benefits if it is incorporated into
iv) The long-term nature of the study and the use of a 5-y buffer a dyadic ritual that tests the quality of an important social bond
period before the observation period reduce the chance that (27, 45). Additionally, many of the capuchin innovations in this
behaviors will be falsely termed innovations when they are dataset might inform the developing monkey about the affor-
actually low-frequency behaviors already in the repertoire. If dances of objects or how his/her body relates to the environment,
our study had been the length of a typical dissertation proj- providing useful feedback, even if there is no practical value to
ect (i.e., 1 y) and if we had assumed that behaviors were new permanently incorporating the new behavior into the behavioral
to the practitioners if it was the first time that we had seen repertoire.
them performed in those groups, then we would have had
52% more innovations in our sample than we obtained by How the Capuchin Innovation Repertoire Compares with Those of
using the buffer period method and requiring each recorded Chimpanzees and Orangutans. The profound methodological dif-
innovation to be the first sighting by that individual in its ferences between studies of these species preclude precise
group(s) of residence. quantitative comparisons of innovation rates in different be-
v) Our method is more conservative than the definition used by havioral domains, but we can at least make some crude quali-
Ramsey et al. (12), which defines innovations as being new tative comparisons between capuchins and the other two primate
to individual repertoires but not necessarily new to group species for which innovations have been systematically cataloged
repertoires. We recognize the possibility of having indepen- in the wild. The Mahale chimpanzee researchers (42) present
dent inventions within a single social group, and we suspect data on 26 novel behaviors (after excluding food choice to make
that many true innovations in our sample have been discarded their results more comparable with the other datasets) retro-
because of the suspicion that they may be the products of spectively extracted from their 43-y study of two chimpanzee
social learning. On the whole, we think that our method pro- communities. Although studying innovation was not an explicit
vides a more accurate technique for diagnosing innovations part of their core data collection protocol, many researchers at
than alternative observational methods. However, overlook- Mahale described behavioral variation and novel behaviors in
ing independent inventions within the same social group is detail. They defined innovation as any behavior not seen in the
one way in which we likely underestimate innovation rates. first 15 y of research. Innovation in orangutans has been ex-
vi) Our approach looks at the properties that affect individual plicitly studied using the geographic contrasts method in short-
propensities to innovate and longitudinally tracks the rate of term studies by van Schaik et al. (26) at multiple sites, producing
innovation over 5 y. Previous studies have looked at group- a sample of 44 putative innovations. Comparison of the distri-
level differences or short windows of time. Our hierarchical bution of behaviors across domains in their datasets with the
statistical approach accounts for unequal sampling effort composition of the repertoire in the dataset of 127 unique in-
among individuals, estimates between- and within-individual novations from Lomas Barbudal suggests that capuchins, relative
variation, and avoids the potential problem of falsely making to these ape species, devote a higher proportion of their creative
inferences about individual innovation rate from potentially energy to investigation of their environment and devising new
spurious group-level effects. social behaviors and a lower proportion to comfort-related be-
haviors and foraging. Orangutans are particularly prone to de-
Rates and Types of Innovation in Capuchin Monkeys. Evidence that vising new variants on nest-building techniques, and even their
individuals, on average, (i) innovate in any one of these four novel acoustic behaviors seem to emerge primarily in a nest-
domains less than once per year, (ii) retain only about 20% of building context. Both ape species are more innovative with
these innovations in their repertoires, (iii) transmit no more than regard to bodily comfort and hygiene than capuchins. Capuchins
22% of their innovations to other group members, and (iv) rarely seem to prioritize comfort: they do not build nests, and

7810 | www.pnas.org/cgi/doi/10.1073/pnas.1620739114 Perry et al.


COLLOQUIUM
PAPER
they readily endure stings, bites, urticating hairs, and spiny veg- and subordinates are often less socially central than dominants.
etation in the process of acquiring food. Although nearly 4% of Because there was not a strong rank effect in this dataset, we
unique innovations may serve to mitigate physical discomfort, remain hesitant about making claims regarding rank mediating
this percentage is low compared with both chimpanzees (15%) the relationship between sociality and innovative tendencies.
and orangutans (36%). Capuchins devoted about 42% of their Although sex and rank (16, 20, 29) are often predictive of
innovation repertoire to the investigative domain compared with innovative tendencies in other species, neither variable had much
chimpanzees (31%) and orangutans (14%). Orangutans are cu- explanatory power in the capuchin dataset. Capuchin males were
riously neophobic relative to both chimpanzees and capuchins, slightly more likely to innovate in all domains, which is consistent
and the exploration of the environment that they engage in as with observations from other primates (16) and meerkats (20)
young individuals is primarily focused on the mother’s activities but not guppies (48); however, these effects were so small and
rather than independent discovery (13), whereas capuchins engage uncertain in the capuchin dataset that they are not likely to have
in much object play and independent exploration of the affordances any biological significance.
of the environment at an early age. The preponderance of in- With a better understanding of innovation, the next logical
vestigative sorts of innovations in orangutans seems to be devoted to questions to address are two questions linking this research to
solving the problem of how to safely engage in arboreal locomotion the social learning literature. What properties of innovations and
as a large-bodied animal (26, 46)—a problem that is less difficult for what properties of innovators make the innovations most likely
a small-bodied animal with a prehensile tail. Within the category of to be socially transmitted to other group members? Adequate an-
social innovations, capuchins were more prone to inventing be- swers to these questions are beyond the scope of this paper, but
haviors related to social bonding, whereas chimpanzees placed preliminary analyses suggest that age and group size of the innovator
more emphasis on aggressive displays. This emphasis is consistent might be important drivers (SI Appendix, SI Text and Fig. S5).
with the importance of alliances and hence, bond tests in capuchins:
males need allies for parallel dispersal and to acquire and maintain Methods
breeding positions, and females depend on allies to defend food Study Site and Subjects. The study was performed at Lomas Barbudal Biological
resources and infants from potentially infanticidal males (40). Reserve in Guanacaste, Costa Rica, a tropical dry forest described in the work by
Frankie et al. (49). This population of C. capucinus has been the subject of long-
What Factors Affect Propensity to Innovate in Other Species? In this term study by Perry et al. (50) since 1990, starting with a single social group.
The number of regularly monitored groups has since expanded by both fission
study, younger animals were more innovative in all behavioral
of research groups and habituation of new groups to include 10 groups by the
domains except for social interaction, in which the tendency was end of 2011.
reversed. This result is consistent with capuchins’ slow life his- One of the methodological problems in documenting innovations is the
tory, generalist ecological niche, and complex social relation- difficulty of knowing whether a particular behavior has ever been performed
ships. During a long juvenile phase, individuals have much to or observed by a particular individual. In a short-term study, when the re-
gain by experimenting with both new foods and more efficient searchers have little experience with the animals, many rare behaviors will be
ways of acquiring resources. As they age, capuchins increasingly falsely scored as innovations simply because the researcher has not seen them
have reason to form and test the quality of those social rela- previously. Ironically, such sampling biases may result in the misimpression
tionships that will prove critical for acquiring and protecting that innovation rates are higher in short-term studies than long-term studies.
Such problems are mitigated in this study by (i) using data from a long-term
access to the food and social resources needed for enhancing study; (ii) generating a high density of observations by using a large staff to
reproductive success (40), and therefore, it makes sense that older collect year-round behavioral data, in which new behaviors were explicitly
animals are more prone to innovation in the social domain. recorded; (iii) explicitly training observers to watch for and record new be-
Studies of other species provide mixed results regarding the effect haviors; and (iv) having a single observer with 26 y of experience collecting

ANTHROPOLOGY
of age on innovative tendencies. A review of the published primate data on this population (S.E.P.) make the final determinations about which
literature suggests that adults innovate more than immatures (16), behaviors are truly new for each group of monkeys, with input from two
and this pattern has been corroborated by experimental studies of additional long-term researchers (B.J.B. and I.G.).
innovation in callitrichids (22) and meerkats (20), in which young Beginning in January of 2002, all research staff were directed to make
freeform comments about any behaviors that did not neatly fit into the
animals were less likely than adults to successfully solve a novel
standard ethogram of species-specific behavior and explicitly mark comment
extractive foraging task, possibly because of insufficient develop- lines as comment innovation when they thought they were seeing a behavior
ment of dexterity. Chimpanzees are a notable exception to this that they had never seen before in that group or a behavior that they thought
general pattern; high rates of innovation, particularly of the social was a unique behavioral tradition. Naturally, many behaviors seem new to
and investigative sort, are observed in immature chimpanzees (16, relatively inexperienced observers, and therefore, not all of the behaviors
29). Among human children, older children are better than initially coded as innovations were true innovations. Also, some behaviors not
younger children at solving novel problems (47). It is worth noting coded as innovations in the comments section were, in fact, true innovations.
that, in this study, we were probably measuring something more S.E.P., who personally collected 13,770 h of data on this population from
akin to creativity and exploration rather than skill at solving a task, 1990 to 2016, read through all data to determine whether behavioral se-
quences were likely to be true innovations (i.e., behaviors seen for the first
and it is possible that the innovations created by older capuchins
time in that particular social group).
could be argued to be more sophisticated in some way. This research was performed in compliance with the laws of Costa Rica,
The other variable that had a strong effect on capuchins’ and the protocol was approved by the University of California, Los Angeles
propensity to innovate was sociality: more social capuchins were Animal Care Committee (ARC 1996-122 and 2005-084 plus various renewals).
more prone to inventing novel social interaction types, and more
social monkeys were also slightly more likely to have their social Buffer Period. We used a 5-y chunk of observational data (35,196 h collected
innovations picked up by other group members (although we between January 1, 2007 and December 31, 2011) to look for innovations. We
cannot currently say whether this is because of social learning) used the 5 preceding years of data (∼37,514 h of observation collected during
(SI Appendix, Fig. S5B). These results can probably be explained 2002–2006) as a “buffer period” (i.e., a period in which we could search for
simply by the fact that more social individuals have more op- prior instances of behaviors that appeared to be innovations within the
targeted 2007–2011 time period). If we had not left a large buffer period,
portunities to experiment with novel forms of social interaction.
we would have falsely concluded that far more behaviors within the time
The existing literature on innovation does not have much to say period of interest were innovations. SI Appendix, Table S1 reports, for each
about the effects of sociality per se, aside from predicting that innovation, the number of groups in which it occurred, its persistence in the
technological innovations will be more common in peripheral innovator’s repertoire, and whether it spread to other group members.
animals who are less distracted by social life (29). However, the Data were collected by a team of 50 highly trained observers (∼12 per
literature does address dominance rank as a predictive variable, year), each of whom underwent a training period of approximately 3 mo of

Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7811
dawn to dusk instruction and interobserver reliability testing before con- their offspring that are less than 1 y of age). Individuals with less than 20 scans
tributing data to the database. Interobserver reliability tests for monkey per year were excluded from analysis. Sociality scores were standardized, so
identifications and coding were repeated monthly throughout their tenures that a sociality score of one is 1 SD above the centered mean of zero. (iv) Rank.
at the field site. Obviously, we could not train observers to reliably code For individuals older than 3 y old, rank was calculated using the EloRating
innovations (behaviors that had not yet happened) in the same ways. We package in R (52). We used outcomes from dyadic interactions involving
could, however, ensure that the observers recognized and reliably recorded avoids, cowers, flees, and supplants to determine dominance. We used default
basic motor movements, gestures, and 205 behaviors considered to be part Elo parameters, with initial Elo scores set to 1,000, and the constant k set to
of the species-typical behavioral repertoire. This repertoire was based 100. Rank was estimated using average Elo scores calculated for each individual
on >6,000 h of observation invested in studying these monkeys during 1990– per calendar year. Young juveniles less than 3 y old were given Elo Scores of
2001 before explicit recording of innovations began, thereby enabling zero to assure a low rank within their social groups. Members of a group were
better detection of the idiosyncratic behaviors. divided up into tertiles, with the corresponding levels receiving rank categories
Although we gave instructions to data collection teams to record any type of high, middle, and low. High and low ranks were estimated as dummy vari-
of innovation in any behavioral domain, we decided not to include inno- ables, with middle rank serving as the intercept-only reference category.
vations regarding (i) choices of food or medicinal plants and (ii) vocal be-
havior in this analysis. Although we witnessed many instances of apparent Statistical Methods. Our outcome variable, number of innovations, is a count
innovation regarding use of novel foods and medicines, we lacked sufficient variable with many zero values. Each monkey was observed over multiple
confidence in observers’ ability to accurately identify rarely used plants and years. Membership in a particular group may have affected propensities to
insects or accurately identify rarely produced vocalizations. innovate. Therefore, we analyzed these data using a series of hierarchical ZIP
A behavioral observation was scored as an innovation if it met the fol- models. ZIP models are mixture models that use two probability distribu-
lowing criteria: (i) the behavior was absent in some of the groups that we tions. One component assumes a Bernoulli distribution and estimates p: the
regularly monitor, (ii) it was the first time that this behavior had been seen probability of observing a zero. The other component assumes a Poisson
in this group during 2007–2011, and (iii) this behavior had not been seen in distribution and estimates λ: the estimated mean of a Poisson distribution.
this group during the buffer period time period preceding 2007 during the ZIP models permit a mixture of causal factors to be evaluated and help
lifetime of the putative innovator. In cases of group fission or migration, we
better predict outcomes when there is a large number of zeros because of
required that the behavior not have been seen previously in the other
both the rarity of an event and false negatives. The joint likelihood of ob-
groups of which the potential innovator had been a member. There was one
serving an innovation can be calculated by multiplying the likelihoods of the
class of behaviors on which an additional criterion was imposed: postural
Bernoulli and Poisson outcomes and converting them to the real number
habits, such as clutching of one’s own body parts or sniffing one’s own hand
scale using their corresponding link functions. We graphically present joint
(SI Appendix).
posterior predictions here (Figs. 2 and 3) along with model predictions of
The number of innovations was also counted via two less conservative
p and λ in SI Appendix (SI Appendix, Figs. S1–S4).
methods for methodological comparison. In one version, we eliminated the
We looked at four predictors in this analysis: (i) age, (ii) sex, (iii) sociality,
buffer period criterion, calling the first observation of a behavior in a par-
and (iv) rank. We also estimated unique offsets for each individual, because
ticular group an innovation. This method yielded 263 innovations compared
they differed in observation time or exposure. We analyzed six models, four
with 187 produced with the more conservative method described above. In an
even less stringent version that yielded 282 innovations, we termed a be- of which corresponded to one of the single aforementioned predictors. The
havior an innovation if it had not been seen in that group in the past year. other two included a global model that looked at all four predictors and an
intercepts-only model. In each model, we used varying intercepts for each
Another challenge in defining innovation is the “grain” problem (i.e.,
individual (n = 234), social group (n = 10), and behavioral domain (n = 4).
determining the descriptive breadth of behavioral categories or in other
Varying slopes for domains and groups were estimated for all four predic-
words, the extent to which to lump vs. split behaviors) (51). The grain
problem is insoluble; the best that we can do is to use our intuitions about tors, and varying slopes for individuals were estimated for sociality and age.
what the animals themselves seem to consider novel and observations of Group size was used as a covariate to control for the numerical likelihood
how these behaviors cluster in the repertoires of groups and individuals. Are that, in smaller groups, observed behaviors might be more likely to be
novel actions used as part of a task already in the behavioral repertoire scored as innovations because of our definition of uniqueness and that a
novelties? In our view, they usually were. Are the same actions applied to greater number of innovations is more likely in larger groups. In another
different objects? Unless the objects and contexts were quite radically dif- analysis, we used experimental year as a varying effect to see if there were
ferent, we opted to lump these together (i.e., we did not designate them as any biases in data collection between field assistant cohorts that would
innovations). This issue was most prominent in our decision-making when change our inference. There were none, and therefore, we excluded these
evaluating clutching of different body parts (a self-directed behavior), parameters from our final analyses to simplify the presentation of results.
sucking of different body parts (a social behavior), and the “toy game,”
which involves passing an object from mouth to mouth. In all of these cases, Offsets and Exposure. Because observations of innovations were collected not
we chose to lump rather than split behaviors. In the case of object play (e.g., only in focal follows but also, ad libitum, and because individuals varied in
with sand, water, or rocks), different actions used by different individuals in their likelihood of being observed because of data collection protocols or
interacting with these same substances were scored as different behaviors. their visibility in the group, they also differed in exposure. To account for
Descriptions of the complete list of behaviors that were included in our differences in exposure, we calculated an annual offset for each individual in
analysis are in SI Appendix. each calendar year. Offsets for each individual (Oi) were estimated using
Innovations were classified in four categories or behavioral domains, the  
content of which is described in greater detail in Results: foraging, self- Gi + Fi
Oi = log ,
directed behavior, social behaviors, and investigative behaviors (explora- 365
tion of the environment).
where Gi is the number of instantaneous group scans calculated per calen-
dar year for each individual i, and Fi is the number of point samples collected
Data Structure. We created a different row for every individual monkey/year/
at 2.5-min intervals during focal follows in a calendar year. These offsets
behavioral domain combination, and a value was scored for the number of
were included alongside linear predictors in each model.
innovations observed for that combination; this number was the output
variable. We measured four main predictor variables to determine what Models were fit using the map2stan function in the R package rethinking
predicts the number of innovations per individual per year: age, sex, sociality, (53). Models were fit using Hamilton Markov Chain Monte Carlo in r-STAN
and rank. (i) Age. To calculate age, we subtracted the birth year of an indi- (v 2.14) (54) in R v. 3.3.2 (55). Models were compared with widely applicable
vidual from the year of observation and added one. Individuals born in the information criteria (WAIC) using the compare function in rethinking. The
same year as the year of observation were excluded from the dataset. Age was corresponding code and data used for each model and graph production can
log-transformed and centered for analysis. (ii) Sex. We coded sex as a dummy be found through a link in SI Appendix.
variable (one for males and zero for females). (iii) Sociality. Sociality was cal- To estimate group-level difference in annual innovation rate, we summed
culated using data from group scans, which were taken opportunistically for the number of innovations observed within each group and across all indi-
all group members, at intervals no closer together than 10 min. For each in- viduals and behavioral domains. Exposure rates for individuals within groups
dividual per calendar year, we calculated the proportion of group scans in were summed within years. Counts of annual innovations per group were
which the individual was in proximity (i.e., within ∼400 cm) to at least one then fit using a hierarchical Poisson model fit using r-STAN that accounted for
other group member other than their dependent offspring (i.e., for females, exposure rates using metrics previously described and varying intercepts for

7812 | www.pnas.org/cgi/doi/10.1073/pnas.1620739114 Perry et al.


COLLOQUIUM
PAPER
each group, thus providing an estimate of annual innovation rate per year J. Manson, K. Perry, B. Scelza, and M. J. West-Eberhard. We thank
per group. Marcus W. Feldman, Andrew Whiten, Kevin N. Laland, and Francisco J.
Ayala for the opportunity to participate in this Sackler Symposium. We
ACKNOWLEDGMENTS. The following field assistants contributed a year or thank the Costa Rican park service (Sistema Nacional de Áreas de Conserva-
more of data to the Lomas Barbudal Monkey Project dataset: L. Beaudrot, ción and Área de Conservación Tempisque), Hacienda Pelon de la Bajura,
M. Bergstrom, R. Berl, A. Bjorkman, L. Blankenship, T. Borcuch, J. Broesch, Hacienda Brin D’Amor, and the residents of San Ramon de Bagaces for per-
J. Butler, F. Campos, C. Carlson, S. Caro, M. Corrales, N. Donati, C. Dillis, mission to work on their land. This project is based on work supported by the
G. Dower, R. Dower, K. Feilen, K. Fisher, A. Fuentes J., M. Fuentes A., funding provided to S.E.P. by the Max Planck Institute for Evolutionary
C. Gault, H. Gilkenson, I. Gottlieb, L. Hack, S. Herbert, C. Hirsch, C. Holman, Anthropology, National Science Foundation (NSF) Grants SBR-0613226 and
S. Hyde, L. Johnson, S. Lee, S. Leinwand, T. Lord, K. Kajokaite, M. Kay, BCS-0848360, two grants from the Leakey Foundation, two grants from the
E. Kennedy, D. Kerhoas-Essens, E. Johnson, S. Kessler, S. MacCarter, J. Manson, National Geographic Society, and multiple Committee on Research grants
W. Meno, C. Mitchell, Y. Namba, A. Neyer, C. O’Connell, J. C. Ordoñez, from the University of California, Los Angeles (UCLA). B.J.B. was supported
J., N. Parker, B. Pav, R. Popa, K. Potter, K. Ratliff, H. Ruffler, S. Sanford,
by NSF Graduate Research Fellowship (GRF) Grant 1650042 and two
M. Saul, I. Schamberg, C. Schmitt, A. Scott, J. Verge, A. Walker-Bolton, E. Wikberg,
and E. Williams. We thank H. Gilkenson, W. Lammers, C. Dillis, M. Corrales, Achievement Rewards for College Scientists Foundation, Northern Cali-
and R. Popa for helping to manage the field site. E. Wikberg and K. Kajokaite fornia Chapter fellowships. I.G. was supported by the International Society
contributed a year or more of effort to organizing the dataset. D. Cohen for Human Ethology, a Ford Foundation fellowship, an NSF GRF, and multi-
created the MySQL database. R. Hong assisted in the organization and ple grants from UCLA. Any opinions, findings, and conclusions or recommen-
coding of the innovation data. This paper has benefitted from helpful discus- dations expressed in this material are those of the author(s) and do not
sions with D. Caillaud, M. Crofoot, M. Grote, K. Kajokaite, R. McElreath, necessarily reflect the views of the NSF or other funding agencies.

1. Imanishi K (1952) Man (Mainichi-Shinbunsha, Tokyo). 31. Navarrete AF, Reader SM, Street SE, Whalen A, Laland KN (2016) The coevolution of
2. Kummer H (1971) Primate Societies: Group Techniques of Ecological Adaptation innovation and technical intelligence in primates. Philos Trans R Soc Lond B Biol Sci
(AHM Publ Corp, Arlington Heights, IL). 371:371.
3. West-Eberhard MJ (2003) Developmental Plasticity and Evolution (Oxford Univ Press, 32. Stephan H, Bauchot R, Andy OJ (1970) Data on size of the brain and various brain
Oxford). parts in insectivores and primates. The Primate Brain, eds Noback CR, Montagna W
4. Giraldeau L-A, Caraco Y, Valone T (1994) Social foraging: Individual learning and (Appleton-Century-Crofts, New York), pp 289–297.
cultural transmission of innovations. Behav Ecol 5:35–43. 33. Perry S, Ordoñez Jiménez JC (2006) The Effects of Food Size, Rarity, and Processing
5. Sol D (2003) Behavioural flexibility: A neglected issue in the ecological and evolu- Complexity on White-Faced Capuchins’ Visual Attention to Foraging Conspecifics, eds
tionary literature? Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Hohmann G, Robbins M, Boesch C (Cambridge Univ Press, Cambridge, UK), pp 203–234.
Oxford), pp 62–82. 34. Fragaszy DM, Visalberghi E, Fedigan LM (2004) The Complete Capuchin: The Biology
6. Reader SM, Morand-Ferron J, Flynn E (2016) Animal and human innovation: Novel of the Genus Cebus (Cambridge Univ Press, Cambridge, UK).
problems and novel solutions. Philos Trans R Soc Lond B Biol Sci 371:371. 35. Day RL, Coe RL, Kendal JR, Laland KN (2003) Neophilia, innovation and social learn-
7. Wyles JS, Kundel JG, Wilson AC (1983) Birds, behavior, and anatomical evolution. Proc ing: A study of intergeneric differences in callitrichid monkeys. Anim Behav 65:
Natl Acad Sci USA 80:4394–4397. 559–571.
8. Keagy J, Savard J-F, Borgia G (2009) Male satin bowerbird problem-solving ability 36. Lynch Alfaro JW, et al. (2012) Explosive Pleistocene range expansion leads to wide-
predicts mating success. Anim Behav 78:809–817. spread Amazonian sympatry between robust and gracile capuchin monkeys. J Biogeogr
9. Cauchard L, Boogert NJ, Lefebvre L, Dubois F, Doligez B (2013) Problem-solving per- 39:272–288.
formance is correlated with reproductive success in a wild bird population. Anim 37. Rose L, et al. (2003) Interspecific interactions between white-faced capuchins (Cebus
Behav 85:19–26. capucinus) and other species: Preliminary data from three Costa Rican sites. Int J
10. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through Primatol 24:759–796.
culture. Proc Natl Acad Sci USA 114:7775–7781. 38. Perry S, Manson JH (2008) Manipulative Monkeys: The Capuchins of Lomas Barbudal
11. Reader SM, Laland KN (2003) Animal innovation: An introduction. Animal Innovation,
(Harvard Univ Press, Cambridge, MA).
eds Reader SM, Laland KN (Oxford Univ Press, Oxford), pp 3–35.
39. Sol D, Sayol F, Ducatez S, Lefebvre L (2016) The life-history basis of behavioural in-
12. Ramsey G, Bastian ML, van Schaik C (2007) Animal innovation defined and oper-
novations. Philos Trans R Soc Lond B Biol Sci 371:371.
ationalized. Behav Brain Sci 30:393–407.
40. Perry S (2012) The behavior of wild white-faced capuchins: Demography, life history,
13. van Schaik CP, et al. (2016) The reluctant innovator: Orangutans and the phylogeny of
social relationships, and communication. Adv Study Behav 44:135–181.

ANTHROPOLOGY
creativity. Philos Trans R Soc Lond B Biol Sci 371:371.
41. Perry S (2011) Social traditions and social learning in capuchin monkeys (Cebus). Philos
14. Fogarty L, Creanza N, Feldman MW (2015) Cultural evolutionary perspectives on
Trans R Soc Lond B Biol Sci 366:988–996.
creativity and human innovation. Trends Ecol Evol 30:736–754.
42. Nishida T, Matsusaka T, McGrew WC (2009) Emergence, propagation or disappear-
15. Perry S (1996) Intergroup encounters in wild white-faced capuchins, Cebus capucinus.
ance of novel behavioral patterns in the habituated chimpanzees of Mahale: A re-
Int J Primatol 17:309–330.
view. Primates 50:23–36.
16. Reader SM, Laland KN (2001) Primate innovation: Sex, age, and social rank differ-
43. Haslam M, et al. (2016) Pre-Columbian monkey tools. Curr Biol 26:R521–R522.
ences. Int J Primatol 22:787–805.
44. Gould SJ, Vrba ES (1982) Exaptation - a missing term in the science of form.
17. Lefebrvre L, et al. (1998) Feeding innovations and forebrain size in Australasian birds.
Paleobiology 8:4–15.
Behaviour 135:1077–1097.
45. Zahavi A (1977) The testing of a bond. Anim Behav 25:246–247.
18. Lefebrvre L, Whittle P, Lascaris E, Finkelstein A (1997) Feeding innovations and
46. Russon AE, Kuncoro P, Ferisa A (2015) Tools for the trees: Orangutan arboreal tool use
forebrain size in birds. Anim Behav 53:549–560.
and creativity. Animal Creativity and Innovation, eds Kaufman AB, Kaufman JC
19. Boogert NJ, Reader SM, Hoppitt W, Laland KN (2008) The origin and spread of in-
novations in starlings. Anim Behav 75:1509–1518. (Elsevier, San Diego), pp 419–455.
20. Thornton A, Samson J (2012) Innovative problem solving in wild meerkats. Anim 47. Beck SR, Williams C, Cutting N, Apperly IA, Chappell J (2016) Individual differences in
Behav 83:1459–1468. children’s innovative problem-solving are not predicted by divergent thinking or
21. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture executive functions. Philos Trans R Soc Lond B Biol Sci 371:371.
via conformity in wild birds. Nature 518:538–541. 48. Laland KN, van Bergen Y (2003) Experimental studies of innovation in the guppy.
22. Kendal RL, Coe RL, Laland KN (2005) Age differences in neophilia, exploration, and Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, New York), pp
innovation in family groups of callitrichid monkeys. Am J Primatol 66:167–188. 155–173.
23. Russon AE (2003) Innovation and creativity in forest-living rehabilitant orangutans. 49. Frankie GW, Vinston SB, Newstrom LE, Barthell JF (1988) Nest site and habitat pref-
Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Oxford), pp 279–306. erences of Centris bees in the Costa Rican dry forest. Biotropica 20:301–310.
24. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. 50. Perry S, Godoy I, Lammers W (2012) The Lomas Barbudal Monkey Project: Two de-
25. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. cades of research on Cebus capucinus. Long-Term Field Studies of Primates, eds
Science 299:102–105. Kappeler P, Watts D (Springer, New York), pp 141–165.
26. van Schaik CP, van Noordwijk MA, Wich SA (2006) Innovation in wild Bornean 51. Russon A, Andrews K, Huss B (2007) Innovation and the grain problem. Behav Brain
orangutans (Pongo pygmaeus wurmbii). Behaviour 143:839–876. Sci 30:423–433.
27. Perry S, et al. (2003) Social conventions in wild white-faced capuchin monkeys: Evi- 52. Neumann C, Kulik L (2014) EloRating: Animal Dominance Hierarchies by Elo-Rating.
dence for traditions in a neotropical primate. Curr Anthropol 44:241–268. R Package, Version 0.43. Available at https://cran.r-project.org/package=EloRating.
28. Koolhaas JM, et al. (1999) Coping styles in animals: Current status in behavior and Accessed January 5, 2017.
stress-physiology. Neurosci Biobehav Rev 23:925–935. 53. McElreath R (2016) Statistical Rethinking: A Bayesian Course with Examples in R and
29. Kummer H, Goodall J (1985) Conditions of innovative behaviour in primates. Philos Stan (Chapman and Hall/CRC, New York).
Trans R Soc Lond B Biol Sci 308:203–214. 54. Stan Development Team (2016) RStan: The R Interface to Stan. R package version
30. Overington S, Morand-Ferron J, Boogert NJ, Lefebvre L (2009) Technical innovations 2.14.1. (R Foundation for Statistical Computing, Vienna).
drive the relationship between innovativeness and residual brain size in birds. Anim 55. R Development Core Team (2013) R: A Language and Environment for Statistical
Behav 78:1001–1010. Computing (R Foundation for Statistical Computing, Vienna).

Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7813
Gene–culture coevolution in whales and dolphins
Hal Whiteheada,1
a
Department of Biology, Dalhousie University, Halifax, NS, Canada B3H 4R2

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 1, 2017 (received for review January 14, 2017)

Whales and dolphins (Cetacea) have excellent social learning skills places major constraints on the other modes of inheritance.
as well as a long and strong mother–calf bond. These features Circumstances in which other inheritance mechanisms control
produce stable cultures, and, in some species, sympatric groups the inheritance of genes are much less obvious but can have great
with different cultures. There is evidence and speculation that this evolutionary significance (10). Thus, there has been particular
cultural transmission of behavior has affected gene distributions. interest in gene–culture coevolution (11–13).
Culture seems to have driven killer whales into distinct ecotypes, Theoreticians started to model potential interactions between
which may be incipient species or subspecies. There are ecotype- genes and culture in the 1970s (14), and there has been much
specific signals of selection in functional genes that correspond to empirical and theoretical work since then (11, 13, 15, 16).
cultural foraging behavior and habitat use by the different eco- However, gene–culture coevolution has rarely been formally
types. The five species of whale with matrilineal social systems defined. For the perspective of this article, I consider cases in
have remarkably low diversity of mtDNA. Cultural hitchhiking, which culture—i.e., group-specific social learning—affects the
the transmission of functionally neutral genes in parallel with se- distribution of DNA found in a population. That is, we can
lective cultural traits, is a plausible hypothesis for this low diver- reasonably suppose that the distribution of genes in a population
sity, especially in sperm whales. In killer whales the ecotype would be different if individuals were not transmitting cultural
divisions, together with founding bottlenecks, selection, and cul- information through social learning. Gene–culture coevolution
tural hitchhiking, likely explain the low mtDNA diversity. Several includes very specific selective processes in which a particular
cetacean species show habitat-specific distributions of mtDNA cultural practice affects the evolution of a particular gene or
haplotypes, probably the result of mother–offspring cultural trans- genes. The most famous example is the coevolution of dairy
mission of migration routes or destinations. In bottlenose dolphins, farming and the lactase gene, allowing adult humans in dairy-
remarkable small-scale differences in haplotype distribution result farming cultures to digest milk products (17). However, gene–
from maternal cultural transmission of foraging methods, and large- culture coevolution can also include general processes. If culture
scale redistributions of sperm whale cultural clans in the Pacific have is driving significant parts of a species’ behavior, then this in-
likely changed mitochondrial genetic geography. With the accelera- fluence may constrain genetic evolution. For instance if cultural
tion of genomics new results should come fast, but understanding processes isolate groups and give them distinctive behavior, then
gene–culture coevolution will be hampered by the measured pace this isolation may initiate or promote the course of speciation or
of research on the socio-cultural side of cetacean biology. reduce the diversity of genes that are transmitted in parallel with
the cultural traits (18–20). Under this concept of gene–culture
gene–culture coevolution | Cetacea | cultural hitchhiking coevolution there will be circumstances in which the available
information may not be definitive. Scenarios including or exclud-

E volution is dependent on inheritance. Although the trans-


mission of genes is primary, it is not the only mode of in-
heritance. As recently articulated in the extended evolutionary
ing culture may both be consistent with results.
Gene–culture coevolution has been considered almost entirely
from the perspective of Homo sapiens, and the great majority of
synthesis, inheritance extends beyond genes to include epigenetic research has been aimed at the hypothesis that there are specific
inheritance, physiological inheritance, ecological inheritance, genes whose distribution has been affected by human cultural
social transmission, and cultural inheritance, all of which can practices (11, 13, 15). Two of the best-documented cases are the
lead to heritability of phenotype (1). From the perspective of dairy farming–lactase relationship mentioned above and a sce-
modern humans, cultural inheritance, which directly determines nario relating human activities, such as clear-cutting of forests,
much of our own behavior as well as indirectly determining how with malaria incidence, and sickle cell anemia. Different cultural
we affect the global environment, has particular salience (2). practices in different parts of the world that increase suscepti-
Culture, as an inheritance system, can be defined as behavior bility to malaria are correlated with higher allele frequencies of
or information shared within a community that is acquired the HbS variant of normal adult hemoglobin that in heterozy-
from conspecifics through some form of social learning (3), i.e., gotes confers protection against malaria infection but in homo-
learning that is influenced by observation of, or interaction with, zygotes results in sickle cell anemia (21, 22).
another animal or its products (4). Social learning comes in a Population genomics has been able to highlight areas of DNA
range of forms including imitation, emulation, teaching, and subject to selection, give an approximate date for selective
local enhancement (5), all of which can promote behavioral sweeps, and provide links to functionality. The recent expansion
similarity between learner and model. Culture may include a of the genomic enterprise has revealed a great sway of recent
wide range of behavior, including foraging methods, vocaliza- selection on the genomes of modern humans, particularly in
tions, diet selection, social behavior, movement, habitat use,
social structure, and play (e.g., refs. 3 and 6); potentially higher-
level attributes of culture, e.g., conservative cultures, exploratory This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
cultures, or pacific cultures (refs. 3, 7, and 8), also can affect a Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
range of behavior. Cultures may be stable over many generations in Irvine, CA. The complete program and video recordings of most presentations are available
or transitory fads. The learning is often from parent to offspring on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
in the former and from peer to peer in the latter (9). Author contributions: H.W. designed research, performed research, analyzed data, and
The different modes of inheritance do not act wholly inde- wrote the paper.

pendently. Phenotypic assimilation by one mode of inheritance The author declares no conflict of interest.
may influence the transmission of traits by another mode. Genes This article is a PNAS Direct Submission.
produce the basic organismal phenotype, and this genetic phenotype 1
Email: hwhitehe@dal.ca.

7814–7821 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620736114


COLLOQUIUM
PAPER
populations derived following migration out of Africa (23). It has live sympatrically and these groups have differential reproductive
been hypothesized that an accelerated rate of adapted substitu- success, then gene–culture coevolution can be driven powerfully
tions over the last 40,000 y is linked to cultural innovations and by group selection (3, 27). Stable, sympatric social groups with
consequent demographic group selection (24). These findings distinctive cultures are the hallmark of two large odontocete
led authors to propose gene–culture coevolution as the dominant species with matrilineal social systems, the killer whale (Orcinus
form of evolution in recent human history, because of the speed orca) and the sperm whale (Physeter macrocephalus) (3). In these
at which genetic change has taken place and correlation with the species, females, typically, and males, sometimes, spend their
occupation of new niches and the establishment of new cultures lives grouped with their mothers while both are alive, forming
(10, 13). However, a temporal correlation, and even a plausible stable social units of about 10 animals. The social units are
scenario linking a particular selective gene to a cultural trait, themselves elements of higher-level matrilineally based social
does not demonstrate culture as causal to genetic change. De- tiers such as clans, communities, and ecotypes. Distinctive be-
mographic history can be a vital factor in driving the genetic havior, which is believed to be generally culturally transmitted,
variation of contemporary populations. For instance, population maps onto the different elements of the social tiers (3). This
bottlenecks can cause founder effects in both genes and culture, socio-cultural relationship sets up conditions in which gene–
leading to correlations between specific genes and specific cul- culture coevolution could lead to neutral or functional genes
tures that are not the result of gene–culture coevolution. There being found in different elements of the social system or to more
are, in fact, relatively few cases in which human culture has been general effects such as speciation or reductions in genetic
irrefutably shown to have caused genetic change (13). That is not diversity.
to say that gene–culture coevolution is rare, and indeed the Here I examine four general hypotheses for gene–culture co-
overall temporal correlation between the recent explosion in the evolution in whales and dolphins: (i) that culture has led to in-
scope of human culture and an extremely accelerated rate of cipient speciation of killer whale ecotypes; (ii) that culture has
selection on genes seems to point toward a major effect. How- driven the evolution of functional genes in killer whales; (iii) that
ever, in most instances it is very difficult to provide a compelling low levels of mtDNA diversity in five species of matrilineal
empirical case that the interaction between culture and genetics odontocete result from cultural hitchhiking; and (iv) that geo-

EVOLUTION
is causal. graphical patterns of mtDNA result from cultural behavior. As
Even harder to substantiate are the proposals for more general with gene–culture coevolution in humans, evidence is not always
effects of culture on patterns of human genetics. For instance, conclusive, and I will consider alternative mechanisms for the
models have shown that the remarkably low levels of human observed patterns.
genetic diversity could result from culturally mediated pop-
ulation structure (25) or from selectively important cultures being Ecotype Radiations in Killer Whales?
transmitted in parallel with neutral genes (26). However, there is Although killer whales as a species have an extremely diverse
no strong empirical evidence for either of these processes. diet, ranging from herring to the largest whales, each killer whale
Although culture is clearly present in other species, it is is typically a member of an ecotype, and most ecotypes are ex-
dwarfed in almost all respects by its impact on modern humans tremely specialized in what they eat and often in how they catch
(27–29). Thus, given the difficulties of demonstrating gene–cul- it. Ecotypes are distinctive in a range of other behaviors, as well
ture coevolution in by far the most cultural species, there has as morphology (20). In each of the North Pacific, North Atlantic,
been little effort to look elsewhere. However, gene–culture co- and Antarctic two or more ecotypes, each with a few hundred to
evolution can theoretically operate in simple cultural systems a few thousand animals, are sympatric but socially isolated (20).
(14), and there may be attributes of nonhuman species that could Best known are the three North Pacific ecotypes: a salmon-
accentuate the effects of culture on genes or make them more eating specialist ecotype known as “residents,” a mammal-
discernible. eating specialist or “transient” ecotype, and an “offshore” eco-
Birds offer one candidate taxon for nonhuman gene–culture type, which feeds upon sharks (20). All three ecotypes cluster in
coevolution. In oscine species, birdsong is largely a socially distinct mitochondrial and nuclear genome clades (33, 34) that
learned population phenomenon and thus represents culture diverged a few hundred thousand years ago (35). We have less
(30). For instance, populations of the sharp-beaked Galápagos complete information for the Antarctic, but there appear to
ground finch (Geospiza difficilis) have socially learned songs with be five ecotypes respectively specializing on minke whales
an important role in mating but respond fully only to songs from (Balaenoptera acutorostrata) (ecotype A), seals (ecotype B1),
their own island population (31). Thus, Grant and Grant suggest penguins (ecotype B2), and fish (ecotype C and perhaps eco-
that cultural transmission is driving speciation (31). However, type D) (20). These Antarctic ecotypes seem to have diverged
some other studies have found little evidence of links between more recently than those in the North Pacific, and ecotypes B1,
cultural and genetic evolution in birds (e.g., ref. 32). B2, and C, at least, are part of a distinct and relatively shallow
Of all nonhuman species, culture seems particularly prevalent Antarctic clade (34, 35). The distinctions between the ecotypes
and significant among Cetacea (3). The cetaceans include about have led Morin et al. to suggest that they should be considered
89 species of whales and dolphins, ranging in size from 1 to 30 m. separate species or subspecies (36), but this proposal has not
They use a range of habitat from large rivers to the deep ocean, been formally implemented.
in polar, temperate, and tropical waters. They eat a wide range of Since the initial discovery of killer whale ecotypes, scientists
marine animals, from copepods to large whales. Cetaceans are have speculated about how they may have arisen. Although some
divided into the mysticetes or baleen whales that filter feed on issues are disputed, a consensus hypothesis about the funda-
schools of prey using baleen, and the odontocetes that use mentals of ecotype formation has grown. This culturally driven
echolocation to locate and track single prey. ecological speciation scenario has been articulated most clearly
Although cetaceans are not easy to study, there is good evi- by Riesch et al. (20), with support from more recent genomic
dence for cultural transmission in song, migrations, foraging studies (35). Ecological speciation needs (i) an ecological source
behavior, social conventions, cooperative associations with hu- of divergent selection; (ii) a form of reproductive isolation; and
mans, and play (3). Almost all cetacean species whose behavior (iii) a mechanism linking divergent selection to reproductive
has been much studied show possibly, or likely, culturally ac- isolation (37). The killer whale scenario is especially interesting
quired behavior (3). For gene–culture coevolution, culture needs because culture seems to play a key role in all three components.
to be quite stable—fads need not apply—and to affect fitness Within the best-studied killer whale ecotypes there is evidence
directly or indirectly. When social groups with distinct cultures that different matrilineal social units can have distinctive ways of

Whitehead PNAS | July 25, 2017 | vol. 114 | no. 30 | 7815


life. For instance, the AT1 transient matriline of Prince William is almost certainly socially learned, and the ecotypes themselves
Sound, Alaska has particular ways of capturing fast-swimming likely arise from this cultural process (see previous section), the
Dall’s porpoises, Phocoenoides dalli (38). Such foraging methods first contrast can reasonably be deemed cultural; the second is
are almost certainly socially learned, primarily within matrilineal less directly so, but culture likely underpins almost all behavioral
units. As these foraging methodologies become more efficient differences between ecotypes. Niche construction is the modifi-
through cultural selection, they will tend to become more spe- cation of the environment by an organism, so that the changes
cialized, so that matrilines, or groups of socially connected matri- influence natural selection (39). In their review of gene–culture
lines, develop new ways of life. These matrilines are thought to be coevolution, Laland et al. (10) emphasized that the defining
the progenitors of the ecotypes (20). In concordance with this characteristic of niche construction is the modification of selec-
scenario, new genomic studies indicate that ecotypes started with tion pressures and that this modification could be achieved
small populations of related animals (35). through processes such as migration, dispersal, and habitat se-
The second and third conditions of ecological speciation, the lection. Niche construction then provides a link between the
emergence of reproductive isolation between ecotypes, are key cultures of killer whale ecotypes and their habitats.
to the population structure of killer whales. Riesch et al. consider Recent population genomic studies comparing the fish-eating
four mechanisms through which cultural divergence could lead resident and mammal-eating transient ecotypes of the North
to reproductive isolation among ecotypes (20). Most directly, Pacific found a signature consistent with selection on genes as-
cultural xenophobia could restrict mating patterns. Vocal dia- sociated with diet (35, 40). The regions of the genome that most
lects, which are socially learned and thus culture, are good differentiated members of the resident ecotype contained genes
candidates (20). These dialects appear to set barriers to mating associated with digestive tract formation during developmental
with close associates, and, in accordance with a paradigm of stages and carboxylic ester hydrolase activity, including the hy-
mating only with animals whose vocalizations are similar but not drolysis of long-chain fatty acid esters (35). A signature of se-
too similar, also might prevent mating between proto-ecotypes lection on the CBS gene, which has a role in the methionine
(20). Secondly, the cultural specializations that come to define cycle, was found in the mammal-eating transient ecotype. In-
ecotypes would likely make between-ecotype dispersal difficult terestingly, a signature of selection on a different set of genes
and often functionally impossible. Killer whales are culturally that play a key role in the methionine cycle was also found in the
conservative (7); thus, for instance, adopting the specialized, predominantly mammal-eating B1 ecotype from the Antarctic
collaborative behavior of a seal-eater might be very hard for a (35). Methionine is an essential amino acid that cannot be pro-
postweaning salmon-eater, who had already absorbed the con- duced by the body and must be obtained through dietary intake
formist ways of her own lineage. Third, genetic drift within of protein (41). Foote et al. (35) hypothesized that the more
ecotypes could lead to postzygotic infertility or reduced fitness of sporadic intake of large amounts of dietary protein by mammal-
hybrids. Such drift is particularly likely given the generally small eating killer whales may exert selection pressures on the methi-
sizes, and even smaller founding bottlenecks, of the ecotypes (20, onine cycle different from those experienced with a more tem-
35). Riesch et al.’s (20) final mechanism promoting reproductive porally consistent intake of protein by fish-eating killer whales.
isolation among ecotypes is the selection on functional genes by Similar to the parallel examples of lactose tolerance and malaria
different cultural practices. New genomic studies have found resistance in different human populations, the signatures of se-
strong evidence for this selection (35), and this evidence is de- lection on the same pathway in two independently derived
scribed in the next section. mammal-eating ecotypes supports the notion that selection may
Although ecological speciation driven by cultural specializa- be associated with culturally acquired ecological niches.
tion is the favored hypothesis for ecotype formation and is Although the killer whales in the North Pacific inhabit rela-
consistent with what is known of the behavior, ecology, genetics, tively temperate waters, those found in Antarctic waters arguably
and genomics of killer whales, the hypothesis is not proven. An inhabit the most extreme environment of this species’ range. The
alternative hypothesis is that nonecological cultural divergence three Antarctic ecotypes included in a recent population geno-
might have predated the ecological specializations (20). For in- mics study share a recent common ancestor (34, 35). Targets
stance, proto-ecotypes could have been based on vocalization of selection along the evolutionary branch to this ancestral
patterns of groups of killer whales that had similar diets, and Antarctic population includes genes associated with adipose
then, once isolated, these groups could have begun to specialize tissue development; interestingly, the same process has been
ecologically. It is hard to envisage an alternative ecotype- under selection in the polar bear, Ursus maritimus, compared
generating scenario that does not have a cultural component. with the brown bear, Ursus arctos, and are suggestive of a link to
adaptation to a colder climate (42).
Gene–Culture Coevolution of Functional Genes The FAM83H gene, which regulates the keratin cytoskeleton,
We have evidence that culture has influenced the evolution of including that in epithelial cells (43), is one of the genes most
functional genes in killer whales, and killer whale ecotypes differentiated in the Antarctic and North Pacific ecotypes, con-
provide the substrate for these inferences. In fewer than taining four fixed nonsynonymous substitutions in the Antarctic
10,000 generations, killer whales have radiated from a single ecotypes (35). Antarctic killer whales are thought to have a slow
ancestral lineage to colonize regions from the Arctic to the regeneration of skin cells because of the thermal constraints of
Antarctic and all waters in between (34, 35). Until humans allowing blood to circulate to the outer epithelial cells (44). This
reached Antarctica, killer whales were the most widely distrib- slow skin regeneration is evidenced from the build-up of dia-
uted mammals and likely were subject to a diverse range of se- toms, giving Antarctic killer whales a yellowish hue compared
lective pressures associated with different habitats. The stable with North Pacific killer whales (45). Genes associated with skin
cultural traditions of killer whale ecotypes provide a rare and regeneration, such as FAM83H, therefore may be under selec-
attractive system for investigating gene–culture coevolution in tion because of this thermal constraint. Just as human cultural
functional genes. However, as with most genomically derived practices frequently buffer selective pressures from harsh envi-
putative linkages between genes and culture in humans (10), ronments (10), Antarctic killer whales seem to use behavioral
there is no definitive proof that cultural practice has affected adaptation to mitigate the impact of the Antarctic climate upon
selection pressure on the identified genes. skin regeneration. Satellite tagging has revealed that these
Genomic studies have looked at two particular contrasts: be- whales make regular round trips to warm subtropical waters; when
tween fish-eating and mammal-eating ecotypes and between they return they have lost the yellow hue of their diatom film,
temperate and Antarctic ecotypes. Because diet in killer whales indicating they have regenerated the outer epithelial cells (44).

7816 | www.pnas.org/cgi/doi/10.1073/pnas.1620736114 Whitehead


COLLOQUIUM
PAPER
Several phenotypic distinctions between ecotypes appear mitochondrial genome, but this finding does not exclude a se-
functional, given the ecotypes’ different cultural behaviors, and lective sweep originating in these other regions.
are likely genetically based, but the candidate genes for these More satisfying than bottlenecks or selection are scenarios
distinctions have not yet been identified. Most obviously, eco- postulating important demographic processes operating primar-
types that eat mammals are generally larger than those that eat ily at the level of the matrilineal group, thus reducing the ef-
fish (20). This difference in size could be related to the nutri- fective population size to something like the number of groups
tional values and availability of the different foods, but it also and thus lowering the expected genetic diversity (54–56). How-
may have a genetic component. In the North Pacific, mammal- ever, agent-based models found no realistic circumstances in
eating transients seem to have more robust mouth parts than which the demography of matrilineal groups could reduce ge-
fish-eating residents (20). The type B1 Antarctic killer whales use netic diversity, unless fitness differentials between groups were
a particularly extreme form of coordination, synchronous fluke heritable (57). If there is a heritable component to fitness within
beats, to create waves that knock seals off ice flows (46). They matrilineal groups, then the conditions for reduced genetic di-
also have relatively larger white eye-patches than other ecotypes. versity become much relaxed (57). This heritability could happen
Are these large eyepatches coincidence or the result of gene– with selection on the mitochondrial genome, for instance in re-
culture coevolution? sponse to mitochondrial disease or deep diving (47, 48), but
there is no need for a species to have a matrilineal social system
Cultural Hitchhiking in Matrilineal Whales for selection on genes to reduce mtDNA diversity, so this ex-
This section summarizes information in ref. 47. As molecular planation is unsatisfying.
The heritable component to fitness does not need to be ge-
genetics was applied to cetacean species during the 1990s, an
netic; it just needs to be transmitted matrilineally. Killer and
unexpected pattern appeared. The four whale species known to
sperm whales are well known for their matrilineally transmitted
have matrilineal social systems, the killer whale, sperm whale,
cultures (3). Cultural hitchhiking is the hypothesis that these
and two pilot whale species (Globicephala macrorhynchus and matrilineally transmitted cultures, with fitness differentials, have
Globicephala melas), possessed much lower diversities of the reduced diversity of the mitochondrial genes that are being
control region of their mtDNA than other cetacean species with

EVOLUTION
transmitted in parallel (19). Thus cultural hitchhiking is a form of
comparable population sizes or latitudinal ranges (19). In the gene–culture coevolution. Agent-based models have shown that
nearly two decades since then, many additional estimates of cultural hitchhiking can work in circumstances that seem realistic
cetacean control region diversity have been published, for other for the matrilineal whales (57) but do not necessarily imply that
species (including another presumed matrilineal species, the cultural hitchhiking is behind their low mtDNA diversity.
false killer whale, Pseudorca crassidens), with larger and more However, it is most parsimonious to assume that a factor
geographically dispersed samples, and greater coverage of the common to the matrilineal species has led to reduced diversity.
genome (47). The pattern still holds (Fig. 1). The range-wide Because that factor does not seem to be the direct influence of
mtDNA diversities in the matrilineal species are ∼29.8% of matrilineality itself (57), I suspect that it is culture. Matrilineal
that of nonmatrilineal species with similar latitudinal ranges social systems are particularly good substrates for the evolution
(47). For regional estimates the ratio is 16.6%. of stable, group-specific cultures (3), and stable, group-specific
Low genetic diversity usually invokes discussion of bottlenecks cultures are the prerequisite for cultural hitchhiking (57). How-
and selection. Both these default mechanisms have been used to ever, culture may affect genetic diversity through different paths
explain the low mtDNA diversity of the matrilineal species of in different species (18), as illustrated by the two best-known
Cetacea (48–53). However, neither bottlenecks nor selection link matrilineal species, sperm and killer whales.
the remarkably low diversity of the species to their matrilineal Female and immature sperm whales use tropical and sub-
social systems. Furthermore, contrary to expectations from bot- tropical waters, where they live in matrilineally based social units
tlenecks, the diversity of nuclear microsatellites is not obviously (58). These units, in turn, are members of coda clans (59). The
reduced in the matrilineal whales (although this result may be coda clans have distinctive dialects, movements, microhabitat
partially explained by the greater effective population size and use, and social behavior, as well as differential reproductive
higher mutation rates of microsatellites compared with mtDNA) success (59–62). The clans are sympatric but do not associate
(47). In sperm whales, Alexander et al. (48) found no evidence with one another; they show no differences in nuclear micro-
for selection in the control region relative to other regions of the satellites but do have distinct mtDNA haplotype distributions
(63, 64). Thus, sperm whales fit the classic cultural hitchhiking
scenario well, with clans being the cultural groups under selec-
tion, and cultural hitchhiking is the most parsimonious expla-
nation for their low mitochondrial diversity. A selective sweep in
the mitochondrial genome is also consistent with available re-
sults on sperm whales (48) but does not obviously provide a link
with matrilineality.
Although killer whales are also matrilineal, their subdivision
into ecotypes adds a layer of additional possible drivers of low
genetic diversity. Population subdivision itself tends to reduce
genetic diversity (65). The highly specialized ways of life of many
of the ecotypes may make them particularly vulnerable to ex-
tirpation, removing characteristic mtDNA haplotypes in the
process, and thus reducing diversity (65). The ecotypes have very
different culturally transmitted behavior—one of the conditions
for cultural hitchhiking—but their lifestyles are so different that
Fig. 1. Mean mtDNA nucleotide diversity (across estimates for a species
with n >100 covering >25% of the species’ range or at least one ocean basin)
it is hard to see how they would often be in competition (how-
against the latitudinal range of cetacean species. Nonmatrilineal species are ever, see ref. 66 for a scenario of indirect competition between
indicated by a plus sign, and matrilineal species are indicated by circles. Gma, ecotypes), so cultural hitchhiking at the ecotype level seems an
short-finned pilot whale; Gme, long-finned pilot whale; Oo, killer whale; Pc, incomplete explanation. Processes that reduce diversity within
false killer whale; Pm, sperm whale. Adapted from ref. 47. ecotypes would also affect the overall diversity of killer whales.

Whitehead PNAS | July 25, 2017 | vol. 114 | no. 30 | 7817


Genomic studies give strong support for bottleneck effects dur- grounds (74). The wintering grounds of southern right whales
ing the founding of ecotypes, as well as selection driven by (Eubalaena australis) have distinct mtDNA haplotype distribu-
ecotype-specific cultural traits (35, 67). Cultural hitchhiking also tions, presumably because of maternal migratory fidelity, and
could operate on socio-cultural groups within ecotypes, such as Carroll et al.’s (75) analysis of gene distributions and stable
the communities of resident ecotype killer whales (68). However, isotopes suggests that the migratory cultures also affect genetic
at this time, our evidence for a cultural factor behind the low structure on feeding grounds.
mtDNA diversity of killer whales lies primarily in its role in Such cases of cultural migration routes and habitat use may
setting up the ecotypes and driving selection within them (see not be uncommon among animal species such as birds and un-
previous sections). In this respect, the killer whale situation gulates, although many traditional migrations are insufficiently
corresponds quite closely to Premo and Hublin’s (25) suggestion faithful to set up clear genetic patterning (76). Such patterns can
that culturally transmitted barriers to mating and dispersal may also be found without culture, for instance salmon using chem-
have set up conditions for the reduction in human genetic di- ical traces to home to their native streams, and so set up clear
versity through bottlenecks and selection. geographical genetic distinctions. Using senses alone to home to
We know much less about the three other matrilineal species: native areas consistently is an alternative explanation for the
the short-finned pilot whale, the long-finned pilot whale, and the geographical patterns of mtDNA in cetaceans. However, given
false killer whale. Although the evidence for matrilineal social the social nature of these species, and especially the long
systems is quite good (69–71), we have not identified suitable mother–calf relationship that typically includes several migra-
socio-cultural groups on which cultural hitchhiking might oper- tions, the cultural explanation is more parsimonious.
ate, nor do we know whether such groups are barriers to mating, Geographical patterns of gene distributions can also be driven
as with the killer whale ecotypes, or, alternatively, share nuclear by cultural inheritance of traits that are not themselves about
genes through male dispersal, as with sperm whale coda clans. migration but affect habitat use secondarily. The population of
For these species we also have little idea as to what important bottlenose dolphins (Tursiops spp.) in western Shark Bay, Aus-
matrilineally based cultural traits either provide selection dif- tralia, shows very pronounced distributional differences for three
ferential or define population structure. However, the low main mtDNA haplotypes over scales of a few kilometers (Fig. 2).
mtDNA diversity in these matrilineal species is unlikely to be a There are no barriers to movement between the primary ranges
coincidence; culture is probably involved in some way. of the different haplotypes, and individuals could move between
Culture and the Geographical Distribution of Genes these ranges in less than an hour. Kopps et al. (77) relate the
geographical patterns of mtDNA to foraging methods learned
The geography of the genes of H. sapiens has been thoroughly
from the mother. One particular foraging type, “sponging,” in
redistributed by culture. Both the neutral genes characteristic of
which a sponge is used as a tool to forage in relatively deep waters,
northern Europeans and some functional ones (e.g., those for
is known to be learned almost entirely from the mother (78, 79),
blond hair) are much more widely dispersed than would have
and in Kopps et al.’s (77) sample is restricted to just one haplotype
been expected a millennium ago. This dispersal is a result of their
characteristic of the deeper waters (haplotype E in Fig. 2).
linkage to the knowledge of oceanic navigation and weapons
As illustrated by human colonization, culture can potentially af-
technology, as well as a drive to explore, all culturally acquired. In
fect the dynamics of gene distributions. Sperm whale populations
several cases the geographical distribution of cetacean genes has
a strong cultural imprint. This geographical pattern happens are subdivided into cultural coda clans that are matrilineally based,
most obviously when migration routes are learned matrilineally,
and so migratory destinations may have distinctive mtDNA haplo-
type distributions. However, there are also more complex situations
in which the link between geography and culture is not as direct.
Most beluga whale (Delphinapterus leucas) populations make
seasonal migrations between shallow summering areas, where
the young are born, and deeper-water wintering grounds. The
young belugas are presumed to learn these migration routes
from their mothers and repeat them sufficiently faithfully that
summering grounds that were emptied by intense whaling remain
almost unoccupied several generations later, even though they
border the migration routes of other belugas whose summering
grounds were not intensely whaled. So strong are these matri-
lineal cultural traditions that a population of belugas summering
in western Hudson Bay, Canada, has mtDNA haplotypes traced
to a Pacific refugium during the Wisconsin glaciation, whereas
those summering in eastern Hudson Bay have an almost entirely
different distribution of haplotypes that are presumed to derive
from an Atlantic refugium (72). This distinction has been main-
tained over several thousand generations even though there are no
physical barriers between the summering grounds and the migra-
tion routes and wintering grounds of the two populations, where the
animals mate, overlap so that there is mating between them (73).
Baleen whales also show geographical genetic structure in
mtDNA resulting from maternal learning of migration destina-
tions. For instance, humpback whales (Megaptera novaeangliae) in
the North Atlantic and North Pacific consistently use particular Fig. 2. Segregation of bottlenose dolphin haplotypes by habitat in western
summer feeding grounds in temperate latitudes but breed in Shark Bay, Australia. Survey colors represent haplotypes of dolphins. Each
common tropical wintering areas. Calves use the same sum- sighting of a sampled dolphin was plotted. Green represents land, white
mering grounds as their mothers, but not necessarily their fa- represents shallow water (<10 m), and gray represents deeper water
thers, setting up mtDNA haplotype distinctions among feeding (≥10 m). Adapted from ref. 77.

7818 | www.pnas.org/cgi/doi/10.1073/pnas.1620736114 Whitehead


COLLOQUIUM
PAPER
show distinct mtDNA haplotype distributions, and are often sym- 2014 these had been replaced by two different clans previously
patric (59, 64). Between 1985 and 1995 there were two principal primarily found elsewhere in the Pacific (80). It seems that clan
cultural coda clans off the Galápagos Islands, but in 2013 and membership was a primary factor governing the movements of the

Table 1. Potential cases of gene–culture coevolution in whales and dolphins


Strength of evidence
Noncultural for gene–culture
Effect Species Brief description mechanisms/comments coevolution Ref.

Speciation
Ecotype radiations Killer whale A deep division of killer whales Perhaps other cultural Strong (20)
into ecotypes driven by foraging behavior initiated
cultures ecotype formation
Gene-culture coevolution of functional genes
Genes related to Killer whale Differ between mammal-eating Possible (35)
digestive tract and fish-eating ecotypes
Genes related to Killer whale Differ between mammal-eating Independent contrasts in Strong (35)
methionine cycle and fish-eating ecotypes North Pacific and
Antarctic
Genes related to Killer whale Differ between Antarctic and Cultural role not direct Possible (35)
adipose tissue temperate ecotypes
development
Genes related to skin Killer whale Differ between Antarctic and Cultural role not direct Possible (35)

EVOLUTION
regeneration temperate ecotypes
Size Killer whale Differences between mammal- No genes identified; could Weak (20)
eating, fish-eating and bird- be environmental
eating ecotypes
Robustness of mouth Killer whale Differ between mammal-eating No genes identified; Weak (20)
parts and fish-eating ecotypes preliminary study
Coloration Killer whale Most enlarged white eye-patch in No genes identified Weak (20)
ecotype that uses most
coordinated behavior
Cultural hitchhiking in matrilineal whales
Low mtDNA diversity Sperm whale mtDNA has hitchhiked on selective Clans are strong candidate Possible (57)
cultural traits transmitted in for cultural groups; could
parallel be caused by selective
sweep in mtDNA or
bottleneck (less supported)
Low mtDNA diversity Killer whale Ecotype population structure, plus Culture very likely to have Strong (57)
additional cultural effects, role(s); several possible
bottlenecks and/or selection routes; not necessarily
have reduced diversity classic cultural
hitchhiking
Low mtDNA diversity Pilot, false mtDNA has hitchhiked on selective Little evidence other than Weak (57)
killer whales cultural traits transmitted in correlation between
parallel matrilineal social system
and low mtDNA diversity
Culture and gene geography
Distinctive mtDNA Beluga whale Belugas follow their mothers on Born in summer, could be Strong (72)
distributions of first migrations purely environmental
summering areas sensing, but unlikely
Distinctive mtDNA Humpback Humpbacks follow their mothers Born in winter, so, because Strong (74)
distributions of whale on first migrations first visit to summering
summering areas grounds is with mother,
this is social learning
Distinctive mtDNA Southern Right whales follow their mothers As born in winter, could be Strong (75)
distributions of right whale on first migrations purely environmental
wintering areas sensing, but unlikely
Distinctive mtDNA Bottlenose Dolphins learn specialized feeding Social learning of some Strong (77)
distributions at dolphin techniques from their mothers, foraging techniques well
small scales leading to habitat selection established
Change in mtDNA Sperm whale Clans redistributed themselves Genetic change inferred; Possible (80)
distribution caused over 30 y, changing mtDNA redistribution could have
by cultural clan distribution been independent of
redistribution clan membership, but
unlikely

Whitehead PNAS | July 25, 2017 | vol. 114 | no. 30 | 7819


sperms, and so the changes in their mitochondrial gene distributions particular emphasis on killer whales, sperm whales, and the other
(80). Thus, culture can have an important role in structuring the matrilineal odontocetes. When matrilineal social structures with
geographical dynamics of a population and, consequently, its genes. distinct and stable cultures cease exchanging mates, then nuclear
genes become targets for gene–culture coevolution. Killer whales
Discussion are the only species in which we know that sympatric matrilines
Given the difficulty of studying whale and dolphin behavior, have become reproductively isolated, and it is no coincidence
there is a remarkable amount of evidence for cetacean culture that killer whales feature so prominently in Table 1. However,
(3). Some of this culture has driven gene evolution. The extent there may be other cetacean species in which mating barriers,
and nature of gene–culture coevolution in cetaceans is very un- either physical or social, between groups with different cultures
certain, as it is in humans (13). Table 1 summarizes the ideas and could allow gene–culture coevolution on nuclear genes.
evidence. Although the strength of evidence is variable (Table 1), Some of the processes discussed here, particularly the cultural
culture seems to have driven the evolution of both neutral and inheritance of migration routes driving the geography of genes,
functional cetacean genes in several quite different ways. In all may be present in other nonhuman species. In species such as
cases, socially learned behavior affects how individuals interact chimpanzees that have important stable foraging cultures (6, 81)
with their environment or with each other and thus affects the the evolution of functional genes might be driven by culture; for
transmission patterns or selection pressures on genes. Included is instance a group that uses a particular tool might evolve senses
the relatively prosaic mechanism of young animals faithfully or musculature to make better use of it. In birds, culturally
following their mothers on their first migrations and using these inherited songs can demarcate social relationships and mating
learned migrations for the rest of their lives to set up mtDNA opportunities, perhaps promoting speciation (31). However, this
distinctions between geographical habitats. However, there is process is somewhat different from the culturally driven eco-
also evidence for unusual processes: culture driving ecological logical speciation of killer whales, and the killer whale scenario is
speciation and cultural hitchhiking reducing genetic diversity. likely rare (20) and maybe unique. Cultural hitchhiking pro-
There are strong parallels with what is known regarding human ducing low genetic diversity may also be rare, because it requires
gene–culture coevolution and evidence for the evolution of func-
quite stable groups of relatives with quite stable functionally im-
tional genes in different killer whale ecotypes with distinct cultural
portant cultures producing consistent fitness differentials; cultural
behavior (35). However, the clear phylogenetic and cultural divi-
hitchhiking works much more efficiently if the cultural groups are
sions between killer whale ecotypes make it easier, in some re-
sympatric (26, 57).
spects, to pin down gene–culture coevolution in these species than
This review has focused on culture causing differences in genes
in the messier trajectory of human evolutionary history.
within species. However, when cultural transmission becomes
In 1996 Feldman and Laland (11) suggested that social
learning was rarely stable enough in nonhumans to support long- important, there may be species-wide selection for characteristics
standing cultural traditions upon which selection could act. that make this transmission more effective. Humans as well as
Documenting cultural stability over generations is not easy in some species of Cetacea have remarkably large brains, as well as
nonhumans. However, there is archaeological evidence for over menopause. There have been suggestions that both traits could be
4,000 y of a particular type of nut-cracking by chimpanzees (Pan adaptive in complex cultural systems (3, 83).
troglodytes) at one site in west Africa (81). Faithful transmission The prospects for our understanding of gene–culture co-
of cultures over many generations is implied by many of the evolution in cetaceans, as well as other nonhumans, are good,
patterns in cetacean genetics documented in Table 1. Examples but uneven. On the gene side, recent progress in sequencing
include the strong mtDNA differences between beluga whales on technology has allowed the transition from targeting candidate
different summering grounds and the linkage of functional genes genes to conducting genome-wide or whole-genome compari-
with ecotype behavior in killer whales. sons. This advance makes finding gene–culture coevolution
Although evidence for culture is present across the range of much more likely, because studies are not limited to identifying
cetacean species whose behavior has been studied (3), there is genetic variants associated with a particular cultural behavior a
considerable variation in how that culture is expressed. One priori. As costs go down and expertise builds, genomic studies
constant, however, is a strong mother–calf bond, lasting at least are being used increasingly on cetaceans and are being deployed
many months and often much longer. This relationship is an with added depth and scope (84). In contrast, studies of cetacean
excellent conduit for social learning, and stable cultures (3). social structure and culture are naturally slow and painstaking
Migration routes, dialects, and foraging strategies are often and are not getting any cheaper. Some shortcuts are being used,
learned from the mother and so are transmitted in parallel with such as inferring diet through stable isotopes and automated
mtDNA, potentially leading to gene–culture coevolution. How- processing of recordings of vocalizations, but we also require
ever in some of the larger odontocetes the matrilineal influence much more detailed observation at sea. For major progress in
is much stronger. Females, and sometimes males, typically spend understanding gene–culture coevolution, the socio-cultural side
their lives with their mothers and other maternal relatives. Not needs innovation, commitment, and investment.
only does this association increase the potential learning period
and number of models, but the behavior within these matrilineal ACKNOWLEDGMENTS. I thank Andrew Foote, who helped write and
social units may become very stereotyped, through processes validate parts of this article, particularly the parts involving genomics,
contributed alternative explanations, and also reviewed the article; two
such as conformism and symbolic marking (82). These factors anonymous reviewers who provided detailed and constructive comments;
increase the range of ways culture can affect genes, especially the and Francisco J. Ayala, Marcus W. Feldman, Kevin N. Laland, and Andrew
matrilineally transmitted mitochondrial genes. So there is a Whiten for organizing such an interesting colloquium.

1. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump- 6. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685.
tions and predictions. Proc Roy Soc B 282:20151019. 7. Barrett-Lennard L (2011) Killer whale evolution: Populations, ecotypes, species, Oh
2. Maynard Smith J (1989) Evolutionary Genetics (Oxford Univ Press, Oxford, UK). my! J Am Cet Soc 40:48–53.
3. Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Univ of 8. Sapolsky RM, Share LJ (2004) A pacific culture among wild baboons: Its emergence
Chicago Press, Chicago). and transmission. PLoS Biol 2:E106.
4. Heyes CM (1994) Social learning in animals: Categories and mechanisms. Biol Rev 9. Cavalli-Sforza LL, Feldman MW, Chen KH, Dornbusch SM (1982) Theory and obser-
Camb Philos Soc 69:207–231. vation in cultural transmission. Science 218:19–27.
5. Hoppitt W, Laland KN (2013) Social Learning: An Introduction to Mechanisms, 10. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
Methods, and Models (Princeton Univ Press, Princeton, NJ). Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.

7820 | www.pnas.org/cgi/doi/10.1073/pnas.1620736114 Whitehead


COLLOQUIUM
PAPER
11. Feldman MW, Laland KN (1996) Gene-culture coevolutionary theory. Trends Ecol Evol 50. Moura AE, et al. (2014) Killer whale nuclear genome and mtDNA reveal widespread
11:453–457. population bottleneck during the last glacial maximum. Mol Biol Evol 31:1121–1131.
12. Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution: A 51. Hoelzel AR, et al. (2002) Low worldwide genetic diversity in the killer whale (Orcinus
Quantitative Approach (Princeton Univ Press, Princeton, NJ). orca): Implications for demographic history. Proc Biol Sci 269:1467–1473.
13. Richerson PJ, Boyd R, Henrich J (2010) Colloquium paper: Gene-culture coevolution in 52. Janik VM (2001) Is cetacean social learning unique? Behav Brain Sci 24:337–338.
the age of genomics. Proc Natl Acad Sci USA 107:8985–8992. 53. Mesnick SL, et al. (1999) Culture and genetic evolution in whales. Science 284:2055a.
14. Feldman MW, Cavalli-Sforza LL (1976) Cultural and biological evolutionary processes, 54. Amos W (1999) Culture and genetic evolution in whales. Science 284:2055a.
selection for a trait under complex transmission. Theor Popul Biol 9:238–259. 55. Tiedemann R, Milinkovitch M (1999) Culture and genetic evolution in whales. Science
15. Creanza N, Feldman MW (2016) Worldwide genetic and cultural change in human 284:2055a.
evolution. Curr Opin Genet Dev 41:85–92. 56. Siemann LA (1994) Mitochondrial DNA Sequence Variation in North Atlantic Long-
16. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How culture Finned Pilot Whales, Globicephala melas. PhD thesis (Massachusetts Institute of
evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789. Technology, Cambridge, MA).
17. Tishkoff SA, et al. (2007) Convergent adaptation of human lactase persistence in 57. Whitehead H (2005) Genetic diversity in the matrilineal whales: Models of cultural
Africa and Europe. Nat Genet 39:31–40. hitchhiking and group-specific non-heritable demographic variation. Mar Mamm Sci
18. Premo L (2012) Hitchhiker’s guide to genetic diversity in socially structured pop- 21:58–79.
ulations. Curr Zool 58:287–297. 58. Christal J, Whitehead H, Lettevall E (1998) Sperm whale social units: Variation and
19. Whitehead H (1998) Cultural selection and genetic diversity in matrilineal whales.
change. Can J Zool 76:1431–1440.
Science 282:1708–1711. 59. Rendell LE, Whitehead H (2003) Vocal clans in sperm whales (Physeter macro-
20. Riesch R, Barrett-Lennard LG, Ellis GM, Ford JKB, Deecke VB (2012) Cultural traditions
cephalus). Proc Biol Sci 270:225–231.
and the evolution of reproductive isolation: Ecological speciation in killer whales?
60. Marcoux M, Rendell L, Whitehead H (2007) Indications of fitness differences among
Biol J Linn Soc Lond 106:1–17.
vocal clans of sperm whales. Behav Ecol Sociobiol 61:1093–1098.
21. Flint J, et al. (1986) High frequencies of alpha-thalassaemia are the result of natural
61. Cantor M, Whitehead H (2015) How does social behavior differ among sperm whale
selection by malaria. Nature 321:744–750.
clans? Mar Mamm Sci 31:1275–1290.
22. Piel FB, et al. (2010) Global distribution of the sickle cell gene and geographical
62. Whitehead H, Rendell L (2004) Movements, habitat use and feeding success of cul-
confirmation of the malaria hypothesis. Nat Commun 1:104.
tural clans of South Pacific sperm whales. J Anim Ecol 73:190–196.
23. Fan S, Hansen ME, Lo Y, Tishkoff SA (2016) Going global by adapting local: A review
63. Whitehead H (2003) Sperm Whales: Social Evolution in the Ocean (Univ of Chicago
of recent human adaptation. Science 354:54–59.
Press, Chicago).
24. Hawks J, Wang ET, Cochran GM, Harpending HC, Moyzis RK (2007) Recent accelera-
64. Rendell L, Mesnick SL, Dalebout ML, Burtenshaw J, Whitehead H (2012) Can genetic
tion of human adaptive evolution. Proc Natl Acad Sci USA 104:20753–20758.
25. Premo LS, Hublin JJ (2009) Culture, population structure, and low genetic diversity in differences explain vocal dialect variation in sperm whales, Physeter macrocephalus?

EVOLUTION
Pleistocene hominins. Proc Natl Acad Sci USA 106:33–37. Behav Genet 42:332–343.
26. Whitehead H, Richerson PJ, Boyd R (2002) Cultural selection and genetic diversity in 65. Whitlock MC, Barton NH (1997) The effective size of a subdivided population.
humans. Selection 3:115–125. Genetics 146:427–441.
27. Richerson PJ, Boyd R (2005) Not by Genes Alone: How Culture Transformed Human 66. Baird RW, Abrams PA, Dill LM (1992) Possible indirect interactions between transient
Evolution (Univ of Chicago Press, Chicago). and resident killer whales: Implications for the evolution of foraging specializations in
28. Whiten A (2011) The scope of culture in chimpanzees, humans and ancestral apes. the genus Orcinus. Oecologia 89:125–132.
Philos Trans R Soc Lond B Biol Sci 366:997–1007. 67. Foote AD, et al. (2011) Positive selection on the killer whale mitogenome. Biol Lett 7:
29. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through 116–118.
culture. Proc Natl Acad Sci USA 114:7775–7781. 68. Ford JKB, Ellis GM, Balcomb KC (2000) Killer Whales (Univ of British Columbia,
30. Catchpole CK, Slater PJB (2008) Bird Song: Biological Themes and Variations (Cam- Vancouver).
bridge Univ Press, Cambridge, UK). 69. Amos B, Schlötterer C, Tautz D (1993) Social structure of pilot whales revealed by
31. Grant BR, Grant PR (2002) Simulating secondary contact in allopatric speciation: An analytical DNA profiling. Science 260:670–672.
empirical test of premating isolation. Biol J Linn Soc Lond 76:545–556. 70. Heimlich-Boran JR (1993) Social Organization of the Short-Finned Pilot Whale Glo-
32. Wright TF, Wilkinson GS (2001) Population genetic structure and vocal dialects in an bicephala macrorhynchus, with Special Reference to the Comparative Social Ecology
amazon parrot. Proc Biol Sci 268:609–616. of Delphinids. PhD thesis (Cambridge University, Cambridge, UK).
33. Moura AE, et al. (2015) Phylogenomics of the killer whale indicates ecotype di- 71. Baird RW, et al. (2008) False killer whales (Pseudorca crassidens) around the main
vergence in sympatry. Heredity (Edinb) 114:48–55. Hawaiian Islands: Long-term site fidelity, inter-island movements, and association
34. Morin PA, et al. (2015) Geographic and temporal dynamics of a global radiation and patterns. Mar Mamm Sci 24:591–612.
diversification in the killer whale. Mol Ecol 24:3964–3979. 72. Brown Gladden JG, Ferguson MM, Clayton JW (1997) Matriarchal genetic population
35. Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of structure of North American beluga whales Delphinapterus leucas (Cetacea: Mono-
killer whale ecotypes. Nat Commun 7:11693. dontidae). Mol Ecol 6:1033–1046.
36. Morin PA, et al. (2010) Complete mitochondrial genome phylogeographic analysis of 73. Turgeon J, Duchesne P, Colbeck GJ, Postma LD, Hammill MO (2012) Spatiotemporal
killer whales (Orcinus orca) indicates multiple species. Genome Res 20:908–916. segregation among summer stocks of beluga (Delphinapterus leucas) despite nuclear
37. Rundle HD, Nosil P (2005) Ecological speciation. Ecol Lett 8:336–352. gene flow: Implication for the endangered belugas in eastern Hudson Bay (Canada).
38. Matkin C, Durban J (2011) Killer whales in Alaskan waters. J Am Cet Soc 40:24–29.
Conserv Genet 13:419–433.
39. Laland KN, Odling-Smee J, Feldman MW (2000) Niche construction, biological evo-
74. Baker CS, et al. (1990) Influence of seasonal migration on geographic distribution of
lution, and cultural change. Behav Brain Sci 23:131–146, discussion 146–175.
mitochondrial DNA haplotypes in humpback whales. Nature 344:238–240.
40. Moura AE, et al. (2014) Population genomics of the killer whale indicates ecotype
75. Carroll EL, et al. (2015) Cultural traditions across a migratory network shape the
evolution in sympatry involving both selection and drift. Mol Ecol 23:5179–5192.
genetic structure of southern right whales around Australia and New Zealand. Sci Rep
41. Finkelstein JD (1990) Methionine metabolism in mammals. J Nutr Biochem 1:228–237.
5:16182.
42. Liu S, et al. (2014) Population genomics reveal recent speciation and rapid evolu-
76. Harrison XA, et al. (2010) Cultural inheritance drives site fidelity and migratory con-
tionary adaptation in polar bears. Cell 157:785–794.
nectivity in a long-distance migrant. Mol Ecol 19:5484–5496.
43. Forman OP, et al. (2012) Parallel mapping and simultaneous sequencing reveals de-
77. Kopps AM, et al. (2014) Cultural transmission of tool use combined with habitat
letions in BCAN and FAM83H associated with discrete inherited disorders in a do-
specializations leads to fine-scale genetic structure in bottlenose dolphins. Proc Roy
mestic dog breed. PLoS Genet 8:e1002462.
44. Durban JW, Pitman RL (2012) Antarctic killer whales make rapid, round-trip move- Soc B 281:20133245.
ments to subtropical waters: Evidence for physiological maintenance migrations? Biol 78. Mann J, Sargeant B (2003) The Biology of Traditions; Models and Evidence, eds
Lett 8:274–277. Fragaszy DM, Perry S (Cambridge Univ Press, Cambridge, UK), pp 236–266.
45. Pitman RL, Ensor P (2003) Three forms of killer whales (Orcinus orca) in Antarctic 79. Krützen M, et al. (2005) Cultural transmission of tool use in bottlenose dolphins. Proc
waters. J Cetacean Res Manag 5:131–139. Natl Acad Sci USA 102:8939–8943.
46. Pitman RL, Durban JW (2012) Cooperative hunting behavior, prey selectivity and prey 80. Cantor M, Whitehead H, Gero S, Rendell L (2016) Cultural turnover among Galápagos
handling by pack ice killer whales (Orcinus orca), type B, in Antarctic Peninsula waters. sperm whales. R Soc Open Sci 3:160615.
Mar Mamm Sci 28:16–36. 81. Mercader J, et al. (2007) 4,300-year-old chimpanzee sites and the origins of percussive
47. Whitehead H, Vachon F, Frasier TR (2017) Cultural hitchhiking in the matrilineal stone technology. Proc Natl Acad Sci USA 104:3043–3048.
whales. Behav Genet 47:324–334. 82. Cantor M, et al. (2015) Multilevel animal societies can emerge from cultural trans-
48. Alexander A, et al. (2013) Low diversity in the mitogenome of sperm whales revealed mission. Nat Commun 6:8091.
by next-generation sequencing. Genome Biol Evol 5:113–129. 83. van Schaik CP, Isler K, Burkart JM (2012) Explaining brain size variation: From social to
49. Lyrholm T, Leimar O, Gyllensten U (1996) Low diversity and biased substitution pat- cultural brain. Trends Cogn Sci 16:277–284.
terns in the mitochondrial DNA control region of sperm whales: Implications for es- 84. Cammen KM, et al. (2016) Genomic methods take the plunge: Recent advances in
timates of time since common ancestry. Mol Biol Evol 13:1318–1326. high-throughput sequencing of marine mammals. J Hered 107:481–495.

Whitehead PNAS | July 25, 2017 | vol. 114 | no. 30 | 7821


Song hybridization events during revolutionary song
change provide insights into cultural transmission
in humpback whales
Ellen C. Garlanda,b,1, Luke Rendella,b, Luca Lamonia,b, M. Michael Poolec, and Michael J. Noadd
a
Centre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews, St. Andrews, KY16 9TH, United Kingdom; bSea Mammal
Research Unit, School of Biology, University of St. Andrews, St. Andrews, KY16 8LB, United Kingdom; cMarine Mammal Research Program, Maharepa,
Moorea 98728, French Polynesia; and dCetacean Ecology and Acoustics Laboratory, School of Veterinary Science, University of Queensland, Gatton, QLD
4343, Australia

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark June 16, 2017
(received for review January 14, 2017)

Cultural processes occur in a wide variety of animal taxa, from whale song is one of the most elaborate acoustic displays in the
insects to cetaceans. The songs of humpback whales are one of the animal kingdom (21). The song is produced solely by adult males
most striking examples of the transmission of a cultural trait and (22) and is therefore considered a product of sexual selection,
social learning in any nonhuman animal. To understand how songs even though the details of how it functions as a signal are still
are learned, we investigate rare cases of song hybridization, where debated (23).
parts of an existing song are spliced with a new one, likely before Song is organized in a nested hierarchy: single sounds are
an individual totally adopts the new song. Song unit sequences termed “units,” a sequence of units is grouped into a “phrase,”
were extracted from over 9,300 phrases recorded during two song phrases are repeated to form a “theme,” and a number of different
revolutions across the South Pacific Ocean, allowing fine-scale themes are usually sung in a set order to form the “song” (24). To
analysis of composition and sequencing. In hybrid songs the current move from one theme into another, a single “transitional phrase”
and new songs were spliced together in two specific ways: (i) sing- is sometimes sung that contains content from the preceding and
ers placed a single hybrid phrase, in which content from both songs following themes (20). Different versions of the display (contain-
were combined, between the two song types when transitioning ing different themes) are termed “song types” (18). Within each
from one to the other, and/or (ii) singers spliced complete themes population, there is usually strong conformity to a single song type
from the revolutionary song into the current song. Sequence anal- at any point in time (25). However, the song is constantly changing
ysis indicated that both processes were governed by structural sim- (20), and all males must continuously incorporate these alterations
ilarity rules. Hybrid phrases or theme substitutions occurred at to maintain the observed conformity. This slow and gradual
points in the songs where both songs contained “similar sounds change is a process of cultural evolution in which subtle changes
arranged in a similar pattern.” Songs appear to be learned as seg- occur over time at a population scale (20, 26).
ments (themes/phrase types), akin to birdsong and human language Populations within an ocean basin sing similar songs, but the
acquisition, and these can be combined in predictable ways if the similarity depends on both geographic (27, 28) and temporal dis-
underlying structural pattern is similar. These snapshots of song tances, as transmission of song changes across a region may take
change provide insights into the mechanisms underlying song learn- several years (18, 29, 30). In the western and central South Pacific
ing in humpback whales, and comparative perspectives on the evo- region, song also undergoes dramatic cultural “revolutions,” where
lution of human language and culture. the song type from a neighboring population is rapidly adopted by
all of the males in an adjacent population (18, 19). We have pre-
vocal learning | cultural transmission | song | cetacean | humpback whale viously described the rapid, repeated, and regular horizontal cul-
tural transmission of multiple song types, creating multiple song
revolutions across the western and central South Pacific region (18,
C ultural transmission has been shown in a wide variety of taxa,
spanning birds, fish, insects, cetaceans, and nonhuman pri-
mates (1, 2). We define culture in the broad sense as shared in-
29, 30). Among populations in any nonhuman animal, this is a very
rare, possibly unique, example of population-wide horizontal cul-
formation or behavior acquired through some form of social tural transmission where behavioral variants are transmitted rapidly
learning from conspecifics (3–5). Each of these studies has pro- and repeatedly (18). However, we know little regarding the un-
vided examples demonstrating a behavioral trait being passed derlying vocal and sequence learning mechanisms governing this
from one individual to another, and on occasion entire pop- extraordinary cultural phenomenon.
Mechanisms of vocal learning are far better understood for
ulations, through some form of social learning. Cetaceans show
human language acquisition and birdsong than for cetacean
some of the most sophisticated and complex vocal and cultural
behavior outside of humans (6, 7), including vocal learning, shared
traditions, and gene–culture coevolution. For example, southern
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
right whales (Eubalaena australis) demonstrate strong migratory Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
culture (8), whereas bottlenose dolphins (Tursiops truncatus and Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
Tursiops aduncus) demonstrate the cultural transmission of tool in Irvine, CA. The complete program and video recordings of most presentations are available
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
use (9, 10). Both sperm whales (Physeter macrocephalus) and killer
Author contributions: E.C.G., L.R., and M.J.N. designed research; E.C.G., M.M.P., and
whales (Orcinus orca) have culturally transmitted group vocaliza- M.J.N. performed research; E.C.G. contributed new reagents/analytic tools; E.C.G., L.L.,
tions that are maintained over decades (11, 12), and also appear to and M.J.N. analyzed data; and E.C.G., L.R., L.L., M.M.P., and M.J.N. wrote the paper.
undergo gene–culture coevolution (13–15). The authors declare no conflict of interest.
Humpback whales (Megaptera novaeangliae) possess multiple, This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the
independently evolving cultural traditions, including maternally Editorial Board.
directed site fidelity to breeding and feeding grounds (16), so- 1
To whom correspondence should be addressed. Email: ecg5@st-andrews.ac.uk.
cially learned feeding tactics (17), and song displays that are This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
subject to cultural evolution and revolution (18–20). Humpback 1073/pnas.1621072114/-/DCSupplemental.

7822–7829 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1621072114


COLLOQUIUM
PAPER
vocalizations (7). Humpback whales are “vocal production learn- information as to how the song is memorized. Alternatively, parts
ers,” as they are able to modify the form of their own vocal signals of both song types may be spliced together in a random ar-
after experience with signals from other individuals (7). Vocal rangement of new and old units. This would indicate that the
production learning is not widespread; thus far, only a few mam- structural arrangement of an individual’s song disintegrates to a
malian groups, including cetaceans, pinnipeds, bats, elephants, babbling/subsong phase (41) before learning the new song ar-
and humans have been shown to be capable of it (7). An impor- rangement, and that segmentation is not occurring. Additionally,
tant limitation when studying large cetacean species is the inability we hypothesize that if segmentation occurs, then the combination
to conduct controlled laboratory experiments. Although learning of these segments from both song types by an individual will not be
and song production can be mapped to different pathways in the random (hypothesis 2). That is, the insertion of new song segments
brain for songbirds, mice, and humans (to name a few) (31), this is into the existing song will be at locations in the existing song where
not yet possible with large, free-roaming cetaceans. However, we there is some structural similarity in the sound units, phrases, or
can explore the learning mechanisms involved by examining the themes of the old and new songs, rather than at random positions.
structure and arrangement of the song displays in detail, and This “similar sounds in similar arrangements” mechanism would
comparing any rules we uncover governing the arrangement and be akin to word substitutions in humans, such as malapropisms,
learning of song to those currently known in human language where an incorrect word with a similar sound is used in place of
acquisition and birdsong learning. the correct word (42). To test these hypotheses, we first investi-
Statistical learning, where patterns and structure are identified gated how each singer displaying a hybrid song transitioned be-
based on the statistical information present in sensory stimuli, is a tween song types, and second we quantified the similarity in
common human learning mechanism present in all sensory mo- arrangement between the themes from each song using sequence
dalities (32). From a very young age human infants are able to analysis metrics. We analyzed four hybrid songs recorded during
detect, extract, and generalize statistical regularities (i.e., simple two different song revolutions from two geographic locations
algebraic rules) from their auditory environment (32, 33), and (eastern Australia and French Polynesia). Thus far, these are the
understanding how they use this statistical information to learn only examples of hybrid songs in over 20 y of fieldwork from five
language is a major research focus (32). The ability to detect populations where song revolutions are known to occur regularly,
transition probabilities—the probability that a given sound or and from which ∼1,500 song sequences representing at least
syllable follows another one (32, 33)—is important in under- 100,000 phrases have been analyzed. [A fifth hybrid song has been
standing word segmentation or grammar learning tasks. From a identified. This recording is of a very poor quality (low signal-to-
comparative perspective, recent work has demonstrated that zebra noise ratio). Themes can be sporadically identified but the clear
finches (Taeniopygia guttata) generate phonological categories
transitions between themes required for the current analysis is
that result in the song being easier for others to learn (34).
lacking. We therefore excluded this recording from analysis. The
Understanding how humpback whales learn their extended
recording was from eastern Australia in 1997 as part of the pink/
song sequences is therefore of interest in the comparative study
black song revolution presented in ref. 19.]
of mechanisms for learning sequences and patterns in cultural
vocal signals. Results
Segmentation, the chunking of sequences into smaller com-
Three separate datasets were included in the analysis, as each
ponents (phrases or words) that can later be recombined, is
contained one or more hybrid songs. These spanned two geo-
another important mechanism in human language acquisition
graphic locations: Peregian Beach, eastern Australia (1996–1997
(32, 33, 35, 36). Songbirds have been shown to segment when
and 2002–2003), and Mo’orea, French Polynesia (2005); and two
learning their song displays (37–40). Segments are typically
separated by longer pauses (silence), and these pauses may song revolutions: from pink to black (Australia 1996–1997), the
provide an emphasis that aids in memorization of segment “original” song revolution (19), and from blue to dark red
chunks (39). In a recent review of human language and non- [which occurred in Australia in 2002–2003 and French Polynesia
human animal communication, Birchenall (33) suggests that the in 2005 (20)]. Over 46 h of song from 50 singers and 4 song types
process of segmentation may also be present in humpback whale (each given an arbitrary color label—blue, dark red, pink, and
song learning. Given the importance of segmentation to lan- black—to be consistent with published analyses of these song
guage acquisition and the presence of this mechanism in the types) were analyzed from French Polynesia (2005: 18 singers,
learning of birdsong, this is a logical starting place to study 1 hybrid) and eastern Australia (1996–1997: 2 singers each
humpback whale song learning. based on the highest quality singer for each song type from

ECOLOGY
Here, we present evidence that humpback whales use seg- 249 singers presented in ref. 19, 2 hybrids; and 2002–2003:
mentation in song learning by examining recordings made during 26 singers, 1 hybrid).
the process of learning a new song in the context of a song rev- To identify if new songs were learned as segments (hypothesis
olution event. Recording a whale in the act of changing his song is 1), we first needed to classify each potential segment. Because
challenging; they are highly mobile and one cannot simply record there are multiple levels in the humpback song hierarchy, each
all of an individual’s song during a 2- to 3-mo breeding season being a potential basis for segmentation, we analyzed each level.
and >6,000-km migration. We therefore investigate some rare First, individual sounds were classified into categories (i.e., unit
cases of song hybridization recorded during song revolution events types) (SI Methods and Tables S1 and S2). Then the stereotyped
to understand how individual whales transition between two dif- sequences of units that made phrases were established and fur-
ferent songs. These hybrid songs, which contain themes and ele- ther grouped into themes (SI Methods and Table S1). Themes
ments from both the previous song and the new, revolutionary from each song type were labeled 1 through 37 (Table 1; also see
song, presumably represent a transition phase in the process by SI Methods and Table S1) following previous classification of
which singers change their song display to a new, completely dif- these song types (18, 19, 29, 43, 44). The song type of origin (pink
ferent arrangement. We aim to identify if there are any underlying or black) for theme 11 was uncertain and thus remained un-
structural rules governing song change (e.g., segmentation, tran- resolved, as it was not heard in any nonhybrid songs (Fig. 1,
sition probabilities) that can provide insight into how new songs Table 1, and Table S1). The sequence of themes for each hybrid
can be learned so rapidly. We hypothesize that new songs will be singer was established (Table 1). It is immediately obvious that
learned as segments if segmentation is a taxon-general mechanism the hybrid songs examined here comprised complete themes
(hypothesis 1). Identifying the level in the song hierarchy (phrase, from the two different song types combined into a single song;
theme, or song) that comprises a segment will provide important segmentation occurred at the theme level.

Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7823
Table 1. Theme sequence of hybrid songs from French Polynesia (2005) and eastern Australia (1997 and 2003)

/ Represents a transitional phrase between the two labeled themes. // Represents a break in recording for <1 min. SLA, surface level attenuation, where the
whale is breathing at the surface and the song content is difficult to hear and therefore uncertain. Themes are color-coded by song type. Note the song type
of origin for theme 11 was uncertain (colored gray). See Table S1 for description of theme content.
*Themes 31a and 31b repeated multiple times. No break in recording.

Themes 8a and 9b repeated multiple times. No break in recording.

Themes 8b and 9b repeated multiple times. No break in recording.

Given that hybrid songs contained theme segments from each between song types (Fig. 2 and Tables 1 and 2). In songs from the
song type, we investigated if there were any patterns to the eastern Australia 2002 revolution involving the same song types,
arrangement of themes (hypothesis 2). To do this, we: (i) estab- this pattern was not as clear because theme transitions did not
lished the location of hybrid transitions in the song, (ii) in- occur at the most similar themes (Fig. 3B). Instead, theme tran-
vestigated how each singer transitioned between the song types, sitions were mediated by a transitional phrase (Tables 1 and 2).
and (iii) quantified the similarity of theme content using sequence Finally, in songs from the 1996 eastern Australia pink/black rev-
analysis metrics to understand why a singer might switch at that olution, the dendrogram showed a single location where themes
particular location in the song. from both song types grouped together on a branch (Fig. 3C and
To understand the location of theme transitions, the full se- SI Results). This was where the majority of transitions in hybrid
quence of themes from all singers was used to construct a first- songs occurred between the song types (Table 2). The hybrid
order Markov model based on the frequencies of transition be- singers replaced the next theme in the song sequence with a
tween phrases (Fig. 2). Transitions occurred between the pink and similarly arranged theme from the other song type (Fig. 1 and
black song types at multiple locations in the song (Fig. 2A and Table 2). The remaining theme transitions were either mediated
Table 1) but, in contrast, transitions between the blue and dark red by a transitional phrase or the mechanism of transition between
song types occurred only at two locations in the song (Fig. 1 and the song types was unclear (Tables 1 and 2). Regardless, in ad-
Fig. 2B). At these transition locations, singers often placed a tran- dition to transitional phrases this final analysis strongly indicates
sitional phrase between the two song types to mediate the transition that transitions between song types are not random and occur
(Tables 1 and 2). This single phrase combined the starting units more often at locations where theme content is most similar.
from the preceding phrase with units from the following phrase
(typically the ending units) (Fig. 1, Table 2, and Table S1). Discussion
We characterized the structural similarity, that is the similarity Hybrid songs are recorded extremely rarely but are of interest
in the sequence of units that comprised each theme/phrase type because they capture some part of the process by which singers
(laid out in Table S1), between each pair of songs (e.g., blue vs. change their song display from an older version (type) to a new,
dark red) using the Levenshtein distance (LD), a common simi- completely different arrangement. The hybrid songs presented
larity metric in linguistic and humpback song comparisons (29, 43, here were all captured during song revolution events, when
45, 46). In songs from the 2005 French Polynesia blue/dark red singers using both the old and new song types were in the same
revolution, hierarchical clustering of themes showed a single lo- population. It is clear that new songs are learned as segments,
cation on the dendrogram where themes from both song types confirming hypothesis 1 (see also ref. 33), indicating that seg-
grouped together on a branch (Fig. 3A). This was where the singer mentation is a learning mechanism found in the cetacean
of the hybrid song in the French Polynesian dataset switched lineage. The way singers move between song types during singing

7824 | www.pnas.org/cgi/doi/10.1073/pnas.1621072114 Garland et al.


COLLOQUIUM
PAPER
Fig. 1. Example spectrograms of hybrid transitional phrases, corresponding parent themes, and substituted themes from the blue and dark red song types
(A and B), and the pink and black song types (C and D). A shows the theme progression (from left to right) of the transition from blue theme 24, through the
hybrid 24/37a phrase into dark red theme 37a and then theme 37b (singer HYB1) (Table 1). B shows the theme progression from dark red theme 31a to blue
theme 27, mediated by hybrid transition phrases 27/31a and 31a/27 [note the difference in arrangement depending on the direction of transition (singer
HYB2)]. C shows the theme progression (from left to right) from pink theme 1, through hybrid phrase 1/7a into black theme 7a (singers HYB3 and HYB4)

ECOLOGY
(Table 1), and the substituted pink theme 2. D shows the theme progression (from left to right) from black theme 9b, through hybrid phrase 9b/4 into pink
theme 4 (singers HYB3 and HYB4) (Table 1). It also shows pink theme 3 and the unresolved theme 11. Spectrograms were 2,048-point fast Fourier transform
(FFT), Hanning window and 75% overlap, generated in RAVEN PRO 1.4 (see also Audios S1–S4 for corresponding audio files).

bouts suggests that these displays are unlikely to be learned as a novelties in the song are adopted by singers once reaching a
whole. Instead, songs are split into theme segments, and the fact threshold prevalence (47), and therefore an individual male
that transitions between song types occur at specific points would need to hear a new song from multiple individuals before
in the theme sequence suggests that each theme is learned as adopting the change. The male therefore has multiple potential
a separate entity. Segmentation or chunking of sequences is models for each theme and a general overview of the “correct”
an important mechanism in human language acquisition (35), sequence of the themes. The highly stereotyped nature of theme
where a stream of utterances is segmented into smaller com- and phrase sequences, both of which we quantified as transition
ponents (phrases or words) and later recombined (36). Song- probabilities (e.g., Fig. 2 and ref. 48), strongly suggests hump-
birds have also been shown to segment their song displays (37– back whales, like songbirds, use statistical learning in learning
40) and statistically learn sound categories (34). Juvenile male their song display (34).
songbirds may learn their song from one or more tutors as a In songbirds, segments are typically separated by longer pauses
sequence of syllable segments, which they recombine to form (silence), and these pauses may provide an emphasis that aids in
their own song (37–40). In humpback whales, our results sug- memorization of segment chunks (39). This feature of pauses
gest that a male learns the new song as theme segments, which between segments of zebra finch song is also a feature of
he combines with older themes as he progressively learns humpback whale song, as a phrase is delineated from the start of
the new song. The novelty-threshold hypothesis suggests that another phrase by a longer pause (24, 49). Given that a single

Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7825
Fig. 2. First-order Markov model of theme transitions to understand hybridization between (A) pink/black song types (n = 2,222 phrase transitions, n =
4 individuals), and (B) blue/dark red song types (n = 8,852 phrase transitions, n = 46 individuals). Each node represents a theme or phrase type, color-coded by
song type. White nodes represent transitional phrases and dashed lines indicate transitions between song types. Arrows represent the direction of movement
and thicker lines indicate higher transition probabilities. Transitions between the pink and black song types (A) occurred at multiple locations (themes 1 to 7b,
9b to 4, 10a to 1, 4 to 10a, 10a to 5b, 8b to 4, and 8a to 5b). In contrast, transitions between the blue and dark red song types (B) occurred only at two specific
locations in the song: blue theme 27–dark red theme 31a (both directions), and blue theme 24–dark red theme 37a (one-way). Phrase repetitions are removed
from the figure for ease of display.

humpback whale song can last anywhere from 5 to 30 min directional change in song revolutions (to stop whales reverting
(24), any aid in memorization of such a long display would be back to the previous song type), leading to the broad-scale cultural
under strong selection. The repetition of phrases within themes phenomenon we observe (18).
introduces redundancy in the song, and likely aids memorization Hybrid songs from both song revolutions contained themes from
through repetition and reduced content. Furthermore, rhyme-like one song type that were spliced into the middle of the other song
patterns in humpback song (50) appear similar to rhyme patterns type (Table 1). There are multiple examples of such hybrid song
in human poems or prose, which also aid recall (51). The question production in songbirds at the boundary of two song dialect areas
of how humpback whales remember their song display (they rarely or the boundary between two closely related species (41). For ex-
sing the wrong thing) is still open. From playback studies we know ample, orange-tufted sunbirds (Nectarinia osea) have sharp dialect
humpback whales react more strongly to novel songs than to the boundaries, but a small number of birds along these boundaries
song of the current year (see ref. 52). The whales can identify sing songs from both dialects (i.e., hybrids) (54). Similarly, in the
“same” from “different.” It would be interesting to explore how village indigobird (Vidua chalybeata), a species that undergoes
long their song memory lasts, as bottlenose dolphins have been continuous population-wide song evolution in some ways similar to
shown to remember vocalizations (signature whistles of conspe- humpback whale songs, males along dialect boundaries have been
cifics) for over 20 y (53). Such a song memory could drive the recorded singing hybrid songs that combined songs from each

7826 | www.pnas.org/cgi/doi/10.1073/pnas.1621072114 Garland et al.


COLLOQUIUM
PAPER
Table 2. Theme transitions between two different song types in hybrid songs

The direction of transition (i.e., old song to new song, or vice versa), and the number of times this transition occurred as a percentage of the total number
of hybrid transitions for each pair of song types (taken from Table 1) are noted. The similarity in sound units or their arrangement is described along with
whether this similarity was supported in the dendrograms (i.e., both themes present on a branch). The presence of a transitional phrase is noted, and a description
of the potential mechanism assisting the transition is suggested.

dialect (55). In yellow-rumped caciques (Cacicus cela vitellinus), arranged in a similar pattern (i.e., a “switch-when-similar” rule).
another species with continuous population-wide song evolution, Word substitutions in humans, such as malapropisms—the use of
males in a colony may occasionally incorporate a foreign song type an incorrect word in place of a word with a similar sound (42)—is
as part of their yearly population dialect if the two colonies are highly suggestive for a general mechanism. These transition
closely situated (56). In another example, at the range interface of points based on similarity could act as a point of reference or
black-capped chickadees (Poecile atricapillus) and Carolina chick- cue, allowing the singer to switch from the old into the new song
adees (Poecile carolinensis), birds from both species displayed bi- at this position in the song. Such anchors are present in human
lingual or atypical repertoires (57). Clearly, segmentation is an vocal performances [e.g., oral traditions (51)], and single sounds
important general mechanism in vocal learning present in multiple or words and similar note arrangements are used to transition
independent lineages. among songs in human music performances. Finally, the ability
Transitions between humpback whale song types were often to jump from one song into another is also a feature of birdsong;
mediated by a transitional phrase containing individual sound for example, counter-singing allows a male to select a matching
units from the previous and following phrases that were common song of a rival male and switch to singing that song in an ag-
to both song types (Figs. 1 and 2 and Tables 1 and 2). Transi- gressive context (41). This skill strongly suggests the presence of

ECOLOGY
tional phrases are a neglected component of the song in general, an underlying mechanism allowing plasticity in vocal output
as they are often excluded from analyses focused on delineating shared among vocal learning species.
song types (49). The variable structure of transitional phrases can We suggest the switch-when-similar rule may be stronger and
make them difficult to categorize, particularly if they are not thus more important in one direction (i.e., old-to-new themes)
routinely used in all transitions between themes. Nevertheless, it (Table 2), assisting singers in learning new themes sequentially
is clear this normal component of song organization is important and in the “correct” order. The whale is attempting to learn the
to allow an ordered progression from one theme into another, new display; this is very directional. The location in the song
regardless of the song types. where old themes encroach back into the song display may be
Transitions between song types were partially governed by less important and is unlikely to be governed by this similarity
structural similarity, based on the Markov model and sequence rule (explaining the majority of unsimilar transitions backward).
analysis (Figs. 2 and 3), rejecting random combinations of seg- These new-to-old song transitions appear to be mediated more
ments (hypothesis 2). The sequence analysis indicated that often by transitional phrases (Table 2).
transitions or theme substitutions occurred more often in loca- The process of vocal production learning (7) of a completely
tions that contained “similar sounds arranged in a similar pat- new song type could occur through a number of structural
tern” in old and new songs (Fig. 3). Themes either progressed changes to the song, as new themes must be learned and old
into a similarly sounding theme of the other song type or themes removed. Multiple studies indicate that male humpback
replaced that similarly sounding theme altogether (Table 2). In whales adhere to the current arrangement of the song (e.g., refs.
addition to segmenting, song learning and change are partially 20 and 25). Importantly, once a new song is recorded in a pop-
governed by structural similarity rules where transitions or theme ulation, all males switch to this new song (18, 19). Clearly, the
substitutions occur in locations that contain similar sounds song is learned as theme segments to aid in the learning of this

Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7827
particularly important to our understanding of structurally arranged
vocal communication and the potential origins of human language.
Here, by investigating rare cases of song hybridization, where parts
of an existing song are spliced with a novel, revolutionary song, we
have unearthed a number of underlying structural rules governing
song change, including segmentation and transition/substitution of
themes based on the similarity in sound sequences. These rules
likely assist humpback whales in rapidly learning their complex and
ever-changing songs, and provide insights into the evolution of
human language and culture.

Methods
Song Recordings. All recordings covered the frequency range of humpback
whale song (see SI Methods for detailed recording settings). The units in each
recording were transcribed by a human classifier (E.C.G. or L.L.), and a subset
of units measured for a suite of acoustic parameters to ensure consistent
naming (45). As humpback whale song is highly stereotyped (24), units were
grouped into phrases, phrases into themes, and themes into song types. Pre-
vious studies have identified and quantified these four song types (pink, black,
blue, and dark red), the themes (labeled 1–37), and unit types within each, and
their cultural transmission across the western and central South Pacific (18, 19,
29, 43, 59).

Theme Transitions to Understand Song Sequences. For each recording, the


sequences of themes, including phrase repetitions, transitional phrases, and
hybrid phrases, were noted. Transition tables were calculated and a first-
order Markov model of phrase transition probabilities was constructed for
each song revolution using these data: pink to black and blue to dark red. The
2002–2003 eastern Australian and 2005 French Polynesian data were com-
bined, given that they represented the same song types (18, 29), and the aim
of this higher-level analysis was to identify positions within a song where a
singer may transition between two song types.

Structural Similarity of Themes. The LD or string edit distance is a powerful


metric for comparing humpback whale song sequences, which we and others
have used extensively to understand song similarity at all levels within the
song hierarchy (29, 43, 45, 46, 59–61). The LD similarity index produces a
measure of similarity (between 0 and 1) among multiple sequences of
varying lengths, and provides an overall understanding of the similarity of all
sequences (see ref. 45). Here, we compared the sequence of units (i.e., a
Fig. 3. Dendrogram of bootstrapped (1,000) similarity matrices of average-
phrase) to establish the most representative phrase for each theme based on
linkage hierarchical clustered median unit sequences for each theme for
the similarity in the sequence of units (see SI Methods for further in-
(A) French Polynesia 2005 blue and dark red song types (CCC = 0.93),
formation, and Table S1) (29, 43, 45, 46). These representative phrases for
(B) eastern Australia 2002–2003 blue and dark red song types (CCC = 0.88), and
(C) eastern Australia 1996–1997 pink and black song types (CCC = 0.95). each theme (laid out in Table S1) were then compared between the two
Multiscale bootstrap resampling (AU, Left, red dot indicates P > 95%) and song types (pink vs. black or blue vs. dark red) to quantify the structural
normal bootstrap probabilities (Right, green dot indicates P > 70%) are dis- similarity among themes in an attempt to identify any underlying structural
played. Branches with high AU values are strongly supported by the data. rules for the transitions highlighted in the Markov models. Similarity scores
Dashed boxes indicate where themes from different song types appear to- were hierarchically clustered and bootstrapped in R using the hclust, pvclust,
gether on a branch. and pvrect packages to ensure the resulting structure was stable and likely to
occur (43, 45, 62). Branches with high bootstrap values (AU significance P >
95% and bootstrap probability significance P > 70%) are strongly supported
complex display. In male village indigobirds, immigrant males by the data, whereas lower values suggest variability in their division (45). As
add song types from their new dialect and then drop their old, a further test of how well each dendrogram represented the data, the
foreign song types within a year (55). We suggest humpback Cophenetic Correlation Coefficient (CCC) was also calculated. A CCC score of
over 0.8 is considered a good representation of the associations within the
whales may undertake a similar process by adding in new themes
data (63).
starting at similar locations and then progressively deleting the
old themes. Intense cultural conformity is likely influencing these
ACKNOWLEDGMENTS. This report is based on a presentation at the Sackler
vocal displays, which are in turn also driven by sexual selection. Colloquium on “The Extension of Biology Through Culture”; we thank the or-
The presence of an innate template likely governs the underlying ganizers of the Colloquium and Emma Carroll, Elena Miu, and two anonymous
processes and rules of song learning (58), overlaid with a more reviewers for providing helpful comments on previous versions of this manu-
flexible cultural component that governs what variant of the song script. E.C.G. and this study were supported by a Newton International Fellow-
display to sing, regardless of the species. The details of how songs ship from the Royal Society of London; L.L. was supported by Leverhulme
Trust Research Project Grant RPG-2013-367; L.R. was supported by the Marine
change when there is a general conformity to a population song, Alliance for Science and Technology for Scotland (MASTS) pooling initiative.
and how this process interacts with sexual selection that un- MASTS is funded by the Scottish Funding Council (Grant HR09011) and con-
derlies the humpback song display, are important questions for tributing institutions. Song recordings in eastern Australia were funded by the
future research. Scott Foundation, the US Office of Naval Research, and the Australian Defence
Science and Technology Organization. We thank everyone involved with this
Conclusions project. Some funding and logistical support was provided to M.M.P. by the
US National Oceanic Society, Dolphin & Whale Watching Expeditions (French
Humpback whales provide a unique perspective for understanding Polynesia), Vista Press, and the International Fund for Animal Welfare (via the
of animal culture. Their mammalian heritage also makes them South Pacific Whale Research Consortium).

7828 | www.pnas.org/cgi/doi/10.1073/pnas.1621072114 Garland et al.


COLLOQUIUM
PAPER
1. Laland KN, Janik VM (2006) The animal cultures debate. Trends Ecol Evol 21:542–547. 34. Fehér O, Ljubičic I, Suzuki K, Okanoya K, Tchernichovski O (2016) Statistical learning in
2. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through songbirds: From self-tutoring to song culture. Phil Trans R Soc Lond B Biol Sci 372:
culture. Proc Natl Acad Sci USA 114:7775–7781. 20160053.
3. Rendell L, Whitehead H (2001) Culture in whales and dolphins. Behav Brain Sci 24: 35. Jusczyk PW (1999) How infants begin to extract words from speech. Trends Cogn Sci 3:
309–324, discussion 324–382. 323–328.
4. Fragaszy DM, Perry S (2003) Preface. The Biology of Traditions: Models and Evidence, 36. Doupe AJ, Kuhl PK (1999) Birdsong and human speech: Common themes and mecha-
eds Fragaszy DM, Perry S (Cambridge Univ Press, Cambridge, UK), pp xiii–xvi. nisms. Annu Rev Neurosci 22:567–631.
5. Whiten A (2009) The identification and differentiation of culture in chimpanzees and 37. Williams H, Staples K (1992) Syllable chunking in zebra finch (Taeniopygia guttata)
other animals: From natural history to diffusion experiments. The Question of Animal song. J Comp Psychol 106:278–286.
Culture, eds Laland KN, Galef BG (Harvard Univ Press, Cambridge, MA), pp 99–124. 38. Takahasi M, Yamada H, Okanoya K (2010) Statistical and prosodic cues for song seg-
6. Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Univ of mentation learning by Bengalese finches (Lonchura striata var. domestica). Ethology
Chicago Press, Chicago). 116:481–489.
7. Janik VM (2014) Cetacean vocal learning and communication. Curr Opin Neurobiol 28: 39. Spierings M, de Weger A, Ten Cate C (2015) Pauses enhance chunk recognition in
60–65. song element strings by zebra finches. Anim Cogn 18:867–874.
8. Carroll EL, et al. (2015) Cultural traditions across a migratory network shape the ge- 40. Slabbekoorn H, And AJ, Bell DA (2013) Microgeographic song variation in island
netic structure of southern right whales around Australia and New Zealand. Sci Rep 5: populations of the white-crowned sparrow (Zonotrichia leucophrys nutalli): In-
16182. novation through recombination. Behaviour 140:947–963.
9. Krützen M, et al. (2005) Cultural transmission of tool use in bottlenose dolphins. Proc 41. Catchpole C, Slater PJB (2008) Bird Song: Biological Themes and Variations (Cam-
Natl Acad Sci USA 102:8939–8943. bridge Univ Press, Cambridge, UK), 2nd Ed.
10. Kopps AM, Krützen M, Allen SJ, Bacher K, Sherwin WB (2014) Characterizing the 42. Fay D, Cutler A (1977) Malapropisms and the structure of the mental lexicon. Linguist
socially transmitted foraging tactic “sponging” by bottlenose dolphins (Tursiops sp.) Inq 8:505–520.
in the western gulf of Shark Bay, Western Australia. Mar Mamm Sci 30:847–863. 43. Garland EC, et al. (2012) Improved versions of the Levenshtein distance method for
11. Rendell LE, Whitehead H (2003) Vocal clans in sperm whales (Physeter macrocephalus). comparing sequence information in animals’ vocalisations: Tests using humpback
Proc Biol Sci 270:225–231. whale song. Behaviour 149:1413–1441.
12. Deecke VB, Ford JKB, Spong P (2000) Dialect change in resident killer whales: Impli- 44. Smith JN, Goldizen AW, Dunlop RA, Noad MJ (2008) Songs of male humpback whales,
cations for vocal learning and cultural transmission. Anim Behav 60:629–638. Megaptera novaeangliae, are involved in intersexual interactions. Anim Behav 76:
13. Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of 467–477.
killer whale ecotypes. Nat Commun 7:11693. 45. Garland EC, Rendell L, Lilley MS, Poole MM, Noad MJ (2017) The devil is in the detail:
14. Whitehead H (1998) Cultural selection and genetic diversity in matrilineal whales. Quantifying vocal variation in a complex, multi-levelled, and rapidly evolving display.
Science 282:1708–1711. Acoustical Soc Am, 10.1121/1.4991320.
15. Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl Acad 46. Kershenbaum A, Garland EC (2015) Quantifying similarity in animal vocal sequences:
Sci USA 114:7814–7821. Which metric performs best? Methods Ecol Evol 6:1452–1461.
16. Baker CS, et al. (1990) Influence of seasonal migration on geographic distribution of 47. Noad MJ (2002) The use of song by humpback whales (Megaptera novaeangliae)
mitochondrial DNA haplotypes in humpback whales. Nature 344:238–240. during migration off the east coast of Australia. PhD dissertation (University of
17. Allen J, Weinrich M, Hoppitt W, Rendell L (2013) Network-based diffusion analysis Sydney, NSW, Australia).
reveals cultural transmission of lobtail feeding in humpback whales. Science 340: 48. Helweg DA, Herman LM, Yamamoto S, Forestell PH (1990) Comparison of songs of
485–488. humpback whales (Megaptera novaeangliae) recorded in Japan, Hawaii, and Mexico
18. Garland EC, et al. (2011) Dynamic horizontal cultural transmission of humpback whale during the winter of 1989. Sci Reports Cetacean Res 1:1–20.
song at the ocean basin scale. Curr Biol 21:687–691. 49. Cholewiak DM, Sousa-Lima RS, Cerchio S (2013) Humpback whale song hierarchical
19. Noad MJ, Cato DH, Bryden MM, Jenner MN, Jenner KCS (2000) Cultural revolution in structure: Historical context and discussion of current classification issues. Mar Mamm
whale songs. Nature 408:537. Sci 29:1–21.
20. Payne K, Payne RS (1985) Large scale changes over 19 years in songs of humpback 50. Guinee LN, Payne KB (1986) Rhyme-like repetitions in songs of humpback whales.
whales in Bermuda. Z Tierpsychol 68:89–114. Ethology 79:295–306.
21. Wilson EO (2000) Sociobiology: The New Synthesis (Harvard Univ Press, Cambridge, 51. Rubin DC (1997) Memory in Oral Traditions: The Cognitive Psychology of Epic, Ballads,
MA), 25th Ed. and Counting-Out Rhymes (Oxford Univ Press, Oxford).
22. Glockner DA (1983) Determining the sex of humpback whales (Megaptera no- 52. Helweg DA, Frankel AS, Mobley JR, Jr, Herman LM (1992) Humpback whale song: Our
vaeangliae) in their natural environment. Communication and Behavior of Whales, current understanding. Marine Mammal Sensory Systems, eds Thomas JA,
ed Payne R (AAAS Selected Symposia Series, Boulder, CO), pp 447–464. Kastelein RA, Supin AY (Plenum Press, New York), pp 459–483.
23. Herman LM (November 6, 2016) The multiple functions of male song within the 53. Bruck JN (2013) Decades-long social memory in bottlenose dolphins. Proc R Soc B 280:
humpback whale (Megaptera novaeangliae) mating system: Review, evaluation, and 20131726.
synthesis. Biol Rev Camb Philos Soc, 10.1111/brv.12309. 54. Leader N, Wright J, Yom-Tov Y (2000) Microgeographic song dialects in the orange-
24. Payne RS, McVay S (1971) Songs of humpback whales. Science 173:585–597. tufted sunbird (Nectarinia Osea). Behaviour 137:1613–1627.
25. Payne K, Tyack P, Payne R (1983) Progressive changes in the songs of humpback 55. Payne RB (1985) Behavioral continuity and change in local song populations of village
whales (Megaptera novaeangliae): A detailed analysis of two seasons in Hawaii. indigobirds Vidua chalybeata. Z Tierpsychol 70:1–44.
Communication and Behavior of Whales, ed Payne R (AAAS Selected Symposia Series, 56. Trainer JM (1989) Cultural evolution in song dialects of Yellow-rumped Caciques in
Boulder, CO), pp 9–57. Panama. Ethology 80:190–204.
26. Winn H, Winn L (1978) The song of the humpback whale Megaptera novaeangliae in 57. Sattler GD, Sawaya P, Braun MJ (2017) An assessment of song admixture as an in-
the West Indies. Mar Biol 47:97–114. dicator of hybridization in black-capped chickadees (Poecile atricapillus) and Carolina
27. Payne R, Guinee LN (1983) Humpback whale (Megaptera novaeangliae) songs as an chickadees (P. carolinensis). Auk 124:926–944.
indicator of “stocks.” Communication and Behavior of Whales, ed Payne R (AAAS 58. Cerchio S, Jacobsen JK, Norris TF (2001) Temporal and geographical variation in songs

ECOLOGY
Selected Symposia Series, Boulder, CO), pp 333–358. of humpback whales, Megaptera novaeangliae: Synchronous change in Hawaiian and
28. Darling JD, Acebes JMV, Yamaguchi M (2014) Similarity yet a range of differences Mexican breeding assemblages. Anim Behav 62:313–329.
between humpback whale songs recorded in the Philippines, Japan and Hawaii in 59. Garland EC, et al. (2015) Population structure of humpback whales in the western and
2006. Aquat Biol 21:93–107. central South Pacific Ocean as determined by vocal exchange among populations.
29. Garland EC, et al. (2013) Quantifying humpback whale song sequences to understand Conserv Biol 29:1198–1207.
the dynamics of song exchange at the ocean basin scale. J Acoust Soc Am 133: 60. Eriksen N, Tougaard J (2006) Analysing differences among animal songs quantita-
560–569. tively by means of the Levenshtein distance measure. Behaviour 143:239–252.
30. Garland EC, et al. (2013) Humpback whale song on the Southern Ocean feeding 61. Helweg DA, Cato DH, Jenkins PF, Garrigue C, McCauley RD (1998) Geographic vari-
grounds: Implications for cultural transmission. PLoS One 8:e79422. ation in South Pacific humpback whale songs. Behaviour 135:1–27.
31. Arriaga G, Zhou EP, Jarvis ED (2012) Of mice, birds, and men: The mouse ultrasonic 62. Suzuki R, Shimodaira H (2004) An application of multiscale bootstrap resampling to
song system has some features similar to humans and song-learning birds. PLoS One hierarchical clustering of microarray data: How accurate are these clusters. 15th
7:e46610. Annual International Conference of Genome Informatics, Posters and Software
32. Romberg AR, Saffran JR (2010) Statistical learning and language acquisition. Wiley Demonstrations. Available at http://stat.sys.i.kyoto-u.ac.jp/prog/pvclust/. Accessed October
Interdiscip Rev Cogn Sci 1:906–914. 29, 2015.
33. Birchenall LB (2016) Animal communication and human language: An overview. Int J 63. Sokal RR, Rohlf FJ (1962) The comparison of dendrograms by objective methods.
Comp Psychol 29:1–27. Taxon 11:33–40.

Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7829
Conformity does not perpetuate suboptimal traditions
in a wild population of songbirds
Lucy M. Aplina,1 , Ben C. Sheldona , and Richard McElreathb,c
a
Edward Grey Institute, Department of Zoology, University of Oxford, Oxford OX1 3PS, United Kingdom; b Department of Human Behavior, Ecology, and
Culture, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany; and c Department of Anthropology, University of California, Davis,
CA 95616

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 30,
2017 (received for review January 17, 2017)

Social learning is important to the life history of many ani- locally adaptive information. However, it may also have the out-
mals, helping individuals to acquire new adaptive behavior. How- come of maintaining group differences in behavior, with within-
ever despite long-running debate, it remains an open question group traditions resilient to invasion by alternative variants.
whether a reliance on social learning can also lead to mismatched Empirical evidence for conformity in nonhuman animals is
or maladaptive behavior. In a previous study, we experimentally currently limited, but hints at a wide taxonomic occurrence, with
induced traditions for opening a bidirectional door puzzle box in proposed cases in fish (17), birds (18), and primates (4, 19). Fur-
replicate subpopulations of the great tit Parus major. Individuals thermore, theoretical modeling has suggested that conformist
were conformist social learners, resulting in stable cultural behav- transmission should evolve under a wide range of conditions and
iors. Here, we vary the rewards gained by these techniques to be particularly favored when environments are spatially hetero-
ask to what extent established behaviors are flexible to changing geneous (15, 20). Yet if individuals are exclusively conformist,
conditions. When subpopulations with established foraging tra- then any new environmental change may result in a mismatch
ditions for one technique were subjected to a reduced foraging with the majority behavior, leading to a perpetuation of subop-
payoff, 49% of birds switched their behavior to a higher-payoff timal or maladaptive traditions over time (21) and exaggerating
foraging technique after only 14 days, with younger individu- the disadvantages of social information use. Evolutionary mod-
als showing a faster rate of change. We elucidated the decision- eling has gone as far as to suggest that in socially learning ani-
making process for each individual, using a mechanistic learning mals, coupling of conformist learning with environmental change
model to demonstrate that, perhaps surprisingly, this population- could lead to population collapse (22).
level change was achieved without significant asocial exploration This apparent paradox of nonadaptive culture has thus been
and without any evidence for payoff-biased copying. Rather, by the subject of much debate (23–28), with two individual-level
combining conformist social learning with payoff-sensitive indi- strategies proposed as a potential means of evading this evolu-
vidual reinforcement (updating of experience), individuals and tionary trap. First, individuals could switch from socially learned
populations could both acquire adaptive behavior and track envi- behavior to engaging in asocial learning (individual innovation)
ronmental change. when the rewards gained for performing the established tradi-
tion is smaller than previously (7, 12, 27, 29). Second, individu-
social learning | animal culture | conformity | Parus major als could combine conformist tendencies with payoff-biased social
learning, using a “behavioral toolbox” of social learning strate-
gies when choosing what behavior to adopt, thus integrating infor-

S ocial learning, the acquisition of behavior by observation of,


or interaction with, other individuals, is common to many
animal species. It provides a relatively cheap way of acquiring
mation about both the frequency of behavior and the relative
rewards gained by demonstrators (8, 30). Such a strategy has been
observed in laboratory experiments in humans (8). However,
valuable information and shields naive individuals from the risks empirical tests for the occurrence or emergence of suboptimal
of engaging in trial and error learning. A range of studies have or maladaptive traditions in animals have been limited (31, 32).
further highlighted the crucial role of social learning in promot- In a previous study, we demonstrated an influence of con-
ing cultural behavior and shared traditions (1–4) and suggested formist transmission on the social learning of novel foraging
that the cultural inheritance of information across generations techniques in wild great tits (Parus major) (18). There, we trained
may be an important component of the behavioral ecology of two demonstrators in each of five subpopulations to one of two
some animals (1, 5, 6). However, social learning may also be dis- equal alternative solutions to a novel task, where food could
advantageous, if copied information is outdated or mismatched be gained from a puzzle box by pushing a bidirectional door to
to the observing individual. How individuals balance costs and the right (technique A) or the left (technique B). Using auto-
benefits of social learning has therefore been the focus of much mated tracking of individuals and their choices, we then mapped
recent research aiming to understand how natural selection has
shaped learning (7).
The use of social learning “strategies” is one possible route
by which animals can combine and filter different kinds of infor- This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, ”The Extension of Biology Through Culture,” held November 16–17, 2016, at the
mation to optimize learning outcomes (8–10). Here, individuals Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
use social cues, often from multiple conspecifics, to bias learn- in Irvine, CA. The complete program and video recordings of most presentations are available
ing in favor of better quality information. These include pref- on the NAS website at www.nasonline.org/Extension of Biology Through Culture.
erences to copy kin or more prestigious or older individuals, Author contributions: L.M.A. and B.C.S. designed research; L.M.A. performed research;
as well as conformist and payoff-biased social learning (11, 12). L.M.A. and R.M. analyzed data; and L.M.A., B.C.S., and R.M. wrote the paper.
Such learning strategies also have implications for the spread and The authors declare no conflict of interest.
persistence of information in populations and for cultural evolu- This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
tion more broadly (13, 14). For example, conformist transmis- Board.
sion, here defined as the disproportionate tendency to copy the 1
To whom correspondence should be addressed. Email: lucy.aplin@zoo.ox.ac.uk.
most common behavioral variant (15, 16), may evolve as a means This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
of providing naive individuals with a quick way of ascertaining 1073/pnas.1621067114/-/DCSupplemental.

7830–7837 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1621067114


COLLOQUIUM
PAPER
the spread and establishment of these seeded behaviors through ulations of great tits [see Aplin et al. (18) for detailed methods]
each subpopulation. Results showed a sigmoidal relationship (Fig. 1A). Either solving technique was rewarded with the same
between a naive individual’s probability of adopting one of the highly preferred food, a live mealworm. In all subpopulations,
two techniques and its frequency in their foraging group, with the large majority (87–98%) of solutions used the technique that
birds disproportionately likely to learn the majority technique. was introduced by the trained demonstrator at the beginning of
Across each subpopulation, the most common behavior there- each experiment, with the population-level bias to this technique
fore became increasingly prevalent over time (18). These strong becoming increasingly entrenched over time (18) (Fig. 1B).
preferences at the subpopulation for one technique persisted
over the two generations measured, suggesting that technique Condition 2: Equal Low Payoffs. For a 2-day period following con-
choice had become established as a stable cultural behavior. dition 1, three puzzle boxes were distributed in each subpopula-
Here, we vary the rewards gained by these techniques to inves- tion that rewarded both variants A and B with a less preferred
tigate whether a reliance on conformist social learning will result food, sunflower seed (Fig. S1) (18). Compared with condition 1,
in mismatched behavior after conditions change. First, in four there was no change in the overall proportion of each solution
subpopulations, two with established traditions for technique A performed in any subpopulation (Welch two-sample test: T1, t =
(“left”) and two for technique B (“right”), we stocked all puzzle 0.82, P = 0.42; T2, t = −0.16, P = 0.87; T3, t = −0.57, P =
boxes with a less preferred food reward over 2 days, so that solv- 0.57; T4, t = 1.42, P = 0.16) (Fig. 1B). Whereas these patterns
ing using either technique was rewarded with a lower payoff. This may have differed over a longer time frame, these results suggest
“equal low payoffs” condition was designed to test whether indi- that, at least over 2 days [average number of solves per individual
viduals would explore alternative foraging behaviors when con- across replicates = 22(14–30)] birds did not change their sam-
fronted with a lower payoff from their previously learned behav- pling behavior in response to experiencing or observing lower
ior. Second, and immediately following this condition, all puzzle rewards.
boxes were stocked with unequal rewards for a period of 14 days.
Solving using the established tradition was thus rewarded with Condition 3: Unequal Payoffs. Immediately following condition
this same less preferred food, but solving with the uncommon 2, puzzle boxes with unequal rewards were installed at all
behavioral variant now resulted in a high payoff. Whereas most sites/subpopulations over a period of 14 days. Here, solving using
individuals had no experience of the uncommon solution, a small the established tradition was rewarded with this same less pre-
minority of individuals already preferred it and so provided avail- ferred food of sunflower seeds, whereas solving with the uncom-
able social information for the new difference between the two mon variant was rewarded with live mealworms. Overall, there
techniques. Finally, all visits and behaviors at the puzzle boxes were significantly more solutions of the alternative variant per-
were monitored using automated tracking. By quantifying the formance in condition 3 than in the previous conditions (Welch
two-sample test, first vs. third condition: T1, t = −5.28, P < 0.001;

PSYCHOLOGICAL AND
decision-making process for each individual’s visit to the puzzle

COGNITIVE SCIENCES
box we then examined (i) whether individuals switched from con- T2, t = −3.87, P < 0.001; T3, t = −3.75, P < 0.001; T4, t = −5.86,
formist to payoff-biased copying when observing others receiv- P < 0.001) (Fig. 1B). In the last part of condition 1, an average
ing variable rewards, (ii) whether individuals flexibly adjusted of 8(2–16)% of individuals either showed no preference or pre-
their behavior in response to learning about and gaining variable ferred the alternative variant. For these individuals, their prefer-
rewards, and (iii) whether this resulted in a population-level shift ence did not change in condition 3. By contrast, 49(33–71)% of
in behavior. other individuals switched to change their variant preference by
the end of the experimental period (proportion of all solves for
Results each individual over last 2 days).
Condition 1: Equal High Payoffs. A tradition for pushing a bidirec- Similarly to Aplin et al. (18), we analyzed the change in
tional door either to the right (variant A, T1–T2) or to the left individual and population preferences over time. First, as the
(variant B, T3–T4) was experimentally induced in four subpop- data were clearly bimodal (Fig. S2), a longitudinal clustering
1

A B
1
Variant A

Probability of first using uncommon variant


Variant B
Proportion of each variant in all solves

0.8
0.8
0.6

0.6
0.4

0.4
0.2

0.2
0
0

T1 T2 T3 T4 T1 T2 T3 T4 T1 T2 T3 T4
Equal High Payoffs Equal Low Payoffs Unequal Payoffs

Fig. 1. (A) A puzzle box where visiting individuals can slide the door open from either the blue/left side (variant A) or the red/right side (variant B) to
access a reward in a concealed feeder behind the door. The individual pictured is solving using variant A. Solving using either option can give the same
(equal condition) or different (unequal condition) rewards. Puzzle boxes record identity, contact duration, and solution choice and reset after each visit.
(B) Proportion of variant A or B used in each replicate (T1–T4) in three sequential conditions after variant A (T1–T2) or variant B (T3–T4) was initially
introduced by a trained demonstrator: (i) equally high payoffs for each solving option, proportion for last 5 days shown; (ii) equally low payoffs for each
solving option (2 days); and (iii) unequal payoffs, with the established tradition leading to a lower reward (14 days). Solid circles and error bars show mean
and 95% CI of the probability of individuals’ first solve in each condition being the uncommon variant.

Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7831
1
vidual choosing either option was a combination of both social
cues and accumulated experience. Social cues included the fre-
quencies of each behavior and the relative value of demonstrated
rewards and were calculated from activity immediately before a
Proportion of solves are seeded variant

given solve at a given puzzle box (18). Code sufficient to repeat


0.8

our results is available as an R package, wythamewa, that con-


tains the data, models, and simulation code, as well as code for
reproducing each figure to follow (Figs. 3–5).
Four model parameters were considered: s, g, λ, and y, rep-
0.6

resenting, respectively, the influence of social cues on choice,


the updating rate (how quickly an individual’s attraction to an
option responds to newly experienced payoffs), an individual’s
conformist exponent, and its payoff social learning bias (popula-
0.4

tion averages presented in Table S1). There was little evidence


for a payoff social learning bias in individuals’ behavior, with no
individuals weighting this parameter higher than 0.1 (Fig. S2). All
other parameters were more important, yet there was consider-
0.2

able individual variation in each (Table S1 and Fig. S4). Solving


T1 individuals ranged in their behavior from little use of social infor-
T2 mation to placing more than half of their decision-making weight
T3 on social cues (s). Similarly, the updating rate (g) ranged from
T4
0.0

nearly zero to over 0.8. Notably, these two parameters were corre-
lated across individuals, indicating that birds that tended to learn
2 4 6 8 10 12 14
socially also updated their personal information more quickly.
Time (1-14 days) In contrast to payoff social learning biases, the large majority
Fig. 2. The proportion of solutions using the seeded technique decreased
of individuals had a conformity exponent (λ) above 1, indicating
over time in each replicate, with individuals moving toward preferring the at least mild conformity in their use of social cues (above dotted
previously uncommon technique. Each replicate is shown in a different line, Fig. 3A). However, there was a strong negative correla-
color/shape, and solid and open symbols represents the two distinct clus- tion with overall reliance on social learning (s), such that birds
ters of individuals identified in the longitudinal clustering algorithm (solid
symbols, cluster 1; open symbols, cluster 2). Lines show the generalized esti-
mating equation model fit for each cluster/replicate.
A B

0.2 0.4 0.6 0.8 1.0


(b) (d)
conformity exponent (lambda)
4

probability of high option


algorithm was used to group individuals into two behavioral tra-
3

jectories, with 48% (0.33–0.67%) of individuals falling into clus-


ter 1. Clusters were then analyzed separately, using a generalized
estimating equation (GEE) model where the dependent variable
2

was the proportion of solves using the established technique on


each of 15 days and the explanatory variables were day, individ-
1

uals weighted by their total number of solves, and subpopulation


0.0

(Fig. 2). There was strong evidence in cluster 1 that the prefer- 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.8 1.0
ence for the established tradition decreased over time (pooled social learning weight (s) observed freq high option
replicate data; coefficient ± SEM = −0.27 ± 0.02, P < 0.001).
Cluster 2 showed a significant but much lower decreasing prefer- C D
ence for the established tradition (pooled replicate data; coeffi-
0.8
0.6

cient ± SEM = −0.13 ± 0.02, P < 0.001) (Fig. 2). This bimodality
social learning weight

in the rate of change over time was related to age, with younger
0.6
updating rate
0.4

individuals more likely to fall into cluster 1 [general linear mixed


model (GLMM): z140 = −2.94, P = 0.003]. There was no rela-
0.4

tionship between cluster membership and any other measured


0.2

0.2

variables (sex GLMM, z116 = 0.26, P = 0.79; prior preference


strength GLMM, z116 = −0.89, P = 0.37; number of solves in
0.0

condition 1 GLMM, z116 = −0.89, P = 0.37). Finally, and in


0.0

support of this result, there was a negative correlation between -1 0 1 2 3 -1 0 1 2 3


age (standardized) age (standardized)
age and likelihood of preferring the high-payoff variant in the last
2 days of the experimental period [linear mixed model (LMM): Fig. 3. Individual parameter estimates for the mechanistic learning model.
−0.07 ± 0.02, t = −3.83, P < 0.001] (Fig. S3). Each circle represents the posterior mean for an individual bird. (A) Strength
of conformity (λ) is negatively correlated with reliance on social learning
Analysis of Learning Mechanisms. We next investigated the relative (s), but most individuals show some conformist bias (exponent above 1).
contribution of different learning mechanisms to the observed (B) Implied social learning influence functions (expression 5). The diagonal
change in behavior during condition 3. To address these ques- line represents unbiased social learning. S-shaped curves are conformist indi-
viduals. The weak influence of payoff bias shifts these curves upward in
tions statistically, we used a sequential learning model of a form
the lower left corner. (C) Reliance on social cues tends to decline with age,
previously used to study the interaction of social and asocial explained mainly by the presence of large values of s in the youngest indi-
learning (8, 33). This framework allows for individual choices viduals. Individuals with low values are present at all ages. (D) Updating rate
to be modeled as products of time-varying interactions of dif- g tends to decline with age. The youngest individuals can be highly respon-
ferent modes of learning. Specifically, each solution decision was sive to individual experience, whereas the oldest individuals change their
modeled as a binary outcome in which the probability of an indi- attraction scores more slowly.

7832 | www.pnas.org/cgi/doi/10.1073/pnas.1621067114 Aplin et al.


COLLOQUIUM
PAPER
who put lower weights on social learning were more strongly both individuals and the population show adaptive responses to
conformist (Fig. 3A). It is easier to understand the impact of environmental change?
these individual differences by translating the estimates for each To test this, we used the same learning model as for data anal-
individual into an implied social learning function, as defined by ysis and used it as a forward evolutionary simulation. We simu-
expression 5. Fig. 3B shows these implied functions, plotted for lated groups of individuals learning together, with parameters for
the posterior mean of each individual. Here, an s-shaped curve the weight of social cues, strength of conformity, and updating
corresponds to conformist learning. Whereas a few birds appear rate. We first used these simulations to validate our data analy-
slightly anticonformist, it must be noted that Fig. 3B shows poste- sis code and then explored the population dynamics arising from
rior means, with considerable uncertainty about the exact func- different parameter settings. Finally, we computed selection gra-
tion of any one individual. Inspecting individual functions with dients on the parameters.
full uncertainty envelopes confirms that there is no strong evi- First, we consider three examples to show the role of con-
dence for anticonformity in this population, only for a minority formist social learning (Fig. 4). Each plot shows the time series
of individuals whose behavior is consistent with both weak anti- of a simulated group of 10 learners, where payoffs switch for the
conformity and weak conformity. variants at turn 1 and again at turn 30, with groups starting by
There was a consistent negative impact of age on both weight preferring the low variant. Only conformist strength λ is varied,
given to social cues (s) and updating rate (g). In contrast, there with other parameters held at s = 0.5, g = 0.6, and y = 0 (repre-
were no consistent or strong effects of age on conformity (λ) or sentative of the posterior distribution in the fitted model). When
on payoff bias (y). These relationships are shown in Fig. 3 C conformity is turned off (λ = 1), individuals take a long time to
and D and suggest that older individuals were much less influ- stabilize on the high-payoff variant and are slow to switch back
enced by social cues and also changed their behavior more slowly when the environment changes again (Fig. 4A). Similarly, when
in response to changes in personal experience. This is in agree- conformity is set very high (λ = 10), adaptive learning is very
ment with the descriptive results: Older individuals switched to slow, and most individuals fail to establish on the high-payoff
the high-payoff variant more slowly. Younger individuals, in con- variant at all before the environment switches (Fig. 4B). How-
trast, adapted more quickly; these “adaptor” individuals showed ever, when conformity is set to a moderately high value (λ = 5),
the strongest use of social information and simultaneously tended all individuals stabilize on the high-payoff variant before turn 30
to be less conformist, yet also tended to update their own attrac- and quickly switch back after the variants switch again (Fig. 4C).
tion scores more quickly in response to new personal experience. Second, we ran a simulation using parameter values for 10
birds sampled from the posterior distribution deduced from the
Modeling the Link Between Individual Behavior and Population-Level experimental data (Fig. 4D). The simulation shown here is repre-
Patterns. The results in the previous section suggest a combina- sentative, and if anything, the birds showed less conformity than
tion of conformist social learning and payoff-sensitive individual

PSYCHOLOGICAL AND
is optimal in this setting. Nevertheless, the amount of conformity

COGNITIVE SCIENCES
reinforcement (updating) in the population, with individual vari- that is present, combined with payoff-sensitive updating, allows
ation in all measures. Surprisingly, given our initial hypotheses, individuals to track changes in behavioral variants.
we found no evidence of individual exploration or payoff-biased Finally, selection gradients for social learning weight (s) and
social learning of sufficient strength to explain the patterns of conformity strength (λ) were used to determine whether selec-
behavior change. Therefore, does the conformity present in the tion favors larger or smaller values of each parameter, condi-
population slow the rate of switching? Or does it instead help tional on the value of the other. We calculated the selection

Too much conformity


A Enough conformity
B
9
9

7
7

bird
bird

5
5

3
3

1
1

0 10 20 30 40 50 60 0 10 20 30 40 50 60
turn turn

C Too little conformity


D Individuals sampled from posterior distribution
9

9
7

7
bird

bird
5

5
3

3
1

0 10 20 30 40 50 60 0 10 20 30 40 50 60
turn turn

Fig. 4. Simulations of the population consequences of mixes of conformist social learning and individual reinforcement. In each plot, each row is an
individual agent and each column is a time period. Open and solid circles represent alternative behavior. Before the vertical dashed line at turn 30, solid
is adaptive. After turn 30, open is adaptive. All groups of learners initialized with nonadaptive attraction scores, s = 0.5, g = 0.6, and y = 0. (A) λ = 1, no
conformity. (B) λ = 10, high conformity. (C) λ = 5, intermediate conformity. (D) Ten birds sampled from posterior distribution of the fitted model.

Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7833
gradients by conducting 20,000 simulations at each of 63 combi- Indeed when subpopulations with strongly established forag-
nations of s and λ (for a total of 1,260,000 simulations) to com- ing traditions for a single behavioral variant were subjected to
pute the selection differential of a mutant (Fig. 5). In each sim- a change in foraging payoffs, after just 14 days 49% of birds
ulation, we considered the difference in total payoffs between a switched their behavior to prefer an alternative higher-payoff
mutant individual with parameters s + δs or λ + δλ and an aver- variant, with almost all individuals sampling this option. By
age common-type individual with parameters s and λ. This dif- modeling the decision-making process for each individual we
ference defines a numerical estimate of the selection gradient show that, perhaps surprisingly, this population-level flexibility
for an invader. The parameters g and y were again fixed at 0.6 was achieved without significant asocial sampling and despite
and 0. We display the results in Fig. 5 as a vector field. Selec- an ongoing bias for conformity at the individual level. Instead,
tion adjusts combinations on s and λ in the directions indicated switching depended on two factors. First, there was an interac-
by the arrows, with longer arrows indicating stronger selection. tion between social information and personal experience, with
The red dashed contour is the combinations of s and λ at which individuals that experienced the higher-payoff behavior having
selection on conformity is neutral. Selection increases conformity a strong preference for that variant in future solves. Second,
below this contour. The blue dashed contour is the combinations there was extensive individual variation, with those individuals
where selection on s is neutral. Selection increases s above this that relied more on social information showing a weaker con-
contour. Therefore, selection favors more conformity for most formist bias. These factors allowed some individuals to switch
of the gradient space, becoming disadvantageous only at high once fortuitously exposed to, and experiencing, the high-payoff
weights of social learning. Social learning weight increases above variant. These individuals then provided the correct social infor-
and to the left of the blue contour. Here, social learning weight mation for others, leading to a positive feedback loop and even-
cannot increase from zero unless social learning is slightly con- tual population-level turnover.
formist (above the central blue contour); however, once confor- There has been extensive speculation about whether a reliance
mity is above a threshold value, higher social learning weight is on social learning can lead to mismatched or out-of-date behav-
favored only up to a point. These processes combine to produce ior (23, 25). Conformity has been thought to potentially exacer-
evolutionary dynamics that favor conformity combined with an bate this process, as conformist individuals rely on an indirect
intermediate weighting of social learning (Fig. 5). cue of information quality (the proportion of individuals exhibit-
In summary, the birds in the experiment obviously did not ing a behavior) rather than assessing the value of the information
evolve in the experiment, and we do not expect them to be pre- itself (21, 22). However, there has been a paucity of empirical
cisely adapted to it. Nevertheless, these simulations help us to evidence in nonhuman animals. In the only prior study, guppies
understand why conformist social learning, in combination with (Poceilia reticulata) were trained on a longer, suboptimal route
payoff-sensitive individual reinforcement, facilitated the ability to reach a feeding station. This route preference transmitted and
of individuals and groups to track environmental change. persisted over several days before eroding toward a faster route
(31). It was assumed that this erosion was associated with aso-
Discussion cial learning, but this was not tested. The learning mechanisms
Our experiment reveals that socially learned foraging traditions that individuals may be using to optimally exploit variable envi-
in great tits are flexible in response to environmental change. ronments were more explicitly tested in Rendell et al. (7), where
a computer tournament was used to compete different sets of
learning strategies. Winning strategies invested in social learning
over asocial learning learned most when individuals experienced
3.0

a reduced payoff and relied on recently acquired information.


Most interestingly, strategies did not benefit in variable environ-
ments either by using conformist learning or by preferentially
copying high payoffs.
Our experiment supports the findings from Rendell et al. (7)
2.5
conformity strength

in two main ways. First, we found no evidence that individu-


als used asocial sampling to change established behavior. This
is contrary to previous theory, which suggested that individu-
als should use “critical social learning,” switching to individual
innovation under reduced payoffs (27, 34). In Rendell et al. (7),
2.0

indiscriminate social learning was adaptive because other agents


were “rational”; that is, they reliably demonstrated the highest-
payoff behavior in their repertoire. Again, our results reflect this
finding; individuals exhibited a clear payoff bias in their per-
1.5

sonal experience, preferring to use the high-payoff technique


once experiencing both possible options. Second, we found no
evidence that individuals were incorporating information about
the differences in payoffs achieved by different observed demon-
strators. Thus, our results, along with those of ref. 7, suggest that
1.0

the evolution of adaptive social learning strategies like “copy the


high payoff” may not be necessary for individuals and popula-
0.0 0.1 0.2 0.3 0.4 0.5 0.6 tions to cope with temporally variable environments.
Our results differ in one major respect from those of ref. 7;
social learning individuals exhibited a conformist bias in their social information
use. This agrees with our previous experiment conducted under
Fig. 5. Selection gradient on social learning weight and conformity, rep-
resented as a vector field for social learning weight (horizontal axis) and
stable payoff conditions, where individuals were also conformist
conformity (vertical axis). Selection increases conformity below the outer in their social learning (18). This conformity did not prevent
red contours. Selection increases social learning above and to the left of the the population from tracking environmental change. Rather, our
central blue contour. For these parameter values, selection does not favor simulations demonstrated that the combination of conformist
much social learning, unless social learning is also conformist. social learning with the payoff-sensitive individual reinforcement

7834 | www.pnas.org/cgi/doi/10.1073/pnas.1621067114 Aplin et al.


COLLOQUIUM
PAPER
we observed allowed naive individuals to learn adaptive behav- of environments and contexts (7, 12). It is intriguing to consider
ior, but then actually promoted their ability to learn new infor- what circumstances might therefore promote the perpetuation
mation and switch behavior once conditions changed. However, of maladaptive traditions. One possibility is that some kinds of
our simulations found that evolutionary dynamics favored con- socially learned information might be more vulnerable to this,
formity only under intermediate social learning. This reflects the for example, where matching group patterns is more important
model in Kandler and Laland (35), which also suggested that than the absolute adaptive value of a behavior or where the adap-
conformity bias should be associated with a weaker influence for tive value of a behavior is obtuse or delayed. Future work should
social learning (20, 21). continue to investigate how general these findings are to other
Interestingly, we also found considerable individual varia- species, including humans, and explore their possible implica-
tion in these parameters, with individuals that used the most tions for the adaptive significance of animal culture.
social information also generally being less conformist. The role
that this this between-individual variation plays in mediating Materials and Methods
population-level outcomes merits more investigation. Addition- Study System. This study was conducted in a population of great tits
ally, future research should investigate whether these underlying (P. major) at Wytham Woods, near Oxford (51◦ 46’ N, 01◦ 20’ W; 385 ha).
individual differences in learning behavior are consistent across This population has been the subject of a long-term study; all resident
great tits were caught as chicks or adults and fitted with a British Trust
contexts and whether they relate to other correlates of behavior,
for Ornithology metal leg ring and a plastic leg ring encasing a passive
for example differences in personality or in developmental con- integrated transponder (PIT) tag (IB Technology). In addition to this main
ditions (36, 37). marking scheme, regular mist netting targeted individuals immigrating into
Our simulations of model parameters therefore suggest that a the population and was also used to age and sex birds (by plumage). Immi-
mix of conformist social learning and individual reinforcement grants could be classed only as first year or older on plumage; however, as
is sufficient to result in population-level switches in groups of most individuals disperse as relatively young individuals, in all analyses they
free-mixing individuals. However, it is interesting to consider how were assigned as their youngest possible age based on first capture date.
social structure might have additionally influenced this process in From autumn to winter, birds form loose flocks of unrelated individuals (38,
the wild population. Great tits show a fission–fusion social struc- 39) with groups aggregating to exploit patchy food sources. In spring and
ture, with extensive mixing and remixing of small foraging flocks summer, great tits prefer insect prey, but switch to a seed-based diet in win-
ter when insects are less available [e.g., beech mast, Fagus sylvatica (40)].
(38, 39) and with social information moving between individuals
All experiments mimicked this diet, using live mealworms, unhusked sun-
via these foraging associations (18, 40). It seems likely that this flower seeds, and peanut granules; previous work has established that meal-
social system might have acted to increase the rate at which the worms are a highly preferred food type, and sunflower seed is preferred to
population could flexibly adjust. That is, if individuals occur in peanut granules (18). All work was conducted with relevant ethics approval
small groups that frequently fission, then if there is any behav- from the University of Oxford, and by license holders from the British Trust
ioral heterogeneity within the population, then even as conformist

PSYCHOLOGICAL AND
for Ornithology and the Natural England (Natural England license numbers

COGNITIVE SCIENCES
learners, they will likely have some opportunity to acquire this 20123075, 20131205, 20145171).
alternative information. In species with highly modular networks,
by contrast, social structure could instead act to slow the rate Experimental Apparatus. The social learning task consisted of a plastic box
containing a feeder that was accessed by sliding a bidirectional door either
at which individuals and populations could flexibly adjust their
to left or to right. The left side of this door was colored blue and the right
socially learned behavior, with individuals repeatedly exposed to side red, and it had a raised front section to allow an easier grip. A perch
the same mix of potential demonstrators when copying. in front in the door functioned as a radio-frequency identification (RFID)
In addition to social structure, population demography could antenna registering the identity, visit duration, and action of each visiting
also influence the speed at which populations can flexibly adjust individual; these were recorded and controlled by a printed circuit board
socially learned behavior in a variable environment. Whereas we (Stickman Technology) inside the box. One second after a bird was recorded
found no sex differences in learning, unlike in refs. 41 and 42, as departed from the antenna, the door reset back to the middle. When
younger individuals in our population tended to show a faster installed in the woodland, each puzzle box was surrounded by a 1 × 1-
move away from the established low-payoff technique than older m cage with a 5 × 5-cm mesh to prevent access by larger species, and a
freely accessible bird feeder providing peanut granules was provided at 1 m
individuals and had a higher probability of preferring the high-
distance.
payoff technique by the end of the experiment. All individuals In experimental condition 1, the puzzle-box feeder contained live meal-
had equal opportunity in condition 1 to learn and practice the worms. However, for experimental conditions 2 and 3, the puzzle box was
established behavior, and this result was unrelated to previous modified to provide two different rewards, depending on the solving tech-
experience. Rather, it appears that younger birds were generally nique. This modification involved widening the door by ∼1.5 cm; however,
more likely than older individuals to use social information and, no other changes were made to the puzzle-box interface.
once having experienced the high-payoff technique, were also
more able to flexibly adjust their behavior. As younger individu- Experimental Design. A social learning and foraging experiment was con-
als are often also more likely to disperse, such flexibility in behav- ducted in four relatively isolated subpopulations across the woodland, in
ior could be advantageous when moving between new habitats 4-wk periods between December 2013 and January 2014 (treatment 1 and
treatments 3 and 4) and in a 4-wk period in January 2013 for treatment 2.
(41). More broadly, future work should model the effects of pop-
First, two males were caught from each subpopulation and trained in cap-
ulation demography and social network structure on the ability tivity to solve a novel puzzle box: In two subpopulations (T1 and T2), they
of socially learning populations to track environmental change. were trained to solve using variant A (solving pushing right from the blue
Indeed, both population demography and social structure side), whereas in T3 and T4 they were trained to solve using variant B (solv-
could also potentially be manipulated, and their effect experi- ing pushing left from the red side) (Fig. 1A). All birds were then released
mentally tested. to act as the initial demonstrators for this behavior, and three such puzzle
In conclusion, we show that socially learned traditions in wild boxes were installed 250 m apart in each subpopulation. These then con-
populations of great tits will track environmental change. We tinuously operated from dawn on Monday to dusk on Friday for a total of
further find that populations can track payoffs while individu- 20 days (18). In all areas the solving behavior spread rapidly, with 68–83%
(n = 37–96 per subpopulation) of resident individuals solving either vari-
als remain conformist social learners and use simulations to elu-
ant at least once. Puzzle boxes were used frequently, with 7,945–12,411
cidate the mechanisms by which this counterintuitive outcome rewarded visits per subpopulation; for more detail see ref. 18.
occurs. Indeed, our results suggest that conformist social learn- In January and February 2014, these four replicates were then exposed to
ing actually helps the population adapt to and retain high-payoff a modified puzzle box providing changed rewards. In condition 2 (2 days),
behavior, provided it is not too strong. This adds further weight this modified puzzle box provided sunflower seeds as a reward for solving
to arguments that social learning will be adaptive in a wide range using either technique. This was followed by condition 3 (14 days): Here

Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7835

the behavioral variant introduced in the initial experiment was rewarded Skit = P kt λ , [4]
with sunflower seed, whereas the alternative technique was rewarded with m nmt
live mealworms. In three replicates, these conditions occurred immediately
where nkt is the frequency of choice k among social cues at time t and λ = λi
following the initial experiment. In T2, the experiment occurred 1 y later;
is individual i’s conformist exponent. When this exponent is 1, social learning
however, the population was given 15 days of exposure to the puzzle box
is unbiased by frequency and behavior is sampled merely in proportion to
immediately before condition 2, with 73% of resident individuals solving
its occurrence among cues. When, however, the exponent exceeds 1, social
the task (either variant) in this period. The results from this replicate were
learning is conformist. In this study, we consider also payoff-biased social
similar to those from the three other replicates.
learning, which favors the highest-payoff choice among choices observed at
time t. Specifically, we construct a convex combination of conformist and
Statistical Analysis. To analyze the change over time in individual- and payoff-bias terms
population-level preferences in condition 3, we used a GEE model where
the dependent variable was the proportion of solves using the established nλ
Skit = (1 − yi ) P kt λ + yi (1), [5]
tradition and explanatory variables were day, replicate, and individuals, m nmt
weighted by their total number of solves per day. As the data distribution
was bimodal, we first divided the data into two clusters, using a longitudi- where yi is individual i’s reliance on payoff bias. When there is no variation
nal clustering algorithm that fitted the data for the relative proportion of in social cues, we assume that yi = 0, which means that payoff bias is active
events that were variant A for individuals over cumulative 2-h time periods. only when observed payoffs vary.
This method was implemented using the R package kml3d. We allow learning strategy to vary at the individual level, estimating si , gi ,
Learning mechanisms underlying individual changes in behavior were λi , and yi for each individual i in the sample. In each case, we construct each
analyzed using a sequential learning model that modeled individual choices parameter such that its log-odds are a linear combination of an individual
as products of time-varying interactions of different modes of learning. The random effect and an age-specific offset. For example, the submodel for si is
foundation of this framework is the experience-weighted attraction learn- logit (si ) = ai, 1 + bs xi , [6]
ing model (43), but with additional terms that allow behavior to be guided
by social cues. Specifically, we assume that the probability of observing a where xi is the standardized age of individual i and ai is a vector of individ-
choice k at time t by individual i is given by ual random effects, one for each parameter s, g, λ, and y. The conformity
exponent λ is given a log rather than a logit link.
pkit = (1 − si ) Ikit + si Skit , [1]
Model fitting was performed using Hamiltonian Monte Carlo, as imple-
mented in version 2.12 of Stan (44), to draw samples from the posterior
where si is the influence of social cues on choice, Ikit the probability of choice
distribution. We assessed convergence by inspection of the trace plots,
k according only to accumulative individual attraction, and Skit the probabil-
Gelman–Rubin R̂, and an estimate of the effective number of samples.
ity of k according only to social cues. The individual attractions are modeled
Finally, model priors were defined to be weakly informative and conserva-
as ordinary experience-weighted attractions with a simple reinforcement
tive, so that estimated effects and correlations were shrunk slightly toward
model, such that the attraction score for an option k at time t and individ-
zero. Specifically, the averages for s, g, λ, and y were assigned Normal(0,1)
ual i is given by
priors on the latent scale. The standard deviations of each random effect
Aki,t = (1 − gi ) Aki, t−1 + gi πk , [2] were assigned Exponential(2) priors, also on the latent scale. For the corre-
lation matrix of random effects, we used the LKJ family of distributions of
where gi is the importance of newly experienced payoff πk . Therefore, matrices and assigned η = 3, which shrinks correlations away from extreme
gi = 1 when there is no influence of past experience. Here we estimate both values near −1 or +1 and toward zero. For the unobserved payoff advan-
gi for each individual i and the unobservable payoff πk to each option k. tage of the high-payoff option, we assigned a Cauchy(0,1) prior, which is
Attraction scores at time t imply choice probability by means of a softmax essentially uninformative. Code sufficient to repeat our results is available
choice rule: as an R package, wythamewa, that contains the data, models, and simula-
tion code.
exp (Akit )
Ikit = . [3]
sumn exp (Anit ) ACKNOWLEDGMENTS. We thank Keith McMahon, Stephen Lang, and other
In fitting the model, we set the initial attraction scores at time t = 0 for members of the Edward Gray Institute for help with various aspects of
fieldwork and data collection and Damien Farine for discussions leading
each individual to the empirical preferences from the first condition. This
to the formation of the project and for developing the software for the
accounts for the fact that most individuals begin the second condition with puzzle boxes. This research was supported by a grant from the Biotechnol-
strong preferences for the formerly high option. ogy and Biosciences Research Council (BB/L006081/1) (to B.C.S.). L.M.A. was
Social cues at time t can influence choice by changing the probability Skit . supported by a junior research fellowship at St. John’s College, University
In the simplest example, conformist learning is modeled as of Oxford.

1. Muller CA, Cant MA (2010) Imitation and traditions in wild banded mongooses. Curr 14. Rendell L, et al. (2011) How copying affects the amount, evenness and persistence of
Biol 20:1171–1175. cultural knowledge: Insights from the social learning strategies tournament. Philos
2. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. Trans R Soc Lond B Biol Sci 366:1118–1128.
3. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. 15. Morgan TJH, Laland K (2012) The biological bases of conformity. Front Neurosci 6:1–7.
Science 299:102–105. 16. Aplin LM, et al. (2015) Counting conformity: Evaluating the units of information in
4. van de Waal E, Borgeaud C, Whiten A (2013) Potent social learning and conformity frequency-dependent social learning. Anim Behav 110:e5–e8.
shape a wild primate’s foraging decisions. Science 340:483–485. 17. Pike TW, Laland K (2010) Conformist learning in nine-spined stickleback’s foraging
5. Slagsvold T, Wiebe KL (2007) Learning the ecological niche. Proc Biol Sci 274: decisions. Biol Lett 6:466–468.
19–23. 18. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture
6. Slagsvold T, Wiebe KL (2011) Social learning in birds and its role in shaping a foraging via conformity in wild birds. Nature 518:539–541.
niche. Philos Trans R Soc Lond B Biol Sci 366:969–977. 19. Whiten A, Horner V, de waal FB (2005) Conformity to cultural norms of tool use in
7. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies chimpanzees. Nature 437:737–740.
tournament. Science 328:208–213. 20. Nakahashi W, Wakano JY, Henrich J (2012) Adaptive social learning strategies in tem-
8. McElreath R, et al. (2008) Beyond existence and aiming outside the laboratory: Esti- porally and spatially varying environments : How temporal vs. spatial variation, num-
mating frequency-dependent and pay-off-biased social learning strategies. Philos ber of cultural traits, and costs of learning influence the evolution of conformist-
Trans R Soc Lond B Biol Sci 363:3515–3528. biased transmission, payoff-biased transmission, and individual learning. Hum Nat
9. Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago 23:386–418.
Press, Chicago). 21. Henrich J, Boyd R (1998) The evolution of conformist transmission and the emergence
10. Henrich J, McElreath R (2003) The evolution of cultural evolution. Evol Anthropol of between-group differences. Evol Hum Behav 19:215–241.
12:123–135. 22. Whitehead H, Richerson PJ (2009) The evolution of conformist social learning can
11. Laland K (2004) Social learning strategies. Learn Behav 32:4–14. cause population collapse in realistically variable environments. Evol Hum Behav
12. Rendell L, et al. (2011) Cognitive culture: Theoretical and empirical insights into social 30:261–273.
learning strategies. Trends Cogn Sci 15:68–76. 23. Galef BG (1995) Why behaviour patterns that animals learn socially are locally adap-
13. Cantor M, Whitehead H (2013) The interplay between social networks and culture: tive. Anim Behav 49:1325–1334.
Theoretically and among whales and dolphins. Philos Trans R Soc Lond B Biol Sci 24. Galef BG (1995) A new model system for studying behavioural traditions in animals.
368:20120340. Anim Behav 50:705–717.

7836 | www.pnas.org/cgi/doi/10.1073/pnas.1621067114 Aplin et al.


COLLOQUIUM
PAPER
25. Laland K (1996) Is social learning always locally adaptive? Anim Behav 52:637–640. 36. Mesoudi A, Chang L, Dall SR, Thornton A (2016) The evolution of individual and cul-
26. Franz M, Matthews LJ (2010) Social enhancement can create adaptive, arbitrary and tural variation in social learning. Trends Ecol Evol 31:215–225.
maladaptive cultural traditions. Proc Biol Sci 277:3363–3372. 37. Farine DR, Spencer KA, Boogert NJ (2015) Early-life stress triggers juvenile zebra
27. Enquist M, Eriksson K, Ghirlanda S (2007) Critical social learning: A solution to Rogers’s finches to switch social learning strategies. Curr Biol 25:2184–2188.
paradox of nonadaptive culture. Am Anthropol 109:727–734. 38. Farine DR, et al. (2015) The role of social and ecological processes in structuring ani-
28. Giraldeau LA, Valone TJ, Templeton JJ (2002) Potential disadvantages of using socially mal populations: A case study from automated tracking of wild birds. R Soc Open Sci
acquired information. Philos Trans R Soc Lond B Biol Sci 357:1559–1566. 2:150057.
29. Kendal RL, Coolen I, van Bergen Y, Laland KN (2005) Trade-offs in the adaptive use of 39. Aplin LM, et al. (2015) Consistent individual differences in the social phenotypes of
social and asocial learning. Adv Stud Behav 35:333–379. wild great tits, Parus major. Anim Behav 108:117–127.
30. Kendal J, Giraldeau LA, Laland K (2009) The evolution of social learning rules: Payoff- 40. Aplin LM, Farine DR, Morand-Ferron J, Sheldon BC (2012) Social networks predict
biased and frequency-dependent biased transmission. J Theor Biol 260:210–219. patch discovery in a wild population of songbirds. Proc Biol Sci 279:4199–4205.
31. Laland K, Williams K (1998) Social transmission of maladaptive information in the 41. Aplin LM, Sheldon B, Morand-Ferron J (2013) Milk-bottles revisited: Social learn-
guppy. Behav Ecol 9:495–499. ing and individual variation in the blue tit (Cyanistes caeruleus). Anim Behav 85:
32. Bates L, Chappell J (2002) Inhibition of optimal behaviour by social transmission in 1225–1232.
the guppy depends on shoaling. Behav Ecol 13:827–831. 42. Brodin A, Urhan AU (2015) Sex differences in learning ability in a common songbird,
33. McElreath R, et al. (2005) Applying evolutionary models to the laboratory study of the great tit - females are better observational learners than males. Behav Ecol Socio-
social learning. Evol Hum Behav 26:483–508. biol 69:237–241.
34. Rendell L, Fogarty L, Laland K (2010) Rogers’ paradox recast and resolved: Pop- 43. Camerer C, Ho TH (1999) Experience-weighted attraction learning in normal form
ulation structure and the evolution of social learning strategies. Evolution 64: games. Econometrica 67:827–874.
534–548. 44. Stan Development Team (2016) Stan Modeling Language Users Guide and Reference
35. Kandler A, Laland K (2013) Tradeoffs between the strength of conformity and number Manual, Version 2.12.0. Available at mc-stan.org/users/documentation/. Accessed
of conformists in variable environments. J Theor Biol 332:191–202. June 19, 2017.

PSYCHOLOGICAL AND
COGNITIVE SCIENCES

Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7837
A social insect perspective on the evolution of social
learning mechanisms
Ellouise Leadbeatera,1 and Erika H. Dawsonb,2
a
School of Biological Sciences, Royal Holloway University of London, Egham TW20 0EX, United Kingdom; and bLaboratoire Evolution, Génomes,
Comportement et Ecologie, CNRS, 91198 Gif-sur-Yvette, France

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 14, 2017)

The social world offers a wealth of opportunities to learn from the ability to learn socially is simply a useful byproduct of the fun-
others, and across the animal kingdom individuals capitalize on damental ability to learn asocially that has remained untouched by
those opportunities. Here, we explore the role of natural selection natural selection. There is clear empirical evidence that natural se-
in shaping the processes that underlie social information use, using lection can fine-tune associative processes to particular tasks (34–36).
a suite of experiments on social insects as case studies. We illustrate For certain social learning processes that typify human behavior,
how an associative framework can encompass complex, context- such as imitation or learning through language or instruction, this
specific social learning in the insect world and beyond, and based on mechanistic middle ground between adaptation- and preadaptation-
the hypothesis that evolution acts to modify the associative process, based explanations for social learning is the subject of substantial
suggest potential pathways by which social information use could empirical attention (37–40). However, this is not the case for those
evolve to become more efficient and effective. Social insects are social learning mechanisms that underlie the majority of social in-
distant relatives of vertebrate social learners, but the research we formation use outside the primates, such as local or stimulus en-
describe highlights routes by which natural selection could coopt hancement, social facilitation, or goal emulation (41). The effects
similar cognitive raw material across the animal kingdom. produced by these “simpler” processes are well-labeled (1), but
these definitions often tell us little about how the social stimuli upon
|
social learning associative learning | observational conditioning | which they are based come to control behavior. For example,
|
social insects Bombus imagine that an individual observes a demonstrator using a tool to
get food, and then uses the same tool to extract food for him or

A n expanding body of work now shows that social learning


(1), once considered the preserve of vertebrate species, is a
feature of insect behavioral repertoires (2, 3). Insects not only
herself, without using the same exact actions. This fits a classic
definition of “emulation” (1), because observation of a demon-
strator interacting with objects in its environment has rendered the
learn about foraging skills, food preferences, brood hosts, and observer more likely to perform any actions that have a similar
potential mates by responding to information provided inad- effect on those objects. An associative hypothesis might put forward
vertently by others (4–11), but also transmit these behaviors that perhaps, through the demonstrator’s actions, the observer came
further, such that they propagate through groups (8, 12) and to associate the tool with food, and thus manipulated the tool in its
possibly even through wild populations (13). Some of these own manner (1). This is not an alternative option, but rather a
phenomena appear similar to socially learned behavior patterns different level of explanation: the label describes the effect, and
that have been described in vertebrates (11, 14) at least outside associative learning theory puts forward a testable hypothesis as
the context of imitation, and as such they are interesting exten- to why that effect occurs, on a mechanistic level (39). Under-
sions to the taxonomic distribution of social learning. However, standing those mechanisms is critical to forming a picture of how
there is more to be gained from these findings than the claim natural selection has acted upon cognitive raw material, but as yet,
that “insects are smarter than we thought.” An insect perspective such an understanding is lacking. As Galef (41) puts it, “an entire
encourages questions about the basic cognitive raw material that field of local enhancement awaits exploration.”
underlies social learning. Addressing this issue first requires exploration of the basic
Associative learning theory describes the rules that govern the mechanisms through which associative learning could produce
strength of connections between neural representations of stimuli, adaptive social information use, and here we review a series of
and there is little doubt that these rules are taxonomically widespread case studies carried out by ourselves and by other researchers
(15–18). We now know that associative learning is a means to build a that specifically illustrate how social learning effects can sit
complex, representation-based picture of the world that is far from within an associative framework. In the light of claims that stu-
the simple stimulus–response/reinforcement paradigms through dents of animal cognition might sometimes be “association-
which it was originally characterized (19–22). Correspondingly, it has blind” (24), we hope that this approach proves useful not only in
been repeatedly argued that associative explanations are powerful highlighting how associative learning theory can be applied to
enough to encompass many cognitive phenomena (23–29), and social
learning is no exception. Heyes in particular has highlighted that
social learning mechanisms may be fundamentally associative (28–30; This paper results from the Arthur M. Sackler Colloquium of the National Academy of
see also ref. 31). But the contention that no major qualitative leaps Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
are required to distinguish at least some social learning processes
in Irvine, CA. The complete program and video recordings of most presentations are available
from asocial learning is not an argument against evolutionary change. on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
It has been argued that a likely pathway for cognitive evolution Author contributions: E.L. and E.H.D. wrote the paper.
may be the accumulation of small quantitative changes that render The authors declare no conflict of interest.
domain-general processes a closer fit to the specific cognitive de-
This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the
mands of a particular niche (32, 33). Because almost all animals in- Editorial Board.
teract with others at some point during their lifetime, these demands 1
To whom correspondence should be addressed. Email: Elli.Leadbeater@rhul.ac.uk.
most likely include effective processing of social information. In other 2
Present address: Unité de Modélisation Mathématique et Informatique des Systèmes
words, whereas there may be little evidence to support a polarized Complexes, Unité Mixtes Internationales, Institut de Recherche pour le Développe-
view of asocial- and social learning processes, it does not follow that ment/Univesité Pierre et Marie Curie 209, 93143 Bondy, France.

7838–7845 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620744114


COLLOQUIUM
PAPER
empirical work on animal social learning, but also in stimulating Second-order conditioning is relevant to social learning be-
consideration of how natural selection might modify associative cause it provides a mechanism by which a learned response to a
mechanisms. Social learning involves upstream input mechanisms social stimulus, such as conspecific behavior or its products,
that define how social information is received, and downstream could come to be elicited by a new asocial stimulus (49). Hence,
parameters that define how associations are weighted, processed, any biologically important stimulus (US) that an animal has
stored, retrieved, and applied, and all may be potential targets for learned to associate with conspecific presence, or with a partic-
selection (29, 32, 33). Given the phylogenetic distance between ular conspecific behavior pattern, or a vocalization—indeed, with
insects and vertebrates (and even between the diverse vertebrate any social stimulus at all—could become secondarily conditioned
groups within which social learning abilities are manifest), we do to a new asocial stimulus (a CS2). In Worden and Papaj’s bum-
not infer that any evolutionary modifications would necessarily be blebee example (5), observers might “copy” the color preferences
conserved across lineages through common descent; such inference of demonstrators because they have previously learned to associate
would have little foundation. Rather, our aim is to illustrate the conspecifics (a CS) with food rewards (US+), perhaps because
cognitive toolbox from which social learning can be built. We hope other bees are found at rewarding feeders or flowers. On sub-
to provide some food for thought for those researchers from both sequent observation of conspecifics on a particular flower color (a
the invertebrate and the vertebrate world that are interested in the CS2), that flower color should be rendered attractive through
question of how associative learning could be shaped by natural second-order conditioning, even though it has not been directly
selection to create something specifically, if not exclusively, social. paired with food.
We recently carried out an experiment to test this hypothesis
An Associative Framework (Fig. 1) (43), in which we controlled the previous social foraging
Bumblebees (Bombus spp.) and honey bees (Apis spp.) forage for experience of observer bees (Bombus terrestris). Individuals that
nectar and pollen in rapidly changing floral landscapes, where had learned to associate conspecifics (CS) with sucrose (US+)
floral reward levels vary not only between flower species but also behaved similarly to bees in previous experiments (5), “copying”
with season, time of day, local pollinator abundance, and even the color preferences of the demonstrators that they had ob-
shade patterns (42). Foragers visit thousands of flowers per day, served through a screen. However, bees that had learned to as-

EVOLUTION
and quickly learn about the stimuli that predict where to find sociate conspecifics (CS) with an aversive substance (US−)
rewards in the particular flower species and patches on which actively avoided those same colors, and bees that had never
they forage. Information provided inadvertently by other forag- foraged with conspecifics were not influenced by conspecific
ing bees influences this learning process, and a simple example choices. In other words, the results support that observing con-
whereby social bees learn from their conspecifics about rewarding specifics through a screen influenced forage preferences through
flower types provides a good introduction to an associative second-order conditioning.
Second-order conditioning is an associative mechanism, and the

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
framework for studying social learning.
Worden and Papaj (5) allowed bumblebees (Bombus impatiens) use of an associative framework to explain empirical results in
social learning research is not new. Perhaps the best-known ex-
to observe their foraging conspecifics through a screen. Those
ample comes from the work of Cook and Mineka (50–52), who
conspecifics foraged on only one of two available flower colors,
demonstrated almost 30 years ago that juvenile rhesus monkeys
and when the observers were later permitted to forage alone, they
(Macaca mulatta) can acquire a fear of snakes through observation
“copied” the color preferences of the demonstrators (5, 43, 44).
of a frightened conspecific interacting with a snake, a phenome-
This type of learning initially appears conceptually opaque, be-
non that they termed “observational conditioning.” As the name
cause learning theory is based upon the fundamental concept of
implies, these results are considered to reflect a classic condi-
prediction error, which requires a difference between predicted tioning process, whereby the sight of a frightened conspecific (US)
and experienced outcomes (45). In this observational paradigm, elicits an unconditioned fear response in observers. This affective
observers do not directly experience any rewarding outcome, nor state becomes conditioned to stimuli that are experienced at the
had these laboratory bees ever had the chance to learn that same time, in this case a snake model (CS), which thus also ac-
matching the color choices of conspecifics was rewarding. Thus, quire the ability to elicit fear (28, 52). In other words, the snake is
the bees in this situation seem to have learned about flower color simply a conditioned stimulus that comes to elicit the fear re-
simply by observing the behavior of their conspecifics. sponse. Cook and Mineka (50) found that fear could be condi-
Second-order conditioning is an associative phenomenon that can tioned to snakes in this way, but not to flowers, implying that fear
potentially explain why animals respond to certain stimuli as though cannot be socially conditioned to any randomly chosen stimulus.
they have been directly associated with food rewards, when they have However, this is not an argument against an associative explana-
not. This process is best known for its use in psychological research tion, because in primates snake stimuli are generally particularly
to study the contents of learning (46), but perhaps the most illus- easily conditioned to fear responses, whereas flower stimuli are
trative example derives from the work of Pavlov (47), who famously not (53–56). For example, in humans, pairing of snake pictures
trained dogs that the tick of a metronome (a conditioned stimulus, with electric shock leads those pictures to later elicit heart-rate
CS) predicted the arrival of food (an appetitive unconditioned acceleration (indicating fear), whereas the same protocol involving
stimulus, US+). When Pavlov later trained the same dogs that flower pictures leads to heart-rate deceleration (indicating arousal
presentation of a black square (a second conditioned stimulus, CS2) of attention) (54, 56). It is thus not surprising that a fear response
predicted the sound of the metronome in the absence of food, he invoked by a social stimulus can be easily conditioned to snakes
found that subsequent presentation of the square alone evoked but was not detectable for flowers.
salivation. In other words, the black square (CS2) had come to elicit What is the link between the second-order conditioning and the
the same response as the food (US+), despite the two never having observational conditioning process that we describe above? In both
been experienced together. An association had formed between the cases, a response to a social stimulus becomes conditioned to a new,
black square and either the food itself or the appetitive state induced asocial stimulus. In observational conditioning, an unconditioned
by the conditioned state of the bell (it remains unclear which) (20, response (here fear, elicited by seeing a frightened conspecific)
48), and this association constitutes a second-order conditioned re- becomes conditioned to a new stimulus. In second-order condition-
lationship. Whereas in classic conditioning paradigms, it is an un- ing, a conditioned appetitive response becomes conditioned to a new
conditioned response to a stimulus that comes to be elicited by a new stimulus. The two mechanisms are functionally analogous, and the
stimulus, in second-order conditioning paradigms, it is a conditioned difference lies in whether the initial response to the social stimulus is
response that undergoes what is functionally the same effect. acquired through learning (a conditioned response) or unlearned

Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7839
A B C

Fig. 1. Second-order conditioning of flower color preferences in bumblebees (43). (A) Observer bees were initially allowed to forage on a floral array in
which the presence of conspecifics (CS1) predicted either sucrose (US+; Upper) or quinine (US−; Lower). A third group completed this training in the absence
of any conspecifics (not pictured). (B) All bees then observed conspecifics (CS1) foraging on one flower color (CS2) and ignoring an alternative, through a glass
screen. (C) Finally, each observer was permitted to forage alone on the colored array.

predisposition [an unconditioned response; although Heyes (29) foraging honey bees (Apis mellifera) (61). We recently found that
highlights that presumed predispositions may in fact reflect learning]. bees not only avoid areas where they currently detect such vol-
Note that although we have framed this argument in terms of atiles, but also later avoid colored lights that were experienced at
classic conditioning, the same case can be made for operant the same time (60) (Fig. 2). Thus, the avoidance response, which
conditioning (57). Matched-dependent behavior (58) describes is initially elicited by the social stimulus, becomes conditioned to
instances whereby matching of another animal’s behavior (i.e., a a new asocial stimulus. In the laboratory, we used colored lights
response to a social stimulus) is rewarded (e.g., by finding food; a as asocial stimuli; in the wild, floral features that predict the
reinforcer) and thus increases in frequency, and Church (59) has presence of sit-and-wait predators, such as crab spiders (Thomisidae
shown that this response can become conditioned to the asocial spp.) could fulfill the same role.
stimulus that initially elicited the demonstrator’s behavior. The It is important to reiterate that this associative framework
key point is that responses to social stimuli—be they acquired complements, rather than adds to, the collection of processes
through classic conditioning, operant conditioning, or a history that are labeled as social learning mechanisms, such as local or
of natural selection—are conditioned to new asocial stimuli. stimulus enhancement, or social facilitation (1). These labels
As an illustration, consider the following example (60). Social describe effects, and learning theory offers an explanation for
information from injured conspecifics, in the form of volatiles why these effects occur, rather than an additional alternative
from stressed conspecifics, typically elicits aversion responses in effect (39). In some cases, an associative framework is already

A B

Fig. 2. Honey bees learn to respond to colored lights through exposure to alarm volatiles (60). (A) We used an assay whereby highly phototactic subjects
walked up a dark tube toward a colored light (balanced blue/green design; only blue is pictured). Warning triangle indicates the presence of volatiles from a
stressed conspecific. (B) In the training phase, bees in groups E1 (experimental) and E2 (control for sensitization/habituation effects) were slower to approach
the light, but only group E1 were slower in the testing phase. Thus, responses were conditioned to the specific stimuli that had been contiguous with stress
volatiles in the training phase.

7840 | www.pnas.org/cgi/doi/10.1073/pnas.1620744114 Leadbeater and Dawson


COLLOQUIUM
PAPER
part of the definition of a particular process (e.g., observational learning about the floral cue, and correspondingly, the authors
conditioning or matched-dependent behavior) and for others (e.g., discuss why the ecological niche of pollinators might favor
local enhancement, stimulus enhancement, emulation, or imitation) reliance on social cues. However, given that floral color is also
it is not. Furthermore, our focus here is on mechanisms that are likely a very useful cue for bees, it seems surprising that social
taxonomically widespread within the animal kingdom, but similar cues should be more salient, and consideration of these findings
arguments have been made for processes that are typically associ- in the context of blocking (66)—another mainstay of associative
ated with the primate lineage [32, 39, 40; see also Lotem et al. (33), learning theory—raises a testable alternative explanation.
who make such an argument for imitation within this issue]. More- Association between a CS (e.g., flower color) and a US (e.g.,
over, this associative framework is relevant to any form of social sucrose) is typically impaired if that CS is presented together with
stimulus, irrespective of whether it is visual, olfactory, or auditory, another CS that has previously been associated with the US. This
or whether it is a specific behavior pattern, bias, or expression, or phenomenon could potentially explain Dunlap et al.’s (65) results,
simply physical presence. The key concept is that social stimuli elicit because the bees have prior experience of social foraging but not
affective states, neural representations, or motor patterns, which of the flower colors in question. As members of laboratory colo-
can then be associated with contiguous asocial stimuli. nies that forage ad libitum from feeders placed in flight arenas
when not participating in experiments, individuals would have had
Salience of Social Stimuli an opportunity to associate conspecific presence with reward, but
Above, we argued that responses to social stimuli can produce a no similar experience of flower colors. During the training, bees
functionally analogous outcome irrespective of whether they should thus learn little about the reliability of the floral CS, be-
arise through unlearned predisposition or through learning, be- cause the simultaneous presence of the social CS blocks learning
cause they can be conditioned to new, asocial stimuli. However, about the asocial cue. The only situation where bees should learn
the distinction between such unconditioned and conditioned about the floral CS would occur if the social CS became unreli-
responses to social stimuli is important for questions about evolved able. In this situation, the association between conspecific pres-
traits. A clear pathway for natural selection might be to implement ence and sucrose rewards should move toward extinction, allowing
small quantitative changes that render social associations more likely for new learning of the association between floral color and sucrose,

EVOLUTION
to be learned, such that information deriving from informative cues is exactly as Dunlap et al.’s results illustrate.
acquired particularly rapidly. One means (but not the only means, as It may well be the case that bees treat asocial and social cues
we discuss later) to do so could involve changes to upstream mech- differently, and post hoc associative explanations should gener-
anisms that determine whether animals notice or pay attention to ate testable predictions to be useful. Here, one simple means to
social stimuli. We begin with a focus on the question of whether social rule out a blocking explanation would be to closely control the
stimuli are more salient than less-ecologically informative alternatives. previous foraging experience of the bees. In this study system,
The salience of a stimulus—the property that renders it con- where individuals never leave the flight arena and the colony’s

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
spicuous or noticeable, and thus likely to be attended to—is a foraging needs can be met by providing sucrose and pollen di-
key determinant of the speed with which it can be associated with rectly into the nest, it is entirely possible to create individual
other stimuli (62). Salience depends on the species-specific foragers with no experience of social foraging (43, 67, 68).
characteristics of the receiver’s sensory system, such that a Finding the same effect under these circumstances would render
stimulus that is salient for one taxonomic group may be less so the case for enhanced salience of social stimuli compelling, and a
for another (63). For example, carbon disulphide (a component further experiment by Dawson and Chittka (68) employs such an
of rat breath) is a particularly salient stimulus for young rats, and approach. These authors compared the speed at which bum-
is quickly associated with any food flavors that are encountered blebees (B. terrestris) learned to associate either a social CS (the
at the same time (64). Experimental evolution paradigms pro- presence of dead, pinned conspecifics) or an asocial CS (a coin/
vide evidence that environments where particular cue types are plastic disk/black wooden cuboid) with sucrose, in a free-flying
reliably useful for decision-making select for changes to the at- choice paradigm. Bees were more likely to associate the social
tentional or perceptual mechanisms that determine stimulus than asocial CS with reward, and were also more likely to use the
salience. For example, Drosophila lines for which learning about social CS to identify rewarding flowers in a transfer test involving
olfactory stimuli is the best method of identifying a good brood a novel target flower color. On realizing that their results could
host develop enhanced sensitivity to such cues and ignore visual reflect the fact that subjects had previous experience of social
alternatives, and vice versa (36). In a social context, the question foraging from laboratory feeders, Dawson and Chittka repeated
of whether social cues are especially salient has rarely been di- their experiment using bees that had never foraged with others,
rectly addressed [with the exception of Galef et al. (64)], but a and found that the effect was maintained. Interestingly, pinned
number of bee studies touch upon the topic, with mixed results. honey bee (A. mellifera) demonstrators, which visit comparable
When two conditioned stimuli (e.g., a social and an asocial but not identical floral resources, elicited similar results, suggesting
stimulus) are simultaneously paired with an unconditioned stimu- that natural selection might lead to increased salience of cues that
lus, the more salient stimulus overshadows learning about the less derive from useful heterospecifics, as well as conspecifics.
salient stimulus (47). Accordingly, an experiment by Dunlap and
colleagues initially appears to suggest that conspecifics present Social Associations
might be a particularly salient stimulus for bumblebees (65). In this Salience is an important determinant of the ease with which an
study, subjects (B. impatiens) were trained to find sucrose association is acquired, but it is by no means the only contrib-
rewards in floral arrays where both asocial and social cues (floral uting factor. Whereas salience affects the extent to which a
color and pinned conspecifics, respectively) provided some in- stimulus is made available for learning, learning itself requires
dication of which flowers were rewarding, before assessing which that associations between neural representations of stimuli, af-
of the two cue types the bees preferentially used on a test array. fective states, or motor patterns, are formed around that stim-
When both cue types had been equally reliable at predicting ulus. Natural selection might act upon the parameters that
sucrose, the bees disproportionately favored social cues in the determine the number and timing of exposures that are required
tests, and most surprisingly, they also used social cues even when before a particular association is committed to memory (69). The
floral cues had been more reliable. Bees only resorted to using key feature of such prepared learning is that certain combi-
floral cues if social cues had been entirely useless predictors of nations of stimuli, rather than any particular stimulus alone,
reward. These results would seem to indicate that the social elicit rapid learning. For example, consider the oft-cited “Garcia
stimulus is the more salient alternative, overshadowing any effect,” whereby rats rapidly learn to associate tastes but not

Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7841
audible tones with gastrointestinal illness. The combination of on flower types where predation was a real threat. Bees could not
taste and illness is critical here, and the same effect is not ob- have simply learned that the social cue was useful on dangerous
served when tastes are associated with shock (35, 70), ruling out flowers but irrelevant on safe ones because individuals were
the suggestion that tastes are simply more salient stimuli than trained on the mixed array entirely alone.
tones. In fact, Garcia and Koelling (35) found that tones were How might associations between stimuli produce such context-
more easily associated with shock than taste was, potentially specific and potentially adaptive behavior? Conditioned suppres-
because loud noises are likely to be a relevant cue to imminent sion is an associative phenomenon widely used to study the ac-
pain, whereas taste is not. quisition and extinction of fear, and it predicts exactly the effect
In a social learning context, there are clear candidate hypoth- that Dawson and Chittka (74) demonstrated in bumblebees (30).
eses concerning stimulus combinations that await investigation. In the presence of a cue predicting an aversive stimulus that an
For example, above we discussed evidence that bees might acquire animal would usually avoid, learned food-seeking behaviors are
associations between nectar rewards and social stimuli more rap- typically repressed, although they do not disappear altogether
idly than associations between nectar rewards and asocial stimuli (75). Thus, a rat that has learned to press a lever for food typically
(65, 68). A prepared learning hypothesis would predict that social reduces the frequency of pressing when a light that predicts foot
stimuli might be particularly easy to associate with sucrose reward shock is turned on (76). Similarly, when under predation pressure
levels, but not with an aversive stimulus. Or alternatively, perhaps from a threat that is hard to directly detect, bumblebees increase
the sign of the CS–US relationship is also important, such that both their latency to probe and rejection rate of all flowers of the
bees learn positive sucrose/social cues relationships easily but not color morph associated with danger (77, 78). With this in mind,
negative ones. These latter two alternatives—that bees are picture a bee that enters a flight arena filled with flowers of the
particularly sensitive to social CSs when learning about where to safe color morph, and happens to first come across an unoccupied
find food, or that they are particularly likely to learn positive social flower. Because the bee has learned that the holes in the flowers
CS-sucrose combinations—are qualitatively different traits. The- reliably contain sucrose rewards, it is very likely to land and feed.
ory correspondingly predicts that they should evolve under dif- If it instead first comes across an occupied flower where a dem-
ferent circumstances (36, 71). To visualize the difference, consider onstrator is foraging, it is also very likely to land and feed. Even if
again the Garcia effect (35). Although many stimuli might precede the presence of the demonstrator renders flowers much more
a feeling of illness, the true cause will often be related to recently attractive, the difference in acceptance rates between unoccupied
eaten food, so an a priori prioritization of associations between (very attractive) and occupied (extremely attractive) flowers will
taste and illness, rather than sound and illness, makes evolutionary be hard to detect experimentally, because all flowers are very
sense. Now consider the situation where a particular flavor always likely to be accepted. Now consider an environment where the
predicts illness. Here, theory would predict the fixation of an floral color morph predicts danger. All conditioned responses will
aversion to that flavor, rather than prepared learning (71). Thus, a be suppressed, such that flowers in general are less likely to be
priori expectations and prepared learning about social stimuli are accepted. In this situation, if the presence of a demonstrator bee is
both means by which natural selection could facilitate social attractive, the effect is much more likely to be detected experi-
learning processes, but the ecological conditions that favor their mentally (Fig. 4), because it is no longer the case that a bee almost
evolution are distinct. Teasing apart the roles of stimulus salience, invariably accepts the first flower that it encounters.
prepared learning, and a priori expectation is a task that invites This associative hypothesis does not require that selection
empirical exploration in our social insect system and in others. deriving from risky contexts has influenced the weight that bees
ascribe to social information. It is simply an alternative to the
Retrieval and Implementation of Learned Associations suggestion that individuals strategically respond to the circum-
We have discussed potential pathways by which natural selection stances in which they find themselves by computing the likely
could modify stimulus salience, or the downstream learning pa- pay-offs of to social information use. However, our hypothesis
rameters that influence memory formation, and suggested means by again generates a testable prediction. If dangerous environments
which such hypotheses could be explored. However, we have not yet simply render the effect of an attractive social stimulus more
touched upon the possibility that selection could produce modifi- detectable, the same should be true for an attractive asocial
cations to the final retrieval or implementation of learned in- stimulus. Thus, replacing conspecific demonstrators with asocial
formation. This is particularly important in light of the large volume stimuli that have previously been conditioned to sucrose should
of literature on “social learning strategies” that describes how ani- produce analogous results.
mals use social information most often in those situations in which it A study by Smolla et al. (79) employs exactly this approach in a
is most beneficial (72, 73). We begin with an example that illustrates different social learning context. Based on the premise that bees
how associative processes could bring about such context-specificity. should use a “copy-when-uncertain” strategy, which follows from an
As we have alluded to in an earlier example in this paper, agent-based model that they develop, these authors pretrained bees
foraging bees suffer predation by camouflaged sit-and-wait pred- (B. terrestris) that either a social (model bee) or an asocial (green
ators that ambush individuals as they land on flowers to feed. rectangle) cue predicted reward in a floral array. Half of the bees in
Dawson and Chittka (74) allowed bees to forage in environments each group were then trained that the floral array contained highly
that mimicked high or low risk of such predation, and found that variable rewards and half learned that rewards were constant, in the
bumblebees (B. terrestris) appear to use social information to absence of both cue types. When subsequently presented with a
identify safe flowers specifically in dangerous environments. Their nonrewarding test array, those bees that had learned that rewards
subjects were initially trained on an array where landing on flowers were variable used the social cue to find food, but those that had
of one color morph led to brief capture in a pressure trap (sim- experienced the constant array did not. Their results thus support a
ulating spider attack), whereas an alternative color morph was “copy-when-uncertain” interpretation, but as pointed out above, the
safe. Note that the color morphs simulate flower species with fact that more variable rewards render flowers less attractive (80, 81)
different levels of spider occupancy, rather than spiders them- means that the use of any cue, not just social ones, should be ren-
selves, which are typically cryptic. When subsequently tested on an dered more detectable. However, crucially, the difference between
array containing only flowers of the dangerous morph, bees contexts was much less evident for the asocial cue and did not reach
strongly preferred to land on the single flower where a live dem- statistical significance. This difference seems unlikely to be attributed
onstrator could be seen feeding, an effect that was entirely absent to greater salience or associability of the social cue, because Smolla
when tested on flowers of the safe morph (Fig. 3). In other words, et al. (79) state that pretraining was similarly successful for both cue
bees seemed to use social information adaptively when foraging types. Smolla et al.’s results invite further exploration that has not yet

7842 | www.pnas.org/cgi/doi/10.1073/pnas.1620744114 Leadbeater and Dawson


COLLOQUIUM
PAPER
Fig. 3. Social information use varied with context (74). Bees were trained that sucrose could be obtained by extending the proboscis through holes demarked

EVOLUTION
by colored squares. During training, one color (here yellow) predicted brief capture in a pressure trap. Bees were then tested in either a “dangerous” or a
“safe” environment, where one live demonstrator was foraging at a single location.

been carried out, but their approach of comparing context-specific learning should be accredited with the status of a “default” ex-
responses to social and asocial stimuli is a promising one that seems planation for social learning phenomena, to be assumed true

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
a useful way to evaluate the specific characteristics that govern unless proven false (24). Arguments that favor this approach in
learning about social stimuli. animal cognition are typically based on Morgan’s canon, which
states that “no animal activity should be interpreted in terms of
Competing Hypotheses higher psychological processes if it can be fairly interpreted in
As a whole, the insect-based studies that we have discussed il- terms of processes that stand lower in the scale of psychological
lustrate the power of associative learning to generate adaptive evolution and development” (82), but this has led to debate over
behavior. There is nothing new about associative explanations whether Morgan’s canon itself it up to the task (24).
for social learning phenomena; Heyes in particular has long Perhaps such a polarized perspective is less productive than it
championed this approach (24, 28–30, 39, 40). However, the fact could be. If there were clear evidence that social learning pro-
that a hypothesis has explanatory power is not evidence of its cesses were more efficient in species where social information
truth, and consequently, debate arises over whether associative repeatedly presents itself than in more solitary species (for

A Safe B Dangerous
1.5

1.5
Probability of Acceptance

Probability of Acceptance
1.0

1.0
0.5

0.5
0.0

0.0

Unoccupied Occupied Unoccupied Occupied

Fig. 4. Conditioned suppression should render social effects more detectable when aversive stimuli are present. (A) Safe environment. We assume that bees have
learned to associate the feeding holes in artificial flowers with sucrose rewards. Therefore, in the absence of aversive stimuli, the chances of acceptance on en-
countering a flower are high (here set at 0.9, for illustration purposes). The presence of a demonstrator is also attractive (here set at 0.45), perhaps because bees have
previously learned to associate conspecifics with sucrose. If two appetitive conditioned stimuli are presented together, the subject’s expectation is equal to their
combined strength (62), so occupied flowers are very attractive, but a probability of acceptance cannot exceed 1. Thus, the detectable effect of demonstrator presence
is small (arrow). (B) Dangerous environment. The presence of an aversive stimulus (the dangerous flower color) reduces all responses for food (here, suppression ratio
has been set at 0.5). Thus, the difference in the probability of acceptance between the unoccupied and occupied flowers is now relatively more detectable (arrow).

Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7843
example, in the social vs. solitary insects, which span multiple of mechanism is not the norm. It is surprising that for the vast
origins of sociality), it might be argued that the two are suffi- majority of cases in which animals respond to social information,
ciently different traits to justify discussion of evolutionary par- we understand very little about underlying mechanisms at all (41).
simony. However, such evidence is sparse. Most studies that To return to Galef’s statement that “an entire field of local en-
compare learning efficiency across social and asocial contexts hancement awaits exploration” (41), we suggest that this explo-
conclude that the two are highly correlated (83–85; but see also ration begins with a full understanding of how reported “simpler”
ref. 86). Here, we have suggested that a productive way forward social learning phenomena fit into an associative framework.
might be to search for small quantitative differences between
associative processes in asocial and social contexts, rather than Conclusion
qualitative leaps. Heyes (29) has highlighted that the pathway of We became interested in insect social learning based on a
least resistance for natural selection might be to modify the input thought-provoking anecdotal observation that the acceptance of
mechanisms that determine which aspects of the world are made a laboratory feeder by a single bumblebee seemed to render that
available for learning, contrasting such processes with learning feeder popular, rather than through careful choice of a study
mechanisms that determine how stimuli are linked and com- system. It was surprising to subsequently find that insects seemed
mitted to memory. Increased salience of social stimuli is one capable of what we considered a relatively sophisticated cogni-
such input mechanism, but increased salience of social stimuli tive process. However, questions about cognition in invertebrates
could be evolutionarily advantageous for many reasons other inevitably involve detailed dissection of mechanisms, and a
than acquiring information, such as recognizing kin, selecting critical feature of this system has been the ability to closely
mates, or defending a territory (87). Perhaps more convincing control the previous social foraging experience of experimental
evidence that natural selection has shaped responses to social subjects in a way that would not be possible for many species. We
information would be evidence of prepared learning about social hope that the body of research presented above shows that there
stimuli, of unconditioned responses to social stimuli that were was value in the novel perspective offered by a rather unlikely
specific to a social learning context, or of social learning strate- model. The studies that we have described are based on a small-
gies that cannot be accounted for by associative learning theory. brained study organism, but one that excels at accomplishing the
Our choice of study taxon—the social insects—has meant that cognitive tasks that are relevant to its own ecological niche (92),
we have made little mention of processes that characterize mainly that is highly social, and that offers exceptional experimental
primate behavior, such as imitation and (to a lesser extent) em- tractability. Other insect systems offer similar advantages for
ulation. Nonetheless, very recent work has begun to focus on investigating social learning contexts that we have not mentioned
potentially emulative behavior even in bees (8, 9), and it is not our in detail here; for example, a large literature now documents the
intention to imply that such processes follow a different evolu- existence of remarkably rapid and generalizable mate-choice
tionary pathway to other forms of social learning. In fact, asso- copying in Drosophila fruit flies (3, 11, 12). It may well be the
ciative explanations for imitation are prominent in the psychology case that social learning abilities have traveled further along
literature [40, 88; see also Lotem et al. (33)]. Explaining how in- some evolutionary routes in other lineages, and less far in others,
dividuals copy a novel sequence of actions through imitation in- but the rules that govern associative learning are taxonomically
vokes a “correspondence problem” (89, 90) because the seen widespread within the animal kingdom, and natural selection
movements of others must somehow be matched to motor rep- may well coopt the same cognitive raw material across multiple
resentations of self-movements. Associative models of imitation evolutionary lineages.
propose that such links could arise through previous experience of
contingency between performing an action and seeing it per- ACKNOWLEDGMENTS. We thank the organizers and funders of the Arthur M.
formed (91). For example, an infant might often observe others Sackler Colloquium on “The Extension of Biology Through Culture,” from
which this paper derives, the many participants at the colloquium who pro-
smiling when she or he smiles, leading to association between the vided informative feedback and discussion, and Simon Reader for comments
visual and motor representations of smiling. She or he will also on an earlier draft of the manuscript. Several of the empirical studies pre-
typically observe a raised arm each time that she raises her own sented here were coauthored by Lars Chittka, and the ideas that we have
arm, or react to an unexpected surprise with the same expressions discussed owe much to his direction and guidance. E.L. is funded by Euro-
pean Research Council Starting Grant BeeDanceGap, and the empirical
as others nearby. This elegant idea generates both theoretical work described here also derives from an Early Career Fellowship from
debate and extensive empirical exploration. However, imitation is The Leverhulme Trust. E.H.D.’s current position is funded by a Fyssen foun-
but one social learning mechanism, and this intensive exploration dation postdoctoral fellowship.

1. Hoppitt W, Laland K (2013) Social Learning: An Introduction to Mechanisms, Methods 12. Battesti M, Moreno C, Joly D, Mery F (2012) Spread of social information and dy-
and Models (Princeton Univ Press, Princeton, NJ). namics of social transmission within Drosophila groups. Curr Biol 22:309–313.
2. Grüter C, Leadbeater E (2014) Insights from insects about adaptive social information 13. Goulson D, Park KJ, Tinsley MC, Bussiere LF, Vallejo-Marin M (2013) Social learning
use. Trends Ecol Evol 29:177–184. drives handedness in nectar-robbing bumblebees. Behav Ecol Sociobiol 67:1141–1150.
3. Leadbeater E, Chittka L (2007) Social learning in insects—From miniature brains to 14. Danchin E, Giraldeau LA, Valone TJ, Wagner RH (2004) Public information: From nosy
consensus building. Curr Biol 17:R703–R713. neighbors to cultural evolution. Science 305:487–491.
4. Mery F, et al. (2009) Public versus personal information for mate copying in an in- 15. Thompson R, McCONNELL J (1955) Classical conditioning in the planarian, Dugesia
vertebrate. Curr Biol 19:730–734. dorotocephala. J Comp Physiol Psychol 48:65–68.
5. Worden BD, Papaj DR (2005) Flower choice copying in bumblebees. Biol Lett 1: 16. Kemenes G, Benjamin PR (1989) Appetitive learning in snails shows characteristics of
conditioning in vertebrates. Brain Res 489:163–166.
504–507.
17. Walters ET, Carew TJ, Kandel ER (1981) Associative learning in Aplysia: Evidence for
6. Leadbeater E, Chittka L (2007) The dynamics of social learning in an insect model, the
conditioned fear in an invertebrate. Science 211:504–506.
bumblebee (Bombus terrestris). Behav Ecol Sociobiol 61:1789–1796.
18. Spatz HC, Emanns A, Reichert H (1974) Associative learning of Drosophila mela-
7. Leadbeater E, Chittka L (2008) Social transmission of nectar-robbing behaviour in
nogaster. Nature 248:359–361.
bumble-bees. Proc Biol Sci 275:1669–1674.
19. Rescorla RA (1988) Pavlovian conditioning. It’s not what you think it is. Am Psychol 43:
8. Alem S, et al. (2016) Associative mechanisms allow for social learning and cultural
151–160.
transmission of string pulling in an insect. PLoS Biol 14:e1008529; erratum in 14(12): 20. Holland PC, Sherwood A (2008) Formation of excitatory and inhibitory associations
e1008529. between absent events. J Exp Psychol Anim Behav Process 34:324–335.
9. Loukola OJ, Perry CJ, Coscos L, Chittka L (2017) Bumblebees show cognitive flexibility 21. Timberlake W (1994) Behavior systems, associationism, and Pavlovian conditioning.
by improving on an observed complex behavior. Science 355:833–836. Psychon Bull Rev 1:405–420.
10. Battesti M, Moreno C, Joly D, Mery F (2015) Biased social transmission in Drosophila 22. Pearce JM, Bouton ME (2001) Theories of associative learning in animals. Annu Rev
oviposition choice. Behav Ecol Sociobiol 69:83–87. Psychol 52:111–139.
11. Dagaeff A-C, et al. (2016) Drosophila mate copying correlates with atmospheric 23. Dickinson A (2012) Associative learning and animal cognition. Philos Trans R Soc Lond
pressure in a speed-learning situation. Anim Behav 121:163–173. B Biol Sci 367:2733–2742.

7844 | www.pnas.org/cgi/doi/10.1073/pnas.1620744114 Leadbeater and Dawson


COLLOQUIUM
PAPER
24. Heyes C (2012) Simple minds: A qualified defence of associative learning. Philos Trans 60. Dawson EH, Chittka L, Leadbeater E (2016) Alarm substances induce associative social
R Soc Lond B Biol Sci 367:2695–2703. learning in honeybees, Apis mellifera. Anim Behav 122:17–22.
25. Dwyer D, Burgess KV (2011) Rational accounts of animal behaviour? Lessons from 61. Balderrama N, et al. (1996) A deterrent response in honeybee (Apis mellifera) for-
C. Lloyd Morgan’s canon. Int J Comp Psychol 24:349–364. agers: Dependence on disturbance and season. J Insect Physiol 42:463–470.
26. Guilford T, Burt De Perera T (2016) An associative account of avian navigation. J Avian 62. Rescorla RA, Wagner AR (1972) A theory of Pavlovian conditioning: Variations in the
Biol 48:191–195. effectiveness of reinforcement and nonreinforcement. Classical Conditioning II:
27. Shettleworth SJ (2010) Clever animals and killjoy explanations in comparative psy- Current Research and Theory, eds Black AH, Prokasy WF (Appleton-Century-Crofts,
chology. Trends Cogn Sci 14:477–481. NY), pp 64–99.
28. Heyes CM (1994) Social learning in animals: Categories and mechanisms. Biol Rev 63. Shettleworth S (2010) Cognition, Evolution and Behavior (Oxford Univ Press, Oxford).
Camb Philos Soc 69:207–231. 64. Galef BG, Jr, Mason JR, Preti G, Bean NJ (1988) Carbon disulfide: A semiochemical
29. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202. mediating socially-induced diet choice in rats. Physiol Behav 42:119–124.
30. Heyes C, Pearce JM (2015) Not-so-social learning strategies. Proc Biol Sci 282:7. 65. Dunlap AS, Nielsen ME, Dornhaus A, Papaj DR (2016) Foraging bumble bees weigh
31. Galef BG (1992) The question of animal culture. Hum Nat 3:157–178. the reliability of personal and social information. Curr Biol 26:1195–1199.
32. Lotem A, Halpern JY (2012) Coevolution of learning and data-acquisition mecha- 66. Kamin LJ (1969) Predictability, surprise, attention, and conditioning. Punishment and
nisms: A model for cognitive evolution. Philos Trans R Soc Lond B Biol Sci 367: Aversive Behavior, eds Campbell BA, Church RM (Appleton-Century-Crofts, New
2686–2694. York), pp 279–296.
33. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive mechanisms 67. Leadbeater E, Chittka L (2011) Do inexperienced bumblebee foragers use scent marks
in response to cultural innovations. Proc Natl Acad Sci USA 114:7915–7922. as social information? Anim Cogn 14:915–919.
34. Mery F, et al. (2007) Natural polymorphism affecting learning and memory in Dro- 68. Dawson EH, Chittka L (2012) Conspecific and heterospecific information use in
sophila. Proc Natl Acad Sci USA 104:13051–13055. bumblebees. PLoS One 7:e31444.
35. Garcia JK, Koelling RA (1966) Relation of cue to consequence in avoidance learning. 69. Domjan M, Galef BG (1983) Biological constraints on instrumental and classical-con-
Psychon Sci 4:123–124. ditioning—Retrospect and prospect. Anim Learn Behav 11:151–161.
36. Dunlap AS, Stephens DW (2014) Experimental evolution of prepared learning. Proc 70. Dwyer DM (2015) Experimental evolution of sensitivity to a stimulus domain alone is
Natl Acad Sci USA 111:11750–11755. not an example of prepared learning. Proc Natl Acad Sci USA 112:E385.
37. Truskanov N, Lotem A (2017) Trial-and-error copying of demonstrated actions reveals 71. Dukas R (1998) Evolutionary ecology of learning. Cognitive Ecology, ed Dukas R (Univ
how fledglings learn to ‘imitate’ their mothers. Proc Biol Sci 284:20162744. of Chicago Press, Chicago), pp 129–174.
38. Ho MK, MacGlashan J, Littman ML, Cushman F (March 21, 2017) Social is special: A nor- 72. Laland KN (2004) Social learning strategies. Learn Behav 32:4–14.
mative framework for teaching with and learning from evaluative feedback. Cognition, 73. Kendal RL, Coolen I, Laland KN (2009) Adaptive trade-offs in the use of social and
10.1016/j.cognition.2017.03.006. personal information. Cognitive Ecology II, eds Dukas R, Ratcliffe JM, pp 249–271.
39. Heyes C (2017) When does social learning become cultural learning? Dev Sci 20: 74. Dawson EH, Chittka L (2014) Bumblebees (Bombus terrestris) use social information as

EVOLUTION
e12350. an indicator of safety in dangerous environments. Proc Biol Sci 281:20133174.
40. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to 75. Annau Z, Kamin LJ (1961) Conditioned emotional response as a function of intensity
function. Behav Brain Sci 37:177–192. of US. J Comp Physiol Psych 54:428–432.
41. Galef BG (2013) Imitation and local enhancement: Detrimental effects of consensus 76. Estes WK, Skinner BF (1941) Some quantitative properties of anxiety. J Exp Psychol 29:
definitions on analyses of social learning in animals. Behav Processes 100:123–130. 390–400.
42. Heinrich B (1979) Bumblebee Economics. (Harvard Univ Press, Cambridge, MA). 77. Ings TC, Chittka L (2008) Speed-accuracy tradeoffs and false alarms in bee responses
43. Dawson EH, Avarguès-Weber A, Chittka L, Leadbeater E (2013) Learning by obser- to cryptic predators. Curr Biol 18:1520–1524.
vation emerges from simple associations in an insect model. Curr Biol 23:727–730. 78. Lenz F, Ings TC, Chittka L, Chechkin AV, Klages R (2012) Spatio-temporal dynamics of

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
44. Avarguès-Weber A, Chittka L (2014) Observational conditioning in flower choice bumblebees foraging under predation risk. Phys Rev Lett 108:098103.
copying by bumblebees (Bombus terrestris): Influence of observer distance and 79. Smolla M, Alem S, Chittka L, Shultz S (2016) Copy-when-uncertain: Bumblebees rely
demonstrator movement. PLoS One 9:e88415. on social information when rewards are highly variable. Biol Lett 12:20160188.
45. Schultz W, Dickinson A (2000) Neuronal coding of prediction errors. Annu Rev 80. Shafir S, Wiegmann DD, Smith BH, Real LA (1999) Risk-sensitive foraging: Choice
Neurosci 23:473–500. behaviour of honeybees in response to variability in volume of reward. Anim Behav
46. Rescorla R (1980) Pavlovian Second-Order Conditioning: Studies in Associative 57:1055–1061.
Learning (Laurence Erlbaum Associates, Hillsdale, NJ). 81. Seefeldt S, De Marco RJ (2008) The response of the honeybee dance to uncertain
47. Pavlov IP (1927) Conditioned Reflexes (Oxford Univ Press, Oxford). rewards. J Exp Biol 211:3392–3400.
48. Winterbauer NE, Balleine BW (2005) Motivational control of second-order condi- 82. Lloyd Morgan C (1909) An Introduction to Comparative Psychology (Walter Scott
tioning. J Exp Psychol Anim Behav Process 31:334–340. Publishing, London).
49. Chittka L, Leadbeater E (2005) Social learning: Public information in insects. Curr Biol 83. Lefebvre L, Giraldeau L-A (1996) Is social learning an adaptive specialization? Social
15:R869–R871. Learning in Animals: The Roots of Culture, eds Heyes CM, Galef BG, Jr (Academic,
50. Cook M, Mineka S (1989) Observational conditioning of fear to fear-relevant versus London), pp 107–128.
fear-irrelevant stimuli in rhesus monkeys. J Abnorm Psychol 98:448–459. 84. Reader SM, Hager Y, Laland KN (2011) The evolution of primate general and cultural
51. Mineka S, Cook M (1988) Social learning and the acquisition of snake fear in monkeys. intelligence. Philos Trans R Soc Lond B Biol Sci 366:1017–1027.
Social Learning: Pyschological and Biological Perspectives, eds Zentall TR, Galef BG 85. Reader SM (2003) Innovation and social learning: Individual variation and brain
(Laurence Erlbaum Associates, Hillsdale, NJ), pp 51–75. evolution. Animal Biology 53:147–158.
52. Mineka S, Cook M (1993) Mechanisms involved in the observational conditioning of 86. Templeton JJ, Kamil AC, Balda RP (1999) Sociality and social learning in two species of
fear. J Exp Psychol Gen 122:23–38. corvids: the pinyon jay (Gymnorhinus cyanocephalus) and the Clark’s nutcracker
53. Ohman A (2009) Of snakes and faces: An evolutionary perspective on the psychology (Nucifraga columbiana). J Comp Psychol 113:450–455.
of fear. Scand J Psychol 50:543–552. 87. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool (Lond)
54. Cook EW, 3rd, Hodes RL, Lang PJ (1986) Preparedness and phobia: Effects of stimulus 295:4–11.
content on human visceral conditioning. J Abnorm Psychol 95:195–207. 88. Heyes C (2016) Homo imitans? Seven reasons why imitation couldn’t possibly be as-
55. Tomarken AJ, Sutton SK, Mineka S (1995) Fear-relevant illusory correlations: What sociative. Philos Trans R Soc Lond B Biol Sci 371:20150069.
types of associations promote judgmental bias? J Abnorm Psychol 104:312–326. 89. Brass M, Heyes C (2005) Imitation: Is cognitive neuroscience solving the correspon-
56. Ohman A, Mineka S (2003) The malicious serpent: Snakes as a prototypical stimulus dence problem? Trends Cogn Sci 9:489–495.
for an evolved module of fear. Curr Dir Psychol Sci 12:5–9. 90. Nehaniv CL, Dautenhahn K (2002) The correspondence problem. Imitation in Animals
57. Zentall TR, Galef BG, eds (1988) Social Learning: Pyschological and Biological and Artifacts, eds Dautenhahn K, Nehaniv CL (MIT Press, Cambridge, MA).
Perspectives (Laurence Erlbaum Associates, Hillsdale, NJ). 91. Catmur C, Walsh V, Heyes C (2009) Associative sequence learning: The role of expe-
58. Miller NE, Dollard J (1941) Social Learning and Imitation (Yale Univ Press, New Haven, CT). rience in the development of imitation and the mirror system. Philos Trans R Soc Lond
59. Church RM (1968) Applications of behaviour theory to social psychology. Social B Biol Sci 364:2369–2380.
Facilitation and Imitative Behavior, eds Simmel EC, Hoppe RA, Milton GD (Allyn & 92. Chittka L, Thomson JD (2001) Cognitive Ecology of Pollination (Cambridge Univ Press,
Bacon, Boston). Cambridge, UK).

Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7845
Cultural macroevolution matters
Russell D. Graya,b,c,1 and Joseph Wattsa,d
a
Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena D-07745, Germany; bSchool of Psychology,
University of Auckland, Auckland 1142, New Zealand; cResearch School of the Social Sciences, Australian National University, Canberra, ACT 2601, Australia;
and dDepartment of Experimental Psychology, University of Oxford, Oxford OX1 3PH, United Kingdom

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 16, 2017)

Evolutionary thinking can be applied to both cultural microevolu- the importance of studying cultural macroevolution. Although we
tion and macroevolution. However, much of the current literature completely understand the need for elegant empirical work and
focuses on cultural microevolution. In this article, we argue that appropriate models of cultural change within populations, we
the growing availability of large cross-cultural datasets facilitates should never forget that the large-scale patterns of diversity be-
the use of computational methods derived from evolutionary tween cultures also cry out for evolutionary analyses and expla-
biology to answer broad-scale questions about the major transi- nation. The macro really matters.
tions in human social organization. Biological methods can be
extended to human cultural evolution. We illustrate this argument Big(ish) Data and Need for Computational Methods
with examples drawn from our recent work on the roles of Big It is a cliché these days to talk about big data transforming the
Gods and ritual human sacrifice in the evolution of large, stratified social sciences. However, clichés can be true. Certainly, there are
societies. These analyses show that, although the presence of Big a growing number of global comparative cultural and linguistic
Gods is correlated with the evolution of political complexity, in databases, such as D-PLACE (2), DRH (7), WALS (8), ASJP
Austronesian cultures at least, they do not play a causal role in (9), and Phoible (10), as well as relatively large regional data-
ratcheting up political complexity. In contrast, ritual human bases, such as the Austronesian Basic Vocabulary Database (11),
sacrifice does play a causal role in promoting and sustaining the SAILS (12), Chirilla (13), and Pulotu (14). Although these da-
evolution of stratified societies by maintaining and legitimizing tabases might not technically qualify as “big data,” they are large
the power of elites. We briefly discuss some common objections to enough to afford the application of the type of sophisticated
the application of phylogenetic modeling to cultural evolution and computational methods that are often used in the biological
argue that the use of these methods does not require a commit- sciences such as network analysis of reticulate evolution, epide-
ment to either gene-like cultural inheritance or to the view that miological models, and phylogenetic comparative methods.
cultures are like vertebrate species. We conclude that the careful These methods can be used to compare the relative importance
application of these methods can substantially enhance the of different factors in the distribution of traits, model the un-
prospects of an evolutionary science of human history. derlying dynamics of evolutionary change, and infer the history
of traits. The combination of big(ish) data and computational
cultural evolution | macroevolution | phylogenetics | religion | Big Gods methods has the potential to transform the social sciences and
humanities by enabling powerful quantitative tests of hypotheses
arwin’s On the Origin of the Species ends with the poetic
D phrase, “From so simple a beginning endless forms most
beautiful and most wonderful have been, and are being, evolved”
that would have previously only been analyzable in much more
limited ways.
To illustrate the promise of this approach, we present a recent
(1). The central challenge for evolutionary biology is to explain study by Botero et al. (15) titled, “The Ecology of Religious Be-
this diversity of endless forms. Evolutionary biologists tackle this liefs,” in which the authors examined the global distribution of
task by studying both microevolution (changes in gene frequency moralizing high gods (MHGs)—supernatural beings who are
within a population) and macroevolution (changes between claimed to have created or govern all reality, intervene in human
species over much longer time periods). The aim is to have a affairs, and enforce or support human morality (sometimes re-
mechanistic understanding of the evolution of biological diversity ferred to as “Big Gods”). These gods are central to the Abrahamic
that integrates microlevel processes and macrolevel patterns. religions, which includes the two largest religious families in the
This work examines ways in which evolutionary thinking and world today, Christianity and Islam. Scholars have debated the
methods can be extended into the realm of culture, extending the social and physical environments in which MHGs most readily
scope of biology to include questions that have traditionally been spread, and previous studies found rather contradictory results,
restricted to the humanities and social sciences. Human cultures with resource scarcity both positively and negatively associated
also display a vast variety of most beautiful and most wonderful with a belief in a MHG (16–18). These studies were limited by
forms. We speak ∼7,000 different languages, engage in hundreds the use of crude metrics of ecology or indirect measures of
of different religious practices, build many different types of
houses, exploit different resources for subsistence, use numerous
different kinship systems, and abide by a striking array of marital, This paper results from the Arthur M. Sackler Colloquium of the National Academy of
sexual, and child-rearing norms (2). The cultural processes that Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
produce such striking cultural diversity must be explained. The Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
field of cultural evolution is currently beginning to blossom (Fig. on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
1). There is a new cultural evolution society, a proposed journal,
Author contributions: R.D.G. and J.W. designed research, performed research, analyzed
and an inaugural conference (3). However, with a few notable data, and wrote the paper.
exceptions (4), much of the current work on cultural evolution The authors declare no conflict of interest.
focuses on microevolutionary processes. For example, in Dan
This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
Sperber’s influential book Explaining Culture: A Naturalistic Ap- Board.
proach (5), cultural macroevolution rates only a passing mention Data deposition: The data reported in this paper have been deposited in the D-PLACE
on p. 2. More recently, in Lewens’ (6) otherwise masterful anal- (https://d-place.org/home), Pulotu (https://pulotu.shh.mpg.de), and ABVD (https://abvd.
ysis of current work on cultural evolution, macroevolutionary shh.mpg.de/austronesian/) databases.
phenomena again fail to feature. This article is a plea—a plea for 1
To whom correspondence should be addressed. Email: gray@shh.mpg.de.

7846–7852 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620746114


COLLOQUIUM
PAPER
By themselves, large cross-cultural datasets and sophisticated
computational methods are insufficient to create an exciting
macroevolutionary science of human history. What is needed are
2.5

big hypotheses and a powerful synthetic framework. We would


Percentage of search results

like to suggest that much of the evolutionary thinking behind


The Major Transitions in Evolution could be applied to cultural
2.0

macroevolution (23).
Ten thousand years ago, most humans lived in small, kin-
1.5

based, and relatively egalitarian groups (26). Today, we live in


colossal nation states with distantly related members, complex
hierarchical organization, and huge social inequality. Although
1.0

kin selection and reciprocity explain a great deal of cooperation


in the animal kingdom, these mechanisms break down in modern
societies because the sheer scale of modern societies means that
0.5

people can be anonymous and only distantly biologically related


(27). The challenge is to explain the cultural forces that enabled
1950 1960 1970 1980 1990 2000 2010
this major transition in the size and complexity of human social
Year groups to occur.
The potential of religious beliefs and practices to bind to-
Fig. 1. Percentage of Google Scholar search results containing the term gether social groups has long been recognized (28), although
“cultural evolution” from 1950 to 2010. these functions have only recently been considered from an ex-
plicitly evolutionary perspective (29–31). One prominent theory
is that belief in supernatural punishment, particularly by pow-
agricultural potential, relatively small sample sizes, and a failure
erful and omnipotent Big Gods, inhibits selfishness and increases
to account for the nonindependence of societies as a result of cooperation among adherents (32–34). The intuitive idea is that
spatial proximity and common ancestry (19). Botero et al. (15) if people believe a punishing and moral supernatural agent is
extracted fine-grained environmental data, as well as cultural, monitoring them, they are more likely to behave themselves. By
linguistic, and geographic information from the open-access facilitating cooperation in large groups of nonkin, beliefs in su-
D-PLACE database (https://d-place.org) (2). They used multimodel pernatural punishment are thought to have played a causal role
inference to simultaneously evaluate the effects of environ- in the emergence of large, complex human societies (33, 35, 36).
mental variables, shared ancestry, geographic proximity, and so- It is crucial to emphasize that, at least as it was initially formu-
cial structures on belief in MHGs in 583 societies from around lated, this hypothesis was both causal and directional. Big Gods
the globe. Generalized linear models and generalized linear were needed to make big societies. For example:
mixed models were fitted in R (20) by using the lme4 (21) and It is no coincidence that the world is now dominated by a few great
MuMin (22) packages. The best-fitting models included spatial monotheisms, and that much human behaviour is influenced by the
proximity, political complexity, animal husbandry, resource abun- belief in a few high gods. To achieve a civilization of this scale, it was
dance, and resource stability. Belief in MHGs was more prevalent in necessary to invent them (36);
societies from harsher environments and more likely in politically
and
complex societies that had animal husbandry. Strikingly, this mul-

ANTHROPOLOGY
timodel inference approach was able to predict the global distri- One reason societies were able to develop cultural complexity in the
bution of belief in MHGs in a separate sample of cultures with an first place is partly on account of the cooperative benefits attained
through a belief in moralizing gods (35).
accuracy of 91%.
In support of the Supernatural Punishment Hypothesis, a
Major Transitions: Big Questions for Big(ish) Data number of cross-cultural studies have shown that belief in MHGs
John Maynard Smith and Eörs Szathmáry’s 1995 book, The is positively correlated with a range of measures of social com-
Major Transitions in Evolution (23), is perhaps one of the most plexity, such as political hierarchy, agriculture, and taxation
important and insightful contributions to evolutionary theory systems (18, 35). On the face of it, the cross-cultural evidence for

EVOLUTION
in the last 50 years. In this book, Maynard Smith and Szathmáry the Supernatural Punishment Hypothesis appears compelling.
not only document fundamental changes in biological organi- However, these studies have a number of important limitations
zation, such as the emergence of the genetic code, the origins (37). First, these studies do not actually get at the direction of
of cells, the evolution of the eukaryotic cell, and multicellular- causality. Although one possible explanation for these results is
ity, they also show how these changes in biological organiza- that MHGs facilitate social complexity, another is that social
tion change the way in which biological systems can evolve. complexity makes cultures more likely to adopt MHGs. Second,
The major transitions create entirely new evolutionary possibil- these studies are ether based on a single dataset called the
ities built upon new and more powerful ways of storing and Ethnographic Atlas or a subset of this dataset known as the
transmitting information (24). According to Maynard Smith Standard Cross-Cultural Sample (17, 18, 35, 38). The MHGs in
these datasets are almost all derived from the closely related
and Szathmáry (25), these transitions have at least five general
family of Abrahamic religions—Christianity, Judaism, and Islam
properties:
(37). These religions share a wide range of features, such as
1. Smaller entities form larger entities; providing a universal rather than ethnocentric doctrine and en-
2. Smaller entities become differentiated as part of the couraging fertility, and it is not clear whether it is an MHG
larger entity; specifically or some other part of these religions that is related
3. The smaller entities are often unable to replicate in the ab- to social complexity (37, 39). Third, cultures often inherit traits
sence of the larger entity; such as language, customs, oral traditions, and social norms from
4. The smaller entities can sometimes disrupt the development their ancestors (19). These relationships between cultures mean
of the larger entity; and that cultures cannot be treated as statistically independent–a
5. New ways of transmitting information arise. problem famously first pointed out by Francis Galton (40, 41).

Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7847
The studies mentioned above do not adequately account for human history, sailing from their homeland in Taiwan to settle
Galton’s Problem, so the correlation observed between the on islands ranging in size from the 0.4-km2 island of Anuta up to
presence of MHGs and social complexity might merely arise the 785,000-km2 continental island of New Guinea (14, 53, 54).
because of the historical relationships between cultures (42). The archaeological, genetic, and linguistic evidence suggests that
Thus, to rigorously test hypotheses about the role of MHGs in this expansion started ∼5,000 y ago and spread in a series of
driving the major transitions in human history, we need data expansion pulses and pauses through Island South East Asia and
from cultures with non-Abrahamic religions, as well as methods the Pacific (52–55). The cultures that evolved on these islands
that avoid Galton’s Problem and can explicitly test causal ranged from small kin-based groups, such as the Berawan (56),
predictions. up to federated kingdoms, such as Southern Toraja (57). Pop-
Phylogenetic methods have revolutionized the field of evolu- ulation sizes ranged from ∼200 people on Anuta (58) to ap-
tionary biology (43). These methods solve Galton’s Problem by proximately half a million people in the case of the Merina of
explicitly estimating ancestral state changes on phylogenetic Madagascar (59). No less diverse were their religious systems,
trees (19, 41). Thus, there is no overcounting or undercounting with supernatural beliefs including anthropomorphic, animistic,
of evolutionary events. Phylogenetic methods have recently been and nature deities, and religious rituals ranging in scale from
used to make inferences about things such as the ancestral state humble personal offerings to multiday community-wide festivals
of postmarital residence patterns in Austronesian cultures (44), (14). Because Austronesian cultures were some of the last cul-
the evolution of political complexity (45), the effects of cultural tures in the world to have contact with major world religions, and
ancestry on deforestation (46), and the links between cattle and their traditional beliefs were well documented, they provide an
matrilinity (47). Mark Pagel and Andrew Meade introduced a ideal sample for testing theories about the role of religion in the
method called “Discrete” in the program BayesTraits that emergence of social complexity.
models the evolution of two binary traits and tests between de- We ran two series of analyses to test the Supernatural Pun-
pendent and independent models of evolution (48, 49). In an ishment Hypothesis (50). In the first, we tested the effect of
independent model, the gains and losses of each trait are mod- Broad Supernatural Punishment on the evolution of political
eled separately from each other (Fig. 2A). In the dependent complexity. Agents counted in this test included a wide range of
model, the rate at which a trait is gained or lost depends on the punishing and morally concerned supernatural agents, such as
state of the other trait, as would be expected if there is a causal ancestral spirits, natural spirits (e.g., forest and sky gods), and
relationship between traits (Fig. 2 B and C). This approach gets mythical heroes, in addition to MHGs (14). Belief in Broad
at the direction of causality by inferring the temporal order that Supernatural Punishment was found in just over two-thirds of the
traits tend to arise and the effects they have on one another (49). cultures sampled. We found modest support for the coevolution
Using this approach and data from the Pulotu database (14, of Broad Supernatural Punishment and political complexity, with
49), we recently tested a series of hypotheses about the role of Broad Supernatural Punishment facilitating the rise of political
religion in the emergence of social complexity (14, 50, 51). The complexity, but not helping to sustain it. In the second series
Pulotu database contains quantitative variables documenting the of analyses, we tested whether the specific belief in MHGs
traditional religious beliefs and practices as well as the social coevolved with political complexity. We were surprised to find
organization of 106 Austronesian cultures. Special care was evidence of MHGs in just 6 of the 96 traditional Austronesian
taken in the coding of these data to ensure that, as far as pos- cultures we studied. Although our analyses suggested that MHGs
sible, the coding reflected the state of the culture before con- coevolved with political complexity, instead of MHGs driving
version to major world religions and colonization (14). Previously political complexity, our results indicated that MHGs tended to be
published language-based phylogenies were used as a proxy for gained after political complexity had already emerged (Fig. 2C).
the population history of these cultures (52). These trees fit re- Our analyses suggested that these MHGs had been gained only
markably well with archaeological evidence that shows Austronesian- recently, and most of these MHGs occurred in regions where
speaking cultures were some of the greatest ocean voyagers in there had been early contact with Muslim traders. Although we

Fig. 2. An independent model (A) of evolution alongside the dependent model predicted by the Supernatural Punishment Hypothesis (B) and the dependent
model resulting from analyses of traditional Austronesian cultures (C). The red figure represents the presence of a MHG, and the black figure represents the
presence of political complexity (PC). Arrows indicate the rates of change between states, and the width of the arrows are proportional to the size of the
transition rates. (A) In independent models of evolution, the rate at which each trait is gained or lost is independent of the state of the other trait. In this
example, cultures are more likely to gain PC than to lose it (rate c is lower than rate d). (B) In dependent models of evolution, the rate at which each trait is
gained and lost can be dependent on the state of the other. In the model predicted by the Supernatural Punishment Hypothesis, the rate at which PC
is gained is higher when a MHG is present (rate d) than when it is absent (rate b), and the rate at which PC is lost is lower when an MHG is present (rate g) than
when an MHG is absent (rate e). (C) The resulting models from our analyses suggested that MHGs had little effect on the gain and loss of PC, but that MHGs
were rarely gained in cultures without PC (rate a is lower than rate f).

7848 | www.pnas.org/cgi/doi/10.1073/pnas.1620746114 Gray and Watts


COLLOQUIUM
PAPER
excluded all clear cases of direct borrowings from Abrahamic re- fitness, and inheritance) are satisfied by cultural systems (4, 73).
ligions, it is likely that the concept of a MHG was subtly borrowed As Henrich and Boyd have shown, adaptive cultural evolution
and transferred to the names of indigenous deities (60). does not require replicator-like inheritance systems (74).
Defenders of the Big Gods hypothesis might argue that, for There is, however, another line of skepticism that is sometimes
some reason, the Austronesian cultures do not reflect the role directed against attempts to apply phylogenetic methods to cul-
MHGs played in the emergence of social complexity in other tural evolution. Lewontin’s former colleague, Steven Jay Gould
regions of the world. However, a closer examination of previous (75), put this view with characteristic vigor:
cross-cultural studies suggests otherwise. Of the 40 MHGs in the
Human cultural evolution proceeds along paths outstandingly dif-
Standard Cross Cultural Sample, 32 are of Christian or Islamic ferent from the ways of genetic change. . . Biological evolution is
origin, and the remaining 8 are either other Abrahamic religions constantly diverging; once lineages become separate, they cannot
or plausibly influenced by them (37). The Abrahamic religions amalgamate (except in producing news species by hybridization—a
arose ∼3,000 y ago, long after humans had begun forming large, process that occurs very rarely in animals). Trees are correct topol-
sedentary, and complexly organized societies (26). Although ogies of biological evolution. . . In human cultural evolution, on the
there is a substantial body of experimental research showing that other hand, transmission and anastomosis are rampant. Five minutes
Abrahamic MHGs can increase cooperation within groups (61), with a wheel, a snowshoe, a bobbin, or a bow and arrow may allow an
the timing of their origin means that they cannot explain at least artisan of one culture to capture a major achievement of another.
the initial emergence of social complexity in human history. This Although Gould may have hugely overestimated how easy it is
finding tells us that microlevel processes observed in contem- to reverse-engineer the manufacture of these items, the critique
porary cultures do not necessarily explain the macroevolutionary of cultural phylogenetics has not gone away. Recently, Norenzayan
patterns observed in human history. et al. (76) stated:
An alternative vein of scholarship has focused on the darker
role of religion in human social life (62, 63). Archaeological, We caution against rushing to embrace analytical techniques im-
historical, and ethnographic records reveal that in early societies ported from genetic evolution – used to reconstruct species phylog-
religious and political authority often overlapped (26), providing enies – to cultural evolution. Cultural evolution is in some crucial
respects unlike genetic evolution. . . Species, for example, are not
ample opportunities for elites to use religious systems toward subject to intergroup competition that creates massive and directed
their own ends. As a result, religious narratives in early human horizontal transmission of only some traits. Therefore, we think the
societies often legitimize the authority of those in power and first step should be to benchmark phylogenetic techniques to cultural
involve rituals that benefit the elite at the expense of under- history using known historical cases.
classes (64). A particularly gruesome example is the practice of
ritualized human sacrifice that occurred in early human societies For the sake of clarity, we should be clear that we are not
throughout the world (64–68). According to the Social Control advocating the blanket adoption of phylogenetics to all cultural
Hypothesis (64, 66, 68), ritualized human sacrifice was used by phenomena. So, let us look more closely at what the legitimate
social elites as a religiously sanctioned means of terrifying un- concerns may or may not be. The statements above could be
derclasses into obedience. boiled down to four linked, but logically separate, claims:
To test the Social Control Hypothesis, we went back to the 1. Culture evolves differently from biology. Biological evolution
Pulotu database (14), coded variables on human sacrifice and is treelike, but in culture reticulation rules.
social stratification, and tested for their coevolution (51). The 2. Cultures are not (vertebrate) species. Different aspects of
term “social stratification” refers to inherited differences in culture will have quite different histories.
wealth and status and is thought to have been one of the earliest 3. The estimation of phylogenetic trees will be biased by

ANTHROPOLOGY
forms of hierarchical structuring to emerge in human history horizontal transmission.
(26). We found human sacrifice to have been remarkably com- 4. The accuracy of cultural phylogenies has not been validated.
mon in traditional cultures, occurring in almost half of those
sampled (51). Typically, social elites orchestrated the sacrifices, The first claim displays a shocking lack of knowledge of bi-
with social underclasses becoming the victims. The results of our ology and human culture. There is a great deal of biology that
analyses showed that human sacrifice coevolved with social strati- does not fit tidily on the “tree of life.” Indeed, the tree of life has
fication and functioned to stabilize social inequality in general, as been mocked as the “tree of 1%” (77). A very significant amount
well as facilitated the emergence of rigid class systems (Fig. 3). This of cross-lineage transfer occurs in biological evolution, especially
result does not imply that human sacrifice was necessarily functional in microbes (78). Mallet (79) estimated that there is hybridiza-

EVOLUTION
for the whole group, nor that it would have these effects in modern tion in ∼10% of animal and 25% of plant species. Dagan and
societies, which have developed more sophisticated methods of Martin’s (80) analysis of 190 prokaryotic genomes suggests that
sustaining social inequality. What our results do show is that ritual horizontal gene transfer has affected at least two-thirds of >57,000
human sacrifice was used by social elites as a tool to maintain their gene families.
social standing in the early stages of social complexity. In the literature on cultural microevolution, there is evidence
that the majority of social learning occurs between members of
Overextension of Biological Metaphors and Methods? the same population, but the relative importance of parent-to-
The famous evolutionary biologist Richard Lewontin often liked offspring and peer-to-peer social learning is debated (81–84).
to cite Rosenblueth and Wiener’s quip that, “The price of met- What matters for the application of phylogenetic methods are the
aphor is eternal vigilance” (69). One of the things that Lewontin resulting macroevolutionary patterns. Given that social learning
is particularly skeptical about is the metaphorical extension of occurs predominantly within a population, both peer-to-peer and
evolutionary ideas to cultural history (70, 71). Part of this parent-to-offspring learning can result in vertical transmission at
skepticism is driven by his opposition to Dawkins’ meme concept the macroevolutionary level. The relative importance of vertical
(72). Fracchia and Lewontin write (70): and horizontal transmission between populations is likely to vary
across domains of culture, world regions, and periods of history.
But, unlike genes, memes are not entities with an existence inde- For example, the design of the internal combustion engine has
pendent of the theory. They are a mental construct whose only de-
fined property is to fill in the gap in an elaborate metaphor.
been borrowed between cultural lineages. Conversely, basic vo-
cabulary items, such as terms for hand and eye, lower numerals,
However, Lewontin’s own three central principles for systems and kinship terms show clear evidence of vertical transmission
to evolve by natural selection (phenotypic variation, differential down cultural lineages (85).

Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7849
A Atayal
Tsou
Ami
Bunun
Puyuma
Paiwan
Sama Dilaut
Tagbanwa
Palawan Batak
Bukidnon
Subanun
Isneg
Gaddang
Bontok
Ifugao
Kalinga
Tinguian
Ngaju
Merina
Bidayuh
Kayan
Berawan
Kelabit
Moken
Iban
Toba Batak
Southern Toraja
Minahasa
Eastern Toraja
Nias
Palau
Chamorro
Ata Tana ‘Ai
Kedang
Manggarai
Savu
Laboya
East Sumba
Tanimbar
Tetum
Atoni
Roti
Biak
Waropen
Lakalai
Wogeo
Manam
Dobu
Trobriand Islands
Mekeo
Motu
Nendo
Bughotu
Lau
Kwaio
To’abaita
Kwara’ae
Tolai
Buka
Cheke Holo
Choiseul
Simbo
Roviana
Lifou
Mare
Eromanga
Tanna
Mota
Southern Malekula
Kiribati
Marshall Islands
Pohnpei
Woleai
Chuuk
East Fiji
Tonga
a = gain of HS
Samoa
Anuta
Rennell and Bellona
East Futuna
B c = loss of HS
Niue

b = gain of SS

d = gain of SS
g = loss of SS
West Futuna

e = loss of SS
Tikopia
Kapingamarangi
Ontong Java
Tokelau
Rapa Nui
Marquesas
Mangareva f = gain of HS
Rarotonga
Maori
Hawaii
Tahiti h = loss of HS

Fig. 3. (A) Ancestral state reconstruction of human sacrifice and social stratification on a maximum clade credibility consensus tree of 93 Austronesian
languages. The circles at the tips of the tree represent the known traditional states of cultures, and the circles found across the nodes of the tree represent the
state of prehistoric cultures inferred by a Markov chain Monte Carlo analysis in BayesTraits. In the analysis, 4,200 of the most likely possible trees were used,
and the consensus tree is a summary of these trees for illustrative purposes. The gray at each of the internal nodes represents the proportion of trees sampled
without this node and provides an indication of phylogenetic uncertainty. (B) The resulting dependent model shows that cultures with ritualized human
sacrifice were less likely to lose social stratification than those that lacked human sacrifice (rate g is lower than rate e). Adapted from ref. 51.

The second claim is more sensible, but does not undermine The third objection—that the estimation of cultural phylog-
the use of phylogenetic methods. If anything, it points out an enies will be biased by horizontal transmission—is a quantitative
important role for their use—to assess the coherence of differ- issue that can be evaluated by simulation modeling. Greenhill
ence aspects of culture. In their book chapter, “Are Cultural et al. (89) simulated language phylogenies with different tree
Phylogenies Possible?”, Boyd et al. (86) describe a range of topologies, different borrowing scenarios, and different levels
positions along a continuum on the question of how integrated of borrowing. The results show that tree topologies constructed
cultural histories are: (i) Cultures are tightly integrated like with Bayesian phylogenetic methods are robust to realistic
vertebrate species; (ii) cultures contain a core of traditions that levels of borrowing. Inferences about divergence dates were
are tightly linked and vertically transmitted, with peripheral as- slightly less robust and showed a tendency to underestimate
pects that are less cohesive and marked by frequent borrowing; dates.
The final objection—have inferences from cultural phyloge-
(iii) cultures contain some aspects that are bound together, but
netics been validated?—is a fair enough concern, but one that
there are no core traditions; and (iv) cultures are collections of
applies to much of computational biology, and indeed the ex-
ephemeral entities. Just as biologists talk about “every gene trapolation of laboratory studies to the field. In brief, we will
having its own history,” and have developed methods to map point out that the Austronesian languages phylogenies built from
these gene genealogies on to a species phylogeny, so cultural basic vocabulary fit strikingly well with both archaeological (55)
phylogenetists could construct trees for different aspects of cul- and recent genetic data (90, 91), both in terms of the sequence
ture and evaluate their fit with population history (87, 88). For and the timing of the Austronesian expansion.
example, genealogies of religious beliefs, material culture, kinship
systems, music genres, and styles of art could be mapped and Conclusion
compared with language-based cultural histories. Phylogenetic In the coming years, more quantitative phylogenies for the major
methods make the traditional social science debate about the language families will be published, and the number and richness
extent to which a culture is an integrated whole testable. of comparative cultural databases will undoubtedly grow (7, 92).

7850 | www.pnas.org/cgi/doi/10.1073/pnas.1620746114 Gray and Watts


COLLOQUIUM
PAPER
The series of studies we have discussed in this work illustrate understanding of cultural macroevolution. They can even be used
how causal theories about the emergence of major transitions in to predict political and economic changes (95). Although there is
human social organization can be tested with the combination of much still to be done to integrate microlevel processes and mac-
large quantitative cross-cultural data and computational phylo- rolevel patterns, the macro not only matters, it is tractable.
genetic methods. We do not claim that these methods are ap-
propriate for all questions and for all spatial and temporal time ACKNOWLEDGMENTS. We thank colleagues Quentin Atkinson, Carlos Botero,
scales in cultural evolution. Instead, we suggest that, when they Joseph Bulbulia, Michael Gavin, Simon Greenhill, and Oliver Sheehan for their
are used carefully in cases where there is clear historical signal, important contributions to the joint work on the cultural evolution of religion
discussed here. Olivier Morin and Kim Sterelny made useful comments on the
such as the Austronesian or Bantu expansions (52, 93), and manuscript. This work was supported by John Templeton Foundation Grant
where the inferences are triangulated with other lines of evidence 28745; a PhD scholarship from the University of Auckland; and Marsden Fund
(94), then they can make an important contribution to our Grant UOA1104.

1. Darwin C (1872) On the Origin of Species by Means of Natural Selection (John Murray, 34. Schloss JP, Murray MJ (2011) Evolutionary accounts of belief in supernatural pun-
London), 6th Ed. ishment: A critical review. Religion Brain Behav 1:46–99.
2. Kirby KR, et al. (2016) D-PLACE: A global database of cultural, linguistic and envi- 35. Johnson DDP (2005) God’s punishment and public goods: A test of the supernatural
ronmental diversity. PLoS One 11:e0158391. punishment hypothesis in 186 world cultures. Hum Nat 16:410–446.
3. The Evolution Institute (2016) A New Society for the Study of Cultural Evolution. 36. Shariff AF, Norenzayan A, Henrich J (2011) The birth of high gods: How the cultural
Available at https://evolution-institute.org/project/society-for-the-study-of-cultural- evolution of supernatural policing influenced the emergence of complex, cooperative
evolution/. Accessed January 3, 2017. human societies, paving the way for civilization. Evolution, Culture, and the Human
4. Mesoudi A (2011) Cultural Evolution: How Darwinian Theory Can Explain Human Mind, eds Schaller M, Norenzayan A, Heine SJ, Yamagishi T, Kameda T (Psychology,
Culture and Synthesize the Social Sciences (Univ of Chicago Press, Chicago). New York), pp 119–136.
5. Sperber D (1996) Explaining Culture: A Naturalistic Approach (Blackwell, Oxford). 37. Atkinson Q, Latham A, Watts J (2015) Are Big Gods a big deal in the emergence of big
6. Lewens T (2015) Cultural Evolution: Conceptual Challenges (Oxford Univ Press, groups? Religion Brain Behav 5:266–274.
Oxford). 38. Peoples HC, Marlowe FW (2012) Subsistence and the evolution of religion. Hum Nat
7. Slingerland E, Sullivan B (2017) Durkheim with data: The Database of Religious His- 23:253–269.
tory. J Am Acad Relig 85:312–347. 39. Watts J, Bulbulia J, Gray RD, Atkinson QD (2016) Clarity and causality needed in claims
8. Haspelmath M (2005) The World Atlas of Language Structures (Oxford Univ Press, about Big Gods. Behav Brain Sci 39:41–42.
Oxford). 40. Jordan FM (2013) Comparative phylogenetic methods and the study of pattern and
9. Wichmann S, Holman EW, Brown CH (2016) The ASJP Database. Version 17. Available process in kinship. Kinship Systems: Change and Reconstruction, eds McConvell P,
at asjp.clld.org/. Accessed January 3, 2017. Keen I, Hendery R (Univ of Utah Press, Salt Lake City), pp 43–58.
10. Moran S, McCloy D, Wright R (2014) PHOIBLE Online (Max Planck Institute for Evo- 41. Mace R, Jordan F, Holden C (2003) Testing evolutionary hypotheses about human
lutionary Anthropology, Leipzig). biological adaptation using cross-cultural comparison. Comp Biochem Physiol A Mol
11. Greenhill SJ, Blust R, Gray RD (2008) The Austronesian Basic Vocabulary Database: Integr Physiol 136:85–94.
From bioinformatics to lexomics. Evol Bioinform Online 4:271–283. 42. Dow M, Eff E (2008) Global, regional, and local network autocorrelation in the
12. Muysken P, et al. (2016) South American Indigenous Language Structures (SAILS) standard cross-cultural sample. Cross-Cultural Res 42:148–171.
Online (Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany). 43. Freckleton RP, Harvey PH, Pagel M (2002) Phylogenetic analysis and comparative data:
Available at sails.clld.org. Accessed January 3, 2017. A test and review of evidence. Am Naturalist 160:712–726.
13. Bowern C (2016) Chirila: Contemporary and Historical Resources for the Indigenous 44. Fortunato L, Jordan F (2010) Your place or mine? A phylogenetic comparative analysis
of marital residence in Indo-European and Austronesian societies. Philos Trans R Soc
Languages of Australia. Lang Doc Conserv 10:1–44.
Lond B Biol Sci 365:3913–3922.
14. Watts J, et al. (2015) Pulotu: Database of Austronesian supernatural beliefs and
45. Currie TE, Greenhill SJ, Gray RD, Hasegawa T, Mace R (2010) Rise and fall of political
practices. PLoS One 10:e0136783.
complexity in island South-East Asia and the Pacific. Nature 467:801–804.
15. Botero CA, et al. (2014) The ecology of religious beliefs. Proc Natl Acad Sci USA 111:
46. Atkinson QD, Coomber T, Passmore S, Greenhill SJ, Kushnick G (2016) Cultural and
16784–16789.
environmental predictors of pre-European deforestation on Pacific Islands. PLoS One
16. Snarey J (1996) The natural environment’s impact upon religious ethics: A cross-
11:e0156340.

ANTHROPOLOGY
cultural study. J Sci Study Relig 35:85–96.
47. Holden CJ, Mace R (2003) Spread of cattle led to the loss of matrilineal descent in
17. Brown C, Eff EA (2010) The state and the supernatural: Support for prosocial be-
Africa: A coevolutionary analysis. Proc Biol Sci 270:2425–2433.
havior. Struct Dyn 4:1–21.
48. Pagel M (1994) Detecting correlated evolution on phylogenies: A general method for
18. Roes FL, Raymond M (2003) Belief in moralizing gods. Evol Hum Behav 24:126–135.
the comparative analysis of discrete characters. Proc Biol Sci 255:37–45.
19. Mace R, Pagel M (1994) The comparative method in anthropology. Curr Anthropol 35:
49. Pagel M, Meade A (2006) Bayesian analysis of correlated evolution of discrete char-
549–564.
acters by reversible-jump Markov chain Monte Carlo. Am Nat 167:808–825.
20. R Core Team (2015) R: A Language and Environment for Statistical Computing (R
50. Watts J, et al. (2015) Broad supernatural punishment but not moralising high gods
Foundation for Statistical Computing, Vienna).
precede the evolution of political complexity in Austronesia. Proc R Soc B Biol Sci
21. Bates D, Maechler M, Bolker B, Walker S (2015) Package lme4. J Stat Softw 67:1–91.
282:20142556.
22. Barton K (2015) MuMIn: Multi-model inference. R package, Version 1.15.1. Available
51. Watts J, Sheehan O, Atkinson QD, Bulbulia J, Gray RD (2016) Ritual human sacrifice
at r-forge.r-project.org/projects/mumin/. Accessed January 3, 2017.

EVOLUTION
promoted and sustained the evolution of stratified societies. Nature 532:228–231.
23. Maynard Smith J, Szathmáry E (1995) The Major Transitions in Evolution (Oxford Univ
52. Gray RD, Drummond AJ, Greenhill SJ (2009) Language phylogenies reveal expansion
Press, Oxford).
pulses and pauses in Pacific settlement. Science 323:479–483.
24. Calcott B, Sterelny K (2011) A big picture of big pictures of life’s history. The Major
53. Kirch PV, Green RC (2001) Hawaiki, Ancestral Polynesia: An Essay in Historical
Transitions in Evolution Revisited, eds Calcott B, Sterelny K (MIT Press, Cambridge, Anthropology (Cambridge Univ Press, Cambridge, UK).
MA). 54. Ko AM, et al. (2014) Early Austronesians: Into and out of Taiwan. Am J Hum Genet 94:
25. Szathmáry E, Smith JM (1995) The major evolutionary transitions. Nature 374: 426–436.
227–232. 55. Wilmshurst JM, Hunt TL, Lipo CP, Anderson AJ (2011) High-precision radiocarbon
26. Flannery K, Marcus J (2012) The Creation of Inequality: How our Prehistroic Ancestors dating shows recent and rapid initial human colonization of East Polynesia. Proc Natl
Set the Stage for Monarchy, Slavery, and Empire (Harvard Univ Press, Cambridge, Acad Sci USA 108:1815–1820.
MA). 56. Huntington R, Metcalf P (1979) Celebrations of Death: The Anthropology of Mortuary
27. Gintis H, Bowles S, Boyd R, Fehr E (2003) Explaining alturistic behavior in humans. Evol Ritual (Cambridge Univ Press, Cambridge, UK).
Hum Behav 24:153–172. 57. Nooy-Palm H (1979) The Sa’dan-Toraja: A Study of Their Social Life and Religion
28. Durkheim E (1915) The Elementary Forms of the Religious Life (Allen & Unwin, (Martinus Nijhoff, The Hague).
London). 58. Feinberg R (1991) Anuta. Oceania, Encyclopedia of World Cultures, ed Hays TE (G. K. Hall,
29. Sosis R (2009) The adaptationist-byproduct debate on the evolution of religion: Five New York), Vol II, pp 13–16.
misunderstandings of the adaptationist program. J Cogn Cult 9:315–332. 59. Campbell G (1991) The state and pre-colonial demographic history: The case of late
30. Bulbulia J (2004) The cognitive and evolutionary psychology of religion. Biol Philos 18: Nineteenth-Century Madagascar. J Afr Hist 32:425–445.
655–686. 60. Buck PH (1952) The Coming of the Maori (Human Relations Area Files Press, New
31. Wiebe D (2008) Does talk about the evolution of religion make sense? Evolution of Haven, CT).
Religion: Studies, Theories and Critiques, eds Bulbulia J, et al. (Collins Foundation, 61. Shariff AF, Willard AK, Andersen T, Norenzayan A (2016) Religious priming: A meta-
Santa Margarita, CA), pp 339–346. analysis with a focus on prosociality. Pers Soc Psychol Rev 20:27–48.
32. Johnson DD, Krüger O (2004) The good of wrath: Supernatural punishment and the 62. Marx K, Engels F (1975) Karl Marx and Friedrich Engels: Collected Works (In-
evolution of cooperation. Polit Theol 5:159–176. ternational, New York).
33. Norenzayan A (2013) Big Gods: How Religion Transformed Cooperation and Conflict 63. Cronk L (1994) Evolutionary theories of morality and the manipulative use of signals.
(Princeton Univ Press, Princeton). Zygon 29:81–101.

Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7851
64. Carrasco D (1999) City of Sacrifice (Beacon, Boston). 82. Hewlett BS, Cavalli-Sforza LL (1986) Cultural transmission among Aka Pygmies. Am
65. Bremmer JN (2007) The Strange World of Human Sacrifice (Peeters, Leuven, Belgium). Anthropol 88:922–934.
66. Turner CG, Turner JA (1999) Man Corn: Cannibalism and Violence in the Prehistoric 83. Henrich J, Broesch J (2011) On the nature of cultural transmission networks: Evidence from
American Southwest (Univ of Utah Press, Salt Lake City). Fijian villages for adaptive learning biases. Philos Trans R Soc Lond B Biol Sci 366:1139–1148.
67. Girard R (1987) Violent origins: Ritual killing and cultural formation. Violent Origins, 84. Aunger R (2000) The life history of culture learning in a face-to-face society. Ethos 28:
eds Hamerton-Kelly R, Burkert W, Girard R, Smith J (Stanford Univ Press, Stanford, 445–481.
CA), pp 73–105. 85. Haspelmath M, Tadmor U (2009) World Loanword Database (WOLD) (Max Planck
68. Winkelman M (2014) Political and demograpic-ecological determinants of in- Digital Library, Leipzig, Germany).
stitutionalised human sacrifice. Anthropol Forum 24:47–70. 86. Boyd R, Borgerhoff-Mulder M, Durham WH, Richerson PJ (1997) Are cultural phy-
69. Lewontin RC (2001) In the beginning was the word. Science 291:1263–1264. logenies possible? Human by Nature: Between Biology and the Social Sciences, eds
70. Fracchia J, Lewontin RC (2005) The price of metaphor. Hist Theory 44:14–29. Weingart P, Richerson P, Mitchell S, Maasen S (Lawrence Erlbaum Associates, Mahwah,
71. Fracchia J, Lewontin RC (1999) Does culture evolve? Hist Theory 38:52–78. NJ), pp 355–386.
72. Dawkins R (1976) The Selfish Gene (Oxford Univ Press, Oxford). 87. Gray RD, Greenhill SJ, Ross RM (2007) The pleasures and perils of Darwinizing culture
73. Lewontin RC (1970) The units of selection. Annu Rev Ecol Syst 1:1–18. (with phylogenies). Biol Theory 2:360–375.
74. Henrich J, Boyd R (2002) On modeling cognition and culture: Why cultural evolution 88. Gray RD, Bryant D, Greenhill SJ (2010) On the shape and fabric of human history.
Philos Trans R Soc Lond B Biol Sci 365:3923–3933.
does not require replication of representations. J Cogn Cult 2:87–112.
89. Greenhill SJ, Currie TE, Gray RD (2009) Does horizontal transmission invalidate cul-
75. Gould SJ (2010) An Urchin in the Storm: Essays About Books and Ideas (W. W. Norton,
tural phylogenies? Proc R Soc B Biol Sci 276:2299–2306.
New York).
90. Lipson M, et al. (2014) Reconstructing Austronesian population history in Island
76. Norenzayan A, et al. (2016) The cultural evolution of prosocial religions. Behav Brain
Southeast Asia. Nat Commun 5:4689.
Sci 39:e1.
91. Lind J, Lindenfors P, Ghirlanda S, Lidén K, Enquist M (2013) Dating human cultural
77. Dagan T, Martin W (2006) The tree of one percent. Genome Biol 7:118.
capacity using phylogenetic principles. Sci Rep 3:1785.
78. Shapiro JA (2016) Nothing in evolution makes sense except in the light of genomics:
92. Turchin P, et al. (2015) Seshat: The global history databank. Cliodynamics J Quant Hist
Read-write genome evolution as an active biological process. Biology (Basel) 5:E27. Cult Evol 6(1).
79. Mallet J (2005) Hybridization as an invasion of the genome. Trends Ecol Evol 20: 93. Currie TE, Meade A, Guillon M, Mace R (2013) Cultural phylogeography of the Bantu
229–237. Languages of sub-Saharan Africa. Proc R Soc London B Biol Sci 280:20130695.
80. Dagan T, Martin W (2007) Ancestral genome sizes specify the minimum rate of lateral 94. Gray RD, Atkinson QD, Greenhill SJ (2011) Language evolution and human history:
gene transfer during prokaryote evolution. Proc Natl Acad Sci USA 104:870–875. What a difference a date makes. Philos Trans R Soc Lond B Biol Sci 366:1090–1100.
81. Tehrani JJ, Collard M (2009) On the relationship between interindividual cultural 95. Matthews LJ, Passmore S, Richard PM, Gray RD, Atkinson QD (2016) Shared cultural
transmission and population-level cultural diversity: A case study of weaving in Ira- history as a predictor of political and economic changes among nation states.
nian tribal population. Evol Hum Behav 30:286–300. PLoS One 11:e0152979.

7852 | www.pnas.org/cgi/doi/10.1073/pnas.1620746114 Gray and Watts


COLLOQUIUM
PAPER
Pursuing Darwin’s curious parallel: Prospects for a
science of cultural evolution
Alex Mesoudia,1
a
Human Biological and Cultural Evolution Group, Department of Biosciences, University of Exeter, Penryn, Cornwall TR10 9FE, United Kingdom

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved April 10, 2017 (received for review January 11, 2017)

In the past few decades, scholars from several disciplines have Historical Context
pursued the curious parallel noted by Darwin between the genetic Darwin’s comment above was inspired by historical linguists of
evolution of species and the cultural evolution of beliefs, skills, his time, who, even before publication of On the Origin of Species
knowledge, languages, institutions, and other forms of socially (4), were constructing tree-like schemas of extant languages ex-
transmitted information. Here, I review current progress in the plicitly based on the assumption of common descent (5). Although
pursuit of an evolutionary science of culture that is grounded in “evolutionary” ideas became popular ways of describing cultural
both biological and evolutionary theory, but also treats culture as change in the late 1800s, such ideas were confused. Many scholars
more than a proximate mechanism that is directly controlled by erroneously saw evolution as inevitable progress along fixed stages
genes. Both genetic and cultural evolution can be described as of increasing complexity (e.g., from savagery to barbarism to civi-
systems of inherited variation that change over time in response lization), drawing more from Herbert Spencer than from Darwin
to processes such as selection, migration, and drift. Appropriate (6). There was also much confusion, in the absence of a clear
differences between genetic and cultural change are taken seriously, understanding of genetics, about genetic and cultural inheritance.
such as the possibility in the latter of nonrandomly guided variation Theories were often literally Lamarckian, with ideas, artifacts, and
or transformation, blending inheritance, and one-to-many trans-
words somehow thought to become part of the germ line through
repeated use (7). Due to this confusion, as well as the misuse of
mission. The foundation of cultural evolution was laid in the late
pseudoevolutionary racial theories for distasteful political ends,
20th century with population-genetic style models of cultural mi-
early 20th century social scientists declared culture to be separate
croevolution, and the use of phylogenetic methods to reconstruct from “the organic” (8); the biological and social sciences went
cultural macroevolution. Since then, there have been major efforts separate ways; and the notion of cultural evolution, or indeed any
to understand the sociocognitive mechanisms underlying cumulative evolutionary basis for human behavior, fell from favor.
cultural evolution, the consequences of demography on cultural
evolution, the empirical validity of assumed social learning biases, Cultural Microevolution
the relative role of transformative and selective processes, and the It was not until the 1970s and 1980s that a properly Darwinian
use of quantitative phylogenetic and multilevel selection models to theory of cultural change was formulated, first by Cavalli-Sforza
understand past and present dynamics of society-level change. and Feldman (9, 10) and then by Boyd and Richerson (11). This
I conclude by highlighting the interdisciplinary challenges of study- theory comprised quantitative models of cultural microevolution,
ing cultural evolution, including its relation to the traditional social describing the mechanisms by which cultural variation is trans-
sciences and humanities. mitted from person to person, and the processes that change this
variation over time within populations (Table 1), thus embodying
|
cultural evolution cumulative culture | gene–culture coevolution | the “population thinking” that characterizes Darwin’s approach.

ANTHROPOLOGY
|
human evolution social learning Here, “culture” is defined as “information capable of affecting
individuals’ behavior that they acquire from other members of their
species through teaching, imitation, and other forms of social trans-
The formation of different languages and of distinct species, and the mission” (12). “Social transmission,” “social learning,” and
proofs that both have been developed through a gradual process, are “cultural transmission” are used interchangeably to denote the
curiously parallel. . . nongenetic transfer of learned information from one individual
to another. “Cultural trait,” “cultural variant,” and sometimes
Charles Darwin, The Descent of Man, p 90

PSYCHOLOGICAL AND
“meme” are used to refer to the information (e.g., ideas, attitudes,

COGNITIVE SCIENCES
T his quote from Charles Darwin (1) draws a parallel between,
on the one hand, the genetic evolution of species, and on the
other, cultural change (i.e., changes in socially learned infor-
skills) that is transmitted. All of these terms hide huge complexity
and caveats. Such simplification is typical of a modeling approach.
This approach follows population genetics, which makes simplify-
mation, such as beliefs, knowledge, tools, technology, attitudes, ing assumptions (e.g., infinitely large populations) to understand
norms, and, as Darwin mentions, languages). This idea is the similarly complex genetic evolutionary processes. The simplification
in both cases is tactical, aiming to understand complex processes in
basic premise of cultural evolution: Cultural change constitutes a a piecemeal fashion and to formalize verbal arguments (13).
Darwinian evolutionary process that shares key characteristics Some of the processes in Table 1 have parallels in genetic
with the genetic evolution of species. The emergence of this second evolution. Selection-like “content” or “direct” biases favor the
evolutionary process saw an unprecedented extension of genetic acquisition and transmission of some cultural variants over others
evolution by allowing organisms to adapt more rapidly to, and due to their memorability or effectiveness (14, 15), just as some
more powerfully create and shape, their environments.
Since the 1980s, this parallel between genetic and cultural
evolution has been pursued by scholars from a range of disci- This paper results from the Arthur M. Sackler Colloquium of the National Academy of
plines across the social, behavioral, and biological sciences. In Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
this article, I review the current state of this interdisciplinary Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
effort, focusing on topics of major recent research interest. No
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
new theories or findings are presented, but in presenting dispa-
Author contributions: A.M. wrote the paper.
rate strands of work alongside each other, I hope to identify links
between strands and foster a synthetic evolutionary science of The author declares no conflict of interest.

culture (2, 3) paralleling the interdisciplinary synthesis of the This article is a PNAS Direct Submission.
biological sciences in the early 20th century. 1
Email: a.mesoudi@exeter.ac.uk.

www.pnas.org/cgi/doi/10.1073/pnas.1620741114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7853–7860


Table 1. Comparison of genetic and cultural evolution as originally modeled by Cavalli-Sforza and Feldman (10)
and Boyd and Richerson (11), with citations to recent empirical tests or examples
Process Genetic evolution Cultural evolution

Variation Genetic mutation Undirected cultural mutation/invention (16)


Recombination Recombination (133)
— Guided variation or transformation (109, 112)
Inheritance Particulate Particulate or “meme-like” (134)
— Blending of continuous cultural variation (60)
Vertical (parent to offspring) Vertical cultural transmission (75)
Horizontal gene transfer Horizontal/oblique cultural transmission (76, 77)
Selection Natural selection Cultural selection or direct/content biases (14, 15)
— Frequency-dependent biases (e.g., conformity) (88, 89)
— Indirect (e.g., success, prestige) biases (92, 99)
Migration Gene flow Demic diffusion (movement of people with their cultural traits) (17)
— Cultural diffusion (movement of traits without people) (135)
Drift Genetic drift Cultural drift (18)

alleles have higher fitness than others. Random cultural “mutation” switch from matriliny to patriliny even after controlling for
occurs where new variation arises randomly, such as via perceptual descent (23).
error (16), akin to random genetic mutation. Migration allows indi- Concurrently, archaeologists began using phylogenetic meth-
viduals to introduce novel variants to a population as they move (17), ods to reconstruct the history of artifacts, such as projectile
just as gene flow spreads alleles. Some cultural traits, such as first points (24), and, following Darwin’s original insight, others be-
names, fit the expectations of neutral drift (18), just like some alleles. gan reconstructing the history of language families (25). Like for
However, this enterprise is not simply the unthinking transfer microevolution, the advantage here was in the use of quantitative
of models from genetic to cultural evolution. In many cases, methods borrowed from biology that were explicit in their as-
cultural variation is generated, inherited, and changed in very dif- sumptions about how to reconstruct historical relationships (e.g.,
ferent ways from genetic variation, and models have addressed these maximum parsimony, maximum likelihood), repeatable and ex-
differences (10, 11). Examples include the blending of continuous, tendable by others, and easily scaled up to large datasets (26), in
nonparticulate cultural variation; the systematic, nonrandom contrast to the informal, idiosyncratic, and subjective schemas of
generation of cultural variation, or “guided variation”; frequency- historical linguistics or archaeology.
dependent biases, such as conformity, where variants are adopted
based on their commonness in the population; and indirect biases, Is It Evolution?
where traits are adopted based on the characteristics of their A common question is whether culture really evolves. This
bearers, such as success or prestige. Where possible, evidence question comes from both social scientists skeptical of any kind
from psychology, anthropology, sociology, and other fields was of parallel with evolution, and biologists insistent that Darwin’s
used to justify these processes (10, 11). However, the use of theory applies only to genetic evolution (27). Importantly, no
quantitative models went beyond typical theory in the social and one argues that genetic evolution and cultural evolution operate
behavioral sciences by (i) precisely and explicitly defining these identically. From the outset, microevolutionary modelers in-
processes rather than relying on imprecise verbal descriptions of corporated processes unique to cultural change, such as one-to-
phenomena (e.g., “conformity”) and (ii) exploring the population- many transmission (10) or nonrandomly guided variation (11).
level consequences of such processes, such as the consequences However, an examination of Table 1 indicates that the parallels
of frequency-dependent biases for between- and within-group are numerous enough to warrant an evolutionary theory of culture,
as long as these differences are taken seriously. At its heart, cultural
cultural variation (19).
change is a process of inherited variation that changes due to se-
Cultural Macroevolution lection, drift, migration, and other processes, which, in their details,
may operate similar to or different from the genetic case.
In the 1990s, the study of cultural microevolution was supple- Similarly, at the macroevolutionary level, it is sometimes ar-
mented by the study of cultural macroevolution, defined as long- gued that human culture is so riven with cross-lineage diffusion
term cultural change at or above the level of the society. Mace that it is not tree-like, and thus not amenable to phylogenetic
and Pagel (20) introduced the phylogenetic comparative method methods (26). Although this argument may be true for some
as a means to (i) reconstruct the cultural evolutionary history of cultural domains, many, such as languages or some artifacts,
a particular trait or set of traits and (ii) test functional hypoth- have been shown to be tree-like due to strong intergenerational
eses concerning the spread or distribution of cultural variation cultural descent (28). Moreover, cross-lineage blending is a
across societies while controlling for evolutionary history. The common feature of genetic evolution when we look beyond our
latter had been a problem within anthropology for over a cen- own kingdom to, say, prokaryotes, where horizontal gene trans-
tury. In 1889, Francis Galton (21) pointed out that even if two fer is rife (29). Indeed, network-based methods exist for dealing
traits (e.g., cattle-keeping and patriliny) often co-occur across with non–tree-like data (30).
many societies, this co-occurrence does not necessarily provide One indirect, but perhaps most important, test of the parallel
evidence that they are functionally associated (e.g., cattle-keeping between genetic and cultural change is whether methods bor-
causes patriliny), because all these societies may have culturally rowed from evolutionary biology, suitably modified, actually
inherited this combination from a common ancestral society. So- prove useful in explaining cultural change in a manner that adds
cieties are not necessarily statistically independent data points, due to the findings of nonevolutionary methods. Table 2 lists such
to shared history. This problem is the same one facing biologists methods, which are further discussed throughout this article.
when comparing across species, and, in the meantime, biologists
had developed methods for controlling for nonindependence due Evolution of Cultural Evolution
to common descent (22). Mace, Pagel, and others imported these In parallel to the study of cultural change itself, that is, changes
methods to test functional evolutionary hypotheses in the same in the contents of culture, modelers have also examined when
way, showing, for example, that cattle-keeping did likely cause a and why the capacity for cultural evolution evolved. Models

7854 | www.pnas.org/cgi/doi/10.1073/pnas.1620741114 Mesoudi


COLLOQUIUM
PAPER
suggest social learning evolves when environments change at may also require favorable demography, such as larger pop-
intermediate rates: too quickly for genes to track directly, but not ulations (42, 47) or populations partially connected via migration
so fast that socially learned information is outdated (11, 31). An (48, 49), as is typical of human societies (50). Currently, there is
influential model by Rogers (32) showed that whereas social no consensus on which of these factors is key to explaining
learning readily evolves in such conditions, it does not increase uniquely human cumulative culture. A combination of more than
mean population fitness, contrary to claims that culture is responsible one factor is probably necessary, perhaps explaining why cumu-
for our species’ evolutionary success. Further models showed that lative culture is confined to just one extant species.
social learning does enhance population fitness when it is cumulative; Is “cumulative culture” synonymous with “cultural evolution”?
that is, individuals can learn via social learning what they could not In principle, evolutionary change can be noncumulative, in-
possibly learn alone (33). This finding has led to extensive study of the volving changes in trait frequencies over time such as occurs with
factors that permit cumulative culture (discussed below). Finally, genetic drift or local adaptive changes in gene frequencies. In
gene–culture coevolution models and data have examined how this sense, nonhuman, noncumulative cultural change can justi-
cultural evolution interacts with and affects genetic evolution (34). fiably be called cultural evolution. However, genetic evolution is
There is extensive evidence for culture-driven genetic change in clearly also cumulative, involving the gradual accumulation of
humans, including agriculture-induced genetic adaptations for beneficial genetic modifications over time to produce complex ad-
digesting starch, dairy products, and alcohol (34, 35). Overall, this aptations, such as eyes or wings. Human cumulative cultural evo-
research shows the extent to which culture extends genetic evolu- lution bears a clear parallel with this form of cumulative genetic
tion by independently tracking environmental change that is too evolution. Indeed, the gradual accumulation of cultural innovations
rapid for genes to track, by generating diverse cultural adaptations results in complex cultural adaptations, such as telescopes or air-
to those environmental challenges, and by driving genetic planes, that resemble and rival complex genetic adaptations (12).
evolution. Ideas regarding the origin of cumulative culture can inform
thinking about factors that might affect recent and ongoing cu-
Recent Research Trends
mulative cultural evolution. The invention of writing, followed by
Cumulative Culture. Just 20 y ago, little was known about social digital media, surely greatly increased the fidelity of social
learning and culture in nonhuman species. Many definitions of learning and, potentially, the speed of cumulative culture (51).
culture stated that it was unique to humans, making the notion of Demographically, the threefold increase in the world population
“nonhuman culture” nonsensical. Now, it is established that a and increased global mobility in the past century should also
range of species from diverse taxa exhibit social learning (36), as have accelerated cumulative culture. However, there are also
well as cultural traditions, where social learning generates long- constraints. For example, as the amount of knowledge that is
lasting behavioral differences between groups (37). accumulated increases (which, by definition, it must), it should
However, there is still a gulf between the cultural achieve-
take longer for each new individual to acquire that knowledge.
ments of humans and other species. Recent work has focused on
This increased acquisition cost may result in extended educational
cumulative culture, where knowledge is built up over successive
periods, and the eventual slowing down of cumulative culture as
generations to exceed anything that a single individual could
invent alone (38, 39). This ability appears to be unique to hu- innovation becomes harder (52).
mans: Although chimpanzees’ nut-cracking (36) or dolphins’ use
Demography. Biologists have long recognized that demographic
of protective nose-sponges (40) does not seem to exceed what a
lone chimpanzee or dolphin could invent alone, it is surely im- factors, such as population size, structure, and interconnectedness,
possible for a single human to have discovered quantum me- are crucial for understanding trajectories of evolutionary change
chanics, invented smartphones, visited the moon, or achieved (53). Although the effect of demography on cultural evolution was
any of the other feats that require standing on the shoulders of modeled in the 1980s (10), the past decade has seen a major focus,

ANTHROPOLOGY
previous generations. As noted above, models of the evolution of mostly in archaeology, on the way in which population structure
culture show that cumulative culture is particularly effective at in- affects patterns of cultural variation and the gain and loss of
creasing mean population fitness beyond the population fitness of cultural complexity. Shennan (54) and Henrich (47) argued that
noncumulative cultural species (33). A major research question is population size has been a major determinant of cultural com-
therefore “What allows human culture to be uniquely cumulative?” plexity in hunter-gatherers, often measured as the number of tools
There has been much focus on high-fidelity social learning, in a toolkit or the number of components per tool. Henrich (47)
which is needed to preserve modifications over successive gen- argued that the loss of toolkit complexity in Tasmania following

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
erations such that they can accumulate (41, 42). It was initially isolation from Australia 12 kya was due to reduced effective
suggested that imitation (i.e., the copying of bodily actions rather population size, because the isolated population was too small to
than products) was key to this high fidelity (43). This claim seems maintain complexity, given imperfect social learning. In his model
doubtful, given that chimpanzees can imitate tool use techniques of this process, each member of each new generation acquires the
with high enough fidelity for alternate techniques to stabilize in skill of the most skillful member of the previous generation with
different groups (44). Rather than an “imitation vs. no-imitation” systematic loss due to copying error and some chance of im-
dichotomy, perhaps humans are more effective, spontaneous, or provement. Larger populations make the loss of skills due to im-
compulsive imitators (45). Other comparative work has sug- perfect copying less likely and improvements more likely. Shennan
gested roles for prosociality and language-mediated teaching and coworkers (48, 54) argued that increasing population densities
(46). As well as individual cognitive abilities, cumulative culture in Upper Paleolithic Europe around 45 kya caused the major increase

Table 2. Methods and concepts that have been adapted from evolutionary biology to study cultural change
Evolutionary biology Cultural evolution

Population genetic models Cultural evolution (or gene–culture coevolution) models (10, 11)
Gene-based phylogenetics Cultural phylogenetics (24, 26)
Comparative (cross-species) method Comparative (cross-cultural) method (20)
Population dynamic models Historical dynamic models (120)
Multilevel selection Multilevel cultural selection (122, 123)
Genetic drift Cultural drift (15, 18)
Multigeneration breeding experiments Multigeneration transmission chain experiments (111, 136)

Mesoudi PNAS | July 25, 2017 | vol. 114 | no. 30 | 7855


in symbolic and technological complexity seen in the archaeological Empirical Tests of Social Learning Biases. Cultural evolution models
record, rather than a genetic mutation that enhanced cognitive ability. contain assumptions about how people learn from one another.
Subsequent tests of the link between demography and cultural Some posit biases such as conformity, where people are dispro-
complexity have been mixed. Analyses of tools (55) and lan- portionately more likely to copy common cultural traits, or
guages (56) on islands in the Pacific Ocean are supportive, with success or prestige biases, where people preferentially copy traits
toolkit complexity and word gain rate increasing with island size. of successful or prestigious individuals (11). Others explore the
Others have found environmental risk, not population size, to consequences of learning from parents (vertical transmission),
predict toolkit complexity (57). These findings remain to be elders (oblique transmission), and peers (horizontal trans-
reconciled. A recent critique arguing that “population size does mission) (10). Where possible, these assumptions are made using
not explain past changes in cultural complexity” (58) is surely too evidence from the social and behavioral sciences, such as the
strong (59). Although not even the strongest advocates of de- social learning literature in psychology (72) or the diffusion of
mography would argue that population size should always pre- innovations literature in sociology (73). However, such evidence
dict cultural complexity, the notion that demography has no was not collected with these models in mind, and is often un-
effect at all is surely also incorrect in light of the positive evi- suitable. For example, classic psychology studies of conformity
dence cited above. Recent models integrating population density confound social influence and personal judgment in a way that
cannot reveal whether conformity in the sense modeled in the
and mobility (not simply population size) as proxies of cul-
cultural evolution literature is present (74). Such differences may
tural transmission (49) offer promise for more robust tests of
seem trivial, but they have significant population-level implica-
demographic hypotheses. tions: Only conformity in the sense of disproportionately copying
Experiments have also examined how population size affects the majority can lead to within-group homogeneity (11). Con-
cultural complexity (60–63). Experiments cannot prove whether sequently, the past decade has seen the use of field studies and
demography affects cultural evolution in the real world, but they laboratory experiments that test the assumptions and predictions
can test the validity of behavioral assumptions of demographic of cultural microevolution models.
models and manipulate factors in a way we cannot in real life. Field studies, typically conducted in small-scale societies where
For example, the one study not to find a link between group size paths of social transmission are easier to trace, have examined
and cultural complexity used a simple task that is unlikely to whether and when people use vertical, oblique, or horizontal
benefit from a large pool of demonstrators (61), highlighting the transmission to acquire key skills, beliefs, or knowledge. This
importance of task difficulty. Another showed that blending in- research shows a cumulative refinement of findings. An initial
heritance, where learners combine information from multiple study on the Aka, relying solely on retrospective self-report
demonstrators, also leads to a group size effect (60), providing an (asking people from whom they learned X), found substantial
alternative mechanism to the assumption that people solely copy vertical transmission for most traits, including hunting skills, food
the most skilled group member (47). gathering/preparation techniques, infant care methods, and sharing
norms, with dancing and singing the only domains with substantial
Cultural Phylogenetics. The use of phylogenetic methods to re- nonparental influence (75).
construct human history has been greatly extended in both However, retrospective self-report is vulnerable to recall bias.
methodology and subject matter. Methodologically, maximum Later studies used nonretrospective methods (e.g., asking to
parsimony has been superseded by Bayesian Markov chain whom people would go to learn X) or non–self-report regres-
Monte Carlo (MCMC) techniques that allow the testing of ex- sions to assess pairwise similarity across respondents, on the
plicit evolutionary hypotheses such as whether artifacts exhibit assumption that higher similarity indicates transmission between
core packages of traits that are inherited together (64); network- those individuals. These studies support a two-stage model of
based methods to handle non–tree-like reticulation (30); and/or skill acquisition (76–79): People initially learn from their parents,
phylogeographic methods that explicitly model the spread of and then update this knowledge by learning from older adults or
cultural traits in space, making assumptions about geographical peers later in life. Moreover, the updating stage typically targets
constraints [e.g., water bodies as potential barriers to language highly skilled or knowledgeable individuals (76). Interestingly,
diffusion (65)]. this pattern resembles evidence regarding childhood learning in
Such methods have been used to test hypotheses about human Western societies, where peers are more important than parents
history with more rigor than traditional nonquantitative, non- for the acquisition of key skills and knowledge (80). Such findings
evolutionary methods, shedding light on, for example, the origin are also consistent with models of age-based learning schedules,
of the Indo-European language family (55), whether similarities which find that cumulative cultural evolution is facilitated when
between hand-axe assemblages are caused by shared descent or learning becomes increasingly oblique and horizontal with age
convergence (66), the historical links between folktales from (81). Nevertheless, these generalizations betray many exceptions,
different regions (30), and even the recent evolution of computer and effects may vary with domain (82).
programming languages (67). These analyses do not ignore Laboratory experiments have explored similar questions re-
garding from whom people learn. Experiments offer more con-
existing work in traditional historical disciplines. They often test
trol than field studies, albeit with reduced external validity.
existing hypotheses, but using larger samples and more powerful Participants face an unfamiliar task designed to resemble a real-
statistical techniques. For example, Currie et al. (68) found life task faced by people past or present. They can solve this task
support in South-East Asia and the Pacific for the hypothesis through asocial learning and/or various forms of social learning
that political complexity increases in a unilinear sequence (69), (e.g., conformity, success bias). Such studies paint a consistent
moving from acephalous to simple chiefdoms, to complex pattern despite using different tasks and protocols (74, 83–87).
chiefdoms, to states, without skipping stages, but with possible At least some people behave adaptively as predicted by models,
collapses down to any earlier stage (note that this latter point is by learning socially when appropriate (e.g., when asocial learning
evidence against the aforementioned Spencerian “progress” is costly) and in an appropriate manner (e.g., using success bias
theories of social evolution, which posited that progress toward rather than copying at random). Moreover, some people exhibit
increasing complexity is an inevitable law). Similarly, Haynie and adaptive flexibility, such as using payoff bias in a task to determine
Bowern (70) used phylogenetic comparative analyses of Australian which of two choices yields higher payoffs, but frequency-based
languages to test Berlin and Kay’s sequence model of color term biases in a coordination task where it pays to match others’
acquisition (71), where languages first acquire terms for black/ choices (85).
white, then red, then green/yellow, then blue, then brown. This However, across these experiments, there are unexplained
model was generally supported, albeit with many exceptions to individual differences in social learning use and often an un-
the sequence. derutilization of social information (74, 83–87). There is some

7856 | www.pnas.org/cgi/doi/10.1073/pnas.1620741114 Mesoudi


COLLOQUIUM
PAPER
evidence linking this individual variation to other individual mutation, with selection-like processes, such as content or con-
differences, such as personality (87) or intelligence (88, 89), but formist biases, altering the frequency of cultural traits over time.
these correlations are weak and exploratory. There is also evi- Sperber and coworkers (105, 106) have argued that many in-
dence of cross-cultural variation in social learning, specifically stances of cultural change do not take this form. Instead, they
higher social learning in collectivistic East Asian societies than in argue, cultural transmission is transformative: People reconstruct
individualistic Western societies (90, 91). Similar individual and what they learn from others according to their preexisting
cultural variation in nonexperimental data suggests that this knowledge, cognitive or perceptual biases, or other factors. This
finding is not just a laboratory artifact (92). process of transformation is known as cultural attraction, and the
The causes and consequences of this individual and cultural points at which representations converge are called cultural
variation are unclear. Individual variation in social learning is attractors (105, 106) [although ambiguities in these definitions
found in many species. It can be viewed on a continuum of are discussed elsewhere (108)].
phenotypic plasticity, from genetically polymorphic and de- For example, recall the phylogenetic analysis showing that
velopmentally fixed individual differences, to developmentally Australian languages typically acquire color terms in the specific
determined facultative responses to external environments or Berlin–Kay sequence, explaining cross-cultural regularities in
physiological state, to the associative learning of learning strat- color terminology (70). Cultural attraction offers a plausible
egies (93). Whether individual variation in human social learning microevolutionary explanation for this finding: People share
is nonadaptive noise, reflects frequency-dependent equilibria perceptual systems that lead them to invent and transform color
between information producers and scroungers with no group- terms independently in the same way. This explanation is sup-
level benefit (32), or is adaptive at the group level by maintaining ported by experiments in which initially random artificial color
a mix of innovation and tradition is unknown. terms were passed along chains of people (109). Each participant
Cultural variation in human social learning suggests that we learned unfamiliar terms for each color, with these labels, in-
acquire norms via social learning that, in turn, affect our degree cluding errors, passed to the next person as his or her learning
of social learning (93). This process may generate cultural dy- set. In each of 30 independent chains, the artificial terminology
namics entailing the “social learning of social learning” (94), converged on the predicted Berlin–Kay scheme, as each person
rather than the typical modeling assumption that learning transformed the labels in a systematic manner. Similar experi-
strategies are genetically inherited. The origin of this cultural ments have shown convergence toward universal patterns of
variation is unknown, but it might be an historical and societal grammatical structure (110), category learning (111) and
response to different rates of environmental change (95). An- bloodletting as a medical practice (112).
other line of work has identified intentional, institution-based The cultural attraction approach moves the explanatory focus
mechanisms for generating innovation (96), suggesting similar from the population to the individual level. Rather than explaining
flexibility in asocial learning. There is great scope to use the patterns of cultural diversity, stability, and change in terms of the
methods of cultural macroevolution outlined above to test these differential selection of certain cultural variants (e.g., content
hypotheses and explain how cultural macroevolution, as well as biases) or differential copying of certain individuals (e.g., success
genetic evolution, has shaped the behavioral responses measured bias, prestige bias), cultural attraction focuses on how individuals
in experiments, rather than assuming that participants are systematically transform representations. The latter is similar
coming into the laboratory as “blank slates.” [but not identical (107)] to evolutionary psychologists’ notion of
“evoked culture” (113), where genetically evolved cognitive
Development. The brief mentions above of work with children biases cause the independent recurrence of genetically adap-
belie rapid growth in the study of the developmental basis of tive behavior, and the process of iterated learning, where re-
cultural evolution (97). Such studies are crucial for un- peated learning and transmission cause convergence on inductive
derstanding how people developmentally acquire the learning biases or priors (111).

ANTHROPOLOGY
biases noted above to be individually and culturally variable, and Although cultural attraction is sometimes presented as an al-
how these learning biases interact with developing abilities, such ternative to cultural evolution, the two approaches are compat-
as language and theory of mind. Experiments have shown that ible (104). Many “standard” cultural evolution models, in fact,
children are sophisticated social learners and exhibit biases do not model transmission as high fidelity, and allow for trans-
predicted by models to be adaptive, such as preferentially formation (47, 114). The notion of guided variation (11) is
learning from accurate over inaccurate individuals (98) and similar to the individual, nonrandom transformation described as
prestigious over nonprestigious individuals (99). There is work cultural attraction, and can operate in parallel to the more

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
combining biases, finding that children copy groups over indi- selection-like transmission biases in Table 1. However, cultural
viduals when both are equally successful but copy successful in- attraction proponents have a valid point that, in practice, such
dividuals over unsuccessful groups, thus adaptively switching transformative processes have not received adequate attention.
between frequency and success information (100). Other work The relative influence of each likely varies with domain (104).
has addressed the motivation for copying, with children more Where there are clear inductive biases favoring certain repre-
likely to imitate indiscriminately when tasks are presented as sentations, such as bloodletting or color terms, then explanations
conventional rather than instrumental, such that the motiva- in terms of transformation/attraction will be useful. Where there
tion is to affiliate with one’s group rather than acquire effective are no clear intuitions or inductive biases, selection-like pro-
skills (101). cesses will be more important. Bloodletting, for example, has
There are exceptions to these impressive skills, however. One been replaced in many societies with surgical techniques that are
study found that children copy adults over peers even when peers the product of a long refinement and accumulation of unintuitive
are more knowledgeable (102). In addition, there is similar in- knowledge and skills. Many medical, scientific, and technological
dividual (103) and cultural (90) variation as seen in adults, which practices are the product of accidental invention followed by
has yet to be explained. Further work is needed to link the study payoff-biased selection in the face of resistance due to confor-
of social learning in childhood and adulthood, ideally using mity to prior practices or transformation back to intuitive
models to link developmentally changing learning schedules to attractors (115). Examples include glassmaking (116); musical
macroevolutionary patterns of cumulative culture (81). instruments (12]); and the theory of evolution, an unintuitive
idea that needs conscious effort to understand (118).
Cultural Attraction. An ongoing debate has been over the relative Even where there is clear evidence for inductive biases, as in
role of preservative, selection-like processes and nonselective, the case of color terms (109), the prediction of cross-cultural
transformative processes in explaining cultural change (104– universals is only partially upheld in real-world data, as shown
107). Many of the cultural evolution models described earlier by the many exceptions to the Berlin–Kay scheme identified by
assume high-fidelity transmission plus random copying error or phylogenetic analyses (70). Further work might show these

Mesoudi PNAS | July 25, 2017 | vol. 114 | no. 30 | 7857


exceptions to be determined by processes like migration or of Donald Trump more understandable, and provides early warning
prestige bias. A Bayesian framework can be useful here, having signs of societal collapse based on the long view of cultural evolu-
been used to model both attraction-like inductive biases (111) and tion that we ignore at our peril. Continual progress toward more
selection-like biases (119). This integration of selective and stable forms of political organization is far from inevitable (68), and
transformative processes, and use of both microevolutionary evi- institutions are fragile balancing acts between individual and group
dence regarding cognitive biases and macroevolutionary analyses interests (122). Better understanding of this balance, within the long
of actual cultural diversity and change, holds great promise. view of cultural evolution, is surely crucial for creating and main-
taining sustainable societies.
Historical and Contemporary Social Dynamics. As well as cultural
phylogenetics, another approach to understanding historical Conclusions
change uses population dynamic models of the kind used within Science itself is a cumulative cultural evolutionary process (126).
ecology to model changes in population size over time in re- Is the science of cultural evolution accumulating an increasingly
sponse to births, deaths, predation, or migration (120). This reliable body of knowledge concerning human culture? I hope I
approach is more useful when detailed temporal data are avail- have shown that it is. Initial claims derived from self-reports that
able, such as on the rise and fall of empires. Again, models and cultural transmission is largely vertical have been replaced with a
theories are not unthinkingly imported from ecology and applied two-stage model in which parental knowledge is updated via
to human societies. They are adapted to take into account the horizontal or oblique transmission, often targeted in age- or skill-
unique aspects of human culture, often drawing on traditional appropriate ways (76, 77). The common modeling assumption
theories from history and sociology. However, the advantages of that social learning is under fixed genetic control is incompatible
this approach are that (i) theories can be precisely quantified in a with experimental evidence of substantial individual and cultural
way that generates clear predictions, unlike verbal arguments, variation, and is being revised (93). The “demographic turn”
and (ii) these predictions can be empirically tested often across within archaeology, itself an improvement on unrealistic “single
multiple societies, regions, and time periods, in a way that his- genetic mutation” explanations for increases in past cultural
torians seldom do (120). complexity, has been refined to focus on population density and
For example, Turchin (120) used population dynamic models migration rather than just population size (49). Long-standing
to test competing theories for the rise and fall of empires in hypotheses regarding the acquisition of color terms, origin of
Europe during the period from AD 0–1900. One idea, proposed Indo-European languages, and trajectory of sociopolitical com-
by Ibn Khaldun in the 14th century, can be interpreted as a plexity have been tested using phylogenetic methods that are
theory of multilevel selection. Here, societies grow by solving more powerful than informal comparisons using cherry-picked
collective action problems, such as building irrigation systems or examples (26).
organizing collective defense against enemies. Societies that However, major questions remain, which is also the normal
more effectively solve such problems grow in size and defeat course of science. Ongoing work seeks to explain why only hu-
other less internally cooperative societies, eventually becoming mans exhibit cumulative cultural evolution, the origins of indi-
empires. When societies are very large, however, there is over- vidual and cultural variation in social learning and innovation,
production of elites who fight among themselves for power, as the relative influence of selection and attraction across domains,
well as a disconnect between the majority and the squabbling and the balance between individual and group interests that
elites. This internal conflict allows another society to invade, shape societal cohesion and stability. A welcome trend is to apply
typically one in a border region with higher internal cooperation cultural evolution theory to real-world problems, including en-
due to its smaller size and common enemy (the larger empire). vironmental sustainability (122), the social effects of new digital
This new empire grows larger, internal cooperation breaks down, media (51) and society-level cooperation (124). The work reviewed
the new empire is itself eventually invaded, and the cycle here is, moreover, a small selection of cultural evolution research.
continues. There is no space to cover, say, economics (127), neuroscience
Turchin (120) converted this verbal theory into a quantitative (128), literature (129), or religion (130).
model, incorporating within-group cooperation as well as factors As well as the use of quantitative methods, often borrowed
traditionally considered important, such as access to resources from biology (Table 2), a major strength of cultural evolution
and geographical overreach (121). This model provided a better work is its interdisciplinarity. Findings from, say, experimental
fit to historical data on the rise and fall of actual empires than psychology can be applied to problems in archaeology (16). Con-
models without cooperation. This work shows not only the value versely, a consideration of the population-level consequences of
of quantitative cultural evolution models and empirical tests as psychological constructs, such as conformity, highlights their lim-
applied to history but also support for the idea of multilevel itations (74). This interdisciplinarity should be pushed further,
cultural selection, where societies grow as a result of superior especially by integrating microevolution and macroevolution. For
within-group cooperation that provides an advantage in between- example, phylogenetic analyses of actual color terminologies (70)
group competition (11, 122). Subsequent work has found further reveal broad support but key exceptions to the universality pre-
support for this theory in a spatially explicit model tested with data dicted by experiments (109). Are these exceptions also predictable
beyond Europe (123). from individual cognition, or are other factors needed? More
More recently, Turchin (124) has applied these ideas to current broadly, interdisciplinarity is facilitated by open science. Although
societies, particularly the United States, extending structural- data and analytical techniques have traditionally been kept within
demographic theories from political science (125). Worryingly, disciplines as protected knowledge, the public release of data and
elite overproduction, interelite conflict, and social inequality, which use of reproducible analyses (e.g., R scripts) encourage scholars
were key markers of low within-group cooperation and impending from other disciplines to explore that data and familiarize them-
empire collapse in the past, have been increasing in the United selves first-hand with key findings.
States in recent decades. Examples of interelite conflict include However, there is still an epistemological gulf between the
decreased Republican-Democrat cooperation in government and scientific, evolutionary approach outlined here and much of the
the Tea Party challenge within the Republican Party. The success social sciences and humanities, which, after all, study the same
of self-styled antiestablishment figures, such as Donald Trump, cultural phenomena (131, 132). Most cultural anthropologists,
is arguably due to rising social inequality and a disconnect sociologists, historians, and the like are reluctant to use the sim-
between the majority of voters and the increasingly conflict- plifying assumptions and reductionism inherent in quantitative
ridden political elites. models and methods, instead highlighting complexity and contradic-
Predicting the future can never be done with complete certainty, tions. There is also a reluctance to generalize across societies, re-
for either genetic or cultural evolution. However, the work of gions, or time periods and, instead, a focus on specificity and
Turchin (124) and others makes events like the unexpected election uniqueness. There is also a reluctance to consider continuities

7858 | www.pnas.org/cgi/doi/10.1073/pnas.1620741114 Mesoudi


COLLOQUIUM
PAPER
between human behavior and the behavior of other species, with sciences without inappropriate genetic determinism. Unlike so-
the latter often perceived to be “instinctive” or genetically de- ciobiology and evolutionary psychology approaches that downplay
termined in a way that human behavior is not [an overview of transmitted culture (113), the work outlined here assumes that
these issues from both sides is provided elsewhere (131)]. cultural evolution is semiautonomous from genetic evolution,
This reluctance is, I think, misguided. The methods and ap- allowing rapid cultural adaptation to novel physical and social
proaches described here add to, rather than detract from, tra-
environments without genetic change. Although some aspects of
ditional methods and knowledge. Phylogenetic analyses typically
use existing linguistic or ethnographic data to reconstruct his- cultural diversity may reflect genetically evolved, content-rich
torical relationships and test causal historical hypotheses, but cognitive biases, many others reflect the accumulation of modifi-
with greater rigor than is possible by considering single case cations via content-independent learning rules such as success
studies or ignoring Galton’s problem (20). Quantitative models bias, leading to diverse, historically contingent pathways of
are simply verbal arguments expressed more precisely and un- culturally inherited knowledge. The theory of cultural evolution
ambiguously, and in a way that affords easier empirical testing. offers a means of taking culture seriously within a scientific,
Those verbal arguments often come from the social sciences and evolutionary framework.
humanities (120, 121, 124, 125). There are also excellent exam-
ples of traditional ethnography providing important corrections ACKNOWLEDGMENTS. I thank Francisco Ayala, Marc Feldman, Kevin Laland,
to cultural evolution theory, such as Wilf’s (96) demonstration and Andrew Whiten for inviting me to participate in the Sackler Colloquium
that cultural innovation can be institutionally driven. Further- upon which this special issue is based, and two anonymous reviewers for
more, cultural evolution offers a link to the wider evolutionary helpful comments.

1. Darwin C (1871) The Descent of Man (John Murray, London); reprinted (2003) 29. Doolittle WF, Bapteste E (2007) Pattern pluralism and the Tree of Life hypothesis.
(Gibson Square, London). Proc Natl Acad Sci USA 104:2043–2049.
2. Mesoudi A (2011) Cultural Evolution (Univ of Chicago Press, Chicago). 30. Tehrani JJ (2013) The phylogeny of Little Red Riding Hood. PLoS One 8:e78871.
3. Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evo- 31. Aoki K, Wakano JY, Feldman MW (2005) The emergence of social learning in a
lution. Behav Brain Sci 29:329–347, discussion 347–383. temporally changing environment. Curr Anthropol 46:334–340.
4. Darwin C (1859) On the Origin of Species (John Murray, London); reprinted 32. Rogers A (1988) Does biology constrain culture? Am Anthropol 90:819–831.
(1968) (Penguin, London). 33. Boyd R, Richerson PJ (1995) Why does culture increase human adaptability? Ethol
5. van Wyhe J (2005) The descent of words: Evolutionary thinking 1780-1880. Sociobiol 16:125–143.
Endeavour 29:94–100. 34. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
6. Freeman D (1974) The evolutionary theories of Charles Darwin and Herbert Spencer. Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.
Curr Anthropol 15:211–221. 35. Henrich J (2015) The Secret of Our Success (Princeton Univ Press, Princeton).
7. Stocking GW (1962) Lamarckianism in American social science: 1890-1915. J Hist Ideas 36. Hoppitt W, Laland KN (2013) Social Learning (Princeton University Press, Princeton,
23:239–256. NJ).
8. Kroeber AL (1917) The superorganic. Am Anthropol 19:163–213. 37. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes.
9. Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phe- Proc Natl Acad Sci USA 114:7790–7797.
notypic transmission from parents to children. (A theory of the effect of parental 38. Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL (2014) Human cumulative culture: A
phenotypes on children’s phenotypes). Am J Hum Genet 25:618–637. comparative perspective. Biol Rev Camb Philos Soc 89:284–301.
10. Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution (Prince- 39. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of
ton Univ Press, Princeton). cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415.
11. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago 40. Krützen M, et al. (2005) Cultural transmission of tool use in bottlenose dolphins. Proc
Press, Chicago). Natl Acad Sci USA 102:8939–8943.
12. Richerson PJ, Boyd R (2005) Not by Genes Alone (Univ of Chicago Press, Chicago). 41. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
13. Servedio MR, et al. (2014) Not just a theory–the utility of mathematical models in mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.
evolutionary biology. PLoS Biol 12:e1002017. 42. Kempe M, Lycett SJ, Mesoudi A (2014) From cultural traditions to cumulative culture:

ANTHROPOLOGY
14. Stubbersfield JM, Tehrani JJ, Flynn EG (2015) Serial killers, spiders and cybersex: Parameterizing the differences between human and nonhuman culture. J Theor Biol
Social and survival information bias in the transmission of urban legends. Br J Psychol 359:29–36.
106:288–307. 43. Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press,
15. Sindi SS, Dale R (2016) Culturomics as a data playground for tests of selection: Cambridge, MA).
Mathematical approaches to detecting selection in word use. J Theor Biol 405: 44. Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM (2009) Emulation, imitation,
140–149. over-imitation and the scope of culture for child and chimpanzee. Philos Trans R Soc
16. Kempe M, Lycett S, Mesoudi A (2012) An experimental test of the accumulated Lond B Biol Sci 364:2417–2428.
copying error model of cultural mutation for Acheulean handaxe size. PLoS One 45. Lyons DE, Young AG, Keil FC (2007) The hidden structure of overimitation. Proc Natl

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
7:e48333. Acad Sci USA 104:19751–19756.
17. Hamilton MJ, Buchanan B (2007) Spatial gradients in Clovis-age radiocarbon dates 46. Dean LG, Kendal RL, Schapiro SJ, Thierry B, Laland KN (2012) Identification of the
across North America suggest rapid colonization from the north. Proc Natl Acad Sci social and cognitive processes underlying human cumulative culture. Science 335:
USA 104:15625–15630. 1114–1118.
18. Bentley RA, Hahn MW, Shennan SJ (2004) Random drift and culture change. Proc 47. Henrich J (2004) Demography and cultural evolution. Am Antiq 69:197–214.
Biol Sci 271:1443–1450. 48. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the
19. Mesoudi A (2009) How cultural evolutionary theory can inform social psychology appearance of modern human behavior. Science 324:1298–1301.
and vice versa. Psychol Rev 116:929–952. 49. Grove M (2016) Population density, mobility, and cultural transmission. J Archaeol
20. Mace R, Pagel M (1994) The comparative method in anthropology. Curr Anthropol Sci 74:75–84.
35:549–564. 50. Salali GD, et al. (2016) Knowledge-sharing networks in hunter-gatherers and the
21. Galton F (1889) Comment on Tylor, E. B., On a method of investigating the devel- evolution of cumulative culture. Curr Biol 26:2516–2521.
opment of institutions, applied to laws of marriage and descent. J R Anthropol Inst 51. Acerbi A (2016) A cultural evolution approach to digital media. Front Hum Neurosci
18:270. 10:636.
22. Harvey P, Pagel M (1991) The Comparative Method in Evolutionary Biology (Oxford 52. Mesoudi A (2011) Variable cultural acquisition costs constrain cumulative cultural
Univ Press, Oxford). evolution. PLoS One 6:e18239.
23. Holden CJ, Mace R (2003) Spread of cattle led to the loss of matrilineal descent in 53. Charlesworth B (2009) Fundamental concepts in genetics: Effective population size
Africa: A coevolutionary analysis. Proc Biol Sci 270:2425–2433. and patterns of molecular evolution and variation. Nat Rev Genet 10:195–205.
24. O’Brien MJ, Darwent J, Lyman RL (2001) Cladistics is useful for reconstructing ar- 54. Shennan S (2001) Demography and cultural innovation. Camb Archaeol J 11:5–16.
chaeological phylogenies. J Archaeol Sci 28:1115–1136. 55. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oce-
25. Gray RD, Jordan FM (2000) Language trees support the express-train sequence of ania. Proc Biol Sci 277:2559–2564.
Austronesian expansion. Nature 405:1052–1055. 56. Bromham L, Hua X, Fitzpatrick TG, Greenhill SJ (2015) Rate of language evolution is
26. Gray RD, Watts J (2017) Cultural macroevolution matters. Proc Natl Acad Sci USA affected by population size. Proc Natl Acad Sci USA 112:2097–2102.
114:7846–7852. 57. Collard M, Buchanan B, O’Brien MJ (2013) Population size as an explanation for
27. Fracchia J, Lewontin RC (1999) Does culture evolve? Hist Theory 38:52–78. patterns in the Paleolithic archaeological record. Curr Anthropol 54:S388–S396.
28. Collard M, Shennan S, Tehrani J (2006) Branching, blending, and the evolution of 58. Vaesen K, Collard M, Cosgrove R, Roebroeks W (2016) Population size does not
cultural similarities and differences among human populations. Evol Hum Behav 27: explain past changes in cultural complexity. Proc Natl Acad Sci USA 113:
169–184. E2241–E2247.

Mesoudi PNAS | July 25, 2017 | vol. 114 | no. 30 | 7859


59. Henrich J, et al. (2016) Understanding cumulative cultural evolution. Proc Natl Acad 98. Birch SAJ, Vauthier SA, Bloom P (2008) Three- and four-year-olds spontaneously use
Sci USA 113:E6724–E6725. others’ past performance to guide their learning. Cognition 107:1018–1034.
60. Kempe M, Mesoudi A (2014) An experimental demonstration of the effect of group 99. Chudek M, Heller S, Birch S, Henrich J (2012) Prestige-biased cultural learning. Evol
size on cultural accumulation. Evol Hum Behav 35:285–290. Hum Behav 33:46–56.
61. Caldwell CA, Millen AE (2010) Human cumulative culture in the laboratory: Effects of 100. Wilks M, Collier-Baker E, Nielsen M (2015) Preschool children favor copying a suc-
(micro) population size. Learn Behav 38:310–318. cessful individual over an unsuccessful group. Dev Sci 18:1014–1024.
62. Derex M, Beugin M-P, Godelle B, Raymond M (2013) Experimental evidence for the 101. Herrmann PA, Legare CH, Harris PL, Whitehouse H (2013) Stick to the script: The
influence of group size on cultural complexity. Nature 503:389–391. effect of witnessing multiple actors on children’s imitation. Cognition 129:536–543.
63. Muthukrishna M, Shulman BW, Vasilescu V, Henrich J (2014) Sociality influences 102. Wood LA, Kendal RL, Flynn EG (2012) Context-dependent model-based biases in
cultural complexity. Proc Biol Sci 281:20132511. cultural transmission. Evol Hum Behav 33:387–394.
64. Matthews LJ, Tehrani JJ, Jordan FM, Collard M, Nunn CL (2011) Testing for divergent 103. Burdett ERR, et al. (2016) Do children copy an expert or a majority? PLoS One 11:
transmission histories among cultural characters: A study using Bayesian phyloge- e0164698.
netic methods and Iranian tribal textile data. PLoS One 6:e14810. 104. Acerbi A, Mesoudi A (2015) If we are all cultural Darwinians what’s the fuss about?
65. Bouckaert R, et al. (2012) Mapping the origins and expansion of the Indo-European Clarifying recent disagreements in the field of cultural evolution. Biol Philos 30:
language family. Science 337:957–960.
481–503.
66. Lycett SJ (2009) Are Victoria West cores “proto-Levallois”? A phylogenetic assess-
105. Claidière N, Scott-Phillips TC, Sperber D (2014) How Darwinian is cultural evolution?
ment. J Hum Evol 56:175–191.
Philos Trans R Soc Lond B Biol Sci 369:20130368.
67. Valverde S, Solé RV (2015) Punctuated equilibrium in the large-scale evolution of
106. Sperber D (1996) Explaining Culture: A Naturalistic Approach (Oxford Univ Press,
programming languages. J R Soc Interface 12:20150249.
Oxford).
68. Currie TE, Greenhill SJ, Gray RD, Hasegawa T, Mace R (2010) Rise and fall of political
107. Morin O (2015) How Traditions Live and Die (Oxford Univ Press, Oxford).
complexity in island South-East Asia and the Pacific. Nature 467:801–804.
108. Buskell A (March 17, 2017) What are cultural attractors? Biol Philos, 10.1007/s10539-
69. Johnson AW, Earle TK (2000) The Evolution of Human Societies (Stanford Univ Press,
017-9570-6.
Stanford, CA).
109. Xu J, Dowman M, Griffiths TL (2013) Cultural transmission results in convergence
70. Haynie HJ, Bowern C (2016) Phylogenetic approach to the evolution of color term
systems. Proc Natl Acad Sci USA 113:13666–13671. towards colour term universals. Proc Biol Sci 280:20123073.
71. Berlin B, Kay P (1991) Basic Color Terms (Univ of California Press, Berkeley, CA). 110. Kirby S, Cornish H, Smith K (2008) Cumulative cultural evolution in the laboratory:
72. Bandura A (1977) Social Learning Theory (Prentice Hall, Oxford). An experimental approach to the origins of structure in human language. Proc Natl
73. Rogers E (1995) The Diffusion of Innovations (Free Press, New York). Acad Sci USA 105:10681–10686.
74. Efferson C, Lalive R, Richerson PJ, McElreath R, Lubell M (2008) Conformists and 111. Griffiths TL, Kalish ML, Lewandowsky S (2008) Review. Theoretical and empirical
mavericks. Evol Hum Behav 29:56–64. evidence for the impact of inductive biases on cultural evolution. Philos Trans R Soc
75. Hewlett BS, Cavalli-Sforza LL (1986) Cultural transmission among Aka pygmies. Am Lond B Biol Sci 363:3503–3514.
Anthropol 88:922–934. 112. Miton H, Claidière N, Mercier H (2015) Universal cognitive mechanisms explain the
76. Henrich J, Broesch J (2011) On the nature of cultural transmission networks: Evidence cultural success of bloodletting. Evol Hum Behav 36:303–312.
from Fijian villages for adaptive learning biases. Philos Trans R Soc Lond B Biol Sci 113. Tooby J, Cosmides L (1992) The psychological foundations of culture. The Adapted
366:1139–1148. Mind, eds Barkow JH, Cosmides L, Tooby J (Oxford Univ Press, London), pp 19–136.
77. Hewlett BS, Fouts HN, Boyette AH, Hewlett BL (2011) Social learning among Congo 114. Henrich J, Boyd R, Richerson PJ (2008) Five misunderstandings about cultural evo-
Basin hunter-gatherers. Philos Trans R Soc Lond B Biol Sci 366:1168–1178. lution. Hum Nat 19:119–137.
78. Reyes-Garcia V, et al. (2009) Cultural transmission of ethnobotanical knowledge and 115. Muthukrishna M, Henrich J (2016) Innovation in the collective brain. Phil Trans R Soc
skills. Evol Hum Behav 30:274–285. B 371:20150192.
79. Kline MA, Boyd R, Henrich J (2013) Teaching and the life history of cultural trans- 116. Macfarlane A, Martin G (2002) Glass: A World History (Univ of Chicago Press, Chi-
mission in Fijian villages. Hum Nat 24:351–374. cago).
80. Harris J (1995) Where is the child’s environment? Psychol Rev 102:458–489. 117. Nia HT, et al. (2015) The evolution of air resonance power efficiency in the violin and
81. Lehmann L, Wakano JY, Aoki K (2013) On optimal learning schedules and the its ancestors. Proc Math Phys Eng Sci 471:20140905.
marginal value of cumulative cultural evolution. Evolution 67:1435–1445. 118. Shtulman A (2006) Qualitative differences between naïve and scientific theories of
82. McElreath R, Strimling P (2008) When natural selection favors imitation of parents. evolution. Cognit Psychol 52:170–194.
Curr Anthropol 49:307–316. 119. Perreault C, Moya C, Boyd R (2012) A Bayesian approach to the evolution of social
83. McElreath R, et al. (2008) Beyond existence and aiming outside the laboratory: Es- learning. Evol Hum Behav 33:449–459.
timating frequency-dependent and pay-off-biased social learning strategies. Philos 120. Turchin P (2003) Historical Dynamics (Princeton Univ Press, Princeton).
Trans R Soc Lond B Biol Sci 363:3515–3528. 121. Collins R (1995) Prediction in macrosociology: The case of the Soviet collapse. Am J
84. Mesoudi A (2011) An experimental comparison of human social learning strategies. Sociol 100:1552–1593.
Evol Hum Behav 32:334–342. 122. Waring TM, et al. (2015) A multilevel evolutionary framework for sustainability
85. Molleman L, van den Berg P, Weissing FJ (2014) Consistent individual differences in
analysis. Ecol Soc 20:34.
human social learning strategies. Nature Commun 5:3570.
123. Turchin P, Currie TE, Turner EAL, Gavrilets S (2013) War, space, and the evolution of
86. Morgan TJH, Rendell LE, Ehn M, Hoppitt W, Laland KN (2012) The evolutionary basis
Old World complex societies. Proc Natl Acad Sci USA 110:16384–16389.
of human social learning. Proc Biol Sci 279:653–662.
124. Turchin P (2016) Ages of Discord (Beresta Books, Chaplin, CT).
87. Toelch U, Bruce MJ, Newson L, Richerson PJ, Reader SM (2014) Individual consistency
125. Goldstone JA, et al. (2010) A global model for forecasting political instability. Am J
and flexibility in human social information use. Proc Biol Sci 281:20132864.
Pol Sci 54:190–208.
88. Muthukrishna M, Morgan TJH, Henrich J (2016) The when and who of social learning
126. Hull DL (1988) Science as a Process (Chicago Univ Press, Chicago).
and conformist transmission. Evol Hum Behav 37:10–20.
127. Nelson RR, Winter SG (2002) Evolutionary theorizing in economics. J Econ Perspect
89. Shennan S, Wilkinson J (2001) Ceramic style change and neutral evolution. Am Antiq
16:23–46.
66:577–593.
128. Stout D (2013) Neuroscience of technology. Cultural Evolution: Society, Technology,
90. DiYanni CJ, Corriveau KH, Kurkul K, Nasrini J, Nini D (2015) The role of consensus and
Language, and Religion, ed Richerson PJ (MIT Press, Cambridge, MA), pp 157–174.
culture in children’s imitation of inefficient actions. J Exp Child Psychol 137:99–110.
129. Moretti F (2013) Distant Reading (Verso Books, London).
91. Mesoudi A, Chang L, Murray K, Lu HJ (2015) Higher frequency of social learning in
130. Norenzayan A, et al. (2016) The cultural evolution of prosocial religions. Behav Brain
China than in the West shows cultural variation in the dynamics of cultural evolu-
tion. Proc Biol Sci 282:20142209. Sci 39:e1.
92. Beheim BA, Thigpen C, McElreath R (2014) Strategic social learning and the pop- 131. Slingerland E, Collard M (2011) Creating Consilience: Integrating the Sciences and
ulation dynamics of human behavior. Evol Hum Behav 35:351–357. the Humanities (Oxford Univ Press, Oxford).
93. Mesoudi A, Chang L, Dall SRX, Thornton A (2016) The evolution of individual and 132. Ingold T, Palsson G (2013) Biosocial Becomings: Integrating Social and Biological
cultural variation in social learning. Trends Ecol Evol 31:215–225. Anthropology (Cambridge Univ Press, Cambridge, UK).
94. Acerbi A, Enquist M, Ghirlanda S (2009) Cultural evolution and individual develop- 133. Youn H, Strumsky D, Bettencourt LMA, Lobo J (2015) Invention as a combinatorial
ment of openness and conservatism. Proc Natl Acad Sci USA 106:18931–18935. process: Evidence from US patents. J R Soc Interface 12:20150272.
95. Chang L, et al. (2011) Cultural adaptations to environmental variability. Educ Psychol 134. Howe CJ, et al. (2001) Manuscript evolution. Trends Genet 17:147–152.
Rev 23:99–129. 135. Henrich J (2001) Cultural transmission and the diffusion of innovations. Am
96. Wilf E (2015) Routinized business innovation: An undertheorized engine of cultural Anthropol 103:992–1013.
evolution. Am Anthropol 117:679–692. 136. Mesoudi A, Whiten A (2008) Review. The multiple roles of cultural transmission
97. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl experiments in understanding human cultural evolution. Philos Trans R Soc Lond
Acad Sci USA 114:7877–7883. B Biol Sci 363:3489–3501.

7860 | www.pnas.org/cgi/doi/10.1073/pnas.1620741114 Mesoudi


COLLOQUIUM
PAPER
Evolutionary neuroscience of cumulative culture
Dietrich Stouta,1 and Erin E. Hechtb,c
a
Department of Anthropology, Emory University, Atlanta, GA 30322; bCenter for Behavioral Neuroscience, Georgia State University, Atlanta, GA 30303;
and cYerkes National Primate Research Center, Emory University, Atlanta, GA 30302-5090

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 1, 2017 (received for review January 12, 2017)

Culture suffuses all aspects of human life. It shapes our minds and clear why this exceptional capacity for culture has evolved in hu-
bodies and has provided a cumulative inheritance of knowledge, mans and humans alone.
skills, institutions, and artifacts that allows us to truly stand on the If there is one thing on which IC and CE agree, it is that
shoulders of giants. No other species approaches the extent, cultural capacity is a good thing: It has “undeniable practical
diversity, and complexity of human culture, but we remain unsure advantages” (8) that have allowed our species to have “expanded
how this came to be. The very uniqueness of human culture is both across the globe and. . .occupy a wider range than any other ter-
a puzzle and a problem. It is puzzling as to why more species have restrial species” (2). Indeed, the benefits are so substantial that
not adopted this manifestly beneficial strategy and problematic even small “initial increments” (8) in this direction are expected to
because the comparative methods of evolutionary biology are generate powerful biocultural feedback leading to further brain
ill suited to explain unique events. Here, we develop a more and cognitive evolution (2, 8). Proponents of a highly modular view
particularistic and mechanistic evolutionary neuroscience approach of IC nevertheless argue that this feedback will lead to coordinated
to cumulative culture, taking into account experimental, develop- enhancement across multiple domains (8). CE advocates similarly
mental, comparative, and archaeological evidence. This approach suggest that evolved cognitive mechanisms (i.e., modules in a loose
reconciles currently competing accounts of the origins of human sense) for social learning will lead to more general brain size and
culture and develops the concept of a uniquely human technolog- intelligence increases to deal with increased amounts of “valuable
ical niche rooted in a shared primate heritage of visuomotor cultural information” (2). So why are humans the only species to
coordination and dexterous manipulation. have fallen into this virtuous cycle?
Arguing from a CE perspective, Boyd and Richerson (11)
brain evolution | cultural evolution | archaeology | imitation suggest that cumulative culture is rare because of the evolu-
tionary costs of requisite social-learning mechanisms. According

M odern humans live in a culturally constructed niche of


artificial landscapes, structures, artifacts, skills, practices,
and beliefs accumulated over generations and beyond the ability
to this argument, accumulation must begin with simple skills that
are within the inventive potential of individuals. Insofar as such
simple skills could be socially learned using existing “low-fidelity”
of any one individual to recreate in a lifetime (1, 2). Like the air mechanisms, any small increases in learning efficiency provided by
we breathe, this cumulative cultural matrix is so immersive that it new high-fidelity social-learning mechanisms would be unlikely to
is easy to forget it is there. However, this is the medium through pay for the (presumably high) metabolic and developmental costs
which we grow, act, and think, and it exerts profound influences of those mechanisms. In this case, it would only be after accu-
on human life across a range of behavioral (2), developmental mulation had already generated a sufficient body of complex,
(3), and evolutionary (4) scales. How did our species find itself in difficult-to-learn, and useful cultural content that these expen-
this remarkable situation? sive mechanisms would begin to pay for themselves. Boyd and
Niche construction is not unique to humans (5), and many Richerson (11) thus suggest that high-fidelity social-learning
animals reliably transmit behavioral traditions across generations capacities initially arose as a side effect (exaptation) of some
other adaptation, such as behavior prediction for Machiavellian
(1). In contrast, it is controversial whether any examples of
social strategizing (12, 13). Although reasonable, this hypothesis
nonhuman cumulative culture exist and all can agree that no
is weakened by its reliance on assumptions regarding the cost of
other species approaches the extent, diversity, and complexity of social-learning mechanisms and the rarity of social-cognitive pre-
human culture (6, 7). Two kinds of explanations have been adaptations. In fact, the hypothesis does not so much explain the
proposed for this disjunction. “Individual cognition” accounts rarity of cumulative culture as shift the puzzle to explaining the
(IC) propose that humans accumulate more complex cultures rarity of preadaptations, like behavior prediction, which might also
primarily because the biological evolution of greater intelligence be assumed to be quite generally beneficial.
in human individuals has promoted innovation and allowed An alternative, IC-compatible proposal is that it is large brains
mastery of more complex concepts and skills (7, 8). Alternatively, in general that are expensive, rather than social-learning capacities
“cultural evolution” accounts (CE) propose that the difference in particular. Thus, Pinker (8) argues that human cognitive ex-
arises from uniquely human psychological specializations for ceptionalism arises from the fortuitous fact that “hominid ances-
“high-fidelity” social learning [e.g., theory of mind (ToM), imi- tors, more so than any other species, had a collection of traits that
tation], which have enabled the lossless “ratchet-effect” of cul- had tilted the payoffs toward further investment in intelligence.”
NEUROSCIENCE

tural accumulation to supplant biology as humanity’s primary Whether such “intelligence” is thought to be composed of discrete
mode of adaptation (2, 9). Both views recognize a role for social but tightly coevolving innate modules (8) or a general-purpose
learning in reducing the costs of knowledge and skill acquisition,
but they differ on the phylogenetic uniqueness (e.g., ref. 7 vs. ref.
9) and transformative power (e.g., ref. 2 vs. ref. 8) of human This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
high-fidelity social transmission. This plays out in starkly differ- Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
ent visions of cumulative culture as either a fundamental evo-
ANTHROPOLOGY

in Irvine, CA. The complete program and video recordings of most presentations are available
lutionary transition (cf. ref. 10) that altered the very medium of on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.

human adaption (2), or just another “unique or extreme” bi- Author contributions: D.S. and E.E.H. wrote the paper.
ological trait comparable to “the elephant’s trunk, the narwhal’s The authors declare no conflict of interest.
tusk, the whale’s baleen, the platypus’s duckbill, and the arma- This article is a PNAS Direct Submission.
dillo’s armor” (8). Neither option, however, makes it immediately 1
To whom correspondence should be addressed. Email: dwstout@emory.edu.

www.pnas.org/cgi/doi/10.1073/pnas.1620738114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7861–7868


capacity only “secondarily” specialized by experience (14), the on the complexity and difficulty of the production process.
premise is that bigger brains are generally advantageous but can Transmission chains building spaghetti towers and paper air-
only evolve under a narrow set of conditions. Lifespans must be planes achieve sufficient fidelity for cumulative improvement
long enough (i.e., mortality low enough) to reward early invest- even in purely end-state emulative (i.e., reverse engineering)
ments in growth and additional energetic costs must be accom- conditions (23), whereas more challenging tasks—such as de-
modated through increased intake or reallocation (4). For many signing virtual “fishing nets” (24), building real weight-bearing
species, constraints such as small body size, unstable environ- devices (25), and reproducing particular artifact forms (26)—
ments, and unavoidable mortality from predation and disease may may require imitative copying of specific actions or processes.
simply make large brains impractical. In this framework, high- Interactions between particular tasks demands, required fidelity,
fidelity social learning can increase the payoffs of large brains and sufficient mechanisms are critical to the interpretation of
(8, 15) but is not associated with its own unique costs, as it is comparative and evolutionary evidence, yet remain underex-
thought to rely on cognitive abilities that overlap (15) or evolved in plored and undertheorized. The objective is obviously to move
tandem with (8) asocial learning and problem solving. This ap- from particularistic analyses of specific behaviors to identifica-
proach actually parallels the suggestion of Boyd and Richerson tion of general principles, yet it is not obvious how to design or
(11) that initial enhancements of social-learning capacities were a select experimental tasks to efficiently advance this goal.
byproduct of selection on other capacities, and similarly shifts the One solution is to seek inspiration from the archaeological
question back to identifying the unique trait (4) or combination of record of human evolution (e.g., refs. 26 and 27). As the name
traits (8) that pushed humans, and no others, into an auto-catalytic implies, this early “Paleolithic” evidence is dominated by stone
coevolutionary feedback loop. tools. These artifacts are valuable, not only because they endure
The strength of both IC and CE accounts is that they explain but because they provide prolific and fine-grained evidence of
a wide range of distinctive human traits in terms of a single co- behavioral changes across a critical evolutionary interval during
evolutionary process. However, this parsimony is offset by the which hominin brains tripled in volume to assume their modern
perceived need to posit a uniquely exceptional [and possibly even proportions. Stone tools were key components of premodern
random (16)] initial cause inaccessible to more general evolu- subsistence and survival strategies and likely helped to shape the
tionary explanation (8, 17, 18). An alternative is to consider cu- very course of this evolution. Experimental, comparative, and
mulative culture, not as a unitary capacity that is either present or ethnographic evidence indicate that stone tool-making (“knap-
absent, but as a complex trait with a correspondingly complex ping”) is a complex skill integrating demands for planning, problem
history of gradual or piecemeal emergence. Although any event solving, and perceptual-motor coordination within a collaborative
that altered the evolutionary cost/benefit analysis of either brain social context (28–31). It encompasses an evolutionary continuum
expansion generally (4, 8) or high-fidelity social learning specifi- from early Paleolithic skills at or just beyond the limits of modern
cally (2, 16) could theoretically have initiated runaway biocultural apes (32) to the virtuoso craftsmanship of later prehistory (33). The
coevolution in our lineage, there is no reason to assume this study of knapping skill acquisition and transmission is thus a
feedback would be indefinitely self-sustaining once initiated, nor promising avenue for evolutionarily grounded investigation into the
that that it would necessarily produce constant increase as op- foundations of human cumulative culture, including the copying
posed to more complex dynamics. In fact, both comparative bi- fidelity needed to explain empirical patterns in the archaeological
ological evidence (4) and cultural evolutionary models (19) record (34, 35). To be clear, knapping is but one of many evolu-
indicate the potential for just such interactions and dynamics and tionarily relevant skills that might be studied, but it is one of which
this is entirely consistent with the emerging paleoanthropological we have a good archaeological record and which may reasonably be
picture of multilineal, intermittent, asynchronous change over hoped to be representative of broader trends.
human evolution (20, 21). This indication suggests a more con- Knapping is a “reductive” technology involving the sequential
tingent evolutionary history, likely involving multiple inflection detachment of flakes from a stone core using precise ballistic
points in response to perturbations both intrinsic (19) and ex- strikes with a handheld hammer (typically stone, bone, or antler)
trinsic (20) to hominin behavior systems. If this is the case, un- to initiate controlled and predictable fracture. This means that
derstanding coevolutionary feedback dynamics (4, 8) and CE small errors in strike execution can have catastrophic, unrever-
processes (2, 19) will be necessary but not sufficient to explain the sible effects. Experiments by Bril and colleagues have shown that
actual path of human evolution, which will additionally require fracture prediction and control is a demanding perceptual-motor
the application of these general principles to explain particular skill reliably expressed only in expert knappers (28, 29). Building
historical contingencies (17). on this work, Stout and colleagues (31, 36, 37) found that even
In our view, this situation calls for a particularistic and mecha- 22 mo (x ̄ = 167 h) of knapping training produced relatively little
nistic approach to the study of cumulative culture. As exemplified evidence of perceptual-motor improvement, in contrast to clear
below, such an “evolutionary neuroscience” perspective evaluates gains in conceptual understanding.
comparative evidence of brain and behavioral variation in light of The key bottleneck in the social reproduction of knapping is
(i) evolutionary and developmental processes, (ii) primary archae- thus the extended practice required to achieve perceptual-motor
ological and paleontological evidence of evolutionary timing and competence. This requires mastery of relationships, for example
context, and (iii) the ethnographic, ethological, and experimental between the force and location of the strike and the morphology,
analogies needed to interpret this primary evidence. positioning, and support of the core (29, 38, 39), that are not
perceptually available to naïve observers and cannot be directly
High-Fidelity Social Reproduction communicated as semantic knowledge. Attempts to implement
To begin with, it is necessary to further dissect “cumulative cul- semantic knowledge of knapping strategies before perceptual-
ture” as a complex trait with heterogeneous cognitive and be- motor skill development are ineffective at best (40, 41), and
havioral prerequisites (cf. ref. 6) and capable of various degrees of such knowledge decays rapidly along knapping transmission chains
expression. Although there is broad agreement (2, 8, 9, 15, 22) when practice time is limited, even if explicit verbal teaching is
that accurate transmission is necessary for cultural accumulation, allowed (27). For observational learning, the challenge is to
the actual fidelity (i) required to accumulate particular behaviors, translate visual and auditory information of another’s actions to
(ii) associated with different social-learning mechanisms, and (iii) appropriate motor commands for one’s own body. This may be
typically exhibited by different species are all largely unknown accomplished by linking the observed behavior with preexisting
(22). Experimental studies of artifact reproduction in humans internal models of one’s own body and actions through associative
suggest that the required fidelity and relevant mechanisms depend learning and stimulus generalization (42, 43). Novel behaviors are

7862 | www.pnas.org/cgi/doi/10.1073/pnas.1620738114 Stout and Hecht


COLLOQUIUM
PAPER
copied by breaking them down into familiar action elements (e.g., and affective feedback, as well as intentional demonstration and
lift, turn, twist), matching these, and reassembling (44). Such explicit instruction (53).
matching always requires a degree of approximation, if only be- Underlying this diversity of processes, are a small number of
cause no two bodies are identical, but where low-level differences fundamental psychological capacities required for efficient high-
are not critical it can enable fast imitation learning (13). Unfor- fidelity skill learning in a helical curriculum. These include:
tunately, such low-level details do matter for knapping. Learners (i) the ability to relate fine details of actions and objects to
must thus proceed by incorrectly executing an observed gesture or complex goals during behavior observation and execution (46)
communicated strategy, checking the actual vs. predicted outcome, and (ii) the prosocial motivation and ToM needed to support
and embarking on a lengthy process of behavioral exploration to apprenticeship learning (9, 54, 55). In keeping with comparative
(re)discover relevant task constraints and develop corresponding (12) and archaeological (50) evidence, variation in these capac-
internal models through a self-conscious process of deliberate ities is expected to be continuous rather than binary, and to be
practice (cf. ref. 45). These learning challenges call for an iter- reflected in the complexity and difficulty of skills that can be
ative approach that alternates social-learning opportunities accurately reproduced. The latter point suggests that the ar-
(observation, instruction) with motivated individual practice (46), chaeological record of increasingly complex and difficult stone
as commonly seen in coaching or apprenticeship learning. Whiten tool-making (31, 36, 50) could help adjudicate between (or rec-
(47) describes this as a “helical curriculum.” oncile) IC and CE accounts of human evolution, but only in the
This complexity is not easily reconciled with the dichotomy of context of a solid inferential framework linking brains, behaviors,
high-fidelity process copying (imitation) vs. low-fidelity product and evolutionary processes. Key questions include the extent and
copying (emulation) prevalent in discussions of cumulative cul- nature of overlap between processes supporting behavior exe-
ture (e.g., ref. 48). For knapping and similar crafts, what is ac- cution, observation, and interpretation (e.g., ToM), and the
tually reproduced across individuals is a flexible skill rather than relevance of evolutionary processes other than natural selection
an invariant formula, and what must be learned is a structured (e.g., CE). An emerging extended evolutionary synthesis (EES)
set of relations [compare with, for example, “affordances” (49)] effectively addresses both topics through its core concepts of
between effective means and appropriate goals at multiple levels constructive development and reciprocal causation (56).
of task organization (50). This is achieved through a diverse
helical curriculum in which many varieties of social learning may An Extended Evolutionary Framework
scaffold individual development of perceptual-motor skill and As Deacon (57) provocatively put it, “brain evolution should be
strategic competence (Fig. 1). Whiten and colleagues (e.g., see impossible.” How could random mutational changes to such a
ref. 12) have identified a “portfolio” of such processes, ranging complex integrated system be anything other than catastrophic?
from copying bodily postures and gestures to object movement The answer is that brain development is itself an evolutionary
reenactment and end-state emulation. In humans, these may process of remarkable flexibility and adaptability. Basic patterning
even extend to copying cultural norms of affect, behavior, and mechanisms and developmental selection (58) can produce func-
social affiliation (30). Although such copying may be a goal in tional systems even in the face of quite significant environmental
itself when learning conventional behaviors, like human dance or or genetic perturbation. Such plasticity may be essential to the
the “do-as-I-do” experimental paradigm used with apes (12), in evolvability of larger brains, as developmental programs and net-
instrumental skill acquisition these processes are means to the work architectures are forced to accommodate shifting geometries
end of learning subtle affordances and complex causal relations together with massive increases in the number of potential neu-
(51). Such learning can be further facilitated by (passive or ac- ronal connections to be specified. It also creates a medium for the
tive) niche construction, increasing exposure to relevant mate- multilevel interactions between genes, developmental mecha-
rials and situations (30, 52) and by teaching, potentially including nisms, environments both internal and external, and organismal
the provision of practice opportunities, the direction of attention, behavior that the EES describes as “constructive development”

NEUROSCIENCE

Fig. 1. A learning cycle in the helical curriculum.


Social resources both passive and pedagogical (53),
together with constructed learning contexts (52),
provide opportunities and structure for individual
ANTHROPOLOGY

practice, which can include a portfolio (12) of pro-


cesses ranging from “emulative” end-state copying to
the imitation of specific body movements. This be-
havioral continuum maps roughly onto the differing
contributions of dorsal and ventral processing streams
in the primate brain.

Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7863
(56). Any viable evolutionary account of cumulative culture must feedback and error correction. This limitation can be overcome
address these dynamics. through the use of internal models that predict movements and
The primate neocortex is divided into large-scale functional outcomes in advance (71), a simulation process supported by a
networks characterized by high within-network functional and distributed network of frontal, parietal, and occipitotemporal re-
anatomical connectivity (59, 60). In humans, these networks are gions combining elements of the dorsal and ventral attention net-
organized in a processing hierarchy from concrete perceptual works. As argued above, such models of self-action control likely
and motor functions to abstract, domain-general processing. This form the basis for understanding and copying the observed actions
arrangement is realized anatomically as a cortex-wide gradient of of others through a process of matching, often referred to as
topological distance and connectivity patterns extending from “motor resonance” (42). The assembly of complex goal-oriented
distinct peripheral sensorimotor cortices at one end to highly sequences from these elements is likely supported in the next tier
interconnected central association cortices at the other (61). of cortical organization by the multiple demand system (aka
Along this gradient, seven widely recognized networks can be “frontoparietal control network”). This cognitive control network is
arrayed into four organizational tiers: (i) visual and somatomotor, thought to support general or “fluid” intelligence through its role in
(ii) dorsal and ventral attention, (iv) multiple demand and limbic, assembling structured mental programs from a series of subtasks
and (iv) default mode (61). (72), a critical process in skill learning as reviewed above. Together,
The tethering hypothesis of Buckner and Krienen (62) proposes these sensorimotor matching and control processes support the
that this pattern arises from disproportionate expansion of the interactive behavioral alignment that is critical to human social
cortical mantle during evolutionary brain enlargement, leading to learning, communication, cooperation, and bonding (73, 74).
gaps between the chemical signaling gradients that pattern cortical It is debatable whether the application of increasingly sophis-
differentiation during development. Developmental selection in ticated motor planning and cognitive control networks to social
these gaps fosters the emergence of “noncanonical” association learning involved phylogenetic construction or is entirely explica-
networks primarily interconnected with each other rather than ble in terms of developmental construction and inflection (42, 75),
with more developmentally constrained peripheral sensorimotor but either scenario is consistent with the IC premise that social
systems. As expected, these relatively unconstrained association learning substantially overlaps with asocial-learning mechanisms
cortices are also relatively late developing (59, 63) and variable and comes at little additional cost (15). It also fits well with the
in connectivity across individuals (64). Indeed, comparative evi- close integration of individual and social-learning processes in
dence indicates human-specific changes in the rate and timing real-world skill acquisition, as envisioned by the helical curricu-
of synaptogenesis, synapse elimination, and cortical myelination, lum. Advocates of CE, however, emphasize the additional im-
resulting in increased plasticity into adulthood (65, 66). That portance of a specialized ToM capacity to allow truly cultural
nonspecific selection for increased brain size in the human lineage learning (9). Does this requirement imply additional evolutionary
might have indirectly driven increased plasticity is suggested by costs for cultural learning?
evidence of low heritability for cortical morphology (sulcal di- Although ToM is commonly thought of as a human specialization,
mensions) vs. overall brain size in humans, a pattern that contrasts depending on phylogenetically constructed neurocognitive mecha-
with high heritability of both in chimpanzees (67). In any case, the nisms, Heyes and Frith (76) have recently argued that it is largely a
human association cortex appears particularly sensitive to envi- product of developmental inflection and cultural evolution. On this
ronmental and behavioral influences, providing a potent evolu- account, low-level or “implicit” mind-reading capacities emerge di-
tionary feedback mechanism between organism and environment, rectly from motor resonance properties of the action control system
which the EES refers to as reciprocal causation (56). discussed above. Motor resonance provides the input needed to
Such phenotypic flexibility is useful but may come at a cost identify recurring relations between actions, outcomes, and internal
(e.g., investments in learning or temporary phenotype–environment states, and thus to predict behavior and infer intent. This would be
mismatches). Where possible, natural selection is expected to largely explicable in terms of general mechanisms [e.g., statistical and
reduce costs by “canalizing” plastic responses to recurring envi- associative learning (77)] acting within developmentally constructed
ronmental situations as automatic parts of normal development systems, as favored by some IC accounts (15).
(68). The tethering hypothesis suggests that such innate spe- Explicit mind reading, in contrast, involves active reasoning
cializations are most likely to be found in relatively heritable about mental states: in other words, the “theory” part of “theory
sensorimotor systems and with respect to behaviors/stimuli that of mind.” Anatomically, this appears to be supported by regions
have been relatively invariant over long periods of time. Because of posterior cingulate, medial frontal, and lateral temporopar-
humans’ expanded association areas remain relatively plastic ietal cortex associated with the so-called “default mode network”
(64) and are both late developing (59) and phylogenetically re- (DMN) (78). The DMN was initially identified as a set of regions
cent (60), their derived cognitive features are less likely to that experience de-activation during attention-demanding tasks,
be directly shaped by natural selection (phylogenetically con- but is increasingly recognized to make a positive contribution to
structed) and more likely to result from developmental side ef- abstract, internally directed tasks involving information retrieval
fects (developmental construction) and modifications to the and integration. Examples include introspection, social cogni-
structure of inputs they receive from more peripheral systems tion, autobiographical memory, future planning, narrative com-
(phylogenetic or developmental inflection) (69). In theory, such prehension, and goal-directed working memory. This functional
modifications could arise through environmental as well as ge- profile reflects the fact that the DMN sits atop the cortical
netic inheritance, including persistent changes to the physical processing hierarchy: maximally distant from peripheral senso-
and social context of development brought about through niche rimotor systems and dominated by internal connectivity with
construction (52, 70). For example, there is widespread agree- other association networks (61). As discussed above, its devel-
ment that the human brain lacks specific genetic adaptations for opment and function are thus expected to be highly plastic and
literacy, and yet learning to read reliably produces functional reliant on learning. In fact, Heyes and Frith (76) propose that
specialization for script perception in a particular region of the learning explicit mental theories is an inherently cultural process
left ventral occipitotemporal cortex known as the “visual word requiring language-based instruction. Their view is supported by,
form area” (3). Similar logic may apply to the enhanced mech- among other things, evidence that individual and cross-cultural
anisms for complex action parsing and ToM that support ap- differences in caregiver use of mental state vocabulary are pre-
prenticeship learning in a helical curriculum. dictive of variation in childrens’ acquisition of ToM concepts,
Skilled actions, such as the ballistic strikes involved in stone such as false belief, knowledge vs. ignorance, and difference of
knapping, often unfold too quickly to be guided by online sensory opinion. Insofar as mind-reading capacities are themselves seen

7864 | www.pnas.org/cgi/doi/10.1073/pnas.1620738114 Stout and Hecht


COLLOQUIUM
PAPER
as critical to language acquisition (9) and both may have been From this early template, further adaptations emerged. Mo-
important to Paleolithic skill reproduction (27, 55), this suggests tion processing expanded from the MT to produce a dorsal
a deep and densely reciprocal history of biological, cultural, and stream of “vision for action” extending into the posterior parietal
developmental interactions during the evolution of the capacities cortex (80). The internal models of movement in space processed
that support cumulative culture. in this stream provide the “how” needed to execute or copy (Fig.
An EES perspective thus charts a middle path between IC and 1) bodily actions. A second ventral stream of “vision for per-
CE extremes. The action-parsing and ToM capacities that support ception” extends into the lateral and inferior temporal lobe. This
high-fidelity transmission do have a biological basis (cf. ref. 8) “what” pathway supports the recognition of objects, individuals,
and are indeed specialized and highly derived in humans (2), and body parts critical for organizing goal-directed action and
but they may also be “secondarily modular” (15) products of emulating observed outcomes.
processes other than phylogenetic construction (9). A long his- Whereas the presence of both dorsal and ventral visual streams
tory of reciprocal interaction between cultural and biological across macaques, chimpanzees, and humans reflects the ancient
evolution confirms the importance of considering CE processes origins of these systems, structural and functional differences be-
(2; contra ref. 8) but contradicts the view of cumulative culture tween species provide evidence of subsequent evolutionary
origins as a single “key event” (2) involving “one and only one changes. It is clear that, like other association cortices, temporal
biological adaptation” (9). Disentangling the respective contri- regions associated with the ventral stream underwent substantial
butions of IC and CE changes will require evidence of the se- enlargement over ape and human evolution (62) and this likely
quence, timing, and context of evolutionary developments. fostered new functional capacities, ranging from semantic pro-
cessing to face recognition. However, we argue that it was a series
Evolution of Primate Action Systems of key evolutionary changes to the dorsal stream that enabled in-
The gold standard for reconstructing such evolutionary develop- tegration of increasingly fine action details and complex goals
ments is phylogenetic inference from comparative evidence of ex- during behavior observation and execution, and thus supported the
tant species (Fig. 2). Specialized neural machinery for visuomotor emergence of high-fidelity social learning and cumulative culture.
integration is perhaps the quintessential primate adaptation to a One such change was the emergence of new functional regions
diurnal life in the trees (75, 79). Success in this niche was supported in the parietal cortex. Comparative fMRI studies with macaques
by the emergence of two new brain regions. The ventral premotor and humans have identified portions of human intraparietal
cortex (PMv) appeared anterior to primary motor cortex, and sulcus with novel sensitivity to 3D form-from-motion stimuli, as
allowed for the integration of visual input with new, higher-order well as a patch of human anterior supramarginal gyrus (aSMG)
control of movement sequences. In the temporal cortex, the middle specifically responsive to observed tool use (81). Both regions
temporal (MT) visual area (or V5) appeared as a specialized are likely relevant to a wide array of evolutionarily relevant
motion-processing region. These basic adaptations are present in object-manipulation behaviors, and both are known to be
all primates and were in place at or near the root of our clade. recruited by stone tool-making activities in modern humans (82,

NEUROSCIENCE
ANTHROPOLOGY

Fig. 2. Species differences in action processing cir-


cuitry (Upper) and inferred phylogenetic origins
(Lower).

Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7865
83). The aSMG in particular is believed to support a confluence Such learning in humans requires a degree of bodily awareness
of object information from the ventral stream with dorsal stream sufficient to match variations in kinematic detail with desired
kinematics to manage the novel action possibilities afforded by outcomes during deliberate practice. A measure of such aware-
handheld tools (81). Although similar functional studies have not ness that has been applied to other animals is the Mirror Self-
been conducted with chimpanzees, structural evidence from Recognition (MSR) test (89). Unlike enculturated humans,
diffusion tensor imaging suggests similarity with humans. mirror-naïve animals must discover de novo that the visual per-
Whereas macaques show little or no connectivity between the ception of their reflection corresponds to the sensorimotor
aSMG and ventral stream object-processing cortex, in both hu- representations of their own movements. As might be expected
mans and chimpanzees these connections are robust (84). This from the preceding review of action perception circuitry, ma-
tool-relevant innovation appears to predate the chimpanzee– caques typically fail at MSR and chimpanzees are intermediate
human split, as might be expected from the impressive tool-use in performance, with some passing and some failing. In fact,
capacities of modern chimpanzees (13). chimpanzee MSR performance is predicted by individual varia-
Enhanced aSMG connectivity between dorsal and ventral tion in the degree of right-lateralization of SLFIII projections
steams is just one aspect of a broader pattern of changes (Fig. 2) into the vlPFC. In other words chimpanzees with more human-
in action circuitry over ape and human evolution (84). In ma- like SLFIII connectivity show more human-like MSR behavior
caques, this circuitry is dominated by ventral stream projections (90). Because attending to one’s own movements is a critical el-
to the ventrolateral prefrontal cortex (vlPFC) along the inferior ement in the hypothesized construction of implicit mind reading
longitudinal fasciculus and extreme/external capsules, with rela- from motor resonance (76), this finding suggests further links
tively little connectivity through the dorsal stream. Dorsal stream between dorsal stream evolution and the cognitive prerequisites
projections to the vlPFC as well as the PMv through the middle of cultural learning.
and superior longitudinal fasciculi are better developed in What is not yet known is the extent to which any or all of these
chimpanzees and become quite pronounced in humans. Across structural and functional differences between species are classic
these three taxa, there is thus a trend toward the addition of “adaptations” in the sense of being canalized products of phy-
increasing dorsal stream inputs to the vlPFC in complement to logenetic construction vs. other evolutionary processes. Even in
established ventral stream connectivity. We have previously macaques, there is some evidence that extensive tool-training
proposed that this dorsal stream enhancement may underlie the can produce plastic alterations in dorsal stream connectivity
progressive elaboration of action-parsing capacities from ma- (91). Enlarged ape and human brains are expected to be more
caques to chimpanzees and then humans (84, 85), as reflected developmentally plastic and subject to inflection by somatic [e.g.,
by these species’ increasingly complex foraging skills (13) and bipedal locomotion, hand morphology (92)] and sensorimotor
capacities/propensities for bodily imitation (12). adaptations, and developmental niche construction (70). In fact,
Motor resonance mechanisms are least-developed in macaques, research with modern humans has shown that the acquisition of
which are not known to imitate manual actions. Macaque “mirror” Paleolithic tool-making skills elicits plastic remodeling of dorsal
neurons respond to observed action goals rather than detailed stream white matter connections, including SLFIII’s projection
means of execution and are almost entirely unresponsive to actions into the right vlPFC, even in adults (37). Functionally, the gray
that do not involve an object (75). In contrast, chimpanzee action matter targeted by this projection is recruited by execution (93)
observation activates nearly identical voxels to execution of the and observation (83) of relatively complex tool-making se-
same movements, regardless of whether they produce a physical quences of the kind that appeared with Late Acheulean hand-
result on an object (85). This basic action-matching mechanism may axe technology after about 0.7 Mya (50). Such findings suggest
thus predate the chimpanzee–human split. With respect to object- that further experimental studies of Paleolithic tool-making may
directed actions, however, chimpanzees retain a generally macaque- begin to fill in details of timing, mechanisms, and context of
like pattern of brain response dominated by the “top-down” con- evolutionary changes that occurred since the chimpanzee–human
tributions of the frontal executive cortex (85, 86). Humans alone divergence and are inaccessible to purely comparative methods.
display a more distributed pattern of occipital, temporal, parietal,
premotor, and prefrontal activation, reflecting an increased role Conclusion: An Evolving Technological Niche
for bottom-up perceptual representations incorporating kinematic The earliest stone tools (Fig. 2) predate evidence of brain
and spatiotemporal details about object-directed actions (85). This expansion by hundreds of thousands of years (32, 94) during
functional difference, and the structural changes that support it, which their occurrence was extremely patchy, discontinuous, and
may be critical to the exceptional development of skill acquisition lacking in evidence of progressive change. By 2.6 Mya, early
and cultural learning capacities in humans. Oldowan knapping provides some evidence for high-fidelity
In humans but not chimpanzees or macaques, the core action cultural transmission of particular methods (35), as well as in-
perception circuitry includes a prominent projection to the su- creasing demands on visual, motor and attentional systems (82),
perior parietal lobule, a region associated with awareness of but the overall impression in this early period remains one of a
one’s body in space (84). Furthermore, the third branch of the tenuous and expendable technology at the edge of contemporary
superior longitudinal fasciculus (SLFIII), which links the inferior hominin capacities. It is only after about 2.0 Mya that stone tool-
parietal cortex with the PMv in monkeys, extends into more making appears to become more commonplace (as indicated by
anterior regions of the vlPFC in humans, particularly in the right site frequency and geographic distribution), at which time it is
hemisphere (87). Chimpanzees again appear intermediate, with accompanied by evidence of brain- and body-size increase (20).
a weak but observable extension of SLFIII into vlPFC and no Further episodes of apparently correlated technological (50) and
evidence of right-lateralization at the population level (87). Thus, brain size (20, 95) change occurred with the appearance of in-
a robust extension of SLFIII into the right hemisphere homolog of creasingly skill-intensive Early (1.7 Mya) and Late (0.7 Mya)
Broca’s area appears to be a human-specific adaptation. This re- Acheulean knapping. Understanding what exactly changed at
gion is an element of the multiple demand system discussed above, these various transitions is an important priority for future research
which is thought to support the assembly of complex, multistep and will ultimately require an integration of CE approaches to
action plans (88). The observed extension of human SLFIII would understanding technological change (19) and IC insights into the
thus provide an anatomical substrate for the integration of kine- evolutionary economics of “expensive” brains (4), with mechanistic
matic details into complex action goals and sequences, as required neuroscientific perspectives on evolving brain–behavior–culture
for skill learning in a helical curriculum. interactions.

7866 | www.pnas.org/cgi/doi/10.1073/pnas.1620738114 Stout and Hecht


COLLOQUIUM
PAPER
Such an approach highlights the central importance of em- collaboration, apprenticeship learning, and the development of
bodied skills for object manipulation and modification to the ToM, as discussed above. In contrast to other animals, however,
high-fidelity social reproduction that IC and CE agree is critical human affiliation routinely extends to include “ultrasocial” co-
to sustained biocultural feedback. Dorsal stream action systems operation and sharing with nonkin enabled by the use of lan-
constitute a critical substrate for the evolutionary-developmental guage to create social norms and identities, including purely
cascade that constructs the action-parsing and ToM capacities symbolic affinal and fictive kinship ties (98, 99).
needed for cumulative culture. These systems are themselves A substantial body of research has linked language evolution
sensitive to inflected input from more peripheral somatic or to exaptation of the same action control systems discussed here
sensorimotor adaptations and constructed niches that shape (e.g., ref. 100), often with an emphasis on gesture production and
early object manipulation and visual experiences (96). Object action “syntax” in the dorsal stream (55). Less emphasized has
manipulation and modification contribute to the construction of been the critical role that such high-fidelity production (whether
learning niches (Fig. 1) populated by the residues of past action manual, vocal, or artifact-mediated) plays in grounding semantic
(52), and provide a persistent external medium scaffolding pro- meaning (101). Consistent physical “tokens” anchor meaning
duction of the more complex and protracted action goals and and allow discrete recombination in a way that continuous as-
sequences (46) that both require and reward cultural learning sociative networks, however complex, do not. In other words,
through a helical curriculum. concrete tokenization appears critical to the transition from
As we have seen, stone knapping requires the multilevel in- broadly semantic to truly symbolic thought (57), as exemplified
tegration of bodily kinematics and object affordances into goal- by the construction of precise mathematics from approximate
oriented action sequences. This finding is supported by phylo- numeracy using number words as tokens (102). Once learned,
genetic (87) and plastic (37) enhancements to action control this externalizing symbolic “trick” can readily be internalized as a
systems and demands extended investment in deliberate practice tool for thought, supporting such abstract concepts as social roles
(36). Oldowan knapping, although far from easy compared with that can be filled by different actual individuals. The elaboration
ape tool-use (32), remains a forgiving technology that is rela- of action control systems supporting skilled action and behav-
tively quickly acquired (36) because the limited contingency ioral alignment (74) thus emerges as a foundational adaptation
between successive actions allows substantial latitude for error enabling the modern human way of life.
(31). Given sufficient sensorimotor and action control capacities, This way of life might fairly be described as a human cultural
it is likely that social-learning opportunities (15) and niche- niche; however, this term is already taken to refer to a somewhat
construction processes (52) seen in socially tolerant nonhu- more circumscribed (CE) theoretical proposal (2). We have thus
mans would be sufficient for Oldowan skill acquisition (cf. ref. suggested the term “technological niche” (36) to emphasize the
29). Nevertheless intentional demonstration and linguistic in- critical role for object manipulation and modification indicated
struction are clearly helpful (27), and such teaching did even- by evolutionary neuroscience, as well as the necessary integration
tually become a vital part of human technological reproduction of cultural evolutionary thinking with the economic logic of the
(53). Archaeological evidence cannot demonstrate that a par- expensive brain to address biocultural interactions across evo-
ticular form of teaching was essential at a given point in pre- lutionary time scales. As should be evident from the preceding
history but does document transmission of quite complex and discussion, “technological” is not meant to imply a narrow focus
demanding techniques by Late Acheulean times (41), some of on artifacts apart from the broader processes that generate them.
which modern humans find difficult to convey without explicit Indeed, the now-ubiquitous term “technology” did not even
verbal instruction (40, 97). come into widespread use until well into the 20th century, when
Modern human teaching relies on social affiliative bonds it was adopted to describe: (i) a newly emerging and radically
that support extended interaction and motivate investment by transformative power going far beyond anything implied by ex-
teachers. Such investments are a special case of the alloparental tant conceptions of the “mechanical” or “useful” arts, and (ii) the
contributions that subsidize human reproduction (4), and their new configurations of materials, social and economic relations,
affective substrates likely evolved as part of a more general institutions, and regulations that come together to constitute a
prosocial package (98, 99) relying on strong affiliative bonds. unitary “thing,” like “automotive technology” (103). In this
The interindividual behavioral alignment facilitated by reso- sense, we find the term to be a particularly apt description of the
nance (74) is a critical mechanism that promotes such affiliation content, scope, and power of coevolutionary relationships that
across many species (73), and in humans also supports pragmatic conspired to produce the modern human condition.

1. Whiten A, Hinde RA, Laland KN, Stringer CB (2011) Culture evolves. Philos Trans R 12. Whiten A, van de Waal E (December 26, 2016) Social learning, culture and the ‘socio-
Soc Lond B Biol Sci 366:938–948. cultural brain’ of human and non-human primates. Neurosci Biobehav Rev, 10.1016/
2. Boyd R, Richerson PJ, Henrich J (2011) The cultural niche: Why social learning is es- j.neubiorev.2016.12.018.
sential for human adaptation. Proc Natl Acad Sci USA 108:10918–10925. 13. Byrne RW (2016) Evolving Insight: How It Is We Can Think About Why Things Happen
3. Dehaene S, Cohen L, Morais J, Kolinsky R (2015) Illiterate to literate: Behavioural and (Oxford Univ Press, Oxford).
cerebral changes induced by reading acquisition. Nat Rev Neurosci 16:234–244. 14. Burkart JM, Schubiger MN, van Schaik CP (July 28, 2016) The evolution of general
4. Isler K, Van Schaik CP (2014) How humans evolved large brains: Comparative evi- intelligence. Behav Brain Sci, 10.1017/S0140525X16000959.
15. van Schaik CP, Isler K, Burkart JM (2012) Explaining brain size variation: From social
NEUROSCIENCE

dence. Evol Anthropol 23:65–75.


5. Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected to cultural brain. Trends Cogn Sci 16:277–284.
16. Henrich J, McElreath R (2003) The evolution of cultural evolution. Evol Anthropol 12:
Process in Evolution (Princeton Univ Press, Princeton).
123–135.
6. Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL (2014) Human cumulative culture: A
17. Stout D, Human brain evolution: History or science? Rethinking Human Evolution, Vienna
comparative perspective. Biol Rev Camb Philos Soc 89:284–301.
Series in Theoretical Biology, ed Schwartz JH (MIT Press, Cambridge, MA), in press.
7. Whiten A, Horner V, Marshall-Pescini S (2003) Cultural panthropology. Evol Anthropol
18. Cartmill M (2002) Paleoanthropology: Science or mythological charter? J Anthrop
12:92–105.
Res 52:183–201.
8. Pinker S (2010) Colloquium paper: The cognitive niche: Coevolution of intelligence,
19. Kolodny O, Creanza N, Feldman MW (2016) Game-changing innovations: How cul-
sociality, and language. Proc Natl Acad Sci USA 107:8993–8999. ture can change the parameters of its own evolution and induce abrupt cultural
ANTHROPOLOGY

9. Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press, shifts. PLOS Comput Biol 12:e1005302.
Cambridge, MA). 20. Antón SC, Potts R, Aiello LC (2014) Human evolution. Evolution of early Homo: An
10. Szathmáry E, Smith JM (1995) The major evolutionary transitions. Nature 374: integrated biological perspective. Science 345:1236828.
227–232. 21. d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the
11. Boyd R, Richerson P (1996) Why culture is common but cultural evolution is rare. emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069.
Evolution of Social Behaviour Patterns in Primates and Man, eds Runciman W, 22. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
Maynard Smith J, Dunbar R (Oxford Univ Press, Oxford), pp 77–93. mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.

Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7867
23. Caldwell CA, Schillinger K, Evans CL, Hopper LM (2012) End state copying by humans 63. Hill J, et al. (2010) Similar patterns of cortical expansion during human development
(Homo sapiens): Implications for a comparative perspective on cumulative culture. and evolution. Proc Natl Acad Sci USA 107:13135–13140.
J Comp Psychol 126:161–169. 64. Mueller S, et al. (2013) Individual variability in functional connectivity architecture of
24. Derex M, Godelle B, Raymond M (2013) Social learners require process information the human brain. Neuron 77:586–595.
to outperform individual learners. Evolution 67:688–697. 65. Preuss TM (2012) Human brain evolution: From gene discovery to phenotype dis-
25. Wasielewski H (2014) Imitation is necessary for cumulative cultural evolution in an covery. Proc Natl Acad Sci USA 109:10709–10716.
unfamiliar, opaque task. Hum Nat 25:161–179. 66. Somel M, Liu X, Khaitovich P (2013) Human brain evolution: Transcripts, metabolites
26. Schillinger K, Mesoudi A, Lycett SJ (2015) The impact of imitative versus emulative and their regulators. Nat Rev Neurosci 14:112–127.
learning mechanisms on artifactual variation: Implications for the evolution of 67. Gómez-Robles A, Hopkins WD, Schapiro SJ, Sherwood CC (2015) Relaxed genetic
material culture. Evol Hum Behav 36:446–455. control of cortical organization in human brains compared with chimpanzees. Proc
27. Morgan TJ, et al. (2015) Experimental evidence for the co-evolution of hominin tool- Natl Acad Sci USA 112:14799–14804.
making teaching and language. Nat Commun 6:6029. 68. Murren CJ, et al. (2015) Constraints on the evolution of phenotypic plasticity: Limits
28. Roux V, Bril B, Dietrich G (1995) Skills and learning difficulties involved in stone and costs of phenotype and plasticity. Heredity (Edinb) 115:293–301.
knapping. World Archaeol 27:63–87. 69. Heyes C (2003) Four routes of cognitive evolution. Psychol Rev 110:713–727.
29. Nonaka T, Bril B, Rein R (2010) How do stone knappers predict and control the 70. Flynn EG, Laland KN, Kendal RL, Kendal JR (2013) Target article with commentaries:
outcome of flaking? Implications for understanding early stone tool technology. Developmental niche construction. Dev Sci 16:296–313.
J Hum Evol 59:155–167. 71. Wolpert DM, Doya K, Kawato M (2003) A unifying computational framework for
30. Stout D (2002) Skill and cognition in stone tool production: An ethnographic case motor control and social interaction. Philos Trans R Soc Lond B Biol Sci 358:593–602.
study from Irian Jaya. Curr Anthropol 45:693–722. 72. Duncan J (2010) The multiple-demand (MD) system of the primate brain: Mental
31. Stout D, Hecht E, Khreisheh N, Bradley B, Chaminade T (2015) Cognitive demands of programs for intelligent behaviour. Trends Cogn Sci 14:172–179.
lower paleolithic toolmaking. PLoS One 10:e0121804. 73. Feldman R (2017) The neurobiology of human attachments. Trends Cogn Sci 21:
32. Toth N, Schick K (2009) The Oldowan: The tool making of early hominins and 80–99.
chimpanzees compared. Annu Rev Anthropol 38:289–305. 74. Hasson U, Frith CD (2016) Mirroring and beyond: Coupled dynamics as a generalized
33. Apel J, Knutsson K, eds (2006) Skilled Production and Social Reproduction: Aspects of framework for modelling social interactions. Phil Trans R Soc B 371:20150366.
Traditional Stone-Tool Technologies. (Societas Archaeologica Upsaliensis, Uppsala, 75. Tramacere A, Pievani T, Ferrari PF (November 16, 2016) Mirror neurons in the tree of
Sweden). life: Mosaic evolution, plasticity and exaptation of sensorimotor matching responses.
34. Schillinger K, Mesoudi A, Lycett SJ (2016) Copying error, evolution, and phylogenetic Biol Rev Camb Philos Soc, 10.1111/brv.12310.
signal in artifactual traditions: An experimental approach using “model artifacts”. 76. Heyes CM, Frith CD (2014) The cultural evolution of mind reading. Science 344:
J Archaeol Sci 70:23–34. 1243091.
35. Stout D, Semaw S, Rogers MJ, Cauche D (2010) Technological variation in the earliest 77. Byrne R (1999) Imitation without intentionality. Using string parsing to copy the
Oldowan from Gona, Afar, Ethiopia. J Hum Evol 58:474–491. organization of behaviour. Anim Cogn 2:63–72.
36. Stout D, Khreisheh N (2015) Skill learning and human brain evolution: An experi- 78. Mars RB, et al. (2012) On the relationship between the “default mode network” and
mental approach. Camb Archaeol J 25:867–875. the “social brain”. Front Hum Neurosci 6:189.
37. Hecht EE, et al. (2015) Acquisition of Paleolithic toolmaking abilities involves structural 79. Genovesio A, Wise SP, Passingham RE (2014) Prefrontal-parietal function: From
remodeling to inferior frontoparietal regions. Brain Struct Funct 220:2315–2331. foraging to foresight. Trends Cogn Sci 18:72–81.
38. Magnani M, Rezek Z, Lin SC, Chan A, Dibble HL (2014) Flake variation in relation to 80. Milner AD, Goodale MA (1995) The Visual Brain in Action (Oxford Univ Press, Ox-
the application of force. J Archaeol Sci 46:37–49. ford).
39. Faisal A, Stout D, Apel J, Bradley B (2010) The manipulative complexity of Lower 81. Orban GA, Caruana F (2014) The neural basis of human tool use. Front Psychol 5:310.
Paleolithic stone toolmaking. PLoS One 5:e13718. 82. Stout D, Chaminade T (2007) The evolutionary neuroscience of tool making.
40. Putt SS, Woods AD, Franciscus RG (2014) The role of verbal interaction during ex- Neuropsychologia 45:1091–1100.
perimental bifacial stone tool manufacture. Lithic Technol 39:96–112. 83. Stout D, Passingham R, Frith C, Apel J, Chaminade T (2011) Technology, expertise
41. Stout D, Apel J, Commander J, Roberts M (2014) Late Acheulean technology and and social cognition in human evolution. Eur J Neurosci 33:1328–1338.
cognition at Boxgrove, UK. J Archaeol Sci 41:576–590. 84. Hecht EE, et al. (2013) Process versus product in social learning: Comparative diffu-
42. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to sion tensor imaging of neural systems for action execution-observation matching in
function. Behav Brain Sci 37:177–192. macaques, chimpanzees, and humans. Cereb Cortex 23:1014–1024.
43. Laland KN, Bateson P (2001) The mechanisms of imitation. Cybern Syst 32:195–224. 85. Hecht EE, et al. (2013) Differences in neural activation for object-directed grasping in
44. Buccino G, et al. (2004) Neural circuits underlying imitation learning of hand actions: chimpanzees and humans. J Neurosci 33:14117–14134.
an event-related fMRI study. Neuron 42:323–334. 86. Denys K, et al. (2004) Visual activation in prefrontal cortex is stronger in monkeys
45. Ericsson KA, Krampe RT, Tesch-Romer C (1993) The role of deliberate practice in the than in humans. J Cogn Neurosci 16:1505–1516.
acquisition of expert performance. Psychol Rev 100:363–406. 87. Hecht EE, Gutman DA, Bradley BA, Preuss TM, Stout D (2015) Virtual dissection and
46. Stout D (2013) Neuroscience of technology. Cultural Evolution: Society, Technology, comparative connectivity of the superior longitudinal fasciculus in chimpanzees and
Language, and Religion, Strungmann Forum Reports, eds Richerson PJ, Christiansen M humans. Neuroimage 108:124–137.
(MIT Press, Cambridge, MA), pp 157–173. 88. Duncan J, et al. (2000) A neural basis for general intelligence. Science 289:457–460.
47. Whiten A (2015) Experimental studies illuminate the cultural transmission of per- 89. Anderson JR, Gallup GG, Jr (2015) Mirror self-recognition: A review and critique of
cussive technologies in Homo and Pan. Phil Trans R Soc B 370:20140359; erratum in attempts to promote and engineer self-recognition in primates. Primates 56:317–326.
Phil Trans R Soc B 371:20150436. 90. Hecht EE, Mahovetz LM, Preuss TM, Hopkins WD (2017) A neuroanatomical pre-
48. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of dictor of mirror self-recognition in chimpanzees. Soc Cogn Affect Neurosci 12:37–48.
cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415. 91. Hihara S, et al. (2006) Extension of corticocortical afferents into the anterior bank of
49. Gibson JJ (1979) The Ecological Approach to Visual Perception (Houghton Mifflin, the intraparietal sulcus by tool-use training in adult monkeys. Neuropsychologia 44:
Boston). 2636–2646.
50. Stout D (2011) Stone toolmaking and the evolution of human culture and cognition. 92. Kivell TL (2015) Evidence in hand: Recent discoveries and the early evolution of
Philos Trans R Soc Lond B Biol Sci 366:1050–1059. human manual manipulation. Phil Trans R Soc Lond B Biol Sci 370:20150105.
51. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural 93. Stout D, Toth N, Schick K, Chaminade T (2008) Neural correlates of Early Stone Age
learning. Trends Cogn Sci 19:688–699. toolmaking: Technology, language and cognition in human evolution. Philos Trans R
52. Fragaszy DM, et al. (2013) The fourth dimension of tool use: Temporally enduring Soc Lond B Biol Sci 363:1939–1949.
artefacts aid primates learning to use tools. Philos Trans R Soc B Biol Sci 368:20120410. 94. Harmand S, et al. (2015) 3.3-million-year-old stone tools from Lomekwi 3, West
53. Hewlett BS, Roulette CJ (2016) Teaching in hunter-gatherer infancy. R Soc Open Sci Turkana, Kenya. Nature 521:310–315.
3:150403. 95. Rightmire GP (2004) Brain size and encephalization in early to Mid-Pleistocene
54. Burkart JM, Hrdy SB, Van Schaik CP (2009) Cooperative breeding and human cog- Homo. Am J Phys Anthropol 124:109–123.
nitive evolution. Evol Anthropol 18:175–186. 96. Byrge L, Sporns O, Smith LB (2014) Developmental process emerges from extended
55. Stout D, Chaminade T (2012) Stone tools, language and the brain in human evolu- brain-body-behavior networks. Trends Cogn Sci 18:395–403.
tion. Philos Trans R Soc Lond B Biol Sci 367:75–87. 97. Gärdenfors P, Högberg A (2017) The archaeology of teaching and the evolution of
56. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump- Homo docens. Curr Anthrop 58:188–208.
tions and predictions. Proc R Soc Biol 282:20151019. 98. Bogin B, Bragg J, Kuzawa C (2014) Humans are not cooperative breeders but practice
57. Deacon TW (1997) The Symbolic Species: The Co-Evolution of Language and the biocultural reproduction. Ann Hum Biol 41:368–380.
Brain (WW Norton, New York). 99. Hill K, Barton M, Hurtado AM (2009) The emergence of human uniqueness: Char-
58. Edelman GM (1987) Neural Darwinism (Basic Books, New York). acters underlying behavioral modernity. Evol Anthropol 18:187–200.
59. Power JD, Fair DA, Schlaggar BL, Petersen SE (2010) The development of human 100. Arbib MA (2012) How the Brain Got Language: The Mirror System Hypothesis (Ox-
functional brain networks. Neuron 67:735–748. ford Univ Press, New York).
60. Mantini D, Corbetta M, Romani GL, Orban GA, Vanduffel W (2013) Evolutionarily 101. Clark A (2006) Language, embodiment, and the cognitive niche. Trends Cogn Sci 10:
novel functional networks in the human brain? J Neurosci 33:3259–3275. 370–374.
61. Margulies DS, et al. (2016) Situating the default-mode network along a principal gra- 102. Dehaene S (1997) The Number Sense: How the Mind Creates Mathematics (Oxford
dient of macroscale cortical organization. Proc Natl Acad Sci USA 113:12574–12579. Univ Press, New York).
62. Buckner RL, Krienen FM (2013) The evolution of distributed association networks in 103. Marx L (1997) Technology: The emergence of a hazardous concept. Soc Res 64:
the human brain. Trends Cogn Sci 17:648–665. 965–988.

7868 | www.pnas.org/cgi/doi/10.1073/pnas.1620738114 Stout and Hecht


COLLOQUIUM
PAPER
Identifying early modern human ecological niche
expansions and associated cultural dynamics
in the South African Middle Stone Age
Francesco d’Erricoa,b,1,2, William E. Banksa,c,1, Dan L. Warrend, Giovanni Sgubine, Karen van Niekerkb,f,
Christopher Henshilwoodb,f, Anne-Laure Daniaue, and María Fernanda Sánchez Goñie,g
a
CNRS, UMR 5199–De la Préhistoire à l’Actuel: Culture, Environnement et Anthropologie, Université de Bordeaux, 33615 Pessac Cedex, France; bEvolutionary
Studies Institute, University of the Witwatersrand, Witwatersrand 2050, South Africa; cBiodiversity Institute, University of Kansas, Lawrence, KS 66045-7562;
d
Biocomplexity and Biodiversity Unit, Okinawa Institute of Science and Technology, Okinawa 904-0495 Japan; eCNRS, UMR 5805–Environnements et
Paléoenvironnements Océaniques et Continentaux, Université de Bordeaux, 33615 Pessac Cedex, France; fInstitute for Archaeology, History, Culture, and
Religion, University of Bergen, 5020 Bergen, Norway; and gÉcole Pratique des Hautes Études, L’Université de Recherche Paris Sciences et Lettres, 75014
Paris, France

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 16, 2017 (received for review January 31, 2017)

The archaeological record shows that typically human cultural traits species’ biologically dictated potential. Although some would still
emerged at different times, in different parts of the world, and argue that there is a direct link between cultural behavior and
among different hominin taxa. This pattern suggests that their hominin taxonomy and, as a consequence, that the typically human
emergence is the outcome of complex and nonlinear evolutionary secondary inheritance system only emerged with our species,
trajectories, influenced by environmental, demographic, and social archaeological and paleogenetic research conducted over the
factors, that need to be understood and traced at regional scales. past 20 y challenges such a view.
The application of predictive algorithms using archaeological and First, for periods <200,000 years before the present (ka), it is
paleoenvironmental data allows one to estimate the ecological difficult to attribute a particular cognition and resulting cultural
niches occupied by past human populations and identify niche behavior to a particular fossil species because paleogenetic evi-
changes through time, thus providing the possibility of investigating dence shows that significant interbreeding occurred between
relationships between cultural innovations and possible niche shifts. Neanderthals, Denisovans, and anatomically modern humans
By using such methods to examine two key southern Africa (AMHs) (4–6), thus blurring the concept of fossil species that
archaeological cultures, the Still Bay [76–71 thousand years before many paleoanthropologists had in the past when interpreting
present (ka)] and the Howiesons Poort (HP; 66–59 ka), we identify a morphological differences between human remains. Each new
niche shift characterized by a significant expansion in the breadth of round of publications concerning paleogenetics shows that we are
the HP ecological niche. This expansion is coincident with aridifica- confronted with a complex network of genetic relationships rather
tion occurring across Marine Isotope Stage 4 (ca. 72–60 ka) and than distinct and simple lines of evolutionary descent. There is no
especially pronounced at 60 ka. We argue that this niche shift was reason to assume that such a pattern did not characterize other
made possible by the development of a flexible technological system, phases of our lineage’s evolution.
reliant on composite tools and cultural transmission strategies based
Second, archaeological discoveries show that the cultural in-
novations generally seen as reflecting modern cognition and be-
more on “product copying” rather than “process copying.” These
havior did not emerge as a single package in conjunction with the

ANTHROPOLOGY
results counter the one niche/one human taxon equation. They
appearance of our species in Africa. We know that AMHs
indicate that what makes our cultures, and probably the cultures
emerged in Africa between 200 and 160 ka (7–9), but some be-
of other members of our lineage, unique is their flexibility and haviors considered as “modern” are present in Africa before this
ability to produce innovations that allow a population to shift its speciation event. Ochre use appears at around 300 ka (10), and
ecological niche. laminar blade production is observed perhaps as early as 500 ka
(11). Other modern cultural traits are only observed in the African
| |
Middle Stone Age Still Bay Howiesons Poort | archaeological record after ca. 100 ka. Such is the case with
|
ecological niche modeling paleoclimate heating of stone to facilitate knapping or retouching, pressure-
flaked bifacial projectile points, microlithic armatures, mastic-

ECOLOGY
R esearch on animal behavior has made it clear that culture
represents a second inheritance system that may have changed
the dynamics of evolution on a broad scale (1–3). Understanding
facilitated hafting of stone tools, formal bone tools, abstract en-
gravings, the production of paint and pigment containers, personal
ornaments, and primary burials (12–15). Furthermore, many key
how this process has affected the evolution of our genus is a major cultural innovations are present outside Africa well before AMH
challenge in paleoanthropology. In what ways, and through what
phases of evolutionary history, has human culture extended beyond
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
culture seen in other species? Are the cultural adaptations and Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
associated cultural innovations that we observe in the archaeo- Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
logical record the direct consequence of our biological evolution, in Irvine, CA. The complete program and video recordings of most presentations are available
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
or are they the outcome of mechanisms largely independent of it?
Author contributions: F.d. and W.E.B. designed research; F.d., W.E.B., D.L.W., and A.-L.D.
In our lineage, if cultural innovations were directly linked to classic
performed research; F.d., W.E.B., D.L.W., G.S., K.v.N., and C.H. analyzed data; K.v.N. and
Darwinian evolutionary processes, such as isolation, random mu- C.H. provided and reviewed archaeological data; A.-L.D. and M.F.S.G. interpreted paleo-
tation, selection, and speciation, one would expect a clear corre- climatic data; and F.d., W.E.B., D.L.W., A.-L.D., and M.F.S.G. wrote the paper.
spondence between the emergence of a new species and a related The authors declare no conflict of interest.
set of novel cultural behaviors. By shaping a new hominin species, This article is a PNAS Direct Submission.
natural selection would provide this species with a new cognitive 1
F.d. and W.E.B. contributed equally to this work.
setting resulting in the capacity for particular cultural innovations 2
To whom correspondence should be addressed. Email: francesco.derrico@u-bordeaux.fr.
or behaviors. Such a mechanism would provide the possibility for This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
cultural variability but would narrow its range of expression to the 1073/pnas.1620752114/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1620752114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7869–7876


dispersal. In Europe, Neanderthals used pigment at many sites by In a previous study, we stressed the need to consider the re-
at least 250–200 ka. They also used complex lithic technologies, lationship between past human cultures and environment as a
composite tools, and complex hafting techniques by at least 180 ka dynamic process that occurred at a regional level (39). We ar-
(16). At Bruniquel, France, Neanderthals broke and moved four gued that to do so, one needs to develop heuristic tools that
tons of stalagmites to build a circular structure deep within a cave enable the quantitative comparison and evaluation of individual
176 ka (17). At a number of sites, starting at 130 ka, they used cultural trajectories, their associated behavioral changes through
raptor claws and feathers, probably for symbolic activities (18, 19). time, and the mechanisms that operated behind such trends. This
They made abstract designs on a variety of media (20, 21). Nean- approach may allow for the identification of points in time
derthals in the Near East and Europe engaged very early in a va- during which human cultures substantially reorganized their
riety of funerary practices, including deliberate burials with simple second inheritance systems, thus moving closer to the system
grave goods. The last Neanderthals in Italy and France produced characteristic of historically known and present-day populations.
formal bone tools. They also produced a variety of personal orna- A regional cultural trajectory can be conceived of as a succes-
ments consisting of animal teeth, fossils, and marine shells, some of sion of cultural packages, which we term cohesive adaptive sys-
which were colored with ochre (22, 23). Additionally, isolated oc- tems. A cohesive adaptive system is a cultural entity characterized
currences of innovative cultural traits are recorded at much older by shared and transmitted knowledge reflected by a recognizable
sites in Europe and Asia (24), and well-established innovations suite of cultural traits that a population uses to operate within both
(e.g., Middle Stone Age shell beads) disappear abruptly from cultural and environmental contexts (39). This concept differs from
the archaeological record and similar behaviors later reappear in the concept of “technocomplex” or “archaeological culture” com-
different forms and sometimes on different media (14, 25). monly used in archaeology, in that exploited environmental condi-
This evidence demonstrates that typically modern human tions (i.e., the ecocultural niche) contribute to the definition of a past
cultural traits emerged at different times, in different parts of the cultural adaption. When faced with successive climate changes, a
world, and among different hominin taxa. Such taxa appear more cohesive adaptive system can conserve, expand, or contract its eco-
and more to be the phenotypic expression of a largely shared, logical niche, with “ecological niche” being defined in the Grinnel-
plastic cognition (26, 27), and the emergence of typically human lian sense as the environmental and resources conditions suitable for
innovations appears to be the result of complex and nonlinear a species or population (40). Associated cultural traits, and the way
evolutionary trajectories that need to be understood and traced in which they were transmitted, may also evolve in such situations
at regional scales. and highlight significant changes in the way in which culture influ-
It is clear that cultural innovations were triggered by several enced human populations. Research strategies have been developed
interconnected and dynamic factors, likely biological, environ- to investigate such interactions.
mental, and cultural. Because speciation does not appear to have Predictive algorithms, originally created in the field of ecology,
played a role in the emergence of key innovations, we need to are able to estimate the ecological niche occupied by a past cohesive
explore the potential for relationships between biology and culture adaptive system (i.e., the ecocultural niche) by using the geographic
at the population level, and particularly within those past African locations of archaeological sites where the cohesive adaptive system
populations that first developed behaviors that incorporated suites has been recognized along with chronologically relevant paleo-
of these traits. Such an endeavor, however, is handicapped on the environmental data. Using these data, the predictive algorithms first
biological side by a sparse Upper Pleistocene hominin fossil re- identify the environmental parameters shared among the archaeo-
cord, the absence of pre-Holocene paleogenetic data, and a long logical sites and define the relationships between these parameters.
history of human presence and intracontinental dispersals that These relationships are then used to estimate a cohesive adaptive
complicate interpretations of modern genetic data. Understanding system’s ecological niche. Another important capacity of these al-
how AMHs were biologically structured in the Middle Stone Age gorithms is that they can be used to examine niches between time
is also hampered by the fact that, as recently shown by genetic periods, thereby allowing one to determine whether or not suc-
analyses (6, 28) highlighting the introgression of archaic genes into cessive populations exploited different niches. By comparing the
the African gene pool, they were certainly not ubiquitous across material cultures of two or more successive cohesive adaptive sys-
the continent. To overcome such limitations, research has focused tems, and taking into account environmental frameworks within
on better defining the nature and chronology of the cultural en- which they operated, one can evaluate whether or not cultural in-
tities that may reflect past population structure and distributions novations were a response to environmental fluctuations. Equally as
(29, 30), in addition to documenting the complexity of innovations important, one can identify the degree of resilience of a cohesive
recorded in the Middle Stone Age and exploring their social and adaptive system to environmental change.
cognitive implications (31–33). Others have attempted to identify a The goal of this study is to apply this approach to two key
correspondence between environmental or climatic variability and Middle Stone Age archaeological cultures, the Still Bay (SB) and
the emergence of cultural innovations in the hope of identifying the HP of southern Africa. The SB represents the first known
causal links (34–38). These attempts, however, have no designed cultural adaptation in which technological and symbolic inno-
means, apart from recurrence, with which to verify the hypothesis vations of a complexity comparable to the innovations seen in
that climate may have influenced culture, to identify the suites of modern hunter-gatherers appears as a coherent and recognizable
environmental parameters (i.e., the ecological niche) within which package. After a possible hiatus, we observe a different archae-
each archaeological culture operated, or to evaluate how these ological culture, termed the HP, characterized by dramatically
relationships varied through space and time. The emergence of different and simplified lithic technology, as well as by markedly
key cultural innovations in our lineage may reflect changes in the different symbolic material culture. The available archaeological
nature of such relationships. Identifying and disentangling such and paleoenvironmental datasets of this period are of sufficient
relationships is a key challenge for the involved disciplines. The resolution to make this period of the Middle Stone Age an ideal
failure to do so may result in oversimplified scenarios. For ex- laboratory for exploring how typically human behavioral pack-
ample, Ziegler et al. (38) conclude that cultural innovations during ages arose and evolved in one particular region and for identi-
the Middle Stone Age in southern Africa were triggered by periods fying potential mechanisms at work.
of humidity that produced higher levels of biomass and consequent
increases in human population density. This scenario, however, Cultural and Chronological Contexts
only relies on the mean age of each culture and climatic conditions The SB. This archaeological culture, observed at sites located in
associated with those means, and it does not take into consider- coastal areas of southern Africa and predominantly concentrated
ation the full age range of each recognized archaeological culture. in southwestern regions (Fig. 1), is characterized by the production
Furthermore, their model insinuates hiatuses in the archaeological of bifacial foliate points, often made from fine-grained, nonlocal
record following the post-Howiesons Poort (HP) that are not seen lithic materials (Fig. 2A). At the key site of Blombos Cave, the
in most southern African archaeological sequences. majority of these points have been heat-treated before flaking with

7870 | www.pnas.org/cgi/doi/10.1073/pnas.1620752114 d’Errico et al.


COLLOQUIUM
PAPER
Fig. 1. Map of southern Africa indicating the loca-
tions of the SB (red circles) and HP (green triangles)
archaeological sites, the geographical coordinates of
which were used as occurrence inputs to estimate the
two cultures’ respective ecological niches. Sea level is
depicted at −70 m below present-day sea level.

hard and soft hammer percussion, and finished using a technique Use-wear analyses indicate that they were worn for extended
termed pressure flaking. The latter allows for more refined shaping periods of time (42). Other elements of SB symbolic material
of the object by giving the knapper better control over its final culture include elaborately engraved abstract patterns on ochre
form. Modern-day experiments indicate that this knapping tech- pieces (Fig. 2D), as well as more simple engravings on bone items.
nology requires a long period of apprenticeship. SB bifaces were Also present in assemblages are ochre pieces bearing traces indi-
multifunctional and served as both projectiles and cutting tools. cating that they were processed to produce red powder (Fig. 2E),
Examinations of SB lithic assemblages (41) show that these bifaces which likely was used for both functional and symbolic purposes.
were often repeatedly resharpened and had long use-lives, in- With respect to chronology, a majority of SB sites have yielded
dicating that they formed a curated component of the SB lithic optically stimulated luminescence (OSL) ages that range be-
toolkit. The SB is also the first archaeological culture in which tween 76 ka and 71 ka (34, 43–45). Debate exists as to accuracy of
formal bone tools (i.e., artifacts made of animal osseous material this range due to older OSL and thermoluminescence (TL) dates
shaped with techniques, such as scraping, grinding, and incising, from the Diepkloof rock shelter (45–48). Because the inexpli-
specifically conceived for these materials) are observed at multiple cably older set of dates from Diepkloof remains a unicum, we will
sites rather than as rare elements in single assemblages. Techno- use the currently accepted chronology (45, 49, 50). Debate also
logical and functional studies show that the two different classes of exists as to whether this culture is technologically homogeneous
tool, projectiles and awls (Fig. 2C), were produced with different or, instead, characterized by regional and temporal variability
techniques and that special attention was paid to the finishing of (41). This issue, however, remains open due to a lack of chro-
the bone projectile points, suggesting that they were highly valued nological resolution and the small number of contextually reliable

ANTHROPOLOGY
and possible status items. The SB is also the first archaeological archaeological assemblages.
culture in southern Africa associated with personal ornaments.
These ornaments take the form of marine shells (Nassarius The HP. This archaeological culture, observed in both coastal and
kraussianus) that were deliberately perforated, stained with ochre, inland regions of southwestern and northeastern South Africa
and strung together in a variety of arrangements (33) (Fig. 2B). (Fig. 1), is principally characterized by the presence of backed

ECOLOGY

Fig. 2. (Left) SB artifacts [bifacial points made of quartz and silcrete (A), perforated N. kraussianus shell beads (B), bone points and an awl (C), engraved
ochre fragments (D), and an ochre fragment shaped by grinding (E)]. (Right) HP artifacts [segment made of hornfels (F), segments made of quartz (G), flake
and segments bearing residues of mastic (H), engraved ostrich egg shells (I), ochre fragments shaped by grinding (J), and bone point and awls (K)]. Blombos
Cave (A and B), Sibudu Cave (F, G, and K), and Diepkloof Shelter (H–J) are shown. (Scale bar: 1 cm.) Images courtesy of: (A) ref. 41, (C) ref. 98, (D) ref. 12, (F) ref.
53, (G) ref. 99, (H) ref. 59, (I) ref. 61, (J) ref. 100, and (K) ref. 55. Fig. 2B courtesy of F.D. and C.H., and Fig. 2E courtesy of C.H.

d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7871
blades and bladelets (i.e., lithic blades steeply retouched on one temperatures and humidity in the South Atlantic and Southwestern
side to form crescent-shaped segments) (Fig. 2 F and G) that Indian Ocean (70–74). For southern Africa, Ziegler et al. (38)
were predominantly used as components in composite hunting examined the elemental composition of marine sediments from an
weapons. These tools, although not highly standardized dimen- Indian Ocean core and proposed that GS and HS events are
sionally or morphologically, were made with a lithic reduction characterized by increased erosion reflecting higher precipitation
system that was geared toward the production of thin, straight that triggered increases in vegetation cover and biomass. Recent
blades, some of which were retouched to make this culture’s fossil research has provided direct data concerning vegetation cover and
directeur along with denticulated tools (29, 41, 51). Raw materials biomass for this region. Pollen and microcharcoal records from
used for the lithic technology were predominantly local or near- marine core MD96-2098, retrieved off southwestern Africa (refs.
local in origin, in clear contrast to what is seen for SB bifaces. 65, 67 and this study), show repeated millennial-scale changes in
Similar to the SB, however, HP groups also sometimes heated humidity during the last glacial period that also indicate, within the
lithic raw materials before they were reduced to produce blades uncertainties of the independent ice and marine chronologies, that
(52) and occasionally used pressure flaking (53). Bifacial points are GS and HS events were associated with increases in humidity. Such
absent in the HP, with the exception of a single site where speci- increases are inferred from peaks in microcharcoal concentration
mens that are smaller and of lower quality have been recovered due to grass-fueled fires and decreases in pollen from vegetation
(54). Bone tools recovered from HP sites consist of awls, pressure characteristic of open environments, such as Nama Karoo and fine-
flakers, shaped splintered pieces (pièces esquillées), and small leaved savanna (Fig. 3 D and E). However, when the entire chro-
projectile points (55) (Fig. 2K). It has been argued that HP backed nological interval for both the SB and HP is taken into account, a
segments and bone points were used as bow-delivered arrow points more complex climatic pattern is observed, characterized by an
based on use-wear, fracture patterns, and morphometrics (56–58). alternation of wet and dry events. Despite this variability, the gen-
The interpretation that these tools were hafted is supported by the eral pattern revealed by all available continental proxies across the
presence of mastic remnants observed on some backed pieces (31, entire range of each archaeological culture shows an overall trend
59) (Fig. 2H). At present, with the exception of a perforated conus toward higher humidity during the SB and generally dryer condi-
shell found within an infant burial at Border Cave (60), personal tions during the HP. The contradictory pattern proposed by Ziegler
ornaments are lacking in HP assemblages, and undisputed sym- et al. (38) is probably due to the fact that they do not consider the
bolic behavior is limited to the decoration of ostrich egg shell water entire range of these two cultures but, instead, only look at the
containers with a variety of abstract designs made up of linear humidity trends coincident with each culture’s mean age.
engravings (51, 61) (Fig. 2I). Red ochre (Fig. 2J), also sometimes
incorporated into mastic mixtures, was widely used by HP groups. Materials and Methods
The HP has predominantly been dated with OSL and TL Paleoclimate Modeling. To estimate ecological niches exploited by the SB and
techniques and appears to have lasted for a slightly longer period HP, we used paleoclimatic and vegetation simulations produced by Woillez
than the SB. HP dates range between roughly 66 ka and 59 ka et al. (66) (SI Appendix, Paleoclimatic Simulations) for the periods of 72 ka and
(34, 51, 62). As with the SB, some OSL dates of the HP at 60 ka. Because the two simulations are primarily constrained by orbital pa-
Diepkloof are significantly older (47, 48) than the corpus of rameters and do not estimate suborbital variability, we used the 72 ka simu-
dates available from other South African sites, as well as from lation to represent climatic and environmental conditions for the SB and the
other OSL dates obtained at the same site (63). Based on the fact initial HP (ca. 66–63 ka) and the 60 ka simulation to represent conditions for
that the newly recalculated dates for the Diepkloof HP (63) the terminal HP (ca. 63–59 ka). The use of the 72 ka simulation as a proxy for
cluster with the HP dates from other dated contexts (50), we will climatic conditions of the initial HP is justified by the relatively high humidity
observed at the onset of HS 6, as evidenced by vegetation, fire activity, and
use the 66–59 ka range as the chronological interval for the HP
erosion proxies (Fig. 3 C–E, respectively). To estimate the SB and HP ecocultural
in this study. Shortly after ca. 59 ka, we observe the appearance
niches, we used temperature of the coldest month, maximum precipitation,
of the post-HP archaeological culture. minimum precipitation, mean annual precipitation, mean annual tempera-
ture, and a measure of biomass from the relevant paleoclimatic simulations.
Paleoenvironmental Context. These two archaeological cultures oc-
curred during two very different climatic phases (Fig. 3). At the Ecological Niche Modeling and Hypothesis Testing. To reconstruct the potential
orbital scale, the SB occurs in a phase of precession maximum ecological (ecocultural) niches exploited by the SB and HP and evaluate whether
during which one observes higher seasonality and an increase in cultural changes between the two are associated with an ecological niche shift,
precipitation in the Southern Hemisphere (64–67). To the contrary, we constructed a georeferenced list of archaeological sites with levels that can be
the HP is contemporaneous with a decrease in precession, with the securely attributed to one of these cultures (Fig. 1 and SI Appendix, Table S1).
minimum reached toward its end (ca. 60–59 ka). This change We then used these occurrence data to conduct tests using both Bioclim (75)
resulted in lower seasonality and drier conditions (SI Appendix, Fig. and Maxent (76) predictive algorithms within the “dismo” R package (77, 78) (SI
S1). In addition to orbital climatic variability, SB and HP cultures Appendix, Ecological Niche Modeling). We use these two algorithms to explore
were subjected to suborbital climatic fluctuations, the so-called the differences seen when models are allowed to extrapolate freely into
Dansgaard–Oeschger (D-O) cycles expressed over Greenland by combinations of environments that were unavailable during model training
alternating cold stadials and temperate interstadials, as well as in- (Maxent) versus models that are constrained so that they do not extrapolate
termittent and extreme cooling episodes recorded in the North beyond the minima and maxima of the marginal environmental distributions of
Atlantic, termed Heinrich Stadials (HSs). These millennial-scale the examined population (Bioclim). Due to Maxent’s ability to extrapolate, we
events are also recorded in Antarctic paleoclimatic records. anticipate that similarity between different target populations will generally be
The SB occurs during a period comprising Greenland In- seen to be higher when environmental niches are modeled using Maxent as
terstadial (GI) 20, Greenland Stadial (GS) 20, and GI 19 (68) opposed to Bioclim. With these two algorithms, we reconstructed both SB and
HP niches using relevant climatic outputs and simulated biomass from the 72 ka
(Fig. 3). This culture disappears from the archaeological record
simulation and compared these results. We also reconstructed the HP niche
during the initial phase of GS 19 (GS 19.2). The HP appears
using simulation outputs for 60 ka and compared these estimations with the
toward the end of GS 19 and is present across GI 18 and GS 18 estimations of the SB at 72 ka. A series of Monte Carlo randomization tests was
(ca. 64.4–59.4 ka, which corresponds to HS 6) (69). The suite of conducted to assess the differences in the set of environments occupied by each
diagnostic elements characteristic of this archaeological culture culture. This approach is based on widely used methods in evolutionary ecology
is no longer present by ca. 59–58 ka, a period marked by rapid (the “background” or “similarity” test) (79, 80) that are used to assess whether
climatic oscillations (i.e., GI 17.1, GS 17.1, GI 16.2, GS 16.2). It two populations exhibit statistically significant differences in their environ-
is following this interval that the post-HP adaptation appears. mental tolerances or associations (SI Appendix, Ecological Niche Modeling). We
The impact of the D-O millennial scale climatic variability and also conducted tests using measures of niche breadth (81, 82) to determine
HSs on the Southern Hemisphere regional climates has recently whether any observed differences between the two cultures’ environmental
been investigated. Model experiments and climate reconstructions niches represent a statistically significant expansion of the niche. Because some
suggest that GS and HS events resulted in increased sea surface of these evaluations were conducted using different climate layers for the SB

7872 | www.pnas.org/cgi/doi/10.1073/pnas.1620752114 d’Errico et al.


COLLOQUIUM
PAPER
Fig. 3. Climate variability during the time interval
between 90 ka and 40 ka encompassing the Middle
Stone Age cultures SB (76–71 ka, blue rectangle) and
HP (66–59 ka, red rectangle). Precession index (101)
(A), North Greenland Ice Core Project δ18O curve on
the GICC05 chronology (68) (B), Fe/Ca curve from
core CD154-17-17K collected from the Eastern Cape
margin indicating changes in river discharge (38) (C),
microcharcoal particle concentration curve from core
MD96-2098 collected off the Orange River on the
western South African margin indicating changes in
fire regime and precipitation (65) (D and this study),
Nama Karoo and fine-leaved savannah pollen per-

ANTHROPOLOGY
centage record from core MD96-2098 indicating
changes in precipitation (67) (E), and temperature
curve for Antarctica from the European Project for Ice
Coring in Antarctica ice core (102) (F). Arrows situated
between curves in C and D indicate long-term trends
in humidity during the SB and HP intervals.

and HP (72 ka and 60 ka, respectively), modifications that use Latin hypercube the east and northeast (Fig. 4 E and F), which represent areas that

ECOLOGY
sampling were made to the background similarity tests (SI Appendix, Ecological were less affected by the eastward expansion of desert areas
Niche Modeling and Fig. S2). during Marine Isotope Stage (MIS) 4 (66).
Background similarity tests of overlap between the SB and HP
Results niches both modeled with Maxent using the 72 ka climatic data
Niche estimations for the SB at 72 ka produced with Bioclim and produced no statistically significant result (SI Appendix, Fig. S3A
Maxent both indicate a high probability of presence primarily and Table S2), meaning that their respective niches are not sta-
restricted to the extreme southern and eastern portions of present- tistically different from one another. As pointed out above, this
day South Africa (Fig. 4 A and B). The most noticeable differences lack of significant difference between predictions is likely the re-
are that the Maxent prediction includes areas in the southwestern sult (Materials and Methods) of the used algorithm. To the con-
Cape as well as immediately coastal regions along the southeast- trary, these same tests using Bioclim found instead that SB and HP
ern and eastern coasts. This broader Maxent prediction is due to niche estimations using 72 ka climate outputs were less similar
this algorithm’s propensity to extrapolate into environments not than expected by chance (I-statistic: P ∼ 0.022; SI Appendix, Fig.
directly associated with the input occurrence data (i.e., archaeo- S3C and Table S2). Although HP niche estimates are slightly
logical sites). The predicted niches for the HP at 66 ka, produced broader than niche estimates of the SB at 72 ka with both Maxent
with the proxy 72 ka outputs, include those regions predicted for and Bioclim, these differences are not statistically significant (SI
the SB as well as more inland areas, including the Great Es- Appendix, Fig. S3 B and D and Table S2). Niche overlap between
carpment, the Highveld, and the Kaap Plateau, and broader areas Maxent models for the SB at 72 ka and the HP at 60 ka was
within the southwestern Cape and western coastal regions (Fig. 4 neither greater nor less than expected by chance (SI Appendix, Fig.
C and D). The niche estimations for the HP at 60 ka remain S3E and Table S2). However, overlap of Bioclim predictions for
geographically broader than the niche estimations for the SB and the SB at 72 ka and the HP at 60 ka was significantly lower than
still include major inland plateaus but are visibly shifted toward would be expected by chance (I-statistic: P ∼ 0.013; SI Appendix,

d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7873
Fig. 4. Ecological niche predictions for the SB archaeological culture at 72 ka (A and B), the HP archaeological culture at 66 ka (C and D), and the HP ar-
chaeological culture at 60 ka (E and F) produced with Bioclim and Maxent, respectively.

Fig. S3G and Table S2), indicating that the two cultures occupied strategies that potentially allow them to adapt to climate change
different ecological niches. Change in niche breadth between and environmental reorganization via cultural means. We observe
Maxent predictions for the SB at 72 ka and the HP at 60 ka is not such a pattern between the SB and the HP of Southern Africa.
statistically different from random expectations, although the ap- The SB was a coastal adaptation that exploited a relatively narrow
proximate P value is fairly low (P ∼ 0.11) (SI Appendix, Fig. S3F niche during mild climatic conditions across a large region. To
and Table S2), suggesting that a greater sample size might es- exploit that niche, SB populations developed a variety of complex
tablish the HP niche at 60 ka as significantly broader than the technologies and symbolic practices, some of which certainly
niche SB at 72 ka. The difference in niche breadth for Bioclim entailed costly modes of cultural transmission. A number of SB
models is greater than expected by chance (P ∼ 0.027) (SI Ap- cultural features, such as bifacial points and complex bead-
pendix, Fig. S3H and Table S2), indicating that the HP 60 ka niche working, could only be transmitted by communication and learn-
is broader than the niche of the SB at 72 ka, and points to an ing strategies that emphasize imitation (high-fidelity copying) over
ecological niche expansion. emulation (low-fidelity copying) (86, 87). HP populations signifi-
cantly increased the breadth of their niche compared with SB
Discussion and Conclusions populations. This expansion incorporated more arid and high-
To what extent does this study allow us to understand how human altitude inland environments and demonstrates their ability to
culture extended beyond behavioral adaptations observed in other cope successfully with the more arid climatic conditions and higher
species? Most species exhibit niche conservatism, contraction, or, ecological risk associated with MIS 4, particularly its latter phase.
more rarely, extinction when faced with climate change (83–85). This shift was made possible by developing a cohesive adaptive
Human populations, however, are unique in their capacity for system reliant on more flexible technologies. The variety of used
cumulative culture and associated complex cultural transmission lithic raw materials, blank production techniques, and methods to

7874 | www.pnas.org/cgi/doi/10.1073/pnas.1620752114 d’Errico et al.


COLLOQUIUM
PAPER
retouch and shape those blanks to produce segments, which vary in systems at key moments in our evolutionary history. Its appli-
form and size, are indicative of a flexible toolkit, and one reliant cation to other regions and periods should allow us to follow, at
on composite tools in this case. With effective hafting techniques, regional scales, the complex interplay between cultural in-
such a toolkit would have been easily repaired and maintained. novation, changes in modes of cultural transmission, and en-
Due to its modular nature, the HP toolkit could be effectively used vironmental variability. The results of the present study may be
in diverse environments. More importantly, the communicative improved in the future by producing paleoclimatic simulations
strategies needed to transmit the knowledge necessary to perpet- that capture millennial-scale environmental variability and by
uate this technology can be based more on “product copying” developing and using methods (e.g., date estimations, Bayesian
(emulation) rather than “process copying” (imitation). In the age modeling) that would allow one to attribute archaeological
latter, morphological similarity is associated with the same, or very site levels more precisely to millennial-scale climatic phases. Al-
similar, manufacturing techniques and sequences. For the former, though the former is technically possible, using such models will
one would expect to see artifacts that are morphologically similar not be productive as long as the latter remains beyond our grasp,
despite being made from a variety of raw materials and tech- at least at present. By capturing the main climatic trends charac-
niques, as is observed in the HP. Such patterns could have been
teristic of the end of MIS 5 and MIS 4, our paleoclimatic simu-
the result of a collapse of previously existing long-distance cultural
networks, leading to the formation of more local “traditions,” lations appear appropriate for examining culture–environment
again, which is exactly what we observe in HP bone and lithic relationships when one considers the degree of chronological
technologies (53, 55, 88). The mechanism or mechanisms that uncertainty associated with the two targeted cultures.
operated behind such a process remain unclear (e.g., demographic Our results demonstrate that in some early AMH regional
changes, population replacement, cultural drift). Although the SB cultural trajectories, niche expansion was not always associated
and HP certainly had adaptive strategies in common, it is probable with cultural complexification (an opposite case is discussed in
that their cultural transmission strategies differed. Considering the ref. 97). In this study’s case, complex cultural behaviors and
niche and technological changes observed between the two cul- inferred transmission strategies were replaced during a period of
tures, along with the expertise implicit in some SB technological pronounced aridification with more flexible adaptations that
innovations, we propose that training to create specialists, or “selec- were used to exploit a broader ecological niche. Increased cul-
tive oblique transmission” (89), was used in the SB to convey these tural complexity and elaborated social learning strategies ap-
complex technologies effectively and that this strategy was not, or to a parently were not always necessary for a culture to expand its
greatly lesser degree, used at the HP. ecological niche. Our findings support the view that the path
Numerous studies support the hypothesis that hunter-gatherer followed by past human populations to produce adaptations and
toolkit structure is driven, in part, by the risk of resource failure cultural traits, which most researchers would qualify as typically
(i.e., more diverse and complex toolkits are associated with riskier human, is not the outcome of classic Darwinian evolutionary
environments) (90–92). Data do not always support this pre- processes in which the appearance of a new niche is often as-
diction, however, and it has been proposed that the impact of risk sociated with a new species. Rather, the innovations character-
on toolkits is dependent on the scale of risk differences among the istic of the HP represent cultural exaptation: innovations that use
studied populations (93). The degree of reliance on copying (94), existing skills, techniques, and ideas in new ways. The consoli-
population size (95), and mobility (96) are other factors that may dation of these innovations depends on a population’s ability to
condition toolkit structure. None of these studies, however, are develop, when necessary, new modes of cultural transmission
able to routinely predict what factors were implicated in shifts in that allow such innovations to be maintained through time.
toolkit structure among early AMHs or to address the issue of how
past human niches may have changed when shifts in technology ACKNOWLEDGMENTS. This research was conducted with the financial
were concomitant with major climatic changes. The approach support of the Agence Nationale de la Recherche ANR-10-LABX-52 and the

ANTHROPOLOGY
that we have applied here is an effective means with which to explore European Research Council’s Advanced Grant TRACSYMBOLS 249587 awarded
relationships between climate variability and cohesive adaptive under the Seventh Framework Programme.

1. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology 15. Henshilwood CS, et al. (2011) A 100,000-year-old ochre-processing workshop at
through culture. Proc Natl Acad Sci USA 114:7775–7781. Blombos Cave, South Africa. Science 334:219–222.
2. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. 16. Mazza PPA, et al. (2006) A new Palaeolithic discovery: Tar-hafted stone tools in a
Proc Natl Acad Sci USA 114:7790–7797. European Mid-Pleistocene bone-bearing bed. J Archaeol Sci 33:1310–1318.
3. Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural in- 17. Jaubert J, et al. (2016) Early Neanderthal constructions deep in Bruniquel Cave in
telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad southwestern France. Nature 534:111–114.

ECOLOGY
Sci USA 114:7908–7914. 18. Romandini M, et al. (2014) Convergent evidence of eagle talons used by late
4. Sankararaman S, et al. (2014) The genomic landscape of Neanderthal ancestry in Neanderthals in Europe: A further assessment on symbolism. PLoS One 9:e101278.
present-day humans. Nature 507:354–357. 19. Radovčic D, Sršen AO, Radovčic J, Frayer DW (2015) Evidence for Neandertal jewelry:
5. Sankararaman S, Mallick S, Patterson N, Reich D (2016) The Combined landscape of Modified white-tailed eagle claws at Krapina. PLoS One 10:e0119802.
Denisovan and Neanderthal ancestry in present-day humans. Curr Biol 26:1241–1247. 20. Soressi M, D’Errico F (2007) Pigments, gravures, parures: Les comportements sym-
6. Nielsen R, et al. (2017) Tracing the peopling of the world through genomics. Nature 541:
boliques controversés des Néandertaliens. Les Néandertaliens. Biologie et Cultures,
302–310.
Documents préhistoriques 23, eds Vandermeersch B, Maureille B (Éditions du CTHS,
7. White TD, et al. (2003) Pleistocene Homo sapiens from Middle Awash, Ethiopia.
Paris), pp 297–309. French.
Nature 423:742–747.
21. Rodríguez-Vidal J, et al. (2014) A rock engraving made by Neanderthals in Gibraltar.
8. McDougall I, Brown FH, Fleagle JG (2005) Stratigraphic placement and age of
Proc Natl Acad Sci USA 111:13301–13306.
modern humans from Kibish, Ethiopia. Nature 433:733–736.
22. Zilhão J, et al. (2010) Symbolic use of marine shells and mineral pigments by Iberian
9. Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A (2011) Bayesian inference of ancient
human demography from individual genome sequences. Nat Genet 43:1031–1034. Neandertals. Proc Natl Acad Sci USA 107:1023–1028.
10. Watts I, Chazan M, Wilkins J (2016) Early evidence for brilliant ritualized display: 23. Caron F, d’Errico F, Del Moral P, Santos F, Zilhão J (2011) The reality of Neandertal
Specularite use in the Northern Cape (South Africa) between ∼500 and ∼300 ka. Curr symbolic behavior at the Grotte du Renne, Arcy-sur-Cure, France. PLoS One 6:e21545.
Anthropol 57:287–310. 24. Joordens JCA, et al. (2015) Homo erectus at Trinil on Java used shells for tool pro-
11. Wilkins J, Chazan M (2012) Blade production ∼500 thousand years ago at Kathu Pan duction and engraving. Nature 518:228–231.
1, South Africa: Support for a multiple origins hypothesis for early Middle Pleisto- 25. d’Errico F, et al. (2009) Out of Africa: Modern human origins special feature: Addi-
cene blade technologies. J Archaeol Sci 39:1883–1900. tional evidence on the use of personal ornaments in the Middle Paleolithic of North
12. Henshilwood CS, d’Errico F, Watts I (2009) Engraved ochres from the Middle Stone Africa. Proc Natl Acad Sci USA 106:16051–16056.
Age levels at Blombos Cave, South Africa. J Hum Evol 57:27–47. 26. Dehaene S, Cohen L (2007) Cultural recycling of cortical maps. Neuron 56:384–398.
13. Mourre V, Villa P, Henshilwood CS (2010) Early use of pressure flaking on lithic ar- 27. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl
tifacts at Blombos Cave, South Africa. Science 330:659–662. Acad Sci USA 114:7861–7868.
14. d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the 28. Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E (2015) Evidence for archaic
emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069. adaptive introgression in humans. Nat Rev Genet 16:359–371.

d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7875
29. Henshilwood CS (2012) Late Pleistocene techno-traditions in Southern Africa: A re- 65. Daniau A-L, et al. (2013) Orbital-scale climate forcing of grassland burning in
view of the Still Bay and Howiesons Poort, c. 75–59 ka. J World Prehist 25:205–237. southern Africa. Proc Natl Acad Sci USA 110:5069–5073.
30. Scerri EML, Groucutt HS, Jennings RP, Petraglia MD (2014) Unexpected technological 66. Woillez M-N, et al. (2014) Impact of precession on the climate, vegetation and fire
heterogeneity in northern Arabia indicates complex Late Pleistocene demography at activity in southern Africa during MIS4. Climate of the Past 10:1165–1182.
the gateway to Asia. J Hum Evol 75:125–142. 67. Urrego DH, Sánchez Goñi MF, Daniau A-L, Lechevrel S, Hanquiez V (2015) Increased
31. Wadley L, Hodgskiss T, Grant M (2009) From the cover: Implications for complex aridity in southwestern Africa during the warmest periods of the last interglacial.
cognition from the hafting of tools with compound adhesives in the Middle Stone Climate of the Past 11:1417–1431.
Age, South Africa. Proc Natl Acad Sci USA 106:9590–9594. 68. Rasmussen SO, et al. (2014) A stratigraphic framework for abrupt climatic changes
32. Wadley L, et al. (2011) Middle Stone Age bedding construction and settlement during the Last Glacial period based on three synchronized Greenland ice-core records:
patterns at Sibudu, South Africa. Science 334:1388–1391. Refining and extending the INTIMATE event stratigraphy. Quat Sci Rev 106:14–28.
33. Vanhaeren M, d’Errico F, van Niekerk KL, Henshilwood CS, Erasmus RM (2013) 69. Sánchez Goñi MF, Bard E, Landais A, Rossignol L, d’Errico F (2013) Air-sea temperature de-
Thinking strings: Additional evidence for personal ornament use in the Middle Stone coupling in western Europe during the last interglacial-glacial transition. Nat Geosci 6:837–841.
Age at Blombos Cave, South Africa. J Hum Evol 64:500–517. 70. Broccoli AJ, Dahl KA, Stouffer RJ (2006) Response of the ITCZ to Northern Hemi-
34. Jacobs Z, et al. (2008) Ages for the Middle Stone Age of southern Africa: Implications sphere cooling. Geophys Res Lett 33:L01702.
for human behavior and dispersal. Science 322:733–735. 71. Stouffer RJ, et al. (2006) Investigating the causes of the response of the thermo-
35. Osborne AH, et al. (2008) A humid corridor across the Sahara for the migration of early haline circulation to past and future climate changes. J Clim 19:1365–1387.
modern humans out of Africa 120,000 years ago. Proc Natl Acad Sci USA 105:16444–16447. 72. Barker S, et al. (2009) Interhemispheric Atlantic seesaw response during the last
36. Armitage SJ, et al. (2011) The southern route “out of Africa”: Evidence for an early deglaciation. Nature 457:1097–1102.
expansion of modern humans into Arabia. Science 331:453–456. 73. Kanner LC, Burns SJ, Cheng H, Edwards RL (2012) High-latitude forcing of the South
37. Compton JS (2011) Pleistocene sea-level fluctuations and human evolution on the American summer monsoon during the Last Glacial. Science 335:570–573.
southern coastal plain of South Africa. Quat Sci Rev 30:506–527. 74. Marino G, et al. (2013) Agulhas salt-leakage oscillations during abrupt climate
38. Ziegler M, et al. (2013) Development of Middle Stone Age innovation linked to rapid changes of the Late Pleistocene. Paleoceanography 28:599–606.
climate change. Nat Commun 4:1905. 75. Nix HA (1986) A biogeographic analysis of Australian elapid snakes. Atlas of Elapid
39. d’Errico F, Banks WE (2013) Identifying mechanisms behind Middle Paleolithic and Snakes of Australia, Australian Flora and Fauna Series, ed Longmore R (Australian
Middle Stone Age cultural trajectories. Curr Anthropol 54:S371–S387. Government Publishing Service, Canberra), pp 4–15.
40. Peterson AT, et al. (2011) Ecological Niches and Geographic Distributions (Princeton 76. Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species
Univ Press, Princeton). geographic distributions. Ecol Modell 190:231–259.
41. Soriano S, et al. (2015) The Still Bay and Howiesons Poort at Sibudu and Blombos: 77. Hijmans RJ, Phillips SJ, Leathwick J, Elith J (2017) dismo: Species Distribution Modeling.
Understanding Middle Stone Age technologies. PLoS One 10:e0131127. Available at https://cran.r-project.org/web/packages/dismo/index.html. Accessed January
42. d’Errico F, Henshilwood C, Vanhaeren M, van Niekerk K (2005) Nassarius kraussianus 11, 2017.
shell beads from Blombos Cave: Evidence for symbolic behaviour in the Middle Stone 78. R Core Team (2013) R: A Language and Environment for Statistical Computing (R
Foundation for Statistical Computing, Vienna).
Age. J Hum Evol 48:3–24.
43. Jacobs Z, Duller GAT, Wintle AG (2003) Optical dating of dune sand from Blombos 79. Warren DL, Glor RE, Turelli M (2008) Environmental niche equivalency versus con-
servatism: Quantitative approaches to niche evolution. Evolution 62:2868–2883.
Cave, South Africa: II–single grain data. J Hum Evol 44:613–625.
80. Warren DL, Glor RE, Turelli M (2010) ENMTools: A toolbox for comparative studies of
44. Jacobs Z, Duller GAT, Wintle AG, Henshilwood CS (2006) Extending the chronology
environmental niche models. Ecography 33:607–611.
of deposits at Blombos Cave, South Africa, back to 140 ka using optical dating of
81. Levins R (1968) Evolution in Changing Environments (Princeton Univ Press, Princeton).
single and multiple grains of quartz. J Hum Evol 51:255–273.
82. Mandle L, et al. (2010) Conclusions about niche expansion in introduced Impatiens
45. Jacobs Z, Hayes EH, Roberts RG, Galbraith RF, Henshilwood CS (2013) An improved
walleriana populations depend on method of analysis. PLoS One 5:e15297.
OSL chronology for the Still Bay layers at Blombos Cave, South Africa: Further tests
83. Parmesan C, Yohe G (2003) A globally coherent fingerprint of climate change im-
of single-grain dating procedures and a re-evaluation of the timing of the Still Bay
pacts across natural systems. Nature 421:37–42.
industry across southern Africa. J Archaeol Sci 40:579–594.
84. Wiens JJ, Graham CH (2005) Niche conservatism: Integrating evolution, ecology, and
46. Tribolo C, et al. (2009) Thermoluminescence dating of a Stillbay–Howiesons Poort se-
conservation biology. Annu Rev Ecol Evol Syst 36:519–539.
quence at Diepkloof Rock Shelter (Western Cape, South Africa). J Archaeol Sci 36:730–739.
85. Peterson AT (2011) Ecological niche conservatism: A time-structured review of evi-
47. Guérin G, Murray AS, Jain M, Thomsen KJ, Mercier N (2013) How confident are we in the
dence. J Biogeogr 38:817–827.
chronology of the transition between Howieson’s Poort and Still Bay? J Hum Evol 64:314–317.
86. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of
48. Tribolo C, et al. (2013) OSL and TL dating of the Middle Stone Age sequence at
cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415.
Diepkloof Rock Shelter (South Africa): A clarification. J Archaeol Sci 40:3401–3411.
87. Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM (2009) Emulation, imitation,
49. Archer W, Pop CM, Gunz P, McPherron SP (2016) What is Still Bay? Human bio-
over-imitation and the scope of culture for child and chimpanzee. Philos Trans R Soc
geography and bifacial point variability. J Hum Evol 97:58–72.
Lond B Biol Sci 364:2417–2428.
50. Jacobs Z, Roberts RG (2017) Single-grain OSL chronologies for the Still Bay and
88. Porraz G, et al. (2013) Technological successions in the Middle Stone Age sequence
Howieson’s Poort industries and the transition between them: Further analyses and
of Diepkloof Rock Shelter, Western Cape, South Africa. J Archaeol Sci 40:3376–3400.
statistical modelling. J Hum Evol 107:1–13. 89. d’Errico F, Banks WE (2015) The archaeology of teaching: A conceptual framework.
51. Henshilwood CS, et al. (2014) Klipdrift Shelter, southern Cape, South Africa: Pre-
Camb Archaeol J 25:859–866.
liminary report on the Howiesons Poort layers. J Archaeol Sci 45:284–303. 90. Torrence R (2001) Hunter-gatherer technology: Macro- and microscale patterns. Hunter-
52. Delagnes A, et al. (2016) Early evidence for the extensive heat treatment of silcrete in the Gatherers: An Interdisciplinary Perspective, Biosocial Society Symposium Series, eds Panter-
Howiesons Poort at Klipdrift Shelter (Layer PBD, 65 ka), South Africa. PLoS One 11:e0163874. Brick C, Layton R, Rowley-Conwy P (Cambridge Univ Press, Cambridge, UK), pp 73–98.
53. de la Peña P (2015) Refining our understanding of Howiesons Poort lithic technology: 91. Collard M, Kemery M, Banks S (2005) Causes of toolkit variation among hunter-gatherers:
The evidence from Grey Rocky Layer in Sibudu Cave (KwaZulu-Natal, South Africa). A test of four competing hypotheses. Can J Archaeol J Can D’Archéologie 29:1–19.
PLoS One 10:e0143451. 92. Read D (2008) An interaction model for resource implement complexity based on
54. de la Peña P, Wadley L, Lombard M (2013) Quartz bifacial points in the Howiesons risk and number of annual moves. Am Antiq 73:599–625.
Poort of Sibudu. S Afr Archaeol Bull 68:119–136. 93. Collard M, Buchanan B, Morin J, Costopoulos A (2011) What drives the evolution of
55. d’Errico F, Backwell LR, Wadley L (2012) Identifying regional variability in Middle hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with
Stone Age bone technology: The case of Sibudu Cave. J Archaeol Sci 39:2479–2495. data from the Pacific Northwest. Philos Trans R Soc Lond B Biol Sci 366:1129–1138.
56. Backwell L, d’Errico F, Wadley L (2008) Middle Stone Age bone tools from the 94. Rendell L, et al. (2011) How copying affects the amount, evenness and persistence of
Howiesons Poort layers, Sibudu Cave, South Africa. J Archaeol Sci 35:1566–1580. cultural knowledge: Insights from the social learning strategies tournament. Philos
57. Lombard M, Phillipson L (2010) Indications of bow and stone-tipped arrow use Trans R Soc Lond B Biol Sci 366:1118–1128.
64,000 years ago in KwaZulu-Natal, South Africa. Antiquity 84:635–648. 95. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oce-
58. Bradfield J, Lombard M (2011) A macrofracture study of bone points used in experimental ania. Proc R Soc Lond B Biol Sci 277:2559–2564.
hunting with reference to the South African middle stone age. S Afr Archaeol Bull 66:67. 96. Collard M, Buchanan B, O’Brien MJ, Scholnick J (2013) Risk, mobility or population
59. Charrié-Duhaut A, et al. (2013) First molecular identification of a hafting adhesive in size? Drivers of technological richness among contact-period western North Amer-
the Late Howiesons Poort at Diepkloof Rock Shelter (Western Cape, South Africa). ican hunter-gatherers. Philos Trans R Soc Lond B Biol Sci 368:20120412.
J Archaeol Sci 40:3506–3518. 97. Banks WE, d’Errico F, Zilhão J (2013) Human-climate interaction during the Early
60. d’Errico F, Backwell L (2016) Earliest evidence of personal ornaments associated with Upper Paleolithic: Testing the hypothesis of an adaptive shift between the Proto-
burial: The Conus shells from Border Cave. J Hum Evol 93:91–108. Aurignacian and the Early Aurignacian. J Hum Evol 64:39–55.
61. Texier P-J, et al. (2010) From the cover: A Howiesons Poort tradition of engraving 98. d’Errico F, Henshilwood CS (2007) Additional evidence for bone technology in the
ostrich eggshell containers dated to 60,000 years ago at Diepkloof Rock Shelter, southern African Middle Stone Age. J Hum Evol 52:142–163.
South Africa. Proc Natl Acad Sci USA 107:6180–6185. 99. de la Peña P, Wadley L (2014) Quartz knapping strategies in the Howiesons Poort at
62. Wadley L, Mohapi M (2008) A segment is not a monolith: Evidence from the Ho- Sibudu (KwaZulu-Natal, South Africa). PLoS One 9:e101534.
wiesons Poort of Sibudu, South Africa. J Archaeol Sci 35:2594–2605. 100. Dayet L, Texier P-J, Daniel F, Porraz G (2013) Ochre resources from the Middle Stone Age
63. Jacobs Z, Roberts RG (2015) An improved single grain OSL chronology for the sedimentary sequence of Diepkloof Rock Shelter, Western Cape, South Africa. J Archaeol Sci 40:3492–3505.
deposits from Diepkloof Rockshelter, Western Cape, South Africa. J Archaeol Sci 63:175–192. 101. Laskar J, et al. (2004) A long-term numerical solution for the insolation quantities of
64. Partridge TC, Demenocal PB, Lorentz SA, Paiker MJ, Vogel JC (1997) Orbital forcing of the Earth. Astron Astrophys 428:261–285.
climate over South Africa: A 200,000-year rainfall record from the Pretoria saltpan. Quat 102. Jouzel J, et al. (2007) Orbital and millennial Antarctic climate variability over the
Sci Rev 16:1125–1133. past 800,000 years. Science 317:793–796.

7876 | www.pnas.org/cgi/doi/10.1073/pnas.1620752114 d’Errico et al.


COLLOQUIUM
PAPER
Cumulative cultural learning: Development
and diversity
Cristine H. Legarea,1
a
Department of Psychology, The University of Texas at Austin, Austin, TX 78712

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark June 16, 2017
(received for review January 14, 2017)

The complexity and variability of human culture is unmatched by any or linear but instead is a process of punctuated accumulation; it
other species. Humans live in culturally constructed niches filled with involves the conservation of some features, incremental in-
artifacts, skills, beliefs, and practices that have been inherited, novation, and occasionally dramatic qualitative shifts (36).
accumulated, and modified over generations. A causal account of The diversity of skills, practices, beliefs, and values among pop-
the complexity of human culture must explain its distinguishing ulations is another distinguishing feature of human culture. Cultural
characteristics: It is cumulative and highly variable within and across groups are heterogeneous populations of individuals that differ
populations. I propose that the psychological adaptations supporting along complex ecological, social, and structural variables. Socially
cumulative cultural transmission are universal but are sufficiently acquired and transmitted behaviors vary more distinctly among
flexible to support the acquisition of highly variable behavioral human populations than in any other species (37). Cultural vari-
repertoires. This paper describes variation in the transmission prac- ability is one of our species’ most distinctive features, and a causal
tices (teaching) and acquisition strategies (imitation) that support account of human culture must explain its diversity. The psycho-
cumulative cultural learning in childhood. Examining flexibility and logical adaptations supporting cumulative cultural transmission are
variation in caregiver socialization and children’s learning extends our hypothesized to be universal features of human psychology, but they
understanding of evolution in living systems by providing insight into must be sufficiently flexible to support the acquisition of highly
the psychological foundations of cumulative cultural transmission— variable skill sets and behavioral repertoires (38).
the cornerstone of human cultural diversity. What psychological adaptations explain the species-specific ca-
pacity to accumulate and build upon the cultural innovations of
|
cumulative culture cultural evolution | cross-cultural comparison | previous generations? To what extent do cultural transmission
|
teaching imitation practices (teaching) and cultural acquisition strategies (imitation)

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
vary across populations? How do caregivers use teaching to

H uman and nonhuman animals engage in behaviors that are


culturally created and subsequently transmitted (1–3).
Long-term studies of nonhuman animal species in their natural
transmit information, skills, and practices to children? How do
children use imitation to acquire the knowledge and skills of their
groups? The objective of this paper is to answer these questions
habitats have demonstrated that many species respond to, and using data on teaching and imitation from developmental and
learn from, social information (4–9). Nonhuman animals also cross-cultural research.
transmit group-specific behavior through what could be consid- One potential explanation for cross-species variation in cultural
ered rudimentary forms of cultural transmission (10–14). How- complexity is social learning (2). Social learning is defined as
ever, cultural transmission in humans differs markedly from that “learning that is influenced by observation of or interaction with
in nonhuman animals in both its extent and structural complexity another animal (typically a conspecific) or its products” (39,
(15, 16). The psychological foundations of cultural complexity p. 207). Young children are well equipped with a complex repertoire
are multifaceted. Explanations require drawing together de- of social learning capacities (1, 40). Cumulative culture trans-
velopmental, cross-cultural, and comparative research that ex- mission requires a particular kind of social learning that allows the
tends biology through culture. accumulation of successful modifications over time through a
Culture is defined as “group-typical behaviors shared by members process of cultural ratcheting (41–43). For example, humans im-
of a community that rely on socially learned and transmitted in- proved upon the Oldowan single-face stone tool, used more or less
formation” (17). Humans are “ultra” cultural (18); they live in intact for a million years, by creating bifacial Acheulean handaxes
culturally constructed niches filled with artifacts, skills, beliefs, and with dramatically improved functionality—an example of cultural
practices that have been inherited, accumulated, and modified over continuity followed by punctuated innovation (44).
generations (19–22). What explains the technological and social Cumulative cultural learning is psychologically prepared by a
complexity of human culture? A causal account must explain the set of adaptations that facilitate the transmission and acquisition
distinguishing characteristics of human culture: It is cumulative, of information within and across generations (29, 45–47).
transmitted horizontally within groups and vertically across Teaching, high-fidelity imitation, and language are three linked
generations, and varies within and between populations. Cu- abilities that work in concert to support cultural transmission in
mulative culture is a process by which innovations are pro- humans (48). Teaching and imitation reflect the distinction be-
gressively incorporated into a population’s stock of skills and tween instructed and imitative learning (43). Language allows
knowledge, generating more complex repertoires (23–26).
Cumulative culture requires psychological adaptations that
ensure the high-fidelity transmission of knowledge, skills, and This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
practices (27–29). However, innovation is also necessary to Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
ensure cultural and individual adaptation to novel and changing in Irvine, CA. The complete program and video recordings of most presentations are available
challenges (30–33). on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
Cumulative cultural transmission accelerates innovation be- Author contributions: C.H.L. wrote the paper.
cause each generation can build upon the technologies passed The author declares no conflict of interest.
down by previous generations (34). Much of human technology is This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
too complex and sophisticated to be recreated within individual Board.
lifetimes (35). The growth of cultural complexity is not exponential 1
Email: legare@austin.utexas.edu.

www.pnas.org/cgi/doi/10.1073/pnas.1620743114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7877–7883


information transfer between individuals, supporting both teach- Teaching is not a monolithic process. It consists of a repertoire
ing and imitation. These cognitive abilities are supported by a of cultural transmission strategies that vary based on the kind of
psychological system that has evolved to understand the minds of information or skill being transmitted, the effort required, and
others and to navigate complex social group behavior (49–51). beliefs about how children learn. Kline (77) developed a taxon-
Well-documented cognitive biases reinforce cultural transmission, omy of teaching based on data from a teaching ethogram for
including preferences for similar others (homophily) (52) and cross-cultural human research (TEACH) to describe the varia-
proclivities for conformity (53), consensus (54–56), prestige (57, tion and function of different pedagogical styles. For example,
58), and social norms (59–62). when teaching by social tolerance, a teacher grants the learner
If teaching and imitation provide the foundation for cumulative access for close observation. When teaching by opportunity
culture, they should be early developing and universal (63). They provisioning, teachers provide access to activities that are too
also must afford the capacity to respond flexibly to diverse onto- difficult or dangerous for the learner to explore independently,
genetic contexts and cultural ecologies (38, 64–66). Understanding without modification. When teaching by evaluative feedback, the
cultural continuity and variation in teaching and imitation pro- teacher provides positive or negative reinforcement of the learner’s
vides insight into the process by which cumulative culture allows behavior through positive or negative verbal or gestural feedback,
humans to adapt to highly diverse environments. Teaching and positive or negative consequences, teasing, warning of danger, or
imitation conserve cultural knowledge, thus increasing the po- commands to stop, to say or to do. When teaching by social or local
tential for innovation or modification at the group level, which enhancement, the teacher directs the learner’s attention toward
further increases cultural complexity (67). Greater cultural com- the task at hand. Direction of this kind can include calling atten-
plexity increases the repertoire of socially transmitted beliefs and tion to an object or person and commands to watch. When en-
practices, which increases the prevalence and necessity of teaching gaging in direct active teaching, the teacher makes relevant
(46). High-fidelity transmission may be more central to main- aspects of the task accessible or observable. Direct active teaching
taining cumulative culture than to innovation (68). Teaching and involves defining the boundaries of what is to be learned and can
imitation conserve and transmit both group-specific behavioral include direct communication, abstract communication, and
repertoires and cultural innovations. demonstration.
To understand variability in teaching and imitation better, Opportunity provisioning and direct active teaching are rela-
research must be conducted on childrearing environments and tively effortful compared with the other types of teaching, because
practices across diverse populations. The great majority of psy- the behaviors required are less compatible with teachers’ ongoing
chological research has been conducted in populations that are behaviors, requiring an interruption of the teacher’s behavior at
unrepresentative of human culture globally and historically— some cost. Some of the indicators in the TEACH ethogram are
those from Western, educated, industrialized, rich, and demo- established in the literature as behavioral markers of teaching,
cratic (WEIRD) backgrounds (38, 69). A growing literature within
including ostensive cues and behaviors that facilitate shared at-
developmental psychology and the anthropology of childhood aims
tention (48). Pedagogical style varies predictably based on the costs
to correct the bias in studying WEIRD populations within the dis-
and benefits of the mode of transmission, the cultural domain and
cipline (18, 37, 38, 70–78).
complexity, learner, and teacher identity (94, 95).
Using evidence from cross-cultural, developmental research, I
Direct active teaching has much in common with didactic
will first describe variation in cultural transmission practices and
pedagogy (i.e., adults structuring and guiding children’s learning).
then cultural acquisition strategies. My objective is to provide
When engaging in direct active teaching, caregivers in WEIRD
insight into the origins of variation in cumulative cultural
learning and the processes by which knowledge is acquired and populations scaffold and manage children’s learning environment,
transmitted during development. A comprehensive account of often engaging in extensive face-to-face interaction, eye contact,
teaching and imitation requires systematic study of variations in and instruction (92). According to this pedagogical model, the
childrearing practices and beliefs. caregiver is a teacher—a model of child socialization based on
Western formal educational practices. The extensive reliance on
Variation in Cultural Transmission Practices direct active teaching may be a relatively recent historical phe-
The universal goals of childrearing include promoting the survival, nomenon in caregiver–child interactions and reflects cultural be-
health, and cultural competency of children (79). Children, in liefs about children’s learning, the structure of formal educational
collaboration with their caregivers and peers, interact in ways that institutions, and the extensive body of abstract knowledge and
ensure the transmission of cultural practices and beliefs across skills children are expected to master (e.g., literacy and numerical
generations (37, 80). Teaching, defined as “a behavior in which computation). Direct active teaching is such a normative feature
one animal intends that another learn some skill or acquire some of cultural transmission in WEIRD populations that some are
bit of information or knowledge that it did not have previously” calling for limiting its use because of its potentially detrimental
(48, p. 374), promotes the efficient transfer of information and is a effects on self-directed discovery and exploration (96).
recurrent feature of human cultural transmission. Populations vary widely in the amount of direct active teaching
Human caregivers are unique among animals in their moti- in which caregivers engage. For example in Fiji, direct active
vation to transmit information to children through teaching (81). teaching is relatively rare compared with less time-intensive and
Human adults expect children to learn and provide assistance costly forms of teaching, such as teaching by social tolerance or
when needed (82). The ability to share psychological states and allowing children to observe behaviors they may need to learn
intent with others during (intersubjectivity and metacognition), (77, 95, 97). Cross-cultural research in the United States and
to engage in mutually recognized, shared focus (joint attention), Vanuatu has demonstrated that, in contrast to caregivers from
and to engage in collaborative and coordinated interactions Vanuatu, caregivers from the United States rely heavily on ver-
contributes to efficient scaffolding and teaching (83–87). bal communication. They scaffold through language by asking
Substantial quantitative and qualitative variation exists in children questions, encourage planning, and provide high levels
teaching both within (88) and among (89) populations. For ex- of verbal praise and encouragement. Caregivers in the United
ample, in different populations, caregivers respond differently to States also use extensive verbal instruction and repetition to in-
infants’ emotional displays (90), speak to and structure their troduce new objects or unfamiliar tasks and to establish common
infants’ social interactions and expectations in distinct ways (74, ground in information sharing (98). They engage in high levels of
91), and display variability in the modalities (e.g., physical, visual, visual contact with children, consistent with previous research
vocal) used to transmit information to infants (92, 93). on joint attention in WEIRD populations. In contrast, caregivers

7878 | www.pnas.org/cgi/doi/10.1073/pnas.1620743114 Legare


COLLOQUIUM
PAPER
from Vanuatu use substantially more nonverbal forms of com- Cumulative culture requires the high-fidelity transmission of
munication, such as gesture and physical touch (92). two qualitatively different behaviors: instrumental knowledge
Ethnographic accounts of caregiver–child interaction empha- and skills (e.g., how to keep warm during winter) and social
size variation in parental ethnotheories (i.e., cultural beliefs) conventional knowledge and skills (e.g., how to perform a cer-
about children’s capacity to self-educate through observational emonial dance) (127). Acquiring the behavior of other group
learning (99, 100). Children routinely engage in third-party ob- members may be the function of an individual-level adaptation
servation of adult activity and often learn by close observation for imitation in our species. Thus, the transmission of cumulative
without being directly addressed or involved (101–104). Caregiver culture across generations can be seen, in part, as a product of
expectations that will children learn through attentive observation our propensity for imitative flexibility (128).
before participating impacts the caregiver’s pedagogical style (105). The unique demands of acquiring instrumental skills and social
If caregivers expect children to learn through observation instead conventions provide insight into when children imitate and when
of through interactive conversation, they may be less likely to en- they innovate. The objective of imitating instrumental behavior is
gage in verbal scaffolding or to direct active teaching (77). Formal reproducing the end goal by discerning which actions are causally
education also impacts childrearing practices and values (76, 93, relevant to producing the desired outcome (127). Attending to the
106). These practices include how parents interact with their causal relationship between the actions and the end goal allows for
infants (107), direct children’s attention (108), and use verbal innovation and variability in the reproduction of the behavior and,
instruction (109). as a result, lower-fidelity imitation. In contrast, the objective of
There is substantial variation in how caregivers structure imitating conventional behavior is reproducing all the steps in the
children’s learning opportunities (102, 110–112). The kinds of process (129), which requires attending to the way in which the
tasks caregivers and children engage in together vary (113), as behavior ought to be executed. In contrast to imitating in-
does the amount of time children spend with nonparental care- strumental behaviors, imitating conventional behaviors requires
givers and peers (79, 114). Cultural groups also vary in the degree consistently high-fidelity imitation. Children may encode causally
to which children are segregated from or are participants in adult irrelevant actions not because they think that they are causally
economic and social activity (100, 115). For example, children efficacious in some way, or even to demonstrate shared intentions,
living in communities that rely on labor-intensive subsistence ag- but rather to conform to social conventions (130). Although
riculture are expected to assist adults in subsistence-based labor learning an instrumental skill often allows for variability and in-
(e.g., cooking, planting and harvesting crops, and helping with the novation in methods of execution, learning social conventions
childcare of younger siblings) at a young age (116). requires close conformity to the way other group members per-
Variation in cultural transmission practices also reflects the form the actions.
kinds of skills and behaviors children must acquire. For example, Imitation has social functions, such as encoding normative
behavior (131). The adaptive benefits of group membership have

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
learning a complex or abstract skill often requires direct active
favored individuals who engage in affiliative behaviors, such as
instruction to acquire that skill efficiently. Caregivers play a critical
high-fidelity imitation (132, 133). Children imitate social con-
role in transmitting the beliefs, skills, and practices of particular
ventions as a means of affiliation with group members (127).
populations. Cultural transmission alone does not explain high-
High-fidelity imitation also may function as a reinclusion be-
fidelity cultural acquisition. Young children are adept at acquir-
havior in reaction to the threat of social exclusion from an in-
ing the beliefs and practices of the groups they are born into, an
group in childhood in ways that parallel the increase in motor
extraordinary learning achievement that requires substantial flex- mimicry following social exclusion by in-group members ob-
ibility (38). Next I review evidence for high-fidelity imitation, de- served in adults (134). Children ostracized by in-group members
scribe evidence for variation between populations, and discuss the display higher levels of anxiety and engage in higher imitative
implications for acquiring knowledge. fidelity of a group convention than children ostracized by out-
Variation in Cultural Acquisition Practices group members (135). They imitate instrumental tasks with
higher fidelity when primed with ostracism (136, 137). When
Our species-typical proclivity for high-fidelity imitation is critical for status or inclusion within a group is threatened, children may be
cumulative cultural transmission (42, 117, 118). High-fidelity imi- particularly motivated to enhance their standing in a group
tation plays a central role in both horizontal and vertical trans- through affiliative behavior such as high-fidelity imitation.
mission of group-specific cultural practices. Young children possess Imitation is used to acquire instrumental skills as well as to
cognitive and communication systems that support the transmission engage in social conventions such as rituals. However, it is often
of complex technical skills and social conventions (46, 65, 75). difficult to determine whether a behavior is instrumental or con-
Children learn the skills and practices of their communities by ventional based on observation of the behavior alone. For exam-
imitating others. The ability and motivation to engage in high- ple, lighting a candle could have an instrumental goal (lighting a
fidelity copying allows children to acquire an extraordinary va- dark room) or a conventional goal (worshiping a deity). How do
riety of skills and information they otherwise would not be able children determine whether a behavior is instrumental or con-
to acquire through direct exploration or experimentation alone ventional? Young children are highly sensitive to contextual var-
(29). For acquired behavior to count as cultural, it must dis- iation in social information (138). Children use a number of social
seminate in a social group and remain stable across generations and contextual cues when making inferences about the goal of
(119, 120). The conservation of knowledge and skills across behavior. Cues to conventionality increase imitative fidelity. One
generations supports individual and group-level innovation is causal opacity (i.e., lack of a physical causal mechanism). A
(121). The propensity for overimitation, or copying actions that second is consensus (i.e., multiple actors performing the same
are causally irrelevant to achieving an instrumental end goal actions). A third is synchrony (i.e., multiple actors performing the
(122, 123), develops early. Children often copy when uncertain same actions at the same time) (56). Children are also highly
about the underlying causal structure of a behavior. This pro- sensitive to verbal cues to conventionality and to the presence of a
clivity is useful, given that a vast amount of behavior that chil- social norm (65, 127, 139). Even infants are sensitive to language
dren acquire is opaque from the perspective of physical causality cues to conventionality (140).
(124, 125). High-fidelity imitation is an adaptive human strategy There is both continuity and variation in imitative flexibility across
facilitating more rapid social learning of instrumental skills than populations (141–143). For example, children in industrialized,
would be possible if copying required a full causal representation Western populations (e.g., the United States) and subsistence-based,
of an event (126). non-Western populations (e.g., Vanuatu) imitate conventional

Legare PNAS | July 25, 2017 | vol. 114 | no. 30 | 7879


tasks with higher fidelity than instrumental tasks—an example of variation. Next I describe future directions for research designed to
continuity. Children in Vanuatu, however, engage in higher imitative explain variation between populations.
fidelity of instrumental tasks than in the United States, a potential
consequence of greater socialization for conformity in some pop- Explaining Cultural Variation
ulations than in others (139, 144). Despite evidence that cultural transmission practices and cul-
Cues to conventionality also increase expectations for confor- tural acquisition strategies vary across populations, to date we
mity and attention to behavioral variation. Children’s accuracy in cannot predict and explain the sources of this variation. Do
detecting differences between the performances of two actors is caregivers in populations with lower levels of Western-style ed-
greater when an action is interpreted as a social convention, po- ucation engage in less direct active teaching and more observa-
tentially because of expectations for conformity to conventional tional learning? If so, is this difference explained by participation
behavior (127). The social conventionality of an action may trigger in Western-style education or other factors such as social orga-
affiliative behavior through conformity, motivating greater atten- nization or degree of market participation? Do caregivers in
tion to detail and alertness to deviations from procedure. Children hierarchical populations engage in more active teaching than
also transmit conventional behavior with higher fidelity than in- caregivers in egalitarian populations? Do peers, older siblings, or
strumental tasks when teaching a peer (65). cousins use different teaching styles than caregivers, and does
Imitative flexibility improves over the course of childhood. For age-heterogeneity of the peer group impact learning? More re-
example, there are age-related improvements in object memory- search is needed to collect the kind of demographic and mixed-
based imitation between 2 and 5 y of age (145, 146). Children’s methodological data required to answer such questions.
understanding of the social and contextual cues that distinguish Cultural groups vary along multiple continua. These include level
instrumental from conventional behavior increases with age of integration into the global economic marketplace, social orga-
(127, 139). They may become more sensitive to these cues as a nization, urbanicity, kinship networks, peer-group age heterogene-
result of learning about social conventionality (50, 147). Un- ity, and formal versus informal education. Systematic comparisons
derstanding the development of imitative flexibility requires ex- among multiple groups will provide much stronger support for
amining the extent to which caregivers scaffold this ability. causal claims that a particular variable of interest is responsible
Caregivers in the United States adjust their interactions with for variation in dependent measures (149). For example, study-
children according to the goal of the behavior. For example, they ing Melanesian populations in Yasawa Islands, Fiji, and Tanna,
encourage higher-fidelity imitation of social conventions than of Vanuatu, would allow a comparison of populations that are similar
instrumental tasks and, conversely, encourage more creativity in terms of subsistence agricultural practices and limited exposure
and innovation for instrumental tasks than for social conven- to Western-style education but are different in terms of social or-
tions. They also engage in more encouragement, demonstration, ganization (hierarchical social organization in Fiji versus egalitarian
and monitoring when teaching their children conventional tasks chiefdoms in Vanuatu). Conducting research with multiple pop-
than when teaching instrumental behavior (148). ulations that are similar along some variables but different along
Adults across a wide range of global populations view high- others will (i) reveal the impact of sources of variation on outcomes;
fidelity imitation as an efficient method of learning. For exam- (ii) prevent inadvertently describing idiosyncratic features of par-
ple, parents in the United States and Vanuatu encourage chil- ticular cultural contexts; and (iii) provide opportunities to reveal
dren to conform to the behavior of others (144). However, there social and psychological processes not possible if data were collected
is variation in beliefs about the relationship between conformity only from a narrow range of populations.
and competency. For example, when evaluating US children, US The dearth of systematic research outside Western pop-
adults are more likely to endorse low- than high-conformity ulations presents a major impediment to theoretical progress in
children as intelligent, often citing creativity as a justification the psychological sciences in general (47) and to the develop-
for their judgments. In contrast, Vanuatu adults are more likely mental sciences in particular (18). Despite growing recognition
to endorse high- than low-conformity children as intelligent and that most of what we know about child development is based on
are more likely to endorse high-conformity children as well be- a very narrow sample of children, cross-cultural developmental
haved than are US adults (144). The perceived relations between studies are still rare, often unsystematic, and typically rely on
intelligence, conformity, and creativity vary across populations. convenience sampling (29, 150). A new path forward in devel-
Perceived intelligence is critical to social esteem and status opmental science is needed to understand better the ontogeny of
within a group. The variation in beliefs about indicators of in- a species that inhabits diverse cultural ecologies and faces complex
telligence based on conformity and creativity has implications for adaptive problems.
the kinds of behaviors that are transmitted as well as for the Building a comprehensive understanding of cultural trans-
kinds of behaviors associated with prestige within groups. mission practices and acquisition strategies requires studying
As novice learners, children must acquire the practices and cultural contexts that differ in theoretically relevant ways. There
beliefs of the group they are born into. To understand cultural is a pressing need for systematic, cross-cultural, and mixed-
continuity and variability in cultural acquisition strategies better, methodological research on this topic. The lack of infrastructure
research must be conducted with children across diverse pop- for conducting research across multiple field sites has previously
ulations. There is substantial evidence that high-fidelity imitation posed a major impediment to understanding cultural variation.
is universal in human children. However, imitation, like teaching, Collaborative networks of international fieldsites are needed to
is not a monolithic capacity. Efficient cultural learning requires generate data from diverse populations—an undertaking that
flexible imitation of instrumental skills and social conventions. requires the expertise and cooperation of multiple international
The skills and conventions children must acquire vary enor- and interdisciplinary partners. Another obstacle to conducting
mously among populations, as do expectations for conformity research of this kind is gaining approval to work in diverse
versus innovation. Cross-cultural data demonstrate that variation populations. Connections need to be established with diverse
of this kind impacts the extent to which children engage in high- communities and relationships developed based on trust and
fidelity imitation versus innovation. respect, an issue that is even more critical when working with
Recent decades have produced a large literature on cumula- children. In each community being studied, a network of local
tive cultural transmission. Children, in collaboration with their research assistants and translators must be established and
caregivers and peers, interact in ways that ensure the transmission maintained, and special care must be taken to ensure that the
of cultural practices and beliefs across generations. However, we methodologies and stimuli used in research are culturally salient
currently lack a complete causal explanatory account of cultural and appropriate.

7880 | www.pnas.org/cgi/doi/10.1073/pnas.1620743114 Legare


COLLOQUIUM
PAPER
Summary comparative research (18, 38). The vast majority of studies
The unparalleled intellectual success of humans is widely at- attempting to elucidate the evolutionary origins and ontogenetic
tributed to our ability for cumulative cultural transmission, a processes of cultural learning focus on children raised in WEIRD
process by which we take the discoveries, behaviors, and inven- societies (69, 151). Children living in technologically complex
tions of others and build upon them further to create increasingly cultural environments provide excellent opportunities to study the
complex reserves of socially heritable knowledge and technology early-developing capacity to adopt, capitalize upon, and build in-
(121). Evidence for culture in nonhuman species continues to creasingly sophisticated and opaque technologies. These pop-
grow, but there are few candidate examples of cumulative culture ulations, however, do not reflect the childrearing environments
outside humans’ distinctively complex achievements. Human that Homo sapiens and their close ancestors experienced through
culture is uniquely variable in nature, as exemplified by the ex- much of their cultural evolutionary history or the diverse ways in
traordinary diversity across technological skills and social prac- which children are raised across the world today. Cross-cultural
tices within and among populations. Human psychological research highlights not only the varied effects of the ontogenetic
flexibility provides the foundation for cultural diversity and is a environment on behavior and cognition but also strengthens
prerequisite for cumulative culture. It allows humans to build claims about universal and phylogenetically endowed mechanisms.
upon established behaviors by relinquishing old solutions and Children experience enculturation from infancy through their
flexibly switching to more productive or efficient ones (128). interaction with caregivers, artifacts, and cultural institutions.
Cultural transmission practices and acquisition strategies For this reason, it is also necessary to look to other species to
support cumulative culture: our species-specific capacity to ac- understand better how evolution has shaped the human mind.
cumulate and build upon the cultural innovations of previous Chimpanzees, arguably the second most cultural extant species
generations. A comprehensive account of teaching and imitation (81), are an ideal comparative sample for studying the mecha-
requires systematic study of cultural variation and continuity in nisms and processes which may be unique to human culture or
childrearing practices (151). Variable socialization strategies inherited from our shared ancestors. Studying a wide age range
support different culturally specific childrearing goals. Teaching of chimpanzees, raised in different environments, will also in-
practices reflect the values, educational institutions, and skill sets crease our understanding of the evolutionary origins of when and
of diverse cultural and ecological contexts. Children around the why developmental processes shaped the hominin mind.
globe use imitation flexibly to acquire the specific practices, Humans engage in a wider variety of socially acquired and
beliefs, and values of their groups. Future cross-cultural research transmitted behaviors that vary more distinctly across commu-
on teaching and imitation will enrich our understanding of cog- nities than any other animal species. Data from cross-cultural,
nitive and social development and will substantially increase our comparative, and developmental research are needed to increase
knowledge about the developmental origins of a psychological our understanding of the evolution of cumulative cultural

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
hallmark of our species—cumulative cultural transmission. transmission. Examining the flexibility and variation in cultural
The psychological foundations of cultural complexity are transmission practices and acquisition strategies provides insight
multifaceted. Explanations that extend biology through cul- into the psychological foundations of cumulative cultural trans-
ture require drawing together developmental, cross-cultural, and mission—the cornerstone of human cultural diversity.

1. Herrmann E, Call J, Hernàndez-Lloreda MV, Hare B, Tomasello M (2007) Humans 20. Scott-Phillips TC, Laland KN, Shuker DM, Dickins TE, West SA (2014) The niche con-
have evolved specialized skills of social cognition: The cultural intelligence hypoth- struction perspective: A critical appraisal. Evolution 68:1231–1243.
esis. Science 317:1360–1366. 21. Whiten A, Hinde RA, Laland KN, Stringer CB (2011) Culture evolves. Philos Trans R
2. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in- Soc Lond B Biol Sci 366:938–948.
telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016. 22. Boyd R, Richerson PJ, Henrich J (2011) The cultural niche: Why social learning is es-
3. Whiten A, van Schaik CP (2007) The evolution of animal ‘cultures’ and social in- sential for human adaptation. Proc Natl Acad Sci USA 108:10918–10925.
telligence. Philos Trans R Soc Lond B Biol Sci 362:603–620. 23. Pagel MD (2012) Wired for Culture: Origins of the Human Social Mind (W.W. Norton,
4. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture New York).
via conformity in wild birds. Nature 518:538–541. 24. Pradhan GR, Tennie C, van Schaik CP (2012) Social organization and the evolution of
5. Fragaszy D, Visalberghi E (2004) Socially biased learning in monkeys. Learn Behav 32: cumulative technology in apes and hominins. J Hum Evol 63:180–190.
24–35. 25. Kurzban R, Barrett HC (2012) Behavior. Origins of cumulative culture. Science 335:
6. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool 295: 1056–1057.
4–11. 26. Whiten A, Erdal D (2012) The human socio-cognitive niche and its evolutionary or-
7. Perry S, et al. (2003) Social conventions in wild white‐faced capuchin monkeys: Evi- igins. Philos Trans R Soc Lond B Biol Sci 367:2119–2129.
dence for traditions in a neotropical primate. Curr Anthropol 44:241–268. 27. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul-
8. Plotnik JM, Lair R, Suphachoksahakun W, de Waal FBM (2011) Elephants know when ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789.
they need a helping trunk in a cooperative task. Proc Natl Acad Sci USA 108: 28. Chudek M, Henrich J (2011) Culture-gene coevolution, norm-psychology and the
5116–5121. emergence of human prosociality. Trends Cogn Sci 15:218–226.
9. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies 29. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural
tournament. Science 328:208–213. learning. Trends Cogn Sci 19:688–699.
10. Cantor M, et al. (2015) Multilevel animal societies can emerge from cultural trans- 30. Arbilly M, Laland KN (2017) The magnitude of innovation and its evolution in social
mission. Nat Commun 6:8091. animals. Proc Biol Sci 284:20162385.
11. Fragaszy DM, Perry S (2003) The Biology of Traditions: Models and Evidence (Cam- 31. Carr K, Kendal RL, Flynn EG (2016) Eureka!: What is innovation, how does it develop,
bridge Univ Press, New York). and who does it? Child Dev 87:1505–1519.
12. Garland EC, et al. (2013) Humpback whale song on the Southern Ocean feeding 32. Henrich J, McElreath R (2007) Dual inheritance theory: The evolution of human
grounds: Implications for cultural transmission. PLoS One 8:e79422. cultural capacities and cultural evolution. Oxford Handbook of Evolutionary
13. Laland KN, Galef BG (2009) The Question of Animal Culture (Harvard Univ Press, Psychology, eds Dunbar R, Barrett L (Oxford Univ Press, Oxford,UK), pp 555–570.
Cambridge, MA). 33. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive
14. van Leeuwen EJC, Cronin KA, Haun DBM (2014) A group-specific arbitrary tradition mechanisms in response to cultural innovations. Proc Natl Acad Sci USA 114:
in chimpanzees (Pan troglodytes). Anim Cogn 17:1421–1425. 7915–7922.
15. Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL (2014) Human cumulative culture: A 34. Smaldino PE, Richerson PJ (2013) Human cumulative cultural evolution as a form of
comparative perspective. Biol Rev Camb Philos Soc 89:284–301. distributed computation. Handbook of Human Computation, ed Michelucci P
16. Johnson-Pynn J, Fragaszy DM, Cummins-Sebree S (2003) Common territories in (Springer, New York), pp 979–992.
comparative and developmental psychology: Quest for shared means and meaning 35. Muthukrishna M, Henrich J (2016) Innovation in the collective brain. Proc Biol Sci
in behavioral investigations. Int J Comp Psychol 16:1–27. 371:20150192.
17. Laland KN, Hoppitt W (2003) Do animals have culture? Evol Anthropol 12:150–159. 36. Kolodny O, Creanza N, Feldman MW (2015) Evolution in leaps: The punctuated
18. Nielsen M, Haun D (2016) Why developmental psychology is incomplete without accumulation and loss of cultural innovations. Proc Natl Acad Sci USA 112:
comparative and cross-cultural perspectives. Proc Biol Sci 371:20150071. E6762–E6769.
19. Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected 37. Konner M (2010) The Evolution of Childhood: Relationships, Emotion, Mind (Harvard
Process in Evolution (Princeton Univ Press, Princeton). Univ Press, Cambridge, MA).

Legare PNAS | July 25, 2017 | vol. 114 | no. 30 | 7881


38. Legare CH, Harris PL (2016) The ontogeny of cultural learning. Child Dev 87:633–642. 75. Callaghan T, et al. (2011) Early social cognition in three cultural contexts. Monogr
39. Heyes CM (1994) Social learning in animals: Categories and mechanisms. Biol Rev Soc Res Child Dev, 10.1111/j.1540-5834.2011.00603.x.
Camb Philos Soc 69:207–231. 76. Gaskins S, Paradise R (2010) Learning Through Observation. The Anthropology of
40. Nagell K, Olguin RS, Tomasello M (1993) Processes of social learning in the tool use Learning and Childhood, eds Lancy DF, Bock J, Gaskins S (Alta Mira, Lanham, MD), pp
of chimpanzees (Pan troglodytes) and human children (Homo sapiens). J Comp 85–117.
Psychol 107:174–186. 77. Kline MA (2015) How to learn about teaching: An evolutionary framework for the
41. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cul- study of teaching behavior in humans and other animals. Behav Brain Sci 38:e31.
tural evolution. Proc Natl Acad Sci USA 114:7853–7860. 78. Shneidman L, Gaskins S, Woodward A (2016) Child-directed teaching and social
42. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of learning at 18 months of age: Evidence from Yucatec Mayan and US infants. Dev Sci
cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415. 19:372–381.
43. Tomasello M (2016) The ontogeny of cultural learning. Current Opinion in 79. LeVine JL (2007) Ethnographic studies of childhood: A historical overview. Am
Psychology 8:1–4. Anthropol 109:247–260.
44. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl 80. Otto H, Keller H (2014) Different Faces of Attachment: Cultural Variations on a
Acad Sci USA 114:7861–7868. Universal Human Need (Cambridge Univ Press, Cambridge, UK).
45. Dean LG, Kendal RL, Schapiro SJ, Thierry B, Laland KN (2012) Identification of the 81. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes.
social and cognitive processes underlying human cumulative culture. Science 335: Proc Natl Acad Sci USA 114:7790–7797.
1114–1118. 82. Kruger AC, Tomasello M (1996) Cultural learning and learning culture. The
46. Hewlett BS, Roulette CJ (2016) Teaching in hunter-gatherer infancy. R Soc Open Sci Handbook of Human Development and Education, eds Olson D, Torrance N
3:150403. (Blackwell, Oxford, UK), pp 369–387.
47. Mesoudi A, Chang L, Murray K, Lu HJ (2015) Higher frequency of social learning in 83. Csibra G, Gergely G (2009) Natural pedagogy. Trends Cogn Sci 13:148–153.
China than in the West shows cultural variation in the dynamics of cultural evolu- 84. Marsh KL, Richardson MJ, Schmidt RC (2009) Social connection through joint action
tion. Proc Biol Sci 282:201442209. and interpersonal coordination. Top Cogn Sci 1:320–339.
48. Tomasello M, Kruger AC, Ratner HH (1993) Cultural learning. Behav Brain Sci 16: 85. Nielsen M (2012) Imitation, pretend play, and childhood: Essential elements in the
495–552. evolution of human culture? J Comp Psychol 126:170–181.
49. Bjorklund DF, Ellis BJ (2014) Children, childhood, and development in evolutionary 86. Shneidman L, Woodward AL (2016) Are child-directed interactions the cradle of
perspective. Dev Rev 34:225–264. social learning? Psychol Bull 142:1–17.
50. Diesendruck G, Markson L (2011) Children’s assumption of the conventionality of 87. Tomasello M, Carpenter M, Call J, Behne T, Moll H (2005) Understanding and sharing
culture. Child Dev Perspect 5:189–195. intentions: The origins of cultural cognition. Behav Brain Sci 28:675–691, discussion
51. Hrdy SB (2009) Mothers and Others: The Evolutionary Origins of Mutual Understanding 691–735.
(Harvard Univ Press, Cambridge, MA). 88. Mesoudi A, Chang L, Dall SRX, Thornton A (2016) The evolution of individual and
52. Haun DBM, Over H (2014) Like me: A homophily-based account of human culture. cultural variation in social learning. Trends Ecol Evol 31:215–225.
Epistemological Dimensions of Evolutionary Psychology, ed Breyer T (Springer, New 89. Hewlett BS, Fouts HN, Boyette AH, Hewlett BL (2011) Social learning among Congo
Basin hunter-gatherers. Philos Trans R Soc Lond B Biol Sci 366:1168–1178.
York), pp 117–130.
90. Broesch T, Rochat P, Olah K, Broesch J, Henrich J (2016) Similarities and differences in
53. Muthukrishna M, Morgan TJH, Henrich J (2016) The when and who of social learning
maternal responsiveness in three societies: Evidence from Fiji, Kenya, and the United
and conformist transmission. Evol Hum Behav 37:10–20.
States. Child Dev 87:700–711.
54. Claidière N, Whiten A (2012) Integrating the study of conformity and culture in
91. Kärtner J, Crafa D, Chaudhary N, Keller H (2016) Reactions to receiving a gift—
humans and nonhuman animals. Psychol Bull 138:126–145.
Maternal scaffolding and cultural learning in Berlin and Delhi. Child Dev 87:712–722.
55. Corriveau KH, Fusaro M, Harris PL (2009) Going with the flow: Preschoolers prefer
92. Little EE, Carver LJ, Legare CH (2016) Cultural variation in triadic infant-caregiver
nondissenters as informants. Psychol Sci 20:372–377.
object exploration. Child Dev 87:1130–1145.
56. Herrmann PA, Legare CH, Harris PL, Whitehouse H (2013) Stick to the script: The
93. Keller H, et al. (2006) Cultural models, socialization goals, and parenting ethno-
effect of witnessing multiple actors on children’s imitation. Cognition 129:536–543.
theories: A multicultural analysis. J Cross Cult Psychol 37:155–172.
57. Chudek M, Heller S, Birch S, Henrich J (2012) Prestige-biased cultural learning: By-
94. Demps K, Zorondo-Rodríguez F, García C, Reyes-García V (2012) Social learning
stander’s differential attention to potential models influences children’s learning.
across the life cycle: Cultural knowledge acquisition for honey collection among the
Evol Hum Behav 33:46–56.
Jenu Kuruba, India. Evol Hum Behav 33:460–470.
58. Henrich J (2009) The evolution of costly displays, cooperation, and religion: Credi-
95. Kline MA, Boyd R, Henrich J (2013) Teaching and the life history of cultural trans-
bility enhancing displays and their implications for cultural evolution. Evol Hum
mission in Fijian villages. Hum Nat 24:351–374.
Behav 30:244–260.
96. Bonawitz E, et al. (2011) The double-edged sword of pedagogy: Instruction limits
59. Kenward B (2012) Over-imitating preschoolers believe unnecessary actions are nor-
spontaneous exploration and discovery. Cognition 120:322–330.
mative and enforce their performance by a third party. J Exp Child Psychol 112:
97. Kline MA (2016) TEACH: An ethogram-based method to observe and record teaching
195–207.
behavior. Field Methods, 10.1177/1525822X16669282.
60. Haun DB, van Leeuwen EJ, Edelson MG (2013) Majority influence in children and
98. Clark EV, Bernicot J (2008) Repetition as ratification: How parents and children place
other animals. Dev Cogn Neurosci 3:61–71.
information in common ground. J Child Lang 35:349–371.
61. Rakoczy H, Schmidt MFH (2013) The early ontogeny of social norms. Child Dev
99. Harkness S, Super C (2002) Culture and Parenting, ed Bornstein M, Handbook of
Perspect 7:17–21. parenting (Erlbaum, Mahwah), pp 253–280.
62. Schmidt MFH, Butler LP, Heinz J, Tomasello M (2016) Young children see a single 100. Keller H (2007) Cultures of infancy (Lawrence Erlbaum Associates, Mahwah, NJ).
action and infer a social norm: Promiscuous normativity in 3-year-olds. Psychol Sci 27: 101. Chavajay P, Rogoff B (2002) Schooling and traditional collaborative social organi-
1360–1370. zation of problem solving by Mayan mothers and children. Dev Psychol 38:55–66.
63. Gergely G, Csibra G (2006) Sylvia’s recipe: The role of imitation and pedagogy in the 102. Mejía-Arauz R, Rogoff B, Dexter A, Najafi B (2007) Cultural variation in children’s
transmission of human culture. Roots of Human Sociality: Culture, Cognition and social organization. Child Dev 78:1001–1014.
Human Interaction, eds Enfield NJ, Levenson SC (Berg Publishers, Oxford, UK), pp 103. Peck JG, Gregory RJ (2005) A brief overview of the Old New Hebrides.
229–255. Anthropologist 7:269–282.
64. Barrett HC, et al. (2016) Small-scale societies exhibit fundamental variation in the 104. Tobin J, Hsueh Y, Karasawa M (2009) Preschool in Three Cultures Revisited: China,
role of intentions in moral judgment. Proc Natl Acad Sci USA 113:4688–4693. Japan, and the United States (Univ of Chicago Press, Chicago).
65. Clegg JM, Legare CH (2016a) Instrumental and conventional interpretations of be- 105. Rogoff B, Paradise R, Arauz RM, Correa-Chávez M, Angelillo C (2003) Firsthand
havior are associated with distinct outcomes in early childhood. Child Dev 87: learning through intent participation. Annu Rev Psychol 54:175–203.
527–542. 106. Correa-Chávez M, Rogoff B (2009) Children’s attention to interactions directed to
66. House BR, et al. (2013) Ontogeny of prosocial behavior across diverse societies. Proc others: Guatemalan Mayan and European American patterns. Dev Psychol 45:
Natl Acad Sci USA 110:14586–14591. 630–641.
67. Enquist M, Strimling P, Eriksson K, Laland K, Sjostrand J (2010) One cultural parent 107. Kärtner J, Keller H, Yovsi RD (2010) Mother-infant interaction during the first
makes no culture. Anim Behav 79:1353–1362. 3 months: The emergence of culture-specific contingency patterns. Child Dev 81:
68. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu- 540–554.
mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180. 108. Silva KG, Correa-Chávez M, Rogoff B (2010) Mexican-heritage children’s attention
69. Henrich J, Heine SJ, Norenzayan A (2010) The weirdest people in the world? Behav and learning from interactions directed to others. Child Dev 81:898–912.
Brain Sci 33:61–83, discussion 83–135. 109. LeVine RA, LeVine S, Schnell-Anzola B, Rowe M, Dexter E (2012) Literacy and
70. Arnett JJ (2008) The neglected 95%: Why American psychology needs to become less Mothering: How Women’s Schooling Changes the Lives of the World’s Children
American. Am Psychol 63:602–614. (Oxford Univ Press, New York).
71. Barrett HC, et al. (2013) Early false-belief understanding in traditional non-Western 110. Keller H Kärtner (2013) Development—The culture-specific solution of universal
societies. Proc Biol Sci 280:20122654. developmental tasks. Advances in Culture and Psychology, eds Gelfand ML, Chiu C-Y,
72. Berl REW, Hewlett BS (2015) Cultural variation in the use of overimitation by the Aka Hong YY (Oxford Univ Press, New York), pp 63–116.
and Ngandu of the Congo Basin. PLoS One 10:e0120180. 111. Ochs E, Schieffelin BB (2001) Language acquisition and socialization: Three de-
73. Blake PR, et al. (2015) The ontogeny of fairness in seven societies. Nature 528: velopmental stories and their implications. Linguistic Anthropology: A Reader, ed
258–261. Duranti A (Cambridge Univ Press, Cambridge, UK), pp 296–328.
74. Broesch T, Bryant GA (February 27, 2017) Fathers’ infant-directed speech in a small- 112. Whiting BB, Whiting JW (1975) Children of Six Cultures: A Psycho-Cultural Analysis
scale society. Child Dev, 10.1111/cdev.12768. (Harvard Univ Press, Cambridge, MA).

7882 | www.pnas.org/cgi/doi/10.1073/pnas.1620743114 Legare


COLLOQUIUM
PAPER
113. Cole M (1990) Cognitive development and formal schooling: The evidence from 132. Watson-Jones RE, Legare CH (2016) The social functions of group rituals. Curr Dir
cross-cultural research. Vygotsky and Education, ed Moll LC (Cambridge Univ Press, Psychol Sci 25:42–46.
Cambridge, UK), pp 89–110. 133. Wen N, Herrmann P, Legare C (2016) Ritual increases children’s affiliation with in-
114. Gaskins S (2006) Cultural perspectives on infant-caregiver interaction. The Roots of group members. Evol Hum Behav 37:54–60.
Human Sociality: Culture, Cognition, and Human Interaction, eds Enfield NJ, 134. Chartrand TL, Lakin JL (2013) The antecedents and consequences of human behav-
Levinson SC (Berg, Oxford), pp 279–298. ioral mimicry. Annu Rev Psychol 64:285–308.
115. Gauvain M, Munroe RL (2009) Contributions of societal modernity to cognitive de- 135. Watson-Jones RE, Whitehouse H, Legare CH (2016) In-group ostracism increases
velopment: A comparison of four cultures. Child Dev 80:1628–1642. high-fidelity imitation in early childhood. Psychol Sci 27:34–42.
116. Lancy DF (2015) The Anthropology of Childhood: Cherubs, Chattel, Changelings 136. Over H, Carpenter M (2009) Priming third-party ostracism increases affiliative imi-
(Cambridge Univ Press, Cambridge, UK). tation in children. Dev Sci 12:F1–F8.
117. Heyes C (2009) Evolution, development and intentional control of imitation. Philos 137. Watson-Jones RE, Legare CH, Whitehouse H, Clegg JM (2014) Task-specific effects of
Trans R Soc Lond B Biol Sci 364:2293–2298. ostracism on imitative fidelity in early childhood. Evol Hum Behav 35:204–210.
118. Schillinger K, Mesoudi A, Lycett SJ (2015) The impact of imitative versus emulative 138. Buchsbaum D, Gopnik A, Griffiths TL, Shafto P (2011) Children’s imitation of causal
learning mechanisms on artifactual variation: Implications for the evolution of action sequences is influenced by statistical and pedagogical evidence. Cognition
material culture. Evol Hum Behav 36:446–455. 120:331–340.
119. Claidière N, Sperber D (2010) Imitation explains the propagation, not the stability of 139. Clegg JM, Legare CH (2016b) A cross-cultural comparison of children’s imitative
animal culture. Proc Biol Sci 277:651–659. flexibility. Dev Psychol 52:1435–1444.
120. Mathew S, Perreault C (2015) Behavioural variation in 172 small-scale societies in- 140. Scott JC, Henderson AME (2013) Language matters: Thirteen-month-olds un-
dicates that social learning is the main mode of human adaptation. Proc Biol Sci
derstand that the language a speaker uses constrains conventionality. Dev Psychol
282:20150061.
49:2102–2111.
121. Henrich J (2015) The Secret of Our Success: How Culture Is Driving Human
141. Nielsen M, Tomaselli K (2010) Overimitation in Kalahari Bushman children and the
Evolution, Domesticating Our Species, and Making Us Smarter (Princeton Univ
origins of human cultural cognition. Psychol Sci 21:729–736.
Press, Princeton).
142. Nielsen M, Mushin I, Tomaselli K, Whiten A (2014) Where culture takes hold:
122. Horner V, Whiten A (2005) Causal knowledge and imitation/emulation switching in
“Overimitation” and its flexible deployment in Western, Aboriginal, and Bushmen
chimpanzees (Pan troglodytes) and children (Homo sapiens). Anim Cogn 8:164–181.
children. Child Dev 85:2169–2184.
123. Lyons DE, Damrosch DH, Lin JK, Macris DM, Keil FC (2011) The scope and limits of
143. Nielsen M, Mushin I, Tomaselli K, Whiten A (2016) Imitation, collaboration and their
overimitation in the transmission of artefact culture. Philos Trans R Soc Lond B Biol
interaction among Western and Indigenous Australian preschool children. Child Dev
Sci 366:1158–1167.
87:795–806.
124. Legare CH, Souza AL (2012) Evaluating ritual efficacy: Evidence from the supernat-
144. Clegg JM, Wen NJ, Legare CH (2017) Is non-conformity WEIRD? Cultural variation in
ural. Cognition 124:1–15.
125. Legare CH, Watson-Jones RE (2015) The evolution snd ontogeny of ritual. The adults’ beliefs about children’s competency and conformity. J Exp Psychol Gen 146:
Handbook of Evolutionary Psychology, ed Buss DM (Wiley & Sons, Hoboken, NJ), 428–441.
pp 829–847. 145. Rakoczy H, Brosche N, Warneken F, Tomasello M (2009) Young children’s un-
126. Toelch U, Bruce M, Newson L, Richerson P, Reader S (2014) Individual consistency and derstanding of the context-relativity of normative rules in conventional games. Br J
flexibility in human social information use. Proc Biol Sci 281:20132864. Dev Psychol 27:445–456.
127. Legare CH, Wen NJ, Herrmann PA, Whitehouse H (2015) Imitative flexibility and the 146. Subiaul F, Schilder B (2014) Working memory constraints on imitation and emula-
development of cultural learning. Cognition 142:351–361. tion. J Exp Child Psychol 128:190–200.
128. Davis SJ, Vale GL, Schapiro SJ, Lambeth SP, Whiten A (2016) Foundations of cumu- 147. Köymen B, et al. (2014) Children’s norm enforcement in their interactions with

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
lative culture in apes: Improved foraging efficiency through relinquishing and peers. Child Dev 85:1108–1122.
combining witnessed behaviours in chimpanzees (Pan troglodytes). Sci Rep 6:35953. 148. Clegg JM, Legare CH (2017) Parents scaffold flexible imitation during early child-
129. Call J, Carpenter M, Tomasello M (2005) Copying results and copying actions in the hood. J Exp Child Psychol 153:1–14.
process of social learning: Chimpanzees (Pan troglodytes) and human children 149. Hewlett BS, Lamb ME, Shannon D, Leyendecker B, Schölmerich A (1998) Culture and
(Homo sapiens). Anim Cogn 8:151–163. early infancy among central African foragers and farmers. Dev Psychol 34:653–661.
130. Keupp S, Behne T, Rakoczy H (2013) Why do children overimitate? Normativity is 150. Jensen LA (2012) Bridging universal and cultural perspectives: A vision for de-
crucial. J Exp Child Psychol 116:392–406. velopmental psychology in a global world. Child Dev Perspect 6:98–104.
131. Over H, Carpenter M (2012) Putting the social into social learning: Explaining both 151. Nielsen M, Haun D, Kärtner J, Legare CH (2017) The persistent sampling bias in
selectivity and fidelity in children’s copying behavior. J Comp Psychol 126:182–192. developmental psychology: A call to action. J Exp Child Psychol 162:31–38.

Legare PNAS | July 25, 2017 | vol. 114 | no. 30 | 7883


Young children communicate their ignorance and
ask questions
Paul L. Harrisa,1, Deborah T. Bartza, and Meredith L. Rowea
a
Graduate School of Education, Harvard University, Cambridge, MA 02138

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 8, 2017
(received for review January 12, 2017)

Children acquire information, especially about the culture in which of the information that is made available by the surrounding com-
they are being raised, by listening to other people. Recent evidence munity; they also remedy their own ignorance by adopting an in-
has shown that young children are selective learners who preferen- terrogative stance toward potential informants.
tially accept information, especially from informants who are likely Below, we review recent findings on children’s appraisal of
to be representative of the surrounding culture. However, the extent informant consensus, highlighting the research lacuna just
to which children understand this process of information trans- mentioned. We then turn to research focusing on young child-
mission and actively exploit it to fill gaps in their knowledge has not ren’s appraisal of themselves, especially their states of ignorance,
been systematically investigated. We review evidence that toddlers as well as the emergence of the interrogative stance.
exhibit various expressive behaviors when faced with knowledge
gaps. They look toward an available adult, convey ignorance via Sensitivity to Consensus and Uncertainty
nonverbal gestures (flips/shrugs), and increasingly produce verbal Research on children’s appraisal of informant consensus has drawn
acknowledgments of ignorance (“I don’t know”). They also produce on approaches to cultural learning grounded in evolutionary theo-
comments and questions about what their interlocutors might rizing. Adopting this approach, Morgan et al. (17) asked how far
know and adopt an interrogative stance toward them. Thus, in children would be swayed by varying degrees of consensus among
the second and third years, children actively seek information their informants when making numerical judgments. Children
from interlocutors via nonverbal gestures or verbal questions ranging from 3 to 7 y of age were asked to say which of two displays,
and display a heightened tendency to encode and retain such each containing 10–30 dots, was numerically greater. Consistent
sought-after information. with prior findings (18), children were better at choosing the nu-
merically larger display the greater the difference in size between
ignorance | questions | children | communication the two displays, as indexed by the dot ratio (i.e., the ratio of the
difference between the displays relative to the size of the smaller

A n influential body of research in developmental psychology


has shown that infants are cognitively attuned to stable
properties of the world: They possess core knowledge (1), a set of
display); the gradient of this improvement in accuracy for easier
compared with more difficult trials was steep for older children but
shallow for younger children.
concepts enabling them to make sense of events and transfor- Having made a decision about any given pair of displays,
mations in the physical, biological, and psychological domains. children were offered feedback by 10 informants who each either
Moreover, building on that core knowledge, young children agreed or disagreed with their decision, with the number of in-
gradually construct a variety of deeper conceptual and causal in- formants who agreed versus disagreed varying (from 0 to 10)
sights within each of those domains (2–4). Alongside this portrayal from one trial to the next. Thus, children might be confronted
of the child as a young scientist who steadily builds up a co- with unanimous agreement with their initial decision, unanimous
herent and objective conception of the natural world, recent disagreement, or any split between those two extremes. After this
developmental research has paid increasing attention to the social feedback, children were invited to make a second decision
ways in which infants and young children can also be viewed about the two displays, thereby completing that particular trial.
as anthropologists. They are cultural learners, receptive to All age groups were prone to stick with their initial decision, and,
information from other people, including caregivers, adult surprisingly, they did so even on difficult trials. Moreover, Fig. 1
members of their group, and peers, especially regarding the shows that the tendency to stick with an initial decision became
distinctive languages, beliefs, and practices of the culture that stronger with age, irrespective of the ease or difficulty of the
they live in (5–8). trial. Nevertheless, children’s overall tendency to stick to their
Much of this recent research has emphasized that despite their initial decision was tempered by their sensitivity to social feed-
receptivity to the information provided by other people, young back, and the pattern of that sensitivity changed considerably
children are selective about their informants. More specifically, with age. Fig. 2 indicates that 7-y-olds displayed a so-called
they appraise potential informants along a variety of dimensions, “conformist” bias: They were disproportionately sensitive to
including their familiarity (9, 10), their prior accuracy (9, 11, 12), the majority opinion among informants. In contrast, 6-y-olds
apparent group membership (13, 14), and degree of consensus displayed a linear or proportionate response: The greater the
(15–17). Such selectivity is likely to facilitate children’s acquisition
of those beliefs and norms that are representative of their culture.
In highlighting children’s appraisal of, and receptivity to, poten- This paper results from the Arthur M. Sackler Colloquium of the National Academy of
tial informants, research on early cultural learning has tended to Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
ignore children’s self-appraisals and their concomitant information in Irvine, CA. The complete program and video recordings of most presentations are available
seeking. However, unlike other species, cultural learning by human on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
children is often based on the testimony and teaching of others (i.e., Author contributions: P.L.H., D.T.B., and M.L.R. designed research; D.T.B. performed re-
on nonverbal or verbal messages deliberately aimed at informing search; P.L.H. and D.T.B. analyzed data; and P.L.H. and M.L.R. wrote the paper.
naive or ignorant learners). Granted that distinctive mode of cul- The authors declare no conflict of interest.
tural learning, it is plausible that even from an early age, children This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
communicate gaps in their knowledge and ask for pertinent in- Board.
formation. In this view, young children are not just selective recipients 1
To whom correspondence should be addressed. Email: paul_harris@gse.harvard.edu.

7884–7891 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620745114


COLLOQUIUM
PAPER
metacognitive awareness, especially in the context of ongoing
dialogue with an interlocutor. In the next section, we argue that
very young children display five interlinked abilities: (i) an un-
derstanding of the nature of communication, especially its power
to convey information from an informant to a recipient; (ii) an
ability to signal their ignorance to an interlocutor; (iii) an ability
to talk cogently about knowledge and ignorance; (iv) an ability to
communicate their desire for information via gestures and
questions; and (v) an ability to monitor the extent to which the
information requested of an informant does or does not remedy
their ignorance.

An Early Understanding of Communication


In the course of the second year, when the ability of human infants
to communicate with words remains limited, they nonetheless
display a basic understanding of the way that communication
works. They understand that requests and assertions can be
communicated from one person to another so that the recipient is
likely to end up with information that he or she can act upon, and
may indeed favor relative to prior information based on first-hand
observation (20). Thus, infants show some understanding of the
way that communication can guide the actions and update the
knowledge base of a recipient. Moreover, they display an un-
derstanding of the impact of communication, not simply when
Fig. 1. Probability that children stick with their initial decision for the case
of five versus five informants such that whether or not children stick is based
they seek information from a potential informant or when they
on the initial information they gathered via observation of the displays supply information to a recipient but also when, as a third party,
(asocial information) and their sticking tendency. Children tend to stick with they witness or eavesdrop on an exchange between two other
their initial decision across all trial ratios (i.e., irrespective of trial difficulty), people. In such contexts, infants appear to encode the message
although the tendency to stick is slightly lower on the more difficult (low dot supplied by the informant and to work out its likely impact on the
ratio) trials. The tendency to stick sharply increases with age; the oldest actions and knowledge of the recipient.

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
children (7-y-olds) display a >80% chance of sticking (color-coding of the ages of The following body of experimental work illustrates these
3, 4, 5, 6, and 7 y is provided in Fig. 2). Reprinted with permission from ref. 17.
basic points. Krehm et al. (21) had infants aged 9 and 11 mo
watch while an informant expressed her preference for one of
number of informants choosing a given option, the greater was two objects by reaching for and manipulating her preferred
the likelihood that 6-y-olds responded similarly. Finally, younger
children were swayed by unanimity among informants but
showed little sensitivity to social feedback that fell short of
unanimity. For example, their final decisions were roughly the
same whether two of the 10 informants judged like them and
eight did not, or the reverse. In sum, although the exact nature of
children’s reactions to disagreement among informants changed
sharply with age, children were sensitive to social feedback at all
ages. Moreover, older children displayed the type of conformist
bias (i.e., a disproportionate sensitivity to nontotal majorities)
that evolutionary theory has identified as a highly effective
strategy for widespread cultural dissemination (19). Hence,
children’s tendency to stick with their initial judgment cannot be
attributed to any overall insensitivity to social feedback.
An alternative explanation of children’s tendency to stick, and
to stick even on difficult trials, is that they lack an ability to
monitor their own knowledge states. They treat what effectively
amounts to a random judgment on difficult trials and a well-
founded judgment on easy trials as more or less equivalent. In
this view, young children ultimately have little ability to differen-
tiate cognitive states that, in principle, ought to be quite distinct:
notably, states of ignorance, in which only a guess can be made,
and states of knowledge, in which a judgment can be made with a
high probability of its being correct. If this hypothesis is correct, it
implies that children are poor at weighing social feedback against
their own asocial information. Having little awareness of the ep-
istemic standing of their own asocial information, they do not Fig. 2. Probability that children stick with a given decision (e.g., the right-
hand side display of dots) for a trial of intermediate difficulty. The 7-y-olds
appropriately calibrate their deference to social information.
show a conformist bias by responding disproportionately to majorities that
However, even if young children are insensitive to the cer- fall short of unanimity. The 6-y-olds display a proportionate response to the
tainty versus uncertainty of their numerical judgments, it is un- number of informants endorsing their decision. Younger children, especially
likely that they are insensitive to the standing of their cognitive 3- and 4-y-olds, are only affected by informant feedback when there is com-
states across all domains of knowledge. Indeed, as elaborated plete unanimity; they are prone to ignore informant feedback when there is
below, recent evidence suggests that even 2-y-olds have some disagreement among informants. Reprinted with permission from ref. 17.

Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7885
object. A recipient then appeared who expressed no specific each of two boxes in succession and asking: “Is it here?” A second
preference for either object insofar as she handled both. In the adult answered with a nod to one query and a shake of the head to
subsequent test phase, the informant reappeared and pointed to the other. When infants were then prompted to find the object,
her preferred object as the recipient watched. Infants expressed they typically selected the correct box. Effectively, infants were
more surprise (by looking longer) when the recipient handed the able not only to note the difference between the two head gestures
informant her nonpreferred object as opposed to the one that of the second adult but to tie each gesture to a query about a
she had pointed at. By implication, infants expected the recipient particular location indicated by the first adult.
to understand which object the informant wanted, given her Taken together, these findings imply that infants aged 12–
pointing gesture, and to respond accordingly. A control condi- 18 mo possess a relatively abstract comprehension of the nature
tion consolidated this interpretation. If the informant gestured of communication. They realize that certain signals, such as
with an open fist rather than a point, or if the recipient closed pointing gestures, lexicalized speech, and head gestures, can
her eyes rather than watched the informant’s gesture, the se- provide information about what an informant wants or knows.
lective pattern of looking disappeared. This selective pattern was They expect the recipient of those communicative signals to
displayed by both 9- and 11-mo-olds, and indeed irrespective of construe and respond to them appropriately, by compliance if a
whether infants had started to point themselves. request has been made and by acting in relation to new in-
Martin et al. (22) obtained similar findings when the informant formation about the state of the world if it has been supplied. By
signaled that she wanted a preferred object, not via a pointing implication, human infants have a basic understanding of the way
gesture but by saying “koba.” This lexical item was unfamiliar to that communication conveys information between two interloc-
the 12-mo-olds being tested. Nevertheless, they tended to construe utors. Moreover, they do so at an age when their own production
it as a request by the informant for her preferred object, again as of spoken language remains limited. Accordingly, when infants
indexed by the pattern of looking that they displayed when the proceed to exploit the rich communicative power of language,
recipient did or did not comply in terms of the particular object they are likely to situate that power within a broad understanding
that she handed to the informant. Infants expressed more surprise of the way that human communication operates, particularly
(looked longer) if the recipient handed the informant the non- their realization that communication can function to provide an
preferred object. Control conditions indicated that infants’ con- interlocutor with information.
strual of the informant’s signal as a request was restricted to Granted that young infants understand how communication
speech-like utterances. If the informant coughed rather than operates, we may ask whether they build on that understanding
spoke, or produced a vocalization (“Oooh!”) rather than a lexical by actively eliciting information rather than remaining passive
item, the pattern of selective looking disappeared. observers or recipients of information that is on offer. To opti-
Thus, at the very beginning of the second year, infants are not mize the elicitation of information, it would be helpful for infants
inattentive or uncomprehending bystanders with respect to ongoing to possess four interrelated abilities: the ability to signal their
patterns of communication. They grasp that particular signals can own ignorance, to talk about knowledge and ignorance, to pro-
be interpreted as requests for a particular object. Moreover, their duce interrogative acts of communication, and to gauge the ad-
construal of dialogic communication is such that they expect the equacy of the replies received. In the following sections, evidence
recipient to interpret the requests appropriately and to act ac- will be reviewed showing that human children display each of
cordingly. This construal of dyadic communication is appropriately these abilities in the course of the second and third years, es-
confined to certain types of signals, notably a pointing gesture or the pecially in the context of an ongoing dialogue with an adult.
production of a lexical item (including one that is novel) rather than
a hand movement, a cough, or a vocalization. Signaling Ignorance
Song et al. (23) asked if older infants, aged 18 mo, would un- Nonhuman animals appear to possess at least some metacognitive
derstand not just a request for an object, as conveyed by a pointing capacity. They are capable of monitoring their own uncertainty in
gesture or lexical item, but an assertion, and notably an assertion that they deliberately withhold a response when faced with a dif-
that could, in principle, update the recipient’s knowledge base. ficult discrimination between different choices (25). There is also
Infants watched as an adult repeatedly placed a ball in a box, evidence that chimpanzees and young children (aged 27–32 mo)
withdrew it, replaced it, and eventually left the room. A second appropriately seek out additional visual evidence in the context of
adult who had witnessed the actions of the first adult then moved uncertainty about the location of a hidden object. For example, if
the ball to a cup and covered it with a lid. The first adult returned they have had the opportunity to observe in which of several tubes
to retrieve the ball, but before her making any attempt to retrieve a desirable object has been hidden, both species search promptly
it, she was provided with information about its new location by the in that particular tube. However, if they have not seen the hiding
second adult: “The ball is in the cup.” Alternatively, the second and do not know in which particular tube the object was hidden,
adult made an uninformative remark that did not indicate the they are likely to bend their head or body to look inside the
ball’s new location: “I like the cup.” In the informative condition, available tubes before searching in the one where the hidden
infants expressed surprise (looked longer) when the returning object can be seen; alternatively, they opt for a smaller reward in a
adult searched in the now empty box rather than in the cup where known location (26, 27). By implication, both chimpanzees and
she had just been told that the ball was located. In the un- children realize when they do not know, or have not seen, an
informative condition, by contrast, infants were more surprised if object’s location and act accordingly. They proceed to gather more
the returning adult appeared to know that the object was in the location information before searching accurately on the basis of
cup, as indexed by her searching there rather than in the box that newly gathered information, or they opt for a less desirable
where she had left it. Moreover, in line with the findings for re- object in a known location.
quests discussed above, a pointing response by the second adult In these cases, neither the chimpanzee nor the child commu-
was also construed by 18-mo-olds as an informative assertion that nicates ignorance to another individual. Rather, in the context of
was likely to guide the search behavior of the returning adult. ignorance, they engage in visual exploration or opt out of
Eighteen-month-olds also display some facility in decoding the searching. However, recent evidence indicates that human in-
information conveyed by head gestures as well as hand gestures. fants are capable of signaling their ignorance. Goupil et al. (28)
As in the studies described above, Fusaro and Harris (24) trained 20-mo-old infants to ask their caregiver for guidance if
arranged for infants to witness a minidialogue between two adults they were uncertain of a hidden object’s location. More specifi-
and then probed their construal of that dialogue. One adult sought cally, infants watched as a toy was hidden in one of two opaque
information about the location of a hidden object by pointing to containers. On so-called “possible” trials, the infant observed the

7886 | www.pnas.org/cgi/doi/10.1073/pnas.1620745114 Harris et al.


COLLOQUIUM
PAPER
hiding of the object. By contrast, on so-called “impossible” trials, flip (or shrug) gesture. Acredolo and Goodwyn (29) report a case
the hiding was carried out behind a curtain so that infants could study of a child whose communicative gestures were studied
not tell which container the object was hidden inside. In either from the age of 12–17 mo. Starting at the age of 15 mo, the child
case, the two containers were subsequently occluded for a delay produced a gesture that appeared to signal ignorance. She
ranging from 3 to 12 s and then uncovered once more. Infants shrugged her shoulders and flipped the palms of her hands up-
were taught to indicate which container they had remembered the ward and out to the side. However, because this case study was a
object to be in by pointing to its location. The container indicated study of a single child, it is unclear whether production of this
was then moved forward so that the infant could either recover the gesture is widely used to signal ignorance or is produced in only a
toy (if correct) or find the container to be empty (if incorrect). small minority of families and by a small minority of children. It
Note that depending on the nature of the hiding, and on the is also unclear whether the child observed by these researchers
length of time that the two containers were subsequently occluded, was especially precocious in her communication skills or repre-
trials varied in terms of the likelihood that infants could know and sentative of the communication patterns displayed by typically
remember where the object was hidden. Thus, on possible trials, developing toddlers.
especially when the delay was short, remembering the object’s To examine these issues, Bartz (30) analyzed data from 64 chil-
location was relatively easy. However, on possible trials when the dren included in the Language Development Project (31), a longi-
delay was longer, remembering was more difficult. Finally, on tudinal study of early language development in which the families of
impossible trials, remembering was precluded because the initial the children constituted a representative sample of the US pop-
hiding of the object had not been witnessed. ulation in terms of education and socioeconomic status. The project
Half the infants were taught in a prior training session to ask researchers recorded children’s everyday interactions with care-
their caregiver for help when needed. In this training session, givers in their homes every 4 mo from the age of 14 mo onward for
their pointing responses on impossible trials were ignored. In- a 90-min period. The recordings from eight successive visits (at 14,
stead, caregivers waited until infants turned to look at them in 18, 22, 26, 30, 34, 38, and 42 mo of age) were analyzed with the goal
the eyes and then provided help by pushing the correct container of identifying the age of emergence and prevalence of flip gestures.
forward and saying: “Here it is, look.” Thus, infants were ef- Fig. 3 shows the cumulative number of children who had pro-
fectively taught that, when uncertain of the object’s location, they duced at least one of various types of flip gesture across this 18-mo
should turn to look at their caregiver, who would then help them
period. Fig. 3 also shows the cumulative number of children whom
to identify the correct container.
coders judged to be expressing ignorance via their flip gesture. Fi-
Several results showed that infants in the trained group pro-
nally, Fig. 3 shows the cumulative number of children who produced
duced this help-seeking signal in an appropriate fashion (i.e., when
the explicit verbal utterance, “I don’t know.” Inspection of Fig. 3
they were unsure of the object’s location). First, compared with
shows that, consistent with the earlier case study of a single child,
infants in the untrained group, infants in the trained group proved

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
the flip gesture expressing ignorance emerged among some children
to more accurate on those occasions when they did point. This
greater accuracy was because, when they were uncertain, instead in the course of the second year. At 22 mo, one-fifth of the sample
of pointing with a considerable risk of error, they were likely to had been observed producing a flip to signal ignorance, and by
seek help by looking toward their caregiver. In addition, the 42 mo, almost half had done so. Verbal statements of ignorance
trained infants who asked for help were more likely to do so when emerged somewhat later but rose sharply in frequency across the
the experimental setup created uncertainty. Thus, they were more same period, eventually becoming more widespread.
likely to ask for help on impossible compared with possible trials, These findings build on the findings of Goupil et al. (28) by
and, within the set of possible trials, they were more likely to ask showing that deliberate teaching and reinforcement are not re-
for help if the containers had been occluded for a longer delay. quired for the production of gestures signaling ignorance. Many
Taken together, these findings provide strong evidence that in- children produce such a signal in the course of everyday in-
fants aged 20 mo are able to monitor their ignorance or un- teraction outside the laboratory. The results also raise the pos-
certainty and can learn to signal that uncertainty by gazing at a sibility that such signals are produced not just in the context of
potential informant, notably a caregiver, in such circumstances. goal-directed behavior, such as in the search for a hidden object,
Despite the impressive and systematic nature of such un- but in the context of an ongoing dialogue in which an adult poses
certainty monitoring and help seeking, the findings also point to a question that the child is unable to answer. By implication, it
the critical role of training. More specifically, control infants who would be wrong to assume that signals of ignorance arise only in
received no initial training did sometimes look at their caregiver. problem-solving contexts where children face a practical di-
However, such responses were no more frequent with greater lemma or obstacle and turn to an adult for help in resolving it.
delay lengths and were no more frequent for impossible com-
pared with possible trials. Thus, even if these gaze responses
were aimed at prompting help from the caregiver, there was no
evidence that they signaled uncertainty because their production
was not positively correlated with the experimental conditions 70
producing uncertainty. By implication, although 20-mo-olds do
spontaneously look toward a caregiver, and indeed may do so 60
Cumulative number of children

with the expectation that helpful information will be supplied, Children who ever produced
50 flips
training might be needed if such looks are to be produced in a
40 Children who ever produced I
strategic fashion to signal uncertainty. More generally, this study DON'T KNOW flips
provides persuasive evidence that infants have some awareness 30 Children who ever said "I don't
of their own uncertainty or ignorance, echoing findings with know"
20
nonhuman primates, but it provides no evidence that they are
prone to signal ignorance or uncertainty in a spontaneous fash- 10
ion even if it shows that they can be trained to do so.
0
When do young children begin to signal their ignorance 14 18 22 26 30 34 38 42
spontaneously? Limited observational evidence suggests that in
the course of the second year, human toddlers will sometimes Fig. 3. Cumulative number of children who ever produced a flip, produced
spontaneously express their ignorance via a distinctive nonverbal an I DON’T KNOW flip, or said, “I don’t know” at each age point (14–42 mo).

Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7887
The data suggest that expressions of ignorance also occur in the Did the three children simply use the word know by echoing its
context of conversation. production in an immediately prior utterance by their interlocutor
In an experimental study, this conclusion was examined more or did they introduce the word know into the conversation in an
systematically (30). Children (aged 16–37 mo) were asked a series of autonomous fashion? The same pattern emerged for all three
questions by an adult, only some of which they could easily answer children: The large majority of children’s references to the word
based on their existing knowledge. More specifically, they were know were autonomous, rather than echoes of their interlocutor’s
shown a mix of pictures and asked the name for each of the entities prior utterance. Next, utterances were analyzed to determine
depicted. Some pictures depicted familiar, easy-to-name entities whether children referred only to their own cognitive states or also
(e.g., book, bird), whereas others depicted unfamiliar, hard-to-name made references to the cognitive states of their interlocutor or to
entities (e.g., unusual hardware item). The pattern of responding the cognitive states of a third party. The majority of references
was different for the unfamiliar entities compared with the familiar were indeed to children’s own cognitive states. Nevertheless,
entities. Not only did children make more naming errors and pro- children also referred quite often to the cognitive states of their
duce more filled speech pauses (e.g., “umm”), but they were also interlocutor. By contrast, references to a third party, someone not
more likely to look toward an adult (either the experimenter or participating in the conversation, were rare.
their mother) to ask for information (e.g., “What’s that?”) or to say Granted that children talked about their own cognitive states as
“I don’t know.” This differential pattern of responding was apparent well as the cognitive states of their interlocutor, an analysis was
among younger infants (16–27 mo) but was more systematic among conducted to assess whether the pragmatic function of the utter-
older infants (28–37 mo), especially with respect to filled speech ances was similar or different for these two persons. More spe-
pauses and requests for information. cifically, the proportions of affirmations (“I know. . .” or “You
Taken together, these studies show that toddlers communicate know. . .”), denials (“I don’t know. . .” or “You don’t know. . .”),
their uncertainty in various ways. They communicate by looking and questions (“Do I know. . .?” or “Do you know. . .?”) that in-
toward an adult and by producing a filled speech pause, a flip volved a reference to the self compared with the interlocutor were
gesture, an explicit affirmation of ignorance, or a question to an compared. These proportions varied across the three pragmatic
interlocutor. Admittedly, when they look at an adult or produce functions. In the case of affirmations, children produced them
a filled pause or a flip, such responses might reflect behavioral with respect to both the self and their interlocutor. Denials and
uncertainty rather than metacognitive awareness of ignorance. questions, by contrast, exhibited a strongly asymmetrical pattern.
However, such a parsimonious interpretation seems less appro- Children often denied their own knowledge (“I don’t know. . .”)
priate when toddlers begin to affirm their ignorance verbally. but very rarely denied the knowledge of their interlocutor (“You
Note also that there was only a modest developmental lag be- don’t know. . .”). Conversely, children often asked questions about
tween the production of flips and the emergence of verbal af- their interlocutor’s knowledge (“Do you know. . .?”) but never
asked questions about their own knowledge (“Do I know. . .?”).
firmations of ignorance. In the next section, such affirmations
This asymmetry in the pattern of production for denials compared
are scrutinized in more detail.
with questions was marked, but it was based on only three chil-
Talking About Knowledge and Ignorance dren. To establish its existence firmly, the utterances of a further
eight English-speaking children drawn from the CHILDES data-
The scope of children’s metacognitive awareness can be illumi-
base were also analyzed (36). An identical pattern emerged for all
nated by analyzing their production of the cognitive verb “know”
eight children: Denials were almost invariably produced with re-
in the context of everyday conversations with caregivers (32).
spect to the self rather than the interlocutor, whereas questions
Arguably, young children are aware only of gaps in their
were invariably produced with respect to the interlocutor rather
knowledge. They might have little or no awareness of when their than the self.
information retrieval processes operate smoothly. For example, Returning to the two questions guiding this study, the data
they might register occasions when they cannot readily respond show that 2-y-olds do not simply talk about their ignorance. They
to questions about an object’s name or location but ignore or fail also affirm that they possess particular items of knowledge. In
to register occasions when they can successfully answer. In this addition, the pattern of talk about the self is different from the
view, children would be likely to deny that they have knowledge pattern of talk about the interlocutor: Denials of knowledge are
(“I don’t know. . .”) but unlikely to affirm that they do have frequent for the self (“I don’t know”) but not for the in-
knowledge (“I know. . .”). A further question concerns children’s terlocutor, and questions about knowledge are frequent for the
insight into the cognitive states of other people. Do they talk interlocutor (“Do you know?”) but not for the self.
about the ignorance or knowledge of other people in the same The exact explanation for this asymmetry warrants further
way as they talk about their own, or is there any asymmetry investigation (36), but its existence points to the following pos-
between talk about the self and talk about others? sibility for early communication between young children and
To answer these questions, the spontaneous utterances of three their interlocutors. On the one hand, children monitor their own
children were analyzed. Two children were English-speaking cognitive states: They are aware of knowing some items of in-
(Adam, a middle-class, African-American child and Sarah, a formation and affirm possessing that knowledge, and they are
white, working-class child), whose early language had been recorded also aware of lacking other items of information and deny having
by Brown (33) and his colleagues at regular intervals. The utter- that knowledge. Their monitoring of other people’s knowledge is
ances of each child could be retrieved via the child language data more circumspect. They sometimes affirm, but almost never
exchange system, CHILDES (34). All utterances produced by the deny, that an interlocutor knows something. Rather, they ask an
two children that included the mental verb know from the age of interlocutor about what he or she knows. Given children’s
27 mo (the age at which recordings had begun) to the age of 36 mo awareness of what they do not know (as indexed by their explicit
were analyzed. The third child, Qianqian (芊芊), was a Mandarin- denials) combined with their receptivity to the possibility that an
speaking child whose utterances had been recorded and transcribed interlocutor might know (as indexed by their questions), and
from the age of 16 to 39 mo by her mother, a psycholinguist. given also their understanding of the way that communication
Qianqian’s production of the verb zhi1dao4 was analyzed. Similar can pass knowledge between an informant and a recipient, it is
to the phrase “know that” in English, zhi1dao4 is an epistemic verb feasible for them to turn to other people for information when
that is used in the context of factual knowledge. [Note that, in they do not know something. In particular, it would make sense
contrast to English, Mandarin uses a different verb (i.e., hui4) for for them to ask information-seeking questions. In the next sec-
the phrase “know how” (as in “know how to dance”) (35)]. tion, we review the onset of such questions.

7888 | www.pnas.org/cgi/doi/10.1073/pnas.1620745114 Harris et al.


COLLOQUIUM
PAPER
The Onset of Information-Seeking Questions or looking, the experimenter provided a novel name for the
A long tradition of developmental research has investigated the targeted object. Infants subsequently showed greater learning of
emergence of joint attention in infancy: the capacity to turn the names for the objects they had targeted via pointing compared
head and eyes toward a target that is pointed out by a caregiver and with reaching or looking. By implication, infants were especially
the reciprocal capacity to call a caregiver’s attention to objects via receptive to learning a novel name if it was supplied in the wake
pointing. The emergence of pointing follows a stable developmental of their interrogative point toward it.
timetable. At around 8 mo, whole-handed pointing starts to This early emerging disposition toward interrogative commu-
emerge, and at around 11 mo, index finger pointing starts to emerge nication is not just a subtle behavior whose diagnosis requires the
across markedly different cultural settings (37). Despite this stable tools of the laboratory. Chouinard (44) asked parents of infants
timetable, more hand gestures, including pointing gestures, were aged 12–17 mo and 18–23 mo to keep a diary in which they noted
observed between caregivers and their infants in China than in instances in which they judged their toddler to be asking a
Holland, and more were observed in Holland than in the Yucatan question. Despite the limited verbal ability of the infants, espe-
(38). In all three settings, however, pointing was a dyadic or re- cially in the younger group, parents had little difficulty in iden-
ciprocal mode of engagement. Thus, when a caregiver pointed, the tifying instances of questions. For example, one younger child
infant often reciprocated with a point to the same target (within was beside her mother when she was unpacking groceries. The
10 s), and vice versa. child noticed a kiwi fruit, a fruit that was novel to her; picked it
The standard functional interpretation of infant pointing has up; and, as she showed it to her mother with a puzzled expres-
been that it serves either to request an out-of-reach object or to sion, produced a vocalization “Uh?” The mother interpreted her
establish joint attention to an object of interest (39). More re- communication as a question about the name or identity of the
cently, however, it has been proposed that pointing can serve an fruit, roughly: “What’s this?” Further analysis showed that such
interrogative function (40). Thus, an infant point can imply not “questions” rose in frequency in the course of the second year.
just “I want this” or “Look at this” but also “What is this?” with Overall, the available evidence indicates that toddlers use
several means in the course of the second year to prompt an
the expectation that the interlocutor will respond with pertinent
interlocutor to supply them with information. They use pointing,
information. Begus and Southgate (41) report evidence supporting
showing, and vocalization, either separately or in combination, in
this emphasis on the interrogative function of pointing. Sixteen-
advance of any capacity to formulate a question in words fully.
month-olds proved to be more likely to point to an object if an
available informant appeared to be knowledgeable. A female in- Monitoring an Informant’s Replies
formant sat facing the infant, and a series of novel objects was
We have argued that infants understand how communication can
presented behind her but in view of the infant. Infants often
provide information and ask questions when they do not know
pointed out these novel objects to her. They appeared to be sig- something. Indeed, from the age of ∼18 mo onward, children ask an

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
naling that they wanted her to provide information about the increasing proportion of questions aimed at gathering information
objects because they pointed out novel objects less often if she had as opposed to questions that make other types of requests (e.g., for
proven to be an unreliable informant. Thus, they were less likely to permission, for clarification) (44). When asking such information-
point to the novel objects if she had previously named familiar seeking questions, do they monitor the replies that they receive? In
objects incorrectly and now appeared unsure of the names of the particular, do they differentiate between a satisfactory answer, one
novel objects. A follow-up study suggested that the experimenter’s that dispels their ignorance, and an unsatisfactory answer that does
prior naming errors were especially important in reducing infants’ not? To examine this issue, Chouinard (44) looked at what children
interrogative points. If the experimenter simply called attention to said in reply to an informative answer versus an uninformative an-
the objects (e.g., “Wow, look at this!”) and then appeared unsure swer. When adults failed to supply the information they sought,
how to name the novel objects, infants still pointed them out. children were likely to persist with their questions.
When infants receive information in the wake of an interrogative Extending this analysis, Frazier et al. (45) focused on the “why” or
point, how well do they process that information, and do they pro- “how” questions (i.e., the explanation-seeking questions) of six
cess it more effectively than unsolicited information? To examine English-speaking children whose language had been recorded reg-
these questions, Begus et al. (42) presented 16-mo-old infants with ularly from 2–5 y of age. Children reacted differently depending on
two objects at once. When infants pointed to one of the two objects, whether they received a satisfactory explanation or not. Following a
the experimenter modeled an action either on the indicated object satisfactory explanation, they were likely to acknowledge their
or on the alternative object. After a 10-min delay, the demonstration agreement or to ask a new, follow-up question on the same topic. By
object was presented again and infants were given an opportunity to contrast, when they were not given a satisfactory explanation, they
imitate the action they had seen demonstrated. Infants reproduced were likely to persist with their initial question or to offer an ex-
the actions demonstrated on the objects they had pointed at signif- planation of their own. A follow-up study confirmed that explanatory
icantly more than those actions demonstrated on the nonchosen information is also better remembered. Thus, when preschoolers
objects. Moreover, this difference emerged even though infants had received an explanation for a puzzling illustration, they were more
been equally attentive visually during the demonstrations on both likely to remember that information than a nonexplanation. Indeed,
types of objects. A follow-up experiment showed that this difference children often misremembered nonexplanations, converting them
in copying was due to learning being facilitated when infants’ into explanations via appropriate elaboration (46).
pointing was responded to rather than hindered when their pointing
was ignored. Thus, even when infants’ pointing was not ignored and Conclusions and Implications
a single object was presented, copying was still inferior to when two In the course of the second year, children begin to communicate
objects were presented and the experimenter consistently offered a their doubt or ignorance in various ways: through nonverbal
demonstration on the one that infants pointed to. By implication, gestures, explicit statements of their ignorance (“I don’t know”),
infants’ pointing at a given object had been aimed at eliciting in- as well as information-seeking questions. Nonhuman primates
formation about it and that information was better encoded than also indicate their uncertainty: They act differently depending on
information they had not aimed to elicit. whether they are sure or unsure of what to do next. In particular,
Toddlers’ early word learning provides more evidence for the they suppress responding in a situation where a mistaken re-
information-seeking role of pointing. Lucca and Wilbourn (43) sponse would impose costs. However, despite important paral-
presented 18-mo-olds with pairs of unfamiliar objects, and when lels, notably the implication that all primates are able to monitor
the infants targeted one of them via selective pointing, reaching, their own level of certainty or doubt, the two bodies of evidence

Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7889
also diverge. Toddlers readily express doubt or ignorance in the response to an interrogative gesture rather than supplied in an
context of communication; their flips, “I don’t know” utterances, and unsolicited fashion (42, 43), and explanatory information is better
information-seeking questions are directed at an interlocutor. Con- retained than nonexplanatory information (46).
ceivably, some of these signals could be produced when children are Research on the child’s theory of mind has highlighted the
alone and faced with uncertainty. For example, having searched important role of language and conversation in promoting child-
unsuccessfully, a toddler might produce a flip gesture, signaling un- ren’s insight into the way that the mind works (47). The primary
certainty about the object’s location and/or where to search next. focus of that research has been on children’s developing insight
However, pending more evidence of children’s expressive gestures into false beliefs. The present findings point to a more basic in-
when they are alone, it is plausible that the majority of such meta- sight that conversation is likely to promote. In several of the
cognitive signals are produced in a communicative context, not as an studies reported above, children signaled their ignorance in the
adjunct to ongoing solo behavior. More generally, these signals ap- context of an ongoing conversation with an adult. A speculative
pear to serve two interwoven functions. First, they provide an answer but plausible implication is that involvement in conversation can
to an interlocutor’s question: an admission of ignorance that serve as a constant tutorial for children with respect to the range
amounts to a well-formed turn in an ongoing conversation. Second, and depth of their ignorance. To the extent that children are
when formulated as questions, they convey not just ignorance but prone to engage in conversation with better informed interlocu-
also a request that the interlocutor offer help by supplying missing tors, they are likely to discover that their existing knowledge is
information. That children aim to elicit missing information with limited and fragmentary, albeit open to expansion if they pose
their questions is underlined by their differential reactions to in- appropriate questions. Granted that children vary in the quantity
formative vs. uninformative replies. of speech that they are exposed to by their caregivers (48) in the
Young children’s facility in communicating their ignorance and in extent to which that speech is directive rather than discursive (49),
asking questions appears to build on their foundational insight into is tightly focused on the immediate situation or includes an ex-
the way that testimony works (20). As described earlier, infants aged ploration of situations and events displaced from the here and now
12–18 mo understand that someone who lacks information (e.g., (50), and includes satisfactory answers to children’s causal ques-
regarding the location of an object) can be provided with that in- tions (51), we can anticipate that children will grow up with
formation via the gestures or vocalization of an informant. The markedly different assessments of the scope of human knowledge,
present review indicates that toddlers and young children go beyond the magnitude of their own comparative ignorance, and the po-
that basic insight. They produce avowals of ignorance and adopt an tential role of question asking in mitigating that ignorance.
interrogative stance. Moreover, the interrogative stance appears to
involve not simply the seeking of information from others via ACKNOWLEDGMENTS. Collection and coding of the data displayed in Fig. 3
pointing and/or questions but an accompanying state of in- were supported by the Eunice Kennedy Shriver National Institute of Child
Health and Human Development of the NIH under Award P01HD040605
formational receptivity (i.e., a motivational readiness to encode and (Principal Investigator, Susan Goldin-Meadow). The content is solely the re-
retain the information thereby elicited). Thus, information about sponsibility of the authors and does not necessarily represent the official
object names or object functions is better retained if it is received in views of the NIH.

1. Spelke ES, Kinzler KD (2007) Core knowledge. Dev Sci 10:89–96. 23. Song H-J, Onishi KH, Baillargeon R, Fisher C (2008) Can an agent’s false belief be
2. Carey S (2009) The Origin of Concepts (Oxford Univ Press, New York). corrected by an appropriate communication? Psychological reasoning in 18-month-
3. Gopnik A, Wellman HM (2012) Reconstructing constructivism: Causal models, Bayes- old infants. Cognition 109:295–315.
ian learning mechanisms, and the theory theory. Psychol Bull 138:1085–1108. 24. Fusaro M, Harris PL (2013) Dax gets the nod: Toddlers detect and use social cues to
4. Wellman HM (2014) Making Minds (Oxford Univ Press, New York). evaluate testimony. Dev Psychol 49:514–522.
5. Harris PL, Koenig MA (2006) Trust in testimony: How children learn about science and 25. Smith JD (2009) The study of animal metacognition. Trends Cogn Sci 13:389–396.
religion. Child Dev 77:505–524. 26. Call J, Carpenter M (2001) Do apes and children know what they have seen? Anim
6. Harris PL (2012) Trusting What You’re Told: How Children Learn from Others (Harvard Cogn 3:207–220.
Univ Press, Cambridge, MA). 27. Neldner K, Collier-Baker E, Nielsen M (2015) Chimpanzees (Pan troglodytes) and
7. Legare CH, Harris PL (2016) The ontogeny of cultural learning. Child Dev 87:633–642. human children (Homo sapiens) know when they are ignorant about the location of
8. Harris PL, Corriveau KH (2011) Young children’s selective trust in informants. Philos food. Anim Cogn 18:683–699.
Trans R Soc Lond B Biol Sci 366:1179–1187. 28. Goupil L, Romand-Monnier M, Kouider S (2016) Infants ask for help when they know
9. Corriveau K, Harris PL (2009) Choosing your informant: Weighing familiarity and re- they don’t know. Proc Natl Acad Sci USA 113:3492–3496.
cent accuracy. Dev Sci 12:426–437. 29. Acredolo LP, Goodwyn SW (1985) Symbolic gesturing in language development: A
10. Lucas AJ, et al. (December 29, 2016) The development of selective copying: Children’s case study. Hum Dev 28:40–49.
learning from an expert versus their mother. Child Dev, 10.1111/cdev.12711. 30. Bartz DT (2017) Young children’s meta-ignorance. EdD thesis (Harvard University,
11. Koenig MA, Clément F, Harris PL (2004) Trust in testimony: Children’s use of true and Cambridge, MA).
false statements. Psychol Sci 15:694–698. 31. Goldin-Meadow S, et al. (2014) New evidence about language and cognitive develop-
12. Koenig MA, Harris PL (2005) Preschoolers mistrust ignorant and inaccurate speakers. ment based on a longitudinal study: Hypotheses for intervention. Am Psychol 69:
Child Dev 76:1261–1277. 588–599.
13. Kinzler KD, Corriveau KH, Harris PL (2011) Children’s selective trust in native-accented 32. Harris PL, Yang B, Cui Y (2017) “I don’t know”: Children’s early talk about knowledge.
speakers. Dev Sci 14:106–111. Mind Lang 32:283–307.
14. Chen EE, Corriveau KH, Harris PL (2013) Children trust a consensus composed of 33. Brown R (1973) A First Language (Allen & Unwin, London).
outgroup members–but do not retain that trust. Child Dev 84:269–282. 34. MacWhinney B, Snow C (1985) The child language data exchange system. J Child Lang
15. Corriveau KH, Harris PL (2010) Preschoolers (sometimes) defer to the majority in 12:271–295.
making simple perceptual judgments. Dev Psychol 46:437–445. 35. Tardif T, Wellman HM (2000) Acquisition of mental state language in Mandarin- and
16. Corriveau KH, Fusaro M, Harris PL (2009) Going with the flow: Preschoolers prefer Cantonese-speaking children. Dev Psychol 36:25–43.
nondissenters as informants. Psychol Sci 20:372–377. 36. Harris PL, Ronfard S, Bartz D (2017) Young children’s developing conception of
17. Morgan TJH, Laland KN, Harris PL (2015) The development of adaptive conformity in knowledge and ignorance: Work in progress. Eur J Dev Psychol 14:221–232.
young children: Effects of uncertainty and consensus. Dev Sci 18:511–524. 37. Liszkowski U, Brown P, Callaghan T, Takada A, de Vos C (2012) A prelinguistic gestural
18. Halberda J, Feigenson L (2008) Developmental change in the acuity of the “Number universal of human communication. Cogn Sci 36:698–713.
Sense”: The Approximate Number System in 3-, 4-, 5-, and 6-year-olds and adults. Dev 38. Salomo D, Liszkowski U (2013) Sociocultural settings influence the emergence of
Psychol 44:1457–1465. prelinguistic deictic gestures. Child Dev 84:1296–1307.
19. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago 39. Bates E, Camaione L, Volterra V (1975) The acquisition of performatives prior to
Press, Chicago). speech. Merrill Palmer Q Behav Dev 21:205–226.
20. Harris PL, Lane JD (2014) Infants understand how testimony works. Topoi (Dordr) 33: 40. Southgate V, van Maanen C, Csibra G (2007) Infant pointing: Communication to co-
443–458. operate or communication to learn? Child Dev 78:735–740.
21. Krehm M, Onishi KH, Vouloumanos A (2014) Infants under 12 months understand 41. Begus K, Southgate V (2012) Infant pointing serves an interrogative function. Dev Sci
that pointing is communicative. J Cogn Dev 15:527–538. 15:611–617.
22. Martin A, Onishi KH, Vouloumanos A (2012) Understanding the abstract role of 42. Begus K, Gliga T, Southgate V (2014) Infants learn what they want to learn: Re-
speech in communication at 12 months. Cognition 123:50–60. sponding to infant pointing leads to superior learning. PLoS One 9:e108817.

7890 | www.pnas.org/cgi/doi/10.1073/pnas.1620745114 Harris et al.


COLLOQUIUM
PAPER
43. Lucca K, Wilbourn MP (December 29, 2016) Communicating to learn: Infants’ pointing 48. Huttenlocher J, Vasilyeva M, Waterfall HR, Vevea JL, Hedges LV (2007) The varieties of
gestures result in optimal learning. Child Dev, 10.1111/cdev.12707. speech to young children. Dev Psychol 43:1062–1083.
44. Chouinard MM (2007) Children’s questions: A mechanism for cognitive development. 49. Hart B, Risley T (1992) American parenting of language-learning children: Persisting
Monogr Soc Res Child Dev 72:1–112; discussion 113–126. differences in family-child interactions observed in natural home environments. Dev
45. Frazier BN, Gelman SA, Wellman HM (2009) Preschoolers’ search for explanatory in-
Psychol 28:1096–1105.
formation within adult-child conversation. Child Dev 80:1592–1611.
50. Rowe ML (2012) A longitudinal investigation of the role of quantity and quality of
46. Frazier BN, Gelman SA, Wellman HM (2016) Young children prefer and remember
child-directed speech in vocabulary development. Child Dev 83:1762–1774.
satisfying explanations. J Cogn Dev 17:718–736.
51. Kurkul KE, Corriveau KH (January 27, 2017) Question, explanation, follow-up: A
47. Harris PL, de Rosnay M, Pons F (2005) Language and children’s understanding of
mental states. Curr Dir Psychol Sci 14:69–73. mechanism for learning from others? Child Dev, 10.1111/cdev.12726.

PSYCHOLOGICAL AND
COGNITIVE SCIENCES

Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7891
Changes in cognitive flexibility and hypothesis search
across human life history from childhood to
adolescence to adulthood
Alison Gopnika,1, Shaun O’Gradya, Christopher G. Lucasb, Thomas L. Griffithsa, Adrienne Wentea, Sophie Bridgersc,
Rosie Aboodyd, Hoki Funga, and Ronald E. Dahle
a
Department of Psychology, University of California, Berkeley, CA 94720; bSchool of Informatics, University of Edinburgh, Edinburgh EH1 2QL, United
Kingdom; cDepartment of Psychology, Stanford University, Stanford, CA 94305; dDepartment of Psychology, Yale University, New Haven, CT 06520;
and eSchool of Public Health, University of California, Berkeley, CA 94720

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 8, 2017
(received for review January 18, 2017)

How was the evolution of our unique biological life history related culture would have been coevolutionary and bidirectional: life-
to distinctive human developments in cognition and culture? We history changes allowed changes in cultural learning, which in
suggest that the extended human childhood and adolescence turn both allowed and rewarded extended life histories. In this
allows a balance between exploration and exploitation, between way, culture could have extended biology.
wider and narrower hypothesis search, and between innovation A number of researchers have suggested that our life history is
and imitation in cultural learning. In particular, different develop- related to our learning abilities (8–10). But what might this re-
mental periods may be associated with different learning strate- lation be like in more detail? It is possible that the extended
gies. This relation between biology and culture was probably human childhood and adolescence is simply a waiting period in
coevolutionary and bidirectional: life-history changes allowed
which a large brain can grow or cultural learning can take place
changes in learning, which in turn both allowed and rewarded
(11). However, both developmental psychology and neuroscience
extended life histories. In two studies, we test how easily people
learn an unusual physical or social causal relation from a pattern of
suggest that there may be more substantive differences in
evidence. We track the development of this ability from early learning and plasticity in different developmental periods, dif-
childhood through adolescence and adulthood. In the physical ferences that could contribute to human intelligence and culture.
domain, preschoolers, counterintuitively, perform better than We argue that there may be a developmental trade-off be-
school-aged children, who in turn perform better than adolescents tween cognitive abilities that allow organisms to learn the
and adults. As they grow older learners are less flexible: they are structure of a new physical or social environment, abilities that
less likely to adopt an initially unfamiliar hypothesis that is are characteristic of children, and the more adult abilities that
consistent with new evidence. Instead, learners prefer a familiar allow skilled action on a familiar environment. Empirical evi-
hypothesis that is less consistent with the evidence. In the social dence suggests that children may sometimes be better, and
domain, both preschoolers and adolescents are actually the most particularly more flexible, learners than adults. Ideas from the
flexible learners, adopting an unusual hypothesis more easily than literature on developmental neuroscience, machine learning, and
either 6-y-olds or adults. There may be important developmental cultural learning may help to characterize and explain these
transitions in flexibility at the entry into middle childhood and in developmental differences more precisely.
adolescence, which differ across domains. We go on to test these ideas by examining cognitive flexibility
across the developmental periods of preschool, middle-childhood,
causal reasoning | social cognition | cognitive development | adolescence | adolescence, and adulthood, in both the physical and social domain.
life history
When Younger Learners Do Better. Younger learners usually have

O ne of the most distinctive biological features of human


beings is our unusual life history. Compared with our closest
primate relatives, we have a dramatically extended childhood,
more difficulty with cognitive tasks than older children and
adults. Young children have characteristic deficits in executive
function, working memory, attentional focus, and control (12,
including an exceptionally long middle childhood and adoles- 13). These are precisely the same abilities required for per-
cence. Moreover, humans have shorter interbirth intervals than forming complex skilled actions swiftly and effectively in adult-
our closest primate relatives, producing an even greater number hood. Indeed, human children are so dependent on others partly
of less-capable children (1). There is evidence for other human because of their deficits in these areas.
adaptations that helped cope with this flood of needy young. In
contrast to our closest primate relatives, human children enjoy
the benefits of care from three sources in addition to biological This paper results from the Arthur M. Sackler Colloquium of the National Academy of
mothers: pair-bonded fathers (2), alloparents (3), and post- Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
menopausal women, in particular, grandmothers (4). Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
It may seem evolutionarily paradoxical that humans would have
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
developed a life history that includes such expensive and vulner-
Author contributions: A.G., S.O., C.G.L., T.L.G., A.W., and R.E.D. designed research; A.G.,
able young for such a long period. However, across many different S.O., A.W., S.B., R.A., and H.F. performed research; A.G., S.O., and C.G.L. analyzed data;
species, including birds and both placental and marsupial mam- and A.G. and S.O. wrote the paper.
mals, there is a very general (although not perfect) correlation The authors declare no conflict of interest.
between relative brain size, intelligence and a reliance on learning, This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
and an extended period of immaturity (5–7). This correlation Board.
suggests a relation between our distinctive human life history and 1
To whom correspondence should be addressed. Email: gopnik@berkeley.edu.
our equally distinctive large brains and reliance on learning, par- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
ticularly cultural learning. Such a relation between biology and 1073/pnas.1700811114/-/DCSupplemental.

7892–7899 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1700811114


COLLOQUIUM
PAPER
However, at the same time that their executive abilities are so The different strategies that learners might engage in—and
limited, human children learn a tremendous amount about the their consequences for those learners—can also be characterized
world easily and rapidly. They quickly and spontaneously learn more precisely by considering cognitive development from the
about the causal structure of their physical and social environ- perspective of a probabilistic model approach to cognition. This
ments, constructing intuitive theories of the physical, biological, approach, inspired by statistical methods that are widely used in
and psychological world (e.g., ref. 14). artificial intelligence and machine learning, has become in-
There is also empirical evidence that younger learners sometimes, creasingly influential in cognitive science (e.g., refs. 35–40).
counterintuitively, actually outperform older ones on learning tasks, This approach applies particularly naturally to learning the causal
showing more flexibility. Younger mice learn to reverse a learned structure of the environment. Probabilistic models of cognition use
rule more easily than postpubertal mice (15). Older monkeys show sophisticated causal models to specify the probability of observing a
neural plasticity when they learn an auditory or tactile pattern, but particular statistical pattern of evidence if a causal hypothesis is true
only when the pattern is relevant to their goals; juveniles extract the (41). This makes it possible to use Bayesian inference to determine
patterns and demonstrate plasticity independently of goals (16). the probability that the hypothesis is true given that evidence (42).
Among humans, younger learners are more able to learn new Rather than simply generating a yes or no decision about whether a
linguistic distinctions than older learners (17, 18) and they are particular hypothesis is true, Bayesian inference evaluates multiple
better at imagining new uses for a tool (19). Younger children hypotheses and assigns probabilities to those hypotheses (14, 35–
also remember information that is outside the focus of goal- 40). Many studies have presented children with evidence patterns
directed attention better than adults and older children (20, 21). and alternative hypotheses that might explain those patterns, and
We have recently found that preschool children also outperform found that children characteristically choose hypotheses that
older children and adults on abstract social (22) and physical (23) Bayesian inference suggests should be more probable (14).
causal learning problems (24). In particular, younger learners are However, Bayesian inference comes at a cost: the significant
more likely to infer an initially unlikely causal hypothesis from a computational cost of evaluating hypotheses. It is impossible for
pattern of evidence. These kinds of causal learning are especially any system, human or computer, to consider and compare all of
relevant for human evolution. Theories of the evolution of cogni- the possible hypotheses relevant to a realistic learning problem.
tion stress the adaptive value of human abilities to learn both the Computer scientists and statisticians often use “sampling” to
psychological and social causal relationships that are involved in help solve this problem—stochastically selecting some hypothe-
“theory of mind” and “Machiavellian intelligence,” and the physi- ses rather than others—and there is evidence that people, in-
cal causal relationships that underpin tool use (25, 26). cluding young children, do something similar (43–45).
These findings suggest empirically that children might be es- The sampling process, however, presents learners with a di-
pecially flexible learners. But why would this be? lemma. A learner can conduct a narrow search, only revising

PSYCHOLOGICAL AND
current hypotheses when the evidence is particularly strong and

COGNITIVE SCIENCES
Neuroscience: Trade-Offs Between Executive Function and Plasticity. making small adjustments to accommodate new evidence. This
Neuroscientists have investigated the origins of both the in- strategy is most likely to quickly yield a “good enough” solution
creased executive control and decreased plasticity that come with that will support immediate effective action. But it also means
age. One set of developments involves synaptic changes. In the that the learner may miss a better alternative that is farther from
early period of development, many more new synaptic connec- the current hypothesis, such as a hypothesis about an unusual
tions are made than in adulthood. With age some of these neural causal relation.
connections are strengthened but others are pruned, trans- Alternatively, a learner can conduct a more exploratory search,
forming a more flexible, sensitive, and plastic brain into a more moving to new hypotheses with only a small amount of evidence,
effective and controlled one (27, 28). and trying out hypotheses that are less like the current hypotheses.
Increasing executive control is also related to the development This strategy is less efficient if the learner’s starting hypothesis is
of prefrontal areas of the brain and their increasing connection reasonably good, and may mean that the learner wastes time con-
to other brain areas. However, neuroscientists have also argued sidering unlikely possibilities. But it may also make the learner more
that strong frontal control has costs for exploration and learning likely to adopt genuinely new solutions.
(29). Interference with prefrontal control areas through trans- There is a related contrast in the algorithms that are used in
cranial direct current stimulation leads to a wider range of re- computer science. Drawing on an analogy to statistical physics,
sponses on a “divergent thinking” task (30), and during learning computer scientists have explored the consequences of using
there is a characteristic release of frontal control (31). narrower “low-temperature” versus broader “high-temperature”
The adolescent brain undergoes particular changes. There is searches. Continuing the analogy, “simulated annealing” (46) is
significant maturational development in prefrontal areas and in one of the best ways of resolving the tension between these two
areas thought to be involved in self-perception and social cog- strategies. Learners who begin with a broader higher-temperature
nition (32), which may indicate increased plasticity. However, search and gradually move to a narrower low-temperature search
there is also evidence for enhanced consolidation and pruning in are most likely to find the optimal solution, just as in metallurgy
adolescence (33), which might suggest a period of less flexibility. heating a metal and then cooling it leads to the most robust
structure. Moreover, as in physical cases of annealing, there may be
Computation: Trade-Offs Between Exploitation and Exploration, and multiple rounds of this process. We have argued for a similar de-
Narrow and Broad Search. The trade-off between executive func- velopmental pattern with early broad exploratory sampling followed
tion and plasticity in the neuroscience literature parallels an- by a later narrower search (23, 24). Our hypothesis is that childhood
other trade-off that appears in machine learning. Reinforcement and adolescence may be evolution’s way of performing simulated
learning algorithms make an important distinction between pe- annealing, and hence resolving the explore/exploit trade-off.
riods of exploration, in which the system gathers information
about potential actions and outcomes, and exploitation, in which Cultural Learning: Trade-Offs Between Imitation and Innovation. The
information gathering is replaced by taking the actions most causal learning problems where children do better can also be
likely to maximize reward (34). Human life histories can be recast as cultural learning problems, and understood in relation
interpreted as a unique solution to the explore/exploit tension, to the cultural learning literature. Consider a learner who ob-
with low executive control and high plasticity early in life maxi- serves someone else performing a complicated series of actions
mizing exploration, and increased executive function and lower with artifacts that produce an effect. The learner might approach
plasticity maximizing reward as we switch to exploitation. this information in several ways. First, the learner might simply

Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7893
reproduce the actions in detail. Alternatively, the learner might that are effortful and rare when they first appear within a gener-
apply existing causal knowledge to the situation, and bring about ation can become effortlessly and widely adopted by the next
the effect more directly. These two forms of learning have been generation. In fact, among nonhuman animals, cultural innovations
the focus of the extensive “overimitation” literature, starting with the are often first produced, adopted, and spread by juveniles (55–58).
classic Horner and Whiten study (47).
Human preschoolers are sensitive to information about phys- Continuous Knowledge Acquisition vs. Discontinuous Developmental
ical events and actor’s intentions in deciding how faithfully to Transition. There are two complementary mechanisms that might
imitate, and there are also developmental and cultural differ- lead to a developmental shift from broader exploration to nar-
ences in how imitation takes place (48–52). Learners of all ages rower exploitation. One is simply the accumulation of knowledge
may use their existing causal and cultural knowledge to interpret itself. As we learn more and grow more confident in our beliefs,
the actions of another person and to decide whether and how we are less likely to change those beliefs. From a Bayesian per-
faithfully to imitate those actions. spective, development proceeds from a relatively “flat” prior,
However, they might also use another person’s demonstration where different hypotheses have more similar probabilities, to a
to discover a new or unexpected causal relationship. For exam- more “peaked” distribution, where some hypotheses are much
ple, consider a Pleistocene learner who sees an expert produce a more likely than others, as a learner accumulates knowledge. In
flake from one side of a rock by hitting it on the other side (53), Bayesian models a flatter prior would automatically lead to
or a modern learner who watches an expert swipe to find a photo broader search.
on a phone. The learner might simply imitate the demonstrator Another complementary possibility, building on the literature
exactly. Alternatively, she might use her existing causal knowl- discussed above, is that maturation and general experience lead
edge to bring about the result (hitting the rock at the place where to different degrees of plasticity and flexibility and different
she wants it to flake or using a keyboard command). search strategies, independent of accumulated knowledge. There
However, a learner might also use this information to infer an might be nonlinear changes at points of developmental transi-
unexpected abstract causal principle (distant force or touch ac- tion, such as the transition from early to middle childhood at
tivation). She could then use this principle to design innovative around 6 y or in adolescence, rather than a simple continuous
actions beyond the demonstration, shaping other tools or trying change with accumulated knowledge.
other swipes for other commands. This kind of learning would In particular, although adolescents have more accumulated
both enable learners to adopt innovations in an intelligent way experience than younger children, there is evidence, as noted
and to create innovations themselves. above, that adolescence may also be a period of enhanced
This approach also applies to social and psychological causal plasticity and learning (59, 60), especially for social domains (32,
learning. Imagine that a learner hears a complex narrative de- 61), in part through the privileging of social information pro-
scribing a series of human actions, again a classic cultural, as well cessing and the salience of social rewards in decision making (62,
as causal, learning scenario. The learner might simply encode the 63). Cultural innovations, such as new socially significant forms
actions as they are described, recording what the actors did. She of language, dress, or music often first appear in adolescents.
might interpret those actions in terms of an existing psycholog- Adolescence might be an extra round of annealing in the social
ical schema. Alternatively, she might use the information in the sphere. However, there is also evidence that adolescence may be
narrative to infer new psychological or social relations. a period of pruning and consolidation.
As in the physical case, this last option might lead to both the In fact, two contrasting developmental patterns characterize ad-
adoption and creation of social and psychological innovations. olescence (64, 65). On some measures, such as cognitive control,
Consider a learner who hears a story in which Sam and John live and self-regulation, there is a relatively linear trajectory from
together and share a bedroom. She might interpret this story in childhood through adolescence to adulthood. On others, such as
terms of her existing cultural schemas (perhaps Sam and John sensation-seeking and risk-taking, both forms of exploration, there
are close friends with a small apartment). She might also, how- is a marked increase associated with the onset of puberty, and an
ever, use the story to make a broader inference about the pos- inverted U pattern peaking in adolescence and then declining.
sibility of same-sex marriage. There is extensive research on risk-taking and decision-making in
These alternative forms of cultural learning exemplify the adolescence but, to our knowledge, no research on causal learning.
explore/exploit tension. The first two strategies, namely, exact
imitation or reliance on causal knowledge, are likely to lead to Current Studies. We approach these questions by extending two
quick and mostly effective actions. Entertaining the unlikely new earlier causal learning experiments. Where the original experiments
causal relation is both more cognitively demanding and more contrasted preschoolers with either 6-y-old children or adults, we
risky. In the long run, however, it may confer an advantage in report results covering the entire developmental span from pre-
dealing with changing and variable environments. school to adulthood, with special focus on the transition to middle
Human learners of all ages may use all these strategies to some childhood and adolescence, periods not explored previously. This
extent. However, our hypothesis is that learners at different approach allows us to explore learning across human life history,
developmental stages may be more or less likely to use different and to ask whether there are distinctive developmental transitions.
strategies. In particular, more protected and more behaviorally Both experiments have the same logic. We contrast two hy-
variable younger learners may be more likely to adopt new hy- potheses about how objects or people work: one that is initially
potheses than older learners. In fact, the causal learning tasks in more likely, at least for adults, and one that is more unusual. In
our earlier research, in which younger learners do better than Exp. 1 we contrast the hypothesis that individual objects activate
older ones, involve precisely these kinds of scenarios. Learners a machine with the hypothesis that particular combinations of
infer a new causal relation from a demonstration or narrative. objects do. In Exp. 2 we contrast the hypothesis that someone
This developmental difference may also help resolve the ten- took a risk because of their personal traits with the hypothesis
sion between imitation and innovation in cultural learning (48). that they took the risk because of the situation they encountered.
Human children are adept at imitation. However, the flexibility In one condition, participants receive covariation evidence that
of childhood cognition may also help allow innovations to be supports the likely hypothesis. In a second, otherwise identical con-
adopted and to spread. Young children are rarely the source of dition, they receive covariation evidence that supports the unlikely
complex technical innovations; actually designing and producing hypothesis. In a third baseline condition, participants do not receive
an effective tool, for example, is a challenging task that requires evidence either way. We record whether participants of different ages
both innovation and executive skill (48, 54). However, innovations adopt the likely or unlikely hypothesis in each condition.

7894 | www.pnas.org/cgi/doi/10.1073/pnas.1700811114 Gopnik et al.


COLLOQUIUM
PAPER
The different conditions allow us to control for alternative
factors that might influence performance on these tasks. In the
first two conditions, supporting the likely hypothesis or the un-
likely one, the participants see similar agents perform similar
actions on similar objects; all that differs is the covariation be-
tween causes and effects. Moreover, both conditions require that
the learner attend to and use the particular pattern of data
presented in the demonstration. Whether they adopt the likely or
unlikely hypothesis, the learner still has to attend to the specific
details of the evidence to answer correctly. Differences in per-
formance, then, should reflect differences in causal learning
rather than more general information-processing, linguistic, or
motivational factors.

Exp. 1: Reasoning About the Causes of Physical Events


In an earlier study, Lucas et al. (23) found that, across three
different experiments, with different participants and designs,
Fig. 2. Proportion of participants labeling test objects as “blickets” with SEs.
preschool children learned an unusual abstract physical causal
relationship but adults had difficulty.
In the second experiment of that study, preschool children and allowed us to examine the transitions from early to middle
adults were presented with a machine that lights up when you place childhood, from middle childhood to adolescence, and from
certain patterns of blocks on top, and were told that “blicketness” adolescence to adulthood. Would there be differences between
makes the machine go. First, in a training trial, participants saw preschoolers and school-age children? Would adolescents be less
unambiguous covariation evidence suggesting that the machine flexible and more like adults? Or might they be more flexible
operated according to a general logical rule. In one condition, the than school-aged children and adults with the inverted U pat-
machine operated on a disjunctive “or” rule: each block either ac-
tern? Finally, would there be a continuous change as children
tivated the machine or did not. Accounts of adult causal reasoning
accumulated more knowledge or more discontinuous changes at
suggest that this disjunctive rule is the default assumption for adults
developmental transitions?
(e.g., ref. 66). In the other condition, the machine operated on a
more unusual conjunctive “and” rule: two blocks had to be placed Results.
on the machine at the same time to make it activate. Four-year-old

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Blicket judgments. We combined new data collected from younger
children and adults in both conditions then saw an ambiguous test school-aged children (6- to 7-y-olds), older preadolescent chil-
trial with new blocks that was consistent with either general prin- dren (9- to 11-y-olds), and young adolescents (12-to 14-y-olds)
ciple. In a baseline condition, participants only saw the ambiguous with the data from 4-y-olds and adults tested with the identical
trial without the training trials. In each condition, participants were
method in Lucas et al. (23).
then asked whether each block was or was not a “blicket” and were
If the observers believe the machine operates on an unusual
asked to activate the machine.
conjunctive rule, requiring multiple blickets to operate, they
Children learned the appropriate general rule in each condi-
should say that F, D, and possibly E are blickets and use multiple
tion and applied it to the ambiguous case. Adults applied the
objects to make the machine go. If observers believe that the
default disjunctive rule in the ambiguous case even when the
machine works on the “disjunctive” rule, in contrast, they should
earlier evidence weighed against it.
In Exp. 1 we used exactly the same methods across the entire say that F is a blicket but that D and E are not and put single
developmental range, including 6- to 7-y-olds, 9- to 11-y-olds, and objects on the machine. [The evidence that E is a blicket is less
12- to 14-y-olds. Fig. 1 provides a visual display of the pattern of strong than the evidence for D, so participants should be less likely
evidence used for training and test trials. to say that E is a blicket than D (23).] (See SI Appendix, Table S5
We extended the contrast between preschoolers and adults to for analysis of E judgments, consistent with these predictions.)
include school-aged children and adolescents. This approach Fisher’s exact tests revealed no significant differences between
conditions or ages for the unambiguous F object; as predicted all
of the age groups in all of the conditions said that F was a blicket
(means ranged from 0.7 to 0.96).
Fig. 2 presents the proportion of participants in each age
group labeling the critical D test object as a blicket by condition.
Because the dependent measure is a binary response, we used
A B C AB AC BC comparisons of generalized linear models to identify the statis-
tical model with the best fit to the data. Results of model com-
parisons can be found in SI Appendix, Table S4.
A model predicting the binary D judgment from condition and
age group with no interactions was best fit to the data. Post hoc
A B C AB AC BC
tests using Tukey’s honest significant differences (HSD) for D
Test Trials
object judgments revealed a significant difference between the
conjunctive (M = 0.52, SE = 0.02) and the disjunctive (M = 0.13,
SE = 0.01; t = −0.391, P < 0.001) and baseline (M = 0.15, SE =
D D D E DF DEF DF 0.01; t = −0.374, P < 0.001) conditions, and there was no sig-
nificant difference between the disjunctive and baseline condi-
Fig. 1. Schematic of the procedure for Exp. 1. The yellow rectangle repre-
sents the machine’s activation. “Disjunctive” training provides evidence of tions (t = −0.017, P = 0.923).
the more common, disjunctive hypothesis. “Conjunctive” training provides In addition to the model comparisons, we conducted planned
support for the less common conjunctive hypothesis. “Test” trials presented comparisons for the theoretically crucial developmental con-
ambiguous evidence about the “D” object. trasts in the critical conjunctive condition, using Fisher’s exact

Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7895
tests. These were the transition to school-age (4- vs. 6-y-olds) and adolescence, rather than just a continuous change with in-
to adolescence (12- to 14-y-olds vs. 6- to 7- and 9- to 11-y-olds, creasing knowledge. School-aged children are similar to each
and vs. adults). Four-year-olds (M = 0.92, SE = 0.01) were sig- other and less flexible than preschoolers, adolescents and adults
nificantly more likely to label D a blicket than 6- to 7-y-olds (M = are similar, and both are less flexible than preschoolers and
0.56, SE = 0.02; P < 0.01). Six- to 7-y-olds and 9- to 11-y-olds school-aged children (Fig. 2).
(M = 0.6, SE = 0.02; P = 1) did not differ but both 6- to 7-y-olds
and 9- to 11-y-olds labeled D as a blicket significantly more than Exp. 2: Reasoning About the Causes of Actions
12- to 14-y-olds (M = 0.28, SE = 0.02; P < 0.05, in both cases). In the second experiment we turned from physical causality to
However, adolescents (12- to 14-y-olds) judgments did not differ sig- social and psychological causality. Classic findings in social psy-
nificantly from the judgments of adults (M = 0.25, SE = 0.02; P = 1.0). chology show that Western adults attribute actions to the stable
Thus, within the new data collected in this study, we saw some internal personal traits of an actor despite countervailing evi-
evidence for both middle childhood and adolescent transitions. dence, the “fundamental attribution error” (67). These findings
Intervention choices. We also analyzed participants’ choices when suggest that adults rely on existing causal hypotheses rather than
they were asked to activate the machine. Fig. 3 displays the pro- modifying those hypotheses in the face of evidence.
portion of participants choosing multiple items, indicating that they In one study, for example, an experimenter instructed half the
thought more than one object was necessary to activate the ma- participants in a group to write and read aloud an essay sup-
chine. There was more variability in this open-ended response than porting Castro and the other half to write and read an essay
in the yes/no blicket judgments. However, the general pattern was opposing him. Despite the obvious evidence that the essays were
similar. In particular, adolescents and adults were more likely to the result of the situation, participants reported that people in
choose single objects to make the machine go, suggesting that they the first group were more left wing than those in the second (68).
had genuinely concluded that the machine worked disjunctively, Among adults, this trait bias tends to become stronger with age
and did not simply use the word “blicket” differently than younger (69) and it appears to be stronger in some cultures than others:
participants. American and European middle-class participants show a stronger
Again, we used a generalized linear model (see SI Ap- trait bias than Hong Kong, Mainland Chinese, Japanese, and
pendix, Table S6 for details). The model with the best fit to Korean participants (70).
the data predicted the single vs. multiple object use from con- How does this bias develop in childhood? Seiver et al. (22)
dition, age group, and the interaction between condition and presented preschool children with a scenario in which two dolls
age group. either played or refused to play on two potentially risky toys. The
As with the blicket judgment measure, we made planned covariation evidence supported either a person or situation attri-
comparisons for the conjunctive condition using Fisher’s exact bution. Then they asked the children to explain why the actors
tests, focusing on the school-aged and adolescent transitions. played or refused to play on the toys. Four-year-olds accurately
These tests showed that 4-y-olds (M = 0.84, SE = 0.01) were made person or situation attributions depending on the evidence.
more likely to use multiple objects to activate the machine than Six-year-olds, however, showed a trait bias. They made more
6- to 7-y-olds (M = 0.53, SE = 0.02; P < 0.05), again suggesting a person attributions than 4-y-olds, even when the covariation in-
middle childhood transition. With this measure, 6- to 7-y-olds formation supported a situation attribution. In Exp. 2 we extend
and 9- to 11-y-olds (M = 0.63, SE = 0.02) did not differ signifi- this previous work to study the developmental changes in learning
cantly from 12- to 14-y-olds and adolescents did not differ sig- over childhood and adolescence.
nificantly from adults. We included an adult sample to ensure that adults would in-
deed show a trait bias in this task. Adding 9- to 11-y-old and 12-
Discussion. These results suggest that, in this task, as learners to 14-y-old samples let us test whether the previously discovered
grow older and have more experience they become less sensitive transition from 4- to 6-y-olds was part of a continuous de-
to the evidence and more reliant on their prior beliefs. They velopmental decline, or reflected a particular transition into
increasingly prefer disjunctive explanations to conjunctive ones, middle childhood.
even when the evidence weighs in the opposite direction. We could also examine adolescence. Like adults, adolescents
The results from both the blicket judgments and interventions have extensive experience of their particular culture and the trait
suggest a developmental transition at the entry to middle child- assumptions that go with it. There might be a developmental
hood and the blicket judgment results also suggest a transition at progression toward the adult pattern, as in Exp. 1. However,
adolescents are also especially sensitive to social information and
strongly motivated to explain peer behavior (59). They might be
more sensitive to social evidence, and more likely to override a
trait bias than adults. We might then expect something more like
the inverted U of risk-taking and sensation-seeking.

Results. We combined data from 9- to 11-y-olds, 12- to 14-y-olds,


and adults with the data from preschoolers and 6-y-old children
presented in Seiver et al. (22). We recorded how often partici-
pants explained the dolls’ actions in terms of situations—the
initially unlikely hypothesis—and used this to assign a “situation”
score from 0 to 2. Fig. 4 shows performance across age in the
person condition, where the evidence supports a trait attribution,
the situation condition, where the evidence supports a situation
attribution, and a baseline condition, which didn’t support either
explanation. Linear regression analyses were used to predict the
attribution score from age group and condition.
Model comparisons showed that the model with the best fit to
the data predicted situation attribution score from age group
Fig. 3. Proportion of participants choosing either single or multiple items and condition, as well as interactions between the two variables
for intervention choice with SEs. [F(14, 525) = 15.43, P < 0.001, adjusted R2 = 0.273] (details of

7896 | www.pnas.org/cgi/doi/10.1073/pnas.1700811114 Gopnik et al.


COLLOQUIUM
PAPER
the model comparisons can be found in SI Appendix, Table S12).
Fig. 4 plots the average situation attribution score for each age
group by condition.
As in Exp. 1, we also performed planned comparisons for the
crucial age transitions in the situation condition, using t tests.
The critical situation condition revealed whether participants
would adopt the unlikely situation hypothesis given evidence, or
would instead attribute actions to traits as they did in the person
and baseline conditions.
In the Seiver et al. (22) data the 6-y-olds, but not the 4-y-olds,
showed a trait bias in the situation condition, suggesting a tran-
sition at school age. In this experiment, we also tested the ado-
lescent transition by comparing the 12- to 14-y-olds to 6- and
9-y-olds and to adults. The adolescents showed an interesting
pattern, unlike the pattern in Exp. 1, which appeared to be re-
sponsible for the interaction effect in the model. Adolescent re-
sponses in the situation condition differed both from adults and Fig. 4. Average attribution scores by age group and condition with SEs. YO,
younger children in an inverted U pattern. In the situation condition year old.
12- to 14-y-olds (t = −4.1048, P < 0.001) made more situation
attributions than adults, and 12- to 14-y-olds also made signif-
icantly more situation attributions than 6-y-olds (t = −2.34, P = 0.02), Finally, these results suggest that changes in flexibility are not
although they were not significantly different from 9- to 11-y-olds. solely because of the accumulation of knowledge. The adoles-
We also performed additional analyses using Tukey’s HSD cents should have accumulated more knowledge than the
test. Participants in both the person (M = 0.2, SE = 0.02; t = younger children and this was reflected in their trait bias in the
−0.531, P < 0.001) and baseline (M = 0.49, SE = 0.03; t = −0.531, baseline condition. However, the adolescents were also the most
P < 0.001) conditions provided significantly fewer situation attri- flexible social thinkers; they were most able to overcome prior
butions than those in the situation (M = 1.02, SE = 0.03) condition. biases in the face of new evidence.
There was not a significant difference between the baseline and
person conditions, suggesting a trait bias. General Discussion and Conclusions
Given the interaction, we also used Tukey’s HSD tests to ex- These results support the suggestion that the extended human

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
amine age differences separately for each condition. There were period of immaturity allows a period of flexible hypothesis search
no significant age differences in attribution scores in the person in cultural learning. In both studies, we also found some evidence
condition; all age groups produced trait explanations when these for developmental transitions, particularly from early to middle
explanations were congruent with the data, and rarely made childhood and at adolescence.
situation attributions. The crucial conditions involved cases where the evidence
The baseline condition allowed us to assess participants’ judg- and the existing hypotheses were in conflict, the conjunctive
ments when no evidence was available (their “prior” in Bayesian condition in Exp. 1 and the situation condition in Exp. 2. In
terms). Post hoc Tukey tests revealed that 4-y-olds (M = 0.93, both studies 4-y-olds and 6- to 7-y-olds were significantly dif-
SE = 0.08) provided significantly more situation attributions than ferent in these conditions. In both studies, however, we did
both 12- to 14-y-olds (M = 0.24, SE = 0.06; t = −0.694, P = 0.001) not see significant differences between 6- to 7-y-olds and 9- to
and adults (M = 0.38, SE = 0.05; t = −0.55, P = 0.004). Although 11-y-olds.
both 6-y-olds (M = 0.43, SE = 0.1; t = −0.49, P = 0.09) and 9- to Similarly, we found evidence for a transition in adolescence in
11-y-olds (M = 0.55, SE = 0.11; t = −0.386, P = 0.49) provided both studies in these conditions, but this transition went in op-
fewer situation attributions than 4-y-olds, these differences did not posite directions. In the physical case, in the conjunctive condi-
reach statistical significance. This finding suggested that a trait tion adolescents were similar to adults but less flexible than
bias developed around 6 y and was maintained with age. either 6-y-olds or 9- to 11-y-olds. Like adults, the adolescents
seemed reluctant to revise physical knowledge they had already
Discussion. In the person condition, participants of all ages mostly
made trait attribution explanations, in accordance with the evi- acquired. In the social case, however, in the situation condition
dence. In the baseline condition, with no evidence, there was a adolescents were more flexible than either 6-y-olds or adults.
decrease in situation explanations with age. Accumulating ex- This finding is consistent with the idea that adolescents are more
perience may have led to a trait bias. tuned to the social domain than the physical one, and are willing
In the situation condition, in which the learners had to infer to entertain new social possibilities.
the unusual hypothesis, there was an interesting developmental These findings also raise the question of the interaction be-
reversal, with an inverse U pattern. Twelve- to 14-y-olds were tween biological and environmental factors in the unfolding of
less likely to make trait attributions than either 6-y-olds or life history. The findings in the baseline conditions suggest that
adults. In other words, although the adolescents had developed children are gradually accumulating more knowledge and that
a strong bias to begin with, they overcame that bias when they this may play a role in the decline of cognitive flexibility.
received contradictory evidence. The adolescents showed the However, the discontinuous pattern in the conjunctive and
largest gap between the baseline condition and the situation situation conditions suggests that other factors also play a role.
condition. Biological changes like puberty may play a role in the adolescent
These findings support the idea that adolescents may be par- transitions. There may also be more complex interactions be-
ticularly interested in discovering new social possibilities. This tween the changing life experiences that come with different
finding is consistent with the fact that, compared with adults, developmental stages and hypothesis search and flexibility. Ad-
adolescents show greater activation in brain regions associated olescence is not only a time of biological change; it is also a time
with self-perception and social cognition (71, 72), and that ad- of new social motivation and experience. Similarly, there is a
olescents are often at the forefront of social change. complex interaction between biological changes at around 6 y

Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7897
and experiences such as school in our culture, or more informal for operating the machine, but instead were presented with two ambiguous test
apprenticeships in cultures without formal schooling. trials. We recorded results from the second test trial but there were no signifi-
It is also plausible that a playful protected environment may cant differences between them.
The three conditions only differed in the covariation between the blocks
lead to more flexible, exploratory and childlike learning, even in
and the machine. In all three conditions, at the end of both training and test
adulthood, and that even in childhood, stressful or resource-poor trials, the experimenter pointed to each item individually and asked the
environments may lead to less flexibility and a more adult-like participant if that item was a blicket or not a blicket. Finally, the experi-
emphasis on exploitation (see, e.g., refs. 73 and 74). menter then gestured to the set of three objects and asked the participant,
These issues are all worthy of exploration, as are extensions of “Which of these [gesturing to the three test objects] would you use to turn
these studies to new domains. The physical causal learning re- on the machine?”
sults in Exp. 1 have been replicated in low socioeconomic status
preschoolers in Peru and the United States,* but more extensive Exp. 2.
cross-cultural comparisons, including the social tasks and Participants. The same 9- to 11-y-olds (n = 90) and 12- to 14-y-olds (n = 86) in Exp.
1 also participated in this experiment. Order of administration of the tasks was
extending to forager and small-scale agricultural cultures, would
counterbalanced to avoid interference; there were no order effects. An addi-
also be important. The current findings do, however, suggest a tional 240 adult participants were recruited for an online version of this exper-
relation between biology and culture, in particular between the iment via Amazon’s Mechanical Turk. We combined these data with the original
distinctive childhood and adolescence of our life history and our data from Seiver et al. (22) for 4- and 6-y-olds.
equally distinctive ability to learn about and create new social Procedure and coding. Participants were randomly assigned to one of three
and physical environments. conditions in which two dolls interacted with two toys. Subjects assigned to the
situation condition saw two dolls play on one toy four times and then saw those
Methods same dolls avoid playing on a second toy four times. This pattern of covariation
Data from the new participants in this study can be found on the Open should suggest that something about the situation caused the pattern of ac-
Science Framework (https://osf.io) under the profile for Shaun O’Grady. tions (i.e., “her friend played on the bicycle” or “the trampoline is danger-
ous”). Those assigned to the person condition saw one doll play on both toys
four times, whereas the other doll avoided playing on both toys four times.
Exp. 1.
This evidence should suggest that the actions resulted from an inherent
Participants. Children aged 6- to 7-y-old, (n = 90), 9- to 11-y-old (n = 90), and 12-
trait of the doll, and produce trait-based explanations, such as “she’s the
to 14-y-old (n = 86) participated. We combined these new data with that
type of doll that gets scared/is brave” or “she knows how to ride a bike.”
reported for preschoolers and adults in Exp. 2 of Lucas et al. (23) to com-
Finally, in a baseline condition, participants saw one doll play on one toy
pare performance from preschool to adulthood. For all participants in both
four times, whereas the other doll avoided the other toy four times. Partici-
experiments reported here, parents provided written informed consent
pants in this condition could not rely on covariation information to make
and the child participants provided either written assent (9- to 14-y-olds) or
attributions because they had not seen how each doll acted on the other toy.
verbal assent (4- to 7-y-olds) in accordance with protocols approved by the
After they watched the dolls interact with the toys, each participant was asked
University of California, Berkeley Committee for the Protection of
why each doll either played or did not play on the second toy.
Human Subjects.
Explanations referring to an enduring characteristic of the doll were coded
Procedure. Participants from each age group were randomly assigned to one
as “person” attributions and were given a score of 0 (e.g., “Because she
of three conditions: two training conditions (conjunctive and disjunctive
might be more brave than the other one”). When an explanation referenced
conditions) and a third condition with no training, termed the baseline
an aspect of the toy or situation, the response was coded as a “situation”
condition. In each condition the participants were shown nine different
attribution and given a score of 1 (e.g., “The trampoline doesn’t have any
blocks (A, B, C, A2, B2, C2, D, E, and F). Participants were presented with a
edges”). Some explanations referred to both personal traits and situational
machine and were informed that “blicketness” makes the machine light up
factors and were coded as “interactions” and given a score of 0.5. See SI
and play music.
Appendix, Table S9 for a list of example responses by category. Reliability
In both of the training conditions, the experimenter placed individual blocks or
coding was conducted on 16% of the responses by a second coder who was
combinations of blocks on the machine in the same order (Fig. 1). In the con-
blind to condition, and interrater reliability was high (Cohen’s κ = 0.967, P <
junctive condition the machine only activated when the experimenter placed
0.001). Coded explanation responses for each participant were summed to
both A and C on the machine at the same time, providing evidence that supports
provide a “situation” attribution score for each participant.
a conjunctive rule about the machine’s operation. In the disjunctive condition
the machine activated any time either A or C were placed on the machine,
Analyses. All analyses in both experiments were performed using the R
suggesting that only one of the two blocks was needed. After the two training
statistical programming language (75). Preliminary analyses revealed no
trials participants saw one test trial with three new items: D, E, and F. The test
effect of block shape, doll name, toy, or the order in which the dolls played.
trials provided ambiguous information that could support either the conjunctive
Linear regression models found no effect of gender of the participants or
or disjunctive rule (i.e., D and F are both blickets or just F is a blicket). In the
the experimenter in either experiment (see SI Appendix, Tables S3 and S11).
baseline condition, participants were not given any prior training about the rule

ACKNOWLEDGMENTS. This work was funded by the National Science


Foundation Graduate Research Fellowship under Grant DGE 1106400 (to
*Wente AO, et al., Cognitive Development Society Meeting, October 9–11, 2015, Colum- S.O.); National Science Foundation Grant BCS-331620 (to A.G. and T.L.G.);
bus, OH. and grants from the Bezos Foundation and McDonnell Foundation (to A.G.).

1. Hill K, Kaplan H (1999) Life history traits in humans: Theory and empirical studies. 10. Bjorklund DF, Green BL (1992) The adaptive nature of cognitive immaturity. Am
Annu Rev Anthropol 28:397–430. Psychol 47:46–54.
2. Chapais B (2009) Primeval Kinship: How Pair-Bonding Gave Birth to Human Society 11. Bogin BA, Smith BH (1996) Evolution of the human life cycle. Am J Hum Biol 8:703–716.
(Harvard Univ Press, Cambridge, MA). 12. Munakata Y, Casey BJ, Diamond A (2004) Developmental cognitive neuroscience:
3. Hrdy SB (2011) Mothers and Others (Harvard Univ Press, Cambridge, MA). Progress and potential. Trends Cogn Sci 8:122–128.
4. Hawkes K, O’Connell JF, Jones NG, Alvarez H, Charnov EL (1998) Grandmothering, men- 13. Carlson SM (2005) Developmentally sensitive measures of executive function in pre-
opause, and the evolution of human life histories. Proc Natl Acad Sci USA 95:1336–1339. school children. Dev Neuropsychol 28:595–616.
5. Bennett PM, Harvey PH (1985) Brain size, development and metabolism in birds and 14. Gopnik A, Wellman HM (2012) Reconstructing constructivism: Causal models, Bayes-
mammals. J Zool 207:491–509. ian learning mechanisms, and the theory theory. Psychol Bull 138:1085–1108.
6. Weisbecker V, Goswami A (2010) Brain size, life history, and metabolism at the 15. Johnson C, Wilbrecht L (2011) Juvenile mice show greater flexibility in multiple choice
marsupial/placental dichotomy. Proc Natl Acad Sci USA 107:16216–16221. reversal learning than adults. Dev Cogn Neurosci 1:540–551.
7. Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural in- 16. Buonomano DV, Merzenich MM (1998) Cortical plasticity: From synapses to maps.
telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad Annu Rev Neurosci 21:149–186.
Sci USA 114:7908–7914. 17. Werker JF, Hensch TK (2015) Critical periods in speech perception: New directions.
8. Bruner JS (1972) Nature and uses of immaturity. Am Psychol 27:687–708. Annu Rev Psych 66:173–196.
9. Konner M (2010) The Evolution of Childhood: Relationships, Emotion, Mind (Harvard 18. Kuhl PK (2004) Early language acquisition: Cracking the speech code. Nat Rev
Univ Press, Cambridge, MA). Neurosci 5:831–843.

7898 | www.pnas.org/cgi/doi/10.1073/pnas.1700811114 Gopnik et al.


COLLOQUIUM
PAPER
19. German TP, Defeyter MA (2000) Immunity to functional fixedness in young children. 48. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural
Psychon Bull Rev 7:707–712. learning. Trends Cogn Sci 19:688–699.
20. Plebanek DJ, Sloutsky VM (April 1, 2017) Costs of selective attention: When children 49. Williamson RA, Meltzoff AN (2011) Own and others’ prior experiences influence
notice what adults miss. Psychol Sci, 10.1177/0956797617693005. children’s imitation of causal acts. Cogn Dev 26:260–268.
21. Sloutsky VM, Fisher AV (2004) When development and learning decrease 50. Buchsbaum D, Gopnik A, Griffiths TL, Shafto P (2011) Children’s imitation of causal
memory. Evidence against category-based induction in children. Psychol Sci 15: action sequences is influenced by statistical and pedagogical evidence. Cognition 120:
553–558. 331–340.
22. Seiver E, Gopnik A, Goodman ND (2013) Did she jump because she was the big sister 51. Berl RE, Hewlett BS (2015) Cultural variation in the use of overimitation by the Aka
or because the trampoline was safe? Causal inference and the development of social and Ngandu of the Congo Basin. PLoS One 10:e0120180.
attribution. Child Dev 84:443–454. 52. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl
23. Lucas CG, Bridgers S, Griffiths TL, Gopnik A (2014) When children are better (or at Acad Sci USA 114:7877–7883.
least more open-minded) learners than adults: Developmental differences in learning 53. Morgan TJH, et al. (2015) Experimental evidence for the co-evolution of hominin tool-
the forms of causal relationships. Cognition 131:284–299. making teaching and language. Nat Commun 6:6029.
24. Gopnik A, Griffiths TL, Lucas CG (2015) When younger learners can be better (or at 54. Beck SR, Apperly IA, Chappell J, Guthrie C, Cutting N (2011) Making tools isn’t child’s
least more open-minded) than older ones. Curr Dir Psychol Sci 24:87–92. play. Cognition 119:301–306.
25. Byrne RW (1995) The Thinking Ape: Evolutionary Origins of Intelligence (Oxford Univ 55. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture
Press, Oxford). via conformity in wild birds. Nature 518:538–541.
26. Byrne R, Whiten A (1989) Machiavellian Intelligence: Social Expertise and the 56. Kawamura S (1959) The process of sub-culture propagation among Japanese ma-
Evolution of Intellect in Monkeys, Apes, and Humans (Oxford Univ Press, Oxford). caques. Primates 2:43–60.
27. Huttenlocher PR (1990) Morphometric study of human cerebral cortex development. 57. Aplin LM, Sheldon BC, McElreath R (2017) Conformity does not perpetuate sub-
Neuropsychologia 28:517–527. optimal traditions in a wild population of songbirds. Proc Natl Acad Sci USA 114:
28. Huttenlocher PR (2009) Neural Plasticity (Harvard Univ Press, Cambridge, MA). 7830–7837.
29. Thompson-Schill SL, Ramscar M, Chrysikou EG (2009) Cognition without control: 58. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) invent
more social behaviors, but younger monkeys innovate more in other contexts. Proc
When a little frontal lobe goes a long way. Curr Dir Psychol Sci 18:259–263.
Natl Acad Sci USA 114:7806–7813.
30. Chrysikou EG, et al. (2013) Noninvasive transcranial direct current stimulation over
59. Crone EA, Dahl RE (2012) Understanding adolescence as a period of social-affective
the left prefrontal cortex facilitates cognitive flexibility in tool use. Cogn Neurosci 4:
engagement and goal flexibility. Nat Rev Neurosci 13:636–650.
81–89.
60. Piekarski DJ, et al. (2017) Does puberty mark a transition in sensitive periods for
31. Bassett DS, et al. (2011) Dynamic reconfiguration of human brain networks during
plasticity in the associative neocortex? Brain Res 1654:123–144.
learning. Proc Natl Acad Sci USA 108:7641–7646.
61. Blakemore SJ, Mills KL (2014) Is adolescence a sensitive period for sociocultural pro-
32. Blakemore SJ, Choudhury S (2006) Development of the adolescent brain: Implications
cessing? Annu Rev Psychol 65:187–207.
for executive function and social cognition. J Child Psychol Psychiatry 47:296–312.
62. Cardoos SL, et al. (2017) Social status strategy in early adolescent girls: Testosterone
33. Lebel C, Beaulieu C (2011) Longitudinal development of human brain wiring con-
and value-based decision making. Pscyhoneuroendocrinolgy 81:14–21.
tinues from childhood into adulthood. J Neurosci 31:10937–10947.
63. Braams BR, Peters S, Peper JS, Güroğlu B, Crone EA (2014) Gambling for self, friends,
34. Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey. J Art
and antagonists: Differential contributions of affective and social brain regions on
Int Res 4:237–285.
adolescent reward processing. Neuroimage 100:281–289.
35. Gopnik A, et al. (2004) A theory of causal learning in children: Causal maps and Bayes
64. Braams BR, van Duijvenvoorde ACK, Peper JS, Crone EA (2015) Longitudinal changes
nets. Psychol Rev 111:3–32.
in adolescent risk-taking: A comprehensive study of neural responses to rewards,

PSYCHOLOGICAL AND
36. Perfors A, Tenenbaum JB, Griffiths TL, Xu F (2011) A tutorial introduction to Bayesian

COGNITIVE SCIENCES
pubertal development, and risk-taking behavior. J Neurosci 35:7226–7238.
models of cognitive development. Cognition 120:302–321. 65. Steinberg L, et al. (2017) Around the world, adolescence is a time of heightened
37. Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to grow a mind: sensation seeking and immature self-regulation. Dev Sci, 10.1111/desc.12532.
Statistics, structure, and abstraction. Science 331:1279–1285. 66. Cheng PW (1997) From covariation to causation: A causal power theory. Psychol Rev
38. Gopnik A (2012) Scientific thinking in young children: Theoretical advances, empirical 104:367–405.
research, and policy implications. Science 337:1623–1627. 67. Kelley HH (1967) Attribution theory in social psychology. Nebraska Symposium on
39. Xu F, Kushnir T (2013) Infants are rational constructivist learners. Curr Dir Psychol Sci Motivation 15:192–238.
22:28–32. 68. Jones EE, Harris VA (1967) The attribution of attitudes. J Exp Soc Psychol 3:1–24.
40. Kushnir T, Xu F, eds (2012) Rational Constructivism in Cognitive Development (Aca- 69. Horhota M, Blanchard-Fields F (2006) Do beliefs and attributional complexity influ-
demic, Cambridge, MA), Vol 43. ence age differences in the correspondence bias? Soc Cogn 24:310–337.
41. Pearl J (2009) Causality (Cambridge Univ Press, Cambridge, UK), 2nd Ed. 70. Morris MW, Nisbett RE, Peng K (1995) Causal attribution across domains and cultures.
42. Griffiths TL, Chater N, Kemp C, Perfors A, Tenenbaum JB (2010) Probabilistic models Causal Cognition: A Multidisciplinary Debate, eds Sperber D, Premack D, Premack AJ
of cognition: Exploring representations and inductive biases. Trends Cogn Sci 14: (Clarendon Press, Oxford), pp 577–612.
357–364. 71. Pfeifer JH, et al. (2009) Neural correlates of direct and reflected self-appraisals in
43. Bonawitz E, Denison S, Griffiths TL, Gopnik A (2014) Probabilistic models, learning adolescents and adults: When social perspective-taking informs self-perception. Child
algorithms, and response variability: Sampling in cognitive development. Trends Dev 80:1016–1038.
Cogn Sci 18:497–500. 72. Pfeifer JH, Lieberman MD, Dapretto M (2007) “I know you are but what am I?!”:
44. Denison S, Bonawitz E, Gopnik A, Griffiths TL (2013) Rational variability in children’s Neural bases of self- and social knowledge retrieval in children and adults. J Cogn
causal inferences: The sampling hypothesis. Cognition 126:285–300. Neurosci 19:1323–1337.
45. Bonawitz E, Denison S, Gopnik A, Griffiths TL (2014) Win-stay, lose-sample: A simple 73. Gee DG, et al. (2013) Early developmental emergence of human amygdala-prefrontal
sequential algorithm for approximating Bayesian inference. Cognit Psychol 74:35–65. connectivity after maternal deprivation. Proc Natl Acad Sci USA 110:15638–15643.
46. Kirkpatrick S, Gelatt CD, Jr, Vecchi MP (1983) Optimization by simulated annealing. 74. Nettle D, Frankenhuis WE, Rickard IJ (2013) The evolution of predictive adaptive re-
Science 220:671–680. sponses in human life history. Proc Biol Sci, 10.1098/rspb.2013.1343.
47. Horner V, Whiten A (2005) Causal knowledge and imitation/emulation switching in 75. R Core Team (2012) R: A language and environment for statistical computing. (R
chimpanzees (Pan troglodytes) and children (Homo sapiens). Anim Cogn 8:164–181. Foundation for Statistical Computing, Vienna).

Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7899
How language shapes the cultural inheritance
of categories
Susan A. Gelmana,1 and Steven O. Robertsa
a
Department of Psychology, University of Michigan, Ann Arbor, MI 48109

Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 12, 2017
(received for review January 12, 2017)

It is widely recognized that language plays a key role in the nut using a rock”; “I’ll give you my money if you put down that
transmission of human culture, but relatively little is known about gun”; “Don’t trust Joe—he lies constantly”), which can share
the mechanisms by which language simultaneously encourages ideas, negotiate trades, deceive enemies, impress potential mates,
both cultural stability and cultural innovation. This paper examines affect reputations, and so forth. The expressive capacity of human
this issue by focusing on the use of language to transmit categories, language is virtually unlimited because of its hierarchical, combi-
focusing on two universal devices: labels (e.g., shark, woman) and natorial structure (6). In contrast to the communication systems of
generics (e.g., “sharks attack swimmers”; “women are nurturing”). other organisms (even those as impressive as whales, bees, or
We propose that labels and generics each assume two key princi- vervet monkeys), human language is generative: it permits in-
ples: norms and essentialism. The normative assumption permits finitely many messages to be constructed out of a limited number
transmission of category information with great fidelity, whereas of elements. This remarkably flexible system has obvious survival
essentialism invites innovation by means of an open-ended, place- value, as it is used in the “cognitive arms race” of competitive
holder structure. Additionally, we sketch out how labels and ge- feedback loops implicated in cooperative interactions that involve
nerics aid in conceptual alignment and the progressive “looping”
and must deal with cheating and cheating-detection (7).
between categories and cultural practices. In this way, human lan-
However, much of what human language conveys is not ex-
guage is a technology that enhances and expands the categoriza-
plicitly articulated via propositional content, but rather is implied
tion capacities that we share with other animals.
via presupposition, implicature, and other forms of inference (8).
Four examples follow.
language | categories | essentialism | norms | children (i) Language marks social identity through variation. There are
roughly 6,000 human languages around the globe, mutually un-
I t is broadly agreed that language is a distinctive human ca-
pacity and a powerful engine of cultural transmission. As such,
language is important to the theme of this special issue (1). No
intelligible, and (with rare exception) fully learnable only in
childhood. These aspects materially affect with whom one can
communicate and coordinate, and from whom one can learn. Even
matter how sophisticated the cultural transmission systems of among those who speak the same language, accent and dialect
nonhuman species (and they are astonishingly sophisticated; see reveal a person’s cultural origins, and so serve as honest signals to
papers in this issue) (2), they proceed without language. We can identity, with consequences for whom others choose to interact
thus ask what language distinctively contributes to cultural with and which models others trust to imitate and learn from (9).
transmission in humans and (more speculatively, but impor- (ii) Language directs a person’s attention in the moment by
tantly) what language may distinctively contribute to cultural means of structural features of the grammar. Different linguistic
evolution in humans. Recent evidence from language learning in communities focus on different aspects of experience, and in so
children provides new insights into these questions.
doing indicate what is important (10). For example, Japanese has
In this paper, we focus specifically on a key universal element
an honorific system that requires a speaker to decide level of
of language, category labels (e.g., dogs, gold, women, Muslims),
politeness; Quechua has an evidential system for expressing how
and their central role in the transmission and evolution of cat-
a speaker comes to know something: directly seeing vs. hearsay.
egory representations. The argument, in brief, is that category
There is debate regarding the role of these differences on non-
labels work in an almost paradoxical way to ensure stability in the
linguistic cognition (11, 12). But at a minimum, these structural
transmission process, but simultaneously to permit and even
foster conceptual change. On the one hand, words are conven- features affect a person’s thinking in the moment of speaking
tional and prescriptive, and provide a stable representation that (13), including what information gets encoded and transmitted
is easily shared with great fidelity, but on the other hand, words within a social interaction.
have an open-ended “placeholder” structure that invites in- (iii) Language transmits information through a rich system
novation. We suggest that this dual capacity contributes to what of pragmatic implications (14, 15). Communication involves in-
is distinctive in human cultural evolution. ferring the speaker’s intentions, a complex process that builds on
theory-of-mind capacities (16). Pragmatic inferences not only
Propositions vs. Presuppositions allow a listener to infer a speaker’s meaning, but also to learn
Maynard Smith and Szathmáry argued that language is one of about properties of the world (17).
the major transitions in the evolution of complexity, specifically
in the intergenerational transmission of information: “We accept
[the origin of human speech] as being the decisive step in the This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
origin of specifically human society” (3). Kirby et al. (4) similarly Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
note that “Language is unique in being a system that supports in Irvine, CA. The complete program and video recordings of most presentations are available
unlimited heredity of cultural information, allowing our species on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
to develop a unique kind of open-ended adaptability.” And Pagel Author contributions: S.A.G. and S.O.R. wrote the paper.
(5) likewise refers to “language’s role in the transmission of the The authors declare no conflict of interest.
information that makes our societies possible.” This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
The most obvious way that language transmits information is Board.
via explicit declarative propositions (e.g., “You can crack open a 1
To whom correspondence should be addressed. Email: gelman@umich.edu.

7900–7907 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1621073114


COLLOQUIUM
PAPER
(iv) Language provides cognitive tools that aid with recall, (29). Children devote considerable time and effort to amassing
transmission, and manipulation of concepts that otherwise can be words, learning roughly 14,000 words by age 6, which averages to
difficult to hold in mind. For example, number words help learning nearly one new word every waking hour from 18 mo
speakers of a language, such as English or Turkish, remember and through 6 y of age (30). Although category membership can be
communicate exact cardinalities of sets; speakers of a language inferred without language (e.g., recognizing an animal as a snake
without number words (such as Pirahã) perform poorly on nu- based on its shape and movement), labels have informational
merical reasoning tasks that tap into these processes (18). In this capacity beyond direct observation. Even for young children, they
way, language provides tools much like any other informational can convey surprising category membership of an individual item
technology, such as Arabic numerals, written language, the abacus, (e.g., that a legless lizard is not a snake) (31) or introduce wholly
or even computers (19). new categorical distinctions (e.g., to distinguish animals based on
Our focus is on this last sort of presupposition: language as a subtle variations in antennae rather than overall body shape) (32).
cognitive tool. We focus on how human languages represent cate- Generics are generalizations that refer to a category directly (e.g.,
gories, by means of two universal devices: labels and generics. We “birds fly”) (33). In contrast to specific utterances (e.g., “Did you
argue that these devices convey two important presuppositions: that see that bird?”), generics convey information that extends beyond
categories are normative and that categories have essences. We the current context and indeed in principle cannot be demonstrated
further suggest that these presuppositions are both constraining directly. Importantly, generics are input to children’s developing
(leading to stability) and generative (leading to innovation) in the knowledge systems, as they are frequent in child-directed speech
process of cultural transmission. and acquired by about 2.5 y of age (34, 35). As soon as children
master the syntactic prerequisites for expressing generics in their
Categories as Cultural Inheritance language (e.g., in English: plurals, articles, and tense), they produce
Categories are mental representations in which perceptibly distinct generics (“Does lions crawl?”; “I don’t like babies that cry”), com-
entities are treated as alike (e.g., the category “apple” permits one prehend generics as kind-referring and distinct from specific refer-
to identify a variety of different apples as edible). Every animal ence, and recall whether information was provided using generic
species uses categories to organize their representations of experi- or specific language (35–37). Generics have been attested in all
ence, identify newly encountered instances, and make predictive documented languages, including pidgins and Pirahã (38–42).
inferences, from pigeons identifying food to voles identifying kin. Children’s early capacity to learn generics is particularly striking,
Categories are also a foundational component of socially trans- given that generic referents are abstract (one cannot point to a
mitted behaviors, such as tool use (categories are needed to identify kind, only to instances of a kind) and their semantics cannot be
potential tools), vocalizations to warn of danger (categories are reduced to a particular quantity (unlike “some,” “most,” or “all”)
needed to identify predators), or rituals to maintain group cohesion (43–45). For example, although “Lions have manes” is acceptable
(social categories are needed to decide whom to copy) (20, 21). despite applying only to male lions, “Lions are male” is semanti-

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
For humans, categories themselves are a key part of our cultural cally unacceptable, and preschool children understand this (46).
inheritance, which is to say, they exhibit learned, socially trans-
mitted variation that cannot be explained by genetic or environ- Two Presuppositions: Norms and Essences
mental factors (2). We are not born with a fixed set of categories On a strict reading, labels communicate the category to which
(no one is born knowing of screwdrivers, or that whales are mam- something belongs, and generics communicate some fact, opinion,
mals, or that girls wear pink). Nor do we simply pick up on dis- or belief about a category. This is the explicit informational value of
continuities in the biological world; rather, human categories have a these expressions. However, labels and generics in actual in-
cultural overlay. We see this in categories of natural kinds, social terpersonal use imply more than these literal meanings, and indeed
kinds, and artifacts, all of which display tremendous linguistic, cul- we would argue that appropriate use of these expressions requires
tural, and regional variation. Classifications of the natural world understanding these implications. Next, we review two conceptual
vary in which animals, plants, or substances are classified as edible, presuppositions embedded in the use of labels and generics: that
which are classified as medicinal, and which are classified as clean/ categories are normative (i.e., conventional and prescriptive) and
unclean (22). Classifications of the social world vary in how gender, that categories have essences.
race, and social hierarchies are organized (23, 24). Classifications of
artifacts vary in the very entities there are to be classified, with Categories as Normative. A social norm is a shared, socially con-
distinct types of tools, clothing, furniture, and so forth, as well as structed, context-specific rule that indicates what is (or is not)
category boundaries (25). And there is marked linguistic variation socially appropriate (21, 47). Labels are fundamentally normative
in the classification of dimensions of experience, including color, in that they are conventional (i.e., shared with other members of
number, time, space, emotions, even senses (26, 27). the speech community, a principle required for their successful
Although categories can be acquired asocially by individuals via use) (48, 49). A person who did not appreciate the normative
direct observations and interactions with the world, human languages value of labels might arbitrarily substitute vocalizations of their
provide a socially transmitted system for efficiently communicating own invention for words that they hear from others (e.g., you call
information about which categories there are, what belongs in those that a hammer, but I’ll call it a blicket), and the whole commu-
categories, and which attributes those categories possess. Universally, nicative enterprise would never get off the ground. By the time of
languages use two devices for the intergenerational transmission of their first word, children appreciate the conventionality principle,
categories: labels (names for categories, such as “shark” or “woman”) expecting novel labels used by one speaker to be understood by
and generics (generalizations about named categories, such as “sharks others within the speech community (50). Not all behaviors are
attack swimmers” or “women are nurturing”). treated the way that labels are treated; for example, infants as-
Labels express concepts that have some cultural significance; sume that preferences are individually varying rather than shared
whereas there are indefinitely many concepts one can generate or conventional (51). Infants also appreciate that language oper-
(e.g., “items weighing more than 500 grams,” including vultures ates via a division of linguistic labor, whereby more knowledgeable
and the Oxford English Dictionary but not a small grapefruit), members of the community can be trusted to provide accurate
only a subset of these ideas are lexicalized, and of these, only a labeling (52). From an early age, children are sensitive to social
subset are maintained in a language over time (28). Words are variation in labelers, for example preferentially accepting labels
distinctive to humans in their number (typically about 50,000 in from adults over children and experts over novices (53), an ex-
an adult speaker, many of which are names for things), con- pectation that fosters conformity. A powerful consequence of this
ceptual precision (e.g., chase vs. flee), and need to be learned principle is that even a simple relabeling can shift children’s label

Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7901
use, with only minimal explanation, as in the following examples of Essentialism.
parents looking through a picture book with their children (34): [Essence is] the very being of anything, whereby it is what it is. And thus
i) Child: That’s kangaroo. (Pointing to an aardvark.) the real internal, but generally . . . unknown constitution of things,
whereon their discoverable qualities depend, may be called their essence.
Mother: Well, that looks like a kangaroo, but it’s called
an aardvark. Locke (73)

Child: Aardvark. A striking aspect of human categories—and the words that ex-
press them—is that they often defy appearances: stick-bugs look like
ii) Child: That’s a snake. (Referring to an eel.) sticks, pyrite looks like gold. It is not surprising that scientific cate-
Mother: It looks like a snake, doesn’t it? It’s called an eel. It’s gories extend beyond the obvious, given that the natural world
like a snake, only it lives in the water. And there’s another one. provides evolved mechanisms that lead appearances to mislead, in-
cluding homologies, camouflage, mimicry, and sexual dimorphism.
In experimental settings as well, children accept relabelings What is notable is that nonscientists, including children, share the
from others, even when they compete with perceptual evidence expectation that categories have hidden structure and that words in
that children directly experience (54). ordinary language (e.g., bug, gold) capture this structure (74). This
Generic information is likewise assumed to be conventional and expectation contrasts with classic theories of cognitive development,
shared with others rather than idiosyncratic, private, or subjective. which propose that young children are “perceptually bound” think-
Even in prelinguistic communication, an action that is displayed to ers, and that concepts shift from similarity-based to conceptually
others is more often assumed to be generic than information that is based over development (75, 76).
done for the actor himself or herself (55). With regard to language, We refer to this assumption as “psychological essentialism”: an
young children treat generic statements as conveying information intuitive belief that categories of the natural world share not just
that is widely known (56–58). Generics are particularly frequent in observable features, but also a deeper, nonobvious reality: they
pedagogical (information transmitting) contexts, such as book “carve up nature at its joints” (74, 77, 78). Thus, tigers share more
reading, and when taking on a pedagogical role, such as talking to a than a certain size, gait, striped fur, and ferocity, but also internal
more ignorant interlocutor or pretending to be a teacher (59, 60). parts, temperament, instincts, as well as an innate, unchanging
Although generics can express idiosyncratic or subjective per- tiger “essence.” This essence might be blood, DNA, or even an
spectives (e.g., “Vegemite is delicious!”), expressing this gener-
unspecified, unknown placeholder, an expectation that there is an
ically implies a general truth, even to preschool children (61).
essence without knowing what it might be. For example, young
Category-referring language is normative in a stronger, pre-
children report that an animal’s behavior is caused by its own
scriptive sense as well. That is, labels and generics imply that a
insides or energy before they can have detailed expectations about
feature linked to a category not only is but also should be (62,
the particular form that such causal force might take (79–81).
63). This is particularly so for generic language, which expresses
Evidence for psychological essentialism comes from research with
norms that may even compete with statistical observations: “Boys
adults as well as young children (74, 82). Even in infancy, children
don’t cry” is deemed true—despite being demonstrably false—
expect members of a category to share internal, nonobvious, or
because it expresses a norm (64–66). Similarly, generics such as
causal similarities, even in the face of superficial dissimilarities (31,
“Scientists care about the truth” express abstract values rather
than descriptively accurate features (67). Parents likewise pro- 83–85). Boundaries between categories are treated as discontinuous
duce generics that express prescriptive norms that conflict with and objectively correct, and category membership itself is viewed as
the reality in the moment (e.g., “Remember, we don’t stand up immutable (24, 86–89). Category members are thought to have in-
on chairs”; “Oh, no, you don’t pull on books”; see http://childes. nate potential that resists environmental influences (90–92). Internal
talkbank.org/access/Eng-NA/Brown.html). bodily organs are thought to have the power to modify the recipient’s
Generic language leads to normative judgments, even when the behavior (93, 94). That essentialist beliefs have been documented in
categories are novel and the content is innocuous. In a series of ex- young children and across a variety of cultural contexts suggests that
periments, children 4–13 y of age learned of two novel groups essentialism is a fundamental component of human cognition (23,
that contrasted with one another in some harmless behavior, 95–99). Although which categories are essentialized varies cross-
such as the music they listen to or the food they eat (e.g., culturally, especially for categories of people (such as race, gender,
Hibbles listen to one kind of music, and Glerks listen to another or ethnicity) (100), essentialism of both natural kinds and social
kind of music). Children reported that it was “not OK” for an kinds has been broadly and consistently documented (101–103).
individual to fail to conform to the group behavior (e.g., for a Essentialized social categories have important implications for evo-
Hibble to listen to music that is more typical of Glerks) (68). In lutionarily significant behaviors in humans, including patterns of
other words, children interpreted an unfamiliar descriptive affiliation, mating, reproduction, and conflict. For example, essen-
regularity as if it were prescriptive (see also refs. 69–71 for tialized social categories are often conceptualized as less human and
additional evidence that descriptive and prescriptive norms are more threatening than nonessentialized social categories (104, 105),
conflated in children’s and adults’ concepts). Language plays an and both children and adults are reluctant to share resources with
important role in licensing this normative response: when the members of essentialized out-groups (101, 106).
vignettes depicted individuals (not groups) that received cate- Essentialist expectations are linked to category labels. Hearing
gory labels in either specific statements (e.g., “This Hibble that a pterodactyl is a “dinosaur,” that a swaddled baby is a
listens to this kind of music”) or generic statements (e.g., “boy,” or that a child received the heart of a “monkey” leads to
“Hibbles listen to this kind of music”), children made normative the inference that the pterodactyl does not live in a nest, that the
judgments; when the vignettes depicted individuals without cat- baby will grow up to like football regardless of its upbringing, and
egory labels or generics (e.g., “This listens to this kind of mu- that the donated heart will confer a slight but inevitable uptick in
sic”), children did not make normative judgments (72). Thus, one’s tendency to eat bananas. Essentialist expectations attach
category labels and generic statements license a prescriptive also to wholly novel labels applied to wholly novel categories
reading of novel, innocuous behaviors: they imply that mem- (107–109). This is not to say that labels automatically trigger
bers of the labeled group should behave a certain way. The essentialist reasoning; they do not (74, 110). However, when a
establishment of norms is itself a mechanism that fosters the label is applied to a category that has some coherent conceptual
stability of group behaviors (70). basis (e.g., shared features) then essentialist beliefs follow (32).

7902 | www.pnas.org/cgi/doi/10.1073/pnas.1621073114 Gelman and Roberts


COLLOQUIUM
PAPER
Labels may play a particularly important role for social cate- Conformity and Innovation
gories, given how culturally variable they are (86, 111). Biological evolution requires inheritance and mutations. Simi-
Generics also facilitate the social transmission of essentialism. larly, cultural evolution requires both conformity and innovation
They have two semantic features that support essentialism: they (134–137), and we suggest that the linguistic presuppositions
express properties that are timeless and nonaccidental (e.g., discussed above—norms and essentialism—contribute to both
“birds have hollow bones”), and they minimize within-category these processes. Because labels and generics are fundamentally
variability (e.g., “birds lay eggs,” even though only adult females normative (conventional and prescriptive), they provide stable
do so). Preschool children appreciate both these points (46, 58). representations that are easily shared with great fidelity. Because
Moreover, hearing novel generics about novel categories leads to labels are understood to be conventional and shared among
more within-category inferences (36, 112), assumption of core members of one’s language group, a child who hears a word in
features (57), and essentialist inferences about that category, context rapidly maps an initial meaning on the basis of a single
above and beyond labeling per se (108, 109). exposure (30, 138), although a full understanding emerges more
Elsewhere we have argued that essentialism is a flawed on- slowly and gradually (30, 139). Because generic information is
tology that oversimplifies by underestimating within-category viewed as not only descriptively accurate but also as prescriptively
variability, overestimating between-category differences, and correct, children may judge that failure to conform to generically
assigning too much causal significance to imputed essences (74). stated category regularities (such as acting at variance with one’s
Viewing biological categories as immutable, and viewing varia- in-group) is wrong or even risks punishment (140). At the same
tion as only superficial, contributes to persistent misconceptions time, the open-ended, placeholder structure of essentialism, also
about evolutionary processes, genetics, and other aspects of implied by labels and generics, invites category change. This is
science (113–116). Attributing existing patterns of social ineq- perhaps most evident in the ease with which children accept
uities to hidden, inherent, and inalterable causes in individuals is counterintuitive labels offered by knowledgeable others (e.g.,
an oversimplification that ignores structural and historical fac- accepting that a legless lizard is not a snake) as well as the in-
tors (117, 118) and contributes to a variety of social ills, including ductive potential of labels and generics, wherein new information
stereotyping, prejudice, and discrimination (102, 119–125). For is rapidly learned and generalized to new instances. These pre-
example, essentialist beliefs about gender promote disrespect suppositions suggest a transmission process that fosters change, at
and lowered expectations toward girls and women in schools and the same time that it resists change in the transmission process.
academia (104, 126), and essentialist beliefs about “Blackness” In this section we suggest two additional mechanisms by which
predict the perception of Black people as less than human, which the linguistic representation of categories may promote cultural
subsequently predicts greater violence toward Black children and transmission and cultural evolution: they transform variable in-
increased rates of applying the death penalty toward Black men put into categorical representations, and (in the case of human

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
in the United States courts system (127, 128). kinds) they involve looping effects between categories and the
So then, why do we essentialize? In the words of Medin (78), people being categorized.
“psychological essentialism is bad metaphysics . . . [but] may
From Variable Input to Categorical Representations. A uniquely human
prove to be good epistemology.” In other words, essentialism is
factually wrong but heuristically useful. Essentialism promotes aspect of language is that it takes variable, idiosyncratic experiences
and transforms them into discrete, symbolic, shared representations
learning and conceptual change by providing a placeholder
(28). The world is a complex, continuously dynamic array of sensory
structure that promotes the search for underlying causes and
inputs, and no two people experience identical environmental cues.
modifications over time. The evidence reviewed above demon-
The experience of categories is thus doubly variable: in the range of
strates the placeholder notion of essentialized concepts in three
instances that an individual encounters and in the experiences of
interrelated respects: (i) children expect items with the same
individuals across the language community. Language reduces and
label to share nonobvious similarities that they have not yet
regularizes this remarkable variety. Consider the use of a simple
learned; (ii) children are guided by labeling and generics even
word, “bird,” which extends from hummingbirds to dodos, from
when in competition with children’s own direct experiences (e.g., downy chicks to vicious birds of prey. We converge on a shared
“a whale isn’t a fish”; “boys don’t cry”); and (iii) labeling and label, regardless of our varied experiences: which birds we have seen
generics operate according to a “division of linguistic labor,” or heard, which ones we have owned or eaten, whether our expe-
whereby children defer to more expert others to inform them rience comes from real-world encounters, plush toys, or Big Bird.
about the classifications and generalizations of experience (129– This gap between the variability of experience and the com-
131). Importantly, these placeholder expectations, in turn, permit monality of labels presents a puzzle: “If biological and real world
and promote conceptual innovation, because children’s classifica- constraints are not enough then how is it nevertheless possible
tions build upon the expertise of others, and because children are for a group to arrive at a sufficiently shared set of conceptual
motivated to search for underlying causal similarities that members distinctions to make language possible?” (141). In other words,
of a category share. That even young children view categories in the transmission of language requires conceptual alignment or
essentialist ways suggests that categories are not just structures for compatible mental representations that are abstracted away from
organizing what is already known, but placeholders for further varying experiences and knowledge bases (142, 143).
knowledge that is expected to accrue. The meaning of a word is not We suggest that the manner in which labels and generics ab-
a list of known features or learned facts. Rather, a word serves as an stract away from experience aids in conceptual alignment. Cat-
invitation to form a category (132) and to extend and modify it with egory labels abstract away from the particulars that make
growing knowledge and expertise. “Dog” is not a tag for a fixed set individuals unique (a small poodle and a large Great Dane are
of observed features, but rather a pointer to “things of that nature,” both “dogs”), and generics abstract away from any particular
where the “nature” will be filled in via learning and input from context (“birds fly,” even when the only birds in sight are pen-
others. Here we endorse Putnam’s (133) famous assertion that guins) (144, 145). Speakers don’t require shared experiences to
“‘meanings’ just ain’t in the head!” Words refer to placeholder have a shared system of communication. A 12-mo-old infant and
concepts that do not have fixed content and thus can be modified. a biologist can communicate with the word “dog,” despite radi-
Language “is not a mirror of our inner states but a complement to cally different understandings.
them. It serves as a tool whose role is to extend cognition in ways Generics are particularly well-suited for expressing abstract,
that on-board devices cannot” (19). shared representations because, as noted earlier, they systematically

Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7903
underplay variations in experience by glossing over exceptions. Ge- group using matrilineal kinship, friendships, and causal theories
nerics are not disconfirmed by counterexamples (the existence of a (149). These categories have a nonobvious basis (e.g., infants
nonflying bird does not disconfirm the generic claim that birds fly) “inherit” the rank of the mother) and are learned (e.g., members
(45, 46), which means that generic messages can trump a listener’s need to learn which individuals fall into which group). Seyfarth
personal experiences. People produce generics about features they and Cheney propose: “. . . when it comes to recognizing matri-
consider conceptually important (e.g., dangerous or distinctive), even lineal kin groups, baboons are ‘essentialists’ . . . They act as if the
when they know them to be variably present in a category, but those members of kin groups ‘have essences or underlying natures that
who hear such generics (whether adults or young children) tend to make them the things that they are’” (149).
assume that the feature is almost universally present among category Monkeys and great apes can also track category membership
members (43, 146). This results in systematic distortions in the across radical featural transformations, and privilege kind (essential
transmission process, from variability to category-wide consistency. features) over superficial appearance (surface features). For ex-
ample, one study presented rhesus macaques with food items in
The Looping Effect of Human Kinds. Hacking (147) speaks of a which the inner identity was transformed (e.g., an apple was dis-
“looping effect” in social categories: specifically, that classifica- guised as a coconut) (150). After a piece of this transformed fruit
tions of people have cognitive consequences for those that are was placed in a container, if the animal reached in and found a
classified, which feedback into these same classifications: piece that matched the appearance rather than the inner kind, they
continued to search for another piece, indicating that they had been
To create new ways of classifying people is also to change how we can
expecting the sample to match the inner kind and not the appear-
think of ourselves, to change our sense of self-worth, even how we re-
member our own past. This in turn generates a looping effect, because ance. The researchers interpret the findings as “evidence that ma-
people of the kind behave differently and so are different. This is to say caques share this one primitive aspect of psychological essentialism”
the kind changes, and so there is new causal knowledge to be gained and (150). Similarly, in another study, bonobos, orangutans, and chim-
perhaps, old causal knowledge to be jettisoned. . . . that new knowledge in panzees viewed a transformation process in which one piece of food
turn becomes part of what is to be known about members of the kind, was disguised to look like another (e.g., a carrot slice was disguised
who change again. . . . Kinds are modified, revised classifications are as a banana slice) (151). When given a choice between a true piece
formed, and the classified change again, loop upon loop” (147). of banana versus a disguised piece of carrot that only looked like a
We have sketched out linguistic mechanisms that may con- banana, animals preferred the true banana. The authors interpret
this as “a kind of psychological essentialism, perhaps the phyloge-
tribute to this looping effect. Labels and generics stake out cat-
netically and ontogenetically most basic one” (151). Again, language
egories, which then are altered through human action to reify
was not required to consider an appearance–reality conflict and to
such categories. In contrast to Hacking, however, we see this
privilege the inner identity.
looping effect not only for categories to which one belongs, but
These impressive capacities demonstrate that humans share
also for categories of others. History is replete with modifications
with at least other primates the ability to categorize based on
that differentiate groups. Thus, for example, male/female dif-
subtle, nonperceptible cues, and the ability to conform to nor-
ferences are exaggerated by differences in clothing, hairstyles,
mative regularities (although conformity is substantially greater
gait, bodily deformations (e.g., foot-binding), and styles of in humans) (152). Indeed, norms and essentialism may precede
speech. Modifications may be imposed (e.g., Jews in World War language in human development, as preverbal infants infer
II Germany being required to wear stars) or chosen (e.g., fash- general ways of interacting with objects from pedagogical dem-
ions worn by self-identified hipsters). Social groups may be onstrations, evaluate others based on their social interactions,
physically separated, either by explicit policy (e.g., segregationist categorize based on nonobvious features, and distinguish indi-
policies toward Blacks in the southern United States; Japanese viduals from kinds (55, 80, 153–155).
internment camps in the United States during World War II) or Nonetheless, we suggest four key respects in which human
by other practices and constraints (e.g., low-income families re- language may be unique in fostering the social transmission and
stricted to neighborhoods with unclean water and air). Concepts evolution of categories.
of human kinds may lead to a cyclical pattern in which cultural
practices lead groups to appear more distinct from one another, Efficiency in Transmitting Category Information. First and most
which confirms the categorizations, leading to more differenti- obviously, labels and generic language ensure speed, fidelity, and
ating practices, and so forth. Viewing social kinds as having deep ease of transmitting category information, by means of an overt
differences has cycling effects on behaviors that contribute to the and stable representational format. This would be difficult
reality of that social kind. (perhaps impossible) to achieve by means of actions alone. (Note
that language is not necessarily more efficient for transmitting all
Norms and Essentialism in Nonhuman Species sorts of information. For example, showing the location of an
Are nonhuman animals also capable of learning categories with object is likely more efficiently done by pointing; teaching
prescriptive implications and a nonobvious basis? This question is weaving is likely more efficiently done by demonstration.) Con-
timely, given recent discoveries of remarkably sophisticated cate- sider the case of conveying that an item is not what it appears to
gorization and social transmission abilities in nonhuman animals be. The studies with nonhuman primates required a lengthy and
(see other papers in this issue). For example, consider an in- rather elaborate shared context (the transformation process
genious experiment demonstrating that chimpanzees conform to itself), carried out by an expert with special tools and procedural
cultural (descriptive) norms of tool use (148). The researchers first know-how. Someone who was not present during this demon-
taught a high-ranking chimpanzee one of two manners of tool use stration would not have access to the relevant information.
to obtain food out of a puzzle box (e.g., using either a poking or a Contrast this with the human language case, which efficiently
lifting motion). When let loose within the group, other members corrects a misconception with a single sentence (“This looks like
picked up the demonstrated solution strategy, even adhering to a banana, but really it’s a carrot”). Anyone who hears the new
the method common in the group after having successfully used label—even a nonexpert or young child—could then share it with
the alternative method. Certainly language was not required. others, ensuring a transmission chain. Consider, too, the case of
Nonhuman primates are also capable of categorizing based on conveying the scope of a feature: if eating a mushroom makes
nonperceptible features. For example, baboons engage in so- you sick, is it because of that particular mushroom (e.g., maybe it
phisticated categorizations of conspecifics, with dominance hi- rotten or was sprayed by pesticides) or mushrooms of that type
erarchies that simultaneously rank by individual rank and family more generally? Again, this is efficiently conveyed via generic

7904 | www.pnas.org/cgi/doi/10.1073/pnas.1621073114 Gelman and Roberts


COLLOQUIUM
PAPER
language (“Death cap mushrooms are poisonous”), but difficult transmission processes in nonhuman animals provide models of
(perhaps impossible) to convey nonlinguistically. Notably, ge- what behaviors are possible (i.e., models), whereas social trans-
neric information is conveyed equally well whether it expresses mission processes in human animals provide models of what
preference or avoidance, whereas nonlinguistic social learning behaviors are appropriate (i.e., morals).
mechanisms may be asymmetric in this regard (e.g., the Norway
rat can learn which foods to try by sniffing the breath of con- Conclusions
specifics, but cannot learn which foods to avoid by sniffing the The evolution of culture involves not only behavioral practices and
breath of a sick conspecific) (156, 157). material artifacts, but also the representation of these practices
and artifacts in the human mind, including categories. Cultural
Conceptual Innovation and Change. Language can evoke concep- evolution (as opposed to mere change) entails an increase in di-
tual change, not just providing new information (e.g., that dan- versity and complexity; it cannot just be the recycling of behaviors
delions are edible) but also abandoning an old classificatory (169, 170). We suggest at least three ways that categories can be
framework (e.g., learning that plants are alive). Human classifi- said to evolve. First, as technologies evolve, so do the categories
cation systems undergo reorganizations throughout history, and they belong to and the labels that express them. For example, in
naming patterns have shifted to accommodate these changes English we have words for hammers, trucks, and violas (all
(158). It is less clear that nonhumans engage in conceptual invented technologies), and in many languages, old words are
change. Consider the sad case of seabirds that consume plastic refitted to accommodate new inventions (e.g., “fire vehicle” means
they find in the ocean, resulting in poisoning and malnutrition. train in Mandarin). Young children have no difficulty acquiring
They do so because the chemical odor of the plastic is similar to
these words; they do not lag behind the acquisition of categories
that of dimethyl sulfide, a compound found in marine algae
that were in our distant evolutionary history. Second, as theories
(159). The birds are effectively tricked into eating plastic because
change and evolve, so do our labeled categories: a whale is no
it smells like food. Thus, a categorization capacity that was useful
longer a fish; Pluto is no longer a planet; “female hysteria” is no
for locating food went awry when the environment changed. It is
not clear how one could convince seabirds to abandon this longer a disease. And third, human kinds arguably become in-
classification system, even when it’s a matter of survival. creasingly diversified and complex by means of “looping effects.”
Here, however, it is important to note that the evolution of cat-
Scope of Application. In nonhuman animals, the examples we have egories can be negative as well as positive. Cumulative cultural
seen of sophisticated social transmission, adherence to group change can be a good thing: tools get more sophisticated, social
norms, and nonobvious categorizations fall within a narrow set of organizations get more complex, means of food production get
domains, primarily involving food and within-group social rela- more varied. But there is negative ratcheting as well, in the form of
tions (e.g., mating, dominance). In contrast, human norms and bigotry, polarization, and the perpetuation of social hierarchies.

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
essentialism extend beyond content with obvious survival value Levinson notes the special role of language in the process of
to include any aspect of experience. Essentialism applies to enculturation of cognition: “. . . language appears to play a cru-
natural substances, living kinds, human social groups, personal cial role [in how culture gets into the head]: it is learnt far earlier
characteristics, diseases, and in some respects even artifacts (31, than most aspects of culture, is the most highly practiced set of
82, 113, 160–165). Similarly, normative expectations extend to a cultural skills, and is a representation system that is at once
vast array of behaviors, including which clothing to wear, which public and private, cultural and mental” (171).
music to listen to, or which games to play (68). In the case of learning categories, we suggest that cumulative
cultural evolution is enhanced by labels and generics, which provide
From Models to Morals. Although nonhuman animals are capable a simple yet powerful means of passing along the wisdom (and
of conforming to high-ranking group members (copying modeled prejudices) of prior generations. In this way, language enhances and
behaviors) and “punishing” others by retaliating when they are expands (nonlinguistic) capacities to categorize that we share with
wronged, we are unaware of evidence that they display moral other animals. A full understanding of this process will require
condemnation or punishment of nonconformity in others. For studying how it intersects with a variety of other important cognitive
example, in one study with chimpanzees, an actor could punish a capacities that are present early in human development, including
thief by depriving them of food reward (via trapdoor) (166). The theory of mind, alertness to testimony, attention to ritual, and a
actor only retaliated when their own food was stolen, not when drive for causal understandings (134, 172–174).
another chimpanzee’s food was stolen. This is in sharp contrast
to the findings with young children, who exhibit strong moral ACKNOWLEDGMENTS. We thank Bruce Mannheim and two anonymous
evaluations of others (47, 167, 168). One might say that social reviewers for very helpful comments on an earlier draft.

1. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive mechanisms 11. Gentner D, Goldin-Meadow S, eds (2003) Language in Mind: Advances in the Study
in response to cultural innovations. Proc Natl Acad Sci USA 114:7915–7922. of Language and Thought (MIT Press, Cambridge, MA).
2. Whiten A (2017) A second inheritance system: The extension of biology through 12. Gleitman L, Papafragou A (2013) Relations between language and thought.
culture. Interface Focus, in press. Handbook of Cognitive Psychology, ed Reisberg D (Oxford Univ Press, New York).
3. Maynard Smith JM, Szathmáry E (1997) The Major Transitions in Evolution (Oxford 13. Slobin DI (1991) Learning to think for speaking: Native language, cognition, and
Univ Press, Oxford). rhetorical style. Pragmatics 1:7–25.
4. Kirby S, Cornish H, Smith K (2008) Cumulative cultural evolution in the laboratory: 14. Grice HP (1975) Logic and conversation. Syntax and Semantics 3: Speech Acts, eds
An experimental approach to the origins of structure in human language. Proc Natl Cole P, Morgan J (Academic, New York), pp 41–58.
Acad Sci USA 105:10681–10686. 15. Levinson SC (2000) Presumptive Meanings: The Theory of Generalized Conversational
5. Pagel M (2017) Darwinian perspectives on the evolution of human languages. Implicature (MIT Press, Cambridge, MA).
Psychon Bull Rev 24:151–157. 16. Sperber D, Wilson D (2002) Pragmatics, modularity and mind‐reading. Mind Lang 17:3–23.
6. Chomsky N (1975) Aspects of the Theory of Syntax (MIT Press, Cambridge, MA). 17. Horowitz AC, Frank MC (2016) Children’s pragmatic inferences as a route for
7. Pinker S, Bloom P (1990) Natural language and natural selection. Behav Brain Sci 13: learning about the world. Child Dev 87:807–819.
707–727. 18. Frank MC, Everett DL, Fedorenko E, Gibson E (2008) Number as a cognitive tech-
8. Mannheim B (2015) The social imaginary, unspoken in verbal art. The Routledge nology: Evidence from Pirahã language and cognition. Cognition 108:819–824.
Handbook of Linguistic Anthropology, ed Bonvillain N (Routledge, New York), pp 44–61. 19. Clark A, Chalmers D (1998) The extended mind. Analysis 58:7–19.
9. Kinzler KD, Corriveau KH, Harris PL (2011) Children’s selective trust in native- 20. Watson-Jones RE, Legare CH (2016) The social functions of group rituals. Curr Dir
accented speakers. Dev Sci 14:106–111. Psychol Sci 25:42–46.
10. Hill JH, Mannheim B (1992) Language and world view. Annu Rev Anthropol 21: 21. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl
381–406. Acad Sci USA 114:7877–7883.

Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7905
22. Atran S, Medin DL (2008) The Native Mind and the Cultural Construction of Nature 62. Orvell A, Kross E, Gelman SA (2017) How “you” makes meaning. Science 355:
(MIT Press, Cambridge, MA). 1299–1302.
23. Astuti R, Solomon GE, Carey S (2004) Constraints on conceptual development: A case 63. Orvell A, Kross E, Gelman SA, That’s how ‘you’ do it: Generic you expresses norms in
study of the acquisition of folkbiological and folksociological knowledge in Mada- early childhood. J Exp Child Psych, in press.
gascar. Monogr Soc Res Child Dev 69:1–135, vii–viii, discussion 136–161. 64. Leslie SJ (2015) “Hillary Clinton is the only man in the Obama administration”: Dual
24. Diesendruck G, Goldfein-Elbaz R, Rhodes M, Gelman S, Neumark N (2013) Cross- character concepts, generics, and gender. Analytic Philos 56:111–141.
cultural differences in children’s beliefs about the objectivity of social categories. 65. Prasada S, Dillingham EM (2009) Representation of principled connections: A win-
Child Dev 84:1906–1917. dow onto the formal aspect of common sense conception. Cogn Sci 33:401–448.
25. Malt BC, Majid A (2013) How thought is mapped into words. Wiley Interdiscip Rev 66. Wodak D, Leslie SJ, Rhodes M (2015) What a loaded generalization: Generics and
Cogn Sci 4:583–597. social cognition. Philos Compass 10:625–635.
26. Barrett LF, Mesquita B, Gendron M (2011) Context in emotion perception. Curr Dir 67. Knobe J, Prasada S, Newman GE (2013) Dual character concepts and the normative
Psychol Sci 20:286–290. dimension of conceptual representation. Cognition 127:242–257.
27. Regier T, Kay P (2009) Language, thought, and color: Whorf was half right. Trends 68. Roberts SO, Gelman SA, Ho AK (2017) So it is, so it shall be: Group regularities license
Cogn Sci 13:439–446. children’s prescriptive judgments. Cogn Sci 41:576–600.
28. Pagel M (2012) Wired for Culture: Origins of the Human Social Mind (Norton, New York). 69. Bear A, Knobe J (November 11, 2016) Normality: Part descriptive, part prescriptive.
29. Pinker S, Jackendoff R (2005) The faculty of language: What’s special about it? Cognition, 10.1016/j.cognition.2016.10.024.
Cognition 95:201–236. 70. Eriksson K, Strimling P, Coultas JC (2015) Bidirectional associations between de-
30. Carey S (1978) The child as word learner. Linguistic Theory and Psychological Reality, scriptive and injunctive norms. Organ Behav Hum Decis Process 129:59–69.
eds Bresnan J, Miller G, Halle M (MIT Press, Cambridge, MA), pp 264–293. 71. Tworek CM, Cimpian A (2016) Why do people tend to infer “ought” from “is”? The
31. Gelman SA, Markman EM (1986) Categories and induction in young children. role of biases in explanation. Psychol Sci 27:1109–1122.
Cognition 23:183–209. 72. Roberts SO, Ho AK, Gelman SA (2017) Group presence, category labels, and generic
32. Gelman SA, Davidson NS (2013) Conceptual influences on category-based induction. statements influence children to treat descriptive group regularities as prescriptive.
Cognit Psychol 66:327–353. J Exp Child Psychol 158:19–31.
33. Carlson GN, Pelletier FJ, eds (1995) The Generic Book (Univ Chicago Press, Chicago). 73. Locke J (1959) An Essay Concerning Human Understanding (Dover, New York), Vol 2. Reprint.
34. Gelman SA, Coley JD, Rosengren KS, Hartman E, Pappas A (1998) Beyond labeling: 74. Gelman SA (2003) The Essential Child: Origins of Essentialism in Everyday Thought
The role of maternal input in the acquisition of richly structured categories. Monogr (Oxford Univ Press, New York).
Soc Res Child Dev 63:I–V, 1–148, discussion 149–157. 75. Fisher AV, Sloutsky VM (2005) When induction meets memory: Evidence for gradual
35. Gelman SA, Goetz PJ, Sarnecka BW, Flukes J (2008) Generic language in parent-child transition from similarity-based to category-based induction. Child Dev 76:583–597.
conversations. Lang Learn Dev 4:1–31. 76. Inhelder B, Piaget J (1964) The Early Growth of Logic in the Child (Norton, New York).
36. Graham SA, Gelman SA, Clarke J (2016) Generics license 30-month-olds’ inferences 77. Keil FC, Richardson DC (1999) Species, stuff, and patterns of causation. Species: New
about the atypical properties of novel kinds. Dev Psychol 52:1353–1362. Interdisciplinary Essays, ed Wilson RA (MIT Press, Cambridge, MA).
37. Gelman SA, Raman L (2007) This cat has nine lives? Children’s memory for genericity 78. Medin DL (1989) Concepts and conceptual structure. Am Psychol 44:1469–1481.
in language. Dev Psychol 43:1256–1268. 79. Gottfried GM, Gelman SA (2005) Developing domain-specific causal-explanatory
38. Maurer P, Meeuwis M; APiCS Consortium (2013) Generic noun phrases in subject frameworks: The role of insides and immanence. Cogn Dev 20:137–158.
function. The Atlas of Pidgin and Creole Language Structures, eds Michaelis SM, 80. Setoh P, Wu D, Baillargeon R, Gelman R (2013) Young infants have biological ex-
Maurer P, Haspelmath M, Huber M (Oxford Univ Press, New York), pp 114–117. pectations about animals. Proc Natl Acad Sci USA 110:15937–15942.
39. Everett DL (2009) Pirahã culture and grammar: A response to some criticisms. Lang 81. Simons DJ, Keil FC (1995) An abstract to concrete shift in the development of bi-
85:405–442. ological thought: The insides story. Cognition 56:129–163.
40. Gelman SA, Sánchez Tapia I, Leslie SJ (2016) Memory for generic and quantified 82. Haslam N, Rothschild L, Ernst D (2000) Essentialist beliefs about social categories. Br J
sentences in Spanish-speaking children and adults. J Child Lang 43:1231–1244. Soc Psychol 39:113–127.
41. Mannheim B, Gelman SA, Escalante C, Huayhua M, Puma R (2010) A developmental 83. Booth AE (2014) Conceptually coherent categories support label-based inductive
analysis of generic nouns in Southern Peruvian Quechua. Lang Learn Dev 7:1–23. generalization in preschoolers. J Exp Child Psychol 123:1–14.
42. Tardif T, Gelman SA, Fu X, Zhu L (2012) Acquisition of generic noun phrases in 84. Sobel DM, Yoachim CM, Gopnik A, Meltzoff AN, Blumenthal EJ (2007) The blicket
Chinese: Learning about lions without an “-s”. J Child Lang 39:130–161. within: Preschoolers’ inferences about insides and causes. J Cogn Dev 8:159–182.
43. Cimpian A, Brandone AC, Gelman SA (2010) Generic statements require little evi- 85. Walker CM, Lombrozo T, Legare CH, Gopnik A (2014) Explaining prompts children to
dence for acceptance but have powerful implications. Cogn Sci 34:1452–1482. privilege inductively rich properties. Cognition 133:343–357.
44. Cimpian A, Gelman SA, Brandone AC (2010) Theory-based considerations influence 86. Rhodes M, Gelman SA (2009) A developmental examination of the conceptual
the interpretation of generic sentences. Lang Cogn Process 25:261–276. structure of animal, artifact, and human social categories across two cultural con-
45. Leslie SJ (2008) Generics: Cognition and acquisition. Philos Rev 117:1–47. texts. Cognit Psychol 59:244–274.
46. Brandone AC, Cimpian A, Leslie SJ, Gelman SA (2012) Do lions have manes? For 87. Roberts SO, Gelman SA (2015) Do children see in Black and White? Children’s and
children, generics are about kinds rather than quantities. Child Dev 83:423–433. adults’ categorizations of multiracial individuals. Child Dev 86:1830–1847.
47. Rakoczy H, Schmidt MF (2013) The early ontogeny of social norms. Child Dev 88. Gelman SA, Wellman HM (1991) Insides and essences: Early understandings of the
Perspect 7:17–21. non-obvious. Cognition 38:213–244.
48. Clark EV (1992) Conventionality and contrast: Pragmatic principles with lexical 89. Keil FC (1989) Concepts, Kinds, and Cognitive Development (MIT Press, Cambridge, MA).
consequences. Frames, Fields, and Contrasts: New Essays in Semantic and Lexical 90. Meyer M, Gelman SA (2016) Gender essentialism in children and parents: Implica-
Organization, eds Kittay EF, Lehrer A (Erlbaum, Hillsdale, NJ), pp 171–188. tions for the development of gender stereotyping and gender-typed preferences.
49. Saussure FD (1915) Cours de Linguistique Générale (Payot, Paris). Sex Roles 75:409–421.
50. Sabbagh MA, Henderson AM (2007) How an appreciation of conventionality shapes 91. Taylor MG, Rhodes M, Gelman SA (2009) Boys will be boys; cows will be cows:
early word learning. New Dir Child Adolesc Dev (115):25–37. Children’s essentialist reasoning about gender categories and animal species. Child
51. Henderson AM, Woodward AL (2012) Nine-month-old infants generalize object la- Dev 80:461–481.
bels, but not object preferences across individuals. Dev Sci 15:641–652. 92. Ware EA, Gelman SA (2014) You get what you need: An examination of purpose-
52. Jaswal VK, Markman EM (2007) Looks aren’t everything: 24-month-olds’ willingness based inheritance reasoning in undergraduates, preschoolers, and biological ex-
to accept unexpected labels. J Cogn Dev 8:93–111. perts. Cogn Sci 38:197–243.
53. Koenig MA, Harris PL (2005) Preschoolers mistrust ignorant and inaccurate speakers. 93. Meyer M, Gelman SA, Roberts SO, Leslie SJ (November 17, 2016) My heart made me
Child Dev 76:1261–1277. do it: Children’s essentialist beliefs about heart transplants. Cogn Sci, 10.1111/
54. Lane JD, Harris PL, Gelman SA, Wellman HM (2014) More than meets the eye: Young cogs.12431.
children’s trust in claims that defy their perceptions. Dev Psychol 50:865–871. 94. Meyer M, Leslie SJ, Gelman SA, Stilwell SM (2013) Essentialist beliefs about bodily
55. Csibra G, Gergely G (2009) Natural pedagogy. Trends Cogn Sci 13:148–153. transplants in the United States and India. Cogn Sci 37:668–710.
56. Cimpian A, Scott RM (2012) Children expect generic knowledge to be widely shared. 95. Atran S, et al. (2001) Folkbiology doesn’t come from folkpsychology: Evidence from
Cognition 123:419–433. Yukatek Maya in cross-cultural perspective. J Cogn Cult 1:3–42.
57. Cimpian A, Markman EM (2009) Information learned from generic language be- 96. Moya C, Boyd R, Henrich J (2015) Reasoning about cultural and genetic transmission:
comes central to children’s biological concepts: Evidence from their open-ended Developmental and cross‐cultural evidence From Peru, Fiji, and the United States on
explanations. Cognition 113:14–25. how people make inferences about trait transmission. Top Cogn Sci 7:595–610.
58. Hollander MA, Gelman SA, Raman L (2009) Generic language and judgements about 97. del Río MF, Strasser K (2011) Chilean children’s essentialist reasoning about poverty.
category membership: Can generics highlight properties as central? Lang Cogn Br J Dev Psychol 29:722–743.
Process 24:481–505. 98. Sousa P, Atran S, Medin D (2002) Essentialism and folkbiology: Evidence from Brazil.
59. Gelman SA, Tardif T (1998) A cross-linguistic comparison of generic noun phrases in J Cogn Cult 2:195–223.
English and Mandarin. Cognition 66:215–248. 99. Waxman S, Medin D, Ross N (2007) Folkbiological reasoning from a cross-cultural
60. Gelman SA, Ware EA, Manczak EM, Graham SA (2013) Children’s sensitivity to the knowl- developmental perspective: Early essentialist notions are shaped by cultural beliefs.
edge expressed in pedagogical and nonpedagogical contexts. Dev Psychol 49:491–504. Dev Psychol 43:294–308.
61. Holubar TF, Markman EM (2013) Preschoolers’ understanding of preferences is 100. Haslam N, Holland E, Karasawa M (2013) Essentialism and entitativity across cultures. Culture
modulated by linguistic framing. Cooperative Minds: Social Interaction and Group and Group Processes, eds Yuki M, Brewer M (Oxford Univ Press, New York), pp 17–37.
Dynamics, Proceedings of the 35th Annual Meeting of the Cognitive Science Society, 101. Rhodes M, Leslie SJ, Saunders K, Dunham Y, Cimpian A (February 22, 2017) How does
eds Knauff M, Sebanz N, Pauen M, Wachsmuth I (Cognitive Science Society, Austin, social essentialism affect the development of inter-group relations? Dev Sci, 10.1111/
TX), pp 603–608. desc.12509.

7906 | www.pnas.org/cgi/doi/10.1073/pnas.1621073114 Gelman and Roberts


COLLOQUIUM
PAPER
102. Diesendruck G, Menahem R (2015) Essentialism promotes children’s inter-ethnic bias. 137. d’Errico F, et al. (2017) Identifying early modern human ecological niche expansions
Front Psychol 6:1180. and associated cultural dynamics in the South African Middle Stone Age. Proc Natl
103. Pauker K, Xu Y, Williams A, Biddle AM (2016) Race essentialism and social contextual Acad Sci USA 114:7869–7876.
differences in children’s racial stereotyping. Child Dev 87:1409–1422. 138. Markson L, Bloom P (1997) Evidence against a dedicated system for word learning in
104. Haslam N, Whelan J (2008) Human natures: Psychological essentialism in thinking children. Nature 385:813–815.
about differences between people. Soc Personal Psychol Compass 2/3:1297–1312. 139. Horst JS, Samuelson LK (2008) Fast mapping but poor retention by 24-month-old
105. Leyens JP, et al. (2001) Psychological essentialism and the differential attribution of infants. Infancy 13:128–157.
uniquely human emotions to ingroups and outgroups. Eur J Soc Psychol 31:395–411. 140. Rhodes M (2012) Naïve theories of social groups. Child Dev 83:1900–1916.
106. Bastian B, Haslam N (2008) Immigration from the perspective of hosts and immigrants: 141. Steels L (2011) Modeling the cultural evolution of language. Phys Life Rev 8:339–356.
Roles of psychological essentialism and social identity. Asian J Soc Psychol 11:127–140. 142. Pickering MJ, Garrod S (2006) Alignment as the basis for successful communication.
107. Gelman SA, Heyman GD (1999) Carrot-eaters and creature-believers: The effects of lexi- Res Lang Comput 4:203–228.
calization on children’s inferences about social categories. Psychol Sci 10:489–493. 143. Stolk A, Verhagen L, Toni I (2016) Conceptual alignment: How brains achieve mutual
108. Gelman SA, Ware EA, Kleinberg F (2010) Effects of generic language on category understanding. Trends Cogn Sci 20:180–191.
content and structure. Cognit Psychol 61:273–301. 144. Edmiston P, Lupyan G (2015) What makes words special? Words as unmotivated
109. Rhodes M, Leslie SJ, Tworek CM (2012) Cultural transmission of social essentialism.
cues. Cognition 143:93–100.
Proc Natl Acad Sci USA 109:13526–13531.
145. Gelman SA, Raman L (2003) Preschool children use linguistic form class and prag-
110. Gelman SA, Waxman SR (2007) Looking beyond looks: Comments on Sloutsky, Kloos,
matic cues to interpret generics. Child Dev 74:308–325.
and Fisher (2007). Psychol Sci 18:554–555, discussion 556–557.
146. Brandone AC, Gelman SA, Hedglen J (2015) Children’s developing intuitions about
111. Diesendruck G (2003) Categories for names or names for categories? The interplay be-
the truth conditions and implications of novel generics versus quantified statements.
tween domain-specific conceptual structure and language. Lang Cogn Process 18:759–787.
Cogn Sci 39:711–738.
112. Gelman SA, Star JR, Flukes J (2002) Children’s use of generics in inductive inferences.
147. Hacking I (1995) The looping effects of human kinds. Causal Cognition: A
J Cogn Dev 3:179–199.
Multidisciplinary Debate, eds Sperber D, Premack D, Premack AJ (Clarendon Press,
113. Gelman SA, Rhodes M (2012) Two-thousand years of stasis. How psychological es-
sentialism impedes evolutionary understanding. Evolution Challenges: Integrating Oxford, UK), pp 351–394.
Research and Practice in Teaching and Learning about Evolution, eds Rosengren KS, 148. Whiten A, Horner V, de Waal FB (2005) Conformity to cultural norms of tool use in
Brem S, Evans EM, Sinatra G (Oxford Univ Press, Cambridge, UK), pp 3–21. chimpanzees. Nature 437:737–740.
114. Shtulman A, Schulz L (2008) The relation between essentialist beliefs and evolu- 149. Seyfarth RM, Cheney DL (2009) The evolution of social categories. Neurobiology of.
tionary reasoning. Cogn Sci 32:1049–1062. Umwelt, eds Bethoz A, Christen A (Springer, Berlin), pp 69–87.
115. Dar-Nimrod I, Heine SJ (2011) Genetic essentialism: On the deceptive determinism of 150. Phillips W, Shankar M, Santos LR (2010) Essentialism in the absence of language?
DNA. Psychol Bull 137:800–818. Evidence from rhesus monkeys (Macaca mulatta). Dev Sci 13:F1–F7.
116. Leslie SJ (2013) Essence and natural kinds: When science meets preschooler intuition. 151. Cacchione T, Hrubesch C, Call J, Rakoczy H (2016) Are apes essentialists? Scope and
Oxford Studies in Epistemology 4:108–165. limits of psychological essentialism in great apes. Anim Cogn 19:921–937.
117. Bonilla-Silva E (1997) Rethinking racism: Toward a structural interpretation. Am 152. Haun DB, Rekers Y, Tomasello M (2014) Children conform to the behavior of peers;
Sociol Rev 62:465–480. other great apes stick with what they know. Psychol Sci 25:2160–2167.
118. Lee J, Bean FD (2007) Reinventing the color line: Immigration and America’s new 153. Hamlin JK, Wynn K, Bloom P (2007) Social evaluation by preverbal infants. Nature
racial/ethnic divide. Soc Forces 86:561–586. 450:557–559.
119. Cimpian A, Salomon E (2014) The inherence heuristic: An intuitive means of making 154. Newman GE, Herrmann P, Wynn K, Keil FC (2008) Biases towards internal features in
sense of the world, and a potential precursor to psychological essentialism. Behav infants’ reasoning about objects. Cognition 107:420–432.

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Brain Sci 37:461–480. 155. Dewar K, Xu F (2007) Do 9-month-old infants expect distinct words to refer to kinds?
120. Williams MJ, Eberhardt JL (2008) Biological conceptions of race and the motivation Dev Psychol 43:1227–1238.
to cross racial boundaries. J Pers Soc Psychol 94:1033–1047. 156. Galef BG, McQuoid LM, Whiskin EE (1990) Further evidence that Norway rats do not
121. Bastian B, Haslam N (2006) Psychological essentialism and stereotype endorsement. socially transmit learned aversions to toxic baits. Anim Learn Behav 18:199–205.
J Exp Soc Psychol 42:228–235. 157. Galef BG, Laland KN (2005) Social learning in animals: Empirical studies and theo-
122. Chao MM, Hong YY, Chiu CY (2013) Essentializing race: Its implications on racial retical models. Bioscience 55:489–499.
categorization. J Pers Soc Psychol 104:619–634. 158. Love AC (2015) Conceptual Change in Biology (Springer, New York).
123. Gaither SE, et al. (2014) Essentialist thinking predicts decrements in children’s 159. Savoca MS, Wohlfeil ME, Ebeler SE, Nevitt GA (2016) Marine plastic debris emits a
memory for racially ambiguous faces. Dev Psychol 50:482–488. keystone infochemical for olfactory foraging seabirds. Sci Adv 2:e1600395.
124. Ho AK, Roberts SO, Gelman SA (2015) Essentialism and racial bias jointly contribute 160. Regnier D (2015) Clean people, unclean people: The essentialisation of ‘slaves’
to the categorization of multiracial individuals. Psychol Sci 26:1639–1645. among the southern Betsileo of Madagascar. Soc Anthropol 23:152–168.
125. Kraus MW, Keltner D (2013) Social class rank, essentialism, and punitive judgment. 161. Gelman SA, Heyman GD, Legare CH (2007) Developmental changes in the coherence
J Pers Soc Psychol 105:247–261.
of essentialist beliefs about psychological characteristics. Child Dev 78:757–774.
126. Leslie SJ, Cimpian A, Meyer M, Freeland E (2015) Expectations of brilliance underlie
162. Cooper JA, Marsh JK (2015) The influence of expertise on essence beliefs for mental
gender distributions across academic disciplines. Science 347:262–265.
and medical disorder categories. Cognition 144:67–75.
127. Goff PA, Jackson MC, Di Leone BAL, Culotta CM, DiTomasso NA (2014) The essence of
163. Gelman SA (2013) Artifacts and essentialism. Rev Phil Psychol 4:449–463.
innocence: Consequences of dehumanizing Black children. J Pers Soc Psychol 106:526–545.
164. Nemeroff C, Rozin P (1994) The contagion concept in adult thinking in the United
128. Goff PA, Eberhardt JL, Williams MJ, Jackson MC (2008) Not yet human: Implicit
States: Transmission of germs and of interpersonal influence. Ethos 22:158–186.
knowledge, historical dehumanization, and contemporary consequences. J Pers Soc
165. Newman GE (2016) An essentialist account of authenticity. J Cogn Cult 16:294–321.
Psychol 94:292–306.
166. Riedl K, Jensen K, Call J, Tomasello M (2012) No third-party punishment in chim-
129. Gelman SA (2009) Learning from others: Children’s construction of concepts. Annu
panzees. Proc Natl Acad Sci USA 109:14824–14829.
Rev Psychol 60:115–140.
167. Riedl K, Jensen K, Call J, Tomasello M (2015) Restorative justice in children. Curr Biol
130. Lutz DJ, Keil FC (2002) Early understanding of the division of cognitive labor. Child
25:1731–1735.
Dev 73:1073–1084.
168. Göckeritz S, Schmidt MH, Tomasello M (2014) Young children’s creation and trans-
131. Markman EM, Jaswal VK (2003) Commentary on Part II: Abilities and assumptions
underlying conceptual development. Early Category and Concept Development: mission of social norms. Cogn Dev 30:81–95.
Making Sense of the Blooming, Buzzing Confusion, eds Rakison D, Oakes L (Ox- 169. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
ford Univ Press, New York), pp 384–402. mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.
132. Waxman SR, Markow DB (1995) Words as invitations to form categories: Evidence 170. Heyes C (2016) Who knows? Metacognitive social learning strategies. Trends Cogn
from 12- to 13-month-old infants. Cognit Psychol 29:257–302. Sci 20:204–213.
133. Putnam H (1975) The meaning of ‘meaning’ Mind, Language, and Reality (Cam- 171. Levinson SC (2005) Comment on: Cultural constraints on grammar and cognition in
bridge Univ Press, Cambridge, UK), pp 215–271. Piraha by Daniel L. Everett. Curr Anthropol 46:637–638.
134. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural 172. Wellman HM (2014) Making Minds: How Theory of Mind Develops (Oxford Univ
learning. Trends Cogn Sci 19:688–699. Press, New York).
135. Tomasello M (2009) The Cultural Origins of Human Cognition (Harvard Univ Press, 173. Harris PL, Lane JD (2014) Infants understand how testimony works. Topoi 33:
Cambridge, MA). 443–458.
136. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul- 174. Walker CM, Gopnik A (2014) Toddlers infer higher-order relational principles in
ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789. causal learning. Psychol Sci 25:161–169.

Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7907
Coevolution of cultural intelligence, extended life
history, sociality, and brain size in primates
Sally E. Streeta,b,1, Ana F. Navarretea, Simon M. Readerc, and Kevin N. Lalanda,1
a
Centre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews, St. Andrews KY16 9AJ, United Kingdom; bDepartment of
Anthropology, Durham University, Durham DH1 3LE, United Kingdom; and cDepartment of Biology, McGill University, Montreal, QC H3A 1B1, Canada

Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 30, 2017 (received for review January 15, 2017)

Explanations for primate brain expansion and the evolution of human cultural capabilities evolved independently or coevolved through di-
cognition and culture remain contentious despite extensive research. rectly reinforcing processes. Enlarged brains, enhanced cognition, and
While multiple comparative analyses have investigated variation in highly developed social learning abilities co-occur not only in primate
brain size across primate species, very few have addressed why species but also in some cetaceans and birds (18–22), raising the pos-
primates vary in how much they use social learning. Here, we evaluate sibility of a key role for social learning and culture in brain evolution
the hypothesis that the enhanced reliance on socially transmitted and intelligence in multiple, independent animal lineages (23–27).
behavior observed in some primates has coevolved with enlarged Across primates, support for multiple, nonexclusive hypotheses
brains, complex sociality, and extended lifespans. Using recently for enlarged brain (particularly neocortex) size has been identified
developed phylogenetic comparative methods we show that, across in comparative studies, emphasizing the roles of social complexity
primate species, a measure of social learning proclivity increases with (e.g., group size) (28, 29), ecological intelligence (e.g., dietary
absolute and relative brain volume, longevity (specifically reproductive complexity) (30, 31), technical intelligence (e.g., tool use and
lifespan), and social group size, correcting for research effort. We also technical innovation) (21, 25, 32), and behavioral complexity (e.g.,
confirm relationships of absolute and relative brain volume with innovativeness, social learning, and tactical deception) (21, 25, 33).
longevity (both juvenile period and reproductive lifespan) and social Further, several comparative studies have found that larger
group size, although longevity is generally the stronger predictor. brained primates have slower life histories, including longer juvenile
Relationships between social learning, brain volume, and longevity periods and overall lifespans (e.g., ref. 29). Although mutually
remain when controlling for maternal investment and are therefore not reinforcing evolutionary processes have been proposed to account
simply explained as a by-product of the generally slower life history for this association (16), recent comparative analyses suggest that
expected for larger brained species. Our findings suggest that both lifespan increases with brain size in mammals instead due to de-
brain expansion and high reliance on culturally transmitted behavior
velopmental costs: i.e., it requires a longer period of maternal in-
coevolved with sociality and extended lifespan in primates. This
vestment to support offspring with greater natal and postnatal brain
coevolution is consistent with the hypothesis that the evolution of
growth, requiring a slower life history strategy of which longer life-
large brains, sociality, and long lifespans has promoted reliance on
span is a by-product (34). Primates, however, are potentially distinct
culture, with reliance on culture in turn driving further increases in brain
from most mammalian taxa in their unusually large, neuron-dense
volume, cognitive abilities, and lifespans in some primate lineages.
brains (8–11) and in the extensive occurrence of socially transmitted
behavior exhibited in some lineages (e.g., refs. 35–37). Whether the
|
cultural evolution social learning | brain evolution | primates | association between extended life history and enlarged brain size is
phylogenetic comparative analysis
best explained by a cognitive or developmental mechanism in pri-
mates specifically remains to be explored. Further, despite many
B rain expansion is unquestionably a distinctive feature of
primate, and especially human, evolution. Primate brain
expansion is evident regardless of whether the brain is measured
previous comparative analyses of brain size and relevant predictors
in primates, comparative analyses have not yet directly explored the
evolutionary relationships between brain expansion, cultural com-
in absolute terms, in relation to body size, or as the size of the plexity, sociality, and longevity in analyses that include all of these
neocortex relative to the rest of the brain (1), and irrespective of variables, with control for relevant potentially confounding variables.
whether it is better characterized by variation in a single size Here, in a comparative analysis of primate species, we directly
dimension (2) or mosaic evolution of component parts (3). The test the widely held view that encephalization, sociality, longev-
striking variation in brain size in nonhuman primates, across ity, and reliance on culture have coevolved (16, 23–27, 32, 38).
three orders of magnitude (4), has long demanded an evolu- We use a quantitative behavioral measure of reliance on culture:
tionary explanation (5). Although the cognitive implications of specifically, the number of unique reports (i.e., richness) of social
cross-species variation in whole brain size remain contentious learning per species from a sample of relevant published litera-
and require further investigation (5–7), evolutionary increases in ture (21, 39) (henceforth referred to simply as “social learning”)
overall brain size in primates reflect neuroanatomical changes
that are plausibly linked to increases in general cognitive abilities.
For instance, larger primate brains have more neurons in absolute This paper results from the Arthur M. Sackler Colloquium of the National Academy of
terms (8–11), with coordinated expansion particularly in the neo- Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
cortex and cerebellum (12), potentially supporting a greater diversity Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
of cognitive functions (7, 10). In support of this idea, overall brain
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
size increases with broad measures of cognitive ability in primates,
Author contributions: S.E.S. and K.N.L. designed research with contributions from A.F.N.
including performance in laboratory tests of learning and cognition and S.M.R.; S.E.S. performed research; S.E.S. analyzed data; and S.E.S., A.F.N., S.M.R., and
across primate genera (13) and performance in experimental mea- K.N.L. wrote the paper.
sures of behavioral inhibition across primate species (14). The authors declare no conflict of interest.
At ∼1,500 g (15), human brains are at least three times heavier This article is a PNAS Direct Submission.
than those of any other primate species (1). However, humans 1
To whom correspondence may be addressed. Email: knl1@st-andrews.ac.uk or
are also extreme in their long lifespan, social complexity, cog- sallystreet13@gmail.com.
nition, and cultural capabilities (16, 17), raising questions about This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
whether large brains, long lives, complex cognition, and advanced 1073/pnas.1620734114/-/DCSupplemental.

7908–7914 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1620734114


COLLOQUIUM
PAPER
(see SI Appendix for further details on this measure). We use Prediction 4: Absolute and Relative Brain Volume Increases
Bayesian phylogenetic mixed models to investigate a cluster of with Longevity
related hypotheses concerned with the evolutionary relationships Across mammals, a relationship between adult brain mass and
between social learning, brain volume, group size, and lifespan. longevity is not supported when controlling for maternal in-
Below, we specify and test four predictions, each of which is vestment, suggesting that developmental constraints associated
independent; however, if all are supported, it would imply sup- with investing in large-brained offspring underpin this associa-
port for a cluster of related, and mutually consistent, ideas tion (34). However, if associations of longevity with absolute and
concerning the factors underlying evolutionary expansions of relative brain volume remain when maternal investment is in-
brains, cognition, and culture. cluded in analyses of primate species, the relationship between
brain volume and lifespan is not confounded with maternal in-
Prediction 1: Social Learning Increases with Absolute and vestment, thus potentially indicative of a cognitive, rather than
Relative Brain Volume solely developmental, mechanism underpinning this relationship
This expectation follows from the hypotheses that (i) high levels in primates specifically, even if not across mammals more gen-
of knowledge and skill are required for primates to exploit high- erally. Additionally, if longevity is related to brain volume in-
quality, difficult-to-access dietary resources, with these skills dependently of any potentially confounding effect of social group
primarily acquired through social learning (16, 23–27, 32, 38, 40) size, these associations should remain intact when group size is
and (ii) the energy so acquired is critical to developing and included in statistical models.
running a large brain (16, 27). Previous comparative analyses
Results
have identified positive associations of social learning with the
absolute and relative sizes of brain components, primarily the Prediction 1: Social Learning and Brain Volume. As predicted, social
neocortex (21, 25). Here, we extend these analyses to overall learning richness increases with both absolute brain volume
(<1% β coefficients in the posterior distribution crossing zero,
brain size measured as endocranial volume (ECV), allowing for
n = 150) (SI Appendix, Table S1i) and relative brain volume (3%
much larger (at least threefold) sample sizes, far more repre-
β crossing zero, n = 150) (SI Appendix, Table S1ii).

EVOLUTION
sentative of the diversity in brain size across primate species (4).
Prediction 2: Social Learning and Longevity. As predicted, social
Prediction 2: Social Learning Increases with Longevity
learning richness increases with longevity (<1% β crossing zero,
This expectation follows from the hypotheses that (i) extended life n = 117) (SI Appendix, Table S2 A, i and Fig. 1A). We found no
history, particularly a longer lifespan and period of juvenile de- evidence that social learning increases with juvenile period length,
pendence, facilitates the acquisition, exploitation, and social trans- however (58% β crossing zero, n = 101) (SI Appendix, Table S2 B,
mission of life skills (16, 23, 40) and (ii) cultural knowledge promotes i and Fig. 1A). Rather, social learning increases with reproductive
survival and long lives (25–27) by acting as a “cognitive buffer,” lifespan specifically (0% β crossing zero, n = 92) (SI Appendix,
enhancing survival in challenging environmental conditions through Table S2 C, i). Relationships between social learning and lon-
behavioral responses (41, 42). Complex skills frequently take time to gevity, and between social learning and reproductive lifespan, re-
learn; therefore, longer lifespans potentially provide more time for main intact when maternal investment (summed gestation and
relevant experience to accrue, more time for adults to benefit from lactation time) is included as an additional predictor (2%, <1% β
knowledge acquired earlier in life, and more time for parents to pass crossing zero, n = 87, n = 82, respectively) (SI Appendix, Table S2
on relevant skills to offspring (16, 23, 26, 27, 40). If an extended A, ii and C, ii) whereas maternal investment itself does not predict
juvenile period in particular is critical for the acquisition of adaptive social learning in these models (≥35% β crossing zero) (SI Appendix,
socially transmitted behavior (16), we expect that juvenile period has Table S2 A, ii and C, ii). Relationships between social learning and
a strong association with social learning richness. However, costly longevity or reproductive lifespan are also not confounded by
investment in learning socially transmitted skills may pay off in later those between social learning and absolute or relative brain vol-
life only across a long reproductive lifespan (16); therefore, we may ume, as they remain when either brain volume or both brain
expect the association between social learning and longevity to be volume and body mass are included as additional predictors (<1%
driven more strongly by increases in reproductive lifespan. If there is β crossing zero, n = 111, n = 89) (SI Appendix, Table S2 A, iii and
a specific relationship of social learning with longevity, not con- iv and C, iii and iv). However, brain volume itself does not predict
founded by relationships of either with absolute or relative brain size, social learning when included alongside longevity measures (>22%
we should still find this association even when controlling for brain β crossing zero) (SI Appendix, Table S2 A, iii and iv and C, iii and iv).
volume and body mass. Furthermore, if reliance on socially trans-
Prediction 3: Social Learning and Group Size. As predicted, we found
mitted behavior is related to longevity via a cognitive buffer mech- a positive association between group size and social learning (<1%
anism rather than as a by-product of a relationship between social β crossing zero, n = 167) (SI Appendix, Table S3i and Fig. 1A).
learning, brain volume, and slower life history traits due to de- This association is independent of the relationship between social
velopmental constraints, this relationship should remain when con- learning and longevity or reproductive lifespan, as it remains when
trolling for the potentially confounding effect of maternal investment either of these life history traits is included (4% β crossing zero,
(measured as the sum of gestation and lactation periods) (34). 5% β crossing zero, n = 111, n = 89) (SI Appendix, Table S3 ii, A
and B). The relationship between group size and social learning is
Prediction 3: Social Learning Increases with Group Size
also not confounded by the association of either trait with absolute
This expectation follows from several theoretical and empirical or relative brain volume, because it remains when either brain
analyses showing that large social groups support greater amounts volume or both brain volume and body mass are included as ad-
of adaptive cultural knowledge (e.g., refs. 43–46) and broader ditional predictors (<4% β crossing zero, n = 140) (SI Appendix,
hypotheses that stable social grouping supports the evolution of Table S3 iii and iv). Both absolute and relative brain volume have,
reliance on social learning (e.g., ref. 20). If the relationship however, a weaker effect on social learning when group size is
of social learning to group size is not confounded by associations included as an additional predictor (2%, 7% β crossing zero)
of either trait with absolute brain volume, relative brain volume, or (SI Appendix, Table S3 iii and iv) compared with models
longevity, this prediction should hold when controlling for brain without group size (<1%, 3% β crossing zero) (SI Appendix,
volume, body mass, and longevity measures. Table S1 i and ii).

Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7909
A B C

Fig. 1. Posterior distributions of β coefficients for the effects of longevity, juvenile period, and group size on (A) social learning richness, (B) absolute brain
volume, and (C) relative brain volume (i.e., brain volume accounting for body mass). Here, we present effects from the simplest models, including only either
longevity, juvenile period, or group size as independent variables, together with research effort and body mass for the social learning model, and body mass
for the relative brain model. However, these results are not strongly affected by the inclusion of additional potentially confounding variables (Methods,
Results, and SI Appendix). Percentages indicate the percentage of posterior estimates that cross zero in the opposite to the predicted direction for each effect.
Distributions shifted substantially away from zero indicate evidence for effects of predictor variables in the corresponding direction whereas those centered
close to zero indicate little or no evidence for effects of predictor variables.

Prediction 4: Predictors of Brain Volume. We confirmed the expec- Discussion


ted positive association of absolute brain volume with social We investigated the widely held view that cultural intelligence,
group size (3% β crossing zero, n = 151) (SI Appendix, Table S4i extended life history, sociality, and brain size have coevolved in
and Fig. 1B). Absolute brain volume also increases with lon- nonhuman primates (16, 23–27). Using Bayesian phylogenetic
gevity, juvenile period length, and reproductive lifespan (<1% β generalized linear mixed models, we found a positive relation-
crossing zero, n = 112, n = 98, n = 90) (SI Appendix, Table S4 ii, ship between reliance on culture (as measured by reported
A, B, and C and Fig. 1B). Relationships between absolute brain richness of social learning, corrected for research effort) and
volume and longevity, juvenile period, and reproductive lifespan measures of both absolute and relative brain volume. Earlier
remain intact when maternal investment is included in the studies had established positive relationships between primate
model, which itself also increases with brain volume (all <1% β social learning and both absolute and ratio measures of the size of
crossing zero, n = 84, n = 86, n = 79) (SI Appendix, Table S4 iii, the “executive brain” (combined neocortex and striatum volume)
A, B, and C). Relationships between longevity, juvenile period, (25) and that social learning, as a component of a composite
and reproductive lifespan with absolute brain volume are in- measure of general cognitive ability, increases with absolute and
dependent of the association of brain volume and group size, ratio measures of neocortex size and with executive brain ratio
remaining intact when group size is included as an additional (21). Here, we found that these associations generalize further to
predictor (all <1% β crossing zero, n = 106, n = 95, n = 87) (SI overall brain size measured as endocranial volume, across a sub-
Appendix, Table S4 iv, A, B, and C) whereas group size is a rel- stantially larger (>3×) sample of primate species. Although its
atively weak predictor when included with longevity or re- occurrence in insects demonstrates that large brains, in absolute
productive lifespan (>6% β crossing zero) (SI Appendix, Table S4 terms, are not a prerequisite for social learning (47), evolutionary
iv, A and C). expansions in brain size may support more efficient, high-fidelity,
Similarly, relative brain volume increases with social group size or more diverse forms of social transmission (25), due to increases
(4% β crossing zero, n = 151) (SI Appendix, Table S5i and Fig. in, for instance, cross-modal integration of perceptual and motor
1C). Relative brain volume also increases with longevity, juvenile information and the general computational power and flexibility
period, and reproductive lifespan (<1% β crossing zero, n = 112, required to implement sophisticated learning strategies (40, 48).
n = 98, n = 90) (SI Appendix, Table S5 ii, A, B, and C and Fig. 1C). Evolutionary expansion of the primate brain is driven substantially
Again, associations between relative brain volume and all three by visual specialization (5, 49, 50) and coordinated expansion of
life history measures remain intact when controlling for maternal the neocortex and cerebellum, with likely corresponding increases
investment, which itself also increases with relative brain volume in fine visuo-motor control, which may underpin the ability to
(all <1% β crossing zero, n = 84, n = 86, n = 79) (SI Appendix, replicate complex behavioral sequences inherent to high-fidelity
Table S5 iii, A, B, and C). The relationship between relative brain social learning (5, 12). In turn, more effective social learning po-
volume and life history length is not confounded by social group tentially allows individuals to garner high-quality dietary resources
size because all three measures remain intact when group size is that can be invested in brain growth (16). Therefore, while the
added to the model (<1% β crossing zero, n = 106, n = 95, n = 87) cognitive mechanisms underpinning social learning largely remain
(SI Appendix, Table S5 iv, A, B, and C). When included with to be established (51), it remains highly plausible that evolutionary
longevity or reproductive lifespan, however, group size is not increases in overall brain size are associated with elevated social
strongly supported as a predictor of relative brain volume (>12% learning capabilities. However, our finding that brain volume does
β crossing zero) (SI Appendix, Table S5 iv, A and C). not predict social learning when accounting for longevity, together
Parameters from all statistical models are reported in full in SI with strong links between social learning and longevity, and longevity
Appendix, Tables S1–S5. All results reported in the main text and brain size, suggests that the association of social learning and
refer to models including great apes, but none of our main re- brain volume may be indirect, mediated via increased longevity.
sults are qualitatively affected by removing these species (n = 4) Results support a similar picture for relationships between social
from analyses (SI Appendix, Tables S1–S5). Variation in social learning, group size, and brain volume. While we cannot rule out
learning, longevity, group size, and brain volume data across the possibility that the reduced sample size and likely correspond-
primate genera is illustrated in Fig. 2. ing reduction in power with the inclusion of additional variables

7910 | www.pnas.org/cgi/doi/10.1073/pnas.1620734114 Street et al.


COLLOQUIUM
PAPER
Galagoides
Otolemur
Galago
A
Euoticus
Nycticebus
Loris
Perodicticus
Arctocebus
Daubentonia
Varecia
Lemur
Hapalemur
Eulemur
Propithecus
Lepilemur
Cheirogaleus
Mirza
Microcebus
Tarsius
Pithecia
Chiropotes
B
Cacajao
Saimiri
Cebus
Saguinus
Leontopithecus
Callithrix
Callimico
Aotus
Lagothrix
Ateles
Alouatta
Pongo
Pan
Gorilla
Symphalangus
Hylobates
Presbytis
Trachypithecus
C
Semnopithecus
Pygathrix
Nasalis
Colobus
Macaca
Papio
Social learning Theropithecus
Lophocebus
Brain volume Mandrillus
Group size Cercocebus
Longevity Erythrocebus
Cercopithecus
Allenopithecus

Fig. 2. Summary of raw data on social learning, absolute brain volume, group size, and longevity for 52 primate genera, using the consensus phylogeny
from 10ktrees (65). For illustration purposes only, all data are summarized as genus-level means, standardized with minimum 0 and maximum 1. Also for

EVOLUTION
illustration purposes only, social learning is displayed as a proportion of research effort whereas, in statistical analyses, social learning is controlled for
research effort by including research effort as an independent variable. Images show (A) bearded capuchin (Cebus libidinosus), (B) chimpanzees (Pan
troglodytes), and (C ) guinea baboons (Papio papio), illustrating lineages that represent convergent coevolution of high social learning abilities, large
brain volumes, complex social relationships, and long lifespans. (A) Courtesy of Flickr/Bart van Dorp, (B) courtesy of Flickr/USAID in Africa, and (C) courtesy of Flickr/
William Warby.

accounts for the loss of a direct relationship between social learning remains when measures of maternal investment are
learning and brain size, these results are consistent with a previous included in analyses supports these functional arguments and
exploratory phylogenetic path analysis showing that social learning argues against an interpretation solely in terms of developmental
and brain volume are related indirectly via links with dietary, constraints, in primates at least. Therefore, in primates, the
social, or life history traits (32). combination of social learning with large brains may provide a
The positive relationship between social learning and longevity cognitive buffer against environmental unpredictability, im-
we identify supports the idea that longer lifespans provide spe- proving survival and permitting long lives. Primates may con-
cies reliant on culture more time to learn novel skills, more time trast with most mammalian lineages in this regard due to the
to “cash in” on those skills once learned, and more time to pass unusually extensive reliance on culturally transmitted behav-
them on to their offspring (16, 23, 40). Additionally, longer ior seen in certain lineages (e.g., refs. 35–37), perhaps nec-
lifespans may confer greater opportunity for behavioral innova- essary for social learning to buffer individuals sufficiently
tions, providing the raw material for social transmission, because against environmental risks.
longer lifespans are positively associated with greater propensity Our finding of a positive relationship between social learning
to innovate in birds (52) and in primates (ref. 32, albeit in- and group size supports the expectation that large, stable social
directly). Culturally acquired knowledge is typically adaptive and groups support greater amounts of adaptive cultural knowledge
may often promote growth and survival of both learners and and facilitate a greater reliance on social learning (20). Although
their dependent young, and thereby extend lifespans (25–27) via this hypothesis is well-established in theoretical models (e.g., refs.
a cognitive buffer effect whereby social learning allows individ- 43, 44) and has found recent empirical support in human historical
uals to adapt behaviorally to challenging environments (41, 42). (45) and experimental (46) studies, previous comparative phylo-
These benefits may be sufficient to compensate for negative fit- genetic analyses have failed to find this relationship across primate
ness consequences associated with reliance on social learning, species (21, 25). The fact that we found a positive association here
such as increased risk of social transmission of parasites (39). most likely reflects the greater power of our analyses compared
Although hypotheses for the coevolution of lifespan and culture with earlier studies, due to the availability of a larger group size
propose that increases in both juvenile period and overall lifespan database (53) and phylogenetic comparative methods that adjust
are related to reliance on culturally transmitted knowledge (e.g., phylogenetic signal according to the traits included in the model
ref. 16), here, we found that the association between social learn- (SI Appendix, Methods), contrasting with the older independent
ing and longevity is driven by an increased reproductive lifespan, contrasts method, which effectively assumes a maximum level of
rather than an extended period of juvenile dependence. It phylogenetic signal and can therefore be overly conservative (54).
remains possible that a link between extended juvenile periods The relationship between social learning and group size remains
and social learning capabilities will be identified in future studies when longevity, brain volume, and body mass are included and
using novel social learning measures, such as those based on therefore seems not to be simply a by-product of the relationship
experimental tests. Nonetheless, our current findings suggest between group size and absolute or relative brain volume, or
that an extended reproductive lifespan, during which enhanced confounded by life history traits.
fitness benefits of earlier costly investment in learning skills Both large social groups and extended longevity (including
for survival can be reaped, primarily drives the association increases in juvenile period and reproductive and total lifespan)
between social learning and lifespan that we identify here. are associated with enlarged brain volume, whether measured in
Our finding that the relationship between longevity and social absolute terms or relative to body mass. Group size has proven a

Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7911
robust predictor of measures of brain size, particularly relative compensated for by higher productivity during the adult period,
neocortex size (29, 55, 56), and it remains an important predictor provided there is an intergenerational flow of both food and
of both absolute and relative whole brain volume, as well as knowledge from old to young (59). Our results are therefore
social learning, in our analyses. Thus, our findings support pre- broadly consistent with a cultural intelligence explanation (23–
vious studies claiming an important role for social intelligence in 27) manifested in particular primate lineages showing high re-
primate brain evolution (e.g. 29, 55–57). However, when in- liance on social learning, in which selection for efficient social
cluded together with longevity, longevity is independently related learning has allowed energy gains in diet, which in turn fueled
to brain volume whereas group size becomes a fairly weak pre- brain growth, and generated selection for extended longevity.
dictor. This result may be significant because the association of Previous comparative phylogenetic analyses have found social
brain volume and longevity is usually not regarded as directly learning to covary positively with rates of behavioral innovation
causally relevant in brain evolution (e.g., ref. 29). Further, a and tool use in primates (21, 25). Additionally, the best supported
recently published comparative analysis suggests that dietary graphs in exploratory phylogenetic path analyses link technical
factors, rather than sociality, are the primary drivers of increased innovation directly to brain size and social learning and non-
relative brain size in primates (31). It remains to be seen whether technical innovation indirectly to brain volume via diet and life-
these findings generalize to measures of neocortex volume, ar- history measures (32). Together with the current study, this body of
guably more relevant to social intelligence (29, 55–57). None- findings is consistent with the hypothesis that cultural intelligence,
theless, together, these results reinforce an emerging consensus as manifested by a cluster of behavioral traits, including social
that sociality is not the sole driver of primate brain evolution but learning, innovation, and tool use, may have been a significant
rather is embedded in a nexus of evolutionary conditions that driver of primate brain evolution. However, we highlight two notes
favor brain expansion, including dietary, ecological, life history, of caution in particular. First, the majority of primates exhibit
and behavioral factors (12, 16, 21, 25, 29, 32). comparatively little social learning (Fig. 2) (at least, as reflected in
Across mammals more broadly, the relationship between adult our database), which implies that any selection for cultural in-
brain mass and longevity is accounted for by patterns of maternal telligence has operated primarily in a small number of large-
investment and is generally interpreted as a manifestation of de- brained primate lineages. Second, our social learning measure is
velopmental costs of producing larger brained offspring, rather than largely based on observational reports, not controlled experimental
necessarily due to any cognitive or behavioral mechanism (34). tests, whereas social learning is challenging to identify from ob-
Here, however, we found that the associations of longevity with servation alone (21, 25). However, this approach provides a more
absolute and relative brain volume remain when controlling naturalistic comparative measure of social learning in comparison
for maternal investment. Therefore, in primates, compared with with those based on experimental tests, representing a far broader
mammals in general (34), variation in adult brain size across species
range of primate behavioral diversity, necessary for large-scale
cannot be fully accounted for by patterns of maternal investment,
comparative investigations (21, 25, 32, 39, 60). Results based on
and the relationship between brain size and lifespan is potentially
patterns of observational accounts of social learning across species
indicative of a cognitive buffering (41, 42), rather than solely de-
should be valuable in informing and directing future, larger scale
velopmental, mechanism through which cultural intelligence facil-
comparative experimental investigations of variation in social
itates survival. This contrast can perhaps be explained by divergent
learning abilities across species (21, 39, 61).
scaling relationships between brain volume and neuron number
One comprehensive way to interpret these findings is to recog-
(potentially a more relevant correlate of cognitive capacity) (7, 10,
nize multiple waves of selection for enlarged brains and enhanced
12) in primates compared with other mammalian lineages. Unlike
nonprimate mammalian lineages, such as rodents, in which neuron cognition in primates. In addition to selection for the cognitive skills
size increases and neuron density decreases with increased brain required for complex social lives (29) and dietary niches (31)
volume, in primates, the number of neurons increases approxi- characteristic of some primate taxa, our results imply a likely later
mately isometrically with brain volume (8–11). Therefore, in pri- bout of selection for cultural intelligence among a restricted number
mates, larger brains may confer stronger benefits in terms of of large-brained primate lineages. The latter notably include the
increased cognitive function and behavioral flexibility com- great apes, but also other independent lineages such as capuchins
pared with other mammalian lineages. Overall, together with and baboons (Fig. 2), as our results are not contingent on the in-
the strong relationship between social learning and longevity, clusion of great apes. Plausibly, complex sociality and foraging may
these findings are consistent with the hypotheses that cultural have led to the evolution of large-brained primate lineages, some of
knowledge facilitates survival and that extended longevity fa- which passed a critical threshold in reliance on socially learned
cilitates the acquisition, exploitation, and social transmission behaviors, leading to mutually reinforcing selection for increased
of life skills (16, 23, 25–27, 40). brain size, cognitive abilities, and reliance on social learning and
Our finding that longevity is a strong, and potentially causally innovation, mediated by conferred increases in longevity and diet
significant, predictor of both brain volume and social learning quality. The twin challenges of complex socioecological niches and
richness is evocative of the argument that intelligence and life- reliance on culture may therefore best account for the evolution of
history length have coevolved in humans because our intellectual large brains, advanced cognition, and extended lifespans in pri-
abilities allowed us to exploit high-quality, but difficult-to-access, mates. However, our analyses do not allow the direction of causality
food resources, with the nutrients gleaned “paying” for brain to be inferred, and other interpretations, for instance, in which large
growth, and with increased longevity favored because it allowed brains evolved for other reasons, subsequently allowing for gains in
more time to cash in on complex, and difficult to master, for- social and cultural complexity, are equally supported by the findings
aging skills, with fitness benefits that pay off later in life (16). presented here.
High levels of knowledge, skill, coordination, and strength are Our results do, however, strongly suggest a strong coevolutionary
required to exploit the high-quality dietary resources consumed relationship among cultural intelligence, brain size, sociality, and
by humans and other apes. Consistent with this idea, the most life-history length in primates. Although we have focused here on
common use of social learning in primates seems to be in ac- nonhuman primates, broader comparative trends support the idea
quiring foraging skills, as ∼50% of reports of social learning in that enlarged brain size, general cognitive abilities, and reliance on
a prior compilation occurred within the context of foraging culture may have coevolved in other long-lived, highly social line-
(25, 58). Complex tool use and extractive foraging abilities re- ages, including some birds (e.g., corvids and parrots) and toothed
quire time to acquire, but, in larger brained animals, an ex- whales (18–20, 22). These associations may be mutually reinforcing
tended learning phase, during which productivity is low, can be (24), with positive feedback loops reaching their zenith in humans,

7912 | www.pnas.org/cgi/doi/10.1073/pnas.1620734114 Street et al.


COLLOQUIUM
PAPER
who are extreme in their encephalization, intelligence, culture, and controlled for body mass in models in which life history traits predicted so-
lifespan (23, 62). cial learning as the outcome variable, due to the well-established association
of larger adult body size with slower life histories (e.g., ref. 67). For models
Methods including longevity, we reran analyses including maternal investment as an
additional predictor to account for its potentially confounding effect on
Data Compilation. All data used in analyses were obtained from existing
brain volume and longevity (34). Namely, if associations of brain volume
published datasets, referenced in full below, with additional details in SI
and/or social learning with longevity are confounded with maternal in-
Appendix, Methods.
vestment, we expect to find that, when included together with longevity,
Endocranial volume (ECV, in cubic centimeters) and body mass (in grams)
only maternal investment is a strong predictor of brain volume and/or social
data were obtained from ref. 4. Because ECV reflects the interior volume of
learning (as in ref. 34). Models including longevity as a predictor were also
the cranial cavity, including not only the volume of the brain but also the
rerun using either juvenile period length (age of sexual maturity) or re-
volume of protective structures of the brain, such as the meninges (4), and
productive lifespan (longevity minus juvenile period), to investigate whether
does not allow for separate estimates of the volumes of individual brain compo-
any identified relationships with longevity were driven by increases in juvenile
nents, it is a relatively crude brain measure (6). Nonetheless, ECV is strongly and
period length, reproductive lifespan, or both. To investigate whether group
near isometrically related to brain mass in primates (4), which is itself related ap-
size and longevity predicted brain volume and social learning independently
proximately isometrically to neuron number (8–11). Moreover, brain volume esti-
of each other, we ran additional models in which both group size and lon-
mates from ECV (hereafter “brain volume”) are available for around three times
gevity were included as predictors. We reran all analyses without great apes,
more primate species (n = 184 species) (SI Appendix, Methods) than for volumes
a potentially influential group due to their high social learning richness and
of individual brain structures (neocortex, cerebellum, etc.; typically ∼60 species)
large brains (Fig. 2), and due to potential researcher biases toward identi-
(e.g., ref. 63), allowing for analyses far more representative of the range of in-
fying social learning in apes compared with monkeys (SI Appendix, Methods).
terspecific variation in primate brain size (4). Further, because size estimates from
We found that none of our key findings were affected, demonstrating that
brain tissue can be influenced by variation in environmental effects, such as the
our results are robust to removal of potential outliers and to possible biases
age and life experience of the individual, along with variation in preservation
associated with this group (SI Appendix, Tables S1–S5).
techniques (6), ECV may be a more consistent measure of species-typical brain
To analyze data, we used Bayesian phylogenetic generalized linear mixed
size than those derived from direct measurements of volume or mass (4).
models, which allow for control for phylogenetic nonindependence and for
Data on social learning richness and a measure of research effort were
modeling non-Gaussian response variables, using the R package MCMCglmm
obtained from ref. 21 via the DataDryad digital repository (64) (see SI Appendix,
(68). Where brain volume was the response variable, Gaussian models were

EVOLUTION
Methods for full details on the social learning measure, illustrative examples,
used with all variables log-10 transformed, diffuse normal priors for the fixed
and discussion of its reliability). Briefly, social learning richness is the number of
effects with a mean of 0 and a large variance (1010), and inverse-Wishart priors
reports of unique social learning behaviors per primate species, primarily from
for the phylogenetic and residual variance (with V = 1, ν = 0.002). Where social
a literature sample of >4,000 articles from primate behavior journals (from
learning was the response variable, Gaussian models were not appropriate
1925 to 2000) (21). Instances of social learning were identified using keywords
due to the highly skewed distribution of this variable; we therefore used
(e.g., “social learning,” “cultural transmission,” and “traditional”) to minimize
Poisson models, with all predictor variables log-10 transformed and with
subjectivity in the collation of reports from the literature (21, 25). Although
nontransformed response variables. Poisson models used the same priors for
identifying social learning from literature reports of nonhuman primate be-
the fixed effects and residual variance as for the Gaussian models, with a
havior is inherently challenging, this approach allows for a quantitative be-
parameter-expanded prior (V = 1, ν = 1, αμ = 0, and αV = 252) for the phy-
havioral measure of social learning across a large sample of diverse primate
logenetic random effect (68, 69). Although a large proportion of the species
species, supporting far larger scale comparative analyses than would be pos-
included in analyses had zero records of social learning, these species are still
sible using data from controlled experiments alone (21, 25, 32, 39, 60). Ex-
informative due to the inclusion of research effort in all models (SI Appendix,
perimental approaches to measuring social learning across species are associated
Methods). Further, preliminary analyses established that Poisson models
with their own particular challenges, especially in comparability and ecological
without a zero-inflation term were appropriate for our data (SI Appendix,
validity of behavioral tests, and limited statistical power due to smaller sample
Methods).
sizes (21, 25, 61). We account for broad-scale species differences in research ef-
Markov chain Monte Carlo (MCMC) analyses were run with a sufficient
fort, here estimated using the number of papers published in the Zoological
number of iterations and thinning to return effective sample sizes of >1,000
Record (between 1993 and 2001, total 7,288 articles) (21) (see SI Appendix,
for all parameters (SI Appendix, Methods). Chain convergence and adequate
Methods for further information).
performance were confirmed by visual inspection of trace plots and checking
Data on social group size and life history traits (gestation length, weaning
effective sample sizes. From each model, we report the mean h2 (a measure of
age, age of sexual maturity, and maximum longevity) were obtained from the
phylogenetic signal equivalent to Pagel’s λ) (70) and mean β coefficient esti-
PanTheria dataset (53). As a measure of maternal investment, we summed
mate from posterior distributions. To assess the strength of evidence for fixed
gestation length and weaning age (following ref. 34). Reproductive lifespan
effects, we use the percentage of posterior β coefficient estimates crossing zero
was calculated as age of sexual maturity subtracted from maximum lon-
in the direction opposite to predictions (as in refs. 39, 71, and 72, for example).
gevity. Comparative datasets were matched to a dated consensus phylogeny
Posterior distributions shifted substantially away from zero in a positive or
for 301 primate species (10kTrees version 3, using GenBank taxonomy) (65).
negative direction indicate support for positive or negative associations, re-
Taxonomic mismatches were resolved using the 10kTrees Translation table
spectively, between fixed effects and outcome variables. Conversely, posterior
and the International Union for Conservation of Nature (IUCN) Red List
distributions centered on zero or overlapping substantially with zero indicate a
website (66).
lack of evidence for any relationship between the fixed effects and outcome
variables. Here, all associations are predicted to be positive in direction. As a
Statistical Analyses. To test predictions, we ran a series of statistical models in
measure of model fit, we used a pseudo R2, estimated as the squared Pearson’s
which the outcome variables were always either brain volume or social
correlation between fitted values and observed data (73). No analysis reported
learning, fitting independent variables that correspond to specific predicted
a variance inflation factor (VIF) above 5, demonstrating that multicollinearity
associations, along with appropriate potentially confounding variables. Ac-
was not a concern in our analyses (SI Appendix, Methods).
counting for the effects of multiple variables is essential in comparative
studies of brain evolution, due to multiple potential correlates (29). We
ACKNOWLEDGMENTS. We thank Chris Venditti for advice regarding the
analyzed brain volume both in absolute terms, and relative to body mass, by
implementation of phylogenetic Poisson models. Research was supported in
variably including body mass as an additional predictor variable. Where part by European Research Council Advanced Grant “Evoculture” 232823 (to
social learning was the outcome variable, research effort was always in- K.N.L.), John Templeton Foundation Grant 23807 (to K.N.L. and S.M.R.), and
cluded as a predictor to account for its effect on the number of records Natural Sciences and Engineering Research Council of Canada Grants
of social learning in the primate behavioral literature (21, 25). We also 418342-2012 and 429385-2012 (to S.M.R.).

1. Striedter GF (2005) Principles of Brain Evolution (Sinauer, Sunderland, MA). 5. Barton RA (2006) Primate brain evolution: Integrating comparative, neurophysio-
2. Finlay BL, Darlington RB (1995) Linked regularities in the development and evolution logical, and ethological data. Evol Anthropol 15:224–236.
of mammalian brains. Science 268:1578–1584. 6. Healy SD, Rowe C (2007) A critique of comparative studies of brain size. Proc Biol Sci
3. Barton RA, Harvey PH (2000) Mosaic evolution of brain structure in mammals. Nature 274:453–464.
405:1055–1058. 7. Chittka L, Niven J (2009) Are bigger brains better? Curr Biol 19:R995–R1008.
4. Isler K, et al. (2008) Endocranial volumes of primate species: Scaling analyses using a 8. Herculano-Houzel S, Collins CE, Wong P, Kaas JH (2007) Cellular scaling rules for
comprehensive and reliable data set. J Hum Evol 55:967–978. primate brains. Proc Natl Acad Sci USA 104:3562–3567.

Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7913
9. Herculano-Houzel S (2009) The human brain in numbers: A linearly scaled-up primate 43. Henrich J (2004) Demography and cultural evolution: How adaptive cultural processes
brain. Front Hum Neurosci 3:31. can produce maladaptive losses—the Tasmanian case. Am Antiq 69:197–214.
10. Herculano-Houzel S (2011) Brains matter, bodies maybe not: The case for examining 44. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the ap-
neuron numbers irrespective of body size. Ann N Y Acad Sci 1225:191–199. pearance of modern human behavior. Science 324:1298–1301.
11. Herculano-Houzel S, Manger PR, Kaas JH (2014) Brain scaling in mammalian evolution 45. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oceania.
as a consequence of concerted and mosaic changes in numbers of neurons and av- Proc Biol Sci 277:2559–64.
erage neuronal cell size. Front Neuroanat 8:77. 46. Derex M, Beugin M-P, Godelle B, Raymond M (2013) Experimental evidence for the
12. Barton RA (2012) Embodied cognitive evolution and the cerebellum. Philos Trans R influence of group size on cultural complexity. Nature 503:389–391.
Soc Lond B Biol Sci 367:2097–2107. 47. Leadbeater E, Chittka L (2007) Social learning in insects: From miniature brains to
13. Deaner RO, Isler K, Burkart J, van Schaik C (2007) Overall brain size, and not en- consensus building. Curr Biol 17:R703–R713.
cephalization quotient, best predicts cognitive ability across non-human primates. 48. Street SE, Laland KN (2017) Social learning, intelligence, and brain evolution. The
Brain Behav Evol 70:115–124. Wiley Handbook of Evolutionary Neuroscience, ed Shepherd SV (John Wiley, Chichester,
14. MacLean EL, et al. (2014) The evolution of self-control. Proc Natl Acad Sci USA 111: UK), pp 495–513.
E2140–E2148. 49. Barton RA (1998) Visual specialization and brain evolution in primates. Proc Biol Sci
15. Later W, et al. (2010) Is the 1975 Reference Man still a suitable reference? Eur J Clin 265:1933–1937.
Nutr 64:1035–1042. 50. Barton RA (2004) Binocularity and brain evolution in primates. Proc Natl Acad Sci USA
16. Kaplan H, Hill K, Lancaster J, Hurtado AM (2000) A theory of human life history 101:10113–10115.
evolution: Diet, intelligence, and longevity. Evol Anthropol 9:156–185. 51. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202.
17. Boyd R, Silk JB (2012) How Humans Evolved (Norton, New York). 52. Sol D, Sayol F, Ducatez S, Lefebvre L (2016) The life-history basis of behavioural in-
18. Emery NJ, Clayton NS (2004) The mentality of crows: Convergent evolution of in- novations. Philos Trans R Soc Lond B Biol Sci 371:20150187.
telligence in corvids and Apes. Science 306:1903–1907. 53. Jones KE, et al. (2009) PanTHERIA: A species-level database of life history, ecology,
19. Emery NJ (2006) Cognitive ornithology: The evolution of avian intelligence. Philos and geography of extant and recently extinct mammals. Ecology 90:2648.
Trans R Soc B Biol Sci 361:23–43. 54. Carvalho P, Diniz-Filho JAF, Bini LM (2006) Factors influencing changes in trait cor-
20. Rendell L, Whitehead H (2001) Culture in whales and dolphins. Behav Brain Sci 24: relations across species after using phylogenetic independent contrasts. Evol Ecol 20:
309–324. 591–602.
21. Reader SM, Hager Y, Laland KN (2011) The evolution of primate general and cultural 55. Dunbar RIM (1995) Neocortex size and group size in primates: A test of the hy-
intelligence. Philos Trans R Soc B Biol Sci 366:1017–1027.
pothesis. J Hum Evol 28:287–296.
22. Hunt GR, Gray RD (2003) Diversification and cumulative evolution in New Caledonian
56. Dunbar RIM (1998) The social brain hypothesis. Evol Anthropol 6:178–190.
crow tool manufacture. Proc Biol Sci 270:867–874.
57. Dunbar RIM (1992) Neocortex size as a constraint on group size in primates. J Hum
23. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago
Evol 22:469–493.
Press, Chicago).
58. Reader SM (2000) Social learning and innovation: Individual differences, diffu-
24. Wilson AC (1985) The molecular basis of evolution. Sci Am 253:164–173.
sion dynamics and evolutionary issues. PhD dissertation (University of Cambridge,
25. Reader SM, Laland KN (2002) Social intelligence, innovation, and enhanced brain size
Cambridge, UK).
in primates. Proc Natl Acad Sci USA 99:4436–4441.
59. Kaplan HS, Robson AJ (2002) The emergence of humans: The coevolution of in-
26. Whiten A, van Schaik CP (2007) The evolution of animal “cultures” and social in-
telligence and longevity with intergenerational transfers. Proc Natl Acad Sci USA 99:
telligence. Philos Trans R Soc B Biol Sci 362:603–620.
10221–10226.
27. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in-
60. Lefebvre L, Reader SM, Sol D (2004) Brains, innovations and evolution in birds and
telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016.
primates. Brain Behav Evol 63:233–246.
28. Whiten A, Byrne RW (1997) Machiavellian Intelligence II: Extentions and Evaluations
61. Bates LA, Byrne RW (2007) Creative or created: Using anecdotes to investigate animal
(Cambridge Univ Press, Cambridge, UK).
cognition. Methods 42:12–21.
29. Dunbar RIM, Shultz S (2007) Understanding primate brain evolution. Philos Trans R
62. Pagel M (2012) Evolution: Adapted to culture. Nature 482:297–299.
Soc Lond B Biol Sci 362:649–658.
63. Reader SM, MacDonald K (2003) Environmental variability and primate behavioural
30. Clutton-Brock TH, Harvey PH (1980) Primates, brains and ecology. J Zool 190:309–323.
31. DeCasien AR, Williams SA, Higham JP (2017) Primate brain size is predicted by diet but flexibility. Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Oxford),
not sociality. Nat Ecol Evol 1:0112. pp 83–116.
32. Navarrete AF, Reader SM, Street SE, Whalen A, Laland KN (2016) The coevolution of 64. Reader SM, Hager Y, Laland KN (2011) Data from: The evolution of primate general
innovation and technical intelligence in primates. Philos Trans R Soc Lond B Biol Sci and cultural intelligence. Dryad Digital Repository. Available at dx.doi.org/10.5061/
371:20150186. dryad.t0q94. Accessed November 23, 2016.
33. Byrne RW, Corp N (2004) Neocortex size predicts deception rate in primates. Proc Biol 65. Arnold C, Matthews LJ, Nunn CL (2010) The 10kTrees Website: A new online resource
Sci 271:1693–1699. for primate phylogeny. Evol Anthropol 19:114–118.
34. Barton RA, Capellini I (2011) Maternal investment, life histories, and the costs of brain 66. IUCN (2016) The IUCN Red List of Threatened Species. Version 2016-2. Available at
growth in mammals. Proc Natl Acad Sci USA 108:6169–6174. www.iucnredlist.org. Accessed November 24, 2016.
35. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. 67. West HER, Capellini I (2016) Male care and life history traits in mammals. Nat
36. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. Commun 7:11854.
Science 299:102–105. 68. Hadfield JD (2010) MCMC methods for multi-response generalized linear mixed
37. Perry S, et al. (2003) Social conventions in wild white-faced capuchin monkeys. Curr models: The MCMCglmm R package. J Stat Softw 33:1–22.
Anthropol 44:241–269. 69. Hadfield J (2016) MCMCglmm Course Notes. Available at: ftp://cran.r-project.org/pub/
38. Henrich J (2015) The Secret of Our Success: How Culture Is Driving Human Evolution, R/web/packages/MCMCglmm/vignettes/CourseNotes.pdf. Accessed January 5, 2017.
Domesticating Our Species, and Making Us Smarter (Princeton Univ Press, Princeton). 70. Hadfield JD, Nakagawa S (2010) General quantitative genetic methods for compar-
39. McCabe CM, Reader SM, Nunn CL (2015) Infectious disease, behavioural flexibility and ative biology: Phylogenies, taxonomies and multi-trait models for continuous and
the evolution of culture in primates. Proc Biol Sci 282:20140862. categorical characters. J Evol Biol 23:494–508.
40. Laland KN (2017) Darwin’s Unfinished Symphony: How Culture Made the Human 71. Capellini I, Baker J, Allen WL, Street SE, Venditti C (2015) The role of life history traits
Mind (Princeton Univ Press, Princeton). in mammalian invasion success. Ecol Lett 18:1099–1107.
41. Sol D (2009) Revisiting the cognitive buffer hypothesis for the evolution of large 72. Allen WL, Street SE, Capellini I (2017) Fast life history traits promote invasion success
brains. Biol Lett 5:130–133. in amphibians and reptiles. Ecol Lett 20:222–230.
42. González-Lagos C, Sol D, Reader SM (2010) Large-brained mammals live longer. J Evol 73. Zheng B, Agresti A (2000) Summarizing the predictive power of a generalized linear
Biol 23:1064–1074. model. Stat Med 19:1771–1781.

7914 | www.pnas.org/cgi/doi/10.1073/pnas.1620734114 Street et al.


COLLOQUIUM
PAPER
The evolution of cognitive mechanisms in response to
cultural innovations
Arnon Lotema, Joseph Y. Halpernb, Shimon Edelmanc, and Oren Kolodnyd,1
a
Department of Zoology, Tel Aviv University, Tel Aviv 6997801, Israel; bDepartment of Computer Science, Cornell University, Ithaca, NY 14850; cDepartment
of Psychology, Cornell University, Ithaca, NY 14850; and dDepartment of Biology, Stanford University, Stanford, CA 94305

Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 20, 2017)

When humans and other animals make cultural innovations, they moment on, learning mechanisms that need not have initially been
also change their environment, thereby imposing new selective specifically social were also selected according to their ability to
pressures that can modify their biological traits. For example, support social learning. If that is the case, one can certainly claim
there is evidence that dairy farming by humans favored alleles for that these mechanisms were adapted or shaped to serve their new
adult lactose tolerance. Similarly, the invention of cooking possibly social function (although using the term “evolved for” may still be
affected the evolution of jaw and tooth morphology. However, premature without knowing the degree of genetic modification
when it comes to cognitive traits and learning mechanisms, it is and specialization).
much more difficult to determine whether and how their evolution Similarly, when social learning enables the accumulation or
was affected by culture or by their use in cultural transmission. Here spread of shared group behaviors—these days recognized as the
we argue that, excluding very recent cultural innovations, the formation of “culture” (17, 18)—this culture becomes the new
assumption that culture shaped the evolution of cognition is both ecological niche for all of the learning mechanisms that con-
more parsimonious and more productive than assuming the tribute to it, and therefore has the potential to shape their evolution.
Thus, in theory, given sufficient evolutionary time, cultural phe-

EVOLUTION
opposite. In considering how culture shapes cognition, we suggest
that a process-level model of cognitive evolution is necessary and
nomena that are adaptive for the individual, and whose acquisition is
supported by advanced learning or cognitive skills, such as the ability
offer such a model. The model employs relatively simple coevolv-
to imitate or to learn language, are expected to select for improve-
ing mechanisms of learning and data acquisition that jointly
ments in these cognitive skills (see also ref. 19). In practice, however,
construct a complex network of a type previously shown to be
clear evidence showing the effect of culture on cognition is lacking,
capable of supporting a range of cognitive abilities. The evolution
and alternative accounts for the evolution of advanced cognition

PSYCHOLOGICAL AND
of cognition, and thus the effect of culture on cognitive evolution,

COGNITIVE SCIENCES
and culture through domain-general learning principles cannot
is captured through small modifications of these coevolving learn- be ruled out (1, 2, 20). As a result, whether and how culture
ing and data-acquisition mechanisms, whose coordinated action is really shapes the evolution of cognition is still under debate.
critical for building an effective network. We use the model to show In what follows, we first clarify some of the theoretical issues in
how these mechanisms are likely to evolve in response to cultural this debate, using two recent controversies in the fields of language
phenomena, such as language and tool-making, which are asso- evolution and social learning. We then offer a process-level ap-
ciated with major changes in data patterns and with new compu- proach to cognitive evolution that may be useful in predicting what
tational and statistical challenges. aspects of learning and cognition are likely to coevolve with cul-
ture. Finally, we use the model to demonstrate how cultural
| |
tool-making language evolution niche construction | phenomena such as language and tool-making (each related to
|
cognitive evolution social learning one of the two controversies discussed earlier) are likely to shape
cognition, given their association with changes in data distribution
and with new computational and statistical challenges.
A n open question in the study of culture and cognitive evo-
lution is whether (and to what extent) cognitive mechanisms,
especially those viewed as advanced or sophisticated, evolved in
Can Culture Evolve Without Shaping Cognition? On
Parsimony, Likelihood, and Scientific Productivity
response to social-learning challenges or are merely the product of
domain-general mechanisms (1–3). According to one view—still Evolution takes time, so it is clear that very recent cultural in-
widely held in cognitive science and evolutionary psychology—cog- novations, such as cars, computers, cellular phones, or the In-
nitive adaptations take the form of specialized brain modules (or ternet, could not have yet generated detectable effects (or
perhaps any effect at all) on the evolution of cognition. But what
neuronal mechanisms) that evolved for specific, often social pur-
about relatively ancient cultural phenomena, such as song-
poses, such as “imitation” (4, 5), “mind reading” (6, 7), “cheating
learning in birds or tool-making and language acquisition in
detection” (8), or most famously, language acquisition (9, 10). These humans? Although there is evidence for the effect of human
ideas have been criticized on theoretical and empirical grounds (11, culture on biological traits and gene frequencies (21), evidence
12), and the debate around them demonstrates our limited un- for specific effects of human culture on learning and cognitive
derstanding of the evolution of cognition, its relationship to the mechanisms is mostly circumstantial. This evidence includes
evolution of social behavior and, in some organisms, culture.
The question of whether culture and social behavior shape the
evolution of the brain is, in our view, best considered using the This paper results from the Arthur M. Sackler Colloquium of the National Academy of
evolutionary framework of niche construction (13–16): that is, Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
culture and social behavior change the ecological niche to which Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
cognitive traits must adapt in the same manner that nest-building in Irvine, CA. The complete program and video recordings of most presentations are available
by birds changes the ecological niche in which their nestlings evolve. on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
For example, animals’ ability to learn from each other may have Author contributions: A.L., J.Y.H., S.E., and O.K. designed research, performed research,
initially been a by-product of domain-general associative learn- and wrote the paper.

ing mechanisms that did not evolve for social learning (1). The authors declare no conflict of interest.
However, as soon as these mechanisms enabled social learning This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
and were recruited by it for regular use, social learning and its Board.
outcomes also became part of their ecological niche. From that 1
To whom correspondence should be addressed. Email: okolodny@stanford.edu.

www.pnas.org/cgi/doi/10.1073/pnas.1620742114 PNAS | July 25, 2017 | vol. 114 | no. 30 | 7915–7922


signs of selection on genes implicated in brain growth, learning, social learning and culture could evolve without somehow affect-
and cognition (22–24) that may be attributed to human culture ing these mechanisms. In reality, of course, unlike in the beak
(21), and differences in gene expression in the brain between example, learning mechanisms are not necessarily constrained to
human and nonhuman primates (25, 26) that may be interpreted be uniform across all domains; there is plenty of evidence for
similarly. There is also recent evidence relating structural changes adaptive specialization in associative learning mechanisms (e.g.,
in the human brain to Paleolithic tool-making abilities (27, 28), refs. 38–40).
but additional work is still needed to clarify the direction of The key point in this evolutionary argument is that, even in the
causation between culture and cognition (we will return to dis- absence of supportive evidence, it is more parsimonious to as-
cuss these findings toward the end of the paper). Finally, when sume that learning mechanisms were shaped by their social
animal culture is considered, a recent study (29) suggests an function and, if the relevant evidence is lacking, that we have
effect of culture on learning in songbirds: a larger repertoire size simply failed to find it, than to assume the opposite. The as-
is found in species that developed open-ended learning ability. sumption of no change requires us to posit a lack of genetic
The lack of clear empirical evidence for the effect of culture on variance in learning mechanisms, which runs counter to sub-
cognition is not surprising, given that cognitive mechanisms and stantial evidence (39, 41–45), or else to explain how the selection
their genetic underpinning are still poorly understood, making it regime was miraculously unaffected by the new social niche. The
difficult to track their evolution (as opposed to that of clearly de- assumption of evolutionary change may also be viewed as sci-
fined biological traits). As a result, much of the debate is focused on entifically more productive, as it encourages further research
theoretical arguments of plausibility and likelihood, which may be (46). This implies that a useful working hypothesis should be that
interpreted differently by psychologists and evolutionary biologists. “culture did shape cognition” and that we have to find out how.
We use two examples of such controversies to illustrate this prob- Demonstrating that social learning and the so-called “mirror
lem and to suggest a methodologically productive resolution. neurons” phenomenon can be explained by associative learning
principles (1, 2) is important and consistent with our view. But
Problem 1: The Evolution of Social-Learning Mechanisms. Social given the evolutionary argument above, the demonstration need
learning is broadly defined as learning that is influenced by ob- not imply that these associative learning mechanisms did not
servation or interaction with other individuals or with the evolve beyond their basic state (see also ref. 47). Indeed, the ar-
products of their behavior (30). This definition leaves open the gument suggests that associative learning mechanisms are the
question of whether social-learning mechanisms have evolved building blocks of cognitive evolution, and are finely tuned to
specifically to serve their social function or whether they are serve new social, cultural, and other advanced functions (48). This
domain-general associative learning mechanisms that are also view may help us make the shift from postulating “black box”
used to learn socially. In a series of thought-provoking papers, adaptations that evolve “for” particular social purposes to having
Heyes and her colleagues (1, 2, 31–33) have demonstrated that credible process-level mechanistic models of cognitive evolution.
most mechanisms of social learning and imitation that are nor-
mally viewed as specialized adaptations for social life (4, 5, 34) Problem 2: The Evolution of Language and Memory Constraints. The
can also be explained by domain-general associative learning same evolutionary argument discussed above is also relevant to
principles. In this light, and in the absence of convincing evi- the question of whether or not human language shaped the
dence to the contrary, they also suggested that there is no need evolution of cognition (16, 49–52). It implies that it is highly
to posit that these domain-general mechanisms were shaped by unlikely that the use of language for at least several thousand
their social or cultural function (35). In other words, it would generations [or even more (53–55)] failed to affect some aspects
appear more parsimonious to assume that these mechanisms did of learning and cognition. The real question is not whether or
not evolve beyond their initial domain-general form. However, not it did, but rather what aspects were affected and how. For
this appeal to parsimony is somewhat misleading in evolutionary example, Chater et al. have pointed out correctly that many as-
contexts and time scales, where changes are actually to be pects of human language change too fast for genetic evolution to
expected (36, 37). In fact, for most evolutionary biologists, it respond (56). The authors used computer simulations to show
would appear highly unlikely that some learning mechanisms that even in the presence of genetic variation, cultural conven-
would be used for many generations to serve social functions, yet tions of language are like “moving targets” for natural selection,
remain unaffected by this new social niche. It would almost be making the evolution of genetic adaptations to specific languages
like expecting the surface of the moon to remain unmarked with highly implausible. However, although this analysis makes a
craters after millions of years of exposure to space debris. convincing argument against strong nativism (the claim that
To explain why, imagine a population of finches that expands there is a significant innate component to human language), it
its range. The beaks of these finches have a certain morphology also implies that genes for language can evolve if they serve
that evolved to handle seed types present in the original habitat; general skills for language learning that are stable over time (49,
the expanded habitat includes novel types of seeds. In this set- 56, 57). In other words, the need to learn a language may still
ting, the assumption that after many generations there would be select for many general cognitive abilities, such as better mem-
no evolutionary change in beak morphology is not at all parsi- ory, computational abilities, or greater attention to verbal input.
monious. For a biological trait to remain unchanged over time, Indeed, it does not seem possible that language could evolve
an active process of stabilizing selection is necessary, otherwise it without affecting such abilities.
will change through directional selection or genetic drift. That is, Interestingly, in a more recent paper, Christiansen and Chater
the probability that the new types of seeds will not affect the (58) have extended their view that language cannot shape the
previous stabilizing selection regime is extremely small. evolution of cognition by proposing that the limited sensory
Returning to the evolution of learning, and following the same memory span—the window of time during which a linguistic ut-
reasoning, it is difficult to imagine how adding a new function for a terance is retained in its entirety—creates a bottleneck that
basic learning mechanism would not affect its evolution. As in the strongly constrains language and cannot evolve to become wider.
beak example, the change may be subtle, and merely quantitative. The authors suggest that language has evolved to cope with this
This is in part because a bird has only one beak that must serve memory limitation, but that the evolution of this memory limita-
many different functions (from foraging for different types of food tion was not affected by language. As we noted elsewhere (59),
to feather-preening and nest-building); it cannot be specialized this assumption runs counter to the evolutionary argument above
into different kinds of beaks. However, even in this case of a single and to substantial evidence for genetic variance in memory pa-
multipurpose adaptation, every new function must affect evolu- rameters (60–63). In a reply to this criticism, Chater and Chris-
tion; we would expect a more significant change if specialization is tiansen (64) explained that the memory bottleneck cannot be
possible. Similarly, even if all cognitive functions are supported by viewed as a genetically variable trait that can respond freely to
the same domain-general learning mechanisms, it is unlikely that selection because it “emerges from the computational architecture

7916 | www.pnas.org/cgi/doi/10.1073/pnas.1620742114 Lotem et al.


COLLOQUIUM
PAPER
of the brain.” This answer, however, merely kicks the can down the learning (76, 77), with less emphasis on biological and behavioral
road, by moving the problem from the domain of memory to that of realism (78), are being developed and applied to challenging
the computational architecture of the brain. The same arguments tasks, including: perceptual parsing, associative learning, and the
hold: although constrained by many factors, the computational learning of conceptual contingencies (79, 80).]
architecture of the brain is shaped by the sum of selective pressures Once we have such a minimal model, we can consider how
arising from the need to accommodate the multiple functions of the small variations in the basic operational units and their param-
brain that influence the organism’s fitness. Even if the memory eters can build better, or different, cognitive mechanisms and
bottleneck emerges as a product of this architecture, it can still how this evolutionary process can be shaped by culture. Over the
evolve as long as this architecture evolves. If the challenge of past few years, we have developed such a model and explored its
processing and using language, which affects individuals’ fitness and ability to explain a range of phenomena. In the following sec-
has been in place for thousands of generations, has played a role in tions, we briefly describe this model and use it to consider how
shaping this architecture and its emergent properties, then, as one culture may shape the evolution of cognition and, in particular,
of these emergent properties, memory should have been affected. how such a model may help resolve the abovementioned con-
Specifically, as we proposed in the past (59) and explain further troversies regarding social-learning mechanisms and memory
below, the selective pressure exerted by language learning may have constraints on language.
acted to limit the working-memory buffer, as this may be useful for
coping with the computational challenges involved in data seg- A Process-Level Model of Cognitive Evolution
mentation and network construction. The model presented here has already been described in several
Our two examples—social learning and memory bottleneck— of our previous papers (81–87). Some of the main aspects of the
suggest that, excluding very recent cultural innovations, it is model were implemented in a set of computer simulations,
unlikely that culture could have evolved without shaping learning demonstrating a gradual evolutionary trajectory, from simple
and cognition. This forces us to think more specifically about associative learning, to chaining, to seldom-reinforced continu-
how culture shapes cognition, which requires, as we claim next, ous learning [in which a network model of the environment is
adopting a process-level approach to cognitive evolution: that is, constructed (84)], to complex hierarchical sequential learning
a mechanistic model that explains a behavior or an ability as the that can support advanced cognitive abilities of the kind needed

EVOLUTION
outcome of a process. Such a model may also help provide a useful for language acquisition and for creativity (85, 87). For the latter,
structure for reexamining the two problems outlined above. our modeling framework had to go well beyond chaining through
second-order conditioning (84, 88). The success of a computer
Why Do We Need a Process-Level Approach to Theorize program, originally developed to simulate the behavior of ani-
About Culture and Cognition? mals learning to forage for food in structured environments (85),
Whereas it is relatively easy to see how natural selection acts on in reproducing a range of findings in human language (86) sug-
clearly defined morphological traits, such as limbs, bones, or gests that the model may be useful in the study of cognitive

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
coloration, with cognitive traits that are not well understood, it is evolution. Thus far the model’s implementation has been lim-
difficult to tell what is actually evolving. Cognition is not a ited, for simplicity, to an unsupervised learning mode with a
physical trait, but an emergent property of processes that are learning phase, and then a test phase during which the learners
carried out by multiple mechanisms, most of which involve act based on what they have learned. Its extension to accom-
learning. Thus, to consider how culture shapes the evolution of modate iterated cycles of learning and action, which is necessary
cognition, we must explain how such mechanisms work and how to capture the learning of behavioral contingencies through trial
they can be modified by natural selection. The importance of and error, is straightforward. Detailed pseudocode for our model
using mechanistic models in the study of behavioral evolution is is found in the supplementary material of refs. 85 and 86.
increasingly recognized (65–68), but most attempts to integrate The model is based on coevolving mechanisms of learning and
evolutionary theory and cognition are still based on modeling the data acquisition that jointly construct a complex network that
evolution of learning rules that are far too simple to capture represents the environment and is used for computing adaptive
complex cognition (69–74). To understand how culture shapes responses to challenges in the environment. In particular, the
the evolution of cognitive mechanisms, such as those serving network is used for search, prediction, decision making, and
imitation, theory of mind, or language acquisition, it is necessary generating behavioral sequences (including language utterances,
to have models that explain how such mechanisms work and how when applicable). The extent to which learners’ use of the net-
they could evolve. work produced adaptive behaviors was measured in our imple-
Clearly, given the immense complexity of the brain, any at- mentation by foraging success [in the context of animal foraging
tempt to propose a general process-level model of advanced (84, 85, 87)] and by a set of language performance scores [in the
cognition would be ambitious. However, we believe that it is context of language learning (86)]. Although the production of
possible and necessary to start by constructing models that cap- adaptive behaviors depends on the structure of the network, this
ture some of the key working principles of advanced cognitive structure is not directly coded by genes and therefore cannot
mechanisms in a manner that suffices to explain their evolution. evolve directly. The components that may evolve over genera-
An analogy that may clarify our approach is the apparent chal- tions are the parameters of the learning and data-acquisition
lenge in explaining the evolution of the eye. The vertebrate eye is mechanisms that construct the network through interaction
highly complex; it is initially hard to see how it could have with the environment, and whose coordinated action, as we show
evolved. However, with a minimal understanding of how the eye below, is critical for building the network appropriately. [This
works, the “magic” is removed (75). The basic eye model is a coevolution is very much in the spirit of the notions of con-
layer of photosensitive cells; the visual acuity it provides can structive development and reciprocal causation in the recently
gradually improve as it buckles into a ball-like shape, looking proposed “extended evolutionary synthesis” (89).]
(and working) more and more like a pinhole camera. This sketch
ignores many details, and is far from explaining everything about Constructing a Network. We now briefly sketch the main principles
eyes and vision, but is sufficient to resolve the puzzle. that govern how the network is constructed. (This technical de-
This is the kind of modeling approach that we seek for explaining scription may become clearer and more intuitive after reading
cognitive evolution. Specifically, we do not seek a fully detailed the simplified example outlined in the next subsection and il-
neuronal-level model of brain and cognition. Instead, we want a lustrated by Fig. 1). We assume that what data are acquired by
minimal set of principles that suffice to explain how simple oper- the learner is determined by its “data-acquisition mechanisms”:
ational units, capable of only the most basic forms of learning, can the collection of sensory, attentional, and motivational mecha-
jointly and gradually create the much more sophisticated mecha- nisms that direct the learner to process and acquire whatever is
nisms of advanced cognition. [Powerful algorithms using deep deemed relevant. These mechanisms [also referred to as “input

Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7917
mechanisms” (33)] determine the content and the distribution of by the parameters of weight increase and decrease. These pa-
different data items in the input. For simplicity, we assume that rameters create a window for learning, during which data can be
the input takes the form of strings of symbols (i.e., linear se- either retained or discarded from the network.
quences of discrete items), which are then processed through a We assume that if a data sequence reaches the threshold weight
limited working-memory buffer, similar to the “phonological loop” for memory fixation, it remains in memory and is not segmented
in humans (90, 91), and tested for familiar segments and statistical any further. An intuitive example from language learning is a word
regularities among their components. This is done by the learning such as “backpack,” which would be fixed in memory if it were
mechanisms in a sequence of steps. heard repeatedly without prior exposure to instances of “back” or
A data sequence is scanned for subsequences that recur within “pack” (not even within other sequences, such as “on my back” or
it and for previously learned subsequences, and is segmented “in the pack”). If “back” or “pack” are heard often, then their
accordingly. This results in a series of chunks, which are either partial commonality with “backpack” would result in “backpack”
previously known or are incorporated at this point into a network being segmented into “back” and “pack” (with a directed link
of nodes that represents the world as it has been learned so far between them; i.e., back→pack). The fixation of long sequences
(nodes stand for objects or other meaningful units; the links in may have a positive or a negative impact on a learner’s success, as
the network represent their association in time and space). discussed below and in refs. 81 and 87. Note that the fixation of
Weights are assigned to the nodes and to the links to reflect their “backpack” does not prevent the formation of separate nodes for
frequency of occurrence: links between nodes are established “back” and “pack” following later observations.
In addition to breaking up segments to form smaller segments,
whenever two nodes follow one another in the input. The weight
a node can be formed by the concatenation of smaller segments
of a node or a link is increased whenever it is encountered in the
after they are repeatedly observed in succession. Thus, nodes can
input; the weight also decreases with time if it is not encoun- be formed “top-down” directly from the raw input by segmen-
tered. This process ensures that only those units and relations tation, or “bottom-up” through concatenation of previously
that are potentially meaningful are retained in memory, and learned units, creating a hierarchical structure, with potentially
spurious occurrences are forgotten. If a node’s or link’s weight multiple hierarchies that can be perceived as “sequences of
increases above a fixation threshold, its decay becomes highly shorter sequences” (see refs. 85 and 86 for more details). In both
improbable. The probability that a data item is learned is thus cases, the effects that memory parameters have on learning
determined by how frequently it is encountered in the data, and amount to a test of statistical significance: natural and mean-
ingful patterns are likely to recur and thus pass the test, whereas
spurious patterns decay and are forgotten.
A 756483617569813675628136 A Simplified Example. To better understand the process of data
segmentation and network construction, a simplified example is
756483617569813675628136 illustrated in Fig. 1A. This example shows a network that is
constructed as the result of acquiring three specific strings of
data, under the assumption that the weight-increase parameter is
756483617569813675628136 0.4, the fixation threshold is 1.0 (which means that a data item
reaches fixation after three successive observations, because 3 ×
0.4 > 1), and the weight-decrease (decay) parameter is 0.01 (i.e.,
the weight of a data item that is not yet fixated decreases
by 0.01 with each symbol that enters the input). It is also as-
756 48361 sumed that the working-memory buffer can accommodate up to
98 136 24 symbols (which is the length of one data string in the example
in Fig. 1A), and that the strings in this illustration are separated
28 by 30 additional (irrelevant) characters that prevent parts of any
two of these strings from being processed simultaneously in the
B 756483617564836175648361 memory buffer. The figure demonstrates that repeated se-
quences within each string (highlighted by shades of gray for
clarity) are segmented based on their similarity and become the
756981367569813675698136 data units that form the nodes of the network. Directed links
represent past association between these units; thus, they rep-
756281367562813675628136 resent statistical regularities of the environment. For example,
98 always follows 756 and precedes 136, whereas 756 leads to
48361, 98, and 28 with equal probability. Despite the simplicity of
this network, we can already observe that 98 and 28 have a
75648361 6 75628136 similar link structure: both are preceded by 756 and followed by
136. In our earlier work (85, 86), we showed how such similarity
75698136 in link structure can be used for generalization, for the con-
struction of hierarchical representations, and for creativity (87).
Fig. 1. Data input in the form of three strings, and the network that is We stressed earlier that, according to our model, the co-
constructed as a result of acquiring and processing this input using the ordinated action of learning and data acquisition mechanisms
learning mechanisms and parameter set described in the text. (A) Each data and their evolution in response to typical input characteristics
string of 24 characters is composed of three nonidentical subsequences of
are critical for building an effective network. This point is illus-
eight characters that share some common segments (highlighted using the
trated by Fig. 1B, where the same data as in Fig. 1A are now
same shade of gray). The three strings are identical in this case, so labeling
each subsequence of eight characters as A, B, and C, respectively, allows
distributed differently, leading to a radically different network
describing the structure of the input as ABC ABC ABC. (B) The same input as representation (although the learning parameters and the
in A is distributed differently over time, which can be described in short as working-memory buffer size remain the same). The distribution
AAA BBB CCC. This input leads to a completely different network structure of the data input in Fig. 1B leads to the fixation of large idio-
due to fixation of A, B, and C as long eight-character chunks. The weights of syncratic data sequences and to poor link structure, which may
the nodes and the links of the networks are not shown in the figure, but all hamper further learning and generalization (81, 87). For exam-
of them exceed the fixation threshold of 1.0, as the weight-increase pa- ple, no generalization can now be drawn for 98 and 28 because
rameter was set to 0.4 per occurrence. each of them is “locked” within another segment. Recognizing

7918 | www.pnas.org/cgi/doi/10.1073/pnas.1620742114 Lotem et al.


COLLOQUIUM
PAPER
segments in novel input becomes more difficult (i.e., is less be selected to better accommodate the physical requirements of
likely) if the memory representation is based on large idiosyn- the constructed network. In the next two sections we consider
cratic units. The learner is then less likely to place novel data in how this may have happened in the case of human language and
context and to perform further segmentation. stone-tool production, reexamining in this light the two problems
Note that if we change only the weight-increase parameter discussed earlier regarding memory constraints and social-learning
from 0.4 to 0.3, then the data input of Fig. 1B would result in mechanisms.
exactly the same network as in Fig. 1A and all of the problems
that we have just described would disappear. This is because the The Case of Human Language and Memory Constraints
segment “75648361” would not reach fixation after the first data Although the question of how language has evolved is in itself
string and would not decay completely before the second string is the focus of extensive research (e.g., refs. 94–96), here we focus
acquired, so it would be segmented when 756 is encountered in on a more specific question: Given that language has evolved,
the second string and again in the third. A similar result can be how can it shape the evolution of cognition? According to the
achieved if we extend the working-memory buffer to include the process-level approach described above, we should address this
beginning of the next string (so that the fourth occurrence of question in terms of how the need to learn a language, or to use
“756” can split the “75648361” and so on). This example shows it, selects for possible changes in data-acquisition or learning
how different combinations of data distribution and memory mechanisms. The first expected change, which is quite obvious, is
parameters can generate quite different or quite similar net- in the data-acquisition mechanisms: we would expect that at-
works. It also demonstrates how relatively small modifications to tention to human speech, as well as to human gaze and gestures
the learning parameters or to the distribution of data input can that can help to learn the meaning of spoken words, would be-
lead to major changes in the network. As we discussed elsewhere come even more important than before. Indeed, these manifes-
(82, 86), it is important to bear in mind that not only may the tations of social attention are very typical of human infants and
learning parameters vary across individuals or species, but they young children (97, 98); impairments in such social attention
may also evolve to differ across different sensory modalities (or skills are known to lead to problems in language learning, as in
different learning mechanisms) to better respond to the different the case of autism (e.g., ref. 99). Perhaps the most significant
distributions of data types in nature. Similarly, the learning pa- expected impact of changes to the data distribution is on the

EVOLUTION
rameters may also be modulated by physiological and emotional segmentation process, and consequently on the construction of
state, giving higher increase in memory weight to important but the network: because we expect the data-acquisition and learning
relatively rare observations (82). parameters to coevolve, the evolution of language should also
The learning mechanism described so far is sensitive to the affect the memory parameters of the learning mechanisms. Note
order in which elements appear in the data. For example, that this consideration takes us back to the problem of language
756 and 576 are viewed as different data sequences. This may be
and memory constraints discussed earlier in the paper. It is
important for some data types, such as the sequence of actions of

PSYCHOLOGICAL AND
highly unlikely that the memory and learning parameters that

COGNITIVE SCIENCES
a particular hunting technique, the phrases in a birdsong, or
evolved before language existed were best suited for processing
human speech. But for some other types of data it may be suf-
linguistic data. Although certain plastic adjustment of these
ficient (or even better) to classify two sequences as the same by
memory parameters on the basis of the learner’s individual ex-
merely recognizing some of their similar components. For exam-
ple, two instances of the same salad in a salad bar or a stand of perience cannot be ruled out, it is unlikely that adaptation to
mango trees in the forest may be recognized based on a combi- language learning and use over hundreds of generations did not
nation of stimuli, ignoring their exact serial order (which may also play a role in shaping the genetic basis of these parameters’
actually vary across instances). It is therefore possible that the values. This claim is supported by the known genetic heritability
learning mechanisms may also differ in the set of parameters that component in various types of memory (60–63). The question to
determine how sensitive they are to the exact serial order of data address next is how these parameters evolved as a result of
items. Clearly, a change in these parameters can also influence the language evolution.
segmentation process and the structure of the network, an issue Intuitively, one would expect the challenge of language ac-
that will become relevant again when we discuss language and quisition to require and to select for a better working memory,
tool-making, for which serial order is critically important. leading to a view of a memory limit as a constraint rather than as
Finally, as explained earlier, our model does not pretend to an adaptation (see problem 2, above). However, according to our
capture cognitive mechanisms at the neuronal level. The nodes model, there are at least two reasons why a limited working
and the links in our network do not correspond to neurons and memory may actually be adaptive. First, as explained earlier, the
synapses. Nevertheless, the processes described in our model at parameters of weight increase and decrease create a window for
the computational level can be realized by neuronal structures learning that serves as a test of statistical significance: natural
and activities, and a representation of the proposed network may and meaningful patterns are likely to recur and thus to pass the
exist in the brain. We can assume that the neuronal structures test, whereas spurious patterns decay and are forgotten.
and brain circuits that realize the network are ultimately affected According to this view, the reason that it is typically difficult to
by constraints of size and morphology that are at least partly learn a novel input from a single encounter is that the mecha-
determined genetically. That is, adaptive changes in the data nism of learning has evolved to expect more evidence before
acquisition and the learning mechanisms that can potentially deciding whether an item should be learned or ignored. The
lead to the construction of an extensive network in the acoustic evolution of learning parameters that allow data items to reach
domain, for example, may be subject to physical constraints that fixation in memory after a single encounter should be possible.
are also genetically determined. Over generations, genetic vari- There are in fact examples for such “one-trial learning,” that,
ants that are better in relaxing these physical constraints and in interestingly, occurs when rapid learning seems to be adaptive, as
meeting the demand for larger or more appropriate neuronal in the context of fear learning and enemy recognition (100, 101)
structures will be favored by selection. This view of brain evo- or in the case of word “fast-mapping” in young children (102).
lution is consistent with the “Baldwin effect” view (92, 93), However, in the case of large quantities of sequential data, where
according to which genes may be selected based on how well they all items may be equally important, proximity of repeated occur-
support adaptive plastic processes, such as learning. Using this rences in time and frequency of recurrence are the best first-resort
approach to address the question of how culture shapes the tests of meaningfulness. The selective pressure of language in the
evolution of the brain would imply that culture exerts selective direction of smaller buffer sizes and moderate fixation rates may
pressure that shapes learning and data-acquisition parameters, explain why people are not better at memorizing sequential data
which in turn shape the structure of the constructed network. verbatim (58), which would require larger buffer sizes and more
Consequently, over evolutionary time scales, brain anatomy may rapid fixation.

Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7919
Second, the limited memory buffer can be viewed as repre- This ability may be useful for fast recognition of objects or
senting an adaptive trade-off between memory and computation. structures in the field that does not depend on serial order. It
[Recall our earlier discussion of Christiansen and Chater’s may also improve rote learning of recent actions that might be
memory bottleneck (58).] We assume that a larger buffer that helpful in systematically searching for food without returning to
can accommodate more data can evolve, but whether or not it a place that was just visited. Interestingly, exceptional ability to
will be adaptive depends on the kind of computations needed to retain in memory an accurate, detailed image of a complex scene
process the data in the buffer. In the case of linguistic input, or pattern (also known as “eidetic imagery”) is more common
serial order is important; the data must be segmented into words among young children (105) and autistic savants (106). Such an
or chunks based on recurring segments. This means that the ability may facilitate rote learning at the expense of effective
learner has to search and compare all possible chunks within the segmentation and network representation (81). Regardless of
buffer, and to find possible matches between these chunks and whether or not we understand this phenomenon correctly, it
those represented already as nodes in the network (so as to put clearly shows that having a larger working-memory buffer is bi-
the incoming data in context). The number of possible chunks in ologically feasible and that genetic variants that possess this
a linear sequence grows as the square of the length of the se- ability are already present in human populations. The fact that
quence [there are N(N − 1)/2 possible chunks in a linear se- they do not spread and become the norm suggests that a small
quence of N data items]. Thus, in a relatively small buffer of memory bottleneck is somehow more adaptive.
5 data items, the learner has to find and compare only 10 possi-
ble chunks, whereas in larger buffers of, say 10 or 15 items, the The Case of Tool-Making and the Evolution of Social-
learner has to find and compare 45 or 105 possible chunks, re- Learning Mechanisms
spectively. Therefore, increasing the buffer size leads to consid- Cultural transmission of tool-making techniques depends on
erable computational cost that may not be justifiable. Depending social-learning mechanisms. Whereas learning some advanced
on the number of items that comprise a typical meaningful chunk techniques may involve teaching and verbal instruction (107), the
in the language, it might be better to process a sequence of data ability to make stone tools probably depended, initially at least,
items by gradually scanning it with a small buffer of 5 items on social-learning mechanisms of the type needed to facilitate
rather than with a large buffer of 15 items. imitation or emulation (108–110). How, then, could the evolu-
The computational burden is likely to be much smaller if data tion of culturally transmitted techniques for making tools (i.e.,
input does not need to be segmented accurately, but merely the culture of “tool-making”) affect the genetic evolution of such
recognized and classified based on some characteristic features. learning mechanisms? This question is a specific instance of the
As we mentioned earlier, recognizing a particular salad in a salad more general questions addressed earlier (Problem 1, above) of
bar or a typical fruit tree in a forest may not require paying how using learning mechanisms for social functions shapes their
attention to the exact serial order of the data. In this case, a evolution. To answer this question, we should first consider how
larger memory buffer may not lead to such a sharp increase in imitation or emulation works. Here we try to explain it in terms
computation. A simple illustration of this phenomenon is given of our model. We assume that for imitation, the coupling be-
by the three nonsegmented sentences presented in Fig. 2. En- tween perception and action is developed through experience
glish speakers who cannot read Hebrew or Japanese would most (see also ref. 2). That is, when an individual repeatedly observes
certainly try to segment (almost automatically and subcon- its own actions, it gradually—and quite automatically—asso-
sciously) the first sentence that is in English but not the next two ciates the perception and the motor experience of those actions.
sentences that are in Hebrew and Japanese. Those would be Eventually, seeing another individual performing those actions
classified quickly as “Gibberish in a foreign language” or, more activates the observer’s representation of the motor experience
specifically, as “two sentences in Hebrew and Japanese that I of performing those actions [because the observed actions are
can’t read” (based on some distinctive features of Hebrew and perceived as being similar to the (perceptual representation of)
Japanese letters). Clearly, readers who know Hebrew or Japa- the individual’s own actions, which are already coupled with the
nese will segment those sentences automatically. The point of relevant motor experience]. Thus, the first expected effect, which
this example is to demonstrate that the “ecological” context and may be described in terms of data-acquisition mechanisms (33,
the cultural background of a learner can determine the level of 82), is an increase in attention to the behavioral patterns of other
computation applied to processing incoming data. individuals (in imitation) or to the outcomes of their actions (in
Some level of data segmentation was clearly required even emulation). For imitation, it is also important to acquire much
before the evolution of language; for example, when animals information on self-actions to create the coupling between the
forage for food in structured environments (85) or need to learn perception of these actions and their motor experience.
or interpret observed behavioral sequences (103). However, the The next question is how to organize the acquired data in
evolution of language almost certainly increased the proportion memory. In our framework, this amounts to asking how the
of data input that must be accurately segmented, thereby im- network should be constructed. In the case of imitation, there
posing more significant computational requirements for a given seem to be two possibilities. One is to represent long sequences
memory buffer size. Individuals with a genetic predisposition to of observed actions or sensory experiences as large chunks: exact
use a smaller buffer may have been selected, which may explain copies of entire sequences that can then be executed. The other
why the memory bottleneck is indeed so small. This hypothesis possibility is to segment the observed behavior or sensory expe-
predicts that some of the humans’ close relatives that do not rience into smaller basic units, just as in the process of language
possess language may be endowed with a larger working-memory acquisition, and then compose them again into larger sequences
buffer. Indeed, a notable study on working memory for numerals in the production process. The first possibility of exact imitation
in chimpanzees (104) shows that young chimpanzees have a fits the notion of specialized imitation ability that allows copying
better capacity for numerical recollection than human adults. and executing complex behaviors accurately and almost auto-
matically. However, this approach leads to three problems. First,
the expected effect of exact imitation on the evolution of
learning is that both the “working-memory buffer” and the
weight-increase parameter should increase in size. The working-
memory buffer should be large enough to capture the long
sequences of observed behaviors, and the weight-increase pa-
rameter should be high enough to allow rapid fixation in mem-
Fig. 2. The sentence: “The number of possible chunks in a linear sequence = ory. This prediction does not seem to hold. As discussed earlier,
N(N − 1)/2” written in a nonsegmented form in three different languages, the human working-memory buffer is typically small, and com-
English, Hebrew, and Japanese. See the explanation in the main text. plex patterns require repeated encounters to be learned. The

7920 | www.pnas.org/cgi/doi/10.1073/pnas.1620742114 Lotem et al.


COLLOQUIUM
PAPER
second problem with exact imitation of novel sequences is various hand movements, stone tools in various stages of com-
whether or not it can work within the framework of the asso- pletion, and the association of all these images and segments in
ciative account. To create the coupling between perception and time and space. According to our model, having a rich, well-
action, the learner must also produce the action, and this cannot segmented, and well-connected network helps to put new ob-
be done before successful imitation (because the perception of a servations in context and to produce effective actions (81, 82,
unique novel sequence does not yet have a match in motor 87). Moreover, the ability to create a well-segmented and well-
representation that can be executed). Finally, the learning of connected network likely depends on appropriate settings of the
long fixed-action sequences reduces flexibility. It would require, parameters of the data acquisition and the learning mechanisms
for example, that the initial raw stone from which a tool is to be that govern the dynamic process of network construction. Thus,
produced be nearly identical to the raw stone that was used in the fine-tuning of these mechanisms by natural selection to
the original learned sequence, a highly unlikely occurrence. produce the most effective network for the purpose of learning
The alternative possibility, which involves sequence segmen- to make tools would be precisely the manner in which the culture
tation and reassembly, is consistent with the associative account of tool-making shapes the evolution of cognition.
and is also feasible. The learner first explores and practices a
large repertoire of simple behavioral actions, thereby creating Conclusions
the necessary coupling between perception and action in a large In this paper we embraced the view that cognitive mechanisms
repertoire of basic behavioral units. It can then concatenate
have evolved to accommodate—among other tasks—the rela-
these basic units in many possible ways; for the purpose of imi-
tively new challenges of learning cultural constructs, such as
tation, it can concatenate them to gradually match the complex
behavior demonstrated by other individuals. The demonstrated language and tool-making techniques or, simply put, that culture
behavior is also segmented into familiar units, which can then be shaped cognition. We claim, however, that to study how such
associated with familiar actions, which helps in producing an cultural constructs shape cognitive evolution, a computationally
imitation. This scenario is quite consistent with mounting evi- explicit process-level mechanistic model of learning may be re-
dence and recent views of experience-based imitation and em- quired. We described such a model, one that is based on coevolving
ulation (110, 111). It also suggests that similar processes are mechanisms of learning and data acquisition that jointly construct a
complex network, capable of supporting a range of cognitive abil-

EVOLUTION
involved both in language learning and in complex imitation,
which is in line with recent views according to which tool-making ities. The effect of culture on cognitive evolution is captured
possibly preadapted the brain to language learning (27). through small modifications of these coevolving learning and data-
Finally, our process-level approach may also help to explain acquisition mechanisms, whose coordinated action improves the
recent new studies linking neuroanatomical changes in the brain network’s ability to support the learning processes that are involved
to Paleolithic tool-making ability. These studies found that in cultural phenomena, such as language or tool-making. Finally, we
the acquisition of tool-making abilities by experimental subjects proposed that culture exerts selective pressure that shapes learning

PSYCHOLOGICAL AND
COGNITIVE SCIENCES
involved specific structural changes in the brain (27) and that and data acquisition parameters, which in turn shape the structure
these structures and regions in the brain are more developed in of the representation network, so that over evolutionary time scales,
humans than in chimpanzees (28). This evidence for a short-term brain anatomy may be selected to better accommodate the physical
plastic response colocalized with structures that underwent re- requirements of the learned processes and representations.
cent evolutionary change strongly suggests a process akin to the
Baldwin effect, in which genetic variants are selected based on ACKNOWLEDGMENTS. We thank the organizers and funders of the Arthur
how well they support the required plastic changes (92, 93). It is M. Sackler Colloquium on “The Extension of Biology Through Culture.” We
yet to be explained, however, how the observed plastic changes also thank two anonymous reviewers for highly constructive comments.
improve tool-making abilities. As we suggested earlier, in our Funding was provided by the Israel Science Foundation Grant 871/15 (to
A.L.) and by NSF Grant CCF-1214844, Air Force Office of Scientific Research
view, such neuroanatomical changes have to do with the path- Grant FA9550-12-1-0040, and Army Research Office Grant W911NF-14-1-0017
ways and the representational systems that are recruited to serve (to J.Y.H.). O.K. was supported by the John Templeton Foundation Grant ID
the construction of the network. The result should be a rich 47981 and by the Stanford Center for Computational, Evolutionary, and
network that represents sensory and perceptual experiences of Human Genomics.

1. Heyes C, Pearce JM (2015) Not-so-social learning strategies. Proc Biol Sci 282:20141709. 15. Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected
2. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to Process in Evolution (Princeton Univ Press, Princeton, NJ).
function. Behav Brain Sci 37:177–192. 16. Iriki A, Taoka M (2012) Triadic (ecological, neural, cognitive) niche construction: A
3. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool (Lond) 295: scenario of human brain evolution extrapolating tool use and language from the
4–11. control of reaching actions. Philos Trans R Soc Lond B Biol Sci 367:10–23.
4. Iacoboni M, et al. (1999) Cortical mechanisms of human imitation. Science 286: 17. Laland KN, Hoppitt W (2003) Do animals have culture? Evol Anthropol 12:
2526–2528. 150–159.
5. Rizzolatti G, Fogassi L, Gallese V (2001) Neurophysiological mechanisms underlying 18. Laland KN, Janik VM (2006) The animal cultures debate. Trends Ecol Evol 21:542–547.
the understanding and imitation of action. Nat Rev Neurosci 2:661–670. 19. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl
6. Leslie AM (1987) Pretense and representation: The origins of “theory of mind.” Acad Sci USA 114:7861–7868.
Psychol Rev 94:412–426. 20. Solan Z (2005) Unsupervised learning of natural languages. Proc Natl Acad Sci USA
7. Baron-Cohen S (2000) Theory of mind and autism: A fifteen year review. Understanding
102:11629–11634.
Other Minds, eds Baron-Cohen S, Tagar-Flusberg H, Cohen DJ (Oxford Univ Press,
21. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
Oxford), Vol A, pp 3–20.
Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.
8. Cosmides L, Tooby J, Fiddick L, Bryant GA (2005) Detecting cheaters. Trends Cogn Sci
22. Williamson SH, et al. (2007) Localizing recent adaptive evolution in the human ge-
9:505–506, author reply 508–510.
nome. PLoS Genet 3:e90.
9. Chomsky N (1965) Aspects of the Theory of Syntax (MIT Press, Cambridge, MA).
23. de Magalhães JP, Matsuda A (2012) Genome-wide patterns of genetic distances
10. Tooby J, Cosmides L, Barrett HC (2005) Resolving the debate on innate ideas. The Innate
reveal candidate loci contributing to human population-specific traits. Ann Hum
Mind: Structure and Content, eds Carruthers P, Laurence S, Stich S (Oxford Univ Press,
New York), pp 305–337. Genet 76:142–158.
11. Anderson ML (2010) Neural reuse: A fundamental organizational principle of the 24. Somel M, Liu X, Khaitovich P (2013) Human brain evolution: Transcripts, metabolites
brain. Behav Brain Sci 33:245–266, discussion 266–313. and their regulators. Nat Rev Neurosci 14:112–127.
12. Bates E (1993) Modularity, domain specificity and the development of language. 25. Cáceres M, et al. (2003) Elevated gene expression levels distinguish human from non-
Discuss Neurosci 10:136–148. human primate brains. Proc Natl Acad Sci USA 100:13030–13035.
13. Scott-Phillips TC, Laland KN, Shuker DM, Dickins TE, West SA (2014) The niche con- 26. Somel M, Rohlfs R, Liu X (2014) Transcriptomic insights into human brain evolution:
struction perspective: A critical appraisal. Evolution 68:1231–1243. Acceleration, neutrality, heterochrony. Curr Opin Genet Dev 29:110–119.
14. Laland KN, Odling-Smee FJ, Feldman MW (1999) Evolutionary consequences of 27. Hecht EE, et al. (2015) Acquisition of Paleolithic toolmaking abilities involves
niche construction and their implications for ecology. Proc Natl Acad Sci USA 96: structural remodeling to inferior frontoparietal regions. Brain Struct Funct 220:
10242–10247. 2315–2331.

Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7921
28. Hecht EE, Gutman DA, Bradley BA, Preuss TM, Stout D (2015) Virtual dissection and 72. Lange A, Dukas R (2009) Bayesian approximations and extensions: Optimal decisions
comparative connectivity of the superior longitudinal fasciculus in chimpanzees and for small brains and possibly big ones too. J Theor Biol 259:503–516.
humans. Neuroimage 108:124–137. 73. Katsnelson E, Motro U, Feldman MW, Lotem A (2011) Evolution of learned strategy
29. Creanza N, Fogarty L, Feldman MW (2016) Cultural niche construction of repertoire choice in a frequency-dependent game. Proc Biol Sci 279:1176–1184.
size and learning strategies in songbirds. Evol Ecol 30:285–305. 74. Hamblin S, Giraldeau L-A (2009) Finding the evolutionarily stable learning rule for
30. Shettleworth SJ (2010) Cognition, Evolution, and Behavior (Oxford Univ Press, New York). frequency-dependent foraging. Anim Behav 78:1343–1350.
31. Heyes C (2010) Where do mirror neurons come from? Neurosci Biobehav Rev 34:575–583. 75. Nilsson DE, Pelger S (1994) A pessimistic estimate of the time required for an eye to
32. Heyes C (2016) Homo imitans? Seven reasons why imitation couldn’t possibly be evolve. Proc Biol Sci 256:53–58.
associative. Philos Trans R Soc Lond B Biol Sci 371:20150069. 76. Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw
33. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202. 61:85–117.
34. Laland KN (2004) Social learning strategies. Learn Behav 32:4–14. 77. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444.
35. Catmur C, Press C, Cook R, Bird G, Heyes C (2014) Authors’ response: Mirror neurons: 78. Edelman S (2016) The minority report: Some common assumptions to reconsider in
Tests and testability. Behav Brain Sci 37:221–241. the modelling of the brain and behaviour. J Exp Theor Artif Intell 28:751–776.
36. Felsenstein J (1983) Parsimony in systematics: Biological and statistical issues. Annu 79. Mnih V, et al. (2015) Human-level control through deep reinforcement learning.
Rev Ecol Syst 14:313–333. Nature 518:529–533.
37. Lotem A (1993) Secondary sexual ornaments as signals: The handicap approach and 80. Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-
three potential problems. Etologia 3:209–218. based acceleration. Proceedings of The 33rd International Conference on Machine
38. Mery F, Belay AT, So AK, Sokolowski MB, Kawecki TJ (2007) Natural polymorphism Learning, eds Balcan MF, Weinberger KQ (Proceedings of Machine Learning Re-
affecting learning and memory in Drosophila. Proc Natl Acad Sci USA 104:13051–13055. search, New York), Vol 48, pp 2829–2838.
39. Dunlap AS, Stephens DW (2014) Experimental evolution of prepared learning. Proc 81. Lotem A, Halpern JY (2008) A data-acquisition model for learning and cognitive
Natl Acad Sci USA 111:11750–11755. development and its implications for autism. Cornell University Computing and In-
40. Garcia J, Kimeldorf DJ, Koelling RA (1955) Conditioned aversion to saccharin re- formation Science Technical Reports. Available at https://ecommons.cornell.edu/
sulting from exposure to gamma radiation. Science 122:157–158. handle/1813/10178. Accessed May 11, 2017.
41. Finkel D, Pedersen NL, McGue M, McClearn GE (1995) Heritability of cognitive abilities 82. Lotem A, Halpern JY (2012) Coevolution of learning and data-acquisition mecha-
in adult twins: comparison of Minnesota and Swedish data. Behav Genet 25:421–431. nisms: A model for cognitive evolution. Philos Trans R Soc Lond B Biol Sci 367:
42. Plomin R, Spinath FM (2002) Genetics and general cognitive ability (g). Trends Cogn 2686–2694.
Sci 6:169–176. 83. Goldstein MH, et al. (2010) General cognitive principles for learning structure in time
43. Briley DA, Tucker-Drob EM (2013) Explaining the increasing heritability of cognitive and space. Trends Cogn Sci 14:249–258.
ability across development: A meta-analysis of longitudinal twin and adoption 84. Kolodny O, Edelman S, Lotem A (2014) The evolution of continuous learning of the
studies. Psychol Sci 24:1704–1713. structure of the environment. J R Soc Interface 11:20131091.
44. Pearson-Fuhrhop KM, Minton B, Acevedo D, Shahbaba B, Cramer SC (2013) Genetic 85. Kolodny O, Edelman S, Lotem A (2015) Evolution of protolinguistic abilities as a by-
variation in the human brain dopamine system influences motor learning and its product of learning to forage in structured environments. Proc Biol Sci 282:
modulation by L-Dopa. PLoS One 8:e61197. 20150353.
45. Mery F (2013) Natural variation in learning and memory. Curr Opin Neurobiol 23:52–56. 86. Kolodny O, Lotem A, Edelman S (2015) Learning a generative probabilistic grammar
46. Stephens DW, Krebs JR (1986) Foraging Theory (Princeton Univ Press, Princeton, NJ). of experience: A process-level model of language acquisition. Cogn Sci 39:227–267.
47. Leadbeater E, Dawson EH (2017) A social insect perspective on the evolution of social 87. Kolodny O, Edelman S, Lotem A (2015) Evolved to adapt: A computational approach
learning mechanisms. Proc Natl Acad Sci USA 114:7838–7845. to animal innovation and creativity. Curr Zool 61:350–367.
48. Lotem A, Kolodny O (2014) Reconciling genetic evolution and the associative 88. Enquist M, Lind J, Ghirlanda S (2016) The power of associative learning and the
learning account of mirror neurons through data-acquisition mechanisms. Behav ontogeny of optimal behaviour. R Soc Open Sci 3:160734.
Brain Sci 37:210–211. 89. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump-
49. Thompson B, Kirby S, Smith K (2016) Culture shapes the evolution of cognition. Proc tions and predictions. Proc Biol Sci 282:20151019.
Natl Acad Sci USA 113:4530–4535. 90. Baddeley A, Gathercole S, Papagno C (1998) The phonological loop as a language
50. Christiansen MH, Chater N (2008) Language as shaped by the brain. Behav Brain Sci learning device. Psychol Rev 105:158–173.
31:489–508, discussion 509–558. 91. Burgess N, Hitch GJ (1999) Memory for serial order: A network model of the pho-
51. Barbujani G, Sokal RR (1990) Zones of sharp genetic change in Europe are also lin- nological loop and its timing. Psychol Rev 106:551–581.
guistic boundaries. Proc Natl Acad Sci USA 87:1816–1819. 92. Baldwin JM (1896) A new factor in evolution. Am Nat 30:441–451.
52. Laland KN (2017) The origins of language in teaching. Psychon Bull Rev 24:225–231. 93. Weber BH, Depew DJ (2003) Evolution and Learning: The Baldwin Effect Reconsidered
53. Belfer‐Cohen A, Goren‐Inbar N (1994) Cognition and communication in the Le- (MIT Press, Cambridge, MA).
vantine Lower Palaeolithic. World Archaeol 26:144–157. 94. Dunbar R (1998) Grooming, Gossip, and the Evolution of Language (Harvard Univ
54. d’Errico F, et al. (2003) Archaeological evidence for the emergence of language, sym- Press, Cambridge, MA).
bolism, and music—An alternative multidisciplinary perspective. J World Prehist 17:1–70. 95. Pinker S (2003) Language as an adaptation to the cognitive niche. Stud Evol Lang 3:
55. Mellars P (2006) Why did modern human populations disperse from Africa ca. 16–37.
60,000 years ago? A new model. Proc Natl Acad Sci USA 103:9381–9386. 96. Premack D (1985) “Gavagai!” or the future history of the animal language contro-
56. Chater N, Reali F, Christiansen MH (2009) Restrictions on biological adaptation in versy. Cognition 19:207–296.
language evolution. Proc Natl Acad Sci USA 106:1015–1020. 97. Butterworth G, Jarrett N (1991) What minds have in common is space: Spatial
57. Chater N, Christiansen MH (2010) Language acquisition meets language evolution. mechanisms serving joint visual attention in infancy. Br J Dev Psychol 9:55–72.
Cogn Sci 34:1131–1157. 98. Scaife M, Bruner JS (1975) The capacity for joint visual attention in the infant. Nature
58. Christiansen MH, Chater N (2016) The Now-or-Never bottleneck: A fundamental 253:265–266.
constraint on language. Behav Brain Sci 39:e62. 99. Klin A, Lin DJ, Gorrindo P, Ramsay G, Jones W (2009) Two-year-olds with autism
59. Lotem A, Kolodny O, Halpern JY, Onnis L, Edelman S (2016) The bottleneck may be orient to non-social contingencies rather than biological motion. Nature 459:
the solution, not the problem. Behav Brain Sci 39:e83. 257–261.
60. Mueller ST, Krawitz A (2009) Reconsidering the two-second decay hypothesis in 100. Curio E, Ernst U, Vieth W (1978) Cultural transmission of enemy recognition: One
verbal working memory. J Math Psychol 53:14–25. function of mobbing. Science 202:899–901.
61. Cui J, Gao D, Chen Y, Zou X, Wang Y (2010) Working memory in early-school-age 101. Dunsmoor JE, Murty VP, Davachi L, Phelps EA (2015) Emotional learning selectively
children with Asperger’s syndrome. J Autism Dev Disord 40:958–967. and retroactively strengthens memories for related events. Nature 520:345–348.
62. Blokland GAM, et al. (2011) Heritability of working memory brain activation. 102. Medina TN, Snedeker J, Trueswell JC, Gleitman LR (2011) How words can and cannot
J Neurosci 31:10882–10890. be learned by observation. Proc Natl Acad Sci USA 108:9014–9019.
63. Vogler C, et al. (2014) Substantial SNP-based heritability estimates for working 103. Byrne RW (1999) Imitation without intentionality. Using string parsing to copy the
memory performance. Transl Psychiatry 4:e438. organization of behaviour. Anim Cogn 2:63–72.
64. Chater N, Christiansen MH (2016) Squeezing through the Now-or-Never bottleneck: Re- 104. Inoue S, Matsuzawa T (2007) Working memory of numerals in chimpanzees. Curr
connecting language processing, acquisition, change, and structure. Behav Brain Sci 39:e91. Biol 17:R1004–R1005.
65. McNamara JM, Houston AI (2009) Integrating function and mechanism. Trends Ecol 105. Conway ARA, Jarrold C, Kane MJ, Miyake A, Towse JN (2007) Variation in working
Evol 24:670–675. memory: An introduction. Variation in Working Memory, eds Conway ARA, Jarrold C,
66. Fawcett TW, Hamblin S, Giraldeau LA (2013) Exposing the behavioral gambit: The Kane MJ, Miyake A, Towse JN (Oxford Univ Press, Oxford), pp 3–17.
evolution of learning and decision rules. Behav Ecol 24:2–11. 106. Snyder AW, Mitchell DJ (1999) Is integer arithmetic fundamental to mental pro-
67. Kacelnik A, Bateson M (1997) Risk-sensitivity: Crossroads for theories of decision- cessing?: The mind’s secret arithmetic. Proc Biol Sci 266:587–592.
making. Trends Cogn Sci 1:304–309. 107. Morgan TJH, et al. (2015) Experimental evidence for the co-evolution of hominin
68. van den Berg P, Weissing FJ (2015) The importance of mechanisms for the evolution tool-making teaching and language. Nat Commun 6:6029.
of cooperation. Proc Biol Sci 282:20151382. 108. Whiten A (2000) Primate culture and social learning. Cogn Sci 24:477–508.
69. Trimmer PC, McNamara JM, Houston AI, Marshall JA (2012) Does natural selection 109. Heyes CM, Galef BG, Jr (1996) Social Learning in Animals: The Roots of Culture
favour the Rescorla-Wagner rule? J Theor Biol 302:39–52. (Academic, San Diego).
70. Mcnamara JM, Trimmer PC, Houston AI (2012) The ecological rationality of state- 110. Galef BG (2015) Laboratory studies of imitation/field studies of tradition: Towards a
dependent valuation. Psychol Rev 119:114–119. synthesis in animal social learning. Behav Processes 112:114–119.
71. Arbilly M, Motro U, Feldman MW, Lotem A (2010) Co-evolution of learning com- 111. Truskanov N, Lotem A (2017) Trial-and-error copying of demonstrated actions re-
plexity and social foraging strategies. J Theor Biol 267:573–581. veals how fledglings learn to ‘imitate’ their mothers. Proc Biol Sci 284:20162744.

7922 | www.pnas.org/cgi/doi/10.1073/pnas.1620742114 Lotem et al.


NEWS FEATURE
NEWS FEATURE

Can animal culture drive evolution?


Once the purview of humans, culture has been observed in all sorts of animals. But are these
behaviors merely ephemeral fads or can they shape the genes and traits of
future generations?
Carolyn Beans, Science Writer

In Antarctic waters, a group of killer whales makes a Scientists once placed culture squarely in the human
wave big enough to knock a seal from its ice floe. domain. But discoveries in recent decades suggest that
Meanwhile, in the North Atlantic, another killer whale a wide range of cultural practices—from foraging tactics
group blows bubbles and flashes white bellies to and vocal displays to habitat use and play—may influ-
herd a school of herrings into a ball. And in the Crozet ence the lives of other animals as well (3). Studies at-
Archipelago in the Southern Ocean, still another group tribute additional orca behaviors, such as migration
charges at seals on a beach, grasps the prey with their routes and song repertoires, to culture (4). Other re-
teeth, and then backs into the water (1). Some re- search suggests that a finch’s song (5), a chimpanzee’s
searchers see these as more than curious behaviors nut cracking (3), and a guppy’s foraging route (6) are all
manifestations of culture. Between 2012 and 2014, over
or YouTube photo ops: they see cultural mores—
100 research groups published work on animal culture
introduced into populations and passed to future gener-
covering 66 species, according to a recent review (7).
ations—that can actually affect animals’ fitness.
Now, scientists are exploring whether culture may
Killer whales, also known as orcas (Orcinus orca), have
shape not only the lives of nonhuman animals but the
a geographic range stretching from the Antarctic to the
evolution of a species. “Culture affects animals’ lives and
Arctic. As a species, their diet includes birds, fish, mam- their survival and their fitness,” says the review’s (7) coau-
mals, and reptiles. But as individuals, they typically fall thor, behavioral scientist Andrew Whiten of the University
into groups with highly specialized diets and hunting of St Andrews in Scotland. “We’ve learned that’s the case
traditions passed down over generations. Increasingly, to an extent that could hardly have been appreciated half a
scientists refer to these learned feeding strategies as cul- century ago.” Based on work in whales, dolphins, and
ture, roughly defined as information that affects behavior birds, some researchers contend that animal culture is likely
and is passed among individuals and across generations a common mechanism underlying animal evolution. But
through social learning, such as teaching or imitation (2). testing this hypothesis remains a monumental challenge.

Riding a Cultural Wave


Animal populations essentially have two streams of in-
formation, genetic and cultural, explains ecologist and
whale researcher Hal Whitehead of Dalhousie University in
Canada. In the case of the cultural stream, he says, “things
are being learned, sometimes from the mother, possibly
from the father, as well as from peers and unrelated
adults.” Whitehead and others want to understand how
these streams interact. Lactose tolerance in humans is a
classic example. Studies suggest that adult production of
lactase—the enzyme necessary for digesting the sugar
lactose in milk—coevolved with the cultural practice of
dairy farming in Europe in the last 10,000 years (8).
Showing that culture can influence the distribution
of genes in an animal population would confirm its role
as an evolutionary driver, and Whitehead believes he
may have found evidence for exactly that. In the 1990s,
Whitehead observed that matrilineal whale species—
whose daughters stick with their mothers for life—have
Killer whales are divided into groups known as ecotypes, with highly specialized
diets and hunting traditions passed down over generations. Here, a
low genetic diversity of mitochondrial DNA (9). He
mammal-eating ecotype in the North Pacific hunts seal. Photograph by David coined the term “cultural hitchhiking” to explain how
Ellifrit, courtesy of Center for Whale Research. this pattern might emerge. In these species, cultures

7734–7737 | PNAS | July 25, 2017 | vol. 114 | no. 30 www.pnas.org/cgi/doi/10.1073/pnas.1709475114


are passed from mothers to offspring. If a cultural
behavior increases a descendant’s chances of sur-
vival and reproduction, then this behavior would
persist and become more common in the population.
The maternal line’s particular mitochondrial DNA hap-
lotype, which also passes directly from mother to off-
spring, would simultaneously become more common.
“The culture is driving and the gene is riding along,”
says Whitehead. “There is no particular functional
linkage between them.” Whitehead demonstrated
through computer models that cultural hitchhiking is a
plausible explanation for reduced genetic diversity in
matrilineal whale species.
Cultural hitchhiking, it seems, is also at work in a
population of bottlenose dolphins in western Shark
Bay, Western Australia, according to research by evo-
lutionary geneticist Michael Krützen of the University of
Zurich (10). In this population, some dolphins carry
sponges on their rostrums, most likely for protection as
they probe the rough seafloor for fish that they other-
wise couldn’t reach (11). This behavior is passed from
mothers to offspring through social learning and all
“sponging” dolphins in the population share the same
mitochondrial haplotype. Because the sponging dol-
phins primarily inhabit a deep channel where the
sponges occur, this culture appears to affect the fine-
scale geographic distribution of the mitochondrial
genes. “What is really exciting here is that the cultural
practice of sponging has led to a change in the genetic For birds in the tanager family, like this magpie tanager (Cissopis leveriana) in
make-up of the population when you look at mito- Brazil’s Itatiaia National Park, song is a cultural trait that must be learned. Song
chondrial DNA,” says Krützen. evolves faster in this family than in the ovenbird family, whose species have innate
song. Image courtesy of Daniel J. Field (University of Bath, Bath, United Kingdom).

An Evolutionary Force
Longstanding ecological and evolutionary theories distinct hunting strategies as ecotypes, subsets of a
suggest that culture could also more directly affect the species that occupy unique ecological niches. New
evolution of traits, and even the making of species. genomics technologies allow researchers to search for
Animal populations evolve through natural selection evolutionary consequences of these various hunting
when a heritable trait, like beak size or fur color, varies cultures. “We came into the genomics era and really
and different versions of the trait allow some individuals wanted to see whether these cultural traditions in killer
to survive and reproduce more than others. whales led to enough of a long-term selection pressure
Animal culture has the potential to affect this process
that you would actually see changes in the genome,”
in a number of ways, says Whiten. For one, cultural in-
says evolutionary biologist Andrew Foote of Bangor
novations, such as tools or predator-avoidance tactics,
University in the United Kingdom.
could increase an animal’s survival and reproduction,
Foote and colleagues sequenced the genomes of
buffering them against some selection pressures. But
48 orcas across 5 ecotypes to identify whether the
culture could also enable animals to colonize regions
groups were truly genetically isolated, and whether their
they otherwise couldn’t, exposing them to new selec-
different cultures were associated with unique genomic
tion pressures, such as novel temperatures, predators,
changes (1). The sample included one mammal-eating
or food sources. And culture could generate selection
for animals to be better suited to a cultural behavior and one salmon-eating ecotype from the North Pacific,
through physical changes, such as stronger arms for and one mammal-eating, one penguin-eating, and one
more powerful hammering, or cognitive ones, such as Antarctic toothfish-eating ecotype from the Antarctic.
the ability to learn tool use by mirroring others. “And The researchers found that the groups were genetically
that, of course, may affect the evolution of the brain to distinct. “What is really surprising is just how differenti-
match,” says Whiten. Furthermore, cultural differences, ated the ones that live in the same area are,” says Foote.
such as birdsong or migration patterns, could prevent “The two North Pacific ones are really different genet-
groups from mating together, which could help main- ically even though there is overlap in their range.”
tain or even generate new species. Foote estimated that these ecotypes began di-
Or anyway, those are the working theories. Finding verging within the last 250,000 years. He traced some
definitive evidence is a tricky prospect, though recent of the genetic differences among groups to gene
research in whales and birds offers some substantive variants possibly associated with adaptation to the
support. Scientists refer to the many orca groups with hunting traditions of each ecotype, and the unique

Beans PNAS | July 25, 2017 | vol. 114 | no. 30 | 7735


geographic regions those ecotypes colonized. For innate song. Culture, therefore, might actually ramp up
example, the two mammal-eating ecotypes were the pace of speciation.
each associated with gene variants that play key roles Derryberry and Mason (12) acknowledge that they
in regulating the metabolism of methionine, an es- don’t know whether bird song evolution drives speciation
sential amino acid that mammal eaters consume in a or vice versa. In one scenario, bird song could diverge first,
boom–bust cycle with influxes following each kill. And which would prevent individuals with different songs from
the ecotypes that live in the extreme cold of the mating together, setting their lineages on the path to be-
Antarctic were associated with gene variants involved coming distinct species. Alternatively, the species could
in the development of adipose tissue, which could diverge first by some other mechanism, which would cre-
protect individuals from the frigid climate. ate strong natural selection for song divergence to follow. If
Foote doesn’t believe the cultural barrier between song evolution comes first, then the faster bird song
orca ecotypes is long-lasting enough to divide groups evolves, the more rapidly species diverge. “My inkling is
into different species altogether. “They probably radi- that rapid evolution of birdsong could contribute to spe-
ate and collapse and radiate and collapse,” he says. ciation,” says Mason. “At the scale we are looking at, we
“Maybe one or two might escape that process and go look at patterns, so interpreting process becomes tricky.”
on to become fully fledged species. But looking at what A separate, long-term study may offer insight into
we know now, knowing that they have a relatively re- the speciation cause and effect. For four decades,
evolutionary biologists Peter and Rosemary Grant of
cent common ancestor, it would suggest that [splitting
Princeton University carefully tracked the survival, mat-
into species] never happened in the past.”
ing, and reproduction success of about 12,000 individ-
ual birds in species of Darwin’s finches on the island of
Speciation and Song
Daphne Major in the Galápagos (5). In these species,
Birdsong, often used to identify mates, offers another
which belong to the tanager family included in Mason’s
robust means for probing culture’s impact on specia-
study, offspring learn song from their fathers.
tion and animal evolution. For some bird species, song
In 1981, a male bird from a nearby island arrived on
is innate. For others, it’s essentially cultural, a trait that
Daphne Major singing a song the Grants had never
must be learned. In both cases, song evolves over time
heard. Genetic analyses (conducted decades later with
microsatellite data) suggested that it was possibly a hy-
“My inkling is that rapid evolution of birdsong could brid of the medium ground finch (Geospiza fortis) and
the cactus finch (Geospiza scandens), two species that
contribute to speciation.” were also found on Daphne Major. But this bird, which
—Nicholas Mason the Grants call “Big Bird,” was much larger than the
parent species. Big Bird survived for 13 years in his new
as it passes through generations. Evolutionary biologist home and found six mates: the first three hybrids like
Elizabeth Derryberry of the University of Tennessee, himself, the last three all medium ground finches. To-
Knoxville, and Nicholas Mason, a doctoral candidate in gether with one of the medium ground finches, Big Bird
ecology and evolutionary biology at Cornell University, produced offspring that bred only with one another,
resulting in the beginnings of an incipient species that
studied two families of birds: the tanagers (Thraupidae)
the Grants have now followed through six generations.
that learn song and the ovenbirds (Furnariidae) that are
The bird’s unique song passed on through genera-
innate singers (12). What they found suggests that
tions helped members of his lineage recognize one an-
culture could play a sizeable role. other as potential mates. “It’s very important that it’s had
Derryberry and Mason analyzed nearly 4,500 song cultural transmission of song,” says Rosemary Grant.
recordings across nearly 600 species within these There is no agreed upon standard for how many gener-
families. For each recording, they measured eight ations a lineage must remain reproductively isolated be-
vocal characters, including maximum volume, range fore it can be called a new species, so the Grants maintain
of pitch, and length. By studying differences in these only that the Big Bird lineage is a species in the making.
characters between species in the same family, the
researchers estimated how quickly song evolved in Many Unknowns
different branches of the family tree. If song differed Despite such findings, well-documented examples of
greatly between closely related species, for example, animal culture influencing evolution remain rare, even in
that would suggest a fast rate of song evolution. For humans’ closest relatives, primates. There’s little doubt
that cultural differences and social learning are impor-
each family, they merged this song dataset with a
tant to primates’ lives. But can such behaviors have an
genetic one that showed rates of speciation; the idea
evolutionary impact?
was to identify any connection between the rates of
Whiten cites work supporting an evolutionary ver-
song change and species divisions. sion of the “cultural intelligence hypothesis” in pri-
Derryberry and Mason found that when the rate of mates: the idea that species with culturally rich
song evolution sped up in a branch of a family tree, so communities will experience selection to enhance the
too did the rate of speciation. But song evolved cognitive abilities that support social learning, which
1.4 times faster in the tanager family, with cultural would in turn require a larger brain capable of pro-
transmission of song, than in the ovenbird family, with cessing learning and storing learned information. This

7736 | www.pnas.org/cgi/doi/10.1073/pnas.1709475114 Beans


larger brain could possibly result in increased overall same genes popping up,” says Foote. “That’s missing
intelligence. “You may get into a feedback cycle here; in most tentative cases.”
as you become more cultural, that selection pressure One of the biggest challenges with animal studies is
on the brain and cultural capacities then make you determining whether genetic differences between pop-
able to become more cultural, which in turn selects for ulations are really a response to culture or merely a sig-
greater brain size,” says Whiten. nature of genetic drift: chance fluctuations in the
Indeed, a recent study by Kevin Laland, of the Uni- frequencies of gene variants over time. “It’s not an easy
versity of St Andrews, and colleagues found that in task. You really need to know something about the de-
primate species, reliance on social learning is positively mographic history of your species,” says Krützen. He calls
correlated with brain volume, as well as social group Foote’s orca ecotypes study amazing, in part because it at
size and lifespan (13). But the authors acknowledge that least partially disentangled genetic drift from culturally
they cannot determine whether selection on social driven natural selection. Ideally, Krützen would like to see
learning actually caused the evolution of larger brain evidence that this same gene is under selection in many
size. It’s also possible that larger brains evolved first for different species all experiencing a similar selection
some other purpose, and then cultural advances made pressure. Indeed, Foote notes that adipose tissue gene
possible by enhanced cognitive abilities followed. variants are also under selection in the polar bear when
Whitehead wouldn’t be surprised if researchers find compared with the brown bear, which may similarly help
support for gene-culture evolution in primates. “There is buffer this Arctic animal against a frigid climate (14).
evidence that some cultural elements in chimpanzees Even once scientists identify genes that may be
have remarkable stability, and stability is a prerequisite for under selection, they still must determine whether the
culture having a major effect on genetics.” To find that genes connect back to culture. “One of the difficulties
effect, he says researchers might test, for example, is that there aren’t that many genes that we actually
whether chimpanzee populations with a culture of using know exactly what they do, even in humans, even less in
stones as tools also carry gene variants that enhance their other species, and certainly in nonmodel species like
abilities to use those tools, such as greater hand–eye co- the whales,” says Whitehead. The conventional strat-
ordination or muscle strength in parts of their bodies. But egy for definitively determining a gene–phenotype
the field is still new and such proposals remain theoretical. connection entails experimentally altering, for example,
The paucity of examples could also indicate that animal a gene in orcas related to adipose tissue, and then re-
culture is not quite influential or stable enough to routinely cording the effects. “But, of course, we can’t do this,”
have an impact on evolution, says cultural evolutionist Krützen says. He says that instead, scientists often at-
Peter Richerson of the University of California, Davis, who is tempt to assign function to a gene by comparing it to
president of the recently founded Cultural Evolution So- similar genes of known function in humans. This ap-
ciety. “There [are] just more targets in the case of humans proach may work well for studying close animal rela-
than in the case of other culture-bearing animals,” he says. tives like primates. “The farther you go away from
“That doesn’t mean that we won’t find a lot of examples— humans,” he says, “the harder this gets.”
I expect we will in the long run. But it still ought to be more Even while recognizing the research limitations, Krüt-
spectacular in humans than in other animals.” zen remains undeterred. “I’m convinced that as time goes
Even in humans, thus far there are only a few ex- by there will be more studies finding more evidence of
amples in which potential genetic changes have been genetic change based on culture,” he says. Foote is less
conclusively linked to cultural variation. “The strong certain. “I think a lot of it comes under ‘untested and
cases are where we see independent cultural evolution: unknown’,” he says. “And we have to keep an open mind
dairy farming cropping up multiple places and the as to what the alternative hypotheses are.”

1 Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of killer whale ecotypes. Nat Commun 7:11693.
2 Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through culture. Proc Natl Acad Sci USA 114:7775–7781.
3 Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. Proc Natl Acad Sci USA 114:7790–7797.
4 Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl Acad Sci USA 114:7814–7821.
5 Grant PR, Grant BR (2014) 40 Years of Evolution: Darwin’s Finches on Daphne Major Island (Princeton Univ Press, Princeton, NJ).
6 Laland KN, Williams K (1997) Shoaling generates social learning of foraging information in guppies. Anim Behav 53:1161–1169.
7 Galef BG, Whiten A (2017) The comparative psychology of social learning. APA Handbook of Comparative Psychology, ed Call J
(American Psychological Association, Washington, DC), pp 411–439.
8 Itan Y, Powell A, Beaumont MA, Burger J, Thomas MG (2009) The origins of lactase persistence in Europe. PLOS Comput Biol
5:e1000491.
9 Whitehead H (2003) Sperm Whales: Social Evolution in the Ocean (The Univ of Chicago Press, Chicago).
10 Kopps AM, et al. (2014) Cultural transmission of tool use combined with habitat specializations leads to fine-scale genetic structure in
bottlenose dolphins. Proc Biol Sci 281:20133245.
11 Krützen M, et al. (2014) Cultural transmission of tool use by Indo-Pacific bottlenose dolphins (Tursiops sp.) provides access to a novel
foraging niche. Proc Biol Sci 281:20140374.
12 Mason NA, et al. (2017) Song evolution, speciation, and vocal learning in passerine birds. Evolution 71:786–796.
13 Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural intelligence, extended life history, sociality, and brain
size in primates. Proc Natl Acad Sci USA 114:7908–7914.
14 Liu S, et al. (2014) Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears. Cell 157:785–794.

Beans PNAS | July 25, 2017 | vol. 114 | no. 30 | 7737


The Extension of Biology Through Culture
November 16–17, 2016
Arnold and Mabel Beckman Center, Irvine, CA
Organized by Andrew Whiten, Francisco J. Ayala, Marcus W. Feldman, and Kevin N. Laland

Program

Wednesday, November 16

Session I

Welcome Remarks
Marcus W. Feldman, Stanford University

Evolution and revolution in cetacean vocal culture: Lessons from humpback


whale song
Ellen C. Garland, University of St. Andrews

Gene–culture coevolution in whales and dolphins


Hal Whitehead, Dalhousie University

Cultural legacies: Unpacking the intergenerational transmission of information


in birds
Lucy M. Aplin, University of Oxford

What evolves in the evolution of social learning? A social insect perspective


Ellouise Leadbeater, Queen Mary University of London

Session II

Can culture reshape the evolution of learning and how?


Arnon Lotem, Tel Aviv University

What long-term field studies reveal of primate traditions


Susan E. Perry, University of California, Los Angeles

Can we identify a primate signature in social learning?


Dorothy M. Fragaszy, University of Georgia

The evolution of primate intelligence


Kevin N. Laland, University of St. Andrews
Distinctive Voices Public Lecture

How animal cultures extend the scope of biology: Tradition and learning from apes to
whales to bees
Andrew Whiten, University of St. Andrews

Thursday, November 17

Session III

Skill learning, neuroplasticity, and exaptation in the evolution of human tool-making


and language
Dietrich Stout, Emory University

The role of cultural innovations, learning processes, and ecological dynamics in shaping
Middle Stone Age cultural adaptations
Francesco d’Errico, University of Bordeaux

The ontogenetic foundations of cumulative cultural transmission


Cristine H. Legare, The University of Texas at Austin

“I don’t know”: Ignorance and question-asking as engines for cognitive development


Paul L. Harris, Harvard University

Session IV

Childhood as simulated annealing: How wide hypothesis exploration in an extended


childhood contributes to cultural learning
Alison Gopnik, University of California, Berkeley

How language shapes the nature of cultural inheritance


Susan A. Gelman, University of Michigan

Big data, cultural macroevolution, and the prospects for an evolutionary science of
human history
Russell D. Gray, Max Planck Institute for the Science of Human History

Ongoing prospects for a unified science of cultural evolution


Alex Mesoudi, University of Exeter

Concluding Remarks
Francisco J. Ayala, University of California, Irvine

Presentations from this colloquium can be viewed at


http://www.nasonline.org/Extension_of_Biology_Through_Culture
This page left intentionally blank
This page left intentionally blank
National Academy of Sciences
Sackler Colloquium Series

The Arthur M. Sackler Colloquia of the National Academy of Sciences


address scientific topics of broad and current interest, cutting across the
boundaries of traditional disciplines. Each year, three or four such collo-
quia are scheduled, typically 2 days in length and international in scope.
Each colloquium is organized by a member of the Academy, often with
the assistance of an organizing committee, and features presentations
by leading scientists in the field and discussions with 100 or more
researchers with an interest in the topic. Colloquium presentations are
recorded and posted on the Sackler Colloquia website
www.nasonline.org/programs/sackler-colloquia and the Sackler
YouTube Channel www.youtube.com/sacklercolloquia. Many colloquia
also result in the publication of scholarly papers, usually published as a
collection in the Proceedings of the National Academy of Sciences
(PNAS). These colloquia are made possible by a generous gift from
Mrs. Jill Sackler, in memory of her husband, Arthur M. Sackler.
National Academy of Sciences
500 Fifth Street, NW
Washington, DC 20001

You might also like