Professional Documents
Culture Documents
Through Culture
Edited by Andrew Whiten, Francisco J. Ayala, Marcus W. Feldman,
and Kevin N. Laland
Irvine, CA
The papers collected here result from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, The Extension of Biology Through Culture. Complete information about this colloquium
and video recordings of most presentations are available on the NAS website at http://www.
nasonline.org/Extension_of_Biology_Through_Culture.
Contents
INTRODUCTION
7775 The extension of biology through culture
Andrew Whiten, Francisco J. Ayala, Marcus W. Feldman,
Cover image: A juvenile capuchin
and Kevin N. Laland
monkey observes a skilled adult male
eating a nut it has just broken using COLLOQUIUM PAPERS
a hammerstone. Articles in the Sackler
Colloquium on the Extension of Biology 7782 Cultural evolutionary theory: How culture evolves and why it matters
Through Culture explore social learning Nicole Creanza, Oren Kolodny, and Marcus W. Feldman
and cultural transmission in humans and
nonhuman animals as well as the
7790 Culture extends the scope of evolutionary biology in the great apes
Andrew Whiten
interplay between cultural and genetic
evolution. See the Introduction to the 7798 Synchronized practice helps bearded capuchin monkeys learn to extend
Sackler Colloquium by Andrew Whiten attention while learning a tradition
et al. on pages 7775–7781. Image Dorothy M. Fragaszy, Yonat Eshchar, Elisabetta Visalberghi, Briseida Resende, Kellie Laity,
courtesy of Luca Antonio Marino and Patrícia Izar
(EthoCebus Project, Brazil).
7806 Older, sociable capuchins (Cebus capucinus) invent more social behaviors,
but younger monkeys innovate more in other contexts
Susan E. Perry, Brendan J. Barrett, and Irene Godoy
7814 Gene–culture coevolution in whales and dolphins
Hal Whitehead
7822 Song hybridization events during revolutionary song change provide
insights into cultural transmission in humpback whales
Ellen C. Garland, Luke Rendell, Luca Lamoni, M. Michael Poole, and Michael J. Noad
7830 Conformity does not perpetuate suboptimal traditions in a wild
population of songbirds
Lucy M. Aplin, Ben C. Sheldon, and Richard McElreath
7838 A social insect perspective on the evolution of social learning mechanisms
Ellouise Leadbeater and Erika H. Dawson
7846 Cultural macroevolution matters
Russell D. Gray and Joseph Watts
7853 Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution
Alex Mesoudi
7861 Evolutionary neuroscience of cumulative culture
Dietrich Stout and Erin E. Hecht
7869 Identifying early modern human ecological niche expansions and associated
cultural dynamics in the South African Middle Stone Age
Francesco d’Errico, William E. Banks, Dan L. Warren, Giovanni Sgubin, Karen van Niekerk,
Christopher Henshilwood, Anne-Laure Daniau, and María Fernanda Sánchez Goñi
7877 Cumulative cultural learning: Development
and diversity
Cristine H. Legare
7884 Young children communicate their ignorance 7908 Coevolution of cultural intelligence, extended life
and ask questions history, sociality, and brain size in primates
Paul L. Harris, Deborah T. Bartz, and Meredith L. Rowe Sally E. Street, Ana F. Navarrete, Simon M. Reader,
and Kevin N. Laland
7892 Changes in cognitive flexibility and hypothesis
search across human life history from childhood 7915 The evolution of cognitive mechanisms in response
to adolescence to adulthood to cultural innovations
Alison Gopnik, Shaun O’Grady, Christopher G. Lucas, Arnon Lotem, Joseph Y. Halpern, Shimon Edelman,
Thomas L. Griffiths, Adrienne Wente, Sophie Bridgers, and Oren Kolodny
Rosie Aboody, Hoki Fung, and Ronald E. Dahl
NEWS FEATURE
7900 How language shapes the cultural inheritance
of categories 7734 Can animal culture drive evolution?
Susan A. Gelman and Steven O. Roberts Carolyn Beans
COLLOQUIUM INTRODUCTION
INTRODUCTION
COLLOQUIUM
The extension of biology through culture
Andrew Whitena,1, Francisco J. Ayalab, Marcus W. Feldmanc, and Kevin N. Lalandd
Biology is the study of life. How our understanding of the have changed in the course of human history, through
nature and evolution of living systems is being enriched a different form of inheritance: that in which people learn
and extended through new discoveries about social from others (social learning), including from previous
learning and culture in human and nonhuman animals is generations. Darwin himself recognized the parallels
the subject of the collection of articles we introduce here. between the evolution of culturally inherited languages
Recent decades have revealed that social learning and organic evolution (9, 10); indeed, evolutionary fam-
and the transmission of cultural traditions are much ily trees of languages proposed by philologists long
more widespread in the animal kingdom than earlier predated the Origin of Species, although they were
suspected, affecting numerous forms of functional further spurred by its publication (11–13).
behavior and creating a secondary form of evolution, During the 1970s and 1980s, first by Cavalli-Sforza
built onto the better-known primary, genetically based and Feldman (14–16) and then Boyd and Richerson
form. New scientific approaches to the study of human (17), the implications of the existence of the two forms
cultural evolution have also emerged and become of evolution, organic and cultural, was at last explored
productive. However, these developments in the study systematically and formally, through conceptual and
of cultural phenomena in both human and nonhu- mathematical modeling that formed a foundation for
man animals have yet to be seriously integrated into later empirical investigations. The present collection
mainstream evolutionary biology. Here we offer an of papers opens with a contribution by Creanza et al.
introductory overview of the background and scope of (18) that offers an overview of both the foundational
a collection of articles that report recent progress in studies in (human) cultural evolution and major devel-
these fields, and outline their proposed significance opments in the period since. The early body of 20th
for biology at large. century work laid out some of the ways in which cul-
The theoretical backbone of the life sciences, its tural evolution: (i) echoes many core principles of or-
central organizing principle, is of course evolution, by ganic evolution, yet (ii) also differs from it in dramatic
now rich in both theory and empirical support (1–3). ways that change evolutionary dynamics, and (iii) in-
The great synthesis of Darwin’s and Wallace’s evolu- teracts with the genetically based phenomena to
tionary insights and early 20th century understanding create new complexities (“gene–culture coevolu-
of genetics that became known as the “Modern Syn- tion”). We return to discuss these further, below.
thesis” was achieved by a brilliant set of biologists From a somewhat different perspective Maynard-
mainly in the period 1938–1946 (4), and its principles Smith and Szathmary (19) distinguished a series of
have provided the core of evolutionary theory since major transitions in the nature of evolution, such as
that time (5). Thus, contemporary texts on “evolu- the emergence of multicellularity and of sex, the
tion” focus on such topics as mutation, genetically most recent major transition being the emergence
based inheritance, population genetics, genomics, of (human) culture; and Dawkins (20) gave a name
and the natural and sexual selection pressures that to cultural elements suggested to be the analogs
shape gene frequencies, genotypes, and pheno- of genetic replicators—“memes”—which has been
types (1, 2, 6, 7). Genes and their role in inheritance assimilated into popular culture. Other authors sug-
have come to be celebrated as the pivotal elements gested “semes” (21), echoing semiotics, the study
in evolution (8). of signs and symbols.
However, a second form of evolution was also rec- We shall discuss such developments and subse-
ognized long ago, in the ways that cultural phenomena quent related scientific progress further below, but for
a
Centre for Social Learning and Cognitive Evolution, School of Psychology and Neuroscience, University of St. Andrews, St. Andrews KY16 9JP,
United Kingdom; bDepartment of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697; cDepartment of Biology,
Stanford University, Stanford, CA 94305; and dCentre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews,
St. Andrews KY16 9JP, United Kingdom
This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “The Extension of Biology Through Culture,” held
November 16–17, 2016, at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. The complete
program and video recordings of most presentations are available on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
Author contributions: A.W., F.J.A., M.W.F., and K.N.L. wrote the paper.
The authors declare no conflict of interest.
1
To whom correspondence should be addressed. Email: a.whiten@st-andrews.ac.uk.
Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7777
been particularly powerfully applied to the differentiation and nonhuman animals (79) to “rational imitation” (80) and “over-
evolution of language groups (68, 69), but also to such diverse imitation” (81) in children. The contributions to this issue address
topics as the evolution of socio-political organization (70) and folk all of these prospects conceptually and empirically in diverse and
tales (71). Here, Gray and Watts (60) apply this approach to the important ways. Here, we offer a brief introductory overview of
evolution of religion, using this example to explore the analysis of the background foundations to these new explorations and
cultural macroevolution. updated reviews.
As in the capuchin study of Fragaszy et al. (32), the psycho-
logical and social processes that allow human culture to be so Cultural Phenomena Create a New Form of Evolution. The core
distinctive need to be examined as individuals’ life histories un- of adaptive evolution through natural selection involves the triad
fold, and the affordances of the culture in which they develop are of variation in characters, competitive selection of the best
selectively assimilated and further modified. On the one hand, adapted to current circumstances among them (“survival of the
these processes are part of our species’ biology, their properties fittest,” although relative reproductive success is what ultimately
shaped during the millennia of evolutionary time over which our counts), and inheritance of those selected characters by descen-
ancestors became increasingly and intensively dependent on dants (82). Interwoven in cycles of these processes are three ad-
cumulative cultural inheritance (54, 56, 57). In turn, these unique ditional principles, notably the refinement of adaptations suited to
cultural processes operating in humans generate forms of life not the properties of ecological niches, the accumulation of com-
hitherto witnessed in the natural world. To highlight and dissect plexes of these, and differentiation of descendant populations
some of these special cultural phenomena, we include in this issue where they are sufficiently separated, for example by geography,
four contributions that share a focus on ontogenetic development. ultimately leading to speciation. The latter three effects are
Legare (72) provides an overview of core features of human manifested in the picture of organic evolution with which we are
development that facilitate the adaptive transmission and re- familiar, involving a broadly progressive complexification in life—
finement of culture, including the concept of “natural pedagogy,” from early bacteria to the sophisticated animals of today—and a
whereby adults provide active support to cultural assimilation and vast diversity of living species, all displaying a remarkable fit to the
children are predisposed to recognize and respond to this in ecological niche they so successfully inhabit. Current thinking in
particular ways, such as selective and discriminating copying (for cultural evolution suggests that all these principles are active also
example with respect to alternative cultural models), conformity,
in human cultural evolution (14, 16, 56, 61). Social learning and
including the recognition of norms, and innovative flexibility.
transmission provide the inheritance element and human in-
Other, complementary contributions in the present collection
vention the variants, the most successful of which are transmitted
focus on more specific topics, including the active role that chil-
to future generations, generating cultural adaptations to envi-
dren come to play in recognizing their ignorance, as well as their
ronments around the world; and progressive, cumulative cultures
knowledge, and systematically seek information to remedy this
show immense regional differentiation. Empirical evidence in
(73); the ways in which related hypothesis testing changes through
support of these contentions has accumulated over recent de-
the long period of human development in relation to the stage of
cades, reviewed for example in refs. 61, 62, 78, 83, and 84, and is
cognitive development and socio-ecological context (74); and the
pursued further in the present collection (18, 63).
significance of language as both a product and medium of culture,
Such questions about cultural evolution have remained little
illustrated by the linguistic labels and generics that provide spe-
studied in the animal culture literature, which has instead been
cial forms of both the transmission fidelity and affordance for in-
focused on the more fundamental matter of establishing what
novation that permit cumulative culture (75).
cultural phenomena exist in a diversity of species, and what
How Culture Extends Biology transmission processes underpin these (25, 27, 34). Initial explo-
How the existence of culture extends our understanding of the rations of Darwinian dynamics in the case of animal culture (53)
scope and nature of living systems and their evolution was initially have taken the list of eight key properties extracted from the
analyzed in three major respects (14, 15, 17). First, cultural phe- Origin of Species (9) for testing with human data [the six listed
nomena provide a second inheritance system (76) built on the above, plus changes of function and convergent evolution (83)]
foundations of the primary, genetically based system, and this can and through examining studies of animal culture, concluded there
generate a second form of evolution in the sphere of culturally is evidence for all of them (although minimal and slow-developing
transmitted behaviors and artifacts. Second, because cultural compared with the most recent, cumulative cultures of humans).
transmission is mechanically different from genetic transmission in However, there is evidence that animal traditions with suboptimal
particular ways, such as horizontal diffusion among nonrelatives, it payoffs are sometimes, although seemingly not always (85), vul-
can have new and drastically different evolutionary consequences nerable to decay (41), implying the working of the core Darwinian
(62). Third, the two systems may interact in complex ways, the triad, and it seems likely that those animals for which there is now
phenomenon of gene–culture coevolution (16, 56, 77, 78). To evidence of multiple-tradition cultures are the descendants of
these three we can now add two other important dimensions. One lines of ancestors among whom these traditions were pro-
is that the accumulating evidence that social learning and cultural gressively added, surviving through their success, as in the case of
transmission are much more widespread and consequential over 4,300-y-old nut-cracking in chimpanzees, mentioned earlier
across the animal kingdom than earlier suspected, extends much (46). Nonetheless, experimental studies, for example, of mate-
more broadly the implications of the three effects outlined above, choice copying, show that animal social transmission can be
which were originally conceived with a focus on human culture. A evolutionarily consequential, even if short lived (86).
second is that studies increasingly dissect and delineate the In any case, it is becoming apparent that cultural phenomena
richness of the consequences of cultural evolution, and resulting play an important role in shaping many species’ adjustment to and
diversification of life forms. Examples of recent such discoveries exploitation of their environments, with likely significant evolu-
range from the elucidation of functional forms of teaching in tionary consequences that are the focus of current research.
1 Barton N, et al. (2007) Evolution (Cold Spring Harbor Lab Press, Cold Spring Harbor, NY).
2 Futuyma DJ (2013) Evolution (Sinauer Associates, Sunderland, MA), 3rd Ed.
3 Losos JB (2013) Princeton Guide to Evolution (Princeton Univ Press, Princeton, NJ).
4 Huxley JS (1942) Evolution, the Modern Synthesis (Allen & Unwin, London).
5 Mayr E (1982) The Growth of Biological Thought: Diversity, Evolution and Iheritance (Harvard Univ Press, Cambridge, MA).
6 Mayr E (2002) What Evolution Is (Weidenfield and Nicholson, London).
7 Ridley M (2004) Evolution (Blackwell, Cambridge, MA), 3rd Ed.
8 Laland K, et al. (2014) Does evolutionary theory need a rethink? Nature 514:161–164.
9 Darwin C (1859) On the Origin of Species by Natural Selection (Murray, London).
10 Darwin C (1871) The Descent of Man and Selection in Relation to Sex (Murray, London).
11 Jones W (1798) The third anniversary discourse, delivered 2nd February 1786: On the Hindus. Asiatick Researches 1:415–431.
Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7779
12 Schleicher A (1850) Linguistische Untersuchungen. 2. Teil: Die Sprachen Europas in systematischer Übersicht (HB König, Bonn).
13 Schleicher A (1869) Darwin Tested by the Science of Language (JC Hoten, London).
14 Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phenotypic transmission from parents to children. (A theory of the effect of parental
phenotypes on children’s phenotypes.). Am J Hum Genet 25:618–637.
15 Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution: A Quantitative Approach (Princeton Univ Press, Princeton, NJ).
16 Feldman MW, Cavalli-Sforza LL (1976) Cultural and biological evolutionary processes, selection for a trait under complex transmission. Theor Popul Biol
9:238–259.
17 Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago Press, Chicago).
18 Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How culture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789.
19 Maynard-Smith J, Szathmary E (1995) The Major Transitions in Evolution (Freeman, Oxford).
20 Dawkins R (1976) The Selfish Gene (Oxford Univ Press, Oxford).
21 Hewlett BS, de Silvestri A, Guglielmino CR (2002) Genes and semes in Africa. Curr Anthropol 43:313–321.
22 Fisher J, Hinde RA (1949) The opening of milk bottles by birds. Br Birds 42:347–357.
23 Kawai M (1965) Newly acquired pre-cultural behaviour of the natural troop of Japanese monkeys on Koshima Islet. Primates 2:1–30.
24 Marler P, Tamura M (1964) Culturally transmitted patterns of vocal behavior in sparrows. Science 146:1483–1486.
25 Whiten A, Hinde RA, Stringer CB, Laland KN, eds (2012) Culture Evolves (Oxford Univ Press, Oxford).
26 Hoppitt W, Laland KN (2013) Social Learning: An Introduction to Mechanisms, Methods and Models (Princeton Univ Press, Princeton, NJ).
27 Laland KN, Galef BG, eds (2009) The Question of Animal Culture (Harvard Univ Press, Cambridge, MA).
28 Galef BG, Whiten A (2017) The comparative psychology of social learning. APA Handbook of Comparative Psychology, eds Call J, Burghardt G, Pepperberg I,
Snowdon C, Zentall T (American Psychological Association, Washington, DC), pp 411–440.
29 Thornton A, Clutton-Brock T (2011) Social learning and the development of individual and group behaviour in mammal societies. Philos Trans R Soc Lond B Biol Sci
366:978–987.
30 Whiten A (2012) Social learning, traditions and culture. The Evolution of Primate Societies, eds Mitani J, Call J, Kappeler PM, Palombit RA, Silk JB (Chicago Univ
Press, Chicago), pp 682–700.
31 Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. Proc Natl Acad Sci USA 114:7790–7797.
32 Fragaszy DM, et al. (2017) Synchronized practice helps bearded capuchin monkeys learn to extend attention while learning a tradition. Proc Natl Acad Sci USA
114:7798–7805.
33 Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) invent more social behaviors, but younger monkeys innovate more in other
contexts. Proc Natl Acad Sci USA 114:7806–7813.
34 Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Chicago Univ Press, Chicago).
35 Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl Acad Sci USA 114:7814–7821.
36 Garland EC, et al. (2011) Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Curr Biol 21:687–691.
37 Garland EC, Rendell L, Lamoni L, Poole MM, Noad MJ (2017) Song hybridization events during revolutionary song change provide insights into cultural
transmission in humpback whales. Proc Natl Acad Sci USA 114:7822–7829.
38 Catchpole CK, Slater PJB (2008) Bird Song: Biological Themes and Variations (Cambridge Univ Press, Cambridge, UK), 2nd Ed.
39 Slagsvold T, Wiebe KL (2011) Social learning in birds and its role in shaping a foraging niche. Philos Trans R Soc Lond B Biol Sci 366:969–977.
40 Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture via conformity in wild birds. Nature 518:538–541.
41 Aplin LM, Sheldon BC, McElreath R (2017) Conformity does not perpetuate suboptimal traditions in a wild population of songbirds. Proc Natl Acad Sci USA
114:7830–7837.
42 Laland KN, Atton N, Webster MM (2011) From fish to fashion: Experimental and theoretical insights into the evolution of culture. Philos Trans R Soc Lond B Biol Sci
366:958–968.
43 Grüter C, Leadbeater E (2014) Insights from insects about adaptive social information use. Trends Ecol Evol 29:177–184.
44 Leadbeater E, Dawson EH (2017) A social insect perspective on the evolution of social learning mechanisms. Proc Natl Acad Sci USA 114:7838–7845.
45 Alem S, et al. (2016) Associative mechanisms allow for social learning and cultural transmission of string pulling in an insect. PLoS Biol 14:e1002564.
46 Mercader J, et al. (2007) 4,300-year-old chimpanzee sites and the origins of percussive stone technology. Proc Natl Acad Sci USA 104:3043–3048.
47 Allen J, Weinrich M, Hoppitt W, Rendell L (2013) Network-based diffusion analysis reveals cultural transmission of lobtail feeding in humpback whales. Science
340:485–488.
48 Hobaiter C, Poisot T, Zuberbühler K, Hoppitt W, Gruber T (2014) Social network analysis shows direct evidence for social transmission of tool use in wild
chimpanzees. PLoS Biol 12:e1001960.
49 Whiten A, Mesoudi A (2008) Review. Establishing an experimental science of culture: Animal social diffusion experiments. Philos Trans R Soc Lond B Biol Sci
363:3477–3488.
50 Whiten A, Caldwell CA, Mesoudi A (2016) Cultural diffusion in humans and other animals. Curr Op Psychol 8:15–21.
51 Shapiro JA (2017) Biological action in read-write genome evolution. Interface Focus, in press.
52 Mueller T, O’Hara RB, Converse SJ, Urbanek RP, Fagan WF (2013) Social learning of migratory performance. Science 341:999–1002.
53 Whiten A (2017) A second inheritance system: The extension of biology through culture. Interface Focus, in press.
54 Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press, Cambridge, MA).
55 Pagel M (2012) Wired For Culture: The Natural History of Human Communication (Allen Lang, London).
56 Henrich J (2015) The Secret of Our Success: How Culture is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter (Princeton Univ Press,
Princeton, NJ).
57 Laland KN (2017) Darwin’s Unfinished Symphony: How Culture Made the Human Mind (Princeton Univ Press, Princeton, NJ).
58 Boivin NL, et al. (2016) Ecological consequences of human niche construction: Examining long-term anthropogenic shaping of global species distributions. Proc
Natl Acad Sci USA 113:6388–6396.
59 Alberti M, et al. (2017) Global urban signatures of phenotypic change in animal and plant populations. Proc Natl Acad Sci USA, 10.1073/pnas.1606034114.
60 Gray RD, Watts J (2017) Cultural macroevolution matters. Proc Natl Acad Sci USA 114:7846–7852.
61 Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evolution. Behav Brain Sci 29:329–347, discussion 347–383.
62 Mesoudi A (2011) Cultural Evolution: How Darwinian Theory Can Explain Culture and Sythesize the Social Sciences. (Univ of Chicago Press, Chicago).
63 Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cultural evolution. Proc Natl Acad Sci USA 114:7853–7860.
64 Harmand S, et al. (2015) 3.3-million-year-old stone tools from Lomekwi 3, West Turkana, Kenya. Nature 521:310–315.
65 Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl Acad Sci USA 114:7861–7868.
66 d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069.
67 d’Errico F, et al. (2017) Identifying early modern human ecological niche expansions and associated cultural dynamics in the South African Middle Stone Age. Proc
Natl Acad Sci USA 114:7869–7876.
68 Gray RD, Atkinson QD, Greenhill SJ (2011) Language evolution and human history: What a difference a date makes. Philos Trans R Soc Lond B Biol Sci
366:1090–1100.
69 Bouckaert R, et al. (2012) Mapping the origins and expansion of the Indo-European language family. Science 337:957–960.
Whiten et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7781
Cultural evolutionary theory: How culture evolves and
why it matters
Nicole Creanzaa,1, Oren Kolodnyb,1,2, and Marcus W. Feldmanb
a
Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235; and bDepartment of Biology, Stanford University, Stanford, CA 94305
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 16, 2017)
Human cultural traits—behaviors, ideas, and technologies that can between DNA variants and traits that have major cultural
be learned from other individuals—can exhibit complex patterns components, such as years of schooling, marriage choices, IQ test
of transmission and evolution, and researchers have developed results, and poverty. Perhaps because of the perceived greater
theoretical models, both verbal and mathematical, to facilitate precision of the genomic data, these culturally transmitted com-
our understanding of these patterns. Many of the first quantita- ponents have been relegated to the deep background, creating a
tive models of cultural evolution were modified from existing con- misleading public portrayal of the traits as being predetermined by
cepts in theoretical population genetics because cultural evolution genetics (see, e.g., ref. 11). Models of the dynamics of interaction
has many parallels with, as well as clear differences from, genetic among culture, demography, and genetics, which uncover the
evolution. Furthermore, cultural and genetic evolution can interact complexities in the determination of these behaviors and traits, are
with one another and influence both transmission and selection. crucial to remedy this potentially dangerous misinterpretation.
This interaction requires theoretical treatments of gene–culture
Here, we explore the ways in which cultural evolutionary
coevolution and dual inheritance, in addition to purely cultural
theory and its applications enhance our understanding of human
history and human biology, focusing on the links between cul-
evolution. In addition, cultural evolutionary theory is a natural
tural evolutionary theory and population genetics, human be-
component of studies in demography, human ecology, and many
havioral ecology, and demography. Throughout, we give examples
other disciplines. Here, we review the core concepts in cultural
of efforts to apply theory to data, linking models of cultural evo-
evolutionary theory as they pertain to the extension of biology lution to empirical studies of genetics, language, archaeology, and
through culture, focusing on cultural evolutionary applications in anthropology. For example, studies of cultural factors, including
population genetics, ecology, and demography. For each of these language and customs, help biologists interpret patterns of genetic
disciplines, we review the theoretical literature and highlight rel- evolution that might be misinterpreted if the cultural context were
evant empirical studies. We also discuss the societal implications of not taken into account. Finally, we outline several societal impli-
the study of cultural evolution and of the interactions of humans cations of cultural evolutionary theory.
with one another and with their environment.
Population Genetics and Cultural Evolution
|
cultural evolution mathematical models | gene–culture coevolution | Many of the first models of cultural evolution drew explicit
|
niche construction demography parallels between culture and genes by modifying concepts from
theoretical population genetics and applying them to culture.
Cultural patterns of transmission, innovation, random fluctua-
H uman culture encompasses ideas, behaviors, and artifacts
that can be learned and transmitted between individuals and
can change over time (1). This process of transmission and
tions, and selection are conceptually analogous to genetic pro-
cesses of transmission, mutation, drift, and selection, and many
change is reminiscent of Darwin’s principle of descent with of the mathematical techniques used to study genetics can be
useful in the study of culture (1, 12). However, these mathe-
modification through natural selection, and Darwin himself drew
matical approaches had to be modified to account for the dif-
this explicit link in the case of languages: “The formation of ferences between genetic and cultural transmission. For example,
different languages and of distinct species, and the proofs that we do not expect cultural transmission to follow the rules of genetic
both have been developed through a gradual process, are curiously transmission strictly. Indeed, cultural traits are likely to deviate from
parallel” (2, 3). Theory underpins most scientific endeavors, and, in all three laws of Mendelian inheritance: segregation, independent
the 1970s, researchers began to lay the groundwork for cultural assortment, and dominance (13).
evolutionary theory, building on the neo-Darwinian synthesis of The simple observation that cultural traits need not conform
genetics and evolution by using verbal, diagrammatic, and mathe- to Mendelian inheritance is sufficient to produce complex evo-
matical models (4–8). These models are, by necessity, approxima- lutionary dynamics: If children are likely to reject a cultural trait
tions of reality (9), but because they require researchers to specify that both of their parents possess, the frequency of that trait in
their assumptions and extract the most important features from the population may oscillate between generations (4). In addi-
complex processes, they have proven exceedingly useful in ad- tion, if two biological parents have different forms of a cultural
vancing the study of cultural evolution (10). Here, we review the trait, their child is not necessarily equally likely to acquire the
field of cultural evolutionary theory as it pertains to the extension of
biology through culture. We focus on human culture because the
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
bulk of cultural evolutionary models are human-centric and certain
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
processes such as cumulative culture seem to be unique to humans. Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
However, numerous nonhuman species also exhibit cultural trans- in Irvine, CA. The complete program and video recordings of most presentations are available
mission, and we consider the areas of overlap between models of on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
human and animal culture in Discussion. Author contributions: N.C., O.K., and M.W.F. designed research, performed research,
The study of cultural evolution is important beyond its aca- analyzed data, and wrote the paper.
demic value. Cultural evolution is a fundamentally interdisci- The authors declare no conflict of interest.
plinary field, bridging gaps between academic disciplines and This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
facilitating connections between disparate approaches. For ex- Board.
1
ample, the advent of technologies for revealing genomic varia- N.C. and O.K. contributed equally to this work.
tion has led to a plethora of studies that measure association 2
To whom correspondence should be addressed. Email: okolodny@stanford.edu.
EVOLUTION
fluence the continuing transmission, and thus survival, of a cultural parison between modeling predictions and the archaeological
trait (16). The relative importance of a population’s size, and its record that showed that the frequencies of Neolithic pottery
environmental context, for the retention and perhaps expansion of features over time are not consistent with a cultural system at
the cultural repertoire constitutes an ongoing debate (16–20). equilibrium (40).
The Roles of Transmission and Innovation in Cultural Evolution. Thus Linking Genetic and Cultural Evolution. As mentioned above, the-
far, we have made the analogy between alleles of a gene and oretical treatments of cultural transmission and evolution can
forms of a cultural trait, implying that the cultural trait in usefully draw on concepts from theoretical population genetics,
ANTHROPOLOGY
question can be represented in a binary or discrete manner. extending them to accommodate cultural processes. However,
Although this approximation is appropriate for some culturally cultural and genetic evolutionary processes can also interact with
transmitted traits, such as knowing or not knowing how to use a one another and with the environment (Fig. 2), and elucidating
certain tool, or smoking or not smoking, some cultural traits are the relative contributions of genes, culture, and environment to a
more naturally regarded as continuous or quantitative traits. For phenotype can be very difficult (41). Extensive theoretical work
example, cultural norms and preferences, such as degree of risk has been devoted to characterizing these interactions, termed
tolerance, have been modeled as continuous traits (e.g., ref. 21), and gene–culture coevolution (1, 42), culture–gene coevolution (43),
knowledge of a tool or technique has usefully been represented in dual inheritance theory (12, 44), or cultural niche construction
terms of a quantitative “skill level” (e.g., refs. 16, 22, and 23). (45, 46). When cultural and genetic evolution interact, the dy-
Like genes, cultural traits can be more or less adaptive namics of both genetic and cultural traits are likely to be very
depending on the environment and spread accordingly. An in- different from those characteristic of only one mode of trans-
teresting question is the following: If a certain behavior may be mission (47, 48). Further, cultural traits can alter the selection
either innate (i.e., genetically determined) or culturally acquired pressures on genetic traits and vice versa: For example, genetic
(and thus potentially responsive to the environment), which traits that are adaptive in one cultural background might not be
environmental patterns would favor the genetic transmission? adaptive in another (49, 50). The classic example of these in-
Models predict that spatially varying environments will favor teractions between cultural and genetic evolution is lactase
cultural transmission, whereas only highly stable environments persistence in adulthood: For much of human history, there was
would favor the genetic determination of the behavior (24–26). little reason to digest milk after weaning, and adults did not typically
Cavalli-Sforza and Feldman note an important reason that produce the enzyme that digests lactose. However, with the cultural
genes, cultural traits, and environments should all be considered
together: “Given the existence of individual plasticity in response
to the environment, correlations between biological relatives are
expected even if there is no genetic variation whatsoever” (14). Climate Niche construction
Unlike in genetics, where mutations are the source of new Environmental change Selection pressures
traits, cultural innovations can occur via multiple processes and Ultraviolet exposure Pathogens
at multiple scales (1, 27–29). Most of the models described above Altitude Demography Microbiome
include the cultural transmission of existing traits without pro- Transmission dynamics
viding a mechanism for novel traits to be introduced to the Population size
population. In many models of social learning, new information Innovations
enters a population via trial-and-error learning or individual in-
teractions with the environment, and this information can then
be culturally transmitted (30, 31). New cultural traits can also
originate when existing traits are combined in novel ways, which Flora and fauna
can lead to exponential rates of cultural accumulation (32). Food sources Migration
Recent models represent innovation as the result of multiple Subsistence strategy Genetic admixture
interacting processes (27–29), and cultural traits can accumulate Available resources Cultural connectivity
in punctuated bursts when these processes of innovation are
interdependent: A truly groundbreaking innovation can pave the Fig. 2. Cultural, genetic, and environmental factors influencing evolution.
Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7783
practice of cattle domestication and dairying, a genetic mutation on one cultural trait can influence the evolutionary dynamics of
that enabled the production of the lactase enzyme in adulthood was other cultural traits, facilitating the spread of rare cultural or ge-
strongly favored by selection (51, 52). netic variants (71, 72). More generally, assorting can affect not just
Theoretical analyses show that gene–culture coevolution can mate choice but many types of cultural interactions, termed
be dynamically complex and surprisingly unpredictable. For ex- “assortative meeting” (73). Empirical work supports this theoret-
ample, a well-known finding in population genetics is that a fit- ical finding; for example, beneficial health behaviors spread more
ness advantage to heterozygote genotypes maintains genetic readily through a social network when individuals’ social contacts
variation in a population. However, it is not sufficient to main- were more similar to themselves (74, 75). Culturally mediated
tain genetic variation for heterozygote offspring to be superior to assortment can also lead to biological differences: Partners that
homozygotes in their ability to acquire an advantageous cultural are more similar tend to have more offspring (76), thus increasing
trait that is transmitted culturally by a parent (12). In fact, the fitness, and assortative mating within highly homophilic groups
fitness advantage to the culturally transmitted trait has to be affects the average length of homozygous DNA segments (59, 77),
sufficiently large that it overcomes imperfection in vertical cul- leading to the appearance of higher levels of inbreeding than
tural transmission. In a similar vein, Aoki et al. (26, 53) modeled the might actually exist. Humans can also assort by language; however,
evolution of a genetic trait that increased the efficiency of teaching, studies of the interactions between language and genetic pop-
defined as vertical transmission of a cultural trait. Genetic variation ulation structure show that the resulting dynamics can differ by
at this teaching locus could not be maintained with asexual haploid population. For example, in some geographic regions, language
genetics and uniparental cultural transmission, but sexual haploid boundaries do not seem to act as barriers to gene flow (78–80)
genetics and biparental cultural transmission could preserve both whereas, in other places, assorting with respect to language seems
genetic polymorphism of the teaching locus and polymorphism to have had a large effect, and genetic similarity is more closely
of the cultural trait. These examples illustrate the theoretical associated with language than with geographic distance (80–83).
complexity that emerges when standard population genetic the- Assortative mating has had a measurable effect on human geno-
ory is extended to include the interactions between genetic and mic architecture, and genetic and phenotypic correlations between
cultural traits; the result is a highly nonlinear theory with com- partners are substantial (84).
plications not seen in purely biological theory. In addition to choosing their mates nonrandomly, individuals
The theoretical literature on gene–culture interactions has can also choose their cultural role models; these cultural trans-
become increasingly relevant in the genomic era. Genome-wide mission biases affect the relationship between a trait’s frequency
association studies (GWAS) have shown many genomic associ- in the population and its likelihood of transmission (Fig. 3). For
ations with a wide array of complex phenotypes and have allowed example, conformity bias is an exaggerated preference for the
detection of signals of genetic adaptation (54). However, GWA cultural variant practiced by the majority of the population,
studies of behavioral phenotypes such as IQ, educational at- which can lead to an increasingly large majority over time (85,
tainment, and life history should be interpreted with care (55– 86). Alternatively, individuals might preferentially seek out novel
58). As the authors of one such study state: “Studies of genetic cultural traits, termed rarity bias or novelty bias (30). These
analyses of behavioural phenotypes have been prone to mis- frequency-dependent biases can lead to patterns of cultural dif-
interpretation, such as characterizing identified associated vari- fusion in which the prevalence of a cultural trait can change
ants as ‘genes for education.’ Such characterization is not correct dramatically over short timescales, producing logistic growth
for many reasons: Educational attainment is primarily deter- (“S-shaped” curves) of trait frequency over time (87, 88). Ex-
mined by environmental factors” (55). Statistical relationships amples of cultural traits that are likely to exhibit frequency-
between genetic variants and behaviors need not be causal be- dependent transmission are fashion trends (89), career choices
cause assortative mating, spatial autocorrelation, and a shared (12), and baby names (90). Conformist transmission is likely to
environment can influence such relationships (55, 59–61). Twin dominate when the environment is relatively stable and common
studies of tobacco smoking point to interacting roles of genetics, cultural traits are well adapted to that environment (86, 91).
environment, and assortative mating in the initiation and con- Other types of transmission biases reflect not how common a
tinuance of smoking (62). In large-scale studies of human health, trait is in a population, but the characteristics of the people who
environmental and cultural factors should also be considered have the trait. In the case of prestige bias, individuals attempt to
because these could conflate the effects of genetics and ancestry acquire cultural traits that are perceived to be high quality by
with those of poverty, stress, racism, or socioeconomic status selectively learning from those individuals with high social rank
(63–65). For example, data from the large-scale Health and (92). For example, in an experimental test, children were much
Retirement Study showed an association between African an- more likely to choose an adult cultural role model if they had
cestry and hypertension: The prevalence of hypertension was observed bystanders attending to the potential model rather
eight percentage points higher in respondents with the highest than ignoring him or her (93); thus, even at a very early age,
quartile of African ancestry compared with those with the lowest humans can assess such characteristics as prestige or social
quartile (63). However, controlling for a subset of factors related standing. Individuals can also use observations of success asso-
to socioeconomic status (childhood disadvantage, education, ciated with a cultural trait, such as a fruitful hunt with a certain
income, and wealth) explained ∼38% of this disparity, reducing tool, to develop a preference for cultural role models that are
it to a five-percentage-point difference (63).
EVOLUTION
affecting evolution (97). A special case of niche construction is retical studies of culturally determined behaviors bear directly on
cultural niche construction: the alteration of the environment human fitness, past and present. Human ecological traits, such as
through cultural practices, which may themselves evolve. Cul- life history profiles, subsistence strategies, mating preferences,
tural niche construction involves complex dynamics in which economic decision making, and social structures (119–122), have
selective pressures act on the culture itself, interacting with ge- been analyzed to predict individual behavior and to support
netic evolution and the environment to influence the spread of potential intervention that might alter human behaviors at the
both genetic and cultural traits (71). Because cultural change has societal level.
the potential to occur faster than genetic adaptation, dynamics of Interestingly, few studies in human ecology consider the dy-
ANTHROPOLOGY
niche construction that are driven by cultural traits play a namics of cultural evolution on which the studied behaviors
prominent role in human evolution; yet, only in recent decades depend; thus, for example, it is frequently assumed that alter-
has cultural evolution begun to be explicitly incorporated into native possible behaviors are available to the human group of
human evolutionary ecology (98). Studies that pioneered this interest when they might not be, such as different subsistence
approach showed how it can provide insight into the dynamics of strategies. Similarly, with some notable exceptions (e.g., refs.
the demographic transition in postindustrialized societies (e.g., 123–128), human behavioral ecology models often do not con-
refs. 1 and 99). For example, the reduction in birth rate during sider ecological and evolutionary dynamics that may depend on
the demographic transition is often characterized as a paradox the studied behavior and that play out on intermediate and long
because, from a Darwinian fitness perspective, individuals should timescales: For example, how would prey populations evolve
prefer to have more offspring, not fewer (100). However, if a over long periods of time in response to a certain human for-
cultural norm favoring small family size spreads, the fertility rate aging strategy, and how would that feed back onto human
can drop as well, resulting in a culturally induced demographic strategy choice? We suggest that these aspects are promising
transition (99, 101), which is a case where natural selection and avenues for further exploration.
cultural transmission seem to be in opposition.
The niche-construction approach has been productive in many Interspecies and Intergroup Dynamics. One of the hotly debated
other studies, such as those that describe culturally driven change topics in human prehistory is the replacement of Neanderthals
at the ecosystem level: for example, the extinction of megafauna by modern humans ∼40,000 y ago. A recent study (129) proposed
after the arrival of humans (102), the change of broad-scale an ecocultural model that incorporated cultural differences be-
landscapes as a result of cultivation in early and recent times tween two competing species into Lotka–Volterra competition
(103–105), and the traditional use of fire as a means to manip- dynamics and showed that a difference in culture between moderns
ulate the environmental dynamics in a way beneficial for humans and Neanderthals could have driven the latter’s extinction. This
(106, 107). Niche construction is also important in understanding model explicitly includes cultural evolutionary dynamics and
the evolutionary dynamics driven by changes in the immediate shows that a difference in population sizes between moderns in
environment that humans experience, such as via construction of Africa and Neanderthals in Eurasia could have led to a differ-
shelters and production of clothing that enabled the expansion of ence in the cultural complexity between the two populations,
humans into otherwise uninhabitable regions (108), and the use allowing the small groups of moderns that migrated out of Africa
of fire for food handling, which allowed dramatic changes in to gradually outcompete the larger population of Neanderthals
subsistence and may even have led to significant change to the that they encountered.
anatomy of the human jaw (109). This pattern—with one group replacing another as a result of
a culturally derived advantage—is likely to have taken place
Major Cultural Shifts. A key aspect of human evolution is the change repeatedly throughout human history. Thus, for example, genetic
over time in human subsistence strategies. Several models con- evidence largely supports a scenario in which the Neolithic rev-
sider the interaction of hunter-gatherers with the populations of olution spread throughout the world not by diffusion of farming
organisms that they consume and how these interact over time. practices among groups but by replacement of hunter-gatherer
They propose that predation pressure can decrease a prey species’ groups by farmers (130) (see also refs. 34, 131, and 132). A
population and exert selective pressures in favor of early re- second revolution occurred 6,000 to 4,000 y ago, when the early
production at a smaller body size, potentially leaving a tell- Neolithic farmers were overwhelmed by Yamnaya invaders from
tale pattern in the archaeological record. The result may be the the Russian Steppe, who had the cultural advantage of
Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7785
transportation by horses (133, 134). Such dynamics, in which Standard quantitative models of demographic change do not in-
cultural adaptation to temporally variable conditions may play an clude within-population variation in behaviors that affect fecundity
important role, are also pervasive more recently: For example, or mortality. Projections usually use fixed values for birth and death
competition between pastoralists and agriculturalists and re- rates; however, religious preferences, marriage customs, dietary
placement of one by the other are documented from biblical choices, population subdivision, and mortality profiles may af-
times to the present (135, 136). fect fecundity but are usually not part of demographic models.
Further, aspects of cultural transmission, such as prestige bias
Culture and Microbes. Models are also important in analyzing and the choice of nonparental cultural role models, can facili-
humans’ cultural and genetic coevolution with pathogens, the tate the spread of fertility-reducing behaviors (12, 153). Thus,
realm in which many of our species’ harshest evolutionary cultural evolutionary approaches should be integrated into
challenges have occurred. Some of the clearest signals of natural demography, especially the processes that have led to fertility
selection in the human genome are found near genes that are decline (154).
directly related to coping with diseases such as malaria (137, Many models for life history analysis of humans divide the
138), Kuru (139), and others (140–142), and the understanding lifespan into an ordered series of age classes. These models first
of their evolutionary dynamics is greatly enhanced when we are define the fertility rates of each age class and the survival rates
able to couple such genetic evidence with cultural dynamics that from one age class to the next. Then, they iterate the number in
influenced them. Durham (143), for example, argues that yam each age class produced by these parameters to determine the
farming practices in West Africa significantly increased standing dynamics of the population, including whether the number in
water, thus increasing breeding sites for malaria-carrying mos- each age class approaches a stable equilibrium, termed the sta-
quitoes, which led to high exposure to malaria and exerted se- tionary age distribution, or whether the population will grow or
lective pressure in favor of genetic variants that increase go extinct and at what rate (155).
resistance to malaria. In the New Guinea highlands, cannibalism Carotenuto et al. proposed a demo-cultural framework for
practices that were widespread until the 1940s drove the Kuru such an age-structured population, in which each individual
epidemic among the people of this region (144). A model of carried one variant of a dichotomous trait, say H or h, where H
culture–pathogen interactions demonstrated that different be- represents the presence of a socially learned behavior (for example,
havioral regimes could shape dynamics of pathogenic bacteria, fertility control) and h is its absence (156). An individual of type H
leading to nonintuitive outcomes (145). For example, antibiotic- might also be more likely to survive into the next age class. This
resistant strains will spread throughout the population in the integration of demography and culture yields complex dynamics; for
presence of ubiquitous antibiotic use whereas the WT bacteria example, the trait H can persist in the populations even if it lowers
have a fitness advantage if antibiotics are not used; however, if fertility, as long as the cultural transmission of H is reliable enough,
people modify their behavior by decreasing use of antibiotics or if H also sufficiently increases the chance of survival. Addi-
when they become less effective, both WT and resistant patho- tional learning steps can also be added to age-structured models,
such that vertical and horizontal transmission can occur at dif-
gens can coexist (145).
A fast-growing body of research focuses on the host-associated ferent rates for different age classes (101). In this case, hori-
zontal learning accelerated the trait’s spread and led to faster
microbiome: the communities of organisms, mostly bacteria, that
population growth than vertical transmission alone.
live in and on eukaryotes. The dynamics of the microbiome can
An important outgrowth of demo-cultural modeling has been
interact with those of its host, including genetic variation, cul-
its application to the sex-ratio problem. In many places, the sex
tural practices, and environmental context, further complicating ratio at birth is strongly biased in favor of males and, in China
the study of evolutionary processes. Thus, for example, the in- and parts of India, has resulted in up to 120 male births for every
teraction between dairy farming and selection on the lactase 100 female births (157). This cultural preference for sons can be
persistence gene has become the poster child of gene–culture manifested in sex-selective abortion or withholding of resources
coevolution; however, lactose-using bacteria in humans’ digestive from daughters. This bias has both economic and socio-cultural
tracts are very likely to have played a prominent role in the antecedent, as well as important ethical and demographic
emergence of dairy farming (146). Moreover, these bacteria consequences (158).
continue to affect individuals who do not carry a genetic muta- Data on cultural transmission of son preference can be in-
tion that allows them to efficiently digest dairy in adulthood. corporated into formal demographic analysis (159), linking these
Understanding how cultural practices influence human–microbe data to real-world policy applications (160). Theoretical models
interactions may provide us not only with insight into the Neo- can also aid in predicting the effects of policies: For example,
lithic farming revolution or early cattle domestication and re- one such model tracked the cultural transmission of the perceived
lated human evolution since then, but also with the necessary present value of a son relative to a daughter, the sex ratio at birth,
tools to make informed nutritional choices, such as those related and their effects on demographic change (161). The results of this
to dairy utilization in our present lives. Thus, worldwide dietary model suggest that interventions focused on peer-to-peer cultural
recommendations stand to benefit significantly from an im- transmission of a perceived higher value of daughters might
proved understanding of microbe–human interactions (147). complement existing economic incentives to support and educate
daughters, with the goal of mitigating the effects of son preference.
Demography and Cultural Evolution The literature on the interaction between cultural transmission
The growth and age structure of human populations are both and formal demography is quite sparse. Given the large variety
affected by norms and beliefs of their members. A predominantly of customs that relate to birth and death rates in different hu-
agricultural lifestyle produced higher population growth than the man societies, population projections for the future needs of di-
hunting-gathering lifestyle it replaced (148, 149). This increased verse populations should incorporate more cultural dynamics than
growth was most likely due to the spread of a complex of cultural is currently standard practice.
traits (150) whose adoption may have created conditions that
favored the accumulation of subsequent culturally transmitted Discussion
behaviors (151, 152). Beginning in the late 19th century, parts of With the extensive body of theoretical and empirical literature
Europe, Asia, the United States, Australia, and New Zealand on cultural evolution, researchers in this field are now combining
began to undergo a second demographic transition, which in- information from multiple disciplines and integrating disparate
volved a change from a high birth rate, high mortality regime to a approaches. Part of this new frontier involves more fully bridging
lower birth rate, low mortality regime. These changes were due the divide between theory and data, as well as developing
to the spread of fertility-reducing and survival-increasing be- mathematical models than can aid in the interpretation of an-
haviors that became part of the developed countries’ cultures. thropological and archaeological information. In addition to
EVOLUTION
based on patterns in material culture (e.g., refs. 172–174), and the human environment coevolve is necessary for understanding
models of cultural dynamics within and between groups (e.g., historical and present dynamics, and for predicting future trends.
refs. 86 and 175–178)). In addition, we focused on human studies, These analyses will provide much-needed tools for the planning
although cultural processes are present in many other species. For and direction of such dynamics. Humans’ worldwide well-being
example, social learning has been extensively studied in non- and that of the ecosystem we live in depend on our ability to
human animals, in which behavioral strategies, such as producer make such predictions and act accordingly.
and scrounger, and cultural trajectories can be more clearly
defined than in humans (166, 179). Cultural transmission also ACKNOWLEDGMENTS. We thank the John Templeton Foundation and
ANTHROPOLOGY
has large-scale evolutionary implications for some nonhuman Stanford Center for Computational, Evolutionary, and Human Genomics
animals: For example, theoretical studies suggest that nonrandom for funding.
1. Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution: A 22. Kobayashi Y, Aoki K (2012) Innovativeness, population size and cumulative cultural
Quantitative Approach (Princeton Univ Press, Princeton). evolution. Theor Popul Biol 82:38–47.
2. Darwin C (1859) On the Origin of Species by Means of Natural Selection (Murray, 23. Baldini R (2015) Revisiting the effect of population size on cumulative cultural
London). evolution. J Cogn Cult 15:320–336.
3. Darwin C (1888) The Descent of Man, and Selection in Relation to Sex (Murray, 24. Boyd R, Richerson PJ (1983) The cultural transmission of acquired variation: Effects
London). on genetic fitness. J Theor Biol 100:567–596.
4. Feldman MW, Cavalli-Sforza LL (1976) Cultural and biological evolutionary processes, 25. Aoki K, Feldman MW (2014) Evolution of learning strategies in temporally and
selection for a trait under complex transmission. Theor Popul Biol 9:238–259. spatially variable environments: A review of theory. Theor Popul Biol 91:3–19.
5. Feldman MW, Cavalli-Sforza LL (1975) Models for cultural inheritance: A general 26. Aoki K, Wakano J, Feldman M (2005) The emergence of social learning in a tem-
linear model. Ann Hum Biol 2:215–226. porally changing environment: A theoretical model. Curr Anthropol 46:334–340.
6. Blum HF (1978) Uncertainty in interplay of biological and cultural evolution: Man’s 27. Fogarty L, Creanza N, Feldman MW (2015) Cultural evolutionary perspectives on
view of himself. Q Rev Biol 53:29–40. creativity and human innovation. Trends Ecol Evol 30:736–754.
7. Cavalli-Sforza L, Feldman MW (1973) Models for cultural inheritance. I. Group mean 28. Kolodny O, Creanza N, Feldman MW (2015) Evolution in leaps: The punctuated accu-
and within group variation. Theor Popul Biol 4:42–55. mulation and loss of cultural innovations. Proc Natl Acad Sci USA 112:E6762–E6769.
8. Alland A, Jr (1972) Cultural evolution: The Darwinian model. Soc Biol 19:227–239. 29. Kolodny O, Creanza N, Feldman MW (2016) Game-changing innovations: How cul-
9. Burnham KP, Anderson DR (1998) Model Selection and Inference: A Practical
ture can change the parameters of its own evolution and induce abrupt cultural
Information-Theoretic Approach (Springer, New York).
shifts. PLOS Comput Biol 12:e1005302.
10. Haldane JBS (1964) A defense of beanbag genetics. Perspect Biol Med 7:343–359.
30. Henrich J, McElreath R (2003) The evolution of cultural evolution. Evol Anthropol 12:
11. Guedes Jd, et al. (2013) Is poverty in our genes? Curr Anthropol 54:71–79.
123–135.
12. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Chicago Univ Press,
31. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies
Chicago).
tournament. Science 328:208–213.
13. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for a science of cul-
32. Enquist M, Ghirlanda S, Jarrick A, Wachtmeister C-A (2008) Why does human culture
tural evolution. Proc Natl Acad Sci USA 114:7853–7860.
14. Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phe- increase exponentially? Theor Popul Biol 74:46–55.
33. Klein RG, Edgar B (2002) The Dawn of Human Culture (Wiley, New York).
notypic transmission from parents to children. (A theory of the effect of parental
34. Bar-Yosef O (1998) On the nature of transitions: The Middle to Upper Palaeolithic
phenotypes on children’s phenotypes). Am J Hum Genet 25:618–637.
15. Giraldeau L-A (1994) Social foraging: Individual learning and cultural transmission of and the Neolithic revolution. Camb Archaeol J 8:141–163.
innovations. Behav Ecol 5:35–43. 35. Roebroeks W (2008) Time for the Middle to Upper Paleolithic transition in Europe.
16. Henrich J (2004) Demography and cultural evolution: How adaptive cultural pro- J Hum Evol 55:918–926.
cesses can produce maladaptive losses—The Tasmanian case. Am Antiq 69:197–214. 36. Darmstaedter L, Du Bois-Reymond R (1904) 4000 Jahre Pionier-Arbeit in den Exakten
17. Henrich J, et al. (2016) Understanding cumulative cultural evolution. Proc Natl Acad Wissenschaften (JA Stargardt, Berlin).
Sci USA 113:E6724–E6725. 37. Aiyar S, Dalgaard C-J, Moav O (2008) Technological progress and regress in pre-
18. Vaesen K, Collard M, Cosgrove R, Roebroeks W (2016) Population size does not explain industrial times. J Econ Growth 13:125–144.
past changes in cultural complexity. Proc Natl Acad Sci USA 113:E2241–E2247. 38. Kuhn SL (2012) Emergent Patterns of Creativity and Innovation in Early Technologies:
19. Collard M, Ruttle A, Buchanan B, O’Brien MJ (2013) Population size and cultural Origins of Human Innovation and Creativity (Elsevier, Oxford), pp 69–88.
evolution in nonindustrial food-producing societies. PLoS One 8:e72628. 39. Lehman HC (1947) The exponential increase of man’s cultural output. Soc Forces 25:
20. Collard M, Buchanan B, Morin J, Costopoulos A (2011) What drives the evolution of 281–290.
hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with 40. Crema ER, Kandler A, Shennan S (2016) Revealing patterns of cultural transmission
data from the Pacific Northwest. Philos Trans R Soc Lond B Biol Sci 366:1129–1138. from frequency data: Equilibrium and non-equilibrium assumptions. Sci Rep 6:39122.
21. Bisin A, Verdier T (2010) The economics of cultural transmission and the dynamics of 41. Feldman MW, Cavalli-Sforza LL (1979) Aspects of variance and covariance analysis
preferences. Handb Soc Econ 319:339–416. with cultural inheritance. Theor Popul Biol 15:276–307.
Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7787
42. Feldman MW, Laland KN (1996) Gene-culture coevolutionary theory. Trends Ecol 81. Barbujani G, Sokal RR (1990) Zones of sharp genetic change in Europe are also lin-
Evol 11:453–457. guistic boundaries. Proc Natl Acad Sci USA 87:1816–1819.
43. Chudek M, Henrich J (2011) Culture-gene coevolution, norm-psychology and the 82. Karafet TM, et al. (2016) Coevolution of genes and languages and high levels of
emergence of human prosociality. Trends Cogn Sci 15:218–226. population structure among the highland populations of Daghestan. J Hum Genet
44. Richerson PJ, Boyd R (1978) A dual inheritance model of the human evolutionary 61:181–191.
process. I. Basic postulates and a simple model. J Soc Biol Struct 1:127–154. 83. de Filippo C, et al. (2011) Y-chromosomal variation in sub-Saharan Africa: Insights
45. Laland KN, Odling-Smee J, Feldman MW (2000) Niche construction, biological evo- into the history of Niger-Congo groups. Mol Biol Evol 28:1255–1269.
lution, and cultural change. Behav Brain Sci 23:131–146. discussion 146–175. 84. Robinson MR, et al. (2017) Genetic evidence of assortative mating in humans. Nat
46. Odling-Smee J, Laland KN, Feldman MW (2003) Niche Construction: The Neglected Hum Behav 1:0016.
Process in Evolution (Princeton Univ Press, Princeton). 85. Efferson C, Lalive R, Richerson PJ, McElreath R, Lubell M (2008) Conformists and
47. Laland KN, Kumm J, Van Horn JD, Feldman MW (1995) A gene-culture model of mavericks: The empirics of frequency-dependent cultural transmission. Evol Hum
human handedness. Behav Genet 25:433–445. Behav 29:56–64.
48. Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evo- 86. Henrich J, Boyd R (1998) The evolution of conformist transmission and the emer-
lution. Behav Brain Sci 29:329–347, discussion 347–383. gence of between-group differences. Evol Hum Behav 19:215–241.
49. Rendell L, Fogarty L, Laland KN (2011) Runaway cultural niche construction. Philos 87. Rogers EM (2010) Diffusion of Innovations (Simon and Schuster, New York).
Trans R Soc Lond B Biol Sci 366:823–835. 88. Henrich J (2001) Cultural transmission and the diffusion of innovations: Adoption
50. Laland KN, O’Brien MJ (2012) Cultural niche construction: An introduction. Biol dynamics indicate that biased cultural transmission is the predominate force in be-
Theory 6:191–202. havioural change. Am Anthropol 103:992–1013.
51. Feldman MW, Cavalli-Sforza LL (1989) On the theory of evolution under genetic and 89. Acerbi A, Ghirlanda S, Enquist M (2012) The logic of fashion cycles. PLoS One 7:
cultural transmission with application to the lactose absorption. Mathematical e32541.
Evolutionary Theory, ed Feldman MW (Princeton Univ Press, Princeton), pp 145–173. 90. Acerbi A, Alexander Bentley R (2014) Biases in cultural transmission shape the
52. Ingram CJ, Liebert A, Swallow DM (2012) Population genetics of lactase persistence turnover of popular traits. Evol Hum Behav 35:228–236.
and lactose intolerance. eLS, 10.1002/9780470015902.a0020855.pub2. 91. Kendal J, Giraldeau LA, Laland K (2009) The evolution of social learning rules:
53. Aoki K, Wakano J, Feldman M (2016) Gene-culture models for the evolution of al- Payoff-biased and frequency-dependent biased transmission. J Theor Biol 260:
truistic teaching. On Human Nature: Biology, Psychology, Ethics, Policy, and Religion, 210–219.
eds Tibayrenc M, Ayala F (Academic, Amsterdam), pp 279–296. 92. Henrich J, Gil-White FJ (2001) The evolution of prestige: Freely conferred deference
54. Berg JJ, Coop G (2014) A population genetic signal of polygenic adaptation. PLoS as a mechanism for enhancing the benefits of cultural transmission. Evol Hum Behav
Genet 10:e1004412. 22:165–196.
55. Okbay A, et al.; LifeLines Cohort Study (2016) Genome-wide association study 93. Chudek M, Heller S, Birch S, Henrich J (2012) Prestige-biased cultural learning: By-
identifies 74 loci associated with educational attainment. Nature 533:539–542. stander’s differential attention to potential models influences children’s learning.
56. Benyamin B, et al.; Wellcome Trust Case Control Consortium 2 (WTCCC2) (2014) Evol Hum Behav 33:46–56.
Childhood intelligence is heritable, highly polygenic and associated with FNBP1L. 94. Mesoudi A, O’Brien MJ (2008) The cultural transmission of Great Basin projectile-
Mol Psychiatry 19:253–258. point technology II: An agent-based computer simulation. Am Antiq 73:627–644.
57. Davies G, et al. (2011) Genome-wide association studies establish that human in- 95. Mesoudi A (2011) An experimental comparison of human social learning strategies:
telligence is highly heritable and polygenic. Mol Psychiatry 16:996–1005. Payoff-biased social learning is adaptive but underused. Evol Hum Behav 32:
58. Minkov M, Bond MH (2015) Genetic polymorphisms predict national differences in 334–342.
life history strategy and time orientation. Pers Individ Dif 76:204–215. 96. Alberti M, et al. (2017) Global urban signatures of phenotypic change in animal and
59. Abdellaoui A, et al. (2015) Educational attainment influences levels of homozygosity plant populations. Proc Natl Acad Sci USA, 10.1073/pnas.1606034114.
through migration and assortative mating. PLoS One 10:e0118935. 97. Laland KN, Brown GR (2006) Niche construction, human behavior, and the adaptive‐
60. Piffer D (2015) A review of intelligence GWAS hits: Their relationship to country IQ lag hypothesis. Evol Anthropol 15:95–104.
and the issue of spatial autocorrelation. Intelligence 53:43–50. 98. Laland KN, Odling‐Smee J, Feldman MW (2001) Cultural niche construction and
61. Domingue BW, Fletcher J, Conley D, Boardman JD (2014) Genetic and educational human evolution. J Evol Biol 14:22–33.
assortative mating among US adults. Proc Natl Acad Sci USA 111:7996–8000. 99. Ihara Y, Feldman MW (2004) Cultural niche construction and the evolution of small
62. Maes HH, et al. (2006) Genetic and cultural transmission of smoking initiation: An family size. Theor Popul Biol 65:105–111.
extended twin kinship model. Behav Genet 36:795–808. 100. Borgerhoff Mulder M (1998) The demographic transition: Are we any closer to an
63. Marden JR, Walter S, Kaufman JS, Glymour MM (2016) African ancestry, social fac- evolutionary explanation? Trends Ecol Evol 13:266–270.
tors, and hypertension among non-Hispanic Blacks in the Health and Retirement 101. Fogarty L, Creanza N, Feldman MW (2013) The role of cultural transmission in hu-
Study. Biodemogr Soc Biol 62:19–35. man demographic change: An age-structured model. Theor Popul Biol 88:68–77.
64. Paradies Y, et al. (2015) Racism as a determinant of health: A systematic review and 102. Barnosky AD, Koch PL, Feranec RS, Wing SL, Shabel AB (2004) Assessing the causes of
meta-analysis. PLoS One 10:e0138511. late Pleistocene extinctions on the continents. Science 306:70–75.
65. Nugent NR, Tyrka AR, Carpenter LL, Price LH (2011) Gene-environment interactions: 103. Lansing JS, Cox MP, Downey SS, Janssen MA, Schoenfelder JW (2009) A robust
Early life stress and risk for depressive and anxiety disorders. Psychopharmacology budding model of Balinese water temple networks. World Archaeol 41:112–133.
(Berl) 214:175–196. 104. Erickson CL (1992) Prehistoric landscape management in the Andean highlands:
66. Laeng B, Mathisen R, Johnsen JA (2007) Why do blue-eyed men prefer women with Raised field agriculture and its environmental impact. Popul Environ 13:285–300.
the same eye color? Behav Ecol Sociobiol 61:371–384. 105. Delcourt PA, Delcourt HR (2004) Prehistoric Native Americans and Ecological
67. Keller MC, et al. (2013) The genetic correlation between height and IQ: Shared genes Change: Human Ecosystems in Eastern North America Since the Pleistocene (Cam-
or assortative mating? PLoS Genet 9:e1003451. bridge Univ Press, Cambridge, UK).
68. Treur JL, Vink JM, Boomsma DI, Middeldorp CM (2015) Spousal resemblance for 106. Boyd R (1999) Indians, Fire, and the Land in the Pacific Northwest (Oregon State Univ
smoking: Underlying mechanisms and effects of cohort and age. Drug Alcohol Press, Corvallis, OR).
Depend 153:221–228. 107. Bliege Bird R, Bird DW, Codding BF, Parker CH, Jones JH (2008) The “fire stick
69. Feldman MW, Cavalli-Sforza LL (1977) The evolution of continuous variation. II. farming” hypothesis: Australian Aboriginal foraging strategies, biodiversity, and
Complex transmission and assortative mating. Theor Popul Biol 11:161–181. anthropogenic fire mosaics. Proc Natl Acad Sci USA 105:14796–14801.
70. Rice J, Cloninger CR, Reich T (1978) Multifactorial inheritance with cultural trans- 108. Roebroeks W, et al. (1992) Dense forests, cold steppes, and the palaeolithic settle-
mission and assortative mating. I. Description and basic properties of the unitary ment of Northern Europe. Curr Anthropol 33:551–586.
models. Am J Hum Genet 30:618–643. 109. Wrangham RW (2009) Catching Fire: How Cooking Made Us Human (Basic Books,
71. Creanza N, Fogarty L, Feldman MW (2012) Models of cultural niche construction with New York).
selection and assortative mating. PLoS One 7:e42744. 110. Stiner MC (2001) Thirty years on the “broad spectrum revolution” and paleolithic
72. Creanza N, Feldman MW (2014) Complexity in models of cultural niche construction demography. Proc Natl Acad Sci USA 98:6993–6996.
with selection and homophily. Proc Natl Acad Sci USA 111(Suppl 3):10830–10837. 111. Davis S, Rabinovich R, Goren-Inbar N (1988) Quaternary extinctions and population
73. Eshel I, Cavalli-Sforza LL (1982) Assortment of encounters and evolution of co- increase in western Asia: The animal remains from Biq’at Quneitra. Paéorient 14:
operativeness. Proc Natl Acad Sci USA 79:1331–1335. 95–105.
74. Centola D (2010) The spread of behavior in an online social network experiment. 112. Hockett B, Haws JA (2005) Nutritional ecology and the human demography of Ne-
Science 329:1194–1197. andertal extinction. Quat Int 137:21–34.
75. Centola D (2011) An experimental study of homophily in the adoption of health 113. Hardy BL (2010) Climatic variability and plant food distribution in Pleistocene
behavior. Science 334:1269–1272. Europe: Implications for Neanderthal diet and subsistence. Quat Sci Rev 29:662–679.
76. Thiessen D, Gregg B (1980) Human assortative mating and genetic equilibrium: An 114. Flannery KV (1969) Origins and ecological effects of early domestication in Iran and
evolutionary perspective. Ethol Sociobiol 1:111–140. the Near East. The Domestication and Exploitation of Plants and Animals, eds
77. Abdellaoui A, et al. (2013) Association between autozygosity and major depression: Ucko PJ, Dimbleby GW (Gerald Duckworth, London), pp 73–100.
Stratification due to religious assortment. Behav Genet 43:455–467. 115. Valla FR, Bar-Yosef O, eds (1991) The Natufian Culture in the Levant (International
78. Hunley K, et al. (2008) Genetic and linguistic coevolution in Northern Island Mela- Monographs in Prehistory, Ann Arbor, MI).
nesia. PLoS Genet 4:e1000239. 116. Rowley-Conwy P, Layton R (2011) Foraging and farming as niche construction: Stable
79. Hunley K, Long JC (2005) Gene flow across linguistic boundaries in Native North and unstable adaptations. Philos Trans R Soc Lond B Biol Sci 366:849–862.
American populations. Proc Natl Acad Sci USA 102:1312–1317. 117. Smith BD, Zeder MA (2013) The onset of the Anthropocene. Anthropocene 4:8–13.
80. Srithawong S, et al. (2015) Genetic and linguistic correlation of the Kra–Dai-speaking 118. Winterhalder B, Smith EA (2000) Analyzing adaptive strategies: Human behavioral
groups in Thailand. J Hum Genet 60:1–10. ecology at twenty-five. Evol Anthropol Issues News Rev 9:51–72.
EVOLUTION
farmers into a region occupied by hunter–gatherers. Theor Popul Biol 50:1–17. 171. Kolodny O, Lotem A, Edelman S (2015) Learning a generative probabilistic grammar
132. Patterson MA, Sarson GR, Sarson HC, Shukurov A (2010) Modelling the Neolithic of experience: A process-level model of language acquisition. Cogn Sci 39:227–267.
transition in a heterogeneous environment. J Archaeol Sci 37:2929–2937. 172. Hovers E (2012) Invention, reinvention and innovation: The makings of Oldowan
133. Allentoft ME, et al. (2015) Population genomics of bronze age Eurasia. Nature 522: lithic technology. Origins of Human Innovation and Creativity, ed Elias S (Elsevier,
167–172. Oxford).
134. Goldberg A, Günther T, Rosenberg NA, Jakobsson M (2017) Ancient X chromosomes 173. Bar-Yosef O (2002) The Upper Paleolithic revolution. Annu Rev Anthropol 31:363–393.
reveal contrasting sex bias in Neolithic and Bronze Age Eurasian migrations. Proc 174. Klein RG (2008) Out of Africa and the evolution of human behavior. Evol Anthropol
Natl Acad Sci USA 114:2657–2662. 17:267–281.
135. Wossink A (2009) Challenging Climate Change: Competition and Cooperation 175. Boyd R, Richerson PJ (2009) Voting with your feet: Payoff biased migration and the
ANTHROPOLOGY
Among Pastoralists and Agriculturalists in Northern Mesopotamia (c. 3000-1600 evolution of group beneficial behavior. J Theor Biol 257:331–339.
BC) (Sidestone, Leiden, The Netherlands). 176. Wiens JJ, Hollingsworth BD (2000) War of the Iguanas: Conflicting molecular and
136. Spielmann KA, Eder JF (1994) Hunters and farmers: Then and now. Annu Rev morphological phylogenies and long-branch attraction in iguanid lizards. Syst Biol
Anthropol 23:303–323. 49:143–159.
137. Kwiatkowski DP (2005) How malaria has affected the human genome and what 177. Borgerhoff Mulder M, et al. (2009) Intergenerational wealth transmission and the
human genetics can teach us about malaria. Am J Hum Genet 77:171–192. dynamics of inequality in small-scale societies. Science 326:682–688.
138. Tishkoff SA, et al. (2001) Haplotype diversity and linkage disequilibrium at human 178. Fogarty L, Strimling P, Laland KN (2011) The evolution of teaching. Evolution 65:
G6PD: Recent origin of alleles that confer malarial resistance. Science 293:455–462. 2760–2770.
139. Mead S, et al. (2003) Balancing selection at the prion protein gene consistent with 179. Fehér O, Wang H, Saar S, Mitra PP, Tchernichovski O (2009) De novo establishment
prehistoric kurulike epidemics. Science 300:640–643. of wild-type song culture in the zebra finch. Nature 459:564–568.
140. Bustamante CD, et al. (2005) Natural selection on protein-coding genes in the hu- 180. Verzijden MN, et al. (2012) The impact of learning on sexual selection and specia-
man genome. Nature 437:1153–1157. tion. Trends Ecol Evol 27:511–519.
141. Sabeti PC, et al.; International HapMap Consortium (2007) Genome-wide detection 181. Lachlan RF, Servedio MR (2004) Song learning accelerates allopatric speciation.
and characterization of positive selection in human populations. Nature 449: Evolution 58:2049–2063.
913–918. 182. Creanza N, Fogarty L, Feldman MW (2016) Cultural niche construction of repertoire
142. Enard D, Cai L, Gwennap C, Petrov DA (2016) Viruses are a dominant driver of size and learning strategies in songbirds. Evol Ecol 30:285–305.
protein adaptation in mammals. eLife 5:e12469. 183. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture
143. Durham WH (1991) Coevolution: Genes, Culture, and Human Diversity (Stanford via conformity in wild birds. Nature 518:538–541.
Univ Press, Stanford, CA). 184. Rendell L, Whitehead H (2001) Culture in whales and dolphins. Behav Brain Sci 24:
144. Lindenbaum S (2015) Kuru Sorcery: Disease and Danger in the New Guinea 309–324, discussion 324–382.
Highlands (Routledge, Abingdon, UK). 185. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685.
145. Boni MF, Feldman MW (2005) Evolution of antibiotic resistance by human and 186. Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl
bacterial niche construction. Evolution 59:477–491. Acad Sci USA 114:7814–7821.
146. Walter J, Ley R (2011) The human gut microbiome: Ecology and recent evolutionary 187. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) in-
changes. Annu Rev Microbiol 65:411–429. vent more social behaviors, but younger monkeys innovate more in other contexts.
147. Szilagyi A, Galiatsatos P, Xue X (2016) Systematic review and meta-analysis of lactose Proc Natl Acad Sci USA 114:7806–7813.
digestion, its impact on intolerance and nutritional effects of dairy food restriction 188. Fragaszy D, Izar P, Visalberghi E, Ottoni EB, de Oliveira MG (2004) Wild capuchin monkeys
in inflammatory bowel diseases. Nutr J 15:67. (Cebus libidinosus) use anvils and stone pounding tools. Am J Primatol 64:359–366.
148. Bocquet‐Appel J (2002) Paleoanthropological traces of a Neolithic demographic 189. Whiten A, Horner V, de Waal FBM (2005) Conformity to cultural norms of tool use in
transition. Curr Anthropol 43:637–650. chimpanzees. Nature 437:737–740.
149. Gage TBB, DeWitte S (2009) What do we know about the agricultural demographic 190. Ottoni EB, Izar P (2008) Capuchin monkey tool use: Overview and implications. Evol
transition? Curr Anthropol 50:649–655. Anthropol. 17:171–178.
150. Ammerman AJ, Cavalli-Sforza LL (1984) The Neolithic Transition and the Genetics of 191. Whiten A (2011) The scope of culture in chimpanzees, humans and ancestral apes.
Populations in Europe (Princeton Univ Press, Princeton). Philos Trans R Soc Lond B Biol Sci 366:997–1007.
151. Henn BM, Cavalli-Sforza LL, Feldman MW (2012) The great human expansion. Proc 192. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes.
Natl Acad Sci USA 109:17758–17764. Proc Natl Acad Sci USA 114:7790–7797.
152. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the 193. Seneviratne SI, Donat MG, Pitman AJ, Knutti R, Wilby RL (2016) Allowable CO2
appearance of modern human behavior. Science 324:1298–1301. emissions based on regional and impact-related climate targets. Nature 529:477–483.
153. Richerson PJ, Boyd R (1984) Natural selection and culture. Bioscience 34:430–434. 194. Fischer RA, Byerlee D, Edmeades G (2014) Crop Yields and Global Food Security
154. Colleran H (2016) The cultural evolution of fertility decline. Philos Trans R Soc Lond B (ACIAR, Canberra, Australia).
Biol Sci 371:20150152. 195. Garibaldi LA, et al. (2017) Farming approaches for greater biodiversity, livelihoods,
155. Leslie PH (1948) Some further notes on the use of matrices in population mathe- and food security. Trends Ecol Evol 32:68–80.
matics. Biometrika 35:213–245. 196. Kassam A, Friedrich T, Shaxson F, Pretty J (2009) The spread of conservation agri-
156. Carotenuto L, Feldman MW, Cavalli-Sforza L (1989) Age structure in models of cul- culture: Justification, sustainability and uptake. Int J Agric Sustain 7:292–320.
tural transmission. Working paper (Morrison Institute for Population and Resource 197. Rhines AS (2013) The role of sex differences in the prevalence and transmission of
Studies, Stanford, CA), No 16. tuberculosis. Tuberculosis (Edinb) 93:104–107.
Creanza et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7789
Culture extends the scope of evolutionary biology in
the great apes
Andrew Whitena,b,1
a
Centre for Social Learning and Cognitive Evolution, University of St. Andrews, St. Andrews, KY16 9JP, United Kingdom; and bScottish Primate Research
Group, School of Psychology and Neuroscience, University of St. Andrews, St. Andrews, KY16 9JP, United Kingdom
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 14, 2017)
Discoveries about the cultures and cultural capacities of the great (henceforth simply “apes”), as they do in humans (13). I then
apes have played a leading role in the recognition emerging in explore ways in which cultural inheritance goes yet further be-
recent decades that cultural inheritance can be a significant factor yond these principles, creating new evolutionary phenomena.
in the lives not only of humans but also of nonhuman animals. This Finally I address interactions between the primary manifesta-
prominence derives in part from these primates being those with tions of organic evolution based on genetic inheritance and the
whom we share the most recent common ancestry, thus offering “second inheritance system” (14) based on social learning. In a
clues to the origins of our own thoroughgoing reliance on cumulative now long-standing body of literature for humans, this inter-
cultural achievements. In addition, the intense research focus on these action has been called “gene–culture coevolution” (15); the
species has spawned an unprecedented diversity of complementary logic of such coevolution (10, 16) may apply to other cultural
methodological approaches, the results of which suggest that cultural animals (1, 7).
phenomena pervade the lives of these apes, with potentially
Diverse and Convergent Evidence for the Scope of Great
major implications for their broader evolutionary biology. Here I
Ape Culture
review what this extremely broad array of observational and
experimental methodologies has taught us about the cultural lives Geographic Variation in Traditions in the Wild. In 1986 Goodall
of chimpanzees, gorillas, and orangutans and consider the ways in began to chart differences in behavior patterns among chim-
which this knowledge extends our wider understanding of primate
panzee study sites across Africa (17), proposing these differences
biology and the processes of adaptation and evolution that shape it.
as cultural variants when no genetic or environmental explana-
tion was apparent (later called the “method of exclusion”). The
I address these issues first by evaluating the extent to which the
approach became more comprehensive with time (18, 19),
results of cultural inheritance echo a suite of core principles that
eventually benefitting from a systematic collaboration between
underlie organic Darwinian evolution but also extend them in new
multiple long-term research groups (20, 21). Similar collaborative
ways and then by assessing the principal causal interactions
analyses were soon achieved by orangutan field researchers (22)
between the primary, genetically based organic processes of and more recently by a gorilla consortium (23). These analyses
evolution and the secondary system of cultural inheritance that converged in reporting multiple cultural variants in all three gen-
is based on social learning from others. era: 39 in Pan; 24 in Pongo, and 23 in Gorilla. The variants spanned
apes’ behavioral repertoires, including a great variety of tool use,
social learning | culture | evolutionary biology | chimpanzee | orangutan food processing, and social behavior, as discussed further below.
Further variants have continued to be reported intermittently for
and see ref. 7). Such a statement is obviously true for our own Author contributions: A.W. wrote the paper.
species (8–11); here I examine the justifications for thinking the The author declares no conflict of interest.
phrase also has validity for great apes. This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
Following a sister review ranging much more widely across Board.
both vertebrates and invertebrates (1), I take eight core princi- 1
Email: a.whiten@st-andrews.ac.uk.
ples of evolution illuminated by Darwin (12) and assess the ex- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
tent to which they apply to cultural phenomena in the great apes 1073/pnas.1620733114/-/DCSupplemental.
EVOLUTION
Quantitative Evidence for Vertical Mother-to-Offspring Transmission. ant-fishing to a new community after the immigration of a pro-
A study of the ontogeny of using stem-tools for termite-fishing ficient individual from a neighboring community where fishing
found that juvenile female chimpanzees spent significantly more was habitual (44).
time attending to their mother’s fishing than did their male peers
(38). Consistent with the skills being learned by observation, the Quantitative and Qualitative Evidence for Investment in Transmission.
young females tended to master the technique a whole year ahead Videos of termite-fishing have documented skilled chimpanzee
of the males, with a significant tendency to match even the length mothers donating tools to less competent juveniles (Fig. S1), thus
of probe their mother typically inserted into the mound (38). suffering a diminished duration and rate of termite-fishing while
ANTHROPOLOGY
Researchers studying orangutans have called the focused vi- the recipient enjoyed improved fishing (45). The authors pro-
sual attention of juveniles “peering” (Fig.1) (39). Building on pose these observations meet commonly accepted criteria for a
studies documenting correlations between maternal and juvenile functional (as opposed to intentional) concept of “teaching.”
foraging profiles (40, 41), a suite of predictions were confirmed They also document mothers orally splitting their tool lengthwise
that were consistent with peering functioning to facilitate learning neatly to make two functional tools or bringing multiple tools
key survival skills (39). In foraging and nest-building contexts, in and suggest these behaviors partially buffer mothers from the
which peering is most frequent, it was found that (i) the frequency costs of youngsters’ demands. Alternatively it might be argued
of peering in foraging contexts was predicted by the quantified that these actions are essentially unnecessary and thus represent
the more compelling evidence that the behavior has costs and
therefore counts as teaching, even if the teaching is not as active
as teaching by scorpion provision by meerkats (46) or beaching
to catch seals by killer whales (6). However, the pattern of costs
and benefits suggests that this support has positive fitness ben-
efits for the young, and parallel reports concerning the use of
tools for nut-cracking have also been described (47).
An earlier report described more active involvement in curb-
ing youngsters’ exploration of potentially dangerous food-types.
Haraiwa-Hasegawa reported that when an infant, PN, reached to
touch some fig leaves, “her mother, FT, took PN’s hand and
moved it away from the leaves. As PN continued . . . FT took the
leaves from PN’s hand, plucked all the leaves within her arm’s
reach and dropped them to the ground” (ref. 48, p. 280). At least
one other mother behaved similarly and “prohibited . . . infants
only from feeding on the individual trees that they themselves
never fed on.”
Nut-cracking
recorded at eight
sites in West
Africa across
~500 Km [60] In an experiment in an Island
sanctuary, East African
chimpanzees who do not nut-
crack in the wild learned nut-
cracking through observation
[31,32]
Fig. 2. Convergent evidence for a culture of nut-cracking in chimpanzees. Evidence for nut-cracking is seen at multiple sites in West Africa (20, 21, 60) (white stars)
but is absent at others (black stars). The gray star indicates an early report in Cameroon, which was not subsequently confirmed. Independent studies confirmed
availability of raw materials at two such sites (61, 62). Experiments showed East African chimpanzees (two-letter ID codes) did not initially nut-crack (Phase 1), but
when half of the population was exposed to a proficient model, they began to do so (Phase 2), and all did so once exposed (Phase 3) (31, 32).
EVOLUTION
fitness (reproductive success) of that cultural entity—stone-tool
dietary profiles. use—will be enhanced through its spread, and to that extent
These years of mother–offspring association and cofeeding are there is cultural evolution of this behavior. It is this second
typical of all the great apes and appear to lay down dietary phenomenon that we are addressing here. Effects on an indi-
preferences that change relatively little after weaning. Although vidual culture-bearer’s biological, inclusive fitness are a different
the social learning implied may be as simple as enhancement of a matter to which we return in a later section. We can now con-
food type by the mother’s feeding on it, such effects are likely to sider the eight evolutionary principles noted above.
be profoundly important, because large diet-sets need to be
ANTHROPOLOGY
mastered and selected from the even more vast options a tropical Variation, Selection, and Inheritance. The three principles of vari-
forest offers. This mastery includes avoiding the numerous plant ation, selection, and inheritance can together be regarded as the
parts that are toxic, selecting relatively nutritious options, and core trinity of Darwinian evolution. Their joint working is an
avoiding relatively poor ones. Chimpanzees may eat more than evolutionary algorithm that has been suggested to have the
300 different food types (species × parts) in a year (74); in the power to explain a multitude of phenomena beyond the living
Lopé Park of Gabon, for example, fruit alone is taken from systems to which Darwin applied it (83, 84).
114 different plants (75) selected from among many hundreds of As we have seen above, there is plentiful evidence in the great
potential food types available. The diet of gorillas may be similarly apes for the feature of inheritance through social learning that
diverse; gorillas in the Alfi Mountains of Cameroon eat more than provides sufficient fidelity to sustain traditions. There is also
200 different food types, including fruits, seeds, leaves, stems, pith, cultural variation, in part because, compared with gene replica-
flowers, bark, roots, and invertebrates (75). For the orangutans of tion, social learning is prone to imperfect copying. In the arrays
Tanjung Putting in Borneo, the figure is again more than 300 dif- of cultural variants among great apes discussed earlier in this
ferent food types (76). However, the dietary profiles of different article, there are plenty of behaviors that are displayed by many
populations may vary greatly, as suggested by earlier chimpanzee but not all individuals in a community (these behaviors are classed
studies (77) and confirmed more recently in neighboring orang- as “habitual” rather than “customary”).
utan populations separated by a large river, which displayed 60% By contrast, as yet there seems to be little direct recording of
difference in diet, contrasting with intrapopulation homogeneity cultural evolutionary change through competition and selection
(78). Years of close apprenticeship to a mother who daily displays within this variation. This absence of evidence of evolutionary
her knowledge of such a large but selective diet-set likely provide change is perhaps unsurprising. During the human Stone Age,
even when sophisticated, bilaterally symmetric Acheulian blades
an important means of achieving an adaptive response to this
showed an advance over earlier crude Oldowan tools, they
challenging complexity.
changed relatively little over a million years (85). If, as is plau-
sible, such stability characterizes chimpanzee nut-cracking and
Time Depth of Cultural Transmission. Long-term field sites have
other cultural variants of apes, then we will see little evidence
shown that techniques such as termite-fishing have continued of cultural selection in human lifetimes. Of course, organic
across several generations during the half-century of research evolutionary change is itself often slow compared with scientific
now achieved. However, this time-frame pales in comparison lifetimes, although instructive exceptions have often followed
with the discoveries of real archaeological excavation, which in human-caused environmental perturbations that create new se-
the Tai Forest of Ivory Coast reached a depth corresponding to lection pressures. The classic example is selection favoring dark
4,300 y, where remains of nut-cracking were identified (illus- morphs of peppered moths, better camouflaged against the sooty
trated in figure 1 of ref. 1) beneath those currently generated on surfaces of the industrial revolution, and then flipping to favor
the surface by chimpanzees (79). Of course, this behavior may be light-colored morphs as the world became cleaner again.
very much older. Once such a beneficial technology becomes Accordingly I have suggested that similar contexts of anthro-
customary, it may continue in perpetuity, pending major ecological pogenic change may be fruitful for investigating cultural evolu-
perturbation. This example suggests that ape cultural inheritance tion in animals (1). Scientific experiments may offer a convenient
spans not only the breadth of behavioral repertoires outlined instance. For example, in a pioneering cultural diffusion study,
earlier but also a potentially significant time depth comparable to three juvenile chimpanzees were confronted with and avoided
that familiar in organic evolution via genetic inheritance. two novel objects (86). One youngster was then replaced with a
EVOLUTION
again. This pattern can be thought of as a spiral or helical process culturally transmitted behaviors are evolutionarily consequential,
of learning in which cycles of observation and practice allow the i.e., they have implications for practitioners’ survival, reproduc-
learner to assimilate more in later observations than was possible tion, and ultimate inclusive fitness (as opposed to the repro-
in the earlier, more naive stages (Fig. 3) (32, 39). ductive success of the cultural items themselves, discussed
Social learning also may be selective in the assimilation of earlier). Some cultural variants that appear relatively frivolous,
information, variously referred to as “directed social learning” such as staring at one’s reflection in water in gorillas (23) or
(95), “biased transmission” (15), or “social-learning strategies” applying an autoerotic tool in orangutans (22), may have less
ANTHROPOLOGY
(96), which can in principle shape adaptation and consequent evolutionary significance, but varied forms of tool use by orang-
evolutionary change, with no clear counterparts in the gene- utans and chimpanzees appear to be highly functional in gaining
based processes. access to rich resources such as insect prey, nut kernels, and
Evidence in great apes has been adduced for a number of the honey. Indeed, some of these behaviors appear to be vital for
potential learning rules these analyses highlight (97). Evidence chimpanzees to exploit niches that would otherwise exclude
for a copy-the-majority rule, suggested by the apparent confor- them (87). Other culturally transmitted behaviors play functional
mity of chimpanzees in diffusion experiments (68), came in roles in grooming, social interactions, and sexual courtship.
further experiments showing that both children and chimpanzees Another sense in which culturally transmitted behaviors may
would copy the choices of three other conspecifics rather than have been evolutionary important concerns their effects on or-
those of a single individual repeating the same act three times ganic evolution. Cetacean researchers have proposed that cul-
tural differentiation among whales has led to genetic differences
(7, 105). For example, killer whales display eco-types that spe-
cialize in hunting alternative prey such as seals or fish using very
different techniques, and different clans exhibit other behavioral
differences in their songs and migratory/resident patterns, de-
spite often being sympatric (6, 7, 105, 106). Such effects are
suggested to have driven other morphological and genetic dif-
ferentiation, ultimately leading to incipient speciation, because it
becomes difficult for a member of one culture to enter another
and successfully manage the different foraging and courtship
requirements of that culture. This causal pathway would be an
instance of “behavioral drive” (107–109), in which plasticity in
behavior allows a species to exploit or create a new niche, in this
case a culturally dependent one (e.g., hunting fish versus seal, a
cultural drive). This niche in turn may create selection pressures
acting on organic evolution, with effects such as the evolution of
more robust jaws in the seal-hunters (6). Parallel hypotheses
have been developed in the case of birdsong dialects driving
speciation (110, 111).
Dramatically different specialisms such as seen in killer whales
are not apparent among great apes, although the extent to which
similar processes are at work, e.g., in contrasts between nut-
cracking communities of chimpanzees and the nearest neighbors
Fig. 3. Helical curriculum model of skill development (after ref. 32). Over that do not crack, would repay attention. However, one principal
repeated cycles of observation-of-expert and practice, the social learner is effect of complex culture on organic evolution in apes has been
able to assimilate more information from the expert and gradually improve proposed concerning encephalization and the cognitive sophis-
his/her skill level. See the text for more explanation. tication it can provide: the cultural intelligence hypothesis.
1. Whiten A (2017) A second inheritance system: The extension of biology through 24. Whiten A (2012) Social learning, traditions and culture. The Evolution of Primate
culture. Interface Focus, in press. Societies, eds Mitani J, Call J, Kappeler P, Palombit R, Silk J (Univ of Chicago Press,
2. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology Chicago), pp 681–699.
through culture. Proc Natl Acad Sci USA 114:7775–7781. 25. van Schaik CP, et al. (2009) Orangutan cultures re-visited. Orangutans: Geographic
3. Fragaszy DM, et al. (2017) Synchronized practice helps bearded capuchin monkeys Variation in Behavioral Ecology and Conservation, eds Wich SA, Atmoko SSU,
learn to extend attention while learning a tradition. Proc Natl Acad Sci USA 114: Setia TM, van Schaik CP (Oxford Univ Press, Oxford, UK), pp 299–309.
7798–7805. 26. Laland KN, Janik VM (2006) The animal cultures debate. Trends Ecol Evol 21:542–547.
4. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) invent 27. Krützen M, Willems EP, van Schaik CP (2011) Culture and geographic variation in
more social behaviors, but younger monkeys innovate more in other contexts. Proc Natl orangutan behavior. Curr Biol 21:1808–1812.
Acad Sci USA1147806–7813. 28. Schöning C, Humle T, Möbius Y, McGrew WC (2008) The nature of culture: Tech-
5. Hohmann G, Fruth B (2003) Culture in Bonobos? Between species and within species nological variation in chimpanzee predation on army ants revisited. J Hum Evol 55:
variation in behavior. Curr Anthropol 44:563–571. 48–59.
6. Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Univ of 29. Möbius Y, Boesch C, Koops K, Matsuzawa T, Humle T (2008) Cultural differences in
Chicago Press, Chicago). army ant predation by West African chimpanzees? A comparative study of micro-
7. Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl ecological variables. Anim Behav 76:37–45.
Acad Sci USA 114:7814–7821. 30. Luncz LV, Boesch C (2014) Tradition over trend: Neighboring chimpanzee commu-
8. Pagel M (2012) Wired For Culture: The Natural History of Human Communication nities maintain differences in cultural behavior despite frequent immigration of
(Allen Lang, London). adult females. Am J Primatol 76:649–657.
9. Henrich J (2015) The Secret of Our Success: How Culture Is Driving Human Evolution, 31. Marshall-Pescini S, Whiten A (2008) Social learning of nut-cracking behavior in East
Domesticating Our Species, and Making Us Smarter (Princeton Univ Press, Princeton, NJ). African sanctuary-living chimpanzees (Pan troglodytes schweinfurthii). J Comp Psychol
10. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul- 122:186–194.
ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789. 32. Whiten A (2015) Experimental studies illuminate the cultural transmission of per-
11. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl cussive technologies in Homo and Pan. Philos Trans R Soc Lond B Biol Sci 370:
Acad Sci USA 114:7877–7883. 20140359.
12. Darwin CD (1859) On the Origin in Species by Natural Selection (Murray, London). 33. Santorelli CJ, et al. (2011) Traditions in spider monkeys are biased towards the social
13. Mesoudi A, Whiten A, Laland KN (2004) Perspective: Is human cultural evolution domain. PLoS One 6:e16863.
Darwinian? Evidence reviewed from the perspective of the Origin of Species. 34. van Leeuwen EJC, Cronin KA, Haun DB (2014) A group-specific arbitrary tradition in
Evolution 58:1–11. chimpanzees (Pan troglodytes). Anim Cogn 17:1421–1425.
14. Whiten A (2005) The second inheritance system of chimpanzees and humans. Nature 35. van Leeuwen EJC, Cronin KA, Haun DB, Mundry R, Bodamer MD (2012) Neigh-
437:52–55. bouring chimpanzee communities show different preferences in social grooming
15. Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago behaviour. Proc Biol Sci 279:4362–4367.
Press, Chicago). 36. Rawlings B, Davila-Ross M, Boysen ST (2014) Semi-wild chimpanzees open hard-
16. Mesoudi A (2017) Pursuing Darwin’s curious parallel: Prospects for science of cultural shelled fruits differently across communities. Anim Cogn 17:891–899.
evolution. Proc Natl Acad Sci USA 114:7853–7860. 37. Bonnie KE, de Waal FBM (2006) Affiliation promotes the transmission of a social
17. Goodall J (1986) The Chimpanzees of Gombe: Patterns of Behavior (Harvard Univ custom: Handclasp grooming among captive chimpanzees. Primates 47:27–34.
Press, Boston). 38. Lonsdorf EV, Eberly LE, Pusey AE (2004) Sex differences in learning in chimpanzees.
18. McGrew WC (1992) Chimpanzee Material Culture: Implications for Human Evolution Nature 428:715–716.
(Cambridge Univ Press, Cambridge, UK). 39. Schuppli C, et al. (2016) Observational learning and socially induced practice of
19. Boesch C, Tomasello M (1998) Chimpanzee and human cultures. Curr Anthropol 39: routine skills in immature orangutans. Anim Behav 119:87–98.
591–614. 40. Jaeggi AV, et al. (2010) Social learning of diet and foraging skills by wild immature
20. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. Bornean orangutans: Implications for culture. Am J Primatol 72:62–71.
21. Whiten A, et al. (2001) Charting cultural variation in chimpanzees. Behaviour 138: 41. Jaeggi AV, van Noordwijk MA, van Schaik CP (2008) Begging for information:
1489–1525. Mother-offspring food sharing among wild Bornean orangutans. Am J Primatol 70:
22. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. 533–541.
Science 299:102–105. 42. Hobaiter C, Poisot T, Zuberbühler K, Hoppitt W, Gruber T (2014) Social network
23. Robbins MM, et al. (2016) Behavioural variation in gorillas: Evidence of potential analysis shows direct evidence for social transmission of tool use in wild chimpan-
cultural traits. PLoS One 11:e0160483. zees. PLoS Biol 12:e1001960.
EVOLUTION
56. Tennie C, Call J, Tomasello M (2009) Ratchetting up the ratchet: On the evolution of J Phys Anthropol 149:447–457.
cumulative culture. Phil Tran R Soc B 36:2405-2415. 93. Fragaszy D, Izar P, Visalberghi E, Ottoni EB, de Oliveira MG (2004) Wild capuchin
57. Subiaul F (2016) What’s special about human imitation? A comparison with encul- monkeys (Cebus libidinosus) use anvils and stone pounding tools. Am J Primatol 64:
turated apes. Behav Sci (Basel) 6:13. 359–366.
58. Whiten A (2017) Social learning and culture in child and chimpanzee. Annu Rev 94. Whiten A (1998) Imitation of the sequential structure of actions by chimpanzees
Psychol 68:129–154. (Pan troglodytes). J Comp Psychol 112:270–281.
59. Whiten A, van de Waal E (2016) Social learning, culture and the ‘socio-cultural brain’ of 95. Coussi-Korbel S, Fragaszy DM (1995) On the relation between social dynamics and
human and non-human primates. Neurosci Biobehav Rev, 10.1016/j.neubiorev.2016.12.018.
social learning. Anim Behav 50:1441–1450.
60. Carvalho S, McGrew W (2010) The origins of the Oldowan: Why chimpanzees are still
96. Laland KN (2004) Social learning strategies. Learn Behav 32:4–14.
ANTHROPOLOGY
good models for technological evolution in Africa. Stone Tools and Fossil Bones, ed
97. Price EE, Wood LA, Whiten A (2016) Adaptive cultural transmission biases in children
Domínguez-Rodrigo M (Cambridge Univ Press, Cambridge, UK), pp 201–221.
and nonhuman primates. Infant Behav Dev, 10.1016/j.infbeh.2016.11.003.
61. Boesch C, Marchesi P, Marchesi N, Fruth B, Joulian F (1994) Is nutcracking in wild
98. Haun DB, Rekers Y, Tomasello M (2012) Majority-biased transmission in chimpanzees
chimpanzees a cultural behaviour? J Hum Evol 26:325–338.
and human children, but not orangutans. Curr Biol 22:727–731.
62. McGrew WC, Ham RM, White LJT, Tutin CEG, Fernandez M (1997) Why don’t
99. Vale GL, Flynn EG, Lambeth SP, Schapiro SJ, Kendal RL (2014) Public information use
chimpanzees in Gabon crack nuts? Int J Primatol 18:353–374.
63. Whiten A, Mesoudi A (2008) Review. Establishing an experimental science of culture: in chimpanzees (Pan troglodytes) and children (Homo sapiens). J Comp Psychol 128:
Animal social diffusion experiments. Philos Trans R Soc Lond B Biol Sci 363:3477–3488. 215–223.
64. Whiten A, Caldwell CA, Mesoudi A (2016) Cultural diffusion in humans and other 100. Horner V, Proctor D, Bonnie KE, Whiten A, de Waal FBM (2010) Prestige affects
animals. Curr Op Psychol 8:15–21. cultural learning in chimpanzees. PLoS One 5:e10625.
65. Horner V, Whiten A, Flynn E, de Waal FBM (2006) Faithful replication of foraging 101. Kendal R, et al. (2015) Chimpanzees copy dominant and knowledgeable individuals:
techniques along cultural transmission chains by chimpanzees and children. Proc Implications for cultural diversity. Evol Hum Behav 36:65–72.
Natl Acad Sci USA 103:13878–13883. 102. Henrich J, Broesch J (2011) On the nature of cultural transmission networks: Evi-
66. Dindo M, Stoinski T, Whiten A (2011) Observational learning in orangutan cultural dence from Fijian villages for adaptive learning biases. Philos Trans R Soc Lond B Biol
transmission chains. Biol Lett 7:181–183. Sci 366:1139–1148.
67. Whiten A, Horner V, de Waal FBM (2005) Conformity to cultural norms of tool use in 103. Harris PL, Corriveau KH (2011) Young children’s selective trust in informants. Philos
chimpanzees. Nature 437:737–740. Trans R Soc Lond B Biol Sci 366:1179–1187.
68. Bonnie KE, Horner V, Whiten A, de Waal FBM (2007) Spread of arbitrary conventions 104. Lucas AJ, et al. (2016) The development of selective copying: Children’s learning
among chimpanzees: A controlled experiment. Proc Biol Sci 274:367–372. from an expert versus their mother. Child Dev, 10.1111/cdev.12711.
69. Whiten A, et al. (2007) Transmission of multiple traditions within and between 105. Carroll EL, et al. (2015) Cultural traditions across a migratory network shape the
chimpanzee groups. Curr Biol 17:1038–1043. genetic structure of southern right whales around Australia and New Zealand. Sci
70. Pruetz JD, Bertolani P (2007) Savanna chimpanzees, Pan troglodytes verus, hunt with Rep 5:16182.
tools. Curr Biol 17:412–417. 106. Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of
71. Pruetz JD, et al. (2015) New evidence on the tool-assisted hunting exhibited by
killer whale ecotypes. Nat Commun 7:11693.
chimpanzees (Pan troglodytes verus) in a savannah habitat at Fongoli, Sénégal. R Soc
107. Wyles JS, Kunkel JG, Wilson AC (1983) Birds, behavior, and anatomical evolution.
Open Sci 2:140507.
Proc Natl Acad Sci USA 80:4394–4397.
72. Sanz C, Call J, Morgan D (2009) Design complexity in termite-fishing tools of chim-
108. Wilson AC (1985) The molecular basis of evolution. Sci Am 253:164–173.
panzees (Pan troglodytes). Biol Lett 5:293–296.
109. Bateson PPG (2004) The active role of behavior in evolution. Biol Philos 19:283–298.
73. Nakamura M, Uehara S (2004) Proximate factors of different types of grooming
110. Grant BR, Grant PR (2002) Simulating secondary contact in allopatric speciation: An
hand-clasp in Mahale chimpanzees: Implications for chimpanzee social customs. Curr
empirical test of premating isolation. Biol J Linn Soc Lond 76:545–556.
Anthropol 45:108–114.
111. Grant PR, Grant BR (2002) Adaptive radiation of Darwin’s finches. Am Sci 90:130–139.
74. Inskipp T (2005) Chimpanzee (Pan troglodytes). World Atlas of Great Apes and Their
112. Whiten A, van Schaik CP (2007) The evolution of animal ‘cultures’ and social in-
Conservation, eds Caldecott J, Miles L (Univ of California Press, Berkeley, CA), pp
telligence. Philos Trans R Soc Lond B Biol Sci 362:603–620.
53–81.
113. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in-
75. Ferriss S (2005) Western gorilla (Gorilla gorilla). World Atlas of Great Apes and Their
Conservation, eds Caldecott J, Miles L (Univ of California Press, Berkeley, CA), pp telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016.
105–127. 114. Burkart JM, Schubiger MN, van Schaik CP (2016) The evolution of general in-
76. McConkey K (2005) Bornean orangutan (Pongo pygmaeus). World Atlas of Great telligence. Behav Brain Sci1–65.
Apes and Their Conservation, eds Caldecott J, Miles L (Univ pof California Press, 115. Street SE, Navarette AF, Reader SM, Land KN (2017) Coevolution of cultural in-
Berkeley, CA), pp 161–183. telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad
77. Nishida T, Wrangham RW, Goodall J, Uehara S (1983) Local differences in plant Sci USA 114:7908–7914.
feeding habits of chimpanzees between the Mahale Mountains and Gombe Na- 116. Forss SIF, Willems E, Call J, van Schaik CP (2016) Cognitive differences between
tional Park, Tanzania. J Hum Evol 12:467–480. orang-utan species: A test of the cultural intelligence hypothesis. Sci Rep 6:30516.
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 23, 2017)
Culture extends biology in that the setting of development shapes of how culture can extend biology (13–15). However, cognitive
the traditions that individuals learn, and over time, traditions processes associated with learning are not yet well integrated into
evolve as occasional variations are learned by others. In humans, theories of cultural evolution and niche construction (16). Here,
interactions with others impact the development of cognitive we illustrate how growing up in a group with a prevailing tradition
processes, such as sustained attention, that shape how individuals of using tools in foraging could affect cognitive development in
learn as well as what they learn. Thus, learning itself is impacted young monkeys in ways that support their learning this traditional
by culture. Here, we explore how social partners might shape the skill. This work opens a bridge between the learning sciences and
development of psychological processes impacting learning a the field of cultural evolution.
tradition. We studied bearded capuchin monkeys learning a
traditional tool-using skill, cracking nuts using stone hammers. Social Experience Influences the Development of Attention
Young monkeys practice components of cracking nuts with stones In humans, social experiences are implicated in the development
for years before achieving proficiency. We examined the time of attention, memory, and individual learning styles (16–18).
course of young monkeys’ activity with nuts before, during, and Cultural influences on long-term memory development include,
following others’ cracking nuts. Results demonstrate that the on- for example, the development of particular ways of chunking and
set of others’ cracking nuts immediately prompts young monkeys rehearsing information to be remembered, such as the con-
to start handling and percussing nuts, and they continue these struction of “memory palaces” used by the ancient Greeks and
activities while others are cracking. When others stop cracking the cultures that succeeded them (19) and the use of written lists
nuts, young monkeys sustain the uncommon actions of percussing and notes in the present day. Culture also impacts the develop-
and striking nuts for shorter periods than the more common ac- ment of working memory, which incorporates structures and
tions of handling nuts. We conclude that nut-cracking by adults processes used for the temporary storage and organization of
can promote the development of sustained attention for the crit- information about events recently heard or seen or about activ-
ical but less common actions that young monkeys must practice to ities recently performed (20, 21). Working memory is dependent
learn this traditional skill. This work suggests that in nonhuman on sustained attention, and therefore sensitive to attentional
species, as in humans, socially specified settings of development disruption. It is intimately related to motor processes, and lim-
impact learning processes as well as learning outcomes. Nonhumans,
ited in span to perhaps three to four “chunks” of information at
like humans, may be culturally variable learners.
one time. In humans, working memory typically lasts from a few
seconds to more than 1 min depending on emotional salience;
primates | attention | development | learning | tool use whether the information to be remembered is about events, ac-
tions, or space; and other factors (20).
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
with visual guidance to collect food and bring it to the mouth is a method. Subsequently, the boxes were presented to all members of
primitive characteristic of primates, and all primates use their the group, and the naive meerkats’ interactions with the boxes, as
hands to explore objects and surfaces, as well as to contact others well as when they watched other meerkats opening the boxes, were
during social interactions, such as grooming and play (32). This recorded. The researchers found that individuals were more likely
fundamental feature of primate behavior is associated with a host to interact with a box immediately after observing another meerkat
of neuroanatomical, perceptuomotor, and cognitive attributes, interacting with it; the half-life of the effect was 20 s. Young
including strong visual and proprioceptive salience to movements meerkats spend much time with adults in the period when they are
of their own hands and of the hands of others (31). learning to forage on the hidden and dangerous scorpions that
Through the observer’s bias to attend to others’ manual ac- these animals capture and eat (37). Adults’ influence on young
tions with objects, others’ actions can support the development individuals in this period is thought to be necessary for meerkats to
of attentional control by young monkeys during particular master their challenging foraging style (38).
manual activities. In this way, social partners can support young We consider the relation in young monkeys between social
nonhuman primates that otherwise normally experience brief influence on activity and attentional processes associated with
sustained attention to others and to their own activities, learning learning. The activity in question relates to using stone hammers
manual skills that require longer sustained attention. Certainly to crack nuts, seeds, or other encased foods, a technical tradition
using tools qualifies as challenging enough to benefit from sup- in several populations of wild bearded capuchin monkeys
port of this kind. Acknowledging the cognitive dimension of the (Sapajus libidinosus) (39–42) (Fig. 1). Young capuchin monkeys,
constructed niche in nonhuman taxa will strengthen cultural like young meerkats, master finding and feeding on hidden and
evolutionary theory. Social influence on the development of sometimes noxious prey, and like meerkats, they are interested in
sustained attention is an appropriate early target for research in and affected by others’ actions with objects (43). Thus, they are
this direction. Even if the perceptual biases in primates favoring good candidates for studies of the temporal dynamics of social
attention to actions of others are small, they could nevertheless influence on behavior with objects and on the cognitive processes
powerfully affect learning trajectories, particularly when magni- associated with learning traditional tool-using skills.
fied by shifts in attention and memory (24).
We hypothesize that nonhuman primates and humans share Temporal Dynamics of Social Influence as a Window on
strong susceptibility to tuning (i.e., extending, strengthening) at- Sustained Attention
tention and memory about manual actions and about objects via Our objective was to examine temporal dynamics of social in-
interest in others’ manual activity. However, nonhuman primates fluence on young monkeys’ behavior in a situation in which the
face a challenge in sustaining attention to actions with their hands young monkeys were practicing component actions of a tradi-
that humans typically do not, or do not face to the same extent, tional manual skill. Temporal dynamics of young monkeys’ be-
which could be particularly important when learning a skill in- havioral coordination with others in this context reflect sustained
volving handling objects. Nonhuman primates’ attention to their attention to their own and others’ actions. The skill in question is
own activity is typically disrupted every few seconds to scan the cracking palm nuts using stone hammers, a precursor to the
surroundings briefly (surveying the surroundings in this way is situation facing humans knapping stone (44). Social support in
termed “vigilance” in the animal behavior literature) (33, 34). the form of instruction and demonstration aid in the acquisition
Vigilance, which functions to inform the perceiver about preda- of knapping, but these actions are not sufficient by themselves
tors, conspecifics, and other relevant dynamic features in the for people to learn to knap stone (45). Providing repeated
Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7799
The data for this report are taken from monkeys 6 y of age or
younger belonging to a wild, habituated group of bearded ca-
puchins observed in five periods over 2 y (Table 1). Members of
this group of monkeys routinely crack nuts using stone hammers
at many anvils scattered across their home range (49–51). In this
study, one observer continuously recorded the focal young
monkey’s behavior. A second observer concurrently recorded at
intervals of 1 min the distance from each neighbor (within a 10-m
zone) to the focal monkey, the identity and behavior of each
neighbor, and the occurrence of cracking nuts (i.e., striking a nut
with a stone, producing a sharp cracking noise) by any monkey in
the group. The method allowed us to analyze the focal monkey’s
behavior with nuts and stones and its presence near anvils in
relation to the start, continuation, and end of cracking by other
members of the group. Data collected while the group was in a
frequently visited area with abundant anvils, hammers, and
cracked shells, as detailed below, indicated that the temporal
Fig. 1. An adult bearded capuchin monkey has cracked a palm nut using a pattern of others’ influence on young monkeys’ activity with nuts
stone hammer on a log anvil and is removing and eating pieces of the kernel. and stones and their presence near anvils was associated spe-
A young monkey that cannot crack a nut itself watches closely. Image used
cifically with the others’ activity with nuts (i.e., synchronization
with permission from Luca Antonio Marino, Roma Tre University (Rome, Italy).
was not a byproduct of traveling in a cohesive group). Results are
reported for n = 16 monkeys, unless otherwise noted.
occasions for practice and self-discovery of movement solutions
Results
is a crucial dimension of social support for learning this complex
perceptuomotor skill and human traditional skills more generally Manipulation of Nuts. Young monkeys manipulated nuts at the
(13, 46). highest rate when others were cracking nuts (i.e., striking nuts
We measured the temporal rise and fall of social influence on with a stone) (median = 8.3 acts per 10 min) and at the lowest
wild young bearded capuchin monkeys (S. libidinosus) in the rate (median = 0.9 acts per 10 min) during periods when no
company of adults that were cracking palm nuts by striking them others were cracking nuts, and had not been cracking nuts for at
repetitively with stone hammers. These monkeys start to interact least 8 min. The difference between these rates was significant
with nuts and stones in the first year of life. They handle nuts, (estimate = 1.98, P < 0.0001). The onset of the effect of others’
percuss nuts directly against a hard surface (hereafter, percuss), cracking nuts was quick: Compared with the minute before
and strike nuts or nut shells with stones (hereafter, strike) for others began to crack, the median probability that a young
several years before they are able to crack palm nuts themselves monkey would manipulate a nut doubled in the first minute after
(47, 48). Thus, young monkeys exhibit remarkable persistence in others began to crack, and remained doubled or more for at least
a foraging activity that they cannot perform effectively. We know 5 min when others continued to crack (Fig. 2). Movie S1 provides
that others’ cracking nuts is partially responsible for the young a video-clip of a young monkey handling a nut while and after
monkeys’ persistence. In one recent study, while other group another monkey is cracking a nut. During the 7 min after the
members were cracking and eating nuts, monkeys 6 y of age or others stopped cracking, the rate of manipulation of nuts de-
younger were threefold more likely to be near an anvil, qua- clined exponentially (in At = A0 * e−βt; estimates: A = 9.96, P <
drupled their rate of interaction with nuts, and doubled their rate 0.0001; β = 0.325, P = 0.0013; Fig. 3), where e is the base of the
of percussing and striking compared with times when no monkey natural logarithm, β is the rate by which the dependent variable
in the group was cracking nuts. Interactions with objects other declines with time, t is the time since cracking in the group
than nuts showed the opposite pattern (48). stopped, and At (the dependent variable) is the rate or percentage
Table 1. Subjects’ date of birth, sex, and body mass at each sampling time point
Name Date of birth (mm/dd/yyyy) Sex Body mass in 2011, kg Body mass in 2012, kg Body mass in 2013, kg
F, female; M, male.
*Estimate.
Fig. 2. Probability that a young monkey (n = 11) manipulated a nut in the Time Spent Near an Anvil. The percentage of time young monkeys
1 min before the onset of striking a nut performed by another monkey (No spent near an anvil was highest when others were cracking nuts
cracking) and during each of the 5 min following the onset of striking a nut (median = 13.3%) and lowest during periods 8 min or longer
by another monkey. The boxes display the median and interquartile range, after others stopped cracking (median = 2.6%) (Fig. 7). The
and whiskers indicate minimum and maximum values. The solid line within difference between these percentages was significant (estimate =
the box depicts the median. Circles indicate values of outliers.
6.06, P < 0.0001). When others were cracking until 7 min after all
others had stopped cracking, the percentage of time young
of time. The output from the model is A0 (the strength of the monkeys spent near an anvil declined exponentially (in At = A0 *
effect on the dependent variable) and β. The half-life of the effect e−βt; estimates: A = 11.94, P < 0.0001; β = 0.25, P < 0.0001). The
half-life of the effect was 2.8 min.
was 2.1 min. The rate of nut manipulation during the first to fifth
minutes after other monkeys stopped cracking nuts was signifi- Discussion
cantly higher than the rate during periods 8 min or longer after
PSYCHOLOGICAL AND
Linking Learning Processes to a Tool-Using Tradition in Monkeys.
COGNITIVE SCIENCES
others stopped cracking nuts (i.e., the baseline rate) (P <
Culture in a behavioral sense is present when, aided by social
0.0001 for minutes 1–4 after other monkeys stopped cracking nuts,
context, individuals consistently learn behaviors exhibited by
P = 0.0146 for minute 5). The rate of manipulation of nuts during
others in their community (i.e., they have traditions). Culture can
the sixth and seventh minutes after others stopped cracking nuts extend biology in an evolutionary sense when the traditions in
and the rate during periods 8 min or longer after others stopped question persist across generations, and when they have selective
cracking nuts did not differ significantly. consequences. In parallel with natural selection, occasional
The temporal pattern for young monkeys’ manipulation of spontaneous behavioral variants (arising from developmental
other objects besides nuts was the opposite of their actions with plasticity) that afford some advantage and that are learned by
nuts. Young monkeys manipulated other objects at the lowest rate
when others were cracking nuts (median = 12.8 per 10 min) and at
the highest rate during periods 8 min or longer after others
stopped cracking nuts (median = 16.7 per 10 min). The difference
between these rates was significant (estimate = 1.29, P < 0.0001).
From the time when others were cracking nuts through 7 min after
they stopped, juveniles’ rate of manipulation of other objects in-
creased exponentially (in At = A0 * e−βt; estimates: A = 14.2, P <
0.0001; β = −0.03, P = 0.015), although the rate of manipulation
of other objects in each of the first 7 min after others stopped
cracking nuts did not differ from the rate in baseline periods (i.e.,
8 min or longer after other monkeys stopped cracking nuts).
Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7801
challenging perceptuomotor skill. When others act in frequent
bouts, and when these bouts are long-enduring, they support
more frequent performance by young individuals of those actions
for which attention is least well maintained and, subsequently,
working memory is the most fragile. We provisionally identify
this process as social scaffolding for learning attentional skills
that support learning the least familiar component(s) of an ac-
tivity. The socially supported practice of sustained attention in a
given context can powerfully support development of longer
periods of sustained attention that can be marshalled during
other contexts. For example, human infants practice sustained
attention while maintaining joint attention with a caregiver, and
these common and frequent social interactions influence the
development of sustained attention more generally (27). In the
case of young bearded capuchin monkeys, we suggest that
extending sustained attention for percussive actions with stones
and nuts is one outcome of repeated prompting to perform these
actions arising from others striking nuts while cracking them. We
Fig. 4. Probability that a young monkey (n = 11) percussed a nut 1 min further suggest that the development of longer sustained atten-
before the onset of striking a nut performed by another monkey (No tion to their own percussive activity supports the acquisition of
cracking) and during each of the 5 min following the onset of striking a nut
by another monkey. The boxes display the median and interquartile range,
nut-cracking using stone hammers, which is a signature tradition
and whiskers indicate minimum and maximum values. The solid line within of tool use in some groups of bearded capuchin monkeys.
the box depicts the median. Circles indicate values of outliers.
Socially Tuned Attention and Learning Traditions in Primates and
Other Orders. This line of reasoning leads us to predict that
others can become established as new traditions. Learning a tool-using traditions in nonhuman animals include actions that
tradition is, by definition, dependent on the social context in are infrequent outside of that activity (i.e., traditions are not
which learning takes place, but it is equally dependent on the composed solely of common actions), and to the related pre-
learning processes of the individual. Here, we draw attention to dictions that (i) the components of tool use traditions that are
the power of social partners to influence not just the content of infrequent in species-typical activity outside of the traditional
learning a tradition (i.e., how or when to perform a particular activity will be performed frequently by proficient tool users, and
behavior) but also how learning itself takes place. Working from (ii) frequent performance by adults of uncommon actions will
findings with humans that sustained attention is developmentally support practice of these particular actions by young individuals.
plastic and susceptible to social influence, we examined behavior Taxonomic variation in social learning may follow from
of young monkeys, seeking evidence that social influences shape species-typical variations in attention in combination with tem-
attention in young monkeys. Specifically, we examined temporal poral dynamics of action. We hypothesize two features of at-
dynamics of social influence on young monkeys’ activity with the tention and memory distinguish primates from other orders:
materials relevant to the tool-using tradition in their wild group (i) Primates are more likely than other orders to have stronger
(cracking nuts with stone hammers) to index their sustained at- sustained attention and working memory for actions with objects
tention to this task. Young monkeys were more likely to be near compared with other kinds of actions, and (ii) primates have
an anvil, and to manipulate, percuss, and strike nuts once others stronger interest in others’ actions with objects than do other
began to strike nuts to crack them (which we call cracking). The orders [although there may be other taxa with strong interest in
higher probability of these actions in young monkeys persisted
through 5 min while others cracked nuts (5 min was the longest
period for which we had sufficient data for these analyses). They
continued to manipulate nuts at higher rates and were more
likely to be present near an anvil for minutes after others stopped
cracking nuts, with a half-life of more than 2 min for each vari-
able. These findings indicate substantial lingering effects of so-
cial influence on young monkeys’ interest in nuts and anvils and
general exploratory activity directed to nuts. In contrast, these
young monkeys decreased manipulation of other objects while
other monkeys cracked nuts.
Key to our argument that attention (and its related process,
working memory) can be enhanced by participating in socially
supported practice, others’ cracking was a strong facilitator of
young monkeys’ practicing percussion and striking nuts, but only
when others were cracking. The young monkeys percussed a nut
or struck a nut with a stone much less frequently than they
manipulated nuts; moreover, following the end of others’
cracking nuts, young monkeys reduced percussion and striking Fig. 5. Rate of direct percussion of nuts by young monkeys (n = 16) when
one or more other monkeys struck nuts (Cracking present) during the 7 min
quickly (a half-life of about 37 s for percussion and a median of after other monkeys stopped striking nuts and in periods 8 min or longer
zero after 1 min for striking). after others stopped striking a nut (8 min or over). The boxes display the
median and interquartile range; whiskers indicate minimum and maximum
Building Sustained Attention for the Least Likely Actions. These values. The solid line depicts the median. Circles indicate values of outliers.
findings illustrate how the frequency and temporal duration of The exponential curve generated by model fitting is overlaid. The half-life of
others’ actions can influence how young individuals learn a the decline occurred at 0.62 min.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
those skills that are commonly practiced by the young individ- Methods
ual’s social companions. Study Site. This study was conducted at Fazenda Boa Vista and adjacent lands
We hope this work prompts others to develop these ideas and in the southern Parnaíba Basin (9°39′S, 45°25′W) in Piauí, Brazil. Palms are
to test them with data from diverse species. Conceptual and abundant in the area, and many produce fruit at ground level. Two species
empirical work along these lines is needed to integrate de- of palm nuts in particular were commonly cracked by the monkeys in this
velopmental and psychological understanding of behavioral study: tucum (Astrocaryum campestre) and piassava (Orbygnia spp.). A
tucum nut is, on average, 46 mm in length, weighs 15.5 g, and has a 4.1-mm-
variation into theories of cultural evolution, niche construction,
thick shell, with a peak-force-at-failure of 5.6 kN. An average piassava nut is
and evolutionary biology. 61.3 mm long, weighs 50.6 g, and has a thicker and more resistant shell than
a tucum nut: 6 mm with a peak-force-at-failure of 11.5 kN (55).
Concluding Remarks
The naturally occurring stones used by the monkeys to crack nuts weigh,
Culture potentially extends biology insofar as the setting of de- on average, 1.1 kg (range: 250 g to 2.5 kg) (56). They are quartz, quartzite,
velopment supports individuals’ learning traditions, and occasion-
ally learning behavioral variants of these traditions arising in other
individuals that become established as new traditions. Behavioral
traditions are learned in social settings, and the attentional and
memorial processes that underlie that learning are themselves
shaped by social partners. To date, our attention on socially aided
learning, traditions, and cultures in nonhuman species has focused
on the form and function of traditional behaviors (e.g., foraging
skills, social interactional patterns). We argue that we need to in-
clude social influences on the learning process itself in the scope of
cultural inquiry, as cross-cultural educational psychologists have
argued (18). Improving our understanding of the psychological
processes supporting socially biased learning, and thus the tradi-
tions that animals acquire, must be part of advancing theory in
cultural evolution. Working memory and attention are one set of
linked cognitive processes available for study with respect to how
and when learning occurs, and how the social setting of develop-
ment influences learning processes.
We have hypothetically linked temporal dynamics of social
influence to sustained attention and working memory. These Fig. 7. Percentage of time spent within an arm’s length of an anvil by
young monkeys (n = 16) when one or more other monkeys struck nuts
cognitive processes are fundamental to learning, including
(Cracking present) during the 7 min after other monkeys stopped striking
learning a traditional skill, cracking nuts with a stone hammer in and in periods 8 min or longer after others stopped striking a nut (8 min or
the case of the young monkeys that we studied. We suggested over). The boxes display the median and interquartile range, and whiskers
that repeated experiences of performing challenging parts of indicate minimum and maximum values. The solid line depicts the median.
the action cycle relating to cracking nuts could lead to ex- Circles indicate values of outliers. The exponential curve generated by model
tended sustained attention and working memory for these actions. fitting is overlaid. The half-life of the decline occurred at 2.8 min.
Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7803
siltstone, or harder sandstone. The monkeys cracked nuts on naturally oc- dependent variables in this period were calculated as the number of events
curring anvils (boulders, exposed stone, or horizontal logs with a flat, or divided by total observation time. The proportion of time near an anvil was
nearly flat, horizontal surface). Anvils are abundant in the region. calculated as the number of seconds spent there divided by total observation
time. In the models, we treated the variables as count variables and used total
Subjects. At the beginning of the study, there were 11 immature monkeys in time as an offset. For variables that did not distribute normally (tested with
the group, aged from 3 mo to 4.5 y (Table 1). Five more infants were born the Shapiro–Wilk test), the Poisson distribution was used. Subject identification
during the study. At the beginning of the study, none of the subjects could was used as a random factor. Randomization of residuals was used to com-
crack open a whole nut of the more resistant palm species. The two oldest pensate for overdispersion.
juveniles and, to some extent, two others mastered this skill during the We used the exponential model At = A0 * e-βt to describe the dynamics of
study. Apart from the study subjects, the group included three adult males the dependent variables with time, where t is the time since cracking in the
and five adult females. All but one female habitually cracked palm nuts. The group stopped and At (the dependent variable) is the rate or percentage of
body mass of each monkey was obtained as monkeys stood individually on a time. The output from the model is A0 (the strength of the effect on the
digital scale to drink from a bowl of water (57). dependent variable) and β, from which the half-life can be calculated as
ln(2)/β. For each variable, we examined the goodness of fit of our data to
Data Collection. Data were collected in five discrete collection periods, each this exponential model and determined the estimates for A0 and β, as well as
6 to 9 wk, during three dry seasons (May–July) in 2011, 2012, and 2013 and the half-life of the effect. We used the values of the independent variables
two wet seasons (January–March) in 2012 and 2013. Observations were when monkeys were striking nuts in each of the 7 min after this activity
collected using two-person teams. One observer followed a focal subject to stopped; at times, we also used the values of the independent variables
obtain a continuous record of its activities, including manipulation of nuts when monkeys were striking nuts 8 min or longer after this activity stopped
and other objects, and locations, specifically if the subject was near an anvil. or during observation that did not include striking at all. Data from all ob-
Concurrently, the other member of the team recorded, as an instantaneous servation periods were used in analyses, except that data from the two wet
observation every minute, the identity, location, and activity of other seasons on young monkeys’ direct percussion of the nut and striking the nut
monkeys within 10 m of the focal monkey. All observations lasted 20 min, or with a stone were not used in analyses because these actions occurred very
until the focal subject went out of view and could not be followed. Obser- rarely in the data from these seasons.
vations lasting <5 min were discarded. We examined the effect of the onset of others’ cracking by tabulating
Observers first learned to identify all members of the group and were young monkeys’ actions with nuts in the minute before others began
subsequently trained in the method with one of the authors (Y.E.). Reliability cracking and in the 5 min following the onset of others’ cracking. The data
for focal observations was calculated using Generalized Sequential Querier are presented in Dataset S2. We present these data descriptively. We used
(GSEQ) software (www2.gsu.edu/∼psyrab/gseq/index.html). We used the 11 monkeys’ data for these tallies: 2-y-old, 3-y-old, and 4-y-old monkeys,
time unit method, which compares the codes inserted by two observers and which our previous analyses had indicated were affected more strongly by
defines as a match any instant in which both observers used the same code others’ striking than younger or older immature monkeys.
within a time window of 5 s. For each observer and trainer pair, time unit At each data collection period, one-quarter to one-half of observations
kappa was at or above 0.7, which is considered highly reliable (58). Reliability were collected while the monkeys visited an area (∼30-m diameter) of their
for instantaneous observations of other monkeys near the focal monkey was home range containing several large boulders, fallen logs, and areas of
tested separately for each aspect (identity, distance to the focal monkey, exposed stone that the monkeys habitually used as anvils to crack nuts. We
activity, and location) until agreement (sum agreement/agreement plus use this area as our outdoor laboratory. Several hammer stones were pre-
disagreement) was over 80% for each variable for 20 consecutive samples. sent in this area, typically left by the monkeys on or near the anvils. Many
The protocol was reviewed and approved by the Institutional Animal Care other anvils with hammer stones were present within 200 m in the sur-
and Use Committee of the University of Georgia. The study adheres to the code rounding area. The monkeys were sometimes provisioned in the outdoor
of best practices for field primatology set by the International Primatological laboratory with nuts as part of ongoing experiments (59–62).
Society and all applicable Brazilian regulations for the conduct of field research. When in the outdoor laboratory, young monkeys had ample opportunity
to spend time near anvils handling stones and nut shells independent of other
Data Analysis. For each subject in each collection period, we collected be- monkeys’ activity. The influence of others in the group cracking nuts in their
tween 19 and 53 observations, which lasted cumulatively between 5.3 and vicinity on the rate of manipulating nuts and on proximity to anvils by our
27.1 h. Observations were collated by subject for each season. Ten subjects subjects in the outdoor laboratory is approximately the same (manipulation:
appeared in all five collection periods. Data were collected as the monkeys P = 0.0406, estimate = 2.33; proximity to anvils: P < 0.0001, estimate = 8.4) as
traveled throughout their home range. The observations were exported from in the full sample collected over the entire home range (estimate = 1.98 for
The Observer to GSEQ software to extract the frequency of different events manipulation, estimate = 6.06 for proximity to anvil). Young monkeys
(e.g., manipulation of nuts) at times when others in the group cracked (struck) approached anvils and handled nuts most often while adults were cracking
nuts and at times when they did not. nuts, although anvils were available, and nut shells and hammer stones were
With respect to change in the young monkeys’ activity following cessation equally present and available, when others were not cracking. They showed
of others’ cracking, we used general mixed linear models to evaluate the the opposite pattern for manipulating other objects. We conclude that the
differences in activity under different conditions and exponential models to fine temporal influence of others’ nut-cracking on young monkeys’ activity
evaluate the temporal pattern of the effects. SAS/STAT14.2 software was with nuts and presence near anvils reported here is not a byproduct of
used for the analyses. We examined the rate of manipulation of nuts, ma- synchronized travel of a cohesive group.
nipulation of objects other than nuts, specific actions with nuts, and time
spent near an anvil. The data are summarized in Dataset S1. ACKNOWLEDGMENTS. We thank the assistants who helped collect the data
Our independent variables were the presence of nut-cracking activity (which and Marino Gomes de Oliveira and the Oliveira family for their help and
involved striking nuts with stones) in the group (yes/no) and the time that had permission to work on their land. We thank Marcus W. Feldman, Andrew
elapsed since this activity stopped (e.g., 0–1 min after the activity stopped, Whiten, Kevin N. Laland, and Francisco J. Ayala for the opportunity to participate
1–2 min after the activity stopped, 2–3 min after the activity stopped). The in the Sackler Colloquium “The Extension of Biology Through Culture.” We
dependent variables were (i) proportion of time the subject spent within an thank the statistical consulting service at the University of Georgia (UGA) and
arm’s length of an anvil, (ii) manipulation of nuts, (iii) manipulation of other the UGA Dean’s Award for the funding of this service. This research was funded
by the National Geographic Society, UGA, Coordenadoria de Aperfeiçoamento
objects, (iv) rate of percussing a nut directly on a surface, and (v) rate of
de Pessoal de Nível Superior, São Paulo Research Foundation (Grant 08/55684-3),
striking a nut with a stone. “Total time” is defined as all seconds of observa- and Brazilian National Council for Scientific and Technological Development
tion under a specific condition of the independent variable. For example, the (CNPq) (Contract 029088). Permission was granted for the research by Instituto
total time of “3 min after activity stopped” includes all observations from Brasileiro do Meio Ambiente e dos Recursos Renováveis through Permit 28689
120 to 180 s after all monkeys in the group stopped cracking nuts. Rates for and by CNPq/Ministério da Ciência e Tecnologia Permit 0002547/2011.
1. Fragaszy DM, Perry S (2003) Towards a biology of traditions. Traditions in Nonhuman 3. West-Eberhard MJ (2003) Developmental Plasticity and Evolution (Oxford Univ Press,
Animals: Models and Evidence, eds Fragaszy D, Perry S (Cambridge Univ Press, Oxford).
Cambridge, UK), pp 1–32. 4. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump-
2. Perry SE, Barrett BJ, Godoy I (2017) Older, sociable capuchins (Cebus capucinus) tions and predictions. Proc Biol Sci 282:20151019.
invent more social behaviors, but younger monkeys innovate more in other con- 5. Fragaszy DM, Perry S (2003) The Biology of Traditions: Models and Evidence (Cambridge
texts. Proc Natl Acad Sci USA 114:7806–7813. Univ Press, Cambridge, UK).
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
26. Pasternak T, Greenlee MW (2005) Working memory in primate sensory systems. Nat in wild New Caledonian crows. Behaviour 147:553–586.
Rev Neurosci 6:97–107. 53. Fuhrmann D, Ravignani A, Marshall-Pescini S, Whiten A (2014) Synchrony and motor
27. Yu C, Smith LB (2016) The social origins of sustained attention in one-year-old human mimicking in chimpanzee observational learning. Sci Rep 4:5283.
infants. Curr Biol 26:1235–1240. 54. Schuppli C, et al. (2016) Observational social learning and socially induced practice of
28. Fausey CM, Jayaraman S, Smith LB (2016) From faces to hands: Changing visual input routine skills in immature wild orangutans. Anim Behav 119:87–98.
in the first two years. Cognition 152:101–107. 55. Visalberghi E, et al. (2008) Physical properties of palm fruits processed with tools by
29. Myowa-Yamakoshi M, Matsuzawa T (2000) Imitation of intentional manipulatory wild bearded capuchins (Cebus libidinosus). Am J Primatol 70:884–891.
actions in chimpanzees (Pan troglodytes). J Comp Psychol 114:381–391. 56. Visalberghi E, et al. (2007) Characteristics of hammer stones and anvils used by wild
30. Fragaszy DM, Deputte B, Cooper EJ, Colbert-White EN, Hémery C (2011) When and bearded capuchin monkeys (Cebus libidinosus) to crack open palm nuts. Am J Phys
how well can human-socialized capuchins match actions demonstrated by a familiar Anthropol 132:426–444.
human? Am J Primatol 73:643–654. 57. Fragaszy DM, et al. (2016) Body mass in wild bearded capuchins, (Sapajus libidinosus):
31. Rizzolatti G, Sinigaglia C (2008) Mirrors in the Brain: How Our Minds Share Actions, Ontogeny and sexual dimorphism. Am J Primatol 78:389–484.
Emotions and Experience (Oxford Univ Press, Oxford). 58. Bakeman R, Deckner DF, Quera V (2005) Analysis of behavioral streams. Handbook of
32. Fragaszy DM, Crast J (2016) Functions of the hand in primates. The Evolution of the Research Methods in Developmental Science, ed Teti DM (Wiley, New York), pp
Primate Hand, Perspectives from Anatomical, Developmental, Functional, and 394–420.
Paleontological Evidence, eds Kivell T, Schmitt D, Lemelin P (Springer, New York), 59. Massaro L, Liu Q, Visalberghi E, Fragaszy D (2012) Wild bearded capuchin (Sapajus
Vol 2, pp 313–344. libidinosus) select hammer tools on the basis of both stone mass and distance from
33. Davis RT (1974) Monkeys as perceivers. Primate Behavior: Developments in Field and the anvil. Anim Cogn 15:1065–1074.
Laboratory Research, ed Rosenblum L (Academic, New York), Vol 3. 60. Fragaszy DM, Liu Q, Wright BW, Allen A, Brown CW (2013) Wild bearded capuchin
34. Treves A (2000) Theory and method in studies of vigilance and aggregation. Anim monkeys (Sapajus libidinosus) strategically place nuts in a stable position during nut-
Behav 60:711–722. cracking. PLoS One 8:E56182.
35. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202. 61. Hanna JB, et al. (2015) Kinetics of bipedal locomotion during load carrying in capu-
36. Hoppitt W, Samson J, Laland KN, Thornton A (2012) Identification of learning chin monkeys. J Hum Evol 85:149–156.
mechanisms in a wild meerkat population. PLoS One 7:e42044. 62. Liu Q, Fragaszy DM, Visalberghi E (2016) Wild capuchin monkeys spontaneously ad-
37. Thornton A, Hodge S (2008) The development of foraging microhabitat preferences just actions when using hammer stones of different mass to crack nuts of different
in meerkats. Behav Ecol 20:103–110. resistance. Am J Phys Anthropol 161:53–61, 1.
Fragaszy et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7805
Older, sociable capuchins (Cebus capucinus) invent
more social behaviors, but younger monkeys
innovate more in other contexts
Susan E. Perrya,b,1, Brendan J. Barrettc,d, and Irene Godoye
a
Department of Anthropology, University of California, Los Angeles, CA 90095-1553; bBehavior, Evolution and Culture Program, University of California, Los
Angeles, CA 90095-1553; cAnimal Behavior Graduate Group, University of California, Davis, CA 95616-8522; dDepartment of Anthropology, University of
California, Davis, CA 95616-8522; and eBehavioural Science Institute, Radboud University Nijmegen, 6500 HE Nijmegen, The Netherlands
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 16, 2017
(received for review January 18, 2017)
An important extension to our understanding of evolutionary Despite the obvious theoretical importance of innovation as a
processes has been the discovery of the roles that individual and key element in cultural evolution, there are relatively few studies
social learning play in creating recurring phenotypes on which of this topic, particularly in wild populations, because of meth-
selection can act. Cultural change occurs chiefly through invention odological difficulties in stimulating innovation experimentally
of new behavioral variants combined with social transmission of or detecting innovations in observational studies (10). This
the novel behaviors to new practitioners. Therefore, understand- paucity of information is partly because of the difficulty in cre-
ing what makes some individuals more likely to innovate and/or ating operational definitions of innovation that can produce
transmit new behaviors is critical for creating realistic models of meaningful datasets for comparative analysis. Here, we loosely
culture change. The difficulty in identifying what behaviors qualify
adopt the definitions by Reader and Laland (11) of innovation
as new in wild animal populations has inhibited researchers from
(the process) as “a process that results in new or modified
understanding the characteristics of behavioral innovations and
learned behaviour and that introduces novel behavioural variants
innovators. Here, we present the findings of a long-term, system-
atic study of innovation (10 y, 10 groups, and 234 individuals) in
into a population’s repertoire” and innovation (the product) as
wild capuchin monkeys (Cebus capucinus) in Lomas Barbudal, “a new or modified learned behavior not previously found in the
Costa Rica. Our methodology explicitly seeks novel behaviors, re- population” (ref. 11, p.14). Following the definitions of Ramsey
quiring their absence during the first 5 y of the study to qualify as et al. (12) and van Schaik et al. (13), we emphasize that inno-
novel in the second 5 y of the study. Only about 20% of 187 inno- vations are not part of the innate repertoire and do not arise
vations identified were retained in innovators’ individual behavioral predictably in all population members at certain points in the life
repertoires, and 22% were subsequently seen in other group mem- history; also, they do not predictably emerge in all population
bers. Older, more social monkeys were more likely to invent new members in response to particular social or ecological condi-
forms of social interaction, whereas younger monkeys were more tions. The definition by Ramsey et al. (12) and van Schaik et al.
likely to innovate in other behavioral domains (foraging, investiga- (13) differs from the definition by Reader and Laland (11), be-
tive, and self-directed behaviors). Sex and rank had little effect on cause it focuses on the individual rather than the population [i.e.,
innovative tendencies. Relative to apes, capuchins devote more of Ramsey et al. (12) argue that multiple individuals within the
their innovations repertoire to investigative behaviors and social same population could independently create the same behavior].
bonding behaviors and less to foraging and comfort behaviors. We take a compromise position, counting a behavior as an in-
novation if this is the first time that the behavior has been seen in
innovation | Cebus capucinus | cultural evolution | phenotypic plasticity | a particular social group during the putative innovator’s lifetime.
learning
Operational definitions of innovation and invention differ greatly
across fields. Some define innovations as inventions that have
ANTHROPOLOGY
reasons. Experimentally seeded innovations or problem-solving dependently invented in the buffer period (2002–2006) by mon-
tasks provide controlled contexts, where both the latency of in- keys in three other groups and spread to multiple individuals in
dividuals innovating a solution and the social diffusion of inno- one of those groups. (The buffer period was a time period during
vations may be studied. Innovating solutions to novel tasks is also which we did not score innovations, but we used this time period
of obvious adaptive value. However, many innovations that are to confirm whether behaviors seen in the subsequent 5-y period
ecologically or socially relevant are difficult to study experi- were truly new to that group. Details are in Methods.) It was not
mentally. Some animals have innovative social interaction, i.e., possible to reliably score food choice innovations and assign them
“social games,” which may serve as bond testing rituals and can to particular innovators, but the foraging category would have
be socially transmitted (27). Some wild animals display repetitive been much larger had we been able to include them.
self-directed “quirks,” perhaps to self-soothe, akin to the pro- The self-directed category included nine behaviors related to
posed function of some stereotyped coping behaviors in captive enhancing comfort, dental hygiene, self-soothing, and self-
and wild animals (28). Other behavioral innovations have no stimulation. Capuchins are prone to inventing “personal quirks,”
apparent immediate biological or ecological function. especially involving clutching or poking some part of their own
Systematic observational study of innovations across a wide body for prolonged periods of time. These habits may persist for
range of behavioral domains permits us to explore whether indi- years and might be transmitted to other group members. There
vidual propensity to innovate is generalized or whether individuals were many individuals in the 2007–2011 dataset that were still
will be differentially prone to innovate in different behavioral do- practicing postural quirks that they invented during the buffer
mains according to their ecology and life history. For example, it has period (2002–2006), and those are not evident in SI Appendix,
been suggested that “necessity is the mother of invention” and Table S1. The “body part hold” category lumps together many
hence, that individuals who are young, low-ranking, and/or socially different types of postural quirks; if we split this category more
peripheral will be more prone to inventing new foraging strategies finely, there would be far more innovations in this domain.
(29); this hypothesis has yielded mixed results in past literature The social category includes 47 forms of social interaction that
reviews (6). In general, there are few strong theoretical expectations are not part of the standard species repertoire. Eight of these
about how age, sex, rank, and sociality affect innovation rates; we were independently invented in multiple groups, and most of
need natural history and observational studies to help guide theory. these inventions involved incorporation of behavioral elements
Capuchin monkeys (genera Cebus and Sapajus) are expected from the foraging repertoire into the exploration or use of the
to have unusually high innovation rates for myriad reasons. interaction partner’s body (e.g., explorations of the partner’s
Comparative studies have shown that brain size covaries with orifices or mouthing parts of the partner). Some behaviors (e.g.,
innovation frequency in primates and birds (30, 31), and capuchins dental examinations, eye poking, hand sniffing, sucking of body
Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7807
parts, toy game, and hair game) have been invented in multiple AA CU FF FL LB
groups over the years and have become well-established tradi-
tions in some groups; hence, many of these behaviors scored as
innovations for this 2007–2011 time period were invented before
2007 in other groups and still in practice (and hence, not counted
as innovations) during this time period. Both past work and the MK NM RF RR SP
patterning of results in SI Appendix, Table S1 suggest that those
social rituals that involve some discomfort or risk (e.g., having an
appendage bitten or damaging an eye) are more prone to remain
in individual repertoires and become established in group rep-
2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10
ertoires. In addition to the aforementioned behaviors that have estimated annual innovation rate per group
been proposed as bond-testing behaviors (41), the social category
of innovations includes social play, social displays, and some Fig. 1. Posterior predictions of annual innovation rate (number of inno-
creative ways for females to regulate infant behavior. vations per year) for each group. Two-letter names of the social groups are
The investigative category included creative manipulations of at the top of each panel; vertical lines indicate PMs. n = 44 group-years.
other species (e.g., porcupines, howler monkeys, and turtles),
human artifacts, leaves, sand, sticks, water, rocks, and other in-
animate objects as well as innovative ways of locomoting through self-directed domain (PM = 0.018; 89% CI = 0.008–0.037).
the forest. Most of these behaviors had no obvious immediate Much of the variance in probability of innovating can be ex-
purpose and gave the impression that the innovator was engaged plained by individual identity (σid of αp = 1.01) and social group
in recreational creativity, exploring the affordances of the ma- membership (σ group of αp = 1.56). Importantly, differences
terials. To give a few examples, in “cow pie seesaw” (Movie S2), between behavioral domains account for more variation in rates
the young monkey flips over a dried piece of cattle dung the of innovation (Table 2) than between-individual or -group dif-
length of her body, so that the flat side is on top and the rounded ferences for all other varying effect parameters with the excep-
side contacts the ground; then, she stands on top, rocking back tion of βmalep (SI Appendix, Table S2). Because interpreting the
and forth. In “mango hitting game,” a young monkey routinely coefficients of these models can be nonintuitive and because they
finds mangos about one-half the size of her body on the ground interact multiplicatively to estimate innovation rates in each
and applies hard, two-handed strikes to them for 4–5 min at a domain, we instead refer the reader to plots of model predictions
stretch, throwing her body weight into the assault on the mango, (Figs. 2 and 3 and SI Appendix, Figs. S1–S4) for all estimated effects.
with no apparent interest in eating it (SI Appendix). Less than For our model predictions, we display the posterior predic-
15% of investigative behaviors were seen more than once in tions for each individual’s annual innovation rate in each be-
individual repertoires, and only 18% were subsequently observed havioral domain. The y axes may represent (i) the estimated
being practiced by the innovator’s groupmates. number of innovations per individual monkey per year or the
The final dataset, described in detail in SI Appendix, Table S1, joint probability, (1 − p) × λ (Figs. 2 and 3 and SI Appendix, Figs.
included 187 innovations, 127 of which were unique behavior S3A and S4A). In some cases, looking at the individual compo-
types. Of the 187 innovations, 149 (80%) of them we never again nents of a zero-inflated Poisson (ZIP) model can be informative,
saw performed by the innovator, and only 41 (22%) were seen and therefore, we also present (ii) the probability of innovating
performed by other monkeys in the same group (i.e., had the per year, 1 − p, (SI Appendix, Figs. S1 A–D, S2 A–D, S3B, and
potential to have become a tradition during the observation S4B) and (iii) the number of innovations per year estimated to be
period). Of the 127 unique innovations seen, 54 (42.5%) were in observed conditional on being an innovator, λ (SI Appendix, Figs.
the investigative domain, 47 (37.0%) related to social behavior, S1 E–H, S2 E–H, S3, and S4C).
9 (7.1%) were self-directed behaviors, and 17 (13.4%) were
foraging behaviors. At least one innovation, based on our con- Age. Age differentially predicts innovativeness across behavioral
servative criteria, was scored in 117 of 234 individuals included in contexts. Younger individuals innovate at higher rates in the
this dataset. Descriptions of all innovations are included in SI investigative, foraging, and self-directed domains, although the
Appendix, and SI Appendix, Table S1 reports the distribution of effect size is quite small for self-directed and foraging behaviors
these behavioral variants across social groups and individuals. (Fig. 2 A–C). Older individuals are slightly more innovative in
Innovations were rare, being observed, on average, less than the social domain (Fig. 2D and Table 2). This effect was more
once per individual per year in any particular domain. These heavily driven by the probability of being an innovator (SI Appendix,
annual rates may appear low, but our aim is to predict the Fig. S1 A–D and Table S2) than the number of innovations con-
properties of an individual that make him/her more prone to ditional on being an innovator. Younger innovators seem slightly
innovate in particular behavioral domains. If one ignores the more likely to produce more innovations, conditional on being
properties of individuals or behavioral domains, annual group- an innovator, but this effect is small (SI Appendix, Fig. S1 E–H)
wide innovation rates (Fig. 1), which have been the focus of and near zero in most domains.
many field studies of innovation, are much higher; however, we
can learn much about individual innovators by taking this more
detailed approach. Table 1. WAIC estimates for all evaluated innovation models
Our global model (referred to as mASRMG in Table 1) re- Model WAIC dWAIC wWAIC SE
ceived overwhelming support compared with other models
[having a Widely Applicable Information Criterion (WAIC) mASRMG 1,442.02 0 1 81.97
weight of 1.00] and suggests that sociality and age are the most mA 1,478.34 36.32 0 84.38
important predictors of innovation (Table 1). From posterior mS 1,503.03 61.01 0 84.49
median (PM) estimates and 89% credible intervals (89% CIs), m 1,510.11 68.09 0 84.22
our model suggests that marginal rates of innovation per indi- mRG 1,526.1 84.08 0 86.46
vidual per year (innovation rates) are highest in the social do- mM 1,570.25 128.23 0 88.39
main (PM = 0.122; 89% CI = 0.064–0.195) followed by the Capital letters in model names correspond to predictors included: age (A),
investigative domain (PM = 0.085; 89% CI = 0.047–0.147), sociality (S), rank (R), sex (M), and group size (G). dWAIC, difference in WAIC
foraging domain (PM = 0.028; 89% CI = 0.014–0.052), and scores from the highest ranked model; wWAIC, WAIC weight.
1.5
Behavioral domain
1
individual innovation rate
αp −0.12 −0.45 0.01 0.06
αl −0.98 0.07 −1.38 0.49
0.5
βagep 2.37 3.96 2.97 −1.64
βagel 0.03 −0.52 −0.04 0.24
βsocialityp −0.03 −0.31 0.23 −0.13
0
βsocialityl −0.32 −0.09 −0.52 0.51
βrank.highp 0.14 −0.46 −0.02 0.00 c. self-directed d. social
1.5
βrank.highl 0.05 −0.32 0.15 −0.01
βrank.lowp −0.37 −0.42 −0.44 0.19
βrank.lowl −0.21 0.03 −0.21 0.13
βmalep 0.12 −0.57 0.33 −0.09
1
βmalel −0.09 0.08 −0.26 0.20
0.5
Sociality. Sociality also differentially predicts innovation across
behavioral domains. More social individuals showed higher rates
0
of innovation in the social (Fig. 3D) domain. Less social indi-
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
viduals had higher innovation rates in the foraging (Fig. 3A), sociality scores (standardized)
investigative (Fig. 3B), and self-directed (Fig. 3C) domains, al-
though these effects are weak and less certain. More social in- Fig. 3. Joint model predictions for the effect of sociality on the number of
dividuals produced a greater number of innovations per year innovations per individual per year. Dark lines are at the PMs; lighter lines
(conditional on being innovators) in the social domain (SI Ap- are 100 randomly sampled posterior predictions. n = 3,132 individual-years.
pendix, Fig. S2H), whereas less social individuals showed a
greater number of innovations in the self-directed domain (SI
slightly more likely to be innovators (SI Appendix, Fig. S3B).
Appendix, Fig. S2G), and there was little effect of sociality on
Within the social and investigative domains, males showed
foraging (SI Appendix, Fig. S2E) and investigative (SI Appendix,
slightly higher innovation rates both overall and conditional on
Fig. S2F) behaviors.
being innovators, but there were no discernable effects of sex in
Sex. Males (PM = 0.034; 89% CI = 0.014–0.074) have slightly the foraging and self-directed domains (SI Appendix, Fig. S3 A
higher innovation rates than females (PM = 0.024; 89% CI = and C). Where these small differences exist, they are uncertain
0.011–0.045), ignoring between-domain variation. Males were and potentially of no biological significance.
ANTHROPOLOGY
a. foraging b. investigative CI = 0.017–0.070) followed by high-ranking individuals (PM =
0.026; 89% CI = 0.011–0.058) and lowest in low-ranking indi-
1.5
c. self-directed d. social ranked individuals (SI Appendix, Fig. S4B). Most of these rank-
related domain-specific effects are relatively small and uncertain,
1.5
Fig. 2. Joint model predictions for the effect of age on the number of inno- i) Researchers were vigilant for innovations and recorded
vations per individual per year in the domains of (A) foraging, (B) investigative, them throughout the study period, likely resulting in fewer
(C ) self-directed and (D) social behaviors. Dark lines are at the PMs; lighter lines overlooked innovations than in other long-term studies. It
are 100 randomly sampled posterior predictions. n = 3,132 individual-years. also enabled rigorous recording of innovations across a
Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7809
wider range of behavioral domains than would have been the generally produce innovations with no obvious utility may give an
case had we focused our data collection on a specialized artificially low impression of the biological importance that in-
research topic not specifically related to innovation in multi- novation has for behavioral repertoires overall. Most groups
ple behavioral domains. Despite efforts to record all possible have a current repertoire of bond-testing signals that have been
types of innovations, we failed to record food choice innova- steadily practiced by a subset of the group for many years, and
tions with sufficient rigor to include them in this analysis. these behaviors are not, for the most part, scored as innovations
ii) This study generated a high density of behavioral observa- in this dataset, because they first appeared in the buffer period or
tions compared with most primate field projects because of even earlier. Food choice innovations (not reported here be-
the year-round presence of a large number of well-trained cause of the interobserver reliability challenge of being able to
data collectors. This higher sampling density suggests that, correctly identify plants the first time that they are eaten) tend to
relative to other studies, more true innovations would have be adopted quickly and remain in repertoires for long periods of
been observed and also, perhaps, that fewer false positives time, such as is the case for chimpanzees (42) and various species
would have been generated (i.e., fewer rare species-typical of monkeys (29). Particularly useful food processing or drinking
behaviors labeled as innovations simply because they had techniques, once invented, are likely to persist in repertoires for
not been observed in other practitioners because of low many years [or even centuries (43)]. Tail-dipping to access water
sampling density). in deep tree holes seems to have been independently invented in
iii) The large number of groups monitored in essentially iden- four groups at Lomas, originating during the 2002–2006 period
tical ecological circumstances with overlapping home ranges in three of these groups (and therefore, not counting as an in-
offered better opportunity than most studies for identifying novation according to our definition) but persisting during the
innovations via comparison of presence vs. absence in groups 2007–2011 period in two of those three groups. Although many
exhibiting similar ecologies. With the exception of five of the of the innovative behaviors recorded during this 5-y period seem
innovations concerned with exploration of human artifacts, to be aimless creativity with no obvious utilitarian goal in mind
there is no reason to think that monkeys had differential oppor- (aside from the foraging behaviors), it is important to remember
tunities to discover particular innovations. Having more groups that innovations, like mutations, may not be particularly bene-
in the sample might make it easier to find one that lacked some ficial on their own but may become exapted (44) and acquire a
rare behavior, although the long-term nature of the study com- benefit when paired with a particular ecological or social context
bined with the dense behavioral sampling probably mitigate this (even if the initial pairing is accidental). For example, a poten-
tendency for the large number of groups to produce false pos- tially risky behavior, such as sticking a finger deep into the eye
itives of innovation caused by presence/absence contrasts. socket of a partner, might yield benefits if it is incorporated into
iv) The long-term nature of the study and the use of a 5-y buffer a dyadic ritual that tests the quality of an important social bond
period before the observation period reduce the chance that (27, 45). Additionally, many of the capuchin innovations in this
behaviors will be falsely termed innovations when they are dataset might inform the developing monkey about the affor-
actually low-frequency behaviors already in the repertoire. If dances of objects or how his/her body relates to the environment,
our study had been the length of a typical dissertation proj- providing useful feedback, even if there is no practical value to
ect (i.e., 1 y) and if we had assumed that behaviors were new permanently incorporating the new behavior into the behavioral
to the practitioners if it was the first time that we had seen repertoire.
them performed in those groups, then we would have had
52% more innovations in our sample than we obtained by How the Capuchin Innovation Repertoire Compares with Those of
using the buffer period method and requiring each recorded Chimpanzees and Orangutans. The profound methodological dif-
innovation to be the first sighting by that individual in its ferences between studies of these species preclude precise
group(s) of residence. quantitative comparisons of innovation rates in different be-
v) Our method is more conservative than the definition used by havioral domains, but we can at least make some crude quali-
Ramsey et al. (12), which defines innovations as being new tative comparisons between capuchins and the other two primate
to individual repertoires but not necessarily new to group species for which innovations have been systematically cataloged
repertoires. We recognize the possibility of having indepen- in the wild. The Mahale chimpanzee researchers (42) present
dent inventions within a single social group, and we suspect data on 26 novel behaviors (after excluding food choice to make
that many true innovations in our sample have been discarded their results more comparable with the other datasets) retro-
because of the suspicion that they may be the products of spectively extracted from their 43-y study of two chimpanzee
social learning. On the whole, we think that our method pro- communities. Although studying innovation was not an explicit
vides a more accurate technique for diagnosing innovations part of their core data collection protocol, many researchers at
than alternative observational methods. However, overlook- Mahale described behavioral variation and novel behaviors in
ing independent inventions within the same social group is detail. They defined innovation as any behavior not seen in the
one way in which we likely underestimate innovation rates. first 15 y of research. Innovation in orangutans has been ex-
vi) Our approach looks at the properties that affect individual plicitly studied using the geographic contrasts method in short-
propensities to innovate and longitudinally tracks the rate of term studies by van Schaik et al. (26) at multiple sites, producing
innovation over 5 y. Previous studies have looked at group- a sample of 44 putative innovations. Comparison of the distri-
level differences or short windows of time. Our hierarchical bution of behaviors across domains in their datasets with the
statistical approach accounts for unequal sampling effort composition of the repertoire in the dataset of 127 unique in-
among individuals, estimates between- and within-individual novations from Lomas Barbudal suggests that capuchins, relative
variation, and avoids the potential problem of falsely making to these ape species, devote a higher proportion of their creative
inferences about individual innovation rate from potentially energy to investigation of their environment and devising new
spurious group-level effects. social behaviors and a lower proportion to comfort-related be-
haviors and foraging. Orangutans are particularly prone to de-
Rates and Types of Innovation in Capuchin Monkeys. Evidence that vising new variants on nest-building techniques, and even their
individuals, on average, (i) innovate in any one of these four novel acoustic behaviors seem to emerge primarily in a nest-
domains less than once per year, (ii) retain only about 20% of building context. Both ape species are more innovative with
these innovations in their repertoires, (iii) transmit no more than regard to bodily comfort and hygiene than capuchins. Capuchins
22% of their innovations to other group members, and (iv) rarely seem to prioritize comfort: they do not build nests, and
ANTHROPOLOGY
of age on innovative tendencies. A review of the published primate data on this population (S.E.P.) make the final determinations about which
literature suggests that adults innovate more than immatures (16), behaviors are truly new for each group of monkeys, with input from two
and this pattern has been corroborated by experimental studies of additional long-term researchers (B.J.B. and I.G.).
innovation in callitrichids (22) and meerkats (20), in which young Beginning in January of 2002, all research staff were directed to make
freeform comments about any behaviors that did not neatly fit into the
animals were less likely than adults to successfully solve a novel
standard ethogram of species-specific behavior and explicitly mark comment
extractive foraging task, possibly because of insufficient develop- lines as comment innovation when they thought they were seeing a behavior
ment of dexterity. Chimpanzees are a notable exception to this that they had never seen before in that group or a behavior that they thought
general pattern; high rates of innovation, particularly of the social was a unique behavioral tradition. Naturally, many behaviors seem new to
and investigative sort, are observed in immature chimpanzees (16, relatively inexperienced observers, and therefore, not all of the behaviors
29). Among human children, older children are better than initially coded as innovations were true innovations. Also, some behaviors not
younger children at solving novel problems (47). It is worth noting coded as innovations in the comments section were, in fact, true innovations.
that, in this study, we were probably measuring something more S.E.P., who personally collected 13,770 h of data on this population from
akin to creativity and exploration rather than skill at solving a task, 1990 to 2016, read through all data to determine whether behavioral se-
quences were likely to be true innovations (i.e., behaviors seen for the first
and it is possible that the innovations created by older capuchins
time in that particular social group).
could be argued to be more sophisticated in some way. This research was performed in compliance with the laws of Costa Rica,
The other variable that had a strong effect on capuchins’ and the protocol was approved by the University of California, Los Angeles
propensity to innovate was sociality: more social capuchins were Animal Care Committee (ARC 1996-122 and 2005-084 plus various renewals).
more prone to inventing novel social interaction types, and more
social monkeys were also slightly more likely to have their social Buffer Period. We used a 5-y chunk of observational data (35,196 h collected
innovations picked up by other group members (although we between January 1, 2007 and December 31, 2011) to look for innovations. We
cannot currently say whether this is because of social learning) used the 5 preceding years of data (∼37,514 h of observation collected during
(SI Appendix, Fig. S5B). These results can probably be explained 2002–2006) as a “buffer period” (i.e., a period in which we could search for
simply by the fact that more social individuals have more op- prior instances of behaviors that appeared to be innovations within the
targeted 2007–2011 time period). If we had not left a large buffer period,
portunities to experiment with novel forms of social interaction.
we would have falsely concluded that far more behaviors within the time
The existing literature on innovation does not have much to say period of interest were innovations. SI Appendix, Table S1 reports, for each
about the effects of sociality per se, aside from predicting that innovation, the number of groups in which it occurred, its persistence in the
technological innovations will be more common in peripheral innovator’s repertoire, and whether it spread to other group members.
animals who are less distracted by social life (29). However, the Data were collected by a team of 50 highly trained observers (∼12 per
literature does address dominance rank as a predictive variable, year), each of whom underwent a training period of approximately 3 mo of
Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7811
dawn to dusk instruction and interobserver reliability testing before con- their offspring that are less than 1 y of age). Individuals with less than 20 scans
tributing data to the database. Interobserver reliability tests for monkey per year were excluded from analysis. Sociality scores were standardized, so
identifications and coding were repeated monthly throughout their tenures that a sociality score of one is 1 SD above the centered mean of zero. (iv) Rank.
at the field site. Obviously, we could not train observers to reliably code For individuals older than 3 y old, rank was calculated using the EloRating
innovations (behaviors that had not yet happened) in the same ways. We package in R (52). We used outcomes from dyadic interactions involving
could, however, ensure that the observers recognized and reliably recorded avoids, cowers, flees, and supplants to determine dominance. We used default
basic motor movements, gestures, and 205 behaviors considered to be part Elo parameters, with initial Elo scores set to 1,000, and the constant k set to
of the species-typical behavioral repertoire. This repertoire was based 100. Rank was estimated using average Elo scores calculated for each individual
on >6,000 h of observation invested in studying these monkeys during 1990– per calendar year. Young juveniles less than 3 y old were given Elo Scores of
2001 before explicit recording of innovations began, thereby enabling zero to assure a low rank within their social groups. Members of a group were
better detection of the idiosyncratic behaviors. divided up into tertiles, with the corresponding levels receiving rank categories
Although we gave instructions to data collection teams to record any type of high, middle, and low. High and low ranks were estimated as dummy vari-
of innovation in any behavioral domain, we decided not to include inno- ables, with middle rank serving as the intercept-only reference category.
vations regarding (i) choices of food or medicinal plants and (ii) vocal be-
havior in this analysis. Although we witnessed many instances of apparent Statistical Methods. Our outcome variable, number of innovations, is a count
innovation regarding use of novel foods and medicines, we lacked sufficient variable with many zero values. Each monkey was observed over multiple
confidence in observers’ ability to accurately identify rarely used plants and years. Membership in a particular group may have affected propensities to
insects or accurately identify rarely produced vocalizations. innovate. Therefore, we analyzed these data using a series of hierarchical ZIP
A behavioral observation was scored as an innovation if it met the fol- models. ZIP models are mixture models that use two probability distribu-
lowing criteria: (i) the behavior was absent in some of the groups that we tions. One component assumes a Bernoulli distribution and estimates p: the
regularly monitor, (ii) it was the first time that this behavior had been seen probability of observing a zero. The other component assumes a Poisson
in this group during 2007–2011, and (iii) this behavior had not been seen in distribution and estimates λ: the estimated mean of a Poisson distribution.
this group during the buffer period time period preceding 2007 during the ZIP models permit a mixture of causal factors to be evaluated and help
lifetime of the putative innovator. In cases of group fission or migration, we
better predict outcomes when there is a large number of zeros because of
required that the behavior not have been seen previously in the other
both the rarity of an event and false negatives. The joint likelihood of ob-
groups of which the potential innovator had been a member. There was one
serving an innovation can be calculated by multiplying the likelihoods of the
class of behaviors on which an additional criterion was imposed: postural
Bernoulli and Poisson outcomes and converting them to the real number
habits, such as clutching of one’s own body parts or sniffing one’s own hand
scale using their corresponding link functions. We graphically present joint
(SI Appendix).
posterior predictions here (Figs. 2 and 3) along with model predictions of
The number of innovations was also counted via two less conservative
p and λ in SI Appendix (SI Appendix, Figs. S1–S4).
methods for methodological comparison. In one version, we eliminated the
We looked at four predictors in this analysis: (i) age, (ii) sex, (iii) sociality,
buffer period criterion, calling the first observation of a behavior in a par-
and (iv) rank. We also estimated unique offsets for each individual, because
ticular group an innovation. This method yielded 263 innovations compared
they differed in observation time or exposure. We analyzed six models, four
with 187 produced with the more conservative method described above. In an
even less stringent version that yielded 282 innovations, we termed a be- of which corresponded to one of the single aforementioned predictors. The
havior an innovation if it had not been seen in that group in the past year. other two included a global model that looked at all four predictors and an
intercepts-only model. In each model, we used varying intercepts for each
Another challenge in defining innovation is the “grain” problem (i.e.,
individual (n = 234), social group (n = 10), and behavioral domain (n = 4).
determining the descriptive breadth of behavioral categories or in other
Varying slopes for domains and groups were estimated for all four predic-
words, the extent to which to lump vs. split behaviors) (51). The grain
problem is insoluble; the best that we can do is to use our intuitions about tors, and varying slopes for individuals were estimated for sociality and age.
what the animals themselves seem to consider novel and observations of Group size was used as a covariate to control for the numerical likelihood
how these behaviors cluster in the repertoires of groups and individuals. Are that, in smaller groups, observed behaviors might be more likely to be
novel actions used as part of a task already in the behavioral repertoire scored as innovations because of our definition of uniqueness and that a
novelties? In our view, they usually were. Are the same actions applied to greater number of innovations is more likely in larger groups. In another
different objects? Unless the objects and contexts were quite radically dif- analysis, we used experimental year as a varying effect to see if there were
ferent, we opted to lump these together (i.e., we did not designate them as any biases in data collection between field assistant cohorts that would
innovations). This issue was most prominent in our decision-making when change our inference. There were none, and therefore, we excluded these
evaluating clutching of different body parts (a self-directed behavior), parameters from our final analyses to simplify the presentation of results.
sucking of different body parts (a social behavior), and the “toy game,”
which involves passing an object from mouth to mouth. In all of these cases, Offsets and Exposure. Because observations of innovations were collected not
we chose to lump rather than split behaviors. In the case of object play (e.g., only in focal follows but also, ad libitum, and because individuals varied in
with sand, water, or rocks), different actions used by different individuals in their likelihood of being observed because of data collection protocols or
interacting with these same substances were scored as different behaviors. their visibility in the group, they also differed in exposure. To account for
Descriptions of the complete list of behaviors that were included in our differences in exposure, we calculated an annual offset for each individual in
analysis are in SI Appendix. each calendar year. Offsets for each individual (Oi) were estimated using
Innovations were classified in four categories or behavioral domains, the
content of which is described in greater detail in Results: foraging, self- Gi + Fi
Oi = log ,
directed behavior, social behaviors, and investigative behaviors (explora- 365
tion of the environment).
where Gi is the number of instantaneous group scans calculated per calen-
dar year for each individual i, and Fi is the number of point samples collected
Data Structure. We created a different row for every individual monkey/year/
at 2.5-min intervals during focal follows in a calendar year. These offsets
behavioral domain combination, and a value was scored for the number of
were included alongside linear predictors in each model.
innovations observed for that combination; this number was the output
variable. We measured four main predictor variables to determine what Models were fit using the map2stan function in the R package rethinking
predicts the number of innovations per individual per year: age, sex, sociality, (53). Models were fit using Hamilton Markov Chain Monte Carlo in r-STAN
and rank. (i) Age. To calculate age, we subtracted the birth year of an indi- (v 2.14) (54) in R v. 3.3.2 (55). Models were compared with widely applicable
vidual from the year of observation and added one. Individuals born in the information criteria (WAIC) using the compare function in rethinking. The
same year as the year of observation were excluded from the dataset. Age was corresponding code and data used for each model and graph production can
log-transformed and centered for analysis. (ii) Sex. We coded sex as a dummy be found through a link in SI Appendix.
variable (one for males and zero for females). (iii) Sociality. Sociality was cal- To estimate group-level difference in annual innovation rate, we summed
culated using data from group scans, which were taken opportunistically for the number of innovations observed within each group and across all indi-
all group members, at intervals no closer together than 10 min. For each in- viduals and behavioral domains. Exposure rates for individuals within groups
dividual per calendar year, we calculated the proportion of group scans in were summed within years. Counts of annual innovations per group were
which the individual was in proximity (i.e., within ∼400 cm) to at least one then fit using a hierarchical Poisson model fit using r-STAN that accounted for
other group member other than their dependent offspring (i.e., for females, exposure rates using metrics previously described and varying intercepts for
1. Imanishi K (1952) Man (Mainichi-Shinbunsha, Tokyo). 31. Navarrete AF, Reader SM, Street SE, Whalen A, Laland KN (2016) The coevolution of
2. Kummer H (1971) Primate Societies: Group Techniques of Ecological Adaptation innovation and technical intelligence in primates. Philos Trans R Soc Lond B Biol Sci
(AHM Publ Corp, Arlington Heights, IL). 371:371.
3. West-Eberhard MJ (2003) Developmental Plasticity and Evolution (Oxford Univ Press, 32. Stephan H, Bauchot R, Andy OJ (1970) Data on size of the brain and various brain
Oxford). parts in insectivores and primates. The Primate Brain, eds Noback CR, Montagna W
4. Giraldeau L-A, Caraco Y, Valone T (1994) Social foraging: Individual learning and (Appleton-Century-Crofts, New York), pp 289–297.
cultural transmission of innovations. Behav Ecol 5:35–43. 33. Perry S, Ordoñez Jiménez JC (2006) The Effects of Food Size, Rarity, and Processing
5. Sol D (2003) Behavioural flexibility: A neglected issue in the ecological and evolu- Complexity on White-Faced Capuchins’ Visual Attention to Foraging Conspecifics, eds
tionary literature? Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Hohmann G, Robbins M, Boesch C (Cambridge Univ Press, Cambridge, UK), pp 203–234.
Oxford), pp 62–82. 34. Fragaszy DM, Visalberghi E, Fedigan LM (2004) The Complete Capuchin: The Biology
6. Reader SM, Morand-Ferron J, Flynn E (2016) Animal and human innovation: Novel of the Genus Cebus (Cambridge Univ Press, Cambridge, UK).
problems and novel solutions. Philos Trans R Soc Lond B Biol Sci 371:371. 35. Day RL, Coe RL, Kendal JR, Laland KN (2003) Neophilia, innovation and social learn-
7. Wyles JS, Kundel JG, Wilson AC (1983) Birds, behavior, and anatomical evolution. Proc ing: A study of intergeneric differences in callitrichid monkeys. Anim Behav 65:
Natl Acad Sci USA 80:4394–4397. 559–571.
8. Keagy J, Savard J-F, Borgia G (2009) Male satin bowerbird problem-solving ability 36. Lynch Alfaro JW, et al. (2012) Explosive Pleistocene range expansion leads to wide-
predicts mating success. Anim Behav 78:809–817. spread Amazonian sympatry between robust and gracile capuchin monkeys. J Biogeogr
9. Cauchard L, Boogert NJ, Lefebvre L, Dubois F, Doligez B (2013) Problem-solving per- 39:272–288.
formance is correlated with reproductive success in a wild bird population. Anim 37. Rose L, et al. (2003) Interspecific interactions between white-faced capuchins (Cebus
Behav 85:19–26. capucinus) and other species: Preliminary data from three Costa Rican sites. Int J
10. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through Primatol 24:759–796.
culture. Proc Natl Acad Sci USA 114:7775–7781. 38. Perry S, Manson JH (2008) Manipulative Monkeys: The Capuchins of Lomas Barbudal
11. Reader SM, Laland KN (2003) Animal innovation: An introduction. Animal Innovation,
(Harvard Univ Press, Cambridge, MA).
eds Reader SM, Laland KN (Oxford Univ Press, Oxford), pp 3–35.
39. Sol D, Sayol F, Ducatez S, Lefebvre L (2016) The life-history basis of behavioural in-
12. Ramsey G, Bastian ML, van Schaik C (2007) Animal innovation defined and oper-
novations. Philos Trans R Soc Lond B Biol Sci 371:371.
ationalized. Behav Brain Sci 30:393–407.
40. Perry S (2012) The behavior of wild white-faced capuchins: Demography, life history,
13. van Schaik CP, et al. (2016) The reluctant innovator: Orangutans and the phylogeny of
social relationships, and communication. Adv Study Behav 44:135–181.
ANTHROPOLOGY
creativity. Philos Trans R Soc Lond B Biol Sci 371:371.
41. Perry S (2011) Social traditions and social learning in capuchin monkeys (Cebus). Philos
14. Fogarty L, Creanza N, Feldman MW (2015) Cultural evolutionary perspectives on
Trans R Soc Lond B Biol Sci 366:988–996.
creativity and human innovation. Trends Ecol Evol 30:736–754.
42. Nishida T, Matsusaka T, McGrew WC (2009) Emergence, propagation or disappear-
15. Perry S (1996) Intergroup encounters in wild white-faced capuchins, Cebus capucinus.
ance of novel behavioral patterns in the habituated chimpanzees of Mahale: A re-
Int J Primatol 17:309–330.
view. Primates 50:23–36.
16. Reader SM, Laland KN (2001) Primate innovation: Sex, age, and social rank differ-
43. Haslam M, et al. (2016) Pre-Columbian monkey tools. Curr Biol 26:R521–R522.
ences. Int J Primatol 22:787–805.
44. Gould SJ, Vrba ES (1982) Exaptation - a missing term in the science of form.
17. Lefebrvre L, et al. (1998) Feeding innovations and forebrain size in Australasian birds.
Paleobiology 8:4–15.
Behaviour 135:1077–1097.
45. Zahavi A (1977) The testing of a bond. Anim Behav 25:246–247.
18. Lefebrvre L, Whittle P, Lascaris E, Finkelstein A (1997) Feeding innovations and
46. Russon AE, Kuncoro P, Ferisa A (2015) Tools for the trees: Orangutan arboreal tool use
forebrain size in birds. Anim Behav 53:549–560.
and creativity. Animal Creativity and Innovation, eds Kaufman AB, Kaufman JC
19. Boogert NJ, Reader SM, Hoppitt W, Laland KN (2008) The origin and spread of in-
novations in starlings. Anim Behav 75:1509–1518. (Elsevier, San Diego), pp 419–455.
20. Thornton A, Samson J (2012) Innovative problem solving in wild meerkats. Anim 47. Beck SR, Williams C, Cutting N, Apperly IA, Chappell J (2016) Individual differences in
Behav 83:1459–1468. children’s innovative problem-solving are not predicted by divergent thinking or
21. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture executive functions. Philos Trans R Soc Lond B Biol Sci 371:371.
via conformity in wild birds. Nature 518:538–541. 48. Laland KN, van Bergen Y (2003) Experimental studies of innovation in the guppy.
22. Kendal RL, Coe RL, Laland KN (2005) Age differences in neophilia, exploration, and Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, New York), pp
innovation in family groups of callitrichid monkeys. Am J Primatol 66:167–188. 155–173.
23. Russon AE (2003) Innovation and creativity in forest-living rehabilitant orangutans. 49. Frankie GW, Vinston SB, Newstrom LE, Barthell JF (1988) Nest site and habitat pref-
Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Oxford), pp 279–306. erences of Centris bees in the Costa Rican dry forest. Biotropica 20:301–310.
24. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. 50. Perry S, Godoy I, Lammers W (2012) The Lomas Barbudal Monkey Project: Two de-
25. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. cades of research on Cebus capucinus. Long-Term Field Studies of Primates, eds
Science 299:102–105. Kappeler P, Watts D (Springer, New York), pp 141–165.
26. van Schaik CP, van Noordwijk MA, Wich SA (2006) Innovation in wild Bornean 51. Russon A, Andrews K, Huss B (2007) Innovation and the grain problem. Behav Brain
orangutans (Pongo pygmaeus wurmbii). Behaviour 143:839–876. Sci 30:423–433.
27. Perry S, et al. (2003) Social conventions in wild white-faced capuchin monkeys: Evi- 52. Neumann C, Kulik L (2014) EloRating: Animal Dominance Hierarchies by Elo-Rating.
dence for traditions in a neotropical primate. Curr Anthropol 44:241–268. R Package, Version 0.43. Available at https://cran.r-project.org/package=EloRating.
28. Koolhaas JM, et al. (1999) Coping styles in animals: Current status in behavior and Accessed January 5, 2017.
stress-physiology. Neurosci Biobehav Rev 23:925–935. 53. McElreath R (2016) Statistical Rethinking: A Bayesian Course with Examples in R and
29. Kummer H, Goodall J (1985) Conditions of innovative behaviour in primates. Philos Stan (Chapman and Hall/CRC, New York).
Trans R Soc Lond B Biol Sci 308:203–214. 54. Stan Development Team (2016) RStan: The R Interface to Stan. R package version
30. Overington S, Morand-Ferron J, Boogert NJ, Lefebvre L (2009) Technical innovations 2.14.1. (R Foundation for Statistical Computing, Vienna).
drive the relationship between innovativeness and residual brain size in birds. Anim 55. R Development Core Team (2013) R: A Language and Environment for Statistical
Behav 78:1001–1010. Computing (R Foundation for Statistical Computing, Vienna).
Perry et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7813
Gene–culture coevolution in whales and dolphins
Hal Whiteheada,1
a
Department of Biology, Dalhousie University, Halifax, NS, Canada B3H 4R2
Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 1, 2017 (received for review January 14, 2017)
Whales and dolphins (Cetacea) have excellent social learning skills places major constraints on the other modes of inheritance.
as well as a long and strong mother–calf bond. These features Circumstances in which other inheritance mechanisms control
produce stable cultures, and, in some species, sympatric groups the inheritance of genes are much less obvious but can have great
with different cultures. There is evidence and speculation that this evolutionary significance (10). Thus, there has been particular
cultural transmission of behavior has affected gene distributions. interest in gene–culture coevolution (11–13).
Culture seems to have driven killer whales into distinct ecotypes, Theoreticians started to model potential interactions between
which may be incipient species or subspecies. There are ecotype- genes and culture in the 1970s (14), and there has been much
specific signals of selection in functional genes that correspond to empirical and theoretical work since then (11, 13, 15, 16).
cultural foraging behavior and habitat use by the different eco- However, gene–culture coevolution has rarely been formally
types. The five species of whale with matrilineal social systems defined. For the perspective of this article, I consider cases in
have remarkably low diversity of mtDNA. Cultural hitchhiking, which culture—i.e., group-specific social learning—affects the
the transmission of functionally neutral genes in parallel with se- distribution of DNA found in a population. That is, we can
lective cultural traits, is a plausible hypothesis for this low diver- reasonably suppose that the distribution of genes in a population
sity, especially in sperm whales. In killer whales the ecotype would be different if individuals were not transmitting cultural
divisions, together with founding bottlenecks, selection, and cul- information through social learning. Gene–culture coevolution
tural hitchhiking, likely explain the low mtDNA diversity. Several includes very specific selective processes in which a particular
cetacean species show habitat-specific distributions of mtDNA cultural practice affects the evolution of a particular gene or
haplotypes, probably the result of mother–offspring cultural trans- genes. The most famous example is the coevolution of dairy
mission of migration routes or destinations. In bottlenose dolphins, farming and the lactase gene, allowing adult humans in dairy-
remarkable small-scale differences in haplotype distribution result farming cultures to digest milk products (17). However, gene–
from maternal cultural transmission of foraging methods, and large- culture coevolution can also include general processes. If culture
scale redistributions of sperm whale cultural clans in the Pacific have is driving significant parts of a species’ behavior, then this in-
likely changed mitochondrial genetic geography. With the accelera- fluence may constrain genetic evolution. For instance if cultural
tion of genomics new results should come fast, but understanding processes isolate groups and give them distinctive behavior, then
gene–culture coevolution will be hampered by the measured pace this isolation may initiate or promote the course of speciation or
of research on the socio-cultural side of cetacean biology. reduce the diversity of genes that are transmitted in parallel with
the cultural traits (18–20). Under this concept of gene–culture
gene–culture coevolution | Cetacea | cultural hitchhiking coevolution there will be circumstances in which the available
information may not be definitive. Scenarios including or exclud-
pendently. Phenotypic assimilation by one mode of inheritance The author declares no conflict of interest.
may influence the transmission of traits by another mode. Genes This article is a PNAS Direct Submission.
produce the basic organismal phenotype, and this genetic phenotype 1
Email: hwhitehe@dal.ca.
EVOLUTION
is causal. graphical patterns of mtDNA result from cultural behavior. As
Even harder to substantiate are the proposals for more general with gene–culture coevolution in humans, evidence is not always
effects of culture on patterns of human genetics. For instance, conclusive, and I will consider alternative mechanisms for the
models have shown that the remarkably low levels of human observed patterns.
genetic diversity could result from culturally mediated pop-
ulation structure (25) or from selectively important cultures being Ecotype Radiations in Killer Whales?
transmitted in parallel with neutral genes (26). However, there is Although killer whales as a species have an extremely diverse
no strong empirical evidence for either of these processes. diet, ranging from herring to the largest whales, each killer whale
Although culture is clearly present in other species, it is is typically a member of an ecotype, and most ecotypes are ex-
dwarfed in almost all respects by its impact on modern humans tremely specialized in what they eat and often in how they catch
(27–29). Thus, given the difficulties of demonstrating gene–cul- it. Ecotypes are distinctive in a range of other behaviors, as well
ture coevolution in by far the most cultural species, there has as morphology (20). In each of the North Pacific, North Atlantic,
been little effort to look elsewhere. However, gene–culture co- and Antarctic two or more ecotypes, each with a few hundred to
evolution can theoretically operate in simple cultural systems a few thousand animals, are sympatric but socially isolated (20).
(14), and there may be attributes of nonhuman species that could Best known are the three North Pacific ecotypes: a salmon-
accentuate the effects of culture on genes or make them more eating specialist ecotype known as “residents,” a mammal-
discernible. eating specialist or “transient” ecotype, and an “offshore” eco-
Birds offer one candidate taxon for nonhuman gene–culture type, which feeds upon sharks (20). All three ecotypes cluster in
coevolution. In oscine species, birdsong is largely a socially distinct mitochondrial and nuclear genome clades (33, 34) that
learned population phenomenon and thus represents culture diverged a few hundred thousand years ago (35). We have less
(30). For instance, populations of the sharp-beaked Galápagos complete information for the Antarctic, but there appear to
ground finch (Geospiza difficilis) have socially learned songs with be five ecotypes respectively specializing on minke whales
an important role in mating but respond fully only to songs from (Balaenoptera acutorostrata) (ecotype A), seals (ecotype B1),
their own island population (31). Thus, Grant and Grant suggest penguins (ecotype B2), and fish (ecotype C and perhaps eco-
that cultural transmission is driving speciation (31). However, type D) (20). These Antarctic ecotypes seem to have diverged
some other studies have found little evidence of links between more recently than those in the North Pacific, and ecotypes B1,
cultural and genetic evolution in birds (e.g., ref. 32). B2, and C, at least, are part of a distinct and relatively shallow
Of all nonhuman species, culture seems particularly prevalent Antarctic clade (34, 35). The distinctions between the ecotypes
and significant among Cetacea (3). The cetaceans include about have led Morin et al. to suggest that they should be considered
89 species of whales and dolphins, ranging in size from 1 to 30 m. separate species or subspecies (36), but this proposal has not
They use a range of habitat from large rivers to the deep ocean, been formally implemented.
in polar, temperate, and tropical waters. They eat a wide range of Since the initial discovery of killer whale ecotypes, scientists
marine animals, from copepods to large whales. Cetaceans are have speculated about how they may have arisen. Although some
divided into the mysticetes or baleen whales that filter feed on issues are disputed, a consensus hypothesis about the funda-
schools of prey using baleen, and the odontocetes that use mentals of ecotype formation has grown. This culturally driven
echolocation to locate and track single prey. ecological speciation scenario has been articulated most clearly
Although cetaceans are not easy to study, there is good evi- by Riesch et al. (20), with support from more recent genomic
dence for cultural transmission in song, migrations, foraging studies (35). Ecological speciation needs (i) an ecological source
behavior, social conventions, cooperative associations with hu- of divergent selection; (ii) a form of reproductive isolation; and
mans, and play (3). Almost all cetacean species whose behavior (iii) a mechanism linking divergent selection to reproductive
has been much studied show possibly, or likely, culturally ac- isolation (37). The killer whale scenario is especially interesting
quired behavior (3). For gene–culture coevolution, culture needs because culture seems to play a key role in all three components.
to be quite stable—fads need not apply—and to affect fitness Within the best-studied killer whale ecotypes there is evidence
directly or indirectly. When social groups with distinct cultures that different matrilineal social units can have distinctive ways of
EVOLUTION
transmitted in parallel (19). Thus cultural hitchhiking is a form of
comparable population sizes or latitudinal ranges (19). In the gene–culture coevolution. Agent-based models have shown that
nearly two decades since then, many additional estimates of cultural hitchhiking can work in circumstances that seem realistic
cetacean control region diversity have been published, for other for the matrilineal whales (57) but do not necessarily imply that
species (including another presumed matrilineal species, the cultural hitchhiking is behind their low mtDNA diversity.
false killer whale, Pseudorca crassidens), with larger and more However, it is most parsimonious to assume that a factor
geographically dispersed samples, and greater coverage of the common to the matrilineal species has led to reduced diversity.
genome (47). The pattern still holds (Fig. 1). The range-wide Because that factor does not seem to be the direct influence of
mtDNA diversities in the matrilineal species are ∼29.8% of matrilineality itself (57), I suspect that it is culture. Matrilineal
that of nonmatrilineal species with similar latitudinal ranges social systems are particularly good substrates for the evolution
(47). For regional estimates the ratio is 16.6%. of stable, group-specific cultures (3), and stable, group-specific
Low genetic diversity usually invokes discussion of bottlenecks cultures are the prerequisite for cultural hitchhiking (57). How-
and selection. Both these default mechanisms have been used to ever, culture may affect genetic diversity through different paths
explain the low mtDNA diversity of the matrilineal species of in different species (18), as illustrated by the two best-known
Cetacea (48–53). However, neither bottlenecks nor selection link matrilineal species, sperm and killer whales.
the remarkably low diversity of the species to their matrilineal Female and immature sperm whales use tropical and sub-
social systems. Furthermore, contrary to expectations from bot- tropical waters, where they live in matrilineally based social units
tlenecks, the diversity of nuclear microsatellites is not obviously (58). These units, in turn, are members of coda clans (59). The
reduced in the matrilineal whales (although this result may be coda clans have distinctive dialects, movements, microhabitat
partially explained by the greater effective population size and use, and social behavior, as well as differential reproductive
higher mutation rates of microsatellites compared with mtDNA) success (59–62). The clans are sympatric but do not associate
(47). In sperm whales, Alexander et al. (48) found no evidence with one another; they show no differences in nuclear micro-
for selection in the control region relative to other regions of the satellites but do have distinct mtDNA haplotype distributions
(63, 64). Thus, sperm whales fit the classic cultural hitchhiking
scenario well, with clans being the cultural groups under selec-
tion, and cultural hitchhiking is the most parsimonious expla-
nation for their low mitochondrial diversity. A selective sweep in
the mitochondrial genome is also consistent with available re-
sults on sperm whales (48) but does not obviously provide a link
with matrilineality.
Although killer whales are also matrilineal, their subdivision
into ecotypes adds a layer of additional possible drivers of low
genetic diversity. Population subdivision itself tends to reduce
genetic diversity (65). The highly specialized ways of life of many
of the ecotypes may make them particularly vulnerable to ex-
tirpation, removing characteristic mtDNA haplotypes in the
process, and thus reducing diversity (65). The ecotypes have very
different culturally transmitted behavior—one of the conditions
for cultural hitchhiking—but their lifestyles are so different that
Fig. 1. Mean mtDNA nucleotide diversity (across estimates for a species
with n >100 covering >25% of the species’ range or at least one ocean basin)
it is hard to see how they would often be in competition (how-
against the latitudinal range of cetacean species. Nonmatrilineal species are ever, see ref. 66 for a scenario of indirect competition between
indicated by a plus sign, and matrilineal species are indicated by circles. Gma, ecotypes), so cultural hitchhiking at the ecotype level seems an
short-finned pilot whale; Gme, long-finned pilot whale; Oo, killer whale; Pc, incomplete explanation. Processes that reduce diversity within
false killer whale; Pm, sperm whale. Adapted from ref. 47. ecotypes would also affect the overall diversity of killer whales.
Speciation
Ecotype radiations Killer whale A deep division of killer whales Perhaps other cultural Strong (20)
into ecotypes driven by foraging behavior initiated
cultures ecotype formation
Gene-culture coevolution of functional genes
Genes related to Killer whale Differ between mammal-eating Possible (35)
digestive tract and fish-eating ecotypes
Genes related to Killer whale Differ between mammal-eating Independent contrasts in Strong (35)
methionine cycle and fish-eating ecotypes North Pacific and
Antarctic
Genes related to Killer whale Differ between Antarctic and Cultural role not direct Possible (35)
adipose tissue temperate ecotypes
development
Genes related to skin Killer whale Differ between Antarctic and Cultural role not direct Possible (35)
EVOLUTION
regeneration temperate ecotypes
Size Killer whale Differences between mammal- No genes identified; could Weak (20)
eating, fish-eating and bird- be environmental
eating ecotypes
Robustness of mouth Killer whale Differ between mammal-eating No genes identified; Weak (20)
parts and fish-eating ecotypes preliminary study
Coloration Killer whale Most enlarged white eye-patch in No genes identified Weak (20)
ecotype that uses most
coordinated behavior
Cultural hitchhiking in matrilineal whales
Low mtDNA diversity Sperm whale mtDNA has hitchhiked on selective Clans are strong candidate Possible (57)
cultural traits transmitted in for cultural groups; could
parallel be caused by selective
sweep in mtDNA or
bottleneck (less supported)
Low mtDNA diversity Killer whale Ecotype population structure, plus Culture very likely to have Strong (57)
additional cultural effects, role(s); several possible
bottlenecks and/or selection routes; not necessarily
have reduced diversity classic cultural
hitchhiking
Low mtDNA diversity Pilot, false mtDNA has hitchhiked on selective Little evidence other than Weak (57)
killer whales cultural traits transmitted in correlation between
parallel matrilineal social system
and low mtDNA diversity
Culture and gene geography
Distinctive mtDNA Beluga whale Belugas follow their mothers on Born in summer, could be Strong (72)
distributions of first migrations purely environmental
summering areas sensing, but unlikely
Distinctive mtDNA Humpback Humpbacks follow their mothers Born in winter, so, because Strong (74)
distributions of whale on first migrations first visit to summering
summering areas grounds is with mother,
this is social learning
Distinctive mtDNA Southern Right whales follow their mothers As born in winter, could be Strong (75)
distributions of right whale on first migrations purely environmental
wintering areas sensing, but unlikely
Distinctive mtDNA Bottlenose Dolphins learn specialized feeding Social learning of some Strong (77)
distributions at dolphin techniques from their mothers, foraging techniques well
small scales leading to habitat selection established
Change in mtDNA Sperm whale Clans redistributed themselves Genetic change inferred; Possible (80)
distribution caused over 30 y, changing mtDNA redistribution could have
by cultural clan distribution been independent of
redistribution clan membership, but
unlikely
1. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump- 6. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685.
tions and predictions. Proc Roy Soc B 282:20151019. 7. Barrett-Lennard L (2011) Killer whale evolution: Populations, ecotypes, species, Oh
2. Maynard Smith J (1989) Evolutionary Genetics (Oxford Univ Press, Oxford, UK). my! J Am Cet Soc 40:48–53.
3. Whitehead H, Rendell L (2015) The Cultural Lives of Whales and Dolphins (Univ of 8. Sapolsky RM, Share LJ (2004) A pacific culture among wild baboons: Its emergence
Chicago Press, Chicago). and transmission. PLoS Biol 2:E106.
4. Heyes CM (1994) Social learning in animals: Categories and mechanisms. Biol Rev 9. Cavalli-Sforza LL, Feldman MW, Chen KH, Dornbusch SM (1982) Theory and obser-
Camb Philos Soc 69:207–231. vation in cultural transmission. Science 218:19–27.
5. Hoppitt W, Laland KN (2013) Social Learning: An Introduction to Mechanisms, 10. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
Methods, and Models (Princeton Univ Press, Princeton, NJ). Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.
EVOLUTION
Pleistocene hominins. Proc Natl Acad Sci USA 106:33–37. Behav Genet 42:332–343.
26. Whitehead H, Richerson PJ, Boyd R (2002) Cultural selection and genetic diversity in 65. Whitlock MC, Barton NH (1997) The effective size of a subdivided population.
humans. Selection 3:115–125. Genetics 146:427–441.
27. Richerson PJ, Boyd R (2005) Not by Genes Alone: How Culture Transformed Human 66. Baird RW, Abrams PA, Dill LM (1992) Possible indirect interactions between transient
Evolution (Univ of Chicago Press, Chicago). and resident killer whales: Implications for the evolution of foraging specializations in
28. Whiten A (2011) The scope of culture in chimpanzees, humans and ancestral apes. the genus Orcinus. Oecologia 89:125–132.
Philos Trans R Soc Lond B Biol Sci 366:997–1007. 67. Foote AD, et al. (2011) Positive selection on the killer whale mitogenome. Biol Lett 7:
29. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through 116–118.
culture. Proc Natl Acad Sci USA 114:7775–7781. 68. Ford JKB, Ellis GM, Balcomb KC (2000) Killer Whales (Univ of British Columbia,
30. Catchpole CK, Slater PJB (2008) Bird Song: Biological Themes and Variations (Cam- Vancouver).
bridge Univ Press, Cambridge, UK). 69. Amos B, Schlötterer C, Tautz D (1993) Social structure of pilot whales revealed by
31. Grant BR, Grant PR (2002) Simulating secondary contact in allopatric speciation: An analytical DNA profiling. Science 260:670–672.
empirical test of premating isolation. Biol J Linn Soc Lond 76:545–556. 70. Heimlich-Boran JR (1993) Social Organization of the Short-Finned Pilot Whale Glo-
32. Wright TF, Wilkinson GS (2001) Population genetic structure and vocal dialects in an bicephala macrorhynchus, with Special Reference to the Comparative Social Ecology
amazon parrot. Proc Biol Sci 268:609–616. of Delphinids. PhD thesis (Cambridge University, Cambridge, UK).
33. Moura AE, et al. (2015) Phylogenomics of the killer whale indicates ecotype di- 71. Baird RW, et al. (2008) False killer whales (Pseudorca crassidens) around the main
vergence in sympatry. Heredity (Edinb) 114:48–55. Hawaiian Islands: Long-term site fidelity, inter-island movements, and association
34. Morin PA, et al. (2015) Geographic and temporal dynamics of a global radiation and patterns. Mar Mamm Sci 24:591–612.
diversification in the killer whale. Mol Ecol 24:3964–3979. 72. Brown Gladden JG, Ferguson MM, Clayton JW (1997) Matriarchal genetic population
35. Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of structure of North American beluga whales Delphinapterus leucas (Cetacea: Mono-
killer whale ecotypes. Nat Commun 7:11693. dontidae). Mol Ecol 6:1033–1046.
36. Morin PA, et al. (2010) Complete mitochondrial genome phylogeographic analysis of 73. Turgeon J, Duchesne P, Colbeck GJ, Postma LD, Hammill MO (2012) Spatiotemporal
killer whales (Orcinus orca) indicates multiple species. Genome Res 20:908–916. segregation among summer stocks of beluga (Delphinapterus leucas) despite nuclear
37. Rundle HD, Nosil P (2005) Ecological speciation. Ecol Lett 8:336–352. gene flow: Implication for the endangered belugas in eastern Hudson Bay (Canada).
38. Matkin C, Durban J (2011) Killer whales in Alaskan waters. J Am Cet Soc 40:24–29.
Conserv Genet 13:419–433.
39. Laland KN, Odling-Smee J, Feldman MW (2000) Niche construction, biological evo-
74. Baker CS, et al. (1990) Influence of seasonal migration on geographic distribution of
lution, and cultural change. Behav Brain Sci 23:131–146, discussion 146–175.
mitochondrial DNA haplotypes in humpback whales. Nature 344:238–240.
40. Moura AE, et al. (2014) Population genomics of the killer whale indicates ecotype
75. Carroll EL, et al. (2015) Cultural traditions across a migratory network shape the
evolution in sympatry involving both selection and drift. Mol Ecol 23:5179–5192.
genetic structure of southern right whales around Australia and New Zealand. Sci Rep
41. Finkelstein JD (1990) Methionine metabolism in mammals. J Nutr Biochem 1:228–237.
5:16182.
42. Liu S, et al. (2014) Population genomics reveal recent speciation and rapid evolu-
76. Harrison XA, et al. (2010) Cultural inheritance drives site fidelity and migratory con-
tionary adaptation in polar bears. Cell 157:785–794.
nectivity in a long-distance migrant. Mol Ecol 19:5484–5496.
43. Forman OP, et al. (2012) Parallel mapping and simultaneous sequencing reveals de-
77. Kopps AM, et al. (2014) Cultural transmission of tool use combined with habitat
letions in BCAN and FAM83H associated with discrete inherited disorders in a do-
specializations leads to fine-scale genetic structure in bottlenose dolphins. Proc Roy
mestic dog breed. PLoS Genet 8:e1002462.
44. Durban JW, Pitman RL (2012) Antarctic killer whales make rapid, round-trip move- Soc B 281:20133245.
ments to subtropical waters: Evidence for physiological maintenance migrations? Biol 78. Mann J, Sargeant B (2003) The Biology of Traditions; Models and Evidence, eds
Lett 8:274–277. Fragaszy DM, Perry S (Cambridge Univ Press, Cambridge, UK), pp 236–266.
45. Pitman RL, Ensor P (2003) Three forms of killer whales (Orcinus orca) in Antarctic 79. Krützen M, et al. (2005) Cultural transmission of tool use in bottlenose dolphins. Proc
waters. J Cetacean Res Manag 5:131–139. Natl Acad Sci USA 102:8939–8943.
46. Pitman RL, Durban JW (2012) Cooperative hunting behavior, prey selectivity and prey 80. Cantor M, Whitehead H, Gero S, Rendell L (2016) Cultural turnover among Galápagos
handling by pack ice killer whales (Orcinus orca), type B, in Antarctic Peninsula waters. sperm whales. R Soc Open Sci 3:160615.
Mar Mamm Sci 28:16–36. 81. Mercader J, et al. (2007) 4,300-year-old chimpanzee sites and the origins of percussive
47. Whitehead H, Vachon F, Frasier TR (2017) Cultural hitchhiking in the matrilineal stone technology. Proc Natl Acad Sci USA 104:3043–3048.
whales. Behav Genet 47:324–334. 82. Cantor M, et al. (2015) Multilevel animal societies can emerge from cultural trans-
48. Alexander A, et al. (2013) Low diversity in the mitogenome of sperm whales revealed mission. Nat Commun 6:8091.
by next-generation sequencing. Genome Biol Evol 5:113–129. 83. van Schaik CP, Isler K, Burkart JM (2012) Explaining brain size variation: From social to
49. Lyrholm T, Leimar O, Gyllensten U (1996) Low diversity and biased substitution pat- cultural brain. Trends Cogn Sci 16:277–284.
terns in the mitochondrial DNA control region of sperm whales: Implications for es- 84. Cammen KM, et al. (2016) Genomic methods take the plunge: Recent advances in
timates of time since common ancestry. Mol Biol Evol 13:1318–1326. high-throughput sequencing of marine mammals. J Hered 107:481–495.
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark June 16, 2017
(received for review January 14, 2017)
Cultural processes occur in a wide variety of animal taxa, from whale song is one of the most elaborate acoustic displays in the
insects to cetaceans. The songs of humpback whales are one of the animal kingdom (21). The song is produced solely by adult males
most striking examples of the transmission of a cultural trait and (22) and is therefore considered a product of sexual selection,
social learning in any nonhuman animal. To understand how songs even though the details of how it functions as a signal are still
are learned, we investigate rare cases of song hybridization, where debated (23).
parts of an existing song are spliced with a new one, likely before Song is organized in a nested hierarchy: single sounds are
an individual totally adopts the new song. Song unit sequences termed “units,” a sequence of units is grouped into a “phrase,”
were extracted from over 9,300 phrases recorded during two song phrases are repeated to form a “theme,” and a number of different
revolutions across the South Pacific Ocean, allowing fine-scale themes are usually sung in a set order to form the “song” (24). To
analysis of composition and sequencing. In hybrid songs the current move from one theme into another, a single “transitional phrase”
and new songs were spliced together in two specific ways: (i) sing- is sometimes sung that contains content from the preceding and
ers placed a single hybrid phrase, in which content from both songs following themes (20). Different versions of the display (contain-
were combined, between the two song types when transitioning ing different themes) are termed “song types” (18). Within each
from one to the other, and/or (ii) singers spliced complete themes population, there is usually strong conformity to a single song type
from the revolutionary song into the current song. Sequence anal- at any point in time (25). However, the song is constantly changing
ysis indicated that both processes were governed by structural sim- (20), and all males must continuously incorporate these alterations
ilarity rules. Hybrid phrases or theme substitutions occurred at to maintain the observed conformity. This slow and gradual
points in the songs where both songs contained “similar sounds change is a process of cultural evolution in which subtle changes
arranged in a similar pattern.” Songs appear to be learned as seg- occur over time at a population scale (20, 26).
ments (themes/phrase types), akin to birdsong and human language Populations within an ocean basin sing similar songs, but the
acquisition, and these can be combined in predictable ways if the similarity depends on both geographic (27, 28) and temporal dis-
underlying structural pattern is similar. These snapshots of song tances, as transmission of song changes across a region may take
change provide insights into the mechanisms underlying song learn- several years (18, 29, 30). In the western and central South Pacific
ing in humpback whales, and comparative perspectives on the evo- region, song also undergoes dramatic cultural “revolutions,” where
lution of human language and culture. the song type from a neighboring population is rapidly adopted by
all of the males in an adjacent population (18, 19). We have pre-
vocal learning | cultural transmission | song | cetacean | humpback whale viously described the rapid, repeated, and regular horizontal cul-
tural transmission of multiple song types, creating multiple song
revolutions across the western and central South Pacific region (18,
C ultural transmission has been shown in a wide variety of taxa,
spanning birds, fish, insects, cetaceans, and nonhuman pri-
mates (1, 2). We define culture in the broad sense as shared in-
29, 30). Among populations in any nonhuman animal, this is a very
rare, possibly unique, example of population-wide horizontal cul-
formation or behavior acquired through some form of social tural transmission where behavioral variants are transmitted rapidly
learning from conspecifics (3–5). Each of these studies has pro- and repeatedly (18). However, we know little regarding the un-
vided examples demonstrating a behavioral trait being passed derlying vocal and sequence learning mechanisms governing this
from one individual to another, and on occasion entire pop- extraordinary cultural phenomenon.
Mechanisms of vocal learning are far better understood for
ulations, through some form of social learning. Cetaceans show
human language acquisition and birdsong than for cetacean
some of the most sophisticated and complex vocal and cultural
behavior outside of humans (6, 7), including vocal learning, shared
traditions, and gene–culture coevolution. For example, southern
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
right whales (Eubalaena australis) demonstrate strong migratory Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
culture (8), whereas bottlenose dolphins (Tursiops truncatus and Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
Tursiops aduncus) demonstrate the cultural transmission of tool in Irvine, CA. The complete program and video recordings of most presentations are available
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
use (9, 10). Both sperm whales (Physeter macrocephalus) and killer
Author contributions: E.C.G., L.R., and M.J.N. designed research; E.C.G., M.M.P., and
whales (Orcinus orca) have culturally transmitted group vocaliza- M.J.N. performed research; E.C.G. contributed new reagents/analytic tools; E.C.G., L.L.,
tions that are maintained over decades (11, 12), and also appear to and M.J.N. analyzed data; and E.C.G., L.R., L.L., M.M.P., and M.J.N. wrote the paper.
undergo gene–culture coevolution (13–15). The authors declare no conflict of interest.
Humpback whales (Megaptera novaeangliae) possess multiple, This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the
independently evolving cultural traditions, including maternally Editorial Board.
directed site fidelity to breeding and feeding grounds (16), so- 1
To whom correspondence should be addressed. Email: ecg5@st-andrews.ac.uk.
cially learned feeding tactics (17), and song displays that are This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
subject to cultural evolution and revolution (18–20). Humpback 1073/pnas.1621072114/-/DCSupplemental.
ECOLOGY
Here, we present evidence that humpback whales use seg- 249 singers presented in ref. 19, 2 hybrids; and 2002–2003:
mentation in song learning by examining recordings made during 26 singers, 1 hybrid).
the process of learning a new song in the context of a song rev- To identify if new songs were learned as segments (hypothesis
olution event. Recording a whale in the act of changing his song is 1), we first needed to classify each potential segment. Because
challenging; they are highly mobile and one cannot simply record there are multiple levels in the humpback song hierarchy, each
all of an individual’s song during a 2- to 3-mo breeding season being a potential basis for segmentation, we analyzed each level.
and >6,000-km migration. We therefore investigate some rare First, individual sounds were classified into categories (i.e., unit
cases of song hybridization recorded during song revolution events types) (SI Methods and Tables S1 and S2). Then the stereotyped
to understand how individual whales transition between two dif- sequences of units that made phrases were established and fur-
ferent songs. These hybrid songs, which contain themes and ele- ther grouped into themes (SI Methods and Table S1). Themes
ments from both the previous song and the new, revolutionary from each song type were labeled 1 through 37 (Table 1; also see
song, presumably represent a transition phase in the process by SI Methods and Table S1) following previous classification of
which singers change their song display to a new, completely dif- these song types (18, 19, 29, 43, 44). The song type of origin (pink
ferent arrangement. We aim to identify if there are any underlying or black) for theme 11 was uncertain and thus remained un-
structural rules governing song change (e.g., segmentation, tran- resolved, as it was not heard in any nonhybrid songs (Fig. 1,
sition probabilities) that can provide insight into how new songs Table 1, and Table S1). The sequence of themes for each hybrid
can be learned so rapidly. We hypothesize that new songs will be singer was established (Table 1). It is immediately obvious that
learned as segments if segmentation is a taxon-general mechanism the hybrid songs examined here comprised complete themes
(hypothesis 1). Identifying the level in the song hierarchy (phrase, from the two different song types combined into a single song;
theme, or song) that comprises a segment will provide important segmentation occurred at the theme level.
Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7823
Table 1. Theme sequence of hybrid songs from French Polynesia (2005) and eastern Australia (1997 and 2003)
/ Represents a transitional phrase between the two labeled themes. // Represents a break in recording for <1 min. SLA, surface level attenuation, where the
whale is breathing at the surface and the song content is difficult to hear and therefore uncertain. Themes are color-coded by song type. Note the song type
of origin for theme 11 was uncertain (colored gray). See Table S1 for description of theme content.
*Themes 31a and 31b repeated multiple times. No break in recording.
†
Themes 8a and 9b repeated multiple times. No break in recording.
‡
Themes 8b and 9b repeated multiple times. No break in recording.
Given that hybrid songs contained theme segments from each between song types (Fig. 2 and Tables 1 and 2). In songs from the
song type, we investigated if there were any patterns to the eastern Australia 2002 revolution involving the same song types,
arrangement of themes (hypothesis 2). To do this, we: (i) estab- this pattern was not as clear because theme transitions did not
lished the location of hybrid transitions in the song, (ii) in- occur at the most similar themes (Fig. 3B). Instead, theme tran-
vestigated how each singer transitioned between the song types, sitions were mediated by a transitional phrase (Tables 1 and 2).
and (iii) quantified the similarity of theme content using sequence Finally, in songs from the 1996 eastern Australia pink/black rev-
analysis metrics to understand why a singer might switch at that olution, the dendrogram showed a single location where themes
particular location in the song. from both song types grouped together on a branch (Fig. 3C and
To understand the location of theme transitions, the full se- SI Results). This was where the majority of transitions in hybrid
quence of themes from all singers was used to construct a first- songs occurred between the song types (Table 2). The hybrid
order Markov model based on the frequencies of transition be- singers replaced the next theme in the song sequence with a
tween phrases (Fig. 2). Transitions occurred between the pink and similarly arranged theme from the other song type (Fig. 1 and
black song types at multiple locations in the song (Fig. 2A and Table 2). The remaining theme transitions were either mediated
Table 1) but, in contrast, transitions between the blue and dark red by a transitional phrase or the mechanism of transition between
song types occurred only at two locations in the song (Fig. 1 and the song types was unclear (Tables 1 and 2). Regardless, in ad-
Fig. 2B). At these transition locations, singers often placed a tran- dition to transitional phrases this final analysis strongly indicates
sitional phrase between the two song types to mediate the transition that transitions between song types are not random and occur
(Tables 1 and 2). This single phrase combined the starting units more often at locations where theme content is most similar.
from the preceding phrase with units from the following phrase
(typically the ending units) (Fig. 1, Table 2, and Table S1). Discussion
We characterized the structural similarity, that is the similarity Hybrid songs are recorded extremely rarely but are of interest
in the sequence of units that comprised each theme/phrase type because they capture some part of the process by which singers
(laid out in Table S1), between each pair of songs (e.g., blue vs. change their song display from an older version (type) to a new,
dark red) using the Levenshtein distance (LD), a common simi- completely different arrangement. The hybrid songs presented
larity metric in linguistic and humpback song comparisons (29, 43, here were all captured during song revolution events, when
45, 46). In songs from the 2005 French Polynesia blue/dark red singers using both the old and new song types were in the same
revolution, hierarchical clustering of themes showed a single lo- population. It is clear that new songs are learned as segments,
cation on the dendrogram where themes from both song types confirming hypothesis 1 (see also ref. 33), indicating that seg-
grouped together on a branch (Fig. 3A). This was where the singer mentation is a learning mechanism found in the cetacean
of the hybrid song in the French Polynesian dataset switched lineage. The way singers move between song types during singing
ECOLOGY
(Table 1), and the substituted pink theme 2. D shows the theme progression (from left to right) from black theme 9b, through hybrid phrase 9b/4 into pink
theme 4 (singers HYB3 and HYB4) (Table 1). It also shows pink theme 3 and the unresolved theme 11. Spectrograms were 2,048-point fast Fourier transform
(FFT), Hanning window and 75% overlap, generated in RAVEN PRO 1.4 (see also Audios S1–S4 for corresponding audio files).
bouts suggests that these displays are unlikely to be learned as a novelties in the song are adopted by singers once reaching a
whole. Instead, songs are split into theme segments, and the fact threshold prevalence (47), and therefore an individual male
that transitions between song types occur at specific points would need to hear a new song from multiple individuals before
in the theme sequence suggests that each theme is learned as adopting the change. The male therefore has multiple potential
a separate entity. Segmentation or chunking of sequences is models for each theme and a general overview of the “correct”
an important mechanism in human language acquisition (35), sequence of the themes. The highly stereotyped nature of theme
where a stream of utterances is segmented into smaller com- and phrase sequences, both of which we quantified as transition
ponents (phrases or words) and later recombined (36). Song- probabilities (e.g., Fig. 2 and ref. 48), strongly suggests hump-
birds have also been shown to segment their song displays (37– back whales, like songbirds, use statistical learning in learning
40) and statistically learn sound categories (34). Juvenile male their song display (34).
songbirds may learn their song from one or more tutors as a In songbirds, segments are typically separated by longer pauses
sequence of syllable segments, which they recombine to form (silence), and these pauses may provide an emphasis that aids in
their own song (37–40). In humpback whales, our results sug- memorization of segment chunks (39). This feature of pauses
gest that a male learns the new song as theme segments, which between segments of zebra finch song is also a feature of
he combines with older themes as he progressively learns humpback whale song, as a phrase is delineated from the start of
the new song. The novelty-threshold hypothesis suggests that another phrase by a longer pause (24, 49). Given that a single
Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7825
Fig. 2. First-order Markov model of theme transitions to understand hybridization between (A) pink/black song types (n = 2,222 phrase transitions, n =
4 individuals), and (B) blue/dark red song types (n = 8,852 phrase transitions, n = 46 individuals). Each node represents a theme or phrase type, color-coded by
song type. White nodes represent transitional phrases and dashed lines indicate transitions between song types. Arrows represent the direction of movement
and thicker lines indicate higher transition probabilities. Transitions between the pink and black song types (A) occurred at multiple locations (themes 1 to 7b,
9b to 4, 10a to 1, 4 to 10a, 10a to 5b, 8b to 4, and 8a to 5b). In contrast, transitions between the blue and dark red song types (B) occurred only at two specific
locations in the song: blue theme 27–dark red theme 31a (both directions), and blue theme 24–dark red theme 37a (one-way). Phrase repetitions are removed
from the figure for ease of display.
humpback whale song can last anywhere from 5 to 30 min directional change in song revolutions (to stop whales reverting
(24), any aid in memorization of such a long display would be back to the previous song type), leading to the broad-scale cultural
under strong selection. The repetition of phrases within themes phenomenon we observe (18).
introduces redundancy in the song, and likely aids memorization Hybrid songs from both song revolutions contained themes from
through repetition and reduced content. Furthermore, rhyme-like one song type that were spliced into the middle of the other song
patterns in humpback song (50) appear similar to rhyme patterns type (Table 1). There are multiple examples of such hybrid song
in human poems or prose, which also aid recall (51). The question production in songbirds at the boundary of two song dialect areas
of how humpback whales remember their song display (they rarely or the boundary between two closely related species (41). For ex-
sing the wrong thing) is still open. From playback studies we know ample, orange-tufted sunbirds (Nectarinia osea) have sharp dialect
humpback whales react more strongly to novel songs than to the boundaries, but a small number of birds along these boundaries
song of the current year (see ref. 52). The whales can identify sing songs from both dialects (i.e., hybrids) (54). Similarly, in the
“same” from “different.” It would be interesting to explore how village indigobird (Vidua chalybeata), a species that undergoes
long their song memory lasts, as bottlenose dolphins have been continuous population-wide song evolution in some ways similar to
shown to remember vocalizations (signature whistles of conspe- humpback whale songs, males along dialect boundaries have been
cifics) for over 20 y (53). Such a song memory could drive the recorded singing hybrid songs that combined songs from each
The direction of transition (i.e., old song to new song, or vice versa), and the number of times this transition occurred as a percentage of the total number
of hybrid transitions for each pair of song types (taken from Table 1) are noted. The similarity in sound units or their arrangement is described along with
whether this similarity was supported in the dendrograms (i.e., both themes present on a branch). The presence of a transitional phrase is noted, and a description
of the potential mechanism assisting the transition is suggested.
dialect (55). In yellow-rumped caciques (Cacicus cela vitellinus), arranged in a similar pattern (i.e., a “switch-when-similar” rule).
another species with continuous population-wide song evolution, Word substitutions in humans, such as malapropisms—the use of
males in a colony may occasionally incorporate a foreign song type an incorrect word in place of a word with a similar sound (42)—is
as part of their yearly population dialect if the two colonies are highly suggestive for a general mechanism. These transition
closely situated (56). In another example, at the range interface of points based on similarity could act as a point of reference or
black-capped chickadees (Poecile atricapillus) and Carolina chick- cue, allowing the singer to switch from the old into the new song
adees (Poecile carolinensis), birds from both species displayed bi- at this position in the song. Such anchors are present in human
lingual or atypical repertoires (57). Clearly, segmentation is an vocal performances [e.g., oral traditions (51)], and single sounds
important general mechanism in vocal learning present in multiple or words and similar note arrangements are used to transition
independent lineages. among songs in human music performances. Finally, the ability
Transitions between humpback whale song types were often to jump from one song into another is also a feature of birdsong;
mediated by a transitional phrase containing individual sound for example, counter-singing allows a male to select a matching
units from the previous and following phrases that were common song of a rival male and switch to singing that song in an ag-
to both song types (Figs. 1 and 2 and Tables 1 and 2). Transi- gressive context (41). This skill strongly suggests the presence of
ECOLOGY
tional phrases are a neglected component of the song in general, an underlying mechanism allowing plasticity in vocal output
as they are often excluded from analyses focused on delineating shared among vocal learning species.
song types (49). The variable structure of transitional phrases can We suggest the switch-when-similar rule may be stronger and
make them difficult to categorize, particularly if they are not thus more important in one direction (i.e., old-to-new themes)
routinely used in all transitions between themes. Nevertheless, it (Table 2), assisting singers in learning new themes sequentially
is clear this normal component of song organization is important and in the “correct” order. The whale is attempting to learn the
to allow an ordered progression from one theme into another, new display; this is very directional. The location in the song
regardless of the song types. where old themes encroach back into the song display may be
Transitions between song types were partially governed by less important and is unlikely to be governed by this similarity
structural similarity, based on the Markov model and sequence rule (explaining the majority of unsimilar transitions backward).
analysis (Figs. 2 and 3), rejecting random combinations of seg- These new-to-old song transitions appear to be mediated more
ments (hypothesis 2). The sequence analysis indicated that often by transitional phrases (Table 2).
transitions or theme substitutions occurred more often in loca- The process of vocal production learning (7) of a completely
tions that contained “similar sounds arranged in a similar pat- new song type could occur through a number of structural
tern” in old and new songs (Fig. 3). Themes either progressed changes to the song, as new themes must be learned and old
into a similarly sounding theme of the other song type or themes removed. Multiple studies indicate that male humpback
replaced that similarly sounding theme altogether (Table 2). In whales adhere to the current arrangement of the song (e.g., refs.
addition to segmenting, song learning and change are partially 20 and 25). Importantly, once a new song is recorded in a pop-
governed by structural similarity rules where transitions or theme ulation, all males switch to this new song (18, 19). Clearly, the
substitutions occur in locations that contain similar sounds song is learned as theme segments to aid in the learning of this
Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7827
particularly important to our understanding of structurally arranged
vocal communication and the potential origins of human language.
Here, by investigating rare cases of song hybridization, where parts
of an existing song are spliced with a novel, revolutionary song, we
have unearthed a number of underlying structural rules governing
song change, including segmentation and transition/substitution of
themes based on the similarity in sound sequences. These rules
likely assist humpback whales in rapidly learning their complex and
ever-changing songs, and provide insights into the evolution of
human language and culture.
Methods
Song Recordings. All recordings covered the frequency range of humpback
whale song (see SI Methods for detailed recording settings). The units in each
recording were transcribed by a human classifier (E.C.G. or L.L.), and a subset
of units measured for a suite of acoustic parameters to ensure consistent
naming (45). As humpback whale song is highly stereotyped (24), units were
grouped into phrases, phrases into themes, and themes into song types. Pre-
vious studies have identified and quantified these four song types (pink, black,
blue, and dark red), the themes (labeled 1–37), and unit types within each, and
their cultural transmission across the western and central South Pacific (18, 19,
29, 43, 59).
ECOLOGY
Selected Symposia Series, Boulder, CO), pp 333–358. of humpback whales, Megaptera novaeangliae: Synchronous change in Hawaiian and
28. Darling JD, Acebes JMV, Yamaguchi M (2014) Similarity yet a range of differences Mexican breeding assemblages. Anim Behav 62:313–329.
between humpback whale songs recorded in the Philippines, Japan and Hawaii in 59. Garland EC, et al. (2015) Population structure of humpback whales in the western and
2006. Aquat Biol 21:93–107. central South Pacific Ocean as determined by vocal exchange among populations.
29. Garland EC, et al. (2013) Quantifying humpback whale song sequences to understand Conserv Biol 29:1198–1207.
the dynamics of song exchange at the ocean basin scale. J Acoust Soc Am 133: 60. Eriksen N, Tougaard J (2006) Analysing differences among animal songs quantita-
560–569. tively by means of the Levenshtein distance measure. Behaviour 143:239–252.
30. Garland EC, et al. (2013) Humpback whale song on the Southern Ocean feeding 61. Helweg DA, Cato DH, Jenkins PF, Garrigue C, McCauley RD (1998) Geographic vari-
grounds: Implications for cultural transmission. PLoS One 8:e79422. ation in South Pacific humpback whale songs. Behaviour 135:1–27.
31. Arriaga G, Zhou EP, Jarvis ED (2012) Of mice, birds, and men: The mouse ultrasonic 62. Suzuki R, Shimodaira H (2004) An application of multiscale bootstrap resampling to
song system has some features similar to humans and song-learning birds. PLoS One hierarchical clustering of microarray data: How accurate are these clusters. 15th
7:e46610. Annual International Conference of Genome Informatics, Posters and Software
32. Romberg AR, Saffran JR (2010) Statistical learning and language acquisition. Wiley Demonstrations. Available at http://stat.sys.i.kyoto-u.ac.jp/prog/pvclust/. Accessed October
Interdiscip Rev Cogn Sci 1:906–914. 29, 2015.
33. Birchenall LB (2016) Animal communication and human language: An overview. Int J 63. Sokal RR, Rohlf FJ (1962) The comparison of dendrograms by objective methods.
Comp Psychol 29:1–27. Taxon 11:33–40.
Garland et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7829
Conformity does not perpetuate suboptimal traditions
in a wild population of songbirds
Lucy M. Aplina,1 , Ben C. Sheldona , and Richard McElreathb,c
a
Edward Grey Institute, Department of Zoology, University of Oxford, Oxford OX1 3PS, United Kingdom; b Department of Human Behavior, Ecology, and
Culture, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany; and c Department of Anthropology, University of California, Davis,
CA 95616
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 30,
2017 (received for review January 17, 2017)
Social learning is important to the life history of many ani- locally adaptive information. However, it may also have the out-
mals, helping individuals to acquire new adaptive behavior. How- come of maintaining group differences in behavior, with within-
ever despite long-running debate, it remains an open question group traditions resilient to invasion by alternative variants.
whether a reliance on social learning can also lead to mismatched Empirical evidence for conformity in nonhuman animals is
or maladaptive behavior. In a previous study, we experimentally currently limited, but hints at a wide taxonomic occurrence, with
induced traditions for opening a bidirectional door puzzle box in proposed cases in fish (17), birds (18), and primates (4, 19). Fur-
replicate subpopulations of the great tit Parus major. Individuals thermore, theoretical modeling has suggested that conformist
were conformist social learners, resulting in stable cultural behav- transmission should evolve under a wide range of conditions and
iors. Here, we vary the rewards gained by these techniques to be particularly favored when environments are spatially hetero-
ask to what extent established behaviors are flexible to changing geneous (15, 20). Yet if individuals are exclusively conformist,
conditions. When subpopulations with established foraging tra- then any new environmental change may result in a mismatch
ditions for one technique were subjected to a reduced foraging with the majority behavior, leading to a perpetuation of subop-
payoff, 49% of birds switched their behavior to a higher-payoff timal or maladaptive traditions over time (21) and exaggerating
foraging technique after only 14 days, with younger individu- the disadvantages of social information use. Evolutionary mod-
als showing a faster rate of change. We elucidated the decision- eling has gone as far as to suggest that in socially learning ani-
making process for each individual, using a mechanistic learning mals, coupling of conformist learning with environmental change
model to demonstrate that, perhaps surprisingly, this population- could lead to population collapse (22).
level change was achieved without significant asocial exploration This apparent paradox of nonadaptive culture has thus been
and without any evidence for payoff-biased copying. Rather, by the subject of much debate (23–28), with two individual-level
combining conformist social learning with payoff-sensitive indi- strategies proposed as a potential means of evading this evolu-
vidual reinforcement (updating of experience), individuals and tionary trap. First, individuals could switch from socially learned
populations could both acquire adaptive behavior and track envi- behavior to engaging in asocial learning (individual innovation)
ronmental change. when the rewards gained for performing the established tradi-
tion is smaller than previously (7, 12, 27, 29). Second, individu-
social learning | animal culture | conformity | Parus major als could combine conformist tendencies with payoff-biased social
learning, using a “behavioral toolbox” of social learning strate-
gies when choosing what behavior to adopt, thus integrating infor-
PSYCHOLOGICAL AND
decision-making process for each individual’s visit to the puzzle
COGNITIVE SCIENCES
box we then examined (i) whether individuals switched from con- T2, t = −3.87, P < 0.001; T3, t = −3.75, P < 0.001; T4, t = −5.86,
formist to payoff-biased copying when observing others receiv- P < 0.001) (Fig. 1B). In the last part of condition 1, an average
ing variable rewards, (ii) whether individuals flexibly adjusted of 8(2–16)% of individuals either showed no preference or pre-
their behavior in response to learning about and gaining variable ferred the alternative variant. For these individuals, their prefer-
rewards, and (iii) whether this resulted in a population-level shift ence did not change in condition 3. By contrast, 49(33–71)% of
in behavior. other individuals switched to change their variant preference by
the end of the experimental period (proportion of all solves for
Results each individual over last 2 days).
Condition 1: Equal High Payoffs. A tradition for pushing a bidirec- Similarly to Aplin et al. (18), we analyzed the change in
tional door either to the right (variant A, T1–T2) or to the left individual and population preferences over time. First, as the
(variant B, T3–T4) was experimentally induced in four subpop- data were clearly bimodal (Fig. S2), a longitudinal clustering
1
A B
1
Variant A
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0
T1 T2 T3 T4 T1 T2 T3 T4 T1 T2 T3 T4
Equal High Payoffs Equal Low Payoffs Unequal Payoffs
Fig. 1. (A) A puzzle box where visiting individuals can slide the door open from either the blue/left side (variant A) or the red/right side (variant B) to
access a reward in a concealed feeder behind the door. The individual pictured is solving using variant A. Solving using either option can give the same
(equal condition) or different (unequal condition) rewards. Puzzle boxes record identity, contact duration, and solution choice and reset after each visit.
(B) Proportion of variant A or B used in each replicate (T1–T4) in three sequential conditions after variant A (T1–T2) or variant B (T3–T4) was initially
introduced by a trained demonstrator: (i) equally high payoffs for each solving option, proportion for last 5 days shown; (ii) equally low payoffs for each
solving option (2 days); and (iii) unequal payoffs, with the established tradition leading to a lower reward (14 days). Solid circles and error bars show mean
and 95% CI of the probability of individuals’ first solve in each condition being the uncommon variant.
Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7831
1
vidual choosing either option was a combination of both social
cues and accumulated experience. Social cues included the fre-
quencies of each behavior and the relative value of demonstrated
rewards and were calculated from activity immediately before a
Proportion of solves are seeded variant
nearly zero to over 0.8. Notably, these two parameters were corre-
lated across individuals, indicating that birds that tended to learn
2 4 6 8 10 12 14
socially also updated their personal information more quickly.
Time (1-14 days) In contrast to payoff social learning biases, the large majority
Fig. 2. The proportion of solutions using the seeded technique decreased
of individuals had a conformity exponent (λ) above 1, indicating
over time in each replicate, with individuals moving toward preferring the at least mild conformity in their use of social cues (above dotted
previously uncommon technique. Each replicate is shown in a different line, Fig. 3A). However, there was a strong negative correla-
color/shape, and solid and open symbols represents the two distinct clus- tion with overall reliance on social learning (s), such that birds
ters of individuals identified in the longitudinal clustering algorithm (solid
symbols, cluster 1; open symbols, cluster 2). Lines show the generalized esti-
mating equation model fit for each cluster/replicate.
A B
(Fig. 2). There was strong evidence in cluster 1 that the prefer- 0.0 0.2 0.4 0.6 0.0 0.2 0.4 0.6 0.8 1.0
ence for the established tradition decreased over time (pooled social learning weight (s) observed freq high option
replicate data; coefficient ± SEM = −0.27 ± 0.02, P < 0.001).
Cluster 2 showed a significant but much lower decreasing prefer- C D
ence for the established tradition (pooled replicate data; coeffi-
0.8
0.6
cient ± SEM = −0.13 ± 0.02, P < 0.001) (Fig. 2). This bimodality
social learning weight
in the rate of change over time was related to age, with younger
0.6
updating rate
0.4
0.2
PSYCHOLOGICAL AND
is optimal in this setting. Nevertheless, the amount of conformity
COGNITIVE SCIENCES
reinforcement (updating) in the population, with individual vari- that is present, combined with payoff-sensitive updating, allows
ation in all measures. Surprisingly, given our initial hypotheses, individuals to track changes in behavioral variants.
we found no evidence of individual exploration or payoff-biased Finally, selection gradients for social learning weight (s) and
social learning of sufficient strength to explain the patterns of conformity strength (λ) were used to determine whether selec-
behavior change. Therefore, does the conformity present in the tion favors larger or smaller values of each parameter, condi-
population slow the rate of switching? Or does it instead help tional on the value of the other. We calculated the selection
7
7
bird
bird
5
5
3
3
1
1
0 10 20 30 40 50 60 0 10 20 30 40 50 60
turn turn
9
7
7
bird
bird
5
5
3
3
1
0 10 20 30 40 50 60 0 10 20 30 40 50 60
turn turn
Fig. 4. Simulations of the population consequences of mixes of conformist social learning and individual reinforcement. In each plot, each row is an
individual agent and each column is a time period. Open and solid circles represent alternative behavior. Before the vertical dashed line at turn 30, solid
is adaptive. After turn 30, open is adaptive. All groups of learners initialized with nonadaptive attraction scores, s = 0.5, g = 0.6, and y = 0. (A) λ = 1, no
conformity. (B) λ = 10, high conformity. (C) λ = 5, intermediate conformity. (D) Ten birds sampled from posterior distribution of the fitted model.
Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7833
gradients by conducting 20,000 simulations at each of 63 combi- Indeed when subpopulations with strongly established forag-
nations of s and λ (for a total of 1,260,000 simulations) to com- ing traditions for a single behavioral variant were subjected to
pute the selection differential of a mutant (Fig. 5). In each sim- a change in foraging payoffs, after just 14 days 49% of birds
ulation, we considered the difference in total payoffs between a switched their behavior to prefer an alternative higher-payoff
mutant individual with parameters s + δs or λ + δλ and an aver- variant, with almost all individuals sampling this option. By
age common-type individual with parameters s and λ. This dif- modeling the decision-making process for each individual we
ference defines a numerical estimate of the selection gradient show that, perhaps surprisingly, this population-level flexibility
for an invader. The parameters g and y were again fixed at 0.6 was achieved without significant asocial sampling and despite
and 0. We display the results in Fig. 5 as a vector field. Selec- an ongoing bias for conformity at the individual level. Instead,
tion adjusts combinations on s and λ in the directions indicated switching depended on two factors. First, there was an interac-
by the arrows, with longer arrows indicating stronger selection. tion between social information and personal experience, with
The red dashed contour is the combinations of s and λ at which individuals that experienced the higher-payoff behavior having
selection on conformity is neutral. Selection increases conformity a strong preference for that variant in future solves. Second,
below this contour. The blue dashed contour is the combinations there was extensive individual variation, with those individuals
where selection on s is neutral. Selection increases s above this that relied more on social information showing a weaker con-
contour. Therefore, selection favors more conformity for most formist bias. These factors allowed some individuals to switch
of the gradient space, becoming disadvantageous only at high once fortuitously exposed to, and experiencing, the high-payoff
weights of social learning. Social learning weight increases above variant. These individuals then provided the correct social infor-
and to the left of the blue contour. Here, social learning weight mation for others, leading to a positive feedback loop and even-
cannot increase from zero unless social learning is slightly con- tual population-level turnover.
formist (above the central blue contour); however, once confor- There has been extensive speculation about whether a reliance
mity is above a threshold value, higher social learning weight is on social learning can lead to mismatched or out-of-date behav-
favored only up to a point. These processes combine to produce ior (23, 25). Conformity has been thought to potentially exacer-
evolutionary dynamics that favor conformity combined with an bate this process, as conformist individuals rely on an indirect
intermediate weighting of social learning (Fig. 5). cue of information quality (the proportion of individuals exhibit-
In summary, the birds in the experiment obviously did not ing a behavior) rather than assessing the value of the information
evolve in the experiment, and we do not expect them to be pre- itself (21, 22). However, there has been a paucity of empirical
cisely adapted to it. Nevertheless, these simulations help us to evidence in nonhuman animals. In the only prior study, guppies
understand why conformist social learning, in combination with (Poceilia reticulata) were trained on a longer, suboptimal route
payoff-sensitive individual reinforcement, facilitated the ability to reach a feeding station. This route preference transmitted and
of individuals and groups to track environmental change. persisted over several days before eroding toward a faster route
(31). It was assumed that this erosion was associated with aso-
Discussion cial learning, but this was not tested. The learning mechanisms
Our experiment reveals that socially learned foraging traditions that individuals may be using to optimally exploit variable envi-
in great tits are flexible in response to environmental change. ronments were more explicitly tested in Rendell et al. (7), where
a computer tournament was used to compete different sets of
learning strategies. Winning strategies invested in social learning
over asocial learning learned most when individuals experienced
3.0
PSYCHOLOGICAL AND
for Ornithology and the Natural England (Natural England license numbers
COGNITIVE SCIENCES
learners, they will likely have some opportunity to acquire this 20123075, 20131205, 20145171).
alternative information. In species with highly modular networks,
by contrast, social structure could instead act to slow the rate Experimental Apparatus. The social learning task consisted of a plastic box
containing a feeder that was accessed by sliding a bidirectional door either
at which individuals and populations could flexibly adjust their
to left or to right. The left side of this door was colored blue and the right
socially learned behavior, with individuals repeatedly exposed to side red, and it had a raised front section to allow an easier grip. A perch
the same mix of potential demonstrators when copying. in front in the door functioned as a radio-frequency identification (RFID)
In addition to social structure, population demography could antenna registering the identity, visit duration, and action of each visiting
also influence the speed at which populations can flexibly adjust individual; these were recorded and controlled by a printed circuit board
socially learned behavior in a variable environment. Whereas we (Stickman Technology) inside the box. One second after a bird was recorded
found no sex differences in learning, unlike in refs. 41 and 42, as departed from the antenna, the door reset back to the middle. When
younger individuals in our population tended to show a faster installed in the woodland, each puzzle box was surrounded by a 1 × 1-
move away from the established low-payoff technique than older m cage with a 5 × 5-cm mesh to prevent access by larger species, and a
freely accessible bird feeder providing peanut granules was provided at 1 m
individuals and had a higher probability of preferring the high-
distance.
payoff technique by the end of the experiment. All individuals In experimental condition 1, the puzzle-box feeder contained live meal-
had equal opportunity in condition 1 to learn and practice the worms. However, for experimental conditions 2 and 3, the puzzle box was
established behavior, and this result was unrelated to previous modified to provide two different rewards, depending on the solving tech-
experience. Rather, it appears that younger birds were generally nique. This modification involved widening the door by ∼1.5 cm; however,
more likely than older individuals to use social information and, no other changes were made to the puzzle-box interface.
once having experienced the high-payoff technique, were also
more able to flexibly adjust their behavior. As younger individu- Experimental Design. A social learning and foraging experiment was con-
als are often also more likely to disperse, such flexibility in behav- ducted in four relatively isolated subpopulations across the woodland, in
ior could be advantageous when moving between new habitats 4-wk periods between December 2013 and January 2014 (treatment 1 and
treatments 3 and 4) and in a 4-wk period in January 2013 for treatment 2.
(41). More broadly, future work should model the effects of pop-
First, two males were caught from each subpopulation and trained in cap-
ulation demography and social network structure on the ability tivity to solve a novel puzzle box: In two subpopulations (T1 and T2), they
of socially learning populations to track environmental change. were trained to solve using variant A (solving pushing right from the blue
Indeed, both population demography and social structure side), whereas in T3 and T4 they were trained to solve using variant B (solv-
could also potentially be manipulated, and their effect experi- ing pushing left from the red side) (Fig. 1A). All birds were then released
mentally tested. to act as the initial demonstrators for this behavior, and three such puzzle
In conclusion, we show that socially learned traditions in wild boxes were installed 250 m apart in each subpopulation. These then con-
populations of great tits will track environmental change. We tinuously operated from dawn on Monday to dusk on Friday for a total of
further find that populations can track payoffs while individu- 20 days (18). In all areas the solving behavior spread rapidly, with 68–83%
(n = 37–96 per subpopulation) of resident individuals solving either vari-
als remain conformist social learners and use simulations to elu-
ant at least once. Puzzle boxes were used frequently, with 7,945–12,411
cidate the mechanisms by which this counterintuitive outcome rewarded visits per subpopulation; for more detail see ref. 18.
occurs. Indeed, our results suggest that conformist social learn- In January and February 2014, these four replicates were then exposed to
ing actually helps the population adapt to and retain high-payoff a modified puzzle box providing changed rewards. In condition 2 (2 days),
behavior, provided it is not too strong. This adds further weight this modified puzzle box provided sunflower seeds as a reward for solving
to arguments that social learning will be adaptive in a wide range using either technique. This was followed by condition 3 (14 days): Here
Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7835
nλ
the behavioral variant introduced in the initial experiment was rewarded Skit = P kt λ , [4]
with sunflower seed, whereas the alternative technique was rewarded with m nmt
live mealworms. In three replicates, these conditions occurred immediately
where nkt is the frequency of choice k among social cues at time t and λ = λi
following the initial experiment. In T2, the experiment occurred 1 y later;
is individual i’s conformist exponent. When this exponent is 1, social learning
however, the population was given 15 days of exposure to the puzzle box
is unbiased by frequency and behavior is sampled merely in proportion to
immediately before condition 2, with 73% of resident individuals solving
its occurrence among cues. When, however, the exponent exceeds 1, social
the task (either variant) in this period. The results from this replicate were
learning is conformist. In this study, we consider also payoff-biased social
similar to those from the three other replicates.
learning, which favors the highest-payoff choice among choices observed at
time t. Specifically, we construct a convex combination of conformist and
Statistical Analysis. To analyze the change over time in individual- and payoff-bias terms
population-level preferences in condition 3, we used a GEE model where
the dependent variable was the proportion of solves using the established nλ
Skit = (1 − yi ) P kt λ + yi (1), [5]
tradition and explanatory variables were day, replicate, and individuals, m nmt
weighted by their total number of solves per day. As the data distribution
was bimodal, we first divided the data into two clusters, using a longitudi- where yi is individual i’s reliance on payoff bias. When there is no variation
nal clustering algorithm that fitted the data for the relative proportion of in social cues, we assume that yi = 0, which means that payoff bias is active
events that were variant A for individuals over cumulative 2-h time periods. only when observed payoffs vary.
This method was implemented using the R package kml3d. We allow learning strategy to vary at the individual level, estimating si , gi ,
Learning mechanisms underlying individual changes in behavior were λi , and yi for each individual i in the sample. In each case, we construct each
analyzed using a sequential learning model that modeled individual choices parameter such that its log-odds are a linear combination of an individual
as products of time-varying interactions of different modes of learning. The random effect and an age-specific offset. For example, the submodel for si is
foundation of this framework is the experience-weighted attraction learn- logit (si ) = ai, 1 + bs xi , [6]
ing model (43), but with additional terms that allow behavior to be guided
by social cues. Specifically, we assume that the probability of observing a where xi is the standardized age of individual i and ai is a vector of individ-
choice k at time t by individual i is given by ual random effects, one for each parameter s, g, λ, and y. The conformity
exponent λ is given a log rather than a logit link.
pkit = (1 − si ) Ikit + si Skit , [1]
Model fitting was performed using Hamiltonian Monte Carlo, as imple-
mented in version 2.12 of Stan (44), to draw samples from the posterior
where si is the influence of social cues on choice, Ikit the probability of choice
distribution. We assessed convergence by inspection of the trace plots,
k according only to accumulative individual attraction, and Skit the probabil-
Gelman–Rubin R̂, and an estimate of the effective number of samples.
ity of k according only to social cues. The individual attractions are modeled
Finally, model priors were defined to be weakly informative and conserva-
as ordinary experience-weighted attractions with a simple reinforcement
tive, so that estimated effects and correlations were shrunk slightly toward
model, such that the attraction score for an option k at time t and individ-
zero. Specifically, the averages for s, g, λ, and y were assigned Normal(0,1)
ual i is given by
priors on the latent scale. The standard deviations of each random effect
Aki,t = (1 − gi ) Aki, t−1 + gi πk , [2] were assigned Exponential(2) priors, also on the latent scale. For the corre-
lation matrix of random effects, we used the LKJ family of distributions of
where gi is the importance of newly experienced payoff πk . Therefore, matrices and assigned η = 3, which shrinks correlations away from extreme
gi = 1 when there is no influence of past experience. Here we estimate both values near −1 or +1 and toward zero. For the unobserved payoff advan-
gi for each individual i and the unobservable payoff πk to each option k. tage of the high-payoff option, we assigned a Cauchy(0,1) prior, which is
Attraction scores at time t imply choice probability by means of a softmax essentially uninformative. Code sufficient to repeat our results is available
choice rule: as an R package, wythamewa, that contains the data, models, and simula-
tion code.
exp (Akit )
Ikit = . [3]
sumn exp (Anit ) ACKNOWLEDGMENTS. We thank Keith McMahon, Stephen Lang, and other
In fitting the model, we set the initial attraction scores at time t = 0 for members of the Edward Gray Institute for help with various aspects of
fieldwork and data collection and Damien Farine for discussions leading
each individual to the empirical preferences from the first condition. This
to the formation of the project and for developing the software for the
accounts for the fact that most individuals begin the second condition with puzzle boxes. This research was supported by a grant from the Biotechnol-
strong preferences for the formerly high option. ogy and Biosciences Research Council (BB/L006081/1) (to B.C.S.). L.M.A. was
Social cues at time t can influence choice by changing the probability Skit . supported by a junior research fellowship at St. John’s College, University
In the simplest example, conformist learning is modeled as of Oxford.
1. Muller CA, Cant MA (2010) Imitation and traditions in wild banded mongooses. Curr 14. Rendell L, et al. (2011) How copying affects the amount, evenness and persistence of
Biol 20:1171–1175. cultural knowledge: Insights from the social learning strategies tournament. Philos
2. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. Trans R Soc Lond B Biol Sci 366:1118–1128.
3. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. 15. Morgan TJH, Laland K (2012) The biological bases of conformity. Front Neurosci 6:1–7.
Science 299:102–105. 16. Aplin LM, et al. (2015) Counting conformity: Evaluating the units of information in
4. van de Waal E, Borgeaud C, Whiten A (2013) Potent social learning and conformity frequency-dependent social learning. Anim Behav 110:e5–e8.
shape a wild primate’s foraging decisions. Science 340:483–485. 17. Pike TW, Laland K (2010) Conformist learning in nine-spined stickleback’s foraging
5. Slagsvold T, Wiebe KL (2007) Learning the ecological niche. Proc Biol Sci 274: decisions. Biol Lett 6:466–468.
19–23. 18. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture
6. Slagsvold T, Wiebe KL (2011) Social learning in birds and its role in shaping a foraging via conformity in wild birds. Nature 518:539–541.
niche. Philos Trans R Soc Lond B Biol Sci 366:969–977. 19. Whiten A, Horner V, de waal FB (2005) Conformity to cultural norms of tool use in
7. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies chimpanzees. Nature 437:737–740.
tournament. Science 328:208–213. 20. Nakahashi W, Wakano JY, Henrich J (2012) Adaptive social learning strategies in tem-
8. McElreath R, et al. (2008) Beyond existence and aiming outside the laboratory: Esti- porally and spatially varying environments : How temporal vs. spatial variation, num-
mating frequency-dependent and pay-off-biased social learning strategies. Philos ber of cultural traits, and costs of learning influence the evolution of conformist-
Trans R Soc Lond B Biol Sci 363:3515–3528. biased transmission, payoff-biased transmission, and individual learning. Hum Nat
9. Boyd R, Richerson P (1985) Culture and the Evolutionary Process (Univ of Chicago 23:386–418.
Press, Chicago). 21. Henrich J, Boyd R (1998) The evolution of conformist transmission and the emergence
10. Henrich J, McElreath R (2003) The evolution of cultural evolution. Evol Anthropol of between-group differences. Evol Hum Behav 19:215–241.
12:123–135. 22. Whitehead H, Richerson PJ (2009) The evolution of conformist social learning can
11. Laland K (2004) Social learning strategies. Learn Behav 32:4–14. cause population collapse in realistically variable environments. Evol Hum Behav
12. Rendell L, et al. (2011) Cognitive culture: Theoretical and empirical insights into social 30:261–273.
learning strategies. Trends Cogn Sci 15:68–76. 23. Galef BG (1995) Why behaviour patterns that animals learn socially are locally adap-
13. Cantor M, Whitehead H (2013) The interplay between social networks and culture: tive. Anim Behav 49:1325–1334.
Theoretically and among whales and dolphins. Philos Trans R Soc Lond B Biol Sci 24. Galef BG (1995) A new model system for studying behavioural traditions in animals.
368:20120340. Anim Behav 50:705–717.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Aplin et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7837
A social insect perspective on the evolution of social
learning mechanisms
Ellouise Leadbeatera,1 and Erika H. Dawsonb,2
a
School of Biological Sciences, Royal Holloway University of London, Egham TW20 0EX, United Kingdom; and bLaboratoire Evolution, Génomes,
Comportement et Ecologie, CNRS, 91198 Gif-sur-Yvette, France
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 14, 2017)
The social world offers a wealth of opportunities to learn from the ability to learn socially is simply a useful byproduct of the fun-
others, and across the animal kingdom individuals capitalize on damental ability to learn asocially that has remained untouched by
those opportunities. Here, we explore the role of natural selection natural selection. There is clear empirical evidence that natural se-
in shaping the processes that underlie social information use, using lection can fine-tune associative processes to particular tasks (34–36).
a suite of experiments on social insects as case studies. We illustrate For certain social learning processes that typify human behavior,
how an associative framework can encompass complex, context- such as imitation or learning through language or instruction, this
specific social learning in the insect world and beyond, and based on mechanistic middle ground between adaptation- and preadaptation-
the hypothesis that evolution acts to modify the associative process, based explanations for social learning is the subject of substantial
suggest potential pathways by which social information use could empirical attention (37–40). However, this is not the case for those
evolve to become more efficient and effective. Social insects are social learning mechanisms that underlie the majority of social in-
distant relatives of vertebrate social learners, but the research we formation use outside the primates, such as local or stimulus en-
describe highlights routes by which natural selection could coopt hancement, social facilitation, or goal emulation (41). The effects
similar cognitive raw material across the animal kingdom. produced by these “simpler” processes are well-labeled (1), but
these definitions often tell us little about how the social stimuli upon
|
social learning associative learning | observational conditioning | which they are based come to control behavior. For example,
|
social insects Bombus imagine that an individual observes a demonstrator using a tool to
get food, and then uses the same tool to extract food for him or
EVOLUTION
and quickly learn about the stimuli that predict where to find sociate conspecifics (CS) with an aversive substance (US−)
rewards in the particular flower species and patches on which actively avoided those same colors, and bees that had never
they forage. Information provided inadvertently by other forag- foraged with conspecifics were not influenced by conspecific
ing bees influences this learning process, and a simple example choices. In other words, the results support that observing con-
whereby social bees learn from their conspecifics about rewarding specifics through a screen influenced forage preferences through
flower types provides a good introduction to an associative second-order conditioning.
Second-order conditioning is an associative mechanism, and the
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
framework for studying social learning.
Worden and Papaj (5) allowed bumblebees (Bombus impatiens) use of an associative framework to explain empirical results in
social learning research is not new. Perhaps the best-known ex-
to observe their foraging conspecifics through a screen. Those
ample comes from the work of Cook and Mineka (50–52), who
conspecifics foraged on only one of two available flower colors,
demonstrated almost 30 years ago that juvenile rhesus monkeys
and when the observers were later permitted to forage alone, they
(Macaca mulatta) can acquire a fear of snakes through observation
“copied” the color preferences of the demonstrators (5, 43, 44).
of a frightened conspecific interacting with a snake, a phenome-
This type of learning initially appears conceptually opaque, be-
non that they termed “observational conditioning.” As the name
cause learning theory is based upon the fundamental concept of
implies, these results are considered to reflect a classic condi-
prediction error, which requires a difference between predicted tioning process, whereby the sight of a frightened conspecific (US)
and experienced outcomes (45). In this observational paradigm, elicits an unconditioned fear response in observers. This affective
observers do not directly experience any rewarding outcome, nor state becomes conditioned to stimuli that are experienced at the
had these laboratory bees ever had the chance to learn that same time, in this case a snake model (CS), which thus also ac-
matching the color choices of conspecifics was rewarding. Thus, quire the ability to elicit fear (28, 52). In other words, the snake is
the bees in this situation seem to have learned about flower color simply a conditioned stimulus that comes to elicit the fear re-
simply by observing the behavior of their conspecifics. sponse. Cook and Mineka (50) found that fear could be condi-
Second-order conditioning is an associative phenomenon that can tioned to snakes in this way, but not to flowers, implying that fear
potentially explain why animals respond to certain stimuli as though cannot be socially conditioned to any randomly chosen stimulus.
they have been directly associated with food rewards, when they have However, this is not an argument against an associative explana-
not. This process is best known for its use in psychological research tion, because in primates snake stimuli are generally particularly
to study the contents of learning (46), but perhaps the most illus- easily conditioned to fear responses, whereas flower stimuli are
trative example derives from the work of Pavlov (47), who famously not (53–56). For example, in humans, pairing of snake pictures
trained dogs that the tick of a metronome (a conditioned stimulus, with electric shock leads those pictures to later elicit heart-rate
CS) predicted the arrival of food (an appetitive unconditioned acceleration (indicating fear), whereas the same protocol involving
stimulus, US+). When Pavlov later trained the same dogs that flower pictures leads to heart-rate deceleration (indicating arousal
presentation of a black square (a second conditioned stimulus, CS2) of attention) (54, 56). It is thus not surprising that a fear response
predicted the sound of the metronome in the absence of food, he invoked by a social stimulus can be easily conditioned to snakes
found that subsequent presentation of the square alone evoked but was not detectable for flowers.
salivation. In other words, the black square (CS2) had come to elicit What is the link between the second-order conditioning and the
the same response as the food (US+), despite the two never having observational conditioning process that we describe above? In both
been experienced together. An association had formed between the cases, a response to a social stimulus becomes conditioned to a new,
black square and either the food itself or the appetitive state induced asocial stimulus. In observational conditioning, an unconditioned
by the conditioned state of the bell (it remains unclear which) (20, response (here fear, elicited by seeing a frightened conspecific)
48), and this association constitutes a second-order conditioned re- becomes conditioned to a new stimulus. In second-order condition-
lationship. Whereas in classic conditioning paradigms, it is an un- ing, a conditioned appetitive response becomes conditioned to a new
conditioned response to a stimulus that comes to be elicited by a new stimulus. The two mechanisms are functionally analogous, and the
stimulus, in second-order conditioning paradigms, it is a conditioned difference lies in whether the initial response to the social stimulus is
response that undergoes what is functionally the same effect. acquired through learning (a conditioned response) or unlearned
Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7839
A B C
Fig. 1. Second-order conditioning of flower color preferences in bumblebees (43). (A) Observer bees were initially allowed to forage on a floral array in
which the presence of conspecifics (CS1) predicted either sucrose (US+; Upper) or quinine (US−; Lower). A third group completed this training in the absence
of any conspecifics (not pictured). (B) All bees then observed conspecifics (CS1) foraging on one flower color (CS2) and ignoring an alternative, through a glass
screen. (C) Finally, each observer was permitted to forage alone on the colored array.
predisposition [an unconditioned response; although Heyes (29) foraging honey bees (Apis mellifera) (61). We recently found that
highlights that presumed predispositions may in fact reflect learning]. bees not only avoid areas where they currently detect such vol-
Note that although we have framed this argument in terms of atiles, but also later avoid colored lights that were experienced at
classic conditioning, the same case can be made for operant the same time (60) (Fig. 2). Thus, the avoidance response, which
conditioning (57). Matched-dependent behavior (58) describes is initially elicited by the social stimulus, becomes conditioned to
instances whereby matching of another animal’s behavior (i.e., a a new asocial stimulus. In the laboratory, we used colored lights
response to a social stimulus) is rewarded (e.g., by finding food; a as asocial stimuli; in the wild, floral features that predict the
reinforcer) and thus increases in frequency, and Church (59) has presence of sit-and-wait predators, such as crab spiders (Thomisidae
shown that this response can become conditioned to the asocial spp.) could fulfill the same role.
stimulus that initially elicited the demonstrator’s behavior. The It is important to reiterate that this associative framework
key point is that responses to social stimuli—be they acquired complements, rather than adds to, the collection of processes
through classic conditioning, operant conditioning, or a history that are labeled as social learning mechanisms, such as local or
of natural selection—are conditioned to new asocial stimuli. stimulus enhancement, or social facilitation (1). These labels
As an illustration, consider the following example (60). Social describe effects, and learning theory offers an explanation for
information from injured conspecifics, in the form of volatiles why these effects occur, rather than an additional alternative
from stressed conspecifics, typically elicits aversion responses in effect (39). In some cases, an associative framework is already
A B
Fig. 2. Honey bees learn to respond to colored lights through exposure to alarm volatiles (60). (A) We used an assay whereby highly phototactic subjects
walked up a dark tube toward a colored light (balanced blue/green design; only blue is pictured). Warning triangle indicates the presence of volatiles from a
stressed conspecific. (B) In the training phase, bees in groups E1 (experimental) and E2 (control for sensitization/habituation effects) were slower to approach
the light, but only group E1 were slower in the testing phase. Thus, responses were conditioned to the specific stimuli that had been contiguous with stress
volatiles in the training phase.
EVOLUTION
to be learned, such that information deriving from informative cues is exactly as Dunlap et al.’s results illustrate.
acquired particularly rapidly. One means (but not the only means, as It may well be the case that bees treat asocial and social cues
we discuss later) to do so could involve changes to upstream mech- differently, and post hoc associative explanations should gener-
anisms that determine whether animals notice or pay attention to ate testable predictions to be useful. Here, one simple means to
social stimuli. We begin with a focus on the question of whether social rule out a blocking explanation would be to closely control the
stimuli are more salient than less-ecologically informative alternatives. previous foraging experience of the bees. In this study system,
The salience of a stimulus—the property that renders it con- where individuals never leave the flight arena and the colony’s
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
spicuous or noticeable, and thus likely to be attended to—is a foraging needs can be met by providing sucrose and pollen di-
key determinant of the speed with which it can be associated with rectly into the nest, it is entirely possible to create individual
other stimuli (62). Salience depends on the species-specific foragers with no experience of social foraging (43, 67, 68).
characteristics of the receiver’s sensory system, such that a Finding the same effect under these circumstances would render
stimulus that is salient for one taxonomic group may be less so the case for enhanced salience of social stimuli compelling, and a
for another (63). For example, carbon disulphide (a component further experiment by Dawson and Chittka (68) employs such an
of rat breath) is a particularly salient stimulus for young rats, and approach. These authors compared the speed at which bum-
is quickly associated with any food flavors that are encountered blebees (B. terrestris) learned to associate either a social CS (the
at the same time (64). Experimental evolution paradigms pro- presence of dead, pinned conspecifics) or an asocial CS (a coin/
vide evidence that environments where particular cue types are plastic disk/black wooden cuboid) with sucrose, in a free-flying
reliably useful for decision-making select for changes to the at- choice paradigm. Bees were more likely to associate the social
tentional or perceptual mechanisms that determine stimulus than asocial CS with reward, and were also more likely to use the
salience. For example, Drosophila lines for which learning about social CS to identify rewarding flowers in a transfer test involving
olfactory stimuli is the best method of identifying a good brood a novel target flower color. On realizing that their results could
host develop enhanced sensitivity to such cues and ignore visual reflect the fact that subjects had previous experience of social
alternatives, and vice versa (36). In a social context, the question foraging from laboratory feeders, Dawson and Chittka repeated
of whether social cues are especially salient has rarely been di- their experiment using bees that had never foraged with others,
rectly addressed [with the exception of Galef et al. (64)], but a and found that the effect was maintained. Interestingly, pinned
number of bee studies touch upon the topic, with mixed results. honey bee (A. mellifera) demonstrators, which visit comparable
When two conditioned stimuli (e.g., a social and an asocial but not identical floral resources, elicited similar results, suggesting
stimulus) are simultaneously paired with an unconditioned stimu- that natural selection might lead to increased salience of cues that
lus, the more salient stimulus overshadows learning about the less derive from useful heterospecifics, as well as conspecifics.
salient stimulus (47). Accordingly, an experiment by Dunlap and
colleagues initially appears to suggest that conspecifics present Social Associations
might be a particularly salient stimulus for bumblebees (65). In this Salience is an important determinant of the ease with which an
study, subjects (B. impatiens) were trained to find sucrose association is acquired, but it is by no means the only contrib-
rewards in floral arrays where both asocial and social cues (floral uting factor. Whereas salience affects the extent to which a
color and pinned conspecifics, respectively) provided some in- stimulus is made available for learning, learning itself requires
dication of which flowers were rewarding, before assessing which that associations between neural representations of stimuli, af-
of the two cue types the bees preferentially used on a test array. fective states, or motor patterns, are formed around that stim-
When both cue types had been equally reliable at predicting ulus. Natural selection might act upon the parameters that
sucrose, the bees disproportionately favored social cues in the determine the number and timing of exposures that are required
tests, and most surprisingly, they also used social cues even when before a particular association is committed to memory (69). The
floral cues had been more reliable. Bees only resorted to using key feature of such prepared learning is that certain combi-
floral cues if social cues had been entirely useless predictors of nations of stimuli, rather than any particular stimulus alone,
reward. These results would seem to indicate that the social elicit rapid learning. For example, consider the oft-cited “Garcia
stimulus is the more salient alternative, overshadowing any effect,” whereby rats rapidly learn to associate tastes but not
Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7841
audible tones with gastrointestinal illness. The combination of on flower types where predation was a real threat. Bees could not
taste and illness is critical here, and the same effect is not ob- have simply learned that the social cue was useful on dangerous
served when tastes are associated with shock (35, 70), ruling out flowers but irrelevant on safe ones because individuals were
the suggestion that tastes are simply more salient stimuli than trained on the mixed array entirely alone.
tones. In fact, Garcia and Koelling (35) found that tones were How might associations between stimuli produce such context-
more easily associated with shock than taste was, potentially specific and potentially adaptive behavior? Conditioned suppres-
because loud noises are likely to be a relevant cue to imminent sion is an associative phenomenon widely used to study the ac-
pain, whereas taste is not. quisition and extinction of fear, and it predicts exactly the effect
In a social learning context, there are clear candidate hypoth- that Dawson and Chittka (74) demonstrated in bumblebees (30).
eses concerning stimulus combinations that await investigation. In the presence of a cue predicting an aversive stimulus that an
For example, above we discussed evidence that bees might acquire animal would usually avoid, learned food-seeking behaviors are
associations between nectar rewards and social stimuli more rap- typically repressed, although they do not disappear altogether
idly than associations between nectar rewards and asocial stimuli (75). Thus, a rat that has learned to press a lever for food typically
(65, 68). A prepared learning hypothesis would predict that social reduces the frequency of pressing when a light that predicts foot
stimuli might be particularly easy to associate with sucrose reward shock is turned on (76). Similarly, when under predation pressure
levels, but not with an aversive stimulus. Or alternatively, perhaps from a threat that is hard to directly detect, bumblebees increase
the sign of the CS–US relationship is also important, such that both their latency to probe and rejection rate of all flowers of the
bees learn positive sucrose/social cues relationships easily but not color morph associated with danger (77, 78). With this in mind,
negative ones. These latter two alternatives—that bees are picture a bee that enters a flight arena filled with flowers of the
particularly sensitive to social CSs when learning about where to safe color morph, and happens to first come across an unoccupied
find food, or that they are particularly likely to learn positive social flower. Because the bee has learned that the holes in the flowers
CS-sucrose combinations—are qualitatively different traits. The- reliably contain sucrose rewards, it is very likely to land and feed.
ory correspondingly predicts that they should evolve under dif- If it instead first comes across an occupied flower where a dem-
ferent circumstances (36, 71). To visualize the difference, consider onstrator is foraging, it is also very likely to land and feed. Even if
again the Garcia effect (35). Although many stimuli might precede the presence of the demonstrator renders flowers much more
a feeling of illness, the true cause will often be related to recently attractive, the difference in acceptance rates between unoccupied
eaten food, so an a priori prioritization of associations between (very attractive) and occupied (extremely attractive) flowers will
taste and illness, rather than sound and illness, makes evolutionary be hard to detect experimentally, because all flowers are very
sense. Now consider the situation where a particular flavor always likely to be accepted. Now consider an environment where the
predicts illness. Here, theory would predict the fixation of an floral color morph predicts danger. All conditioned responses will
aversion to that flavor, rather than prepared learning (71). Thus, a be suppressed, such that flowers in general are less likely to be
priori expectations and prepared learning about social stimuli are accepted. In this situation, if the presence of a demonstrator bee is
both means by which natural selection could facilitate social attractive, the effect is much more likely to be detected experi-
learning processes, but the ecological conditions that favor their mentally (Fig. 4), because it is no longer the case that a bee almost
evolution are distinct. Teasing apart the roles of stimulus salience, invariably accepts the first flower that it encounters.
prepared learning, and a priori expectation is a task that invites This associative hypothesis does not require that selection
empirical exploration in our social insect system and in others. deriving from risky contexts has influenced the weight that bees
ascribe to social information. It is simply an alternative to the
Retrieval and Implementation of Learned Associations suggestion that individuals strategically respond to the circum-
We have discussed potential pathways by which natural selection stances in which they find themselves by computing the likely
could modify stimulus salience, or the downstream learning pa- pay-offs of to social information use. However, our hypothesis
rameters that influence memory formation, and suggested means by again generates a testable prediction. If dangerous environments
which such hypotheses could be explored. However, we have not yet simply render the effect of an attractive social stimulus more
touched upon the possibility that selection could produce modifi- detectable, the same should be true for an attractive asocial
cations to the final retrieval or implementation of learned in- stimulus. Thus, replacing conspecific demonstrators with asocial
formation. This is particularly important in light of the large volume stimuli that have previously been conditioned to sucrose should
of literature on “social learning strategies” that describes how ani- produce analogous results.
mals use social information most often in those situations in which it A study by Smolla et al. (79) employs exactly this approach in a
is most beneficial (72, 73). We begin with an example that illustrates different social learning context. Based on the premise that bees
how associative processes could bring about such context-specificity. should use a “copy-when-uncertain” strategy, which follows from an
As we have alluded to in an earlier example in this paper, agent-based model that they develop, these authors pretrained bees
foraging bees suffer predation by camouflaged sit-and-wait pred- (B. terrestris) that either a social (model bee) or an asocial (green
ators that ambush individuals as they land on flowers to feed. rectangle) cue predicted reward in a floral array. Half of the bees in
Dawson and Chittka (74) allowed bees to forage in environments each group were then trained that the floral array contained highly
that mimicked high or low risk of such predation, and found that variable rewards and half learned that rewards were constant, in the
bumblebees (B. terrestris) appear to use social information to absence of both cue types. When subsequently presented with a
identify safe flowers specifically in dangerous environments. Their nonrewarding test array, those bees that had learned that rewards
subjects were initially trained on an array where landing on flowers were variable used the social cue to find food, but those that had
of one color morph led to brief capture in a pressure trap (sim- experienced the constant array did not. Their results thus support a
ulating spider attack), whereas an alternative color morph was “copy-when-uncertain” interpretation, but as pointed out above, the
safe. Note that the color morphs simulate flower species with fact that more variable rewards render flowers less attractive (80, 81)
different levels of spider occupancy, rather than spiders them- means that the use of any cue, not just social ones, should be ren-
selves, which are typically cryptic. When subsequently tested on an dered more detectable. However, crucially, the difference between
array containing only flowers of the dangerous morph, bees contexts was much less evident for the asocial cue and did not reach
strongly preferred to land on the single flower where a live dem- statistical significance. This difference seems unlikely to be attributed
onstrator could be seen feeding, an effect that was entirely absent to greater salience or associability of the social cue, because Smolla
when tested on flowers of the safe morph (Fig. 3). In other words, et al. (79) state that pretraining was similarly successful for both cue
bees seemed to use social information adaptively when foraging types. Smolla et al.’s results invite further exploration that has not yet
EVOLUTION
by colored squares. During training, one color (here yellow) predicted brief capture in a pressure trap. Bees were then tested in either a “dangerous” or a
“safe” environment, where one live demonstrator was foraging at a single location.
been carried out, but their approach of comparing context-specific learning should be accredited with the status of a “default” ex-
responses to social and asocial stimuli is a promising one that seems planation for social learning phenomena, to be assumed true
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
a useful way to evaluate the specific characteristics that govern unless proven false (24). Arguments that favor this approach in
learning about social stimuli. animal cognition are typically based on Morgan’s canon, which
states that “no animal activity should be interpreted in terms of
Competing Hypotheses higher psychological processes if it can be fairly interpreted in
As a whole, the insect-based studies that we have discussed il- terms of processes that stand lower in the scale of psychological
lustrate the power of associative learning to generate adaptive evolution and development” (82), but this has led to debate over
behavior. There is nothing new about associative explanations whether Morgan’s canon itself it up to the task (24).
for social learning phenomena; Heyes in particular has long Perhaps such a polarized perspective is less productive than it
championed this approach (24, 28–30, 39, 40). However, the fact could be. If there were clear evidence that social learning pro-
that a hypothesis has explanatory power is not evidence of its cesses were more efficient in species where social information
truth, and consequently, debate arises over whether associative repeatedly presents itself than in more solitary species (for
A Safe B Dangerous
1.5
1.5
Probability of Acceptance
Probability of Acceptance
1.0
1.0
0.5
0.5
0.0
0.0
Fig. 4. Conditioned suppression should render social effects more detectable when aversive stimuli are present. (A) Safe environment. We assume that bees have
learned to associate the feeding holes in artificial flowers with sucrose rewards. Therefore, in the absence of aversive stimuli, the chances of acceptance on en-
countering a flower are high (here set at 0.9, for illustration purposes). The presence of a demonstrator is also attractive (here set at 0.45), perhaps because bees have
previously learned to associate conspecifics with sucrose. If two appetitive conditioned stimuli are presented together, the subject’s expectation is equal to their
combined strength (62), so occupied flowers are very attractive, but a probability of acceptance cannot exceed 1. Thus, the detectable effect of demonstrator presence
is small (arrow). (B) Dangerous environment. The presence of an aversive stimulus (the dangerous flower color) reduces all responses for food (here, suppression ratio
has been set at 0.5). Thus, the difference in the probability of acceptance between the unoccupied and occupied flowers is now relatively more detectable (arrow).
Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7843
example, in the social vs. solitary insects, which span multiple of mechanism is not the norm. It is surprising that for the vast
origins of sociality), it might be argued that the two are suffi- majority of cases in which animals respond to social information,
ciently different traits to justify discussion of evolutionary par- we understand very little about underlying mechanisms at all (41).
simony. However, such evidence is sparse. Most studies that To return to Galef’s statement that “an entire field of local en-
compare learning efficiency across social and asocial contexts hancement awaits exploration” (41), we suggest that this explo-
conclude that the two are highly correlated (83–85; but see also ration begins with a full understanding of how reported “simpler”
ref. 86). Here, we have suggested that a productive way forward social learning phenomena fit into an associative framework.
might be to search for small quantitative differences between
associative processes in asocial and social contexts, rather than Conclusion
qualitative leaps. Heyes (29) has highlighted that the pathway of We became interested in insect social learning based on a
least resistance for natural selection might be to modify the input thought-provoking anecdotal observation that the acceptance of
mechanisms that determine which aspects of the world are made a laboratory feeder by a single bumblebee seemed to render that
available for learning, contrasting such processes with learning feeder popular, rather than through careful choice of a study
mechanisms that determine how stimuli are linked and com- system. It was surprising to subsequently find that insects seemed
mitted to memory. Increased salience of social stimuli is one capable of what we considered a relatively sophisticated cogni-
such input mechanism, but increased salience of social stimuli tive process. However, questions about cognition in invertebrates
could be evolutionarily advantageous for many reasons other inevitably involve detailed dissection of mechanisms, and a
than acquiring information, such as recognizing kin, selecting critical feature of this system has been the ability to closely
mates, or defending a territory (87). Perhaps more convincing control the previous social foraging experience of experimental
evidence that natural selection has shaped responses to social subjects in a way that would not be possible for many species. We
information would be evidence of prepared learning about social hope that the body of research presented above shows that there
stimuli, of unconditioned responses to social stimuli that were was value in the novel perspective offered by a rather unlikely
specific to a social learning context, or of social learning strate- model. The studies that we have described are based on a small-
gies that cannot be accounted for by associative learning theory. brained study organism, but one that excels at accomplishing the
Our choice of study taxon—the social insects—has meant that cognitive tasks that are relevant to its own ecological niche (92),
we have made little mention of processes that characterize mainly that is highly social, and that offers exceptional experimental
primate behavior, such as imitation and (to a lesser extent) em- tractability. Other insect systems offer similar advantages for
ulation. Nonetheless, very recent work has begun to focus on investigating social learning contexts that we have not mentioned
potentially emulative behavior even in bees (8, 9), and it is not our in detail here; for example, a large literature now documents the
intention to imply that such processes follow a different evolu- existence of remarkably rapid and generalizable mate-choice
tionary pathway to other forms of social learning. In fact, asso- copying in Drosophila fruit flies (3, 11, 12). It may well be the
ciative explanations for imitation are prominent in the psychology case that social learning abilities have traveled further along
literature [40, 88; see also Lotem et al. (33)]. Explaining how in- some evolutionary routes in other lineages, and less far in others,
dividuals copy a novel sequence of actions through imitation in- but the rules that govern associative learning are taxonomically
vokes a “correspondence problem” (89, 90) because the seen widespread within the animal kingdom, and natural selection
movements of others must somehow be matched to motor rep- may well coopt the same cognitive raw material across multiple
resentations of self-movements. Associative models of imitation evolutionary lineages.
propose that such links could arise through previous experience of
contingency between performing an action and seeing it per- ACKNOWLEDGMENTS. We thank the organizers and funders of the Arthur M.
formed (91). For example, an infant might often observe others Sackler Colloquium on “The Extension of Biology Through Culture,” from
which this paper derives, the many participants at the colloquium who pro-
smiling when she or he smiles, leading to association between the vided informative feedback and discussion, and Simon Reader for comments
visual and motor representations of smiling. She or he will also on an earlier draft of the manuscript. Several of the empirical studies pre-
typically observe a raised arm each time that she raises her own sented here were coauthored by Lars Chittka, and the ideas that we have
arm, or react to an unexpected surprise with the same expressions discussed owe much to his direction and guidance. E.L. is funded by Euro-
pean Research Council Starting Grant BeeDanceGap, and the empirical
as others nearby. This elegant idea generates both theoretical work described here also derives from an Early Career Fellowship from
debate and extensive empirical exploration. However, imitation is The Leverhulme Trust. E.H.D.’s current position is funded by a Fyssen foun-
but one social learning mechanism, and this intensive exploration dation postdoctoral fellowship.
1. Hoppitt W, Laland K (2013) Social Learning: An Introduction to Mechanisms, Methods 12. Battesti M, Moreno C, Joly D, Mery F (2012) Spread of social information and dy-
and Models (Princeton Univ Press, Princeton, NJ). namics of social transmission within Drosophila groups. Curr Biol 22:309–313.
2. Grüter C, Leadbeater E (2014) Insights from insects about adaptive social information 13. Goulson D, Park KJ, Tinsley MC, Bussiere LF, Vallejo-Marin M (2013) Social learning
use. Trends Ecol Evol 29:177–184. drives handedness in nectar-robbing bumblebees. Behav Ecol Sociobiol 67:1141–1150.
3. Leadbeater E, Chittka L (2007) Social learning in insects—From miniature brains to 14. Danchin E, Giraldeau LA, Valone TJ, Wagner RH (2004) Public information: From nosy
consensus building. Curr Biol 17:R703–R713. neighbors to cultural evolution. Science 305:487–491.
4. Mery F, et al. (2009) Public versus personal information for mate copying in an in- 15. Thompson R, McCONNELL J (1955) Classical conditioning in the planarian, Dugesia
vertebrate. Curr Biol 19:730–734. dorotocephala. J Comp Physiol Psychol 48:65–68.
5. Worden BD, Papaj DR (2005) Flower choice copying in bumblebees. Biol Lett 1: 16. Kemenes G, Benjamin PR (1989) Appetitive learning in snails shows characteristics of
conditioning in vertebrates. Brain Res 489:163–166.
504–507.
17. Walters ET, Carew TJ, Kandel ER (1981) Associative learning in Aplysia: Evidence for
6. Leadbeater E, Chittka L (2007) The dynamics of social learning in an insect model, the
conditioned fear in an invertebrate. Science 211:504–506.
bumblebee (Bombus terrestris). Behav Ecol Sociobiol 61:1789–1796.
18. Spatz HC, Emanns A, Reichert H (1974) Associative learning of Drosophila mela-
7. Leadbeater E, Chittka L (2008) Social transmission of nectar-robbing behaviour in
nogaster. Nature 248:359–361.
bumble-bees. Proc Biol Sci 275:1669–1674.
19. Rescorla RA (1988) Pavlovian conditioning. It’s not what you think it is. Am Psychol 43:
8. Alem S, et al. (2016) Associative mechanisms allow for social learning and cultural
151–160.
transmission of string pulling in an insect. PLoS Biol 14:e1008529; erratum in 14(12): 20. Holland PC, Sherwood A (2008) Formation of excitatory and inhibitory associations
e1008529. between absent events. J Exp Psychol Anim Behav Process 34:324–335.
9. Loukola OJ, Perry CJ, Coscos L, Chittka L (2017) Bumblebees show cognitive flexibility 21. Timberlake W (1994) Behavior systems, associationism, and Pavlovian conditioning.
by improving on an observed complex behavior. Science 355:833–836. Psychon Bull Rev 1:405–420.
10. Battesti M, Moreno C, Joly D, Mery F (2015) Biased social transmission in Drosophila 22. Pearce JM, Bouton ME (2001) Theories of associative learning in animals. Annu Rev
oviposition choice. Behav Ecol Sociobiol 69:83–87. Psychol 52:111–139.
11. Dagaeff A-C, et al. (2016) Drosophila mate copying correlates with atmospheric 23. Dickinson A (2012) Associative learning and animal cognition. Philos Trans R Soc Lond
pressure in a speed-learning situation. Anim Behav 121:163–173. B Biol Sci 367:2733–2742.
EVOLUTION
e12350. an indicator of safety in dangerous environments. Proc Biol Sci 281:20133174.
40. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to 75. Annau Z, Kamin LJ (1961) Conditioned emotional response as a function of intensity
function. Behav Brain Sci 37:177–192. of US. J Comp Physiol Psych 54:428–432.
41. Galef BG (2013) Imitation and local enhancement: Detrimental effects of consensus 76. Estes WK, Skinner BF (1941) Some quantitative properties of anxiety. J Exp Psychol 29:
definitions on analyses of social learning in animals. Behav Processes 100:123–130. 390–400.
42. Heinrich B (1979) Bumblebee Economics. (Harvard Univ Press, Cambridge, MA). 77. Ings TC, Chittka L (2008) Speed-accuracy tradeoffs and false alarms in bee responses
43. Dawson EH, Avarguès-Weber A, Chittka L, Leadbeater E (2013) Learning by obser- to cryptic predators. Curr Biol 18:1520–1524.
vation emerges from simple associations in an insect model. Curr Biol 23:727–730. 78. Lenz F, Ings TC, Chittka L, Chechkin AV, Klages R (2012) Spatio-temporal dynamics of
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
44. Avarguès-Weber A, Chittka L (2014) Observational conditioning in flower choice bumblebees foraging under predation risk. Phys Rev Lett 108:098103.
copying by bumblebees (Bombus terrestris): Influence of observer distance and 79. Smolla M, Alem S, Chittka L, Shultz S (2016) Copy-when-uncertain: Bumblebees rely
demonstrator movement. PLoS One 9:e88415. on social information when rewards are highly variable. Biol Lett 12:20160188.
45. Schultz W, Dickinson A (2000) Neuronal coding of prediction errors. Annu Rev 80. Shafir S, Wiegmann DD, Smith BH, Real LA (1999) Risk-sensitive foraging: Choice
Neurosci 23:473–500. behaviour of honeybees in response to variability in volume of reward. Anim Behav
46. Rescorla R (1980) Pavlovian Second-Order Conditioning: Studies in Associative 57:1055–1061.
Learning (Laurence Erlbaum Associates, Hillsdale, NJ). 81. Seefeldt S, De Marco RJ (2008) The response of the honeybee dance to uncertain
47. Pavlov IP (1927) Conditioned Reflexes (Oxford Univ Press, Oxford). rewards. J Exp Biol 211:3392–3400.
48. Winterbauer NE, Balleine BW (2005) Motivational control of second-order condi- 82. Lloyd Morgan C (1909) An Introduction to Comparative Psychology (Walter Scott
tioning. J Exp Psychol Anim Behav Process 31:334–340. Publishing, London).
49. Chittka L, Leadbeater E (2005) Social learning: Public information in insects. Curr Biol 83. Lefebvre L, Giraldeau L-A (1996) Is social learning an adaptive specialization? Social
15:R869–R871. Learning in Animals: The Roots of Culture, eds Heyes CM, Galef BG, Jr (Academic,
50. Cook M, Mineka S (1989) Observational conditioning of fear to fear-relevant versus London), pp 107–128.
fear-irrelevant stimuli in rhesus monkeys. J Abnorm Psychol 98:448–459. 84. Reader SM, Hager Y, Laland KN (2011) The evolution of primate general and cultural
51. Mineka S, Cook M (1988) Social learning and the acquisition of snake fear in monkeys. intelligence. Philos Trans R Soc Lond B Biol Sci 366:1017–1027.
Social Learning: Pyschological and Biological Perspectives, eds Zentall TR, Galef BG 85. Reader SM (2003) Innovation and social learning: Individual variation and brain
(Laurence Erlbaum Associates, Hillsdale, NJ), pp 51–75. evolution. Animal Biology 53:147–158.
52. Mineka S, Cook M (1993) Mechanisms involved in the observational conditioning of 86. Templeton JJ, Kamil AC, Balda RP (1999) Sociality and social learning in two species of
fear. J Exp Psychol Gen 122:23–38. corvids: the pinyon jay (Gymnorhinus cyanocephalus) and the Clark’s nutcracker
53. Ohman A (2009) Of snakes and faces: An evolutionary perspective on the psychology (Nucifraga columbiana). J Comp Psychol 113:450–455.
of fear. Scand J Psychol 50:543–552. 87. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool (Lond)
54. Cook EW, 3rd, Hodes RL, Lang PJ (1986) Preparedness and phobia: Effects of stimulus 295:4–11.
content on human visceral conditioning. J Abnorm Psychol 95:195–207. 88. Heyes C (2016) Homo imitans? Seven reasons why imitation couldn’t possibly be as-
55. Tomarken AJ, Sutton SK, Mineka S (1995) Fear-relevant illusory correlations: What sociative. Philos Trans R Soc Lond B Biol Sci 371:20150069.
types of associations promote judgmental bias? J Abnorm Psychol 104:312–326. 89. Brass M, Heyes C (2005) Imitation: Is cognitive neuroscience solving the correspon-
56. Ohman A, Mineka S (2003) The malicious serpent: Snakes as a prototypical stimulus dence problem? Trends Cogn Sci 9:489–495.
for an evolved module of fear. Curr Dir Psychol Sci 12:5–9. 90. Nehaniv CL, Dautenhahn K (2002) The correspondence problem. Imitation in Animals
57. Zentall TR, Galef BG, eds (1988) Social Learning: Pyschological and Biological and Artifacts, eds Dautenhahn K, Nehaniv CL (MIT Press, Cambridge, MA).
Perspectives (Laurence Erlbaum Associates, Hillsdale, NJ). 91. Catmur C, Walsh V, Heyes C (2009) Associative sequence learning: The role of expe-
58. Miller NE, Dollard J (1941) Social Learning and Imitation (Yale Univ Press, New Haven, CT). rience in the development of imitation and the mirror system. Philos Trans R Soc Lond
59. Church RM (1968) Applications of behaviour theory to social psychology. Social B Biol Sci 364:2369–2380.
Facilitation and Imitative Behavior, eds Simmel EC, Hoppe RA, Milton GD (Allyn & 92. Chittka L, Thomson JD (2001) Cognitive Ecology of Pollination (Cambridge Univ Press,
Bacon, Boston). Cambridge, UK).
Leadbeater and Dawson PNAS | July 25, 2017 | vol. 114 | no. 30 | 7845
Cultural macroevolution matters
Russell D. Graya,b,c,1 and Joseph Wattsa,d
a
Department of Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Jena D-07745, Germany; bSchool of Psychology,
University of Auckland, Auckland 1142, New Zealand; cResearch School of the Social Sciences, Australian National University, Canberra, ACT 2601, Australia;
and dDepartment of Experimental Psychology, University of Oxford, Oxford OX1 3PH, United Kingdom
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark May 29, 2017
(received for review January 16, 2017)
Evolutionary thinking can be applied to both cultural microevolu- the importance of studying cultural macroevolution. Although we
tion and macroevolution. However, much of the current literature completely understand the need for elegant empirical work and
focuses on cultural microevolution. In this article, we argue that appropriate models of cultural change within populations, we
the growing availability of large cross-cultural datasets facilitates should never forget that the large-scale patterns of diversity be-
the use of computational methods derived from evolutionary tween cultures also cry out for evolutionary analyses and expla-
biology to answer broad-scale questions about the major transi- nation. The macro really matters.
tions in human social organization. Biological methods can be
extended to human cultural evolution. We illustrate this argument Big(ish) Data and Need for Computational Methods
with examples drawn from our recent work on the roles of Big It is a cliché these days to talk about big data transforming the
Gods and ritual human sacrifice in the evolution of large, stratified social sciences. However, clichés can be true. Certainly, there are
societies. These analyses show that, although the presence of Big a growing number of global comparative cultural and linguistic
Gods is correlated with the evolution of political complexity, in databases, such as D-PLACE (2), DRH (7), WALS (8), ASJP
Austronesian cultures at least, they do not play a causal role in (9), and Phoible (10), as well as relatively large regional data-
ratcheting up political complexity. In contrast, ritual human bases, such as the Austronesian Basic Vocabulary Database (11),
sacrifice does play a causal role in promoting and sustaining the SAILS (12), Chirilla (13), and Pulotu (14). Although these da-
evolution of stratified societies by maintaining and legitimizing tabases might not technically qualify as “big data,” they are large
the power of elites. We briefly discuss some common objections to enough to afford the application of the type of sophisticated
the application of phylogenetic modeling to cultural evolution and computational methods that are often used in the biological
argue that the use of these methods does not require a commit- sciences such as network analysis of reticulate evolution, epide-
ment to either gene-like cultural inheritance or to the view that miological models, and phylogenetic comparative methods.
cultures are like vertebrate species. We conclude that the careful These methods can be used to compare the relative importance
application of these methods can substantially enhance the of different factors in the distribution of traits, model the un-
prospects of an evolutionary science of human history. derlying dynamics of evolutionary change, and infer the history
of traits. The combination of big(ish) data and computational
cultural evolution | macroevolution | phylogenetics | religion | Big Gods methods has the potential to transform the social sciences and
humanities by enabling powerful quantitative tests of hypotheses
arwin’s On the Origin of the Species ends with the poetic
D phrase, “From so simple a beginning endless forms most
beautiful and most wonderful have been, and are being, evolved”
that would have previously only been analyzable in much more
limited ways.
To illustrate the promise of this approach, we present a recent
(1). The central challenge for evolutionary biology is to explain study by Botero et al. (15) titled, “The Ecology of Religious Be-
this diversity of endless forms. Evolutionary biologists tackle this liefs,” in which the authors examined the global distribution of
task by studying both microevolution (changes in gene frequency moralizing high gods (MHGs)—supernatural beings who are
within a population) and macroevolution (changes between claimed to have created or govern all reality, intervene in human
species over much longer time periods). The aim is to have a affairs, and enforce or support human morality (sometimes re-
mechanistic understanding of the evolution of biological diversity ferred to as “Big Gods”). These gods are central to the Abrahamic
that integrates microlevel processes and macrolevel patterns. religions, which includes the two largest religious families in the
This work examines ways in which evolutionary thinking and world today, Christianity and Islam. Scholars have debated the
methods can be extended into the realm of culture, extending the social and physical environments in which MHGs most readily
scope of biology to include questions that have traditionally been spread, and previous studies found rather contradictory results,
restricted to the humanities and social sciences. Human cultures with resource scarcity both positively and negatively associated
also display a vast variety of most beautiful and most wonderful with a belief in a MHG (16–18). These studies were limited by
forms. We speak ∼7,000 different languages, engage in hundreds the use of crude metrics of ecology or indirect measures of
of different religious practices, build many different types of
houses, exploit different resources for subsistence, use numerous
different kinship systems, and abide by a striking array of marital, This paper results from the Arthur M. Sackler Colloquium of the National Academy of
sexual, and child-rearing norms (2). The cultural processes that Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
produce such striking cultural diversity must be explained. The Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
field of cultural evolution is currently beginning to blossom (Fig. on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
1). There is a new cultural evolution society, a proposed journal,
Author contributions: R.D.G. and J.W. designed research, performed research, analyzed
and an inaugural conference (3). However, with a few notable data, and wrote the paper.
exceptions (4), much of the current work on cultural evolution The authors declare no conflict of interest.
focuses on microevolutionary processes. For example, in Dan
This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
Sperber’s influential book Explaining Culture: A Naturalistic Ap- Board.
proach (5), cultural macroevolution rates only a passing mention Data deposition: The data reported in this paper have been deposited in the D-PLACE
on p. 2. More recently, in Lewens’ (6) otherwise masterful anal- (https://d-place.org/home), Pulotu (https://pulotu.shh.mpg.de), and ABVD (https://abvd.
ysis of current work on cultural evolution, macroevolutionary shh.mpg.de/austronesian/) databases.
phenomena again fail to feature. This article is a plea—a plea for 1
To whom correspondence should be addressed. Email: gray@shh.mpg.de.
macroevolution (23).
Ten thousand years ago, most humans lived in small, kin-
1.5
ANTHROPOLOGY
timodel inference approach was able to predict the global distri- One reason societies were able to develop cultural complexity in the
bution of belief in MHGs in a separate sample of cultures with an first place is partly on account of the cooperative benefits attained
through a belief in moralizing gods (35).
accuracy of 91%.
In support of the Supernatural Punishment Hypothesis, a
Major Transitions: Big Questions for Big(ish) Data number of cross-cultural studies have shown that belief in MHGs
John Maynard Smith and Eörs Szathmáry’s 1995 book, The is positively correlated with a range of measures of social com-
Major Transitions in Evolution (23), is perhaps one of the most plexity, such as political hierarchy, agriculture, and taxation
important and insightful contributions to evolutionary theory systems (18, 35). On the face of it, the cross-cultural evidence for
EVOLUTION
in the last 50 years. In this book, Maynard Smith and Szathmáry the Supernatural Punishment Hypothesis appears compelling.
not only document fundamental changes in biological organi- However, these studies have a number of important limitations
zation, such as the emergence of the genetic code, the origins (37). First, these studies do not actually get at the direction of
of cells, the evolution of the eukaryotic cell, and multicellular- causality. Although one possible explanation for these results is
ity, they also show how these changes in biological organiza- that MHGs facilitate social complexity, another is that social
tion change the way in which biological systems can evolve. complexity makes cultures more likely to adopt MHGs. Second,
The major transitions create entirely new evolutionary possibil- these studies are ether based on a single dataset called the
ities built upon new and more powerful ways of storing and Ethnographic Atlas or a subset of this dataset known as the
transmitting information (24). According to Maynard Smith Standard Cross-Cultural Sample (17, 18, 35, 38). The MHGs in
these datasets are almost all derived from the closely related
and Szathmáry (25), these transitions have at least five general
family of Abrahamic religions—Christianity, Judaism, and Islam
properties:
(37). These religions share a wide range of features, such as
1. Smaller entities form larger entities; providing a universal rather than ethnocentric doctrine and en-
2. Smaller entities become differentiated as part of the couraging fertility, and it is not clear whether it is an MHG
larger entity; specifically or some other part of these religions that is related
3. The smaller entities are often unable to replicate in the ab- to social complexity (37, 39). Third, cultures often inherit traits
sence of the larger entity; such as language, customs, oral traditions, and social norms from
4. The smaller entities can sometimes disrupt the development their ancestors (19). These relationships between cultures mean
of the larger entity; and that cultures cannot be treated as statistically independent–a
5. New ways of transmitting information arise. problem famously first pointed out by Francis Galton (40, 41).
Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7847
The studies mentioned above do not adequately account for human history, sailing from their homeland in Taiwan to settle
Galton’s Problem, so the correlation observed between the on islands ranging in size from the 0.4-km2 island of Anuta up to
presence of MHGs and social complexity might merely arise the 785,000-km2 continental island of New Guinea (14, 53, 54).
because of the historical relationships between cultures (42). The archaeological, genetic, and linguistic evidence suggests that
Thus, to rigorously test hypotheses about the role of MHGs in this expansion started ∼5,000 y ago and spread in a series of
driving the major transitions in human history, we need data expansion pulses and pauses through Island South East Asia and
from cultures with non-Abrahamic religions, as well as methods the Pacific (52–55). The cultures that evolved on these islands
that avoid Galton’s Problem and can explicitly test causal ranged from small kin-based groups, such as the Berawan (56),
predictions. up to federated kingdoms, such as Southern Toraja (57). Pop-
Phylogenetic methods have revolutionized the field of evolu- ulation sizes ranged from ∼200 people on Anuta (58) to ap-
tionary biology (43). These methods solve Galton’s Problem by proximately half a million people in the case of the Merina of
explicitly estimating ancestral state changes on phylogenetic Madagascar (59). No less diverse were their religious systems,
trees (19, 41). Thus, there is no overcounting or undercounting with supernatural beliefs including anthropomorphic, animistic,
of evolutionary events. Phylogenetic methods have recently been and nature deities, and religious rituals ranging in scale from
used to make inferences about things such as the ancestral state humble personal offerings to multiday community-wide festivals
of postmarital residence patterns in Austronesian cultures (44), (14). Because Austronesian cultures were some of the last cul-
the evolution of political complexity (45), the effects of cultural tures in the world to have contact with major world religions, and
ancestry on deforestation (46), and the links between cattle and their traditional beliefs were well documented, they provide an
matrilinity (47). Mark Pagel and Andrew Meade introduced a ideal sample for testing theories about the role of religion in the
method called “Discrete” in the program BayesTraits that emergence of social complexity.
models the evolution of two binary traits and tests between de- We ran two series of analyses to test the Supernatural Pun-
pendent and independent models of evolution (48, 49). In an ishment Hypothesis (50). In the first, we tested the effect of
independent model, the gains and losses of each trait are mod- Broad Supernatural Punishment on the evolution of political
eled separately from each other (Fig. 2A). In the dependent complexity. Agents counted in this test included a wide range of
model, the rate at which a trait is gained or lost depends on the punishing and morally concerned supernatural agents, such as
state of the other trait, as would be expected if there is a causal ancestral spirits, natural spirits (e.g., forest and sky gods), and
relationship between traits (Fig. 2 B and C). This approach gets mythical heroes, in addition to MHGs (14). Belief in Broad
at the direction of causality by inferring the temporal order that Supernatural Punishment was found in just over two-thirds of the
traits tend to arise and the effects they have on one another (49). cultures sampled. We found modest support for the coevolution
Using this approach and data from the Pulotu database (14, of Broad Supernatural Punishment and political complexity, with
49), we recently tested a series of hypotheses about the role of Broad Supernatural Punishment facilitating the rise of political
religion in the emergence of social complexity (14, 50, 51). The complexity, but not helping to sustain it. In the second series
Pulotu database contains quantitative variables documenting the of analyses, we tested whether the specific belief in MHGs
traditional religious beliefs and practices as well as the social coevolved with political complexity. We were surprised to find
organization of 106 Austronesian cultures. Special care was evidence of MHGs in just 6 of the 96 traditional Austronesian
taken in the coding of these data to ensure that, as far as pos- cultures we studied. Although our analyses suggested that MHGs
sible, the coding reflected the state of the culture before con- coevolved with political complexity, instead of MHGs driving
version to major world religions and colonization (14). Previously political complexity, our results indicated that MHGs tended to be
published language-based phylogenies were used as a proxy for gained after political complexity had already emerged (Fig. 2C).
the population history of these cultures (52). These trees fit re- Our analyses suggested that these MHGs had been gained only
markably well with archaeological evidence that shows Austronesian- recently, and most of these MHGs occurred in regions where
speaking cultures were some of the greatest ocean voyagers in there had been early contact with Muslim traders. Although we
Fig. 2. An independent model (A) of evolution alongside the dependent model predicted by the Supernatural Punishment Hypothesis (B) and the dependent
model resulting from analyses of traditional Austronesian cultures (C). The red figure represents the presence of a MHG, and the black figure represents the
presence of political complexity (PC). Arrows indicate the rates of change between states, and the width of the arrows are proportional to the size of the
transition rates. (A) In independent models of evolution, the rate at which each trait is gained or lost is independent of the state of the other trait. In this
example, cultures are more likely to gain PC than to lose it (rate c is lower than rate d). (B) In dependent models of evolution, the rate at which each trait is
gained and lost can be dependent on the state of the other. In the model predicted by the Supernatural Punishment Hypothesis, the rate at which PC
is gained is higher when a MHG is present (rate d) than when it is absent (rate b), and the rate at which PC is lost is lower when an MHG is present (rate g) than
when an MHG is absent (rate e). (C) The resulting models from our analyses suggested that MHGs had little effect on the gain and loss of PC, but that MHGs
were rarely gained in cultures without PC (rate a is lower than rate f).
ANTHROPOLOGY
forms of hierarchical structuring to emerge in human history horizontal transmission.
(26). We found human sacrifice to have been remarkably com- 4. The accuracy of cultural phylogenies has not been validated.
mon in traditional cultures, occurring in almost half of those
sampled (51). Typically, social elites orchestrated the sacrifices, The first claim displays a shocking lack of knowledge of bi-
with social underclasses becoming the victims. The results of our ology and human culture. There is a great deal of biology that
analyses showed that human sacrifice coevolved with social strati- does not fit tidily on the “tree of life.” Indeed, the tree of life has
fication and functioned to stabilize social inequality in general, as been mocked as the “tree of 1%” (77). A very significant amount
well as facilitated the emergence of rigid class systems (Fig. 3). This of cross-lineage transfer occurs in biological evolution, especially
result does not imply that human sacrifice was necessarily functional in microbes (78). Mallet (79) estimated that there is hybridiza-
EVOLUTION
for the whole group, nor that it would have these effects in modern tion in ∼10% of animal and 25% of plant species. Dagan and
societies, which have developed more sophisticated methods of Martin’s (80) analysis of 190 prokaryotic genomes suggests that
sustaining social inequality. What our results do show is that ritual horizontal gene transfer has affected at least two-thirds of >57,000
human sacrifice was used by social elites as a tool to maintain their gene families.
social standing in the early stages of social complexity. In the literature on cultural microevolution, there is evidence
that the majority of social learning occurs between members of
Overextension of Biological Metaphors and Methods? the same population, but the relative importance of parent-to-
The famous evolutionary biologist Richard Lewontin often liked offspring and peer-to-peer social learning is debated (81–84).
to cite Rosenblueth and Wiener’s quip that, “The price of met- What matters for the application of phylogenetic methods are the
aphor is eternal vigilance” (69). One of the things that Lewontin resulting macroevolutionary patterns. Given that social learning
is particularly skeptical about is the metaphorical extension of occurs predominantly within a population, both peer-to-peer and
evolutionary ideas to cultural history (70, 71). Part of this parent-to-offspring learning can result in vertical transmission at
skepticism is driven by his opposition to Dawkins’ meme concept the macroevolutionary level. The relative importance of vertical
(72). Fracchia and Lewontin write (70): and horizontal transmission between populations is likely to vary
across domains of culture, world regions, and periods of history.
But, unlike genes, memes are not entities with an existence inde- For example, the design of the internal combustion engine has
pendent of the theory. They are a mental construct whose only de-
fined property is to fill in the gap in an elaborate metaphor.
been borrowed between cultural lineages. Conversely, basic vo-
cabulary items, such as terms for hand and eye, lower numerals,
However, Lewontin’s own three central principles for systems and kinship terms show clear evidence of vertical transmission
to evolve by natural selection (phenotypic variation, differential down cultural lineages (85).
Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7849
A Atayal
Tsou
Ami
Bunun
Puyuma
Paiwan
Sama Dilaut
Tagbanwa
Palawan Batak
Bukidnon
Subanun
Isneg
Gaddang
Bontok
Ifugao
Kalinga
Tinguian
Ngaju
Merina
Bidayuh
Kayan
Berawan
Kelabit
Moken
Iban
Toba Batak
Southern Toraja
Minahasa
Eastern Toraja
Nias
Palau
Chamorro
Ata Tana ‘Ai
Kedang
Manggarai
Savu
Laboya
East Sumba
Tanimbar
Tetum
Atoni
Roti
Biak
Waropen
Lakalai
Wogeo
Manam
Dobu
Trobriand Islands
Mekeo
Motu
Nendo
Bughotu
Lau
Kwaio
To’abaita
Kwara’ae
Tolai
Buka
Cheke Holo
Choiseul
Simbo
Roviana
Lifou
Mare
Eromanga
Tanna
Mota
Southern Malekula
Kiribati
Marshall Islands
Pohnpei
Woleai
Chuuk
East Fiji
Tonga
a = gain of HS
Samoa
Anuta
Rennell and Bellona
East Futuna
B c = loss of HS
Niue
b = gain of SS
d = gain of SS
g = loss of SS
West Futuna
e = loss of SS
Tikopia
Kapingamarangi
Ontong Java
Tokelau
Rapa Nui
Marquesas
Mangareva f = gain of HS
Rarotonga
Maori
Hawaii
Tahiti h = loss of HS
Fig. 3. (A) Ancestral state reconstruction of human sacrifice and social stratification on a maximum clade credibility consensus tree of 93 Austronesian
languages. The circles at the tips of the tree represent the known traditional states of cultures, and the circles found across the nodes of the tree represent the
state of prehistoric cultures inferred by a Markov chain Monte Carlo analysis in BayesTraits. In the analysis, 4,200 of the most likely possible trees were used,
and the consensus tree is a summary of these trees for illustrative purposes. The gray at each of the internal nodes represents the proportion of trees sampled
without this node and provides an indication of phylogenetic uncertainty. (B) The resulting dependent model shows that cultures with ritualized human
sacrifice were less likely to lose social stratification than those that lacked human sacrifice (rate g is lower than rate e). Adapted from ref. 51.
The second claim is more sensible, but does not undermine The third objection—that the estimation of cultural phylog-
the use of phylogenetic methods. If anything, it points out an enies will be biased by horizontal transmission—is a quantitative
important role for their use—to assess the coherence of differ- issue that can be evaluated by simulation modeling. Greenhill
ence aspects of culture. In their book chapter, “Are Cultural et al. (89) simulated language phylogenies with different tree
Phylogenies Possible?”, Boyd et al. (86) describe a range of topologies, different borrowing scenarios, and different levels
positions along a continuum on the question of how integrated of borrowing. The results show that tree topologies constructed
cultural histories are: (i) Cultures are tightly integrated like with Bayesian phylogenetic methods are robust to realistic
vertebrate species; (ii) cultures contain a core of traditions that levels of borrowing. Inferences about divergence dates were
are tightly linked and vertically transmitted, with peripheral as- slightly less robust and showed a tendency to underestimate
pects that are less cohesive and marked by frequent borrowing; dates.
The final objection—have inferences from cultural phyloge-
(iii) cultures contain some aspects that are bound together, but
netics been validated?—is a fair enough concern, but one that
there are no core traditions; and (iv) cultures are collections of
applies to much of computational biology, and indeed the ex-
ephemeral entities. Just as biologists talk about “every gene trapolation of laboratory studies to the field. In brief, we will
having its own history,” and have developed methods to map point out that the Austronesian languages phylogenies built from
these gene genealogies on to a species phylogeny, so cultural basic vocabulary fit strikingly well with both archaeological (55)
phylogenetists could construct trees for different aspects of cul- and recent genetic data (90, 91), both in terms of the sequence
ture and evaluate their fit with population history (87, 88). For and the timing of the Austronesian expansion.
example, genealogies of religious beliefs, material culture, kinship
systems, music genres, and styles of art could be mapped and Conclusion
compared with language-based cultural histories. Phylogenetic In the coming years, more quantitative phylogenies for the major
methods make the traditional social science debate about the language families will be published, and the number and richness
extent to which a culture is an integrated whole testable. of comparative cultural databases will undoubtedly grow (7, 92).
1. Darwin C (1872) On the Origin of Species by Means of Natural Selection (John Murray, 34. Schloss JP, Murray MJ (2011) Evolutionary accounts of belief in supernatural pun-
London), 6th Ed. ishment: A critical review. Religion Brain Behav 1:46–99.
2. Kirby KR, et al. (2016) D-PLACE: A global database of cultural, linguistic and envi- 35. Johnson DDP (2005) God’s punishment and public goods: A test of the supernatural
ronmental diversity. PLoS One 11:e0158391. punishment hypothesis in 186 world cultures. Hum Nat 16:410–446.
3. The Evolution Institute (2016) A New Society for the Study of Cultural Evolution. 36. Shariff AF, Norenzayan A, Henrich J (2011) The birth of high gods: How the cultural
Available at https://evolution-institute.org/project/society-for-the-study-of-cultural- evolution of supernatural policing influenced the emergence of complex, cooperative
evolution/. Accessed January 3, 2017. human societies, paving the way for civilization. Evolution, Culture, and the Human
4. Mesoudi A (2011) Cultural Evolution: How Darwinian Theory Can Explain Human Mind, eds Schaller M, Norenzayan A, Heine SJ, Yamagishi T, Kameda T (Psychology,
Culture and Synthesize the Social Sciences (Univ of Chicago Press, Chicago). New York), pp 119–136.
5. Sperber D (1996) Explaining Culture: A Naturalistic Approach (Blackwell, Oxford). 37. Atkinson Q, Latham A, Watts J (2015) Are Big Gods a big deal in the emergence of big
6. Lewens T (2015) Cultural Evolution: Conceptual Challenges (Oxford Univ Press, groups? Religion Brain Behav 5:266–274.
Oxford). 38. Peoples HC, Marlowe FW (2012) Subsistence and the evolution of religion. Hum Nat
7. Slingerland E, Sullivan B (2017) Durkheim with data: The Database of Religious His- 23:253–269.
tory. J Am Acad Relig 85:312–347. 39. Watts J, Bulbulia J, Gray RD, Atkinson QD (2016) Clarity and causality needed in claims
8. Haspelmath M (2005) The World Atlas of Language Structures (Oxford Univ Press, about Big Gods. Behav Brain Sci 39:41–42.
Oxford). 40. Jordan FM (2013) Comparative phylogenetic methods and the study of pattern and
9. Wichmann S, Holman EW, Brown CH (2016) The ASJP Database. Version 17. Available process in kinship. Kinship Systems: Change and Reconstruction, eds McConvell P,
at asjp.clld.org/. Accessed January 3, 2017. Keen I, Hendery R (Univ of Utah Press, Salt Lake City), pp 43–58.
10. Moran S, McCloy D, Wright R (2014) PHOIBLE Online (Max Planck Institute for Evo- 41. Mace R, Jordan F, Holden C (2003) Testing evolutionary hypotheses about human
lutionary Anthropology, Leipzig). biological adaptation using cross-cultural comparison. Comp Biochem Physiol A Mol
11. Greenhill SJ, Blust R, Gray RD (2008) The Austronesian Basic Vocabulary Database: Integr Physiol 136:85–94.
From bioinformatics to lexomics. Evol Bioinform Online 4:271–283. 42. Dow M, Eff E (2008) Global, regional, and local network autocorrelation in the
12. Muysken P, et al. (2016) South American Indigenous Language Structures (SAILS) standard cross-cultural sample. Cross-Cultural Res 42:148–171.
Online (Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany). 43. Freckleton RP, Harvey PH, Pagel M (2002) Phylogenetic analysis and comparative data:
Available at sails.clld.org. Accessed January 3, 2017. A test and review of evidence. Am Naturalist 160:712–726.
13. Bowern C (2016) Chirila: Contemporary and Historical Resources for the Indigenous 44. Fortunato L, Jordan F (2010) Your place or mine? A phylogenetic comparative analysis
of marital residence in Indo-European and Austronesian societies. Philos Trans R Soc
Languages of Australia. Lang Doc Conserv 10:1–44.
Lond B Biol Sci 365:3913–3922.
14. Watts J, et al. (2015) Pulotu: Database of Austronesian supernatural beliefs and
45. Currie TE, Greenhill SJ, Gray RD, Hasegawa T, Mace R (2010) Rise and fall of political
practices. PLoS One 10:e0136783.
complexity in island South-East Asia and the Pacific. Nature 467:801–804.
15. Botero CA, et al. (2014) The ecology of religious beliefs. Proc Natl Acad Sci USA 111:
46. Atkinson QD, Coomber T, Passmore S, Greenhill SJ, Kushnick G (2016) Cultural and
16784–16789.
environmental predictors of pre-European deforestation on Pacific Islands. PLoS One
16. Snarey J (1996) The natural environment’s impact upon religious ethics: A cross-
11:e0156340.
ANTHROPOLOGY
cultural study. J Sci Study Relig 35:85–96.
47. Holden CJ, Mace R (2003) Spread of cattle led to the loss of matrilineal descent in
17. Brown C, Eff EA (2010) The state and the supernatural: Support for prosocial be-
Africa: A coevolutionary analysis. Proc Biol Sci 270:2425–2433.
havior. Struct Dyn 4:1–21.
48. Pagel M (1994) Detecting correlated evolution on phylogenies: A general method for
18. Roes FL, Raymond M (2003) Belief in moralizing gods. Evol Hum Behav 24:126–135.
the comparative analysis of discrete characters. Proc Biol Sci 255:37–45.
19. Mace R, Pagel M (1994) The comparative method in anthropology. Curr Anthropol 35:
49. Pagel M, Meade A (2006) Bayesian analysis of correlated evolution of discrete char-
549–564.
acters by reversible-jump Markov chain Monte Carlo. Am Nat 167:808–825.
20. R Core Team (2015) R: A Language and Environment for Statistical Computing (R
50. Watts J, et al. (2015) Broad supernatural punishment but not moralising high gods
Foundation for Statistical Computing, Vienna).
precede the evolution of political complexity in Austronesia. Proc R Soc B Biol Sci
21. Bates D, Maechler M, Bolker B, Walker S (2015) Package lme4. J Stat Softw 67:1–91.
282:20142556.
22. Barton K (2015) MuMIn: Multi-model inference. R package, Version 1.15.1. Available
51. Watts J, Sheehan O, Atkinson QD, Bulbulia J, Gray RD (2016) Ritual human sacrifice
at r-forge.r-project.org/projects/mumin/. Accessed January 3, 2017.
EVOLUTION
promoted and sustained the evolution of stratified societies. Nature 532:228–231.
23. Maynard Smith J, Szathmáry E (1995) The Major Transitions in Evolution (Oxford Univ
52. Gray RD, Drummond AJ, Greenhill SJ (2009) Language phylogenies reveal expansion
Press, Oxford).
pulses and pauses in Pacific settlement. Science 323:479–483.
24. Calcott B, Sterelny K (2011) A big picture of big pictures of life’s history. The Major
53. Kirch PV, Green RC (2001) Hawaiki, Ancestral Polynesia: An Essay in Historical
Transitions in Evolution Revisited, eds Calcott B, Sterelny K (MIT Press, Cambridge, Anthropology (Cambridge Univ Press, Cambridge, UK).
MA). 54. Ko AM, et al. (2014) Early Austronesians: Into and out of Taiwan. Am J Hum Genet 94:
25. Szathmáry E, Smith JM (1995) The major evolutionary transitions. Nature 374: 426–436.
227–232. 55. Wilmshurst JM, Hunt TL, Lipo CP, Anderson AJ (2011) High-precision radiocarbon
26. Flannery K, Marcus J (2012) The Creation of Inequality: How our Prehistroic Ancestors dating shows recent and rapid initial human colonization of East Polynesia. Proc Natl
Set the Stage for Monarchy, Slavery, and Empire (Harvard Univ Press, Cambridge, Acad Sci USA 108:1815–1820.
MA). 56. Huntington R, Metcalf P (1979) Celebrations of Death: The Anthropology of Mortuary
27. Gintis H, Bowles S, Boyd R, Fehr E (2003) Explaining alturistic behavior in humans. Evol Ritual (Cambridge Univ Press, Cambridge, UK).
Hum Behav 24:153–172. 57. Nooy-Palm H (1979) The Sa’dan-Toraja: A Study of Their Social Life and Religion
28. Durkheim E (1915) The Elementary Forms of the Religious Life (Allen & Unwin, (Martinus Nijhoff, The Hague).
London). 58. Feinberg R (1991) Anuta. Oceania, Encyclopedia of World Cultures, ed Hays TE (G. K. Hall,
29. Sosis R (2009) The adaptationist-byproduct debate on the evolution of religion: Five New York), Vol II, pp 13–16.
misunderstandings of the adaptationist program. J Cogn Cult 9:315–332. 59. Campbell G (1991) The state and pre-colonial demographic history: The case of late
30. Bulbulia J (2004) The cognitive and evolutionary psychology of religion. Biol Philos 18: Nineteenth-Century Madagascar. J Afr Hist 32:425–445.
655–686. 60. Buck PH (1952) The Coming of the Maori (Human Relations Area Files Press, New
31. Wiebe D (2008) Does talk about the evolution of religion make sense? Evolution of Haven, CT).
Religion: Studies, Theories and Critiques, eds Bulbulia J, et al. (Collins Foundation, 61. Shariff AF, Willard AK, Andersen T, Norenzayan A (2016) Religious priming: A meta-
Santa Margarita, CA), pp 339–346. analysis with a focus on prosociality. Pers Soc Psychol Rev 20:27–48.
32. Johnson DD, Krüger O (2004) The good of wrath: Supernatural punishment and the 62. Marx K, Engels F (1975) Karl Marx and Friedrich Engels: Collected Works (In-
evolution of cooperation. Polit Theol 5:159–176. ternational, New York).
33. Norenzayan A (2013) Big Gods: How Religion Transformed Cooperation and Conflict 63. Cronk L (1994) Evolutionary theories of morality and the manipulative use of signals.
(Princeton Univ Press, Princeton). Zygon 29:81–101.
Gray and Watts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7851
64. Carrasco D (1999) City of Sacrifice (Beacon, Boston). 82. Hewlett BS, Cavalli-Sforza LL (1986) Cultural transmission among Aka Pygmies. Am
65. Bremmer JN (2007) The Strange World of Human Sacrifice (Peeters, Leuven, Belgium). Anthropol 88:922–934.
66. Turner CG, Turner JA (1999) Man Corn: Cannibalism and Violence in the Prehistoric 83. Henrich J, Broesch J (2011) On the nature of cultural transmission networks: Evidence from
American Southwest (Univ of Utah Press, Salt Lake City). Fijian villages for adaptive learning biases. Philos Trans R Soc Lond B Biol Sci 366:1139–1148.
67. Girard R (1987) Violent origins: Ritual killing and cultural formation. Violent Origins, 84. Aunger R (2000) The life history of culture learning in a face-to-face society. Ethos 28:
eds Hamerton-Kelly R, Burkert W, Girard R, Smith J (Stanford Univ Press, Stanford, 445–481.
CA), pp 73–105. 85. Haspelmath M, Tadmor U (2009) World Loanword Database (WOLD) (Max Planck
68. Winkelman M (2014) Political and demograpic-ecological determinants of in- Digital Library, Leipzig, Germany).
stitutionalised human sacrifice. Anthropol Forum 24:47–70. 86. Boyd R, Borgerhoff-Mulder M, Durham WH, Richerson PJ (1997) Are cultural phy-
69. Lewontin RC (2001) In the beginning was the word. Science 291:1263–1264. logenies possible? Human by Nature: Between Biology and the Social Sciences, eds
70. Fracchia J, Lewontin RC (2005) The price of metaphor. Hist Theory 44:14–29. Weingart P, Richerson P, Mitchell S, Maasen S (Lawrence Erlbaum Associates, Mahwah,
71. Fracchia J, Lewontin RC (1999) Does culture evolve? Hist Theory 38:52–78. NJ), pp 355–386.
72. Dawkins R (1976) The Selfish Gene (Oxford Univ Press, Oxford). 87. Gray RD, Greenhill SJ, Ross RM (2007) The pleasures and perils of Darwinizing culture
73. Lewontin RC (1970) The units of selection. Annu Rev Ecol Syst 1:1–18. (with phylogenies). Biol Theory 2:360–375.
74. Henrich J, Boyd R (2002) On modeling cognition and culture: Why cultural evolution 88. Gray RD, Bryant D, Greenhill SJ (2010) On the shape and fabric of human history.
Philos Trans R Soc Lond B Biol Sci 365:3923–3933.
does not require replication of representations. J Cogn Cult 2:87–112.
89. Greenhill SJ, Currie TE, Gray RD (2009) Does horizontal transmission invalidate cul-
75. Gould SJ (2010) An Urchin in the Storm: Essays About Books and Ideas (W. W. Norton,
tural phylogenies? Proc R Soc B Biol Sci 276:2299–2306.
New York).
90. Lipson M, et al. (2014) Reconstructing Austronesian population history in Island
76. Norenzayan A, et al. (2016) The cultural evolution of prosocial religions. Behav Brain
Southeast Asia. Nat Commun 5:4689.
Sci 39:e1.
91. Lind J, Lindenfors P, Ghirlanda S, Lidén K, Enquist M (2013) Dating human cultural
77. Dagan T, Martin W (2006) The tree of one percent. Genome Biol 7:118.
capacity using phylogenetic principles. Sci Rep 3:1785.
78. Shapiro JA (2016) Nothing in evolution makes sense except in the light of genomics:
92. Turchin P, et al. (2015) Seshat: The global history databank. Cliodynamics J Quant Hist
Read-write genome evolution as an active biological process. Biology (Basel) 5:E27. Cult Evol 6(1).
79. Mallet J (2005) Hybridization as an invasion of the genome. Trends Ecol Evol 20: 93. Currie TE, Meade A, Guillon M, Mace R (2013) Cultural phylogeography of the Bantu
229–237. Languages of sub-Saharan Africa. Proc R Soc London B Biol Sci 280:20130695.
80. Dagan T, Martin W (2007) Ancestral genome sizes specify the minimum rate of lateral 94. Gray RD, Atkinson QD, Greenhill SJ (2011) Language evolution and human history:
gene transfer during prokaryote evolution. Proc Natl Acad Sci USA 104:870–875. What a difference a date makes. Philos Trans R Soc Lond B Biol Sci 366:1090–1100.
81. Tehrani JJ, Collard M (2009) On the relationship between interindividual cultural 95. Matthews LJ, Passmore S, Richard PM, Gray RD, Atkinson QD (2016) Shared cultural
transmission and population-level cultural diversity: A case study of weaving in Ira- history as a predictor of political and economic changes among nation states.
nian tribal population. Evol Hum Behav 30:286–300. PLoS One 11:e0152979.
Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved April 10, 2017 (received for review January 11, 2017)
In the past few decades, scholars from several disciplines have Historical Context
pursued the curious parallel noted by Darwin between the genetic Darwin’s comment above was inspired by historical linguists of
evolution of species and the cultural evolution of beliefs, skills, his time, who, even before publication of On the Origin of Species
knowledge, languages, institutions, and other forms of socially (4), were constructing tree-like schemas of extant languages ex-
transmitted information. Here, I review current progress in the plicitly based on the assumption of common descent (5). Although
pursuit of an evolutionary science of culture that is grounded in “evolutionary” ideas became popular ways of describing cultural
both biological and evolutionary theory, but also treats culture as change in the late 1800s, such ideas were confused. Many scholars
more than a proximate mechanism that is directly controlled by erroneously saw evolution as inevitable progress along fixed stages
genes. Both genetic and cultural evolution can be described as of increasing complexity (e.g., from savagery to barbarism to civi-
systems of inherited variation that change over time in response lization), drawing more from Herbert Spencer than from Darwin
to processes such as selection, migration, and drift. Appropriate (6). There was also much confusion, in the absence of a clear
differences between genetic and cultural change are taken seriously, understanding of genetics, about genetic and cultural inheritance.
such as the possibility in the latter of nonrandomly guided variation Theories were often literally Lamarckian, with ideas, artifacts, and
or transformation, blending inheritance, and one-to-many trans-
words somehow thought to become part of the germ line through
repeated use (7). Due to this confusion, as well as the misuse of
mission. The foundation of cultural evolution was laid in the late
pseudoevolutionary racial theories for distasteful political ends,
20th century with population-genetic style models of cultural mi-
early 20th century social scientists declared culture to be separate
croevolution, and the use of phylogenetic methods to reconstruct from “the organic” (8); the biological and social sciences went
cultural macroevolution. Since then, there have been major efforts separate ways; and the notion of cultural evolution, or indeed any
to understand the sociocognitive mechanisms underlying cumulative evolutionary basis for human behavior, fell from favor.
cultural evolution, the consequences of demography on cultural
evolution, the empirical validity of assumed social learning biases, Cultural Microevolution
the relative role of transformative and selective processes, and the It was not until the 1970s and 1980s that a properly Darwinian
use of quantitative phylogenetic and multilevel selection models to theory of cultural change was formulated, first by Cavalli-Sforza
understand past and present dynamics of society-level change. and Feldman (9, 10) and then by Boyd and Richerson (11). This
I conclude by highlighting the interdisciplinary challenges of study- theory comprised quantitative models of cultural microevolution,
ing cultural evolution, including its relation to the traditional social describing the mechanisms by which cultural variation is trans-
sciences and humanities. mitted from person to person, and the processes that change this
variation over time within populations (Table 1), thus embodying
|
cultural evolution cumulative culture | gene–culture coevolution | the “population thinking” that characterizes Darwin’s approach.
ANTHROPOLOGY
|
human evolution social learning Here, “culture” is defined as “information capable of affecting
individuals’ behavior that they acquire from other members of their
species through teaching, imitation, and other forms of social trans-
The formation of different languages and of distinct species, and the mission” (12). “Social transmission,” “social learning,” and
proofs that both have been developed through a gradual process, are “cultural transmission” are used interchangeably to denote the
curiously parallel. . . nongenetic transfer of learned information from one individual
to another. “Cultural trait,” “cultural variant,” and sometimes
Charles Darwin, The Descent of Man, p 90
PSYCHOLOGICAL AND
“meme” are used to refer to the information (e.g., ideas, attitudes,
COGNITIVE SCIENCES
T his quote from Charles Darwin (1) draws a parallel between,
on the one hand, the genetic evolution of species, and on the
other, cultural change (i.e., changes in socially learned infor-
skills) that is transmitted. All of these terms hide huge complexity
and caveats. Such simplification is typical of a modeling approach.
This approach follows population genetics, which makes simplify-
mation, such as beliefs, knowledge, tools, technology, attitudes, ing assumptions (e.g., infinitely large populations) to understand
norms, and, as Darwin mentions, languages). This idea is the similarly complex genetic evolutionary processes. The simplification
in both cases is tactical, aiming to understand complex processes in
basic premise of cultural evolution: Cultural change constitutes a a piecemeal fashion and to formalize verbal arguments (13).
Darwinian evolutionary process that shares key characteristics Some of the processes in Table 1 have parallels in genetic
with the genetic evolution of species. The emergence of this second evolution. Selection-like “content” or “direct” biases favor the
evolutionary process saw an unprecedented extension of genetic acquisition and transmission of some cultural variants over others
evolution by allowing organisms to adapt more rapidly to, and due to their memorability or effectiveness (14, 15), just as some
more powerfully create and shape, their environments.
Since the 1980s, this parallel between genetic and cultural
evolution has been pursued by scholars from a range of disci- This paper results from the Arthur M. Sackler Colloquium of the National Academy of
plines across the social, behavioral, and biological sciences. In Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
this article, I review the current state of this interdisciplinary Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
effort, focusing on topics of major recent research interest. No
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
new theories or findings are presented, but in presenting dispa-
Author contributions: A.M. wrote the paper.
rate strands of work alongside each other, I hope to identify links
between strands and foster a synthetic evolutionary science of The author declares no conflict of interest.
culture (2, 3) paralleling the interdisciplinary synthesis of the This article is a PNAS Direct Submission.
biological sciences in the early 20th century. 1
Email: a.mesoudi@exeter.ac.uk.
alleles have higher fitness than others. Random cultural “mutation” switch from matriliny to patriliny even after controlling for
occurs where new variation arises randomly, such as via perceptual descent (23).
error (16), akin to random genetic mutation. Migration allows indi- Concurrently, archaeologists began using phylogenetic meth-
viduals to introduce novel variants to a population as they move (17), ods to reconstruct the history of artifacts, such as projectile
just as gene flow spreads alleles. Some cultural traits, such as first points (24), and, following Darwin’s original insight, others be-
names, fit the expectations of neutral drift (18), just like some alleles. gan reconstructing the history of language families (25). Like for
However, this enterprise is not simply the unthinking transfer microevolution, the advantage here was in the use of quantitative
of models from genetic to cultural evolution. In many cases, methods borrowed from biology that were explicit in their as-
cultural variation is generated, inherited, and changed in very dif- sumptions about how to reconstruct historical relationships (e.g.,
ferent ways from genetic variation, and models have addressed these maximum parsimony, maximum likelihood), repeatable and ex-
differences (10, 11). Examples include the blending of continuous, tendable by others, and easily scaled up to large datasets (26), in
nonparticulate cultural variation; the systematic, nonrandom contrast to the informal, idiosyncratic, and subjective schemas of
generation of cultural variation, or “guided variation”; frequency- historical linguistics or archaeology.
dependent biases, such as conformity, where variants are adopted
based on their commonness in the population; and indirect biases, Is It Evolution?
where traits are adopted based on the characteristics of their A common question is whether culture really evolves. This
bearers, such as success or prestige. Where possible, evidence question comes from both social scientists skeptical of any kind
from psychology, anthropology, sociology, and other fields was of parallel with evolution, and biologists insistent that Darwin’s
used to justify these processes (10, 11). However, the use of theory applies only to genetic evolution (27). Importantly, no
quantitative models went beyond typical theory in the social and one argues that genetic evolution and cultural evolution operate
behavioral sciences by (i) precisely and explicitly defining these identically. From the outset, microevolutionary modelers in-
processes rather than relying on imprecise verbal descriptions of corporated processes unique to cultural change, such as one-to-
phenomena (e.g., “conformity”) and (ii) exploring the population- many transmission (10) or nonrandomly guided variation (11).
level consequences of such processes, such as the consequences However, an examination of Table 1 indicates that the parallels
of frequency-dependent biases for between- and within-group are numerous enough to warrant an evolutionary theory of culture,
as long as these differences are taken seriously. At its heart, cultural
cultural variation (19).
change is a process of inherited variation that changes due to se-
Cultural Macroevolution lection, drift, migration, and other processes, which, in their details,
may operate similar to or different from the genetic case.
In the 1990s, the study of cultural microevolution was supple- Similarly, at the macroevolutionary level, it is sometimes ar-
mented by the study of cultural macroevolution, defined as long- gued that human culture is so riven with cross-lineage diffusion
term cultural change at or above the level of the society. Mace that it is not tree-like, and thus not amenable to phylogenetic
and Pagel (20) introduced the phylogenetic comparative method methods (26). Although this argument may be true for some
as a means to (i) reconstruct the cultural evolutionary history of cultural domains, many, such as languages or some artifacts,
a particular trait or set of traits and (ii) test functional hypoth- have been shown to be tree-like due to strong intergenerational
eses concerning the spread or distribution of cultural variation cultural descent (28). Moreover, cross-lineage blending is a
across societies while controlling for evolutionary history. The common feature of genetic evolution when we look beyond our
latter had been a problem within anthropology for over a cen- own kingdom to, say, prokaryotes, where horizontal gene trans-
tury. In 1889, Francis Galton (21) pointed out that even if two fer is rife (29). Indeed, network-based methods exist for dealing
traits (e.g., cattle-keeping and patriliny) often co-occur across with non–tree-like data (30).
many societies, this co-occurrence does not necessarily provide One indirect, but perhaps most important, test of the parallel
evidence that they are functionally associated (e.g., cattle-keeping between genetic and cultural change is whether methods bor-
causes patriliny), because all these societies may have culturally rowed from evolutionary biology, suitably modified, actually
inherited this combination from a common ancestral society. So- prove useful in explaining cultural change in a manner that adds
cieties are not necessarily statistically independent data points, due to the findings of nonevolutionary methods. Table 2 lists such
to shared history. This problem is the same one facing biologists methods, which are further discussed throughout this article.
when comparing across species, and, in the meantime, biologists
had developed methods for controlling for nonindependence due Evolution of Cultural Evolution
to common descent (22). Mace, Pagel, and others imported these In parallel to the study of cultural change itself, that is, changes
methods to test functional evolutionary hypotheses in the same in the contents of culture, modelers have also examined when
way, showing, for example, that cattle-keeping did likely cause a and why the capacity for cultural evolution evolved. Models
ANTHROPOLOGY
previous generations. As noted above, models of the evolution of mostly in archaeology, on the way in which population structure
culture show that cumulative culture is particularly effective at in- affects patterns of cultural variation and the gain and loss of
creasing mean population fitness beyond the population fitness of cultural complexity. Shennan (54) and Henrich (47) argued that
noncumulative cultural species (33). A major research question is population size has been a major determinant of cultural com-
therefore “What allows human culture to be uniquely cumulative?” plexity in hunter-gatherers, often measured as the number of tools
There has been much focus on high-fidelity social learning, in a toolkit or the number of components per tool. Henrich (47)
which is needed to preserve modifications over successive gen- argued that the loss of toolkit complexity in Tasmania following
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
erations such that they can accumulate (41, 42). It was initially isolation from Australia 12 kya was due to reduced effective
suggested that imitation (i.e., the copying of bodily actions rather population size, because the isolated population was too small to
than products) was key to this high fidelity (43). This claim seems maintain complexity, given imperfect social learning. In his model
doubtful, given that chimpanzees can imitate tool use techniques of this process, each member of each new generation acquires the
with high enough fidelity for alternate techniques to stabilize in skill of the most skillful member of the previous generation with
different groups (44). Rather than an “imitation vs. no-imitation” systematic loss due to copying error and some chance of im-
dichotomy, perhaps humans are more effective, spontaneous, or provement. Larger populations make the loss of skills due to im-
compulsive imitators (45). Other comparative work has sug- perfect copying less likely and improvements more likely. Shennan
gested roles for prosociality and language-mediated teaching and coworkers (48, 54) argued that increasing population densities
(46). As well as individual cognitive abilities, cumulative culture in Upper Paleolithic Europe around 45 kya caused the major increase
Table 2. Methods and concepts that have been adapted from evolutionary biology to study cultural change
Evolutionary biology Cultural evolution
Population genetic models Cultural evolution (or gene–culture coevolution) models (10, 11)
Gene-based phylogenetics Cultural phylogenetics (24, 26)
Comparative (cross-species) method Comparative (cross-cultural) method (20)
Population dynamic models Historical dynamic models (120)
Multilevel selection Multilevel cultural selection (122, 123)
Genetic drift Cultural drift (15, 18)
Multigeneration breeding experiments Multigeneration transmission chain experiments (111, 136)
ANTHROPOLOGY
biases noted above to be individually and culturally variable, and Although cultural attraction is sometimes presented as an al-
how these learning biases interact with developing abilities, such ternative to cultural evolution, the two approaches are compat-
as language and theory of mind. Experiments have shown that ible (104). Many “standard” cultural evolution models, in fact,
children are sophisticated social learners and exhibit biases do not model transmission as high fidelity, and allow for trans-
predicted by models to be adaptive, such as preferentially formation (47, 114). The notion of guided variation (11) is
learning from accurate over inaccurate individuals (98) and similar to the individual, nonrandom transformation described as
prestigious over nonprestigious individuals (99). There is work cultural attraction, and can operate in parallel to the more
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
combining biases, finding that children copy groups over indi- selection-like transmission biases in Table 1. However, cultural
viduals when both are equally successful but copy successful in- attraction proponents have a valid point that, in practice, such
dividuals over unsuccessful groups, thus adaptively switching transformative processes have not received adequate attention.
between frequency and success information (100). Other work The relative influence of each likely varies with domain (104).
has addressed the motivation for copying, with children more Where there are clear inductive biases favoring certain repre-
likely to imitate indiscriminately when tasks are presented as sentations, such as bloodletting or color terms, then explanations
conventional rather than instrumental, such that the motiva- in terms of transformation/attraction will be useful. Where there
tion is to affiliate with one’s group rather than acquire effective are no clear intuitions or inductive biases, selection-like pro-
skills (101). cesses will be more important. Bloodletting, for example, has
There are exceptions to these impressive skills, however. One been replaced in many societies with surgical techniques that are
study found that children copy adults over peers even when peers the product of a long refinement and accumulation of unintuitive
are more knowledgeable (102). In addition, there is similar in- knowledge and skills. Many medical, scientific, and technological
dividual (103) and cultural (90) variation as seen in adults, which practices are the product of accidental invention followed by
has yet to be explained. Further work is needed to link the study payoff-biased selection in the face of resistance due to confor-
of social learning in childhood and adulthood, ideally using mity to prior practices or transformation back to intuitive
models to link developmentally changing learning schedules to attractors (115). Examples include glassmaking (116); musical
macroevolutionary patterns of cumulative culture (81). instruments (12]); and the theory of evolution, an unintuitive
idea that needs conscious effort to understand (118).
Cultural Attraction. An ongoing debate has been over the relative Even where there is clear evidence for inductive biases, as in
role of preservative, selection-like processes and nonselective, the case of color terms (109), the prediction of cross-cultural
transformative processes in explaining cultural change (104– universals is only partially upheld in real-world data, as shown
107). Many of the cultural evolution models described earlier by the many exceptions to the Berlin–Kay scheme identified by
assume high-fidelity transmission plus random copying error or phylogenetic analyses (70). Further work might show these
1. Darwin C (1871) The Descent of Man (John Murray, London); reprinted (2003) 29. Doolittle WF, Bapteste E (2007) Pattern pluralism and the Tree of Life hypothesis.
(Gibson Square, London). Proc Natl Acad Sci USA 104:2043–2049.
2. Mesoudi A (2011) Cultural Evolution (Univ of Chicago Press, Chicago). 30. Tehrani JJ (2013) The phylogeny of Little Red Riding Hood. PLoS One 8:e78871.
3. Mesoudi A, Whiten A, Laland KN (2006) Towards a unified science of cultural evo- 31. Aoki K, Wakano JY, Feldman MW (2005) The emergence of social learning in a
lution. Behav Brain Sci 29:329–347, discussion 347–383. temporally changing environment. Curr Anthropol 46:334–340.
4. Darwin C (1859) On the Origin of Species (John Murray, London); reprinted 32. Rogers A (1988) Does biology constrain culture? Am Anthropol 90:819–831.
(1968) (Penguin, London). 33. Boyd R, Richerson PJ (1995) Why does culture increase human adaptability? Ethol
5. van Wyhe J (2005) The descent of words: Evolutionary thinking 1780-1880. Sociobiol 16:125–143.
Endeavour 29:94–100. 34. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
6. Freeman D (1974) The evolutionary theories of Charles Darwin and Herbert Spencer. Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.
Curr Anthropol 15:211–221. 35. Henrich J (2015) The Secret of Our Success (Princeton Univ Press, Princeton).
7. Stocking GW (1962) Lamarckianism in American social science: 1890-1915. J Hist Ideas 36. Hoppitt W, Laland KN (2013) Social Learning (Princeton University Press, Princeton,
23:239–256. NJ).
8. Kroeber AL (1917) The superorganic. Am Anthropol 19:163–213. 37. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes.
9. Cavalli-Sforza LL, Feldman MW (1973) Cultural versus biological inheritance: Phe- Proc Natl Acad Sci USA 114:7790–7797.
notypic transmission from parents to children. (A theory of the effect of parental 38. Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL (2014) Human cumulative culture: A
phenotypes on children’s phenotypes). Am J Hum Genet 25:618–637. comparative perspective. Biol Rev Camb Philos Soc 89:284–301.
10. Cavalli-Sforza LL, Feldman MW (1981) Cultural Transmission and Evolution (Prince- 39. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of
ton Univ Press, Princeton). cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415.
11. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago 40. Krützen M, et al. (2005) Cultural transmission of tool use in bottlenose dolphins. Proc
Press, Chicago). Natl Acad Sci USA 102:8939–8943.
12. Richerson PJ, Boyd R (2005) Not by Genes Alone (Univ of Chicago Press, Chicago). 41. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
13. Servedio MR, et al. (2014) Not just a theory–the utility of mathematical models in mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.
evolutionary biology. PLoS Biol 12:e1002017. 42. Kempe M, Lycett SJ, Mesoudi A (2014) From cultural traditions to cumulative culture:
ANTHROPOLOGY
14. Stubbersfield JM, Tehrani JJ, Flynn EG (2015) Serial killers, spiders and cybersex: Parameterizing the differences between human and nonhuman culture. J Theor Biol
Social and survival information bias in the transmission of urban legends. Br J Psychol 359:29–36.
106:288–307. 43. Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press,
15. Sindi SS, Dale R (2016) Culturomics as a data playground for tests of selection: Cambridge, MA).
Mathematical approaches to detecting selection in word use. J Theor Biol 405: 44. Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM (2009) Emulation, imitation,
140–149. over-imitation and the scope of culture for child and chimpanzee. Philos Trans R Soc
16. Kempe M, Lycett S, Mesoudi A (2012) An experimental test of the accumulated Lond B Biol Sci 364:2417–2428.
copying error model of cultural mutation for Acheulean handaxe size. PLoS One 45. Lyons DE, Young AG, Keil FC (2007) The hidden structure of overimitation. Proc Natl
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
7:e48333. Acad Sci USA 104:19751–19756.
17. Hamilton MJ, Buchanan B (2007) Spatial gradients in Clovis-age radiocarbon dates 46. Dean LG, Kendal RL, Schapiro SJ, Thierry B, Laland KN (2012) Identification of the
across North America suggest rapid colonization from the north. Proc Natl Acad Sci social and cognitive processes underlying human cumulative culture. Science 335:
USA 104:15625–15630. 1114–1118.
18. Bentley RA, Hahn MW, Shennan SJ (2004) Random drift and culture change. Proc 47. Henrich J (2004) Demography and cultural evolution. Am Antiq 69:197–214.
Biol Sci 271:1443–1450. 48. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the
19. Mesoudi A (2009) How cultural evolutionary theory can inform social psychology appearance of modern human behavior. Science 324:1298–1301.
and vice versa. Psychol Rev 116:929–952. 49. Grove M (2016) Population density, mobility, and cultural transmission. J Archaeol
20. Mace R, Pagel M (1994) The comparative method in anthropology. Curr Anthropol Sci 74:75–84.
35:549–564. 50. Salali GD, et al. (2016) Knowledge-sharing networks in hunter-gatherers and the
21. Galton F (1889) Comment on Tylor, E. B., On a method of investigating the devel- evolution of cumulative culture. Curr Biol 26:2516–2521.
opment of institutions, applied to laws of marriage and descent. J R Anthropol Inst 51. Acerbi A (2016) A cultural evolution approach to digital media. Front Hum Neurosci
18:270. 10:636.
22. Harvey P, Pagel M (1991) The Comparative Method in Evolutionary Biology (Oxford 52. Mesoudi A (2011) Variable cultural acquisition costs constrain cumulative cultural
Univ Press, Oxford). evolution. PLoS One 6:e18239.
23. Holden CJ, Mace R (2003) Spread of cattle led to the loss of matrilineal descent in 53. Charlesworth B (2009) Fundamental concepts in genetics: Effective population size
Africa: A coevolutionary analysis. Proc Biol Sci 270:2425–2433. and patterns of molecular evolution and variation. Nat Rev Genet 10:195–205.
24. O’Brien MJ, Darwent J, Lyman RL (2001) Cladistics is useful for reconstructing ar- 54. Shennan S (2001) Demography and cultural innovation. Camb Archaeol J 11:5–16.
chaeological phylogenies. J Archaeol Sci 28:1115–1136. 55. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oce-
25. Gray RD, Jordan FM (2000) Language trees support the express-train sequence of ania. Proc Biol Sci 277:2559–2564.
Austronesian expansion. Nature 405:1052–1055. 56. Bromham L, Hua X, Fitzpatrick TG, Greenhill SJ (2015) Rate of language evolution is
26. Gray RD, Watts J (2017) Cultural macroevolution matters. Proc Natl Acad Sci USA affected by population size. Proc Natl Acad Sci USA 112:2097–2102.
114:7846–7852. 57. Collard M, Buchanan B, O’Brien MJ (2013) Population size as an explanation for
27. Fracchia J, Lewontin RC (1999) Does culture evolve? Hist Theory 38:52–78. patterns in the Paleolithic archaeological record. Curr Anthropol 54:S388–S396.
28. Collard M, Shennan S, Tehrani J (2006) Branching, blending, and the evolution of 58. Vaesen K, Collard M, Cosgrove R, Roebroeks W (2016) Population size does not
cultural similarities and differences among human populations. Evol Hum Behav 27: explain past changes in cultural complexity. Proc Natl Acad Sci USA 113:
169–184. E2241–E2247.
Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 1, 2017 (received for review January 12, 2017)
Culture suffuses all aspects of human life. It shapes our minds and clear why this exceptional capacity for culture has evolved in hu-
bodies and has provided a cumulative inheritance of knowledge, mans and humans alone.
skills, institutions, and artifacts that allows us to truly stand on the If there is one thing on which IC and CE agree, it is that
shoulders of giants. No other species approaches the extent, cultural capacity is a good thing: It has “undeniable practical
diversity, and complexity of human culture, but we remain unsure advantages” (8) that have allowed our species to have “expanded
how this came to be. The very uniqueness of human culture is both across the globe and. . .occupy a wider range than any other ter-
a puzzle and a problem. It is puzzling as to why more species have restrial species” (2). Indeed, the benefits are so substantial that
not adopted this manifestly beneficial strategy and problematic even small “initial increments” (8) in this direction are expected to
because the comparative methods of evolutionary biology are generate powerful biocultural feedback leading to further brain
ill suited to explain unique events. Here, we develop a more and cognitive evolution (2, 8). Proponents of a highly modular view
particularistic and mechanistic evolutionary neuroscience approach of IC nevertheless argue that this feedback will lead to coordinated
to cumulative culture, taking into account experimental, develop- enhancement across multiple domains (8). CE advocates similarly
mental, comparative, and archaeological evidence. This approach suggest that evolved cognitive mechanisms (i.e., modules in a loose
reconciles currently competing accounts of the origins of human sense) for social learning will lead to more general brain size and
culture and develops the concept of a uniquely human technolog- intelligence increases to deal with increased amounts of “valuable
ical niche rooted in a shared primate heritage of visuomotor cultural information” (2). So why are humans the only species to
coordination and dexterous manipulation. have fallen into this virtuous cycle?
Arguing from a CE perspective, Boyd and Richerson (11)
brain evolution | cultural evolution | archaeology | imitation suggest that cumulative culture is rare because of the evolu-
tionary costs of requisite social-learning mechanisms. According
tural accumulation to supplant biology as humanity’s primary Whether such “intelligence” is thought to be composed of discrete
mode of adaptation (2, 9). Both views recognize a role for social but tightly coevolving innate modules (8) or a general-purpose
learning in reducing the costs of knowledge and skill acquisition,
but they differ on the phylogenetic uniqueness (e.g., ref. 7 vs. ref.
9) and transformative power (e.g., ref. 2 vs. ref. 8) of human This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
high-fidelity social transmission. This plays out in starkly differ- Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
ent visions of cumulative culture as either a fundamental evo-
ANTHROPOLOGY
in Irvine, CA. The complete program and video recordings of most presentations are available
lutionary transition (cf. ref. 10) that altered the very medium of on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
human adaption (2), or just another “unique or extreme” bi- Author contributions: D.S. and E.E.H. wrote the paper.
ological trait comparable to “the elephant’s trunk, the narwhal’s The authors declare no conflict of interest.
tusk, the whale’s baleen, the platypus’s duckbill, and the arma- This article is a PNAS Direct Submission.
dillo’s armor” (8). Neither option, however, makes it immediately 1
To whom correspondence should be addressed. Email: dwstout@emory.edu.
NEUROSCIENCE
Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7863
(56). Any viable evolutionary account of cumulative culture must feedback and error correction. This limitation can be overcome
address these dynamics. through the use of internal models that predict movements and
The primate neocortex is divided into large-scale functional outcomes in advance (71), a simulation process supported by a
networks characterized by high within-network functional and distributed network of frontal, parietal, and occipitotemporal re-
anatomical connectivity (59, 60). In humans, these networks are gions combining elements of the dorsal and ventral attention net-
organized in a processing hierarchy from concrete perceptual works. As argued above, such models of self-action control likely
and motor functions to abstract, domain-general processing. This form the basis for understanding and copying the observed actions
arrangement is realized anatomically as a cortex-wide gradient of of others through a process of matching, often referred to as
topological distance and connectivity patterns extending from “motor resonance” (42). The assembly of complex goal-oriented
distinct peripheral sensorimotor cortices at one end to highly sequences from these elements is likely supported in the next tier
interconnected central association cortices at the other (61). of cortical organization by the multiple demand system (aka
Along this gradient, seven widely recognized networks can be “frontoparietal control network”). This cognitive control network is
arrayed into four organizational tiers: (i) visual and somatomotor, thought to support general or “fluid” intelligence through its role in
(ii) dorsal and ventral attention, (iv) multiple demand and limbic, assembling structured mental programs from a series of subtasks
and (iv) default mode (61). (72), a critical process in skill learning as reviewed above. Together,
The tethering hypothesis of Buckner and Krienen (62) proposes these sensorimotor matching and control processes support the
that this pattern arises from disproportionate expansion of the interactive behavioral alignment that is critical to human social
cortical mantle during evolutionary brain enlargement, leading to learning, communication, cooperation, and bonding (73, 74).
gaps between the chemical signaling gradients that pattern cortical It is debatable whether the application of increasingly sophis-
differentiation during development. Developmental selection in ticated motor planning and cognitive control networks to social
these gaps fosters the emergence of “noncanonical” association learning involved phylogenetic construction or is entirely explica-
networks primarily interconnected with each other rather than ble in terms of developmental construction and inflection (42, 75),
with more developmentally constrained peripheral sensorimotor but either scenario is consistent with the IC premise that social
systems. As expected, these relatively unconstrained association learning substantially overlaps with asocial-learning mechanisms
cortices are also relatively late developing (59, 63) and variable and comes at little additional cost (15). It also fits well with the
in connectivity across individuals (64). Indeed, comparative evi- close integration of individual and social-learning processes in
dence indicates human-specific changes in the rate and timing real-world skill acquisition, as envisioned by the helical curricu-
of synaptogenesis, synapse elimination, and cortical myelination, lum. Advocates of CE, however, emphasize the additional im-
resulting in increased plasticity into adulthood (65, 66). That portance of a specialized ToM capacity to allow truly cultural
nonspecific selection for increased brain size in the human lineage learning (9). Does this requirement imply additional evolutionary
might have indirectly driven increased plasticity is suggested by costs for cultural learning?
evidence of low heritability for cortical morphology (sulcal di- Although ToM is commonly thought of as a human specialization,
mensions) vs. overall brain size in humans, a pattern that contrasts depending on phylogenetically constructed neurocognitive mecha-
with high heritability of both in chimpanzees (67). In any case, the nisms, Heyes and Frith (76) have recently argued that it is largely a
human association cortex appears particularly sensitive to envi- product of developmental inflection and cultural evolution. On this
ronmental and behavioral influences, providing a potent evolu- account, low-level or “implicit” mind-reading capacities emerge di-
tionary feedback mechanism between organism and environment, rectly from motor resonance properties of the action control system
which the EES refers to as reciprocal causation (56). discussed above. Motor resonance provides the input needed to
Such phenotypic flexibility is useful but may come at a cost identify recurring relations between actions, outcomes, and internal
(e.g., investments in learning or temporary phenotype–environment states, and thus to predict behavior and infer intent. This would be
mismatches). Where possible, natural selection is expected to largely explicable in terms of general mechanisms [e.g., statistical and
reduce costs by “canalizing” plastic responses to recurring envi- associative learning (77)] acting within developmentally constructed
ronmental situations as automatic parts of normal development systems, as favored by some IC accounts (15).
(68). The tethering hypothesis suggests that such innate spe- Explicit mind reading, in contrast, involves active reasoning
cializations are most likely to be found in relatively heritable about mental states: in other words, the “theory” part of “theory
sensorimotor systems and with respect to behaviors/stimuli that of mind.” Anatomically, this appears to be supported by regions
have been relatively invariant over long periods of time. Because of posterior cingulate, medial frontal, and lateral temporopar-
humans’ expanded association areas remain relatively plastic ietal cortex associated with the so-called “default mode network”
(64) and are both late developing (59) and phylogenetically re- (DMN) (78). The DMN was initially identified as a set of regions
cent (60), their derived cognitive features are less likely to that experience de-activation during attention-demanding tasks,
be directly shaped by natural selection (phylogenetically con- but is increasingly recognized to make a positive contribution to
structed) and more likely to result from developmental side ef- abstract, internally directed tasks involving information retrieval
fects (developmental construction) and modifications to the and integration. Examples include introspection, social cogni-
structure of inputs they receive from more peripheral systems tion, autobiographical memory, future planning, narrative com-
(phylogenetic or developmental inflection) (69). In theory, such prehension, and goal-directed working memory. This functional
modifications could arise through environmental as well as ge- profile reflects the fact that the DMN sits atop the cortical
netic inheritance, including persistent changes to the physical processing hierarchy: maximally distant from peripheral senso-
and social context of development brought about through niche rimotor systems and dominated by internal connectivity with
construction (52, 70). For example, there is widespread agree- other association networks (61). As discussed above, its devel-
ment that the human brain lacks specific genetic adaptations for opment and function are thus expected to be highly plastic and
literacy, and yet learning to read reliably produces functional reliant on learning. In fact, Heyes and Frith (76) propose that
specialization for script perception in a particular region of the learning explicit mental theories is an inherently cultural process
left ventral occipitotemporal cortex known as the “visual word requiring language-based instruction. Their view is supported by,
form area” (3). Similar logic may apply to the enhanced mech- among other things, evidence that individual and cross-cultural
anisms for complex action parsing and ToM that support ap- differences in caregiver use of mental state vocabulary are pre-
prenticeship learning in a helical curriculum. dictive of variation in childrens’ acquisition of ToM concepts,
Skilled actions, such as the ballistic strikes involved in stone such as false belief, knowledge vs. ignorance, and difference of
knapping, often unfold too quickly to be guided by online sensory opinion. Insofar as mind-reading capacities are themselves seen
NEUROSCIENCE
ANTHROPOLOGY
Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7865
83). The aSMG in particular is believed to support a confluence Such learning in humans requires a degree of bodily awareness
of object information from the ventral stream with dorsal stream sufficient to match variations in kinematic detail with desired
kinematics to manage the novel action possibilities afforded by outcomes during deliberate practice. A measure of such aware-
handheld tools (81). Although similar functional studies have not ness that has been applied to other animals is the Mirror Self-
been conducted with chimpanzees, structural evidence from Recognition (MSR) test (89). Unlike enculturated humans,
diffusion tensor imaging suggests similarity with humans. mirror-naïve animals must discover de novo that the visual per-
Whereas macaques show little or no connectivity between the ception of their reflection corresponds to the sensorimotor
aSMG and ventral stream object-processing cortex, in both hu- representations of their own movements. As might be expected
mans and chimpanzees these connections are robust (84). This from the preceding review of action perception circuitry, ma-
tool-relevant innovation appears to predate the chimpanzee– caques typically fail at MSR and chimpanzees are intermediate
human split, as might be expected from the impressive tool-use in performance, with some passing and some failing. In fact,
capacities of modern chimpanzees (13). chimpanzee MSR performance is predicted by individual varia-
Enhanced aSMG connectivity between dorsal and ventral tion in the degree of right-lateralization of SLFIII projections
steams is just one aspect of a broader pattern of changes (Fig. 2) into the vlPFC. In other words chimpanzees with more human-
in action circuitry over ape and human evolution (84). In ma- like SLFIII connectivity show more human-like MSR behavior
caques, this circuitry is dominated by ventral stream projections (90). Because attending to one’s own movements is a critical el-
to the ventrolateral prefrontal cortex (vlPFC) along the inferior ement in the hypothesized construction of implicit mind reading
longitudinal fasciculus and extreme/external capsules, with rela- from motor resonance (76), this finding suggests further links
tively little connectivity through the dorsal stream. Dorsal stream between dorsal stream evolution and the cognitive prerequisites
projections to the vlPFC as well as the PMv through the middle of cultural learning.
and superior longitudinal fasciculi are better developed in What is not yet known is the extent to which any or all of these
chimpanzees and become quite pronounced in humans. Across structural and functional differences between species are classic
these three taxa, there is thus a trend toward the addition of “adaptations” in the sense of being canalized products of phy-
increasing dorsal stream inputs to the vlPFC in complement to logenetic construction vs. other evolutionary processes. Even in
established ventral stream connectivity. We have previously macaques, there is some evidence that extensive tool-training
proposed that this dorsal stream enhancement may underlie the can produce plastic alterations in dorsal stream connectivity
progressive elaboration of action-parsing capacities from ma- (91). Enlarged ape and human brains are expected to be more
caques to chimpanzees and then humans (84, 85), as reflected developmentally plastic and subject to inflection by somatic [e.g.,
by these species’ increasingly complex foraging skills (13) and bipedal locomotion, hand morphology (92)] and sensorimotor
capacities/propensities for bodily imitation (12). adaptations, and developmental niche construction (70). In fact,
Motor resonance mechanisms are least-developed in macaques, research with modern humans has shown that the acquisition of
which are not known to imitate manual actions. Macaque “mirror” Paleolithic tool-making skills elicits plastic remodeling of dorsal
neurons respond to observed action goals rather than detailed stream white matter connections, including SLFIII’s projection
means of execution and are almost entirely unresponsive to actions into the right vlPFC, even in adults (37). Functionally, the gray
that do not involve an object (75). In contrast, chimpanzee action matter targeted by this projection is recruited by execution (93)
observation activates nearly identical voxels to execution of the and observation (83) of relatively complex tool-making se-
same movements, regardless of whether they produce a physical quences of the kind that appeared with Late Acheulean hand-
result on an object (85). This basic action-matching mechanism may axe technology after about 0.7 Mya (50). Such findings suggest
thus predate the chimpanzee–human split. With respect to object- that further experimental studies of Paleolithic tool-making may
directed actions, however, chimpanzees retain a generally macaque- begin to fill in details of timing, mechanisms, and context of
like pattern of brain response dominated by the “top-down” con- evolutionary changes that occurred since the chimpanzee–human
tributions of the frontal executive cortex (85, 86). Humans alone divergence and are inaccessible to purely comparative methods.
display a more distributed pattern of occipital, temporal, parietal,
premotor, and prefrontal activation, reflecting an increased role Conclusion: An Evolving Technological Niche
for bottom-up perceptual representations incorporating kinematic The earliest stone tools (Fig. 2) predate evidence of brain
and spatiotemporal details about object-directed actions (85). This expansion by hundreds of thousands of years (32, 94) during
functional difference, and the structural changes that support it, which their occurrence was extremely patchy, discontinuous, and
may be critical to the exceptional development of skill acquisition lacking in evidence of progressive change. By 2.6 Mya, early
and cultural learning capacities in humans. Oldowan knapping provides some evidence for high-fidelity
In humans but not chimpanzees or macaques, the core action cultural transmission of particular methods (35), as well as in-
perception circuitry includes a prominent projection to the su- creasing demands on visual, motor and attentional systems (82),
perior parietal lobule, a region associated with awareness of but the overall impression in this early period remains one of a
one’s body in space (84). Furthermore, the third branch of the tenuous and expendable technology at the edge of contemporary
superior longitudinal fasciculus (SLFIII), which links the inferior hominin capacities. It is only after about 2.0 Mya that stone tool-
parietal cortex with the PMv in monkeys, extends into more making appears to become more commonplace (as indicated by
anterior regions of the vlPFC in humans, particularly in the right site frequency and geographic distribution), at which time it is
hemisphere (87). Chimpanzees again appear intermediate, with accompanied by evidence of brain- and body-size increase (20).
a weak but observable extension of SLFIII into vlPFC and no Further episodes of apparently correlated technological (50) and
evidence of right-lateralization at the population level (87). Thus, brain size (20, 95) change occurred with the appearance of in-
a robust extension of SLFIII into the right hemisphere homolog of creasingly skill-intensive Early (1.7 Mya) and Late (0.7 Mya)
Broca’s area appears to be a human-specific adaptation. This re- Acheulean knapping. Understanding what exactly changed at
gion is an element of the multiple demand system discussed above, these various transitions is an important priority for future research
which is thought to support the assembly of complex, multistep and will ultimately require an integration of CE approaches to
action plans (88). The observed extension of human SLFIII would understanding technological change (19) and IC insights into the
thus provide an anatomical substrate for the integration of kine- evolutionary economics of “expensive” brains (4), with mechanistic
matic details into complex action goals and sequences, as required neuroscientific perspectives on evolving brain–behavior–culture
for skill learning in a helical curriculum. interactions.
1. Whiten A, Hinde RA, Laland KN, Stringer CB (2011) Culture evolves. Philos Trans R 12. Whiten A, van de Waal E (December 26, 2016) Social learning, culture and the ‘socio-
Soc Lond B Biol Sci 366:938–948. cultural brain’ of human and non-human primates. Neurosci Biobehav Rev, 10.1016/
2. Boyd R, Richerson PJ, Henrich J (2011) The cultural niche: Why social learning is es- j.neubiorev.2016.12.018.
sential for human adaptation. Proc Natl Acad Sci USA 108:10918–10925. 13. Byrne RW (2016) Evolving Insight: How It Is We Can Think About Why Things Happen
3. Dehaene S, Cohen L, Morais J, Kolinsky R (2015) Illiterate to literate: Behavioural and (Oxford Univ Press, Oxford).
cerebral changes induced by reading acquisition. Nat Rev Neurosci 16:234–244. 14. Burkart JM, Schubiger MN, van Schaik CP (July 28, 2016) The evolution of general
4. Isler K, Van Schaik CP (2014) How humans evolved large brains: Comparative evi- intelligence. Behav Brain Sci, 10.1017/S0140525X16000959.
15. van Schaik CP, Isler K, Burkart JM (2012) Explaining brain size variation: From social
NEUROSCIENCE
9. Tomasello M (1999) The Cultural Origins of Human Cognition (Harvard Univ Press, shifts. PLOS Comput Biol 12:e1005302.
Cambridge, MA). 20. Antón SC, Potts R, Aiello LC (2014) Human evolution. Evolution of early Homo: An
10. Szathmáry E, Smith JM (1995) The major evolutionary transitions. Nature 374: integrated biological perspective. Science 345:1236828.
227–232. 21. d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the
11. Boyd R, Richerson P (1996) Why culture is common but cultural evolution is rare. emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069.
Evolution of Social Behaviour Patterns in Primates and Man, eds Runciman W, 22. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
Maynard Smith J, Dunbar R (Oxford Univ Press, Oxford), pp 77–93. mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.
Stout and Hecht PNAS | July 25, 2017 | vol. 114 | no. 30 | 7867
23. Caldwell CA, Schillinger K, Evans CL, Hopper LM (2012) End state copying by humans 63. Hill J, et al. (2010) Similar patterns of cortical expansion during human development
(Homo sapiens): Implications for a comparative perspective on cumulative culture. and evolution. Proc Natl Acad Sci USA 107:13135–13140.
J Comp Psychol 126:161–169. 64. Mueller S, et al. (2013) Individual variability in functional connectivity architecture of
24. Derex M, Godelle B, Raymond M (2013) Social learners require process information the human brain. Neuron 77:586–595.
to outperform individual learners. Evolution 67:688–697. 65. Preuss TM (2012) Human brain evolution: From gene discovery to phenotype dis-
25. Wasielewski H (2014) Imitation is necessary for cumulative cultural evolution in an covery. Proc Natl Acad Sci USA 109:10709–10716.
unfamiliar, opaque task. Hum Nat 25:161–179. 66. Somel M, Liu X, Khaitovich P (2013) Human brain evolution: Transcripts, metabolites
26. Schillinger K, Mesoudi A, Lycett SJ (2015) The impact of imitative versus emulative and their regulators. Nat Rev Neurosci 14:112–127.
learning mechanisms on artifactual variation: Implications for the evolution of 67. Gómez-Robles A, Hopkins WD, Schapiro SJ, Sherwood CC (2015) Relaxed genetic
material culture. Evol Hum Behav 36:446–455. control of cortical organization in human brains compared with chimpanzees. Proc
27. Morgan TJ, et al. (2015) Experimental evidence for the co-evolution of hominin tool- Natl Acad Sci USA 112:14799–14804.
making teaching and language. Nat Commun 6:6029. 68. Murren CJ, et al. (2015) Constraints on the evolution of phenotypic plasticity: Limits
28. Roux V, Bril B, Dietrich G (1995) Skills and learning difficulties involved in stone and costs of phenotype and plasticity. Heredity (Edinb) 115:293–301.
knapping. World Archaeol 27:63–87. 69. Heyes C (2003) Four routes of cognitive evolution. Psychol Rev 110:713–727.
29. Nonaka T, Bril B, Rein R (2010) How do stone knappers predict and control the 70. Flynn EG, Laland KN, Kendal RL, Kendal JR (2013) Target article with commentaries:
outcome of flaking? Implications for understanding early stone tool technology. Developmental niche construction. Dev Sci 16:296–313.
J Hum Evol 59:155–167. 71. Wolpert DM, Doya K, Kawato M (2003) A unifying computational framework for
30. Stout D (2002) Skill and cognition in stone tool production: An ethnographic case motor control and social interaction. Philos Trans R Soc Lond B Biol Sci 358:593–602.
study from Irian Jaya. Curr Anthropol 45:693–722. 72. Duncan J (2010) The multiple-demand (MD) system of the primate brain: Mental
31. Stout D, Hecht E, Khreisheh N, Bradley B, Chaminade T (2015) Cognitive demands of programs for intelligent behaviour. Trends Cogn Sci 14:172–179.
lower paleolithic toolmaking. PLoS One 10:e0121804. 73. Feldman R (2017) The neurobiology of human attachments. Trends Cogn Sci 21:
32. Toth N, Schick K (2009) The Oldowan: The tool making of early hominins and 80–99.
chimpanzees compared. Annu Rev Anthropol 38:289–305. 74. Hasson U, Frith CD (2016) Mirroring and beyond: Coupled dynamics as a generalized
33. Apel J, Knutsson K, eds (2006) Skilled Production and Social Reproduction: Aspects of framework for modelling social interactions. Phil Trans R Soc B 371:20150366.
Traditional Stone-Tool Technologies. (Societas Archaeologica Upsaliensis, Uppsala, 75. Tramacere A, Pievani T, Ferrari PF (November 16, 2016) Mirror neurons in the tree of
Sweden). life: Mosaic evolution, plasticity and exaptation of sensorimotor matching responses.
34. Schillinger K, Mesoudi A, Lycett SJ (2016) Copying error, evolution, and phylogenetic Biol Rev Camb Philos Soc, 10.1111/brv.12310.
signal in artifactual traditions: An experimental approach using “model artifacts”. 76. Heyes CM, Frith CD (2014) The cultural evolution of mind reading. Science 344:
J Archaeol Sci 70:23–34. 1243091.
35. Stout D, Semaw S, Rogers MJ, Cauche D (2010) Technological variation in the earliest 77. Byrne R (1999) Imitation without intentionality. Using string parsing to copy the
Oldowan from Gona, Afar, Ethiopia. J Hum Evol 58:474–491. organization of behaviour. Anim Cogn 2:63–72.
36. Stout D, Khreisheh N (2015) Skill learning and human brain evolution: An experi- 78. Mars RB, et al. (2012) On the relationship between the “default mode network” and
mental approach. Camb Archaeol J 25:867–875. the “social brain”. Front Hum Neurosci 6:189.
37. Hecht EE, et al. (2015) Acquisition of Paleolithic toolmaking abilities involves structural 79. Genovesio A, Wise SP, Passingham RE (2014) Prefrontal-parietal function: From
remodeling to inferior frontoparietal regions. Brain Struct Funct 220:2315–2331. foraging to foresight. Trends Cogn Sci 18:72–81.
38. Magnani M, Rezek Z, Lin SC, Chan A, Dibble HL (2014) Flake variation in relation to 80. Milner AD, Goodale MA (1995) The Visual Brain in Action (Oxford Univ Press, Ox-
the application of force. J Archaeol Sci 46:37–49. ford).
39. Faisal A, Stout D, Apel J, Bradley B (2010) The manipulative complexity of Lower 81. Orban GA, Caruana F (2014) The neural basis of human tool use. Front Psychol 5:310.
Paleolithic stone toolmaking. PLoS One 5:e13718. 82. Stout D, Chaminade T (2007) The evolutionary neuroscience of tool making.
40. Putt SS, Woods AD, Franciscus RG (2014) The role of verbal interaction during ex- Neuropsychologia 45:1091–1100.
perimental bifacial stone tool manufacture. Lithic Technol 39:96–112. 83. Stout D, Passingham R, Frith C, Apel J, Chaminade T (2011) Technology, expertise
41. Stout D, Apel J, Commander J, Roberts M (2014) Late Acheulean technology and and social cognition in human evolution. Eur J Neurosci 33:1328–1338.
cognition at Boxgrove, UK. J Archaeol Sci 41:576–590. 84. Hecht EE, et al. (2013) Process versus product in social learning: Comparative diffu-
42. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to sion tensor imaging of neural systems for action execution-observation matching in
function. Behav Brain Sci 37:177–192. macaques, chimpanzees, and humans. Cereb Cortex 23:1014–1024.
43. Laland KN, Bateson P (2001) The mechanisms of imitation. Cybern Syst 32:195–224. 85. Hecht EE, et al. (2013) Differences in neural activation for object-directed grasping in
44. Buccino G, et al. (2004) Neural circuits underlying imitation learning of hand actions: chimpanzees and humans. J Neurosci 33:14117–14134.
an event-related fMRI study. Neuron 42:323–334. 86. Denys K, et al. (2004) Visual activation in prefrontal cortex is stronger in monkeys
45. Ericsson KA, Krampe RT, Tesch-Romer C (1993) The role of deliberate practice in the than in humans. J Cogn Neurosci 16:1505–1516.
acquisition of expert performance. Psychol Rev 100:363–406. 87. Hecht EE, Gutman DA, Bradley BA, Preuss TM, Stout D (2015) Virtual dissection and
46. Stout D (2013) Neuroscience of technology. Cultural Evolution: Society, Technology, comparative connectivity of the superior longitudinal fasciculus in chimpanzees and
Language, and Religion, Strungmann Forum Reports, eds Richerson PJ, Christiansen M humans. Neuroimage 108:124–137.
(MIT Press, Cambridge, MA), pp 157–173. 88. Duncan J, et al. (2000) A neural basis for general intelligence. Science 289:457–460.
47. Whiten A (2015) Experimental studies illuminate the cultural transmission of per- 89. Anderson JR, Gallup GG, Jr (2015) Mirror self-recognition: A review and critique of
cussive technologies in Homo and Pan. Phil Trans R Soc B 370:20140359; erratum in attempts to promote and engineer self-recognition in primates. Primates 56:317–326.
Phil Trans R Soc B 371:20150436. 90. Hecht EE, Mahovetz LM, Preuss TM, Hopkins WD (2017) A neuroanatomical pre-
48. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of dictor of mirror self-recognition in chimpanzees. Soc Cogn Affect Neurosci 12:37–48.
cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415. 91. Hihara S, et al. (2006) Extension of corticocortical afferents into the anterior bank of
49. Gibson JJ (1979) The Ecological Approach to Visual Perception (Houghton Mifflin, the intraparietal sulcus by tool-use training in adult monkeys. Neuropsychologia 44:
Boston). 2636–2646.
50. Stout D (2011) Stone toolmaking and the evolution of human culture and cognition. 92. Kivell TL (2015) Evidence in hand: Recent discoveries and the early evolution of
Philos Trans R Soc Lond B Biol Sci 366:1050–1059. human manual manipulation. Phil Trans R Soc Lond B Biol Sci 370:20150105.
51. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural 93. Stout D, Toth N, Schick K, Chaminade T (2008) Neural correlates of Early Stone Age
learning. Trends Cogn Sci 19:688–699. toolmaking: Technology, language and cognition in human evolution. Philos Trans R
52. Fragaszy DM, et al. (2013) The fourth dimension of tool use: Temporally enduring Soc Lond B Biol Sci 363:1939–1949.
artefacts aid primates learning to use tools. Philos Trans R Soc B Biol Sci 368:20120410. 94. Harmand S, et al. (2015) 3.3-million-year-old stone tools from Lomekwi 3, West
53. Hewlett BS, Roulette CJ (2016) Teaching in hunter-gatherer infancy. R Soc Open Sci Turkana, Kenya. Nature 521:310–315.
3:150403. 95. Rightmire GP (2004) Brain size and encephalization in early to Mid-Pleistocene
54. Burkart JM, Hrdy SB, Van Schaik CP (2009) Cooperative breeding and human cog- Homo. Am J Phys Anthropol 124:109–123.
nitive evolution. Evol Anthropol 18:175–186. 96. Byrge L, Sporns O, Smith LB (2014) Developmental process emerges from extended
55. Stout D, Chaminade T (2012) Stone tools, language and the brain in human evolu- brain-body-behavior networks. Trends Cogn Sci 18:395–403.
tion. Philos Trans R Soc Lond B Biol Sci 367:75–87. 97. Gärdenfors P, Högberg A (2017) The archaeology of teaching and the evolution of
56. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump- Homo docens. Curr Anthrop 58:188–208.
tions and predictions. Proc R Soc Biol 282:20151019. 98. Bogin B, Bragg J, Kuzawa C (2014) Humans are not cooperative breeders but practice
57. Deacon TW (1997) The Symbolic Species: The Co-Evolution of Language and the biocultural reproduction. Ann Hum Biol 41:368–380.
Brain (WW Norton, New York). 99. Hill K, Barton M, Hurtado AM (2009) The emergence of human uniqueness: Char-
58. Edelman GM (1987) Neural Darwinism (Basic Books, New York). acters underlying behavioral modernity. Evol Anthropol 18:187–200.
59. Power JD, Fair DA, Schlaggar BL, Petersen SE (2010) The development of human 100. Arbib MA (2012) How the Brain Got Language: The Mirror System Hypothesis (Ox-
functional brain networks. Neuron 67:735–748. ford Univ Press, New York).
60. Mantini D, Corbetta M, Romani GL, Orban GA, Vanduffel W (2013) Evolutionarily 101. Clark A (2006) Language, embodiment, and the cognitive niche. Trends Cogn Sci 10:
novel functional networks in the human brain? J Neurosci 33:3259–3275. 370–374.
61. Margulies DS, et al. (2016) Situating the default-mode network along a principal gra- 102. Dehaene S (1997) The Number Sense: How the Mind Creates Mathematics (Oxford
dient of macroscale cortical organization. Proc Natl Acad Sci USA 113:12574–12579. Univ Press, New York).
62. Buckner RL, Krienen FM (2013) The evolution of distributed association networks in 103. Marx L (1997) Technology: The emergence of a hazardous concept. Soc Res 64:
the human brain. Trends Cogn Sci 17:648–665. 965–988.
Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 16, 2017 (received for review January 31, 2017)
The archaeological record shows that typically human cultural traits species’ biologically dictated potential. Although some would still
emerged at different times, in different parts of the world, and argue that there is a direct link between cultural behavior and
among different hominin taxa. This pattern suggests that their hominin taxonomy and, as a consequence, that the typically human
emergence is the outcome of complex and nonlinear evolutionary secondary inheritance system only emerged with our species,
trajectories, influenced by environmental, demographic, and social archaeological and paleogenetic research conducted over the
factors, that need to be understood and traced at regional scales. past 20 y challenges such a view.
The application of predictive algorithms using archaeological and First, for periods <200,000 years before the present (ka), it is
paleoenvironmental data allows one to estimate the ecological difficult to attribute a particular cognition and resulting cultural
niches occupied by past human populations and identify niche behavior to a particular fossil species because paleogenetic evi-
changes through time, thus providing the possibility of investigating dence shows that significant interbreeding occurred between
relationships between cultural innovations and possible niche shifts. Neanderthals, Denisovans, and anatomically modern humans
By using such methods to examine two key southern Africa (AMHs) (4–6), thus blurring the concept of fossil species that
archaeological cultures, the Still Bay [76–71 thousand years before many paleoanthropologists had in the past when interpreting
present (ka)] and the Howiesons Poort (HP; 66–59 ka), we identify a morphological differences between human remains. Each new
niche shift characterized by a significant expansion in the breadth of round of publications concerning paleogenetics shows that we are
the HP ecological niche. This expansion is coincident with aridifica- confronted with a complex network of genetic relationships rather
tion occurring across Marine Isotope Stage 4 (ca. 72–60 ka) and than distinct and simple lines of evolutionary descent. There is no
especially pronounced at 60 ka. We argue that this niche shift was reason to assume that such a pattern did not characterize other
made possible by the development of a flexible technological system, phases of our lineage’s evolution.
reliant on composite tools and cultural transmission strategies based
Second, archaeological discoveries show that the cultural in-
novations generally seen as reflecting modern cognition and be-
more on “product copying” rather than “process copying.” These
havior did not emerge as a single package in conjunction with the
ANTHROPOLOGY
results counter the one niche/one human taxon equation. They
appearance of our species in Africa. We know that AMHs
indicate that what makes our cultures, and probably the cultures
emerged in Africa between 200 and 160 ka (7–9), but some be-
of other members of our lineage, unique is their flexibility and haviors considered as “modern” are present in Africa before this
ability to produce innovations that allow a population to shift its speciation event. Ochre use appears at around 300 ka (10), and
ecological niche. laminar blade production is observed perhaps as early as 500 ka
(11). Other modern cultural traits are only observed in the African
| |
Middle Stone Age Still Bay Howiesons Poort | archaeological record after ca. 100 ka. Such is the case with
|
ecological niche modeling paleoclimate heating of stone to facilitate knapping or retouching, pressure-
flaked bifacial projectile points, microlithic armatures, mastic-
ECOLOGY
R esearch on animal behavior has made it clear that culture
represents a second inheritance system that may have changed
the dynamics of evolution on a broad scale (1–3). Understanding
facilitated hafting of stone tools, formal bone tools, abstract en-
gravings, the production of paint and pigment containers, personal
ornaments, and primary burials (12–15). Furthermore, many key
how this process has affected the evolution of our genus is a major cultural innovations are present outside Africa well before AMH
challenge in paleoanthropology. In what ways, and through what
phases of evolutionary history, has human culture extended beyond
This paper results from the Arthur M. Sackler Colloquium of the National Academy of
culture seen in other species? Are the cultural adaptations and Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
associated cultural innovations that we observe in the archaeo- Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
logical record the direct consequence of our biological evolution, in Irvine, CA. The complete program and video recordings of most presentations are available
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
or are they the outcome of mechanisms largely independent of it?
Author contributions: F.d. and W.E.B. designed research; F.d., W.E.B., D.L.W., and A.-L.D.
In our lineage, if cultural innovations were directly linked to classic
performed research; F.d., W.E.B., D.L.W., G.S., K.v.N., and C.H. analyzed data; K.v.N. and
Darwinian evolutionary processes, such as isolation, random mu- C.H. provided and reviewed archaeological data; A.-L.D. and M.F.S.G. interpreted paleo-
tation, selection, and speciation, one would expect a clear corre- climatic data; and F.d., W.E.B., D.L.W., A.-L.D., and M.F.S.G. wrote the paper.
spondence between the emergence of a new species and a related The authors declare no conflict of interest.
set of novel cultural behaviors. By shaping a new hominin species, This article is a PNAS Direct Submission.
natural selection would provide this species with a new cognitive 1
F.d. and W.E.B. contributed equally to this work.
setting resulting in the capacity for particular cultural innovations 2
To whom correspondence should be addressed. Email: francesco.derrico@u-bordeaux.fr.
or behaviors. Such a mechanism would provide the possibility for This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
cultural variability but would narrow its range of expression to the 1073/pnas.1620752114/-/DCSupplemental.
hard and soft hammer percussion, and finished using a technique Use-wear analyses indicate that they were worn for extended
termed pressure flaking. The latter allows for more refined shaping periods of time (42). Other elements of SB symbolic material
of the object by giving the knapper better control over its final culture include elaborately engraved abstract patterns on ochre
form. Modern-day experiments indicate that this knapping tech- pieces (Fig. 2D), as well as more simple engravings on bone items.
nology requires a long period of apprenticeship. SB bifaces were Also present in assemblages are ochre pieces bearing traces indi-
multifunctional and served as both projectiles and cutting tools. cating that they were processed to produce red powder (Fig. 2E),
Examinations of SB lithic assemblages (41) show that these bifaces which likely was used for both functional and symbolic purposes.
were often repeatedly resharpened and had long use-lives, in- With respect to chronology, a majority of SB sites have yielded
dicating that they formed a curated component of the SB lithic optically stimulated luminescence (OSL) ages that range be-
toolkit. The SB is also the first archaeological culture in which tween 76 ka and 71 ka (34, 43–45). Debate exists as to accuracy of
formal bone tools (i.e., artifacts made of animal osseous material this range due to older OSL and thermoluminescence (TL) dates
shaped with techniques, such as scraping, grinding, and incising, from the Diepkloof rock shelter (45–48). Because the inexpli-
specifically conceived for these materials) are observed at multiple cably older set of dates from Diepkloof remains a unicum, we will
sites rather than as rare elements in single assemblages. Techno- use the currently accepted chronology (45, 49, 50). Debate also
logical and functional studies show that the two different classes of exists as to whether this culture is technologically homogeneous
tool, projectiles and awls (Fig. 2C), were produced with different or, instead, characterized by regional and temporal variability
techniques and that special attention was paid to the finishing of (41). This issue, however, remains open due to a lack of chro-
the bone projectile points, suggesting that they were highly valued nological resolution and the small number of contextually reliable
ANTHROPOLOGY
and possible status items. The SB is also the first archaeological archaeological assemblages.
culture in southern Africa associated with personal ornaments.
These ornaments take the form of marine shells (Nassarius The HP. This archaeological culture, observed in both coastal and
kraussianus) that were deliberately perforated, stained with ochre, inland regions of southwestern and northeastern South Africa
and strung together in a variety of arrangements (33) (Fig. 2B). (Fig. 1), is principally characterized by the presence of backed
ECOLOGY
Fig. 2. (Left) SB artifacts [bifacial points made of quartz and silcrete (A), perforated N. kraussianus shell beads (B), bone points and an awl (C), engraved
ochre fragments (D), and an ochre fragment shaped by grinding (E)]. (Right) HP artifacts [segment made of hornfels (F), segments made of quartz (G), flake
and segments bearing residues of mastic (H), engraved ostrich egg shells (I), ochre fragments shaped by grinding (J), and bone point and awls (K)]. Blombos
Cave (A and B), Sibudu Cave (F, G, and K), and Diepkloof Shelter (H–J) are shown. (Scale bar: 1 cm.) Images courtesy of: (A) ref. 41, (C) ref. 98, (D) ref. 12, (F) ref.
53, (G) ref. 99, (H) ref. 59, (I) ref. 61, (J) ref. 100, and (K) ref. 55. Fig. 2B courtesy of F.D. and C.H., and Fig. 2E courtesy of C.H.
d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7871
blades and bladelets (i.e., lithic blades steeply retouched on one temperatures and humidity in the South Atlantic and Southwestern
side to form crescent-shaped segments) (Fig. 2 F and G) that Indian Ocean (70–74). For southern Africa, Ziegler et al. (38)
were predominantly used as components in composite hunting examined the elemental composition of marine sediments from an
weapons. These tools, although not highly standardized dimen- Indian Ocean core and proposed that GS and HS events are
sionally or morphologically, were made with a lithic reduction characterized by increased erosion reflecting higher precipitation
system that was geared toward the production of thin, straight that triggered increases in vegetation cover and biomass. Recent
blades, some of which were retouched to make this culture’s fossil research has provided direct data concerning vegetation cover and
directeur along with denticulated tools (29, 41, 51). Raw materials biomass for this region. Pollen and microcharcoal records from
used for the lithic technology were predominantly local or near- marine core MD96-2098, retrieved off southwestern Africa (refs.
local in origin, in clear contrast to what is seen for SB bifaces. 65, 67 and this study), show repeated millennial-scale changes in
Similar to the SB, however, HP groups also sometimes heated humidity during the last glacial period that also indicate, within the
lithic raw materials before they were reduced to produce blades uncertainties of the independent ice and marine chronologies, that
(52) and occasionally used pressure flaking (53). Bifacial points are GS and HS events were associated with increases in humidity. Such
absent in the HP, with the exception of a single site where speci- increases are inferred from peaks in microcharcoal concentration
mens that are smaller and of lower quality have been recovered due to grass-fueled fires and decreases in pollen from vegetation
(54). Bone tools recovered from HP sites consist of awls, pressure characteristic of open environments, such as Nama Karoo and fine-
flakers, shaped splintered pieces (pièces esquillées), and small leaved savanna (Fig. 3 D and E). However, when the entire chro-
projectile points (55) (Fig. 2K). It has been argued that HP backed nological interval for both the SB and HP is taken into account, a
segments and bone points were used as bow-delivered arrow points more complex climatic pattern is observed, characterized by an
based on use-wear, fracture patterns, and morphometrics (56–58). alternation of wet and dry events. Despite this variability, the gen-
The interpretation that these tools were hafted is supported by the eral pattern revealed by all available continental proxies across the
presence of mastic remnants observed on some backed pieces (31, entire range of each archaeological culture shows an overall trend
59) (Fig. 2H). At present, with the exception of a perforated conus toward higher humidity during the SB and generally dryer condi-
shell found within an infant burial at Border Cave (60), personal tions during the HP. The contradictory pattern proposed by Ziegler
ornaments are lacking in HP assemblages, and undisputed sym- et al. (38) is probably due to the fact that they do not consider the
bolic behavior is limited to the decoration of ostrich egg shell water entire range of these two cultures but, instead, only look at the
containers with a variety of abstract designs made up of linear humidity trends coincident with each culture’s mean age.
engravings (51, 61) (Fig. 2I). Red ochre (Fig. 2J), also sometimes
incorporated into mastic mixtures, was widely used by HP groups. Materials and Methods
The HP has predominantly been dated with OSL and TL Paleoclimate Modeling. To estimate ecological niches exploited by the SB and
techniques and appears to have lasted for a slightly longer period HP, we used paleoclimatic and vegetation simulations produced by Woillez
than the SB. HP dates range between roughly 66 ka and 59 ka et al. (66) (SI Appendix, Paleoclimatic Simulations) for the periods of 72 ka and
(34, 51, 62). As with the SB, some OSL dates of the HP at 60 ka. Because the two simulations are primarily constrained by orbital pa-
Diepkloof are significantly older (47, 48) than the corpus of rameters and do not estimate suborbital variability, we used the 72 ka simu-
dates available from other South African sites, as well as from lation to represent climatic and environmental conditions for the SB and the
other OSL dates obtained at the same site (63). Based on the fact initial HP (ca. 66–63 ka) and the 60 ka simulation to represent conditions for
that the newly recalculated dates for the Diepkloof HP (63) the terminal HP (ca. 63–59 ka). The use of the 72 ka simulation as a proxy for
cluster with the HP dates from other dated contexts (50), we will climatic conditions of the initial HP is justified by the relatively high humidity
observed at the onset of HS 6, as evidenced by vegetation, fire activity, and
use the 66–59 ka range as the chronological interval for the HP
erosion proxies (Fig. 3 C–E, respectively). To estimate the SB and HP ecocultural
in this study. Shortly after ca. 59 ka, we observe the appearance
niches, we used temperature of the coldest month, maximum precipitation,
of the post-HP archaeological culture. minimum precipitation, mean annual precipitation, mean annual tempera-
ture, and a measure of biomass from the relevant paleoclimatic simulations.
Paleoenvironmental Context. These two archaeological cultures oc-
curred during two very different climatic phases (Fig. 3). At the Ecological Niche Modeling and Hypothesis Testing. To reconstruct the potential
orbital scale, the SB occurs in a phase of precession maximum ecological (ecocultural) niches exploited by the SB and HP and evaluate whether
during which one observes higher seasonality and an increase in cultural changes between the two are associated with an ecological niche shift,
precipitation in the Southern Hemisphere (64–67). To the contrary, we constructed a georeferenced list of archaeological sites with levels that can be
the HP is contemporaneous with a decrease in precession, with the securely attributed to one of these cultures (Fig. 1 and SI Appendix, Table S1).
minimum reached toward its end (ca. 60–59 ka). This change We then used these occurrence data to conduct tests using both Bioclim (75)
resulted in lower seasonality and drier conditions (SI Appendix, Fig. and Maxent (76) predictive algorithms within the “dismo” R package (77, 78) (SI
S1). In addition to orbital climatic variability, SB and HP cultures Appendix, Ecological Niche Modeling). We use these two algorithms to explore
were subjected to suborbital climatic fluctuations, the so-called the differences seen when models are allowed to extrapolate freely into
Dansgaard–Oeschger (D-O) cycles expressed over Greenland by combinations of environments that were unavailable during model training
alternating cold stadials and temperate interstadials, as well as in- (Maxent) versus models that are constrained so that they do not extrapolate
termittent and extreme cooling episodes recorded in the North beyond the minima and maxima of the marginal environmental distributions of
Atlantic, termed Heinrich Stadials (HSs). These millennial-scale the examined population (Bioclim). Due to Maxent’s ability to extrapolate, we
events are also recorded in Antarctic paleoclimatic records. anticipate that similarity between different target populations will generally be
The SB occurs during a period comprising Greenland In- seen to be higher when environmental niches are modeled using Maxent as
terstadial (GI) 20, Greenland Stadial (GS) 20, and GI 19 (68) opposed to Bioclim. With these two algorithms, we reconstructed both SB and
HP niches using relevant climatic outputs and simulated biomass from the 72 ka
(Fig. 3). This culture disappears from the archaeological record
simulation and compared these results. We also reconstructed the HP niche
during the initial phase of GS 19 (GS 19.2). The HP appears
using simulation outputs for 60 ka and compared these estimations with the
toward the end of GS 19 and is present across GI 18 and GS 18 estimations of the SB at 72 ka. A series of Monte Carlo randomization tests was
(ca. 64.4–59.4 ka, which corresponds to HS 6) (69). The suite of conducted to assess the differences in the set of environments occupied by each
diagnostic elements characteristic of this archaeological culture culture. This approach is based on widely used methods in evolutionary ecology
is no longer present by ca. 59–58 ka, a period marked by rapid (the “background” or “similarity” test) (79, 80) that are used to assess whether
climatic oscillations (i.e., GI 17.1, GS 17.1, GI 16.2, GS 16.2). It two populations exhibit statistically significant differences in their environ-
is following this interval that the post-HP adaptation appears. mental tolerances or associations (SI Appendix, Ecological Niche Modeling). We
The impact of the D-O millennial scale climatic variability and also conducted tests using measures of niche breadth (81, 82) to determine
HSs on the Southern Hemisphere regional climates has recently whether any observed differences between the two cultures’ environmental
been investigated. Model experiments and climate reconstructions niches represent a statistically significant expansion of the niche. Because some
suggest that GS and HS events resulted in increased sea surface of these evaluations were conducted using different climate layers for the SB
ANTHROPOLOGY
centage record from core MD96-2098 indicating
changes in precipitation (67) (E), and temperature
curve for Antarctica from the European Project for Ice
Coring in Antarctica ice core (102) (F). Arrows situated
between curves in C and D indicate long-term trends
in humidity during the SB and HP intervals.
and HP (72 ka and 60 ka, respectively), modifications that use Latin hypercube the east and northeast (Fig. 4 E and F), which represent areas that
ECOLOGY
sampling were made to the background similarity tests (SI Appendix, Ecological were less affected by the eastward expansion of desert areas
Niche Modeling and Fig. S2). during Marine Isotope Stage (MIS) 4 (66).
Background similarity tests of overlap between the SB and HP
Results niches both modeled with Maxent using the 72 ka climatic data
Niche estimations for the SB at 72 ka produced with Bioclim and produced no statistically significant result (SI Appendix, Fig. S3A
Maxent both indicate a high probability of presence primarily and Table S2), meaning that their respective niches are not sta-
restricted to the extreme southern and eastern portions of present- tistically different from one another. As pointed out above, this
day South Africa (Fig. 4 A and B). The most noticeable differences lack of significant difference between predictions is likely the re-
are that the Maxent prediction includes areas in the southwestern sult (Materials and Methods) of the used algorithm. To the con-
Cape as well as immediately coastal regions along the southeast- trary, these same tests using Bioclim found instead that SB and HP
ern and eastern coasts. This broader Maxent prediction is due to niche estimations using 72 ka climate outputs were less similar
this algorithm’s propensity to extrapolate into environments not than expected by chance (I-statistic: P ∼ 0.022; SI Appendix, Fig.
directly associated with the input occurrence data (i.e., archaeo- S3C and Table S2). Although HP niche estimates are slightly
logical sites). The predicted niches for the HP at 66 ka, produced broader than niche estimates of the SB at 72 ka with both Maxent
with the proxy 72 ka outputs, include those regions predicted for and Bioclim, these differences are not statistically significant (SI
the SB as well as more inland areas, including the Great Es- Appendix, Fig. S3 B and D and Table S2). Niche overlap between
carpment, the Highveld, and the Kaap Plateau, and broader areas Maxent models for the SB at 72 ka and the HP at 60 ka was
within the southwestern Cape and western coastal regions (Fig. 4 neither greater nor less than expected by chance (SI Appendix, Fig.
C and D). The niche estimations for the HP at 60 ka remain S3E and Table S2). However, overlap of Bioclim predictions for
geographically broader than the niche estimations for the SB and the SB at 72 ka and the HP at 60 ka was significantly lower than
still include major inland plateaus but are visibly shifted toward would be expected by chance (I-statistic: P ∼ 0.013; SI Appendix,
d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7873
Fig. 4. Ecological niche predictions for the SB archaeological culture at 72 ka (A and B), the HP archaeological culture at 66 ka (C and D), and the HP ar-
chaeological culture at 60 ka (E and F) produced with Bioclim and Maxent, respectively.
Fig. S3G and Table S2), indicating that the two cultures occupied strategies that potentially allow them to adapt to climate change
different ecological niches. Change in niche breadth between and environmental reorganization via cultural means. We observe
Maxent predictions for the SB at 72 ka and the HP at 60 ka is not such a pattern between the SB and the HP of Southern Africa.
statistically different from random expectations, although the ap- The SB was a coastal adaptation that exploited a relatively narrow
proximate P value is fairly low (P ∼ 0.11) (SI Appendix, Fig. S3F niche during mild climatic conditions across a large region. To
and Table S2), suggesting that a greater sample size might es- exploit that niche, SB populations developed a variety of complex
tablish the HP niche at 60 ka as significantly broader than the technologies and symbolic practices, some of which certainly
niche SB at 72 ka. The difference in niche breadth for Bioclim entailed costly modes of cultural transmission. A number of SB
models is greater than expected by chance (P ∼ 0.027) (SI Ap- cultural features, such as bifacial points and complex bead-
pendix, Fig. S3H and Table S2), indicating that the HP 60 ka niche working, could only be transmitted by communication and learn-
is broader than the niche of the SB at 72 ka, and points to an ing strategies that emphasize imitation (high-fidelity copying) over
ecological niche expansion. emulation (low-fidelity copying) (86, 87). HP populations signifi-
cantly increased the breadth of their niche compared with SB
Discussion and Conclusions populations. This expansion incorporated more arid and high-
To what extent does this study allow us to understand how human altitude inland environments and demonstrates their ability to
culture extended beyond behavioral adaptations observed in other cope successfully with the more arid climatic conditions and higher
species? Most species exhibit niche conservatism, contraction, or, ecological risk associated with MIS 4, particularly its latter phase.
more rarely, extinction when faced with climate change (83–85). This shift was made possible by developing a cohesive adaptive
Human populations, however, are unique in their capacity for system reliant on more flexible technologies. The variety of used
cumulative culture and associated complex cultural transmission lithic raw materials, blank production techniques, and methods to
ANTHROPOLOGY
that we have applied here is an effective means with which to explore European Research Council’s Advanced Grant TRACSYMBOLS 249587 awarded
relationships between climate variability and cohesive adaptive under the Seventh Framework Programme.
1. Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology 15. Henshilwood CS, et al. (2011) A 100,000-year-old ochre-processing workshop at
through culture. Proc Natl Acad Sci USA 114:7775–7781. Blombos Cave, South Africa. Science 334:219–222.
2. Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. 16. Mazza PPA, et al. (2006) A new Palaeolithic discovery: Tar-hafted stone tools in a
Proc Natl Acad Sci USA 114:7790–7797. European Mid-Pleistocene bone-bearing bed. J Archaeol Sci 33:1310–1318.
3. Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural in- 17. Jaubert J, et al. (2016) Early Neanderthal constructions deep in Bruniquel Cave in
telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad southwestern France. Nature 534:111–114.
ECOLOGY
Sci USA 114:7908–7914. 18. Romandini M, et al. (2014) Convergent evidence of eagle talons used by late
4. Sankararaman S, et al. (2014) The genomic landscape of Neanderthal ancestry in Neanderthals in Europe: A further assessment on symbolism. PLoS One 9:e101278.
present-day humans. Nature 507:354–357. 19. Radovčic D, Sršen AO, Radovčic J, Frayer DW (2015) Evidence for Neandertal jewelry:
5. Sankararaman S, Mallick S, Patterson N, Reich D (2016) The Combined landscape of Modified white-tailed eagle claws at Krapina. PLoS One 10:e0119802.
Denisovan and Neanderthal ancestry in present-day humans. Curr Biol 26:1241–1247. 20. Soressi M, D’Errico F (2007) Pigments, gravures, parures: Les comportements sym-
6. Nielsen R, et al. (2017) Tracing the peopling of the world through genomics. Nature 541:
boliques controversés des Néandertaliens. Les Néandertaliens. Biologie et Cultures,
302–310.
Documents préhistoriques 23, eds Vandermeersch B, Maureille B (Éditions du CTHS,
7. White TD, et al. (2003) Pleistocene Homo sapiens from Middle Awash, Ethiopia.
Paris), pp 297–309. French.
Nature 423:742–747.
21. Rodríguez-Vidal J, et al. (2014) A rock engraving made by Neanderthals in Gibraltar.
8. McDougall I, Brown FH, Fleagle JG (2005) Stratigraphic placement and age of
Proc Natl Acad Sci USA 111:13301–13306.
modern humans from Kibish, Ethiopia. Nature 433:733–736.
22. Zilhão J, et al. (2010) Symbolic use of marine shells and mineral pigments by Iberian
9. Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A (2011) Bayesian inference of ancient
human demography from individual genome sequences. Nat Genet 43:1031–1034. Neandertals. Proc Natl Acad Sci USA 107:1023–1028.
10. Watts I, Chazan M, Wilkins J (2016) Early evidence for brilliant ritualized display: 23. Caron F, d’Errico F, Del Moral P, Santos F, Zilhão J (2011) The reality of Neandertal
Specularite use in the Northern Cape (South Africa) between ∼500 and ∼300 ka. Curr symbolic behavior at the Grotte du Renne, Arcy-sur-Cure, France. PLoS One 6:e21545.
Anthropol 57:287–310. 24. Joordens JCA, et al. (2015) Homo erectus at Trinil on Java used shells for tool pro-
11. Wilkins J, Chazan M (2012) Blade production ∼500 thousand years ago at Kathu Pan duction and engraving. Nature 518:228–231.
1, South Africa: Support for a multiple origins hypothesis for early Middle Pleisto- 25. d’Errico F, et al. (2009) Out of Africa: Modern human origins special feature: Addi-
cene blade technologies. J Archaeol Sci 39:1883–1900. tional evidence on the use of personal ornaments in the Middle Paleolithic of North
12. Henshilwood CS, d’Errico F, Watts I (2009) Engraved ochres from the Middle Stone Africa. Proc Natl Acad Sci USA 106:16051–16056.
Age levels at Blombos Cave, South Africa. J Hum Evol 57:27–47. 26. Dehaene S, Cohen L (2007) Cultural recycling of cortical maps. Neuron 56:384–398.
13. Mourre V, Villa P, Henshilwood CS (2010) Early use of pressure flaking on lithic ar- 27. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl
tifacts at Blombos Cave, South Africa. Science 330:659–662. Acad Sci USA 114:7861–7868.
14. d’Errico F, Stringer CB (2011) Evolution, revolution or saltation scenario for the 28. Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E (2015) Evidence for archaic
emergence of modern cultures? Philos Trans R Soc Lond B Biol Sci 366:1060–1069. adaptive introgression in humans. Nat Rev Genet 16:359–371.
d’Errico et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7875
29. Henshilwood CS (2012) Late Pleistocene techno-traditions in Southern Africa: A re- 65. Daniau A-L, et al. (2013) Orbital-scale climate forcing of grassland burning in
view of the Still Bay and Howiesons Poort, c. 75–59 ka. J World Prehist 25:205–237. southern Africa. Proc Natl Acad Sci USA 110:5069–5073.
30. Scerri EML, Groucutt HS, Jennings RP, Petraglia MD (2014) Unexpected technological 66. Woillez M-N, et al. (2014) Impact of precession on the climate, vegetation and fire
heterogeneity in northern Arabia indicates complex Late Pleistocene demography at activity in southern Africa during MIS4. Climate of the Past 10:1165–1182.
the gateway to Asia. J Hum Evol 75:125–142. 67. Urrego DH, Sánchez Goñi MF, Daniau A-L, Lechevrel S, Hanquiez V (2015) Increased
31. Wadley L, Hodgskiss T, Grant M (2009) From the cover: Implications for complex aridity in southwestern Africa during the warmest periods of the last interglacial.
cognition from the hafting of tools with compound adhesives in the Middle Stone Climate of the Past 11:1417–1431.
Age, South Africa. Proc Natl Acad Sci USA 106:9590–9594. 68. Rasmussen SO, et al. (2014) A stratigraphic framework for abrupt climatic changes
32. Wadley L, et al. (2011) Middle Stone Age bedding construction and settlement during the Last Glacial period based on three synchronized Greenland ice-core records:
patterns at Sibudu, South Africa. Science 334:1388–1391. Refining and extending the INTIMATE event stratigraphy. Quat Sci Rev 106:14–28.
33. Vanhaeren M, d’Errico F, van Niekerk KL, Henshilwood CS, Erasmus RM (2013) 69. Sánchez Goñi MF, Bard E, Landais A, Rossignol L, d’Errico F (2013) Air-sea temperature de-
Thinking strings: Additional evidence for personal ornament use in the Middle Stone coupling in western Europe during the last interglacial-glacial transition. Nat Geosci 6:837–841.
Age at Blombos Cave, South Africa. J Hum Evol 64:500–517. 70. Broccoli AJ, Dahl KA, Stouffer RJ (2006) Response of the ITCZ to Northern Hemi-
34. Jacobs Z, et al. (2008) Ages for the Middle Stone Age of southern Africa: Implications sphere cooling. Geophys Res Lett 33:L01702.
for human behavior and dispersal. Science 322:733–735. 71. Stouffer RJ, et al. (2006) Investigating the causes of the response of the thermo-
35. Osborne AH, et al. (2008) A humid corridor across the Sahara for the migration of early haline circulation to past and future climate changes. J Clim 19:1365–1387.
modern humans out of Africa 120,000 years ago. Proc Natl Acad Sci USA 105:16444–16447. 72. Barker S, et al. (2009) Interhemispheric Atlantic seesaw response during the last
36. Armitage SJ, et al. (2011) The southern route “out of Africa”: Evidence for an early deglaciation. Nature 457:1097–1102.
expansion of modern humans into Arabia. Science 331:453–456. 73. Kanner LC, Burns SJ, Cheng H, Edwards RL (2012) High-latitude forcing of the South
37. Compton JS (2011) Pleistocene sea-level fluctuations and human evolution on the American summer monsoon during the Last Glacial. Science 335:570–573.
southern coastal plain of South Africa. Quat Sci Rev 30:506–527. 74. Marino G, et al. (2013) Agulhas salt-leakage oscillations during abrupt climate
38. Ziegler M, et al. (2013) Development of Middle Stone Age innovation linked to rapid changes of the Late Pleistocene. Paleoceanography 28:599–606.
climate change. Nat Commun 4:1905. 75. Nix HA (1986) A biogeographic analysis of Australian elapid snakes. Atlas of Elapid
39. d’Errico F, Banks WE (2013) Identifying mechanisms behind Middle Paleolithic and Snakes of Australia, Australian Flora and Fauna Series, ed Longmore R (Australian
Middle Stone Age cultural trajectories. Curr Anthropol 54:S371–S387. Government Publishing Service, Canberra), pp 4–15.
40. Peterson AT, et al. (2011) Ecological Niches and Geographic Distributions (Princeton 76. Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species
Univ Press, Princeton). geographic distributions. Ecol Modell 190:231–259.
41. Soriano S, et al. (2015) The Still Bay and Howiesons Poort at Sibudu and Blombos: 77. Hijmans RJ, Phillips SJ, Leathwick J, Elith J (2017) dismo: Species Distribution Modeling.
Understanding Middle Stone Age technologies. PLoS One 10:e0131127. Available at https://cran.r-project.org/web/packages/dismo/index.html. Accessed January
42. d’Errico F, Henshilwood C, Vanhaeren M, van Niekerk K (2005) Nassarius kraussianus 11, 2017.
shell beads from Blombos Cave: Evidence for symbolic behaviour in the Middle Stone 78. R Core Team (2013) R: A Language and Environment for Statistical Computing (R
Foundation for Statistical Computing, Vienna).
Age. J Hum Evol 48:3–24.
43. Jacobs Z, Duller GAT, Wintle AG (2003) Optical dating of dune sand from Blombos 79. Warren DL, Glor RE, Turelli M (2008) Environmental niche equivalency versus con-
servatism: Quantitative approaches to niche evolution. Evolution 62:2868–2883.
Cave, South Africa: II–single grain data. J Hum Evol 44:613–625.
80. Warren DL, Glor RE, Turelli M (2010) ENMTools: A toolbox for comparative studies of
44. Jacobs Z, Duller GAT, Wintle AG, Henshilwood CS (2006) Extending the chronology
environmental niche models. Ecography 33:607–611.
of deposits at Blombos Cave, South Africa, back to 140 ka using optical dating of
81. Levins R (1968) Evolution in Changing Environments (Princeton Univ Press, Princeton).
single and multiple grains of quartz. J Hum Evol 51:255–273.
82. Mandle L, et al. (2010) Conclusions about niche expansion in introduced Impatiens
45. Jacobs Z, Hayes EH, Roberts RG, Galbraith RF, Henshilwood CS (2013) An improved
walleriana populations depend on method of analysis. PLoS One 5:e15297.
OSL chronology for the Still Bay layers at Blombos Cave, South Africa: Further tests
83. Parmesan C, Yohe G (2003) A globally coherent fingerprint of climate change im-
of single-grain dating procedures and a re-evaluation of the timing of the Still Bay
pacts across natural systems. Nature 421:37–42.
industry across southern Africa. J Archaeol Sci 40:579–594.
84. Wiens JJ, Graham CH (2005) Niche conservatism: Integrating evolution, ecology, and
46. Tribolo C, et al. (2009) Thermoluminescence dating of a Stillbay–Howiesons Poort se-
conservation biology. Annu Rev Ecol Evol Syst 36:519–539.
quence at Diepkloof Rock Shelter (Western Cape, South Africa). J Archaeol Sci 36:730–739.
85. Peterson AT (2011) Ecological niche conservatism: A time-structured review of evi-
47. Guérin G, Murray AS, Jain M, Thomsen KJ, Mercier N (2013) How confident are we in the
dence. J Biogeogr 38:817–827.
chronology of the transition between Howieson’s Poort and Still Bay? J Hum Evol 64:314–317.
86. Tennie C, Call J, Tomasello M (2009) Ratcheting up the ratchet: On the evolution of
48. Tribolo C, et al. (2013) OSL and TL dating of the Middle Stone Age sequence at
cumulative culture. Philos Trans R Soc Lond B Biol Sci 364:2405–2415.
Diepkloof Rock Shelter (South Africa): A clarification. J Archaeol Sci 40:3401–3411.
87. Whiten A, McGuigan N, Marshall-Pescini S, Hopper LM (2009) Emulation, imitation,
49. Archer W, Pop CM, Gunz P, McPherron SP (2016) What is Still Bay? Human bio-
over-imitation and the scope of culture for child and chimpanzee. Philos Trans R Soc
geography and bifacial point variability. J Hum Evol 97:58–72.
Lond B Biol Sci 364:2417–2428.
50. Jacobs Z, Roberts RG (2017) Single-grain OSL chronologies for the Still Bay and
88. Porraz G, et al. (2013) Technological successions in the Middle Stone Age sequence
Howieson’s Poort industries and the transition between them: Further analyses and
of Diepkloof Rock Shelter, Western Cape, South Africa. J Archaeol Sci 40:3376–3400.
statistical modelling. J Hum Evol 107:1–13. 89. d’Errico F, Banks WE (2015) The archaeology of teaching: A conceptual framework.
51. Henshilwood CS, et al. (2014) Klipdrift Shelter, southern Cape, South Africa: Pre-
Camb Archaeol J 25:859–866.
liminary report on the Howiesons Poort layers. J Archaeol Sci 45:284–303. 90. Torrence R (2001) Hunter-gatherer technology: Macro- and microscale patterns. Hunter-
52. Delagnes A, et al. (2016) Early evidence for the extensive heat treatment of silcrete in the Gatherers: An Interdisciplinary Perspective, Biosocial Society Symposium Series, eds Panter-
Howiesons Poort at Klipdrift Shelter (Layer PBD, 65 ka), South Africa. PLoS One 11:e0163874. Brick C, Layton R, Rowley-Conwy P (Cambridge Univ Press, Cambridge, UK), pp 73–98.
53. de la Peña P (2015) Refining our understanding of Howiesons Poort lithic technology: 91. Collard M, Kemery M, Banks S (2005) Causes of toolkit variation among hunter-gatherers:
The evidence from Grey Rocky Layer in Sibudu Cave (KwaZulu-Natal, South Africa). A test of four competing hypotheses. Can J Archaeol J Can D’Archéologie 29:1–19.
PLoS One 10:e0143451. 92. Read D (2008) An interaction model for resource implement complexity based on
54. de la Peña P, Wadley L, Lombard M (2013) Quartz bifacial points in the Howiesons risk and number of annual moves. Am Antiq 73:599–625.
Poort of Sibudu. S Afr Archaeol Bull 68:119–136. 93. Collard M, Buchanan B, Morin J, Costopoulos A (2011) What drives the evolution of
55. d’Errico F, Backwell LR, Wadley L (2012) Identifying regional variability in Middle hunter-gatherer subsistence technology? A reanalysis of the risk hypothesis with
Stone Age bone technology: The case of Sibudu Cave. J Archaeol Sci 39:2479–2495. data from the Pacific Northwest. Philos Trans R Soc Lond B Biol Sci 366:1129–1138.
56. Backwell L, d’Errico F, Wadley L (2008) Middle Stone Age bone tools from the 94. Rendell L, et al. (2011) How copying affects the amount, evenness and persistence of
Howiesons Poort layers, Sibudu Cave, South Africa. J Archaeol Sci 35:1566–1580. cultural knowledge: Insights from the social learning strategies tournament. Philos
57. Lombard M, Phillipson L (2010) Indications of bow and stone-tipped arrow use Trans R Soc Lond B Biol Sci 366:1118–1128.
64,000 years ago in KwaZulu-Natal, South Africa. Antiquity 84:635–648. 95. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oce-
58. Bradfield J, Lombard M (2011) A macrofracture study of bone points used in experimental ania. Proc R Soc Lond B Biol Sci 277:2559–2564.
hunting with reference to the South African middle stone age. S Afr Archaeol Bull 66:67. 96. Collard M, Buchanan B, O’Brien MJ, Scholnick J (2013) Risk, mobility or population
59. Charrié-Duhaut A, et al. (2013) First molecular identification of a hafting adhesive in size? Drivers of technological richness among contact-period western North Amer-
the Late Howiesons Poort at Diepkloof Rock Shelter (Western Cape, South Africa). ican hunter-gatherers. Philos Trans R Soc Lond B Biol Sci 368:20120412.
J Archaeol Sci 40:3506–3518. 97. Banks WE, d’Errico F, Zilhão J (2013) Human-climate interaction during the Early
60. d’Errico F, Backwell L (2016) Earliest evidence of personal ornaments associated with Upper Paleolithic: Testing the hypothesis of an adaptive shift between the Proto-
burial: The Conus shells from Border Cave. J Hum Evol 93:91–108. Aurignacian and the Early Aurignacian. J Hum Evol 64:39–55.
61. Texier P-J, et al. (2010) From the cover: A Howiesons Poort tradition of engraving 98. d’Errico F, Henshilwood CS (2007) Additional evidence for bone technology in the
ostrich eggshell containers dated to 60,000 years ago at Diepkloof Rock Shelter, southern African Middle Stone Age. J Hum Evol 52:142–163.
South Africa. Proc Natl Acad Sci USA 107:6180–6185. 99. de la Peña P, Wadley L (2014) Quartz knapping strategies in the Howiesons Poort at
62. Wadley L, Mohapi M (2008) A segment is not a monolith: Evidence from the Ho- Sibudu (KwaZulu-Natal, South Africa). PLoS One 9:e101534.
wiesons Poort of Sibudu, South Africa. J Archaeol Sci 35:2594–2605. 100. Dayet L, Texier P-J, Daniel F, Porraz G (2013) Ochre resources from the Middle Stone Age
63. Jacobs Z, Roberts RG (2015) An improved single grain OSL chronology for the sedimentary sequence of Diepkloof Rock Shelter, Western Cape, South Africa. J Archaeol Sci 40:3492–3505.
deposits from Diepkloof Rockshelter, Western Cape, South Africa. J Archaeol Sci 63:175–192. 101. Laskar J, et al. (2004) A long-term numerical solution for the insolation quantities of
64. Partridge TC, Demenocal PB, Lorentz SA, Paiker MJ, Vogel JC (1997) Orbital forcing of the Earth. Astron Astrophys 428:261–285.
climate over South Africa: A 200,000-year rainfall record from the Pretoria saltpan. Quat 102. Jouzel J, et al. (2007) Orbital and millennial Antarctic climate variability over the
Sci Rev 16:1125–1133. past 800,000 years. Science 317:793–796.
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark June 16, 2017
(received for review January 14, 2017)
The complexity and variability of human culture is unmatched by any or linear but instead is a process of punctuated accumulation; it
other species. Humans live in culturally constructed niches filled with involves the conservation of some features, incremental in-
artifacts, skills, beliefs, and practices that have been inherited, novation, and occasionally dramatic qualitative shifts (36).
accumulated, and modified over generations. A causal account of The diversity of skills, practices, beliefs, and values among pop-
the complexity of human culture must explain its distinguishing ulations is another distinguishing feature of human culture. Cultural
characteristics: It is cumulative and highly variable within and across groups are heterogeneous populations of individuals that differ
populations. I propose that the psychological adaptations supporting along complex ecological, social, and structural variables. Socially
cumulative cultural transmission are universal but are sufficiently acquired and transmitted behaviors vary more distinctly among
flexible to support the acquisition of highly variable behavioral human populations than in any other species (37). Cultural vari-
repertoires. This paper describes variation in the transmission prac- ability is one of our species’ most distinctive features, and a causal
tices (teaching) and acquisition strategies (imitation) that support account of human culture must explain its diversity. The psycho-
cumulative cultural learning in childhood. Examining flexibility and logical adaptations supporting cumulative cultural transmission are
variation in caregiver socialization and children’s learning extends our hypothesized to be universal features of human psychology, but they
understanding of evolution in living systems by providing insight into must be sufficiently flexible to support the acquisition of highly
the psychological foundations of cumulative cultural transmission— variable skill sets and behavioral repertoires (38).
the cornerstone of human cultural diversity. What psychological adaptations explain the species-specific ca-
pacity to accumulate and build upon the cultural innovations of
|
cumulative culture cultural evolution | cross-cultural comparison | previous generations? To what extent do cultural transmission
|
teaching imitation practices (teaching) and cultural acquisition strategies (imitation)
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
vary across populations? How do caregivers use teaching to
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
learning a complex or abstract skill often requires direct active
favored individuals who engage in affiliative behaviors, such as
instruction to acquire that skill efficiently. Caregivers play a critical
high-fidelity imitation (132, 133). Children imitate social con-
role in transmitting the beliefs, skills, and practices of particular
ventions as a means of affiliation with group members (127).
populations. Cultural transmission alone does not explain high-
High-fidelity imitation also may function as a reinclusion be-
fidelity cultural acquisition. Young children are adept at acquir-
havior in reaction to the threat of social exclusion from an in-
ing the beliefs and practices of the groups they are born into, an
group in childhood in ways that parallel the increase in motor
extraordinary learning achievement that requires substantial flex- mimicry following social exclusion by in-group members ob-
ibility (38). Next I review evidence for high-fidelity imitation, de- served in adults (134). Children ostracized by in-group members
scribe evidence for variation between populations, and discuss the display higher levels of anxiety and engage in higher imitative
implications for acquiring knowledge. fidelity of a group convention than children ostracized by out-
Variation in Cultural Acquisition Practices group members (135). They imitate instrumental tasks with
higher fidelity when primed with ostracism (136, 137). When
Our species-typical proclivity for high-fidelity imitation is critical for status or inclusion within a group is threatened, children may be
cumulative cultural transmission (42, 117, 118). High-fidelity imi- particularly motivated to enhance their standing in a group
tation plays a central role in both horizontal and vertical trans- through affiliative behavior such as high-fidelity imitation.
mission of group-specific cultural practices. Young children possess Imitation is used to acquire instrumental skills as well as to
cognitive and communication systems that support the transmission engage in social conventions such as rituals. However, it is often
of complex technical skills and social conventions (46, 65, 75). difficult to determine whether a behavior is instrumental or con-
Children learn the skills and practices of their communities by ventional based on observation of the behavior alone. For exam-
imitating others. The ability and motivation to engage in high- ple, lighting a candle could have an instrumental goal (lighting a
fidelity copying allows children to acquire an extraordinary va- dark room) or a conventional goal (worshiping a deity). How do
riety of skills and information they otherwise would not be able children determine whether a behavior is instrumental or con-
to acquire through direct exploration or experimentation alone ventional? Young children are highly sensitive to contextual var-
(29). For acquired behavior to count as cultural, it must dis- iation in social information (138). Children use a number of social
seminate in a social group and remain stable across generations and contextual cues when making inferences about the goal of
(119, 120). The conservation of knowledge and skills across behavior. Cues to conventionality increase imitative fidelity. One
generations supports individual and group-level innovation is causal opacity (i.e., lack of a physical causal mechanism). A
(121). The propensity for overimitation, or copying actions that second is consensus (i.e., multiple actors performing the same
are causally irrelevant to achieving an instrumental end goal actions). A third is synchrony (i.e., multiple actors performing the
(122, 123), develops early. Children often copy when uncertain same actions at the same time) (56). Children are also highly
about the underlying causal structure of a behavior. This pro- sensitive to verbal cues to conventionality and to the presence of a
clivity is useful, given that a vast amount of behavior that chil- social norm (65, 127, 139). Even infants are sensitive to language
dren acquire is opaque from the perspective of physical causality cues to conventionality (140).
(124, 125). High-fidelity imitation is an adaptive human strategy There is both continuity and variation in imitative flexibility across
facilitating more rapid social learning of instrumental skills than populations (141–143). For example, children in industrialized,
would be possible if copying required a full causal representation Western populations (e.g., the United States) and subsistence-based,
of an event (126). non-Western populations (e.g., Vanuatu) imitate conventional
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
hallmark of our species—cumulative cultural transmission. transmission. Examining the flexibility and variation in cultural
The psychological foundations of cultural complexity are transmission practices and acquisition strategies provides insight
multifaceted. Explanations that extend biology through cul- into the psychological foundations of cumulative cultural trans-
ture require drawing together developmental, cross-cultural, and mission—the cornerstone of human cultural diversity.
1. Herrmann E, Call J, Hernàndez-Lloreda MV, Hare B, Tomasello M (2007) Humans 20. Scott-Phillips TC, Laland KN, Shuker DM, Dickins TE, West SA (2014) The niche con-
have evolved specialized skills of social cognition: The cultural intelligence hypoth- struction perspective: A critical appraisal. Evolution 68:1231–1243.
esis. Science 317:1360–1366. 21. Whiten A, Hinde RA, Laland KN, Stringer CB (2011) Culture evolves. Philos Trans R
2. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in- Soc Lond B Biol Sci 366:938–948.
telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016. 22. Boyd R, Richerson PJ, Henrich J (2011) The cultural niche: Why social learning is es-
3. Whiten A, van Schaik CP (2007) The evolution of animal ‘cultures’ and social in- sential for human adaptation. Proc Natl Acad Sci USA 108:10918–10925.
telligence. Philos Trans R Soc Lond B Biol Sci 362:603–620. 23. Pagel MD (2012) Wired for Culture: Origins of the Human Social Mind (W.W. Norton,
4. Aplin LM, et al. (2015) Experimentally induced innovations lead to persistent culture New York).
via conformity in wild birds. Nature 518:538–541. 24. Pradhan GR, Tennie C, van Schaik CP (2012) Social organization and the evolution of
5. Fragaszy D, Visalberghi E (2004) Socially biased learning in monkeys. Learn Behav 32: cumulative technology in apes and hominins. J Hum Evol 63:180–190.
24–35. 25. Kurzban R, Barrett HC (2012) Behavior. Origins of cumulative culture. Science 335:
6. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool 295: 1056–1057.
4–11. 26. Whiten A, Erdal D (2012) The human socio-cognitive niche and its evolutionary or-
7. Perry S, et al. (2003) Social conventions in wild white‐faced capuchin monkeys: Evi- igins. Philos Trans R Soc Lond B Biol Sci 367:2119–2129.
dence for traditions in a neotropical primate. Curr Anthropol 44:241–268. 27. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul-
8. Plotnik JM, Lair R, Suphachoksahakun W, de Waal FBM (2011) Elephants know when ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789.
they need a helping trunk in a cooperative task. Proc Natl Acad Sci USA 108: 28. Chudek M, Henrich J (2011) Culture-gene coevolution, norm-psychology and the
5116–5121. emergence of human prosociality. Trends Cogn Sci 15:218–226.
9. Rendell L, et al. (2010) Why copy others? Insights from the social learning strategies 29. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural
tournament. Science 328:208–213. learning. Trends Cogn Sci 19:688–699.
10. Cantor M, et al. (2015) Multilevel animal societies can emerge from cultural trans- 30. Arbilly M, Laland KN (2017) The magnitude of innovation and its evolution in social
mission. Nat Commun 6:8091. animals. Proc Biol Sci 284:20162385.
11. Fragaszy DM, Perry S (2003) The Biology of Traditions: Models and Evidence (Cam- 31. Carr K, Kendal RL, Flynn EG (2016) Eureka!: What is innovation, how does it develop,
bridge Univ Press, New York). and who does it? Child Dev 87:1505–1519.
12. Garland EC, et al. (2013) Humpback whale song on the Southern Ocean feeding 32. Henrich J, McElreath R (2007) Dual inheritance theory: The evolution of human
grounds: Implications for cultural transmission. PLoS One 8:e79422. cultural capacities and cultural evolution. Oxford Handbook of Evolutionary
13. Laland KN, Galef BG (2009) The Question of Animal Culture (Harvard Univ Press, Psychology, eds Dunbar R, Barrett L (Oxford Univ Press, Oxford,UK), pp 555–570.
Cambridge, MA). 33. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive
14. van Leeuwen EJC, Cronin KA, Haun DBM (2014) A group-specific arbitrary tradition mechanisms in response to cultural innovations. Proc Natl Acad Sci USA 114:
in chimpanzees (Pan troglodytes). Anim Cogn 17:1421–1425. 7915–7922.
15. Dean LG, Vale GL, Laland KN, Flynn E, Kendal RL (2014) Human cumulative culture: A 34. Smaldino PE, Richerson PJ (2013) Human cumulative cultural evolution as a form of
comparative perspective. Biol Rev Camb Philos Soc 89:284–301. distributed computation. Handbook of Human Computation, ed Michelucci P
16. Johnson-Pynn J, Fragaszy DM, Cummins-Sebree S (2003) Common territories in (Springer, New York), pp 979–992.
comparative and developmental psychology: Quest for shared means and meaning 35. Muthukrishna M, Henrich J (2016) Innovation in the collective brain. Proc Biol Sci
in behavioral investigations. Int J Comp Psychol 16:1–27. 371:20150192.
17. Laland KN, Hoppitt W (2003) Do animals have culture? Evol Anthropol 12:150–159. 36. Kolodny O, Creanza N, Feldman MW (2015) Evolution in leaps: The punctuated
18. Nielsen M, Haun D (2016) Why developmental psychology is incomplete without accumulation and loss of cultural innovations. Proc Natl Acad Sci USA 112:
comparative and cross-cultural perspectives. Proc Biol Sci 371:20150071. E6762–E6769.
19. Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected 37. Konner M (2010) The Evolution of Childhood: Relationships, Emotion, Mind (Harvard
Process in Evolution (Princeton Univ Press, Princeton). Univ Press, Cambridge, MA).
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
lative culture in apes: Improved foraging efficiency through relinquishing and peers. Child Dev 85:1108–1122.
combining witnessed behaviours in chimpanzees (Pan troglodytes). Sci Rep 6:35953. 148. Clegg JM, Legare CH (2017) Parents scaffold flexible imitation during early child-
129. Call J, Carpenter M, Tomasello M (2005) Copying results and copying actions in the hood. J Exp Child Psychol 153:1–14.
process of social learning: Chimpanzees (Pan troglodytes) and human children 149. Hewlett BS, Lamb ME, Shannon D, Leyendecker B, Schölmerich A (1998) Culture and
(Homo sapiens). Anim Cogn 8:151–163. early infancy among central African foragers and farmers. Dev Psychol 34:653–661.
130. Keupp S, Behne T, Rakoczy H (2013) Why do children overimitate? Normativity is 150. Jensen LA (2012) Bridging universal and cultural perspectives: A vision for de-
crucial. J Exp Child Psychol 116:392–406. velopmental psychology in a global world. Child Dev Perspect 6:98–104.
131. Over H, Carpenter M (2012) Putting the social into social learning: Explaining both 151. Nielsen M, Haun D, Kärtner J, Legare CH (2017) The persistent sampling bias in
selectivity and fidelity in children’s copying behavior. J Comp Psychol 126:182–192. developmental psychology: A call to action. J Exp Child Psychol 162:31–38.
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 8, 2017
(received for review January 12, 2017)
Children acquire information, especially about the culture in which of the information that is made available by the surrounding com-
they are being raised, by listening to other people. Recent evidence munity; they also remedy their own ignorance by adopting an in-
has shown that young children are selective learners who preferen- terrogative stance toward potential informants.
tially accept information, especially from informants who are likely Below, we review recent findings on children’s appraisal of
to be representative of the surrounding culture. However, the extent informant consensus, highlighting the research lacuna just
to which children understand this process of information trans- mentioned. We then turn to research focusing on young child-
mission and actively exploit it to fill gaps in their knowledge has not ren’s appraisal of themselves, especially their states of ignorance,
been systematically investigated. We review evidence that toddlers as well as the emergence of the interrogative stance.
exhibit various expressive behaviors when faced with knowledge
gaps. They look toward an available adult, convey ignorance via Sensitivity to Consensus and Uncertainty
nonverbal gestures (flips/shrugs), and increasingly produce verbal Research on children’s appraisal of informant consensus has drawn
acknowledgments of ignorance (“I don’t know”). They also produce on approaches to cultural learning grounded in evolutionary theo-
comments and questions about what their interlocutors might rizing. Adopting this approach, Morgan et al. (17) asked how far
know and adopt an interrogative stance toward them. Thus, in children would be swayed by varying degrees of consensus among
the second and third years, children actively seek information their informants when making numerical judgments. Children
from interlocutors via nonverbal gestures or verbal questions ranging from 3 to 7 y of age were asked to say which of two displays,
and display a heightened tendency to encode and retain such each containing 10–30 dots, was numerically greater. Consistent
sought-after information. with prior findings (18), children were better at choosing the nu-
merically larger display the greater the difference in size between
ignorance | questions | children | communication the two displays, as indexed by the dot ratio (i.e., the ratio of the
difference between the displays relative to the size of the smaller
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
children (7-y-olds) display a >80% chance of sticking (color-coding of the ages of The following body of experimental work illustrates these
3, 4, 5, 6, and 7 y is provided in Fig. 2). Reprinted with permission from ref. 17.
basic points. Krehm et al. (21) had infants aged 9 and 11 mo
watch while an informant expressed her preference for one of
number of informants choosing a given option, the greater was two objects by reaching for and manipulating her preferred
the likelihood that 6-y-olds responded similarly. Finally, younger
children were swayed by unanimity among informants but
showed little sensitivity to social feedback that fell short of
unanimity. For example, their final decisions were roughly the
same whether two of the 10 informants judged like them and
eight did not, or the reverse. In sum, although the exact nature of
children’s reactions to disagreement among informants changed
sharply with age, children were sensitive to social feedback at all
ages. Moreover, older children displayed the type of conformist
bias (i.e., a disproportionate sensitivity to nontotal majorities)
that evolutionary theory has identified as a highly effective
strategy for widespread cultural dissemination (19). Hence,
children’s tendency to stick with their initial judgment cannot be
attributed to any overall insensitivity to social feedback.
An alternative explanation of children’s tendency to stick, and
to stick even on difficult trials, is that they lack an ability to
monitor their own knowledge states. They treat what effectively
amounts to a random judgment on difficult trials and a well-
founded judgment on easy trials as more or less equivalent. In
this view, young children ultimately have little ability to differen-
tiate cognitive states that, in principle, ought to be quite distinct:
notably, states of ignorance, in which only a guess can be made,
and states of knowledge, in which a judgment can be made with a
high probability of its being correct. If this hypothesis is correct, it
implies that children are poor at weighing social feedback against
their own asocial information. Having little awareness of the ep-
istemic standing of their own asocial information, they do not Fig. 2. Probability that children stick with a given decision (e.g., the right-
hand side display of dots) for a trial of intermediate difficulty. The 7-y-olds
appropriately calibrate their deference to social information.
show a conformist bias by responding disproportionately to majorities that
However, even if young children are insensitive to the cer- fall short of unanimity. The 6-y-olds display a proportionate response to the
tainty versus uncertainty of their numerical judgments, it is un- number of informants endorsing their decision. Younger children, especially
likely that they are insensitive to the standing of their cognitive 3- and 4-y-olds, are only affected by informant feedback when there is com-
states across all domains of knowledge. Indeed, as elaborated plete unanimity; they are prone to ignore informant feedback when there is
below, recent evidence suggests that even 2-y-olds have some disagreement among informants. Reprinted with permission from ref. 17.
Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7885
object. A recipient then appeared who expressed no specific each of two boxes in succession and asking: “Is it here?” A second
preference for either object insofar as she handled both. In the adult answered with a nod to one query and a shake of the head to
subsequent test phase, the informant reappeared and pointed to the other. When infants were then prompted to find the object,
her preferred object as the recipient watched. Infants expressed they typically selected the correct box. Effectively, infants were
more surprise (by looking longer) when the recipient handed the able not only to note the difference between the two head gestures
informant her nonpreferred object as opposed to the one that of the second adult but to tie each gesture to a query about a
she had pointed at. By implication, infants expected the recipient particular location indicated by the first adult.
to understand which object the informant wanted, given her Taken together, these findings imply that infants aged 12–
pointing gesture, and to respond accordingly. A control condi- 18 mo possess a relatively abstract comprehension of the nature
tion consolidated this interpretation. If the informant gestured of communication. They realize that certain signals, such as
with an open fist rather than a point, or if the recipient closed pointing gestures, lexicalized speech, and head gestures, can
her eyes rather than watched the informant’s gesture, the se- provide information about what an informant wants or knows.
lective pattern of looking disappeared. This selective pattern was They expect the recipient of those communicative signals to
displayed by both 9- and 11-mo-olds, and indeed irrespective of construe and respond to them appropriately, by compliance if a
whether infants had started to point themselves. request has been made and by acting in relation to new in-
Martin et al. (22) obtained similar findings when the informant formation about the state of the world if it has been supplied. By
signaled that she wanted a preferred object, not via a pointing implication, human infants have a basic understanding of the way
gesture but by saying “koba.” This lexical item was unfamiliar to that communication conveys information between two interloc-
the 12-mo-olds being tested. Nevertheless, they tended to construe utors. Moreover, they do so at an age when their own production
it as a request by the informant for her preferred object, again as of spoken language remains limited. Accordingly, when infants
indexed by the pattern of looking that they displayed when the proceed to exploit the rich communicative power of language,
recipient did or did not comply in terms of the particular object they are likely to situate that power within a broad understanding
that she handed to the informant. Infants expressed more surprise of the way that human communication operates, particularly
(looked longer) if the recipient handed the informant the non- their realization that communication can function to provide an
preferred object. Control conditions indicated that infants’ con- interlocutor with information.
strual of the informant’s signal as a request was restricted to Granted that young infants understand how communication
speech-like utterances. If the informant coughed rather than operates, we may ask whether they build on that understanding
spoke, or produced a vocalization (“Oooh!”) rather than a lexical by actively eliciting information rather than remaining passive
item, the pattern of selective looking disappeared. observers or recipients of information that is on offer. To opti-
Thus, at the very beginning of the second year, infants are not mize the elicitation of information, it would be helpful for infants
inattentive or uncomprehending bystanders with respect to ongoing to possess four interrelated abilities: the ability to signal their
patterns of communication. They grasp that particular signals can own ignorance, to talk about knowledge and ignorance, to pro-
be interpreted as requests for a particular object. Moreover, their duce interrogative acts of communication, and to gauge the ad-
construal of dialogic communication is such that they expect the equacy of the replies received. In the following sections, evidence
recipient to interpret the requests appropriately and to act ac- will be reviewed showing that human children display each of
cordingly. This construal of dyadic communication is appropriately these abilities in the course of the second and third years, es-
confined to certain types of signals, notably a pointing gesture or the pecially in the context of an ongoing dialogue with an adult.
production of a lexical item (including one that is novel) rather than
a hand movement, a cough, or a vocalization. Signaling Ignorance
Song et al. (23) asked if older infants, aged 18 mo, would un- Nonhuman animals appear to possess at least some metacognitive
derstand not just a request for an object, as conveyed by a pointing capacity. They are capable of monitoring their own uncertainty in
gesture or lexical item, but an assertion, and notably an assertion that they deliberately withhold a response when faced with a dif-
that could, in principle, update the recipient’s knowledge base. ficult discrimination between different choices (25). There is also
Infants watched as an adult repeatedly placed a ball in a box, evidence that chimpanzees and young children (aged 27–32 mo)
withdrew it, replaced it, and eventually left the room. A second appropriately seek out additional visual evidence in the context of
adult who had witnessed the actions of the first adult then moved uncertainty about the location of a hidden object. For example, if
the ball to a cup and covered it with a lid. The first adult returned they have had the opportunity to observe in which of several tubes
to retrieve the ball, but before her making any attempt to retrieve a desirable object has been hidden, both species search promptly
it, she was provided with information about its new location by the in that particular tube. However, if they have not seen the hiding
second adult: “The ball is in the cup.” Alternatively, the second and do not know in which particular tube the object was hidden,
adult made an uninformative remark that did not indicate the they are likely to bend their head or body to look inside the
ball’s new location: “I like the cup.” In the informative condition, available tubes before searching in the one where the hidden
infants expressed surprise (looked longer) when the returning object can be seen; alternatively, they opt for a smaller reward in a
adult searched in the now empty box rather than in the cup where known location (26, 27). By implication, both chimpanzees and
she had just been told that the ball was located. In the un- children realize when they do not know, or have not seen, an
informative condition, by contrast, infants were more surprised if object’s location and act accordingly. They proceed to gather more
the returning adult appeared to know that the object was in the location information before searching accurately on the basis of
cup, as indexed by her searching there rather than in the box that newly gathered information, or they opt for a less desirable
where she had left it. Moreover, in line with the findings for re- object in a known location.
quests discussed above, a pointing response by the second adult In these cases, neither the chimpanzee nor the child commu-
was also construed by 18-mo-olds as an informative assertion that nicates ignorance to another individual. Rather, in the context of
was likely to guide the search behavior of the returning adult. ignorance, they engage in visual exploration or opt out of
Eighteen-month-olds also display some facility in decoding the searching. However, recent evidence indicates that human in-
information conveyed by head gestures as well as hand gestures. fants are capable of signaling their ignorance. Goupil et al. (28)
As in the studies described above, Fusaro and Harris (24) trained 20-mo-old infants to ask their caregiver for guidance if
arranged for infants to witness a minidialogue between two adults they were uncertain of a hidden object’s location. More specifi-
and then probed their construal of that dialogue. One adult sought cally, infants watched as a toy was hidden in one of two opaque
information about the location of a hidden object by pointing to containers. On so-called “possible” trials, the infant observed the
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
the flip gesture expressing ignorance emerged among some children
to more accurate on those occasions when they did point. This
greater accuracy was because, when they were uncertain, instead in the course of the second year. At 22 mo, one-fifth of the sample
of pointing with a considerable risk of error, they were likely to had been observed producing a flip to signal ignorance, and by
seek help by looking toward their caregiver. In addition, the 42 mo, almost half had done so. Verbal statements of ignorance
trained infants who asked for help were more likely to do so when emerged somewhat later but rose sharply in frequency across the
the experimental setup created uncertainty. Thus, they were more same period, eventually becoming more widespread.
likely to ask for help on impossible compared with possible trials, These findings build on the findings of Goupil et al. (28) by
and, within the set of possible trials, they were more likely to ask showing that deliberate teaching and reinforcement are not re-
for help if the containers had been occluded for a longer delay. quired for the production of gestures signaling ignorance. Many
Taken together, these findings provide strong evidence that in- children produce such a signal in the course of everyday in-
fants aged 20 mo are able to monitor their ignorance or un- teraction outside the laboratory. The results also raise the pos-
certainty and can learn to signal that uncertainty by gazing at a sibility that such signals are produced not just in the context of
potential informant, notably a caregiver, in such circumstances. goal-directed behavior, such as in the search for a hidden object,
Despite the impressive and systematic nature of such un- but in the context of an ongoing dialogue in which an adult poses
certainty monitoring and help seeking, the findings also point to a question that the child is unable to answer. By implication, it
the critical role of training. More specifically, control infants who would be wrong to assume that signals of ignorance arise only in
received no initial training did sometimes look at their caregiver. problem-solving contexts where children face a practical di-
However, such responses were no more frequent with greater lemma or obstacle and turn to an adult for help in resolving it.
delay lengths and were no more frequent for impossible com-
pared with possible trials. Thus, even if these gaze responses
were aimed at prompting help from the caregiver, there was no
evidence that they signaled uncertainty because their production
was not positively correlated with the experimental conditions 70
producing uncertainty. By implication, although 20-mo-olds do
spontaneously look toward a caregiver, and indeed may do so 60
Cumulative number of children
with the expectation that helpful information will be supplied, Children who ever produced
50 flips
training might be needed if such looks are to be produced in a
40 Children who ever produced I
strategic fashion to signal uncertainty. More generally, this study DON'T KNOW flips
provides persuasive evidence that infants have some awareness 30 Children who ever said "I don't
of their own uncertainty or ignorance, echoing findings with know"
20
nonhuman primates, but it provides no evidence that they are
prone to signal ignorance or uncertainty in a spontaneous fash- 10
ion even if it shows that they can be trained to do so.
0
When do young children begin to signal their ignorance 14 18 22 26 30 34 38 42
spontaneously? Limited observational evidence suggests that in
the course of the second year, human toddlers will sometimes Fig. 3. Cumulative number of children who ever produced a flip, produced
spontaneously express their ignorance via a distinctive nonverbal an I DON’T KNOW flip, or said, “I don’t know” at each age point (14–42 mo).
Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7887
The data suggest that expressions of ignorance also occur in the Did the three children simply use the word know by echoing its
context of conversation. production in an immediately prior utterance by their interlocutor
In an experimental study, this conclusion was examined more or did they introduce the word know into the conversation in an
systematically (30). Children (aged 16–37 mo) were asked a series of autonomous fashion? The same pattern emerged for all three
questions by an adult, only some of which they could easily answer children: The large majority of children’s references to the word
based on their existing knowledge. More specifically, they were know were autonomous, rather than echoes of their interlocutor’s
shown a mix of pictures and asked the name for each of the entities prior utterance. Next, utterances were analyzed to determine
depicted. Some pictures depicted familiar, easy-to-name entities whether children referred only to their own cognitive states or also
(e.g., book, bird), whereas others depicted unfamiliar, hard-to-name made references to the cognitive states of their interlocutor or to
entities (e.g., unusual hardware item). The pattern of responding the cognitive states of a third party. The majority of references
was different for the unfamiliar entities compared with the familiar were indeed to children’s own cognitive states. Nevertheless,
entities. Not only did children make more naming errors and pro- children also referred quite often to the cognitive states of their
duce more filled speech pauses (e.g., “umm”), but they were also interlocutor. By contrast, references to a third party, someone not
more likely to look toward an adult (either the experimenter or participating in the conversation, were rare.
their mother) to ask for information (e.g., “What’s that?”) or to say Granted that children talked about their own cognitive states as
“I don’t know.” This differential pattern of responding was apparent well as the cognitive states of their interlocutor, an analysis was
among younger infants (16–27 mo) but was more systematic among conducted to assess whether the pragmatic function of the utter-
older infants (28–37 mo), especially with respect to filled speech ances was similar or different for these two persons. More spe-
pauses and requests for information. cifically, the proportions of affirmations (“I know. . .” or “You
Taken together, these studies show that toddlers communicate know. . .”), denials (“I don’t know. . .” or “You don’t know. . .”),
their uncertainty in various ways. They communicate by looking and questions (“Do I know. . .?” or “Do you know. . .?”) that in-
toward an adult and by producing a filled speech pause, a flip volved a reference to the self compared with the interlocutor were
gesture, an explicit affirmation of ignorance, or a question to an compared. These proportions varied across the three pragmatic
interlocutor. Admittedly, when they look at an adult or produce functions. In the case of affirmations, children produced them
a filled pause or a flip, such responses might reflect behavioral with respect to both the self and their interlocutor. Denials and
uncertainty rather than metacognitive awareness of ignorance. questions, by contrast, exhibited a strongly asymmetrical pattern.
However, such a parsimonious interpretation seems less appro- Children often denied their own knowledge (“I don’t know. . .”)
priate when toddlers begin to affirm their ignorance verbally. but very rarely denied the knowledge of their interlocutor (“You
Note also that there was only a modest developmental lag be- don’t know. . .”). Conversely, children often asked questions about
tween the production of flips and the emergence of verbal af- their interlocutor’s knowledge (“Do you know. . .?”) but never
asked questions about their own knowledge (“Do I know. . .?”).
firmations of ignorance. In the next section, such affirmations
This asymmetry in the pattern of production for denials compared
are scrutinized in more detail.
with questions was marked, but it was based on only three chil-
Talking About Knowledge and Ignorance dren. To establish its existence firmly, the utterances of a further
eight English-speaking children drawn from the CHILDES data-
The scope of children’s metacognitive awareness can be illumi-
base were also analyzed (36). An identical pattern emerged for all
nated by analyzing their production of the cognitive verb “know”
eight children: Denials were almost invariably produced with re-
in the context of everyday conversations with caregivers (32).
spect to the self rather than the interlocutor, whereas questions
Arguably, young children are aware only of gaps in their
were invariably produced with respect to the interlocutor rather
knowledge. They might have little or no awareness of when their than the self.
information retrieval processes operate smoothly. For example, Returning to the two questions guiding this study, the data
they might register occasions when they cannot readily respond show that 2-y-olds do not simply talk about their ignorance. They
to questions about an object’s name or location but ignore or fail also affirm that they possess particular items of knowledge. In
to register occasions when they can successfully answer. In this addition, the pattern of talk about the self is different from the
view, children would be likely to deny that they have knowledge pattern of talk about the interlocutor: Denials of knowledge are
(“I don’t know. . .”) but unlikely to affirm that they do have frequent for the self (“I don’t know”) but not for the in-
knowledge (“I know. . .”). A further question concerns children’s terlocutor, and questions about knowledge are frequent for the
insight into the cognitive states of other people. Do they talk interlocutor (“Do you know?”) but not for the self.
about the ignorance or knowledge of other people in the same The exact explanation for this asymmetry warrants further
way as they talk about their own, or is there any asymmetry investigation (36), but its existence points to the following pos-
between talk about the self and talk about others? sibility for early communication between young children and
To answer these questions, the spontaneous utterances of three their interlocutors. On the one hand, children monitor their own
children were analyzed. Two children were English-speaking cognitive states: They are aware of knowing some items of in-
(Adam, a middle-class, African-American child and Sarah, a formation and affirm possessing that knowledge, and they are
white, working-class child), whose early language had been recorded also aware of lacking other items of information and deny having
by Brown (33) and his colleagues at regular intervals. The utter- that knowledge. Their monitoring of other people’s knowledge is
ances of each child could be retrieved via the child language data more circumspect. They sometimes affirm, but almost never
exchange system, CHILDES (34). All utterances produced by the deny, that an interlocutor knows something. Rather, they ask an
two children that included the mental verb know from the age of interlocutor about what he or she knows. Given children’s
27 mo (the age at which recordings had begun) to the age of 36 mo awareness of what they do not know (as indexed by their explicit
were analyzed. The third child, Qianqian (芊芊), was a Mandarin- denials) combined with their receptivity to the possibility that an
speaking child whose utterances had been recorded and transcribed interlocutor might know (as indexed by their questions), and
from the age of 16 to 39 mo by her mother, a psycholinguist. given also their understanding of the way that communication
Qianqian’s production of the verb zhi1dao4 was analyzed. Similar can pass knowledge between an informant and a recipient, it is
to the phrase “know that” in English, zhi1dao4 is an epistemic verb feasible for them to turn to other people for information when
that is used in the context of factual knowledge. [Note that, in they do not know something. In particular, it would make sense
contrast to English, Mandarin uses a different verb (i.e., hui4) for for them to ask information-seeking questions. In the next sec-
the phrase “know how” (as in “know how to dance”) (35)]. tion, we review the onset of such questions.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
naling that they wanted her to provide information about the increasing proportion of questions aimed at gathering information
objects because they pointed out novel objects less often if she had as opposed to questions that make other types of requests (e.g., for
proven to be an unreliable informant. Thus, they were less likely to permission, for clarification) (44). When asking such information-
point to the novel objects if she had previously named familiar seeking questions, do they monitor the replies that they receive? In
objects incorrectly and now appeared unsure of the names of the particular, do they differentiate between a satisfactory answer, one
novel objects. A follow-up study suggested that the experimenter’s that dispels their ignorance, and an unsatisfactory answer that does
prior naming errors were especially important in reducing infants’ not? To examine this issue, Chouinard (44) looked at what children
interrogative points. If the experimenter simply called attention to said in reply to an informative answer versus an uninformative an-
the objects (e.g., “Wow, look at this!”) and then appeared unsure swer. When adults failed to supply the information they sought,
how to name the novel objects, infants still pointed them out. children were likely to persist with their questions.
When infants receive information in the wake of an interrogative Extending this analysis, Frazier et al. (45) focused on the “why” or
point, how well do they process that information, and do they pro- “how” questions (i.e., the explanation-seeking questions) of six
cess it more effectively than unsolicited information? To examine English-speaking children whose language had been recorded reg-
these questions, Begus et al. (42) presented 16-mo-old infants with ularly from 2–5 y of age. Children reacted differently depending on
two objects at once. When infants pointed to one of the two objects, whether they received a satisfactory explanation or not. Following a
the experimenter modeled an action either on the indicated object satisfactory explanation, they were likely to acknowledge their
or on the alternative object. After a 10-min delay, the demonstration agreement or to ask a new, follow-up question on the same topic. By
object was presented again and infants were given an opportunity to contrast, when they were not given a satisfactory explanation, they
imitate the action they had seen demonstrated. Infants reproduced were likely to persist with their initial question or to offer an ex-
the actions demonstrated on the objects they had pointed at signif- planation of their own. A follow-up study confirmed that explanatory
icantly more than those actions demonstrated on the nonchosen information is also better remembered. Thus, when preschoolers
objects. Moreover, this difference emerged even though infants had received an explanation for a puzzling illustration, they were more
been equally attentive visually during the demonstrations on both likely to remember that information than a nonexplanation. Indeed,
types of objects. A follow-up experiment showed that this difference children often misremembered nonexplanations, converting them
in copying was due to learning being facilitated when infants’ into explanations via appropriate elaboration (46).
pointing was responded to rather than hindered when their pointing
was ignored. Thus, even when infants’ pointing was not ignored and Conclusions and Implications
a single object was presented, copying was still inferior to when two In the course of the second year, children begin to communicate
objects were presented and the experimenter consistently offered a their doubt or ignorance in various ways: through nonverbal
demonstration on the one that infants pointed to. By implication, gestures, explicit statements of their ignorance (“I don’t know”),
infants’ pointing at a given object had been aimed at eliciting in- as well as information-seeking questions. Nonhuman primates
formation about it and that information was better encoded than also indicate their uncertainty: They act differently depending on
information they had not aimed to elicit. whether they are sure or unsure of what to do next. In particular,
Toddlers’ early word learning provides more evidence for the they suppress responding in a situation where a mistaken re-
information-seeking role of pointing. Lucca and Wilbourn (43) sponse would impose costs. However, despite important paral-
presented 18-mo-olds with pairs of unfamiliar objects, and when lels, notably the implication that all primates are able to monitor
the infants targeted one of them via selective pointing, reaching, their own level of certainty or doubt, the two bodies of evidence
Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7889
also diverge. Toddlers readily express doubt or ignorance in the response to an interrogative gesture rather than supplied in an
context of communication; their flips, “I don’t know” utterances, and unsolicited fashion (42, 43), and explanatory information is better
information-seeking questions are directed at an interlocutor. Con- retained than nonexplanatory information (46).
ceivably, some of these signals could be produced when children are Research on the child’s theory of mind has highlighted the
alone and faced with uncertainty. For example, having searched important role of language and conversation in promoting child-
unsuccessfully, a toddler might produce a flip gesture, signaling un- ren’s insight into the way that the mind works (47). The primary
certainty about the object’s location and/or where to search next. focus of that research has been on children’s developing insight
However, pending more evidence of children’s expressive gestures into false beliefs. The present findings point to a more basic in-
when they are alone, it is plausible that the majority of such meta- sight that conversation is likely to promote. In several of the
cognitive signals are produced in a communicative context, not as an studies reported above, children signaled their ignorance in the
adjunct to ongoing solo behavior. More generally, these signals ap- context of an ongoing conversation with an adult. A speculative
pear to serve two interwoven functions. First, they provide an answer but plausible implication is that involvement in conversation can
to an interlocutor’s question: an admission of ignorance that serve as a constant tutorial for children with respect to the range
amounts to a well-formed turn in an ongoing conversation. Second, and depth of their ignorance. To the extent that children are
when formulated as questions, they convey not just ignorance but prone to engage in conversation with better informed interlocu-
also a request that the interlocutor offer help by supplying missing tors, they are likely to discover that their existing knowledge is
information. That children aim to elicit missing information with limited and fragmentary, albeit open to expansion if they pose
their questions is underlined by their differential reactions to in- appropriate questions. Granted that children vary in the quantity
formative vs. uninformative replies. of speech that they are exposed to by their caregivers (48) in the
Young children’s facility in communicating their ignorance and in extent to which that speech is directive rather than discursive (49),
asking questions appears to build on their foundational insight into is tightly focused on the immediate situation or includes an ex-
the way that testimony works (20). As described earlier, infants aged ploration of situations and events displaced from the here and now
12–18 mo understand that someone who lacks information (e.g., (50), and includes satisfactory answers to children’s causal ques-
regarding the location of an object) can be provided with that in- tions (51), we can anticipate that children will grow up with
formation via the gestures or vocalization of an informant. The markedly different assessments of the scope of human knowledge,
present review indicates that toddlers and young children go beyond the magnitude of their own comparative ignorance, and the po-
that basic insight. They produce avowals of ignorance and adopt an tential role of question asking in mitigating that ignorance.
interrogative stance. Moreover, the interrogative stance appears to
involve not simply the seeking of information from others via ACKNOWLEDGMENTS. Collection and coding of the data displayed in Fig. 3
pointing and/or questions but an accompanying state of in- were supported by the Eunice Kennedy Shriver National Institute of Child
Health and Human Development of the NIH under Award P01HD040605
formational receptivity (i.e., a motivational readiness to encode and (Principal Investigator, Susan Goldin-Meadow). The content is solely the re-
retain the information thereby elicited). Thus, information about sponsibility of the authors and does not necessarily represent the official
object names or object functions is better retained if it is received in views of the NIH.
1. Spelke ES, Kinzler KD (2007) Core knowledge. Dev Sci 10:89–96. 23. Song H-J, Onishi KH, Baillargeon R, Fisher C (2008) Can an agent’s false belief be
2. Carey S (2009) The Origin of Concepts (Oxford Univ Press, New York). corrected by an appropriate communication? Psychological reasoning in 18-month-
3. Gopnik A, Wellman HM (2012) Reconstructing constructivism: Causal models, Bayes- old infants. Cognition 109:295–315.
ian learning mechanisms, and the theory theory. Psychol Bull 138:1085–1108. 24. Fusaro M, Harris PL (2013) Dax gets the nod: Toddlers detect and use social cues to
4. Wellman HM (2014) Making Minds (Oxford Univ Press, New York). evaluate testimony. Dev Psychol 49:514–522.
5. Harris PL, Koenig MA (2006) Trust in testimony: How children learn about science and 25. Smith JD (2009) The study of animal metacognition. Trends Cogn Sci 13:389–396.
religion. Child Dev 77:505–524. 26. Call J, Carpenter M (2001) Do apes and children know what they have seen? Anim
6. Harris PL (2012) Trusting What You’re Told: How Children Learn from Others (Harvard Cogn 3:207–220.
Univ Press, Cambridge, MA). 27. Neldner K, Collier-Baker E, Nielsen M (2015) Chimpanzees (Pan troglodytes) and
7. Legare CH, Harris PL (2016) The ontogeny of cultural learning. Child Dev 87:633–642. human children (Homo sapiens) know when they are ignorant about the location of
8. Harris PL, Corriveau KH (2011) Young children’s selective trust in informants. Philos food. Anim Cogn 18:683–699.
Trans R Soc Lond B Biol Sci 366:1179–1187. 28. Goupil L, Romand-Monnier M, Kouider S (2016) Infants ask for help when they know
9. Corriveau K, Harris PL (2009) Choosing your informant: Weighing familiarity and re- they don’t know. Proc Natl Acad Sci USA 113:3492–3496.
cent accuracy. Dev Sci 12:426–437. 29. Acredolo LP, Goodwyn SW (1985) Symbolic gesturing in language development: A
10. Lucas AJ, et al. (December 29, 2016) The development of selective copying: Children’s case study. Hum Dev 28:40–49.
learning from an expert versus their mother. Child Dev, 10.1111/cdev.12711. 30. Bartz DT (2017) Young children’s meta-ignorance. EdD thesis (Harvard University,
11. Koenig MA, Clément F, Harris PL (2004) Trust in testimony: Children’s use of true and Cambridge, MA).
false statements. Psychol Sci 15:694–698. 31. Goldin-Meadow S, et al. (2014) New evidence about language and cognitive develop-
12. Koenig MA, Harris PL (2005) Preschoolers mistrust ignorant and inaccurate speakers. ment based on a longitudinal study: Hypotheses for intervention. Am Psychol 69:
Child Dev 76:1261–1277. 588–599.
13. Kinzler KD, Corriveau KH, Harris PL (2011) Children’s selective trust in native-accented 32. Harris PL, Yang B, Cui Y (2017) “I don’t know”: Children’s early talk about knowledge.
speakers. Dev Sci 14:106–111. Mind Lang 32:283–307.
14. Chen EE, Corriveau KH, Harris PL (2013) Children trust a consensus composed of 33. Brown R (1973) A First Language (Allen & Unwin, London).
outgroup members–but do not retain that trust. Child Dev 84:269–282. 34. MacWhinney B, Snow C (1985) The child language data exchange system. J Child Lang
15. Corriveau KH, Harris PL (2010) Preschoolers (sometimes) defer to the majority in 12:271–295.
making simple perceptual judgments. Dev Psychol 46:437–445. 35. Tardif T, Wellman HM (2000) Acquisition of mental state language in Mandarin- and
16. Corriveau KH, Fusaro M, Harris PL (2009) Going with the flow: Preschoolers prefer Cantonese-speaking children. Dev Psychol 36:25–43.
nondissenters as informants. Psychol Sci 20:372–377. 36. Harris PL, Ronfard S, Bartz D (2017) Young children’s developing conception of
17. Morgan TJH, Laland KN, Harris PL (2015) The development of adaptive conformity in knowledge and ignorance: Work in progress. Eur J Dev Psychol 14:221–232.
young children: Effects of uncertainty and consensus. Dev Sci 18:511–524. 37. Liszkowski U, Brown P, Callaghan T, Takada A, de Vos C (2012) A prelinguistic gestural
18. Halberda J, Feigenson L (2008) Developmental change in the acuity of the “Number universal of human communication. Cogn Sci 36:698–713.
Sense”: The Approximate Number System in 3-, 4-, 5-, and 6-year-olds and adults. Dev 38. Salomo D, Liszkowski U (2013) Sociocultural settings influence the emergence of
Psychol 44:1457–1465. prelinguistic deictic gestures. Child Dev 84:1296–1307.
19. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago 39. Bates E, Camaione L, Volterra V (1975) The acquisition of performatives prior to
Press, Chicago). speech. Merrill Palmer Q Behav Dev 21:205–226.
20. Harris PL, Lane JD (2014) Infants understand how testimony works. Topoi (Dordr) 33: 40. Southgate V, van Maanen C, Csibra G (2007) Infant pointing: Communication to co-
443–458. operate or communication to learn? Child Dev 78:735–740.
21. Krehm M, Onishi KH, Vouloumanos A (2014) Infants under 12 months understand 41. Begus K, Southgate V (2012) Infant pointing serves an interrogative function. Dev Sci
that pointing is communicative. J Cogn Dev 15:527–538. 15:611–617.
22. Martin A, Onishi KH, Vouloumanos A (2012) Understanding the abstract role of 42. Begus K, Gliga T, Southgate V (2014) Infants learn what they want to learn: Re-
speech in communication at 12 months. Cognition 123:50–60. sponding to infant pointing leads to superior learning. PLoS One 9:e108817.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Harris et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7891
Changes in cognitive flexibility and hypothesis search
across human life history from childhood to
adolescence to adulthood
Alison Gopnika,1, Shaun O’Gradya, Christopher G. Lucasb, Thomas L. Griffithsa, Adrienne Wentea, Sophie Bridgersc,
Rosie Aboodyd, Hoki Funga, and Ronald E. Dahle
a
Department of Psychology, University of California, Berkeley, CA 94720; bSchool of Informatics, University of Edinburgh, Edinburgh EH1 2QL, United
Kingdom; cDepartment of Psychology, Stanford University, Stanford, CA 94305; dDepartment of Psychology, Yale University, New Haven, CT 06520;
and eSchool of Public Health, University of California, Berkeley, CA 94720
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 8, 2017
(received for review January 18, 2017)
How was the evolution of our unique biological life history related culture would have been coevolutionary and bidirectional: life-
to distinctive human developments in cognition and culture? We history changes allowed changes in cultural learning, which in
suggest that the extended human childhood and adolescence turn both allowed and rewarded extended life histories. In this
allows a balance between exploration and exploitation, between way, culture could have extended biology.
wider and narrower hypothesis search, and between innovation A number of researchers have suggested that our life history is
and imitation in cultural learning. In particular, different develop- related to our learning abilities (8–10). But what might this re-
mental periods may be associated with different learning strate- lation be like in more detail? It is possible that the extended
gies. This relation between biology and culture was probably human childhood and adolescence is simply a waiting period in
coevolutionary and bidirectional: life-history changes allowed
which a large brain can grow or cultural learning can take place
changes in learning, which in turn both allowed and rewarded
(11). However, both developmental psychology and neuroscience
extended life histories. In two studies, we test how easily people
learn an unusual physical or social causal relation from a pattern of
suggest that there may be more substantive differences in
evidence. We track the development of this ability from early learning and plasticity in different developmental periods, dif-
childhood through adolescence and adulthood. In the physical ferences that could contribute to human intelligence and culture.
domain, preschoolers, counterintuitively, perform better than We argue that there may be a developmental trade-off be-
school-aged children, who in turn perform better than adolescents tween cognitive abilities that allow organisms to learn the
and adults. As they grow older learners are less flexible: they are structure of a new physical or social environment, abilities that
less likely to adopt an initially unfamiliar hypothesis that is are characteristic of children, and the more adult abilities that
consistent with new evidence. Instead, learners prefer a familiar allow skilled action on a familiar environment. Empirical evi-
hypothesis that is less consistent with the evidence. In the social dence suggests that children may sometimes be better, and
domain, both preschoolers and adolescents are actually the most particularly more flexible, learners than adults. Ideas from the
flexible learners, adopting an unusual hypothesis more easily than literature on developmental neuroscience, machine learning, and
either 6-y-olds or adults. There may be important developmental cultural learning may help to characterize and explain these
transitions in flexibility at the entry into middle childhood and in developmental differences more precisely.
adolescence, which differ across domains. We go on to test these ideas by examining cognitive flexibility
across the developmental periods of preschool, middle-childhood,
causal reasoning | social cognition | cognitive development | adolescence | adolescence, and adulthood, in both the physical and social domain.
life history
When Younger Learners Do Better. Younger learners usually have
PSYCHOLOGICAL AND
current hypotheses when the evidence is particularly strong and
COGNITIVE SCIENCES
Neuroscience: Trade-Offs Between Executive Function and Plasticity. making small adjustments to accommodate new evidence. This
Neuroscientists have investigated the origins of both the in- strategy is most likely to quickly yield a “good enough” solution
creased executive control and decreased plasticity that come with that will support immediate effective action. But it also means
age. One set of developments involves synaptic changes. In the that the learner may miss a better alternative that is farther from
early period of development, many more new synaptic connec- the current hypothesis, such as a hypothesis about an unusual
tions are made than in adulthood. With age some of these neural causal relation.
connections are strengthened but others are pruned, trans- Alternatively, a learner can conduct a more exploratory search,
forming a more flexible, sensitive, and plastic brain into a more moving to new hypotheses with only a small amount of evidence,
effective and controlled one (27, 28). and trying out hypotheses that are less like the current hypotheses.
Increasing executive control is also related to the development This strategy is less efficient if the learner’s starting hypothesis is
of prefrontal areas of the brain and their increasing connection reasonably good, and may mean that the learner wastes time con-
to other brain areas. However, neuroscientists have also argued sidering unlikely possibilities. But it may also make the learner more
that strong frontal control has costs for exploration and learning likely to adopt genuinely new solutions.
(29). Interference with prefrontal control areas through trans- There is a related contrast in the algorithms that are used in
cranial direct current stimulation leads to a wider range of re- computer science. Drawing on an analogy to statistical physics,
sponses on a “divergent thinking” task (30), and during learning computer scientists have explored the consequences of using
there is a characteristic release of frontal control (31). narrower “low-temperature” versus broader “high-temperature”
The adolescent brain undergoes particular changes. There is searches. Continuing the analogy, “simulated annealing” (46) is
significant maturational development in prefrontal areas and in one of the best ways of resolving the tension between these two
areas thought to be involved in self-perception and social cog- strategies. Learners who begin with a broader higher-temperature
nition (32), which may indicate increased plasticity. However, search and gradually move to a narrower low-temperature search
there is also evidence for enhanced consolidation and pruning in are most likely to find the optimal solution, just as in metallurgy
adolescence (33), which might suggest a period of less flexibility. heating a metal and then cooling it leads to the most robust
structure. Moreover, as in physical cases of annealing, there may be
Computation: Trade-Offs Between Exploitation and Exploration, and multiple rounds of this process. We have argued for a similar de-
Narrow and Broad Search. The trade-off between executive func- velopmental pattern with early broad exploratory sampling followed
tion and plasticity in the neuroscience literature parallels an- by a later narrower search (23, 24). Our hypothesis is that childhood
other trade-off that appears in machine learning. Reinforcement and adolescence may be evolution’s way of performing simulated
learning algorithms make an important distinction between pe- annealing, and hence resolving the explore/exploit trade-off.
riods of exploration, in which the system gathers information
about potential actions and outcomes, and exploitation, in which Cultural Learning: Trade-Offs Between Imitation and Innovation. The
information gathering is replaced by taking the actions most causal learning problems where children do better can also be
likely to maximize reward (34). Human life histories can be recast as cultural learning problems, and understood in relation
interpreted as a unique solution to the explore/exploit tension, to the cultural learning literature. Consider a learner who ob-
with low executive control and high plasticity early in life maxi- serves someone else performing a complicated series of actions
mizing exploration, and increased executive function and lower with artifacts that produce an effect. The learner might approach
plasticity maximizing reward as we switch to exploitation. this information in several ways. First, the learner might simply
Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7893
reproduce the actions in detail. Alternatively, the learner might that are effortful and rare when they first appear within a gener-
apply existing causal knowledge to the situation, and bring about ation can become effortlessly and widely adopted by the next
the effect more directly. These two forms of learning have been generation. In fact, among nonhuman animals, cultural innovations
the focus of the extensive “overimitation” literature, starting with the are often first produced, adopted, and spread by juveniles (55–58).
classic Horner and Whiten study (47).
Human preschoolers are sensitive to information about phys- Continuous Knowledge Acquisition vs. Discontinuous Developmental
ical events and actor’s intentions in deciding how faithfully to Transition. There are two complementary mechanisms that might
imitate, and there are also developmental and cultural differ- lead to a developmental shift from broader exploration to nar-
ences in how imitation takes place (48–52). Learners of all ages rower exploitation. One is simply the accumulation of knowledge
may use their existing causal and cultural knowledge to interpret itself. As we learn more and grow more confident in our beliefs,
the actions of another person and to decide whether and how we are less likely to change those beliefs. From a Bayesian per-
faithfully to imitate those actions. spective, development proceeds from a relatively “flat” prior,
However, they might also use another person’s demonstration where different hypotheses have more similar probabilities, to a
to discover a new or unexpected causal relationship. For exam- more “peaked” distribution, where some hypotheses are much
ple, consider a Pleistocene learner who sees an expert produce a more likely than others, as a learner accumulates knowledge. In
flake from one side of a rock by hitting it on the other side (53), Bayesian models a flatter prior would automatically lead to
or a modern learner who watches an expert swipe to find a photo broader search.
on a phone. The learner might simply imitate the demonstrator Another complementary possibility, building on the literature
exactly. Alternatively, she might use her existing causal knowl- discussed above, is that maturation and general experience lead
edge to bring about the result (hitting the rock at the place where to different degrees of plasticity and flexibility and different
she wants it to flake or using a keyboard command). search strategies, independent of accumulated knowledge. There
However, a learner might also use this information to infer an might be nonlinear changes at points of developmental transi-
unexpected abstract causal principle (distant force or touch ac- tion, such as the transition from early to middle childhood at
tivation). She could then use this principle to design innovative around 6 y or in adolescence, rather than a simple continuous
actions beyond the demonstration, shaping other tools or trying change with accumulated knowledge.
other swipes for other commands. This kind of learning would In particular, although adolescents have more accumulated
both enable learners to adopt innovations in an intelligent way experience than younger children, there is evidence, as noted
and to create innovations themselves. above, that adolescence may also be a period of enhanced
This approach also applies to social and psychological causal plasticity and learning (59, 60), especially for social domains (32,
learning. Imagine that a learner hears a complex narrative de- 61), in part through the privileging of social information pro-
scribing a series of human actions, again a classic cultural, as well cessing and the salience of social rewards in decision making (62,
as causal, learning scenario. The learner might simply encode the 63). Cultural innovations, such as new socially significant forms
actions as they are described, recording what the actors did. She of language, dress, or music often first appear in adolescents.
might interpret those actions in terms of an existing psycholog- Adolescence might be an extra round of annealing in the social
ical schema. Alternatively, she might use the information in the sphere. However, there is also evidence that adolescence may be
narrative to infer new psychological or social relations. a period of pruning and consolidation.
As in the physical case, this last option might lead to both the In fact, two contrasting developmental patterns characterize ad-
adoption and creation of social and psychological innovations. olescence (64, 65). On some measures, such as cognitive control,
Consider a learner who hears a story in which Sam and John live and self-regulation, there is a relatively linear trajectory from
together and share a bedroom. She might interpret this story in childhood through adolescence to adulthood. On others, such as
terms of her existing cultural schemas (perhaps Sam and John sensation-seeking and risk-taking, both forms of exploration, there
are close friends with a small apartment). She might also, how- is a marked increase associated with the onset of puberty, and an
ever, use the story to make a broader inference about the pos- inverted U pattern peaking in adolescence and then declining.
sibility of same-sex marriage. There is extensive research on risk-taking and decision-making in
These alternative forms of cultural learning exemplify the adolescence but, to our knowledge, no research on causal learning.
explore/exploit tension. The first two strategies, namely, exact
imitation or reliance on causal knowledge, are likely to lead to Current Studies. We approach these questions by extending two
quick and mostly effective actions. Entertaining the unlikely new earlier causal learning experiments. Where the original experiments
causal relation is both more cognitively demanding and more contrasted preschoolers with either 6-y-old children or adults, we
risky. In the long run, however, it may confer an advantage in report results covering the entire developmental span from pre-
dealing with changing and variable environments. school to adulthood, with special focus on the transition to middle
Human learners of all ages may use all these strategies to some childhood and adolescence, periods not explored previously. This
extent. However, our hypothesis is that learners at different approach allows us to explore learning across human life history,
developmental stages may be more or less likely to use different and to ask whether there are distinctive developmental transitions.
strategies. In particular, more protected and more behaviorally Both experiments have the same logic. We contrast two hy-
variable younger learners may be more likely to adopt new hy- potheses about how objects or people work: one that is initially
potheses than older learners. In fact, the causal learning tasks in more likely, at least for adults, and one that is more unusual. In
our earlier research, in which younger learners do better than Exp. 1 we contrast the hypothesis that individual objects activate
older ones, involve precisely these kinds of scenarios. Learners a machine with the hypothesis that particular combinations of
infer a new causal relation from a demonstration or narrative. objects do. In Exp. 2 we contrast the hypothesis that someone
This developmental difference may also help resolve the ten- took a risk because of their personal traits with the hypothesis
sion between imitation and innovation in cultural learning (48). that they took the risk because of the situation they encountered.
Human children are adept at imitation. However, the flexibility In one condition, participants receive covariation evidence that
of childhood cognition may also help allow innovations to be supports the likely hypothesis. In a second, otherwise identical con-
adopted and to spread. Young children are rarely the source of dition, they receive covariation evidence that supports the unlikely
complex technical innovations; actually designing and producing hypothesis. In a third baseline condition, participants do not receive
an effective tool, for example, is a challenging task that requires evidence either way. We record whether participants of different ages
both innovation and executive skill (48, 54). However, innovations adopt the likely or unlikely hypothesis in each condition.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Blicket judgments. We combined new data collected from younger
children and adults in both conditions then saw an ambiguous test school-aged children (6- to 7-y-olds), older preadolescent chil-
trial with new blocks that was consistent with either general prin- dren (9- to 11-y-olds), and young adolescents (12-to 14-y-olds)
ciple. In a baseline condition, participants only saw the ambiguous with the data from 4-y-olds and adults tested with the identical
trial without the training trials. In each condition, participants were
method in Lucas et al. (23).
then asked whether each block was or was not a “blicket” and were
If the observers believe the machine operates on an unusual
asked to activate the machine.
conjunctive rule, requiring multiple blickets to operate, they
Children learned the appropriate general rule in each condi-
should say that F, D, and possibly E are blickets and use multiple
tion and applied it to the ambiguous case. Adults applied the
objects to make the machine go. If observers believe that the
default disjunctive rule in the ambiguous case even when the
machine works on the “disjunctive” rule, in contrast, they should
earlier evidence weighed against it.
In Exp. 1 we used exactly the same methods across the entire say that F is a blicket but that D and E are not and put single
developmental range, including 6- to 7-y-olds, 9- to 11-y-olds, and objects on the machine. [The evidence that E is a blicket is less
12- to 14-y-olds. Fig. 1 provides a visual display of the pattern of strong than the evidence for D, so participants should be less likely
evidence used for training and test trials. to say that E is a blicket than D (23).] (See SI Appendix, Table S5
We extended the contrast between preschoolers and adults to for analysis of E judgments, consistent with these predictions.)
include school-aged children and adolescents. This approach Fisher’s exact tests revealed no significant differences between
conditions or ages for the unambiguous F object; as predicted all
of the age groups in all of the conditions said that F was a blicket
(means ranged from 0.7 to 0.96).
Fig. 2 presents the proportion of participants in each age
group labeling the critical D test object as a blicket by condition.
Because the dependent measure is a binary response, we used
A B C AB AC BC comparisons of generalized linear models to identify the statis-
tical model with the best fit to the data. Results of model com-
parisons can be found in SI Appendix, Table S4.
A model predicting the binary D judgment from condition and
age group with no interactions was best fit to the data. Post hoc
A B C AB AC BC
tests using Tukey’s honest significant differences (HSD) for D
Test Trials
object judgments revealed a significant difference between the
conjunctive (M = 0.52, SE = 0.02) and the disjunctive (M = 0.13,
SE = 0.01; t = −0.391, P < 0.001) and baseline (M = 0.15, SE =
D D D E DF DEF DF 0.01; t = −0.374, P < 0.001) conditions, and there was no sig-
nificant difference between the disjunctive and baseline condi-
Fig. 1. Schematic of the procedure for Exp. 1. The yellow rectangle repre-
sents the machine’s activation. “Disjunctive” training provides evidence of tions (t = −0.017, P = 0.923).
the more common, disjunctive hypothesis. “Conjunctive” training provides In addition to the model comparisons, we conducted planned
support for the less common conjunctive hypothesis. “Test” trials presented comparisons for the theoretically crucial developmental con-
ambiguous evidence about the “D” object. trasts in the critical conjunctive condition, using Fisher’s exact
Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7895
tests. These were the transition to school-age (4- vs. 6-y-olds) and adolescence, rather than just a continuous change with in-
to adolescence (12- to 14-y-olds vs. 6- to 7- and 9- to 11-y-olds, creasing knowledge. School-aged children are similar to each
and vs. adults). Four-year-olds (M = 0.92, SE = 0.01) were sig- other and less flexible than preschoolers, adolescents and adults
nificantly more likely to label D a blicket than 6- to 7-y-olds (M = are similar, and both are less flexible than preschoolers and
0.56, SE = 0.02; P < 0.01). Six- to 7-y-olds and 9- to 11-y-olds school-aged children (Fig. 2).
(M = 0.6, SE = 0.02; P = 1) did not differ but both 6- to 7-y-olds
and 9- to 11-y-olds labeled D as a blicket significantly more than Exp. 2: Reasoning About the Causes of Actions
12- to 14-y-olds (M = 0.28, SE = 0.02; P < 0.05, in both cases). In the second experiment we turned from physical causality to
However, adolescents (12- to 14-y-olds) judgments did not differ sig- social and psychological causality. Classic findings in social psy-
nificantly from the judgments of adults (M = 0.25, SE = 0.02; P = 1.0). chology show that Western adults attribute actions to the stable
Thus, within the new data collected in this study, we saw some internal personal traits of an actor despite countervailing evi-
evidence for both middle childhood and adolescent transitions. dence, the “fundamental attribution error” (67). These findings
Intervention choices. We also analyzed participants’ choices when suggest that adults rely on existing causal hypotheses rather than
they were asked to activate the machine. Fig. 3 displays the pro- modifying those hypotheses in the face of evidence.
portion of participants choosing multiple items, indicating that they In one study, for example, an experimenter instructed half the
thought more than one object was necessary to activate the ma- participants in a group to write and read aloud an essay sup-
chine. There was more variability in this open-ended response than porting Castro and the other half to write and read an essay
in the yes/no blicket judgments. However, the general pattern was opposing him. Despite the obvious evidence that the essays were
similar. In particular, adolescents and adults were more likely to the result of the situation, participants reported that people in
choose single objects to make the machine go, suggesting that they the first group were more left wing than those in the second (68).
had genuinely concluded that the machine worked disjunctively, Among adults, this trait bias tends to become stronger with age
and did not simply use the word “blicket” differently than younger (69) and it appears to be stronger in some cultures than others:
participants. American and European middle-class participants show a stronger
Again, we used a generalized linear model (see SI Ap- trait bias than Hong Kong, Mainland Chinese, Japanese, and
pendix, Table S6 for details). The model with the best fit to Korean participants (70).
the data predicted the single vs. multiple object use from con- How does this bias develop in childhood? Seiver et al. (22)
dition, age group, and the interaction between condition and presented preschool children with a scenario in which two dolls
age group. either played or refused to play on two potentially risky toys. The
As with the blicket judgment measure, we made planned covariation evidence supported either a person or situation attri-
comparisons for the conjunctive condition using Fisher’s exact bution. Then they asked the children to explain why the actors
tests, focusing on the school-aged and adolescent transitions. played or refused to play on the toys. Four-year-olds accurately
These tests showed that 4-y-olds (M = 0.84, SE = 0.01) were made person or situation attributions depending on the evidence.
more likely to use multiple objects to activate the machine than Six-year-olds, however, showed a trait bias. They made more
6- to 7-y-olds (M = 0.53, SE = 0.02; P < 0.05), again suggesting a person attributions than 4-y-olds, even when the covariation in-
middle childhood transition. With this measure, 6- to 7-y-olds formation supported a situation attribution. In Exp. 2 we extend
and 9- to 11-y-olds (M = 0.63, SE = 0.02) did not differ signifi- this previous work to study the developmental changes in learning
cantly from 12- to 14-y-olds and adolescents did not differ sig- over childhood and adolescence.
nificantly from adults. We included an adult sample to ensure that adults would in-
deed show a trait bias in this task. Adding 9- to 11-y-old and 12-
Discussion. These results suggest that, in this task, as learners to 14-y-old samples let us test whether the previously discovered
grow older and have more experience they become less sensitive transition from 4- to 6-y-olds was part of a continuous de-
to the evidence and more reliant on their prior beliefs. They velopmental decline, or reflected a particular transition into
increasingly prefer disjunctive explanations to conjunctive ones, middle childhood.
even when the evidence weighs in the opposite direction. We could also examine adolescence. Like adults, adolescents
The results from both the blicket judgments and interventions have extensive experience of their particular culture and the trait
suggest a developmental transition at the entry to middle child- assumptions that go with it. There might be a developmental
hood and the blicket judgment results also suggest a transition at progression toward the adult pattern, as in Exp. 1. However,
adolescents are also especially sensitive to social information and
strongly motivated to explain peer behavior (59). They might be
more sensitive to social evidence, and more likely to override a
trait bias than adults. We might then expect something more like
the inverted U of risk-taking and sensation-seeking.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
amine age differences separately for each condition. There were period of immaturity allows a period of flexible hypothesis search
no significant age differences in attribution scores in the person in cultural learning. In both studies, we also found some evidence
condition; all age groups produced trait explanations when these for developmental transitions, particularly from early to middle
explanations were congruent with the data, and rarely made childhood and at adolescence.
situation attributions. The crucial conditions involved cases where the evidence
The baseline condition allowed us to assess participants’ judg- and the existing hypotheses were in conflict, the conjunctive
ments when no evidence was available (their “prior” in Bayesian condition in Exp. 1 and the situation condition in Exp. 2. In
terms). Post hoc Tukey tests revealed that 4-y-olds (M = 0.93, both studies 4-y-olds and 6- to 7-y-olds were significantly dif-
SE = 0.08) provided significantly more situation attributions than ferent in these conditions. In both studies, however, we did
both 12- to 14-y-olds (M = 0.24, SE = 0.06; t = −0.694, P = 0.001) not see significant differences between 6- to 7-y-olds and 9- to
and adults (M = 0.38, SE = 0.05; t = −0.55, P = 0.004). Although 11-y-olds.
both 6-y-olds (M = 0.43, SE = 0.1; t = −0.49, P = 0.09) and 9- to Similarly, we found evidence for a transition in adolescence in
11-y-olds (M = 0.55, SE = 0.11; t = −0.386, P = 0.49) provided both studies in these conditions, but this transition went in op-
fewer situation attributions than 4-y-olds, these differences did not posite directions. In the physical case, in the conjunctive condi-
reach statistical significance. This finding suggested that a trait tion adolescents were similar to adults but less flexible than
bias developed around 6 y and was maintained with age. either 6-y-olds or 9- to 11-y-olds. Like adults, the adolescents
seemed reluctant to revise physical knowledge they had already
Discussion. In the person condition, participants of all ages mostly
made trait attribution explanations, in accordance with the evi- acquired. In the social case, however, in the situation condition
dence. In the baseline condition, with no evidence, there was a adolescents were more flexible than either 6-y-olds or adults.
decrease in situation explanations with age. Accumulating ex- This finding is consistent with the idea that adolescents are more
perience may have led to a trait bias. tuned to the social domain than the physical one, and are willing
In the situation condition, in which the learners had to infer to entertain new social possibilities.
the unusual hypothesis, there was an interesting developmental These findings also raise the question of the interaction be-
reversal, with an inverse U pattern. Twelve- to 14-y-olds were tween biological and environmental factors in the unfolding of
less likely to make trait attributions than either 6-y-olds or life history. The findings in the baseline conditions suggest that
adults. In other words, although the adolescents had developed children are gradually accumulating more knowledge and that
a strong bias to begin with, they overcame that bias when they this may play a role in the decline of cognitive flexibility.
received contradictory evidence. The adolescents showed the However, the discontinuous pattern in the conjunctive and
largest gap between the baseline condition and the situation situation conditions suggests that other factors also play a role.
condition. Biological changes like puberty may play a role in the adolescent
These findings support the idea that adolescents may be par- transitions. There may also be more complex interactions be-
ticularly interested in discovering new social possibilities. This tween the changing life experiences that come with different
finding is consistent with the fact that, compared with adults, developmental stages and hypothesis search and flexibility. Ad-
adolescents show greater activation in brain regions associated olescence is not only a time of biological change; it is also a time
with self-perception and social cognition (71, 72), and that ad- of new social motivation and experience. Similarly, there is a
olescents are often at the forefront of social change. complex interaction between biological changes at around 6 y
Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7897
and experiences such as school in our culture, or more informal for operating the machine, but instead were presented with two ambiguous test
apprenticeships in cultures without formal schooling. trials. We recorded results from the second test trial but there were no signifi-
It is also plausible that a playful protected environment may cant differences between them.
The three conditions only differed in the covariation between the blocks
lead to more flexible, exploratory and childlike learning, even in
and the machine. In all three conditions, at the end of both training and test
adulthood, and that even in childhood, stressful or resource-poor trials, the experimenter pointed to each item individually and asked the
environments may lead to less flexibility and a more adult-like participant if that item was a blicket or not a blicket. Finally, the experi-
emphasis on exploitation (see, e.g., refs. 73 and 74). menter then gestured to the set of three objects and asked the participant,
These issues are all worthy of exploration, as are extensions of “Which of these [gesturing to the three test objects] would you use to turn
these studies to new domains. The physical causal learning re- on the machine?”
sults in Exp. 1 have been replicated in low socioeconomic status
preschoolers in Peru and the United States,* but more extensive Exp. 2.
cross-cultural comparisons, including the social tasks and Participants. The same 9- to 11-y-olds (n = 90) and 12- to 14-y-olds (n = 86) in Exp.
1 also participated in this experiment. Order of administration of the tasks was
extending to forager and small-scale agricultural cultures, would
counterbalanced to avoid interference; there were no order effects. An addi-
also be important. The current findings do, however, suggest a tional 240 adult participants were recruited for an online version of this exper-
relation between biology and culture, in particular between the iment via Amazon’s Mechanical Turk. We combined these data with the original
distinctive childhood and adolescence of our life history and our data from Seiver et al. (22) for 4- and 6-y-olds.
equally distinctive ability to learn about and create new social Procedure and coding. Participants were randomly assigned to one of three
and physical environments. conditions in which two dolls interacted with two toys. Subjects assigned to the
situation condition saw two dolls play on one toy four times and then saw those
Methods same dolls avoid playing on a second toy four times. This pattern of covariation
Data from the new participants in this study can be found on the Open should suggest that something about the situation caused the pattern of ac-
Science Framework (https://osf.io) under the profile for Shaun O’Grady. tions (i.e., “her friend played on the bicycle” or “the trampoline is danger-
ous”). Those assigned to the person condition saw one doll play on both toys
four times, whereas the other doll avoided playing on both toys four times.
Exp. 1.
This evidence should suggest that the actions resulted from an inherent
Participants. Children aged 6- to 7-y-old, (n = 90), 9- to 11-y-old (n = 90), and 12-
trait of the doll, and produce trait-based explanations, such as “she’s the
to 14-y-old (n = 86) participated. We combined these new data with that
type of doll that gets scared/is brave” or “she knows how to ride a bike.”
reported for preschoolers and adults in Exp. 2 of Lucas et al. (23) to com-
Finally, in a baseline condition, participants saw one doll play on one toy
pare performance from preschool to adulthood. For all participants in both
four times, whereas the other doll avoided the other toy four times. Partici-
experiments reported here, parents provided written informed consent
pants in this condition could not rely on covariation information to make
and the child participants provided either written assent (9- to 14-y-olds) or
attributions because they had not seen how each doll acted on the other toy.
verbal assent (4- to 7-y-olds) in accordance with protocols approved by the
After they watched the dolls interact with the toys, each participant was asked
University of California, Berkeley Committee for the Protection of
why each doll either played or did not play on the second toy.
Human Subjects.
Explanations referring to an enduring characteristic of the doll were coded
Procedure. Participants from each age group were randomly assigned to one
as “person” attributions and were given a score of 0 (e.g., “Because she
of three conditions: two training conditions (conjunctive and disjunctive
might be more brave than the other one”). When an explanation referenced
conditions) and a third condition with no training, termed the baseline
an aspect of the toy or situation, the response was coded as a “situation”
condition. In each condition the participants were shown nine different
attribution and given a score of 1 (e.g., “The trampoline doesn’t have any
blocks (A, B, C, A2, B2, C2, D, E, and F). Participants were presented with a
edges”). Some explanations referred to both personal traits and situational
machine and were informed that “blicketness” makes the machine light up
factors and were coded as “interactions” and given a score of 0.5. See SI
and play music.
Appendix, Table S9 for a list of example responses by category. Reliability
In both of the training conditions, the experimenter placed individual blocks or
coding was conducted on 16% of the responses by a second coder who was
combinations of blocks on the machine in the same order (Fig. 1). In the con-
blind to condition, and interrater reliability was high (Cohen’s κ = 0.967, P <
junctive condition the machine only activated when the experimenter placed
0.001). Coded explanation responses for each participant were summed to
both A and C on the machine at the same time, providing evidence that supports
provide a “situation” attribution score for each participant.
a conjunctive rule about the machine’s operation. In the disjunctive condition
the machine activated any time either A or C were placed on the machine,
Analyses. All analyses in both experiments were performed using the R
suggesting that only one of the two blocks was needed. After the two training
statistical programming language (75). Preliminary analyses revealed no
trials participants saw one test trial with three new items: D, E, and F. The test
effect of block shape, doll name, toy, or the order in which the dolls played.
trials provided ambiguous information that could support either the conjunctive
Linear regression models found no effect of gender of the participants or
or disjunctive rule (i.e., D and F are both blickets or just F is a blicket). In the
the experimenter in either experiment (see SI Appendix, Tables S3 and S11).
baseline condition, participants were not given any prior training about the rule
1. Hill K, Kaplan H (1999) Life history traits in humans: Theory and empirical studies. 10. Bjorklund DF, Green BL (1992) The adaptive nature of cognitive immaturity. Am
Annu Rev Anthropol 28:397–430. Psychol 47:46–54.
2. Chapais B (2009) Primeval Kinship: How Pair-Bonding Gave Birth to Human Society 11. Bogin BA, Smith BH (1996) Evolution of the human life cycle. Am J Hum Biol 8:703–716.
(Harvard Univ Press, Cambridge, MA). 12. Munakata Y, Casey BJ, Diamond A (2004) Developmental cognitive neuroscience:
3. Hrdy SB (2011) Mothers and Others (Harvard Univ Press, Cambridge, MA). Progress and potential. Trends Cogn Sci 8:122–128.
4. Hawkes K, O’Connell JF, Jones NG, Alvarez H, Charnov EL (1998) Grandmothering, men- 13. Carlson SM (2005) Developmentally sensitive measures of executive function in pre-
opause, and the evolution of human life histories. Proc Natl Acad Sci USA 95:1336–1339. school children. Dev Neuropsychol 28:595–616.
5. Bennett PM, Harvey PH (1985) Brain size, development and metabolism in birds and 14. Gopnik A, Wellman HM (2012) Reconstructing constructivism: Causal models, Bayes-
mammals. J Zool 207:491–509. ian learning mechanisms, and the theory theory. Psychol Bull 138:1085–1108.
6. Weisbecker V, Goswami A (2010) Brain size, life history, and metabolism at the 15. Johnson C, Wilbrecht L (2011) Juvenile mice show greater flexibility in multiple choice
marsupial/placental dichotomy. Proc Natl Acad Sci USA 107:16216–16221. reversal learning than adults. Dev Cogn Neurosci 1:540–551.
7. Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural in- 16. Buonomano DV, Merzenich MM (1998) Cortical plasticity: From synapses to maps.
telligence, extended life history, sociality, and brain size in primates. Proc Natl Acad Annu Rev Neurosci 21:149–186.
Sci USA 114:7908–7914. 17. Werker JF, Hensch TK (2015) Critical periods in speech perception: New directions.
8. Bruner JS (1972) Nature and uses of immaturity. Am Psychol 27:687–708. Annu Rev Psych 66:173–196.
9. Konner M (2010) The Evolution of Childhood: Relationships, Emotion, Mind (Harvard 18. Kuhl PK (2004) Early language acquisition: Cracking the speech code. Nat Rev
Univ Press, Cambridge, MA). Neurosci 5:831–843.
PSYCHOLOGICAL AND
36. Perfors A, Tenenbaum JB, Griffiths TL, Xu F (2011) A tutorial introduction to Bayesian
COGNITIVE SCIENCES
pubertal development, and risk-taking behavior. J Neurosci 35:7226–7238.
models of cognitive development. Cognition 120:302–321. 65. Steinberg L, et al. (2017) Around the world, adolescence is a time of heightened
37. Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to grow a mind: sensation seeking and immature self-regulation. Dev Sci, 10.1111/desc.12532.
Statistics, structure, and abstraction. Science 331:1279–1285. 66. Cheng PW (1997) From covariation to causation: A causal power theory. Psychol Rev
38. Gopnik A (2012) Scientific thinking in young children: Theoretical advances, empirical 104:367–405.
research, and policy implications. Science 337:1623–1627. 67. Kelley HH (1967) Attribution theory in social psychology. Nebraska Symposium on
39. Xu F, Kushnir T (2013) Infants are rational constructivist learners. Curr Dir Psychol Sci Motivation 15:192–238.
22:28–32. 68. Jones EE, Harris VA (1967) The attribution of attitudes. J Exp Soc Psychol 3:1–24.
40. Kushnir T, Xu F, eds (2012) Rational Constructivism in Cognitive Development (Aca- 69. Horhota M, Blanchard-Fields F (2006) Do beliefs and attributional complexity influ-
demic, Cambridge, MA), Vol 43. ence age differences in the correspondence bias? Soc Cogn 24:310–337.
41. Pearl J (2009) Causality (Cambridge Univ Press, Cambridge, UK), 2nd Ed. 70. Morris MW, Nisbett RE, Peng K (1995) Causal attribution across domains and cultures.
42. Griffiths TL, Chater N, Kemp C, Perfors A, Tenenbaum JB (2010) Probabilistic models Causal Cognition: A Multidisciplinary Debate, eds Sperber D, Premack D, Premack AJ
of cognition: Exploring representations and inductive biases. Trends Cogn Sci 14: (Clarendon Press, Oxford), pp 577–612.
357–364. 71. Pfeifer JH, et al. (2009) Neural correlates of direct and reflected self-appraisals in
43. Bonawitz E, Denison S, Griffiths TL, Gopnik A (2014) Probabilistic models, learning adolescents and adults: When social perspective-taking informs self-perception. Child
algorithms, and response variability: Sampling in cognitive development. Trends Dev 80:1016–1038.
Cogn Sci 18:497–500. 72. Pfeifer JH, Lieberman MD, Dapretto M (2007) “I know you are but what am I?!”:
44. Denison S, Bonawitz E, Gopnik A, Griffiths TL (2013) Rational variability in children’s Neural bases of self- and social knowledge retrieval in children and adults. J Cogn
causal inferences: The sampling hypothesis. Cognition 126:285–300. Neurosci 19:1323–1337.
45. Bonawitz E, Denison S, Gopnik A, Griffiths TL (2014) Win-stay, lose-sample: A simple 73. Gee DG, et al. (2013) Early developmental emergence of human amygdala-prefrontal
sequential algorithm for approximating Bayesian inference. Cognit Psychol 74:35–65. connectivity after maternal deprivation. Proc Natl Acad Sci USA 110:15638–15643.
46. Kirkpatrick S, Gelatt CD, Jr, Vecchi MP (1983) Optimization by simulated annealing. 74. Nettle D, Frankenhuis WE, Rickard IJ (2013) The evolution of predictive adaptive re-
Science 220:671–680. sponses in human life history. Proc Biol Sci, 10.1098/rspb.2013.1343.
47. Horner V, Whiten A (2005) Causal knowledge and imitation/emulation switching in 75. R Core Team (2012) R: A language and environment for statistical computing. (R
chimpanzees (Pan troglodytes) and children (Homo sapiens). Anim Cogn 8:164–181. Foundation for Statistical Computing, Vienna).
Gopnik et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7899
How language shapes the cultural inheritance
of categories
Susan A. Gelmana,1 and Steven O. Robertsa
a
Department of Psychology, University of Michigan, Ann Arbor, MI 48109
Edited by Andrew Whiten, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 12, 2017
(received for review January 12, 2017)
It is widely recognized that language plays a key role in the nut using a rock”; “I’ll give you my money if you put down that
transmission of human culture, but relatively little is known about gun”; “Don’t trust Joe—he lies constantly”), which can share
the mechanisms by which language simultaneously encourages ideas, negotiate trades, deceive enemies, impress potential mates,
both cultural stability and cultural innovation. This paper examines affect reputations, and so forth. The expressive capacity of human
this issue by focusing on the use of language to transmit categories, language is virtually unlimited because of its hierarchical, combi-
focusing on two universal devices: labels (e.g., shark, woman) and natorial structure (6). In contrast to the communication systems of
generics (e.g., “sharks attack swimmers”; “women are nurturing”). other organisms (even those as impressive as whales, bees, or
We propose that labels and generics each assume two key princi- vervet monkeys), human language is generative: it permits in-
ples: norms and essentialism. The normative assumption permits finitely many messages to be constructed out of a limited number
transmission of category information with great fidelity, whereas of elements. This remarkably flexible system has obvious survival
essentialism invites innovation by means of an open-ended, place- value, as it is used in the “cognitive arms race” of competitive
holder structure. Additionally, we sketch out how labels and ge- feedback loops implicated in cooperative interactions that involve
nerics aid in conceptual alignment and the progressive “looping”
and must deal with cheating and cheating-detection (7).
between categories and cultural practices. In this way, human lan-
However, much of what human language conveys is not ex-
guage is a technology that enhances and expands the categoriza-
plicitly articulated via propositional content, but rather is implied
tion capacities that we share with other animals.
via presupposition, implicature, and other forms of inference (8).
Four examples follow.
language | categories | essentialism | norms | children (i) Language marks social identity through variation. There are
roughly 6,000 human languages around the globe, mutually un-
I t is broadly agreed that language is a distinctive human ca-
pacity and a powerful engine of cultural transmission. As such,
language is important to the theme of this special issue (1). No
intelligible, and (with rare exception) fully learnable only in
childhood. These aspects materially affect with whom one can
communicate and coordinate, and from whom one can learn. Even
matter how sophisticated the cultural transmission systems of among those who speak the same language, accent and dialect
nonhuman species (and they are astonishingly sophisticated; see reveal a person’s cultural origins, and so serve as honest signals to
papers in this issue) (2), they proceed without language. We can identity, with consequences for whom others choose to interact
thus ask what language distinctively contributes to cultural with and which models others trust to imitate and learn from (9).
transmission in humans and (more speculatively, but impor- (ii) Language directs a person’s attention in the moment by
tantly) what language may distinctively contribute to cultural means of structural features of the grammar. Different linguistic
evolution in humans. Recent evidence from language learning in communities focus on different aspects of experience, and in so
children provides new insights into these questions.
doing indicate what is important (10). For example, Japanese has
In this paper, we focus specifically on a key universal element
an honorific system that requires a speaker to decide level of
of language, category labels (e.g., dogs, gold, women, Muslims),
politeness; Quechua has an evidential system for expressing how
and their central role in the transmission and evolution of cat-
a speaker comes to know something: directly seeing vs. hearsay.
egory representations. The argument, in brief, is that category
There is debate regarding the role of these differences on non-
labels work in an almost paradoxical way to ensure stability in the
linguistic cognition (11, 12). But at a minimum, these structural
transmission process, but simultaneously to permit and even
foster conceptual change. On the one hand, words are conven- features affect a person’s thinking in the moment of speaking
tional and prescriptive, and provide a stable representation that (13), including what information gets encoded and transmitted
is easily shared with great fidelity, but on the other hand, words within a social interaction.
have an open-ended “placeholder” structure that invites in- (iii) Language transmits information through a rich system
novation. We suggest that this dual capacity contributes to what of pragmatic implications (14, 15). Communication involves in-
is distinctive in human cultural evolution. ferring the speaker’s intentions, a complex process that builds on
theory-of-mind capacities (16). Pragmatic inferences not only
Propositions vs. Presuppositions allow a listener to infer a speaker’s meaning, but also to learn
Maynard Smith and Szathmáry argued that language is one of about properties of the world (17).
the major transitions in the evolution of complexity, specifically
in the intergenerational transmission of information: “We accept
[the origin of human speech] as being the decisive step in the This paper results from the Arthur M. Sackler Colloquium of the National Academy of
Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
origin of specifically human society” (3). Kirby et al. (4) similarly Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
note that “Language is unique in being a system that supports in Irvine, CA. The complete program and video recordings of most presentations are available
unlimited heredity of cultural information, allowing our species on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
to develop a unique kind of open-ended adaptability.” And Pagel Author contributions: S.A.G. and S.O.R. wrote the paper.
(5) likewise refers to “language’s role in the transmission of the The authors declare no conflict of interest.
information that makes our societies possible.” This article is a PNAS Direct Submission. A.W. is a guest editor invited by the Editorial
The most obvious way that language transmits information is Board.
via explicit declarative propositions (e.g., “You can crack open a 1
To whom correspondence should be addressed. Email: gelman@umich.edu.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
For humans, categories themselves are a key part of our cultural cally unacceptable, and preschool children understand this (46).
inheritance, which is to say, they exhibit learned, socially trans-
mitted variation that cannot be explained by genetic or environ- Two Presuppositions: Norms and Essences
mental factors (2). We are not born with a fixed set of categories On a strict reading, labels communicate the category to which
(no one is born knowing of screwdrivers, or that whales are mam- something belongs, and generics communicate some fact, opinion,
mals, or that girls wear pink). Nor do we simply pick up on dis- or belief about a category. This is the explicit informational value of
continuities in the biological world; rather, human categories have a these expressions. However, labels and generics in actual in-
cultural overlay. We see this in categories of natural kinds, social terpersonal use imply more than these literal meanings, and indeed
kinds, and artifacts, all of which display tremendous linguistic, cul- we would argue that appropriate use of these expressions requires
tural, and regional variation. Classifications of the natural world understanding these implications. Next, we review two conceptual
vary in which animals, plants, or substances are classified as edible, presuppositions embedded in the use of labels and generics: that
which are classified as medicinal, and which are classified as clean/ categories are normative (i.e., conventional and prescriptive) and
unclean (22). Classifications of the social world vary in how gender, that categories have essences.
race, and social hierarchies are organized (23, 24). Classifications of
artifacts vary in the very entities there are to be classified, with Categories as Normative. A social norm is a shared, socially con-
distinct types of tools, clothing, furniture, and so forth, as well as structed, context-specific rule that indicates what is (or is not)
category boundaries (25). And there is marked linguistic variation socially appropriate (21, 47). Labels are fundamentally normative
in the classification of dimensions of experience, including color, in that they are conventional (i.e., shared with other members of
number, time, space, emotions, even senses (26, 27). the speech community, a principle required for their successful
Although categories can be acquired asocially by individuals via use) (48, 49). A person who did not appreciate the normative
direct observations and interactions with the world, human languages value of labels might arbitrarily substitute vocalizations of their
provide a socially transmitted system for efficiently communicating own invention for words that they hear from others (e.g., you call
information about which categories there are, what belongs in those that a hammer, but I’ll call it a blicket), and the whole commu-
categories, and which attributes those categories possess. Universally, nicative enterprise would never get off the ground. By the time of
languages use two devices for the intergenerational transmission of their first word, children appreciate the conventionality principle,
categories: labels (names for categories, such as “shark” or “woman”) expecting novel labels used by one speaker to be understood by
and generics (generalizations about named categories, such as “sharks others within the speech community (50). Not all behaviors are
attack swimmers” or “women are nurturing”). treated the way that labels are treated; for example, infants as-
Labels express concepts that have some cultural significance; sume that preferences are individually varying rather than shared
whereas there are indefinitely many concepts one can generate or conventional (51). Infants also appreciate that language oper-
(e.g., “items weighing more than 500 grams,” including vultures ates via a division of linguistic labor, whereby more knowledgeable
and the Oxford English Dictionary but not a small grapefruit), members of the community can be trusted to provide accurate
only a subset of these ideas are lexicalized, and of these, only a labeling (52). From an early age, children are sensitive to social
subset are maintained in a language over time (28). Words are variation in labelers, for example preferentially accepting labels
distinctive to humans in their number (typically about 50,000 in from adults over children and experts over novices (53), an ex-
an adult speaker, many of which are names for things), con- pectation that fosters conformity. A powerful consequence of this
ceptual precision (e.g., chase vs. flee), and need to be learned principle is that even a simple relabeling can shift children’s label
Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7901
use, with only minimal explanation, as in the following examples of Essentialism.
parents looking through a picture book with their children (34): [Essence is] the very being of anything, whereby it is what it is. And thus
i) Child: That’s kangaroo. (Pointing to an aardvark.) the real internal, but generally . . . unknown constitution of things,
whereon their discoverable qualities depend, may be called their essence.
Mother: Well, that looks like a kangaroo, but it’s called
an aardvark. Locke (73)
Child: Aardvark. A striking aspect of human categories—and the words that ex-
press them—is that they often defy appearances: stick-bugs look like
ii) Child: That’s a snake. (Referring to an eel.) sticks, pyrite looks like gold. It is not surprising that scientific cate-
Mother: It looks like a snake, doesn’t it? It’s called an eel. It’s gories extend beyond the obvious, given that the natural world
like a snake, only it lives in the water. And there’s another one. provides evolved mechanisms that lead appearances to mislead, in-
cluding homologies, camouflage, mimicry, and sexual dimorphism.
In experimental settings as well, children accept relabelings What is notable is that nonscientists, including children, share the
from others, even when they compete with perceptual evidence expectation that categories have hidden structure and that words in
that children directly experience (54). ordinary language (e.g., bug, gold) capture this structure (74). This
Generic information is likewise assumed to be conventional and expectation contrasts with classic theories of cognitive development,
shared with others rather than idiosyncratic, private, or subjective. which propose that young children are “perceptually bound” think-
Even in prelinguistic communication, an action that is displayed to ers, and that concepts shift from similarity-based to conceptually
others is more often assumed to be generic than information that is based over development (75, 76).
done for the actor himself or herself (55). With regard to language, We refer to this assumption as “psychological essentialism”: an
young children treat generic statements as conveying information intuitive belief that categories of the natural world share not just
that is widely known (56–58). Generics are particularly frequent in observable features, but also a deeper, nonobvious reality: they
pedagogical (information transmitting) contexts, such as book “carve up nature at its joints” (74, 77, 78). Thus, tigers share more
reading, and when taking on a pedagogical role, such as talking to a than a certain size, gait, striped fur, and ferocity, but also internal
more ignorant interlocutor or pretending to be a teacher (59, 60). parts, temperament, instincts, as well as an innate, unchanging
Although generics can express idiosyncratic or subjective per- tiger “essence.” This essence might be blood, DNA, or even an
spectives (e.g., “Vegemite is delicious!”), expressing this gener-
unspecified, unknown placeholder, an expectation that there is an
ically implies a general truth, even to preschool children (61).
essence without knowing what it might be. For example, young
Category-referring language is normative in a stronger, pre-
children report that an animal’s behavior is caused by its own
scriptive sense as well. That is, labels and generics imply that a
insides or energy before they can have detailed expectations about
feature linked to a category not only is but also should be (62,
the particular form that such causal force might take (79–81).
63). This is particularly so for generic language, which expresses
Evidence for psychological essentialism comes from research with
norms that may even compete with statistical observations: “Boys
adults as well as young children (74, 82). Even in infancy, children
don’t cry” is deemed true—despite being demonstrably false—
expect members of a category to share internal, nonobvious, or
because it expresses a norm (64–66). Similarly, generics such as
causal similarities, even in the face of superficial dissimilarities (31,
“Scientists care about the truth” express abstract values rather
than descriptively accurate features (67). Parents likewise pro- 83–85). Boundaries between categories are treated as discontinuous
duce generics that express prescriptive norms that conflict with and objectively correct, and category membership itself is viewed as
the reality in the moment (e.g., “Remember, we don’t stand up immutable (24, 86–89). Category members are thought to have in-
on chairs”; “Oh, no, you don’t pull on books”; see http://childes. nate potential that resists environmental influences (90–92). Internal
talkbank.org/access/Eng-NA/Brown.html). bodily organs are thought to have the power to modify the recipient’s
Generic language leads to normative judgments, even when the behavior (93, 94). That essentialist beliefs have been documented in
categories are novel and the content is innocuous. In a series of ex- young children and across a variety of cultural contexts suggests that
periments, children 4–13 y of age learned of two novel groups essentialism is a fundamental component of human cognition (23,
that contrasted with one another in some harmless behavior, 95–99). Although which categories are essentialized varies cross-
such as the music they listen to or the food they eat (e.g., culturally, especially for categories of people (such as race, gender,
Hibbles listen to one kind of music, and Glerks listen to another or ethnicity) (100), essentialism of both natural kinds and social
kind of music). Children reported that it was “not OK” for an kinds has been broadly and consistently documented (101–103).
individual to fail to conform to the group behavior (e.g., for a Essentialized social categories have important implications for evo-
Hibble to listen to music that is more typical of Glerks) (68). In lutionarily significant behaviors in humans, including patterns of
other words, children interpreted an unfamiliar descriptive affiliation, mating, reproduction, and conflict. For example, essen-
regularity as if it were prescriptive (see also refs. 69–71 for tialized social categories are often conceptualized as less human and
additional evidence that descriptive and prescriptive norms are more threatening than nonessentialized social categories (104, 105),
conflated in children’s and adults’ concepts). Language plays an and both children and adults are reluctant to share resources with
important role in licensing this normative response: when the members of essentialized out-groups (101, 106).
vignettes depicted individuals (not groups) that received cate- Essentialist expectations are linked to category labels. Hearing
gory labels in either specific statements (e.g., “This Hibble that a pterodactyl is a “dinosaur,” that a swaddled baby is a
listens to this kind of music”) or generic statements (e.g., “boy,” or that a child received the heart of a “monkey” leads to
“Hibbles listen to this kind of music”), children made normative the inference that the pterodactyl does not live in a nest, that the
judgments; when the vignettes depicted individuals without cat- baby will grow up to like football regardless of its upbringing, and
egory labels or generics (e.g., “This listens to this kind of mu- that the donated heart will confer a slight but inevitable uptick in
sic”), children did not make normative judgments (72). Thus, one’s tendency to eat bananas. Essentialist expectations attach
category labels and generic statements license a prescriptive also to wholly novel labels applied to wholly novel categories
reading of novel, innocuous behaviors: they imply that mem- (107–109). This is not to say that labels automatically trigger
bers of the labeled group should behave a certain way. The essentialist reasoning; they do not (74, 110). However, when a
establishment of norms is itself a mechanism that fosters the label is applied to a category that has some coherent conceptual
stability of group behaviors (70). basis (e.g., shared features) then essentialist beliefs follow (32).
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
in the United States courts system (127, 128). kinds) they involve looping effects between categories and the
So then, why do we essentialize? In the words of Medin (78), people being categorized.
“psychological essentialism is bad metaphysics . . . [but] may
From Variable Input to Categorical Representations. A uniquely human
prove to be good epistemology.” In other words, essentialism is
factually wrong but heuristically useful. Essentialism promotes aspect of language is that it takes variable, idiosyncratic experiences
and transforms them into discrete, symbolic, shared representations
learning and conceptual change by providing a placeholder
(28). The world is a complex, continuously dynamic array of sensory
structure that promotes the search for underlying causes and
inputs, and no two people experience identical environmental cues.
modifications over time. The evidence reviewed above demon-
The experience of categories is thus doubly variable: in the range of
strates the placeholder notion of essentialized concepts in three
instances that an individual encounters and in the experiences of
interrelated respects: (i) children expect items with the same
individuals across the language community. Language reduces and
label to share nonobvious similarities that they have not yet
regularizes this remarkable variety. Consider the use of a simple
learned; (ii) children are guided by labeling and generics even
word, “bird,” which extends from hummingbirds to dodos, from
when in competition with children’s own direct experiences (e.g., downy chicks to vicious birds of prey. We converge on a shared
“a whale isn’t a fish”; “boys don’t cry”); and (iii) labeling and label, regardless of our varied experiences: which birds we have seen
generics operate according to a “division of linguistic labor,” or heard, which ones we have owned or eaten, whether our expe-
whereby children defer to more expert others to inform them rience comes from real-world encounters, plush toys, or Big Bird.
about the classifications and generalizations of experience (129– This gap between the variability of experience and the com-
131). Importantly, these placeholder expectations, in turn, permit monality of labels presents a puzzle: “If biological and real world
and promote conceptual innovation, because children’s classifica- constraints are not enough then how is it nevertheless possible
tions build upon the expertise of others, and because children are for a group to arrive at a sufficiently shared set of conceptual
motivated to search for underlying causal similarities that members distinctions to make language possible?” (141). In other words,
of a category share. That even young children view categories in the transmission of language requires conceptual alignment or
essentialist ways suggests that categories are not just structures for compatible mental representations that are abstracted away from
organizing what is already known, but placeholders for further varying experiences and knowledge bases (142, 143).
knowledge that is expected to accrue. The meaning of a word is not We suggest that the manner in which labels and generics ab-
a list of known features or learned facts. Rather, a word serves as an stract away from experience aids in conceptual alignment. Cat-
invitation to form a category (132) and to extend and modify it with egory labels abstract away from the particulars that make
growing knowledge and expertise. “Dog” is not a tag for a fixed set individuals unique (a small poodle and a large Great Dane are
of observed features, but rather a pointer to “things of that nature,” both “dogs”), and generics abstract away from any particular
where the “nature” will be filled in via learning and input from context (“birds fly,” even when the only birds in sight are pen-
others. Here we endorse Putnam’s (133) famous assertion that guins) (144, 145). Speakers don’t require shared experiences to
“‘meanings’ just ain’t in the head!” Words refer to placeholder have a shared system of communication. A 12-mo-old infant and
concepts that do not have fixed content and thus can be modified. a biologist can communicate with the word “dog,” despite radi-
Language “is not a mirror of our inner states but a complement to cally different understandings.
them. It serves as a tool whose role is to extend cognition in ways Generics are particularly well-suited for expressing abstract,
that on-board devices cannot” (19). shared representations because, as noted earlier, they systematically
Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7903
underplay variations in experience by glossing over exceptions. Ge- group using matrilineal kinship, friendships, and causal theories
nerics are not disconfirmed by counterexamples (the existence of a (149). These categories have a nonobvious basis (e.g., infants
nonflying bird does not disconfirm the generic claim that birds fly) “inherit” the rank of the mother) and are learned (e.g., members
(45, 46), which means that generic messages can trump a listener’s need to learn which individuals fall into which group). Seyfarth
personal experiences. People produce generics about features they and Cheney propose: “. . . when it comes to recognizing matri-
consider conceptually important (e.g., dangerous or distinctive), even lineal kin groups, baboons are ‘essentialists’ . . . They act as if the
when they know them to be variably present in a category, but those members of kin groups ‘have essences or underlying natures that
who hear such generics (whether adults or young children) tend to make them the things that they are’” (149).
assume that the feature is almost universally present among category Monkeys and great apes can also track category membership
members (43, 146). This results in systematic distortions in the across radical featural transformations, and privilege kind (essential
transmission process, from variability to category-wide consistency. features) over superficial appearance (surface features). For ex-
ample, one study presented rhesus macaques with food items in
The Looping Effect of Human Kinds. Hacking (147) speaks of a which the inner identity was transformed (e.g., an apple was dis-
“looping effect” in social categories: specifically, that classifica- guised as a coconut) (150). After a piece of this transformed fruit
tions of people have cognitive consequences for those that are was placed in a container, if the animal reached in and found a
classified, which feedback into these same classifications: piece that matched the appearance rather than the inner kind, they
continued to search for another piece, indicating that they had been
To create new ways of classifying people is also to change how we can
expecting the sample to match the inner kind and not the appear-
think of ourselves, to change our sense of self-worth, even how we re-
member our own past. This in turn generates a looping effect, because ance. The researchers interpret the findings as “evidence that ma-
people of the kind behave differently and so are different. This is to say caques share this one primitive aspect of psychological essentialism”
the kind changes, and so there is new causal knowledge to be gained and (150). Similarly, in another study, bonobos, orangutans, and chim-
perhaps, old causal knowledge to be jettisoned. . . . that new knowledge in panzees viewed a transformation process in which one piece of food
turn becomes part of what is to be known about members of the kind, was disguised to look like another (e.g., a carrot slice was disguised
who change again. . . . Kinds are modified, revised classifications are as a banana slice) (151). When given a choice between a true piece
formed, and the classified change again, loop upon loop” (147). of banana versus a disguised piece of carrot that only looked like a
We have sketched out linguistic mechanisms that may con- banana, animals preferred the true banana. The authors interpret
this as “a kind of psychological essentialism, perhaps the phyloge-
tribute to this looping effect. Labels and generics stake out cat-
netically and ontogenetically most basic one” (151). Again, language
egories, which then are altered through human action to reify
was not required to consider an appearance–reality conflict and to
such categories. In contrast to Hacking, however, we see this
privilege the inner identity.
looping effect not only for categories to which one belongs, but
These impressive capacities demonstrate that humans share
also for categories of others. History is replete with modifications
with at least other primates the ability to categorize based on
that differentiate groups. Thus, for example, male/female dif-
subtle, nonperceptible cues, and the ability to conform to nor-
ferences are exaggerated by differences in clothing, hairstyles,
mative regularities (although conformity is substantially greater
gait, bodily deformations (e.g., foot-binding), and styles of in humans) (152). Indeed, norms and essentialism may precede
speech. Modifications may be imposed (e.g., Jews in World War language in human development, as preverbal infants infer
II Germany being required to wear stars) or chosen (e.g., fash- general ways of interacting with objects from pedagogical dem-
ions worn by self-identified hipsters). Social groups may be onstrations, evaluate others based on their social interactions,
physically separated, either by explicit policy (e.g., segregationist categorize based on nonobvious features, and distinguish indi-
policies toward Blacks in the southern United States; Japanese viduals from kinds (55, 80, 153–155).
internment camps in the United States during World War II) or Nonetheless, we suggest four key respects in which human
by other practices and constraints (e.g., low-income families re- language may be unique in fostering the social transmission and
stricted to neighborhoods with unclean water and air). Concepts evolution of categories.
of human kinds may lead to a cyclical pattern in which cultural
practices lead groups to appear more distinct from one another, Efficiency in Transmitting Category Information. First and most
which confirms the categorizations, leading to more differenti- obviously, labels and generic language ensure speed, fidelity, and
ating practices, and so forth. Viewing social kinds as having deep ease of transmitting category information, by means of an overt
differences has cycling effects on behaviors that contribute to the and stable representational format. This would be difficult
reality of that social kind. (perhaps impossible) to achieve by means of actions alone. (Note
that language is not necessarily more efficient for transmitting all
Norms and Essentialism in Nonhuman Species sorts of information. For example, showing the location of an
Are nonhuman animals also capable of learning categories with object is likely more efficiently done by pointing; teaching
prescriptive implications and a nonobvious basis? This question is weaving is likely more efficiently done by demonstration.) Con-
timely, given recent discoveries of remarkably sophisticated cate- sider the case of conveying that an item is not what it appears to
gorization and social transmission abilities in nonhuman animals be. The studies with nonhuman primates required a lengthy and
(see other papers in this issue). For example, consider an in- rather elaborate shared context (the transformation process
genious experiment demonstrating that chimpanzees conform to itself), carried out by an expert with special tools and procedural
cultural (descriptive) norms of tool use (148). The researchers first know-how. Someone who was not present during this demon-
taught a high-ranking chimpanzee one of two manners of tool use stration would not have access to the relevant information.
to obtain food out of a puzzle box (e.g., using either a poking or a Contrast this with the human language case, which efficiently
lifting motion). When let loose within the group, other members corrects a misconception with a single sentence (“This looks like
picked up the demonstrated solution strategy, even adhering to a banana, but really it’s a carrot”). Anyone who hears the new
the method common in the group after having successfully used label—even a nonexpert or young child—could then share it with
the alternative method. Certainly language was not required. others, ensuring a transmission chain. Consider, too, the case of
Nonhuman primates are also capable of categorizing based on conveying the scope of a feature: if eating a mushroom makes
nonperceptible features. For example, baboons engage in so- you sick, is it because of that particular mushroom (e.g., maybe it
phisticated categorizations of conspecifics, with dominance hi- rotten or was sprayed by pesticides) or mushrooms of that type
erarchies that simultaneously rank by individual rank and family more generally? Again, this is efficiently conveyed via generic
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
essentialism extend beyond content with obvious survival value Levinson notes the special role of language in the process of
to include any aspect of experience. Essentialism applies to enculturation of cognition: “. . . language appears to play a cru-
natural substances, living kinds, human social groups, personal cial role [in how culture gets into the head]: it is learnt far earlier
characteristics, diseases, and in some respects even artifacts (31, than most aspects of culture, is the most highly practiced set of
82, 113, 160–165). Similarly, normative expectations extend to a cultural skills, and is a representation system that is at once
vast array of behaviors, including which clothing to wear, which public and private, cultural and mental” (171).
music to listen to, or which games to play (68). In the case of learning categories, we suggest that cumulative
cultural evolution is enhanced by labels and generics, which provide
From Models to Morals. Although nonhuman animals are capable a simple yet powerful means of passing along the wisdom (and
of conforming to high-ranking group members (copying modeled prejudices) of prior generations. In this way, language enhances and
behaviors) and “punishing” others by retaliating when they are expands (nonlinguistic) capacities to categorize that we share with
wronged, we are unaware of evidence that they display moral other animals. A full understanding of this process will require
condemnation or punishment of nonconformity in others. For studying how it intersects with a variety of other important cognitive
example, in one study with chimpanzees, an actor could punish a capacities that are present early in human development, including
thief by depriving them of food reward (via trapdoor) (166). The theory of mind, alertness to testimony, attention to ritual, and a
actor only retaliated when their own food was stolen, not when drive for causal understandings (134, 172–174).
another chimpanzee’s food was stolen. This is in sharp contrast
to the findings with young children, who exhibit strong moral ACKNOWLEDGMENTS. We thank Bruce Mannheim and two anonymous
evaluations of others (47, 167, 168). One might say that social reviewers for very helpful comments on an earlier draft.
1. Lotem A, Halpern JY, Edelman S, Kolodny O (2017) The evolution of cognitive mechanisms 11. Gentner D, Goldin-Meadow S, eds (2003) Language in Mind: Advances in the Study
in response to cultural innovations. Proc Natl Acad Sci USA 114:7915–7922. of Language and Thought (MIT Press, Cambridge, MA).
2. Whiten A (2017) A second inheritance system: The extension of biology through 12. Gleitman L, Papafragou A (2013) Relations between language and thought.
culture. Interface Focus, in press. Handbook of Cognitive Psychology, ed Reisberg D (Oxford Univ Press, New York).
3. Maynard Smith JM, Szathmáry E (1997) The Major Transitions in Evolution (Oxford 13. Slobin DI (1991) Learning to think for speaking: Native language, cognition, and
Univ Press, Oxford). rhetorical style. Pragmatics 1:7–25.
4. Kirby S, Cornish H, Smith K (2008) Cumulative cultural evolution in the laboratory: 14. Grice HP (1975) Logic and conversation. Syntax and Semantics 3: Speech Acts, eds
An experimental approach to the origins of structure in human language. Proc Natl Cole P, Morgan J (Academic, New York), pp 41–58.
Acad Sci USA 105:10681–10686. 15. Levinson SC (2000) Presumptive Meanings: The Theory of Generalized Conversational
5. Pagel M (2017) Darwinian perspectives on the evolution of human languages. Implicature (MIT Press, Cambridge, MA).
Psychon Bull Rev 24:151–157. 16. Sperber D, Wilson D (2002) Pragmatics, modularity and mind‐reading. Mind Lang 17:3–23.
6. Chomsky N (1975) Aspects of the Theory of Syntax (MIT Press, Cambridge, MA). 17. Horowitz AC, Frank MC (2016) Children’s pragmatic inferences as a route for
7. Pinker S, Bloom P (1990) Natural language and natural selection. Behav Brain Sci 13: learning about the world. Child Dev 87:807–819.
707–727. 18. Frank MC, Everett DL, Fedorenko E, Gibson E (2008) Number as a cognitive tech-
8. Mannheim B (2015) The social imaginary, unspoken in verbal art. The Routledge nology: Evidence from Pirahã language and cognition. Cognition 108:819–824.
Handbook of Linguistic Anthropology, ed Bonvillain N (Routledge, New York), pp 44–61. 19. Clark A, Chalmers D (1998) The extended mind. Analysis 58:7–19.
9. Kinzler KD, Corriveau KH, Harris PL (2011) Children’s selective trust in native- 20. Watson-Jones RE, Legare CH (2016) The social functions of group rituals. Curr Dir
accented speakers. Dev Sci 14:106–111. Psychol Sci 25:42–46.
10. Hill JH, Mannheim B (1992) Language and world view. Annu Rev Anthropol 21: 21. Legare CH (2017) Cumulative cultural learning: Development and diversity. Proc Natl
381–406. Acad Sci USA 114:7877–7883.
Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7905
22. Atran S, Medin DL (2008) The Native Mind and the Cultural Construction of Nature 62. Orvell A, Kross E, Gelman SA (2017) How “you” makes meaning. Science 355:
(MIT Press, Cambridge, MA). 1299–1302.
23. Astuti R, Solomon GE, Carey S (2004) Constraints on conceptual development: A case 63. Orvell A, Kross E, Gelman SA, That’s how ‘you’ do it: Generic you expresses norms in
study of the acquisition of folkbiological and folksociological knowledge in Mada- early childhood. J Exp Child Psych, in press.
gascar. Monogr Soc Res Child Dev 69:1–135, vii–viii, discussion 136–161. 64. Leslie SJ (2015) “Hillary Clinton is the only man in the Obama administration”: Dual
24. Diesendruck G, Goldfein-Elbaz R, Rhodes M, Gelman S, Neumark N (2013) Cross- character concepts, generics, and gender. Analytic Philos 56:111–141.
cultural differences in children’s beliefs about the objectivity of social categories. 65. Prasada S, Dillingham EM (2009) Representation of principled connections: A win-
Child Dev 84:1906–1917. dow onto the formal aspect of common sense conception. Cogn Sci 33:401–448.
25. Malt BC, Majid A (2013) How thought is mapped into words. Wiley Interdiscip Rev 66. Wodak D, Leslie SJ, Rhodes M (2015) What a loaded generalization: Generics and
Cogn Sci 4:583–597. social cognition. Philos Compass 10:625–635.
26. Barrett LF, Mesquita B, Gendron M (2011) Context in emotion perception. Curr Dir 67. Knobe J, Prasada S, Newman GE (2013) Dual character concepts and the normative
Psychol Sci 20:286–290. dimension of conceptual representation. Cognition 127:242–257.
27. Regier T, Kay P (2009) Language, thought, and color: Whorf was half right. Trends 68. Roberts SO, Gelman SA, Ho AK (2017) So it is, so it shall be: Group regularities license
Cogn Sci 13:439–446. children’s prescriptive judgments. Cogn Sci 41:576–600.
28. Pagel M (2012) Wired for Culture: Origins of the Human Social Mind (Norton, New York). 69. Bear A, Knobe J (November 11, 2016) Normality: Part descriptive, part prescriptive.
29. Pinker S, Jackendoff R (2005) The faculty of language: What’s special about it? Cognition, 10.1016/j.cognition.2016.10.024.
Cognition 95:201–236. 70. Eriksson K, Strimling P, Coultas JC (2015) Bidirectional associations between de-
30. Carey S (1978) The child as word learner. Linguistic Theory and Psychological Reality, scriptive and injunctive norms. Organ Behav Hum Decis Process 129:59–69.
eds Bresnan J, Miller G, Halle M (MIT Press, Cambridge, MA), pp 264–293. 71. Tworek CM, Cimpian A (2016) Why do people tend to infer “ought” from “is”? The
31. Gelman SA, Markman EM (1986) Categories and induction in young children. role of biases in explanation. Psychol Sci 27:1109–1122.
Cognition 23:183–209. 72. Roberts SO, Ho AK, Gelman SA (2017) Group presence, category labels, and generic
32. Gelman SA, Davidson NS (2013) Conceptual influences on category-based induction. statements influence children to treat descriptive group regularities as prescriptive.
Cognit Psychol 66:327–353. J Exp Child Psychol 158:19–31.
33. Carlson GN, Pelletier FJ, eds (1995) The Generic Book (Univ Chicago Press, Chicago). 73. Locke J (1959) An Essay Concerning Human Understanding (Dover, New York), Vol 2. Reprint.
34. Gelman SA, Coley JD, Rosengren KS, Hartman E, Pappas A (1998) Beyond labeling: 74. Gelman SA (2003) The Essential Child: Origins of Essentialism in Everyday Thought
The role of maternal input in the acquisition of richly structured categories. Monogr (Oxford Univ Press, New York).
Soc Res Child Dev 63:I–V, 1–148, discussion 149–157. 75. Fisher AV, Sloutsky VM (2005) When induction meets memory: Evidence for gradual
35. Gelman SA, Goetz PJ, Sarnecka BW, Flukes J (2008) Generic language in parent-child transition from similarity-based to category-based induction. Child Dev 76:583–597.
conversations. Lang Learn Dev 4:1–31. 76. Inhelder B, Piaget J (1964) The Early Growth of Logic in the Child (Norton, New York).
36. Graham SA, Gelman SA, Clarke J (2016) Generics license 30-month-olds’ inferences 77. Keil FC, Richardson DC (1999) Species, stuff, and patterns of causation. Species: New
about the atypical properties of novel kinds. Dev Psychol 52:1353–1362. Interdisciplinary Essays, ed Wilson RA (MIT Press, Cambridge, MA).
37. Gelman SA, Raman L (2007) This cat has nine lives? Children’s memory for genericity 78. Medin DL (1989) Concepts and conceptual structure. Am Psychol 44:1469–1481.
in language. Dev Psychol 43:1256–1268. 79. Gottfried GM, Gelman SA (2005) Developing domain-specific causal-explanatory
38. Maurer P, Meeuwis M; APiCS Consortium (2013) Generic noun phrases in subject frameworks: The role of insides and immanence. Cogn Dev 20:137–158.
function. The Atlas of Pidgin and Creole Language Structures, eds Michaelis SM, 80. Setoh P, Wu D, Baillargeon R, Gelman R (2013) Young infants have biological ex-
Maurer P, Haspelmath M, Huber M (Oxford Univ Press, New York), pp 114–117. pectations about animals. Proc Natl Acad Sci USA 110:15937–15942.
39. Everett DL (2009) Pirahã culture and grammar: A response to some criticisms. Lang 81. Simons DJ, Keil FC (1995) An abstract to concrete shift in the development of bi-
85:405–442. ological thought: The insides story. Cognition 56:129–163.
40. Gelman SA, Sánchez Tapia I, Leslie SJ (2016) Memory for generic and quantified 82. Haslam N, Rothschild L, Ernst D (2000) Essentialist beliefs about social categories. Br J
sentences in Spanish-speaking children and adults. J Child Lang 43:1231–1244. Soc Psychol 39:113–127.
41. Mannheim B, Gelman SA, Escalante C, Huayhua M, Puma R (2010) A developmental 83. Booth AE (2014) Conceptually coherent categories support label-based inductive
analysis of generic nouns in Southern Peruvian Quechua. Lang Learn Dev 7:1–23. generalization in preschoolers. J Exp Child Psychol 123:1–14.
42. Tardif T, Gelman SA, Fu X, Zhu L (2012) Acquisition of generic noun phrases in 84. Sobel DM, Yoachim CM, Gopnik A, Meltzoff AN, Blumenthal EJ (2007) The blicket
Chinese: Learning about lions without an “-s”. J Child Lang 39:130–161. within: Preschoolers’ inferences about insides and causes. J Cogn Dev 8:159–182.
43. Cimpian A, Brandone AC, Gelman SA (2010) Generic statements require little evi- 85. Walker CM, Lombrozo T, Legare CH, Gopnik A (2014) Explaining prompts children to
dence for acceptance but have powerful implications. Cogn Sci 34:1452–1482. privilege inductively rich properties. Cognition 133:343–357.
44. Cimpian A, Gelman SA, Brandone AC (2010) Theory-based considerations influence 86. Rhodes M, Gelman SA (2009) A developmental examination of the conceptual
the interpretation of generic sentences. Lang Cogn Process 25:261–276. structure of animal, artifact, and human social categories across two cultural con-
45. Leslie SJ (2008) Generics: Cognition and acquisition. Philos Rev 117:1–47. texts. Cognit Psychol 59:244–274.
46. Brandone AC, Cimpian A, Leslie SJ, Gelman SA (2012) Do lions have manes? For 87. Roberts SO, Gelman SA (2015) Do children see in Black and White? Children’s and
children, generics are about kinds rather than quantities. Child Dev 83:423–433. adults’ categorizations of multiracial individuals. Child Dev 86:1830–1847.
47. Rakoczy H, Schmidt MF (2013) The early ontogeny of social norms. Child Dev 88. Gelman SA, Wellman HM (1991) Insides and essences: Early understandings of the
Perspect 7:17–21. non-obvious. Cognition 38:213–244.
48. Clark EV (1992) Conventionality and contrast: Pragmatic principles with lexical 89. Keil FC (1989) Concepts, Kinds, and Cognitive Development (MIT Press, Cambridge, MA).
consequences. Frames, Fields, and Contrasts: New Essays in Semantic and Lexical 90. Meyer M, Gelman SA (2016) Gender essentialism in children and parents: Implica-
Organization, eds Kittay EF, Lehrer A (Erlbaum, Hillsdale, NJ), pp 171–188. tions for the development of gender stereotyping and gender-typed preferences.
49. Saussure FD (1915) Cours de Linguistique Générale (Payot, Paris). Sex Roles 75:409–421.
50. Sabbagh MA, Henderson AM (2007) How an appreciation of conventionality shapes 91. Taylor MG, Rhodes M, Gelman SA (2009) Boys will be boys; cows will be cows:
early word learning. New Dir Child Adolesc Dev (115):25–37. Children’s essentialist reasoning about gender categories and animal species. Child
51. Henderson AM, Woodward AL (2012) Nine-month-old infants generalize object la- Dev 80:461–481.
bels, but not object preferences across individuals. Dev Sci 15:641–652. 92. Ware EA, Gelman SA (2014) You get what you need: An examination of purpose-
52. Jaswal VK, Markman EM (2007) Looks aren’t everything: 24-month-olds’ willingness based inheritance reasoning in undergraduates, preschoolers, and biological ex-
to accept unexpected labels. J Cogn Dev 8:93–111. perts. Cogn Sci 38:197–243.
53. Koenig MA, Harris PL (2005) Preschoolers mistrust ignorant and inaccurate speakers. 93. Meyer M, Gelman SA, Roberts SO, Leslie SJ (November 17, 2016) My heart made me
Child Dev 76:1261–1277. do it: Children’s essentialist beliefs about heart transplants. Cogn Sci, 10.1111/
54. Lane JD, Harris PL, Gelman SA, Wellman HM (2014) More than meets the eye: Young cogs.12431.
children’s trust in claims that defy their perceptions. Dev Psychol 50:865–871. 94. Meyer M, Leslie SJ, Gelman SA, Stilwell SM (2013) Essentialist beliefs about bodily
55. Csibra G, Gergely G (2009) Natural pedagogy. Trends Cogn Sci 13:148–153. transplants in the United States and India. Cogn Sci 37:668–710.
56. Cimpian A, Scott RM (2012) Children expect generic knowledge to be widely shared. 95. Atran S, et al. (2001) Folkbiology doesn’t come from folkpsychology: Evidence from
Cognition 123:419–433. Yukatek Maya in cross-cultural perspective. J Cogn Cult 1:3–42.
57. Cimpian A, Markman EM (2009) Information learned from generic language be- 96. Moya C, Boyd R, Henrich J (2015) Reasoning about cultural and genetic transmission:
comes central to children’s biological concepts: Evidence from their open-ended Developmental and cross‐cultural evidence From Peru, Fiji, and the United States on
explanations. Cognition 113:14–25. how people make inferences about trait transmission. Top Cogn Sci 7:595–610.
58. Hollander MA, Gelman SA, Raman L (2009) Generic language and judgements about 97. del Río MF, Strasser K (2011) Chilean children’s essentialist reasoning about poverty.
category membership: Can generics highlight properties as central? Lang Cogn Br J Dev Psychol 29:722–743.
Process 24:481–505. 98. Sousa P, Atran S, Medin D (2002) Essentialism and folkbiology: Evidence from Brazil.
59. Gelman SA, Tardif T (1998) A cross-linguistic comparison of generic noun phrases in J Cogn Cult 2:195–223.
English and Mandarin. Cognition 66:215–248. 99. Waxman S, Medin D, Ross N (2007) Folkbiological reasoning from a cross-cultural
60. Gelman SA, Ware EA, Manczak EM, Graham SA (2013) Children’s sensitivity to the knowl- developmental perspective: Early essentialist notions are shaped by cultural beliefs.
edge expressed in pedagogical and nonpedagogical contexts. Dev Psychol 49:491–504. Dev Psychol 43:294–308.
61. Holubar TF, Markman EM (2013) Preschoolers’ understanding of preferences is 100. Haslam N, Holland E, Karasawa M (2013) Essentialism and entitativity across cultures. Culture
modulated by linguistic framing. Cooperative Minds: Social Interaction and Group and Group Processes, eds Yuki M, Brewer M (Oxford Univ Press, New York), pp 17–37.
Dynamics, Proceedings of the 35th Annual Meeting of the Cognitive Science Society, 101. Rhodes M, Leslie SJ, Saunders K, Dunham Y, Cimpian A (February 22, 2017) How does
eds Knauff M, Sebanz N, Pauen M, Wachsmuth I (Cognitive Science Society, Austin, social essentialism affect the development of inter-group relations? Dev Sci, 10.1111/
TX), pp 603–608. desc.12509.
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
Brain Sci 37:461–480. 155. Dewar K, Xu F (2007) Do 9-month-old infants expect distinct words to refer to kinds?
120. Williams MJ, Eberhardt JL (2008) Biological conceptions of race and the motivation Dev Psychol 43:1227–1238.
to cross racial boundaries. J Pers Soc Psychol 94:1033–1047. 156. Galef BG, McQuoid LM, Whiskin EE (1990) Further evidence that Norway rats do not
121. Bastian B, Haslam N (2006) Psychological essentialism and stereotype endorsement. socially transmit learned aversions to toxic baits. Anim Learn Behav 18:199–205.
J Exp Soc Psychol 42:228–235. 157. Galef BG, Laland KN (2005) Social learning in animals: Empirical studies and theo-
122. Chao MM, Hong YY, Chiu CY (2013) Essentializing race: Its implications on racial retical models. Bioscience 55:489–499.
categorization. J Pers Soc Psychol 104:619–634. 158. Love AC (2015) Conceptual Change in Biology (Springer, New York).
123. Gaither SE, et al. (2014) Essentialist thinking predicts decrements in children’s 159. Savoca MS, Wohlfeil ME, Ebeler SE, Nevitt GA (2016) Marine plastic debris emits a
memory for racially ambiguous faces. Dev Psychol 50:482–488. keystone infochemical for olfactory foraging seabirds. Sci Adv 2:e1600395.
124. Ho AK, Roberts SO, Gelman SA (2015) Essentialism and racial bias jointly contribute 160. Regnier D (2015) Clean people, unclean people: The essentialisation of ‘slaves’
to the categorization of multiracial individuals. Psychol Sci 26:1639–1645. among the southern Betsileo of Madagascar. Soc Anthropol 23:152–168.
125. Kraus MW, Keltner D (2013) Social class rank, essentialism, and punitive judgment. 161. Gelman SA, Heyman GD, Legare CH (2007) Developmental changes in the coherence
J Pers Soc Psychol 105:247–261.
of essentialist beliefs about psychological characteristics. Child Dev 78:757–774.
126. Leslie SJ, Cimpian A, Meyer M, Freeland E (2015) Expectations of brilliance underlie
162. Cooper JA, Marsh JK (2015) The influence of expertise on essence beliefs for mental
gender distributions across academic disciplines. Science 347:262–265.
and medical disorder categories. Cognition 144:67–75.
127. Goff PA, Jackson MC, Di Leone BAL, Culotta CM, DiTomasso NA (2014) The essence of
163. Gelman SA (2013) Artifacts and essentialism. Rev Phil Psychol 4:449–463.
innocence: Consequences of dehumanizing Black children. J Pers Soc Psychol 106:526–545.
164. Nemeroff C, Rozin P (1994) The contagion concept in adult thinking in the United
128. Goff PA, Eberhardt JL, Williams MJ, Jackson MC (2008) Not yet human: Implicit
States: Transmission of germs and of interpersonal influence. Ethos 22:158–186.
knowledge, historical dehumanization, and contemporary consequences. J Pers Soc
165. Newman GE (2016) An essentialist account of authenticity. J Cogn Cult 16:294–321.
Psychol 94:292–306.
166. Riedl K, Jensen K, Call J, Tomasello M (2012) No third-party punishment in chim-
129. Gelman SA (2009) Learning from others: Children’s construction of concepts. Annu
panzees. Proc Natl Acad Sci USA 109:14824–14829.
Rev Psychol 60:115–140.
167. Riedl K, Jensen K, Call J, Tomasello M (2015) Restorative justice in children. Curr Biol
130. Lutz DJ, Keil FC (2002) Early understanding of the division of cognitive labor. Child
25:1731–1735.
Dev 73:1073–1084.
168. Göckeritz S, Schmidt MH, Tomasello M (2014) Young children’s creation and trans-
131. Markman EM, Jaswal VK (2003) Commentary on Part II: Abilities and assumptions
underlying conceptual development. Early Category and Concept Development: mission of social norms. Cogn Dev 30:81–95.
Making Sense of the Blooming, Buzzing Confusion, eds Rakison D, Oakes L (Ox- 169. Lewis HM, Laland KN (2012) Transmission fidelity is the key to the build-up of cu-
ford Univ Press, New York), pp 384–402. mulative culture. Philos Trans R Soc Lond B Biol Sci 367:2171–2180.
132. Waxman SR, Markow DB (1995) Words as invitations to form categories: Evidence 170. Heyes C (2016) Who knows? Metacognitive social learning strategies. Trends Cogn
from 12- to 13-month-old infants. Cognit Psychol 29:257–302. Sci 20:204–213.
133. Putnam H (1975) The meaning of ‘meaning’ Mind, Language, and Reality (Cam- 171. Levinson SC (2005) Comment on: Cultural constraints on grammar and cognition in
bridge Univ Press, Cambridge, UK), pp 215–271. Piraha by Daniel L. Everett. Curr Anthropol 46:637–638.
134. Legare CH, Nielsen M (2015) Imitation and innovation: The dual engines of cultural 172. Wellman HM (2014) Making Minds: How Theory of Mind Develops (Oxford Univ
learning. Trends Cogn Sci 19:688–699. Press, New York).
135. Tomasello M (2009) The Cultural Origins of Human Cognition (Harvard Univ Press, 173. Harris PL, Lane JD (2014) Infants understand how testimony works. Topoi 33:
Cambridge, MA). 443–458.
136. Creanza N, Kolodny O, Feldman MW (2017) Cultural evolutionary theory: How cul- 174. Walker CM, Gopnik A (2014) Toddlers infer higher-order relational principles in
ture evolves and why it matters. Proc Natl Acad Sci USA 114:7782–7789. causal learning. Psychol Sci 25:161–169.
Gelman and Roberts PNAS | July 25, 2017 | vol. 114 | no. 30 | 7907
Coevolution of cultural intelligence, extended life
history, sociality, and brain size in primates
Sally E. Streeta,b,1, Ana F. Navarretea, Simon M. Readerc, and Kevin N. Lalanda,1
a
Centre for Social Learning and Cognitive Evolution, School of Biology, University of St. Andrews, St. Andrews KY16 9AJ, United Kingdom; bDepartment of
Anthropology, Durham University, Durham DH1 3LE, United Kingdom; and cDepartment of Biology, McGill University, Montreal, QC H3A 1B1, Canada
Edited by Marcus W. Feldman, Stanford University, Stanford, CA, and approved May 30, 2017 (received for review January 15, 2017)
Explanations for primate brain expansion and the evolution of human cultural capabilities evolved independently or coevolved through di-
cognition and culture remain contentious despite extensive research. rectly reinforcing processes. Enlarged brains, enhanced cognition, and
While multiple comparative analyses have investigated variation in highly developed social learning abilities co-occur not only in primate
brain size across primate species, very few have addressed why species but also in some cetaceans and birds (18–22), raising the pos-
primates vary in how much they use social learning. Here, we evaluate sibility of a key role for social learning and culture in brain evolution
the hypothesis that the enhanced reliance on socially transmitted and intelligence in multiple, independent animal lineages (23–27).
behavior observed in some primates has coevolved with enlarged Across primates, support for multiple, nonexclusive hypotheses
brains, complex sociality, and extended lifespans. Using recently for enlarged brain (particularly neocortex) size has been identified
developed phylogenetic comparative methods we show that, across in comparative studies, emphasizing the roles of social complexity
primate species, a measure of social learning proclivity increases with (e.g., group size) (28, 29), ecological intelligence (e.g., dietary
absolute and relative brain volume, longevity (specifically reproductive complexity) (30, 31), technical intelligence (e.g., tool use and
lifespan), and social group size, correcting for research effort. We also technical innovation) (21, 25, 32), and behavioral complexity (e.g.,
confirm relationships of absolute and relative brain volume with innovativeness, social learning, and tactical deception) (21, 25, 33).
longevity (both juvenile period and reproductive lifespan) and social Further, several comparative studies have found that larger
group size, although longevity is generally the stronger predictor. brained primates have slower life histories, including longer juvenile
Relationships between social learning, brain volume, and longevity periods and overall lifespans (e.g., ref. 29). Although mutually
remain when controlling for maternal investment and are therefore not reinforcing evolutionary processes have been proposed to account
simply explained as a by-product of the generally slower life history for this association (16), recent comparative analyses suggest that
expected for larger brained species. Our findings suggest that both lifespan increases with brain size in mammals instead due to de-
brain expansion and high reliance on culturally transmitted behavior
velopmental costs: i.e., it requires a longer period of maternal in-
coevolved with sociality and extended lifespan in primates. This
vestment to support offspring with greater natal and postnatal brain
coevolution is consistent with the hypothesis that the evolution of
growth, requiring a slower life history strategy of which longer life-
large brains, sociality, and long lifespans has promoted reliance on
span is a by-product (34). Primates, however, are potentially distinct
culture, with reliance on culture in turn driving further increases in brain
from most mammalian taxa in their unusually large, neuron-dense
volume, cognitive abilities, and lifespans in some primate lineages.
brains (8–11) and in the extensive occurrence of socially transmitted
behavior exhibited in some lineages (e.g., refs. 35–37). Whether the
|
cultural evolution social learning | brain evolution | primates | association between extended life history and enlarged brain size is
phylogenetic comparative analysis
best explained by a cognitive or developmental mechanism in pri-
mates specifically remains to be explored. Further, despite many
B rain expansion is unquestionably a distinctive feature of
primate, and especially human, evolution. Primate brain
expansion is evident regardless of whether the brain is measured
previous comparative analyses of brain size and relevant predictors
in primates, comparative analyses have not yet directly explored the
evolutionary relationships between brain expansion, cultural com-
in absolute terms, in relation to body size, or as the size of the plexity, sociality, and longevity in analyses that include all of these
neocortex relative to the rest of the brain (1), and irrespective of variables, with control for relevant potentially confounding variables.
whether it is better characterized by variation in a single size Here, in a comparative analysis of primate species, we directly
dimension (2) or mosaic evolution of component parts (3). The test the widely held view that encephalization, sociality, longev-
striking variation in brain size in nonhuman primates, across ity, and reliance on culture have coevolved (16, 23–27, 32, 38).
three orders of magnitude (4), has long demanded an evolu- We use a quantitative behavioral measure of reliance on culture:
tionary explanation (5). Although the cognitive implications of specifically, the number of unique reports (i.e., richness) of social
cross-species variation in whole brain size remain contentious learning per species from a sample of relevant published litera-
and require further investigation (5–7), evolutionary increases in ture (21, 39) (henceforth referred to simply as “social learning”)
overall brain size in primates reflect neuroanatomical changes
that are plausibly linked to increases in general cognitive abilities.
For instance, larger primate brains have more neurons in absolute This paper results from the Arthur M. Sackler Colloquium of the National Academy of
terms (8–11), with coordinated expansion particularly in the neo- Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
cortex and cerebellum (12), potentially supporting a greater diversity Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
in Irvine, CA. The complete program and video recordings of most presentations are available
of cognitive functions (7, 10). In support of this idea, overall brain
on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
size increases with broad measures of cognitive ability in primates,
Author contributions: S.E.S. and K.N.L. designed research with contributions from A.F.N.
including performance in laboratory tests of learning and cognition and S.M.R.; S.E.S. performed research; S.E.S. analyzed data; and S.E.S., A.F.N., S.M.R., and
across primate genera (13) and performance in experimental mea- K.N.L. wrote the paper.
sures of behavioral inhibition across primate species (14). The authors declare no conflict of interest.
At ∼1,500 g (15), human brains are at least three times heavier This article is a PNAS Direct Submission.
than those of any other primate species (1). However, humans 1
To whom correspondence may be addressed. Email: knl1@st-andrews.ac.uk or
are also extreme in their long lifespan, social complexity, cog- sallystreet13@gmail.com.
nition, and cultural capabilities (16, 17), raising questions about This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
whether large brains, long lives, complex cognition, and advanced 1073/pnas.1620734114/-/DCSupplemental.
EVOLUTION
sentative of the diversity in brain size across primate species (4).
Prediction 2: Social Learning and Longevity. As predicted, social
Prediction 2: Social Learning Increases with Longevity
learning richness increases with longevity (<1% β crossing zero,
This expectation follows from the hypotheses that (i) extended life n = 117) (SI Appendix, Table S2 A, i and Fig. 1A). We found no
history, particularly a longer lifespan and period of juvenile de- evidence that social learning increases with juvenile period length,
pendence, facilitates the acquisition, exploitation, and social trans- however (58% β crossing zero, n = 101) (SI Appendix, Table S2 B,
mission of life skills (16, 23, 40) and (ii) cultural knowledge promotes i and Fig. 1A). Rather, social learning increases with reproductive
survival and long lives (25–27) by acting as a “cognitive buffer,” lifespan specifically (0% β crossing zero, n = 92) (SI Appendix,
enhancing survival in challenging environmental conditions through Table S2 C, i). Relationships between social learning and lon-
behavioral responses (41, 42). Complex skills frequently take time to gevity, and between social learning and reproductive lifespan, re-
learn; therefore, longer lifespans potentially provide more time for main intact when maternal investment (summed gestation and
relevant experience to accrue, more time for adults to benefit from lactation time) is included as an additional predictor (2%, <1% β
knowledge acquired earlier in life, and more time for parents to pass crossing zero, n = 87, n = 82, respectively) (SI Appendix, Table S2
on relevant skills to offspring (16, 23, 26, 27, 40). If an extended A, ii and C, ii) whereas maternal investment itself does not predict
juvenile period in particular is critical for the acquisition of adaptive social learning in these models (≥35% β crossing zero) (SI Appendix,
socially transmitted behavior (16), we expect that juvenile period has Table S2 A, ii and C, ii). Relationships between social learning and
a strong association with social learning richness. However, costly longevity or reproductive lifespan are also not confounded by
investment in learning socially transmitted skills may pay off in later those between social learning and absolute or relative brain vol-
life only across a long reproductive lifespan (16); therefore, we may ume, as they remain when either brain volume or both brain
expect the association between social learning and longevity to be volume and body mass are included as additional predictors (<1%
driven more strongly by increases in reproductive lifespan. If there is β crossing zero, n = 111, n = 89) (SI Appendix, Table S2 A, iii and
a specific relationship of social learning with longevity, not con- iv and C, iii and iv). However, brain volume itself does not predict
founded by relationships of either with absolute or relative brain size, social learning when included alongside longevity measures (>22%
we should still find this association even when controlling for brain β crossing zero) (SI Appendix, Table S2 A, iii and iv and C, iii and iv).
volume and body mass. Furthermore, if reliance on socially trans-
Prediction 3: Social Learning and Group Size. As predicted, we found
mitted behavior is related to longevity via a cognitive buffer mech- a positive association between group size and social learning (<1%
anism rather than as a by-product of a relationship between social β crossing zero, n = 167) (SI Appendix, Table S3i and Fig. 1A).
learning, brain volume, and slower life history traits due to de- This association is independent of the relationship between social
velopmental constraints, this relationship should remain when con- learning and longevity or reproductive lifespan, as it remains when
trolling for the potentially confounding effect of maternal investment either of these life history traits is included (4% β crossing zero,
(measured as the sum of gestation and lactation periods) (34). 5% β crossing zero, n = 111, n = 89) (SI Appendix, Table S3 ii, A
and B). The relationship between group size and social learning is
Prediction 3: Social Learning Increases with Group Size
also not confounded by the association of either trait with absolute
This expectation follows from several theoretical and empirical or relative brain volume, because it remains when either brain
analyses showing that large social groups support greater amounts volume or both brain volume and body mass are included as ad-
of adaptive cultural knowledge (e.g., refs. 43–46) and broader ditional predictors (<4% β crossing zero, n = 140) (SI Appendix,
hypotheses that stable social grouping supports the evolution of Table S3 iii and iv). Both absolute and relative brain volume have,
reliance on social learning (e.g., ref. 20). If the relationship however, a weaker effect on social learning when group size is
of social learning to group size is not confounded by associations included as an additional predictor (2%, 7% β crossing zero)
of either trait with absolute brain volume, relative brain volume, or (SI Appendix, Table S3 iii and iv) compared with models
longevity, this prediction should hold when controlling for brain without group size (<1%, 3% β crossing zero) (SI Appendix,
volume, body mass, and longevity measures. Table S1 i and ii).
Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7909
A B C
Fig. 1. Posterior distributions of β coefficients for the effects of longevity, juvenile period, and group size on (A) social learning richness, (B) absolute brain
volume, and (C) relative brain volume (i.e., brain volume accounting for body mass). Here, we present effects from the simplest models, including only either
longevity, juvenile period, or group size as independent variables, together with research effort and body mass for the social learning model, and body mass
for the relative brain model. However, these results are not strongly affected by the inclusion of additional potentially confounding variables (Methods,
Results, and SI Appendix). Percentages indicate the percentage of posterior estimates that cross zero in the opposite to the predicted direction for each effect.
Distributions shifted substantially away from zero indicate evidence for effects of predictor variables in the corresponding direction whereas those centered
close to zero indicate little or no evidence for effects of predictor variables.
Fig. 2. Summary of raw data on social learning, absolute brain volume, group size, and longevity for 52 primate genera, using the consensus phylogeny
from 10ktrees (65). For illustration purposes only, all data are summarized as genus-level means, standardized with minimum 0 and maximum 1. Also for
EVOLUTION
illustration purposes only, social learning is displayed as a proportion of research effort whereas, in statistical analyses, social learning is controlled for
research effort by including research effort as an independent variable. Images show (A) bearded capuchin (Cebus libidinosus), (B) chimpanzees (Pan
troglodytes), and (C ) guinea baboons (Papio papio), illustrating lineages that represent convergent coevolution of high social learning abilities, large
brain volumes, complex social relationships, and long lifespans. (A) Courtesy of Flickr/Bart van Dorp, (B) courtesy of Flickr/USAID in Africa, and (C) courtesy of Flickr/
William Warby.
accounts for the loss of a direct relationship between social learning remains when measures of maternal investment are
learning and brain size, these results are consistent with a previous included in analyses supports these functional arguments and
exploratory phylogenetic path analysis showing that social learning argues against an interpretation solely in terms of developmental
and brain volume are related indirectly via links with dietary, constraints, in primates at least. Therefore, in primates, the
social, or life history traits (32). combination of social learning with large brains may provide a
The positive relationship between social learning and longevity cognitive buffer against environmental unpredictability, im-
we identify supports the idea that longer lifespans provide spe- proving survival and permitting long lives. Primates may con-
cies reliant on culture more time to learn novel skills, more time trast with most mammalian lineages in this regard due to the
to “cash in” on those skills once learned, and more time to pass unusually extensive reliance on culturally transmitted behav-
them on to their offspring (16, 23, 40). Additionally, longer ior seen in certain lineages (e.g., refs. 35–37), perhaps nec-
lifespans may confer greater opportunity for behavioral innova- essary for social learning to buffer individuals sufficiently
tions, providing the raw material for social transmission, because against environmental risks.
longer lifespans are positively associated with greater propensity Our finding of a positive relationship between social learning
to innovate in birds (52) and in primates (ref. 32, albeit in- and group size supports the expectation that large, stable social
directly). Culturally acquired knowledge is typically adaptive and groups support greater amounts of adaptive cultural knowledge
may often promote growth and survival of both learners and and facilitate a greater reliance on social learning (20). Although
their dependent young, and thereby extend lifespans (25–27) via this hypothesis is well-established in theoretical models (e.g., refs.
a cognitive buffer effect whereby social learning allows individ- 43, 44) and has found recent empirical support in human historical
uals to adapt behaviorally to challenging environments (41, 42). (45) and experimental (46) studies, previous comparative phylo-
These benefits may be sufficient to compensate for negative fit- genetic analyses have failed to find this relationship across primate
ness consequences associated with reliance on social learning, species (21, 25). The fact that we found a positive association here
such as increased risk of social transmission of parasites (39). most likely reflects the greater power of our analyses compared
Although hypotheses for the coevolution of lifespan and culture with earlier studies, due to the availability of a larger group size
propose that increases in both juvenile period and overall lifespan database (53) and phylogenetic comparative methods that adjust
are related to reliance on culturally transmitted knowledge (e.g., phylogenetic signal according to the traits included in the model
ref. 16), here, we found that the association between social learn- (SI Appendix, Methods), contrasting with the older independent
ing and longevity is driven by an increased reproductive lifespan, contrasts method, which effectively assumes a maximum level of
rather than an extended period of juvenile dependence. It phylogenetic signal and can therefore be overly conservative (54).
remains possible that a link between extended juvenile periods The relationship between social learning and group size remains
and social learning capabilities will be identified in future studies when longevity, brain volume, and body mass are included and
using novel social learning measures, such as those based on therefore seems not to be simply a by-product of the relationship
experimental tests. Nonetheless, our current findings suggest between group size and absolute or relative brain volume, or
that an extended reproductive lifespan, during which enhanced confounded by life history traits.
fitness benefits of earlier costly investment in learning skills Both large social groups and extended longevity (including
for survival can be reaped, primarily drives the association increases in juvenile period and reproductive and total lifespan)
between social learning and lifespan that we identify here. are associated with enlarged brain volume, whether measured in
Our finding that the relationship between longevity and social absolute terms or relative to body mass. Group size has proven a
Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7911
robust predictor of measures of brain size, particularly relative compensated for by higher productivity during the adult period,
neocortex size (29, 55, 56), and it remains an important predictor provided there is an intergenerational flow of both food and
of both absolute and relative whole brain volume, as well as knowledge from old to young (59). Our results are therefore
social learning, in our analyses. Thus, our findings support pre- broadly consistent with a cultural intelligence explanation (23–
vious studies claiming an important role for social intelligence in 27) manifested in particular primate lineages showing high re-
primate brain evolution (e.g. 29, 55–57). However, when in- liance on social learning, in which selection for efficient social
cluded together with longevity, longevity is independently related learning has allowed energy gains in diet, which in turn fueled
to brain volume whereas group size becomes a fairly weak pre- brain growth, and generated selection for extended longevity.
dictor. This result may be significant because the association of Previous comparative phylogenetic analyses have found social
brain volume and longevity is usually not regarded as directly learning to covary positively with rates of behavioral innovation
causally relevant in brain evolution (e.g., ref. 29). Further, a and tool use in primates (21, 25). Additionally, the best supported
recently published comparative analysis suggests that dietary graphs in exploratory phylogenetic path analyses link technical
factors, rather than sociality, are the primary drivers of increased innovation directly to brain size and social learning and non-
relative brain size in primates (31). It remains to be seen whether technical innovation indirectly to brain volume via diet and life-
these findings generalize to measures of neocortex volume, ar- history measures (32). Together with the current study, this body of
guably more relevant to social intelligence (29, 55–57). None- findings is consistent with the hypothesis that cultural intelligence,
theless, together, these results reinforce an emerging consensus as manifested by a cluster of behavioral traits, including social
that sociality is not the sole driver of primate brain evolution but learning, innovation, and tool use, may have been a significant
rather is embedded in a nexus of evolutionary conditions that driver of primate brain evolution. However, we highlight two notes
favor brain expansion, including dietary, ecological, life history, of caution in particular. First, the majority of primates exhibit
and behavioral factors (12, 16, 21, 25, 29, 32). comparatively little social learning (Fig. 2) (at least, as reflected in
Across mammals more broadly, the relationship between adult our database), which implies that any selection for cultural in-
brain mass and longevity is accounted for by patterns of maternal telligence has operated primarily in a small number of large-
investment and is generally interpreted as a manifestation of de- brained primate lineages. Second, our social learning measure is
velopmental costs of producing larger brained offspring, rather than largely based on observational reports, not controlled experimental
necessarily due to any cognitive or behavioral mechanism (34). tests, whereas social learning is challenging to identify from ob-
Here, however, we found that the associations of longevity with servation alone (21, 25). However, this approach provides a more
absolute and relative brain volume remain when controlling naturalistic comparative measure of social learning in comparison
for maternal investment. Therefore, in primates, compared with with those based on experimental tests, representing a far broader
mammals in general (34), variation in adult brain size across species
range of primate behavioral diversity, necessary for large-scale
cannot be fully accounted for by patterns of maternal investment,
comparative investigations (21, 25, 32, 39, 60). Results based on
and the relationship between brain size and lifespan is potentially
patterns of observational accounts of social learning across species
indicative of a cognitive buffering (41, 42), rather than solely de-
should be valuable in informing and directing future, larger scale
velopmental, mechanism through which cultural intelligence facil-
comparative experimental investigations of variation in social
itates survival. This contrast can perhaps be explained by divergent
learning abilities across species (21, 39, 61).
scaling relationships between brain volume and neuron number
One comprehensive way to interpret these findings is to recog-
(potentially a more relevant correlate of cognitive capacity) (7, 10,
nize multiple waves of selection for enlarged brains and enhanced
12) in primates compared with other mammalian lineages. Unlike
nonprimate mammalian lineages, such as rodents, in which neuron cognition in primates. In addition to selection for the cognitive skills
size increases and neuron density decreases with increased brain required for complex social lives (29) and dietary niches (31)
volume, in primates, the number of neurons increases approxi- characteristic of some primate taxa, our results imply a likely later
mately isometrically with brain volume (8–11). Therefore, in pri- bout of selection for cultural intelligence among a restricted number
mates, larger brains may confer stronger benefits in terms of of large-brained primate lineages. The latter notably include the
increased cognitive function and behavioral flexibility com- great apes, but also other independent lineages such as capuchins
pared with other mammalian lineages. Overall, together with and baboons (Fig. 2), as our results are not contingent on the in-
the strong relationship between social learning and longevity, clusion of great apes. Plausibly, complex sociality and foraging may
these findings are consistent with the hypotheses that cultural have led to the evolution of large-brained primate lineages, some of
knowledge facilitates survival and that extended longevity fa- which passed a critical threshold in reliance on socially learned
cilitates the acquisition, exploitation, and social transmission behaviors, leading to mutually reinforcing selection for increased
of life skills (16, 23, 25–27, 40). brain size, cognitive abilities, and reliance on social learning and
Our finding that longevity is a strong, and potentially causally innovation, mediated by conferred increases in longevity and diet
significant, predictor of both brain volume and social learning quality. The twin challenges of complex socioecological niches and
richness is evocative of the argument that intelligence and life- reliance on culture may therefore best account for the evolution of
history length have coevolved in humans because our intellectual large brains, advanced cognition, and extended lifespans in pri-
abilities allowed us to exploit high-quality, but difficult-to-access, mates. However, our analyses do not allow the direction of causality
food resources, with the nutrients gleaned “paying” for brain to be inferred, and other interpretations, for instance, in which large
growth, and with increased longevity favored because it allowed brains evolved for other reasons, subsequently allowing for gains in
more time to cash in on complex, and difficult to master, for- social and cultural complexity, are equally supported by the findings
aging skills, with fitness benefits that pay off later in life (16). presented here.
High levels of knowledge, skill, coordination, and strength are Our results do, however, strongly suggest a strong coevolutionary
required to exploit the high-quality dietary resources consumed relationship among cultural intelligence, brain size, sociality, and
by humans and other apes. Consistent with this idea, the most life-history length in primates. Although we have focused here on
common use of social learning in primates seems to be in ac- nonhuman primates, broader comparative trends support the idea
quiring foraging skills, as ∼50% of reports of social learning in that enlarged brain size, general cognitive abilities, and reliance on
a prior compilation occurred within the context of foraging culture may have coevolved in other long-lived, highly social line-
(25, 58). Complex tool use and extractive foraging abilities re- ages, including some birds (e.g., corvids and parrots) and toothed
quire time to acquire, but, in larger brained animals, an ex- whales (18–20, 22). These associations may be mutually reinforcing
tended learning phase, during which productivity is low, can be (24), with positive feedback loops reaching their zenith in humans,
EVOLUTION
Methods for full details on the social learning measure, illustrative examples,
used with all variables log-10 transformed, diffuse normal priors for the fixed
and discussion of its reliability). Briefly, social learning richness is the number of
effects with a mean of 0 and a large variance (1010), and inverse-Wishart priors
reports of unique social learning behaviors per primate species, primarily from
for the phylogenetic and residual variance (with V = 1, ν = 0.002). Where social
a literature sample of >4,000 articles from primate behavior journals (from
learning was the response variable, Gaussian models were not appropriate
1925 to 2000) (21). Instances of social learning were identified using keywords
due to the highly skewed distribution of this variable; we therefore used
(e.g., “social learning,” “cultural transmission,” and “traditional”) to minimize
Poisson models, with all predictor variables log-10 transformed and with
subjectivity in the collation of reports from the literature (21, 25). Although
nontransformed response variables. Poisson models used the same priors for
identifying social learning from literature reports of nonhuman primate be-
the fixed effects and residual variance as for the Gaussian models, with a
havior is inherently challenging, this approach allows for a quantitative be-
parameter-expanded prior (V = 1, ν = 1, αμ = 0, and αV = 252) for the phy-
havioral measure of social learning across a large sample of diverse primate
logenetic random effect (68, 69). Although a large proportion of the species
species, supporting far larger scale comparative analyses than would be pos-
included in analyses had zero records of social learning, these species are still
sible using data from controlled experiments alone (21, 25, 32, 39, 60). Ex-
informative due to the inclusion of research effort in all models (SI Appendix,
perimental approaches to measuring social learning across species are associated
Methods). Further, preliminary analyses established that Poisson models
with their own particular challenges, especially in comparability and ecological
without a zero-inflation term were appropriate for our data (SI Appendix,
validity of behavioral tests, and limited statistical power due to smaller sample
Methods).
sizes (21, 25, 61). We account for broad-scale species differences in research ef-
Markov chain Monte Carlo (MCMC) analyses were run with a sufficient
fort, here estimated using the number of papers published in the Zoological
number of iterations and thinning to return effective sample sizes of >1,000
Record (between 1993 and 2001, total 7,288 articles) (21) (see SI Appendix,
for all parameters (SI Appendix, Methods). Chain convergence and adequate
Methods for further information).
performance were confirmed by visual inspection of trace plots and checking
Data on social group size and life history traits (gestation length, weaning
effective sample sizes. From each model, we report the mean h2 (a measure of
age, age of sexual maturity, and maximum longevity) were obtained from the
phylogenetic signal equivalent to Pagel’s λ) (70) and mean β coefficient esti-
PanTheria dataset (53). As a measure of maternal investment, we summed
mate from posterior distributions. To assess the strength of evidence for fixed
gestation length and weaning age (following ref. 34). Reproductive lifespan
effects, we use the percentage of posterior β coefficient estimates crossing zero
was calculated as age of sexual maturity subtracted from maximum lon-
in the direction opposite to predictions (as in refs. 39, 71, and 72, for example).
gevity. Comparative datasets were matched to a dated consensus phylogeny
Posterior distributions shifted substantially away from zero in a positive or
for 301 primate species (10kTrees version 3, using GenBank taxonomy) (65).
negative direction indicate support for positive or negative associations, re-
Taxonomic mismatches were resolved using the 10kTrees Translation table
spectively, between fixed effects and outcome variables. Conversely, posterior
and the International Union for Conservation of Nature (IUCN) Red List
distributions centered on zero or overlapping substantially with zero indicate a
website (66).
lack of evidence for any relationship between the fixed effects and outcome
variables. Here, all associations are predicted to be positive in direction. As a
Statistical Analyses. To test predictions, we ran a series of statistical models in
measure of model fit, we used a pseudo R2, estimated as the squared Pearson’s
which the outcome variables were always either brain volume or social
correlation between fitted values and observed data (73). No analysis reported
learning, fitting independent variables that correspond to specific predicted
a variance inflation factor (VIF) above 5, demonstrating that multicollinearity
associations, along with appropriate potentially confounding variables. Ac-
was not a concern in our analyses (SI Appendix, Methods).
counting for the effects of multiple variables is essential in comparative
studies of brain evolution, due to multiple potential correlates (29). We
ACKNOWLEDGMENTS. We thank Chris Venditti for advice regarding the
analyzed brain volume both in absolute terms, and relative to body mass, by
implementation of phylogenetic Poisson models. Research was supported in
variably including body mass as an additional predictor variable. Where part by European Research Council Advanced Grant “Evoculture” 232823 (to
social learning was the outcome variable, research effort was always in- K.N.L.), John Templeton Foundation Grant 23807 (to K.N.L. and S.M.R.), and
cluded as a predictor to account for its effect on the number of records Natural Sciences and Engineering Research Council of Canada Grants
of social learning in the primate behavioral literature (21, 25). We also 418342-2012 and 429385-2012 (to S.M.R.).
1. Striedter GF (2005) Principles of Brain Evolution (Sinauer, Sunderland, MA). 5. Barton RA (2006) Primate brain evolution: Integrating comparative, neurophysio-
2. Finlay BL, Darlington RB (1995) Linked regularities in the development and evolution logical, and ethological data. Evol Anthropol 15:224–236.
of mammalian brains. Science 268:1578–1584. 6. Healy SD, Rowe C (2007) A critique of comparative studies of brain size. Proc Biol Sci
3. Barton RA, Harvey PH (2000) Mosaic evolution of brain structure in mammals. Nature 274:453–464.
405:1055–1058. 7. Chittka L, Niven J (2009) Are bigger brains better? Curr Biol 19:R995–R1008.
4. Isler K, et al. (2008) Endocranial volumes of primate species: Scaling analyses using a 8. Herculano-Houzel S, Collins CE, Wong P, Kaas JH (2007) Cellular scaling rules for
comprehensive and reliable data set. J Hum Evol 55:967–978. primate brains. Proc Natl Acad Sci USA 104:3562–3567.
Street et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7913
9. Herculano-Houzel S (2009) The human brain in numbers: A linearly scaled-up primate 43. Henrich J (2004) Demography and cultural evolution: How adaptive cultural processes
brain. Front Hum Neurosci 3:31. can produce maladaptive losses—the Tasmanian case. Am Antiq 69:197–214.
10. Herculano-Houzel S (2011) Brains matter, bodies maybe not: The case for examining 44. Powell A, Shennan S, Thomas MG (2009) Late Pleistocene demography and the ap-
neuron numbers irrespective of body size. Ann N Y Acad Sci 1225:191–199. pearance of modern human behavior. Science 324:1298–1301.
11. Herculano-Houzel S, Manger PR, Kaas JH (2014) Brain scaling in mammalian evolution 45. Kline MA, Boyd R (2010) Population size predicts technological complexity in Oceania.
as a consequence of concerted and mosaic changes in numbers of neurons and av- Proc Biol Sci 277:2559–64.
erage neuronal cell size. Front Neuroanat 8:77. 46. Derex M, Beugin M-P, Godelle B, Raymond M (2013) Experimental evidence for the
12. Barton RA (2012) Embodied cognitive evolution and the cerebellum. Philos Trans R influence of group size on cultural complexity. Nature 503:389–391.
Soc Lond B Biol Sci 367:2097–2107. 47. Leadbeater E, Chittka L (2007) Social learning in insects: From miniature brains to
13. Deaner RO, Isler K, Burkart J, van Schaik C (2007) Overall brain size, and not en- consensus building. Curr Biol 17:R703–R713.
cephalization quotient, best predicts cognitive ability across non-human primates. 48. Street SE, Laland KN (2017) Social learning, intelligence, and brain evolution. The
Brain Behav Evol 70:115–124. Wiley Handbook of Evolutionary Neuroscience, ed Shepherd SV (John Wiley, Chichester,
14. MacLean EL, et al. (2014) The evolution of self-control. Proc Natl Acad Sci USA 111: UK), pp 495–513.
E2140–E2148. 49. Barton RA (1998) Visual specialization and brain evolution in primates. Proc Biol Sci
15. Later W, et al. (2010) Is the 1975 Reference Man still a suitable reference? Eur J Clin 265:1933–1937.
Nutr 64:1035–1042. 50. Barton RA (2004) Binocularity and brain evolution in primates. Proc Natl Acad Sci USA
16. Kaplan H, Hill K, Lancaster J, Hurtado AM (2000) A theory of human life history 101:10113–10115.
evolution: Diet, intelligence, and longevity. Evol Anthropol 9:156–185. 51. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202.
17. Boyd R, Silk JB (2012) How Humans Evolved (Norton, New York). 52. Sol D, Sayol F, Ducatez S, Lefebvre L (2016) The life-history basis of behavioural in-
18. Emery NJ, Clayton NS (2004) The mentality of crows: Convergent evolution of in- novations. Philos Trans R Soc Lond B Biol Sci 371:20150187.
telligence in corvids and Apes. Science 306:1903–1907. 53. Jones KE, et al. (2009) PanTHERIA: A species-level database of life history, ecology,
19. Emery NJ (2006) Cognitive ornithology: The evolution of avian intelligence. Philos and geography of extant and recently extinct mammals. Ecology 90:2648.
Trans R Soc B Biol Sci 361:23–43. 54. Carvalho P, Diniz-Filho JAF, Bini LM (2006) Factors influencing changes in trait cor-
20. Rendell L, Whitehead H (2001) Culture in whales and dolphins. Behav Brain Sci 24: relations across species after using phylogenetic independent contrasts. Evol Ecol 20:
309–324. 591–602.
21. Reader SM, Hager Y, Laland KN (2011) The evolution of primate general and cultural 55. Dunbar RIM (1995) Neocortex size and group size in primates: A test of the hy-
intelligence. Philos Trans R Soc B Biol Sci 366:1017–1027.
pothesis. J Hum Evol 28:287–296.
22. Hunt GR, Gray RD (2003) Diversification and cumulative evolution in New Caledonian
56. Dunbar RIM (1998) The social brain hypothesis. Evol Anthropol 6:178–190.
crow tool manufacture. Proc Biol Sci 270:867–874.
57. Dunbar RIM (1992) Neocortex size as a constraint on group size in primates. J Hum
23. Boyd R, Richerson PJ (1985) Culture and the Evolutionary Process (Univ of Chicago
Evol 22:469–493.
Press, Chicago).
58. Reader SM (2000) Social learning and innovation: Individual differences, diffu-
24. Wilson AC (1985) The molecular basis of evolution. Sci Am 253:164–173.
sion dynamics and evolutionary issues. PhD dissertation (University of Cambridge,
25. Reader SM, Laland KN (2002) Social intelligence, innovation, and enhanced brain size
Cambridge, UK).
in primates. Proc Natl Acad Sci USA 99:4436–4441.
59. Kaplan HS, Robson AJ (2002) The emergence of humans: The coevolution of in-
26. Whiten A, van Schaik CP (2007) The evolution of animal “cultures” and social in-
telligence and longevity with intergenerational transfers. Proc Natl Acad Sci USA 99:
telligence. Philos Trans R Soc B Biol Sci 362:603–620.
10221–10226.
27. van Schaik CP, Burkart JM (2011) Social learning and evolution: The cultural in-
60. Lefebvre L, Reader SM, Sol D (2004) Brains, innovations and evolution in birds and
telligence hypothesis. Philos Trans R Soc Lond B Biol Sci 366:1008–1016.
primates. Brain Behav Evol 63:233–246.
28. Whiten A, Byrne RW (1997) Machiavellian Intelligence II: Extentions and Evaluations
61. Bates LA, Byrne RW (2007) Creative or created: Using anecdotes to investigate animal
(Cambridge Univ Press, Cambridge, UK).
cognition. Methods 42:12–21.
29. Dunbar RIM, Shultz S (2007) Understanding primate brain evolution. Philos Trans R
62. Pagel M (2012) Evolution: Adapted to culture. Nature 482:297–299.
Soc Lond B Biol Sci 362:649–658.
63. Reader SM, MacDonald K (2003) Environmental variability and primate behavioural
30. Clutton-Brock TH, Harvey PH (1980) Primates, brains and ecology. J Zool 190:309–323.
31. DeCasien AR, Williams SA, Higham JP (2017) Primate brain size is predicted by diet but flexibility. Animal Innovation, eds Reader SM, Laland KN (Oxford Univ Press, Oxford),
not sociality. Nat Ecol Evol 1:0112. pp 83–116.
32. Navarrete AF, Reader SM, Street SE, Whalen A, Laland KN (2016) The coevolution of 64. Reader SM, Hager Y, Laland KN (2011) Data from: The evolution of primate general
innovation and technical intelligence in primates. Philos Trans R Soc Lond B Biol Sci and cultural intelligence. Dryad Digital Repository. Available at dx.doi.org/10.5061/
371:20150186. dryad.t0q94. Accessed November 23, 2016.
33. Byrne RW, Corp N (2004) Neocortex size predicts deception rate in primates. Proc Biol 65. Arnold C, Matthews LJ, Nunn CL (2010) The 10kTrees Website: A new online resource
Sci 271:1693–1699. for primate phylogeny. Evol Anthropol 19:114–118.
34. Barton RA, Capellini I (2011) Maternal investment, life histories, and the costs of brain 66. IUCN (2016) The IUCN Red List of Threatened Species. Version 2016-2. Available at
growth in mammals. Proc Natl Acad Sci USA 108:6169–6174. www.iucnredlist.org. Accessed November 24, 2016.
35. Whiten A, et al. (1999) Cultures in chimpanzees. Nature 399:682–685. 67. West HER, Capellini I (2016) Male care and life history traits in mammals. Nat
36. van Schaik CP, et al. (2003) Orangutan cultures and the evolution of material culture. Commun 7:11854.
Science 299:102–105. 68. Hadfield JD (2010) MCMC methods for multi-response generalized linear mixed
37. Perry S, et al. (2003) Social conventions in wild white-faced capuchin monkeys. Curr models: The MCMCglmm R package. J Stat Softw 33:1–22.
Anthropol 44:241–269. 69. Hadfield J (2016) MCMCglmm Course Notes. Available at: ftp://cran.r-project.org/pub/
38. Henrich J (2015) The Secret of Our Success: How Culture Is Driving Human Evolution, R/web/packages/MCMCglmm/vignettes/CourseNotes.pdf. Accessed January 5, 2017.
Domesticating Our Species, and Making Us Smarter (Princeton Univ Press, Princeton). 70. Hadfield JD, Nakagawa S (2010) General quantitative genetic methods for compar-
39. McCabe CM, Reader SM, Nunn CL (2015) Infectious disease, behavioural flexibility and ative biology: Phylogenies, taxonomies and multi-trait models for continuous and
the evolution of culture in primates. Proc Biol Sci 282:20140862. categorical characters. J Evol Biol 23:494–508.
40. Laland KN (2017) Darwin’s Unfinished Symphony: How Culture Made the Human 71. Capellini I, Baker J, Allen WL, Street SE, Venditti C (2015) The role of life history traits
Mind (Princeton Univ Press, Princeton). in mammalian invasion success. Ecol Lett 18:1099–1107.
41. Sol D (2009) Revisiting the cognitive buffer hypothesis for the evolution of large 72. Allen WL, Street SE, Capellini I (2017) Fast life history traits promote invasion success
brains. Biol Lett 5:130–133. in amphibians and reptiles. Ecol Lett 20:222–230.
42. González-Lagos C, Sol D, Reader SM (2010) Large-brained mammals live longer. J Evol 73. Zheng B, Agresti A (2000) Summarizing the predictive power of a generalized linear
Biol 23:1064–1074. model. Stat Med 19:1771–1781.
Edited by Kevin N. Laland, University of St. Andrews, St. Andrews, United Kingdom, and accepted by Editorial Board Member Andrew G. Clark April 29, 2017
(received for review January 20, 2017)
When humans and other animals make cultural innovations, they moment on, learning mechanisms that need not have initially been
also change their environment, thereby imposing new selective specifically social were also selected according to their ability to
pressures that can modify their biological traits. For example, support social learning. If that is the case, one can certainly claim
there is evidence that dairy farming by humans favored alleles for that these mechanisms were adapted or shaped to serve their new
adult lactose tolerance. Similarly, the invention of cooking possibly social function (although using the term “evolved for” may still be
affected the evolution of jaw and tooth morphology. However, premature without knowing the degree of genetic modification
when it comes to cognitive traits and learning mechanisms, it is and specialization).
much more difficult to determine whether and how their evolution Similarly, when social learning enables the accumulation or
was affected by culture or by their use in cultural transmission. Here spread of shared group behaviors—these days recognized as the
we argue that, excluding very recent cultural innovations, the formation of “culture” (17, 18)—this culture becomes the new
assumption that culture shaped the evolution of cognition is both ecological niche for all of the learning mechanisms that con-
more parsimonious and more productive than assuming the tribute to it, and therefore has the potential to shape their evolution.
Thus, in theory, given sufficient evolutionary time, cultural phe-
EVOLUTION
opposite. In considering how culture shapes cognition, we suggest
that a process-level model of cognitive evolution is necessary and
nomena that are adaptive for the individual, and whose acquisition is
supported by advanced learning or cognitive skills, such as the ability
offer such a model. The model employs relatively simple coevolv-
to imitate or to learn language, are expected to select for improve-
ing mechanisms of learning and data acquisition that jointly
ments in these cognitive skills (see also ref. 19). In practice, however,
construct a complex network of a type previously shown to be
clear evidence showing the effect of culture on cognition is lacking,
capable of supporting a range of cognitive abilities. The evolution
and alternative accounts for the evolution of advanced cognition
PSYCHOLOGICAL AND
of cognition, and thus the effect of culture on cognitive evolution,
COGNITIVE SCIENCES
and culture through domain-general learning principles cannot
is captured through small modifications of these coevolving learn- be ruled out (1, 2, 20). As a result, whether and how culture
ing and data-acquisition mechanisms, whose coordinated action is really shapes the evolution of cognition is still under debate.
critical for building an effective network. We use the model to show In what follows, we first clarify some of the theoretical issues in
how these mechanisms are likely to evolve in response to cultural this debate, using two recent controversies in the fields of language
phenomena, such as language and tool-making, which are asso- evolution and social learning. We then offer a process-level ap-
ciated with major changes in data patterns and with new compu- proach to cognitive evolution that may be useful in predicting what
tational and statistical challenges. aspects of learning and cognition are likely to coevolve with cul-
ture. Finally, we use the model to demonstrate how cultural
| |
tool-making language evolution niche construction | phenomena such as language and tool-making (each related to
|
cognitive evolution social learning one of the two controversies discussed earlier) are likely to shape
cognition, given their association with changes in data distribution
and with new computational and statistical challenges.
A n open question in the study of culture and cognitive evo-
lution is whether (and to what extent) cognitive mechanisms,
especially those viewed as advanced or sophisticated, evolved in
Can Culture Evolve Without Shaping Cognition? On
Parsimony, Likelihood, and Scientific Productivity
response to social-learning challenges or are merely the product of
domain-general mechanisms (1–3). According to one view—still Evolution takes time, so it is clear that very recent cultural in-
widely held in cognitive science and evolutionary psychology—cog- novations, such as cars, computers, cellular phones, or the In-
nitive adaptations take the form of specialized brain modules (or ternet, could not have yet generated detectable effects (or
perhaps any effect at all) on the evolution of cognition. But what
neuronal mechanisms) that evolved for specific, often social pur-
about relatively ancient cultural phenomena, such as song-
poses, such as “imitation” (4, 5), “mind reading” (6, 7), “cheating
learning in birds or tool-making and language acquisition in
detection” (8), or most famously, language acquisition (9, 10). These humans? Although there is evidence for the effect of human
ideas have been criticized on theoretical and empirical grounds (11, culture on biological traits and gene frequencies (21), evidence
12), and the debate around them demonstrates our limited un- for specific effects of human culture on learning and cognitive
derstanding of the evolution of cognition, its relationship to the mechanisms is mostly circumstantial. This evidence includes
evolution of social behavior and, in some organisms, culture.
The question of whether culture and social behavior shape the
evolution of the brain is, in our view, best considered using the This paper results from the Arthur M. Sackler Colloquium of the National Academy of
evolutionary framework of niche construction (13–16): that is, Sciences, “The Extension of Biology Through Culture,” held November 16–17, 2016, at the
culture and social behavior change the ecological niche to which Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering
cognitive traits must adapt in the same manner that nest-building in Irvine, CA. The complete program and video recordings of most presentations are available
by birds changes the ecological niche in which their nestlings evolve. on the NAS website at www.nasonline.org/Extension_of_Biology_Through_Culture.
For example, animals’ ability to learn from each other may have Author contributions: A.L., J.Y.H., S.E., and O.K. designed research, performed research,
initially been a by-product of domain-general associative learn- and wrote the paper.
ing mechanisms that did not evolve for social learning (1). The authors declare no conflict of interest.
However, as soon as these mechanisms enabled social learning This article is a PNAS Direct Submission. K.N.L. is a guest editor invited by the Editorial
and were recruited by it for regular use, social learning and its Board.
outcomes also became part of their ecological niche. From that 1
To whom correspondence should be addressed. Email: okolodny@stanford.edu.
EVOLUTION
outcome of a process. Such a model may also help provide a useful for language acquisition and for creativity (85, 87). For the latter,
structure for reexamining the two problems outlined above. our modeling framework had to go well beyond chaining through
second-order conditioning (84, 88). The success of a computer
Why Do We Need a Process-Level Approach to Theorize program, originally developed to simulate the behavior of ani-
About Culture and Cognition? mals learning to forage for food in structured environments (85),
Whereas it is relatively easy to see how natural selection acts on in reproducing a range of findings in human language (86) sug-
clearly defined morphological traits, such as limbs, bones, or gests that the model may be useful in the study of cognitive
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
coloration, with cognitive traits that are not well understood, it is evolution. Thus far the model’s implementation has been lim-
difficult to tell what is actually evolving. Cognition is not a ited, for simplicity, to an unsupervised learning mode with a
physical trait, but an emergent property of processes that are learning phase, and then a test phase during which the learners
carried out by multiple mechanisms, most of which involve act based on what they have learned. Its extension to accom-
learning. Thus, to consider how culture shapes the evolution of modate iterated cycles of learning and action, which is necessary
cognition, we must explain how such mechanisms work and how to capture the learning of behavioral contingencies through trial
they can be modified by natural selection. The importance of and error, is straightforward. Detailed pseudocode for our model
using mechanistic models in the study of behavioral evolution is is found in the supplementary material of refs. 85 and 86.
increasingly recognized (65–68), but most attempts to integrate The model is based on coevolving mechanisms of learning and
evolutionary theory and cognition are still based on modeling the data acquisition that jointly construct a complex network that
evolution of learning rules that are far too simple to capture represents the environment and is used for computing adaptive
complex cognition (69–74). To understand how culture shapes responses to challenges in the environment. In particular, the
the evolution of cognitive mechanisms, such as those serving network is used for search, prediction, decision making, and
imitation, theory of mind, or language acquisition, it is necessary generating behavioral sequences (including language utterances,
to have models that explain how such mechanisms work and how when applicable). The extent to which learners’ use of the net-
they could evolve. work produced adaptive behaviors was measured in our imple-
Clearly, given the immense complexity of the brain, any at- mentation by foraging success [in the context of animal foraging
tempt to propose a general process-level model of advanced (84, 85, 87)] and by a set of language performance scores [in the
cognition would be ambitious. However, we believe that it is context of language learning (86)]. Although the production of
possible and necessary to start by constructing models that cap- adaptive behaviors depends on the structure of the network, this
ture some of the key working principles of advanced cognitive structure is not directly coded by genes and therefore cannot
mechanisms in a manner that suffices to explain their evolution. evolve directly. The components that may evolve over genera-
An analogy that may clarify our approach is the apparent chal- tions are the parameters of the learning and data-acquisition
lenge in explaining the evolution of the eye. The vertebrate eye is mechanisms that construct the network through interaction
highly complex; it is initially hard to see how it could have with the environment, and whose coordinated action, as we show
evolved. However, with a minimal understanding of how the eye below, is critical for building the network appropriately. [This
works, the “magic” is removed (75). The basic eye model is a coevolution is very much in the spirit of the notions of con-
layer of photosensitive cells; the visual acuity it provides can structive development and reciprocal causation in the recently
gradually improve as it buckles into a ball-like shape, looking proposed “extended evolutionary synthesis” (89).]
(and working) more and more like a pinhole camera. This sketch
ignores many details, and is far from explaining everything about Constructing a Network. We now briefly sketch the main principles
eyes and vision, but is sufficient to resolve the puzzle. that govern how the network is constructed. (This technical de-
This is the kind of modeling approach that we seek for explaining scription may become clearer and more intuitive after reading
cognitive evolution. Specifically, we do not seek a fully detailed the simplified example outlined in the next subsection and il-
neuronal-level model of brain and cognition. Instead, we want a lustrated by Fig. 1). We assume that what data are acquired by
minimal set of principles that suffice to explain how simple oper- the learner is determined by its “data-acquisition mechanisms”:
ational units, capable of only the most basic forms of learning, can the collection of sensory, attentional, and motivational mecha-
jointly and gradually create the much more sophisticated mecha- nisms that direct the learner to process and acquire whatever is
nisms of advanced cognition. [Powerful algorithms using deep deemed relevant. These mechanisms [also referred to as “input
Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7917
mechanisms” (33)] determine the content and the distribution of by the parameters of weight increase and decrease. These pa-
different data items in the input. For simplicity, we assume that rameters create a window for learning, during which data can be
the input takes the form of strings of symbols (i.e., linear se- either retained or discarded from the network.
quences of discrete items), which are then processed through a We assume that if a data sequence reaches the threshold weight
limited working-memory buffer, similar to the “phonological loop” for memory fixation, it remains in memory and is not segmented
in humans (90, 91), and tested for familiar segments and statistical any further. An intuitive example from language learning is a word
regularities among their components. This is done by the learning such as “backpack,” which would be fixed in memory if it were
mechanisms in a sequence of steps. heard repeatedly without prior exposure to instances of “back” or
A data sequence is scanned for subsequences that recur within “pack” (not even within other sequences, such as “on my back” or
it and for previously learned subsequences, and is segmented “in the pack”). If “back” or “pack” are heard often, then their
accordingly. This results in a series of chunks, which are either partial commonality with “backpack” would result in “backpack”
previously known or are incorporated at this point into a network being segmented into “back” and “pack” (with a directed link
of nodes that represents the world as it has been learned so far between them; i.e., back→pack). The fixation of long sequences
(nodes stand for objects or other meaningful units; the links in may have a positive or a negative impact on a learner’s success, as
the network represent their association in time and space). discussed below and in refs. 81 and 87. Note that the fixation of
Weights are assigned to the nodes and to the links to reflect their “backpack” does not prevent the formation of separate nodes for
frequency of occurrence: links between nodes are established “back” and “pack” following later observations.
In addition to breaking up segments to form smaller segments,
whenever two nodes follow one another in the input. The weight
a node can be formed by the concatenation of smaller segments
of a node or a link is increased whenever it is encountered in the
after they are repeatedly observed in succession. Thus, nodes can
input; the weight also decreases with time if it is not encoun- be formed “top-down” directly from the raw input by segmen-
tered. This process ensures that only those units and relations tation, or “bottom-up” through concatenation of previously
that are potentially meaningful are retained in memory, and learned units, creating a hierarchical structure, with potentially
spurious occurrences are forgotten. If a node’s or link’s weight multiple hierarchies that can be perceived as “sequences of
increases above a fixation threshold, its decay becomes highly shorter sequences” (see refs. 85 and 86 for more details). In both
improbable. The probability that a data item is learned is thus cases, the effects that memory parameters have on learning
determined by how frequently it is encountered in the data, and amount to a test of statistical significance: natural and mean-
ingful patterns are likely to recur and thus pass the test, whereas
spurious patterns decay and are forgotten.
A 756483617569813675628136 A Simplified Example. To better understand the process of data
segmentation and network construction, a simplified example is
756483617569813675628136 illustrated in Fig. 1A. This example shows a network that is
constructed as the result of acquiring three specific strings of
data, under the assumption that the weight-increase parameter is
756483617569813675628136 0.4, the fixation threshold is 1.0 (which means that a data item
reaches fixation after three successive observations, because 3 ×
0.4 > 1), and the weight-decrease (decay) parameter is 0.01 (i.e.,
the weight of a data item that is not yet fixated decreases
by 0.01 with each symbol that enters the input). It is also as-
756 48361 sumed that the working-memory buffer can accommodate up to
98 136 24 symbols (which is the length of one data string in the example
in Fig. 1A), and that the strings in this illustration are separated
28 by 30 additional (irrelevant) characters that prevent parts of any
two of these strings from being processed simultaneously in the
B 756483617564836175648361 memory buffer. The figure demonstrates that repeated se-
quences within each string (highlighted by shades of gray for
clarity) are segmented based on their similarity and become the
756981367569813675698136 data units that form the nodes of the network. Directed links
represent past association between these units; thus, they rep-
756281367562813675628136 resent statistical regularities of the environment. For example,
98 always follows 756 and precedes 136, whereas 756 leads to
48361, 98, and 28 with equal probability. Despite the simplicity of
this network, we can already observe that 98 and 28 have a
75648361 6 75628136 similar link structure: both are preceded by 756 and followed by
136. In our earlier work (85, 86), we showed how such similarity
75698136 in link structure can be used for generalization, for the con-
struction of hierarchical representations, and for creativity (87).
Fig. 1. Data input in the form of three strings, and the network that is We stressed earlier that, according to our model, the co-
constructed as a result of acquiring and processing this input using the ordinated action of learning and data acquisition mechanisms
learning mechanisms and parameter set described in the text. (A) Each data and their evolution in response to typical input characteristics
string of 24 characters is composed of three nonidentical subsequences of
are critical for building an effective network. This point is illus-
eight characters that share some common segments (highlighted using the
trated by Fig. 1B, where the same data as in Fig. 1A are now
same shade of gray). The three strings are identical in this case, so labeling
each subsequence of eight characters as A, B, and C, respectively, allows
distributed differently, leading to a radically different network
describing the structure of the input as ABC ABC ABC. (B) The same input as representation (although the learning parameters and the
in A is distributed differently over time, which can be described in short as working-memory buffer size remain the same). The distribution
AAA BBB CCC. This input leads to a completely different network structure of the data input in Fig. 1B leads to the fixation of large idio-
due to fixation of A, B, and C as long eight-character chunks. The weights of syncratic data sequences and to poor link structure, which may
the nodes and the links of the networks are not shown in the figure, but all hamper further learning and generalization (81, 87). For exam-
of them exceed the fixation threshold of 1.0, as the weight-increase pa- ple, no generalization can now be drawn for 98 and 28 because
rameter was set to 0.4 per occurrence. each of them is “locked” within another segment. Recognizing
EVOLUTION
rameters may also be modulated by physiological and emotional segmentation process, and consequently on the construction of
state, giving higher increase in memory weight to important but the network: because we expect the data-acquisition and learning
relatively rare observations (82). parameters to coevolve, the evolution of language should also
The learning mechanism described so far is sensitive to the affect the memory parameters of the learning mechanisms. Note
order in which elements appear in the data. For example, that this consideration takes us back to the problem of language
756 and 576 are viewed as different data sequences. This may be
and memory constraints discussed earlier in the paper. It is
important for some data types, such as the sequence of actions of
PSYCHOLOGICAL AND
highly unlikely that the memory and learning parameters that
COGNITIVE SCIENCES
a particular hunting technique, the phrases in a birdsong, or
evolved before language existed were best suited for processing
human speech. But for some other types of data it may be suf-
linguistic data. Although certain plastic adjustment of these
ficient (or even better) to classify two sequences as the same by
memory parameters on the basis of the learner’s individual ex-
merely recognizing some of their similar components. For exam-
ple, two instances of the same salad in a salad bar or a stand of perience cannot be ruled out, it is unlikely that adaptation to
mango trees in the forest may be recognized based on a combi- language learning and use over hundreds of generations did not
nation of stimuli, ignoring their exact serial order (which may also play a role in shaping the genetic basis of these parameters’
actually vary across instances). It is therefore possible that the values. This claim is supported by the known genetic heritability
learning mechanisms may also differ in the set of parameters that component in various types of memory (60–63). The question to
determine how sensitive they are to the exact serial order of data address next is how these parameters evolved as a result of
items. Clearly, a change in these parameters can also influence the language evolution.
segmentation process and the structure of the network, an issue Intuitively, one would expect the challenge of language ac-
that will become relevant again when we discuss language and quisition to require and to select for a better working memory,
tool-making, for which serial order is critically important. leading to a view of a memory limit as a constraint rather than as
Finally, as explained earlier, our model does not pretend to an adaptation (see problem 2, above). However, according to our
capture cognitive mechanisms at the neuronal level. The nodes model, there are at least two reasons why a limited working
and the links in our network do not correspond to neurons and memory may actually be adaptive. First, as explained earlier, the
synapses. Nevertheless, the processes described in our model at parameters of weight increase and decrease create a window for
the computational level can be realized by neuronal structures learning that serves as a test of statistical significance: natural
and activities, and a representation of the proposed network may and meaningful patterns are likely to recur and thus to pass the
exist in the brain. We can assume that the neuronal structures test, whereas spurious patterns decay and are forgotten.
and brain circuits that realize the network are ultimately affected According to this view, the reason that it is typically difficult to
by constraints of size and morphology that are at least partly learn a novel input from a single encounter is that the mecha-
determined genetically. That is, adaptive changes in the data nism of learning has evolved to expect more evidence before
acquisition and the learning mechanisms that can potentially deciding whether an item should be learned or ignored. The
lead to the construction of an extensive network in the acoustic evolution of learning parameters that allow data items to reach
domain, for example, may be subject to physical constraints that fixation in memory after a single encounter should be possible.
are also genetically determined. Over generations, genetic vari- There are in fact examples for such “one-trial learning,” that,
ants that are better in relaxing these physical constraints and in interestingly, occurs when rapid learning seems to be adaptive, as
meeting the demand for larger or more appropriate neuronal in the context of fear learning and enemy recognition (100, 101)
structures will be favored by selection. This view of brain evo- or in the case of word “fast-mapping” in young children (102).
lution is consistent with the “Baldwin effect” view (92, 93), However, in the case of large quantities of sequential data, where
according to which genes may be selected based on how well they all items may be equally important, proximity of repeated occur-
support adaptive plastic processes, such as learning. Using this rences in time and frequency of recurrence are the best first-resort
approach to address the question of how culture shapes the tests of meaningfulness. The selective pressure of language in the
evolution of the brain would imply that culture exerts selective direction of smaller buffer sizes and moderate fixation rates may
pressure that shapes learning and data-acquisition parameters, explain why people are not better at memorizing sequential data
which in turn shape the structure of the constructed network. verbatim (58), which would require larger buffer sizes and more
Consequently, over evolutionary time scales, brain anatomy may rapid fixation.
Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7919
Second, the limited memory buffer can be viewed as repre- This ability may be useful for fast recognition of objects or
senting an adaptive trade-off between memory and computation. structures in the field that does not depend on serial order. It
[Recall our earlier discussion of Christiansen and Chater’s may also improve rote learning of recent actions that might be
memory bottleneck (58).] We assume that a larger buffer that helpful in systematically searching for food without returning to
can accommodate more data can evolve, but whether or not it a place that was just visited. Interestingly, exceptional ability to
will be adaptive depends on the kind of computations needed to retain in memory an accurate, detailed image of a complex scene
process the data in the buffer. In the case of linguistic input, or pattern (also known as “eidetic imagery”) is more common
serial order is important; the data must be segmented into words among young children (105) and autistic savants (106). Such an
or chunks based on recurring segments. This means that the ability may facilitate rote learning at the expense of effective
learner has to search and compare all possible chunks within the segmentation and network representation (81). Regardless of
buffer, and to find possible matches between these chunks and whether or not we understand this phenomenon correctly, it
those represented already as nodes in the network (so as to put clearly shows that having a larger working-memory buffer is bi-
the incoming data in context). The number of possible chunks in ologically feasible and that genetic variants that possess this
a linear sequence grows as the square of the length of the se- ability are already present in human populations. The fact that
quence [there are N(N − 1)/2 possible chunks in a linear se- they do not spread and become the norm suggests that a small
quence of N data items]. Thus, in a relatively small buffer of memory bottleneck is somehow more adaptive.
5 data items, the learner has to find and compare only 10 possi-
ble chunks, whereas in larger buffers of, say 10 or 15 items, the The Case of Tool-Making and the Evolution of Social-
learner has to find and compare 45 or 105 possible chunks, re- Learning Mechanisms
spectively. Therefore, increasing the buffer size leads to consid- Cultural transmission of tool-making techniques depends on
erable computational cost that may not be justifiable. Depending social-learning mechanisms. Whereas learning some advanced
on the number of items that comprise a typical meaningful chunk techniques may involve teaching and verbal instruction (107), the
in the language, it might be better to process a sequence of data ability to make stone tools probably depended, initially at least,
items by gradually scanning it with a small buffer of 5 items on social-learning mechanisms of the type needed to facilitate
rather than with a large buffer of 15 items. imitation or emulation (108–110). How, then, could the evolu-
The computational burden is likely to be much smaller if data tion of culturally transmitted techniques for making tools (i.e.,
input does not need to be segmented accurately, but merely the culture of “tool-making”) affect the genetic evolution of such
recognized and classified based on some characteristic features. learning mechanisms? This question is a specific instance of the
As we mentioned earlier, recognizing a particular salad in a salad more general questions addressed earlier (Problem 1, above) of
bar or a typical fruit tree in a forest may not require paying how using learning mechanisms for social functions shapes their
attention to the exact serial order of the data. In this case, a evolution. To answer this question, we should first consider how
larger memory buffer may not lead to such a sharp increase in imitation or emulation works. Here we try to explain it in terms
computation. A simple illustration of this phenomenon is given of our model. We assume that for imitation, the coupling be-
by the three nonsegmented sentences presented in Fig. 2. En- tween perception and action is developed through experience
glish speakers who cannot read Hebrew or Japanese would most (see also ref. 2). That is, when an individual repeatedly observes
certainly try to segment (almost automatically and subcon- its own actions, it gradually—and quite automatically—asso-
sciously) the first sentence that is in English but not the next two ciates the perception and the motor experience of those actions.
sentences that are in Hebrew and Japanese. Those would be Eventually, seeing another individual performing those actions
classified quickly as “Gibberish in a foreign language” or, more activates the observer’s representation of the motor experience
specifically, as “two sentences in Hebrew and Japanese that I of performing those actions [because the observed actions are
can’t read” (based on some distinctive features of Hebrew and perceived as being similar to the (perceptual representation of)
Japanese letters). Clearly, readers who know Hebrew or Japa- the individual’s own actions, which are already coupled with the
nese will segment those sentences automatically. The point of relevant motor experience]. Thus, the first expected effect, which
this example is to demonstrate that the “ecological” context and may be described in terms of data-acquisition mechanisms (33,
the cultural background of a learner can determine the level of 82), is an increase in attention to the behavioral patterns of other
computation applied to processing incoming data. individuals (in imitation) or to the outcomes of their actions (in
Some level of data segmentation was clearly required even emulation). For imitation, it is also important to acquire much
before the evolution of language; for example, when animals information on self-actions to create the coupling between the
forage for food in structured environments (85) or need to learn perception of these actions and their motor experience.
or interpret observed behavioral sequences (103). However, the The next question is how to organize the acquired data in
evolution of language almost certainly increased the proportion memory. In our framework, this amounts to asking how the
of data input that must be accurately segmented, thereby im- network should be constructed. In the case of imitation, there
posing more significant computational requirements for a given seem to be two possibilities. One is to represent long sequences
memory buffer size. Individuals with a genetic predisposition to of observed actions or sensory experiences as large chunks: exact
use a smaller buffer may have been selected, which may explain copies of entire sequences that can then be executed. The other
why the memory bottleneck is indeed so small. This hypothesis possibility is to segment the observed behavior or sensory expe-
predicts that some of the humans’ close relatives that do not rience into smaller basic units, just as in the process of language
possess language may be endowed with a larger working-memory acquisition, and then compose them again into larger sequences
buffer. Indeed, a notable study on working memory for numerals in the production process. The first possibility of exact imitation
in chimpanzees (104) shows that young chimpanzees have a fits the notion of specialized imitation ability that allows copying
better capacity for numerical recollection than human adults. and executing complex behaviors accurately and almost auto-
matically. However, this approach leads to three problems. First,
the expected effect of exact imitation on the evolution of
learning is that both the “working-memory buffer” and the
weight-increase parameter should increase in size. The working-
memory buffer should be large enough to capture the long
sequences of observed behaviors, and the weight-increase pa-
rameter should be high enough to allow rapid fixation in mem-
Fig. 2. The sentence: “The number of possible chunks in a linear sequence = ory. This prediction does not seem to hold. As discussed earlier,
N(N − 1)/2” written in a nonsegmented form in three different languages, the human working-memory buffer is typically small, and com-
English, Hebrew, and Japanese. See the explanation in the main text. plex patterns require repeated encounters to be learned. The
EVOLUTION
involved both in language learning and in complex imitation,
which is in line with recent views according to which tool-making ities. The effect of culture on cognitive evolution is captured
possibly preadapted the brain to language learning (27). through small modifications of these coevolving learning and data-
Finally, our process-level approach may also help to explain acquisition mechanisms, whose coordinated action improves the
recent new studies linking neuroanatomical changes in the brain network’s ability to support the learning processes that are involved
to Paleolithic tool-making ability. These studies found that in cultural phenomena, such as language or tool-making. Finally, we
the acquisition of tool-making abilities by experimental subjects proposed that culture exerts selective pressure that shapes learning
PSYCHOLOGICAL AND
COGNITIVE SCIENCES
involved specific structural changes in the brain (27) and that and data acquisition parameters, which in turn shape the structure
these structures and regions in the brain are more developed in of the representation network, so that over evolutionary time scales,
humans than in chimpanzees (28). This evidence for a short-term brain anatomy may be selected to better accommodate the physical
plastic response colocalized with structures that underwent re- requirements of the learned processes and representations.
cent evolutionary change strongly suggests a process akin to the
Baldwin effect, in which genetic variants are selected based on ACKNOWLEDGMENTS. We thank the organizers and funders of the Arthur
how well they support the required plastic changes (92, 93). It is M. Sackler Colloquium on “The Extension of Biology Through Culture.” We
yet to be explained, however, how the observed plastic changes also thank two anonymous reviewers for highly constructive comments.
improve tool-making abilities. As we suggested earlier, in our Funding was provided by the Israel Science Foundation Grant 871/15 (to
A.L.) and by NSF Grant CCF-1214844, Air Force Office of Scientific Research
view, such neuroanatomical changes have to do with the path- Grant FA9550-12-1-0040, and Army Research Office Grant W911NF-14-1-0017
ways and the representational systems that are recruited to serve (to J.Y.H.). O.K. was supported by the John Templeton Foundation Grant ID
the construction of the network. The result should be a rich 47981 and by the Stanford Center for Computational, Evolutionary, and
network that represents sensory and perceptual experiences of Human Genomics.
1. Heyes C, Pearce JM (2015) Not-so-social learning strategies. Proc Biol Sci 282:20141709. 15. Odling-Smee FJ, Laland KN, Feldman MW (2003) Niche Construction: The Neglected
2. Cook R, Bird G, Catmur C, Press C, Heyes C (2014) Mirror neurons: From origin to Process in Evolution (Princeton Univ Press, Princeton, NJ).
function. Behav Brain Sci 37:177–192. 16. Iriki A, Taoka M (2012) Triadic (ecological, neural, cognitive) niche construction: A
3. Leadbeater E (2015) What evolves in the evolution of social learning? J Zool (Lond) 295: scenario of human brain evolution extrapolating tool use and language from the
4–11. control of reaching actions. Philos Trans R Soc Lond B Biol Sci 367:10–23.
4. Iacoboni M, et al. (1999) Cortical mechanisms of human imitation. Science 286: 17. Laland KN, Hoppitt W (2003) Do animals have culture? Evol Anthropol 12:
2526–2528. 150–159.
5. Rizzolatti G, Fogassi L, Gallese V (2001) Neurophysiological mechanisms underlying 18. Laland KN, Janik VM (2006) The animal cultures debate. Trends Ecol Evol 21:542–547.
the understanding and imitation of action. Nat Rev Neurosci 2:661–670. 19. Stout D, Hecht EE (2017) Evolutionary neuroscience of cumulative culture. Proc Natl
6. Leslie AM (1987) Pretense and representation: The origins of “theory of mind.” Acad Sci USA 114:7861–7868.
Psychol Rev 94:412–426. 20. Solan Z (2005) Unsupervised learning of natural languages. Proc Natl Acad Sci USA
7. Baron-Cohen S (2000) Theory of mind and autism: A fifteen year review. Understanding
102:11629–11634.
Other Minds, eds Baron-Cohen S, Tagar-Flusberg H, Cohen DJ (Oxford Univ Press,
21. Laland KN, Odling-Smee J, Myles S (2010) How culture shaped the human genome:
Oxford), Vol A, pp 3–20.
Bringing genetics and the human sciences together. Nat Rev Genet 11:137–148.
8. Cosmides L, Tooby J, Fiddick L, Bryant GA (2005) Detecting cheaters. Trends Cogn Sci
22. Williamson SH, et al. (2007) Localizing recent adaptive evolution in the human ge-
9:505–506, author reply 508–510.
nome. PLoS Genet 3:e90.
9. Chomsky N (1965) Aspects of the Theory of Syntax (MIT Press, Cambridge, MA).
23. de Magalhães JP, Matsuda A (2012) Genome-wide patterns of genetic distances
10. Tooby J, Cosmides L, Barrett HC (2005) Resolving the debate on innate ideas. The Innate
reveal candidate loci contributing to human population-specific traits. Ann Hum
Mind: Structure and Content, eds Carruthers P, Laurence S, Stich S (Oxford Univ Press,
New York), pp 305–337. Genet 76:142–158.
11. Anderson ML (2010) Neural reuse: A fundamental organizational principle of the 24. Somel M, Liu X, Khaitovich P (2013) Human brain evolution: Transcripts, metabolites
brain. Behav Brain Sci 33:245–266, discussion 266–313. and their regulators. Nat Rev Neurosci 14:112–127.
12. Bates E (1993) Modularity, domain specificity and the development of language. 25. Cáceres M, et al. (2003) Elevated gene expression levels distinguish human from non-
Discuss Neurosci 10:136–148. human primate brains. Proc Natl Acad Sci USA 100:13030–13035.
13. Scott-Phillips TC, Laland KN, Shuker DM, Dickins TE, West SA (2014) The niche con- 26. Somel M, Rohlfs R, Liu X (2014) Transcriptomic insights into human brain evolution:
struction perspective: A critical appraisal. Evolution 68:1231–1243. Acceleration, neutrality, heterochrony. Curr Opin Genet Dev 29:110–119.
14. Laland KN, Odling-Smee FJ, Feldman MW (1999) Evolutionary consequences of 27. Hecht EE, et al. (2015) Acquisition of Paleolithic toolmaking abilities involves
niche construction and their implications for ecology. Proc Natl Acad Sci USA 96: structural remodeling to inferior frontoparietal regions. Brain Struct Funct 220:
10242–10247. 2315–2331.
Lotem et al. PNAS | July 25, 2017 | vol. 114 | no. 30 | 7921
28. Hecht EE, Gutman DA, Bradley BA, Preuss TM, Stout D (2015) Virtual dissection and 72. Lange A, Dukas R (2009) Bayesian approximations and extensions: Optimal decisions
comparative connectivity of the superior longitudinal fasciculus in chimpanzees and for small brains and possibly big ones too. J Theor Biol 259:503–516.
humans. Neuroimage 108:124–137. 73. Katsnelson E, Motro U, Feldman MW, Lotem A (2011) Evolution of learned strategy
29. Creanza N, Fogarty L, Feldman MW (2016) Cultural niche construction of repertoire choice in a frequency-dependent game. Proc Biol Sci 279:1176–1184.
size and learning strategies in songbirds. Evol Ecol 30:285–305. 74. Hamblin S, Giraldeau L-A (2009) Finding the evolutionarily stable learning rule for
30. Shettleworth SJ (2010) Cognition, Evolution, and Behavior (Oxford Univ Press, New York). frequency-dependent foraging. Anim Behav 78:1343–1350.
31. Heyes C (2010) Where do mirror neurons come from? Neurosci Biobehav Rev 34:575–583. 75. Nilsson DE, Pelger S (1994) A pessimistic estimate of the time required for an eye to
32. Heyes C (2016) Homo imitans? Seven reasons why imitation couldn’t possibly be evolve. Proc Biol Sci 256:53–58.
associative. Philos Trans R Soc Lond B Biol Sci 371:20150069. 76. Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Netw
33. Heyes C (2012) What’s social about social learning? J Comp Psychol 126:193–202. 61:85–117.
34. Laland KN (2004) Social learning strategies. Learn Behav 32:4–14. 77. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444.
35. Catmur C, Press C, Cook R, Bird G, Heyes C (2014) Authors’ response: Mirror neurons: 78. Edelman S (2016) The minority report: Some common assumptions to reconsider in
Tests and testability. Behav Brain Sci 37:221–241. the modelling of the brain and behaviour. J Exp Theor Artif Intell 28:751–776.
36. Felsenstein J (1983) Parsimony in systematics: Biological and statistical issues. Annu 79. Mnih V, et al. (2015) Human-level control through deep reinforcement learning.
Rev Ecol Syst 14:313–333. Nature 518:529–533.
37. Lotem A (1993) Secondary sexual ornaments as signals: The handicap approach and 80. Gu S, Lillicrap T, Sutskever I, Levine S (2016) Continuous deep Q-learning with model-
three potential problems. Etologia 3:209–218. based acceleration. Proceedings of The 33rd International Conference on Machine
38. Mery F, Belay AT, So AK, Sokolowski MB, Kawecki TJ (2007) Natural polymorphism Learning, eds Balcan MF, Weinberger KQ (Proceedings of Machine Learning Re-
affecting learning and memory in Drosophila. Proc Natl Acad Sci USA 104:13051–13055. search, New York), Vol 48, pp 2829–2838.
39. Dunlap AS, Stephens DW (2014) Experimental evolution of prepared learning. Proc 81. Lotem A, Halpern JY (2008) A data-acquisition model for learning and cognitive
Natl Acad Sci USA 111:11750–11755. development and its implications for autism. Cornell University Computing and In-
40. Garcia J, Kimeldorf DJ, Koelling RA (1955) Conditioned aversion to saccharin re- formation Science Technical Reports. Available at https://ecommons.cornell.edu/
sulting from exposure to gamma radiation. Science 122:157–158. handle/1813/10178. Accessed May 11, 2017.
41. Finkel D, Pedersen NL, McGue M, McClearn GE (1995) Heritability of cognitive abilities 82. Lotem A, Halpern JY (2012) Coevolution of learning and data-acquisition mecha-
in adult twins: comparison of Minnesota and Swedish data. Behav Genet 25:421–431. nisms: A model for cognitive evolution. Philos Trans R Soc Lond B Biol Sci 367:
42. Plomin R, Spinath FM (2002) Genetics and general cognitive ability (g). Trends Cogn 2686–2694.
Sci 6:169–176. 83. Goldstein MH, et al. (2010) General cognitive principles for learning structure in time
43. Briley DA, Tucker-Drob EM (2013) Explaining the increasing heritability of cognitive and space. Trends Cogn Sci 14:249–258.
ability across development: A meta-analysis of longitudinal twin and adoption 84. Kolodny O, Edelman S, Lotem A (2014) The evolution of continuous learning of the
studies. Psychol Sci 24:1704–1713. structure of the environment. J R Soc Interface 11:20131091.
44. Pearson-Fuhrhop KM, Minton B, Acevedo D, Shahbaba B, Cramer SC (2013) Genetic 85. Kolodny O, Edelman S, Lotem A (2015) Evolution of protolinguistic abilities as a by-
variation in the human brain dopamine system influences motor learning and its product of learning to forage in structured environments. Proc Biol Sci 282:
modulation by L-Dopa. PLoS One 8:e61197. 20150353.
45. Mery F (2013) Natural variation in learning and memory. Curr Opin Neurobiol 23:52–56. 86. Kolodny O, Lotem A, Edelman S (2015) Learning a generative probabilistic grammar
46. Stephens DW, Krebs JR (1986) Foraging Theory (Princeton Univ Press, Princeton, NJ). of experience: A process-level model of language acquisition. Cogn Sci 39:227–267.
47. Leadbeater E, Dawson EH (2017) A social insect perspective on the evolution of social 87. Kolodny O, Edelman S, Lotem A (2015) Evolved to adapt: A computational approach
learning mechanisms. Proc Natl Acad Sci USA 114:7838–7845. to animal innovation and creativity. Curr Zool 61:350–367.
48. Lotem A, Kolodny O (2014) Reconciling genetic evolution and the associative 88. Enquist M, Lind J, Ghirlanda S (2016) The power of associative learning and the
learning account of mirror neurons through data-acquisition mechanisms. Behav ontogeny of optimal behaviour. R Soc Open Sci 3:160734.
Brain Sci 37:210–211. 89. Laland KN, et al. (2015) The extended evolutionary synthesis: Its structure, assump-
49. Thompson B, Kirby S, Smith K (2016) Culture shapes the evolution of cognition. Proc tions and predictions. Proc Biol Sci 282:20151019.
Natl Acad Sci USA 113:4530–4535. 90. Baddeley A, Gathercole S, Papagno C (1998) The phonological loop as a language
50. Christiansen MH, Chater N (2008) Language as shaped by the brain. Behav Brain Sci learning device. Psychol Rev 105:158–173.
31:489–508, discussion 509–558. 91. Burgess N, Hitch GJ (1999) Memory for serial order: A network model of the pho-
51. Barbujani G, Sokal RR (1990) Zones of sharp genetic change in Europe are also lin- nological loop and its timing. Psychol Rev 106:551–581.
guistic boundaries. Proc Natl Acad Sci USA 87:1816–1819. 92. Baldwin JM (1896) A new factor in evolution. Am Nat 30:441–451.
52. Laland KN (2017) The origins of language in teaching. Psychon Bull Rev 24:225–231. 93. Weber BH, Depew DJ (2003) Evolution and Learning: The Baldwin Effect Reconsidered
53. Belfer‐Cohen A, Goren‐Inbar N (1994) Cognition and communication in the Le- (MIT Press, Cambridge, MA).
vantine Lower Palaeolithic. World Archaeol 26:144–157. 94. Dunbar R (1998) Grooming, Gossip, and the Evolution of Language (Harvard Univ
54. d’Errico F, et al. (2003) Archaeological evidence for the emergence of language, sym- Press, Cambridge, MA).
bolism, and music—An alternative multidisciplinary perspective. J World Prehist 17:1–70. 95. Pinker S (2003) Language as an adaptation to the cognitive niche. Stud Evol Lang 3:
55. Mellars P (2006) Why did modern human populations disperse from Africa ca. 16–37.
60,000 years ago? A new model. Proc Natl Acad Sci USA 103:9381–9386. 96. Premack D (1985) “Gavagai!” or the future history of the animal language contro-
56. Chater N, Reali F, Christiansen MH (2009) Restrictions on biological adaptation in versy. Cognition 19:207–296.
language evolution. Proc Natl Acad Sci USA 106:1015–1020. 97. Butterworth G, Jarrett N (1991) What minds have in common is space: Spatial
57. Chater N, Christiansen MH (2010) Language acquisition meets language evolution. mechanisms serving joint visual attention in infancy. Br J Dev Psychol 9:55–72.
Cogn Sci 34:1131–1157. 98. Scaife M, Bruner JS (1975) The capacity for joint visual attention in the infant. Nature
58. Christiansen MH, Chater N (2016) The Now-or-Never bottleneck: A fundamental 253:265–266.
constraint on language. Behav Brain Sci 39:e62. 99. Klin A, Lin DJ, Gorrindo P, Ramsay G, Jones W (2009) Two-year-olds with autism
59. Lotem A, Kolodny O, Halpern JY, Onnis L, Edelman S (2016) The bottleneck may be orient to non-social contingencies rather than biological motion. Nature 459:
the solution, not the problem. Behav Brain Sci 39:e83. 257–261.
60. Mueller ST, Krawitz A (2009) Reconsidering the two-second decay hypothesis in 100. Curio E, Ernst U, Vieth W (1978) Cultural transmission of enemy recognition: One
verbal working memory. J Math Psychol 53:14–25. function of mobbing. Science 202:899–901.
61. Cui J, Gao D, Chen Y, Zou X, Wang Y (2010) Working memory in early-school-age 101. Dunsmoor JE, Murty VP, Davachi L, Phelps EA (2015) Emotional learning selectively
children with Asperger’s syndrome. J Autism Dev Disord 40:958–967. and retroactively strengthens memories for related events. Nature 520:345–348.
62. Blokland GAM, et al. (2011) Heritability of working memory brain activation. 102. Medina TN, Snedeker J, Trueswell JC, Gleitman LR (2011) How words can and cannot
J Neurosci 31:10882–10890. be learned by observation. Proc Natl Acad Sci USA 108:9014–9019.
63. Vogler C, et al. (2014) Substantial SNP-based heritability estimates for working 103. Byrne RW (1999) Imitation without intentionality. Using string parsing to copy the
memory performance. Transl Psychiatry 4:e438. organization of behaviour. Anim Cogn 2:63–72.
64. Chater N, Christiansen MH (2016) Squeezing through the Now-or-Never bottleneck: Re- 104. Inoue S, Matsuzawa T (2007) Working memory of numerals in chimpanzees. Curr
connecting language processing, acquisition, change, and structure. Behav Brain Sci 39:e91. Biol 17:R1004–R1005.
65. McNamara JM, Houston AI (2009) Integrating function and mechanism. Trends Ecol 105. Conway ARA, Jarrold C, Kane MJ, Miyake A, Towse JN (2007) Variation in working
Evol 24:670–675. memory: An introduction. Variation in Working Memory, eds Conway ARA, Jarrold C,
66. Fawcett TW, Hamblin S, Giraldeau LA (2013) Exposing the behavioral gambit: The Kane MJ, Miyake A, Towse JN (Oxford Univ Press, Oxford), pp 3–17.
evolution of learning and decision rules. Behav Ecol 24:2–11. 106. Snyder AW, Mitchell DJ (1999) Is integer arithmetic fundamental to mental pro-
67. Kacelnik A, Bateson M (1997) Risk-sensitivity: Crossroads for theories of decision- cessing?: The mind’s secret arithmetic. Proc Biol Sci 266:587–592.
making. Trends Cogn Sci 1:304–309. 107. Morgan TJH, et al. (2015) Experimental evidence for the co-evolution of hominin
68. van den Berg P, Weissing FJ (2015) The importance of mechanisms for the evolution tool-making teaching and language. Nat Commun 6:6029.
of cooperation. Proc Biol Sci 282:20151382. 108. Whiten A (2000) Primate culture and social learning. Cogn Sci 24:477–508.
69. Trimmer PC, McNamara JM, Houston AI, Marshall JA (2012) Does natural selection 109. Heyes CM, Galef BG, Jr (1996) Social Learning in Animals: The Roots of Culture
favour the Rescorla-Wagner rule? J Theor Biol 302:39–52. (Academic, San Diego).
70. Mcnamara JM, Trimmer PC, Houston AI (2012) The ecological rationality of state- 110. Galef BG (2015) Laboratory studies of imitation/field studies of tradition: Towards a
dependent valuation. Psychol Rev 119:114–119. synthesis in animal social learning. Behav Processes 112:114–119.
71. Arbilly M, Motro U, Feldman MW, Lotem A (2010) Co-evolution of learning com- 111. Truskanov N, Lotem A (2017) Trial-and-error copying of demonstrated actions re-
plexity and social foraging strategies. J Theor Biol 267:573–581. veals how fledglings learn to ‘imitate’ their mothers. Proc Biol Sci 284:20162744.
In Antarctic waters, a group of killer whales makes a Scientists once placed culture squarely in the human
wave big enough to knock a seal from its ice floe. domain. But discoveries in recent decades suggest that
Meanwhile, in the North Atlantic, another killer whale a wide range of cultural practices—from foraging tactics
group blows bubbles and flashes white bellies to and vocal displays to habitat use and play—may influ-
herd a school of herrings into a ball. And in the Crozet ence the lives of other animals as well (3). Studies at-
Archipelago in the Southern Ocean, still another group tribute additional orca behaviors, such as migration
charges at seals on a beach, grasps the prey with their routes and song repertoires, to culture (4). Other re-
teeth, and then backs into the water (1). Some re- search suggests that a finch’s song (5), a chimpanzee’s
searchers see these as more than curious behaviors nut cracking (3), and a guppy’s foraging route (6) are all
manifestations of culture. Between 2012 and 2014, over
or YouTube photo ops: they see cultural mores—
100 research groups published work on animal culture
introduced into populations and passed to future gener-
covering 66 species, according to a recent review (7).
ations—that can actually affect animals’ fitness.
Now, scientists are exploring whether culture may
Killer whales, also known as orcas (Orcinus orca), have
shape not only the lives of nonhuman animals but the
a geographic range stretching from the Antarctic to the
evolution of a species. “Culture affects animals’ lives and
Arctic. As a species, their diet includes birds, fish, mam- their survival and their fitness,” says the review’s (7) coau-
mals, and reptiles. But as individuals, they typically fall thor, behavioral scientist Andrew Whiten of the University
into groups with highly specialized diets and hunting of St Andrews in Scotland. “We’ve learned that’s the case
traditions passed down over generations. Increasingly, to an extent that could hardly have been appreciated half a
scientists refer to these learned feeding strategies as cul- century ago.” Based on work in whales, dolphins, and
ture, roughly defined as information that affects behavior birds, some researchers contend that animal culture is likely
and is passed among individuals and across generations a common mechanism underlying animal evolution. But
through social learning, such as teaching or imitation (2). testing this hypothesis remains a monumental challenge.
An Evolutionary Force
Longstanding ecological and evolutionary theories distinct hunting strategies as ecotypes, subsets of a
suggest that culture could also more directly affect the species that occupy unique ecological niches. New
evolution of traits, and even the making of species. genomics technologies allow researchers to search for
Animal populations evolve through natural selection evolutionary consequences of these various hunting
when a heritable trait, like beak size or fur color, varies cultures. “We came into the genomics era and really
and different versions of the trait allow some individuals wanted to see whether these cultural traditions in killer
to survive and reproduce more than others. whales led to enough of a long-term selection pressure
Animal culture has the potential to affect this process
that you would actually see changes in the genome,”
in a number of ways, says Whiten. For one, cultural in-
says evolutionary biologist Andrew Foote of Bangor
novations, such as tools or predator-avoidance tactics,
University in the United Kingdom.
could increase an animal’s survival and reproduction,
Foote and colleagues sequenced the genomes of
buffering them against some selection pressures. But
48 orcas across 5 ecotypes to identify whether the
culture could also enable animals to colonize regions
groups were truly genetically isolated, and whether their
they otherwise couldn’t, exposing them to new selec-
different cultures were associated with unique genomic
tion pressures, such as novel temperatures, predators,
changes (1). The sample included one mammal-eating
or food sources. And culture could generate selection
for animals to be better suited to a cultural behavior and one salmon-eating ecotype from the North Pacific,
through physical changes, such as stronger arms for and one mammal-eating, one penguin-eating, and one
more powerful hammering, or cognitive ones, such as Antarctic toothfish-eating ecotype from the Antarctic.
the ability to learn tool use by mirroring others. “And The researchers found that the groups were genetically
that, of course, may affect the evolution of the brain to distinct. “What is really surprising is just how differenti-
match,” says Whiten. Furthermore, cultural differences, ated the ones that live in the same area are,” says Foote.
such as birdsong or migration patterns, could prevent “The two North Pacific ones are really different genet-
groups from mating together, which could help main- ically even though there is overlap in their range.”
tain or even generate new species. Foote estimated that these ecotypes began di-
Or anyway, those are the working theories. Finding verging within the last 250,000 years. He traced some
definitive evidence is a tricky prospect, though recent of the genetic differences among groups to gene
research in whales and birds offers some substantive variants possibly associated with adaptation to the
support. Scientists refer to the many orca groups with hunting traditions of each ecotype, and the unique
1 Foote AD, et al. (2016) Genome-culture coevolution promotes rapid divergence of killer whale ecotypes. Nat Commun 7:11693.
2 Whiten A, Ayala FJ, Feldman MW, Laland KN (2017) The extension of biology through culture. Proc Natl Acad Sci USA 114:7775–7781.
3 Whiten A (2017) Culture extends the scope of evolutionary biology in the great apes. Proc Natl Acad Sci USA 114:7790–7797.
4 Whitehead H (2017) Gene–culture coevolution in whales and dolphins. Proc Natl Acad Sci USA 114:7814–7821.
5 Grant PR, Grant BR (2014) 40 Years of Evolution: Darwin’s Finches on Daphne Major Island (Princeton Univ Press, Princeton, NJ).
6 Laland KN, Williams K (1997) Shoaling generates social learning of foraging information in guppies. Anim Behav 53:1161–1169.
7 Galef BG, Whiten A (2017) The comparative psychology of social learning. APA Handbook of Comparative Psychology, ed Call J
(American Psychological Association, Washington, DC), pp 411–439.
8 Itan Y, Powell A, Beaumont MA, Burger J, Thomas MG (2009) The origins of lactase persistence in Europe. PLOS Comput Biol
5:e1000491.
9 Whitehead H (2003) Sperm Whales: Social Evolution in the Ocean (The Univ of Chicago Press, Chicago).
10 Kopps AM, et al. (2014) Cultural transmission of tool use combined with habitat specializations leads to fine-scale genetic structure in
bottlenose dolphins. Proc Biol Sci 281:20133245.
11 Krützen M, et al. (2014) Cultural transmission of tool use by Indo-Pacific bottlenose dolphins (Tursiops sp.) provides access to a novel
foraging niche. Proc Biol Sci 281:20140374.
12 Mason NA, et al. (2017) Song evolution, speciation, and vocal learning in passerine birds. Evolution 71:786–796.
13 Street SE, Navarrete AF, Reader SM, Laland KN (2017) Coevolution of cultural intelligence, extended life history, sociality, and brain
size in primates. Proc Natl Acad Sci USA 114:7908–7914.
14 Liu S, et al. (2014) Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears. Cell 157:785–794.
Program
Wednesday, November 16
Session I
Welcome Remarks
Marcus W. Feldman, Stanford University
Session II
How animal cultures extend the scope of biology: Tradition and learning from apes to
whales to bees
Andrew Whiten, University of St. Andrews
Thursday, November 17
Session III
The role of cultural innovations, learning processes, and ecological dynamics in shaping
Middle Stone Age cultural adaptations
Francesco d’Errico, University of Bordeaux
Session IV
Big data, cultural macroevolution, and the prospects for an evolutionary science of
human history
Russell D. Gray, Max Planck Institute for the Science of Human History
Concluding Remarks
Francisco J. Ayala, University of California, Irvine