Communicative Musicality 2009

00-Malloch-Prelims 9/10/08 12:04 PM Page i
This is the Hardback edition, published in 2008. The Paperback

edition, with some corrections, was published in 2010
Communicative
Musicality
Exploring the basis of
human companionship
00-Malloch-Prelims 9/10/08 12:04 PM Page ii
00-Malloch-Prelims 9/10/08 12:04 PM Page iii
Communicative
Musicality
Exploring the basis of
human companionship
Stephen Malloch
and
Colwyn Trevarthen
1
00-Malloch-Prelims 9/10/08 12:04 PM Page iv
1
Great Clarendon Street, Oxford OX2 6DP
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide in
Oxford New York
Auckland Cape Town Dar es Salaam Hong Kong Karachi
Kuala Lumpur Madrid Melbourne Mexico City Nairobi
New Delhi Shanghai Taipei Toronto
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
Oxford is a registered trade mark of Oxford University Press
in the UK and in certain other countries
Published in the United States
by Oxford University Press Inc., New York
© Oxford University Press 2008
The moral rights of the authors have been asserted
Database right Oxford University Press (maker)
First published 2008
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any means,
without the prior permission in writing of Oxford University Press,
or as expressly permitted by law, or under terms agreed with the appropriate
reprographics rights organization. Enquiries concerning reproduction
outside the scope of the above should be sent to the Rights Department,
Oxford University Press, at the address above
You must not circulate this book in any other binding or cover
and you must impose this same condition on any acquirer
British Library Cataloguing in Publication Data
Data available
Library of Congress Cataloging-in-Publication Data
Data available
Typeset by Cepha Imaging Private Ltd., Bangalore, India
Printed and bound in the United Kingdom by
Antony Rowe, Chippenham, Wiltshire
ISBN 978–0–19–856628–1
10 9 8 7 6 5 4 3 2 1
00-Malloch-Prelims 9/10/08 12:04 PM Page v
Contents
Author affiliations and biographies ix
1 Musicality: Communicating the vitality and interests of life 01

Stephen Malloch and Colwyn Trevarthen
Part 1 The origins and psychobiology of musicality 13

2 Root, leaf, blossom, or bole: Concerning the origin and adaptive
function of music 17
Ellen Dissanayake
3 Music and how we became human—a view from cognitive semiotics:
Exploring imaginative hypotheses 31
Per Aage Brandt
4 Ritual foundations of human uniqueness 45
Bjorn Merker
5 The evolution of music: Theories, definitions and the nature
of the evidence 61
Ian Cross and Iain Morley
6 Tau in musical expression 83
David N. Lee and Benjaman Schögler
7 The neuroscience of emotion in music 105
Jaak Panksepp and Colwyn Trevarthen
8 Brain, music and musicality: Inferences from neuroimaging 147
Robert Turner and Andreas A. Ioannides
Part 2 Musicality in infancy 183

9 Infant rhythms: Expressions of musical companionship 185
Katerina Mazokopaki and Giannis Kugiumutzakis
10 Voices of shared emotion and meaning: Young infants and their
mothers in Scotland and Japan 209
Niki Powers and Colwyn Trevarthen
11 ‘Music’ and the ‘action song’ in infant development: An interpretation 241
Patricia Eckerdal and Bjorn Merker
12 Early trios: Patterns of sound and movement in the genesis of
meaning between infants 263
Benjamin S. Bradley
00-Malloch-Prelims 9/10/08 12:04 PM Page vi
vi CONTENTS
13 The effects of maternal depression on the ‘musicality’ of infant-directed

speech and conversational engagement 281
Helen Marwick and Lynne Murray
14 The improvised musicality of belonging: Repetition and variation
in mother–infant vocal interaction 301
Maya Gratier and Gisèle Apter-Danon
Part 3 Musicality and healing 329

15 Music for children in zones of conflict and post-conflict:
A psychobiological approach 331
Nigel Osborne
16 Between communicative musicality and collaborative musicing:
A perspective from community music therapy 357
Mercédès Pavlicevic and Gary Ansdell
17 Supporting the development of mindfulness and meaning: Clinical
pathways in music therapy with a sexually abused child 377
Jacqueline Robarts
18 The human nature of dance: Towards a theory of aesthetic
community 401
Karen Bond
19 Therapeutic dialogues in music: Nurturing musicality of communication
in children with autistic spectrum disorder and Rett syndrome 423
Tony Wigram and Cochavit Elefant
Part 4 Musicality of learning in childhood 447

20 Musicality in talk and listening: A key element in classroom discourse
as an environment for learning 449
Frederick Erickson
21 Spontaneity in the musicality and music learning of children 465
Nicholas Bannan and Sheila Woodward
22 Vitality in music and dance as basic existential experience: Applications
in teaching music 495
Charlotte Fröhlich
23 Intimacy and reciprocity in improvisatory musical performance:
Pedagogical lessons from adult artists and young children 513
Lori A. Custodero
Part 5 Musicality in performance 531

24 Bodies swayed to music: The temporal arts as integral to
ceremonial ritual 533
Ellen Dissanayake
00-Malloch-Prelims 9/10/08 12:04 PM Page vii
CONTENTS vii
25 Towards a chronobiology of musical rhythm 545

Nigel Osborne
26 Musical communication: The body movements of performance 565
Jane Davidson and Stephen Malloch
27 Communicative musicality as creative participation: From early childhood to
advanced performance 585
Helena Maria Rodrigues, Paulo Maria Rodrigues
and Jorge Salgado Correia
Index 611
00-Malloch-Prelims 9/10/08 12:04 PM Page viii
00-Malloch-Prelims 9/10/08 12:04 PM Page ix
Author affiliations and biographies
Gary Ansdell
Nordoff-Robbins Music Therapy Centre
London, England
Gary Ansdell is currently Co-Head of Research at the Nordoff-Robbins Music Therapy Centre in
London, Co-Director of the MA in Music Therapy (Community Music Therapy/Nordoff-
Robbins), and Honorary Research Fellow in Community Music Therapy at the University
of Sheffield. A trainer and researcher, he works as a music therapist in adult psychiatry. He has
published widely, including (with Mercedes Pavlicevic) Community Music Therapy (Jessica
Kingsley, 2004).
Gisèle Apter-Danon
Université Denis Diderot Paris 7
France
Gisèle Apter-Danon is an infant and perinatal psychiatrist and researcher. She is Director of
Psychopathology and Psychiatric Research at Erasmus Hospital in Paris and Head of the
Perinatal Psychiatry Emergency Unit. She is Assistant Professor of Perinatal Psychopathology at
University Denis Diderot Paris 7 and Vice President of the Francophone World Association for
Infant Mental Health (WAIMH).
Nicholas Bannan
University of Western Australia
Nicholas Bannan is a composer and choral conductor who teaches Music Education at the
University of Western Australia (UWA). He directs The Winthrop Singers at St George’s College,
UWA, and pursues research that contributes to the interdisciplinary agenda for the study of
music as a feature of human evolution, while also applying this approach in aural pedagogy,
music therapy and children’s creative projects.
Karen Bond
Temple University
Philadelphia, USA
Karen Bond is Associate Professor and Coordinator of the Master of Education in Dance in the
Department of Dance, Temple University, Philadelphia. Formerly Senior Lecturer in Dance at the
University of Melbourne, she was a pioneer in the development of the field of dance therapy in
Australia. Her research and publications focus on participant engagement and meanings in dance.
Benjamin S. Bradley
Charles Sturt University
Australia
Ben Bradley is a Professor of Psychology whose concept of infancy assumes that the human
neocortex evolved through the demands of group living. His argument that the psyche is prima-
rily synchronically constituted have been worked out in participatory action research with teens,
historically in Visions of Infancy (Polity Press, 1989) and theoretically in Psychology and
Experience (CUP, 2005) and his work on infants’ acquisition of ‘thirdness’.
00-Malloch-Prelims 9/10/08 12:04 PM Page x
x AUTHOR AFFILIATIONS AND BIOGRAPHIES
Per Aage Brandt

Case Western Reserve University
Cleveland, OH, USA
Per Aage Brandt is Professor of Cognitive Sciences and Modern Languages, and Director of the
Center for Cognition and Culture at Case Western Reserve University. He is the author of a dozen
books and more than 150 published papers on cognitive and semiotic theory of language,
grammar, aesthetics, art, and music. His work centres on the elaboration of a series of models for
describing patterns of meaning.
Jorge Salgado Correia
University of Aveiro
Portugal
With a background in philosophy and music, Jorge has studied in Portugal, Holland and
England. He specializes in contemporary flute music, and has given many world premieres
including works commissioned for him. In addition to a busy performing career, Jorge is a
Lecturer at the University of Aveiro. His publications include a chapter in The Science and
Psychology of Music Performance: Creative Strategies for Teaching and Learning (OUP, 2002).
Ian Cross
University of Cambridge
England
Ian Cross is Reader in Music and Science at the University of Cambridge where he is Director
of the Centre for Music and Science and a Fellow of Wolfson College. Initially trained as a classi-
cal guitarist, he has published widely in the field of music cognition. His principal research
focus is music as a biocultural phenomenon, which involves collaboration with psychologists,
anthropologists, archaeologists and computational neuroscientists.
Lori A. Custodero
Teachers College, Columbia University
New York, USA
Lori Custodero is Associate Professor and Coordinator of Music and Music Education at
Columbia University Teachers College. Her work has focused on musical experiences, specifically
with infant and early-middle childhood interactions with adults as musicians, teachers and
parents. Her collaborations with local institutions in New York City such as Lincoln Center, and
across diverse international populations, provide generative opportunities for the theoretical
framing of real world practice.
Jane Davidson
University of Sheffield, England
and University of Western Australia
Jane Davidson is Professor of Music Performance Studies at the University of Sheffield and
Callaway-Tunley Chair of Music, University of Western Australia. She has undergraduate and
postgraduate degrees in music, vocal performance and contemporary dance. Author of more
than one hundred scholarly contributions on performance, expression, therapy and the determi-
nants of artistic abilities, she is a former editor of the journal Psychology of Music, and was Vice
President of the European Society for the Cognitive Sciences of Music from 2003–2006.
00-Malloch-Prelims 9/10/08 12:04 PM Page xi
AUTHOR AFFILIATIONS AND BIOGRAPHIES xi
Ellen Dissanayake
University of Washington
Seattle, USA
Ellen Dissanayake is an independent scholar and writer whose interdisciplinary work contends
that the arts are evolved behaviours, inherent in human nature. Her books, What is Art
for? (1988), Homo aestheticus: Where art comes from and why (1992), and Art and intimacy:
How the arts began (2000), are all published by the University of Washington Press. She resides
in Seattle where she is an Affiliate Professor in the School of Music at the University of
Washington.
Patricia Eckerdal
Uppsala University Hospital
Sweden
Patricia Eckerdal is a physician at Uppsala University Hospital. She also has a degree of Bachelor
of Arts, Musicology. She has been affiliated to the Institute of Biomusicology, Östersund, Sweden
and to the Royal College of Music, Stockholm, Sweden, with the research project ‘Music in
Human Ontogeny’.
Cochavit Elefant
University of Bergen
Norway
Cochavit Elefant is Associate Professor of Music Therapy, Grieg Academy, University of Bergen,
and associate editor for the Nordic Journal of Music Therapy. She is one of the founders of and a
Music Therapist for the Israeli Rett Syndrome Evaluation and Treatment Team Center. In 2000,
she received an award from the International Rett Syndrome Association for her contribution to
that field.
Frederick Erickson
University of California
Los Angeles, USA
Frederick Erickson is George F. Kneller Professor of Anthropology of Education and (by
courtesy) Professor of Applied Linguistics at the University of California, Los Angeles. Initially
trained in composition and music history, he became a specialist in the use of video analysis
in interactional sociolinguistics and microethnography. His publications include Talk and
Social Theory: Ecologies of speaking and listening in everyday life (Polity Press, 2004) and numerous
articles.
Charlotte Fröhlich
University of Applied Sciences
North Western Switzerland
Charlotte Fröhlich is Professor of Music Pedagogy at the University of Applied Sciences, north-
western Switzerland. She has taught music in kindergarten, primary and secondary schools,
schools for special education, and universities in Germany and Switzerland. She trains teachers in
how to teach groups in movement and music for all ages. She is the immediate past chairperson
of the commission of Early Childhood Music Education (ECME) within ISME (International
Society of Music Education).
00-Malloch-Prelims 9/10/08 12:04 PM Page xii
xii AUTHOR AFFILIATIONS AND BIOGRAPHIES
Maya Gratier
Université de Paris X
Nanterre
France
Maya Gratier is a lecturer (Maître de Conférences) in the Psychology of Music at the Université
Paris X – Nanterre. Her research focuses on the musicality of mother–infant vocal interaction
and on musical communication in improvised performance. She is affiliated to the Laboratoire
de Psychiatrie et de Psychopathologie Périnatale (L’Aubier).
Andreas A. Ioannides
Laboratory for Human Brain Dynamics
BSI, RIKEN
Wako, Japan, and
Laboratory for Human Brain Dynamics
Nicosia, Cyprus
Andreas Ioannides is Head of the Laboratory for Human Brain Dynamics (RIKEN) and Head
of Laboratory for Human Brain Dynamics in Nicosia, Cyprus. He trained in Physics and worked
in nuclear Physics until 1988. In 1987 his research turned to magnetoencephalography (MEG).
Since 1990 he has focused on MEG studies of normal human brain function and how it is
modified in pathology, pioneering tomographic analysis of MEG data.
Giannis Kugiumutzakis
University of Crete
Greece
Giannis Kugiumutzakis is Professor of Developmental Psychology and Epistemology of
Psychology in the Department of Philosophy and Social Studies, University of Crete. His research
focuses on the emotional foundations of infant imitation, arithmetic abilities, rhythmic and
playful or teasing behaviours, and imagination. He sees the musical arts as one constituent
of human intersubjective life, in essential contrast with other a-musical, polemic or agonistic
‘arts’ – two sides of the drama of human nature.
David Lee
University of Edinburgh
Scotland
David Lee is Professor Emeritus of Perception, Action and Development, and Director of
the Perception-Movement-Action Research Centre (PMARC), University of Edinburgh. He
investigates motor control in animals from protozoa to humans. His General Tau Theory
specifies how guidance of purposive movement is achieved by control, synchronization
and sequencing of the time to closure of spatial and force gaps between effectors and their goals.
He applies the theory in studying skilled performance, early development, and how to assist
people with movement disorders.
Stephen Malloch
MARCS Auditory Laboratories
University of Western Sydney
Australia
Stephen Malloch holds the position of Adjunct Fellow at MARCS Auditory Laboratories,
University of Western Sydney. After training as a violinist, he studied music analysis and
00-Malloch-Prelims 9/10/08 12:04 PM Page xiii
AUTHOR AFFILIATIONS AND BIOGRAPHIES xiii
researched the structural role of timbre in music composition. Following his doctorate, research
has focused within the discipline of psychology on communicative musicality, in particular in
infant communication, music therapy and maternal postnatal depression. He now practices
as a coach and counsellor in Sydney, and works with organizations to develop more meaningful
communication.
Helen Marwick
National Centre for Autism Studies
University of Strathclyde
England
Helen Marwick is a developmental psychologist and psycholinguist with a primary interest in the
development and social functions of human communication. She is Co-Director of the National
Centre for Autism Studies in the University of Strathclyde, and a lecturer in the department of
Childhood and Primary Studies, Faculty of Education, University of Strathclyde.
Katerina Mazokopaki
University of Crete
Greece
Katerina Mazokopaki has a doctorate in Developmental Psychology from the Department
of Philosophy and Social Studies, School of Philosophy, University of Crete, Greece. She is a
musician, a pianist and a teacher of piano. Her interest focuses on the intersubjective dynamics of
musicality in human experience, and this has led her to undertake research on spontaneous
rhythmic expressions of infants and their participation in musical companionship.
Bjorn Merker
Segeltorp, Sweden
Bjorn Merker is a neuroscientist with broad interests in behavioural biology. Since obtaining his
doctorate for studies on the hamster midbrain at MIT in 1980, he has worked on oculomotor
physiology in cats, on the primary visual cortex in macaques, on song development and mirror
self-recognition in gibbons, and on the evolutionary and developmental background to human
music. With Nils Wallin and Steven Brown he edited the interdisciplinary volume The Origins of
Music (The MIT Press, 2000).
Iain Morley
University of Cambridge
England
Iain Morley is based at the McDonald Institute for Archaeological Research, Cambridge
University, and is a Fellow of Darwin College. After initially studying psychology he moved into
palaeolithic archaeology and has specialized in researching the evolution of human cognition.
His areas of interest include the evolutionary origins and archaeology of music, the emergence of
ritual and religion, and the relationship between ritual and music.
Lynne Murray
University of Reading
England
Lynne Murray is Research Professor of Developmental Psychopathology, co-director of the
Winnicott Research Unit, University of Reading, as well as an honorary Senior Research Fellow in
the Department of Child and Adolescent Psychiatry at the University of Cambridge. Her research
00-Malloch-Prelims 9/10/08 12:04 PM Page xiv
xiv AUTHOR AFFILIATIONS AND BIOGRAPHIES
focuses on the impact of parental psychiatric disorder on child development, particularly postnatal
depression, and the intergenerational transmission of psychopathology.
Nigel Osborne
Scotland
Nigel Osborne is Reid Professor of Music at Edinburgh University. His compositions have
featured in most major international festivals and have been performed by many of the leading
orchestras and ensembles around the world. He pioneered the use of music in therapy and
rehabilitation for children who are victims of conflict, his work being carried out in the Balkans,
the Caucasus, Africa and the Middle East.
Jaak Panksepp
Washington State University
USA
Jaak Panksepp is Baily Endowed Professor of Animal Well-Being Science at the College of
Veterinary Medicine at Washington State University. He investigates the neuroanatomical and
neurochemical mechanisms of emotional behaviours, with a focus on understanding how
separation responses, social bonding, social play, fear, anticipatory processes and drug craving are
organized in the brain, especially with reference to psychiatric disorders, such as depression. He
helped establish the new speciality area of affective neuroscience.
Mercedes Pavlicevic
Nordoff-Robbins Music Therapy Centre
London, England
Mercedes Pavlicevic is Co-Head of Research and Head of education at Nordoff-Robbins Music
Therapy (UK). She is Associate Professor at the Music Department, University of Pretoria, South
Africa and Lecturer in Music Therapy at Queen Margaret University, Edinburgh. Her research
interests include music therapy as cultural work and group music therapy improvisation; and
as a practitioner, her focus is on community-based projects in collaboration with other arts
therapists.
Niki Powers
Scotland
Niki Powers is a post-doctoral researcher in child psychology and a teacher with an interest
in how experience is shared through physical activities of communication in infancy. She is
collaborating with Japanese colleagues in cross-cultural comparisons of infant care and nursery
schools and has a commitment to apply her research findings to assist young persons who find
communication difficult. At present she is working in a project to support young people who
have experienced trauma.
Jacqueline Robarts
Nordoff-Robbins Music Therapy Centre and City University
London, England
Jackie Robarts is a Senior Therapist at the Nordoff-Robbins Music Therapy Centre, London, and
teaches clinical improvisation on the Masters professional training programme. She specializes
in work within child and adolescent mental health. A former Research Fellow at City University,
00-Malloch-Prelims 9/10/08 12:04 PM Page xv
AUTHOR AFFILIATIONS AND BIOGRAPHIES xv
her writing on the foundations of self and symbolization in music therapy is informed by clinical
work with children and adults with a wide range of conditions, particularly those with histories
of early trauma.
Helena Maria Rodrigues
CESEM-FCSH, New University of Lisbon
Portugal
Helena Rodrigues is Professor of Psychology of Music and Music Pedagogy in the Department of
Musical Sciences in the Faculty of Social and Human Sciences, New University of Lisbon. She is
also the artistic director of Theatrical Music Company (Companhia de Música Teatral),
a Portuguese music group producing interdisciplinary artistic projects.
Paulo Maria Rodrigues
Universidade of Aveiro
Portugal
Paulo Rodrigues is Professor at the Department of Communication and Art, University of Aveiro,
Portugal. He founded Companhia de Música Teatral, a group that has created and developed
many educational and artistic projects that emerge from expanding music to the territories of
other artistic languages and technology. He is the Head of Education at Casa da Música, Porto,
Portugal.
Benjamin Schögler
Scotland
Benjaman Schögler is a research fellow at the Perception-Movement-Action Research Centre,
University of Edinburgh. A professional jazz musician, he is interested in music-making,
communication and the practicalities of being a human performer. He is engaged in research on
‘how’ we move, employing this analysis in creative performance and academic understanding
with particular focus on applied technology in music and the visual arts.
Colwyn Trevarthen
Scotland
Colwyn Trevarthen is Professor (Emeritus) of Child Psychology and Psychobiology at the
University of Edinburgh, where he has taught since 1971. He trained as a biologist, has a Ph.D. in
psychobiology from Caltech and was a Research Fellow at the Center for Cognitive Studies at
Harvard, where his infancy research began. His published work covers brain development, infant
communication and child learning, and emotional health. He is interested in how to help
parents, teachers and clinicians give the best care and companionship to young children.
Robert Turner
Max-Planck-Institute for Human Cognitive and Brain Sciences
Leipzig, Germany
Robert Turner is Director of the Department of Neurophysics at the Max-Planck-Institute for
Human Cognitive and Brain Sciences, in Leipzig. A physicist and mathematician by training, he
has designed magnetic resonance imaging (MRI) gradient coils, pioneered diffusion weighted
EPI, now used widely in clinical stroke evaluation, and co-invented functional MRI. He worked
on the first non-invasive study of brain changes due to learning music notation in 2002.
00-Malloch-Prelims 9/10/08 12:04 PM Page xvi
xvi AUTHOR AFFILIATIONS AND BIOGRAPHIES
Tony Wigram
Aalborg University
Denmark
Tony Wigram is Professor of Music Therapy and Head of Ph.D. Studies in Music Therapy in the
Department for Communication and Psychology, Faculty of Humanities, University of Aalborg.
He is Prinicipal Research Fellow in the Faculty of Music, Melbourne University, and Professor in
Music Therapy at Anglia Ruskin University, Cambridge. His research interests include clinical
assessment, autism spectrum disorders, and the documentation of methods and techniques used
in music therapy practice.
Sheila C. Woodward
University of Southern California
USA
Sheila C. Woodward is a professor of Music Education at the Thornton School of Music,
University of Southern California. An award-winning researcher, her work has been published
and presented internationally and includes a focus on music and well-being. This encompasses
research on music medicine, the fetus and neonate, early childhood and juvenile offenders.
She serves on the Board of Directors of the International Society for Music Education.
01-Malloch-Chap01 9/10/08 12:08 PM Page 1
Chapter 1
Musicality: Communicating the

vitality and interests of life
Music expresses that which cannot be put into words and cannot remain
silent.
Victor Hugo
1.1 A brief history of discoveries

Four decades ago scientific interest began to focus on a new theory of how human will
and emotion are immediately shareable with others through gestures of the body and voice.
The handful of researchers who contributed to this new way of looking at human nature,
paediatricians, child psychiatrists, ethologists, anthropologists and social linguists, independently
making observations of mothers and infants in natural, mutually enjoyable communication,
considered the vitality of the communicative gestures themselves to be sufficient for the creation
of memorable stories. From these beginnings comes the account of communicative musicality
explored in this book.
Until the late 1960s, however, mainstream medical and psychological science were not inclined
to credit infants with complex skills or creative mental abilities, and certainly not with any active
sympathy for other persons’ thoughts or feelings. The role of a mother was seen to be that of a
provider of basic physiological protection and nourishment. Some, however, in propitious
circumstances and feeling free of the orthodox obligations of conventional medical and psycho-
logical research and publication, began to think differently. From close observation of infant
behaviour, they started to question the prevailing view that thought first begins as solving practi-
cal problems of object use, and that human communication is governed essentially by formal
generative principles and cognitive information processing of language.
When the infancy researchers reported discoveries of delicate expressions and sensitive
responses passing between young infants and their mothers, they described it in terms of rhythmic
patterns of engagement that could be represented as ‘musical’ or ‘dance-like’. Rather than using
terms to point to specific referents, they used metaphors for sympathetic movement such as
‘protoconversation’, ‘attunement’ and ‘acts of meaning’ to capture the dynamic and apparently
intentional phenomena of non-verbal communication that they observed. The cultural anthro-
pologist Mary Catherine Bateson described adult protoconversations with infants as performed
with a ‘delighted ritual courtesy’ (1979, p. 65), and she drew attention to the shared rhythmic
foundation for turn taking. Babies were found to be more aware of human presence and its
activity and affections than they were of physical objects or events, and this strong curiosity for
2 STEPHEN MALLOCH AND COLWYN TREVARTHEN
humans was expressed in responsive smiles, calls and gestures which excited their mothers and
‘captured’ them into the flow of the present moment of the exchange.
As the title of an historically important book that collected the new ideas of the 1970s put it,
what was being observed was communication Before Speech (Bullowa 1979), and speechless
infants and their mothers were extremely good at it. In her introductory chapter, Margaret
Bullowa referred to the then new concept of a child’s ‘communicative competence’, citing the
ideas of the anthropologist, musician and photographer Paul Byers. Byers writes of envisaging
‘a human and animal world that is communicationally related through the sharing of time forms
in multiple levels of behavioural organization’. He goes on to say that
The information carried by interpersonal rhythms does not move directly from one person to another.
Thus information cannot easily be conceptualized as messages since the information is always
simultaneously shared and always about the state of the relationship.
Byers (1976, p. 160)
He is neatly describing what we call ‘sympathy’. Bullowa describes Byers’ method as ‘detecting
the beat in each communicant’s speech or activity as one would in assigning the “signature” to a
stretch of music’ (Bullowa 1979, p. 16).
In the next two decades the picture grew, with more detailed analysis of infants’ vocalizations
and the particular style of speech mothers used to entrance and delight their infants. There were
experiments on infants’ and even foetuses’ perception of the musical features of sounds resem-
bling the prosody of the human speaking voice (for example, Stern 1974; Alegria and Noirot
1978; DeCasper and Fifer 1980; Fernald 1985, 1989; Papoušek 1987; Trehub 1987; Hepper 1988).
All this activity led to a transformation of developmental science, challenging in fundamental
ways both the accepted account of infancy and theories of the foundations of the adult mind.
‘Unsophisticated’ infants communicated with innate skill, compelling sympathetic responses
from their parents and generating cooperative narratives of emotion. Within a few months of
birth, infants were shown to begin to share a growing interest in the world of objects, and to
enjoy shared games with them. The babies were evidently possessed of an ‘innate intersubjectivity’,
one that led before the end of the first year to the learning of culturally conditioned meanings.
They took part in shared consciousness regulated by emotions of affection and enjoyment,
expressed and given meaningful form by rhythms of modulated movement. In Edinburgh, with
the help of gifted young collaborators entranced by the opportunity to make discoveries starting
from the simple premise that mother–infant play is intelligent and creative, Colwyn traced the
growth of infants’ motives for sharing intentions and feelings in human company (Trevarthen
et al. 1981). All the while, whether in Scotland, Nigeria, Germany, Sweden or Japan, researchers
found that mothers spoke to infants with similar rhythms and intonation, and infants moved in
sympathy (e. g. Fernald et al. 1989; Kuhl et al. 1997; Masataka 1993; Mundy-Castle 1980;
Papoušek 1992; Papoušek et al. 1991).
A new insight was then brought to the research. It began in 1996 with Stephen, sitting in a win-
dowless office on the upper floor of Edinburgh University Psychology Department, at the very
start of a post-doctoral research programme with Colwyn. He began by listening to tapes of
mothers and their babies ‘chatting’ with each other, recorded by Colwyn many years earlier. One
of the first tapes was of the vocal interactions of a 6-week-old Scottish girl, Laura, and her mother
(Figure 1.1).
As I listened, intrigued by the fluid give and take of the communication, and the lilting speech of the
mother as she chatted with her baby, I began to tap my foot. I am, by training, a musician, so I was very
used to automatically feeling the beat as I listened to musical sounds. There was no doubt in my mind
MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 3
SPECTROGRAPHIC ANALYSIS
22.05 kHz
-5.300 dB/COLOUR
1.53 1.53 1.58 1.53 1.63 1.48 1.53 1.53 1.58 1.63 1.53 1.53 1.63 1.53 1.53 1.53 1.53 BAR LENGTHS
FREQUENCY, LOG2
IN SECONDS
C4 C4
UTTERANCE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 NUMBER
come on again come on that’s oh yes is that well tell me tell me some orh come on ch ch ch egoo goo
then clever right some more then more then ch ch ch MOTHER’S
UTTERANCES
PITCH
C5 C5
C4 C4
C3 C3
TIMBRE
ROUGHNESS
WIDTH
SHARPNESS
0 2 4 6 8 10 12 14 16 18 20 22 24 26 Seconds
Fig. 1.1 Spectrograph, pitch and timbre plots of Laura, a six-week-old female infant, and her mother
conversing together. The vocalizations are represented on the spectrograph by fundamental frequency
and overtones. The mother’s utterances are written below the spectrograph. Each utterance is num-
bered directly above the utterance text, and these numbers cross-reference with Figure 1.2. The pitch
C4 (261.63 Hz) is indicated by a horizontal line that crosses the spectrograph, and C4 along with C3
and C5 are indicated on the pitch plot. A rectangle around an utterance number and around a vocal-
ization on the spectrograph, and on pitches and timbre measures, indicates an utterance by the
baby. Numbers at the top of the spectrograph indicate the duration of the ‘bars’. Bars are deter-
mined by the occurrence of important acoustic events—vocalization onset or offset, top or bottom
of a pitch ‘bend’, or word emphasis. A dashed bar-line indicates no vocal event marks its placement,
but its duration is inferred from the duration of the surrounding bars.
The pitch plot is indicated by small circles. Whether the circles are black, grey or white indicates the
strength of the pitch of the sound—in other words, how close the sound is to a pure harmonic
spectrum. The darker the circle, the more ‘pitched’ the sound is.
As timbre is a multidimensional attribute of sound, the timbre plot shows three complementary
timbre measures. Roughness is a measure of the degree of ‘beating’ between acoustic partials.
Width is a measure of how ‘expansive’ or ‘narrow’ a sound is heard to be. Sharpness is related to
the relative position of a sound’s loudness centroid within its spectrum. Note how the timbre of the
mother’s voice changes after each of her infant’s vocalizations. Immediately after all three infant
vocalizations most of the timbre measures for the mother’s voice drop. This may indicate the
mother’s wish to signal to her infant that she has heard her and make her voice more like her
infant’s. More detailed explanation and analysis will be found in Malloch 1999. (Figure adapted
from Malloch 1999.)
that the melodious speech of the mother had a certain musical quality to it. It suddenly dawned on me
that I was tapping my foot to human speech—not something I had ever done before, or even thought
possible. I replayed the tape, and again, I could sense a distinct rhythmicity and melodious give and
take to the gentle promptings of Laura’s mother and the pitched vocal replies from Laura.
This was at the very beginning of my post-doctoral research, and I was yet to read the work of such
researchers as the Papoušeks, who years earlier had talked about the musical nature of mother–infant
communication. A few weeks later, as I walked down the stairs to Colwyn’s main lab, the words
‘communicative musicality’ came into my mind as a way of describing what I had heard.
To demonstrate this melodic and rhythmic co-creativity, spectrograms and pitch plots were
generated of the interactions, and precise measurements taken of the onset and offset times of
the vocalizations of both mother and baby (see Figure 1.1). From this work, reported in Malloch
(1999), the theory of communicative musicality found precise formulation in terms of three
parameters: pulse, quality and narrative.
‘Pulse’ is the regular succession of discrete behavioural events through time, vocal or gestural,
the production and perception of these behaviours being the process through which two or more
people may coordinate their communications, spend time together, and by which we may antici-
pate what might happen and when it might happen. ‘Quality’ refers to the modulated contours of
expression moving through time. These contours can consist of psychoacoustic attributes of
vocalizations—timbre, pitch, volume—or attributes of direction and intensity of the moving
body. These attributes of quality will often co-occur multimodally, such that a wave of the hand
will accompany a ‘swoop’ of the voice. Daniel Stern et al. (1985) have written on this in terms of
‘vitality contours’. Pulse and quality combine to form ‘narratives’ of expression and intention.
These ‘musical’ narratives allow adult and infant, and adult and adult, to share a sense of sympathy
and situated meaning in a shared sense of passing time. The dramatic narrative structure of the
exchange between Laura and her mother can be seen in the shape of the pitch contour of their
exchange (Figure 1.2). From vocalizations centred on C4 at the start of the exchange, the mother
takes her cue from Laura’s upward moving vocalization by abruptly moving her pitch to C5.
This sudden upwards movement is ‘reprised’ by the mother during the rising pitch ‘swoop’ of
utterance number 9. From here till the end, the pitch level slowly descends back to C4, reflected
in the downwards pitch movement of Laura (utterance 11). In Figure 1.2 it is also suggested that
the narrative structure may be thought of in a ‘classical’ four-part evolution of a story, through
Introduction, Development, Climax and Resolution. The ‘poetic form’ of protoconversation,
its rhythmic and prosodic regulation, has been recognized by David Miall and Ellen
Dissanayake (2003).
It will be clear that when discussing communicative musicality we are using the words
‘musicality’ and ‘musical’ in a very particular way. When we talk of the ‘musicality’ of
mother–baby interaction, we are not talking of what we generally understand to be music, with
its known composers and performers. Music is moulded by the forces of culture, such that a song
from a rainforest tribe in Brazil will sound very different to a Beethoven symphony, which in turn
will sound very different from the output of a composer such as Stockhausen. When we talk of
musicality we are pointing to the innate human abilities that make music production and appre-
ciation possible (Blacking 1969/1995). And not only music, also dance and any other human
endeavour that could be considered one of the temporal arts, such as religious ceremonies or
theatre—all instances of ‘the human seriousness of play’ (Turner 1982). We define musicality as
expression of our human desire for cultural learning, our innate skill for moving, remembering
and planning in sympathy with others that makes our appreciation and production of an endless
variety of dramatic temporal narratives possible—whether those narratives consist of specific
cultural forms of music, dance, poetry or ceremony; whether they are the universal narratives of
Pitch
G
F INTRODUCTION DEVELOPMENT CLIMAX RESOLUTION
E
D
C5
B
A
G
F
E
D
C4
B
A
G
F
E
D 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
C3
0.00 5.00 10.00 15.00 20.00 25.00
TIME IN SECONDS
INTRODUCTION DEVELOPMENT CLIMAX RESOLUTION

1 Come on 7 Oh yes! 10 Tell me some 15 Ch ch
2 Again 8 Is that right? more then With INFANT
3 Come on then 9 Well tell me 11 INFANT 16 Ahgoo
4 That’s clever some more then 12 Ooorrh 17 Goo
5 INFANT 13 Come on
6 INFANT 14 Ch ch ch ch
With INFANT
Fig. 1.2 Photos show the expressions of Laura and her mother in dialogue. The pitch plot is a com-
pressed version of that shown in Figure 1.1 indicating how the narrative demonstrates four parts:
Introduction, Development, Climax and Resolution. Utterance numbers appear immediately above
the time axis and in the table. (adapted from Malloch 1999).
a mother and her baby quietly conversing with one another; whether it is the wordless emotional
and motivational narrative that sits beneath a conversation between two or more adults or
between a teacher and a class. In the coordination of practical tasks, a shared, intuitively commu-
nicated understanding is necessary for success. It is our common musicality that makes it possi-
ble for us to share time meaningfully together, in its emotional richness and its structural
holding, and for us to participate with anticipation and recollection of pleasure in the ‘imitative
arts’ as explained by Adam Smith (1777/1982).
This is the sense in which communicative musicality is explored by the authors of this volume.
It is also the sense in which the word ‘musicality’ was introduced into the literature on infant
communication by Mechtild and Hanuš Papoušek (1981), who, with sensitive acoustic analysis of
vocalizations shared with their infant daughters, described the ‘intuitive musicality’ of parenting,
and its role in the development of cultural and linguistic awareness in infancy and early child-
hood. Through the 1980s they made a fine account of what it means for us to share melodies and
controlled expressions of feeling in the voice. This inspired our work.
In the autumn of 1997, Irène Deliège, Editor of Musicae Scientiae, attended a meeting in the
Music Department at the University of Edinburgh, and discussed with Colwyn questions
pertaining to the psychology of music, with particular regard to the temporal and rhythmic
features in vocal and gestural interactions of infants with their parents This led to collaboration
with the URPM (Unité de Recherche en Psychologie de la Musique) of the University of Liège. In
1998 a symposium on ‘Rhythm, Musical Narrative, and Origins of Human Communication’ was
offered by Irène and Colwyn at the URPM by invitation of the Second International CASYS
Conference (Computing Anticipatory Systems) on the theme of ‘Anticipation, Cognition and
Music’. The papers of the symposium became the Special Issue of Musicae Scientiae, published by
ERCAM (European Society for the Cognitive Sciences of Music) in 2000 (a special issue dated
1999/2000). Papers by Bjorn Merker, Ben Schögler, Maya Gratier, Stephen and Colwyn were
included. Also represented were Marc Wittmann and Ernst Pöppel who reported their work on
temporal mechanisms of the brain in communication, with special reference to music perception
and performance, and Louise Robb who described application of acoustic analysis to study the
changes in musicality of the voice of a depressed mother and the effects on her infant.
1.2 Musicality and the energies of the Self

The power of musicality to facilitate and energize of meaning in communication is poignantly
expressed in music and dance therapy—its ability to play a vital role in this nurturing of the Self
points strongly to its intrinsic role in our biological–psychological make-up (Trevarthen and
Malloch 2000; Sacks 2007). Music and dance, with their progressions from regularity and
predictability to novelty and surprise and back again, can provide a safe, supportive environment
in the ‘present moment’ (Stern 2004) for those for whom interactions with others are fraught
with complexities and difficulties. For traumatized children and those whose development has
taken them towards communicative isolation, such as those with autism and Rett syndrome, the
engagement of their musicality by another can be a lifeline to human sociality. Our shared musi-
cality can be harnessed to our intention to reach out to others, and in this we see the powerful
healing nature of our desire for companioning others through time, even when those others may
have no language and exceedingly limited communication abilities. Our musicality serves our
need for companionship just as language serves our need for the sharing of facts and practical
actions with things. Here Ian Cross’s description of music’s ‘floating intentionality’ comes in
useful (Cross 1999; Chapter 5, this volume). Music, he says, complements language by providing
us with a means for sharing coordinated, embodied space and time while lessening the potential
for disagreements based on the particularity or ‘discretising’ of verbal meaning to which Per Aage
Brandt refers (Chapter 3, this volume). We can ‘agree’ in the shared embodied space of music and
dance, whereas we may disagree in the shared objective space of a verbal discussion because our
version of ‘reality’ differs from that of another. Musicality’s nature of engaging one with an other,
or many with many, intersubjectively, is intrinsic to musicality’s healing potential.
As discussed by Patricia Eckerdal and Bjorn Merker (Chapter 11, this volume), an infant is
inducted into his or her culture through participation in games. In this environment of ritual
learning, babies practise the gestures of song and ‘ceremonial’ movements, and show them off
with pride to people they trust (Trevarthen 2002). They appreciate musical jokes, making others
laugh (Malloch 1999; Reddy 2003). At the same time as particular forms of others’ actions begin
to be salient in the infant’s awareness, a baby fears misunderstanding in presence of strangers,
and may withdraw in shame when communication fails (Trevarthen 2002). This shame or
shyness is important—it acts to protect the growth of cultural meanings invented and shared
with intimately known and constant companions. It limits misunderstanding. We also believe
adult human relationships and negotiations, including those of creative art, are worked out along
a pride–shame continuum, with dynamic balance between interacting wills and imaginations
(Scheff 1988). Thus, healthy pride and shame, reacting to the appreciations, and misunderstand-
ings or judgements of others, is an important dimension along which flows infant or adult ability
for expressing the various cultural narratives of communicative musicality. Of course to become
fixed for any length of time at either end of this continuum, ceasing to respond intuitively to the
needs of the social environment, can severely impair our ability to share meaning with others,
with potentially debilitating effects. We see this in cases of maternal postnatal depression and
loneliness due to cultural dislocation where a mother’s sense of Self moves towards what could
be called ‘stagnant shame’ (Marwick and Murray Chapter 13, and Gratier and Apter-Danon
Chapter 14, this volume). We see the repercussions of adult ‘stagnant pride’ in the emotional
damage brought to children by war (Osborne, Chapter 15, this volume). And we also see in these
examples the power of music and of our innate musicality to bring free flow once more to the
human psyche. However, a pride–shame continuum is not sufficient to account for the wide
range of human experiences that can be negotiated through communicative musicality.
Taking our cue from eastern philosophies such as Buddhism, the work of the western theolo-
gian Martin Buber (for example, Buber 1923/1958) and Chapter 16 of this volume by Mercédès
Pavlicevic and Gary Ansdell, we believe that along with a pride–shame continuum, our experi-
ence of music and the temporal arts can also show us a separation–interconnection continuum.
Through what Mercédès and Gary call ‘collaborative musicing’ people can move to awareness of
their intrinsic interconnectedness. This is a state where our sense of separateness moves towards
a sense of being an inseparable part of community. They describe an experience they call ‘“multi-
subjective”, in the sense that we both lose and retain our subjectivity within the collective “I”’ (see
page 369). The ability of dance and music to contain paradoxical viewpoints, as described by Ian
Cross and Iain Morley (Chapter 5, page 68) may well contribute to this process. Thus, we propose
a model (necessarily a simplification of actual experience) where the human disposition for com-
municative musicality operates within the tension of two continua lying at right angles to each
other. One is the pride–shame continuum (Scheff 1988), which holds our progress in cultural
learning. The other is the separation–interconnection continuum, which holds our sense of
degree of interrelationship or ‘belonging’ with other people (perhaps all people), of concordant
actions in society, and of skilful uses of objects. Both engage an inherent consciousness of human
relationship, ‘the second person psychology’ that gives meaning to the psychologies of individual
(first person) and objective (third person) action and experience (Reddy 2008). We maintain that
to become fixed or ‘stagnant’ at either end of either continuum provokes discord, both for the
person and for those in their environment. A healthy human psyche will flow freely along these
intersecting continua as the needs of a situation are perceived.
This ‘floating’ and energetic ‘flow’ of intentions with all their affective colour are lived by young
children as a source of energy and inspiration for both play and learning, as an intrinsic part of
their ‘musical culture’ (Bjørkvold 1989). Contributors to this volume who are teachers welcome this
vitality of expression with its generous sharing; and recognizing that adults must participate in the
rhythms and accents of two-way improvization of meaning for learning to flourish, the authors
consider how the musicality of both teacher and pupil helps their teaching practices throughout the
curriculum. We believe education to be a collaborative task requiring intentional participation in
actions, discoveries and feelings in the human time of shared movement.
1.3 A dichotomy and the way forward

The companionship in discovery of the musical impulses in human beings has been an enjoyable
and encouraging one, but the story is far from finished. Our scientific curiosity has faced us with
two mysterious dichotomies that were familiar to Aristotle and a long line of his predecessors—
the mind–body question, and the problem of sympathy.
◆ How do our swift and ethereal thoughts move our heavy and intricately mobile bodies so our
actions obey the spirit of our conscious, speaking Self?
◆ How do we appreciate mindfulness in one another and share what is in each other’s personal
and coherent consciousness when all we may perceive is the touch, the sight, the hearing, the
taste and smell of one another’s bodies and their moving?
The authors of this book all believe that progress to finding answers to these questions must
acknowledge, as a first step, that we move with rhythm, and that this movement simultaneously
makes the measure of time from ‘inside us’; we tell one another measured stories with emotion-
ally expressive grace—with what we call musicality. This musicality communicates because we
meet as actors first who detect the source of human movements in their form, subjectively—
before we debate, explain, reason the imaginative and hopeful stories that our minds make up as
reconstructions of objective reality ‘out there’.
We believe humans move under the coordinated and integrated control of a time keeping,
energy regulating Intrinsic Motive Pulse (IMP) (Trevarthen 1999). The brain is a network of
dynamic systems all obedient to a scale of rhythms that flow in unison, orchestrating their effec-
tive actions to fulfil the future-sensitive (motivated) desires and recollecting past experiences of
being. There is no other way all these muscles of my body could work in collaborative efficiency,
initiating and executing their forces in synchrony and succession in the present moment, modi-
fying inclinations and desires for the future that are founded on experiences past. This is the way
intentions come to be, and it is also the way they are perceived in others. We can only cooperate
in relationships or social groups by sympathetic harmonization and synchronization with this
time-creating IMP, dancing together ‘in one time’ with its rhythms and respecting the qualities of
its tensions and future-oriented impulses and melodies which we share.
As David Lee shows us (Lee 2005; Chapter 6, this volume), scientific analysis of movement and
of perception-in-action shows we are not just information processors. Rather are we, and all ani-
mal organisms, manufacturers of information in consciousness, generators of the prospects of
our growth and movement in space and time, estimating our needs in a world of varied poten-
tialities that have to be evaluated as imagined goals. The sharing of action enables these goals to
have meaning, and the innate neurochemistry of emotions that Jaak Panksepp has explored
(Panksepp 1998; Panksepp and Trevarthen, Chapter 7, this volume) enable us to give self-related
values to life’s experience, and to communicate them. What matters to us is what feels right in
movement, and what makes common sense in action.
In short, this book explores a particular way of thinking about how the human mind and
human body work together and are intimately interdependent; it investigates how we share life
and make the meaning of our culture in communities. It presents many kinds of evidence to
support the view that we are evolved to know, think, communicate, create new things and care for
one another in movement—through a sense of being in rhythmic time with motives and in tune
with feelings to share the energy and harmony of meaning and of relating. The authors, with
Whitehead (1926), Langer (1942, 1953), Gibson (1979) Lakoff and Johnson (1999) and the
phenomenologists (Merleau-Ponty 1962; Husserl 1964), bring the creative experience of time
in movement of our whole organism, and in the sensations of a moving body (Damasio 1999),
back into the theory of the work of the mind and of conscious perception. They want to balance
the attention paid in contemporary psychology to the input of structure and information that is
perceived as objective food for thought and the subject matter of language. They develop ideas
about the intuitive subjective processes that generate a moving consciousness, and about the
artful and informative stories these motives tell, intersubjectively.
The way forward has been made clear by fine analysis with open curiosity of how an infant and
mother or father share their purposes and feelings with touch, sight and sound, and by evidence
from movement science of the primary dimensions of motor images generated in the brain. We
live, think, imagine and remember in movement. To capture the essence of movement and its
values we use the metaphor of ‘musicality’. To recognize that our experience in movement is
shared by a compelling sympathy we call this activity ‘communicative’. We believe that our
learning, anticipating and remembering, our infinite varieties of communication including
spoken and written language are all given life by our innate communicative musicality.
References
Alegria J and Noirot E (1978). Neonate orientation behaviour towards the human voice. Early Human
Development, 1, 291–312.
Bateson MC (1979). The epigenesis of conversational interaction: A personal account of research
development. In M Bullowa, ed. Before speech: The beginning of human communication, pp. 63–77.
Cambridge University Press, London.
Bjørkvold J-R (1989). The muse within: Creativity and communication, song and play from childhood through
maturity. Harper Collins, New York.
Blacking J (1969/1995). The value of music in human experience. The 1969 Yearbook of the International
Folk Music Council. (Republished in P Bohlman and B Nettl, eds, 1995, Music, culture and experience:
Selected papers of John Blacking. Chapter One, Expressing human experience through music.) University
of Chicago Press, Chicago, IL.
Buber M (1923/1958). I and Thou. Translated by RG Smith. Charles Scribner and Sons, New York.
Bullowa M (ed.) (1979). Before speech: The beginning of human communication. Cambridge University
Press, London.
Byers P (1976). Biological rhythms as information channels in interpersonal communication behavior.
In PPG Bateson and PH Klopfer, eds, Perspectives in ethology, pp. 135–164. Plenum, New York.
Cross I (1999). Is music the most important thing we ever did ? Music, development and evolution.
In SW Yi, ed., Music, mind and science, pp. 10–39. Seoul National University Press, Seoul.
Damasio AR (1999). The feeling of what happens. Body, emotion and the making of consciousness.
Willam Heinemann, London.
DeCasper AJ and Fifer WP (1980). Of human bonding: newborns prefer their mothers’ voices. Science,
208(4448), 1174–1176.
Fernald A (1985). Four-month-old infants prefer to listen to motherese. Infant Behavior and Development,
8, 181–195.
Fernald A (1989). Intonation and communicative intent in mothers’ speech to infants: Is the melody the
message? Child Development, 60, 1497–1510.
Fernald A, Taeschner T, Dunn J, Papoušek M, Boysson-Bardies B de and Fukui I (1989). A cross-language
study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. Journal of Child
Language, 16, 477–501.
Gibson JJ (1979). The ecological approach to visual perception. Houghton Mifflin, Boston, MA.
Hepper PG (1988). Fetal ‘soap’ addiction. Lancet, 1, 1347–1348.

Husserl E (1964). The phenomenology of internal time-consciousness. (Translated by JS Churchill). Indiana
University Press, Bloomington, IN.
Kuhl PK, Andruski JE, Chistovich LA et al. (1997). Cross-language analysis of phonetic units in language
addressed to infants. Science, 277(5326), 684–686.
Lakoff G and Johnson M (1999). Philosophy in the flesh: The embodied mind and its challenges to Western
thought. Basic Books, New York.
Langer SK (1942). Philosophy in a new key: A study in the symbolism of reason, rite and art. Harvard
University Press, Cambridge, MA.
Langer SK (1953). Feeling and form: A theory of art developed from philosophy in a new key. Routledge and
Kegan Paul, London.
Lee DN (2005). Tau in action in development. In JJ Rieser, JJ Lockman and CA Nelson, eds, Action as an
organizer of learning and development, pp. 3–49. Erlbaum, Hillsdale, NJ.
Malloch S (1999). Mothers and infants and communicative musicality. Musicae Scientiae (Special Issue
1999–2000), 29–57.
Masataka N (1993). Relation between pitch contour of prelinguistic vocalisations and communicative
functions in Japanese infants. Infant Behaviour and Development, 16(3), 397–401.
Merleau-Ponty M (1962). Phenomenology of perception. Routledge and Kegan Paul, London.
Miall DS and Dissanayake E (2003). The poetics of babytalk. Human Nature, 14, 337–364.
Mundy-Castle A (1980). Perception and communication in infancy: A coss-cultural study. In D Olson, ed.,
The social foundations of language and thought, pp. 231–253. Norton and Co., New York.
Panksepp J (1998). Affective neuroscience: The foundations of human and animal emotions. Oxford
University Press, New York.
Papoušek M (1987). Models and messages in the melodies of maternal speech in tonal and non-tonal
languages. Abstracts of the Society for Research in Child Development, 6, 407.
Papoušek M (1992). Early ontogeny of vocal communication in parent–infant interactions. In H Papoušek,
U Jurgens and M Papoušek, eds, Nonverbal vocal communication. Comparative and developmental
approaches, pp. 230–261. Cambridge University Press, Cambridge.
Papoušek M and Papoušek H (1981). Musical elements in the infant’s vocalization: their significance for
communication, cognition, and creativity. LP Lipsitt and CK Rovee-Collier, eds, Advances in infancy
research, Vol. 1, pp. 163–224. Ablex, Norwood, NJ.
Papoušek M, Papoušek H and Symmes D (1991). The meanings and melodies in motherese in tone and
stress languages. Infant Behavior and Development, 14, 415–440.
Reddy V (2003). On being the object of attention: implications for self–other consciousness. Trends in
Cognitive Sciences, 7(9), 397–402.
Reddy V (2008). How infants know minds. Harvard University Press, Cambridge MA/London.
Sacks O (2007). Musicophilia: Tales of music and the brain. Random House, New York/Picador, London
Scheff TJ (1988). Shame and conformity: the deference-emotion system. Sociological Review, 53, 395–406.
Smith A (1777/1982). Of the nature of that imitation which takes place in what are called the imitative arts.
In WPD Wightman and JC Bryce, eds, Essays on philosophical subjects, pp. 176–213. (With Dugald
Stewart’s account of Adam Smith, edited by IS Ross. General editors, DD Raphael and AS Skinner)
Liberty Fund, Indianapolis, IN.
Stern DN (1974). Mother and infant at play: The dyadic interaction involving facial, vocal and gaze
behaviours. In M Lewis and LA Rosenbum, eds, The effect of the infant on its caregiver, pp. 187–213.
Wiley, New York.
Stern DN (2004). The present moment: In psychotherapy and everyday life. Norton, New York.
Stern DN, Hofer L, Haft W and Dore J (1985). Affect attunement: the sharing of feeling states between
mother and infant by means of intermodal fluency. In T Field and N Fox, eds, Social perception in
infants, pp. 249–268. Ablex Publishing Corporation, Norwood, NJ.
Trehub SE (1987). Infants’ perception of musical patterns. Perception and Psychophysics, 41, 635–641.
Trevarthen, C (1999). Musicality and the intrinsic motive pulse: Evidence from human psychobiology and
infant communication. Musicae Scientiae (Special Issue 1999–2000), 155–215.
Trevarthen C (2002). Origins of musical identity: evidence from infancy for musical social awareness.
In RAR MacDonald, DJ Hargreaves and D Miell, eds, Musical identities, pp. 21–38. Oxford University
Press, Oxford.
Trevarthen C and Malloch S (2000). The dance of wellbeing: Defining the musical therapeutic effect.
Nordic Journal of Music Therapy, 9(2), 3–17.
Trevarthen C, Murray L and Hubley P (1981). Psychology of infants. In J Davis and J Dobbing, eds,
Scientific foundations of clinical paediatrics, 2nd edn, pp. 235–250. Heinemann Medical Books, London.
Turner V (1982). From ritual to theatre: The human seriousness of play. Performing Arts Journal
Publications, New York.
Whitehead AN (1926). Science and the modern world. Lowell Lectures, 1925. Cambridge University Press,
Cambridge.
02-Malloch-Chap02 9/9/08 10:25 AM Page 13
Part 1
The origins and psychobiology

of musicality
The joyful and adventurous experience of music and dance is an integral part of all human
societies. Indeed, their ubiquity can lead us to take their existence for granted: we are human,
therefore we dance and sing. Yet why and how are we musical? What is the evolutionary history of
the appreciative hearing and skilful production of our musicality? Is the doing of music and
dance helpful, or maybe even vital for our survival? And if musicality is an intrinsic aspect of
being human, how does it express itself in our living, feeling, thinking ‘being’? What foundations
for it may be found in our psyche and biology?
The authors of Part 1 address these questions from different positions: evolutionary theory
and archaeological evidence, investigation of human and ape cultures, semiotics, biology, mathe-
matics and brain science. A theme that all chapters have in common is an emphasis on the vital
role innate musicality has played and continues to play in creating and sustaining human social
relationships, with their cooperative imagining and myth-making. Social relationships enable us
to achieve, think, and imagine more than we could alone. They sustain, enrich and nourish our
lives. For Ellen Dissanayake (Chapter 2), the seeds of our musicality lie in love, in mutuality—the
behavioural and emotional cooperation between two individuals who need each other. While she
is talking particularly of the intimate relationship between mother and infant, Per Aage Brandt
(Chapter 3) talks of the importance of the sound of a person’s name to stand for that person in
the context of an adult love relationship—the person is seen by the other as possessing special
beauty or ‘meaning’. As Per says, ‘proper names should be understood from the point of view of
the musicality of personhood … [they] ‘mean’ or refer to the affect (love) that first made an indi-
vidual into a person’ (page 34).
Social relationships are, however, not only about dyadic attachments, they are also about rela-
tionships within groups, and all authors stress the vital role musicality has played in facilitating
group cohesion, and the creativity of imagination expressed in a group’s social history. Ellen
Dissanayake writes of the function of group ritual, involving of music and dance, in allaying our
anxieties of the unknown while contributing to spiritual beliefs and practices. Björn Merker
(Chapter 4) argues for the importance of vocal learning in humans and other animals, most
conspicuously in some families of birds, in the evolution of the social mind. He argues that our
endlessly elaborated ritual culture, emotionally enriched by displays of music and dance
composed of elements of gesture, pitch and rhythm invented by the human mind, constitutes a
14 THE ORIGINS AND PSYCHOBIOLOGY OF MUSICALITY
decisive difference between us and our ape relatives whose imitations are fewer, more practical
and far simpler. He believes that the elaboration of ritual activities prepared the way for the
unparalleled inventiveness of meaning in language:
Through [human] ritual, the core concerns of life are attired in fancy dress and complex gestures as
concrete, living proof that life does not hang by a mere thread, that there are resources beyond those
needed for the bare maintenance of life itself.
(Page 52, this volume)
Bjorn raises issues for the theory of the evolution of musical invention that are invitations
for new research, especially regarding the development of rituals of musicality in children
(as elaborated in Merker and Eckerdal, Chapter 11).
One of the ways society protects its coherence and the mutual support for its members is
through the ritualization of non-confrontational interactions. Social relationships are evolved to
favour cooperation (Axelrod and Hamilton 1983). In common with all authors of this section,
Ian Cross and Iain Morley (Chapter 5) consider music as embodied expressive movement, and, in
particular, point to the ambiguous reference of musical intentions and the role that management
of uncertainty in musical creation plays in nurturing pleasurable social cohesion. It is this ambi-
guity, or ‘floating intentionality’, as contrasted to the specific references of language, that they
argue is music’s unique contribution to our social environment. Floating intentionality is experi-
enced through the life of the body: ‘the primary determinant of musical experience might well be
how the perceived sounds fit with the temporal structures experienced in a moving human body’
(page 72).
If we perceive the forms and meanings of music ‘through’ the body, then there is the question
of how this occurs. David Lee and Benjamin Schögler (Chapter 6), using the mathematics of
tau theory, point to psychological and biological universals generated in the body and brain that
underpin this process—indeed all processes of intelligently moving from one position to another
with ‘prospective awareness’, be it the movement of the voice, hand or whole body: ‘Ultimately,
perception and cognition have inherent relationships to the generation of motivated psycho-
logical time ... which is the key to social communication in all animals’ (page 84). Universal
features in the affective biology of animal life, linking human elaborative musicality with the life-
sustaining emotions and behaviours of other species and their neurochemical foundations, are
investigated by Jaak Panskepp and Colwyn Trevarthen (Chapter 7). Infectious laughter of playful
baby rats or children, the whining cry of a hurt puppy or child, the sad calls of a lonely chicken or
lover, the growl of a wolf or angry parent, the shriek of an eagle or the triumphant shout of a
successful athlete, and the accompanying modulations of rhythm in gesture, are all generated
within neurochemistries we share in the core of our brains with other species. They remain active
in our relationships, and in the unique elaboration of our collaborative arts and work: ‘[these
universal features] help explain how music supports our social life, and how our musical
preferences can define our “identity” in society’ (page 105). The forms we prefer are those that
have gained recognition through their appeal to our group’s need to share beauty. They become
precious as our rituals.
Again turning to processes within the body, but this time to the dynamic processes that can be
detected in the whole living, thinking, perceiving, acting human brain, Robert Turner and
Andreas Ioannides (Chapter 8), in their review of the functional brain imaging literature, show
how new information is being gleaned about the innateness of musicality and its possible rela-
tionships to speech and language. In this new field of science the temporal essence of music and
dance, the rhythm of its making and its perception, challenges both the methods by which inte-
grative activity of nervous tissues can be detected and the theory of how essential mental
THE ORIGINS AND PSYCHOBIOLOGY OF MUSICALITY 15
processes of many kinds are held together in one time throughout the vast array of elements that
is the human brain. As Oliver Sacks argues in Musicophilia (Sacks 2007), our brains have musical
processes deeply embedded in them, processes essential for ordinary communication as well as
for professional musicianship. These intrinsically communicative processes stimulate the imagi-
nation and give rise to powers of invention, or can lead to strange disorders of awareness; they
can be recruited to help individuals with disturbed motives, thinking and social feelings regain
control of their minds and their fellowship with other people (see the chapters in Part 3).
References
Axelrod R and Hamilton W (1983). The evolution of cooperation. Science, 211, 1390–1396.
Sacks O (2007). Musicophilia: Tales of music and the brain. Random House, New York/Picador, London.
Chapter 2
Root, leaf, blossom, or bole:

Concerning the origin and adaptive
function of music
Ellen Dissanayake
2.1 Introduction
In an earlier survey of ideas about the adaptive function of art (Dissanayake 1994), I invoked the
old analogy of blind men examining an elephant to describe what kind of creature it is. The
concept of art, I said, is similarly composed of a variety of features, some as different from one
another as the elephant’s trunk from its ear or tail. Yet to discuss the subject of art’s function
(or origin) cogently, we need to know what it is that we are referring to—we need to have an idea
of the larger whole.
It is the same with ‘music’, which—like ‘art’—is not a word or concept in many human
societies, even ones that conspicuously engage in what we would call music or art. Like the
editors and other contributors to the present volume, my subject here is music as a component of
communicative musicality, a term that offers a new way of thinking about that complex, many-
faceted entity that we call music. As the essays in this volume seek to demonstrate, musicality is a
psychobiological capacity that underlies all human communication, including music. It has
evolved to become a universal characteristic of human nature. As a component of human
nature—part of what makes us human—musicality and music itself have an evolutionary origin
and function. Like speech or toolmaking, there must have been a time when they did not exist.
From what antecedent abilities did they arise, and why?
Non-evolutionary explanations for the origin of music have probably existed for thousands of
years—we know, for example, that many cultures consider it to be a gift of the gods, or of a
particular god. However, under the influence of the Enlightenment, and especially after the
publication of Charles Darwin’s ideas in the second half of the nineteenth century, more
‘scientific’, quasi-evolutionary sources and functions for music were proposed. It is useful to
describe these and two recent specifically evolutionary proposals, before offering my own
hypothesis on the origin and purpose of music.
To organize my discussion, I use an analogy from another large living thing, in the present case
vegetable rather than animal. In his poem Among school children, William Butler Yeats asks,
O chestnut-tree, great-rooted blossomer,
Are you the leaf, the blossom or the bole?
W B Yeats (1928)
In seeking to understand music as it emerged from the great-rooted whole of musicality,

I consider the early quasi-evolutionary speculations as individual leaves, and a current influential
theory from evolutionary psychology as blossom (music as ornament to attract sexual partners)
18 ELLEN DISSANAYAKE
and—indulging in poetic license or interpolation for the purposes of my discussion—as burl

(music as non-adaptive by-product of other adaptations). My own hypothesis about music
I liken to the bole or trunk that supports the entire chestnut tree, which Yeats implies is more
than the sum of its various parts. The root, which nourishes the entire tree, is communicative
musicality, as described in other essays in the present volume. As with any illustrative analogy,
I beg the reader’s indulgence and ask that it not be taken further than I take it here.
2.2 Leaf: early speculations

In the second half of the nineteenth century, writings about music began to include speculations
about its evolutionary origin and what we now call its adaptive function (although formulated
more in terms of what music was observed to accomplish). These suggestions were not presented
as scientific hypotheses, but instead were ideas derived in large measure from observations of
musical behaviour in ‘primitive’ societies. As is well known, Darwin’s theory of descent with
modification by means of natural selection, had reverberating and not always beneficial effects
on subsequent thought about humans. Ideas about ‘progress’ and ‘improvement’, based on
misunderstandings and the misapplication of Darwin’s theory, were used to support racialist
‘explanations’ for human social and intellectual differences, eventually tarring all of the subject of
human evolution with the brush of racism and ethnocentrism (Degler 1991). A century later, few
humanists or even social scientists seek to propose evolutionary origins and functions for human
behaviour, and the concept of human nature itself is suspect.
Nevertheless, my colleagues and I, engaged in adaptationist studies of the various arts
(e.g., Aiken 1998; Brown 2000a, b; Carroll 2004; Coe 2003; Dissanayake 1995/1992, 2000a, b;
Miller 2000a, b; Mithen 2005), are concerned to propose plausible hypotheses about their evolu-
tionary origin and, especially, their adaptive function(s). Considering what others before us have
suggested is a useful starting point, even though early theorists did not feel obliged to address
what would have caused or promoted the emergence of music in human individuals or societies
in the first place. Here, on the chestnut tree of music, I place these and other such suggestions
about the origin and function of music as leaves, which give nourishment and rustle pleasantly,
but are to be replaced after their seasonal usefulness is past.
Most speculation about the biological origin and function of music, from the late eighteenth
century to the present day, have looked to human vocal expression and communication: for
example, emotional outcries that are inherently strongly moving or alarming—weeping,
sobbing, calls for help, the ups and downs of excited speech, shouts of joy, and so forth
(Combarieu 1894; Lacépède 1785; Spencer 1857)—what Eibl-Eibesfeldt (1975, pp. 498–499)
later called ‘innate releasing mechanisms’ (and see Panksepp and Trevarthen, Chapter 7, this
volume).
Additionally, some theorists traced the beginnings of music to rhythmic sound (Rowbotham
1880), or a rhythmic impulse, arising from the general appetite for exercise (Wallaschek 1893)
and used to facilitate and promote work such as pulling nets or pounding grain (Bücher 1899).
Other scholars thought that rhythm alone was not ‘music’, which required tones and intervals
(Hornbostel 1975; Stumpf 1911).
Music also has been traced to sounds from human activity—hunting calls that may imitate
animal cries and birdcalls (Lucretius 1937, in Rowbotham 1880, p. 661; Geist 1978); signalling
(Révész 1941; Stumpf 1911), as across valleys and distances with hoots and hollers (Hall-Craggs
1969); play (Bücher 1910); and accompaniments to dance and festal excitement (Stumpf 1911)—
these sounds gradually acquiring refinement and social purpose when used to ease and pace
work and ritual.
ROOT, LEAF, BLOSSOM, OR BOLE: CONCERNING THE ORIGIN AND ADAPTIVE FUNCTION OF MUSIC 19
Other theories have looked to human speech itself—to tone languages such as Chinese or
Yoruba, where different pitches of the same syllable are semantically significant (Kuttner 1990;
Schneider 1957), or to the preverbal babbling of babies (cited in McLaughlin 1970). There has
been a lengthy debate, which still continues, about which came first in human evolution: music
(Darwin 1874; Monboddo 1774; Rousseau 1761) or speech (e.g., Pole 1924/1879; Spencer 1857).
Although suggestions like these sprouted with some profusion, they were primarily arm-
chair speculations, often influenced by unexamined Western presuppositions about aesthetic
experience. For example, Max Weber (1958, p. 40), writing about the peculiar rational properties
of developed Western music, contrasted it with ‘primitive music’, which was used for socially
important and practical ends—apotropaic (protective) and exorcistic—rather than for accessing
‘the sphere of pure aesthetic enjoyment’.
Most scholars today who address the adaptive function of the arts are concerned with the
visual arts or literature, not music. Insofar as the earliest ‘literature’ (oral literature) was probably
inseparable from song and movement, one can suggest shared evolutionary origins or adaptive
functions for vocal music and the earliest recitation or oratory. I know only a handful of evolu-
tionarily informed hypotheses about music’s place in evolution—Brown (2000a, b), Cross
(2003), Dissanayake (2000a, b), Hagen and Bryant (2003), Huron (2001), Merker (2000), Morley
(2002), Miller (2000a, b), and Mithen (2005). Pinker (1997, 2002) contends that music is not an
adaptation (Section 2.4, this chapter). This essay is not the place for comparison and evaluation
of these hypotheses, except for those of Miller and Pinker, which have attracted popular attention
and are discussed in the following two sections.
2.3 Blossom: music as sexual ornament and costly signal

A widely disseminated adaptationist hypothesis of the arts is that of Geoffrey Miller, who has
specifically addressed the subject of music according to sexual selection theory (Miller 2000a, b).
Because the flowers of a tree are essential for reproduction and are generally showy, I treat the
sexual selection hypothesis as a ‘blossom’ of musicality.
Miller does not suggest a specific evolutionary origin for music or the other arts. He begins
with Darwin’s (1874) suggestion that human music might have evolved as a courtship display
by invoking recent theoretical formulations in evolutionary psychology, notably ‘costly signalling’
theory and contemporary understanding of Darwin’s own, now accepted, theory of ‘sexual
selection’. Restrictions of space prevent more than a brief outline of these ideas here.
Remarking on the apparently useless ornamental accoutrements of other species—especially
the beautifully coloured, marked, and often enlarged crests, tails, wings, and other body parts of
male birds—Darwin wondered whether human art, although different from culture to culture,
was the same sort of thing. Yet why would either evolve? Features that are superfluous (as music
and the other arts appeared to be) take time and energy from more obviously useful activities,
such as finding food, mating, or resting. What is more, they draw attention to themselves, even
attracting predators. Heavy antlers or luxurious tails impede locomotion and divert energy from
vital activities. They would seem to reduce fitness, not contribute to it.
Miller applies ‘costly signalling’ theory (Zahavi and Zahavi 1997) to this apparent problem.
The ornaments of birds and animals honestly advertise their fitness, because the strength and
vigour required for their display cannot be ‘faked’ by less well-endowed males. Females who
prefer (i.e., find beautiful or pleasing) such signals of genetic superiority and who choose their
bearers as mates will produce similarly well-endowed male and female offspring with similar
preferences. Through succeeding generations, these ornaments are ‘selected for’ and become
established as species traits, despite their apparent liability. Similarly, Miller claims that human
music—song, dance, and the virtuosity to do them well—has evolved through sexual selection
by females.
Competition is inherent in this model, since females choose the male with the most extrava-
gant endowment—whether the brightest, tallest crest, the most vigorously quivered tail, the most
splendidly ornamented bower, or the most complex and sonorous song. Sometimes male birds
compete directly—for territories as well as mates—with their songs and displays; at other times
they perform before a female who may, however, prefer another, better endowed suitor.
Miller’s hypothesis of the competitive use of musical behaviour as male display, which I have
only summarized here, has evidence in its favour. In many, if not all human societies, young
males show their vigour, beauty, and sexual desirability through song and dance, and they
achieve status through these and other accomplishments. About the time Miller was born, Curt
Sachs (1962) described competitive uses of music in ‘primitive societies’: the Chippewa admire a
singer’s ‘expanded range’; in Hawaii, deep and powerful chest resonance is noticed; Kikuyu
women give a good flautist food and drink as signs of appreciation (1962, p. 134). In the
Trobriand Islands, a good male singer is a success with women: ‘The throat is a long passage like
the wila [cunnus] and the two attract each other. A man who has a beautiful voice will like
women very much and they will like him’ (Malinowski 1929, p. 478). Other such examples can be
easily found.
I suggest, however, that male competitive display for females is not the origin of music, nor its
primary function. To begin with, ‘musicality’ in the animal world is used not only by competing
males, but notably by monogamous pairs for bonding and territory maintenance—for example,
duetting in gibbons (Geissmann 2000; Merker, Chapter 4, this volume) or courtship ‘dances’ in
birds such as cranes, where both partners participate. As Sugiyama and Scalise Sugiyama (2003,
p. 182) assert, costly signals may operate ‘on several frequencies, capable of sending a variety of
messages’, not only mate value. Costly ceremonial arts, which include music and dance, may
display kinship, generosity and sociality—as well as skill and male competence (Ottenberg 1989,
p. 180) or group prestige (van Damme 1996, p. 270), since groups as well as individuals engage in
costly signalling. Excess and cost are not only signals that say ‘Look at me, I can afford this
extravagance’. They can also signal ‘I [or we as a group] really care about this message, and I am
[we are] putting my money where my mouth is.’ Expenditure of time, materials, and resources—
in ceremonial performances composed of expensive, rare, or labour-intensive artefacts, cos-
tumes, structures, and performances—indicate to members of the group its zeal and strength of
purpose for achieving the outcome that the ceremony is intended to provide. Thus art-filled
ceremonies may appear luxurious, superfluous and pathological—as do male costly displays
for mates—but they have fitness benefits other than mating advantage that are arguably as
important (Section 2.5, this chapter).
Miller’s hypothesis emphasizes creativity and virtuosity as unfakable endowments of which
music and dance are evidence. However, although originality is valued for its own sake in some
societies, human art is typically conservative, not idiosyncratic (Coe 2003). Its manifestations
derive from the ancestors or supernatural spirits and must be reproduced accurately so that they
will work as intended. Skill, too, is usually admired, but may also incite envy. Cole and Aniakor
(1984) report that an Umunze Igbo slit-drum artist was so skilled that he was sacrificed so that he
would not carve a bigger drum for another village (in van Damme 1996, p. 348). Among the
Baule, carvers were usually cripples—those who were unable to farm (van Damme 1996, p. 232).
These are perhaps exceptional cases, and may be interpreted as proving the rule that music is a
sign of male fitness; however, they indicate that at least in the visual arts, male display may be
only one—and not the most important—function of the arts.
Music and other arts in premodern societies have many contexts and uses that belie a sexual
selection function. They are frequently displayed and performed in single-sex groups. Older
males or females may be considered the best artists or performers. A glance at the artefacts
on display in any ethnographic museum suggests that they and their accompanying music are
frequently used for occasions that are as likely to create fear or awe as sexual interest.
Steven Brown (2000b, 2002) has pointed out other problems with a strict sexual selection argu-
ment for music. Darwin’s theory was meant to explain sexually dimorphic traits—it is male birds
who attract attention to their skilful song and who have costly ornaments. Yet human musical
ability is possessed equally by females—as Miller notes, though he emphasizes male music-
making—and both sexes typically produce music. In some East Asian societies, girls are the
primary singers during courtship—e.g., the Moso (Namu and Mathieu 2003)—and in many
others, the courting pair together engage in antiphonal love dialogues that allow them to
coordinate body rhythms and otherwise assess their physical and emotional compatibility,
e.g, Hmong (Catlin 1992), Kmhmu (Proschan 1992), Maranao (Cadar 1975), and Moso (Namu
and Mathieu 2003).
Miller’s hypothesis addresses extreme talent, but musicality is a general human capacity that
benefits all, not only a few virtuosos. The entire hypothesis is concerned less with music than
with virtuosic capacity and the benefits of having a large brain. By treating all products of the
human brain as a form of sexual signalling, Miller (2000b) offers no cognitive function that is
primary to music (or any other art)—no need that is fulfilled by its specific character that is not
equally fulfilled by any other skill or display behaviour (Carroll 2004, p. xxi).
Another serious difficulty with the sexual selection argument is that it leaves no room for
cooperative uses of music, which are by far the most noticeable in the world’s musics—as the
pioneer ethnomusicologist John Blacking emphasized (e.g., Blacking 1995, p. 31; see also
Pavlicevic and Ansdell, Chapter 15, and Dissanayake, Chapter 22, this volume). Steven Brown
makes the important point that ‘the two most salient features of music, compared to any
other form of vocal communication in nature, are its use of pitch-blending and temporal
synchronization’ (Brown 2000b, p. 297). These features facilitate coordination and cooperation
among individuals, and it is difficult to see how they could have arisen through competitive
interactions. Brown speculates that these two cognitive capacities may have evolved specifically
for coordination and emotional unification among individuals in a group.
2.4 Burl: music as functionless by-product

William James (1890, p. 419) considered music to be ‘a mere incidental peculiarity of the nervous
system, with no teleological significance’. A century later, Steven Pinker (1997) echoes James and
others like him who consider the arts to be superfluous by-products of other adaptations—
rather like the hemispherical woody outgrowths or burls that sometimes form, like a wen or
tumour, on the trunks of trees. The canonical example of this view is Pinker’s analogy with straw-
berry cheesecake, which humans evolved to like because during the Pleistocene, when sugar and
fat were rare, it was advantageous to prefer high-calorie, energy-rich foodstuffs when they were
available, rather than to be satisfied with tubers or leaves. (Our atavistic liking for such foods
today has little, if any adaptive function and indeed is maladaptive if overindulged). Like sugar,
fat, alcohol, recreational drugs, masturbation and pornography, the arts—including music—
exploit cravings that in other contexts are or were adaptive. Even though they are non-functional,
we like these concentrated doses of sensory and mental delight because they allow us ‘[to press]
our pleasure buttons’ (Pinker 1997, p. 525).
Pinker’s analogy is clever and amusing. Sometimes music, like eating, is an indulgence.
However, familiarity with the range of uses of arts in small-scale subsistence societies—and
apparently by our Paleolithic ancestors in what is now France and Spain, who have left evidence
of artful behaviour in remote parts of deep caves—disposes of any suggestion that making and
experiencing the arts are simply for pressing pleasure buttons. Laments, funerary arts, crawling
through a kilometre of narrow, wet, dark tunnels to paint (or view and perform ceremonies
before) bison and other animals on cave walls—these are not things people do for entertainment
and fun. The sheer amount of time, individual energy and material resources devoted to ceremo-
nial behaviour—in which music, dance, visual decor and literary language all combine—
indicates that pleasure is not the only or even greatest reason for its centrality and persistence
in the overwhelming number of human societies for which we have historical (and sometimes
prehistorical) evidence. Music engenders emotional states—fellow-feeling, affirmation, solemnity,
tears, and intimations of transcendence—that are not reducible to self-gratifying pleasure.
2.5 Bole: from protomusic to music

My argument about the evolutionary origin and function of music has two parts: a protomusical
and a musical stage. I present it here as the bole or trunk from which the entire chestnut tree,
rooted in communicative musicality, spreads forth its many branches in space and time.
2.5.1 Protomusic in communicative musicality

The emergence of protomusical capacities in humans, I suggest, derives in part from the conse-
quences of being a creature that showed two incompatible anatomical modifications during its
evolution: walking upright and developing an enlarged brain. A bipedal stance requires changes
in bones and muscles that were originally used for quadrupedal locomotion. Among these
alterations is a narrowed pelvis, making for difficulties at parturition for both the mother and
her large-brained infant—a trend that was well underway by 1.6 million years ago (Falk 2004,
p. 499). The solution (or compromise) was that selection favoured infants who were born at an
increasingly premature state. Over evolutionary time, hominid babies gradually became much
more helpless at birth than those of other primates (Dunbar 1996, pp. 128–129).
At birth and with lactation, the release of hormones such as opioids and oxytocin ensures that
mammalian mothers are devoted to caring for their infants (Miller and Rodgers 2001; Pederson
et al. 1992). It appears, however, that hominid mothers of helpless, demanding offspring required
additional insurance to guarantee that they would willingly devote constant attention and care to
them for years. How else to explain why shortly after birth (and at least as early as four weeks),
human mothers and infants universally engage in dyadic species-specific interactions that serve
to coordinate their behaviour and emotions—what Malloch and Trevarthen call ‘communicative
musicality’ (see Panksepp and Trevarthen, Chapter 7, this volume). This interaction is more than
the lilting, simplified utterances of ‘motherese’ or ‘infant-directed speech’. It includes concurrent
special facial expressions and movements of the head and body as well.
‘Musicality’ is an appropriate label, in that the interactions are organized in bouts (phrases)
over time and in time, using such musical features as melodic vocal contours, rhythmic and
regularized vocalizations and body movements, and expressive dynamic contrasts and variations
in space (large–small, up–down) and time (fast–slow, short–long), with behavioural ‘rests’ or
silences between bouts. The interactions are a multimodal (or multichannelled) ‘performance’ of
the mother, in which vocal, facial and bodily movement occur all together, temporally organized
according to a common pulse.
Although mother–infant interaction is well-studied, my hypothesis about its relevance to

the evolution of music emphasizes three intertwined points that invite further investigation:
(a) the noteworthy nature of the signals presented by the mother; (b) the infant’s strong and
untaught receptivity to the signals; and (c) the infant’s active contribution to the communication.
The visual, vocal and kinesic elements used multimodally in ‘packages’ by mothers are simplified,
repeated, exaggerated and elaborated versions of adult communicative signals—what one might
assume is necessary for attracting and holding the interest of an immature baby who requires
stimulation and emphasis to pay attention. However, putting it this way diminishes the role of
the baby in eliciting and preferring precisely these kinds of signals which, interestingly, are all
similar to, and possibly derived from, affiliative expressions that adults use with each other in
normal positive social interchange: open mouth, eyebrow flash, smile, looking at, head bob
backward, body leaning toward, head nodding, soft high-pitched undulant vocalizations,
touches, pats and kisses—many of which, incidentally, are present in some form in affiliative or
submissive contexts in primate societies (Dissanayake 2000b, p. 41; King 2004).
Far from adults ‘teaching’ babies to like these sorts of signals, babies encourage us to interact
with them in a way that we would never think of using with other adults or even older children. It
seems reasonable to assume that a mother’s simplifications, repetitions, exaggerations and elabo-
rations of common affiliative communicative signals would have reinforced—through proprio-
ceptive feedback (Scherer and Zentner 2001, pp. 371–372)—affiliative neural circuits in her own
brain at the same time as she was communicating her feelings of love and attachment to her
infant. (See Panskepp and Trevarthen, Chapter 7, and Turner and Ioannides, Chapter 8, this
volume, for a description of the neurobiology of both partners in the mother–infant interchange,
and a fuller account of its ontogeny.) Such reinforcement would have been adaptive both for
maternal reproductive success and infant survival. In a book about the origins of music that
appeared after the present chapter was prepared, Steven Mithen (2005) also suggests that evolved
interactions between mothers and infants provided abilities that were used for communal
singing and dancing in early human societies (Homo ergaster [1.8 million years ago] and Homo
heidelbergensis [0.5 million years ago]). This chapter is not the place to discuss the points of
similarity and difference in our hypotheses.
I suggest that the elements or operations of communicative musicality as they developed
phylogenetically in interactions between ancestral mothers and infants—simplification or
formalization, repetition, exaggeration, elaboration (and, for older infants, manipulation of their
expectation, or surprise) of simultaneous vocal, visual, and kinesic expressions—are the origins
of the capacities later used by humans in making and responding to music. Long before adult
individuals themselves intentionally began to make what we call music (and the other temporal
arts, which in small-scale societies are generally performed all together), these operations were
adaptive between caretakers and infants, serving to coordinate behaviour and emotion and, by so
doing, to conjoin or bond the pair. (Many other adaptive benefits to infants of the behaviour
have been described by a large number of scholars, some of which are listed in Dissanayake
[2000a, p. 393]).
In this scheme, the capacities for eventual music originate not in sex (i.e., sexual display), but
in love or ‘mutuality’ (Dissanayake 2000b): the behavioural and emotional coordination between
two individuals who need each other for their own individual reasons—for the baby, survival
and, for the mother, reproductive success. In other words, as communication richly endowed
with communicative musicality, the original function of simplified, repeated, exaggerated and
elaborated signals presented multimodally by mothers to infants would have been to reinforce
concord, not to compete for or seduce mates.
2.5.2 Music as interpersonal coordination and conjoinment

The second part of my argument—how our ancestors distilled music from protomusic—arises
from observing the most common context in which music occurs in ‘societies of intimates’
(Givón and Young 2002)—traditional foraging (hunting and gathering) societies that were the
sole institutional form of human society until six to eight thousand years ago. In these and more
recent small-scale groups, music and the other arts are an integral part of human ritual
ceremonies (see also Merker, Chapter 4, and Dissanayake, Chapter 24, this volume).
I have described ceremonies as ‘collections of arts’, without which a ceremony would not
exist (Dissanayake 2000b). As in mother–infant interaction, where vocalization does not take
place apart from facial expression and bodily movement, the collection of arts in ceremonies
(singing, chanting, intoning, playing an instrument, dancing, keeping time by striking or moving
to a beat) occurs and has its effects simultaneously. Early ethnographers of the arts (e.g., Boas
1925, p. 329; Hornbostel 1975/1905, p. 270), like countless subsequent ethnographers, noted
the close relationship in small-scale societies between music, poetic language and expressive
movement.
In proposing an evolutionary account of music, one need not posit a single origin in, say,
emotional outcries or a predilection to rhythmic movement. In contrast to other evolutionary or
quasi-evolutionary hypotheses of music’s origin and function, mine considers the first music to
have consisted of simultaneously presented vocal, visual and kinesic ordinary behaviours that
were to some degree altered—simplified or formalized, repeated, exaggerated, elaborated, and
sometimes manipulated to delay (or otherwise confound) expectation—making them non-
ordinary (Section 2.6 of this chapter).
These emotionally powerful alterations of ordinary behaviours—first developed and per-
formed spontaneously (that is, unintentionally, without being taught) as the communicative
musicality of mother–infant interactions—can be called aesthetic or protoaesthetic (i.e., musical
or protomusical). They are, additionally, what artists in every medium intentionally do to attract
attention and to create and shape emotion. It remains, however, to suggest how and why early
human adults began to use and expand their protoaesthetic or protomusical capacities and sensi-
tivities—originally evolved from communicative musicality in mother–infant mutuality—in
religious ceremonies.
Earlier I described the hominid trend towards developing an enlarged brain with increasingly
complex neuronal connections. Among the sophisticated mental abilities that larger brains made
possible in ancestral humans were the expansion of memory (remembering significant events,
both desired and feared) and foresight (the ability to predict and plan). Rather than simply
respond instinctively and contingently, like other animals, to current and changing conditions of
hunger, danger, illness, and other important survival-related states, early humans—with memory
and foresight—at some point would wish to do something about uncertainty (Malinowski 1948;
Per Brandt, Chapter 3, this volume).
An expanded awareness of past and future, and a concomitant concern with cause and effect,
provide the ground and motivation for what we call religion—in brief, the concern about why
good and bad things happen, how they got that way, and what can be done about them. Such
concerns and the arts appear to have developed together. One could say, in fact, that ceremonies
composed of music and associated arts are the behavioural or expressive counterpart of religious
doctrine and belief, providing something ‘special’ (shaped, embellished) to do for humans
cognisant of and attempting to cope with the problems and uncertainties of mortal existence,
whether past, present or future. In ceremonies, the temporal arts, based on the protoaesthetic
operations of communicative musicality, could similarly coordinate and conjoin individuals,
providing emotional reassurance that the group’s efforts would prevail (Dissanayake, Chapter 24,
this volume).
2.6 Control of anxiety

Pace sexual selection arguments, the costly messages that groups and individuals transmit in
ceremonies are less for attracting prospective mates than for attracting spirits, ancestors, and
other forces that affect their lives and can bestow success in hunting, protection in warfare,
prosperity, fertility, traversing important life stages, healing, and so forth. The question here is
why fantastic religious beliefs and practices persist when they (to a modern scientific mind)
do not attract spirits who can give assistance. That is, what is the ultimate function of costly
ceremonial behaviours? I suggest that by joining with others in music and art-filled ceremonial
behaviour, individuals may have felt more of a sense of coping with the uncertain circumstances
addressed by the ceremony and thereby effects of the stress response were better ameliorated
than for those who went their own isolated, anxious ways. Psychologists have found that the
feeling of control has considerable positive effects on health and ageing (e.g., Maier and Seligman
1976; Peterson, Seligman and Vaillant 1988) as does the presence of social support (Uchino,
Cacioppo and Kiecolt-Glaser 1996).
Hormones released during prolonged stress are debilitating to a wide range of somatic
functions, including immune system activity, mental performance, growth and tissue repair, and
reproductive physiology and behaviour (Sapolsky 1992). However, the physiological and neuro-
logical effects of entraining brain and body with others—through the vocal, visual, and kinesic
behaviours and aesthetic operations that evolved to establish communicative musicality and
ultimately music/art in ceremonial practice—require and establish a sense of behavioural
control and actually could enable our ancestors to cope emotionally with uncertainty (Mithen
2005, p. 220).
Both infants and adults engage in repetitive kinesic and vocal behaviours for self-soothing,
even to the point of pathological states such as rocking and head-banging (Perry and Pollard
1998). Even captive animals that perform pathological-appearing repetitive behaviours are found
to have lower levels of stress than their counterparts that do not move stereotypically and repeat-
edly (Charmove and Anderson 1989). Mothers use the protomusical operations of communi-
cative musicality to soothe and regularize emotional states in their infants. There is a large
literature on music being used by individuals in modern societies to ‘regulate, enhance, and
change qualities and levels of emotion’ (DeNora 2001, p. 169).
My suggestion here and elsewhere (Dissanayake 1995/1992) that the arts help individuals
cope with anxiety antedates and resembles E.O. Wilson’s suggestion that general intelligence
enables behavioural flexibility, which has been adaptive, and at the same time produces confu-
sion and uncertainty (Wilson 1998, p. 225). For Wilson, the arts are designed ‘to create order and
meaning from the chaos of daily existence [and to] nourish our craving for the mystical’ (Wilson
1998, p. 232). We should not overlook behavioral and emotional contributions of the arts to stress
reduction in ancestral ceremonial contexts, which derive from the protomusical operations
(or mechanisms of communicative musicality) that originally facilitated conjoinment in
mother–infant mutuality. That is, the adaptive benefit to humans of reassurance through behav-
ioural and emotional coordination should be emphasized as much as cognitive ordering and
understanding (e.g., Taylor 2002, pp. 48, 79, 133; Dissanayake, Chapter 24, this volume).
How might a temporally organized, ceremonialized response to uncertainty have originated?
The human tendency to come together is especially great under stressful circumstances (Taylor
2002, p. 77). Both Malinowski (1922) and Mead (1976/1930) describe small groups in what is
now Papua New Guinea, huddling together and chanting charms in a sing-song voice to abate the
violence of a storm. Having ‘something to do’ in a time of stress, such as moving and vocalizing
rhythmically with one or more companions, would be more soothing—and safer—than going
one’s own isolated, anxious way. If the storm abated without mishap, one can imagine the
chanting becoming more formalized and elaborated during subsequent storms. Another
plausible model for an origin of early human music is the lament, a widespread performance by
individuals or groups in which the natural behaviour of weeping and moaning in grief at the loss
by death or separation from a loved one became formalized and elaborated in song/poetry/
movement, and shared with others to relieve feelings of helplessness, individual isolation,
despair, and the anxiety attendant on the interruption death makes to the life of an individual or
group. Even the spontaneous reactions of individuals in the United States after the September
2001 attacks illustrates the therapeutic nature of participation in temporally organized and
elaborated behaviour—listening with others to song, liturgy and poetry, walking solemnly and
formally while holding candles, flowers and flags, or composing poetry to be placed with quiet
ceremony in public places.
2.7 Concluding remarks

I have proposed that the primary adaptive function for mother–infant interaction in which
communicative musicality is so evident—coordinating and reinforcing emotional coordination
of the pair and promoting their mutual feelings of conjoinment—was similarly adaptive when
additionally shaped and elaborated in ceremonial uses of music, although in a group rather than
dyadic context. Such psychobiological cohesion makes it possible for individuals, dyads and
groups to flourish.
My hypothesis fits in with stimulating ideas about music aiding group coordination offered by
Brown (2000a, b), Freeman (2000), Benzon (2001) and Mithen (2005), and is a plausible
antecedent of or alternative to other hypotheses about music’s function. While developing my
ideas, I discovered another scholar who in passing mentioned mother–infant interaction as a
possible source of music (Hodges 1996, p. 46), and after preparing this chapter I found that
Mithen (2005) expressly considered mother–infant interaction to be critical in the evolution of
human music. He also considers the therapeutic effects of music for stress reduction and healing
in Neanderthals and observes that modern humans make music under conditions of adversity
(Mithen 2005, p. 236). Ian Cross (2003; Chapter 5, this volume) suggests that music may have
originated in part as a result of processes concomitant with increasing neotenization (i.e., the
persistence of juvenile traits into adulthood) during hominid evolution and Jaak Panksepp and
Günther Bernatzky (2002) identify the evolution of song with the attachment/affiliation function
of affection-seeking or affection-expressing vocalizations.
Merker (2000; Chapter 4, this volume) emphasizes unusually developed human abilities that
antedate and contributed to the evolution of human music, including the unique ability to keep
time to a common pulse. His account begins not with hominid mother–infant interaction, but
with the synchronous chorusing and foot-stomping of late Miocene ancestors, which he suggests
contributed to group coordination of males for the purpose of attracting females and for
competing against other groups. Hagen and Bryant (2003), in a paper that is admirably
illustrated with ethnographic examples, propose that music and dance may have evolved as a
coalition-signalling system, perhaps originating from coordinated territorial defence signals.
McNeill (1995) remarks on the ‘muscular bonding’ that occurs with ‘keeping together in time’ in
dance and military drill. Although these scholars emphasize ancestral music as coordinating and
strengthening bonds between males so they can compete with other groups, this function does
not belie a hypothesis of music’s origin in the protomusical performances of mother–infant

interaction.
Benzon (2001) suggests that music, by recruiting so many different parts of the brain, enables
neural circuits to achieve coherent temporal form, and that this coherence is subjectively experi-
enced as pleasurable and satisfying, relieving the anxiety of incoherence. Benzon examines
several suggestions for evolutionary origin (although he does not consider the protomusical
components and operations in mother–infant interaction) and concludes that although bio-
logical adaptation may have played a role in the evolution of the precursors to ‘musicking’
(his term for music behaviour), he finds the effects of culture on music to be more relevant
(Benzon 2001, p. 190).
The phenomenon of human music is an ancient and mighty tree with many branches, leaves,
flowers—and a burl or two. From its roots in communicative musicality, its bole, or trunk, rises
as a thick compendium of mechanisms that foster emotional communion and conjoinment.
In turn, these mechanisms support the superstructure of music with its variety of biological,
social and cultural manifestations and purposes—some of them far from and even different from
their source, as in the solitary rewards of listening to or making music alone (Kivy 1990). This
view of human music as rooted in communicative musicality helps us to appreciate music’s
emotional and transformative power in human experience and to understand its antiquity and
unique importance in our species.
References
Aiken NE (1998). The biological origins of art. Praeger, Westport, CT.
Benzon W (2001). Beethoven’s anvil: Music in mind and culture. Basic Books, New York.
Blacking J (1995). Music, culture, and experience: Selected papers of John Blacking, edited and with an
introduction by R Byron. University of Chicago Press, Chicago, IL.
Boas F (1925). Stylistic aspects of primitive literature. Journal of American Folklore, 38, 329–339.
Brown S (2000a). The ‘musilanguage’ model of music evolution. In NL Wallin, B Merker and S Brown, eds,
The origins of music, pp. 271–300. MIT Press, Cambridge, MA.
Brown S (2000b). Evolutionary models of music: from sexual selection to group selection. In NS
Thompson and F Tonneau, eds, Perspectives in ethology XIII: Evolution, culture, and behavior, pp.
231–281. Plenum, New York.
Brown S (2002). The great debates: Rameau vs. Rousseau, Spencer vs. Darwin, Miller vs. Brown. Paper
presented in session on evolutionary musicology, International Musicological Society Meetings, Leuven,
Belgium, 1–7 August, 2002.
Bücher K (1899). Arbeit und Rhythmus, 2nd edn. BG Teubner, Leipzig (original publication 1896).
Bücher K (1910). Die Entstehung der Volkswirtschaft. H Laupp, Tübingen.
Cadar UH (1975). The role of Kulintang in Maranao society. Ethnomusicology, 2, 49–62.
Carroll J (2004). Literary Darwinism: Literature and the human animal. Routledge, New York and London.
Catlin A (1992). Homo Cantens: why Hmong sing during interactive courtship rituals. Selected Reports in
Ethnomusicology, 9, 43–60.
Charmove AS and Anderson JR (1989). Examining environmental enrichment. In EF Segal, ed., Housing,
care and psychological well being of captive and laboratory animals, pp. 183–202. Noyes Publications,
Park Ridge, NJ.
Coe K (2003). The ancestress hypothesis: Visual art as adaptation. Rutgers University Press, New Brunswick, NJ.
Cole H and Aniakor CC (1984) Igbo arts: Community and cosmos. Museum of Cultural History, University
of California, Los Angeles, CA.
Combarieu J (1894). Les rapports de la musique et de la poésie considerées au point de vue de l’expression.
Flammarion, Paris.
Cross I (2003). Music and evolution: Consequences and causes. Contemporary Music Review, 22(3), 79–89.
Damme W van (1996). Beauty in context: Towards an anthropological approach to aesthetics. Brill, Leiden.
Darwin C (1874). The descent of man and selection in relation to sex, 2nd edn. AL Burt, New York.
Degler CN (1991). In search of human nature: The decline and revival of Darwinism in American social
thought. Oxford University Press, New York.
DeNora T (2001). Aesthetic agency and musical practice: New directions in the sociology of music. In PN
Juslin and JA Sloboda, eds, Music and emotion: Theory and research, pp. 161–180. Oxford University
Press, Oxford.
Dissanayake E (1994). Chimera, spandrel, or adaptation: Conceptualizing art in human evolution. Human
Nature, 6, 99–117.
Dissanayake E (1995). Homo aestheticus: Where art comes from and why. University of Washington Press,
Seattle, WA (original publication 1992).
Dissanayake E (2000a). Antecedents of the temporal arts in early mother-infant interaction. In NL Wallin,
B Merker and S Brown, eds, The origins of music, pp. 389–410. MIT Press, Cambridge, MA.
Dissanayake E (2000b) Art and intimacy: How the arts began. University of Washington Press, Seattle, WA.
Dunbar R (1996). Grooming, gossip and the evolution of language. Faber, London.
Eibl-Eibesfeldt I (1975). Ethology: The biology of behavior, 2nd edn, translated by Erich Klinghammer. Holt,
Rinehart and Winston, New York.
Falk D (2004). Prelinguistic evolution in early hominins: whence motherese? Behavioral and Brain Sciences,
27(4), 491–503.
Freeman WJ (2000). A neurological role of music in social bonding. In NL Wallin, B Merker and S Brown,
eds, The origins of music, pp. 411–424. MIT Press, Cambridge, MA.
Geissmann T (2000). Gibbon songs and human music from an evolutionary perspective. In NL Wallin,
Geist V (1978). Life strategies, human evolution, environmental design. Springer, New York.
Givón T and Young P (2002). Cooperation and interpersonal manipulation in the society of intimates.
In M Shibatani, ed., The grammar of causation and interpersonal manipulation, 23–56. John Benjamins,
Amsterdam.
Hagen E and Bryant GA (2003). Music and dance as a coalition-signaling system. Human Nature, 14, 21–51.
Hall-Craggs J (1969). The aesthetic content of bird song. In RA Hinde, ed., Bird vocalizations, pp. 367–381.
Cambridge University Press, Cambridge.
Hodges DA (1996). Human musicality. In DA Hodges, ed, Handbook of music psychology, 2nd edn,
pp. 29–68. IMR Press, San Antonio, TX.
Hornbostel EM von (1975). The problems of comparative musicology. In KP Wachsmann, D Christensen
and H-P Reinecke, eds, Hornbostel Opera Omnia I: 247–270. Nijhoff, The Hague. Translated by
R Campbell (original publication 1905).
Huron D (2001). Is music an evolutionary adaption? Annals of the New York Academy of Sciences, 930, 43–61.
James W (1890). Principles of psychology, Vol. 2, Henry Holt, New York.
King BJ (2004). The dynamic dance: Nonvocal communication in the African great apes. Harvard University
Press, Cambridge, MA.
Kivy P (1990). Music alone: Philosophical reflections on the purely musical experience. Cornell University Press,
Ithaca, NY.
Kuttner FA (1990). The archaeology of music in ancient China: 2000 years of acoustical experimentation, ca.
1400 BC–AD 750. Paragon, New York.
Lacépède M le comte de (1785/1970). La poétique de la musique. Slatkin Reprints, Geneva (original
publication 1785).
Lucretius Carus Titus (1937). De rerum natura, English translation by RC Trevelyan. Cambridge University
Press, Cambridge.
Maier SF and Seligman MEP (1976). Learned helplessness: Theory and evidence. Journal of Experimental
Psychology: General, 105, 3–47.
Malinowski B (1922). Argonauts of the Western Pacific. Routledge and Kegan Paul, London.
Malinowski B (1929). The sexual life of savages. G Routledge and Sons, London.
Malinowski B (1948). Magic, science, and religion. Beacon Press, Boston, MA.
McLaughlin T (1970). Music and communication. Faber, London.
McNeill WH (1995). Keeping together in time: Dance and drill in human history. Harvard University Press,
Cambridge, MA.
Mead M (1976). Growing up in New Guinea. Morrow, New York (original publication 1930).
Merker B (2000). Synchronous chorusing and human origins. In NL Wallin, B Merker, and S Brown, eds,
Miller G (2000a). Evolution of human music through sexual selection. In NL Wallin, B Merker and S
Brown, eds, The origins of music, pp. 329–360. MIT Press, Cambridge, MA.
Miller G (2000b). The mating mind: How sexual choice shaped the evolution of human nature. Doubleday,
New York.
Miller WB and Rodgers JL (2001). The ontogeny of human bonding systems: Evolutionary origins, neural
bases, and psychological mechanisms. Kluver, Dordrecht.
Mithen S (2005). The singing Neanderthals: The origins of music, language, mind and body. Weidenfeld and
Nicolson, London.
Monboddo JBL (1774). Of the origin and progress of language, Vol. 1. Balfour, Edinburgh.
Morley I (2002). Evolution of the physiological and neurological capacities for music. Cambridge
Archaeological Journal, 12, 195–216.
Namu YE and Mathieu C (2003). Leaving mother lake: A girlhood at the edge of the world. Little, Brown,
Boston, MA.
Ottenberg S (1989). Boyhood rituals in an African society: An interpretation. University of Washington Press,
Seattle, WA.
Panksepp J and Bernatzky G (2002). Emotional sounds and the brain: The neuro-affective foundations of
musical appreciation. Behavioural Processes, 60, 133–155.
Pedersen CA, Caldwell JD, Jirikowski GF and Insel TR (eds) (1992). Oxytocin in maternal, sexual and
social behaviors. Annals of the New York Academy of Sciences Vol. 652.
Perry BD and Pollard R (1998). Homeostasis, stress, trauma, and adaptation: a neurodevelopmental view
of childhood trauma. Child and Adolescent Psychiatric Clinics of North America, 7, 33–51.
Peterson C, Seligman MEP and Vaillant GE (1988). Pessimistic explanatory style is a risk factor for physical
illness: a thirty-five-year longitudinal study. Journal of Personality and Social Psychology, 55, 23–27.
Pinker S (1997). How the mind works. Norton, New York.
Pinker S (2002). The blank slate: The modern denial of human nature. Viking, New York.
Pole W (1924). The philosophy of music, 4th edn. Harcourt Brace, New York (original publication 1879).
Proschan F (1992). Poetic parallelism in Kmhmu verbal arts: From texts to performances. Selected Reports
in Ethnomusicology, 9, 1–31.
Révész G (1941/1953). Introduction to the psychology of music, translated by GIC de Courcy. Longmans
Green, London (original publication 1941).
Rousseau JJ (1761/1986). Essay on the origin of languages which treats of melody and musical imitation.
In JH Moran and A Gode, eds, On the origins of language, pp. 5–74. University of Chicago Press,
Chicago, IL (original publication 1761).
Rowbotham JF (1880). The origin of music. Contemporary Review, 38, 647–664.
Sachs C (1962). The wellsprings of music. M Nijhoff, The Hague.
Sapolsky RM (1992). Neuroendocrinology of the stress response. In JR Becker, SM Breedlove and
D Crews, eds, Behavioral Endocrinology, pp. 287–324. MIT Press, Cambridge, MA.
Scherer KR and Zentner MR (2001). Emotional effects of music: production rules. In PN Juslin and
JA Sloboda, eds, Music and emotion: Theory and research, pp. 361–392. Oxford University Press, Oxford.
Schneider M (1957). Primitive music. In E Wellesz, ed., New Oxford history of music I: Ancient and oriental
music, pp. 1–82. Oxford University Press, Oxford.
Spencer H (1857). The origin and function of music. Fraser’s Magazine, 56, 396–408.
Stumpf C (1911). Die Angfänge der Musik. JA Barth, Leipzig.
Sugiyama LS and Scalise Sugiyama M (2003). Social roles, prestige, and health risk: social niche specialization
as a risk-buffering strategy. Human Nature, 14, 165–190.
Taylor SE (2002). The tending instinct: How nurturing is essential to who we are and how we live. Henry Holt,
New York.
Uchino BN, Cacioppo JT and Kiecolt-Glaser JK (1996). The relationship between social support and
physiological processes: a review with emphasis on underlying mechanisms and implications for health.
Psychological Bulletin, 119, 488–531.
Wallaschek R (1893). Primitive music: An inquiry into the origin and development of music, song, instru-
ments, dances and pantomimes of savage races. Longmans Green, London.
Weber M (1958). The rational and social foundations of music. Southern Illinois University Press,
Carbondale, IL (original work published 1921).
Wilson EO (1998). Consilience: The unity of knowledge. Knopf, New York.
Yeats WB (1928). The Tower. Macmillan, London.
Zahavi A and Zahavi A (1997). The handicap principle: A missing piece of Darwin’s puzzle. Oxford University
Press, Oxford.
Chapter 3
Music and how we became

human—a view from
cognitive semiotics
Exploring imaginative hypotheses
Per Aage Brandt
3.1 Introduction
On the evidence from palaeontology, our species, Homo sapiens, was biologically stable and phys-
iologically modern 160,000 years ago (Stringer 2003). When glaciation stopped 150,000 years
later, agriculture, writing, and history emerged—cultural life based on a symbolically represented
shared past. Somewhere in the middle of this long period, perhaps about 50,000 years ago,
humans apparently began to ‘make sense’ together—to symbolize, paint, speak, and form kinship
systems that held communities together—and, according to the scenario I propose, perhaps first
made music. It is commonly estimated that in the Upper Palaeolithic (40,000–10,000 years ago,
during the Würm glaciation), humans equipped with Aurignacian culture technology and
cooking by fire began to paint in caves, to dance and to make musical sounds; they may have
chosen pitches, rhythms and melodic forms by beating on resonant objects, blowing in hollowed
objects, and striking stalactites to create pitched sounds. (For a broader evolutionary context see
Wallin et al. 2001).1
There are several ways to arrive at the hypothesis that musical practice preceded the symbolic,
or intentionally semiotic, message-signalling practices of modern humans. In the next section,
I present arguments for this hypothesis.
3.2 Facing death and danger

Memory-based feelings—such as those related to the collective commemoration of the dead,
to ritual forms of imaginary communication with remembered persons, and hence the cult
of ancestors, belief in their existence as spirits and ghosts, and experiences implied in the convo-
cation of these spirits—are probably the ancestors of modern ‘gods’. Especially in situations of
collective crisis, such immaterial beings are called on through ceremonial performances, activa-
ting the genres of human sensitivity and activity that we now call religious. These events are, in
1 Beaune (1995, pp. 220–225) mentions the flutes found in Aurignacian and later caves (Isturitz) and
the stalagmite indentations in caves such as Pech-Merle, Portel and Clastres, suggesting the use of these
formations as petrophones; acoustic analyses have shown the good resonance in caves such as Niaux,
Fontanet and Portel, especially in the areas where parietal figurations appear. She notes however that such
uses remain difficult to prove.
32 PER AAGE BRANDT
all known cultural communities, linked to musical performances (Merker Chapter 4, Cross and
Morley Chapter 5, Dissanayake Chapter 24, this volume).
Singing, the articulation of the human voice into stable tones and intervals, links the emotions
of the breath with the rhythm of body movement. The ‘discretization’ that transforms an original
glissando into a series of distinct tonal steps is crucial to the change from shouting to chanting
and singing. The shared experience of articulate singing and of the song-imitating sounds
of melodic and rhythmic instruments universally affects our embodied minds by creating
‘non-pragmatic states’, i.e., states of non-functionality—of contemplation, exaltation or even
trance—that are typically expected and presupposed in situations of sacredness: celebration,
commemoration and invocation.
Collective musical practices also form the aesthetic framing of many trivially pragmatic (work-
related) forms of negotiation and cooperation, such as the institutional genres of functional
verbal communication; these can still entail occasional hymnic singing, performative chanting,
ceremonial choreography and gestural control (as, for example, in conveying politeness). School
assemblies, parades, even contemporary TV news programmes, are examples of quasi-pragmatic
uses of music or musicality used to consolidate feelings of community, to placate social fears and
to confirm security.
3.3 Traces of music in language

In all known languages, regular intonation patterns connect lexical items and syntactic
constructions. Linguistic expressivity includes and integrates levels of ‘musical’ phrasing, from
syllabic quantity, stress and tone, to clause melodies and syntactic emphasis, and from there to global
intonational profiles marking utterance modes and discourse genres (such as ‘narrative’ versus
‘argumentative’, ‘exhortative’, ‘imperative’). In modern phonetics, it is accepted that intonation
profiles universally distinguish imperative, interrogative, affirmative and affective modes of utterance
meaning (for example, see Bolinger 1983). Dialogical rhythms of turn-taking and attunement
to emotionally determined styles of legato, staccato and rubato phrasing in different tempi are
important for the proper use of language in conversation and in the performance of speech acts
(Fonagy, 2001).
These constitutive ‘suprasegmental’ structures or dynamic features may be a residue of
antecedent and still-active underlying forms of musical expressivity, although it is of course
impossible to find conclusive evidence for this hypothesis from a lost past. Clause embedding
(such as the insertion of completive, relative and adverbial phrases in a matrix sentence) is freely
phraseable in oral expression by changes in tone and tempo: in ordinary speech, we sponta-
neously ‘sing’ the overall structure of our grammatical sentences in accordance with the intersub-
jective circumstances and our purpose. Remarkably, there is no well-established theory of the
origin of this phenomenon. Nevertheless, we need only pay attention to the role of playful
singing and rhyming in infant and toddler language acquisition, from early babbling up to
the multiclause stage, to provide a strong demonstration of the formative force of expressive
musicality (Powers and Trevarthen, Chapter 10, this volume).
3.4 Language into music

The transformation of sentences into verses occurs universally, and is always understood as a
specific poetic device. Poetry exists in all known cultures as an aesthetic genre of oral expression,
in which the text is framed by some sense of music: poetry is chanted, sung or solemnly recited,
often to a background of accompanying music. Even when the music seems to vanish into a silent
MUSIC AND HOW WE BECAME HUMAN—A VIEW FROM COGNITIVE SEMIOTICS 33
metric pattern, leaving the pattern of the ‘feet’ of unaccompanied verse as a formal framework for
the poetic genre of signification (e.g., in the academic poetry of the past five centuries of Western
culture), this rhythmic framing or integration remains phenomenologically constitutive of the
poetic. In poetry, language and music are made one by a surprisingly smooth mapping from the
former to the latter. If language already ‘contains’ and builds on musical phrasing and musical time,
this transposition is easier to explain (Turner and Pöppel 1999; Miall and Dissanayake 2003).
Language is a distinct activity of the socialized human mind. It acquires a triple compositio-
nality of its own—phonetic, syntactic and semantic components in the ‘structure’ of language
enable us to think and share ideas of absent, past, distant things. We do not currently know in
detail how our mental and neural architecture has shaped the relation of language to music (but
see Turner and Ioannides, Chapter 8, this volume). We do not yet know if music and language
evolved independently, or if language could have evolved without music. Nevertheless, it remains
a plausible hypothesis that language emerged ‘embedded’ in music, implying that poetry
preceded prose (Cross and Morley, Chapter 5, this volume).
To our counterfactual imagination, it appears that if these two semi-automatic communica-
tion systems, language and music, had been and stayed mutually unconnected, they would both
be reduced to functional signalling systems little different from those of many animals, with
limited referential or explicit narrative power. I believe there is something in music, or musicality,
that language needs structurally in order to be symbolic (in the technical sense); that is, language
needs musicality to be able to intentionally refer to states of affairs outside the deictic ‘here and
now’ of persons in communication. This ‘something’ includes in particular the invocational
effect of rhythm in expressive movement.
The following sections present more specific ideas on the role of music in the constituting of
humans as a ‘symbolic species’ (Deacon 1997).
3.5 An indispensable emotional background to naming identities

There is overwhelming evidence of a fundamental, stable and primordial connection between
music and feelings and, in particular, emotional states related to the inter-human affective state
we call love. Linguistico–musical compositions in the world literature of scores and the texts of
songs, lieder, hymns, dramatic works, ballads, operas, that is, language-related musical creations,
generally show a constant semantic preference for this affective category as a thematic focus.
Poetry in world literature is predominantly ‘about’ this particular theme and the affective state
of love.
Such a semantic binding to a specific preferential domain of content calls for semiotic
reflection. What is being signalled in love songs everywhere? There must be a very strong connec-
tion between this realm of affective state between persons and ‘musicality’. My rather unromantic
suggestion follows.
Once the technology of tools and weapons allowed our species to extend its respective territo-
ries of operation, namely male long-distance hunting and female short-distance roaming, and
especially fishing, something like what we call ‘couples’, or adult parenthood partners, must have
endured longer periods of separation. Fishing and local foraging allow more sedentary living
habits and thus favour stationary nursing. The fine motor digital skills of females, manifested in
the production of adornments and fishing tools, could also have been developed during the same
period of early symbolic constitution (Cleyet-Merle 1990). The human concept of parenthood,
family relations and stable partnership—the notion of a ‘loving couple’—presupposes a capacity
to recall and recognize the ‘(significant) other’, to identify the beloved’s face and person and,
eventually, to associate these permanently with a given proper name.
34 PER AAGE BRANDT
Names, in this sense, are not generally used for referring to trivial artefacts or objects and
animals, but primarily to persons, and hence to personal belongings and territories. ‘Proper
names’ and ‘common nouns’ are linguistically and semiotically distinct. Nouns pertain to the
natural mental process of categorization, whereas proper names are grounded in speech acts
and possessive intersubjective relations—to interpersonal ties. However, proper names have
additional, absolutely decisive semiotic qualities. They make it possible to designate the numerical
identity of one particular individual, and thus to signify the singularity of a given individual
entity, not just the qualitative or useful properties of that entity—precisely what we do when
naming persons. The significance of the philosophical distinction between numerical and quali-
tative identity is not commonly understood in contemporary ‘materialistic’ culture. ‘Sameness’
refers either to an individual’s continuous existence through time (staying the same) or to
a property shared by several individuals (that are ‘like’ one another). I am ‘me’ by numerical
identity, and I am a certain person by qualitative identity. This account leaves unattended the
special affective and intermental relations between persons.
Whether the entity is a person or not, once the principle of naming is installed, the signified
singularity of an item makes it possible to ‘cognize’ it as an abstract ontological entity, a ‘countable’
being, perceived with a numerical (i.e., radically individual) self-identity. The named entity is
stable through time, precisely like a ‘love for a lifetime’ addressing the (same) ‘one and only’
person. This emotional binding to oneness is a cognitive capacity that appears to be only vaguely
present in other species, and one that humans in certain psychopathological states can lose.
Capgras’ syndrome in paranoid schizophrenia (Huang et al. 1999) is a central example; milder
forms are common in cases of paranoia in love relations. Jealousy may in fact be the most
frequent manifestation of all.
Nostalgic songs expressing a longing for an absent beloved person appear common to all texted
music. The name of the beloved is a quasi-obligatory part of such songs. A contemporary jazz
songbook will include such songs as ‘I loves you Porgy’, ‘Dindi’, ‘Stella by Starlight’, ‘Michelle’,
‘My Funny Valentine’ and ‘Sweet Lorraine’. Grieving songs recalling deceased loved persons
generally follow the same pattern, and just as vividly evoke the spiritual presence of the person
thus designated. Names are small phonetic songs in themselves, and the melody of a name-song
can identify a person (a thematic principle exploited in opera and cinema). When we vocally call
each other at a distance, the melodic aspect of the sound sequence is particularly efficient.
Something like the note series C-A-A-F might often be heard as the melody for calling ‘Se-bas-ti-an’,
with reduced versions such as A-A-F for ‘Jo-na-than’, and just A-F for ‘John-ny’ (see also Rainey
and Larsen 2002).
The point here is that proper names should be understood from the point of view of the
musicality of personhood: these nominal entities are arbitrary, emphatically conventional,
symbolic signs established by performative rituals, and basically ‘mean’ or refer to the affect
(love) that first made an individual into a person, a subject inscribed in kinship relations and
recognized as a singular and personalized being. Names are, of course, intimately related to
parental feelings, to the procedures of ‘giving’ names, analogous to the idea of ‘giving’ life,
and especially to the existence of a universal practice of voiced interaction between infant and
parents (Trevarthen and Malloch 2002; Dissanayake, Chapter 2, this volume). Similarly, we find
animals carrying a proper name more difficult to eat than anonymous creatures. Their name
makes them a ‘person’. However, this love-borne nominalism and personalized orientation
in music and poetry, by which music inherently seems to ‘think about’ love for someone (and
the love that seems to think about music for them), needs in its turn a grounding in additional
semiotic factors and circumstances of communication, such as those we will consider in the
following sections.
3.6 Homunculus in the artistic and musical sign

The global musical sign constitutes in itself, in the motivation behind it, an important prerequi-
site for its emotional use. Let me explain this semiotic phenomenon after first presenting what is
perhaps a more familiar pictorial analogy.
A painting, for example a landscape, offers first an iconic (pictorial) relation between a canvas
framing a complex multitude of graphic and chromatic events appearing on the painted surface,
and a framed view of the depicted landscape, as seen from a window, or from another limited
vantage point. The landscape in question may be a real place whose name appears in the title of
the painting—a representation of an existing geographic locality—or it may be a pure invention
by the painter. To the observer, it shows a fragment of a particular kind of possible ‘world’.
It offers a supposedly intentional glimpse into this world in such a way that the properties of the
glimpse illustrate the general character of the whole it refers to. That is, the part symbolizes an
underlying, more general whole: it ‘stands for’ the place.
Thus, the initial icon gives rise to an intentional act of symbolization, and the landscape
painting is now a symbol of a character, style or atmosphere, or a state of mind, in a spatial
habitat. Since the painting in front of us addresses our attention without further specification,
we ‘read’ it as an unspecified, existential index: a human mind was there and as a materialized
symbolizer is still here with us now, through the presence of his work, showing us the place.
Symbolization always yields the metonymic presence of the symbolizer. Inversely, it may be true
to say that the semiotics of metonymy always involves or stands for an act of symbolization of
some sort.
A painting may thus be represented as a cascade of sign functions, IC (icon) → SY (symbol) → IN
(index), where the initial percept, its icon, is again a sign or symbol whose content is yet a third sign,
the index, that contains the presence of the ‘ghost’ (spirit) of the artist (Figure 3.1).
This tripartite sign produced by pictorial iconicity can be compared to what happens in the
experience of musical performance.
1 The rhythmic and melodic gesture will suggest a body making that gesture. In this sense, the
auditive form, as an icon, means (signifies) the bodily gesture, even if the movement is not
actually shown, but only ‘played’ and heard in musical sound.
2 The idea of bodily gesturing in the sound will be a symbol; it will mean and signify a person
in a corresponding state of mind and emotive movement.
3 Since this very abstract affective meaning, or symbolic content of the iconic sign, occurs
at the very moment of hearing the music, it will ultimately, as an index, yield to those who
are sharing the musical experience a feeling of the presence of the ghost, or spirit, or avatar
(or whatever we might call it) of that musician.
Painting
IC
(IN) Motif: landscape
Paint
SY
(Style of)
Style of state of mind
motif Fig. 3.1 The cascade of sign
IN
functions for a painting: icon
Feeling of presence
Mind (IC), symbol (SY) and index (IN).
36 PER AAGE BRANDT
Music played
IC
Auditive Gesture
SY
State of mind
Fig. 3.2 The cascade of sign Style of
functions for a musical event:
IN
icon (IC), symbol (SY) and
Feeling of presence
index (IN). Mind
Thus, in the case of a musical event, there would be a corresponding sign cascade (Figure 3.2).
The cascade format of semiotic meaning in process is clear in these and related cases of art.
However, the semiotic cascade may also be cognitively active in other forms of communication
by explicitly expressive signs, whether arbitrarily coded or not—such as facial expressions,
theatrical gestures of politeness, pragmatic signposts, signboards. The particular interest that
humans take in the artistic cascade, however, is undoubtedly due to the forceful feeling created by
a particularly elaborate iconic stance in art, by which the symbolic function is built into the
content of the icon and therefore made immanent and disembodied, so that the symbolized
emotional state of mind does not carry the signature of the performer, but will instead remain an
immanent semantic property of the artistic piece of work. The participants will be able to feel,
sense or accept the emotion of the state of mind in question without ‘being in it’. The subject of
the mind whose presence is felt by the participants is what I propose to call a ‘homunculus’, an
imaginary persona or ‘virtual other’ experienced as immanent in the work of art. When art is
associated with cultural, institutional and discursive practices of different kinds, including
religion, the authority of a voice experienced as emanating from an artistic expression will then
be associated with the abstract homunculus—whose disembodied status will endow it with a
particular symbolic force, perhaps accounting for the dynamic effect that we call ‘sacredness’.
In the evolution of cultural practices, I claim that the necessary presence, at first, of such authority-
yielding symbolic forces—especially in the execution of performative acts and rituals—stems
from the semiotic homunculus. Music generates sacredness. Furthermore, it is probable that visual
cascades appear in evolution subsequent to auditory cascades. It is possible to derive the symbolic
meaning from the iconic content only to the extent that different modes of representation can be
perceived as ‘styles’ or graphically manifested expressive gestures (responsible for strokes, colours,
contours and light) characterizing variable mental ‘styles’ or perceptive modes of seeing. By
contrast, musical rhythms (corresponding to strokes), soundings (colours), and melodic phrasing
(contours) directly inform our bodies of the way to dance in order to unfold their meaning;
we immediately grasp the state of their homuncular out of body mental being, or spirit. Meaning,
as distinct from the fact of someone who ‘means’ something by saying it or playing it, is homuncular.
It transcends its performer (see Chapter 5 by Cross and Morley, this volume, on ‘floating
intentionality’).
In so far as this privilege of auditory imagery has always been a property of our motor-based
perception of temporal events, music may have guided other expressive modalities and eventu-
ally language; the voice heard in different forms of enunciation (such as irony, bathos, impera-
tives and interrogatives) is indirectly, theatrically linked to the speaker, and directly related to this
homuncular symbolic force. Implicit narrators in fiction and humour, impersonal bureaucratic
formulaicity and juridical textuality all rely on homuncular enunciation. The law ‘speaks’, or
rather chants, and we can sympathize with this authoritative voice or mock it by letting it sound
like good or bad music.
There is a structured process in the architecture of the human mind that ‘does’ semiotic
cascades and the expressive body codings associated with them, and that represents the virtual,
homuncular other in relation to the Ego (the Self of which the subject is aware). Let me outline
briefly and speculatively the general semiotic view that underlies this line of thinking about
cognitive aesthetics and musicology.
3.7 Mental architecture and the communicative role of music

The human mind organizes knowledge about the spatial and temporal world, including, most
intimately, the body that hosts it. Just as importantly, it organizes the functional and expressive
acts of its individual host as an embodied person in society, namely in a society of persons
sharing significant homunculi, and being moved by them, while sharing imagery and music.
Thus, we perceive and also perform. To account for this double perspective of our subjectivity in
theoretical or philosophical terms is highly complicated; current research is far from having an
elaborate model at its disposal for orienting the required technical and empirical investigations.
Nevertheless, there are certain elementary principles that have begun to emerge, allowing us to
form an initial, minimally ordered view of what the mental brain is doing. Two dimensions must
be distinguished:
1 A ‘vertical’ dimension in which afferent integration builds up content from input, and efferent
integration builds up our agentive programmes as output. In this sense, borrowing terms
from neuroscience, we could speak of afferent cognition and efferent cognition: experience
and intention. (Afferent means bearing or conducting inwards; in neurology, conveying
impulses toward the central nervous system. Efferent means conducting outward from an
organ; conveying impulses to an effector).
2 A ‘horizontal’ dimension, in which different levels of mental work are articulated, both sepa-
rated and connected in function.
(a) In afferent cognition, five superimposed levels of distinct and relatively independent
conscious meaning production, as a minimum, appear to be operating in parallel:
(i) perception, which precedes
(ii) categorization and conceptual categories, which precede situational scenario forma-
tion, also called
(iii) narrative cognition;
(iv) comparative and reflective recall, which constitute a fourth level of consciousness,
and
(v) an ultimate level of free-floating imagination and such phenomena as ‘offline’ rep-
resentations, ideas and daydreams. In this order, each level presupposes systematic
access to the products of the preceding level.
(b) Efferent cognition, the last level of the outward oriented process, shapes our bodily
actions in the surroundings we perceive. It must be closely related to the first level of the
afferent process by some sort of bridge, creating a shared level, since specific sensory
perceptions (gestalts) can directly and spontaneously trigger or confirm certain gestures
and reflexes. These are typically deictical moves by which we apply volition, positive or
negative, to what we sense, in order to better perceive it as the prospect that we had when
38 PER AAGE BRANDT
we acted in that way. Behind this level of deixis and volition, or underlying it, afferent
categorization must be connected to efferent object-oriented motor routines by a second
bridge; the bridge on this second level, between afference and efference, may therefore be
related to lexical structure in language. On a still deeper level, efference prepares
sequences of acts that express superordinate intentional meanings, connected to afferent
situational understanding by a narrative organizer or ‘planner’ of temporal experiences
(related to semio-syntactic structure in language). Underlying this level, our semiotic
body finds its affective tonus (emotional attitude), by which it reflexively supports our
ongoing acts and action sequences; this emotional attitude could be connected to the
variations of enunciation in language. Afferent imagination is matched by efferent pulses
of rhythm; pure rhythmic attention, stepping into the expected experience of intended
acts, may be the afferent–efferent bridge. This might seem a strange claim, but we may
think of imagination as creating expectant states of impatience, and rhythm, including
tapping by the fingers and feet, as connected phenomena; or we may think of the way in
which depressive or ecstatic phantasizing (imaginary thinking) affects the tempo of our
iterative routines. This last connection between imagination and rhythm must intercon-
nect offline representational awareness and online presence-oriented awareness on a
bridge of what philosophers might want to call a pure phenomenological consciousness
(here just called ‘attention’).
The hypothetical model of our mental architecture is represented in Figure 3.3.
The phenomena that semioticians and philosophers refer to as forms of meaning are mental
contents neither belonging to the afferent or to the efferent line exclusively, but which may float
freely from side to side, precisely as the forms of structure characterizing language. Linguistic
structures seem valid as principles of organization in both directions, since we listen and speak
through the same grammatical forms. Only in foreign language acquisition (and in early childhood)
do we observe a significant difference in afferent and efferent competence. We are, incidentally,
normally better at reading (hearing) than at writing (speaking) a foreign language (and toddlers
Sensation Behavior
Gestalt Perception Expression Gesture

Deixis
Object
Concept Categorization orientation
Lexical strand
Intentional
Situation Narrativization act planning
Syntactic strand
Recall Reflection Emotion

Enunciation
Fantasizing
Imagination Rhythm
Attention
Fig. 3.3 A hypothetical model Afferent Efferent flow
of our human mental flow
architecture. Somatic receptors and impulses
understand, frighteningly, more than they can say!). This difference is probably due to the role
of consciousness of others’ actions and expressions in language learning; for many reasons, it is
easier to attend to reception (afferent content) than to production (efferent content) if one is
oriented to the reception of a message intended by someone. Other individuals are apparently
more salient in the afferent than in the efferent line of processing (and children live, inquisitively,
as our wards).
If the architectural hypothesis presented here is solid, then music is essentially both a matter of
auditory perception and of deep, abstract ideation, an ideation that originates with the impulse
to move, i.e., with action. Whereas auditory events (noises and environmental sounds) are gener-
ally perceived to be integrated into multimodal clusters of objectal concepts—since ‘things’ yield
multimodal sensations—musical sounds are perceived as tones, which have rhythmic meaning as
beats. We need to ask what the particular principle underlying this truly strange fact could be.
Of course, the ‘strange fact’ is comparable to what happens in visual art and pictorial and graphic
iconism in general: the visual mode is kept separate from other possible sensory gestalts offered
by the source of perception; otherwise there would be no ‘image’. The auditory percept is
thus carried through all standard instances—categorization, narrativization, reflection and
imagination—without being absorbed by contextual meaning, and is then interpreted as an
event manifesting the spiritual presence of some being.
We may explain this symbolico–aesthetic miracle simply by stressing that musical sound
is perceived as an intentional gesture, i.e., as a ‘symptom’ of someone moving in a particular
expressive way. Since it is immediately understood as an intentional expression, attention is
drawn toward the category: Other Person’s Conscious Doing. Since musical sounds are know-
ingly produced, the actual Other Person playing is conceived of as particularly self-conscious,
so that there are inherently three intentional processes occurring at the same time: a listener’s
conscious attending, a player’s conscious attending to what is played, and the consciousness
invested in the music that the player attends to while playing it. This last intentional instance is
precisely what the listener foregrounds. It is not the player’s autocontrolling (the technique),
but the musical flow that the player intends to control, and that is thus objectified during its
production as an autonomous instance: the meaning of what is played.
This rather tricky phenomenological analysis may be fundamental to the general understan-
ding of our topic, so I will rephrase it a couple of times. To play or paint something (instead of
just performing intransitively) is to embody and inhabit this something and to experience it as a
pre-existing efference that the actual efference emulates or reactivates. It is very particularly this
pre-existing ‘intentionality’ that the musical or pictorial experiencers focus on, beyond the
performer’s own efference. The triple subjectivity generically built into the process stems from
the performer’s normative project. Since the performer creates ‘something’, and thereby could
either fail or succeed to give birth to it, the meaning immanent in the something is saved by the
performance; the feeling of a precarious, fragile transcendent intentionality quite naturally
accompanies the aesthetic display.
Thus, every act of symbolization is a normative performance project and entails the feeling of
transcendence, which is more directly and clearly present in the musical here-and-now experi-
ence than in any other circumstance. Symbolization may thus be derived from the primordial
musical practice of humans.
There is still, however, a constitutive aspect of symbolicity that needs to be elucidated: how did
we manage to isolate symbols as discrete single signifiers and then to conceive of their combina-
tions as formulaic sequences? Where could this discretization and this idea of concatenation have
entered human cognition? Again, music may have been a structural source of these formal cognitive
inventions, as I will try briefly to show in the following, concluding section.
40 PER AAGE BRANDT
3.8 Names and numbers: from metric and rhythmic

time to calendar time
In most music, the rhythmic organization in which instrumentalists, singers and dancers
anchor their performing consists of finite temporal units that can be described as recursive bars
(measures) comprising a short sequence of regular pulses, or beats. These bars form a shared
reference for the performers and allow them to synchronize their expressions (see Lee and
Schögler, Chapter 6, this volume). The finiteness of the bar makes it possible to conceptualize a
temporal flow as a highly structured recursive process of nested metric cycles. The encompassing
multi-bar units are normally related to melodic wholes, and there are further compositional,
multimelodic wholes, united or separated by specific scales and harmonic preferences (see
Osborne, Chapter 25, this volume, on chronobiology). For the sake of demonstration, Figure 3.4
shows a 12-bar blues chord schema.
Such a construction is only possible because the beats of the bars are numbered (named),
so that a musician can count ‘one-two-three-four, two-two-three-four, three-two-three-four …’,
the first ‘one’ referring both to the beat and to the bar—a double present, so to speak. This is
already in itself a numerical system: it is tetradic, comparable to the decimal or the binary
systems, and the possibility of identifying a unit by at least two recursive parameters is what
makes the unit as symbolic as a person’s name, including first and family names. Once we are able
to name a beat, within a closed list of possible names, we can conceptualize the temporal
moment as a ‘place’ in time: a recurrent place as something to return to, something immaterial
that is still there ‘as time goes by’; different persons’ presence in the future can coincide or signif-
icantly not coincide. Planification becomes possible, or intuitive motor planning is realised.
The calendar is born.
The elementary miracle is that the place will be there whether or not someone pays it a visit:
the beat, and equally the bar, exist even if they are empty! An empty (unmarked, unplayed) beat
is an auditory event that we do not hear; it is an acoustic ghost, one could say. It exists plainly and
Bars
ONE
(T)
TWELVE TWO
(D7) (S7)
one
four two
ELEVEN Beats three THREE
Fig. 3.4 Representation of a (T) (T)
blues chorus. Bar numbers are
shown by the upper-case spelt- TEN FOUR
out numbers around the outside (S7) (T7)
of the circle; beat numbers in
each bar are represented by the NINE FIVE
lower-case spelt-out numbers (D7) (S7)
surrounding the small circle at
bar ONE. T, tonic chord; D, EIGHT SIX
(T) (S7)
dominant chord; S, subdomi-
nant chord; 7 indicates the SEVEN
addition of the chord’s seventh. (T)
numerically, and, I contend that this is how plain natural numbers might have come into
existence: as beats to fill or leave unfilled. A named beat is a numerator with an unfilled, pronom-
inal denominator. The embodied origin of mathematics might thus be the nested cyclicity of
musical rhythm.
We note that Lakoff and Nunez (2000, p. 52) prefer to believe that numbers are grounded in
‘subitizing’ our fingers. Perhaps, they could originate from ‘stepping out’ the base of a building or
the space for a game or dance. There may be many possibities, but all will depend on a sense of
time and nested rhythms, in making all kinds of moves, step by step in groups. Music may have
brought this feature of animal movement into systematic human consciousness.
The metric underpinning of poetic rhythm—beyond the quibble of feet, tones, accents and
quantity in culturally distinct poetics—is exactly the same beat-based temporal cognition.
Here is a stanza by Robert Burns (from On Mary, Queen of Scots, written in 1791; in Noble
and Hogg 2001):
O! Soon, to me, may Summer suns
Nae mair light up the morn!
Nae mair, to me, the Autumn winds
Wave o’er the yellow corn!
And in the narrow house o’ death
Let Winter round me rave;
And the next flow’rs, that deck the Spring,
Bloom on my peaceful grave!
Four beats organize each verse as a bar:

1 2 3 4
O! Soon, to me, may Summer suns
Nae mair light up the morn! – [4]
Note the empty fourth beat [4] in line 2. The syntactic accentuation at the close would oppose
the realization of these rhythmic beats by strongly stressed syllables:
And the next flow’rs …
Bloom on my …
Here, the linguistically unaccentuated morphemes would be grotesquely overstressed if their

stress were to follow the four-beat rhythm; instead, they are to be pronounced in a slightly slower
tempo and with an artificially equalized half-stressed weight, a counter-accentual solution that
yields a perceptible poetic effect.
Let me present one more example, a famous Japanese haiku by Matsuo Bashô (1644–1694)
featured in Stryk and Ikemoto (1977, p. 91):
Furu ike ya [an old pond and]
Kawazu tobikomu [a frog jumps]
Mizu no oto [water’s sound]
The translation is as follows: Old pond, / leap-splash – /a frog. See also:

http://www.teeweg.de/de/literatur/basho/furuikeya.htm)
The verses of a haiku have 5 + 7 + 5 syllables. The stressed voicing of these lines, however,
imposes a four-beat measure:
1 2 3 4
Fu ike ya [4] Au: pls
Ka zu bi mu check.
Mi no to [4]
42 PER AAGE BRANDT
The result is that the final void [4]—the empty beat following ‘oto’—becomes the temporal
place of the splashing beat. A poetic trick consisting in animating the void, or rather semantici-
zing the ‘pure’ temporal slot.
We know that music has always been associated, transculturally, with the hours of the day
and the night; in fact, the notion of hour and day is due to the same nested cyclicity as the
musical metric itself. The names of hours are mostly numerical, and this is often also the case
for days (e.g., Portuguese weekdays: segunda-feira, terça-feira, quarta-feira, quinta-feira,
sexta-feira – Monday, Tuesday etc.). Sociocultural conceptualizations of time are isomorphic
with time’s musical form of schematization. It is evident that calendars, using names of divine
entities as numbers, are built out of exactly the same symbolic substance. I conclude that symbol-
ization springs from temporal cognition, and that temporal cognition serves the ‘time in the
mind’ that gives music its rhythms.
3.9 A last remark and concluding thoughts

Let me add a last remark on tones. The discretization of tonal sounds, already specified as tones,
not noises, by their formants (overtones), and the melodic combination of tones of different
pitch, as produced by musical instruments perceived as analogous to the vocalizing human voice,
probably occurred when they were connected to beats. A tone manifesting a beat calls for subse-
quent tones representing other beats of the same cycle or multicycle (cf. the blues cogwheel
in Figure 3.4). Thus, the length of the tone comes to refer to the beats of the bar as a metric,
quantitative scale; because there is no cognitive continuity or gradual transition from one beat to
the next, the tonal signifier of the beat will be cognized as a discontinuous, discrete sound event
with a determinable onset, followed by a new onset of a tone, same or distinct, or by a pause
(a void beat).
Since the rhythmic organization is serial, finite and cyclic, this alliance of tone and beat leads to
the invention and stabilization of finite scales – series of notes separated by stable intervals and
united by their affinity as sets of elements that combine syntactically into cognitively and
emotionally clear melodic forms. As soon as a note is integrated in a scale, it acquires a name, e.g.,
do – re – mi, c – d – e. Scales are sometimes associated with affective moods and social situations
in such a way that a musical culture will dispose of different scales felt as appropriate for corre-
spondingly different moods and situations (such as are explicit in the genres of flamenco music
and the ragas of India). In a sense, these scale systems are psychological and sociological ‘theories’
in themselves. They ‘interpret’ significant moments of shared human time, with universal
emotional appeals as well as conventions of acceptance.
Discretization (tones are discrete units, not glissandos) and finitization (beats are members
of finite recursive series, not elements of an unending train) are thus basic aspects of the genesis
of symbolic expressions. When the human voice finally stabilizes the sets of linguistic sounds we
call phonemes, it does so within a phenomenology of syllables, but on this phonotactic level—
more easily experienced than single consonants and vowels—discretization and, to a certain
extent, finitization likewise take place. The syllabic phenomenon, including the naturalness with
which we articulate words by dividing them into syllabic sequences, could be an effect of the
musical binding of tone and beat. Syntactic phrase formation would be an additional melodic
superstructure.
The reproducibility of melodic phrases and their easy interpersonal transmission, due to the
particular refinement of our auditory memory for effects of action, makes them appropriate
for the interpersonal monitoring of attention to situations. Since melodic integration does not
eliminate the discontinuity of its syllabico–lexical components, the tension between separate
words and integrative clauses, as between tones and melody (a dynamic principle exploited
in thematic variation), generates what we call grammar. Grammar is not a system, but rather a
constant crisis of separation: words do not dissolve into phrases or clauses, but instead tend
toward discreteness and autonomy. Oral phrases are thus often completed by gesture and intona-
tion rather than explicit wording. We can start a sentence with explicit wording, and continue
with a ‘nanana and nanana …’ that every hearer will understand. The superordinate intonational
utterance profile would eventually represent a supplementary expressive unification or homoge-
nization of discrete units, rooted in affect and rhythm, as we have seen.
Musicality of action and consciousness is possibly even the factor that unified the lexical
(object-oriented) and the syntactic (propositional and evaluative, subject-oriented) components
of language and thereby created the very logos of our species. The first manifestations of
language would therefore have been what we now call poetry. It may have been that the cave
paintings in the deep, acoustically rewarding halls, where no signs of household are found, were
the scores of the recitals and musical performances that shaped human culture. Their superim-
positions of figures may be similar to melodic superimpositions in baroque fugues. Synaesthesia
is now understood to be common in the perception of beauty, as it is in any active experience in a
richly stimulating world. I refer the reader to Ramachandran and Hubbard (2001) for a particu-
larly interesting study of and reflection on synaesthesia, art and language. Perhaps Cro-Magnon
humanity emerged as a bouquet of baroque cultures, first using petrophones (resounding stones)
or stalagtites, flutes and drums, then bowed strings and animal horns, to create and animate the
cognitive and emotional architectures that eventually grounded imagination and rationality and
opened the way to language.
In contemporary and future research, musicology and many different forms of cognitive and
semiotic studies may collaborate along the lines of these and similar evolutionary scenarios,
elaborating both imaginative hypotheses and finding yet more empirical arguments in favour of
a coherent reconstruction of the origins of human symbolization. What is already overwhelm-
ingly probable is that symbolization is grounded in temporal cognition, and that the human
conceptualization of time is grounded in music.
References
Beaune SA de (1995). Les hommes au temps de Lascaux. 40000–10000 avant J.-C. Editions Hachette, Paris.
Bolinger D (1983). Intonation and gesture. American Speech, 58(2), 156–174.
Brandt PA (2004). Spaces, domains, and meaning. Essays in cognitive semiotics. Series European Semiotics,
4. Peter Lang, Bern, Switzerland.
Cleyet-Merle J-J (1990). La préhistoire de la pêche. Edition Errance, Paris.
Deacon T (1997). The symbolic species. The co-evolution of language and the brain. Norton, New York.
Fonagy I (2001). Languages within language: An evolutive approach. Foundations of Semiotics 13.
John Benjamins, Amsterdam/Philadelphia.
Huang T-L, Liu C-Y and Yang Y-Y (1999). Capgras Syndrome: Analysis of nine cases. Psychiatry and
Clinical Neurosciences, 53, 455–460.
Lakoff G and Núnez RE (2000). Where mathematics comes from. How the embodied mind brings mathematics
into being. Basic Books, New York.
Miall DS and Dissanayake E (2003). The poetics of babytalk. Human Nature, 14(4), 337–364.
Noble A and Hogg PS (eds) (2001). The Canongate Burns: The complete poems and songs of Robert Burns.
Canongate, Edinburgh.
Rainey DW and Larsen JD (2002). The effects of familiar melodies on initial learning and long-term
memory for unconnected text. Music Perception, 20(2), 173–186.
44 PER AAGE BRANDT
Ramachandran VS and Hubbard EM (2001). Synaesthesia – A window into perception, thought and
language. Journal of Consciousness Studies, 8(12), 3–34.
Stringer C (2003). Out of Ethiopia. Nature, 423(6941), 692–693.
Stryk L and Ikemoto T (1995). The Penguin book of Zen poetry. Penguin Books Ltd, Harmondsworth.
Trevarthen C and Malloch S (2002). Musicality and music before three: Human vitality and invention
shared with pride. Zero to Three, 23(1), 10–18.
Turner F and Pöppel E (1999). The neural lyre: Poetic meter, the brain and time. In RS Gwynn, ed.,
New expansive poetry, pp. 86–119. Story Line Press, Ashland, OR.
Wallin NL, Merker B and Brown S (eds) (2001). The origins of music. MIT Press, Cambridge, MA.
Chapter 4
Ritual foundations of human

uniqueness
Bjorn Merker
4.1 Introduction
This chapter presents reflections on the natural history of human culture. They had their origin
in a puzzle generated by a study of music in mother–infant interaction which I conducted with
Colwyn Trevarthen in the late 1990s. Since the pieces of that puzzle have finally fallen into place
to reveal an unexpected perspective on the nature and origins of human culture, I will take this
opportunity to step back from the motivating details in infant development—covered in
Eckerdal and Merker, Chapter 11, this volume—to paint that evolutionary perspective in broad
strokes. In so doing, I will draw a crucial distinction (apart from language) between the culture
possessed by our closest relatives among animals, the apes, and the culture we possess as humans.
The distinction will turn out to bear on our understanding of the evolutionary background to
the origins of language, the uniqueness of human culture, and the nature and role of imitation in
cultural learning. I will relate this distinction to the rare but well-studied biological phenomenon
of vocal learning. Finally, a three-tiered conception of human culture will be presented, following
a sketch of the curious evolutionary trajectory by which it was assembled to make us the unique
species we are today.
4.2 Culture and ritual culture

Culture, in the sense of durable behavioural traits shared by a subpopulation of a species on
account of intergenerational transmission by non-genetic means, was once thought to be a
distinguishing and unique trait of humans. Nowadays, this belief, along with other putative
markers of human uniqueness—such as the manufacture and use of tools, and bipedalism—has
had to yield to progress in behavioural biology and palaeontology. Intergenerational cultural
transmission is present not only among monkeys (Imanishi 1957; Itani 1958) and apes (Whiten
et al. 1999), but has been documented in non-primate species as well (Fragaszy and Perry 2003).
Nevertheless, there are compelling reasons not to abandon the notion that human culture is
unique. One of these is, of course, our possession of speech and language: all animals communicate,
but only humans talk, a distinction that holds at least when ontogenies are completed in the nat-
ural habitat. Behind this distinction, I suggest, lurks another more fundamental, and thus far
neglected difference between the psychological underpinnings of human culture and those of
our closest relatives among the apes. Our culture, but not that of chimpanzees, is a ritual culture.
Since the distinction has not been part of comparisons between human and animal cultures,
I shall dwell on it at some length.
A ritual culture is one in which certain behaviours, whatever their purpose, goal or function
might be, have a ‘correct’ form, in the sense that one particular acquired mode of execution
46 BJORN MERKER
among many possible alternatives is an obligatory part of its performance, without necessarily
being superior to its alternatives in an instrumental or practical sense. The form of the ritual in
this sense is arbitrary, and it is the obligatory nature of arbitrary form that distinguishes ritual
from instrumental behaviour (see Merker 2005 for additional formal detail). In instrumental
behaviour, the goal—the practical end result—is primary; in principle, any mode of execution
that achieves an invariant outcome (the purpose or function of the behaviour) is adequate,
although considerations of efficiency may lead one mode of execution to be preferred to another.
This is not so in ritual: there, the exact form and particulars of execution are primary, and the
achievement of an invariant outcome counts for nothing if the form of the ritual is violated. One
can satisfy one’s hunger perfectly well by eating without etiquette. Yet in so doing, one has
violated ritual form and runs the risk of being censured for rudeness. Success here is defined by
correct execution; i.e., the criterial measure of adequacy is adherence to the socially approved
form of the ritual itself and not outcome or utility for the eater alone. Ritual is thus not to be
equated with either habit or convention. These are not typically defined in terms of obligatory
formal and particular modes of execution, although the distinction may be less clear for some
conventions.
Nor is ritual in any way to be confused with the term ‘ritualization’, as used in ethology to
explain the formation of a behaviour pattern. The ethological term refers to the evolutionary
process whereby a facet of behaviour is turned into a ritualized innate action pattern (display
signal). This chapter will exclusively concern learned, cultural rituals—behaviours that no child
is born with.
In the sense indicated above, ritual as ritual has no purpose outside itself: the purpose of the
ritual is that it be performed, and performed correctly (Staal 1989). This is not to say that its
participants do not believe or insist on its efficacy with regard to external goals or functions.
Quite the contrary: this presumption is deeply embedded in the performance of, for example,
religious rituals, the performers of which will insist that the specific form in which the particular
ritual is performed, far from being arbitrary or optional, is the one and only proper and effica-
cious way of achieving an external purpose. In the case of rituals such as wedding ceremonies,
this presumption is intrinsic to the ritual itself: by its correct performance, one simply is married.
What remains constant throughout is the insistence on the obligatory nature of a particular
form, independent of outcomes.
The culture of our closest relatives, the apes, abounds in instrumental cultural traditions such
as termite or ant fishing and other forms of tool use, while ritual culture proper is conspicuously
scarce among them, although perhaps not absent altogether. Young chimpanzees learn termite
fishing by observing an adult—typically the mother—fashioning a fishing stick and using it to
feed off the termite mound. Yet in so doing, the young do not copy the details of the mother’s
movements, and nor does she go about her task in a fixed way each time. Termite fishing as such
is not a ritual: it is a purposeful instrumental cultural tradition whose individual instances
are flexibly adapted to the layout of the environment and the happenstance of the moment in
accordance with utility. One might say that chimpanzees are too practical, even too ‘rational’, to
burden the execution of this feeding technique with unnecessary fixity of form. What might a
termite-fishing ritual look like? It would suffice if, for example, each time a mother had fashioned
her stick, but before inserting it into the termite mound, she raised her stick vertically high into
the air on a stretched arm and uttered a grunt before proceeding to feed. If young ones acquired
this behaviour through learning, and if, as a result, the behaviour were endemic to a subpopula-
tion of chimpanzees, but absent (or present in a distinctly different form) in other subpopula-
tions, this would qualify as a ritual. Two cultural variants of chimpanzee insect fishing recorded
RITUAL FOUNDATIONS OF HUMAN UNIQUENESS 47
in the wild in the form of ‘manually wiping insects off the stick’ (Goodall 1986—at Gombe)
versus ‘mouthing them off the stick’ (Sugiyama 1993—at Bossou; Boesch 1996—at Thai Forest)
come closer to the sense of ritual I am attempting to define, since they involve an apparently
arbitrary difference in how things are done, in this case, the part of the body employed in
removing termites from the stick. Yet to draw firm conclusions about their status as ritual
requires us to know many details of the possible constraining influences of utility on the execu-
tion of the behaviours, such as differences in the behaviour or other characteristics of the species
of insect being consumed, or in the plant species used to fashion sticks, as illustrated by the
following example.
Mountain gorillas strip stinging nettles of their leaves for food (Byrne and Byrne 1993; Byrne
and Russon 1998). This is done in a complex, bimanual and hierarchically organized fashion
which is efficient both in terms of gathering speed and minimizing the sting of the nettles by
wrapping the worst stings inside each leaf. Stripping invariably proceeds from the base to the tip
of stems. Might this be an arbitrary but fixed direction qualifying as a nettle-stripping ritual? No,
because nettle anatomy constrains this part of the behaviour as well, not only in that the base is
sturdier than the tip, but in that nettle stinging hairs are oriented at an oblique angle from base to
tip on both stems and leaves. This makes the base-to-tip direction of stripping a means to avoid
breaking the stinging hairs and releasing their poison, rendering the operation safe (as I know
from personal experience). The mountain gorilla’s direction of nettle stripping is thus one of
many instances of ape instrumental behaviour.
Ape cultural traditions (in the sense established earlier) appear to be concentrated largely, but
not exclusively, to types of behaviour which, among humans, would be classified under material
culture, subsistence techniques, and other practically oriented areas of life. Of the 39 behaviours
qualifying as cultural traditions in a recent major survey of chimpanzee culture (Whiten et al.
1999) 34 behaviours—87 per cent—were of this type. They involved directly instrumental acts
aimed at food procurement, processing or consumption (19), personal comfort (9), attention-
getting (5) and aimed throwing (1). The remaining five behaviours involved communicatory
signals and displays, such as ‘leaf clipping’ (making a distinctive noise by tearing a leaf by the use
of fingers or incisors) and the ‘rain dance’ (an excited group gathering at the start of rain). Since,
as we shall see, the domain of displays is the natural arena most hospitable to the emergence
of ritual, it is of interest that leaf clipping traditions show not only variants of execution, but
different uses, in different chimpanzee populations. They are thus far probably the most
promising candidates for behaviours qualifying as bona fide rituals among our closest non-
human relatives. Group displays such as the rain dance and the chimpanzee carnival are likewise
of interest in this regard.
Turning to humans, we find a radically different situation as far as the evidence for ritual
behaviour is concerned. Compared with the meagre evidence for ritual proper among the great
apes, human culture abounds in ritual, in the small as in the large (van Gennep 1908; Durkheim
1912; Radcliffe-Brown 1961; Glukman 1965; Turner 1969). Table 4.1 provides a sample of
common types of human ritual attested to not here and there, but across many cultures, each
ritual assuming a form specific to a given culture.
While some of these rituals occur in the context of the practical affairs of life, many are clearly
dissociated from practical matters of utility. It is also noteworthy that the ostensible purpose
of some rituals in practical domains, such as personal hygiene, is subverted through the ritual
manner of their performance. Thus, in ritual cleanliness it is not essential to actually clean the
body parts involved. What is of the essence is to ‘go through the motions’ properly, irrespective of
whether the body parts are in fact contaminated or whether the ritual motions accomplish any
48 BJORN MERKER
Table 4.1 Schematic tabulation of major categories of human ritual practice
Types of Religious Religious and/or secular Largely secular

ritual
Rites proper Spells, incantations Rites of passage, including Institutional rites (guilds,
and mantras marriage, naming, etc. associations, corporations, etc.)
Magic and divination Mortuary and funerary rites Children’s play rituals
Liturgy Ritual cleanliness Courting rituals
Prayer rites Ritual body adornment Food ceremonial
Sacrificial rites Festival rites Etiquette
Consecration rites Oaths, vows and curses Mottos and slogans
Cleansing and Rites of greeting and Rites of the hunt
exorcism parting
Healing rites Ruling rites (dynastic, Rites of negotiation
national, etc.)
Blessing rites Rites of rebellion Rites of war and peace
Invocations Sports and games ceremonial
Ritual arts Song, music, dance, drill, recitation/poetry, iconography
removal of contaminants. In other words, the purpose of an instrumental act can be compro-
mised or lost through its cultural ritualization, a striking instance and illustration of the
fundamental distinction between instrumental culture and ritual culture.
Notwithstanding the problematic relation between ritual and utility, human societies invest
considerable resources in the rituals they sustain and through which they partly define them-
selves as distinct cultures. The full Agnichayana sacrifice of the traditional Vedic culture of India
is a formally structured 12-day progression of complex interwoven chanting and ritual perfor-
mances requiring months of preparation and rehearsal. It involves 17 priests, each specialized
in the recitation of particular branches of the massive Vedic corpus of hymns and sacrificial
formulas on which the rite draws (Staal 1989, 1993). For millennia, a considerable portion of the
Brahmin caste of the Indian subcontinent has devoted its intellectual resources to the syllable-
perfect memorization and correct recitation of this textual corpus and the preservation of the
many rituals in the course of which it is recited. Such practices are not confined to civilizations
such as the Vedic: hunter-gatherer cultures such as those of the Australian aborigines feature the
memorized transmission of a corpus of sacred songs, rituals and associated objects and myths.
Acquiring these ritual vehicles of tradition demands a substantial investment of time and energy
on the part of initiates, and may involve undergoing severe bodily torture to prove worthiness for
becoming a carrier of the sacred lore (Elkin 1945; Strehlow 1947). Any identifiable subdivision of
humanity is likely to develop and practise rituals—witness children, who everywhere engage in
their own particular rites as part of their play and games (see Opie and Opie 1960, for a striking
sample from England). Rituals in the domain of mother–infant interaction are dealt with in
Eckerdal and Merker, Chapter 11, this volume.
By definition, a ritual is a specific subset of the set of possible actions in a given situation
(no action at all being one of these). Acquiring a ritual requires the learning of that particular,
culture-specific form. This imposes the need to invest learning resources in its acquisition by
those who would perform it. To know the correct forms and contexts of simple things such as the
greeting rituals proper in a given culture may appear unremarkable to those within the culture.
Yet, as any foreign visitor knows, they are anything but free gifts of our humanity. They must be
acquired over a considerable stretch of ontogenetic experience and in close interaction with
people such as parents, peers and teachers. Because this normally takes place as only one of innu-
merable interwoven facets of growing up in a given culture, the expenditure of the requisite
learning resources may remain largely invisible to those acquiring the ritual in this way.
It becomes second nature, and comes to attention only when a ritual miscarries, as in cases of
mistaken identity or context.
The learnt nature of rituals has another aspect: in the sense in which the form of rituals was
said here to be arbitrary, they provide a particularly appropriate forum for, and in some cases an
acute need for, actual, deliberate teaching and instruction, and its complement, ‘imitation’
(Eckerdal and Merker, Chapter 11, this volume; Merker 2005). This is clearly the case in the
elaborate Vedic rituals already referred to. Brahmin students spend years of their childhood and
adolescence in apprenticeship under a teacher whose active guidance and instruction helps them
to memorize thousands of sacred verses and to perfect their command of the ritual lore. But
teaching and instruction also features in the acquisition of innumerable far less demanding
rituals in other cultural contexts, such as learning the proper steps for different genres of dance,
or the correct forms of etiquette in different social circumstances. Deliberate teaching has been
regarded as a milestone in cultural learning (Tomasello et al. 1993), being rare in chimpanzees
but ubiquitous in humans. I suggest that the ritual nature of human culture accounts for this
difference and provides a far simpler and more robust interpretive framework for the human
practice of deliberate teaching than a formal assimilation into presumptive stages in the develop-
ment of the so-called ‘theory of mind’ (Premack and Woodruff, 1978).
Through its conspicuous content of ritual performances, human culture has a content largely
absent from ape culture, and it is this specifically ritual content whose acquisition in many cases
calls for deliberate instruction. Short of ritual, the inherent scope of explicit instructional teaching
in human affairs would accordingly be limited (Rogoff et al. 2003). Moreover, I suggest that the
development of instructional teaching or mentoring for ritual purposes is primary and that
teaching had its origin in this context, to be extended only secondarily to the acquisition of non-
ritual forms of knowledge and skill. Much the same can be said of imitation: its utility is high in
ritual compared with instrumental circumstances. Observational learning is often perfectly
adequate for the acquisition of many practical skills and traditions, while the duplication of the
arbitrary features of ritual (close to the heart of the definition of ritual) requires imitative
learning in the ordinary dictionary sense of imitation. The failure to distinguish instrumental
and ritual culture and traditions, and the kinship of both teaching and imitation to ritual, has
burdened the concept of imitation with unnecessary complexity and controversy (e.g., Whiten
and Byrne 1991; Tomasello et al. 1993; Call and Tomasello 1995; Meltzoff 1996; Byrne and
Russon 1998; Miklosi 1999). The ritual perspective on imitation obviates the need to construe
imitation in instrumental and, by extension, survival terms. It allows us to adhere to the dictionary
definition of imitation rather than to encumber it with constructs related to model intentions.
That an imitator, in addition to imitative capacity, may have the cognitive capacity to infer
motives on the part of a model is, accordingly, a separate matter, without intrinsic bearing on the
issue of imitation as such (Eckerdal and Merker, Chapter 11, this volume).
I have noted that rituals proper distinguish themselves in formal rather than in functional or
utility terms: moreover that some rituals compromise the ostensive purpose of the instrumental
domain to which they belong, and that the functionally irrelevant features they contain, as well as
the costs of their acquisition and performance, impose an extraneous burden on instrumental
behaviour for which thus far no apparent compensatory benefit or purpose has been indicated.
This dearth of apparent utility arising from the form of the ritual itself holds a hidden key to the
nature, function and origins of ritual. To recover this key, we need to remind ourselves that,
50 BJORN MERKER
although ritual proper is scarce among our closest evolutionary relatives, the great apes, it is by
no means altogether absent from non-human nature. On the contrary, spectacular examples of
elaborate cultural ritual are scattered across the animal kingdom, and an appreciation of these
evolutionary analogues of our own ritual culture will help us to understand not only its origins
but its function and formal powers. It is time to pose the question ‘whence and wherefore ritual?’
4.3 Whence and wherefore ritual?

Humans are vocal learners, while our nearest relatives among the apes are not. This means that we,
but not they, have the capacity to learn to shape our vocal output to match the pattern of auditory
models received through the sense of hearing. Our ability to do so covers broad domains of
structured sound, within limits set largely by the range of our voices. We do so with every word we
know how to pronounce, we do so when we sing, and when we imitate animate and inanimate
sound sources of whatever kind. If this capacity seems unremarkable to us, it is only because we
possess it. Yet in phylogenetic terms, it is exceedingly rare. Mammals excel in their capacity to
learn, yet vocal learning is a rarity among them (see review by Janik and Slater 1997). Beyond a
few mammals such as humans, whales and seals, it is birds that supply the striking examples of the
ability to learn the patterns of songs or calls from auditory models. Even then, only 3 of the
24 orders of birds feature species with this capacity, namely parrots (Todt 1975; Pepperberg 1981),
hummingbirds (Baptista and Schuchmann 1990) and the large group of oscine (but not
suboscine) songbird species (Kroodsma and Baylis 1982; Kroodsma 1988).
In purely descriptive terms, and disregarding many details and species differences, the first
stage in song-learning on the part of a songbird hatchling consists of listening to the (learned!)
pattern of song produced by a (typically) conspecific model. At a later point in development,
it starts producing ill-formed jumbles of song elements in a practice phase called ‘subsong’,
functionally analogous to human infant babbling (Marler 1970; Doupe and Kuhl 1999). This is
followed by a stage called ‘plastic song’, and eventually, at ‘song crystallisation’, the young bird
duplicates the model pattern in its own output with high fidelity. In so doing it has performed
the quintessential act of cultural learning, and given proof of an instance of spectacularly
deferred imitation. The resulting song tradition is subpopulation-specific (Nottebohm 1972),
and has a form of which the particulars are not dictated by the purpose served by the behaviour;
yet for a given individual in a given population, those particulars are obligatory. In other words,
the behaviour has all of the hallmarks of cultural ritual described above. This applies to whale
song no less than to birdsong (Payne 2000). Whale song, in addition, features individual innova-
tion that is copied by all members of the singing group, resulting in ongoing repertoire turnover.
Among bird species, also, there are numerous patterns of vocal learning besides that sketched
above (Marler and Mundinger 1971), including the acquisition of voluminous individual
repertoires composed of hundreds and even thousands of individual song types (Kroodsma
and Parker 1977). Some species, such as parrots and mynah birds, copy sounds of any kind
(Baylis 1982).
Vocal learning for song thus supplies us with paradigmatic examples of ritual culture in
animals. To know the origin and mechanism of forms of imitative learning that faithfully dupli-
cate its model to result in the lasting acquisition of a formally patterned behaviour as an integral
part of a cultural tradition with ritual characteristics, we need not scour the behaviour of apes for
tenuous examples: the trees surrounding those apes resound with striking instances in the
singing of birds! The focus on apes in this regard may derive from the mistaken notion that adap-
tations are like phylogenetic heirlooms, and that our closest relatives are thus the natural place to
look for the origin of all of our own traits. However, such an approach must, by definition, fail in
the case of the so-called diagnostic features of a species, i.e., those traits that distinguish a species
from its closest evolutionary relatives. Vocal learning, song and speech are such traits in humans,
and to search for them among the great apes is futile.1 Instead, all we need to do to make sense of
the radical difference between ape and human cultures with regard to ritual content is to assume
that we acquired a crucial enabling adaptation for ritual behaviour at or after the divergence of
the lines that led to humans and chimpanzees. I suggest that vocal learning for song—having
arisen time and again by independent evolution in scattered clusters of mammalian and bird
species, and clearly present in us but not in chimpanzees—supplies that enabling adaptation. In
this respect I am only updating, with a focus on ritual, a proposal long since made in relation to
the origins of language by Darwin (1871), and emphasized by students of birdsong since the
1970s (Marler 1970, 2000; Nottebohm 1975, 1976; Doupé and Kuhl 1999; Wilbrecht and
Nottebohm, 2003; Jarvis 2004; Merker 2005; Merker and Okanoya, 2007).
If we ask, then, about the evolutionary setting that has given rise to vocal learning for song
and the functions it serves across scattered lineages of animals, we land far from the utilities of an
individual animal’s survival-oriented negotiation of its physical environment. Instead, we land in
the realm of the social, and more specifically in that vast theatre of animal communicative signals
and displays which houses all of nature’s aesthetic extravaganzas—the peacock’s tail no less than
the 1800 melodies in the repertoire of a brown thrasher (Kroodsma and Parker 1977). They serve
an animal in its pursuit of those intimate social matters to do with wooing or choosing a mate,
fending off rivals and reproducing successfully. This is so because the purpose of display is to
impress: to show that one is worthy of being taken seriously as a mate, a fighter or a provider.
There are many ways of doing so concretely, but a common denominator of achieving this
through display is that in one way or another the display requires resources over and above what
is required for merely ‘getting by’ (Zahavi and Zahavi 1997). This can be done by engaging in
activities requiring unusual amounts of energy, degrees of persistence, levels of complexity, or
learning capacity. In the vocal sphere, this would mean unusually loud, persistent or complex
(learned) calling. A long bout of learned song combines all of these: loudness (consider the
size of birds in relation to their vocal output), persistence (10 hours per day is not exceptional in
the mating season), the complexity of structural patterns and repertoire (Catchpole and Slater
1995) and the learning capacity that allowed their acquisition. What looks like frivolous aesthetic
excess thus turns out to have a serious purpose at the heart of the central reproductive drama
of life.
The lack of apparent utility of human rituals is, by this interpretation, exactly that: an appearance.
Beneath the frivolity and excess apparent to a superficial view, behind the lavish expenditure of
time and energy on what may strike an outsider as ‘mere ritual’, lies a subtle, hidden dynamic.
1 As an example, the paucity of clear evidence for vocal learning in the great apes has been taken to support a
gestural origin of language in humans (see discussion in Janik and Slater 1997). Yet it remains a mystery on
such an account how a species already in possession of language embodied in gesture (because of its lack of
vocal learning) could ever switch to vocal language without already having the capacity for vocal learning.
The lack which favoured gesture must be conveniently forgotten to explain how the original gestural mode
of language (visual–manual) was replaced by a radically different mode (auditory–vocal) at reasonable cost.
Assuming an absence of the capacity for vocal learning at that point forces us to assume that the advantage
of disencumbering the hands from their involvement in gestural language was so great that it would pay the
evolutionary costs of developing the capacity for vocal learning de novo, and this to achieve
nothing more than replacing one form of functional language with another. Such consequences are easily
avoided by recognizing the error from which they flow, namely the presupposition that our closest living
relatives necessarily must exhibit some version of our own distinctive traits.
52 BJORN MERKER
Through ritual, the core concerns of life are attired in fancy dress and complex gestures as
concrete, living proof that life does not hang by a mere thread, that there are resources beyond
those needed for the bare maintenance of life itself, and that those who have managed to harness
those resources are worth taking seriously, and may even have cause to celebrate. Accordingly,
there is a celebratory aspect to many rituals. Through them, we may be said to celebrate signifi-
cant facets of our lives: what we are and what we have, what we are about to do or become, what
we have received or given, what we have achieved or mean to achieve, and whom or what we
hold dear, are committed to or are making a commitment to, in one or another context and
through one ritual or another. In one sense celebration is frivolity and in another sense a serious
affirmation. A celebration consumes resources and does not in itself achieve that which is being
celebrated. Yet celebration is not without consequence: for one thing, it announces and affirms
that which is being celebrated. In ritual form, it allows participants and observers from afar know
what is the case; it is a display for all to see, and is proof of the wherewithal for its performance.
To achieve its end of being taken seriously—of impressing—the display must not be cheap or
fake, or indeed even able to be faked. The apparently excessive lavishness and its cost guarantees
unfakeability, whether those costs be in time, energy, material resources or learning effort (Zahavi
and Zahavi 1997); this applies to the singing of a songbird in the mating season no less than to
our own ritual performances.
The formal properties and complexity of various types of ritual—including the songs acquired
through vocal learning—provide a major means for making ritual unfakeable. To participate in
the mutual hand-clapping rituals developed by many children’s playground cultures, you must
have practised them. Their complex obligatory stereotypy ensures that every mistake is instantly
obvious. You have either learned the moves or you have not, and children do coach one another
to perfect their practice of the rite. You either know it or you do not. The acquisition of learned
complexity presupposes the requisite resources in time, effort and attention, and successful
performance is therefore proof of past and present application. Formality makes command of
the ritual easily judged, and complexity ensures that the requisite skills require application. This
is true of even trivial examples, such as the forms of greeting that are proper to a given culture
and context. As we noted, these skills have to be acquired by paying attention, and performing
them proves that you once did so, and that you now care. In this, rituals supply the formal cement
and culture-specific currency of our social intercourse, a set of unforgeable shorthand certificates
of competence in the culture, produced for all to see when we demonstrate command of a ritual
by our participation in it.
4.4 A generalized ritual propensity

Striking similarities between animal and human cultural ritual notwithstanding, there are
differences. Human rituals are typically, although not exclusively, shared, in the sense that their
performance involves two or more participants jointly engaged in the ritual. Some can indeed be
practised in solitude, such as rituals of sorcery, but these are rare compared with the many social
rituals that humans practise. The animal rituals of song are generally not shared in this sense, but
are typically performed individually and ‘broadcast’, although examples of chorusing behaviour
do exist (see for example Staicer et al. 1996). Among animals, the learned rituals we are
concerned with here are also generally confined to the pair of modalities used in song, namely
hearing and voice. This is not to deny that these rituals may be accompanied by visual signals
such as plumage displays and bodily movement (Williams 2004, pp. 20–24), but to emphasize the
pre-eminence of the vocal modality with regard to the complexity of learned patterns it supports
in song-learning species. As originally suggested by Thorpe, the auditory–vocal modality has the
advantage that it allows more complete monitoring (by ear) of self-produced output for compar-
ison with model patterns than other expressive modalities (Thorpe 1961, p. 69). The issue is
hardly trivial, because it provides an exact parallel to the auditory–vocal deployment of the
modalities of human speech (Marler 1970; Doupe and Kuhl 1999). But human learned ritual is
not generally confined in this way to a fixed pair of modalities. Rather, vision, touch, manual
dexterity, the extension of the hands in tools, and the human body as a whole, by itself or coordi-
nated with those of others, are also recruited in various combinations for use in many human
rituals. As such, they supply the ‘instruments’ of what Merlin Donald has discussed in terms of
our unique human capacity for expressive mimesis (Donald 1991), intimately related to our
propensity for ritual. Some of the expressive latitude of human rituals compared with those of
animals undoubtedly is a secondary consequence of our possession of language, yet there may be
a more fundamental reason, particularly since it also bears on the social orientation of human
rituals. It takes us to a peculiar behavioural setting within which vocal learning may have evolved
in humans.
Not only must we picture that setting to have been highly social from the very outset —witness
the sociality we share with chimpanzees—but it may have featured a peculiar elaboration of the
physical displays that accompany distance calls in the apes. These include stomping, branch-
shaking and drumming (Geissmann 2000, pp. 118–19). I have suggested a biologically plausible
selection pressure that converted our ancestral distance calls into genuinely cooperative and beat-
based synchronous chorusing at an early point in our divergence from the common ancestor
(Merker 1999, 2000). This change in the form of hominid distance calls would presumably not
have broken their inherent linkage (common among the apes) to vigorous physical display.
As accompaniment to rhythmic group-chanting, such physical displays presumably took the
form of a kind of dancing, paced by the same even beat needed to maintain the synchrony of
voices in the group chant (Fraisse 1982), a beat most easily derived from locomotor rhythms
(Merker 1999). Whenever vocal learning evolved among our ancestors, it is thus likely to have
had such a group display of rhythmic chanting and dancing as its behavioural setting. This may
have resulted in human vocal learning with a difference: around the core of learned control of the
voice, the same mimetic (‘conformal’) motive that ensures copying fidelity in song (Merker 2005)
would have been linked to other aspects of the behavioural display that furnished the setting for
its evolution. This linkage to postural and locomotor behaviours featured in a synchronous
group display would have provided a possible route by which the evolution of human vocal
learning was expanded into a capacity for bodily mimesis more generally (cf. Donald 1991).
If so, this new capacity for extending vocal learning to bodily (postural and gestural) expressive
learning would have turned us into a species of ‘imitative generalists’ (Meltzoff 1996). Presumably,
the vocal, the gestural and the social aspects of group display were never separate, nor are they
kept separate in the conceptual categories of a number of non-Western peoples even today. Their
languages subsume rhythm, song, dance and ritual (celebration) under a unitary concept, in
striking agreement with the origins scenario I have proposed (Merker 1999). This is true of the
ngoma of the Bantu languages no less than of the saapup of the Blackfoot native Americans or the
mousiké of classical Greece. The latter included poetry and dance besides our notion of ‘music’,
which derives its name from the broader Greek original.
Our ancestors might have developed a facility for bodily mimesis entirely without vocal learning,
were it not for the ultimate role of the voice and its group synchrony in the very setting within
which our capacity for expressive learning then evolved. Given that capacity, its content would be
defined by local traditions carried from generation to generation by cultural learning, providing
a vehicle for the elaboration of human ritual culture long before language entered the picture,
and perhaps setting the stage for its emergence.
54 BJORN MERKER
As already noted, human rituals are typically social; that is, performance is participatory.
The forms of rituals and the criteria for proper performance are thus not a matter of individual
choice and preference, but are intrinsic to the culture that defines itself through them. This is
what it means for their forms to be obligatory: we have no choice in the matter, because to partic-
ipate is to participate in this way, or not to take part at all. No referee is needed to adjudicate, the
form itself performs this function through its stereotypy and complexity. Yet once the obligatory
form is anchored in tradition, it provides a forum in which individuals may excel through the
manner of their performance of a ritual they already command. Since the basic form is fixed by
the pattern of the ritual itself, this takes the form of the ease, elegance or virtuosity with which it
is carried through.
It is the fixity of form that provides the common standard by which differences in execution
can be apprehended and appreciated. The steps of a popular dance style allow stylistic latitude in
execution, from the merely passable to the elegant. The elegant dancer observes the same essen-
tials of form as a less accomplished one, but goes beyond them to stylistic elaboration of the
basics. At the level of virtuosity, this can even extend to the introduction of structural elements
that are not part of the received ritual pattern. If so, these prove their status as virtuoso embel-
lishment rather than mistake by fitting seamlessly into the framework of the received form. These
modes of execution tend to impress, for obvious reasons, and this invites emulation and attempts
to surpass even this performance. A dynamic of elaboration and change is thus inherent in ritual
in its cultural context, tending to push its development in the direction of differentiation and
complexity.
4.5 The three-tiered vehicle of human culture

We have arrived, then, at an ancestral species possessed of a generalized capacity for expressive
mimesis organized around the core mechanism of vocal learning, engaged in culturally transmitted
group rituals featuring a shared repertoire of strings of structured nonsense syllables, or song.
This situation provides an ideal and pattern-rich starting point for the ‘holistic’ type of mecha-
nism proposed to play a role in language origins on developmental grounds by Alison Wray
(1998; also Hurford 2000), and explored in computer simulations of the emergence of language
in multigeneration populations of learning agents (Batali, 1998, 2002; Kirby 1998, 2000, 2001,
2002). The computer simulations disclose that the process of intergenerational transmission
itself, via the agency of the ‘learner bottleneck’ (an aspect of the so-called ‘poverty of the stimulus’,
Smith 2003), possesses unexpected powers to formally structure the content of initially random
strings into shared and efficient grammars matching the cognitive world of its agents. This takes
place without the intervention of differential reinforcement of outcomes or natural selection in
the process, which accordingly is a purely historical, cultural one.
Merker and Okanoya (2007) have provided an account of how such an historical dynamic
might convert the originally unsemanticized song-strings of our ritual ancestors into the gram-
maticalized forms of human language (e.g., DeLancey 1994; Lehmann 1995; Bybee 1998;
Campbell 1998). This is not the place to retrace that account, except to note that it shows our
species arriving at human language inadvertently, by a process of the contextual differentiation
and segmentation of a song-string repertoire employed not for the communication of meanings,
but—as before—to impress potential mates and rivals (cf. Miller 2000). Whether by this or other
means, the fact remains that we did eventually arrive at the biological singularity of human
language, and with its help developed a mode of culture which in its richness and powers exceeds
the cultural traditions documented in the rest of the animal kingdom. It follows that no fewer
than three cultural levels, or tiers, are needed to capture the distinctive content of this uniquely
human mode of culture.
The basic level in this tripartite division of human culture is instrumental culture, which we
share with apes, and might honour with the name ‘ape culture’. It includes the many intuitive and
implicit instrumental habits of body and mind that we acquire in the course of growing up in a
culture, such as sleeping in beds or on grass, how to find or catch suitable foods and to know
which ones to cook, how to fashion a variety of implements and their uses, and which kinds of
activities might be pursued for a living, to name but a few among innumerable culturally deter-
mined dimensions of life belonging to this instrumental level. It includes much of what belongs
to human material culture and subsistence techniques. In principle, these differ from tool use
among the apes not so much in kind as in degree of differentiation and elaboration. Their lineage
in this respect tends to be obscured from our view by the influence of the third level of human
culture, language, the presence of which creates new possibilities of elaboration for both of the
prior levels upon which it rests. This is one of the signal powers of language, but it does not alter
the basic fact that both prior levels may flourish without it, and cannot be assimilated to it without
obscuring the nature of language. Ape or instrumental culture includes learned facets of our
non-verbal expressiveness through voice, posture and gesture when they are not ritualized. The
possession of instrumental culture requires no special mechanisms beyond a sufficient learning
capacity of the kind conferred by the cerebral cortex that we share with other mammals (Merker
2004), or the kind of learning capacity underpinning feeding innovation in birds, which is
separate from the brain nuclei of the song system (Lefebvre et al. 1997; Timmermans et al. 2000).
The next level or tier is that of ritual culture, to the explication of which this chapter has been
devoted. Not much more needs to be said about it here than that it requires a special mechanism,
one that ensures the duplication—irrespective of instrumental considerations or utility—of the
form of a ritual model. Besides the requisite learning capacity on both the receptive (auditory,
posterior) and productive (vocal, frontal) sides of brain organization, it needs a mechanism that
ensures the matching of the two in such a way that the output comes to conform to the input.
Since a lengthy practice phase typically intervenes between song exposure and mastery, it seems
that the process requires a motive not to terminate practice until an adequate match has been
achieved. I call this motive the ‘conformal motive’ (Merker 2005), a formulation particularly
suited to characterize the motivational underpinnings of the broader mimetic capacity that I
have suggested encompasses both vocal learning and bodily mimesis in the human ancestral line.
It confers a readiness to accept and incorporate into one’s own behaviour what amount to essen-
tially arbitrary patterns, by duplicating them through a learning process. This readiness is far
from a free gift of nature, as shown by the paucity of good instances of such duplication among
chimpanzees in the wild (Tomasello et al. 1993).
Acquiring this readiness as an innate motivational predisposition may rank as the single most
crucial adaptation that our species has evolved in the sphere of behaviour. Being recent, it may
also be susceptible to malfunction. Certain manifestations of obsessive–compulsive as well as
autism spectrum disorders may be interpretable in this light as instances of developmental
pre-emption of socially shared rituals by idiosyncratic, personal ones. In addition, something
analogous may occur in schizophrenia at the level of the meta ritual of language in a failure to
adhere to socially shared patterns of thought. Properly functioning, however, the predisposition
of the conformal motive provides the pivot of ritual culture by making us want to adopt the
forms of the culture in which we aspire to membership. It is also an essential means to the acqui-
sition of what might be called the ‘meta ritual’ of language.
The third and final level of our tripartite culture is language itself, our most powerful and
problematic attainment. As Schopenhauer reminds us:
The animal can never stray far from the path of nature, for its motives lie only in the world of perception,
where only the possible, only the actual indeed, finds room. On the other hand, all that is merely
56 BJORN MERKER
imaginable or conceivable, and consequently also what is false, impossible, absurd, and senseless, enters
into abstract concepts, into thoughts, ideas, and words.
(1844/1969, v. II, p. 69)
The simple grammatical device of negation virtually guarantees this. Language therefore not only
gives us unprecedented powers of reference and communication about the entire empirical
world of perception, but encumbers us with ‘questions such as cannot be answered by any empir-
ical employment of reason, or by principles thence derived’ (Kant 1781, p. 14).
This lack of limit on what conceptions can be put into words, provided they conform to grammar
in how they are expressed, is the reason I alluded to language as meta ritual. Language shares
with the ritual mode out of which it grew an insistence on proper form, yet differs from it by its
emancipation from the finite particularity of rituals. Language is our general-purpose ritual for
constituting and communicating thought, a truly awesome novelty in nature. Through its powers
it has come to infiltrate every dimension of our cognitive equipment, and through it to colonize
both of the prior two tiers of the human vehicle of culture. Almost all we do is at least accompanied,
although not necessarily mediated, by language, and through the instrumentality of language, the
other two tiers have reached degrees of differentiation and complexity inconceivable without it.
As we come to understand more about the liabilities encumbering this crowning achievement of
human culture, it may eventually take its rightful place as but one of its three major components.
It is utterly dependent upon the other two, but they are not, in principle, dependent on it for their
existence. It is only fitting, then, that language repay some of its debt to these its elders by lending
them the use of its own unique powers. The tendency of the latecomer to pretend to sole dominion
of our minds should not be allowed to obscure entirely from our view the three-tiered structure
of culture that is ours and of which it forms but a part. Nor should we forget that the entire
tripartite edifice of culture itself is erected on an even more basic foundation: the nature that is
ours as creatures of an evolutionary history vastly antedating our cultural history, and without
which the latter would be nought.
Acknowledgements
The encouragement of Colwyn Trevarthen has helped me to pursue the ideas in this chapter. I am
also indebted to Pär Segerdal for the inspiration provided me by his manuscript on Kanzi,
language, and the Pan/Homo culture. Reading it pushed me over the threshold to the conception
of human culture presented here. The writing of this chapter was supported by a grant from the
Bank of Sweden Tercentenary Foundation to Guy Madison and the author, and the research on
which it is based was supported by an earlier grant by the same foundation to the author.
References
Baptista LF and Schuchmann K (1990). Song learning in the Anna hummingbird (Calypte anna). Ethology,
84, 15–26.
Batali J (1998). Computational simulations of the emergence of grammar. In JR Hurford, M Studdert-Kennedy
and C Knight, eds, Approaches to the evolution of language: Social and cognitive bases, pp. 405–426,
Batali J (2002). The negotiation and acquisition of recursive grammars as a result of competition among
exemplars. In T Briscoe, ed., Linguistic evolution through language acquisition: Formal and computational
models, pp. 111–172. Cambridge University Press, Cambridge.
Baylis JR (1982). Avian vocal mimicry: Its function and evolution. In DE Kroodsma and EH Miller, eds,
Acoustic communication in birds, pp. 51–83. Academic Press, New York.
Boesch C (1996). The emergence of cultures among wild chimpanzees. Proceedings of the British Academy,
88, 251–268.
Bybee J (1998). A functionalist approach to grammar and its evolution. Evolution of Communication,
2, 249–278.
Byrne RW and Byrne JME (1993). Complex leaf-gathering skills of mountain gorillas (Gorilla g. beringei):
Variability and standardization. American Journal of Primatology, 31, 241–261.
Byrne RW and Russon AE (1998). Learning by imitation: A hierarchical approach. Behavioral and Brain
Sciences, 21, 667–721.
Call J and Tomasello M (1995). Use of social information in the problem solving of orangutans
(Pongo pygmaeus) and human children (Homo sapiens). Journal of Comparative Psychology,
109, 308–320.
Campbell L (1998). Historical linguistics. An introduction. Edinburgh University Press, Edinburgh.
Catchpole CK and Slater PJB (1995). Bird song: Biological themes and variations. Cambridge University
Press, Cambridge.
Darwin C (1871). The descent of man and selection in relation to sex. D Appleton & Company, New York.
DeLancey S (1994). Grammaticalization and linguistic theory. In J Gomez de Garcia and D Rood, eds,
Proceedings of the 1993 Mid-America Linguistics Conference and Conference on Siouan/Caddoan
Languages, pp. 1–22. Department of Linguistics, University of Colorado, Boulder, CO.
Donald M (1991). Origins of the modern mind. Harvard University Press, Cambridge, MA.
Doupe AJ and Kuhl PK (1999). Birdsong and human speech: Common themes and mechanisms. Annual
Review of Neuroscience, 22, 567–631.
Durkheim É (1912/1995). The elementary forms of religious life. Free Press, New York.
Elkin AP (1945/1980). Aboriginal men of high degree. University of Queensland Press, Brisbane.
Fragaszy DM and Perry S (eds) (2003). The biology of traditions. Models and evidence. Cambridge
University Press, Cambridge.
Fraisse P (1982). Rhythm and tempo. In D Deutsch, ed., The psychology of music, pp. 149–180. Academic
Press, New York.
Geissmann T (2000). Gibbon song and human music from an evolutionary perspective. In NL Wallin,
Glukman M (1965). Politics, law and ritual in tribal society. Blackwell Publishers, Oxford.
Goodall J (1986). The chimpanzees of Gombe: Patterns of behavior. Harvard University Press,
Cambridge, MA.
Hurford JR (2000). The emergence of syntax (editorial introduction to section on syntax). In C Knight,
M Studdert-Kennedy and J Hurford, eds, The evolutionary emergence of language: Social function and the
origins of linguistic form, pp. 219–230. Cambridge University Press, Cambridge.
Imanishi K (1957). Identification: A process of enculturation in the subhuman society of Macaca fuscata.
Primates, 1, 1–29.
Itani J (1958). On the acquisition and propagation of a new food habit in the troop of Japanese monkeys
at Takasakiyama. In K Imanishi and S Altmann, eds, Japanese monkeys: a collection of translations,
pp. 52–65. University of Alberta Press, Edmonton, Canada.
Janik VM and Slater PJB (1997). Vocal learning in mammals. Advances in the Study of Behavior, 26, 59–99.
Jarvis ED (2004). Learned birdsong and the neurobiology of human language. In HP Ziegler and
P Marler, eds, Behavioral neurobiology of birdsong, pp. 749–777. Annals of the New York Academy of
Sciences, 1016.
Kant I (1781/1966). Critique of pure reason. Translated by F Max Müller. Doubleday (Anchor Books),
Garden City, New York.
Kirby S (1998). Language evolution without natural selection: From vocabulary to syntax in a population
of learners. Technical Report, Edinburgh Occasional Papers in Linguistics, 98–1, Department of
Linguistics, University of Edinburgh.
58 BJORN MERKER
Kirby S (2000). Syntax without natural selection: How compositionality emerges from vocabulary in a
population of learners. In C Knight, M Studdert-Kennedy and J Hurford, eds, The evolutionary
emergence of language: Social function and the origins of linguistic form, pp. 303–323. Cambridge
Kirby S (2001). Spontaneous evolution of linguistic structure: An iterated learning model of the emergence
of regularity and irregularity. IEEE Transactions on Evolutionary Computation, 5, 102–110.
Kirby S (2002) Learning, bottlenecks and the evolution of recursive syntax. In T Briscoe, ed., Linguistic
evolution through language acquisition: Formal and computational models, pp. 173–204. Cambridge
Kroodsma DE and Baylis JR (1982). A world survey of evidence for vocal learning in birds. In DE Kroodsma
and EH Miller, eds, Acoustic communication in birds, pp. 311–337. Academic Press, New York.
Kroodsma DE and Parker LD (1977). Vocal virtuosity in the brown thrasher. Auk, 94, 783–785.
Kroodsma DE (1988). Song types and their use: developmental flexibility of the male blue-winged warbler.
Ethology, 79, 235–247.
Lefebvre L, Whittle P, Lascaris E and Finkelstein A (1997). Feeding innovations and forebrain size in birds.
Animal Behaviour, 53, 549–560.
Lehmann C (1995). Thoughts on grammaticalization, 2nd, revised edn. LINCOM Europa, München.
Premack DG. and Woodruff G (1978). Does the chimpanzee have a theory of mind? Behavioral and
Brain Sciences, 1, 515–526.
Marler P (1970). Bird song and speech development: could there be parallels? American Scientist,
58, 669–673.
Marler P (2000). Origins of music and speech: Insights from animals. In NL Wallin, B Merker and
S Brown, eds, The origins of music, pp. 31–48. The MIT Press, Cambridge, MA.
Marler P and Mundinger P (1971). Vocal learning in birds. In H Moltz, ed., The ontogeny of vertebrate
behavior, pp. 389–449. Academic Press, New York.
Meltzoff AN (1996). The human infant as imitative generalist: A 20-year progress report on infant imitation
with implications for comparative psychology. In CM Heyes and BG Galef, eds, Social learning in animals:
The roots of culture, pp. 347–370. Academic Press, San Diego, CA.
Merker B (1999). Synchronous chorusing and the origins of music. Musicae Scientiae (Special Issue
1999–2000), 59–74.
Merker B (2000). Synchronous chorusing and human origins. In NL Wallin, B Merker and S Brown, eds,
The origins of music, pp. 315–328. The MIT Press, Cambridge, MA.
Merker B (2004). Cortex, countercurrent context, and dimensional integration of lifetime memory. Cortex,
40, 559–576.
Merker B (2005). The conformal motive in birdsong, music and language: An introduction. In G Avanzini,
L Lopez, S Koelsch and M Majno, eds, The neurosciences and music ii: From perception to performance,
pp. 17–28. Annals of the New York Academy of Sciences, 1060.
Merker B and Okanoya K (2007). The natural history of human language: bridging the gaps without
magic. In C Lyon, CL Nehaniv and A Cangelosi, eds, Emergence of communication and language,
pp. 403–420. Springer Verlag, London.
Miklosi A (1999). The ethological investigation of imitation. Biological Reviews, 74, 347–377.
Miller GF (2000). The mating mind: How sexual choice shaped the evolution of human nature. New York:
Doubleday.
Nottebohm F (1972). The origins of vocal learning. American Naturalist, 106, 116–140.
Nottebohm F (1975). A zoologist’s view of some language phenomena, with particular emphasis on vocal
learning. In EH Lenneberg and E Lenneberg, eds, Foundations of language development, pp. 61–103.
Academic Press, New York.
Nottebohm F (1976). Discussion paper. Vocal tract and brain: A search for evolutionary bottlenecks.
In SR Harnad, HD Steklis and J Lancaster, eds, Origins and evolution of language and speech,
Opie I and Opie P (1960). The lore and language of schoolchildren. Oxford University Press, Oxford.
Payne K (2000). The progressively changing songs of humpback whales: A window on the creative process
in a wild animal. In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 135–150.
The MIT Press, Cambridge, MA.
Pepperberg I (1981). Functional vocalization by an Africal grey parrot. Zeitschrift für Tierpsychologie,
55, 139–160.
Radcliffe-Brown AR (1961). Structure and function in primitive society. Free Press, Glencoe, IL.
Rogoff B, Paradise R, Arauz RM, Correa-Chávez M and Angelillo C (2003). Firsthand learning through
intent participation. Annual Review of Psychology, 54, 175–203.
Schopenhauer A (1844/1969). The world as will and representation, 2 vols. Translated by EFJ Payne. Dover,
New York.
Staal F (1993). From meanings to trees. Journal of Ritual Studies, 7, 11–32.
Staal F (1989). Rules without meaning. Ritual, mantras and the human sciences. Peter Lang, New York.
Staicer CA, Spector DA and Horn AG (1996). The dawn chorus and other diel patterns in acoustic
signalling. In DE Kroodsma and EH Miller, eds, Ecology and evolution of acoustic communication
in birds, pp. 426–453. Comstock, Ithaca, NY.
Smith K (2003). Learning biases and language evolution. In S Kirby, ed., Language evolution and computation.
Proceedings of the Workshop on Language Evolution and Computation, 15th European Summer School
on Logic, Language and Information, Vienna, Austria, pp. 22–31.
Strehlow TGH (1947). Aranda traditions. Melbourne University Press, Melbourne.
Sugiyama Y (1993). Local variation of tools and tool use among wild chimpanzee populations.
In A Berthelet and J Chavaillon, eds, The use of tools by human and non-human primates,
pp. 175–187. Clarendon Press, Oxford.
Thorpe WH (1961). Bird-song. Cambridge University Press, Cambridge.
Timmermans S, Lefebvre L, Boire D and Basu P (2000). Relative size of the hyperstriatum ventrale is the
best predictor of feeding innovation rate in birds. Brain, Behavior and Evolution, 56, 196–203.
Todt D (1975). Social learning of vocal patterns and modes of their application in grey parrots
(Psittacus erithacus). Zeitschrift für Tierpsychologie, Suppl. 4, 1–100.
Tomasello M, Kruger AC and Ratner HH (1993). Cultural learning. Behavioral and Brain Sciences,
16, 495–552.
Turner V (1969). The ritual process. Aldine, Hawthorne, New York.
van Gennep A (1908/1960). The rites of passage. University of Chicago Press, Chicago, IL.
Whiten A and Byrne RW (1991). The emergence of metarepresentation in human ontogeny and primate
phylogeny. In A Whiten, ed., Natural theories of mind, pp. 267–281. Blackwell Publishers, Oxford.
Whiten A, Goodall J, McGrew et al. (1999). Cultures in chimpanzees. Nature, 399, 682–685.
Wilbrecht L and Nottebohm F (2003). Vocal learning in birds and humans. Mental Retardation and
Developmental Disabilities Research Reviews, 9, 135–148.
Williams H (2004). Birdsong and singing behavior. In HP Ziegler and P Marler, eds, Behavioral neurobiology
of birdsong, pp. 1–30. Annals of the New York Academy of Sciences, 1016.
Wray A (1998). Protolanguage as a holistic system for social interaction. Language and Communication,
18, 47–67.
Zahavi A and Zahavi A (1997). The handicap principle: a missing piece of Darwin’s puzzle. Oxford University
Press, Oxford.
Chapter 5
The evolution of music:

Theories, definitions and the
nature of the evidence
Ian Cross and Iain Morley
5.1 Introduction
It is nowadays uncontroversial among scientists that there is biological continuity between
humans and other species. However, much of what humans do is not shared with other animals.
Human behaviour seems to be as much motivated by inherited biology as by acquired culture,
yet most musical scholarship and research has treated music solely from a cultural perspective.
Over the past 50 years, cognitive research has approached the perception of music as a capacity of
the individual mind, and perhaps as a fundamentally biological phenomenon. This psychology of
music has either ignored, or set aside as too tough to handle, the question of how music becomes
the cultural phenomenon it undoubtedly is. Indeed, only over the past 10 years or so has the
question of the ‘nature’ of culture received serious consideration, or have the operations of mind
necessary for cultural learning explicitly engaged the attention of many cultural researchers
(D’Andrade 1995; Shore 1996). The problem of reconciling ‘cultural’ and ‘biological’ approaches
to music, and indeed to the nature of mind itself, remains.
One way of tackling this problem is to view music from an evolutionary perspective. The idea
that music could have evolutionary origins and selective benefits was widely speculated on in the
early part of the twentieth century, in the light of increasing bodies of ethnographic research and
Darwinian theory (e.g., Wallaschek 1893). This approach fell rapidly out of favour in the years
before the Second World War, for political as much as for scientific reasons, with the repudiation
of biological and universalist ideas in anthropological and musicological fields (Plotkin 1997).
However, evolutionary thinking has again become central in a range of sciences and in recent
philosophical approaches, and music’s relationship to evolutionary processes has been increas-
ingly explored over the past two decades (see also Dissanayake, Chapters 2 and 24; Brandt,
Chapter 3; Merker, Chapter 4, this volume).
5.2 Music in evolutionary thinking

Previous writings on the evolution of capacities for music have made one of two assumptions:
either music is a by-product of other cognitive and physiological adaptations, or there are bene-
fits associated with musical behaviour in its own right. Views advocating non-adaptive roots
for music have been prominent over the past 20 years. A widely publicized view (Pinker 1997)
proposes that the complex sound patterns of music make stimulating use of adaptations for
language, emotion and fine motor control, which evolved independently through selective
pressures not associated with any functions peculiar to music.
62 IAN CROSS AND IAIN MORLEY
Music may not be essential for survival, as eating or breathing are, but, like talking, may confer
a selective benefit and express a motivating principle that has great adaptive power. Music may
have developed from functions evolved for particular life-supporting purposes as a specialization
that elaborates and strengthens those same purposes. As Huron (2001, p. 44) puts it, ‘If music
is an evolutionary adaptation, then it is likely to have a complex genesis. Any musical adaptation
is likely to be built on several other adaptations that might be described as pre-musical or
proto-musical.’
Let us consider the theories that have been proposed to explain how our capacity for music
may have evolved.
5.2.1 Music promotes group cohesion

Roederer (1984), like H. Papousek (1996) and Dissanayake (2000; and Chapters 2 and 24, this
volume), proposes that music developed from mother–infant communication. The musical
manner of their interaction, he suggests, strengthens emotional bonds between mother and
infant, and practices the extraction of speech information from the musical components of talk,
such as vowels, inflections and the pitch cues cultivated in some oriental languages. Roederer
notes that music can transmit emotional information to many people at once, equalizing the
emotional state of the group, which results in a bonding effect between the group members. This
is an effect clearly identified earlier in Blacking (1969).
Sloboda (1985) observes that all cultures require the cognitive and social organization of
practices and mental techniques for survival, and that while modern cultures have ‘many
complex artefacts that help us to externalize and objectify the organizations we need and value’
(Sloboda 1985, p. 267), in non-literate societies the ‘organizational structures’ must be evidenced
and expressed primarily in terms of the expressive ways that people interact with one another.
For example, music can provide a mnemonic framework for the knowledge of a community, as
well as a way of expressing the structure of social relations (Dissanayake, Chapters 2 and 24;
Merker, Chapter 4, this volume).
5.2.2 Music is a product of group selection

The potential function of music in selection at the level of the group needs to be assessed in the
light of the extensive debate within recent evolutionary thinking on the nature and existence of
mechanisms of selection at the group level. Shennan (2002), in a comprehensive evaluation
of models of evolutionary selection applicable to theories of human prehistory, observes that
selection can occur at numerous levels, including that of the group. Group behaviours affect the
social environment in which individuals live, feed and breed. As Shennan puts it,
All theoretical schools, including those that are sceptical about other levels of evolutionary process
than that of individual inclusive fitness, recognize that such [individual] interests may often be served
by co-operating rather than competing with other individuals of the same species.
(2002, p. 213)
In consequence of frequent interaction with the same people, an individual’s behaviours are
likely to acquire the form of approved prosocial norms that emerge within a population.
Adherence to these norms can benefit the members of the group by giving additional rewards
for behaviours that they choose to undertake as individuals (Bowles and Gintis 1998). In
other words, optimal behaviours for the well-being of an individual can be determined
through engagement with conspecifics, as well as between each individual and their non-human
environment. In a social species, the likelihood that individuals will survive to procreate or have a
THE EVOLUTION OF MUSIC: THEORIES, DEFINITIONS AND THE NATURE OF THE EVIDENCE 63
high rate of procreation depends on their ‘cultural fitness’—how they behave in relation to others
in their social group, not just their physical fitness.
Behaviours that contribute to ‘group cohesiveness’ may make other cooperative behaviours
more likely. Bowles and Gintis (1998) have demonstrated through their game theory model that
‘populations [without a centralizing control] whose interactions are structured in such a way that
coordination problems are successfully overcome will tend to grow, to absorb other populations,
and to be copied by others’ (Shennan 2002, p. 216).
It can be seen that the emergence of musical behaviours as a prosocial norm assisting
coordination within a group could lead to the growth of such groups and the spread of those
behaviours. Not only could musical behaviours become a behavioural norm in their own right,
but, because of their foundations in powerful motives for social awareness and expressive
behaviours, those individuals with well-developed capacities for musical action and perception
should also be best at identifying and engaging with other norms of social interactive behaviour.
Therefore, in theory, musical behaviours fit well with the models of selection, at both individual
and group levels, that demonstrate how the development and spread of musical behaviours
is possible.
5.2.3 Socio-emotional bonding is favoured by evolution

of musical signalling
Brown (2000a) proposes that music and language have common origins in a single communica-
tive system (see also Scherer 1991). According to Brown, both music and language can be
conceived of as functioning at ‘phonological’ and ‘meaning’ levels. The stream of sonic events
that constitutes spoken language is interpreted as lexical items by means of a ‘phonological
system’, the output of which feeds into a ‘propositional system’ for the production of speech,
within which units have both relational (syntactical) and referential (semantic) value.
Analogously, a phonological system can be conceived of as transforming a stream of musical
sounds into discrete entities (e.g., motifs, harmonic configurations), that is, ‘fed into a system
of pitch-blending syntax that specifies a set of relationships between sound patterns and
emotions … [which] deals with the issues of sound emotion, tension and relaxation, rhythmic
pulse and the like’ (p. 274). In this model, systems for dealing with sound information in both
music and speech are identified as dissociated but employing comparable tiers of ‘processing’,
each derived from a common set of hypothetical principles for interpreting and generating
phrasing in action and experience (see also Brandt, Chapter 3, this volume).
Brown proposes that this common set of principles arose first as a unitary vocal communica-
tive medium, ‘musilanguage’, and that language and music then became separate capacities
though a process of divergence and functional specialization. Language came to fulfil proposi-
tional functions such as the expression of ‘truth values’, whereas music came to constitute a
pre-eminently social or interpersonal phenomenon. He suggests that, ‘the principal function of
music making is to promote group cooperation, coordination and cohesion’ (p. 296).
Brown subsequently (2000b) adds the notion of music as reinforcing ‘groupishness’, which he
defines as a ‘suite of traits that favour the formation of coalitions, promote cooperative behaviour
towards group members and create the potential for hostility towards those outside the group’
(p252). Music supports these traits through the opportunities that it offers for the formation and
manifestation of group identity, for the conduct of collective thinking (as in the transmission of
group history and planning for action), for group coordination through synchronization (the
sharing of time between members of a group), and for group catharsis (the collective expression
and experience of emotion). Ultimately, Brown sees music as having become established
in human cultures through its role as ‘ritual’s reward system’; music, for him, is a type of
‘modulatory system acting at the group level to convey the reinforcement value of these
activities … for survival’ (p. 257). For Brown, music’s survival value is thus not immediate and
individual, but lies in its ability to promote group cohesion.
A different position is adopted by Hagen and Bryant (2003), who suggest that rather than
causing social cohesion, music and dance signal social cohesion achieved by other means. Hagen
and Bryant’s overall thesis is that
For humans and human ancestors, musical displays may have … functioned, in part, to defend territory
(and perhaps also to signal group identity), and that these displays may have formed the evolutionary
basis for the musical behaviours of modern humans.
(2003, p. 25)
They propose that music and dance act as indicators of group stability and the ability to carry out
complex coordinated actions (as exemplified, perhaps, in the New Zealand All Blacks’ football
cry, haka). They propose that the time needed to create and practice music and dance corre-
sponds to the quality of the coalition performing them, indicating how much time they have
devoted to preparing their skill.
Hagen and Bryant justify their position, and reject other explanations, on the grounds that
musical behaviours cannot contribute directly to the cohesion of a group, because they are not a
good indicator of an individual’s ability to contribute to the group’s survival. However, this view
of group cohesion purely in terms of immediately perceived costs and benefits of group member-
ship ignores emotional bonding and the loyalty engendered by a mutual emotional experience.
Individuals may already have established their credibility within a group, in terms of their ability
to contribute to its survival, but this provides no indication of their likelihood of doing so, or to
whom they will direct their assistance. The ability of music to act as a forum for the practice of
integrated, complex, coordinated group activities resulting in a powerful sense of membership
and trust provides a coherent explanation as to why these behaviours persisted at a group level.
One of the manifestations of this role may have been ‘coalition signalling’, and this may even have
led to its perpetuation; however, this is unlikely to have been the primary selective force for
music’s development.
At a psychobiological and individual level, rather than a behavioural and social level, musical
experience has been linked with the release and action of life-sustaining regulatory hormones.
Freeman (1995) reports that the neuropeptide transmitter oxytocin aids in the formation of
strong positive emotional memories and in the supplanting of negative emotional memories,
having its strongest effects during trauma or ecstasy. Oxytocin is released into the brain in
females during lactation, and is produced by males and females following sexual orgasm. It medi-
ates in interpersonal bonding, both pair-bonding and mother–infant bonding. Critically,
Freeman suggests that oxytocin is likely to be released while a person is merely listening to music.
This would provide a strong neurological rationale for the role of music in the formation
of social bonds, both in intimate interactions between people and in group musical activities
such as crowd chants (Huron 2001; see Panksepp and Trevarthen, Chapter 7, and Osborne,
Chapter 25, this volume)
5.2.4 Music promotes sexual selection

Charles Darwin proposed that the evolution of music in humans has its roots in courtship songs.
He believed that the vocalizations with the greatest pitch changes made by apes tend to be
produced by males when soliciting mates (Darwin 1871). Miller (2000, 2001) argues that musical
behaviours can indicate sexual fitness, signalling status, age, physical well-being and fertility. He
suggests that dancing reveals aerobic fitness, coordination, strength and health; voice control may
reveal self-confidence and status; rhythmic ability may indicate the ‘capacity for sequencing com-
plex movements reliably’, whist virtuosic performance per se, ‘may reveal motor coordination,
capacity for automating complex learned behaviours, and having the time to practise’ (Miller
2000, p. 340). The last characteristic may also, in young adults, signal sexual availability, as it
implies a lack of parenting demands. These properties of musical and dramatic displays could
lead to aesthetic preferences for particular forms of those behaviours, which leads Miller (2000)
to propose that
Any aspect of music that we find appealing might also have been appealing to our ancestors, and if it
was, that appeal would have set up sexual selection pressures in favour of musical productions that
fulfilled those preferences.
(2000, p. 342)
This logic, however, implies that any musical trait for which there is a preference will subse-
quently be selected for by sexual selection. To this, an important qualifier should be applied: by
definition, selection, sexual or otherwise, for a particular trait can occur only if that trait can arise
by mutation of a gene and can be inherited. Behaviours and skills (for example, a particular lan-
guage or music) can be transmitted in other ways. In addition, if sexual selection was responsible
for the evolution of motives that cause most humans to find features of music aesthetically
appealing, then we would expect convergence in behaviours of musical expression, and in the
aspects of them that give pleasure. While musical behaviours are found in all cultures and share
dynamic features and social motivations and uses, aesthetic preferences are often culture-specific.
Miller argues that ‘If one can perceive the quality, creativity, virtuosity, emotional depth and
spiritual vision of somebody’s music, sexual selection through mate choice can notice it too’ (p. 355);
however, he admits that such rationales are speculative. While his thesis is presented as a call for
empirical testing, Miller’s hypothesis of the fitness-display properties of music does intuitively
make sense. It could provide a mechanism by which musical behaviours may have become
refined, perpetuated and spread in human evolution. His theory attempts to explain how the
forms of musical behaviour may have evolved in the species, rather than how musical forms
became appealing. It may be that the core factor in the appreciation of the quality of the musical
behaviour (and its creativity and virtuosity in artistically developed forms) is its very ‘emotional
depth’, i.e., the extent to which its perception elicits a compelling emotional response, and that
this experience of emotion might not be a product of sexual selection. (See Dissanayake, Chapter 2;
Lee and Schögler, Chapter 6, on emotional expression in movement, this volume).
5.3 The need for a comprehensive definition of music

Clearly, theories of how music may have evolved are divergent. For Pinker, music is a technology,
ultimately dispensable, with no evolutionary significance. For Roederer, Sloboda, Brown and
Hagen and Bryant, music may have had significant adaptive roles in selection at the group level,
while for Miller, music may well have played a part in sexual selection. Nevertheless, all of these
theories rely on what Huron (2001, p. 44) has described as ‘the nebulous rubric music’. They pro-
vide no clear demarcation of what is intended by the term ‘music’. It seems that these authors are
employing something like a standard dictionary definition of music, such as, ‘the art of combin-
ing sounds of voices or instruments so as to achieve beauty of form and expression of emotion’,
and a ‘pleasant sound’, both from the Concise Oxford English Dictionary (Sykes 1982) or ‘the art or
science of arranging sounds in notes and rhythms to give a desired pattern or effect’ – from the
Penguin Dictionary of Music (Jacobs 1972).
For contemporary musicologists and ethnomusicologists, these definitions are unsatisfactory.
They could apply to, say, a CD recording of a Beethoven string quartet, or a live performance by a
rock band such as Coldplay. It is unclear whether the dictionary definition would embrace either
the musical intentions of a contemporary composer such as Brian Ferneyhough, the sonic
surface of contemporary popular forms such as electronic dance music, or the drum and dance
music of a shamanic ritual in Borneo. To the musicologist and ethnomusicologist, these phe-
nomena are indubitably musical, but ‘sounds combined so as to produce beauty of form and
expression of emotion’ scarcely captures what can be considered to be musical in them. Several
of the scientific conceptions of how musical behaviours and appreciations arose in evolution (for
example that of Miller, 2000 and 2001) appear implicitly to define music according to current
Western musical practices, where music is produced by few and consumed by many.
All of these notions of music reveal themselves to be ideological constructs rooted in the
workings of broader socio-economic and political forces, which are dynamic, changing processes.
As Magrini (2000) notes, changes in the ways in which music is manifested result in the discour-
agement of alternative and often older ways of engaging with music, particularly as an active
element in everyday life. An ‘inhibition’ of musical practices may occur through processes of the
reification of elements in cultural models of engagement with music; this occurs as the role of
the music consumer—as opposed to that of a participant or everyday practitioner in musical
activity—is created, then enhanced and eventually enforced, by institutionalizing or commodifying
the processes of knowledge acquisition. Music-making may thus be inhibited through the loss of
roles, contexts, situations and practices and the impoverished models of music and its social roles
that result may all too easily be taken by music scholars to represent all possible kinds of music.
Before assessing the relationship between music and evolution, it is essential to frame the object
of study in a different way—to perceive music in all of its manifestations.
5.3.1 Music across cultures and times

All known cultures have or have had something that can be regarded as music. To be more
precise, in the words of John Blacking (1995, p. 224), ‘every known human society has what
trained musicologists would recognize as music’ (our emphasis). Across cultures and over time,
the forms and significances of music are extremely diverse. In many, perhaps most, non-Western
cultures, music requires overt action and active group engagement; the differentiation and
specialization of the roles of performer and audience might almost be considered a minority
practice. In most cultures, music is employed not just in entertainment and courtship, but as an
essential component of ritual, often marking transitions between different stages of life (e.g.,
from adolescence to adulthood), as well as consequential events such as funerary rituals and
seasonal festivals. It may function in the maintenance of oral traditions by virtue of its mnemonic
powers. And it seems that in most, if not all, cultures, interactions between caregivers and infants
have features that can be interpreted as musical.
Music appears to be something of a universal social fact. However, as the continuation of
the quotation from Blacking above makes clear, ‘there are some societies [not confined to the
African continent] that have no word for music or whose concept of music has a significance
quite different from that generally associated with the word “music”’. It is notable, moreover, that
where a term exists, in a non-Western society, that embraces the activities that a Western musi-
cologist might conceive of as music, for example the Igbo nkwa (Waterman 1991), that meaning
tends not to differentiate between music and dance.
In general, it seems that practices that are recognizable as music in societies beyond contempo-
rary global Western culture are characterized by their use of sound and movement together. They
tend to involve collective performance: that is, they are characterized in terms of not only sound
and action, but interaction between the music makers. They are marked by (1) an apparent
‘non-efficaciousness’, in that their immediate and evident consequences are not observable
through material change in the local environment or in the subsequent behaviours of the
participants, and (2) ‘embeddedness’ in a wide range of everyday and special practices. In most,
if not all cases, they also manifest significant hedonic value (Panksepp and Trevarthen, Chapter 7,
this volume).
Accepting that something like music—even if not discretely identified as such by its
practitioners—is in all human cultures, the definitions in our dictionaries seem clearly
unsatisfactory. ‘Music’ as a universal human behaviour is marked by sound, action, interaction,
non-efficacy, and a multiplicity of social functions and emotional effects. These characteristics
will now be assessed in more detail, to arrive at an operational definition of music that might
enable its relationship (if any) to evolutionary processes to be addressed comprehensively.
5.3.2 Music as embodied expressive movement

Since the advent of sound recording, listening, with no overt and observable behaviour on the
part of the listener, has been the paradigmatic mode of engagement with music in Western
societies. However, before the advent of sound recording, the notion of music as involving action
would have seemed self-evident. While it may seem trivial to suggest that music entails activity in
its making, there are many instances where music’s sonic patterns are not just caused by actions,
but have a structure and identity that is inseparable from the doing and regulation of the actions
themselves. This is evident in the studies by Blacking (1961) of southern African kalimba thumb-
piano music, where he showed that the melodies can, on occasion, depend more on the sequence
of movements involved in the production of the melody than on the pitch patterns produced.
Similar findings are reported by Baily (1985) for the repertoires performed on Afghani dutars,
and Nelson (2002) for melodic patterns in blues guitar solos.
In these three instances, action on the part of a performer is an integral component of the
identity of the music. Several instances can be cited where actions by participants, in situations
where the performer–audience distinction is absent, constitute a framework essential for the
intelligibility of musical sound patterning (e.g., Stobart and Cross 2000). A recent meta-analysis
of neuroscientific studies of music perception (Janata and Grafton 2003) demonstrates that
passive musical perception appears to involve areas of the brain associated with motor behaviour,
perhaps elicited by the sound sequences of music mirroring aspects of physical movement
(Scherer and Zentner 2001, pp. 377–78; Benzon 2001). Music seems better understood, not as
abstract patterns of sound contemplated in immobility, but as a thoroughly embodied activity of
human agents (Lee and Schögler, Chapter 6; Turner and Ioannides, Chapter 8, this volume)
5.3.3 Music as entraining others, and engaging them in movement

Most of the contexts in which music occurs are not only active but participatory, involving the
overt and active engagement of people in musical group activities. An intrinsic component of
this participation is ‘entrainment’ (Clayton et al. 2004), which involves the coordination in time
of one participant’s musical behaviours with those of others. This process appears to involve
the perceptual inference or abstraction of a regular periodic pulse or beat from a sequence of
rhythmic events, and the intuitive or cognitive organization of the timing of actions and sounds
around the motivating pulse. It orientates attention prospectively to the time points presented in
the pulse, with a concomitant periodic modulation of the amount of attentional resources
devoted to tracking the temporal flow of the music, again orientated around the pulse (Drake
et al. 2000).
According to a cognitive interpretation, pulse abstraction facilitates an optimal use of atten-
tional resources over time. Experiments show that events occurring in temporal alignment with
the inferred pulse are detected and identified more easily than events that occur out of phase with
the pulse (Jones and Yee 1993). What is conceived as the ‘attentional load’ is modulated in time in
accordance with the pulse the subject infers. At a neurophysiological level, the experience of pulse
seems intimately related to the different ranges of timing in the coordination of gross and fine
movements (Thaut 2005). Entrainment to an external pulse may be either volitional (under
conscious control) or preconscious (Stephan et al. 2002).
We conclude that musical interaction between human participants is rooted in intuitive, mind-
generated processes of pulse abstraction/generation within the individuals. These processes
implement the optimal allocation (modulation in time) of attentional resources and may focus
experience in hierarchical temporal structures. The perceptual processes are integral to
the prospective temporal control of periodic motor behaviour. Music as an interactive social
behaviour thus affords the means for synchronizing the deployment of a participant’s experience
of moving with that of other participants, facilitating the individual and the collective
(intersubjective) focus on specific moments and sequential patterns in the temporal unfolding of
the music (see Osborne, Chapter 25, this volume).
5.3.4 The ambiguity of musical intentions and a definition

of musical meaning
A broad interpretation of these entrainment processes, or the prospective perceptual control of
socially engaged musical movements, might impute similar characteristics to language.
Conversational language also relies on features that coordinate the timing of an individual’s
behaviours with those of others, as well as synchronizing the deployment of participants’
attention (Auer et al. 1999). In language, however, the meaning of an utterance with reference to
an object in the world can be specified with some precision; this is not the case for music.
The ‘outside’ meaning or denotational significance of music can rarely be pinned down
unambiguously. As John Blacking noted (1995) ‘the “same” sound patterns … can … have different
meanings within the same society because of different social contexts’ (p. 237); in Langer’s (1942)
words, in music, the ‘actual function of meaning, which calls for permanent contents, is not
fulfilled; for the assignment of one rather than another possible meaning to each form is never
explicitly made’ (p. 195). In effect, the same piece of music can bear quite different meanings for
performer and listener; it might even bear multiple disparate simultaneous meanings for a single
participant. Music, to a much greater degree than language, appears to have a ‘floating intentionality’
(Cross 1999), gathering meaning from the contexts when it happens, or where and how it is
remembered to have happened, and in turn contributing meaning to those contexts.
While language can articulate complex propositions that can be interpreted as referring exclu-
sively to particular states of affairs in the world, which may have ‘truth value’ in respect of these,
this is not the case for music. Although possessing a similar potential to language for the articula-
tion of complex syntactic structures in action and awareness of action, music never seems to
achieve direct or unequivocally interpretable reference to things beyond itself. While music can be
interpreted as referring both to itself and beyond itself (as possessing both ‘sense’ and ‘reference’,
after Frege 1952), it is only in respect of its perceived reference to itself (its sense) that its ambiguity
may be minimized or entirely resolved (Cross 2005; Brandt, Chapter 3, this volume).
As music flows in time, it presents rhythmic and melodic patterns that may give rise to expec-
tations for listeners or participants as to how and when it will continue. In the rhythmic flow of
the music, those expectations may be realized or abrogated. Thus, music can generate allusion
to future possibilities of unfolding; when those future possibilities become actualities, the
significance of those earlier musical events may become clear, their sense (at least partially)
disambiguated, giving rise to what Meyer (1956) has called music’s evident meanings.
Those patterns of evident meaning, together with the music’s sonic and gestural qualities as it
unfolds, may also yield a degree of reference, this time beyond the music itself. They may result in
the elicitation of emotion or the evocation of specific conceptual–intentional complexes in the
mind—complexes of ideas with which aspects of the music have become associated through indi-
vidual experience or cultural convention, or because of biosocial predispositions (Cross 2005;
Lavy 2001; Morley 2003, pp. 150–162). But while those conceptual–intentional complexes may
themselves be complex, they are neither propositional nor decomposable in relation to definite
objects of human thought and action. Their experience is likely to vary from participant to
participant, taking form in what Meyer (1956) referred to as connotative complexes. Their sense
and reference is not bound to a specific situation or set of circumstances, but rather to a range of
situations, as a particular emotional or affective mind–brain–body state may be relevant to a range
of circumstances for any one individual (Oatley and Johnson-Laird 1998). Hence, while aspects of
music’s sense may (retrospectively) be disambiguated, its objective reference cannot.
In certain circumstances, however, music can appear to bear meanings in much the same way
as language. Results from functional brain imaging studies support this conclusion. Koelsch et al.
(2004) demonstrated that music can elicit brain responses similar to those elicited by language in
respect of ‘semantic mismatches’, although the responses following a musical context were less
consistent than those following a linguistic context. Music and language both mean; they can
both function in the conceptual–intentional domain as acts of meaning. Nevertheless, language
can express more semantically decomposable propositions; it can refer unambiguously to
complex states of affairs in the world. Music embodies and exploits an essential ambiguity, and in
this respect, language and music may be at complementary poles of a communicative continuum,
meeting somewhere near poetry (Cross 2003c). This inherent ambiguity—together with the
quality of the actions and interactions that were noted earlier as integral to music—suffices to
differentiate music from language, enabling it to be efficacious for individuals and for groups in
contexts where language would be unproductive or impotent, precisely because of the need for
language to be interpreted unambiguously (Brandt, Chapter 3, this volume).
Hence, music might be defined broadly and operationally as embodying, entraining, and
transposably intentionalizing time in sound and action (Cross 2003a), typically expressed
by voices and instruments that articulate patterns in pitch, rhythm and timbre, and involving
correlated gestural patterns of movement that may or may not be oriented towards sound
production. This definition is not intended as an alternative to conventional dictionary
definitions; such definitions effectively delimit those aspects of music that appear significant
within recent Western culture. The broad definition is intended to delineate those attributes that,
in every community, appear to distinguish music from other spheres of human activity in a way
that might enable its relationships to cultural and biological processes to be evaluated. It is not
intended to be either constitutive or essentialist.
5.4 The communal functions of musical actions

Music, as broadly defined above, is capable of engaging and rewarding communities, groups
and individuals. In collective musical behaviour, individuals act and experience what they do
in shared, purposeful time. The experience of the coordinated nature of the collective activity is
likely to engender a strong sense of group identity with the communication of pleasure. Music
both entrains movement and experience, and allows each participant to interpret its significances
for him or her self, independently, without the integrity of the collective musical behaviour being
undermined. Music’s ambiguity—its ‘floating intentionality’—in the self and for or with others,
may thus be highly advantageous for groups, serving as a medium for participation and
contributing to the maintenance of social flexibility.
A clue to music’s efficacy for the individual might be found in Meyer’s (1956) suggestion that
music does not merely embody metaphors, but is a ‘metaphorizing medium’ through which
seemingly disparate concepts may be experienced as related and become part of a transforming
experience of the self. Music appears to constitute a medium that facilitates access to, and the
formation of, conceptual–intentional complexes and metaphorical representations that may
apply to many individual and social circumstances. As Meyer puts it:
Music does not [for example] present the concept or image of death itself. Rather it connotes that rich
realm of experience in which death and darkness, night and cold, winter and sleep and silence are all
combined and consolidated into a single connotative complex… What music presents is not any one
of these metaphorical events but rather that which is common to all of them, that which enables them
to become metaphors for one another. Music presents a generic event, a ‘connotative complex’, which
then becomes particularized in the experience of the individual listener.
(1956, p. 265)
Thus, music can be interpreted as facilitating the formation of conceptual–intentional complexes

across multiple domains of experience, providing a synthetic medium that can bind together the
experiences of disparate situations and concepts in whole forms that cannot be decomposed into
sets of discrete propositions. This may be of particular significance where two or more domains
of experience with fundamentally irreconcilable characteristics appear to coexist, as may be
encountered in ritual or religious contexts (Cross 2003c; Merker, Chapter 4, this volume).
5.4.1 The developmental value of music

While music can function as a concept-linking medium for mature members of a culture, we
suggest that it is also powerfully effective in infancy and in childhood, for the individual and
for pairs or groups. ‘Protomusical behaviours’ (M Papoušek 1996) have been identified as the
foundation of the ability infants have to interact with others predictively, to exercise the capacity
for Trevarthen’s ‘primary intersubjectivity’ (1979, 1980, 1999; Dissanayake, Chapter 2, this
volume). For older children and adults, musical behaviours can be interpreted as providing ways
of interacting that—by virtue of their ambiguity, or flexible significance—are likely to minimize
social conflict. As a group of children play together musically, for each child the significance of
their own and others’ musical behaviour can be quite different and individual; yet the integrity of
the overall musical interaction, and the pleasure gained, need not be compromised. Music’s
ambiguity allows for the exploration and rehearsal of skills in interacting with others, minimi-
zing the risks of engaging in conflict or misunderstanding, risks that would be more likely were
the medium linguistic with unambiguous reference. Musical play can be a way to exercise and
acquire social competence and confidence in cost-free and mutually rewarding interaction.
In early childhood, protomusical and protolinguistic abilities are intimately interlinked, sharing
many features and relying on common systems in children’s cognitions and behaviours. As children
develop the capacity for ostensive/inferential communication, the extent to which vocal and gest-
ural behaviours can substitute for one another in linguistic contexts is increasingly constrained;
utterances become more fixed and unambiguous in their significance and meaning. In contrast,
protomusical and musical behaviours retain a degree of ambiguity or transposability in their
‘aboutness’, particularly in the babbling stage (Elowson et al. 1998). This ambiguity is evident in
the capacity of prelinguistic utterances to reflect or engage with the temporal dynamics of the
joint actions, physical events, experienced affective states and changes of affective state that can
be shared in social exchanges. The elements of protomusical behaviour can be associated,
for infants and children, with any or all of a wide range of types of event in their experience of
the world.
In what is still the only large-scale study of children’s music and musicality in a non-Western
context, Blacking (1967) notes that music subserves primarily social functions for the children of
the Venda society in southern Africa: ‘Most Venda children are competent musicians … and yet
they have no formal musical training. They learn music by imitating the performances of adults
and other children’ (p. 29). In a society where music is chiefly manifested as interactive behaviour
that plays an especially significant role in structuring social relations in both ritual and everyday
contexts (Blacking 1976), the musicality that emerges from enculturative processes has profound
effects on children’s socialization. Blacking’s findings relate directly to research on how children
learn all manner of knowledge and skills in different cultures, and specifically to the prevalence of
‘intent participation learning’ in the majority of societies (Rogoff et al. 2003), particularly where
there is little or no institutionalized schooling. While the Venda culture that Blacking studied
might be regarded as exceptional in the importance that it accords to music in structuring social
relations, music seems equally socially significant in many other non-Western societies, such as
those of the rural Andes (Stobart 1996), or the partially urbanized and heteroglot cultures of
north-west China (in the form of hua’er songs—Yang 1994). Music and activities exhibiting
musicality in infancy and childhood can be conceived of as providing a medium through which
social flexibility may be acquired and sustained.
Music may also aid development of the individual’s cognitive flexibility. Over the past 20 years,
cognitive psychologists have found that infants do not come into the world as blank slates
(Spelke 1999); neonates are predisposed to pick up and to process experience in quite specific
ways. Capacities for consciousness of events and objects emerge too rapidly to be explained by
the operation of a general-purpose learning mechanism, and their adaptive purpose is now
abundantly evident. Moreover, it has been shown that infants assimilate information pertaining
to the use of physical objects and events quite differently from how they acquire and manage
their intentions toward people and social events. For example, very young children may show a
highly developed capacity to reason about the social world at a level that may not be manifested
in their reasoning about physical objects (Donaldson 1992; Cummins 1998). It could be said that
infants come primed for ‘physics’ and primed for ‘psychology’, each in domain-specific ways.
Yet infants and children ultimately acquire what can be thought of as a domain-general compe-
tence that is useful for grasping meanings in any kind of cultural context. We suggest that music,
or rather protomusical behaviour, is efficacious in the emergence of this domain-general cultural
competence by virtue of its ambiguity—its transposability or floating intentionality. Infants
not only emerge into the world primed for investigation of what a psychological scientist might
identify as physics and psychology, but predisposed to engage in music-like activities in
their interactions with caregivers, which are neither or both of these. Thus, the foci and signifi-
cances of these protomusical activities—inherent musicality—can lie equally in either domain
(Cross 1999): it seems probable that they operate at a more fundamental motivating level,
enhancing the likelihood of integration of information across physical and social experience, and
facilitating the formation of a general competence not tied to any cognitively specialized domain
(Cross 2005).
There is tentative evidence for this suggestion in the positive correlations between IQ and the
engagement in musical activities found in studies reviewed by Schellenberg (2003). His own
more rigorously conducted study (Schellenberg 2004) shows that engaging in music lessons leads
to a small but statistically significant enhancement of IQ. While this evidence suggests that music
has limited effect on the intellectual capacities of some individuals, it is also possible that, for
Schellenberg’s participants, the formal Western music lesson (which tends to take a form very
similar to a school lesson) provides a highly culture-specific learning context that minimizes
the extent to which the apparent social efficacy of music can be explored and exercised (for an
exception to this learning context, see Fröhlich, Chapter 22, this volume).
We conclude that music and language, while different parts of the human communicative
toolkit, both provide purposeful syntactic frameworks that serve human needs of joint action
and interaction. Similar capacities underlie their use, including the capacity to produce complex
and hierarchically structured sequences of events (sounds and actions) and to abstract structure
from such patterns produced by others. However, where language and music diverge is in the
ways in which the structures of those patterns are endowed with significance. In language,
considerations of reference and of relevance with regard to states of affairs in the world (Sperber
and Wilson 1986) are paramount. In music, unambiguous reference and relevance are much less
significant; the primary determinant of musical experience might well be how the perceived
sounds fit with the temporal structures experienced in a moving human body.
5.4.2 Is musicality a universal human talent, and if so,

what kind of talent?
Our account of the functions of music presumes that music is not only culturally but humanly
universal, i.e., that not only do all known cultures engage in practices that are recognizable as
musical, but that all individuals of those cultures have the capacity for musicality. This assump-
tion would be seriously undermined were evidence to be found that a significant proportion of
normally developing individuals in any human population were incapable of displaying musical
behaviours. On the basis of current evidence, we believe that this is not the case.
In many traditional societies, a capacity to engage in musical activities appears to be expected
of all its members (Blacking 1995; Arom 1991). While it is accepted that some people will be
more adept or creative than others, a capacity for music is expected of all, like a capacity for
speech. In contemporary Western societies, a similar situation prevails: even individuals who feel
that they have no capacity to engage in overt musical behaviours are generally expected to have
the capacity to listen to music with a degree of appreciation.
There are, nevertheless, persons who are classified as amusical—who appear, when tested, to
lack the capacity to engage with or comprehend the sounds produced by musical behaviours.
This deficit may be consequent on a brain trauma, but some individuals with no identifiable
neurological damage also appear to lack musical capacities, as defined by particular tests (Peretz
2003). These individuals typically show a dissociation between their capacities to deal with infor-
mation in the pitch and time domains, frequently exhibiting more profound deficits in the
processing of melody than of rhythm. Peretz suggests that an inability to process fine-grained
pitch differences inhibits the development of a capacity to engage in musical activities, a condi-
tion that she defines as ‘amusia’. While earlier studies (e.g., Kalmus and Fry 1980) suggested that
some five per cent of the normal population are amusical, evidence from the application of a
more sophisticated test instrument—the Montréal Battery of Evaluation of Amusia (MBEA)—
suggests that amusia is extremely rare: only 2 per cent of those tested had scores of less than two
standard deviations below the mean, but even here, performance was around 70 per cent correct
(Peretz et al. 2003).
It appears that there is no strong evidence that musicality is not a universal human attribute.
However, very little scientific research into the possession of musical capacities has been
conducted outside the confines of contemporary Western society, and for a wider picture one
must rely on the ethnographic record. From the evidence presented in the ethnographic and
scientific literature taken together, we conclude that, as with language, all humans (with a very
few rare exceptions) have the capacity to engage in musical behaviours.
In view of the extent to which music appears entwined with other domains of human
behaviour, it seems feasible to suggest that this human capacity for music may comprise a
number of components, which may have come about under the influence of a range of different
evolutionary pressures. The integrated suite of behavioural capacities that constitutes modern
human musicality might have a variety of sources in prehistoric adaptive changes.
Pinker’s (1997) description of music as a technology with no evolutionarily adaptive value, a
view apparently predicated on the notion that music consists simply of sonic patterns, is unac-
ceptable to us. As we have seen, music cannot be reduced to patterns of sound, and its effects
appear more far-reaching than simple and immediate hedonic response in individuals. Miller’s
(2000) sexual selection theory, which focuses on music as display, may well describe some of
the ways in which musicality was adaptive in human evolution. However, as evident from the
foregoing, music is more than display: it typically involves coordinated interaction in individual
performance. It seems highly likely that music plays a significant role in forming and maintaining
group cohesion among humans, as Brown (2000b) suggests, by virtue of its capacity to entrain
activity, and its floating intentionality. Despite differences, there appear to be close functional
correspondences between music and language, which support Brown’s (2000a) suggestion that
they share a common and deeply rooted evolutionary origin.
5.4.3 Altriciality and play

Considered as a universal human behaviour, music does appear to have significant proximate
effects; however, these effects are not necessarily equitable with ultimate causes. To evaluate
music’s status in processes of human evolution, it is also necessary to consider how musical
behaviours might have become part of the human behavioural repertoire. We propose that
processes of progressive juvenalization evident in the later hominid lineage may have spurred the
emergence of behaviours that are central to the modern human faculty for music. In the hominid
lineage, each successive species appears to have been more altricial than its predecessors, with a
progressively longer proportion of the total lifespan spent in increasingly differentiated juvenile
states (Bogin 1999).
Joffe (1997) has shown that primate species with complex social organizations are more likely
to be altricial; she proposes that a complex social organization is enabled by an extension of the
learning period in which members of a species manage their social interaction in more flexible
ways. A significant feature of the behaviour of juvenile animals, particularly of predatory
or social species, is play, which can be identified as action and interaction that appears to be
purposeless (Bekoff 1998) carried out within a world largely constructed by the participants.
Play usually involves the employment of functional behaviours in modified forms, and when
used among individuals it requires the negotiation of cooperative agreement (Bekoff 1998). Play
enables juveniles to learn to deal with their environment by testing features of it through action,
and to acquire the skills necessary to engage with conspecifics when rehearsing and elaborating
skills of social interaction. It is also self-stimulating fun in its own right (Panksepp and Burgdorf
2003). Play thus has many musical features and comparable individual and social efficacy. Hanuš
Papoušek (1996, pp. 46–47) describes infant and early childhood musical behaviours as forms of
play involving higher-level integrative processes that act to nurture ‘exploratory competence’.
Vocal play, in the form of babbling, does not appear to be unique to humans; Elowson et al.
(1998) note that this behaviour occurs in juvenile pygmy marmosets, and that response from
a caregiving adult is more likely when the juvenile is vocalizing, and suggest that pygmy
marmoset babbling has relevance to understanding the evolutionary processes of human vocal
development. It may be that an association between vocal play and a positive caregiving response
privilege the social function of these types of play.
We suggest that in an increasingly altricial lineage, the need to accommodate to population
structures with an increasing proportion of members with access to juvenile modes of cognition,
motivation and behaviour (other factors being equal) may have favoured the emergence of some-
thing like musicality as a means of assimilating the value of those juvenile modes of exploratory
cognition into the adult behavioural repertoire, while regulating its modes of expression. Given
that play is a particular feature of the behaviour of juveniles in social mammals, and that it is
likely to have positive survival value for members of those species who engage in it, it is probable
that group behaviours that both enable and regulate it to co-opt its utility into the adult reper-
toire are likely to have some adaptive or exaptive value. Music can be interpreted as one of these
mechanisms, emerging under the selection pressures of the progressive extension and stage-
differentiation of the juvenile period in the later hominid lineage.
5.5 The archaeological record

Archaeological evidence is clear: musical behaviours have been a part of human life for many
millennia. Modern humans in Europe were manufacturing musical pipes from the bones of birds
at least 36,000 years ago, and the sophistication of these instruments exceeds that of many
medieval and contemporary examples of such pipes (Scothern 1992). It seems likely that when
modern humans arrived in Europe around 40,000 years ago, they had already developed instru-
mental musical behaviours; it is likely that instruments were in use far earlier, and that musical
behaviours that made use of the voice and body movements had a long history prior to the devel-
opment of musical artefacts.
From 30,000 years ago, however, there is a marked increase in the evidence for musical activities,
including rasps, percussion instruments, many more bone pipes, and in the evidence that rocks
and caves were exploited for their acoustic properties (Cross and Watson 2006; Morley 2003).
These musical activities seem to have been widespread, often occurring in what appear to be loci
of intense human activity, which includes the making of graphical art. The evidence—fragmentary
as it is—suggests that musical performance was a group activity, rather than one involving a
select few individuals. The differential preservation of bone over other organic materials is likely
to bias the record, and with the focus of archaeological research on Europe, the rest of the old
world that was occupied by anatomically modern humans has been neglected. There is the possi-
bility that objects used for sound production have yet to be identified. Increasingly sophisticated
analysis (e.g., d’Errico et al. 2003) and methods of excavation, and experimental work on the
potential sound-producing properties of archaeological materials (cf. Cross et al. 2002) should
help to fill out the record of musical activities in prehistory. However, we do know enough to
assert that musical behaviours are extremely ancient, probably dating at least to the emergence of
behavioural complexity in anatomically modern Homo sapiens.
While a fully integrated capacity for musicality is evident in early modern humans, musicality
appears to be made up of a number of psychological capacities, including those for the produc-
tion and perception of complex sequences of sounds and actions, for social entrainment, and
for creatively engaging with patterns of sounds and actions—all manifestations of multiple
intentionalities. The palaeo–anatomical and archaeological records suggest that these different
capacities arose at different times in the hominid lineage that leads to modern humans (Morley
2002; 2003).
The evidence suggests that our nearest primate relatives have few capacities that could be
interpreted as musical. Chimpanzees and bonobos lack the phonational capacity for the produc-
tion of complex vocal signals, partly because of their very different physiques (Morley 2002), and
there is no evidence that either species can entrain to regular patterns of visual or sonic stimuli
(however, see Fitch 2006). A recent survey of systems of animal communication (Seyfarth and
Cheney 2003) concludes that even among primates, the interpretability of vocal signals by
conspecifics is generally bound so tightly to the awareness of present circumstances that they
cannot be regarded as referential. Calls that might be conceived of as conveying disembedded
information to conspecifics are better thought of as expressing an individual’s affective state,
without reference or intention to inform others. As the authors note, ‘In sum, a variety of results
argue that, in marked contrast to humans, nonhuman primates do not produce vocalizations in
response to their perception of another individual’s ignorance or need for information’ (p. 159).
It appears that although some non-human primates, notably gibbons, can produce complex and
long sequences of sound and action, a key element of musicality—the engagement with the
intentionalities of such sequences—is absent (Merker, Chapter 4, this volume).
The likelihood of significant continuities between the lifeways of other primates and of
australopithecines (currently the oldest known ancestor genus leading to modern humans)
suggests that no significant components of a human faculty for music emerged with this latter
group of species, although it might be hypothesized that the move to bipedalism laid some of the
foundations for a capacity for entrainment in rhythmic stepping and gesturing. Recent evolu-
tionary thinking (see Wood and Collard 1999) interprets the very early humans Homo habilis (and
possibly Homo rudolfensis) (from 2 million years before the present) as manifesting a high degree
of continuity with australopithecine lifeways and capacities; however, the archaeology associated
with the species shows significant changes in the evidence for toolmaking and the transmission
of traditions of tool manufacture. While H. habilis and H. rudolfensis remains are fragmentary
and their interpretation is debated, the manufacture and use of tools suggests that the species had
more muscularly developed hands, perhaps with a longer thumb, than did their predecessor
species, and a greater degree of refinement in the control of manual movement (Wilson 1998).
These capacities are likely to have allowed for the beginnings of finely controlled expressive
manual gesture, an intrinsic component of all modern human communicative systems.
With Homo ergaster and Homo erectus (from about 1.8 million years before the present), major
changes occurred; brain size reached around 1000 cc, and body size and configuration approxi-
mated those of modern humans. H. ergaster and H. erectus had more complex lifeways and toolkits
than their precursors, and a vast increase in geographical range. The capacity for the much-
enhanced control of phonation—conferred by a barrel-shaped chest, the enhanced articulatory
capacities of the vocal system, and the presence of an ear canal of modern proportions—suggests
that vocal sounds were increasingly significant for this species. This may indicate significant
changes in social life, perhaps marking the emergence of a rich vocal repertoire to replace other
forms of interpersonal interaction (in conformance with Dunbar’s [1992] ‘grooming-to-gossip’
model). The evidence also suggests that some foundational components of musicality were in
place, most likely expressed in the use of vocal sounds to articulate complex emotion states in the
regulation of social relations, and possibly to convey referential information.
It was not until the appearance of Homo heidelbergensis (c.700 to 500 kyr BP), however, that we
find the fully modern vocal tract, together with an auditory system that is maximally sensitive
to speech frequencies (Martinez et al. 2004). This coadaptation suggests that vocal sounds were
crucially significant for this species, more so than other environmental sounds. This can be
construed as a refinement of earlier H. ergaster capacities, which is supported by evidence for the
production and use of an expanded range of artefacts. This advance in creativity is likely to have
been manifested in the capacity to produce and perceive increasingly complex vocal sounds and
sequences, including behaviours that we might identify as singing.
Following the emergence of anatomically modern Homo sapiens, which dates back some
150 kyr BP, we ultimately find evidence for symbolic intelligence or ‘fully modern sapiens
behaviour’ (Henshilwood and Marean 2003), and unambiguous evidence of musical behaviours.
These behaviours are built on cognitive, physiological and behavioural foundations that emerged
in the preceding hominid species, as outlined above. At what point these behaviours can be
considered symbolic, in the sense of having the capacity to indicate meaning through an arbi-
trary coupling of sign and referent, is open to debate, but the capabilities probably emergent
in H. ergaster, and then developed in H. heidelbergensis, would have featured strong associations
between emotional content and vocal and physical gesture. Symbolic culture, in which signs enter
into a web of interrelationships that come to constitute a significant feature of the ecology of the
human mind (Chase 1999), emerged with modern Homo sapiens.
Thus, we suggest that the emergence and development of complex manual and vocal gesture,
under the conditions of greater social complexity associated with H. ergaster and H. erectus,
constituted the foundations of what would come to be melodic vocalization, i.e., singing. It seems
likely that the production and perception of complex sequences of sounds with the voice was
very important by the time of H. heidelbergensis, and that the social roles of such vocalizations,
including the potential to rehearse and refine social interactions, were built on subsequently, to
become a part of music and language in the fully symbolic culture that emerged in modern
humans.
5.6 Conclusions
The evolutionary story can be read as indicating that a version of Brown’s (2000a) musilanguage
may have emerged with H. ergaster, perhaps restricted to the exchange of social information, with
a further development of a capacity for more general reference with H. heidelbergensis. It seems
likely that the divergence between music and language arose first in modern humans, with
language emerging to fulfil communicative, ostensive and propositional functions with immediate
efficacy. Music, operating over longer timescales, emerged to sustain (and perhaps also to foster)
the capacity to manage social interactions, while providing a matrix for the integration of infor-
mation across domains of human experience. We propose that music and language enabled the
emergence of modern human social and individual cognitive flexibility (Cross 1999). We regard
both music and language as subcomponents of the human communicative toolkit—as two
complementary mechanisms for the achievement of productivity in human interaction though
working over different timescales and in different ways.
While the selection pressures for the emergence of language are widely regarded as self-evident
(Pinker 1994), those for music appear less well understood, perhaps because the effects of music
appear less immediate and direct, or obvious, than do those of language (Mithen 2005).
However, we suggest that a degree of adaptation to changes in the rate of individual maturation
evident in the later hominid lineage may be a factor that led to the human capacity for musicality,
distinct from, and perhaps foundational, in respect of language (Cross 2003b).
Musical capacities are built on fundamentally important social and physiological mechanisms
and, at an essential level, are processed as such. Music uses capacities crucial in situations of
social complexity; the vocal, facial and interactive foundations of these capabilities are evident in
other higher primates, and such capacities would have become increasingly important and
sophisticated as group size and complexity increased. Vocal emotional expression, interaction,
and sensitivity to others’ emotional state would have been selectively important abilities; individ-
uals in which these capabilities were more developed would have been selectively favoured.
Fundamentally integrated into the planning and control of complex sequences of vocalizations,
and related to the prosodic rhythm inherent in such sequences, is rhythmic motor coordination.
The motor system is primed in the instigation of such vocal behaviours, and corporeal gesture is
consequently incorporated into the execution of the vocal behaviour.
In terms of their potential selective advantages, developed musical behaviours could confer an
advantage on individuals in terms of sexual selection, due to their foundations in the capacities
to communicate emotionally and effectively, to empathize, to bond and elicit loyalty.
Musical abilities have the potential to be a proxy for an individual’s likelihood of having strong
social networks and loyalties, and of contributing to a group. Musical behaviour also has the
potential to be a mechanism for stimulating and maintaining those networks and loyalties;
because of the stimulation of shared emotional experience as a consequence of participation
in musical activities, it can engender strong feelings of empathic association and group
membership. Musical or protomusical behaviour has the potential to make use of several cogni-
tive capacities at once, relying on the integration and control of biological, psychological, social
and physical systems; it gives the opportunity to practise and develop these integrated skills in a
context of limited risk.
The emergence of full (specialized, as opposed to proto-) musical behaviours, with founda-
tions in social interaction, emotional expression, and fine control and planning of corporeal and
vocal muscular control, lends them extremely well to integrating important cognitive skills. The
execution of musical activities could become increasingly important and beneficial on both indi-
vidual and group levels, with increasing social complexity within and between groups. Because
music production and perception is processed by the brain in ways that are complex and related
to interpersonal interaction and the formation of social bonds, it stimulates many associated
functions. It seems that musical participation, even without lyrics or symbolic associations, can
act on the brain in ways that are appealing to humans, because of their vicarious stimulation of
fundamentally important human interactive capacities.
While this model for the emergence of musicality appears to fit well with the evidence available
from ethnographic, cognitive, comparative, palaeo–anatomical and archaeological sources, other
ecologically observable behaviours suggest further facets to the evolutionary story require
consideration. The investigation of the origins, emergence and nature of musical behaviours in
humans is in its early stages, and has more to reveal. It concerns an element of human behaviour
that, in contrast with Pinker’s (1997) opinion, the vast majority of people would miss very much
if they were suddenly bereft of it. It would be impossible to do away with music without removing
many of the abilities of social cognition that are fundamental to being human.
References
Arom S (1991). African polyphony and polyrhythm. Cambridge University Press, Cambridge.
Auer P, Couper-Kuhlen E and Muller K (1999). Language in time: The rhythm and tempo of spoken
language. Oxford University Press, Oxford.
Baily J (1985). Music structure and human movement. In P Howell, I Cross & R West, eds, Musical structure
and cognition, pp. 237–258. Academic Press, London.
Bekoff M (1998). Playing with play: What can we learn about cognition, negotiation and evolution?
In DD Cummins & C Allen, eds, The evolution of mind, pp. 162–182. Oxford University Press, Oxford.
Benzon W (2001). Beethoven’s anvil: Music, mind and culture. Basic Books, New York.
Blacking J (1961). Patterns of Nsenga kalimba music. African Music, 2(4), 3–20.
Blacking J (1967). Venda children’s songs: A study in ethnomusicological analysis. Witwatersrand University
Press, Johannesburg.
Blacking J (1969). The value of music in human experience. Yearbook of the International Folk Music
Council, 1, 33–71.
Blacking J (1976). How musical is man? Faber, London.
Blacking J (1995). Music, culture and experience. University of Chicago Press, London.
Bogin B (1999). Patterns of human growth, 2nd edn. Cambridge University Press, Cambridge.
Bowles S and Gintis H (1998). The moral economy of community: structured populations and the
evolution of pro-social norms. Evolution and Human Behaviour, 19, 3–25.
Brown S (2000a). The ‘musilanguage’ model of music evolution. In N Wallin, B Merker and S Brown, eds,
Brown S (2000b). Evolutionary models of music: From sexual selection to group selection.
In F Tonneau & NS Thompson, eds, Perspectives in ethology 13: Behavior, evolution and culture,
pp. 231–281. Plenum Publishers, New York.
Chase P (1999). Symbolism as reference and symbolism as culture. In C Knight, R Dunbar and C Power,
eds, The evolution of culture: An interdisciplinary view, pp. 34–49. Edinburgh University Press,
Edinburgh.
Clayton M, Sager R and Will U (2004). In time with the music: The concept of entrainment and its
significance for ethnomusicology. ESEM Counterpoint, 1, 1–82.
Cross I (1999). Is music the most important thing we ever did? Music, development and evolution.
In S W Yi, ed., Music, mind and science, pp. 10–39. Seoul National University Press, Seoul.
Cross I (2003a). Music and biocultural evolution. In M Clayton, T Herbert and R Middleton, eds,
The cultural study of music: A critical introduction, pp. 19–30. Routledge, London.
Cross I (2003b). Music and evolution: causes and consequences. Contemporary Music Review, 22(3), 79–89.
Cross I (2003c). Music, cognition, culture and evolution. In I Peretz and R Zatorre, eds, The cognitive
neuroscience of music, pp. 42–56. Oxford University Press, Oxford.
Cross I (2005). Music and meaning, ambiguity and evolution. In D Miell, R MacDonald and D Hargreaves,
eds, Musical Communication, pp. 27–43. Oxford University Press, Oxford.
Cross I, Zubrow E and Cowan F (2002). Musical behaviours and the archaeological record: a preliminary
study. In J Mathieu, ed., Experimental archaeology: Replicating past objects, behaviors and processes,
pp. 25–34. British Archaeological Reports International Series 1035. Archaeopress, Oxford.
Cross I and Watson A (2006). Acoustics and the human experience of socially organised sound. In C Scarre
and G Lawson, eds, Acoustics, space and intentionality: Identifying intentionality in the ancient use of
acoustic spaces and structures, pp. 107–116. McDonald Institute for Archaeological Research, Cambridge.
Cummins DD (1998). Social norms and other minds: the evolutionary roots of higher cognition.
In DD Cummins and C Allen, eds, The evolution of mind, pp. 30–50. Oxford University Press, Oxford.
D’Andrade R (1995). The development of cognitive anthropology. Cambridge University Press, Cambridge.
D’Errico F, Henshilwood C, Lawson G, et al. (2003). Archaeological evidence for the emergence of
language, symbolism, and music – an alternative multidisciplinary perspective. Journal of World
Prehistory, 17(1), 1–70.
Darwin C (1871). The descent of man and selection in relation to sex. Murray, London.
Dissanayake E (2000). Antecedents of the temporal arts in early mother–infant interactions. In N Wallin,
Donaldson M (1992). Human minds: An exploration. Allen Lane/Penguin Books, London
Drake C Jones MR and Baruch C (2000). The development of rhythmic attending in auditory sequences:
attunement, referent period, focal attending. Cognition, 77, 251–288.
Dunbar R (1992). Neocortex size as a constraint on group size in primates. Journal of Human Evolution,
22, 469–493.
Elowson AM, Snowdon CT and Lazaro-Perea C (1998). ‘Babbling’ and social context in infant monkeys:
parallels to human infants. Trends in Cognitive Sciences, 2, 31–37.
Fitch W Tecumseh (2006). The biology and evolution of music: a comparative perspective. Cognition,
100(1), 173–215.
Foley RA (1995). Humans before humanity. Blackwell, Oxford.
Freeman WJ (1995). Societies of brains. A study in the neurobiology of love and hate. Erlbaum, Mahwah, NJ.
Frege G (1952). On sense and reference. In P Geach and M Black, eds, Translations from the Philosophical
Writings of Gottlob Frege. Blackwell, Oxford.
Hagen EH and Bryant GA (2003). Music and dance as a coalition signaling system. Human Nature,
14(1), 21–51.
Henshilwood CS and Marean CW (2003). The origin of modern human behavior: critique of the models
and their test implications. Current Anthropology, 44(5), 627–651.
Huron D (2001). Is music an evolutionary adaptation? Annals of the New York Academy of Science, 930, 43–61.
Jacobs A (1972). New dictionary of music, 2nd edn. Penguin Books, Harmonsdworth.
Janata P and Grafton ST (2003). Swinging in the brain: Shared neural substrates for behaviors related to
sequencing and music. Nature Neuroscience, 6(7), 682–687.
Joffe TH (1997). Social pressures have selected for an extended juvenile period in primates. Journal of
Human Evolution, 32(6), 593–605.
Jones MR and Yee W (1993). Attending to auditory events: The role of temporal organization.
In S McAdams and E Bigand, eds, Thinking in sound, pp. 69–112. Oxford University Press, Oxford.
Kalmus A and Fry DB (1980). On tune deafness (dysmelodia): Frequency, development, genetics and
musical background. Annals of Human Genetics, 43(4), 369–382.
Koelsch S, Kasper E, Sammler D, Schultze K, Gunter T and Frederici A (2004). Music, language and
meaning: brain signatures of semantic processing. Nature Neuroscience, 7(3), 302–307.
Langer S (1942). Philosophy in a new key. Harvard University Press, Cambridge, MA.
Lavy M (2001). Emotion and the experience of listening to music: A framework for empirical research,
Ph.D. thesis.University of Cambridge. Available at http://www.scribblin.gs
Magrini T (2000). From music-makers to virtual singers: New musics and puzzled scholars. In D Greer, ed.
Musicology & sister disciplines, pp. 320–330. Oxford University Press, Oxford.
Martinez I, Rosa M, Arsuaga J-L et al. (2004). Auditory capacities in Middle Pleistocene humans from the
Sierra de Atapuerca in Spain. Proceedings of the National Academy of Sciences, 101(27), 9976–9981.
Meyer LB (1956). Emotion and meaning in music. University of Chicago Press, London.
Miller G (2000). Evolution of human music through sexual selection. In N Wallin, B Merker and S Brown,
Miller G (2001). The mating mind: How sexual choice shaped the evolution of human nature. Vintage/Ebury,
London.
Mithen S (2005). The singing Neanderthals: The origins of music, language, mind and body.
Weidenfeld & Nicolson, London.
Archaeological Journal, 12(2), 195–216.
Morley I (2003). The evolutionary origins and archaeology of music: An investigation into the prehistory of
human musical capacities and behaviours. Ph.D. thesis. University of Cambridge, Cambridge. Darwin
College Research Reports, DCRR-002, available online at www.dar.cam.ac.uk/dcrr/
Nelson S (2002). Melodic improvisation on a twelve-bar blues model: an investigation of physical and historical
aspects, and their contribution to performance. Ph.D. thesis. City University London, Department of
Music, London.
Oatley K and Johnson-Laird PN (1998). The communicative theory of the emotions: Empirical tests,
mental models and implications for social interactions. In JM Jenkins, K Oatley and NL Stein, eds,
Human emotions: A reader, pp. 84–97. Blackwell, Oxford.
Panksepp J and Burgdorf J (2003) ‘’Laughing’’ rats and the evolutionary antecedents of human joy?
Physiology and Behavior, 79, 533–547.
Papoušek H (1996). Musicality in infancy research: Biological and cultural origins of early musicality.
In I Deliège and JA Sloboda, eds, Musical beginnings, pp. 37–55. Oxford University Press, Oxford.
Papoušek M (1996). Intuitive parenting: A hidden source of musical stimulation in infancy. In I Deliège
and JA Sloboda, eds, Musical beginnings, pp. 88–112. Oxford University Press, Oxford.
Peretz I (2003). Brain specialization for music: New evidence from congenital amusia. In I Peretz and
R Zatorre, eds, The cognitive neuroscience of music, pp. 192–203. Oxford University Press, Oxford.
Peretz I, Champod AS and Hyde K (2003). Varieties of musical disorders: The Montréal Battery of
Evaluation of Amusia. Annals of the New York Academy of Sciences: The Neurosciences and Music,
999, 58–75.
Pinker S (1994). The language instinct. Allen Lane, London.
Pinker S (1997). How the mind works. Allen Lane, London.
Plotkin H (1997). Evolution in mind. Allen Lane, London.
Roederer JG (1984). The search for a survival value of music. Music Perception, 1, 350–356.
Rogoff B, Paradise R, Arauz RM, Correa-Chévez M and Angelillo C (2003) First-hand learning through
Schellenberg EG (2003). Does exposure to music have beneficial side effects? In I Peretz and R Zatorre, eds,
The cognitive neuroscience of music, pp. 430–448. Oxford University Press, Oxford.
Schellenberg EG (2004). Music lessons enhance IQ. Psychological Science, 15(8), 511–514.
Scherer C and Zentner MR (2001). Emotional effects of music: Production rules. In P Juslin and JA Sloboda,
eds, Music and emotion: theory and research, pp. 361–392. Oxford University Press, Oxford.
Scherer KR (1991) Emotion expression in speech and music. In J Sundberg, L Nord and R Carlson, eds,
Music, Language, Speech and Brain, 146–156. MacMillan Press, Basingstoke.
Scothern PMT (1992). The music-archaeology of the palaeolithic within its cultural setting. Ph.D. thesis.
University of Cambridge, Cambridge.
Seyfarth RM and Cheney DL (2003). Signalers and receivers in animal communication. Annual Review of
Psychology, 54, 145–173.
Shennan S (2002) Genes, memes and human history. Thames and Hudson, London.
Shore B (1996). Culture in mind: Cognition, culture, and the problem of meaning. Oxford University Press,
Oxford.
Sloboda JA (1985). The musical mind. Oxford University Press, Oxford.
Spelke E (1999). Infant cognition. In RA Wilson and FC Keil, eds, The MIT encyclopedia of cognitive sciences,
pp. 402–404. MIT Press, Cambridge, MA.
Sperber D and Wilson D (1986). Relevance: Communication and cognition. Blackwell, Oxford.
Stephan KM, Thaut MH, Wunderlich G et al. (2002). Conscious and subconscious sensorimotor
synchronization – prefrontal cortex and the influence of awareness. NeuroImage, 15, 345–352.
Stobart HF (1996). Tara and Q’iwa: Worlds of sound and meaning. In MP Baumann, ed., Cosmología y
música en los Andes (Music and cosmology in the Andes), pp. 67–81. Biblioteca Iberoamericana and
Vervuert Verlag, Madrid and Frankfurt.
Stobart HF and Cross I (2000). The Andean anacrusis? Rhythmic structure and perception in Easter songs
of Northern Potosí, Bolivia. British Journal of Ethnomusicology, 9(2), 63–94.
Sykes JB (1983). Concise Oxford dictionary, 7th edn. Oxford University Press, Oxford.
Thaut MH (2005). Rhythm, human temporality, and brain function. In D Miell, R MacDonald and
D Hargreaves, eds, Musical Communication, pp. 171–191. Oxford University Press, Oxford.
Trevarthen C (1979). Communication and cooperation in early infancy. A description of primary
intersubjectivity. In M Bullowa, ed., Before speech: The beginning of human communication,
pp. 321–347. Cambridge University Press, London.
Trevarthen C (1980). The foundations of intersubjectivity: Development of interpersonal and cooperative

understanding in infants. In D Olson, ed., The social foundation of language and thought, pp. 316–342.
Norton, New York.
Trevarthen C (1999). Musicality and the intrinsic motive pulse: Evidence from human psychobiology and
Wallaschek R (1893). Primitive music. Longmans, Green & Co., London.
Waterman CA (1991). Uneven development of African ethnomusicology. In B Nettl and PV Bohlman, eds,
Comparative musicology and anthropology of music. University of Chicago Press, London.
Wilson FR (1998). The hand: How its use shapes the brain, language, and human culture. Pantheon Books,
New York.
Wood B and Collard M (1999). The human genus. Science, 282, 65–71.
Yang M (1994). On the hua’er songs of north-western China. Yearbook for Traditional Music, 26, 100–116.
Chapter 6
Tau in musical expression

David N. Lee and Benjaman Schögler
6.1 Musical expression

What is it that makes a great actor, musician, prima ballerina or performance artist? Many
and varied talents are at work, but in every case it is individuals’ ability to communicate by
controlled body movements that distinguishes their art. Creative artistic expression often excites
the communication of complex ideas and emotions through a manipulation of ‘narrative’ in
different modalities simultaneously, thereby stimulating affect (Malloch 2005). The influence
that dynamic art has over us can be hard to describe, but when it is successful, viewers or listeners
are moved as they experience the non-verbal ‘message’ of the picture, film or performance.
Something in the pattern of flow in the movement of music, dance or gesture communicates
directly and elicits emotion and, sometimes, sympathetic movement. What is this something?
How is it translated and shared between us?
Our ability to respond with matching feelings to expressive gestures experienced in different
modalities lies at the heart of this mystery. Interest in analysing musical expression has a long
history (Clarke 1988, 1999; Clynes 1973; Dogantan 2002; Krumhansl 1996, 2002). In addition to
helping us articulate an understanding of music, knowledge of how musical expression is
achieved has crucial implications for brain science (Molinari et al. 2003; Panksepp and Bernatzky
2002; Zatorre and Krumhansl 2002), and observations on brain activity related to the awareness
and the performance of music are yielding remarkable insights (Buccino et al. 2004; Keysers et al.
2003; Peretz and Zatorre 2005; Schlaug 2001).
Recent research has focused on the effects of separately measured expressive qualities or
dimensions in music (pitch, rhythm, timing, timbre, tension) and has made comparisons with
the corresponding motor actions in performers and the perceptions of listeners (Baily 1985;
Friberg et al. 2000; Dahl and Friberg 2003; Mitchell and Gallaher 2001; Shove and Repp 1995;
Todd 1994). Other research concerns the design of interactive systems where, for example,
gestures or dance are converted into sound (Camurri et al. 2000, 2003). However, as Camurri
et al. point out, there is a need for a more detailed understanding of all bodily movements
involved in communicative expression—particularly, we argue, for understanding the pattern of
flow in expressive movement. In this chapter we concentrate on expression in the sounds of
music, but our results may also help the understanding of expression across other art forms that
employ speech and gesture (Donald 1999; MacNeilage 1999; Dissanayake, Chapter 2, and Brandt,
Chapter 3, this volume).
6.2 Toward a science of expressive movement

All gestures and intentional vocalisations are ultimately actions of the musculature.
Donald (1999, p. 41)
84 DAVID N. LEE AND BENJAMAN SCHÖGLER
Analysis of dynamic emotional exchanges in human activity—particularly in the dynamic arts of

drama, music and dance—can provide science with a wealth of information on the psychological
processes of perception and cognition. Ultimately, perception and cognition have inherent
relationships to the generation of motivated psychological time, the time of moving and experi-
encing that is regulated emotionally (Donald 1999; Schögler 1999), which is the key to social
communication in all animals (MacLean 1990; Panskepp and Bernatzky 2002).
Artists, in their myriad incarnations, act as mediators for emotion and aesthetic awareness,
manipulating patterns of expression and their experience in different modalities. For thousands
of years, dancers and musicians have sought to engage others and tell stories through the
movement of their bodies, creating and responding to the ‘emotional narrative’ in music
(Hanna 1979). The exchange between musician and dancer has invited our enquiry by methods
that record in real time the communication between performers, as illustrated in Figure 6.1.
We attempt to answer such questions as: how does an expression or gesture pass from the mind
of a musician into the body of a dancer through the medium of sound?
The rich expressive information of music has long been the focus of research into the princi-
ples of artistic creativity and aesthetics (Camurri et al. 2003; Dogantan 2002; Haagendorn 2004;
Iyer 2004; Stevens et al. 2003), and such phenomena have attracted the attention of psychologists
Sound wave
Fig. 6.1 Cyclic flow of musical expression through movement, sound and sight. The expression
passes from the player (a) through the medium of sound (b) to the dancer (c) and thence back to
the player through vision.
TAU IN MUSICAL EXPRESSION 85
with diverse research interests (e.g., Gibson 1966; Schmidt, Carello and Turvey 1990; Trevarthen
1999). It is widely agreed now that information about the creative act of expression—engaging
physiological, muscular, neural and behavioural systems—is of central importance to our under-
standing of the arts, and more fundamentally, to psychology and all forms of communication
(e.g., Panskepp and Bernatzky 2002). Yet it is also evident that we lack a coherent theory of
expressive motor control to integrate information from these fields, one that is sufficiently sensi-
tive to their multidimensional and temporal nature. We need to examine music in terms of the
processes by which our bodies create it.
Music, as it is usually understood, is sound created by human action, and it is the precise
nature of the flow or regulated change of body movement that determines musical expression
(although there are exceptions in some forms of electronic music). Musical sound may be
sustained momentarily in a discrete tone or chord. However, even in these apparently static
elements, the sound is restive; tones performed by the voice or in playing an instrument always
fluctuate in loudness, pitch and timbre, and this physiologically regulated variation is employed
to great effect when a performer is moving between tones expressively. Musical expression,
we shall argue, is the manner in which the performed sound changes within and between tones
(see Brandt, Chapter 3, this volume, on the role of expression in communication of meaning).
Performers create this expression out of their own musical sense of emotional control in
moving (Scholes 1960; Clynes 1973). Singers and instrumentalists achieve musical expression by
the way they move—by how they modulate their vocal apparatus, how they draw the bow across
the strings, how they depress the piano keys. Finely controlled movements, in interaction with the
physics of the body and musical instrument (if one is used), sculpt musical expression. It is a
remarkable fact that very similar musical expression can be achieved using quite different move-
ments on different instruments (for example, bowing versus pressing a key), which, in turn,
produce quite different sound qualities (violin versus piano). Furthermore, a person can pick up
the expression in the music and ‘mirror’ that same expression in gesture or dance.
We propose that musical expression is embodied in certain measurable expressive variables in
the flow of movement and sound that are invariant across different means of enactment, as
shown in Figure 6.1. Our quest in this chapter is to identify, in movement and sound, some of
these common expressive variables generated in the brain of the performer and in the controlled
execution of the movement.
6.3 General tau theory

Our ideas are rooted in theories of intrinsic and perceptual guidance of movement of Bernstein
(1967), Gibson (1966), and Lashley (1951) and, in particular, in General Tau Theory (Lee 1998,
2005), which was developed from these theories. The theory is supported by experiments with
infants and adults, with many animal species, and spanning a wide range of skills (Lee, Craig and
Grealy 1999; Lee 2005). First, we will outline the theory. Then we will report experiments on
musical performance that test the theory and demonstrate the analytical methods we use to
explore expression in performance.1
1 The experiments reported in this chapter are part of the ongoing research of the Perception-in-Action
Laboratories, University of Edinburgh. The experiments on singing and bass playing were carried out,
under Lee’s and Schögler’s supervision, by R Berger, P Biggs, B Harvey, J Scriven, and E Ward for their
Honours Dissertations in Psychology 2004. The singing experiments were carried out in the Speech
Science Research Centre, Queen Margaret University College, Edinburgh with the help of N Hewlett and
colleagues. We are also indebted to Professor Murray Campbell for blowing his trombone.
Musical performance requires the generation of movements, and an awareness of the sound
produced by those movements. Consider someone playing a tune from memory on the piano.
The activity involves both intrinsic guidance, knowing how to move, and perceptual guidance
through hearing and other senses. The tune and a prescription of how it is to be played come
from within the pianist, but information through hearing and/or touch and/or vision is also
necessary to ensure that the fingers follow the pianist’s musical intent.
6.4 Motion/perception gaps

A basic concept in general tau theory identifies the gap between a current state of the body or
a part of the body, or of awareness, and a goal state. Any purposeful movement involves control-
ling the closure of many gaps in different dimensions. For example, a pianist must control, with
finger, hand and arm movements, the closure of the ‘distance gap’ between the rest position of the
key and the aimed-for goal position (the hammer-release point of the piano mechanism—after
this point, the action is purely mechanical). In singing a melody, a singer controls the closure of
gaps in fundamental frequency between successive tones (f0 gaps) by regulating the expulsion
of air and changing the actions of muscles in the chest, throat, vocal cords and mouth. Violin
playing requires, among other things, the control of the closure of the distance gap between
the initial position of the bow and its stopping point, to produce a desired sound from the
instrument. Playing or singing a crescendo requires the control of the closure of an intensity
gap to the goal intensity. Thus, gaps can be perceived and measured in different dimensions—
such as distance, frequency, intensity. A musician must control gaps in several dimensions
simultaneously and rapidly, as when skilfully playing a violin.
6.5 Tau of a gap

From an evolutionary perspective, it might be expected that, in the interests of reliability and
efficiency, the perceptuo–motor systems of animals would measure all gaps in a common
dimension. Identifying such a core measure in human movement would help explain how
musicians can produce beautiful effects with gestures that simultaneously affect the different
dimensions of pitch, duration, intensity and timbre of perceived sound. General tau theory
posits that the common dimension for measuring all gaps is time. The theory proposes that, in
principle, to control the closure of a gap, an animal requires only temporal information about
that gap, and that other information about the size and speed of closure of the gap is not needed.
The special temporal measure that provides sufficient information to guide the closure of a gap is
the time-to-closure of the gap at the current closure rate. This variable is named ‘tau’ after the
Greek letter τ (Lee 1976).
6.6 Tau-coupling gaps

Returning to the example of piano playing, general tau theory proposes that when a key is
pressed, the tau of the gap between the key and its goal position is sensed by the pianist, and that
this sensory feedback is compared in the pianist’s brain with the ‘ideal’ tau of a motion gap
generated by the brain to prescribe the desired pattern of movement of the key. This intrinsic tau
is referred to as τG, or tauG, which is realized in the brain as a patterned flow of electrical energy
through an assembly of neurons (Lee 2005). TauG is defined by a particular mathematical
function. By sending appropriate tau information to the muscles that control the fingers, the
brain regulates τX (the tau of the gap, X, between the key and its goal position) in an attempt to
keep the ratio, tX/tG, constant during the movement at a value, kX,G, set by the brain. In other
words, the brain attempts to maintain the relationship:
(1) τX = kX,G τG
This is an example of tau-coupling, where kX,G is the ‘coupling factor’. We refer to equation 1
as the tauG-guidance equation, and gap-closing movements that follow this equation as
tauG-guided movements. For a gap that starts at rest and ends at rest, tauG is specified by the
equation:2
(2) tG = 1/2(t-TG2/t)
where TG is the duration of the tauG-guide, and time, t, runs from zero at the start of closure
of the gap (Lee 1998). Empirical evidence for tauG-guidance of this form is summarized in
Lee (2005).
TauG-guidance is a central concept in general tau theory. The tauG-guidance or tau-
coupling equation (1) prescribes the pattern of flow of a tauG-guided movement. To achieve this
controlled movement, a person modulates their movement so as to keep the two streams of tau
information—τX, the tau of the gap under control—as registered by that person’s perceptual
systems, and τG, the tauG-guide as generated in the brain, in the constant ratio, kX,G, prescribed
by equation (1). Guiding the movement in this way allows a person to control the manner in
which they close a gap (such as fast, slow, jerky, smooth, with harsh acceleration or deceleration).
For example, the higher the value of kX,G the more gradually the movement starts and the more
abruptly it ends (Lee 2005): i.e., the more precipitous the movement. TauG-guidance also enables
the person to time the closure of a gap (e.g., when a tone is struck, plucked, bowed or strummed)
because the duration of a tauG-guide is set by the nervous system, and the gap reaches closure as
tauG reaches its end. In short, general tau theory can help us interpret how musicians play the
right tone in the right way at the right time, and that should enable us to measure expression in
performance.
6.7 Tau in the nervous system

If movements are intrinsically guided by tauG, there must be tauG information coursing through
the nervous system. This nervous information is likely to be some (mathematical) function of
‘neural power’—the rate of flow of electrical energy through ensembles of neurons, either as
trains of electrical pulses or as action potentials—since the nervous system functions by modu-
lating neural (that is, electrical) power. However, tauG could not correspond to neural power as
such, because the dimensions do not match: tauG is measured in time units, whereas neural
power is measured in power units, i.e., the rate of change of energy over time. But tauG could be
encapsulated neurally as the tau of a neural power gap. The hypothesis was tested by analysing
neural power (spike-rate) data collected from the motor cortex and parietal cortex Area 5
of monkeys when they were carrying out purposeful reaching movements (Lee et al. submitted).
2 Equation (2) for tG corresponds to the tau of a gap, G, that closes from rest with constant acceleration
toward its goal. The formula for tG is derived from Newton’s equations of motion. Constant acceleration
of closure of G is hypothesized because constant acceleration is a simple and common form of regular
motion to which animals are exposed (e.g., falling under gravity) and so might have been assimilated into
their nervous systems as a base form of gap closure onto which other forms of gap closure could be built.
In both regions of the cortex, a neural power gap was found whose ‘tau melody’ (the temporal
pattern of tau) was proportional to the tauG melody and to the tau melody of the gap between
the monkey’s hand and the target as it reached out. In the motor cortex, the neural-tau melody
coincided with the hand-movement-tau melody, indicating that it was guiding the movement.
In sensory parietal cortex Area 5, the neural-tau melody followed the movement-tau melody,
indicating that it was monitoring the movement.
6.8 TauG in musical expression

To control the right tone at the right time in the right way, there are, in addition to kX,G, two other
parameters of a tauG-guided movement that determine the dynamic form or expression of the
closure of a gap, X. These are AX, the initial amplitude of the gap, and TG, the duration of the tauG-
guide. The amplitude, AX, and duration, TG, of a tauG-guided movement have different impacts on
the sound produced by a musician, depending on the instrument they are playing and the style of
their playing. For example, due to the different mechanical properties of the instruments, the
amplitude of foot movement has a very different end result in playing the piano (using the sustain
pedal or damping action) compared with using the kick-drum pedal or high hat in percussion.
In the art of composition, a variety of qualities corresponding to metre, rhythm, melody and
harmony—all crucial components for communicating an artistic image or emotion in music—
must be indicated in the score. In performing or improvising music, all such aspects are
expressed through a musician’s interpretation of the piece, in short, by the way they move to
create the sound. Thus, we argue that though metre, rhythm, melody and harmony are the
fundamental building blocks of composition, musicians’ movements are the source of expression
in performance. Moreover, the conventional building blocks of the musical craft would not exist
were it not for the body movements they represent.
A useful example of how a performer creates musical expression can be seen in jazz, particu-
larly in its early years. Jazz musicians first used current and popular tunes, or compositions based
on them, to communicate their expressive artistic feelings and images. They would take a compo-
sition as a vehicle for their expression, playing it in their own unique, personal way. The figure of
the tune would still be recognizable, as, say, ‘My favourite things’, but when played by John
Coltrane, the tune would become the voice of a new genre of music. The performer’s own expres-
sions defined the heard artistic creation and its message for a listener; the score or composition
was a vehicle for that communication.
6.9 Hypotheses
In applying general tau theory to explore musical expression, our working hypothesis is that the
following will occur.
1 Musicians will tauG-guide the closure of gaps, X, to create sounds (following equation 1),
and to regulate the values of the tau-coupling factor, kX,G, the initial amplitude of the gap,
AX, and the duration, TG, of the tauG-guide, to convey expression.
2 These movements generate tauG-guided sounds with related, though not necessarily identi-
cal, kX,G, AX, TG values.
3 The same kX,G, AX, TG values are present in the sound and can be perceived by a listener, who
can then move to the music, making tauG-guided movements with related kX,G, AX, TG values.
In short, we hypothesize that kX,G, AX, TG are expressive variables used in musical interaction
such as illustrated in Figure 6.1. We also hypothesize that additional expressive information is
provided by the temporal profile of the ratio tX/τG, (tauX/tauG). As a tauG-guided movement
unfolds, a musician attempts to keep the ratio, tX/τG, equal to a goal value, kX,G. We write the
ratio, t X/τ G, as k X,G (kappaXG). In practice, k X,G will generally vary over time, because the
variable forces coming from within the musician and from the environment will require
the musician to make adjustments to their movement to keep the value of kX,G close to that
of k X,G. The perceptual feedback a musician receives allows them to monitor t X, the tau of
the gap, X, that they are controlling, and hence kX,G, and to compare this with the value of kX,G
in much the same way as a tightrope walker senses the lateral position/movement of their centre
of gravity and relates this to the current position of the rope. A skilful tightrope walker moves
with energetically efficient style (or grace), shifting their weight systematically from side to
side to maintain balance. Likewise, a skilled musician making a tauG-guided movement may
be expected to shift the value of k X,G systematically around the goal value of k X,G as the
movement evolves, and to do so in a manner that befits the musical expression they are
conveying.
This hypothesis can be tested by plotting kX,G against time for the duration of the tauG-guided
movements of differing musical expression, and determining whether or not the kX,G profiles are
different for each musical expressions. Consistencies in the way kX,G profiles unfold for different
expressive purposes might provide a glimpse of the underlying expressive image or plan, that is,
the emotive gesture an artist is conveying through their movements in their chosen art, whether
brush strokes, pirouettes, vibrato or soaring vocal phrases.
We will now describe studies of musical performance that explore and test this approach.
They have generated measures of kX,G, TG, and kX,G, illustrating how general tau theory can
help us to understand what is involved in producing the right sounds in the right way at the
right time.
6.10 Singing f0 glides

In this study (Schögler et al. 2008), two accomplished female singers sang Pergolesi’s duet Vanne,
Vale, Dico Addio, unaccompanied. The singing was legato, which meant that when moving from
one tone to the next, the voice glided (rapidly) through the range of fundamental frequencies (f0)
between the tones, producing an ‘f0 glide’. They performed the duet in separate vocal booths to
enable isolated recordings to be made of each singer’s individual performances. The singers wore
headphones so they could hear each other as they performed. Acoustic and laryngograph record-
ings were made of the performance. The laryngograph (Fourcin 1981) directly records the vocal
fold pattern in the larynx as an ‘Lx waveform’. The sound wave file produced by the laryngograph
is a direct reflection of the muscular activity of the vocal folds. This sound file was used to com-
pare a singer’s control of the fundamental frequency, f0, of their voice against the control of
movement of their vocal folds.
To measure the way in which the singers glided between tones, a computer program, ‘Praat’
(Boersma and Weenink 2000), first converted, at 500 Hz, the recorded sound into a fundamental
frequency (f0) graph for each of the performances. The f0 graph was then inspected and each
transition between the tones measured, using a tauG analysis program3 to determine if the way a
f0 gap, X, changed could be explained by the singer’s tauG-guiding of the closure of the gap,
following the formula of Equation 1, tX = kX,GtG. Consider, for example, the f0 glide from E down
to D sharp in the first bar of the phrase illustrated in Figure 6.2. The singer sings Va-le with one
continuous movement of the fundamental frequency, f0, of her voice.
3 The tauG analysis program was written by G-J Pepping.

Va--ne, va--le di--co ad--di---o, Ma ram--men-ta ch’il cor mi----o
Sound
wave
700
f0
(Hz)
200
3.5 Time (s) 16
Magnified section 5–6 seconds
Sound
wave
690
f0 gap
f0
(Hz)
630
5 Time (s) 6
Fig. 6.2 Delineation of f0 glides (fundamental frequency glides) linking adjacent notes. The
sound wave recorded from the singer was computer-analysed to yield the f0 profile shown.
f0 glides are indicated by the large rapid shifts in f0 that are aligned with the note changes in
the musical score.
We tauG-analysed 225 acoustic f0 glides using the method described in Figure 6.3. The means
(standard deviations) of the tauG-guidance measures were (1) the percentage of the gap, X, that
was tauG-guided = 99.03% (1.54%), and (2) the percentage of the variance in the data explained
by the tauG-guidance equation 1 = 98.7% (1.4%); kX,G = 0.552 (0.115). Thus, the data strongly
support the hypothesis that the singers tauG-guided their f0 glides.
Box 6.1 Measuring tauG–guidance

Figure 6.3 overleaf shows the steps taken in analysing gap-closing data to measure (1) the
degree to which the closure of a gap is tauG-guided, and (2) the kinematic form of the gap
closure. To illustrate the procedure, we use data from a typical f0 glide taken from the singing
study reported in this chapter. However, the same procedure applies to the analysis of any gap
.
closure. Figure 6.3(a) shows how the f0 gap, X, and its rate of change, X (obtained by the
numerical differentiation of X) changed during the f0 glide. The vertical lines indicate the
.
start and end of the f0 glide; they correspond to the time points when X just exceeded 10% of
its peak value (the peak is negative in Figure 6.3(a) because the f0 glide is downward). These
.
cut-offs are used to eliminate the noisy estimates of X at low values. Figure 6.3(b) shows how
tX, the tau of the f0-gap, X, covaried over time with tG (computed from equation 2). Figure
6.3(c) plots tX against tG. The line through the data points is the result of applying a recursive
linear regression algorithm. The algorithm derives two measures of the degree to which
closure of the gap is tauG-guided, namely the ‘% gap tauG-guided’ and the ‘% variance
explained’. The ‘% gap tauG-guided’ is the highest percentage of data points, up to the end
point, that fit the tauG-guidance model (Equation 1) with less than 5% of the variance
unaccounted for (i.e., with r2 of the linear regression greater than 0.95). In Figure 6.3(c), this
is 97.6% (i.e., all but the left-most point, which is not shown). The ‘% variance explained ‘by
the tauG-guidance model equals 100 times the r2 of the linear regression computed by the
algorithm. In Figure 6.3(c), this is 99.0%. The slope of the linear regression computed by
the algorithm (0.655 in Figure 6.3c) is an estimate, k̂X,G, of the coupling ratio, kX,G, in the
tauG-guidance Equation 1.
In the tauG-guidance of closure of a gap, X, kX,G measures the precipitousness of the gap
closure, in the following sense: (i) the higher the value of kX,G, the longer the initial accelera-
tion and the shorter and steeper the terminal deceleration; (ii) when 0 < kX,G ≤ 0.5, the arrival
.
at the end of the gap will be gentle, because X will then be zero; however, when kX,G > 0.5, the
.
end of the gap will be reached with collision because, in this case, X will be greater than zero at
the end of the gap closure (Lee 1998). As an example, imagine being driven between traffic
lights in a tauG-guided manner, accelerating from rest to a peak speed and then braking to
stop. The higher the value of kX,G, the more precipitous will be the approach to the lights, and
the scarier the drive!
The style used by an animal or person attempting to follow the tauG-guidance equation
(equation 1) is measured by the kappa profile (kX,G) of the movement, where kX,G = tX/tG at
each sample time. For example, Figure 6.3(d) plots the smoothly undulating kappa profile of
the f0 glide shown in Figure 6.3(a). In general, a kappa profile will undulate, because it is
physically impossible for a person or animal to keep kX,G (or indeed any variable) absolutely
constant when closing a gap. (If kX,G were kept constant, the kappa profile would be straight
and horizontal, and kX,G would equal kX,G during the whole movement). Therefore, to keep
kX,G sufficiently close to the goal value, kX,G, kX,G must be varied around the goal value in a
controlled manner. Thus, we conjectured that a skilled singer tauG-guiding an f0 glide, or a
skilled bass player tauG-guiding an intensity glide, may shift the value of kX,G smoothly
around the goal value kX,G with a style that befits the musical expression they want to convey.
We tested this hypothesis in studies reported in this chapter, at the same time measuring the
degree to which f0 glides and intensity glides were tauG-guided, and the kinematics of those
movements.
30 100 0
tX
25 0 −0.2
tG
−100 −0.4
20 X .
X (Hz s-1)
−200 −0.6
X(Hz)
t (s)
15
. −300
X −0.8
10
−400
−1
5 −500
−1.2
0 −600
66.74 66.78 66.82 66.74 66.78 66.82
(a) Time (s) (b) Time (s)
0 1.2
1
−0.2
0.8
kX,G
tX (s)
−0.4 0.6
0.4
−0.6
0.2
−0.8 0
−0.8 −0.6 −0.4 −0.2 0 66.74 66.78 66.82
(c) tG (s) (d) Time (s)
Fig. 6.3 TauG-analysis procedure.
6.11 Stressing f0 glides

To determine whether musical stress was related to the form of a f0 glide, the singers examined
the score using their musical judgement and identified 20 unstressed and 20 stressed transitions
between adjacent notes. The f0 glides of these 40 transitions were then tauG-analysed. Analysis
of the f0 glides showed (Figure 6.4a) that k̂X,G was significantly higher for the stressed f0 glides
(t(19) = –3.699, p < 0.05). However, there was no significant difference between the durations of
the stressed and unstressed f0 glides. Thus, the results indicate that kX,G was a parameter of
expression in the f0 glides, but TG was not. The higher values of kX,G in the stressed f0 glides
meant that the stressed f0 glides started more gradually, the rate of change of f0 (the 'f0-velocity')
peaked later and the terminal 'f 0 deceleration' was higher and more abrupt (Lee 1998). The
singers also used different styles in stabilizing around their chosen kX,G value when singing
stressed and unstressed f 0 glides. The k X,G (kappaXG) profiles of the stressed f 0 glides in
Figure 6.4b were significantly higher during the first about 60% of the f 0 glide (p < 0.05,
two-tailed t-test).4 If maintained, a higher kX,G at the beginning promises a more abrupt ending
to the movement. Thus, it seems that stress was added to a f0 glide by making the glide appear
particularly precipitous at the beginning.
4 The significance of the difference in the kappaXG profiles was tested by applying a t-test to each successive
time point in the profiles.
f0-glide f0-glide
1 1.2
1
0.8
0.8 Stressed
Mean kX,G
0.6
kX,G
^
0.6
0.4
0.4 Unstressed
0.2 0.2
0 0
Unstressed Stressed 0 10 20 30 40 50
(a) (b) Normalized time
Fig. 6.4 Stressed versus unstressed f0 glides. (a) Mean values of k̂X,G, estimating the coupling factor
in a singer’s tauG-guided f0 glides. Vertical bars represent standard errors. The statistically significant
higher mean k̂X,G for the musically stressed f0 glides indicates that they approached their end more
precipitously. (b) Mean time-normalized kX,G (kappaXG) profiles of the musically stressed and
unstressed f0 glides, measuring the different styles of control of the f0 glides. Vertical bars represent
standard errors. Thicker sections of the curves indicate statistically significant differences between
the profiles (p < 0.05 two-tailed t-test). The significantly higher and rising mean value of kX,G for
the musically stressed f0 glides during the first half of the f0 glides indicates different styles of control
of the stressed and unstressed f0 glides: the stressed f0 glides were particularly precipitous at the
beginning of the glide.
6.12 Laryngeal f0 glides

The acoustic and laryngograph data for 42 f0 glides were tauG-analysed for one of the singers and
were found to be practically identical: the mean (sd) of kX,G was 0.563 (0.103) for both acoustic
and laryngograph recordings, the mean (SD) percentage gap tauG-guided for the voice was
99.23% (1.27%), and 99.09% (1.32%) for the laryngograph, and the kX,G profiles were virtually
superimposed (Figure 6.5). This confirms that the tauG guidance of the f0 glides took place in the
larynx as a consequence of the nervous system regulating the tension in the laryngeal muscles.
It is reasonable to assume that the fundamental frequency ( f0) generated in the larynx is a power
function of the tension in the vocal folds. Thus, tauG guiding the tension in the vocal folds would
f0 -glide
1.2
0.8 Fig. 6.5 Mean time-normalized

Voice
kX,G (kappaXG) profiles of the
κX,G
0.6
acoustic and laryngograph
0.4 Laryngograph records of a singer’s f0 glides
were not significantly different,
0.2 indicating that the acoustic f0
0 glides were generated in the
0 10 20 30 40 50 larynx. Vertical bars represent
Normalized time standard errors.
Hand Singer Voice

Selspot camera Microphone and
and active marker professional digital
record hand multi-track record
movements vocal performance
Fig. 6.6 Recording a singer’s voice together with her hand gestures to measure the relationship
between the two.
result in tauG guiding the f0 glide (Lee 1998). Thus, the data support the hypothesis that f0 glides
are tauG-guided by the nervous system’s tauG-guiding the tension in the laryngeal muscles.
6.13 Singing and gesturing f0 glides

A professional jazz singer was invited to record an a cappella version of the song ‘The beat goes
on’ (Sonny and Cher). Her singing was recorded in a traditional studio setting, but with one
additional requirement: she was asked to try and move her right hand up and down in a manner
that matched her vocal performance, paying particular attention to her movements in pitch.
Thus, as she moved from a high tone to a low tone, her hand would move down, and vice versa.
The song involves a repeating blues structure affording multiple comparisons to be made across
individual pitch transitions.
The singer’s vertical ‘hand glides’ when gesturing were recorded at 500 Hz on a Selspot™
motion capture system at the same time as her vocal performance was recorded on audio
(Figure 6.6). Her f0 glides were graphed (using Praat software) at 500 Hz, together with the
hand glides that accompanied them, and both were tauG-analysed as described in Figure 6.3.
Sixteen pairs of f0 glides and hand glides were obtained from the performance. The mean and
standard error plots of kX,G for the 16 f0 glides, and the 16 hand glides, are plotted in Figure 6.7
against normalized time. Despite the fact that the hand glides often lasted about twice as long as
the f0 glides, the time-normalized temporal profiles of kX,G were very similar. This indicates that
voice and hand were being independently tauG-guided with similar goal kX,G values, and similar
patterns of control of kX,G.
6.14 Trombone f0 glides

In singing, the source of a f 0 glide (the larynx) lies hidden in the body, but in playing the
trombone, a key part of the action is brought to the surface and can be recorded optically.
Trombonists produce f0 glides by moving the trombone slide, and by regulating the tension in
their lips. In this study, an amateur trombonist played a f0 glide between two tones a tone apart.
The f0 glide was played quite quickly, both with and without moving the trombone slide. The
motion of the trombone slide was recorded at 500 Hz on a Selspot™ motion-capture system and
was synchronized with the acoustic record, from which the changing f0 was computed using the
Hand glides accompanying f0-glides

1
Fig. 6.7 Mean time-normalized

kX,G (kappaXG) profiles of a
kX,G Hand glides jazz singer’s f0 glides and of
0.5
f0-glides her accompanying hand glides
(vertical gestures of her hand).
Vertical bars represent standard
errors. The two profiles are not
significantly different, indicating
0
0 10 20 30 40 50 a common style of control of
Normalized time the f0 glides and hand glides.
Praat software. The glide of the trombone slide and the acoustic f0 glides were tauG-analysed,
as described in Figure 6.3. Table 6.1 shows strong evidence that the f0 glides using only the lips,
the f0 glides using the trombone slide, and the glide of the trombone slide were all tauG-guided
and used similar, non-significantly different, kX,G values averaging around 0.55.
In a second study, the trombonist played a Mozart tune slowly. The glide of the trombone
slide and the resulting f0 glides between successive tones of the tune were recorded and tauG-
analysed as in the preceding study. Again, there was strong evidence that both were tauG-guided
(Table 6.2). What was particularly interesting was that a single glide of the trombone slide
frequently encompassed a group of tones and hence several f0 glides. Although the trombone
slide did not stop at each tone in a group, the acoustic record showed that the tones were, in fact,
each held for a brief period. The player was presumably regulating the tension in his lips on each
tone of the phrase so as to produce a reverse f0 glide to cancel the f0 glide that would otherwise
have resulted from the continuous glide of the trombone slide.
6.15 Bowing intensity glides

Thus far, we have seen how tau information derived from sound and movement can provide
insights into how musicians create the right tone, and, furthermore, in the example of
the singers, how this information can help us to explore the intensity of musical expression. The
following experiments with a double-bass player (Schögler et al. 2008) employed the kind of
paradigm for movement recording illustrated in Figure 6.1, to explore further a manipulation
of expression or mood. In playing the double bass, a musician obtains the correct pitch by
pressing the fingers of the left hand on the strings, while the right hand controls the movement of
Table 6.1 Means (SDs) of tauG-guidance parameters when playing a f0 glide on a trombone
between two tones a tone apart
% variance explained % gap tauG-guided ˆk

X,G
f0 glide using only lips 98.8 (1.0) 99.2 (0.9) 0.519 (0.059)
f0 glide using trombone slide 96.2 (1.0) 98.9 (1.7) 0.551 (0.078)
Glide of trombone slide 98.2 (1.5) 100 (0) 0.572 (0.053)
Table 6.2 Means (SDs) of tauG-guidance parameters when playing f0 glides on a trombone in a
Mozart tune
% variance explained % gap tauG-guided kX,G

f0 glides 97.1 (1.5) 90.9 (18.2) 0.506 (0.168)
Glide of trombone slide 97.7 (1.2) 99.7 (0.5) 0.395 (0.077)
the bow across the strings. The analysis of the singers’ f0 glides demonstrates that tauG-analysis
can provide information on expression that is not dependent on the production of a particular
tone or change in tones, but more importantly is related to how a sound is made. The bow
movement, which affects how the sound of the double bass is made, is the focus of the following
experiment. The left hand of a bass player also plays a vital role in the manipulation of the
performance (in vibrato, and to sustain, or modulate the sound); however, for simplicity, and as a
first step, we focussed on the musician’s bowing movement of the right hand. How the
bow moves across a string modulates the ‘attack’ on the tone (the primary phase of the sound
intensity). When a tone is made by stroking a taught string, the intensity of the sound evolves in
three distinct phases of the action: attack, sustain and decay. In the attack phase, the sound level
rises rapidly to a peak level, in what we shall call an ‘intensity glide’. The intensity of sound then
gradually declines during the sustain phase and rapidly decreases during the decay. The
frequency spectrum is different in the three phases of the sound, and the attack phase contains a
great deal of the information crucial to the character of the sound heard through all three
phases (Galembo et al. 2001). A skilled musician varies how a tone is attacked and stressed to
produce different acoustic effects. The attack of a tone also determines when that tone is
perceived to occur. The relationship between bowing movement and attack in double-bass
playing was investigated in the following experiment.
A professional bass player bowed several times a key phrase from Tchaikovsky’s ‘The dance of the
sugar plum fairy’ in two moods: ‘happy’ and ‘sad’. The sound was recorded using professional audio
multitracking facilities and the glides of the bow across the strings were recorded at 500 Hz on a
Selspot™ motion-capture system synchronized with the acoustic record, from which the changing
intensity was computed using Praat software (Figure 6.8). The intensity glide during the attack phase
and the ‘bow glide’ that produced it were tauG-analysed, applying the method described in
Figure 6.3. In all, 146 intensity glides and accompanying bow glides were analysed and found to be
tauG-guided (the mean percentage gap tauG-guided was 86.2% for the intensity glides and 89.2%
for the bow glides, with the percentage variance explained by tauG-guidance being greater than 95%).
Figures 6.9(a) and 6.9(b) show that, for both the intensity glides and the bow glides, the values
of the tauG-guidance parameters kX,G and TG were significantly higher in the ‘sad’ compared
with the ‘happy’ mood (p < 0.001, t-test). There was also a significant difference between the kX,G
(kappaXG) profiles for the two moods (Figure 6.10), the intensity glide and bow glide both being
more precipitous in the sad mood. Thus, the data indicate that, as for the singers’ f0 glides, the
intensity glides during the attacks on tones and the bow glides that produced them were each
tauG-guided by the musician’s nervous system, and the values of the kX,G and TG parameters
were each increased to shift from a happy to a sad mode of expression.
6.16 Playing in time

Playing in time with others involves communicating and maintaining a shared pulse and flow
of musical expression. In orchestral performances, this can be achieved when all musicians
mf
3200
Bow x
displacement
(Selspot units)
2800
Sound
wave
75
Intensity
(db)
50
4.28 Time (s) 10.28
Magnified section
5–6 seconds
3200 74
Intensity
Bow x Intensity
displacement (db)
(Selspot units) x - Displacement
2800
5 Time (s) 6
Fig. 6.8 Delineation of upward moving intensity glides at the beginnings of tones. The bow
movement and sound wave were recorded from an electric bass. The sound wave was computer-
analysed to yield the intensity profile shown. Intensity glides are indicated by large rapid rises in
intensity aligned with the onset of notes in the musical score, which correspond to the bow glides
(large displacements of the bow across the strings).
Intensity glides accompanying bow glides Intensity glides accompanying bow glides
1 1
0.8 0.8
Mean duration TG (s)

Bow glides
Bow glides
Meank^X,G
0.6 0.6
Intensity glides
0.4 0.4
Intensity glides
0.2 0.2
0 0
‘Happy’ ‘Sad’ ‘Happy’ ‘Sad’
(a) (b)
Fig. 6.9 (a) Mean values of k̂X,G, estimating the coupling factor in a bass player’s tauG-guided bow
glides and resultant intensity glides when playing a tune ‘sadly’ and ‘happily’. Vertical bars represent
standard errors. The statistically significant higher mean k̂X,G for the ‘sad’ rendition indicates more
precipitous approaches to the ends of the bow glides and resultant intensity glides. (b) Mean
durations, TG, of the bass player’s tauG-guided bow glides and resultant intensity glides. Vertical
bars represent standard errors. TG was longer in the ‘sad’ rendition, in line with the general tempo,
which was slower.
focus on the rhythmic ‘dance’ of the conductor, while also listening to and watching other
orchestral members. In smaller ensembles, for example those of jazz or chamber music, there is
no conductor, so, more than in an orchestral goup, information in the behaviour of each
musician communicates the subtleties of shared pulse and musical expression to each other. This
information allows the musicians to prospectively control their movements so that they can
synchronize with each other both in pulse and expression. In short, they need to perceive what
the others are about to do. At the level of beat, this would be possible if, as our studies indicate, all
players’ movements, and the resulting musical sounds, were tauG-guided, because each player
Bow glides Intensity glides

1.2 1.2
1 ‘Sad’ 1
0.8 0.8
‘Sad’
kX,G
0.6 0.6
kX,G
0.4 0.4 ‘Happy’

‘Happy’
0.2 0.2
0 0
0 10 20 30 40 50 0 10 20 30 40 50
(a) Normalized time (b) Normalized time
Fig. 6.10 (a) Mean time-normalized kX,G (kappaXG) profiles of the tauG-guided bow glides, and
(b) resultant intensity glides for the ‘sad’ and ‘happy’ renditions depicted in Figure 6.9. Vertical bars
represent standard errors. Thicker sections of the curves indicate statistically significant differences
between the profiles. The significantly higher and rising mean value of kX,G for the ‘sad’ bow glides
and resultant intensity glides during the first half of the glides indicates different styles of control of
the ‘sad’ and ‘happy’ glides. The ‘sad’ glides were made particularly precipitous at the beginning,
paralleling the difference between sung stressed and unstressed f0 glides (Figure 6.4b).
would then be able to perceive the initial part of a tauG-guided movement of a co-musician, and
be able to extrapolate the rest of the movement.
The tauG information perceived by musicians as they play together could also form the basis of
larger-scale rhythmic interpretations of pulse in the following way. Each player produces a thread
of tauG-guided movements and sounds, each with its own duration, TG, which will vary by tone
and by instrument. The players weave their tauG threads together to form a braid with a shared
pulse. When weaving a fabric, the colouring or patterning of the textile is created by the different
elements of the weave, and subtle adjustments made at the micro level can have dramatic global
effects. It is similar with the organic rhythm between musicians as they play together. As different
voices are added to an ensemble, they can change the essential rhythmic character of a piece in
the way they combine. The main point is that rhythm is not based on clock-time, but rather on
the action-times in the tauG threads created by the players, and these are flexible in relation to
clock-time.
We believe that the tauG-guidance of movements and sounds makes it possible for two or
more musicians to be in time with the underlying rhythm and expression of the piece. Their
musical movements and sounds can then unite into a single act that creates something more than
the simple sum of its parts, like the movements of two tango dancers. This is what is meant by
being ‘in the groove’—a feeling of fitting perfectly in a way that is unique to the actions of the
moment and dependent on being with someone else. Two players performing a duet are no
longer two individuals doing separate things, but are a united dyad: the result is both functional
and beautiful.
6.17 Synchronizing intensity glides in improvised jazz duets

Research into the perception of musical pulse by both musicians and non-musicians confirms
the importance of accent and dynamic change in the perceptual segmentation of a stream of
sound (Deliège 1987). In music theory, accents are defined as ‘melodic’ or ‘dynamic’, the dynamic
accent being a generator of salience accessible in all modalities of music, and common to both
percussive drumming traditions of Africa, and Western tonal music. Basic dynamic accents in the
control of instrument or voice do not require complex modulations of pitch and timbre, but can
be expressed entirely in the changing intensity of sounds, which are easily perceived by infant
and adult, jazz improviser or naive listener. Thus, a simple way for musicians to mark a dynamic
accent in joint performance is for them to produce synchronous peaks of intensity. These consti-
tute identifiable shared gap-closing goals that can be subjected to appropriate tauG-analyses.
Thus, the concept of tauG-guidance of movement and sound can be used to help decipher the
musicians’ art and to help understand how they are able to play together.
We have employed tauG-analysis to measure the coordination of sounds between players
in jazz duets, and to gain precise information about how the changing sound produced by
the individual musicians is combined in the more coherent performance of the dyad (Schögler
1998, 1999, 2003). TauG-analyses were made of the control of tone production at points of
synchrony in the performance. It was assumed that each performer accurately controlled the
sounds he/she made to arrive at specific points in musical time. As described in the study of
double-bass playing, the intensity of musical sounds evolves through three distinct phases:
attack, sustain and decay. The attack phase, which comprises an intensity glide, contains informa-
tion crucial to the character of the sound and determines when that sound is perceived to occur.
Five duets were performed by musicians separated in two studios linked by sound alone (thus the
musicians were unable to see each other). The following instrument pairs were recorded for
analysis: (i) kit drums and electric bass, (ii) kit drums and double bass, (iii) kit drums and
Point of synchrony
6
Combined sound
Loudness (sones)
4
Drums
Fig. 6.11 The changing loudness
at a point of synchrony of the
combined sounds and the Bass
individual sounds produced by
2
two musicians improvising 106.6 106.7
together on bass and drums. Time (s)
electric guitar, (iv) electric guitar and double bass, and (v) kit drums and double bass. Each duet
was recorded digitally using a multitrack facility that recorded both the combined performance
and the individual sounds of the musicians. Algorithms developed for micro-analytic research on
infant communication to extract pitch, timbre and loudness (Malloch et al. 1997; Malloch 2001)
were applied to the recordings of the improvised duets to generate a measure of the changing
loudness, in sones,5 with a resolution of 100 Hz. Subsequent inspection of the changing loudness
in the two performances and their combination identified moments of synchrony as coincident
peaks in loudness.
Figure 6.11 shows the changing loudness at a point of synchrony of the combined sounds and
the individual sounds produced by the two musicians improvising together on bass and drums.
The taus of the three intensity glides from the base level to the peak were plotted against tauG, as
illustrated in Figure 6.3(b). These plots were subjected to a tauG-analysis to ascertain the degree of
tauG-guidance of the intensity glides in the attacks of the musicians’ sounds. The combined
sound, even though it was the result of the activity of two players, showed higher percentage gap
tauG-guided, and with less variability, than the players’ individual sounds. The means (SD) were
99.06% (5.33%) and 97.63% (9.56%), respectively. The difference in mean percentages was not
statistically significant, but the marked decrease in variance shown in the analysis of the combined
sound was significant (F(495 256) = 3.21, p < 0.001). This marked difference indicates a higher
degree of coherence in the tauG information perceivable in the combined sound (the actual
performance) versus the constituent elements creating it, thus demonstrating the players' cooper-
ation in sound production to a shared rhythmic goal. In reference to points of synchronous
activity, the musicians' individual performances were apparently subordinated to their shared
sense of time. The resulting tauG-guide perceivable in the actual performance is more stable,
providing a cohesive 'time of action' through which the players coordinate their conjoint activity.
6.18 Expressive kX,G in jazz duets

As well as providing the necessary perceptual substrate to play in time, the tauG-guidance at
points of synchronous activity suggests how those moments of synchrony are produced in the
5 The sones scale of perceived loudness is the current basis for the national and international measurement
of loudness (Campbell and Greated 1994).
sonic landscape of the piece. Regulation of kX,G is hypothesized to be a means of modulating

expression or, in other words, the expressive qualities of the musical gestures in the piece.
However, if kX,G is to be accepted as a measure of the expressive components of musical gestures,
it must be shown to be a parameter that is systematically varied by musicians and not simply a
product of the acoustic properties of the instrument. Through the course of an improvisation,
moments of synchrony occur at a variety of rhythmic and narrative positions in the piece. If the
kX,G values for the tauG analyses of the individual and combined sounds were to be normally
distributed around a central value (the null hypothesis), this would indicate that the moments of
synchrony were due to acoustic properties of the instruments. To test this, the duet between
electric guitar and drums was tauG-analysed at 1000 Hz to provide a more detailed picture of
how kX,G is varied in an improvised jazz duet. No significant relationships were found between
kX,G and the position in the duet or rhythmic structure. However, when the three tauG-analyses
of the individual and combined sounds of the two musicians were subjected to a means cluster
analysis, the kX,G data demonstrated three key groupings around kX,G values of 0.32, 0.56 and
0.88, which suggests that the musicians used three distinct classes of tauG-guided movements
6.19 Prospects
Applying general tau theory to study musical expression clarifies how expression is achieved, and
allows it to be measured. Using the theory, we have sought to understand what it is in the
patterned flow of movement that lies at the root of music (and communication in general) that is
modulated to express feeling and beauty. The key ideas are:
1 Any skilled movement consists of guiding the closure of gaps between the current state and a
goal state.
2 Guiding the closure of a gap requires perceptual information only about the tau of the gap
(the time-to-closure of the gap at the current rate of closure), this information being readily
available in all sensory modalities.
3 Humans (and animals) tauG-guide the closure of gaps using a special tauG ‘formula’ that is
generated in their nervous system and is expressible as a mathematical function.
The tauG-guidance of a gap, X, has three parameters (kX,G, AX, TG), the values of which can be
regulated by a person or animal to control how the gap closes. kX,G determines the temporal
shape of the gap closure (i.e., how it evolves over time), AX is the maximum amplitude of the gap,
X, and TG is the duration of the tauG-guide. The temporal shape of the gap closure is also regu-
lated by adjusting the kX,G (kappaXG) profile of the movement, which measures the style of
control of the tauG-guidance. Our hypothesis is that in performance, musicians generate tempo-
rally overlapping tauG-guided gaps when moving on their instruments or with their voice, and
thereby create the temporally overlapping sound gaps that occur in the music. The amodal
nature of tau enables coordination and control of musical expression across different means
of enacting the music, and the tauG-guidance of movements structures the rhythm, melody,
harmony, texture and feeling of the music.
Our experiments have shown that the parameters, kX,G, and TG, of the tauG-guidance of the
closure of a gap, X, and the kX,G profile of the gap closure, all relate to expressive movements and
sounds in music. However, the durations of those expressive movements and sounds (e.g., f0
glides, intensity glides, and bow glides) were all relatively short (less than 1 second). Further
experiments are needed to test whether the theory also applies to longer expressive movements
and sounds, such as crescendos, diminuendos, accelerandos, ritardandos, and general phrasing.
More work on comparing different instruments and different styles of music is required to gain
a comprehensive understanding of how musicians create tauG-guided sounds by their

tauG-guided movements. We believe that using the theory and the experimental approach that
we have described in this chapter will lead to a better understanding of how human movement
sculpts the beauty and expressiveness of music.
By examining, with the aid of general tau theory, how musical communion between two people
is possible, we arrive at a juncture where the question of what is being controlled becomes more
important. Many professional musicians would say they do not think about their movements
at all, but they do think of the music they are playing. Their movements enact the intention or
emotion they experience and want to express. We have presented control in musical performance
as translating the closure of a physical gap, as in a bow glide, into the closing of a musical gap,
as in an intensity glide. However, it may be that what we really witness in musical performance
is the closing of emotional gaps that singers, instrumentalists and dancers convey by the way
they move.
The remarkable coherence we discovered in the kX,G graphs over time indicates a source of
meaning in music, immediately evident, for example, in jazz improvisation. How the motifs and
gestural rhythmic forms evolve in such social creative activities is a current research topic in the
Perception-Movement-Action Research Centre at Edinburgh University. With more comprehen-
sive analysis of different instruments, we expect it will provide further information about how
tau variables are manipulated to communicate ‘stories of meaning’ in music. A different piece to
the puzzle will be examined by research on dance. Studying not only how musicians create sound
and how listeners perceive their playing, but how dancers respond to those sounds, will enable us
to complete the study of the movement–music–movement cycle.
Acknowledgements
The research was supported by a grant from the University of Minnesota, and the writing by a
Leverhulme Trust Fellowship to the first author.
References
Baily J (1985). Music structure and human movement. In P Howell, I Cross and R West, eds, Musical
structure and cognition, pp. 237–258. Academic Press, London.
Bernstein NA (1967). The co-ordination and regulation of movements. Pergamon Press, Oxford.
Boersma P and Weenink D (2000). Praat 3.9: A system for doing phonetics by computer. http://www.
praat.org
Buccino G, Vogt S, Rotzl A, Fink G R, Zilles K, Freund H-J and Rizzolatti G (2004). Neural circuits under-
lying imitation learning of hand actions: An event-related fMRI study. Neuron, 42, 323–334.
Campbell M and Greated C (1994). The musicians guide to acoustics. Oxford University Press, Oxford.
Camurri A, Hashimoto S, Ricchetti M, Trocca R, Suzuki K, Volpe G (2000). EyesWeb – toward gesture and
affect recognition in interactive dance and music systems. Computer Music Journal, 24, 57–69.
Camurri A, Leman M, Mazzarino B, Vermeulen V, Voogdt L De and Volpe G (2003). Relationship between
musical audio, perceived qualities, and motoric responses – a pilot study. In R Bresin, ed., Proceedings of
the International Stockholm Acoustic Conference 2003 (SMAC03), Stockholm, pp. 631–633. Royal Swedish
Academy of Music, Stockholm, Sweden.
Clarke EF (1988). Generative principles in music performance. In J Sloboda, ed., Generative processes in
music, pp. 1–27. Clarendon Press, Oxford.
Clarke EF (1999). Rhythm and timing in music. In D Deutsch, ed., The psychology of music, 2nd edn,
pp. 437–500. Academic Press, NewYork.
Clynes M (1973). Sentics: Biocybernetics of emotion communication. Annals of the New York Academy of
Sciences, 220(3), 55–131.
Dahl S and Friberg S (2003). What can the body movements reveal about a musician’s emotional intention?
Proceedings of Stockholm Music Acoustics Conference, pp. 599–602, August 6–9, Stockholm.
Deliège I (1987). Grouping conditions in listening to music: An approach to Lerdahl and Jackendoff ’s
grouping preference rules. Music Perception, 4, 325–360.
Dogantan M (2002). Mathis Lussy: A pioneer in studies of expressive performance. Varia Musicologica, 1,
Peter Lang, Bern, Switzerland.
Donald M (1999). Preconditions for the evolution of protolanguages. In MC Corballis and SEG Lea, eds,
The descent of mind: Psychological perspectives on hominid evolution, pp. 138–154, Oxford University
Press, Oxford.
Fourcin AJ (1981). Laryngographic assessment of phonatory function. In SHA Report 11, pp. 116–127.
The American Speech Language Hearing Association, Maryland.
Friberg A, Sundberg, J and Frydén L (2000). Music from motion: Sound level envelopes of tones expressing
human locomotion. Journal of New Music Research, 29(3), 199–210.
Galembo A, Askenfelt A, Cuddy L and Russo F (2001). Effects of relative phase on pitch and timbre in
piano bass range. Journal of the Acoustical Society of America, 110(3), 1649–1666.
Gibson JJ (1966). The senses considered as perceptual systems. Houghton Mifflin, Boston, MA.
Hagendoorn I (2004). Some speculative hypotheses about the nature and perception of dance and
choreography, Journal of Consciousness Studies, 11, 79–110.
Hanna JL (1979). To dance is human. A theory of nonverbal communication. University of Texas Press,
Austin, TX.
Iyer V (2004). Improvisation, temporality and embodied experience. Journal of Consciousness Studies,
11, 159–173.
Keysers C, Kohler E, Umilta MA, Nanetti L, Fogassi L and Gallese V (2003). Audiovisual mirror neurons
and action recognition. Experimental Brain Research, 153, 628–636.
Krumhansl CL (1996). A perceptual analysis of Mozart’s Piano Sonata, K. 282: Segmentation, tension and
musical ideas. Music Perception, 13, 401–432.
Krumhansl CL (2002). A link between cognition and emotion. Current Directions in Psychological Science,
11, 45–50.
Lashley KS (1951). The problem of serial order in behavior. In LA Jeffress, ed., Cerebral mechanisms in
behavior: the Hixon symposium, pp. 112–136. Wiley, New York.
Lee DN (1976). A theory of visual control of braking based on information about time to collision.
Perception, 5, 437–459.
Lee DN (1998). Guiding movement by coupling taus. Ecological Psychology, 10(3/4), 221–250.
Lee DN, Craig CM and Grealy MA (1999). Sensory and intrinsic coordination of movement. Proceedings of
the Royal Society of London, Series B, 266, 2029–2035.
Lee DN, Georgopoulos AP, Lee TM and Pepping G-J (submitted). A neural formula that directs movement.
MacLean PD (1990). The triune brain in evolution: Role in paleocerebral functions. Plenum Press, New York.
MacNeilage PF (1999). Whatever happened to articulate speech? In MC Corballis and SEG Lea, eds,
The descent of mind: Psychological perspectives on hominid evolution, pp. 116–137. Oxford University
Press, Oxford.
Malloch S (2001). Timbre and technology: An analytical partnership. Contemporary Music Review,
19, 155–172.
Malloch S (2005). Why do we like to dance and sing? In R Grove, C Stevens and S McKechnie, eds, Thinking
in four dimensions: Creativity and cognition in contemporary dance, pp.14–28. Melbourne University
Press, Melbourne.
Malloch S, Sharp D, Campbell DM, Campbell AM and Trevarthen C (1997). Measuring the human voice:
Analysing pitch, timing, loudness and voice quality in mother/infant communication. Proceedings of
The International Symposium of Musical Acoustics, Edinburgh, 19–22 August 1997, Vol. 19, Part 5,
pp. 495–500. Curran Associates, Redhook, New York.
Mitchell RW and Gallaher MC (2001). Embodying music: Matching music and dance in memory.
Music Perception, 19, 65–85.
Molinari M, Leggio ML, DeMartin M, Cerasa A and Thaut MH (2003). Neurobiology of rhythmic motor
entrainment. Annals of the New York Academy of Sciences, 999, 313–321.
Panksepp J and Bernatzky G (2002). Emotional sounds and the brain: The neuro-affective foundations of
Peretz I and Zatorre RJ (2005). Brain organization for music processing. Annual Reviews of Psychology,
56, 89–114.
Schlaug G (2001). The brain of musicians: A model for functional and structural adaptation. Annals of the
New York Academy of Sciences, 930, 281–299.
Schmidt RC, Carello C and Turvey MT (1990). Phase transitions and critical fluctuations in the visual
coordination of rhythmic movements between people. Journal of Experimental Psychology: Human
Perception and Performance, 16, 227–247.
Schögler BW (1998). Music as a tool in communications research. Nordic Journal of Music Therapy,
7(1), 40–49.
Schögler BW (1999). Studying temporal co-ordination in jazz duets. Musicae Scientiae (Special Issue
1999–2000), 75–92.
Schögler BW (2003). The pulse of communication in improvised music. In R Kopiez, AC Lehmann,
I Wolther and C Wolf, eds, Proceedings of the 5th Triennial European Society for the Cognitive Sciences of
Music Conference, 8–13 September, Hannover University of Music and Drama, Germany. Hannover
University of Music and Drama, Hannover, Germany.
Schögler B, Pepping G-J and Lee DN. TauG-guidance of transients in expressive musical
performance. Experimental Brain Research (in press).
Scholes PA (1960). The Oxford companion to music. Oxford University Press, London.
Shove P and Repp B (1995). Musical motion and performance. Theoretical and empirical perspectives.
In J Rink, ed., The practice of performance, pp. 55–83. Cambridge University Press, Cambridge.
Stevens C, Malloch S, McKenchnie S and Steven N (2003). Choreographic cognition. The time-course
and phenomenology of creating dance. Pragmatics and Cognition, 11, 276–326.
Todd NP (1994). The kinematics of musical expression. Journal of the Acoustic Society of America,
97, 1940–1949.
Trevarthen C (1999). Musicality and the intrinsic motive pulse: Evidence from human psychobiology
and infant communication. Musicae Scientiae (Special Issue 1999–2000), 155–215.
Zatorre RJ and Krumhansl CL (2002). Mental models and musical minds. Science, 298, 2138–2139.
Chapter 7
The neuroscience of emotion

in music
Jaak Panksepp and Colwyn Trevarthen
7.1 Overture: why humans move, and communicate,

in musical–emotional ways
Music moves us. Its rhythms can make our bodies dance and its tones and melodies can stir
emotions. It brings life to solitary thoughts and memories, can comfort and relieve loneliness,
promote private or shared happiness, and engender feelings of deep sadness and loss. The sounds
of music communicate emotions vividly in ways beyond the ability of words and most other
forms of art. It can draw us together in affectionate intimacy, as in the first prosodic song-like
conversations between mothers and infants. It can carry the volatile emotions of human attach-
ments and disputes in folk songs and grand opera, and excite the passions of crowds on great
social occasions.
These facts challenge contemporary cognitive and neural sciences. The psychobiology of music
invites neuroscientists who study the emotional brain to unravel the neural nature of affective
experience, and to seek entirely new visions of how the mind generates learning and memory—
to reveal the nature of ‘meaning’. The science of communicative musicality—the dynamic forms
and functions of bodily and vocal gestures—helps us enquire how the motivating impulses of
music can tell compelling stories. This also leads us to ask how human music relates to our
animal emotional heritage and the dynamic instinctual movements that communicate emotions.
Research on the emotional systems of animals is bringing us closer to explanations of the still
mysterious aspects of human affective experiences, and hence the emotional power of music.
Through unfathomed neurochemical responses in the brain, the sounds of music can bring joy
and dull the jab of pain, as endogenous opioids and many other affective chemistries
are recruited in musically entrained minds (Panksepp and Bernatzky 2002; Panksepp 2005c).
With the aid of animal research on these emotional systems and how they interact with our
cognitive abilities, we may find a new view of communicative musicality as a form of playful and
endlessly inventive social behaviour that helps build, epigenetically, the social brains of our
children, to facilitate mental and physical health and learning permeated by prosocial affects
(Panksepp 2001, 2007b).
We review the evidence that the emotions of animals, their neurochemical basis and the
body movements and vocalizations by which they are expressed, may reveal the deep sources of
musical feeling in human beings. They help explain how music supports our social life, and how
our musical preferences can define our ‘identity’ in society. We relate the comparative neuro-
science of emotions to the rhythmic musicality of mother–infant communication. By conside-
ring the effects of brain injuries and genetic disorders in childrens’ brain development, we
conclude that the acquired cognitive appreciation of music depends upon subcortical emotional
systems that uniquely engage the cerebral hemispheres. We note the teaching and healing powers
106 JAAK PANKSEPP AND COLWYN TREVARTHEN
of music, and relate communication of emotions in music to motives for cultural learning,
especially to the evolution of language acquisition.
7.1.1 Musical meaning: the motives of musical culture

Although we can find expressive actions in animals that have rhythm and affective tone, the art
of music, as most of us understand it, is uniquely human. We thrive in unique dancing, symbol-
seeking, storytelling ways. Our communicative displays and performances exhibit the special
multilevel time of human moving, with stepping feet, gesturing hands and visible and audible
messages of exceptionally quick and versatile eyes, face and vocal system, all obedient to the slower
rhythms of the whole gliding, swaying body, that becomes adept at generating affective narratives
(Trevarthen 1999; see also Osborne, Chapter 15, this volume). Music is, from the beginning of a
child’s development, the polyrhythmic sound of the human body in adventurous and creative self-
possessed activity that loves to communicate what it imagines. The times of its compositions span
the range from fractions of a second to many minutes or hours (Kühl 2007; Trevarthen 2008a). As
development proceeds, emotions come to express moments of wonder, longing, joy, rage, pride,
fear or gentle affection, and these are woven into complexes of rhythm and melody that may
become unforgettable and precious memories in the art of an historic culture, inseparable from
rituals of many kinds that rule the way humans come to move and feel together (Turner 1982;
Blacking 1995). We acquire intense and lasting preferences for particular pieces or kinds of music.
Thus, music motivates our loyalties and place in society (MacDonald et al. 2002).
The precise relationship between emotions and cognitions in music remains obscure and
controversial in modern psychology. Nevertheless, it is evident that, before its ‘disembedded’
information and structure is perceived with the aid of symbols and analytic thought, we know
music first as lived and felt experience in the body, ‘embedded’ in intersubjective and cultural
dynamics. This leads some who place high value on articulate rationality and symbolic commu-
nication in language to dismiss music as frills—inessential experiences, merely for entertain-
ment. These scholars appear not to appreciate the real, everyday power of music to move us, and
our need to share it (Sacks 2006).
Music blends readily with so many important dimensions of human lives in community,
suggesting, on the contrary, it evolved as a prime facilitator of our social communication,
our learning and the creation of cultural meaning (Blacking 1976, 1988, 1995; Bjørkvold 1992;
Schubert 1996; Wallin, Merker and Brown 2000; Donald 2001; Mithen 2005; Cross 1999, 2007;
Kühl 2007). Far from being an inconsequential emotional side-effect of the rational and
informative consciousness of a Cartesian thinker, appreciating order in the measured tones of
Apollo’s lyre, the emotional response to a musical narrative is a cerebral force that drives
and shapes memory, imagination and thinking in passionate Dionysian social encounters, as
well as one for recreative escape from demanding tasks and enjoyment of good stories
(Freeman 2000).
There are certain aspects of the so-called ‘inner life’—physical or mental—which have formal properties
similar to those of music—patterns of motion and rest, of tension and release, of agreement and
disagreement, preparation, fulfilment, excitation, sudden change, etc.
Langer (1942, p. 228), quoted by Kühl (2007, p. 223)
Musical art is cultivated from the primary energy of human meaning-and-pleasure-seeking

action, on the syntactic time base of which also rest the logical processes of reasoning and the
mercurial references of language. That is how musical meaning is created and transmitted, and
why it means so much.
THE NEUROSCIENCE OF EMOTION IN MUSIC 107
Musical forms, it is true, readily become associated with particular cognitive representations
or meanings in their enactive contexts, which Kühl (2007, p. 50) calls ‘ur-semantics’. Its melodies
and harmonies evoke images as well as emotions—remembered scenes and persons, adventures,
reveries and relationships. It recalls special moments, an ‘art of times’ charged with emotions of
interpersonal relating (Imberty 2000, 2005). But these important outward manifestations of
musical meaning are never subordinate to, or just made of, any attachment to external particulars,
even when its forms of expression are obedient to elaborate conventions of a musical literature,
the score. Music remains ‘about itself ’, and the emotions in its making (Trevarthen 2008b; see
Cross and Morley Chapter 5, and Rodrigues, Rodrigues and Correia Chapter 27, this volume).
Every human brain senses musical–emotional meanings many months before it becomes
a facilitator of linguistic–propositional signs. For a child, musical expression is as natural as
moving itself. If anything in the higher human brain has a genetically preordained evolutionary
history, it is the fundamental urge to communicate in the temporal cadences of emotional
movements, with endlessly creative protomusical dynamics, helping us understand why music is so
widely treasured. At its very roots, musicality is part of our particular animal nature, an
emotional–cognitive heritage that can be reconstituted into an endless variety of cultural pastries.
Musical meaning seems, indeed, to be the evolutionary and ontogenetic parent of linguistic terms
and functions (Brown 2000, Mithen 2005), not just a frivolous younger relative. The first coos and
word-like babblings of babies have musical/poetic structure matched by the intuitive parental
encouragement that is attuned to them (Papoušek and Papoušek 1981, Stern et al. 1985, Papoušek
1994, Miall and Dissanayake 2003). In song, music is a natural partner for the cadenced language of
poetry, the prosodic expression of emotions in dynamic intersubjective synchrony that Ivan Fonagy
(2001) calls ‘languages within language’, the ‘distant past still present in live speech’.
Language in its complexity enables the speaker to reflect mental content belonging to different phases of
his emotional and intellectual development. In each creative verbal act—and in a wider sense, all speech
acts are creative—we must first descend to deeper, thus earlier, ontogenetic and phylogenetic layers.
Fonagy (2001, pp. 687–688)
Moreover, the style and rhythmic pattern of music can define a person, role or group in society
(MacDonald et al. 2002). Its rhythmic cycles facilitate the joint accomplishment of routine chores
on which, we assume, the social cooperation that fostered the agricultural revolution in human
prehistory was based, along with any talk about what was done. Once basic bodily needs are ful-
filled, music, with dance, often becomes a central cultural passion that helps close knit groups of
humans move and think together, sharing pleasurable narrations of vitality (Becker 2004;
Benzon 2001; Schubert and McPherson 2006).The power of particular configurations of music,
especially favourite or loved pieces, to inspire, heal or teach proves that musical expressions, and
their communication, can engage the core mechanisms of the brain that regulate well-being in
body and mind, and that guide the formation of self-confident associations and memories in
affectionate relationships (Pratt and Spintge 1996; Pratt and Grocke 1999; Peretz and Zattore
2003; Klockars and Peltomaa 2007; Osborne, Chapter 25, this volume).
Whatever our evolutionary story may have been, we believe the instinctual–emotional core of
the brain, essential for all these phenomena of music, has ancestry in the minds of living animals.
Evidence from human brain mapping highlights how emotionally moving music resonates
robustly within subcortical emotional circuits homologous with those of other species (Blood
and Zatore 2001). Even if humans are the only species that makes and appreciates music, we find
that the rhythms and basic sounds of musicality are evident in the sometimes long and intricate
social displays of other animals (Rogers and Kaplan 2000; Wallin, Merker and Brown 2000,
Section II). That, too, opens new paths to a neuroscience of music.
7.1.2 A possible evolution of human musicality

We believe that the evolutionary roots of musicality must lie in the repetitive rhythms and emo-
tions at the source of moving. Especially important are the emotional sounds by which birds and
mammals communicate. We mammals are social creatures who depend critically on resonance
with the inner purposes and concerns of others. At times we need to call on others for help,
request affective sustenance and companionship, and share the essential tasks of mating and care
for defenceless young. Musical dynamics resemble the dynamics of emotive movements and
feelings evident in the ritual behaviours of communicating animals (Darwin 1872; Tinbergen
1951; MacLean 1990; Rogers and Kaplan 2000). Throughout our evolutionary journey, as in our
infancy, the sounds of emotions connected us and guided our relationships and collaborations,
and our actions in harmonious, joyful play, and in conflict or distress (Dissanayake 2000;
Mithen 2005).
We, along with Merlin Donald (2001) and Ole Kühl (2007), believe that an innate rhythmic
musicality provides the prosodic background for all of our lyrical urges to communicate
affectively, however artificial and sophisticated the techniques of that communication and
its symbolization may become; however disembodied or abstracted from bodily activity and its
perceived context. Young children and adolescents advertise their enthusiasm for life, their cre-
ativity, self-confidence, longing and sociability, in displays of musical exuberance (Bjørkvold
1992; Miller 2000; MacDonald et al. 2002; Miell, MacDonald and Hargreaves 2005; Custodero
2005; Schubert and McPherson 2006). An infant just six months old can demonstrate infectious
glee in the performance of the actions of a clapping song, looking for affectionate praise
(Trevarthen 2002; and see Gratier and Danon, Chapter 14, this volume), and a musical joke
can be appreciated, indeed performed, two months before that (Stern 1990, 1999; Malloch 1999,
p. 47), when an infant is beginning to show flirtatious coyness (Reddy 2003; Reddy and
Trevarthen 2004). Untold young children have expressed their epic affective imaginations in
dramatic song, with no audience but Mother Nature.
It appears likely, therefore, that human musicality evolved as an evolutionary exaptation of
social–emotional systems that became the medium by which our ancestors harmoniously coordi-
nated not only intimate engagements, but also ambitious group activities, as in hunting large and
dangerous animals, harvesting crops, defending small family groups in a hostile world, and
teaching the young (Cross, 2007; Brandt Chapter 3, Cross and Morley Chapter 5, this volume).
Ellen Dissanayake suggests that the intimate communication between helpless but intelligent
infants and their sensitive and responsive mothers was the place where human musicality first
grew, and that this was the source of other cooperative social uses of expressive behaviour
(Dissanayake 2000; Dissanayake, Chapters 2 and 24, this volume). These possibilities (Roederer
1984; Storr 1992; Wallin, Merker and Brown 2000, Sections III and IV; Mithen 2005), may forever
be lost in a psycho-evolutionary past, the traces of which cannot be deciphered with any certainty
(Wallin 1991; see Cross and Morley, Chapter 5, this volume). However, the study of the
emotional dynamics of other living creatures many eventually help reveal the affective evolutionary
and developmental foundations of human musicality.
7.1.3 The comparative psychobiology of musicality

First, we assume that the intuitive affective responses of humans to music involve brain processes
still active in other living animals. Thus, most mammals and birds may exhibit protomusicality, and
we propose that we might be able to probe their living brain systems and emotional activities as
Rosetta stones to decipher our psycho-evolutionary past in detail (Panksepp 1998a, b, 2003; see
Merker Chapter 4, Cross and Morley Chapter 5, and Turner and Ioannides Chapter 8, this volume).
For instance, the brain mechanisms for birdsong appear to grow like the brain mechanisms for
practice and learning of speech in children. Research on the parts of the brain of a bird necessary
for the performance, hearing and learning of song has revealed that young birds must hear and
experiment with motor performances of their own song if they are to learn from a tutor how to
maintain the more elaborate adult song and how to discriminate the songs of other individuals,
skills that establish them as mature members of a community (Marler and Doupe 2000). This
active vocal learning of expressions of self and others, and its brain mechanisms, has been
compared to an infant learning to speak (Doupe and Kuhl 1999). In neither case is the neural
mechanism of motivation for subjective and intersubjective regulation of emotional expression
in the voice well understood.
Evidence from experiments designed to clarify the associations between core emotional
processes and conscious experiences of music supports the ideas of Clynes (1977, 1995) and
Krumhansl (1997) concerning the appreciation of ‘sentic forms’, i.e., sensed movement shapes,
organized temporally in the brain, with different emotional force and narrative significance.
These forms appear to obey universal dynamic principles of emotional movements, and they
regulate the power and efficiency of instinctual actions in animals (Clynes 1982; Lee et al. 1999;
Lee 2005; Schögler and Trevarthen 2007; Lee and Schögler, Chapter 6, this volume). All healthy
animal movements are rhythmic, with qualities of power and grace. These features become
accentuated in emotional communication, most conspicuously in human music.
Animals and humans communicate with motor organs that are adapted for both regulation of
selective consciousness by orienting and focusing special sense organs, and for emotional communi-
cation (Trevarthen 2001). Especially expressive of human intentions, interests and feelings are the
hands, eyes, face and voice—movements which give conscious guidance to volatile inner states or
‘motives’ (MacLean 1990; Scherer 1986; Goldin-Meadow and McNeill 1999; Zei Pollermann 2002).
Thus, while accepting the greater complexity of our human actions and experience, we would relate
human song and music making, as well as the rhythmic rituals of communication by gesture and
dance, to the instinctive affiliative calls, vocal expressions of passion and displays of intentions in
body movement of other highly social animals (Darwin 1872; Wallin, Merker and Brown 2000,
Section III; see Dissanayake Chapter 2, Brandt Chapter 3, Merker Chapter 4, and Pavlicevic and
Ansdell Chapter 16, this volume).
It is through melodic emotional intonations that our species first masters one of its most
productive skills—how to chat (Bateson 1979; Trevarthen 1974, 1993, 1998, 2005). In the
last trimester of gestation preferences for and recognitions of rhythmic and melodious vocaliza-
tions emerge, long before any sense is made of propositional speech (Fifer and Moon 1995;
Lecanuet 1996). The development of musical responsivity can be monitored by musically
induced changes in spontaneous rhythmic movements of infants (Condon and Sander 1974;
Condon 1979; Trevarthen 1999; Mazokopaki and Kugiumutzakis, Chapter 9, this volume),
as well as by recording accompanying autonomic changes. For instance, Chang and Trehub
(1977) report that before infants are six months old, they exhibit heart-rate changes
when melodic contours are changed. Zentner and Kagan (1996) report visual avoidance in
4-month-old infants exposed to dissonant harmonic stimuli, which is not evident in response to
consonant stimuli.
Young babies are riveted by the melodious flow of a mother’s lullaby, and their bodies become
entrained to musical rhythms. They are profoundly sensitive to the contingency and authenticity
of a communicative partner’s rhythm of expression, and to the sympathy of the feelings
expressed by gestural movements and in tone of voice (Murray and Trevarthen 1985; Nadel et al.
1999; Robb 1999, Trevarthen 1995, 2005; Powers and Trevarthen Chapter 10, Marwick and
Murray Chapter 13, Gratier and Danon Chapter 14, this volume).
7.1.4 Why, then, is human ‘musicality’ so mysterious, and such a

new idea? How have we deluded ourselves?
Some writers have asked why the essential emotional musicality of our nature has been so
neglected in scientific analysis of human life and the human mind and brain (Cross 1999, 2003;
Mithen 2005). Such blind spots in our established scientific ways of thinking, may reflect
the over-intellectualized views of mind engendered by the computer model-based cognitive
revolution. The primal embodiment of human communication seems largely forgotten, except in
recent developmental studies. This culturally promoted neglect contrasts with the more cohesive
intersubjective social practices and beliefs in the East (Becker 2004) and in Africa (Blacking 1988,
Frøshaug and Aahus 1995; see Woodward and Bannan Chapter 21, and Rodrigues, Rodrigues and
Correia Chapter 27, this volume). Likewise, much modern brain science is not concerned with
the affective mechanisms of the brain that give music meaning.
To overcome this bias, a plausible scientific theory of innate or intuitive musicality is needed.
Thus, we advance the view that human musicality represents a distinct motive system, related to
the imaginative capacity of other social animals for cooperative moving and practical affective
living in community. Among humans, this motive energizes a parable-making imagination of
mind that Turner (1996) calls ‘literary’, but which is equally ‘poetic’, ‘musical’ or ‘theatrical’
(Brandt Chapter 3, and Cross and Morley Chapter 5, this volume).
Brown (2000), Donald (2001) and Stokoe (2001) have proposed that mimesis—the expressive
use of the body and gesture to portray recalled or imagined thoughts and experiences as projects
in engagement with the world—was the first truly human form of communication. The symbolic
power of the art of gestural rhythm and lyrical–affective expressions are clearly evident in
traditional classic Indian dance forms (Hejmadi et al. 2000), and are nascent in the performative
talents of toddlers with their highly energetic bodies (Bjørkvold 1992; Custodero 2005).
This dynamic, fully human mode of liveliness, with its prosocial humour and joy, deserves to
be recollected and revalued in intellectualized subcultures of the West (Bjørkvold 1992;
Frøshaug and Aahus 1995). Without appreciating this socializing aliveness, we may generate
inadequate visions of the human brain, and hence the human mind, which may promote cultural
degeneration.
7.2 The socio-emotional psychobiology of music

7.2.1 Animal calls and mother–infant dialogues are musical,
and serve learning
Two phenomena indicate that the neurobiological foundations of music are social–emotional.
First, music-like vocal communications between animals serve essential social functions. In many
species, the amount of vocal activity that has evolved to advertise sexual attractiveness and
generate social arousal is phenomenal (Bradbury and Vehrencamp 1998; Hauser 1996). Second,
the emotional use of the voice is the primary medium by which mothers coax their babies into
the human cultural world. Here, there is something new, beyond the regulation of sexual
reproduction and ‘inclusive fitness’.
The melodious chat of infant directed speech and dancing musical games and songs requires
the invention and learning of rhythms, affective melodies, and intersubjective harmonies that tell
of possible adventures, and that connect an infant’s awareness to the cultual and epigenetic
history of its parents’ community going back many generations (see Gratier and Danon,
Chapter 14, this volume). Their rhythms and prosodic sounds, charged with emotion, are made
meaningful in imaginative ways. Infants respond to their mothers precisely with their own vocal
expressions in ‘protoconversations’. Even on their own, without adult assistance, infants in groups
can express their feelings and can regulate the drama and meaning of their relationships in
musical ways (Bradley, Chapter 12, this volume).
A momentous affective–cognitive transition was achieved in the evolution of mind when
mothers and infants started to communicate in this way (Fernald 1992a, b; Papoušek et al. 1991;
Dissanayake 2000). This affect-sharing is well-described as the ‘cradle of thought’ (Hobson
2002). It is likely to have evolved in association with the executive thinking required to regulate
the use of clever hands, and their incorporation into gestural communication of subtle tides of
self-sensing and intention (Donald 2001; Pollick and de Waal 2007). Indeed, newborn infants
make many complex and highly expressive hand movements, which remain, as yet, little studied
(Trevarthen 1986; Rönnqvist and Hofsten 1994).
The ancestral dynamics motivating the social communication of vitality and well-being in
animal-made sound may be essential preparation in our brains for the emergence of human
communicative musicality and musical meaning, and then for language (Kühl 2007; Brandt
Chapter 3, Cross and Morley Chapter 5, this volume). Even rats and mice produce emotionally
attractive sounds during joyous, playful and sexual interactions (Panksepp and Burgdorf 2003;
Holy and Guo 2005). Our close ancestors lived in arboreal canopies where sound was an efficient
way to coordinate cohesive group activities, to periodically reinforce and re-establish social
bonds, and to sustain dominance/submission patterns in ways that minimized physical
injury (Hauser 2000, Seyfarth and Cheney 2003; also see Richman’s [1987] descriptions of the
social ‘songs’ of Gelada monkeys, and the report on the songs of gibbons recorded by Merker and
Cox [1999]).
Although the emotional calls of other primates are more stereotyped than those of humans
(Marshall and Marshall 1976; Hauser 2000), they have evolved to mediate subtle messages of vital
importance in the maintenance of cooperative groups (see the social functions of the nuanced
calls of vervet monkeys revealed in the work of Cheney and Seyfarth 1990). They transmit infor-
mation for the collective awareness of opportunities or dangers, as well as individual identity and
rank in the group (Seyfarth and Cheney 2003; Merker, Chapter 4, this volume). Such primal
urges for prosodic social communication may have served as an essential foundation for the
evolution of human musicality that led to language.
Musicality certainly both aids learning of non-musical skills, and uses them. New information is
comparatively easily acquired when encoded in affective musical forms (Panksepp and Bernatzky
2002; Shepard 1999)—an effect evident even in profoundly retarded children (Farnsworth 1969;
Merker and Wallin 2001; Wigram and Elefant, Chapter 19, this volume). Musical training can
apparently strengthen brain functions at many morphological levels (Schlaug et al. 2005; Turner
and Ioannides, Chapter 8, this volume), including the subcortical systems for hearing speech and
music (Musacchia et al. 2007). Indeed, the bodily movement facilitated by music, as well as the
accompanying autonomic and mood changes, show how powerfully music can engage with the
core regulation of complex movements, conscious awareness, concepts and memories useful for
survival in a human community. Affective states reflect the energetic states, or ‘vitality affects’
(Stern 1999), of the nervous system that are not explained by information-processing metaphors
(Ciompi and Panksepp 2005; Panksepp 2005a, b).
The priority of dynamic ‘relational affects’ (Stern 1993) over cognition remains a salient
aspect of our personal and cultural lives. As Adam Smith (1777/1982) said, musical sounds, with
theatre or dance, are ‘our most pleasurable inventions’, and many tens of thousands will seek
to commune in musical environments that the socially talented can create. By contrast, serious
cognitive communications will rarely attract a massive audience, even if the speaker is eminent in
the field.
7.2.2 Human musicality and physical exuberance of interactive play:

important distinctions
One fundamental brain process for helping weave individuals into the social fabric among
mammals, and some birds, is social play (Bekoff and Byers 1998; Burghardt 2005; Panksepp
1998a). Rough-and-tumble physical play and/or chase-and-dodge teasing play are intrinsic,
experientially refined faculties of every mammalian species, which help promote social affilia-
tions and epigenetic development of fully social brains.
In the first formal experimental analysis of physical play of children, without the biasing
presence of material toys, we investigated whether the presence of music (in this case joyous Irish
jigs) would energize this most fundamental form of social engagement (Scott and Panksepp
2003). Surprisingly, it did not. Neither did it facilitate laughter, even though an intensification of
various dancing locomotor effects was evident. Thus, primitive physical playful engagements
were not unconditionally facilitated, but expressive dancing movements were (see Mazokopaki
and Kugiumutzakis, Chapter 9, this volume). Music was apparently communicating at a different
level from the primitive childhood urges for physical play, perhaps partly because such play is
largely subcortically organized (Panksepp et al. 1994).
However, there are many forms of play. It seems music is a medium by which the young
humans first indulge in a potentiality for the drama of imaginative and imitative symbolic play
(Turner 1974, 1982), for the art and story-making imaginativeness in self-expression that may be
unique to our species, and that may be first evident in mother–infant sing-along interactions
(Dissanayake 2000; Eckerdal and Merker, Chapter 11, this volume). This encourages us to make
two critical distinctions concerning the complexity of human emotionality and musicality—
distinctions that mark an evolutionary advance from the primal sociability of other mammals.
First, musicality seems directly linked to the highly articulated adventurousness of thought
in metaphor that guides all forms of human acting and attending, which is enhanced aestheti-
cally by intrinsic emotions. Second, and secondarily, music seeks the complexity of learned ritual
and acquires narrations of rhythm and melody (Kühl 2007; Trevarthen 2008b), eventually
involving artificial subtleties of technique and remembering or ‘notating’ (Brandt, Chapter 3;
Merker, Chapter 4, this volume). If so, we again must consider musicality to be a profoundly
foundational and evolutionarily rooted aspect of Art, and of what is uniquely cognitive and
rational in human nature—a liminal or transitional activity working creatively between physical
exuberance and rationality (Turner 1983).
7.2.3 Musical passions: Intrinsic emotional assessment of the

communicated meanings or values of moving
All integrated actions and interests of animals have emotional control, balancing the anticipated
risks and benefits of behaving. These emotional valuations have evolved settings of sensitivity
and expressive form that define a set of adaptive affective neural systems, operating with different
specialized neurotramsmitter codes; these systems are integrated with the bodily hormone-
producing systems that diffuse information through the body to regulate and protect the vital
state of its somatic and visceral organs (Panksepp 1998a).
Diverse affective neural systems are the regulators of physiological resources and prospective
awareness in control of moving and sensing in the present (Bernstein 1967; Jeannerod 1994;
Lee 1998, 2005; Lee and Schögler, Chapter 6, this volume). The motor regulating system, with its
emotions, animates the recalling of experiences of moving, the imagining of an active future, and
the dreaming of impossible reflections on the intentions and emotions of moving (Solms 1997;
Levin 2004). It links the time-keeping functions and neurodynamics of brainstem dopamine and
other emotional action systems (Holstege et al. 1996) with the more cognitively conscious
environment–sensing–learning and thoughtful regions of the forebrain.
The felt emotions of music probably arise in this motivating core of the brain, which includes
the basic emotional brain systems that have been identified using localized electrical stimulation
of various mammalian brains (Panksepp 1998a). These emotional networks mediate receptive
SEEKING, FEAR of danger, RAGE when access to resources are compromized, LUST to insure
reproduction, CARE to assure nurturance of the young, PANIC/separation distress by which
young signal caretakers when they are lost, physical play to promote exercise of social skills and
to build prosocial brains, all interacting with a core SELF that helps represent a primordial
neurosymbolic representation of the body within the brain. The last four social emotions are
especially important in promoting and shaping creative communal activities (Panksepp 1998a).1
These systems promote all instinctual actions and valuation of the world. At the psychological
level, they reflect our basic affective capacities—the ability to engage in the world with great
interest, to become enraged if our freedom of action is limited, to become scared if our actions
lead to personal harm, to feel lust, care and joy for social engagement, and to feel the sting of pain
when losing things we value, especially people we love (Panksepp 1998a, 2003c). All of these
feelings, or shades of them, can be evoked by music.
Many recent findings on how music modulates or recruits activity in the human brain
(recently reviewed by Peretz and Zatorre 2003) are outside the scope of this chapter, since they
have not sought to identify dynamic emotional effects and regulations in the brain, but have been
exclusively concerned with practical musical perception and cognition and the acquisition of
musical skills.
Even as certain rhythmic elements of music that convey affective change are being clarified
(Gabrielsson 1995; Peretz et al. 1998), only preliminary evidence from human brain imaging
informs how brain emotional changes facilitate music appreciation (Blood and Zatorre 2001).
We may need to categorize emotions differently, taking into consideration the many emergent
emotions that arise when basic emotions interact with cognitive structures. The language-based,
state-describing ‘categorical affects’ (such as happy, sad, angry, disgusted) may not optimally
define the emotions experienced in music. It is also necessary to focus on the dynamic and the
relational affects that characterize the sensations of energy and grace in the movement and inter-
personal messages of emotionally charged gestures (Stern 1993, 1999).
Music communicates feeling qualities, transmitted with distinctive activation contours, which
are, ‘captured by such kinetic terms as ‘crescendo,’ ‘decrescendo,’ ‘fading’, ‘exploding,’ ‘bursting,’
‘elongated,’ ‘fleeting,’ ‘pulsing,’ ‘wavering,’ ‘effortful,’ ‘easy,’ and so on’ (Stern 1993, p. 206). In
Stern’s terms, these give ‘vitality forms’ to the emotions, which are probably homologous with the
‘sentic forms’ described in musical expression of more categorical feelings by Manfred Clynes
(Clynes 1995; Clynes and Nettheim 1982). These can be related to the affective neuroscience
of basic emotions (Panksepp 1998a). Some musical intervals are consistently perceived by
listeners, who are asked to put words to their emotional qualities, as ‘monotonous’, ‘sad’, ‘joyful’,
‘disharmonious’, ‘calm’, ‘whole’, ‘resolute’, etc., terms which can be related to different ways the
body moves—‘up–down’, ‘out–in’, ‘tense–relaxed’, ‘repulse–receive’, ‘asymmetric–symmetric’,
1 Capitalizations designate basic emotional systems, to (i) avoid part/whole confusions, (ii) alert readers to
the claim that these may be necessary brain systems for those types of emotional behaviours and feelings,
although by no means sufficient for all of the emotional manifestations that may arise from those systems
in real-world activities, and (iii) highlight that specific psychobehavioural brain systems are the referents
of these labels.
‘straight–round’, ‘cheerful–gloomy’ (Krantz 2007). The musical intervals clearly evoke the same
ideas of different qualities or efforts of moving in different listeners.
Although there is evidence from electroencephalography (EEG) that brain activity is affected
by music (e.g., Petsche 1996, Sarnthein et al. 1997), we still know little about how affective prop-
erties of music modify human brain activity (e.g., Hodges 1995; Panksepp and Bekkedal 1997;
Panksepp and Bernatzky 2002). Presumably, links will be found between the regulatory evalua-
tions of emotions and the rhythmic modulations of motor activity, with their multimodal
prospective perceptual control (Lee and Schögler, Chapter 6, this volume). It is clear that a wide
array of brain activities become involved in the production and perception of music (Kühl 2007;
Turner and Ioannides, Chapter 8, this volume).
7.2.4 Musical narrative or ‘adventure’: the intentional core of the

brain, and its time
Rhythmic processes, paced by adaptable biological clocks that respond to environmental
contingencies, are a conspicuous feature of organized tissues, including neural nets, and espe-
cially in the regulation of vital functions serving metabolism and the energy economy of the
body—most obviously in beating of the heart and breathing, but also in the control of diurnal
rhythms of sleep and wakefulness, and anticipations of the changing seasons (Bernardi and
Sleight 2007). All animals locomote by moving their bodies rhythmically, and the rhythms of
flying birds, scurrying mice, plodding elephants, walking, trotting or galloping horses, or gibbons
swinging between the trees invite imitation in music.
Music depends upon the rhythmic measure of expressive movements in time, and the tensions
created by combining rhythms (Osborne, Chapter 15, this volume). The ‘architecture’ and ‘narra-
tion’ of moving psychological time is manifested in the measured rhythms of human action,
experience and communication, real or imagined—with its emotional qualities and their relation
to vital functions of the body (Trevarthen 2008a). These psychobiological processes are measured
in three bands or ranges of physical or scientific time: (1) for the felt and imagined ‘extended
present’ (from 10 seconds to years); (2) for the conscious ‘psychological present’ (Stern 2004),
with its rhythmic motor control coupled to the physiological rhythms of breathing and varia-
tions in heart rate (0.3 to 7 seconds); and (3) for ‘reflex experiences’ and ‘just noticeable
differences’ too fast to be regulated by movements that are prospectively controlled in awareness
(5 to 200 milliseconds). (For detail and the sources of this information see Trevarthen 1999).
The time of musical narrative, which Imberty (2000) calls the macrostructure or ‘story-
without-words’ of music, is related to the times of expressive behaviour that form ‘protonarrative
envelopes’ of intuitive vocal and gestural play between infants and their mothers (Stern 1985,
1995; Malloch 1999). The period corresponding to a stanza or verse of 20 to 40 seconds may be
regulated in the brain, as gamma waves or parasympathetic cycles, to control autonomic
functions of the heart and breathing. It continues through sleep to produce fluctuating rates of
breathing and heartbeat, as well as electrical activity of the cerebral cortex that might be related
to the rehearsal and consolidation of memories in dreaming (Delamont et al. 1999). In wakeful-
ness the narrative cycle is charged and modulated for intersubjective meaning with the
‘microtonal’ and ‘microtemporal’ variations of emotion that express urgency and facility in
skilled control of moving within the voice of a singer or the playing fingers of an instrumental
performer, and in the hearing of a listener (Imberty 1981, 2000; Gabrielsson and Juslin 1996;
Juslin 1997, 2001; Kühl, 2007; Osborne, Chapter 15, this volume). Music can assist the synchro-
nization of physiological functions of respiration and heart activity and bring improvement
in locomotor activity, and it can improve cognitive and memory processes by brain synchronization.
Despite individual differences in subjective preferences, the physiological effects of music are
often predictable (Bernardi and Sleight 2007).
7.2.5 Musical sympathy: the intersubjectivity of movements and

emotions
This anatomy and physiology of intentions (i.e., emotional intentions-in-action; see Panksepp
2003c) help regulate social collaboration intersubjectively. The core emotions of vertebrate
brains have evolved to resonate among emotionally interacting individuals. They constitute the
primordial core self (Panksepp 1998a, b). They establish self–other defining motivations in higher
medial regions of the brain implicated in the development of both self-referential information
processing and sociable self-awareness (Schore 1994; Northoff et al. 2006; Schilbach et al. 2006).
In the course of the evolutionary process that established social communication of intentions
and feelings between animal selves, polymodal areas of the higher regions of the forebrain come
to act in resonance with the intentions and feelings of other subjects, constituting what are called
‘mirror’ representations, which however do more than reflect what other subjects are intending
to do (Gallese 2001; Gallese, Keysers and Rizzolatti 2004; Jeannerod 2004; Rizzolatti et al. 2006;
Molnar-Szakacs and Overy 2006; Bråten 2007). Being emotionally controlled, these other-within-self
intersubjective representations establish sympathetic resonances, and intersubjective
contagions, probably by intrinsic affective systems situated much lower than the neocortex (Watt
and Pincus 2004), making complementary adjustments to the intelligence and feelings expressed
in gestures of other bodies and sensed by sight, sound and touch through neocortical processes
that are epigenetically programmed by experience. This cerebral machinery of emotional
self–other awareness (Thompson 2001; Reddy 2003) is the ancestor of much more than a
‘language acquisition device’ (Rizzolatti and Arbib 1998); it is the motivator of sociocultural exis-
tence and its moral foundations, and of each individual’s urge from infancy to learn cultural
skills, including those of language (Trevarthen 2004; Bråten and Trevarthen 2007).
7.2.6 An interlude: the genetics of the musical and social mind in

Williams syndrome
Certain neurologically impaired children, the development of whose intellectual competence is
seriously compromised, retain musical and social desires (Sacks 2007). This is evident in
Williams syndrome children, who show musical talents and social/linguistic urges in spite of
severe handicaps in areas of spatial comprehension and performance (Mervis et al. 1999).
Children who have this rare developmental disorder, affecting both boys and girls with a preva-
lence of about 1 in 20,000, are outgoing and socially joyous and communicative though handi-
capped in movement and practical intelligence. Their unique cluster of symptoms appear to arise
from anomalies in crucial growth-regulating genes (for a summary, see Peterson and Panksepp
2004). Williams children are usually mentally retarded, with an overall IQ of about 50 (that in
some exceptional individuals approaches normality). They are deficient in visuospatial skills, but
often perform well above their mental age in auditory–social skills of speech and music. Their
characteristic physical, physiological and psychological features were first recognized as a distinct
syndrome in 1961, by Dr J.C.P. Williams, a New Zealand pediatrician (Williams et al. 1961). The
genetics of this disorder have been well detailed, with three to five major disrupted genes identi-
fied in the 7q11.23 stretch of chromosome 7 (Nickerson et al. 1995; Frangiskakis et al. 1996;
Peoples et al. 1996). Apparently, these defective genes impair forebrain development and prevent
normal spatial–cognitive skills from emerging, while releasing other perceptuomotor and social
abilities more dependent on hearing (Jernigan et al. 1993; Bellugi 2001).
Although Williams children will never be able to read, write or do mathematics well, many
have a remarkable knack for music, dance and performance, along with simple but highly embel-
lished forms of storytelling. Williams children, unlike most children with autism, are eager to
communicate their feelings in expressive ways. The cognitive strengths and weaknesses of
Williams children are also quite distinct from those of Down’s children. Williams children may
have difficulty walking down the stairs, but be able to coordinate superbly the movements
needed to play a musical instrument. Even though they cannot draw an elephant or bicycle, they
can describe them vividly. To their descriptions of everyday objects, they often add a richness
of detail and emotional meaning rarely matched by normally developing children, who seem
prosaic or reserved by comparison (for overview, see Bellugi 2001).
Many Williams children, like some children with autism, exhibit perfect pitch perception—the
ability to identify, and name, an isolated sound exactly as a note in a musical scale (Levitin and
Bellugi 1998). This ability is lost in most adults but can be learned (or re-learned) as a skill.
Paradoxically, perception of the absolute pitch level of an isolated note, rather than a sensibility
for the relative pitch of musical notes in groups, which adults generally find easier, is demon-
strated in infants (Saffran and Griepentrog 2001). Evidently children with developmental brain
disorders fail to make a reorganization of pitch awareness; this is presumably linked to learned
discrimination of meaning-rich pitch transitions in speech. Although the analytical skills of
Williams children for musical form may not be of a high order, they are remarkably engaged with
music or song as a means of emotional expression in sound (Don et al. 1999; Hopyan et al. 2001).
The psychological phenotypes of these children highlight how dramatically social emotionality,
musicality and language urges go together in the brain.
In other cases of abnormal brain development, for instance in the presence of severe higher
cerebral impairments that would lead to persistent vegetative states in adults, a well cared-for
child, with essentially no higher brain regions intact, can remain emotionally conscious, and
their distress can still be soothed by music and by the musical expressions of caring others
(Shewmon et al. 1999; Merker 2006). We commonly accept that normal musical appreciation is
totally interpenetrant with certain cognitive abilities, but there is a fundamental musicality that
seems to be more basic, supporting the hypothesis that our love of music is strongly linked to our
genetic heritage of core emotional and motivational processes (see Wigram and Elefant, Chapter 19,
this volume, for an example of the power of musical communication to win over profound
mental handicap, and Robarts, Chapter 17, this volume, for accounts of how music therapy can
ameliorate emotional havoc caused by abuse).
7.3 The neuroscience of music

7.3.1 Visualizing the brain processes of intersubjective sympathy
Electroencephalic recordings have shown that 8-week-old infants’ brains respond to the sight of
a woman’s face with activity in cortical zones that will later acquire socially important concepts and
skills; skills not only to recognize different people and their personal characteristics by their face or
voice, but to see and hear their communicative gestures and to articulate and comprehend expres-
sive language (Tzourio-Mazoyer et al. 2002). Babies’ brains become active in the same regions as are
aroused in adults when they understand one another’s speech and hand gestures (Willems and
Hagoort 2007). Research on adult music learners shows that brain regions associated with musical
skills are comparable to those involved in acquiring speech and language (Schlaug et al. 2005;
Molnar-Szakacs and Overy 2006; Turner and Ioannides, Chapter 8, this volume).
This stunning information on the ‘embodiment’ of human communicative talents is radically
changing our view of how higher brain functions emerge. But the location and organisation of
networks of the human brain that generate the whole-person emotional ‘time in the mind’—of
moving, thinking, remembering and imagining (Clynes 1982; Clynes and Nettheim 1982;
Wittmann and Pöppel 1999) remain obscure. This lack is a large part of the mystery surrounding
musicality and its biology.
7.3.2 Affective neuroscience, cognitive neuropsychology of the

hemispheres and consciousness of music
Our spontaneously musical nature suggests that the higher information-processing activities of
the human mind are not essential for arousing our affective experiences (Zajonc 2004, Sacks
2006, 2007). Affective consciousness is sufficiently distinct from cognitive consciousness (highly
interactive as they obviously are), that if we conflate the two, we can easily misunderstand
how core affective states are created in the brain (Ciompi and Panksepp 2005; Panksepp
2003a, b, 2005a, b), and how higher emotions emerge through cognitive interactions. The effects
of differently located cerebral injuries confirm this view.
The widespread brain mechanisms of musicality can be distorted or confused by brain injury
(Sacks 2007), but can also survive the severe depletion of the cognitive brain. Maurice Ravel, who
lost his ability to write music following left hemisphere trauma, continued to conceive and enjoy
music, as have many other individuals with similar brain damage (Peretz et al. 1998). Other people
with restricted brain lesions have lost many intellectual and executive functions but retained
musical appreciation (Peretz and Zatorre 2005; Stewart et al. 2006). Musicality is intrinsically
asymmetric in the human brain. Most of us have, since infancy, spontaneously appreciated vocal
prosody and enjoyed music more with our right than with our left hemispheres. We engage with
the emotions of others more with our right than left hemispheres (for references see Trevarthen
1984, 1996; Storr 1992; Schore 1994; Siegel 1999). These asymmetries are also evident in the ways
we gesture with our hands to express our thoughts and feelings (Kimura 1982; MacNeilage 1999).
Infants are, of course, born with more affective than cognitive competence. We come into the
world with our dynamic and interpersonal mental and emotional life, concentrated more in deep
subcortical emotional and affective limbic forebrain systems, intensely active (Chugani 1998;
Panksepp 1998b). Hearing music or speech is not just ‘in’ the auditory neocortex. The infant
brain subcortical systems of voice appreciation are certainly more active than the auditory cortex,
and can be compared directly to the vocal auditory systems in other species (Ploog 1992; Hauser,
2000). An obligatory brainstem waystation for auditory processing, the inferior colliculus of the
midbrain roof, lies over the periaqueductal grey (PAG), where all emotional systems converge on
a coherent self-representation of the organism—a primordial core consciousness (Damasio 1999;
Panksepp 1998a, b; Merker 2005). That brain region clearly mediates affective processes in all
mammals (Bagri et al. 1992). This is where a mother’s voice may leave its first affective imprints,
and it is evidently active in the refinement of hearing that comes with practice of musicianship
(Musacchia et al. 2007). It is richly endowed with opiate receptors (Panksepp and Bishop 1981)
that help mediate social attachments including our atunements to voices of those we love, and
hence, by parallel reasoning, to certain types of music (Panksepp 1995) (Figure 7.1).
7.3.3 Cerebral asymmetry of musical awareness, and its

emotional foundations
Before they begin to develop their more propositional, linguistically defined left-hemisphere
abilities, babies exhibit a right-hemisphere dominance that is prepared to engage and interact
emotionally with loving caretakers (Schore 1994, 1998). Young babies, at 3 months, show greater
brain arousal effects of music than 12-month olds, and as in adults, positive and negative
Convergence of somatic
information in the
superior colliculus
Vision SC
Panic PAG
Hearing
Fear MLR
Touch
Rage
Sex/nurturance Motor
Seeking
Convergence of emotional
information in the
periaqueductal gray
Fig. 7.1 Convergence of the major emotional systems on the self-coordinating mechanism of the
periaqueductal grey (PAG) in the midbrain.
emotions tend to arouse respectively the left and right hemisphere functions of infants (Trainor
and Schmidt 2003).
The neurology of music (e.g., Critchley and Henson 1977; Steinberg 1995; Sacks 2007) has long
recognized that the right emotional–prosodic hemisphere (Bogen 1969) is more influential than
the linguistic–propositional left hemisphere in affective musical appreciation and expression
(Peretz 1990; Perry et al. 1999; Zatorre 1984). It appears that the aspects of music that are more ana-
lytical and learned rely more on developmental potentialities of the left neocortex (Peretz 1990;
Sergent et al. 1992), which acquires a special relationship with more focused and serially patterned
movements (Kimura 1982). Thus spontaneous, affective musical responsivity is typically mediated
more by the right hemisphere of the adult brain, suggesting an intimate relationship throughout
development between dynamic, self-regulating emotional functions and musical processes.
There is a long ancestry of asymmetry in physiological and socio-emotional regulations of
vertebrate brains (Bradshaw and Rogers 1993; Quaranta et al. 2007), the right side of the brain as
far back as the amphibia (Malashichev and Rogers 2002) demonstrating a specialization for
‘trophotropic’ energy-conserving regulation of well-being, while the left is adapted more for
‘ergotropic’ energy expending engagement with the environment. These asymmetries are innate
in human infants and function throughout life in both self-regulations and the emotional con-
trol of communication and personality (Davidson and Hugdahl 1995; Trevarthen 1996;
Davidson 2001; Tucker 2001). They are rooted in asymmetric systems of the brainstem associated
with the hypothalamo–pituitary–adrenal (HPA) the sympathetic–adrenal medullary (SAM)
regulations, both coupling neurochemical and hormonal mediators, the HPA being concerned
with a ‘distress’ or ‘conservation withdrawal’ response to a threat of stress, the SAM being an
active ‘effort’ or system that promotes fleeing and fighting (Trevarthen et al. 2006).
7.3.4 How the musical–emotional brain grows: prenatal origins

of intrinsic motives and feelings
‘Primordial motive systems appear in subcortical and limbic systems of the embryo before the cerebral
cortex. These are presumed to continue to guide the growth of a child’s brain after birth. We propose
that an ‘intrinsic motive formation’ (IMF) is assembled prenatally and is ready at birth to share emotion
with caregivers for regulation of the child’s cortical development, upon which cultural cognition and
learning depend.
Trevarthen and Aitken (1994, p. 599)
Our core time sense and control of effort in whole-body biomechanics—integrating a flexible
trunk with head, arms and legs—is given its rhythmic control, coherence and regulation of energy
by a widespread Intrinsic Motive Formation (IMF) (Trevarthen and Aitken 1994) (Figure 7.2).
The IMF is defined as the assembly of nerve systems that activates, integrates and steers move-
ments and aims the perceptual guidance of the moving body, coordinating the limbs and balanc-
ing the whole in relation to the inertial forces of the parts and the forces arising from contact
with the external media, selecting goals by aiming the focus of awareness of diverse sensory
organs (Trevarthen and Aitken 1994, 2003; Trevarthen 1997). It arises in the brainstem, the core
of the embryonic central nervous system, mapping all representations of the motor organs and
sensory fields somatotopically, i.e., in relation to the polarity and symmetry of the body
(Trevarthen 1985). The integrative anatomy of the IMF, and its rhythmic activity, is established in
the embryonic brain before sensory or motor nerves connect the organic vitality and implicit
intelligence of the body with conditions in the outside world. The fetal brain is already ‘inten-
tional’ in this experience-seeking sense (Zoia et al. 2007).
CULTURAL EVALUATIONS MADE IN COMMUNITY

Play/joy social affection
TECHNOLOGY ART
SCIENCE DRAMA, MUSIC, DANCE
Seeking and Nurturance and

O P
exploring maternal care
EXTERO-CEPTIVE ALTERO-CEPTIVE Panic/

Feelings of self–object S Moral feelings separation
appraisal of relating distress
Rage Sexuality
Fear Social attack
PROPRIO-CEPTIVE
Self-regulating
feelings
Pleasure-pain
Fig. 7.2 The intrinsic motive formation (Trevarthen and Aitken 1994) coordinates the vital states
of human being and directs engagements of the embodied Self (S) with the environment. It has
three different systems of emotional regulation of movements and perceptions: proprioceptive for
feelings of the well-being of the body; exteroceptive for feelings of engagement with the
objects (O) of the physical world; and alteroceptive for sympathetic feelings for the intentions and
emotions of other persons (P). Human musical activities and experiences are part of the cultural
process that develops both the technology and art of a historical community by communication of
both the practical and emotional aspects of all these three regulatory systems (Trevarthen 1998, 2005).
The basic emotions systems (Panksepp 1998a) are indicated as they relate to the body, to the
experience of physical objects (on the left) or to communication with other subjects (on the right).
As the cerebral hemispheres mature to regulate the dynamics of conscious imagination and
thought, and the detailed experience of both speech and music (Callan et al. 2006), they do so
within the integrative control of the subcortical brain (Merker 2006). Every motor impulse or
plan is evaluated emotionally with regard to its value for the present and future well-being of the
organism. The IMF projections also modulate and collaborate with the immense integrative
powers of the cerebellum, which gives precision and order to the timing of motor actions
throughout musculoskeletal mechanisms of the whole body in movement, and under prospec-
tive control by all senses (Bell et al. 1997; Bell 2001). Through the integration of the IMF
infrastructure, in concert with the affective system of the brain (Panksepp 1998a) multisensory
information is given meaning in relation to the antecedent intentions, attentions and conscious
experiences of the maturing infrastructure of the self (Merker 2005, 2006; Northoff et al. 2006).
7.3.5 Tracing musicality beneath cognition in the subcortical

reaches of the human brain/mind
Much classical work in human neuropsychology comes from the study of brain damaged people.
However, it is now possible to image the effects of brain activity in people without brain damage,
including asymmetries of hemispheric involvement, which are correlated with musically induced
affective experiences (Turner and Ioannides, Chapter 8, this volume). For instance, pleasurable
and disagreeable aspects of consonance and dissonance in sounds or melodies are related to the
arousal of activity in particular regions of the brain (Blood et al. 1999). Positron emission tomog-
raphy (PET) imaging has proved impressive in highlighting the degree to which activity in sub-
cortical areas of the human brain, that are homologous with those long implicated in animal
emotionality, contribute to high levels of affect (e.g., Blood and Zatorre 2001). A worthy hypoth-
esis is that music, while serving active communication of dynamic and affective processes in the
mind and involving all levels of the brain to do so, depends on the arousal of basic emotional
feelings in the core self associated with brainstem neural processing directly concerned with the
neuro-humoral regulation of bodily states (Panksepp 1998b). It is unlikely that high-level cogni-
tive processes of environmental awareness and intricate acquired skills can, alone, support the
affective states of musical appreciation or their instinctive communication. Rewards of music
may arise partly from the brain dopamine systems (Menon and Levitin 2005) that integrate the
search for and appreciaton of reward (Alcaro et al. 2007).
In addition to correlative analysis of brain changes, it is important to directly manipulate brain
chemistries to get some appreciation for the causal infrastructure of feelings evoked by music.
By using pharmaceuticals such as naloxone and naltrexone to modify synaptic transmission in
brain opioid systems, investigators have already evaluated whether the opioid pleasure system of
the brain helps mediate music appreciation. As already noted, the emotions evoked by music can
be markedly diminished by opioid blockade (Goldstein 1980), while modest doses of major
tranquillizers that suppress peripheral autonomic effects have not been found to diminish the
emotional impact of music (Harrer and Harrer 1977, p. 216). Although such work remains in its
infancy, our working hypothesis is that the general emotional effects may arise from changes in
central biogenic amine systems, while the more specific moods and emotions are conveyed by
intrinsic neuropeptide systems. Likewise, the literature on the effects of music on many bodily
systems and processes is growing steadily (e.g., Pratt and Grocke 1999; Kreutz et al. 2004; Stefano
et al. 2004; Klockars and Peltomaa 2007), and in section 7.3.7 we will consider work on peak
emotional experiences with music that produce the bodily feeling of chills.
Overall, the localization of the deep-seated generators of emotion by functional brain
imaging will require more sensitive and more responsive techniques than we now possess
(Turner and Ioannides, Chapter 8, this volume). It is already clear that basic emotional arousal
results in neural resonances that are very widely distributed in the brain, and fast changing.
Emotional circuits resemble tree-like structures, with trunks and roots in subcortical areas, and
widespread canopies in cortical regions (Panksepp 1998a) (Figure 7.3). Accordingly, we presume
music accesses emotional systems at many levels and has whole-body effects.
Skilled auditory processing of musical information requires many higher reaches of the brain,
linking auditory temporal lobe inputs into amygdala and basal ganglia (as in the motivational
circuitry of the nucleus accumbens), and involving frontal, parietal and limbic–cingulate cortical
regions. These function in close association with the refined integration of motor dynamics and
their multimodal sensory regulation in the cerebellum (Blood et al. 1999; Blood and Zattore
2001; Menon and Levitin 2005).
Many higher brain systems are essential for the cultivated appreciation of music, and for
skilled performence, as well as for the complex information processing that is essential for
musical intelligence, as measured by psychometric tests (Penhune et al. 1999; Peretz et al. 1994;
Peretz and Zattore 2003). As one becomes a skilled musician, the brain control of music appreciation
Cingulate
Panic
Seeking
expectancies
Basal forebrain Fear and PAG

rage
Temporal lobe
A amygdala
G
PA
BNST B
MF VTA
VMH
Higher frontal and POA
amygdaloid inputs Sensory
Hormonal inputs
B inputs
Fig. 7.3 Emotions mediate between body and mind, regulating intentions, awareness and
wellbeing, in the individual and with the community. Schematic of subcortical emotional systems
of the mammalian brain (A), and one system responsible for regulating sexual behaviour
in the rodent brain (B). BNST = bed nucleus of the stria terminalis; POA = pre-optic area;
VMH = ventromedial nucleus of the hypothalamus; VTA = ventral tegmental area; MFB = medial
forebrain bundle; PAG = periaqueductal grey (Panksepp 1998).
apparently shifts from right hemispheric affective to more left hemisphere analytical skills
(Zatorre 1984).
In short, there can be no restricted brain ‘module’ of musical intelligence, but many wide-
spread emotional systems that transmit the affective qualities of music throughout the brain.
Thus, although there are bound to be specific evolutionary adaptations for musical appreciation
in higher regions of the brain, the emotional power of music may be largely dependent on
premusical emotional adaptations of the brain. The ability of a musical pulse to arouse a desire
for bodily movements and to induce various autonomic changes is congruent with the powerful
subcortical influences of music (Hodges 1995; Blood and Zattore 2001).
7.3.6 Music-induced emotions in the real time of brain activity

Skilled musicians and everyday listeners alike recognize the basic emotional and motivating
content of music (e.g., Juslin 1997, 2001; Robazza et al. 1994; Imberty 2000). Even little children
are quite adept at identifying emotional themes in music (Terwogt and Van Grinsven 1991). Most
humans, whether adults or children, distinguish four named emotions that move our bodies
differently—happiness, sadness, anger and fear—and these same emotions can easily be
conveyed as distinct by the dynamics and tonality of music with considerable confidence.
Moreover, performing musicians can also improvise dynamic ‘portraits’ of these individual
emotions (Gabrielsson and Juslin 1996; Gabrielsson and Lindstroem 1995; Juslin 1997, 2001;
Nielzen and Cesarec 1982). However, understanding these foundations is just a beginning,
because it is clear that movement in music conveys a rich panoply of ‘vitality affects’, as defined
by Stern (1999), that are related to the feelings of the body moving in itself and in the world
(Krantz 2007; Lee and Schögler, Chapter 6, this volume).
While a great body of data claiming to trace steps in musical information processing by the
brain has now been published (for a summary and discussion, see Peretz and Zatorre 2003), little
work has been conducted to determine how the motivations and emotions of music instigate and
modify higher cerebral neuronal activities as they come about, keeping in mind that functional
Magnetic Resonance Imaging (fMRI) and PET studies do not directly monitor nerve cell firings,
recording only changes of local metabolic physiologies in cortical tissues, and with limited spatial
and temporal resolution. Electroencephalography (EEG) or magnetoencephalography (MEG)
are the only techniques that can directly monitor real-time electrical activity of the human brain
(Petsche et al. 1998; Turner and Ioannides, Chapter 8, this volume).
To help fill that gap, Panksepp and Bekkedal (1997) evaluated topographic (EEG) changes
to standardized ‘sad’ and ‘happy’ selections from Terwogt and Van Grinsven (1991). These
optimal segments of happy and sad music were repeated about 30 times using a topographic
analysis of the whole cerebral surface. The repetition allowed them to use the sensitive event-
related desynchronization and synchronization (ERD and ERS) algorithms developed by
Pfurtscheller et al. (1990).
The results were variable. Within the sensitive alpha range (8–12 Hz), there were only modest
tendencies, primarily in females, for happy music to induce less cortical arousal (more synchro-
nizations) and sad music to produce more arousal (more desynchronizations), especially in the
posterior, multimodal sensory regions of the cortex (Panksepp and Bekkedal 1997). In males,
this pattern was reversed. However, subjects had no personal ‘relationships’ with these musical
selections. When repeated with self-selected ‘loved’ music, the brain changes were more robust.
Happy music produced more robust event-related synchronizations (i.e., decreased cortical
arousal) and sad music produced more event-related desynchronizations (i.e., increased cortical
arousal). This pattern is reasonable from the perspective that during sad emotional states, people
have more cognitive and anxiety-provoking issues to dwell on, which would be expected to facil-
itate ERDs. During happy feelings (of relaxation), no such cognitive arousal is needed. However,
there was no clearly sustained laterality effect comparable to those described by Davidson (1992),
and in studies of simple repeated tones and melodies (Breitling et al. 1987). It appears that music
engages with episodic rather than declarative memory systems, which construct personal life
histories of emotion-rich experiences (Tulving and Markowitsch 1998), and thus it may ‘play’
creatively and socially between the hemispheres, engaging and enhancing the use of their
complementary mental aptitudes (Turner 1982, 1983).
It is to be emphasized that cortical measures such as the above give no good indication of what
may be transpiring in subneocortical emotional systems—the primary-process generators for
affective states (Liotti and Panksepp 2004; Panksepp 2000a, 2005a). For that, we need to use less
direct estimates of neural activity, such as the PET and fMRI approaches already noted (Blood
and Zatorre 2001; Menon and Levitin 2005).
7.3.7 Bodily feelings from music: with a focus on ‘chills’

Substantial experimental work has examined the effects of music on the regulations inside the
body (for a summary of early work on effects on autonomic function see Critchley and Henson
1977; for more recent research, see Hodges 1995; Steinberg 1995). The effects on bodily dynamics
are expected simply because music so effectively arouses and changes both the autonomic
functions and emotional feelings that are associated with preparations for purposeful moving
(Jeannerod 1994). However, it is increasingly clear that different individuals commonly show
different physiological responses to music (Nyklicek et al. 1997; VanderArk and Ely 1992),
confirming that personality or habits of motivation, and episodic memory for embodied life
events, are critical components of how people respond to music and to other socio-emotional
experiences. This variability, so common in human psychophysiology and brain imaging work
(Barrett 2006), may clarify why some people prefer affective engagement, and others more
detached cognitive perspectives to music appreciation (Storr 1992). The intimate interpersonal
‘sympathetic’ regulations of vital bodily functions shape, from before birth, our differences in
social boldness or timidity, in self-confidence and the need for intimate support, even for twins
conceived with the same genes (Piontelli 2002; Trevarthen et al. 2006); they presumably also
affect our musicality and tastes in music.
One dramatic and consistent bodily effect induced by music is the feeling of shivers or chills
many people experience when they are intensely moved, as by emotionally powerful music
(Sloboda 1991), especially bittersweet songs of unrequited love and longing, and of patriotic
pride arising from the commemoration of lost warriors, which may reflect group-cohesion
dynamics (Panksepp 1995). These feelings are experienced as intense and desirable peak affective
experiences. Parenthetically, other highly aversive chill-evoking sounds, such as fingernails scraping
across a blackboard (Halper et al. 1986), are presumably generated by different brain response
than the chills discussed here.
There is substantial individual variability in the incidence of this response. Typically, people
experience more chills to musical selections with which they have a pre-existing emotional
relationship, but chills can be rapidly established to new emotionally moving music, suggesting it
is a response based on attachments that individuals develop to the music they enjoy. Some
people—perhaps those who are less socially emotional and more alexithymic—rarely experience
musically evoked chills. Most, however, delight in the experience. Females exhibit the response
more than males, perhaps because the response is dependent on ‘interior’ socio-emotional
sensitivities, and they are more likely to have such experiences from music they experience
as sad, expressive of loneliness or loss, rather than music they experience as happy and more
sociable (Panksepp 1995).
High-pitched sustained crescendos are ideal stimuli for evoking chills. One hypothesis is that an
influential acoustic property that triggers the response is that the sound resembles the separation
cry of babies—the primal care soliciting signal that attracts social care and attention, especially
by mothers. Musically evoked chills may arise from the resonance of our brain separation-distress
systems that mediate the painful emotional impact of social loss (Panksepp 1981, 2003c). In part,
the affective impact of this response may reflect homeostatic thermoregulatory adjustments trig-
gered by separation experiences, which promote motivational urgency for social reunion. The evo-
lutionary roots of social motivation are partly linked to thermoregulatory networks of the brain
(Panksepp 1998a). The sound of a lost child sends chilly shivers down our spine. This may promote
urges for reunion and the re-establishment of social warmth with body-to-body contact (Figure 7.4).
Indeed, positively valenced music to the left ear (and right brain) tends to increase body
temperature, while negatively valenced music has the opposite effect (McFarland and Kennison
1989). Musical performances that evoke chills blend a wistful sense of loss with the possibility of
reunion and redemption. Such aesthetic experiences remind us of our humanness—our profound
social attachments and loving dependencies, our relatedness to other people and nature.
Since pharmacologically induced opiate receptor blockades can reduce the incidence of chills
(Goldstein 1980), the chill response is partly due to changes in endogenous opioid activity in the
brain. The directionality remains ambiguous—the chill experience may follow either a rush of
endorphins or, conversely, perhaps a precipitous decline in endogenous opioid activity (Panksepp
1995). A recent PET imaging study of the human brain indicates that sadness is accompanied by
opioid activity in the limbic system (Zubieta et al. 2003). Blood and Zatorre’s (2001) work has high-
lighted abundant arousal in the socio-emotional limbic regions of the brain during chill-evoking
music. Positive correlations were evident to positive affective responses in subcortical emotion-
regulating regions, such as the ventral striatum and midbrain periaqueductal gray regions.
Work on the psychobiology of chills has barely begun, but is beginning to captivate a new gen-
eration of investigators who are working out the physiological nature of the response and the
aspects of music most likely to evoke such feelings (Craig 2005; Grewe et al. 2007; Guhn et al.
2007). Beside helping illuminate the nature of musical aesthetics, such musically induced brain
responses could clarify the addictive nature of music (Figure 7.5).
Fig. 7.4 A chick reacts to being

held in warm hands by falling
asleep.
Piano solo ends Flute entrance Chill passage

Orchestra entrance
Flute 12
Clarinet p
in A
p
Bassoon
p
Viola
p p
Piano
Piano/cello/double bass
Violin 1
p
Violin 2
1.0
Music begins
Excerpt begins
0.5
Skin
conductance
(Microsiemens)
−0.5
82
80
Average
heart rate 78
(beats/minute)
76
74
72
:05 :10 :15 :20 :25 :30 :35 :40 :45 :50 :55 :60 :65 :70 :75 :80 :85 :90 :95 :100 :105 :110 :115
Time (seconds)
10
10
5
1 1 2 2
1 2 3 4 5 6 7 8 9 10 11
Phrases
60
50
Volume 40
(dB) 30
20
10
Fig. 7.5 From Guhn et al. 2007. Excerpt from Mozart’s Piano Concerto (K488), 2nd movement, meas-
ures 11–18; including the passage, measures 16–17 (top graphic) where listeners experienced ‘chills’.
Mean skin conductance curves in micro siemens, relative to baseline, for the Chill Group (n = 10; high-
er curve) and No Chill Group (n = 11; lower curve) (skin conductance). Mean heart rate curve of all par-
ticipants in beats per minute (n = 27) (average heart rate); time reference axis in seconds. Bars indicat-
ing the number of participants (total of n = 16) that experienced a chill during the designated phrases,
1–11 (Chills); and volume curve of the music in decibels. Note that the length of the music score does
not coincide with the length of the played musical excerpt. Also, piano, cello, and double bass parts are
alike, and therefore notated in one system for space reasons.
7.3.8 Music and the neurochemistry of social attractions

and ‘addictions’
The seduction of music, that musical experiences have a knack of being unforgettable, might lead
us to wonder whether music can stimulate the neurochemistries of memories and emotions
which have evident adaptive value. There is increasing evidence that this is so. Social-bonding
in animals is controlled partly by feelings of separation distress regulated by brain opioids, an
affective state that may promote addictive urges (Panksepp 1981). Similar brain dynamics have
been affirmed in humans with the demonstration that human sadness is accompanied by opioid
activity outside the neocortex (Zubieta et al. 2003). If the emotional appeal of music relies exten-
sively on the activation of important social emotional processes, some of which, such as
mother–infant bonding, are addictive (Panksepp 1998a), then music may have comparable
addictive features that may be beneficial, or even essential, to a socially bonded life.
In addition to opioids, infant–mother bonding has strong oxytocinergic components
(Panksepp 1998a), and it is increasingly clear that adult attractions are promoted by similar brain
chemistries (Insel and Young 2001), leading to their comparison with the addictive aspects of
brain seeking desires (Alcaro et al. 2007), urges mediated by dopamine (Insel 2003). Both phar-
macological and brain imaging studies indicate that brain opioid and dopamine systems partici-
pate in peak musical experiences (Blood et al. 1999; Blood and Zatorre 2001; Goldstein 1980).
This is only the glimmer of a complex neurochemical cascade, yet to be detailed. If we take an
evolutionary perspective to our musical nature, we are led to the conclusion that the emotional
experience of music is ultimately based on the melodious powers of the human voice, first
evident in the loving duets of mother and child. Our understanding of this is not demeaned by
the recognition that our uniquely human emotional wealth remains grounded on the ancient
neurochemistry of our animal passions.
7.3.9 Neurochemistry of affective systems in the animal

brain—music for other species?
The evidence supports the idea that hearing music engages innate neurochemical systems that
facilitate social processes, but it is difficult to measure activity in the transmitter systems of the
human brain, even though neuropharmacological work can sometimes provide indirect infor-
mation about their activity (Goldstein 1980). Investigators have usually measured peripheral
plasma or salivary products, to assess, for example, cortisol (e.g., Kreutz et al. 2004; VanderArk
and Ely 1992). Unfortunately, such peripheral measures are unlikely to reflect brain transmitter
dynamics with fidelity.
Methodological difficulties such as these have led us to expend much effort to determine how
music affects the brain and behavior of experimental animals. Initially, we (along with many
others) had the naive hope that some common laboratory animals might enjoy our music, or
accept it as reward. Neither we nor anyone else, to our knowledge, has obtained compelling
evidence that animals like human music. This does not mean that they are not affected by musi-
cal stimuli; they certainly can be (Panksepp and Bernatzky 2002; Chikahisa et al. 2006, 2007).
After we shifted our focus from laboratory rats, whose vocal emotional communications are
typically in the ultrasonic range (20–60 kHz) to newborn domestic chicks, who communicate
within our own auditory range, strikingly consistent results have been obtained.2
2 Regrettably, most of the work is still unpublished, and is only noted in passing in review papers, because
such research is presently viewed by influential journals as of fringe significance. For a summary of the
findings, see Panksepp and Bernatzky (2002, pp. 147–148).
Just as mothers calm fussy babies by singing to them, we have found that music reduces
the separation cries young domestic chicks emit when they are briefly isolated from social
companions, and this calming effect is eliminated if animals are induced into a hyperemotional,
agitated state with intracerebral kainic acid (Figure 7.6). Since neuropeptides, such as the
endogenous opioids oxytocin and prolactin, are very effective in alleviating chicks’ separation
distress, we might anticipate that the music was activating these endogenous neurochemical
mediators in the chick’s brains. Unfortunately, the release of such low concentration chemistries
of the brain are very difficult to measure. However, one of the peptides, oxytocin, placed directly
in the brain produces the same fixed-action patterns in young birds as does music. When we
administer oxytocin, or the avian equivalent vasotocin, directly into the ventricular system (since
neuropeptides do not readily cross the blood–brain barrier), young chicks show dramatic eleva-
tions of relaxed behaviours—yawning, head-flicking, feather-ruffling and wing-flapping
(Panksepp 1992); these effects are also commonly observed during the exposure of the birds to
music (Panksepp and Bernatzky 2002).
That endogenous opioids, oxytocin and prolactin are presently the most powerful neuropep-
tides to reduce separation distress (Panksepp 1998a), and are important for establishing social
bonds (Carter 1998; Insel 1997; Nelson and Panksepp 1998), suggests that music experienced as
calming and seductive may also release these chemicals in the human brain. We wonder whether
the elevated levels of feather-ruffling evoked by oxytocin in chicks (as well as the ‘wet-dog shakes’
in most animals during opioid withdrawal) may have physiological relations to chills evoked by
music. These ideas await the development of oxytocin receptor antagonists that can be deployed
in human research.
In humans music engages such a range of emotional states that it surely activates a vast
symphony of neurochemical changes. In addition to interacting with comfort-regulating
neuropeptide systems, music may also interact with generalized arousal and attention systems of
wakeful consciousness, such as those based on norepinephrine and serotonin that regulate more
environment-focused emotional responses (Panksepp 1986). We evaluated the efficacy of
Auditory Integration Training, a music-based treatment for early childhood autism (Rimland
and Edelson 1995; Waldhoer et al. 1995) to reduce stress in chicks. Major effects were demon-
strated on brain norepinephrine attention systems (increased transmitter synthesis), with more
modest effects of dopamine and serotonin (for a summary, see Panksepp and Bernatzky 2002,
pp. 147–148). This work opens up the further possibility that musical stimulation can modulate
the expression of specific genes in the brain, and potentially effect permanent epigenetic changes
by regulating the methylation of certain genes. Other emotional states have influenced gene
expression profiles (e.g., Kroes et al. 2006).
400
Kainic acid (0.25 µg)
Distress vocalizations / 3 mins
Control
300
200
100
Fig. 7.6 Reduction of

0
0 Silence Music Silence Music distress calls of domestic chicks
Successive test blocks by music.
Given the clear effects of music on chick affective neurochemistry and behaviour, we inquired
whether chicks and rats exhibited any preference for human music. We never did obtain robust
evidence for this. To this day, there is insufficient evidence that any other species likes human
music. Nevertheless, that music can facilitate human brain dopamine synthesis, reduce blood
pressure, and alleviate Parkinsonian symptoms (Bernatzky et al. 2004; Sutoo and Akiyama 2004),
leads us to wonder whether some aspect of the effects of music on brain regulation might not be
usefully studied in animal models, and may even find application in animal care.
In pursuing ideas of cross-species aesthetics, we should not neglect animals’ own, species-
specific forms of emotional communications and their own rhythms of expression (Hauser
1996). They are likely to find the socio-emotional sounds they produce more attractive than the
ones we generate (e.g., Bradbury and Vehrencamp 1998). For instance, the playful 50 kHz chirps
of rats, totally inaudible to us, and so abundant during their rough-and-tumble play and sexual
solicitation (especially during playful tickling by skilled humans) may resemble the laughter
which expresses shared excitement, relief and joy in apes and humans (Hooff 1989; Panksepp
2005c, 2007a). These laughing sounds serve as social attractants for young rats (Burgdorf et al.
2007; Panksepp and Burgdorf 2003). By amplifying and stylizing such acoustic signals as are
emotionally relevant for other species, we might increase preferences above and beyond the ani-
mals’ attractions to natural calls. If so, perhaps we could create simple socially attractive messages
of ‘protomusic’ for other species—perhaps by using musically stylized clucking sounds for chick-
ens, squealing/snorting sounds for pigs, meows for cats, 50 kHz chirps for rats, and so forth.
Might experience with such protomusic modulate their social tendencies, as does the affective
quality of the speech of experienced animal handlers? Might the sound of preferred music make
their lives in monotonous laboratory and confined farm situations more pleasant?
Animal models have dramatically advanced our understanding of genetics (Ridley 2003), the
mechanisms of learning (Kandel 2006), and the fundamental neurobiological nature of human
emotions (Panksepp 1998a). Might they eventually give us a theory of emotional motives
for music?
7.4 The applied psychobiology of music

7.4.1 Musical affect and the training of musical intelligence
and skill: beyond animal signs to composed art
Human musicality is both innate and powerfully teachable: we make music and share it with
children in cultural forms, and children at play create and teach one another their own ‘intuitive’
musical culture (Bjørkvold 1992). The making of music opens up creative psychological spaces
within and between humans that do not, as far as we know, exist in other animals. Perhaps the
critical evolutionary novelty in humans relates to the polyrhythmia of bodily expression released
by the evolution of bipedal walking—to the intrisic motive pulse of walking, marching, skipping,
dancing, waltzing people, with two hands and ten fingers free as complementary messengers of
intricate thought and skilled purposefulness (McNeill 1992; Trevarthen 1999). There is evidence
from studies of chimpanzee gestures that flexible sign-making by ritualized hand movements
may have preceded the evolution of speech, that gestural signing led to making audible signs by
mouth that could become symbols (Pollick and de Waal 2007). After all, a sign in music is, in a
sense, ‘audible gesture’, whether made by mouth or with an instrument. But, in addition, some
new sense of the extended messages of gesture that can be learned as stories is involved in the
genesis of music—what we have been characterizing as its ‘narrative power’ (Kühl 2007; Imberty
and Gratier 2008).
Perhaps the polyrhythmic texture of human musicality grew from the complex new manual
skills needed for primate tree-climbing and highly flexible foraging; these are activities in which
hand and finger movements move rapidly in well-planned fugues of complementary industry,
the two cerebral hemispheres taking different roles in the planning, control and learning of the
manipulative skill, and in planning strategies for exploratory or creative activity (Trevarthen
1978, 1995). Musical composition and performance certainly depend on the human urge to
discover, create and practice new intricacies of skill for moving by hand and mouth, and for
hearing acts that have been accomplished (Donald 2001; Schögler and Trevarthen 2007). There is
theory and evidence from functional brain imaging that in hominids the development of
new brain systems for coordination between body movement and expressions of hand and
mouth were crucial in the evolution of speech and language (MacNeilage 1999; Willems and
Hagoort 2007).
Music arises from a distinctly human instinct for cultural inventiveness, of action and thought
as art—together making up new and valued creations of mystery and imagination (Dissanayake
1988), building a shared habitus of meanings (Bourdieu 1990; Gratier 2008). All musical
inventions, however spontaneous or unpremeditated, tend to adopt, elaborate and remember
discrete conventions of execution and composition. Even babies, only 6 months old, want the
ritual movements of ‘their’ baby song to be performed ‘correctly’, with the ‘proper’ interpersonal
coordination (Trevarthen 2002; Merker Chapter 4, Eckerdal and Merker Chapter 11, Gratier and
Apter-Danon Chapter 14, this volume).
Conventional musical tones (i.e., notes) are artefacts of a musical tradition, derived indirectly
from natural emotional sounds (Brandt, Chapter 3, this volume). Musical notation is, of course, a
rational tool, developed to fabricate and communicate musical forms on paper. In reality, none of
the musical notes, as played by a trained score-reading musician, will be just the ‘pure’ sounds as
represented on the printed page. All varieties of emotional animal sounds, and all human vocal-
izations, including those of singing, as well as the sounds of an expressive instrumental perform-
ance, are modulated, with fluctuations of intensity and timbre; these modulations are essential to
the expressiveness of such sounds (Lee and Schögler, Chapter 6, this volume).
As a cultural medium, music derives additional satisfactions from cognitive processes and
rational transformations. Musicians gain command of their creations by learning the conven-
tional elements, and by discovering the skill of modulating tonal identities to blend emotions in
exciting ways (Brandt, Chapter 3, this volume). However, while we are a species motivated to
invent artificial meanings that become the arts, techniques, rituals and languages of our shared
world, we never lose the taste for spontaneously expressed emotions, such as may occur in the
heat of the moment during a performance by a highly skilled musician, or during a rollicking
play song sung by a mother to her infant. The ‘dynamics’ and ‘harmonies’ of these emotions
powerfully affect the quality of our relating.
7.4.2 The aesthetic foundations of musical performance

How we perform music with skill or compose it with intelligence, and how we are moved
by music, entails different neuropsychological abilities, cognitions and skills, but both creation
and enjoyment share affective foundations. Our immediate experiences of emotions prepare for
cognitive elaborations, such as those reinforced by cultural conventions that are essential
for the full appreciation of musical art (Merker Chapter 4, Turner and Ioannides Chapter 8, this
volume).
Skilled musicality has, as well as complexity, the special neurally mediated quality of beauty,
telling a memorable story with the the aesthetic appeal of a graceful presentation. When charged
with communicative significance the grace of moving becomes highly emotive, both personally
and interpersonally. Beauty is valued because it can be shared. It makes human works and natural
objects ‘special’ (Dissanayake 1988).
Art is concerned with the direct communication of the pleasure of creating shareable
experiences and objects. It enhances rituals and ‘stories’ of performance without regard for
practical products; that is what distinguishes it from ‘technique’. However, the skills of advanced
artistic performance combine art and technique (Flohr and Trevarthen, 2007; Rodrigues et al.
Chapter 27, this volume), and learning such skills changes parts of the brain that store elaborated
representations of action and experience in the massively adaptable tissues of the cerebellum
and cerebral neocortex (Schlaug et al. 2005). Learning musical perceptual or executive knowledge
and skills must involve systems of the brain at all levels, and not all of these can be called
‘emotional’, although all may be subject to emotional evaluative influences (Molnar-Szakacs
and Overy 2006).
Playful arts and rituals most probably have special value for development of the child’s
brain and of skills that are valued in healthy societies (d’Aquili and Laughlin 1979). Early
education in active, enjoyable musical experience, with opportunities to acquire and share fluent
emotive expressions of musicality, may have profound positive consequences for the rest of
a child’s mental apparatus and for overall development (Bjørkvold 1992; Custodero 2005;
Flohr and Trevarthen 2007). Such enjoyable activities may activate neuronal growth factors
within the brain, and also epigenetically invigorate brain systems that promote life-long
satisfaction with being alive, thereby diminishing depression. The role of arts in early childhood
education has diminished markedly, especially in the United States, as ‘no child left behind’
politics of ‘back to basics’ has trumped emotional engagements with the arts (though
see Fröhlich, Chapter 22, this volume, for an exception). To delete music and the arts from the
school curricula with instruction limited to rational and technical skills, may be tantamount to
leaving every child behind—a false and insensitive economy (Rodrigues et al. Chapter 27, this
volume).
7.4.3 The emotional effects of music in the regulation

of mood, movement, and thought
Music is highly effective for mood induction (Camp et al. 1989; Kenealy 1988; Mayer et al. 1995;
Stratton and Zalanowski 1991), and more robust effects are achieved when one uses participant-
selected rather than experimenter-selected music (Carter et al. 1995; Thaut and Davis 1993).
When formally evaluated, the mood changes induced outlast the music by only about 10 minutes
(Panksepp and Bernatzky 2002). This is about as long as the Mozart Effect on spatial reasoning
tasks (Rauscher and Shaw 1998), supporting the conclusion that both effects are mediated
simply by non-specific attention-focusing arousal effects of music. Newborn infants may be
calmed by music, especially if they show signs of actively listening to it (Standley 1998).
Parenthetically, calming effects beneficial to learning have also been seen in laboratory rats
(Chikahisa et al. 2006, 2007). Changing affective dynamics and the stimulation of ‘generalised
amodal cognition’ (Kühl 2007) may underlie the widely heralded Mozart Effect (see Crncec et al.
2006, for an overview of the literature on the cognitive and academic effects of music listening
in children).
Positive moods evoked by music can facilitate creative output (Adaman and Blaney 1995), but
there are difficult measurement issues to be resolved (Asmus 1985). It is less doubtful that the
pleasures derived from having one’s creative musicality applauded by others as fascinating
and beautiful is a very powerful reinforcer of emotions of pleasure and well-being displayed in
participation with music from early childhood (Bjørkvold 1992; Trevarthen 2002; Custodero
2005; Mazokopaki and Kugiumutzakis, Chapter 9, this volume).
The impact of music on the control of bodily movement is both immediate and profound,
and young children spontaneously move to music without instruction (Bjørkvold 1992;
Scott and Panksepp 2003b; see Lee and Schögler Chapter 6, Mazokopaki and Kugiumutzakis
Chapter 9, Fröhlich Chapter 22, and Custodero Chapter 23, this volume). An ancestral relation-
ship between emotions, distinct types of action tendencies, and the sounds we make is ancient
and fundamental to our nature (Todd 1985). These relations are instantiated in the creative
power and affective immediacy of dance as well as music. Much ancestral knowledge of
our species was traditionally transmitted through ritualized chanting and dance, which can
represent, in captivating metaphorical ways, how ritualized sequences of complex culturally
important actions should be conducted (Turner 1974, 1982; Lakoff and Johnson 1980; Donald
2001; Mithen 2005; Cross 2007; and Brandt Chapter 3, Merker Chapter 4, Cross and Morley
The ability of rhythmic music to promote and coordinate powerful bodily actions has surely
served as an impetus for the biological-cultural coevolution of music and dance. In many
cultures no semantic distinction is made between music and dance. For instance in the Igbo
language nkwa denotes dancing, singing and playing instruments. There is no concept of a music
solely of sound (Cross 2001; Cross and Morley, Chapter 5, this volume). The same was true for
the word musikè (mousikh) in ancient Greek, which signified music, poetry and dance (Storr
1992), all celebrated in the Dyonisian rites.
Endless varieties of self-regulating and self-expressive movements are added by arms and
hands, which also aid the regulation and transmission of thoughts and concepts as they gesture
in intricate coordination with eyes, facial movements and the intonations of the voice
(Nespoulous et al. 1986; Varela et al. 1991; McNeill 1992; Goldin-Meadow and McNeill 1999;
Gallese and Lakoff 2005).
Disorders of communication due to rhythmic body movement, as in the dopamine deficits of
Parkinson’s disease, can be partly alleviated with music. Symptomatic relief of motor difficulties,
including involuntary movements that disrupt expressive gestures, have been noted during
exposure to the insistent rhythms of music (Sacks 1973). Clinical reports have been affirmed with
more rigorous approaches (Bernatzky et al. 2004; Pacchetti et al 1998; Lee and Schögler, Chapter 6,
this volume). Indeed, physical exercise and music can promote the synthesis of dopamine
in animal brains (Sutoo and Akiyama 2003, 2004). The therapeutic use of musical activities to
promote relief from effects of severe trauma result in improved motor control as well as
emotional and social benefits bringing self-confidence and joy (Robarts Chapter 17, Osborne
7.5 Conclusion: affective regulations in the syntax and

semantics of music and language
The importance of musicality for education brings us back to the question of the relationship
between communicative musicality and language, and the brain systems involved. Infant studies
show that musical communication exists between human infants and their mother before
the emergence of propositional speech (Trehub 2006). With maturation, the diverse
emotional–musical communications of the infant separate into two streams—propositional
speech flows toward the left hemisphere, while the prosodic–emotional stream flows more force-
fully into the right hemisphere (Callan et al. 2006; Turner and Ioannides, Chapter 8, this volume).
If we dissect this argument into component parts, a case emerges for the evolutionary source of
motives for human language as indicated in the emotional sounds of other species, that led to
the pre-human emergence of a form of shared meaning comparable with the communicative
musicality of infancy. This can be summarized as follows:
1 Animals communicate with emotional sounds, and with greater subtlety than usually
imagined (Burgdorf and Panksepp 2006; Panksepp and Burgdorf 2003), but only affectively
(Panksepp 1998a, b; Wallin, Merker and Brown 2000, Section II; Fitch 2006).
2 Music is the ‘language’ of emotions, and its affective power arises from subcortical emotional
systems (Blood and Zatorre 2001; Menon and Levitin 2005; Panksepp and Bernatzky 2002).
3 A protomusical competence, coupling manual gestures with vocal gestures in narrations
leading to protolanguage (Halliday 1975) precedes language in development of the human
mind (Fernald 1992a, b; Trehub et al. 1984; Malloch 1999; Trevarthen 1999).
4 Communication by musical vocal gestures and vocal and manual language capacities remain
tightly coupled and they engage overlapping processes in the brain (Callan et al. 2006;
Schwartz et al. 2003; Turner and Ioannides, Chapter 8, this volume).
It is clear that the prosodic aspects of vocal expression are not only supremely important in
leading the infant into comprehension and production of speech, but in language learning
throughout life (Fonagy 2001). The same rhythmic phrases and affective tones, melodies and
prosody, of intentional activity are shared in music and language from their earliest stages
through the most complex elaborations. The brain that learns language is an organ of intersub-
jective collaboration, and to this end has systems of emotional regulation that are fundamentally
musical.
Thus, it is reasonable to envision that human language differentiated from our initial
affective–musical motivations guiding emerging cognitive abilities. Our provisional conclusion,
like those of others who have thought outside the box of ‘the language instinct’, is that not only
did our our inborn musical nature derive from our more ancient socio-emotional nature
(Panksepp 1998b, 2005a), but the emergence of language was preconditioned by our capacity
for emotional feelings (Shanahan 2007). Vocal affective communication may have been a
precondition for the evolution of propositional communication by speech.
7.5.1 Coda: the immediate future of bio-musical research

The relationship between social processes and our enchantment with music, as emphasized
throughout this chapter and this volume, informs us about the nature and importance of shared
emotions. Music can amplify our sense of our unique place in animal nature—our capacity to
appreciate the profound joy, sadness, power and wonder of this life and its moral complexity.
It can become a critical ingredient in our sense of power and our triumphant feelings of victory,
or our sense of compassion and responsibility toward those who suffer. Music amplifies our awe
at the vast beauty of our physical, social and mental universes. Thereby it readily becomes a
natural part of religious tradition that may also rely deeply on the biology of social bonds
(Ostow 2007).
One day, when our still ruthlessly reductionistic neuroscientific culture begins to recognize and
accept that our souls are profoundly biological (Panksepp 1998b), we will truly understand how
music touches and transports the human spirit. Then we will begin to understand how music,
while it also easily captures pounding erotic rhythms and patriotic fury, can enrich our capacity
for gentle communion and forgiveness—our search for solace and grace. The evolutionary roots
of all our musicality are deeply embedded in the evolved passionate nature of our minds, all of
which is grounded in complex bodily representations which we have here encapsulated in terms
such as the core SELF with central SEEKING urges transported through Intrinsic Motive
Formations. We will have to find ways to study the large-scale neurodynamic rhythms of the
living brain to gain a deeper understanding of how the emotions that move us are elaborated by
such neural networks (Panksepp 2000b).
Our love of music ultimately reflects the ancestral ability of our mammalian brain to transmit
and receive sounds of emotion in movement—sounds that can arouse affective feelings that are
implicit indicators of adaptive vitality. Many emotional calls and cries we make were evolution-
arily designed to communicate whether certain actions or events were likely to promote or hinder
our well-being and survival. Among brain areas long implicated in the evolutionary generation
of expressions of emotionality in animal brains (Panksepp 1998a, 2005a), modern brain imaging
has revealed dramatic and deep subcortical foundations for peak musical experiences (Blood
and Zatorre 2001; Menon and Levitin 2005). An understanding of how music arouses the
emotional/affective processes of the brain will eventually provide a scientific understanding of
how we come to love music, and what benefits—perhaps the vitality of language itself—we gain
from such loving attachments and affective relationships with sound.
The role of subcortical emotional systems in our love affair with music remains greatly under-
estimated, as it has in consciousness studies until recently (Panksepp 1998b, 2005a, b; Denton
2006; Merker 2006). Without the intrinsic ancestral dynamics of emotional systems, learned
musical facility remains affectively flat, its intricacies becoming only an intellectual exercise or a
muscular tour de force.
Upon such fundamental emotional foundations artists can construct simple melodies we
will never forget, or magnificent sonic spaces, to fill cathedrals or concert halls or outdoor rock
festivals. They have created cultural musical traditions reaching far beyond simple affective or
evolutionary concerns (Becker 2004). Musical meaning is eventually embedded in these cultural
creations and the neuro-affective structures of our minds. Any attempt to understand music in
either evolutionary or neurophysiological terms will, of course, be reductive approximations and
fall short of explaining the wealth of musical sound constructed in the diverse sociocultural
dimensions of aesthetics. If we are to establish a basic psychobiological knowledge for this difficult
field, we must be content, first, with provisional simplifications of the natural complexities.
Through successive empirically guided theoretical approximations, we may generate some
lasting understanding of the mental apparatus by which our passions, intelligence and skill are
coordinated into the artistic whole.
Experience and education massively expand these potentials, awakening an appetite for endless
cultural inventions. In this way music moves us on, in a world of shared meanings of many kinds.
References
Adaman JE and Blaney PH (1995). The effects of musical mood induction on creativity. Journal of Creative
Behavior, 29, 95–108.
Alcaro A, Huber R and Panksepp J (2007). Behavioral functions of the mesolimbic dopaminergic system:
An affective neuroethological perspective. Brain Research Reviews, 56(2), 283–321.
Asmus EP (1985). The development of a multidimensional instrument for the measurement of affective
responses to music. Psychology of Music, 13, 19–30.
Bagri A, Sandner G and Di Scala G (1992). Wild running and switch-off behavior eleicited by electrical
stimulation of the inferior colliculus: Effect of anticonvulsant drugs. Pharmacology Biochemistry and
Behavior, 39, 683–688.
Barrett LF (2006). Solving the emotion paradox: Categorization and the experience of emotion. Personality
and Social Psychology Review, 10, 20–46.

development. In M Bullowa, ed., Before speech: the beginning of human communication, pp. 63–77.
Becker J (2004). Deep listeners: Music, emotion, and trancing. Indiana University Press, Bloomington, IN.
Bekoff M and Byers JA (1998). Animal play: Evolutionary, comparative and ecological approaches. Cambridge
Bell C, Bodznick D, Montgomery J and Bastian J (1997). The generation and subtraction of sensory
expectations within cerebellum-like structures. Brain, Behavior and Evolution, 50, 17–31.
Bell CC (2001). Memory-based expectations in electrosensory systems. Current Opinion in Neurobiology,
11, 481–487.
Bellugi U (ed.) (2001). Journey from cognition to brain to gene: Perspectives from Williams syndrome. MIT
Benzon WL (2001). Beethoven’s anvil: Music in mind and culture. Basic Books, New York.
Bernardi L and Sleight P (2007). Music and biological rhythms. In M Klockars and M Peltomaa, eds, Music
meets medicine. Acta Gyllenbergiana VII, pp. 29–41. The Signe and Ane Gyllenberg Foundation,
Helsinki.
Bernatzky G, Bernatzky P, Hesse HP, Staffen W and Ladurner G (2004). Stimulating music increases
motor coordination in patients afflicted with Morbus Parkinson’s. Neuroscience Letters, 361, 4–8.
Bernstein N (1967). Coordination and regulation of movements. Pergamon, New York.
Blacking J (1976). How musical is man? London, Faber and Faber.
Blacking J (1988). Dance and music in Venda children’s cognitive development. In G Jahoda and
I M Lewis, eds, Acquiring culture: Cross-cultural studies in child development, pp. 91–112. Croom Helm,
Beckenham, Kent.
Blacking J (1995). Music, culture and experience. University of Chicago Press, London.
Blood AJ and Zatorre RJ (2001). Intensely pleasurable responses to music correlate with activity in
brain regions implicated in reward and emotion. Proceedings of the National Academy of Sciences,
98, 11818–11823.
Blood AJ, Zatorre RJ, Bermudez P and Evans AC (1999). Emotional responses to pleasant and unpleasant
music correlate with activity in paralimbic regions. Nature Neuroscience, 2, 322–327.
Bogen JE (1969). The other side of the brain. II: An appositional mind. Bulletin of the Los Angeles
Neurological Society, 34, 135–162.
Bourdieu P (1990). The logic of practice. Stanford University Press, Palo Alto, CA.
Bradbury JW and Vehrencamp SL (1998). Principles of animal communication. Sinauer Assocs,
Sunderland, MA.
Bradshaw JL and Rogers LJ (1993). The evolution of lateral asymmetries, language, tool use and intellect.
Bråten S (2007). On being moved: From mirror neurons to empathy. John Benjamin,
Amsterdam/Philadelphia.
Bråten S and Trevarthen C (2007). Prologue: From infant intersubjectivity and participant movements to
simulations and conversations in cultural common sense. In S. Bråten, ed., On being moved: From mirror
neurons to empathy, pp. 21–34. John Benjamins, Amsterdam/Philadelphia.
Breitling D, Guenther W and Rondot, P (1987). Auditory perception of music measure by brain electrical
activity mapping. Neuropsychologia, 25, 765–774.
Brown S (2000). The ‘Musilanguage’ model of language evolution. In NL Wallin, B Merker and S Brown, eds,
Burgdorf J, Wood PL, Kroes RA, Moskal JR and Panksepp J (2007). Neurobiology of 50-kHz ultrasonic
vocalizations in rats: Electrode mapping, lesion, and pharmacological studies. Behavioral Brain Research,
182, 274–283.
Burgdorf J and Panksepp J (2006). The neurobiology of positive emotions. Neuroscience and Biobehavioral
Reviews, 30, 173–187.
Burghardt GM (2005). The genesis of animal play. MIT Press, Cambridge, MA.
Callan DE, Tsytsarev V, Hanakawa T et al. (2006). Song and speech: Brain regions involved with perception
and covert production. Neuroimage, 31, 1327–1342.
Camp CJ, Elder ST, Pignatiello M and Rasar LA (1989). A psychophysiological comparison of the Velten
and musical mood induction techniques. Journal of Music Therapy, 26, 140–154.
Carter CS (1998). Neuroendocrine perspectives on soical attachment and love. Psychoneuroendocrinology,
23, 779–818.
Carter FA, Wilson JS, Lawson RH and Bulik CM (1995). Mood induction procedure: Importance of
individualizing music. Behavior Change, 12, 159–161.
Chang H and Trehub SE (1977). Auditory processing of relational informaiton by young infants. Journal of
Experimental Child Psychology, 24, 324–331.
Cheney DL and Seyfarth RM (1990). The representation of social relations by monkeys. Cognition,
37, 67–96.
Chikahisa S, Sano A, Kitaoka K, Miyamoto KI and Sei H (2007). Anxiolytic effect of music depends on
ovarian steroid in female mice. Behavioral Brain Research, 179(1), 50–59.
Chikahisa S, Sei H, Morishima M et al. (2006). Exposure to music in the perinatal period enhances
learning performance and alters BDNF/TrkB signaling in mice as adults. Behavioral Brain Research,
169, 312–319.
Chugani HT (1998). A critical period of brain development: Studies of cerebral glucose utilization with
PET. Preventive Medicine, 27, 184–188.
Ciompi L and Panksepp J (2005). Energetic effects of emotions on cognitions—complementary psychobio-
logical and psychosocial findings. In R Ellis and N Newton, eds, Consciousness and emotion, pp. 23–55.
John Benjamins, Amsterdam/Philadelphia.
Clynes M (1977) Sentics, the touch of emotion. Doubleday Anchor, New York.
Clynes M, ed., (1982). Music, mind, and brain. Plenum, New York.
Clynes M (1995). Microstructural musical linguistics: composers’ pulses are liked most by the best musicians.
Cognition, 55, 269–310.
Clynes M and Netttheim N (1982). The living quality of music: Neurobiologic basis of communicating
feeling. In M Clynes, ed., Music, mind, and brain, pp. 47–82. New York, Plenum.
Condon WS (1979). Neonatal entrainment and enculturation. In M Bullowa, ed., Before speech: The beginnings
of human communication, pp. 131–148. Cambridge University Press, Cambridge.
Condon WS and Sander LS (1974). Neonate movement is synchronized with adult speech: Interactional
participation and language acquisition. Science, 183, 99–101.
Craig DG (2005). An exploratory study of physiological changes during chills induced by music. Musicae
Scientiae, 9, 273–287.
Critchley M and Henson RA eds (1977). Music and the brain. Charles C Thomas, Springfield, IL.
Crncec R, Wilson S and Prior M (2006). The cognitive and academic benefits of music to children: Facts
and fiction. Educational Psychology, 26(4), 579–594.
In Suk Won-Yi ed., Music, mind and science, pp. 10–29. Seoul National University Press, Seoul.
Cross I (2001). Music, cognition, culture and evolution. Annals of the New York Academy of Sciences,
903, 28–42.
Cross I (2003). Music and biocultural evolution. In M. Clayton, T. Herbert and R. Middleton, eds,
The cultural study of music: a critical introduction, pp. 19–30. Routledge, London.
Cross I (2007). Music, culture and evolution. In M Klockars and M Peltomaa, eds, Music meets medicine.
Acta Gyllenbergiana VII, pp. 5–13. The Signe and Ane Gyllenberg Foundation, Helsinki.
Custodero LA (2005). Observable indicators of flow experience: A developmental perspective on musical
engagement in young children from infancy to school age. Music Education Research, 7(2), 185–209.
d’Aquili EG and Laughlin CD (1979). The neurobiology of myth and ritual. In EG d’Aquili, CD Laughlin
and J McManus, eds, The spectrum of ritual: A biogenetic structural analysis, pp. 152–182. Columbia
Damasio AR (1999). The feeling of what happens: Body and emotion in the making of consciousness. Harcourt
Brace, New York.
Darwin C (1872/1998). The expression of the emotions in man and animals, 3rd edn. Oxford University
Press, New York.
Davidson RJ (1992). Anterior cerebral asymmetries and the nature of emotion. Brain and Cognition,
2, 125–151.
Davidson, RJ (2001). Toward a biology of personality and emotion. Annals of the New York Academy of
Sciences, 935, 191–207.
Davidson RJ and Hugdahl K (1995). Brain asymmetry. MIT Press, Cambridge MA.
Delamont RS, Julu POO and Jamal GA (1999). Periodicity of a noninvasive measure of cardiac vagal tone
during non-rapid eye movement sleep in non-sleep-deprived and sleep-deprived normal subjects.
Journal of Clinical Neurophysiology, 16(2), 146–153.
Denton D (2006). The primordial emotions: The dawning of consciousness. Oxford University Press,
Oxford.
Dissanayake E (1988). What is art for? University of Washington Press, Seattle, WA and London.
Dissanayake E (2000). Art and intimacy: How the arts began. University of Washington Press, Seattle and
London.
Don AJ, Schellenberg E and Rourke BP (1999). Music and language skills of children with Williams
syndrome. Child Neuropsychology, 5, 154–170.
Donald M (2001). A mind so rare. Norton, New York.
Doupe A and Kuhl PK (1999). Birdsong and speech: Common themes and mechanisms. Annual Review of
Neuroscience, 22, 567–631.
Farnsworth P (1969). The social psychology of music. Iowa State University Press, Ames, IA.
Fernald A (1992a). Meaningful melodies in mothers’ speech to infants. In Papoušek H, Jürgens U and
Papoušek M, eds, Nonverbal vocal communication: comparative and developmental aspects, pp. 262–282.
Cambridge University Press, Cambridge/Editions de la Maison des Sciences de l’Homme, Paris.
Fernald A (1992b). Human maternal vocalizations to infants as biologically relevant signals: An evolutionary
perspective. In J Barkow, L Cosmides and J Tooby eds, The adapted mind, pp. 392–428, Oxford
Fifer WP and Moon CM (1995). The effects of fetal experience with sound. In J-P Lecanuet, WP Fifer,
NA Krasnegor and WP Smotherman, eds, Fetal development: A psychobiological perspective, pp. 351–366.
Erlbaum, Hillsdale NJ.
Fitch WT (2006). Production of vocalizations in mammals. In K Brown, ed., Encyclopedia of language and
linguistics, pp. 115–121. Elsevier, Oxford.
Flohr J and Trevarthen C (2007). Music learning in childhood: Early developments of a musical brain and
body. In F Rauscher and W Gruhn, eds, Neurosciences in music pedagogy, pp. 53–100. Nova Biomedical
Books: New York.
Fonagy I (2001). Languages within language. An evolutive approach. Foundations of Semiotics 13. John
Benjamins, Amsterdam/Philadelphia.
Frangiskakis JM, Ewart AK and Morris CA et al. (1996). LIM-kinase1 hemizygosity implicated in impaired
visuospatial constructive cognition. Cell, 86, 59–69.
Freeman W (2000). A neurobiological role of music in social bonding. In NL Wallin, B Merker and
S Brown, eds, The origins of music, pp. 411–424. MIT Press, Cambridge, MA.
Frøshaug OB and Aahus A (1995). When the moment sings: The muse within with Africa as a mirror.
Video, with Jon-Roar Bjørkvold, by ‘Visions’, Wergelandsvein, 23, 0167 Oslo, Norway. (In Norwegian,
English, Spanish, Portuguese and French.)
Gabrielsson A (1995). Expressive intention and performance. In R Steinberg, ed., Music and the mind
machine, pp. 35–47. Springer, Berlin.
Gabrielsson A and Juslin PN (1996). Emotional expression in music performance: Between the performer’s
intention and the listener’s experience. Psychology of Music, 24, 68–91.
Gabrielsson A and Lindstroem E (1995). Emotional expression in synthesizer and sentograph performance.
Psychomusicology, 14, 94–116.
Gallese V (2001). The ‘Shared Manifold’ hypothesis: From mirror neurons to empathy. Journal of
Consciousness Studies, 8(5–7), 33–50.
Gallese V and Lakoff G (2005). The brain’s concepts: The role of the sensory–motor system in reason and
language. Cognitive Neuropsychology, 22, 455–79.
Gallese V, Keysers C and Rizzolatti G (2004). A unifying view of the basis of social cognition. Trends in
Cognitive Sciences, 8, 396–403.
Goldin-Meadow S and McNeill D (1999). The role of gesture and mimetic representation in making
language. In MC Corballis and EG Lea, eds, The descent of mind: Psychological perspectives on hominid
evolution, pp. 155–172. Oxford University Press, Oxford.
Goldstein A (1980). Thrills in response to music and other stimuli. Physiological Psychology, 3, 126–29.
Gratier M (2008). Grounding in musical interaction: Evidence from jazz performances. Musicae Scientiae,
Special Issue. In press.
Grewe O, Nagel F, Kopiez R and Altenmuller E (2007). Listening to music as a re-creative process: physio-
logical, psychological, and psychoacoustical correlates of chills and strong emotions. Music Perception,
24, 297–314.
Guhn M, Hamm A and Zentner M (2007). Physiological and musico-acoustic correlates of the chill response.
Music Perception, 24, 170–180.
Halliday MAK (1975). Learning how to mean: Explorations in the development of language. Edward Arnold,
London.
Halper DL, Blake R and Hillenbrand J (1986). Psychoacoustics of a chilling sound. Perception and
Psychophysics, 39, 77–80.
Harrer G and Harrer H (1977). Music, emotion and autonomic arousal. In M Critchley and RA Henson,
eds, Music and the brain, pp. 202–216. Charles C Thomas, Springfield, IL.
Hauser MD (1996). The evolution of communication. MIT Press, Cambridge, MA.
Hauser MD (2000). The sound and the fury: Primate vocalizations as reflections of emotion and
thought. In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 77–102. MIT Press,
Cambridge, MA.
Hejmadi A, Davidson RJ and Rozin P (2000). Exploring Hindu Indian emotional expressions: Evidence for
accurate recognition by Americans and Indians. Psychological Science, 11, 183–187.
Hobson P (2002). The cradle of thought: Exploring the origins of thinking. Macmillan, London.
Hodges DA (ed.) (1995). Handbook of music psychology. IMR Press, San Antonia.
Holstege G, Bandler R and Saper CB eds (1996). The emotional motor system (Progress in brain research,
Volume 107). Elsevier, Amsterdam.
Holy TE and Guo Z (2005). Ultrasonic songs of male mice. PLoS Biology, 3, e386.
Hooff Jaram van (1989). Laughter and humour, and the ‘duo-in uno’ of nature and culture. In W Koch, ed.,
The Nature of Culture, pp. 120–149. Proceedings of the International and Interdisciplinary Symposium,
Ruhr Universitat, Bochum, October 7–11, 1986. Brockmeyer, Bochum.
Hopyan T, Dennis M, Weksberg R and Cytrynbaum C (2001). Music skills and the expressive interpretation
of music in children with Willams-Beuren syndrome: Pitch, rhythm, melodic imagery, phrasing and
musical affect. Child Neuropsychology, 7, 42–53.
Imberty M (1981). Les ecriture du temps. Dunod, Paris.
Imberty M (2000). The question of innate competencies in musical communication. In NL Wallin,
Imberty M (2005). La musique creuse le temps. De Wagner à Boulez: Musique, psychologie, psychanalyse.
L’Harmattan, Paris.
Imberty M and Gratier M (eds) (2008). Musicae Scientiae, Special issue on musical narrative. In press.
Insel T (1997). The neurobiology of social attachment. American Journal of Psychiatry, 154, 726–735.
Insel TR (2003). Is social attachment an addictive disorder? Physiology and Behavior, 79, 351–357.
Insel TR and Young LJ (2001). The neurobiology of attachment. Nature Reviews Neuroscience,
2, 129–136.
Jeannerod M (1994). The representing brain: Neural correlates of motor intention and imagery. Behavioral
and Brain Sciences, 17(2), 187–245.
Jeannerod M (2004). Vision and action cues contribute to self-other distinction. Nature Neuroscience,
7(5), 422–423.
Jernigan TL, Bellugi U, Sowell E, Doherty S and Hesselink JR (1993). Cerebral morphologic distinctions
between Williams and Down syndromes. Archives of Neurology, 50, 186–191.
Juslin PN (1997). Can results from studies of perceived expression in musical performances be generalized
across response formats? Psychomusicology, 16, 77–101.
Juslin PN (2001). Communicating emotion in music performance: A review and theoretical framework.
In PN Juslin and J Sloboda, eds, Music and emotion: Theory and research, pp. 309–337. Oxford
University Press, Oxford.
Kandel E (2006). Psychiatry, psychoanalysis, and the new biology of mind. American Psychiatric Publishing
Inc., New York.
Kenealy P (1988). Validation of a music mood induction procedure: Some preliminary findings. Cognition
and Emotion, 2, 41–48.
Kimura D (1982). Left-hemisphere control of oral and brachial movements and their relation to
communication. Philosophical Transactions of the Royal Society, London, Series B, 298, 135–149.
Klockars M and Peltomaa M (2007). Music meets medicine. Acta Gyllenbergiana VII. The Signe and
Ane Gyllenberg Foundation, Helsinki.
Krantz G (2007). Mental responses to music. In M Klockars and M Peltomaa, eds, Music meets medicine.
Acta Gyllenbergiana VII, pp. 103–113. The Signe and Ane Gyllenberg Foundation, Helsinki.
Kreutz G, Bongard S, Rohrmann S, Hodapp V and Grebem D (2004). Effects of choir singing or listening
on secretory immunoglobulin A, cortisol, and emotional state. Journal of Behavioral Medicine,
27, 623–635.
Kroes RA, Panksepp J, Burgdorf J, Otto NJ and Moskal JR (2006). Social dominance-submission gene
expression patterns in rat neocortex. Neuroscience, 137, 37–49.
Krumhansl CL (1997). An exploratory study of musical emotions and psychophysiology, Canadian Journal
of Experimental Psychology, 51, 336–352.
Kühl O (2007). Musical semantics. European Semiotics: Language, Cognition and Culture, No. 7. Peter
Lang, Bern.
Lakoff G and Johnson M (1980). Metaphors we live by. University of Chicago Press, Chicago, IL.
Langer, S (1942). Philosophy in a new key. Harvard University Press, Cambridge, MA.
Lecanuet J-P (1996). Prenatal auditory experience. In I Deliege and J Sloboda, eds, Musical beginnings:
origins and development of musical competence, pp. 3–34. Oxford University Press, Oxford,
New York, Tokyo.
Lee DN (1998). Guiding movement by coupling taus. Ecological Psychology, 10(3–4), 221–250.
Lee DN, Craig CM and Grealy MA (1999). Sensory and intrinsic coordination of movement. Proc R Soc
London B, 266, 2029–2035.
Levin FM (2004). Psyche and brain: The biology of talking cures. International Universities Press,
Madison, CT.
Levitin DJ and Bellugi U (1998). Musical abilities in individuals with Williams syndrome. Music Perception,
15, 357–389.
Liotti M and Panksepp J (2004). On the neural nature of human emotions and implications for biological
psychiatry. In J Panksepp, ed., Textbook of biological psychiatry, pp. 33–74. Wiley, New York.
MacDonald RAR, Hargreaves DJ and Miell D (eds) (2002). Musical identities. Oxford University Press,
Oxford.
MacLean PD (1990). The triune brain in evolution, role in paleocerebral functions. Plenum Press, New York.
MacNeilage PF (1999). Whatever happened to articulate speech?. In MC Corballis and EG Lea, eds,
The descent of mind: Psychological perspectives on hominid evolution, pp. 116–137. Oxford University
Press, Oxford.
Malashichev YB and Rogers LJ (eds) (2002). Behavioural and morphological asymmetries in amphibians
and reptiles. Laterality, 7(3), 195–229.
Malloch S (1999). Mother and infants and communicative musicality. Musicae Scientiae (Special Issue
1999–2000), 29–57.
Marler P and Doupe AJ (2000). Singing in the brain. Proceedings of the National Academy of Sciencse of the
USA, 97(7), 2965–2967.
Marshall JT Jr and Marshall ER (1976). Gibbons and their territorial songs. Science, 199, 235–237.
Mayer JD, Allen JP and Beauregard K (1995). Mood inductions for four specific moods: A procedure
employing guided imagery vignettes with music. Journal of Mental Imagery, 19, 133–150.
McFarland RA and Kennison R (1989). Asymmetry in the relationship between finger temperature changes
and emotional state in males. Biofeedback and Self Regulation, 14, 281–290.
McNeill D (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press,
Chicago, IL.
Menon V and Levitin DJ (2005). The rewards of music listening: response and physiological connectivity of
the mesolimbic system. Neuroimage, 28, 175–184.
Merker B (2005). The liabilities of mobility: A selection pressure for the transition to cortex in animal
evolution. Consciousness and Cognition, 14, 89–114.
Merker B (2006). Consciousness without a cerebral cortex: A challenge for neuroscience and medicine.
Behavioral and Brain Sciences, 30, 63–134.
Merker B and Cox C (1999). Development of the female great call in Hylobates gabriellae: A case study.
Folia Primatologica, 70, 97–106.
Merker B and Wallin NL (2001). Musical responsiveness in Rett disorder. In A Kerr and I Witt Engerström,
eds, Rett disorder and the developing brain, pp. 327–338. Oxford University Press, Oxford.
Mervis CB, Morris CA, Bertrand J and Robinson BF (1999). Williams syndrome: Findings from an
integrated program of research. In H Tager-Flusberg, Neurodevelopmental disorders, pp. 65–110.
MIT Press, Cambridge, MA.
Miell D, MacDonald R and Hargreaves D (eds) (2005). Musical communication. Oxford University Press,
Oxford.
Miller G (2000). The mating mind: How sexual choice shaped the evolution of human nature. Doubleday
Books, New York.
Nicholson, London.
Molnar-Szakacs I and Overy K (2006). Music and mirror neurons: from motion to ‘e’motion. Social
Cognitive and Affective Neuroscience, 1(3), 235–241.
Murray L and Trevarthen C (1985). Emotional regulation of interactions between two-month-olds
and their mothers. In TM Field and NA Fox, eds, Social perception in infants, pp. 177–197.
Ablex, Norwood, NJ.
Musacchia G, Sams M, Skoe E and Kraus N (2007). Musicians have enhanced subcortical auditory and
audiovisual processing of speech and music. Proceedings of the National Academy of Sciences, U S A,
104, 15894–15898.
Nadel J, Carchon I, Kervella C, Marcelli D and Réserbat-Plantey D (1999). Expectancies for social
contingency in 2-month-olds. Developmental Science, 2(2), 164–173.
Nelson E and Panksepp J (1998). Brain substrates of infant-mother attachment: Contributions of opioids,
oxytocin, and norepinepherine. Neuroscience and Biobehavioral Reviews, 22, 437–452.
Nespoulous J-L P. Perron P and Lecours AR (eds) (1986). The biological foundation of gestures: Motor and
semiotic aspects. Erlbaum, Hillsdale NJ.
Nickerson E, Greenberg F, Keating MT, McCaskill C and Shaffer LG (1995). Deletions of the elastin gene
at 7q11.23 occur in approximately 90% of patients with Williams syndrome. American Journal of
Human Genetics, 56, 1156–1161.
Nielzen S and Cesarec Z (1982). Emotional experience of music by psychiatric patients compared with
normal subjects. Acta Psychiatrica Scandinavica, 65, 450–460.
Northoff G, Henzel A, de Greck M, Bermpohl F, Dobrowolny H and Panksepp J (2006). Self-referential
processing in our brain—A meta-analysis of imaging studies of the self. Neuroimage, 31, 440–457.
Nyklicek I, Thayer JF, and Van Doornen LJP (1997). Cardiorespiratory differentiation of musically-
induced emotions. Journal of Psychophysiology, 11, 304–321.
Ostow M (2007). Spirit, mind, and brain: A psychoanalytic examination of spirituality and religion. Columbia
Pacchetti C, Aglieri R, Mancini F, Martignoni E, and Nappi G (1998). Active music therapy and Parkinson’s
disease: methods. Functional Neurology, 13, 57–67.
Panksepp J (1981). Brain opioids: A neurochemical substrate for narcotic and social dependence. In S
Cooper, ed., Progress in theory in psychopharmacology, pp. 149–175. Academic Press, London.
Panksepp J (1986). The neurochemistry of behavior. Annual Review of Psychology, 37, 77–107.
Panksepp J (1992). Oxytocin effects on emotional processes: separation distress, social bonding, and
relationships to psychiatric disorders. Annals of the New York Academy of Sciences, 652, 243–252.
Panksepp J (1995). The emotional sources of ‘chills’ induced by music. Music Perception, 13, 171–207.
Panksepp J (1998a). Affective neuroscience: The foundations of human and animal emotions. Oxford
Panksepp J (1998b). The periconscious substrates of consciousness: Affective states and the evolutionary
origins of the SELF. Journal of Consciousness Studies, 5, 566–582.
Panksepp J (2000a). Affective consciousness and the instinctual motor system: The neural sources of
sadness and joy. In R Ellis and N Newton, eds, The caldron of consciousness: Motivation, affect and
self-organization, Advances in Consciousness Research, pp. 27–54. John Benjamins,
Panksepp J (2000b). The neurodynamics of emotions: An evolutionary-neurodevelopmental view.
In MD Lewis and I Granic, eds, Emotion, self-organization, and development, pp. 236–264.
Cambridge University Press, New York.
Panksepp J (2001). The long-term psychobiological consequences of infant emotions: Prescriptions for the
21st century. (reprinting from Infant Mental Health Journal, 2001, 22, 132–173.) NeuroPsychoanalysis,
3, 140–178.
Panksepp J (2003a). At the interface between the affective, behavioral and cognitive neurosciences:
Decoding the emotional feelings of the brain. Brain and Cognition, 52, 4–14.
Panksepp J (2003b). An archeology of mind: The ancestral sources of human feelings. Soundings, 86, 41–69.
Panksepp J (2003c). Can anthropomorphic analyses of ‘separation cries’ in other animals inform us about
the emotional nature of social loss in humans? Psychological Reviews, 110, 376–388.
Panksepp J (2005a). Affective consciousness: Core emotional feelings in animals and humans. Consciousness
and Cognition, 14, 19–69.
Panksepp J (2005b). On the embodied neural nature of core emotional affects. Journal of Consciousness
Studies, 12, 158–184.
Panksepp J (2005c). Beyond a joke: From animal laughter to human joy? Science, 308, 62–63.
Panksepp J (2007a). Neuroevolutionary sources of laughter and social joy: Modeling primal human
laughter in laboratory rats. Behavioral Brain Research, 182, 231–244.
Panksepp J (2007b). Can PLAY diminish ADHD and facilitate the construction of the social brain?
Journal of the Canadian Academy of Child and Adolescent Psychiatry, 16(2), 5–14.
Panksepp J and Bekkedal MYV (1997). The affective cerebral consequence of music: Happy vs sad effects
on the EEG and clinical implications. International Journal of Arts Medicine, 5, 18–27.
Panksepp J and Bernatzky G (2002). Emotional sounds and the brain: the neuro-affective foundations of
Panksepp J and Bishop P (1981). An autoradiographic map of (3H) diprenorphine binding in rat brain:
Effects of social interaction. Brain Research Bulletin, 7, 405–410.
Panksepp J and Burgdorf J (2003), ‘Laughing’ rats and the evolutionary antecedents of human joy?,
Physiology and Behavior, 79, 533–547.
Panksepp J, Nelson E and Siviy S (1994). Brain opioids and mother–infant social motivation.
Acta Paediatrica Supp, 397, 40–46.
Papoušek M and Papoušek H (1981). Musical elements in the infant’s vocalization: Their significance for
communication, cognition, and creativity. In LP Lipsitt and CK Rovee-Collier, eds, Advances in Infancy
Research, vol. 1, pp. 163–224. Ablex, Norwood, NJ.
Papoušek M, Papoušek H and Symmes D (1991). The meanings of melodies in motherese in tone and
stress language. Infant Behavioral Development, 14, 414–440.
Papoušek M (1994). Melodies in caregivers’ speech: A species specific guidance towards language. Early
Development and Parenting, 3, 5–17.
Penhune VB, Zatorre RJ and Feindel WH (1999). The role of auditory cortex in retention of rhythmic
patterns as studied in patients with temporal lobe removals including Heschl’s gyrus. Neuropsychologia,
37, 315–331.
Peoples R, Perez-Jurado L, Wang YK, Kaplan P and Francke U (1996). The gene for replication factor C
subunit 2 (RFC2) is within the 7q11.23 Williams syndrome deletion. American Journal of Human
Genetics, 58, 1370–1373.
Peretz I (1990). Processing of local and global musical information by unilateral brain-damaged patients.
Brain, 113, 1185–1205.
Peretz I and Zatorre R (eds) (2003). The cognitive neuroscience of music. Oxford University Press,
New York.
Peretz I and Zatorre RJ (2005). Brain organization for music processing. Annual Review of Psychology,
56, 89–114.
Peretz I, Gagnon L and Bouchard B (1998). Music and emotion: perceptual determinants, immediacy, and
isolation after brain damage. Cognition, 68, 111–141.
Peretz I, Kolinsky R, Tramo M et al. (1994). Functional dissociations following bilateral lesions of auditory
cortex. Brain, 117, 1283–1301.
Perry DW, Zatorre RJ, Petrides M, Alivisatos B, Meyer E and Evans AC (1999). Localization of cerebral
activity during simple singing. NeuroReport, 10, 3453–3458.
Peterson B and Panksepp J (2004). The biological psychiatry of childhood disorders. In Panksepp J, ed.,
Textbook of biological psychiatry, pp. 393–436. Wiley, New York.
Petsche H, Lindner K, Rappelsberger P and Gruber G (1988). The EEG: An adequate method to concretize
brain processes elicited by music. Music Perception, 6, 133–160.
Petsche H (1996). Approaches to verbal, visual and musical creativity by EEG coherence analysis.
International Journal of Psychophysiology, 24, 145–159.
Pfurtscheller G, Klimesch W, Berhold A, Mohl W and Schimke H (1990). Event-related desynchronization
(ERD) correlated with cognitive activity. In ER John, ed., Machinery of the mind, pp. 243–251. Birhauser,
Boston, MA.
Piontelli A (2002). Twins: From fetus to child. Routledge, London.
Ploog D (1992). The evolution of vocal communication. In H Papoušek, U Jürgens, and M Papoušek, eds,
Nonverbal vocal communication: Comparative and developmental aspects, pp. 3–13. Cambridge
University Press, Cambridge/New York.
Pollick AS and de Waal FBM (2007). Ape gestures and language evolution. Proceedings of the National
Academy of Sciences, 104(19), 8184–8189.
Pratt RR and Grocke DE (eds) (1999). MusicMedicine, vol. 3 – Music medicine and music therapy:
Expanding horizons. MMB Music, Saint Louis, MO.
Pratt RR and Spintge R (eds) (1996). MusicMedicine, vol. 2. MMB Music, Saint Louis, MO.
Quaranta A, Siniscalchi M and Vallortigara G (2007). Asymmetric tail-wagging responses by dogs to
different emotional stimuli. Current Biology, 17(6), R199–R201.
Rauscher F and Shaw GL (1998). Key components of the Mozart effect. Perception and Motor Skills,
86, 835–841.
Reddy V (2003). On being the object of attention: implications for self–other consciousness. TRENDS in
Cognitive Sciences, 7(9), 397–402.
Reddy V and Trevarthen C (2004). What we learn about babies from engaging with their emotions.
Zero to Three, 24(3), 9–15.
Richman B (1987). Rhythm and melody in Gelada vocal exchanges. Primates, 28, 199–223.
Ridley M (2003). The agile gene. HarperCollins, New York.
Rimland B and Edelson SM (1995). A pilot study of auditory integration training in autism. Journal of
Autism and Developmental Disorders, 25, 61–70.
Rizzolatti G and Arbib MA (1998). Language within our grasp. Trends in the Neurosciences, 21, 188–194.
Rizzolatti G, Fogassi L and Gallese V (2006). Mirrors in the mind. Scientific American, 295 (5), 30–37.
Robazza C, Macaluso C and D’Urso V (1994). Emotional reactions to music by gender, age, and expertise.
Perceptual and Motor Skills, 79, 939–944.
Robb L (1999). Emotional musicality in mother–infant vocal affect, and an acoustic study of postnatal
depression. Musicae Scientiae (Special Issue, 1999–2000), 123–151.
Roederer JG (1984). The search for the survival value of music. Music Perception, 1, 350–356.
Rogers LJ and Kaplan G (2000). Song, roars and rituals: Communication in birds, mammals and other
animals. Harvard University Press, Cambridge, MA.
Rönnqvist L and Hofsten C von (1994). Neonatal finger and arm movements as determined by a social and
an object context. Early Development and Parenting, 3, 81–94.
Sacks O (1973). Awakenings. Dutton, New York.
Sacks O (2006). The power of music. Brain, 129, 2528–2532.
Sacks O (2007). Musicophilia: Tales of music and the brain. Random House: New York/Picador, London.
Saffran JR and Griepentrog GJ (2001). Absolute pitch in infant auditory learning: Evidence for develop-
mental reorganization. Developmental Psychology, 37, 74–85.
Sarnthein J, von Stein A, Rappelsberger P, Petsche H, Rauscher FH and Shaw GL (1997). Persistent
patterns of brain activity: An EEG coherence study of the positive effect of music on spatial-temporal
reasoning. Neurological Research, 19, 107–116.
Scherer KR (1986). Vocal affect expression: A review and a model for future research. Psychological Bulletin,
99, 143–165.
Schilbach L, Wohlschläger AM, Newen A et al. (2006). Being with others: Neural correlates of social
interaction. Neuropsychologia, 44(5), 718–730.
Schlaug G, Norton A, Overy K and Winner E (2005). Effects of music training on brain and cognitive
development. Annals of the New York Academy of Science, 1060, 219–230.
Schögler B and Trevarthen C (2007). To sing and dance together. In S Bråten, ed., On being moved:
From mirror neurons to empathy, pp. 281–302. John Benjamin, Amsterdam/Philadelphia.
Schore AN (1994). Affect regulation and the origin of the self: The neurobiology of emotional development.
Erlbaum, Hillsdale, NJ.
Schore AN (1998). The experience-dependent maturation of an evaluative system in the cortex.
In KH Pribram, ed., Brain and values: Is a biological science of values possible?, pp. 337–358.
Erlbaum, Mahwah, NJ.
Schubert E (1996). Enjoyment of negative emotions in music: An associative network explanation.
Psychology of Music, 24(1), 18–28.
Schubert E and McPherson GE (2006). The perception of emotion in music. In GE McPherson, ed.,
The child as musician: A handbook of musical development, pp. 193–212. Oxford University Press, Oxford.
Schwartz DA, Howe CQ and Purves D (2003). The statistical structure of human speech sounds predicts
musical universals. Journal of Neuroscience, 23, 7160–7168.
Scott E and Panksepp J (2003). Rough-and-tumble play in human children. Aggressive Behavior,
29(6), 539–551.
Sergent J, Zuck E, Terriah S and MacDonald B (1992). Distributed neural network underlying musical
sight-reading and keyborard eprformance. Science, 257, 106–109.
Seyfarth RM and Cheney DL (2003). Meaning and emotion in animal vocalizations. Annals of the New York
Academy of Sciences, 1000, 32–55.
Shanahan D (2007). Language, feeling, and the brain. New Brunswick, NJ and London, Transaction
Publishers.
Shepard R (1999). Cognitive psychology of music. In PR Cook, ed., Music, cognition, and computerized
sound, pp. 21–35. MIT Press, Cambridge, MA.
Shewmon DA, Holmse DA and Byrne PA (1999). Consciousness in congenitally decorticate children:
developmental vegetative state as self-fulfilling prophecy. Developmental Medicine and Child Neurology,
41, 364–374.
Siegel D (1999). The developing mind: Toward a neurobiology of interpersonal experience. Guilford Press,
New York.
Sloboda J (1991). Music structure and emotional response: Some empirical findings. Psychology of Music,
19, 110–120.
In WPD Wightman and JC Bryce, eds, Essays on philosophical subjects, pp. 176–213. Liberty Fund,
Indianapolis, IN.
Solms M (1997). What is consciousness? Journal of the American Psychoanalytic Association, 45, 482–489.
Standley JM (1998). The effect of music and multi-modal stimulation on developmental responses of
premature infants in neonatal intensive care. Pediatric Nursing, 24, 532–538.
Stefano GB, Zhu W, Cadet P, Salamon E and Mantione KJ (2004). Music alters constitutively expressed
opiate and cytokine processes in listeners. Medical Science Monitor, 10, MS18–27.
Steinberg R (ed.) (1995). Music and the mind machine. Springer, Berlin.
Stern DN (1985). The interpersonal world of the infant: A view from psychoanalysis and development
psychology. Basic Books, New York.
Stern DN (1990). Joy and satisfaction in infancy. In RA Glick and S Bone, eds, Pleasure beyond the pleasure
principle, pp. 13–25. Yale University Press, Newhaven, CT.
Stern DN (1993). The role of feelings for an interpersonal self. In U Neisser, ed., The perceived self: Ecological
and interpersonal sources of self-knowledge, pp. 205–215. Cambridge University Press, New York.
Stern DN (1995). The motherhood constellation. Basic Books, New York.

Stern DN (1999). Vitality contours: The temporal contour of feelings as a basic unit for constructing the
infant’s social experience. In P Rochat, ed., Early social cognition: Understanding others in the first months
of life, pp. 67–90. Erlbaum, Mahwah, NJ.
Stern DN, Hofer L, Haft W and Dore J (1985). Affect attunement: The sharing of feeling states between
mother and infant by means of inter-modal fluency. In TM Field and NA Fox, eds, Social perception in
infants, pp. 249–268. Ablex, Norwood, NJ.
Stewart L, von Kriegstein K, Warren JD and Griffiths TD (2006). Music and the brain: disorders of
musical listening. Brain, 129, 2533–2553.
Stokoe WC (2001). Language in hand: Why signs came before speech. Gallaudet University Press,
Washington, DC.
Storr A (1992). Music and the mind. Ballantine Books, New York.
Stratton VN and Zalanowski AH (1991). The effects of music and cognition on mood. Psychology of Music,
19, 121–127.
Sutoo D and Akiyama K (2003). Regulation of brain function by exercise. Neurobiology of Disease, 13, 1–14.
Sutoo D and Akiyama K (2004). Music improves dopaminergic neurotransmission: demonstration based
on the effect of music on blood pressure regulation. Brain Research, 1016, 255–262.
Terwogt MM and Van Grinsven F (1991). Musical expressions of mood states. Psychology of Music, 13, 99–109.
Thaut MH and Davis WB (1993). The influence of subject-selected versus experimenter-chosen music on
affect, anxiety, and relaxation. Journal of Music Therapy, 30, 210–233.
Thompson E (ed) (2001). Between ourselves: second-person issues in the study of consciousness. Charlottesville,
VA/Thorverton, UK: Imprint Academic. Also published in the Journal of Consciousness Studies,
8, Number 5–7.
Tinbergen N (1951). The study of instinct. Clarendon Press, Oxford.
Todd N (1985). A model of expressive timing in tonal music. Music Perception, 3, 33–57.
Trainor LJ and Schmidt LA (2003). Processing emotions induced by music. In I Peretz and R Zatorre, eds,
The cognitive neuroscience of music, pp. 310–324. Oxford University Press, New York.
Trehub SE (2006). Infants as musical connoisseurs. In G. McPherson, ed., The child as musician, pp. 33–49.
Oxford University Press, Oxford.
Trehub SE, Bull D and Thorpe LA (1984). Infants’ perception of melodies: The role of melodic contour.
Child Development, 55, 821–830.
Trevarthen C (1974). The psychobiology of speech development. In EH Lenneberg, ed., Language and
brain: Developmental aspects—Neurosciences Research Program Bulletin, vol. 12, pp. 570–585.
Neuroscience Research Program, Boston, MA.
Trevarthen C (1978). Manipulative strategies of baboons and the origins of cerebral asymmetry.
In, M Kinsbourne, ed., The asymmetrical functions of the brain, pp. 329–391. Cambridge University
Press, New York and London.
Trevarthen C (1984). Hemispheric specialization. In SR Geiger et al. eds, Handbook of Physiology; (Section 1,
The Nervous System); Volume 2, Sensory Processes. (Section Editor, I Darian-Smith), pp. 1129–1190.
American Physiological Society, Washington, DC.
Trevarthen C (1985). Neuroembryology and the development of perceptual mechanisms. In F Falkner and
JM Tanner, eds, Human growth, 2nd edn, pp. 301–383. Plenum, New York.
Trevarthen C (1986). Form, significance and psychological potential of hand gestures of infants.
In J-L Nespoulous, P Perron and AR Lecours, eds, The biological foundation of gestures: Motor and
semiotic aspects, pp. 149–202. Erlbaum, Hillsdale, NJ.
Trevarthen C (1993). The function of emotions in early infant communication and development.
In J Nadel and L Camaioni, eds, New perspectives in early communicative development, pp. 48–81.
Routledge, London.
Trevarthen C (1995). Mother and baby – seeing artfully eye to eye. In R Gregory, J Harris, D Rose and
P Heard, eds, The artful eye, pp. 157–200. Oxford University Press, Oxford.
Trevarthen C (1996). Lateral asymmetries in infancy: Implications for the development of the hemispheres.
Neuroscience and Biobehavioral Reviews, 20(4), 571–586.
Trevarthen C (1997). Foetal and neonatal psychology: Intrinsic motives and learning behaviour.
In F Cockburn, ed., Advances in perinatal medicine, pp. 282–291. Parthenon, New York.
Trevarthen C (1998). The concept and foundations of infant intersubjectivity. In S Bråten, ed.,
Intersubjective communication and emotion in early ontogeny, pp. 15–46. Cambridge University Press,
Cambridge.
Trevarthen C (1999). Musicality and the intrinsic motive pulse: evidence from human psychobiology and
infant communication. Musicae Scientiae (Special Issue, 1999–2000), 157–213.
Trevarthen C (2001). The neurobiology of early communication: intersubjective regulations in human
brain development. In AF Kalverboer and A Gramsbergen, eds, Handbook on brain and behavior in
human development, pp. 841–882. Dordrecht, The Netherlands, Kluwer.
Press, Oxford.
Trevarthen C (2004). How infants learn how to mean. In M Tokoro and L Steels, eds, A learning zone of
one’s own, pp. 37–69. (SONY Future of Learning Series). IOS Press, Amsterdam.
Trevarthen C (2005). Action and emotion in development of the human self, its sociability and cultural
intelligence: Why infants have feelings like ours. In J Nadel and D Muir, eds, Emotional development,
pp. 61–91. Oxford University Press, Oxford.
Trevarthen C (2008a). Human biochronology: On the source and functions of ‘musicality’. In R Haas and
V Brandes, eds, Proceedings of the Mozart and Science Conference, Baden, October, 2006. Springer,
Vienna, in press.
Trevarthen C (2008b). The musical art of infant conversation: Narrating in the time of sympathetic
experience, without rational interpretation, before words. Musicae Scientiae (Special Issue), M Imberty
and M Gratier, eds, in press.
Trevarthen C and Aitken KJ (1994). Brain development, infant communication, and empathy disorders:
intrinsic factors in child mental health. Development and Psychopathology, 6, 599–635.
Trevarthen C and Aitken KJ (2003). Regulation of brain development and age-related changes in infants’
motives: The developmental function of ‘regressive’ periods. In M Heimann, ed., Regression periods in
human infancy, pp. 107–184. Erlbaum, Mahwah, NJ.
Trevarthen C, Aitken KJ, Vandekerckhove M, Delafield-Butt J and Nagy E (2006). Collaborative
regulations of vitality in early childhood: Stress in intimate relationships and postnatal psycho-
pathology. In D Cicchetti and DJ Cohen, eds, Developmental psychopathology, volume 2,
Developmental neuroscience, pp. 65–126, 2nd edn. Wiley, New York.
Tucker DM (2001). Motivated anatomy: A core-and-shell model of corticolimbic architecture.
In G Gainotti, ed., Handbook of neuropsychology, 2nd edn, vol. 5: Emotional behavior and its disorders,
pp. 125–160. Elsevier, Amsterdam.
Tulving E and Markowitsch HJ (1998). Episodic and declarative memory: Role of the hippocampus.
Hippocampus, 8, 198–204.
Turner M (1996). The literary mind: The origins of thought and language. Oxford University Press,
New York/Oxford.
Turner V (1974). Dramas, fields and metaphors. Cornell University Press, Ithaca, NY.
Turner V (1983). Play and drama: The horns of a dilemma. In FE Manning, ed., The world of play,
pp. 217–224. Proceedings of the 7th Annual Meeting of the Association of the Anthropological Study
of Play. Leisure Press, West Point, NY.
Tzourio-Mazoyer N, De Schonen S, Crivello F et al. (2002). Neural correlates of woman face processing by
2-month-old infants. Neuroimage, 15, 454–461.
VanderArk SD and Ely D (1992). Biochemical and glavanic skin responses to music stimuli by college
students in biology and music. Perceptual and Motor Skills, 74, 1079–1090.
Varela F, Thompson E and Rosch E (1991). The embodied mind: Cognitive science and human experience.
Waldhoer M, Panksepp J, Pruitt D, et al. (1995). An animal model of auditory integration training (AIT).
Society for Neuroscience Abstracts, 21, 736.
Wallin NL (1991). Biomusicology: Neurophysiological, neuropsychological, and evolutionary perspectives on
the origins and purposes of music. Pergamon Press, Stuyvesant, NY.
Watt DF and Pincus DI (2004). Nerual substrates of consciousness: Implications for clinical psychiatry.
In J Panksepp, ed., Textbook of biological psychiatry, pp. 627–660. Wiley, Hoboken, NJ.
Willems RM and Hagoort P (2007). Neural evidence for the interplay between language, gesture, and
action: A review. Brain and Language, 101, 278–289.
Williams JCP, Barratt-Boyes BG and Lowe JB (1961). Supravalvular arortic stenosis. Circulation,
24, 1311–1318.
Wittmann M and Pöppel E (1999). Temporal mechanisms of the brain as fundamentals of communication,
with special reference to music perception and performance. Musicae Scientiae, (Special Issue,
1999–2000), 13–28.
Zajonc RB (2004). Exposure effects: An unmediated phenomenon. In ASR Manstead, N Fijda and Agneta
Fischer eds, Feelings and emotions: The Amsterdam symposium, pp. 194–203. Cambridge University
Press, Cambridge, UK.
Zatorre RJ (1984). Musical perception and cerebral function: A critical review. Music Perception, 2, 196–221.
Zei Pollermann B (2002). A place for prosody in a unified model of cognition and emotion. Proceedings of
Laboratoire Parole et Langage [Speech Prosody] CNRS. Aix-en-Provence, France: Universitié de
Provence. Available from www.lpl.univ-aix.fr/sp2002/oral.htm.
Zentner MR and Kagan J (1996). Perception of music by infants. Nature, 383, 29.
Zoia S, Blason L, D’Ottavio G et al. (2007). Evidence of early development of action planning in the human
foetus: a kinematic study. Experimental Brain Research, 176, 217–226.
Zubieta JK, Ketter TA, Bueller JA et al. (2003). Regulation of human affective responses by anterior
cingulate and limbic mu-opioid neurotransmission. Archives of General Psychiatry, 60, 1145–1153.
Chapter 8
Brain, music and musicality:

Inferences from neuroimaging
Robert Turner and Andreas A. Ioannides
The man that hath no music in himself

Nor is not moved with concord of sweet sounds
Is fit for treasons, stratagems and spoils;
The motions of his spirit are dull as night
And his affections dark as Erebus:
Let no such man be trusted.
Shakespeare, The Merchant of Venice
8.1 Introduction
We approach the question of human musicality by considering the spatial and temporal patterns
of brain activity that may be specific to the human experience of music. Commonalities in any
activity or behaviour across people and cultures point towards biologically rather than culturally
defined competencies of mind and brain. As brain scientists, we wish to discriminate such innate
cerebral competencies for music from conventional, culturally learned skills, and to make sense
of the functional localizations for both types of activity in particular parts of the brain that are
amenable to study using neuroimaging techniques.
There is an important parallel here with studies of language in which, over the past century, the
question of innateness has been much debated, especially with reference to the theory of genera-
tive grammars and the possibility of a brain-based ‘language instinct’. Since instrumental music
and some songs lack explicit reference to things experienced in the world, yet excite powerful
cognitive and emotional effects, it is likely that the study of music, and especially its foundations
in the motives for musicality, can provide insights into the innate competencies of the brain that
enable the production and interpretation of the ordered sound sequences that constitute spoken
language.
In any given culture, the classification and terminology used by educated people in character-
izing the structured, ordered and intentional sequences of sounds understood as ‘music’ take on
a unique authority. In Western studies of music, terms such as ‘pitch’, ‘rhythm’, ‘melody’, and
‘timbre’ tend to be regarded as applicable to music found in any human group, whether or not
that group itself employs corresponding concepts in their music. Studies seeking universal prin-
ciples of brain organization or brain activation for these Western concepts could potentially
be taking an answer for granted. However, it is outside the scope of this chapter to review the
ethnomusicological literature on musical terminology worldwide (and see Cross and Morley,
Chapter 5, this volume). For the purposes of describing the corpus of research exploring brain
representations of musical competencies, we will accept the conventional terminology widely
148 ROBERT TURNER AND ANDREAS A. IOANNIDES
used to identify features of music in Western cultures—melody, harmony, rhythm, chord, attack,
timbre, pitch, tempo and tonality—because this research corpus has done just that. Even where
the terms of analysis do not correspond to human universals, much can still be learned about
how our brains internalize the collective representations that make up a musical culture, by
studying the brain processes by which these musical concepts come to derive their unquestion-
able character.
Music, as a structured and intentional succession of movement-produced sounds, is clearly
comparable to other complex products of human activity, including both speech and non-
spoken language (Dissanayake Chapter 2, Brandt Chapter 3, this volume). These areas are now
the subject of intensive research using the neuroimaging techniques that we will discuss below,
such as magnetoencephalography (MEG) and functional magnetic resonance imaging
(fMRI. However, until recently, these methods have been little used to investigate music and
musicality—partly because music and musicality have been seen by influential language-focused
scholars as less important to human life, and partly because they (especially the motivational
processes of musicality) are more difficult to fit into easy categories of object or stimulus.
For some time, imaging techniques have been established that can tell us where and when the
brain has increased activity. More recently, new techniques have been proposed that can also
quantifiably explore patterns in regional electrical brain activity; while the topic of the neural
correlates of musical expression and performance is at least as interesting as those of musical
perception, we will, given the limitations of current research methods, address mainly the latter
in this chapter.
Combining these powerful imaging techniques can be used to address the following questions:
◆ Are the experienced effects of stimulation by music and language related in the brain, and
if so, how?
◆ What aspects of music correspond to our basic ‘hard-wired’ (i.e., early developed and
regulatory) brain capabilities?
◆ Are specific parts of the adult brain specialized for perceiving components of musical
awareness, such as those we define as pitch, harmony and rhythm?
◆ How does brain activity at different timescales relate to different aspects of musical activity or
stimulation, and how are these dynamic events integrated into a unified and powerful percept
of a musical mood, theme or narrative?
◆ What changes occur in the brain when we learn to read music or to play an instrument?
◆ How does the system of sound awareness relate to the functions of the emotion system, to
cause us to be profoundly moved by certain qualities of music?
There is a plethora of techniques for mapping brain activity. Here we will review non-invasive
methods that allow the localization and timing of human brain activity over spatial and temporal
scales that are relevant to the study of music. These methods fall into two categories: those which
track changes in cerebral blood flow or blood oxygenation correlated with changes in neuronal
activity, and those which directly detect changes in electrical or magnetic signals caused by
neuronal activity. We begin with a summary of the main physical principles involved, and outline
advantages and limitations for research on psychological events (see Table 8.1).
8.2 Blood flow response methods: PET and fMRI

The first type of technique encompasses two modalities: positron emission tomography (PET)
(tomography is the process of forming cross-sectional images or maps of activity corresponding
Table 8.1 Techniques for functional brain imaging
Detection method PET fMRI EPI EEG MEG

08-Malloch-Chap08
Source of Local increase of Disturbance of imposed Very fast version Direct measurement of Detection of generators of
measurements radioactive oxygen magnetic field by changes of fMRI electrical activity of neural activity by magnetic
of neural activity carried by blood flow in deoxygenated blood in synchronized neuron coils in an array round the
active brain, observed by groups, with electrode subject’s head
detecting MRI signal from arrays attached to the
9/10/08
protons in tissue water scalp
Temporal resolution 2 minutes 4–8 seconds, detecting As for fMRI Fractions of milliseconds, Fractions of milliseconds,
events over 0.2 sec but averaging reduces this but averaging reduces this
12:10 PM
Spatial Resolution 5 mm 1–3 mm As for fMRI ≥ 10 mm or more, A few millimeters, with

depending on number of careful computation
sources
Restrictions of Participant in prone Participant in prone As for fMRI Immobile in comfortable chair Immobile in comfortable
Page 149
participant’s activity position on bed, head position on bed, head chair. Enclosed in chamber
and environment firmly immobilized firmly immobilized of scanner
Stimuli or response £ 12 tracer injections, Several stimuli or response As for fMRI Nil Nil
conditions one condition each conditions applied while
subject is scanned for
15–20 minutes
Advantages Silent Comparatively good As for fMRI Silent and non-invasive. Excellent Silent and non-invasive.
spatial and temporal temporal resolution allowing Most direct view of brain
resolution discrimination of rhythmic activity. Expensive and not
components over a wide range. widely available
Relatively cheap and available
Disadvantages Potentially toxic tracer Loud noise of magnetic As for fMRI Problems of computation to Problems of computation
substance. Poor spatial coils. Overcome by locate neural activity. Low to locate neural activity.
and temporal resolution intermittent conductivity of the skull makes Needs MRI to correct for
‘sparse scanning’ with the extraction of detailed individual brain anatomy.
coils periodically information about the generators Complex processing to
switched off difficult. Needs MRI to correct for interpret data
individual brain anatomy. Complex
BRAIN, MUSIC AND MUSICALITY: INFERENCES FROM NEUROIMAGING
processing to interpret data

149
to slices of the brain) and fMRI (Turner and Jones 2003). These two modalities involve very
different physical principles. PET tracks radioactive decays of rapidly decaying positron-emitting
nuclei in radioactive compounds injected into the bloodstream that make their way into
the brain. Most PET studies of brain function have used water molecules labelled with the
isotope oxygen-15, which decays into the stable nucleus nitrogen-14, emitting a positron in the
process. The positron quickly comes to rest and mutually annihilates with an electron. Before
annihilation, the positron travels an average distance of about 5 mm, which sets the intrinsic
limit for the spatial accuracy of PET. The annihilation event produces two high-energy photons
that travel (at the speed of light) away from the emission site in opposite directions. Rings of
detectors register these photons, recording the time and position of detection. The nearly simul-
taneous detection of the two photons identifies them as originating from the same positron anni-
hilation event. A mathematical algorithm converts these data into a map of average cerebral
blood flow (CBF) during the oxygen-15 half-life of 122 seconds, the time by which half of the
nuclei has decayed. In brain-mapping experiments, volunteer participants are given up to
12 injections of the relatively harmless oxygen-15 labelled water, and their brains are scanned
while they are presented with a stimulus or task (the experimental ‘condition’) to test such factors
as their awareness, thoughts or memories. Local changes of CBF due to variations in neuronal
activity can be inferred using precise statistical techniques that provide parametric maps of the
location and intensity of activity. These maps have a spatial resolution of about 5 mm at best, and
due to demanding radiation dose restrictions, it is usually necessary to average results from
several volunteers to obtain significant results in brain-mapping studies. This limits the sensiti-
vity of the method for immediate, labile and idiosyncratic experiences.
PET has the great advantage for studies of music and human brain that it is entirely silent in
operation, by contrast with fMRI (see below). Its main disadvantages are the requirement of
injections of a radioactive tracer (limiting any given volunteer to only one set of 12 scans in a
lifetime), the relatively poor spatial resolution, and the poor temporal resolution of about two
minutes, which is the time it takes to hear six verses of an average song, or to read this paragraph
at least six times.
Functional magnetic resonance imaging (fMRI) is based on magnetic resonance imaging
(MRI). This widely available medical imaging technique provides images of the tissue distribu-
tion of hydrogen nuclei (protons) in water molecules in any organ. In the presence of a very
uniform, large, steady magnetic field, the protons can be resonantly excited by a radio-frequency
magnetic field of appropriate frequency. The frequency of precession of these spins can be
controlled by imposing gradients of intensity in the magnetic field, generated by ‘gradient coils’
driven by large rapidly switched currents. These magnetic field gradients allow the distribution of
the water molecules to be mapped. The tiny voltage induced in a receiver coil by the precessing
protons is detected with sensitive radiofrequency electronics, and converted into a high-resolution
image of a chosen slice of tissue. The MR image intensity depends mainly on the density of
water protons, and is modulated by intrinsic local properties of the tissue, such as biochemical
composition and magnetic heterogeneity.
The different properties of different human tissues is of particular importance for imaging
effects in brain tissue. The deoxyhaemoglobin, found in venous blood, has a greater magnetic
susceptibility than tissue and oxyhaemoglobin; that is, if a person’s head is inside a large magnet,
the net magnetic field in the capillaries and small veins of the brain will be very slightly larger
than anywhere else in the head. This disturbs the otherwise uniform magnetic field, causing a
decrease in the MR image intensity around the smallest veins. The venous deoxyhaemoglobin
thus acts as an endogenous paramagnetic MRI contrast agent. Any intervention that changes the
oxygenation of venous blood can thus be observed as a change in MR image intensity.
BRAIN, MUSIC AND MUSICALITY: INFERENCES FROM NEUROIMAGING 151
Using a very fast MRI technique known as echo-planar imaging (EPI), these changes can be
followed second by second, and thus the technique can be used for tracking changes in brain
activity. Focal increases in neuronal activity give rise to increases in blood flow that surpass
changes in oxygen consumption, so that the net oxygenation of blood leaving an area where
neurons are more active is increased. Thus, this venous blood becomes more similar in its mag-
netic properties to the surrounding tissue, the magnetic field becomes more uniform, and there
is a local rise in MR image intensity, the so-called BOLD (Blood Oxygenation Level-Dependent)
response. At typical MRI static field strengths of 1.5 Tesla (a measure of magnetic field intensity),
this increase is about 4 per cent of the total image intensity, and is thus quite easily observable.
Because the level of blood flow accurately reflects locally increased neuronal activity, and changes
in oxygen demand correlate well to these electrical changes, fMRI can provide quite accurate
localization of neuronal activity. However, the change due to blood flow (the haemodynamic
response function) is slow compared with that of the neuronal activity that causes it. Typically, it
takes 4–8 s to build up and decay, compared with neuronal time constants of a few tens of
milliseconds. Nevertheless, it is possible to sample the entire brain with a spatial resolution of
3 mm in about 3 s. Studies are usually conducted by presenting participants with several succes-
sive tasks or conditions while they are continuously scanned for periods of 15–20 minutes, and
estimating the effects statistically.
The great advantages of fMRI for studies of music and brain are the relatively high spatial
resolution (3 mm or better), the reasonable temporal resolution (a few seconds), the high sensi-
tivity over wide areas of the brain (it takes only a few minutes of scanner time to record activity
in many brain areas), the complete non-invasiveness (allowing for an indefinite number of repeat
scans), and the wide availability of MRI scanners. The main disadvantage is the quite loud noise
produced by the vibration of the gradient coils as their current is varied. Though the sound is
repetitive, it can still be distracting and can mask specific frequencies in musical stimuli. Methods
for shielding this noise from participants’ ears during auditory presentation are far from perfect,
although active noise cancellation methods have shown promise. One method of reducing this
problem is known as ‘sparse scanning’. Because of the slowness of the haemodynamic response, it
is feasible to increase the time delay between scans by up to 8 s, presenting the auditory stimuli
during the periods of silence. When the brain is scanned, the blood flow and BOLD response to
the auditory stimulus are only just reaching their maxima and are thus easily detectable. Sparse
scanning has been the mainstay of much of the fundamental imaging research on brain auditory
response.
A further disadvantage of fMRI, compared with MEG and EEG (see below), is its low temporal
resolution. Brain events can be located accurately in time only to about half a second, a very long
time compared with typical cortical processing and transit times between brain areas that
are usually in the range of a few to tens of milliseconds, and in comparison with the timing of
significant events in music.
8.3 Electrophysiological methods: EEG and MEG

The most important electrophysiological methods are electroencephalography (EEG) and
magnetoencephalography (MEG). MEG and EEG use detectors with high temporal resolution,
capable of recording signals with a resolution of a fraction of a millisecond. MEG and EEG
signals are direct measures of brain activity, but relating these measures to their underlying
neuronal generators requires elaborate computation. Before discussing the difficult aspects of
interpreting MEG and EEG signals, we describe three properties that ease the burden and
provide useful data about how these signals should be interpreted.
First, the signals propagate from the source in the brain to the detector at the speed of light, so
the data recorded are, for all practical purposes, the consequences of instantaneous changes in
the brain’s electrical activity. Second, the detected signal is normally well above instrumental
noise, generated, for example, by thermal fluctuations. Since the activity of a single or a few
neurons is too small to produce a detectable signal, it follows that the observed changes in electrical
activity are produced by sources with compact spatial organization and near-synchronous
timing. Third, raw signals can be naturally analysed into many components. These include slow
components with time constants of many tens to hundreds of milliseconds, and fast components
lasting just a few milliseconds (Ioannides et al. 2005).
Typically, MEG and EEG studies have used high- and low-pass filtering of the signal, and the
averaging of many single trials, which removes the signal components arising from fast, intermit-
tent activity. Thus, the results reflect only the slow components in the signal that are closely time-
locked to external stimuli. Better analysis of non-averaged single-trial MEG data has
demonstrated that both slow and fast intermittent activity corresponds to generators in brain
areas known by other methods to be activated by the tasks and stimuli used in experiments
(Ioannides 2001). In short, both the properties of the signal and evidence from localization
studies from single-trial MEG data suggest that the generators at each point of time are both
focal and sparse. They must be focal, because only nearby generators can be sufficiently spatially
(and possibly temporally) organized to produce coherent electrical activity. They must be sparse,
because the brief activation of these focal generators is consistently seen across single trials. They
occur in highly variable sequence from trial to trial, yet they lead to an overall signal with
well-defined temporal structure. Recent publications based on possibly related observations on
raw EEG data (Freeman and Holmes 2005) and high density arrays of electrodes placed on
primary sensory cortices of animals (Freeman 2005) provide intriguing insights that may be
particularly important for advancing our understanding of how music excites the brain.
To relate the measured signal to what is happening in the brain, one must solve two problems,
each with its own difficulties. The first is the ‘forward problem’, from source to signal, i.e., the
computation to identify the signal generated by a given source configuration in the brain activity.
The relevant physics is well-understood, involving only the classical laws of electromagnetism
and the details of the coil or electrode arrangement making up each MEG and EEG detector.
However, the details for the conductivity of the intervening space between generators and
detectors must be taken into account: these can be messy, as they usually are when biological
structures—such as the head, with its bones and cavities—are involved. In this respect, MEG has
a distinct advantage over EEG. The MEG signal detected by a given sensor can be computed
remarkably accurately with simple models of the generators, e.g., a sphere model with the centre
extracted from the curvature of the inner part of the skull close to the sensor. For EEG, however,
a very accurate definition of the conductivity is required throughout the head.
The second problem is the ‘inverse problem’, from signal to source, i.e., the task of recon-
structing the generators, given the signal. It has been known for a long time that, in its purest
mathematical expression, the bioelectromagnetic inverse problem has no unique solution (von
Helmholtz 1853). However, in practice, given the sparse nature of the generators, a reliable
estimate of where they are can be obtained using very broad and reasonable constraints on the
assumed nature of the generators, specifically limiting the maximum strength and maximizing
the information extracted from the signal (Ioannides et al. 1990; Taylor et al. 1999). It is, in fact,
possible to recover tomographic estimates or maps of activity from individual timeslices of
single-trial MEG data (Ioannides 2001; Ioannides et al. 2005). These single-trial tomographic
solutions have revealed that the brain activity excited by the perception of stimuli is very
dynamic. Responses to repeated presentations of identical stimuli differ from trial to trial,
even for very basic sensory stimuli, as we have repeatedly demonstrated for the auditory (Liu
et al. 1998), somatosensory (Ioannides et al. 2002a) and visual (Laskaris et al. 2003) modalities.
The results demonstrate that, even for simple sensory stimuli, the route from the peripheral
nerves to the cortex involves competition within systems of different neuronal networks, some of
which may be beyond the reach of available imaging techniques. However, the range of responses
for a given stimulus is often sufficiently limited to allow the classification of responses in one
area, and to separate out the nodes and interactions of at least part of the underlying system
(Ioannides et al. 2002a). The averaged data represent a mixture of different processes, even for
individual subjects. In contrast, detailed single-trial analysis reveals not only common loci of
activity across subjects, but common sequences of interactions in individuals, both for normal
processing (Ioannides et al. 2002a) and its changes in pathology (Ioannides et al. 2004).
In terms of localization, recent studies have demonstrated that tomographic reconstructions of
MEG data and post-reconstruction statistical analysis can recover the generators close to the
cortical surface with an accuracy of a few millimetres (Moradi et al. 2003). Even for deep sources,
as long as they are distant from the centre of the head, the generators can be recovered well
enough to distinguish, for example, the gaze centres on either side of the brainstem (Ioannides
et al. 2004). Tomographic reconstructions of single-trial MEG data and post-reconstruction
analysis produce an extraordinary wealth of information about the brain. That said, the resources
required for the computation and storage of the data are huge, and this limits most investigations
to the study of just a few subjects.
8.4 State of the art, and problems of implementation

Given the uncertainties outlined above, we believe that there is a need for improved standards of
research planning and the more circumspect interpretation of data. We need better models and
better methods of analysis. The analysis of MEG and EEG data continues to rely on oversimpli-
fied assumptions, partly because of the computational demands of single-trial tomographic
analysis and partly for historical rather than scientific reasons (Ioannides 1995). Even today, most
MEG studies use equivalent current dipoles (ECD)—point source models—to describe brain
activity (a current dipole is completely described by six parameters: three to establish its position
within the head, two to define its orientation and one to define its strength). Usually one or more
ECDs are fitted to MEG signals that have been heavily filtered and averaged. Following the
successful use of baseline conditions in PET and fMRI, many electrophysiological studies have
localized ECDs from difference signals between the averages of different conditions. Admittedly,
some interesting results have been reported using this approach, as we will describe below.
However, describing brain activations elicited by complex linguistic and music stimuli by estab-
lished methods becomes increasingly dubious and difficult to justify conceptually, as long as the
methodology relies on averaging the signals, grand-averaging across subjects, obtaining differ-
ences of averages between conditions, and modelling with point-like sources.
As we have mentioned, EEG and MEG measurements are completely silent and non-invasive.
The measurements can be repeated as often as a participant is willing to put up with the inconven-
ience of relative immobility in a comfortable chair. EEG is widely available and cheap to run, but
the low conductivity of the skull makes the extraction of detailed information about the generators
difficult. MEG provides a more direct view of the brain, but it is expensive and not widely available.
MEG and EEG on their own access only functional information. Additional information must be
provided from a separate MRI scan for each individual’s brain anatomy, to show the generators in
the context of the background anatomy at whatever spatial accuracy EEG or MEG can provide. The
direct relationship to neuronal activity is obviously a desirable property, but it also has drawbacks.
In the case of music, where so much of the brain is excited, many different neuronal networks
are in play in response to the music at the same time, so separating out distinct contributions
requires complex processing. The non-uniqueness of the inverse problem turns out to be less of a
problem in practice than in theory, at least for MEG. However, the use of simplified models to
define the generators poses a major problem in interpreting many of the results. These models
are founded on assumptions that the time course of evoked events will be repeated, and that the
generators of detected effects are fixed and focal in nature, conditions that clearly are not realized
in the brain. Until recently, the computational demands of more robust and powerful methods
limited their use to a few laboratories. Today, the computational disadvantages are reduced, but
they are still serious enough, and reliance on simpler and less reliable methods of analysis is still
prevalent.
8.5 Gathering the threads thus far

Much effort is currently devoted to combining haemodynamic techniques (to study the forces
involved in the circulation of blood) and electrographic techniques to interpret evoked changes
in the brain. The most straightforward way of doing this is to use the MEG and/or EEG signal to
recover the time course of regional sources after constraining them to the locations where foci of
activity have been identified in the same or similar experiments by fMRI. However, the different
techniques rely on very different mechanisms, and provide mappings of events spanning very
different timescales. It is all too easy to combine different techniques without caution, and to end
up with a result that is determined primarily by the limitations rather than advantages of each
method. There is much to be said for advancing each technique as far as possible to reap its fullest
benefits before combining their data, as has recently been demonstrated (Moradi et al. 2003).
8.6 Talk and music

Before we look at the correlates of components in musical stimuli with detected brain activity, it
is worth considering the relationship of music and spoken language. Recent results from electro-
physiology suggest common processes for syntactic (rhythmic and dynamic) motivating features
of music and speech (Besson and Schon 2001) and for aspects of pitch perception. In contrast,
some imaging studies suggest that certain features of musical perception are strongly lateralized
in a different way from those identified for language perception, as has long been claimed by
neuropsychologists (Wallin 1991). We will return to this controversy, since it may provide
valuable information on the innateness or developmental origin of human musicality and its
relationship to acquired musical skills.
Both music and language are generated in time and contain basic units or elements of inten-
tional action that are planned in time. Cognitive theory supposes that these elements must first
be defined or recognized, organized according to rules, and integrated into a ‘phrase’ that is
correct both in terms of grammatical and semantic rules. However, all of these features of expres-
sive organization in communication by music and speech (or text) are effects of the intrinsic
prospective control of movements by the brain, and their perception is part of the experience of
acting in motivated ways (Lee and Schögler, Chapter 6, this volume).
For the purposes of the current chapter, we consider words (or syllables) and music notes (or
chords) as the perceived elements of language and music, respectively. Much has been written on
the similarities and differences between language and music, as controlled processes of human
expression for communication. As we will see, much of the evidence about the similarity of the
recognition of organized form in music and speech suggests a partially shared ‘syntactic produc-
tion/analysis’ apparatus. Evidence presented in Part 2 of this volume shows that use of musical forms
of production and reception in communication precedes language in development, and lesion

studies suggest at least some degree of dissociation between the motivations for music and language
in the human brain (Peretz 2002). However, much of the neural apparatus is likely to be shared, as it
relates to the regulation of all body actions and their monitoring in all sensory modalities: it has been
strongly argued by Besson and Schon (2001) and others that a prerequisite for understanding how
our brains deal with language is first to understand how they deal with the dynamic intentions and
the narrative power of music.
In relation to the question of innate musicality in humans, this area of discussion presents
another twist. If, as is claimed by Chomsky and his followers (e.g., Pinker 2000), language learning
were quasi-instinctive for humans, and if it were true that music and language occupy much the
same brain areas, one might argue a fortiori that musical production and perception, or the
impulses for their development, are also quasi-instinctive, perhaps representing a critical step in
the process of leading to language acquisition after infancy, and indeed in human evolution
(Dissanayake Chapters 2 and 24, Cross and Morley Chapter 5, this volume).
8.6.1 PET and fMRI studies

In one of the few direct comparisons between language and music-like stimuli using functional
brain imaging, Binder and colleagues used fMRI to contrast passive and active listening to words
and tone sequences (Binder et al. 1996). In the passive condition, participants heard in alternating
periods only the background scanner noise or the background scanner noise with English words
and random tone sequences. In the active condition, participants performed either a semantically
based word decision or a tone-pattern analysis. The study identified several left hemisphere
cerebral areas that respond more strongly to word conditions, including the superior temporal
sulcus, middle temporal gyrus, angular gyrus and lateral frontal lobe. In contrast, the planum
temporale responded equally well to tones and words in the passive listening condition, and
more strongly to tones during active listening. These authors concluded that the planum tempo-
rale is likely to be involved in early auditory processing, while specifically linguistic functions are
mediated by multimodal association areas distributed elsewhere in the left hemisphere.
Parenthetically, Tzourio-Mazoyer et al. (2002) provided evidence that the traditional language
areas—Broca’s and Wernicke’s areas—are not uniquely specialized for mature speech and
language. These two areas were activated when 2-month old babies looked at pictures of a
woman’s face. The proficiency of 2-month-olds in ‘protoconversation’, which employs the same
parts of the body as are used in adult speech and hand sign language, also demonstrates that
these parts of the brain are active in regulation and perception of forms of vocal expression, even
the prelinguistic ones that infants use.
The best-known proponent at present of a music-specific lateralization of brain function
based on neuroimaging studies is Robert Zatorre. Most of his work has been done using PET. The
musical tasks performed during scanning include pitch judgements within melodies (Zatorre
et al. 1994), imagined imagery for tunes (Halpern and Zatorre 1999), and the reproduction of
tonal rhythm patterns. From these results, he argues that tonal processing, an essential ingredient
of music, but not necessarily of the textuality of highly practised and/or literate language, is
predominantly a right-hemispheric task. However, since laterality in these studies was inferred by
the questionable technique of visually examining arbitrarily thresholded z-maps, rather than by
making an explicit statistical comparison between homologous regions in each hemisphere, these
findings are not definitive.
By contrast, a recent fMRI study showed greater activation in left-hemispheric primary
auditory cortex (Heschl’s gyrus) for monaurally presented pure tones that are sinusoidally
modulated at 5 Hz (Devlin et al. 2003). This study used a robust method for assessing laterality,
in which a lateralization index was repeatedly calculated for several different significance
thresholds. In another fMRI study, Levitin and Menon (2003) compared listening to normal
musical excerpts with listening to scrambled versions of the same pieces, which differed mainly in
their temporal coherence. They found activation in areas generally accepted to be related to
language perception, and also in their right-hemisphere homologues, with no clear lateralization.
A further fMRI study by Koelsch et al. studied the cortical response to unexpected musical events
that disobey normal syntax, such as abrupt changes in key. These authors found widespread areas
activated, with good overlap with traditional language areas, including Broca’s and Wernicke’s
areas, but with no distinct lateralization (Koelsch et al. 2002).
Summarizing recent data, it appears that homologous areas in both hemispheres can carry out
some of the processing required for musical perception, while left-hemisphere areas have become
specialized for acquiring control of articulations of language during development. Interestingly,
there are many accounts (Abo et al. 2004; Yonemoto 2004; Blank et al. 2003; Calvert et al. 2000;
Woods et al. 1988) of right-hemisphere structures—anatomically symmetric to normal
left-hemisphere language areas—that become active during recovery of language capabilities
following left-hemisphere stroke. This suggests that not much further reorganization of these
areas is required for them to support language production and reception.
8.6.2 EEG and MEG studies of language and music

Studies comparing music and language with MEG and EEG have demonstrated that the evolu-
tion of effects occurs through distinct time-shared stages. We consider three temporal stages of
processing (Table 8.2). We acknowledge from the outset that these stages are convenient ways of
summarizing the probable cascades of activity, some of which will straddle the boundaries of the
stages as we define them; we also recognize that many subdivisions of timing are possible within
each stage.
The first stage is completed soon after 100 ms; it corresponds with what is conceived as the
initial processing of the physical properties of the sound that make up the linguistic or music
element. The second stage extends from the first stage to around 200 ms; it deals with (semi-)
automatic processing of rules that link the element just analysed to an online, tentative frame-
work for the grammatical structure of the yet-to-be completed phrase. The third stage reflects
processes that integrate the element with the surrounding context or require a previous sentence
parcellation to be re-evaluated. The third stage covers latencies after 250 ms; it must be analysed
further into three parts—one for latencies between 250 and 400 ms, one for latencies around
400 ms, and one for latencies around 600 ms. The separation of processing into temporal stages
for the different processes of music and language is not surprising when one sees that all human
motor activity shares the same fundamental hierarchy of rhythms (Trevarthen 1999). Thus,
beyond the analysis of the physical properties of a stimulus, the timing of the successive stages of
‘processing’ matches nicely with different ‘step rates’ in walking or thinking, from presto to
andante, or alternatively to the duration of a single act and its prospective perceptual monitoring.
It has been argued that the thinking process itself—the ‘mindness’ created by brain function—is
a by-product of the evolutionary embedding of motor activities and their associated rhythms
(Llinas 2001).
We will next discuss briefly the main conclusions regarding the similarities and differences in
language and music processing for each of these stages, focusing, in chronological order, on the
work of groups that have studied extensively both linguistic and music responses with MEG and
EEG. The first significant contributions were from EEG in the linguistic domain, beginning with
the discovery of the N400 component by Kutas and Hillyard (1980; 1984). This is a centroparietal
Table 8.2 Temporal stages of processing for music and language
Processing Early components, < 250 ms Late components, > 250 ms

times
< 100 ms 100–200 ms 250–400 ms ~ 400 ms ~ 600 ms
Speech Initial processing Early left Left anterior N400 P600 a positive
40–100 ms anterior temporal centro-parietal centro-parietal
(P1m); in negativities negativity negative deflection after
auditory cortex (ELAN) (LAN) for event-related the onset of an
following incongruities in potential to unexpected
syntactic syntax/grammar unexpected word.
incongruity or words and Associated with
‘phrase structure semantic errors. syntactic rather
violation’ Associated with than semantic
Mismatch semantic processing
negativity integration (knowing where
(MMN) for a (knowing what it and oneself is).
change in a the thing is) Re-evaluation
repeated N400 priming of earlier
auditory signal effect for processing,
language, in checking
. posterior middle meaning
temporal gyrus. in a sentence
N400 for
incongruent
words sung
in tune
Music Initial Early right Right anterior- N400 priming P600 for musical
sounds processing anterior temporal effect for music, incongruities
40–100 ms (P1m); negativities negativity in posterior and for correct
in auditory cortex (ERAN) following (RATN) for middle temporal words sung out
unexpected incongruous gyrus of tune
chords, and right musical sound
MMN for a
change in a
repeated chord
Movements Trills Vibrato, Presto Andante pulse
of musical arpeggios pulse/beat or beat
performance
negative ERP (event-related potential) component; it is elicited by unexpected words, especially

semantic errors, beginning around 250 ms and peaking around 400 ms, hence the name. The
N400 corresponds to the second part of our third stage, and is associated with semantic integra-
tion (knowing what the thing is). The P600 is a positive deflection that, like the N400, is over
centroparietal electrodes and starts fairly early after the onset of an unexpected word, but that
peaks later, around 600 ms (Osterhout and Holcomb 1992). The P600 corresponds to the third
part of the third stage and is associated with syntactic rather than semantic processing (knowing
where it and oneself is). The P600 is strong when a re-evaluation of earlier processing is required,
such as the following of ‘garden-path’ sentence structure.
Except for some early work by Besson and Macar (1987), until recently there were relatively
few studies of evoked effects of music incongruity (Besson et al. 1994; Janata 1995). The first
study that directly compared music and language claimed that linguistic and music incongruities
elicited positivities around 600 ms that were statistically indistinguishable for both language and
music (Patel et al. 1998). In the same paper, an earlier music-specific ERP component was
observed that showed anterotemporal right-hemisphere lateralization, termed right anterior-
temporal negativity (RATN). This component resembled the left anterior negativities (LAN)
reported earlier for language-related syntactic processing (Friederici 1995). The LAN and RATN
correspond to the first part of our third stage.
The similarity between late evoked components elicited by music and language has been
reported in a series of recent studies. In Besson and Schon (2001), 200 excerpts from French
operas were used with either the last word sung in or out of tune, or the last word replaced by a
semantically incongruent word sung in or out of tune. The results demonstrated a N400 for
incongruent words sung in tune and a P600 for correct words sung out of tune. The double
incongruity (incorrect words sung out of tune) elicited both an N400 and a P600, not signifi-
cantly different from the sum of the effects observed for each incongruity separately. The near
additivity of these effects suggests that, despite the similarity in language- and music-induced
N400 and P600 effects, relatively independent neural processing underpins the semantic aspects
of language and the harmonic aspects of music.
In Koelsch et al. (2004), behavioural measures and the N400 ERP component elicited by a
visually presented target word were used to quantify the priming effect of either semantically
related or unrelated preceding sentences or music. The semantic relationship between words and
music was chosen either on the basis of self-reports of the composers or was based on musicolog-
ical terminology—for example, the musical prime for the word ‘narrowness’, used intervals set in
closed position (covering a narrow pitch range, and being dissonant), and the prime for the word
‘wideness’, used intervals set in open position (covering a wide pitch range). The N400 priming
effect was similar for language and music in terms of time course and strength. ECD (equivalent
current dipole) localization of the difference signal (‘semantically related’ subtracted from
‘semantically unrelated’ evoked potentials) produced very similar results for both language and
music, localizing the source in the posterior portion of the middle temporal gyrus.
Early anterior negativities (< 250 ms) following syntactic incongruity, in the front half of
the brain, have been labelled ELAN and ERAN, depending on whether they are recorded
primarily over the left or right hemisphere, respectively. The ELAN was first identified in lin-
guistic syntactic violations, e.g., following phrase structure violation (Friederici 1995). The local-
ization of the magnetic correlate of these decreases in EEG signal, using tomographic analysis of
MEG data with limited head coverage (Gross et al. 1998) and with ECD analysis from full-head
coverage (Friederici et al. 2000), showed early activity in both left and right auditory cortex and
in left and right inferior frontal cortex, corresponding to Broca’s area and its right-hemisphere
homologue. These studies were followed by experiments using a sequence of five chords,
introducing unexpected variation (Neapolitan chords) in the third and fifth chord to study the
music analogue of the linguistic syntactic violation. The unexpected chords elicited anterior
right negativities, ERAN, at the same latency (around 200 ms) as the linguistic violation effect on
the left (ELAN), followed by a bilateral negative signal, greatest around 500–550 ms. The ERAN
was identified with both EEG and MEG, but the later signal was evident with only EEG.
ECDs were localized in the middle of the Heschl’s gyrus bilaterally from the average MEG signal
of the in-key chords around 200 ms, a location within or near the primary auditory cortex
(Maess et al. 2001). In the same early time period, the ECD were localized in more anterior
regions around Broca’s area and its right-hemisphere homologue, using the same stimulus as
for the ERAN signal, the difference between the Neapolitan chords and the in-key chords
(Maess et al. 2001).
To fully appreciate the ELAN and ERAN results, we must contrast them with the well-studied
‘mismatch negativity’ (MMN) (Naatanen 1992). MMN is elicited by a change (‘deviant’) in a
repeating auditory stimulus. It is an ‘automatic’ response, since it can be elicited even when
participants’ attention is engaged in a different task. MMN is seen as a negative ERP wave
between 100 and 200ms in the difference signal between the deviant and standard ERP
waveforms. A direct comparison between encoding of elementary musical and phonetic sounds
was made using MEG recording in an MMN paradigm (Tervaniemi et al. 1999). It was found,
only within the right hemisphere, that the MEG counterpart of MMN (MMNm) elicited by an
infrequent chord change was stronger than the MMNm elicited by a phoneme change. The ECD
model was used separately for phonemes and chords to describe the early MEG signal (the
increase in signal occurring between 40 and 100 ms, termed P1m) and the MMNm (the differ-
ence signal between frequently repeated and deviant items). The MMNm ECD localizations
differed slightly for phonemes and chords, and more noticeably for binaural stimulation, but for
P1m did not differ significantly for phonemes and chords. However, a PET study using very
similar stimuli showed an even more lateralized response, suggesting a left-hemisphere special-
ization for the processing of phonetic stimuli and a right-hemisphere specialization for the
processing of chords.
The traditional interpretations of ERAN and MMN results have been rather different. MMN is
thought to be an index for the neural traces of short-term auditory memory, which must be
related to ongoing awareness of the environment, while ERAN is proposed as an index of early
syntactic processing, which will be more concerned with prospective control of participants’
actions. However, a case can be made for the equivalence of ERAN and MMN. Both ERAN and
MMN can be elicited automatically by deviant auditory stimuli. Evidence for anticipatory
auditory cortex activity within a sequence of tones with variable ISI (inter-stimulus interval)
was recently presented (Ioannides et al. 2003). If the auditory system cannot easily distinguish
anticipatory neural activity from the memory trace set up by a repetitive stimulus, then much of
the argument for the distinction between MMN and ERAN, and between ‘memory trace’ and
‘prospective perceptual control’, evaporates. For now, we note that the shared properties for
ERAN and MMN include an overlap in latency and similarity in the spatial distribution of the
corresponding ERP components. The evidence thus far cannot eliminate the possibility that
additional processes, specific to syntactic ‘cognitive’ processing, give rise to ERP components in
ERAN that are not present in MMN (Koelsch et al. 2001).
The few studies that looked at the processing of sound stimuli before the ERAN and MMN
range have reported no differences between the processing of language and music. For example,
the ECD localization for the P1m response was not significantly different for phonemes and
chords (Tervaniemi et al. 1999).
8.6.3 Innateness and plasticity at different temporal scales:

results from electrophysiology
The first year of human postnatal life is characterized by enormously actively engaged brain
plasticity (Dawson and Fischer 1994; Trevarthen 2004a). During this first year, infants, while
intending to communicate, pick up the statistical and prosodic patterns in a culturally formed
language input, thus discovering phonemes and words (Trevarthen 2001, 2004b). Through their
highly intelligent social interaction with other humans, their speech learning accelerates in a way
that has been compared to communicative learning in songbirds (Kuhl 2004), but which differs
fundamentally in the shared intentions it serves. Changes in capacity to learn new speech sounds
occur more slowly after early childhood, but ‘plasticity’ for auditory learning is still present in
adults. Although the evidence suggests that ‘hard-wired’ changes leading to new neural ‘maps’ are
slow to occur (requiring activity-dependent changes at the synaptic level), and changes in grey
matter (that can be detected by imaging the local anatomy at very high resolution) are slower
still, adequately motivated learning can still be fast, notably in 2-year-olds ‘one trial’ learning of
new words (Carey and Bartlett 1978; Halberda 2003. A catchy tune can make an indelible impres-
sion the first time it is encountered, and this is true for infants as it is for adults (Trainor 1996;
Trevarthen 2002).
In many studies of long-term changes in responsiveness to stimuli, a single ECD is fitted to the
averaged MEG or EEG data, and the derived dipole strength is regarded as a quantitative measure of
the plasticity effect. However, it is not always clear whether the increase in ECD strength arises
because the real-time activation is stronger in each single trial, or because the single-trial activity is
better ‘organized’ (e.g., better time-locked to the external stimuli). As we have seen, some studies that
involve the exposure of naive participants to sequences of tones (and presumably music) for hours
or days have tended to use real-time measures. These studies point to the reorganization of back-
ground rhythms in distributed cortical representations, especially in the gamma band of frequencies
(~ 40 Hz), as the most likely change that correlates with performance (Bosnyak et al. 2004).
Some indication of the specificity of innate neural competence for musical perception, as com-
pared with language perception, is provided by the intrasurgical studies of Chauvel and colleagues.
During open-skull procedures with conscious patients, they intracerebrally recorded the auditory
evoked potentials from the primary and secondary auditory cortices of both hemispheres of the
human brain, in response to syllables or tones. In one study, they showed that the response to the
acoustic elements of syllables is lateralized. The effect of voiced and voiceless syllables was distin-
guishable in the left, but not in the right Heschl’s gyrus (HG) and planum temporale (PT); only
the evoked potentials in the left HG, and to a lesser extent in PT, reflected a sequence of responses
to the different components of the syllables. This acoustic temporal effect in the brain was not lim-
ited to speech sounds, but applied also to non-verbal sounds mimicking the temporal structure of
the syllable (Liégeois-Chauvel et al. 1999). In a second study, clear spectrally organized tonotopic
maps were observed (that is, spatially organized according to gradations of sound frequency), with
distinct separations between different frequency-processing regions in the right hemisphere. In
contrast, tonotopic organization was less evident in the left hemisphere, where different regions
were involved in response to a range of frequencies (Liégeois-Chauvel et al. 2001).
Ioannides and colleagues (2003) used MEG to compute the so-called echoic memory trace
(EMT)—the short-term retention of stimulus-related information. They used trains of auditory
stimuli designed specifically for computing the EMT lifetime and its contextual sensitivity, and
observed time-dependent EMTs with different lifetimes at different latencies, suggesting the
existence of multiple neural delay lines. The longer EMT lifetimes showed handedness and
gender-dependent interhemispheric asymmetry. Specifically, all participants except left-handed
males showed longer EMT lifetimes in the left hemisphere. These EMT results, together with the
results of Liégeois-Chauvel et al. (2001), suggest that very fundamental properties of the auditory
system, likely to determine how language and music are first perceived in the brain, are differ-
ently established in the left and right hemispheres of the adult human brain.
8.6.4 Conclusions
The differences between electrophysiological and imaging studies are not easy to reconcile. Patel
(2003) makes a valiant effort in this regard by proposing a common neural substrate for music
and linguistic ‘syntactic processing’, possibly in the frontal lobe and related to the production of
speech and musical sounds. He proposes that this neural substrate operates on distinct linguistic
and musical representations of experience, very likely localized in posterior parietal areas.
Clearly, from Chauvel’s work, innate or early-maturing networks in primary auditory areas are
also involved in bringing about these distinct representations. Our discussion of the similarities
between MMN and ERAN point to a similar conclusion to that of Patel, but with more emphasis
on the role of anticipatory mechanisms.
This picture is consistent with the evidence that a newborn human is already music-
competent, with specific areas in both hemispheres capable of interpreting structured sequences
of sounds—especially those with distinct rhythms linked with the body rhythms of cardiac
pulsation, respiration, gestural movement, and walking—and that this shared rhythmic sense, or
communicative musicality, facilitates the later acquisition of language through the elaboration of
expressive communication between babies and intimate companions. Lateral prefrontal areas,
which include Broca’s area, are innately configured to be involved in the anticipations and predic-
tions that define the syntactic structures of both linguistic and musical expression. It is also clear
that the processing requirements specific to language or music, such as the distinctions between
very similar consonants needed for semantic precision, require access to early acoustic analysis in
the auditory cortex with distinct hemispheric lateralization for different features.
8.7 Cortical specializations for components of music perception

8.7.1 Pitch and melody
Several careful brain imaging studies have addressed the following questions.
1 What brain areas are selectively involved in processing notes of identifiable pitch as
compared with noise?
2 What areas of the cortex are ‘tonotopic’ (i.e., map tones in a systematic pattern)?
3 What brain areas are involved in the detection of the fundamental musical aspects of
pitch—i.e., which octave (pitch height) and what note within the octave (pitch chroma)?
4 What brain areas are concerned with listening to melodies compared with simple repeated
note sequences and noise?
There are strong reasons to believe that none of these questions is seriously ethnocentric, or
even anthropocentric. From studies of animals, especially monkeys (Morel et al. 1993; Brugge
1985) it is clear that tonotopy is a universal feature of the primate brain, and that pitch discrimi-
nation is an important component of the recognition of sound sources, with obvious survival
value. Pitch, like colour, is a conscious percept, constructed by the brain; this percept bears a
relationship to the frequencies that are picked up by the ear, but this is not necessarily a simple
relationship—for instance, there is the well-known example of the ‘missing fundamental’, in
which a listener will confidently claim to have heard the fundamental note of a harmonic series
of pitches played together, even when the fundamental is physically missing. The studies that
address the questions posed above will now be presented.
1 The definitive fMRI experiment comparing pitched sound with noise was performed by
Griffiths et al. (1998). To avoid the artificial character of a pure tone, and to provide emphasis
for the percept of pitch rather than the response to auditory frequency, a stimulus consisting
of iterated noise was used, which has a very broad frequency spectrum and can easily
be adjusted to give varying degrees of perceived pitch strength. This allowed Griffiths and
colleagues to perform a parametric experiment, in which the temporal regularity (equivalent
to the strength of the perceived pitch) was systematically increased, and brain areas were
found that showed a corresponding increase. These brain areas were located in the primary
auditory cortex (Heschl’s gyrus) bilaterally, with no activation in brain areas that are earlier
in the auditory pathway (inferior colliculus, medial geniculate body).
2 It is apparent that our auditory cortices provide spatial maps of sound frequency. While this
was carefully explored in one of the earlier MEG studies (Pantev et al. 1988), the use of fMRI
has provided much greater detail (Formisano 2003; Talavage 2004), showing several tono-
topic gradients in early auditory areas, consistent with results in owl monkeys (Brugge 1985;
Recanzone et al. 1993). The implication is that distinguishable sound frequencies are natu-
rally important in human hearing, which enables us to perceive meaningful structures of
sounds based on sequences of notes.
3 There is abundant evidence that the perception of octave equivalence (in which the sound
frequency is precisely doubled) is a human universal—it is even found in rhesus monkeys
(Wright et al. 2000), which can generalize tonal melodies (but only tonal) across octave
transpositions. Tones within each octave are uniquely consonant with their corresponding
tones in other octaves, while within each octave there is a subjectively similar cycle of notes,
whatever scale is considered. In an fMRI study, Warren et al. (2003) investigated the brain
areas involved in perceptual changes along these two dimensions of pitch variation, known as
pitch height (which octave?) and pitch chroma (which note within the octave?). In their
experiment testing musically untrained adults, the stimuli were harmonic complexes, in
which chroma and height could be varied continuously, while the total energy and spectral
region remained fixed. This controlled properly for these important auditory variables, leav-
ing the perceptual aspect of pitch as the important experimental variable. They found that
change in pitch chroma, but not in height, activates bilateral areas in front of Heschl’s gyrus
in the planum polare, while change in pitch height, but not in chroma, activates bilateral
areas in the posterior planum temporale. While Heschl’s gyrus itself is involved in all audi-
tory experience, these regions, which bracket the primary auditory cortex on the superior
surface of the temporal lobe, represent distinct brain substrates for processing the two musical
dimensions of pitch (Figure 8.1).
Is this specialization of auditory cortex innate or acquired? The participants in this study
were not musically trained, but as adults they would already have been exposed to a great
deal of mainly Western music, and as we will see later in this chapter, our adaptable brains are
capable of permanent change in the process of acquiring many musically related skills. Thus,
without performing similar studies of very young infants, with almost no musical experi-
ence, this question is impossible to answer definitively. However, the cross-culturally univer-
sal perception of the octave interval strongly suggests that there are brain regions that are
‘prewired’ to be sensitive to this feature of pitch, and there is no reason to suppose that the
regions identified as activated in this study differ across cultures.
4 The earliest imaging study to investigate whether there are particular brain areas concerned
with melody was a PET study by Zatorre et al. (1994), in which participants listened to
simple melodies and acoustically matched noise sequences. The results showed that cerebral
blood flow increased in the right superior temporal and right occipital cortices. The authors
concluded that specialized neural systems in the right superior temporal cortex participate in
the perceptual analysis of melodies, but as with several other early studies, appropriate statis-
tical methods were not used to evaluate lateralization. These areas can be considered to form
part of the pervasive mirror or ‘sympathy’ system of the brain for all sensory modalities
(Decety and Chaminade 2003; Jeannerod 2004).
Size of effect
(a) R (b) (d) 0.8
0.4
All ∆ chroma 0
Anterior
(c) 0.8
0.4
0
Pitch-noise, noise-silence All ∆ height ∆ Chroma only, ∆ Height only Posterior
Fig. 8.1 Statistical parametric maps for Warren et al. (2003). The 90 per cent probability boundaries
for primary auditory cortex are outlined (black). (a) Broadband noise contrasted with silence
(noise–silence, green) activates extensive bilateral superior temporal areas including both medial and
lateral Heschl’s gyrus (HG). The pitch-producing stimuli contrasted with noise (pitch–noise, lilac)
produce more restricted bilateral activation in lateral HG, planum polare (PP), and planum temporale
(PT). (b) Pitch chroma change (D chroma) contrasted with fixed chroma (all D chroma, red) activates
bilateral areas in lateral HG, PP, and anterolateral PT. (c) Pitch height change (D height) contrasted
with fixed height (all D height, blue) activates bilateral areas in lateral HG and anterolateral PT. (d) Voxels
(volume elements) in b and c activated both by pitch chroma change and pitch height change have
been exclusively masked. Pitch chroma change but not height change (D chroma only, red) activates
bilateral areas anterior to HG in PP; pitch height change but not chroma change (D height only,
blue) activates bilateral areas in posterior PT. These areas represent distinct brain substrates for
processing the two musical dimensions of pitch. The relative magnitude of the blood oxygen
level-dependent (BOLD) signal change in anterior and posterior areas is shown for each of the
contrasts of interest (right). The height of the histogram columns represents the mean size of effect
(signal change) relative to global mean signal for the contrasts D chroma-only (red) and D height-
only (blue) at the peak voxels for each contrast in the right hemisphere; vertical bars represent the
standard error of the mean size of effect. The histograms demonstrate opposite patterns of pitch
chroma and pitch height processing in the anterior and posterior auditory areas. (Adapted and
reproduced with permission.) (See also colour plate 1.)
Another study by the Griffiths group (Patterson et al. 2002) addressed this question, this time
with an fMRI experiment that allowed more detailed investigation to identify the main stages of
melody processing in the auditory pathway. Spectrally matched sounds that produce no pitch,
fixed pitch, or melody were all found to activate Heschl’s gyrus and planum temporale. Within
this region, sounds with pitch produced more activation than those without pitch only in the
lateral half of Heschl’s gyrus. When the pitch was varied to produce a melody, there was activa-
tion in regions beyond Heschl’s gyrus and planum temporale, specifically in the superior tempo-
ral gyrus and planum polare, with significantly more activation in the right hemisphere, in
partial agreement with Zatorre’s findings. Unsurprisingly, regions in the planum polare are also
specifically involved in pitch perception within the octave (chroma). The results support the view
that there is a hierarchy of pitch processing in which the centre of activity moves forward and
more laterally away from the primary auditory cortex as the processing of melodic sounds
proceeds. An analogous hierarchical functional distribution has been identified in visual areas at
the rear of the brain, where areas further forward are concerned with progressively more complex
features of a visual scene (Zeki 1993).
Again, the question remains whether these brain areas have undergone a special adaptation as a
result of experience or whether they are congenitally earmarked for the task they perform.
Perhaps cross-cultural studies could be helpful here, if possible involving participants from
cultures where far less emphasis is placed on melody in music. It is intriguing to hypothesize that
the observed lateralization arises because language processing comes to dominate left hemisphere
auditory areas during development, leaving the homologous right hemisphere areas to music by
default.
8.7.2 Timbre
The term ‘timbre’ describes the harmonic content of sounds as they evolve through time during
the playing of each note (McAdams et al. 1995). It is this feature of a sound that allows us to
distinguish one resonant source from another when the other perceptual features—pitch, loud-
ness and duration—are held constant. Thus, the perception of timbre is likely to be relatively
species-invariant, since it may well be required for potentially survival-dependent auditory
identification, and for the detection of expressive effort and level of excitement or emotional
modulation in calls.
Changes in timbre are an inherent part of a musical performance, and they play an important
role in the emotional effects of music, which we will discuss in more depth later. In a well-
designed fMRI study, Menon et al. (2002) used selected sound stimuli to investigate the neural
correlates of timbre perception. They presented non-musician volunteer subjects with the same
melodies performed with contrasting timbres, and compared the resulting brain activations. The
two timbres comprised tones with (a) a fast attack, low spectral centroid, and no spectral flux,
and (b) a slow attack, higher spectral centroid, and greater spectral flux. Loudness and subjective
pitch were carefully controlled. The participants’ low-level task was a key-press response at the
end of each short melody. The results indicated that both left and right hemispheres are involved
in timbre processing, challenging the conventional notion that the elementary attributes of
musical perception are predominantly lateralized to the right hemisphere.
Significant timbre-related brain activation was found in well-defined regions of posterior
Heschl’s gyrus and superior temporal sulcus, extending into the circular insular sulcus. Although
the extent of activation was not significantly different between left and right hemispheres,
temporal lobe activations were significantly posterior in the left hemisphere when compared
with those of the right, suggesting a functional asymmetry in their respective contributions to
timbre processing. Apart from Heschl’s gyrus activation, which appears to be common to most
auditory perceptions, the timbre-specific activation areas appear to be different from those of
pitch and melody; they are generally deeper and lower in the temporal lobe, perhaps indicating
greater involvement with the emotive effects of the sounds. (For an analysis of the physical infor-
mation that defines the different emotional intensity of expressive movements, including those of
musical performance, see Lee and Schögler, Chapter 6, this volume).
8.7.3 Rhythm
A considerable proportion of music in most cultures has definable rhythms. It is probable that
the perception of rhythm is mediated by specific, innately predefined brain areas. However, as
with melody, cultures have developed highly sophisticated elaborations of rhythmic motifs.
Rhythmic movement and sounds are intrinsically part of animal life—heartbeat and respira-
tion must occur cyclically to give tissue continuous nourishment, and animal locomotion and
chewing are repetitive. Indeed, all movements of animals, even those such as jellyfish or worms,
without jointed body members, have a cyclic time control or rhythm (Llinas 2001). For all animal
movement, increases in physiological arousal are accompanied by increases in frequency, provi-

ding music-makers with a uniquely natural means of modulating arousal in their listeners.
Rhythmic sounds have a powerful entraining effect on listeners (Molinari et al. 2003), encoura-
ging a response in the form of rhythmical movements or dance (Cross and Morley, Chapter 5,
this volume).
Primate species share a brain system known as the ‘mirror system’ (Rizzolatti et al. 1996;
Iacoboni et al. 2001). In a given participant’s brain, neurons forming part of this system produce
action potentials whether the individual is performing a specific movement, or whether another
conspecific is viewed performing the movement. Rhythmical sounds easily evoke in the
imagination the movements required to produce them, and thus it would not be surprising if
they activate areas of cortex involved in movement generation, specifically the premotor cortex
and supplementary motor areas. Perhaps even more pertinent for understanding the brain
mechanisms of musical communication and art is the increasing evidence that mirror systems
in the brain are conveying information between participants about dynamic emotional
states, qualities of purpose expressed in movement, and the goals of actions (Adolphs 2003;
Gallese 2001).
There have been few brain imaging studies that address directly the question of what brain
regions are specifically responsible for discrimination of rhythms, either in music or in the wider
context of regularly repeated actions. Parsons (2001) performed a PET study in which musicians
and non-musicians discriminated pairs of rhythms with respect to pattern, tempo, meter, or
duration. Among other areas, the cerebellum appears to be a crucial component of the rhythm
perception network. Sakai et al. (1999) used fMRI to explore the brain areas involved in main-
taining the short-term memory of a rhythmical sequence of sounds, while varying the comple-
xity of the rhythm. They found that for simple rhythms, as commonly found in music, activation
occurred in predominantly left-hemisphere premotor and inferior parietal areas in the cerebrum,
together with right anterior cerebellum. These are areas important in ‘mirroring’ intentions of
movements. Overy et al. (2004), in a ground-breaking fMRI study, investigated melody and
rhythm perception in children of about 6 years old. By contrast with earlier studies on adults of
rhythm discrimination, based on the effect of cortical lesions (Samson et al. 2001), this study
found no strong cerebral lateralization to the left hemisphere. Overy argues that this lateraliza-
tion is more likely to develop during maturation.
The production of regular movements, for instance finger tapping, has received much more
attention from imaging neuroscientists (Rivkin et al. 2003; Dhamala et al. 2003, Ullen et al. 2003),
but space does not permit a full discussion here. (See Lee and Schögler, Chapter 6, this volume,
for information on brain systems that are involved in the regulation of the timing of movements
and their perceptual control).
8.8 Temporal aspects of music (and how to study them)

Music and time are inseparable. Notes, pitch and chords are defined by the natural frequencies of
their elements (‘real’ or ‘virtual’; e.g., missing fundamental), and hence they are intrinsically
based on the temporal notion of oscillation at a very fine temporal scale (Osborne, Chapter 25,
this volume). Melody is defined by the temporal arrangement of musical elements over longer
time periods, and melodies are linked together in patterns specific to musical themes, phrases,
songs and composers, that span timescales from seconds to a composer’s lifetime, i.e., the full
range of conscious experience and its recollection. Rhythm, with elaborated narratives of
rhythms, provides the defining organizational principle of temporal sequences. It is therefore
clear that all elements of music are conncected with unfolding sequences in time.
Brain activity is also characterized by events and oscillations (chemical and electrical) that
unfold in time at different timescales. Two ingredients are therefore highly desirable for the
analysis of brain activity elicited by music. First, a method is needed to describe in a quantifiable
way the temporal attributes of music in all its aspects. It is especially important to allow for struc-
ture over wide temporal scales and to include variations expressing the performer’s interpreta-
tion that are known to have a profound effect on the way the music is perceived. Second, brain
activity needs to be described in ways that produce regional time-courses of activity that can be
analysed in exactly the same way as music, so that correlations can be found between the temporal
organization of the piece of music, its specific expression in a given performance, and the brain
activity it elicits. We will emphasize a new methodology that embodies these two ingredients,
after summarizing the achievements and limitations of other recently proposed methods that
also allow flexibility in the stimulus content.
8.8.1 Conventional analysis, and moves in the right direction

Methods relying on haemodynamic processes lack the temporal resolution to capture the high
and moderate frequencies encountered in music. MEG and EEG do have the required temporal
resolution, but their conventional use does not lend itself easily to such an investigation. Even
studies relying on musical phrases containing a harmonically, melodically, or rhythmically incon-
gruous note or chord fall short of the mark. Although such approaches rely on elaborated cogni-
tive categories and concepts of music theory, they can only probe the natural processes
of expectancy at a restricted level, identifying isolated responses to musical violations, rather
than probing the responses evoked during the generation of experience of the local and global
attributes of an actual piece of music.
The steady-state response (SSR) of individual MEG or EEG channels has been used to study
the temporal neural correlates of the auditory sequences in a number of recent studies. This
approach provides a measure of the real-time variation elicited by a continuous stimulus, albeit
often an indirect one. The evidence from these studies suggests that phase reorganization plays a
more important role than amplitude changes, as the two following examples suggest.
Patel and Balaban (2000) have studied the SSR elicited by the amplitude modulation of
melody-like sequences. They found that energy changes in the MEG signal did not track the
stimulus structure well. In contrast, the phase of the MEG signal at particular sensor locations
showed marked resemblance to the contour of the pitch-time series. The distribution of sensors
showing this phase tracking of the auditory sequences was bilateral, but with a ‘statistical
tendency’ for higher density in the right hemisphere. A similar approach was followed in Bosnyak
et al. (2004) to address the question of plasticity in the adult human brain. Adult non-musicians
were trained to discriminate small changes in the carrier frequency of 40 Hz amplitude modu-
lated pure tones. Although changes in the amplitude of the transient evoked response were
identified, no change in the SSR amplitude was identified, and only changes in the phase of the
SSR were observed after training.
Working at the level of the SSR of individual channels limits what can be deduced from the
data. For MEG in particular, the signal recorded by any one sensor is sensitive to generators that
can be very distant from each other, so changes at the level of sensors will be more apparent if
large-scale synchrony is established. Conversely, a single focal generator produces a dipolar radial
field pattern; therefore, a strong MEG signal will be recorded at two separate locations (the
distance between the sensors of maximum sensitivity increases with the depth of the focal
source). Temporal synchronization between areas cannot be reliably concluded on the basis of
measures of synchronization measures between sensors at different locations, because each single
focal generator will create strong correlation (with no time delay) in the output of sensors at
different locations.
Two recent publications exploited the availability of full tomographic descriptions of brain
activity, millisecond by millisecond, elicited by continuous music (Ioannides et al. 2002b;
Popescu et al. 2004). For the tomographic analysis, the CURRY 4.5 (Compumedics Neuroscan)
source localization software was used, with a minimum L2-norm constraint for the currents
and the L-curve regularization. Five right-handed male participants were used, none of whom
had formal music training. The music stimulus was a 2 minute 50 second solo piano piece played
at a moderate tempo. It had a simple anapaestic metre, i.e., its basic metrical foot consisted
of crotchet, crotchet, minim (i.e., ratio of 1:1:2). The piece was initially unfamiliar to the
participants, and became familiar through a training procedure that each participant underwent
before actually listening to the excerpts used for the analysis. The procedure ensured that the
material was neither novel not overlearnt, but rather equally familiar to each participant. The
piece was divided into motifs that could be used for memorization and recall. The MEG signal
was then recorded while the participants listened to two motifs each lasting 10 seconds (motif I
and motif II). The MEG signal was recorded for 20 repetitions of each of the two motifs.
The first study (Ioannides et al. 2002b) used the average and single trial MEG signal from the
20 repetitions of motifs I and II. The average MEG signal at the beginning of the motif showed a
dipolar pattern on the lateral surface (Figure 8.2a), and the tomographic solutions showed loci of
activity in and around the auditory cortices (Figure 8.2b), stronger on the right. The average
MEG signal was computed separately for sensors around the maximum and minimum of the
dipolar pattern for each hemisphere, using five sensors for each sum as marked by the pentagons
in Figure 8.2a. The differences between the averages around the maximum and minimum of each
dipolar pattern define a virtual sensor (VS) for each hemisphere. The VSs can be applied to the
single-trial MEG signal to obtain a real-time estimate of activity in the left and right auditory and
nearby cortices. Figure 8.3 shows that the frequency spectra for the time series representing the
sound of music and the activity in the left and right auditory cortex (as described by the two VSs)
shared similar peaks, especially on the right, as would be expected by the results of Patel and
Balaban (2000).
The second study extended these results in two ways. First, the frequency spectra were com-
puted for regional brain activations extracted from the tomographic solutions. The results
showed that musical attributes at different temporal scales are processed in distributed and par-
tially overlapping networks: low frequencies are encountered in networks distributed in the ante-
rior part of the temporal lobes and frontal areas (Figure 7 in Popescu et al. 2004). These networks
presumably deal with ‘higher-order patterns’ formed by slow features of the melody. The results
showed further that high frequency features corresponding to individual notes are analysed in
regions within and around the auditory cortex, as already demonstrated in the earlier study
(Ioannides et al. 2002b), and in motor areas, specifically in primary sensorimotor area (SM1),
premotor area and supplementary motor area (SMA) (Figure 8 in Popescu et al. 2004).
A novel analysis was introduced and applied to one of the two motifs to capture the temporal
characteristics of the music score and brain activity. For the purpose of this analysis, the motif
was divided into four melodic contour segments (A–D); the first three segments (A–C), each
lasting a little over two seconds, were analysed in detail. The performance-rhythm of each
segment was characterized by temporal deviations from the reference interval ratio (DRIR),
introduced either by the performer’s artistic expression or the limitation/restriction of his physi-
cal capabilities. The mean interval ratios were 1.0:1.2:2.0 in segment A, 1.0:1.3:2.4 in segment B
and 1.0:1.1:2.1 in segment C—that is, segment C corresponds best to the anapeastic 1:1:2 interval
ratio defined on the score (crotchet, crotchet, minim), segment A comes close to it, and segment B
110.4 ms
Magnetic
L field pattern R
(a)
110.4 ms
Auditory
cortex
L activation R
(b)
Fig. 8.2 (a) Contour plot of the magnetic field over the left (L) and right (R) side of the head. The
signal topography is dipolar, i.e., the pattern separates clearly into positive and negative values
corresponding to magnetic field in and out of the plane. The sensors around the maxima and minima
of this nearly dipolar pattern are marked by regular pentagons pointing in opposite directions. This
signal topography is consistent with a rather focal generator somewhere in the brain in the region
between the two extrema, which corresponds to the primary auditory cortex and/or the surrounding
auditory association areas. An estimate of the activity around each auditory cortex can be calculated
directly from the signal values at each time slice by taking the difference between the averages of
the signal recorded for each half of the brain by each set of sensors (each set marked on the figure
with pentagons that point in the same direction). (b) Activity from tomographic reconstructions
of the signal shown in (a) are superimposed on rendering of the left and right cortical surface.
(See also colour plate 2.)
deviates the most. The partitioning of the 10-second motif therefore reveals two switches in note-
duration ratios that occur during the musical motif: the first switch marks the transition from a
close to metrical segment (segment A) to a segment characterized by higher DRIR (segment B),
whereas the second switch marks the transition from segment B (with higher DRIR) to a
segment with the smallest DRIR (segment C). Full details on the music score and its decomposi-
tion can be found in Popescu et al. (2004).
The new measure, introduced in Popescu et al. (2004), was to quantify the similarity in
the temporal structures of music and brain activity through the correlation between
1.0
Motif I
AMsignal
0.8
Left VS
Magnitude
0.6
Right VS
0.4
0.2
0.0
0.5
Motif II AMsignal
0.4
Left VS
Magnitude
0.3
Right VS
0.2
0.1
0.0
2 3 4 5 6 7 8 9 10
Frequency (Cycles / second)
Fig. 8.3 Frequency spectra for the amplitude envelope of two musical motifs and for the auditory
cortex activity (derived from the tomographic solutions) show a high correlation in the 7 Hz frequency
band, which roughly corresponds to the repetition rate of the piano key sounds. The match at other
frequencies is better for the right hemisphere (adapted from Ioannides et al. 2002b).
two beat-spectra (Popescu et al. 2004). The beat spectrum provides a robust characterization of
the time-series across different timescales. When applied to a music score, the beat spectrum
encapsulates the musical performance rhythm in accordance to its perceptual features. The beat
spectrum is derived directly from the amplitude modulation of a time series (Todd 1994),
namely from self-similarity measures of the multi-resolution wavelet decomposition of the
amplitude envelope of the music signal (Foote and Uchihashi 2001). The same analysis is applied
to the time courses of regional brain activity to obtain a corresponding beat spectrum. The beat
spectrum captures the rhythmic properties of the amplitude modulation: rhythmically transient
responses that preserve the stimulus periodicities will produce similar beat spectra. The beat
spectrum is sufficiently sensitive to capture small changes in rhythm on the sound track of differ-
ent musical excerpts produced by the expressive performance of the artist.
We made use of this sensitivity by studying how the new measure changed for different
brain areas with changes in DRIR in the three segments of the motif. For each participant, the
correlation was computed between the beat spectra for each music segment and each of six
brain areas—three motor-related areas in each hemisphere. Figure 8.4 shows the mean correla-
tion for each segment and the statistical significance for the two changes (between segments A–B
and B–C). The results demonstrate that when external rhythm is close to metrical (i.e., for
small DRIR), the coherent mode of oscillatory activity is encountered in the left hemisphere.
The increased synchronization between the internal and external rhythms might result in a
subsequent mental stability of the rhythm percept and the efficient self-generation or retrieval of
these rhythms, and easier memorization and reproduction of these rhythmical patterns (Essens
and Povel 1985; Sakai et al. 1999). The change to high DRIR reduces the rhythm-tracking in the
left but not the right hemisphere. Hence, the right hemisphere dominates when high DRIR
rhythms are encountered, which is consistent with better tracking of non-metrical rhythms
Decrease
SMA 0.62 0.43 0.39
(Almost)
Decrease Increase
Left
Broca 0.55 0.41 0.63
Decrease Increase
SM1 0.7 0.4 0.71
SMA 0.49 0.41 0.39

Right
Broca 0.64 0.64 0.55
SM1 0.56 0.54 0.47
Fig. 8.4 Measures expressing the relationship between the temporal organization of musical sound
and brain activity. Each measure corresponds to the correlation between two beat spectra. The first
beat-spectrum is extracted from the time series for the regional activation of one brain area. Six brain
areas were used: three on each side, namely left and right supplementary motor area (SMA), left and
right Broca area and left and right primary sensorimotor area (SM1). The results for one area are
shown in one row in the figure. The second beat-spectrum is extracted from the time series of a
music segment. Three successive music segments (A–C) were used. The first (a) and third (c) music
segments were more regular in their rhythmic properties relative to the middle segment (b). The values
of the mean beat-spectrum correlation for each segment is shown in one column. A solid arrow
between successive segments marks statistically significant change in the beat spectrum correlation,
while a dash arrow corresponds to a trend that narrowly fails to reach significance.
(Roland et al. 1981). However, the overall conclusion from the beat spectrum analysis is that the
processing of rhythm is not confined to only one hemisphere, in agreement with recent studies
(Peretz 1990; Sakai et al. 1999).
The full dynamics of brain activity can only be appreciated in the real-time tomographic
displays of brain activity as the music is played. Single-trial analysis of the MEG signal has been
performed using magnetic field tomography (MFT) (Ioannides et al.1990; Ioannides 2001).
Statistical parametric mapping (SPM) of these solutions were computed for 500 ms windows,
comparing brain activity during music listening with the pre-stimulus period. The SPM analysis
was made using the MFT solutions derived separately for each of 20 repetitions of the motif.
The 500 ms window was run at the same rate as the music so the resulting audiovisual display
shows changes in brain activity unfolding in real time (click on ‘MUSIC and animation’ at
http://www.hbd.brain.riken.jp/auditorymusic.htm). The display shows the slow build-up of
activity, beginning in motor-related areas, that eventually engulfs much of the brain and outlasts
the music.
In summary, tomographic analysis of MEG responses to real music demonstrates that very large
areas of the brain are activated when we listen to music. These activations differ in the left and
right hemispheres; the left hemisphere is more engaged when regular rhythms are encountered.
The activity in different brain areas reflects musical structure over different timescales; auditory
and motor areas closely follow the low-level, high-frequency musical structure. In contrast, frontal
areas contain a slower response, presumably playing a more integrative role. All of these results
show that listening to music simultaneously engages distant brain areas in a cooperative way
across time. This might be one reason why music has such a profound impact on humans.
8.9 Learning musical techniques

The difficulty of assigning innate status to a particular brain competence is underlined by a
number of studies of brain changes that result from training. Clearly, our brains are made to take
the imprint of experience, and especially human-created experience, or there would be much less
point in having them. Thus, it can become very difficult to disentangle nature from nurture. It is
understandable that in forming a neural imprint corresponding to a particular skill, our brains
will engage in heightened activity in areas that are in some sense the most appropriate to the task.
The outline of cortical functional segregation—especially the elaborate functional adaptations of
subcortical systems, including emotional and communicative adaptations with which we are
born—inevitably regulates which areas are brought into play in acquisition of a learned skill
(Panksepp and Trevarthen, Chapter 7, this volume). Neurons that fire together, wire together, as
Hebb (1949) pointed out, and thus areas that are exercised mature into more fully specialized
regions that are fine-tuned for efficient and accurate performance.
Electrophysiological measures of changes can be traced at different temporal scales in the adult
brain. Pantev and colleagues used MEG techniques in a number of studies to investigate the
changes that occur in the human auditory cortex when a skill is acquired, such as when learning
to play a musical instrument (Pantev et al. 2003). These studies showed that increased neuromag-
netic response to musical stimuli was correlated with the age at which musicians began to prac-
tice, and that this response was preferentially enhanced for the timbre of the instrument on
which the musician was trained (comparing violinists and trumpeters). In one of the studies,
short-term laboratory training, involving learning to perceive virtual pitch (perceived frequency
created through processing by the brain) instead of spectral pitch (frequency that is physically
present), showed that the switch to perceiving virtual pitch was correlated with a significantly
stronger gamma band frequency response, combined with a shift towards a more medial source
location (Schulte et al. 2002).
Using fMRI and making very careful measurements of the amount of grey matter at any given
part of the brain has enabled the tracking of brain changes resulting from practice, in particular
musical practice. In the forerunner fMRI study of this work, volunteer participants were asked to
learn the same sequence of finger-to-thumb tapping for 10 minutes each day, similar to a five-
finger piano keyboard exercise (Karni et al. 1995). The participants were scanned each week while
slowly performing the tapping sequence they were learning, and also when performing a very
similar but not practised control sequence. When tested for speed and accuracy, over the first
three weeks the performance level outside the scanner reached a plateau level. During this time,
the brain areas in primary motor cortex (M1) that were activated during the practised sequence
grew larger by comparison with those for the unpractised sequence. Several weeks after the end
of the study, when tested again in the scanner, the participants showed that this increase in area,
like the new skill associated with it, had remained unaltered, suggesting a permanent change in
brain organization, with more neural tissue devoted to the trained task.
This study was followed by one that was more explicitly music-related—the investigation
of brain changes associated with the learning of musical notation (Stewart et al. 2003); music
notation-naive participants were trained for 12 weeks up to Grade One standard in sight-reading
(the first grade of the Associated Board, UK). Before training, they were scanned while looking at
a musical score and performing a simple keyboard task that did not require an understanding of
the notation; after training, they were again scanned, this time performing an exactly analogous
keyboard task, which now required a reading of the score appropriately. When the results were
averaged across all 12 participants’ brains, the only significant changes after training were a
task-related increase in brain activity in the right dorsal parietal lobe, and a decrease in activity in
the hippocampus. The parietal increase can be interpreted as increased specialization of this
region of cortex to enable the automatic spatial decoding of the score from a vertical arrange-
ment of notes to a horizontal placement of fingers. The hippocampal decrease can be seen as a
familiarity-driven reduction of activity in this region, which is well-known to be important in the
laying down of long-term memories.
It is now becoming apparent that part of the process of cortical specialization is a quantitative
change in the amount of grey matter in task-specific areas. This has been demonstrated in a series
of studies of increasing sophistication and convincingness, including Gaser and Schlaug (2003),
who showed that professional musicians have increased grey matter density in several brain
regions, dependent on the particular skills required. From an early age, musicians learn complex
motor and auditory skills, which they practise extensively from childhood throughout their
entire careers. Using a voxel-by-voxel morphometric technique and high-resolution structural
MR images, Schlaug and his colleagues found that grey-matter volume increases in motor,
auditory, and visual–spatial brain regions when comparing professional keyboard players with a
matched group of amateur musicians and non-musicians. Schneider et al. (2002), made careful
measurements using MR images of the Heschl’s gyrus of 12 professional musicians, 12 amateur
musicians, and 13 non-musicians, and used MEG to record the evoked response to pure tones in
the same participants. The results showed that the evoked response, and the size of Heschl’s
gyrus, were both strongly correlated with the musical experience of the participants. In Sluming
et al. (2002), increased grey matter density was identified in Broca’s area of the left hemisphere of
orchestral musicians, compared with non-musician controls. Since Broca’s area and its right
hemisphere homologue appear to be involved with musical syntax (Levitin and Menon 2003),
and sight-reading unquestionably involves Broca’s area (Parsons 2001), this is entirely consistent
with the other studies in this area.
It has been argued that some of these regional differences could be attributable to innate
predisposition, but a very recent study has shown highly significant morphometric changes in
the visual motion areas V5 and parietal cortex in participants who learned to juggle over a period
of three months (Draganski et al. 2004).
The studies we have reviewed support the notion that differences between brain areas in
different participants related to musical skill, including volumetric differences, are largely
acquired rather than innate. One may ask in this context, then, whether the question of human
brain musicianship is not so much, ‘Is the brain hard-wired for music production?’ as ‘Can
the brain be trained to produce music?’. The answer, for musicianship, would appear to be
‘Yes, the brain can be trained to produce music’. It is abundantly clear that what the hearer, at
least in our culture, enjoys as music is well-tuned to what the musician can produce, which could
then support the idea that part of musical enjoyment is bound up with the system of neurons
that inherently equips us to move with the actions of others—that in hearing music we can
vicariously participate in its production, which is often a fulfilling and satisfying experience
(Cross and Morley, Chapter 5, this volume).
8.10 Music and emotion

This moves us on to a crucial question: what motivates us to experience music? Since musical
structures provide no specific information, make no obvious short-term structural changes in
our tissue, and are emphatically transient, why should we spend so much time listening to music,
why should music accompany most forms of entertainment, and why should the music industry
comprise such a large proportion of the economy? The answer to these questions can only be that
music is a powerful regulator of our emotions. Music makes us feel, in a controllable and safe
way. It allows us to experience feelings of great joy and sadness without the costs associated with
the social and personal events that would otherwise be required to induce these emotions. It can
calm and soothe us, stir us up, ravish us with its beauty.
It is well known that the human brain structures most involved with emotion are old, in evolu-
tionary terms. While it is still quite uncertain (pace Orpheus!) whether any other species experi-
ence music in the way that we do, it seems likely that our emotional response to music is
underlain by innate processes taking place in ancient brain structures, which receive complex
modulation as we mature and are exposed to the characteristic musical forms of our own
cultures (Panksepp and Trevarthen, Chapter 7, this volume).
Because musical forms often have clearly defined structures, one element of emotional
response is surely related to anticipation and resolution (Meyer 1956). Many areas of our brains
appear to be concerned with planning, whether in the feedforward circuitry entailed in motor
control (Wolpert et al. 1995), in the phenomenon of ‘priming’, where perceptual expectations are
set up in specific cortical areas, or in the prefrontal areas associated with selection for action
(Rowe and Passingham 2001) and working memory (Cohen et al. 1997) (Lee and Schögler,
Krumhansl (1990, 2002), following Lehrdahl and Jackendorff (1983), has carefully analysed the
expectations set up by musical passages, showing that the degree of tension can be quantified.
There may well be an autonomic response, with emotional force, when a high degree of tension
built up in a musical passage is eventually resolved in a satisfying way. A further source of emo-
tional response may occur when the music takes an unanticipated direction, which is only later
seen as logically implicit in what has gone before. This can provoke a type of ‘aha!’ experience.
Whether the perception of musical tension has innate foundations, or whether the expectations
that are generated are entirely brought about by exposure to a culture’s typical musical forms, is a
question still without answer. Infants very quickly learn to recognize a ‘favourite’ song; this is
facilitated by the way the song is encountered as part of a meaningful interaction between a baby
and its adult companion. The structure of ‘motherese’ songs that are sung to babies are highly
adapted to the babies’ preferences (Trevarthen and Malloch 2002), and often take a dramatic
form leading to a strong cadence or an ending with physical involvement (e.g., ‘Round and
Round the Garden’), suggesting that brain circuits which provide a sense of anticipation are
active, even in early infancy.
Perhaps the most powerfully rewarding emotional experience that music can provide has been
given the name of ‘chills’ (e.g., Panksepp 1995). This describes a response to specific musical
passages which induce an intensely pleasurable, euphoric sensation that can bring tears to the
eyes and shivers up the spine (and see Panksepp and Trevarthen, Chapter 7, this volume).
Because such chills are clear, discrete events and are often highly reproducible for a specific piece
of music in a given individual, they provide a good model for neuroimaging studies of emotional
responses to music. A seminal study by Blood and Zatorre (2001) explored this phenomenon
using PET. Ten musically trained students, equal numbers of males and females, were interviewed
to discover which musical passages reliably elicited the chills response. One such passage was
selected for each participant, making sure that it did not elicit the chills for any of the other
participants, for whom it could act as a control stimulus. All musical passages were purely
instrumental and came from the Western classical tradition.
The PET data revealed a network of brain areas that are closely involved in the chills
experience. Some of these, including several that have previously been associated with reward,
showed increased activity, and others with decreased activity, when compared with listening to
the control musical passages that did not elicit this experience. Regional CBF (cerebral blood
flow) increases were found in left ventral striatum, which includes the nucleus accumbens and
dorsomedial midbrain, and decreases were found in right amygdala, left hippocampus/amygdala
and ventromedial prefrontal cortex. Increases in chills intensity were also observed in paralimbic
regions (bilateral insula, right orbitofrontal cortex) and in regions associated with arousal
(thalamus and anterior cingulate) and motor processes (supplementary motor area and
cerebellum). The pattern of activity correlating with music-induced chills is similar to that
observed in other brain imaging studies of euphoria and/or pleasant emotion. Dopaminergic
activity in the nucleus accumbens appears to be the common mechanism underlying the reward
response to all naturally rewarding stimuli (e.g., food and sex) and to drugs with euphorogenic
properties and/or abuse potential. Activity in the insula is associated with subjective feeling states
involving representation of bodily responses elicited by emotional events (Critchley et al. 2004).
When such an extensive network of brain areas is seen to be involved with a particular experi-
mental condition, interpretation becomes difficult, especially with the relatively poor spatial
resolution of PET scanning. Thus, it is not yet possible to identify areas in this network that are
specific for music-induced emotional response. However, it is encouraging to be able to observe
a distinct brain-state correlate for the powerful feelings that are subjectively reported. Many
of these areas are phylogenetically ancient. The human genre of music has somehow enlisted
these areas, and it will be a pressing task of music and brain research to discover how this comes
about.
8.11 Importance and innateness of music: insights from

imaging studies
Brain imaging research related to music is rapidly growing in volume. Increasing numbers of
neuroscientists have recognized that there is a dearth of empirical research in this area, and that
many important questions can be raised relating to what seems to be a human need for music.
This brief survey has focused on a few of these questions, summarized here.
Language and music. It has become clear that while language and music share wide areas of
neuronal substrate, there are differences, even at the level of primary auditory cortex. Generally
these differences take the form of greater bilaterality for music than for language; language tends
to be strongly left-lateralized, especially in right-handed males. This is consistent with a view that
the capacity to be affected by music, when compared to language, is more likely to be innate, and
is supported by specialized brain areas. Language, by this argument, is a highly specialized subset
of musical cognition.
Features of music. Several experiments have shown that certain elements of music, such as
pitch, have specific neural representations. There is good reason to believe that the neural appa-
ratus that supports these features is predefined, even though it may become much more effective
as a result of musical experience.
Acquisition of musical skill. The limited number of truly longitudinal studies using brain imaging
techniques makes it hard to assert that there are human brain areas that are uniquely adapted
for learning musical techniques. It is more likely that areas which have a general competence,
for example for control of hand movements or vocal production and articulation, become special-
ized as a result of extensive practice, and there is increasing evidence that this specialization is
accompanied by an increased bulk of neuronal tissue in some parts of the cerebral cortex.
Music and emotion. Music can induce powerful and primordial emotions in humans which are
experienced as rewarding and pleasurable. Brain imaging techniques have allowed localization of
activity associated with such emotions, and the areas involved are consistent with those involved
with emotional responses to other types of stimuli. There remains the burning question: how and
why can a structural sequence of non-referential sounds produce such a powerful response?
8.12 Future work

The imaging neuroscience of music is a research area that is experiencing rapid growth. The non-
invasiveness of MEG and fMRI, and the continued existence of deep questions regarding the
power and purpose of music, have encouraged increasing numbers of eager young researchers to
enter this field. However, only a small fraction of published and planned studies are relevant to
the study of the innateness of musicality. Most are concerned with music as an acquired cultural
skill, which may or may not shed light on whether the brain structures involved are uniquely
suited to the task.
When it comes to assessing electrophysiological methods, there is an urgent need to bring
together two types of information. On the one hand, studies with large number of participants
using average EEG and/or MEG data and simple models for the generators, show that the brain
deals with language and music in remarkably similar ways. On the other hand, studies using
very detailed single-trial tomographic analysis, but of only a few participants, show a much
more dynamic picture of brain activity. These studies reveal distinct differences in how the
left and right hemispheres share the processing of simple tone sequences, linguistic or music
material, depending crucially on the fine details of the material (e.g., expectation and rhythmic
properties). Between these extremes, the data from fMRI and PET show a degree of specialization
for music and language material, but their low temporal resolution makes the establishment of
contact between the haemodynamic and electrophysiologically based methods difficult without
sacrificing the finer points of each. The synthesis of this wealth of neuroimaging data and
the establishment of further associations with behaviour is one of the challenges that the field
now faces.
Whatever technique is used, there is a particular need for longitudinal neuroimaging studies of
the development of musical perceptions and skills, especially in ecologically valid contexts, such
as mother–baby interactions. These could allow us to determine the way in which musical
expression may precede language, and to relate music with rhythmical movement and expressive
gestures that are adaptive, minimizing the influence of acculturation. MEG lends itself better
than fMRI for such studies, because it is silent in operation and the scanning conditions can be
adapted more easily to suit young babies and their mothers. However, much can be achieved
using fMRI, provided that attention is given to the comfort of the participants. Because musical
skills are often learned much later than language skills, there are rich opportunities for further
longitudinal studies of music training, like that of Stewart et al. (2003), described above. Changes
in brain organization relating to keyboard fluency and levels of rhythm complexity could be
studied, both with fMRI and MEG.
Further work on the relationship of music and language perception and production is highly
desirable. Existing studies do not often control the tasks or stimuli sufficiently well to pinpoint
and fully characterize the differences in brain structures involved. By contrast, Callan et al. (2006)
compare brain responses to hearing familiar songs with those to hearing spoken versions of the
same songs by the same speakers. This experimental design involves approximate controls for
semantic content, and for timbre and auditory source; results show remarkably small differences
in brain activation.
Song itself needs further study. Neuroimaging studies of song and singing may approach
the question of human innate musicality more directly than research involving such culturally
relative auditory sources as musical instruments.
Still further studies could investigate the relationship between brain areas involved in the
perception and production of rhythmical sounds, and those dealing with other controlled and
repetitive movements. Research on the brain’s mirror systems is still in its infancy, and further
experiments explicitly addressing the integration of hearing and sight in a musical context would
be of great interest, especially with regard for sympathetic brain responses to expressive forms of
moving (Calvo-Merino et al. 2005). For example, an fMRI or MEG study where participants
view dancers moving to music, in which in some conditions the music is incongruous with the
movements, could identify brain areas responsible for this integration. Finding activations in
areas associated with the autonomic system would provide evidence of the innateness of the
association of auditory and motor rhythms.
Other research might include studies of musically induced emotion, comparing brain activa-
tions produced by passages selected for their emotional effects with emotionally laden natural
sounds, such as the cry of a baby, the coo of a dove, the crash of breaking glass, and a cry of joy.
This could help to discriminate the pathways by which musical perception enters our innate
emotional networks, which have clearly evolved in close relationship with our requirements for
survival.
References
Abo M, Senoo A, Watanabe S, Miyano S, Doseki K, Sasaki N, Kobayashi K, Kikuchi Y, Besson M and
Macar F (1987). An event-related potential analysis of incongruity in music and other non-linguistic
contexts. Psychophysiology, 24(1), 14–25.
Abo M, Senoo A, Watanabe S, Miyano S et al. (2004). Language-related brain function during word repetition
in post-stroke aphasics. Neuroreport, 15(2), 1891–1894.
Adolphs R (2003). Cognitive neuroscience of human social behaviour. Nature reviews, Neuroscience,
4(3), 165–178.
Besson M and Macar F (1987). An event-related potential analysis of incongruity in music and other
non-linguistic contexts. Psychophysiology, 24, 14–25.
Besson M and Schon D (2001). Comparison between language and music. New York Academy of Scences,
930, 232–258.
Besson M, Faita F and Requin J (1994). Brain waves associated with musical incongruities differ for
musicians and non-musicians. Neuroscience Letters, 168, 101–105.
Binder JR, Frost JA, Hammeke TA, Rao SM and Cox RW (1996). Function of the left planum temporale in
auditory and linguistic processing. Brain, 119, 1239–1247.
Blank SC, Bird H, Turkheimer F and Wise RJ (2003). Speech production after stroke: the role of the right
pars opercularis. Annals of Neurology, 54(3), 310–320.
Blood AJ and Zatorre RJ (2001). Intensely pleasurable responses to music correlate with activity in brain
regions implicated in reward and emotion. Proceedings of the National Academy of Sciences USA,
98(20), 11818–11823.
Bosnyak DJ, Eaton RA and Roberts LE (2004). Distributed auditory cortical representations are modified
when non-musicians are trained at pitch discrimination with 40 Hz amplitude modulated tones.
Cerebral Cortex, 14(10), 1088–1099.
Brugge JF (1985). Patterns of organization in auditory cortex. Journal of the Acoustical Society of America,
78(1/2), 353–359.
Callan DE, Tsytsarev V, Hanakawa T, Callan AM, Katsuhara M, Fukuyama H and Turner R (2006).
Song and speech: Brain regions involved with perception and covert production. Neuroimage,
31, 1327–1342.
Calvert GA, Brammer MJ, Morris RG, Williams SC, King N and Matthews PM (2000). Using fMRI to
study recovery from acquired dysphasia. Brain and Language, 71(3), 391–399.
Calvo-Merino B, Glaser DE, Grezes J, Passingham RE and Haggard P (2005). Action observation and
acquired motor skills: An FMRI study with expert dancers. Cerebral Cortex, 15(8), 1243–1249.
Carey S and Bartlett E (1978). Acquiring a single new word. Proceedings of the Stanford Child Language
Conference, 15, 17–29.
Cohen JD, Perlstein WM, Braver TS, Nystrom LE, Noll DC, Jonides J and Smith EE (1997). Temporal
dynamics of brain activation during a working memory task. Nature, 386, 604–608.
Critchley HD, Wiens S, Rotshtein P, Ohman A Dolan RJ (2004). Neural systems supporting interoceptive
awareness. Nature Neuroscience, 7(2), 189–195.
Dawson G and Fischer KW (eds) (1994). Human behavior and the developing brain. The Guilford Press,
New York.
Decety J and Chaminade T (2003). Neural correlates of feeling sympathy. Neuropsychologia, 41, 127–138.
Devlin JT, Raley J, Tunbridge E, Lanary K, Floyer-Lea A, Narain C, Cohen I, Behrens T, Jezzard P,
Matthews PM and Moore DR (2003). Functional asymmetry for auditory processing in human primary
auditory cortex. Journal of Neuroscience, 23(37), 11516–11522.
Dhamala M, Pagnoni G, Wiesenfeld K, Zink CF, Martin M and Berns GS (2003). Neural correlates of the
complexity of rhythmic finger tapping. Neuroimage, 20(2), 918–926.
Draganski B, Gaser C, Busch V, Schuierer G, Bogdahn U and May A (2004). Neuroplasticity: changes in
grey matter induced by training. Nature, 427(6972), 311–312.
Essens PJ and Povel DJ (1985) Metrical and nonmetrical representations of temporal patterns. Perceptual
Psychophysics, 37, 1–7.
Foote J and Uchihashi S (2001). The beat spectrum: A new approach to rhythm analysis. Proceedings of
IEEE International Conference on Multimedia and Expo. Paper available from
http://rotorbrain.com/foote/papers/allpapers.html
Formisano E, Kim DS, Di Salle F, van de Moortele PF, Ugurbil K and Goebel R (2003). Mirror-symmetric
tonotopic maps in human primary auditory cortex. Neuron, 40(4), 859–869.
Freeman WJ (2005). Origin, structure and role of background EEG activity. Part 3. Neural frame
classification. Clinical Neurophysiology, 116, 1118–11129.
Freeman WJ and Holmes MD (2005). Metastability, instability, and state transition in neocortex. Neural
Networks, 18, 497–504.
Friederici AD (1995). The time course of syntactic activation during language processing: A model based
on neuropsychological and neurophysiological data. Brain and Language, 50(3), 259–281.
Friederici AD, Wang Y, Herrmann CS, Maess B and Oertel U (2000). Localization of early syntactic
processes in frontal and temporal cortical areas: A magnetoencephalographic study. Human Brain
Mapping, 11(1), 1–11.
Gallese V (2001). The ‘Shared Manifold’ hypothesis: From mirror neurons to empathy. In E Thompson, ed.,
Between ourselves: Second-person issues in the study of consciousness, pp. 33–50. Imprint Academic,
Charlottesville, VA/Thorverton, UK.
Gaser C and Schlaug G (2003). Brain structures differ between musicians and non-musicians. The Journal
of Neuroscience, 23(27), 9240–9245.
Griffiths TD, Buchel C, Frackowiak RS and Patterson RD (1998). Analysis of temporal structure in sound
by the human brain. Nature Neuroscience, 1(5), 422–427.
Gross J, Ioannides AA, Dammers J, Maess B, Friederici AD and Muller-Gartner HW (1998). Magnetic
field tomography analysis of continuous speech. Brain Topography, 10(4), 273–281.
Halberda J (2003). The development of a word-learning strategy. Cognition, 87, B23–B34.
Halpern AR and Zatorre RJ (1999). When that tune runs through your head: A PET investigation of
auditory imagery for familiar melodies. Cerebal Cortex, 9(7), 697–704.
Hebb DO (1949). The organization of behavior. Wiley, New York.
Iacoboni M, Koski LM, Brass M, Bekkering H, Woods RP, Dubeau MC, Mazziotta JC and Rizzolatti G
(2001). Reafferent copies of imitated actions in the right superior temporal cortex. Proceedings of the
National Academy of Sciences USA, 98(24), 13995–13999.
Ioannides AA (1995). Estimates of 3D brain activity ms by ms from biomagnetic signals: Method (MFT),
results and their significance. In E Eiselt, U Zwiener and H Witte, eds, Quantitative and topological EEG
and MEG analysis, pp. 59–68. Universitaetsverlag Druckhaus-Maayer GmbH, Jena.
Ioannides AA (2001). Real time human brain function: Observations and inferences from single trial
analysis of magnetoencephalographic signals. Clinical EEG, 32, 98–111.
Ioannides AA, Bolton JPR aqnd Clarke CJS (1990). Continuous probabilistic solutions to the biomagnetic
inverse problem. Inverse Problem, 6, 523–542.
Ioannides AA, Fenwick PBC and Liu LC (2005). Widely distributed magnetoencephalography spikes
related to the planning and execution of human saccades. Journal of Neuroscience, 25, 7950–767.
Ioannides AA, Kostopoulos GK, Laskaris NA, Liu LC, Shibata T, Schellens M, Poghosyan V and
Khurshudyan A (2002a). Timing and connectivity in the human somatosensory cortex from single trial
mass electrical activity. Human Brain Mapping, 15, 231–246.
Ioannides AA, Poghosyan V, Dammers J and Streit M (2004). Real-time neural activity and connectivity in
healthy individuals and schizophrenia patients. NeuroImage, 23, 473–482.
Ioannides AA, Popescu M, Otsuka A, Bezerianos A and Liu LC (2003). Magnetoencephalographic evidence
of the inter-hemispheric asymmetry in echoic memory lifetime and its dependence on handedness and
gender. NeuroImage, 19(3), 1061–1075.
Ioannides AA, Popescu M, Otsuka, Abrahamyan A and Deliège I (2002b). Using neuroimaging to study
neural correlates of music over wide spatial and temporal scales. In C Stevens, D Burnham,
G McPherson, E Schubert and J Renwick, eds, Proceedings of the 7th International Conference
on Music Perception and Cognition – ICMPC7, Sydney, July, pp. 677–680. Australian Music and
Psychology Society (AMPS), Sydney NSW and Causal Productions, Adelaide, SA. Published
as CD Rom only.
Janata P (1995). ERP measures assay the degree of expectancy violation of harmonic contexts in music.
Journal of Cognitive Neuroscience, 13, 1–17.
Jeannerod M (2004). Visual and action cues contribute to the self-other distinction. Nature Neuroscience,
7(5), 421–422.
Karni A, Meyer G, Jezzard P, Adams MM, Turner R and Ungerleider LG (1995). Functional MRI evidence
for adult motor cortex plasticity during motor skill learning. Nature, 377(6545), 155–158.
Koelsch S, Gunter TC, Schroger E, Tervaniemi M, Sammler D and Friederici AD (2001). Differentiating
ERAN and MMN: An ERP study. Neuroreport, 12(7), 1385–1389.
Koelsch S, Gunter TC, v Cramon DY, Zysset S, Lohmann G and Friederici AD (2002). Bach speaks:
a cortical ‘language-network’ serves the processing of music. Neuroimage, 17(2), 956–966.
Koelsch S, Kasper E, Sammler D, Schulze K, Gunter T and Friederici AD (2004). Music, language and
meaning: Brain signatures of semantic processing. Nature Neuroscience, 7(3), 302–307.
Krumhansl CL (1990). Cognitive foundations of musical pitch. Oxford University Press, New York.
Krumhansl CL (2002). Music: A link between cognition and emotion. Current Directions in Psychological
Science, 11(2), 45–50.
Kuhl PK (2004). Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience,
5(11), 831–843.
Kutas M and Hillyard SA (1980). Reading senseless sentences: brain potentials reflect semantic incongruity.
Science, 207(4427), 203–205.
Kutas M and Hillyard SA (1984). Brain potentials during reading reflect word expectancy and semantic
association. Nature, 307(5947), 161–163.
Laskaris N, Liu LC and Ioannides AA (2003). Single-trial variability in early visual neuromagnetic
responses: an explorative study based on the regional activation contributing to the N70m peak.
NeuroImage, 20(2), 765–783.
Lehrdahl F and Jackendorff R (1983). A generative theory of tonal music. MIT Press, Cambridge, MA.
Levitin DJ and Menon V (2003). Musical structure is processed in ‘language’ areas of the brain: A possible
role for Brodmann Area 47 in temporal coherence. Neuroimage, 20(4), 2142–2152.
Liégeois-Chauvel C, de Graaf JB, Laguitton V and Chauvel P (1999). Specialization of left auditory cortex
for speech perception in man depends on temporal coding, Cerebral Cortex, 9, 484–496.
Liegeois-Chauvel C, Giraud K, Badier JM, Marquis P and Chauvel P (2001). Intracerebral evoked potentials
in pitch perception reveal a functional asymmetry of the human auditory cortex. Annals of the New York
Academy of Sciences, 930, 117–132.
Liu LC, Ioannides AA and Mueller-Gaertner HW (1998). Bi-hemispheric study of single trial MEG signals
of the human auditory cortex. Electroenceph Clin Neurophysiol, 106, 64–78.
Llinas RL (2001). I of the vortex, from neuroscience to self. MIT Press, Cambridge, MA.
Maess B, Koelsch S, Gunter TC and Friederici AD (2001). Musical syntax is processed in Broca’s area:
An MEG study. Nature Neuroscience, 4(5), 540–545.
McAdams S, Winsberg S, Donnadieu S, De Soete G and Krimphoff J (1995). Perceptual scaling of
synthesized musical timbres: Common dimensions, specificities, and latent subject classes. Psychological
Research, 58, 177–192.
Menon V, Levitin DJ, Smith BK, Lembke A, Krasnow BD, Glazer D, Glover GH and McAdams S (2002).
Neural correlates of timbre change in harmonic sounds. Neuroimage, 17(4), 1742–1754.
Meyer LB (1956). Emotion and meaning in music. University of Chicago Press, Chicago, IL.
Molinari M, Leggio MG, De Martin M, Cerasa A and Thaut M (2003). Neurobiology of rhythmic motor
entrainment. Annals of the New York Academy of Sciences, 999, 313–321.
Moradi F, Liu LC, Cheng K, Waggoner RA, Tanaka K and Ioannides AA (2003). Consistent and precise
localization of brain activity in human primary visual cortex by MEG and fMRI. NeuroImage,
18, 595–609.
Morel A, Garraghty PE and Kaas JH (1993). Tonotopic organization, architectonic fields, and connections
of auditory cortex in macaque monkeys. Journal of Comparative Neurology, 335(3), 437–459.
Näätänen R (1992). Attention and brain function. Erlbaum, Hillsdale, NJ.
Osterhout L and Holcomb PJ (1992). Event-related brain potentials elicited by syntactic anomaly. Journal
of Memory and Language, 31, 785–804.
Overy K, Norton AC, Cronin KT, Gaab N, Alsop DC, Winner E and Schlaug G (2004). Imaging melody
and rhythm processing in young children. Neuroreport, 15(11), 1723–1726.
Panksepp J (1995). The emotional sources of ‘chills’ induced by music. Music Perception, 13(2), 171–207.
Pantev C, Hoke M, Lehnertz K, Lutkenhoner B, Anogianakis G and Wittkowski W (1988). Tonotopic
organization of the human auditory cortex revealed by transient auditory-evoked magnetic fields.
Electroencephalogr Clin Neurophysiol, 69(2), 160–170.
Pantev C, Ross B, Fujioka T, Trainor LJ, Schulte M and Schulz M (2003). Music and learning–induced
cortical plasticity. Annals of the New York Academy of Sciences, 999, 438–450.
Parsons LM (2001). Exploring the functional neuroanatomy of music performance, perception, and
comprehension. Annals of the New York Academy of Sciences, 930, 211–231.
Patel AD (2003). Language, music, syntax and the brain. Nature Neuroscience, 6(7), 674–681.
Patel AD and Balaban E (2000). Temporal patterns of human cortical activity reflect tone sequence
structure. Nature, 404(6773), 80–84.
Patel AD, Gibson E, Ratner J, Besson M and Holcomb PJ (1998). Processing syntactic relations in language
and music: an event-related potential study. Journal of Cognitive Neuroscience, 10(6), 717–733.
Patterson RD, Uppenkamp S, Johnsrude IS and Griffiths TD (2002). The processing of temporal pitch and
melody information in auditory cortex. Neuron, 36(4), 767–776.
Peretz I (1990). Processing of local and global musical information in unilateral brain damaged patients.
Brain, 13, 1185–1205.
Peretz I (2002). Brain specialization for music. Neuroscientist, 8(4), 372–380.
Pinker S (2000). The language instinct: How the mind creates language. HarperCollins Publishers, New York.
Popescu M, Otsuka A and Ioannides AA (2004). Dynamics of brain activity in motor and frontal cortical
areas during music listening: A magnetoencephalographic study. NeuroImage, 21, 1622–1638.
Recanzone GH, Schreiner CE and Merzenich MM (1993). Plasticity in the frequency representation of
primary auditory cortex following discrimination training in adult owl monkeys. Journal of
Neuroscience, 13(1), 87–103.
Rivkin MJ, Vajapeyam S, Hutton C, Weiler ML, Hall EK, Wolraich DA, Yoo SS, Mulkern RV, Forbes PW,
Wolff PH and Waber DP (2003). A functional magnetic resonance imaging study of paced finger
tapping in children. Pediatric Neurology, 28(2), 89–95.
Rizzolatti G, Fadiga L, Gallese V and Fogassi L (1996). Premotor cortex and the recognition of motor
actions. Brain Res Cogn Brain Res, 3(2), 131–141.
Roland PE, Skinhøj E and Lassen NA (1981). Focal activation of human cerebral cortex during auditory
discrimination. Journal of Neurophysiology, 45, 1139–11351.
Rowe JB and Passingham RE (2001). Working memory for location and time: Activity in prefrontal area 46
relates to selection rather than maintenance in memory. Neuroimage, 14(1/1), 77–86.
Sakai K, Hikosaka O, Miyauchi S, Takino R, Tamada T, Iwata NK and Nielsen M (1999). Neural represen-
tation of a rhythm depends on its interval ratio. Journal of Neuroscience, 19(22), 10074–10081.
Samson S, Ehrle N and Baulac M (2001). Cerebral substrates for musical temporal processes. In Zatorre RJ
and Peretz I, eds, The biological foundations of music, pp. 166–178. New York Academy of Science,
New York.
Schneider P, Scherg M, Dosch HG, Specht HJ, Gutschalk A and Rupp A (2002). Morphology of Heschl’s
gyrus reflects enhanced activation in the auditory cortex of musicians. Nature Neuroscience, 5(7),
688–694.
Schulte M, Knief A, Seither-Preisler A and Pantev C (2002). Different modes of pitch perception and
learning-induced neuronal plasticity of the human auditory cortex. Neural Plasticity, 9(3), 161–175.
Sluming V, Barrick T, Howard M, Cezayirli E, Mayes A and Roberts N (2002). Voxel-based morphometry
reveals increased gray matter density in Broca’s area in male symphony orchestra musicians.
Neuroimage, 17(3), 1613–1622.
Stewart L, Henson R, Kampe K, Walsh V, Turner R and Frith U (2003). Brain changes after learning to read
and play music. NeuroImage, 20(1), 71–83.
Talavage TM, Sereno MI, Melcher JR, Ledden PJ, Rosen BR and Dale AM (2004). Tonotopic organization
in human auditory cortex revealed by progressions of frequency sensitivity. J Neurophysiology, 91(3),
1282–1296.
Taylor JG, Ioannides AA and Muller-Gartner HW (1999). Mathematical analysis of lead field expansions.
IEEE Transactions on Medical Imaging, 18, 151–163.
Tervaniemi M, Kujala A, Alho K, Virtanen J, Ilmoniemi RJ and Naatanen R (1999). Functional specializa-
tion of the human auditory cortex in processing phonetic and musical sounds: A magnetoencephalo-
graphic (MEG) study. Neuroimage, 9(3), 330–336.
Todd NPM (1994). The auditory primal sketch: A multiscale model of rhythmic grouping. Journal of
New Music Research, 23(1), 25–70.
Trainor LJ (1996). Infant preferences for infant-directed versus non-infant-directed play songs and
lullabies. Infant Behavior and Development, 19, 83–92.
Trevarthen C (2001). The neurobiology of early communication: Intersubjective regulations in human
human development, pp. 841–882. Kluwer, Dordrecht, The Netherlands.
Trevarthen C (2002). Origins of musical identity: Evidence from infancy for musical social awareness.
In R MacDonald, DJ Hargreaves and D Miell, eds, Musical identities, pp. 21–38. Oxford University Press,
Oxford.
Trevarthen C (2004a). Brain development. In RL Gregory, ed., Oxford companion to the mind, 2nd edn,
pp. 116–127. Oxford University Press, Oxford/New York
Trevarthen C (2004b). Language development: Mechanisms in the brain. In G Adelman and BH Smith, eds,
Encyclopedia of neuroscience, 3rd edn, CD-ROM, Article Number 397. Elsevier Science, Amsterdam.
Trevarthern C and Malloch S (2002). Musicality and music before three: Human vitality and invention
shared with pride. Zero to Three, 10–18.
Turner R and Jones T (2003). Techniques for imaging neuroscience. British Medical Bulletin, 65, 3–20.
Tzourio-Mazoyer N, De Schonen S, Crivello F and Reutter B (2002) Neural correlates of woman face
processing by 2-month-old infants. Neuroimage, 15, 454–461.
Ullen F, Forssberg H and Ehrsson HH (2003). Neural networks for the coordination of the hands in time.
Journal of Neurophysiology, 89(2), 1126–11235.
Von Helmoltz H (1853). Ueber einige Gesetze der Vertheilung elektrischer Stroeme in koerperlichen
Leitern, mit Anwendung auf die thierisch-elektrischen Versuche. Ann Phys Chem, 89, 211–233, 353–377.
Wallin N (1991). Biomusicology: Neurophysiological, neuropsychological, and evolutionary perspectives on the
origins and purposes of music. Pendragon Press, New York.
Warren JD, Uppenkamp S, Patterson RD and Griffiths TD (2003). Separating pitch chroma and pitch
height in the human brain. Proceedings of the National Academy of Sciences USA, 100(17), 10038–10042.
Wolpert DM, Ghahramani Z and Jordan MI (1995). An internal model for sensorimotor integration.
Science, 269(5232), 1880–1882.
Woods RP, Dodrill CB and Ojemann GA (1988). Brain injury, handedness, and speech lateralization in a
series of amobarbital studies. Annals of Neurology, 23(5), 510–518.
Wright AA, Rivera JJ, Hulse SH, Shyan M and Neiworth JJ (2000). Music perception and octave generaliza-
tion in rhesus monkeys. Journal of Experimental Psychology General, 129(3), 291–307.
Yonemoto K (2004). Language-related brain function during word repetition in post-stroke aphasics.
Neuroreport, 15(12), 1891–1894.
Zatorre RJ, Evans AC and Meyer E (1994). Neural mechanisms underlying melodic perception and memory
for pitch. Journal of Neuroscience, 14(4), 1908–1919.
Zeki S (1993). A vision of the brain. Blackwell Scientific Publications, Oxford.
Part 2
Musicality in infancy
Infants are born eager to communicate in sympathy with others through movements expressive
of their feelings and intentions. If the adult pays close intuitive attention, an infant will ‘show’
through their, at first, rudimentary gestures how the adult needs to act in order for meaningful
communication to occur. Infancy marks the birthplace of communicative musicality as defined
and discussed in this book, both in development of the theory itself as a new way of under-
standing human communication, and in its natural ontogeny—from innate beginnings to mastery
of meaningful performances in childhood and adulthood.
Part 2 provides a group of complementary viewpoints on the communicative skills of infants
that underlie the unique human talents for the temporal arts, and provides insights into how they
learn musical behaviours as a special kind of human cultural activity. Whether responding to
recorded music, participating in musical games, interacting with other infants, or demonstrating
their reliance on a healthy musicality in their primary caregiver, infants show an extraordinary
curiosity and engagement with the environment of human sound and movement. We are only
just beginning to appreciate and understand it, and many issues remain challenging as we search
for interpretations.
First we are offered reflections on stories of musicality at the beginning of time and in the early
months of life. Katerina Mazokopaki and Giannis Kugiumutzakis (Chapter 9) describe how tales
linking the origins of music with manifestations of cosmic harmony have probably been in
existence as long as humans have made legends of their shared experience. In experiments with
how infants respond to recorded music through movements, Katerina and Giannis endeavour to
understand the beginnings of musical needs in each of us: ‘our research seeks to understand the
power of music … we observe how the rhythms of music engage and move infants’ (page 186).
Their demonstrations of the curiosity and delight an infant shows when music is heard intro-
duces the idea of a ‘virtual musical other’ in the infant’s mind, a companion whose presence
comes alive in the human-made story of the sound. Niki Powers and Colwyn Trevarthen
(Chapter 10) also seek to better understand the beginnings of our musicality and its sharing.
They compare the tones of vowel sounds made by mothers and infants in Japan and Scotland to
observe the effect of cultural differences in playful engagements of mothers with 4-month-olds:
‘Long before they can speak, infants begin adapting to the parental culture . . . and the family
responds, giving objects and actions a clear sharable sense for the learner by offering rhythmic
participation in rituals and tasks’ (page 209). The authors find that vowel sounds expressed in
musical ways are both a means of engaging emotions between adult and infant and a vehicle for
enculturation in how to use feelings to share activities.
184 MUSICALITY IN INFANCY
The talent of infants for actively participating in their ritual induction into the musical art of
their native culture is explored by Bjorn Merker and Patricia Eckerdal (Chapter 11). The authors
argue that music, in particular a genre of baby song, plays a vital role as a vehicle for infant
cultural learning: ‘Our suggestion is that the infant’s primary gate of admission to the ritual level
of human culture is the action song and related games with a formal structure’ (page 251). Here
infants’ abilities as ‘imitative generalists’ are vital. The authors view infants as consummate vocal
learners, whose capacities for song and speech are vocal resources which are over and above a
basic repertoire of non-verbal vocal expressiveness, an ability shared with all animals that use
non-verbal social signalling. Bjorn and Patricia, taking a somewhat different view from other
contributors, define music specifically as the learned ritual use of patterns of discrete tones and
measured rhythms, and they propose that while infants, like other animals, possess attuned audi-
tory perceptions of sounds as expressions of emotion, they do not have ‘musicality’ until they can
begin to ‘perform’—to sing a traditional melody or move to a ritual action song.
‘Induction’ [in the musical culture] starts with exposure to the ritual form without requiring formal
contributions by the infant, progresses to the infant’s contributing simple bodily gestures such as
raising the hands, and ends, perhaps years later, with full mastery of the performance, a performance
to which the child, on growing up, will introduce his or her own immature offspring in the cycle of
ritual culture. The infant … is making a decisive break with our ape ancestry and entering the ground
of a truly human, ritual culture.
The authors point out that the process by which infant non-verbal expressive calls move into
specific musical form requires much detailed research.
Helen Marwick and Lynne Murray (Chapter 13) and Maya Gratier and Gisèle Apter-Danon
(Chapter 14) report the changes that occur when a mother’s musicality is diminished by a depres-
sive illness (Chapter 13) or through cultural dislocation (Chapter 14). Helen and Lynne provide a
thorough review of the literature concerned with the characteristics of musicality—timing and
expression—in adult–infant communication, and consider the repercussions on this musicality
of postnatal depression, and subsequent effects on the infant. Maya and Gisèle find diminish-
ment in mothers’ expressive musicality as those mothers lose their sense of ‘belonging’ when they
leave their country of birth, and also in mothers with bipolar disorder who find all relationships
difficult. By extension, the authors coin the term ‘proto-habitus’ which captures ‘all the
projectable styles and routines that mothers and infants establish over time as they interact…
rooted in cultural styles that the mother brings with her from her own community of belonging’
(page 304). Proto-habitus is the mother’s and infant’s embodied belonging experienced through
time within a space they share. Its development is weakened when the mother drifts towards
unresponsiveness due to loneliness or depression.
All of the chapters discussed so far consider the caregiver–infant dyad as the primary object of
interest, as does the majority of the infancy literature. Ben Bradley (Chapter 12), however,
considers this dyadic relationship as a specialized form of ‘group relatedness’ and of musicality in
communication. By investigating infant interactions within a group of three peers, all under
1 year old, Ben can study the very beginnings of group dynamics and explore whether musicality
of relating is to be found when there is no adult ‘expert’ present on whom an infant can rely to
scaffold their behaviour.
The innate abilities of infants have been explored in detail in Western psychology since the
1960s. Many of the fruits of these labours are to be found in the chapters of Part 2, but the job is
unfinished; there is still much work to be done to translate the eventful story of our very first
engagements with communicative musicality.
Chapter 9
Infant rhythms: Expressions of

musical companionship
Katerina Mazokopaki and Giannis Kugiumutzakis
9.1 Introduction: from the ‘music of the spheres’ to rhythms

of musical communication with infants
Many scholars accept that it is in our nature to be musical—that the sound of music, and the ritual
use of music in the life of the community, must have accompanied human beings from the evolu-
tionary origins of mankind. Some believe that the talent for song and dance is what first made
the evolution of social life of human beings different from other animals, and that with this talent
arose the capacity to use language and make culture (Donald 1991; Brown 2000; Mithen 2005).
In our evolutionary past music may well have been present to celebrate every stage of life: from
gestation until death, and beyond. There is evidence that the pleasures of making and hearing
musical sounds were known to the Cro-Magnon people 40,000 years ago (Farb 1978; Cross and
Morley, Chapter 3, this volume).
However, in many aspects, the nature of this human appreciation and need to make music is
still a mystery, and some have sought a source outside human or animal nature to explain it. It
has been thought that music, because it depends on the creation of sound from the resonances of
rhythmically activated bodies or musical instruments, might be a property of the dynamic
universe, sharing the intrinsic periods and harmonies of physical phenomena. Theories linking
music to manifestations of cosmic harmony, beyond human nature, are recorded from the begin-
ning of historical time (Theodorakis 2007). Pythagoras’s natural law of 2500 years ago, known as
the Harmony of the Spheres, connected music, the theory of numbers and astronomy. It was
believed that since the celestial bodies, including Earth, move in perpetual orbits, they must
create tones varying in pitch, like notes plucked on the strings of the lyre with measured length.
In his theory of music, Plato (1963) distinguishes a ‘true’ or ‘pure’ harmony, based on ‘harmonic’
relations between numbers. Indeed, developments in neurosciences and biomusicology are
converging on a new synthesis of ideas concerning the mathematical regularities of nature ‘in
which the Pythagorean whole integer ratios and resonance phenomena indeed emerge as the
rock bed upon which music scales, harmonies and tonal relations appear to rest’ (Merker 2006).
Our musical sense appears to have connection with cosmic, physical and mathematical
harmonies.
The Pythagoreans recognized the special place of music in well-being of the human spirit.
They stressed that music, with its under- and overtones and the different kinds of rhythms and
melodies, influences emotions. It leads the soul to excitement, relaxation, amusement, pleasure,
suspense, drowsiness, stimulation, vivification and enthusiasm. Through musical movements it is
made whole, sinks in quiescence, is enflamed with pathos, feels mercy and modesty, and is led to
harmony. Aristotle observed that music has a direct influence on our soul, offering joy, catharsis,
and promoting intellectual cultivation (Kugiumutzakis 2007).
186 KATERINA MAZOKOPAKI AND GIANNIS KUGIUMUTZAKIS
Many of the theories of cosmic harmony intend to explain the pleasurable harmony we
perceive in ourselves when we hear musical sounds—in song created by the voice, or in melodies
made by the vibrations of the strings of musical instruments played by hand. The ideas of the
influence of musical forms over the emotions and health of the soul that originated in ancient
Greece are remarkably similar to the concepts of the equally old Vedic texts of India, still influen-
tial in Indian understanding of music and its emotional and therapeutic influences (Rowell 1992;
Deva 1995; Inayat Kahn 2005). We, too, know that music has extraordinary powers to strengthen
the human spirit, to overcome disordered impulses for movement, and to heal (Trevarthen and
Malloch 2000; and see chapters in Part 3, this volume), and the findings of brain science support
this view (Sacks 2007; Turner and Ioannides, Chapter 8, this volume).
While humankind has revered music for millennia, and felt its origins to be in underlying
cosmic unity present from the beginning of time, a unity of which the human community is a
part, our research seeks to understand the power of music by studying its manifestations in an
infant, within spontaneous and unsophisticated human experience; we observe how the rhythms
of music engage and move infants. The first author, as a musician and a developmental psychologist,
is delighted to find that contemporary ethnomusicologists (Blacking 1979), musicologists
(Bjørkvold 1992), psychologists (Donald 2001), and developmental psychologists (Papoušek
1996; Trevarthen 1999), believe that musicality—the appreciation of rhythmic and melodious
patterns produced by the voice or instruments—is a need or motive of all humans, an essential
part of us. She has undertaken research on the reactions and creative actions of infants who
are excited by music and mothers’ singing, with the advice of the second author, a professor of
psychology whose research concerns the origins of communication in early childhood, before
language (Kugiumutzakis 1993, 1998, 1999, 2007). He is persuaded that the imitative sympathy
infants show for the actions and expressions of other persons from birth is, among other things,
profoundly musical.
Twenty-five years ago, while making a study of infant imitation for his Phd thesis
(Kugiumutzakis 1985), he observed several interesting linked phenomena, including imitation of
facial expressions with the grasping of the researcher’s face at 5.5–6 months, and the imitation of
vocal sounds with their rhythms of expression in neonates less than 45 minutes old. The models
were imitated as part of felt movements in the imitator’s body. The newborn babies were pre-
sented the vocal models /a/, /m/ and /ang/, each a short sound repeated five times in a rhythmical
manner, four short and one long, more emphasized. Neonates clearly tried to imitate when an
open vowel sound /a/ was presented, but they did not do so, at a statistically significant level, for
the sounds /m/ and ang/, apparently sensing what they could and could not achieve. In addition,
several neonates, when apparently making an effort to imitate the /a/ sound, chose to reproduce
the rhythm by repeating an unstructured sound between /ae/ and /m/ in the same
temporal pattern as the model. In a longitudinal study through months 2 to 6, many infants
imitated clearly both the three sounds /a/, /m/ and /ang/ and the rhythm of their emission
(Kugiumutzakis 1985).
Although these observations on the two ways of imitating—to reproduce the sound or the
rhythm—were characterized as ‘an important result’, and it was noted that the imitations took
place in an ‘emotional’ frame, the main research focus of the early 1980s, given the prevailing
belief, endorsed by Piaget and Skinner, that neonates could not possibly know how to imitate, was
to test the hypothesis that neonatal imitation exists (Kugiumutzakis 1985, No. 376, p. 4; No. 377,
p. 6; No. 378, pp. 12–14 and summary p. 14). Consideration of the co-appearance of imitation of
the sound and the rhythm of a group of sounds was left for the future (Kugiumutzakis 1993,
p. 45, 2007). Reflecting on the apparent motivation of neonates and older infants to experiment
with their efforts to imitate and to experience an emotional involvement, the experimenter
INFANT RHYTHMS: EXPRESSIONS OF MUSICAL COMPANIONSHIP 187
(Kugiumutzakis 1983, 1998) found that Aristotle (384–322 BC) considered imitation to be
an innate human capacity/potentiality, made perfect by habit. Like his predecessors, Aristotle
was sure that activation of the imitative arts, and especially music, improves the strength and
refinement of emotions. His words have been translated as follows:
Imitation is natural to man from childhood, one of his advantages over the lower animals being this,
that he is the most imitative creature in the world, and learns at first by imitation. And it is also natural
for all to delight in works of imitation … Imitation, melody and rhythm are natural to us.
Aristotle Poetics, 1448B, in Sifakis (2001, pp. 38–39)
9.2 The innate musicality of infants

Although at any given time and in every society, the skilled experts in the cultivation of music are
few, we all have the basic ability to express ourselves, to respond and to communicate in musical
ways. It is therefore necessary, we believe, for psychologists and teachers of music to re-examine
the source of musical ability, to estimate more closely musical experience and expression not only
as a talent that a few possess, a gift for learning that needs to be systematically trained to come to
fruition, but as an intuitive expressive ability and communicative need to be found, and actively
appreciated in all people (Flohr and Trevarthen 2007; Malloch 1999; Mazokopaki 2007;
Woodward and Bannan Chapter 21, Fröhlich Chapter 22 and Custodero Chapter 23, this
volume). Trevarthen (1999) describes musicality as the psychobiological source of music that
originates from the ‘intrinsic motive pulse’ (IMP) that motivates movement and consciously
directed action of the individual, as well as shared social experience.
In the past 20 years, a number of researchers have sought for the origins of human music
ability in infancy, and especially in infants’ first experiences with their parents (Papoušek 1996;
Papoušek and Papoušek 1981; Trehub 1990, 2000; Trevarthen 1999). Mothers’ songs, musical
games, dancing, and rhythmic gestures are shared with infants many months before words gain
meaning for them, and they excite their interest, please them and motivate them to act (Trainor
1996). Before birth, the human fetus can hear musical and non-musical sounds, and may learn to
recognize distinctive features of song or of music made by instruments from the seventh month
of gestation (DeCasper and Spence 1986; Lecanuet 1996; Shetler 1990). We should view musical-
ity and the enjoyment of music as a psychosocial need that derives from innate motives for sym-
pathetic understanding and cooperation between individuals and generations (Bjørkvold 1992;
Blacking 1969/1995, 1979). Musicality, by attracting sympathy through the pulse of moving,
brings minds into companionship, affirms one’s social identity and creates unforgettable narra-
tives of feeling in community (Trevarthen 2002; Trevarthen and Malloch 2002).
Psychological research on the development of musical ability in infancy can be divided into
two categories, according to the emphasis researchers give to the actions and awareness of the
infant individual, or to the collaborative exchange of expressions in the communicative
parent–infant dyad. The first line of research asks what is an infant’s ability, in the laboratory, to
perceive and distinguish musical stimuli presented in a controlled manner. In general, these tests
are done with babies over 3 months of age, when they have developed good musculoskeletal sup-
port of the head and a clear sense of the location of events either side. The tests use an infant’s
natural curiosity about new events to track orienting preferences. Trehub and her colleagues have
used head orienting to a loudspeaker, presenting a change in sound patterns to show that infants
are able to perceive just those features of musical sound that are significant in Western European
music. Infants pay attention to relational aspects of melodies, encode the contour of a melody
across variations in pitch levels and intervals, and are more precise in perceiving the differences
between diatonic melodies than melodies that violate the common conventions of music. They
are sensitive to differences in tempo and they can discriminate rhythmic sequences independent
of tempo. They group components of tone sequences on the basis of similarities in pitch, timbre
or loudness, experiencing Gestalt grouping effects in much the same way as adults (Thorpe and
Trehub 1989; Thorpe et al. 1988; Trainor and Trehub 1992; Trehub 1987, 1990; Trehub and
Thorpe 1989; Trehub, Endman and Thorpe 1990; Trehub, Thorpe and Trainor 1990).
The second line of research focused on the search for motives and intentions that lead the
infant and its parents to express themselves musically together, within spontaneous social and
communicative contexts. Intentional participation in musical exchanges can be studied in
natural parent–infant play from birth. In infant psychology, music has become a model to guide
the analysis and understanding of the communicative and emotional components of interaction
between the two companions (Malloch 1999; Papoušek 1996; Papoušek 1987; 1994; Papoušek
and Papoušek 1981; Stern 1974, 1992, 1993, 1999; Stern et al. 1985; Trevarthen 1999, 2001, 2002,
2003, 2004a, 2004b; Trevarthen and Malloch 2000; Trevarthen, Powers and Mazokopaki 2006).
Infants have shown themselves to be proficient improvisors in musical engagements with playful
parents (Trevarthen and Malloch 2002; Gratier and Danon, Chapter 14, this volume).
Developmental research on musical and paralinguistic components of the way mothers talk to
their young infants has revealed a distinctive melodious way of expression—described as intu-
itive ‘motherese’ (or, more impersonally, ‘infant-directed speech’)—of which the expressive
prosodic elements have been confirmed in many different languages. The appearance of the same
intuitive motherese in different languages and cultures confirms the universality of the motives
for this way of speaking to infants, notwithstanding the expected, developmentally and culturally
crucial individual differences concerning the mothers, the languages, the civilizations and the
nations (Fernald and Simon 1984; Fernald et al. 1989; Grieser and Kuhl 1998; Masataka 1992;
Papoušek and Papoušek 1981; Stern, Spieker and Mackain 1982; Powers and Trevarthen, Chapter 10,
this volume).
During the first weeks after birth, the infant can take intense interest in the musical prosody of
motherese and join in actively by attempting to synchronize his or her expressions with those of
the mother (Beebe et al. 1985; Trevarthen 1999; Marwick and Murray Chapter 13, and Gratier
and Danon Chapter 14, this volume). Descriptive microanalysis of spontaneous interaction of
infants with their parents has shown that both participants are actively responsible for their coor-
dination in improvised dialogues by the use of precise vocal, kinetic and emotional components
of their behaviour (Tronick 2005). Observation and analysis of the mutual regulation in
mother–infant communication leads to the theory of communicative musicality (Malloch 1999;
Trevarthen 1999; Trevarthen and Malloch 2002), defined as the ability that allows both infant and
mother to sustain a coordinated relationship in time and to share a jointly constructed narrative
of moving. Any lack of rhythmic organization in the mother’s expressions and responses, such as
that due to emotional difficulties of a depressed mother, may cause the infant to fail to corre-
spond in a predictable way, which further compromises their engagement (Robb 1999; Marwick
and Murray Chapter 13, and Gratier and Danon Chapter 14, this volume). It is believed that the
fundamental forms of artistic expression in the art of music and all of the temporal arts grow
from this innate capacity for communicative musicality, that is an expression of motives seeking
sympathy and companionship that can be observed as an intrinsic organizing principle for all
forms of human communication (Dissanayake 2000; Trevarthen and Malloch 2002).
The interest of developmental psychologists has also been focused on searching for character-
istics, and possible functions, of the songs addressed to infants by parents everywhere.
Comparative studies have shown that the baby songs not only have common characteristics that
are clearly evident in various cultures, but exhibit minor variations depending on the way the
song is performed, on cultural traditions and on the emotional messages it transmits (Trainor
1996; Trehub, Unyk and Trainor 1993a, b; Trehub et al. 1997; Trevarthen 1999). Research on
musicality in the voice addressed to infants is in progress and is focusing systematically on the
analysis of singing in spontaneous communicative contexts (Part Two, this volume). Both lines of
research converge on the idea that we are born sensitive to the company of the person who moves
in the music.
This chapter describes part of the results of a longitudinal study in Crete focused on the devel-
opment during the first year of infancy: (a) of spontaneous rhythmic expressions of the infants
in the absence and presence of music, and (b) of responses to maternal songs and the infants’
participation in free play with their mothers (Mazokopaki 2007).
9.3 How rhythm is defined

Rhythm is the fundamental element of music, and it has an intimate part to play in relation
to other aspects of music, such as melody, harmony and timbre. Even though it is clear that
rhythmic organization is essential to music, as well as to language, and indeed to all natural
movements and cultural practices, it is difficult to give a generally accepted definition of rhythm
(Osborne, Chapter 25, this volume). The difficulty derives from the fact that rhythm in sound
refers to a complex reality, in the organization of which several variables, such as duration,
intensity, and pitch are combined (Fraisse 1982). Rhythm derives from the two Greek words,
rnqmóV (rhythm) and re¢ w (to flow). Plato, abstract formalist though he was, connected this
meaning of ‘flowing’ to body movements, and he defined rhythm as ‘the order in movement’
(Fraisse 1982).
In essence, rhythm is made up of regularities in both temporal and intensive patterns. It refers
to the grouped organization within perception of elements of a pulse (Fraisse 1978, 1982;
Krumhansl 2000; Lerdahl and Jackendoff 1983). A series of identical sounds heard repeating
through a certain time with constant intervals is spontaneously perceived as grouped in clusters
of two, three or four elements, the grouping being related to the basic frequency of the sounds.
Because nothing objectively specifies the grouping, it is called ‘subjective rhythm’ (Fraisse 1978,
1982). ‘Objective rhythm’ arises when actual differences in intervals or contrasts are introduced
into the sequence of sounds. These differences can be a lengthening of an interval between two
elements, an increase in intensity, or a change in pitch between successive sounds (Fraisse 1982).
The ear, or mind, is searching for rhythmic patterns, and can perceive one when there is no objec-
tive rhythmic pattern (see Osborne, Chapter 25, this volume, on the chronobiology of music).
Subjective rhythm can be perceived if the interval between successive events is not too short
(not much less than 115 ms), in which case the sequence will be perceived as a single, continuous
or shaped event, or not so long that the events are perceived as independent or unconnected in
time (Fraisse 1978, 1982). Generally, the lower limit duration between two elements, according to
many researchers (e.g., Fraisse 1982), is placed at around 100 ms, and the upper limit is com-
monly taken to be between 1500 and 2000 ms; however, there is disagreement about how long
this interval may be before the sense of rhythmic connectedness is lost. The perceived connected-
ness depends on the expectancy of the listener, what they sense about what is happening, or
how movements that make the sound are being performed. The lower limit, also, is affected by
musical practice and experience (Kühl 2007).
Disregarding rhythms of day and night and of the seasons, which have great biological and
psychological importance, the perception of rhythms generated by human body movements
reflects many other features of grouping and dynamic succession, and these are richly exploited
for dramatic and aesthetic effect in the arts of dance and music
9.4 The sense of rhythm from movement of the body in

communication
As a concept, as an essential part of music, and as personal experience, rhythm, and the sharing of
rhythm in a group, has a central position in Mikis Theodorakis’ s theory of music and life. He
describes the origin of rhythm in music as follows:
All rhythms in folk dances start from the need of the human body to express itself in movement – of
legs, body, hands. Moreover, the movement usually takes place in a group. These two elements – body
movement and group movement (which requires sympathetic co-ordination) – define the rhythmic
patterns … the most simple rhythmic schema is the walking in isochronal rhythmic intervals, sometimes
slow sometimes fast … Rhythm is the relation between two or more beats. Once the first rhythmic
schema has been found, the need for group co-ordination necessitates its repetition. Thus, we have
a second element: the repetition of the original schema … [which] never changes … and thus the
participants in the group dance can move in synchrony.
Theodorakis (1983, pp. 55–57)
We are interested here in the developmental source of this natural rhythm and its sociability—
in the temporal measures that emerge in control of an infant’s interest and action in the immedi-
ate ‘conscious present’, within the space of a few tens of seconds (Stern 2004). Rhythmic
expression in movement of an infant is defined as a series of events in time, produced by activity
of the body (vocal sounds, gestures or whole-body movements, or combinations of these), where
a stable beat can be perceived, and the various elements have a degree of regularity and organiza-
tion. The succession of elements that make up a given rhythmic sequence of infant vocalizations
and/or movements should be distinct from one another, but not so far apart in time that they
could be perceived as disconnected (Osborne, Chapter 25, this volume). We take the value of
2 seconds as the upper limit of the interval of time between two elements in a rhythmic
succession (Mazokopaki 2007).
The rhythmic experience of infants should be considered not only in relation to the production
and perception of sound, whether musical or non-musical, but as an integrated expression of the
total effect of the vocal and kinetic pulse and the emotional quality that is produced and sensed in
the body when movements are made (Malloch 1999; Trevarthen 1999). Starting from this theoret-
ical position, i.e., with an interest in rhythmic motives that generate and regulate all movements,
we investigate how infants, as performers, convey their feelings and emotions rhythmically,
through the modulation of voice and body movement, both when they are on their own and when
they listen to music, and we set out to record how these rhythms develop during the first year.
9.5 A method for studying the development, and sharing,

of the sense of rhythm
Our study investigated the following during the first year:
(a) The development of spontaneous infant vocal and bodily rhythms, and the emotional facial
expressions of participation in the rhythmic experience, in two conditions. In Condition 1,
the infant was filmed alone while the mother was out of the room. In Condition 2, the infant
was alone, but this time a tape-recording of a traditional Greek baby song was played in the
same room.
(b) The development of rhythmic ‘narratives’ in mother–infant interaction, when the mother is
spontaneously singing to her infant. These were recorded in Condition 3, a period of free
play between mother and infant in the absence of recorded music (Table 9.1).
Table 9.1 The three conditions of the study
Condition Description Duration in minutes

1 Infant alone/no music/mother out of room 2
2 Infant alone/baby songs/mother out of room 2

3 Mother and infant in free play 6
Total 10
We present here a summary of results from the two first conditions, focusing on the develop-
ment of spontaneous rhythmic vocalizations and movements of the infants when they are on
their own, amusing themselves without musical sounds (Condition 1), and when they are on
their own, listening to recorded music (Condition 2). More specifically, the results presented here
refer to: (i) the frequencies of the infant rhythmical expressions, (ii) the developmental course of
infant rhythms during the first year, (iii) the kinds of rhythmical expressions, (iv) the duration of
simple and complex rhythms, and (v) the emotional expressions before, during and after the
rhythmic experience.
This longitudinal study was conducted in Crete. Fifteen mother–infant pairs were recorded in
their homes 11 times from the second to the 10th month of age in the three conditions. Eight
were boys (four first-born and four second-born) and seven were girls (four first-born and three
second-born). All infants were full-term babies of normal birth weight. From months two to
four, when development is rapid, the recordings were made every 15 days, and thereafter they
were made monthly until the tenth month. The recordings were made by a Panasonic NV-MS4
SVHS camera and a Sony digital audio tape-recorder (DAT TCD-D8) with one microphone.
A room in the home that was very familiar to the infant was selected. The mothers were told that
the aim of the study was to observe them playing with their infants, and that they would partici-
pate in the research after a recording had been made of their infant alone for comparison. None
of the mothers had received a formal education in music. A pilot study was carried out with
10 infants to test the procedure. Two infants at each age were observed at 1, 2, 4, 6 and 8 months.
9.5.1 The recording conditions

In Condition 1, the infant was recorded for 2 minutes in the presence of an ‘uncommunicative’
researcher making the video recording, while the mother was out of sight and silent. Depending
on his or her age, the infant was seated on the floor or in a baby seat, or lying or standing in the
cradle or cot, or on the mother’s bed.
In Condition 2, the infant was again alone, as in Condition 1, but now a tape recording of a
traditional Greek baby song was presented for 2 minutes at a moderate volume from a player in
the same room. The song presented was changed every 2 months, each song at a slightly faster
tempo than the preceding one.
We chose four baby songs, which the pilot study showed mothers were singing often to their
infants during free play. All four songs had a simple melody, without large movements of pitch
and dramatic changes. The first three songs (The fisherman’s little boat, The small boat and
Madam Maria) had a simple 4/4 rhythm; the last song (The little lemon tree) had a 7/8 rhythm of
a traditional Greek dance. At the end of each song, there was a pause of 10 seconds, and the song
started again from the beginning for a few seconds. This was done to observe infant behaviour
during the pause.
9.5.2 Data analysis

The rhythmic expressions of the infant were measured with the aid of a Video-Logger Event
Recorder (Macleod, Morse and Burford 1993). This technique facilitates the microanalysis of
recorded behaviours, to determine when they occur, their durations, and their sequences and
co-occurrences. Measurements are made to an accuracy of 1/25th of a second.
In the present study the following categories of rhythmic action were coded:
1 Rhythmic vocalizations. A sequence of vocalizations or pre-speech sounds in time, demon-
strating a clear and predictable beat.
2 Rhythmic hand gestures:
(a) General rhythmic activity of one or both hands, starting with hands close to body and
finishing with arms extended, raised and lowered to touch the same or a different part of
the body.
(b) Specific rhythmic gestures such as patting, flapping, clapping, rotation of the palm while
the fingers were slightly bent, palm opening–closing, movement of the palm with all the
fingers closed (punch), tickling of the second and third finger on a surface and a move-
ment of two or more fingers in irregular order (Trevarthen and Marwick 1982).
3 ‘Dance’ movements:
(a) General rhythmic activity of one or both legs while the infant is lying on their back or
seated, such as kicking by the flexing and extending of the knees, cyclic movement of the
legs, banging of the legs on a surface.
(b) Rocking of the trunk combined often with the shaking of the shoulders and the head,
forward–backward or right–left.
(c) Combined movements of legs and trunk, such as bouncing while standing or seated.
4 Combinations of rhythmic hand gestures and dance movement. The infant moving with varied
combinations of the above subcategories of hand and body movements.
5 Combinations of rhythmic vocalizations, hand gestures and dance movements. Combinations of
vocalizations with the subcategories of hand and body movements.
Rhythm or order in a temporal succession occurs when elements are grouped over hierar-
chically organized intervals of time (Fraisse 1982). As in speech, where syllables, phrases,
sentences and larger rhythmic units are distinguished, any pattern of coordinated
body movement can be analysed in smaller elements, and a rhythmic sequence of move-
ments can be experienced as organized into a group, depending on how we perceive the
periodicity of the pulse in the movement. The organization of human body rhythmic move-
ments reflects many complicated features of grouping and dynamic succession. In the
present study of the organization of infants’ rhythms, periodic movements were classified
into two patterns:
(a) Simple rhythmic patterns (sustained or continuous). The rhythm is organized as a single
continuing sequence in which we can perceive a stable pulse sustained through the whole
sequence; e.g., (- – - – - – - –).
(b) Complex rhythmic patterns (not sustained or non continuous). The rhythmic sequence is
organized in two or more kinds of groups, of the same repeated movement, with the
pauses between the groups. The pauses between two successive groups are longer than
the intervals between the elements in either group; e.g., (- – -) [(- -) (- -)] (- -).
The durations of Simple and Complex rhythmic patterns were measured.
The emotional expressions of the infants were analysed 5 seconds before the rhythmic
expression, during, and 5 second after the infant rhythm. The following categories were
coded:
(a) Surprise: Wide-open eyes, open or loosely closed mouth, the corners of the mouth are
pulled slightly downwards so the upper lip has an inverse U shape, with raised or knitted
eyebrows. Surprise is often expressed in response to the presentation of a novel, strange
and undifferentiated input (Izard 1978).
(b) Interest: Unsmiling face with open eyes and intense looking; lips loosely open or closed,
the corners of the mouth sometimes slightly downward (Kokkinaki 1998). The direction
of interest was categorized according to the orientation of the infant’s attention: to his or
her body, to the musical sound or its source, or elsewhere (e.g., to the room, or to the
researcher). Interest in musical sound was characterized by looking about, searching for
the sound source, or fixation (concentration) of looking on the source or other object,
very often with the body still for attentive listening.
(c) Pleasure: A happy relaxed face, or a gentle smile with a slightly open mouth, slightly
stretched lips and cheeks drawn upwards; eyes open wide. These expressions are often
combined with pleasure vocalizations (Kokkinaki 1998).
(d) Joy: Shown by a broad smile, often with laughter. Joy was categorized as an intense
expression of pleasure: the mouth is open with wrinkles on either side, the eyes are open
but narrowed with wrinkles under them, the cheeks are bulging, and the baby often
makes intense playful sounds.
(e) Excitement: In comparison with joy, excitement is characterized by laughter of higher
intensity and similar facial expressions, which in this case express elation. Excitement is
often combined with intense body movements.
(f) Neutral: Characterized by an unsmiling, relaxed face with no signs of vocalizations or
body movement. The infant has an indifferent expression unrelated to the self, to
surroundings, or to the musical sound (see also Kugiumutzakis et al. 2005).
9.6 How infants responded to the music, and expressed

their feelings
9.6.1 Frequencies of all infant rhythms
We observed 471 rhythmic expressions in the absence of music (Condition 1), and 653 in
reaction to the sound of music (Condition 2) (Table 9.2).
All infants produced rhythms in Condition 1. Only three infants, Subjects 10, 11 and 12,
marked*, all boys, made fewer rhythmic movements when the music was played. Analysis with
a paired samples t-test confirmed that infants, as a group, produced significantly more rhythmic
activity when they heard the songs (t = –2.989, p = 0.01). There was no significant difference
between boys and girls within the condition, but by comparing the two conditions for each
sex separately we found that girls in Condition 2 exhibited more rhythmic expressions
(t = –2.957, p = 0.025), a difference that was not found with the boys.
9.6.2 Developmental course of infant rhythms

We applied two multivariate repeated measures models with two factors to test for differences in
the distribution of rhythmic expressions by age. The first model searches for differences between
Table 9.2 Frequencies of infant rhythmic expressions in Conditions 1 and 2
Conditions
1 Without music 2 With music
Infants Frequency of rhythmic expressions

1 Girls 15 29
2 46 81
3 52 58
4 10 12
5 14 2
6 29 48
7 13 15
8 Boys 17 61
9 43 59
10 53 44*
11 39 34*
12 23 16*
13 56 87
14 35 54
15 26 28
Totals 471 653
the 11 age categories when the condition is maintained constant. The second searches for differ-
ences between conditions, when age remains constant (Stevens 2002).
In Condition 1, when there was no music, the infants displayed more rhythmic movements as
they became older, with maxima at 3 months and 9 months (Figure 9.1). The analysis showed
that the increase from 8th to 9th (F = 4.789, p < 0.05) and the decrease from 9th to 10th month
(F = 5.259, p < 0.05) were significant, and the value at 9 months was significantly different from
those for all the other ages (Table 9.3). Clearly there is an important increase in spontaneous
100
Numbers of rhythmic movements
80
Frequency
60
40
Conditions
20 1 No music
2 With music
0
2 2.5 3 3.5 4 5 6 7 8 9 10
Age (months)
Fig. 9.1 Infant rhythmic expressions in Condition 1 and 2, by age.
Table 9.3 Difference between 9 months and all the other ages in Condition 1
Months F p
2 vs. 9 22.488 0.000
2.5 vs. 9 10.725 0.006

3 vs. 9 5.774 0.031
3.5 vs. 9 13.319 0.003
4 vs. 9 8.687 0.011
5 vs. 9 5.410 0.036
6 vs. 9 4.439 0.049
7 vs. 9 5.545 0.034
8 vs. 9 4.789 0.046
10 vs. 9 5.259 0.038
rhythmic movements at 9 months for this group of infants when they are amusing themselves on
their own. Infants are undergoing both physical and mental developments at this age, which
transform their communication and learning (Trevarthen and Aitken 2003).
In Condition 2, where the infants were hearing recorded music, the developmental course of
rhythmic behaviours shows large ups and downs during the first 5 months (Figure 9.1). The analysis
showed that the drop in frequency of moving between 3 and 3.5 months (F = 7.549, p < 0.05) and
the rise between 3.5 and 4 months (F = 6.829, p < 0.05) are both significant. Furthermore, the level at
3.5 months was found to be significantly lower than those for the other ages, with the exception of
2.5 and 5 months (Table 9.4). The decrease at 5 months was significant in comparison with the 8th
(F = 8.739, p = 0.01), the 9th (F = 5.263, p < 0.05) and the 10th months (F = 0.045, p < 0.05). The
infant responses changed in complex ways with age, but it is clear that the fall in rhythmic engage-
ment with the music at 3.5 months is an accurate indicator of a developmental change. Again, this
correlates with known changes in body and brain at this age (Trevarthen and Aitken 2003).
Table 9.4 Difference between 3.5 months and all the other ages, in Condition 2
Months F p
2 vs. 3.5 4.877 0.044
2.5 vs. 3.5 3.621 0.078

3 vs. 3.5 7.549 0.016
4 vs. 3.5 6.829 0.020
5 vs. 3.5 2.516 0.135
6 vs. 3.5 7.746 0.015
7 vs. 3.5 9.964 0.007
8 vs. 3.5 15.307 0.002
9 vs. 3.5 11.735 0.004
10 vs. 3.5 16.900 0.001
While there are significant differences between Condition 1 and Condition 2 at 2 months
(F = 5.895, p < 0.05), 3.5 (F = 7.273, p < 0.05) and 8 months (F = 8.253, p < 0.05), the analysis
revealed no systematic interaction between the two conditions. Spontaneous rhythmic activity
when there is no music is apparently different from the rhythms expressed when the infants hear
music (Figure 9.1).
9.6.3 Different kinds of infant rhythms

In both conditions, there were polyrhythmic expressions of different kinds, including vocalizations,
hand gestures, ‘dancing’ movements, hand gestures with ‘dancing’ movements, and combinations
of vocalizations, hand gestures and ‘dancing’ (Figure 9.2). In Condition 1, hand gestures were
significantly more frequent than the other kinds of rhythmic expression (Table 9.5). Analysis by
the two samples independent t-test showed that in Condition 1 boys produced significantly more
rhythmic gestures than girls (t = 4.633, p = 0.000). In Condition 2, the music stimulated ‘dance’
movements and, to a lesser degree, hand gestures (Figure 9.2). Analysis showed a significant dif-
ference between vocalizations and hand gestures (F = 18.430, p = 0.001) and vocalizations and
dance movements (F = 4.599, p < 0.05). Interestingly, vocalizations decreased when music was
heard, presumably because the infants were listening. No significant differences were found
between boys and girls in this condition. Comparing the various rhythmic expressions between
the two conditions, we found a significant difference only in 'dancing' movements in Condition 2
(F = 8.805, p = 0.01). We found that boys produced more ‘dancing’ movements involving their
whole bodies (t = –2.634, p < 0.05) and girls produced more rhythmic hand gestures (t = –3.523,
p < 0.05) in Condition 2 than in Condition 1.
9.6.4 Durations of infant rhythmic expressions

In Condition 1, the mean duration of simple rhythmic sequences was 3.20 seconds (SD = 0.62)
and the mean duration of the complex rhythmic sequences was 6.33 seconds (SD = 2.38). The
difference was significant (t = –5.22, p < 0.001). In Condition 2, the mean duration of the
simple sequences was 3.03 seconds (SD = 0.56) and the mean duration of the complex rhythmic
sequence was 6.33 seconds (SD = 1.69). Again, the difference was significant (t = –6.97,
p < 0.001).
Comparison of the durations of the simple and complex patterns (a) between conditions,
and (b) from the 2nd to the 10th month both within and between conditions, no significant
250
215
204
100 Condition 1
164 No music
Frequency
150
Fig. 9.2 Kinds of infant rhythmic 115
expressions in Conditions 1 and 2. 100 88 85
76 77 Condition 2
1, Vocalizations; 2, hand With music
57
gestures; 3, dancing movements; 50 43
4, combined hand gestures and
dancing movements; 5, combined 0
vocalizations, hand gestures and 1 2 3 4 5
‘dancing’. Kinds of rhythm
Table 9.5 Comparisons between infants’ rhythmic hand gestures and other kinds of rhythm in
Condition 1
Kinds of infants’ rhythms F p

Hand gestures vs vocalizations 11.696 0.004
Hand gestures vs dance movements 5.123 0.040

Hand gestures vs gestures and dance 8.507 0.011
Hand gestures vs vocalizations, gestures and dance 17.132 0.001
differences were found. The temporal patterns of infants’ simple and complex rhythms in expres-
sive movement were stable during the first year, and both within and between the conditions.
Neither age nor hearing music in Condition 2 changed the infants sense of time for these
expressive activities (Table 9.6).
9.6.5 Infants expressions of emotion when hearing music

Measurement of the frequencies of repetitive activity of infants as they listen to music has limited
meaning, as it does not represent the grace, beauty, surprise, satisfaction and excitement they
evidently experienced in hearing the ‘story’ or ‘drama’ of the music. A full description of the ways
they expressed their curiosity and feelings tells much more. Here, we can give only a preliminary
account of their emotions of appreciation.
Microanalysis combined with the observation of the video showed that when the tape recorder
started playing, even very young infants, at about 3 months, stopped whatever they were doing
and purposefully turned their head about in a ‘seeking’ manner, searching for the source of the
sound, with characteristic facial expressions of interest and in many cases surprise.
Panos, a 9-month-old boy shows the usual reactions that follow this (Figure 9.3). After his first
surprise (photo top left, he remained motionless and very much attentive to the sound for a few
seconds (photo top right). Then, suddenly, there was a change in his facial expression, which
started with a sweet smile that expressed pleasure, and satisfaction and became progressively a
broader smile, expressing joy and excitement (photo bottom left). These expressions were fol-
lowed by active participation through lively rhythmic movements and playful vocal sounds
(photo bottom right).
We found that in both Conditions 1 and 2, compared with a period of 5 seconds before and
5 seconds after, there was an increased expression of positive emotions of pleasure and joy when
the infants were acting rhythmically. In Condition 2, interest in sound appeared many times
throughout the playing of the song, but during the rhythmic response we found a decrease of
Table 9.6 Mean time (sec) of Simple and Complex rhythmic patterns in Condition 1 & 2, by age
Age (months)
2 2.5 3 3.5 4 5 6 7 8 9 10 M SD
Condition 1 Simple 3.95 3.24 2.94 2.35 3.21 2.40 3.32 2.58 3.99 3.47 3.18 3.20 0.62
Complex 16.68 8.52 5.57 5.12 7.17 5.93 8.47 5.08 4.97 5.44 8.95 6.33 2.38
Condition 2 Simple 2.54 2.69 3.20 3.22 3.11 2.33 3.33 3.16 3.44 2.97 3.13 3.03 0.56
Complex 10.12 7.71 6.55 -- 7.79 7.94 5.80 4.97 12.42 5.81 5.01 6.33 1.69
expressions of interest and an increase of surprise, pleasure, joy and excitement. After the rhyth-
mic response, while the infants continued to listen to the song, surprise, pleasure, joy and excite-
ment decreased and interest in sound again increased, as if the infant was looking for or
expecting ‘something’ to come, and getting ready to start a rhythmic activity again.
Although we did not correlate infant emotional expressions with specific acoustic or musical
elements in the songs, such as the changing values of pitch, loudness and tempo, the expressive
patterns of the face together with the gestures, the movements of the limbs and the tone of vocal
sounds suggest the range of emotional experiences of the infants while they were listening to the
song. In the beginning, the appearance of a novel musical presence attracted their interest and
elicited an expression of surprise, with characteristic opening of the mouth, wide open eyes and
knitted eyebrows, as if wondering ‘What’s going on?’, or ‘Who is that?’ After they had become still
to listen attentively and seemed to have finished ‘exploring’, ‘feeling’ or ‘thinking about’ what they
had listened to, they expressed pleasure or joy with a beautiful smile, as if saying ‘Yes, I do like
you!’. The intensity of these emotional expressions, as well as the strength and the tempo of
the rhythmic movements that often followed, varied in relation with the temperament and the
personal style of expression of the infants (Figures 9.3, 9.4 and 9.5).
Summarizing the description of the performance of the infants when they were experiencing
the ‘narration’ of the music, we distinguish the following range of emotional ‘activities’:
◆ Searching for (by the turn of the body, the head and the eyes) and orientation to the source
of sound
◆ Interest and attentive listening to the music by becoming still
Fig. 9.3 Nine-month-old Panos responding to music. He looks surprised, smiles a greeting, then
moves rhythmically, beats with his hand and vocalizes.
Fig. 9.4 Georgos, 3.5 months, listens, responds with pleasure and gestures a performance to the
sound of music.
◆ Surprise and curiosity for the unexpected ‘events’ of the song (the beginning, the pause we
inserted at the end of the first part of the song, and the repetition of the introduction)
◆ Pleasure sounds, or ‘talking’, addressed to musical sound
◆ Graceful repetitive gestures and movements in eager response
◆ Characteristic expressions of pleasure, joy and sometimes enthusiasm
◆ Occasional appearances of self-consciousness or coyness.
9.7 Discussion
9.6.1 Infants’ rhythmic expressions
Natural or intuitive musicality of human beings is described in this chapter by charting the
spontaneous rhythmic expressions of infants, at home, in two conditions: when they are on
their own, amusing themselves while a quiet researcher records them, and when music comes
on in the room. We have recorded and counted the repetitive body movements and vocaliza-
tions of the infants, and their face expressions of different emotions accompanying the
rhythms.
Infants, we found, could generate various graceful, simple or more complex and gentle or
vigorous rhythms in the two conditions. In Condition 1, they could perfectly well express
themselves happily with rhythmic musicality in a quiet room. The spontaneous emergence of
rhythmic expressions in the absence of music is interpreted as motivation of the intrinsic motive
pulse (IMP) that moves one into action, coordinating vocal and body activity in the Self and
social experience with Others (Trevarthen 1999; Trevarthen and Malloch 2002). The finding
that infants produced significantly more rhythmic activity when they heard the songs of
Condition 2 confirms the motivating force of music in a human being, even from infancy, but
most importantly indicates the ability of infants to be attracted by, to desire, to respond to
and celebrate the music by making sympathetic or ‘synrhythmic’ movements of their body
Fig. 9.5 Above; Katerina, 9 months, responds to music by looking, smiling and ‘flying’ into action.
Below; Anna, 10 months, standing in her cot, is surprised by the music, then smiles and starts
dancing and singing vigorously.
(Trevarthen et al. 2006; ‘synrhythmia’ is defined as the direct regulation of psychological states in
intersubjectivity1. It is interesting that the girls were more rhythmic than the boys in their expres-
sions to the presence of the music. Perhaps we can relate this to the many kinds of evidence
suggesting that infant girls are inherently more socially inclined than boys, but this needs more
systematic investigation of how boys and girls use the experience of music in their play and
communication.
9.6.2 Development of infants’ movements in the first year

The changes in the rhythmic movements shown in Figure 9.1 indicate that the awareness infants
have of their bodies and how they may be used to carry out intentions undergoes changes at
particular ages. This is confirmed by knowledge of the normal course of development of the activ-
ities of different kinds; to orient to or track objects, to reach and grasp for them, to manipulate
them, to crawl, stand and walk (Trevarthen and Aitken 2003). There are also related changes in
communication and Self–Other awareness, involving the development of more discriminating
attention to the eye gaze of a partner, more lively changes of expression, more versatile vocaliza-
tion and, after 6 months, the elaboration of learned utterances and gestures. Infants 5 to 7 months
of age enjoy lively game rituals and show pleasure in being admired for the tricks they have
learned (Malloch 1999; Trevarthen 1999). After 9 months, there is a marked change in awareness
of other people’s intentions and a willingness to cooperate with them (Trevarthen 2001, 2004a).
9.6.3 The multimodality of infants’ rhythms—the impulse to dance

The spontaneous rhythms of infants in both conditions were expressed through voice, hand
gestures, dance movements and their combinations (hand gestures with dance movements or
vocal expressions, hand gestures and dance movements all together). This polyrhythmic/multimodal
expression indicates that all of the above modes are from the beginning of life functional and
they serve, in any level of maturity, subjective and intersubjective needs. In the absence of music
the prevalence of hand gestures and their significant differences compared to each of the other
kinds of rhythmic expression indicates that the kinetic system of the arms and hands is, from the
2nd to the 10th month, more developed than the vocal. Moreover, the appearance of the rhyth-
mic combinations shows the efforts of the infants to coordinate their available rhythms, as if a
solo rhythmic expression is not enough to express the IMP of their movements in Condition 1
and the coordination of the IMP with music in Condition 2 (Trevarthen 1999).
The finding of the significant difference in dancing movements between the two conditions,
and the absence of significant differences in the other kinds of rhythmic expressions between
conditions, shows that the music we presented—cheerful baby music—seems to motivate self-
organizing or self-experiencing activity of the whole body, more than the other kinds of rhythmic
expressions such as might be made to give expressive messages directed to other persons being
addressed by the child. The increase in rhythmic movements of the body reflect the internal
impulse to dance and the pleasure of participation in the musical experience.
9.6.4 The timing of infants’ rhythms

Rhythmic behaviour depends on the timing of movements of the body, and there is a hierarchy
of periods that are coordinated between different parts of the body in the production of rhythms.
Separate movements, equivalent to syllables or musical notes or chords, have regulated timing at
periodicities from 1 to 3 per second, and they are grouped in phrase units of about 3 to 5 seconds
in duration. We measured the durations of simple and complex rhythmic bursts of activity when
infants were on their own in a quiet room and when they heard music.
Whether moving to entertain themselves or joining in with the music, the simple rhythmic
sequences of expression lasted for about 3 seconds and the mean duration of the complex
sequence was about 6 seconds. A ‘phrase’ of 3 seconds seems to be a basic temporal pattern in
human activity and communication that is recognized in many works on musical timing. In
mother–infant vocal interaction, a ‘bar’ structure has been found lasting about 1.6 to 3.3 seconds
(Malloch 1999). In music and poetry, a phrase unit of about 3 seconds has been revealed, which
appears to be a basic temporal element used by our brains to organize experience (Wittmann and
Pöppel 1999). The duration of spontaneous vocal phrases of infants has been found between
3 and 4.5 seconds (Lynch et al. 1995). In our study, the temporal unit of 3 seconds duration for a
simple rhythmic performance was a feature of all the movements the infants made and was not
significantly influenced by age and the presence of music. While the presence of music seems to
increase the rhythmic activity, to differentiate its development, to motivate dance movements
and to provoke intense and varied emotions, it does not seem to affect this fundamental
periodicity, a finding that seems to reflect the constancy of the regulatory activity of the IMP.
Indeed this interval, of around 3 to 5 seconds, has been defined as the ‘psychological present’
of conscious experience (Trevarthen 2005; Obsorne, Chapter 25, this volume). That complex,
polyrhythmic expressions were about twice as long suggests that the infants could ‘compose’
longer rhythmic experiences by doubling the basic three-second phrase.
9.6.5 The evolution of emotions in the infants’ rhythmic

‘dialogue’ with the musical other
Infants certainly did not listen to the songs merely as receptive auditors who were discriminating
musical features. They attended carefully and then acted as good ‘musical performers’. They were
listening very attentively in the beginning of the song, looking for the source of sound, expressing
surprise and interest. They seemed to need to explore the musical sound a little, and then they
participated by dancing and singing, expressing pleasure and increasing joy. They shared rhythms
and emotions.
The increase of the emotional expressions of pleasure and joy during the rhythmic activity both
with music and without it reflects the emotional motivation of infant rhythmic expression. More
specifically, in the presence of music the infants’ rhythms were motivated mostly by the emotions of
interest (in sound), of surprise, of pleasure, of joy and excitement. However, during the rhythmic
activity, there was a decrease of interest, an increase of pleasure and joy and a smaller increase of
surprise and excitement, indicating the enjoyment of being in the rhythm with the music. After the
end of the rhythmic activity, the interest in sound was increased, indicating probably the attentive
listening of the infant to the musical sound, maybe before he or she starts a new burst of activity.
The ‘musical’ performance of the infant, the rhythmic show we have described apparently to
amuse the Self, confirms the primary social or communicative role of musicality. In the absence of
music (Condition 1) the spontaneous appearance of the rhythms may reflect a subjective infant
expectation for filling the mind’s companion space with music or with the singing mother—
a deep wish to pass, by means of music, from the state of subjectivity to intersubjectivity.
According to Bråten’s theory (1988, 1992), the Virtual Other is an intrinsic complementary
perspective of a non-particular companion that exists in the absence of the actual Other, and fills
in the perspective of the infant when he or she is alone. Based on our results, we assume that the
Virtual Other in the infant’s mind can be a more generalized Other with musical and cultural
meaning. The spontaneous emergence of infants’ rhythmic expressions in the absence of recorded
music in Condition 1 might be showing the infant’s intrasubjective expectation that expects to be
filled by the actual intersubjective musical communication, the actual Musical Other. Spontaneous
multimodal rhythms may be inviting the music, the mother or even the silent researcher who is
recording the infant behaviour, to a potential ‘musical’ sharing.
In Condition 2, when the actual music of recorded songs starts, the infant’s invitation has been
recognized and accepted, the intersubjective expectation has been confirmed, the space of the
virtual cultural Other has been filled by real music, the game of sharing has started and has to
be completed by both coordinating partners—infant and music, inside the nest of positive
emotions. Infants not only expect music, but respond to it with the culturally expected mode of
performing to music—by dancing movements, plus rhythmic vocalizations and hand gestures,
and their combinations. The song is greeted as a Musical Other who invites the infants into an
adventure of discovery and creation. In joint companionship with a real and present person, the
infant is even more motivated to express and develop musicality than in the individual experi-
ence of solitary play, although it is clear that a baby has the imagination to create a musical expe-
rience for him or her self. In the intersubjective mother–infant communication, both partners
are able to sustain a coordinated relationship through time, and share jointly constructed narra-
tives of moving through many seconds (Malloch 1999; Stern and Gibbon 1980; Stern 1992, 1999;
Trevarthen 1999). By anticipating the changes, each partner contributes to the improvisation and
development of the interaction in a way that leads to coordinated enjoyment of a common social
experience (Trevarthen 2002; Gratier and Danon, Chapter 14, this volume). When the infant
listens alone to a song, it is the presence of music, the melody and the dramatic form of narration
that acquire meaning and that enable the infant to become proficient in appreciation and
production of musicality, by ‘knowing’ what he or she is doing with and for the Musical Others.
In a more advanced state of participation (Condition 3; Mazokopaki 2007) the singing mother
enters the companion spaces in the infant’s mind—simultaneously she becomes both mother
and their shared musical art (Dissanayake 2000). Infants are born expecting not only an affective
talking mother, but even more a singing mother, sharing personal emotions, cultural meanings
and synrhythmic modes of coexistence. They share a critical moment (Stern 2004), where nature
and culture prove their indivisible synrhythmia and its enormous role in evolution of the species
and ontogenesis (Gratier and Danon, Chapter 14, this volume). Microanalysis of the videos and
spectrographic analysis of mother–infant singing showed us that mothers in Crete sing various
kinds of song, such as ballades, modern songs, invented songs, songs following traditional forms,
and mostly baby songs. Infants were excited by their mothers’ singing. They often recognized the
melody, watched, and anticipated the musical changes by joining in rhythmic vocalizations and
body movements. The development of mother–infant singing seems to generate a dialogical
form based on the rhythmic phrase unit of mother’s singing, which matches the natural
rhythmic intuitions of the infant’s mind (Mazokopaki 2007).
We have proposed that for the infant, there are no such separate things as ‘my imitative ability’,
‘my rhythmical ability’, or ‘my arithmetic ability’ (Tsourtou and Kugiumutzakis 2003;
Kugiumutzakis 2007). It is we, the researchers, who divide the infant mind for methodological
and other reasons. The developing sense of Self is cohesive, in the infant and in communication,
from the start, unless the child is unwell and distressed when confused and fragmentary behav-
iours may appear. Imitation, rhythms, rhythmic imitations, melodorhythmic structures and
lyrical music (Brown 2000; Miller 2000; Molino 2000; Theodorakis 2007) are, all together, basic
talents for the growth of music, language, ‘mind reading’, empathy, sympathy and intersubjectivity
in a child, as they certainly were in the early evolutionary stages of the process of hominization,
called by Merlin Donald (1991) ‘mimetic culture’. Mousiké in ancient Greek culture means
melody, movement, word/talk (logos), rhythm, dance and poetry, and their ‘common denominator
is pulse-based rhythmicity’ (Merker 2000, p. 320), with the rhythm revealing the brain’s capacity
‘for sequencing complex movements reliably’ (Miller 2000, p. 340; see also Richman 2000). Young
infants possess a natural bias to attach to cardinal musical features like ‘good’ rhythms (Trehub
2000), but at the same time they combine rhythmical expressions with sympathetic imitation,
and, we find, the discrimination of small numbers and small-integer ratios (Tsourtou
and Kugiumutzakis 2003). It is the simultaneous ‘co-appearances’ of many human abilities in
infancy that shows the way future research should approach the fate-spinning phenomenon of
psychological intersubjective synrhythmia.1
1 Trevarthen distinguishes the physiological regulation from psychological regulation in the intimate
relations developing between a mother and her fetus and infant. Both regulations are functioning before
birth and through infancy (Trevarthen et al. 2006). The first was named amphoteronomos or amphotero-
nomic regulation, and the second synrhythmia or synrhythmic regulation. Amphoteronomic regulation
engages their bodies across frontiers of hormonal and physicochemical traffic with the amniotic fluid and
through the placenta, as well as in close physicochemical contact after birth. The word amphoteronomic
conveys the idea of combined or mutually dependent self-regulations, between the autonomic systems of
two organisms in one physical ‘containment’. It was choosen because the words amj óteroi (both), a¢mjw
(both together) and amjte¢rwqen (bilateral, two-sided), with nóµoV (law—a stable, regular manner of
behaviour of natural, social or personal events) describe with sufficient precision the mutual, unconscious
processes of the two living systems. Synrhythmic regulation engages the emerging psychological motives
in the infant’s mind with the expressions of mind states of the mother. The word syn-rhythmia, or
synrhythmic regulation, was chosen because sun (together with, plus) and ruqµóV (rhythm, regularity
of recurrence, periodicity) convey the idea of the mutual psychosocial regulation of intentions and
experience between mother and infant, using, among other musical modes, their coordinated
rhythms (Kugiumutzakis 2007, p. 368; Trevarthen et al. 2006; Panksepp and Trevarthen, Chapter 7, this
volume).
References
Beebe B, Jaffe J, Feldstein S, Mays K and Alson D (1985). Inter-personal timing: The application of an adult
dialogue model to mother–infant vocal and Kinesic interactions. In FM Field and N Fox, eds, Social
Perception in Infants, pp. 249–268. Ablex, Norwood, NJ.
Bjørkvold J-R (1992). The Muse within: Creativity and communication, song and play from childhood
through maturity. Harper Collins, New York.
Blacking J (1969/1995). The value of music in human experience. The 1969 Yearbook of the International
Folk Music Council. (Republished in P Bohlman and B Nettl, eds, 1995, Music, culture and experience:
selected papers of John Blacking. University of Chicago Press, Chicago, IL).
Bråten S (1988). Dialogic mind: The infant and adult in protoconversation. In M Cavallo, ed., Nature,
cognition and system, pp. 187–205. Kluwer Academic Publications, Dordrecht.
Bråten S (1992). The virtual other in infants’ minds and social feelings. In AH Wold, ed., The dialogical
alternative, pp. 77–97. Scandinavian University Press, Oslo.
Brown S (2000). The ‘Musilanguage’ model of music evolution. In NL Wallin, B Merker and S Brown, eds,
DeCasper AJ and Spence M (1986). Prenatal maternal speech influences newborns’ perception of speech
sounds. Infant Behavior and Development, 9, 133–150.
Deva BC (1995). The music of India: A scientific study. Munshiram Manoharlal Publishers, Delhi.
Dissanayake E (2000). Antecedents of the temporal arts in early mother–infant interaction. In NL Wallin,
Donald M (1991). Origins of the modern mind. Harvard University Press, Cambridge and London.
Donald M (2001). A mind so rare. Norton, New York.
Farb P (1978). Humankind – a history of the development of man. Jonathan Cape, London.
Fernald A and Simon T (1984). Expanded intonation contours in mothers’ speech to newborns.
Developmental Psychology, 20, 104–113.
Fernald A, Taeschner T, Dunn J, Papousek M, Boysson-Bardies B and Fukui I (1989). A cross-language
study of prosodic modifications in mothers’ and fathers’ speech to preverbal infants. Child Language,
16, 477–501.
Books, New York.
Fraisse P (1978). Time and rhythm perception. In EC Carterette and MP Friedman, eds, Handbook of
perception, vol. 8, pp. 203–253. Academic Press, New York.
Fraisse P (1982). Rhythm and tempo. In D Deutch, ed., The psychology of music, pp. 149–180. Academic
Press, New York.
Grieser DL and Kuhl PK (1988). Maternal speech to infants in a tonal language: Support for universal
prosodic features in motherese. Developmental Psychology, 24, 14–20.
Inayat Kahn H (2005). The music of life: The inner nature and effects of sound. Omega, New Lebanon NY.
Izard CE (1978). Emotions as motivations: An evolutionary-developmental perspective. In Nebraska
Symposium on Motivation, pp. 163–200. University of Nebraska Press, Lincoln, NE.
Kokkinaki T (1998). Emotion and imitation in early infant–parent interaction: a longitudinal and
cross-cultural study. Ph.D. Thesis, University of Edinburgh.
Krumhansl CL (2000). Rhythm and pitch in music cognition. Psychological Bulletin, 126(1), 159–179.
Kugiumutzakis G (1983). Imitative phenomena: a new challenge. MA Thesis, Department of Psychology,
Uppsala University, Sweden.
Kugiumutzakis G (1985). The origin, development and function of the early infant imitation. Ph.D. Thesis,
Department of Psychology, Uppsala University, Sweden.
Kugiumutzakis G (1993). Intersubjective vocal imitation in early mother–infant interaction. In J Nadel and
L Camaioni, eds, New perspective in early communication development, pp. 22–47. Routledge, London.
Kugiumutzakis G (1998). Neonatal imitation in the intersubjective companion space. In S Bråten, ed.,
Cambridge.
Kugiumutzakis G (1999). Genesis and development of early infant mimesis to facial and vocal models.
In J Nadel and G Butterworth, eds, Imitation in infancy, pp. 36–59. Cambridge University Press, Cambridge.
Kugiumutzakis G (2007). Imitation, numbers and rhythms. In G. Kugiumutzakis, ed., Universal harmony –
science and music. In honour of Mikis Theodorakis, pp. 235–294. Crete University Press, Heraklion.
(In Greek translation).
Kugiumutzakis G, Kokkinaki T, Markodimitraki M and Vitalaki E (2005). Emotions in early mimesis.
In J Nadel and D Muir, eds, Emotional development, pp 161–182. Oxford, Oxford University Press.
Kühl O (2007). Musical semantics. European Semiotics: Language, Cognition and Culture No. 7.
Peter Lang, Bern.
Origins and development of musical competence, pp. 3–34. Oxford University Press,
Oxford/New York/Tokyo.
Lerdahl F and Jackendoff R (1983). A generative theory of tonal music. MIT Press, Cambridge, MA.
Lynch MP, Oller DK, Steffens ML and Buder EH (1995). Phrasing in prelinguistic vocalisations.
Developmental Psychobiology, 28, 3–25.
Macleod H, Morse D and Burford B (1993). Computer support for behavioural event recording and
transcription. Psychology Teaching Review, 2(2), 112–116.
1999–2000), 29–57.
Masataka N (1992). Pitch characteristics of Japanese maternal speech to infants. Journal of Child Language,
19, 213–223.
Mazokopaki K (2007). Oi rizes tis musikotitas: I anaptyxi ton epikoinoniakon vrefikon rythmon apo ton 2o eos
ton 10o mina (The roots of musicality: the development of infant communicative rhythms from the 2nd
until the 10th month). Ph.D. Thesis, Department of Philosophy and Social Studies, University of Crete.
Merker B (2006). Why music and whence its harmonies? Paper presented at the Symposium Music and
Universal Harmony in Hersonisos, Crete, 10–11 March, 2006.
Miller G (2000). Evolution of human music through sexual selection. In NL Wallin, B Merker and S Brown,
Mithen S (2005). The singing neanderthals: The origins of music, language, mind and body. Weidenfeld and
Nicholson, London.
Molino J (2000). Toward an evolutionary theory of music and language. In NL Wallin, B Merker and
In I Deliege and J Sloboda, eds, Musical beginnings: Origins and development of musical competence,
Papoušek, M (1987). Models and messages in the melodies of maternal speech in tonal and non-tonal
languages. Abstracts of the Society for Research in Child Development, 6, 407.
Papoušek M (1994). Melodies in caregivers’ speech: A species specific guidance towards language.
Early Development and Parenting, 3, 5–17.
Papoušek M and Papoušek H (1981). Musical elements in the infant’s vocalizations: Their significance for
communication, cognition and creativity. In LP Lipsitt, ed., Advances in infancy research, 1, pp. 163–224.
Ablex, Norwood, NJ.
Plato (1963). Politeia (The Republic). Etaireia Hellinikon Ekdoseon, Athens.

Richman B (2000). How music fixed ‘nonsense’ into significant formulas: On rhythm, repetition, and
meaning. In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 301–314. MIT Press,
Cambridge, MA.
Robb L (1999). Emotional musicality in mother-infant vocal affect, and an acoustic study of postnatal
depression. Musicae Scientiae (Special Issue 1999–2000), 123–153.
Rowell L (1992). Music and musical thoughts in early India. University of Chicago Press, London.
Sacks O (2007). Musicophilia: Tales of music and the brain. Random House, New York/Picador, London.
Shetler DJ (1990). The inquiry into prenatal musical experience. In FR Wilson and FH Roehmann, eds,
Music and child development, pp. 44–62. MMB Music, Saint Louis, MO.
Sifakis GM (2001). Aristotle on the function of tragic poetry. Crete University Press, Herakleion.
Wiley, New York.
Stern DN (1992). L’enveloppe prénarrative: Vers une unité fondamentale d’expérience permettant d’
explorer la réalité psychique du bébé. Revue Internationale de Psychopathologie, 6, 13–63.
Stern DN (1993) The role of feelings for an interpersonal self. In U Neisser, ed., The perceived self: Ecological
and interpersonal sources of self-knowledge, pp. 205–215. Cambridge University Press, New York.
Stern DN and Gibbon J (1980). Temporal expectancies of social behaviours in mother–infant play.
In E Thoman, ed., Origins of infant social responsiveness, pp. 409–429. Erlbaum, New York.
mother and infant by means of inter-modal fluency. In T Field and N Fox, eds, Social perception in
Stern DN, Spieker S and MacKain K (1982). Intonation as signals in maternal speech to prelinguistic
infants. Developmental Psychology, 18, 727–735.
Stevens J (2002). Applied multivariate statistics for the social sciences, 4th edn. Lawrence Erlbaum, Mahwah, NJ.
Theodorakis M (1983). I anatomia tis musikis (The anatomy of music). Gnoseis, Athens.
Theodorakis M (2007). Sympantiki armonia (Universal Harmony). In G Kugiumutzakis, ed., Universal
harmony – science and music. In honour of Mikis Theodorakis, pp. 75–102. Crete University Press, Heraklion.
Thorpe LA and Trehub SE (1989). Duration illusion and auditory grouping in infancy. Developmental
Thorpe LA, Trehub SE, Morrongiello BA and Bull D (1988). Perceptual grouping by infants and preschool
children. Developmental Psychology, 24, 484–491.
Trainor LJ (1996). Infant preferences for infant-directed versus non-infant-directed playsongs and lullabies.
Infant Behavior and Development, 19, 83–92.
Trainor LJ and Trehub SE (1992). A comparison of infants’ and adults’ sensitivity to Western musical
structure. Journal of Experimental Psychology: Human Perception and Performance, 18, 394–402.
Trehub SE (1990). Human infants’ perception of auditory patterns. International Journal of Comparative
Trehub SE (2000). Human processing predispositions and music universal. In NL Wallin, B Merker and
Trehub SE (1987). Infants’ perception of musical patterns. Perception and Psychophysics, 41, 635–641.
Trehub SE and Thorpe LA (1989). Infants’ perception of rhythm: Categorization of auditory sequences by
temporal structure. Canadian Journal of Psychology, 43(2), 217–229.
Trehub SE, Endman MW and Thorpe LA (1990). Infants’ perception of timbre: classification of complex
tones by spectral structure. Journal of Experimental Child Psychology, 49, 300–313.
Trehub SE, Thorpe LA and Trainor LJ (1990). Infants’ perception of good and bad melodies.
Psychomusicology, 9, 5–19.
Trehub SE, Unyk AM and Trainor LJ (1993a). Adults identify infant-directed music across cultures. Infant
Behavior and Development, 16, 193–211.
Trehub SE, Unyk AM and Trainor LJ (1993b). Maternal singing in cross-cultural perspective. Infant
Trehub SE, Unyk AM, Kamenetsky SB, Hill DS, Trainor LJ, Henderson JL and Saraza M (1997). Mothers’
and fathers’ singing to infants. Developmental psychology, 33, 500–507.
Trevarthen C (2001). Intrinsic motives for companionship in understanding: Their origin, development
and significance for infant mental health. International Journal of Infant Mental Health, 22 (1–2),
95–131.
In R MacDonald, DJ Hargreaves, and D Miell, eds, Musical identities, pp. 21–38. Oxford University
Press, Oxford.
Trevarthen C (2003). Making sense of infants making sense. Intellectica: Revue de l’ Association pour la
Recherche Cognitive, 2002/1, 34, 161–188.
Trevarthen C (2004a). How infants learn how to mean. In M Tokoro and L Steels, eds, A learning zone of
ones’s own, pp. 37–69. IOS Press, Amsterdam.
Trevarthen C (2004b). Learning about ourselves, from children: Why a growing human brain needs
interesting companions. Research and Clinical Centre for Child Development, Annual Report 2002–2003,
26, 9–44. Graduate School of Education, Hokkaido University.
Trevarthen C (2005). Action and emotion in development of cultural intelligence: Why infants have
feelings like ours. In J Nadel and D Muir, eds, Emotional development, pp. 61–91. Oxford, Oxford
University Press.
Trevarthen C (2007). Harmony in meaning: How infants use their innate musicality to find companions in
culture. In G Kugiumutzakis, ed., Universal harmony – science and music. In honour of Mikis Theodorakis,
pp. 353–410. Crete University Press, Heraklion. (In Greek).
Trevarthen C and Aitken KJ (2003) Regulation of brain development and age-related changes in infants’
Trevarthen C, Aitken KJ, Vandekerckhove M, Delafield-Butt J and Nagy E (2006). Collaborative regulations
of vitality in early childhood: Stress in intimate relationships and postnatal psychopathology.
In D Cicchetti and DJ Cohen, eds, Developmental psychopathology, volume 2, Developmental
neuroscience, pp. 65–126, 2nd edn. Wiley, New York.
The Nordic Journal of Music Therapy, 9(2), 3–17.
Trevarthen C and Marwick H (1982). A method for analyzing mother–infant communication.
In C Trevarthen and H Marwick, Cooperative understanding in infants. Unpublished Project
Report to Spencer Foundation, Chicago.
Trevarthen C, Powers N and Mazokopaki K (2006). Investigating the rhythms and vocal expressions of
infant musicality in Crete, Japan and Scotland. In M Baroni, AR Addressi, R Caterina and M Costa, eds,
Proceedings of the 9th International Conference on Music Perception and Cognition (ICMPC9), Bologna,
Italy, August 22–26, 2006.
Tronick EZ (2005). Why is connection with others so critical? The formation of dyadic states of consciousness:
coherence governed selection and the co-creation of meaning out of messy meaning making. In J Nadel
and D Muir, eds, Emotional development, pp. 293–315. Oxford University Press, Oxford.
Tsourtou V and Kugiumutzakis G (2003). Anaptyxiakes taseis stin proimi arithmitiki ikanotita
(Developmental tendencies in early arithmetic ability). Psychologika Themata, 9(1), 24–54.
with special reference to music perception and performance. Musicae Scientiae (Special Issue
1999–2000), 13–28.
Chapter 10
Voices of shared emotion and meaning:

Young infants and their mothers in
Scotland and Japan
Niki Powers and Colwyn Trevarthen
10.1 The journey to meaning in calls of relating

A journey of cultural discovery has already begun when an infant is born—from the inborn
powers of human expression and experience within the growing Self toward mastery of habits
that have been cultivated by generations of forebears; they who have learned and taught their
knowledge and skills so that their children may assimilate a history of many Others’ lives and
meanings (Gomes-Pedro 2002). To discover what other people know, a child has to participate in
feelings with them, not just exchange information about the world they are in together (Halliday
1975; Jahoda and Lewis 1988; Bruner 1996; Rogoff 2003; Rogoff et al. 2003).
After many years of close observation, we now know that a baby starts the journey an innately
musical/poetical being, moving and hearing with pulse and rhythm, immediately sensitive to the
harmonies and discords of human expression, in the Self and in companionship with close
Others (Miall and Dissanayake 2003). Infants have an intuitive capacity for sharing implicit emo-
tional meaning in the rituals of human relating (Bateson 1979; Stern et al. 1999). Mothers, too,
are specially equipped in body and brain to meet the creative vitality of this elementary human
being in affectionate and highly vocal intimacy (Klaus and Kennel 1976; Papoušek and Bornstein
1992; Papoušek 1996; Stern 1995, 2000). All affectionate humans, but especially mothers, are
inspired by love and caring made manifest in their musical and poetic vocalizations and gestures
when they play with an infant (Dissanayake 2000). As the baby uses innate wisdom to acquire
more explicit learned understanding, following impulses to be part of the meaning of life with
other humans, the voice of a sociable Self is discovered, and this vocal personality is joined by the
voices of those special Others who share the way to the speaking of a particular language
(Bullowa 1979; Gratier and Trevarthen 2007; Gratier and Danon Chapter 14, this volume).
Long before they can speak, infants begin adapting to the parental culture, learning simple
habits of expression, and the family responds, giving objects and actions a clear shareable sense
for the learner by offering rhythmic participation in rituals and tasks. Adults, provided they are
not stressed and unsure in themselves, are naturally ready to teach their ideas and methods to
young children. Indeed, toddlers can assist the infant’s sociocultural learning as the baby and
their brother or sister share the rhythms and emotions of communication in affectionate playful
ways. The journey between nature and culture is navigated with innate human sympathy for a
peculiar inventiveness of human moving and human feelings. Its passage is made easy and joyful,
or more difficult and even dangerous and painful, by the emotions on both sides—in the child
and in their companions—emotions for which the human voice is a versatile instrument of
expression and of Self-and-Other awareness (Panksepp and Trevarthen Chapter 7, this volume).
210 NIKI POWERS AND COLWYN TREVARTHEN
We have studied the emotions expressed in the sounds of voices of infants and adults, and how
the pitch and durations of these sounds change as the infant attends to the habits of sympathetic
older persons and becomes aware of the common sense in the talk around them. We report here
an investigation of the expressive features of extended voice sounds in two cultures. We start by
outlining the nature of infant intersubjectivity, then review evidence that regulation of commu-
nication between mothers and young infants is mediated substantially by intuitive emotions that
both express in their vocalizations, especially in vowel-like vocalisations or sustained phonations.
Finally, we describe analyses of voice sounds made by mothers and young infants playing in
Japan and Scotland, and experiments undertaken in Scotland to identify acoustic features of their
voices that vary with emotion. Thus we explore basic dimensions of the musicality of human
communication many months before language develops, and test the idea that culture-related
patterns, and patterns of individuality, may arise through the infants’ adaptation to their partic-
ular human environment and through the differences they encounter in the adults’ manners of
speaking and playing.
10.1.1 Vocal expression of feelings and the nature of infant

intersubjective communication
Do the expressive forms of vowels in the mother tongue regulate intersubjectivity with an infant?
This question may seem straightforward, but it has complex implications that still hold mys-
teries in spite of several decades of intensive research. It assumes that when an infant interacts
with a parent they do so as two conscious intending subjects—that is, the infant as well as the adult
is conscious and purposefully attending and responding to regulate their own affective state and
to maintain mutual involvement in an engagement of motives with the Other. The answer to this
question relies upon a theory of the intuitive subjective and intersubjective processes that would
be necessary for such a mutual regulation of activity and consciousness to occur (Trevarthen
1979, 1998, 2004a; Trevarthen and Reddy 2007).
There are five core concepts in the theory of how intersubjectivity develops (Aitken and
Trevarthen 1997; Trevarthen 1998; Trevarthen and Aitken 2001, 2003):
1 A notion of the background self-regulation of physiological state, arousal of voluntary activ-
ity and level of attention of the infant.
2 A model of the motives of both the infant and caregiver as individual subjects—of how their
impulses integrate and coordinate their body movements and facilitate selective awareness by
focusing and directing coherent attention.
3 An assumption that stimulation of active interest and awareness, and emotions of ‘seeking’
are necessary for learning.
4 A theory that between immature infant and caregiver there is an intersubjective attachment
motivation—a shared affectionate concern for each other’s well-being that tends to bring
both parties together in pleasurable or loving relationship, and causes them to feel distress at
separation.
5 An assumption of a further intersubjective companionship motivation, activating interest and
pleasure in shared experiences and goal-directed ‘seeking’ activities in relation to the world of
places, things and other persons.
Research over the last few decades has found that, beyond being surprisingly clever in perceiving
and acting on their own on non-living physical events and objects outside their bodies, (e.g. Gopnik
et al. 1999), infants perceive people as living partners and respond in highly adaptive coordinated
and directed ways, communicating expression, imitating intentions and sharing consciousness and
VOICES OF SHARED EMOTION AND MEANING 211
emotions (e.g. Bateson 1971; Bullowa 1979; Trevarthen 1978, 1979, 1994, 1998; Stern 1974, 1977,
2000; Papoušek and Papoušek 1981; Papoušek and Bornstein 1992; Legerstee 1992; Nadel and
Butterworth 1999; Trevarthen and Aitken 2001; Beebe and Lachmann 2002; Nadel and Muir 2005;
Tronick 2005; Mazokopaki and Kugiumutzakis Chapter 9, this volume). Infants know when they
are the focus for communicative behaviour, and are affected by it. They are sensitive to the shifting
emotions of the other person and they express their own motivations and feelings toward them
(Donaldson 1978, 1992; Trevarthen et al. 1981; Trevarthen 1984; Stern 2000; Legerstee 2005). All
this is evidence for the theory that infants possess innate intersubjectivity.
However, having recognized that infants gain a special consciousness of the motives and
feelings of other persons from perceiving their movements as motivated action (Legerstee 2005;
Trevarthen and Reddy 2007), we realize that the fundamental features of behaviour that support
this process are far from clear. We can only begin to make an account of the essential components,
which are rooted in the way a human body is formed and how it moves and thinks under the
control of a Self–Other-sensing human brain (Thompson 2001; Bråten 2007).
It appears that parent–infant communication is greatly facilitated by the mutual coordination
of rhythmical temporal patterning, or ‘kinematics’ of vocalizations and gestures (Trevarthen
1986). This is what makes such communication musical and poetic (Miall and Dissanayake
2003). It also depends on discrimination of particular physiognomic forms made by expressive
movement of different parts of the body, for example facial expressions, vocalizations and move-
ments of the fingers and hands. Newborn infants have a remarkable sensibility for such expres-
sions; they can imitate them as separate and different forms of action (Kugiumutzakis 1998;
Melzoff and Moore 1999; Nagy and Molnár 2004; Mazokopaki and Kugiumutzakis Chapter 9,
this volume). There are innate mechanisms that facilitate awareness of other people’s emotions
by sensitivity to the timing of their expression, and by matching the anatomical forms of differ-
ent movements that are adapted to signal many different emotions (Darwin 1872; Bråten 1998,
2007; Trevarthen 1986, 1999; Nadel and Butterworth 1999). It is important to emphasize that
infants’ imitations are conscious psychological actions—they are intentional from birth, being
guided by their effects, attentive to the contingency and form of their partners’ responses, and
regulated by emotions (Kugiumutzakis 1993; Nagy and Molnár 2004; Reddy and Trevarthen
2004; Kugiumutzakis et al. 2005; Trevarthen 2005a; Trevarthen and Reddy 2007; Mazokopaki and
Kugiumutzakis Chapter 9 and Marwick and Murray Chapter 13, this volume).
These mechanisms of sympathetic engagement of minds work between persons, formed by
innate motivational, emotional, sensory-motor and intersubjective systems (Panksepp 1998;
Trevarthen 2001a), are adapted to come to life and function within a cultural framework of
meanings, and facilitate learning of ‘how to behave’ in meaningful ways. They become elaborated
or enriched in structure and significance, without losing their basic dynamic emotional features.
Different human communities have different ways of cultivating or evaluating these primary acts
of communication, leading the communications to differentiate into many subtle elements and
forms. Infants’ motives and emotions are adapted to take up and evaluate the special behaviours
of other persons, so they soon come to anticipate certain signs and rituals of play and joint
experience, and connected patterns of these, that are special to their parents’ culture (Mundy-
Castle 1980; Trevarthen 1988, 2004b).
10.1.2 Vowels as key elements in the communication of

emotion and meaning by sound
Many animals make vocal sounds—to announce where they are, to indicate their identity and
social status, their vigour and attentiveness, their health and reproductive fitness, their amiability,
fear or aggressiveness (Cheney and Seyfarth 1990; Papoušek et al. 1992; Payne 2000; Wallin et al.
2000; Manning 2004; Merker Chapter 11, this volume). They use their voice to establish, coordi-
nate and regulate encounters with members of their ecological community, of both their own
and other species.
Animal sound is a rich source of proprioceptive stimulation for regulation of the Self,
monitoring of the muscular actions of the body in engagement with the world and keeping track
of vital energy needs. It offers a bridge from one body to another for intimate exchange of the
experience of motive states, with a power to excite physically that is as direct as touch.
Vocalizations, especially, have evolved to transmit the primary affective information that
regulates social encounters (Panksepp and Bernatzky 2002). In rough and tumble play between
juveniles experimenting how to move with one another, sounds of joy or distress are made that
express how physical contacts are experienced (Panksepp and Trevarthen, Chapter 7, this
volume).
In this world of animal sound the human voice is exceptionally rich in what it can communicate.
Its tone and texture or quality, whether in natural spontaneous conversation and singing or in
cultivated oratory and operatic performance, identifies the person and signals their sex and age,
their state of vitality and well-being or fatigue and sickness, and their intersubjectivity (Karpf
2006). The rhythms, loudness, pitch, timbre and melodies of vocalization express the interper-
sonal or moral emotions of affection and love, dislike and anger, admiration, jealousy, and pride
and shame; the cognitive or investigative emotions of interest and curiosity, confidence, determi-
nation or fear; and the aesthetic self-regulating emotions of appetite and pleasure and comfort, or
disgust and pain (Trevarthen 1993).
Paradoxically, by a process of rational reduction, the Self-regulating emotions are often
assumed to be primary in psychology, and they are taken to lead to the differentiation or learning
of more complex cognitive emotions. Acquisition of the interpersonal, moral emotions is then
assumed to depend on the other two, the primary and the cognitive, and thus on a considerable
period of social experience and training, notwithstanding their obvious adaptive value for a
dependent infant with intelligence adapted to seek social support and knowledge. An alternative
view sees basic complex emotions as of fundamental importance in the regulation of human
awareness (Draghi-Lorenz et al. 2001). We expect them to be expressed by infants and to be
appreciated by them (Reddy and Trevarthen 2004).
Theories of musicality in parent–infant communication (Papoušek 1996; Malloch 1999;
Trehub and Nakata 2002) identify dynamic parameters of vocalizations that favour intersubjec-
tive awareness and its emotional regulation from early in life. Temporal and acoustic features of
infant and parent vocal interactions foster a sympathetic coordination, enabling subjective feel-
ings inside one subject’s body to ‘touch’ and ‘move’ others. The dynamics of the voice are the
dynamics of breathing and of the transforming resonance of vocal organs. That is what is ‘felt’ in
vocalizations. The physiognomic features of voicing cause the sounds to have different emotional
qualities.
Infants’ sounds and those of parents speaking affectionately or disapprovingly to infants
tend to have a slow rhythm and a singing expression, with many drawn out dynamic call-like
elements (Trehub 1987, 1990, 2003; Fernald 1989, 1993). It seems likely that sustained phona-
tions are especially important, both to mark and coordinate the rhythms and affective tone of
communication, and to shape the drama of emotional narratives. We propose that emotional
information contained in what can be identified as vowel sounds in the natural vocalizations of
infants and mothers at play provides the infant with a psychological measure of their own and
others’ affective states, and a means by which both mother and infant can actively participate in
and regulate psychological interactions (Panksepp and Trevarthen, Chapter 7, this volume).
In sum, there is a wealth of evidence from research of the past 30 years that feelings about their
relationship with other persons are experienced by very young infants, and signalled by them in
movements of face, voice and hands. Furthermore, these actions appear to be reliably understood
by adults as emotions (Papoušek 1996; Powers 2001; Draghi-Lorenz et al.2001; Legertsee 1992).
We have studied vowel-like sounds of both babies and their mothers to further explore both of
these propositions.
10.2 Background to a research project to test emotions in intuitive

vocal engagements with infants less than 6 months of age
We selected infant vowel sounds to clarify their function in human communication at the stage
before a child has awareness of language. We assume that the messages of speech are carried on a
prosody or musicality that has a fundamental competence to convey interpersonal messages that
are adapted to regulate relationships and the cooperation of mental states, intersubjectively and
emotionally (Fonagy 2001; Kühl 2007; Marwick and Murray Chapter 13, this volume). It is our
intention to clarify the first ‘vocabulary’ of emotion and the melodic ‘syntax’ of communication
that we assume must underlie the more rational and explicit communication by words that is
acquired after infancy. We believe that the primary state of these functions linking imitation,
emotion and learning has not been understood by the dominant cognitive school of psychology
(Mazokopaki and Kugiumutzakis Chapter 9, this volume; and see Merker and Eckerdal, Chapter 11,
this volume, for a discussion of the relationship between animal cries and any ‘musicality’ in
infant vocalizations).
10.2.1 Learning how to speak after knowing how to communicate:

the functions of vowels
As the infant becomes aware of the segmentation of speech, and starts to find interest in the
communicative purpose of words that combine elements of vocalization in ever-changing ways
to express an endless variety of interests and purposes, the prosody of parental speech tutors the
infant in how sounds in the vocal stream may be shaped and interrupted by movements of the
tongue, jaws and lips (Kuhl 1983, 1994; Fernald 1989, 1992; Jusczyk 2001; Powers 2001).
Consonants and silences define syllable structure and tempo of speech (Crystal 1997). Sustained
vowel sounds and their modulations are salient features of speech in all languages (Ladd 1996),
and their melody or ‘song’ of expression convey the emotional quality and intensity of psycho-
logical information about changing motive states, social contacts and relationships. There is
abundant evidence that infants are highly sensitive and responsive to the melody of affectionate
mothers’ talk many months before words make any sense to them (Fernald 1989; Marwick and
Murray Chapter 13, Gratier and Danon Chapter 14, this volume).
Vowels are defined by linguists as speech components that are made with an open vocal tract,
the sound of the voice being usually emitted for between 0.1 to 0.5 seconds, except in specially
strong forms of expressions such as calling or singing, when phonation may be sustained for
several seconds. They are varied in pitch, loudness and resonance by changes in breathing, in the
tension of the vocal cords, and in the configurations of the whole vocal tract. Their intonation is
influenced by the preceding and following acts of vocalization, and they convey sequential and
segmental cues to the listener that aid speech perception (Ladd 1996; Kuhl et al. 1997). They also
offer information about the gender, social attitude, psychological state and age of the speaker
(Clarkson et al. 1996; Shimura et al. 1996). Their dynamic features and resonant qualities express
emotions and the purposeful adjustments of intentional energy, as well as serving as carriers for
the intricate articulations that formulate the infinite variety of information in speech. Vowels also
resemble the sounds infant’s make; before being ‘sculpted’ in speech they develop from the first
quasi-resonant calls of infants (Locke 1993; Goldfield 2000; Oller 1986; Oller and Eilers 1992).
The first emotional vocalizations produced by infants, ‘extended vocables’ (Locke 1994), contain
many features of canonical vowel sounds. We propose that acoustic information in these calls
provides infants with a means by which they can sense their own psychological states, understand
others’ psychological states, begin to mediate in subtle personal relationships, and learn the
cultural meanings that are expressed and formed through the acquired skills of communication
by speech.
Infants’ vocal organs are immature, the vocal centres in the brain are still growing and the oral
articulatory skills are rudimentary. But babies are born with a powerful and delicately modulated
range of coos, calls and cries, and they appear to have highly developed sensitivity for the emo-
tions in others’ voices. A fetus can hear and learn to discriminate the sounds of the mother’s
speech, receiving auditory stimulation from 22 weeks, just past mid-term gestation. Thus, the
baby can later recognize the mother’s speech, showing preferential orienting responses to the
location of her voice at birth (Alegria and Noirot 1978; DeCasper and Fifer 1980; Clifton et al.
1981; Querleu et al. 1984). From birth, many months before they can speak, infants make vowel-
like sounds called ‘quasi-resonant nuclei’ (Oller 1986). Their sounds then develop in range of
expression and in rhythmic regulation.
Languages differ in the way articulations constrain vowels and separate or join them. The
learning of the motor and perceptual skills of speaking and hearing speech for a particular
language is guided by the special ‘musical’ way parents talk to infants, known as ‘infant-directed
speech’ (IDS) or ‘motherese’, and begins to have effects on their hearing and imitations around the
middle of the first year (Fernald 1989, 1993; Papoušek and Papoušek 1981; Papoušek et al. 1985;
Papoušek and Bornstein 1992; Trevarthen 1999; Kuhl 1998). Vowels in IDS of adults demonstrate
hyperarticulation (having extended harmonics or formants), which facilitates their discrimination
by infants, who show a poorer performance on perceptual tasks when harmonic information is
limited (Clarkson et al. 1996). It is possible, however, that infants may be drawn to vowel sounds
not just because they are stimuli rich in features that they find perceptually attractive or salient,
but for the functional intersubjective reason that, as phonologically stressed components of the
parental language, they help the infant to make sense of the changing purposes and feeling in
communications addressed to them and made manifest by all kinds of expressive movements
(Fisher and Tokura 1996). Infants show particular sensitivity for the affect content of infant-
directed speech, which is carried strongly in vowels (Oller 1986; Kitamura and Burnham 1998).
10.2.2 The basic acoustics of learning to speak: the developing

musicality of shared vocal expression
The complexity of human communication is such that no one feature of expressive behaviour
can explain it. We express ourselves to receptive others in all the ways we move about, gesticulate
or grimace (Darwin 1872; Siegman and Felstein 1979; Key 1982; McNeill 1992), but the voice,
driven by the movements of breathing, gives the richest and the most intimate and immediate
information on our inner state of mind and body and how this is changing.
The following parameters of the voice have been identified as important in sustaining
emotional regulations and conscious interest between an infant and an adult.
10.2.2.1 Timbre
About 95 per cent of the energy of vocalization is spent in the production of extended vocables or
vowels, and muscular production and control of this energy in the vocal organs determines the
quality or timbre of the voice. Lapp (2003, p. 11) describes timbre or voice quality, which gives ‘a
flavour’ and identity to sound, as the ‘subtlest of all its descriptors that can be heard’. Timbre, or
spectral complexity, plays a major role in the expression and perception of emotion in speech
and song, as it does for instrumental music (Malloch 1999; Trehub and Nakata 2002).
Importantly, with pitch and loudness, the timbre of the voice signals the intensity of expressed
emotions (Jonsson et al. 2001). Goldfield (2000) describes infants’ exploration of acoustic vowel
space as an emergent system in the infant’s proprioceptive control of movement—the energy (or
resonance) of vowel sounds provides infants with a means by which they can ‘explore their own
actions’ (p. 433).
10.2.2.2 Timing
Malloch demonstrates that a 6-week-old can enter a rhythmic communicative partnership with a
parent, sharing in the regulation of characteristic syllable and phrase elements (Malloch 1999;
Malloch and Trevarthen, Chapter 1, this volume). There are important developments in timing
and expression (pitch variation and features of timbre) in the first 6 months of the infant’s life.
Parents’ speech to infants changes as the infant becomes more alert, more exploratory, and capable
of reacting to and imitating more lively and playful sounds and actions. The talk with the infant
becomes faster and covers a wider range of rhythms and qualities of expression (Malloch 1999),
the parents often imitating the infant’s sounds, following the infant’s developments (Trevarthen
et al. 1999).
10.2.2.3 Narratives
In their ‘conversations’, parents and infants mutually adjust the pulse of their vocalizations and
vary the quality of their expression systematically to produce a narrative lasting tens of seconds
(Malloch 1999), anticipating and regulating the cycles of emotional intensity in what Stern
has called ‘proto-narrative envelopes’ (1985). Jaffe et al. (2001) measured parameters of ‘time
vocalising’, ‘pause’ and ‘switching pause’ to demonstrate how adults slow and regulate or
‘infantise’ the rhythm of speech when talking to an infant. Vocal timing of both adults and
infants changes when they converse with ‘coordinated interpersonal timing’. Thus, infants can
actively regulate their behaviour in time, to respond synchronously or in alternation to parents’
vocalizations. The musical features of a mother’s speech or song tempt the infant to predict and
anticipate narrative patterns, and the special infant-directed speech register provides a percep-
tually attractive and functionally useful means by which infants can begin to segment the
speech stream (Werker and McLeod 1989; Cooper and Aslin 1990; Pegg et al. 1992; Trainor 1996;
Rock et al. 1999). Infants are sensitive to changes in the pitch, intonation, timing and rhythm of
speech before the sense of words is appreciated (Locke 1993; Fassbender 1996; Trehub 1990;
Trehub et al. 1997).
10.2.3 The function of emotions in relationships and

cultural learning
Within a few months after birth the infant learns to identify a particular individual as an attach-
ment figure, responding to the sensitivity and consistency of the care they receive from that
person. Brain systems motivating the infant’s activity, attention and emotions enlist maternal sup-
port for physiological well-being, psychological experience and development (Hofer 1987; Kraemer
1992; Panksepp 1998; Panksepp et al 1997; Porges 2005), helping the infant form working models
of parental support (Bowlby 1988; Schore 1994). ‘Intrinsic motives’ regulate the growth of psycho-
logical experience through action on the environment and special sensitivity to the emotions of
human expression guides parenting (Trevarthen 2001b; Trevarthen and Aitken 1994).
Extended vocal sounds allow the infant to be actively and emotionally involved with other
people nearby, even when they are out of sight, helping them build internal representations of the
highly significant dimensions of affective and psychological experience (Bowlby 1988), and soon,
through pleasurable and creative relationships of ‘companionship’ with persons of all ages, to
represent meaningful knowledge and skills (Trevarthen and Hubley 1978; Trevarthen 1984,
2005b). These representations are maintained by the infant’s awareness of his or her own actions,
the shared situation, and contingent behavioural feedback from others. At first, in intimate spon-
taneous play, the parent responds to the affective expression of the infant, and the infant (pro-
vided the parent is attentive and ‘sensitive’ or sympathetic) can actively regulate the interaction to
maintain emotional closeness. These first emotional representations of ‘moments’ of close
engagement are the very beginnings of the creation of self-esteem and emotional well-being
(Stern 1985, 2000, 1990, 2004).
Vygotsky (1962) describes human speech as developing within ‘affective, expressive vocal
reactions’ (p. 40). However, focused as he was on cultural learning, he states that although
emotional aspects of this development are a means to make ‘psychological contact’ they are, he
believes, ‘far removed from intentional, conscious attempts to inform or influence others’ (ibid.).
However, we believe there are several important features of infant behaviours that reveal a
natural link between intuitive emotional factors and conscious intentions to communicate
experiences.
When even a very young infant imitates expressions of an adult in interactions, or makes
complementary expressions, they appear to do so in a consciously controlled way (Nagy and
Molnar 2004; Trevarthen 2005a; Tronick 2005; Trevarthen and Reddy 2007) evidently attempting
to ‘understand’ and learn from what may be described as a sense of ‘the correspondence between
innate brain representations of the “self ” and “other” as potentially equivalent’ (Trevarthen and
Aitken 1994, p. 599). Their communicative engagements have been described as ‘the cradle of
thought’ (Hobson 2002). Donald (2001), considering the evolution of the human mind, suggests
that through mimesis of communicative actions, ‘interlinking the infant’s attentional system with
those of other people’ (p. 255), infants can formulate participative routines that become more
complex over time. Reciprocal imitation would appear to be the essential foundation for com-
munication and learning of even very simple experiences and skills (Meltzoff 1985; Meltzoff and
Moore 1999). Normally developing infants are soon skilled at participating with high affect in
traditional or improvised games and songs with persons they have come to know well, and they
show strong emotional investment in performance of what they have learned (Trevarthen 2002;
Eckerdal and Merker Chapter 11, this volume).
10.2.4 The growth of innate sympathy in co-consciousness:

problems with research on infants’ emotions
Notwithstanding the revolution in scientific understanding of the innate motives for human
sociability and a common sense of shared purposes and concerns, there remains a stubborn
theoretical resistance to any claim that young infants feel and can express emotions, or that they
have sensitivity for mental states of interest, intention and feelings active in other persons. Any
capacity in an infant for the interpersonal or moral feelings that are of primary importance in
regulation of responsible social life, or of the aesthetic emotions that aid both recreation and
sharing of meaning, has been denied on the dubious grounds that such emotions need to be
reasoned about by learned cognitions, or codified and interpreted by language.
Draghi-Lorenz et al. (2001) review theories of the development of emotion in infants, and record
a ‘widely agreed’ classification separating ‘basic emotions’ (usually interest, disgust, joy, distress,
anger, sadness, surprise and fear) and ‘non-basic’ emotions (shame, embarrassment, coyness, shyness,
empathic concern, sadism, guilt, jealously, envy, pride, contempt, gratitude, etc.). The first basic set are
believed to develop in infancy, the non-basic emotions appearing in the second year, or later. It is
evident that what are assumed to be basic or primary emotions are conceived as reactive states in an
individual who is provoked by stimuli. This conventional theory of emotional development takes it
for granted that complex cognitive representations of interpersonal experience and social training
are required for the second set. There are psychobiological grounds for questioning this theory, and
observation of behaviours of infants when they are in normal engagements with other people
strongly supports the view that it is invalid. If infants’ emotions rely on immediate representation
or awareness of other persons and their emotions, and not on learned understanding of the other
and instruction in social conventions, then non-basic emotions may be found in some form from
birth, and they at first may be essentially the same in different cultures.
Psychobiological evidence indicates that ‘the experience of affect reflects a more ancient form
of consciousness than that which sub-serves most of our cognitive abilities’ (Panksepp 2001,
p. 14) and that emotion and cognition are integrated processes dependent on each other, but still
discrete and separately measurable. Emotional experience modulates motives as subconscious
and purposeful processes of a coherent agent or Self (Ochsner and Barrett 2001). Perception
informs purposeful processes that are evaluated by emotion, and cognitive representations,
executive strategies and interpretative schemes are formed from, not causes of, this consciousness
of the self in action, and the intuitive emotions regulate communication with other people.
10.2.5 Receiving/interpreting infants’ calls and promoting the growth

of meaning: adults’ responses to infants’ expressions
Much of the research on what adults perceive in infant vocalizations has focused on sounds of
distress, assuming that infants vocalize just to get care or comfort. However, some studies have
looked at what experienced adults (i.e. parents and child care professionals) can consistently
understand from a wider range of happier sounds that infants produce. Shimura (Shimura et al.
1996) coded infant vocalizations as either positive or negative and found that parents, nursery staff,
students and even young children aged between 2–3 years old could correctly match the coding
for vocalizations produced by 2-month-old infants expressing comfort, discomfort, or pleasure.
Again, the aim was to study emotions as manifestations of states within the Self.
Papoušek (1992) found that fathers, mothers, speech therapists and 8-year-old children could
reliably match the emotional content of 2-month-old infants’ vocalizations. The infant vocaliza-
tions were previously coded on scales for comfort/joy and discomfort/cry and Papoušek found that
all participants could distinguish between comfort and discomfort and they often made judge-
ments about the level of intensity of the emotion they perceived. The evidence that adults can
consistently perceive and identify a range of emotions in the sounds that infants make supports
the hypothesis that infants are signalling subtle information about their psychological state—that
they are actively expressing complex interpersonal emotions. Furthermore, if adults can consis-
tently judge the emotions indicated by infant vowel sounds, it is probably because they feel a
sympathetic emotional response to what they hear. The infants’ emotions expressed during
exchanges with adults will also indicate how they feel about the engagement, not just how they
feel within themselves.
Our research reported here shows that adults do experience a wide variety of emotional
responses to infant vocalizations, and that they make interpretations of them as the baby’s
expressions of human social feelings, and there is published evidence supporting this finding.
Mechthild Papoušek (1992) found that American and Chinese mothers expressed ‘intuitive
didactic caregiving tendencies’ (p. 243) in response to infant vocalizations coded as neutral,
comfort, joy, discomfort or cry. She divided mothers’ vocal responses into categories of
reward/greeting, encouraging a turn, encouraging imitation, evaluating infant state, reassuring of
mothers’ presence, readiness to intervene, soothing or discouraging. Although there were some
differences, mothers of both cultures tended to differentiate their response to match the emotion
expressed in the infant vocalization. Papoušek (loc. cit.) concluded that adults do attribute
meaning to the communicative sounds of infants, and that this has the evolutionary advantage of
increasing the likelihood of security for the infant.
We propose that emotion between mothers and infants is expressed in the variation of pitch,
timbre and rhythm of extended vocables and vowel sounds, which convey information about the
moment-to-moment dynamics of intrinsic motives, including motives for human relating. The
sounds are energized by breathing and modulated by the complex muscular control of the
resonant spaces of the vocal tract and articulatory movements that constrict or interrupt the air
flow. The dynamic feature of pitch modulation in the vowel sounds of IDS transmits the affective
state of the speaker, which clearly serves to maintain the infant’s interest in familiar games and
personal ‘proto-conversational narratives’. Comparable features of extended vocables express
the infant’s affective state. The two sides of the interpersonal process regulate the psychological
experience of the infant, and lay the foundation for emotional well-being and later cognitive
development of the child (Stern 1999, 2000).
10.3 Differences of emotional or interpersonal culture—Japan

and Scotland
Emotions in human communication have universal neurobiological foundations (Holstege et al.
1996; Panksepp 1998), but they are adapted to be modified through learning in use (Harris 1994;
LeDoux 2002), and the controlled expression of feeling and cultural rules for appropriate display
of emotion in different social situations and in different relationships soon influence infant
behaviours (Gratier and Danon Chapter 14, this volume). The Japanese and English languages
have conspicuous differences in rhythm and syllable structure, and in the range of vowel sounds
(Ladd 1996). The Japanese mora, the expressive element that determines the weight or stress of
a syllable, is composed of vowel sounds which are varied in duration and pitch and their
combinations, in much more subtle ways than is the case for syllables in the stress-timed
language of English, to convey different meanings (Cutler and Takeshi 1999; Kozasa 2002, 2004).
The Japanese and Anglo-American cultures also differ greatly in social customs, moral attitudes
and educational aims (Koizumi 1989; Bierhof 2002).
Societal attitudes towards emotional expression governing group bonding and shared
behaviour lead to ‘deep enculturation’ of the infant (Donald 2001, p. 256). In Japanese society,
emotional cues are strongly controlled, and there are clear rules for how people relate to each
other, depending on degree of familiarity, relative social position and whether the interaction
takes place in public or private. Similar rules exist in Scotland and, indeed, in all societies, but in
Japan there are traditional social constraints on how emotion is expressed in specific settings
dependent upon different interpretations of life experience. This is exemplified by the traditional
Japanese principle of Koroko. Koroko defines a philosophy of the integrated or coherent meaning
of human nature. It considers ‘heart’, ‘mind’ and a ‘sense of knowing’ to be inseparably linked,
in contrast to the ‘mind–body’ dualism of Western thinking. This belief shapes Japanese ideas
about child rearing, affects the ways in which emotional expression is communicated to infants,
and informs the way in which infants are conceived as perfectly complete human ‘spirits’
(Nakano 1997).
Japanese mothers adopt a different attitude to their infants in comparison to Anglo-Saxon

mothers, giving greater value to interpersonal aspects of the baby’s behaviour and needs, and less
attention to the baby’s cognitive interest in objects and events or their ‘intelligence’. They direct
attention to different topics and activities with their infants, and their vocal expression is differ-
ent (Shimura and Imaizumi 1995; Bloom and Masataka 1996). An important virtue in Japanese
society is to extend positive sympathy or ‘kind consideration’ (Nakano 1997). Lewis (1995), for
example, found that an important part of elementary education is to ‘minimise competition and
help children develop the feeling that we’re all in it together’ (p. 7). In accord with Koroko, this
belief stresses the importance of harmonizing emotionally with others and on what it means to
be a kind, responsible member of a school community. It reflects the higher value given to collec-
tive prosocial behaviours and their cultivation in Eastern cultures, in comparison with the
importance given to individual self-expression and achievement in the West (Bierhoff 2002).
Bornstein et al. (1992) found that while mothers in France, Japan and the United States showed
a number of similarities in the way they interact with 5-month-old infants, culturally specific
patterns of responsiveness were evident. All imitated ‘non-distressed’ vocalizations produced by
their infants, they all behaved in a nurturing way when their infants were distressed and in all
three cultures mothers encouraged their infants to explore the environment. Differences lay pri-
marily in the ways in which mothers followed their infant’s gaze. American mothers responded
more often to their infants by directing their infants’ attention to the environment than the
mothers in France and Japan. Japanese mothers responded more frequently when their infants
engaged in social looking and more often by looking at their infants’ faces than either American
or French mothers. Masataka (1993) found that Japanese mothers were particularly expressive
with their voices, using rising pitch contours frequently when speaking to their infants, and the
shape of the contours correlated with the mother’s communicative intention, rising to encourage
and attract their infants, or falling to express concern. These intonational features are particularly
important in Japanese speech to differentiate between words of different meaning (Kozasa 2004).
10.4 Measures of emotion in voices of mothers and infants, and

their uses in play
To contribute to a clearer understanding of how acoustic features of communication can regulate
shared psychological experience between infants and mothers, the following goals were set for
our research:
1 To identify the types of vowel sounds produced by mothers and their 4-month-old infants
and how these are related to the parameters that appear to be important for communication
of emotions in natural playful encounters in Japan and Scotland.
2 To investigate how infants in Scotland respond to different emotional tones in their mother’s
speech, and how adults, male and female, parents and non-parents, understand or respond to
the emotional content of infants’ vowel sounds.
10.4.1 Similarities
and differences in acoustic features and meanings
of vowels as mothers play with infants in Japan and Scotland
Six English-speaking and six Japanese-speaking mother–infant dyads, three boys and three girls
in each country, were recorded on digital video in their own homes, with no other family mem-
bers present, when the infants were 4-months-old (plus or minus 2 weeks), an age when infants
are highly attentive to their mothers’ playful communication and beginning to learn different
patterns of movement and vocal expression, but before the time when they appear to be adapting
their hearing to the special features of speech sounds (Trevarthen and Aitken 2003). Filming was
arranged at a time to suit mother and infant so that we could record at least 15 minutes of inti-
mate communication while infants were alert. Mothers’ ages were similar in the two countries:
mean age was 32.3 years in Scotland and 33.5 years in Japan. All but one of the Japanese mothers
was married and at home with her child full time. Most of the Scottish mothers lived with their
partners and had a full or part-time job. All mothers who worked were on maternity leave at the
time of filming.
We selected for acoustic analysis the sounds that occurred in small games in which a commu-
nicative topic, story or ritual episode was played out through repeating patterns of gesture and
expression. Figure 10.1 shows examples of Japanese and Scottish mothers playing with their
infants during this kind of game. The mother would try to attract the infant’s attention to her
game by pushing the infant’s feet, coordinating the rhythmical movement of her hands on the
infant’s feet with a rhythmical sing-song voice (a); or she would tickle the baby’s hands (B); or
hold the baby in the air and exchange playful voice sounds and face expressions (c and d). The
patterns of the game were repeated in inviting ways as little rituals, sometimes increasing in
tempo to maintain excitement and engagement, and were similar in the two countries.
The recordings from Japan were brought to Scotland where analysis of the corpus from both
countries was carried out. Segments 30 seconds in length, when mother and infant were judged
by the experimenters to be attentive to each other in play, were chosen from the first 10 minutes
of each digital recording, and each of these was subjected to acoustic analysis, using PRAAT
computer software (Boersma and Weenink, www.praat.org) to measure pitch, intensity and
duration for vowels or vowel-like vocalizations of both mothers and infants. We measured the
(a) (b)
(c) (d)
Fig. 10.1 Mothers playing at home with 4-month-old babies, in Japan and in Scotland.
(See also colour plate 3.)
relationships between these acoustic features of vowel sounds made by the mothers and infants
in Scotland and Japan, and we compared these features to see if they varied in consistent ways
when mothers and infants were engaged versus disengaged in play (Section 10.4.2). The vowels of
Japanese and Scottish mothers were compared to discover if they varied the length, pitch and
intensity of their vowels in different ways to express emotional and motivational messages
reported in this section.
Vowels were identified by listening to the tapes and supported by inspection of the spectrograms.
The beginning and ending of each vowel was measured using PRAAT. Vowel lengths were
classified as extended (E), over 250 milliseconds, long (L), 151 to 250 ms, and short (S), between
50 and 150 ms.
In both countries mothers made many more of the identified vowel sounds than infants, and
Japanese mothers, excluding one exceptionally quiet mother whose infant made no extended
vowel-like sounds, were more vocal than the mothers in Scotland (Table 10.1). Most of the
infants made few sounds, but one Japanese infant and two in Scotland were noticeably more
vocal than the others.
Figure 10.2 shows that the vowel sounds of infants were higher pitched and more intense than
those of their mothers, and the infants’ sounds were less modulated. The mean pitch of Japanese
mothers’ sounds (312 Hz) was higher than that of the Scottish mothers (275 Hz), who, however,
made a larger proportion of exceptionally high ‘outlier’ sounds, indicated by small circles in
Figure 10.2. Only the group of sounds made by Scottish mothers had a mean pitch below Middle
C (C4, 261.63 Hz). Research on mothers’ voices and on the preferences of infants show that
sounds pitched in the octave above C4 are normally used for happy playful speech (Trehub 1990,
2003). Mothers suffering from depression tend to let their voices fall below C4 (Robb 1999;
Marwick and Murray Chapter 13, this volume). The intensities of sounds made by Japanese and
Scottish mothers were similar (68 dB and 65 dB respectively) though the Japanese mothers
ranged to slightly higher levels. Statistical analysis confirmed that overall both pitch (p = 0.003)
and intensity (p = 0.0001) of Japanese mothers were significantly higher than those of Scottish
mothers. Infants in both countries made more intense sounds at about the same level (mean
values were 72 dB for Japanese infants, and 71 dB for those in Scotland). No significant differ-
ences were found for acoustic features of infant vowel sounds recorded in the two countries.
Figure 10.3 shows that Japanese mothers mainly made sounds that were short or long in
duration, with a mean duration of 201 ms, but some of their sounds were very long, even beyond
Table 10.1 Numbers of vowel sounds made by mothers and infants in Japan and Scotland in
30 seconds of play
Japan Scotland
Mothers Infants Mothers Infants
15 0 33 0
43 17 29 6
32 2 22 1
73 4 45 2
40 0 23 16
37 1 35 30
Totals 240 24 187 55
Pitch Intensity
Hz Japan Scotland dB Japan Scotland
800 100
600
C5 80
400
C4
60
200
0 Mothers Infants Mothers Infants Mothers Infants Mothers Infants

40
Fig. 10.2 The acoustic features pitch and intensity of vowel sounds made by mothers and infants in
Japan and Scotland. C4 = middle C, 261.63 Hz; C5 = octave above middle C, 523.25 Hz.
one second. This supports the impression that the Japanese mothers were expressive over a wider
range of times. Scottish mothers had a slightly higher mean duration (247 ms) with fewer very
long sounds. They appear to be making their voices more expressive by discontinuous variations
of pitch. The infants in both countries made much longer sounds than their mothers, most of
their vowels lying in the extended range, and their mean durations were above 300 milliseconds
(449 ms in Japan, and 447 ms in Scotland). The durations of these infant sounds are outside the
normal range of vowels in speech, more like affective calls.
Pitch and intensity both varied significantly in relationship with duration for the vowel sounds
of both mothers and infants. Japanese infants made extended vowel sounds at a higher pitch than
the Scottish infants or the mothers of both countries, and Scottish infants had a higher pitch in
their long and short vowel sounds than other groups. These differences suggest that the infants
were already developing different ways of expressing themselves in play with their mothers.
Japan Scotland Japan Scotland

1.75 Mothers Infants Mothers Infants 120 Mothers Infants Mothers Infants
1.50 100
Extended
1.25
80 Long
Seconds
Number
1.00
60 Short
0.75
40
0.50
E
20
0.25
L
S
0.00 0
Fig. 10.3 Durations of vowel-like sounds made by mothers and infants in Japan and Scotland,
and the distribution of durations for each group.
10.4.2 The relationship between engagement in play and the

sounds of vowels
The emotional quality of engagement was judged by an intuitive assessment of the behaviours
seen in the videos from both Japan and Scotland. Periods of engaged and disengaged communi-
cation in the episodes of play were identified for each 30 second segment, and judgements made
independently in Scotland by two raters from the visible gestures and expressions using only the
video data, without sound.
The idea of engagement is based on the theory of ‘attunement’ (Stern et al. 1985). Engagement
was defined as established ‘when mother and infant shared intimate moments of close mutual
awareness of each other or of a shared interest’. Disengagement was defined as ‘when the infant’s
attention moved away from the mother or from the communicative interaction’. One rater, who
had not been involved in the work, was completely blind to the purpose of the experiment and
had not been trained to make psychological evaluations of this kind, was asked to identify
periods of communication where the mother and infant were emotionally engaged or disengaged
according to these definitions. After a short period of practice this rater and the researcher (the
first author) were able to independently and consistently identify engaged and disengaged
periods of communication. A selection of 15 per cent of all segments were categorized by both
raters with 100 per cent concordance.
The intensity of the3 mother’s sounds was higher in engaged situations (p < 0.0001), as was
their duration (p < 0.0001), but the pitch of their voice was slightly higher for disengaged situations,
possibly as mothers made attempts to re-engage their infant. For infants’ vowel sounds there was
no significant difference between acoustic features of infants’ vowel sounds in the two emotional
levels of engagement, although trends were in the same direction as for mothers.
Figure 10.4 summarizes the findings for the vowel sounds made during periods of engagement
and disengagement, and also shows the numbers of vowel sounds of the three durations:
extended (E), long (L), and short (S), for all mothers and all infants. Engaged periods where
characterized by a preponderance of extended vowels, and when the mothers and infants were
disengaged, their utterances were more frequently short.
A chi square analysis confirmed that, according to the judgements made by the two raters,
Scottish mothers had significantly more engaged interactions (66 per cent) than Japanese
Japan Scotland Engaged Disengaged

120
150 Engaged
100
Frequency of vowels
Disengaged
Number of sounds
80
100
60
50 40
20
0 0
Mothers Infants Mothers Infants E L S E L S
Fig. 10.4 Left; vowel-like vocalizations of mothers and infants in periods of engagement and
disengagement in play, in Japan and Scotland. Right; frequency of extended (E), long (L) and
short (S) vowels in engaged and disengaged periods.
mothers (34 per cent) (p = 0.0001). Scottish infants were also significantly more engaged (82 per
cent) than Japanese infants (7.7 per cent; p = 0.0001). The intensity of vowel sounds was higher
(p = 0.0001), and they were of longer duration (p = 0.006) during engaged communication.
Mothers and infants in Japan and in Scotland appear to be coordinating the intensity and
duration of their vowel sounds during emotional communication.
10.4.3 How a mother’s voice changes when she recites a nursery

rhyme to her infant in different moods
We wanted to test if a mother’s vocal expressions of different moods would affect the emotions of
her infant, and to find out how she would change the parameters of her vowel sounds to convey
contrasting emotions. Inspired by the success of the ‘still face’ and double video replay perturba-
tions to reveal infants sensitivity to loss of the normal affectionate quality or contingency of
the mothers’ behaviours (Tronick et al. 1978; Murray and Trevarthen 1985), we devised an
‘emotional voice’ paradigm. The first purpose was to find out if acoustic features of the mothers’
vowels would vary consistently with the emotional intent of her speaking. The paradigm was also
intended to find whether or not the infant noticed changes in the quality of the mother’s voice
when she was required to change affective features of her speech. The experiment was carried out
to provide a description of how infants behaved in response to affective changes in the mother’s
voice, and, further, to see whether the infants’ affective expressions might correspond to
emotions expressed in the mother’s voice.
Scottish mothers were asked to recite the words of a nursery rhyme to their infants, acting in
three different voices to express ‘happy’, ‘sad’ and ‘bored’ emotions. Eleven mothers and their
infants took part who had not previously taken part in experiments for this research. There were
six female infants aged between 5 and 9 months (mean age 7.2 months) and five male infants
aged between 3 and 8 months (mean age 5.4 months). The mean age of all infants was 6.7 months.
Mothers were invited to visit the Infant Laboratory in the Psychology Department at Edinburgh
University with their infant and the infant was placed in a baby seat facing the mother. The
purpose of the study was explained to the mothers, and each was asked to speak, not sing, a
familiar nursery rhyme, Round and Round the Garden, to her baby four times, each time ‘acting’
one of the three emotions in the following order:
Condition 1 = ‘Happy’ voice 1
Condition 2 = ‘Sad’ voice
Condition 3 = ‘Happy’ voice 2
Condition 4 = ‘Bored’ voice
The second Happy condition was included to ensure that the infant had the chance to recover
from any emotional reaction to the Sad condition. The infant’s reactions were videotaped. The
mother was not filmed but a recording of her voice was taken from the videotape. Two types of
analysis were carried out:
◆ Acoustic analysis of the mothers’ vocal expressions, measuring pitch levels, intensities and
durations.
◆ Rating of non-vocal responses of the infant to their mother’s voice by observers of the video.
The aim was to examine how the mother’s ‘acting’ of the different emotions influenced the
infants’ responses so as to gain a measure of emotional matching between mother and infant.
Figure 10.5 shows the text of the nursery rhyme that the mothers recited and illustrates its
classical form as a traditional four-line stanza with iambic meter, with rhyming words at the ends
of the second and fourth lines. The words, and expressive devices, such as the rubato used to
Happy
700 I Round and Round the Gar - den
I II III IV
500
300 II Ran a Te ddy B E A R.

C4
200
150 III One Step Two Step
100
70 IV And a Tick-l -y Un der THERE
50
0 5 10
Sad Bored
700 700
I II III IV I II III IV
500 500
300 300
C4 C4
200 200
150 150
100 100
70 70
50 50
Seconds
0 5 10 0 5 10
Fig. 10.5 Upper right; the text of the nursery rhyme that the mothers recited, indicating the intro-
duction (I), development (II), climax (III) and resolution (IV) of the narrative. Left and below right;
three pitch plots of one mother reciting the stanza in the three moods. C4 = Middle C, 261.63 Hz.
build excitement in the third line, and the accelerando with which the fourth line is usually
chanted, convey the four classical divisions of a narrative: introduction, development, climax and
resolution, labelled I, II, III and IV in Figure 10.5. Normally the final event is an affectionate
‘attack’ on the infant with the word ‘Tickley’. In our experiment the mother was required to sim-
ply recite the story to her baby, and not to play any action game.
Figure 10.5 also shows three pitch plots of a mother reciting the stanza in the three moods, and
illustrates the differences in timing of her performances. For all 11 mothers the average dura-
tions, in seconds, of their recitations in the three moods were as follows: Happy 7.78; Sad 9.82;
Bored 8.02. The measures for the two Happy conditions were virtually identical. The mother
whose recitations are shown in Figure 10.5 is relatively slow for all three conditions, but she, too,
speaks longest for the Sad condition.
In the Happy condition the narrative pattern of the mother’s voice shown in Figure 10.5
exhibits an introduction phase where she sets up the rhythm for the rhyme, a development where
she builds the pitch and lets it fall and build again, a climax in which her pitch reaches its maxi-
mum level, before the resolution where her pitch falls at the end of the phrase. The infant’s par-
ticipation would appear to depend on an attention to the pattern of development through the
stanza, which commonly ranges between 10 and 20 seconds in length, lullabies being longer than
lively action songs (Trevarthen 1999). In the Sad condition this mother’s pitch is lower, the into-
nation of her voice is much flatter, showing less expression, and in the introductory phrases she
does not build toward the climax. Her speech is slow. The pitch of her voice lies below Middle C
and there is no fall at the end of the final phrase. For the Bored condition she again sets her
speech mainly below Middle C. The introductory phrase falls steadily and the second part of
the rhyme is said in a ‘sighing’ voice, which causes the pitch plot to break up towards the end of
the rhyme. A falling pitch is evident at the end, but was incorporated into the sighing voice and
did not give an impression of relaxation, rather one of weakness.
An analysis of the mother’s voices was carried out with PRAAT phonetic analysis software to
see if there were consistent differences in acoustic dimensions of pitch and intensity in isolated
vowel sounds depending on the emotional condition. Figure 10.6 shows the means and standard
deviations for pitch range and intensity for the three moods.
Pitch variation appears to be important in the expression of different emotions. Positive and
negative emotional conditions were clearly different. A paired samples t-test showed that the
pitch ranges of Happy and Sad conditions differed (p = 0.027), as did those for Happy and Bored
(p = 0.007). There was, however, no significant difference between the Sad and Bored conditions.
The durations of vowel sounds also varied with the mood performed by the mother. The Sad
condition had more extended and long vowels than the other conditions, and there were more
short vowels in the Happy and Bored conditions than in the Sad condition. Similar differences
appear in Figure 10.5 in the lengths of the phrases that make up the four lines of the poem. In
both Sad and Bored performances this mother protracted the two first lines, the introduction
and the development of the narrative, and in the Sad condition the third line or climax is longer,
lacking the urgency needed for the build up to the conclusion.
10.4.4 How the infants reacted to the mothers’ changes of mood.

Figure 10.7 illustrates a 5-month baby girl, recorded on a different occasion, showing intense
interest in her mother’s recitation of this same rhyme in a Happy mood, and presents a spectro-
gram of the final line showing how this infant anticipated the end. She vocalized in synchrony
with the mother’s last word ‘bear’, making a sound similar to ‘aaeer’ at a pitch slightly higher
than the mother’s sound, which comes to conclusion on middle C (C4). It is notable that the
infant’s sound has a rising-falling contour like the mother’s sound but less strongly modulated.
This suggests both the immaturity of the infant’s production, and her skill in synchronizing the
narrative of expressive motives with those of the mother.
We tried to find out if the infants’ responses to their mothers’ voices vary systematically accord-
ing to emotional condition. We asked three individuals, two female non-parents and one male
Hz Pitch range dB Intensity

500
18
16
400
14
12
300
10
8
200
6
4
100
Happy Sad Bored Happy Sad Bored
Fig. 10.6 Means and standard deviations for the pitch range and intensity of the mothers’
vocalizations as they recited the nursery rhyme in the three moods: Happy, Sad and Bored.
Spectrographic analysis
22.05 KHz
-6. 500 dB/COLOR Time spanned: 2.76 s

Infant
Mother
Frequency, Log2
I
C4 M
B e a r
43.07 Hz
R a n the T e ddy B e a r
0 1 2 3 Seconds
Fig. 10.7 A 5-month-old baby girl in Scotland is attentive when her mother recites the action
song Round and Round the Garden, and a spectrograph of the final line showing how this infant
synchronized a matching sound with her mother’s last word ‘bear’ at a pitch slightly higher than
the mother’s sound, which ends on middle C (C4).
parent, who were blind to the purpose of the study, to view the videos of infants when they were
listening to their mothers recite the nursery rhyme. The three observers saw the video clips in a
randomized sequence without sound, and with no information about the mother’s behaviour.
They were required to code each of the 11 infants’ behaviours to their mothers’ voices in the four
conditions: ‘Happy 1’, ‘Sad’, ‘Happy 2’ and ‘Bored’. Each time the raters’ coding of infant gestures
corresponded with the emotion the mother was trying to imitate, it was counted as a ‘Matching’
score. Matching scores for the Happy 1 and Happy 2 conditions were identical and so from this
point on, analysis is carried out using only the data from the Happy 1 condition.
In the Happy condition 54.5 per cent of raters’ judgments were Matching, and in the
Bored condition a similar value of 60.6 per cent was obtained. For the Sad condition, however,
only 12.1 per cent of judgements matched, indicating that infants’ responses to this condition
were much less likely to reflect the emotion expressed by the mother.
10.5 The other side of the emotional bridge

10.5.1 Adults perceiving emotions in infants’ voices
Tests of distinct emotions perceived by adults, and especially parents, in the vocalizations of
infants as young as 2 months support the proposition that, even with limited powers to produce
learned sounds with articulations resembling language, infants in different countries have emo-
tions and express them in systematic ways with their voice, and parents are gifted with intuitive
responses to them. Young infants are active in regulating intersubjective contact with a parent
and can make subtle modifications of their voice sounds to express changing motives and
feelings, in conjunction with a wide range of other movements. These vocal abilities of infants,
especially the production of extended vowel-like sounds modulated by emotion, and parental
sympathetic interest are part of the human endowment for communicative musicality.
To further explore emotion in this early musicality of the voice, to identify some of the salient
features and their use in communication, we asked adults to rate the expressions in a selection of
isolated vowel sounds made by unidentified infants. Cries, calls and coos selected from the
recordings of five infants from English-speaking families were chosen from the corpus of a previ-
ous study (Powers 2001). Each chosen vocalization was an extended vocable resembling a vowel
sound. There were two vocalizations, one had previously been judged to be positive and one
negative by 2 independent raters, from each of 5 infants: 2 boys at 23 and 44 weeks; and 3 girls at
23, 28 and 46 weeks.
The emotional content of each vocalization was assessed from one minute of a video interac-
tion between the infant and his or her mother according to a tested category system (Trevarthen
and Marwick 1982) for definition of affect. Two independent raters used both acoustic and visual
aspects of the video to make their decision about the positive or negative emotion in each inter-
action from moment to moment, as the video was advanced for one second at a time. Reliability
was established at 93 per cent.
Two vowel sounds were isolated from each of the ten recordings to represent a positive and a
negative vocalization for all five infants.
The study was run via the Internet. Participants, 158 English-speaking adults, of both sexes and
parents and non-parents were each asked to identify the emotional content of the ten infant
vocalizations they heard as isolated sounds in a recording. Table 10.2 summarizes the ages, sex
and parental status of the participants. Their ages ranged from 18 to 59 years. The non-parents
were much younger than the parents and there were also fewer male parents.
The ten vocalizations were presented in a random order for each listener, the listeners being asked
to rate each sound as either ‘Positive’ or ‘Negative’, or as ‘None’ (containing no discernable emo-
tional message) and to feel free to listen to the vocalization as many times as they needed to before
making their decision. Positive vocalizations were defined as expressions that convey the feelings of
‘happiness’, ‘joy’ or ‘pleasure’: Negative vocalizations were defined by any emotional expression
considered to be ‘sad’ or ‘unhappy’. After choosing the description of emotion, they were asked to
write down details of the emotion that they thought they had heard in the sound, and also to
provide in writing details of any emotion they had felt when hearing the infant vocalization.
All judgments of emotion in the infants’ sounds were categorized as ‘Matching’ (when in
agreement with the researcher’s previous coding), ‘Non-matching’ answers (not in agreement),
Table 10.2 Participants in the study of emotion in infants’ vocalizations
Sex Parental status Number Mean age (years)

Female Non-parent 59 21
Female Parent 36 42
Male Non-parent 45 24
Male Parent 18 44
and ‘None’ where no distinct emotion was perceived in the sound. The dataset consisted of 1,580
judgements (158 participants each listening to 10 sound files). For 1,100 of these judgements,
participants reported that they perceived a clear positive or negative emotion. There were 874
Matching judgements, 218 Non-matching judgements, and 498 None judgements. A chi square
analysis showed that overall participants could match with previous codings of emotion
(p = 0.0001). Sex and parental status was put into the model to see if these factors would be
predictive of judgment type. Females are more likely than males to make Matching judgements,
although this difference is not significant (p = 0.068). Females are also more likely than males to
make None judgements, and this difference was significant (p = 0.05). Parental status was not
predictive of judgement type.
The written descriptions given by listeners were collated, and descriptions from Matching,
Non-matching and None responses were analysed. All words associated with emotion were coded
according to the following categories:
◆ Body feelings. Indicating hunger, pain, tiredness, discomfort.
◆ Expressions of emotion. Happiness or sadness – laughter, crying, about to cry, thinking about
laughing, starting to cry.
◆ Communicative engagement. Joining in company – playing a game, being tickled, upset needing
attention, seeking communication, greeting a person, playing with voice, acknowledgment,
recognition.
◆ Self-regulating or motive state. Expressing interest, wonder, or enjoyment; behaviour that is
vigorous, relaxed, restless, playful. This category also included descriptions that reinforced the
participant’s original judgement of the sound as Positive or Negative, for example happy or sad.
Descriptions that did not make sense where excluded from all analyses.
Females were more likely than males to give all these types of description when they expanded
on their initial judgement of emotion contained in the infant vowel sounds. Parental status was
not associated with particular differences in description of infants’ feelings. The proportions of
descriptions shown in Figure 10.8 show that all groups make more ‘Self-regulation’ descriptions
of the emotion that they perceive in the infant vowel sounds, and female non-parents give more
than the other groups. In this population mothers appear more inclined to hear infant sounds as
seeking ‘communicative engagement’, certainly more than fathers do. Males appear more likely to
50
25
0
Female Non-parents Mothers Male Non-parents Fathers
Self-regulation Communicative Body Expressions None

or motive state engagement feelings of emotion
Fig. 10.8 Descriptions made by adults, mothers, fathers and male and female non-parents, of the
feelings expressed in isolated vocalizations of infants heard on a recording.
attribute emotions related to ‘body feelings’. Simple happy or sad ‘expressions of emotion’ are
represented only in small proportion by all four groups of adults, suggesting that they are
inclined to hear in these isolated infant sounds more complex messages of state of mind and
motives seeking relationship to others.
Statistical analyses revealed that female non-parents were significantly less likely than other
participant groups to give body feelings descriptions compared to communicative engagement
or expressions of emotion descriptions. Non-parents, female and male, were significantly more
likely to give expressions of emotion descriptions and communicative engagement descriptions
of the emotion they perceived in the infant vocalizations. When participants made ‘Matching’
judgments, which agreed with the experimenter’s categorization of the emotion in the sound,
they were more likely to give communicative engagement and self-regulation descriptions.
An unexpected result of the analysis of verbal descriptions was that participants who judged
the infant vocalizations as containing ‘No Emotion’ still went on to provide written descriptions,
often giving detailed emotional descriptions of the infant’s state, contradicting their ‘None’
judgment.
The judgments of the adults did not always match the experimenter’s codings of the positive or
negative emotional content of infants’ sounds. Non-matching responses were similar across all
participant groups, but it is interesting to note that there was a significant association between
parental status and disagreement about the emotion expressed. ‘Parents’ were more likely to
make a Non-matching judgement when the vocalization contained a positive emotional message.
In other words, they showed a bias to perceive a vocalization as positive.
When participants make Matching judgements they were significantly more likely to give body
feelings descriptions (p = 0.0001), communicative engagement descriptions (p = 0.0001) and
self-regulation descriptions (p = 0.0001).
10.5.2 How adults feel when they hear infants’ voice sounds
Adult participants were asked if they felt any emotional response to the infant vocalizations that
they heard and their replies were categorized according to the following descriptions:
1 Caring: A need or desire to take action, for example – to soothe, comfort, pick up, attend to,
pull faces/make sounds for the infant.
2 Sympathetic: A direct emotional message, for example, expressing concern, happiness,
curiosity, shared pleasure.
3 Interpreting: A conceptual response extending the verbal description to refer to other factors
such as the environment or to articulate more complex inferences about the stage of develop-
ment or the infant’s actions or intentions, for example – ‘infant changing her boundaries’,
‘the baby is desperate to succeed’, ‘the infant requires something but I don’t know what’,
‘could imagine the baby being in my arms and looking up at me’.
Any written responses that made no clear sense were excluded from these analyses, and
participants’ emotional responses sometimes contained words from several of the categories. The
responses were compared with the initial classification of the sounds as expressive of positive or
negative emotions and grouped under the three headings: Matching, Non-matching or None.
In Figure 10.9 it appears that males exhibit less caring responses to the infant vocalization than
female participants. A statistical test confirmed that female sex was associated with a caring
response (p = 0.003) and being a parent was also associated with caring (p = 0.029). Figure 10.9
also shows that among those that gave a None response for the relationship to the previous classi-
fication there was a higher number saying that they felt ‘No response’ to the infant vocalizations.
75 400
300
Percentage
Number
50
200
25
100
0 0
Female Male Non-parent Parent Matching Non- None
matching
Sympathetic Caring Interpreting No response
Fig. 10.9 Left; feelings reported by adults in response to the infant sounds. Right; proportions of
different feelings reported for emotions that were Matching or Non-matching in comparison with
the experimenter’s evaluations of the sounds, and when no emotion was recognized (None).
Participants who gave Matching responses, agreeing with the assigned categorization of the
emotions, were significantly more likely to report they felt caring (p = 0.0001) and sympathetic
(p = 0.0001) to the infants sounds, and they were also more willing to give interpreting responses
(p = 0.003), than when they gave Non-matching judgements.
10.6 How the tones of voices communicate feelings of

companionship between infants and their mothers
from the start, and lead to the musicality of meaning
in a particular culture
We have explored only one small aspect, one element or dimension, of the musicality of relating
with an infant in the early months, before the habits of culture have an easily detected effect, but
already at a time when babies are attending to fine distinctions in their mothers’ vocalizations,
including culture-specific ones, and stimulating their mothers to heighten the expressive,
prosodic or ‘musical’ forms of their vocalizations. By measuring the variations in pitch intensity
and duration of extended calls resembling the vowels in speaking we have found evidence that,
notwithstanding differences in individual personalities on both sides, which we also recorded,
infants and their mothers cooperate in establishing a subtle link between their minds by modu-
lating their voice sounds. Infants a few months old can convey emotions that adults recognize,
and if the mother does not speak a story to them in a natural happy way, but acts sad or bored,
the infants show reactions that others perceive as a loss of sympathy of engagement.
These results, while they go beyond widely accepted theories of infants’ emotional and social
intelligence, are in accord with sensitive clinical reports of the first human conversations and
their early developments. When a mother greets her newborn baby with cooing sounds and
rhythmic touches, a happy alert baby responds in ways that inspire affection and caring sympathy
or love in the mother (Klaus and Kennel 1976; Brazelton 1979), and any present company can
share this. Thus begins a unique very human kind of friendship—the first occasion in which the
child will find a companion who will understand and become the teacher of what to do and
know in the world. It depends on the female intuitions of the mother for sympathetic emotional
response, on her personality, on learning from her infant and on her social support and influ-
ences from her culture. If another person, man or woman, is seeking intimate communication
with a young baby the will need to find similar sympathy and respect for the baby as a person.
All humans are born with this need and ability to get in touch with sounds and visible signs of
their companions’ changing feelings, not just with the ability to mirror their movements. We are
therefore not surprised that fathers, too, and women and men who are not parents, can experience
similar emotions to mothers when they listen to isolated calls of infants, or that mothers are gener-
ally more inspired to caregiving than fathers, who feel strong sympathy for the infant, and that par-
ents have acquired a little more understanding of how to accept care for a baby than non-parents.
Through the busy months of infancy, before talking, a baby practices a programmable kind of
inter mind–body play that, as Bjorn Merker (Chapter 4, this volume) reminds us, no ape or
singing bird or whale can master—the beginning of a play with the rhythmic actions and tones of
inventive ritual that has no innate boundaries. We found differences in the sounds passing
between mothers and 4-month-olds in Japan and Scotland when they were playing together, and
these might be related to the contrasting moral attitudes or beliefs, and the differences in speech
prosody, in these two cultures. The wider range of durations and pitch levels in Japanese
mothers’ speech compared with the English-speaking mothers in Scotland might be expressive of
both the special characteristics of Japanese language, and the importance given in traditional
Japanese society to respect for the emotions of other persons. This special respect for emotions is
evident in Japanese concepts of how infants feel and how they should be addressed. Certainly the
infants and their mothers in each culture matched one another in the ways they exchanged
feelings in their sounds. They seem to be beginning to share a simple proto-habitus of sound art
of the kind that Gratier and Danon (Chapter 14, this volume) propose as the way to ‘belonging’
in a human community.
However, such differences are but variations of universal human motives and needs, such as
those Charles Darwin observed when he carried out research by correspondence with people in
many distant lands to compare carefully emotional behaviours in human societies (Darwin 1872;
Bowlby 1991). The main message is that normal happy infants and their mothers use their voice
tones cooperatively, with ‘communicative musicality’ to sustain a harmony and synchronicity of
emotional ‘narrative envelopes’ as Daniel Stern (1992, 1999) has so felicitously called them.
Before they are 6 months old infants enjoy learning and ‘showing off ’ ways of play that others
appreciate (Trevarthen 2002). We submit that this is the first stage of a proud storytelling, ritual-
and myth-making that leads within a few months to a distinctly human fascination with
intricate, arbitrary, ritual elaborations of feeling and thinking. Without the musicality that
infants and their affectionate adult companions share the fabricated meanings of culture and
language, including its music, would be inaccessible.
Note
The findings reported in this chapter are the product of research carried out by Niki Powers
for her Ph.D., based on her proposal to make an acoustic study of vowel sounds made by
mothers and infants using material from both Japan and Scotland. Her research was supported
by a Post-Graduate Studentship from the Economic and Social Research Council of the UK
References
Aitken KJ and Trevarthen C (1997). Self–other organization in human psychological development.
Development and Psychopathology, 9, 651–675.
Alegria J and Noirot E (1978). Neonate orientation behavior towards the human voice. Early Human
Bateson MC (1971). The interpersonal context of infant vocalization. Quarterly Progress Report of the
Research Laboratory of Electronics, MIT, 100, 170–176.
development. In M Bullowa, ed., Before speech: The beginning of human communication, pp. 63–77.
Beebe B and Lachmann FM (2002). Infant research and adult treatment: Co-constructing interactions.
Academic Press, London.
Bierhoff H-W (2002). Prosocial behaviour. Psychology Press, Hove, East Sussex.
Bloom K and Masataka N (1996). Japanese and Canadian impressions of vocalising infants. International
Journal of Behavioural Development, 19(1), 89–99.
Boersma P and Weenink D (1992–2001). Praat: A system for doing phonetics by computer. Institute of
Phonetic Sciences. University of Amsterdam. Available from http://www.praat.org.
Bornstein MH, Tamislemonda CS, Tal J et al. (1992). Maternal responsiveness to infants in three societies
– the United States, France and Japan. Child Development, 63(4), 808–821.
Bowlby J (1969). Attachment and loss: Volume one. Attachment. Basic Books, New York.
Bowlby J (1981). Attachment and loss. Volume three: Loss, sadness, and depression. Basic Books, New York.
Bowlby J (1988). Developmental psychiatry comes of age. American Journal of Psychiatry, 145, 1–10.
Bowlby J (1991). Charles Darwin: A new biography. Pimlico, London.
Bråten S (ed.) (1998). Intersubjective communication and emotion in early ontogeny, pp. 372–382.
Bråten S (ed.) (2007). On being moved: From mirror neurons to empathy, pp. 21–34. John Benjamins,
Brazelton TB (1979). Evidence of communication during neonatal assessment. In M Bullowa,
ed., Before speech: The beginning of human communication, pp. 79–88. Cambridge University Press,
London.
Bruner JS (1996). The culture of education. Harvard University Press, Cambridge, MA.
Press, London.
Carter CS, Ahnert L, Grossman KE et al. (eds) (2005). Attachment and bonding: A new synthesis. Dahlem
Workshop Report, 92. The MIT Press, Cambridge, MA.
Cheney DL and Seyfarth RM (1990). The representation of social relations by monkeys. Cognition.
37, 67–96.
Clarkson MG, Martin R and Miciek SG (1996). Infants’ perception of pitch: Number of harmonics.
Infant Behaviour and Development, 19, 191–197.
Clifton RK, Morrongiello BA, Kulig J and Dowd J (1981). Newborns’ orientation toward sound: Possible
implications for cortical developmnt. Child Development, 52, 833–838.
Cohen AJ, Thorpe LA and Trehub SE (1987). Infants’ perception of musical relations in short transposed
tone sequences. Canadian Journal of Psychology, 41(1), 33–47.
Cooper RP and Aslin RN (1990). Preference for infant-directed speech in the first month after birth.
Crystal D (1997). The Cambridge encyclopedia of language, 2nd edn. Cambridge University Press,
Cambridge.
Cutler A and Takeshi O (1999). Pitch accent in spoken-word recognition in Japanese. Journal of the
Acoustic Society of America, 105, 1877–1588.
Darwin C (1872). The expression of the emotions in man and animals. John Murray, London.
DeCasper AJ and Fifer WP (1980). Of human bonding: Newborns prefer their mother’s voices.
Science, 208, 1174–1176.
Dissanayake E (2000). Art and intimacy: How the arts began. University of Washington Press, Seattle and
London.
Donald M (2001). A mind so rare: The evolution of human consciousness. Norton, New York.
Donaldson M (1978). Children’s minds. Fontana/Collins, Glasgow.

Donaldson M (1992). Human minds: An exploration. Allen Lane/Penguin Books, London.
Draghi-Lorenz R, Reddy V and Costall A (2001). Rethinking the development of ‘nonbasic’ emotions:
A critical review of existing theories. Developmental Review, 21(3), 236–304.
Ejiri K and Masataka N (1999). Synchronization between preverbal vocal behaviour and motor action in
early infancy II: An acoustical examination of the functional significance of the synchronization.
Japanese Journal of Psychology, 69(6), 433– 440.
Ejiri K and Masataka N (2001). Co-occurrence of preverbal vocal behaviour and motor action in early
infancy. Developmental Science, 4(1), 40–48.
Fassbender C (1996). Infants’ auditory sensitivity towards acoustic parameters of speech and music.
In I Deliège and J Sloboda, eds, Musical beginnings: Origins and development of musical competence,
pp. 56–87. Oxford University Press, Oxford, New York, Tokyo.
Fernald A (1989). Intonation and communicative intent in mother’s speech to infants: Is the melody the
Fernald A (1992). Meaningful melodies in mothers’ speech to infants. In H Papoušek, U Jürgens and
M Papoušek, eds, Nonverbal vocal communication: Comparative and developmental aspects, pp. 262–282.
Fernald A (1993). Approval and disapproval: Infant responsiveness to vocal affect in familiar and
unfamiliar languages. Child Development, 64, 657–674.
Fisher C and Tokura H (1996). Acoustic cues to grammatical structure in infant-directed speech: Cross-
linguistic evidence. Child Development, 67(6), 3192–3218.
Fogel A, Messinger DS, Dickson KL and Hsu HC (1999). Posture and gaze in early mother–infant
communication: Synchronization of developmental trajectories. Developmental Science, 2(3), 325–332.
Fonagy I (2001). Languages within language. An evolutive approach. Foundations of Semiotics 13. John
Benjamins, Amsterdam/Philadelphia.
Gergely G and Watson JS (1999). Early socio-emotional development: Contingency perception and the
social-biofeedback model. In P Rochat, ed., Early social cognition: Understanding others in the first
months of life, pp. 101–136. Erlbaum, Mahwah NJ.
Goldfield EC (2000). Exploration of vocal tract properties during serial production of vowels by full term
and preterm infants. Infant Behaviour and Development, 23(3–4), 421–439.
Gomes-Pedro G (2002). The child in the twenty-first century. In G Gomes-Pedro, K Nugent, G Young and
B Brazelton, eds, The infant and family in the twenty-first century, pp. 3–23. Brunner-Routledge, New
York/Hove, UK.
Gopnik A, Meltzoff A and Kuhl PK (1999). The scientist in the crib: What early learning tells us about the
mind. Harper Collins, New York.
Gratier M and Trevarthen C (2007). Voice, vitality, and meaning: On the shaping of the infant’s utterances
in willing engagement with culture. International Journal for Dialogical Science, 2, 169–81. Available at
http://ijds.lemoyne.edu/journal/2_1/IJDS.2.1.11.Gratier_Trevarthen.html
Guedeney A and Fermanian J (2001). A validity and reliability study of assessment and screening for
sustained withdrawal reaction in infancy: The Alarm Distress Baby Scale. Infant Mental Health Journal,
22(5), 559–575.
London.
Harris PL (1994) The child’s understanding of emotion: Developmental change and the family
environment. Journal of Child Psychology and Psychiatry, 35, 3–28.
Hobson P (2002). The cradle of thought: Exploring the origins of thinking. Macmillan, London
Hofer MA (1987). Early social relationships. A psychobiologist’s view. Child Development, 58, 633–647.
Holstege G, Bandler R and Saper CB (eds) (1996). The emotional motor system, Progress in brain research,
volume 107. Elsevier, Amsterdam.
Jaffe J, Beebe B, Feldstein S, Crown C and Jasnow M (2001). Rhythms of dialogue in infancy: Coordinated
timing and social development. SRCD Monographs, 66(2) Serial No. 264, 1–132.
Jahoda G and Lewis IM (1988). Acquiring culture: Cross-cultural studies in child development. Croom Helm,
Beckenham, Kent.
Jonsson C, Clinton D, Fahrman M, Mazzaglia G, Novak S and So" rhus K (2001). How do mothers signal
shared feeling-states to their infants? An investigation of affect attunement and imitation during the
first year of life. Scandinavian Journal of Psychology, 42(4), 377–381.
Jusczyk PW (2001). In the beginning was the word. In F Lacerda, C von Hofsten and M Heimann, eds,
Emerging cognitive abilities in early infancy, pp. 173–192. Erlbaum, Mahwah, NJ.
Kaitz M and Maytal H (2005). Interactions between anxious mothers and their infants: An integration of
theory and research findings. Infant Mental Health Journal, 26(6), 570–597.
Karpf A (2006). The human voice. Bloomsbury, London.
Key MR (ed.) (1982). Non-verbal communication today: Current research. Mouton, New York.
Kitamura C and Burnham D (1998). The infant’s response to maternal vocal affect. In C Rovee-Collier,
LP Lipsitt and H Hayne, eds, Advances in Infancy Research, 12, 221–36.
Klaus M and Kennel J (1976). Maternal–-infant bonding. Mosby, St Louis.
Koizumi T (1989). The attitudes of Japanese children and the effects of parental behaviour. Journal of
Moral Education, 18(3), 218–231.
Kozasa T (2002). Duration and F0 cues for vowel length in Japanese. Journal of the Acoustic Society of
America, 111, 2365.
Kozasa T (2004). Duration and pitch cues in Japanese mora. Speech Prosody, March 23–26, 2004.
Nara ISCA Archive, http://www.isca-speech.org/archive.
Kraemer GW (1992). A psychobiological theory of attachment. Behavioural and Brain Sciences,
15(3), 493–541.
Kugiumutzakis G (1993). Intersubjective vocal imitation in early mother-infant interaction. In J Nadel and
L Camioni, eds, New perspectives in early communicative development, pp. 23–47. Routledge, London.
Kugiumutzakis G (1998). Neonatal imitation in the intersubjective companion space. In S. Bråten, ed.,
Cambridge.
Kugiumutzakis G, Kokkinaki T, Markodimitraki M and Vitalaki E (2005). Emotions in early mimesis.
In J Nadel and D Muir, eds, Emotional development, pp. 161–182. Oxford University Press, Oxford.
Kühl O (2007). Musical semantics, European Semiotics: Language, Cognition and Culture, No. 7. Peter
Lang, Bern.
Kuhl PK (1983). Perception of auditory equivalence classes for speech in early infancy. Infant Behaviour
and Development, 6(2–3), 263–285.
Kuhl PK (1994). Learning and representation in speech and language. Current Opinion in Neurobiology,
4(6), 812–822.
Kuhl PK (1998). Language, culture and intersubjectivity: The creation of shared perception. In S Bråten,
ed., Intersubjective communication and emotion in early ontogeny, pp. 297–315. Cambridge University
Press, Cambridge.
Kuhl PK, Andruski JE, Chistovich LA et al. (1997). Cross-language analysis of phonetic units in language
addressed to infants. Science, 277(5326), 684–686.
Ladd DR (1996). Intonational phonology. Cambridge University Press, Cambridge.
Lapp D (2003). The physics of musical instruments. Wright Center For Innovative Science Education,
Tufts University, Medford, Massachusetts. Available as pdf from http://staff.tamhigh.org/lapp/
Origins and development of musical competence, pp. 3–34. Oxford University Press, Oxford,
New York, Tokyo.
LeDoux JE (2002) Emotion, memory, and the brain. Scientific American, 12, 62–71.
Legerstee M (1992). A review of the animate–inanimate distinction in infancy: Implications for models of
social and cognitive knowing. Early Development and Parenting, 1, 59–67.
Legerstee M (2005). Infants’ sense of people: Precursors to a theory of mind. Cambridge University Press,
Cambridge.
Lewis C (1995). Educating hearts and minds. Cambridge University Press, Cambridge.
Locke JL (1993). The child’s path to spoken language. Harvard University Press, Cambridge, MA and
London.
Locke JL (1994). The biological building blocks of spoken language. In JA Hogan and JJ Bolhuis, eds,
Causal mechanisms of behavioural development, pp. 300–324. Cambridge University Press, Cambridge.
1999–2000), 29–57.
Analysing pitch, timing, loudness and voice quality. Proceedings of the Institute of Acoustics,
19(5), 495–500.
Mangelsdorf S, McHale JL, Diener M, Heim Goldstein L and Lehn L (2000). Infant attachment:
Contributions of infant temperament and maternal characteristics. Infant Behaviour and Development,
23, 175–196.
Manning A (2004) The sound of life. BBC Radio 4, July 2004. CD produced by S Blunt for the Open
University, Milton Keynes.
Masataka N (1993). Relation between pitch contour of prelinguistic vocalisations and communicative
functions in japanese infants. Infant Behaviour and Development, 16(3), 397–401.
Masataka N (1995). The relation between index-finger extension and the acoustic quality of cooing in
3-month-old infants. Journal of Child Language, 22(2), 247–257.
McNeill D (1992). Hand and mind: What gestures reveal about thought. University of Chicago Press,
Chicago, IL.
Meltzoff AN (1995). Understanding the intentions of others: Re-enactment of intended acts by
18-month-old children. Developmental Psychology, 31, 838–850.
Meltzoff AN and Moore MK (1999). Persons and representation: Why infant imitation is important for
theories of human development. In J Nadel and G Butterworth, eds, Imitation in infancy, pp. 9–35.
Miall DS and Dissanayake E (2003). The poetics of babytalk. Human Nature, 14(4), 337–364.
Mundy-Castle A (1980). Perception and communication ininfancy: A cross-cultural study. In D Olson, ed.,
The social foundations of language and thought, pp. 231–253. Norton and Co., New York.
Murrray L and Trevarthen C (1985). Emotional regulation of interactions between two-month-olds
and their mothers. In TM Field and N Fox, eds, Social perception in infants, pp. 177–197. Ablex,
Norwood, NJ.
Nadel J and Butterworth G (eds) (1999). Imitation in infancy. Cambridge University Press, Cambridge.
Nadel J and Muir D (eds) (2005). Emotional development. Oxford, Oxford University Press.
Nadel J, Carchon I, Kervella C, Marcelli D and Reserbat-Plantey D (1999). Expectancies for social
Nagy E and Molnár P (2004). Homo imitans or Homo provocans? Human imprinting model of neonatal
imitation. Infant Behaviour and Development, 27(1), 54–63.
Nakano S (1997). Heart-to-heart (inter-Jo-) resonance: A concept of intersubjectivity in Japanese everyday
life. Annual Report, 1995–1996, No. 19. Research and Clinical Center for Child Development, Faculty of
Education, Hokkaido University, Japan, pp. 1–14.
Ochsner K and Barrett L (2001). A multiprocess perspective on the neuroscience of emotion. In T Mayne
and G Bonanno, eds, Emotions: Current issues and future directions. Emotions and social behaviour,
pp. 38–81. The Guilford Press, New York.
Oller DK (1986). Metaphonology and infant vocalisations. In B Lindblom and R Zetterstrom, eds,
Precursors of early speech, pp. 21–35. Macmillan, Basingstoke.
Oller DK and Eilers RE (1992). Development of vocal signalling in humans. In H Papoušek, U Jürgens and
M Papoušek, eds, Nonverbal vocal communication: comparative and developmental aspects, pp. 174–191.
Panksepp J (2001). The long-term psychobiological consequences of infant emotions: prescriptions for
the 21st century. Infant Mental Health Journal, 22, 132–173. (Reprinted in Neuropsychoanalysis,
3, 140–178).
Panksepp J and Bernatzky G (2002). Emotional sounds and the brain: the neuro-affective foundations of
Panksepp J, Nelson E and Bekkedal M (1997). Brain system for the mediation of social separation-distress
and social reward. Evolutionary antecedents and neuropeptide intermediaries, In CS Carter,
II Lederhendler and B Kirkpatrick, eds, The integrative neurobiology of affiliation. Annals of the New
York Academy of Sciences, 807, pp. 78–101. The New York Academy of Sciences, New York.
Papoušek H (1996). Musicality in infancy research: biological and cultural origins of early musicality
Papoušek H and Bornstein MH (1992). Didactic interactions: intuitive parental support of vocal and
verbal development in human infants. In H Papoušek, U Jurgens and M Papoušek, eds, Noverbal vocal
communication. Comparative and developmental approaches, pp. 209–229. Cambridge University Press,
Cambridge.
Papoušek H and Papoušek M (2002). Parent infant speech patterns. In G Gomes-Pedro, K Nugent,
G Young and B Brazelton, eds, The infant and family in the twenty-first century, pp. 101–108.
Brunner-Routledge, New York/Hove, UK.
Papoušek H, Jürgens U and Papoušek M (eds) (1992). Nonverbal vocal communication: comparative and
developmental aspects. Cambridge University Press, Cambridge/Editions de la Maison des Sciences de
l’Homme, Paris.
Papoušek M (1992). Early ontogeny of vocal communication in parent-infant interactions. In H Papoušek,
U Jurgens and M Papoušek, eds, Noverbal vocal communication. Comparative and developmental
approaches, pp. 230–261. Cambridge University Press, Cambridge.
Papoušek M (1994). Melodies in caregivers’ speech: A species specific guidance towards language. Early
Research, 1, pp. 163–224. Ablex, Norwood, NJ.
Papoušek M, Papoušek H (1991). The meanings of melodies in motherese in tone and stress languages.
In Infant Behaviour and Development, 14, 415–440.
Papoušek M, Papoušek H and Bornstein MH (1985). The naturalistic vocal environment of young infants:
On the significance of homogeneity and variability in parental speech. In TM Field and N Fox, eds,
Social perception in infants, pp. 269–298. Ablex, Norwood, NJ.
Payne K (2000). The progressively changing songs of humpback whales: A window on the creative process
in a wild animal. In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 135–150. The MIT
Pegg JE, Werker JF and McLeod PJ (1992). Preference for infant-directed over adult-directed speech:
Evidence from 7-week-old infants. Infant Behavior and Development, 15, 325–345.
Petitto LA and Marentette PF (1991). Babbling in the manual mode: evidence for the ontogeny of
language. Science, 251, 1493–1496.
Porges SW (2005). The role of social engagement in attachment and bonding: A phylogenetic perspective.
In CS Carter, L Ahnert, KE Grossman et al., eds, Attachment and bonding: A new synthesis (Dahlem
Workshop Report 92), pp. 33–54. The MIT Press, Cambridge, MA.
Powers N (2001). Intrinsic musicality: Rhythm and prosody in infant-directed voices. Annual Report,
1999–2000, No. 23, pp. 1–19. Research and Clinical Center for Child Development, Faculty of
Education, Hokkaido University, Japan.
Querleu C, Lefebvre C, Titran M, Renard X, Morrillion M and Crepin G (1984). Réactivité du nouveau-né
moins de deux heures de vie à voix maternelle. Journal de Gynecologie, Obstetrique et Biologie de la
Reproduction, 13, 125–134.
Reddy V and Trevarthen C (2004). What we learn about babies from engaging with their emotions.
Zero to Three, 24(3), 9–15.
Rock AML, Trainor LJ and Addison TL (1999). Distinctive messages in infant-directed lullabies and
play songs. Developmental Psychology, 35, 527–534.
Rogoff B (1998). Cognition as a collaborative process. In D Kuhn and RS Siegler, eds, Handbook of child
psychology, volume 2: Cognition, perception and language, pp. 679–744. Wiley, New York.
Rogoff B (2003). The cultural cature of human development. Oxford University Press, Oxford.
Rogoff B, Paradise R, Arauz RM, Correa-Chávez M and Angelillo C (2003). Firsthand learning through
Shimura Y and Imaizumi S (1995). Emotional information in young infants’ vocalisations. Proceedings of
the International Congresses of Phonetic Sciences, 3, 412–415.
Shimura Y, Saito K, Imaizumi S and Yamamuro C (1996). Interrelationships between prosody and gesture
in vocal communication in the early years: Emotion perception and speaker identification. In The
Emergence of Human Cognition and Language, 3, 225–230. Annual report, March 1996, Grant-in-Aid
for Scientific Research, Ministry of Education, Science, Sport and Culture, Japan.
Siegman AW and Felstein S (1979). Nonverbal behaviour and communication. Erlbaum, Hillsdale NJ.
behaviours. In M Lewis and LA Rosenblum, eds, The effect of the infant on its caregiver, pp. 187–213.
Wiley, New York.
Stern DN (1977). The first relationship; infant and mother. Harvard University Press, Cambridge MA
Stern DN (1985). The interpersonal world of the infant: A view from psychoanalysis and developmental
Stern DN (1992). L’enveloppe prénarrative: Vers une unité fondamentale d’expérience permettant
d’explorer la réalité psychique du bébé. Revue Internationale de Psychopathologie, 6, 13–63.
Stern DN (1995). The motherhood constellation: A unified view of parent-infant psychotherapy. Basic Books,
New York.
infant’s social experience. In P Rochat, ed., Early social cognition: understanding others in the first
months of life, pp. 67–90. Erlbaum, Mahwah, NJ.
psychology, 2nd edn with new Introduction. Basic Books, New York.
Stern DN, Bruschweiler-Stern N, Harrison AM et al. (1999). The process of therapeutic change involving
implicit knowledge: Some implications of developmental observations for adult psychotherapy. Infant
Mental Health Journal, 19(3), 300–308.
Stern DN, Spieker S and MacKain K (1982). Intonation as signals in maternal speech to prelinguistic
infants. Developmental Psychology, 18, 727–735.
Stern DN, Hofer L, Haft W and Dore J (1985). Affect atunement: The sharing of feeling states between
Thompson E (ed.) (2001). Between ourselves: second-person issues in the study of consciousness. Imprint
Academic, Charlottesville, VA/Thorverton, UK. Also published in the Journal of Consciousness Studies,
8, Number 5–7.
Trehub SE (1987). Infants’ perception of musical patterns. Perception and Psychophysics, 41(6), 635–641.
Trehub SE (1990). The perception of musical patterns by human infants: The provision of similar patterns
by their parents. In MA Berkley and WC Stebbins, eds, Comparative perception; vol. 1, mechanisms,
pp. 429–459. Wiley, New York.
Trehub SE (2003). Musical predispositions in infancy: An update. In I Peretz and R Zatorre, eds,
The cognitive neuroscience of music, pp. 3–20. Oxford University Press, New York.
Trehub SE and Nakata T (2002). Emotion and music in infancy. Musicae Scientiae (Special Issue
2001–2002), 37–61.
Trehub SE, Schellenberg EG, Glenn E and Hill DS (1997) The origins of music perception and cognition:
A developmental perspective. In I Deliège, J Sloboda et al., eds, Perception and cognition of music,
pp. 103–128. Psychology Press, Hove, UK.
Trevarthen C (1978). Modes of perceiving and modes of acting. In JH Pick, ed., Psychological modes of
perceiving and processing information, pp. 99–136. Erlbaum, Hillsdale, NJ.
intersubjectivity. In M. Bullowa, ed., Before speech: The beginning of human communication,
Trevarthen C (1984). Emotions in infancy: Regulators of contacts and relationships with persons.
In K Scherer and P Ekman, eds, Approaches to emotion, pp. 129–157. Erlbaum, Hillsdale NJ.
Trevarthen C (1986). Form, significance and psychological potential of hand gestures of infants.
In J-L Nespoulous, P Perron and AR Lecours, eds, The biological foundation of gestures: Motor and
semiotic aspects, pp. 149–202. Erlbaum, Hillsdale, NJ.
Trevarthen C (1988). Universal cooperative motives: How infants begin to know language and skills of
culture. In G. Jahoda and I.M. Lewis, eds, Acquiring culture: Ethnographic perspectives on cognitive
development, pp. 37–90. Croom Helm, London.
Trevarthen C (1993). The function of emotions in early infant communication and development.
In J Nadel and L Camaioni, eds, New perspectives in early communicative development, pp. 48–81.
Routledge, London.
Trevarthen C (1994). Infant Semiosis. In W Noth, ed., Origins of semiosis: Sign evolution in nature and
culture, pp. 219–252. Mouton de Gruyter, New York.
Trevarthen C (1998). The concept and foundations of infant intersubjectivity. In S Bråten, ed.,
Cambridge.
Trevarthen C (2001). The neurobiology of early communication: intersubjective regulations in human
brain development. In A F Kalverboer and A Gramsbergen, eds, Handbook on brain and behavior in
Press, Oxford.
Trevarthen C (2004a). Infancy, mind in. In RL Gregory, ed., Oxford companion to the mind, 2nd edn,
pp. 455–464. Oxford University Press, Oxford, New York.
Trevarthen C (2004b). How infants learn how to mean. In M Tokoro and L Steels, eds, A learning zone of
one’s own, pp. 37–69. (SONY Future of Learning Series). IOS Press, Amsterdam.
Trevarthen C (2005a). First things first: infants make good use of the sympathetic rhythm of imitation,
without reason or language. Journal of Child Psychotherapy, 31(1), 91–113.
Trevarthen C (2005b). Stepping away from the mirror: Pride and shame in adventures of companionship.
Reflections on the nature and emotional needs of infant intersubjectivity. In CS Carter, L Ahnert,
KE Grossman et al., eds, Attachment and bonding: A new synthesis (Dahlem Workshop Report 92),
pp. 55–84. The MIT Press, Cambridge, MA.
intrinsic factors in child mental health. Development and Psychopathology, 6, 599–635.
Trevarthen C and Aitken KJ (2001). Infant intersubjectivity: Research, theory and clinical applications.
Annual Research Review, Journal of Child Psychology and Psychiatry, 42(1), 3–48.
Trevarthen C and Aitken KJ (2003). Regulation of brain development and age-related changes in infants’
Trevarthen C and Hubley P (1978). Secondary intersubjectivity: confidence, confiding and acts of
meaning in the first year. In A Lock, ed., Action, gesture and symbol, pp. 183–229. Academic Press,
New York.
Trevarthen C and Marwick H (1982). Cooperative understanding in infants. Project report to the Spencer
Foundation of Chicago. Department of Psychology, The University of Edinburgh.
Trevarthen C and Reddy V (2007). Consciousness in infants. In M Velman and S Schneider, eds,
A companion to consciousness, pp. 41–57. Blackwells, Oxford.
Trevarthen C, Kokkinaki T and Fiamenghi GA Jr (1999). What infants’ imitations communicate: With
mothers, with fathers and with peers. In J Nadel and G Butterworth, eds, Imitation in infancy,
pp. 127–185. Cambridge University Press, Cambridge.
Trevarthen C, Murray L and Hubley PA (1981). Psychology of infants. In J Davis and J Dobbing, eds,
Scientific foundations of clinical paediatrics, 2nd edn, pp. 211–274. Heinemann Medical, London.
Tronick EZ (2005). Why is connection with others so critical? The formation of dyadic states of
consciousness: coherence governed selection and the co-creation of meaning out of messy meaning
making. In J Nadel and D Muir, eds, Emotional development, pp. 293–315. Oxford University Press,
Oxford.
Tronick EZ, Als H, Adamson L, Wise S and Brazelton TB (1978). The infant’s response to entrapment
between contradictory messages in face-to face interaction. Journal of the American Academy of Child
Psychiatry, 1, 1–13.
Vygotsky LS (1962). Thought and language. MIT Press, Cambridge, MA.
Werker JF and McLeod PJ (1989). Infant preference for both male and female infant-directed talk:
A developmental study of attentional affective responsiveness. Canadian Journal of Psychology,
43, 230–246.
Chapter 11
‘Music’ and the ‘action song’ in infant

development: An interpretation
Patricia Eckerdal and Bjorn Merker
11.1 Introduction
After more than a century of study, the newborn’s developmental path to maturity is still imper-
fectly understood. For example, the use of song and musical play in adult–infant interaction
is cross-culturally ubiquitous (Trehub and Trainor 1998; Unyk et al. 1992; Falk 2004), yet the
question of what functional role this common form of interaction with infants has in their
emotional, social and cognitive development is only in the process of being formulated (Papousek
et al. 1991; Papaeliou and Trevarthen 1994; Trevarthen 1999; Malloch 1999; Mozgot 2003).
Motivated by an interest in gathering descriptive material as an aid to thought on this functional
issue, we decided to record mother–infant interaction in the home setting in a collaborative
project with Colwyn Trevarthen. In brief, 25 mother–infant dyads were filmed continuously for
one to two hours during routine activities in the home when infants were 6, 9 and 12 months of
age to obtain samples of the everyday use of music in interaction with infants for subsequent
analysis. The filming was conducted by the first author in the province of Jämtland in the north
of Sweden in 1998 and 1999, and resulted in a total of 120 hours of digital video recordings
of diverse behavioural content. This material was then classified behaviourally, with particular
focus on sequences containing song, music or rhythmic activity as a basis for additional study,
employing frame-by-frame analysis, sonograms and other methods. Additional information
regarding data collection and analysis is provided in Appendix I.
In this chapter we are not concerned primarily with reporting specific results of this study, but
rather with giving an account of the conceptual issues with which we had to grapple to interpret,
and sometimes even simply to categorize, the many music-related behavioural sequences we
captured on film. We came to the project fully cognizant of the extensive evidence documenting
the receptive musical precocity of the human infant (summarized in Trehub 2000; Ilari 2002). By
weaving song into the routines by which they care for and interact with their infants, mothers
show an intuitive awareness of their infant’s receptivity to music. Using singing when an infant is
upset or distracted also implies a recognition that such stimuli are salient for infants. The infants
in our study did not, of course, respond by singing back to the mother; they were too young for
that. They showed their engagement, excitement or calming in varied responses to songs and
other music by the various vocal, postural and gestural means of non-verbal expressiveness that
form part of our native equipment. We call the time lag between infant receptivity to music in
others’ behaviour and their own expression of music the ‘developmental paradox of music’, and
will have more to say about it in what follows.
The distinction just drawn between music receptivity and expression is crucially dependent on
what one means by ‘music’. For this reason, we had to deal explicitly with the problem of what
would count as a musical structure in our classification. We see this as especially important
242 PATRICIA ECKERDAL AND BJORN MERKER
because the question that motivated our study ultimately concerns the possible roles of such
structures in the process of learning through which we are ‘inducted’ into our complex natal
culture (Tomasello et al. 1993a; Merker 2005, and Chapter 4, this volume). Such questions cannot
be answered without a workable way for distinguishing specifically musical structures from the
many forms of non-verbal communicative expressiveness humans command on a largely innate
basis (as do all vocal animals, whether or not in addition there is anything resembling song in
their repertoire; see Marler 2004, and the next section). Yet arriving at workable criteria for what
constitutes a musical structure is not a trivial matter: the lack of an agreed definition of ‘music’ in
musicology is notorious (Merker 2002). As we shall see, the issue is far from intractable, and also
bears upon the characterization of so-called infant-directed speech.
As if such quandaries did not suffice, our engagement with the video materials confronted us
with even larger issues. In retrospect, it seems obvious that to understand the developmental role
of some facet of mother–infant interaction it is helpful to know where development is heading,
and what end-state it is ‘meant’ to achieve. Otherwise we may fail to recognize the bearing of a
particular behaviour on a crucial aspect of adult cultural competence. Accordingly, the answer to
some developmental questions will depend on how human culture is defined and characterized.
There was one strand of our material which especially raised this larger question by both stimu-
lating and resisting our efforts to ‘make sense of it’ in terms of the cultural categories we brought
to our work. That strand was the type of ‘artificially’ stereotyped combination of action schemata
with song we came to designate ‘action songs’. It was this interpretive puzzle surrounding action
songs that inspired the perspective on human culture presented by the second author (Chapter 4,
this volume), and we shall outline our interpretation of the relationship between music, action
songs, and human culture towards the end of this chapter. Meanwhile, the following preliminaries
will provide essential background to that interpretation.
11.2 The non-verbal foundations of human communication

There is a rich domain of human vocal (and postural–gestural) communication that is
completely independent of and more basic than both language and music. It vastly antedates
both of these in our prehistory, the human behaviours being in most respects, we believe, homol-
ogous and continuous with modes of communication in our non-human primate relatives, and
even with those of mammals more generally (see, e.g., Marler 1955, 2004; Todt 1988; Macedonia
and Evans 1993; Hauser 1996; Evans 1997; Owings and Morton 1998; Jürgens 1998, 1999). In
some respects, this continuity of expressive functions extends far back into our pre-mammalian
vertebrate ancestry. These modes of non-verbal social signalling serve a range of communicative
functions. They include the expression of a variety of emotional and motive states that arise
within the individual (Scherer 1986; Panskepp and Trevarthen, Chapter 7, this volume). Some
non-verbal signals carry so-called ‘functional reference’ to activities such as finding food (Marler
et al. 1986; Hauser 2000) and detecting a predator (Struhsaker 1967; Gyger et al. 1987). Others
regulate social interactions like courtship (Wickler 1974), play (Bekoff 1972), and contact
between an ambulatory mother and offspring through interactive calling (Marler 2004; Searby
and Jouventin 2003). Vocal calls used in these ways are predominantly innate (Jürgens 1998,
1999), although some, for example the alarm calls of vervet monkeys (Seyfarth and Cheney
1980), may be refined and differentiated by learning. Although calls tend to be unitary sound
gestures, they often exhibit systematic variation in such characteristics as dynamics and intensity,
which convey subtle emotional nuances and temporal fluctuations in motive state (Scherer 1986;
Hauser 2000). They are sensitive to context and may be repeated when the evoking state persists,
but are essentially devoid of syntactic complexity in the sense of using call combinations to
‘MUSIC’ AND THE ‘ACTION SONG’ IN INFANT DEVELOPMENT: AN INTERPRETATION 243
convey compound meanings (Marler 2000). Calls are generally understood by all conspecifics
within range, and include alarm calls (which may be differentiated according to predator class),
isolation calls, contact calls, begging calls, food calls (which may be differentiated according
to food quality or type), threats, and calls used in appeasement, flight, courtship and the care of
offspring (cf. Marler 2004).
The human capacities for song and speech are vocal resources which (like those for song
in singing animals) have been added over and above this basic repertoire of non-verbal vocal
expressiveness, as depicted schematically in Figure 11.1.
The human repertoire of non-verbal expressiveness is likely to be no less rich than are its
animal homologues. The dominance of language as a human expressive modality has, however,
relegated the study of this important aspect of our behavioural biology to the sidelines. One
searches in vain, for example, for even a simple systematic inventory of the human repertoire of
non-verbal expressive calls. An informal list includes at least the following (asterisks mark those
less likely to be produced by infants, although this remains to be determined in several of the
cases, and some are matters of definition or age): coos, lip-smacking, tongue-smacking, high-
pitched forceful pure-tone glissandos, peeps, hoots*, whines, whimpers, moans*, grunts, groans*,
growls*, gurgles, gasps, sighs, coughs*, howls*, shouts, screams, panting, whooping*, yelping*,
warbles, trills*, squeals, snickers*, giggles, laughter, sobbing, ‘oooohs and aaaahs’, breathy exhala-
tion, glottal-stop exhalation, glottal stop inhalation*, and hisses. Some of these may be culturally
based, at least in their communicative use, and many undergo modification through their inter-
active use in the first months of life, but most are found in every human society, and many of
their equivalents are attested again and again in the call repertoires of various species of
animals—extending even to ‘laughter’ and ‘giggling’ (Burgdorf and Panksepp 2001). Thus, a rich
innate repertoire of calls is available as a primary medium for the vocal expression of motive
states in humans no less than in animals. This is the domain of the natural schemata (Morton
1977; Papousek 1996, p. 95; Cordes 1997, 1998; Stern 1999; Cohen 2000, 2003; Scherer 1995;
Scherer et al. 2001) by which, in part, our innate feelings and impulses clothe themselves in
projected sound and bodily movement for others to apprehend through shifts and finely graded
fluctuations in the ebb and flow of vocal and gestural dynamics, reflecting functional processes
from the autonomic to the cerebral. Here is something deserving the name of ‘a language of
emotions’, although, of course, it is neither a language in a formal sense nor strictly confined to
emotions, since it includes phenomena such as the social timing signals of play (discussed further
below). This repertoire of calls and dynamic schemata is the source from which both music and
Song Language
and music
Non-verbal
expressiveness
Fig. 11.1 Schematic depiction of the relation between human non-verbal expressiveness, song and
music, and language. Each region in the diagram has concrete instantiations among the varieties of
human expressiveness. The central region would contain, for example, ‘lyrics sung “with feeling”’.
language draw the means to convey emotional meaning, and they do so in similar ways (Juslin
and Laukka 2003; Scherer 1995; Burling 1993; Brandt, Chapter 3 and Cross and Morley, Chapter 5,
this volume, discuss distinctions between music and language).
A schematism such as the use of rising pitch, increasing loudness, and quickening pace to
signal mounting excitement can occur in music, language and non-verbal vocal expression; as we
share the latter version with animals, and it is understandable to us in its non-verbal form quite
apart from music and language, this would seem to be the original on which the other two draw.
The use in music of such expressive devices is neither an essential nor defining feature of music.
The classic treatment of this issue by Hanslick is still unsurpassed (Hanslick 1854). Rather, the
immediacy with which such devices are understood, even when clothed in the formal structures
of music, reflects their origin in the primary domain of basic non-verbal expressiveness. When
we really are sad, we do not sing a sad song, but cry and sob convulsively; when we are really
angry, we do not merely raise our talking voice, but burst the bounds of language by shouting
and screaming ‘incoherently’. Whistling a bright tune tends to signal leisurely contentment, while
great happiness makes us jump, flail, laugh and squeal with delight. The squeals may use a pure,
mellifluous tone, but that by itself does not, of course, mean that they are music (as we shall see
in greater detail in the next section). Parenthetically, let us note that the use in music of contours
and dynamics from the primary domain of non-verbal expressiveness may have encouraged the
metaphorical use of ‘musical’ for some of the phenomena of non-verbal expressiveness on which
music models emotional effects (Papousek 1996, p. 94; Papaeliou and Trevarthen 1994; Papousek
et al. 1991). The tendency is also manifested in the speech-oriented literature on vocal develop-
ment by the habit to refer to the pitch-contours of non-verbal calls as ‘melodic contours’, again
without structural relation to musical melody as such (Grieser and Kuhl 1988; Hsu et al. 2000,
p. 3, referring to results by D’Odorico). To avoid the confusion that may result from such
terminological assimilation of phenomena across domains, there is a need for structural criteria
allowing them to be distinguished. They will be discussed in the next section.
Human infants are richly endowed with a range of non-verbal modes of vocal expressiveness
(Prescott 1975; Stark 1978; Kent and Murray 1982; D’Odorico 1984; Marcos 1987; Stark et al.
1993; Nwokah et al. 1994; McCune et al. 1996; Hsu et al. 2000), a sample being included in the list
of ‘human calls’ above. Given the relative bodily immaturity of the human infant, vocal expres-
siveness supplies a primary medium for the infant’s interactive engagement with others. An
example from our material is the excited squeal by which a 6-month-old girl conducts a brief
‘protoconversation’ with her mother in a book-reading situation, depicted in Figure 11.2.
Human infants appear far more vocal than the infants of non-human primates such as
chimpanzees (Falk 2004), and may be compared to hatchling birds in this as in other respects
(Alexander 1990). Infant facility in this regard includes—as it does in all animals—not only the
vocal signals, but their appropriate contextual cueing and use in interaction with others, as illus-
trated in the ‘protoconversational’ example in Figure 11.2(a). This includes their timing, which in
interactive contexts extends to smooth reciprocal timing, as exhibited in turn-taking and expec-
tations of contingency (Stern 1977; Rubin et al. 1983; Murray and Trevarthen 1985; Fogel 1993;
MacDonald 1993). Focused attunement to the partner, flexible reactivity and smooth mutual
adjustment in the timing of movements, and the taking of turns, are all featured in animal play
(Bekoff 1972; Bekoff and Byers 1998) and in other interactive social behaviours such as contact
calling. The masters of animal play are the young of the species, often preferring one another to
adults as partners in play (for primates, see Biben and Suomi 1993).
The interactive timing of mother–infant interaction thus finds in animal play and contact
calling a natural homologue and model within the more general framework of non-verbal
communication, and requires no special explanatory effort invoking musical or other categories.
2.5
1.5
0.5
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5
åå-s far’t ien jaaaaaa b ä ä ä bää sid b ä ä bä-ä
SQUEAL ååe
(a) (b) t
r
2.5
1.5
0.5
0
10 10.5 11 11.5 12 12.5 13 13.5 14 14.5 15 15.5 16 16.5 17 17.5 18 18.5 19 19.5
bå bå K b å bå b ä bä vi - t a I a m m har du någon u I I
I
(c) S (d)
S
Fig. 11.2 Annotated sonogram of a 20-second continuous sound segment taken from our video
corpus. Time in seconds is on the x-axis, frequency content in kHz on the y-axis. The fundamental
is black; partials and noise are grey (see text for analysis). A 6-month-old girl sits on the lap of her
mother ‘reading’ a children’s picture book whose pages the mother turns. Part (a): At the beginning
of the segment the mother turns a page to reveal the picture of a sheep, and exclaims ‘and the
sheep again’ in a soft ordinary speaking voice. The girl utters a delighted high-pitched squeal,
which the mother affirms with an excited ‘yeeees’ using the prosody of infant-directed speech. It is
classified as ‘proto-conversation’ in our analysis. Part (b): The mother produces a life-like imitation
of bleating, using a highly aspirated, staccato voice (reflected in the characteristic spectral content
of the tracing), inserting the comment ‘that’s the way it sounds’ in mid performance. With the last
‘bäää’ the mother returns to a more normal voice, as reflected in the spectral tracing. Part (c): The
mother now repeats pairs of ‘bäää’ sounds with song-like intonation (classified as ‘song fragment’
in our analysis), kissing the girl’s head between pairs. Part (d): The mother sings the Swedish version
of ‘Ba Ba Blacksheep’ (whose sheep is white, quite in keeping with the picture book).
The cultural goods of songs and other explicitly musical content often featured in human
adult–infant play are introduced by the adult, again exemplified by the mother’s stage-wise intro-
duction of a children’s song into the play sequence depicted in Figure 11.2, parts (c) and (d). The
infant can then rely on its native communicative equipment, which is likely to include specifically
human motive mechanisms underpinning vocal learning (see Merker 2005 and below) as well as
learned expectancies generated by predictable features of repeated musical structures to fit
non-verbal vocalizations and other reactions into this adult-supplied framework (Papousek
1996; Trevarthen 1999; Malloch 1999). We shall consider further this and related issues in the
final section of this chapter.
Similar considerations apply to the interpretation of the distinctive prosodic features of infant-
directed speech, the mode of address adults use in communicating with infants (Fernald and
Simon 1984; Fernald and Kuhl 1987; Grieser and Kuhl 1988; Fernald et al. 1989). Its prosodic
peculiarities, such as high pitch with exaggerated intonation contours and pitch range (e.g., the
excited ‘jaaaa’ by which the mother acknowledges her daughter’s squeal in Figure 11.2), have been
called ‘musical’, but we believe they are only metaphorically so. Rather, they are typical instances
of the non-verbal vocal modes through which we convey emotion, in common with animals.
When mothers (and other adults) resort to such prosody to engage with the prelinguistic infant,
they are putting to use, on an intuitive basis, only their own and the infant’s native equipment of
emotional communication, as sketched in the foregoing. The prosodic features of infant-directed
speech are thus first and foremost instances of an emotional mode of address, as recently shown
by Trainor et al. (2000). In addition to serving this primary, emotionally based, communicative
function, the features of infant-directed speech may contribute to language acquisition through
effects such as supplying prosodic cues to parsing (Gerken 1996) and enhancing the perceptual
salience of vowels (Kuhl et al. 1997; Xu et al. 2006). These functions would amount to little unless
the speech stream held the infant’s attention, and that, we suggest, is accomplished by infant-
directed speech on the basis of an ancient linkage between voice and motive state which we share
with other animals, and which is strikingly embodied in the dynamics of speech directed to
infants. Thus far, none of the infant capacities and behaviours we have dealt with are in any
way specifically related to music. Rather, we have been at pains to sketch, however briefly, the
complex and sophisticated communicative background within which and on the basis of which
additional and specifically human manifestations of ritual culture such as song, instrumental
music, dance and language (see Merker, Chapter 4, this volume, for a detailed discussion of ritual
culture) play out their developmental choreography. In attempting to trace the developmental
emergence of these specifically human arts—which differ from the domain of non-verbal expres-
siveness by their obligatory dependence on learning—we believe that it would be fatal to
confound them with elements of the common communicative heritage we share with primates,
mammals and even vertebrates. We do not abandon that heritage in acquiring our new-fangled
human arts, but retain it as a secure and basic communicative foundation for all else we do.
Hence, we have made these prefatory remarks to provide background argument to the task of
characterizing specifically human ritual resources such as music, to which we now turn.
11.3 Music as music

Darwin (1871, p. 590) called the human faculties for the enjoyment and production of musical
notes ‘amongst the most mysterious with which [man] is endowed’. His mention of ‘notes’ is well
taken, because unlike most animal song, which consists of units which themselves are tiny
phrases or vocal gestures, the melodies of music typically consist of sequences of simple notes.
A note is a pitch to which one can return. That is what distinguishes ordinary phonation from
intonation: intonation is the ability to return to the same pitch repeatedly. It is an essential condi-
tion for the formation of melodies, which are sequences of such ‘returnable-to’ stations within
the continuum of pitch, and finding them or not determines whether you sing in or out of tune.
Even the most fluid ‘portamento’ avails itself of these stations, lest it be musically out of tune. The
distinction is unique to music, and taking it seriously goes some way towards accounting for the
remarkable properties of music. It is unique in availing itself of the simplest conceivable unit of
vocal production—the single note—as the grist for its pattern mill. By throwing away most of the
continuum of pitch to retain small sets of these simple pitch-stations as elements of its patterns,
it conquers for itself the infinitely rich universe of melody (Merker 2002, 2006).
Music has done the same thing it has done for pitch—the categorization of a continuum—for
time, in the device of rhythmicity. In the same way in which all melodic figures combine unique
pitches, so all rhythmic figures combine discrete durations. This is achieved by discretizing the
time continuum through the simple device of the musical pulse. The repetitive ‘unit of time’ sup-
plied by the musical pulse, or tactus, underlies all rhythmic (‘measured’) music. Its patterns,
accordingly, consist of ‘durations with proportional values’ (Arom 1991, p. 179ff), related to the
pulse through subdivision and multiplication. Again, the combination of such elements results in
a boundless diversity of potential rhythmic patterns. Combining these patterns with those of
melody gives us the universe of music as we know it. Its structural secret, accordingly, is the
boundless pattern possibilities made available through the radical simplification of categorically
subdividing the continua of pitch and time into small sets of finite elements whose combinations
form melodies and combinations of melodies without end (Merker 2002; and see Brandt,
Chapter 3, this volume, on ‘discretization’).
We are not claiming that music either originated in or is created through a process of piece-
wise assembly from collections of elementary, discrete units, but only that its most distinctive
structural feature is its reliance on sequences of discrete pitches and durations. These are traits
unique to music as music, which are absent from other forms of human communication. They
therefore provide an invaluable point of departure for distinguishing musical structures from
other structures in contexts where such distinctions might be difficult to make, as in infant devel-
opment (Papousek 1996, p. 104). The publication by the second author, cited above (Merker
2002), grew directly out of the need, in our project, for making such distinctions, as the issue had
not been clearly articulated in the extant musicological literature. To highlight the importance of
this issue, imagine, if you will, that the various studies that have established the receptive musical
precocity of the human infant (Trehub 2000) had employed as test stimuli not the standard
musical stimuli actually used (with their discrete pitches and note durations), but instead a vari-
ety of synthesized analogs of the squeals and pure tone glissandos by which infants express their
delight and excitement in communicative interactions (Papousek 1996). Interesting as such stud-
ies might be—in fact they deserve to be done but have not—under those circumstances, what
reason would there be to think that the results pertained to the specifically musical abilities of
infants?
Matters are no different when we turn to expressive behaviour. To call an expressive act
musical, we clearly need some way to distinguish musical expressiveness from the rich back-
ground of non-verbal expressiveness commanded by infants. The musically unique structural
features just mentioned allow us to do so. In the pitch domain, the behaviour would have to at
least aim at something tantamount to intonation, even if imperfectly achieved. That is, if the
possibility of singing out of tune is not a relevant factor in production, we have no assurance of
being on the grounds of music as far as pitch is concerned. Expressive acts traversing sets of dif-
ferent discrete pitches to which the voice returns (melodies) would accordingly always qualify, as
would a replicable approximation of the correct intonation of a single note, provided its pitch
was not the purely passive result of essentially physiological conditions. Evidence regarding imi-
tative pitch-matching by infants is as yet inconclusive. Thus, Kessen et al. (1979) reported such
matching in infants of 3 to 6 months of age, while Siegel et al. (1990), studying infants from
8 months to 1 year of age, and McRoberts and Best (1997) in a 14-month longitudinal case study
starting at 3 months of age, concluded against significant pitch-matching by infants (although
adults were found to adjust to infants in the last study). In view of the centrality of intonation as
a distinctive marker for music production, the issue deserves further study, and to be extended to
participants of younger age. Since sequences of pitch changes reflecting no more than the direc-
tional changes of a melody are structurally related to that melody, such sequences would also
qualify as ‘aiming’ at the production of a musical structure. In the complete absence of melody-
related structure (as in purely percussive rhythms) some evidence for time discretization by
a unit pulse (and, of course, any structure built on it) would suffice to call it musical.
In no way do we mean that criteria such as these should be applied in the manner of a stern
and demanding music teacher who denies that a pupil who is singing out of tune and without
a steady beat is producing music. Even such a teacher would probably be willing to admit that
the pupil is in fact trying to sing, although the results are not deemed acceptable from the stand-
point of musical pedagogy. And ‘trying to sing’’ is what we propose to define, in principle,
by aiming at the kind of structures described above, whether in the melodic or rhythmic domain.
In practical terms, difficulties remain when it comes to ascertaining the earliest manifestations
of such expressiveness. They may, in fact, be indistinguishable from the early stages in the devel-
opment of speech, as pointed out by Papousek (1996, p. 104). This difficulty should not be
underestimated, specifically in relation to the stage of canonical babbling in infant development.
Once the difficulty has been identified, there may yet be ways to distinguish the precursors of
musical expressiveness, from those of speech on the one hand, and from our ordinary modes of
non-verbal expressiveness on the other (e.g., Sundin 1977, 1998). One approach might be to
fine-comb longitudinal developmental records backwards (i.e., in the same infant), starting from
unequivocal instances of each vocal class (song and speech), and to trace each such instance back
through its individual developmental history to its earliest recognizable beginnings, using
sonogram-based behavioural methods. Such an approach lies, however, entirely in the future.
11.4 The developmental paradox of music

Thus far, and short of the more demanding kind of search just alluded to, the first signs we have
found of structures classifiable as musical by the criteria outlined above occur at 12 months of
age in our corpus. The second year of life would thus be a time for the emergence of explicit
musical expressiveness, because by 2 years of age it is not unusual for children to produce
hummed or sung pitch contours that are clearly recognizable, even to the untrained ear, as
approximations of familiar children’s songs (unpublished personal observations; McKernon
1979; Papousek 1996, p. 105; Stadler Elmer 2000). Because infants, as already mentioned, display
a receptive sensitivity to musical structures far earlier than this—as early as it is convenient to test
them after birth (Trehub 2000)—it seems that a puzzling period of ‘musical silence’ intervenes
between the infant’s first receptive and first expressive competence in matters musical. We believe
this to be the case, and call it ‘the developmental paradox of music’. As we shall see, the paradox
readily resolves in light of the distinctions we have made here.
In contrast with our expressive call repertoire, which we command as a largely free gift of our
biological nature, the melodies of song and music are cultural goods and hence must be learned.
Their pattern specifics originate in and are elaborated through cultural traditions, and they are
acquired through a process of cultural learning (Merker 2005). Even when invented on the spot,
this is done against a background of acquired familiarity with musical materials (Merker 2006).
The patterns of music, in not being given by nature, are inherently dependent on learning
processes in ways that expressive calls are not. However, it is not always clearly recognized that the
receptive and the expressive parts of this process pose radically different demands on learning
mechanisms when the task of the learning process is not only to acquire familiarity with the
pattern in question in the perceptual–cognitive sense, but to duplicate it through an efferent
modality such as the voice.
The receptive capacity to acquire familiarity with auditory patterns is shared by all higher
animals by virtue of their sophisticated auditory systems, which exhibit a fundamental similarity
in neural organization, even across a taxonomic gap as wide as that between mammals and birds
(Farries 2001; Jarvis 2004). This commonality in auditory organization presumably underlies the
remarkable similarity between the macaque and human perception of melodies and tonality
(Wright et al. 2000; Merker 2006). It is this receptive kind of familiarity, based on perceptual
learning, that is tested by the preferential looking method and related techniques (Graham and
Clifton 1966; Trehub 2000). Human infants’ performance in this regard relies on the precocious
maturation of their auditory system, which is the only sensory modality whose subcortical path-
ways have completed their myelination (i.e., structural maturation) before birth (Moore et al.
1995; Yakovlev and Lecours 1967).
Yet to learn to recognize, discriminate, classify and compare melodies implies nothing about
the ability to duplicate those known melodies with the help of the voice. Our facility with
language (and song) makes it seem natural that what we know we can also express, yet as
discussed at some length by the second author (Merker, Chapter 4, this volume), the capacity to
learn to duplicate heard patterns with the voice is an exceedingly rare capacity in the animal
kingdom. It is most richly represented in birds, but even among them it is only present in 3 out of
the 24 different orders of bird species. The capacity is known as vocal learning, and among
mammals, who excel at learning in other respects, it is only humans along with some species of
whales, seals and bats who possess this ability (reviewed by Janik and Slater 1997). It is conspicu-
ously absent from chimpanzees and all other primates as it is from most mammals, yet we
depend on this ability for every word we know how to pronounce, for every melody we know
how to sing, and for every sound we are capable of imitating.
The human capacity for vocal learning, and its extension in our propensity for ritual mimesis
(Merker 2000, 2005, Chapter 4, this volume; Donald 1993, 1998; Meltzoff 1996), is the missing
piece of the puzzle of human uniqueness. This was recognized long ago with respect to a signal
feature of that uniqueness, namely human speech and language, by students of bird song
(Marler 1970; Nottebohm 1975, 1976; Doupé and Kuhl 1999; Jarvis 2004). Human speech would
be impossible without vocal learning, and so, of course, would human song, since singing
involves the duplication by voice of a heard auditory model possessing arbitrary (in the sense
of culturally determined) pattern characteristics. From our present point of view, it is of consid-
erable interest that in vocal learning in nature, as it meets us in well-studied cases of bird
song, the perceptual acquisition phase of song not only precedes the production phase, but is
separated from it by a considerable span of developmental time (Thorpe 1961; Williams 2004).
Only part of that interval is filled by a phase of practice (technically called subsong and
plastic song), in which the young bird produces first jumbles of ill-formed phrases, then
more complex song patterns, eventually to approximate and finally duplicate the adult song it
has stored in memory from the acquisition phase. In the temporal gap between perceptual
acquisition and expressive production of bird song, we have something resembling the
‘musical silence’ that we have claimed reigns in the first year of life of the human infant. But
what of the practice phase that occupies part of the interval in birds? It can hardly be called
silence.
The subsong phase of avian vocal development has been compared to the babbling phase in
the development of the human infant (Marler 1970; Nottebohm 1975; Doupé and Kuhl 1999;
Wilbrecht and Nottebohm 2003). Yet the developmental literature links the stages of babbling in
human infants to their development of language and not to song (e.g., de Boysson-Bardies et al.
1989; Kent and Miolo 1995; Vihman 1996; Jusczyk et al. 1998). The presence of speech in humans
adds a new, important task to the infant’s capacity for vocal learning. Given this capacity, the
devotion of the babbling stage to speech acquisition may simply reflect the fact that much of the
vocal material the human infant is exposed to during early learning consists of speech. This
makes the relative proportions of linguistic and musical material to which infants are exposed of
interest as a determinant of what their expressive use of the voice will be shaped to approximate
and to duplicate.
It is therefore of some interest to determine the proportions of time devoted to speech as

compared with other modes of communication in adult interactions with infants. We made an
estimate of this based on our recordings, so far only for our 9- and 12-month material. To do so,
we defined ‘active time’ as those portions of our recordings in which adults or siblings were
interactively engaged with the infant, and averaged them over the two age groups. Of this active
time, 71 per cent was taken up by infant-directed speech, some 10 per cent by song of all kinds
(including improvisations by mothers), and an additional 3 per cent by instrumental music and
non-sung rhymes. The remaining 16 per cent of active time was devoted to other forms of
neither sung nor spoken vocal expressiveness and ‘horsing around’. These proportions may, of
course, differ at earlier and later ages, yet at all ages parents and others devote much of their
face-to-face interaction time with infants to infant-directed speech, with its characteristic
emotional modulations (Trainor et al. 2000). The infant’s system of vocal learning, which may in
fact be ‘primed’ by the face-to-face format of intimate interaction (Tzourio-Mazoyer et al. 2002;
Lewkowicz 1999), is thus supplied with a great amount of speech input, and shapes its output
accordingly. A learning system is well advised not to ignore the statistical properties of its input at
any scale available to it, including the most global (Elman 2004).
Yet speech is not the only kind of vocal expressiveness that infants acquire. We know that
eventually they will sing, and singing no less than speech is dependent on vocal learning. This
means that song is subject to the developmental logic of the vocal learning mechanism, even
when a good part of its resources are devoted to language acquisition. A considerable time gap
between perceptual learning and competent production is therefore expected in the case of song
no less than in the case of speech, and this resolves, we suggest, the apparent developmental
paradox of music. It poses the interesting challenge of distinguishing musical babbling from
speech babbling in the infant’s second half-year of life (Sundin 1977 1998; Papousek 1996).
Against this background, we turn, then, at long last, to the topic announced in our title, the
action songs of infancy and their developmental interpretation.
11.5 The action song in ritual perspective

All of the issues concerning the innate and acquired underpinnings of infant communicative
expressiveness we have sketched were forced on us in trying to classify and understand the
well-known phenomena of adult–infant interaction in the first year of life. As already indicated
in our introduction, our greatest challenge in this regard turned out to be the action song. Action
songs combine melody, words and schematized sequences of obligatory bodily action (such as
knee jogging, hand clapping, finger games and pantomime) into a narrative sequence that
provides scope for the infant to participate in predictable ways in interaction with an adult
(Trevarthen 1979, 1999; Trevarthen and Hubley 1978). This participation spans from no more
than the infant’s squeals of delight when, at the end of a verse, a build-up of dynamic tension
culminates in the tickling or mock dropping of the infant, over more active participation such as
hand clapping, to the infant’s duplication of finger-games or a succession of bodily movements
pantomiming key points in an evolving verse narrative (as in ‘Itsy Bitsy Spider’, the most frequent
action song in our corpus). The more familiar we became with the behavioural content of our
recordings, the more our interest came to centre on the significance of these action songs in
adult–infant interaction. Unlike other contents, these seemed to become less understandable the
more we scrutinized them, a tell-tale sign of approaching a phenomenon with inadequate
conceptual categories.
Our difficulties with these commonplaces of adult–infant play were several, and all con-
verged on rendering elusive a working hypothesis regarding the function of action songs in
infant development. Seen as hypothetical language teaching devices, the lyrics of these songs
appear ill-suited to their purpose. They frequently contain idiosyncratic and marginally
grammatical constructions, as well as nonsense words. Moreover, the lyrics are often subordi-
nated to musical or metric aspects of the song, that is, to simple temporal patterns of melody and
verse metrics. Yet, somehow, the music does not seem to be the point either. The adult performer
commonly takes their role as singer or poet quite lightly: melodies are sung out of tune (in one
such instance, the mother, a musician, had only moments before entertained her infant by
playing the cello with excellent intonation), and stanzas are begun without completing the metric
count of the previous line. Any part of the performance might be interrupted at any time—even
in mid-phrase—to interact with or react to the infant in alternative ways, including responses to
infant ‘disobedience’ (Reddy 1991). As is usual in well-functioning adult–infant interaction,
the integrity of communicative contact with the infant seems to take precedence over other
considerations, such as the formal or aesthetic aspects of song performance. The action song,
it seems, is not meant to entertain the infant passively, as was apparently the cello solo just
referred to.
The point of many an interruption seemed to be to ensure the infant’s participation in
the action song: to return the infant’s attention to the performance when it strayed, and to
shower praise on the little one for every hint of joining in and performing the ‘moves’ of the song.
As the name ‘action song’ indicates, the interaction seems geared to recruit the infant’s active
participation. The question is: active participation in what? Not a communicative exchange as
such; the common non-verbal communication between parent and infant functions quite
smoothly without the trappings of the action song, as the non-verbal interventions interrupting
the song demonstrate. In fact, the formal, stereotypic aspects of the action song impose a mild
burden on the infant beyond what his or her native non-verbal communicative expressiveness is
equipped to shoulder; this occasions the interruptions and the need for encouragement, relying
on the non-verbal communicative capacity the infant already possesses (see Appendix II for this
pedagogical aspect). Nor is the infant being recruited to engage in an act of play as such: play also
functions perfectly well without the formal aspects of the action song, as can be seen in smoothly
executed spontaneous play sequences involving the infant both before and after more awkwardly
executed episodes of infant participation in action songs.
As the reader familiar with Chapter 4 of this volume may have guessed, and as we eventually
came to realize, these observations are most easily interpretable as the process of first introduc-
tion to active participation in a human ritual. Here, the non-verbal communicative competence
shared by parent and infant becomes a means of instruction for the infant’s first participation in
ritual. Were human culture only ape culture, then the non-verbal communicative exchanges and
the spontaneous play conducted with their help would suffice, and would provide adequate
guidance to developing competence in the culture. But human culture has two levels beyond
ape culture, namely the ritual and the linguistic levels, and the action songs of infancy are the
paradigmatic ‘baby rituals’ of a ritual culture. With them come formal structure, actual teaching,
and the need for imitative fidelity as already discussed in the earlier chapter.
Our suggestion is that the infant’s primary gate of admission to the ritual level of human
culture is the action song and related games with a formal structure. They provide a first forum
for the infant’s inclusion in and sharing of a ritual performance, beginning with things like the
clapping of hands as a ritual sign of approval and excitement. These shared performances form a
watershed in the initiation of the infant into the natal culture. They do not supply a teaching
device for anything beyond themselves. Their ‘rule structures’ (Bruner and Sherwood 1976)—the
form or syntax of the ritual (Staal 1989)—are rules not for life, but for this specific part of it,
valid only within the clearly demarcated confines of particular action songs. Their purpose is
mere participation in the arbitrary form itself, since in the case of ritual, its primary purpose is to
be performed rather than the achievement of an instrumental end beyond itself. The smallest
step of infant progress in mastering the ‘moves’ of the action song is greeted by excited expres-
sions of praise and encouragement from the adult partner. With these exclamations, parents wel-
come a new member to the ritual culture to which they belong; and the little one beams with joy
and pride over having raised his or her hands high in the air at the right moment in the right
verse of the action song, or, perhaps, freezes in mid-move on realizing that this was the move
for the next verse, and not the present one, in which the hands should have gone down,
pantomiming rain (even though this latter significance may be understood only by the adult).
Here is the developmental, social and cultural context that makes sense of the human infant’s
capacities as an ‘imitative generalist’ (Meltzoff 1996), capacities for which chimpanzees have no
use because they live in an ape culture and not a ritual culture. Their cultural traditions
are largely confined to instrumental behaviours, for which observational learning, or ‘intent
participation’ (Rogoff et al. 2003) generally suffices, and imitation as such is of marginal utility.
Instead, imitation comes into its own in a culture featuring the one function that cannot be
accomplished without it, namely, the high-fidelity duplication and permanent acquisition
of truly arbitrary but obligatory forms of behaviour, such as the culturally transmitted complex
display behaviours of learned song in birds and whales, and human rituals such as song and
dance. For these, imitation serves as an enabling device by supplying the duplicative entry
point—supported by what the second author has dubbed a ‘conformal motive’ (Merker 2005)—
for the long-term mimetic acquisition of adult competence in ritual performance.
The ritual perspective on imitation relieves us of the need to educe a function or purpose for
imitation outside itself. The purpose of imitation is not the episodic online production of a
likeness of a novel model behaviour involving tool use or other purposes, as staged in some
laboratory studies of imitation (Meltzoff 1988; Nagell et al. 1993; Tomasello et al. 1993b; Call and
Tomasello 1995; Whiten et al. 1996; see also Miklosi 1999 for trenchant commentary); instead, it
serves the permanent acquisition of the frivolous aesthetics of display behaviour, where the very
point of the performance is exhibiting command of the model in all of its arbitrary formal
detail. The action songs of infancy share this lack of instrumental utility, combined with formal
exuberance, as do human rituals more generally. However, in contrast with the imitation experts
among animal species, whose imitative feats are typically confined to elaborate vocal learning
(but see Williams 2004), human infants and adults are imitative generalists (Meltzoff 1996). We
go beyond the learned control of the voice to draw the full range of our expressive motoric
equipment into the orbit of the duplicative logic of the vocal learning mechanism (Merker,
Chapter 4, this volume). The action songs of infancy embody this expanded scope of human
imitation in their multimodal, multi-effector engagement between infant and adult in the
unfolding pattern stereotypies of the song.
The action song thus epitomizes the broad scope of mimesis deployed in human ritual. It
draws on bodily posture and movement, gesture, manual dexterity, song and verbal ability, and
their interactive coordination into, at acquisition asymptote, a sophisticated formal choreogra-
phy of mutually attuned behaviour of which chimpanzee culture—even disregarding the
melodic and verbal components of the human performance—shows hardly a glimmer. In
humans, the process of ‘induction’ is a protracted one. At the outset, the infant lacks some of the
capacities essential for full performance. This is conspicuously so for melodic song and language,
and for some motoric competences. The performance (pantomime) aspects of an action song
may therefore have different versions adapted to the infant’s age, while the global form of the
action song remains invariant. We illustrate this by a developmental classification of different
performance versions of ‘Itsy Bitsy Spider’ in Appendix II.
To summarize, we suggest that an action song like ‘Itsy Bitsy Spider’ provides a culturally trans-
mitted developmental vehicle for introducing infants to the specifically ritual aspect of human
culture. The repeated enactment of its global form laid down in melody and lyrics constitute a
longitudinal framework capable of absorbing successively maturing capacities of the infant into a
formal framework providing both easily recognizable structural continuity and repeated occa-
sions to celebrate progressively accumulating successes over developmental time. ‘Induction’
starts with exposure to the ritual form without requiring formal contributions by the infant
(Appendix II), progresses to the infant’s contributing simple bodily gestures such as raising the
hands, and ends, perhaps years later, with full mastery of the performance, a performance to
which the child, on growing up, will introduce his or her own immature offspring in the cycle of
ritual culture. The infant can rightly be proud of each step of this accomplishment, because in
taking them, the infant, even before language competence, is making a decisive break with our
ape ancestry and entering the ground of a truly human, ritual culture. This, we now believe, is the
developmental significance of the action song in infant development, a significance which eluded
us until our own understanding of the nature of human culture had undergone a reconstellation,
alerting us to ritual culture as a distinctive stratum of our cultural inheritance, as detailed in the
earlier chapter.
Appendix I
Video documentation of mother-infant interaction
The basic data of the study were collected by filming (SONY DSR-PD1P digital camcorder
equipped with remote stereo microphone SONY ECM-909A) 25 mother–infant dyads, occasion-
ally with the additional participation of other family members, during routine, everyday activi-
ties in the home at a time when infants were 6, 9 and 12 months of age. Families were recruited
through newspaper advertisements and notices posted at local maternity centres throughout the
thinly populated provinces of Jämtland and Härjedalen in the north of Sweden in 1998 and 1999.
Families volunteered for the study on their own initiative by contacting the first author by phone
or email. This resulted in a wide range of socio-economic and educational levels being repre-
sented among participants. As the study was conducted by an Institute for Biomusicology no
secret could be made of the fact that the study related to music, although during an introductory
visit to participating families before the 6-month filming session, it was emphasized that the pur-
pose of the filming was to document ordinary everyday activitities involving the infant, without a
particular agenda external to the habitual life of the family. During this visit, the primary care-
taker filled out an extensive questionnaire, and in most cases filming was also begun during this
first visit, although these preliminary recordings, made at a time when infants were around
3 months old, were not included in the data analysis. The regular filming sessions started when
the infant was 6 months old, and took place by appointment at the family’s convenience in a
single filming session per age level, lasting between one and two hours of essentially continuous
filming. We took care to stay as much as possible in the background during filming, and
whenever possible the camera was left running in an optimal position without the presence of
the operator in the room.
The resulting footage amounted to a total of 120 hours of digital video recordings, featuring
highly diverse behaviours. The entire corpus was subjected to three separate content analyses
focused on different behavioural aspects of interest. The first was the annotation of the video
time code by a set of behavioural categories and comments designed to give an overall inventory
of the type of situations in a given recording session. The categories of this inventory classified
sequences containing music into those consisting of children’s songs, other standard songs,
improvised song, song fragments (exemplified in Figure 11.2, part c), ditties/rhymes, and other
music. They gave brief statements of the overall behavioural situation (e.g., x on floor playing
with pot), of the mother’s behaviour, of the infant’s behaviour, and of apparent function (e.g.,
attempt to make contact), and a column for miscellany information that might be relevant to
determining the significance of the sequence. An example of an annotation of a 58-second
segment of tape under the heading ‘infant behaviour’ is as follows (in translation): ‘Drumming
on tambourine; when mother starts playing, infant stops to look at mother’s drum. Further stray
drumming, chatting, lots going on. Attention on mother most of the time’. This inventory was
used for identifying sequences subjected to more detailed scrutiny, by means such as sonograms
for vocal behaviour (an example of which is in Figure 11.2) or the detailed plotting of interactive
movements in active play sequences.
A second inventory of the entire corpus focused on the various forms of communicative
behaviour used by the adult during active engagement with the infant, an analysis, that is,
focused on communication rather than music. It distinguished singing, other music, nursery
rhymes, horsing around, infant-directed speech, other vocal behaviour, and silence. The percent-
ages of active time devoted to infant-directed speech versus song and other modes of communi-
cation given in the main text of this chapter are based on this inventory.
A third inventory identified every sequence containing rhythmic, repetitive, and iterative
behaviour of any body part (such as bobbing, swaying, waving, banging) on the part of
the infant. They were classified according to whether they occurred to music. The movement
characteristics of 93 such sequences (56 with music; 37 without) were analysed statistically frame
by frame.
Appendix II
Adapting ritual form to developmental progression: Alternate
versions of ‘Itsy Bitsy Spider’
There are indications in our material that the performance (pantomime) aspect of action songs
such as ‘Itsy Bitsy Spider’ are adapted to the age and development of the infant. This has subse-
quently been confirmed by checking the many instances of ‘Itsy Bitsy Spider’ sequences involving
infants posted on the searchable Internet video archive YouTube (www.youtube.com, searched
for via ‘Itsy Bitsy Spider’). On this basis, we propose the following preliminary classification of
alternate ways of performing this action song:
Version one, first few months

The infant is lying down in a play situation. The adult, making a ‘spider hand’ manually ‘climbs up’
the infant’s body with crawling movements starting at the infant’s feet. At the ‘water spout’ ending
of the first line of the song (typically sung in all versions) the infant is tickled beneath the chin.
‘Down comes the rain’ is performed by a single stroking movement with spread fingers covering
the infant’s face and proceeding downwards over his or her body. At ‘Up comes the sun’, the adult
places her or his hand with a fan-like spread of the fingers right in front of the infant’s face, and
finally repeats, for the last line of the song, the gestures of the first line, again ending in a tickle.
Version two, mid-infancy

Adult and infant face one another, sitting close together on the floor or with the infant perched
on the adult’s knee. The adult physically guides the infant in an assisted performance, in which
the adult manually holds the infant’s hands, actively moving the infant’s hands and arms through
the stages of the pantomime sequence while singing the song. The adult may alternate this
assisted performance with adult solo performances at close quarters to the infant.
Version three, late infancy

Adult and infant face one another without any requirement for physical contact. Both perform
the pantomime sequence together, in parallel. This is the version of ‘Itsy Bitsy Spider’ with which
most of us are familiar.
Video examples selected to illustrate the three movement patterns (irrespective of infant age)
are available at the following URLs: http://www.youtube.com/watch?v=SQJpjGaous4; http://www.
youtube.com/watch?v=Zx40pmNQVs; http://www.youtube.com/watch?v=aM0KjxhA4us.
The global form of the ritual marked by its melody and words is identical in all three versions.
What differs, drastically, is the performance aspect—the pantomime movements that implement
the action aspect of the song. These appear to be conspicuously and most appropriately adapted
to the developmental competences of the infant. Version one relies entirely on the infant’s spon-
taneous responses and expressiveness—principally laughter and excitement on being tickled,
along with his or her anticipatory reactions based on repeated experience with the ritual. In this
version, the global form of the ritual is being most tangibly impressed on the infant by bodily
touch, hearing and sight, without requiring independent motoric initiatives on his or her part.
Subsequent versions extend the scope of the space over which the action song is enacted from the
infant’s body outwards, and increasingly recruit active motoric participation and initiative by
the infant. Although largely passive in Version two, we note that the pantomime requires the
infant’s consent in the sense of not resisting the manual guidance of the adult. In Version three,
the infant is independently active as a full partner in a joint performance. This developmental
progression underscores the point we made in the main text concerning the action song as a
developmental vehicle for induction into the ritual aspects of human culture.
Acknowledgements
We thank Colwyn Trevarthen for his encouragement and support, and we thank him and
Stephen Malloch for many valuable discussions in the course of our work. The research on which
this chapter is based was supported by a grant from the Bank of Sweden Tercentenary Foundation.
References
Alexander RD (1990). How did humans evolve? Reflections on the uniquely unique species. University of
Michigan Museum of Zoology Special Publication, 1, 1–38.
Arom S (1991). African polyphony and polyrhythm: Musical structure and methodology. Cambridge
University Press, Cambridge, UK.
Bekoff M (1972). The development of social interaction, play, and metacommunication in mammals:
An ethological perspective. Quarterly Review of Biology, 47, 412–434.
Bekoff M and Byers JA (eds) (1998). Animal play: Evolutionary, comparative, and ecological approaches.
Biben M and Suomi SJ (1993). Lessons from primate play. In K MacDonald, ed., Parent-child play:
description and implications, pp. 185–196. State University of New York Press, Albany, NY.
Bruner JS and Sherwood V (1976). Peekaboo and the learning of rule structures. In JS Bruner, A Jolly and
K Sylva, eds., Play: Its role in development and evolution, pp. 277–285. Basic Books, New York.
Burgdorf J and Panksepp J (2001). Tickling induces reward in adolescent rats. Physiology and Behavior,
72, 167–173.
Burling R (1993). Primate calls, human language, and nonverbal communication. Current Anthropology,
34, 25–53.
Call J and Tomasello M (1995). Use of social information in the problem solving of orangutans
(Pongo pygmaeus) and human children (Homo sapiens). Journal of Comparative Psychology, 109,
308–320.
Cohen D (2000). More on the meaning of natural schemata: Their role in shaping types of directionality.
In J Sloboda and S O’Neill, eds, Proceedings of the Sixth International Conference on Music Perception and
Cognition, 5–10 August 2000, Keele, UK. Keele University, Department of Psychology, Keele, UK.
Cohen D (2003). Incorporating natural schemata into musical analysis. Orbis Musicae, 13, 195–211.
Cordes I (1997). Observations on some correspondences between ethnic music and animal calls.
In A Gabrielsson, ed., Proceedings of the Third Triennial ESCOM Conference, 7–12 June 1997,
Uppsala, Sweden, pp. 235–40. The Department of Psychology, University of Uppsala, Sweden.
Cordes I (1998). Melodische Kontur und emotionaler Ausdruck in Wiegenliedern [Melodic contour and
emotional expression in lullabies]. In KE Behne, G Kleinen and H de la Motte-Haber, eds,
Musikpsychologie, Vol. 13, Musikalischer Ausdruck, pp. 26–54 [Music psychology. Musical expression].
Hogrefe Verlag, Göttingen.
D’Odorico L (1984). Non-segmental features in prelinguistic communications: An analysis of some types of
infant cry and non-cry vocalizations. Journal of Child Language, 11, 17–27.
Darwin C (1871). The descent of man and selection in relation to sex. D Appleton and Company, New York.
De Boysson-Bardies B, Halle P, Sagart L and Durand C (1989). A cross-linguistic investigation of vowel
formants in babbling. Journal of Child Language, 16, 1–17.
Donald M (1993). Origins of the modern mind. Harvard University Press, Cambridge, MA.
Donald M (1998). Mimesis and the executive suite: Missing links in language evolution. In JR Hurford,
M Studdert-Kennedy and C Knight, eds, Approaches to the evolution of language: Social and cognitive
bases, pp. 44–67. Cambridge University Press, Cambridge.
Doupé AJ and Kuhl PK (1999). Birdsong and human speech: Common themes and mechanisms. Annual
Review of Neuroscience, 22, 567–631.
Elman JL (2004). Generalization from sparse input. Proceedings of the 38th Annual Meeting of the Chicago
Linguistic Society, pp. 175–200. Chicago Linguistic Society, Chicago, IL (Dated 2002, published 2004).
Evans CS (1997). Referential signals. In DH Owings, MD Beecher and NS Thompson, eds, Perspectives in
ethology, Volume 12, pp. 99–143. Plenum Press, New York.
Falk D (2004). Prelinguistic evolution in early hominins: Whence motherese? Behavioral and Brain Sciences,
27, 491–541.
Farries MA (2001). The oscine song system considered in the context of the avian brain: Lessons learned
from comparative neurobiology. Brain, Behavior and Evolution, 58, 80–100.
Fernald A and Kuhl P (1987). Acoustic determinants of infant preference for motherese speech. Infant
Fernald A, Taeschner T, Dunn J, Papousek M, Boysson-Bardies B and Fukui I (1989). A cross-language
Language, 16, 977–1001.
Fogel A (1993). Developing through relationships. University of Chicago Press, Chicago, IL.
Gerken L (1996). Prosody’s role in language acquisition and adult parsing. Journal of Psycholinguistic
Research, 25, 345–365.
Graham FK and Clifton RK (1966). Heart-rate change as a component of the orientig response.
Psychological Bulletin, 65, 305–320.
Grieser DL and Kuhl PK (1988). Maternal speech to infants in a tonal language: Support for universal
Gyger M, Marler P and Pickert R (1987). Semantics of an avian alarm call system: The male domestic fowl,
Gallus domesticus. Behaviour, 102, 15–40.
Hanslick E (1854). Vom Musikalisch-Schönen. Ein Beitrag zur Revision der Aesthetik der Tonkunst. Weigel,
Leipzig. Published in English in 1986 as On the Musically Beautiful: A Contribution Towards the Revision
of the Aesthetics of Music. Translated by Geoffrey Payzant. Hackett Publishing Company, Indianapolis, IN.
Hauser MD (1996). The evolution of communication. MIT Press, Cambridge, MA.
Hauser MD (2000). The sound and the fury: Primate vocalizations as reflections of emotion and thought.
In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 77–102. MIT Press, Cambridge, MA.
Hsu H-C, Fogel A and Cooper RB (2000). Infant vocal development during the first 6 months: speech
quality and melodic complexity. Infant and Child Development, 9, 1–16.
Ilari BS (2002). Music perception and cognition in the first year of life. Early Child Development and Care,
172, 311–322.
Janik VM and Slater PJB (1997). Vocal learning in mammals. Advances in the Study of Behavior, 26, 59–99.
Jarvis ED (2004). Learned birdsong and the neurobiology of human language. In HP Ziegler and
P Marler, eds, Behavioral neurobiology of birdsong, pp. 749–777. Annals of the New York
Academy of Sciences, 1016.
Jürgens U (1998). Neuronal control of mammalian vocalization: With special reference to the squirrel
monkey. Naturwissenschaften, 85, 376–388.
Jürgens U (1999). Primate communication: Signaling, vocalization. In G Adelman and BH Smith, eds,
Encyclopedia of neuroscience, pp. 1694–1697. Elsevier, Amsterdam.
Jusczyk PW, Houston D and Goodman M (1998). Speech perception during the first year. In A Slater, ed.,
Perceptual development: Visual, auditory, and speech perception in infancy, pp. 357–388.
Psychology Press, Hove.
Juslin P and Laukka P (2003). Communication of emotions in vocal expression and music performance:
Different channels, same code? Psychological Bulletin, 129, 770–814.
Kent RD and Miolo G (1995). Phonetic abilities in the first year of life. In P Fletcher and B MacWhinney,
eds, The handbook of child language, pp. 303–334. Blackwell, Cambridge, MA.
Kent RD and Murray AD (1982). Acoustic features of infant vocalic utterances at 3, 6, and 9 months.
Journal of Acoustic Society of America, 72, 353–365.
Kessen W, Levine J and Wendrich KA (1979). The imitation of pitch in infants. Infant Behavior and
Kuhl PK, Andruski JA, Chistovich IA et al. (1997). Cross-language analysis of phonetic units in language
addressed to infants. Science, 277, 684–686.
Lewkowicz DJ (1999). Infants’ perception of the audible, visible and bimodal attributes of talking and
singing faces. Proceedings of the Audio-Visual Speech Processing Conference, 7–9 August 1999. University
of California, Santa Cruz. (http://mambo.ucsc.edu/avsp99)
MacDonald K (ed.) (1993). Parent–child play: Description and implications. State University of New York
Press, Albany, NY.
Macedonia JM and Evans CS (1993). Variation among mammalian alarm call systems and the problem of
meaning in animal signals. Ethology, 93, 177–197.
1999–2000), 29–57.
Marcos H (1987). Communicative functions of pitch range and pitch direction in infants. Journal of Child
Language, 14, 255–268.
Marler P (1955). Characteristics of some animal calls. Nature, 176, 6–8.
Marler P (1970). Bird song and speech development: Could there be parallels? American Scientist,
58, 669–673.
Marler P (2000). Origins of music and speech: Insights from animals. In NL Wallin, B Merker and S Brown,
eds, The origins of music, pp. 31–48. The MIT Press, Cambridge, MA.
Marler P (2004). Bird calls: Their potential for behavioral biology. In HP Ziegler and P Marler, eds,
The behavioral neurobiology of birdsong, pp. 31–44. Annals of the New York Academy of Sciences, 1016.
Marler P, Dufty A and Pickert R (1986). Vocal communication in the domestic chicken: I. Does a sender
communicate information about the quality of a food referent to a receiver? Animal Behaviour,
34, 188–193.
McCune L, Vihman MM, Rough-Hellichius L, Delery DB and Gogate L (1996). Grunt communication in
human infants (Homo sapiens). Journal of Comparative Psychology, 110, 27–37.
McKernon PE (1979). The development of first songs in young children. New Directions for Child
Meltzoff AN (1988). Infant imitation and memory: Nine-month-olds in immediate and deferred tests.
Meltzoff AN (1996). The human infant as imitative generalist: A 20-year progress report on infant imitation
with implications for comparative psychology. In CM Heyes and BG Galef, eds, Social learning in
animals: The roots of culture, pp. 347–370. Academic Press, San Diego, CA.
Merker B (2002). Music: the missing Humboldt system. Musicae Scientiae, 6, 3–21.
Merker B (2005). The conformal motive in birdsong, music and language: an introduction. In G Avanzini,
L Lopez, S Koelsch and M Majno, eds, The neurosciences and music II: From perception to performance,
Merker B (2006). Layered constraints on the multiple creativities of music. In I Deliege and G Wiggins, eds,
Musical creativity: Multidisciplinary research in theory and practice, pp. 25–41. Psychology Press, Hove, UK.
Miklosi A (1999). The ethological investigation of imitation. Biological Reviews, 74, 347–377.
Moore JK, Perazzo LM and Braun A (1995). Time course of axonal myelination in the human brainstem
auditory pathway. Hearing Research, 87, 21–31.
Morton ES (1977). On the occurrence and significance of motivational-structural rules in some bird and
mammal sounds. American Nauralist, 111, 855–869.
Mozgot VG (2003). Auditory imprinting in shaping an individual’s music world. In R Kopiez, AC Lehmann,
I Wolther and C Wolf, eds, Proceedings of the 5th Triennial ESCOM Conference 8–13 September 2003,
pp. 599–602, Hanover University of Music and Drama, Germany.
Murray L and Trevarthen C (1985). Emotional regulation of interactions between two-month-olds and
their mothers. In T Field and N Fox, eds, Social perception in infants, pp. 177–197. Ablex, Norwood, NJ.
Nagell K, Olguin RS and Tomasello M (1993). Process of social learning in the tool use of chimpanzees
(Pan troglodytes) and human children (Homo sapiens). Journal of Comparative Psychology, 107, 174–186.
Nottebohm F (1975). A zoologist’s view of some language phenomena, with particular emphasis on vocal
learning. In EH Lenneberg and E Lenneberg, eds, Foundations of language development, pp. 61–103.
Nottebohm F (1976). Discussion paper. Vocal tract and brain: A search for evolutionary bottlenecks.
In SR Harnad, HD Steklis and J Lancaster, eds, Origins and evolution of language and speech,
Nwokah EE, Hsu H, Dobrowolska O and Fogel A (1994). The development of laughter in mother–infant
communication: timing parameters and temporal sequences. Infant Behavior and Development, 17, 23–35.
Owings DH and Morton ES (1998). Animal vocal communication: A new approach. Cambridge University
Press, Cambridge.
Papaeliou C and Trevarthen C (1994). The infancy of music. Musical Praxis, 1, 19–33.
Papousek M (1996). Intuitive parenting: A hidden source of musical stimulation in infancy. In I Deliege
and JI Sloboda, eds, Musical beginnings: Origins and development of musical competence, pp. 88–108.
Papousek M, Papousek H and Symmes D (1991). The meaning of melodies in motherese in tone and stress
languages. Infant Behaviour and Development, 14, 415–440.
Prescott R (1975). Infant cry sound: Developmental features. Journal of Acoustic Society of America,
57, 1186–11891.
Reddy V (1991). Playing with others’ expectations: teasing and mucking about in the first year. In Whiten A,
ed., Natural theories of mind, pp. 143–158. Blackwell, Oxford.
Rogoff B, Paradise R, Arauz RM, Correa-Chávez M and Angelillo C (2003). First-hand learning through
Rubin KH, Fein GG and Vandenberg B (1983). Play. In EM Hetherington, ed., Handbook of child
psychology, pp. 693–774. John Wiley and Sons, New York.
Scherer KR (1986). Vocal affect expression: A review and a model for future research. Psychological Bulletin,
99, 143–165.
Scherer KR (1995). How emotion is expressed in speech and singing. Proceedings of the International
Conferences of Phonetic Science, 3, 90–96.
Scherer KR, Banse R and Wallbott HG (2001). Emotion inferences from vocal expression correlate across
languages and cultures. Journal of Cross-Cultural Psychology, 32, 76–92.
Searby A and Jouventin P (2003). Mother–lamb acoustic recognition in sheep: A frequency coding.
Proceedings of the Royal Society B: Biological Sciences, 270, 1765–1771.
Seyfarth RM and Cheney DL (1980). The ontogeny of vervet monkey alarm-calling behavior: A preliminary
report. Zeitschrift für Tierpsychologie, 54, 37–56.
Siegel GM, Cooper M, Morgan JL and Brenneise-Sarshad R (1990). Imitation of intonation by infants.
Journal of Speech and Hearing Research, 33, 9–15.
Staal F (1989) Rules without meaning. Ritual, mantras and the human sciences. Peter Lang, New York.
Stadler Elmer S (2000). Spiel und Nachahmung—Über die Entwicklung der elementaren musikalischen
Aktivitäten. [Play and imitation – On the development of basic musical activities.] HBS Nepomuk Verlag,
Aarau, Switzerland.
Stark RE (1978). Features of infant sounds: The emergence of cooing. Journal of Child Language,
5, 379–390.
Stark RE, Bernstein LE and Demorest ME (1993). Vocal communication in the first 18 months of life.
Journal of Speech and Hearing Research, 36, 548–558.
Stern D (1977). The first relationship. Harvard University Press, Cambridge, MA.
Stern D (1999). Vitality contours: The temporal contour of feelings as a basic unit for constructing the
Struhsaker TT (1967). Auditory communication among vervet monkeys (Cercopithecus aethiops).
In SA Altmann, ed., Social communication among primates, pp. 281–324. University of Chicago Press,
Chicago, IL.
Sundin B (1977). Barnets musikaliska värld: påverkan och utveckling i förskoleåldern. [The musical world of
the child: Influences and development in preschoolers.] Liber Läromedel, Lund, Sweden.
Sundin B (1998). Musical creativity in the first six years: a research project in retrospect. In B Sundin,
GE McPherson and G Folkestad, eds, Children composing: Research in music education, pp. 35–56.
Malmo Academy of Music, Lund University, Lund.
Thorpe WH (1961). Bird song. Cambridge University Press, Cambridge.
Todt D (1988). Serial calling as a mediator of interaction processes: Crying in primates. In D Todt,
P Goedeking and D Symmes, eds, Primate vocal communication, pp. 88–107. Springer-Verlag, Berlin.
Tomasello M, Kruger AC and Ratner HH (1993a). Cultural learning. Behavioral and Brain Sciences,
16, 495–552.
Tomasello M, Savage-Rumbaugh S and Kruger AC (1993b). Imitative learning of actions on objects by
children, chimpanzees and exculturated chimpanzees. Child Development, 64, 1688–1705.
Trainor LJ, Austin CM and Desjardins RN (2000). Is infant-directed speech prosody a result of the vocal
expression of emotion? Journal of the American Psychological Society, 11, 188–195.
Trehub S (2000). Human processing predispositions and musical universals. In NL Wallin, B Merker and
Trehub SE and Trainor LJ (1998). Singing to infants: Lullabies and playsongs. In C Rovee-Collier and
L Lipsitt, eds, Advances in infancy research, pp. 43–77. Ablex, Norwood, NJ.
intersubjectivity. In M Bullova, ed., Before speech: The beginning of human communication, pp. 321–347.
Trevarthen C and Hubley P (1978). Secondary intersubjectivity: Confidence, confiding and acts of
meaning in the first year. In A Lock, ed., Action: gesture and symbol: The emergence of language,
pp. 183–229. Academic Press, London.
Tzourio-Mazoyer N, De Schonen S, Crivello F, Reutter B, Aujard Y and Mazoyer B (2002). Neural
correlates of woman face processing by 2-month-old infants. Neuroimage, 15, 454–461.
Unyk AM, Trehub SE, Trainor LJ and Schellenberg EG (1992). Lullabies and simplicity: A cross-cultural
perspective. Psychology of Music, 20, 15–28.
Vihman MM (1996). Phonological development: The origins of language in the child. Blackwell Publishers,
Cambridge, MA.
Whiten A, Custance DM, Gomez JC, Teixidor P and Bard KA (1996). Imitative learning of artificial fruit
processing in children (Homo sapiens) and chimpanzees (Pan troglodytes). Journal of Comparative
Wickler W (1974). The sexual code. Weidenfeld and Nicholson, London.
Wilbrecht L and Nottebohm F (2003). Vocal learning in birds and humans. Mental Retardation and
Developmental Disabilities Research Reviews, 9, 135–48.
Williams H (2004). Birdsong and singing behavior. In HP Ziegler and P Marler, eds, Behavioral neurobiology
of birdsong, pp. 1–30. Annals of the New York Academy of Sciences, 1016.
Wright AA, Rivera JJ, Hulse, SH, Shyan M and Neiworth JJ (2000). Music perception and octave general-
ization in Rhesus monkeys. Journal of Experimental Psychology: General, 129, 291–307.
Xu N, Burnham D and Kitamura C (2006). Tone hyperarticulation in Cantonese infant-directed speech.
Paper presented at the Eleventh Australasian International Conference on Speech Science and
Technology, University of Auckland, Auckland, New Zealand, 6–8 December 2006.
http://www.assta.org/sst/2006/viewabstract.php?id=95
Yakovlev PI and Lecours A-R (1967). The myelogenetic cycles of regional maturation of the brain.
In A Minkowski, ed., Regional development of the brain in early life, pp. 3–70. Blackwell Scientific
Publications, Oxford.
Chapter 12
Early trios: Patterns of sound and

movement in the genesis of meaning
between infants
Benjamin S. Bradley
12.1 Introduction
This chapter examines the extent to which spontaneous communication between babies can be
said to be musical. Previous research on communicative musicality in infancy has focused on
babies interacting with adults, analysing their responses to a parent who chats or sings to them
(e.g., Malloch 1999; Trevarthen and Malloch 2002; Powers and Trevarthen, Chapter 10, this
volume). Clearly, the musicality of the adult partner’s behaviour can be a source of or stimulus
for musical responses from the infant. Here, I discuss the kinds of sound and movement gener-
ated by babies in threesomes, without adults present. In particular, I illustrate how the analysis of
video data may begin to address two questions about musicality: In what ways can infant sound-
making, during interaction with others, be said to express relational meaning? To what degree do
babies exhibit self-conscious comodulation when vocalizing in groups? Before I consider the
data, I will introduce the conceptual backing for my research on early communication and the
observational paradigm I use.
12.2 Infants and musicality

The study of infants has led directly to current ways of thinking about communicative musicality
(Malloch 1999). In 1970, Habermas seminally demonstrated that Chomsky’s take on language
acquisition missed out the crucial dimension of human communication, the ‘intersubjectivity’
that is afforded by what Habermas called ‘dialogue-constitutive universals’ (Habermas 1970).
He argued there was a fundamental psychical sociality to human communication that could
not be conceived under the monologic rubric of Chomsky’s (1959) ‘language acquisition device’.
He thereby gave new theoretical coherence to the work of pioneering researchers on infancy who,
since the mid-1960s, had been gathering evidence that refuted the then-dominant image of
the newborn as an asocial information processor. Most notable were small babies’ easy involve-
ment in phatic (that is, socially expressive, demonstrative and emotive) conversations with
adults, their dismay when these conversations were interrupted, and evidence that newborns
have a precocious ability to imitate manual and facial gestures (e.g., Tissaw 2007; Condon
and Sander 1974; Murray and Trevarthen 1985; Stern 1971, 2000; Tronick 1989). In short, very
young infants showed interest in and adaptations for responding to the expressions of other
people. During the 1970s, Trevarthen (1979a, 1998; Trevarthen and Hubley 1978) and others
(e.g., Ryan 1974; Bruner 1975) combined these findings into a developmental theory of innate
intersubjectivity. Trevarthen proposed that babies were born with a motivation for sharing mental
264 BENJAMIN S. BRADLEY
states with others, and that this grew from having a purely interpersonal focus in the
early months to draw in objects and symbols, or ritualized games during the second six months
of life, ultimately providing the psychical foundation for human involvement in all forms of
cultural meaning (including language). By employing the word ‘culture’, Trevarthen (1979b)
made a sweeping claim concerning how arbitrary ideas and conventional forms of behaviour are
learned. More recently, Malloch and he have focused empirical research on the communicative
foundations for one product of culture, music, giving new detail to Trevarthen’s initial claims
about the dynamic engagements between infants and others. Not only are babies innately
conversational and cooperative at an early stage in many creative forms of expression; they are
innately musical as well! Or rather, and this may make all the difference when one is analysing
infant behaviour, they ‘exhibit communicative musicality’ (Malloch 1999; Trevarthen and
Malloch, 2000).
An important distinction must be made in the study of infants’ entrance into the culture of
music (and see Malloch and Trevarthen, Chapter 1, this volume). Although babies may not be the
slightest bit ‘musical’ in the way one might view a promising school-age pianist, they may yet
show the first signs of whatever it takes to do what is commonly called ‘enjoy music’. For example,
babies are very sensitive to rhythm, liking and even entering into some rhythms while not liking
others (Smitherman 1969; Ejiri 1998; Jaffe et al., 2001; Bahrick et al., 2004; and see Mazakopaki
and Kugiumutzakis, Chapter 9, and Gratier and Danon, Chapter 14, this volume).
Darwin (1877) likened the ‘language’ of infants to the song of the gibbon, which raises
a question crucial to the study of the nature of musicality: are animals musical? Or is the
language of beasts just that, a language of fixed emotional meanings with no bearing at all on
the origin of the art of music? In this essay, I adopt Darwin’s side of the case, showing that
findings about the acoustic displays of animals can be assumed to have a significant bearing on
the empirical study of infant musicality (see Merker, Chapter 4, and Panskepp and Trevarthen,
Chapter 7, this volume). At the risk of putting too simple a gloss on science’s existing findings
about animal song, bestial music has what we might call an obvious erotic component; that is,
a component that has to do with the celebration of attractions (and repulsions) between
conspecifics (Bradley 1981, 1989, 1991; Selby and Bradley 2003a). On these grounds, our research
looks for infant music in vocalizations synchronous with the attractions that develop between
babies in trios.
Initially, two lines of research extended out from Trevarthen’s (1979b) theoretical proposals.
One has sought descriptively to elaborate the forms of intersubjective development from birth
up to the school years (e.g., Bråten and Trevarthen 2007); the other has sought to explain
the ‘how’ of infants’ capacity for intersubjective connectedness (e.g., Rochat 2004). The concept
of communicative musicality belongs to the second line. Inspired by advances in brain
science (e.g., Panksepp 1998) and the analysis of coordinated movement (e.g., Bernstein 1967;
Lee 1998), Trevarthen and Malloch (2000) propose that our bodies are organized to express the
dynamic patterns of our minds in inherently communicative forms. In particular, the activities
that express the motives of an individual are synchronized within a single integrated timeframe
which is intrinsically shareable (Trevarthen 1986; see Gratier and Danon, Chapter 14, this
volume).
One way of illustrating the intrinsic communicativeness of coordinated human action is
by reference to tau theory, which defines the essential prospective control of a movement to a goal
in terms of a single time-space function of ‘gap closure’ generated in the brain (Lee 1998; Lee and
Schögler, Chapter 6, this volume). While the theory of ‘tau-coupling’ has been shown to
deal effectively with the coordination and integration of a complex of movements made by
one person, it applies with equal simplicity to the synchronization of movements made by
EARLY TRIOS: PATTERNS OF SOUND AND MOVEMENT IN THE GENESIS OF MEANING BETWEEN INFANTS 265
different people. However, there are many steps between proposing, on the one hand, that the
coordination of individual human actions renders such actions both rhythmical (synchronized
within a single timeframe) and communicative (shareable by others) and proposing, on the
other, that such coordination renders human actions musical. Inevitably, researchers interested in
the communicative musicality of babies preface their studies by distinguishing between those
defining features of what is indubitably music that are common to what they call ‘musicality’, and
those that are not. The pioneers of this area, Papoušek and Papoušek (1981, p. 182 ff), made the
founding claim: ‘the bare fundamental voicing in the newborn’s non-cry vocalization acquires
the features of musical sounds from the first months on’. The features referred to include timbre,
rhythm, tonal variety, pitch range, pitch preference, protomelodic intonation contours, and the
use of intervals. It is a far cry, however, from saying that the newborn voice soon has the appurte-
nances of musical sound, to saying that babies use their voices musically. In this connection, the
Papoušeks noted their daughter Tanya’s capacity to match pitch and imitate intonation contours
from as early as 2 months of age (cf. Kugiumutzakis 1993, 1999). This shows an early phase of
self-consciousness in voice production. By 13 months, Tanya was beginning to join her parents in
the singing of simple nursery chants and songs, showing she had become capable of actively
reflecting on prosody, rhythm and pitch.
More recently, Trevarthen (1999) has analysed a film showing how a blind Swedish 5-month-old
appeared to move her left hand in rhythmical accompaniment to her mother’s singing of baby
songs (see Preisler and Palmer 1986 on the role of the voice in blind-infant–mother interaction).
The infant appeared to match the melodic line and phrasing of the song with what Trevarthen
described as ‘conducting’ movements of wrist and fingers, and doing so a fraction of a second
ahead of the corresponding expressions of the mother. This and other descriptive material from
infant–mother baby talk (e.g., Bradley and Trevarthen 1978) suggests that young infants ‘take
critical interest in coordinating their limited repertoire of movements to the musicality of maternal
expressions’ (Trevarthen 1999, p. 173).
Picking up on Trevarthen’s and Bruner’s initial arguments (following Darwin 1872; Habermas
1970; Ryan 1974), the idea of innate communicative musicality implies that infant sound-making
has not just physical dimensions (of such factors as pitch, amplitude and rhythm), but an inter-
subjective topography. In this vein, the aim of the project described here is to show the value of
a new ‘symmetrical’ observational paradigm (Selby and Bradley 2003a) by which some of the
key intersubjective sites for musical expression in the second six months of life can be mapped.
In particular, data are used to examine whether the rudiments of two distinctive aspects of music
can be discovered in the sound-making in a group of babies on their own: relational meaning,
and the auditory self-consciousness of music-makers.
1 Relational meaning. Singing is perhaps one of the most powerful forms of relational commu-
nication known to humankind. Whether it be Elton John’s Rocket man or Verdi’s Un di, felice,
eterea, song can evoke an enormous range of overwhelming and subtle emotions, whether for
the singer, the onlie begetter,1 one concert-goer among thousands, or a lone radiohead in the
boondocks of night. Song is quintessentially lyrical—that is, social and relational.2 It affects
others. So, to what extent is the sound-making of infants lyrical or erotic, serving as a vehicle
for specific relational attraction?
1 Shakespeare’s sonnets were published by Thomas Thorpe in 1609, who wrote a dedication to their ‘onlie
begetter’—a person whose identity has eluded subsequent historians.
2 ‘Lyrical’ in the sense of ‘directly expressing the poet’s own thoughts and sentiments’, expressive of conta-
gious ‘poetic enthusiasm’ (Onions 1973, vol. i, p. 1253).
2 Auditory self-consciousness of music-makers (cf. ‘auditory imagination’, Gardner 1949). Even in

solo performance, musicians must monitor the sounds they make against their internalized
standard of the music they are reading or the notes they have in their heads. The need for
such monitoring is even more obviously the case in collective music. Whether they are playing
together a well-known Beethoven quartet, harmonizing an old folk tune or improvising
free-form jazz, musicians can only make good music if they can comodulate the sounds
they make—rhythm, pitch, volume—to the sounds being made by the other musicians in
their ensemble. Similarly, if we wish to claim that babies exhibit collective musicality when
in groups, we need to examine whether the babies show any signs of deliberately and self-
consciously regulating the ways they vocalize or percuss with or without regard to the sound-
making of the other group members. Indeed, it is only when vocalizing or percussing
with others that auditory self-regulation can be observed in infants. Although babies do
vocalize when alone or in an undirected way when in groups, there is no way of knowing
whether they are attempting to follow a standard, unless that standard is overt because set by
someone else.
12.3 The infants-in-groups paradigm

The step from music to musicality takes research beyond a simple spectrographic taxonomy of
what babies can do with their voices, or what preferences they may have for the musical features
of their mother’s speech and song. The starting point becomes rather the observation that infant
sound-making is sociable.
As in Bowlby’s attachment theory, with its ‘shared dyadic programme’, research on infant
musicality has largely restricted its attention to dyadic engagements (Bradley 1991). Like devel-
opmental research in psycholinguistics, the overriding investigative interest has been in the way
in which infants use and/or acquire song in the course of their interactions with their parents,
asking how they respond to the traditional ditties and lullabies handed down from parent to
child over generations. What behaviour of a parent most engages them? When do they begin to
join in? When do they begin to sing independently of their parents?
This direction in research has two limitations. One is that it can assume that the music babies
respond to and (potentially) make is only that defined and handed down to them by adults:
infants are ‘fitting into’ or elaborating on pre-exisiting, other-determined patterns of sound and
rhythm. Secondly, when the infant sound-making is analysed in a dyad, this excludes considera-
tion of sound-making (or intersubjectivity) as a more collective enterprise betokening not a
dyadic but a general relational socio-emotional capacity, or ‘sociability’, which may be more
fundamental (Selby and Bradley 2003a).
Thus, the dominant image is of a parent singing and the baby listening or joining in by
‘conducting’ the music with their hands or feet, or cooing in time and in tune. Tellingly, even
when sound was recorded in trios, by Stern and his colleagues (Stern et al. 1975, p. 90) in their
study of a mother’s vocalizing in unison and alternation with her twins, analysis remained
dyadic: ‘only interactions between the mother and infant were scored; i.e., all triadic interactions
were excluded’. Yet music has claim to be traditionally a collective enterprise, being often created
in groups larger than two. Witness the traditions of communal chanting and singing, big-band
music, rock and jazz combos, orchestras and choirs (see Dissanayake, Chapter 2, Pavlicevic and
Ansdell, Chapter 16, this volume).
In our previous research on babies in groups, we have demonstrated that babies within the
same ‘intersubjective space’ (Bradley 2005) are able to enter into relationships that betoken
awareness of more than one other at the same time. In other words, by 9 months of age, there is
evidence for a ‘clan’ or ‘group’ mentality in infants, something quite different in form from the
kind of ‘shared dyadic programme’ which Bowlby (1982, p. 378) hypothesized to underpin
the growth of humans’ sociability (Selby and Bradley 2003a, b; Bradley and Selby 2004).
Furthermore, by adopting a more rigorous descriptive procedure, what we call two-stage
case analysis (incorporating an interpretive approach, see Selby and Bradley 2003a), we are able
to generate evidence that babies are capable of producing specifiable supradyadic socio-
emotional meanings over the course of their transactions with each other. Interestingly, these
meanings can rarely have a pragmatic function in Halliday’s (2003, p. 22) sense of requiring
‘some answer in deeds’ to have been successful. They appear to fall more into Halliday’s
‘mathetic’ or declarative category, as for example, when baby Ann vocalized while both looking
and pointing at Joe (see below).
Finally, without a symmetrical descriptive paradigm, there is a problem with hypotheses such
as Mueller’s ‘semantic primacy hypothesis’ (Mueller 1991, p. 316; Vandell and Mueller 1995)
which proposes ‘a system of meaning that precedes verbal communication [that is] crucial in
understanding the emergence of language’. Preverbal communicative behaviour of the kind
pertinent to the hypothesis of ‘semantic primacy’ in infancy cannot be discovered by frequency-
counts of generic categories defined a priori by adult researchers. Neither can the meaning of any
single communicative act be established by inferential statistics. Rather, we need to find a way of
discovering whether meanings are really generated differently in each group of infants, changing
over time as the group’s ‘conversation’ progresses—and what these created meanings are. This is
particularly the case if we believe that infant musicality may convey specific socio-emotional
meanings, as argued above. Specific, idiosyncratic group-generated meanings need to be sought
out by more sensitive means than can be provided by a set of one-size-fits-all categories (e.g.,
‘socially directed behaviour’ – Vandell et al. 1980), useful though these measures can sometimes
be. Hence: the iterative approach to the description of the communicative behaviour of infants,
which I employ in examining infant musicality in the groups discussed below.
Our recordings are produced as follows. Three infants who are unknown to each other and
between the ages of 6 and 9 months are put together in strollers forming an equilateral triangle
within a recording studio: each baby is within foot-touching distance of the other two (Figure 12.1).
All that passes is then recorded by two digital cameras. The trio is dissolved only when either a
mother or an experimenter – watching from the next room on a closed-circuit monitor – deems that
the interaction is ‘over’ (most usually because one or more of the babies appears bored or frustrated).
Thus far, we have studied 45 babies (15 trios) in this way, the trios lasting 12 minutes on average
(range 5–22 minutes).
The findings are extracted as follows. A prima facie ‘thick’ description (Geertz 1973) of each
group session is made that focuses on meanings and feelings (e.g., that baby A ‘preferred’ baby B
to baby C). More fine-grained behavioural evidence is then sought which potentially supports or
challenges these interpretations. The behavioural categories used in this second phase of analysis
are defined in a way that allows the establishment of interobserver reliability. This two-stage
process is similar to the description by Fivaz-Depeursinge and Corboz-Warnery of multiple
readings of ‘family affective communications’, backed up by microreadings of the same episodes
(1999, p. 161). The final inevitably provisional interpretation is then advocated on the basis of all
available evidence. (Our approach parallels the process for attaining proof in civil law, which
presents an interpretation supported by a preponderance of available evidence.) Alternative
explanations can then be tested by seeing which make most sense of the observed behaviours
(this approach is illustrated in detail in Reddy 1991).
I now illustrate how this two-stage approach casts light on the two questions posed at
the start of this chapter – that is: In what ways can infant sound-making, during interaction
Fig. 12.1 Configuration of babies and cameras in the infants-in-groups paradigm.
with others, be said to express relational meaning? To what degree do babies exhibit self-
conscious comodulation when vocalizing in groups? In this I build on results published
elsewhere (Selby and Bradley 2003a), that:
◆ over the duration of a ‘conversation’, communication within baby trios may transform or add
to the initial meaning of an action;
◆ babies may become involved in triangular conversations where the behaviour of one
shows simultaneous awareness of both other members of the trio; we call this ‘three-way
linking’;
◆ babies in trios show both attractions and repulsions to others over the course of a conversa-
tion, lasting for, say, 10 minutes.
The additional points to be discussed here have the following logic.
◆ I introduce a way of representing different dynamics of attraction in infant trios.
◆ I make an empirical distinction in infants’ musicality between directed vocalization and
undirected vocalization, and debate its links with non-musical and musical utterance.
◆ I illustrate the relation between this distinction and two different kinds of infant ‘singing’:
directed and undirected.
◆ I consider evidence for self-conscious aural collaboration between babies.
12.4 Music and the dynamics of attraction in two trios

Attraction between babies is normally conceived dyadically, often with the infant–mother rela-
tionship as its assumed prototype (e.g., Denham et al. 1991). Few if any studies have targeted the
often fast-forming dynamics of intersubjective attraction observable in infant groups. The data
presented here have two foci: gaze and vocalization. Gaze is perhaps the best-defined index of
preference. While theories of the psychological functions of looking are manifold and complex, it
would not be heretical to propose looking at another as predominantly betokening interest,
liking, curiosity, arousal, if also, though more rarely, fear, loathing, displacement and the desire to
escape from what Chance (1962) called the field of ‘agonism’ (Bradley 1981; Latour and Woolgar
1986). That the forms of attraction and repulsion characteristic of an ‘eternal triangle’ are funda-
mental in human social relationships is shown by great plays such as Hamlet and Oedipus Rex, as
well as by classical psychoanalytic theory (Britton 1989).
I will first examine different versions of two possible dynamics of gaze observable in a triad.
Our examination refers to the data in Tables 12.1a and 12.1b. They are drawn from recordings of
two groups: ‘Red Hat’ (12 minutes long; named because the dominant character, Ann, wore a
red hat – see Figure 12.2) and ‘Cats’ Chorus’ (14 minutes long; named because of the trio’s vocal
virtuosity).
Two main behavioural categories were coded. Interobserver reliability was calculated by a
second observer, who coded 50 instances of each of the reported behaviours chosen at random.
Cohen’s Kappa was above the acceptable level (0.70) in each case: gaze at another baby
(durations, k = 0.74; occurrences, k = 0.82), vocalizations (durations, k = 0.80; occurrences,
k = 0.85; and groupings, k = 0.77).
A vocalization was defined as any voiced sound (including coughs and crying but not unvoiced
sounds, e.g., unvoiced sighs). A vocalization was coded as ‘directed’ if it fulfilled the criteria for
socially directed behaviours, that is, vocalizations ‘accompanied within 3 seconds by one or more
discrete looks at the same person’ (Tremblay-Leveau and Nadel, 1996, p. 149; by the same token,
vocalizations were defined as responses if they occurred within 3 seconds after a partner’s
intiation). This means that the same vocalization can sometimes be directed at both of the other
babies if its onset is followed within 3 seconds by looks at both (as sometimes occurs). Following
this conventional scoring of ‘socially directed behaviour’, vocalizations were counted as ‘new’
if they occurred more than 5 seconds after the previous vocalization (Selby and Bradley 2003a).
All other vocalizations were coded as undirected.
Table 12.1(a) Looking data in the Red Hat trio (12 minutes’ duration)
Babies Joe Ann Mona Total looking at other babies

Joe looks at 285 122 407
Ann looks at 365 120 485

Mona looks at 348 296 644
Totals: baby looked at by others 713 581 242
Looking time in seconds.

Table 12.1(b) Looking data in the Cats’ Chorus trio (14 minutes’ duration)
Babies Jim Barbara Mary Total looking at other babies

Jim looks at 224 215 439
Barbara looks at 238 293 531

Mary looks at 269 202 471
Totals: baby looked at by others 507 426 508
Looking time in seconds.
Vocalizations may also be responded to by both other babies, and they often are. Vocalizations
were coded as ‘musical’ if they included complex changes of pitch, were not at all raucous, and
were deemed attractive to the ear. Satisfactory agreement was gained on this measure (k = 0.73),
although clearly it excludes a great deal of rhythmic sound that a more inclusive criterion might
also classify as musical. Rhythm is something that is not necessarily restricted to the auditory
domain, which makes coding it difficult and defining it as ‘musicality’ contentious (for example,
is a baby who repetitively looks down between her legs or kicks out her legs doing this
‘musically’?). To take account of these possible non-vocal manifestations of rhythm would
have required analysis of all physical movements made by the babies, which was not technically
possible in this study.
To begin in the simplest fashion, I reduced these data to a single ‘coefficient of attraction’
by dividing the duration of gaze at each baby’s favoured other by the duration of gaze at their
less preferred baby. In Figure 12.3, we can see that Joe’s (9 months old) coefficient is 2.34 toward
Ann (9 months old), Ann’s is 3.04 toward Joe, and Mona’s (6 months old) is 1.18 towards Joe.
Fig. 12.2 Ann and Joe play ‘footsie’ in the Red Hat trio. (See also colour plate 4.)
This creates a pattern of exclusion that babies can react to in various ways (Selby and Bradley
2003a): two are attracted to each other and the third is the ‘gooseberry’ (i.e., the one left out).
Alternatively, we have the conformation that occurs in the Cats’ Chorus, a kind of Catherine
wheel of attraction: Jim (8 months old) is attracted to Barbara (9 months old) with a coefficient
of 1.04, Barbara is attracted to Mary (9 months old) at 1.23, and Mary is attracted to Jim at 1.33.
(a)
Mona 1.18
2.34 3.04
Joe Ann
(b)
Jim 1.04
1.33
1.23
Mary Barbara
Fig. 12.3 (a) Patterns of attraction in the Red Hat trio. (b) Patterns of attraction in the Cat’s
Chorus trio. The length and direction of the arrows represent the coefficient of attraction (see text)
for each baby; each coefficient is also displayed numerically. Black circles show the babies who
looked most at others (the baby’s total looking at both other babies combined); grey circles show
who gazed second most at others; and white circles show the babies who looked at others the
least (data on which these figures are based are shown in Tables 12.1a and 12.1b).
Figure 12.3a (Red Hat) shows a more asymmetrical pattern of looking than that in the Cats’
Chorus (Figure 12.3b) – as represented by the far higher coefficients in Red Hat than Cats’ Chorus.
This picture can be informatively juxtaposed to data on the number of babies’ vocalizations.
Jim is the most frequent vocalizer in the Cats’ Chorus (Table 12.2b). While his vocalizations are
relatively effective in attracting the others’ visual attention (31 out of his 41 cries got at least one
of the other babies to look at him), particularly his frequently blown ‘raspberries’, Barbara and
Mary preferred to vocalize to each other than to Jim, having attraction coefficients of 3.5 and 2.7,
respectively (based on number of vocalizations). Overall, Jim vocalizes only slightly more to
Barbara than to Mary (coefficient = 1.2) and the largest proportion (41 per cent) of his vocaliza-
tions are directed at neither. By contrast, Mary (18 per cent) and Barbara (19 per cent) make
proportionately few undirected vocalizations. A similar picture emerges for the Red Hat trio
(Table 12.2a). Ann utters more than 200 vocalizations during the session, largely in bursts
of between two and eight staccato, open-mouthed ‘Ah!’ sounds, the massive preponderance
being directed at Joe (coefficient of attraction = 12.1). Joe marginally prefers Ann to Mona (coef-
ficient = 1.2), although the biggest proportion of his sound-making is undirected (54 per cent).
Mona prefers Joe to Ann (coefficient = 2.0). Both Mona (14 per cent) and Ann (6 per cent) make
proportionately few undirected sounds.
The musical utterances identified in both trios occurred in two kinds of intersubjective setting.
The majority (73 per cent—24 out of 33) were directed at another baby, and the minority were
undirected. Directed musical utterances were most common (63 per cent of all directed musical
utterances—15 out of 24) where the vocalizer had just initiated or was in the midst of initiating
engagement in response to an overture from the other baby (often when smiling with eyebrows
raised; cf. Eibl-Eibesfeldt 1968). These ‘flirtatious’ utterances are usually brief (mean = 1.9 sec-
onds; SD = 1.2 seconds), although they may lead into a longer sentence of sing-song babbling.
They typically occur in delighted response to an initiative from the other baby.
Alternatively, babies may vocalize when they are looking at the ceiling or to a point in
space. These undirected ‘songs’ are on average considerably longer (mean = 6.7 seconds;
SD = 3.4 seconds) than their directed counterparts. They typically occur when the baby has
withdrawn from immediate engagement with the other babies, is apparently relaxed and perhaps
oblivious of his or her surroundings, and suggest that the baby is playing with sound. Barbara
made four utterances of this kind, on two occasions interspersing a long utterance with a long
‘A-a-a-a-a-a-a’ rhythm by patting her hand on and off her open mouth in the same manner that
small children make ‘Red Indian’ war cries when pretending to fight in ‘cowboys and Indians’.
Mary sang a long and intricate lyric to her hand (which was held up approximately 5 centimetres
from her mouth for 11.1 seconds). Here, we see music being linked with the beginnings of the
‘capacity to be alone’ in the presence of others, which Winnicott (1958) saw as fundamental to
Table 12.2(a) Numbers of vocalizations in the Red Hat trio
Directed Undirected Musical Total vocalizations

Joe To A = 6 13 Directed = 0 24
To M = 5 Undirected = 2
Ann To J = 193 14 Directed = 10 223

Mona To J = 4 1 0 7
To A = 2
Table 12.2(b) Numbers of vocalizations in the Cats’ Chorus trio
Directed Undirected Musical Total vocalizations

Jim To B = 13 17 Directed = 2 41
Barbara To J = 7 Directed = 6 32
To M = 19 6 Undirected = 4
Mary To J = 2 2 Directed = 6 11
To B = 7 Undirected = 1
the development of the ‘true self ’. Other authors, such as Buckholz and Helbraun (1999) and
Sander et al. (1979) underline the need for both engaged interaction and ‘open space’ times when
the infants’ behaviours are not directly joined with the behaviour of the others present, so that
they are free to explore ‘transitional space’ (Winnicott 1974). In short, we have observed two
contrasting intersubjective settings for musical vocalization in trios of 9-month-olds. One
celebrates enthusiastic responsive engagement with another; the other is associated with
‘time out’ from direct engagement.
12.5 The beginnings of coordinated sound-making

The two trios under discussion illustrate two different ways in which sound can be coordinated
in infant groups.
12.5.1 Red Hat trio

I will not comment here on instances of direct imitation, although the infants we observe
sometimes show a very precise capacity for imitation with simultaneous regard to pitch, phrasing
and tempo. The skills underpinning imitation are well illustrated in the vocal interactions
we recorded from the Red Hat trio. The main structure of this conversation was given by Ann’s
very many different groupings of rather similar staccato utterances (Ah! Ah! Ah! etc.), largely
all occurring at the same pitch and occurring in regular rhythmic bursts of between two and
eight calls. These utterances were often repeated: for example, a sequence of three three-beat
utterances followed by a four-beat utterance followed by six two-beat utterances. Overall, there
were 44 single vocalizations, 35 two-beat utterances, 13 three-beat utterances, 4 four-beat
utterances, 3 five-beat, 3 six-beat, 3 seven-beat and 1 eight-beat utterance. Some of these were
co-produced.
The vast majority (87 per cent) of Ann’s vocalizations were directed at Joe. From time to time,
Joe (or Mona) would vocalize, much in the same style as Ann, and sometimes synchronously
with Ann. Alternatively, another baby might echo the rhythm of an utterance that Ann had just
made. Depiction of the sound wave forms allows us visually to represent the rhythm (the timing
and length) of these utterances. Figure 12.4 shows a two-beat utterance by Ann (pulses numbered
1 and 2) which is quickly echoed (at a similar pitch) by Joe (pulses 3 and 4). This by itself under-
lines the interest in and capacity for detecting and matching rhythms in 9-month-olds engaged
in spontaneous vocalization.
However, the phenomenon did not end there. Typically, Ann would quickly follow up vocaliza-
tions by the other babies, reasserting or elaborating on the rhythmical structure she and her
1 2 3 4 5 6 7 8 9 10 11 12
100
80
60
40
20
Amp.
0
−20
−40
−60
−80
−100
7:18 7:19 7:20 7:21 7:22 7:23 7:24 7:25 7:26 7:27 7:28 7:29 7:30 7:31
Time
Fig. 12.4 Sound wave representation of vocalizations in the Red Hat trio. Joe echoes Ann, and Ann’s
response. The pulses are numbered for identification. (Image produced using CoolEdit 96 software.)
1 2 3 4 5 6 7 8 9 10
100
80
60
40
20
Amp.
0
−20
−40
−60
−80
−100
6:18 6:19 6:20 6:21 6:22 6:23 6:24 6:25 6:26 6:27 6:28 6:29 6:30 6:31
Time
Fig. 12.5 Red Hat trio: a previous five-beat duet with Joe is repeated solo by Ann.
peers had just co-produced. Thus, in Figure 12.4 we can see how a two-pulse utterance by
Ann (pulses 1 and 2), when echoed by Joe (pulses 3 and 4), is quickly repeated by Ann four times,
with variations in emphasis and timing (pulses 5 and 6, 7 and 8, 9 and 10, and 11 and 12).
On another occasion, a five-beat phrase (pulses 1 to 5 in Figure 12.5) was collaboratively
produced in a brief duet by Ann and Joe, in which Joe entered accurately on the second, third and
fifth pulses of Ann’s phrase. Ann then quickly (1.5 seconds later) repeated a five-beat phrase
with almost identical timing (pulses 6 to 10). The only other five-beat phrase in the entire
twelve-minute session came four and half minutes after these two. Both of these examples are
early illustrations of others’ sound-makings being aurally/orally ‘internalized’ or psychically
‘ingested’ by a 9-month-old, but actively, musically and therefore actually ‘in public’ (Morss
1988). In another example (Figure 12.6), one pair of two-beat utterances was co-produced by
all three babies. Ann vocalized briefly and then Mona made a longer vocalization (pulses 1
and 2). Six seconds later, Ann briefly vocalized again (pulse 3), after which Joe vocalized at greater
length (pulse 4), collaboratively creating a similarly stressed ‘echo’ of the previous collaborative
1 2 3 4
100
80
60
40
20
Amp.
0
−20
−40
−60
−80
−100
8:22 8:23 8:24 8:25 8:26 8:27 8:28 8:29 8:30 8:31 8:32 8:33 8:34 8:35
Time
Fig. 12.6 Two co-produced two-beat utterances involving all three babies in the Red Hat trio.
sound-pattern. This marked short–long rhythm was not otherwise seen in any two-beat
utterance during this session.
Ann’s barrage of staccato vocalizations does not fall under the definition of ‘musical’ employed
here. Yet Ann undoubtedly was strongly attracted to Joe (see Selby and Bradley 2003a for details
in support of this claim) and was musically excited when he responded to her overtures. Hence, a
less stringent, less exclusive criterion for what is musical than the one used here might easily have
included Ann’s rhythmical vocalizations (but see the discussion above).
12.5.2 The Cats’ Chorus trio

Daniel Stern and his colleagues reported from their studies of 4-month-old twins ‘the parallel
emergence of two separate modes of vocal communication [in infancy], which differ structurally
and functionally’: vocalizing in alternation and in unison (Stern et al., 1975, p. 96). From a larger
biological perspective, it is not unusual for different species such as birds to develop both
antiphonal and synchronous or coaction communication modes to serve different purposes (e.g.,
Thorpe 1961). Stern and his co-authors found that ‘the coaction pattern occurs almost twice as
frequently as the alternating pattern’, although all mother–baby dyads appeared capable of per-
forming in both modes. However, the psychological significance of vocalizing in unison remains
unclear from Stern et al.’s study. As they note, unison may grace many circumstances: shared
delight, love, riot, collective anger and sadness.
To show that it is possible psychologically to differentiate between alternating and synchronous
vocalizations, as Stern wishes, we need to have inspected some clear, detailed examples of babies
vocalizing in unison (but not crying), and in alternation. In this connection, the most striking
feature of the recording called the Cats’ Chorus is an almost symphonic sequence of synchronous
vocalizations from all three of the trio over a period of two minutes just before the close of
the 14-minute session (i.e., from 11 m, 30 s to 13 m, 30 s of the interaction). The early part of the
session had been taken up by much ‘squawking’ from Joe and, later, a mutually intriguing
game of ‘footsie’ between Mary and Barbara. Mary had spent the first eight minutes of the
meeting in silence, although was visually engaged. Barbara (average length of directed
look (dl) = 5.1 s) had been the most friendly and euphonic of the babies. She was the one
who ‘sang’ to Jim when he finally responded with a directed smile to the last of her many failed
initiations addressed to both of her partners. Mary (average dl = 4.5 s) had been ‘cooler’, and
Jim (average dl = 3.9 s) had been noisier, more abrupt and fleeting. The final period of the
session, leading up to and including the cats’ chorus (from which this trio got its name), was
described as follows:
Prima facie description

At last (8 m 10 s), Barbara seems to be getting tired of all this – her sounds are grizzling too
now, and she hides her eyes and her face looks tired. She sits back. Jim is still making noises.
Barbara’s noises begin to grizzle less and be more like babbling. Mary responds to this contin-
uous burble with a sneeze-like noise, while Barbara continues singsong. Both Mary and Jim
look at Barbara. Mary is looking distressed and withdrawn too as Barbara seems to have
stopped trying to interact directly. Mary makes crying-like noises and Jim responds with high
and melodious ‘too, too, too … ‘. Jim’s sounds develop into guttural ones, raspberries and then
trilling, throaty noises, too. Mary makes two sort of sobbing noises, then ‘too, too, too’ in a
melodious way.
Then Barbara responds to some of Mary’s vocalizations by sing-songing, sitting forwards
again towards her, and bringing her arms up and down in her characteristic way. It is as though
she has regrouped to face another foray into social life. Continues in these ways (9 m 53 s). All
three intermittently show they are aware and sometimes responding to the others including
their vocalizations, but rarely is there more than one baby making a noise at any one time.
They all seem to be getting more frustrated and grizzly. Barbara continues occasional ‘hello’
noises and overtures. Mary at one point is grasping at the stroller strap and grizzling. At the
same time ‘too’ noises and melodious longer notes continue from time to time.
But then they start to make noises at the same time as each other. Jim is providing raspberry
noises. Barbara is looking at him making low, almost grizzly noises continually, and after a
while Mary adds a fluting ‘aeeeh’. Mary’s second vocalization is a musical one (11 m 20 s), but
when Jim stops his raspberrying, she is more grizzly, Barbara stops her continuous noise, then
restarts ‘by, by, by, by…’. Both the girls stop then Barbara restarts, as does Jim. He stops shortly
while Barbara develops her vocalization into ‘ah, ah, ah’s’ which become higher and higher.
The other two look at her, and Jim makes a raspberry. Barbara continues her vocalizations,
but becoming more upset and looking at Jim. Mary joins in ‘ya, ya, ya, ya,…’ and Jim blows a
raspberry. Barbara’s noises get higher and Jim provides more raspberries. Barbara stops then
restarts and Jim starts to grizzle and raspberry and make ‘ahh ah’ noises at the same time as
Barbara. Barbara begins to wail in earnest and the session closes.
Even in this first phase of our interpretive procedure, there are no obvious features, structurally
and functionally (Stern et al., 1975, p. 96) distinguishing synchronous vocalizations from
alternating ones. The three babies, when chorusing together, appear to be continuing to do
simultaneously what they had previously been doing in alternation. Thus, Jim’s raspberries occur
in both types of sound-making. (Like Barbara’s greeting ‘Ey-yo!’ [Hello] to both Jim and Mary
near the start of the session, Jim’s raspberries are a ‘trick’ that he has imported from his
life outside the recording studio. Maybe this is a party piece that goes down well at home.)
Certainly, by the time the unison begins in earnest four-fifths of the way through the interaction,
the novelty of Jim’s raspberries is well-worn, though they are still effective in gaining Barbara’s
attention. Clearly, then, vocalizing at the same time as each other does not necessarily imply the
comodulation of sound that is essential to vocalizing musically in unison. On the other hand, two
of the sound sequences made during the three-way vocalizations we have dubbed cats’ chorus
were deemed to be musical when applying the definition of musicality introduced above.
Obviously, a full ‘diachronic’ history of the trio’s interaction to date may not be required to
address the question of whether there is genuine choral unison in recordings such as the cats’
chorus. This could be done in the same way as was done for the Red Hat trio above: by demon-
strating a complex sharing of rhythmical understandings, for example, or finding an intelligible
interplay of melodic contour and pitch. The point being made here, however, is that any such
comodulation, were it to be demonstrated, would be something that existed in addition to the
socio-emotional interplay that creates the content of both trios’ interactions. In short, contrary to
Stern et al. (1975), I conclude that the content of the cats’ chorus is wine crushed from the same
vine of liking and frustration that soaked the preceding eleven minutes of the babies’ interaction.
In the main, Jim continues doing in the cats’ chorus much the same that he had been doing
before it (i.e., watching the girls, looking around, wriggling and squawking). Mary has just found
her voice and is exploring it. And Barbara, though still well-disposed to Jim and Ann, is tiring.
Other interpersonal dynamics may also be in operation. Thus, it sometimes appears that an
infant uses sound in a parental function to calm and reassure a distressed peer, mirroring and
modulating what is heard from another. The fleeting periods of euphony we may detect in this
kind of cats’ chorus must therefore draw their inspiration from and embellish deeper longer-
lasting socio-emotional dynamics – dynamics of the kind that are at the heart of all human and
animal lyrical utterance.
12.6 Conclusion
This chapter argues that detailed case-analysis of the spontaneous vocalizations of infants in
infant-only groups can cast a new and valuable light on the origins of musicality. Two features of
infant–infant communication in particular have been considered: the relation of music-making
to early psychical attractions and the beginnings of coordinated sound-making. With regard to
the ‘erotic’ or affiliative behaviours, I have made a distinction between directed ‘music’ made on
the upbeat of attraction and undirected ‘music’ made during time out from immediate intersub-
jective engagements. With regard to sonic coordination, I have presented illustrative evidence of
sophisticated collaborations in rhythmical sound-making between 9-month-olds. On the other
hand, I have argued on empirical grounds that vocalizing at the same time is not necessarily
choral (or musical). And, if it is choral, such sound-making owes its musicality as much to the
socio-emotional dynamics that it embellishes as to any freestanding capacity for auditory
comodulation.
Most researchers agree that we have to approach the question of ‘what music is’ in a step-wise
manner if we wish to seek whether and how infants are musical. I conclude from our work on
infants in groups that among the first steps we must take are those towards understanding
the collective dimension of infants’ intersubjective being; for it is in intersubjectivity that the
inspiration, power and appeal of music most characteristically lie.
Acknowledgements
Thanks to Jane Selby for conducting the first phase of the relational analysis of the two trios,
improving drafts and for her enthusiasm for the project; to Colwyn Trevarthen for advice in
setting up our infant lab, perfecting the diagrams and his enthusiasm for the work, and Stephen
Malloch for his support, interest and help, especially with the sound-editing software.
References
Bahrick LE, Lickliter R and Flom R (2004). Intersensory redundancy guides the development of selective
attention, perception, and cognition in infancy. Current Directions in Psychological Science, 3, 99–102.
Bernstein N (1967). The coordination and regulation of movements. Pergamon, Oxford.
Bowlby J (1982). Attachment, 2nd edn. Penguin Books, Harmondsworth, UK.
Bradley BS (1981). Negativity in early infant–adult exchanges and its developmental significance.
In WP Robinson, ed., Communication in Development, pp. 1–37. Academic, London.
Bradley BS (1989). The asymmetric involvement of infants in social life: Consequences for theory.
Revue Internationale de Psychologie Sociale, 2, 61–81.
Bradley BS (1991). Infancy as paradise. Human Development, 34, 35–54.
Bradley BS (2005). Psychology and experience. Cambridge University Press, Cambridge.
Bradley BS and Selby JM (2004). Observing infants in groups: The clan revisited. The International Journal
of Infant Observation, 7, 107–122.
Bradley BS and Trevarthen C (1978). Babytalk as an adaptation to the infant’s communication.
In N Waterson and C Snow, eds, The development of communication, pp. 75–92. Wiley, London.
Bråten S and Trevarthen C (2007). Prologue: From infant intersubjectivity and participant movements to
simulation and conversation in cultural common sense. In S Bråten, ed. On being moved: From mirror
neurons to empathy, pp. 21–34. John Benjamins, Amsterdam.
Britton R (1989). The missing link: Parental sexuality in the Oedipus complex. In J Steiner, ed., The Oedipus
complex today, pp. 83–101. Karnac, London.
Bruner JS (1975). The ontogenesis of speech acts. Journal of Child Language, 2, 1–19.
Buckholz ES and Helbraun E (1999). A psychobiological developmental model for an ‘alone time’ need in
infancy. Bulletin of the Menninger Clinic, 63, 143–158.
Chance MRA (1962). The interpretation of some agonistic postures: The role of ‘cut-off ’ acts and postures.
Symposium of the Zoological Society of London, 8, 71–89.
Chomsky N (1959). Review of Skinner’s Verbal behavior. Language, 35, 26–58.
Condon S and Sander LS (1974). Neonate movement is synchronized with adult speech: interactional
anticipation and language acquisition. Science, 183, 99–101.
Darwin CR (1872). The expression of the emotions in man and animals. Murray, London.
Darwin CR (1877). A biographical sketch of an infant. In HE Gruber and PH Barrett, eds, Darwin on man:
A psychological study of scientific creativity (1974), pp. 464–474. Wildwood, London.
Denham SA, Renwick SM and Holt RW (1991). Working and playing together: Prediction of preschool
socio-emotional competence from mother–child interaction. Child Development, 62, 242–249.
Eibl-Eibesfeldt I (1968). Ethology of human greeting behavior. Zeitschrift für Tierpsychologie,
25(6), 727–744.
Ejiri K (1998). Rhythmic behavior and the onset of canonical babbling in early infancy. Japanese Journal of
Fivaz–Depeursinge E and Corboz-Warnery A (1999). The primary triangle: A developmental systems view
of mothers, fathers and infants. Basic Books, New York.
Gardner H (1949). The art of T. S. Eliot. Cresset Press, London.
Geertz C (1973). The interpretation of cultures: Selected essays. Hutchinson, London.
Habermas J (1970). Towards a theory of communicative competence. In HP Dreitzel, ed., Recent Sociology,
No. 2, pp. 114–148. Macmillan, London.
Halliday MAK (2003). The language of early childhood, volume 4, edited by J Webster. Continuum, London.
Jaffe J, Beebe B, Feldstein S, Crown CL and Jasnow MD (2001). Rhythms of dialogue in infancy: Coordinated
timing in development. Monographs of the Society for Research in Child Development, 66, 1–131.
Kugiumutzakis G (1993). Neonatal imitation in the intersubjective companion space. In J Nadel and
L Camaioni, eds, New perspectives in early communicative development, pp. 23–47. Routledge, London.
In J Nadel and G Butterworth, eds, Imitation in infancy, pp. 127–185. Cambridge University Press,
Cambridge.
Latour B and Woolgar S (1986). Laboratory life: The construction of scientific facts. Princeton University
Press, Princeton, NJ.
Lee DN (1998). Guiding movement by coupling taus. Ecological Psychology, 10, 221–250.
1999–2000), 29–57.
Morss JR (1988). The public world of childhood. Journal for the Theory of Social Behaviour, 18, 323–343.
Mueller E (1991). Toddlers’ peer relations: shared meaning and semantics. In W Damon, ed., Child develop-
ment today and tomorrow, pp. 177–197. Jossey-Bass, San Fransisco, CA.
Murray LM and Trevarthen C (1985). Emotional regulation of interactions between two-month-olds and
their mothers. In TM Field and NA Fox, eds, Social perception in infants. Ablex, Norwood, NJ.
Onions CT (ed.) (1973). The shorter Oxford English dictionary (2 vols). Clarendon Press, Oxford.
communication, cognition and creativity. In LP Lipsitt and CK Rovee-Collier, eds, Advances in infancy
research, vol. I., pp. 163–224. Ablex, Norwood, NJ.
Preisler G and Palmer C (1986). The function of vocalization in early parent-blind child interaction.
In B Lindblom and R Zetterstrom, eds, Precursors of early speech, pp. 269–277. Macmillan, Basingstoke.
Reddy V (1991). Playing with others’ expectations: Teasing and mucking about in the first year. In A Whiten,
ed., Natural theories of mind: Evolution, development and simulation of everyday mindreading, pp. 143–158.
Basil Blackwell, Cambridge, MA.
Rochat P (2004). Emerging co-awareness. In G Bremner, ed., Essays in honor of George Butterworth,
Ryan J (1974). Early language development: towards a communicational analysis. In MPM Richards, ed.,
The integration of a child into a social world, pp. 185–213. Cambridge University Press, Cambridge.
Sander LW, Stechler G, Burns P and Lee A (1979). Change in infant and caregiver variables over the first
two months of life: integration of action in early development. In E Thoman, ed., Origins of the infant’s
social responsiveness, pp. 806–836. Erlbaum, Hillsdale, NJ.
Selby JM and Bradley BS (2003a). Infants in groups: A paradigm for the study of early social experience.
Human Development, 46, 197–221.
Selby JM and Bradley BS (2003b). Infants in groups: extending the debate. Human Development, 46, 247–249.
Smitherman C (1969). The vocal behavior of infants as related to the nursing procedure of rocking.
Nursing Research, 18, 256–258.
Stern DN (1971). A micro-analysis of mother–infant interaction: behaviour regulating social contact
between a mother and her 3.5-month-old twins. Journal of the American Academy of Child Psychiatry,
13, 402–421.
psychology, 2nd edn. Basic Books, New York.
Stern DN, Jaffe J, Beebe B and Bennett SL (1975). Vocalising in unison and in alternation: Two modes of
communicating within the mother-infant dyad. In D Aronson and RW Rieber, eds, Developmental
psycholinguistics and communication disorders, pp. 89–100. New York Academy of Sciences, New York.
Thorpe WH (1961). Bird song: The biology of vocal communication and expression in birds. Cambridge
Tissaw MA (2007). Making sense of neonatal imitation. Theory and Psychology, 17, 217–242.
Tremblay-Leveau H and Nadel J (1996). Exclusion in triads: Can it serve ‘meta-communicative’ knowledge
in 11- and 23-month old children? British Journal of Developmental Psychology, 14(2), 145–158.
Trevarthen C (1979a). Communication and cooperation in early infancy: a description of primary

intersubjectivity. In M Bullowa, ed., Before speech: The beginning of interpersonal communication,
Trevarthen C (1979b). Instincts for human understanding and cultural cooperation: Their development in
infancy. In M von Cranach, K Foppa, W Lepenies and D Ploog, eds, Human ethology: Claims and limits
of a new discipline, pp. 530–571. Cambridge University Press, Cambridge.
Trevarthen C (1986). Development of intersubjective motor control in infants. In MG Wade and HTA
Whiting, eds, Motor development in children: Aspects of coordination and control, pp. 229–261. Martinus
Nijhof, Dordrecht.
Trevarthen C (1998). The concept and foundations of intersubjectivity. In S Bråten, ed., Intersubjective
communication and emotion in early ontogeny, pp. 15–46. Cambridge University Press, Cambridge.
Trevarthen C and Hubley PA (1978). Secondary intersubjectivity: Confidence, confiding and acts of
meaning in the first year. In A Lock, ed., Action, gesture and symbol: The emergence of language,
pp. 183–229. Academic, London.
Trevarthen C, Sheeran L and Hubley PA (1975). Psychological actions in early infancy. La Recherche,
6, 447–458.
Tronick EZ (1989). Emotions and emotional communication in infants. American Psychologist,
44, 112–119.
Vandell DL and Mueller EC (1995). Peer play and friendships during the first two years. In HC Foot and
AJ Chapman, eds, Friendship and social relations in children, pp. 181–208. Transaction,
New Brunswick, NJ.
Vandell DL, Wilson KS and Buchanan NR (1980). Peer interaction in the first year of life: An examination
of its structure, content and sensitivity to toys. Child Development, 51, 481–488.
Winnicott DW (1958). The capacity to be alone. International Journal of Psycho-Analysis, 39, 416–420.
Winnicott DW (1974). Playing and reality. Harmondsworth, Penguin.
Chapter 13
The effects of maternal depression on

the ‘musicality’ of infant-directed speech
and conversational engagement
Helen Marwick and Lynne Murray
13.1 Introduction
In this chapter, we look at the effect of the mental illness of depression on the ‘musicality’ of a
mother’s voice as she talks with her young infant, and discuss the possible impact of this on the
infant’s emotional and cognitive development. The term musicality has been used to capture fun-
damental characteristics of human vocal communication—the shared expressiveness in timing,
phrasing, intonation and voice quality. All of these dimensions can be identified in the earliest
caregiver–infant talk and play. We first summarize the characteristics of musicality—timing and
expression—in human communication in general, and then consider its special place in the vocal
expressiveness of adults talking to infants, and on the part of infants themselves. Finally, we
review findings on the impact of depression on maternal expressiveness, their implications for
how we should understand conversational engagements between depressed mothers and their
infants, and the significance for diagnosis and therapy.
13.2 Musicality in human communication

The prosodic and paralinguistic features of speech in conversation, such as intonation, tempo,
phrasing, loudness, pausing, turn-taking and voice quality, are of central importance in the com-
munication of emotion, attitudes, intentions and ideas (Halliday 1967; Searle 1969; O’Connor
and Arnold 1973; Lyons 1977; Crystal 1979, 1975). These are understood to be the key vocal ele-
ments underpinning interpersonal understanding in communication (Bolinger 1964; Halliday
1975; Searle 1969; Crystal 1975,1979; Bruner 1983; Marwick 1987).
The direction and range of pitch movement and levels of pitch height that make up the intona-
tion contour of an utterance (see Figure 13.1), or the succession of utterances within discourse,
and the type of phonation (e.g., whisperiness, harshness), the extent of laryngeal tension, loud-
ness, pace, and pausing within and around utterances afford—separately or in combination—
important potential meaning contrasts in the vocal expression of emotion, attitude and intent.
Together with the phonemic and propositional content, they embody components of the subjec-
tive state, interpersonal emotion and intent, and the communicative purpose of an utterance
(Crystal 1979; Marwick 1987).
The complex interplay of expressive elements reveals crucial regularities in expressive function.
For example, pitch excursions signal emphasis within an utterance (Fry 1958; Daw 1977), func-
tioning contrastively to indicate ‘new’ or ‘given’ information and ‘focus’ within a conversational
topic, and ‘shared’ and ‘unshared’ knowledge and presuppositions within the context of discourse
282 HELEN MARWICK AND LYNNE MURRAY
Intonation analysis
Simple fall Simple rise

Jump fall Jump rise
Slope fall Slope rise
Undulating fall Undulating rise
Level
Fig. 13.1 Descriptions of different forms of intonation (following Marwick 1987).
(Halliday 1970; Brazil 1975; Coulthard 1977; Brown et al. 1980). In this way, they underpin suc-
cessful mutual understanding of contextual reference, meaning and expectations. Similarly,
a general raising of pitch level is associated with the strong emotional involvement of the speaker
in the utterance content (Scherer and Oshinsky 1977), and emotional meaning is found to be
reliably communicated by combinations of acoustic cues and prosodic features (Scherer and
Oshinsky 1977; van Bezooijen 1984). For example, anger is conveyed by high pitch level and wide
pitch range, fast tempo and loud voice, and sadness by low pitch and narrow pitch range, falling
pitch contour, slow tempo and ‘soft’ voice (Scherer 1979). Similar parameters distinguish musical
and gestural expressions of different emotions, and the intensity and interpersonal value of com-
munications (see Panksepp and Trevarthen, Chapter 7, this volume).
Papoušek and Papoušek (1981) demonstrate that prosodic patterns associated with certain
moods and affects remain unchanged from birth. Nevertheless, no simple direct association is
found in adult communicative conversation between any one particular intonation pattern and
any one grammatical function, conversational intention or interpersonal affect (Bolinger 1958;
Crystal 1979; Marwick 1987). Consistent specificity of function within the context of shared
experience and interpersonal intention is, however, indicated for some expressive forms. The ris-
ing intonation contour, for example, is held to indicate the incompleteness of an utterance, topic,
or idea in communication (Bolinger 1964; Lindsey 1981), and could thus be considered, depend-
ing on other contextual and expressive features such as accompanying voice quality, to signal a
conversational continuation, as found in conversational narrative (Britain and Newman 1992),
to invite or direct the other’s participation, or to convey the attitude of ‘demanding a response’
(Brown 1977). In each case, the modulation of the voice functions to maintain or heighten com-
municative involvement.
Conversational discourse analysis describes the prosodic development of topic and narrative
(Brandt, Chapter 3, and Erickson, Chapter 20, this volume), and recent work on conversational
interactions of mothers and young infants shows the parameters of discourse narrative to be in
place from the earliest weeks (Malloch 1999).
13.3 Infant acoustic preferences and sensitivities

Infants respond to prosodic and paralinguistic features in speech from birth. Discrimination of
pitch, intensity and temporal differences has been demonstrated by 3 to 5 days (Stratton and
Connolly 1973), and in newborns (Eisenberg 1976). Eisenberg found that low frequency sounds
addressed to newborns have a soothing effect, and that higher frequency sounds produce distress.
Blass (1987) found that newborns became attentive to a vocalized click sound and calmed to a
‘shh’ sound, but they did not attend to the sound of a triangle, or to a ‘psst’ sound. Studies by
Kearsley (1973), Webster et al. (1972) and Hutt et al. (1968) showed that infants are sensitive to
the normal pitch range of speech addressed to them. A very early preference, possibly arising
THE EFFECTS OF MATERNAL DEPRESSION ON THE ‘MUSICALITY’ OF INFANT-DIRECTED SPEECH 283
from experience in utero, has been observed in newborns for their mother’s voice (DeCasper and
Fifer 1980); and Fifer is reported by Stern (1985) to suggest that it is voice quality that underlies
this discrimination.
A number of studies indicate that infants are aware of narrative patterns in the register of
infant-directed speech. Mehler and colleagues (1978, 1979) found that 3-week-old infants can
recognize their mother’s voice, but not if she was reading from right to left, and therefore with
abnormal prosodic features of rhythm and intonation. Condon and Sander (1974) found that
infant movement was precisely synchronized with prosodic elements, such as rhythm, intensity,
stress and juncture or contingency, in adult speech. Discrimination of the location of stress, of
special significance for learning English, which is a stress-timed language, has been found at 1 to
4 months (Spring and Dale 1977), and different rhythmic patterns are distinguished at 3 to 4
months (Demany et al. 1977), as are tonal sequences at 5 months (Chang and Trehub 1977).
Kaplan (1969) found discrimination of rising and falling intonations between 4 and 8 months,
and Morse (1972) found that infants recognize intonation and acoustic cues for place of articula-
tion in the second month.
Taken together, the parameters that appear to be important for infants when listening to
speech have the characteristics of music. Two-month-old infants show interest in differences in
musical timbre (Michel 1973), and Caine (1991) and Standley (1998) found that vocal music
stimulation with premature and low birthweight neonates resulted in reduced stress behaviours,
increased weight gain and earlier hospital discharge. Early infant sensitivity to rhythm, timbre
and other prosodic features in the expressiveness of others’ voices plays a key role in establishing
and maintaining interpersonal connectedness, and may therefore be assumed to be of impor-
tance for the development of purposeful behaviour, emotional well-being and all conventional
forms of communication, including language.
13.4 Special features of vocal expressions addressed to infants

The prosody of speech to infants has distinctive and systematically modified characteristics in
comparison with talk between adults. The average pitch level is higher (Ferguson 1964; Blount
and Padgug 1977), the pitch patterns are characterized by smoothly gliding contours and
expanded pitch excursions (Papoušek and Papoušek 1981; Stern, et al. 1982; Fernald and Simon
1984), and the amplitude is more strongly modulated (Cooper and Aslin 1990). Fernald and
Simon (1984) found that mothers’ speech to 3- to 5-day-old infants contained more whispered
speech, and a greater amount of prosodic repetition of intonation contour and whisper phona-
tion in utterance clusters. The expanded pitch contours consisted predominately of rising unidi-
rectional pitch glides, and fewer falling unidirectional pitch glides and bell shaped rise–fall or
fall–rise contours. Timing of speech to infants is also distinctive. Maternal speech rate is slower
(Papoušek et al.1985), the utterances are regularly spaced (Beebe et al. 1979), and short, with
short pauses between them (Stern et al. 1977, Stern et al. 1983). Maternal response time to infant
vocalizations is found to be very short, typically less than a second (Stern et al. 1983; Bettes 1988).
Microanalysis reveals that mother and infant may share the pulse precisely (Malloch 1999;
Gratier and Apter-Danon, Chapter 14, this volume).
These prosodic modifications are also found in the speech addressed to infants by adults who
are not parents (Fernald and Simon 1984), and in the speech of older children to infants (Sachs
and Devin 1976). They have been observed across languages, including a tonal language very dif-
ferent from English, in the speech of parents to preverbal infants (Fernald et al. 1989; Grieser and
Kuhl, 1988). Mothers’ songs to young infants also show cross-cultural similarities of structure,
rhythm, melody and tempo (Trehub et al. 1993).
Speech to infants appears to be sensitive to their age. Thus, Stern et al. (1983) found that
prosodic modifications in speech directed to infants were more exaggerated at 4 months than at
earlier or later stages. Trevarthen and Marwick (1986), noting less variability of voice quality and
a predominance of ‘breathy’ phonation in maternal communication with a 6-week-old infant in
contrast with that of an 18-week-old infant, report systematic changes throughout the first year
within maternal vocal expression in relation to the developmental stage of the infant. An interest-
ing correlate of the changes in the first 6 months, indicating how age-related developments in the
infant may drive changes in parents’ behaviours, is found in imitations infants make of voice
sounds modelled to them in isolation. While the imitation of visible gestures of tongue protru-
sion and mouth opening declines from the neonate stage to the second month and then increases
after 5 months, the imitation of vocalizations becomes easier to elicit after 2 months, then
declines at 5 months (Kugiumutzakis 1999).
An important aspect of infant-directed speech is that it is both sensitive to, and helps regulate,
infant state and emotions. Papoušek et al. (1985) observed much parental imitative pitch match-
ing of the infant, and found that parents with 3-month-old infants used only a small number of
distinctive intonation patterns, predominately unidirectional rising glides, unidirectional falling
glides, and bell-shaped rise–fall contours, frequently repeated throughout the communicative
exchange. These could either activate the general state and motor activity of the infant, or soothe
and calm the infant, with falling contours prevalent in response to infants’ ‘fussy’ states, and ris-
ing, sinusoidal and bell-shaped contours prevalent when infants were pleasantly excited and
responsive. Similarly, Stern et al. (1982) found that mothers’ use of pitch contour was related to
an infant’s behaviour and mood, with a rising contour used to seek the infants’ attention, and a
rising–falling contour to maintain the infants’ interest. Stern et al. suggest that the contrasting
patterns of rise and fall of pitch in the bell-shaped and sinusoidal contours offer the infant a pat-
tern of build-up and reduction in stimulus level, which has been shown to evoke positive infant
affect (Emde et al. 1978; Sroufe and Waters 1976). Brazelton et al. (1974), and Papoušek et al.
(1986) observe that, in joyful interactions with young infants, the parent must modulate the level
of stimulation to avoid infant exhaustion and discomfort. Marwick et al. (1984) found that
moment-to-moment changes in mother–infant interactive engagement were systematically
reflected in altered settings of maternal voice quality features, whereas settings were maintained
throughout an episode of interactive engagement where interpersonal intentions remained con-
stant. For example, lax laryngeal tension with whispery voice was associated with gentle and
mutually absorbed playful engagement, and moderate laryngeal tension with whispery voice was
associated with seeking the infant’s attention or inciting more boisterous playful games. Stern,
Dore and Marwick (Stern 1985/2000; Trevarthen and Marwick 1986) found movement in inter-
personal intention and affect to be directly reflected by a change in maternal voice quality feature
setting; moreover, this was found to be more systematically organized at 4 months than at
1 year—that is, when infants are beginning to join in rituals of ‘action’ games or songs (Trevarthen
and Hubley 1978; Eckerdal and Merker, Chapter 11, this volume).
Infant preferences for varieties of infant-directed speech are reported to be clear in the first
month after birth (Cooper and Aslin 1990), and by 4 months of age, the mother’s special way of
talking is essential to the regulation of playful engagements (Fernald 1985; Werker and McLeod
1989; Stern 1990). Fernald and Kuhl (1987) identified both the level of pitch and pitch modula-
tion as the most influential prosodic elements for the infant’s preference.
13.6 Early vocalizations and reciprocal imitation

Intonation contours are considered to be the earliest form of communicative or linguistic struc-
turing of infant vocalizations and cries (Lenneberg 1967; Crystal 1975; Halliday 1975; Lieberman
1967; D’Odorico 1984). Stark et al. (1975) and Oller (1980) isolate a number of acoustic and
articulatory parameters for the description of early cries and vocalizations, including features of
pitch, loudness, breath direction and glottal and superglottal restriction. Papoušek and Papoušek
(1981), observing the vocalizations of their infant daughter, found that (i) rising–falling pitch
contours were present in cries from birth, (ii) falling terminal glides on vowel sound appeared in
the second and third months during quiet waking, and (iii) rising contours with steep glides on
squealing sounds and melodious intonation patterns appeared from the fourth month. The pitch
levels of infant vocalizations have been found to alter in response to the voice the infant is hear-
ing (Webster et al. 1972; Lieberman 1967). Papoušek et al. (1985) observed infants to match vocal
pitch at 2 months, and Kessen et al. (1979) report the same in 3- to 6-month-olds. Wendrich
(1981) observed pitch-matched vocalizations in response to sung tones in infants aged 3 to
6 months, and Kuhl and Meltzoff (1982) found pitch contour imitation at 5 months. Summers
(1984) found that 6-month-olds could discriminate changes in melody, and Reis (1982) found
infants of 7 months able to sing in pitch-matched and harmonic tones in response to a vocal cue.
Parents imitate infant sounds, providing sounds of a musicality that infants appreciate best
(Papoušek and Papoušek 1989; Trehub 1990). Papoušek et al. (1985) found mutual vocal imita-
tion between parents and their 3-month-old infants. Similarly, Trevarthen (1979, 1999) analysed
intricate reciprocal imitations in the vocal expression and other expressive behaviours of mother
and infant in a protoconversation at 6 weeks, and found they exhibited precise rhythmic timing
in the alternation of maternal and infant utterances; Malloch et al. (1997) confirmed a shared
periodicity in timing and rhythm in this same vocal interchange, providing a methodological
foundation for his theory that infants may participate actively in communicative musicality
(Malloch 1999).
In the second half of the first year, the infant regains motor control of voicing that appears
reduced between 4 and 6 months (Kugiumutzakis 1999), and imposes intonation contours upon
babbling (Lenneberg 1967; Kaplan and Kaplan 1970; Halliday 1975; Stark 1979; Oller 1980;
Papoušek and Papoušek 1981); De Boysson-Bardies et al. (1984) found that, during these
months, adult judges could identify infants from their own linguistic community on the basis of
metaphonological cues, such as voice quality and tonal contrasts, which were present within long
and coherent intonation patterns of infant babbling productions. Continuity from babbling to
later language development, in terms of phonotactic pattern and choice of sounds, is also
reported by Oller et al. (1976), and Oller and Eilers (1982).
From around 7 months onwards, infants have been observed to use intonation and voice quality
to produce and accompany certain systematic interpersonal meaning contrasts within communi-
cation, such as a high-pitched rising contour used as an invitation to mutual play, a horizontal
contour with vibrations being a ‘nagging’ request (Papoušek and Papoušek 1981) and a falling
intonation accompanying a protest vocalization (Carter 1978), or being used for demanding
(Menn 1976) or labelling (Dore 1975). Rising intonation is also associated with questioning utter-
ances (Menyuk 1971; Weeks 1978), reqesting or offering (von Raffler-Engel 1973; Menn 1976),
requiring a response (Halliday 1975) or requesting an answer (Dore 1975). Investigators agree that
intonational idioms are more stable than the accompanying segmental component (Dore et al.
1976; Crystal 1979), and Bruner (1983) further clarified understanding of preverbal and early ver-
bal intonation use by demonstrating that the contour shape and direction of early request-type
utterances were dependent on specifiable contextual features. Prosodic emphasis is used to mark
‘new’ information as early as the two-word stage (Wieman 1976; MacWhinney and Bates 1978),
and infants are observed to follow the intonation use of their mothers throughout their second
year. By 28 months, they produce functionally clear conversational utterances, with the use of the
complex, but contextually unambiguous in function, patterns of intonational expression found in
the interpersonal communication of their mothers (Marwick 1987).
13.7 The function of specialized vocalizations to infants

The studies outlined above indicate that mothers’ intonation and voice in speech to infants is
characterized by distinctive and systematically modified prosodic and paralinguistic features,
which are not only perceptually relevant to the infant and attended to, but interpersonally rele-
vant, engaging an infant’s affects in supportive ways. Thus, they are considered to be fundamental
to the intersubjective quality of the relationship of the carer and infant, encouraging mutual
attunement and enabling mutual affective regulation and communicative involvement. Shared
prosodic focus and shared excursions of prosodic direction and flow evidently facilitate joint
emotional and conceptual referencing and the all-important mutual understanding of commu-
nicative affect and intention within cooperative interaction. In this way, vocal communication
underpins later interpersonal, linguistic and cognitive development in the infant (Bruner 1983;
Trevarthen and Marwick 1986; Papoušek and Papoušek 1997; Stern 1985/2000).
In sum, the systematic adjustments of prosodic and paralinguistic characteristics in infant-
directed speech of carers in non-clinical populations accord with infant preferences and capaci-
ties to discriminate. They are attuned to infant tolerance limits within the dynamics of
engagement, they enhance communicative attention and engagement, facilitating infant partici-
pation, and they regulate infant mood state (Brazelton et al. 1974; Kagan 1970; Papoušek and
Papoušek 1981; Stern et al. 1982; Trevarthen and Marwick 1986; Werker and McLeod 1989;
Trevarthen 1999; Tronick and Weinberg 1997). Thus, they are adapted to mediate in a partner-
ship of interpersonal communicative motivations (Stern 1985/2000; Papoušek and Papoušek
1997; Trevarthen and Aitken 2001), and are a natural part of the regulation of joint intentions
and the learning of cultural meanings (Trevarthen 1988, 2005).
As their communicative interactions increasingly involve shared attention to the world around
them and action upon objects in it, the communicative and expressive behaviours in successful
interactions between carer and infant lead to the development, through expressive focus such as
prosodic emphasis and word repetition, of joint conceptual and linguistic reference to objects
and events. This, in turn, facilitates the development of shared meaning and cooperative under-
standing, which are fundamental to the communicative, interpersonal and linguistic functions of
a human world (Trevarthen and Hubley 1978; Bruner 1983; Fernald and Simon 1984; Papoušek
and Papoušek 1997).
Differences in the vocal expressiveness of mothers’ speech to male and female infants provide
further information on the role of specialized parental vocalizations to infants, and of mothers’
responses to infants’ behaviours. Sex differences have been found at 6 months in the affective
expressiveness and regulatory behaviours of infants in communicative interactions (Weinberg
and Tronick 1996), and in infants’ listening to non-vocal tones and musical stimuli (Kagan and
Lewis 1965). Studies of carer–infant interactions reveal that mothers synchronize with, and
match, male infant behaviours to a greater extent than they do for the behaviours of female
infants (Malatesta and Haviland 1982; Tronick and Cohn 1989; Murray et al. 1993). This is
argued to reflect an attuned maternal responsiveness to the differences in the expressive and reg-
ulatory behaviours of boys and girls in interpersonal interactions (Tronick and Weinberg 1997),
and which Malatesta and Haviland (1982) suggest reflect an increased responsive effort the
mother must make to reach the same level of successful interaction achieved with female infants,
the mother’s motivation being to achieve optimal success in communicative engagement and
mutual involvement.
Studies in which the mother’s communication is experimentally disrupted or perturbed and
made non-contingent to the infant’s efforts of communication, so that she appears and sounds
unresponsive (Tronick et al. 1978; Murray and Trevarthen 1985; Nadel et al. 1999) prove the
active role of the infant. The infant’s demonstrations of protest, avoidance and distress confirm
that the infant is both sensitive to the quality of communicative engagement and motivated to
achieve mutually attuned communicative involvement with the mother in a tightly regulated
joint performance (Figure 13.2). Thus, the cooperation of the primary carer in supporting and
encouraging the active participation of the communicatively pre-adapted infant, through sensi-
tive attunement to infant state and changes in expressive behaviour, is shown to be fundamental
to the achievement of successful communicative interactions and learning (Bruner 1983;
Papoušek and Papoušek 1997).
13.8 Communication and expressiveness of mothers with

depression
It is clear that a depressive illness distorts a mother’s participation in communication with a
young infant. Maternal postnatal depression can transform both the style and quality of commu-
nicative behaviours of mothers in interactions with their infants and the infants’ developmental
prospects. Depressed mothers are withdrawn and unresponsive to the infant, or exhibit intrusive
and interfering, or hostile behaviours (Cohn et al. 1986; Field 1984; Murray et al. 1993; Murray
et al. 1996a). Infants of depressed mothers show similarly disturbed affect and dysregulated
behaviours in interactions (Cohn et al. 1986; Field et al. 1988; Field 1992, 1997; Tronick and
Weinberg 1997), and develop deficits on measures of emotional and cognitive functioning
(Cogill et al. 1986; Murray 1992; Murray et al. 1993; Sharp et al. 1995; Stanley et al. 2004).
A prospective longitudinal study by Murray and colleagues on the effects of maternal depres-
sion on mother–infant interactions and infant development, revealed that maternal mental state
had a profound impact on the quality of maternal engagement. The depressed mothers were less
sensitively focused on their infants’ experience and they made more responses that were rejecting
or affectively discordant with the infants’ behaviour (Murray et al. 1996a). A micro-analysis of
infant behaviour showed that, by 2 months, infants of depressed mothers responded rapidly to
momentary discordances in maternal responsiveness, their own behaviour becoming dysregu-
lated. Later, the infants of depressed mothers were found to be more likely to be insecurely
attached to their mothers and to have behavioural problems, and they performed less well on
object concept tests and the Bayley Scales of Mental Development at 18 months (Bayley 1969,
1993). Infant outcomes were poorer when the postpartum depressive episode was the mother’s
first depressive episode, and was more directly related to the birth of the infant (Murray 1992;
Murray et al. 1996a).
The effects of a mother’s emotional state are particularly evident in vocal communication
when the depressed mothers are compared with non-depressed mothers. A study by Murray
and colleagues found that infants of mothers who showed higher rates of infant-focused speech
in conversation at 2 months achieved higher scores on the Bayley scales at 18 months (Murray
et al. 1993), the highest rates of this speech style being in control group mothers of male infants.
In contrast, the lowest rates of infant-focused speech occurred in conversations of depressed
mothers of male infants, and these infants showed poorer outcome than female case infants on
measures of both emotional and cognitive development. It was thus demonstrated that the asso-
ciation between maternal depression, the sex of the infant and infant cognitive development was
mediated by the quality of early communication with the infant. Particularly damaging for the
well-being of the infant was a low level of infant-focused speech, which, it was argued, reflects the
mother’s underlying preoccupation with her own experience, and her difficulty in maintaining
involvement in her infant’s experience.
Murray suggested that the quality of interaction of mothers experiencing a first episode
of depression was poorer because the depression focused on the infant and the mother’s
attitude to the infant, whereas depression of mothers who had had a previous episode was more
Monitor Monitor
M B
Image of Mother Image of Baby
Camera Camera
B M
Speaker Speaker
Mother’s voice VTR B VTR M Baby’s voice
Thin glass
(a) (Half-mirror)
Warm up 14 18 22 39 43
(50 seconds)
16 25 42
Mother
chatting
playfully.
67
Live
(1 minute) Baby very
cheerful and
responsive
58 72 78
Replay
(b) 58 60 79 110 125

Fig. 13.2 (a) The double video method to test a young infant’s sensitivity to the contingent respon-
siveness of the mother in protoconversation and play. Infant and mother are in separate rooms in
which they see and hear images of each other via television. Images on the two monitors are
viewed by reflection on inclined glass plates through which full face recordings are made by video
cameras, not visible to the participants. Voices are projected from behind the images by concealed
loudspeakers. The expressions of mother and infant are recorded on separate videotape recorders
(VTR, marked M and B respectively). (b) An 8-week-old baby girl easily communicated with her mother
using the apparatus. Images are labelled with the time in seconds from the start of the communication.
After approximately 50 seconds they were interacting playfully, the baby smiling, cooing and laugh-
ing as the mother gently teased her with an affectionate voice. From this point, 1 minute of their
engagement was chosen for replay. The tape of the mother’s behaviour was rewound and replayed
1 minute later to the infant. The photographs in the bottom row show that the baby was watchful
in the replay, made some attempts to enter into a dialogue, but failed. She brought her hands to
her mouth and looked away. Occasionally she made fleeting smiles (as at 110 s), but her expres-
sions and vocalizations were indicative of distress. Comparison of the infant’s appearance at 78 sec-
onds in the live interchange with that at the 79-second point of the replay, when the mother was
enjoying a game, clearly shows her discomfort 20 seconds after the start of the replay.
generally focused. The quality of the maternal communicative interaction at 2 months was also
found to be predictive of cognitive outcome at 5 years (Murray et al. 1996b). Murray and Cooper
(1997) showed that depressed maternal behaviour in interactions was mainly determined not by
the infant’s behaviour, but by maternal mental state.
13.9 Effects of depression on expressiveness in adult–adult speech

and adult–infant speech
The prosodic and paralinguistic characteristics of speech are affected by psychiatric disorder, and
vocal indicators of depression include changes in voice quality, speech rate, pitch range, pitch
level, intonation contour and intensity level, with particular characteristics of these features dis-
tinguishing different syndromes, and different phases in the manic depressive cycle (Scherer
1986). The speech of depressed adults is characterized by extremes of variation in speech rate
(Scherer 1979; Teasdale et al. 1980; Godfrey and Knight 1984; Hoffman et al. 1985), abnormal
voice quality (Ostwald 1965) and reduced pitch range and increased downward pitch contours. It
is described as flat, slow and monotonous (Ostwald 1961, 1965; Beck 1972). Notably, the acoustic
correlates of depressive speech are very similar to those found to be associated with the percep-
tion and simulation of sorrow or sadness in non-depressed populations. They include low pitch
and narrow pitch range, downward pitch contour, slow tempo (Scherer and Oshinsky 1977;
Williams and Stevens 1972; van Bezooijen 1984), soft voice (Scherer 1987), voicing irregularities
with occasional whisper (Williams and Stevens 1972) and ‘creak’ phonation (van Bezooijen
1984).
If these characteristics of intonation and voice quality in the speech of a depressive adult,
symptomatic of feelings of sorrow or sadness, are presented when a depressed mother talks to her
infant, not only will they express the mother’s underlying subjective state and communicative
motivation, but are likely to greatly distort the distinctive prosodic and paralinguistic modifica-
tions of such functional importance in non-clinical mother–infant communication. The modifi-
cations of prosody and affective expression typical of motherese or infant-directed speech
support the infant’s participation in dialogue, regulate infant affective engagement, and facilitate
concordant understanding, communicative effectiveness and the establishment of joint emo-
tional and conceptual referencing. We can assume that the changes in a depressed mother’s talk-
ing will have important implications for the development of joint linguistic reference, and later
interpersonal and cognitive functioning of her child.
Bettes (1988) measured temporal and intonational features in the interactions of mild to mod-
erately depressed and non-depressed mothers with infants 3 to 4 months old. This revealed that
the depressed mothers were significantly slower in responding to infant vocalizations, had more
variable pause and utterance lengths, and showed less use of the prosodically exaggerated intona-
tion contours characteristic of infant-directed speech (Fernald and Simon 1984). Bettes sug-
gested that the increased response latency, and the variable length of pauses and utterances found
in the depressed mothers would interfere with the infant’s perception of contingency in maternal
vocalizations. Experimental disruption in contingent maternal responsiveness had been shown
by Murray and Trevarthen (1985) to significantly lower infant affect and attention. Bettes also
argued that the non-exaggerated intonation shapes would reduce the affective signals available to
the infant, and concluded that both temporal and intonational features in maternal interaction
are a mechanism by which maternal depression could influence infant developmental outcome.
Bettes’s findings are corroborated by subsequent studies. A significant lessening in contingent
interpersonal responsiveness in the expressive timing of depressed mothers is also reported by
Zlochower and Cohn (1996) in a comparison of clinically depressed mothers with their
4-month-old infants and non-depressed mothers, and by Stanley et al. (2004). In this last study,
it was shown that the lack of contingency in depressed mothers’ interactions with their infants
was associated with impaired infant learning of contingent associations in general. With regard
to speech quality, Robb (1999), in a single case–control example, applied an acoustic analysis,
making spectrographic and pitch plots, to describe the pulse, quality and narrative (Malloch
1999) of the speech of a depressed and a well mother to their infants at 8 weeks and 6 months
postpartum. At 8 weeks, in contrast to the well mother, the speech of the mother who was
depressed was characterized by a slow pulse rate, long pauses, and stretched intonation contours
of low, descending, pitch; however, as the depressed mother began to recover, around 6 months,
her speech to her infant became more like that of the well mother (Figure 13.3). Notably, unlike
the vocalizations of the well mother’s infant, those of the infant whose mother was depressed
were, like his mother’s, also infrequent, short and of low pitch.
Further prosodic features in maternal expressiveness were investigated by Marwick
et al. (2008), who compared the speech of depressed and non-depressed mothers during face-to-
face conversational interactions with their 10-week-old infants. Maternal depression was found
to significantly affect the frequencies of certain types of intonation forms and voice quality, and
the patterns of sequencing of intonation contours. Depressed mothers were found to have more
falling intonation contours and a greater amount of sequential repetition of intonation forms
within their conversational interaction than the control mothers, and they also had more ‘creaky’
voice phonation. Creaky voice is a phonation type associated with sorrow and sadness. The
known perceptual receptiveness of the infant to changes of quality of voice would indicate a
special sensitivity to the lowered affect in the mother’s speech; moreover, the mother’s persever-
ant repetition in intonation can be understood to preclude an attuned focus on her infant’s
responsiveness, and it is likely that the mutual regulation of mood and involvement, which
is strongly mediated by prosodic contrast, will suffer (see Tronick 2005, on the regulation of
‘co-consciousness’, and Gratier and Apter-Danon, Chapter 14, this volume).
A number of studies have related the style and quality of the communication of depressed
mothers to corresponding features of participation and attention in their infants. Robb (1999)
found that the expressiveness of the infant of a depressed mother was imitative of that of the
mother, resulting in a less reciprocal and coordinated interaction because both mother and infant
were unresponsive (Figure 13.3). It can be seen, therefore, that the process of interpersonal
imitation, or sympathy, in prosodic expressiveness, which in positive circumstances can join
participants in shared enjoyment, in other conditions can lead both participants into fragmented
connectivity and low affect.
The consequences for infant attention and learning of the lowered expressiveness of mothers
with depression is indicated in two studies. Stanley et al. (2004) found the degree to which
depressed mothers responses lacked contingency was predictive of poor infant associative learn-
ing in a conditioning experiment. Similarly, Kaplan et al. (1999) found that experimentally
prescribed samples of child-directed speech segments taken from the speech of depressed moth-
ers in structured object–play interaction did not, when presented to infants of non-depressed
mothers within a conditioned attention experimental paradigm, constitute effective stimuli to
associative learning. By contrast, the child-directed speech collected from non-depressed moth-
ers did promote associative learning in the infant subjects of this study. Kaplan et al. found that
the fundamental frequency of the final portion of the selected speech segment of the mothers
with a greater number of depressive symptoms was significantly less modulated than that of the
other mothers; the authors suggested that a weak final pitch modulation may fail to increase the
infants’ state of arousal sufficiently to enable efficient or complete processing of, or attention to,
the information required. Correspondingly, they suggest that these results indicate that the child-
directed speech of depressed mothers may be less effective in directing the infant’s attention and
Pitch Pitch
G G
F F
E E
D D
C5 C5
B B
A A
G G
F F
E E
D D
C4 C4
B B
A A
G G
F F
E E
D D
C3 C3
0 1 2 3 Seconds 0 1 2 3 Seconds
Do they? Hey It’s not like Its ... not really Ohh .. Dear!
your car seat interesting
is it? at all is it?
(a) Non-depressed dyad, baby 8 weeks old. (c) Depressed dyad, baby 8 weeks old.
Pitch Pitch
G G
F F
E E
D D
C5 C5
B B
A A
G G
F F
E E
D D
C4 C4
B B
A A
G G
F F
E E
D D
C3 C3
0 1 2 3 Seconds 0 1 2 3 Seconds
Hey! Hey! The clown’s Ohh Can you
gone, hasn’t Ooohhh catch my
he? Oh! finger?
(b) Non-depressed dyad, baby 6 months old. (d) Depressed dyad, now well, baby
6 months old.
Fig. 13.3 Pitch plots of the voices of two mothers and their infants from recordings made by Lynne
Murray (Robb 1999). One mother (c) had postnatal depression when her baby was 8 weeks old.
The two mothers were again recorded when their babies were 6 months old and the mother who
had been depressed was fully recovered and happy with her infant (d). The well mother is speaking
playfully with her baby at both ages (a and b). Her voice is rhythmic and shows large excursions of
pitch, over more than an octave, with rising or lifting contours. Her pitch moves in the octave and a
half above Middle C (C4). When mother (c) is depressed, with her 8-week-old, her speech is nega-
tive, not rhythmic and shows large downward movement to below C4 (the shaded area). Her
infant’s vocalizations are also depressed. Four months later she has a lively, happy voice as she plays
a chasing game with her infant. Vocalizations of the infants are enclosed in rectangles. It is clear
that they tend to match the intonations and pitch levels of their mothers.
maintaining the infant’s intention or interest on the task during object–play contexts of interac-
tion. This could influence learning and later performance on cognitive tasks.
Kaplan et al. (2001, 2002) found that mothers diagnosed with depression produced less modu-
lation in fundamental frequency in comparison with well mothers, when compared on experi-
mentally isolated target speech samples. Robb’s (1999) data indicate that the depressed and well
mother both made wide excursions of fundamental frequency and psychometrically defined
pitch, but differed in the absolute pitch range and inclination of pitch modulations. Other studies
report that in stressful contexts, maternal pitch modulation can increase in depressed mothers.
Reissland et al. (2003) report, using a matched-pairs design, that in the context of being asked to
read a given story book to their young infants of less than 1 year old, mothers with self-reported
depression had significantly greater mean pitch height and greater pitch modulation than moth-
ers without depression. Furthermore, whereas mothers without depression had the same mean
pitch when both reading to their child and addressing their child, depressed mothers spoke with
a significantly higher voice when addressing their child compared with when reading the book.
Reissland et al. argue that changes in the depressed mothers’ pitch may reflect the stress they felt
when confronted with the task, in the same way as the anxiety associated with waiting for a doc-
tor’s visit was found by Breznitz and Sherman (1987) to significantly increase the amount of
speech used by depressed mothers of 3-year-olds; non-depressed mothers showed no such effect.
Breznitz and Sherman observed that the children of the depressed mothers made fewer vocaliza-
tions than children of mothers without depression, and argued that this reflected the depressed
mothers’ manner of communication. The effect of the anxiety-provoking situation on the vocal
expressiveness of mothers in these studies seemed to impair their capacity to share a focus of
interest and attune to feelings of their child.
13.10 Conclusions, and possible long-term consequences of loss

of ‘musicality’ in the voice of a depressed mother
Studies of the vocal expressiveness of mothers with depression show distinctive abnormalities in
contingent timing, pacing, voice quality, loudness, intonation contour and repetition of intona-
tion contour. These changes transform the specialized vocal expressiveness used intuitively by
non-depressed adults in speech addressed to infants, and can be expected to affect, perhaps in a
lasting way, the quality of interpersonal engagement between mother and child. It is argued that
the distinctive characteristics of depressed mothers’ intonation and voice quality within early
interpersonal interactions may be contributing factors, or mediators, of known effects of mater-
nal depression upon the developing child.
The changes in the mother’s speech associated with depression directly reflect significant dif-
ferences in maternal communicative motivation, subjective state, interpersonal awareness and
communicative purpose. They must profoundly affect the mother’s conscious involvement with,
and attunement to the subjective and interpersonal state and affective engagement of the infant.
Changes in the structure and patterning of the mother’s communication with her infant will
determine opportunities for active mutual interpersonal attunement. How well joint affective
and conceptual understanding is established depends on mutual regulation of affect state and
conceptual focus. It requires interpersonally aligned intentions that will lead to concordant effec-
tiveness in communication. Characteristics of depressed mothers’ vocal rhythms, intonation and
voice quality when talking to their young infants during the early phase of more serious postnatal
depression manifestly do influence the infant’s immediate intersubjective experience and com-
municative motivations. Those characteristics can also be expected to significantly affect the
interpersonal, cognitive and linguistic development of the infant; delineating the nature of these
effects is an important research topic.
To date, as summarized above, studies have generally focused on the consequences of distur-
bances in depressed mothers’ communication for infant and child cognitive development and
learning. Murray (Murray et al. 1993, 1996b) has proposed a number of mechanisms whereby
cognitive deficits may be brought about in infants of depressed mothers. First, consistent with
evidence from normal samples (e.g., Lewis and Goldberg 1969; Dunham et al. 1989), a number of
studies have shown the contingency of the mother’s response to the infant, an element often lack-
ing in depression, to be particularly important (Murray et al. 1993; Zlochower and Cohn 1996;
Stanley et al. 2004). This can be presumed to be because a sense of consistent links between the
infant’s own behaviour and maternal responses, fundamental to perceiving associations between
self-generated activity and events in the world in general, and embedded in learning, fails to be
fostered. Second, in normal interactions, a number of maternal speech characteristics serve to
sustain infant attention, the capacity for which is also a basic component of good cognitive
functioning. These include the way in which maternal responses are modulated over time, with a
degree of repetition, small changes consistent with fluctuations in infant state (Brazelton
et al. 1974), and particular intonational features, such as the tendency to show rising, or
rising–falling, intonation contours (Stern et al. 1982). Where these speech qualities are lacking, as
they are more likely to be where the mother is depressed, it may thus be to the detriment of the
infant’s developing attention span. As Tronick (2005) and Stern (1985/2000) have described,
maternal responses to the infant in early interactions not only mirror back the infant’s own
expressions and gestures, but elaborate on them, thus providing the infant with an enriched ver-
sion of their original experience. Where the mother is unable to respond to her infant because of
her own preoccupations, the infant will lack such enrichment.
With regard to a child’s emotional development, rather different processes are likely to operate.
Here, a mother’s support for the development of her infant’s self-regulation of emotions is likely to
be of particular importance. Tronick and Weinberg (1997) have described, for example, how
depressed mothers may have difficulty in appropriately supporting their infant to recover from the
normal moments of mismatched communication that are typical of communication between well
mothers and their infants; instead, the depressed mother may either act in an intrusive, overbearing,
way, causing the infant to become further distressed, or may fail to respond at all, leaving the infant
unassisted. The longitudinal study of Murray and colleagues found that expressed hostility towards
an infant in the interactions of depressed mothers in the first few months was predictive of emotion-
ally dysregulated infant behaviour later in the first year; this, in turn, provoked further maternal
intrusive and hostile contacts, thus perpetuating a vicious cycle of contact that culminated in raised
levels of conduct disorder and hyperactivity at ages 5 and 8 years (Morrell and Murray 2003). In this
same study, early, 2-month, maternal-expressed hostility was predictive of the 5-year-old children of
depressed mothers themselves showing depressive cognitions (Murray et al. 2001), suggesting that
the effects of this facet of maternal disturbance may be quite pervasive.
Since depression itself is not normally manifest before the teenage years, there has been little
research into the role of early mother–infant interactions in the development of depressive
episodes in the offspring. Nevertheless, it can be hypothesized that consistent exposure in the first
few months postpartum to those parameters of vocal quality that signal sadness may play an
important role. Thus, just as infants become rapidly sensitized to the distinctive features of
speech that signal their parents’ own particular language, so they may become sensitized by early
exposure to particular emotions carried in the voice. This proposition is consistent with a num-
ber of studies showing that children of depressed mothers, and particularly girls, are highly
attuned to others’ distress (Radke-Yarrow et al. 1994; Murray et al. 2006); nevertheless, further
research is needed to explore these possible associations.
13.11 Implications for clinical practice

Although good tools to screen for postnatal depression are available to health-care professionals
(e.g., the Edinburgh Postnatal Depression Scale, or EPDS (Cox et al. 1987)), it is unfortunate that
many episodes of postnatal depression go undetected and untreated (Murray et al. 2004). This
may be partly because women are reluctant to confide how they are feeling to the relevant health
care professionals, and may try to put on a ‘brave face’. This is particularly unfortunate since evi-
dence suggests that vulnerable women who are unwilling to engage in support services are more
likely to have poor outcomes for themselves and their infants than similarly vulnerable women
who do engage (Murray et al. 2003). Thus, increasing professional awareness of the effects of
depression on speech qualities may enable the illness to be better identified and treated. This is an
important area for future clinical research.
References
Bayley N (1969). The Bayley scales of infant development. The Psychological Corporation, New York.
Bayley N (1993). The Bayley scales of infant development, 2nd edn. The Psychological Corporation, San
Antonio, TX.
Beck AT (1972). Depression: Causes and treatment. University of Pennsylvania Press, Philadelphia, PA.
Beebe B, Gerstman L, Carson B, Dolins M, Zigman A, Rosenweig H, Faughey K and Korman M (1979).
Rhythmic communication in the mother–infant dyad. In M Davis, ed., Interaction rhythms,
pp. 79–100. Human Sciences Press, New York.
Bettes BA (1988). Maternal Depression and motherese: Temporal and intonational features. Child
Blass E (1987). What babies know, and noises parents make. Science, 237, 726.
Blount B and Padgug E (1977). Prosodic, paralinguistic and interactional features in parent–child speech:
English and Spanish. Journal of Child Language, 4, 67–86.
Bolinger D (1958). Intonation and grammar. Language and Learning, 8, 31–37.
Bolinger D (1964). Around the edge of language: Intonation. Harvard Educational Review, 34, 282–293.
Reprinted in Bolinger D, ed., Intonation (1972). Penguin, Harmondsworth.
Brazelton TB, Koslowski B and Main M (1974). The origins of reciprocity: The early mother–infant
interaction. In M Lewis and LA Rosenblum, eds, The effect of the infant on its caretaker,
pp. 49–76. John Wiley, New York and London.
Brazil D (1975). Discourse intonation. Discourse analysis monographs, 1. English Language Research,
University of Birmingham.
Breznitz N and Sherman T (1987). Speech patterning of natural discourse of well and depressed mothers
and their young children. Child Development, 58, 395–400.
Britain D and Newman J (1992). High rising terminals in New Zealand English. Journal of the
International Phonetics Association, 22, 1–11.
Brown G (1977). Listening to spoken English. Longman Group Ltd, London.
Brown G, Currie KL and Kenworthy J (1980). Questions of intonation. Croom Helm, London.
Bruner J (1983). Child’s talk: Learning to use language. Norton, New York.
Caine J (1991). The effects of music on the selected stress behaviours, weight, caloric and formula intake,
and length of hospital stay of premature and low birth weight neonates in a newborn intensive care
unit. Journal of Music Therapy, 28(4), 180–192.
Carter A (1978). The development of systematic vocalisations prior to words: A case study. In N Waterson
and CE Snow eds, Development of communication, pp. 127–138. Wiley, Chichester.
Chang H and Trehub S (1977). Auditory processing of relational information by young infants. Journal of
Cogill SR, Caplan HL, Alexandra H, Robson KM and Kumar R (1986). Impact of maternal postnatal
depression on cognitive development in young children. British Medical Journal, 292, 1165–1167.
Cohn JF, Matias R, Tronick EZ, Connell D and Lyons-Ruth K (1986). Face-to-face interactions of
depressed mothers and their infants. In E Z Tronick and T Field, eds, Maternal depression and infant
disturbance, pp. 31–45. Jossey-Bass, San Francisco, CA.
Condon WS and Sander LW (1974). Neonate movement is synchronized with adult speech: Interactional
Cooper R and Aslin R (1990). Preference for infant-directed speech in the first month after birth. Child
Coulthard M (1977). An introduction to discourse analysis. Longman, London.
Cox JL, Holden JM and Sagovsky R (1987). Detection of post-natal depression: Development of the
10-item Edinburgh Post-natal Depression Scale. British Journal of Psychiatry, 150, 782–786.
Crystal D (1975). The English tone of voice. Edward Arnold, London.
Crystal D (1979). Prosodic development. In P Fletcher and M Garman, eds, Language acquisition,
Daw H (1977). The perception of linguistic stress. Unpublished Masters Thesis, University of Edinburgh.
De Boysson-Bardies B, Sagart L and Durand C (1984). Discernable differences in the babbling of infants
according to target language. Journal of Child Language, 11(1), 1–15.
DeCasper AJ and Fifer WP (1980). Of human bonding: Newborns prefer their mother’s voices. Science,
208, 1174–1176.
Demany L, McKenzie B and Vurpillot E (1977). Rhythm perception in early infancy. Nature,
266, 718–719.
D’Odorico L (1984). Non-segmental features in pre-linguistic communication: An analysis of some types
of infant cry and non-cry vocalisations. Journal of Child Language, 11(1), 17–27.
Dore J (1975). Holophrases, speech acts and language universals. Journal of Child Language, 2, 21–40.
Dore J, Franklin MB, Miller RT and Ramer ALH (1976). Transitional phenomena in early language
aquisition. Journal of Child Language, 3, 13–28.
Dunham P, Dunham F, Hurshman A and Alexander T (1989). Social contingency effects on subsequent
perceptual cognitive tasks in young infants. Child Development, 60, 1486–1496.
Eisenberg RB (1976). Auditory competence in early life: The roots of communicative behaviour. University
Park Press, Baltimore, MD.
Emde RN, Campos J, Reich J and Gaensbauer TJ (1978). Infant smiling at five and nine months: Analysis
of heart rate and movement. Infant Behavior and Development, 1, 26–35.
Ferguson CA (1964). Baby talk in six languages. American Anthropologist, 66, 103–14.
Fernald A (1985). Four-month-old infants prefer to listen to motherese. Infant behaviour and Development,
8, 181–95.
Fernald A and Kuhl P (1987). Acoustic determinants of infant preference for motherese speech.
Infant Behaviour and Development, 10, 279–293.
Fernald A, Taeschner T, Dunn J, Papoušek M, de Boysson-Bardies B and Fukui I (1989). A cross-language
language, 16, 477–501.
Field T (1984). Early interaction between infants and their postpartum depressed mothers. Infant
Behaviour and Development, 7, 517–522.
Field T (1992). Infants of depressed mothers. Development and Psychopathology, 4, 49–66.
Field T (1997). The treatment of depressed mothers and their infants. In L Murray and PJ Cooper, eds,
Postpartum depression and child development, pp. 221–236. Guilford Press, New York.
Field T, Healy B, Goldstein S, Perry S, Bendell D, Schanberg S, Zimmerman EA and Kuhn C (1988).
Infants of depressed mothers show ‘depressed’ behaviour even with non-depressed adults. Child
Fry DB (1958). Experiments in the perceptions of stress. Language and Speech, 1, 126–151.
Godfrey HPD and Knight RG (1984). The validity of actomotor and speech activity measures in the
asssessment of depressed patients. British Journal of Psychiatry, 145, 159–163.
Grieser DL and Kuhl P (1988). Maternal speech to infants in a tonal language: Support for universal
Halliday MAK (1967). Intonation and grammar in British English. Mouton, The Hague.
Halliday MAK (1970). A course in spoken English: Intonation. Oxford University Press, London.
Halliday MAK (1975). Learning how to mean: Explorations in the development of language.
Edward Arnold, London.
Hoffman GMA, Ganze JC, and Mendlewicz J (1985). Speech pause time as a method for the evaluation of
psychomotor retardation in depressive illness. British Journal of Psychiatry, 146, 535–538.
Hutt SC, Hutt C, Lenard HG, Bernuth HV and Muntjewerff WJ (1968). Auditory responsitivity in the
human neonate. Nature, 218, 888–890.
Kagan J (1970). The determinants of attention in the infant. American Scientist, 58(3), 298–306.
Kagan J and Lewis M (1965). Studies of attention in the human infant. Merrill-Palmer Quarterly,
11, 95–127.
Kaplan EL (1969). The role of intonation in acquisition of language. Unpublished Doctoral dissertation,
Cornell University.
Kaplan EL and Kaplan GA (1970). The pre-linguistic child. In J Elliot, ed., Human development and
cognitive processes, pp. 359–381. Holt, Rinehart and Winston, New York.
Kaplan PS, Bachorowski J, Smoski MJ and Hudenko WJ (2002). Infants of depressed mothers, although
competent learners, fail to learn in response to their own mothers’ infant-directed speech. Psychological
Science, 13(3), 268–271.
Kaplan PS, Bachorowski J and Zarlengo-Strouse P (1999). Child-directed speech produced by mothers
with symptoms of depression fails to promote associative learning in 4-month-old infants. Child
Kaplan PS, Bachorowski J, Smoski MJ and Zinser M (2001). Role of clinical diagnosis and medication use
in effects of maternal depression on infant-directed speech. Infancy, 2(4), 537–548.
Kearsley R (1973). The newborn’s response to auditory stimulation: A demonstration of orienting and
defensive behaviour. Child Development, 44, 582–590.
Kessen W, Levine J and Wendrich KA (1979). The imitation of pitch by infants. Infant Behaviour and
In J Nadel and G Butterworth, eds, Imitation in infancy, pp. 36–59. Cambridge University Press,
Cambridge.
Kuhl PK and Meltzoff AN (1982). The bimodal perception of speech in infancy. Science, 218, 1138–1141.
Lenneberg EH (1967). Biological foundations of language. Wiley, New York.
Lewis M and Goldberg S (1969). Perceptual–cognitive development in infancy: A generalized expectancy
model as a function of the mother–infant interaction. Merrill-Palmer Quarterly, 15, 307–316.
Lieberman P (1967). Intonation, perception and language. MIT Press, Cambridge, MA.
Lindsey G (1981). Intonation and pragmatics. Journal of the International Phonetics Association.
11(1), 2–21.
Lyons J (1977). Semantics. Cambridge University Press, Cambridge.
MacWhinney B and Bates E (1978). Sentential devices for conveying giveness and newness: A cross-
cultural developmental study. Journal of Verbal Learning and Verbal Behaviour, 17, 539–558.
Malatesta CZ and Haviland JM (1982). Learning display rules: The socialization of emotion expression in
infants. Child Developmentelopment, 53, 991–1003.
1999–2000), 29–57.
Analysing pitch, timing, loudness and voice quality in mother/infant communication. Proceedings of the
Institute of Acoustics, 19(5), 495–500.
Marwick H (1987). The intonation of mothers and children in early speech. Doctoral Thesis, University of
Edinburgh.
Marwick H, McKenzie J, Laver J and Trevarthen C (1984). Voice quality as an expressive system in
mother-to-infant communication: a case study. Work in Progress, 17, Department of Linguistics,
University of Edinburgh.
Marwick H, Martins C and Murray L (submitted 2008). Altered characteristics of intonation and voice
quality in maternal depressive state interpersonal expression.
Mehler J, Bertoncini J, Barriere M and Jassik-Gerschenfeld D (1978). Infant recognition of mother’s
voice. Perception, 7, 491–497.
Mehler J and Bertoncini J (1979). Infants’ perception of speech and other acoustic stimuli. In J Morton
and JC Marshall, eds, Psycholinguistics Series – 2, pp. 67–105. Elek Science, London.
Menn L (1976). Pattern, control and contrast in beginning speech: A case study in the development of word
form and word function. Ph.D. Thesis, University of Illinois.
Menyuk P (1971). The acquisition and development of language. Prentice-Hall, New Jersey.
Michel P (1973). The optimum development of musical abilities in the infant years of life. Psychology of
Music, 1(2), 14–20.
Morrell J and Murray L (2003). Postnatal depression and the development of conduct disorder and
hyperactive symptoms in childhood: A prospective longitudinal study from 2 months to 8 years.
Journal of Child Psychology and Psychiatry, 44(4), 489–508.
Morse P (1972). The discrimination of speech and non-speech stimuli in early infancy. Journal of
Murray L (1992). The impact of post-natal depression on infant development. Journal of Child Psychology
and Psychiatry, 33, 543–561.
Murray L and Cooper PJ (eds) (1997). Postpartum depression and child development. Guilford, New York.
and their mothers. In TM Field and NA Fox, eds, Social perception in infants, pp. 177–97.
Ablex, Norwood, NJ.
Murray L, Halligan SL, Adams GC, Patterson P and Goodyer I (2006). Socioemotional development in
adolescents at risk for depression: the role of maternal depression and attachment style. Development
and Psychopathology, 18, 489–516.
Murray L, Hipwell A, Hooper R, Stein A and Cooper PJ (1996b). The cognitive development of five-year-
old children of postnatally depressed mothers. Journal of Child Psychology and Psychiatry, 37, 927–935.
Murray L, Kempton C, Woolgar M and Hooper R (1993). Depressed mothers’ speech to their infants and
its relation to infant gender and cognitive development. Journal of Child Psychology and Psychiatry,
34(7), 1083–1101.
Murray L, Stanley C, Hooper R, King F and Fiori-Cowley A (1996a). The role of infant factors in
postnatal depression and mother–infant interactions. Developmental Medicine and Child Neurology,
38(2), 109–119.
Murray L, Woolgar M and Cooper PJ (2004). Detection and treatment of postpartum depression in
primary care. Community Practitioner, 77(1), 13–17.
Murray L, Woolgar M, Cooper PJ and Hipwell A (2001). Cognitive vulnerability in five-year-old children
of depressed mothers. Journal of Child Psychology and Psychiatry, 42(7), 891–899.
Murray L, Woolgar M, Murray J and Cooper PJ (2003). Self-exclusion from health care in women at high
risk for postpartum depression. Journal of Public Health Medicine, 25(2), 131–137.
Nadel J, Carchon I, Kervella C, Marcelli D and Reserblat-Plantey D (1999). Expectancies for social
O’Connor JD and Arnold GF (1973). Intonation of colloquial English, 2nd edn. Longman, London.
Oller DK (1980). The emergence of the sounds of speech in infancy. In G Yeni-Komshian, G Kavanagh and
C Ferguson, eds, Child phonology, perception and production, pp. 93–112. Academic Press, New York.
Oller DK and Eilers RE (1982). Similarities of babbling in Spanish and English learning babies. Journal of
Child Language, 9, 565–577.
Oller DK, Wieman LA, Doyle WJ and Ross C (1976). Infant babbling and speech. Journal of Child
Language, 3, 1–11.
Ostwald PF (1961). Sounds of emotional disturbance. Archives of General Psychiatry, 5, 587–592.
Ostwald PF (1965). Acoustic methods in psychiatry. Scientific American, 212(3), 82–91.
Papoušek H and Papoušek M (1989). Forms and functions of vocal matching in interactions between
mothers and their precanonical infants. First Language, 9, 137–158.
Papoušek H and Papoušek M (1997). Fragile aspects of early social interaction. In L Murray and
PJ Cooper, eds, Postpartum depression and child development, pp. 35–53. Guilford Press, New York.
Papoušek H, Papoušek M and Koester LS (1986). Sharing emotionality and sharing knowledge:
A microanalytic approach to parent–infant communication. In CE Izard and PB Read, eds, Measuring
emotions in infants and children, vol 2, Cambridge University Press, New York.
Papoušek M and Papoušek H (1981). Musical elements in the infant’s vocalisation: Their significance for
Research, 1, 163–224.
Papoušek M, Papoušek H and Bornstein M (1985). The naturalistic vocal environment of young infants:
on the significance of homogeneity and variability in parental speech. In TM Field and N Fox, eds,
Radke-Yarrow M, Zahn-Waxler C, Richardson D, Susman A and Martinez P (1994). Caring behaviour in
children of clinically depressed and well mothers. Child Development, 65, 1405–1414.
Reissland N, Sheperd J and Herrera E (2003). The pitch of maternal voice: A comparison of mothers
suffering from depressed mood and non-depressed mothers reading books to their infants. Journal of
Child Psychology and Psychiatry, 44(2), 255–261.
Ries NLL (1982). An analysis of the characteristics of infant-child singing expressions. Dissertation
Abstracts International (University Microfilms No. AAT-8223568).
Robb L (1999). Emotional musicality in mother–infant vocal affect, and an acoustic study of postnatal
Sachs J and Devin J (1976). Young children’s use of age-appropriate speech styles in social interaction and
role playing. Journal of Child Language, 3, 221–245.
Scherer K (1979). Personality markers in speech. In K Scherer and H Giles, eds, Social markers in speech.
Scherer KR (1986). Vocal affect expression: A review and a model for future research. Psychological
Bulletin, 99(2), 143–165.
Scherer KR (1987). Vocal assessment of affective disorders. In JD Maser, eds, Depression and expressive
behavior, pp. 57–82. Erlbaum, Hillsdale, NJ.
Scherer KR and Oshinsky JS (1977). Cue utilisation in emotion attribution from auditory stimuli.
Motivation and Emotion, 1, 333–346.
Searle JR (1969). Speech acts: An essay in the philosophy of language. Cambridge University Press,
Cambridge.
Searle JR (1976). A classification of speech acts. Language in Society, 5, 1–23.
Sharp D, Hay D, Pawlby S, Schmucher G, Allen H and Kumar R (1995). The impact of postnatal
depression on boys’ intellectual development. Journal of Child Psychology and Psychiatry,
36, 1315 –1336.
Spring DR and Dale PS (1977). Discrimination of linguistic stress in early infancy. Journal of Speech and
Hearing Research, 20, 224–232.
Sroufe LA and Waters E (1976). The ontogenesis of smiling and laughter: A perspective on the
organization of development in infancy. Psychological Review, 83, 173–189.
Standley JM (1998). Pre and perinatal growth and development: Implications of music benefits for
premature infants. International Journal of Music Education, 31, 1–13.
Stanley C, Murray L and Stein A (2004). The effect of postnatal depression on mother–infant interaction,
infant response to the still-face perturbation and performance on an instrumental learning task.
Development and Psychopathology, 16, 1–18.
Stark R (1979). Prespeech segmental feature development. In P Fletcher and M Garman, eds, Language
acquisition, pp. 149–173. Cambridge University Press, Cambridge.
Stark RE, Rose SN and McLagen M (1975). Features of infant sounds: The first eight weeks of life. Journal
of Child language, 2, 205–221.
Stern DN (1985/2000). The interpersonal world of the infant: a view from psychoanalysis and development
psychology, 2nd edn, with new Introduction. Basic Books, New York.
Stern D (1990). Diary of a baby. Basic Books, New York.
Stern D, Beebe B, Jaffe J and Bennett S (1977). The infant’s stimulus world during social interaction:
a study of caregiver behaviours with particular reference to repetition and timing. In HR Schaffer, ed.,
Studies in mother–infant interaction, pp. 177–202, Academic Press, New York.
Stern D, Spieker S and Mackain K (1982). Intonation contours as signals in maternal speech to
prelinguistic infants. Developmental Psychology, 18, 727–735.
Stern D, Spieker S, Barnett RK and Mackain K (1983). The prosody of maternal speech: Infant age and
context-related changes. Journal of Child Language, 10, 1–15.
Stratton P and Connolly K (1973). Discrimination by newborns of the intensity, frequency and temporal
characteristics of auditory stimuli. British Journal of Psychology, 64, 219–232.
Summers EK (1984). The categorization and conservation of melody in infants (concept, reinforcement,
music). Dissertation Abstracts International (University Microfilms No. AAT-8501103).
Teasdale JD, Fogarty SJ and Williams JM (1980). Speech rate as a measure of short-term variation in
depression. British Journal of Social and Clinical Psychology, 19, 271–278.
by their parents. In MA Berkley and WC Stebbins, eds, Comparative perception; vol. 1, Mechanisms.
Trehub SE, Unyk AM and Trainor LJ (1993). Maternal singing in cross-cultural perspective. Infant
Trevarthen C (1979). Communication and cooperation in early infancy: A description of primary
intersubjectivity. In M Bullowa, ed., Before speech: the beginnings of human communication,
culture. In G Jahoda and I M Lewis, eds, Acquiring culture: Ethnographic perspectives on cognitive
Trevarthen C (1999). Musicality and the intrinsic motive pulse: evidence from human psychobiology and
infant communication. Musicae Scientae, Special Issue, 155–215.
Trevarthen C (2005). First things first: Infants make good use of the sympathetic rhythm of imitation,
without reason or language. Journal of Child Psychotherapist, 31(1), 91–113.
Trevarthen C and Aitken K (2001). Infant intersubjectivity: Research, theory, and clinical applications.
Annual Research Review. The Journal of Child Psychology and Psychiatry and Allied Disciplines,
42(1), 3–48.
London.
Trevarthen C and Marwick H (1986). Signs of motivation for speech in infants, and the nature of a
mother’s support for development of language. In B Lindblom and R Zetterstrom, eds, Precursors of
early speech, pp. 279–308. Macmillan, Basingstoke.
Tronick EZ (2005). Why is connection with others so critical? The formation of dyadic states of
consciousness: coherence governed selection and the co-creation of meaning out of messy meaning
making. In J Nadel and D Muir, eds, Emotional development, pp. 293–315. Oxford University Press,
Oxford.
Tronick EZ and Cohn JF (1989). Infant–mother face-to-face interaction: Age and gender differences in
coordination and occurrence of miscoordination. Child Development, 60, 85–92.
Tronick EZ and Weinberg MK (1997). Depressed mothers and infants: Failure to form dyadic states of
consciousness. In L Murray and PJ Cooper, eds, Postpartum depression and child development,
pp. 54–81. Guilford Press, New York.
between contradictory messages in face-to-face interaction. Journal of the American Academy of Child
van Bezooijen R (1984). The characteristics and recognizability of vocal expression of emotions. Foris,
Dordrecht.
von Raffler Engel W (1973). The development from sound to phoneme in child language. In C Ferguson
and D Slobin, eds, Studies of child language development, pp. 9–12. Holt, Rinehart and Winston,
New York.
Webster RL, Steinhardt MH and Senter MG (1972). Changes in infants’ vocalisations as a function of
differential acoustic stimulisation. Developmental Psychology, 7, 39–43.
Weeks TE (1978). Intonation as an early marker of meaning. Presented at the First International
Congress for the study of Child Language, Tokyo.
Weinberg MK and Tronick EZ (1996). Infant affective reactions to the resumption of maternal interaction
after the still-face. Child Development, 67, 905–914.
Wendrich KA (1981). Pitch imitation in infancy and early childhood: Observations and implications.
Doctoral dissertation, University of Connecticut.
Werker JF and McLeod PJ (1989). Infant preference for both male and female infant-directed talk:
A developmental study of attentional affective responsiveness. Canadian Journal of Psychology,
43, 230–246.
Wieman LA (1976). Stress patterns of early child language. Journal of Child Language, 3, 283–286.
Williams CE and Stevens KN (1972). Emotions and speech: Some acoustic correlates. Journal of the
Acoustical Society of America, 52, 1238–1250.
Zlochower AJ and Cohn JF (1996). Vocal timing in face-to-face interaction of clinically depressed
and non-depressed mothers and their 4-month-old infants. Infant Behaviour and Development,
19, 371–374.
Chapter 14
The improvised musicality of belonging:

Repetition and variation in
mother–infant vocal interaction
Maya Gratier and Gisèle Apter-Danon
One launches forth, hazards an improvisation. But to improvise is to

join with the World, or meld with it. One ventures forth from home on the
thread of a tune. Along sonorous, gestural, motor lines that mark the
customary path of a child and graft themselves onto or begin to bud
‘lines of drift’ with different loops, knots, speeds, movements, gestures,
and sonorities.
Deleuze and Gattari (1987, pp. 311–312)
14.1 Introduction
In their spontaneous interactions, mothers and infants build repertoires of communicative
motifs based on repetition and variation of expressive units that carry meaning. This shared
repertoire grows as it is performed, and brings a sense of pride in mutual understanding and a
sense of belonging. We offer a description of the ‘feeling of belonging’ between mothers and
infants as a subtle and dynamic balance between sameness and novelty, between well-known tra-
jectories and adventurous detours. This is at the essence of communicative musicality.
In our first study, we find evidence that mothers who have lost their sense of place and confi-
dence as a result of unhappy emigration experiences have difficulty in sustaining lively and excit-
ing vocal exchange with their babies. In our second study, we present findings and a case study
from a growing database of audio and video recordings of mothers diagnosed with borderline
personality disorders interacting with their 3-month-old infants. These contrasting investiga-
tions show that when a mother has a confused perception of herself, which may or may not be
considered pathological, her vocal interaction with her baby loses vitality, becoming more rigid
and repetitive. In a sense, it looses its temporal ‘flow’—mother and infant no longer seem able to
share ‘inner time’, neither to consolidate their relationship nor to develop new pathways for
shared experiences. This chapter has two main aims. One is to support the view that temporally
coordinated expression (coordinated rhythm, prosody and interactive dynamics) forms the basis
of a spontaneous communicative musicality in the first months of life, in part by showing how it
can be disrupted. The other is to offer perspectives on the natural roots of musicality in individu-
als and in communities.
302 MAYA GRATIER AND GISÈLE APTER-DANON
14.2 Performing a sense of belonging through shared musicality

14.2.1 Research findings
The first author conducted a study of 60 mother–infant dyads from France, India and the United
States, observed in their homes during spontaneous face-to-face interaction (Gratier 2001, 2003).
Thirty mothers living in the countries of their birth (10 from France, 10 from India and 10 from
the US) were compared with 30 who had recently emigrated from India to the US. The infants
were all first-born and between 2 and 5 months old. The aim of the study was twofold. First, by
comparing the spontaneous vocal interactions of French, Indian and American dyads living in
their own cultures, we wanted to show that vocal exchanges present both cross-cultural similari-
ties and cultural specificities. Using measurements of vocal events made with the aid of spectro-
graphs and pitch plots, we found that in the three cultural contexts, mothers and infants produce
narrative-like episodes of vocal exchange lasting between 12 and 30 seconds, made up of shorter
phrase-length segments of 2 to 6 seconds, in turn organized around recurrent units lasting about
one second (Figure 14.1). These findings are consistent with the literature, presented later in the
chapter, and suggest that communicative musicality, which we take as the creative interpersonal
coordination of expression in time generated by the brain and active body, constitutes a crucial
basis for intersubjective experience (Malloch 1999, 2005; Trevarthen 1999).
Based on the work of Malloch (1999), we define the temporal coordination between mothers
and infants in terms of ‘pulse’, ‘phrase’ and ‘narrative episode’. Pulse refers to the stable recurrence
of an implicit interval between vocalization onsets. Phrases are sequences of vocalization and
pause, by mother, infant or both, which are bounded by longer pauses. Narrative episodes are
longer cycles of shared excitement that have a beginning, a development and an end (for further
details on these definitions and the way the analysis was performed, see Gratier 2003). As shown
700
Pitch (Hz)
0
0 19.393
Time (S)
Hey Lucy girl Ba ba ba ba:: Oh Lucy
Are you having a good day? Talk to me Ba ba ba:: Yes girl
Are you having a good day? Talk to me
Narrative episode
Phrase boundaries
Pulse markers
Infant vocalization
Fig. 14.1 Spectrograph, generated using the software Praat (Boersma and Weekink 2000), showing
‘pulse’ (here with a 1-second pulse interval), ‘phrase’ and ‘narrative episode’ for 20 seconds of a
vocal exchange between a Californian mother and her 2-month-old (Gratier 2003).
REPETITION AND VARIATION IN MOTHER–INFANT VOCAL INTERACTION 303
Table 14.1 Group similarities in three hierarchical temporal units derived from acoustic analysis of
mother–infant vocal interaction
Durations French dyads Indians dyads American dyads Immigrant dyads

Mean (SD) Mean (SD) Mean (SD) Mean (SD)
Pulse (ms) 902.8 (130) 805.8 (154) 903.9 (117) 870.1 (163)
Phrase (ms) 3130.4 (237) 3017.6 (291) 2991.8 (250) 2989.1 (168)
Narrative episode (s) 23.7 (3) 24.8 (4) 23.6 (5) 24.1 (4)
in Table 14.1, the mean durations of these three hierarchical temporal units are almost the same
for the four groups. This supports the hypothesis of an early, universal motivation to organize
expression in time in order to share experience.
Our analyses highlighted cultural variations in the ways mothers and infants organize their vocal
exchanges. Each cultural group seems to reflect specificities that can be seen as consistent with tacit
rules of verbal conversation within their cultures. For example, Indian dyads spend more time
vocalizing simultaneously and thus have less clearly marked turns in their exchanges than French
and American dyads (F (2.27)= 3.4, p < 0.05). Another notable finding is that the duration of
pauses between turns in the three contexts is close to that in adult conversation in each of these
cultures (F (2.27) = 6.3, p < 0.01) (Kerbrat-Orecchioni 1994). Table 14.2 presents all of the findings
concerning cultural specificities of vocal exchange of the dyads in France, India and the US.
The second aim of the study was to examine the idea that these culturally determined ways of
interacting not only are supported by, but in turn support, a communicative musicality. We studied
the vocal interactions of immigrant mothers who had experienced a loss of confidence in the
infant-care practices promoted in their own culture as a result of ‘cultural conflict’ (Greenfield et al.
2000, Stork 1994) between American and Indian representations of mothering. A measure of cul-
tural conflict was derived from questionnaire and interview data and from measures of self-esteem
and reported social-support. We found that for Indian immigrant mothers with no signs of cultural
conflict, those who felt supported and had confidence in the quality of the care they provided for
their infants (which could be based on Indian, American or bicultural practices), the temporal
coordination of vocal interaction was analogous not only to that of Indian mothers living
Table 14.2 Group differences showing culture-specific traits of protoconversation
French dyads Indian dyads American dyads Immigrant

Mean (SD) Mean (SD) Mean (SD) dyads Mean (SD)
Percentage time spent in 3.2 (2.9) 9.4 (8.1) 5.1(4.2) 4.7 (8.1)
overlap
Duration (ms) of between 389.4 (182) 268.8 (66) 473 (123) 331.5 (121)
speaker pause
Percent total mother 46.6 (7.1) 42.9 (16.4) 49.9 (7.3) 52 (15)
vocalization
Percent total infant 5.4 (3) 13 (10.5) 9.4 (5.7) 5 (6.4)
vocalization
Percent verbal maternal 82.7 (12.7) 52.6 (16.8) 72 (23.8) 58.3 (23.5)
vocalization
Percent non-verbal maternal 17.3 (12.7) 47.4 (16.6) 32 (21.9) 42.2 (23.9)
vocalization
in India, but also to that of the American and French dyads living in their own cultures. The
interactions of these immigrant dyads presented cultural features such as turn-taking style
and between-speaker pause length that situated them between Indian and American patterns
(Table 14.2). These findings support the idea that acculturation is a dynamic process of integrat-
ing two cultural styles (Berry et al. 1992).
The vocal interactions of mothers who did exhibit signs of cultural conflict were markedly dif-
ferent: their temporal organization was less coherent, with fewer clearly distinguishable narrative
episodes, and it lacked an improvisational quality. We defined a temporal unit called ‘expressive
micro-shift’, which is a fraction of a pulse unit, as a measure to assess the ‘expressive timing’ or
improvisational quality of the vocal exchanges (Gratier 2003). Our analyses revealed that the
interactions of the cultural-conflict dyads were more rigidly rhythmic in the metronomic sense
(t(58) = 2.7, p < 0.01). They presented greater regularity and recurrence at the level of the short-
est temporal unit (the pulse, lasting around one second) and less organization at the level of the
longest unit (the narrative episode). Mothers who had experienced a loss of self-confidence
through a confused sense of belonging were far more predictable in their communicative expres-
sions, and their infants in turn were far less adventurous or creative in theirs.
14.2.2 Towards a definition of ‘belonging’

Based on these findings, we propose that ‘belonging’ comes to life in musical communication in
the first months of life—and possibly prenatally—and that, in important ways, it is both cultur-
ally and musically derived. Belonging is based on interactive motifs and styles shared by the com-
munity into which the infant is born, and that the infant begins to embody spontaneously
through regular and intimate communication with close kin. But belonging goes beyond what is
culturally given as a right of birth. It allows the infant to explore new inventive ways of expressing
and sharing experiences. The infant is motivated right from the start to acquire and incorporate
culturally meaningful ways of tuning in to others (Trevarthen 1988, 1993); but soon, his or her
own particular style of belonging, with all of its improvisational vitality, becomes an ongoing
motivating drive for development and learning. The feeling of belonging is acquired through
musical engagement and attunement (Stern et al. 1985), and opens up new spaces for an intimate
communication supporting culturally based personal styles of ‘being together in time’.
Through ongoing, long-term intersubjective encounters, mother and infant learn to sense,
through all of their sense modalities, the future trajectories of each other’s expressive movements.
They acquire what we have called ‘protohabitus’, borrowing from Bourdieu’s notion of habitus as
social dispositions and practices that have an improvised quality (Bourdieu 1977).1 Protohabitus
is made up of all the projectable styles and routines that mothers and infants establish over time
as they interact. It is a variable repertoire of embodied habits rooted in cultural styles that the
mother brings with her from her own community of belonging. We can think of this repertoire
as akin to that used by jazz musicians in their improvisation, made of ‘licks’, riffs and in part tac-
itly learned ‘etiquette’2 (and see Lee and Schögler, Chapter 6, this volume).
In our model of protohabitus, the infant’s sense of belonging is based largely on active body
sense, not directly on elaborated cognitive or linguistic processes. In our definition of the ‘sense
of belonging’, the infant’s confidence in sensing what will happen when he or she acts a certain
1 ‘The particular set of culturally determined bodily dispositions which have no representative content but
through a ‘regulated improvisation’ unconsciously guide our perceptions, actions and representations is
known as ‘habitus’.’ (Bourdieu 1977, p. 78).
2 ‘Veterans refer to the discrete patterns in their repertory storehouses as vocabulary, ideas, licks, tricks, pet
patterns, crisps, clichés, and, in the most functional language, things you can do.’ (Berliner 1994, p. 102).
way grows within a shared protohabitus of expressive activities, which in turn is supported by the
mother’s belonging and confidence in a cultural community to which her expressive movements
are already adapted. A common sense of belonging guides mother’s and infant’s anticipations of
each other’s expressions through a protohabitus that is continually revised and adjusted.
Protohabitus in action, and the emotional sense of belonging, are linked in dynamic equilibrium,
so that each supports the other through motive-guided change. When a mother loses confidence
in her own community’s cultural practices, a shared protohabitus with her infant, uniquely
improvised within well-established cultural traditions, is harder to define, and so is an intersub-
jective sense of belonging.
Figure 14.2 describes the dialectic relationship between the sense of belonging and its improv-
isational quality in vocal interaction. We suggest that the sense of belonging is both a source of
personal confidence—‘well-being’—and a powerful motivating force generated in mother’s and
infant’s awareness of sharing a set of culturally derived expressive forms that are both predictable
and afford playful variation. Protohabitus thus constitutes the structure or set of themes through
which novel practices and variations take shape and are appreciated. The sense of belonging and
the confidence and pleasure it produces must be bounded within what Keith Sawyer (2000, 2001)
has called ‘the improvisation zone’; in other words, an interaction that lacks proto-habitus, that
lacks the creation of shared structure, does not foster a sense of belonging, and, for similar rea-
sons, neither can an interaction that lacks a temporal improvisational quality foster a sense of
belonging. Belonging requires a balance between the known and the new, repetition and creativ-
ity, structure and variation. In much the same way, a piece of music or a novel become uninter-
esting and ‘stuck in time’ if all they present are known or repeated elements; if they present no
familiarity at all, they become unintelligible and confusing (Imberty 2005).
14.3 Intersubjective timing and improvisation

14.3.1 Anticipating temporal units and weaving time
Studies of the temporal aspects of expression in the communication of young infants suggest
that optimal forms of timing and rhythm are not so much periodic as improvisational (Malloch
1999; Stern 1985; Trevarthen and Malloch 2002). They have shown that the flexible yet predictable
Interactive motifs, routines, styles

Etiquette
Repertoires of licks
Practices and habits ‘Protohabitus’
Embodied knowing
Tacit ways of being
Rhythmic and prosodic signatures
Sense of
BELONGING
Playful temporal
and prosodic coordination
Prosodic modulation Improvisational quality
Emulation/imitation
Variations in speed
Variations in timbre
Self-confidence and
motivation
Ability to anticipate within the ‘improvisation zone’
Fig. 14.2 Belonging as form and process: sensing future expressive trajectories through protohabitus
and improvisation.
organisation of polyrhythms of expression in time supports affective involvement and learning in

preverbal infants, and that it may contribute to shape the first units of meaning perceived by them
(Dominey and Dodane 2004; Kuhl 2004). Infant-directed speech, for example, while always chang-
ing, is implicitly motivated towards setting up recurrent temporal units and frames of expectation
to facilitate the infant’s perception of temporal structure in what will happen (Bruner 1979; Fernald
1989; Papoušek et al. 1985). But it also plays on prosodic contour and affective response patterns,
as do all human expressive performances (Lee and Schögler, Chapter 6, this volume).
Adults tend to ‘package’ their multimodal expressions into units lasting between 1.5 and 5 seconds
(Beebe and Gertsman 1980; Stern 1982, 2000; Trevarthen 1999). Infant vocalization and other
expressive behaviours also appear to be structured in comparable units of time. Lynch et al. (1995)
found that, at the age of 2–12 months, infants’ utterances are organized in units of 3–4.5 seconds.
Many analyses of vocal interaction between mother and baby suggest a clear patterning of expres-
sion in time corresponding to 2–5-second cycles of activity and pause (Beebe and Gertsman 1980;
Gratier 1999, 2001; Lynch et al. 1995; Stern 1999, 2000, 2004). This natural phrasing of interaction
may constitute a primary given structure for communication based on the common experience of
the ‘felt present’, described most notably by James (1890/1992) and Husserl (1964).
Studies of vocal interaction in the first six months of life have shown that it is also organized
around shorter, more rhythmic, temporal units lasting 500–1000 ms (Beebe, Stern and Jaffe 1979;
Gratier 2003; Feldstein et al. 1993; Malloch 1999; Stern et al. 1977; Stern 1982; Trevarthen 1999).
Researchers have also described longer ongoing narrative-like episodes lasting 20–30 seconds
that are characterized by periods of initiation, development, climax and conclusion (Gratier
2003; Malloch 1999; Trevarthen 1999). These studies are evidence that infants’ sense of timing in
action, like that of adults, is organized hierarchically, so that they may simultaneously predict the
occurrence of complex stimuli that engage several rhythmic levels, much in the same way that
when listening to music we can often predict—or simultaneously sense and synchronize with—
the occurrence of the next beat and of the next phrase, and the evolution of the narrative cycle
(Gratier 1999; Trevarthen 1999). Taken together, we believe an infant’s capacity to parse stimuli
and anticipate events and an adult’s intuitive segmentation of expression at matching levels facil-
itate the generation of highly creative, improvised non-verbal dialogues.
14.3.2 Expressive timing and the improvisation zone

The term ‘expressive timing’ is borrowed from the literature on musical performance and
improvisation, and can be used to describe the particular quality or ‘energy’ of temporal organi-
zation in spontaneous mother–infant interaction. Expressive timing in music refers to the small
deviations from strict metronomic rhythm that musicians use to impart vitality and expressive-
ness to their playing (Clarke 1989; Iyer 1998; Kühl 2007). We propose that a loss of expressive
timing in a mother’s behaviour implies an over-predictability in interaction, which does not cap-
ture the infant’s attention and expressive potential.
An infant’s active construction of experience bridging between past and future is further con-
firmed by the study of the temporal organization of face-to-face interaction and by studies of
infants’ perception of tempo and musicality. Various studies of mother–infant communication
point to the existence of recurrent temporal units embedded within the interactive flow. They cor-
respond to cycles of attention (Brazelton et al. 1974), to units of vocal and kinesic activity (Fogel
1988; Stern et al. 1977; Trevarthen 1999; Tronick and Weinberg 1997) and to pauses in turn-taking
sequences (Beebe et al. 1985, 1988). We know that 2- to 4-month-olds have a preference for musical
tempi of around one beat per 600 milliseconds, moderato to andante; this corresponds to the mean
tempo to which we naturally entrain, or to what Paul Fraisse has called our ‘spontaneous tempo’
(Baruch and Drake 1997; Fraisse 1982; Mazokopaki and Kugiumutzakis, Chapter 9, this volume).
What is most interesting to us in terms of the infant’s perception of time, however, is that
3-month-old infants display an attentional preference for moderate or variable contingency over
perfectly rigid contingency, and that the fastest learning of novel actions, in engagement with
events the infants are attempting to control, occurs in situations of moderate contingency
(Bruner and Sherwood 1975; Watson 1979). This preference for moderate contingency can be
related to an infant’s daily experiences with social stimuli that present both identifiable regularity
and exciting nuance (Hane et al. 2003; Watson 1985); it supports a highly adaptive form of inter-
subjective engagement and learning.
An important study of vocal interaction between 4-month-olds and their mothers revealed the
significance of mid-range vocal rhythm coordination. Infants with both high and low levels of
vocal rhythm coordination—defined by the degree of coordination of sounds and silences
between mother and infant—exhibited insecure attachment styles at 12 months; by contrast,
infants in the mid-range coordination category were securely attached (Jaffe et al. 2001). A mid-
range vocal coordination at 4 months has also been shown to predict language development at
24 months (Hane and Feldstein 2005). Mid-range coordination between mother and infant
leaves room for creative expression and collaborative exploration. The temporal ‘elasticity’ it
affords is a crucial basis for the negotiation of purposes between mother and infant. Studies of
‘peekaboo play’ with young infants provide further evidence of the central motivating role
of well-balanced repetition and variation. Around the world, adults and infants play peekaboo
games to explore the fascinating tension between repetition and novelty, anticipation and sur-
prise. These games frame social and emotional co-regulation, and support, or exercise, the
infant’s social awareness (Bruner and Sherwood 1975; Fernald and O’Neil 1993; Greenfield 1972;
Rochat et al. 1999). A mid-range predictability in the peekaboo game provides greatest enjoy-
ment to the infant (Fernald and O’Neil 1993). It seems then that intimate communication and
intersubjective engagement are associated with less rigid and more flexible interactive timing, in
a range that corresponds to an internally generated sense of time for the prospective control of
movements (Lee and Schögler, Chapter 6, this volume).
As we have seen, the temporal organization of mother–infant interaction is not strictly or con-
stantly rhythmic: it does not present wholly predictable temporal patterns, but rather a form of
timing that stimulates frames of expectation and generates improvisation zones, a timing that is
at once clearly structured and subtly varied. Expressive timing in spontaneous face-to-face inter-
action, between mothers and infants, and perhaps in any interactive practice, constitutes the basis
for implicitly defining the boundaries of an ‘improvisation zone’ (Sawyer 2001) beyond which
communication fails because it no longer engages the other’s speculative mind. We can conceive
the limits of the intersubjective sharing of experience—by means of conversation, music or
non-verbal interaction—as a space beyond the boundaries of which interactions are either too
predictable or too chaotic to be intelligible or shared (Sawyer 2001).
Daniel Stern (1982, 1999) has pointed out that the modulated, variable timing of
mother–infant interaction is perfectly suited to maintaining and regulating the infant’s attention
and emotions. Highly repetitive timing is used at times to elicit the infant’s attention, but if it
persists, the infant becomes bored and turns away; similarly, unstructured timing fails to capture
the infant’s attention and is invariably upsetting.3
3 Stern likens mother–infant interaction to a jazz duet. He writes: ‘There is something about the deviation
from the beat, the irregularity or variation from the expected regularity that is “expressive” of feelings and
thoughts. This is perhaps most dramatic in jazz, where such deviations are a conventionalized feature of
the style, and much of the excitement can be generated by fluctuations of falling behind and then getting
ahead of the beat and then slipping back into it’ (Stern 1982, p. 104).
14.3.3 The vital importance of repetition and variation

A number of researchers have been interested in exploring this improvisational quality of
mother–infant interaction and, indeed, a few recent studies suggest that mothers and babies rely
on similar timing mechanisms as jazz musicians to adjust their behaviour to each other (Malloch
1999; Schögler 1999, 2002; Stern 2000). Improvisation in various musical genres is based on the
establishment of a common beat and a shared temporal structure within which each musician
introduces meaningful changes (Bailey 1992; Iyer 1998). It is built from such devices as turn-
taking, antiphony, imitation and synchronous play; the introduction of novelty or change is
always negotiated between the musicians. Furthermore, jazz musicians rely on predetermined
‘licks’ and ‘etiquette’, which constitute the ‘culture’ and ‘aesthetic’ within which they play, and
within which their play is meaningful to each other and to their audience (Becker 2000;
Iyer 1998). They rely on expressive timing and synchrony to set up frames of expectation, lines of
tension, and gratifying resolutions (Iyer 1998; Schögler 2002).
Improvisation can be seen as fundamental to many of our spontaneous activities. Conversations
are improvisations of a kind; we follow rules of conversation, but every conversation is unique,
although some are more scripted and others more creative (Sawyer 2001). Mothers and infants set
up temporal structures within which they play with time, and establish and rely on unique and
culturally shaped interactive routines. Like musicians, they use shared cultural ‘licks’ to generate
new patterns within a well-defined improvisation zone (Brandt, Chapter 3, Bradley, Chapter 12,
and Erickson, Chapter 20, this volume).
As we have already suggested, the dual actions of anticipating known forms and processes, and
perceiving and producing novel forms and processes based on them, impart a crucial dynamism
and vitality to ongoing interaction. This holds true, again, for multiple forms of interaction: pre-
verbal, conversational, musical and kinesic. Repetition implies appropriation and projection, and
contributes to building a dynamic and assertive sense of self. Variation of repeated forms and
processes transforms the known into the new, moves it forward, and draws or invites it into a
space for creative dialogue and exploration. Studies in the cognitive psychology of music have
shown that variations in timing, intensity and duration in musical performance serve to enhance
aspects of musical structure by facilitating listeners’ segmentation of musical sequences
(Drake and Palmer 1993). In the case of mother–infant interaction, variation of known forms
may contribute to strengthening a shared protohabitus.
Another important aspect of dynamic patterns of repetition and variation is their inbuilt
flexibility. Certain expectations are built up from recurrent, repeated elements, and at any given
point, in spontaneous interaction, there are various possible ‘next events’. However, there are also
events that are not possible, or not perceived to be lawful within a particular, implicitly agreed-on
interactive context. The well-documented phenomenon of ‘repair’ in mother–infant interaction
(Tronick and Cohn 1989) attests to the moment-to-moment awareness of the existence of a
range of possible or ‘lawful’ expressions. There is, so to speak, a common sense or shared intu-
ition of what is right and wrong in interaction and, when an element is out of bounds, mother
and infant work cooperatively to fix the problem. They do this in much the same way as jazz
musicians.
During collaborative play, musicians often miss each others’ cues or play something unin-
tended, and these slips and mistakes are usually immediately picked up by their colleagues.
However, a good jazz musician must be brave enough to risk mistakes, and the final status of
the so-called mistake is difficult to ascertain because the cooperative flexibility of a musical
ensemble often contributes to blurring the boundary between expectation and creativity
(Duranti and Burrell 2004). In jazz, ‘mistakes’ are part of the music, and the ability to ‘repair’
them during a performance has become an important index of proficiency (Monson 1996;
Iyer 1998). Imberty (2005) underscores the centrality of an ‘ornamented and diversified
regularity’ in creating our experience of time and meaning while we are listening to music.
Interaction styles that balance repetition and variation also favour mutual engagement and
interactive flow, because they support graduated change, change that remains at all times mean-
ingful within its situated context, its history, aura and potential. In short, repetition and varia-
tion are of vital importance in any interaction because they constitute the basic architecture for
the experience of sharing existential time, or, as Alfred Schutz (1962, p. 116) put it, of ‘growing
older together’ within the space of an interaction.
14.3.4 What is intersubjective time?

Psychologists, who by and large have ignored the question of time and temporal experience,
tend to adhere to the philosophical distinction between physical measurable time and psychologi-
cal subjective time. However, in the light of much convincing research in the social sciences,
aimed at the analysis of social interaction and socially constructed meaning, it seems crucial
for psychologists to acknowledge and to study a third form of time, namely intersubjective time.
Studies of mother–infant interaction, and of collaborative group practices such as improvised
musical performance, provide valuable insights into this form of temporal experience.
Intersubjective time is surely central to all of our questions about consciousness, cognition
and culture.
We believe the infant’s experience within the flow of interaction is of an unfolding future
that resonates with a recent and continually reconstructed past. As Donaldson put it, individ-
ual human consciousness grows through ‘point’, ‘line’, ‘core construct’ and ‘transcendent’
modes as memories and thinking explore further and further out from the ‘here and now’
(Donaldson 1992). However, all understanding is rooted in and develops from present embod-
ied experience. Daniel Stern (1999, 2004) uses the term ‘vitality affect’ to describe the way we
sense the temporal contour of experience as it arises. It is both a narrative of ‘becoming’ and
the experience of a ‘now’. Vitality affects are made up of particular patterns of tension, often
involving periods of heightened excitement and ending in a dénouement. The beginning, the
evolution and the end of short-lived non-verbal stories weave together as past, present and
future in predictable yet variable ‘feeling forms’ (Langer 1953). They occupy the space of a
present moment with its past and future horizons and, in interaction, give structure to moth-
ers’ and infants’ moment-to-moment sensing of each other. They are like the trajectories of
intentions travelling to their immediate goals (Stern 2000). Vitality affects are compared to
musical phrases, both in terms of their average duration (2 to 5 seconds) and in terms of their
experiential qualities, and contribute to setting up ‘moments of meeting’ that emerge as mean-
ingful and motivating crystallizations of shared experience (Stern 2004). The ability to sense
the temporal contours of our own feelings and their potential for intimacy grows richer within
authentic encounters with trustworthy partners. An inner sense of the temporal contouring of
experience, infused with both aesthetic and moral qualities, is derived from lived-through
intersubjective time.
In his 1951 paper entitled ‘Making music together: A study in social relationship’, Alfred Schutz
(1962) describes what he calls the ‘mutual-tuning-in relationship’. It is a type of relationship that
is actualized in intersubjective experience, either with live partners in face-to-face encounters, or
in the presence of works of art conveying the communicative intent of their creators. According
to Schutz, the relationship is
established by the reciprocal sharing of the Other’s flux of experiences in inner time, by living
through a vivid present together, by experiencing this togetherness as a ‘We’. Only within this
experience does the Other’s conduct become meaningful to the partner tuned in on him – that is,
the Other’s body and its movements can be and are interpreted as a field of expression of events
within his inner life.
Schutz (1962, p. 118)
Sharing inner time through common experiences is described as a unique type of experience.
Intersubjective time is not entirely disconnected from objective measurable time in the way
that most people think of subjective time. On the contrary, it constitutes a fascinating and
complex challenge for psychologists in the sense that, if we think of it as a form of experience,
it is largely based on observable and easily measurable events—by events governed by objective
reality ‘out there’—that can be observed from within interactive encounters. What, then, is the
relationship between the temporal organization of interactive events and the experiences of
‘sharing time’ through them?
Without doubt, intersubjective time must be analysed with respect to its real-time physical
embodiment and the ‘feeling’ of its motivational force. Interactants rely on an intersubjective
time of acting together to give form and meaning to co-constructed experiences. The perception
of meaning is inherently connected to the experience of an ‘unfolding’ in sympathetically shared
or ‘melded’ inner time. In ideal conditions, it is through the expressive confluence of ‘inner time’
that mothers and infants actively engage in shared practices containing both historical thickness
(protohabitus) and the excitement of shared discovery in a common reality.
14.4 Meaning in belonging

14.4.1 Musicality and narrative
‘Narrative’ has been identified as one of the key components of communicative musicality
(Malloch 1999), but the term narrative used within this context is in need of clarification. It does
not imply the activity of recounting or reconstructing real or imagined events, referentially;
it implies only a particular format that is a fundamental intentional means of structuring and
conveying meaning. The narrative format is a vehicle for a purposeful progression of emotional
expression and is inseparable from a temporal trajectory that has expectations and excitements
(Imberty 2005; Stern 2004). As many researchers have pointed out, narrative content and narra-
tive format must be considered as fundamentally human forms and activities (Bruner 1987;
Burke 1945; Ochs and Capps 2001; Ricoeur 1983–1985). Narratives may be thought of as vectors
of intentionality, providing the impulse to connect experiences into meaningful, memorable and
recognizable wholes (Bruner 1990). Narratives, taken as stories about reality that are told within
a particular format, and which therefore have linguistic content, derive meaning from the
existence of a tacitly acknowledged stock of assumptions and expectations about the nature of
reality that are agreed between tellers and hearers (Bruner 2002). Similarly, narratives taken as
‘lived non-verbal stories’ (Stern 2004), which have a narrative format but a non-linguistic
content, become meaningful within a simpler, but more vitally compelling framework of shared
assumptions and expectations (Brandt, Chapter 3, Merker, Chapter 4, and Cross and Moreley,
Research on interaction between mothers and preverbal infants shows that rhythmic
expression and vocal prosody provide meaningful content for their musical lived-through narra-
tives. Detailed acoustic analysis reveals intricate narrative-like, or dramatic dynamics of pitch
coordination between 2-month-olds and their mothers (Trevarthen 1999; Trevarthen and
Malloch 2002). Temporal phases of introduction, development, climax and conclusion in
sequences bounded by longer pauses generate non-verbal narrative episodes lasting between
20 and 30 seconds (see Malloch and Trevarthen, Chapter 1, this volume). The effects of narrative
tension and resolution are produced as a coherent and gradual rising and falling of vocal pitch, and
often the mother’s verbal discourse offers insight into other features of the non-verbal units of
narrative. Mother and infant search for and maintain a changing feeling for the expressive pulse
throughout a narrative cycle. Infants often vocalize on the beat, at the end of a phrase, or at a
point of crescendo and relaxation, and their bodies move or ‘dance’ in time with the musicality of
the mother’s expressions (Malloch 1999; Trevarthen and Malloch 2002). By making joint narra-
tives of action and emotion, mother and infant come to share history and to invoke community.
14.4.2 The polysemic and non-discursive roots of belonging

If the infant does perceive and partake in the construction of narrative-like forms and in vitality
affects, we must reflect on the purpose, and value, of these creative experiences. It is possible that
the realm of meaning that the infant accesses through the temporal and qualitative coordination
of expressive behaviour is akin to the particular kind of meaning we access through musical
experience, or more generally through forms of aesthetic perception, which are by and large pre-
sentational and connotative rather than representational—a semiotic form that the philosopher
Susanne Langer (1942) describes as ‘non-discursive meaning’. Music, according to Langer, is a
semiotic mode which is as prevalent and normal as language, but whose units of meaning are
context-dependent, temporally based, and untranslatable. Music has a vital purpose in life.
Although it lacks the denotational specificity of language, music is our most powerful medium
for expressing emotion and the ‘ambivalences and intricacies of inner experience’ (Langer 1942,
p. 100). In line with Langer’s philosophy, we consider the musicality of interaction as a funda-
mental feature of preverbal and verbal communication, and as constitutive of the activity of
human intelligence. Musicality in the first months guides the infant mind into a world of mean-
ing in action before anything specific outside the engagement is talked about.
We propose that in the first six months of life, an infant does not acquire knowledge of inten-
tional relations as much as a knowing of intentional practice that is based on embodied processes
of intention and actualized within temporal frameworks of interacting minds. Stern (2004)
defines implicit knowing as ‘non-symbolic, non-verbal, procedural, and unconscious in the sense
of not being reflectively conscious’. We further suggest that, in the first months of life, implicit
knowing, which is grounded in an intersubjective sense of time, is woven into a sense of belong-
ing. In other words, the infant’s expressive movements are guided by a set of culturally derived
and musically shaped habits and practices. Belonging is implicit knowing that takes into account
the cultural tones of every individual’s ways of moving and meaning. Mothers and infants make
sense to each other (and make meaning for each other) in the course of their exchanges, through
musical narrative forms that structure experience in and of time, and that build on expanding
and changing shared repertoires of communicative form.
We know that young infants are attracted to the temporal trajectories of human intention.
They are tuned into the contours of vocal expression (Papoušek 1996), the durations of phrases
perceived as vitality affects (Stern 2000, 2004) and the stages of narrative engagement (Trevarthen
1999). By 7 months, they accurately parse and segment ongoing linguistic and musical material
(Jucszyk and Krumhansl 1993, Krumhansl and Jucszyk 1990). Around 10 months, infants parse
a complex flow of action along boundaries that coincide with the initiation and completion
of intentional movements (Baldwin et al. 2001). Trajectories of intention and expression,
then, must be considered as dynamic vectors that structure intersubjective time and give mean-
ing to shared experience.
From about 9 months of age, within affectionate relationships of mutual attention and confi-
dence, infants begin to grasp fixed patterns of intentional relations through referential activity
(Bruner 1979, 1990; Tomasello et al. 1993; Tomasello 1999; Trevarthen and Hubley 1978). These
are ‘acts of meaning’ (Halliday 1975), activities that pave the road for the acquisition of language
and for the crystallized meaning of words (Bruner and Sherwood 1975; Markus et al. 2000). It is
around this age that infants are thought to begin to acquire cultural know-how, because culture is
usually thought of as a fixed set of properties that are added to mental activity, rather than as
constitutive of the human psyche. As they develop, infants learn to physically demonstrate the
course of their sensing of others’ intentionality, as is clear from activities involving joint and
mutual attention. Meanings become increasingly public, conscious and contained. The non-
discursive semiotic field that the young infant accesses, however, constitutes the foundation for
all meaning-making activity and coexists with linguistic referential meaning throughout life.
Indeed, linguistic meaning is always embedded in multiple semiotic fields (Goodwin 2003),
many of which present non-verbal forms of ‘tacit knowing’ (Polanyi and Prosch 1975; Stern
2004). Thus, a highly specific infant semiosis or protosymbolic representation is co-constructed
in intersubjective experience4 (Trevarthen and Hubley 1978; Trevarthen 1980, 1988, 1994).
Important insights into the nature of infant semiosis may be obtained from close scrutiny of
the literature on musical meaning (Meyer 1956; Imberty 1981; Kivy 1990; Kühl 2007). More
interesting still might be insights gleaned from the much more limited research on musical inter-
action and group interaction processes (Monson 1996; Sawyer 2003). We provide a few pointers
for further study in this domain. One fundamental aspect of musical meaning is its inherent pol-
ysemia (having multiple meanings). In fact the polysemic nature of music, and of other art
forms, is inseparable from its aesthetic quality (Cross 2001).
Studies of real-time jazz performance and collected interviews of jazz musicians show that
meaning is intimately connected with a shared sense of belonging (Becker 2000; Duranti and
Burrell 2004; Monson 1996). Musicians can only make sense of performances because, on the
one hand, their improvisations are based on known structures, themes and habits, and on the
other hand, they make more or less clear reference to historically rooted repertoires. Thus, they
are making meaning at multiple levels and with more or less transparency. A particular riff may
carry strong connotations of a well-known event in the history of jazz, it may be considered by
all musicians as a direct reference to that event. Or it may carry multiple connotations that are
perceived and taken up by some but not all the performers. In addition to these forms of more or
less referential meaning in improvised musical performance, the very fact of the polysemia that is
inherent to particular musical sequences itself procures aesthetic meaning and may be played
upon by the musicians. Finally, the act of sharing tacit references non-verbally through online
multimodal coordination is in itself a meaningful experience, and reinforces the bonds and
intimacy between the performers. 5 In pre-composed written music, too, meaning largely
depends on the listener’s perception of the style of the piece, which is also a particular way of
sensing time, based on universal abilities to structure and segment expressive sound (Imberty
1981). The culture of music is created as it is perceived, with ‘human presence’ (Brandt, Chapter 3,
Mazokopaki and Kugiumutzakis, Chapter 9, this volume).
4 ‘The syntax of verbal expression in speech or text is derivative of, or built upon, a nonreferential process
that regulates the changes and exchanges of motivation and feeling between subjects in all communica-
tion where cooperative awareness is being created. This is the level of semiotic process at which infants
communicate’ (Trevarthen 1994, p. 240).
5 This seems to be what John Blacking is alluding to when he writes that ‘the chief function of music is to
involve people in shared experiences within the framework of their cultural experience’ (Blacking 1973, p. 48).
It may be useful to distinguish the terms ‘meaning’ and ‘meaningfulness’. Many human experi-
ences may indeed be considered highly meaningful, despite not having a clear meaning attached
to them. For example, there is an important distinction to be made between grasping new mean-
ings through the activity of conversing with someone, which results in understanding and
insight, and feeling that one has participated in a meaningful conversation, one which carries res-
onance beyond the words and thoughts that were exchanged. In other words, the sense of con-
nectedness and mutuality is in and of itself meaningful. Meaning and meaningfulness are highly
related, since they are both engendered by actual or implicit intersubjective experience. Thus,
the feeling of belonging is at the same time meaningful, meaning-driven and meaning-oriented.
We belong together in meaningful experience.
14.3 The morals and aesthetics of belonging

The meaning of connectedness attests to its aesthetic quality. There is something special and
deeply humanly enjoyable about being in ‘synch’ with others, or being ‘in the groove’ or ‘on the
same wavelength’. There may indeed be a strong link between our experience of belonging
through ongoing dynamic communicative practices and the enjoyment of art.
Dissanayake (2000a) proposes common phylogenetic origins for connectedness, ritual and art
(and see Dissanayake, Chapter 24, this volume). Schutz (1962) considers the experience of art to
be inherently intersubjective because it involves an immediate convergence of the fluxes of inner
time of the beholder of a work of art and of its creator. In both successful real-life encounters
with others and art that personally pleases us, we seem to sense or sympathize with others’
hidden intentionalities. Even when encountering a static work of art—a painting, for instance—
but one which has been ascribed that particular aesthetic stance that makes it a ‘work of art’, our
appreciation of the object itself seems to simultaneously conjure up a general feeling of connect-
edness. This has been described in the experience of ‘flow’ by Csikszentmihalyi (1990) or of
‘peak experience’ by Maslow (1971).
If we take the analogy between improvised performance and mother–infant vocal interaction
one step further, we may suppose that a culturally derived improvised sense of belonging is akin
to the culture or aesthetics of a musical tradition. The aesthetics of a particular musical genre
can indeed be abstracted artificially, described and written about, as a set of norms, processes,
and styles with their specific histories. However, it is meaningful and productive only as it is actu-
alized in performance or composition, and with an audience that appreciates it in mind. The
culture of any art form is a dynamic nexus that holds and shapes individual instances of its
expression.
There may be a powerful connection between belonging and beauty (Panksepp and Trevarthen,
Chapter 7, this volume). Musicians feel most connected during their performances when what
they produce is both meaningful and in harmony with the aesthetics of their tradition—and vice
versa. However, the most acclaimed music (or work of art in general) goes slightly beyond the
aesthetic traditions it is rooted in. When we say that a work of art is ‘avant-garde’, we may mean
that it lies on that edge between what is recognizable and totally transformed, that it is, teasingly,
at the boundary of what we are able to connect with. The affection-making, bond-strengthening
humour of jokes and games in play is on that boundary too (Reddy 1991).
Mothers often claim to be powerfully moved by the connectedness they experience with their
babies, and there is a sense in which some interactions are more beautiful than others. Bill
Condon (1982) spoke of the choreographic beauty of some of the filmed interactions he studied
image by image, as opposed to others—and their beauty was connected to their particular form
of coordination and the interactants’ well-being. The norms of expectation encountered in
improvised performance and in mother–infant interaction can also be viewed in moral terms
(Duranti and Burrell 2004; Trevarthen 1986). Alessandro Duranti and Kenny Burrell (2004)
analysed videotapes of improvised musical exchanges and performances, as well as conversations
surrounding them. They describe the morality of jazz improvisation as those tacitly agreed-upon
rules that define what is considered aesthetically and morally ‘good’. In their view, the moral and
aesthetic qualities musicians describe are associated with a sense that the music adheres to, rein-
forces a shared aesthetics and, at the same time, emerges from a personal commitment to authen-
ticity, honesty and empathy, or other-awareness (Duranti and Burrell 2004). Jazz musicians often
complain of the overpowering egos of some, of their arrogance, self-centredness or greed, which
can come in the way of successful performance. There is an interesting merging of aesthetic and
moral canons in musical performance.
14.4 Musicality as ‘holding’

Movements of bodies, their uses and expressions, carry imprints of belonging. The history of a
community, even its ancient history, is alive in movement. Marcel Mauss (1934) describes this
very form of embodied culture when he talks about ‘techniques of the body’. Human expression
carries elements of past communicative traditions that are simultaneously relived and renewed in
contemporary contexts. The confused sense of belonging of a mother who is psychologically
detached from her parental community potentially inhibits this sociohistorical flow of implicit
knowing, which inspires confidence and at the same time opens up frames for negotiation and
creative exploration. Through musical multimodal exchanges with their mothers, infants access
the rich and coherent communicative traditions of their communities of belonging, and use
them as themes on which to improvise new modes of intersubjective engagement. Expressive
timing in mother–infant interaction, the framing device that contains and structures intersubjec-
tive experience, must be both clearly defined and flexible in nature. Through sensitive timing,
mothers hold their infants’ attention, regulate their emotions, and create a microculture of pre-
dictable routines and styles. Musicality, then, can be considered as a form of psychological ‘hold-
ing’ that encompasses the handling of the baby (Winnicott 1971).
Falk (2004) suggests that, early in the evolution of humankind, mothers were able to be physi-
cally separated from their immature infants thanks to the development of this capacity for vocal
holding, an interactive capacity of adults, who spontaneously talk to, coo and sing to infants, and
of infants, who orient and attend with particular interest to these vocal expressions (Panksepp
and Trevarthen, Chapter 7, this volume). However, we must also bear in mind that the role of
vocal interaction with infants varies a great deal across the world and through history. We are
suggesting that it is the expressive timing of interaction with infants, by vocalization or any other
expressive means, that constitutes the primary source for the experience of ‘holding’ described by
Winnicott, whatever the form of the interaction. In fact, interaction styles may evolve and change
as infants develop but must maintain this sensitive musical quality in timing (Mazokopaki and
Kugiumutzakis, Chapter 9, and Powers and Trevarthen, Chapter 10, this volume). Infants who are
considered securely attached at 12 months are perhaps those that received a continuous and
coherent ‘musical holding’. This idea is supported by Jaffe et al.(2001), in their study of vocal
timing and attachment in the first year of life.
Implicitly transmitted ‘techniques of the body’ guide a mother’s actions and movements when
she interacts with or cares for her baby. The infant’s intuitive responsiveness and pleasure will
give a new mother confidence and a sense of expertise in her day-to-day activities. Beyond that, a
mother’s culturally inherited know-how and her sense of belonging to a community is a holding
environment for her, just as she herself provides a holding environment for her infant. A mother
who does not know who she is disturbs the temporal historical flow of implicit knowing, which
may finally disrupt the infant’s spontaneous sense of narrative time. At an acoustic level, we can
imagine that a feeling of ‘home’ is generated by the very musicality of interaction: the security
and holding afforded by recognizable shared routines, rhythmic patterns and familiar expressive
dynamics, as well as stable and predictable vocal pitch contouring and timbre. Feeling at home in
a community of caring and trusting people supports the negotiation of a musical holding within
interactive contexts. The psychoanalyst Didier Anzieu (1995) provides a fascinating analysis of
this holding function of the sound-world the infant inhabits. He suggests that the first level in the
construction of the sense of self is based on a ‘sound envelope’ that must be coherent, predictable
and comforting.
Studies of the dynamics of non-verbal adult interaction in various cultures point to specific
cultural microrhythms of communication. Microanalyses of conversations between people from
different cultural backgrounds reveal slight misattunements and dissynchronies between them
(Condon 1982; Gumperz 1981). The anthropologist E.T. Hall (1983) suggests that our core cul-
ture is expressed in these very subtle and unconscious ways, through a silent language, when we
interact with others. Our intuitive cultural ways of being together play a vital role in generating
and maintaining community and cooperative behaviour. Within our particular communities of
belonging, we are better equipped to sense the future of particular expressions through a feeling
for their established dynamics and trajectories. We sense and anticipate the possible ways in
which our feelings and our thoughts will, or should, develop.
The importance of timing and of the expressive qualities conveyed by the voice and body may
be linked to the universal affiliative powers of music and its meaning (Dissanayake 2000a;
Kühl 2007). Dissanayake (2000b) suggests that we find meaning in art and are moved by it
because it resonates with an ancestral and vital need for communion. This thesis is supported by
Donald’s (1993) view of human cultural cognition as stemming from emotional mimetic com-
munion and a motivation to belong to meaningful cultural communities. The idea that mimesis
and transmission are rooted in shared rhythm was foreseen by Marcel Jousse (1969), an unusual
thinker—both a Jesuit priest and an anthropologist. He presented a theory of rhythm as a consti-
tutive element of humanity, continually linking mankind to the cosmos and nature. Rather than
reason, rhythm carries the greatest power to humanize (for him ‘I rhythm therefore I am’), and it
acts as the primary vector for transmission through mimetic action. This relates to the ideas of
the Norwegian musicologist Bjørkvold (1992), who has studied children’s musical culture created
in the playground by toddlers and preschool children, outside adult supervision and established
musical art. Bjørkvold says the child’s motto is ‘canto ergo sum’ (‘I sing therefore I am’). Rituals
around the world rely on music and dance to generate experiences of identity and belonging, and
to mark time (Blacking 1973).
In contrast, some rituals found in a variety of cultural contexts, particularly those involving
altered states of consciousness, are aimed at disrupting identity and breaking the flow of mun-
dane time; these rituals are most often characterized by extremely repetitive actions, music or
incantations (Rouget 1994). The study of mothers with borderline personality disorder to which
we will now turn picks up on this ethnomusicological insight connecting loss of identity with
repetition and stagnation of rhythm.
14.5 Fragmented time and self in borderline mothers’ interactions

14.5.1 Situating borderline personality disorder
We became interested in the vocal interactions of mothers presenting borderline personality dis-
order (BPD) because of the particular connection between identity, belonging and expressive
timing we observed in the study of immigrant mothers. BPD, as defined by the American
Psychiatric Association (Diagnostic and Statistical Manual of Mental Disorders [DSM-IV-TR]
2000), is largely characterized by breaches in the continuity and coherence of interpersonal rela-
tionships. As with other personality disorders, however, the signs of this condition are not always
clearly demarcated. They are woven into the fabric of a person’s personality structure with its
own ontogenetic trajectories. It may not be easy to disentangle the defining features of BPD from
other idiosyncratic personality traits, nor is it clear if the defining signs of the disorder are always
expressed discretely or are coloured and shaped by individual personality features.
BPD is, above all, a disorder of identity (Kernberg 1975), characterized most often by intense
and chaotic relationships, impulsive behaviour, labile and extreme emotional reactions, separa-
tion anxiety and intrusiveness, and perhaps paradoxically, fears of being swallowed up or aban-
doned by others. It is a disorder of the regulation of intimacy and can be observed both in the
micromoments of a meeting, and in the lines of a person’s lifetime or personal narrative history
(Apter-Danon 2004). These histories of people with BPD are often marked by sudden and vio-
lent breaks in their relationships. Actual experiences of loss are common—of being abandoned
physically or psychologically, or of abuse during childhood. In many cases, there are recurrent
disruptions of intergenerational transmission in the families of borderline persons and we may
speculate that this lack of transmission—of love, of knowledge, and also perhaps of tacit know-
ing—is at the very heart of the difficulties they experience. Women with borderline personality
disorder seem to lack a holding environment and coherent internal working models to guide and
shape new relationships.
Thus, we draw a parallel between mothers with BPD and mothers who experience cultural
conflict. However, we must bear in mind that the level of suffering and the adverse consequences
for the infant are much greater in the case of BPD mothers, as their condition has deep roots and
is long-lasting. In most cases, immigrant mothers rapidly regain confidence and a renewed sense
of belonging once they have tuned into their new contexts. Mothers with BPD, on the other
hand, have implicit knowledge of repeated experiences of emptiness, impeachment of their emo-
tions by significant others, and no relational scaffolding. For these mothers, having a baby brings
into play a number of issues that are at the core of their suffering, such as fears of abandonment
and enmeshment, feelings of inadequacy related to repeated interpersonal failure, or a sense of
narcissistic emptiness. Yet many women with BPD seek out motherhood. Having a baby can
bring them a subjective sense of comfort, well-being and connectedness. It offers an opportunity
(real or imagined) for the repair of their relatedness.
14.5.2 Preliminary research findings

First, we present findings from a large-scale longitudinal research project on interactions of bor-
derline mothers and infants from 3 to 18 months of age (Apter-Danon and Candillis 2005).
We then present results from our acoustic analysis of a small number of audio recordings of six
borderline and six control mother–infant dyads.
Microanalyses were first performed on 18 videotaped recordings of borderline mothers and
their 3-month-old infants, based on the ‘still-face’ paradigm (Tronick et al. 1978), and on
18 recordings of mother–infant dyads that presented no signs of pathology. These analyses indi-
cate that mothers with BPD hold their babies less comfortably, rock them more, and hold them
closer than control mothers. Lower degrees of contingency in mutual-gaze orientation and less
smiling further distinguished the two groups (Apter-Danon 2004).
A second microanalytic study compared nine borderline mothers with nine control mothers.
The results showed that BPD mothers are generally more intrusive and that they attribute
70
60
50
40 BPD
30 Controls
20
10
0
Change Very close Very far Avert Smile Vocalization
position
Fig. 14.3 Maternal interactive behaviours for mothers with BPD and control mothers.
80.00
70.00
60.00
50.00
40.00 BPD
30.00 Controls
20.00
10.00
0.00
Look Look away Touch Self-comfort Smile
mother mother
Figure 14.4 Infant interactive behaviours for infants of mothers with borderline personality disorder
(BPD) and for infants of control mothers.
negative states to their infants more frequently. They vocalize and smile less, avert their gaze
more, and vary the distance between themselves and the baby more (Figure 14.3). Infants of
borderline mothers appear less motivated by social engagement and have fewer positive expres-
sions during interaction than infants of control mothers. They look at their mothers and smile
less than control infants. In addition, they explore objects and their environments less, touch
their mothers less, and are less efficient at self-soothing when upset by the still-face episode
(Figure 14.4). (Note that the data presented in Figures 14.3 and 14.4 are descriptive, and due to
small sample size statistical analyses were not performed.) The interactive styles of mothers
with BPD and their infants are clearly distinguishable from those both of mothers with other
psychological disturbances and their infants, and of control dyads (Apter-Danon 2004).
We further analysed two minutes of vocal interaction for six BPD dyads and six control dyads
(a number of audio recordings had to be discarded as their quality was insufficient for reliable
acoustic analysis), using the acoustic analysis methods described in Section 14.2.1. Differences
between the two groups’ vocal interaction patterns are presented in Table 14.3. It appeared that
the protoconversational format is less clearly defined among borderline dyads, with mothers and
infants taking fewer vocal turns than control dyads. The whole interactive sequence is marked by
fewer phrase units in the borderline sample, probably due to a lack of clear segmentation in the
mothers’ speech. However, the average duration of a phrase unit remains the same in both
groups. An interesting difference concerns the number of semantic–prosodic repetitions in the
mother’s speech to her infant—that is, the number of times she repeats the same word or group
of words with almost unvarying prosody. We found it to be much higher among mothers with
borderline personality disorder, and that they shift to a new topic less frequently than control
dyads. We found that borderline mothers and their infants vocalize at the same time much less
than control dyads, suggesting an overall lack of musicality and positive engagement.
In the following section, examples from a qualitative analysis of these interactions are pre-
sented. They illustrate the most striking finding of this series of pilot studies: that borderline
mothers tend to adhere to an overly rigid and repetitive interactive style, one that lacks the flexi-
bility and expressiveness that we see as fundamental aspects of communicative musicality. It is an
interactive style that restrains the potential for creative engagement to such an extent that infants
must either retract and withdraw or furiously express their frustration (see clinical case studies
by Apter-Danon 2004).
14.5.2.1 Being stuck and unstuck in time: examples from acoustic analyses
Looking at these interactions through the temporal lenses of acoustic analysis, it became clear
that borderline mothers present highly specific and distinguishing vocal interactive styles. We
were surprised to find strong similarities between the borderline dyads, despite the surface vari-
ability in the mothers’ clinical presentations. Further studies are under way to test and extend
findings derived from these preliminary acoustic analyses (Delavenne et al. in press).
Based on detailed acoustic analysis of 2-minute segments of vocal interaction of six borderline
and six control dyads, we observed that infants of borderline mothers vocalize much less,
and that borderline mothers’ vocal expressions present high degrees of repetitiveness, at the
semantic, the prosodic and the temporal levels. Figure 14.5 is an example of the combined seman-
tic, prosodic and temporal repetitiveness of a borderline mother’s speech to her infant. This spec-
trograph depicts 27 seconds of vocal exchange between 3-month-old Lois and his mother. The
mother repeats the same phrase over and over, a phrase that she in fact repeats many more times
Table 14.3 Results from the acoustic analysis of vocal interactions between mothers with borderline
personality disorder (BPD) and their infants and control mothers and infants
Borderline dyads Control dyads

Mean (SD) Mean (SD)
Mean number of turns between mother and infant 4 (4) 7.83 (4.5)
Mean number of phrase units 10.8 (8.4) 17.2 (1.6)

Mean length of phrase units (s) 3.4 (2.5) 3.1 (1.8)
Mean number of semantic–prosodic repetitions 34 21
Mean number of semantic shifts 8 12
Mean duration of simultaneous mother–infant vocalization (ms) 960 4800
700
500
Pitch (Hz)
300
200
150
100
70
50
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Time (S)
Phrase 1 : 2820 ms Phrase 3 : 3540 Phrase 5 : 5530 Phrase 7 : 3530
Phrase 2 : 3540 Phrase 4 : 3890 Phrase 6 : 3990
Phrase boundary
Infant vocalization
Simultaneous vocalization
(pitch not shown)
TRANSCRIPT
Bon:jour petit bonhomme (Phrase 1) bon:jour petit bonhomme dit le soleil (Phrase 2)
bon:jour petit bonhomme dit le soleil (Phrase 3) bon:jour petit bonhomme dit le soleil
(Phrase 4) bon:jour – [infant vocalization] oooh:: ça va ? (Phrase 5) bon:jour petit
bonhomme dit le soleil (Phrase 6) bon:jour petit bonhomme dit le soleil (Phrase 7)
Fig. 14.5 Pitch plot representing 27 seconds of a vocal interaction between a ‘borderline’ mother
and her 3-month-old infant, Lois, showing phrase durations in milliseconds.
over the course of the ten minutes of interaction we recorded. The temporal organization of these
27 seconds is marked on Figure 14.5 through the addition of vertical lines. It appears that the
phrasing of this segment of exchange is extremely static, as compared with that of a control dyad.
It is also quite clear from this spectrograph that there is remarkably little prosodic variation
between one repeated phrase and the next. Figure 14.6 highlights the durations and prosodic con-
tours of two almost identical consecutive phrases (phrases 6 and 7 in Figure 14.5).
700
500
300
Pitch (Hz)
200
150
100
70
50
0 1 2
Time (S)
Fig. 14.6 Superimposed prosodic contours of phrases 6 and 7.
700
500
300
Pitch (Hz)
200
150
100
70
50
0 1 2 3 4 5 6 7 8 99.37279
Time (s)
Phrase 4 Phrase 5
Phrase boundary
Infant vocalization
Simultaneous vocalization
(pitch not shown)
Figure 14.7 Phrases 5 and 6 of the vocal interaction between Lois and her mother.
Figure 14.7 is a close up of the fourth and fifth phrases of the 27-second segment shown
in Figure 14.5. Examination of the fifth phrase, to which the infant unexpectedly contributes,
reveals an interesting interactive pattern. At the beginning of phrase 5, the mother is about
to restate her repeated utterance ‘bonjour petit bonhomme, dit le soleil’, but the infant emits a
short vocalization exactly on cue at the same time as she does (shown in Figure 14.7 by the square
indicating a simultaneous vocalization). This appears to interrupt the mother momentarily but
she catches up and concludes with the ascending–descending part of the prosodic contour of her
utterance. The infant vocalizes again, more forcefully, right at the cusp of the prosodic turn in
‘bon-jour’. This second vocalization appears to disrupt the mother’s output and, in her stead, the
infant produces a long flat vocalization which is, prosodically and temporally, remarkably in line
with the contour of what was meant to follow from the mother: ‘petit bonhomme’. By inserting
his own collaborative utterance, the infant seems to set up the opportunity for a dialogue but
in order not to risk a complete breakdown of the exchange he must strictly adhere to the pre-
established pattern. Lois’ effort is temporarily successful: after his long vocalization, his mother
imitates him with a matching long and flat non-verbal coo that is also consistent with the pre-
established prosodic and temporal template. The exchange occurs within the space of a phrase
and may be considered meaningful and motivating for the infant. In subsequent phrases, how-
ever (not presented in the figures), the mother picks up where she left off and the infant is again
‘left in the lurch’, with very little opportunity to participate in an interaction.
In a healthy interaction, the infant must anticipate moments when it is meaningful and/or
motivationally appropriate for him to vocalize. There are generally multiple possible points of
entry in an exchange, both for infant and for mother, and at any of these anticipated possible
points of entry, the infant again has multiple expressive possibilities. For example, a soft, long
vocalization might fit into the overall expressive pattern of the exchange just as well as a series of
shorter coos. There is room for expressive variation within the framework of the exchange. It is
by nature a polysemic co-construction, and its subtle meaning and intent unfolds as it is explored.
The case that we have just examined, although only a fragment, illustrates a pattern that emerges
repeatedly in our data: Lois seems to sense a possible point of entry into the ‘exchange’. But his
expressive potential appears to be severely curtailed by the inbuilt rigidity and repetitiveness of the
interactive format established by the mother. It is quite remarkable that he not only succeeds in
generating a short vocal exchange, but should have the will and know-how to do so. We believe that
this illustrates how a baby can be drawn into the maladaptive behaviour of the mother.
14.5.3 Authenticity and belonging

People with borderline personality disorder are often described as lacking authenticity. Their
shifting moods and emotions and their impulsive behaviour make it very difficult for others to
anticipate their behaviour or state of being. This difficulty with anticipation creates for the other
a strong sense of interpersonal disconnection that often goes hand in hand with an impression of
being intentionally manipulated. We have tried to allude to the particular form of alienation
experienced and manifested by the BPD individual: an alienation from intimate intersubjective
encounters founded on a difficulty with sharing inner time and on an essentially disrupted, dis-
continuous sense of inner time.
BPD mothers’ disrupted personal time-flow hinders their ability to put their experiences into
narrative form, to connect the narrative threads through which the actions of a coherent and
credible ‘self ’ can be projected. Many researchers have highlighted the link between narrative and
identity (Gergen and Gergen 1988; Ochs and Capps 1997; Ricoeur 1991). In early childhood, lan-
guage is acquired together with a deepening sense of personal identity through the recounting of
personal and communal stories (Nelson 1989). Only certain types of experiences are reportable
within the narrative format (Labov 1982; Ochs and Capps 1997). In general, the recounting of
personal experiences is entwined with a motivation and effort to make the shared narrative cred-
ible and coherent for others, as well as affectively acceptable. At the same time, this activity of
trustworthy, authentic storytelling produces the very matter from which our memories are made
and constitutes the core mechanism of self-knowledge and projection. Individuals who do not
convey this stance towards authenticity in their narrative recounting are perceived as lacking a
cohesive sense of self or as deceitful (Gergen and Gergen 1988). Borderline individuals appear
inauthentic to others both because of their highly changeable and unpredictable ways of being
and because of a difficulty in organizing their experiences into credible narrative form. By delin-
eating and encapsulating specific events in the past and connecting them with enduring selves
and potential futures, narrative forms weave time. The disconnected time in which the BPD indi-
vidual lives fosters a general lack of meaningfulness and of a unified, articulated and flowing self.
It is not surprising that young infants lose trust and confidence in their borderline mothers,
becoming themselves aloof, quiet and inexpressive (Apter-Danon 2004). At the same time our
microanalyses of face-to-face interaction provide examples of highly resilient infants who work
hard to pick up and to reconnect the buds of narrative line in their mothers’ monotonous expres-
sions. Some succeed and their mothers might then learn through them to trust and be trusted in
the building of relationships (see the clinical case studies in Apter-Danon 2004). Others fail but
do not necessarily give up for good. The case of 3-month-old Lois provides such an example.
Jazz musicians and other musicians are particularly preoccupied with the issue of authenticity,
and we may once again draw insights from their experiences. We have already discussed how a
‘moral’ dimension in jazz playing is derived from principles of personal honesty and sincerity
(Duranti and Burrell 2004). There is another equally important sense in which a performance or
a composition can be considered ‘inauthentic’. Music is most often thought of as inauthentic if it
is not clearly affiliated or rooted in a known or familiar tradition. The history of jazz is intimately
linked with these issues of authenticity, and goes far beyond musical tradition into questions of
race and belonging (MacDonald et al. 2002). It still appears inauthentic to some for white musi-
cians to play jazz (despite the recognized talent of these white musicians). Similarly today, hip-
hop culture that was born largely from racial tension and economic strife during the 1970s sits
uneasily within white, middle-class contexts. At the same time, it plays with issues of appropria-
tion and authenticity through sampling and manipulation of black and white musical forms
(Keil and Feld 1994). However, many musicians seem to argue that to play a particular kind of
music requires a credible claim on it by the musicians, through cultural transmission, heredity or
authentic motivated practice (Maira 2002; Monson 1996). Thus, authenticity in music is impor-
tant both at personal and subcultural levels. However, this commitment to authenticity has been
severely criticized by postcolonial theorists who argue in favour of multiplicity and heterogene-
ity. An increasingly important question raised by such debates is whether musicians have to
belong to be authentic. This question can easily be reversed.
The reason why borderline mothers have trouble partaking in the musicality of interaction,
improvising and creating new forms of intimacy, we suggest, is that they do not have a clear feel-
ing of belonging. We further propose that this is due primarily to a fragmented, discontinuous
sense of inner time. Without shareable inner time, mother and infant lack a protohabitus that
constitutes the impulse for creative dynamic exchange. Many researchers have described the
strong parallels between mother–infant and therapist–patient interactions (Beebe and Lachman
2002; Stern 2004). Therapy with borderline patients that focuses on the negotiation of shared
timing and on building implicit ways-of-being-together may provide sufficient holding for a
mother to launch musical dialogues with her infant (for descriptions of therapeutic treatment,
see Apter-Danon 2004).
14.6 Conclusions
The fine-grained acoustic analysis of spontaneous vocal interaction between mothers and infants
opens spaces for understanding how expression unfolds within shared temporal frames to
sustain both a sense of belonging and a sense of adventure. By presenting research based on sim-
ilar methodologies but associated with fundamentally different theoretical fields and practical
outcomes, we have argued that communicative musicality—and particularly its improvisational
quality—is intimately linked to the experience of belonging. In both studies, we highlighted the
dialectic relationship between musicality and belonging, a relationship that appears to hold true
for musical improvisation. A sense of belonging, or of sharing implicit and embodied ways-
of-being-together, constitutes the springboard from which creative variations can take form; and
at the same time, it is through new and efficient forms of expression that belonging and what we
have called protohabitus are dynamically renewed.
In both the studies—of troubled immigrant mothers and of mothers with borderline person-
ality disorder—we showed that when the mother experiences a loss of belonging, vocal interac-
tion between mother and baby can lose its improvisational vitality, becoming highly repetitive
and predictable. We further suggest that the difficulty mothers experience with their sense of
belonging is closely related to their sense of time and narrative. Immigrant mothers who feel
uprooted live temporarily in a disconnected world; they need time to reconnect to the time of the
place they came from with that of the place they came to, and to spin new stories in which one
cultural self falls into step with another. Nostalgia thickens time, makes it feel heavy and
slow-moving, and bitterly highlights the irreversibility of time (Jankélévitch 1974). Borderline
mothers suffer primarily from a difficulty with their inner sense of time and the production of
personal narrative. Here, it is a disruption of self in time, not place in time, which undoes the
sense of belonging.
Further research is clearly needed in this area, and these findings must be taken as pointers
towards more detailed hypotheses. Our thesis supports other researchers’ views that musicality in
interaction is a fundamentally humanizing activity through which we continually live. The study
of musicality constitutes a unique means of accessing and unravelling pathological as well as
healthy interaction, for infants and for adults. These studies make it quite clear that new impulses
can be given to dyads that have lost their musicality, and that musicality cuts the most natural
path out of suffering and loneliness.
Acknowledgements
The research reported here on the interactions of borderline mothers and their infants was
financed by the French Ministry of Health (PHRC).
References
American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders, revised 4th
edn. American Psychiatric Association, Washington DC.
Anzieu D (1995). Le Moi-Peau. Dunod, Paris.
Apter-Danon G (2004). De l’insubjectivité à l’intrapsychique: Etude des interactions précoces des mères
‘borderlines’ et de leurs bébés de 3 mois. Unpublished doctoral dissertation, Université Paris VII.
Apter-Danon G and Candillis D (2005). A challenge for perinatal psychiatry: Therapeutic management of
maternal borderline personality disorder and their very young infants. Clinical Neuropsychiatry,
2(5), 302–314.
Bailey D (1992). Improvisation: Its nature and practice in music. Da Capo Press, New York.
Baldwin DA, Baird JA, Saylor MM and Clark MA (2001). Infants parse dynamic action. Child
Development, 72(3), 708–717.
Baruch C and Drake C (1997). Tempo discrimination in infants. Infant Behavior and Development,
20(4), 573–577.
Becker H (2000). The etiquette of improvisation. Mind, Culture, and Activity, 7(3), 171–176.
Beebe B, Alston D, Jaffe J, Felstein S and Crown C (1988). Vocal congruence in mother–infant play.
Journal of Psycholinguistic Research, 17, 245–259.
Beebe B and Gerstman L (1980). The ‘packaging’ of maternal stimulation in relation to infant facial-visual
engagement: A case study at four months. Merill-Palmer Quaterly, 26(4), 321–339.
Beebe B, Jaffe J, Feldstein S, Mays K and Alson D (1985). Inter-personal timing: The application of an
adult dialogue model to mother–infant vocal and kinesic interactions. In FM Field and N Fox, eds,
Beebe B and Lachman F (2002). Infant research and adult treatment: Co-constructing interactions.
Analytic Press, Hillsdale, NJ.
Beebe B, Stern DN and Jaffe J (1979). The kinesic rhythms of mother–infant interactions. In Aron
W Sigman and S Feldstein, eds, Of speech and time: Temporal speech patterns in interpersonal contexts,
pp. 23–24. Erlbaum, Hillsdale, N.J.
Berliner PF (1994). Thinking in jazz: The infinite art of improvisation. The University of Chicago Press,
Chicago, IL.
Berry JW, Poortinga YH, Segall MH and Dasen PR (1992). Cross-cultural psychology: Research and
applications. Cambridge, Cambridge University Press.
Bjørkvold J-R (1992). The muse within: Creativity and communication, song and play from childhood
Boersma P and Weenink D (2000). Praat: A system for doing phonetics by computer.
http://www.fon.hum.uva.nl/praat/
Bourdieu P (1977). Outline of a theory of practice. Cambridge University Press, Cambridge.
Brazelton TB, Koslowski B and Main M (1974). The origins of reciprocity: The early mother–infant
interaction. In M Lewis and LA Rosenblum, eds, The effect of the infant on its caregiver, pp. 49–76.
Wiley, New York
Bruner JS (1979). Learning how to do things with words. In D Aronson and R Rieber, eds, Psycholinguistic
research, pp. 265–284. Erlbaum, Hillsdale, NJ.
Bruner JS (1987). Life as narrative. Social Research, 54(1), 11–32.
Bruner JS (1990). Acts of meaning. Harvard University Press, Cambridge, MA.
Bruner JS (2002). Making stories: Law, literature, life. Farrar, Strauss and Giroux, New York.
Bruner J and Sherwood V (1975). Early rule structure: The case of peekaboo. In JS Bruner, A Jolly and
K Sylva, eds, Play: Its role in evolution and development, pp. 277–285. Penguin, Harmondsworth.
Burke K (1945). A grammar of motives. Prentice-Hall, New York.
Clarke EF (1989). The perception of expressive timing in music. Psychological Research, 51, 2–9.
Condon WS (1982). Cultural microrhythms. In M Davis, ed., Interaction rhythms: Periodicity in
communicative behavior, pp. 77–102. Human Sciences Press, New York.
Cross I (2001). Music, cognition, culture and evolution. Annals of the New York Academy of Sciences,
930, 28–42.
Csikszentmihalyi M (1990). Flow: The psychology of optimal experience. Harper Perennial, New York.
Delavenne A, Gratier M, Devouche E and Apter-Danon G (in press). Phrasing and fragmented time in
‘pathological’ mother–infant vocal interaction. In M Imberty and M Gratier (eds), Musicae Scientiae.
Special Issue ‘Narrative in music and interaction’.
Deleuze G and Guattari F (1987). A thousand plateaux. University of Minneapolis Press, Minneapolis, MN.
Dissanayake E (2000a). Antecedents of the temporal arts in early mother–infant interaction. In N Wallin,
Dissanayake E (2000b). Art and intimacy How the arts began. University of Washington Press, Seattle, WA.
Dominey PF and Dodane C (2004). Indeterminacy in language acquisition: The role of child-directed
speech and joint attention. Journal of Neurolinguistics, 17, 121–145.
Donald M (1993). Human cognitive evolution: What we were, what we are becoming. Social Research,
60, 143–170.
Donaldson M (1992). Human minds: An exploration. Penguin, London.
Drake C and Palmer C (1993). Accent structures in music performance. Music Perception, 10, 343–378.
Duranti A and Burrell K (2004). Jazz improvisation: A search for hidden harmonies and a unique self.
Ricerche di Psicologia, 3, 71–101.
Faulk D (2004). Prelinguistic evolution in early hominins: Whence motherese? Behavioral and Brain
Sciences, 27, 491–503.
Feldstein S, Jaffe J, Beebe B, Crown CL, Jasnow M, Fox H and Gordon S (1993). Coordinated timing in
adult–infant vocal interactions: A cross-site replication. Infant Behavior and Development, 16, 455–470.
Fernald A and O’Neil DK (1993). Peekaboo across cultures: How mothers and infants play with voices,
faces, and expectations. In K MacDonald, ed., Parent–child play: Descriptions and implications,
pp. 259–285. State University of New York Press, Albany, NY.
Fernald A (1989). Intonation and communicative interest in mother’s speech to infants: Is the melody the
Fogel A (1988). Cyclicity and stability in mother–infant face-to-face interaction: A comment on Cohn and
Tronick (1988). Developmental Psychology, 24(3), 393–395.
Fraisse P (1982). Rhythm and tempo. In D Deutsch, ed., The psychology of music, pp 149–180.
Gergen KJ and Gergen MM (1988). Narrative and the self as relationship. Advances in Experimental Social
Goodwin C (2003). Pointing as situated practice. In S Kita, ed., Pointing: Where language, culture and
cognition meet, pp. 217–241. Lawrence Erlbaum, Mahwah, NJ.
Gratier M (1999). Expression of belonging: The effect of acculturation on the rhythm and harmony of
mother–infant vocal interaction. Musicae Scientiae (Special Issue 1999–2000), 93–122.
Gratier M (2001). Rythmes et appartenances culturelles: Etude acoustique des échanges vocaux entre mères et
bébés autochtones et migrants. Unpublished doctoral dissertation. Université René Descartes (Paris V).
Gratier M (2003). Expressive timing and interactional synchrony between mothers and infants:
Cultural similarities, cultural differences, and the immigration experience. Cognitive Development,
18, 533–554.
Greenfield PM (1972). Playing peekaboo with a four-month-old: A study in the role of speech and
non-speech sounds in the formation of a visual schema. The Journal of Psychology, 8, 287–298.
Greenfield PM, Quiroz B and Raeff C (2000) Cross-cultural conflict and harmony in the social
construction of the child. New Directions for Child and Adolescent Development, 87, 93–108.
Gumperz JJ (1981). Ethnic differences in communicative style. In CA Ferguson and S Brice Heath,
eds, Language in the USA, pp. 430–445. Cambridge University Press, Cambridge.
Hall ET (1983). The dance of life: The other dimension of time. Anchor Press/Doubleday, Garden City, NY.
Halliday MAK (1975) Learning how to mean: Explorations in the development of language. Edward
Arnold, London.
Hane AA and Feldstein S (2005). The divergent functions of early maternal and infant vocal coregulation in
the growth of communicative competence. Paper presentation, Second Biennial Meeting, Society for
Research in Child Developmentelopment, Atlanta, USA, April 7–10.
Hane AA, Feldstein S and Dernetz VH (2003). The relation between coordinated interpersonal timing and
maternal sensitivity with four-month-old infants. Journal of Psycholinguistic Research, 32(5), 525–539.
Husserl E (1964). The phenomenology of internal time-consciousness. Translated by JS Churchill. Indiana
University Press, Bloomington, IN.
Imberty M (1981). Les écritures du temps: Sémantique psychologique de la musique (tome 2). Dunod, Paris.
Imberty M (2005). La musique creuse le temps. De Wagner à Boulez: Musique, psychologie, psychanalyse.
L’Harmattan, Paris.
Iyer V (1998). Microstructures of feel, macrostructures of sound: Embodied cognition in West African and
African-American musics. Unpublished doctoral dissertation. University of California, Berkeley.
Jaffe J, Beebe B, Feldstein S, Crown CL and Jasnow MD (2001). Rhythms of dialogue in infancy.
Monographs of the Society for Research in Child Development, 66(2), (Serial No. 265).
James W (1992). Principles of psychology, vols 1 and 2. Dover, New York (Original work published 1890).
Jankélévitch V (1974). L’irréversible et la nostalgie. Flammarion, Paris.
Jousse M (1969). L’Anthropologie du Geste. Editions Resma, Paris.
Jusczyk PW and Krumhansl CL (1993). Pitch and rhythmic patterns affecting infant’s sensitivity to
musical phrase structure, Journal of Experimental Psychology: Human Perception and Performance,
19, 627–640.
Keil C and Feld S (1994). Music grooves. The University of Chicago Press, Chicago, IL.
Kerbrat-Orecchioni C (1994). Les interactions verbales, Tome III. Armand Colin, Paris.
Kernberg OF (1975). Borderline conditions and pathological narcissism. Jason Aronson, New York.
Kivy P (1990). Music alone. Cornell University Press, Ithaca, NY.
Krumhansl CL and Jusczyk PW (1990). Infants’ perception of phrase structure in music. Psychological
Science, 1(1), 70–73.
Kühl O (2007). Musical semantics. European Semiotics: Language, Cognition and Culture, No. 7.
Peter Lang, Bern.
Kuhl P (2004). Early language acquisition: cracking the speech code. Nature Reviews, 5, 831–843.
Labov W (1982). Speech actions and reactions in personal narrative. In D Tannen, ed., Georgetown
University round table on language and linguistics 1981. Analyzing discourse: Text and talk, pp. 219–247.
Georgetown University Press, Washington, DC.
Langer S (1942). Philosophy in a new key: A study in the symbolism of reason, rite and art. Harvard
Langer S (1953). Feeling and form: A theory of art developed from philosophy in a new key. Routledge and
Kegan Paul, London.
Lynch MP, Oller DK, Steffens ML and Buder EH (1995). Phrasing in prelinguistic vocalisations.
Developmental Psychbiology, 28, 3–25.
Oxford.
Maira S (2002). Desis in the house: Indian American youth culture in New York City. Temple University
Press, Philadelphia, PA.
1999–2000), 29–57.
Malloch S (2005). Why do we like to dance and sing? In R Grove, C Stevens and S McKechnie eds,
Thinking in four dimensions: Creativity and cognition in contemporary dance, pp. 14–28. Melbourne
University Press, Melbourne.
Markus J, Mundy P, Morales M, Delgado C and Yale M (2000). Individual differences in infant skills as
predictors of child–caregiver joint attention and language. Social Development, 9, 302–315.
Maslow AH (1971). The farther reaches of human nature. Penguin, New York.
Mauss M (1934). Sociologie et Anthropologie. PUF, Paris.
Meyer LB (1956). Emotion and meaning in music. Chicago University Press, Chicago, IL.
Monson I (1996). Saying something: Jazz improvisation and interaction. The University of Chicago Press,
Chicago, IL.
Nelson K (ed.) (1989). Narratives from the crib. Harvard University Press, Cambridge, MA.
Ochs E and Capps L (1997). Narrative authenticity. Journal of Narrative and Life History, 7(1–4), 83–89.
Ochs E and Capps L (2001). Living narrative. Harvard University Press, Cambridge, MA.
Papoušek H, Papoušek M and Bornstein MH (1985). The naturalistic environment of young infants:
On the significance of homogeneity and variability in parental speech. In T M Field and N Fox, eds,
Papoušek M (1996). Intuitive parenting: A hidden source of musical stimulation in infancy. In I Deliège
and J Sloboda, eds, Musical Beginnings: Origins and development of musical competence, pp. 88–112.
Polanyi M and Prosch H (1975). Meaning. The University of Chicago Press, Chicago, IL.
Reddy V (1991). Playing with others’ expectations: Teasing and mucking about in the first year.
In A Whiten, ed., Natural theories of mind: Evolution, development and simulation of everyday
mindreading, pp. 143–158. Blackwell, Oxford.
Ricoeur P (1983–1985). Temps et récit, vols 1–3. Editions Seuil, Paris.
Ricoeur P (1991). Soi même comme un autre: interview by G. Jarczyk. Revue du Collège International de
Philosophie, 1–2, 225–237.
Rochat P, Querido JG and Striano T (1999). Emerging sensitivity to the timing and structure of
protoconversation in early infancy. Developmental Psychology, 35(4), 950–957.
Rouget G (1994). La musique et la transe. Gallimard, Paris.
Sawyer KR (2000). Improvisational cultures: Collaborative emergence and creativity in improvisation.
Mind, Culture and Activity, 7(3), 180–185.
Sawyer KR (2001). Creating conversations: Improvisation in everyday discourse. Hampton Press,
Cresskill, NJ.
Sawyer KR (2003). Group creativity: Music, theatre, collaboration. Lawrence Erlbaum, Mahwah, NJ.
1999–2000), 75–92.
Schögler BW (2002). The pulse of communication in improvised jazz duets. Unpublished doctoral
dissertation, The University of Edinburgh, U.K.
Schutz A (1962). Collected papers, vol. 1. Edited by Arvid Brodersen. Martinus Nijhoff, The Hague.
Stern DN (1982). Some interactive functions of rhythm changes between mother and infant. In M Davis,
ed., Interaction rhythms: Periodicity in communicative behavior, pp. 101–117. Human Sciences Press,
New York.
infant’s social experience. In P Rochat, ed., Early social cognition: Understanding others in the first
months of life, pp. 67–90. Erlbaum, Mahwah, NJ.
Stern DN (2000). Putting time back into our considerations of infant experience: A microdiachronic view.
Infant Mental Health Journal, 21(1–2), 21–28.
Stern DN (2004). The present moment in psychotherapy and everyday life. Norton, New York.
Stern DN, Beebe B, Jaffe J and Bennett SL (1977). The infant’s stimulus world during social interaction:
A study of caregiver behaviors with particular reference to repetition and timing. In HR Schaffer, ed.,
Studies in mother–infant interaction, pp. 177–202. Academic Press, New York.
Stork HE (1994). Gestes de maternage en situation d’immigration. Bulletin de Psychologie, XLVIII(419),
278–287.
Tomasello M (1999). The cultural origin of human cognition. Harvard University Press, Cambridge, MA.
Tomasello M, Kruger A C and Ratner HH (1993). Cultural learning. Behavioural and Brain Sciences, 16(3),
495–552.
understanding of infants. In D Olson, ed., The social foundations of language and thought: Essays in
honour of J.S. Bruner, pp. 316–342. W.W. Norton, New York.
Trevarthen C (1986). Brain science and the human spirit. Zygon, 21, 161–200.
culture. In G Jahoda and IM Lewis, eds, Acquiring culture: Ethnographic perspectives on cognitive
Trevarthen C (1993). Predispositions to cultural learning in young infants. Behavioural and Brain Sciences,
16, 534–535.
Trevarthen C (1994). Infant semiosis. In W Nöth, ed., Origins of semiosis: Sign evolution in nature and
culture, pp. 219–252. Mouton de Gruyter, New York/Berlin.
meaning in the first year. In A Lock, ed., Action, gesture and symbol: The emergence of language,
pp. 183–229. Academic Press, London.
between contradictory messages in face-to-face interaction. American Academy of Child Psychiatry,
17, 1–13.
Tronick EZ and Cohn JF (1989). Infant-mother face-to-face interaction: Age and gender differences in
coordination and the occurrence of miscoordination. Child Development, 60, 85–92.
Tronick EZ and Weinberg MK (1997). Depressed mothers and infants: Failure to form dyadic states of
consciousness. In L Murray and PJ Cooper, eds, Postpartum depression and child development,
pp. 54–81. The Guilford Press, London.
Watson JS (1979). Perception of contingency as a determinant of social responsiveness. In E Thoman, ed.,
The origins of social responsiveness, pp. 33–64. Erlbaum, Hillsdale, NJ.
Watson JS (1985). Contingency perception in early social development. In TM Field and NA Fox, eds,
Winnicott DW (1971/1992). Playing and reality. Brunner-Routledge, Hove, East Sussex.
Part 3
Musicality and healing

Music and dance can transport us to a happier, more ‘inclusive’ disposition, and in Part 3, it is
argued that music and dance can do much more than that – the skilful engagement of our musi-
cality can bring healing to our sense of self. Engaging with a person’s musicality ‘attunes to the
essential efforts that the mind makes to regulate the body, both in its inner neurochemical,
hormonal and metabolic processes, and in its purposeful engagements with the objects of the
world, and with other people’ (Trevarthen and Malloch 2000). Music and dance are an especially
potent form of interaction and healing for those who are beyond the reach of talk.
Children who have suffered war (Chapter 15), children from dysfunctional families (Chapter 16),
a child who has been subjected to overwhelming abuse (Chapter 17), children who are
deaf-blind (Chapter 18), and children whose own development suddenly turns against them
(Chapter 19)—all are shown to benefit significantly from the skilful therapeutic use of musicality
in music and dance. Nigel Osborne (Chapter 15) tells of his experience in Bosnia-Herzegovina of
bringing music to children suffering loss, physical and psychological damage and post-traumatic
stress disorder due to the 1992–1995 war. Interweaved through his tale is the development of
a biopsychosocial model of the benefits of musical experiences for children in situations of
conflict. The children’s responses were immediate and unambiguous.
It was not unusual for generally melancholic and reticent groups to leave a [music-making] session
laughing and dancing, or groups containing large numbers of hyperactive children to leave calmed
and focused. Sometimes it seemed that the experience of music had some kind of hotline to the biology
and psychology of traumatized children.
Mercédès Pavlicevic and Gary Ansdell (Chapter 16) demonstrate the importance of commu-
nity music-making—‘collaborative musicing’. They point to a problem—traditional psychology
prioritizes the individual, and thus supports a somewhat asocial, acultural perspective. In their
examples, they illustrate
concrete situation[s] where people can uniquely meet in and through music. These musicing events
stand as beacons of hope against all of the precluding factors (illness, social inequality, fear, and
cultural fragmentation) that now militate against human companionable or communal meeting.
330 MUSICALITY AND HEALING
Similarly, Karen Bond (Chapter 18) develops a model of the empowering nature of the
‘aesthetic community’ for a group of deaf-blind children: an aesthetic belonging accessed
through the medium of dance therapy. In her belief that all people are predisposed to aesthetic
experience, she wonders ‘if an entrenched tendency to valorize verbal intelligence makes us miss
something in children such as those presented here … perhaps in many people … perhaps in
ourselves’ (page 417, ellipses in original).
The authors of Chapters 17 and 19 are concerned with the therapeutic use of music with indi-
vidual children. Jacqueline Robarts (Chapter 17) tells the story of a young child whose extent of
early sexual abuse causes her to become psychotic, unable to communicate meaningfully with
others: ‘When the intersubjective sense of self is devastated at its core by such early relational
trauma, music, used with clinical perception, may reach a child and work constructively in an
evolving, musically mediated therapeutic relationship’ (page 377). After three years of a total of
seven years of regular music therapy, the child began to express her emotions in ways that were
no longer overwhelming for her: ‘She was increasingly able to reflect on her feelings, using com-
plete sentences in a normal tone of voice, rather than fragmented words screamed or whispered
as though triggered by a hallucination in which the past invaded the present’ (page 395). Tony
Wigram and Cochavit Elefant (Chapter 19) describe the role of music therapy for assessment and
treatment of children with severe developmental disorders – autism and Rett syndrome. For Rett
children, even though they experience a drastic developmental regression, music ‘can be helpful
in developing social relatedness, attention, primary communication and in stimulating move-
ment, functional hand usage and learning’ (page 429).
The chapters in Part 3 showcase the extraordinary resilience of the human spirit, even within
the most profound experiences of human isolation. Our shared musicality can function as a
bridge between those called to help and those in need.
Reference
Chapter 15
Music for children in zones of conflict

and post-conflict: A psychobiological
approach
Nigel Osborne
15.1 Introduction
15.1.1 Historical background
The idea of using the creative arts to help children deal with experiences of danger and conflict is
by no means new. Therapeutic art was a focus of activity for the children’s colonies (colonias
infantiles) of the Spanish Civil War (1936–39), and work of courage and vision was initiated by
artists for children in the camps and ghettos of Nazi-occupied Europe; examples include painter
Friedl Dicker-Brandeisová and composer Hans Krása in Terezin (Theresienstadt), and physician
Henryk Goldszmit, alias writer ‘Janusz Korczak’, in the Warsaw ghetto.
A very different wave of creative arts intervention, however, emerged from the conflicts and
genocides of the last two decades of the twentieth century. This wave of intervention has been
relatively systematic, with an emerging methodology, and the initiatives of individuals have
been embraced by organizations such as non-governmental organizations (NGOs), governmental
agencies and professional associations. I have been closely involved with one strand of this
development, based principally on music, which began in Bosnia-Herzegovina in 1993. I offer a
brief overview of the difficulties faced by children during and after war, followed by a short
history of the project as background to a proposed approach to the intervention and a paradigm
for intervention, research and development.
15.1.2 Children, war and trauma

War may have devastating consequences for children—for their physical and mental health, as
well as for their social, economic and political well-being. There are problems of severe physical
trauma for children who are the direct or indirect victims of gunfire, mortars, mines, chemical
weapons and other munitions; there are also the indirect physical consequences of the conditions
of war, including problems of disease and nutrition. These are areas where creative-arts interven-
tions are, naturally, of limited use at the initial, emergency stages (in certain cases and conditions
there may be a marginal, complementary role in distraction and relaxation). They are of most use
in palliative care and phases of rehabilitation: aspects of these will be discussed below.
The most prevalent outcome for children of the experience of war, however, is mental/
emotional trauma, which may or may not be linked to physical traumas. The most common
form of this trauma is defined clinically as post-traumatic stress disorder (PTSD). The Diagnostic
and statistical manual of mental disorders 4th edition (DSM-IV; American Psychiatric Association
1994) identifies four diagnostic criteria:
332 NIGEL OSBORNE
A. Exposure to a profoundly traumatic event, including sensations of terror, horror or help-

lessness;
B. The subsequent recall and re-experiencing of the event, including intrusive recollections,
distressing dreams, acting or feeling as if the event were recurring, psychological distress at
exposure to cues or physiological reactivity at exposure to cues (one symptom needed for
diagnosis);
C. Avoidance and numbing symptoms, including avoidance of thoughts or feelings, avoidance
of activities, places or people, inability to recall important aspects of the trauma, diminished
interest in activities, detachment or estrangement, restricted range of affect and a sense of a
foreshortened future (three symptoms needed);
D. Hyperarousal symptoms, including difficulty falling or staying asleep, irritability or out-
bursts of anger, difficulty concentrating, hypervigilance and exaggerated startle response
(two symptoms needed).
For diagnosis, these symptoms must persist for more than one month and cause clinically
significant distress or impairment in social or occupational functioning. There may be associated
features, usually more prevalent or detectable in adults than in children, including guilt over acts
of commission or omission, survivor guilt, reduction in awareness of surroundings, ‘derealization’
and ‘depersonalization’. Significantly, for the purposes of this chapter, there are clear physio-
logical and neurophysiological indicators associated with PTSD in both adults and children,
including accelerated heart rate, marginal increase in systolic and diastolic blood pressure,
heart-rate variability, delayed heart-rate recovery from shock, respiratory irregularities, dysregu-
lation of hormonal and neurotransmitter systems related to stress and relaxation, and altered
movement repertoires. PTSD may also either overlap or be comorbid with other difficulties,
including major depression, and attention deficit hyperactivity disorder. This chapter presents
evidence for hypotheses that music may have a modest role to play helping children to deal with
both the psychological and physiological symptoms of mental trauma, in both initial emergency
and palliative and rehabilitation phases (cf. Robarts, Chapter 17, this volume).
Unfortunately, the influence of the physical and mental traumas of war may reach beyond the
initial victims. Children who are not the direct victims of conflict may experience loss, mental
trauma and dysfunction among surviving families and carers, trauma in society at large, and
continuing social tensions and economic difficulties, all of which may create the conditions
for secondary trauma. To these problems may be added the toll of stress-related illnesses in
the adult population. In Bosnia-Herzegovina, for example, there have been significant rises in
the incidence of acute coronary syndrome, cancers and perinatal mortality both during the
war (1992–95) and following the war (Bergovec et al. 2005; Drljevi ć and Mehmedbaši ć 2005;
Nermina 2005; Tomić and Galić 2005; Fatusić et al. 2005).
At the non-clinical extreme of children’s difficulties in situations of conflict, there are the
relatively simple consequences of extreme unhappiness, humiliation, fear and loss—including
grief, mourning, sadness, lack of self-esteem, erosion of sense of identity, loss of hope, lack of
trust, anger and reluctance to communicate. Sometimes these feelings and behaviours are diffi-
cult to distinguish from the clinical symptoms of PTSD, and there is much debate among field-
workers about where precisely trauma begins and ends. Some argue that it may be unwise to
medicalize simple and all too natural human reactions. Others suggest that these behaviours and
symptoms may be seen as part of a continuum, or points along a set of parallel continua relating
to different symptoms, with thresholds where clinical diagnosis and intervention may become
helpful for the children. The advantage of music in these circumstances is that it is richly polyvalent,
and that if the intervention is implemented sensitively, it may serve the whole continuum of
needs in a safe and reliable manner.
MUSIC FOR CHILDREN IN ZONES OF CONFLICT AND POST-CONFLICT: A PSYCHOBIOLOGICAL APPROACH 333
15.1.3 Background to the intervention

The first phase of the project comprised a small number of creative workshops for children
organized in collaboration with local artists in the besieged city of Sarajevo. The decision to start
the work was the result of a subjective but shared perception. There were no diagnoses of
children’s mental health available at the time; but a study of 1,505 children in Sarajevo in 1994
found that 48 per cent of girls and 38 per cent of boys over the age of 13, and 38 per cent of girls
and 34 per cent of boys under 13 had PTSD (Husain 2000). A longitudinal study carried out
between 1993 and 1997 revealed that 78 per cent of children had experienced traumatic events
(–Dapić and Stuvland 2000). In a survey of 364 internally displaced children in Central Bosnia
conducted by the Department of Social Medicine, Harvard, 94 per cent of children from the
Sarajevo region met the criteria of DSM-IV for PTSD (Goldstein et al. 1997).
The objectives of the work were to offer children and young people a diversion and distraction
from the brutalized conditions of life in the city, and opportunities for creative expression, relax-
ation and joy. In the initial stages, the workshops involved musical creative work: creative musical
games, composing melodies, and inventing musical textures and narratives. Later, more emphasis
was placed on performance and collaboration with other art forms. The reactions of the children
to the work were strongly positive, and their parents highly supportive. This first phase culmi-
nated in ambitious performances in the National Theatre and Chamber Theatre during ceasefires
in the first months of 1995.
Meanwhile, a second phase of work was already under way in Mostar. The Washington
Agreement of March 1994 had brought an end to major hostilities, and schools were under
reconstruction by the European Union. However, on the Bosniak side of the new Federation,
few teachers of creative arts remained. Following pilot work and negotiations with the Ministry
of Education early in 1995, it was agreed that the project would take over music and other
creative arts hours in the primary school timetable, to deliver both the essentials of the national
curriculum and therapeutically directed content.
Several important collaborations were established at this stage. The first was with the charity
War Child UK, which planned to build a music centre in Mostar, and a second with Apeiron,
a group of young artists in East Mostar who were to form the basis of a trainee work force.
Finally, music students from the University of Edinburgh and later from the Hannover-based
group Musiegt were recruited, alongside individual volunteers, to support the work in both short
and long-term placements.
The content of the work was determined by the agreement with the ministry. Sessions were
planned to take account of the national curriculum as far as was possible. The therapeutic
content was based largely on song, drawn both from local traditions and from the resources of
world music. This was what the young Bosnian trainee volunteers were most comfortable doing,
and the repertoire offered opportunities for rich and safe aesthetic journeys for the children, and
the chance for both enlivenment and relaxation.
The objective was not to treat PTSD, but to help the population at large deal with a variety
of aspects of personal and social damage and loss, among which symptoms of PTSD played a
role for some children. The children’s responses were immediate and unambiguous. It was
not unusual for generally melancholic and reticent groups to leave a session laughing and
dancing, or groups containing large numbers of hyperactive children to leave calmed and
focused. Sometimes it seemed that the experience of music had some kind of hotline to the
biology and psychology of traumatized children, effecting certain immediate, direct and very
obvious changes. It would be fair to say that in the non-clinical therapeutic aspects of the work,
the children led the process by their responses (cf. chapters on music teaching and learning in
Part 4, this volume).
334 NIGEL OSBORNE
In 1997, War Child UK opened the Pavarotti Music Centre, and in 1998 a purpose-built
department of clinical music therapy opened within the Centre. Although the department grew
out of the experiences of the general therapeutic project, it was sui generis a new departure and an
independent concern that has documented its history in its own clinical and professional terms
(Lang et al. 2002). The lesson of the Mostar experience is that a strong clinical base is essential for
the proper functioning of an effective, generally therapeutic outreach programme.
The concerns of this chapter are with the general, non-clinical intervention. By the end of the
1990s, a rapidly growing War Child Netherlands had taken over responsibility from War Child
UK for such creative arts interventions, and the manifest success of the programme in Bosnia-
Herzegovina had led to requests to implement the work elsewhere. In 1999, work began in
Albania and Kosovo, and was soon followed by projects in Georgia, Chechnya, the Sudan, Sierra
Leone, Israel/Palestine and many other regions of Africa and Asia. War Child UK has now
resumed a leading role in the area of clinical music therapy in Mostar.
15.1.4 Research, assessment and responsibility

The principal problem was, and remains, that the intervention has evolved and expanded some
way ahead of reflection, assessment and research. I argue that this problematic deficit has been
acknowledged and dealt with in a responsible manner. The work continued only because of the
support and endorsement of ministries of health and education, psychosocial services, schools,
families and the children themselves. It was mentored by individuals with long and worldwide
experience of educational development and clinical and non-clinical therapeutic intervention.
Significantly, music itself is a generally secure and self-regulating human activity with well-
documented benefits for personal well-being and very few potentially damaging side-effects. The
simplest of sensitivities among workshop leaders to questions such as levels of musical and
acoustical energy and the comfort of individual children within groups appear to have been suf-
ficient to avoid potential difficulties. In 16 years of experience as both workshop leader and
observer with many thousands of traumatized children, I can recall only one instance of a child
visibly distressed by the work. Over the same period, to my knowledge, there have been no
reports of negative side-effects from psychologists, social workers or teachers responsible for the
children’s welfare.
The obstacles to establishing satisfactory assessment procedures in the early stages of the
project were of two kinds: the day-to-day demands of urgent and difficult work, and structural
problems in the traditional quantitative research methodologies. These included, for example,
problems of conducting assessments among shifting populations of refugees or internally
displaced persons, and the reliability of psychometric measures and blind testing in extreme
circumstances of human experience. Furthermore, there were ethical difficulties both in estab-
lishing control groups and in subjecting sensitive relationships built on fragile trust to potentially
invasive investigation. In certain circumstances, for example in the camps for people internally
displaced from Abkhazia to Western Georgia, teams were welcomed only because they were
musicians, offering ‘their hearts and their souls’. At the time (1999), any suggestion of official
international intervention, assessment or monitoring led to immediate rejection.
A promising assessment model has been developed by Mary Anne Kochenderfer under the
supervision of the School of Medicine and Department of Music at the University of Edinburgh.
This takes a scientific qualitative approach, based on sensitively prepared, in-depth interviews
and questionnaires directed towards the whole environment of the child, including family, carers
and educators; the method also invites the children to participate creatively in their own assess-
ment. In addition, in certain post-conflict environments (for example, Bosnia-Herzegovina
or Kosovo), conditions have now become stable enough, and the work sufficiently well
established, for more traditional, quantitative approaches to be tested responsibly. This work is
now in preparation.
Crucially, in recent years, a body of relevant medical evidence has begun to emerge—evidence
that enables connections to be made between specific symptoms of trauma and specific
biological, psychological and social effects of music. On the medical/psychiatric side, there is a
long-established body of detailed research in the pathology of PTSD conducted across large
populations of adults, principally among war veterans in the US, and to a lesser extent among
children; this has been effective in supporting the development of diagnostic criteria, although
much important research—for example, in the neuroscience of PTSD—remains preclinical. On
the musical side, there is a rapidly growing body of evidence from music/medical research,
particularly in areas such as neurophysiology and endocrinology, where autonomic, metabolic,
cortical and subcortical activity associated with musical experience has been identified
(see Panksepp and Trevarthen, Chapter 7, this volume). To this may be added a growing body of
relevant research in music therapy, music psychology, clinical psychology, psychobiology, the
social sciences and education.
Here is the core concern of this chapter. I argue that it is now possible to relate these findings
both to specific symptoms of PTSD and, equally importantly, to general issues of well-being for
children in situations of conflict. Linking clinical evidence for general autonomic dysregulation
among traumatized children to preclinical evidence that the experience of music may help to
modulate and regulate the autonomic nervous system, leads to the hypothesis that music may
have a role to play helping children to deal with problems associated with environments of
conflict—a hypothesis supported by consistent anecdotal evidence from the field. The approach
offers a non-invasive way forward for research into music and trauma in children, and for the
further development of practical methods and methodologies.
The initial approach described here is essentially psychobiological. It is primarily concerned
with those symptoms of trauma and effects of music most closely related to the body, and to the
most intimate connections between mind and body. The approach embraces a number of inter-
related hypotheses that are proposed as possible bases for practice and further research. This psy-
chobiological perspective takes its place in a wider, biopsychosocial paradigm that seeks to link
physiological, psychological and social concerns in a single, integrated model. The model defines
a space in which practitioners may feel confident in the potential of their work to effect positive
change, and where the development of practical methods and methodologies may take place with
the general support of current scientific research.
15.2 A psychobiological approach

15.2.1 The ear: hearing and listening
It is summer 1994, in Mostar, Bosnia-Herzegovina. The Washington Agreement has been signed at
the beginning of March, but there is still shelling across the boulevard which divides the city, and
there are waves of mortar fire from paramilitaries in the mountains. I work with children in the
cellars of solid old Austro-Hungarian buildings. Although in East Mostar there is hardly a roof left on
a house, or a wall left standing undamaged, these cellars are relatively safe, cool in the summer heat
and, for musicians, wonderfully resonant.
Today, I am working with a group of 20 children, helping to create a group musical improvisation
using hand-held percussion, inspired by the river Neretva, following its course from
its source in the mountains bordering Montenegro down to the sea. We have reached white
336 NIGEL OSBORNE
water rapids, somewhere in Eastern Herzegovina, and the music has become exuberantly wild
and loud. The children are playing energetically and provocatively, grinning and relishing a
moment of permitted anarchy; however, I notice that nine-year-old Nermina has covered her ears.
Her look is intense and concentrated rather than distressed. I calm the torrent of the Neretva in as
measured and musical a way as I can, and soon we arrive in the tranquil, looking-glass waters of
Jablanica Lake.
I have noticed that Nermina finds it hard to deal with loud noises. I do not know whether this
is physical damage to her ears or something related more to traumatic recall, but I decide to set
up some ‘safe’ environments in which she can explore musical sound. We improvise together
with an ocean drum—a large, sealed, tambourine-like instrument with ball-bearings inside.
At its quietest, it whispers like a rock pool; at its loudest, it crashes like ocean waves. Nermina seems
to enjoy improvising in the ‘risky’ area of soft to medium loud. She is in complete control of the
music, but seems somehow to be pushing at her own frontiers of tolerance. I introduce some
games where individual children modulate or sculpt the sound volume of their fellow musicians
by moving their arms up and down, or by moving their bodies through spaces marked on the floor—
the nearer to the musicians, the louder they play; the further away, the quieter. Nermina is keen
to participate, and once again seems to enjoy being in complete control of the sound levels that
surround her.
I have no idea whether our exercises are helping, but after a few sessions Nermina stops
covering her ears.
In many ways, music is multisensory. Tactile and visual cues, for example, play important roles
in musical experience, like feeling the vibration of an instrument, sensing the touch of a piano, or
following (watching) a conductor. But the core material and essential sensory foci of music are,
of course, sound, the ear, hearing and listening.
Children who are victims of war may experience difficulties with hearing and listening. Those
who are exposed to explosions or heavy arms fire in close proximity—like the children of East
Mostar who lived through one of the most intensive bombardments of a civilian population in
history (spring 1993 to spring 1994)—may suffer physical injury to their ears. Blast over-pressure
may rupture the tympanic membrane, dislocate or fracture the ossicle chain, or damage sensory
structures of the basilar membrane, leading to temporary or permanent hearing loss, or other
auditory disorders (Patterson and Hamernik 1997). I am aware of no published studies of hear-
ing loss in Mostar at the time of the bombardment, and would be surprised to find any, given the
conditions. However, there have been systematic studies elsewhere, for example of a single explo-
sion in Myyrmanni shopping mall, Vantaa, Finland in 2002, where 7 people were killed and
160 injured, of whom 44 suffered ear trauma. Otological examination of 29 of these patients
found that 66 per cent had tinnitus, 55 per cent hearing loss, 41 per cent pain in the ears, 28 per cent
sound distortion, and 41 per cent a combination of tinnitus and hearing loss. Ear injury was
recorded in patients as far as 70 metres away from the three-kilogram ammonium nitrate blast
(Mrena et al. 2004). In general, areas of frequency loss seem to correspond roughly to the
frequency spectrum of the explosions and weapons; for example, in a recent study of exposure to
impulse noise among soldiers during military service (Konopka et al. 2005), the highest levels of
weaponry noise were recorded at frequencies 1.6–16.0 kHz, and hearing loss after military
service at an average of 6 dB at 10–12 kHz. Ylikoski (1987) found that 75 per cent of conscripts
had a hearing loss in the high-frequency region above 2 kHz; the remaining 25 per cent recorded
deficits at lower frequencies, associated more often with impulse noise from large-calibre
weapons and lower-frequency explosions. There is also evidence that acute auditory stress may
cause vestibular symptoms, including balance disorders (Cassandro et al. 2003; findings related,
ironically, to very loud music), which are potentially related to vestibular functions such
as motion sensing and spatial orientation, posture and muscle tone, and the vestibular–
ocular reflex.
Other hearing difficulties among traumatized children, including discomfort with certain
sounds, may be more psychological or psychobiological in nature; these difficulties tend to be
related to the principal symptom clusters of PTSD. Certain sounds may be associated with
traumatic recall, and become partners in fear conditioning. The most obvious and common
examples involve gunfire and explosions. However, for some children in Chechnya and
Ingushetia, it is the sound of helicopters that causes distress; it is likely here that the sensory
information converges in the lateral–basolateral complex of the amygdala, where it activates
systems of both startle and stress (Coupland 2000). This may in turn be related to symptoms in
the hyperarousal cluster, and in particular to exaggerated acoustic startle response.
Music is highly processed sound that may draw on a wide range of frequencies throughout the
range of human audition. On the whole, music is concerned with ‘beautiful’ sound, at frequen-
cies and amplitudes that give pleasure to the listener. When it comprises challenging sound, this
usually forms part of a carefully constructed and prepared aesthetic programme or emotional
narrative. Occasionally, what may be regarded as a shock outside the context of music may
become a stimulating surprise within. It is possible for live music to respond fluidly to the
reactions of its listeners. What appears uncomfortable may soon be taken away.
The first hypothesis is that music may help traumatized children exercise their hearing in safe
and enjoyable ways. This may take the form of simple listening exercises: listening to natural
sounds, or identifying and describing hidden sounds and instruments, which may in turn
encourage concentration and attention. It may involve sound play and composition. In the
early days of the Sarajevo project, I carried a small hand-held percussion kit into the city in a
rucksack: bells, wind chimes, chime bars, triangles, agogos, cowbells, flexatones, maracas, claves,
tambours and tambourines. It was a musical palette sufficient to work creatively with children
exploring different frequencies and intensities of pitch and colour. Sometimes we worked purely
musically; at other times we created musical narratives—the courses of rivers, the pathways
of stories.
It is, of course, possible to help children deal with specific difficulties. Improvisations with the
ocean drum, musical sculpting and spatial games may help children who have difficulties with
loud or sudden noise to acquire self-regulated and aesthetic ways of dealing with it. For children
with hearing deficits as a result of otological trauma, there is the possibility of exploring and
exercising a rich and wide range of frequency and audition through the safe and attractive
medium of musical sound.
15.2.2 The autonomic nervous system: the heart

In the late winter of 1998, I am making my way in the early evening down the Rustaveli Avenue in
Tbilisi, Georgia, with workshop colleagues Roxana Pope and Toni Pesikan. Some distance ahead, in a
square, I see a group of eight or nine ragamuffin children shambling towards us. The adults on the
pavement look uneasy, and give them a wide berth. These are street children, orphans, mostly from
conflicts of the 1990s in such regions as Abkhazia and South Ossetia, who have now turned to
survival in the cities, begging, petty crime and prostitution. As they approach us they break into a
stampede and we are noisily mobbed and roughly embraced to cries of ‘Toni!’, ‘Roxanna!’, ‘Nigel!’.
The bystanders appear alarmed, even horrified: we feel privileged, and secretly rather proud.
They are our friends from the well-run Tbilisi drop-in centre. These children are in a state
of permanent physical and mental arousal. The process may have begun with traumatizing
experiences of war, but it continues in the turmoil of day-to-day survival on the streets. We are
338 NIGEL OSBORNE
working creatively, in collaboration with students from the Tbilisi Academy of Music, writing songs
and inventing soundscapes—small musical documents of emotional lives, tokens of personal achieve-
ment and self-respect; but the core of the work is the quest for transforming musical journeys. We may
begin with focused clapping games, then move to exciting drumming and movement exercises
(perhaps a Caucasian dance or a Brazilian samba), then to calming and reflective music (a Georgian
healing song or a Native American lullaby). After the focusing exercises, and then the high energy,
the children seem ready to relax in an atmosphere of calm, concentration and clarity. We have no way
of looking inside their hearts and minds to determine the rate of their pulse or the speed of their
thoughts, but we can see, hear and sense profound physical and psychological changes. The transfor-
mations are rapid and radical.
A substantial body of research into the influence of PTSD on the heart has been carried out
among war veterans in the United States. There is consistent evidence that subjects with a PTSD
diagnosis have a significantly higher resting heart-rate than do control subjects with no such
diagnosis (e.g., Buckley et al. 2004). A study of women veterans showed a mean resting heart-rate
of 83.9 beats per minute for those with PTSD diagnosis, as opposed to 77.5 for the control group
(Forneris et al. 2004). It is significant that these differences appeared regardless of age, body mass
index, race or medication. The connection between trauma and heart rate was further under-
lined in a recent meta-analytic examination of basal cardiovascular activity in PTSD (Buckley
and Kaloupek 2001), in which 34 studies of 2670 participants showed consistently raised heart
rates among those with PTSD, with the greatest effect sizes for comparison among those with
chronic PTSD. Trauma may also affect the delay with which the heart returns to its normal
resting rate after a sudden shock or startle. In a recent study of veterans (Kibler and Lyons 2004),
the heart-rate recovery delay after acoustic startle was linearly related to the severity of PTSD
diagnosis: the more severe the trauma, the consistently slower the recovery. Buckley and
Kaloupek’s meta-analytic examination also found an association between PTSD and elevations
in systolic and diastolic blood pressure, although the effect was less marked than for heart rate. In
a study of 118 Vietnam combat veterans (Beckham et al. 2002), participants with PTSD showed
significantly higher diastolic blood pressure response than the control group when asked to
choose and recall a memory that caused them to be angry.
Trauma is associated with cardiac arrhythmias (irregularities in the heart beat). In two studies
involving spectral analysis of heart-rate variability among participants diagnosed with PTSD
(Cohen et al. 1998, 2000), there was clear evidence of autonomic dysregulation (disturbance of
the balance between arousal and repose in the autonomic nervous system, which controls
involuntary and self-regulating functions of the body such as secretory glands, heart, oxygen
supply, and digestive and metabolic functions). This dysregulation took the form of a baseline
autonomic hyperarousal state: a chronic shift to a permanent state of arousal in the autonomic
nervous system. It involved increased tone in the sympathetic nervous system (the division of the
autonomic nervous system that accelerates the heartbeat in situations such as crisis). At the same
time, there was decreased tone in the parasympathetic autonomic nervous system, which inner-
vates the heart from the brainstem via the vagus nerve, and slows the heart down to prepare for
functions such as energy storage, digestion and immune responses. The 1998 study indicated that
subjects with PTSD were locked into autonomic hyperactivation to the extent that they were
unable to marshal further arousal or stress in the presence of stimuli that induce stress responses
in participants with no PTSD diagnosis.
There is a common perception that evidence concerning the relationship between music and
the heart is controversial or contradictory. In fact, the evidence is relatively consistent. In an
informal review I conducted recently of 50 refereed papers on music and the heart, 49 of them
found that music significantly changed the behaviour of the heart; only one, carried out in a
geriatric ward, found no correlation. Furthermore, there is reasonable consistency in the
findings. There is consistent evidence that certain kinds of fast-tempo, exciting music, for example
techno-music, may significantly increase heart rate and systolic blood pressure (e.g., Gerra et al.
1998), and in the short-term may decrease the activation of the parasympathetic nervous system
(Iwanaga et al. 2005). There is also a strong link to movement; when combined with exercise,
for example, music may increase sympathetic nerve activity (Urakawa and Yokoyama 2005).
Paradoxically and significantly, in the long term, fast and exciting music may also decrease
perceived tension and increase perceived relaxation (Iwanaga et al. 2005.).
There is an overwhelming body of evidence that music, and especially music generally felt to
be relaxing, may slow the heart and lower blood pressure (e.g., Updike and Charles 1987; Byers
and Smyth 1997; Cardigan et al. 2001; Knight and Rickard 2001; Aragon et al. 2002; Mok and
Wong 2003; Lee et al. 2005). It appears that listening to relaxing music may not only slow heart
rate, but significantly reduce its variability (Escher and Evequoz 1999). In general, music seems to
have a richer and more intimate relationship with the parasympathetic than with the sympa-
thetic nervous system (Iwanaga and Tsukamoto 1997). Under certain conditions at low, parasym-
pathetic rhythmic frequencies—say, between 48 and 42 beats per minute—harmonic
synchronization or entrainment between musical pulse and heart beat may occur (Reinhardt
1999). This harmonic synchronization (simple ratios of speed of musical pulse to heart rate, such
as 1:1, 1:2 and 2:3) appears to play a role in individual tempo preferences. People show preference
for tempi close to their own heart rates in normal daily situations: between 70 and 100 bears
per minute, in a relationship 1:1; in the case of unfamiliar rhythmic patterns, they seem to
enjoy pulses with simple harmonic relationships with their heart beat, for example 1:2 or 2:3
(Iwanaga 1995).
Perhaps it is the very richness and fluidity of music’s special relationship with the heart that
has made research appear contradictory. As documented above, there is reliable evidence that
differing experiences in music may both speed up and slow down the heart, regulate it, raise and
lower blood pressure, reinforce and prolong the effects of movement and exercise, reduce heart
rate variability, and in certain circumstances entrain and harmonize the rhythms of the heart.
Simultaneously, music has a vital relationship with other functions of the autonomic nervous
system—with respiration, movement and the metabolism, and with the emotions, memory and
cognition, all of which impact collectively on and interact reciprocally with the activity of
the heart.
This brings me to the second hypothesis. Based on the juxtaposition of clinical evidence,
preclinical evidence and practice, for traumatized children music may ‘exercise’ the heart in
supple, fluid and enjoyable ways, and that this in turn may have a beneficial effect on symp-
toms of trauma. This applies in particular to autonomic dysregulation, where children appear to
be locked in low-level hyperactivation and hyperarousal. I argue that both the active/performing
and passive/listening experience of music may have an unlocking effect on the autonomic nerv-
ous system, helping to modestly increase and decrease tone in the sympathetic and parasympa-
thetic divisions in safe, flexible ways, and ultimately helping to relax and regulate the system as a
whole. This may include other autonomic functions, such as the secretion of saliva and the activ-
ity of the gastrointestinal tract; indeed, there is clear evidence of the effect of music on galvanic
skin conductance (e.g., VanderArk and Ely 1993).
For traumatized children, musical experience may be focused at the musical/psychobiological
location closest to them, and at tempi similar to or in harmony with the speeds of their hearts.
I propose that it may also embark on sympathetic and parasympathetic adventures in rhythm
340 NIGEL OSBORNE
and tempo, from excitement to entrainment, with high and low rhythmicity, singing, playing,
moving, dancing and listening, and safe and well-navigated journeys in speed and time.
15.2.3 Respiration
It is late spring 1995, again in Mostar, more than a year since the signing of the Washington
Agreement; but there is still sporadic shelling across the boulevard and mortar fire from the moun-
tains. On a visit to the orphanage in Zalik, I am surprised to find a rock’n’roll band rehearsing—four
young men in their mid-to-late teens, half a drum kit, and battered acoustic guitars, none fully
strung. I meet Miro, Omer, Zlatko and Bijeli. No one says much. Omer, who is bodily huge, offers
the occasional cavernous monosyllable; Miro makes an effort to form sentences, but the words are
scrambled in a breathless stammer. I feel there is a real originality and musicianship in their compo-
sitions and their playing; as I emerge from the orphanage to the glare of the sun and the depressing
thud of detonations, a voice in my head (I am not normally given to such delusions) declared, ‘This is
the beginning of a great rock’n’roll story.’
The members of the group subsequently became volunteers in our training programme for young
people to work with children. Miro took up singing, and whether in consequence or by coincidence,
his speech began to improve quite dramatically, although he still seemed to see himself as a worthless
and loveless diabolus with two horns and a tail. This changed one day in August 1996. I had invited
the team to a summer camp at Portonovo, near Ancona in Italy. Miro, with his bright blue eyes and
curly blond hair, tanned from a summer swimming in the Neretva, wandered on to the beach and
found himself surrounded by an adoring group of Italian girls in bikinis. I saw a dark cloud rise up
from around Miro’s head and disperse forever in the clear Adriatic sky.
All four of the Zalik rockers took part in the opening of the Pavarotti Centre in Mostar in 1997,
when they shared a platform with the likes of Bono, Zucchero and Brian Eno. Omer now has a signif-
icant international career as a drummer and percussionist, Bijeli works in music, therapy and educa-
tion, Zlatko is an air traffic controller who still writes radically original songs and sings in clubs, and
Miro is very much the man-about-town in Mostar, still successfully chatting up the girls, still singing,
and in demand from time to time as a toastmaster and public speaker.
Surprisingly, there is relatively little research into PTSD and respiratory difficulties. The litera-
ture recognizes an association between experiences such as flashbacks, or the exaggerated startle
response, and hyperventilation—that is, rapid breathing leading to a reduction of the carbon
dioxide concentration of arterial blood. Most studies in this area, however, are concerned with
ASD—acute stress disorder (e.g., Nixon and Bryant 2005). Similarly, studies of dyspnoea, or
clinically observed laboured breathing, appear mainly in the general literature on stress (e.g.,
Donker et al. 2002). The most researched area related to respiration and PTSD is respiratory
sinus arrhythmia (RSA—the fluctuation in the heart rate controlled in part by the effect of
breathing on the vagus nerve). The evidence suggests that PTSD sufferers have low vagal activity
in reaction to challenge (Sahar et al. 2001) and that decreased RSA may be a response to a
traumatic reminder (Sack et al. 2004).
The science is clearly in its early stages, but any therapist or volunteer who has spent time
in the field has noticed that a small but significant number of traumatized children have irregu-
larities in breathing, manifested in a range of difficulties in speech, including breathlessness,
hesitancy or, in certain cases, mild stammering and stuttering (although the latter are problems
which may be respiratory ‘by proxy’ rather than by origin).
The centre for rhythmic control of breathing lies in groups of neurons in the medulla oblongata,
the lowest part of the brainstem. The pulse is generated by either the firing of autonomous
pacemaker neurons (as yet unidentified) or possibly neural networks, and modulated by
feedback from lung and chest-wall reflexes and a variety of peripheral controls. Pathways descend
along the spinal axis, in separate inspiratory and expiratory axons, to receptors in the thoracic
and abdominal viscera, which control muscle groups concerned with breathing. This is the
rhythmic generator of automatic breathing, where we are not particularly aware of our breathing
patterns, and is the state in which we spend most of our time.
There is, however, a critical, third pathway: the corticospinal fibres that descend from the
motor area of the cerebral cortex along the spinal axis to the receptor sites for breathing. This is
the pathway for voluntary breathing, including activities such as speaking, singing, holding
breath and voluntary hyperventilation. Crucially, for an understanding of respiratory problems
and PTSD, it is also the pathway for certain cortically processed ventilatory responses to anxiety
and fear.
Automatic and, in some instances, voluntary breathing may be affected by the more general
influence of the vagus nerve; in particular, the section of the vagi in the neck before entrance to
the vertebral column. Vagal activity may reduce breathing frequency and increase the tidal
volume of breath. It is significant that parasympathetic fibres are distributed widely through the
respiratory as well as cardiovascular systems. It is a reasonable assumption that the reduction in
vagal tone frequently observed in PTSD, and often associated with raised heart rate and
decreased respiratory sinus arrhythmia, is also associated with decreased vagal modulation of
breath frequency and volume (with the possibility of paradoxical effects). It seems likely that
trauma may alter in some ways both voluntary and automatic functions of breathing; hence,
perhaps, the subjective observations of those who work with traumatized children in the field.
Music has a singularly privileged and effective position in relation to these problems. Musical
experience in general, and in singing in particular, has the capacity to modify and regulate both
automatic and voluntary breathing patterns. The results of research in automatic respiratory
changes through listening to music are very similar to those for the cardiovascular system—
ventilation increases with faster tempi and simpler rhythmic structures, and decreases both in
pauses and with slower tempi (e.g., Bernardi et al. 2006). Results have shown that the effect is
universal, but stronger for musicians than for non-musicians.
In relaxation training with deep-diaphragmatic breathing, relaxing music has been shown
to deepen breathing and quicken relaxation; concentrations of carbon dioxide in the arterial
blood are normalized with decreased respiration rate. The combined psychophysiological indices
of the research suggest that music ‘potentiates the hypometabolic counterarousal state’ (Fried
1990a, b).
Voluntary breathing underpins much musical performance, above all singing, where patterns
of inspiration and expiration both control and are controlled by the musical breath/phrase, and
where the fluid flow of breath through the respiratory airways must be sustained evenly and at a
relatively high pressure. At low-level physical activity, lung volume is only 10–15 per cent of vital
capacity; for strenuous activity it may be as high as 50 per cent; singing long phrases, however,
may require the full capacity of the lungs. Singing is undoubtedly the ‘normal’ human activity
that offers the most complete and precisely regulating exercise for the lungs and for breath
control. Recent research (Grape et al. 2003) has demonstrated that singing may have wider
benefits for general well-being including increased concentrations of oxytocin, a peptide
hormone secreted by the posterior pituitary and associated with the contraction of the uterus
during labour, lactation and sensations of joy and peace. (Endocrinal aspects of music and
trauma will be investigated in subsequent sections on the metabolism and the emotions.)
This leads to the third hypothesis, based on a small body of clinical evidence, and a large body of
preclinical research and practice: that music, and particularly singing, helps traumatized
children to deal with respiratory difficulties of both an automatic and voluntary nature.
342 NIGEL OSBORNE
Traumatized children may sometimes find it difficult to sing well. I recall the enthusiastic but
‘monotonal’ singing of children who had survived the siege of East Mostar (spring 1993 to spring
1994). Sensitive choices of repertoire, with fewer pitches and strong rhythmic content, provided a
way through. Most important is joy in the experience of the act of singing itself. Interesting and
exciting journeys through different and contrasting tempi may exercise both automatic and vol-
untary respiratory responses, and do so in parallel with the closely related concerns of the heart. In
time, it is possible to work on voluntary control of breathing patterns through paying attention to
breath/phrases: first, normal lengths, then long breaths. This may lead to the opportunity to intro-
duce some of the more traditional, but nonetheless essential values of singing teaching, such as
proper support and diaphragmatic–costal breathing. Where it is appropriate to explore a musical
repertoire that is calm, reflective and encourages slow, deep breathing, there is the opportunity to
work with relaxation in a way that embraces not only the respiratory and cardiovascular systems,
but other vital functions, including aspects of the metabolism.
15.2.4 Bodily movement

In autumn 1995, following the final ceasefire of the war in Bosnia, I took my first group of
students from Edinburgh to expand and develop our schools programme in Mostar, and to act
as peer support for the young Bosnian outreach team. A visit to Sarajevo was also planned.
A colleague from the charity Première Urgence had described problems at the isolated Pazarić
Hospital in the hills outside Sarajevo, where there were both children and adults—some mentally ill,
some with profound learning difficulties—all forced together by necessity and the ravages
of war.
I had managed to secure transport with a rather colourful NGO called The Serious Road Trip,
which had worked through the most dangerous parts of the war delivering aid in brightly painted
trucks, decorated with sunflowers, rainbows and whirligigs. They had given us the use of a blindingly
psychedelic Dodge bus, and had thrown in, for good measure, an excellent and very funny clown
to keep us company. This surreal combination worked wonders at what remained of the joyless
paramilitary roadblocks. It also made our arrival at Pazarić something of a sensation, and provoked
an ecstatic greeting from the patients at the gate of the hospital, which was to become a regular
feature of our visits. Another eagerly anticipated feature (by 1995 standards) was the banquet of
čevapćići and rakija provided by Danilo, the Director of the hospital. Danilo was the only doctor
caring for more than 300 seriously ill patients, and supported by just two qualified nurses. He was a
Bosnian Serb, one of many unsung heroes who shunned genocidal nationalism and stayed and cared
for their multiconfessional neighbours. (When Danilo died, in 2002, the patients insisted on writing
and performing a song for him; it began: ‘Danilo was a good man; he looked after everyone in the
house’.)
At times in our work, there are moments of overwhelming chaos. I recall in the late winter of 1998,
in Zugdidi in Western Georgia, arriving at a camp of internally displaced people from Abkhazia, who
had been burned out of their houses more than once. The camp was located in an abandoned factory
with poor sanitation and toxic waste still smouldering in the yards. We arrived in the late afternoon,
with darkness falling and no power supply, to be greeted in a massive industrial atrium by thousands
of milling, expectant human shadows. I recall the seemingly irretrievable chaos of muddy fields on
the Kosovor–Macedonian border, teeming with hundreds of recently arrived refugee children.
The overwhelming chaos in Pazaric Hospital came when more than 100 patients crowded into
the hospital’s modest social activities space. What do you do in such impossible circumstances?
One answer is that you sing and play with as much contagious joy and energy as you can muster.
Music is extraordinarily effective in these situations. Perhaps it is just a natural and inoffensive way
of raising your voice and attracting attention. But music also generates trust: if someone sings to you,
they clearly mean you no harm; they make themselves vulnerable, ‘bare their soul’ and offer
sympathy, empathy and a kind of care and love. Then there is the power of music to bring social
cohesion—by consent—from chaos, and both to synchronize and to entrain.
We began, on this occasion, with songs from ex-Yugoslavia and Bosnia-Herzegovina, songs that
people knew and could join in—folk songs, popular songs, national songs (of an ethnically inclusive
nature) and art songs. Then we moved on to our African repertoire: highly rhythmic songs with just
a few words or syllables, which can be learnt, literally, as they are sung. We added djembes, and some
of the patients spontaneously joined in the drumming. Then everyone began to sing, move and
dance—adults with physical disabilities, children with profound learning difficulties, lawyers,
labourers, plasterers and professors who carried deep wounds of the war in their minds, triumphs and
disasters of sartorial and social grace, all moving to the same pulse, bodies and thoughts, for this
special moment in time, in synchrony.
Over the next several months, the work in Pazarić was developed and refined with properly
planned and organized individual and small-group work; but our eccentric big-jam sessions were to
continue, by popular demand, for many years to come.
The power of music to entrain, coordinate and synchronize movement has been a critical
part of the work from the beginning. During 1993 and 1994, the village of Blagaj, formerly a
well-to-do country suburb and summer get-away from Mostar, was cruelly brutalized—
overflowing with refugees, bombarded intensively by both Croatian and Serbian militias, and for
periods cut off from the main Bosnian defence forces and all aid. By the time of the Washington
Agreement in 1994, there was an angry and resentful atmosphere in the village. Local and interna-
tional offers of intervention were often summarily rejected. But our team was warmly welcomed,
because we were musicians, and I suspect because of our close association with War Child, and its
stubbornly regular deliveries of fresh bread to the village, often through hailstorms of bullets, from its
mobile bakery at the former Hepok vineyard headquarters. (Playwright Tom Stoppard, a leading
patron of War Child, described this combination of music and bread as a ‘wholesome partnership of
basic essentials’.)
When we first met the children, many were wildly distracted, uncoordinated and hyperactive. We
attempted to meet the children at their level of energy and activity with highly rhythmic musical
performance and dance. The children seemed to enjoy these opportunities to express their energies
and inclinations in powerful and structured ways. We were able gradually to slow down, focus and
help contain that energy. Often, children who entered our sessions in violently scattered hyperactivity
left in calm control of their bodies. For children who were withdrawn or sluggish, we worked in the
opposite way, meeting them at their level of energy, and leading them to more energetic repertoires of
coordinated movement. Although post-Dayton Bosnia is burdened with many difficulties, Blagaj has
now returned to being a peaceful, urbane and well-ordered place. The children are mostly well turned
out, relaxed, attentive, bright and creative.
The physical movement behaviours of children who are victims of conflict extend from the
norms expected for non-traumatized children to extremes of low and high activity. These
extremes appear to be closely related to two of the three principal symptom clusters of PTSD, and
to certain comorbid and/or overlapping disorders.
The first is the cluster of avoidance symptoms of PTSD, which are traditionally associated with
a numbing of responsiveness, avoidance of thoughts, feelings and situations which may be
connected to the trauma, and a diminished interest in participating in significant activities
(DSM-IV—see above). These symptoms may manifest themselves in physical activity, in ‘reluctance’
344 NIGEL OSBORNE
or withdrawn sluggishness, and for some children in a pattern of subdued movement not far
removed from the symptoms of depression, ranging from demoralization to major depression
(Yule 1994; Brent et al. 1995).
At the other extreme, the hyperarousal symptom cluster includes irritability, difficulty in con-
centrating, exaggerated startle response and hypervigilance (DSM-IV). It is a short step from this
symptom cluster to the symptoms of attention-deficit hyperactivity disorder (ADHD). Recent
research has pointed to a significant association between these disorders in both adults (Adler et
al. 2004) and children (Famularo et al. 1996). This appears to be consistent with the view of many
fieldworkers that repertoires of physical activity for children who are victims of conflict extend to
extremes of both withdrawn/subdued, and disruptive/hyperactive behaviour.
Music and body movement are intimately linked (Osborne, Chapter 25, and Lee and Schögler,
Chapter 6, this volume). Certain connections between sound and movement seem to be hard
wired, such as the acoustic startle response, which involves a direct pathway from the dorsal
cochlear nucleus to the inferior collicus and spinal systems (Meloni and Davis 1998; Li et al.
1998). Other connections include the recruitment of a complex loop of functions, such as the
auditory cortex of the temporal lobes which process rhythm (e.g., Peretz and Kolinsky 1993), the
right anterior secondary auditory cortex for retention of patterns (Penhune et al. 1999), and
the premotor cortex, the basal ganglia, the vestibular system and the cerebellum for the ‘musical’
control of movement (Turner and Ioannides, Chapter 8, this volume). Studies of mother–infant
interaction (Stern 1974, 1999; Beebe et al. 1979; Trevarthen 1979, 1986, 1999; Stern et al. 1985;
Trehub 1990; Fernald 1989; Papoušek et al. 1990; Papoušek 1994; Trainor 1996; and Part 2, this
volume) have shown that the first coordinated movement in young babies is triggered or
attracted by ‘musical’ cues from the prosodic vocalizations of mothers. At the opposite end of
human life and experience, music has the powerful and well-documented effect of cueing coordi-
nated movement in certain phases of Parkinson’s Disease (e.g., Pachetti et al. 2000). The physical
effect of musical rhythm may be complemented and augmented by timbral colour, melodic and
harmonic movement, metrical patterns of anticipation and stress, biomechanical articulation,
auditory tau (Lee 2005; Lee and Schögler, Chapter 6, this volume) and vitality affect (Stern 1999).
This is perhaps what Tia DeNora describes as the ‘prosthetic’ effect of music (DeNora 2000): the
capacity of music to externalize, coordinate and at times replace the body’s internal control of
patterns of movement.
This brings me to the fourth hypothesis, based on a combination of practice, clinical and
preclinical evidence, which is that music may help traumatized children process and control
uncomfortable extremes of movement behaviour. Music of clear rhythmic profile is useful here,
and the combination of music and movement is highly effective— through performing music,
and through singing and dancing more generally. For children with high levels of physical activ-
ity, music of high rhythmic energy (e.g., West African drumming) is a good starting point, from
which journeys through more varied and balanced levels of rhythmic energy and movement may
begin. Similarly, for withdrawn and subdued children, gentle music of low rhythmic energy (e.g.,
lullabies or love songs) may be an appropriate place to start, gradually increasing energy and
speed, and embarking on adventures in exciting but regulated physical movement and musical
power. For larger groups of children, music may also entrain, coordinate and synchronize
movement between individuals, offering the excitement, satisfaction, security, comradeship and
cohesion of playing and moving rhythmically together.
The working hypothesis and the functional principle here are the same as for the autonomic
nervous system, heart rate, blood pressure and respiration: music may exercise, unlock, coordi-
nate and regulate the body through a variety of experiences in melodic/harmonic movement,
pulse, metre, energy and speed.
15.2.5 Basal metabolism: stress, relaxation, the emotions

It is May 1999, and my colleague Dee Isaacs and I travel to Albania to make an assessment for War
Child Netherlands concerning the situation among refugee children still pouring over the border from
Kosovo. As far as I am concerned, any assessment relating to the creative arts/health intervention is
unlikely to give a useful picture without pilot work with the children themselves, so Dee and I travel
to work in a camp north of Tirane, with a population of refugees recently arrived from the Gjakova
(–Dakovica) region.
I begin by spending some time visiting families in their tents. These camps are always very
different from the image portrayed in the media—the despair, discomfort and humiliation are far
greater, but the human dignity is far more robust. Along one row of tents I am greeted by a late mid-
dle-aged couple, smartly dressed in spite of the dust and stifling heat. They apologize that they have to
invite me into a tent and not into their house (which had been vandalized and dynamited, over the
border in Kosovo, some days before). I have to promise that one day I will visit them in their own
home, wherever that may be. Inside the tent, everything is immaculate, the beds are made up neatly
with crisp sheets, and coffee is prepared on a primus stove, with Kosovor attention to detail, as if in
the kitchens of a gourmet hotel. Later, they proudly show me the only possessions they have brought
with them: their university diplomas, from Prishtina and Belgrade.
We assemble the children, about 80 of them, in one of only two buildings on the site, a former
school for the surrounding farms. We begin with my relatively small repertoire of Albanian/Kosovor
songs, and introduce percussion instruments. The atmosphere is expectant and heady, and I find
myself in a situation that I have experienced many times before. A nearly palpable, highly charged
wave of emotional energy comes from the children; it is almost a physical sensation, which I struggle
to explain. I remember it from the early days of the siege of Sarajevo, and I have felt it often since, for
example in the Caucasus mountains, and in 2004 among Palestinian children on the West Bank. The
most palpable part of the sensation is in the children’s body language and their eyes, which seem to be
saying, ‘This is the sort of thing we want. Keep going.’
It is part of what I regard as the central mystery of the work. I feel I can account for the longer-term
psychological and social processes of the intervention. For example, I know that music is a cognitively
and emotionally rich form of non-verbal communication. The potential of music to help share
and shape expressions of inner states of mind and body, and hence (under the right conditions and
over an appropriate period of time) to help transform the mental and physical lives of children
has been well documented in the literature of music therapy. It is easy to see how the cultivation
of musical communication may help children to develop and grow in their capacities to communicate
in other ways. The psychosocial benefits for children of musical/creative achievement—in terms of
self-esteem, trust, identity, hope and social cohesion—are self-evident and supported by an increas-
ing body of research, particularly in the domain of education. This is not to forget more practical
benefits in spatial awareness, motor skills, concentration and general cognitive development. We
celebrate these things every year in our summer camps for chronically traumatized children and
children with special needs, bringing together therapeutic and outreach teams from the whole of the
Balkan region.
What remains a mystery is the precise nature of the ‘hotline’ music seems to have to our more
immediate physical, neurophysiological and psychological responses. This is why I have taken a
particular interest in the theory of communicative musicality (Malloch 1999; Trevarthen and
Malloch 2000), and in psychobiology and subcortical thresholds of physical and mental activity and
response. The basic regulation of functions such as body movement, respiration and the autonomic
nervous system is located primarily at these thresholds, and it is at subcortical thresholds that the
neurophysiological and endocrinal/musical hotline seems able to make an immediate and profound
346 NIGEL OSBORNE
connection with our metabolism and the deepest biological foundations of our emotions, feelings and
moods. (For the purposes of this discussion, I define emotions as transformed or heightened states of
body and mind that may precede or follow real or imagined actions or events, feelings as reflections
on thoughts and emotions, and moods as more sustained states or predispositions of metabolism and
emotion.)
The Ancient Greek ritual of katapontismos involved diving into the ocean and being immersed in
water. In many Greek myths, it is implicitly associated with the experience of music (Mâche 1993),
for example, in the story of the musician Arion who was captured by pirates and escaped by singing a
song, diving headlong into the sea and returning to Greece on the back of a dolphin. These myths can
be seen in some ways as anticipating contemporary musical neuroscience: the idea, for example, that
musical experience is an immersing and saturation of cognitive, emotional and physical systems of
the human mind and body. The mythology also suggests an interesting and useful metaphor, one I
have called on many times—the idea of musical emotion as an ocean.
There is consistent evidence for endocrinal dysregulation associated with PTSD. The most
documented is the hypothalamic–pituitary–adrenal (HPA) axis associated with the body’s reac-
tion to stress. In the normal experience of stress, sensory information associated with stress or
fear (for example, stressful sounds or images), or signals from the neocortex associated with the
cognition of stressful experience, reach the central nucleus of the amygdala. From there, stress
messages travel along the bed nucleus of the stria terminalis to the hypothalamus, which secretes
corticotropin-releasing hormone (CRH) into the blood of the portal circulation. This in turn
triggers the release of adrenocorticotropic hormone (ACTH) from the anterior pituitary gland
into the bloodstream. ACTH reaches receptors in the adrenal gland on top of the kidney which
then releases cortisol, a steroid, glucocorticoid hormone, into the general blood circulation.
Through the general circulation, cortisol is carried to receptor sites in the immune system as well
as to organs associated with the ‘fight or flight’ reaction. An important regulation of the HPA
axis occurs in the hippocampus, where glucocorticoid receptors respond to elevated levels of
cortisol and set in motion a process of inhibition of the release of CRH, ACTH and hence of
cortisol itself.
There is strong evidence for the significant dysregulation of the HPA axis in both the initial
experience of trauma and in diagnosed PTSD, but this evidence is paradoxical. As might be
expected, cortisol levels are often high directly after traumatic experiences. A recent study of
children exposed to trauma (Delahanty et al. 2005) showed high initial urinary cortisol levels in
the immediate aftermath of trauma, and an association of these high levels with an elevated risk
of subsequent development of acute PTSD symptoms, particularly in boys. For longer-term
trauma, however, the evidence is counterintuitive. Most research shows that chronic PTSD is
associated with lowered levels of cortisol, the opposite of what might be expected—as shown in
a study of children, carried out five years after the event, who were victims of the Armenian
earthquake (Goenjian et al. 1996).
There are a number of possible explanations for this. The paradox may be linked to the theory
of adrenal exhaustion (Selye 1980), or tonic inhibition of the HPA axis through chronic adapta-
tion of the organism to the source of stress. The theory of enhanced negative feedback inhibition
of cortisol (Yehuda 2000) suggests that the effect is due to increased glucocorticoid receptor
sensitivity in the hippocampus, resulting in over-inhibition of the HPA axis, and therefore of
adrenal secretion of cortisol. Another study (Rasmusson et al. 2003) identifies the adrenal
neurosteroid dehydroepiandrosterone (DHEA) as a potential mediator in HPA axis adaptation to
extreme stress and in psychiatric symptoms related to PTSD. Whatever the case, it is important
to understand the feelings of children whose minds and bodies may be reacting to traumatic
experience in many ways—in the experience of stress and fear, in accelerated heart rate, irregular
breathing patterns, hyperactive or sluggish behaviour, disturbed sleep or hypervigilance—and
who are locked into an inhibited biochemistry. This inhibition is at the level of their ability to
process stress biologically and to feel and create the preconditions for action.
There are several types of neurotransmitter and neuromodulator associated with PTSD
(Coupland 2000). CRH, the hormone active in the HPA stress axis, shows increased levels in the
cerebrospinal fluid of trauma patients, and may be associated with the major depression
frequently comorbid with PTSD. Raised levels of glutamate neurotransmitters may be connected
to dissociative amnesia and reduced hippocampal volumes observed in PTSD (literally atrophy
of the hippocampus, which has vital role in learning and memory); this links directly to the
enhanced negative feedback theory. Reduced levels of gamma-aminobutyric acid (GABA) may
be related to alcoholism and depression—both, again, disorders comorbid with PTSD.
Noradrenaline, a neurotransmitter synthesized from dopamine, is associated with attention,
working memory and arousal-related symptoms. There is evidence for noradrenergic hyperactiv-
ity in PTSD, and this may play a role both in reinforcing fear and hypervigilance and in disrupting
focused attention. Dopamine is a catecholamine neurotransmitter associated with motivation,
reward and the avoidance of threat. The presence of stress and activation of threat systems in
PTSD may disrupt the dopamine firing system and lead to impaired motivation, emotional
numbing, social estrangement, and also possibly psychotic symptoms often linked to trauma—in
particular, psychotic depression and schizophrenia.
Acetylcholine is an amine that serves as a neurotransmitter in the central and peripheral
nervous systems, including in the control of memory. Changes in levels observed in PTSD may
contribute to hypervigilance and nightmares. Serotonin is an amine neurotransmitter with an
important role in sleep, appetite and the control of aggression. The depletion of serotonin is asso-
ciated with impulsiveness, irritability, aggression and suicidal tendencies, which may once again
relate to PTSD symptoms. Finally the opioid circuits are associated with hedonic responses and
blocking pain. Stress may trigger activity in these circuits, leading to cycles of dependence and
withdrawal from these endogenous opium-imitating neurotransmitters, and ultimately to
dysfunctions associated with PTSD.
There is a robust body of neuroendocrinal evidence that the experience of music may help to
regulate the HPA axis and modulate cortisol levels. As in the evidence for music and heart rate,
the effect depends on the kind of music. For example listening to techno music appears to raise
cortisol levels (Gerra et al. 1998), and the music built into video games increases the cortisol/stress
response during game playing (Hebert et al. 2005). As in the case of the heart, the more common
effect of music is to lower or stabilize levels of stress, and consequently levels of cortisol (e.g.,
Miluk-Kolasa et al. 1994; Uedo et al. 2004; Schneider et al. 2001; Uedo et al. 2004; Nilsson et al.
2005). This endocrinal evidence is supported by evidence from electroencephalograms (EEG),
functional magnetic resonance imaging (fMRI) and positron emission tomography (PET scans),
which shows activity associated with music and emotional responses to music in key components
of the HPA axis, in particular the amygdala and hippocampus (e.g., Wieser and Mazzola 1986;
Blood and Zatorre 2001; Brown et al. 2004; Baumgartner et al. 2006; Koelsch et al. 2006).
Although there is no evidence that music may in any way help reverse the atrophy of the
hippocampus observed in PTSD, there are intriguing indications that exposure to music—in this
case probably best defined as processed harmonic sound—may increase neurogenesis in the
hippocampus of developing rats (Panksepp and Trevarthen, Chapter 7, this volume).
There is a small but growing body of research on the effect of music on the neurotransmitters
and neuromodulators associated with PTSD. Once again, the influence of music seems to be
348 NIGEL OSBORNE
both modulatory and regulatory. There is evidence, for example, that listening to techno music
increases plasma norepinephrine levels (Gerra et al. 1998), whereas slow music decreases
levels (Yamamoto et al. 2003). A recent study on music, dopamine and blood pressure (Sutoo
and Akiyama 2004) suggests that music may regulate various brain functions through dopamin-
ergic neurotransmission, and may therefore be effective in rectifying symptoms of diseases
and disorders associated with dopamine dysfunction. There is evidence of a connection between
the experience of music and changes in both levels of release and central intercellular content
of serotonin (e.g. Evers and Suhr 2000), of a relationship between musical sensation and the
opioids (e.g. Blood and Zatorre 2001; Stefano et al. 2004) and of increased levels of melatonin
in Alzheimer’s patients participating in music therapy (Kumar et al. 1999). Finally, there is
growing interest in the peptide hormone oxytocin, which appears to have a role beyond its
traditional association with childbirth and lactation, and may be linked to more general feelings
of well-being, and indeed to the modulation of feelings of fear and stress in women, men
and children. A study of two groups of singers, one professional and one amateur (Grape et al.
2003) found that oxytocin concentrations increased significantly in both groups after a singing
lesson.
This leads to the fifth hypothesis, based on the juxtaposition of clinical evidence, preclinical
evidence, practical experience and observation, which is that music may play a modest role in
modulating and regulating neuroendocrine systems and the circuits of neurotransmitters
associated with trauma. For children with raised levels of cortisol, and those chronically locked
into low levels, listening to music and participating in music may help to exercise the HPA
axis in healthy and safe ways, both raising and lowering cortisol levels in the short term, and in
the long term reducing related symptoms of stress, such as heart rate, uncomfortable move-
ment repertoires, and respiration difficulties. Although the study of neurotransmission
and PTSD is at a speculative, preclinical stage and the evidence for the potential influence of
musical experience sketchy, there are grounds to hypothesize that music may have a small role
to play in regulating these systems, especially in the case of CRH, noradrenaline, dopamine and
the opioids.
These endocrine and neural axes are not just systems that happen to be associated with both
trauma and music; they constitute a crucial part of the fundamental psychobiological threshold
of all mood, feeling and emotion. Some aspects of the threshold are almost entirely biological
and subcortical; for example, the triggering of the acoustic startle effect (which travels almost
directly from the ear to spinal motor systems), the initiation of fear in relation to certain images
and sounds through pathways from the ear to the amygdala, and ‘chills’ in the opioid circuits that
result from musical stimuli (Blood and Zatorre 2001). Other functions of the threshold may be
mediated by the cognitive and reflective activity in the neocortex (Sloboda 1991; Peretz 2001).
The result is a complex set of surges and cross-currents in the neural pathways and flow of
endocrine information through the brain, which music seems to entrain, attract and somehow
embody.
Music appears to navigate the ocean of emotion in fluid ways, at times riding the surface, at
times diving to the depths. Music is non-verbal and has no need to land on the islands of emotion
labelled by linguists and psychologists as fear, hate, anger or love. It may course safely through
harbours, straights or underwater caverns; it may glide harmlessly through reflections of dangerous
mountain ranges on the water or simply explore the open sea.
The emotional power of music lies not only in its rhythms and tempi, but in melody,
the flow of intonation and expression, in its vitality affect (Stern 1999), character and valency,
and in harmony, timbre, structures, patterns, cognitive challenges and surprises. Some of these
features operate at the acquired level of culture and in the experiences of individuals. Others are
more universal and basic. In the history of human beings, musical cultures have investigated
different aspects of musical emotion in different places at different times. In some ways they
constitute a kind of global musical genome of complementary and mutually enhancing qualities:
for example, the harmonic/emotional invention of European music in relation to the melodic/
emotional qualities of Indian music or rhythmic/metabolic features (alongside the sophisticated
melodic, textural and harmonic features) of African music.
In work with traumatized children, the widest possible range of musical/emotional potential is
essential. Effective work with these children usually requires a wide repertoire of local cultural
traditions, including popular, folk and classical music, and whatever useful materials may be
appropriated, adapted or translated from neighbouring cultures and other world cultures. These
experiences are liberating for children who have been trapped by war in isolation and confined
spaces, and may be fashioned into safe but exciting musical voyages through the ocean of
emotion, beginning close to the emotional coordinates of the children, then heading out to sea to
circumnavigate treacherous rocks and islands and explore distant horizons.
15.3.1 A psychobiological loop

The effects of music on basal metabolism, emotion, feeling, mood and related symptoms of PTSD,
are linked closely to the following issues of hearing, sensing, the heart, respiration and movement.
◆ Hearing may modulate movement, the heart, respiration and basal metabolism.
◆ The heart may modulate the basal metabolism and may condition respiration and movement.
◆ Respiration may modulate the heart and basal metabolism, condition movement and generate
sound.
◆ Movement may modulate heart rate, respiration and metabolism and generate sound.
◆ The basal metabolism may modulate the heart and respiration and may condition movement.
There are clear neural and endocrinal connections within the loop: the sympathetic autonomic
nervous system is largely synergic with the stress reaction of the HPA axis, and with neural
systems controlling excited movement and heightened voluntary and automatic respiration. The
parasympathetic nervous system is largely synergic with hormonal systems for relaxation, and
neural systems promoting regulated movement and respiration. The whole wide and rich range
of human emotion is associated with equally varied repertoires of movement, respiration
patterns and behaviours of the heart.
As has been described above, music appears to engage these systems—to make our
bodies move and feel—through a combination of direct subcortical and indirect neocortical
connections. An elegant theory that puts this science in a broader context is the theory of
communicative musicality. The theory evolved from the study of ‘musical’ prosodic interactions
between mothers and young babies, including the observation that musical prosody attracts
or entrains the first coordinated vocalizations and gestures of babies (Stern 1974, 1999;
Beebe et al. 1979; Trevarthen 1979, 1986, 1999; Trehub 1987, 1990; Fernald 1989; Papoušek
et al. 1990, 1991; Papoušek 1994; Trainor 1996; Section 2, this volume). The core of communica-
tive musicality, however, relates more to phenomenological theories of intersubjectivity
(Stern 2004). Musical prosodic utterance is understood as a vocal/sonic expression or
consequence of an individual’s state of mind, basal metabolism and motivation. Through
communicative musicality, and in particular through ‘amphoteronomy’ (Trevarthen et al. 2006),
these expressions and consequences are responded to and shared by others, in their mutually
350 NIGEL OSBORNE
controlled physiology.1 The act of imitating or responding to these signals of body state engages
parallel systems in the psychobiology of the individual or individuals who choose to engage
(Stern et al. 1985).
Although music may target specific parts of the psychobiological loop, for example with vocal
exercises to improve respiration or rhythmic exercises to help regulate movement repertoires, the
most powerful effects are holistic. The musical journeys and voyages that bring together the
healthy exploration and exercise of the heart, respiration, movement repertoires and emotion are
the most effective for children’s general health and well-being, and the most richly musical.
15.3.2 A biopsychosocial paradigm

The psychobiological loop takes its place in a proposed biopsychosocial paradigm for a
musical/creative arts intervention for children who are victims of conflict. In the paradigm, the
psychobiological concerns described above are connected, together with their associated
symptoms, to psychological concerns, including cognition, memory, communication and hope,
and related symptoms of trauma such as poor concentration, amnesia, avoidance, detachment
and depression. These lead to psychosocial concerns involving identity, trust, self-belief and
creativity, and associated symptoms such as depersonalization, lack of trust, self-confidence,
motivation and anger. These are linked directly to both social and biosocial concerns such as
socialization, social communication, attachment, social cohesion and synchronization.
It is beyond the scope of this chapter to review the role of music in relation to these wider
concerns in any detail. (There is relevant discussion in the literature of music therapy, music
psychology, music education, ethnomusicology and the sociology and philosophy of music.)
However, I argue that just as music may operate holistically within the psychobiological loop, it
may also work holistically in the paradigm as a whole.
Let us imagine a typical workshop session with traumatized children. The children have been
composing a piece of music for voices and percussion with an interesting melody and
strong rhythmic ideas, and are about to perform it to an audience of peers or parents/carers.
As they start to play and sing, a number of processes unfold. The first is that the children are
synchronized, performing together in the same social space and time. The experience of music
offers a powerful focus for social cohesion and communication. This focus helps to reinforce the
children’s social identity. If they are victims of conflict, this social identity is very likely to have
been undermined or fragmented. In group musical performance of material the children have
created themselves, social identity may proudly be affirmed.
The children are also engaged in a process of trust. This will have begun earlier, in the sharing
of a creative activity. Now, in the act of performing together, it achieves a formal expression.
Similarly, the creative process will have provided the conditions for a sense of achievement and
1 In each environment, the vitality of a developing child, first as an embryo and fetus inside the mother’s
body, and then in the interpersonal community after birth, is dependent on regulation or ‘government’
across a succession of frontiers with the human world. The first environment concerns physiological
states: the term amphoteronomic means ‘governing together’ in a two-way relationship or ‘containment’
(amphora = container; nomic = governing); it is contrasted with autonomic, meaning physiological
self-regulation. The direct regulation of psychological states, in intersubjectivity, Trevarthen defines as
synrhythmic, which leads to sharing symbolic awareness of culture and language. Sympathy in motives
and emotions is made possible by the coalescence of the dynamic regularities or rhythms of movement.
Thus, the term means ‘regulating together’, ‘rhythm’ referring to regularity ‘through time’, as in the flowing
of a river.
for enhanced self-esteem and self-belief, all of which are now celebrated through performance
in a safe social/public space. In the process of musical composition, the children will have
communicated, sympathized and empathized with each another and expressed their feelings.
These transactions are now embodied in the musical work itself and its performance. Throughout
the process, and particularly now in performance, the children are exercising musical memory
and musical cognition, both of which are complex and demanding forms of more general
memory and cognition, frequently impaired in trauma. They are experiencing joy in the
sensation of making music and satisfaction in accomplishing the mental and physical/coordina-
tive challenges of the task. They are experiencing the emotional journey that has run through
the processes of both composition and performance. The children have entered the psychobio-
logical loop, where they are almost certainly aware of physical sensations and changes—in the
transformation of their metabolism, in their movements to the music, in their patterns of respi-
ration as they sing, and in the change of their heart rate. This leads back, full circle, to synchro-
nization and the pleasure, satisfaction, excitement and potentiation of thinking, feeling and
moving together dynamically, in both consciously mobile synrhythmia of communicative
expression and amphoteronomy of vital functions (Trevarthen et al. 2006; see footnote 1). As far
as I know, only music can bring all of these qualities together in this way, simultaneously, in a
shared instant.
It is a commonplace that music brings sounds, voices and pitches together in beautiful shared
moments called harmony. For traumatized children, it is perhaps even more significant that
music may bring together our biological, psychological and social lives in simultaneity, synergy
and harmony in moments which are both aesthetically beautiful and humanly transforming.
References
Adler LA, Kunz M, Chua HC, Rotrosen J and Resnick SG (2004). Attention-deficit/hyperactivity disorder
in adult patients with posttraumatic stress disorder (PTSD): Is ADHD a vulnerability factor? Journal of
Attention Disorders, 8(1), 11–16.
American Psychiatric Association (1994). Diagnostic and statistical manual of mental disorders, 4th edn.
American Psychiatric Association, Washington DC.
Aragon D, Farris C and Byers JF (2002). The effects of harp music in vascular and thoracic surgical
patients. Alternative Therapies in Health and Medicine, 8(5), 52–60.
Baumgartner T, Lutz K, Schmidt CF and Jancke L (2006). The emotional power of music: How music
enhances the feeling of affective pictures. Brain Research, 1075 (1), 151–164.
Beckham JC, Vrana SR, Barefoot JC, Feldman ME, Fairbank J and Moore SD (2002). Magnitude and
duration of cardiovascular responses to anger in Vietnam veterans with and without posttraumatic
stress disorder. Journal of Consulting and Clinical Psychology, 70(1), 228–234.
Beebe B, Stern D and Jaffe J (1979). The kinesic rhythm of mother–infant interactions. In AW Siegman and
S Feldstein, eds, Of speech and time; temporal speech patterns in interpersonal contexts, pp. 23–34.
Erlbaum, Hillsdale, NJ.
Bergovec PA, Heim I, Vasilj I, Jembrek-Gostovid M, Bergovec PA and Simad M (2005). Acute coronary
syndrome and the 1992–95 war in Bosnia and Herzegovina: a 10-year retrospective study. Military
Medicine, 170(5), 431–434.
Bernardi L, Porta C and Sleight P (2006). Cardiovascular, cerebrovascular and respiratory changes
induced by different types of music in musicians and non-musicians: The importance of silence. Heart,
92(4), 445–452.
Blood AJ and Zatorre RJ (2001). Intensely pleasurable responses to music correlate with activity in brain
regions implicated in reward and emotion. Proceedings of the National Academy of Sciences USA,
98(20), 11818–11823.
352 NIGEL OSBORNE
Brent DA, Perper JA, Moritz G, Liotus L, Richardson D, Canobbio R, Schweers J and Roth C (1995).
Posttraumatic stress disorder in peers of adolescent suicide victims: Predisposing factors and
phenomenology. Journal of the American Academy of Child and Adolescent Psychiatry, 34(2), 209–215.
Brown S, Martinez MJ and Parsons LM (2004). Passive music listening spontaneously engages limbic and
paralimbic systems. Neuroreport, 15(13), 2033–2037.
Buckley TC, Holahan D, Greif JL, Bedard M and Suvak M (2004).Twenty-four-hour ambulatory assessment
of heart rate and blood pressure in chronic PTSD and non-PTSD veterans. Journal of Traumatic Stress,
17(2), 163–171.
Buckley TC and Kaloupek DG (2001). A meta-analytic examination of basal cardiovascular activity in
posttraumatic stress disorder. Psychosomatic Medicine, 63(4), 585–594.
Byers JF and Smyth KA (1997). Effect of a musical intervention on noise annoyance, heart rate, and blood
pressure in cardiac surgery patients. American Journal of Critical Care, 6(3), 183–191.
Cardigan ME, Caruso NA, Haldeman SM, McNamara ME, Noyes DA, Spadafora MA and Carroll DL
(2001). The effects of music on cardiac patients on bed rest. Progress in Cardiovascular Nursing,
16(1), 5–13.
Cassandro E, Chiarella G, Catalano M, Gallo LV, Marcelli V, Nicastri M and Petrolo C (2003). Changes in
clinical and instrumental vestibular parameters following acute exposition to auditory stress.
Acta Otorhinolaryngologica Italica, 23(4), 251–256.
Cohen H, Benjamin J, Geva AB, Matar MA, Kaplan Z and Kotler M (2000). Autonomic dysregulation in
panic disorder and in post-traumatic stress disorder: Application of power spectrum analysis of heart
rate variability at rest and in response to recollection of trauma or panic attacks. Psychiatry Research,
96(1), 1–13.
Cohen H, Kotler M, Matar MA, Kaplan Z, Loewenthal U, Miodownik H and Cassuto Y (1998). Analysis of
heart rate variability in posttraumatic stress disorder patients in response to a trauma-related reminder.
Biological Psychiatry, 44(10), 1054–1059.
Coupland NJ (2000). Neurotransmitters and brain mechanisms. In D Nutt, J Davidson and J Zohar, eds,
Posttraumatic stress disorder: diagnosis, management and treatment, pp. 68–99. Martin Dunitz, London.
–Dapić R and Stuvland R (2002). Longitudinal study of the war-related traumatic reactions of children in
Sarajevo in 1993, 1995 and 1997. In S Powell and E Duraković-Belko, eds, Sarajevo 2000: The
Psychosocial Consequences of War, Results of empirical research from the territory of former yugoslavia,
pp. 156–160. D. O. O. OTISAK, Sarajevo, Bosnia-Herzegovina.
Delahanty DL, Nugent NR, Christopher NC and Walsh M (2005). Initial urinary epinephrine and cortisol
levels predict acute PTSD symptoms in child trauma victims. Psychoneuroendocrinoiogy, 30(2), 121–128.
DeNora T (2000). Music in everyday life. Cambridge University Press, Cambridge, New York.
Donker GA, Yzermans CJ, Spreeuwenberg P and Van der Zee J (2002). Symptom attribution after a plane
crash: Comparison between self-reported symptoms and GP records. British Journal of General Practice,
52(484), 917–922.
Drljević K and Mehmedbašić S (2005). The frequency of female genital cancer at the Gynecological
Department of the Cantonal Hospital in Zenica – before, during and after the war in Bosnia-
Herzegovina. (In Bosnian). Medicinski Arhiv, 59(3), 183–187.
Escher J and Evequoz D (1999). Music and heart rate variability. Study of the effect of music on heart rate
variability in healthy adolescents. (In German.) Schweizerische Rundschau für Medizin Praxis,
88(21), 951–952.
Evers S and Suhr B (2000). Changes of the neurotransmitter serotonin but not of hormones during short
time music perception. European Archives of Psychiatry and Clinical Neuroscience, 250(3), 144–147.
Famularo R, Fenton T, Kinscherff R and Augustyn M (1996). Psychiatric comorbidity in childhood post
traumatic stress disorder. Child Abuse and Neglect, 20(10), 953–961.
Fatusić Z, Kurjak A, Grgić G and Tulumović A (2005). The influence of war on perinatal and maternal
mortality in Bosnia and Herzegovina. Journal of Maternal–Fetal and Neonatal Medicine, 18(4), 259–263.
Fernald A (1989). Intonation and communicative interest in mother’s speech to infants: Is the melody the
Forneris CA, Butterfield MI and Bosworth HB (2004). Physiological arousal among women veterans with
and without posttraumatic stress disorder. Military Medicine, 169(4), 307–312.
Fried R (1990a). Integrating music in breathing training and relaxation: I. Background, rationale and
relevant elements. Biofeedback and Self-Regulation, 15(2), 161–169.
Fried R (1990b). Integrating music in breathing training and relaxation: II. Applications. Biofeedback and
Self-Regulation, 15(2), 171–177.
Gerra G, Zaimovi ć A, Franchini D, Palladino M, Giucastro G, Real N, Maestri O, Caccavari R,
Delsignore R and Brambilla F (1998). Neuroendocrine responses of healthy volunteers to
‘techno-music’: Relationships with personality traits and emotional state. International Journal of
Psychophysiology, 28(1), 99–111.
Goenjian AK, Yehuda R, Pynoos RS, Steinberg AM, Tashjian M, Yang RK, Najarian LM and Fairbanks LA
(1996). Basal cortisol, dexamethasone suppression of cortisol and MI-IPG in adolescents after the 1988
earthquake in Armenia. American Journal of Psychiatry, 153(7), 929–934.
Goldstein RD, Wampler NS and Wise PH (1997). War experiences and distress symptoms of Bosnian
children, Pediatrics, 100(5), 873–878.
Grape C, Sandgren M, Hansson LO, Ericson M and Theorell T (2003). Does singing promote well-being?:
An empirical study of professional and amateur singers during a singing lesson. Integrative Physiological
and Behavioral Science, 38(1), 65–74.
Hebert S, Beland R, Dionne-Fournelle O, Crete M and Lupien SJ (2005). Physiological stress response to
video-game playing: The contribution of built-in music. Life Sciences, 76(20), 2371–2380.
Husain SA (2000). Posttraumatic stress reactions in the children and adolescents of Sarajevo during the
war. From a symposium held at the Faculty of Philosophy in Sarajevo, 7 and 8 July.
Iwanaga M (1995). Relationship between heart rate and preference for tempo of music. Perceptual and
Motor Skills, 81(2), 435–440.
Iwanaga M, Kobayashi A and Kawasaki C (2005). Heart rate variability with repetitive exposure to music.
Biological Psychology, 70(1), 61–66.
Iwanaga M and Tsukamoto M (1997). Effects of excitative and sedative music on subjective and
physiological relaxation. Perceptual and Motor Skills, 85(1), 287–296.
Kibler JL and Lyons JA (2004). Perceived coping ability mediates the relationship between PTSD severity
and heart rate recovery in veterans. Journal of Traumatic Stress, 17(1), 23–29.
Knight WEJ and Rickard NS (2001). Relaxing music prevents stress-induced increases in subjective anxiety,
systolic blood pressure and heart rate in healthy males and females. Journal of Music Therapy,
38(4), 254–272.
Koelsch S, Fritz T, V Cramon DY, Muller K and Friederici AD (2006). Investigating emotion with music:
An fMRI study. Human Brain Mapping, 27(3), 239–250.
Konopka W, Pawlaczyk-Luszczy ńska M, Sliwi ńska-Kowalska M, Grzanka A and Zalewski P (2005).
Effects of impulse noise on transiently evoked otoacoustic emission in soldiers. International Journal of
Audiology, 44(1), 3–7.
Kumar AM, Tims F, Cruess DG, Mintzer MJ, Ironson G, Loewenstein D, Caftan R, Fernandez JB,
Eisdorfer C and Kumar M (1999). Music therapy increases serum melatonin levels in patients with
Alzheimer’s disease. Alternative Therapies in Health and Medicine, 5(6), 49–57.
Lang L, McInerney U and Monaghan R (2002). Supervision—processes in listening together—an experience
of distance supervision of work with traumatised children. In J Sutton, ed., Music, music therapy and
trauma, pp. 211–231. Jessica Kingsley, London.
Lee OK, Chung YF, Chan MF and Chan WM (2005). Music and its effect on the physiological responses
and anxiety levels of patients receiving mechanical ventilation: a pilot study. Journal of Clinical Nursing,
14(5), 609–620.
354 NIGEL OSBORNE
Li L, Korngut LM, Frost BJ and Beninger RJ (1998). Prepulse inhibition following lesions of the inferior
colliculus: Prepulse intensity functions—selective uptake and axonal transport of D-[3H] aspartate.
Physiology and Behavior, 65(1), 133–139.
Mâche F-B (1993). Music, myth and nature, or The dolphins of arion. Translated by Susan Delaney. Harwood
Academic Publishers, New York.
1999–2000), 29–57.
Meloni EG and Davis M (1998). The dorsal cochlear nucleus contributes to a high intensity component of
the acoustic startle reflex in rats. Hearing Research, 119(1–2), 69–80.
Miluk-Kolasa B, Obmiński Z, Stupnicki R and Golec L (1994). Effects of music treatment on salivary cortisol
in patients exposed to pre-surgical stress. Experimental and Clinical Endocrinology, 102(2), 118–120.
Mok E and Wong KY (2003). Effects of music on patient anxiety. AORN Journal, 77(2), 396–397, 401–406,
409–410.
Mrena R, Paakkonen R, Back L, Pirvola U and Ylikoski J (2004). Otologic consequences of blast
exposure: a Finnish case study of a shopping mall bomb explosion. Acta Otorhinolaryngologica,
124(8), 946–952.
Nermina O (2005). Cancer incidence in the Sarajevo region. Medicinski Arhiv, 59(4), 250–254.
Nilsson U, Unosson M and Rawal N (2005). Stress reduction and analgesia in patients exposed to
calming music postoperatively: A randomized controlled trial. European Journal of Anaesthesiology,
22(2), 96–102.
Nixon RD and Bryant RA (2005). Induced arousal and reexperiencing in acute stress disorder. Journal of
Anxiety Disorders, 19(5), 587–594.
Pachetti C, Aglieri R, Mancini F, Martignoni E and Nappi G (2000). Active music therapy and Parkinson’s
disease: Methods. Functional Neurology, 13(1), 57–67.
Papoušek M (1994). Melodies in caregivers’ speech: A species-specific guidance towards language. Early
Papoušek M, Bornstein MH, Nuzzo C, Papoušek H and Symmes D (1990). Infant responses to prototypical
melodic contours in parental speech. Infant Behavior and Development, 13, 539–545.
Papoušek M, Papoušek H and Symmes D (1991). The meanings and melodies in motherese in tone and
stress languages. Infant Behavior and Development, 14, 415–440.
Patterson JH Jr and Hamernik RP (1997). Blast overpressure induced structural and functional changes in
the auditory system. Toxicology, 121(1), 29–40.
37(3), 215–231.
Peretz I (2001). Listen to the brain: the biological perspective on musical emotions. In P Juslin and
J Sloboda, eds, Music and emotion: Theory and research, pp. 105–134. Oxford University Press, London.
Peretz I and Kolinsky R (1993). Boundaries of separability between rhythm in music discrimination:
A neuropsychological perspective. The Quarterly Journal of Experimental Psychology, 46(2), 301–325.
Rasmusson AM, Vythilingam M and Morgan CA 3rd (2003). The neuroendocrinology of posttraumatic
stress disorder: New directions. CNS Spectrums, 8(9), 651–667.
Reinhardt U (1999). Investigations into synchronisation of heart rate and musical rhythm in relaxation
therapy in patients with cancer pain. (In German.) Forschende Komplementannedizin,
6(3), 135–141.
Sack M, Hopper JW and Lamprecht F (2004). Low respiratory sinus arrhythmia and prolonged
psychophysiological arousal in posttraumatic stress disorder: Heart rate dynamics and individual
differences in arousal regulation. Biological Psychiatry, 55(3), 284–290.
Sahar T, Shalev AY and Porges SW (2001). Vagal modulation of responses to mental challenge in posttrau-
matic stress disorder. Biological Psychiatry, 49, 637–643.
Schneider N, Schedlowski M, Schürmeyer TH and Becker H (2001). Stress reduction through music in
patients undergoing cerebral angiography. Neuroradiology, 43(6), 472–476.
Selye H (1980). Selye’s guide to stress research. Van Norstrand Reinhold, New York.
Sloboda J (1991). Musical expertise. In KA Ericsson and J Smith, eds, Toward a general theory of expertise:
Prospects and limits, pp. 153–172. Cambridge University Press, Cambridge, MA.
Stefano GB, Zhu W, Cadet P, Salamon E and Mantione KJ (2004). Music alters constitutively expressed
opiate and cytokine processes in listeners. Medical Science Monitor, 10(6), MS18–27.
behaviours. In M Lewis and LA Rosenblum, eds, The effect of the infant on its caregiver, pp. 187–213.
Wiley, New York.
Sutoo D and Akiyama K (2004). Music improves dopaminergic neurotransmission: demonstration based
on the effect of music on blood pressure regulation. Brain Research, 1016(2), 255–262.
Tomic V and Galic M (2005). Perinatal mortality at the University Hospital Mostar during the period
1999 to 2003. (In Bosnian.) Medicinski Arhiv, 59(6), 354–357.
Trehub SE (1987). Infants’ perception of musical patterns. Perception and Psychophysics, 41(6), 635–641.
by their parents. In MA Berkley and WC Stebbins, eds, Comparative perception; Vol. 1, Mechanisms,
intersubjectivity. In M Bullowa, ed., Before speech: The beginning of human communication, pp. 321–347.
Trevarthen C (1986). Development of intersubjective motor control in infants. In MG Wade and HTA
Whiting, eds, Motor development in children: Aspects of coordination and control, pp. 209–261. Martinus
Nijhof, Dordrecht, Holland.
Trevarthen C, Aitken KJ, Vandekerckhove M, Delafield-Butt J and Nagy E (2006). Collaborative regulations
of vitality in early childhood: Stress in intimate relationships and postnatal psychopathology. In
D Cicchetti and DJ Cohen, eds, Developmental psychopathology, Volume 2: Developmental neuroscience,
2nd edn, pp. 65–126. Wiley, New York.
Uedo N, Ishikawa H, Morimoto K, Ishihara R, Narahara H, Akedo I, Loka T, Kaji I and Fukuda S (2004).
Reduction in salivary cortisol level by music therapy during colonoscopic examination.
Hepato-Gastroenterology, 51(56), 451–453.
356 NIGEL OSBORNE
Updike PA and Charles DM (1987). Music Rx: Physiological and emotional responses to taped music
programs of preoperative patients awaiting plastic surgery. Annals of Plastic Surgery, 19(1), 29–33.
Urakawa K and Yokoyama K (2005). Music can enhance exercise-induced sympathetic dominancy assessed
by heart rate variability. Tohoku Journal of Experimental Medicine, 206(3), 213–218.
VanderArk SD and Ely D (1993). Cortisol, biochemical, and galvanic skin responses to music stimuli of
different preference values by college students in biology and music. Perceptual and Motor Skills,
77(1), 227–234.
Wieser HG and Mazzola G (1986). Musical consonances and dissonances: Are they distinguished
independently by the right and left hippocampi? Neuropsychologia, 24(6), 805–812.
Yamamoto T, Ohkuwa T, Itoh H, Kitoh M, Terasawa J, Tsuda T, Kitagawa S and Sato Y (2003). Effects of
pre-exercise listening to slow and fast rhythm music on supramaximal cycle performance and selected
metabolic variables. Archives of Physiology and Biochemistry, 111(3), 211–214.
Yehuda R (2000). Neuroendocrinology. In D Nutt, J Davidson and J Zohar, eds, Posttraumatic stress
disorder: Diagnosis, management and treatment, pp. 53–67. Martin Dunitz, London.
Ylikoski J (1987). Audiometric configurations in acute acoustic trauma caused by firearms. Scandinavian
Audiology, 16(3), 115–120.
Yule W (1994). Posttraumatic stress disorder. Plenum, New York.
Chapter 16
Between communicative musicality and

collaborative musicing: A perspective
from community music therapy
Mercédès Pavlicevic and Gary Ansdell
The meaning of the world can only be acquired in communication

and collaboration with other people. There is no such thing as meaning
found entirely by a single self. Meaning has to be communicated,
or communicable.
Trevarthen (2003, p. 67)
Regardless of how we define intersubjectivity, it must operate for

groups as well as dyads. The couple is a subsystem of the basic units
of evolutionary adaptiveness: the family and the tribe.
Stern (2004, p. 98)
No person moves directly from protomusicality to musicking.

Musicking, based on human protomusicality involves appropriation
of music as culture.
Stige (2003, p. 173)
16.1 Introduction
The marriage between communicative musicality and improvisational music therapy was osten-
sibly made in heaven. Music therapists in search of a new explanatory and legitimating theory
came back from conferences in the 1980s with the fruits of Trevarthen and Stern’s work. They felt
their search had ended. We, too, are among the many music therapists to admire the elegant
synthesis Malloch and Trevarthen subsequently made in their theory of communicative musical-
ity, which they described as ‘a foundation for a theory of music therapy’ (Trevarthen and Malloch
2000, p. 5) based on the belief that ‘Communicative Musicality is the source of the music thera-
peutic experience and its effects’ (p. 3).
In this chapter, we look at this theory from the perspective of a recent development in music
therapy called Community Music Therapy (Pavlicevic and Ansdell 2004; Stige 2003), a practice
for which Culture-Centred Music Therapy (Stige 2002) is the metatheory. This latest shift is
358 MERCÉDÈS PAVLICEVIC AND GARY ANSDELL
towards a more culture-centred, context-sensitive and reflexive orientation; it works clinically

across a continuum from the traditional music therapy dyad through to communal musicing1 in
social contexts. We suggest that for this new approach, communicative musicality provides a nec-
essary, but not sufficient, theoretical platform. What further theory is necessary for accounting
for how music therapy works in broader contexts, and at a more social and cultural level beyond
dyadic forms of relatedness? We suggest—as the beginning of an answer to this question—a
model for coupling such subsequent musical and social development, by way of cultural learning
(musicianship) and direct social participations (musicing). We call this further function of music
‘collaborative musicing’.
We are working here partly from largely undeveloped hints by Trevarthen, Stern and Malloch
that comment on culture, learning and context as factors elaborating the basic capacity of com-
municative musicality in broader social structures; we draw more explicitly from the recent pio-
neering theoretical work on community music therapy by music therapy theorist Brynjulf Stige.
16.2 From music therapy and communicative musicality to

community music therapy and collaborative musicing
16.2.1 Shifting practice, shifting theory
Historical and metatheoretical work on music therapy (Ansdell 1999, 2003; Stige 2002, 2003;
Ruud 1980, 1998) has shown just how flexible and pragmatic music therapy has been down the
ages in matching practice to theory, or theory to practice. Over several millennia, music has been
linked to healing practices via religious or philosophical epistemology (Horden 2000; Gouk
2000). Music therapy can therefore be seen as a discursively produced social construction,
responding to its time, place and purpose by telling various mythical, theoretical and scientific
stories about how and why music helps people. By relating itself to current treatment theories,
modern music therapy (half a century old now as an acknowledged international discipline) has
continued this pattern, forging interdisciplinary links and borrowing theory to account for what
it is doing, and why this should be necessary in specific social contexts. Equally, it could be argued
that music therapy practice has reciprocally inspired interdisciplinary theory in areas such as
developmental psychology, and the psychology and sociology of music, each of which has found
in music therapy practice an exemplary link between music, human relatedness, society and cul-
ture, and their connection to health and well-being (Aldridge 2004; DeNora 2000, 2003;
Macdonald et al. 2002; Ruud 1998; Stige 2003; Miell et al. 2005).
One of the interesting sub-stories of this perennial music/therapy narrative has been the
mutual relationships between developmental psychology, psychotherapy and music therapy over
the past 20 years. Both Trevarthen’s and Stern’s work was seemingly tailor-made for music ther-
apy—as empirically supported theories which gave both an explanation of, and a justification
for, music therapy practice. In the past 20 years, this psychobiological narrative (in its mature
elaboration as the theory of communicative musicality in Malloch’s and Trevarthen’s work) has
been a rich and influential one on music therapy theory, training, research and practice
(see Pavlicevic 2000; Ansdell and Pavlicevic 2005 for an overview and bibliography of this work;
see Pavlicevic 1991, 1997, 2000, 2001 for Pavlicevic’s research based partly on this theory).
1 We use the unconventional form ‘musicing’ (along with musics and musicers) to emphasize an unconven-
tional form of thinking about music. Along with other recent theorists (Small 1998; Elliott 1995),
we argue that music is primarily an activity rather than an object. An alternative spelling ‘musicking’ is
found in the work of other authors.
BETWEEN COMMUNICATIVE MUSICALITY AND COLLABORATIVE MUSICING: A PERSPECTIVE 359
Taking stock of the situation, our view is that this apparent theoretical success story has had
mixed consequences for contemporary music therapy. The positive consequences have been
widely discussed and elaborated in music therapy literature. We share this view that the theory of
communicative musicality has provided an elegant link between human social and musical
processes of great use to thinking about music therapy (Ansdell and Pavlicevic 2005). It was as if
developmental psychologists and music therapists had quite naturally converged on this shared
territory from complementary angles: the developmental psychologists understanding human
communication through music, and music therapists understanding more about ‘music’s help’
for people through the processes of human protocommunication.
What, then, could be the problems with this? One drawback, we suggest, has to do with the
epistemological origin of the theory itself; the other is to do with the particular uses and selec-
tions music therapists have made of it, perhaps at variance with the authors’ original intentions.
To take the second point first: many music therapists’ initial use of Stern and Trevarthen’s work
tended to reduce music in music therapy to ‘just’ preverbal protomusic. This was linked with a
project of redescribing music therapy as a form of psychotherapy, in which the purely psycholog-
ical relationship between therapist and client was privileged. Musical communication was seen as
just a means of establishing this psychological therapeutic relationship, which was seen as the key
healing agent. This use of early interaction theory emphasized the nature of music–therapeutic
dyadic relationships, at the cost of attention to groups and communal events in music therapy,
and did not pay sufficient attention to social and cultural perspectives on music therapy
(although this situation has recently been balanced a little with publications on group work in
music therapy—Davies and Richards 2002; Pavlicevic 2003). Overall, this perspective stood
starkly against our clinical experience as music therapists, experience that highlighted how strik-
ing the specifically musical relationship was with clients, and how such a relationship could natu-
rally and flexibly range between intimate musical companionship and the broader musical
community according to circumstance, need, and physical and cultural context.
More generally, we wonder whether the theories of communicative musicality and conven-
tional music therapy share roots in what has been critiqued as the psy-complex, ‘the set of theo-
ries and practices which reproduces this society from the base up, from the individual out …’
(Parker and Spears 1996, p. 1). Psychology’s basic unit of study—the individual—is subsequently
reflected in our therapeutic culture, which is arguably constructed on the basis of defining clients
(and their needs), and therapists (and their solutions), from an individualized and somewhat
asocial, acultural perspective.
Plenty of objections could be made to the statements in the last paragraph. Surely communica-
tive musicality has emerged from developmental psychology, which is highly attuned to dyadic
relationships and contains many hints pointing to how crucial culture and context are to success-
ful intersubjective communication. And surely psychodynamic music therapy (which first made
most use of this theory) has successfully moved from a one-person to a two-person psychology
(Alvarez 1992). Each discipline is rather a dyadic psychology, on which, it is argued, broader
social and cultural communication is built.
We will leave these points here for the time being, to elaborate our critique as this chapter pro-
ceeds, but with the following question: Has the largely dyadic focus of the research and theory of
communicative musicality left a music therapy theory heavily influenced by this with too narrow
a perspective? If so, what else might be needed?
16.2.2 From musical companionship to musical community

These questions have been brought into sharper focus recently by the arrival of community
music therapy, which draws on a more sociocultural perspective of music and health.
Many music therapists are realizing how this emerging approach both describes and directs
music therapy in ways more congruent with current practice needs, and engages well with
parallel moves in other academic and psychosocial disciplines. Seen from a metatheoretical
perspective, music therapy is similar to other disciplines, undergoing a shift in practice and
theory about every 30 years. This latest shift has nevertheless been motivated by a combination of
significant changes in social, political, cultural and intellectual areas (Stige 2003, 2004).
One key influence behind the development of a more cultural orientation to music therapy has
been the reorientation over the past 10 years of many of the musical and paramusical disciplines,
and their shift in thinking about how music, social life, and well-being relate to each other
(Ansdell 1997, 2001, 2004). By the mid-1990s, an important paradigm shift was occurring in
musicology—from the study of musical texts in isolation to looking at musicing people,
performances, culture and context, improvisation, and the many uses of music in and as social
action. Musicology’s overall interest was now in music in communication and culture. For the
first time, music therapy had access to serious thinking about music and musicing that genuinely
matched its own projects and research. A similar shift was happening in music psychology and
music sociology—both towards a more ecological view of music-in-context (shifting investiga-
tive focus to music in everyday life practices), and increasingly developing cross-disciplinary
theory which related musicology, ethnomusicology and the music-focused genres of the psycho-
logical and social sciences (DeNora 2001, 2003; Davidson 2004; Clarke 2003; Macdonald et al.
2002; Miell et al. 2005; Clarke and Cook 2004; Clarke 2005).
For some music therapists, these changes within and without music therapy have led to a
rethinking of the place and significance of communicative musicality in the rapidly developing
interdisciplinary jigsaw of music in human life—of which we (among others) consider music
therapy to be a small but important segment (Stige 2003; Ansdell and Pavlicevic 2005). Thus, our
question for the rest of this chapter is first: Where is the place of communicative musicality in
this new jigsaw? and second: What, if anything, develops from this theory, to help us to under-
stand the broader practices of community music therapy, and the experiences people report from
these practices?
16.2.3 Reframing communicative musicality for community

music therapy
Trevarthen and Malloch (2000, p. 4) imply a further ongoing developmental trajectory from
protomusicality when they state that ‘a child enters the musical culture by having a natural
talent for the “outward signs of human communication”’. Both naturally, and in the specific
frame of music therapy, there is the development of ‘sympathetic human company’ through
the musical coordination of minds in time (Trevarthen 1999). The result of this is, in their
appealing phrase, ‘musical companionship’. This state in turn motivates the ‘rapid learning of
culture-related skills’. However, while Trevarthen and Malloch gesture towards this further
sociocultural realm, their musical ‘dance of wellbeing’ remains essentially styled as a dyadic
pas-de-deux. What of the ensemble dance? Is this just a multiplication of dyadic musical
companionship, or something more? Do we move beyond communicative musicality when the
ensemble is reached?
An interesting evaluation of how the theory of communicative musicality relates to broader
contemporary thinking about people and music is given by the evolutionary music psychologist
Ian Cross (directly referencing Trevarthen’s work):
The mature musical competences exhibited by members of a culture are grounded in infant protomu-
sical capacities. Culture, in the form of specific modes of interaction conditioned by shared ways of
understanding, shapes and particularizes protomusical behaviours and propensities into specific
forms for specific functions. The potential for multiplicity of meaning embodied in protomusical
activity is likely to underwrite though not to direct or determine a culture’s musical ontologies.
Cross (2003, p. 27)
This formulation addresses what is built on the platform that communicative musicality
creates, and answers the intuitive disquiet we had about music therapists’ earlier uses of this the-
ory—the fact that for them music in music therapy seemed to stop with protomusic! Instead, as
characterized by Cross here, we clearly see how the mature cultural elaboration of the basic
capacity of communicative musicality relates to our communal lives.
Within music therapy, Brynjulf Stige’s (2003) integrative theorizing is probably the most
exacting treatment yet of this area, and is written explicitly to provide an understanding of com-
munity music therapy processes. His critique of the uses of communicative musicality theory in
music therapy is similar to ours. ‘No person’, writes Stige, ‘moves directly from protomusicality
to musicking. Musicking, based on human protomusicality involves appropriation of music as
culture’ (2003, p. 173). Figure 16.1 shows Stige’s basic model, which expresses a dynamic relation-
ship between protomusicality (based on communicative musicality theory), musics (as a diver-
sity of cultural artifacts affording a variety of tools), and musicking.
Stige’s central argument is to put the psychobiological foundation of musicality in relation
to the cultivated aspects of musical–cultural learning and the varieties of interaction
(dyadic through to communal) in musicing:
There is no legitimate foundation for a music therapy theory neglecting the conventional and social
aspects of musicking. It is not possible to go directly from phylogeny to ontogeny, and it is not helpful
to consider music as ‘natural’ or ‘preverbal’ or ‘preconventional’ communication only … While I will
not neglect the possibility of preconventional aspects in interaction based on the shared human capac-
ity for interaction through sound and movement, I suggest we will have a much better notion of the
power of musicking if we are sensitized to the conventional and postconventional aspects of music
therapy musicking also.
Stige (2003, p. 170)
Musicking
Microgenesis
Ontogeny
Phylogeny Cultural history
Protomusicality Musics
Fig. 16.1 Relationships between protomusicality, musics, and musicking (Stige 2002, p.83).
16.3 A model of musical–social development for community

music therapy
The model we present now has many similarities to Stige’s, and a few differences. We present it as
a way of actively expressing the relationship between social and musical experience, as we observe
this happening in community music therapy practice. We suggest that two functions are
activated by this relationship: music in the service of human communication, and music in the
service of human collaboration. Here, it is useful to reflect on the root of each word: the Latin col-
laborare means working together, and communicare is sharing with, and being together.
In this schema (derived from a composite of ideas from Stige 2002, 2003; Elliott 1995; Small
1998; Blacking 1973; Becker 2001; Benzon 2001; Benson 2003; DeNora 2000, 2003) there is a pro-
gressive development from a basic psychobiological capacity—as suggested by the theory of
communicative musicality—through to a facility developed in cultural learning, which in turn
facilitates the social activity of musicing with and for others: collaborative musicing. Here are our
suggested definitions of each level.
MUSICALITY is a core human capacity, and a basic response to and engagement with the
human world.
1 It is a phylogenetic, adaptive mechanism based on the biogrammar of the human brain and
the body’s repertoire of actions and gestures.
2 It affords basic human intersubjective communication through communicative musicality—
‘the art of human companionable communication’ (Malloch 1999), a functional capacity.
MUSICIANSHIP is a cultivated facility of musicality-in-action within sociocultural contexts.
1 It involves the skilful coupling of musicality to specific musical cultures, traditions, games,
techniques and artefacts.
2 This happens through the affordances offered by situated musics, and its skilled musicers,
and the appropriation of these by individuals (in short, the process of communicating and
generating musical knowing through musical doing).
MUSICING is a universal activity of musicianship in action
1 It is the taking part in any way in musical activity, automatically relating to shared forms of
human musicality, but specific traditions of musicianship.
2 It draws both its motivation and its meaning from specific social and culturally related needs,
functions and occasions, and always relates to a context of some kind.
3 Its basic reality is performance (whether solitary or communal), which creates relationships
between people, things and concepts.
This model implies that each slice of the inverted pyramid in Figure 16.2 is needed for attain-
ing the subsequent level: capacity leading to facility to activity. However, the traffic is two-way:
musicing stretches musicianship, which stimulates musicality. This two-way flow clarifies one of
the tasks of community music therapy: to address any of these in the interest of social and
musical development, with the therapist at times working from the top segment downwards.
Alongside this, we give the following very basic model of social development (Figure 16.3),
which Ansdell (2002) presented in illustration of the basic principle of community music
therapy.
Its simplicity suggests two basic points in the context of this chapter: first, as developmental
theory has convincingly shown, the ‘I’ is hypothetical, with an essentially relational I/You
dyad beginning the journey of personal development. In terms of social development as
interpreted by music therapy previously, the line has stopped in the middle—at the attainment of
Musicing Situatedness of …
o Occasions
Activity o Performances
Affordances/
appropriations of
Musicianship o Musics
Facility o Musicers
Musicality
Mobilization of
o Core musicality
Capacity via
o Protomusicality
CORE
Fig. 16.2 The inverted pyramid of musical development.
‘We’, of ‘musical companionship’ and the ideal therapeutic dyad. We have suggested that commu-
nity music therapy works further along this potential social continuum, helping people also to
realize the ‘Us’ of ‘musical community’.
We now want to go one stage further and look at how the formulations presented thus far in
this section might interact. Figure 16.4 maps the two dimensions we have already established:
social experience (Figure 16.3) linked with musical experience (Figure 16.2).
The origin we designate ‘Core musicality’ (c)—a basic abstract capacity (with an equally
abstract designation of ‘I’) leading along the M axis with a trajectory of ‘ideal’ development
moving towards ‘musical community’. However, this trajectory is made up of two two-ended
arrows, symbolizing the movement back and forth, and between and around any of these.
The areas inside the dotted arcs represent the active relationship between the two forms of
experience. The arcs are permeable, symbolizing, again, that any phase also influences any other.
The area (a) towards the upper y-axis indicates that you cannot have musicing (as active musical
participation) without the concomitant social development to support this. Equally, area (b) on
the x-axis shows how a higher level of social development indicates at least the possibility
of social musicing; to put it another way, musicality could not remain unactivated at this level of
social development.
Individual Communal
[“I”] I/You We Us
Fig. 16.3 Social development.

Musical experience Musical

community
Y M
(a)
t ive
(e)
o ra
Musicing
b
t ive o lla
ica C
un
Musicianship
m (d)
m
Co
Core (c) (b)
Musicality musicality
‘I’ X
I / You We Us
Social experience
Fig. 16.4 Towards musical community.
From this model, we make a hypothesis: that this naturally incremental relationship between
musical and social experience incorporates two related but separately identifiable functions that
unfold along the M-axis of ideal development: communication and collaboration. We mean by
this that core musicality, (c), naturally becomes communicative (I/You); that musical compan-
ionship (‘We’) facilitates the development of musicianship (d); and that increasingly elaborate
forms of musicing become naturally collaborative (e). To put it another way, the relationship
between musical and social experience generates, and is generated by musical communication
and musical collaboration.
Where, then, is communicative musicality in this diagram? We suggest it forms a territory above
the circle at the origin (c), linking a mobilizing of musicality in the service of communication—
at the incipient ‘I/You’ level. A further arc outwards from this would then express how—as
the dyad takes in elements of musical culture (e.g., in mother’s vocalizations and nursery
songs)—communication begins to service the development of musicianship (the expression
of musicality in and as culture). And thus, the successive arcs take us up the diagonal M-axis.
When we reach ‘We’, a genuine musical partnership has been built on the platform of commu-
nicative musicality and the ongoing cultural induction of musicianship. At this point (d) moving
towards (e), true musicing becomes possible, if perhaps only in dyad relationship or small-scale
contexts.
We now suggest that a further function of the music/sociality relationship comes into being:
not just communication, but collaboration. As a partner to communicative musicality, we are
calling this collaborative musicing—the outward and audible sign of musical community.
Collaborative musicing builds community through making music together.
Why are we distinguishing communication and collaboration? In the remainder of this
chapter, we will illustrate how what we will call collaborative musicing is an identifiably different
function of the musical/social experience/development interaction. While it is clearly only
possible on the scaffolding of musical communication, it is not merely the accumulation of
dyadic musical communications, but the facilitation of paradyadic musical experiences. We

suggest that the phenomenology of collaborative musicing is different enough to warrant both a
separate formulation, and the attention of separate theoretical modelling.
A few caveats: we are not saying that communication and collaboration are simply discrete
functions; rather, the function of communicative transforms into collaborative. Moreover, the
traffic is not just from communicative to collaborative: in many circumstances there may be an
oscillation between these modes, as there is between the aspects of each of the axes: we live
between I/You/We/Us, similarly between musicality/musicianship/musicing.
Of particular relevance to us as music therapists is how illness and deprivation can impact upon
the ideal model presented in this section. For example, how someone may need help to gain access
(for the first time, or again following trauma) to the affordances of the communicative or collabo-
rative functions of music. That is, they may need help repairing communicative musicality
through the cultivation of musical companionship (symbolized by the two-way arrows on the
M-axis in Figure 16.4). Equally, a person may need help in cultivating (or re-cultivating) the
means of collaborative musicing, in order to give access to their (musical) culture and community.
16.4 Three musical events

In this section, we will look at some musical events in terms of the model sketched in the previous
section. We use the term ‘musical event’ here in the specific sense that Tia DeNora (2003, p. 49) has
suggested: that is, ‘as an indicative scheme for how we might begin to situate music as it is
mobilised in action and as it is associated with social effects’. The following three events have been
chosen to reflect on contrasting social, cultural and contextual differences and similarities.
16.4.1 Group life: a music therapy event

This event happens in South Africa, and is a weekly music group with older teenagers
(aged 15–19) who are preparing for a public performance of Djembe drumming. They are from
highly dysfunctional families, and attend a community centre as part of a social rehabilitation
programme. The music therapist responsible for this performance (the first author) starts this
practice session with a free improvisation.
There are six of us, each with a Djembe drum. We stand in a circle, our drums in the middle,
and our eyes are closed. There is shuffling, sniffing, and we finally settle into a strong silence,
which increases in intensity and has a slightly hard, expectant quality.
Jamie bursts into rapid beating and instantly, the rest of the group (bar myself) jumps in.
Jamie’s playing has the same intensity and slight hardness as the silence before the music <1>.
All continue playing in rapid, accentuated sforzando mode, while I give the occasional tap on my
Djembe, listening closely to the absence of space or breathing in the music <2> (numbers refer to
sections in the improvisation).
Alfred and Helene stop playing <3>. Jamie continues, while Hannah gets louder and faster.
S’bongile, Jamie and Hannah each seem to be in their own musical world, overlapping acciden-
tally every now and then <4>. I now adjust my playing to a steady, strong and relaxed pulse,
slightly increasing the volume of my playing, while continuing to listen closely to each person
<5>. I become aware that Alfred, S’bongile and Helene have begun playing rhythmically around
and with my pulse. Our joint playing begins to take shape, and it feels as though part of the group
is beginning to gel and move together <6>. Jamie stops his fast busy playing, and at that moment,
Hannah takes over my pulse <7>. There’s the beginning of an accelerando and crescendo in our
playing together, which feels natural and relaxed, created by us all. Jamie has tentatively joined us,
playing quietly every now and then <8>.
The music now builds rapidly between us <9>, quickly gathering momentum and tension until
an enormous climax, and everyone bursts out laughing, pointing to one another’s instruments
and commenting, with much energy and vitality, on what good music that was <10>.
Here is musicing in the interests of social life (Figure 16.4). We see the shift from disconnection
<1–4> where the players remain a collection of isolated individuals (Figure 16.5a), towards com-
panionship <5–7> where multiple communicative acts happen at various points between some,
but not all the players (Figure 16.5b), and then collaboration <7–10>. At the point of collabora-
tion (Figure 16.5c), there is a working together that is beyond individual and dyads, in which all
players come concurrently into a qualitatively distinctive entity and activity. We also see musi-
cianship in action: teenagers harnessing a cultural genre (Djembe drumming) which is socially
‘cool’ and which ultimately brings them together musically and socially. The example, however,
makes it clear that to separate the musical and the social does not make sense. From the initial
shuffling, and restless, hard quality of playing in parallel worlds, where the players’ core musical-
ity is not yet harnessed towards companionship, the therapist aims to galvanize the players
towards being a socially and musically bonded group. To do this, she draws from her musician-
ship, which, while based on her personal musicality <1–2>, connects to other affordances: the
music artefacts of their shared cultural world and the sociocultural space they work in.
Gradually, the players—including the therapist herself—begin to shift through micro-moments
of companionship <5>, <7>; through dyadic work (Jamie and the rest; the therapist and the
group; Hannah and the therapist); towards collaborative musicing, when the narrative, interest-
ingly, switches explicitly to musicing. Here is a community—social and musical—moving as one.
A vital role for the therapist is to listen closely. Through this, she attains an apperception of the
players’ core musicality through their musicing (which is mediated by the musicianship within
their own musical traditions). Her skill as a musician-therapist enables her to mobilize both
musicing and the group’s social development (axis-M, Figure 16.4). In the second part of the
improvisation <6–9>, we see the gradual shaping from multiple experiences of time, of parallel,
simultaneous communicative events, towards a collaborative texture of one inner and outer time,
one inner and outer music, one inner and outer mind, in which each and all of the others are
mutually orientated, but with each having arrived at such a point through individual pathways of
musicality–musicianship–musicing. There is thus a condition of both difference and unity—
varieties of togetherness that create musical communitas—where music therapist and client dis-
tinctions are levelled in what has become a seamless exchange in guiding and shaping music, and
are being guided and shaped in musicing.
(a) (b) (c)

Fig. 16.5 Forms of group life. (a) ‘I’, ‘I’, ‘I’ (isolation); (b) Between ‘we’ and ‘us’ (between
communication and collaboration); (c) Musical community.
The spilling over of intensity from the music to the laughter <10> is a continuation of inter-
subjective, interactive meaning that continues after the music has ended. Here is a sign that the
profound experience of collaborative musicianship has been in the service of group life. The shift
towards collaboration is recognized by all—as evidenced by the delighted laughter and mutual
recognition of ‘what good music that was’—what fun it was, how good it felt. This, we need to
remember, from a group of socially deviant young people with a fundamental distrust of one
another (and of adults). The group is now warmed up and ready to begin refining their perform-
ance piece.
16.4.2 A night at the opera: a traditional Western musical event

The event is a performance of Mozart’s Cosi fan Tutte at a London opera house. I sit next to one
person I know, but also with several hundred strangers, who share with me only a wish to
relatively silently participate in this performance of music that most know and love. The simple
production and lively performance brings out the everyday miracle of Mozart’s drama: how the
music affords a vicarious experience of what each character on stage knows and feels about
themselves and each other, and how this shifts as the drama shifts the music, and as the music
shifts the drama.
This delight climaxes for me (and I sense for the hundreds of other audience members I share
this experience with) when the plot reaches the famous ensemble in the Finale of Act I. Three
duets of characters coincide in a sextet: Ferrando and Gugliendo in disguise and pretending to be
poisoned (‘From my desire to laugh my lungs are about to burst’); Fiordiligi and Dorabella,
duped but yielding (‘I can resist no longer’); Despina and Don Alfonso, knowing outsiders
(‘Humour them out of kindness’). Six subjectivities, three pairs of response counterpointed into
a harmonized ensemble – a unique unity of diversity.
Here on stage is a group of characters who, within the ensemble, both retain their separate
voices, and at the same time blend musically together, as well as in various combinations. The
audience’s experience seemingly parallels this: we know and feel them together, as one music,
while also simultaneously as separate, unique voices and characters, with their own motive,
relationship to each other, their own trajectory in and out of the musical fabric of the ensemble.
But we, the audience, also experience a shift in our social experience—that is, our experience of
the high point is both individual and simultaneously shared; it is an ‘Us-together’ (Figures 16.5c
and 16.3), not just hundreds of ‘I’s’ (Figure 16.5a). Writing this is clumsy and complex;
experiencing it is simple and obvious. The Act ends, the audience has been musically quickened;
they, like the characters on stage, have changed together by sharing something important
together, if only while the music lasted.
On the surface, there may seem few parallels between this visit to a conventional Western
opera in London, and the first musical event involving drumming teenagers on a rehabilitation
programme in Africa. Yet beneath the considerable surface differences, do we find quite similar
processes and consequences in terms of our model’s linking of forms of musical and social
experience?
Two radical ways of rethinking this kind of musical event can help us here. First, Christopher
Small’s (1998) suggestion that (i) musical works are for creating performances (not that
performances are for works); (ii) that listening is as much a form of musicing as any other;
(iii) that musicing is about establishing and elaborating multiple relationships through sounds.
Second, Bruce Benson’s (2003) recent suggestion that musicing is more accurately imagined as
an improvisational dialogue between composer, performer and audience, stretching across
time and space, while being utterly context-specific, its only reality being the time and place
of performance. Putting these two ideas together, we have the idea of a fluid continuum
of musicing, which comprises composing–performing–audiencing as one musical–social
system.
Just how this happens is complicated by the fact that at least three levels of social–musical
relationships occur in this musical event: (i) those within Mozart’s music, which then affords
(ii) certain relationships between the performers, which in turn affords (iii) a further set of
relationships between the performers and the audience (and between the audience members
themselves). It is typical in Western ways of thinking about musical performances to cast
the audience in the role of passive spectators, having a vicarious (or borrowed) experience of the
performers’ high-level musicality, musicianship and musicing. This may be wrong!
First, let us look at the musicing on stage. Here, the musicality of the performers is evident:
they use their musicianship to bring Mozart’s unique cultural artefact to life again, here and now
in the time of the performance. Take just the ensemble mentioned above—here, the particular
affordance of the style, genre and form (homophonic polyphony) allows five people to sing
together in a harmonic blend of separate musical voices, simultaneously characterizing them-
selves (musically/dramatically) against each other while together producing a harmonious
synthesis. That is, this ensemble musically models a particular way of people being with each
other (an intersubjective matrix), which is perhaps witnessed by the audience as a paradigm of
unity in diversity. This was indeed seen as unique and exemplary by the pioneer sociologist
Alfred Schutz (1976), who saw Mozart’s music as creating new meaning about social relation-
ships, enacting a community of meaning in which the individual self coexists with the social,
collaborative experience of being human:
Mozart … uses this specific device of the art form of opera in order to present in immediacy the inter-
subjective relations in which his characters are involved. In spite of their diversified reactions to the
common situation, in spite of their individual characteristics, they act together, feel together, will
together as a community, as a We. This does not mean, of course, that they act, feel, or will the same, or
with equal intensity … [but] even in antagonism they are bound together in an intersubjective situa-
tion of a community, in a We.
Schutz (1976, p. 199)
However, there is a second important dimension to our opera event: the increasing mobiliza-
tion of the audience’s musicality and musicianship in participatory listening, by means of which
not just musical but social effects come about (though not by externally communicating by
singing together with the performers—at least not in London!). Nevertheless, each audience
members’ musicality is drawn out through their listening participation, each offering what we
might call receptive musicianship, according to their different levels of training or awareness of
classical music and the forms, conventions, techniques and expressive devices of Mozart operas
and the opera performers here today. Through this mobilization of musicality and musicianship,
there comes a mounting level of overall musical communication (between performers, between
performers and audience) through what Benson (2003) calls the ‘improvisation of musical
dialogue’—the shared co-creation of this musicing occasion by Mozart (vicariously), along with
the cast and the audience (actually).
Thus, when the music works (that is the performers’/audiences’ communicative musicianship
works together), suddenly something else happens in the opera house: something shared by
all—what we are calling the level of collaborative musicing. Here, Mozart, the cast and the audi-
ence become an ‘Us’, not a collection of ‘I’s’, but a spontaneous and precious musical community.
All intuitively know that this is really why they are here: not just to reproduce Mozart, not just to
give a fine performance of the role, but to afford a unique human experience of sympathy
through musicing. This seems like a beacon of social hope: this musical being together in sound,
which is both a reflection and an enaction of community.
16.4.3 Traditional African ceremony: a social event, with music

We’re guests from the metropole, Western musicians and music therapy practitioners, and the villagers
welcome us to their rural African village. Around midnight, after long speeches of welcome and a
splendid feast, music seems to arrive from beyond the hills, beyond the night, and the multi-clustered
collection of young, old, and very small, have plugged into an invisible force and become one moving
organism.
We, the visitors, listen and watch, and after two hours, wish it were all over. The rhythm doesn’t
change, and neither do the dynamics, the tempo, or even the melody, which is an ongoing repetition of
four notes, descending. We’re unenthralled and looking at our watches.
Someone from ‘our’ group then joins the dancing, moving, singing circle. Others follow. We begin
moving in the music, with it; it remains, still, somewhere outside of us. Then we stop hearing it alto-
gether. We seem to have become music and it us, and we have become one-another: one musicing
timeless placeless mass, transcending our separate skins, towards one collective Self.
It’s dawn. We are puzzled, exhilarated—way beyond exhaustion—where were we? Was it a trance,
did we die, did we leave our ‘selves’ somewhere else? What happened to that unenthralling music?
What wonderful, warm people, how embraced, intimate we all felt with one another … back in the
city, it takes days to recover.
(Journal note, Mercedes Pavlicevic)
While initially insisting on a socially inflexible and distant stance, and on remaining silent
listeners to music (which is fine at the opera in London), the visitors’ group remains unable to
overcome the social–cultural–musical differences between the two groups. This stance sustains a
polarisation of (a) and (b) (Figure 16.4), unactivated towards or by each other. Here are two
events at once: the energized collaborative and the visitors’ listening and watching, and making
sense of this event from the outside, through their culturally fixed stance. Here, the visitors
impose their Western culture-specific musical experiences, in which music happens in the mind,
rather than also happening within and between bodies.
Eventually, though, they are irresistibly drawn in. The moving bodies, and shared emotions
galvanise their individual and shared musicality—core and communicative ((c) and (d) in
Figure 16.4)—and their bodies begin to move despite their initial disquiets ((d) and (e) in Figure
16.4). All become bound together, a kind of musical entrainment coordinates all towards one
another, co-creators of one musical and social world (Figure 16.4, M-axis). At this point, the
musical grammar and content ceases to matter: communicative musicality, portrayed through
drumming, singing and dancing, is neurologically irresistible, and shifts effortlessly into collabo-
rative musicing (Figure 16.5C).
John Blacking (1977) wrote about just this kind of experience:
By joining in the dance, I was able to experience what the Venda claimed: to play one’s part in the pipe
melody correctly whilst moving in harmony with others in a large crowd of performers and spectators,
generates individuality in community, and so combines self with others in a way that is fundamental
to the existence of Venda culture and society … The experience was often ecstatic: not only did we
dance; we were sometimes danced.
Blacking (1977, p. 56)
Here is an experience that might be called multisubjective, in the sense that we both lose and
retain our subjectivity within this collective ‘I’. Through collaborative musicing, not only do the
visitors lose their sense of ‘us-and-them’, not only are they danced, but they become part of
and create the music that musics everyone. They are not simply responsive puppets, activated
separately to move collectively. All collaborate socially, musically (the two are inseparable) in a
highly fluid multisubjective event that moves towards the suspending of time (as chronos), space,
culture, age, and social status.
What do these three musical events convey? At one level, they show the sheer diversity of
human musicing – where the nature and quality of the social–musical experience is inextricably
tied to culture, tradition and context. On another level, we think that the events show what Judith
Becker (2004) calls ‘limited universals’ of human engagement with music: how, that is, musical-
ity, musicianship and musicing couple in very similar ways whatever the cultural or contextual
variables are, and how this coupling serves human social process through the two mutually sup-
porting functions of communication and collaboration—two essentials for our creative being in
the world together.
16.5 Beyond dyads? Suggested interdisciplinary links

This speculative model, along with the three musical events in this chapter, suggests a way of
placing communicative musicality within a broader musical phenomenology. We hope it may
help people think about the kinds of situations a music therapist (among others) might find
themself in: working along a continuum between cultivating intimate musical companionship
with an isolated individual, but also working with groups and communal musical events in
specific sociocultural contexts, creating or sustaining musical community. Though a relatively
new theoretical perspective for music therapists, it will probably remind colleagues in other disci-
plines of various current attempts to explore many of the questions it raises. There is sadly no
room in the final section of this chapter to expand in detail on these areas, so we limit ourselves
here to suggesting some promising links to complementary disciplinary areas, which we hope
others will take up in the near future. A collaborative synthesis awaits.
16.5.1 Biomusicology/anthropology
Under the new disciplinary umbrella of biomusicology (for an overview see Wallin et al. 2000) is
an interesting current dispute in the ‘origins of music’ debate: whether musicing is an adaptive
mechanism (as compared with language—the debate can be followed in the Nordic Journal of
Music Therapy). Against Stephen Pinker’s infamous view that music is non-adaptive ‘evolution-
ary cheesecake’, other evolutionists have countered that music is not an epiphenomenon, but is
central to our species’ development (Dunbar 2004; Cross 2003; Cross and Morley, Chapter 5, this
volume). Music, that is, understood not as private, passive pleasure, but as active musicing which
afforded our ancestors communal bonding and strengthening, a way of extending dyadic groom-
ing to the whole group. Dunbar speculates that, in particular, group ‘proto-singing’ was an adap-
tive capacity with a distinct role in human evolution—generating a charismatic collectivity of
emotional enactment. This view is echoed in Judith Becker’s (2004) recent study on the relation-
ship between music, emotion and trancing, in which she proposes that powerfully communal
experiences can be generated through particular kinds of ‘deep listening’ in ritual contexts, ones
that need a joint neurological and cultural description to account for their effects. These two
scholars stand in an interesting parallel to the central argument of this chapter when compared
with the ethologist Ellen Dissanayake’s (2001; Chapter 2, this volume) use of communicative
musicality theory to suggest a more dyadic motive for music’s role as an adaptation. The debate
will doubtless continue, and we should not forget the still extremely pertinent work by earlier
anthropologists of music and theatre such as John Blacking (1973, 1977) and Victor Turner
(1982, 1987), whose provocative theoretical syntheses remain of prime relevance to this field.
16.5.2 Cognitive neuroscience

Meanwhile, brain science and psychobiological perspectives are working on speculative and
empirically demonstrated theories of just how collective emotional, communicational and musi-
cal states occur, and how it is possible to juxtapose (and perhaps synthesize) phenomenological
and explanatory approaches. A popular summary of some of this material is made by cognitive
neuroscientist William Benzon in his book Beethoven’s anvil: Music in mind and culture (2001).
This gives a detailed, if speculative, perspective on the possible underlying neurodynamics of
communal musical states of mind, and the importance of these in generating and modulating
human social–emotional situations:
Music thus becomes a means of communal play, of communal dreaming. It is a group activity in which
the interactions between individuals are as precisely timed and orchestrated as those within a single
brain. The individuals are physically separate, but temporally integrated. It is one music, one dance.
Benzon (2001, p. 164)
Benzon’s equal focus on musicing as a neurological and a cultural phenomenon gives a poten-
tial biological foundation for further discussion of the connections between communicative
musicality and collaborative musicing. Judith Becker (2004) is one of several recent musical
scholars to engage with neuroscience, this time using Antonio Damasio’s (1999) theory of con-
siousness in relation to emotional process.
Equally important is the new thinking coming from psychobiology and branches of psy-
chotherapy connecting to brain science. Trevarthen’s (1999) hypothesis of a brain-generated
‘intrinsic motive pulse’ (IMP) surely has important connections both to Benzon’s structural cou-
pling of people during musicing, and also to Daniel Stern’s (2004) recent elaboration and exten-
sion of his theory of the psychobiological bases of intersubjectivity through a phenomenological
analysis of the temporality of the present moment and the possibilities of interpersonal meeting
that flow from this. Although still dyadically based, Stern’s latest work looks towards a collective
form of this perspective: the intersubjective matrix, which creates a form of collective conscious-
ness called ‘intersubjective consciousness [which] is socially based’ (Stern 2004, p. 132):
Participation in rituals, artistic performances, spectacles, and communal activities like dancing or
singing together all can result in a transient (real or imagined) intersubjectivity. All participants
assume that others experience what is happening roughly as they do … an imagined intersubjective
contact passes between them, and along with it a sense of psychic belonging. They have not only
enjoyed an event, but have also been immersed in the human intersubjective matrix and confirmed
their self-identity.
Stern (2004, p. 109)
16.5.3 Social psychology and sociology of music

The social psychology and sociology of music have both experienced a seeming paradigm shift
recently, and have ended up in just the territory that this chapter presents. Recent psychology of
music (for a summary, see Clarke 2003, 2005; Dibben 2003; Davidson 2004) has moved from an
almost exclusive modelling of individual perceptual aspects of music in a cognitive model to a
more social and ecological perspective, which investigates musical materials in shared and cultur-
ally contexted everyday use, both at individual and collective levels. Much of this theorizing could
be of direct relevance to the material in this chapter.
This body of literature in turn links and overlaps with newer forms of theory and research in
music sociology—effectively a new sociology of music that has been heralded by the work of Tia
DeNora (2000, 2003). In particular, her modelling of how music can penetrate social life at the
microlevel of motivation, emotional modelling and collective engagement through the mecha-
nisms of ‘musical affordance’ structures and ‘musical appropriations’ by actors could be a strong
theoretical platform for linking evolving musical practices such as community music therapy
with strong traditions of empirical research in this branch of sociology.
This latest work should not make us forget the work on music by pioneer sociologist Alfred
Schutz (1976), whose key paper, Making music together, anticipates several of the theories
mentioned in all disciplines in this section by several decades. Schutz’s social phenomenology of
musicing may well be of major interest to those wishing to further theorize around the issues in
this chapter. His conception of the musical event, the communicative relationship between
composer, performers and audience is radical, and his further theorizing of the intersubjective
‘tuning-in relationship’ through the sharing of time and space we believe point exactly to the
right territory:
Communicating with one another presupposes … the simultaneous partaking of the partners in vari-
ous dimensions of outer and inner time—in short in growing older together … the analysis of the
social relationship involved in making music together might contribute to the clarification of the tun-
ing-in relationship and the process of communication as such.
Schutz (1976, p. 178)
16.5.4 Musicology
Musicology as a discipline has undergone a similar paradigm shift in the previous decade,
towards the so-called new musicology (Williams 2001). In this, it has moved towards many of the
areas of interest shared by music therapists. In particular, authors such as Small (1998), Cook
(1998) and Benson (2003) have suggested radical ways of conceptualizing the relationship
between musical works, performance and reception. The most radical aspect of this (and one
which is a foundation of this chapter) is that it is simply musicing which is the universal of
music—a participation which includes all musical activity. Further work on such metatheoretical
perspectives in music will aid connection with biological and sociological conceptions of musical
events and processes.
16.5.6 Social philosophy

Another form of epistemological corrective—this time to individualism—could come from the
diverse area of those writing about social and relational aspects from a philosophical perspective.
For example, we discussed in an earlier suggestion that a critique of communicative musicality
could be made on the basis of its origin in a psychological frame that has been called the
psy-complex, that is the tendency to begin with the individual, the intrapsychic and to work out-
wards to society and culture. Against this, the alternative view may be useful coming from radical
psychology related partly to the earlier Russian tradition of Bakthtin, Volosinov and Vygotsky
(Parker and Spears 1996; Newman and Holzman 1997). Here, the assumption is that the social
comes first in communicative development and praxis: ‘Social psychology in fact is not located
anywhere within (in the ‘souls’ of communicating subjects) but entirely and completely with-
out—in the word, the gesture, the act’ (Volosinov, in Parker and Spears 1996, p. 116).
Their ideas about the primacy of the social and communal, the nature of interindividual
territory of cultural/communal acts have influenced the growing field of cultural psychology and
its use for rethinking aspects of music therapy (Stige 2003).
Last is the important work of the theologian and social philosopher Martin Buber on the con-
nections between dialogical and community relationship. This relates in many ways to all of the
other perspectives above in its simple but profound characterization of how all human commu-
nication happens in a realm that is between You and I and Us. Buber’s dialogical philosophy has
been taken as of great relevance to conventional music therapy previously (Ansdell 1995; Garred
2006). It could be equally relevant to the current project of theorizing community music therapy,
especially Buber’s elaboration of his dialogical principle into a principle of community, which is
an original and a phenomenologically accurate description rather than just a utopian ideology:
‘This community is no union of the like-minded, but a genuine living together of men of similar
or complementary natures but of differing minds. Community is the overcoming of otherness in
living unity’ (Buber, in Kramer 2003, p. 77).
16.6 Conclusions
‘Music’, wrote the ethnomusicologist Charles Keil (Keil and Feld 1994), ‘is our last and best
source of participatory consciousness, and it has this capacity not just to model but maybe to
enact some ideal communities’. In the examples we have offered in this chapter, ‘ideal commu-
nity’ means not a utopia, but a concrete situation where people can meet uniquely in and
through music. These musicing events stand as beacons of hope against all of the precluding fac-
tors (illness, social inequality, fear, and cultural fragmentation) that now militate against human
companionable or communal meeting. The social and political upheavals of our time, the
refugee crises, the stress to urban environments are forcing us all to reframe what it means to
belong to a social group, and what it means to communicate and collaborate with one another.
Many music therapists now work not only with the ill, but also those whose ‘problem’ is social,
cultural and political—those exiled from their cultures, their musics, and their homes. More than
ever, musicing needs to be in the service of generating communities, addressing social fragmenta-
tion, rebuilding trust and social bonding.
There is a growing consilience of perspective between the varying materials we have
presented in this chapter: concerning the broadening view of music therapy (characterized as
community music therapy), which is in turn finding itself, not coincidentally, in alignment
with much interdisciplinary theory on the nature and functions of music and musicing in
the human lifeworld. In a nutshell, this broadening view is an ecological one—music is
interdependent with the interlocking systems of life: biological, intimate, cultural, social and
spiritual.
In this new landscape, it seems clear that Malloch’s and Trevarthen’s communicative musicality
can be seen as a cornerstone of current attempts to theoretically model how musicing can hold
such power and hope for the modern world. It is also clear that while communicative musicality
may indeed be, as Trevarthen and Malloch stated, ‘a foundation for a theory of music therapy …
the source of the music therapeutic experience and its effects’, it perhaps needs further theoretical
architecture on top of this foundation to provide an adequate working account of the full phe-
nomenology of music’s powers. Second, it has become clear, through both the examples and the
varying interdisciplinary theory essayed in this chapter, that an epistemological orientation that
critiques the traditional dyadic preconceptions of the psy-complex may be necessary to recast
theory on music’s full range and potential.
It is not coincidental, then, that community music therapy finds both great inspiration in the
theory of communicative musicality, and is able to use this theory critically as a foil to help
develop broadening models, such as we have experimentally presented in this chapter. Probably
all of us have the same interest at heart: to explore fully how Keil’s vision of musical community
can be promoted. Whatever our professional angle, most of us are certain that music does indeed
have the power to afford that cardinal human need: to take part, together.
Acknowledgements
For conceptual and technical help with this chapter we would like to thank Simon Procter
and Rachel Verney. This work was developed as part of the Research Department Programme
of the Nordoff-Robbins Music Therapy Centre, London, and the ‘Music and health in late
modernity’ international research collaboration (Research Council of Norway project number:
158700/530).
References
Aldridge D (2004). Health, the individual, and integrated medicine: Revisiting an aesthetic of health care.
Jessica Kingsley Publishers, London.
Alvarez A (1992). Live company: Psychoanalytic psychotherapy with autistic, borderline, deprived and abused
children. Tavistock/Routledge, London and New York.
Ansdell G (1995). Music for life. Jessica Kingsley, London.
Ansdell G (1997). Musical elaborations: What has the New Musicology to say to music therapy? British
Journal of Music Therapy, 11(2), 36–44.
Ansdell G (1999). Challenging premises. British Journal of Music Therapy, 13(2), 72–76.
Ansdell G (2001). Musicology: Misunderstood guest at the music therapy feast? In D Aldridge, G
DiFranco, E Ruud and T Wigram, eds, Music therapy in Europe, pp. 17–33. Ismez, Rome.
Ansdell G (2002). Community music therapy and the winds of change [online]. In Voices: A world forum
for music therapy. http://www.voices.no/discussions/discm4_03.html
Ansdell G (2003). The stories we tell: Some metatheoretical reflections on music therapy. Nordic Journal of
Music Therapy, 12(2), 152–159.
Ansdell G (2004). Rethinking music and community: Theoretical perspectives in support of Community
Music Therapy. In M Pavlicevic and G Ansdell, eds, Community Music Therapy, pp. 65–90. Jessica
Kingsley, London.
Ansdell G and Pavlicevic M (2005). Musical companionship, musical community: Music therapy and
the process and values of musical communication. In R Macdonald, D Hargreaves and D Miell, eds,
Musical communication, pp. 193–214. Oxford University Press, Oxford.
Becker J (2001). Anthropological perspectives on music and emotion. In PN Juslin and JA Sloboda, eds,
Music and emotion: Theory and research, pp. 135–160. Oxford University Press, Oxford.
Becker J (2004). Deep listeners: Music, emotion, and trancing. Indiana University Press, Bloomington, IN.
Benson B (2003). The improvisation of musical dialogue: A phenomenology of music. Cambridge University
Press, Cambridge.
Benzon W (2001). Beethoven’s anvil: Music in mind and culture. Basic Books, New York.
Blacking J (1973). How musical is man? University of Washington Press, Seattle, WA.
Blacking J (ed.) (1977). The anthropology of the body. Academic Press, New York.
Clarke E (2003). Music and psychology. In M Clayton, T Herbert and R Middleton, eds, The cultural study
of music: A critical introduction, pp. 113–123. Routledge, London.
Clarke E (2005). Ways of listening: An ecological approach to the perception of musical meaning. Oxford
University Press, Oxford.
Clarke E and Cook N (2004). Empirical musicology: Aims, methods, prospects. Oxford University Press,
New York.
Cook N (1998). Music: A very short introduction. Oxford University Press, Oxford.
Cross I (2003). Music and biocultural evolution. In M Clayton, T Herbert and R Middleton, eds,
The cultural study of music: A critical introduction, pp. 19–30. Routledge, New York and London.
Damasio A (1999). The feeling of what happens: Body and emotion in the making of consciousness.
Harcourt, London.
Davidson JW (2004). What can the social psychology of music offer community music therapy?
In M Pavlicevic and G Ansdell, eds, Community music therapy, pp. 114–128. Jessica Kingsley, London.
Davies A and E Richards (eds) (2002). Music therapy and group work: Sound company. Jessica Kingsley,
London and Philadelphia.
DeNora T (2000). Music in everyday life. Cambridge University Press, Cambridge.
DeNora T (2001). Aesthetic agency and musical practice: New directions in the sociology of music and
emotion. In P Juslin and J Sloboda, eds, Music and emotion, pp. 161–180. Oxford University Press,
Oxford.
DeNora T (2003). After Adorno. Cambridge University Press, Cambridge.
Dibben N (2003). Musical materials, perception, and listening. In M Clayton, T Herbert and R Middleton,
eds, The cultural study of music: A critical introduction, pp. 113–123. Routledge, London.
Dissanayake E (2001). An ethological view of music and its relevance to music therapy. Nordic Journal of
Dunbar R (2004). The human story: A new history of mankind’s evolution. Faber, London.
Elliott D (1995) Music matters. Oxford University Press, Oxford.
Garred R (2006). Music therapy: A dialogical perspective. Barcelona Publishers, Gilsum, NH.
Gouk P (2000). Musical healing in cultural contexts. Ashgate, Aldershot.
Horden P (2000). Music as medicine: The history of music therapy since antiquity. Ashgate, Aldershot.
Keil C and Feld S (1994). Music grooves. University of Chicago Press, Chicago, IL.
Kramer P (2003). Martin Buber’s ‘I and thou’: Practicing living dialogue. Paulist Press, New Jersey.
Oxford.
Malloch S (1999). Mothers and infants and communicative musicality. Musicae Scientiae (Special Issue
1999–2000), 29–57.
Miell D, Macdonald R and Hargreaves D (2005). Musical communication. Oxford University Press, Oxford.
Newman F and Holzman L (1997). The end of knowing: A new developmental way of learning. Routledge,
London.
Parker I and Spears R (eds) (1996). Psychology and society: Radical theory and practice. Pluto Press,
London.
Pavlicevic M (1991). Music in communication: Improvisation in music therapy, Unpublished Ph.D.
dissertation, University of Edinburgh.
Pavlicevic M (1997). Music therapy in context. Jessica Kingsley, London.
Pavlicevic M (2000). Improvisation in music therapy: Human communication in sound. Journal of Music
Therapy, 37(4), 26–285.
Pavlicevic M (2001). A child in time and in health: Guiding images for music therapy practice. British
Pavlicevic M (2003). Groups in music: Strategies from music therapy. Jessica Kingsley, London.
Pavlicevic M and Ansdell G (2004). Community music therapy. Jessica Kingsley, London.
Pinker S (1997). How the mind works. Penguin Books, London.
Ruud E (1980). Music therapy and its relationship to current treatment theories. MagnaMusicBaton,
St. Louis, MO.
Ruud E (1998). Music therapy: Improvisation, communication and culture. Barcelona Publishers,
Gilsum, NH.
Schutz A (1976). Mozart and the philosophers. In A Brodersen, ed., Collected Papers II: Studies in social
theory, pp. 179–200. Martinus Nijhoff, The Hague.
Small C (1998). Musicking: The meanings of performing and listening. Wesleyan University Press,
Hanover, NH.
Stern D (2004). The present moment in psychotherapy and everyday life. Norton, New York and London.
Stige B (2002). Culture-centered music therapy. Barcelona Publishers, Gilsum, NH.

Stige B (2003). Elaborations towards a notion of community music therapy. Unpublished Ph.d. dissertation,
Department of Music and Theatre, University of Oslo.
Stige B (2004). Community music therapy: culture, care and welfare. In M Pavlicevic and G Ansdell, eds,
Community music therapy, pp. 91–113. Jessica Kingsley, London.
Trevarthen C (2003). Neuroscience and intrinsic psychodynamics: Current knowledge and potential for
therapy. In J Corrigall and H Wilkinson, eds, Revolutionary connections: Psychotherapy and neuroscience,
pp. 53–78. Karnac Books, London.
Turner V (1982). From ritual to theatre: The human seriousness of play. PAJ Publications, New York
Turner V (1987). The anthropology of performance. PAJ Publications, New York.
Williams A (2001). Constructing musicology. Aldershot, Ashgate.
Chapter 17
Supporting the development of

mindfulness and meaning: Clinical
pathways in music therapy with a
sexually abused child
Jacqueline Robarts
17.1 Introduction
In this chapter, I present music as a primary catalyst in the development of the therapeutic
relationship and in the therapeutic processes of change. I describe how music therapy can assist a
psychotic child with a history of early childhood sexual abuse. When the intersubjective sense of
self is devastated at its core by such early relational trauma, music, used with clinical perception,
may reach a child and work constructively in an evolving, musically mediated therapeutic
relationship. I consider, in the light of interdisciplinary perspectives, how it is possible for music
to be used to assist a child’s emotional regulation and to nurture ‘symbolization’—the capacity
to mean—where this has been disrupted by early relational trauma. I argue that music has a
generative and architectonic or ‘structuring’ role in this work. Music can be used to organize or
regulate emotions, and can assist the interpersonal attachment and sharing of meaning, which
are foundational to the development of mind (Siegel 1999; Schore 2001). The basic emotional-
resonance and regulation that music therapy can engender are relevant for work with all kinds of
children, but are especially significant for children who are hard to engage by other means.
Clinically perceptive listening and the skilled use of improvization offer a richly creative resource
that can respond to the child within an evolving music–therapeutic relationship.
This chapter describes how music therapy can assist the creation and restoration of meaning, when
meaning and a cohesive sense of self are lost or impaired. The power of the music–therapeutic
process is in the music itself, and it is impossible to convey its richness fully in words. It resides in the
implicit, preverbal realm of feeling and relating, in the therapist’s presence and receptivity to the
child, and in the forms and qualities of the child’s emotional and behavioural responses. Children
enact and express their feelings, responses, and all kinds of motivations primarily in bodily move-
ment, gesture and vocalization. These primary forms of self-experience and expression are essentially
musical—rhythmic and tonal. This is why children’s emotional experience can be so readily reached
and affected by music, and why music continues to be an intrinsic motivating part of human experi-
ence throughout the lifespan. I consider first what musicality is, and the nature of music as therapy.
17.2 The therapeutic potential of music draws on our

natural musicality
Many, if not all, of music’s essential processes can be found in the constitution of the human body and
its patterns of interaction with other bodies in society.
Blacking (1973)
378 JACQUELINE ROBARTS
Music, found in every culture, has its roots in a capacity for sharing the rhythms and emotions of
expressive bodily movement (Blacking 1973; Trevarthen 1999; Malloch 2005). This musicality, as
many contributors to this volume argue, appears to be innate in us, functioning as an essential
emotional resonance in our first relationships, and enlivening the social community. Music is an
essential part of our identity as human beings. As ethnomusicologist John Blacking reminds us,
the musicality in voice and gesture of everyday living in human society motivates, but is not
dependent on, acquired musical skills.
Modern music therapy is based on these premises: that we are all innately musical, that
musicality is robustly rooted in our brain, and that it can survive severe neurological trauma and
impairment (Bruscia 1998a; Wigram and De Backer 1999a, b; Sutton 2002; Darnley-Smith and
Patey 2003; Pavlicevic and Ansdell, Chapter 16, this volume). Music therapy has been defined as
‘a systematic process of intervention wherein the therapist helps the client(s) to achieve health,
using musical experiences and the relationships that develop through them as dynamic forces of
change’ (Bruscia 1998a, p. 47). The basic kinematic and tonal properties of music, evoking our
innate musicality, provide the natural medium for music’s healing effect, a resource from which
the music therapist draws to develop a therapeutic relationship, led by the client’s needs.
Our bodily movements show rhythmic features and dynamic modulations that arise directly from
our emotional states and motivational impulses (see Panksepp and Trevarthen, Chapter 7, this
volume). There is a musical–emotional tone in our voice, not only when we are singing, but in
speaking, laughing, crying. ‘Musical expression’ is thus isomorphic with ‘human expression’
(Aldridge 1996). Before there is any conscious effort or intentionality to ‘play music’ or to express
oneself in musical performance, music’s communicative properties afford an involuntary sympa-
thetic field of relatedness, or emotional resonance. Musical expression connects individual experi-
ences at a deep implicit level, and it can give expressive form to the emotions of children who are, by
all other means, hard to reach. The natural responsiveness to music and how it may be developed in
improvization has been described by the pioneers of music therapy in their work with disabled,
autistic and emotionally disturbed children (Alvin and Warwick 1991; Nordoff and Robbins 2004,
2007; and see Pavlicevic and Ansdell, Chapter 16, Wigram and Elefant, Chapter 19, this volume).
17.3 Communicative musicality in the context of music therapy

The function of music in helping to develop an infant’s sense of self and shared meaning in
close relationships has been defined as communicative musicality within a psychobiological and
developmental theory of human intersubjectivity (Stern 1985; Papousek 1996; Malloch 1999;
Trevarthen 1999, 2002). The theory offers a scientific explanation of a basic premise of music
therapy—that an innate musicality resides in all of us and does not depend on musical training.
Furthermore, research inspired by the theory has demonstrated the interresponsive improvisa-
tional character of human emotional expression originating in motor impulses of the body that
are active in both preverbal and non-verbal communication (Malloch 1999; Trevarthen and
Malloch 2000; Trevarthen and Schögler 2006).
However, while accepting that the natural feelings of musicality inherent in human intersub-
jectivity are essential to the ‘music–therapeutic effect’, I believe that to understand the work of
music therapy with clients of all ages, we also need to appreciate the ‘creative–constructive’, highly
adaptable applications of improvized music with people for whom the easy ebb and flow of commu-
nicative musicality in relationship are not yet a reality, or have been weakened (see Gratier and
Danon, Chapter 14, this volume). Where early development has gone awry, where there are
constitutional, genetic or environmental influences impairing or inhibiting the natural flow of
communication and relationship, developmental knowledge and psychodynamic understanding
SUPPORTING THE DEVELOPMENT OF MINDFULNESS AND MEANING 379
contribute important dimensions to the music therapy process. We need to explain how feelings
that are not expressed in overt musical activity or behaviour may be attended to and may change
in the dynamics of relationship within the musical communication, as both sounded and
unsounded phenomena of relating. In a therapeutic setting, there is much in awareness between
any two human beings that needs to be worked with below the threshold of consciousness and
overt behaviour, musical or otherwise, as I will describe in the case material that follows.
17.4 Emotions in ‘the dance of well-being’ and the

co-construction of meaning
The biological roots of musicality, evident in the first vocal impulses and gestures of an infant,
which are imbued with the rhythmic, dynamic shaping of action in time, are intrinsically linked
to the emergence of shared meaning, or ‘symbolization’ (Dissanayake 2000 and Chapter 2, this
volume; Peretz 2001; Trehub 2001; Wallin et al. 2000; and see Brandt, Chapter 3 and Merker,
Chapter 4, this volume). Infancy research substantiates the concept of a robust and responsive
musicality of communion that is present at birth and continues to function throughout the lifespan,
creating and shaping meaning in relationships. ‘The universal features of human musicality, its
timing, emotive expression and intersubjective sympathy, are clear signs of innate motives, and
music functions everywhere as a primary motivating force in human life’ (Trevarthen 1999). This
‘dance of wellbeing’ (Trevarthen and Malloch 2000) may be observed in the musical–dynamic
forms intrinsic to all human sociability and in the collaborative pursuit of shared interests.
Animated conversations, with their eloquent gestures, body movements and vocal expressions,
form a spontaneously orchestrated symphony of emotional/motor impulses in sight and sound,
capturing and directing attention. Significantly, it is the regulatory, organizational function of
emotional ‘attunement’ (Stern 1985) that leads to the emergence of meaning in relationship and
the infant’s developing capacity ‘to mean’ (Halliday 1975).
The infant is a virtuoso performer in his attempts to regulate both the level of stimulation from the
caregiver and the internal level of stimulation in himself. The mother is also a virtuoso in her moment-
by-moment regulation of the interaction. Together they evolve some exquisite dyadic patterns.
Stern (1985, p. 109)
At the heart of empathy (or, more correctly, sympathy) in experience is emotional resonance,
which accommodates both attunement and misattunements as the relationship evolves, and con-
tinues after the direct interaction itself (Siegel 1999, p. 281). These felt, living experiences of relat-
edness are fundamentally aesthetic, as well as moral, concerned with ‘knowing’ or ‘appreciating’,
and they constitute the ‘cradle of thought’ (Hobson 2002), a phrase that captures the containing,
regulating, nurturing and transforming nature of these primitive, yet nonetheless subtle forms of
relatedness in the vitality of communicating minds.
Communicative musicality defines the motivating roots of human intersubjectivity, within
which cultural meaning may flourish (Malloch 1999). In his research on the musical and sponta-
neously improvisational inflections of the features of pitch, timing and timbre characterizing
early infant–parent vocalizations, Malloch identified the intrinsic organizing principle of com-
municative movements in normal infant–carer interactions. In its organizing function, commu-
nicative musicality can be viewed as expressing the embodiment of emotional feelings and
thoughts in relationships between persons and between minds, arising from a ‘mirroring’ of
motor impulses and perceptual expectations (Gallese and Lakoff 2005). These impulses and
expectations take form in sympathetic interpersonal activity occurring over time. The temporal
frame is intrinsic to the development of reciprocity of all forms of action and is the defining
feature of mutuality, or ‘the self in intersubjectivity’ (Beebe and Lachmann 1988; Beebe et al.
2000; Stern 1985, 1995, 2004; Trevarthen 1979, 1993; see also Lee and Schögler, Chapter 6, this
volume, on the source of time and expression in movements, and their role in communication).
The regulating, organizing function of music as it flows through time has also been described
as a central feature of music therapy (Sears 1968/1996).
In a basic sense, music therapy offers the individual the experiencing of events in certain ways… Although
the past experiences of the individual may serve as a basis (often a very important one) for organizing the
therapeutic situation, that situation always begins in the present and goes into the future. No therapist
can change the past experiences of the individual, but he can organize a present situation so that the
effect of the past is altered for a more adequate future. It is in this sense – that of the present going into
the future – that the word ‘experience’ has been selected for use.
Sears (1996, p. 34)
Sears’s own emphasis is in italics; the words that I have made bold italic are, in my view, especially
relevant to work with sexually abused, early traumatized children.
Sears’ concept of music therapy as providing experiences that bring order and guide change
was developed further by Nordoff and Robbins (2004, 2007). They, also, recognized the integra-
tive potential of music to help children who found relationship impossible or difficult, and they
set about exploring in individual and group music therapy the effects of improvizing music as a
form of communication and relationship. As music therapists and researchers, Nordoff and
Robbins explored the dynamic features of the musical relationship as it evolved in engagements
with a wide range of children.1 They eloquently describe the integrative properties and processes
inherent in music therapy in their concept of ‘the music child’:
The Music Child is…the individualized musicality inborn in each child: the term has reference to the
universality of musical sensitivity – the heritage of complex sensitivity to the ordering and relation-
ship of tonal and rhythmic movement; it also points to the distinctly personal significance of each
child’s musical responsiveness. Often with severely or profoundly handicapped children musical
responses are initially fragmentary or reflexive or in various ways tied into their conditions to appear
as stereotyped, perseverative, or compulsive musically stimulated behaviour. One cannot yet speak of
the Music Child in these children. Only when some communicative direction or some responsive
order, some perceptive openness or some freedom from conditioning habitual activity develops, can it
be said the Music Child is ‘being awakened’, is being ‘synthesized’.
Nordoff and Robbins (1977, p. 1)
Nordoff and Robbins also emphasize the power of music to organize the developing functions
of the child’s personality:
The term Music Child denotes an organization of receptive, cognitive, and expressive capabilities that
can become central to the organization of the personality insofar as a child can be stimulated to use
these capabilities with a significant extent of self-involvement. Such an involvement, creatively and
responsively fostered, induces the functions of recognition, perception, and memory; intelligence,
purposefulness, confidence come spontaneously into expression as the child becomes deeply, personally
involved. He becomes emotionally involved, not only in the particular music itself or in his activity in it,
but also in his own self-realization and self-integration within all the therapy situation holds for him.
Nordoff and Robbins (1977, pp. 1–2)
1 This work was funded by National Institute of Mental Health research grant MHPG 982 from 1962–67
at the University of Pennsylvania. It is fully reported in Nordoff and Robbins’s 1977 text, Creative music
therapy, the revised edition of which (Nordoff and Robbins 2007) includes six CDs of individual music
therapy sessions that engage children with a wide range of disabilities and conditions.
Among the many possibilities for meeting children in music and engaging them in play, ‘sharing
the basic beat’ was a feature of the music–therapeutic relationship that Nordoff and Robbins
found to be crucial. From a developmental perspective, the shared beat or pulse is an intuitive
feature of infant–carer communication that, in its organizing function, creates mutuality or sym-
pathy (Trevarthen 1980, 1999).
With atypically developing and emotionally disturbed children (Condon and Ogston 1966; Evans
1986); this fundamental organizing feature of sympathy is almost invariably not established, or is
established only in highly idiosyncratic and unstable ways; here music therapy can do much to help.
The power of music to organize or regulate a person’s emotion and attention, and to support
an interpersonal–intermusical relationship in music therapy, is particularly valuable in work with
children whose capacity for emotional communication and play is impoverished or damaged
(Gold et al. 2004). Evasive, avoidant and emotionally disturbed behaviour prevents easy commu-
nication and shared play; moods are labile and emotional/behavioural responses volatile or
unduly passive. Extreme examples of such children are those who have experienced early familial
abuse, including neglect and fear, severe physical pain or even torture, and a persistent lack of
safety or trust. These children may become psychotic and lose a sense of reality. They may survive
by residing in states of mindlessness and timelessness as ways of coping with the terrors
they involuntarily re-experience. These memories reside in the preverbal or implicit realm of
experience, and can be triggered by the slightest event, such as a smell, a sudden movement, or an
encounter that for anyone else would be ordinary or unnoticed. These children need help with
the most basic emotional processes, including self- and interregulatory experiences, to create the
time and space in which reality and some cohesion of mind can be experienced. Only then can
the preverbal domain and sense of self become sufficiently cohesive for thought and ideas to
come into play, as symbolization, and become meaningful—that is, able to be experienced,
communicated and reflected on.
Music can work with dynamic expressive and sensory levels of experience in relationship from
which trust, a sense of safety, and new patterns of healthy attachment or intersubjectivity can
begin to develop. Through the creative clinical use of improvisation, music can provide an inter-
personal framework, a living experience of interpersonal connectedness. Such experiences
cannot be prescribed or didactically generated. However, within the aesthetic, emotionally regu-
lating experiences afforded by the properties of music and the clinical skills and receptivity of the
therapist, the musical relationship can bring the necessary integrative experiences of being and
being-with (Ansdell 1995; Austin 2001; Bunt 1994; Darnley-Smith and Patey 2003; Etkin 1999;
Nordoff and Robbins 2004, 2007; Pavlicevic 1997, 1999, 2001; Robarts 1994, 1998, 1999, 2000,
2003, 2006; Rogers 2003; Wigram 2003; Wigram and De Backer 1999a,b; see Brandt, Chapter 3,
this volume, on the formal processes of meaning in communication).
Music therapists frequently work with people for whom there has been a breakdown of commu-
nication and the capacity for relationship, or where these capacities of mind and meaning have to
be built as if for the first time. For children whose natural pathways of symbolization have been
blocked, affecting identity and a cohesive sense of self, music therapy can offer a means of forming
or regaining meaning by constructing capacities for symbolization that grow from the implicit
realm of emotional communication (Nordoff and Robbins 2004, 2007; Pavlicevic 1997, 1999;
Robarts 1998, 2003, 2006; Wigram and De Backer 1999a). For adults, too, where early normal
development has been thwarted for whatever reason, or where later experiences have led to
reduced capacities for communication and relationship, music therapy can play a vital role in
restoring the person’s sense of self in relationship, using communication where there is no need
for words, but equally where words arise in music, finding greater meaning and authenticity
(Ansdell 1995; Austin 2001; Hadley 2003; Wigram and De Backer 1999b).
17.5 Meeting in music: clinical improvisation and

the creative ‘now’
In music therapy, improvisation is used in ways that adapt to or ‘meet’ the child or adult’s
responses. A musical–emotional, communicative, therapeutically oriented relationship is formed
that evolves in highly individual ways according to each person’s needs. This ‘clinical improvisation’
may be freely developed in response to a client, or to the therapeutic situation, and struct-
ured improvisations as well as pre-composed music may be used according to the client’s needs
(Nordoff and Robbins 1983; Pavlicevic 1997; Wigram 2003; see Wigram and Elefant, Chapter 19,
this volume). It requires the creative, resourceful skills of a music therapist, who must be able to
enter into a musical relationship with therapeutic understanding in which music is the process not
the product—although a recognizable musical product may well result from this endeavour, often
with a particular beauty arising from its freshness and felt immediacy. At the heart of the musical
relationship in therapy is a particular kind of listening and witnessing, sympathetic (not in the
sentimental sense in which this word is often used, but in the sense of being receptive, responding
intuitively) while also observant of the musical and extramusical features of the relationship. The
music therapist uses the properties of music and the power of its aesthetic forms, with a creative
awareness of the intra- and interpersonal fields of relationship. He or she improvizes music with
clinical perception to meet and respond to a client’s responses as they evolve, not to generate a
musical performance. This ‘meeting’ encompasses the total music therapy setting in which a
therapist helps a client to experience acceptance and understanding, offering a tacit or explicit
invitation to explore through playing—or simply finding a way to enable a client who initially
does not or cannot play, just to be within music.
A useful analogy may be drawn here with Daniel Stern’s concept of an ‘emergent moment’ of
interpersonal experience; an emergent moment is:
a subjective chunk of experience that is constructed by the mind as it is being lived. One experiences
oneself as being ‘in’ a moment. It organizes the diverse simultaneous happenings that are registered
during a motivated event. In this sense, the moment is an emergent property of mind.
Stern (1995, p. 96)
In helping to construct or reconstruct a traumatized child’s healthy capacity for being and
relating, imagining playing and thinking, musical improvization tries to bring organization
and coherent moments of feeling and living, a sense of being in one’s own body as an aware and
confident agent. Paradoxically, it can also offer experiences of freedom from structure, creating a
sense of space and time in which to be alone, while in company. The senses of separateness and
togetherness are confused in the early abused child, whose experiences of being and relating have
been deformed and derailed at their visceral core. These primary experiences of self and other
can begin to be explored again in music, and out of this communication a coherent sense of self,
a sense of continuity of self-experience and physical and emotional body boundaries, may grow
(see Wigram and Elefant, Chapter 19, this volume).
Meeting in the music therapy relationship has been described as the ‘creative now’, where the
integration of affective and cognitive capacities of self may take place (Nordoff and Robbins 2007;
Robbins and Forinash 1991). This is a concept close to Stern’s idea of ‘now’ moments, in which
implicit levels of relating give rise to living experiences of change (Stern 1998, 2004). The palette of
natural expressive musicality may be augmented and enriched in music therapy, providing new
emotional landscapes of meeting, perhaps creating a different mood, offering a temporal frame-
work to help develop a client’s playing, however fragmented this may be. Sometimes a therapist’s
quality of listening creates the kind of space and stillness in which a child or adult may begin to
hear themselves with a fresh awareness. These are known means of therapeutic change, always
manifesting in highly individual ways.
Co-active musical improvization is a form of shared play, a negotiation that is developed through
the therapist’s listening to, exploring and working with the qualities and characteristics presented
in the client’s being and being-with, as well as in the client’s not being and not being-with.
Attending, listening and waiting form as much a part of music therapy as active reflecting, matching
or mirroring, enhancing, and interpreting (musically or verbally, as the clinical situation demands).
Music therapy works with all of the senses, with the inner movements of the self of both client
and therapist that come into immediate expression, as well as with the more subtle levels of being
that do not reveal themselves so readily. In playing a musical instrument, tapping on a drum or
woodblock, or plucking the strings of a guitar, a person experiences an extension of themselves
as an agent. Sensory and motivational roots of feeling are set in motion and augmented in their
resonance by contact with the instrument, which immediately becomes a sounding board for
experiences of ‘I’ and ‘me’—here, now, in this moment.
Within the music–therapeutic relationship, the musical and psychodynamic-expressive
structures of self, both intra-and interpersonal, come into play. Music therapy can help to sustain
motivation, develop trust, explore new ways of being, feeling and relating, express and regulate
emotions, and work sensitively with a client’s self-protecting defences and other habitual patterns
of behaviour (Bruscia 1998b; Pavlicevic 1997, 1999, 2001; Robarts 1994, 1998, 1999, 2000, 2003,
2006; Tyler 2002, 2003). Emotional regulation is a central feature of music therapy. A field of
emotional resonance is generated by experiencing the properties of music itself—tone, rhythm,
melody, pulse, tempo, and metre—and by the ways in which a therapist listens and responds to a
child within the range of musical and non-musical events that present in the therapy situation.
Music can reach into the realm of the preverbal self, where it can work on creating deeply rooted
creative–constructive change in impaired attachment patterns.
17.6 The implicit realm of emotions and the creative

construction of meaning
The preverbal self that is reached so completely by music can be identified with the domain of
procedural or implicit memory, the implicit realm of experiencing that is the first sense of self.
Developing in the early intimacy of infant–parent relating, the preverbal self is a sought-for social
construction of implicit relational knowing (Emde et al. 1991; Stern 1995; Siegel 1999). Its vitality
is tested throughout the lifespan; it undergoes modifications in response to life experiences. When
the preverbal self is traumatized in early development, the neural ‘templates’ and mutual influence
structures of the motivating sense of self and self in relation are damaged, with lasting
consequences. If we consider the self as layers of memory that have been laid down by experience
from the bodily procedural or implicit level to the episodic level of memory where specific events
and details of experience reside and may be retrieved or may arise involuntarily in a person’s
consciousness, traumatic assaults on a child’s person, especially if persistent over many years, will
have a dramatic effect on the workings of self as memory (Tulving and Markowitsch 1998). ‘States
then become traits’ (Perry et al. 1995).
For the person who has suffered early trauma that has become embodied as part of the
body–mind self, and often beyond verbal recall, or ‘explicitness’, the power of music and singing
can be a healing process (Austin 2001; Etkin 1999; Robarts 2003; Rogers 2003; Sutton 2002).
Forging new links or connectedness at the implicit level is a central aspect of the emotional reso-
nance and regulation afforded by music. Examples of such musical processes may involve an almost
imperceptible contracting or extending of a phrase, a thickening or thinning of harmonic texture,
a therapist’s supporting or offering more or less pace, faster or slower tempo, and in particular the
quality of touch on the instruments (such as light, strong or sustained), and the emotional quality
conveyed through pitch (high, medium or low) and timbre (colours or textures) that the therapist
brings into her voice or instrumental playing to meet the client.
Musical idiom and scale form are also significant in terms of the kind of emotional resonance
that needs to be generated or responded to. The therapist needs to judge, ‘Is the music too “spicy”
or too “bland”?’ Is the child responding to a four-bar phrase, or is this too much, and would a
three-note motif be something the child could take in and engage with more easily? All of this
musical–interpersonal evaluation contributes to the panoramic view as well as the telescopic
focusing of the therapist’s listening, sensing and responding. It becomes instinctive, interwoven
in the therapeutic process. These are some of the subtle levels of working towards connectedness
or linking, joining up at a dynamic sensory–emotional level with the child. They have particular
significance in working with the foundations of meaning in both children and adults, where
there is an impoverished, fragile or traumatized sense of self.
17.7 Early childhood sexual abuse and post-traumatic

stress disorder (PTSD)
Sexual abuse in early childhood has a devastating impact with lasting consequences for the devel-
oping child. Destruction of the child’s fundamental sense of safety and sense of self leads to
deviant, perverse development of sensory–motor–emotional pathways. Researchers of early
childhood trauma observe that traumatic experience is organized in memory on sensorimotor
and affective levels (van der Kolk and Fisler 1995; van der Kolk et al. 1995). They describe the
dissociation that occurs between emotional arousal and intentionality or goal directed action.
Unable to interpret the meaning of their emotional arousal, feelings themselves become endowed with
a negative valence: because no release can be found in adaptive action, emotions merely become
reminders of one’s inability to affect the outcome of one’s life.
van der Kolk et al. (1995, p. 10)
In the worst cases, sexually abused children are likely to have suffered extreme physical pain
tantamount to torture, throughout their early years from adults on whom the child is dependent,
and from whom they may be unprotected. The ‘sense of self ’ in such children is thus likely to be
undeveloped, incohesive, fragmented, perverse, with severe emotional–behavioural disturbance
and learning disabilities.
Trauma studies by Perry et al. (1995) have shown that psychobiological response to trauma
consists of two separate response patterns: hyperarousal and dissociation. Where a split or
‘dissociation’ occurs between emotional arousal and intentionality or goal-directed action, the
child loses that sense of a self that is ‘going onbeing’, a self that holds together, and is cohesive and
integrated. Dissociated experience is experience that has a separate reality, cut off from human
relatedness and symbolic functions of thought and language. States of hyperarousal, even those
excited by positive feelings such as curiosity or joy, may act as triggers for dissociation. Schore
(2001) has coined the term ‘early relational trauma’, differentiating this from the impact of
single-event trauma and trauma in later life. He explains the distinction in neurobiological
terms:
Early trauma alters the development of the right brain, the hemisphere that is specialized for the
processing of socio-emotional information and bodily states. The early maturing right cerebral cortex
is dominant for attachment functions and stores an internal working model of the attachment
relationship. An enduring developmental impairment of this system would be expressed as a severe

limitation of the essential activity of the right hemisphere—the control of vital functions supporting
survival and enabling the organism to cope actively and passively with stressors.
Schore (2001, p. 206)
For children who have suffered traumatic sexual abuse throughout their early years, innate
communicative musicality—the spontaneous responsiveness and interplay that music generates
between human beings—is compromised by the disruption of the core motivational system
along with the capacity for socio-emotional regulation that is part of the interregulatory dyadic
system in which mind and meaning grow (Trevarthen et al. 2006).
As described above, emotional regulatory processes in music are central to music therapy in
the creating of a space to think and mean in the flow of time and movement of both the child’s
and the therapist’s spirit. These processes are highly individual for each child and therapist. These
are some of the many musical considerations that arise from moment to moment in music
therapy, which occur simultaneously with other dimensions of feeling and thinking with and
about the child in the therapy room.
17.8 Clinical pathways of symbolization in music therapy:

poietic processes
In my work with children who find relating and assimilating new experiences difficult, I became
aware of a core of clinical pathways in the therapeutic process. Among my main teachers were
the disturbed children themselves, who had histories of familial sexual abuse, leading to post-
traumatic stress disorder (PTSD). Children with PTSD may present their emotional disturbance
in chaotic behaviour, or in total emotional withdrawal, unable to find inner calmness or stable
vitality. Their volatile emotions and complex reactions to the therapist and the therapy setting
present many challenges and dilemmas in developing a therapeutic alliance, not least in attending
to the somatic levels at which trauma and stress may find expression (Wickham and West 2002;
Rogers 2003; van der Kolk 2003; and see Osborne, Chapter 15, this volume).
In trying to capture the essence of therapy for abused children, and the function of aesthetic form
in therapeutic change, I sometimes liken music therapy processes to the construction of a poem.
The Greek poiein—the root of our word ‘poetry’—means to make or construct. In music therapy,
as I conceive it, the ‘poem’ is the therapeutic relationship, encompassing the many layers of meaning
and transformations made within that relationship, within the client and within the therapist. This
is not intended as an idealistic or fanciful notion, but in the sense of poem-making, a process of
crafting and honing one’s senses, perceptions and creative skills to forge new horizons of meaning.
The dynamic forms of timing, intensity, colour of voice and intonation are aesthetic experiences
that bring about a state of stillness or spaciousness in which listening to the self is a felt experience.
Using improvisation in music therapy, a therapist can work directly with a child’s emotions and
behaviour to develop experiences of self-regulation, healthy attachment, and a capacity for play
(Bargiel 2004; Robarts 1998, 2003; Wheeler and Stultz 2001). The form and order that can be
experienced in music are not rigid, but are dynamic, growing or diminishing in intensity, duration
and temporal/spatial structures, while retaining salient aspects of constancy and predictability. Here
the biological link between music and the emotions takes on particular significance, requiring
careful attention by the therapist to the child’s responses, and the musical dynamics and psychody-
namics of relationship. Elsewhere, I have described the use of short duration musical ‘shapes’ and
motives in relation to work with autistic children (Robarts 1998) and with anorexic adolescents
(Robarts 1994; Robarts and Sloboda 1994). This shaping of small units of time can often help
children who find the taking in of experiences of relationship difficult, and who can only cope
with clear, easy to assimilate dynamic-expressive structures.
My clinical pathways model of poietic processes in music therapy gradually assumed a recogniz-
able hierarchical structure. It offered me an overview and an ‘experience-near’ way of evaluating
therapeutic change that was consonant with my way of working to help the children, adolescents
and adults I assisted, especially in terms of a child’s developing capacities for relating in meaning or
symbolizing. Symbol formation arises from the basic interregulatory field of emotional communi-
cation and relatedness, linking interests and feelings; in music therapy, I describe this phenomenon
as the ‘tonal-rhythmic field of sympathetic resonance’ (Robarts 2000, 2003). This field manifests
itself due to basic human sympathy resonating to and with the properties of music and sound,
through the use of the voice and the playing of musical instruments, which in turn augment experi-
ences of self and relationship. From this musical ground of relationship, subsequent levels of
symbol formation develop, until autobiographical narratives may present spontaneously.
The pathway of symbolization in music therapy is a dynamic trajectory from evoked levels
(pre-intentional, unconscious responses at the implicit level of experiencing) to more
self-directed, intentional modes of expression, where imagination comes more fully into play as
autobiographical narratives. These processes are illuminated by Stern’s (1994) model of the
infant’s representational world, where the building of meaning and self-coherence grows from
the sensory–motor–affective schemas of lived experience that are evoked and enacted sponta-
neously leading to autobiographical narrative. Damasio (1999) has also referred to autobio-
graphical narratives arising from felt, lived experiences at a sensory-motor level.
In music therapy with my child and adult clients, rather than seeking to arouse conscious
associations with music, imagining and remembering, or musing on an experienced past, I am
dealing with much earlier levels of being and relating, where the abused child’s sense of self has
been ‘dis-membered’, and where, without therapeutic intervention, ‘re-membering’ (putting
together) is invariably a painful experience.
All traumatized patients seem to have the evolution of their lives checked; they are attached to an
insurmountable object. Unable to integrate traumatic memories, they seem to have lost their capacity
to assimilate new experiences as well.
Janet (1925, p. 660, cited in van der Kolk and van der Hart 1989, p. 1533)
Where the child has no mental space to be, let alone imagine or reflect, his or her being and relating
must first be addressed. If used with care and clinical skill to avoid the re-traumatization of
those children whose foundations of self have been damaged, music can set in motion new self-
experiences as an integrative, formative pathway from visceral and sensory experience to conscious
expression. In this way, a sounder sense of self starts to develop in the present—a self that can begin
to withstand the pain of the past, to some extent work through it, and then perhaps be helped to
move on from the trauma of abuse. This is, of course, far from being an untrammelled process.
It depends on factors that at best may support the therapeutic process, but at worst may threaten to
invade and sabotage a child’s overall therapeutic program and care.
Despite each child’s individuality, I have found that the central processes or fields of related-
ness/symbolization, where music acted most significantly, comprised three fields of emergent
relatedness and symbolization, acting as a bridge from one field to the other that could be crossed
to and fro as necessary in the therapeutic process. The following fields were identifiable, although
constantly overlapping and interweaving in the evolving clinical situation:
Field 1: a tonal–rhythmic field of sympathetic resonance forming the ground of relatedness and
regulation underlying development of meaning;
Field 2: the emergence of motive signalling and the building of more aware and stabilizing
expressive forms of relatedness; and
Field 3: autobiographical narratives, arising spontaneously in song or other symbolic expressive
forms, which may indicate that Fields 1 and 2 have been assimilated, a sign of integration and
symbol formation.
The three fields represent different levels of relating and playing (different strata of symbolizing)
within the music–therapeutic relationship, from implicit to explicit levels of symbolizing. Field 3
may remain fragmentary and tenuous until Fields 1 and 2 are consolidated. Sometimes Field 1
assumes prominence in the building of the musical–relational phenomena of Fields 2 and 3. Some
children are more readily engaged at Field 3 in a symbolic form of expression (a drawing, an image,
a story) to broach the more immediate emotional and relational contact of Fields 1 and 2.
17.9 How music therapy brought change for a sexually

abused traumatized child
In work with sexually abused children, the immediacy of connectedness that music therapy can
bring must be handled with extreme care. Connectedness denotes the most basic experiences of
relatedness and integration where intra- and interpersonal experience needs to be carefully
regulated. Experiences of being and being-with tend to be overwhelming for children trauma-
tized by early abuse, resulting in dissociation or dissociative states. Trauma brings about complex
psychological and neurological conditions in which the impact of hyperarousal accompanying
fear or terror leads to the numbing of feelings, over-compliance, and the withdrawal of
feelings (for a proper introduction to trauma and its impact neurologically, psychologically, and
developmentally, see Herman 1992; Perry et al. 1995; Schore 2001, 2003; Fosha 2003; Siegel 2003;
van der Kolk 2003). The child presents as remote, unmindful of the present, in fact mindless.
These neurobiological mechanisms of dissociative states characteristic of sexually abused
children can be worked with directly within a musical–therapeutic relationship. Even without the
essential element of the music, the following case material conveys the centrality of the music in
the therapeutic relationship.
17.10 Sally2
Sally displayed all of the major symptoms of PTSD (DSM-IV-TR – American Psychiatric
Association 2000): dissociative states, persistent symptoms of increased arousal, poor capacity to
self-regulate emotions, persistent avoidance of stimuli associated with the trauma, the numbing
of feelings, and the persistent re-experiencing of the traumatic event. In addition, she displayed
alterations in perception of the perpetrator (this is especially common where the perpetrator is a
primary carer), alterations in relations with others, and alterations in systems of meaning
(Herman 1992, p. 121).
Born a ‘floppy’ baby, Sally was the third in a family of four children. The family lived on
an inner-city council estate. Her mother lived on social support and had already had a difficult
life, including a history of violent relationships, alcoholism and prostitution. From the age of
2-and-a-half to 7 years, Sally was sexually abused by two men in the household, one being her
2 The child’s name and details that might identify her or her family have been changed to ensure anonymity,
while retaining the salient features of the clinical material.
mother’s partner. The assaults were tantamount to torture, in terms of the extreme pain and
physical violence persistently inflicted on this child. When the abuse was finally disclosed, the
perpetrators were imprisoned and all of the children were placed on the Child Protection At Risk
register. Despite the volatile and sometimes violent relationships between the adults, a strong
bond of affection sustained the family. Sally’s mother’s new partner was someone with whom all
the children were developing a good relationship.
Sally attended a school for children with severe learning disabilities. Although her language
and comprehension indicated moderate learning disabilities, her behaviour was such that she
could not function at that level. Sally was doubly incontinent, with poor motor control, severe
problems of attention, and limited expressive language, but showed she could understand simple
conversations. She had a squint, her gaze often swivelling towards the ceiling, where it would
linger as she stood limp, passive, often uttering a terrible, hollow wail, followed by piercing
screams and hysterical laughter. At first, Sally’s remote manner, volatile behaviour, marked diffi-
culties in social interaction, and obsessive–compulsive habits suggested to some that she was
autistic as well as severely emotionally disturbed. Only when the full extent of her early suffering
was disclosed was she understood to be as psychotic and suffering from PTSD.
At school, a behavioural programme was set up for her, involving clear boundaries, with strict
routines. The program emphasized that Sally was always to be given clearly communicated
choices to enable her to feel secure by sensing that she had some control over situations, building
her sense of autonomy and trust. Sally was frequently so disturbed that she would seek to harm
herself, often banging her head against a wall. It required two adults to hold her to prevent her
from injuring herself. Inevitably, this was experienced by Sally as being overpowered by her
abusers, increasing her own distress and that of her helpers. Sally’s response to being given help
often spiralled out of control, culminating in her taking all her clothes off, and running to put her
head down the toilet, and flushing it. She was obsessive about washing her hands and the rest of
her body until her skin bled. For Sally to remain in school, a one-to-one support was required at
all times. She had great difficulty in trusting people and was scared of men and of dogs, so that
normal outings in public would often trigger screaming and panic.
Sally received individual psychotherapy from the age of 7 to 9 years old, but this was curtailed
when Sally’s mother was unable to take her daughter regularly. From the age of 7 until she
was 14 years old, Sally received individual music therapy once weekly from me, initially lasting
30 minutes and, from the fourth year of therapy, increasing to 40 minutes. Continuity of sessions
was achieved only because Sally was brought by her learning mentor. I would have liked to
see Sally for at least two sessions a week, but this was not possible. Ongoing liaison work between
Sally’s teacher, her learning mentor and me was vital. Sally could not have used her weekly
music therapy sessions with me in the ways that she did, had she not had a daily 15 minutes
session with her learning mentor at the end of each school day. In this way, Sally was supported,
listened to and given the experience of her feelings and emerging thoughts being ‘held’
throughout the week. Sally’s mother was wary of meetings with professionals, but with the
support of Sally’s learning mentor, gradually became more trusting and met with me on several
occasions. She refused to engage in therapy herself.
17.11 Music therapy with Sally

In the first two years of music therapy, Sally’s sense of herself and her trust in the therapeutic
relationship developed slowly. Sally’s sense of her self was growing in terms of her body sense and
body boundaries, sense of self-agency and trust in me, but she showed only intermittent episodes
of coherent action or thought. Much of the time, she seemed to be experiencing ‘falling’, or
‘dark holes’, her screams and kicking often directed at me, but carried out in a ‘cut-off ’, remote
state. Frequently, after throwing herself to the floor, she stared up at the ceiling, uttering a few
words repeatedly in her hollow tone: ‘stairs’, ‘light off ’, ‘dark’, ‘break your neck’, ‘fucking bastard’,
and a long drawn out guttural ‘da-a-a-a-a’ (possibly her word for Daddy, but I was never sure;
it could have been a more drawn out expression of ‘dark’). Her trauma was replayed in this
fragmented way, with sudden shifts of mood, impulses and withdrawal into remoteness. I had to
learn how to listen, yet be firm and intervene if she was harming herself, and whenever possible
not waver or react, even when I knew she was experiencing me as one of her abusers. I had
to accept and acknowledge to her how she felt, while trying to comfort her in her distress. I had to
explore ways of helping her move forward from her habitual, self-destructive states, and to offer
her new experiences to build a more ordinary, healthy sense of self.
At first unable to modulate/regulate her impulses and feelings, Sally gradually began to stroke
rather than kick or hit the musical instruments. She would then look at her hand wonderingly,
almost amazed, as if recognizing her hand as her own for the first time. The various musical
instruments proved to be vital intermediaries of relationship, offering her sensory experiences of
herself in sound that helped her begin to develop a sense of herself. Her sensory explorations
seemed to help her not only recognize but to ‘own’ her bodily experiences as she began to trust
the act of playing using her hands, fingers, mouth, feet and arms. The resonance of musical
instruments extended her self-awareness and held her interest in ways that created space and
time for thought. By the end of the second year of music therapy, she had begun to use some
words that reflected her pleasure in music: for example, she would say ‘swimming’ or ‘riding’ as
she played; ‘safe’ became a word she used when she became more able to settle and engage in a
shared experience of quiet play.
In her third year of therapy, 10-year-old Sally’s capacities for emotional self-regulation and
symbolizing became more firmly established. She seemed more stable, at home in her body, able
to stay in the here and now of the therapy room, without such frequent dissociating and
psychotic states of mind. The case material of this time, however, shows the tenuousness of her
ability to sustain these capacities, and how quickly they fragmented or became diffuse.
17.12 Significant episodes from a session in Sally’s

third year of therapy
Five episodes are highlighted from one 30-minute session in Sally’s third year of therapy (when
Sally was 10 years old) to illustrate significant moments in the music therapy process. These
episodes represent integrative poietic processes of affect regulation and symbol formation in
music therapy, traversing all three fields of poiesis. A brief summary of each episode is given in
bullet points before the more detailed description.
17.12.1 Episode 1: beginning of session

◆ Sally runs around the room. Sally shows ambivalent feeling states: anger, hysteria, hollow
laughter, kicking out at me.
◆ Sally’s movements start to be influenced by the pulse of the music, and her responses become
more focused.
◆ Moments of cohesion.
◆ I musically match the intensity and ambivalence of Sally’s feelings.
◆ I introduce stability of pulse and harmonic tension/texture.
After a few seconds of running around the room in a distressingly chaotic state, with hollow,
wavering laughter, Sally brushes against a cymbal, knocking it over. This does not appear to have
hurt her, but it brings her awareness into the moment where she appears to stop and reflect and
look around. Then she cries. The music is dissonant, but lightly textured, rhythmically working
to slow the tempo of Sally’s fast running around the room. My role is to support her, listen, and
provide a safe framework in which she may find stillness. I remain seated at the piano, sometimes
silent, sometimes playing according to my sense of whether she needs quietness or someone to
accompany her. She sometimes looks over to me. Silence is also a form of acknowledgement and
offers a space in which Sally hears herself. I hope that my stillness will reassure her. Very readily,
I am made to feel like one of her abusers. I have to steady myself, ground myself to counter this
feeling, yet accept it until she begins to trust me more.
This is an example of Field 1, in which music is used to create a tonal–rhythmic field of sympa-
thetic resonance to meet her mood, while working to regulate and stabilize her emotional state,
her running and her feeling out of control.
17.12.2 Episode 2: working with Sally’s alternating

stability and distress
◆ Sally’s intermittent screaming and repose seem to inhabit a dream state.
◆ She quietens and whispers something that sounds like ‘Cry it’ or ‘Try it’ as she sits on the floor.
◆ I place a small woodblock and a clave near her, tapping the woodblock twice.
◆ Sally plays sporadically, alternating her playing with screaming/vocalizing.
◆ There is an intermittent regulating influence of musical pulse on Sally’s playing, providing a
framework for recurring silences that she fills with screaming—she listens to her screams.
Sally’s crying intensifies into a hollow wail; she then lets out a piercing scream. She does not
seem to be screaming at me, because she is now lying on the floor, her body still, and is staring at
the ceiling. She listens to her scream, then screams again.
She whispers very quietly, ‘Cry it’ (or possibly, ‘Try it’). I echo this quietly, but with a slightly
questioning intonation. She sits up and looks around her, as if waking from a dream. I sense that
she is not screaming at me, so much as remembering her screaming—experiences from her
past that she relives. I have the sense that she is exploring this re-experiencing of her screaming,
listening to it in the safety of the therapy room, rather as a baby listens to her own babbling.
Sally’s screams have a hollow, frighteningly sinister ring to them. I have the impression that she
appreciates my listening to her and not moving either away or towards her. I listen, but every now
and then, I play a three-note motive—a simple musical statement; I use it as a way of ‘calling out’
to her without using my voice, yet letting her know I am here and listening to her.
This is an example of Field 2: offering a musical motive as a way of providing emotional and
communicative experiences that have clear form. Islands of connectedness and basic emotional
self-regulation begin to form. Fragments of her traumatic memories seem to be replayed in her
mind. While responding to these, I work to build new experiences of relationship in a bearable
reality of a boundaried present, i.e. a state temporally and spatially structured at micro and
macro levels through the music and the music–therapeutic relationship.
Sally starts to tap sporadically in tempo to my singing. Her beating is unsteady, but becoming
more sustained in repetitions that last 10–15 seconds at a time. I go over to her, and hold out a
small woodblock and place a clave near her. I begin singing softly a simple children’s song that
I then improvise, singing without words, and in a minor key. I move back to the piano where
I slightly increase the vigour of the music to support her physical/emotional response. She is
beginning to experience herself in relationship, responding with a certain alertness and self-
control, achieving a self-regulation that is generally hard for her to sustain in socially intimate
situations. I am very conscious of the effect this new co-active experience in music may have on
her, because she tends to avoid any close contact. In physical proximity, her motor impulses tend
to become chaotic. I am hoping that the way in which I am providing music will give her an
experience of steadiness that she can assimilate as she responds to the stability of its forms.
I pause, allowing silence and space; the stillness seems to add to her experience and sense of being
in control, and does not overwhelm her.
Here Field 1 is becoming more stabilized, shifting tenuously into the more formed, shared
activity of Field 2, which brings greater awareness, control and intentionality.
Sally’s emotionally steadier state continually falters, alternating with her ‘remembered scream’,
which is followed by hollow-sounding vocalizing and a gagging sound at the back of her throat.
I listen, wait and occasionally respond to her, trying to reassure her by singing a two-note motive
that responds to her sounds, but at a lower vocal register. I let her know that I am listening.
Her experiences of herself in the here and now as she plays alternate with her remembered
screams, which seem to be halfway between her habitual dissociative states and her very
disturbed emotions; she now allows herself to feel and communicate these emotions to me.
I sense it is important that I stay still, some distance from her, and witness these alternating
episodes of her screaming and playing. My steadiness and non-action seem to be all-important at
this point. In these moments Sally is no longer dissociating from the reality of her feelings, her
terror about her traumatic past. She can remain for longer episodes in a present that is not only
sympathetic to her, but is providing a structure for her to feel safe and protected enough for her
to begin to use her thinking capacities. I sense she is beginning to experience some peace when
she is held in the present, albeit with her harrowing memories.
Fields 1 and 2 are now increasingly unstable and need to be continually re-established from
moment to moment.
17.12.3 Episode 3: Sally begins to express her feelings through

musically supported activity, without dissociating
◆ Sally stamps her feet, then wildly kicks at the tambourine I hold out to her.
◆ The intensity and form of my singing supports Sally’s kicking and her accompanying feelings
of anger.
◆ This leads to vocal exchanges between us.
◆ Sally lies on the floor, continuing to kick strongly upwards to the tambourine that I hold for
her. She is at first wild and angry, then stabilizes and begins to kick like a small child; normal
experiences of early play and enjoyment of her kicking develop.
◆ A sense of trust in me develops as she responds to the accented pulse of the rhythmic music
supporting her kicking.
A few minutes have elapsed since the events described in Episode 1. Sally’s emotions and behav-
iour have become volatile, erupting into wild running and hollow, almost guttural laughing. I am
concerned to find a musical way in which she can experience her feelings without becoming
overwhelmed and dissociating at the very moment of her experience. I can see her now, experi-
encing self-control and self-soothing. She is stamping and wildly kicking. I hold a tambourine to
her flailing feet, and, in my singing to her, encourage her to direct her kicks there.
Field 3 is being touched on, building on Fields 1 and 2. Field 3 is still tenuous, but she appears
to be dissociating only slightly, although because her dissociating is less extreme, it is sometimes
hard to determine if it is present. She is expressing anger, rage, pain and terror—the feelings that
usually cause her to split off into dissociative states and self-harming behaviour.
17.12.4 Episode 4: vocalizing exchanges and the development

of a sense of agency; memories (autobiographical
narratives) are experienced with anger and pain,
but with less extreme dissociating
◆ I support Sally’s vocalizing and mood, initially from the piano, harmonizing with her sounds.
◆ Her vocalizing begins to develop into a mixture of speech and song.
◆ The trauma of abuse is always present, but with less dissociation as she now repeatedly dwells
on the experience.
◆ Sally half-sings, half-says: ‘Light…careful’, (shouts) ‘fucking bastard’.
◆ Her kicking continues with increased intensity, supported musically and by the tambourine
held to receive her kicks.
◆ As in Episode 3, Sally begins to show normal enjoyment of kicking, and to delight in this
experience.
◆ Sally’s kicking continues with intensity, and becomes steady, focused and intentional;
eventually there is a clear qualitative shift into normal enjoyment of kicking, then stamping
on the floor like a little girl.
◆ Twice, Sally initiates steady stamping, showing her engaging with a new sense of her develop-
ing autonomy, self-agency and control in musically interresponsive activity.
Sally gets up from the floor and runs off around the room. From the piano, I accompany her
movements to try to help her regulate her emotional state, using close-textured harmonies in
short phrases followed by silences. I try to give her space yet engage her awareness, and eventually
draw her back into a more grounded expression of her feelings. However, this does not happen.
In silence, Sally returns to sit on the floor and begins to sing-speak again in her hollow-sounding
voice. It is a voice, not of an innocent child, but the whispering, gasping utterance of a child in a
state of grief, pain and terror. At no time does she indicate that she wants to leave the room;
rather, she seems intent on using the music room and the time with me to vent her feelings and
her memories of her torture. She starts to express these memories in a few words interspersed by
screams. She listens to the echo of each scream. From the piano I offer a slow, gentle pulse using
warm harmonies with intermittent silences to create both a temporal framework that supports
her and gives her space in which to hear herself.
I say this to her, in words: ‘That feels good—being cross…being angry’. I felt at the time that neither
‘cross’ nor ‘angry’ were adequate in describing her raging kicks. But my speaking voice is in many
ways an important contrast within the music and the silence. I use my speaking voice as another facet
of the music, and to acknowledge in words her feelings, where necessary. I am using words as a sym-
bolic form of musical motive that can be held onto and taken in by Sally. Words bring another
dimension to the musical interaction—one that may help further regulate the musical experience,
modifying it to make it easier for Sally to internalize fundamental experiences of relatedness
In this Episode, Fields 1 and 2 are stabilized and she begins to express herself in words (Field 3).
Here, the whole pathway of symbolization from Fields 1 to 3 starts to hold together, and indicates
that she is achieving a level of integration in that moment. This recurs in later sessions with an
increasing range of emotional expression that is less habitual and impulsive. She increasingly uses
words in complete phrases and, eventually, sentences. Here, we see the habitual and fragmented
memories or hallucinations of her trauma being contained while at the same time starting to
give way to newly emerging experiences of herself and her relationships in the present; i.e., the
creation of new, healthy pathways of self and self in relation.
17.12.5 Episode 5: end of session—calmer and tired

◆ Sally sits limply on a chair at the piano close to me.
◆ I hold a hand-chime for her to play (otherwise she would be likely to throw it away, as this is a
fairly new experience).
◆ I provide a gentle, flowing, left-hand accompaniment using a three-beat phrase, ending with a
pause that invites her to play; this is a gentle, clear structure that steadily engages her.
◆ A few minutes later, she half-sings a few inaudible words, then in a resonant voice sings
‘Da–rk’, sustaining the vowel-sound over my changing harmonies; she does not split off into
the kind of screaming that has often accompanied these hallucinations of her past experiences
invading the present.
◆ Sally then sings/whispers: ‘Bye-bye … time to go … light shining … ’, before suddenly coming
to a different level of consciousness, saying in a quite ordinary voice: ‘Sweating!’, while examin-
ing her hot, rather flushed hand.
At the end of this session, after her volatile, extremely disturbed feelings and play, Sally is sitting on
a chair beside me at the piano. She is swaying her body to a gentle lulling song I am playing. I use a
rocking motive with harmonies that are warm and solidly in one tonality to support her quiet vocal
sounds. Her bodily response is steady enough for me to hope she might be able to play, extending
her experiences of herself in relationship with the music and myself, and focusing her self in the
reality of present time and space. While keeping the accompaniment in my left hand, I offer her a
hand chime with my right hand, pausing briefly to show her how it is played. She quickly under-
stands how to pull back and release the hoop of rubber-covered metal so that it hits the metal chime
with a singing tone. Rather than leaving her to explore freely, which is something that she cannot
yet do without losing the flow of the activity, I offer her the chime at each phrase end, so that she
has the experience of structure, forming and shaping a shared experience that she can tolerate and
begin to anticipate with focused awareness and enjoyment. Sally is momentarily calm, supported by
the music and her playing. She sings fragments from her disturbed memories expressed earlier in
the session: ‘Da-a-a-rk…bye-bye…time to go…light shining’. I accompany her long sustained
vowel-sounds with sonorous hymn-like harmonies that offer a strong tonal centre and soulful
solemnity to her singing. Her sudden noticing of her hot hand and her comment ‘Sweating!’ is
a lovely moment when she is able to be fully aware without her awareness and connectedness
triggering her habitual anxiety and dissociation. She leaves the room quiet and contented.
Without the music itself, this case material may seem a little dry. In words, I can only attempt to
describe some of the ways in which music was used creatively in this context—in a way that is
adaptive, responsive, yet framing and giving context to a disturbed child’s feelings. These episodes
give a sense of how she was met in music, how her feelings were supported but also changed by the
use of music and through the different fields of relatedness in the music–therapy relationship. All-
important was the growth of Sally’s capacity for self-regulating within the musical-relational
frame. As this grew, so did her sense of reality, her trust in herself and in others in the present.
17.13 Summary of change

In the course of the first three years of music therapy, Sally changed from being a highly
disturbed, easily re-traumatized and dissociating child to a one who could allow herself to
respond to and assimilate new experiences, particularly basic sensory experiences that formerly
would easily overwhelm her. Her feelings of anger, despair and sadness were now expressed
less impulsively and more coherently, both in her musical responses, which were now less
fragmented, and in words. She was starting to have fun in music, although this experience still
required careful regulation by means of clear structures or phrasing, using solid harmonic
textures. This served to sustain a stability of responsiveness in Sally that could not be achieved
with bland, consonant musical accompaniment. Once persistently self-harming, screaming, or
completely passive and remote, she was now beginning to play and participate in group activities
at school. Her expressive language was becoming more fluent and coherent, as were her drawing,
reading and writing.
In music therapy with Sally, I first aimed to provide a space that she could experience as pre-
dictable and sufficiently safe, to explore the emotions that constantly assailed her in her day to
day life. Through my intent listening and providing temporal musical ‘containers’, she experi-
enced security and ‘being held’ by the music (in terms of structure, form, timbres and so forth).
I worked directly with her feelings as they were manifested in the room; I tried to engender
shared musical experiences through which she was able to build a sense of continuity, a sense of
herself in terms of body boundaries, listening, self-agency, self-reliance, being animated without
being terrified, and being able to remember and want to repeat an experience. These experiences
began to enable her to play, to touch and explore sensations within a musical framework
that helped to define a new reality for her. She could begin to tolerate and trust, initiate shared
play, and show her sense of fun and enjoyment in music. This was not easily sustained, but
grew in the four years that I saw her, until she was 14 years old. Her songs expressed her
emotions, and enabled exploration and a degree of resolution of her feelings of shame and guilt,
sulliedness and rage.
My account of music therapy with Sally describes how re-creative processes were set in motion
to help repair the damage done to this child during formative stages of her development; how the
medium of music played a vital role in helping her recover—or build—a bodily, emotional and
mental sense of self. I have emphasized the potency of music and musical engagement from psycho-
biological and developmental points of view; these are also intrinsic to my understanding of the
psychodynamic phenomena underpinning the growth of trust and meaning in any therapeutic
relationship. This poietic clinical model of change has developed from my experiences with
children like Sally, and maps different fields of relating in music therapy as a guideline when the
territory of relationship is tenuous and direction is far from clear. It is not intended to be didactic;
rather, it is a way of thinking about certain aspects of the therapeutic process where early states of
mind and mindlessness prevail.
Ultimately, my understanding of poiesis (creative–constructive change) in music therapy is
that it is the art of listening which lies at the root of clinical musical perception and action; it is
the art of tuning into and connecting with the soul of the child, with the spirit or will where one
perceives a quickening or a deadness; it is the art of sensing what is beneath the surface, beyond
the audible and visible; it is the capacity to both sense and make sense of what is emerging
between and within oneself and the child from moment to moment, from session to session.
I have described some of the subtle developments of musical interaction that, although initially
fragmented, gradually begin to take shape in a musical relationship that responds to the child’s
qualities and needs. As these developments take shape, so does the growth of mind and meaning
within the musical relationship. The child becomes more able to symbolize, to form thoughts,
to express feelings at non-verbal, preverbal and verbal levels. Early preverbal symbolization is a
realm of early formation of the child’s interpersonal world. Here, music can help develop proto-
forms of thinking and mindfulness. Such a musical relationship can assist the development of
an intersubjective and cohesive sense of self. As I have described above, music can activate and
regulate basic forms of sensory, motor and emotional experiences in an interpersonal relationship.
The aesthetic musical subtleties of shaping and inviting, offering an interpersonal context for
such experiences, can help a traumatized child to internalize rather than throw out such basic
experiences of self and other. Therapy may last for months or, as in the case of Sally, for years.
It therefore requires professional commitment and dedication, and support for the therapist in
the form of regular clinical supervision. This kind of process, as I have already emphasized, does
not succeed alone, but can contribute uniquely to the range of help that such children need.
17.14 Aspects of therapeutic change

Music therapy with Sally focused on developing her capacity for relationship, and trust through
experiences of the predictability and variation that form the essence of music. By listening to her
and accompanying her through her traumatized, disturbed states of mind, where she was very hard
to reach, I was able to help her experience stillness and space unassailed by anything or anyone,
yet a space in which she felt supported. Music therapy helped her to develop a tolerance of normal
sensory and play experiences that brought with them, in the first instance, a basic sense of body
boundaries and physical safety.
Experiencing the cause and effect of her actions at a sensory–motor–emotional level was
motivated by music, bringing a flow to and modifying her impulses. The musical instruments
themselves provided her with sensory experiences over which she felt she had control, yet she
could still experience their immediacy and vitality, linking her touching, listening, seeing, being
and being-with, all in an instant. A new self-cohesion and integration began to form in Sally.
There were marked improvements in emotional regulation in shared and anticipatory play,
where she could now become emotionally involved, tolerate more variations in play, and need
less adaptive timing and framing of musical support from myself. Her motor-coordination and
eye–hand coordination improved. Her rage and terror continued to erupt from time to time, but
were more contained. She was increasingly able to reflect on her feelings, using complete
sentences in a normal tone of voice, rather than fragmented words screamed or whispered as
though triggered by a hallucination in which the past invaded the present.
There were other gains, too, for Sally: the development of a longer attention span, greater concen-
tration, intentionality, and capacity for engaging in sustained interresponsive play. She began to
show a sense of humour and a growing capacity to explore and feel safe in new experiences. She
began to sing and speak about herself in the present tense, without continual dissociation and
flashbacks. Along with this improvement and the capacity to form symbols—thoughts, words
and sentences—came the next stage of therapy (not reported here), which worked with her sense
of guilt and shame and feeling dirty, which she expressed in improvised songs. Now that she
could hold thoughts in her mind and reflect, it was possible to work with her both musically and
verbally. This final stage of work continued for another four years.
17.15 Aspects of educational development

Sally’s progress at school was outlined in an individual educational profile (IEP) when she was
11 years old. It reported Sally’s improved concentration and ability to sustain shared interactive
play and turn-taking with her peers. It noted that she was emotionally more stable, less prone to
impulsive shifts in mood. Her motor skills were maturing, and eye contact much improved. Her
behaviour was generally more emotionally stable, although she was still frightened of men and
dogs. Her impulsivness and poor capacity for listening had shifted towards greater stability
and ability to cooperate. She could think and reflect on her feelings and actions more coherently.
Her words had become clearer and were used with meaning, communicating her needs and
feelings. She could read and write simple sentences, and had acquired basic numeracy. Her
persistent hand washing and other obsessive behaviour had almost disappeared. She enjoyed
cooking as well as horse-riding and swimming. Her play showed more imagination and a
decrease in repetitive, habitual patterns. The IEP requested that music therapy continue as part of
Sally’s statement of special educational needs.
Music therapy alone did not achieve all these changes, but it contributed, particularly at the
sensory–motor–emotional level, in helping Sally to experience herself and others with increasing
awareness and trust, with a sense of continuity in the flow and vitality of life, where once terror
and hallucinations of the past had prevailed.
17.16 Conclusion
The clinical potential of music therapy can be more fully appreciated when music, musicality and
emotional expression are understood as being biologically based or part of our human nature.
I have considered how children’s ways of playing and relating in music, and playing with musical
instruments, offer insights and opportunities for change from sensory, preverbal levels to more
sophisticated, explicit levels of expression. In music, there is a two-way channel between the
sensory realm from which meaning (or symbolizing) emerges in the movement to the symbolic
realm of imagination and play, articulate and full of ideas.
I have examined how music can be used clinically within a music–therapeutic relationship to
generate changes in a child’s being and relating, where this has been traumatized at a core level
by early sexual abuse. Because music can both reach and regulate the core of our being, for the
traumatized child it can work to support and transform the distorted and disrupted foundations
of the bodily–emotional self. As part of a multidisciplinary programme, music therapy with
sexually abused children seeks to help a child to build new patterns of being and being with, as
well as working through the trail of devastation left by early trauma. Thus, a coherent sense of
self begins to form. From this self-coherence, or connectedness, children can begin to find a safe
space within as well as between themselves and others that can be felt as stable, yet flexible in its
vitality.
In music as in life, there is a need for variation and repetition, a sense of continuity but
not stagnation. Only when this stable, flexible foundation is in place can children with past
trauma begin to play and take in new experiences. While the past cannot be changed, its hold
over abused children can be modified, so that they can develop a sense of identity, including
bodily boundaries, autonomy and resilience. By being reached and understood through their
inborn musicality within a music therapy relationship, sexually abused children may be helped to
develop a more cohesive, sounder sense of self and to join the ‘dance of wellbeing’ to which every
child has a right.
Acknowledgement
I am grateful for permission to use in this chapter material from my paper Music therapy with
sexually abused children (2006). Clinical Child Psychology and Psychiatry, 11(2), 249–269.
References
Aldridge D (1996). Music therapy research in practice and medicine: From out of the silence. Jessica Kingsley,
London.
Alvin J and Warwick A (1991). Music therapy. Oxford, Oxford University Press.
American Psychiatric Association (2000). Diagnostic and statistical manual of mental disorders. Revised
4th edn. American Psychiatric Association, Washington, DC.
Ansdell G (1995). Music for life: Aspects of creative music therapy with adult clients. Jessica Kingsley, London.
Austin D (2001). In search of the self: The use of vocal holding techniques with adults traumatized as
children. Music Therapy Perspectives, 19, 22–30.
Bargiel M (2004). Lullabies and play songs: Theoretical considerations for an early attachment music
therapy intervention through parental singing for developmentally at-risk infants. Voices: a world forum
for music therapy. http://www.voices.no/mainissues/mi40004000143.html.
Beebe B, Jaffe J, Lachmann F, Feldstein S, Crown CL and Jasnow MD (2000). Systems models in
development and psychoanalysis: The case of vocal rhythm and coordination and attachment.
Infant Mental Health Journal, 21(1–2), 99–122.
Beebe B and Lachmann F (1988). The contribution of mother–infant mutual influence to the origins of
self- and object-relationships. Psychoanalytic Psychology, 5(4), 305–337.
Bruscia KE (1998a). Defining music therapy. Barcelona, Gilsum, NH.
Bruscia KE (1998b). The dynamics of music psychotherapy. Barcelona, Gilsum, NH.
Bunt L (1994). Music therapy: An art beyond words. Routledge, London.
Condon WS and Ogston W (1966). Sound film analysis of normal and pathological behavior patterns.
Journal of Nervous and Mental Disorders, 14, 338–347.
Heinemann, London.
Darnley-Smith R and Patey H (2003). Music therapy. Sage, London.
Dissanayake E (2000). Antecedents of the temporal arts in early mother–infant interaction. In NL Wallin
Emde RN, Biringen Z, Clyman RB and Oppenheim D (1991). The moral self of infancy: affective core and
procedural knowledge. Developmental Review, 11, 251–270.
Etkin P (1999). The use of creative improvisation and psychodynamic insights in music therapy with an
abused child. In T Wigram and J De Backer, eds, Clinical applications of music therapy in developmental
disability, paediatrics and neurology, pp. 155–165. Jessica Kingsley, London.
Evans JR (1986). Dysrhythmia and disorders of learning and behaviour. In JR Evans and M Clynes, eds,
Rhythm in psychological, linguistic and musical processes, pp. 249–274. Charles C Thomas, Springfield, IL.
Fosha D (2003). Dyadic regulation and experiential work with emotion and relatedness in trauma and
disorganized attachment. In MF Solomon and D J Siegel, eds, Healing trauma: Attachment, mind, body,
and brain, pp. 221–281. W.W. Norton, New York.
Gallese V and Lakoff G (2005). The brain’s concepts: The role of the sensory-motor system in reason and
language. Cognitive Neuropsychology, 22, 455–479.
Gold C, Voracek M and Wigram T (2004). Effects of music therapy for children and adolescent with
psychopathology: A meta-analysis. Journal of Child Psychology and Psychiatry, 45(6), 1054–1063.
Hadley S (ed.) (2003). Psychodynamic music therapy: Case studies. Barcelona Publishers, Gilsum, NH.
London.
Herman J (1992). Trauma and recovery: The aftermath of violence – from domestic abuse to political terror.
Basic Books, New York.
Hobson RP (2002). The cradle of thought: Exploring the origins of thinking. Pan, London.
Janet P (1925). Psychological healing, vols 1, 2. Macmillan, New York. (Original publication Les médications
psychologiques, vols 1–3, 1919. Félix Alcan, Paris.)
Malloch S (2005). Why do we like to dance and sing? In R Grove, C Stevens and S McKechnie, eds, Thinking
in four dimensions: Creativity and cognition in contemporary dance, pp. 14–28. Melbourne University
Press, Melbourne.
Malloch SN (1999). Mothers and infants and communicative musicality. Musicae Scientiae (Special issue
1999–2000), 29–57.
Nordoff P and Robbins C (1977). Creative music therapy: Individualized treatment for the handicapped child.
John Day, New York.
Nordoff P and Robbins C (1983). Music therapy in special education. The John Day Company, New York.
Nordoff P and Robbins C (2004). Therapy in music with handicapped children. Barcelona Publishers,
Gilsum, NH.
Nordoff P and Robbins C (2007). Creative music therapy: A guide to fostering clinical musicianship. Revised
edn. First published 1977, John Day, New York. Barcelona Publishers, Gilsum NH.
pp. 37–55. Oxford University Press, Oxford, New York, Tokyo.
Pavlicevic M (1997). Music therapy in context: Music, meaning and relationship. Jessica Kingsley, London.
Pavlicevic M (1999). Music therapy – intimate notes. Jessica Kingsley, London.
Pavlicevic M (2001). A child in health and time: Guiding images in music therapy. British Journal of
Peretz I (2001). Listen to the brain: A biological perspective on musical emotions. In Patrick N Juslin and
John A Sloboda, eds, Music and emotion: Theory and research, pp. 105–134. Oxford University Press,
Oxford.
Perry BD, Pollard RA, Blakley TL, Baker WL and Vigilante D (1995). Childhood trauma, the neurobiology
of adaptation, and ‘use-dependent’ development of the brain: How states become traits. Infant Mental
Health Journal, 16(4), 271–291.
Robarts JZ (1994). Towards autonomy and a sense of self: music therapy and the individuation process in
relation to children and adolescents with early on–set anorexia nervosa. In D Dokter, ed., Arts therapies
and clients with eating disorders, pp. 229–246. Jessica Kingsley, London.
Robarts JZ (1998). Music therapy and children with autism. In C Trevarthen, K Aitken, D Papoudi and
J Robarts, eds, Children with autism: Diagnosis and interventions to meet their needs, pp. 172–202.
Jessica Kingsley, London.
Robarts JZ (1999). Clinical and theoretical perspectives on poietic processes in music therapy with
reference to Nordoff and Robbins’ case study of Edward. Nordic Journal of Music Therapy,
8(2), 192–199.
Robarts JZ (2000). Music therapy and adolescents with anorexia nervosa. Nordic Journal of Music Therapy,
9(1), 3–12.
Robarts JZ (2003). The healing function of improvised song in music therapy with a child survivor of early
trauma and sexual abuse. In Susan Hadley, ed., Psychodynamic music therapy: Case studies, pp. 141–182.
Barcelona, Gilsum, NH.
Robarts JZ (2006). Music therapy and sexually abused children. Clinical Child Psychology and Psychiatry,
11(2), 249–269.
Robarts JZ and Sloboda A (1994). Perspectives on music therapy with people suffering from anorexia
nervosa. Journal of British Music Therapy, 8(1), 9–15.
Robbins C and Forinash M (1991). A time paradigm: Time as a multilevel phenomenon in music therapy.
Music Therapy, 10 (1), 46–57.
Rogers P (2003). Working with Jenny: stories of gender, power and abuse. In S Hadley, ed., Psychodynamic
music therapy: Case studies, pp. 123–140. Barcelona, Gilsum, NH.
Schore AN (2001). The effects of early relational trauma on right brain development, affect regulation, and
infant mental health. Infant Mental Health Journal, 22, 201–269.
Schore AN (2003). Early relational trauma, disorganized attachment, and the development of a predisposi-
tion to violence. In MF Solomon and DJ Siegel, eds, Healing trauma: Attachment, mind, body, and brain,
pp. 107–167. W.W. Norton, New York.
Sears W (1996). Processes in music therapy. First published 1968, in E Thayer Gaston, ed., Music in therapy,
Macmillan, New York. Nordic Journal of Music Therapy, 5(1), 33–42.
Siegel DJ (1999). The developing mind: How relationship and the brain interact to shape who we are.
Guildford Press, New York and London.
Siegel DJ (2003). An interpersonal neurobiology of psychotherapy: The developing mind and the resolution
of trauma. In MF Solomon and DJ Siegel, eds, Healing trauma: attachment, mind, body, and brain,
pp. 1–56. WW Norton, New York.
Stern DN (1995). The motherhood constellation: A unified view of parent–infant psychotherapy.
BasicBooks/HarperCollins, New York.
Stern DN (1998). The process of therapeutic change involving implicit knowledge: Some implications of
developmental observations for adult psychotherapy. Infant Mental Health, 19(3), 300–308.
Stern DN (2004). The present moment in psychotherapy and everyday life. WW Norton, New York and London.
Sutton JP (2002). Music, music therapy and trauma: International perspectives. Jessica Kingsley, London.
Trehub S (2001). Musical predispositions in infancy. In R Zatorre and I Peretz, eds, The biological
foundations of music, pp. 1–16. Annals of the New Academy of Sciences, New York.
intersubjectivity. In M Bullowa, ed., Before speech: The beginnings of human communication,
understanding in infants. In D Olsen, ed., The social foundations of language and thought: Essays in honor
of J. S. Bruner, pp. 316–342. New York, W.W. Norton.
Trevarthen C (1993). The self born in intersubjectivity: The psychology of an infant communicating.
In U Neisser, ed., The perceived self: Ecological and interpersonal sources of self-knowledge, pp. 121–173.
In RAR Macdonald, DJ Hargreaves and D Miell, eds, Musical identities, pp. 21–38. Oxford University
Press, Oxford.
Trevarthen C and Malloch SN (2000). The dance of wellbeing: Defining the musical therapeutic effect.
Trevarthen C and Schögler B (2006). Musicality and the creation of meaning: Infants’ voices and jazz
duets show us how, not what, music means. In CM Grund, ed., Cross-disciplinary studies in music and
meaning, pp.. Indiana University Press, Bloomington, IN.
Trevarthen C, Aitken K J, Vandekerckhove M, Delafield-Butt J and Nagy E (2006). Collaborative regulations
of vitality in early childhood: Stress in intimate relationships and postnatal psychopathology.
In D Cicchetti and D J Cohen, eds, Developmental psychopathology, Volume 2, Developmental
neuroscience, 2nd edn, pp. 65–126. Wileys, New York.
Tulving E and Markowitsch HJ (1998). Episodic and declarative memory: Role of the hippocampus.
Hippocampus, 8, 198–204.
Tyler H (2002). The music prison: Music therapy with a disabled child who had experienced trauma.
In JP Sutton, ed., Music, music therapy and trauma: International perspectives, pp. 175–92.
Tyler H (2003). Being Beverley: Music therapy with a troubled eight-year-old girl. In S Hadley, ed.,
Psychodynamic music therapy: Case studies, pp. 37–51. Barcelona, Gilsum, NH.
van der Kolk B (2003). Posttraumatic stress disorder and the nature of trauma. In MF Solomon and
DJ Siegel, eds, Healing trauma: Attachment, mind, body, and brain, pp. 168–195. WW Norton, New York.
van der Kolk BA and Fisler R (1995). Dissociation and the fragmentary nature of traumatic memories:
overview and exploratory study. Journal of Traumatic Stress, 8(4), 505–525.
van der Kolk BA and van der Hart O (1989). Pierre Janet and the breakdown of adaptation in psychological
trauma. American Journal of Psychiatry, 146(12), 1530–1540.
van der Kolk BA, van der Hart O and Burbridge J (1995). Approaches to the treatment of PTSD. Available
at http://www.trauma-pages.com/vanderk.htm, accessed 27 November 2007.
Wallin NL, Merker B and Brown S (eds) (2000) The origins of music. MIT Press, Cambridge, MA.
Wheeler B L and Stultz S (April 2001). The development of communication: Developmental levels of
children with and without disabilities. European Music Therapy Congress, Naples, Italy. Available on
Info-CD Rom IV, University of Witten-Herdecke (2002) and at http://www.musictherapyworld.net/
Wickham RE and West J (2002). Therapeutic work with sexually abused children. Sage, London.
Wigram T (2003). Improvisation: Methods and techniques for music therapy clinicians, educators, and
students. Jessica Kingsley, London.
Wigram T and De Backer J (1999a). Clinical applications of music therapy in developmental disability,
paediatrics, and neurology. Jessica Kingsley, London.
Wigram T and De Backer J (1999b). Clinical applications of music therapy in psychiatry. Jessica Kingsley,
London.
Chapter 18
The human nature of dance:

Towards a theory of aesthetic
community
Karen Bond
18.1 Introduction
My first theoretical musings about whether dance might be part of our human biological
endowment date to my daughter’s infancy, a time of sensory–emotional delights, full of breathy
duets of dipping and twirling, finger dancing and shared rhythms. I had a distinct sense that she
was initiating these dances, and I remember thinking, ‘I wonder if we’re wired for this.’ I forged a
deeper connection between dance and biology two decades later during an extended inquiry into
the meanings and effects of dance for six non-verbal children with deaf-blindness (Bond 1991,
1994a). In the process of this journey into uncharted territory, I began to peruse literature related
to the origins of the arts and aesthetic behaviour (Eibl-Eibesfeldt 1973, 1989; Cobb 1977; Maquet
1986; Dissanayake 1988, 1992; Rentschler et al. 1988), concluding that my research supported a
bioaesthetic theory of human dance.
This chapter provides an overview of the research, with particular focus on an emergent social
phenomenon that I term ‘aesthetic community’. Over the course of an intensive dance
programme, children and participating adults began to look to me like a cohesive communicative
unit (Bond 1991, 1994a). They demonstrated a high degree of synchronous action, developing a
collective style of movement. Group affect in terms of excitement, humour and playfulness
evolved into a general ethos of celebration. A kind of work ethic was present in participants’
shared commitment to dance content, including strenuous weight-based activities (such as
partner pushing, pulling and counter-balancing). Considering the severity of the children’s
communication challenges, this finding may be salient to our understanding of the human
nature of dance.
18.2 Pioneers of aesthetic community

On a fine spring day, I climbed a flight of stairs to a residential unit in a school for the blind.
I turned a corner into a space where four boys and two girls, sitting on the floor in close
proximity, but facing different directions, were creating rhythmic designs with their hands.
Choice of this place and these children for an in-depth examination of dance felt serendipitous.
On reflection, I identified a complex of altruistic, scholarly and aesthetic intentions. In the role of
dance fieldwork supervisor for pre-service teachers, I had become an advocate for populations
that may not have ready access to dance in a social context. I knew also that I wanted to assess the
value of dance for children with disabilities. Like ethologist Eibl-Eibesfeldt (1989), who studied
people with deaf-blindness to develop hypotheses about the biological foundations of human
behaviour, I thought that this study might yield insights into the human nature of dance.
402 KAREN BOND
Ranging in age from 5 to 8 years, the six children were survivors of maternal rubella, a viral
embryopathy that can cause severe congenital defects. Born partially deaf and blind, they
presented a range of other debilitating conditions, including cerebral palsy and cardiovascular
dysfunctions. Two had endured failure-to-thrive syndrome during their neonatal lives.
Observing them for the first time, I noticed that each child displayed a unique vocabulary of
gestures and rhythms that appeared to be oriented towards light. Being a student of Balinese
dance, these movements reminded me of dances of south-east Asia that feature ostinato rhythms,
hand gestures and facial expression. I understood that the children’s stereotypical movements
were considered maladaptive and that reducing those behaviours was and continues to be a
major educational concern. Van Dijk (2001, p. 1) asserts, ‘All educational researchers agree that
stereotyped behaviour patterns interfere with learning.’ Idiosyncratic as the children appeared, it
seemed to me that these dance-like expressions had intrinsic value to the children and
were therefore a starting point for dance. I reasoned that these self-initiated, well-practised
embodiments were a source of pleasure, consistency and form in the children’s potentially
chaotic life experiences. Just like my infant daughter, it seemed that these children, too, had been
‘born to dance’.
18.3 The population

Humans with both distance senses impaired experience enormous challenges to communication.
The development of articulate speech is unusual, and severely affected individuals may demon-
strate a lack of interest in social interaction, especially with peers. Communication skills are
therefore a primary focus of educational programming (Enos 1995; Aitken et al. 2000; Jones
2002). Goode (1994) asserts that institutionalized children are in danger of chronic undersocial-
ization with peers, and therefore of having no access to ‘kids’ culture’, a social–physical space
where children construct their own lives independently of adults (pp. 182–83). Pease (2000)
observes, ‘Nowhere are the devastating effects of deaf-blindness more evident than in the area of
communication’, even for ‘relatively able youngsters’ (pp. 36–37), noting that early failures to
communicate may establish a downward spiral into a habit of social isolation. This chapter
describes the reverse: a process that grows outward from isolation to community, a development
grounded in group dance within a child-centred, therapeutic framework.
18.4 Dance for people with sensory impairment:

theoretical foundations
The underpinnings of dance content and methods were forged from literature in the then
emerging contemporary fields of dance therapy and special arts education, in particular, material
related to dance for children with sensory impairments. Table 18.1 lists concepts and strategies
gleaned from a unique body of 1970s and 1980s writings by pioneer practitioners. There is
congruence across these descriptive sources. All embrace the assumption that dance is accessible
to all human beings. Methods draw on individual movement characteristics and developmental
approaches that ‘move connectedly and meaningfully from one level of experience to the
next’ (Delaney 1977, p. 150). All writers stress the importance of a safe environment in
which children with sensory challenges can explore their individual expressive potential.
Goals are oriented to holistic concerns, as evidenced in linguistic items such as whole body,
multisensory, expression of emotions, physical presence, freedom, self-confidence and relationship
skills. Table 18.1 delineates specific dance content for persons with sensory impairment,
including breath awareness, rhythm, the fundamentals of locomotion, posture, gesture,
THE HUMAN NATURE OF DANCE: TOWARDS A THEORY OF AESTHETIC COMMUNITY 403
Table 18.1 Dance for children with sensory impairments—goals, content and methods of pioneer
practitioners
Kratz (1973) (VI) Weisbrod (1974) (VI)
Fundamentals of locomotion Breath
Quality of movement Qualities of physical presence and strength
Tension and release Freedom, range, and self-confidence
Linear/directional movement Whole body locomotion
Body position in space Momentum
Focus more on body than on space or time Body universals: body weight, breath, contrasts of
‘Running is a birth right’ open/closed, self/not-self, front/back, right/left (Laban)
Delaney (1977, 1980) (SC) Rhythm and multisensory experiences
Importance of therapeutic relationship Space exploration
Developmental approach Increase range of movement
Expression of emotions through dance Relationship awareness
Mirroring/imitation Sound, music
Safety: known space and trust in therapist Work from existing movement styles
Canner (1980) (VI/MI) Pisciotta (1980) (HI)
Body awareness Special competencies of hearing impaired children:
Relationship skills concentration, absence of gender bias, physical
Dynamic contrasts to widen range expressiveness due to reliance on facial expressions
Strength in communication
Flow of movement Rhythm and gesture
Tactile stimulation Reber (1980) (HI)
Blind mannerisms as expressive behaviour Laban-based creative movement programme for
Mason (1980) (VI) young children with hearing impairments
Synchronous interaction: encourage use of Dance as a ‘universal language’
residual vision, body and eye contact, Leventhal (1979, 1980) (SC)
rhythm and space Sequential movement experiences enhance
Work from movement characteristics of the learning readiness
population ‘Holding environment’: secure spatial boundaries,
Body sounds repetition, attention to developmental patterning
Circle forms Spontaneous motor expressions
Body weight, alignment Rhythm
Berrol (1981) (DSI/MI) Replicable session formats
Gross sensory–motor system: overt responses Therapeutic relationship as a ‘dyadic union’
of the whole body Elements of dance: space, time, weight and
Sensory integration: cross-modal presentation flow (Laban)
to the strongest modalities
Slow pace
Repetition
Ostinato singing
Importance of vestibular experiences: rocking,
rolling, swaying, swinging, turning
Bernstein (1981) (GT)
Flow of breath
Multisensory stimulation
Laban elements
Developmental approach (Kestenberg)
Whole body response
DSI, dual sensory impairment; GT, general text; HI, hearing impairment; MI, multiple impairment; SC, special child;
VI, visual impairment.
404 KAREN BOND
weight experiences, qualities of movement (dynamics and space), and social interaction through
movement.
Regarding quality of movement, many early practitioners were influenced by the comprehen-
sive movement theories of Rudolph Laban (1879–1958). A keen observer of human movement,
Laban posited that kinetic energy, or ‘effort’, is expended according to a mover’s inner attitude to
the motion factors of flow, space, weight and time (Laban 1960; Laban and Lawrence 1974;
Bartenieff with Lewis 1980; Davies 2001). Further, all movement tends into space, as the mover
responds to internal motivations and external stimuli. Human movement is a vital, rhythmic
process that is at once adaptive and expressive (Davis 1983).
An assumption of Laban theory is that individuals have characteristic ways of moving in terms
of body, energy and space, and that movement behaviour is the observable manifestation of
physical, emotional, mental and social processes (Amighi Kestenberg et al. 1999; Loman 2005).
Thus, the expansion of movement function and expression can be correlated with growth
in other aspects of being. These assumptions permeate the early literature in dance therapy
(Koch 1984), including sources presented in Table 18.1. Laban’s theories continue to be adapted
and applied in a range of fields from dance to career counselling.
While available background literature provided a wealth of in-depth practitioner insights,
I found no empirical research (in English) on the effectiveness of dance for children with sensory
impairment. Only one study examined dance/movement therapy with the deaf-blind population
(Berrol 1981), a case study of four teenage girls. The author highlighted the challenge of meeting
the complex needs of individuals in the context of an effective group process, a problem I came
to know well. A literature update for this chapter indicates a continuing lack of research. For
example, a search of the entire collection of the American Journal of Dance Therapy (1977–2007)
reveals only four published articles on dance for people with sensory impairments, the most recent
being Frost (1984). To the best of my knowledge, therefore, this chapter presents unchallenged
perspectives on the value and meanings of dance for non-verbal children with deaf-blindness.
At the time of my fieldwork in the mid-1980s, the contemporary field of dance therapy was in
its infancy in Australia. It was exciting to be part of a new discourse that sought to define dance
therapy in the Australian context. Borrowing from Serlin (1993), some voices in the debate sought
to honour the ‘root images’ of dance emanating from Australia’s ancient aboriginal culture. Still
others stressed that the therapeutic relationship is at the core of dance therapy, emphasizing its
psychotherapeutic functions (Delaney 1980; Leventhal 1979, 1980). My professional mentor and
Australian dance therapy pioneer Johanna Exiner advocated strongly that the embodied art of
dance itself is central to its therapeutic significance (Exiner and Kelynack 1994; Bond 2008).
As noted above, dance has an ancient role as a therapeutic modality (Serlin 1993; El Guindy
and Schmais 1994). Working with both individuals and groups, contemporary dance therapists
tap into our human non-verbal predilection to dance for the purposes of healing, perceptual
integration, transformation and meaningful social relationships (Loman 2005).
A broad definition of dance framed the research reviewed in this chapter. Here, dance is con-
ceptualized as intentional non-verbal behaviour that expresses, through the dynamic patterns of
special movements in space, a heightened felt sense of self and/or environment. Children’s spon-
taneous natural actions and idiosyncratic mannerisms were viewed as part of their vocabulary of
special movements that might be extended through dance (Cobb 1977).
18.5 The empirical design: research as an art–science duet

The study of six non-verbal children in dance was launched with an experimental research
design. The formal experimental aim was to evaluate the influence of group dance (hereafter
referred to as ‘Dance’ and described in detail in section 18.6.1) on children’s social and task
engagement. In spite of the small number of participants and the impossibility of random
sampling, an interest in conceptualizing children as social groups and identifying criteria for
behavioural engagement suggested an experimental starting point. I reasoned that the combina-
tion of personal, direct involvement and interpretation with the formal coding and categorizing
of behaviour would provide a rigorous research strategy. Inspired by a host of writers advocating
for integration of the epistemologies of science and art (Young 1974; Maquet 1979; Eisner 1981;
Capra 1982; Salk 1983; Bohm and Peat 1987), I described this synthesis of formal measurement
with open-ended experiential inquiry as an ‘art–science duet’ (Bond 1991).
A two-group, repeated-measures crossover design (Campbell and Stanley 1963; Creswell 2003)
was implemented to allow comparison of children’s engagement in two contexts: Dance and Play
(Play is described in section 18.6.2). Ethically this was considered a suitable design, as children
received both programmes. Each group experienced five weeks of one context (Dance 1 or
Play 1) followed by five weeks of the other (Play 2 or Dance 2). Pretests were conducted in the
first week of each experimental period (weeks 1 and 8). Thirty-minute sessions were held four
times a week for a total of 20 sessions per programme per experimental period.
18.5.2 Operationalizing engagement in dance and play

To generate a representative sample of behaviour, the sessions were video-recorded approxi-
mately once a week. Three independent observers, including the researcher, coded the video data.
To enhance validity, appropriate instruments were designed; these drew on pilot observations
and developmental literature for the deaf-blind population (van Dijk 1977; McInnes and Trefry
1982; Kates et al. 1981; Stillman and Battle 1986). Table 18.2 lists 16 non-verbal engagement
criteria with their operational definitions. These cover a dimension from high to low engagement
and a range of behavioural qualities and skills. We observed children in 30-second intervals for
the presence and absence of engagement behaviours over approximately 14 hours of video.
18.5.2 Content and methods of dance and play

Goode paints a picture of inhumane institutional practice in the 1970s, when behaviour modifi-
cation was used routinely in the treatment of ‘deviant’ children on grounds of normalization:
Children’s own ways of being, their own choices and preferences, were totally ignored in the programs
designed for them. Most professionals did not believe that children operated from a valid perspective
or had ideas about things worth examining.
Goode (1994, p. 15)
Education methods for individuals with deaf-blindness have changed since the 1970s. Current
practice is more person-centred, acknowledging the importance of working with children on
their own terms, which are essentially bodily in nature. Aitken proposes:
Through … reciprocal actions, responding to what the child does without deciding beforehand what
the child is expected to do or to follow, the child is encouraged to learn that he can influence the other
person: interaction has begun.
Aitken (2000, p. 33)
Although these values were not found in institutional practices 20 years ago, they were intrinsic
to both Dance and Play. The role of adults was to affirm children’s initiatives and responses,
honouring their freedom to reject content offered and encouraging them to ‘be themselves’ in
each setting.
406 KAREN BOND
Table 18.2 Non-verbal engagement in Dance and Play—operational definitions of selected variables
Participation modes: expressions of degrees, qualities and skills of engagement with self,
tasks, and persons.
Light-oriented behaviour—observable attention to or movement towards light as a dominant feature
of engagement during an observation interval.
Movement mannerisms—idiosyncratic, repetitve gestural or postural movement behaviour as a domi-
nant feature of engagement during an observation interval.
Imperviousness—oblivious to persons and tasks; observable disengagement with the environment; the
child ignores environmental stimuli.
Resistance—observable increase in bodily tension in relation to or rejection of tasks and persons, includ-
ing aggressive behaviour.
Passive tolerance—detached cooperation with tasks and persons with a low level of exertion or invest-
ment of self.
Receptiveness—child is observably open to environmental stimuli; a quality of relaxed awareness; there
may be a subtle quality of enjoyment or pleasure.
Focus—direct looking, listening or bodily orientation to environmental stimuli; non-manneristic behaviour
directed away from self.
Active involvement with tasks—a general sustained quality of self-initiated constructive and appropri-
ate engagement with relevant tasks.
Active involvement with persons—a general sustained quality of self-initiated social engagement with
one or more persons.
Sharing—discreet instances of social initiative; for example, giving, receiving, helping, displaying, request-
ing, turn-taking or demonstrating; assumes active involvement with persons.
Imitation—child follows, mirrors, or reproduces behaviour initiated by another person.
Exploration modes: self-initiated engagement of an embodied exploratory nature; may or may

not be task-related
Near space—intentional movement within reach space of the body; an observable emphasis on expressive
reaching, shaping, changes of level or body position in relation to the close environment.
Far space—intentional movement that extends beyond reach space; locomotor movement showing
awareness of spatial pathway, distance, or destination.
Rhythm—energized patterns of movement with intentional use of rhythmic components such as repetition,
phrasing, and accents. Manneristic movement may be included when used in a constructive fashion,
i.e. for more than self-stimulation.
Gesture—intentional movement of isolated body part(s) for functional and expressive purposes. Manneristic
gesture may be included when used in a constructive fashion, i.e. for more than self-stimulation.
Object exploration—self-initiated physical engagement with objects, including boundaries and structures
of the working space; for example, windows, walls, light switches, video camera, furniture.
Dance and Play were developed and led by the researcher and the teacher-director of the
deaf-blind education unit, respectively. A guiding value in both contexts was to provide the
least restrictive, most responsive environment possible. Treatment innovations included the rota-
tion of adult partners so that children were offered a variety of relationships, and the
physical–affective reflection of children’s movement (for example, mirroring and echoing), an
established method of dance therapy practice (Adler 1968; Duggan 1978; Schmais 1985).
18.6.1 Dance
My background in creative dance, modern dance, world dance forms, contact improvisation
(a form based on the sharing of body weight), Laban Movement Analysis and children’s music
education techniques was put to good use in dancing with deaf-blind children. A variety of
sound stimuli were employed including recorded music percussion instruments, vocalization
and body percussion (e.g., clapping, finger snapping and foot slapping). Some children were more
responsive to sound stimuli than others. Music seemed particularly important for adult partici-
pants, all of whom appeared to perceive a close affinity between music and dancing, often singing
or humming during sessions. Audio speakers were placed on the floor so that children could pick
up sound vibration. Following is the replicable six-part format for a half-hour session of Dance:
1. Greeting circle (3 minutes). Sessions commence with a rhythmic name chant and passing of
a small mirror. Participants sit in a circle to focus attention and encourage social interaction.
Adults perform the chant: ‘Name, name, what’s your name? Marla . . . Tap, tap, tap’ (on the
floor with hands). Each child is invited to hold the mirror while his or her name is being
chanted and to then pass it to a peer.
2. Multisensory warm-up (5 minutes). Partners sit close together. Adults offer rhythmic tactile
experience such as shoulder rubbing and back tapping. Taking cues from children, a variety
of movement qualities may be explored (gentle to firm, lingering to abrupt, circular to lin-
ear), and attention is paid to phrasing. Body-part articulations emphasize tension release.
Facing each other, partners hold hands and adults model an expansive, audible breath
rhythm. This ‘breathing together’ extends to arm and hand gestures. ‘Hand dances’ are a fea-
ture of the program, often beginning with adult mirroring of children’s ‘signature gestures’
and moving into a reciprocal conversational quality.
3. Whole body in space (4 minutes). Standing up, Adults model whole body movements sug-
gested by the leader (e.g., bouncing tilting, rocking, swaying swinging and balancing).
Developments into locomotion and elevation include walking, marching and running, to
tiptoeing, hopping, jumping, and adult-assisted ‘flying’. Contrasting spatial forms (pairs and
groups, shapes, pathways, levels, dimensions and planes) and dynamic qualities (force, speed,
and energy flow) are offered to extend participants’ expressive range.
4. Child leadership time (9 minutes). Children are free to follow their own impulses and
designs, mostly unmediated by external standards of behaviour. Children’s leadership is
affirmed through empathic reflection of their actions, postures, and vocalizations, including
idiosyncratic mannerism. This evolves into elaborated forms of support for children’s dance
development—e.g., echoing, exaggerating, contrasting, and being audience to a child’s
performance.
5. Dancing with gravity (6 minutes). Themes focus on body weight (sensing, centring, shifting)
and acceleration in crawling, rolling, balancing, falling, swinging, turning, and push-pull
(counter-tension) activities. Support dances develop through giving and taking of weight in
pairs and groups.
6. Farewell circle (3 minutes). Dance sessions begin and end in a circle. Here the group joins
hands for rhythmic stepping. The aim is to facilitate social interaction between children
through shared rhythm and spatial structure. At the end the circle condenses inward for a
final good-bye that encourages eye contact.
18.6.2 Play
Play was based on an existing plan to create a ‘kindergarten’ environment that would be less
structured than what children were used to in their regular programme. Teachers were interested
in seeing whether children would explore familiar materials without adult direction. A selection
of play media was provided for each session. Like Dance, Play sessions begin and end with a
group experience designed to encourage sociability. Extension of activities is encouraged through
imitation and modelling, and adults stay close enough for immediate contact should the child
indicate an interest in interaction.
408 KAREN BOND
18.7 Experimental results

Quantitative analysis was limited to the use of descriptive statistics to show comparisons within
and between individuals, groups, and programmes over time. Individual results for each session
were obtained by averaging the scores of the two external observers. Overall individual means
were derived from the results of each session attended. Group scores were calculated for each
session by averaging individual results. Overall group means were then calculated from group
scores for the six sessions observed in each experimental setting.
18.7.1 Social and task engagement

Figure 18.1 presents summaries of the two global engagement variables: active involvement with
persons and active involvement with tasks (see Table 18.2). The figure illustrates children’s overall
self-initiated social and task engagement in Dance, constituting an important landmark on
the journey towards illumination of aesthetic community. Individual mean scores are plotted
for each observed session (Observations 1–6) in Dance and Play (D1–D6; P1–P6), the total
number of observations of social and task engagement never exceeding 15 in any session. Vertical
plots allow comparison of each child’s response to treatments over the available observations
(D1/P1, etc.). Group configurations clearly illustrate the consistency of high social and task
engagement in Dance.
General Play configurations for social and task engagement show less consistency than those
in Dance. There are more within-group and within-child differences in Play, so it is difficult
to describe a coherent pattern of group engagement. This condition may be more ‘normal’ in
the study population. Specialists report that severely affected deaf-blind individuals are noted
for significant day-to-day changes in behaviour (Stillman and Battle 1986; Jones 2002).
However, the striking consistency of high engagement observed in Dance indicates that
non-verbal deaf-blind children have the potential for sustained involvement in constructive
activity.
18.7.2 Emergent categories of engagement

The ranking of group mean outcomes revealed general patterns in the data, which resulted
in an inductive regrouping of the 16 engagement criteria into 4 categories: high engagement,
engagement moderators, engagement skills, and personal style. In Dance, the high engage-
ment variables of focus, receptive, active involvement with tasks, active involvement with persons
and near space all clustered consistently, appearing on average in more than 85 per cent of
observations. Considering the five variables as a cluster, Dance appeared to provide an integra-
tive mode of engagement for the children, one that involved self, space (including social space)
and task.
The engagement moderators, imperviousness, resistance and passive tolerance, appear consis-
tently low in Dance, recorded in less than 20 per cent of observations in all children. Remaining
variables were dispersed between high and low range. Some appeared to be skill-based (sharing,
imitation and far space), while others had personal expressive significance (rhythm, gesture,
movement mannerisms and light-oriented behaviour).
At this point, quantitative results began to dovetail with impressions gained through direct
observation. In both Dance and Play, it seemed clear that observable individual preferences
were influential in mediating children’s engagement. The greatest individual differences,
independent of programmes, were in the above criteria that I came to call ‘personal style’. There
is material on personal style in the fields of dance, religion and psychology. Ritual scholar
Grimes (1995) notes that ‘style is inferred from observed action’ (p. 88). Lyons (1987) suggests
1.5
1.0
With persons
OBS 1
OBS 2
OBS 3
OBS 4
OBS 5
0.5 OBS 6
Group one Group two

0
D1D2D3 P1P2P3 P4P5P6 D4D5D6
(a) Children
1.5
1.0
OBS 1
With tasks
OBS 2
OBS 3
OBS 4
OBS 5
0.5 OBS 6
Group one Group two

0
D1D2D3 P1P2P3 P4P5P6 D4D5D6
(b) Children
Fig. 18.1 (a) Social engagement in Dance and Play; (b) Task engagement in Dance and Play.
D–Dance; P–Play; OBS–observation.
‘even simple actions may express one’s behavioral style’, (p. 209) arguing further for a somatically
anchored psychology:
If there is one scandalous lapse in contemporary psychological theory, it may be the omission of any
reference to the lived, living, working, behaving body … the whole functioning body as it is lived by
each individual. Minds are aspects of bodies, not more and no less.
Lyons (1987, pp. 4–5)
Bohm and Peat (1987) distinguish between an ‘outside order of development’ involving ‘evolution
in a sequence of successions’ and a ‘more inward order out of which the manifest form of things
can emerge creatively’ (p. 151). In the study reported here, an experiential process unfolded that
encompassed childrens’ and adults’ interacting with innovative treatments. An outside order
was present in the use of replicable session formats and systematic observation. The chapter
turns now to the ‘inward order’ reflected in a group process that took place within the replicable
framework of Dance.
410 KAREN BOND
To achieve comprehensive understanding, qualitative data were gathered through participant

observation and systematically coded (Miles and Huberman 1984; Lincoln and Guba 1985).
Data sources include my audiotaped running commentary during sessions, video recordings of
children in Dance as described above, a written field journal, written impressions recorded
concurrently with time-sampling, formal interviews and informal conversations with school
staff, and school documents. Qualitative analysis of multiple sources provided an elaborated
perspective on child engagement, adult engagement, and the Dance context.
Elaboration of the personal style category rhythm, gesture, movement mannerisms and light-
oriented behaviour was particularly important to the interpretive process. Through the
line-by-line coding of written field records backed up by video data, the category was expanded
to include bodily characteristics (posture, gait, rhythm, gesture, voice, personal mannerisms);
sensory/perceptual preferences; affective style (a dimension of bodily openness/resistance); and
interests and aptitudes. Data were coded also for Laban’s effort (flow, space, weight, time), shape,
and space criteria (Dell 1977; Bartenieff with Lewis 1980).
18.9 Three wayfarers on the road to aesthetic community

Non-verbal children with deaf-blindness have less access than most humans to an external dance
culture. There was no point, therefore, in offering the six children options to learn a codified
dance form. Instead, as described above, we attempted to meet them in their own embodied
realities. Nevertheless, for Madeleine, the ‘right dance’—the dance that would match her unique
personal style (after Sparshott 1988)—was never fully crystallized. Preoccupation with control-
ling and manipulating the environment dominated her engagement. Madeleine displayed
well-established, ritualistic patterns of social behaviour that were often in conflict with the
developing social rituals of Dance.
Marc and Madeleine were more alike in their well-armoured body attitudes than either of
them was to Damien with his articulated and flowing quality of movement. However, Marc’s
affinities for rhythm and light were major stylistic differences between him and Madeleine. Marc
was more independent and self-contained than Madeleine and showed less impulsivity. Noted
for his resistance to most educational interventions, in Dance, Marc appeared to liberate his
potential for release, concentration, sociability and joy. Several school staff expressed surprise at
his transformation, but a teacher commented, ‘I knew it was there; I knew he had more going
for him than people think.’ As for Madeleine, emotional volatility, coupled with physical and
cognitive rigidity, appeared to compromise her full engagement in Dance.
Both Marc and Damien showed transformation in Dance, but in very different ways. Damien
became the classicist, a creator of forms in time and space (Bond 1994a). Marc became the
ecstatic: less analytic, more buoyantly musical. These were surprises, since Damien’s favourite
movement ritual of running on the spot while shaking his arms, hands and head was energy-
based, while Marc’s ritualized hand movements took place in a posture that was close to the
ground, full of sharp angles, an immovable tripod. These enduring representations suggested
recuperative and protective functions, a kind of bodily intelligence at work. Potentials for trans-
formation were also implicated in these statements of self. The dynamic excitement expressed in
Damien’s signature ritual also revealed a certain cool containment in its asocial impersonality.
Marc’s side-lying triangular pose was strong, like concrete, but there was evidence of passion and
warmth in the vibratory hand-to-eye gesture that enlivened it.
Compared with Madeleine’s driven, often lurching abruptness and propensity to collapse in a
heap without warning, Damien and Marc exhibited multisensory awareness and pleasure in
moving. Overall, Damien portrayed a logical approach to tasks, for example in his cumulative
creation of repeatable sequences of movement and careful guidance of partners in different

spatial directions (Bond 1994a). He had an eye for kinetic design. His profound hearing impair-
ment may have facilitated this style. He appeared to be relatively unaware of auditory stimuli,
as compared with other children.
It is difficult to discern the source of Damien’s form-making ability, except on the grounds of a
neurobiological predisposition to dance and aesthetic experience (Kealiinohomoku 1976; Cobb
1977; Hanna 1979; Maquet 1986; Eibl-Eibesfeldt 1988; Siegfried 1988; Dissanayake 1988, 1992,
2000; Hagendoorn 2003; McKechnie and Grove 2005); this is what Dissanayake (1992) describes
as our human capacity for ‘making special’. As for Marc, he grew a smile that would not quit, and
transferred his love of jumping (he would often sneak away to the outdoor trampoline) to Dance,
reminding me of Annie Dillard’s (1974) description of ecstatic experiences in the Virginia woods:
‘in and out of Shadow Creek, upstream and down, exultant, in a daze, dancing, to the twin silver
trumpets of praise’ (p. 271).
Marc and Damien surrendered to group process in Dance, whereas Madeleine was more
resistant. Over time, however, even she showed signs of surrender, in spite of her dogmatic social
policies. Peer social awareness and contact increased, as did auditory focus. Independent
observers noted trust in partner work during the final session. Madeleine initiated an uncharac-
teristically slow and elegant dance during the last week, reminiscent of a courtly pavanne, provid-
ing a glimpse of her potential for self-transformation through Dance. Engagement in movement
for its own sake, imitation, elaboration, and a sense of performance, or self-presentation, were
observed in all children.
18.10 The role of aisthesis

Writing the case profiles further highlighted the importance of personal style as a mediator of
engagement. The empirically grounded construct of personal style seemed connected to aesthetic
perception. In a search for relevant literature I discovered the concept of aisthesis in Maletic’s
(1982) study of the foundations of style in choreography. She notes that aisthesis (from the Greek
‘to perceive’) involves immediacy of bodily receptivity without comparison to cultural traditions,
whereas in aesthetic perception, cultural values are at play. Further, where the aisthetic and the
aesthetic are in synthesis, self-environment duality may diminish, a phenomenon observed in
Dance that informed my conception of aesthetic community. In the field of contemporary
pragmatism, the philosophical method of somaesthetics is concerned with enhancing aisthesis in
order to improve critical self-awareness (Shusterman 2000, 2005). Shusterman (2005) suggests
that a clear somaesthetic appreciation of our embodied biases, our habitual ways of feeling,
acting and thinking, may be the ‘only reasonable starting points’ for ‘projects of intelligent
reform’ (p. 71).
Since individuals with both distance senses impaired have limited access to culture in a macro
sense (let alone a dance culture), it seems likely that for these individuals aisthesis remains
the primary basis for creating personal and social meaning throughout life. Standards of taste,
personal satisfaction and form may have more of a somatic than a cultural basis for non-verbal
people with deaf-blindness. Indeed, such individuals may be exemplars of somaesthetic
self-knowledge in that aesthetic standards must be expressed through the body. To the extent
that ‘projects of reform’ (Shusterman 2005) are imposed from the outside, such an aesthetic
may look like expression of anticulture. If sensory deprivation is the source of stereotypical
behaviours in people with sensory impairments (Berrol 1981; van Dijk 2001), this may be a
direct comment on the inadequacy of existing culture to support bioaesthetic preferences. In the
present study, children’s high level of social and task engagement, implying cognitive investment
412 KAREN BOND
in Dance, appeared to be associated with Dance’s aesthetic accommodation of each child’s ais-
thetic style.
18.11 Further contextual influences on participants’

engegament in dance
18.11.1 The power of ritual
As described above, our work with children in Dance occurred within a structured, replicable
format. Through reflection, repetition and elaboration, we sought to affirm and extend content
that appeared to be intrinsically motivating. Early in the programme, I began to use the words
‘ritual’ and ‘ritualistic’ in field notes. Grimes (1996) notes that rituals generally deal with recur-
rent situations and repeated observations are needed to understand them (see Merker, Chapter 4,
and Dissanayake, Chapter 24, this volume, for extended discussions on the nature of ritual). In
Dance, a ritual process was elucidated through multiple observations, drawing on perspectives of
distance (quantitative measurement) and closeness (qualitative interpretation).
Hanna (1979) suggests that the repetition phenomenon in dance is arresting and that due to its
multisensory nature, dance may be contagious. Multisensory embodiment is at the core of ritual.
Grimes (1995, p. 60) writes that ritual action is ‘thick with sensory meaning’, the body being ‘the
central fact of ritual in all circumstances’. His dynamic conception of ‘ritualizing’ is pertinent to
what transpired in Dance, including a ‘here-we-go-again’ quality that stimulates participant
anticipation. Grimes notes that when ‘meaning, communication or performance become more
important than function and pragmatic end, ritualization has begun to occur’ (p. 36). The evolu-
tion of a ritual process in Dance may have been supported by repetitive research and treatment
structures, and the ritualistic nature of children’s personal stylistic expressions.
Ritual is a polymorphous phenomenon that has wide-ranging contemporary applications in
religion, therapy, education, theatre and politics (Schechner 1993; Grimes 1995; 1996, 2000;
Dissanayake 2000; Doty 2000; McCauley and Lawson 2002; Schilbrack 2004; Franko 2007). Doty
(2000) suggests that human society is essentially ritualistic. Rituals may convey or strengthen
individual identity or status, establish new social rankings, promote group cohesion, unlock
communal energies that might otherwise remain latent, and allow the expression of feelings
where verbal discourse needs augmentation or is non-existent, as with non-verbal populations.
Rituals provide means of self-organization, promoting the integration of biology and culture. We
observed all of these phenomena in Dance.
Dissanayake (2000) observes that cultural rituals perform the same functions for participants
that mothers do intuitively for infants: ‘engage their interest, involve them in a shared rhythmic
pulse, and thereby instil feelings of closeness and communion’ (p. 64). Her analysis of the ritual-
ized, rhythmic bonding language of mother–infant communication, which she suggests is the
basis of all human arts, is salient to this study of non-verbal children in Dance.
18.11.2 Ritualizing voice

Similar to mothers’ unselfconscious vocalizing with infants, spontaneous adult vocal
expression was common in Dance, particularly in greetings and farewells. In greetings, ritualistic
vocalizing was part of the design in the form of a repetitive name chant. In the farewell circle
dance, vocalization evolved over time. An adult noted, ‘This is becoming a ritual—we are
unselfconscious.’ Forms of vocalization included spontaneous singing, ostinato chanting,
laughter, verbal affirmations and the reflection of children’s vocal and movement behaviour.
The audio record contained adult singing in every session. Adults sang along with children’s
manneristic vocalizations and accompanied many activities with voice. From the field journal:
‘Adults vocalize breaths… they are synchronizing this now as a group.’ During ‘breathing
together’ (in the multisensory warm-up), an audible whole-group breath rhythm was sometimes
evident. Both children and adults were attentive to these moments of intimacy.
18.11.3 Dancing with gravity

Quieting of children’s manneristic vocalizations was observed during weight theme segments,
which focused on exploration of body weight through tilting, rocking, swinging, balancing,
rolling, and counter-tension activities. Children showed more independent task-engagement in
these segments than in others, as evidenced in perseverance after adult withdrawal, non-verbal
requests for activities, risk-taking, and self-presentation. Weight-based activities were calming for
Madeleine and Marc. As noted in the field journal, ‘These kids have a lot of time for rocking.’
18.12 From ritualizing to community

Discussing the origins of ritual, Moore and Yamamoto (1988) suggest that ‘movements repeated
often enough, in the same location and context… began to acquire communal meaning and to
pass from the world of private experience into the public world of shared meanings’ (p. 105).
This is an apt description for what was observed in Dance. Ritual process seemed integral to the
development of the group social phenomenon I am calling aesthetic community, a component of
which was the emergence of peer social interaction.
18.12.1 Peer social engagement

We observed the development of peer interaction in both groups. Children exhibited sensory
awareness of each other through touching, looking, sniffing, listening and movement synchrony.
Sometimes tactile initiatives appeared to be visually motivated, with individuals drawn to colours
or patterns. Children also interacted by leaning into each other, tracking each other’s movements,
active focusing on peers’ ‘performance’ (including smiling and clapping), and listening to peer
vocalizations. Children were aware of other pairs and would sometimes initiate others’ move-
ment with their own partner. Evidence of group synchrony included clapping together, unison
dancing, and incidents of group rhythm, falling, and crawling.
Positive affection demonstrated between peers in Dance included patting, stroking, kissing,
hugging, smiling and laughing. Aggressive behaviours included striking, biting, scratching, grab-
bing and pinching; however, a pattern of diminishing peer aggression was observed. For example,
Marla’s propensity for pinching and clawing disappeared in the last week of Dance. This coin-
cided with an episode of synchrony in which she and Madeleine fell simultaneously to the floor
during the farewell circle (Madeleine’s typical response). This duet fall appeared to be initiated by
Marla. In the next session, there was a marked improvement in Marla’s attitude towards
Madeleine. The field journal reported ‘Social interaction with Madeleine; smiling at everything,
making physical contact, quite affectionate… still attacking with hands, but much lighter, not so
aggressive.’
The next day, Marla took Madeleine’s hand twice in Dance, the second time with a smile.
Incidents of peer awareness, synchrony, contact, and task engagement were recorded for each day
of Week 5 in Dance, and on the final day both children continued to dance after the farewell
circle was finished. Enos (1995) notes that peer relationships may be highly elusive for people
with deaf-blindness, as even the establishment of mutual attraction (a baseline of friendship)
is difficult to achieve. It appeared that in Dance two young girls with severe communication
disorders ‘fell for each other’ within the safety of a circle dance.
414 KAREN BOND
18.12.2 Adult engagement

Adult engagement also grew over the weeks. An initial competitive element gave way to unbiased
commitment to both Dance and Play. Early field notes recorded teachers’ concerns about
programmes being ‘pitted against each other’ and discomfort mirroring children’s mannerisms:
‘T is uncomfortable with reflective approach, telling me “It’s very hard; goes against everything
I’ve been taught” ’. By Week 4 of Dance 1, positive adult engagement was being exhibited
consistently in terms of humour, kinaesthetic empathy, vocal participation, and acknowledge-
ment of children and the program. Adults danced together spontaneously from time to time,
leaving children to their own pursuits. The following are spontaneous teacher comments
recorded during sessions in Dance 1:
I wish Dance could go on forever.
I love to be able to turn off during the Dance session and not have to evaluate kids.
L: Time is already up—so quickly.
H: Always when you’re having fun.
It really makes it worth it just seeing those faces.
18.12.3 Dance leadership

Group leadership in Dance was interactive and developmental. One constant was the use of
continuous descriptive feedback to propel content and encourage kinaesthetic empathy. At a staff
seminar, a non-participant asked whether the infusion of unorthodox methods might have been
the reason for the success of Dance (novelty effect). Turner’s (1977) study of ritual structure in
traditional cultures illuminates the role of the outsider in stimulating change. In my ignorance of
institutional norms, I may have embodied the ‘ideological idiot’ (Grimes 1995), a figure who
mediates the dichotomy of traditional versus creative. In terms of ritual process, as leader, I held
power as custodian of a special object, a mirror, releasing it at the same time each meeting.
As already noted, modifications to the normal social structure were incorporated systemati-
cally in both programmes, including status reversal (child leadership) and variety of child–adult
relationships; however, the phenomenon of aesthetic community was not observed in Play. Even
though the use of unorthodox methods does not fully explain the success of Dance or the
observed differences between Dance and Play, these social innovations are considered salient to
the emergence of aesthetic community.
18.13 Features of aesthetic community: a realm of shared

embodiment
The construct of aesthetic community was suggested initially through the synthesis of
experimental and qualitative findings. In Dance, an evolving interaction took place between
participants, sensory–physical content, pedagogical innovations, and the formal structure that
framed each session. Features of aesthetic community described at the beginning of the chapter
include the emergence of shared aesthetic values, an ethos of spontaneous celebration, and a
social work ethic. These and other aspects of aesthetic community are discussed next.
Siegfried (1988) studied children’s free play for insights into the origins of dance, noting, ‘the
system to which the dancers and their movements relate must be created by the dancers them-
selves and is the result of a group process’ (p. 118). In Dance, group process was child-driven, but
also involved a merging of child and adult expressive preferences. In Dance, shared aesthetic
values were seen in a high degree of movement synchrony and the emergence of a collective style
of movement. Table 18.3 presents the elements of group style that crystallized in Dance.
A quality of multisensory engagement was present in Dance. Rhythmic repetition and integra-
tion of movement, sound and touch evoked sustained attention. As suggested earlier, this general
quality of multisensory receptivity suggests that aisthesis, the direct perception of immediate
bodily experience, was a predominant mode of engagement for participants (Maletic 1982;
Shusterman 2005).
All children displayed distinctive stylistic characteristics of posture, gesture, gait, rhythm, voice
and spatial patterns, and the movement mannerisms composed of these. Idiosyncrasies of motor
control were present in all six children. In Dance, the concept of bodily idiosyncrasy was normal-
ized as intrinsic to the art form, becoming a component of group style. Recent writings in deaf-
blind education affirm the communicative potential of non-symbolic behaviour (Goode 1994;
Aitken et al. 2000; Jones 2002). In Dance, it seemed clear that children employed personal man-
nerisms to express aisthetic preferences. All children showed an affinity for light and, except for
Madeleine, seemed delighted when adults acknowledged this special interest. All children chose
to integrate personal mannerisms with the mode of Dance, opening up the self-oriented focus of
aisthesis to the possibility of aesthetic community.
Sensitivity of fingers, hands, arms and faces was a group phenomenon. Children employed
gesture in a signatory fashion, to express affect (openness/resistance; pleasure/displeasure), and
for exploratory and communicative purposes. Children’s distinctive, well-practised rhythmic
mannerisms, including vocalizations, were relatively easy to perceive and to mirror. Rhythm and
shared vocalizing became bonding elements in Dance, creating a bridge to communal festivity. All
children exhibited a preference for weight-based themes and moving in the vertical dimension
and plane. The affirmation of children’s preferences for verticality and weight-sensing in Dance
seemed to release their potential for interactive self-presentation. Group style was thus performative.
Children appeared to live in a highly personal realm of time, showing little concern for pacing
themselves in relation to environmental structures. Overall, a quality of timelessness evolved in
Dance (a challenge for the leader who had to stay in the flow of a creative process while keeping
to a tight session schedule). Adults expressed relief in being able to ‘forget about time’ and were
Table 18.3 Characteristics of group style in dance

Multisensory
Aisthetic perception dominant
Special movement as norm
Pervasive rhythmicity
Expressive fingers, hands, arms and faces
Light orientation
Ritualistic
Non-verbal, vocal
Tensile strength
Impulsive
Playful
Heightened affect: openness and resistance
Flow and weight primary
Vertical forms primary
Space/horizontal forms secondary
Circular time
Performative
Interactive
Group synchrony
Child-centered, with merging of child and adult stylistic preferences
416 KAREN BOND
often surprised that a session was over. Stamatelos (1984) relates the aesthetic experience to a loss
of temporal sense. Through deep involvement, a timeless quality may emerge.
Unison dancing leads the self out of its solitary boundaries. In it, I am supported by something larger
than self. I experience my body expanding, as warm to the world and to others… In dancing with
others, the self visibly multiplies. The world expands.
(p. 195)
18.13.1 Work ethic

A work ethic was present in Dance—one that did not dichotomise playfulness as anti-work
(Turner 1982; Grimes 1995). For adults, the child-centred methods of Dance were hard work. All
reported physical discomforts and one took up aerobic dance to enhance her fitness. The Dance
space was often too hot or too cold and always too small. Adults had to cope with the pressure of
being videotaped on a regular basis. In spite of these challenges, a good-humoured commitment
to meet the children on their own terms prevailed. Such commitment required physical and
emotional immediacy, as well as the willingness to be challenged, surprised and fatigued.
Children worked hard as well, as observed in extended focus, commitment to content,
and increasing engagement in social interaction with both peers and adults. Some children’s
commitment extended to between sessions, as exemplified by Donald who would often pull an
adult participant into the Dance space for some rocking or slow turning.
18.13.2 Physical environment

The small, well-lighted Dance space may have promoted the refinement of repetitive and
intensive forms. The pervasiveness of children’s high near space engagement in Dance supports
this interpretation. Close observation showed that proximity, low-level movement and close-
range eye-level content (e.g., hand dances) promoted interaction. Children’s attraction to light
was affirmed as appropriate to the Dance context.
18.13.3 Social innovations

Social innovations included an ethos of equality, outsider as leader, and the systematic reversal of
conventional status roles during child leadership segments. These strategies seemed to liberate both
children and adults into a liminal, or threshold, space of transformative engagement (Turner 1982).
Regarding ritual process, rituals often have instructional functions (Grimes 1995; Doty 2000), and
it became apparent that children (once they became used to the idea) used child leadership time
and other moments during the sessions to teach their aisthetic preferences to their adult partners.
18.13.4 Greetings and farewells

Sparshott (1988) suggests that times of joining and leaving dance always require special
attention. The Dance greeting, based on a rhythmic name chant and the passing of a small mirror
in a circle, was the most ritualistic aspect of Dance in the traditional sense; it was repeated fairly
exactly each session. The farewell circle dance fits Doty’s (2000) concept of ritual as an ‘infinitely
varying chain of action’ (p. 50). While the circle form remained constant, content evolved
through a group process (steps, rhythms, relationships and affective style). Even though Dance
was child-centred, the farewell circle in particular evolved through negotiation between adults
and children. Shared aesthetic values emerged, as seen in the high degree of rhythmic synchrony
and heightened affect. In the linked circle, movement flows without beginning and without end;
this may have facilitated the timeless quality of aesthetic community.
18.14 Communitas
During post-data analysis of the literature, I discovered that aesthetic community resembles
Turner’s (1982) communitas, a pattern of social interaction where human capacities are liberated
from the ‘encumbrances of role, status, or reputation’ (p. 44). Communitas accommodates
individual differences and participants place a high value on openness and personal authenticity.
In the space of communitas, individuals experience unmediated absorption in a freely chosen
event, and there is often a spontaneous, playful or celebratory quality of interaction. Further,
communitas emerges through a ritual process. Turner (1982) posited that the formality and
repetition of ritual process allows a freedom in which creative impulses can find expression.
Turner’s theories continue to have wide application across many disciplines, from theatre to
wilderness studies (St John 2006; Sharpe 2005).
18.14.1 A path to aesthetic community

Figure 18.2 illustrates a generative process of engagement in Dance from the core of personal
aisthetic perception through to aesthetic community. Dance was child-centred, and thus influenced
by children’s aesthetic perception as reflected in personal style. In its accommodation of personal
style, Dance facilitated integration of child and environment, evidenced in high levels of social
and task engagement. In this dynamic context, self-transformation was observed in all
children. Finally, an aesthetic community evolved, a place where the boundaries of personal style
seemed to soften, allowing an authentic blending of child and adult preferences.
18.15 Reflections
Some individuals and groups may be limited in verbal potential. Given effective outlets, however,
their affective, bodily, aesthetic and interpersonal capabilities may surprise us. Even with the
current attention to bodily-kinaesthetic experience as a valid mode of knowledge construction
(Gardner 1983, 1999), I continue to wonder if an entrenched tendency to valorize verbal intelli-
gence makes us miss something in children such as those presented here. Perhaps in many
people… perhaps in ourselves.
ETIC COMMUNI
STH TY
AE ANSFORMAT
F -TR I O
L N
SE ND TASK ENGAG
A N C E
AL DA E M
CI
EN
O N AL S TY
SO
RS
T
HE
ST T
PE
LE
IC
AI
PE
R
CE PTI
O
Fig. 18.2 A path for aesthetic

community in dance.
418 KAREN BOND
As noted by Goode (1994), working with the non-verbal deaf-blind population challenges
entrenched biases about formal symbolic language and its role in human life. He asserts,
‘The acultural child provides us with an opportunity to appreciate our acultural self, our own
aculturality, which is something that every human possesses’ (p. 190). Such children remind me
that I build my everyday social world through embodiment. Since this study, I have gone on to
explore qualities and meanings of aesthetic community in a range of dance environments (Bond
1994b; Bond and Deans 1997; Bond 2001; Bond and Etwaroo 2005; Bond and Richard 2005) and
the phenomenon has been illuminated in a Taiwanese dance education setting (Wu 2005).
I continue to align with theorists who seek to ‘return thought to the body’ (Eagleton 1990,
p. 43). For me, as for Eagleton, this is necessarily an aesthetic project; and my exemplars of best
practice to date are six young non-verbal children with deaf-blindness. Crowther (1993) suggests
that the whole aesthetic realm is driven by the human need for transformation, which he views as
a biological given. Sheets-Johnstone (1994) concurs that the drive for transformative knowledge
is highly personal, asserting: ‘Our bodies are, in fact… the primal form on which we model our
thinking’ (p. 328). Maffesoli (1996) describes an ethic of the aesthetic, grounded in empathy,
communitarian desire, and shared emotion, the ‘melody of social rhythm’ (p. ix). This social
aesthetic draws strength from daily living and is preoccupied with quality of life. It is not driven
by grand ideological causes, which the deaf and blind have no access to and many sighted
and hearing people have lost interest in, feeling ‘defrauded by an essentially rationalist
modernity’ (p. 20).
If all human beings are predisposed to aesthetic experience (I can see no reason to believe
otherwise), it is understandable that children with deaf-blindness would resonate with dance.
Such individuals have limited access to other arts whose practice is distanced from the body.
It was a privilege to work so closely with six non-verbal children for whom movement appeared
to be its own reward. For me, these young people illuminated a human potential for living life as
artful embodiment.
References
Adler J (1968). The study of an autistic child. In BK Weiss, ed., Combined Proceedings of the Third and
Fourth Annual Conference of the American Dance Therapy Association, pp. 43–48. American Dance
Therapy Association, Washington, DC.
Aitken S (2000). Understanding deafblindness. In S Aitken, M Buultjens, C Clark, JT Eyre and L Pease, eds,
Teaching children who are deafblind: Contact, communication and learning, pp. 1–34. David Fulton
Publishers, London.
Aitken S, Buultjens M, Clark C, Eyre JT and Pease L (eds) (2000). Teaching children who are deafblind:
Contact, communication and learning. David Fulton Publishers, London.
Amighi Kestenberg J, Loman S, Lewis P and Sossin M (1999). The meaning of movement: Developmental
and clinical perspectives of the Kestenberg movement profile. Gordon and Breach Publishers,
New York.
Bartenieff I with Lewis D (1980). Body movement: Coping with the environment. Gordon and Breach
Publishers, New York.
Bernstein PL (1981). Theory and methods in dance-movement therapy, 3rd edn. Kendall/Hunt Publishing
Company, Dubuque, IA.
Berrol CF (1981). A neurophysiological approach to dance/movement therapy: Theory and practice.
The American Journal of Dance Therapy, 4(1), 72–84.
Bohm D and Peat FD (1987). Science, order and creativity. Bantam Books, New York.
Bond KE (1991). Dance for nonverbal children with dual sensory impairments. Ph.D. Thesis, La Trobe
University, Bundoora, Australia.
Bond KE (1994a). Personal style as a mediator of engagement in dance: Watching terpsichore rise. Dance
Research Journal, 26(1), 15–26.
Bond KE (1994b). How ‘wild things’ tamed gender distinctions. Journal of Physical Education, Recreation
and Dance, 65(2), 32–38.
Bond KE (2001). ‘I’m not an eagle, I’m a chicken!’ Young children’s experiences of creative dance.
Early Childhood Connections, 7(1), 41–51.
Bond KE (2008). Honoring Hanny Kolm Exiner (1918–2006): Dancer, philosopher, and visionary educator.
In T Hagood, ed., Legacy and dance education: An anthology of essays and interviews on values, practices
and people. Cambria Press, New York.
Bond KE and Deans J (1997). Eagles, reptiles, and beyond: A co-creative journey in dance. Childhood
Education, 73(6), 366–371.
Bond KE and Etwaroo I (2005). ‘If I really see you…’ Experiences of identity and difference in a higher
education setting. In V Marcow-Speiser and MC Powell, eds, Crossing boundaries: The arts, education
and social action, pp. 87–99. Peter Lang Publishers, Cambridge.
Bond KE and Richard B (2005). ‘Ladies and gentlemen: What do you see? What do you feel?’ A story of
connected curriculum in a third grade dance education setting. In L Overby and B Lepczyk, eds, Dance:
Current selected research, 6, pp. 85–133. AMS Press, New York.
Campbell D and Stanley J (1963). Experimental and quasi-experimental designs for research on teaching.
In N Gage, ed., Handbook of research on teaching, pp. 171–246. Rand McNally, Chicago.
Canner N (1980). Movement therapy with multi-handicapped children. In M Leventhal, ed., Movement and
growth: Dance therapy for the special child, pp. 53–56. New York University Center for Educational
Research, New York.
Capra F (1982). The turning point: Science, society and the rising culture. Simon and Schuster, New York.
Cobb E (1977). The ecology of imagination in childhood. Columbia University Press, New York.
Creswell J (2003). Research design: Qualitative, quantitative, and mixed method approaches. Sage, Thousand
Oaks, CA.
Crowther P (1993). Art and embodiment: From aesthetics to self-consciousness. Oxford University Press,
Oxford.
Davies E (2001). Beyond dance: Laban’s legacy of movement analysis. Brechin Books, London.
Davis M (1983). An introduction to the Davis Nonverbal Communication Analysis System (DaNCAS).
American Journal of Dance Therapy, 6, 49–73.
Delaney W (1977). Dance therapy: Selected materials for professional preparation. University Microfilms
International, AAT 1310870, Ann Arbor. (ProQuest document number 761753591)
Delaney W (1980). The use of dance and music in therapy. Unpublished conference paper, Third National
Symposium of the Australian Musicological Society. Perth, Western Australia.
Dell C (1977). A primer for movement description using effort-shape and supplementary concepts. Dance
Notation Bureau, New York.
Dijksterhuis A (2005). Why we are social animals: the high road to imitation as social glue. In S Hurley and
N Chater, eds, Perspectives on imitation: From cognitive neuroscience to social science, 2, pp. 207–220. MIT
Dillard A (1974). Pilgrim at Tinker Creek. Harper and Row Publishers, New York.
Dissanayake E (1988). What is art for? University of Washington Press, Seattle, WA.
Dissanayake E (1992). Homo aestheticus: Where art comes from and why. University of Washington Press,
Seattle, WA.
Dissanayake E (2000). Art and intimacy: How the arts began. University of Washington Press,
Seattle, WA.
Doty W (2000). Mythography: The study of myths and rituals, 2nd edn. University of Alabama Press,
Tuscaloosa, AL.
420 KAREN BOND
Duggan D (1978). Goals and methods in dance therapy with severely multiply-handicapped children.
American Journal of Dance Therapy, 2(1), 31–34.
Eagleton T (1990). The significance of theory. Blackwell, Cambridge.
Eibl-Eibesfeldt I (1973). The expressive behaviour of the deaf-and-blind-born. In M von Cranach and
I Vine, eds, Social communication and movement, pp. 163–194. Academic Press, London.
Eibl-Eibesfeldt I (1988). The biological foundations of aesthetics. In I Rentschler, B Herzberger and
D Epstein, eds, Beauty and the brain: Biological aspects of aesthetics, pp. 29–70. Birkhauser Verlag, Basel.
Eibl-Eibesfeldt I (1989). Human ethology, translated by P Wiessner-Larsen and A Heunemann. Aldine de
Gruyter, New York.
Eisner E (1981). On the differences between scientific and artistic approaches to qualitative research.
Educational Researcher, 10, 5–9.
El Guindy H and Schmais C (1994). The Zar: An ancient dance of healing. American Journal of Dance
Therapy, 16(2), 107–120.
Enos J (1995). Building relationships with friends and other community members. In J Everson, ed.,
Supporting young adults who are deaf-blind in their communities, pp. 185–202. Paul H Brookes
Publishing Company, Baltimore, MD.
Exiner J and Kelynack D (1994). Dance therapy redefined: A body approach to therapeutic dance. Charles C.
Thomas Publisher, Springfield, IL.
Fraleigh S (1987). Dance and the lived body. University of Pittsburgh Press, Pittsburgh, PA.
Franko M (ed.) (2007). Ritual and event: Interdisciplinary perspectives. Routledge, New York.
Frost M (1984). Changing movement patterns and lifestyle in a blind, obsessive compulsive, American
Journal of Dance Therapy, 7, 15–31.
Gardner H (1983). Frames of mind: The theory of multiple intelligences. Basic Books, New York.
Gardner H (1999). Intelligence reframed: Multiple intelligences for the 21st century. Basic Books,
New York.
Goode D (1994). A world without words: The social construction of children born deaf and blind. Temple
University Press, Philadelphia, PA.
Grimes R (1995). Beginnings in ritual studies, 2nd edn. University of South Carolina Press, Columbia, CA.
Grimes R (1996). Readings in ritual studies. Prentice-Hall, Englewood Cliffs, NJ.
Grimes R (2000). Deeply into the bone: Re-inventing rites of passage. University of California Press,
Berkeley, CA.
Hagendoorn IG (2003). The dancing brain. Cerebrum 5(2), 19–34.
Hanna JL (1979). To dance is human: A theory of nonverbal communication. University of Texas Press,
Austin, TX.
Hurley S and Chater N (eds) (2005). Perspectives on imitation: From cognitive neuroscience to social science.
Jones C (2002). Evaluation and educational programming of students with deafblindness and severe
disabilities, 2nd edn. Charles C Thomas, Springfield, IL.
Kates L, Schein JD and Wolf EG (1981). Assessment of deaf-blind children: A study of the use of the
‘behavior rating instrument for autistic and other atypical children’. Viewpoints in Teaching and
Learning, 57(1), 54–63.
Kealiinohomoku J (1976) Theory and methods for an anthropological study of dance, University Microfilms
AAT 7621511, Ann Arbor. (ProQuest document number 760483951)
Koch N (1984). Content analysis of leadership variables in dance therapy. American Journal of Dance
Therapy, 7, 58–75.
Kratz L (1973). Movement without sight. Peek Publications, Palo Alto, CA.
Laban R (1960). Mastery of movement, 2nd edn. Macdonald and Evans, London.
Laban R and Lawrence FC (1974). Effort. Macdonald and Evans, London.
Lamb W and Watson E (1979). Body code: The meaning in movement. Routledge and Kegan Paul, London.
Leventhal MB (1979). Structure in dance therapy: a model for personality integration. Dance Research
Annual, X, 173–82.
Leventhal MB (1980). Dance therapy as treatment of choice for the emotionally disturbed and learning
disabled child. Journal of Physical Education and Recreation, 51, 33–35.
Lincoln Y and Guba E (1985). Naturalistic inquiry. Sage Publications, Beverly Hills, CA.
Loman S (2005). Dance/movement therapy. In C Malchiodi, ed., Expressive therapies, pp. 68–89. Guilford
Press, New York.
Lyons J (1987). Ecology of the body: Styles of behavior in human life. Duke University Press, Durham, NC.
Maffesoli M (1996). The contemplation of the world: figures of community style, translated by S Emanuel.
University of Minnesota Press, Minneapolis, MN.
Maletic V (1982). On the aisthetic and aesthetic dimensions of the dance: A methodology for researching dance
style. Ph.D. Dissertation, Ohio State University, Columbus, Ohio.
Maquet J (1986). The aesthetic experience. Yale University Press, New Haven, CT.
Mason K (1980) Observations on dance therapy as a viable treatment modality for visually handicapped
individuals. In S Fitt and A Riordan, eds, Focus on dance IX: dance for the handicapped, pp. 37–42.
American Alliance for Health, Physical Education, Recreation and Dance, Reston.
McCauley R and Lawson E (2002). Bringing ritual to mind: Psychological foundations of cultural forms.
McInnes J and Trefry J (1982). Deaf-blind infants and children: A developmental guide. University of
Toronto Press, Toronto.
McKechnie S and Grove R (eds) (2005). Thinking in four dimensions. University of Melbourne Press,
Melbourne.
Miles M and Huberman M (1984). Qualitative data analysis: a sourcebook of new method. Sage Publications,
Beverly Hills, CA.
Moore CL and Yamamoto K (1988). Beyond words: Movement observation and analysis. Gordon and Breach
Publishers, New York.
Pease L (2000). Creating a communicating environment. In S Aitken, M Buultjens, J Clark, T Eyre and
L Pease, eds, Teaching children who are deaf-blind, pp. 35–82. David Fulton Publishers, London.
Pisciotta A (1980). The case for dance for the deaf. In S Fitt and A Riordan, eds, Focus on dance IX: Dance
for the handicapped, pp. 25–28. American Alliance for Health, Physical Education, Recreation and
Dance, Reston, VA.
Reber R (1980). Creative movement for the young hearing-impaired child. In S Fitt and A Riordan, eds,
Focus on dance IX: Dance for the handicapped, pp. 29–32. American Alliance for Health, Physical
Education, Recreation and Dance, Reston.
Rentschler I, Herzberger B and Epstein D (eds) (1988). Beauty and the brain: Biological aspects of aesthetics.
Birkhauser Verlag, Basel.
Salk J (1983). Anatomy of reality: Merging of intuition and reason. Columbia University Press, New York.
Schechner R (1993). The future of ritual: Writings on culture and performance. Routledge, London.
Schilbrack K (ed.) (2004). Thinking through rituals: Philosophical perspectives. Routledge, New York.
Schmais C (1985). Healing processes in group dance therapy. American Journal of Dance Therapy, 8, 17–36.
Serlin I (1993). Root images of healing in dance therapy. American Journal of Dance Therapy, 14(1), 65–76.
Sharpe E (2005). Delivering communitas: Wilderness adventure and the making of community. Journal of
Leisure Research, 37(3), 255–280.
Sheets-Johnstone M (1994). The roots of power: Animate form and gendered bodies. Open Court,
Chicago, IL.
Shusterman R (2000). Performing live: Aesthetic alternatives for the ends of art. Cornell University Press,
Ithaca, NY.
422 KAREN BOND
Shusterman R (2005). Making sense and changing lives: Directions in contemporary pragmatism.
Journal of Speculative Philosophy, 19(1), 63–72.
Siegfried W (1988). Dance, the fugitive form of art: aesthetics as behavior. In I Rentschler, B Herzberger
and D Epstein, eds, Beauty and the brain: Biological aspects of aesthetics, pp. 117–148. Birkhauser Verlag,
Basel.
Sparshott F (1988). Off the ground: First steps to a philosophical consideration of dance. Princeton University
Press, Princeton, NJ.
St John G (ed.) (2006). Victor Turner and contemporary cultural performance. Berghahn, New York.
Stamatelos T (1984). Peaks and plateaus of the mentally retarded, The Arts in Psychotherapy, 11, 109–15.
Stillman R and Battle C (1986). Developmental assessment of communication abilities in the deaf-blind.
In D Ellis, ed., Sensory impairments in mentally handicapped people, pp. 319–338. Croom Helm,
Beckenham.
Turner V (1977). The ritual process: Structure and antistructure, 2nd edn. Routledge and Kegan Paul,
London.
van Dijk J (1977). What we have learned in 12.5 years: Principles of deaf-blind education. In M Sopers-
Jurgens, ed., Confrontation between the young deaf blind child and the outer world, pp. 1–10. Swets and
Zeitlinger, Lisse.
van Dijk J (2001). Which predictors play an important role in deaf-blind education? The National
Information Clearinghouse on Children who are Deaf-Blind
http://www.dblink.org/lib/topics/vandijk9a.htm
Weisbrod J (1974). Body movement and the visually impaired person. In K Mason, ed., Dance therapy:
Focus on dance VII, pp. 49–52. American Association for Health, Physical Recreation and Recreation,
Washington, DC.
Wu Y (2005). Dancing with little spirits: A journey towards enhancement of pedagogical relationship and
intersubjectivity in a third grade dance education setting in Taiwan. Ph.D. Dissertation, Temple University,
Philadelphia, USA.
Young JZ (1974). An introduction to the study of man. Oxford University Press, Oxford.
Chapter 19
Therapeutic dialogues in music:

Nurturing musicality of communication
in children with autistic spectrum
disorder and Rett syndrome
Tony Wigram and Cochavit Elefant
19.1 Introduction: support for communication—principles

and techniques
The development of clinical music therapy over the past 50 years has equipped the trained
practitioner with methods and techniques for using both precomposed and improvised music in
ways that have wide application (Bruscia 1987; Wigram 2004; Wigram et al. 2002). In Europe,
a tradition of improvised music-making—stimulating dialogues of expression in musical form—
promotes the development of a musical relationship between a therapist and a patient or group
of patients (Alvin 1975; Nordoff and Robbins 1977; Priestley 1994). Listening to precomposed
music, and singing or composing songs, is frequently used in palliative care and for the manage-
ment of terminal illness (Aasgaard 2005; O’Brien 2005). Whether the therapist uses precomposed
or improvised music, we believe the quality of musical engagement and its clinical benefits
depend on engaging with the motives of communicative musicality considered to be the founda-
tion for the healing process (Trevarthen and Malloch 2000). Human communicative musicality,
evident in parent–child interactions from birth (Malloch 1999; Trevarthen and Malloch 2002),
comes alive with people of all ages when music is shared as the medium for therapeutic dialogue.
To understand the power of music to heal, it should be conceived as communication that can
engage human emotions and thoughts profoundly. But how can the sounds of a human voice
or performance on a musical instrument have such an effect? This is not a question for which
psychology can offer an easy answer.
Whether music functions as a ‘language’ is a matter of debate. A musical dialogue may be
neither oral nor vocal, and usually it has no definite semantic or referential meaning. Sloboda
commented that it would be ‘very difficult to agree on a set of criteria for demonstrating that a
person had “understood” some music’ (Sloboda 1990, p. 6). Yet musical expression can certainly
mediate intimate and creative dialogic encounters between people, linking their motives and
emotions.
The literature of music psychology addresses the teaching of musical skills in children much as
a developmental linguist might study the learning of formal features of speech or writing
(Hargreaves 1990, p. 63). This science is not typically looking at the child’s musical production to
analyse a dialogical process of mutual engagement (see, however, Gratier and Danon Chapter 14,
Erickson Chapter 20, Woodward and Bannan Chapter 21, and Custodero Chapter 23, this
volume). In an original approach, Schögler (1998) demonstrated a clear connection between the
424 TONY WIGRAM AND COCHAVIT ELEFANT
art of music and basic communications research, by comparing the intuitive expressive interac-
tions of infants with parents to the dynamic skills of jazz musicians improvising in duets.
This idea of a non-verbal dialogue mediated by musical expressions was further explored in a
commentary by Hallan Tønsberg and Hauge (1998) who related Schögler’s theory of rhythmic
temporal synchrony and interaction, and Stern’s concept of the ‘attunement’ of behaviours
between people (Stern et al. 1985), to their own analysis of the interplay of simultaneous and
contingent utterances between congenital deaf-blind children and their adult partners (Hauge
and Hallan Tønsberg 1998).
Holck (2002, 2004) has made a detailed analysis of musical interaction in therapy, with partic-
ular reference to the autistic and developmentally disabled population. She commented that, ‘in a
well-functioning dialogue, the nonverbal and often implicit visual and auditory cues ensure good
continuation without interruptions or overlapping’ (Holck 2004, p. 45). Holck went on to
say that ‘in mutual interplay, both partners participate in turn-organization, and therefore an
analysis of cues indicating turn-taking and turn-yielding can give information on the partici-
pants’ social skills, whether or not the dialogue is verbal’. Holck’s analyses of music therapy
sessions, consisting of ‘horizontal’ analysis looking at musical interaction over time, and ‘vertical’
analysis looking at different forms of interactional behaviour occurring simultaneously, demon-
strate the development of ‘interaction themes’, and both turn-taking dialogue and a form of
‘simultaneous dialogue’ unique to musical interplay. Her analysis shows that well-regulated
musical dialogue is characteristic not only of artistic or recreational music making, but of clinical
music therapy.
The deliberate use of controlled musical dialoguing as a therapeutic method has been defined
as ‘a process where therapist and client or a group of clients communicate through their musical
play’ (Wigram 2004, pp. 97–106)1; two main forms are distinguished.
1 ‘Turn-taking dialogues: making music together where the therapist or client in a variety of
ways, musical or gestural, can cue each other to take turns. This “turn-taking” style of
dialogue requires one to pause in their playing and give musical space to each other’
(Wigram 2004, p. 98).
2 ‘Continuous “free-floating” dialogues: making music in a continuous musical dialogic
exchange—a free-floating dialogue. Here the participants, therapist and client, play more or
less continuously and simultaneously. In their playing musical ideas and dynamics are heard
and responded to, but without pause in the musical process’ (Wigram 2004, p. 98).
One can imagine that, just as in a verbal conversation, there are several ways in which the
dialogue can develop between the participants. First, they can time their contributions in various
ways.
1 The therapist and client take turns to play, taking over from each other immediately, without
pause.
2 The therapist and client take turns, with pauses between their statements.
3 The therapist or client interrupts the ‘conversation’.
4 Therapist and client overlap each other (‘talk’ at the same time), in a harmonious manner.
5 The client makes long statements and the therapist, in very short phrases, gives the
equivalent of grunts or ‘ah-ha’ responses.
1 The following principles are described for a therapist working with a single client, but they can also be
applied to work with a group of clients.
THERAPEUTIC DIALOGUES IN MUSIC 425
In addition, the emotional quality of their contributions can differ:

6 The therapist’s musical style in the dialogue may be sympathetic (similar in manner or
feeling) to the style of the client, or, conversely, the client responds in a sympathetic manner
to the statements of the therapist.
7 The therapist’s playing in the dialogue is very unsympathetic and oppositional/confronta-
tional to the client, or vice versa (Wigram 2004).
Musical dialogue is a natural developmental outcome of the impulses for sharing communica-
tive expression typical of normally developing children. However, in improvisational music-
making with clinical populations, musical dialogues do not always develop automatically or
easily. For example, some autistic clients find it extremely difficult to engage in dialogues because
they cannot follow or respond in normal turn-taking exchanges. Clients with Rett syndrome
show marked delay of response, and their uncontrolled movements disrupt the natural timing of
interactions. Those with Asperger’s syndrome typically ‘talk’ so much that they do not pause for
long enough to listen to what somebody else has to say.
By using techniques such as interjecting (waiting for a space in the client’s music and filling
in the gap) and making spaces (leaving spaces within one’s own improvising for the client to
interject their own material) (Bruscia 1987, p. 535), the therapist can engage inattentive clients in
dialoguing, leading to a conversation or argument style of improvisational music-making in
which the playing together can become directly communicative as a game. Communication can
also be facilitated through modelling—playing and demonstrating something in a way that
encourages the client to imitate, match or extend some musical idea (Wigram 2004, p. 99).
These and other techniques can be used to promote the initiation, development and progres-
sion of a dialogue. By supporting the natural motives of communicative musicality, they organize
the harmonic, rhythmic, melodic and dynamic musical cues or gestures in shared patterns of
activity. They create a potential for mutual experience resembling the intuitive sympathy of com-
munication that is an essential element in the developmental process of all children. They are
capable of encouraging collaborative responses, even when pathologies such as autism and Rett
syndrome present significant barriers to expressive communication and comprehension.
19.2 The technique of improvisational therapy

The methods of musical improvisation by which the potential for communicative musicality in
people with autism spectrum disorder (ASD) can be discovered, drawn out, explored, developed
and then integrated and incorporated into everyday engagements with other people rely on
musical technique and therapeutic method, and on the controlled attunement to a client’s
expressed intentions and feelings. The development of the required ‘toolbox’ of clinical tech-
niques and skills in musical improvisation, and their effective use in therapy, requires a system-
atic and comprehensive training. Effective therapeutic practice is learned through sensitive
attention to clients as individuals, and the recognition of their different experience. The theory of
musical improvisation therapy has been well documented; it is taught as a set of methods and
techniques that have proved appropriate and effective for achieving specified therapeutic goals
(Bruscia 1987; Wigram 2004). By the deliberate choice of a therapeutic method, we specify the
means by which those goals will be attained, and which well-practiced musical techniques will be
used as tools.
Simple styles of playing—such as melody dialogues, two-chord accompaniments, walking
basses (tonal and atonal), sixths with octave grounds, jazz, pentatonic and Spanish-style
frameworks—are easily learnt by therapists; they are supported by therapeutic responses sensitive
to the initiatives and emotions of a client, such as matching, supporting, frameworking, and
grounding. Controlled transitions in therapeutic improvisation help a client or group of clients
change and develop their musical expression (Wigram et al. 2002, pp. 278–279). Frameworking,
discussed below, offers a planned musical structure to the expressions of a client, which can
have the goal of enhancing the music aesthetically or guiding the client in a new direction. Jazz
frameworks, illustrated below, offer a predictable but nevertheless creative and flexible structure
that is attractive for clients with autism, attention-deficit hyperactivity disorder (ADHD) and
Asperger syndrome, for whom the experience of predictability can be a critical need. As in games
with children, structure balanced against unpredictability plays a regulating role in the clinical
process, and depends on the skills of the therapist to engage with the impulses of the client musi-
cally (see Erickson, Chapter 20 and Custodero, Chapter 23, this volume, for a comparison with
the function of responsive yet structured musicality in the teaching of young children, and
Gratier and Danon, Chapter 14, this volume, for a comparison of mother–infant dialogues to
jazz improvization).
19.3 Frameworks to support communicative musicality for

children with autism spectrum disorder
Many clinical reports and a few systematic studies in the music therapy literature present
evidence for the stimulation of communication in children with autism (Gold et al. 2006).
A consistent increase in communicative expressions and responses over 10 sessions during
improvisational music therapy was found in a study by Edgerton (1994) of 11 children with
autism. Other studies have explored factors that may influence the efficacy of music therapy for
autism, including the involvement of family members (Müller and Warwick 1993; Oldfield
2004), attention to the developmental level of communication (Perry 2003), the regulation of
turn-taking and visual attention (Plahl 2000; Bunt 1994), the systematic control of dynamic form
in the music (Pavlicevic 1997), and the detailed use of case analyses to describe the interaction
process of playing, turn-taking and timing with individual clients (Robarts 1998; Wigram 1999).
Below, we review improvisational therapy and structured music-making to describe how a
therapist can use these methods as tools to develop communication. Music therapy can be used
for assessment and to improve the diagnosis of developmental disability, and the analysis of
musical activity can play a unique role in demonstrating strengths and difficulties and in identi-
fying potential flexibility, responsivity and mastery of social skills in children with autism
(Oldfield 2004; Wigram 2002).
Children with ASD, or various pervasive developmental disorders and developmental disability,
can show musical creativity and can benefit from its encouragement (Wigram 2004). However,
they tend to be rigid and repetitive in their behaviour, because they seek predictability in experi-
ence that enables them to feel secure. Parents, carers and educational staff are aware that less
challenging behaviour occurs when the environment meets the expectations of these children,
and that learning has a greater chance of taking place in a clear and accepted structure.
Music, particularly improvised music-making, has the advantage of combining a foundation-
giving structure with measured flexibility and unpredictability; this can help children with ASD
to learn, by degrees, how to manage when their world becomes less predictable. Improvisational
music therapy can also take a child with an ASD back to the early, prelinguistic stage, when
the exchange of simple sounds, beginning with the sounds a child will make for his or her own
enjoyment, stimulates an interior communicative dialogue—one that is understandable and
enjoyable for the child. In this way, the sharing of experience, joint attention to meaning, engage-
ment in purposes with others, and relationships of companionship, trust and affection are built.
The creation of an appropriate musical structure to enable a child to engage, or in response to

a child’s music, is natural and helpful during improvising (intentionally or unintentionally), and
is highly relevant in music therapy practice, where clients need, for one reason or another, a clear
musical frame. Children with ASD demonstrate a need for structure, which music contains in
many forms including melody, harmony, rhythm, phrasing, and dynamics.
The technique of frameworking—the provision of a clear musical framework for the impro-
vised material of a client or group of clients, to create or develop a specific type of musical
structure (Wigram 2004, p. 118) can inspire and encourage, or stabilize and contain. Among the
64 techniques of music therapy described by Bruscia (1987), he defines experimenting as ‘provid-
ing a structure or idea to guide the client’s improvising, and having the client explore the possi-
bilities therein’. Frameworking is a more directive or structuring technique for the
communication of ideas and experiences in sound. It is not primarily sympathetic in its purpose,
although the frame provided must be responsive to the feelings and mood of the client, and
modulated to enhance further interaction. Frameworks contain structure to the degree that the
music is formed in a way that allows the child to predict and consequently join in. Therefore,
a framework is a musical type or style that contains varied levels and complexity of structure.
Creating a framework assumes the development of a musical structure.
The following case vignette exemplifies the use of jazz frameworks. Jazz has a constant,
underpinning rhythmic stability or pulse, which is typically ‘played against’, syncopated,
challenged, but remains foundationally solid. A walking bass may support the harmonic
direction of the music. Jazz music also often includes a clear, repeated frame of harmonies,
such as the cycle of fifths, a compelling harmonic sequence used in many styles of music from
popular to classical. Together with the predictable rhythmic structure, the expressive frame
invites the listener to anticipate the musical direction, and to enjoy the many melodic and stylis-
tic variations that take place within this frame. Musical ornamentation—syncopation, silent
beats or bars, off beat melody—enlivens and colours the music in a flexible way (Wigram 2004,
pp. 121–25).
In the short case vignette that follows, drawn from case material of a specialist tertiary service
for the diagnosis and management of children with a variety of disorders, most of which fall
within the autistic spectrum, this style of jazz framework was effective in evoking behaviours
expressive of communicative musicality, despite significant pathology.
Case study 19.1: Joel

Joel is a 7-year-old boy with autism, whose case was more fully reported from a different perspective in a
previous publication (Wigram 2002). At referral from a consultant paediatrician, he was described as a boy
who demonstrated poor use of direct eye contact, a lack of socially imitative play, an inability to share enjoy-
ment with others, stereotyped ritualistic play, and was poor at relating to other people, especially his peers.
He appeared unable to use non-verbal behaviour to regulate social interaction.
The opening experience of the session finds Joel exploring the grand piano. He is particularly interested in
watching the hammers come up when pressing down the keys: such a preoccupation with the mechanical
function of an object is often found in children with autism. The musical engagement that follows—when
the therapist (Tony Wigram) joins in on a second piano—is set out in Table 19.1.
The harmonic structure shown in Table 19.1 uses the cycle of fifths. Joel improvised his melody and
rhythmic patterns over the stable, jazzy accompaniment provided, fitting his melodies into the structure.
The improvisation lasted only around 65 seconds, but in this time Joel demonstrated evidence of interactive
engagement, both musically and through the number of times he visually referred to the therapist.
The musical dialogue continued on the two pianos. Joel began to play rigid sequences on the black notes, to
which to the therapist provided a pentatonic harmonic frame. Joel worked his way up to the top of the piano,
gradually slowing down as he reached the top. The therapist again provided a two-chord accompaniment.
There followed a short transition, followed by a melodic improvisation by Joel, supported by a jazz chordal
framework by the therapist (Table 19.2).
The structure of the harmony led to a clear, mutually anticipated musical dialogue between therapist and
client. This simultaneous style of dialogue emerged because the framework began to use the cycle of fifths
more clearly, within jazzy 12-bar harmonic cycles. In reference to the description above of dialogue as a
method, this dialogue never began as a turn-taking, but developed as a continuous, free-floating exchange of
musical ideas that require the ability to quickly incorporate the other’s musical materials into one’s own
playing, by both therapist and client.
These short samples of musical invention with an autistic child illustrate how a musical
structure can provide the necessary framework for drawing out the musicality of the client, and
indicates how the particular style of either tonal or atonal jazz offers a creative and flexible frame-
work for this. The evidence of autistic spectrum disorder was present in the other assessment
sessions undertaken as part of the multidisciplinary diagnostic procedure with this boy (speech
and language therapy and cognitive psychology), and was also evident in the way Joel established
melodic patterns in his playing. But the flexible style, typical of jazz, and the predictable
harmonic direction, allowed him to anticipate how he could ‘fit in’ his musical production with
the initiatives of the therapist. This fitting in, or matching, is part of the musical dynamic that
draws out or invites the expression of communicative musicality. The use of the jazz framework
became effective in providing structure while allowing flexibility.
There are many more examples from clinical assessments that could be cited here, in which
the harmonic and rhythmic structures both engage the person and offer opportunities to tease
out or release the potential communicative musicality from within confining pathological
Table 19.1 Two-piano improvisation between the therapist and Joel—emerging structure over a two-
chord accompaniment. Where the chords are specified in the therapist section (e.g., D minor 7), each
chord represents a four-beat bar in a fast quaver beat. The number 7 represents a seventh in the chord
Client Therapist
Random bass-notes Falling melody in triplets in treble
Fast repeated notes in treble Melody-matching with Joel in treble
Repeated note (A) in the rhythm of the therapists Accompanying ‘um-cha’ chords
accompaniment D minor 7 D minor 7
Begins a melody that goes up and down in the G major 7 G major 7
treble of his piano on a rhythm of: D minor 7 D Minor 7
–.. –.. G major 7 G major 7
Continues his melody and rhythmic pattern, C major 7 C major 7
remaining in the same tempo and dynamic as the piano F major 7 F major 7
Plays with the flat of his hands, with alternate hands E major 7 A major 7
on the keys in a pulsed beat D major 7 G major 7
Continues with a melody (with one finger) but D minor 7 D minor 7
slows a little. Goes into chord playing. Uses G major 7 G major 7
both hands simultaneously (looks in the piano), D minor 7 D minor 7 (strong)
reverts to melody, jumping around with one G major 7 G major 7
finger of each hand all over the piano C major 7 C major 7
Continues melody with repeated notes, matching F major 7 F major 7
therapist’s rhythm, tempo and accents … STOPS E major 7… Held and sustained
Pulls up a chair to sit down
Table 19.2 Continuation of the improvisation between the therapist and Joel, now within a jazz
framework
Client Therapist
Pentatonic melody up the piano—repeating notes B Minor 7 G Major 7
until he reaches the top B Minor 7 G Major 7
Pause… Notes in the bass of the piano G Major G Major
Transition – random notes, without direction Transition—octaves in the treble, then a
Pentatonic melody in right hand with repeated chromatic, modulation down the piano
notes matching the tempo. Melody moves Two-chord improvisation (jazz style)
downward in a stepwise pattern (still on the black E Minor 7 A Major 7 E Minor 7 A Major 7
notes and with right hand) E Minor 7 A Major 7 E Minor 7 A Major 7
Melody continues with repeated notes D Minor 7 G Major 7 C Major D Major 7
At the change of key, Joel stamps his feet several time G Major 7 G Major 7 G Major 7 G Major 7
Continues with melody and chords C Major 7 C Major 7 G Major 7 G 7–E 7
Joel establishes a new rhythm in his melody, A Major 7 D Major 7 G Major 7 D Major 7
using a.-.-.-.- pattern G Major 7 G Major 7 G Major 7 G Major 7
At the harmonic cue of a change of key, Joel goes C Major 7 C Major 7 G Major 7 E Major 7
into alternate hand chords, and plays repeated A Major 7 D Major 7 G Major 7 G Major 7
chords with the piano (anticipated by Joel from
the harmonic and rhythmic pattern in the music)
patterns of behaviour. While these patterns are evident in one form in ASD, they present in a very
different, yet related way with another, more severe pathology: Rett syndrome.
19.4 Reinforcing communicative musicality to help

children with Rett syndrome
Rett syndrome (first described in Rett 1966) is a genetic disorder affecting mainly females (Amir
et al. 2000). It leaves a child with severe movement and coordination disadvantages, preventing
participation in rhythmic, natural interactions, and severely restricting voluntary activity
(Hagberg et al. 1983, 1993; Kerr and Witt Engerstöm 2001). Nevertheless, music is greatly loved
and appreciated by children with Rett syndrome, and music therapy has long been considered an
effective and indicated treatment for affected children and adults. In particular, it can be helpful
in developing social relatedness, attention, primary communication and in stimulating
movement, functional hand usage and learning (Elefant 2001; Elefant and Wigram 2005; Hadsell
and Coleman 1988; Montague 1986; Wesecky 1986; Wigram 1991).
Communication between an infant and his or her primary caregiver makes an essential contri-
bution to the development of the child’s psychological capacities throughout later development.
When the infant has developmental delay or impairment, the weakening of this process may
affect social, communication, motor or intellectual functioning, which can result in the inability
to create coherency in the self, and thus difficulties in organizing experiences, feelings and
emotional patterns in relationships (Stern 2000). When the infant lacks communicative abilities,
it is associated with increased difficulties in integrating experience with actions and across
modalities. This might be clinically evident in very little or no engagement by eye contact, and
limited emotional sharing by any means of expression (Pavlicevic 1997).
Despite the fact that individuals with Rett syndrome all become afflicted with a severe develop-
mental disability, most appear to develop normally at first (Einspieler et al. 2005; Burford 2005;
Nomura et al. 2005). The appearance of clear diagnostic abnormality typically occurs between
6 months and up to 2 years of age (Hagberg et al. 1993), but there is a wide range of clinical
severity reflecting variability of changes in the brain (Kerr and Witt Engerström 2001). When the
progress of the condition is compared with Daniel Stern’s (2000) account of the development of
‘the five senses of self ’ in infancy, it is evident that many girls with Rett syndrome do acquire
what Stern defines as ‘an emergent self ’, ‘the core self with others’ and ‘the intersubjective self ’,
and that some may have even begin to develop ‘the verbal self ’.
With the knowledge that a girl with Rett syndrome apparently experiences a normal
development at the beginning of her life, we can presume that her primary caregiver will have
interacted with her as she would with a normal baby. This means that both child and adult
will have the emotional experience of learning to attune to one another through preverbal
communication. In the development of a normal child, the primary caregiver typically plays,
sings and shows emotions towards his or her baby, who in return replies in smiles, gestures and
vocalizations. The infant and the parent sympathize with each other’s facial expressions, gestures
and explore different vocal interactions through ‘affect attunement’ (Stern et al. 1985; Stern
2000). They find pleasure in the experience of interacting with communicative musicality
(Malloch 1999; Trevarthen and Malloch 2002). Up to the stage where the syndrome comes to full
expression, this is also the case for infants with Rett disorder (Trevarthen and Burford 2001).
As a result of the drastic regression that typically occurs in a girl with Rett syndrome at Stage II
of this disorder (the ‘destructive stage’, usually around 18 months), there is a change in her inter-
actions with others and in their responses and expressions towards her. This stormy period,
unsettling both for the parents and for the girl (Kerr and Witt Engerström 2001), temporarily
interrupts the flow in emotional communication that mediates human contact.
19.4.1 Preferences and their development in music therapy

for girls with Rett syndrome
In research conducted by the second author, differences in song preferences between seven girls
with Rett syndrome who began to show the disorder at different ages support a developmental
interpretation (Elefant 2001, 2002). It was found that two girls whose onset of Rett syndrome
occurred early, at about 9 months of age—the stage of ‘secondary intersubjectivity’ (Stern 2000;
Trevarthen and Hubley 1978)— preferred songs that were slower in tempo and had fewer
dynamic, rhythmic and melodic changes. Typically, these songs were sustained with few
surprises, like the lullabies and the soft speech of a caregiver seeking communication with a very
young infant. On the other hand, five girls whose onset of Rett was around 15–24 months of
age—the beginning of the ‘verbal self ’ (Stern 2000)—preferred more complex songs with fast
tempi and greater variability in rhythm, dynamic, and melody and vocal humour or playfulness.
It seems that a child with an early onset of Rett syndrome will not have experienced all of Stern’s
five senses of self (Stern 2000), unlike a child who had the chance of normal interaction with her
primary caregiver through the two years of infancy.
Primary caregiver–infant interactions are similar to therapist–client interactions. An affection-
ate mother will attune her communication with her baby in response to the the emotional states
and developmental stages of the infant. Similarly the therapist will respond appropriately to each
child’s emotional expressiveness and level of maturity.
Using songs with children with developmental disabilities is as natural and appropriate as
a mother singing to her infant. The songs are linguistically simple and repetitive, relying on
non-verbal rather than verbal communication, reflecting the child’s expressions. Dialogues are
sustained when the therapist, taking the score of a composed and structured song as a base, strives,
in the way she sings, to be attuned to the child’s facial expression, body movement, gestures and
vocalization.
In the study reported as ‘Enhancing communication in girls with Rett syndrome through songs in
music therapy’, 18 familiar and unfamiliar songs were presented (Elefant 2001, 2002, 2004). One of
the purposes of the study was to determine whether girls with Rett syndrome are able to make inten-
tional choice. They first indicated a choice of a song out of two or four pictures symbols or words
(depending on individual ability) that represented songs about animals and other topics, followed by
confirmation of their choice after the order of the symbols had been randomly changed out of sight
of the girl. The girls expressed their choice by eye gazing, or pointing with their nose or hand. They
expressed their feelings for the music by an array of communicative acts: smiling, laughing, turning
their head away or by crying. The duration of the study was 5 months (20–30-minute sessions three
times a week) and included baseline, intervention and maintenance trials, followed by additional
three maintenance trials (2, 6 and 12 weeks after the intervention had ended). All songs were based
on repetitive elements, to provide a foundation on which a child with developmental disability could
be supported by a therapist. This frame provided the security a relationship needs to develop inter-
subjective rapport, trust and attachment. After presenting the child with this safe ‘container’, both
child and therapist were free to interact musically in more playful and experimental ways.
Analysis of recorded musical interactions between the therapist and the girls with Rett
syndrome confirmed that the regulating motives of communicative musicality described by
Trevarthen and Malloch (2002) were activated as therapist and child conversed emotionally
with one another by sharing songs. As the therapist sang songs to the girls, they responded in
individual ways by movements of the whole body, facial expressions, movements of the limbs,
hand gestures and vocalizations. Each time the same song was sung, it was as if a new narrative
was being told that attracted the child’s attention. In one session, for example, the girl might
respond happily and with vitality when the therapist sang the song she had selected. In another
session, she might remain passive.
In each case, the girl’s response influenced the therapist’s singing, causing her to vary the
tempo and expression of her playing as she kept attentive and attuned to the girls’ facial and
bodily gestures. This sympathizing of musical performance occurred with no conscious intention,
and was brought to light only after the conclusion of the study, during song analysis. The same
responsive adjustment of timing and expression takes place between caregiver and baby in
normal affectionate and playful interactions (Burford and Trevarthen 1997; Stern 2000). The girls’
behaviours confirm that a child with Rett syndrome, having experienced such interactions in
the affectionate communications of early infancy, retains sensitivity to contingent and attuned
expressive behaviour of another person, without the presence of language (Merker and
Wallin 2001).
The study found that girls with Rett syndrome have song preferences. The songs were catego-
rized according to the number of times each was chosen by the participants, as described above,
and the total number of confirmed choices were summed and rank-ordered from the most to the
least preferred songs in the whole group. The five most preferred songs were then compared with
the five least preferred songs, and their structures analysed to determine their musical features.
Furthermore, to test the assumption that the degree of normal development of the ‘self ’ evident
in children with Rett syndrome corresponds with the age onset of the Rett disorder, children with
different ages of onset were compared.
Of the many song features analysed, the most influential factors determining song preference
were found to be familiarity, tempo, rhythmic and tempo variability, dynamic expression,
melodic richness and vocal sounds (such as ‘buzz’ ‘oops!’, ‘toot-toot’ and ‘weeee’) and playfulness.
Information on each of these features will now be presented.
Familiarity: Songs already familiar to the girls were strongly represented in the most preferred
song group, while unfamiliar songs were prominent in the least preferred song group. This finding
is supported by other researchers and clinicians who find that individuals with Rett syndrome
can become more animated and generate more communication and are more responsive when
familiar songs are heard (Braithwaite and Sigafoos 1998; Elefant and Lotan 1998; Hadsell and
Coleman 1988; Woodyatt and Ozanne 1992, 1994; Merker and Wallin 2001).
Tempo: There was a dramatic difference in tempo between the five most and the five least
preferred songs. The mean tempo of the five most preferred songs was 145 beats per minute, and
of the five least preferred songs 84 beats per minute. There may be various reasons for the girls’
preference for the songs with faster tempo, but one simple explanation is that their preference is
age-dependent—non-Rett children in the same age group also prefer music with a fast rather
than a slow tempo (LeBlanc 1981; LeBlanc and Cote 1983; LeBlanc and McCrary 1983; Sims
1987). This finding gives rise to the notion that those children who showed this preference had
experienced normal early caregiver–infant interaction and had continued, in some degree, along
the normal route of a developing ‘self ’ when they were toddlers. An additional explanation for
tempo preference could be age of Rett onset. The girls’ ages were between 4–10 years. When the
onset was later, between 18–24 months, the girls preferred songs more appropriate to their
chronological age. When the onset was earlier than 18 months—typically resulting in more
severe disability—preferences were for songs more appropriate for infants.
Tempo and rhythm variability: The most preferred songs had more complex rhythms, with
marked rhythmic energy and tempo changes, while the least preferred songs had rhythms that
remained static, with almost no rhythmical development. Variations in rhythm and tempo in
songs can add tension and elicits emotional and physical responses in a listener, as early events in
the musical sequence generate expectancies about events that will occur later (Fraisse 1982;
Martin 1972; Meyer 1956). The girls in the study were attentive to the variability of rhythm and
tempo, and became emotionally and physically active when these songs were performed—by
moving their bodies and by facial expressions, smiles and laughter. It seemed clear that, through
their understanding of the progressive narrative of the music, they were excited to look for a
communicative interaction with the therapist.
Melody: Melody is a very important component of musical expression in music therapy
(Aldridge 1999). Girls with Rett syndrome with no verbal means to express their emotions can
communicate that they are actively listening and reacting to the vitality in the melodies of the
songs. The melodic developments were more varied in the most preferred songs. The girls
seemed to be attentive to these melodic developments. It may be supposed that when a melodic
motif is repeated, it provides security, as it is predictable and invites anticipation, but changes
and surprises in the melody keep the song interesting and satisfying. All of the least preferred
songs had repeated and predictable melodic motifs which did not elicit excitement.
Vocal play: All of the preferred songs had distinctive types of vocal representation or mimesis,
some imitating the motions or sounds of objects or animals with nonsense vocalizations
and evocative changes in pitch. Songs with vocal imitations and play bring fun to the music,
and elicited many different emotional and communicative responses from the girls. Hearing
a musically well-balanced song offers order and meaning, and can create a state in which
the whole being of a girl with Rett syndrome attunes to the music. The child becomes open to
her surroundings, communicative, ready to engage with her environment (Figure 19.1).
In retrospect, it is unsurprising to find that children with Rett syndrome are able to experi-
ence clear preferences and can express their likes and dislikes in music, despite very severe
neurological impairment, and that their preferences were consistent with the songs’ musical
elements. A general characterization of the less preferred songs would be relaxing and cradling,
in the style of lullabies that are used to pacify babies and young toddlers. In contrast, most
Fig. 19.1 Hearing a musically well-balanced song offers order and meaning, and a Rett child can
become open to her surroundings, communicative, and ready to engage with her environment
(Cochavit with Yaffa and Ella).
of the preferred songs can be categorized as play or action songs, such as those that are
popular with children at the kindergarten level. At the average age of 7, these girls with Rett
syndrome preferred songs that are appropriate for normal children of the same age, or a little
younger.
19.4.2 Structured music as a catalyst for enhancing

communicative musicality
Variations in song making and changes in vocalization are tools at the disposal of the therapist
when he or she uses precomposed songs with children and, in particular, for children with Rett
syndrome. As the therapist strove to be true to the music, holding the distinctive tempo, rhythm
and melody of the song at the centre of her work, she was attuned to the emotions and commu-
nicative impulses the girls demonstrated in response to her singing. These responses were imme-
diately reflected in the style in which the song was performed. Responsive change in the
emotional expression of the singing gave the song different meanings in different occasions, as if
a new story were told each time the song was sung. Rhythmic variations (ritardandos, acceleran-
dos, fermatas and pauses) were introduced into the playing. The way the therapist performed the
song reflected the therapist’s conscious or unconscious feelings about the girl’s responses, which
were different each time a song was played.
In the communicative musicality of mother–child interaction, the child grows emotionally and
socially, and the relationship between the caregiver and the child changes with each encounter
(Malloch 1999; Trevarthen 2002). Such a growth and development is not linear or predictable.
There are organic transformations of the child’s motives that affect how the parent responds, and
there are many reasons why either or both of them will have different moods and feelings of
sociability at different times (Trevarthen 2001). ‘Musical interaction through songs helps to
establish a basic sense of inter-subjectivity through which a child can, from early on, make an
impact on another’ (Ruud 1998, p. 60). In a similar way, as the relationship between therapist and
client developed, the songs reflected the child’s mood and feeling and promoted her self-
awareness (Figure 19.2).
A series of sessions in a case example of a child with Rett syndrome will illustrate how the
child’s part in the musical communication can change from session to session over many weeks.
In this account, the therapist, Cochavit Elefant, is presented in the first person.
Case study 19.2: Ann

Ann, a 9-year-old child with Rett syndrome, participated in the study reviewed above (Elefant 2001, 2002).
A number of songs were sung to Ann, chosen according to her expressions of preference.
The ‘train’ song was introduced to her one week before the end of the study, and it immediately became
a favourite. It was chosen 12 times by Ann out of 12 times it was offered as a choice (four times during
intervention and eight times during the last maintenance periods: 2, 6 and 12 weeks after the intervention
had ended). This song is fast in tempo, syncopated, with variable tempo and rhythm and a wide range
of melodic phrasings, and with repetition of vocal sounds of ‘toot toot’ at a particular point in the musical
narrative to signal the sound of the train.
Ann’s emotional and communicative expressions show transformations in her awareness of the music
and of me, and our interactions developed and changed over a number of presentations in several weeks
(Figure 19.3). During the first hearing, Ann demonstrated her unfamiliarity with the song. Her facial expres-
sion was minimal and she had very little eye contact with me during the first two verses of the song. In the
second verse, she moved out of her seat and walked towards the exit door. Ann may have been confused.
She had chosen the train song with nose pointing, but may have expected to hear a different train song. The
context of the song was also unfamiliar to her and she may have been communicating this by removing
herself from the situation. Feeling I had lost contact with Ann, I began to accelerate the tempo during the
third verse, hoping that this change would bring her back. As my playing changed, Ann returned and placed
herself in front of me. She smiled and giggled slightly after hearing the sound ‘toot toot’. Apparently, the
acceleration of tempo and the amusing sound caught her attention and interest. A few days later, Ann chose
the song for a third time. During its presentation Ann’s emotional and communicative response increased.
While listening attentively, Ann kept eye contact with me; at first she smiled as she anticipated the ‘toot toot’
and then burst into laughter immediately following the sound; in later sessions, she laughed before the arrival
of the sound. Her laughter became stronger and longer, and her head and body swayed from side to side as
I picked up each little nuance in her body gesture and facial expression, and reflected it through variations in
tempo, dynamic and timbre. Our interaction went beyond the score of the prescribed song: it developed
synchronicity in attunement and a flawless coordination of movement in time. The sympathetic perform-
ance of the song held both my and the girl’s emotions. These were powerful moments of ‘becoming’ in one
intimate body of sound, in one musical space that gave freedom of expression to two separate people. After a
few weeks of these moving and meaningful experiences, Ann’s responses to this song gradually declined.
Her long and deep laughs gave way to short laughs and smiles with very little body movement. Thus, interest
in this song receded.
There is a significant ending to the shared life of the train song. During the second verse of the
song, Ann got up from her chair and went to the exit door where she remained until the end of
the song. I had informed Ann that this would be our last meeting. The performance of the song
contained all the expressive and emotional elements Ann had brought into the song over the past
weeks. She must have understood the meaning of ‘closure’. She expressed this understanding by
leaving the therapist and the song, before they left her.
This case demonstrates the emergence and decline over time of an emotional and communica-
tive relationship, a process reflecting acknowledgement and acceptance of companionship
through the medium of a structured song. This communication of mutual awareness and
Fig. 19.2 As the relationship between therapist and client develop, the songs reflect the child’s
mood and feeling and promote her self-awareness (Ann and Cochavit—for a description of therapy
with Ann, see Case study 19.2). (See also colour plate 5.)
pleasure in one another’s company resembles a typical baby game that changes as parent and
infant learn one another’s performance and expectations of pleasure. It shows that intimacy
of experience can be achieved with children with Rett syndrome through songs in a controlled
and responsive music therapy approach. This case also underlines how a composed song offers a
stable foundation of musical structure for supporting the initial stage of a therapeutic relation-
ship, and how a prescribed form of narrative, once it becomes too familiar, can be restrictive. The
music therapist who intends to communicate through the song can be too committed to the
form, lyrics and structure of the song. To invite a lively relationship and to explore the potential
emotional space, a less firmly structured interaction may be needed. Improvisational music ther-
apy for girls with Rett syndrome should offer an open container—one that can take up and
develop musical qualities, with variations in tempo and rhythm, vocal play, and dynamic change,
choosing what proves to be appealing to them. Individuals with Rett syndrome display a rich
emotional palette, and a skilled music therapist has much to work with to reach the child and
communicate with her, so they can join their expressions in an interactive musical duet that
brings both of them pleasure.
19.5 Music therapy for assessment of girls with Rett syndrome,

and as support for other therapists and teachers
The aim in music therapy is to build a musical relationship with a client and, within that
relationship, to find ways of fulfilling their emotional and communicative needs, helping them
develop their vitality and well-being. An assessment is made of a person’s general response to
music and musical expression—finding which instruments they are most responsive to, if they
Train
Allegro=132
A E7 A
mf
5 A A E7 A
9 A D E7 A
13 E7 A
Puff Puff, Toot Toot, off we go
Date Session
11/6 1 S&L
14/6 2
21/6 3 S&L L
Time 2 L L L
1/7 4 L L L
Time 2 L L
Time 3 L L
5/8 5 L
Time 2 L L L Soft
Time 3 BS
15/9 6 BS BS
Time 2
= ‘Toot Toot’ = Walk toward door S & L = Smile & laugh

= Looks up at therapist L = Laugh = Long laughter BS = Big smile
Fig. 19.3 Ann’s evolving responses to ‘The Train Song’ over a number of presentations.
respond more to vocal sound, and how they react to changes in frequency, rhythm, tempo and
volume. Observation is made of how they accept turn-taking, sharing instruments, musical
improvisation (both tonal and atonal), and what happens if the therapist mirrors or reflects the
various musical sounds made by the client.
Women and girls with Rett syndrome tend to display intense anxiety in unfamiliar places, and
with unknown people and events. Those who come to a Rett Therapy Clinic naturally feel as
though they are on display—the object of discussion—and perhaps they are aware of their own
severe disabilities in the face of all of the ambulant, normally functioning adults around them.
Their past experiences with adults are often associated with them playing the role of ‘inspectors’,
as they undergo a multitude of tests at medical facilities.
Free improvisation in music therapy is a disciplined technique, and is intentionally an open
or receptive process in which interactively meaningful music can be created that the client
can understand and from which support can be gained. By conveying the feeling of a musical
phrase, a rhythm or a melody, a musician can help a client with extremely limited capacity
for response to feel contact at a level they can appreciate and share. Thus, understanding and
reflecting the feelings of the client becomes a crucial part of the process—first of assessment, and
then of therapy.
A National Rett Therapy Clinic has been operating in the UK since 1992, initially as a clinic
held four times a year in the Harper House Children’s Service of Horizon NHS Trust, and since
1998 as a regional clinic under the Wolfson Centre of Great Ormond Street Hospital for Sick
Children in London. A team of specialists, including a paediatrician, a physiotherapist, and
speech, language, occupational and music therapists, undertake multidisciplinary assessment to
advise at the tertiary level on the therapeutic management of Rett syndrome.
The Israeli Rett Centre also implements a multidisciplinary approach. Both teams have found
that it is best for the client and the therapeutic programme if the assessment starts with a music
therapy session, to greet the child and invite her to respond. This approach of using musical com-
munication helps the child to overcome her rejection of an unfamiliar place and promptly gives
the team a picture of the girl as a functional, communicative individual, helping the other thera-
pists by providing a wide range of evidence to inform their own specialist practice (Figures 19.4,
19.5 and 19.6).
We have developed musical assessment with this population to demonstrate how communica-
tion and openness to a child’s experience can be achieved through musical interaction, leading to
clinical engagement (Wigram 1991, 1995). Depending on the needs of individual girls, they may
then have feeding, communication and physical assessments, with physiotherapy, occupational
therapy and medical treatment. When different practitioners work together in this way, a problem
can be viewed in several different sessions from different perspectives. For example, communica-
tion may be examined through the lenses of music therapy and the feeding assessment; hand use
can be observed in music therapy, physiotherapy and occupational therapy assessments.
The process of assessment is not the same as that of therapy (Wigram 1991, 1995). In assessment,
one is attempting to gather in a short period of time a significant amount of information that
will help in the making of decisions about the future treatment of a client. The aim is to obtain an
overview of the client’s abilities and interests—to gain a whole impression of the client as a
person, not only their functional and physical disabilities, but their motivation, attention, interest
in their environment, and readiness for communication. In cooperation with other members of
the assessment team, the music therapist looks for and tests the following abilities:
◆ gross and fine motor skills
◆ attention, attention span and focus
Fig. 19.4 During assessment: Cochavit dramatizing a story with Ella. (See also colour plate 5.)
Fig. 19.5 During assessment: Cochavit dramatizing a story with Yaffa. (See also colour plate 5.)
◆ concentration and general awareness

◆ general non-verbal interaction—turn-taking, eye contact
◆ expressive and receptive communication skills
◆ areas of personal interest and motivation.
Although each client with Rett syndrome presents with individual needs that require varied
specialist attention, these are the six essential topics for all forms of assessment for clients who
have developmental disability.
At the Harper House Clinic, the music therapy session almost always starts at the piano. Since
the Israeli Rett syndrome evaluations are carried out at each child’s educational centre, the thera-
pist uses a guitar or a keyboard. At both places, the therapist starts by improvising gently, trying
Fig. 19.6 During assessment: Cochavit develops a musical relationship with Ella. (See also
colour plate 5.)
to do this sympathetically, intentionally reflecting the feelings of the client at the beginning of
the session. When the child seems to have opened up and attuned to the music therapist, a short
song is introduced to express welcome or to say, in musical terms, ‘Hello!’. In the next stage, the
child is encouraged to take part in a musical interaction—for instance, they are invited to put
their hands on the keys of the piano, guitar or tambourine, and the therapist begins to engage
them, helping them to improvise on the musical instrument by moving their hand. A young child
will probably be sitting on the knees of another member of the team, usually either the physio-
therapist or the occupational therapist.
In many cases, the guitar is the instrument of choice, given the child’s motor disabilities.
A gentle, sustained sound can easily be coaxed from the guitar. It is important when inviting
actions on the piano or the guitar to encourage sustained sounds, to aid the child’s awareness
that they have made the sound, that only a small amount of movement can create a rich and sus-
tained sound. Music-making books or audio tapes brought by the parents might be used, and
sometimes the music therapist will find it helpful to use instruments such as a drum or tam-
bourine.
If using a drum, the child is first seated in front of the drum with their hands resting on it, and
they are encouraged to move gently on the instrument when it is struck, so that they can experi-
ence the vibration through their hands. Then, if they have not already begun to do so, they are
prompted or supported to touch and hit the surface of the drum with their dominant hand, or
with both hands together. If work had begun on the piano, we would first have found out how
the child wants to play, and which hand is dominant. Drum playing gives a combined tactile and
auditory experience for the client; the stimulation felt through the surface of the instrument can
be interesting enough to attract the attention of a child to the feeling of the sound and lead to
anticipation of this sensation.
Other instruments might also be introduced to enliven the communication, such as wind
chimes—a sequence of brass bars suspended on a wooden frame—which make a sustained,
attractive sound. Amplification by a microphone can support any vocal sounds that might lead to
the more articulate communicative skills we want to encourage.
When a communicative musical engagement has been achieved, and the child and music
therapist have become friends, the music therapist continues to support the child’s interactions
with other members of the multidisciplinary team in the evaluation process.
Case study 19.3: Claire

Claire is a 6-year-old girl with Rett syndrome. She is non-verbal, her spine is developing a scoliotic curvature,
and she cannot stand unsupported. During the initial interview, Claire was twisting her hands together and
occasionally clasping them. Her gaze was vacant and her body was moving constantly. She appeared unaware
of the people in the room, and that they were talking about her.
The first assessment was through a 30-minute session in music therapy, observed by the parents and the
rest of the team through a video link. Her responses, reported by the therapist, Tony Wigram in the first
person, were confirmed by an analysis of the video recording.
Claire was sitting on my knee at the piano, and I was playing the ‘Good morning’ song gently to her on
the instrument. She seemed very aware, and began to rest her hands on the keys of the piano. She was
moving her right hand in order to play handfuls of notes. The song allowed pauses between phrases,
and Claire hit the piano keys one or two times during each pause, turning to look at me. Her timing in
this turn-taking was musically appropriate.
Claire started to vocalize, making calling and laughing sounds in response to a simple song I was
singing to her, ‘Claire—let’s play’. I was improvising on the piano, first with a single note, then with a
repetitive melodic phrase. I was leaning with my head against the piano so that Claire would look at me
while she was playing, since her gaze tended to wander and fixate on a light. There were many pauses in
this improvisation, in which Claire made vocal sounds or responded by playing with her hands on the
keys of the piano. At one point, she leant on the piano and stopped. Fast, accelerating music immediately
provoked her back to playing. Her vocal sounds also increased, and up to this point in the assessment her
previously continuous hand clasping and plucking had ceased, as both hands were busy on the piano.
As I transferred my playing to the guitar, Claire maintained consistent eye contact with me, smiling
and laughing and producing vocal sounds. I was singing to her, improvising melodies, changing the
quality and timbre of my voice to stimulate her, and she was very amused and responsive to this. The
physiotherapist held Claire’s pelvis steady and supported her upper arm, at the same time stabilizing
her shoulder; this allowed Claire to feel secure enough to use her right hand to touch and strum the
strings of the guitar.
When we moved to the large tympani drum, Claire initially rested her hands on the instrument.
I beat gently on the drum with the open palm of my hand, then moved up her arms to tap a rhythm.
This focused her attention and she immediately began to play, with both hands, which surprised her
parents and me, as she rarely uses her right hand. My vocalizing with her drum playing was apparently
effective in keeping her interest, because she sustained both eye contact and motivated playing.
Subsequently, I gently restrained her left hand for a short time to encourage her to use her right hand,
which she spontaneously continued to do when restraint was removed. Her grip is usually poor, but
she persevered with her right hand, and on two successive occasions beat once on the drum with a
short-handled beater, followed by two beats on the third attempt.
The musical engagement in all of the activities described above—instrumental and vocal—are
examples of turn-taking dialogues. Clare was particularly responsive to this, and there
was increasing immediacy in her taking her turn as she settled into the musical patterns of
the interaction. Analysis of the video revealed Clare was demonstrating good anticipation skills,
well-developed awareness of cause and effect and of object permanence.
By seating Claire on the knee of the physiotherapist who was holding her hips and torso, we
could evaluate the extent of the hypotonia of her trunk (back weakness) and determine how
much support she needed to maintain her in a functional sitting position. This informed the sub-
sequent physiotherapy and occupational therapy assessment, leading to recommendations
regarding seating and posture, and the use of a spinal brace. Claire’s response to having her left
hand held down, which improved function of her free right hand, allowed predictions to be made
about the potential efficacy of intermittent splinting of one arm for improving hand function in
the other limb.
The music therapy assessment revealed important indicators of latent communicative respon-
siveness and intentionality, and showed Claire’s sense of humour, her understanding of and
capacity to participate in turn-taking and sharing, as well as a range of vocal skills that could be
developed, better understood and responded to. She evidently could be attentive to what was
happening, even if she was not looking directly at it. Besides the evidence of a significant poten-
tial for developing her physical skills and extending the range of her movements, Claire’s vocal-
ization and laughter proved a lively capacity for emotional expression. The most valuable lesson
was evidence of Claire’s communicative abilities. This interaction continued for some 18 minutes,
a lengthy time to sustain a communicative musical engagement at first meeting between two
persons, one of whom has a severe communication delay.
In this example we see a severely handicapped child with Rett syndrome, who has no expressive
language, ‘talking’ with an attuned music therapist, sharing the impulses and emotions of
communicative musicality.
19.6 Conclusion: music supports the foundations of

communication
Music therapy enables one to take a view of the whole person. We believe the development of any
child’s activity is fundamentally a musical process, and research and clinical reports demonstrate
that the experience of being active in music can be an effective therapeutic tool—one that draws
on the motives of innate musicality. In the practical music-making sessions, much significant and
exciting improvement is obtained because the environment of music therapy inspires a child
with the desire and motivation to do many things. We are consistently and pleasurably surprised
at the amount of communication, engagement, functional physical activity, lack of resistance,
prosocial behaviour, and lively emotional expression that can be seen in the course of both struc-
tured and unstructured musical experiences with severely handicapped children.
Music therapy, however, is not always a time of creative joy and laughter. For children and adults
who come to music therapy struggling with a serious disability or disorder, it is also a time when
they may express fear, pain, anger and frustration. Then there is a further need for the therapy ses-
sion to offer a space in which such emotions will be accepted (Robarts, Chapter 17, this volume).
While many clinical examples recorded on video show the pleasure that the children and adults
can obtain from making sounds and developing dialogues, both in music and in other preverbal
methods of contact, there are moments when they appear sad, withdrawn and perhaps
even unable to cope with the situation. For this experience, they need support, nurturing,
and sympathetic response that will help them appreciate that the therapist can understand their
needs and their feelings, and share them. Sympathy is naturally mediated by communicative
musicality, which is grounded in the rhythmic engagement of motives and emotions.
Music is a universal human form of communication that has the capacity to overcome linguis-
tic, physical, mental and cognitive barriers to understanding with others. Mother and infant
communicate by the exchange of coordinated expressions by touch, sound and vision. Even at
birth, an infant shows the foundational impulses of communication, seeking eye contact and
vocalizing in synchrony with his or her parent’s infant-directed speech. When severe develop-
mental disability is evident from birth or at a very young age, the child and parent might not be
able to interact and share their motives and emotions in harmonious ways. This, in turn, may
affect the child’s development of a confident and able ‘self ’. The child may appear to others as
mute, unemotional and with little understanding of surroundings.
In its professional use as therapy, music can be employed as a prepared or composed container,
setting boundaries, structuring, supporting and guiding the client. On the other hand, it can offer
a free space for expressive improvisation—one that can mirror the client’s feelings beyond the
boundaries of his or her disability, allowing the two human beings to share company as equals.
When the skilled music therapist prepares an inviting environment, presents the client with
appropriate opportunities, and stays attuned to the client’s needs and abilities, the stage is set for
a live and intimate conversation.
We have shown in this chapter that the principles of human motivation called communicative
musicality can be harnessed as a powerful tool that can aid even the emotionally barricaded
child with autism to converse on the piano with his music therapist. We have seen the severely
handicapped child with Rett syndrome become able to freely ‘chat’ with the singing of her music
therapist when the setting gives her responsive support. When meeting a speechless client, the
music therapist has the tools to promote communicative musicality, thus enabling a person to
give their meaning a sound, and to sense that it has been received.
References
Aasgaard T (2005). Assisting children with malignant blood disease to create and perform their own songs.
In F Baker and T Wigram, eds, Songwriting: Methods, techniques and clinical applications for music
therapy clinicians, educators and students, pp. 154–180. Jessica Kingsley Publishers, London.
Aldridge G (1999). The implications of melodic expression for music therapy with a breast cancer patient.
In D Aldridge, ed., Music therapy in palliative care, pp. 135–153. Jessica Kingsley Publishers, London.
Alvin J (1975). Music therapy, revised edn. John Claire Books, London.
Amir RE, Van den Veyver IB, Schultz R et al. (2000). Influence of mutation type and X chromosome inacti-
vation on Rett syndrome phenotypes. Annals of Neurology, 47, 670–679.
Braithwaite M and Sigafoos J (1998). Effects of social versus musical antecedents on communication
responsiveness in five children with developmental disabilities. Journal of Music Therapy,
35(2), 88–104.
Bruscia K (1987). Improvisational models of music therapy. Charles C Thomas, Springfield, IL.
Bunt L (1994). Music therapy: An art beyond words. Routledge, London.
Burford B (2005). Perturbations in the development of infants with Rett disorder and the implications
for early diagnosis. Brain Development, 27(Suppl. 1), S3–S7.
Burford B and Trevarthen C (1997). Evoking communication in rett syndrome: Comparisons with
conversations and games in mother–infant interaction. European Child and Adolescent Psychiatry,
6(Suppl. 1), 26–30.
Edgerton CL (1994). The effect of improvisational music therapy on the communicative behaviors of
autistic children. Journal of Music Therapy, 31, 31–62.
Einspieler C, Kerr AM and Prechtl HF (2005). Abnormal general movements in girls with Rett disorder:
The first four months of life. Brain Development, 27(Suppl. 1), S8–S13.
Elefant C (2001). Speechless yet communicative: Revealing the person behind the disability of Rett
syndrome through clinical research on songs in music therapy. In D Aldridge, G Di Franco, E Ruud and
T Wigram, eds, Music therapy in Europe, pp. 113–128. ISMEZ, Rome.
Elefant C (2002). Enhancing communication in girls with Rett syndrome through songs in music therapy.
Unpublished Ph.D. thesis, Aalborg University.
Elefant C (2004). The use of single case designs in testing a specific hypothesis. In D Aldridge, ed.,
Case study designs in music therapy, pp. 145–162. Jessica Kingsley, London
Elefant C and Lotan M (1998). Music and physical therapies in Rett syndrome: A transdisciplinary
approach (In Hebrew). Issues in Special Education and Rehabilitation Journal, 13(2), 89–97.
Elefant C and Wigram T (2005). Learning ability in children with rett syndrome. Journal of Brain and
Fraisse P (1982) Rhythm and tempo. In D Deutsch, ed., The psychology of music, pp. 149–180. Academic
Press, New York.
Gold C, Wigram T and Elefant C (2006). Music therapy for autistic spectrum disorder (Cochrane Review).
The Cochrane Library, Issue 2 2006. John Wiley and Sons Ltd, Chichester, UK.
Hadsell NA and Coleman KA (1988). Rett syndrome: A challenge for music therapists. Music Therapy
Perspectives, 5, 52–56.
Hagberg B, Aicardi J, Dias K and Ramos O (1983). A progressive syndrome of autism, dementia, ataxia,
and loss of purposeful hand use in girls. Rett’s syndrome: Report of 35 cases. Annals of Neurology,
14, 471–479.
Hagberg B, Anuret M and Wahlstrom J (eds) (1993). Rett syndrome – clinical and biological aspects, Clinics
in developmental medicine, No. 127. Mac Keith Press/Cambridge University Press, London and
Cambridge.
Hallan Tønsberg GE and Hauge TS (1998). A response to ‘Music as a tool in communications research’.
Hargreaves D (1990). The developmental psychology of music. Cambridge University Press, Cambridge.
Hauge TS and Tønsberg GE (1998). Musikalske Aspekter i foerspraaklig samspill. Skaadalen Publications
Series No. 3. Oslo: Skaadalen Resource Centre, The Research and Development Unit.
Holck U (2002). ‘Kommunikalsk’ Samspil i Musikterapi [‘Commusical’ interplay in music therapy. Qualitative
video analyses of musical and gestural interactions with children with severe functional limitations,
including children with autism]. Unpublished Ph.D. thesis, Aalborg University.
Holck U (2004). Turn-taking in music therapy with children with commiunication disorders. British
Kerr A and Witt Engerstöm I (eds) (2001). Rett disorder and the developing brain. Oxford University Press,
Oxford.
LeBlanc A (1981). Effects of style, tempo, and performing medium on children’s music preference. Journal
of Research in Music Education, 29, 28–45.
LeBlanc A and Cote R (1983). Effects of tempo and performing medium on children’s music preference.
Journal of Research in Music Education, 31, 57–66.
LeBlanc A and McCrary J (1983). Effect of tempo on children’s music preference. Journal of Research in
Music Education, 31, 283–294.
1999–2000), 29–57.
Martin J (1972). Rhythmic (hierarchical) versus serial structure in speech and other behavior. Psychological
Review, 79, 487–509.
Merker B and Wallin NL (2001). Musical responsiveness in Rett disorder. In A Kerr and IW Engerström,
eds, Rett disorder and the developing brain, pp. 327–338. Oxford University Press, Oxford.
Montague J (1986). Music therapy in the treatment of Rett syndrome. Publication of the National Rett
Syndrome Association, Glasgow.
Müller P and Warwick A (1993). Autistic children and music therapy. The effects of maternal involvement
in therapy. In M Heal and T Wigram, eds, Music therapy in health and education, pp. 214–243.
Nomura Y, Kerr A and Witt Engerström I (2005). Rett syndrome; Early behavior and possibilities for
intervention. Proceedings of the 2nd International Scientific Research Workshop – from Basic Neuroscience
to Habilitation and Treatment – Infant Behavior. Brain and Development, 27(Suppl. 1), S101.
Nordoff P and Robbins C (1977). Creative music therapy. Harper and Row, New York.
O’Brien E (2005). Songwriting with adult patients in oncology and clinical haematology wards. In F Baker
and T Wigram, eds, Songwriting: Methods, techniques and clinical applications for music therapy clini-
cians, educators and students, pp. 180–206. Jessica Kingsley Publishers, London.
Oldfield A (2004). Music therapy with children on the autistic spectrum, approaches derived from clinical
practice and research. Unpublished Ph.D. thesis, Anglia Polytechnic University, Cambridge
Pavlicevic M (1997). Music therapy in context: Music, meaning and relationship. Jessica Kingsley Publisher,
London.
Perry MR (2003). Relating improvisational music therapy with severely and multiply disabled children to
communication development. Journal of Music Therapy, XL(3), 227–246.
Plahl C (2000). Entwicklung fördern durch Musik. Evaluation Musiktherapeutischer Behandlung.
[Development though Music. Assessment of Music Therapy Treatment.] Unpublished Ph.D. thesis,
1999. Waxman, Münster.
Priestley M (1994). Essays on analytical music therapy. Barcelona Publishers, Phoenixville, PA.
Rett A (1966). Uber ein eigenartiges hirnatrophisches Syndrom bei Hyperammonamie im Kindesalter
(on an unusual brain atrophic syndrome with hyperammonemia in childhood). J. Vienmedizinische
wochenschrift, 116, 723–726.
Robarts JZ (1998). Music therapy for children with autism. In C Trevarthen, K Aitken, D Papuodi and
J Z Robarts, eds, Children with autism. Diagnosis and interventions to meet their needs, pp. 172–202.
Ruud E (1998). Music therapy: Improvisation, communication, and culture. Barcelona Publishers, Gilsum,
NH.
Schögler B (1998). Music as a tool in communications research. Nordic Journal of Music Therapy, 7(1), 40–49.
Sims WI (1987). Effect of tempo on music preference of preschool through fourth grade children.
In C K Madsen and C A Prickett, eds, Applications of research in music behaviour, pp. 15–25.
The University of Alabama Press, Tuscaloosa, AL.
Sloboda J (1990). Music as a language. In F Wilson and F Roehmann, eds, Music and child development,
pp. 28–43. MMB MusicInc., St. Louis.
mother and infant by means of inter-modal fluency. In Field TM and Fox NA, eds, Social perception in
Stern DN (2000). The interpersonal world of the infant. Basic Books, New York.
Trevarthen C (2001). Intrinsic motives for companionship in understanding: Their origin, development
and significance for infant mental health. Infant Mental Health Journal, 22(1–2), 95–131.
In R MacDonald, J David, DJ Hargreaves and Dorothy Miell, eds, Musical identities, pp. 21–38.
Trevarthen C and Burford B (2001). Early communication and the Rett disorder. In Alison Kerr and
Ingergerd Witt Engerström, eds, Rett disorder and the developing brain, pp. 303–326. Oxford University
Press, Oxford.
New York.
Wesecky A (1986). Music therapy for children with Rett syndrome. American Journal of Medical Genetics,
24, 253–257.
Wigram T (1991). Music therapy for a girl with Rett’s syndrome: Balancing structure and freedom.
In K Bruscia, ed., Case studies in music therapy, pp. 39–55. Barcelona Publishers, Gilsum, NH.
Wigram T (1995) Assessment and diagnosis in music therapy. In T Wigram, B Saperston and R West, eds,
The art and science of music therapy: A handbook, pp. 181–194. Harwood Academic Publications,
London/Toronto.
Wigram T (1999). Assessment methods in music therapy: A humanistic or natural science framework?
Wigram T (2002). Indications in music therapy: Evidence from assessment that can identify the
expectations of music therapy as a treatment for autistic spectrum disorder (ASD): meeting the
challenge of evidence based practice. British Journal of Music Therapy, 16(1), 11–28.
Wigram T (2004). Improvisation: Methods and techniques for music therapy clinicians, educators and
students. Jessica Kingsley Publications, London.
Wigram T, Nygaard Pedersen I and Bonde LO (2002). A comprehensive guide to music therapy. Theory,
clinical practice, research and training. Jessica Kingsley Publications, London.
Woodyatt G and Ozanne A (1992). Communication abilities and Rett syndrome. Journal of Autism and
Development Disorders, 22, 155–73.
Woodyatt G and Ozanne A (1994). Intentionality and communication in four children with Rett
syndrome. Australia and New Zealand Journal of Developmental Disabilities, 19, 173–183.
Part 4
Musicality in childhood
learning
My aim is to show, although this is not generally attended to, that the
roots of all sciences and arts in every instance arise as early as in the
tender age, and that on these foundations it is neither impossible
nor difficult for the whole superstructure to be laid; provided
always that we act reasonably as with a reasonable creature.
John Amos Comenius (1633), The School of Infancy.
Quoted by Quick (1894)
As infants grow into children, so their innate musicality develops new forms through their accept-
ing ideas of the culture that surrounds them. This process has started inside the womb with
their first listening to the sounds from the surrounding environment. However, the process of
transformation to an awareness of meaning takes hold in earnest when the child begins formal edu-
cation—a system that will be more or less accepting of the child’s innate will to learn in activity.
If we are born to dance and sing, then we could think it is beholden of our education system to
encourage this dancing, singing humanity, especially for the teaching of the temporal arts, nurtur-
ing ‘the muse within’ as the Norwegian musicologist and teacher Jon-Roar Bjørkvold eloquently
urges us to do (Bjørkvold 1992). Unfortunately, the early memories of learning music for too
many adults is of either being told to keep still and pay attention as they were instructed in the
mechanics of music theory, which appear to have very little to do with the musicality of a lively
body and mind, or they are recollections of chaos as a teacher indeed encouraged sound
and movement, but failed to provide any satisfying form into which the musical energy could be
channelled and shared.
The authors of Part 4 provide an alternative scenario for childhood education, whatever the
discipline. These authors propose ways of teaching children that honours their musical creativity,
grounded in their innate communicative musicality. From that starting point, children can
be introduced into the richness of their cultural heritage. Frederick Erickson (Chapter 20) demon-
strates that musicality is a vital engine of the teaching exchange, regardless of the topic being taught:
The musicality of social interaction functions to enable participants to coordinate their attention and
action and to signal interest in various kinds of information to one another. This is true for ordinary
448 MUSICALITY IN TALK AND LISTENING
informal conversation and for conversation that takes place in more formal circumstances, such as in
classrooms … Meaning is always situated meaning, pointing simultaneously to referential and social
or interpersonal contents.
It is the musicality of the teacher–pupil exchanges that supports the pupil’s motivation to engage
and learn.
Charlotte Frölich (Chapter 22) and Lori Custodero (Chapter 23) both paint pictures of
adult–child musical interactions where the musicality of the child is the starting point.
In my role as teacher [writes Charlotte Frölich] I attempt to find teaching principles—not methods of
instruction—that make artistic experiences possible without nipping them in the bud. My purpose is
to elucidate artistic experience for both children and adults, and thus to support and empower their
delivery in the classroom … However, if these principles are adopted, a by-product is that we, as teachers,
will need to accept what seem to be detours in our teaching.
In the last sentence she is warning that if a teacher truly wishes to creatively engage with
children’s musical exuberance, then the course of a lesson will not be predictable!
Lori Custodero documents the exploratory improvisational and social exchanges of two adult
musicians as they spontaneously create music. Communicative musicality inhabits the intimacy
between them: ‘Through the compelling qualities of organized sound and the rewards of making
such sound, we are drawn to others through revelations of our common humanity, and to
artistry through realizations of uncommon accomplishments’ (page 513). She describes her
observations of children’s spontaneous musical play, and concludes that
… spontaneous musical behaviours that are part of a child’s private world are often ignored or misin-
terpreted as intentional disruption or nondirected behaviour … [But] perhaps because it is present so
early in life, music provides ways for children to be experts early on, thereby providing the substance
for sharing in more equitable ways with adults, who tend to control much of children’s lives.
(Page 520–521, this volume)
Nicholas Bannon and Sheila Woodward (Chapter 21) provide a thorough review of the litera-
ture on musicality in infancy leading to childhood: ‘We focus … on how the infant’s instinctive
participation in musical behaviour may be fostered throughout childhood, and can develop into
mature expressive musicianship’ (page 465). By presenting the evidence for the continuity
between the musicality of infancy and childhood, the authors provide strong theoretical reasons
for treating the musicality of children with respect and with a wish to nurture. The innate
musicality of childhood is too precious to be nipped in the bud by adults who value highly the
elaborated form of culture, who overlook the very aliveness in human nature that brought forth
our culture in the first place.
References
Comenius JA (1633). The school of infancy: An essay on the ducation of youth during the first six years.
Translated by D Benham. London, 1858. Republished in 2003 by Kessinger Publishing, Whitefish MT.
Quick RH (1894). Essays on educational reformers. Longmans, Green and Co, London.
Chapter 20
Musicality in talk and listening:

A key element in classroom discourse
as an environment for learning
Frederick Erickson
20.1 Introduction: how classroom talk and listening is musical

I will begin by reviewing some basic principles. I assume as a premise that human social interac-
tion is organized musically, i.e., speaking and listening behaviour is performed in real time in
patterns of regular rhythm, and pitch and volume changes in speech are part of the same verbal
and non-verbal timing frame. This ‘musicality’ is what linguists call ‘speech prosody’. The musi-
cality of social interaction functions to enable participants to coordinate their attention and
action and to signal interest in various kinds of information to one another (Auer et al. 1999;
Scollon 1982). This is true for ordinary informal conversation and for conversation that takes
place in more formal circumstances, such as in classrooms.
Musicality gives emphasis and contrast in the performance of talk. This is behaviourally realized
in three aspects of vocal sound production: volume, pitch, and quality or timbre. These behav-
iours sketch patterns of timing and of emotional key across connected strips of talk that range
from a single breath group within one speaking turn (e.g., a shift between irony and seriousness
across successive breath groups in a single utterance) to forms of sound contour (Gestalten) that
are sustained across a number of turns as these are exchanged among multiple interlocutors.
Expression in the voice is coupled with the prosody of gestures, the shifts of direction in gaze, and
the maintenance and change of postural positioning and interpersonal distance during face-to-face
interaction. These visible aspects of musicality in movement can be thought of as being more akin
to dance than to song. Although this kinesic (body motion) prosody will not be treated as a
primary focus in the following discussion, it bears mention here, because the points of contrast
and emphasis that are performed kinesically (including the onsets and offsets of gaze toward other
interlocutors) contribute to the overall gestalts of timing and emotional keying in speech.
The discussion that follows is guided by four fundamental theoretical insights about human
social interaction. The first comes from Gregory Bateson, concerning the nature of communica-
tion in face-to-face social interaction (Bateson 1954). Bateson claims that the behavioural
conduct of social interaction communicates information continually about the state of the
momentary relations among interactional participants at the same time as it imparts referential
information. Thus every ‘bit’ of referential message is accompanied by a meta-message bit
concerning social relations in the interaction at that present moment. Considering the importance
now given to intersubjective foundations of language (e.g., Rommetveit 1998), which aspect
should be considered ‘message’ and which should be considered ‘meta’ may need rethinking.
Yet Bateson’s crucial point remains pertinent: every effective message contains in its very
behavioural form of expression cues that point indexically toward its interpretation, both in
450 FREDERICK ERICKSON
terms of literal referential meaning and metaphoric social meaning. Conversely, no communica-
tive act occurs decontextualized from the intentions of its utterer, i.e., without pointing from
within itself to the situational context within which that communicative act makes sense. Thus,
in human communication, meaning is always situated meaning, pointing simultaneously to
referential and social or interpersonal contents.
The second, more abstract and mechanistic, theoretical insight comes from H. A. Simon, con-
cerning inherent limits on human information processing in real time. Humans are not neurologi-
cally capable of processing the unreduced complexity of information signals to which we are
exposed (Simon 1979; Newell and Simon 1972). To prevent being continually flooded with more
information bits than we can process, we simplify the informational surround by ‘disattending’ to
some of the bits, and by ‘chunking’ the potentially separable bits into larger units or sets.
The third theoretical insight comes from the work of those who study communication behaviour
itself (e.g., Sacks et al. 1974), and from phenomenological philosophy (e.g., Merleau-Ponty
1945/1962) concerning the priority of the intending subject’s experience of the present moment in
the real-time conduct of communication. The present moment establishes a centring point (now
and here) from which we anticipate, at the leading edge of a temporal horizon before us, a ‘next’
moment to come, and from which we recall (at the following edge of the temporal horizon behind
us) a ‘prior’ moment that has just occurred. From the materialist perspective of Newell and Simon
and the theory of limited information processing capacity, all we can process cognitively is a selection
from the information bits impinging on us in the present moment, as well as a few we recall from the
past moment and a few that we anticipate may occur in an immediately next moment. This exhausts
our attentional capacity. In other words, we cannot pay attention continuously—there must be an
ebb and flow in our attention across strips of real time. (For a theory of musicality as prospective
control of action in mind-created time, see Lee and Schögler, Chapter 6, this volume.)
My assumption is that rhythmic patterns of speech and body motion point us to next moments
that are of particular importance informationally. Each successive now moment is a moment in a
process of created real time. I have described elsewhere the implications for communication of
the phenomenological prominence of each now moment within the continuous stream of our
individual experience of real time:
The construction of oral texts in talk [involves] fundamentally linear processes … We can have a gen-
eral sense of where the talk is going next because of our experience in similar events in the past …
[But] we can never be absolutely sure where the interaction will turn … We can get hints of this and
give them to others, verbally and nonverbally, but the hints are sketchy and ambiguous … thus now
and ‘next’—next moment back in time and next moment forward … provide a set of fundamental
building blocks for the construction of interaction.
Erickson (2004, pp. 3–5)
Especially when we are one of the focal speakers of the moment, as we react in our speaking to
the reactions of our listeners, those listeners are also reacting to us. Thus, speaking and listening
are reflexively related in an ecology of mutual influence. In the ‘now’ of the immediately present
moment, all the parties engaged in interaction are adjusting their actions to one another in the
light of what they perceive the others to be doing at that moment as well as in the light of what
others were perceived to be doing in the moment just past.
The continual process of mutual checking and mid-course correction is what makes interaction social,
i.e. it enables the actions of various parties to fit together as reciprocal and complementary … [this] is
a process that not only takes place within the real-time conduct of the interaction but underlies or
enables it.
Erickson (2004, pp. 3–5)
MUSICALITY IN TALK AND LISTENING 451
Let us consider an aspect of the social ecology of interaction that is especially characteristic of
school classrooms—a fundamental asymmetry between speakers in their communicative rights
and obligations. In contrast to the division of interactional labour in casual conversation, one of
the participants in classrooms, the teacher, assumes control of both the social processes of inter-
action and of its main referential informational content—the successive topics and discourse
sequences in talk (cf. Sinclair and Coulthard 1975; Mehan 1979; Cazden 2001). This asymmetry
is not unique to classrooms. It also happens in such settings as courtrooms and clinic visits with a
physician. But as we will see when we consider examples of classroom interaction, the teacher’s
control of both the content and the conduct of the official ‘on the record’ talk that occurs is a
distinctive aspect of the organization of classroom discourse.
Talk in social interaction always constitutes a potential learning environment, with
three dimensions or aspects of its fundamental organization: action that communicates literal
meaning, action that communicates social meaning, and the semiotic means (signal systems and
language) by which meanings themselves are communicated. In classroom talk, this process is
controlled by the teacher to direct the children’s learning. At each successive now moment in time
during the course of interaction, all three of these aspects of organization operate together
(Figure 20.1).
Figure 20.1 shows on its vertical axis a succession of present moments in real time (i.e., now
moments) T-1 to T-n. On its horizontal axis are shown the three aspects or dimensions which, in
the experience of particular moments in real time, constitute the ‘meaning’ of the moment:
literal, combined with social, and the features within signal systems (e.g., a phoneme, an iconic
gesture, a written word or numeral displayed on a chalkboard) by which those meanings are
realized together in communicative performance.
In ordinary conversation, at any moment, attention can be focused mainly on the literal/refer-
ential aspect of meaning or on the social aspect, or on both simultaneously. In classroom conver-
sation that may also be true, but the primary focus of attention in classrooms is often on the
semiotic signal system itself as a means of representing subject matter or content information.
Thus, for a few moments in time—for instance, during mathematics instruction—deliberate
attention may focus on details of mathematical symbols; during literacy instruction, attention
may focus on details of letters, punctuation marks, or sentence grammar. This foregrounding of
the semiotic means of representation itself is not unique to classroom discourse, but it seems
to occur more frequently in classrooms as part of the technique of instruction than in casual
conversation outside classrooms.
The teacher uses vocal musicality (as well as gesture, gaze, and postural positioning) to
summon students’ collective attention to crucial now and next moments in the communicative
behaviour stream, and in the collaborative thinking that is going on—moments at which impor-
tant new information will be provided. The teacher uses explicit and implicit cues and formulaic
utterances to direct collective attention to matters of subject matter content and to matters
Real-time course of interaction
Literal Social Performed semiotic

meaning meaning signal system feature(s)
T-1 x x x
T-2 x x x
T-n x x x
Fig. 20.1 Meaning dimensions in classroom learning environments.
of social relations. In large and small group conversations, students may also do analogous kinds
of things.
20.2 Examples of teachers and young children talking in class

Here, I will present examples of teacher and student talk from early-grade classrooms in
the United States, in which the musicality of their uttering points implicitly to crucial moments
in which new information is being communicated verbally and non-verbally. In the first few
examples, the new information mainly concerns social participation relations; as the examples
continue, the new information involves social relations but also concerns subject-matter content.
All of the examples come from classrooms in which the children are from 5 to 7 years old. Basic
ideas and skills from the curriculum are being introduced to students and practised by them, and
at the same time basic forms of social participation in the classroom are being presented to and
repeatedly rehearsed with the students.
20.2.1 ‘Not a creature was stirring …’

The first example comes from a kindergarten/first grade classroom in which the age range
was from 6 to 7. On a day in the third week of December, right before vacation, the teacher was
reading aloud from a picture book. She was seated on a child-sized chair at the front of the
classroom, with the students sitting around her on a rug, in a circle formation. The teacher held
the book in her left hand with the pages facing out toward the children so they could see the
illustrations as she read aloud. Looking to her left at the pages she was about to read, the teacher
began to utter the first lines of the text, Clement Moore’s poem The night before Christmas.
In the transcripts that follow, two dots within a line indicates a comma-length pause and four
dots indicates a sentence-terminal length pause. A single dot indicates less than a comma-length
pause; in the following instance, this signifies an abrupt shift to an interpolated side comment, to
‘rein in’ one of the children.
’Twas the night before Christmas ..
And all through the house ..
not a creature was stirring.
LOUIE ….
Not even a mouse ….
It is worthwhile pointing out the (unintended) irony of the teacher’s reprimanding of the
student immediately after the line ‘not a creature was stirring’. The reprimand to Louie (who was
wriggling as the teacher had begun reading) was given force by the way in which the uttering of
the word ‘Louie’ deviated from the overall pattern of musicality in the reading of the poem text
aloud. The teacher read the opening lines of the poem with the prosody appropriate to them—
at a medium pitch level, with four feet per line and syllables mostly pronounced in triplets
(i.e., dactylic tetrameter).
Louie, a kindergartner (thus less experienced at sitting still in the circle formation than were
the first graders) had not yet settled down kinesically. He was moving back and forth on the
rug as the other children sat still, verbally and non-verbally. Accordingly the teacher interpolated
a reprimand, ‘LOUIE ..’ (see Figure 20.2). The volume-emphasized word ‘Louie’ was spoken at a
lower pitch than all of the other volume-emphasized, stressed syllables that had preceded it:
‘night’, ‘Christ-’, and ‘all’. The word ‘house’, said with a medium low–rise in pitch at the end of the
first line (technically this is called ‘continuing intonation’), was uttered at a slightly lower pitch
than the other stressed syllables had been, but still the pitch of ‘house’ was higher than that of
2
3 3
not a crea- true was stir-ring not e- ven a mouse

LO_UIE
Fig. 20.2 The teacher interpolated a reprimand.
‘Louie’. Moreover, the two syllables of ‘Louie’ were spoken in low–low intonation pattern, rather
than in the low–rise intonation pattern of ‘house’.
Not only was the word ‘Louie’ set apart from the sound-gestalt of the words that preceded it by
being uttered at a distinctively lower pitch and with considerably louder volume, its sound
quality also was distinct: it was uttered with a more harsh, constricted-throat vocal quality. Non-
verbal emphasis accompanied this verbal emphasis; as the teacher said ‘Louie’, she looked to her
right, away from the book page and across the circle to where Louie was sitting, transfixing him
with a piercing glance. The word ‘Louie’ received even greater emphasis by being uttered
abruptly—slightly ahead of the cadential beat that had been previously established by the regular
timing pattern of the metric poetry lines. In this timing pattern, the intervals between each
successive volume-stressed syllable (‘night’, ‘Christ-’, ‘all’ and ‘house’) were almost identical in
duration. Thus, the stressed syllables established a cadence-like beat within the overall sequence
of syllables. The rate of this cadence was a sedate largo—approximately one stressed syllable a
second. (That this cadence-like pattern is not only found in the recitation of metric poetry, but is
a common occurrence in the spoken prose of motherese (or infant-directed speech), has been
demonstrated for mother–child interaction, by, for example, Malloch (1999), Trevarthen and
Malloch (2002), and Gratier(1999); as well as for speech among adults and between adults and
school-age children by Erickson and Shultz (1982), Scollon (1982), and Erickson (1982, 1986
1992, 1996, 2003, 2004). See also Auer et al. (1999), for a slightly differing but related claim
concerning the rhythm patterns of spoken discourse, and Hall (1983), especially his discussion
on synchrony and group cohesion, pp. 154–156).
In the abruptness of its uttering, the word ‘Louie’ came slightly ahead of the beat that had been
projected by the previous sequence of stressed syllables (see also the discussion of an analogous
instance of abruptness in Erickson and Shultz 1982, pp. 111–117). Within the timing gestalt of
that sequence, the word ‘Louie’ is heard as coming just a bit too soon. Then, after a one-beat
pause (equivalent to a sentence-terminal pause), the teacher returned to the timing pattern of the
poetry, finishing the line in the cadential formula with triplets: ‘not even a mouse’.
By interrupting the pitch gestalts of the sequence of stressed syllables:
night Christ-mas all house

(med. high) (low–rise) (med. high) (low–rise)
and by interrupting the volume gestalt (with LOUIE uttered much more loudly than the
preceding syllables) and by interrupting the voice-quality gestalt (kindly, open-throat ‘teacher
read-aloud’ voice shifting to annoyed, constricted ‘teacher reprimand’ voice) and by interrupting
the timing gestalt of a four-foot line of poetry by interpolating a word outside the text immedi-
ately after the syllable marking the second foot in the line had been uttered, and by interrupting
the continuity of the teacher’s previous postural and gaze configuration through the sudden
3 3 3 3
T: Twas the night be-fore Christ-mas and all through the house not a
crea-ture was stir-ring not e - ven a mouse

Fig. 20.3 Canonical version of Twas the night before Christmas.
redirection of the teacher’s gaze and head/shoulder position away from the book and across the
circle toward Louie, the illocutionary force of the reprimand was redundantly underscored
in multiple behavioural ways. What was being communicated was an implicit but powerful
directive: ‘Stop wiggling!’. By contrast, see the canonical version of the poem text as presented in
Figure 20.3.
20.2.2 ‘Have we discovered tha::::t …’

The second example comes from a mathematics lesson in the same kindergarten–first
grade classroom from which the previous example came. The teacher was reviewing the mathe-
matical notions of set and set property. The teacher had placed blocks on the rug and she and
the first-grade students sat in a circle looking at the blocks in the centre of the circle they
had made with their bodies. The blocks were arranged on the rug in two sets. In one of the
sets, regardless of size and colour (large, small, green, yellow) all of the blocks had the same
shape: triangles. In the other set of blocks, shape varied (circles, squares, rectangles) but all were
the same colour (yellow). Two blocks were anomalous in the context of these two arrays: yellow
triangles.
The substantive point toward which this lesson was tending was a new concept—a set whose
members could belong simultaneously to other sets. Such a set would be named, at the end of
the lesson, an ‘intersecting set’. A rope ring had been looped around the set whose property was
shape (and at this point the yellow triangles were included in that set). A rope ring was looped
around the other set, all of whose members were yellow (but not including the anomalous pair
of yellow triangles in the other set). Discussion was animated, with children placing the pair
of yellow triangles first with the other triangles and then with the other blocks that were all
yellow (Figure 20.4).
At this juncture in the discussion, the teacher was ready to sum up. She said, ‘What have we
decided … have we decided tha::::t ….’ (the successive colons indicate elongation of the vowel in
a ‘sound stretch’).
In addition to the unusual feature of the sound stretch, the word ‘that’ was pronounced
with volume emphasis, and with a pitch shift (starting higher than the pitch level of the previous
syllables and concluding lower than the previous syllables, i.e., high–low shift), and with a change
in vocal quality from open throat to a more constricted, harsher sound). This is a statistically
infrequent way (i.e., a marked way) to say ‘that’ in this grammatical context. The more frequent
(i.e., unmarked) way of uttering would be to say ‘that’ at the same syllable speed as in the words
that preceded it and then to continue on at the same syllable speed into the new clause that began
with the word ‘that’: ‘have we decided that this set has the property of ’.
g
y
g y
y
y
g
g y y
Triangle blocks Yellow blocks

Set property: shape Set property: colour
y = yellow g = green
Fig. 20.4 Blocks and rope rings illustrating sets and set properties.
Contrasting an utterance with what has immediately preceded it, at any given now moment in
the course of speaking, is a way of signalling markedness—something special about a referent
that may need to be interpreted non-literally. Such contrasting can be manifested by both verbal
and non-verbal means. The verbal means include a shift upward or downward in volume, a shift
upward or downward in pitch, a shift in the tempo of syllable production (i.e., the rate of sylla-
bles per second) faster or slower than those in the immediately prior strip of talk, and a shift in
voice quality/emotional ‘key’ (e.g., from straightforward to ironic). The possible non-verbal
means of producing contrast with the behavioural form of what has occurred immediately
before the present moment include shifts in a previously sustained postural configuration, shifts
in gaze toward or away from other interlocutors, changes in the direction and amplitude of
gestures, and changes in interpersonal distance.
What ordinarily happens is that contrast in the overall gestalts of verbal and non-verbal behav-
iour will occur at major junctures between topics, or when emotional tone has changed. Within a
given strip of interaction in which a given topic is discussed or a certain emotional tone has been
maintained, there tends to be little behavioural contrast. Rather, each successive topic segment
usually has a distinct behavioural continuity—an overall gestalt sustained across the full duration
of that segment. Thus, shifts in behavioural configurations, when they occur, can function as a
generic meta-message ‘something new is happening now’, and interlocutors can adapt mutually
to that new alignment, perhaps also reading the signals of the moment non-literally, whereas
in the immediately preceding moments they had been reading the signals literally. (See the
discussion in Erickson and Shultz 1982, 1997; McDermott et al. 1978.)
(NB A further point should be made here about the pitch patterns that obtain in ordinary
talk in English. In the previous example, there was a high–low–high–low alternation in pitch
between successive volume stressed syllables: ‘night’, ‘Christ-’, ‘all’, and ‘house’. This is the result of
a stylistically characteristic sing-song way of uttering poetry in English in dactylic tetrameter. By
contrast, in the uttering of ordinary prose, a different pattern of pitch variation obtains.
Typically, there is a slight pitch rise near the end of intermediate clauses within a sentence and
a slight pitch rise (or continuous pitch) followed by a slight fall at the end of a sentence. In other
words, there is a tendency to return to the medium level baseline pitch from clause to clause
as the recitation progresses. That is what one sees as well in the ordinary speaking of English: the
maintenance in uttering of a baseline medium pitch level. This is why, in the utterance ‘have we
decided … have we decided this set has the property …’, most of the syllables would be uttered
on the baseline medium pitch level, with the volume accented syllable ‘cid’ in the centre of the
word ‘decided’ and the syllable ‘this’—both of these appearing at syntactic junctures but not at a
sentence terminal juncture—uttered at a slightly higher pitch than the words around them. That
is the usual, unmarked way to say those words. To say instead ‘have we decided tha::::t’ (with a
high-to-low pitch shift on the elongated vowel) was distinctly different from what was usual, and
thus was not predictable by a hearer.)
In the example under discussion, the teacher by uttering the word ‘tha::::t’ in an infrequent,
behaviourally marked way, was making it special—something unusual—and such rendering
enabled the word ‘that’ to function as a cue with distinctive social meaning: the Batesonian
meta-message, ‘sit still and be quiet’. Gumperz (1982, 1992), calls these kinds of signals ‘contextu-
alization cues’; they are a basic component of the social-interactional steering mechanisms
employed in oral discourse, as we tell each other implicitly and indirectly what we are doing
communicatively, within the ongoing course of our doing it.
We can infer the implicit social meaning ‘sit still and be quiet’ from what the students did as the
teacher uttered the word ‘that’. As she said ‘what have we decided … have we decided’, the stu-
dents were talking simultaneously and were rocking back and forth in their places on the rug,
pointing excitedly at the blocks on the floor. As the word ‘tha::::t’ progressed in the course of its
uttering, more and more children became silent and still. By the time the final /t/ in ‘tha::::t’ was
uttered, not a single child was moving or speaking, and all eyes looked down at the blocks on the
rug. Collective attention had been focused at a particular moment in time, towards information
available at a particular location in space (through the semiotic medium of the coloured and
shaped blocks placed on the rug).
Here is what happened immediately next in the discourse:
1 T: What have we decided
2 Have we decided tha::::t
3 …
(T holds up one of the triangle blocks)
4 these blocks all have the property of the same what?
5 SS: SHAPE!
The word ‘these’ (in ‘these blocks’) received volume stress and slightly higher pitch, as the
teacher held up one triangle from the left-positioned set of blocks. This verbal ‘deixis’ (literally,
from the Greek, for ‘pointing’ – using a pronoun to refer indexically to an entity in the scene at
hand) was accompanied by a non-verbal gesture, holding up the block between the teacher’s
thumb and forefinger. Thus, the collective attention of the students was directed not just to the
blocks on the rug in general, but to one of the two sets of blocks—all of the triangle-shaped
blocks—and to one particular member of that set, which was being taken to stand for the whole
set, in what we might call a ‘graphically enacted metonymy’ (Figure 20.5).
The rhythm of the teacher’s speaking sketched a cadential timing gestalt, with the syllables
‘co’ in ‘dis-co-vered’ and ‘tha:::t’ and ‘these’ all occurring at approximately one-second intervals,
separated by a one-second pause after ‘tha:::t’. This marked, especially slow cadence projects
T: THESE blocks are all the pro-per-ty of the same what? SS: SHAPE!
Fig. 20.5 The collective attention of the students was directed to all of the triangle shaped blocks
and to one particular member of that set.
a ‘next’ pulse after the words, ‘all have the property of …’ as a potential answer-slot for students,
helping the children to predict the moment in which to fill the projected slot with the answer
word ‘shape’.
By such intercalation of turns in the timing between speakers, participants in conversation are
able to complete each other’s sentences. In classrooms, a common discourse pattern is for a
teacher to utter a ‘fill in the blank’ question, as in ‘these blocks all have the property of ____’, with
students answering in chorus, as in the preceding example.
Further evidence for the contention that answer slots and question slots occur in regular
timing relationships in classroom discourse is provided by the next example.
20.2.3 Carlos ‘drumming’ an answer slot

In a bilingual first-grade North American classroom early in the school year, the students were
being introduced to Arabic numerals and to the Spanish words for the numbers from one to ten.
The numerals were displayed in a series of large cards, one card for each numeral, that were
attached above the chalkboard at the front of the classroom. The series of cards covered the width
of the chalkboard; thus each numeral was large enough to be read by all the students in the class.
The teacher asked Carlos to stand before the chalkboard, holding a pointer, and to point up at
each numbered card in succession as the teacher said the word for each numeral. There was a
cadence-like timing gestalt in the sequence, with a volume-stressed number-syllable coming on
the next beat preceded by the word ‘numero’, thus: ‘numero uno …. numero dos …. numero
cinco ….’. This temporally regular way of the teacher’s saying the numbers projected an answer
slot on the next beat after the uttering of the stressed syllable in the numeral name. That was the
‘kairos’ moment (from the Greek, the moment of opportunity), the place in time coming imme-
diately next, the slot which should be filled by the student’s pointing to the card whose numeral
had just been uttered—pointing on the beat of the cadence-like timing formula that was
manifested in the teacher’s speech (Figure 20.6).
Carlos and the teacher completed a question–answer sequence in which the teacher called out
various numerals from one to ten. ‘Muy bien, Carlito’, the teacher said. Then she asked Carlos to
go to his seat so that another child could take a turn at being the designated answerer. Carlos
shook his head in annoyance and sat down reluctantly. As the next child answered by pointing to
3 2 3 3
etc.
T: nu-me-ro un - o nu-me-ro dos nu-me-ro
S: (points here) S: (points here)
Fig. 20.6 Pointing on the beat of the cadence-like timing formula that was manifested in the
teacher’s speech.
(very good) (number five) (Carlos taps on desk)

3 2
T: muy bien nu- me-ro cin - co

S: (answerer
points here)
(number six) (Carlos taps (number four) (Carlos taps on desk)

on desk)
3 3
T: nu-me-ro seis nu-me-ro qua-tro T: muy bien

S: (answerer S: (answerer
points here) points here)
Fig. 20.7 Carlos picked up two pencils on his desk and started to use them as drumsticks.
each number card at the appropriate moment, Carlos picked up two pencils on his desk and
started to use them as drumsticks, drumming in the appropriate answer-slot moments that were
being projected by the cadence pattern in the teacher’s voice (Figure 20.7).
(For other examples of the role of cadential timing in the organization of conversational turn-
taking and question–answer sequences see Erickson 1996 and Erickson 2004, Chapters 1 and 2.)
20.2.4 ‘Maximum potential!’

In another pair of kindergarten classrooms from the same school in a different school year, the
children had spent the entire year learning in depth key concepts in a series of topics in the
physics of matter, energy, and motion. Their study was culminating in June in the construction of
a classroom-sized roller coaster, a complex machine whose operation was based on the contrast
between kinetic and potential energy. The students were asked to attach cards (on which they had
written either ‘potential’ or ‘kinetic’ in English and in Spanish) at the locations along the roller-
coaster at which each kind of energy was maximized as a ball rolled through the roller-coaster
system, simulating a roller-coaster carriage. In placing the cards on the roller-coaster and dis-
cussing the reasons for that placement, the students had revealed a misunderstanding—some
thought that as the ball accelerated on the down slopes in the system it was ‘gaining energy’ and
that as the ball decelerated on the up slopes in the system it was ‘losing energy’. (This is called in
science education pedagogy the Aristotelian fallacy. Newton showed that matter in motion nei-
ther loses nor gains energy, but that energy is transformed from one form to another, stocked as
potential and expended as kinetic.)
The classroom teachers who were working together as a team realized (in a quickly held
planning meeting) that it was necessary to re-teach the distinction between kinetic and potential
energy. One of the teachers went directly from the planning meeting to a gathering of all the
children in her class. The students were seated on the rug in front of the whiteboard on which the
teacher had used a black felt-tipped marker to draw a schematic outline of a generic roller-
coaster. She asked one of the children to come to the board with her and then demonstrated
the location of potential energy in the system by covering the child’s hand with her own as the
Fig. 20.8 Potential, potential, potential.
child was holding a red marker. She and the child placed the tip of the marker at the bottom of
the leftmost incline in the diagram. Then they slowly traced a line in red that was parallel to the
black line that had been already drawn (Figure 20.8).
As the red marker moved up the incline, the teacher said with increasing volume and rising
pitch:
MAX TEN-
PO
I-MUM TIAL!!
potential ..
potential
potential
The rising pitch and increased volume continued, and peaked at ‘MAX’ as the marker moved
to the apex of the incline—the point in the system that was the locus of MAXIMUM POTENTIAL
energy. Thus, pitch and volume cues accompanied the gestural cue of drawing a line up the slope
of the incline, with the auditory cues redundantly mimicking (through pitch and volume) what
the drawing gesture was illustrating.
20.3 Conclusion
The musical aspects of speech performance—contrast in pitch, volume, and voice quality,
and cadential patterns in timing—are ubiquitous in classroom talk. They appear to function as
(usually) implicit and (sometimes) explicit signal systems that call attention to the crucial
meanings-of-the-moment in classroom discourse: to literal referential meaning and to
metaphoric social meaning. Thus, in terms of my initial discussion of theoretical foundations,
the pitch, volume, and voice quality in classroom talk appears to function to signal Batesonian
meta-messages, while the cadential timing patterns in classroom talk appear to function to
simplify the informational surround, by summoning collective attention to key information bits
that appear linearly in the speech stream, allowing students to focus attention especially intently
at certain crucial moments on the intentions and thinking of the teacher, and to relax such atten-
tion at other moments. This temporal ebb and flow of attention to informational detail is neces-
sary because of the inherent limits on human information processing (as shown by cognitive
psychology – see Section 20.1), or, in a phenomenological perspective, because of the variability
of subjective purpose. Cadential timing of classroom talk also appears to function to clarify the
sequential organization of talk, as understood by conversation analysts and discourse analysts—
a cadence permits recognition of a now moment and permits prediction of next moments in
relation to that now. All of this is done in a situation of power asymmetry between teacher and
student, with the teacher able to control both the topics and the participation frameworks for
talk in official situations of classroom discussion in a way that is quite different from the social
ecology of interaction in informal social situations. In summary, students in school need to learn
when to attend more and less intently, auditorially and visually, and where in space and time to
direct their attention, in addition to learning to recognize the symbol and cueing systems by
which information is being communicated, both in terms of its literal referential meaning and its
implicit social meaning, continually taking account of the fundamental power difference
between teacher and students to control the content and process of classroom conversation.
The meanings that are communicated in talk, explicitly and implicitly, are so fundamental to
understanding that their correct collective interpretation cannot be left to chance. Rather, such
interpretation must be scaffolded or pointed to in as many ways as are possible—using auditory
cues in combination with visual cues, i.e., employing all possible communicative modalities
redundantly. I infer that this is the reason that the most important meanings are almost always
signalled multimodally in human social interaction; there is now a burgeoning literature on
multimodal discourse analysis of classroom talk (e.g., Kress 2001; Jewitt and Kress 2003).
Early-grade classrooms are especially fruitful sites in which to observe the elementary forms of
classroom discourse and to see how the musicality of talk provides cues to support the moment
by moment meaning-making of students. The examples presented in this chapter moved from
those showing how pitch, volume, voice quality, and timing signal messages about social relations
(e.g., ‘stop moving and talking!’) to those showing how the musicality of talk underscores and
emphasizes crucial points in subject matter content (e.g., the location along an ascending
inclined plane at which the potential energy of a moving object is maximized). The examples
were intended to demonstrate that social relational meaning, referential subject matter meaning,
and the semiotic codes by which meaning itself is signalled, are all present for interpretation
within the behavioural performance—the surface structure—of classroom talk, and that the cues
to meaning are perceived linearly across strips of time
What I have claimed and begun to illustrate here, as well as in some of my previous work, is that
as children and teachers interact with each other using instructional materials and semiotic sym-
bol systems, the musicality of their talk and listening activity in real time provides a foundation
for the successful conjoint performance of interaction, and for the social organization
of mutual understanding—reciprocal cognition, or intersubjectivity. Through pitch and volume
cues in speech, children and teachers signal crucial now and next moments of information, e.g.,
‘the words I am about to say on the next rhythmic beat are the most important thing I’ve said so
far in this utterance, so listen for them—pay special attention’. Speakers almost never utter
such an injunction explicitly—rather, it is usually said implicitly by the behavioural means
of timing, pitch, volume, and voice quality. Nor does a speaker usually say explicitly ‘up to now
I was serious, now I’m not’, or ‘we’re about to start a new scene in this narrative’, or ‘I’m going to
list some things next in my talk, and you need to pay special attention to each item on the list’, or
‘By starting with a high-pitched noun in the list and then shifting to one said at medium pitch, I’m
implicitly telling you that a third list item will follow—and that will be the last item on the list’.
When teachers and students share a similar implicit musical signalling system for the coordi-
nation of attention and action in talk, they tend to understand one another clearly and to
have positive feelings toward one another. When the mutual signalling system is not working well
and interactional stumbles happen (akin to performance stumbles in dance and music), negative
affect and misunderstanding often occur (see the extended discussion in Erickson and Shultz
1982, Chapters 3 and 4; see also Erickson 1996 and Erickson 2004, Chapter 3). Thus,
as Trevarthen (1999) theorizes (Malloch and Trevarthen, Chapter 1, this volume, and Trevarthen
and Malloch 2000; Hall 1983), our capacity to think with and feel with one another seems to
be tied to our capacity to dance and sing in smooth, predictable rhythm with each other in our
talk. This is especially evident in our sharing of action and experience with young children in
ways that support their pride in acting and knowing (Trevarthen and Malloch 2002)
When neo-Vygotskyan perspectives on social interaction as the site of learning in ‘intent par-
ticipation’ (Rogoff 2003 and Gutierrez and Rogoff 2003) are combined with a ‘musicality of
social interaction’ perspective, it becomes apparent that such matters as engagement between
teacher and learner in the ‘zone of proximal development’ have as a necessary condition the
establishment of mutual musicality in their talking and listening activity. In other words, mutual
musicality can be considered as a foundation for the opportunity to learn in the classroom, as
elsewhere.
The examples in this chapter by no means exhaust the full range of kinds and uses of musical-
ity that remain to be discovered in classroom discourse. A particular limitation here is that by
focusing on relatively simple oral texts in early-grade classrooms, I have not discussed the more
complex kinds of oral texts that are produced by teachers and by students in classrooms, espe-
cially as students grow older. Informally, from my own experience of reading stories aloud to
children, from observing lectures by teachers in upper-elementary grades, and from my own
experience in lecturing at the university level, I am aware that the musicality of talk functions to
cue diverse meanings in complex narrative and expository oral texts, but I do not yet have data
organized systematically to support and illustrate such a claim.
Another limitation here is that I have not used recently available computer software for
phonetic analysis of speech samples to identify more precisely the timing patterns—especially
the patterns of alteration in pitch—that I have illustrated in musical transcription. Further
work is needed to specify more adequately the relations between pitch change and timing
patterns in speaking. These microanalytic snapshots of brief strips of classroom discourse need
to be integrated with wide-ranging ethnographic documentation of learning as it takes place
within classroom social interaction. All of this goes beyond the scope of this introductory
chapter, which has raised issues and pointed to their potential significance for the study of
social interaction and of pedagogy. I can only hope that this chapter’s limits will prompt others to
transcend them.
References
Auer P, Couper-Kuhlen E and Müller F (1999). Language in time: The rhythm and tempo of spoken
interaction. Oxford University Press, New York and Oxford.
Bateson G (1954). The message ‘this is play’. In B Schaffner, ed., Group processes, pp. 195–242. The Josiah
Macey Foundation, New York. Reprinted in Bateson G (1975). Steps to an ecology of mind. Ballentine
Books, New York.
Cazden C (2001). Classroom discourse: The language of teaching and learning. Heineman, Portsmouth, NH.
Erickson F (1982). Classroom discourse as improvisation: Relationships between academic task structure
and social participation structure in lessons. In LC Wilkinson, ed., Communication in the classroom,
pp. 155–181. New York: Academic Press. Reprinted in MV Maillo, FJG Castaño, and AD deRada (1993).
Lecturas de antropologia para educadores, pp. 325–353, Impresión Cosmoprint SL Los Naranjos, Madrid.
Erickson F (1986). Listening and speaking. In D Tannen and JE Alatis, eds, Language and linguistics: The
interdependence of theory data, and application, pp. 294–319. Georgetown University Round Table on
Languages and Linguistics, 1985. Georgetown University Press, Washington, DC.
Erickson F (1992). They know all the lines: Rhythmic organization and contextualization in a conversational
listing routine. In P Auer and A di Luzio, eds. The contextualization of language, pp. 365–397. John
Benjamin Publishing Company, Amsterdam/Philadelphia.
Erickson F (1996). Going for the zone: The social and cognitive ecology of teacher-student interaction in
classroom conversations. In Deborah Hicks, ed., Discourse, learning, and schooling, pp. 29–62.
Cambridge University Press, Cambridge and New York.
Erickson F (2003). Some notes on the musicality of speech. In D Tannen, ed., Georgetown University
Roundtable on Languages and Linguistics 2001, pp. 11–35. Georgetown University Press, Washington,
DC.
Erickson F (2004). Talk and social theory: Ecologies of speaking and listening in everyday life. Polity Press,
Cambridge.
Erickson F and Shultz J (1982). The counselor as gatekeeper: Social interaction in interviews. Academic Press,
New York.
Erickson F and Shultz J (1997). When is a context: Some issues and methods in the analysis of social
competence. In M Cole, Y Engeström, and O Vasquez, eds, Mind, culture, and activity, pp. 22–31.
Gratier M (1999). Expression of belonging: The effect of acculturation on the rhythm and harmony of
mother–infant vocal interaction. Musicae Scientiae (Special Issue 1999–2000), 93–122.
Gumperz J (1982). Discourse strategies. Cambridge University Press, Cambridge.
Gumperz J (1992). Contextualization and understanding. In Duranti A and Goodwin C, eds, Rethinking
context: Language as an interactive phenomenon, pp. 229–252. Cambridge University Press Cambridge.
Gutierrez K and Rogoff B (2003). Cultural ways of learning: Individual traits or repertoires of practice.
Educational Researcher, 32(5), 19–25.
Hall ET (1983). The dance of life, the other dimension of time. Anchor Press/Doubleday, Garden City, NY.
Jewitt C and Kress G (2003). Multimodal literacy. Peter Lang, New York.
Kress G (2001). Multimodal teaching and learning: The rehetorics of the science classroom. Continuum,
London and New York
1999–2000), 29–57.
McDermott R, Gospodinoff K and Aron K (1978). Criteria for an ethnographically adequate description of
concerted activities and their contexts. Semiotica, 24, 245–275.
Mehan H (1979). Learning lessons: Social organization in the classroom. Harvard University Press,
Cambridge, MA.
Merleau-Ponty M (1945/1962). Phenomenology of perception. Routledge, Kegan Paul, London.
Newell A and Simon HA (1972). Human problem solving. Prentice Hall, Englewood Cliffs, NJ.
Rogoff B (2003). The cultural nature of human development. Oxford University Press, New York.
Rommetveit R (1998). Intersubjective attunement and linguistically mediated meaning in discourse.
In S Bråten, ed., Intersubjective communication and emotion in early ontogeny, pp. 354–371. Cambridge
Sacks H, Schegloff E and Jefferson G (1974). A simplest systematics for the organization of turn-taking in
conversation. Language, 50, 696–735.
Scollon R (1982). The rhythmic integration of ordinary talk. In D Tannen, ed., Analyzing discourse: Text and
talk, pp. 335–349. Georgetown Roundtable on Language and Linguistics 1981. Georgetown University
Press, Washington, DC.
Simon HA (1979). Information processing models of cognition. Psychological Review, 76, 473–483.
Sinclair J and Coulthard M (1975). Towards an analysis of discourse: The English used by teachers and pupils.
Chapter 21
Spontaneity in the musicality and

music learning of children
Nicholas Bannan and Sheila Woodward
21.1 Introduction: the intuitive musical learner

Developments in educational philosophy question the view that music is in any way an artificial
activity that has to be taught. As the innate skill of infants’ participation in musical games with
their parents has become better appreciated, spontaneous music-making by the young child has
been given greater significance (Bjørkvold 1992; Dissanayake 2000 a, b; Flohr and Trevarthen
2007). At the same time a new ‘world music culture’ is animated by a democratization of musical
participation in rock music, garage bands and karaoke. There are also new opportunities for
independent self-expression made possible by music technology, and untrained musicians have
been involved in creative projects in the work of composers such as Cornelius Cardew and
R. Murray Schafer and the practices that have arisen from their influence. All this has changed the
educational agenda (Laycock 2005; Cox 2004; Paynter and Aston 1970), and a theoretical frame-
work has begun to emerge that conceives the sharing of music as an expression of motives for
acoustically responsive movements, and these motivated movements underlie all human com-
munication (Bannan 2002, 2004).
The free, untutored making of music by children appears to express a natural human impulse
to cultivate the sound of movement and emotion in both reflective and sociable ways, a talent or
need that is clearly displayed in cultures where music is a valued part of both daily activity and
ritual occasions, such as those of sub-Saharan Africa and the islands of the Western Pacific Rim
(Dargie 1988; Feld 1990). Indeed, the spontaneous expression and enjoyment of music is identi-
fied by theorists of cultural evolution and of comparative psychobiology as the foundation of our
common humanity (Blacking 1973; Cross 1999; Campbell 1998; Wallin, Merker and Brown 2000;
Mithen 2005; and see chapters in Part 2, this volume).
We focus in this chapter on how the infant’s instinctive participation in musical behaviour may
be fostered throughout childhood, and can develop into mature expressive musicianship. We will
examine the nature of children’s free engagement with music, and the processes by which culture
and education, the ‘imposed’ principles of sociable behaviour that may either release or inhibit
children’s innate musicality, define the style, context and opportunity for practice and enjoyment
of musical art.
21.2 Development of musicality: personal and social

21.2.1 Origins of musical human nature
We understand musicality as a ‘human phenomenon, dwelling within even the very young and
awaiting the call to expression’ (Campbell 1998, p. 226). Its manifestation in early childhood—
from the earliest expressive vocalizations of the newborn, through free-form rhythmic and
466 NICHOLAS BANNAN AND SHEILA WOODWARD
melodic motifs of spontaneous play between infants and parents, to the structured chants in the
action games of young children in groups—confirms that music satisfies an innate pleasure in
rhythmic narrative that children and adults recognize in one another.
Infants are not only music makers, they are also ready to learn new musical forms. The process
of musical assimilation starts months before birth. The first pitched murmurings of a newborn
infant are responsive to a blanket of human and environmental sound that has enveloped the
fetus from the moment when auditory processing became functional in the womb. The learning
of culturally transmitted forms of music begins through a process shared with that of learning
spoken language, as affective responses to the essential components of acoustic phenomena com-
mon to both (pitch, duration, timbre and amplitude). There is a gestural completeness in the
expressive nature of early vocalization before it becomes segmented into parts of speech or gram-
matical musical structure (Papoušek and Papoušek 1989; Papoušek 1994; Kühl 2007). Intimate
companionship in musicality stimulates responses, primes memories and shapes behaviour, with
or without the conscious directive attention of adults. In readiness for this learning, innate
human musicality is actively ‘environment expectant’ as well as ‘environment dependent’ (Bekoff
and Fox 1972).
Adaptations for life in a musically communicative human world involve all parts of the body,
but those of the vocal and auditory systems are of special importance. Musical instruments, as
extensions of the body, are made to imitate and transcend song by means of polyrhythmic artic-
ulations of the hands, jaws, lips and tongue, assisted by actions of the trunk, limbs and feet. The
performance is monitored by the mechano-receptive senses, along with touch, sight and hearing.
It appears likely that singing came first in human evolution, that the intricate neural scaffolding
for coordinating and regulating musical expression and perception evolved to serve vocal com-
munication of intentions, thoughts and feelings before the hands began to make music (Donald
2001; Mithen 2005; Cross and Morley, Chapter 5, and Panksepp and Trevarthen, Chapter 7, this
volume); though the feet, in dance, would have made an early appearance in cultural practice.
Social elaboration of meaning in song and dance, as for live speech and language, needs no
instruments, but the art of instrumental performance is a technical invention requiring the
development of sound-making tools, and the learning of motor skills to use them (Lee and
Schögler, Chapter 6, this volume). The invention of additional technologies of writing and
recording have come to greatly increase the storage and historical elaboration of traditions in
both language and music (Donald 2001).
Practical music archaeology, investigating how music-making may have played a part in the
evolution of culture in pre-human and early human societies, is a new science. The experiment of
Ian Cross and his team with the musical consequences of flint-knapping (Cross et al. 2002)
explores the relationship between working and listening, as well as the potential for communal
music-making as a side-effect of tool making for other purposes. Research on these lines prom-
ises to illuminate in what ways our ancestors may have discovered how to make music with many
kinds of fabricated objects, direct evidence for which is lost because the likely material, wood,
bone and skin, has not survived.
Music-making serves social life, and it aided collective performance of tasks among the earliest
humans. There is ample evidence of this role of music in surviving hunter-gatherer societies
(Dargie 1988; Ellis 2001). Garfinkel (2003) sees the coordinated movements of group dance,
as depicted on pottery and artefacts in middle Eastern archaeology, as essential to the develop-
ment of agriculture (Cross and Morley, Chapter 5, this volume). He observes that the first human
cultures to be dependent on large-scale agricultural organization recorded their achievements,
not in depiction of the work itself, but in many representations of people engaged in dance.
SPONTANEITY IN THE MUSICALITY AND MUSIC LEARNING OF CHILDREN 467
For the products of art, language, and communication are surely less interesting and fundamental
than the search and ultimate verification of the principals that generate these products. We must begin
with the assumption that human culture is driven by functional prerequisites.
Ellis (1999, p. 11)
21.2.2 Musicality in childhood seeks ritual in shared performance

Elliott (1995) identifies making music as something that humans ‘already do’—part of a natural
‘tendency to elaborate aspects of ordinary life’ (p. 120). Historical and cross-cultural evidence
confirms that motivation for creative ‘musicality’, for making meaning with music, which begins
with the experience of sound being cognitively processed by the brain in the service of move-
ment, is a fundamental human attribute (Kühl 2007; see Merker, Chapter 4, this volume).
It has been proposed, from a reductive cognitivist perspective, that the brain has a set of
information-processing components for the sounds of music—a ‘music faculty’ or dedicated
musical intelligence, much in the same way that it is proposed to have a ‘language faculty’ for
‘understanding’ language (Chomsky 2000; Gardner 1983, 1999). This theory of a listener
‘decoding’ music in a stream of sounds fails to appreciate the essential intuitive motives of music,
whether it is enjoyed by a listener or in memory, or created by a performer (Imberty 1997). While
the activities of making and sharing music are evidently biologically determined, in that the
human body and brain are adapted specifically for the required rhythmic motor, affective and
sensory functions (Lee and Schögler Chapter 6, Panksepp and Trevarthen Chapter 7, Turner and
Ioannides Chapter 8, this volume), any particular musical ethnicity, like any given language, is
learned and ‘socially determined’ (Ellis 1999, p. 11). Human musicality includes an inventive and
collaborative motivation for acquiring musical skill, for joining a musical tradition. It is motivated
by innate sympathy for expressive movement between performers, plus a need to find meaning in
one another’s sounds. Advances in brain science confirm the common sense acceptance of sym-
pathy between the motor expressions of intending subjects, and this changes how we can imagine
innate musicality and the processes by which music ‘means’ (Rizzolatti et al. 2001; Kühl 2007;
Panksepp and Trevarthen Chapter 7, Turner and Ioannides Chapter 8, this volume).
Children acquire musical culture as naturally as they learn to walk and talk. But development
of musical understanding, like learning a language, depends on an appropriately stimulating
environment of adequately skilful music-making. Every society provides children with a music
vocabulary and musical stories particular to itself, created by ancestors, and taught, more or
less formally, by teachers. Sounds become meaningful as their organization begins to constitute
a musical ‘grammar’ for the child that is both recognizable and familiar, as a set of conventions
or ‘rituals’ carrying shareable meaning by cultivation of the dynamic story-making impulses
of imagination and memory that come with the sociable playfulness of human nature
(Turner 1982).
When children learn music, they adopt certain rules that determine arrangements of sounds
in the musical tradition, or traditions, which they have encountered. They learn the musical
practices of their cultural environment, with the particular dialects and accents unique to a
specific subculture. But, at the same time, children retain an imaginative musicianship of their
own. While learning the arts of music, they create their own musical inventions, playing a role in
the changes that transform musical cultures across generations, through processes that are
independent of adult intervention. In order to provide a framework for understanding these
complementary natural phenomena, learning and creating, we examine more closely the earliest
musical awareness of children.
21.2.3 Musical listening before and soon after birth and how
the mother’s voice helps
Human auditory consciousness begins in the uterine environment, where the internal sounds of
the mother’s body are interjected with a wide spectrum of sounds of external origin transmitted
into the uterus (Woodward 1992a, b; Querleu et al. 1984; Benzaquen et al. 1990). The sound
environment triggers and shapes an internally directed process of growth or morphogenesis in a
system for auditory perception and understanding that is adaptively expectant of human sounds.
This is indicated by the finding that neonatal auditory deprivation is linked to profound central
neural changes in the auditory nuclei in the brainstem (Webster and Webster 1977). If these parts
do not receive stimulation, they atrophy.
Research shows that fetal responses to music, evident from approximately three months before
birth, may be elicited in one of two ways. They can be indirectly mediated by a maternal
response, as when fetal heart rate shows change over an extended period during which the
mother listens through headphones to music that she perceives to be either soothing or stressful
(Zimmer et al. 1982). Or fetal response can occur directly to sound, as in a situation where the
mother’s hearing is masked and the auditory stimulus is presented through headphones placed
on her abdomen (Woodward 1992a, b). The prenatal capacities for auditory perception, associa-
tion, memory and learning are confirmed in studies of habituation and dishabituation of the
startle response, a reflex movement which is inhibited through repetition of the same stimulus
and reinitiated through a change in the stimulus (Leader et al. 1982; Lecanuet et al. 1992;
Kisilevsky and Muir 1991; Lecanuet et al. 1986; Shalev et al. 1989). The fetus can be aware of fine
distinctions between changes in auditory stimuli, for example, when reacting to a change of a
repeated stimulus from ‘bi-ba’ and ‘ba-bi’ (Busnel et al. 1986).
Long term auditory memory, from before to after birth, is evident in studies of neonatal
behaviour modification that is associated with activation of a sound that had become familiar
during the period before birth, but not with a similar but unfamiliar sound. Neonates show
learned preference for the maternal voice, or for a lullaby or a passage of spoken prose presented
repeatedly during the pregnancy, over unfamiliar but similar auditory stimuli (De Casper and
Fifer 1980; Satt 1984; De Casper and Spence 1986; DeCasper et al. 1994). Furthermore, the
unborn baby may not only acquire a preference for the mother’s voice, but also for the mother’s
language when compared to a different language (Mehler et al. 1988). These preferences are
apparently learned before birth.
Auditory awareness is clearly complex and adaptive at birth, expectant of new human sounds
and ready to flourish within the framework of innate intersubjectivity, or sympathetic engage-
ment with others. Neonates rapidly form listening habits through their auditory interaction with
human companions. Ten minutes after birth, babies are capable of turning their heads towards
nearby sounds, especially to the sound of the human voice, indicating that locating and orienting
to such a sound, while it may benefit from prenatal experience, is not a learned response
(Wertheimer 1961; Alegria and Noirot 1978; Muir and Field 1979; Clifton et al. 1981). A readi-
ness for making human contacts is matched by an intuitive facilitation of the infant’s responses
by the parent. When speaking to infants, affectionate adults use ‘motherese’ or ‘infant-directed
speech’ (IDS), a soft-grained, intensely sympathetic style of speaking that is more rhythmic,
higher pitched, and more repetitive than adult-directed speech—that is, more musical and more
emotional. Preference for the exaggerated prosodic features of IDS is present from birth and con-
tinues through infancy (Cooper and Aslin 1990; Fernald 1985). This preference that infants show
for musical IDS can be seen across languages when a language is presented that is unfamiliar to
the infant (Fernald 1993; Fernald et al. 1989). Furthermore, when exposed to adult singing,
infants show distinct listening preferences for infant-directed singing, which is more clearly
enunciated, and higher pitched than non infant-directed singing (Bergeson and Trehub 1999;
Trainor 1996).
Infants also detect the musicality in movements. The dynamics of IDS and baby songs are shared
with those of maternal hand gestures. A study on deaf children with deaf mothers indicated that
human infants may be equally predisposed to attend to an infant-directed form of hand sign
language, characterized by slower tempo, exaggerated movements and repetition (Masataka 1996).
There are principles of the regulation of human movement that are inherently communicative and
that are independent of the modality by which they are perceived (Lee and Schögler Chapter 6,
Mazokopaki and Kugiumutzakis Chapter 9, Powers and Trevarthen Chapter 10, this volume).
By 4 to 7 months, infants sense musical phrasing, preferring Mozart minuets that have pauses
placed between phrases (a ‘natural’ place for a pause to occur) to minuets that have pauses placed
within a phrase (an ‘unnatural’ placement) (Krumhansl and Juscyk 1990). Infants are able to
detect small pitch changes and alterations to a melodic contour, timbre, and temporal and rhyth-
mic patterning (Trehub and Trainor 1993; Trehub et al. 1984, 1985, 1990). Such evidence of early
capacity for acute aural discrimination inspires a closer examination of early spontaneous musi-
cal expressions.
21.2.4 The musical expressions of infants: seeking musical

companions
Babies seek to engage actively with musical sounds, sometimes becoming still and attentive, and
then ‘singing’ and/or making rhythmic movements (Metz 1989; Suthers 1995; Mazokopaki
and Kugiumutzakis, Chapter 9, this volume). This effort to communicate with particular organi-
zations of musical sounds appears to be present very early. In a study by Hepper (1998), neonates
who had been exposed repeatedly to the theme tune of a TV soap opera, their mothers having
been asked to watch the show daily during the pregnancy, tended to cease crying and/or appear
more attentive when the tune was played, unlike babies whose mothers had not watched the
TV show. Even newborns can move in sympathy with rhythms of the voice and gesture of an
attentive partner (Condon 1979).
The early manifestations of musicality, from prenatal auditory memories and infants’ willing
responses to musical environments, to the more intricately active involvement of older children
in musical experiences, give strong evidence for a motivation for multimodal discovery of the
representative and communicative strategies on which social engagement and cultural learning
depend (Custodero and Johnson-Green 2003; Flohr and Trevarthen 2007). Young children’s
active response to music through movement is found to benefit their perception of form in musi-
cal structures (Gromko and Poorman 1998b). Sims suggests that, ‘since music is characterized by
the simultaneous interaction of a number of elements... determining how children respond to
more than one element at a time seems important to a more complete understanding of chil-
dren’s responses to music’ (Sims 1991, p. 299; and see Rodrigues, Rodrigues and Correia, Chapter 27,
this volume, for a proposed ‘natural’ experimental paradigm for investigating children’s
responses to music). The musical/poetic features of infantile vocal sharing evidently prepare the
way for learning of language as well as other cultural skills (Papoušek et al. 1985; Papouvek 1996;
Dissanayake 2000a; Miall and Dissanayake 2003; Goddard Blythe 2005; Kühl 2007; Merker
Chapter 4, Powers and Trevarthen Chapter 10, Erickson Chapter 20, this volume).
Babies vocalize and move to music with increasing enthusiasm and skill (Mazokopaki and
Kugiumutzakis, Chapter 9, this volume). Their reactions develop from undirected whole body or
limb responses to acoustic stimuli that begin at the fetal stage, to rhythmic bouncing and limb
waving of the young infant, and then to grasping and reaching activities that have immediate
potential for holding, pressing, shaking, and hitting objects that make sounds, an activity that
introduces the possibility of manipulating resonant instruments (Berk 2002).
This willing movement in response to melodious narrations in music is a distinctive compo-
nent of human behaviour. Our nearest relatives, the chimpanzees, do not appear to move
to humanly produced music (Williams 1967; Merker 2000; Geissmann 2000; Donald 2001,
Merker, Chapter 4, this volume). While elephant orchestras have been recruited in Thailand,
dressage horses are made to dance, and cobras appear charmed by music, all of these instances
are products of training that intentionally shapes instinctual behaviours of the animals so they
will follow physical or visual cues in order to appear musical. There is little, if any, evidence
that these species, or any other, move to music without such priming, though there is evidence
that other species can be emotionally affected by musical sounds (Panksepp and Trevarthen,
21.3 A model of musical education

21.3.1 How vocal expression becomes learned
Modelling of human development and education has, in the last half century, been directed
by the conceptual frameworks of learning theory and Piagetian cognitive psychology. Notions
of acquired abilities for semantic perception and abstract representation, appropriate for
the older child, presently inform attempts to model the early development of musical ability.
One such is the variant of Bruner’s spiral of education (Bruner 1960) devised by Swanwick
and Tillman (1986). Bruner argued that an individual’s learning does not follow a linear
path, but is rather a process of revisiting experiences at progressively more complex levels,
which Bruner depicted in a spiral diagram. Swanwick and Tillman (1986) employed a similar
spiral diagram to illustrate children’s musical learning in school, which resembles Piaget’s
stage-model of learning by the individual, identifying significant transitions in conceptualising,
described for later childhood and adolescence. Such a representation, while offering a
summary of pedagogic experience, tells us little about the role of the critical motivating
foundations growing in early infancy, prior to the onset of abstract thought (Donaldson 1992),
a stage in which an intuitive musical communication with other persons appears to play so
significant a role.
A development of the learning spirals of Bruner and Swanwick and Tillman, one that assumes
an instinctive basis for vocal musical expression (Bannan 2002), represents a different balance
between cognitive development and emotionally motivated behaviours (Figure 21.1). The model
commences at birth, which is represented by the lower end of the spiral. Here, a link is repre-
sented between involuntary (or ‘less voluntary’) self-regulatory actions of the infant, such as
breathing, crying, sucking, biting and yawning, and the development of expressive vocal control
that exercises the sound-making mechanisms that will facilitate learned communication. The
spiral model depicts a chronological sequence of mediation between involuntary self-regulating
responses or motives and voluntary actions or skills, illustrated in the recurring meander
between the left and right axes.
The purpose of this model is to show how, beneath even the most highly developed perform-
ance ability of the opera singer or public speaker, is a necessary reliance on systems of oral
and respiratory function that evolved, in pre-human times, for other, more self-related, purposes.
An infant first masters expression of sounds that he or she makes instinctively, and with
remarkable efficiency from the first hours of life, and then adopts a rich variety of sounds initially
SONG (REPERTOIRE) LEARNING
IC
HONNCE
SEL POLYPP E DE
F–OTHE I NDE
R BLENDING
Experimentation for self and with others

SIMULTANEOUS PARTICIPATION
IC
ON
ARM
Maturation within self
O DIC/H
I O N
RE L
ME CEPT
SPO
NSE TO THE ENVIRONMENT PER
MIMICRY
IN G
LAY
A ME-P
RESP G
ONSE TO PITCH CONTOUR
INTERSUBJECTIVITY
IO N
D MOT
CR
YING RECAELL AINON OF E
SS
AND COOING EXPR
G
FA ING AWNIN
EXPCIAL VOMIT Y
SUCKING G
RESSION
BITING C IN
RY
Involuntary Voluntary
Birth
Fig. 21.1 Spiral model of vocal learning (Bannan 2002).
outside his or her control, which are elaborated in spontaneous solitary play and through inter-
action and learning with other human beings.
This model portrays the acquisition of musical behaviour as a process that transforms both
representation, to the self, and communication, with others. This is a critical distinction in Piagetian
cognitive theory, which assumes that before communication can take place, a means of incorpora-
tion of control within the individual needs to be established by representational redescription
(Karmiloff and Karmiloff-Smith 2001). Cross (1999) has accepted this model to account for musi-
cal learning, and, applying a different version of this thinking, Gardner (1983, p. 301) interpreted a
child’s vocalizing ‘vroom, vroom’, while making disorganized scribbling movements to elaborate
a drawing of a truck, as an early stage of representational development on the way to the capacity
for making visual art. We believe, however, it would be more appropriate to see this behaviour as
an example of a creative multimodal evocation of the representation ‘truck’, in which the vocal
component is more elaborated than the manual. The art is already there with a social purpose
(cf. Dissanayake 2000 a, b).
Alongside the uncontrolled (or incompletely controlled) cries, clicks and plops that often
surprise the infant producing them, are those sounds that begin to be produced under more
deliberate control that we can refer to as intentionality. This process is open to attitudes and
interpretations of other persons, sometimes occurring in a solitude that allows for focused
communication with the sounds themselves and other times mediated within a sympathetic con-
nection between human minds. Sounds made intentionally become available and recallable
‘external objects’, able to be transacted in increasingly elaborate fabrications and exchanges
(Donald 2001). The infant becomes involved in assigning and declaring meaning alongside
emotional responses in a complex semiotic traffic of perception and production in which the inte-
gration of sound, touch, and visual cues with movement responses characterizes experience of
stimuli. When drawing, young children dance, gesticulate and vocalize the process of creation
(Matthews 2003). (For a discussion of the evolution and development of processes of musical
representation and semiosis, see Brandt, Chapter 3, this volume, and Kühl 2007.)
Once sounds have gained this quality of an external object, known to the mind and retrievable
by it, they are immediately capable of transferable ownership and referentiality. Just as the physical
‘transitional objects’ of Winnicott (1971)—a favourite doll or a security blanket—can stand in
for the comfort and affection of the mother’s presence, so can a song (Chong 2000). Indeed, Falk
(2004) proposes that this substitution for physical mothering by song evolved as a means by
which hairless proto-human females, unable simultaneously to carry their young and engage in
manual tasks, sang to comfort a remote child while they worked, a theory that has acquired the
soubriquet ‘parking the baby’ in anthropology circles. Musico-poetic sounds announce the pres-
ence of company (Mazokopaki and Kugiumutzakis, Chapter 9, this volume).
21.3.2 Children explore their voices, with others’ interest, and

discover the ‘representations’ of song and speech
The ways infants appear to explore their own voices for self-comforting, and how they
might manipulate them as a means of communication, by speech or song, has been widely
theorized. Some say a child learns to sing through imitation of maternal songs and in musical
interactions (Moog 1976; Moorehead and Pond 1978; Hargreaves and Galton 1992; Custodero
and Johnson-Green 2003) some that they possess an instinctive musical function shared with the
mother, which constitutes the foundation for intersubjective awareness and ‘proto-conversation’,
as proposed by Malloch (1999) and Trevarthen (1999); and it could be a combination or inter-
weaving of the two.
Infants’ earliest vocal utterances are as much song-like as they are precursors of language.
But can we identify a specific point in development when song and speech diverge? Evidence
from aphasia studies (Sacks 1985; Morgan and Tilluckdharry 1982) and the behaviour of autistic
children (El Mogharbel et al. 2003) suggests that, in adults, neural mediation of speech and song
are separate, though the findings of functional brain imaging reviewed in Turner and Ioannides,
Chapter 8, this volume, show that the two forms of vocal expression and awareness share
many brain processes. If it is the case that vocal development commences as a single, undifferen-
tiated expressive function, there must come a time at which singing and speaking begin to
represent diverse intentions in the child’s mind. The evidence appears complex. Apparently,
the speaking child’s native language and personal speech characteristics can influence the
manner in which a child sings (Chen-Hafteck 1998; Rutkowski and Chen-Hafteck 2001).
However, one study identified no difference between children of Hong Kong, the USA, and Israel
in mean speaking pitch, while a difference was noted in uses of the singing voice (Rutkowski et al.
2002). There would seem to be a social pragmatics of singing, whereby to sing or not to sing
depends on social acceptance and peer evaluation, acting to govern the inhibition or continued
release and elaboration of expressive vocalization.
A model of how these processes of voice acquisition may unfold is provided in Figures 21.2 and
21.3 (Bannan 2000). They attempt to reconcile the two propositions that: (a) those aspects
of vocalization that are adaptive products of natural selection must be part of the constitution
of every human; and (b) there are cultural differences in the practice and evaluation of singing—
indeed, a significant proportion of persons in Western-style industrial societies come to believe
that they ‘cannot sing’.
This pragmatic theory of voice acquisition is based on the theory of instinct proposed by
Tinbergen (1951), according to which behaviour is either released or inhibited by environmental
‘triggering’ of innate capacities (Figure 21.3). In so complex a behaviour as human singing, the
innate releasing mechanisms that are presumed to determine vocal participation would be seen
in repeated engagement with the human ‘releasing environment’ during development, repre-
sented by the spiral model (Figure 21.1). These operations, at each level, can result in divergent
behaviours accounting, for example, both for solitarious singers (those only comfortable to vocal-
ize in the absence of others) and gregarious ones (who are only comfortable to sing when able to
rely on the support of others).
As children begin to explore their voices, they discover their power to create ‘soundscapes’ that
emerge from within, that imitate the auditory world without, and that reveal, to those of us who
stop to listen, their growing understanding of musical ‘narrative’.
21.3.3 The creativity and pride of children’s music-making

The most informative way to study how children’s musical understanding grows is to
observe spontaneous music-making in early childhood, before skills in verbalization develop
1a Amusement/pleasure derived from use 1b Power derived from use

(Right-brain?) (Left-brain?)
(communication with self?) (communication with others?)
These are concurrent in early life,

converging towards:
2 Satisfaction derived from use
For 2 to be activated in older children or adults with vocal difficulties, both

1a and 1b must be revisited.
Fig. 21.2 A pragmatic stage theory for vocal development.

Released Inhibited
1a: Continuing via regular exploration of 1b: Curtailed as a result of inhibitions
singing potential throughout life. arising from negative family or peer
group responses; or from inappropriate
teaching
Signs: Signs:
Good posture, and sense of self-worth; Poor posture and breathing; low self-esteem;
flexibility of sound; aware of the pleasure lacks awareness of the pleasure of
of resonance; accords value to the activity resonance; accords low value to the activity
in others as well as self. for self, sometimes, too, the activity in others.
Fig. 21.3 Appreciation of one’s own voice, and of others’ singing.
(Bjørkvold 1992; Flowers 1993; Gromko 1994; Flohr and Trevarthen 2007). Although, like many
adults, young children cannot name or describe musical concepts and components, they, like all
adults, can demonstrate intuitive understanding of musical principles in their spontaneous
music-making. Adults, who have the training to do so, can rationally assess intimate details of
this untutored musical understanding as infants and children create with steady beat, phrasing,
pitches and melodies, ‘home’ tone (tonic), repetition, sequence, extension, rhythmic or melodic
patterning, deliberate changes in dynamics, and representation, among other characteristics
(Gromko 1994; Barrett 1996). Children’s creative music-making is part of their being as sociable
persons—it tells us who they are, how they feel and what they know. Pride in mastery of a ritual
performance of a simple action song is part of a 6-month-old infant’s individuality in society
(Trevarthen 2002).
An infant or toddler might ponder on the sounds of their own production while exploring cause
and effect, when vocalizing a chant, or hitting a large drum to hear it resonate, discovering the joy in
what is created before perhaps expanding the theme by adding an extension, and then going back,
repeating the first rhythmic phrase. The activity of infants and youngsters controlling and manipu-
lating sound by ‘doing a riff on it is fascinating, spellbinding, irresistible’ (Dissanayake 2000a,
p. 182). Infants, like jazz musicians, may create their spontaneous music within any ‘grammatical’
structure of the musical culture that they have absorbed. They create clusters of notes that might be
influenced by learnt music, such as the repetition of a phrase with seven regular beats followed by a
pause, as in Twinkle, Twinkle Little Star (Young 2003). Deferred imitations of melodies are a natural
outcome of children’s impressive capacity for aural memory and invention, and of their intuitive
sense of rhythm, phrasing and expressive narration (and see Chapter 14 by Gratier and Danon).
Repetition soon evolves into extension and transformation, by controlling a sequence or vary-
ing certain elements of a motif, such as the pulse, dynamics, melodic direction or rhythmic
content. Children are also seen to develop logical form and structure in their spontaneous music,
perhaps returning to a repetition of an earlier musical statement or developing a definite, empha-
sized ending (Barrett 1996; Gromko 1994). Repetition and pattern seem to be impelled by an
inner need for creative control (Dissanayake 2000a). In these infantile acts, the essential ingredi-
ents of adult creative music-making can be witnessed: ‘preparation (the exploration of possibili-
ties and generation of ideas), incubation (which involves less conscious activity), illumination
(the ‘eureka’ experience) and elaboration (the working out of the project in a tangible form)’
(Boyce-Tillman 2000, p. 19).
Early sound play is part of children’s purposeful sensory exploration of the world by moving,
especially of the social world (Swanwick and Tillman 1986; Tillman 1987). At the same time as
acoustic parameters absorbed from the culture provide tools for transforming early spontaneous
expression, infants continue to display their own distinctive musicality of expression. ‘Children
think aloud through music… They socialize, vent emotions and entertain themselves through
music... It is almost as if children exude music’ (Campbell 1998, p. 4). Children’s own sponta-
neous musical activity forms a subculture of its own (Bjørkvold 1992). They develop their reper-
toire of playground music, often associated with intricate games, vocal chants and body
percussion rhythms. The musical expressions of each child become his or her possessions, satisfy-
ing a powerful sense of ownership. ‘Getting a grasp on what music means to children is coming to
understand what they know and value’ (Campbell 1998, p. 171).
When teachers take time to observe children rather than instruct, they gain information about
children’s musical behaviours, their developmental stages, the role of music in their lives and,
by implication, the roles of the teacher that may be most effective (Tarnowski 1996; Flohr and
Trevarthen 2007; and see Fröhlich Chapter 22, and Custodero Chapter 23, this volume). Stepping
back, getting out of the way, allows us to observe how children may experience total engagement
and focus in their spontaneous musical activity, when they appear to be in ‘flow’, a state defined
by Csikszentmihalyi (1990) as a balance between challenge and skill. Flow is described as integral
to optimum musical experience (Custodero 1998), and an experience that leads to self-growth
(Elliott 1995). In taking a closer look at the various modes of spontaneous music-making, we are
first confronted with the mutual regulation of experiences that occur in early engagements of
vocalization.
21.4 Musical invention and imitation

21.3.4 Inventive song, and its development in communication
From birth, spontaneous vocalizations occur both as independent exploration of sound by the
individual, and as an essential part of intersubjectivity with caregivers, guiding a shared evalua-
tion of the world. Intersubjectivity is described as the ‘mutual understanding that is achieved
between people in communication’ (Rogoff 1990, p. 67). This building of acquaintance with
meaning takes place within intense emotional interaction, and in early infancy it is ‘strongly
regulated by the infant’ (Trevarthen 1979, p. 343). Development of musical responses and under-
standing grows from this intuitively ‘musical’ communication, with a foundation in sensitivity
for the emotional variations and ‘intrinsic motive pulse’ of human movement (Trehub 1990,
2001; Trevarthen 1999; Flohr and Trevarthen 2007). Young infants respond with exquisite preci-
sion to the contingent responses of an engaged parent’s affectionate and joyful facial and vocal
stimuli, even when these are mediated through a video system (Murray and Trevarthen 1985;
Nadel et al. 1999), and a baby can soon perceive emotional messages in faces presented on a
television screen, deriving information from them by ‘social referencing’ to learn how to react to
objects (Mumme et al. 1996; Mumme and Fernald 2003).
In early communicative interactions, musical features define the emotionality of early vocal
signals, and neonates attempt pitch matching, show auditory recognition and anticipation, and
display preference for familiar sounds (Kessen et al. 1979; Minami and Nito 1998; Dissanayake
2000b, Papoušek and Papoušek 1989; see Powers and Trevarthen, Chapter 10, this volume). The
range of these responses is, however, clearly constrained by anatomical features, and the window
for imitative vocal pitch behaviours appears to partially close, possibly due to changes that occur
with the descent of the larynx in the second six months after birth (Kessen et al. 1979; Minami
and Nito 1998; Trollinger 2003).
Early infant ‘babblings’ (Locke 1993) are described as ‘vocal scribblings and meanders’ that lay
the ground work for later speech activity and singing with words (Dissanayake 2000b).
Alternatively, vocalizations or sound-making actions are referred to as musical doodlings, short

melodic or rhythmic fragments that emerge either as the object of focus or while the baby or tod-
dler is engaged in another activity (Kartomi 1991, pp. 55–56).
Sometimes they are unaware of this musicking, as it flows almost in a stream-of-consciousness way
from their voices and bodies. Yet it is also made by children with the full intent of preserving a song,
rhythm, or game buoyed by music. This music may even be their concentrated efforts to make up
music that expresses their thoughts in musical ways.
Campbell (1998, p. 13)
In their deliberate vocalizations, children express emotions, socialize with friends, expel energy,
accompany tasks, announce transition at the end or start of an activity, think aloud, name, tease,
taunt, beckon, call, jeer and criticize, celebrate, make fun, dramatize, instruct, give directions,
exclaim, complain, describe, and question (Campbell 1998). Pitched vocalizations occur on a
monotone or display a variety of tones (for example, the falling minor third), and incorporate
nonsense syllables or repeated words. Children create their own message songs (speech with idio-
syncratic meaning), and improvise stories or snippets (part or whole) of previously learned vocal
material (Omi 1992). They invent their own rhymes, chants and tunes, some of which match a
wide range of concurrent games, activities, movement, play, manipulation of objects or instru-
ments, and dramatizations. Furthermore, vocalizations may be interspersed with a number of
unvoiced lip and tongue sounds, such as whistles, clicks, and smacks. From infancy, vocalizations
appear to include variations in pauses, stress, amplitude, tempo, tone quality, and rhythm.
Their constant musical utterances, and their music lore and repertoire... [are]... certain testimony to
the integral nature of music in their lives – whether offered as a result of painstaking deliberation or in
the spirit of spontaneity.
Campbell (1998, p. 64)
When children are engaged in creating sounds while acting on instruments and other objects,
they are observed to frequently accompany sporadic and repeated motifs of vocalization with
movement (Matthews 2003). This leads us to consider how children’s initial vocalizations develop
with the experience of rhythm in the body that is no less ‘musical’ than their vocalizations.
21.4.2 Spontaneous dancing and learning music

The human body is a rich and versatile personal resource for musical expression. Refining, regular-
izing and repeating the fundamental form and timing of movements with elaboration are innate in
the child (Fein 1993), and they lead to dance, song and poetry (Dissanayake 2000a; Miall and
Dissanayake 2003). Agility and control in experiences of manipulation are precursors for the
playing of musical instruments (Goddard Blythe 2005; see Fröhlich, Chapter 22, this volume).
Prenatal movement response to sound is a phenomenon reported by pregnant mothers
who feel their unborn babies kicking in response to a sudden noise, such as a banging door, or to
prolonged loud music at a concert. Incubated premature infants lessen their fretful activity
of body and limbs in response to ‘calming’ music, a possible rechannelling of energy that
results in the significant psychophysiological effect observed in increased physical growth (Salk
1962; Katz 1971; Kramer and Pierpont 1976). Neonates spontaneously localize sounds, turning
the head towards an acoustic stimulus (Field et al. 1980), and modify behaviour in order to
activate a recording of familiar music in preference to unfamiliar music (Panneton 1985;
Cooper and Aslin 1989). Infants display physical movement response to their own vocalizations,
as well as to mothers’ singing and recorded music, and at 6 months of age, rhythmic babbling and
rhythmic banging of objects held in the hand develop together (Dissanayake 2000a, b;
Young 2002).
Instinctive looking, reaching and grasping, head turns and head bobbing, and self-exploratory
limb and trunk movements gradually give way to more purposeful reaching, touching, grasping,
and tapping. Repetition and expansion of rhythmic movement gradually becomes more controlled
as infants spontaneously create and experiment with their own body rhythms, and as they respond
to music (Mazokopaki and Kugiumutzakis, Chapter 9, this volume). The repetitive grasping and
manipulating that leads to the spontaneous creation of sounds with objects and instruments in
order to make music can be seen as the natural outgrowth of early experience in the rhythmic
musicality of all spontaneous movement. Instrumental playing of toddlers is accompanied by
twirls, steps and movement patterns that could be categorized as dance (Young 2003).
In free play, young children communicate with expressive gestures—rhythmic punches, jabs,
pointing of the finger, and beckoning motions—and often these are accompanied by improvised
musical motifs. Campbell (1998), observing young children during school recess, noted that they
wriggle in a regular, rhythmic manner. Legs swing back and forth, shoulders sway and heads bob
metrically. The children manipulate objects, such as lunchboxes, cartons and foil, in repeated pat-
terns, adding to the polyphonic textures of movement created in their own multiple movements
or by the other children around them. At various tempi, children step, stamp, skip, run, trot, gal-
lop, tiptoe, stomp, shuffle, skate, drag, step, wave, stretch, bend, slide and much more. They
invent choreographies, movement patterns, clapping sequences, some of which are in interaction
between two or more people. Children’s imaginative play leads them through musical and move-
ment worlds of dramatizations and representations of animals, people, things, emotions and
situations. ‘Rhythms were as visible in their movement as they were audible’ (Campbell 1998
p. 30). Similar rhythmic, musical playground behaviours, including rituals created by the chil-
dren as their own ‘musical culture’, were observed by Jon-Roar Bjørkvold in Russia, Norway and
the United States (Bjørkvold 1992).
While normal strides in physical development lead a child to spontaneous musical activity
engaging all the body, input of stimulation, especially from other persons, plays an essential role.
Dennis (1960) observed that infants in Iranian orphanages who lay day-after-day on their backs
in their cribs, with no toys or human manipulation, did not move on their own until 2 years
of age, when they scooted on the floor rather than crawled, and only 15 per cent walked by the
age of 3–4 years.
21.4.3 Manipulation of objects as ‘instruments’ in spontaneous

musical story-making, for the self, or ‘for show’
The degree to which arm and hand movements can be controlled and modified by anticipation
of their effects determines a person’s ability to control sounds through manipulating objects,
including musical instruments (Lee and Schögler, Chapter 6, this volume). Neonatal prehensile
skills begin as oriented and rhythmic swipes and swings (called pre-reaching) that, though aimed
in coordination with gaze, are not guided in course and may not result in contact with the object
that excites them (Trevarthen 1984; Berk 2002). Voluntary reaching with more precise guidance,
from about 3 months, advances the infant’s exploration of the environment, leading to increasing
mastery of fine movements with development of binocular stereopsis around 6 months, which,
in turn, enables constructive grasping and manipulation of objects and instruments (Case Smith
et al. 1998). The controlled ulnar grasp, where the finger closes against the palm, emerges at this
stage. Not needing the arms to maintain balance from about the age of 4–5 months, the infant
begins to master sitting, and hands are free to explore objects away from the body. One hand
holds an object, while the other is able to scan it with the tips of the fingers (Rochat 1992). From
about 7 months, the infant reaches to grasp with one hand rather than both, and at 9 months the
visually controlled pincer grasp (thumb and index finger moving in opposition) becomes more
versatile (Fagard and Pezé 1997; Berk 2002).
Babies from around 5–6 months will empty a container of objects placed within reach
(Goldschmied and Jackson 1994). The child will clutch the item, before dropping it. The child
becomes increasingly aware that a movement, like dropping, can be used to experiment with
sound-making with the object or music instrument. Deliberate shaking, hitting, scraping, and
blowing may be applied to make sounds develop (Young 2002). Young (2003) describes the visual
and spatial pathways and patterns that are typical of children’s spontaneous instrumental play.
In her observations of children aged 3–4 years, she describes a child setting out an array of
instruments around her on the floor, playing them several times from one end to the other and
then playing them in reverse order. Furthermore, numerical logic and regularity may be evident
in such experimental games. Children build sound shapes, improvising alone and together, using
polyphonic rather than harmonic structure in their expressions (Pond 1981).
‘Children socialize, vent emotions and entertain themselves through music’ (Campbell 1998,
p. 4). They employ instruments in role play, using the instruments as metaphors for objects and
also using their sounds to represent people, creatures or items. An entire story—either one heard
before, or an original creation that is improvised—may be ‘acted out’ with the instruments.
Sometimes, children set up a performance situation, clearly imitating performers they have
observed. Young (2003) observed that many children begin explorations with instrumental
sounds in a mechanical way, seeking what she describes as ‘stability’ from which they can then
expand their invention. She also noted that a few children play mostly insistently and loudly, in
an exhilarating, high-energy style that is hugely exciting and stimulating to the child.
As we examine the repertoire of children’s sound explorations, and observe how musical and
linguistic development appear to proceed in parallel fashion, we are compelled to confront the
communicative nature of their musical interactions alongside the question of how music func-
tions for them in an artistic, aesthetic capacity, beyond language and its informative references or
practical communication and its uses.
21.4.4 Music for music’s sake: finding and losing ‘meaning’ in

vocal sound
There is a widespread assumption in the literature (e.g. Moog 1976; Hargreaves 1986; Hargreaves
and Galton 1992) that singing develops to support language as a means of performing with
words that have already been learned: a supposed sequence that models, at the individual level,
Pinker’s (1997) theory that the development of music in our species has been a by-product of the
evolution of language. This account of the inherent, and inessential, dependence of music on lan-
guage is, of course, rejected by those who know the role of music in all human societies (Brandt
Chapter 3, Cross and Morley Chapter 5, Dissanayake Chapter 24, this volume), and the musical
and other communicative abilities in infancy (Trevarthen 1994, 1999). It is inconsistent with the
reality of spontaneous infant musicality reported above, and with the experience of parents. The
younger son of one of the authors was able to vocalize with great musical accuracy the theme of
each of the locomotives in the children’s television programme Thomas the Tank Engine at the age
of 18 months, well before he began to use language fluently.
Yet there is a mystery here that remains to be explored: if a disposition to learn musical vocal-
ization develops even earlier than speech and in a way that supports eventual linguistic mastery,
why do so many people who talk well fail to achieve their musical potential? What is the purpose
of musical art for those who continue to pursue it? The means to explore the first of these
questions may well reside in careful analysis of the divergence of proto-musical and proto-lin-
guistic elements in early spontaneous vocalizations; but we lack the systematic research to pro-
vide the answer. The second question can only be addressed through comparative study of the
relationship between universal capacities for musical development and the specific role of music
experience in child development and cultural practice. Again, data is lacking.
Meanwhile, attempts have been made in the education literature to explain the phenomenon of
children’s enthusiastic engagement with music. Swanwick’s 1999 model of music as discourse
made by sequences of notes has proved influential for a pedagogy in which educators strive to
‘teach music musically’. But just as one can view infant capacity for vocal expression as instinc-
tively gestural, and mimicry of target sounds, whether they be linguistic or animal vocalizations,
as representing the attempted capture of sonic gestures in their entirety, it would seem likely that
infant engagement with instruments is more motivated by the physical whole-body enactment of
contours in pitch and amplitude than the assembly of melody out of discrete pitches. Swanwick’s
account of musical structure parallels Chomsky’s (1957) theory of language: it captures what
may occur once enculturation confers fluency, along with differentiation or ‘discretization’ of
audible actions and expressions (Brandt, Chapter 3, this volume). However, this theory of musi-
cal acquisition cannot account for what young children do prior to the stage when they can iden-
tify pitches in melodies, or for the transition between the two. Swanwick’s Chomskyan framework
for how participants assemble expressive structures out of notes seems influenced by the rational
technique of adult composers as opposed to what children actually do and experience.
The instinctive musical utterances of children tend to represent complete melodic gestures or
phrases made whole, rather than patterns assembled from discrete notes. Cross-cultural analysis
of the prosody of infant-directed speech (Fernald 1989) brings to light the natural process by
which adults also generate pitch-contours that operate gesturally, rather than as assemblies of
elements at a more ‘atomic’ level of organization. Scherer’s (1992) cross-species comparison
of vocal communication suggests that there are neural archetypes for the expressive contours of
vocal utterances and their perception that have similar structural properties across a wide variety
of animal species, and thus clearly predate the development of language in humans. Animals
make rhythmic and phrased signals that resemble music (Wallin 1991; Wallin et al. 2000). This
accords with the evolutionary neuropsychology of Paul MacLean (1990) (and see Panksepp and
Trevarthen, Chapter 7, this volume).
It is clear that children’s spontaneous singing is frequently motivated to communicate
thoughts, ideas, emotions, stories and directions of purpose, as well as for enjoyment of the
moving body (Bond Chapter 18, Fröhlich Chapter 22, this volume). Examination of their cheers,
taunts, and musical narratives allows us to glimpse a few of the countless children’s symbolic
representations in sound of motivated characters, objects, feelings and stories (Gromko and
Poorman 1998a; Hetland 2000; Tommis and Fazey 1996; Omi 1992). The music therapy litera-
ture demonstrates that improvised musical communication can have powerful effects on creativ-
ity outside language (El Mogharbel et al. 2003) while it displays all the functions of free
play within acquired conventions that linguists associate with generativity in speech (Robarts
Chapter 17, Wigram and Elefant Chapter 19, this volume). Furthermore, music can be a success-
fully shared experience among children who have grown up speaking different languages and
who cannot communicate through speech, enabling them not only to communicate particular
ideas, but also to share discovery of the aesthetic quality of sounds and the enjoyment of group
activity (Turel 1992).
In many languages, the word for musical engagement, especially with instruments, is play. Just
as language acquisition arises out of experimental social communication, so can music be seen as
an extension of sonic discovery and the capacity of sounds to be employed for playful narrative
or affective purposes. But what, prior to formal musical learning or corrective adult influence, is
the true nature of children’s spontaneous musical invention; and where does it lead? Future
research may illuminate how musical play first experienced in infancy remains a form of memory
or scaffold onto which lifelong musical engagement and perception may be built. While answers
may currently elude us, we will, nevertheless, examine the influence of adults—how they may
inhibit or foster children’s innate musicality.
21.5 Receptive environments for learning music

21.5.1 The role of adult companions: enablers or instructors,
or both?
Infants display a natural preference for the attunement of expressive vocalizations and
movements of human companions (Brazelton et al. 1974; Richards 1974; Stern 1971, 1974; Stern
et al. 1985). They listen and quickly learn that their vocalizations elicit maternal response, partic-
ularly as the mother matches and imitates them (Bullowa 1979). They soon engage in the proto-
conversational dialogues that Mary Catherine Bateson (1975) identified as the source of both
language learning and ‘ritual healing practices’. They join in creating lively and ritualized games,
and around 9 months they collaborate in tasks, taking up other persons’ initiatives (Trevarthen
and Hubley 1979; Hubley and Trevarthen 1978). From that time, babies are increasingly inter-
ested in sharing and imitating any activities within their power that they perceive are important
to the adults they know well. They begin cultural learning, which includes ‘proto-language’
(Halliday 1975), and activities we may call ‘proto-music’. According to the theory of Vygotsky
(1962), new understanding has to occur in a social context first, before being incorporated into a
person’s cognitive structures. This is certainly the way older infants and toddlers learn new means
to express their musicality. They learn when the teacher enters willingly into their ‘zone of
proximal development’ (Erickson 1996; and see Erickson, Chapter 20, this volume).
The shared and communicated responses of the family to musical ‘playing’ stimulates music
learning in the child, from infancy. Hobson (2004) observed that a mother’s response to musical
sound guided the interest of an 8-month-old infant. They were playing, and a Chinese bell was
chimed. The mother showed no response, continuing with the game, and the infant also concen-
trated on the play activity. On a second occasion, the bells were chimed and the mother inter-
rupted play, turning and pointing towards the sound, responding with an ‘aa’ vocalization, and
this elicited a strong response of the infant to the sound. Similarly, 3 to 5-year-olds demonstrate
more attentive listening when the teacher exhibits high magnitude affect (Sims 1986). The sense
of anticipation in mother’s action songs like This Little Piggy or Round and Round the Garden
may evoke excited movements, sounds and gestures from a baby that synchronize with her
expressions (Young 2003; see Eckerdal and Merker, Chapter 11, this volume). Certainly the spon-
taneous vocalizations of children may be strongly affected by cultural influences. Rutkowski and
Trollinger (2005) suggest that a child who is not encouraged to sing may stop singing. Though
the impulse to sing is there, it needs appreciative company.
The musical environment of the home offers opportunity for the parents’ own singing or
playing of instruments not just for children, but with the children, accepting their imitative
interest. Live performance, from the singing of a simple melodic line or striking of one sound
from a hand percussion instrument, to masterful musical performance, can tantalize a
child’s impulsive musical contributions at any age, and instruct their curiosity in conventional
musical ideas. After one year, infants like to imitate increasingly clever meanings. The second
author has reported observing an 18-month-old infant spontaneously saying the name of each
composer when hearing the first bar of each piano piece that his mother performed for him
(Woodward 2005). These pieces were part of a small repertoire that the mother knew well.
She had not anticipated that her infant would learn to associate the names she spoke with the
unique sounds of each piece. This act of labelling was sometimes followed by quiet listening
and other times by a variety of movement responses. Her naming had given the pieces special
significance.
In selecting sound-makers or instruments the child can easily play, parents may influence chil-
dren’s natural exploration of expressive sounds through choice of quality and timbre, but some
‘instruments’ targeted by commercial companies for children have poor tone quality and intonation;
others may be dangerously loud. In her own observations of formal music lessons for toddlers with
their parents, the second author noticed that allowing a free-for-all with any mix of instruments
would lead to an excessive cacophony that was obviously both distressing and frightening to some of
the young children. We believe this may be avoided by alternating instrumental activities where sets
of instruments with one tone colour are enjoyed at a time, thus guiding children to listen to the par-
ticular tone qualities of the specific instruments. Other influences include decisions regarding the
designing of play settings, regulating access, integrating recorded music in storytelling and games,
and facilitating adult observation, listening or participation (Littleton 1991; Young 2003).
Early spontaneous musical responses to recorded music by vocalizations, gestures and move-
ment, and later through making sounds with objects or instruments, prove the powerful role
adults can play in fostering or discouraging children’s music-making (Chen-Hafteck 2004). Adults
face an ever-increasing palette of sounds from which to make their selections, including record-
ings of inferior quality marketed towards an audience of babies or children. Systematic research
might better inform us of the recordings appropriate for each developmental level. Most impor-
tantly, the value and meaning of early experiences with recorded music will, we know, depend on
adults sharing in, and providing feedback for, the child’s spontaneous musical responses.
The passing on of traditional folk songs, children’s songs and nursery rhythms to the next
generation plays a significant role from infancy in developing children’s cultural identity,
immersing them in their heritage (Custodero and Johnson-Green 2003; see Custodero, Chapter 23,
this volume). It contributes towards developing a musical vocabulary for children that may be
imitated or altered in spontaneous music-making, or used as tools with which to create new
sounds (Blacking 1995). Adults and older siblings play a part in the relationship between physical
and musical responses in action songs, knee bouncing activities and games that manipulate or
indicate parts of the infant’s body through pointing, touching or tickling (Stern 1990; Stern and
Gibbon 1980). Such games often enact narratives that develop a sense of anticipation and release
(Malloch 1999). They encourage the making of connections between musical sound and move-
ment that emerges in children’s own music.
21.5.2 Improvisation and sharing cultural practice

When adult performers expose children to their own creative music-making, they allow children
to perceive the enjoyment experienced by the adult in sound exploration and the creation of new
musical material. They may adapt creativity to improvise in response to children’s vocalizations
or attempts to play an instrument—a method integral to the Nordoff Robbins music therapy
practice with children who have special needs. We suggest that repeatedly offering a specific
motivator for the child’s musical contributions, through immediate adult musical responses to
the child’s musical sounds, provides a communicative incentive beyond what might be achieved
when the child plays along to an electronic musical recording. We believe it is beneficial for
children to take part in the experience of all kinds of live music, at social events, community
gatherings, religious venues, carnivals, and performing arts venues. Many cultures welcome
young children into community music performances, where they will likely offer spontaneous
musical responses and contributions, unlike the scenario of the Western classical music concert
when the children are left at home with a babysitter.
The greatest challenge faced by adults who seek to foster children’s spontaneous music-making
is in achieving a balance between encouraging, facilitating, responding and guiding on the one
hand, and allowing the child freedom to explore independently, which is an essential part of the
creative process (Boyce-Tillman 2000; Flohr and Trevarthen 2007). Criticism and correction by
adults may interfere with explorative sound-making, breaking the child’s focus and the enjoy-
ment of musical flow. The importance of independent musical activity in early childhood is sup-
ported by research that shows children engage in more complex dramatic play when they are
given the freedom to create without teacher ‘modelling’, and they may exhibit greater musical
freedom, creativity and initiative when adults are less engaged as directors (Tarnowski and
Leclerc 1994; Smithrim 1997). When an adult responds by watching, commenting, presenting
musical objects, modelling or imitating, the child is given the messages that the activity brings
attentiveness, and that others give the activity value (Young 2003). This stimulates a feeling of
pride in the child that motivates learning (Trevarthen 2002). Other interactive activities include
initiating conversational games involving musical question and answer; matching the pulse
established by the child; and attaching vocal rhymes or songs to the pulse.
The familiarity and manner of relating of adult to child, or the adult’s ‘respect’, is clearly of
importance. Young (2000) observed that children aged 3 to 4 years who are partnered with an
adult, engage in spontaneous instrumental play for longer periods when interacting with a famil-
iar musically untrained adult, than with a musically trained adult stranger. In another study,
if the adult made a deliberate attempt to ‘play badly’, the children lost focus on and enjoyment in
their own playing, turning their attention towards instructing the adult in achieving a better
product in holding the instrument in a different way, and ways to create a better sound. If the
adult played musical interludes that were unrelated to what the child was doing, the child lost
interest. Furthermore, if the adult came in too early or paused too long in echoing or matching
the child’s’ phrases, the child became frustrated (Young 1999). This accords with Erickson’s evi-
dence of the importance of concerted musicality in classroom conversation with young children
(Erickson, Chapter 20, this volume).
Campbell (1998) noted that children as young as three years of age display complex musical
components in their spontaneous music, such as changes in metric emphasis (e.g. shifting a
melody to a metric grouping of three beats, followed by two and then two again). She suggests
that the natural musicianship of children should be retained by capturing and reinforcing these
musical experiments once children enter formal education. Instead, many music curricula in
schools make sure that these complexities are not included in any of the teaching or materials
until many years later, when as youths or young adults, they might have to relearn them as intel-
lectually challenging assignments (Bjørkvold 1992). We believe that as far as possible, children’s
natural capacity for improvisation and composition should be part of the curriculum from the
moment a child starts formal education, to nurture natural musical intelligence. When teachers
progress beyond traditional methods of rote learning, in which children only learn music pre-
scribed by the teacher, they open the classroom to a fascinating world of children’s own music
and creativity (Laycock 2005; Fröhlich Chapter 22, and Custodero Chapter 23, this volume).
It has also been shown that when children create music themselves, they experience a special
sense of ownership.
To deny children the opportunity to work creatively with the materials and structures of music is to
limit their capacity to think artistically and, ultimately, to limit the full exploration of what it means to
be musical.
Campbell and Scott Kassner (2002, p. 271)
Children need to find musical experience intrinsically meaningful for it to impact significantly
on learning (Elliott 1995; Custodero 1998; Barrett 1990, 1992). Meaning is also influenced by the
context, and that context provided by adults is both ‘crucial and complex’ (Custodero 1998,
p. 24). We should not forget, however, that adults are not the only major social influence on chil-
dren’s spontaneous music-making. Children also discover, learn and invent musical activities and
experiences with peers.
21.5.3 Musical companionship with peers: singing and playing

what cannot be said
Children’s socialization with peers plays a significant part in the growing independence of chil-
dren’s music-making from adult influence (for a very early example see Bradley, Chapter 12, this
volume). This process is beautifully captured by Romet (1992) in a study of how children in
Javanese Sunda make the transition from interacting with the infant-directed songs of their
mothers, called Neng Neleng Kung, to participating in the musically quite different repertoire
acquired through contact with other children, named Pring Prang. Romet’s transcriptions and
commentary allow the following comparisons:
Neng Neleng Kung Pring Prang
Sung by an individual adult Sung by groups of children
Timeless Measured
Melismatic Syllabic
Organic Repetitive
Melodic Rhythmic
Continuum Lattice
Individual Group
Serial Simultaneous
Romet’s illustration of a distinct repertoire of songs transmitted from child to child, with orga-
nizational characteristics defined by group participation, accords with the themes of subversive
and covert means by which groups of children exchange songs with one another revealed in the
work of Iona and Peter Opie (1985) and the linguist Guy Cook (2000). The coining and sharing
of repertoires of this kind, which often represent variants on existing songs, provides a mecha-
nism for social release and the coding of emotional response to real-life issues that cannot,
or dare not, be expressed in conversational speech. The child may not even be aware that such a
choice is being made: the little boy who sang all the tunes associated with engines from Thomas
the Tank Engine employed this means of distinguishing between their colours (red, green, blue,
yellow), vocalizing the melodies long after most children would begin to employ words categori-
cally. Only much later did it dawn on his parents that this represented his way of expressing
responses to colours he was unable to label confidently due to an undiagnosed colour-blindness.
Children are socialized into communities, or become part of their collaborations, through
musical experiences (Campbell and Kassner 2002). Spontaneous music-making in infants and
toddlers is strongly affected by peer pressure. Where two or more peers are present, children are
sometimes seen to be less focussed on their sound product, than on checking for peer ‘approval,
permission or camaraderie for their actions and responses’ (Custodero 1998). From as early as
6 months, babies are seen to grin at, gesture and imitate playmates (Vandell et al. 1980; Vandell and
Mueller 1995; Bradley, Chapter 12, this volume). When peers interact, an enormous number of
musical expressions emerge as children tease, call, scold and create with other children (Campbell
1998). Children create narratives together, feeding off each other’s development of story in free play,
often with toys or other objects, music being a means of imaginative representation (Gromko and
Poorman 1998a; Young 2002). In addition, children co-invent games that are often accompanied by
body percussion and movement. These games are sometimes intricate in rhythmic pattern and
require considerable skill for successful participation. As children set these high levels of challenge
for themselves, they appear to reach corresponding levels of joy and self-fulfilment in their achieve-
ments (Elliott 1995). Nadel and colleagues describe this infectious play between pre-verbal toddlers
as an ‘immediate imitation’ of meaningful actions, and this research emphasizes the joy that can
come from doing the same things together (Nadel and Pezé 1993).
Besides integrating music into imaginative play, children use it as a coping tool for managing
the world in which they live. ‘Music emerges magically from children as they search for and find
ways to represent their world’ (Campbell and Kassner 2002, p. xi). One common occurrence
involves children changing the words of a song learnt from adults. Even traumatic events find
their way into playground songs as children use music to help them deal with the horrors of a
society of which they might seldom speak. The second author remembers her experience of
young children on a South African school playground in the late 1990s who sang a well-known
children’s song with an adapted text that referred to the murders of farmers and takeovers of
farms in neighbouring Zimbabwe, with these words:
Old MacDonald had a farm, Ee Ei Ee Ei Oh.
And on that farm he had some guns Ee Ei Ee Ei OH.
With a bang, bang here, bang, bang there,
Here a bang, there a bang, everywhere a bang bang,
Old MacDonald had no farm any more.
She also recorded that in the early 2000s, young South African children referred to both the
AIDS epidemic and the high incidence of rape, adapting words to a song learnt from a children’s
television programme:
I love you, you love me.
Barney gave me HIV.
It started with a kiss and it went too far,
Barney raped me in his car.
While these children were obviously made aware of dangers and atrocities, they seldom
appeared to speak of them amongst themselves, yet they sang about them, the music possibly
providing a means of confronting and coping with their fears.
Finally, we turn to the role of music in the human cycle, where one generation passes to the
next an array of ideas, social practices, stories, and aesthetic responses in tones, rhythms, songs
and sound landscapes. This is a phenomenon integral to human existence.
21.6 The musical life cycle: recovery of musicality in parenthood,

and the effects of schooling
If the capacity for musical engagement is instinctive in infants, what happens to this ability over
the human life-cycle? Does musicality offer advantages to mature human adults in technically
complex modern societies? A clue can perhaps be discerned in the way adults communicate with
the very young.
Mothers communicate in a manner special for the infant, a way of speaking and acting expres-
sively not used in any other social interactions. Indeed this interaction, characterized by the more
gentle toned infant-directed speech (see Section 21.2.3) would be inappropriate in any other cir-
cumstance, but is suggested to comprise biologically relevant signals (Fernald 1992; Papoušek 1994;
Papoušek 1996). This form of communication is adapted to initiate sharing of human experience
and thinking with infants. ‘Mothers’ responses to two-month-old infants are stimulating, attentive,
confirmatory, interpretive and highly supportive’ (Trevarthen 1979 p. 232). The infant’s behaviour
is not only the expression of his or her ‘own consciousness and purpose, but these expressions are
coordinated with the behaviour and experiences of another person’ (Hobson 2004, p. 32). In social
exchanges between infant and mother, ‘both participants in the exchange modify their action in
accord with the feedback they receive from their partner—and so the interchange is genuinely
reciprocal’ (Hobson 2004, p. 36) As infants grow they are ‘developing increasingly rich and pleas-
urable forms of mutually sensitive interpersonal engagement’ (Hobson 2004, p. 42). How do
mothers know that this behaviour is appropriate? How do they activate patterns of vocalization
that may be quite unlike their normal speech? Is such behaviour universal?
Ideas drawn from archaeology and anthropology have led to fresh thinking about the human
phenomenon of music. In attempting to reconstruct the role of music-making in early human
society as a model against which to compare the role of music in child development today,
we encounter radical differences in both structure and function. The former could be character-
ized by musical engagement between infant and carer that initially endows the means to
exchange representations of experiences in the self and of responses to the environment. This
may give way to a more ritualistic, peer-responsive repertoire that prepares for roles in the cul-
ture. Material may contain mnemonic or narrative components that have a teaching function to
aid memory and establishes beliefs. But, at all stages, musical participation remains a function of
active social interaction: with nearest relatives, with cousins and the extended family, and with
the larger social unit that affords protection and group identity. Under such conditions, creativity
for the self and individuality in the group are one: the distinctive, material voice of each partici-
pant is their contribution to the whole.
The problem of interpreting children’s musical development today in such naturalistic terms is
that we cannot ignore the consequences of schooling both for its direct effect on pupils’ progress
and self-estimation as musicians, and on societal conventions as to who is perceived as musically
talented or proficient and who is not. Displacement of the ‘intent participation learning’ charac-
teristic of pre-industrial cultures, and still important for leisure and sport activities, by schooling
according to a curriculum of instruction, has important consequences for the experience of
musical culture (Bruner 1996; Rogoff 2003). For instance, the labelling of ‘tone-deafness’ (a term
the authors would prefer not to employ) has been shown to be almost entirely a consequence of
inappropriate adult judgement or peer pressure (Knight 2000).
In Western-style societies in which perceptions of fame and precedent dominate the reception
and evaluation of expressive activities, this process of selection according to acknowledged talent
rapidly converges on rewarding those who mimic sophisticated adult behaviour early in their
development, either in the prodigious performing abilities of those able to master the classics, or
in those who ape the performing style and repertoire of popular performers. In both cases,
appreciative adults may be seduced by the cuteness factor through which performance is
applauded largely because it is precocious.
There are, in fact, few activities in which it is considered healthy for children to be pushed into
attempting adult levels of physical and mental achievement. That this is possible at all is a fasci-
nating aspect of the unique relationship between musical potential and the plasticity of the
young mind. The danger of such accepted conventions of musical training is not merely for
the talented youngsters themselves, who may not always benefit from a weight of expectation that
sacrifices their childhoods in the pursuit of adult levels of achievement before their time, but for
the many normal children with musical ability who are judged by comparison to be unworthy of
the opportunity to make music effectively. The idea is then formed of a ‘musical child’—one
worth investing in as a pupil for intensive and accelerated training.
A related problem affecting continuity of musical development in children, and sustaining

enjoyment of making music to adulthood, is the way that centrally planned curricula define
musical activities in terms of what can be taught and assessed rather than in terms of what
children might choose to do. At best, a curriculum frames the entitlement of children to partici-
pate in activities that extend their native vocal and movement abilities in a musical context. But a
curriculum can all too easily become an instrument for describing the knowledge that a teacher
delivers: a means of teaching about music rather than through music. This is all too likely to be
necessary where children are taught by teachers comfortable with factual knowledge, but less so
with their own capacity for expressive musical participation. No syllabus can compensate for the
loss of musical curiosity that results from an interruption in children’s confidence to make music
of their own volition (Flohr and Trevarthen 2007).
These twin features—the tendency to single out only a small proportion of children as being
musical, and the trend towards prescribed curricula defined in words and concepts—contribute to
the Western malaise whereby, while all children are born musical, relatively few adults see them-
selves as remaining so. But biology seems to provide a strange way of compensating for this in the
alteration of adult behaviour that can result from sexual reproduction itself. Street (2003), in a
study consistent with existing data in Trehub (2001), and paralleled by Ilari et al. (2003), reveals
how motherhood can release musical communication in women who do not see themselves as able
to sing. There is a marked difference between their negative self-evaluation in interview and
questionnaire, and the expressive nature of their vocal performance with infants captured on video.
Further research may reveal to what extent fatherhood can have the same consequences.
Whatever differences there may be between fathers and mothers, the idea that engagement with
the communicative needs of infants triggers a musical behaviour even in adults who deny that
they have musical ability adds a dimension to the view of music as a biologically determined
emotional need. Just as the infant perceives the environment in musical terms, the role of music-
making comes full circle in adults’ rediscovery of latent musical expressivity in response. The true
heir of spontaneous music-making in childhood is its maintenance as a lifelong component of
human self-fulfilment.
References
Alegria J and Noirot E (1978). Neonate orientation behavior towards the human voice. Early Human
Bannan NJC (2000). Instinctive singing: lifelong development of ‘the child within’. British Journal of Music
Education, 17(3), 295–301.
Bannan NJC (2002). Music in human evolution: an adaptationist approach to voice acquisition. Unpublished
Ph.D. thesis, Department of Arts and Humanities, University of Reading.
Bannan NJC (2004). Language as music: future trends in interdisciplinary research into the origins of
human communication. Unpublished paper presented at the conference Music, Language and Human
Evolution, University of Reading, 28 September–1 October, 2004.
Barrett M (1990). Graphic notation in music education. In Music education: facing the future, pp. 147–153.
Helsinki: International Society for Music Education.
Barrett M (1992). Music education and the natural learning model. International Journal of Music
Education, 20, 27–34.
Barrett M (1996). Children’s aesthetic decision-making: an analysis of children’s musical discourse as
composers. International Journal of Music Education, 28, 37–61.
Bateson MC (1975). Mother–infant exchanges: The epigenesis of conversational interaction. In D Aaronson
and RW Rieber, eds, Developmental psycholinguistics and communication disorders, pp. 101–113. Annals
of the New York Academy of Sciences, Vol. 263. New York Academy of Sciences, New York.
Bekoff M and Fox MW (1972). Postnatal neural ontogeny: Environment-dependent and/or environment-
expectant? Developmental Psychobiology, 5(4), 323–341.
Benzaquen S, Gagnon R, Hunse C and Foreman J (1990). The intrauterine sound environment of the
human fetus during labor. American Journal of Obstetrics and Gynecology, 163, 484–90.
Bergeson TR and Trehub SE (1999). Mothers’ singing to infants and preschool children. Infant Behavior
and Development, 22, 51–64.
Berk L (2002). Infants, children and adolescents, 4th edn. Allyn and Bacon, Boston, MA.
Blacking J (1995). Music, culture and experience. University of Chicago Press, Chicago, IL.
Boyce-Tillman J (2000). Constructing musical healing: The wounds that sing. Jessica Kingsley, London.
Brazelton TB Koslowski B and Main M (1974). The origins of reciprocity: The early mother–infant
reaction. In: M Lewis and R Rosenblum, eds, The effect of the infant on the caregiver, pp.. Wiley,
New York.
Bruner JS (1960). The process of education. Harvard University Press, Cambridge, MA.
Bruner JS (1996). The culture of education. Harvard University Press, Cambridge, MA.
Press, London.
Busnel MC, Lecanuet JP, Granier-Deferre C and De Casper AJ (1986). Perception et acquisition auditives
prénatales. Médecine Périnatale, 37–46.
Campbell PS and Scott-Kassner C (2002). Music in childhood, 2nd edn. Schirmer Books, New York.
Campbell PS (1998). Songs in their heads. Oxford University Press, New York.
Case-Smith J, Bigsby R and Clutter J (1998). Perceptual-motor coupling in the development of grasp.
American Journal of Occupational Therapy, 52, 102–110.
Chen-Hafteck L (1998). Pitch abilities in music and language of Cantonese-speaking children.
International Journal of Music Education, 31(1), 14–24.
Chen-Hafteck L (2004) Music and movement from zero to three: A window to children’s musicality.
In L Custodero, ed., Proceedings of the ISME Early Childhood Conference ‘Els Mons Musical Dels Infants’
(The Musical Worlds of Children), pp.. Barcelona, Spain, July 5–10.
Chomsky N (2000). New horizons in the study of language and mind. Cambridge University Press,
Cambridge.
Chomsky N (1957). Syntactic structures. The Hague: Mouton.
Chong HJ (2000). Vocal timbre preference in children. In BA Roberts and A Rose, eds, The phenomenon
of singing 2, pp. 53–63. Proceedings of the International Symposium. St John’s, Newfoundland, Canada,
Memorial University.
Clifton RK, Morrongiello BA, Kulig JW and Dowd JM (1981). Newborn’s orientation toward sound:
Possible implications for cortical development. Child Development, 52, 833–838.
Cook G (2000). Language play, language learning. Oxford: Oxford University Press.
Condon WS (1979). Neonatal entrainment and enculturation. In M Bullowa. ed., Before speech:
The beginnings of human communication, pp. 131–148. London: Cambridge University Press.
Cooper RP and Aslin RN (1989). The language environment of the young infant: Implications for early
perceptual development. Canadian Journal of Psychology, 43, 247–265.
Cooper RP and Aslin RN (1990). Preference for infant-directed speech in the first month after birth.
Cox G (2004). New sounds in class: Music teaching in UK schools in the 1960s, and its relationship
to the present. In A Giráldez, ed., Sound worlds to discover: Proceedings of the 26th World Conference
of the International Society for Music Education, Tenerife, Spain, pp.. Madrid: Enclave Creativa
Ediciones.
In Suk Won-Yi, ed., Music, mind and science, pp. 10–29. Seoul National University Press, Seoul.
Cross I, Jubrow E and Cowan F (2002). Musical behaviours and the archaeological record: A preliminary
study. In J Mathieu, ed. Experimental archaeology. British Archaeological Reports International Series
1035, pp. 25–34.
Csikszentmihalyi M (1990). Flow: The psychology of optimal experience. Harper and Row, New York.
Custodero L (1998). Observing flow in young children’s music learning. General Music Today,
12(1), 21–27.
Custodero LA and Johnson-Green EA (2003). Passing the cultural torch: Musical experience and musical
parenting of infants. Journal of Research in Music Education, 51(2), 102–114.
Dargie D (1988). Xhosa music: Its techniques and instruments, with a collection of songs. David Phillip,
Cape Town.
De Casper AJ, Lecanuet J-P, Busnel M-C, Granier-Deferre C and Maugeais R (1994). Fetal reaction to
recurrent maternal speech. Infant behavior and development, 17, 159–164.
DeCasper AJ and Fifer WP (1980). Of human bonding: Newborns prefer their mothers’ voices.
Science, 208(4448), 1174–1176.
DeCasper AJ and Spence M (1986). Prenatal maternal speech influences newborns’ perception of speech
sounds. Infant Behavior and Development, 9, 133–150.
Dennis W (1960). Causes of retardation among institutionalized children: Iran. Journal of Genetic
Dissanayake E (2000a). Antecedents of the temporal arts in early mother–infant interaction. In NL Wallin,
Dissanayake E (2000b). Art and intimacy: How the arts began. University of Washington Press, Seattle
and London.
Donald M (2001). A mind so rare: The evolution of human consciousness. Norton, New York.
Donaldson M (1992). Human minds: An exploration. Allen Lane/Penguin Books, London.
Elliott DJ (1995). Music matters. Oxford University Press, Oxford.
Ellis DG (1999). From language to communication, 2nd edn. Lawrence Erlbaum Associates, Mahwah NJ.
Ellis CI (2001). Song for ages and stages. In L Macy, ed., Australia 2, Central Aboriginal music ii.
http://www.grovemusic.com.
El Mogharbel C, Laufs I, Wenglorz M and Deutsch W (2003). The sounds of songs without words.
In R Kopiez, AC Lehmann, I Wolther and C Wolf, eds, Proceedings of the 5th Triennial Conference of the
European Society for the Cognitive Sciences of Music. Institute for Research in Music Education
Monograph No. 6, Hanover.
Erickson F (1996). Going for the zone: the social and cognitive ecology of teacher–student interaction in
classroom conversations. In Deborah Hicks, ed., Discourse, learning, and schooling, pp. 29–62.
Cambridge University Press, Cambridge and New York.
Fagard J and Pezé A (1997). Age changes in interlimb coupling and the development of bimanual
coordination. Journal of Motor Behavior, 29, 199–208.
Falk D (2004). Prelinguistic evolution in early hominins: Whence motherese? Behavioural and Brain
Sciences, 27, 491–503.
Fein S (1993). First drawings: Genesis of visual thinking. Exelrod Press, Pleasant Hill.
Feld S (1990). Sound and sentiment: Birds, weeping, poetics and song in Kaluli expression. University of
Pennsylvania Press, Philadelphia, PA.
Fernald A (1993). Approval and disapproval: Infant responsiveness to vocal affect in familiar and
unfamiliar languages. Child Development, 64, 657–667.
Fernald A (1992). Human maternal vocalizations to infants as biologically relevant signals:
An evolutioanry perspective. In J Barkow, L Cosmides and J Tooby, eds, The adapted mind:
Evolutionary psychology and the generation of culture, pp. 392–428. Oxford University Press, New York.
Fernald A (1989). Intonation and communicative intent in mothers’ speech to infants: Is the melody the
Fernald A (1985). Four-month-old infants prefer to listen to motherese. Infant Behavior and Development,
8, 181–195.
Fernald A, Taeschner T, Dunn J, Papoušek M, Boysson-Bardies B and Fukui I (1989). A cross-language
Language, 16, 477–501.
Field J, Muir D, Pilon R, Sinclair M and Dodwell P (1980). Infant’s orientation to lateral sounds from
birth to three months. Child Development, 51, 295–298.
Books, New York.
Flowers PJ (1993). Evaluation in early childhood music. In M Palmer and WL Sims, eds, Music in
preschool: Planning and teaching. Music Educator’s National Conference, Reston, VA.
Gardner H (1983). Frames of mind: The theory of multiple intelligences. Heinemann, London.
Gardner H (1999). Intelligence reframed: Multiple intelligences for the 21st century. Basic Books, New York.
Garfinkel Y (2003). Dancing at the dawn of agriculture. University of Texas Press, Austin, TX.
Geissmann T (2000). Gibbon song and human music from an evolutionary perspective. In NL Wallin,
Goddard Blythe S (2005). The well balanced child: Movement and early learning. Hawthorn Press, Stroud,
Gloucestershire.
Goldschmied E and Jackson S (1994). People under three: Young children in day care. Routledge,
London.
Gromko J (1994). Children’s invented notations as measures of musical understanding. Psychology of
Music, 22, 136–147.
Gromko J and Poorman A (1998a) The effect of music training on preschoolers’ spatial–temporal task
performance. Journal of Research in Music Education, 46(2), 173–181.
Gromko J and Poorman A (1998b). Does perceptual–motor performance enhance perception of patterned
art music? Musica Scientiae, 2(2), 157–170.
London.
Hargreaves DJ and Galton M (1992). Aesthetic learning: psychological theory and educational practice.
In B Reimer and RA Smith, eds, The arts, education, and aesthetic knowing, pp. 124–149. University of
Chicago Press, Chicago, IL.
Hargreaves DJ (1986). The developmental psychology of music. Cambridge University Press, Cambridge.
Hepper PG (1988). Fetal ‘soap’ addiction. Lancet, 1, 1347–1348.
Hetland L (2000) Learning to make music enhances spatial reasoning. Journal of Aesthetic Education,
(Special issue) 34(3/4), 179–238.
Hobson P (2004). The cradle of thought: Exploring the origins of thinking. Macmillan, London.
Hubley P and Trevarthen C (1979) Sharing a task in infancy. In I Uzgiris, ed., Social interaction during
infancy: New Directions for Child Development, 4, pp. 57–80. Jossey-Bass, San Francisco, CA.
Ilari B, Polka L and Sundara M (2003). Preferences for ‘a cappella’ and accompanied songs: A study with
infant listeners. In R Kopiez, AC Lehmann, I Wolther and C Wolf, eds, Proceedings of the 5th Triennial
Conference of the European Society for the Cognitive Sciences of Music, pp.. Institute for Research in
Music Education Monograph No. 6, Hanover.
Imberty M (1997). Trends of developmental psychology in music. Paper presented at the Florentine
Workshops in Biomusicology 1. The origins of music, 29 May–2June, Fiesole, Italy.
Karmiloff K and Karmiloff-Smith A (2001). Pathways to language: From fetus to adolescent. Harvard
Kartomi MJ (1991). Musical improvisations of children at play. World of Music, 33(3), 53–65.
Katz V (1971). Auditory stimulation and developmental behavior of the premature infant. Nursing
Research, 20, 196–201.
Kessen W, Levine J and Wendrich K (1979). The imitation of pitch in infants. Infant Behaviour and
Kisilevsky BS and Muir DW (1991) Human fetal and subsequent newborn responses to sound and
vibration. Infant Behavior and Development, 14, 1–26.
Knight S (2000). Exploring a cultural myth: What adult non-singers may reveal about the nature of
singing. In BA Roberts and A Rose, eds, The phenomenon of singing 2, pp. 144–154. Memorial
University of Newfoundland, St John’s, NF.
Kramer LI and Pierpont ME (1976). Rocking waterbeds and auditory stimuli to enhance growth of
preterm infants. Journal of Pediatrics, 88, 297.
Krumhansl CL and Jusczyk PW (1990). Infants’ perception of phrase structure in music. Psychological
Science, 1, pp. 70–73.
Peter Lang, Bern.
Laycock J (2005). A changing role for the composer in society. Peter Lang, Bern.
Leader LR, Baillie P, Martin B, Molteno C and Wynchank S (1982). The assessment and significance of
habituation to a repeated stimulus by the human fetus. Early Human Development, 7, 211–219.
Lecanuet J-P, Granier-Deferre C, Jaquet A-Y and Busnel M-C (1992). Decelerative cardiac responsiveness
to acoustical stimulation in the near term fetus. Quarterly Journal of Experimental Psychology,
44, 279–303.
Lecanuet J-P, Granier-Deferre C, Cohen H, Le Houezec R and Busnel M-C (1986). Fetal responses to
acoustic stimulation depend on heart rate variability pattern, stimulus intensity and repetition. Early
Human Development, 13, 269–283.
Littleton D (1991). Influence of play settings on preschool children’s music and play behaviors. Doctoral
dissertation, University of Teas, Austin. Dissertation Abstracts International 52–4, 1198A.
Locke JL (1993). The child’s path to spoken language. Harvard University Press, Cambridge MA.
MacLean PD (1990). The triune brain in evolution, role in paleocerebral functions. Plenum Press,
New York.
1999–2000), 29–57.
Masataka N (1996). Perception of motherese in a signed language by 6-month-old deaf infants.
Matthews J (2003) Drawing and painting: Children and visual representation, 2nd edn. Sage, London.
Mehler J, Jusczyk PW, Lambertz G, Halsted N, Bertoncini J and Amiel-Tison C (1988). A precursor of
language acquisition in young infants. Cognition, 29, 143–178.
The origins of music, pp. 315–327. MIT Press, Cambridge MA.
Metz E (1989). Movement as a musical response among preschool children. Journal of Research in Music
Education, 37 (1), 48–60.
Minami Y and Nito H (1998). Vocal pitch matching in infants. In Proceedings of the 8th International
Seminar of the Early Childhood Commission of the International Society for Music Education (ISME), pp..
Mithen S (2005). The singing neanderthals: The origins of music, language, mind and body. Weidenfeld and
Nicholson, London.
Moog H (1976). The musical experience of the pre-school child, trans. Claudia Clarke. Schott, London.
Moorehead GE and Pond D (1978). Music of young children. Santa Barbara, CA, Pillsbury Foundation for
Advancement of Music Education.
Morgan OS and Tilluckdharry R (1982). Presentation of singing function in severe aphasia. West Indian
Medical Journal, 31, 159–161.
Archaeological Journal, 12(2), 195–216.
Muir D and Field J (1979). Newborn infants orient to sounds. Child Development, 50, 431–436.
Mumme D and Fernald A (2003). The infant as onlooker: Learning from emotional reactions observed in
a televised scenario. Child Development, 74, 221–237.
Mumme D, Fernald A and Herrera C (1996). Infants’ responses to facial and vocal emotional signals in a
social referencing paradigm. Child Development, 67, 3219–3237.
and their mothers. In TM Field and NA Fox, eds, Social perception in infants, pp. 177–197. Ablex,
Norwood, NJ.
Nadel J, Carchon I, Kervella C, Marcelli D and Réserbat-Plantey D (1999). Expectancies for social
Nadel J and Pezé A (1993). Immediate imitation as a basis for primary communication in toddlers and
autistic children. In J Nadel and L Camioni, eds, New perspectives in early communicative development,
pp. 139–156. Routledge, London.
Omi A (1992). Explaining children’s spontaneous singing. In Proceedings 4th International Seminar of the
Early Childhood Commission of the International Society for Music Education (ISME), pp..
Opie I and Opie P (1985). The singing game. Oxford University Press, Oxford.
Panneton RK (1985). Prenatal auditory experience with melodies: Effects on postnatal auditory preferences in
human newborns. Unpublished D Phil thesis, University of North Carolina at Greensborough, North
Carolina.
In I Deliege and J Sloboda, eds, Musical beginnings: Origins and development of musical competence,
Papoušek M (1994). Melodies in caregivers’ speech: A species-specific guidance towards language.
Early Development and Parenting, 3, 5–17.
Papoušek M and Papoušek H (1989). Forms and functions of vocal matching in precanonical
mother-infant interactions. First Language, 9, 137–158.
Papoušek M, Papoušek H and Bornstein MH (1985). The naturalistic vocal environment of young
infants: On the significance of homogeneity and variability in parental speech. In T Field and N Fox,
eds, Social perception in infants, pp. 269–297. Ablex, Norwood NJ.
Paynter J and Aston P (1970). Sound and silence. Cambridge University Press, Cambridge.
Pinker S (1997). How the mind works. Penguin Books, London.
Pond D (1981). A composer’s study of young children’s innate musicality. Council for Research in
Music Education, 68, 1–12.
Querleu D, Lefebvre C, Titran M et al. (1984). Discrimination of the mother’s voice by the neonate
immediately after birth. Journal de gynecologie, obstetrique et biologie de la reproduction,
13(2)9, 125–134.
Richards MPM (1974). First step in becoming social. In MPM Richards, ed., The integration of a child
into a social world, pp.. Cambridge: Cambridge University Press.
Rizzolatti G, Fogassi L and Gallese V (2001). Neurophysiological mechanisms underlying the
understanding and imitation of action. Nature Reviews Neuroscience, 2, 661–670.
Rochat P (1992). Self-sitting and reaching in 5- to 8-month old infants: The impact of posture and its
development on early eye–hand coordination. Journal of Motor Behavior, 24, 210–220.
Rogoff B (2003). The cultural nature of human development. Oxford University Press, New York.
Rogoff B (1990). Apprenticeship in thinking: Cognitive development in social context. Oxford University
Press, New York.
Romet C (1992). Song acquisition in culture: A West Javanese study in children’s song development,
in H Lees, ed., Music education: Sharing musics of the world. Proceedings of the 20th World Conference of
the International Society for Music Education, Seoul, Korea, pp. 164–173. ISME/University of
Canterbury, Christchurch, NZ.
Rutkowski J and Trollinger VL (2005). Singing. In JW Flohr, ed., The musical lives of young children.
Prentice Hall, Upper Saddle River, NJ.
Rutkowski J, Chen-Hafteck L and Gluschankof C (2002). Children’s vocal connections: A cross-cultural
study of the relationship between first graders’ use of singing voice and their speaking ranges.
In Children’s musical connections: Proceedings of the ISME Early Childhood Commission Conference.
ISBN 87–7701–949–0, Danish University of Education, Copenhagen, Denmark.
Rutkowski J and Chen-Hafteck L (2001). The singing voice within every child: A cross-cultural
comparison of first graders’ use of singing voice. Early Childhood Connections: Journal of Music- and
Movement-Based Learning, 7 (1), 37–42.
Sacks O (1985). The man who mistook his wife for a hat. Duckworth, London.
Salk L (1962). Mothers’ heartbeat as an imprinting stimulus. Transactions, Journal of the New York Academy
of Science, 24 (7), 753–763.
Satt BJ (1984). An investigation into the acoustical induction of intra-uterine learning. Unpublished D Phil
thesis, Californian School of Professional Psychologists, Los Angeles.
Scherer KR (1992). Vocal affect expression as symptom, symbol and appeal. In H Papousek, U Jürgens and
M Papousek, eds, Nonverbal vocal communication: Comparative and developmental approaches, pp. 43–60.
Editions de la Maison des Sciences de l’Homme, Paris/Cambridge University Press, Cambridge.
Shalev E, Benett MJ, Megory E, Wallace RM and Zuckerman H (1989). Fetal habituation to repeated
sound stimulation. Journal of Medical Science, 25, 77–80.
Sims WL (1986). The effect of high versus low teacher affect and passive versus active student activity
during music listening on preschool children’s attention, piece preference, time spent listening,
and piece recognition. Journal of Research in Music Education, 34, 173–191.
Sims WL (1991). Effects of instruction and task format on preschool children’s music concept
discrimination. Journal of Research in Music Education, 39(4), 298–310.
Smithrim K (1997) Free musical play in early childhood. Canadian Music Educator, 38(4), 17–24.
Stern DN (1971). A micro-analysis of mother–infant interaction: Behaviors regulating social contact
between a mother and her three-and-a-half-month-old twins. Journal of American Academy of Child
Wiley, New York.
mother and infant by means of intermodal fluency. In TM Field and NA Fox, eds, Social perception in
Stern DN and Gibbon J (1980). Temporal expectancies of social behaviours in mother–infant play.
In E Thoman, ed., Origins of infant social responsiveness, pp. 409–429. Erlbaum, New York.
Street A (2003). Mothers’ attitudes to singing to their infants. In R Kopiez, AC Lehmann, I Wolther and
C Wolf, eds, Proceedings of the 5th Triennial Conference of the European Society for the Cognitive Sciences
of Music. Institute for Research in Music Education Monograph No. 6, Hanover.
Suthers L (1995). Music, play and toddlers. International Play Journal, 3, 142–151.
Swanwick K and Tillman J (1986). The sequence of musical development: a study of children’s
composition. British Journal of Music Education, 3, 305–339.
Swanwick K (1999). Teaching music musically. Routledge, London.
Tarnowski S and Leclerc J (1994). Musical play of preschoolers and teacher–child interaction. Update:
Applications of Research in Music Education, 13(1), 9–16.
Tarnowski S (1996). Preservice early childhood educators’ observations of spontaneous imitative song in
preschool children age two to five years. In Proceedings of the 7th International Seminar of the Early
Childhood Commission of the International Society for Music Education (ISME).
Tillman JB (1987) Towards a model of the development of musical creativity: A study of the compositions of
children aged 3–11. Unpublished Ph. D. Thesis, University of London, Institute of Education.
Tinbergen N (1951). The study of instinct. Clarendon Press, Oxford.
Tommis Y and Fazey DMA (1996). The acquisition of pitch element of music literacy skills. In Proceedings
of the 7th International Seminar of the Early Childhood Commission of the International Society for
Music Education.
Trainor LJ (1996). Infant preferences for infant-directed versus non-infant-directed playsongs and
Trehub SE (2001). Musical predispositions in infancy. In RJ Zatorre and I Peretz, eds, The biological
foundations of music. Annals of the New York Academy of Sciences, 930, 11–16.
Trehub SE and Trainor LJ (1993). Listening strategies in infancy: The roots of music and language
development. In S McAdams and E Bigand, eds, Thinking in sound: The cognitive psychology of human
audition, pp. 278–327. Oxford University Press, New York.
Trehub SE, Bull D and Thorpe LA (1984). Infants’ perception of melodies: The role of melodic contour.
Trehub SE, Endman M and Thorpe LA (1990). Infants’ perception of timbre: Classification of complex
tones by spectral structure. Journal of Experimental Child Psychology, 49, 300–313.
Trehub SE, Thorpe LA and Morrongiello BA (1985). Infants’ perception of melodies: Changes in a single
tone. Infant Behavior and Development, 8, 213–223.
intersubjectivity. In M Bullowa, ed., Before speech: The beginning of human communication. Cambridge
Trevarthen C (1984). How control of movements develops. In HTA Whiting, ed., Human motor actions:
Bernstein reassessed, pp. 223–261. Elsevier (North Holland), Amsterdam.
Trevarthen C (1994). Infant semiosis. In W Nöth, ed. Origins of semiosis, pp. 219–252. Mouton de Gruyter,
Berlin.
Trevarthen C (1997). Foetal and neonatal psychology: Intrinsic motives and learning behaviour.
In F Cockburn, ed., Advances in perinatal medicine, pp. 282–291. Parthenon, New York.
Press, Oxford.
meaning in the first year. In A Lock, ed. Action, gesture and symbol. Academic Press, New York.
Trollinger V (2003). Relationships between pitch-matching accuracy, speech fundamental frequency,
speech range, age, and gender in American English-speaking preschool children. Journal of Research in
Music Education, 51(1), 78–94.
Turel T (1992) Music education for babies. Paper presented at the International Society for Music
Education Early Childhood Commission Seminar, Sharing Discoveries about the Child’s World of
Music, Tokyo, Japan.
Vandell DL and Mueller EC (1995). Peer play and friendships during the first two years. In HC Foot,
AJ Chapman and JR Smith, eds, Friendship and social relations in children, pp. 191–208. Transaction,
New Brunswick, NJ.
Vandell DL, Wilson KS and Buchanan NR (1980). Peer interaction in the first year of life: an examination
of its structure, content and sensitivity to toys. Child Development, 58, 176 – 186.
Vygotsky L (1962). Thought and language. MIT Press, Cambridge, MA.
Wallin NL (1991). Biomusicology: Neurophysiological, neuropsychological, and evolutionary perspectives on
the origins and purposes of music. Pergamon Press, Stuyvesant, NY
Webster DB and Webster M (1977). Neonatal sound deprivation affects brain stem auditory nuclei.
Archives of Otolaryngology, 103, 392–396.
Wertheimer M (1961). Psychomotor coordination of auditory and visual space at birth. Science, 134, 1692.
Williams L (1967). The dancing chimpanzee: A study of primitive music in relation to the vocalising and
rhythmic action of apes. Norton, New York.
Winnicott DW (1971). Playing and reality. Tavistock, London.
Woodward SC (1992a). The transmission of music into the human uterus and the response to music of the
human fetus and neonate. Unpublished doctoral thesis: University of Cape Town.
Woodward SC (1992b). Intrauterine rhythm and blues? British Journal of Obstetric Gynaecology,
99, 787–790.
Woodward SC (1996). Prenatal auditory stimulation. Practica, Roodepoort.
Woodward SC (2005). Critical matters in early childhood music education. In DJ Elliott, ed., Praxial music
education:Reflections and dialogues, pp. 249–266. Oxford University Press, New York.
Young S (1999). Interpersonal features of spontaneous music-play on instruments among three- and
four-year olds. Paper presented at the conference, Cognitive processes of children engaged in musical
activity, Urbana IL: School of Music, University of Illinois at Champaign-Urbana, 3–5 June, 1999.
Young S (2000). Young children’s spontaneous instrumental music-making in nursery settings. Unpublished
Ph. D. thesis, University of Surrey.
Young S (2002). Young children’s spontaneous vocalisations in free-play: Observations of two- to three-
year-olds in a day care setting. Bulletin of the Council for Research in Music Education, 152, 43–53.
Young S (2003). Music with the under 4’s. Routledge Falmer, New York.
Zimmer EZ, Divon MY, Vilensky A, Sarna Z, Peretz BA and Paldi E (1982). Maternal exposure to music
and fetal activity. European Journal of Obstetrics, Gynaecology and Reproductive Biology, 13(4), 209–13.
Chapter 22
Vitality in music and dance as

basic existential experience:
Applications in teaching music
Charlotte Fröhlich
22.1 Introduction
For about 70 years, following the teachings of Carl Orff and Emile Jaques-Dalcroze (Orff and
Keetman 1950; Jaques-Dalcroze 1921), music pedagogy has brought music and movement
together, and more time has been given to improvisation in both classroom music lessons and
instrumental teaching. It is time to ask what experiences have encouraged this trend, and
what research findings may validate a style of teaching that cultivates intuitions for body
movement. It is also time to consider developing new teaching approaches that might refine the
existing methods.
Various empirical surveys have set out to support the connection between music and
movement, through questionnaires or tests of psychological performance (e.g., Altenmüller and
Gruhn 1997; Altenmüller et al. 2000; Aronson and Rosenbloom 1971; Rauscher et al. 1995;
Gruhn and Rauscher 2002). However, these tests, largely concerned with relationships between
measures of audition and movement, fail to elucidate the fundamental processes of motivation.
As regards questionnaires designed to interrogate a learner’s experiences, we must accept that
people may not be as aware of their motivations as the researcher expects them to be. It is often
the case that awareness flees from being described with measurable criteria defined a priori.
An abstract, ‘scientific’ concept of the world takes analysis of a passive reality as a starting
point; to achieve reliability, reality is divided into the smallest parts possible, there is an attempt
to measure the detail in every part, and processes are hypothesised that relate to the parts. This
way of looking at the world is underpinned by operationalized thinking, using a rationally
constructed set of strategies or explanations, and in the field of music pedagogy it may lead to the
practise of curricula that formulate systematic and rather restricted aims for each music lesson in
advance. The emphasis is on how stimuli from reality direct actions.
However, anthropologists and infancy researchers, who accept more inclusive descriptive
methods that allow for creative or spontaneous causes of experience, argue that there exists
an innate holistic perception. For example, developmental psychologists use terms such as
‘transmodality’ or ‘amodality’ to characterize the phenomenon that a baby spontaneously con-
nects perceptions that are informed by different sensory modalities (Lawson 1980; Kuhl and
Meltzoff 1982; Trevarthen 1993; Stern 1985/2000). This integrating, purpose-satisfying process of
consciousness is at the core of our artistic and social experience—in babyhood, childhood and
adulthood.
There is another reason why the natural phenomenon of music-making escapes the empirical
fishing net: researchers aim to consider as ‘scientifically’ relevant only those situations that they
496 CHARLOTTE FRÖHLICH
believe are repeatable. As music teachers, we may be pleased that we can analyse a situation
several times through video observations and through reviewing documentation from a lesson;
however, in practice, it is not possible to repeat a particular situation with exactly the same effect.
This has, for me, far-reaching consequences in the field of teaching. In other words, I believe
the quality of teaching, especially of music teaching, lies in the non-repeatable moments of
intersubjectivity.
It is not necessarily ‘unscientific’ to accept that the spontaneity of movement is essential to
experiencing or learning music. We can logically deduce the meaningfulness of, or justify an
approach that combines, music and dance/movement from the evidence that both sound and
dance/movement have great significance in all human mimetic communication (Blacking 1988;
Donald 2001). I see proof of the effectiveness of combining music and dance/movement in
teaching from observing the increasing motivation of students.
In my role as a teacher, I attempt to find teaching principles—not methods of instruction—
that make artistic experiences possible without nipping them in the bud. These principles are
derived from infancy research on the transmodal nature of infant perception (Trevarthen 1977;
Stern 1985/2000, 2004), and from anthropological insights about attitudes towards irreversible
or unfolding situations (Tarasti 1994), and the role of movement in time in the meaning of music
(Kühl 2007). My purpose is to elucidate artistic experience for both children and adults, and thus
to support and empower their delivery in the classroom. However, if these principles are adopted,
a by-product is that we, as teachers, will need to accept what seem to be detours in our teaching.
We know from psychoanalysis that, for an adult, knowing or being aware of the intentions and
emotions of early experiences, and possibly working through them, can heal. Without forcing
myself to think purely in psychoanalytic terms, my daily experience of teaching shows me that
my musical activities with children and adults, based on an understanding that experience
of music is transmodal and intentional rather than simply analytical, contain a high level of
motivational energy and a high potential of vitality. How can we understand this?
In The present moment in psychotherapy and everyday life, Daniel Stern (2004) refers to the work
of Colwyn Trevarthen (Trevarthen 1980), explaining why intersubjectivity should be seen
‘as a basic, primary motivational system’. That we tend to ‘attune’ to each other, musically or
mimetically, seems to favour the survival of the human species. Thus, it is vitally important for us
to be perceived and to be engaged in the ‘ongoing regulation of the intersubjective space’ (Stern
2004, p. 97). In the intersubjective space of the music lesson, if we become involved with each
other as conscious, intending subjects through music and movement, our students will not
necessarily need extra motivation in the music lesson. The experience of ‘doing this together’
is reward enough. The crucial question here is: how can we start a music session with the
guarantee that we are as close to an attuned intersubjective exchange as possible? If we succeed
with this approach, we are paying attention to both the pedagogical as well as to the ‘therapeutic’
impact of making music or dancing together.
22.2 Irreversibility and motivation

Every process of communication is irreversible. There is not one conversation that I can ‘undo’.
This means that both I and the person I am talking to, or making music with, are in some ways
different after a verbal or a musical interaction. For example, if I have been deeply touched by a
concert performance, if the performance has communicated deeply with me, I leave the concert
hall a different person. I may never forget the artist and the work; whenever I hear the work
again, I may recall that particular performance. The work and the performing artist have
appealed to me on a communicative level. In the arts that unfold over time (such as music), more
VITALITY IN MUSIC AND DANCE AS BASIC EXISTENTIAL EXPERIENCE 497
so than in the ‘static’ arts (such as painting), I witness instantaneously both a process of interpre-
tation or performance, and the oeuvre. This event is irreversible in that it refers to a lived execu-
tion or activity. Most notably, it is a lived experience: one of the qualities of experiences are that
they unfold in ‘presentness’, in the moment of their enactment.
In an earlier book I have shown how every process of understanding and discovery is
irreversible, changing the individual as artist or performer, teacher or learner, as well as his or her
subsequent work (Fröhlich 2002). Within this irreversible process, we can see and make use of a
healing, motivational and vitalizing potential. What is the meaning of this, when applied to the
teaching situation? Looked on from a music pedagogical perspective, and in line with the
findings of infancy research, curiosity and inquisitiveness are essential to vitality itself. They
contribute to all forms of non-verbal experience. We constantly stimulate each other and
ourselves with smaller or larger discoveries—from the snail shell we find on the roadside as a
child, to a polar expedition, from the experiment with a creaking floor, to an elaborate composition.
In the education literature, a distinction is made between primary and secondary motivation
(Deci and Ryan 1985). Primary motivation is understood as that which is derived from activity
itself; it is a synonym for intrinsic motivation. Secondary motivation results from an influence
received from outside one’s own activity, for example praise and other rewards, and is also known
as extrinsic motivation. We can readily make use of secondary motivation in a teaching process,
because secondary motivation is connected to the impulse to learn with the prospect of possible
extrinsic rewards, be they material or social. Primary motivation, in contrast, is often described
as the motivation that cannot be influenced or produced by the influence of a teacher. I argue,
however, that teaching, especially music teaching, must be in line with the existential impulses
towards intersubjectivity and the learner’s intrinsic, intuitive wish for self-stimulation. The desire
to integrate the unknown into our life experience, and the wish to differentiate and elaborate
what we already know in an ongoing regulation process, are basic impulses for primary motiva-
tion (see the ‘seeking’ motivation identified as a primary emotional system in Panksepp and
I believe that to maintain the natural vital impulse and motivation in music teaching, we
must regard both music and movement as irreversible events, and cultivate this understanding in
the teaching of music. This approach carries an important responsibility. When teaching within
this paradigm, the challenge for the teacher is that she or he cannot teach according to a precon-
ceived plan, but must continuously stay in communication and interaction with the pupils, aware
of their spontaneous musical expressions.
22.3 How can we capture irreversible processes?

In this section, I consider sociobiologically based modes of experiencing and learning that
demonstrate the bond between sound, movement and vitality. When creating new principles
for didactic actions based on our understanding of irreversible processes of creativity, I believe
the practice should integrate the power of being effective, both socially (intersubjectively) and
artistically. First, I will take into consideration two existential principles:
1 The out-reaching connection to the natural and personal environment, related to the
‘centrifugal’ tendencies of our soul (Riemann 1991); and
2 The connection to our inner physical experience, related to the ‘centripetal’ tendencies of our
soul (Riemann 1991).
These two basic ways of experiencing lead us to further principles for the practice of creative
teaching.
22.3.1 The personal and the natural environment

Each encounter with a new person, as well as each encounter with a new situation or a new object—
all parts of the natural environment—resembles an exploration of new territory (Figure 22.1).
Willingness for adventure or challenge increases as vitality increases (Csikszentmihalyi 1990;
Custodero 2002). Communication (exploring the reaction of a person who is playing with us) and
improvisation (creating new situations with sound and space) are fundamental learning experi-
ences that involve more than simple skill development. Experiences in these contexts and by these
means leave far more durable marks in our mind than, for example, learning to play an interval
in tune.
Mimetic communication, including facial, gestural and other body expression as well as vocal
improvisation, are activities that have expressive temporal contours (see Lee and Schögler,
Chapter 6, this volume, for a mathematical exploration of this). We cannot transmit a certain
expression by making the gesture ‘backwards’. This is an attribute that Stern’s ‘vitality affects’
share with music (Stern 2004). The fundamental directionality of movements and their use as
messages helps us to understand how, when working with spontaneous impulses in communica-
tion with others, the sound and space of music-making is at the same time a challenge for artistic
development and a form of social behaviour (Gebauer and Wulf 2003), as well as being a means
of self-reflection and even of spiritual growth. Interpersonal sharing of expression is also found
in composed music and dance. Lullabies, solo performances, works for double choir, folk and
character dance (like the waltz or tango), group improvisations and contemporary works that
involve audience participation—all of these types of performance give space and sound to
various processes of intersubjectivity—the mutual exchange of self-expression (Figure 22.2).
This exchange of self-expression, for example as used in a duet or in responsorial singing
or dancing, can be a highly motivating educational principle in teaching music and movement.
Thus, the quality of a teacher is revealed by how she or he facilitates the ‘fine tuning’ of
spontaneous musical and/or emotional expression in a group or class. The agents of this communi-
cation, the specific approaches that activate musical spontaneity, I will discuss later (Section 22.5).
An instance of this kind of teaching approach is provided by the musical example Me
and you (Section 22.6.2), and scenarios like the ones above involving mutual exchange of
self-expression, can also be used as starting points in the creation and production of musical
composition and dance choreography (Playing a dream, Section 22.7, p. 506).
Let us now look at movement and music improvisation in relation to their explorative
and communicative power. The first step of a young child can be seen as improvised walking, and
Fig. 22.1 Each encounter with a new person, as well as each encounter with a new situation or a
new object is an exploration of new territory.
Fig. 22.2 The exchange of self-expression through exploring the sound and movement of a triangle.
their first vocalization can be seen as improvised singing. If the child is successful with the
‘improvisation’ of walking, she adds a second, and then a third step. She discovers that she is able
to change her spatial situation, and this is sure to be followed by a happy shout of joy. She might
even start exploring and elaborating different shapes of her shouts of joy. She plays out a lived
improvisation in a musical or danced phrase.
Hence, regarding the various invitations of the personal and natural environment, we can
conclude that interpersonal communication, as well as sound and spatial improvisation, are
irreversible temporal experiences; for this reason, we can interpret and use them as didactic
principles, as pedagogical points of orientation or as starting points into lived musical
experiences. For children, we can directly switch from this discovery and communicative behav-
iour to the creation of music. However, in teaching adults, it is necessary to build a step between
these two stages. Adults, who have become so accustomed to being ‘reasonable’ in their
behaviour, and verbally aware of the reasons, need extra encouragement to rediscover and
become conscious of their autonomous and artistic expression.
22.4 Physical existence

It is not only the discovery of our surroundings, but our very physical existence that is unmerci-
fully irreversible. No physical or physiological occurrence can ever be unmade. There always
remain traces in our body and our mind. This leads to the realization that body awareness, the
awareness of how we move, how we get moved, and how we conduct our movements can be a
tool to teach music as a ‘live’ system.
Our experience of ourselves in our bodies and in our minds is in constant change—maybe we
are more relaxed after a satisfying conversation or a moving improvisation. Depending on our
mood, we can experience our inner and outer movements in different ways. Thus, our existential
orientation underlies our experience of the arts that unfold through time, and gives us a basis for
understanding the diversity of musical expression.
Making music is as much anchored in the physical situation as is dancing. On the bodily level,
the basic bonding structures between music and movement are cycles of breath and pulsation,
Fig. 22.3 Breathing softly. (See also colour plate 6.)
and within them the flowing and dramatic moods of narrative. Breath results in singing in
articulated phrases, while pulsations result in rhythmic structures of action, memory and
imagination.
Pulsation within the body can be perceived by attending to the heartbeat or pulse. We create
beats when we walk, run or jump. The same beats govern the movements of our hands and arms.
We make regular, rhythmic movements when waving, clapping, chopping, shaking and knitting.
We enjoy carrying out movements of the whole body in regular intervals, such as swinging,
rowing, swimming and pulling heavy objects. Pulsation is the gateway to the structuring and
percussive moments that occur in both music and dance (Figures 22.10 and 22.13).
All these measures of the experience of being alive in our bodies are communicable. If we
want to enhance the energy level of a group, we can easily achieve this through exercises and
games that include pulsating movements or pulsating sound production, by clapping, walking,
jumping, or combinations of these movements. If, however, we want to calm a group down, we
can achieve this by asking them to become aware of their breathing. Breathing is the gateway to the
voice, which can produce both unpitched sound and melody. By modulating our breathing, we
come very close to experiences of dynamic and agogic (impulse-leading) processes and musical
articulation (Figures 22.3 and 22.12).
22.5 The four agents for music and movement in connection

We can build on four agents in music and dance that are interrelated, and use them to help initiate
teaching situations. When using the word ‘agent’ in a musical context, I mean an active and
efficient force, capable of producing a particular vitalizing effect. The specific properties of these
Fig. 22.4 Discovering pulsation

in a clapping game.
agents are that they simultaneously encourage self-awareness and a musical experience in child
(and adult) participants. Furthermore, during the learning of music and dance, the participants
are encouraged to maintain awareness of their existential need for exploring the surroundings
and gaining experience from others through communication.
The four agents are communication, improvisation, pulsation and cycles of breath. When we give
attention to communication, we can start musical and dance creations through non-verbal forms
of engagement. We can initiate and extend a musical or dance process using the second agent,
improvisation. We nurture curiosity for sound and new movements, to develop, with a teacher’s
support, the children’s own compositions and dance creations.
When we emphasize pulsation, we can start musical and dance creations through a child’s
joy in moving rhythmically (Figure 22.4). The method for developing the experience of pulsation
is to integrate a child’s joyous movements into rhythms of music and dance, and perhaps into
song accompaniments. The fourth agent—the fourth starting point into musical experiences and
dance—is the awareness of cycles of breath. The principle technique here is to take a child’s enthu-
siasm for vocal exploration and dynamic change as a point of departure for the experience of
musical and social attunement. When working with adults, the method might change to take into
account an adult’s need for peace and quiet as a starting point, and then to allow movements and
sound production to come intuitively out of a state of relaxation.
22.6 How to enter into a musical process through playing

with communication
I will illustrate two ways of acting that take communication as a point of departure for working
with music and movement (for example, Figure 22.7). Although in these two cases the aim was to
create music, a class could start in a similar way to create dance. The crucial point is that the
teacher and the students experiment light-heartedly with basic situations of human contact. For
example, such situations might be ‘hello, I know you’, ‘hey, I agree (or disagree)’, ‘I’d rather be
alone’, ‘let’s go together’, ‘I am very similar/different’, in all of their shades of expression. I have
chosen two examples: the first from the cluster of themes around ‘hey, I agree/disagree’, and the
second from the cluster of themes around the situation of feeling similar or different.
22.6.1 Starting proposal 1: ‘Yes and no’

1 I have the children in a room where they can move around; desks are not necessary.
2 The children choose small percussion instruments. I ask that they listen to and then echo
several rhythms, each a 4/4 bar in length.
3 I tell the children that this sounds as if all the instruments are saying: ‘Yes, my friend’. The last
pattern that I play is as in Figure 22.5.
On my percussion instrument, I then start to play any ametric rhythm. I explain that my instru-
ment obviously has asked a question, and their instrument should answer with: ‘Yes, my friend’.
4 It will not be long before the children get the idea to volunteer that their instruments are
also saying: ‘No, my friend’. Intuitively, they understand that we are dealing with the
same rhythm. Then I point out that one can play in a ‘yes’ mode that is smoother, and that
one can play in a ‘no’ mode that is jerkier.
5 Now I turn to a child who I know from experience will react quickly—their reaction will
serve as a model for other children who would otherwise need a lot of verbal explanation,
which I want to avoid.
6 It may not take long before the children themselves get the idea of asking some ‘ametric’
questions. When they do, I give them brief feedback from a composer’s point of view. I usu-
ally listen to the shape of the question and encourage them, by my mimicry, to be inventive in
the use of dynamic variation, for example, with the use of different ways of playing crescendi
or diminuendi. In this way, I try to give them ideas for both ‘usual’ and ‘unusual’ qualities
of an improvisation. At the same time, I try not to offer too much feedback, so as not to
interrupt or break the musical flow.
7 I next divide the class into two groups. One group walks around and the other group stays
still. Playing ametrically, each child of the moving group walks to a child who is standing still
and asks him or her an ametrical question. The responding child may now be a little free with
the rhythm of the answer, and could play ‘Yes, yes, yes, my friend’ or ‘yes, my friend, yes,
my friend’, or, by playing more loudly, ‘No, noooo, my friend’. I therefore accept a degree of
variation on the given rhythm.
8 We will, of course, have acoustic chaos in the room, as not every question and not every
answer will be equally long. Later, I indicate that the role between asking and answering child
can change each time (but with very young children, I do not ask them to do this).
9 While the children play musically together in this way with each other, I walk through the
group and listen with interest. Children must understand my mimetic communication as well,
and realize that I am curious and that I appreciate their ideas. I accept the acoustical confusion
and I interrupt only when a child obviously loses concentration, the musical structure or
social connection.
10 Finally, I gather the children again into a group or into a circle and I start chanting: ‘Jenny,
see anyone pass here ... ?’—and in all probability they will answer with their instruments
or chanting: ‘No (yes), my friend’. I continue by singing: ‘well, all of my dumplings are
gone—I tell you so—all of my dumplings are gone.’
Fig. 22.5 Rhythm for participa- Yes, my friend ..

tion in ‘Yes, my friend’. No, my friend ..
Thus, the children have already learned most of a song (Figure 22.6). At the same time,
they have performed rhythmical exercises, acted as improvisers/composers, participated in
spatial experiences, and become aware of different social roles. I believe as well that they have
understood that making sound and music is a communicative process.
Time and time again, I have found that it is easier for children to identify with a song if they
have played with its elements first. This way of teaching encourages children to invent motifs, and
finally songs, of their own. Thus, I start by mobilizing innate abilities of communication, which
then give rise to motivation for creating music (Bjørkvold 1992).
22.6.2 Starting proposal 2: ‘Me and you’

At best, the game ‘Me and You’ starts with something that I notice the children are doing. Let us
consider the following situation: the children are gathered in a circle, but two boys are poking
each other in a friendly way. This could either disturb my plan or I could try to use it as a basis for
the lesson.
1 Without telling the boys that I copied the idea from them, I start speaking or chanting some-
thing that matches their actions, e.g., ‘Me, me, me ... and you, you, you’. Because my sugges-
tion is so close to what they are doing, they usually stop and join in with me, or they correlate
my suggestion to their actions.
‘Ja- ney, you see no- bo- dy pass here?’ ‘No, my friend.’
‘Real- ly you see no- bo- dy pass here?’ ‘No, my friend.’
‘Well, one of my dumplings gone.’ ‘Don’t tell me so!’ ‘One of my dum-plings gone.’
‘Ja- ney, you see no- bo- dy pass here?’ ‘No, my friend.’
‘An- nie, you see no- bo- dy pass here?’ ‘No my friend.’
‘Well, two of my dumplings gone.’ ‘Don’t tell me so!’ ‘Two of my dum-plings gone.’
Fig. 22.6 Janey, you see nobody pass here? learnt through ‘yes and no’ musical communication.
Fig. 22.7 Enjoying communication.
2 First, I point at myself (‘Me, me, me’) and then towards the centre of the circle (‘You, you,
you’). Soon, I change my action to ‘you, you, you’ as I start to softly nudge my neighbour in
the circle. After some repetitions the children may increase the strength of their nudging and
begin poking each other. Here it is extremely important to know the class, as the activity may
get out of control. Some children may take this as an opportunity, or even as an invitation, to
become rude. To offset this possibility, I could possibly remark ‘I am using an idea that I saw
John and Max do—they did it in a very friendly manner, and we can copy it.’ In this way, the
boys feel accepted and empowered, but can recognize from my facial expression that I defi-
nitely see their idea from a critical point of view as well.
3 Now there are at least two options: I could continue in the way that is described in points 4
and 5, or I could directly continue to point 6.
4 On the spur of the moment, I invent a little verse—but I do not make it too complicated.
It could be:
Me, me, me
I jump from the top of that tree
You, you, you,
are invited to see what I do.
5 I invite the children to experiment with the ‘you-sequences’. We might add various gestures
to the game:
◆ clapping into the raised hand of a partner
◆ adding a gesture of the chin—so that we seem to be a little ‘stuck up’
◆ touching a partner’s nose with the point of the little finger
◆ touching each other’s toes or heels.
It soon becomes apparent how quickly one music and movement agent (as described in
Section 22.5) can merge with another. When saying ‘you’ with a big bow, my breathing changes,
which may even lead to the invention of a melody. When touching a partner with my toes,
I am training rhythmical exactitude with my feet. The overall point here is that from the very
first moment, I saw something communicative happening, and I took up that impulse with
the group.
6 Following on from Point 3, I could use ideas that Edwin Gordon proposes for teaching
(Figure 22.8) (Gordon 1997). Having stabilized the responsorial sequence of ‘me, me, me’
and ‘you, you, you’, I start slowly to change the rhythms. I insist that the students over 9 years old
answer with ‘you’. Younger children seem to prefer, at first, to imitate completely and to
answer with ‘me’, until finally they come to enjoy the sudden changes between ‘you’ and ‘me’.
In addition, if I wish to make the children aware of this communicative chanting in the
context of the history of Western music, we can subsequently listen together to Papageno and
Papagena’s aria from Mozart’s The Magic Flute.
22.7 How to enter into a musical process through improvisation

A musical process usually begins with a coincidence, nearly never with a request. A child may be
having fun making a noise by knocking a drawing pin against a chair. This attracts my attention.
1 I ask all children to close their eyes and I imitate the noise. Then I ask them to guess what it is
and to copy it.
2 Usually, this stimulates other children to invent new sounds and noises.
3 At this very moment, my attention is divided; I need to assess two items as quickly as
possible:
◆ Do the children understand that my sudden change in teaching style is not signifying that
I now accept any kind of disturbance? Here the teacher needs to quickly gauge the mood of
the group, to create emotional attunement; the reaction of the class may be challenging!
◆ I try to assess quickly which sounds or noises bear particular musical potential;
for instance, I listen for:
(a) sounds that can be performed very regularly;
(b) sounds that have a large range of volume;
(c) sounds that evoke certain associations.
4 I give the children feedback on these specific musical features, and I ask the children to repeat
these sounds as little musical patterns.
Teacher Students Teacher Students
me, me, me you, you, you you, you, you, you me, me, me, me
Teacher Students
me, you, me you, me, you simliarly ...
Teacher Students
you, you, me me, me, you

and so on ...
Fig. 22.8 The ‘Me and you’ game, based on ideas from Gordon (1997).
5 Subsequently, I might propose that we compose a piece with our ideas. The final form might
be a simple succession: one sound followed by the next. However, this could be rather
superficial, so there is at least one further compositional challenge that I like to introduce,
depending on the level of the class: how do we transition from one formal element to the
next? Do we achieve this by clapping once? By fading out and in? By shifting from one to the
other through slowly mixing the sounds? Through a succession of sounds that become more
and more dense? Via a noise canon—the new element emerging from the succession of
sounds? Through a type of rondo? Through a ‘groove’ (an ostinato to a song we like to sing)?
How the children work together as a group depends on the time of day, the season, and the
atmosphere in the school. If I consider that the children are likely to treat each other sensitively,
I will continue with the improvisation as follows.
The class sits in a circle. One child volunteers to be the dreamer. She may lie in the middle of the
circle, with her eyes closed. I explain to the other children that we are going to play a dream for the
‘sleeper’, but as a dream can never be planned, none of us knows how the dream is going to be
played or understood. So the class starts to play improvised sounds and noises for the sleeper.
Some of these improvisations find their own ending. Sometimes I have to finish the improvisa-
tion by signalling. Usually, the sleeper knows when ‘the dream’ is over, and everyone is eager to
hear what the dream was about. If we wish to relate this improvisation sequence to the historical
western canon, we could, for example, listen to Jimbo’s lullaby by Claude Debussy.
22.8 How to initiate a musical process through pulsation

1 At the beginning of a lesson, the children sit in a circle.
◆ I ask them to stamp wildly for a short time
◆ I ask them to stamp alternately loudly and softly
◆ I might continue by suggesting that the children stamp twice loudly, but leave a long interval
between the two stamps
◆ If the pupils think up more ideas, we try them out and combine them.
2 We start stepping on the spot and I encourage them to step to a regular pulse. While we are
practising this stepping and keeping in time, I invent a new verse or speak a well-known verse
rhythmically. This verse must be short or easy, so that the children can quickly learn it by
heart. It does not matter whether the verse rhymes or makes sense, but it is helpful if the verse
contains some point of climax. (This usually happens quite spontaneously, because a person
tends to create a ‘significant’ ending to a verse when speaking it rhythmically.) In the example,
a well-known children’s verse, I would work initially only with its first two bars (Figure 22.9).
4
4
how much wood would a wood- chuck chuck, if a wood-chuck could chuck wood? A
wood chuck would chuck as much wood as it could, if a wood-chuck could chuck wood!
Fig. 22.9 Rhythm of ‘How much wood would a wood-chuck chuck.’
Fig. 22.10 Being sensitive to

the strength of a beat with a
friend.
3 I gesture to the group to join me in clapping on the word ‘wood’. When the group is successful
in clapping together on this word, I let them see that I am pleased.
4 I now get the group to put their chairs anywhere in the room ‘like bushes in a wood’.
After this short interruption in the musical process, I again try to find a joint rhythm
with the group by all of us clapping together on the last syllable. Next, we start to walk among
the ‘bushes’, trying to step in time. The rules for the last syllable can now be changed—for
example:
◆ we all sit down;
◆ we hide under the chair;
◆ we decide to sit down and stretch our whole body while chanting a long ‘wood’ at the end
of the verse.
Including ideas that the pupils contribute to the lesson is more important than rigidly
following a lesson plan.
The ideas that follow derive from my work with pupils and adult amateurs. There are two
different exercises in this section: one for children and one for adults. While the exercise is
difficult for children (aged about 10 years), it gives them the opportunity to see that real
adult-type challenges can be found and met successfully in this way of approaching music.
For music students and adult amateurs, it will be easy to complete the exercise. Some of the
participants may be stimulated to become musically creative themselves after experiencing
this method of incorporating gesture and percussion into a verse.
5 We return to sitting in a circle, now taking the whole verse as raw material for the creation of
a new musical form.
6 The group is split up into a ‘first’ and a ‘second’ voice (Figure 22.11). The participants playing
the first voice stamp on the word ‘wood’. In the first part of the verse, this will be easy to do;
but in the second part, it becomes more difficult.
The “wood” voice

4
4
The “chuck” voice

4
4
Fig. 22.11 The combination of the stamping and clapping rhythms in the wood-chuck pulsation game.
7 The participants playing the second voice clap on the word ‘chuck’, gliding the palm of their
right hand over that of their left hand to produce a clap followed by a throwing motion
8 I suggest that the group speaks the verse more softly each time they repeat it. Finally, we do
not hear the verse at all, the only sounds being the clapping, gliding and stamping.
If one wants to demonstrate the relationship of pulsation games with works from the western
canon, one could listen to Bolero by Ravel, ‘Cavatine’ from the opera Carmen by Bizet, or the
Military Marches by Schubert.
There are still further challenges to come to grips with in our work with the Woodchuck verse.
The rhythm can be played in ternary mode or as a canon. When performing the verse as a canon,
the second voice normally enters after two bars. Possible, but more difficult, is the entry of the
second voice after two beats. This new development can be used, according to the abilities of the
group, as the rhythmical basis for further improvisations.
22.9 Entering into a musical process through playing with

breath and vocalization
In lessons where I realize that the children require a lot of movement, I like to give the children
chiffon scarves or strips of crêpe paper to dance with. They can move freely and spontaneously
all over the room, and I accompany their movements on one of my instruments—the piano, the
tenor recorder or, preferably, the djembè (a type of drum, originally from West Africa).
1 I ask the children to take a deep breath before they start dancing. I adapt the length of a phrase in
my accompanying music to the approximate duration of a child’s breath as he or she breaths out.
Fig. 22.12 Exploring the dynam-

ics of breath.
2 I give the following sequence the title The Indian circle, because I was shown this game
by a teacher who was related to the Indians of South America. The group gathers in a
circle, which they make as large as possible. We all breathe in together and then everybody
‘whooshes’ towards the centre of the circle as they breathe out—we then open out the circle to
begin again.
3 I start to show the group different ways of breathing out and entering into the circle.
It is good for the children to see that I am creating these movements on the spur of the
moment.
◆ Starting to move forward slowly, I breathe out on ‘shshsh’ with a powerful ‘magical’ gesture
of my hand. The children copy and repeat my movement (the use of the hands is impor-
tant in this sequence, as it helps to divert the children’s attention away from their voices
and any self-consciousness as they make the unusual sounds.)
◆ With a powerful gesture, I take a step forward, aggressively shouting a loud and long
war-cry ‘uuaa’.
◆ I enter the circle turning round and round while making a ‘sirrrrrr’ sound and moving my
hands like a hummingbird’s wings.
4 Each time I return to the perimeter of the circle, the children and I repeat the ideas I have
shown them.
5 As the game proceeds, the children can develop their own ideas and take over the lead.
6 Now I have to use my intuition. I begin by interpreting some of the actions that the children
have invented. It is important to let the children feel that their contributions make sense to
other people, and especially that adults can relate to their ideas. I could say, for instance:
‘While we were doing Ellen’s idea, it looked as if we were all bubbles in the wind’; or ‘That
looked like a lot of windmills’; or ‘That looked like kites in the sky in autumn’; or ‘I saw a lot of
happy birds flying around and Henry looked like a bird enjoying its first flight in the morning,
all alone, before the others joined him.’
It usually turns out that some idea or other that the children contribute works in well with the
content of the lesson I have prepared.
Songs and verses about the weather are particularly suitable as a basis for musical creations
involving dynamics. Using these types of songs we can explore the possibilities of changes in
intensity, what Daniel Stern terms ‘intensity contours’ or ‘modulations of intensity gradients’
(Stern, 1985/2000, p. 88f). Thus, we can experience different ways of becoming more intense,
softer, more subdued, quicker and so on.)
If we intend to make the children aware of these changes in intensity in the context of western
music, we could listen to the ‘Summer Storm’ from The Four Seasons by Vivaldi, or the ‘Morning’
Symphony by Haydn. To make the children especially aware of the connections between breathing
and vocalization, we could listen to recordings of Bobby McFerrin (see www.bobbymcferrin.com
for examples) or the ‘Cat duet’ by Rossini.
There are many more and different ways of initiating musical and dance experiences through
improvisation and communication games than are described here. What is important is bodily
awareness in connection with pulse and the breathing cycle. In the work of well-trained, imaginative
and naturally musical and creative teachers, there should be no limits to the themes of their teaching.
22.10 The ideal setting

Having described ways of teaching which in my experience engender a lot of joy, mutual under-
standing and interest for musical creativity, I wish to describe the setting in which I teach in
Basle, Switzerland. Because of a well-organized timetable, as well as the result of close liaison
between the music school and the public school, I had the opportunity to develop this interactive
way of teaching.
I generally teach one half of a primary school class in a special music room. The other half of
the class has a lesson with the class teacher at this time, and afterwards the half-classes change
round. The music room is only used for the purpose of music lessons, and contains no desks; it is
a clear space, so that the children have a large area in which to move about and sit on the floor.
Cupboards around the room contain many percussion instruments, xylophones, chiffon scarves,
ropes, balloons and stringed instruments. Thus, it is always possible to spontaneously include the
children’s own ideas in a lesson. I personally think that this kind of room is not only helpful, but
is necessary, to be able to teach music and movement effectively.
22.11 Signposts for a lively approach to music making in groups

To conclude, I would like to list some of my teaching principles that I use to motivate and
encourage vitality in my pupils.
Fig. 22.13 Building connection to others’ sounds. (See colour plate 6.)
◆ Dynamics, natural forms of expression, contact with others and self-awareness are crucial
experiences in a child’s early encounters with music, and if the child’s overall ‘health’ is in
mind, these factors are considerably more important than, for example, pitch-matching or
sight-reading.
◆ Begin teaching through actions what might be called experimental music theory by using
young children’s own approaches to their environment as the basis for music lessons. Use
‘adult’ terms, such as ‘piano’ or ‘crescendo’, sparingly, almost like a passing comment. Consider
using the children’s terminologies and approaches as much as possible.
◆ Interpret the children’s spontaneous actions musically, and connect them with both your own
teaching repertoire and your artistic repertoire.
◆ Move beyond simply teaching a song, to building connections to the song by combining it
with the child’s spontaneous ideas and actions and by improving on or extending these.
◆ Avoid teaching notes and rhythmic notation at too early an age. By teaching the notation
system, you are not teaching music! This is important, because, by insisting on cognitive
learning, one restricts other essential musical experiences. Music is neither cognition nor
writing nor sight reading, although cognition is, of course, a helpful tool in making music.
Any tool that one tries to use without having a good experiential background is restrictive
rather than helpful.
◆ However, use notes and rhythmic notation as often as they can help the children to remember.
A child will accept such things as reminders, even if he or she does not understand the whole
system. In a lesson where the child is learning to play an instrument, this problem resolves
almost by itself. However, a child who does not play an instrument may forget the notation
system quickly. I believe we are wasting the child’s time and missing opportunities for essen-
tial ensemble experiences if we teach ‘useless’ notation in the classroom.
◆ A lesson in the art of music is something like a score, where different musical issues interact or
interplay, like voices in a piece of music. The various aims of music lessons cannot be taught
separately, as we then miss the basic experience: music. Such issues as singing in tune, reacting
to a signal, listening to the others, creating new material, combining visualization and sound
production, and combining movement with songs can be considered as the ‘voices’ in a piece
of ‘teaching’. Be sure to combine them in balance.
◆ Be aware that musical aims are not easily divisible, and that musical moments are neither
repeatable nor reproducible. The quickest way, the most efficient method, runs the risk of cut-
ting off basic musical experiences. Slow is also musical. Didactically slow is ‘anthropologically
musical’ and is sure to have a lasting effect on pupils.
References
Altenmüller E and Gruhn W (1997). Music, the brain, and music learning. GIML series vol. 2, CIA
Publishing Inc., Chicago, IL.
Altenmüller E, Gruhn W, Parlitz D and Lieben G (2000). The impact of music education on brain networks:
evidence from EEG-studies. International Journal of Music Education, 5, 47–53.
Aronson E and Rosenbloom S (1971). Space perception in early infancy. Science, 172, 1161–1163.
Bjørkvold JR (1992). Det musiske menneske – barnet og sangen, leg og læring gennem livets faser. Hans
Reizels Forlag, København. Published in English as The muse within: Creativity and communication,
song and play, from childhood through maturity, 1992. Aaron Asher Books/Harper Collins Publishers,
New York.
Blacking J (1988). Dance and music in Venda children’s cognitive development. In G Jahoda and
IM Lewis, eds, Acquiring culture: Cross-cultural studies in child development, pp. 91–112. Croom Helm,
Beckenham, Kent.
Custodero L (2002). Seeking challenge, finding skill: Flow experience in music education. Arts Education
and Policy Review, 103(3), 3–9.
Deci EL and Ryan RM (1985). Intrinsic motivation and self-determination in human behavior. Plenum Press,
New York.
Donald M (2001). A mind so rare: The evolution of human consciousness. Norton, New York and
London.
Fröhlich C (2002). Präsenz und Achtsamkeit. Beiträge zur psychosozialen Prävention aus Musiktherapie und
Elementarer Musikpädagogik. [Being present, being aware. Contributions to psychosocial prevention from
music therapy and elementary music pedagogy.] Lang, Frankfurt.
Gebauer G and Wulf C (2003). Mimetische Weltzugänge. Soziales Handeln – Rituale und Spiele – ästhetische
Produktionen. [The mimetic relation to the world. Social actions – rituals and games – aesthetic productions.]
Kohlhammer, Stuttgart.
Gordon EE (1997). A music learning theory for newborn and young children. GIA, Chicago, IL.
Gruhn W and Rauscher F (2002). The neurobiology of music cognition and learning. In Colwell R and
Richardson C, eds, Second handbook of research on music teaching and learning, pp. 445–460. Oxford
Jaques-Dalcroze E (1921). Rhythm, music and education. Putnam’s Sons, New York.
Kühl O (2007). Musical semantics. European Semiotics: Language, Cognition and Culture, No. 7. Peter
Lang, Bern.
Kuhl R and Meltzoff A (1982). The bimodal perception of speech in infancy. Science, 218, 1138–1141.
Lawson KR (1980). Spatial and temporal congruity and auditory-visual integration in infants.
Orff C and Keetman G (1950). Musik für Kinder [Music for children]. Schott, Mainz.
Rauscher FH, Shaw GL and Ky KN (1995). Listening to Mozart enhances spatial-temporal reasoning:
Towards a neurophysiological basis. Neuroscience Letters, 185, 44.
Riemann F (1991). Grundformen der Angst. Eine tiefenpsychologische Studie. [Basic forms of fear.] Ernst
Reinhardt, München, Basel.
Stern D (1985/2000) The interpersonal world of the infant. A view from psychoanalysis and developmental
psychology, 2nd edn, 2000. Basic Books, New York.
Stern D (2004) The present moment in psychotherapy and everyday life. Norton and Company,
New York.
Tarasti E (1994). A theory of musical semiotics. Indiana University Press, Bloomington, IN.
Trevarthen C (1977). Descriptive analyses of infant communicative behavior. In HR Schaffer ed., Studies in
mother–infant interaction, pp. 227–270. Academic Press, New York.
Trevarthen C (1980). The foundation of intersubjectivity: Development of interpersonal and cooperative
understanding in infants. In D Olson, ed., The social foundation of language and thought, pp. 316–342.
Norton, New York.
Trevarthen C (1993). The self born in intersubjectivity: An infant communicating. In U Neisser, ed.
The perceived self, pp. 121–173. Cambridge University Press, New York.
Chapter 23
Intimacy and reciprocity in

improvisatory musical performance:
Pedagogical lessons from adult artists
and young children
Lori A. Custodero
The infant is born in the same universe where lives the adult
of ripe mind. But its position is not like a schoolboy who has
yet to learn his alphabet, finding himself in a college class.
The infant has its own joy of life because the world is
not a mere road, but a home, of which it will have more and
more as it grows up in wisdom. With our road that gain is
at every step, for it is the road and the home in one;
it leads us on yet gives us shelter.
Tagore1 (1921, p. 91)
23.1 Introduction
Through music, we learn about ourselves and about our relatedness to the world; our engage-
ment with melodies and rhythms provides opportunities for appreciating sound, people and
ideas. Such sympathetic understanding is a product of interaction, a connection (or collision)
between perception and experience, between social context and individual disposition. We make
meaning through shared experiences of musical time and space; we find personal solace and
inspiration from our own and others’ individual interpretations within those dimensions.
Through the compelling qualities of organized sound and the rewards of making such sound, we
are drawn to others through revelations of our common humanity, and drawn to artistry
through realizations of uncommon accomplishments.
This connectedness and sense of competence has its foundations in infancy, when mothers,
fathers, grandparents and childcare providers use musical speech and song, paired with affective
facial expression and gesture, to communicate the structural and emotional nature of shared
experience. These early relationships are intuitively co-constructed, motivated by the need to
1 Rabindranath Tagore (1861–1941), Nobel poet laureate, founded a school, Sankinitekan, outside Calcutta,
where classes were held outdoors and children learned through a diverse subject base involving artistic
expression and appreciation.
514 LORI A. CUSTODERO
know more about each other through listening and responding to perceived meanings in sound.
They are also educative: dancing between responsivity and receptivity, invitations to teach are
heard and acted on by both parties. In this chapter, I consider communicative musicality to be a
fundamental source of relationship, comprised of musical dialogues that generate knowing of the
world through knowing each other. Such knowledge is gleaned from listening deeply to what is
offered, and thoughtfully offering back interpretive responses created in respectful imitation and
expansion of the original. As a mutually informing and emergent process of co-discovery, com-
municative musicality functions in support of human development across diverse populations
(Trevarthen 1998), and holds promise as a framework for examining educational contexts and as
a foundation for pedagogical decisions.
Communicative musicality serves a clear purpose in infancy, bonding parents to children,
inducting them into the culture through creating a social connection governed by turn-taking
and mimetic behaviour (Malloch 1999; Trevarthen and Malloch 2002). Throughout the lifespan,
communicative musicality educates through inherent challenges, requiring participants to
attend to the moment; listening and creating, they hypothesize the content of the next musical
moment based on their experience. Simultaneously reading feedback in the musical cues and the
social context, adjustments are made in the emotional nuances of a performance, listening expec-
tations are reassessed, and compositional conventions are transgressed to surprise imagined
audiences.
As suggested in the quotation that opens this chapter, the lifespan view is especially informative
when examining mature artistry and the world-making of young children:
The comparison of the natural genius of the child with the cultivated inventiveness of adult genius, espe-
cially at the highest levels, is justified by the fact that both ages are in search of true metaphor[s] which
release the organizing powers of mind and nervous system into action and the making of meaning.
Cobb (1977, p. 102)
These two groups—young children and mature artists—are more alike than adolescents and
adults, who differ in their tolerance for the metaphors of which Cobb writes (Gardner et al.
1990). Picasso’s famous comment about working a lifetime to recapture the child-like approach
to his own artistry has been echoed in the reflective words of countless musicians (e.g, Barron et
al. 1997; McCutchan 1999).
To reveal and investigate these similarities in their most fundamental rendering, I review
improvisatory practices in adult artists as they consciously enter a space of freedom and receptiv-
ity, and in the spontaneous music-making of young children in everyday life. The focus on
indigenously conceived music-making, responsive to the social and physical conditions of the
environment, provides evidence about the symbiosis between what Tagore refers to as ‘journey’
and ‘home’. These metaphors speak to the complexity of artistic experience, which requires a
momentum of personal innovation and invention in an environment of the known; it is the
simultaneity of creative living—of being and becoming.
Earlier sections of this book introduce readers to the inherent qualities of communicative
musicality relevant to the establishment of relationships, the maintenance of culture, and as an
organic resource for coping with debilitating conditions. I follow this thread, which weaves
together ordinary day-to-day life experience with extraordinary capabilities to understand and
share through musical cues. In this chapter, I provide evidence of the organic synchronous and
reciprocal interactions that define communicative musicality between infants and caregivers as
these same forms of interaction re-emerge in and between adults and young children in freely
creative settings. Beginning with the voices of two expert composers, I consider communicative
INTIMACY AND RECIPROCITY IN IMPROVISATORY MUSICAL PERFORMANCE 515
musicality as it exists in fully developed musicianship. The interview transcripts of these

professionals’ reflections lead to insights into the process and products of collaborative music-
making, and suggest similarities to mother–infant communication. Following this interpretation
of musical improvisation in adults, the spontaneous music-making of young children is
addressed, observed and interpreted in the voices of researchers and parents. These self-initiated
contexts of music-making in adults and children suggest that, when left to our own devices, we
learn from our interactions with each other and with the affordances in our environment.
Tracing communicative musicality in its post-infancy development from these various view-
points, themes of reciprocity, intimacy, and agency emerge and serve to inform educational
understanding.
23.2 Engagement with artistry: voices of experts as improvisers

Elaine Barkin and Alexandra Pierce are composers with well-established careers in academia who
value and practise improvisation (Figure 23.1). As part of a phenomenological study (Custodero
2007), we met for a weekend of improvising and reflection on the process. I videotaped and
audiotaped their interactive spontaneous performances, participated sporadically in the music-
making, and entered into the dialogue when explicitly or implicitly invited.
Communicative musicality was evident in the details of collaboration: these were demonstrated
physically through the embodiment of musical gestures as they played the piano, and sonically
in their receptivity to musical invitations, which was characterized by a relatedness and cohesion
that conveyed careful listening to one another and a shared conception of narrative. Narrative
forms were nested in a variety of temporal frames: the individual performances were interpretable
at (a) a microlevel in the gestures and junctures that constituted a whole piece, (b) a larger-scale
“So it’s not so much [about] trying to improvise

...I like to find myself trying to get to some place
both musically and socially that I haven’t been
before, since it’s very much a social activity.”
Elaine Barkin
“I just do a something Au: resolution

when you do a something.” is not ok?
Alexandra Pierce to Elaine Barkin
Fig. 23.1 Engagement with artistry: voices of experts as improvisers.

narrative that encompassed a set of three performances over the day, and (c) a macro-narrative
that included the background of these performers’ experiences together. Reflections about
each of these narrative structures provided another mode of communication about shared
meaning.
The intimate nature of partnered improvisation created interesting challenges and considera-
tions for the definitions of communicative musicality. In this strong and historical adult friend-
ship, there was a keen awareness of the partner as different from oneself, heightened by
the research context. Elaine remarked that differences come to the surface during this type
of improvisational music-making, that ‘character types behave differently’, and that it is
‘better facing it and dealing with it than pretending that it doesn’t exist’. This seemed at first
irrelevant to mother–infant communication, yet it may be that the obvious cognitive, biological
and experiential differences between an adult and relatively young infant are at the core of
communicative musicality’s function: facing these differences through co-constructed narrative
renders them as perceived invitations to respond, rather than as obstacles to understanding.
Such differences add interest—they teach us about alternative perspectives—and through
mutual effort and intention create something unimaginable. The juxtaposition of personalities
has been known to result in rewarding musical experiences in improvisational settings
(Bailey 1992).
One of the most interesting aspects of this collaborative effort was the process of getting
started. There was noticeable discomfort and tension between the two musicians on my arrival,
most likely related to the motivation to engage; the setting, although naturalistic from a research
perspective, was less organic and intimate than their usual meetings. The issue was finally negoti-
ated in piano performances by each in separate rooms, still able to hear one another, but commu-
nicating directly from the musical cues, freed from the physical cues that may have been
vulnerable to a sense of voyeurism created by the research context.
EB [foreground to the camera] slowly approached the grand piano, stood and played large, open
sonorities that were playfully inhabited with shorter musical flourishes by AP. There were rare, brief
imitational responses. Both performers were noticeably patient: during long pauses, there was sus-
pended stillness in which EB’s attention was visibly concentrated, her attunement regulated with
either closed eyes or gaze directed at the piano. At about 13 minutes, AP entered the front room with a
small bell, transgressing the boundary of physical space, and suggesting a denouement where differ-
ence, or at least distance, was reconciled.
Custodero (2007, p. 84)
The qualities of musical gestures were invitational: Elaine offered open sonorities and
Alexandra would respond by accepting that space, and through her response, inviting continua-
tion through a complementary gesture. The playfulness of the response was an invitation to
lightness, a sort of gentle teasing used to dissipate tension. The physical distance allowed a musi-
cal closeness that generated communicative musicality and moved the daily narrative forward.
Alexandra spoke of the other room as being ‘totally liberating’, and how she felt more tolerance
from her performing partner:
A: [I] sensed your tolerance from a distance because I’m so busy making you not tolerant.
E: The issue there is having a good time
A: Actually it was fun! [Then moving closer with intentional eye contact] I mean it was fun playing
with you.
A second improvisation followed in which Elaine played in the back room (where Alexandra
had previously been)—an act of conscious reciprocity. The performers also decided to establish a
timeframe of 10 minutes, to which their attunement was remarkably accurate.
This piece began with a flurry of musical dialogue—quick rhythmic gestures in a variety of registers
were answered through imitation and expansion. One example of this thematic expansion was a trilled
semitone figure, first introduced in sustained form by AP as the ‘grundgestalt’ opening the piece, later
reprised as an embellishment, then appropriated by EB who broadened the surface temporal quality to
a slow ostinato figure. AP broadened the motif ’s spatial quality by expanding the intervallic structure to
a third, in foreground counterpoint to the more sustained sound. The overall collage effect culminated
in the surprise introduction of the ‘antique’ guitar timbre, plucked and strummed by EB as she entered
the foreground space, as AP had done in the previous piece.
Here, again, it is evident that the communicative musicality is defined through a shared under-
standing of the musical material offered, thoughtfully received, and responded to in a returned
gesture of invitation.
The final improvisation of the first day was a complex array of invitations, interventions, and
responses received. The intimacy of physical proximity created a beautiful dance of hands—
spaces were generously provided and gently claimed. Although frustration was evident, so were
tenacity and tolerance for the ‘uninvited collaborators’, who elicited efforts on the part of both
performers to accommodate the unexpected in a musical way:
At last the two performers were seated together at the same piano, EB at the right. The performance
might best be described as a dance—the four hands moved with graceful gestures over the keys.
The visualization of sound—its preparation and release—was communicated to the observer, and seem-
ingly the performance partner, through the physical embodiment of the music’s character. At about
1.5 minutes into the piece, the duet became a quartet: AP’s two dogs joined the sonic landscape and con-
tinued until the conclusion. Their chorus was familiar to AP; EB was notably distracted at first, worried
that she had inadvertently provoked them, but with quiet assurances from AP, who had continued in the
music space, she carried on. The close proximity allowed for much observable playfulness, including
(a) crossing arms to trade registers, prepared by AP with repetitive hand movements (her four fingers
acting as one unit lightly and repeatedly touching the thumb) as she edged over EB’s hands; (b) silently
depressing lower keys to allow upper sounding keys to resonate with a fuller spectrum of overtones; and
(c) musical responses, perhaps interpretable as imitation, to the dogs’ barking. The final cadence, with
slowing harmonic and surface rhythms, reflected a submission to the canines’ tenacity.
Custodero (2007, pp. 85–86)
These musicians were clear about what contributed to their communicative musicality—the
gestures were indications of ‘how you give someone else their space’. Just as mothers and infants
play with the space that defines the temporal nature of their narrative, shortening and lengthen-
ing the response time, so, for these artists, improvising together was about inhabiting a personal
space and integrating it with the spaces offered by others. Alexandra spoke of ‘the open sounds’ as
‘a lovely room to be in’. Both were clear that these invitations were very different from call and
response, which they spoke of with a hint of disdain. For them, it was a matter of belonging—of
negotiating intersubjectivity with intrasubjectivity, expressed by Alexandra as a goal: to ‘give
someone else their space and joyfully (rather than creepily) take your own’.
For these musicians, it was a natural progression to move from a focus on their performing expe-
riences to teaching. Here, they were similarly conscientious, feeling that active and musical partici-
pation was important, while being cognisant of the need to remain genuine. They spoke of
‘allowing people to have their own space’ and ‘finding ways to aggregate and let go of authority’,
asking ‘How I can stay where I am and still be a part of it?’ Speaking of improvising with her compo-
sition students, Elaine reminisced about their need to get into a groove, and offered: ‘[The] easiest
thing in the world is playing along, imitating; I’m not interested in [that, but it seems to] give them
a sense of belonging—you are being listened to!’
The improvisatory experience lends itself well to the examination of communicative musicality,
as it calls for responsiveness to the momentary intersections of personal contact and musical
invention (see Lee and Schögler, Chapter 6, this volume, for an examination of the interactions of
improvising jazz musicians). These musicians consider the essence of musical activity: as they
play, they continually ask ‘What do I hear?’ The educative function was implicit in the process, as
they claimed, ‘your consciousness is absolutely raised’. Such activity is not without risk:
One of the most provocative things for any of us is that you can’t take anything back, it’s pretty naked. On
the one hand, I might say, there are no mistakes, and on the other hand I find myself saying ‘I wish … ’
In summary, using communicative musicality as a lens to examine the collaborative improvisa-

tional processes of expert musicians reveals key issues about their artistry in ensemble. Musical
engagement with another is an intimate experience, as evidenced in the visual contact between
performers (Finnegan 2002) and the embodiment of and in the musical sounds. Dissanayake
(2000) makes a convincing case for the role of such intimacy as a psychobiological need that
finds fulfilment in the mutuality that leads to artistry (and Dissanayake, Chapter 2, this volume).
Intimacy also yields vulnerability: the nature of improvisation puts one at risk from a musical
standpoint, as stated in the previous paragraph, and the nature of co-constructing with another
person makes one vulnerable to the possibility of a competing creative muse. Such risk-taking
also leads to learning—about personal capabilities for responding, about musical possibilities,
and about alternative routes seen from the perspective of another.
23.3 Observations of artistry in childhood: voices of researchers

and parents
Communicative musicality in infancy may prepare children and adults for learning experiences
later in life, because it creates a disposition of trust and cultural belonging within a musical con-
text (Trevarthen 1999). As children grow and thrive, they extend their cultural interactions to sib-
lings and extended family, peers, teachers and others. This provides opportunities for experiences
that educate, as much by the method of delivery as in the content of the text (Smith 1998). Being
biologically disposed to musical content (Trehub 2001), young children discover the conse-
quences of their actions in the world as they initiate and respond to musical models: they vocalize
and move rhythmically as they go about their everyday lives (Bjørkvold 1989; Campbell 1998) in
communication with sounds, objects, conditions, and people in the environment. Like the
improvisations of adult musicians, children bring their own past experiences and technical facil-
ity—that is, their own expertise—to the music they make (Figure 23.2).
Music-making can be observed virtually anywhere children are present—in airports and sub-
ways, homes, synagogues and churches, playgrounds, restaurants, museums, concerts, birthday
parties and even grocery stores. Children make music as ‘soundtracks’ to their experiences, and
they respond musically to sounds in their environment. Compared with infant–mother interac-
tions, communicative musicality in young children may be interpreted more globally, as they
strive to interact, to understand their place in local culture. This is achieved by embodying the
sounds and gestures heard and seen, transforming temporal and spatial characteristics into an
indigenous repertoire of movement and vocalizations.
In interviews with teachers and parents, and in recording children’s musical behaviours that
surround me as I travel around the world, I am presented with consistent, overwhelming evi-
dence that such music-making and responses permeate the daily lives of young children:
Personal diary entry, Taipei, 17 July 1999
In the hotel restaurant, evening meal. There is lyrical instrumental music emanating from loud-
speakers, meant to be background, it sounds like Chinese folk music. A young girl, about four years old,
Fig. 23.2 Mother and daughters engage in musical play.
is here with her extended family—a group of about eight people. She is the only child and in need of
interesting activity. Seemingly oblivious to anyone who might be watching (although I am careful to
sneak fleeting glimpses rather than a prolonged gaze) she takes a flower from the vase on the table, and
begins to dance very expressively to the music. The interpretive movement is mesmerizing to behold,
and she seems transfixed and transformed by the delightful synchrony of sound and gesture. She con-
tinues this for over five minutes until she is told to return the flower by the adults.
The intimacy of self-expression demonstrated in this example shows a communication with

the sound itself. The young dancer is having a private moment, interacting with the music, with
no visible human partner. Although it may be categorized as solitary, her movements reveal the
knowledge of others and reflect the embodiment of familiar response. Here, she is able to draw
freely from collective models she has witnessed, and from the implicit character of the sounds.
It may be the intrapersonal development through music that makes possible meaningful inter-
personal communication. Like the expert musicians who had to remove themselves from the
physical intimacy to have musical intimacy, children may need to view themselves as agents of
musicality through their solitary engagement, to initiate social engagement (Archer 2000).
Van Manen and Levering (1996) write about secret and sacred places of childhood, and
how they present opportunities ‘to experiment with possibilities of being, of daydreaming,
of feeling, of wondering, of sensing’ (p. 25). The imagination and adventure that these spaces
inspire result from environmental conditions that are familiar and orderly, providing ‘an atmos-
phere of happy belonging’ (p. 31). As discussed elsewhere in this chapter, music is an inherently
familiar mode of communication; its compelling qualities—which demand so much of our
sensory attention—coupled with its associative images can transport individuals from the
mundane real to the imagined past or invented present.
A similarly intrapersonal experience was collected six years later by a student researcher,
who participated in a formalized study we called ‘One day in Taipei’ (Custodero et al. 2006).
Research teams visited 16 family-friendly locations in Taipei City, Taiwan, that included
museums, parks, public transportation, shopping malls and restaurants one Saturday in July
2005. (The specific sites had been chosen on the basis of previous visits, to ensure the presence of
families with young children.) We recorded 42 episodes of children’s music-making—singing,
humming, rhythmic chanting, playing objects as musical instruments (in patterned and repeated
phrases)—and analysed them for function and meaning. The following episode could be inter-
preted as a boy in communication with his surroundings, receiving and responding to
the implicit musicality of nature, his vocal reflection and physical position an embodiment of the
calm and quiet setting. The observer of this episode, familiar with the musical culture of
the region, recognized no idiomatic structures in the boy’s vocalization—it was as though he was
using the visual aesthetic as a score.
The boy sang, lying on the balcony and watching the beautiful landscape of Taipei Botanical Garden.
He seemed so relaxed. I guess the beautiful scenery made him sing. He was not singing a [learned]
song, but he was singing his own music. The music was so soft. Then, his mother told them to watch
the lotus pool and he stopped to watch for several seconds, and then he continued singing… for about
five minutes. I expected his sister to sing…, but she didn’t sing with him. Five minutes later, their
mother announced it was time to leave.
Both of these episodes show how spontaneous musical behaviours that are part of a child’s
private world are often ignored or misinterpreted as intentional disruption or non-directed
behaviour. In the first example above, adult attention on the activity causes it to cease. This is a
common occurrence in the episodes I have collected; it is as though the adult, however well-
meaning, has interrupted the narrative and transgressed the special place that music created. Like
the canine chorus in the adults’ improvisation, the uninvited interjection or even attention from
an adult can rob a child’s experience of his or her privacy and imaginative musicality.
The musical communication between young children and adults is less visible in public places,
where motivations are often at odds with one another, unlike at home, when parental attention
can be more focused on a child. Below is an exception, an example of adult–child communicative
musicality, where both are intently focused on the activity at hand:
Personal diary entry, New York, 19 July 2003:
In a subway, late afternoon. I jump on the C train, ready to notice all the subtle music-making that
provides accompaniment to my ride. Today, it is not so subtle. A young girl, looking to be about five or
six years old, is inventing a hand-clapping game with her companion, a young adult male, perhaps her
father or uncle or brother. They are singing ‘Do Re Mi’ from The Sound of Music, and she is clearly the
leader. He follows, but offers his own ideas of how to expand the activity—he wants to make the song
into a canon. They collaborate, attempting to negotiate entry points, the child is giggling with each
iteration of the tune—she continues her efforts, and he continues his support. The duo has attracted
the attention of everyone in the general vicinity, and we look at one another and smile in acknowl-
edgement of the joy, initiated by the child and thoughtfully supported by the adult. Their activity con-
tinues for at least 10 minutes until they depart the train…
Here, communicative musicality is evident in the shared intentions of the child and adult
and in the secondary enjoyment of the other passengers—it is as if we are part of the game
because we recognize and sympathize with the interactions we witness. The duration of the activ-
ity is longer than the two ‘solitary’ examples, and it comes to an end with a common motivation
(to get off the train). The nature of spontaneous music-making differs greatly depending on the
context: solitary music-making tends to be ephemeral, non-metered and reflective; while music
shared between children tends to be short repeated rhythmic fragments, often using ono-
matopoeia (Moorhead and Pond 1978). Descriptions of the first two cited examples also differed
in temporal character from this one, the slow, easy tempo and sustained gestures of the child’s
private renderings, strongly contrasted with the interactive patterned repetition.
The adult behaviour in this episode is distinct from that of the other adults in that it acknowl-
edges the child’s musical play as worthwhile; responds to (rather than appropriates) the child’s
agency in crafting the narrative; and follows the child’s sense of temporality. In a study of 10 fam-
ilies with 3-year-olds, we had parents keep diaries of their own children’s musical behaviours that
were particularly noticeable; the goal was to get a sense of what they felt was meaningful, rather
than to obtain frequency counts (Custodero 2006). In these reports, we found much evidence of
communicative musicality: shared intentions, intimacy, reciprocal modelling, as well as success-
ful and unsuccessful attempts to support the child’s agency.
There was much diversity in the musical interactions within families and even between family
members. In one family, the mother and father had complementary modes of musical interac-
tion with their only child. Both the self-described musically untrained mother and the music
educator father demonstrated sensitivity to this mode of childhood sense-making. Parenting has
been examined as a developmental skill, one that responds to the changing needs of the child
(Bornstien 2002). The maternal interactions in this family seemed responsive to the spontaneous
singing of children typical of her daughter’s age, around three years (e.g., Bjorkvøld 1989). The
mother found that she was surprised by the amount of music that filled her parenting time, and
noted that she used her own mother’s techniques as a model:
I just noticed during how much of the day music happens, like my mother used to do, a driving song,
a waking up song, or just making up silly songs, nothing structured … spontaneous music happens all
through the day. I’ve always sung about the mundane things we do in our daily life. You just know
intuitively what to do, when something works well [musically].
The father saw himself as the provider of more formal experiences: he was most surprised by
the agency with which his daughter participated:
I would say one thing that surprised me in writing the stuff down is how many times Melanie would ask for
certain songs … what she wanted me to play [during] our piano time … I would say, half the songs are ones
that she’ll say ‘Oh, let’s do “Skip to my Lou”, let’s do “Kookaburra”, or “Sloop John B”’, or whatever it was.
And a few times, she made up little songs. She would make up the lyrics and kind of sing the melody a little
bit, in kind of a singsong way, and I would play … some basic I-V [harmonic] progression behind it.
This role of adult as accompanist is similar to the roles the improvising adults took for one
another (see Section 23.2), presenting spaces in which the child can play. As in the mother–child
interactions vis-à-vis the spontaneous singing, the father–child interactions involved reciprocity;
it came about in a more formalized way:
She recently had her preschool graduation, so we had the song lyrics to things like ‘Little White Duck
Sitting in the Water’, and songs like that. And she would teach me, ‘you have to move your hands like
this’, and you know, like they were in school, certain movements. ‘No, Daddy, you’re not doing it right,
it’s supposed to be this thing there’.
Perhaps because it is present so early in life, music provides ways for children to be experts early
on, thereby providing the substance for sharing in more equitable ways with adults, who tend to
control much of children’s lives (Figure 23.3). The mother from another family in the study offered
this voice-journal entry regarding her 3-year-old daughter (words in parentheses refer to non-text
vocalization):
In the morning, when Katie was at the breakfast table singing ‘Head, Shoulders, Knees and Toes’ she
then … started singing a little song they taught her at school for Daddy Donut Day. Something
like ‘Three little donuts in the donut shop, some powdered sugar on the very top’ (chants very
rhythmically). Anyway, she used to know the song very well and she’s sort of forgotten some of the
words. So she got stuck. And I started singing the song with her. She fussed at me (laughs) and she was
very angry. She said ‘Mom, don’t sing’. Apparently she wanted to do it solo, which I thought was kind
of amusing. Katie does prefer to sing herself (laugh/sigh).
Here, Katie is asserting herself as the musical agent; her mother’s well-intentioned (yet unin-
vited) assistance was quickly rejected. This resistance to adult help in an activity for which the
child feels expert may be a response to the adults’ misreading of mistakes as a cry for assistance,
rather than mistakes as an opportunity for discovering (or rediscovering). However, in another
episode, Katie does invite her mother to a solo performance—this time improvised. Delivered
with artistic expertise, it communicates:
After Katie had played the little piano and sung to me in her bedroom, she brought the [instrument]
into the … foyer of our house … we were alone … ‘I’m going to sing my last song to you.’ She sang this
lo-o-o-ng thing, where she was playing piano, you know, just punching the random keys, and she …
wasn’t singing any recognizable tune, but she was singing. And it was the long expressive thing, where she
was talking about her whole day. She talked about how she missed her mom during the day, and then
she was happy when her mom came home. She talked about how she was sad because she wanted
her beautiful day, and the rain kept coming and it wouldn’t go away, and she couldn’t have her beautiful
day because of the rain, and all this stuff. Very lengthy, expressive, thing she was doing, she was singing,
while she played the piano. And of course, I was thinking, well, you know, Katie’s a genius. OK, that’s
it (laughter).
Communicative musicality is a diversely interpretable phenomenon in post-infancy early

childhood. As both a consequence and contributor of human development, it is manifest in a
broad array of partnerships—observable in homes and in public spaces between parents and
children and, in seemingly solitary explorations of agency, in response to aural and visual cues
from the environment and in renditions of learned songs of the culture.
Like the expert musicians, the children in these observed episodes took risks for their artistry, and
were similarly vulnerable to exposure. Music creates a frame for experience, bounding secret play
spaces for children that ‘appear formative of the creative realization that things can be otherwise
than they are now’ (van Manen and Levering 1995, p. 33). The collective understanding that defines
communicative musicality is aided by the imitations and inventions that take place in these secret
spaces, where individual agency is practised as listening, reception, and response to the salient fea-
tures of the immediate context and models held in memory. Given these demonstrations of musi-
cality’s educative function in our experiences as agents of consequential action and as artists whose
actions provoke companionship in the emergent creation of narrative structures, it follows that
instructional contexts may be venues where communicative musicality should flourish.
23.4 Teaching to artistry: communicative musicality and

pedagogical insights
Qualities of communicative musicality can be traced in the improvisatory musical activity of both
young children and professional composers. The emotional intensity, the appealingly unifying
temporal structure, and the narratives of reception and response to gestured invitations are
evident in musical interactions beyond infancy. Such characteristics suggest a musical pedagogy
based in a vision of learners as agents, a commitment to co-construction, and a responsive
disposition open to possibility.
Fig. 23.3 A child invites adult participation.
However, because these groups have accrued more experience from which to tender and evalu-
ate responses, and because the interactions often involve more random or assigned partnerships,
communicative musicality may become less intuitive and more vulnerable to self-doubt and
other factors resistant to improvisatory practice. Alexandra Pierce addressed this issue in a per-
sonal correspondence following the improvisation study.
Improvisation, for me, begins as a looking-forward-state-of-curiosity, coupled with an inwardly
decided-on absence of fear. Once begun, the improvisatory spirit continues as a deliberately chosen
freedom from negative inner comments as I hear occurrences that aren’t (objectively) ‘good’ or musi-
cally engaging. Overlying this unfearful trust (confidence) in unfolding events is a self-encouragement
(ongoingly recharged) to awaken responsiveness to the moment and to its evolving context.
I propose that this same deliberate focus be applied to teaching. Returning to Tagore as a
source of timelessness of these issues, I now address common risks and rewards that might be
considered when transferring the observational and theoretical to educational practice. The tem-
porally structured narratives of reciprocity and intimate spatial proximity to the music, and often
to the performing partner(s), not only define infant–mother interactions; they may also serve
teachers in their pursuit of meaningful musical educational experiences.
23.4.1 Emergent musicality: a case for responsive pedagogy

We have to keep in mind the fact that love and action are the only media through which perfect
knowledge can be obtained, for the object of knowledge is not pedantry but wisdom. An institution of
this kind should not only train up one’s limbs and mind to be ready for all emergencies, but to be
attuned to the responses between life and the world, to find the balance of their harmony, which is
wisdom. The first important lesson for children in such a place would be that of improvisation, the
ready-made having been banished in order to give constant occasion to explore one’s capacity through
surprise achievements. I must make it plain that this implies a lesson not in simple life, but in creative
life. For life may grow more complex, and yet, if there is a living personality at its centre, it will have
the unity of creation.
Tagore (1926/1997, pp. 256–257)
Like mother–infant interactions, improvisation is quintessentially experiential (Iyer 2004), a ‘cele-

bration of the moment’ (Bailey 1992, p. 142). Occurring in real time, the performer interprets and
responds to expectations implicit in musical materials. In both collaborative, private impro-
visatory performances, and classrooms, rehearsal halls, and studios, musical invitations emanate
from music and context, responding to an ever-accumulating repertoire of possibilities.
Teaching to ‘surprise achievements’ calls for an open attitude toward possibility; it seems to
address the qualities the adult composers described (‘Getting to a place both musically
and socially that I’ve never been to before’) and the sense of wonder on the part of parents who
observed their children’s music-making. It suggests a pedagogical style that honours diversity
and uniqueness, and looks for what hidden or secret talents may be used as entry points to
instruction (van Manen and Levering 1996). Such a style may be at odds with much music
education, which is very much inculcated with the notion of well-defined achievement in the
form of competition criteria, national standards and institutional expectations. The music edu-
cator who works from a framework of communicative musicality must therefore negotiate these
two fields in facilitating what Archer (2000) refers to as ‘corporate agency’ (p. 11), a collective
approach that deals with transformation of social structures (see Fröhlich, Chapter 22,
this volume, for examples of this).
Additional risks and challenges facing those who choose to teach in response to students’ needs
and interests include the fear of exposure of personal deficits. These seem to disappear when the
focus can be on the moment, reading invitations and responding with an acknowledgement of
the offering and a sustaining contribution to the narrative.
I interviewed Jason—jazz musician, early childhood pedagogue, and risk-taker in the spring of
2005—about a project he had done for my class a year earlier. He began the interview with a
model of artistic passion worthy of emulation, sharing his excitement on discovering how to
bend notes on his new harmonica.
It was a complete accident and I try to reproduce my accidents. I love accidents … Accidents tell a lot
about where you’ve been and where you’re going. Mistakes—mistakes are growth. If you never make
mistakes, you’re kind of stagnant, you’ve mastered whatever you’re doing and it’s time to move on.
Mistakes are the things that can give direction … If you never make a mistake, there’s no investigation,
there’s no growth. So I don’t strive for perfection; I actually enjoy the mistakes.
This ability to see mistakes as opportunities for discovery and growth requires attentiveness
and responsiveness to unexpected possibilities. Such qualities reflect a communicative musicality
that can be responsive to diverse ways of thinking and playing in the classroom. This was mani-
fest in Jason’s project, which involved his 8-year-old cousin, Billy, who has a severe learning dis-
ability associated with the inability to filter sound. Jason read all he could about the disability,
and then wrote a children’s book called Billy’s brave musical adventure in which Billy has to pass a
series of musical challenges to make his way to the dragon’s lair and unleash the colours.
An audio CD accompanied the book, providing musical content for each of the challenges, for
example, the naming of an emotional correlate (e.g., happy, sad) to the sound. Billy met each
challenge successfully, as demonstrated in the initial reading of the story to Billy, captured on
video. Jason believed in Billy’s communicative musicality when others did not, and was able to
reveal it with this project.
23.5 Musical intimacy as a way of knowing

Freedom in the mere sense of independence is meaningless. Perfect freedom lies in the harmony of
relationship, which we realize not through knowing but in being. Objects of knowledge maintain an
infinite distance from us who are the knowers. For knowledge is not union. We attain the world of
freedom through perfect sympathy.
Tagore (1926, p. 253)
The closeness of infant–mother musical interactions creates a sympathetic resonance, a sense of

belonging generated by the motive impulse (Trevarthen 1999). Regardless of age, music calls forth
intimacy, reaching across our senses as we hear and attend to aural cues; as we sing, play or move to
the stylistic rhythms, contours and tempo; as we see images such as musical scores or choreography;
as we experience involuntary responses such as ‘chills’ or an increased heart beat (Gabrielson 2001).
By attending to music, we get close to the music; it inhabits our being. Similarly, during his inter-
view, Jason, preschool teacher and jazz player, talked about the students he works with: ‘there’s no
limitation on what [children] can be… [they tell us] “I am it!”’ This embodiment of musical expe-
rience is integrative, and allows for a depth of understanding. Such understanding is fundamental,
as it brings to consciousness the emotional meanings of life through its practical significance
(Archer 2000). Re-awakening communicative musicality in teaching would honour this intimate
way that music can be known, involving practices that teach the conceptual through the physical.
Intimacy refers not only to the positioning of music within, but also the positioning of self in
relation to others while engaging with music. Like the adults that needed to be physically apart to
be responsive to the musical cues given to each other, students seem to know where to position
themselves to learn. Whether outside the group periphery to take in the perceived enormity of
the situation, or next to a friend with whom they can communicate with ease, young children
often have a sense of who and what conditions are needed to engage with music in a meaningful
way (St John 2006).
Intimacy also refers to personal spheres (van Manen and Levering 1996), where one can be
alone with their thoughts or their music-making, trying on possibilities within their own private
space, and free from evaluation and room to let mistakes guide the direction of artistry.
Educational settings that contribute to teaching and learning through communicative musicality
honour the implicit intimacy in musical experiences and acknowledge the importance of peers,
friends and private spaces.
23.6 Narratives of reciprocity: the rewards of teaching

In educational institutions our faculties have to be nourished in order to give our mind its freedom, to
make our imagination fit for the world which belongs to art, and to stir our sympathy.
Tagore (1926/1997, p. 261)
Communicative musicality is reciprocal: it influences both parties. In the case of educational

settings, it is not only the students who benefit, but those who are charged with providing
instruction. This chapter concludes with the stories of two teachers who were receptive to the
possibilities of young children’s artistry, and responsive to what they observed in ways that were
personally gratifying and meaningful.
23.6.1 Musical intimacy: Kirsten’s story

Kirsten is a music teacher who was working with students at a Catholic K-8 school in Harlem,
New York when she provided an interview about a project she had done two-and-a-half years
earlier. She had been curious about what her youngest students thought about music, and inter-
viewed them in the school library. What she discovered was that her students could communicate
to her about the music they liked and why they liked it, and what kinds of music they heard at
home. She discussed her motivation for the project as a search for reciprocity, an attempt at
involving her partners in communicative musicality.
So often I think that students and parents aren’t taken into consideration when we think about music
curriculum. In actuality, children have so much more music happening in their life than just school,
especially at home, with family, whether it’s at their house, their church, or some other part of their
community context.
The process caused her to question conventional practices both in her impressions of the
profession and in her own classroom.
My students have a very different idea about what music is inside of school and what music is outside
of school. Many … thought that music is playing instruments or singing songs, but they made a very
clear distinction between that kind of making music … and what they may do at home … which
led me to the question, ‘Why do we feel the need to teach [only] Western music in our schools?’
because many of our students have a very different musical experience in their homes. As a first year
teacher, I was pretty dependent on the textbook, taking my ideas from that and getting my lesson
plans from that instead of looking to my students to see what they wanted to learn, to see what was
already inside them and to see what they could maybe expand on or maybe they could teach their
classmates.
Kirstin’s interest in her students’ lives and their musical agency and potential for contribution
was motivated by an acknowledgement of musical intimacy:
When you know something about a person’s family you are more connected to them, and I think that’s
another wonderful way to heighten our classroom environment—to make it more comfortable and
more exciting, because music is fun, and it should be.
During one of her interviews with the children, she was privy to this familial information when
a student replied to her question, ‘Do you ever make up your own songs?’ with a tuneful response
to these lyrics:
I like dolls and they love me too
I like my mommy and she loves me too
I like my daddy and he loves me too
Because I give them kisses
And my mommy too.
Through incorporating listening as a curricular tool, Kirstin was able to reconnect with
communicative musicality and find satisfaction and wonder in her interactions with her
students:
The experience itself of talking to my students was really interesting. And then after, incorporating
what they had said into my curriculum, seeing the outcome of that months down the line was really
fun … my classes with my Pre-K and Kindergarten groups, right now they are more free, in the sense
that my students actually get to explore music, and I don’t feel the need to make sure that they know …
about this composer, and this instrument. Because when it’s fun, and when it’s interesting it comes
more easily and they more readily get into it and enjoy it.
23.6.2 Reciprocity: Caroline’s story

Caroline returned to graduate school soon after the birth of her daughter. She teaches Suzuki
cello lessons and has an active performance career. She did an observational assignment docu-
menting the music making of her 23-month-old child. Her sophisticated definition of music and
musical activity was challenged, and she was able to use what she knew to describe the commu-
nicative musicality she witnessed.
As I was explaining to Auntie (caregiver) what I needed to observe, I pointed out how S (daughter)
was hammering in a rhythm. It was a dotted rhythm with a sixteenth note and dotted eighth note …
While [Auntie] was talking on the phone I observed the children without her interaction. S took one
of the maracas and started to use it as a hammer. The other children circled around her. It was obvious
to me that the noise was bit loud for them, but they were drawn to the unique sound it made. S’s ham-
mering was still in the same rhythmic pattern of the dotted rhythm, but now it was slower and more
deliberate. P (a slightly older child) picked up the other two maracas and began to play a more contin-
uous repetitive beat similar to his previous beat. S looked on with amazement …
She notices intimacy and the compelling nature of music, as well as the musical agency with
which very young children musically engage.
Auntie asked if C (another child) could sing ‘Twinkle, twinkle little star’. He sang pretty well on pitch,
starting with the pitch D. I was not aware of this while I was listening to C sing, but when I listened to
the tape later I heard one of the children rattle the bells and then a maraca in the background. I couldn’t
believe my ears that this little boy was singing so clearly and with correct words. Auntie started to sing
‘Row, row, row your boat’ and then C chimed in gently with ‘down the stream, merrily, merrily,
merrily, life is but a dream’. Auntie said to sing another one and he started to sing ‘Old MacDonald had
a farm’. This developed into a group sing along. All the children were singing by the end of the song.
After the song, the children went back to their playing. Soon, all four children were singing or
humming ‘Old MacDonald’. They each were doing their own thing. P was playing with a car,
R was climbing on the toy rocking duck, C was playing with some letters and S was busy trying to get
my pen from of the table. I was amazed that the children were all humming the same tune. They were
singing almost in unison with a slight overlap of the rhythm, creating a gentle canon-like heterophony.
It was magical.
Caroline’s use of ‘magical’ was indicative of her realization of communicative musicality and its
permeating presence in the lives of infants. She notes the social nature of music-making, even for
the very young; and is ‘amazed’ at the musical skill and attentiveness they display. For Caroline,
the outcomes of observing were profound and personal; the awareness of social referencing illu-
minated her experience as mother and also expanded the parameters of her own musical identity.
I started to become more aware of the many children’s musical behaviours that I encounter. I started
with my own daughter. Every night we have a bedtime ritual of ‘books in bed’, and then I lie with her
in until she falls asleep … I started to realize that she will sing or babble [to put herself] to sleep.
This past week, I have been recording her. I started to really listen to what she was singing rather than
getting wrapped up in my own thoughts of what I was going to do next. I have been astounded to lis-
ten to her sing songs she knows and to hear her rearrange the melodies or words, and also to hear her
make up her own songs. This process has brought me even closer to S and has also opened my own
mind to the spontaneous and wonderful world of children’s own music-making. I find it incredibly
fascinating that I too am discovering my own voice and have begun to compose music. This is some-
thing I had wanted to do as a child, and did as a child, but it was never considered important.
This teacher’s attentiveness to the communicative musicality of children led to unexpected

‘surprise achievements’ of her own. In addition to supplying a knowledge base that contributes to
teachers’ abilities to design ‘authentic’ curricula that are meaningful to children, the awareness of
children’s musicality fosters a sense of wonder and fascination that facilitates openness to emer-
gent musical behaviours that may exceed preconceived expectations. This linking of theory to
practice—the merging of home to journey and convergence of past experience and personally
derived empirical evidence—contributes to teachers’ developing habits of inquiry. Caroline
makes the connections between her observations, her own musicality and her teaching:
My ears have opened up to a new world. I can relate to my students who have always been troubled
with intonation problems and have the realization that they can now hear when they are in tune or
not. The ears and mind have a new understanding, have made a new connection of what it means to
really listen.
23.7 Conclusion
If content knowledge cannot be communicated in a meaningful way, it is useless at best and pos-
sibly dangerous, in as much as dispositions toward lifelong learning are established early, and can
lead to musical/non-musical identity formation (Custodero 2003). Drawing on what we know
about communicative musicality, I have presented a view of music education as an interactive
social phenomenon that requires a responsive and receptive disposition to both the student and
the musical material. Such a pedagogical framework requires that teachers be prepared not only
to make plans and deliver instruction, but to attend to the learner. Communicative musicality for
teachers begins with listening and observing; by risking and respecting the intimacy of musical
experiences, they reap reciprocal rewards, learning not only about their students, but also about
themselves.
References
Archer MS (2000). Being human: The problem of agency. Cambridge University Press, Cambridge.
Bailey D (1992). Improvisation: Its nature and practice in music. Da Capo Press, New York.
Barron F, Montuori A and Barron A (eds) (1997). Creators on creating: Awakening and cultivating the
imaginative mind. Jeremy P. Tarchner, New York.
Bjørkvold J (1989). The muse within: Creativity, communication, song, and play from childhood through
maturity. HarperCollins, New York.
Bornstein MH (ed.) (2002). Handbook of parenting, 2nd edn. Erlbaum, Mahwah, NJ.
Campbell PS (1998). Songs in their heads: Music and its meaning in children’s lives. Oxford University Press,
New York.
Cobb E (1977). The ecology of imagination in childhood. New York, Columbia University Press.
Custodero LA (2003). Perspectives of challenge: A longitudinal investigation of challenge in children’s
music learning. Arts and Learning, 19, 25–53.
Custodero LA (2006). Singing practices in 10 families with young children. Journal of Research in Music
Education, 54(1), 37–56.
Custodero LA (2007). Origins and expertise in the musical improvisations of adults and children:
A phenomenological study of content and process. British Journal of Music Education, 24(1), 77–98.
Custodero LA, Chen JJ, Lin YC and Lee K (2006). One day in Taipei: In touch with children’s spontaneous
music making. Paper presented at the International Society for Music Education Early Childhood
Seminar, July 9–14, Taipei, Taiwan.
Dissanayake E (2000). Art and intimacy: How the arts began. University of Washington Press, Seattle, WA.
Finnegan R (2002). Communicating: The multiple modes of human interconnection. Routledge, New York.
Gabrielsson A (2001). Emotions in strong experiences with music. In PN Juslin and JA Sloboda, eds,
Music and emotion: Theory and research, pp. 431–449. Oxford University Press, New York.
Gardner H, Phelps E and Wolf DP (1990). The roots of adult creativity in children’s symbolic products.
In CN Alexander and EJ Langer, eds, Higher stages of human development: Perspectives on adult growth,
pp. 79–96. Oxford University Press, New York.
Iyer V (2004). Improvisation, temporality and embodied experience. Journal of Consciousness Studies,
11(3–4), 159–173.
1999–2000), 29–57.
McCutchan A (1999). The muse that sings: Composers speak about the creative process. Oxford University
Press, New York.
Moorhead G and Pond D (1978). Music of young children (Reprinted from the 1941–51 editions).
Pillsbury Foundation for the Advancement of Music Education, Santa Barbara, CA.
St John PA (2006). Finding and making meaning, Young children as musical collaborators. Psychology of
Music, 34 (2), 239–262.
Smith F (1998). The book of learning and forgetting. Teachers College Press, New York.
Tagore R (1921). Thought relics. New York, Macmillan.
Tagore R (1926/1997). A poet’s school. In K Dutta and A Robinson, eds. Rabindranath tagore: An
anthology, pp. 248–61. Macmillan, London.
Trehub SE (2001). Musical predispositions in infancy. In RJ Zatorre and I Peretz, eds, The biological
foundations of music, pp. 1–16. New York Academy of Sciences, New York.
Trevarthen C (1998). The child’s need to learn a culture. In M Woodhead, D Faulkner and K Littleton,
eds, Cultural worlds of early childhood, pp. 87–100. New York, Routledge.
shared with pride. Journal of Zero to Three, 23(1), 10–18.
van Manen M and Levering B (1996). Childhood’s secrets: Intimacy, privacy, and the self reconsidered.
Teachers College Press, New York.
Part 5
Musicality in performance
Performance of music is both personal creativity and intimate sharing. As a fellow musician in
a band or as an audience member in a concert, we may participate in what feels to be an intense
relationship with a performer, yet know very little about them, or them about us. What
makes this potentially intimate relating possible is a shared cultural code underpinned by innate
sympathy for performance, with its feelings of movement. The performer’s voice and body
produce gestural narratives through time that have the power to capture our embodied attention,
just as the gestural narratives of an infant can draw us into the rhythm of fluctuating intensities
of a protoconversation or action game. By expressing feelings in the time of song and gesture,
bringing life to a particular cultural form, the performer leads an audience in the narrative of an
imaginary journey. Or, where there is no separation between ‘performer’ and ‘audience’, equal
participants in a musical event may co-create ‘present moments’ (Stern 2004) within an agreed
dramatic narrative. Performance of music creates shared experience, bringing us closer in an act
of musicality and embodied agreement.
In Part 5, the authors consider performance as ritual, as expressing biological rhythms, as
coordinating production and as facilitating creativity. This story begins with an evolutionary
perspective. Ellen Dissanayake (Chapter 24), having argued for a foundation of the temporal arts
in mother–infant communication (Chapter 2), expands her vision to the role of ritual perform-
ance in the development of human culture and meaning. She writes that:
Belonging, meaning and competence are vital human emotional needs, and the temporal arts in ritual
ceremonies help individuals achieve and sustain them. In ceremonies, bodies swayed to music result in
minds relieved of existential anxieties, firmed by convictions, and bonded with their fellows in common
cause.
Like Ellen Dissanayake, Nigel Osborne (Chapter 25) also looks to basic underlying factors for our
musicality in performance. But while Ellen turns to our need to foster meaning, Nigel, as professor
of music and composer, reflects on our biology—in particular with regard to our biological clocks
that measure the time of life—and turns to questions of the relationship between rhythm, biology
and the eddies of consciousness. Through considering our chronobiological nature, he posits that
Perhaps in the experience of pulse at the centre of our being … we most effectively grasp time.
We may speculate it is in the flow of unchanging musical rhythmicities that we can unite
532 MUSICALITY IN PERFORMANCE
memory and anticipation . . . count the passage of time, yet stay in the same stable, predictable, but
dynamic present.
(Pages 550–551, this volume)
The last two chapters of the book consider the biomechanics of performance – one from the
point of view of embodied coordination (Chapter 26), the other from the point of view of perfor-
mance as enthusiastic, passionate creativity. Jane Davidson and Stephen Malloch (Chapter 26)
present musical performance
as the performer communicating with co-performers and audience through the intrinsic musicality of
body movements, the sounds that are produced by the movements of the body being disciplined by
cultural practice and performance technique so as to create meaning in a gestural narrative.
By considering body movements in two different performance types—a solo singer from the East
and the interlocking messages she conveys, and two professional Western classical performers
preparing a duet—the authors show that performers’ body movements create a ‘dance’ of meaning
in which the music is sounded. Part 5 ends with Chapter 27 by Helena Rodrigues, Paulo
Rodrigues and Jorge Correia—a description and exploration of two performance experiences.
The first is a celebration of parent—baby caring and companionship—Bebé Babá—where
parents and their babies come together with skilled theatre performers to create an environment
where adults and babies can explore their shared creativity and then present to an audience. The
second is a study of music students preparing for high-level performance using techniques that
encourage the realization of embodied narratives. The students are supported to leave behind
constant thinking about technical difficulties and drop into their embodied interpretation.
Performance is a way of ‘showing off ’ our personal and shared creative communicative
musicality. Through expressing our individual specialness we discover a meaning we all wish to
know. Thus, with a flourish and a bow, our story of communicative musicality as told in this
book comes to a close… hopefully to be picked up and still further elaborated by inquisitive and
patient explorers of the human psyche who are moved and intrigued by our embodied emotional
and intentional ‘story-making’ nature.
Reference
Chapter 24
Bodies swayed to music: The temporal

arts as integral to ceremonial ritual
Ellen Dissanayake
Some sort of emotional experience is probably the main reason

behind most people’s engagement with music.
Juslin and Sloboda (2001, p. 3)
24.1 Introduction
The primary context for the temporal and performative arts in small-scale societies is in ceremo-
nial ritual, where these activities—or this activity, since they occur together—appears to be essen-
tial and universal. The temporal arts—singing, playing instruments, expressive gesture,
movement (clapping hands, marking time, dancing, performance)—are the behavioural means
for conveying the message of the ceremony: they mark its importance and may even be used to
provide some kind of fundamental change in individuals’ consciousness (Alcorta and Sosis 2005;
Nettl 2000, p. 468).
There are a number of ways in which ritual (as ceremony) can be approached and understood.
Some scholars are concerned with its meanings to a society or to individuals—what the ritual
is about. Others are concerned with what a ritual accomplishes—its effects on individuals or on
the society as a whole. Still others are interested in how ritual accomplishes its meanings and
effects.
The present chapter is concerned with the last-mentioned, specifically, some of the ways in
which ceremonial ritual (as a behavioural manifestation or conduit of the beliefs transmitted by
the ceremony), through the temporal arts, allows a group of individuals to enact and share emo-
tional and non-verbal, as well as ‘cognitive’ (cultural or social) meaning. That is, although partic-
ipants in the ceremony may construct and wear antelope or chameleon masks made from
particular natural materials and dance in a certain way, and although the ceremony may be
‘about’ attaining a new age-grade or celebrating a good yam harvest, these peculiar means and
these particular practical ends acquire force and conviction through the emotional and bodily
meaning that is developed and transmitted by means of the temporal arts that are integral to the
ceremony—some might say is the ceremony. Most ceremonies are temporally organized events
based on protomusical capacities that emerge and are developed in infancy to enable emotional
coordination and concord with another person (Dissanayake, Chapter 2, this volume).
Such a view of ceremony will not be fashionable today, either with anthropologists or ethno-
musicologists (their fields being critical of what are called essentialist, overly general, or scientific
approaches) or even with evolutionary psychologists (who emphasize cognition and consider
emotion primarily as a proximate phenomenon that alerts or guides behaviour to ultimate

adaptive ends). Adaptationist studies of the arts are typically concerned with aesthetic features,
insofar as they arise from perceptual and cognitive predispositions for adaptive choices
(beneficial features of landscapes, vivid colours and sounds, Gestalt-like forms that are
cognitively satisfying, literary plots with romance and resolved conflict) or with aesthetic works,
insofar as they illustrate or are based on abstract categories of behaviours and goals (‘mating’,
‘parenting’ and ‘competing for status’). Rarely are aesthetic capacities or mechanisms consid-
ered—the behavioural and emotional means by which features or works have their effects.
But the arts do not have their adaptive (or any other) effects simply because they activate
cognitive modules that direct us to good mates, or because they contain the colour red that
connotes biologically salient stimuli such as blood or ripe berries. If the mere stimulus were suffi-
cient (say, a pornographic image or a gushing wound), there would be no need to embed these
categories or features in art works at all, where they are arranged with relation to other stimuli
and otherwise manipulated. It is the manipulations—what is done, the means to the end of the
art—that produces emotional responses or effects of the arts. And it is in the temporal arts—
which take place in time—that one can perhaps first begin to formulate principles about how
emotion is created and manipulated to expressive and eventually adaptive ends.
As just mentioned, most adaptationist studies of the arts deal with static, physical entities—
visual art objects or literary works, both of which exist concretely even when not being perceived
or perused. As aesthetic objects or works, they are the residue of behaviour, and often represent
life-like subject matter. In contrast, performative (or processual) arts, such as music and dance,
may not show or tell a story ‘about’ anything. They are behaviours (not static objects or works)
that have noticeable changing characteristics and effects as they take place in time and then cease.
Their emotional effects are found by perceivers in formal and expressive properties as much as or
more than in subjects and themes. In the temporal arts, participants and audiences are figura-
tively, if not literally (in Yeats’s phrase), bodies ‘swayed to music’, there and then. They are moved,
emotionally if not overtly physically—as the word ‘e-motion’ itself indicates (Panksepp and
The vast and complex subject of musical emotion can hardly be addressed in a short essay (for
an introduction to the subject, see, for example, Kivy 1989, Langer 1953, Meyer 1956, and compre-
hensive essays in Hodges 1996 or Juslin and Sloboda 2001). Here, I present ideas and findings from
ethology, evolutionary psychology and neuroscience that will, I hope, contribute to a continuing
discourse on the question of the generation and manipulation of emotion in the arts of time.
24.2 Ritualization and aesthetic operations

In Chapter 2 of this volume, I described ‘multimedia’ or ‘multichannelled’—that is, audio-
acoustic, kinesic, visual and tactile—interactions between mothers and infants that evolved to
enable an emotional bond between the pair and, at the same time, to reinforce ‘affiliative neural
circuits’ in the mother’s brain, thereby helping to assure her continuing care of and attention to
her baby. The coordination of behaviour and emotional expression between the pair is made
possible through ‘musical’ or ‘protomusical’ components—temporally organized expressive
vocalizations and movements of face and body that are conspicuously altered versions of similar
expressive signals of friendliness and accord between adults.
The head-nods, raised eyebrows, sustained smiles and undulant sounds made by mothers to
their babies resemble, at an abstract level, features of special kinds of communicative behaviours
that have evolved in other animals, especially birds (Huxley 1914). Ethologists have called this
evolutionary process ‘ritualization’ (Hinde 1982; Tinbergen 1952).
BODIES SWAYED TO MUSIC: THE TEMPORAL ARTS AS INTEGRAL TO CEREMONIAL RITUAL 535
In ritualization, a movement from a practical, ordinary context (say self-grooming, or flapping

the wings before flight) becomes altered (formalized, repeated, exaggerated and elaborated) so
that attention is drawn to it, and it then communicates a new social message. No longer does
preening indicate simply cleaning one’s feathers, but when ritualized means: ‘Notice me. I want to
mate with you.’ Wing-flapping, when ritualized, no longer indicates preparation for flight,
but means ‘Note: this is my territory and I will defend it.’ In its ritualized form, the precursor
behaviour has been subjected to various manipulations that make it look as well as mean some-
thing different. These manipulations or ‘operations’ include simplification (as stereotypy or
formalization), repetition (regularization), and exaggeration in time (longer or shorter, faster or
slower) and space (larger or smaller, higher or lower). The resulting signal becomes conspicuous
and attracts attention. Ritualized behaviours are typically used in agonistic or cooperative and
affiliative contexts.
Mothers’ unusual vocal, facial and kinesic movements to infants are derived from adult affilia-
tive signals and are, in essence if not fact, ritualized (Dissanayake 2000a). In interactions, a smile
to an infant is typically wider and sustained longer than to adults or even older children. The
head bob is arrested in its backward position, nods are regularized and repeated rhythmically, the
eyebrow flash is held, emphasizing the open (interested and friendly) eyes; speech to infants has
exaggeratedly higher pitch, lilting contours, repetition of words and phrases, and longer inter-
spersed pauses than adult communication. Unlike ritualized behaviours in other animals, which
are generally quite stereotyped, mothers at times vary or elaborate the vocal, kinesic and facial
expressions used with infants, and as infants mature, mothers begin to manipulate their expecta-
tion through pauses and delays, and by using more dynamic or changing features. However, as in
ritualized behaviours, all of these operations on the precursor communicative signals are ‘sponta-
neous, or unself-conscious.
One can say that in human music and dance, the operations of ritualization are also used, but
in an individually or culturally deliberate, conscious manner. For example, ordinary body move-
ments from everyday life, when formalized, repeated, exaggerated and elaborated, become
‘dance’; the ordinary prosodic or paralinguistic aspects of spoken language, when formalized,
repeated, exaggerated and elaborated, are ‘song’, just as the ordinary syntactic and semantic
aspects become ‘poetry’ (and so forth with other arts). As in ritualized behaviour, these same
operations convert the precursor or ordinary behaviours to something distinctive. They attract
attention and have the potential, when organized temporally, to further elicit and shape
emotional response (see also Watanabe and Smuts 1999, Alcorta and Sosis 2005).1
I suggest that we can call the operations that characterize ritualizations, when used sponta-
neously by human adults in interactions with infants, protoaesthetic or, because they comprise
the mechanisms of mutuality of mother–infant interaction, protomusical. Capacities to engage
in and respond to protoaesthetic operations are then available as a sort of reservoir for later
intentional aesthetic operations. That is, what people do in all media when they are making ‘art’
is to formalize, repeat, exaggerate, elaborate and manipulate expectation with respect to movements,
sounds, materials, objects, words, surroundings, themes and ideas. Attention is drawn to these by
means of the aesthetic operations that give them a different or additional meaning from what
1 Identifying these four operations of ritualizations is not meant to deny the existence and importance
of other fundamental endogenous processual or ‘narrative’ influences on aesthetic trajectories that are
expressed and felt as implications and realizations, antecedents and consequents, qualifications and sub-
ordinations, entailment, contrast, redirection, opposition, turn-taking, pacing, tension and release
(Dissanayake 2000b, p. 404; Tarasti 1994; Panksepp and Trevarthen, Chapter 7, this volume).
they are in their ordinary communicative or existential context. What is considered art in modern
times is a further emancipation or derivation from protoaesthetic capacities evolved first in
ancestral mother–infant interaction, later developed in multimedia ceremonial contexts for reli-
gious purposes (Dissanayake, Chapter 2, this volume), and eventually unmoored for use as sepa-
rate art forms in almost any context (see Brandt, Chapter 3, and Merker, Chapter 4, this volume).
24.3 Coopting mechanisms of mutuality for intimate social life

It is significant that protomusical (protoaesthetic) capacities are derived from signals that
communicate affiliative intent. Sociality and affiliation are crucial to human existence, and it
should not be surprising that the components that contribute to mother–infant bonding—in
which emotions are expressed, shared and manipulated—should also become the components of
group bonding. Humans are born ready to elicit and respond to these mechanisms of mutuality
and we remain sensitive to them throughout our lives, ready to use them with infants and with
other people when they are developed in ceremonial contexts, and, as today, in the temporal arts.
Until the domestication of plants and animals that made settled life, food surpluses and large
groups possible, our ancestors were members of ‘societies of intimates’, as described by Givón
and Young (2002), who contrast them with ‘societies of strangers’, the larger and more complex
groupings that began to develop around 8000 to 6000 BCE.
Humans evolved to live and prosper in societies of intimates: for 99 per cent of human life on
earth, they were the sole social form, which sometimes still continues to exist as enclaves within
larger societies. Salient characteristics of such societies are small group size (50–150), a foraging
economy (hunting and gathering), a restricted territorial distribution (within a 10–20-mile
radius), a restricted gene pool, cultural uniformity, informational homogeneity and stability, a
consensual leadership structure, and kinship-based social cooperation (Givón and Young 2002).
In such groups, everything needed for life is obtained or made by people’s own hands.
Cooperation and reciprocity are not optional, and emotional mechanisms to ensure these have
been selected for and retained.
Despite homogeneity and routine, subsistence lives are subject to anxiety about daily sustenance
and preservation. As I described in Chapter 2, uncertainty leads to physiological and neuroen-
docrine responses that affect brain development, genetic expression, and other factors necessary to
survival and reproduction. When supportive social ties are in evidence, these stress responses
decrease (Taylor 2002, p. 13). The most supportive social tie of all is that between mother and
infant (Keverne et al. 1999), and it seems reasonable to suggest that ceremonial arts, which
make use of the mechanisms evolved to enable mother–infant mutuality, would contribute to the
coordination, conjoinment and emotional reassurance of their adult participants, providing a
sense of social support and of coping that ameliorates the deleterious effects of the stress response.
Regularizing and repeating sounds and movements require actual physical control that—
especially when performed with the social reinforcement of others—can generalize to psychological
or emotional control.
The British anthropologist Radcliffe-Brown, like other ‘structuralist functionalists’ of his
generation, is little cited by anthropologists today. However, his interpretation of the function
of rites and ceremonies as he observed them in the Andaman Islanders, shows adaptationist
thinking long before its time (Radcliffe-Brown 1922). Religion, he says, has a function in society
apart from whether it does for the participants what they want it to do or think it does—assure
prosperity, safety, health and abundance. Radcliffe-Brown considered the function of religion
(as expressed in what he called ‘rites’) to be the maintenance of an orderly social life, which itself
depends on the individuals having certain sentiments that affect and control their behaviour
with others. It is by means of rites [ceremonies composed of music/movement and other arts] that
these sentiments [emotions] are [aroused, articulated], regulated, maintained, and transmitted
from one generation to another. (The words in brackets and italics are my reformulations or
additions to Radcliffe-Brown’s statements).
I would additionally reframe Radcliffe-Brown’s thesis and say that the temporal arts (made
even more compelling and arresting by means of the visual emphasis of ornamentation and
costume), based on the mechanisms of mutuality (see the previous section and Chapter 2), were
elaborated to become the myriad kinds and styles of artful behaviours that we see in every
society’s ceremonial rituals, where they serve to coordinate and conjoin individuals and convince
them of the truth of the message that the ceremony is meant to convey. How is such emotionally
infused conviction and belief accomplished? What is emotional or bodily meaning?
24.4 Creating emotional meaning

Whether by philosophers or psychologists, most writing about emotion and meaning in music
tacitly or overtly presumes a Western high art tradition, where music is played by highly trained
and well-rehearsed performers from a written score, often in a special setting like a concert hall,
for an audience of experienced listeners who have the hope of gaining a moving, even transcen-
dent emotional experience. Writings from this viewpoint cannot be and are not meant to be
generalized to music in all times and places.
From whatever viewpoint, however—and despite the widespread conviction that music does
provide emotional experience—it is difficult to say what ‘musical emotion’ is or might be. In the
simplest sense, emotions are positive and negative indicators of what is good and bad for us.
Feeling fear, we freeze in place or flee from danger; feeling anger, we fight to defend something
important. We feel pleasure or joy when with loved ones in familiar, safe surroundings. But
emotion words such as fear, anger, joy, or even pleasure are inadequate to describe what is felt
when we are engaged with the temporal arts, as participants or audience.
Rather than analyse what is felt, it is perhaps more fruitful to consider how emotion is created. In
the space between the bare bones of emotion words and the subtle iridescent garments of contem-
porary aesthetic philosophy, I offer here a ‘preliminary ethological taxonomy’ of four sources of
emotional response to music that apply to both ceremonial music and Western art music, as well as
to contemporary popular music and other forms: (a) the appeal to inherent sensory and cognitive
dispositions; (b) both innate and culturally acquired associations and connotations; (c) the use of
entraining, tuning, driving, and ‘build-up’; and (d) effects of manipulation of expectation (see also
Dissanayake 2000a, pp. 209–18; 2005).
Within each of these ‘types’, which in actual experience will overlap and sometimes be indistin-
guishable, the mechanisms of mutuality described earlier—formalization, repetition, exaggeration,
elaboration, and manipulation of expectation—will attract attention and contribute to the
emotional effect of the music (here referring primarily to music and movement together—as in the
temporal arts of ceremonial ritual).
24.4.1 The appeal to inherent sensory and cognitive dispositions.

Everywhere, the arts make use of attractive and emotionally captivating forms, colours, sounds,
and movements that have intrinsic sensory and cognitive appeal. A new field, evolutionary
aesthetics, seeks to understand how these fundamental stimuli were (and are) adaptive to those
who are drawn to them and consider them beautiful (see Voland and Grammer 2003). It is hardly
an accident that ceremonies make use of striking colours and forms, energetic and graceful
movements, and exciting rhythms, or that performers are often young adults in the prime of life,
whose skill and beauty capture admiration and attention. In West African masquerades, Afikpo
men display their sexually desirable attributes for women, with conspicuous presence, competi-
tive performance, rhythmic motions, and physical strength all being used as forms of flirtation
(Ottenberg 1982, p. 180).
Although inherently attractive elements are essential to ceremonies, it is important to realize
that additional emotion is generated by making the elements even more striking or extraordinary
than they normally are. Maskers in West Africa violate normal standards of dress, vocalization,
movement, and behaviour (Ottenberg 1982). For example, costumes are either very bright or
very drab. Voices may also be ‘masked’, using guttural, strange sounds, animal cries, or even
silence. Embellishment and exaggeration make clear that the masker is not an ordinary person.
Masked Chapeyakas in Yaqui Easter ceremonies in the American Southwest perform tasks left-
handed or backwards, sometimes improvising movement en route such as walking with trotting
steps, assuming the stance of a bull, or scratching like a flighty chicken (Goodridge 1999).
Additionally, it is important to remember that ‘fundamental aesthetic stimuli’ occur in a context—
often one of active, ongoing participation—that unfolds in time (Scherer and Zentner 2001,
pp. 376–377). It is the unfolding temporal structure, more than any affecting element, that
produces musical emotion.
24.4.2 Innate and culturally acquired associations and connotations.

As with the strains of Wagner’s and Mendelssohn’s Wedding Marches, music typically calls atten-
tion to the beginning, ending, and other parts of a ceremony such as marriage. Even before the
bride enters the hall, we may begin to feel tearful or to weep. Our emotions are structured by
expectations and associations ‘in’ the music, which may suggest festivity, excitement, romance,
group pride and invincibility, control of disorder, the possibility of trance, and a host of other
emotive states.
There are more subtle associations that music calls up—synaesthetic intimations where sounds
have colour and texture, melodies have shapes, and movements seem to echo inside our own
bodies. These intimate inarticulatable sensations can perhaps be traced, mechanically in abstract
scientific language, to the neurobiology of mother–infant interaction, described in Chapter 2 and
in 24.2, above. In these interactions, the signals from vocal, visual, and kinesic modalities are
processed cross-modally. That is, in both partners, visual, tactile, auditory, and olfactory sensory
input converges in the orbitofrontal cortex, which projects extensive pathways to subcortical
motivational and emotional integration centres (Schore 1994, p.35; Tucker 1992). The several
senses are experienced inseparably and even in terms of one another—what Stern et al. (1985)
call ‘intermodal fluency’.
If music is built upon such intermingling of multisensorial–emotional foundations, it is not
surprising that its emotional expressiveness has been attributed by writers from Plato to the present
as due somehow to its ‘resemblance to’, ‘analogy with’, ‘representation of ’, or ‘mimesis of ’ human
expressive behaviour and expression (Kivy 1989, p. 171) or that it should be considered to be
‘metaphoric’ (Blacking 1971) or ‘symbolic’ of human feeling (Langer 1953). Steven Feld (1981)
describes metaphors in the musical theory of the Kaluli (in the Southern Highlands province of
Papua New Guinea), for whom kinds of water become ‘kinds of ’ sound. The descending minor
third, called sa or ‘waterfall’, is the most basic interval and stands as a symbol of sadness, isolation,
and loss (see also Feld 1982). In Apache Mescalero ceremonies, in which girls are literally ‘sung’ into
womanhood, repetition of melodies, plus their clear contour with octave leaps, triadic outlines, and
sectional structure, gives them an aesthetic design that matches other parts of the ceremony, from
the tipi shapes against the sky to the geometrical designs painted on the Gaahe (Shapiro and
Talamantez 1986, p. 85).
Daniel Stern’s notion of ‘vitality affects’ is also relevant here. These occur (in infant and adult
experience) as qualities of shape or contour, intensity, motion, duration, and rhythm—supramodal
properties that exist in our minds as dynamic and abstract, not bound to any particular feeling or
event—such as surging, fading away, fleeting, or being drawn out, each of which apply to a variety of
circumstances in visual, auditory and kinesic modalities (Stern 1985). Similarly Bunt and Pavlicevic
(2001, p. 195) describe ‘the communicative and expressive mechanisms of basic emotions, that is,
those of intensity, contour, tempo, rhythm, timbre, and dynamics’. As such, music ‘bears resemblance
to the “structure” of our emotions’ or ‘resembles our expressive behavior’ (Kivy 1989, pp. 37, 52),
without being confined to a particular emotion (joy, sadness) or any emotion at all.
24.4.3 The use of entraining, tuning, driving, or ‘build-up’

Apart from some insects or frogs (which produce synchronous courtship sounds), humans are
unique in the animal world in their ability to synchronize or entrain their behaviour to an extrinsic
common or isometric pulse (Merker 2000). This capacity for entrainment is prefigured in the
‘coordinated interpersonal timing’ of vocal interactions between mothers and infants as young as
two months of age (Feldstein et al. 1993) and in their expectation of social contingency as described
by Murray and Trevarthen (1985) and Nadel et al. (1999), where experimental perturbations of
attunement in behaviour and affect are distressing to both infant and mother (see also Miall and
Dissanayake 2003, p. 339, who call this capacity ‘interpersonal sequential dependency’).
Simply keeping together in time with other persons produces a feeling of well-being or euphoria.
The historian, William H. McNeill, has given a name—’muscular bonding’—to the phenomenon of
fellow feeling that he experienced as a young army draftee during close-order drill, and speculates
that it evolved because of its contribution to group solidarity. He described it as ‘a strange sense of
personal enlargement; a sort of swelling out, becoming bigger than life’ (McNeill 1995, p. 2).
In a number of studies, neuroscientists Andrew Newberg and Eugene d’Aquili have investigated
neurobiological sources and concomitants of the ability of human ritual to produce ‘emotional
discharges, in varying degrees of intensity, that represent subjective feelings of tranquillity, ecstasy,
and awe’ and ‘transcendent unitary states’ (see Newberg and d’Aquili 2001, and d’Aquili and
Newberg 1999 and 2000 for references to their own and others’ work). Although their primary
interest is in mysticism and altered states in contemporary religious experience, their findings
seem applicable with respect to the repetitive use of music and movement in any sort of ceremonial
ritual behaviour.
Newberg and d’Aquili report that the emotional qualities associated with ritually induced states
appear largely to be the result of the effects of fast or slow repetitive rhythms on the autonomic
nervous system and other parts of the brain (Newberg and d’Aquili 2001, p. 88; see also Gellhorn
and Kiely 1972). Techniques for ‘tuning’ the central nervous system and eliciting these states,
sometimes called ‘driving behaviours’, are also described by Lex (1979). Behaviours such as
dancing vigorously to rhythmic accompaniment excite sympathetic neurophysiological structures
and eventually lead to parasympathetic rebound (an effect of the body’s tendency to maintain
homeostasis), which induces an ‘altered’ state felt as trance or ecstasy.
A representative example of the phenomenon is perhaps the Giraffe Dance of the !Kung of the
Kalahari Desert, in which dancers aim to reach a state of kia that enables them to heal others.
Both men and women may dance for hours until, ‘almost imperceptibly, the mood intensifies [as]
the singing and clapping become more spirited, the dancing more focused’ (Katz 1982, p. 40). Kia
is an intense, emotional state and under its influence the !Kung practice extraordinary activities
such as performing cures and handling or walking on fire. The dancing itself is considered to be
exciting, joyful and powerful, enabling participants to confront uncertainties and contradictions.
‘Being at a dance makes our hearts happy’, say the !Kung (Katz 1982, p. 34).
24.4.4 Effects of manipulation of expectation.

It is not only intense build-up that produces strong emotions. Newberg and d’Aquili (2001)
report that both fast and slow rhythms can drive the brain to unitary states, although these
happen through slightly different mechanisms. In either case, rhythmic behaviours cause the
‘orientation association area’ (the posterior superior parietal lobe, which orients the individual in
physical space) to be blocked from neural flow.
The intensity of those unitary states depends upon the degree of neural blockage. Since the degree of
that blockage can increase by any increment, and theoretically until there is a total blocking, a large
spectrum of increasingly unitary states is possible.
Newberg and d’Aquili (2001, p. 115)
For example, sustained slow repetitive activity such as chanting or contemplative prayer stimulates
the parasympathetic system and, when pushed to very high levels, the inhibitory effect opposite to
that for sympathetic arousal occurs with a similar emotional effect of ecstasy, boundary loss, or
‘flow’ that may be subjectively interpreted as transcending the ordinary self and attaining altered
forms of consciousness or connection to a ‘higher’ power.
It is well established that the neuropeptide hormone oxytocin and endogenous opioid peptides
(also called endorphins) are released in maternal and other affiliative behaviours, producing
heightened positive affect, even elation and euphoria (Carter et al. 1999). These hormones also
characterize social interactions such as sports or dancing (Flinn et al. 1996). To my knowledge,
no studies have identified the brain chemicals that are released in either participants or observers
engaged in ceremonies, but it seems highly likely that the repetitive, interactive sounds and
movements of the temporal arts, underlain by the protoaesthetic mechanisms of mutuality
described above, will affect the affiliative neurocircuitry of individuals engaged in them and
promote a sense of union with other participants. Listening to East Indian ragas or Western art
music or jazz requires concentrated attention to the works’ temporal unfolding. Studies show
that for contemporary Western individuals, ‘music’ is one of the primary sources of ‘peak experi-
ences’ (Laski 1961, p. 190; Maslow 1976, pp. 169–170), characterized by strongly felt motor–sensory
responses and a feeling of emotional fusing or merging (Gabrielsson 2001, p. 432).
To keep time with a common pulse makes possible not only entrainment but other interactive
abilities such as alternation of sounds and behaviour, a practice common in musical events
(as well as in mother–infant interaction), and fitting in between the beats of another. Isometric
timekeeping also allows anticipation and the manipulation of expectation, which is an acknowl-
edged way of creating and shaping emotion in Western musical aesthetics (see Meyer 1956;
Sloboda 1999). In the Southern Highlands of Papua New Guinea, performers in the important
gisalo ceremony make audiences weep by ‘jostling expectancies, getting under the surface,
reframing usual thought patterns, and evoking a dramatic response’ (Feld 1982, p. 132). The
‘aesthetic operations’ derived from mother–infant interactions (described in 24.2)—formalization,
repetition, exaggeration and elaboration—can be used to manipulate expectation and create
emotional meaning in the arts.
Newberg and d’Aquili (2001, p. 89) emphasize that autonomic activity alone is insufficient to
create the intense states experienced during ceremonies (or artistic display), but these states are
dependent on other body sensory input and, most importantly, the cognitive context in which a
ceremony is performed. Emotional experiences of art and ceremony entail remembering what
came before, anticipating what might come next, and connecting what is perceived with other
parts of experience—all cognitive activities. Stylistic norms entail expectations and ‘structures of
signification’ (Tarasti 1994) that invest musical events with dynamism and expressivity.
In puberty ceremonies of Mescalero Apache girls, repetition (in the pulses of rattles and jin-
gling cones on costumes, and in periodic recurrence of song formulas accompanied by ritual
smoke) is used to shape the progression of the ritual and to provide a satisfying sense of regularity.
Because the ceremonies last several days, other elements which help mark the passage of time—
pulse, change, and silence—are carefully structured to sustain over time the experience of tran-
scendence. Songs are formed and grouped to unify the diverse segments of the ritual, creating the
impression that no time has elapsed and that this particular ceremony joins others in its own
recreation of the realm of mythological time (Shapiro and Talamantez 1986).
Oliver Sacks has described how a disconnected and disoriented amnesiac patient, Jimmie G.,
would become completely ‘reintegrated’ during the rite of Mass, enabled through its coherence
and unity—with every moment referring to every other, and filled with meaning—to recover,
if only transiently, his own continuity. He became completely held and absorbed in ‘an act of
his whole being, which carried feeling and meaning in an organic continuity and unity’ (Sacks
1987, p. 38).
24.5 Functions of music in ceremonial rituals

Countless scholars have had important things to say about rites, rituals and ceremonies and their
functions (for useful summaries, see Falassi 1987 and Zuesse 1987). Nearly all have pointed out
how rituals take place in and are meant to invoke a non-ordinary or exceptional time and space.
However, although numerous types of rituals are described, one looks in vain in such studies for
a suggestion of an ultimate function for ceremonial behaviour in a biological or adaptive sense.
In his classic text on the anthropology of music, Merriam (1964) distinguishes between uses
and functions of music. Uses are what evolutionists would call ‘proximate’—the expressed ways
in which music is employed in a given society—whereas the ultimate function of a particular use
of music may not be expressed or even recognized by its members. After summarizing influential
views of other anthropologists, Merriam lists 10 major functions for music at a general, analytical
level: emotional expression, aesthetic enjoyment, entertainment, communication, symbolic
representation, physical response, enforcing conformity to social norms, validating social institu-
tions and religious rituals, contributing to the continuity and stability of culture and contribu-
ting to the integration of society (Merriam 1964, pp. 219–27). One can see that these functions
may overlap and that several may be operant in a given musical event.
Even though these functions of music, and others, have been described in scores if not
hundreds of subsequent ethnomusicological studies (and even though it can be observed that
music functions similarly in modern societies), the anthropological perspective does not go on to
consider why these ends are humanly important, why and how music should have evolved to
enable them, or how music can accomplish them. The evolutionary or adaptationist arguments
of this chapter and Chapter 2 offer explanations or hypotheses about these questions.
In evolutionary psychology, emotions direct us to ‘proximate behaviours’ (such as participating
in the temporal arts) that accomplish ultimate adaptive ends. This may seem a cold-blooded
way to regard music, which—like love, religion, and the other arts—is a source of profound and
precious meaning in our lives. Yet it is unarguable that music has been important to our species
and must be considered adaptive (Alcorta and Sosis 2005; Mithen 2005). Song and dance are
thought to be ancient human behaviours, quite possibly accompanying H. s. sapiens out of Africa
around 100 000 years ago (Cross 2003). Anthropological and sociological evidence documents
the fact that human responsiveness to music is worldwide (Hodges and Haack 1996). Van
Damme (1996, pp. 50–51) cites a number of scholars who find ‘processual’ arts to be of greater
import to non-Western societies’ aesthetics than are static, visual arts.
The strong emotions elicited by the temporal arts create emotional dispositions that, in cere-
monial rituals, lead to (and reinforce) cultural beliefs about the verities of one’s society of
intimates and to feelings of confidence and unity. The temporal arts are integral to ceremonies
because, by elaborating their sources in affiliative behaviour, participants gain a felt sense of
social identity (as in rites of passage) and identification (of belonging to their group).
Additionally, through the temporal arts, ceremonies instil in individuals a sense of meaningful-
ness and significance of their group’s messages and a felt sense of competence that the important
and uncertain matters of the ritual can be dealt with. Belonging, meaning and competence are
vital human emotional needs, and the temporal arts in ritual ceremonies help individuals achieve
and sustain them. In ceremonies, bodies swayed to music result in minds relieved of existential
anxieties, firmed by convictions, and bonded with their fellows in common cause.
References
Alcorta CS and Sosis R (2005). Ritual, emotion and sacred symbols: the evolution of religion as an adaptive
complex. Human Nature, 16, 323–359.
Blacking J (1971). The value of music in human experience. Yearbook of the international folk music
council 1969, 1, 33–71.
Bunt L and Pavlicevic M (2001). Music and emotion: Perspectives from music therapy. In PN Juslin and
Carter CS, Lederhendler II and Kirkpatrick B (eds) (1999). The integrative neurobiology of affiliation.
Cross I (2003). Music and biocultural evolution. In M Clayton, T Herbert and R Middleton, eds,
The cultural study of music: A critical introduction. pp. 19–30. Routledge, London.
Damme W van (1996). Beauty in context: Toward an anthropological approach to aesthetics. Brill, Leiden.
d’Aquili EG and Newberg AB (1999). The mystical mind: Probing the biology of religious experience.
Fortress, Minneapolis, MN.
d’Aquili EG and Newberg AB (2000). The neurobiology of aesthetic, spiritual and mystical states.
Zygon, 35, 39–52.
Dissanayake E (2000a). Art and intimacy: How the arts began. University of Washington Press, Seattle, WA.
Dissanayake E (2000b). Antecedents of the temporal arts in early mother–infant interaction. In NL Wallin,
B Merker, and S Brown, eds, The origins of music, pp. 389–410. MIT Press, Cambridge, MA.
Dissanayake E (2005). Ritual and ritualization: Musical means of conveying and shaping emotion in
humans and other animals. In S Brown and U Volgsten, eds, Music and manipulation, pp. 31–57.
Berghahn, Oxford and New York.
Falassi A (1987). Festival: definition and morphology. In A Falassi, ed. Time out of time: essays on the festival.
University of New Mexico Press, Albuquerque, NM.
Feld S (1981). ‘Flow like a waterfall’: The metaphors of Kaluli musical theory. Yearbook for traditional music,
13, 22–47.
Feld S (1982). Sound and sentiment: Birds, weeping, poetics, and song in Kaluli expression. University of
Pennsylvania Press, Philadelphia.
Feldstein S, Jaffe J and Beebe B et al. (1993). Coordinated interpersonal timing in adult–infant vocal
interactions: A cross-site replication. Infant behavior and development, 16, 455–470.
Flinn MV, Quinlan R, Turner M, Decker SA and England G (1996). Male–female differences in effects of
parental absence on glucocorticoid stress response. Human nature 7, 125–162.
Gabrielsson A (2001). Emotions in strong experiences with music. In PN Juslin and JA Sloboda, eds,
Music and emotion: Theory and research, pp. 431–449. Oxford University Press, Oxford.
Gellhorn E and Kiely WF (1972). Mystical states of consciousness: Neurophysiological and clinical aspects.
Journal of nervous and mental disease, 154, 399–405.
Givón T and Young P (2002). Cooperation and interpersonal manipulation in the society of intimates.
In M Shibatani, ed., The grammar of causation and interpersonal manipulation, pp. 23–56. John
Benjamins, Amsterdam.
Goodridge J (1999). Rhythm and timing of movement in performance: Drama, dance and ceremony.
Jessica Kingsley Publishers, London.
Hinde RA (1982). Ethology: Its nature and relations with other sciences. Oxford University Press, New York.
Hodges DA (ed.) (1996). Handbook of music psychology, 2nd edn. IMR Press, San Antonio, TX.
Hodges DA and Haack PA (1996). The influence of music on human behavior. In DA Hodges, ed.,
Handbook of music psychology, 2nd edn, pp. 469–555. IMR Press, San Antonio, TX.
Huxley J (1914). The courtship habits of the Great Crested Grebe (Podiceps cristatus) together with a
discussion of the evolution of courtship in birds. Journal of the Linnean Society of London: Zoology,
53, 253–292.
Juslin PN and Sloboda JA (eds) (2001). Music and emotion: Theory and research. Oxford University Press,
Oxford.
Katz R (1982). Boiling energy: Community healing among the Kalahari Kung. Harvard University Press,
Cambridge, MA.
Keverne EB, Nevison CM and Martel FL (1999). Early learning and the social bond. In CS Carter,
II Lederhendler, and B Kirkpatrick, eds, The integrative neurobiology of affiliation, pp. 263–273.
Kivy P (1989). Sound sentiment: An essay on the musical emotions. Temple University Press, Philadelphia, PA.
Langer SK (1953). Feeling and form. Scribner, New York.
Laski M (1961). Ecstasy: A study of some secular and religious experiences. Cresset, London.
Lex BW (1979). The neurobiology of ritual trance. In EG d’Aquili, CD Laughlin Jr and J McManus, eds,
The spectrum of ritual: A biogenetic structural analysis, pp. 117–151.
Maslow AH (1976). The farther reaches of human nature. Penguin, New York.
McNeill WH (1995). Keeping together in time: Dance and drill in human history. Harvard University Press,
Cambridge, MA.
Merriam AP (1964). The anthropology of music. Northwestern University Press, Evanston, IL.
Miall DS and Dissanayake E (2003). The poetics of babytalk. Human nature, 14, 337–364.
Nicolson, London.
and their mothers. In TM Field and NA Fox, eds, Social perception in infants, pp. 177–197. Ablex,
Norwood, NJ.
Nadel J, Carchon I, Kervella C, Marcelli D and Réserbet-Plantey D (1999). Expectancies for social
contingency in 2-month-olds. Developmental science, 2, 164–173.
Nettl B (2000). An ethnomusicologist contemplates universals in musical sound and musical culture.
In NL Wallin, B Merker and S Brown, eds, The origins of music, pp. 463–472. MIT Press, Cambridge, MA.
Newberg A and d’Aquili E (2001). Why God won’t go away: Brain science and the biology of belief.
Ballantine, New York.
Ottenberg S (1982). Illusion, communication, and psychology in West African masquerades. Ethos
10, 149–185.
Radcliffe-Brown AR (1922). The Andaman islanders. The Free Press, Glencoe, IL.
Sacks O (1987). The man who mistook his wife for a hat and other clinical tales. Harper and Row, New York
(original work published 1970).
Scherer KR and Zentner MR (2001). Emotional effects of music: Production rules. In PN Juslin and
Schore AN (1994). Affect regulation and the origin of the self: The neurobiology of emotional development.
Shapiro AD and A Talamantez (1986). The Mescalero Apache girls’ puberty ceremony: The role of music
in structuring ritual time. Yearbook for traditional music, 18, 77–90.
Sloboda JA (1999). Musical performance and emotion: Its uses and developments. In Suk Won Yi, ed.,
Music, mind, and science, pp. 220–238. Seoul National University Press, Seoul.
Stern D (1985). The interpersonal world of the infant: A view from psychoanalysis and developmental psychology.
Basic Books, New York.
Stern D, Hofer L, Haft W and Dore J (1985). Affect attunement: The sharing of feeling states between
mother and infant by means of intermodal fluency. In TM Field, ed., Social perception in infants,
pp. 249–268. Ablex, Norwood NJ.
Tarasti E (1994). A theory of musical semiotics. Indiana University Press, Bloomington, IN.
Taylor S (2002). The tending instinct: How nurturing is essential to who we are and how we live. Henry Holt,
New York.
Tinbergen N (1952). Derived activities: Their causation, biological significance, origin, and emancipation
during evolution. Quarterly review of biology, 27, 1–32.
Tucker DM (1992). Developing emotions and cortical networks. In MR Gunnar and CA Nelson, eds,
Minnesota symposium on child psychology, vol. 24, Development, behavior, neuroscience, pp. 75–128.
Voland E and Grammer K (2003). Evolutionary aesthetics. Springer, Berlin.
Watanabe JM and Smuts BB (1999). Explaining religion without explaining it away: Trust, truth, and the
evolution of cooperation in Roy A. Rappaport’s ‘The obvious aspects of ritual,’ American anthropologist,
101, 98–112.
Zuesse EM (1987). Ritual. In M Eliade, ed., Encyclopedia of religion, 12, pp. 405–422. Macmillan, New York.
Chapter 25
Towards a chronobiology of musical

rhythm
Nigel Osborne
25.1 Introduction to biological time

The existence of biological clocks—clusters of rhythmically pulsing neurons that keep time for living
organisms, regulate metabolism, procreation, movement, communication and even the temporal
nature of human thought (Foster and Kreitzman 2004)—raises key questions about the relationship
between the rhythm we experience in moving and that we easily share in music, our biology and
neurobiology, and our culturally refined consciousness of time (Husserl 1969; Lakoff and Johnson
1980, 1999; Turner and Pöppel 1988; Schögler 1999; Pöppel and Wittman 1999; Pöppel 2002;
Donald 1991; Krumhansel 2000; Stern 2004; Lee 2005; Schögler and Trevarthen 2007).
Some organic clocks have to measure long periods of time: certain species of cicada count
17 years underground, accurately to within a few days, before surfacing for just a few weeks of
adult life (Karban et al. 2000). Other clocks work on a 24 hour, circadian cycle, like that of the
wide-eyed horseshoe crab, whose internal neural oscillators nightly trigger up to a million-fold
increase in sensitivity to light, while photoreceptor cells on its tail react to daylight and keep the
clock synchronized (Barlow 1990). Many oscillators, including, intriguingly, those that regulate
breathing and heartbeat (Delamont et al. 1999), control rates closer to musical rhythms and
tempi, like the spinal interneurons of the lamprey fish, which depolarize and hyperpolarize
rhythmically in response to raised levels of glutamate in their ion channels, and thus regulate the
movement of swimming (Grillner, in Bear et al. 2006); this is a process analogous, possibly even
homologous in evolution, to spinal motor programmes for the comparable tempi of walking in
human beings, the range of which is represented by the central range of a metronome. The spinal
locomotor network even of a tadpole is modulated by a neurochemistry that amounts to the
same emotional system as transforms skipping allegro to cautious largo (Sillar et al. 1998). All
biological clocks, it should be noted, though they have an inbuilt basic period, are modifiable by
environmental stimuli, usually in an adaptive direction. Like other regulators of vital functions,
including animal play which may be the evolutionary precursor of music, and development of a
child’s brain, they have ‘environment expectancy’ (Bekoff 1972; Bekoff and Fox 1972).
Mainstream chronobiology, the science of inner body times, has not yet engaged directly with
music, possibly because performance of singing or instrumental music expressively may require
so much practice, and because of strong cultural influences on our tastes in music. But its litera-
ture is full of musical terminology and metaphor: ‘rhythm’, ‘rhythmicity’, ‘beat’, ‘octaves’, ‘cueing’,
‘bandmasters’, ‘conductors’ and ‘orchestras’ all feature regularly. Perhaps the usefulness of the
terms and the power of the metaphor derive from the fact that rhythm in music offers a
comprehensible encounter with rhythmicity within human experience: a place in consciousness
where ideas like beats, cueing and synchronization, otherwise hidden at various depths in our
biology, or somehow inaccessible beyond the psychological ‘present’, become tangible, embodied
and enacted. We cannot readily sense the rhythmic oscillation of our suprachiasmic nuclei, or the
546 NIGEL OSBORNE
flickering of the alpha rhythm in the cerebral cortex, except in the disorder of epilepsy, or ‘feel the
groove’ of the changing seasons, but we can clap and dance to the rhythm of a drum—we feel the
pulse of another body moving. Perhaps music’s special position in the chronobiology of human
beings has been far too obvious to be noticed.
The purpose of this chapter is to investigate this special position: where it may be, what it may
mean, how the body generates and responds to rhythm, how rhythm may regulate and synchro-
nize between us, and how a simple frequency-based notation of musical rhythm may make music
more ‘readable’ to biologists. These are the speculations of a musician about science, so
predictably they diverge, first into musical physics and metaphysics, then into music theory, but
they return to music biology, it is hoped, with an appropriate periodicity.
25.2 Rhythm and frequency in the world of hearing, and

in the moving psychology of time
The oscillations of biological clocks and the cycles they control form part of an apparently
infinite spectrum of rhythms or periodicities in nature, from pulsars to planetary motion, and so
on to the electromagnetic spectrum, accelerating from radio frequencies to gamma rays. The
spectrum of the biomechanical energy of sound, what animals hear (Manning 2004) is on the
whole very much slower, but no less dramatic. The energy band of sound is served in human
beings by a beautifully refined anatomy for hearing, the fastest and most evolved firing system in
the brain (Brugge 1987). This means, in theory at least, that we are able to perceive and know
more about the inner workings of sound and hearing than we do about any other energy or
sensation; and this in turn makes possible some revealing journeys through the spectrum of
sound frequencies and the field of energy itself. The extraordinary sensibility of audition to
complex events in time is applied in the computer-aided ‘sonification’ of the polyrhythmic
electrical activity of the brain, making a translation into sounds to aid clinical assessment of
electroengephalograms, like applying a stethoscope to hear the ‘music’ of the cerebral cortex
(Hinterberger and Baier 2005).
At the end of the 1960s I worked at the Polish Radio Experimental Studio, at the time one of
Europe’s most innovative electronic music studios: analogue, valve-driven and creatively
efficient, at a political nexus of the Warsaw Pact. The studio was busy during the day, so students
had to work at night, and I recall in the early hours playing idly with sine wave generators,
oscillators designed to produce ‘pure’ pitches. These particular instruments, originally built for
medical research, had dials calibrated in Hertz (Hz), cycles of sound energy per second: a clock-
wise turn increased the Hz and therefore raised the perceived pitch, an anti-clockwise one
reduced the cycles per second and lowered the pitch. I noticed that as I turned the pitch control
of an oscillator down to very low frequencies, say to between 25 and 20 Hz, I began to hear the
signal as a fast, regular pulse rather than as a pitch. Significantly below 20 Hz I began to hear a
clear, regular and rhythmic ‘woof ’. Of course, the signal was in reality below the threshold of
audition, normally 20 Hz at 80 dB (the equivalent of the sound pressure level of heavy traffic in
less extreme domains of audition). It was simply signal impurities and the mechanical movement
of the speaker membrane that made the sub-audio frequencies appear audible, but naive
exploration of this kind had, I found, already formed the basis of some influential theory.
In 1957, the composer Karlheinz Stockhausen, a leading pioneer of electronic music, had
published an article entitled wie die Zeit vergeht (‘How time passes’) (Stockhausen 1971, 1989).
Here he had proposed a new morphology of sound where pitch and rhythm belong in a single
continuum, and where frequency relationships perceived as pitch may be ‘slowed down’, as with
my sine wave generators, to become rhythmic relationships. For example the harmonic interval
TOWARDS A CHRONOBIOLOGY OF MUSICAL RYTHM 547
of an octave in the domain of pitch, say 220 Hz and 440 Hz (like the first two notes of Somewhere
Over the Rainbow sung together as a chord) may be slowed down to a relationship of 1–2 Hz in
the domain of rhythm (like walking briskly and clapping every second step).
These ideas were not entirely new: music theorists of the Middle Ages and Renaissance,
inspired by neo-Platonist arithmetic, had recognized the relationship between harmonic propor-
tions in pitch, metre, and rhythmic durations. By the time Moritz Hauptmann published his
seminal Die Natur der Harmonik und Metrik (1853) there was a general awareness among music
theorists of an equivalence of harmonicity in pitch and rhythmic metre. Here harmonicity means
the relationships between the usually inaudible harmonics that lie above a fundamental pitch of
a symmetrical resonating object: for example the sounds sometimes made audible when the
wind blows across a hole in a drainpipe or a hollow tree. The concentrations of energy form an
ascending pattern of frequency bands, the harmonic series, in relationships of 1: 2 (the octave):
3: 4: 5: 6: 7, and so on. Theorists like Hauptmann had noticed that the same ratios operate within
rhythmic relationships, between metrical structures, groups of pluses and their subdivisions: a
sort of rhythmic ‘harmonicity’. What was really new in Stockhausen’s theory was the identifica-
tion of a morphology, the detailed exploration of the continuum of sound energy, and its use as a
compositional tool.
Stockhausen identified the time areas of ‘rhythm’ and ‘metre’ as lying between an upper limit of
frequency of approximately 1/16th of a second (or at intervals of 60 milliseconds) and a lower
limit of approximately one every 6 seconds, or 16 Hz and 0.16 Hz, respectively. At first sight, this
appears inconsistent with psychophysical or physiological research (e.g. Phillips and Farmer
1990; Pöppel and Wittmann 1999), where the minimum time required to discriminate two
events as separate is measured to be between 10 and 30 milliseconds, the equivalent of a pulse of
between 100 and 33.3 Hz. But Stockhausen’s concern was to identify the smallest units of time
that could reliably be perceived and performed by musicians, rather than simply appreciated in
abstraction as discrete events.
Definitions of the ‘psychological present’ now vary considerably, depending on the conditions
of measurement—up to approximately 3 seconds. Stockhausen’s estimate of the time area of the
rhythmic present was longer. He was supported by the climate of experimental psychology of his
time (for example, the Informationstheorie of his mentor Werner Meyer-Eppler 1959) and once
again by the common sense experience of musicians. It is certainly possible to perceive and
perform a pulse with a 6-second cycle reasonably accurately (without mentally subdividing the
beat). It is also possible to perform unique or coherent metric structures or ‘phrases’ of 6 seconds
duration or more with a strong sense of the rhythmic present within the cycle of component
events—for example the 16-beat Tintal of Indian music, at least when it is conceived in two
halves, or the 12-beat flamenco compás, whose pattern must be completely internalized by the
musician for effective performance. Clearly there is a divergence between the rhythmic present
and the psychological present, which might be best understood in relation to the activity that the
sense of time is serving. Either they are two overlapping but discrete phenomena, or the active
rhythmic present has the capacity somehow to stretch the psychological sense of an ‘event’
happening ‘now’.
In either interpretation of the present, ‘rhythm’ and ‘metre’ occupy a crucial window in the
spectrum of sound energy and in human chronobiology. This special window has the following
properties (c.f. Table 25.1):
◆ The window for rhythm opens at around the frequency where audition ends (approximately
20 Hz, c.f. Band III, 10–20 Hz); in this specific and narrow sense the performance of musical
rhythm makes available to the ear and body slow, low frequencies no longer available to
human audition, and articulates them through the higher, audible frequencies of musical
548
XXXX 2008
Table 25.1 Psychobiological time (Modified from Trevarthen 1999, 2007)
25-Malloch-Chap25
Band I Band II Band III
Narrative times of imagination and Times of conscious action and response Times of action and response below conscious discrimination.
memory. in the present moment.
NIGEL OSBORNE
9/10/08
Visceral ‘episodic’ time: ‘disembedded’, Active, consciously monitored time. Pre-conscious intervals. Instantaneous awareness.
future or past, thought about. The Immediate Present. ‘Declarative’,
reasoned experience, or recollections.
Minutes Period in seconds and frequency Period in milliseconds and frequency
1:08 PM
and 30–50 10–25 3–7 0.7–1.5 0.3–0.7 150–200 50–100 30–40 5–20
longer 0.02–0.03 Hz 0.04–0.1 Hz 0.14–0.3 Hz 0.6–1.4 Hz 1.4–3 Hz 5–6 Hz 10–20 Hz 25–200 Hz
NARRATIV XEPHRASE RHYTHM
Page 548
EEG DELTA THETA ALPHA 8–12 Hz BETA 12–30 Hz GAMMA 26–100 Hz

Autonomic γ Wave; β Wave; α Wave; Adult Newborn
physiology para- thermo- relaxed heartbeat heartbeat
and ‘arousal’ sympathetic regulatory breathing
heart and EEG Mayer
fluctuations waves1
Cognitive Attention ‘Orienting’ and ‘Expectancy’ N200 ‘Mismatch’ N100 Sensory Brainstem
physiology. cycle waves wave focus responses
ERPs2 potentials
Psycho- Extended Subjective Single, consciously- Automated ‘Click-sensitive’ Temporal Minimal
physics present present controlled action. finger oscillators. order perceived
Short-term memory tapping Internal clock unit threshold interval
Eye and head Oculomotor Separate Inter-saccade Eye-saccade
orientations scan path head interval. duration
orientations Head turn
Manipulating Manipulative Reaching Fast reach, Finger Reflexes,
sequence. slowly. pick up, grasp. articulations, twitches
Sawing Hammering tapping
Walking Slow walk Walk to run Heel-to-toe Fast reflex
stepping step
Speech Phrase, long Short word. Syllable, full Short vowel/ Consonants. Voice
word. Breath- Call, vowel. syllable. Articles, Lip and onset
25-Malloch-Chap25
cycle. soothing Chewing. prefixes. Fast tongue time

sound counting. articulations.
Interpersonal Long turn Short turn Single Overlaps,
dialogue utterance. interruptions
9/10/08
Utterances, Slow smile. Head-shake, Fast gesture

gestures, Scowl. Stare. nod. Wave. or expression.
expressions Sigh, groan, Glance, Wink. Laugh,
1:08 PM
growl eyebrow raise. spasm, gasp.

Burst of Fast patting
laughter.
Grimace.
Stroking, slow
Page 549
patting.
Music Extended narrative. Musical Phrase, slow Pulse/beat:
Pulse/beat: Vibrato, Trills, fast
Song, ballad to large episode gesture largo toandante to arpeggios, passagework
musical composition. ‘narrative’ andante presto rapid movement
Singing Story, novel, play, Verse Phrase Bar Beat Tremulo
Melodic or drama. Theories, Timeless, Very slow, Slow, Controlled; Fast, bursting,
gestural history. All elaborated floating sedate graceful or
urgent to impetuous,
contours in texts, scores and ponderous
casual thrilling
Poetry media. Stanza Phrase Foot Stressed Unstressed
syllable syllable
Emotions Recollected Moods. Self-regulatory and interpersonal: calm, Immediate, urgent protective reactions, intensely expressive
self-sensing Changes of. sad, angry, joyful. in communication
emotions; emotion
anger and
anxiety to
peace and joy3
1 Sustained breath.
2 Event-related potentials.
TOWARDS A CHRONOBIOLOGY OF MUSIC
3 In a fictional, imagined and recalled times and places, resembling events in the natural world, and with persons in the social world
549
550 NIGEL OSBORNE
instruments and voices; the window closes at the end of the psychological, rhythmic present
(0.32–0.16Hz – a period of 3 to 6 seconds, c.f. Band II, 0.14–0.3 Hz);
◆ It opens at around the maximum frequency where musicians are able physically to articulate
rapid rhythms in a coordinated and reliable manner (between 16–20 Hz); it closes at the end
of the rhythmic present where individual articulations may no longer be understood or
embodied as rhythmically related (0.32–0.16 Hz);
◆ The window of musical rhythm contains the frequencies of practically all common voluntary
and involuntary human movement, such as physiological tremor (8–12 Hz), speech and
hand and finger gesture (up to 10 Hz), Parkinsonian resting tremor (3–5 Hz), heartbeat,
walking and sexual intercourse (1–3 Hz), body sway (roughly 0.5–1 Hz) and breathing (auto-
matically, normally 0.16 to 3Hz, but consciously, 0.1 Hz and even lower—more of this later);
◆ The neural oscillators which control these movements voluntarily or by default must of
necessity have either frequencies or phases which fall within the frequency range of the
window; by a thought-provoking coincidence, systems relating to consciousness and
thought also occupy the window, for example the entrained firing of neurons in alpha
rhythms (8–12 Hz, slower in young children) associated with calm wakefulness, highly
activated beta rhythms (over 12 Hz) that occur when attention is focused with intent concen-
tration and/or anxiety, and, at the other extreme, the delta and theta rhythms of sleep or
drowsiness: under 4 Hz and from 4–7 Hz respectively (Epstein 1983).
25.3 The rhythmic ‘present’

The problem of consciousness and of the ‘present’ is addressed in all philosophical traditions.
It has clearly been both a fascination and a puzzle for human beings in diverse cultures at diffe-
rent times. St Augustine, for example, confesses that if no one asks him what the nature of time is,
he knows what it is, but if he is asked to explain, he has no idea (… si nemo ex me querat, scio, si
querenti explicare velim, nescio … Confessiones libri XIII—’… if someone asks me, I know, if they
ask me to explain, I don’t know’) (Augustine 2006). He identifies a praesentia de praeteritis, a
sense of the past, based on memory, a praesentia de praesentibus, a consciousness of the present,
and a praesentia de futuris, a feeling for the future, based on anticipation (expectatio). In all of
this, time is the continuation and enduring flow of the ‘present’.
It is remarkable how consistent this view is with a more contemporary phenomenology, like
Edmund Husserl’s reflections on the experience of music (Husserl 1969). Confucius also
struggled with the quicksilver of the ‘present’, suggesting that time is ‘just like this flowing river’
(Fan and Cohen 1996). John Mbiti, the African relgious philosopher, quoted by D. A. Masolo
(2000), characterizes a pan-African concept of time in the Swahili terms Zamani and Sasa.
Zamani is the the powerful, all-embracing, spiritual realm of the past, Sasa is the dynamic,
human present, with its own micro-past and a very short future, all of which is in the process of
sinking back into the Zamani past. Significantly, parallel to this flows the ‘rhythm that knows
neither end nor radical alteration’: birth, procreation, death, the drum and the dance.
All of this speculation bears witness to the long-term and universal human desire to grasp the
dynamic, apparently flowing present of consciousness, the incomprehensible and unstoppable
river which makes each of these letters as I type them escape from the immediate future through
the present to the past. It is a concern where philosophy, spirituality, psychology and biology
meet, and where the window of rhythm has real significance. It is perhaps in the experience of
pulse at the centre of our being, in its musical chronobiology, that we most effectively grasp time.
We may speculate that it is in the flow of unchanging musical rhythmicities that we can unite
TOWARDS A CHRONOBIOLOGY OF MUSIC 551
memory and anticipation, as Adam Smith (1777/1982) said we do. We can count the passage of
time, yet stay in the same stable, predictable, but dynamic present.
Let us imagine we are listening to a drum playing a simple pulse. We hear a beat, it catches our
attention, we hear it again and already we know its cycle. By the third beat we have entered a
frame where we have no need to remember what has gone before and we anticipate exactly what
is going to come—and we may well be impelled to move our body. In one sense we are locked
physically and mentally into an illusory ‘timeless’ unchanging present. In another we are scrupu-
lously marking the passage of time, and engaged in, paying attention to, its dynamic processes.
If this is the case, then the presence of pulse offers a kind of ‘homeostasis’ or self-stabilizing and
ordering to consciousness, a place where it can ‘play’, and ‘hang out’ with others. We, individually
and collectively, are synchronized, activated and reconciled with our personal and sympathetic
human chronobiology.
It is the lowest frequencies of the time area of rhythm that seem to determine the extent of the
‘rhythmic present’ when we are actively moving, and as we have seen, this appears to diverge
from the ‘psychological present’. The psychological present is like looking out of the window in a
moving train. The window represents our sense of the present, and time and experience flow by.
The rhythmic present shares this quality, but it is simultaneously cyclic, and ‘busy’. Metre and
rhythmic phrase create these cyclic units at the lower frequencies of the time area of rhythm, as a
frame for conventional or metrical pulses in the technique of musicianship, and more rapid
rhythmic articulations, as illustrated in the musical chronobiograms below (Figures 25.1 and
25.2). They appear, subjectively, to mark out and to enfold the time within them, like a series of
cycloramas of views from the train window of roughly 1 to 6 seconds duration. I argue that
this process is not simply a Gestalt or general cognitive phenomenon rationally constructed of
experience, but that it is also biological, derived from an intrinsic motive process.
25.4 Rhythm and moving

Almost everyone has experienced the sensation that rhythm ‘makes you move’ (see Mazokopaki
and Kugiumutzakis Chapter 9, this volume). Young infants may rock to rock’n’roll almost as soon
as they can sit up. Trevarthen (1999) finds that the earliest coordinated movements of babies can
be cued or ‘attracted’ by the musical pulse of a mother’s vocalizations, as was demonstated first by
Condon and Sander (1974). The powerful effects of musical rhythm in cueing coordinated
movement in certain phases of Parkinson’s Disease are well documented (Pacchetti et al. 2000;
Sacks 2007), and there is evidence that music therapy may effect behaviour change and improve
targeted movement repertoires in cerebral palsy (Krakouer et al. 2001). Finally, a significant body
of research points to the existence of some process of entrainment between musical rhythm and
autonomically controlled periodic movement inside the visceral mechanisms of the body, such as
the heartbeat and breathing (Skille and Wigram 1995).
6 Hz
Subdivisions Foot embellishment
3 Hz
Principal pulse Leg movement
1 Hz
Metre Body sway
Fig. 25.1 The metrical structure of body movements in the waltz.
552 NIGEL OSBORNE
12 Hz
Fingers
Subdivisions
6 Hz
Interlocking hands,
both feet
3 Hz
Principal pulse Hands
1 2 3 4 5 6 7 8 9 10 11 12
1.5 Hz
Compàs Foot
1 Hz
Body 0 Hz
Fig. 25.2 The metrical structure of movements in the Flamenco compás.
What is not yet clear is how musical rhythmic impulses activate the neuronal motor systems in
the brain and excite the somatic muscles—how musical sound moves us. It seems likely that
certain classes of sound stimulus have a neural ‘hotline’ to our most fundamental capacity for
action, and especially for its rhythmic source. This is particularly apparent in the simple
and immediate acoustic startle response (ASR), where we jump or blink when there is sudden,
unexpected sound. We do not think about jumping, we just do it. Research with ASR in rats has
established the involvement of the dorsal cochlear nucleus in the brainstem (Meloni and Davis
1998) and of the inferior collicus of the midbrain roof (Li et al. 1998), the latter strategically
placed close to both limbic and spinal related coordinative systems, especially those of the core
‘emotional motor system’ (Holstege et al. 1996; Panksepp 1998). What seems clear is that the ASR
is largely if not entirely subcortical, and this opens up the possibility that certain rhythmic sound
impulses may be similarly processed, at least in part, outside any neocortical cognitive control.
There is ample evidence from neurobiology, both animal and human, that intuitive rhythms of
instinctive behaviour, and the subconscious and emotional modulation of moving, implicates the
emotional core of the brainstem, the basal ganglias, limbic system and cerebellum (McLean 1990;
Panksepp 1998; Sacks 2007), and that all these regulators of dynamic somatic integrity are impor-
tant for our ‘feeling for what happens’ (Damasio 1999) in our active integrated ‘self ’ with all its
mobile members held alive in one consciousness (Merker 2005, 2006). They also play a part in the
abilty to ‘mirror’ or sympathize with the intentions and consciousness of others (Adolphs 2003;
Gallese 2003). These structures beneath the intelligence, memory and skills that accumulate in the
cortex are certanly involved in appreciation and making of music, its emotion and its meaning
(Blood and Zatorre 2001; Zatorre and Peretz 2001; Peretz and Zatorre 2003; Kühl 2007; Sacks
2007; Panksepp and Trevarthen Chapter 7 and Turner and Ioannides Chapter 8, this volume)
At the other extreme there has been significant research on consciousness of rhythm and the
neocortex. This has located the ‘higher’ processing of rhythm where it might be expected, in
the auditory cortex of the temporal lobes (e.g. Peretz and Kolinsky 1993). There is also evidence,
contrary to some earlier hypotheses, that the right anterior secondary auditory cortex has a
specific role in the retention of rhythmic patterns (Penhune et al. 1999). However, in the
transformation of perceived rhythms into movement, and vice versa, a whole complex system is
certainly involved, which recruits many levels of the brain, including the premotor cortex, the
basal ganglia, and the cerebellum (Zatorre and Peretz 2001; Peretz and Zatorre 2003). The frontal
lobes will also be involved in a ‘mirror neuron’ effect (Rizzolatti et al. 2001; Gallese 2003); in our
response to rhythm we seem to sense, internalize and imitate the movements which generate
sounds, just as a musician may sense the movement of a dancer. In this way drummer and dancer
are in both synchrony and gestural sympathy. (See Lee and Schögler Chapter 6, this volume, for
an account of experimental investigations of this phenomenon.)
The regulation of bodily rhythmic activity, all of which lies within the frequency window of the
musically rhythmic ‘present’, requires oscillators which are firing either at the frequency of the
movement or in phase with it. This raises the likelihood of different registers of oscillation, possibly
capable of complex rhythmic harmonicities and even inharmonicities. Wittman and Pöppel (1999)
suggest two principal frequency levels: a high-level band around 30 milliseconds (33 Hz) —this of
course lies beyond the window of rhythm postulated here—and a low-level band around 3 seconds
(0.33 Hz), corresponding to the psychological ‘present’. Trevarthen (1999) suggests that there are
many more levels, which act as foundations for the polyrhythmia of purposeful coordination
of movements of different speeds and power, for the processes of perception related to these move-
ments, for the psychological present and for the lower frequencies of memory and imagination.
Trevarthen and Aitken (1994) propose that an Intrinsic Motive Formation (IMF) develops among
cells proliferating the brain in a human embryo. The IMF persists through life both as an integrated
body imaging core system of the brain, and as a neurochemical affective coordinator and regulator
of human movement and experience (Panksepp 1998). Within the IMF, Trevarthen proposes an
Intrinsic Motive Pulse (IMP), a system of generators of neural and body-moving time, which forms
part of larger system of generators regulating our whole being—movement, emotion and thought
(see Panksepp and Trevarthen Chapter 7, this volume).
The implications of the IMF hypothesis for musical rhythm are far-reaching. As far as music
and movement are concerned, there are two immediate and practical implications. It may be that
differing musical rhythms and tempi are activating, entraining, phase-locking or cueing differing
neurological oscillators, attached to a variety of movement regulating functions throughout the
brain’s representations of the body and its fields of action; and that more complex, layered
rhythms may be engaging more than one set of oscillators at the same time, with the possibility
that they may cue various movement systems symultaneously, such as those of legs and trunk,
arms and hands, breathing, the voice and articulations of the mouth—as in singing and dancing.
25.5 Descriptions, notations and representations of

musical rhythm
The history of the understanding, description, theory and notation of musical rhythm in the
European tradition is an eccentric, rambling edifice some way removed from the proposed
biological foundations of rhythmic experience. In his landmark study The notation of polyphonic
music (1953), the normally sober and scholarly Willi Apel confessed amazement at the tortuous
evolution of rhythmic notation:
From its beginnings to the late sixteenth century, the amount of time, labour and ingenuity spent to
bring about a few paltry results is incredible. Parturiunt montes et nascitur ridiculus mus [the moun-
tain laboured and brought forth a mouse].
The essentials of historical European rhythm, and indeed of most other cultural traditions,
may, however, be expressed very simply. For the sake of intellectual hygiene, and with apologies
to musically literate readers, I propose a minimal description.
554 NIGEL OSBORNE
The central element is a pulse, which will normally be played at a consistent speed, which is its
tempo. The pulse will usually be stressed or ‘accented’ in a regular manner, say one accent for
every two pulses, as in a march; or one accent for every three pulses, as in a waltz. These accents
usually mark the divisions of metre; in contemporary European notation, these metrical divisions
are called bars (measures in American English), and the accent usually falls on the first pulse, or
beat of each bar. Each beat may be subdivided into faster beats usually two, three or four to the
original beat, sometimes more. Finally, notes may be sustained over more than one beat giving
the possibility of different lengths, or durations.
Among the mountain ranges of music cognition, this description seems little more than a reap-
pearance of the ridiculous mouse. Seminal rule-based work by theorists such as Lerdahl and
Jackendoff (1983), establishing a ‘well-formedness’ protocol for metrical structures, Steedman’s
prosody-related model (1996), or Temperley’s preference rule systems (2001), have inspired
ground-breaking work in music informatics and artificial intelligence which both reveals and
enriches the complex textures of the cognition of musical rhythm. At the same time, the features
of musical rhythm most likely to be active in a less neocortically based human response may be
approachable in a less theoretically ambitious, less computational way. Cognition and biology are
not an ‘either or’ for music, they are a ‘both and’. Povel and Essens’ clock-based model (1985), for
example, is a theory based in cognition which goes some way towards linking with biological con-
cerns. Here are some general rhythmic concerns which may be expressions of our chronobiology.
25.6 Metre and pulse: a rudimentary chronoblology

Let us re-notate the rhythm of the waltz in terms closer to the world of neural oscillators. For the
sake of mathematical simplicity, this will be a moderately brisk waltz—something like Tchaikovsky’s
Waltz of the Flowers. The principal pulse or ‘beat’ is to be at a tempo, in musical terms, of 180 beats
to a minute or a frequency of 3 Hz. There are three beats to a bar, and there is an accent on the first
beat of every bar. These accents set up another layer of pulse which, as there are three beats to a bar,
will have a frequency of 1 Hz. Let us add some fast notes to the waltz and divide each beat into two,
which gives a frequency of 6Hz. Here is a simple ‘chronobiogram’ of the waltz (Figure 25.1).
The first observation is that the rhythm of the waltz has three principal frequencies. According to
Wittman and Pöppel’s (1999) theory, these musically rhythmic frequencies would all probably relate
to the same biological low-frequency domain of the psychological present. According to Trevarthen
and Aitken’s IMF hypothesis (1994), there are involved at least three different biological clocks, and
probably very many more. Certainly there is strong frequency related body image associated with the
waltz: swaying and full body movement at 1Hz (this is a fast waltz!), leg movements at 3 Hz, and
intermittent embellishments with the foot at 6 Hz. This is not to mention the many frequencies
which regulate the subtle movements and glides in between, essential for both the dancer’s and musi-
cian’s performance. The theory of proliferation of such oscillators does not of course preclude the
possibility, or indeed likelihood, that they are also linked in hierarchical systems, which may behave
with rhythmic ‘harmonicity’. Indeed, in his classical analysis of the coordination and regulation of
movements, the Russian physiologist Bernstein (1967) gives precise evidence for such ‘harmonicity’
of rhythms in the forces generated by the ‘motor images’ of the brain, and he describes how they
change as the body and mind of a chid master walking and running.
25.7 More complex rhythms, a coherent body image

Let us examine a more complex structure. The flamenco compás is a pattern of accents within a
repeating cycle of pulses. It is also a form where the movements of the dancer contribute to the
sound of the music. We shall consider a compás called bulerías. It has a repeating cycle of twelve
pulses or beats, and a pattern of accents which does not fit into traditional European bar struc-
tures or metric patterns: 1 2 3 4 5 6 7 8 9 10 11 12 (bold and underlined figures indicate accents).
It is a long cycle in the low-frequency range of the rhythmic window which determines its own,
extended rhythmic ‘present’. I recall as a young man with heroic dreams of playing flamenco
guitar being sent to a little girls’ dancing school in Seville to clap the rhythm of bulerías. The
idea was to learn to stop counting, and internalize the structure—to begin to ‘feel’ it, in a kind of
recurring musical/motor patterning of the present.
The duration of the window in the example illustrated in Figure 25.2 is 4 seconds. The basic
hand clap pattern is 3 Hz. The pattern of accents of the compás is interesting: unlike conventional
European music there is no accent on the first beat of the cycle and the pattern changes in the
middle. In fact the structure of accents is based on two cycles of oscillation, at 1 Hz and 1.5Hz
respectively; the pattern simply jumps from one frequency cycle to the other in the middle, as do
many compound, polyrhythmic and asymmetrical metres in music and dance.
The accents of the compás are often stamped by the dancer, and the hand claps are often subdi-
vided, in this case at 6 Hz, usually in an interlocking manner; the feet may also join this level
of frequency, as may the dancer’s hands with castanets. Finally, the guitar will sometimes embel-
lish with intermittent right hand finger patterning around 12 Hz. Once again, the rhythm is
interpreted in music and dance with a strong and coherent body image:
◆ from a still, immobile straight-backed centre of the body (a theoretical 0 Hz);
◆ to stamping action, one foot at a time (1 Hz and 1.5 Hz, overlapping cycles);
◆ to hand clapping (3 Hz), to interlocking hand clapping, castanets and both feet (6 Hz);
◆ and finally to the guitarist’s fingers (12 Hz).
25.8 Rhythm and timbre, rhythmic streaming

Clearly such simple chronobiograms indicate only the most rudimentary scaffolding for a
proposed ‘biological’ construct of rhythm, and how it ‘feels’ and can be anticipated and learned,
with the aid of expressive modulations. There are many other features of its architecture. The
pitch register and timbre of rhythmic articulations, for example, appear to be important for
experiencing and guiding the energy of expressions. In rock’n’roll, it is a deep, throbbing pulse
usually involving bass drum and bass guitar, often around 1 Hz, like the surging and varying
heartbeat of the mother heard in the womb, that activates rock’n’roll’s rhythmic power. The pulse
comes as the first and third beats in a metrical cycle of 0.5Hz containing four beats—1 2 3 4—the
third beat usually anticipated and embellished. There is also an interlocking pulse of 1 Hz, known
in the business as the ‘backbeat’ 1 2 3 4. This is a simple form of syncopation, or the displacing of
an accent onto a subsidiary, interlocking pulse. The backbeat may be articulated by a higher
pitched drum, for example rim shots on snare, maybe also supported by lead guitar. Finally
there may be a driving rhythm at 4 Hz on cymbal or perhaps hi-hat. It is the combination of
these rhythmic frequencies and their associated instrumental colours, together with crucial
biomechanical features still to be discussed, that make rock’n’roll rock.
There is a general tendency here—large, low-pitched instruments for slow, low frequency
pulses, and higher-pitched instruments for higher rhythmic frequencies. This frequency ‘stream-
ing’ is as true of the rock drum kit as it is of the Indonesian gamelan or the Berlin Philharmonic.
It may well relate to the rhythmic ‘body map’, the anatomical and functional representation of
movements and sensations of the body in the brain (Damasio 1999), but it will also be modified
by cultural practice. In West African drum ensembles, for example, the relatively high-pitched
556 NIGEL OSBORNE
cowbell may play a coordinating ‘time line’ which combines higher and lower frequency pulses in
a single cycle, while essentially lower-pitched drums articulate faster rhythms. This may well have
evolved to help make the time line more distinctly audible and ‘articulate’.
25.9 Melody and harmony: the modulation of metric structures

In the European tradition, as in some other cultural traditions, melody and harmony play an
important role in articulating, reinforcing and modifying rhythmic cycles. In the style of waltz
notated above, for example, the underlying harmony may well change roughly every two bars,
which gives a special importance to the first beat of every two bars. In one sense we have entered
the realm of the cognition of (or ‘thinking about’) the intuitive ‘gut’ evaluations of pitch and
harmony—how they ‘feel’. In another sense, we remain with whole-body action and rhythm,
because the sensing of an important harmonic change encourages the performer to articulate
the accent differently (in banal terms, not simply [Um-cha-cha][Um-cha-cha] but [erUm-cha-
cha][Um-cha-cha]. The question of such articulation, at the meeting point of melody, harmony,
pulse and metre, lies very near to the heart of concerns of musical interpretation in the European
classical tradition, and is essential to any biology of rhythm.
25.10 Accent and anticipation, an auditory ‘tau’?

My composition teacher in Warsaw, Witold Rudziński, devoted much of his life to the explo-
ration of subtleties of accentual articulation. First from his studies in Paris in the late 1930s with
Dom Joseph Gajard, a former colleague of Dom André Mocquereau (1849–1930), leader of the
Solesmes School of interpretation of Gregorian plainchant (Du rôle et de la place de l’accent
tonique latin dans le rythme grégorien, 1901, see Mocquereau 1927; Wellesz 1963), to an intimate
knowledge of historical and contemporary theory, including key contributions from Central
and Eastern Europe little known or recognized in the West (for example, Szuman 1951,
Kholopova 1994 and Bielawski 1976), Rudziński synthesized a unique perspective of Eastern and
Western intellectual traditions in his Nauka o Rytmie Muzycznym, 1987 (The Study of Musical
Rhythm).
The historical basis of the study of accentual articulation is shared with poetic prosody. The
central idea is of arsis, the anticipation of an accentuation, or the way we raise a foot when we
walk, and of thesis, the accent itself, or how we set our foot on the ground. Its classic manifesta-
tion in prosody is the iambic foot: da-Daah, da-Daah—’When chapman billies leave the street’
(Burns). It is an image based on the body and motion, and in its very nature anticipates a musical
chronobiology. There are many different ways an accent may be prepared. In some contexts,
faster rhythmic pulses may be recruited to help accent slower ones. In the famous Mexican
folksong La Cucuracha, for example, the rhythm of the melody imitates a Latin percussion
ensemble, with three fast pulses anticipating each lower-frequency accent (La Cu-Cu Ra-Cha La
Cu-Cu Ra-Cha etc.). The same process occurs in the idea of anacrucis, where a single pulse may
be used to prepare for an accent: a single, isolated ‘upbeat’ preparing for a ‘downbeat’. Sometimes
there is a less rhythmically ‘measured’ aspect to this phenomenon, in anticipations and glides
which seem to be leading more dynamically and biomechanically towards the accent.
Perhaps there is aural connection with Lee’s general tau theory (1998) here where the trajec-
tory of a sound both traces the process of movement towards an accent and acts as tau guide
to those listening or moving to the sound—like the whoop of the cuica, the accelerating rasp
of the reco reco, or the left hand flick of the surdo before the accent in many styles of
samba. Particularly interesting are recent developments of the theory that explore the dynamic
narration of tau functions in the execution of a series of movements of musical performance

or dance, or of both these arts together (see kappa theory in Chapter 6 by Lee and Schögler, this
volume).
Muscle groups in the hands and arms are clearly responsible for the physical manifestation of
these processes, but the subjective experience of musicians suggests the existence of some more
central point of origin, capable of coordinating the whole body in such rhythmic anticipation
(see Davidson and Malloch, Chapter 26, Section 26.6.2, for discussion on the ‘centre of moment’
of a musician’s moving body).
25.11 Rubato, swing and rhythmic ‘vitality’

There are other common modifications of the basic pulse. The physical activity of performing
rapid pulses on certain instruments may create a natural and somehow satisfying unevenness.
For the maracas in certain styles of samba for example, the articulation of the fastest layer of
pulse (frequently around 8 Hz) depends on a rapid movement backwards and forwards of hand
and wrist. The combination of the trajectory and collision of seeds or beans within the spherical
body of the instrument, the interplay of potential and kinetic energies, and the pattern of con-
traction and relaxation of distal muscles sets up a richly communicative inequality, involving
asymmetry. The forward movement tends to provoke greater emphasis and a slightly longer
duration; the backward movement is usually less emphasized and shorter.
Rhythmic instrumental music is full of such satisfying and communicative irregulatories. The
physiological/muscular modulation of chronobiologically determined pulses may well commu-
nicate powerful messages in sound about the energy, state of metabolism, and ‘vitality affects’ of
the player (see Stern 2000, pp. 53–61, and his citation of Heinz Werner’s [1948] theory of the
‘physiognomic perception’ of emotion in movement). Swing in jazz of the mid-twentieth century
is a combination of such modulation with distinctive patterns of syncopation.
In many musical cultures, pulse patterns are modulated by the speeding up and slowing down
of their basic frequencies. Rubato style in the European Romantic tradition is an engagingly
expressive flux of tempo, often changing from bar to bar, but normally returning to or ‘tracking’
the original pulse frequency. An accelerando or ritardando (speeding up or slowing down), on
the other hand, may lead to a completely new tempo, or basic pulse frequency, respectively faster
or slower than the original. In Javanese gamelan style, the basic pulse may be in long-term flux,
accelerating and decelerating over whole phrases and sections, at times returning to the original
frequency, at times moving to a new irama, or rhythmic level of subdivision of the pulse, rather
like a runner accelerating gradually from two steps to four steps per breath. (For the physiology
regulating rhythms of breathing and heartbeat, see Delamont et al. 1999.)
Human movement seems to be critical here (Clarke 2001). Speeding up and slowing down
a pulse seems to correspond, in the most primitive terms, to changes in speed and levels of energy
and excitement in human activities, for example, walking, running, swimming or combat.
In practice, in musical cultures it becomes a highly expressive device, articulating subtle nuances
of the flow of energy, excitement, restraint, dynamism, repose, advance, retreat and finely crafted
gesture, which may join the flow of expressive melody, counterpoint or harmony of the musical
language.
There are three mysteries here. The first is how we are able to identify an accelerating pulse as a
‘pulse’—it is likely that both cognitive and biological references play a role. The second is how we
may judge the rate of acceleration of a pulse to arrive, ‘logarithmically’ or ‘quasi-logarithmically’,
at a new frequency at more or less the time we anticipate. This may once again relate to some
cultivated extension of the tau function (Lee and Schögler Chapter 6, this volume).
558 NIGEL OSBORNE
Whatever the case, the subjective experience of musicians suggests that changes of speed
in musical rhythm may be a combination of conscious decision-making and less voluntary,
unconscious or neural/muscular guided action, and visceral nervous system regulation of
the economy of energy for moving. The decision to release an accelerando is conscious, as are
probably the decisions that monitor its early stages, but there comes a point where a momentum
seems to build and a less voluntary process takes over. A hypothetical chronobiology of this
phenomenon would include some kind of ‘control’ oscillator, set at the original stable tempo
against which the performer consciously places ‘pulse’ articulations with decreasing durations
between, and some programme of accelerating muscle contraction determined by the brain
and/or by the interaction of muscle fibres and the action potentials of spinal motor neurons.
In this process, for musicians at least, speeding up appears to be easier to control than slowing
down.
The third mystery is how several layers of oscillation involving different instrumentalists
and the different muscle groups within each instrumentalist in, say, an accelerando in a waltz or
flamenco, may speed up in such a manner as to maintain precisely their phase and rhythmic
harmonicity at every point throughout the acceleration. Once again, this suggest some form of
central ‘harmonic’ coordination or control allowing synchrony to be anticipated and regulated
intentionally (Schögler 1999).
25.12 Music with neither pulse nor metre

There are many kinds of music that appear to have no pulse or metre. For example, a large part
of the repertoire for the shakuhachi, the Japanese end-blown bamboo flute, is understood to
be free of rhythmic frequencies. Its phrases are based on the duration of a breath, which in the
consciously controlled, non-autonomic context, may last for many seconds. These breath phrases
are articulated by notes of ‘sensed’ rather than ‘counted’ duration, with fluid musical figures,
flexible glides and a subtle morphology of timbre and sonority. The composer Toru Takemitsu
told me how once he was sitting with friends in a restaurant with a famous shakuhachi master,
and asked him to play the longest, quietest note he could. The master drew breath, put the instru-
ment to his lips, and for a very long time, there was no sound; then there came a strange, low and
resonant gurgling, and then for many seconds, silence. The friends asked him how he had played
this extraordinary sound. The master replied, ‘I played nothing. It was the sound of the soup
boiling in the kitchen.’
The breath phrase raises questions about both the psychological and the rhythmic ‘present’.
It is perhaps no coincidence that the shakuhachi has been closely associated in its history with the
philosophy and practice of Zen Buddhism. If there is a parallel in the European philosophical
tradition which may help define this specific modality of non-rhythmic musical ‘present’, it is
probably Husserl’s theory of Aktkontinuum, where the observation (Wahrnehmung) of the here-
and-now musical moment is joined in a continuous flow with memory (Erinnerung) and antici-
pation (Erwartung) (Husserl 1969). What is interesting is that this repertoire is profoundly
internalized by players over many years of dedicated rehearsal and repetition, and that there is
notable consistency in how an interpretation evolves. Although there are no obvious frequency
clocks, there is physiological evidence for oscillators with the required periodicity (Delamont
et al. 1999). It seems likely that some deeply unconscious array of overlapping oscillators or
extended accelerating and decelerating tau-related structures may be at work (Schögler and
Trevarthen 2007). The relation of this physiology to the experience of musical ‘narration in
movement’ will surely be the subject of useful future study (Panksepp and Trevarthen Chapter 7,
this volume).
Karlheinz Stockhausen’s compositional research in the 1950s into the morphology of pitch
and rhythm led him to explore complex layered structures of pitch and rhythmic oscillators
(Stockhausen 1971, 1989; Truelove 1998). The result was a music where, once again, the sense
of simple pulse and metre was lost, but where the outcome was a moment-by-moment evolving
atonal fluidity, presenting creative challenges to the rhythmic ‘present’. Conservative music
critics of the time, and neo-conservatives since, considered this music ‘non-biological’, with
‘no beat and no tune’—but dancers reacted with ‘biological’ enthusiasm. For the tradition of
modern dance, issuing from the innovations of figures like Martha Graham, where weight shifted
to the lower body and the flat of the foot, and movement of more complex harmonicity and
inharmonicity became possible, the music of Stockhausen and his generation provided the
parallelism, support and counterpoint for entirely new and expressive narratives of human
movement. Significantly, even more popular choreographers, closer to the traditions of ballet,
like Maurice Béjart, relished making intensely physical dance pieces to these ‘non-rhythmic’
scores.
25.13 Conclusion: musical chronobiology and rhythms

of communication
In this chapter it has been argued that music may have an important role to play in human
chronobiology. The creating and enjoying of music may help us understand the science of time
in the mind and the inner regulation of activities, especially those of communication. In a sense
musical rhythm is human chronobiology made manifest and enacted. This enactment takes place
within what we experience as the ‘window’ of rhythm, which begins with the frequencies where
pitch ends and with the fastest rhythms human beings can play, and ends with the low frequen-
cies where we may no longer feel individual pulse articulations as rhythmically related and they
are perceived to become separate events.
Both rhythm and pitch, in their different temporal domains, are qualities of conscious experi-
ence that are inseparable from the generation of movements in the human body—they are prop-
erties of the sounds that moving may create. In music we hear the controlled dynamic regularity
of moving, and this hearing appears to be crucially connected with the unique ability we humans
have to make the meanings of culture. As Merlin Donald (1991, 2001) and Oliver Sacks (2007)
propose, we are animals that use rhythm and melody within our brains and bodies to make a
collective experience of myths and rituals (Merker Chapter 4, this volume) to generate ‘musical
semantics’ (Kühl 2007).
Significantly, or inevitably, the frequencies of the primary functions of human chronobiology
in regulation of actions, ranging from voluntary and involuntary movements to autonomic
periodicities and brainwaves, are all located within this ‘window’ of perceived musical rhythm.
There is evidence that in certain cases there are strong relationships between the experience or
understanding of musical rhythms and such primary chronobiological functions in moving.
These relationships appear to be both cortical and subcortical or whole brain, and include
complex cognitive processing as well as simple neural hotlines such as those notionally involved
in the acoustic startle response (ASR), as well as less tangible surges or tides of motor energy and
attention generated within the emotional motor system and associated time and serial order
keeping structures of the thalamus, basal ganglia and cerebellum with their palaeo- and neocortical
extensions. It is theorized that at the heart of these relationships lies some form of polyrhythmic,
amodal and physiognomically expressive intrinsic motive pulse (IMP) (Trevarthen 1999), or
system of neural oscillators capable of entraining, and being entrained by musical rhythms and
associated ‘sympathetic’ physical movement of other human beings.
560 NIGEL OSBORNE
Musical rhythms may be understood as more or less simply structured nested systems of
frequencies, or groups of oscillators tending to be in phase. Occasionally the lowest frequency
may be determined as a phrase, but for most purposes it is articulated through metre. The princi-
pal pulse lies at the higher frequency level above the phrase. Rhythmic subdivisions form the
highest frequencies. Such systems may be represented in chronobiograms, where the hierarchy of
pulses may be related to hypothetical neural systems such as the IMP, and also in certain cases to
a map of the body, portrayed hierarchically from its centre to its extremities. There seems to be a
high degree of coherence, harmonicity and phase between these oscillators. In phenomenological
terms metrical/pulse systems may also be seen as effecting both a homeostasis and a stretching
of the ‘present’ in human awareness—evident in thinking and talking. The same frame and
content patterns of temporal regulation in hierarchies of movement are seen in manipulation, in
speaking, in gestures, in dancing, and in music (Table 25.1).
The connection between music, movement and chronobiology does not depend on rhythmic
frequency alone. Movements are modulated by the quality and intensity of energy or ‘flow’
(Csikszentmihalyi 1990), and therein lies their emotional and emotive power. Factors such as
melody, harmony and timbre are central concerns, as are accent and anticipation, auditory tau
and kappa, unevenness, vitality, acceleration and deceleration. Even apparently non-
metrical/pulse structures may carry strong chronobiological messages in terms of fluidity of
movement or complex superimposition of frequencies.
For musicians, a theory of rhythm as enacted chronobiology may serve to confirm a number of
familiar musical intuitions and experiences (most of which are considered in Oliver Sacks’ recent
book, Musicophilia): such as, that there is a strong link between the perception of rhythm and the
execution and control of movement; that this coordination may correspond to a rhythmic map
of the body in the auditory and motor regions of the brain; that rhythm may be highly expressive
and communicate rich physical and emotional messages; that it may generate a sense of psycho-
logical comfort, well-being and physical energy; that musicians may coordinate themselves and
one another by cueing, setting and regulating matching kinds of internal rhythmic ‘clocks’. It may
give the most precise explanation of the human capacity to generate musical dramas and imagi-
native stories of moving.
For biologists the question is how and why this apparently complex and powerful system
evolved. The classic explanation, derived from social anthropology, is that music, together with
musical rhythm, evolved to serve specific social and spiritual functions such as ritual, courtship
and the coordination of work, as it is seen to do in all existing cultures. The argument still holds
true as far as it goes, but it seems likely that these varying activities are predicated upon some
deeper, common core and origin in sources of aesthetic and moral emotions that show their
beginnings in infancy (Dissanayake Chapter 2, this volume). It is arguable that something resem-
bling music emerged early in the evolution of human beings, and possibly long before language
as we know it. The evidence for this is both archaeological (see Cross and Morley, Chapter 5, this
volume) and psychobiological. The fetal responses to musical experience in the womb (Lecanuet
1996), and the rich musical/rhythmic dialogues between mothers and very young babies identi-
fied by Malloch and Trevarthen (Malloch 1999; Trevarthen 1999; Trevarthen and Malloch 2002)
both point towards the possibility of early beginnings in a preverbal stage. Similarly, the presence
of a simple but recruitable neurology for music in the subcortical brain, for example ‘musical’
neurons in the inferior collicus and medial geniculate nucleus, capable of detecting characteris-
tics of pitch glide and vocal modulation, and linked closely to systems of movement, metabolism
and emotion, indicates at least the feasibility and practicability of an emergent musicality early in
the human evolutionary process (Panksepp and Trevarthen Chapter 7, this volume). This is not
to suggest, of course, that this evolution took place independent of innovations and elaborations
in the neocortex permitting a vastly increased learning, but rather that there were well-formed,
ancient foundations for the process in the lower brain.
Such an innate musicality, elaborated from the communicative calls and ‘dancing’ communica-
tive mannerisms of animals (Merker Chapter 4 and Dissanayake Chapter 24, this volume), is
likely to have been linked to innovations in the early social development in human beings—
the need to communicate and learn intentions, and states of mind and body in both intimate
and larger groupings, and the need to engage in sympathy, empathy and synchrony with other
individuals in imaginative fantasy-making play with invented meanings. The place of rhythm
and an enacted chronobiology is central to all of this.
If we accept that musical rhythm is an ‘externalization’ of inner biological rhythms, then it may
also be seen as the chronobiological tool with which humans coordinate one another, entraining
their inner clocks and movements to shared, nested pulses. This is what Trevarthen et al. (2006)
refer to as the shared psychology of ‘synrhythmia’, an elaboration of the mutual physiological reg-
ulation of ‘amphoteronomy’—the way that the sharing of musical rhythm may regulate and
coordinate both conscious and autonomic inner clocks among different individuals. The chrono-
biology of music has the power to carry detailed information about the state of body, emotions,
motivations, energy and vitality of the performer and share all this with others through the
common properties of their motive states. In this coordinated state of being rhythmically ‘with’
others (for examples, see Gratier and Danon Chapter 14, Pavlicevic and Andsell Chapter 16,
Dissanayake Chapter 24, this volume), it is argued that the phenomenological present may be
appreciated in a common homeostasis of thought and consciousness, and may lead to the
sharing of both emotion and common sense of agency in elaborate stories of purpose and
experience—including the orchestral compositions, jazz improviations, rock’n’roll pieces, folk
songs and so on into the endless variety of musical forms that musicians make in our culture.
References
Adolphs R (2003). Investigating the cognitive neuroscience of social behavior. Neuropsychologia, 41, 119–126.
Apel W (1953). The notation of polyphonic music: 900–1600. The Medieval Academy of America,
Cambridge, MA.
Augustine, Saint (2006). The confessions of St Augustine. Translated from the Latin. Watkins, London.
Barlow RB (1990). What the brain tells the eye. Scientific American, 262(4), 90–95.
Bear MF, Connors BW and Paradiso MA (2006). Neuroscience: Exploring the brain, 3rd revised edn.
Lippincott Williams And Wilkins, Philadelphia, PA, London.
Bekoff M (1972). The development of social interaction, play, and metacommunication in mammals:
An ethological perspective. The Quarterly Review of Biology, 47(4), 412–434.
Bekoff M and Fox MW (1972). Postnatal neural ontogeny: Environment-dependent and/or environment-
expectant? Developmental Psychobiology, 5(4), 323–341.
Bernstein N (1967). Coordination and regulation of movements. Pergamon, New York.
Bielawski L (1976). Strefowa teoria czasu i jej znaczenie dla antropologii muzyki [Zonal Theory of Time
and its Significance for the Anthropology of Music]. Polskie Wydawnictwo Muzyczne, Kraków.
Blood AJ and Zatorre R (2001). Intensely pleasurable responses to music correlate with activity in brain
regions implicated in reward and emotion. Proceedings of the National Academy of Sciences, 98/20,
11818–11823.
Brugge JF (1987). Auditory system. In G Adelman, ed., Encyclopedia of neuroscience, Vol. I, pp. 89–92.
Birkhauser, Boston, MA, Basel, Stuttgart.
Clarke E (2001). Meaning and the specification of motion in music. Musicae Scientiae, 5(2), 213–234.
Condon WS and Sander LS (1974). Neonate movement is synchronized with adult speech: Interactional
562 NIGEL OSBORNE
Damasio AR (1999). The feeling of what happens: Body and emotion in the making of consciousness. Harcourt
Brace, New York.
Delamont RS, Julu POO and Jamal GA (1999). Periodicity of a noninvasive measure of cardiac vagal tone
during non-rapid eye movement sleep in non-sleep-deprived and sleep-deprived normal subjects.
Journal of Clinical Neurophysiology, 16(2), 146–153.
Donald M (1991). Origins of the modern mind: Three stages in the evolution of culture and cognition. Harvard
Donald M (2001). A mind so rare: The evolution of human consciousness. Norton, New York and London.
Epstein CM (1983). Introduction to EEG and evoked potentials. Lippincot, Philadelphia, PA.
Fan D and Cohen RS (1996). Chinese studies in the history and philosophy of science and technology.
Trans. K Dugan and Jiang Mingshan. Springer, New York.
Foster RG and Kreitzman L (2004). Rhythms of life: The biological clocks that control the daily lives of every
living thing. Profile Books, London.
Gallese V (2003). The roots of empathy. The shared manifold hypothesis and the neural basis of
intersubjectivity. Psychopathology, 36, 171–180.
Hauptmann M (1853/1991). The nature of harmony and metre [Die Natur der Harmonik und Metrik].
Da Capo Press, New York.
Hinterberger T and Baier G (2005). Parametric orchestral sonification of EEG in real time. IEEE
MultiMedia, 12, 70–79.
Holstege G, Bandler R and Saper CB (eds) (1996). The emotional motor system. Progress in Brain Research,
Volume 107. Elsevier, Amsterdam.
Husserl E (1969). The phenomenology of internal time-consciousness (1893–1917). [Zur Phänomenologie des
inneren Zeitbewusstseins (1893–1917)]. English Translation R Boehm. Martinus Nijhoff, The Hague,
Netherlands.
Karban R, Black CA and Weinbaum SA (2000). How 17-year cicadas keep track of time. Ecology Letters,
3, 253–256.
Kholopova VN (1994). Muzika kak vid iskusstva [Music as a kind of art]. Parts 1 and 2, 2nd edn. Moscow
Conservatory, Moscow. (In Russian).
Krakouer L, Houghton S, Douglas G and West J (2001). The efficacy of music therapy in effecting
behaviour change in persons with cerebral palsy. International Journal of Psychosocial Rehabilitation,
6, 29–37.
Krumhansel CL (2000). Rhythm and pitch in music cognition. Psychological Bulletin, 126(1), 159–179.
Peter Lang, Bern.
Lakoff G and Johnson M (1999). Philosophy in the flesh, the embodied mind and its challenges to Western
thought. New York, Basic Books.
Lecanuet J-P (1996). Prenatal auditory experience. In I Deliège and J Sloboda, eds, Musical beginnings:
Origins and development of musical competence, pp. 3–34. Oxford University Press, Oxford/New
York/Tokyo.
Lee DN (1998). Guiding movement by coupling taus. Ecological Psychology, 10, 221–250.
Lee DN (2005). Tau in action in development. In JJ Rieser, JJ Lockman and CA Nelson, eds, Action,
perception and cognition in learning and development, pp. 3–49. Erlbaum, Hillsdale, NJ.
Lerdahl F and Jackendoff R (1983). A generative theory of tonal music. MIT Press, Cambridge, MA.
Li L, Korngut LM, Frost BJ and Beninger RJ (1998). Prepulse inhibition following lesions of the inferior
colliculus: prepulse intensity functions – selective uptake and axonal transport of D-[3H] aspartate.
Physiology and Behavior, 65(1), 133–139.
MacLean PD (1990). The triune brain in evolution, role in paleocerebral functions. Plenum Press, New York.
1999–2000), 29–57.
Manning A (2004). The sound of life. BBC Radio 4, July 2004. CD produced by S Blunt for the Open
University, Milton Keynes
Masolo DA (2000). From myth to reality: African philosophy at century-end. Research in African Literatures,
31(1), 149–172.
Meloni EG and Davis M (1998). The dorsal cochlear nucleus contributes to a high-intensity component of
the acoustic startle reflex in rats. Hearing Research, 119(1–2), 69–80.
Merker B (2005). The liabilities of mobility: A selection pressure for the transition to cortex in animal
evolution. Consciousness and Cognition, 14, 89–114.
Merker B (2006). Consciousness without a cerebral cortex: A challenge for neuroscience and medicine.
Behavioral and Brain Sciences, 30, 63–134.
Meyer-Eppler W (1952). Grundlagen und Anwendungen der Informationstheorie. Kommunikation und
Kybernetik in Einzedarstellungen, Band 1. Springer-Verlag, Berlin.
Mocquereau DA (1927). Le nombre musical grégorien ou rythmique grégorienne – théorie et pratique [The
measure of gregorian music or gregorian rhythm – theory and practice] – Tomes 1 et 2. Société Saint Jean
l’évangéliste, Desclée.
Pachetti C, Mancini F, Aglieri R, Fundaro C, Marignoni E and Nappi G (2000). Active music therapy in
Parkinson’s disease: An integrative method for motor and emotional rehabilitation. Psychosomatic
Medicine, 62, 386–393.
Panksepp J (1998). The periconscious substrates of consciousness: Affective states and the evolutionary
origins of the self. Journal of Consciousness Studies, 5, 566–582.
37, 315–331.
Peretz I and Kolinsky R (1993). Boundaries of separability between melody and rhythm in music discrimi-
nation: A neuropsychological perspective. Quarterly Journal of Experimental Psychology, 46A, 301–325.
Peretz I and Zatorre R (eds) (2003). The cognitive neuroscience of music. Oxford University Press, Oxford.
Phillips DP and Farmer ME (1990). Acquired word deafness and the temporal grain of sound representa-
tion in the primary auditory cortex. Brain Research, 40, 84–90.
Pöppel E (2002). Three seconds: A temporal platform for conscious activities. In A Grunwald, M Gutmann
and EM Neumann-Held, eds, On human nature. Wissenschaftethik und Technikfolgenbeurteilung, Bd. 15,
pp. 73–79. Springer Verlag, Berlin, Heidelberg, New York.
Pöppel E and Wittmann M (1999). Time in the mind. In R Wilson and F Keil, eds, The MIT encyclopedia of
the cognitive sciences, pp. 836–837. The MIT Press, Cambridge MA.
Povel D-J and Essens P (1985). Perception of temporal patterns. Music Perception, 2(4), 411–440.
Rizzolatti G, Fogassi L and Gallese V (2001). Neurophysiological mechanisms underlying the understanding
and imitation of action. Nature Reviews Neuroscience, 2, 661–670.
Rudziński W (1987). Nauka o rytmie muzycznym. Polskie Wydawnictwo Muyczne, Kraków.
Sacks O (2007). Musicophilia: Tales of music and the brain. Random House, New York; Picador, London.
Schögler B and Trevarthen C (2007). To sing and dance together. In S Bråten, ed., On being moved: From
mirror neurons to empathy, pp. 281–302. John Benjamins, Amsterdam, Philadelphia.
1999–2000), 75–92.
Sillar KT, Reith CA and McDearmid JR (1998). Development and aminergic neuromodulation of a spinal
locomotor network controlling swimming in Xenopus larvae. Annals of the New York Academy of
Sciences, 860, 318–332.
564 NIGEL OSBORNE
Skille O and Wigram A (1995). The effects of music, vocalisation and vibrations on brain and muscle
tissue: Studies in vibroacoustic therapy. In A Wigram, B Saperston and R West, eds, The art and science
of music therapy: A handbook, pp. 23–57. Harwood Academic, London.
In WPD Wightman and JC Bryce, eds, Essays on philosophical subjects, pp. 176–213. Liberty Fund,
Indianapolis, IN.
Steedman M (1996). Phrasal intonation and the acquisition of syntax. In J Morgan and K Demuth, eds,
Signal to syntax, pp. 331–342. Erlbaum, Mahwah, NJ.
Stern DN (2000). The interpersonal world of the infant: A view from psychoanalysis and development psychology.
Originally published in 1985. Paperback 2nd edn, with new Introduction. Basic Books, New York.
Stockhausen K (1971). Texte zur Musik. M DuMont Schauberg, Cologne.
Stockhausen K (1989). Stockhausen on music. Lectures and Interviews compiled by Robin Maconie. Marion
Boyars, London and New York.
Szuman S (1951). Dowcip i ironia Chopina. Muzyka, 2, 23–33.
Temperley D (2001). The cognition of basic musical structures. MIT Press, Cambridge, MA, London.
Trevarthen C (2008). The musical art of infant conversation: Narrating in the time of sympathetic
experience, without rational interpretation, before words. Musicae Scientiae (Special Issue), M Imberty
and M Gratier eds. In press.
Intrinsic factors in child mental health. Development and Psychopathology, 6, 599–635.
Trevarthen, C, Aitken KJ, Vandekerckhove M, Delafield-Butt J and Nagy E (2006). Collaborative
regulations of vitality in early childhood: Stress in intimate relationships and postnatal psychopathology.
In D Cicchetti and DJ Cohen, eds, Developmental psychopathology, 2nd edn, pp. 65–126. Wiley,
New York.
Truelove S (1998). The translation of rhythm into pitch in Stockhausen’s Klavierstuck XI. Perspectives of
New Music, 36(1), 189–220.
Turner F and Pöppel E (1988). Metered poetry, the brain, and time. In I Rentschler, B Herzberger and
D Epstein, eds, Beauty and the brain. Biological aspects of aesthetics, pp. 71–90. Birkhäuser Verlag, Basel.
Wellesz E (1963). The interpretation of plainchant. Music and Letters, 44(4), 343–349.
Werner H (1948). The comparative psychology of mental development. International Universities Press,
New York.
with special reference to music perception and performance. Musicae Scientiae (Special Issue
1999–2000), 13–28.
Zatorre RJ and Peretz I (eds) (2001). The biological foundations of music. New York Academy of Sciences,
New York.
Chapter 26
Musical communication: The body

movements of performance
Jane Davidson and Stephen Malloch
26.1 Introduction
The role of body movement in creating and communicating music, and in conveying extra-musical
meaning during performance, has attracted academic attention. Beyond the obvious need for a
body to interact with a musical instrument in order to create musical sounds, the idea has been
explored that musical meaning itself originates in the body, that music is experienced as move-
ment (for example, Cox 2001, 2006; Davidson and Correia 2002; Malloch 2005; Cross and
Morley, Chapter 5, Lee and Schögler, Chapter 6, and Panskepp and Trevarthen, Chapter 7, this
volume). Many of these ideas have been inspired by the writing of philosophers Mark Johnson
and George Lakoff (Johnson 1987; Johnson and Larson 2003; Lakoff and Johnson 1980, 1999)
who argue that the body reveals and shapes all mental states, both conscious and unconscious.
We have been excited by these proposals, and have independently researched the role of the body
in the creation of expressive musical communication. The second author has investigated the
very earliest mother–infant interactions, arguing that caregiver–infant communication is an
expression of our intrinsic musicality (Malloch 1999; Trevarthen and Malloch 2000). The first
author has been developing the idea that our vocal communication, most typically experienced
between adults as speech and filtered through complex social and cultural practices, is a process
parallel to our development and production of musical expression. Thus, musical performance
might be conceived of as the performer communicating with co-performers and audience
through the intrinsic musicality of body movements, the sounds that are produced by the move-
ments of the body being disciplined by cultural practice and performance technique so as to create
meaning in a gestural narrative.
In this chapter we explore these ideas, and show how our independent research efforts come
together to bring practical insights to important theoretical questions around how musical com-
munication takes place. We briefly explore two types of performance as illustrations. First,
we consider the performances of Amy Wu, a Cantonese singer, to demonstrate that Amy’s vocal
sounds and bodily movements co-specify a number of interlocking messages: an interpretation
of the song being performed, sociocultural rules relating to performance etiquette, a projected
sense of the performer’s social self, and an intimate sense of the self revealed by the performer.
In other words, the performance is a complex communication within the medium of music,
emerging out of a range of communicative principles, including self-awareness. Second, we
explore the musical sounds and overall movement patterns of two professional Western classical
performers—a flautist and a clarinettist. The emphasis is on how the overall movement patterns
change as a function of familiarity with the piece, as well as with the change from playing solo to
playing in duet. In particular, we show how the movements permit co-specification of musical
expression with co-regulation of movement in time.
566 JANE DAVIDSON AND STEPHEN MALLOCH
26.2 Theorizing the movements used in musical performance

26.2.1 Movement in music as a fundamental form of human
communication
Infant-directed speech or ‘motherese’, used to soothe, stimulate and communicate feelings, inter-
ests and purposes with infants, results in adaptive advantages for the baby’s body and for
development of the baby’s mind – e.g., to promote sleep, to attract attention, to teach language
(Ayers 1973; Fernald and Mazzie 1991; Kitamura and Burnham 2003; Snow 1977, 1989). Talk,
song and play with infants facilitates the cultivation of reciprocity and companionship (Malloch
1999; 2005; Trevarthen 1999; 2001a). Compelling video and spectrographic analyses of sound
and movement interactions of infants and caregivers strongly support the notion that there is a
mutual ‘tuning in’ which promotes the infant’s experience of a social world. Malloch (1999) has
described this aspect of interaction as communicative musicality. He has suggested that when
mothers are suffering from a mood disorder (such as post-natal depression), they do not interact
with their infants with as much ‘musicality’, and so infants may not progress as well in their
general development (Malloch 2004, 2005; Murray and Cooper 1997; Robb 1999). Thus commu-
nicative musicality, listened to and participated in, may be an essential foundation for mental and
social development—a core channel of human emotional and purposeful communication
(Panskepp and Trevarthen, Chapter 7, this volume). Normal enculturation—from motherese
and infants participating in play-songs, to children’s play-singing and sharing common social
musical activities such as singing ‘Happy Birthday’, joining in with musical activities at school,
or singing along to music on the radio—seems to guarantee that children will develop their
singing voice by participating in sung vocal activities in a very natural way, learning intuitively
the structures of the musical language in which they are immersed (Sloboda 1985; Bjørkvold
1989). Gordon’s (1987) work employing standardized musical measures suggests that by 9 years
of age most of us have acquired a consistent range of listening and recognition skills in music,
irrespective of culture and whether we receive special training or not, supporting the idea that
experiencing music is both an innate and developing human ability (Gordon 1987). From this
base, we may propose a theory of musical value and meaning based on the adaptive, evolutionary
function of moving with musicality. Contributors to this volume have done this very convinc-
ingly (Part 1 of this volume). The key role of the body in the generation and co-construction of
motherese (Trevarthen 1986, 2001b; Trevarthen and Malloch 2000) hints at how we might use
this fundamental experience as a basis for more sophisticated and abstract behaviours such as the
music or speech as experienced within and which helps to define a specific cultural environment.
Culturally embedded musical forms, which grow from our innate musicality, incorporate a
sophisticated abstracted code dependent on many subtle social and cultural rules. Knowledge
and de-coding skills are required in order to appreciate fully the social and cultural communica-
tion (Clarke and Davidson 1998). However, proposals are emerging, akin to the notion of com-
municative musicality, that are about the understanding and training of musical language itself
which highlight very basic bodily concepts such as ‘weight’, ‘time’, ‘space’ and ‘flow’ as the
elements underpinning musical meaning in movement (Kühl 2007). The argument is that in
order to appreciate fully the nature of music and its meaning, we connect our fundamental
embodiment to all our social and cultural knowledge and understanding. A commonly cited
source of inspiration for such ideas comes in the practical music teaching of Emile Jacques-
Dalcroze (1921) who points to body movement as crucial to the process of unifying musical
elements and focusing on musical expression. Pierce (1994, 2003) has made a significant impact
in the US by adapting Dalcrozian-type principles for work with advanced musicians. Her
approach for piano students involves teaching rhythm by experiencing the beat or pulse of music
MUSICAL COMMUNICATION: THE BODY MOVEMENTS OF PERFORMANCE 567
through pendular, swinging movements away from the instrument. The point of an exercise
like this is to embody the full motion required to produce the exact attack point on the beat
(cf. Lee and Schögler Chapter 6, this volume). That is, the student can feel the approaching
downbeat and the surrounding moments in the body swing. Guile (2000) has made further
important inroads in this area, relating her ideas to those of dance theorist and practitioner
Rudolph Laban (Laban 1960). Guile shows that physical motion descriptions such as force,
weight and flow can be used as means of eliciting musical expression from music students. Here
expression is the subtle variations in pitch, timbre, timing and dynamics that make one interpre-
tation different from the next and more or less artistically appropriate, according to the specific
cultural framework within which the music is created and performed. Guile organized for chil-
dren learning many different instruments to employ particular expressive effects by asking them
to experiment with physical effects, e.g., ‘dabbing’ movements translated into ‘dabbing’ musical
actions which translated into ‘dabbing’ musical expression. Thus, what could be considered a
highly abstract musical idea can be straightforwardly played out through the body. Although the
efficacy of Guile and Pierce’s work or indeed that of Dalcroze and others can be questioned, these
teachers directly draw on experiences of the moving body to assist students towards a deeper
‘grounding’ and embodiment of their music technique and expression.
Lidov (1987, 2006) and Hatten (2006) have shown that Western classical music (a Beethoven
sonata, for example) can be de-coded or appreciated consciously if the naturally generated phys-
ical gestures that are required to articulate the music are carefully considered during the develop-
ment of ideas on how the music can be expressed. For instance, looking at a melody in terms of
how the body needs to negotiate the trajectory of the musical line can help to achieve effects such
as legato or staccato with a specific underlying emotional quality. Perceptually, music often pro-
duces visceral responses. Listening to music, for example, might cause shivers down the spine or a
lump in the throat (Sloboda 1991; Panksepp and Trevarthen, Chapter 7, this volume).
Participation in musical activity elicits powerful physical/affective states. Watt and Ash (1998)
suggest that these emotional experiences are akin to participating in social interaction, the music
acting like a virtual person on the performer or listener.
26.2.2 Key components of musical performance skills

Information about performance skills helps to demonstrate how the body functions in musical
performance. Playing a piece of music depends on having developed a range of complex and
interactive cognitive, perceptual and action processes, and these processes rely upon internal
representations in the memory of the performer. Such ‘representations’ are situation- and task-
specific (see Lehmann and Davidson 2002 for more details), with the level of fluency in the pro-
duction and use of the mental representations being a function of knowledge and practice.
A novice string student might have a representational system consisting of laborious fingering
combinations, while a more advanced player might have a representational system of fluent
fingering and also the underlying chord progression along with some expressive information,
some aural image of the sounds, and a visual representation of the score. Lehmann and Ericsson
(1997) have proposed that in order to play well, musicians need at least three different types of
mental representations corresponding to a goal representation, a production representation, and
a representation of the current performance. We can explain the degree of bodily engagement of
the performer as being the consequence of performance goals (technical and expressive aims)
and the self-monitoring that goes on during the course of the performance, which are dependent
on combined intellectual/conceptual understanding and motor skill. The performer, of course,
is not aware of every performance aspect, and a key feature of music practice is to ensure that the
general playing activity is completely automatic. Thus, a piece is learned through becoming well
established in thought and motor activity, so as to enable the player more mental freedom during
a performance to deal with ‘in the moment’ aspects of expressive interpretation or problem-
solving (Rodrigues et al. Chapter 27, Section 27.3, this volume).
The particular instrument learned will have a critical role in shaping a performer’s representa-
tion of the music, and for each instrument and its particularities, each performer develops ways
of playing which will be slightly different, even if the technicalities of playing are based on the
same principles. In musicians’ interactions with instruments, they may apply the principles of
ergonomics to both differences in the repertoire as well as the physical approach required when
playing—for example, the harpsichord versus the piano, or the viola versus the violin. The bodily
investment of a musician will affect his or her capacity to play and the eventual musical outcome.
There is a trade-off between physical economy and expressive affect in terms of the body
movements a musician uses to generate a performance. For example, basic motor control
achievement demonstrates that humans produce a minimum action time when doing a specific
task, which in turn creates an economy of movement, e.g., in rowing, or box packing
(see Davidson 2005 for more details). During performance, musicians do indeed have their own
‘incompressible minimum’ performance profiles. However, there are also manipulations that are
dependent on representations of musical expression, as well as the technical achievement of note
playing. Thus, two kinds of movement information are involved in the physical action of music
performance: those for the requirement of playing the notes and those to achieve musical effects.
When a skilled performer plays a well-learned piece, these actions become co-specified, especially
if the piece is performed with a high degree of automaticity.
The execution of music relies on bodily movement ‘grammars’ which have semantic codes to
produce expressive effects both in the production of the sounds, and for additional interpersonal
communication. The gestural codes seem to have a function similar to those used in speech.
Consider the gestures used to accompany speech when a feature of an event is being described—
for example, a rolling hand movement to illustrate that someone was speaking on and on. Similar
movements (such as a lifted rolling arm movement) are often found in pianists when rippling
melodic ‘on and on’ lines are being played (Davidson 2007).
In addition to the points made above about the body being critical in shaping the expression of
a performance, it is also important to add that we enjoy the vestibular and other proprioceptive
sense of activity (swaying, dancing) that often accompanies listening to music, the metaphorical
movement induces movement and an experience of movement in ourselves (called ‘sympathetic
kinaesthesia’ in Stevens et al. 2001; also see Todd 1995, 1999). Such a proposal fits with
Trevarthen’s claims about the rhythmical functions of patting, bouncing, and singing being used
to stimulate and comfort an infant (e.g., Trevarthen and Aitken 2001). We propose that perform-
ance and the co-arising reactions to it are associated with both communication and personal
enjoyment in movement, and the movements we make are not only related to musical syntax and
semantics, but are also profoundly impacted by codes of social behaviour.
26.3 Sociocultural codes in music performance

26.3.1 Symbolic exchanges
There are culturally expected behaviours between co-performers, and between performer(s) and
audience—for example, ways in which performers musically cue each other, and ways of greeting
and being received by an audience. Certain forms of dress are expected by both parties. In a
Western classical performance, the performers typically wear evening dress, with dinner jackets
and bow ties for the men and long dresses for the women. The audience are more likely to wear
lounge suits for the men, and cocktail style dresses for the women, though this varies according
to country, venue, age of the audience and size of the event. At the Promenade Concerts in the
Royal Albert Hall, London, for example, one is most likely to see young people in jeans and
t-shirts in the promenader’s standing area, with older people sitting in the grand tier seats wear-
ing formal evening dress (see Davidson 1997 for a description).
26.3.2 Sources of the codes

Some social and cultural codes emerge from historical practices. They may come from a particu-
lar ‘school’ of performance. Specific ‘lines’ of Western classical vocal training can be traced from
eighteenth-century Italy, through to France and can be found nowadays all over the world. Some
traditions have long and enduring histories, while others appear and disappear quickly. Some fan
behaviours in popular music audience settings, such as gently swaying to the beat of a ballad with
a lit candle in hand, might be fashionable for a couple of years, and then vanish. These codes
clearly provide sharable information for performers and audience. Certain sociocultural prac-
tices mean that some of these codes are more culturally specific than others, and so knowledge of
the specific culture in which they appear may be necessary before ‘de-coding’ can occur.
A Western classical singer may find it very difficult to follow the singing patterns of a classical
Chinese singer owing to lack of familiarity with the musical syntax, the vocal technique and the
particularities of the hand gestures used during the narration of the song.
The codes can provide formal and intimate information for co-performers and audience.
In terms of body gestures that accompany singing, Davidson (2002a) and Kurosawa and
Davidson (2005) have suggested the following common use of gestures to accompany singing:
◆ Emblems: symbolic body movements, such as making the sign of the cross.
◆ Illustrators: movements that illustrate content, inflection or loudness metaphorically,
or rhythmically accent or trace ideas.
◆ Affect displays: movements that reveal emotional states.
◆ Regulators: movements that maintain and regulate the flow and content of interaction.
◆ Adaptors: personal ‘habit’ behaviours, such as rocking whilst wrapping the arms around the
body, may be a self-protective (self-adaptor) behaviour. Kurosawa and Davidson (2005) noted
that the displays of closeness and intimacy in adaptor behaviours in the singing of the band,
The Corrs, seemed to reflect their familial bonds.
Besides arm and whole body movements, performers will often use facial expressions
as important means of communication. Facial postures and gaze were found to have important
roles in creating communicative information from The Corrs to the audience and between
themselves (Kurosawa and Davidson 2005).
Bearing in mind the functional, symbolic and social communication roles of movement
in music performances, an attempt to understand these in terms of the theory of communi-
cative musicality is now made by first drawing on the performances of a solo singer, and then by
exploring the performance of an instrumental duet.
26.4 Amy Wu
Amy Wu is a popular singer from Hong Kong with more than 30 commercially successful CD
recordings to her name. She presents audiences with what she regards to be her unique fusion of
classical Cantonese opera in a modern easy-listening or ‘own style’, as Amy calls it. But what does
this mean? To investigate the musical means through which she operates, an informal interview
with Amy took place in the office of her recording company (Worldstar Music, Hong Kong) in
the presence of the first author, Paulina (a translator) and Amy’s recording manager. During the
interview, Amy was asked to sing a verse from a well-known song from the classical repertoire in
two ways: her ‘own style’, and in the traditional style. We realized that the conditions may have
seemed somewhat false, but Amy did not hesitate, she knew exactly what she needed to do in
order to differentiate between the performance styles. She has performed extensively in the tradi-
tional Cantonese opera, as well as promoting her own CDs in a career spanning more than
20 years on stage and screen.
Figures 26.1 and 26.2 show transcriptions of the opening verse of the traditional song ‘Tears of
the Red Candle’, which Amy has sung in the classical theatre on many occasions and which she
has recorded on two of her CDs in her unique ‘own style’. Without attempting to translate the lay-
ers of poetic symbolism in the Cantonese lyrics, the song describes lost love and includes images
of the singer’s body (the candle) ‘fluttering like branches in the breeze’.
The songs are shown in the figures in Western notation. There are striking musical differences
between the two interpretations: the cross-over or ‘own style’ version (Figure 26.1) is one major
fourth lower and at a slower tempo than the traditional version (Figure 26.2). When we asked
Amy about these differences, she commented that in the first version she wished to deliver a more
intimate communication of the piece to her audience. In the second version, the formal presenta-
tional style of the opera performance determines not only the higher pitch, but a faster, generally
louder and more projected performance. Note that the first interpretation was sung pianissimo,
with slight and subtle dynamic variation, and the second piece was louder, beginning piano with
a crescendo to mezzo forte, and finally forte.
To some extent we can ‘understand’ the performances in terms of how the composed material
is executed musically, but what of the non-verbal aspects? What codes are used, and do these
differ between interpretations?
Tables 26.1 and 26.2 present descriptions of the movement types used, according to the musical
phrase.
In the ‘own style’ performance, Amy uses facial expression, and hardly any or only tiny hand
gestures, generally with the hands held together. Her eyes close from the start of the second
Music by Wang Yue-san

Lyrics by Deng De-san
= 76
Fig. 26.1 ‘Tears of the Red Candle’ as sung in Amy’s ‘own style’. The straight lines above the stave
show the duration of each phrase she sings.
Music by Wang Yue-san

Lyrics by Deng De-san
= 92
Fig. 26.2 ‘Tears of the Red Candle’ as sung by Amy in ‘traditional style’. The straight lines above the
stave show the duration of each phrase she sings.
phrase and open only as she sings the last word. Her face shows a range of expressions, and
her head tilts and rocks gently. Applying the gesture categorization codes to these movements,
most are of an illustrative (emphasizing the meaning of the words and/or the musical orna-
ments) and adaptive function (general sense of inward self-adaptors in the overall posture and
hand holding ‘protective’ position). In the ‘traditional style’ (seen in Table 26.2), the movements
Table 26.1 ‘Own style’ (four main phrases, as shown on the score; the last is extended owing to the
ornamental phrase extension)
Musical phrase Body movement

1 H: Gradually raises eyes, looks ahead, gently sways head side-to-side on the
ornamented phrase end.
2 H: First note lifts face higher, raises eyebrows, then closes eyes and inclines
head to her right. Slight side-to-side head sway, as she sings about
‘having had hard times’.
3 H: On deep in-breath, inclines head forward, sways head to left, keeping
eyes closed, though eyebrows move with the ornamentation, singing about
how difficult it can be to break a love.
4 H: Keeps head slightly to left.
A: moves hands a little (to create a tension between them, and so raises her
shoulders).
H: Head sways more, and continuously until she finishes the work. Eyes stay
closed, eyebrows flicker, singing that some things – inferring love – cannot
be sold.She opens her eyes as she finishes the phrase.
To signal the end:
A: Raises her hands (palms together) as she inclines forward to make a small
bowing gesture.
Key: H, head movement; A, arm/hand/shoulder movement.

Table 26.2 ‘Traditional style’ (four main phrases, as are shown on the score; the last being extended
owing to the ornamental phrase extension)
Musical phrase Body movement

1 H: Looks straight ahead of her.
A: Holding thumb and third finger together, swings arm forwards and
upwards from waist level in a brush-stroke manner tracing the phrase with
her right hand, to an outstretched position.
H: Wiggles head intensely on ending ornament.
2 H: Keeps looking forward, with head inclined slightly to left.

A: Reverses the brush-stroke action, looping the hand down to join other
hand placed at waist level.
3 H: Still inclined to right, stares forward, and wiggles head on final ornament.
A: Both hands move together up the front of the torso to the chest level
then descend to waist level, gently entwined in the thumb-finger pose.
4 H: Still inclined to right, stares forward, and wiggles on mid-phrase
ornament. Straightens whole body more.
A: Repeats movements of both hands moving together up the front of the
torso to the chest level then descend to waist level, gently entwined in the
thumb-finger pose. As extended ornamentation begins, hands separate and
left hand is raised to chest height in an index finger pointing gesture,
which gently returns to join the other hand as they both settle at waist level.
H: gently wiggles head as Amy articulates the final ornament.
To signal end:
Stays in fixed pose, then relaxes her body and smiles.
Key: H, head movement; A, arm/hand/shoulder movement.
are much larger, the eyes remain open. The hand gestures trace the line of the melody and point
forward, in a more stylized illustrative performance gestures, some having emblematic meanings
of poise and loss. They also contain elements of display, and there is less evidence of adaptive
movement quality.
The reader can gain some sense of the differences between the two performances by looking at
Figure 26.3, which shows images from the end of the last phrase in each interpretation.
When asked about her performance aims, Amy said she preferred to ‘draw the audience in’ (her
description of the ‘own style’ performance), rather than ‘project onto an audience’ (‘traditional
style’ performance). This way of characterizing the differences between the two interpretations led
her to conclude that her ‘own style’ was potentially a more accessible style both musically (‘well,
it is lower and so easier to listen to’) and in terms of the emotional communication (‘it is my real
emotion’). She felt that there were few ‘social layers’ or ‘cultural barriers’ between her and the audi-
ence. She was performing from ‘inside’ herself. In the traditional style, there was the layer of the
projected, formal gestural style. She said that this was part of the historical tradition, and so within
this framework it was necessary to use these emblematic and illustrative movements, in part owing
to the very dramatic make-up and elaborate costumes worn. However, it was also because it was
more culturally acceptable in the traditional form to have restrained personal emotion.
The first author and the translator were moved by Amy’s performances in both styles, but were
able to see how powerful and affecting the first style was within the context of the interview
room. Indeed, after the interview, Paulina referred to having been ‘touched’ by both the music
and the movements which Amy sang in ‘her style’.
Fig. 26.3 Left: Amy’s final singing position in her ‘own style’ interpretation of ‘Tears of the Red
Candle’. Right: Amy’s final singing gesture in her ‘traditional style’ interpretation of ‘Tears of the
Red Candle’.
While Amy’s performances illustrate the elaborate social and cultural roles of the performer, it
is important to note that Davidson and Coulam (2006) studied ten singers all singing the same
song (‘Summertime’ by George Gershwin), and discovered that the performers most appreciated
by the accompanist and video spectators in terms of musical quality and visual aesthetic were
those who used the most adaptive gestures. Of course, whether or not these movements and
musical effects were completely spontaneous or whether they were ‘choreographed’, based on
previous models, it is impossible to say. However, the effect in the ‘Summertime’ study—as with
Amy Wu—was for the audience to experience a performance which seemed to be more ‘heartfelt’,
and less mediated by cultural emblems. The inference here is that some adaptive gestures have
universal and perhaps inherent appeal and communicative value for humans.
In the example of Amy, we have shown music and movement are co-specified. We have also
indicated that within the intrinsic and innate appeal of music as a form of embodied communi-
cation, elaborate sociocultural practices shape many aspects of performance meaning, though it
seems that to some extent audiences are drawn to those aspects of performance that demonstrate
inner personal emotions as well as deliberately projected emblematic codes. But we have yet to
see how performers collaborate with one another, and what impact bodily presence and interac-
tion may have on the development of the performance representation.
26.5 Deborah DeGraaff and Leah Lock

Deborah DeGraaff (clarinet) and Leah Lock (flute) are professional musicians working in
Sydney, Australia. Both have performed as soloists and played in prestigious professional ensem-
bles for more than 10 years. They work regularly together, and regard themselves as extremely
familiar with one another’s musical and personal tastes and styles, strengths and weaknesses.
They were asked to practice and then perform a short piece of music for flute and clarinet espe-
cially composed for the study by British composer Mark Slater (www.markslater.net). He was
asked to create parts that could function as independent lines, and so could be performed as
solos, but it was also important that the lines would combine and function as a duet. The parts
were to be of equal technical difficulty and use a very similar range of musical effects. The score
in the duet form is shown in Figure 26.4.
574
Music for Jane and Stephen

26-Malloch-Chap26
rit.
A Slow, tranquil =68 con moto, rubato
Flute 4 3 3
4 4 4
3 3 3 sfzpp
9/9/08
pp distant mf p mf p mf p
Clarinet in Bb 4 3 3
4 4 4
12:45 PM
pp poss. subito
mp sfzp f
p p pp poss. mp
B a little waltz =c.104 subito tempo primo (q=68) con moto

C
dolce
JANE DAVIDSON AND STEPHEN MALLOCH
9
Page 574
3 4
Fl. 4 4
p p f pp poss. subito 3
f f f strong
I J K L M
i sim. l1 l2
3 4
Cl. 4 4
pp distant f strong
mf dolce
molto rit.
(low B if available)
15
Fl. 3
4
3 mf p 3
f pp mp
pp
Cl. 3
4
mf mp p pp
pp
Fig. 26.4 Score of ‘Music for Jane and Stephen’ by Mark Slater, for flute and clarinet.
Although the work only lasts a little over a minute, it demands a broad range of expressive
effects: extremes of dynamic range, pitch height, and variations in tempo. As the score indicates,
Mark had the idea of movement in mind as he composed, e.g., a little waltz, con moto. The players
were experienced and confident sight-readers, and so were able to execute the task as we required.
First, we asked each to begin by sight-playing the solo line by themselves. They practiced the
piece for as many times as they felt necessary, and in a manner of their choosing, in order to feel
they had achieved a grasp of the musical content and a desired interpretation. In this solo condi-
tion, each player chose to play the piece four times in run-through, each having additionally
practised small subsections or single notes between the run-throughs. Yet, their motivations
seemed different: Leah wanted to play until she had a ‘fixed interpretation’. Deborah wanted to
‘keep going to try out more and more possibilities’. All full run-throughs were recorded onto
videotape, digital audio tape and a motion capture tracking system (PEAK Motus). This meant
that we could trace in three dimensional space how the performers moved over time and align
the movement tracks to the musical sound-stream so that relationships between musical effects
and physical actions could be investigated.
After each player had played the piece through by themselves, the two were asked to come
together to work on the piece – a practice akin to preparing for an ensemble concert or sound
recording. It was only at this stage that they were shown the duet version of the score. Previously,
they had no idea of how the other part interacted with their line. Working in a similar manner to
the solo recordings, the two played through the duet version five times, prior to giving a sixth and
agreed ‘final and polished performance’. All data were collected in a similar manner. Many data
were generated from these procedures so we can only mention a small section of it here. We focus
on what happens during Deborah and Leah’s rehearsal process during section B of the piece,
moving from solo to duet. Thus, we examine first run-through number 1 and number 4 of each
of the solos, and then run-through number 1 and number 6 of the duets.
26.5.1 Method of data collection

The three-dimensional movement of the instrumentalists was tracked using PEAK Motus motion
tracking equipment. Three markers were used to track the movement of the instrumentalists—
one placed on the player’s forehead, one on their left shoulder, and one on the end of the instrument.
As they performed the music, the position of the three markers was tracked by six cameras
recording motion at 50 times a second. Out of the mass of data that was collected, we have
chosen here only to discuss the movement of the ends of the instruments in the vertical dimension.
The end of the instrument was chosen as it is often with the movements of the instrument itself
that performers cue each other and movements of the instrument reflect both the movements of
the player’s torso and their arms. Thus, a graph of the end of the instrument could be said to
‘summarize’ all the movements of the player. For practical reasons only the vertical (up–down)
movements of the end of the instruments are discussed here. As this is a descriptive analysis of
the movement data, a description of more than one dimension of more than one point would
become hopelessly complex (though could be accommodated through a mathematical-based
description). Yet another practical consideration is that side to side movements, as opposed to
up–down (vertical) movements, were more difficult to capture in a single graph. Although both
instrumentalists performed facing towards the front of the performance space, the instrumen-
talists were free to move their bodies and feet as they performed; depending on the orientation
of the body of the performer, the relative contributions of the towards-and-away (x-axis) and
side-to-side (y-axis) movements relative to the cameras will change. Thus, to plot just the x or
just the y axis would not account adequately for the side-to-side movement. However, plotting just
the z dimension will account for all up–down movement. For all these reasons, the following
description of the movements of the performers solely considers the vertical (z-axis) movement
of the ends of the instruments.
26.5.2 Discussion of the data

The letters on the score in Figure 26.4 are of two types. The letters that are inside a box
(for example, B), demarcate the sections of the piece. The unboxed letters within section B mark
events, and refer to the same letters as shown on the graphs of movement (see Figures 26.5, 26.6
and 26.7). Upper-case unboxed letters are used to denote events at the beginning of a bar, lower-
case unboxed letters denote events that occur within a bar.
26.5.2.1 Section B, the solos (numbers 1 and 4)

There are two immediately striking points about the comparisons between Deborah’s and Leah’s
styles of playing in Section B. First, Deborah (clarinet, Figure 26.5) moves much more than Leah
(flute, Figure 26.6). Second, although both players make larger and faster movements in solo
number 4—probably associated with increased familiarity—Deborah’s movements through
Section B are often quite different in trajectory and amplitude to those in solo number 1. Leah,
on the other hand, carries out a very similar movement pattern in the two solos. The comments
made by the two women help us understand these findings:(i) Deborah was interested in explor-
ing the musical material as much as possible in order to find an interpretation; (ii) Leah wanted
to use the run-throughs to consolidate an interpretation.
It is instructive to investigate some details of the movement graphs. Deborah’s movements in
solo 4 tend to be larger and faster than in solo 1, and this difference is reflected in the sound.
Clarinet solo 1
Event
I J K I(1) I(2) N
0.4
0.3
Metres
0.2
0.1
0
35 40 45 50
Seconds
Clarinet solo 4
Event
I J K I(1) I(2) N
0.4
0.3
Metres
0.2
0.1
0
35 40 45
Seconds
Fig. 26.5 Movement graphs for Section B, Clarinet solo 1 and 4.
Flute solo 1
Event
i J K L M N
0.2
0.1
Metres
−0.1
−0.2
30 35 40
Seconds
Flute solo 4
Event
i J K L M N
0.8
0.7
Metres
0.6
0.5
0.4
30 35 40
Seconds
Fig. 26.6 Movement graphs for Section B, Flute solo 1 and 4.
When comparing solos 1 and 4, Section B in solo 4 is performed by Deborah with a greater sense
of urgency and direction, the individual bars are more shaped and integrated in dynamic and
rhythm; solo 1 is more metrically even, and the individual phrases are more isolated from each
other. Comparing section (J) to (l(1)) in the movement graphs of the two solos, we can see solo
1 consists of a greater number of changes in direction; solo 4 consists of longer, smoother move-
ments. In particular, solo 4 no longer contains the downwards gesture found in solo 1 which
marks the beginning of bar 11 (K). This may reflect Deborah’s desire to ‘play through’ the first
beat of bar 11 so as to link the phrases of bars 10 and 11, which is indeed the way the music
sounds in solo 4. Overall, it appears the differences in movement between solos 1 and 4 reflect
Deborah’s wish to move towards an interpretation that is concerned with joining phrases into
larger musical units.
For Leah, Section B has the same overall shape in both performances, with small differences that
may reflect changes in the way she thinks of the phrasing of a section. For example, in solo 1 the
performer’s movements ‘arrive’ onto the first beats of bars 10 (J) and 12 (L); in solo 4 the performer
arrives on the down beat of bar 11 (K), otherwise moving through to the first quaver of the following
bar at (J) and (L). Thus, together with Leah’s stated desire to consolidate her interpretation, like
Deborah her movements also suggest her search for how best to phrase and shape the music.
26.5.2.2 The duets (numbers 1 and 6)

When the two women came together to play the duet, several interesting exchanges occurred:
(i) Deborah wanted to discuss her ideas with Leah; (ii) Leah participated in the discussion,
though seemed more ready to play than talk; (iii) after each run-through, tiny subsections were
rehearsed at Deborah’s instruction; (iv) overall, the movement profiles of both players share
characteristics with the solo run-throughs, though in the duets both make more, smaller ampli-
tude movements.
Figure 26.7 shows the movements for duets 1 and 6. When compared to the solos, both musi-
cians change the way they move in order to interact closely with the other. Deborah returns to
movements more similar to her ‘busy’ solo 1 than her flowing solo 4; this is particularly notice-
able between (I) and (l(1)). It is as if she is trying to find how to ‘dance’ with her duet partner,
exploring the physical space and noticing Leah’s responses. Leah’s movements are now much
bigger, and strongly mirror Deborah’s; it is as though Deborah’s movements are a magnet
towards which Leah’s movements are drawn. Maybe Leah is trying to fit in with Deborah’s
‘groove’? Just as in the solos, Deborah’s movements show many changes from duet 1 to 6, as she
explores different interpretations, whereas Leah’s movements in duet 6 demonstrate the same
overall shapes as she established when she first joined in the duet. Overall, the movements in duet 6
are larger and more expansive than those in duet 1. They are ‘dancing’ more freely together,
bouncing off one another’s tempi, dynamic and reacting to the harmonies of the piece.
26.6 Conclusions
26.6.1 Summary of the practical investigations
Linking the two sets of practical investigations reported above to the ideas explored at the start of
the chapter, we have provided evidence to help us establish that the body has a critical role in the
construction and communication of music. Amy shows the role of social and cultural frame-
works and how these impact on how the performer expresses the music through the body. We see
Flute and clarinet duet 1

Event
i i J K LI(1) I(2)M N
0.2
0.1 Flute
Metres
0 Clarinet
−0.1
−0.2
30 35 40 45
Seconds
Flute and clarinet duet 6
Event
i i J K LI(1) I(2) M N
0.2 0.2
0.1 Flute 0.1

Metres
0 0
Clarinet
−0.1 −0.1
−0.2 −0.2
30 35 40
Seconds
Fig. 26.7 Movement graphs for Section B, duets 1 and 6.
that in terms of both the construction and execution of the musical performance some sort of
intimate personal disclosure occurs. With Deborah and Leah, we looked at how each of the
women first established a satisfying solo version, and then collaborated intermusically and inter-
personally as they discussed, played and moved towards a final ‘performance’. The talk-aloud
pre- and post-playing comments made by the women enabled us to realize that Deborah’s more
forceful leadership literally shaped Leah’s movements in the production of musical effects; again
we see the role of social dynamics shaping the musical product. Additionally, the study showed
that the musical structure both elicited movement effects – dance-like coordination as they
rehearsed – and that the body created musical continuity and a sense of achieving wholeness. For
example, the coordinated movement of flute with clarinet in section B of both duets 1 and 6
seemed to help the flautist experience the movement of the on- and off-beat playing of the clar-
inet against her. The exploration has shown us how the two women negotiated their physical
space in order to produce a coordinated musical work. Both Amy and Deborah and Leah’s studies
reveal an additional element we have not yet discussed: namely that all three engage is rotational
swaying movements. The importance of this aspect is explored below.
26.6.2 Theoretical possibilities

Biomechanical enquiries by James Cutting and his colleagues (Cutting and Kozlowski 1977;
Cutting et al. 1978; Cutting and Proffitt 1981) refer to a centre of moment for physical expression.
According to this theory, there is a central point about which all other movements operate—
swinging, swaying, rotating—which reveals information about intention. Davidson (1997,
2002b) has argued for the possibility of a centre of moment for the physical expression of musical
intention, since it is well-established that movements specify their causes.
Davidson and Dawson (1995) attempted to control a music learning situation (pianists learn-
ing a piece of a two octave range which did not require any shifts in body position from the
normal still upright sitting position) by introducing a learning condition which included con-
straining the pianists from moving their torsos during sight-reading, learning and performance.
The results revealed the final performances to be far less musically expressive and ‘meaningful’
than those where the performers were allowed to move their torsos, and so sway and rotate freely.
The upper torso swaying or rotation we observed in Amy, Deborah and Leah’s performances may
indeed be a core element of generating musical expression (see Davidson 2005 for a fuller discus-
sion of this idea). Furthermore, if we link back to some of the mutual attuning behaviours
between infants and caregivers which promotes adaptation to the social world (Malloch 1999),
we note that part of this behaviour includes rocking, swaying, rotating in synchrony with the
‘conversational’ exchanges. Through observation by the second author, it appears that when
mothers are suffering from post-natal depression, they tend not to engage in these sorts of natu-
ral bodily swaying actions or accompanying vocalizations (and see Gratier and Danon, Chapter 14,
this volume, for further discussion of potential deleterious effects on a mother’s musicality from
her environment and mental state as well as Marwick and Murray, Chapter 13, this volume). We
might theorize that the types of performance movements we have observed in this chapter are
indeed a central part of communicative musicality as it manifests itself in adult professional level
performance. These movements comprise technical movement for execution of the music on a
particular instrument; and integrated with this functional movement are intrapersonal expres-
sive movements (Amy’s gentle adaptive movements perhaps being as much for self-expression as
for external communication); and there are also interpersonal movements for communication
between co-performers, and between co-performers and audience.
Considering that the investigative work we have described in this chapter also highlights the role
of sociocultural factors such as etiquette, gestural emblems, illustrative and display movements,
it might be argued that humans have developed elaborate social and cultural codes to increase
the scope of communicative musicality from its biological adaptive functions to more abstract
and aesthetic levels. Indeed, one area for future research is to elaborate a theory of communica-
tive musicality to embrace social and cultural codes of behaviour—the ‘protohabitus’ proposed
by Gratier and Danon (Chapter 14, Section 14.2.2, this volume). With this said, a crucial point to
be made in the context of this chapter is that performers may benefit from being aware of the
communicative/adaptive function of the movements necessary to create their musical output,
and they should neither ignore nor attempt to suppress the possibilities of the movement. In fact,
given the necessity of movement to create a performance (operating within social and cultural
frameworks) it seems ironic that teachers like Pierce, Guile and others need to explore deliberate
body movement to illustrate and develop ideas for students for their musical expression (though
see Rodrigues et al., Chapter 27, Section 27.3, this volume, for imaginative use of movement for
musical understanding). But just as physical constraint can damage the expressive potential of a
music performance, so can an excess of movement. The performances of the jazz pianist Keith
Jarrett provide an interesting final case for reflection. Without doubt, Jarrett has phenomenal
expertise, but is a source of great controversy with regard to the function of bodily posture and
gesture. He makes many extreme gestures and adopts strange postures at the keyboard, often
shouting and grunting as he plays. He has stated that he would not be able nor would he want to
produce his performances differently. Indeed, as an improviser, he says that the way he produces
the music through his body is for him a ‘shadow of an attempt’ to represent in sound what he
hears in his mind (see Elsdon 2003, 2006). For Jarrett this is the only way to play. However, he has
been highly criticized for his extraordinary physical movements and bizarre vocalizations, which
many find detracting from the musical content. This is a fascinating case: for Jarrett the very par-
ticular gestures and sounds he makes are a part of his representation of the music, and so
arguably essential to the musical improvisation. However, for those onlookers and listeners who
object, they seem to be wanting him to comply with what other performers do on stage. Thus, the
skills involved in music production do seem to vary, and whether an audience is present or not
might affect the way in which the music is conceived and then produced through the body, and
sociocultural influences seem to have a crucial role in the presentation and reception of the musi-
cal performance.
We conclude by saying that, for us, the core of a good musical performance depends on the
communicative musicality inherent to the performer and expressed through the body, with inte-
grated mind—body thought and action process producing a holistic and meaningful result for
both the performer and the receiver of the musical message.
Acknowledgements
Research reported here on tracking musicians’ movements was funded by a research grant from
the University of Western Sydney.
References
Ayers B (1973) Effects of infant carrying practises on rhythm in music. Ethos, 1(4), 387–404.
through maturity. Aaron Asher Books/Harper Collins Publisher, New York.
Clarke EF and Davidson JW (1998). The body in music as mediator between knowledge and action.
In W Thomas, ed., Composition, performance, reception: Studies in the creative process in music,
Cox A (2001). The mimetic hypothesis and embodied musical meaning. Musicae Scientiae, 5(2), 195–212.
Cox A (2006). Hearing, feeling, grasping gestures. In A Gritten and E King, eds, Music and Gesture,
pp. 45–60. Ashgate, Aldershot, UK.
Cutting JE and Kozlowski LT (1977). Recognising friends by their walk: Gait perception without
familiarity cues. Bulletin of the Psychonomic Society 9, 353–356.
Cutting JE and Proffitt DR (1981). Gait perception as an example of how we may perceive events.
In RD Walk and HL Pick, eds, Intersensory Perception and Sensory Integration, pp. 249–279. Plenum,
New York.
Cutting JE, Proffitt DR and Kozlowski LT (1978). A biomechanical invariant for gait perception.
Journal of Experimental Psychology: Human Perception and Performance, 4, 357–372.
Davidson JW (1997). The social psychology of performance. In D J Hargreaves and AC North, eds, The
social psychology of music, pp. 209–226. Oxford University Press, Oxford.
Davidson JW (2002a). The performer’s identity. In R MacDonald, D Miell and DJ Hargreaves, eds, Musical
identities, pp. 97–116. Oxford University Press, Oxford.
Davidson JW (2002b). Understanding the expressive movements of a solo pianist. Musikpsychologie,
16, 9–31.
Davidson JW (2005). Bodily communication in musical performance. In D Miell, DJ Hargreaves and
R MacDonald, eds, Musical communication, pp. 215–238. Oxford University Press, New York.
Davidson JW (2007). Qualitative insights into the use of expressive body movement in solo piano
performance: A case study approach. Psychology of Music, 35(3), 381–401.
Davidson JW and Correia JS (2002). Body movement in performance. In R Parncutt and GE McPherson,
eds, The science and psychology of music performance: Creative strategies for teaching and learning,
Davidson JW and Coulam A (2006). Exploring jazz and classical solo singing performance behaviours:
A preliminary step towards understanding performer creativity. In G Wiggins and I Deliege, eds,
Musical creativity: Current research in theory and practice, pp. 181–199. Oxford University Press,
New York.
Davidson JW and Dawson JC (1995). The development of expression in body movement during learning
in piano performance. Conference Proceedings of Music Perception and Cognition Conference,
p. 31. University of California, Berkeley, CA.
Elsdon P (2003) Keith Jarrett and the muse. Conference proceedings, International Conference on Music and
Gesture, p.35 University of East Anglia, August.
Elsdon P (2006) Listening in the gaze: The body in Keith Jarrett’s solo piano improvisations. In A Gritten
and E King, eds, Music and gesture, pp. 192–207. Ashgate, Aldershot.
Fernald A and Mazzie C (1991). Prosody and focus in speech to infants and adults. Developmental
Gordon E (1987) The nature, description, measurement and evaluation of music aptitudes. Basic Books,
New York.
Guile L (2000) The expressive world of flux! In C Woods, G Luck, R Brochard, F Seddon and J Sloboda,
eds, Conference Proceedings, 6th International Conference on Music Perception and Cognition,
p. 3. Keele University, Department of Psychology.
Hatten R (2006). A theory of musical gesture and its application to Beethoven and Schubert. In A Gritten
and E King, eds, Music and gesture, pp. 1–23. Ashgate, Aldershot, UK.
Johnson M (1987). The body in the mind: The bodily basis of meaning, imagination and reason.
University of Chicago Press, Chicago, IL.
Johnson M and Larson S (2003). ‘Something in the Way She Moves’—metaphors of musical motion.
Metaphor and Symbol, 18 (2), 63–84.
Kitamura C and Burnham D (2003). Pitch and communicative intent in mother’s speech: Adjustments for
age and sex in the first year. Infancy, 4(1), 85–110.
Peter Lang, Bern.
Kurosawa K and Davidson JW (2005) Nonverbal interaction in popular performance: A case study of The
Corrs. Musicae Scientiae, 19, 111–136.
Laban R (1960). The art of movement and dance, 2nd edn. Macdonald and Evans, London.
Lakoff G and Johnson M (1999). Philosophy in the flesh: The embodied mind and its challenge to Western
Lehmann AC and Davidson JW (2002) Taking an acquired skills perspective on music performance.
In R Colwell and C Richardson, eds, Second handbook on music teaching and learning, pp. 542–560.
Lehmann AC and Ericsson KA (1997). Expert pianists’ mental representations: Evidence from successful
adaptation to unexpected performance demands. In A Gabrielsson, ed., Proceedings of the Third
Triennial ESCOM Conference, pp. 165–169. Uppsala University, Uppsala, Sweden.
Lidov D (1987) Mind and body in music, Semiotica, 66, 69–97.
Lidov D (2006) The emotive gesture in music and its contraries. In A Gritten and E King, eds, Music and
gesture, pp 24–22. Ashgate, Aldershot, UK.
1999–2000), 29–57.
Malloch S (2004). The infant reaches out: The communicative functions of adult–infant vocalisations and
gestures. 9th World Congress of the World Association for Infant Mental Health, Melbourne,
14–17 January.
Malloch S (2005). Why do we like to dance and sing? In R Grove, C Stevens and S McKechnie, eds,
Thinking in four dimensions: Creativity and cognition in contemporary dance, pp. 14–28. Melbourne
University Press, Melbourne.
Murray L and Cooper PJ (1997). The role of infant and maternal factors in postpartum depression,
mother-infant interactions, and infant outcomes. In L Murray and PJ Cooper, eds, Postpartum
depression and child development, pp. 111–135. Guilford Press, New York.
Pierce A (1994) Developing Schenkerian hearing and performing Integral, 8, 51–123.
Pierce A (2003). Letting gesture through: The practice of reverberation. Paper presented at the International
Conference on Music and Gesture. University of East Anglia, August.
Robb L (1999). Emotional musicality in mother-infant vocal affect, and an acoustic study of postnatal
Sloboda JA (1991) Music structure and emotional response: Some empirical findings. Psychology of
Music, 19, 110–120.
Sloboda JA (1985) The musical mind. Clarendon Press, Oxford.
Snow CE (1977). The development of conversation between mothers and babies. Journal of Child
Language, 4, 1–22.
Snow CE (1989). Understanding social interaction and language acquisition: Sentences are not enough.
In MH Bornstein and JS Bruner, eds, Human interaction, pp. 83–103. Lawrence Erlbaum Associates,
Mahwah, NJ.
Stevens C, Malloch S and McKechnie S (2001). Moving mind: The cognitive psychology of contemporary
dance. Brolga 15, 55–67.
Todd NP McA (1995). The kinematics of musical expression. Journal of the Acoustical Society of America,
97, 1940–1949.
Todd NP McA (1999) Motion and music: A neurobiological perspective. Music Perception, 17(1), 115–126.
Trevarthen C (1986). Development of intersubjective motor control in infants. In MG Wade and
HTA Whiting, eds, Motor development in children: Aspects of coordination and control, pp. 209–261.
Martinus Nijhof, Dordrecht, The Netherlands.
Trevarthen C (2001a). Intrinsic motives for companionship in understanding: Their origin, development
and significance for infant mental health. Infant Mental Health Journal, 22(1–2), 95–131.
Trevarthen C (2001b). The neurobiology of early communication: intersubjective regulations in human
Trevarthen C and Aitken K (2001). Infant intersubjectivity: Research, theory, and clinical applications.
Journal of Child Psychology and Psychiatry, 42(1), 3–48.
Watt R and Ash R (1998) A psychological investigation of meaning in music. Musicae Scientiae,
II, 33–53.
Chapter 27
Communicative musicality as creative

participation: From early childhood to
advanced performance
Helena Maria Rodrigues, Paulo Maria Rodrigues and
27.1 Introduction: experiences in musical expression

This chapter reports two very different artistic experiences, and uses them to explore fundamen-
tal ideas about communication, music and human behaviour. These two experiences were
created in different contexts, with participants of widely different age and background, were
aimed at different goals, and were conceived independently by the authors of this chapter.
Despite these differences, the experiences have one important feature in common: both sought to
stimulate creative musical artistry and artistic production.
The first experience is Bebé Babá, a musical project involving babies and their parents. Since it
was created in 2001, Bebé Babá has been realized several times. The reflections of Helena Maria
Rodrigues and Paulo Maria Rodrigues are a summary of these experiences and focus on theatri-
cal communication between parents and babies, with music as the medium.
The second experience is advanced young performers learning a musical instrument, and how
the body influences their capacity to express themselves musically. The progress of a group of
students under the influence of their teacher, Jorge Salgado Correia, using strategies that focus on
embodying musical meaning, was followed until the progress of their work was formally assessed
by examination. This is the point of departure for reflection on the nature of music communica-
tion in advanced instrumental performance.
The authors believe that human communication is rooted in ancestral musical expressive ele-
ments that can be found in high-level musical performance, in amateur music performance, in
interactions between infants and their caregivers, and even in everyday verbal and non-verbal
interactions between people sharing their interests and purposes. Both of the artistic experiences,
discussed in detail below, raise questions about the intrinsic nature of musical communication.
27.2 Bebé Babá, an educational and artistic project with babies

and their parents
Helena Maria Rodrigues and Paulo Maria Rodrigues
27.2.1 From ideas to practice
The main idea of Bebé Babá is to unite education and performance in a process that places at its
centre music, babies and their parents (Rodrigues and Rodrigues 2004). It was conceived by
586 HELENA MARIA RODRIGUES, PAULO MARIA RODRIGUES AND JORGE SALGADO CORREIA
Fig. 27.1 In Bebé Babá both parents and babies are fully involved in the production. (See also
colour plate 7.)
Companhia de Música Teatral (www.musicateatral.com), a Portuguese music group that has pro-
duced several interdisciplinary artistic projects in recent years. We first presented Bebé Babá in
2001, in Teatro Viriato, Viseu, and have since staged it at many other venues, always with enthusi-
astic reception from audiences and participants. Bebé is the Portuguese word for baby; babá is a
sound that often occurs in infant babbling in many cultures, and the word that Brazilians use for
‘babysitter’. It is also a sound very close to papá and mamã, informal Portuguese words for father
and mother.
The first aim of this project was to create and develop a show for and with babies in which their
parents would also be fully involved (Figure 27.1). Our aim, we emphasize, was not to provide a
ready-made show—not just to entertain; instead, we wanted to create music and play games with
the babies in cooperation with their parents, stimulating active group participation and creativ-
ity. We provide the conditions, but the parents themselves have to be the artists for their own
children. From the beginning, Bebé Babá was conceived as a communication game using a variety
of supports, integrating music, sounds, visuals, tactile sensations, movement and language. The
important issue is that each parent enjoy his or her baby’s companionship through music.
A second aim was to invite the general public to share this experience, by being present at a
final performance/presentation. In this way, we intended to share with other people a way to
make music with babies and their parents.
We conceived Bebé Babá as a ‘chain of shows’. The babies would be amused by their parents
through music, movement and toys; the parents would be amused by the pleasure, responses and
reactions of their babies; the public would be the spectator of both of these intimate shows. In a
certain way, we were not trying to create a final show: all of the moments in Bebé Babá are little
shows. The final performance is just one of those moments that we decide to share with outside
observers.
A typical Bebé Babá project starts with a series of workshops during four weekends—some for
babies and their parents, others just for parents. Normally, we have about 15 dyads participating
COMMUNICATIVE MUSICALITY AS CREATIVE PARTICIPATION: FROM EARLY CHILDHOOD 587
(father or mother and his or her baby). The ages of the babies range from just a few weeks up to
2 years. The members of Companhia de Música Teatral who have their own babies bring them as
well; this helps us to model and share our own experience with the participants.
The workshops for parents with their babies are inspired by childhood musical guidance ses-
sions based on Edwin Gordon’s music learning theory (Gordon 1990). We thought that we could
adopt his learning principles to maintain the children’s interest during a musical performance.
Our choice of musical material follows Gordon’s idea that the babies’ interest is best nurtured by
short musical fragments, presenting both variety and repetition. The songs and chants that we
use include a variety of metres and tonalities, and present several kinds of expression and charac-
ter. We also use a variety of original material composed according to these guidelines, plus
Portuguese traditional songs and nursery rhymes. Some of the songs and chants are composed of
meaningless syllables, using simple tonal and rhythmic patterns. We also ask parents to bring
their ‘private sound’ games, and ask that they share them with everybody. We encourage each
parent to develop their particular way to communicate affectionately and musically with their
baby. We regard each of these workshops as a little theatre in which every parent performs for
their child.
The workshops for parents consist of movement activities, exploration and creative games
with puppets, props and materials that we provide, and the practice of musical materials
we select. From early on, we establish an atmosphere of trust and togetherness. The movement
activities have an important role in this respect, and provide general awareness of body and
space, relaxation and energy. The more exploratory and creative activities, besides contributing
to the general good mood of the group and helping to overcome inhibition, are a source of
original ideas that may be shared and used with the babies, and eventually incorporated in the
final performance.
Part of the workshops is concerned with both learning the musical material we bring in and
sharing ‘private vocal play’ that parents normally invent for their children in their own homes.
The workshops are also a space for discussion about music and babies. At the beginning of the
process, there is no emphasis at all on a final presentation. We keep its shape open until quite late
in the process. However, we do discuss the idea of such a presentation during the workshops and
promote the emergence of new ideas, so that every Bebé Babá has unique aspects that depend on
the inventions of the particular set of parents. The workshops with the parents provide basic
musical and stage skills as well as a strong sense of community, an environment for creativity, and
a place for discussion about how to organize a final performance that an audience will watch.
By the end of a month, the group has given shape to the material we provided, and we agree on
basic principles to be observed during performance. We record video images during the work-
shops and, at a final stage, these are included in a collage that is both a report of the process
babies and parents have gone through, and an artistic appraisal of the ways parents and babies
communicate through music. These images are projected onto a screen during the performance
accompanied by sounds triggered by the babies using an interface we developed that allows very
young children to interact with a computer. The atmosphere on the stage on the two days of pub-
lic performances is very much like that of a usual session with the babies and parents and our
company of actors/performers. The lights, projections and the overall organization of the space
on the stage, rather than being a stressful, unfamiliar experience, contribute to a general sense of
organization and cooperation, and establish a very friendly environment. This works like a kind
of game that an audience enjoys observing, because it is beautiful, loving and authentic. The pub-
lic performance/presentation is thus regarded as the continuation of the workshop experiences.
In fact, we do not plan the final presentation as a performance. We just believe that the truth of
such a loving atmosphere is strong enough to captivate the public’s attention.
27.2.2 From practice to reflection: creating a ‘lap’ for musical culture

Bebé Babá came about from the confluence of educative and artistic circumstances. There was an
initial link between the creative philosophy and professional experience of Companhia de Música
Teatral in guiding early childhood musical sessions according to music learning theory. The idea
for the project arose during the observation of Edwin Gordon’s musical guidance sessions with
babies and their parents in Lisbon. Like him, we believe that the roots for musical competence in
a cultural tradition are established in early childhood and, therefore, that it is important to
expose children to their musical culture from birth onwards.
Other influences have nurtured the rationale of Bebé Babá. For instance, a reference for us is
the work of Colwyn Trevarthen, whose use of Stephen Malloch’s concept of communicative
musicality we quote in the title of this chapter (Malloch 1999; Trevarthen 1999; Trevarthen and
Malloch 2002). Hanuš and Mechthild Papoušek’s ideas also greatly inspire us with respect to
what constitutes early musicality and human communication (H Papoušek 1995, 1996;
M Papoušek 1996). We believe that certain forms of education reduce, impoverish and constrict
natural human development, and we agree with the idea that
for the infancy period, it may be advisable not to disturb the earliest forms of intuitive musical stimu-
lation by rationally guided artificial manipulations and formal educational interventions, but to keep
them concealed as a precious part of early parent–infant relationships.
M Papoušek (1996, p. 108)
On the other hand, we also think that the transmission of a certain form of musical culture is
an unavoidable consequence of living in human society. From birth, or even before, a mother
tongue or a type of food characteristic of children’s culture is ‘imposed’ on them. Babies live in
the lap of their culture. It has been shown that the musical environment influences very early
infant ‘musical productions’ (Reigado 2007; Rocha 2007). Thus, we should not miss the opportu-
nity to offer children the best musical lap we can (Figure 27.2).
Fig. 27.2 Infants are born musical. We must offer them the lap of our musical culture.
As Mechthild Papoušek reports, ‘in present industrialized societies, parental tendencies to sing
and dance have decreased in many families’ (M Papoušek 1996, p. 89). Very often in our work, we
encounter parents who are unaware of their own capacities and are surprised by the ‘skilled’
behaviour of their children, rediscovering with us the pleasure of being with them through
music. Therefore, we look at our project as a kind of social compensation mechanism in an
industrialized society.
Promoting the parent’s participation is an essential aim for us and, in the course of conducting
this project, we found many mechanisms that encourage parents to join in. For example, we use
tonal and rhythmic patterns easily imitated by musically untrained adults, and ostinati (repeating
musical patterns) that are easy to maintain within a group; we use songs with humorous lyrics
that appeal to the parents. We believe that it is important to take care of the parents’ well-being,
because it indirectly affects the well-being of the children.
In certain respects, we have adapted principles of Carl Rogers’s non-directive, person-centred
therapy to this artistic situation (Rogers 1961)—the free and empathic atmosphere we develop in
the workshops we feel is a very strong influence. We prefer to stimulate spontaneous participa-
tion than to impose order. We believe that in the right atmosphere, everybody’s musicianship can
emerge (Fröhlich, Chapter 22, and Custodero, Chapter 23, this volume). Bebé Babá is an oppor-
tunity for adults and babies to be exposed to a rich musical environment, creating a community
through music. It is a lap of musical culture.
27.2.3 Observing interactions between parents and babies

When we first developed the idea for this project, we were mainly interested in its artistic potential.
With time, we became interested in understanding more general issues related to music and commu-
nication, and we realized that Bebé Babá provided very interesting opportunities for the observation
of the rich variety of interactions between parents and babies. We do not regard it as a ‘baby lab’, but
we believe our empirical work has brought up many interesting ideas worthy of further study.
For instance, we verified for ourselves that the human voice really is the best musical medium
to interact with babies. To keep babies’ attention, we found that we had to alternate songs that
were accompanied by piano with solo songs, and that intervals of silence between the songs were
essential to keep babies interested. Many of the babies’ musical responses occurred during these
‘open’ moments. The babies’ responses were similar to those that we often obtained in a regular,
early-childhood musical guidance session. These data, which confirm some of the principles of
Gordon’s music learning theory, led us progressively to regard the project as a natural context for
the observation of musical development of babies.
We observed that all children react to music, and often enjoy it. We have no doubts that music
has a great effect on babies from birth. At the start of our project, babies are very attentive and
watchful; at the end, they are fully involved and willing to participate much more, taking their
own initiatives. Even young babies contribute actively while they make efforts to be part of the
activities we promote (see Bradley, Chapter 12, this volume, for an account of babies’ rich social
interactions with each other).
We see many cues from the infants which tell us that the babies recognize musical fragments
they have heard before. Some show they recognize melodies even when they are presented in
another key, or at another tempo, or when they are inserted into another musical context. Infants
will identify a piece of music using words or gestures. Often, infants will even ask for a specific
song using sounds or gestures, or will complain if a particular song they like is not sung.
They seem to absorb music all the time, even when it appears they are not attending to it. In the
final stages of our workshops, some children appear able to anticipate the next musical excerpt,
and thus they become the natural leaders of that performance. Some children appear to display
a natural ‘stage instinct’ that, in an adult, we would recognize as the characteristics of a great per-
former. All children appear to have a very strong need to ‘belong’ in a live experience with others.
We have received feedback demonstrating that this is a remarkable experience for babies, par-
ents and the general public. We believe that music is a very special channel of communication
between human beings: it is often through music, or music-like interaction, that the early bonds
are established between parents and babies. Throughout the process of Bebé Babá, we have
observed changes in the way babies and parents relate to music and in the way they relate to each
other. Gradually, music becomes a channel of communication and we observe the development
of bonds between all adults and all babies, and everyone in the group comes closer and closer
with time (Figure 27.3). At the end of one of the projects, one adult participant said, ‘we became
mothers and fathers of every baby; we are no longer the mother or father of just our own child’.
Does music itself work as a kind of ‘string’ connecting people? Is it just a question of enhancing
interpersonal skills and group dynamics? Or is our team ‘socially affected’ by music, so we com-
municate our enthusiasm to the participants?
The minutes that precede the beginning of the workshops, as well as the time when the partic-
ipants stay after the show is finished, are very interesting moments for observing how parents
interact with babies in a natural (non-directed) way. Everyone seems to use strategies that
emphasize the musical characteristics of spoken language, sometimes by means that are difficult
to classify as either language or music. There is much in common between the particular
instances of these more expressive modes of conversation: the type of patterns used, the inflec-
tions in the pitch, the tempo, the use of repetition, and the physical movements.
There is also great diversity of personal expression when we look in more detail and compare
how different dyads interact. Every parent/child seems to develop a repertoire, a vocabulary, that
has its own ‘signature’. Other dyads can understand this specific vocabulary, but within each dyad
is a very private intimacy. During the course of a project, we observe that parents and babies
Fig. 27.3 Mother and baby making sound as a team, in attunement. (See also colour plate 7.)
develop new ways of playing and interacting non-verbally. Normally, this is not a repetition of
what they learn in the workshops, but a development of their own ideas. The work we do seems
to catalyse the innate potential they already had. We believe the intentionality of the babies’
contributions to the music, and the communication and social skills that babies show, challenge
traditional conceptions of child development (see Gratier and Danon, Chapter 14, this volume).
We observe how important parents’ participation is: babies imitate and follow their parents as
their ‘guru’ (in Figure 27.3, is it not easy to guess that the lady with the white top is the mother?
What are the visual clues that tell us?). The way babies react to their father’s or mother’s singing is
different from the way they react to another adult’s singing. We have no doubts that at this age
children look to their parents as a model, while the parents are stimulated by their babies’ appre-
ciation; if a parent sings or moves that means for their baby, that singing and moving is a good
thing. The children also seek approval from their parents. Observing how amused and quiet the
babies are in our workshops, we believe pediatricians should add a new reason why sometimes
babies cry and seem to be unsatisfied: because they are bored. They want stimuli and excitement
in their lives—something more than just eating and resting.
In summary, the observations we have made in this project challenge our received knowledge
about the musical development of babies, as well as our ideas about their social skills. We dare to
suggest that through scientific investigation, humans may have come to know more about the
behaviour of apes, dolphins and whales than about their own offspring!
27.2.4 Creating an ecological context for research

Studying child development in experimentally controlled conditions satisfies the need for rigour
in collecting factual evidence, but we believe the often very artificial conditions can lead to mis-
leading interpretations of the results. The behaviour observed in such circumstances, often far
from the natural or familiar situation, can be influenced by the experimental design. Studying
spontaneous human behaviour, which is so complex in its motivations and has so many
elements, requires focused observation and measurement so comparisons can be made.
We think that the artistic model we have developed might be used as a context for scientific
research that bridges this gap between controlled experimentation and naturalistic observation.
For example, rather than looking specifically for volunteers to act as participants for a prescribed
test, perhaps it is more ‘natural’ to invite the participants of a self-generating project such as ours,
which genuinely interests families, to collaborate in research, suggesting to parents that they
might participate in a study directly related to the musical and artistic work of Bebé Babá. Our
experience tells us that the parents are likely to regard such an invitation as a generous offer, and
to be very willing to collaborate. The researcher is then allowed to enter into a richly active event
in which he or she has some control over the conditions of observation and measurement. In
other words, an artistic project such as this can work as a natural and ecologically valid ‘baby-lab’;
participants can be easily recruited, either for musical purposes or for the investigation of other
related questions. Given the mobility and adaptability of our kind of project, it is possible to
regard it as a travelling laboratory—one that easily allows the replication of experiments in dif-
ferent cultural contexts.
Bebé Babá gives us the opportunity to make observations before and after the workshops, and
we are easily able to achieve parental collaboration in this. For example, in one of our initiatives,
we had the parents write diaries of what their babies did, making observations throughout the
week. Bebé Babá is also a learning situation that allows the ‘testing’ of stimuli for learning, but we
came to understand that the motives of the child must lead the way. For instance, in a study in
which we wanted to determine if children aged 1-and-a-half to 2 years old could recognize a song
by its words or melody, we tried to teach a gesture response (waving the hand) to a certain song.
Fig. 27.4 Babies are sociable, enjoying festive activities in a group.
We were not successful in establishing this conditioned response. However, we noticed that
little by little, on hearing just that song, Manuel, a 1-and-a-half year old, would walk to the
door pulling his mother with him. We gave up trying to train a response invented by us, and
instead took note of the clue that was being naturally given by the child, which proved he did
recognize the song.
This kind of attention to detail in what happens spontaneously has radically changed the way
we conceptualize research with children. Rather than trying to obtain a given kind of response
from the child (or even more difficult, consistent pre-specified responses), we believe a researcher
should attempt reading the observational facts, ask whether they can be regarded as informative
responses given freely by the child, and then manipulate variables after this kind of behaviour has
been established.
Bebé Babá has contributed greatly to our understanding of musical development in early child-
hood, and of communication between children and adults. However, thus far its aims have been
primarily artistic and educative. By choosing to proceed more systematically, we believe that this
model can be very fruitful for scientific research into musical development and social interaction
in early childhood (Figure 27.4). We think that it will not be difficult to integrate study of a spe-
cific question into the general artistic situation. We have the potential to integrate objectivity of
observation in an intersubjective reality.
27.2.5 Between art and therapy

As artists looking at society in a systems-sensitive way, we enjoy conceiving art at its critical point
of origin, in its creation as a product of the dynamic social system. We do believe, too, that art can
do much for people, improving their way of living together. The best therapy for personal or
social malaise is to promote a self-realized spontaneity of well-being in others’ company, dimin-
ishing the need for an imposed remedial treatment of the individual.
By listening to participants’ feedback, we discovered that our work within Bebé Babá had a
communicative and therapeutic side, at least in the self-developmental meaning of therapy. We
discovered that participation in the shows engendered a feeling of being supported and created a
relaxed interpersonal space between the participants. This allowed the parents to share their
paternal and maternal feelings and concerns with other parents. Surprisingly, we often observed
adults playing and behaving as children, which led us to consider that this kind of return to child-
hood could be explored for purposes other than musical. The music-making by the group and
the general non-directive attitude of the artistic team generated human dynamics which enabled
positive changes in communication and relationships between people, and promoted psycholog-
ical well-being. This was not just a musical effect: beneficial power came as well from the artistic
team’s beliefs and skills, and from everybody’s positive intent and collaboration.
There were particular situations in which the creative group revealed itself to be particularly
effective in providing social support for the relationship between babies and parents. This was the
case with three adolescent mothers participating in one of the projects: we were told that Bebé
Babá helped them to develop their mothering skills. This was also the case with a baby who was
diagnosed with an autistic spectrum disorder: his father praised Bebé Babá for providing him
with a rich and ‘festive’ environment in which to be with his baby. Finally, there is the case of a
child who had been born premature and was showing signs of developmental delay. Particularly
touching was the moment when this developmentally delayed child made her first steps during
Bebé Babá’s final presentation, her mother saying emotionally, ‘a session of Bebé Babá helped my
child more than ten sessions of therapy!’ We have no doubt that the contact with other parents
and babies and the relaxing and playful atmosphere stimulated the global development of all
these children, as well as encouraging a positive and hopeful attitude in their parents.
We believe that this project put us in contact with the positive essence of art. We feel some art is
destructive, showing and appealing to the darker sides of human beings, reflecting a violent
world. By contrast, we choose to strive for optimism and affection in our art. As artists, we prefer
to offer dreams and to bring light to people’s lives. We were inspired by a poem by Herberto
Helder, which we quoted in the final presentation of the video of Bebé Babá:
E por dentro do amor,
até somente ser possível amar tudo,
e ser possível tudo ser reencontrado por dentro do amor.
Helder (1969)
[And inside love,
until loving everything is the only possibility,
and having been made possible, everything is found again inside love.]
We believe that Bebé Babá’s therapeutic effects are powerful because it is conceived as an artis-
tic experience, not as therapy. Good art in itself is therapeutic. We dare to say that if good art is
provided, there is almost no need for ‘art therapy’; more simply, if we promote well-being, there
is less place for disturbance.
27.2.6 A metaphor for a richer education

Our society is said to be facing what has been called a ‘school crisis’ (Rodrigues and Rodrigues
2006). Increasingly, school resembles an anachronistic institution, less and less capable of com-
peting with other sources of information, such as the mass media or the Internet, both of which
are readily available to very young children. At the same time, school is unable to perform the
social roles that are ascribed to it. Many schools are perceived to have more social and economic
functions than strictly pedagogical. The institution of compulsory schooling strongly regulates
the work market: it guarantees employment for those who work within it, and allows others to
work by taking care of their children (Rodrigues 2004).
In our country, Portugal, we have personally witnessed a growing dependence on school as the
place where children learn, at the same time as education in the family has been losing impor-
tance. It is therefore not surprising that there is a struggle for power and responsibility between
school and family. Besides the increase in the time allocated for compulsory education for older
children, many of our infants enter nursery school by the age of 4 months, in many cases staying
there for around 10 hours a day. Attempts have been made to solve our problems with school
with yet more schooling, which we believe is taking the crisis to new depths.
In the future, we believe families will have to recover their responsibility for the process of
educating children. This recovery will need to start from birth, as it always has done, by giving
parents opportunities to follow their children’s learning and to share with others their experience
of parenthood and the knowledge they gain along the way. However, we feel that after years of
‘dumping’ children in nursery schools and kindergartens, we cannot expect that families will sud-
denly change their attitudes.
Having had the opportunity to create our artistic project in several circumstances, we have
witnessed a range of responses in the parents. We believe these responses might be related to the
degree to which parents are or are not involved in taking care of their children on a daily basis,
and in sharing experiences, purposes and concerns with their children. Whereas some of the
mothers and fathers understood immediately the project’s philosophy, assumed their responsibil-
ity, and took an active role, others seemed to have an expectation they would deliver their chil-
dren to us so we would entertain them, and they could forget their responsibility to take care of
them. In these cases, we encouraged a change in attitude, and gradually observed a reinvestment
from parents as they gradually discovered unknown potential in themselves and in their children
for enjoyment of shared activity.
For many families, joint activities for parents and children are increasingly scarce. In Portugal,
the state provides few structures that support family activities. We believe that problems such as
learning failure, and a lack of discipline and of a sense of security in children at school can be
related to family dysfunction; thus, we conclude that the structures and the conception of cre-
ative group activities such as Bebé Babá are not a luxury, but are becoming a necessity; they are
compensatory in the social network we live in, in which children often lack individual attention
and families lack support.
The survival of school as a benefit for society depends not on strengthening its power, but on
social reorganization aimed at promoting the quality of life of families. It is essential that we
develop parallel education strategies that work as an alternative to school and diminish the power
that it cannot use effectively. Creating opportunities for education and artistic enjoyment from
early childhood is a way to develop committed support within families and to help them to share
educational resources. Early childhood is a special time for both the development of affectionate
attachment between parents and their children, and for sharing discovery and learning. We have
been able to demonstrate how creative musical activities can help make a community, promoting
a sense of belonging, and providing context and opportunities for the nurturing of relationships.
Bebé Babá has been a loving experience that generated many emotions, feelings and instincts
related to motherhood and fatherhood. We believe that to take care of someone makes one feel
better. Bebé Babá is mainly a story about life—birth and death—and love. One of the main con-
tributions of the project is a demonstration of how to link education with artistic performance
and to show how musical interaction that stimulates an eagerness for learning is possible, from
birth onwards. Besides being a form of entertainment, this project can be both educative and
artistic, and can make a major contribution to musical acculturation. We believe our work is
helping to demonstrate the importance of exposing children to music from birth, and showing
how that can occur.
The model of Bebé Babá includes time for the adults to relax in a supporting environment,
in an ‘affirmation matrix’ (Freeland et al. 1998), where young parents can share and confirm
aspects of their new and changing identity. It is important to take care of the parents as well as
the children. Our society has lost the social support that used to help parents to assume their new
roles. In Bebé Babá, parents engage themselves in a musical experience similar to the one in
which their children participate; this allows parents a better understanding of the meaning and
importance of shared cultural activity. We believe we have found a strategy that can connect par-
ents and schools and engage them in the common objective of education. Projects like Bebé Babá
may help us to discover fertile interfaces between education, art, therapy and community work.
This may help us to find a new paradigm for school and for the organization of society.
27.2.7 Inspiration for a theory

Working on Bebé Babá and reading authors such as Trevarthen, Stern, Gordon, Trehub, Kuhl,
Hanuš and Mechthild Papoušek, among others, has shaped our ideas about communication,
music and human behaviour. However, it is difficult to rigorously identify how this has happened.
What we propose as a theory of theatre art for infants and toddlers is not really ours: it is like foot-
notes on what other people have said in recent years, a joining of fragments that are dispersed in
the cosmos of ideas. We are trying to establish a framework for a likely ‘version of the facts’.
Bebé Babá has been an opportunity to test those ideas and a source of inspiration for them.
27.2.8 Music and language: emerging from the same source

Infants are, as we see it, born musical, and movement and music have a great effect on babies
from birth. It is possible that our point of view is biased when we perceive certain types of situa-
tions and reactions as ‘musical’, which other people refer to as ‘communicative’, ‘social’ or ‘prelin-
guistic’. Musicality from birth is, for us, so evident, that we find it puzzling that most studies of
psychological development have neglected this aspect of early human behaviour.
Before language emerges, a baby must learn how to express its will and experience by vocal and
gestural movements. This preverbal, non-verbal communication is part of the process of growing
language, and it has a musical nature from the start (Trevarthen 1999). The communicative
musicality of infancy (Malloch 1999) is evident in action games between the baby and caregiver,
in ‘motherese’ protoconversations, and in a host of situations where states and functions of mind
are transferred by physical contact between baby and caregiver. We believe it is basic, biological
and universal. From it emerge all languages of the world, be they spoken or expressive in other
active ways (Rodrigues 2005). In other words, music and language develop from the same deep
need and skill to communicate. They share common aspects because they emerge from the same
source. However, they develop through different paths. In a certain way, musical movement is the
‘language’ that preserves the ancestral aspects of that vital pulse of human narrating that pre-
ceded the making of song, music, dance, theatre and language (Rodrigues 2005; see Dissanayake,
Chapter 2, Brandt, Chapter 3, Merker, Chapter 4, and Cross and Moreley, Chapter 5, this volume
for discussion on the evolution of music and language.)
27.2.9 The musicality of language and music

We have no doubt that when we speak to calm or entertain a baby, it is the melody that matters,
not the words. In adult speech, the melody, timbre and rhythm are not just dispensable acces-
sories to verbal content. Spoken language preserves a musicality that can reinforce or contradict
the verbal content of a message (see Erickson, Chapter 20, this volume, for examples of this).
However, in musical performance, this musicality might be more or less evident. We would like
to clarify this point since, for a non-musician, the distinction between music and its musicality
may not be obvious. While risking the possibility of obscuring other aspects related to culture
and musical taste, we contend that the presence or absence of musicality establishes the distinc-
tion between those who are and are not able to communicate with their audiences. Thus, any
musical phrase may be interpreted with more or less expression or musicality.
There is comparable expressiveness in a musical phrase, in spoken language, and in the
gestures and movements of all we do (Lee and Schögler, Chapter 6, Panksepp and Trevarthen,
Chapter 7, and Davidson and Malloch, Chapter 26, this volume). This leads us to consider the
essential and intrinsic aspects by which the phrasing of a performing musician—a professional
artist—is evalutated. Within the non-verbal expression of music, we find the deepest, indispensa-
ble layer of human communication, through which human emotions flow (Rodrigues 2005).
We may call the science of this communication ‘the psychoacoustics of affection’. Like other con-
tributors to this volume, we hold that there are parameters of dynamic communication that are
primitive and universal among humans. They are present in the first communications a baby
establishes with others, and in spontaneous musical expression. Some spoken discourses or some
artistic performances embody them better than others. That is part of the explanation of their
efficacy as communication.
Bebé Babá, besides being a rewarding artistic experience, has become a philosophical and
scientific challenge, transforming our conceptions of art and music, of the nature of babies, and
of communication between people. It brought us understanding of Popper’s (1989) statement,
‘The work of a creative researcher, the theory, has a lot in common with the work of art; and its
creative activity is very similar to that of the artist.’ Whether our work is art or science is irrele-
vant; what is more important is knowing whether ‘drinking from the source’ nourished the well-
being of the musicians and the people we worked with. The value of a theory is related to its
productive and creative possibilities, more than to the tested limits of its truth and fiction. It is
possible that we just fell in love with a beautiful fiction. However, worthwhile problems for scien-
tific debate have arisen from this work, as well as inspiration for further artistic work. The reason
humans make music is because words are not enough. Music has a vital place in the most literate
cultures because it retains its own importance for human survival: language cannot replace it.
The experience of our creative work is saved in the memory of those who lived it with us. It has
made our experience of the world more beautiful.
Fig. 27.5 In Bebé Babá a festival of musical action makes a community.

27.3 Developing musicality in the teaching of performance

27.3.1 Background: problems with methods of instruction
Teachers of instrumental performance often ask: why can one not just ‘tell and show’ the students
how the piece should be performed? The immediate, obvious answer to this question is that
musical interpretation cannot possibly be based only on imitation of models. The indeterminacy
of the score means that the interpreter can and should be creative; thus, interpretative perform-
ance requires a personal contribution. Only a creative, personal synthesis is capable of producing
communicative, meaningful, intentional musical gestures or musical performances.
In the existing pedagogical literature on performance teaching, the processes involved in
conceiving, rehearsing and performing the expressive elements of a musical work are not given
serious attention; the emphasis is on technical and formal aspects of playing an instrument.
Anecdotal evidence from a variety of teaching contexts indicates that interpretation of music is
discussed, but mainly as an account of notation, which draws on a large stock of standardized
expressive effects acquired through stylistic imitation of a stereotype or model. This model
has probably been provided by the teacher, a particular school of playing, or recordings of a
famous interpreter.
In this context, the performance teacher’s job is to engage the students in a more or less creative
reading of the musical score, trying to develop processes of interpretative meaning construction,
in opposition to the textbook emphasis on technically focused practice, which is potentially con-
straining, or, worse, destructive of artistic confidence. Class observation proves that even if stu-
dents have succeeded in imitating a particular expressive musical phrasing—reproducing it
immediately after their teacher’s demonstration—subsequent performances sound more and
more deficient in capturing the continuities that provided coherence of form and expression in
the teacher’s performance (Correia 2003). All of the information captured in the presence of the
teacher’s demonstration faded away—memory is naturally selective and reductive, especially if
the emotional foundations are fragile. Focused attention on separate expressive cues or on their
combinations, as opposed to a mind-state of open ‘phenomenal’ awareness, is equally reductive
(Deleuze and Guattari 1980).
The method of instruction reported in the following was designed to explore how teachers of
performance can guide their students to be more aware of the musical narrative as it unfolds, and
less attentive to the separate elements of the performance. This practical investigation may be a
useful first step towards teaching creative expression to performance students. In this experi-
ment, I asked if an earlier body movement-based expressive and communicative engagement
with the music could help students to become more musically expressive and teach them to
approach performance more creatively.
27.3.2 Theory: the key issues

The key issues explored in this experiment are artistic/pedagogical and theoretical—both con-
cerned with music communication. A teaching methodology is proposed and teaching strategies
are explored through which students can succeed in approaching performance creatively, inten-
tionally and convincingly. In the course of developing the experiment and later discussing its
results, a theoretical hypothesis has emerged which argues that, in musical communication, per-
formers and audiences share a meaning of musical production that operates below conscious
awareness, and which brings to awareness bodily processes of dynamic experience. At this level, it
is not the literal or referential meaning that is communicated, but rather the self-sensing process
of producing musical, non-verbal (ineffable) meanings—those that are expressible only in terms
of a quasi-corporeal or ‘embodied’ experience. Thus, for musical communication to occur, the
gestures produced by the performers should engage sympathetic listeners, that is, those who are
capable of fulfilling at least two conditions. First, they should be cognitively equipped to react
mimetically to performers’ actions; second, they should be socially and culturally motivated for
playing games of make-believe with those mimetic reactions.
27.3.3 The method

I am a professional flautist and teach flute at the Aveiro University in Portugal. The practical
investigation I report here is a self-reflective case study of my own teaching to explore how
I might become an active researcher to facilitate a creative learning process for my students. The
basis of the approach is a concept that, in effective teaching, students are guided to develop their
own ideas about interpretation, rather than depending on the imitation of models provided by
the teacher. To achieve this ‘self-motivated learning’, I adopted the following four-point model as
a teaching methodology.
The students were guided as follows:
1 They were invited to contextualize the musical pieces, to experiment with the sounds through
the deliberate exploitation of the chosen context or theme;
2 Then they had to (e)motionally explore the context, by making expressive decisions and
engaging the motions and the emotions of the body to create an emotional narrative with the
sounds (I use the term ‘(e)motion’ to point to the intimate relationship between emotion and
body motion);
3 They had to assimilate the narrative deeply in the physiological memory of the body, through
processes of neural coactivation,1 that is, by repeated movement until an ‘automatic pilot’ was
created for the active expression of each musical section; and
4 They had to explore how to communicate their devised narratives, how to let themselves go
into processes of becoming2 and to be creative in the real-time of the performance of the
musical narrative.
It was hoped this method would bring significance to the music and authenticity to the
performance. I hypothesized that the students would be challenged and that the learning process
1 ‘Whenever a domain of subjective experience or judgement is coactivated regularly with a sensoriomotor

domain, permanent neural connections are established via synaptic weight changes … Certain neural
connections between the activated source- and target-domain networks are randomly established at first
and then have their synaptic weights increased through their recurring firing. The more times those
connections are activated, the more the weights are increased, until permanent connections are forged’
(Lakoff and Johnson 1999, p. 57).
2 A concept borrowed from Deleuze and Guatari (1980) that refers to the following process of spontaneous
creativity: when performing, performers concentrate on their devised emotional narratives, reproducing
what they decided in the rehearsals; however, as soon as they start to play, the pulse and emotional varia-
tion of movement is back in place with a strong feeling for the context where the action happens—the
‘here and now’ of the performance situation. It is then that new emotional variations may happen; varia-
tions in emotional intensity, and variations caused by the necessity of integrating new elements and
factors that occur in the real time of the performance. These factors come not only from the outward
context of the performance but also from its inward context; that is, motor adjustments in the body
and/or emotional reactions provoked by the performance ritual.
would stimulate their imagination and creativity. A report follows of a series of practical investi-
gations in which I attempt to articulate these four theoretical concepts in practical terms.
27.3.4 Participants
Aside from the fact that all were studying flute at the same level, the participants were not
selected on the basis of any specific criteria. The three students, all of whom I was preparing for
their final exams, were in the last year of secondary school, having chosen music performance as
their main subject. Their intention was to become professional musicians and to apply to a music
performance course at university. All had been studying flute for at least seven years.
27.3.5 Pilot studies

These first experiments consisted of involving individual students creatively with contextualiza-
tion and (e)motionally exploring the context. My hope was that the musical emotional narratives
created by the students at these early stages would persist and proceed through the other two
phases of coactivation and becoming; that is, the created musical emotional narratives would gain
endurance, being deeply assimilated in the physiological memory of the sensuous body through
processes of coactivation, and this would develop favourable conditions for an experience of
becoming.
One of the students obtained immediately good results when working in this way. As she stated
in the interviews, she felt stimulated and challenged by having to create her own interpretation
and narration. Expressive nuances appeared very soon in her sound, and she changed her atti-
tude on stage, becoming much more ‘present’ and communicative.
I shall give an example. My student Sandra was learning to perform Debussy’s ‘Syrinx’.
Investigating the context of this piece, Sandra found that it is related to Ovid’s Metamorphoses—
that it was programmatic music written for a theatre piece, and that the mythological story of
Syrinx could be summarized as follows: ‘Pan was pursuing the nymph Syrinx, who fled to a river
and begged the nymphs there for help. She was allowed to conceal herself by taking the form of a
reed-bed, from which Pan subsequently picked the reeds to fashion his pipes’ (quoted from
Sandra’s notes).
Sandra was asked to explore the mythological context of the piece, first by acting—playing the
characters of Pan or Syrinx and devising a narrative of gestures and actions. Afterwards, while
playing on the flute, she tried to express the different versions of the narrative she had created.
After a few exercises, Sandra played two versions of the first section of the piece, embodying, so to
speak, the nymph in the first version, and the god Pan in the second version. Clearly, Sandra was
focused on expression in both versions. In the first version, she ‘became’ the nymph Syrinx,
the archetypal feminine character in a pastoral atmosphere, playing tender and soft sounds. In
the second version, she became much more violent and determined, playing the obsessed male
Fig. 27.6 Sandra performs the story while the author plays ‘Syrinx’ by Debussy.
character Pan. There were significant changes in her sound, in her facial expressions and in her
body movements, which were clear signs, for me, of her expressive engagement—her identifica-
tion—with the protagonists. By making them clearly audible and visible, Sandra expressed the
(e)motional qualities of her devised narratives.
For this student, it was quite easy to imagine the actions, motions and emotional states of a
character and to immediately apply them to the musical material. Sandra was able to improve her
playing by imagining (e)motional forms of human ‘being’, in sound. When imagining or acting
as Syrinx or Pan, Sandra translated to her musical phrasing the different gestural qualities she
had attributed to the actions of these personalities: she transferred to her playing the intrinsic
relations of movement and rest, speed and slowness, tension and release typical of their imagined
behaviours.
The other students, however, when exploring the context, were not so easily connected or
involved with the communicative and interpretative features of the music. Instead, their efforts
were confined to processing all of the information cognitively; that is, they were more concerned
with mentally representing the music on the page than in expressing the life of it in sound.
Moreover, these students were precisely the ones who had experienced more difficulty in playing
expressively before the experiment. So, in spite of the promising results obtained with Sandra, the
experiment was not producing positive results for those students who seemed to be the ones
most in need of assistance.
These results led me to look for a more effective way of teaching the articulation of mind and
body or, in other words, of linking cognitive structures in a musical composition to the activities
of bodily structures. If the students were processing all of the information cognitively, the negoti-
ation of the expression with the body was possibly failing to occur. Problems emerged when the
students had to translate their readings to actions and expressive gestures. They were failing in
the translation process from what was in the score. This translation process, which corresponds
to (e)motionally exploring the context, brings expressive sound into being. Thus, it became clear
that these students were failing to create or experience an emotional narrative in sound—the
sound of moving, as it were.
It was this ‘bridge’, where the reading or the hermeneutics of the piece is translated into expres-
sive sound, which needed special attention. To work on the hermeneutic dimension, the students
were given drama exercises in which they were asked to express, first acting and then playing the
flute, different characters, affects or states of mind suggested by their explorations of the histori-
cal narrative contexts of the pieces. When the collected information about the context (historical
and analytical) of the piece was considered sufficiently rich and suggestive for the students, they
were asked to look for action-metaphors to help them make the bridge to the achievement of
expressive sound.
Action-metaphors in music are metaphors inspired by the narrative context or drama, and are
suggestive of actions and gestures—of motions and emotions of actors or protagonists—that are
translatable in sound. They result from processes of free associations between the chosen context
and the musical sounds and their relations, giving meaning to the music and stimulating our
affective response to the sounds. Anyone who has participated in advanced music performance
lessons will recall many cases where action-metaphors were explored. For example, these two
come from masterclasses, one with István Matuz and the other with Patrick Gallois:
When playing ‘Les cinq incantations’ by Jolivet, one should be like talking to the people there and asking
for peace in the world, like in a prayer.
István Matuz
I shall give you another example from the Ballade by Frank Martin … Take the very beginning of the
piece. It has to have suspense like in a Hitchcock film when the innocent victim is taking a shower and
you can see the shadow of the murderer getting closer and closer … So I try to express this tension
with the qualities of my sound, this fear that makes your heart go faster.
Patrick Gallois
The exploration of these action-metaphors has the twin aim of capturing the physical qualities
of the metaphorical referents, their intrinsic relations of movement and rest, speed and slowness,
and simultaneously of searching for the flexibility of the musical material to express these
relations. The creative exploration of action-metaphors works as a symbolic activity, linking
(constructing the bridge between) cognitive and bodily structures, and acts to engage the
students creatively with intentional artistic playing (see Brandt, Chapter 3, this volume, for a
discussion of stages of the semiotic process).
The problem seemed to be that, for the students who were finding expression difficult, there
was some kind of interference in the link between cognitive and bodily structures. Unlike Sandra,
these students were not using the body effectively to achieve a musically expressive performance.
The cause for this was probably that they tended to focus on the technicalities of flute playing, as
they were themselves indicating in their notes and comments:
I know exactly what I want to express musically, but as soon as I begin to play, all my technical prob-
lems seem to get in the way … For instance, in the beginning I was planning to create a beautiful dark
tone to obtain a meditative atmosphere, but when I started to play my sound became full of breath
and the B flat was way too low.
Student interview
Needing to find a way forward, I turned to exploring body movement as both the substrate and
a methodological tool for developing musical expression in my students. From experience with
the class and from self-observation, I realized that exploring the same body movements takes dif-
ferent performers to the same musical gestures and intentionality, and ultimately to the same
interpretation of the purpose of the movement. This seemed valid not only for the formal and
structural features of the music, but also for its emotional and narrative dimensions, that is the
expressive qualities of the body in performance (see Lee and Schögler, Chapter 6, Davidson and
Malloch, Chapter 26, this volume).
27.3.6 Sympathetic body movement: conducting the musical

expression
Turning to the body-based approaches to musical performance of authors as different as Emile
Jacques-Dalcroze (1921), Truslit (1938, cited in Repp 1993) and Alexandra Pierce (1994),
I decided to explore a potential short-cut to achieving expressive playing; that is, to make sure
that my students were really negotiating the action-metaphors with their bodies. If the expression
of the piece were first negotiated with the body, the resulting marks and mappings sensed in the
body would hopefully remain effectively associated with the motor production of the musical
phrasing.
Taking inspiration mainly from the work of Dalcroze and Pierce, I found that the way to first
negotiate expression with the body, before playing, was to ask each student to conduct the
musical expression of the piece while I played. I tried to react to all the nuances suggested by
the students’ movements and gestures. The idea was not so much that the students would indi-
cate accurately the entries, the bar or the beat, as a conductor is supposed to do at the most
elementary level, but that they would suggest freely, using any physical gestures, all the nuances
needed to express their previously devised emotional narratives. They would be focusing on
expression only, reducing drastically the possibility of having any interference from the technical-
ities of performing on the flute. The principle is described by Pierce as follows:
Movement processes can become reliable guides in aural analysis. One can, for example, find phrases
by arcing with the arm, can ascertain climaxes by a hand-stretch, and can discover the middle ground
progression by stepping the bass. Each process is easy and satisfying to use and will serve an inquiring
adept as well as a puzzling student. The movements invite self-confident and committed expressive-
ness. Each, though simple, is subject to endless improvement, and is interesting enough to encourage a
continuing search for its perfection. Each feels quite new in different compositions. Most important
for performers, each can be translated into equivalent movements of playing technique. The payoff is
not only theoretical understanding but also an improved performance.
Pierce (1994, p. 58)
Elaborating on the work of Pierce (1994), it seemed reasonable to hypothesize that this effort
to focus on expression, to the point of being able to communicate it to someone else, would have
a profound effect. It would be deeply assimilated and would thus remain intimately associated
with the musical material when actually playing an instrument, in spite of the distraction caused
by the technical difficulties of playing. When this new methodological procedure was tried out in
the pilot studies with this group of students, the structure of the musical narrative was strikingly
preserved from version to version each time the students played the pieces. There were, of course,
variations in such factors as emotional intensity, dynamics and tempo, but the overall sense of the
narrative structure seemed unaffected, which suggested it had been deeply assimilated.
27.3.7 The new teaching procedures

The finally agreed teaching procedures for the practical investigation that we are reporting
included this new element: the prior negotiation with the body through metaphorical projec-
tions or action metaphors. Therefore, the three students who participated in this experiment
were to follow the four steps as in the method proposed above, but adding this new element to
the second phase of the process. They were guided to contextualize the musical pieces creating a
reading of the piece; to (e)motionally explore this reading of the context, building emotional nar-
ratives to be first negotiated with the body—mainly through conducting the expression—and
then translated into sound; to assimilate these musical emotional narratives deeply in the physio-
logical memory of moving through processes of repetition (coactivation); and to explore how to
communicate these musical emotional narratives to an audience by becoming their narrator.
27.3.8 Results
The students became more musically expressive after negotiating the musical material with their
bodies, or, in other words, after embodying the musical meaning. The three participants in this
practical investigation demonstrated rapid progress in their capacity to play expressively, in spite
of their technical problems. The high marks obtained in their final examinations before a jury,
although not necessarily definitive, represent a relatively objective confirmation of the positive
results of this practical investigation. These three students gave the impression to the jury that
they were freer to communicate musically, less obsessed with technical difficulties, and consis-
tently more confident and more expressive.
In their interviews, the three students considered this experiment to be a rewarding experience,
since they felt they could communicate musically even if they were not technical virtuosi.
They reported having changed their focus when practising and performing. This new way of
working helped them to engage with the expressive goals and to feel that they were in control
of the musical expression when rehearsing and performing. They felt progressively independent
of the teacher and more secure and knowledgeable with respect to making their own interpreta-
tive decisions:
Before I was always trying hard to do what the teachers were demanding and, in the end, I was always
insecure about the results … when working in this way, I am playing much more the way I feel.
From a monthly interview with Vera
Other important changes were observable. The involuntary movements of the students when
playing became much more discrete and more ‘related’ to the musical ideas. Most of the distract-
ing movements they had exhibited before, which were caused by technical difficulties, vanished;
their playing began to develop a well-motivated sense of narrative—that is, when playing at
their final examinations, they succeeded in creating musically intense moments so that one could
easily become involved with their musical discourse.
27.3.9 Conclusions
The practical investigation reported here seems to open new perspectives within the research area
of performance, teaching the participants to explore issues that have been largely avoided or neg-
lected in past practice. While a considerable amount of research has been done on the methods of
music education, it has mainly been directed to measurement of separate, specific areas
of improvement—such as cognitive development and skill acquisition—and to the identification
of age levels for aptitude (e.g., Gabrielsson 2003). The existing body of academic enquiry from a
number of disciplines has tended to focus discussion around elements of ability, which have at
best been considered extremely difficult to articulate or combine in a coherent theory (Gabrielsson
2003). Much of the creativity involved in improving the conceiving, rehearsing and producing of a
musical performance has been ignored completely (but for an exception, see Pierce 1994).
A wide range of treatises on teaching tends to focus on technical issues rather than interpreta-
tion, with an individual’s interpretation of a work regarded as ‘the bit that cannot be taught’.
Here, also, the consequence of the insistence on form, formation and structure has meant that
fundamental issues of process have been ignored; for example, how to interpret, how to build an
interpretation, and how to create. Historically, it has been argued that some kind of profound
communication or empathy is established between teacher and student, and that it is within this
specially intense relationship that an interpretation is found.
The investigation reported in this chapter has explored how this profound communication can
happen. Instead of trying to capture all the separate aspects of their teachers’ performances and
demonstrations, as students tend to do, they were guided to develop or to create their own musi-
cal gestures and to perform them intentionally. Traditional teaching strategies may have given too
much emphasis to the separate expressive cues (sound qualities such as dynamics, timbre, articu-
lation and timing), which can then easily become objects of conscious focused attention. Too lit-
tle emphasis has been given to the self-generated cues that may provide the continuities that will
turn a performance into a personal, individual, unique amalgam.
27.3.10 How do we communicate musically? A hypothesis

The experience reported here has led me to elaborate a hypothesis on how we communicate musi-
cally. If the students who took part in this practical investigation had to look for self-generated
cues to provide their musical discourse with the gestural continuities needed for an intentional
performance, it may well be the case that music listeners also need self-generated cues to make
sense out of their auditory experiences. As Walton (1990, p. 336) wrote, ‘the appreciation of
music is a more personal, private experience than the appreciation of painting and literature …
listening to music is thus more like dreaming; one’s imaginative activity is largely solitary’.
This constitutes a crucial difference in the way we appreciate art forms. In the appreciation of
painting and literature, there is time to elaborate or to involve higher levels of conscious structur-
ing; there is time to symbolically charge our aesthetic experiences. Eventually, there is even time
to exchange verbal comments; and all of this means that we are able to engage our long-term
memory—extended consciousness in terms of Damasio (1999)—in our personal make-believe
constructions.
But in music, or in other performative temporal arts, we have no time to verbally elaborate on
the gestural continuities (Hatten 1999) displayed in a performance, as it happens. We have to
keep up with the musical flow. We have to go on reacting continuously to its changing surface,
following attentively (introspectively) the changes that the music provokes within our selves, in
our bodies. Our experiment with performance students seems to be a clear demonstration of
this: they could not express the continuities of the musical gestures, because they were too
focused on separate expressive cues, out of the time of performance.
These changes that music provokes within our selves (which we follow introspectively) remain
preverbal representations, in spite of their coexistence with verbal language (cf. Donald 1991,
pp. 166 and 168). They are embodied meanings, which are inseparable from their production
process, indistinguishably enmeshed with both the external motor activity and the internal per-
ceptuo–motor adjustments that produced and experienced them. They are more felt than repre-
sented or signified, and thus, by their nature, they escape Saussure’s distinction or separation
between signified and signifier (and see Brandt, Chapter 3, this volume, for an exploration of the
evolution of music from the viewpoint of semiotics).
Thus the performer’s actions, or musical gestures, provoke reactions that are unmediated by
conventional codes and systems. This is probably so because, unlike verbal language, music, or
more generally, the language of gesture, is not organized to pass on a message, that is, to represent
the same propositional meaning or the same reality to everyone, but to affect directly (cf. Cross
2005; Cross and Morley, Chapter 5, this volume). The function of the language of gesture seems
to be to awaken similar gestures that recognize themselves in it, at the subliminal level of the pat-
terns of our bodily experience. This explains why one may talk about communicative musicality
referring either to infant–mother communication or performance–audience communication
(see discussions of the ‘meaning’ of music in Dissanayake, Chapter 2, Brandt, Chapter 3, Merker,
Chapter 4, Cross and Moreley, Chapter 5, and Erickson, Chapter 20, this volume).
This does not mean that there is no representation in the language of gesture. As Donald (1991)
argued, there is representation in gestural language, but it is representation to oneself: the sender’s
act has to be re-enacted by the receptor, who can understand it only ‘on the basis of internal,
self-generated cues’ (Donald 1991, p. 173). The receivers, aurally and/or visually, understand, or are
affected by these meanings, because they are as if imitating the performers through empathy, or
better, sympathy (I use the term ‘performers’ instead of ‘senders’ intentionally to escape the linguis-
tic terminology). Again, in my experience with performance students, conducting the expression
was a means of guiding them to generate their own personal musical gestures and narratives on the
basis of internal, self-generated cues and to experiment and test if these gestures and narratives
were being properly communicated to me, the teacher, who was re-enacting them. Emotion was
naturally and inherently involved, omnipresent, whenever the performance students were express-
ing themselves in terms of movement, because motion is always emotional: to emphasize this
aspect, I have adopted the term ‘(e)motionally’ (for example, ‘(e)motionally exploring the context’).
To understand fully this notion of representation in gestural language—where the receiver

imitates but has to re-enact creatively the performer’s actions to make sense out of them—it
is indispensable to take an evolutionary perspective. Donald (1991) argues that human
communication through the language of gesture was definitely improved with mimesis. There is,
in fact, neurological evidence supporting what might be called the intersubjective nature
of mimesis.
In a proposal that complements Sheets-Johnstone’s ‘intercorporeal iconicity’, Rizzolatti and
Arbib (1998) suggest that mirror neurons provide a basis for social understanding, in that
accommodating the actions of others to one’s own bodily experience allows for an understanding
of their motivations and intentions (see also Tolbert 2001, pp. 89–90).
In Tolbert’s (2001) review of Donald (1991), it becomes apparent how representation took
place and improved the new mimetic minds.
Donald distinguishes between human and non-human primate intelligence primarily in terms of
human’s greater voluntary access to memory. Although apes display a high degree of intelligence,
they seem to depend on environmental cues for access to memory. Donald hypothesizes that the abil-
ity to plan and execute one’s own motor actions can provide a substitute for immediate context by
using the body itself as the missing contextual cue, thereby placing memory under voluntary control.
Through displacement of the here and now, mimesis moves representation from the indexical level to
the symbolic threshold.
Tolbert (2001, p. 88)
27.3.11 The imaginative undercurrents of musical experience

This move of representation ‘from the indexical level to the symbolic threshold’ was (and, onto-
genetically, still is) the result of innumerable imaginative operations, occurring mainly at an
unconscious level. This process allowed for both the emergence of a social mind and the ability to
access memories displaced from their context. The access to the symbolic threshold implies the
existence of a new level of representation, in which the body itself is able to provide the missing
contextual cues (cf. Tolbert 2001).
This ‘major break’, which corresponds to the emergence of the second-order neural maps
(in the terminology of Damasio 1999, p. 170) allows us to understand better what was stated
above: that representation became representation ‘to oneself, on the basis of internal, self-generated
cues’ (Donald 1991, p. 173). This means that these aural and/or visual receivers, when affected by
each other’s intentional gestures, became capable of both imitating the performers by empathy
(or sympathy), and of interpreting the changes observed in themselves by means of evoking
memories and mimetic scenarios, which are, naturally, nurtured by personal meaning associa-
tions. These personal meaning associations, which are triggered by the mimetic reactions and
lived in real time, are regulated and processed by the bodily structures of imagination—by the
emotional logic emerging from our stock of (e)motional experiences.
It is important to highlight the essential role imagination plays in all our cognitive operations,
be it conscious or unconscious. Trevarthen has concluded that:
humans are born with an intrinsic sense of behavioural and experiential time adapted for sympathetic
motivation in imagination, for ‘mirroring’ or ‘echoing’ the motives in another’s song … [It] might be
more appropriately seen as a ‘narrative’ functioning, which is concerned with imagination and its
intersubjective transmission as much as with single subject’s cognitive execution, perceptual learning
and problem solving.
Trevarthen (1999, p. 193)
Walton (1990) argues that, in adulthood, imagination is exercised in our games of make-
believe: ‘it would be surprising if make-believe disappeared without a trace at the onset of
adulthood’ (Walton 1990, pp. 11, 12). He claims that make-believe continues ‘in our interaction
with representational works of art’, and defines games of make-believe as ‘one species of imagina-
tive activity’; specifically, he adds ‘they are exercises of the imagination involving “props” ’
(Walton 1990, p, 12). Since Donald (1991, p. 169) characterizes our capacity for representation as
a ‘creative, novel, expressive’ act, it seems indispensable to consider the role imagination might
play in all aesthetic experience, particularly in musical experiences. Here the second experiment
with imaginative enactment may be crucial in supporting these ideas.
Musical performances are complex events with many layers of meaning, and the fact that they
have the significance and the function of a ritual certainly aggravates the complexity of their
multiple meanings. The whole point of musical rituals, Walton (1997, p. 82) suggests, is to supply
us with auditory experiences, which function as props to stimulate our imagination (see also
Merker, Chapter 4, this volume). It is on these auditory experiences that different listeners make
their different, imaginative meaning constructions. They are thus listening to the music and, at
the same time, interpreting the auditory experiences that arise from it. As Lavy writes, ‘Music is
heard as narrative because when we listen to music we conceptualise it in terms of narrative, with
narrative itself acting almost as a meta-metaphor within which all things can be made compre-
hensible’ (Lavy 2001, p. 99). In other words, music listening refers to processes of narrative/cogni-
tive activity in which the bodily origin of meaning is intimately and introspectively revealed to
the self, although in an obscure, indirect form of vague impressions or surface-effect imaginings.
It is not only knowledge about the music that feeds our musical interpretation or make-believe
fantasy, but everything associated with the piece that may stimulate our affective response to the
music: affective/emotional reactions to the contents of programme notes, historical context and
style of the musical piece; emotional memories of past experiences triggered by free association
at any moment during the musical performance; (e)motional, aural and visual mimetic reactions
to the sounds and to the performers’ movements; and whatever pops into the listener’s mind dur-
ing the mimetic processing caused by the performer’s actions and other environmental stimuli,
and that helps them to focus on the temporal unfolding of their personal narratives.
As the evidence produced by both neuroscientists and cognitive psychologists has shown
(cf. Damasio 1999; Donald 1991; Cox 2001), emotional information is directly experienced in the
brain, but often totally escapes conscious control. This means that the emotional information
and/or flux displayed by musical gestures may be perceived and processed at a level below con-
scious awareness. Because emotional content has a determinant role to play in listeners’ auditory
experiences, we may infer that the essence of a listener’s musical experience does not come into
his/her awareness or conscious control. Our musical experiences might be determined to a large
extent by our unconscious operations (and see Panksepp and Threvarthen, Chapter 7, this
volume).
27.3.12 An alternative view of meaning in music

This short interdisciplinary theoretical inquiry into music listening was developed by rejecting
the more traditional theories of meaning and arguing in favour of a very different account. This
alternative account offers a way for understanding musical meaning and musical communication
and can be systematized by the following principles.
1 There is an unconscious cognitive motivating process grounding all meaning production,
and the capacity to produce (e)motional narratives accompanies the very first manifestations
of human cognition in infancy.
2 Imagination operates in all levels of mental activity, be it conscious or unconscious, both

embedding and grounding the most highly elaborated forms of conceptualization and rea-
soning.
3 Imagination operates thus from an (e)motional logic: it is coherent if it feels right. Emotional
logic constructs (e)motional narratives within the language of gesture, which is the original
ground from which all other languages, whether verbal or musical, have emerged. It is our
emotionally charged and kinaesthetically structured experiences that both nurture and con-
strain the free play of imagination in the construction of all knowledge and during all acts of
communication.
4 The meaning of the language of gesture is embodied symbolic meaning, which implies this
language has to be enacted by the performers to be produced, and re-enacted by the listeners
to be understood.
5 The ritual of musical communication is probably inherently kinaesthetic and intermodal,
and thus intrinsically gestural.
◆ Music listeners seem to react mimetically to the ritualized performer’s actions, enacting
fictionally their personal (e)motional narratives and playing games of make-believe in a
process of continuous, creative introspection.
◆ Music performers seem to re-enact ‘in presence’ their coactivated personal (e)motional
narratives, reacting in the moment to the ritualized social atmosphere of the musical per-
formances, in a process of continuous, creative improvisation.
27.4 Conclusion: what these experiences of human musicality

have taught us
Helena Rodrigues, Paulo Rodrigues and Jorge Correia
The account of these two different experiments in the production of musical art, one with babies
and one with music students, are partly a reaction to the excessive weight of rationalism brought
upon music by analytical theories. The reported experiences and discoveries might assist an
exploration of issues that we believe have been largely avoided or neglected in the existing litera-
ture on music education and music psychology. We hope that they contribute to a better under-
standing of the nature of music, and of music’s particular mode of communication. It appears
that musical communication involves humans through a synthesis of bodily–sensorial aspects
and cognitive–intellectual aspects, if we accept the traditional split between body and mind.
Thus, music may offer a way to heal that split, because to be fully involved in music makes it
impossible to maintain this split.
The results of the two reported ‘experiments’ support the position that musical experiences are
essentially holistic, leading to the conclusion that an insistence on treating music as separated
aspects, such as prescribed by analytical models, is misleading. Much research in music psychol-
ogy and in music pedagogy has supported this atomistic tendency, and has neglected the integra-
tive or coordinating emotional, affective and communicational aspects of music (there are
exceptions to this, especially in the non-academic literature; e.g., Green and Gallwey 1986; Ristad
1982). In music teaching, students are often guided first in an interpretation of the score, or to
accept the teacher’s interpretation. Rarely are they invited as beginners to be interpreters for
themselves, and guided to focus on how they are affected by the music.
On trying to break with the romantic view that music is only about expressing feelings and
emotions, many twentieth-century authors went to the point of denying its physiological or
organic basis—the biological roots and bodily motions that we believe underlie and ground all
music expression. Following this trend, many music teachers seem to have been favouring knowl-
edge about music rather than stimulating the inner process of music fruition. Making a parallel
with literature, this seems like choosing to know the names of the authors and the titles of the
books, while ignoring completely the stories they tell. Traditional teaching of music performance
has been too often focused on technical improvement, forgetting the reason why technical
improvement is needed. Again, to compare with literature, this is like having a huge vocabulary
but no original stories to tell, and not even the motivation to try.
In reporting these experiences, our intention is to emphasize that music is a whole living
experience—to bring into focus something very basic and essential in human nature: the pleas-
ure and joy of communicating. We conclude that in musical communication every kind of
human participant—parents, babies, professional performers and their audiences—share mean-
ing in a way that is beyond conscious awareness, and that reveals (or discovers) bodily structures
of experience. It seems clear that a shared drama of musicality, of story-making in sound, exists
from infancy to the highest levels of musical skill, whatever other elaborate understandings and
associations are incorporated and appreciated. It is also evident that musical communication
between performer and audience at a professional level, and infant–mother communication in
more intuitively musical ways, are essentially of the same nature (Malloch 1999; Trevarthen 1999,
pp. 161–62).
In the future, the authors aim to establish further parallels between infant–mother communi-
cation, musical communication among performers in groups, and the communication of
performers with their audiences. To do this, ways of developing artistic creation and scientific
research together will be further explored.
Acknowledgements
We thank Fundação Calouste Gulbenkian, Instituto Português das Artes (Ministry of Culture)
and Fundação para a Ciência e Tecnologia (Ministry of Science, Technology and Higher
Education) for being supporters of our educational, artistic and research work on music and
early childhood development.
References
Correia JS (2003) Investigating musical performance as embodied socio-emotional meaning construction:
Finding an effective methodology for interpretation. Unpublished Ph.D. thesis, University of Sheffield.
Cox A (2001). The mimetic hypothesis and embodied musical meaning, Musicae Scientiae, 5(2), 195–212.
Cross I (2005). Music and meaning, ambiguity and evolution. In D Miell, R MacDonald and D Hargreaves,
eds, Musical communication, pp. 27–43. Oxford University Press, Oxford.
Harcourt Brace, Orlando, FL.
Deleuze G and Guattari F (1980). Mille Plateaux [One thousand plateaus]. Les Éditions de Minuit, Paris.
Donald M (1991). Origins of the modern mind: Three stages in the evolution of culture and cognition.
Harvard University Press, Cambridge, MA, London.
Freeland A, Stern DN and Bruschweiler-Stern N (1998). O Nascimento de Uma Mãe. Porto: Ambar.
Published in English in 1998 as The birth of a mother: how the motherhood experience changes you
forever. Basic Books, New York.
Gabrielsson Alf (2003). Music performance research at the millennium, Psychology of Music, 31(3),
221–272.
Gordon E (1990). A music learning theory for newborn and young children. GIA, Chicago, IL.
Green B and Gallwey T (1986). The inner game of music. Doubleday, New York.
Hatten R (1999) Musical gesture online lectures, Cyber Semiotic Institute, University of Toronto.
URL:http://www.chass.utoronto.ca/epc/srb/cyber/hatout.html
Helder H (1969). A Colher Na Boca [A spoon in the mouth]. Ática, Lisboa.
Lakoff G and Johnson M (1999). Philosophy in the flesh: The embodied mind and its challenge to Western
Lavy MM (2001). Emotion and the experience of listening to music: A framework for empirical research.
Unpublished Ph.D. thesis, Jesus College, Cambridge, 2001.
1999–2000), 29–57.
Papoušek H (1995). No princípio é uma palavra – Uma palavra melodiosa. [In the beginning was the
word – a melodious word.] In JG Pedro and MF Patrício, eds. Bebé XXI. Criança e família na viragem
do século [The child and the family at the turn of the century], pp. 171–175. Fundação Calouste
Gulbenkian, Lisboa. (No translation has been published, however for text that covers similar material,
see Papoušek H and Papoušek M (2002). Parent infant speech patterns. In G Gomes-Pedro, K Nugent,
G Young and B Brazelton, eds. The infant and family in the twenty-first century, pp. 101–108.
Brunner-Routledge, New York/Hove, UK.)
In I Deliège and J Sloboda, eds, Musical beginnings: origins and development of musical competence,
Papoušek M (1996). Intuitive parenting: a hidden source of musical stimulation in infancy. In I Deliège
and J Sloboda, eds. Musical beginnings: Origins and development of musical competence, pp. 88–112.
Pierce A (1994). Developing Schenkerian hearing and performing. Intégral, 8, 51–123.
Popper K (1989). Em busca de um mundo melhor. Fragmentos, Lisboa. Published in English in 1992 as
In search of a better world. Routledge, London.
Reigado J (2007). Análise acústica das vocalizações de bebés de 9 a 11 meses face a estímulos musicais
e linguísticos. [Acoustic analyses of 9–11 month-old babies’ vocalizations after they are presented with
musical and linguistic stimuli.] MA Thesis presented to FCSH – UNL, Portugal.
Repp BH (1993) Music as motion: A synopsis of Alexander Truslit (1938) Gestaltung und Bewegung
in der Musik. Psychology of Music, 21, 48–72.
Ristad E (1982). A soprano on her head. Real People Press, Moab, UT.
Rizzolatti G and Arbib MA (1998). Language within our grasp. Trends in Neurosciences, 21(5), 188–194.
Rocha A (2007). As vocalizações de bebés de 9 a 11 meses face à música e à linguagem – análise efectuada
por juízes especializados. [Vocalizations of babies 9–11 months old interacting with musical and linguistic
stimuli – specialized judges’ analyses.] MA Thesis presented to FCSH – UNL, Portugal.
Rodrigues H (2004). Desescolarizar a educação [Let’s ‘unschool’ education]. Jornal de Letras/ Educação,
14 April, 8–9.
Rodrigues H (2005). A Festa da Música na iniciação à vida: da musicalidade das primeiras interacções
humanas às canções de embalar [The festival of music at the beginning of life: from the musicality of
first human interactions to lullabies]. Revista da Faculdade de Ciências Sociais e Humanas, 17, 61–80.
Rodrigues H and Rodrigues P (2006). A Educação e a Música no divã – ‘nóias’, paranóias, dogmas
e paradigmas – seguido de apontamento sobre uma ‘gota no oceano’ [Education and music from the
couch – paranoias, dogmas and paradigms – followed by a note on a ‘drop in the ocean’]. Revista de
Educação Musical da Associação Portuguesa de Educação Musical, 121–23, 61–79.
Rodrigues P and Rodrigues H (2004). Bebé babá – explorations in early childhood music. GIA, Chicago, IL.
Rogers C (1961). On becoming a person: A therapist’s view of psychotherapy. Constable, London.
Tolbert E (2001). Music and meaning: an evolutionary story. Psychology of Music, 29, 84–94.
Trevarthen C (1999). Musicality and the intrinsic motive pulse: Evidence from human psychobiology
and infant communication. Musicae Scientiae (Special Issue 1999–2000), 155–215.
Walton KL (1990). Mimesis as make-believe: On the foundations of the representational arts. Harvard
University Press, Cambridge, MA, London.
Walton KL (1997). Listening with imagination: Is music representational? In J Robinson, ed., Music and
Meaning, pp. 57–82. Cornell University, New York.
28-Malloch-Index 9/10/08 3:47 PM Page 611
Index
Please note that page references to non-textual material such as Figures or Tables will be in italic print, while references
to footnotes will have the letter ‘n’ following the note
aborigines, rituals 48 art 17, 19, 130

accents, defined in music theory 99 and Bebé Babá (project) 592
acetylcholine 347 therapeutic 331
acoustic analysis/recordings 89, 93, 303, 318–320 art collections 24
acoustic startle response (ASR) 552, 559 artistry in childhood, observations 518–522
ACTH (adrenocorticotropic hormone) 346 artists 84
action song 250–253 ASD, see autism/autism spectrum disorder 425–426
action-metaphors 600 Asperger’s syndrome 425
adaptors, gestures 569 ASR (acoustic startle response) 552, 559
adrenocorticotropic hormone (ACTH) 346 attachment
aesthetic community 401, 410–411, 416 of infant and vocal interaction 307
features 414–416 in improvised music therapy 383, 425, 426
path to 417 motivation for 210
pioneers 401–402 attachment theory for dyads 266
affect attentional load 68
between persons 33, 34, 231,232 attunement theory, and engagement 223
in music 69, 71, 567 auditory cortex, specialization 162
regulation in music therapy 382, 386 auditory imagery 36
in vocalizations 32, 212, 216, 218, 222, 224 auditory-visual modality 52–53
affect displays, gestures 569 authenticity, and belonging 321–322
affective competence, infants 117, 210, 286, 292, autism/autism spectrum disorder (ASD) 55, 116,
306, 466 425–429
affective neuroscience 112–117, 553 case study 427–428, 429
affective tonus 38 autonomic nervous system 337–340, 346
agonism 269
aisthesis, role 411–412 babbling, in infants see infant vocalizations
algorithms 100, 122 babies/infants
alteroceptive emotional regulation 119 see also infant development, and music; infant
altriciality, and play 73–74 vocalizations; infant-directed speech
ambiguity of music 68–69, 70, 71 acoustic preferences and sensitivities 282–283
Among school children (Yeats) 17 cultural differences in communication 218–223,
amphoteronomos regulation 203n 302–305
ampoteronomic regulation 203n dance impulses 200–201
amusia 72 electroencephalic recordings of brains 116
amygdala, brain 121, 347 emotional expressions
ancestors 31–32, 53 adult feelings on hearing 230–231
animals of feelings 210–211
see also ape culture; chimpanzees/chimpanzee perception by adults 227–230
culture testing of emotion, very young babies 213–218
animal calls and mother-infant dialogues 110–111 types 193
brain, neurochemistry of affective systems 126–128 expressions of emotion, when hearing
emotional systems 105 music 197–199
perceptuo-motor systems 86 imitation by 186
sounds by 211–212 innate musicality of 187–189
Ansdell, Gary 7, 362 innate intersubjectivity 2
anthropology 370 interaction with mother see mother-infant
anxiety control, origins of music 25–26 interactions
ape culture 45, 46, 47, 55 intersubjective communication, nature 210–211
see also chimpanzees/chimpanzee culture; monkeys Japan, emotional or interpersonal culture 218–219
archaeological records 74–76 language acquisition 32
Aristotle 8, 185 learning to speak 213–215
arsis 556 musical communication with 185–187
612 INDEX
babies/infants (cont.) morals and aesthetics of belonging 313–314

musicality 263–266 musicality and narrative 310–311
protoconversation with 1, 115 performing through shared musicality 302–305
responses to music and expression of feelings 193–199 polysemic and non-discursive roots of belonging
rhythmic expressions 186, 199–200 311–313
defined 190 research findings 302–304
durations of 196–197 Bernatzky, Günther 26
measurement of 192 Bernstein, NA intrinsic motor regulation, motor
rhythms image 85, 554
age differences 195, 196 binary systems 40
developmental course 193–196 biological time 545–546
evolution of emotions in rhythmic ‘dialogue’ biomusciology 370
201–203 bio-musical research, immediate future 132–133
frequencies of 193 biopsychosocial paradigm 350–351
multimodality of 200–201 loop, psychobiological 349–350
timing 201 psychobiological approach see psychobiology
types 196 zones of conflict, music for children in 331–335
right-hemisphere dominance 117–118 birds 50
Scotland, emotional or interpersonal culture learned song compared to language 159
218–219 as primitive ritual 50, 51
singing to 109, 127 protomusicality 108
speech 22, 188, 214 plastic song 50, 249
learning to speak 213–215 social play 112
spontaneous communication 263 subsong 50, 249
visual avoidance 109 vocal development 50, 249
vocal engagement with, testing of emotions birdsong, brain mechanisms 109
adult responses to expressions 217–218 Bjørkvold, J-R, children’s musical culture 315, 477
function of emotions in relationships and cultural Blacking, John
learning 215–216 on community music therapy 369
growth of innate sympathy in co-consciousness cultures, music across 66
216–217 on developmental value of music 71
infant’s calls, receiving/interpreting 217 on group cohesion and music 62
learning to speak 213–215 on kalimba thumb-piano music 67
vocal expressions addressed to on ‘outside’ meaning 68
early vocalizations and reciprocal imitation sexual selection theory 21
284–285 blood flow response methods 148, 149, 150–151
function 286–287, 288 blossom analogy, origins of music 17–18, 19–21
special features 283–284 bodily feelings
Bantu languages 53 of infants 229
Barkin, Elaine 515, 516, 517 from music 123–124, 125, 173
bars/measures 554 BOLD (Blood Oxygenation Level-Dependent)
basal ganglia, brain 121 response 151
basal metabolism 345–349 bole analogy, origins of music 22–25
Bateson, Gregory on meta-communication 449, bonobos 75
456, 460 borderline personality disorder (BPD)
Bateson, Mary Catherine, on protoconversation 1, acoustic analyses, examples from 318–320
190, 211, 480 preliminary research findings 316–320
beats 40, 42 situating 315–316
Bebé Babá (project) 585–596 Bourdieu, P, social habitus 304
art and therapy 592–593 bowing intensity glides 95–96
as ‘chain of shows’ 586 Bowlby, J 266
ecological context for research 591–592 Bowles, S 63
from ideas to practice 585–587 BPD see borderline personality disorder (BPD)
inspiration for 595 brain
parents and babies, observation of amygdala 121, 347
interactions 589–591 animal, neurochemistry of affective/affiliative
from practice to reflection 588–589 systems 126–128
Becker, Judith 370, 371 basal ganglia 121
becoming, concept of 598 of birds 109
Before Speech (Bullowa) 2 core of 113
belonging, sense of developmental changes in 120
and authenticity 321–322 emotion systems 113, 117–122
definition of ‘belonging’ 304–305 emotions, music-induced 122–123
INDEX 613
forward problem in measurement of brain chills, provoking 123–124, 125, 173

activity 152 chimpanzees/chimpanzee culture 45, 46, 47, 53, 75
hemispheres, cognitive neuropsychology of 117 human infants compared 244
infant brains, recordings 116 Chomsky, Noam, on innate language faculty, 155, 263,
intersubjective sympathy, visualizing brain 467, 479
processes 116–117 chords, musical 154
inverse problem in measurement of brain chromosome 7
activity 152, 154 defective genes 115
left hemisphere 155 chronobiograms 551, 554
left neocortex 118 chronobiology of music 545, 548–549, 559–561
musical awareness, cerebral asymmetry and accent and anticipation 556–557
emotional foundations 117–118 biological time 545–546
musical-emotional, growth 118–120 breath phrase 558
musicality complex rhythms 554–555
asymmetric in 117 evaluation 559–561
tracing in, subcortical reaches 113–115, 120–122 lack of metre 558–559
right-hemisphere dominance, in babies 117–118 lack of pulse 558–559
timing of events related to music 167 melody and harmony 556
vertebrate, asymmetry in 118 metre and pulse 554
voice appreciation, subcortical systems of 117 metric structures, modulation 556
brain-mapping studies 150, 167 psychological time 548–549
brainstem 112–115, 119 pulse patterns 557
breath 501, 508–509 rhythm and affect 113
breathing difficulties 341 rhythm and frequency 546–550, 549
Broca’s area, language 155, 161, 172 rhythm and movement 551–553
Brown, Steven 76 rhythm and timbre 555–556
on definition of music, requirement for 65 rhythmic ‘present’ 550–551
on group cohesion and music 73 Rubato style and swing 557
on mimesis 110 streaming, rhythmic 555–556
on musical signalling 63–64 tau function 557
sexual selection theory 21 ‘vitality’ 557
Bruner, JS 265, 470 circadian cycle 545
on communicative contingency 306, 307 clarinet, example score 573, 574, 575, 578
on education theory 470 classroom discourse
on information in intonation contours 285 cadential timing 460
on intersubjectivity before language 265 early-grade classrooms 460
on narrative 310 examples of speech 452–459
on rule structure in songs 251 speech and listening, musical nature 449–452
bulerias 555 clause embedding 32
Bullowa, Margaret, communication before cleanliness, ritual 47
speech 2, 210 clinical music therapy see music therapy
burl analogy, origins of music 18 clocks, biological see chronobiology of music 545
Clynes, M, on sentic forms of expression and time in
cadence, speech 453 the mind 85, 109, 117
capacities, musical 76 coalition signalling 64
Capgras’ syndrome, paranoid schizophrenia 34 co-consciousness 216–217
capitalizations 113n coefficient of attraction 270
cascade of sign functions 35, 36 cognitive flexibility 71
Cats’ Chorus trio 269, 271, 272, 275–277 cognitive neuroscience 371
cave art and music 22, 31 cognitive semiotics
CBF (cerebral blood flow) 150 death and danger, facing 31–32
centre of moment 579 emotions 33–34
cerebral blood flow (CBF) 150 language 32–33
ceremonial arts 20 signs, artistic and musical 35–37
ceremonial ritual 533–542 cognitive theory, talk and music 154
emotional meaning, creating see emotional collaborative musicing 358, 364
meaning, creating collective thinking 63
functions of music in 541–542 common nouns 34
mutuality, coopting mechanisms 536–537 communal functions, musical actions 69–74
ritualization and aesthetic operations 534–536 communication
chanting, rhythmic 53 and expressiveness of depressed mothers 287, 289
chat 109 of feelings of companionship 231–232
chi square analysis 223–224 human, complexity of 214
614 INDEX
communication (cont.) cradle of thought 111

with infant, musical 185–187 crescendos 86, 123–124
intersubjective, nature 210–211 CRH (corticotropin-releasing hormone) 346, 347
mimetic 498 Cro-Magnon people 185
mother-child see mother-infant interactions Cross, Ian 6, 7, 26
and movement 105–110, 190, 566–567 on community music therapy 360
music and dance, interrelation 501 flint-knapping experiment 466
music supporting 441–442 Csikszentmihalyi, M and flow theory 313, 475, 498
musicality in 281–282 cultural fitness 63
non-verbal foundations 242–246 culture
playing with 501–505 ape 45, 46, 47, 55
spontaneous, of babies 263 instrumental 55
support for, in clinical musical therapy 423–425 intergenerational cultural transmission 45
and vowel sounds 211–213 musical, motives of 106–107
communication disorders 131 and ritual culture 45–50
communicative musicality Culture-Centred Music Therapy 357
and autistic children 426–429 CURRY 4.5 (Compumedics Neuroscan) source
as creative participation 585–608 localization software 167
defined 2–6, 566 Custodero, LA 516–517, 521
and intersubjectivity 379 cycles of breath in music and dance 501, 558
music as component of 17
music therapy 378–379 Dalcroze, Emile Jaques 495, 601
pedagogical insights 522–525 Dalcroze, Emile Jacques, method of music
protomusic in 22–23 teaching 566, 567
and Rett syndrome 429–435 Damasio, Antonio, moving body emotion and
and structured music 433–435 consciousness 8, 371, 552, 555
communitas 417 and lived experience 386
community music therapy dance 53, 54, 192
group life (music therapy event) 365–367 see also movement
musical community 359–360 adult engagement 414
musical-social development model 362–365 contextual influences on
shifting practice and theory 358–359 engagement in 412–413
suggested interdisciplinary links flamenco 554
anthropology 370 gravity, dancing with 413
biomusicology 370 Indian dance forms 87
cognitive neuroscience 371 infant impulses 200–201
musicology 372 leadership 414
social philosophy 372–373 and music, four agents for 500–501
social psychology and sociology peer social engagement 413
of music 371–372 and play 405–406
traditional African ceremony 369–370 ritual 412
Western musical event (opera) 367–369 and sensory impairment 402, 403, 404
companionship motivation, intersubjective 210 voice 412–413
compás 554–555 waltz 551, 554
competition 20 d’Aquili, E and emotion in rituals 539, 540
composition 88 Darwin, Charles 17
conceptual-intentional complexes 70 on courtship display, and music 19, 21
Condon, WS, on interactional synchrony of expressive descent theory 18
movements between adults and with infants 109, on evolution of music in humans 64
283, 313, 315, 469, 551 on language 51, 264
disturbance in child emotional disorder 381 on musical notes 246
conducting 601–602 sexual selection theory 19–21
conformal motive 55 Davidson, JW 569
connotative complexes 69 deaf-blind children 401, 402
consciousness, of music 117 aisthesis, role of immediate perception 411–412
contour plot, magnetic field 168 dance/movement therapy 404, 406–407
conversational discourse analysis 282 content and methods 405–406
Correia, Jorge Salgado 585 dance 406–407
cortical functional segregation 171 emergent categories of engagement 408–410
corticotropin-releasing hormone (CRH) 346, 347 engagement in dance and play 405
cortisol 347 play 407
costly signalling theory, origins of music 19–20 research 404–407
courtship display, music evolved from 19, 21 social and task engagement 408
INDEX 615
deferred imitation 50 EEG (electroencephalography) 151–153

deliberate teaching 49 infant brains, recordings 116
deoxyhaemoglobin 150 music-induced emotions, brain activity 122
depression, maternal psychobiology of music, socio-emotional 114
clinical practice, implications for 293–294 raw data 152
communication and expressiveness of mothers speech and music 156–159
287, 289 and stress 347
adult-adult speech/adult-infant vocalizations, temporal resolution 166
expressiveness in 289–292 efferent cognition 37
Edinburgh Postnatal Depression Scale 293 Ego 37
musicality, effects on 281–294 ELAN (early anterior negativities), left
loss, long-term consequences 292–293 hemisphere 157, 158, 159
speech 221 electric bass 99–100
descent theory (Darwin) 18 electric guitar 100
development of musicality electroencephalography (EEG) see EEG
innate basis 187–263 (electroencephalography)
musical expression 469–470 embeddedness 67
musical listening before and soon after birth emblems, gestures 569
468–469 embodied expressive movement, music as 67
origins of musical human nature 465–467 emergent moment 382
rituals, seeking in shared performance 467 emergent musicality 523–525
of rhythms in infancy 190–203 emergent self 430
teaching of performance 597–607 emotional binding 34
developmental value of music 70–72 emotional gaps 102
deviations from reference interval ratio (DRIR) emotional meaning, creating
167, 168, 169 innate and socially acquired associations and
DHEA (dehydroepiandrosterone) 346–347 connotations 538–539
Diagnostic and Statistical Manual of Mental Disorders manipulation of expectation, effects 540–541
4th edition (DSM-IV) 331 sensory and cognitive dispositions, appeal
dialogue-constitutive universals 263 to 537–538
dialoguing, musical 424, 425 emotional voice paradigm 224
dichotomies 8–9 emotions
Dicker-Brandeisova, Friedl 331 alteroceptive emotional regulation 119
dimensions, mental architecture 37 animals, emotional systems 105
discretization 42 basic 216–217
disengagement 223 creating emotional meaning 537–541
distance calls, hominid 53 in ‘dance of well-being’ 379–381
distance gap 86 exteroceptive emotional regulation 119
Donald, Merlin function of in relationships and
on animal calls and mother-infant dialogues 110 cultural learning 215–216
on chronobiology 559 implicit realm of 383–384
on expressive movement 83 infant expressions of
on innate musicality 108 adult feelings on hearing 230–231
on mimetic culture 203 adult perceptions 227–230
on ritual 53 feelings, vocal expression of 210–211
dopamine 112, 126, 127, 128 on hearing music 197–199
deficits in Parkinson’s disease 131 perception by adults 227–230
double bass 95–96, 99, 100 types of expression 193
Down’s children, Williams children distinguished 116 in very young babies 213–218
DRIR (deviations from reference interval ratio) 167, of infants, research problems 216–217
168, 169 measures of, in mother-infant voices
driving behaviours 539 acoustic features, similarities and
drums 439 differences 219–222
duets 577–578 nursery rhymes, reciting in different
jazz 99–101, 307n moods 224–226
durations 554 play, engagement in 223–224
dysynchronizationa and synchronization (ERD and reactions to mother’s mood changes 226–227
ERS) algorithms 122 and intersubjectivity 115
movement and communication,
ECD (equivalent current dipoles) 153, 158, 160 musical-emotional 105–110
echoic memory trace (EMT) 160 and music 173–174, 175
echo-planar imaging (EPI) 151 brain activity 122–123
Eckerdal, Patricia 6 mood induction and emotional effects 130–131
616 INDEX
emotions (cont.) future work 175

and music (cont.) learning musical techniques 171
neuroscience of 116–128 magnetic field gradients 150
psychobiology of, socio-emotional 110–116 mechanism of action 150
musical expression, happy or sad 96 and MRI 150
proprioceptive emotional regulation 119 and musical meaning 69
self-regulating 212 music-induced emotions, brain activity 122
empty beats 40 PET contrasted 150
EMT (echoic memory trace) 160 pitch and melody 161–162
engagement speech and music 155–156
communicative 229 and stress 347
in dance, contextual influences 412–413 timbre 164
gravity, dancing with 413 Fonagy, Ivan, expressive communication within
ritual 412 language 107, 213
voice 412–413 Fontanet cave 31
in dance and play 405–406 forebrain 113
defined 223 frameworking technique 427
emergent categories 408–410 Freeman, W J 64
musical, and IQ 72 functional brain imaging studies see fMRI
in play 223–224 (functional Magnetic Resonance Imaging)
social and task 408
vocal, with babies and infants see under babies/infants GABA (gamma-aminobutyric acid) 347
Enlightenment 17 general tau theory
entrainment 67–68 accent and anticipation 556
EPI (echo-planar imaging) 151 description 85–86
equivalent current dipoles (ECD) 153, 158, 160 examining musical communion 102
ERAN (early anterior negativities), hypotheses 88
right hemisphere 157, 158, 159, 161 prospects 101
Erickson, F 450 and tau of a gap/tau-coupling gaps 86
ERP (event-related potential) component see ELAN, tauG-guidance as central concept 87
ERAN, N400 157, 158 gestures
evolution of music 61–65 and chat 109
excitement, expression of in infants 193 feelings, matching to 83
expressive behaviour 247 and language 51
expressive movement, science of 83–85 rhythmic hand, in infants 192
exteroceptive emotional regulation 119 and singing f0 glides 94
types of 569
f0 glides Giraffe Dance 539
delineation 90 Gordon, Edwin, musical guidance instruction 505,
laryngeal 93–94 566, 587–589
and musical stress 92, 93 grammar 43
singing 89–90, 91, 92, 94 grieving songs 34
trombone 94–95 group catharsis 25, 26, 63
fetal responses to music 468 group cohesion, music promoting 62
finger tapping 165 and anxiety reduction 25, 26
finitization 42 group identity 63
fitness, cultural 63 group life (music therapy event) 365–367
flamenco 554 group selection, music as product of 62–63
flow guitars 439
and expectation 540
in communication with infants 6, 109, 286 habitus 304
in musical expression 84, 85, 596 haiku, verses of 41–42
measurement by tau theory 87 Hallan Tonsberg, GE on musical dialogue 424
and rhythm 550, 557, 560 Halliday, MAK on proto-language 267
in music education 475 hand glides 94
in music therapy 378, 385, 395, 396 Hanna, JL on dance 412
of shared knowing 313 harmonic stimuli, dissonant and consonant 109
flutes harmony 185, 186, 556
finding of 31 Harmony of the Spheres 185
score, example 573, 574, 575, 578 Harper House Children’s Service,
shakuhachi (Japanese bamboo flute) 558 Horizon NHS Trust 437, 438
fMRI (functional Magnetic Resonance Imaging) 150 Hatten, R on gesture 567
advantages and disadvantages 151 Hauge, TS on musical dialogue 424
emotion generators, localization 120 heart, and autonomic nervous system 337–340
INDEX 617
hemisphere specificity for language and music 155–172 adapting ritual form to developmental
Heschl’s gyrus (left hemisphere primary progression 256–257
auditory cortex) 155 communication, non-verbal foundations 242–246
ECDs localized in 158 developmental paradox of music 248–250
electrophysiology results 160 music as music 246–248
pitch and melody 162, 163 Infant Laboratory, Edinburgh University Psychology
timbre 164 Department 224
hippocampus 347 infant vocalizations
Hobson P and ‘cradle of thought’ 111, 216, 379 see also babies/infants; infant-directed speech
Holck, U on music interaction therapy 424 analysis of 2
Homo erectus 75, 76 coding of vocalizations 270
Homo ergaster 23, 75, 76 early vocalizations and reciprocal imitation 284–285
Homo habilis 75 ‘extended vocables’ 214
Homo heidelbergensis 23, 75, 76 intonation 285
Homo rudolfensis 75 inventive song, development 475–476
Homo sapiens 31, 74, 76 learning to speak 213–214
homunculus, in artistic and musical signs 36, 37 developing musicality 214–215
hormones musical expressions 469–470
anxiety control 25 and other species 74
basal metabolism 346 preverbal babbling 19
and mother-infant interaction 22 rhythmic expressions see rhythmic expressions,
and musical signalling 64 in infants
and PTSD 346, 347, 348 and subsong of birds 50
hours, names of 42 in trios see Red Hat trio, cats’ chorus
HPA (hypothalamo-pituitary-adrenal) trio 269–277
axis 118, 346, 347, 349 vocal play 74
human culture 45–56 infant-directed speech
three-tiered conception of 45, 54–56 defined 188
hyperarousal symptom cluster 344 maternal depression, effects on see depression,
hypothalamus 346 maternal
musical performance 566
icons 35 origins of music 22
identity pitch 283
emotional background to naming 33–34 prosodic features 468
numerical and qualitative 34 regulation of infant emotions 284
idiom, musical 384 vowels, functions of 214
IDS (infant-directed speech) see infant-directed speech infants see babies/infants
illustrators, gestures 569 infants-in-groups paradigm, trios 266–269, 268
Imberty, M, affective poetry of music, relationships, innate emotional responses to music 8, 108, 126, 173,
musical narrative time and meaning 107, 114, 378, 466
305, 309, 310, 312, 467 innate intersubjectivity/sympathy 2, 264, 467, 468
IMF (Intrinsic Motive Formation) 119, 120, 553 innateness
imitation 49, 50, 186 associations and connotations 538–539
imitative arts 5 language 147
imitative generalists 53 musicality 71, 108, 110, 155
imitative learning 49 infants 187–189
IMP (Intrinsic Motive Pulse) 8, 128, 187 and plasticity 159–160
Au: pls check improvisation subjectivity 2, 263–264
clinical, and creative ‘now’ 382–383 insect fishing by chimpanzees 46–47
entering into musical process through 505–506 institutionalized children 402
expert voices 515–518 instructional teaching 49
improvisation zone 306–307 instrumental culture 55
and intersubjective timing see intersubjective timing intensity contours 509
and improvisation intensity glides
learning environment 481–483 bowing 95–96, 97, 98
and mother-infant interaction 524 defined 96
music and dance, interrelation 501 sychronizing in improvised jazz duets 99–100
improvisational music therapy and autistic spectrum tauG-guidance 100
disorder (ASD) 425–426 intent participation 461
Indian dance forms 110 intentionality 472
infant development, and music 241–257 intentions 115
see also babies/infants; mother-infant interactions interest, expression of in infants 193
action song 250–253 interpersonal bonding, and oxytocin 64
618 INDEX
interpersonal coordination and conjoinment, Laban, Rudolf, dance theory and learning 404,
music as 24–25 406, 567
intersubjective timing and improvisation Lakoff, G, and time experienced in movement 8, 41, 598
anticipating temporal units and weaving time 305–306 and brain ‘mirroring’ 379
definition of ‘intersubjective time’ 309–310 on counting 41
expressive timing and improvisation zone 306–307 LAN (left anterior negativities) 158
repetition and variation, vital importance 308–309 Langer, S, philosophy of musical narratives and ‘feeling
intersubjectivity 70 forms’ 8, 68, 106, 309, 311, 534, 538
and communicative musicality 379 language 32–33
development of theory 210 see also speech
innate 2, 263–264 affective regulations in syntax and semantics
intersubjective sympathy, visualizing brain 131–133
processes 116–117 Bantu languages 53
psychobiological bases 371 Broca’s area 155, 161, 172
secondary 430 Darwin on 51, 264
timing see intersubjective timing and improvisation gestural mode of, original 51
intonation patterns 32 innateness 147
intracerebral kainic acid 127 linguistic terms, and musical meaning 107
intrinsic guidance 86 and music 174, 595–596
Intrinsic Motive Formation (IMF) 119, 120, 553 musilanguage 63, 76
Intrinsic Motive Pulse (IMP) 8, 128, 199, 371, 475, precision of 68
553, 559 and ritual 56
intrinsic motives 118–120, 215 signal powers 55
intuition, and learning music 465 traces of music in 32
inverse problem, bioelectromagnetic 152, 154 and truth values 63, 68
Ioannides, AA 160 and words 154
IQ, and musical engagement 72 language instinct, brain-based 147
irreversibility laryngeal f0 glides 93–94
capturing irreversible processes 497–499 laryngograph recordings 89, 93
and motivation 496–497 leaf clipping, ape traditions 47
physical existence 499–500 learner bottleneck 54
Israeli Rett Centre 437 learning music, receptive environments
‘Itsy Bitsy Spider’ (action song) 253 adult companions, role 480–481
alternate versions 256–257 improvisation and sharing cultural practice
I/You-Us continuum 365 481–483
musical companionship with peers 483–484
Jackendorff, R, rule-based theory of music 173, 554 leaves analogy, origins of music 17, 18–19
Japan Lee, David 8
emotional or interpersonal culture 218–219 left anterior negativities (LAN) 158
vowels, meanings in mother-child interactions left hemisphere, in language and music 155, 172
219–222 legato singing 89
jazz, tonal or atonal 428 Lerdahl, F, rule-based therory of music 173, 189, 554
jazz duets linguistico-musical compositions 33
intensity glides, synchronizing in 99–100 loudness 100
and mother-infant interaction 307n loved ones, and names 33–34
jazz musicians 88 lullabies 109
joy, expression of in infants 193 ‘Lx waveform’ of the laryngograph 89
kalimba thumb-piano music 67 macrostructure of music 114

kappa XG (kX,G) profiles 89, 92, 93, 95, 98 magnetic field tomography (MFT) 170
in jazz duets 100–101 magnetoencephalography (MEG) see MEG
katapontismos (ancient Greek ritual) 346 (magnetoencephalography)
kit drums 99–100 male display, musical behaviour as 20
Koroko (Japanese principle) 218 Malloch, S 4, 215
Krasa, Hans 331 on communicative movements 379
Krumhansl, CL, infants’ sense of musical on communicative musicality
form 109, 173, 311, 469 373, 431, 566
definition of rhythm 189 on community music therapy 360
Kugiumutzakis, G 186–187 on culture 264
Kühl, Ole the natural meaning of music 106–108, 189, on music therapy 357
213, 306, 466–469, 496, 552, 566 on temporal coordination between mothers and
kX,G (kappaXG) profiles 89, 92, 93, 95, 98 infants 302
in jazz duets 100–101 mammals 50, 108, 112
INDEX 619
Marwick, H 284 Morley, Iain 7

Mashô, Matsuo 41 ‘motherese’ see infant-directed speech
masquerades, West African 538 mother-infant interactions
MBEA (Montréal Battery of Evaluation see also babies/infants
of Amusia) 72 action songs/games 242, 250–253, 284, 466, 474,
meaning 480, 481
co-construction of 379–381 in Japan and Scotland 219–224
creative construction of 383–384 and animal calls 110–111
emotional see emotional meaning, creating anxiety control 25
and fact 36 attraction between babies 269
finding and losing in vocal sound 478–480 belonging, sense of see belonging, sense of
journey to, in infants 209–213 depression in mothers, effects see depression,
and music 68–69, 106–107 maternal
alternative view 606–607 discoveries, history 1–6
nature of 105 hormone release 22
relational 265 and improvisation 524
and sense of belonging 312 interactive timing 244
MEG (magnetoencephalography) 151–153 intersubjective timing and improvisation 305–310
conventional analysis 166 lullabies 109
echoic memory trace 160 measures of emotion
music-induced emotions, brain activity 122 acoustic features, similarities and
pitch and melody 161 differences 219–222
single-trial data 152 nursery rhymes, reciting in different
speech and music 156–159 moods 224–226
temporal resolution 166 play, engagement in 223–224
tomographic reconstructions 153 reactions to mother’s mood changes 226–227
melodic integration 42 mid-range vocal rhythm coordination 307
melodic phrases, reproducibility 42 musicality 4, 212
melody oxytocin 126
defined 165 in protoconversations, see protoconversation
developmental paradox of music 249 and protomusic 22, 23
modulation of metric structures 556 recurrent temporal units, embedding of 306
motion/perception gaps 86 repetition and variation, vital importance 308–309
music therapy 432 rhythmic musicality 105
and pitch 161–164 rhythmical temporal patterning 211
tau theory 88 species-specific 22
memory-based feelings 31 mother-infant interactions (cont.)
mental architecture, and music 37–39 speech, time devoted to 250
Merker, Bjorn 6, 26, 54 temporal organization of 307
message-signalling practices 31 tones of voices communicating companionship
metabolism 345–349 feelings 231–232
metre 554 video documentation 254–255
Meyer, LB, on musical meaning and emotion 69, 70, vocal interactions, tapes of 2, 3, 4
312, 432, 534 motion/perception gaps 86
on Western musical aesthetics 540 motivation
MFT (magnetic field tomography) 170 attachment 210
Miall, David 4 and irreversibility 496–497
Miller, Geoffrey, evolutionary selection of musical primary and secondary 497
behaviour 18–20, 21, 54, 64–65, 73 motives of musical culture 106–107
children’s musical sociability 108 motor cortex 87, 88
mimesis 110, 252 motor regulating system 112–113
mimetic culture 203 mountain gorillas 47
mirror system 165 movement
mismatch negativity (MMN) 159, 161 see also dance
Mithen, Steven, social function of music, and of body in communication, rhythm from 190
evolutionary origins with language 18, 19, 23, 25, and communication, musical-emotional 105–110
106–108, 131 meaning of ‘musical’ 106–107
scientific neglect of music 110 motives of musical culture 106–107
MMN (mismatch negativity) 159, 161 detection of musicality in, by infants 469
monkeys 87, 161 embodied expressive, music as 67
Montréal Battery of Evaluation of Amusia (MBEA) 72 emotional assessment of communicated
mood, and effects of music 130–131 meanings 112–114
mora (Japanese) 218 and emotions, intersubjectivity 115
620 INDEX
movement (cont.) primitive 19

expressive, science of 83–85 psychobiology of see psychobiology of music
as fundamental form of human communication psychology of 61
566–567 semiotic functions 36, 37
and impact of music 131 sexual selection theory, according to 19–21
in infants, development during first year 200 sociology of 371–372
metaphors for 1 and speech 154–161
psychobiological approach 342–344 as technology 73
and rhythm 190, 551–553 therapeutic potential of 377–378
sympathetic body 601–602 as therapy see music therapy
theories of 404, 406 traces in language 32
movement-music-movement cycle 102 uses and functions, distinguished 541
Mozart, Wolfgang Amadeus music child, concept 380
Cosi fan Tutte, performance of 367–369 music perception, cortical specializations
Mozart Effect 130 pitch and melody 161–164
Piano Concerto 125 rhythm 164–165
trombone playing 95 timbre 164
MRI (magnetic resonance imaging) 151 music therapy
and fMRI 150 and ASD/autism 425–429
Müller, E 267 communication, support for 423–425
Murray, L 287, 289, 292 communicative musicality 378–379
muscular bonding 26 community see community music therapy
music Culture-Centred Music Therapy 357
across cultures and times 66–67 poiesis in 394
affective regulations in syntax and and Rett syndrome 429–441
semantics 131–133 sexually abused children see sexually abused
ambiguity of 68–69, 70, 71 children, music therapy
applied psychobiology 128–131 musical actions, communal functions 69–74
and attraction dynamics, in trios 269–273 musical scales and affect 42
bodily feelings from 123–124, 125, 173 musical dialoguing 424, 425
chronobiology of see chronobiology of music musical education model
as component of communicative musicality 17 creativity and pride of music-making 473–475
comprehensive definition requirement 65–69 exploration of voices and discovery of song and
consciousness of 117 speech 472–473
and dance, four agents for 500–501 vocal expression, as learned 470–472, 471
defined 85, 241–242 musical expression
developmental paradox of 248–250 creative artistic expression, characteristics 83, 84
developmental value 70–72 f0 glides see f0 glides
embedded nature of 106 happy and sad 96
as embodied expressive movement 67 and human expression 378
and emotion see under emotions of infants 469–470
evolution of see evolution of music jazz duets 99–101
in evolutionary thinking 61–65 playing in time 96–99
‘floating intentionality’ of 6 prospects 101–102
as functionless by-product 21–22 tauG in 88
heart, effect on 339 see also general tau theory; tau theory;
and infant development see infant development, tauG analysis
and music in therapy 378
as interpersonal coordination and conjoinment and vocal gestures in IDS 282
24–25 musical intimacy 525, 526
and language 174, 595–596 musical invention and imitation, see reciprocal
macrostructure 114 imitation
meaning 69 finding and losing meaning in vocal sound 478–480
alternative view 606–607 inventive song, development 475–476
and mental architecture 37–39 manipulation of objects as ‘instruments’, musical
neurology of 118 story-making 477–478
neuroscience of 116–128 spontaneous dancing, and learning music 476–477
social attractions and ‘addictions’, musical life cycle 484–485, 486
neurochemistry of 126 musical pulse or beat, measurement 40
non-adaptive roots 61 musical signalling, and socio-emotional bonding
origins of see origins of music 63–64
participant-selected/experimenter-selected 130 musical skills, key components 567–568
and poetry 107 musical stress, and f0 glides 92, 93
INDEX 621
musical sympathy 115 of reciprocity 525–528

musical techniques, learning 171–172 and rhythmic motives Au: pls check.
musicality, innate and acquired 71, 108, 110, 118, 154, theory needed
155, 159–164, 171–176, 264, 265 natural environment 498–499
and healing 7, 378, 385 Neapolitan chords 158
and learning music or singing 128, 129, 466, 467, Neng Neleng Kung (Javanese song) 483
473, 476, 480 nervous system, tau in 87–88
and meaning 9, 467 nettle stripping, ape traditions 47
and rhythmic motives for cultural ritual in neural power gap 88
humans 4, 52–55, 108, 379 neural-tau melody 88
theory needed 110 neurobiology 552
musicality neurochemistry of emotions and music 126–128
see also communicative musicality neurochemistry of social attractions
of action, and consciousness 43 and ‘addictions’ 126
asymmetric, in human brain 117 neuroimaging
brain, tracing in 120–122 blood flow response methods 148, 149, 150–151
and community music therapy 362 combining techniques 154
comparative psychobiology of 108–109 electroencephalography see EEG
defined 4–5, 281 (electroencephalography) electrophysiological
development see development of musicality methods 151–153
and energies of Self 6–7 functional brain imaging studies see fMRI
and healing (functional Magnetic Resonance Imaging)
as ‘holding’ 314–315 future work 175–176
human, mysterious nature 110 insights, imaging studies 174–175
in human communication 281–282 magnetoencephalography see MEG
and infants 263–266 (magnetoencephalography)
innate and acquired 71, 108, 110, 118, 154, 155, music perception, cortical specializations 161–165
159–164, 171–176, 264 musical techniques, learning 171–172
of infants 187–189 positron emission tomography see PET (positron
of language and music 595–596 emission tomography) imaging
and learning music or singing state of the art and implementation problems 153–154
maternal depression, effects on 281–294 temporal aspects of music, studying 165–171
loss of musicality, long-term consequences neurology, of music 118
292–293 neurons 86, 87
and meaning neuropeptides 127
mother-infant interactions 4, 212 neuroscience
and narrative 310–311 affective 117
natural, and therapeutic potential of music 377–378 movement and communication, musical-emotional
possible evolution of 108 105–110
sense of belonging, performing through 302–305 of music 116–128
skilled 129 musical awareness, cerebral asymmetry and
as universal human talent 72–73 emotional foundations 117–118
musicianship 362 neurochemistry of social attractions
musicing 358, 362 and ‘addictions’ 126
music-makers, auditory self-consciousness of 266 psychobiology of music, socio-emotional 110–116
musicologists 66 neutral expressions, infants 193
musicology 372 neutrotransmitters and stress 347, 348
musiclanguage 63, 76 Newberg, A and emotion in rituals 539, 540
mutuality, mechanisms of 536–537 Niaux cave 31
non-discursive meaning 311
N400 negative ERP 156, 157, 158 non-efficaciousness 67
naloxone 120 Nordoff, P, interactive music therapy for children 378,
naltrexone 120 380–382
naming, principle of 34 norepinephrine attention systems 127
narratives nostalgic songs 34
concept of ‘narrative’ 310 nursery rhymes, reciting in different moods 224–226
emotional 602 infants’ reactions 226–227
infant vocalizations 215
musical expression 83 observational learning 49
and musicality 310–311 OCD (obsessive-compulsive disorder) 55
proto narrative envelopes 215 octave equivalence 162
psychobiology of music 114–115 open-skull procedure, conscious patients 160
pulse and quality 4 opera 367–369
622 INDEX
opiate receptors 117, 124 periaqueductal grey (PAG) 117, 118

Opie, Iona and Peter 483 personhood, musicality of 34
opioids 126, 540 PET (positron emission tomography) imaging 148
oral phrases 43 advantages and disadvantages 150
orchestras 98 chills experience 174
origins of music fMRI contrasted 150
archeological evidence 31, 74–76 future work 175
biological origin and function 18 mechanism of action 150
costly signalling theory 19–20 music-induced emotions, brain activity 122
evolutionary theories 62–65 opioid activity in limbic system 124
and human culture 105–108 rhythm 165
non-evolutionary explanation 17 speech and music 155–156
rhythm 18 tracing of musicality, in brain 120
sexual selection theory 19–21 phenomenological analysis 39
tree analogy see tree analogy, origins of music phrase
origins of music in music and language 32, 42
Other Person’s Conscious Doing 39 and body movement in singing 570–572
oxytocin 64, 126, 127, 540 in mothers’ speech to infants 302, 303, 305, 311
and ‘vitality affects’ 309
P600 event-related potential 157, 158 in bipolar depression 310, 318, 319
paintings semantic functions 35 perceived in music by infants 469
palaeo-anatomical records 74–75 and change in expression 577
palaeontology 31 in vocalization of infants and toddlers 474, 479
Panksepp, Jaak 8, 26, 122 and expressive movements 89, 95, 96
∨ ∨
Papousek, Hanus 6, 62, 265, 282, 588 and the ‘psychological present’ 547, 551
∨
Papousek, Mechtild 6, 217–218, 248, 265, 282 breath phrases in flute playing 558
and Bebé Babá (project) 588, 589 relation to pulse in music 560
PAG (periaqueductal grey) 117, 118 physical environment, aesthetic community 416
parenthood, human concept 33 physical existence 499–500
see also mother-infant interactions physiognomic forms 211
parietal cortex 87 physiological regulation 203n
Parkinson’s disease 131, 344 Piaget, Jean 186, 471
passive musical perception 67 piano mechanism, hammer-release point 86
Pavlicevic, Mercédès 7 Pierce, Alexandra 515, 566, 567, 601, 602
peak experience 313 Pinker, Steven music as ‘reward’ 21, 22, 65, 73, 370
PEAK Motus motion tracking equipment 575 Pinker, Steven, music as pleasure stimulus 21, 22, 65,
Perception-in-Action Laboratories, University of 73, 370
Edinburgh 85n pitch
Perception-Movement-Action Research, University of and melody 161–164
Edinburgh 102 perception of, in Williams children 116
performance in speech 456
aesthetic foundations 129–130 vocal expressions to infants 283
and art 130 pitch plot 5, 225, 291
data collection 575–576 pitch-blending 21
development of musicality in teaching of plastic song of birds 50, 249
conducting 601–602 plasticity 159–160
method 598–599 Plato, mathematics of music 185, 189, 547
new teaching procedures 602 and mimesis of movement 538
participants 599 play
pilot studies 599–601 and altriciality 73–74
results 602–603 collaborative 308
theory 597–598 and dance 407
duets 577–578 content and methods 405–406
imagination 605–607 operationalizing engagement in 405
movements of 565–580 interactive, physical exuberance of 112
musical skills, key components 567–568 and musical engagement 479
and performers 83, 85 social 112
practical investigations, summary 578–579 and vowel sounds 223–224
sociocultural codes 568–569 pleasure, expression of in infants 193
solos 576–577 pleasure
theoretical possibilities 579–580 expression of in infants 193, 197–202
traditional style 572 and opioid system 120
Pergolesis’s duet 89 sought in music 106, 130, 185
INDEX 623
Pleistocene 21 movement, emotional assessment of communicated

poetry 32–33, 107 meanings 112–114
and music therapy 385 movement and impact of music 131
rhythm 201 musical effect and training of musical intelligence
poiesis, in music therapy 394 and skill 128–129
Polish Radio Experimental Studio 546 musical sympathy 115
Pöppel, Ernst, musical time in the mind 6, 33, 547, narrative, musical 114–115
553, 554 psychological time 114
Popper, K 596 socio-emotional 110–116
Portel cave 31 Williams syndrome 115–116
positron emission tomography (PET) imaging see PET psychological present 547, 550, 551
(positron emission tomography) imaging psychological regulation 203n
postnatal depression see depression, maternal psychological time 114, 546, 548–549
post-traumatic stress disorder (PTSD) see PTSD psychology of music 61
(post-traumatic stress disorder) PTSD (post-traumatic stress disorder)
Povel, D-J 554 331, 332, 335, 337
poverty of the stimulus 54 and early childhood sexual abuse 384–385
Praat (software) 89, 94, 221, 226 and heart 338
prehistoric art and music 22, 31 and hormones 346, 347, 348
priming phenomenon 173 and respiratory problems 340
primitive music/societies 18, 19 puberty ceremonies 541
prolactin 127 pulsation in music and dance 501, 506–508
proper names 34 pulse/pulses 4, 40, 554
proprioceptive emotional regulation 119 pygmy marmosets, vocal play 74
protoconversation Pythagoras, natural law 185
culture-specific traits 303
example 3, 5, 244, 245 quality of voice, see timbre, voice quality 4
neuroimaging 155 quasi-resonant nuclei 214
‘poetic form’ 4
proto-conversation, I, 3, 5, 111, 218, 282, 285, 288, rain dance 47
318, 472, 595 RATN (right anterior-temporal negativity) 158
proto-grammar 70, 107 reciprocal imitation
protohabitus 304–305, 580 in classroom talk 450, 460
proto-habitus 301, 305, 305, 308, 310, 322 with infants 216, 244, 284, 285
proto-language 132 and intersubjective time 309
proto-music/proto-musicality 62, 107, 108, in music learning 514
128, 132 reciprocity 527–528
and communicative musicality 22–23 Red Hat trio 269, 271, 273–275
and other species 128, 370 regulators, gestures 569
protomusical behaviours 70, 71, 77 relational affects, and cognition 111
in therapy 359, 360, 361, 361 relational meaning 265
proto-narrative envelopes 114, 215, 218 religious ceremonies 32, 46
proto-symbolic expression 312 research, as art-science duet 404–407
protons, and fMRI 150 respiration 340–342
proximate behaviours 541 Rett syndrome
psychobiology communicative musicality for children with
autonomic nervous system and heart 337–340 429–435
basal metabolism 345–349 girls with 430–433
bodily movement 342–344 case study 440, 441
evidence 217 music therapy 435–441
hearing and listening 335–337 preferences 431
loop, psychobiological 349–350 and musical dialogue 425
of music see psychobiology of music Rett Therapy Clinic 437
respiration 340–342 structured music 433–435
psychobiology of music see chronobiology of case study 434, 436
music 105 rhythm
aesthetic foundations of musical performance chronobiology of music 546–550
129–130 cultural differences in protoconversation 303
animal calls and mother-infant dialogues 110–111 defined 189
applied 128–131 descriptions, notations and representations 553–554
interactive play 112 development study 190–193
mood induction and emotional effects of music data analysis 192–193
130–131 recording conditions 191
624 INDEX
rhythm (cont.) Romantic tradition 557

infants Round and Round the Garden (nursery rhyme) 224
age differences 195, 196 Rubato style 557
developmental course 193–196
durations of rhythmic expressions 196–197 Sachs, Curt 20
evolution of emotions in rhythmic ‘dialogue’ Sacks, Oliver, music in the brain 15, 559, 560
201–203 and Parkinsonism 551
frequencies of all rhythms 193 sadness, opioid activity 124
multimodality of 200–201 SAM (sympathetic-adrenal medullary) regulation 118
rhythmic expressions of 199–200 samba 557
timing 201 Sander, LW 273, 551
types of rhythms 196 Sander, L interactional synchrony between adults and
and movement 190, 551–553 infants 109, 283, 551
objective 189 Sawyer, Keith 305
origins 189 scale form 384
origins of music 18 Scalise Sugiyama, M 20
in protoconversation 2–5, 285, 302, 303 school crisis 593–595
subjective 189 Schutz, Alfred 309, 313, 368
timing of 201 Making music together 372
variability, in music therapy 432 Schutz, Alfred, sociology of music 309–310, 313
rhythmic expressions, in infants 186, 199–200 social power of Mozart’s music 368
see also infant vocalizations Scotland
defined 190 emotional or interpersonal culture 218–219
durations of 196–197 nursery rhymes, reciting in different moods
measurement of 192 224–226
rhythmic musicality 105 vowels, meanings in mother-child interactions
rhythmic organization 40, 42 219–222
rhythmic ‘present’ 550–551 Second International CA Sys Conference
rhythmic processes 114 (Computing Anticipation System) 6
right anterior-temporal negativity (RATN) 158 self-other awareness 115
right hemisphere semantic binding 33
evoked potentials (ERAN, RATN) 158 semantic mismatches 69
and perception of musical chords and sound semiosis, infant 312
frequency 158–160 sensorimotor area (SMI) 167
ritual culture sensory impairment, and dance 402, 403, 404
see also ritual/rituals sentic forms 109
in animals 50 Serious Road Trip (NGO) 342
defined 45–46 serotonin 127
tier of 55 sexual selection theory 19–21, 64–65, 73
ritual propensity, generalized 52–54 sexually abused children, music therapy
ritualization 46, 534–536 case study (Sally) 387–396
ritual/rituals background 387–388
see also ritual culture educational development, aspects 395–396
adapting ritual form to developmental music therapy 388–389
progression 256–257 significant episodes from session 389–393
ceremonial 533–542 summary of change 393–395
and culture 45–50 therapeutic change, aspects 395
dance, engagement in 412 clinical pathways of symbolization 385–387
definitions 48–49 early childhood abuse and PTSD 384–385
formal terms 49 and post-traumatic stress disorder 384–385
human, social 54 shivers, provoking 123–124, 125, 173
katapontismos (ancient Greek ritual) 346 signs, artistic and musical 35–37
and language 56 singing 34
learnt nature 49 see also songs
mere ritual 51 animal rituals 52
purpose of 46 to babies 109, 127
religious 46 crescendo 86
reward system 64 f0 glides 89–90, 91, 92, 94
ritualization distinguished 46 intonation patterns 32
seeking in shared performance 467 inventive song, development 475–476
Robb, Louise, musicality of mothers’ voices, and effects legato 89
of depression 6, 109, 188, 221, 290, 566 neuroimaging 176
Rodrigues, Helena Maria 585 trying to sing 248
Rodrigues, Paulo Maria 585 vocal learning 50–51
INDEX 625
SMA (supplementary motor area) 167, 170 on intersubjectivity 371, 496

SMI (sensorimotor area) 167 on maternal depression, effects 293
Smith, Adam, music as an imitative on mother-infant interaction 307n
art 5, 111 on music therapy 358
music lives between memory and on narratives 215
anticipation 551 on sound-making, infants 275
social innovations, aesthetic community 416 on speech addressed to infants 284
social philosophy 372–373 on Trevarthen 496
social psychology 371–372 on vitality affects 122, 309, 539
social relationships 13 Stockhausen, Karlheinz 546, 547, 559
socially directed behaviour 267 stress
sociocultural codes, in music see also depression, maternal
performance 568–569 anxiety control 25–26
sociocultural conceptualizations 42 hormones 25
socio-emotional bonding, and musical hyperarousal symptom cluster 344
signalling 63–64 and metabolism 345–349
sociology of music 371–372 post-traumatic stress disorder see post-traumatic
somantic receptors, and impulses 38 stress disorder (PTSD)
songbirds, communicative learning in 159 structured music, enhancement of communicative
songs musicality 433–435
see also singing subsong 50
animal rituals 52 ‘Summertime’ (Gershwin) 573
baby 188–189 supplementary motor area (SMA) 167, 170
developmental disabilities, children with 430 suprasegmental structures 32
grieving songs 34 surprise, expression of in infants 193
nostalgic songs 34 syllabic phenomenon 42
plastic song 50 syllables 453, 454
and poetry 107 symbolic exchanges 568–569
subsong 50 symbolic species, humans as 33
vocal learning for 50 symbolization
sound-making, coordinated act of 39
Cats’ Chorus trio 269, 271, 272, 275–277 clinical pathways of 385–387
Red Hat trio 269, 271, 273–275 defined 377
Spanish Civil War (1936-39) 331 synrhythmia 203, 561
spectograph 3, 302 syntactic phrase formation 42
speech
see also language tau theory
classroom 449–452 see also general tau theory
contrasting of utterance 455 communication 264
examples, in classroom 452–459 gaps, tau of 86
gestalts 455, 456 hypotheses 88–89
infant-directed see infant-directed speech melody 88
of infants see infant vocalizations nervous system, tau in 87–88
maternal depression 221 tau-coupling gaps 86–87, 264
and music 154–161 tauG analysis
EEG and MEG studies 156–159 definition of tauG 86–87
innateness and plasticity, at different temporal f0 glides 90, 92
scales 159–160 jazz duets 99
PET and fMRI studies 155–156 measurement of tauG 87
origins of music 19 procedure 92
pitch 456 tauG in musical expression 88
and ritual 50 and tauG-guidance 87
syllables 453, 454 Tchaikovsky, Pytor Ilyich 96
time devoted to 250 teaching of music
spike-rate data, monkeys 87 group work 510–511
SPM (statistical parametric mapping) 170 ideal setting 509–510
SSR (steady-state response) 166 irreversibility 496–499
statistical parametric mapping (SPM) 170 musical process, entering into
steady-state response (SSR) 166 through improvisation 505–506
Steedman, M 554 through playing with breath and
Stern, Daniel 4 vocalization 508–509
on emergent moment 382 through playing with communication 501–505
on emergent self 430 through pulsation 506–508
on intensity contours 509 natural and personal environment 498–499
626 INDEX
‘Tears of the Red Candle’ (song) 570, 571 on intersubjectivity 70, 263–264, 496
technology, music as 73 on music therapy 357, 358
tempo 432, 554 on regulation types 203
temporal aspects of music, studying 165–171 on speech addressed to infants 284
temporal synchronization 21 on synrhythmia 561
termite fishing 46 trios of infants 269–277
Theodorakis, Mikis 190 trombone f0 glides 94–95
theory of mind (ToM) 49 truth values, and language 63, 68
Tillman, J 470 Turner, Victor, on play and drama 4, 47, 106,
timbre 3, 100, 214–215 131, 370, 417, 467
timbre in childhood 112
Au: it is not and brain responses 164
in communicative musicality 69, 379 Upper Palaeolithic period 31
mentioned, and quality of the human voice 85, 212 URPM (Unité de Recherche en Psychologie de la
and communication of emotion 218, 379, Musique) 6
pls check. 394, 539 ur-semantics 107
and energy of expression in time,
complementing rhythm 189, 555, vagal activity, breathing 341
558, 560 Vedic culture, India 48
in infant awareness 188, 469 Vedic texts 186
of a musical instrument, and learned Venda society, southern Africa 71
technique 171, 567 video documentation, mother-infant interaction
and learning to speak 214, 215, 283 254–255
in mother-infant communication 3, 4, 265, Video-Logger Event Recorder 192
305, 466 violin playing 86
as one part of performance 147, 603 vitality affect 122, 309, 539
in speech 595 vitality contours 4
and tau guidance 86 vocal learning 45, 50–51
and teacher talk 449 vocal play 74, 432
in therapy 434, 440 vocal/voice quality, see timbre
timing, see chronobiology of music, and IMP in communication with infants 4, 214–215,
coordinated interpersonal 215 282–293
expressive 306–307, 314 and communicative musicality 4
interactive, mother-infant interaction 244 in classroom talk 453, 455, 459
intersubjective, and improvisation 305–310 richness of human voice 212
narratives 215 variation in mother’s voice with mood 224
rhythms 201 voice appreciation, subcortical systems of 117
vocal 215 voluntary breathing 341
ToM (theory of mind) 49 vowels
tomography, defined 148–149 and communication of emotion and meaning
tones 42 211–213
tonotopy in cerebral cortex 161 defined 213
transferable ownership 472 functions, learning of speech 213–214
tree analogy, origins of music 17–27 meanings in mother-child interactions
anxiety control 25–26 (Japan and Scotland) 219–222
blossom 17–18, 19–21 and ‘quasi-resonant nuclei’ 214
bole 18, 22–25 sounds of, and play 223–224
burl 18, 21–22 Vygotsky, LS, social interaction and cultural
leaves 17, 18–19 learning 216, 461
root 18
Trehub, SE, infants’ musical sounds and their waltz 551, 554
perception of musical features 2, 109, 187–189, War Child Netherlands 334
212, 215, 221, 247–249, 283, 285, 379, 469, 475, War Child UK 333, 334
486, 518, 595 Washington Agreement (1994) 335
Trevarthen, Colwyn 45 Weber, Max 19
and Bebé Babá (project) 588 Werner, Heinz 557
on communicative musicality 373, 431 Wernicke area, language 155
on community music therapy 360 Western musical practices 66
on depression in mothers, effects of 289 whale song 50
on IMF 119, 120, 553 Williams syndrome musical abilities 115–116
on IMP 8, 128, 187 Wittmann, Marc 6, 553, 554
on infants and musicality 265 work ethic, aesthetic community 416
INDEX 627
Wray, Alison 54 zones of conflict, music for children in, see Spanish
Wu, Amy 565, 569–573 Civil War, War Child
Yeats, William Butler 17 historical background 331
project, background to 333–334
Zatorre, RJ, emotion systems of brain in rhythm and research, assessment and responsibility 334–335
music 113, 117, 118, 120–124, 155, 162, 552, 553 war and trauma 331–332

Communicative Musicality 2009

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Communicative Musicality 2009

Uploaded by

Copyright:

Available Formats

00-Malloch-Prelims 9/10/08 12:04 PM Page i

This is the Hardback edition, published in 2008. The Paperback

Author affiliations and biographies ix

1 Musicality: Communicating the vitality and interests of life 01

Part 1 The origins and psychobiology of musicality 13

Part 2 Musicality in infancy 183

13 The effects of maternal depression on the ‘musicality’ of infant-directed

Part 3 Musicality and healing 329

Part 4 Musicality of learning in childhood 447

Part 5 Musicality in performance 531

25 Towards a chronobiology of musical rhythm 545

Author affiliations and biographies

x AUTHOR AFFILIATIONS AND BIOGRAPHIES

Per Aage Brandt

AUTHOR AFFILIATIONS AND BIOGRAPHIES xi

xii AUTHOR AFFILIATIONS AND BIOGRAPHIES

AUTHOR AFFILIATIONS AND BIOGRAPHIES xiii

xiv AUTHOR AFFILIATIONS AND BIOGRAPHIES

AUTHOR AFFILIATIONS AND BIOGRAPHIES xv

xvi AUTHOR AFFILIATIONS AND BIOGRAPHIES

Musicality: Communicating the

1.1 A brief history of discoveries

2 STEPHEN MALLOCH AND COLWYN TREVARTHEN

MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 3

4 STEPHEN MALLOCH AND COLWYN TREVARTHEN

MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 5

INTRODUCTION DEVELOPMENT CLIMAX RESOLUTION

6 STEPHEN MALLOCH AND COLWYN TREVARTHEN

1.2 Musicality and the energies of the Self

MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 7

8 STEPHEN MALLOCH AND COLWYN TREVARTHEN

1.3 A dichotomy and the way forward

MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 9

10 STEPHEN MALLOCH AND COLWYN TREVARTHEN

Hepper PG (1988). Fetal ‘soap’ addiction. Lancet, 1, 1347–1348.

MUSICALITY: COMMUNICATING THE VITALITY AND INTERESTS OF LIFE 11

The origins and psychobiology

14 THE ORIGINS AND PSYCHOBIOLOGY OF MUSICALITY

THE ORIGINS AND PSYCHOBIOLOGY OF MUSICALITY 15

Root, leaf, blossom, or bole:

In seeking to understand music as it emerged from the great-rooted whole of musicality,

and—indulging in poetic license or interpolation for the purposes of my discussion—as burl

2.2 Leaf: early speculations

2.3 Blossom: music as sexual ornament and costly signal

2.4 Burl: music as functionless by-product

2.5 Bole: from protomusic to music

2.5.1 Protomusic in communicative musicality

Although mother–infant interaction is well-studied, my hypothesis about its relevance to

2.5.2 Music as interpersonal coordination and conjoinment

2.6 Control of anxiety

2.7 Concluding remarks

not belie a hypothesis of music’s origin in the protomusical performances of mother–infant

Music and how we became

3.2 Facing death and danger

32 PER AAGE BRANDT

3.3 Traces of music in language

3.4 Language into music

MUSIC AND HOW WE BECAME HUMAN—A VIEW FROM COGNITIVE SEMIOTICS 33

3.5 An indispensable emotional background to naming identities

34 PER AAGE BRANDT

MUSIC AND HOW WE BECAME HUMAN—A VIEW FROM COGNITIVE SEMIOTICS 35

3.6 Homunculus in the artistic and musical sign

36 PER AAGE BRANDT

MUSIC AND HOW WE BECAME HUMAN—A VIEW FROM COGNITIVE SEMIOTICS 37