The Bloomsbury Handbook of Music Production 1501334026 9781501334023 Compress

The Bloomsbury
Handbook of
Music Production
The Bloomsbury
Handbook of
Music Production
Edited by Andrew Bourbon and
Simon Zagorski-Thomas
BLOOMSBURY ACADEMIC
Bloomsbury Publishing Inc
1385 Broadway, New York, NY 10018, USA
50 Bedford Square, London, WC1B 3DP, UK
BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo

are trademarks of Bloomsbury Publishing Plc
First published in the United States of America 2020
Volume Editors’ Part of the Work © Andrew Bourbon and

Simon Zagorski-Thomas, 2020
Each chapter © of Contributor
Cover design: Louise Dugdale

Cover image © Simon Zagorski-Thomas
All rights reserved. No part of this publication may be reproduced or transmitted

in any form or by any means, electronic or mechanical, including photocopying,
recording, or any information storage or retrieval system, without prior
permission in writing from the publishers.
Bloomsbury Publishing Inc does not have any control over, or responsibility for, any
third-party websites referred to or in this book. All internet addresses given in
this book were correct at the time of going to press. The author and publisher
regret any inconvenience caused if addresses have changed or sites have
ceased to exist, but can accept no responsibility for any such changes.
Whilst every effort has been made to locate copyright holders the publishers
would be grateful to hear from any person(s) not here acknowledged.
Library of Congress Cataloging-in-Publication Data

Names: Bourbon, Andrew, editor. | Zagorski-Thomas, Simon, editor.
Title: The Bloomsbury handbook of music production / edited by Andrew
Bourbon and Simon Zagorski-Thomas.
Description: New York : Bloomsbury Academic, 2020. | Series: Bloomsbury handbooks |
Includes bibliographical references and index. | Summary: “A summary of current research
on the production of stereo and mono recorded music”– Provided by publisher.
Identifiers: LCCN 2019035761 (print) | LCCN 2019035762 (ebook) | ISBN 9781501334023
(hardback) | ISBN 9781501334030 (epub) | ISBN 9781501334047 (pdf)
Subjects: LCSH: Sound recordings–Production and direction. | Popular music–Production
and direction. | Sound–Recording and reproducing–History.
Classification: LCC ML3790 .B645 2020 (print) | LCC ML3790 (ebook) | DDC 781.49–dc23
LC record available at https://lccn.loc.gov/2019035761
LC ebook record available at https://lccn.loc.gov/2019035762
ISBN: HB: 978-1-5013-3402-3

ePDF: 978-1-5013-3404-7
eBook: 978-1-5013-3403-0
Typeset by Integra Software Services Pvt. Ltd.
To find out more about our authors and books visit www.bloomsbury.com
and sign up for our newsletters.
Contents
List of Figures ix
List of Tables x
Notes on Contributors xi
Introduction
Andrew Bourbon and Simon Zagorski-Thomas 1
Part I Background
1 Recorded Music
Simon Zagorski-Thomas 7
2 Authenticity in Music Production
Mike Alleyne 19
3 How to Study Record Production
Carlo Nardi 33
Part II Technology

4 From Tubes to Transistors: Developments in
Recording Technology up to 1970
Albin Zak III 53
5 Transitions: The History of Recording Technology
from 1970 to the Present
Paul Théberge 69
6 How Does Vintage Equipment Fit into a Modern
Working Process?
Anthony Meynell 89
vi Contents
Part III Places

7 Recording Studios in the First Half of the
Twentieth Century
Susan Schmidt Horning 109
8 Recording Studios since 1970
Eliot Bates 125
Part IV Organizing the Production Process

9 Information, (Inter)action and Collaboration
in Record Production Environments
M. Nyssim Lefford 145
10 Creative Communities of Practice: Role Delineation
in Record Production in Different Eras and across
Different Genres and Production Settings
Tuomas Auvinen 161
11 Pre-Production
Mike Howlett 177
Part V Creating Recorded Music

12 Songwriting in the Studio
Simon Barber 189
13 The Influence of Recording on Performance:
Classical Perspectives
Amy Blier-Carruthers 205
14 Welcome to the Machine: Musicians, Technology and
Industry
Alan Williams 221
Contents vii
15 Studying Recording Techniques

Kirk McNally and Toby Seay 233
16 Materializing Identity in the Recording Studio
Alexa Woloshyn 249
Part VI Creating Desktop Music

17 Desktop Production and Groove
Anne Danielsen 267
18 The Boom in the Box: Bass and Sub-Bass in
Desktop Production
Robert Fink 281
19 Maximum Sonic Impact: (Authenticity/
Commerciality) Fidelity-Dualism in
Contemporary Metal Music Production
Mark Mynett 293
20 Desktop Production and Commerciality
Phil Harding 303
21 Audio Processing
Michail Exarchos (aka Stereo Mike) and
Simon Zagorski-Thomas 317
Part VII Post-Production

22 Studying Mixing: Creating a Contemporary
Apprenticeship
Andrew Bourbon 337
viii Contents
Part VIII Distribution

23 Producer Compensation in the Digital Age
Richard James Burgess 351
24 Evolving Technologies of Music Distribution:
Consumer Music Formats – Past, Present
and Future
Rob Toulson 367
25 Listening to Recorded Sound
Mark Katz 383
26 Interpreting the Materials of a Transmedia
Storyworld: Word-Music-Image in Steven Wilson’s
Hand. Cannot. Erase. (2015)
Lori A. Burns and Laura McLaren 393
Index 405
Figures
15.1 Publication dates of selected sound-recording textbooks 237

17.1 Amplitude graph and spectrogram of Snoop Dogg’s ‘Can I Get A Flicc
Witchu’ 273
17.2 Sidechain pumping example 276
18.1 The dbx Model 100 ‘Boom Box’ Sub harmonic Synthesizer 284
18.2 A kick drum enhancer using virtual mid/side filtering for dynamic
equalization 285
18.3 Complete signal path for the Brainworx bx_subsynth plug-in 286
18.4 Sonic Academy KICK 2 drum synthesizer, main control panel 289
20.1 Phil Harding commercial pop EQ guide 2018 311
21.1 The Antares Auto-Tune Realtime UAD plug-in window showing settings
used by one of the authors on the lead rap voice for a recent Trap remix 321
21.2 Flex Pitch mode enabled on a distorted bass guitar track in Logic Pro X (10.4.1),
zooming in on both its Workspace and – the more detailed – Editor views 322
21.3 Ableton Live’s Clip View illustrating a number of available Warp modes and the
Transpose function 323
24.1 US music album sales from 1973 to 2018 (millions of units) 368
24.2 US music sales revenue from 1996 to 2018 (millions of dollars) 368
24.3 US music album and singles sales for CD and download from 2004 to 2018
(millions of units) 375
24.4 US vinyl sales between 1989 and 2018 378
Tables
15.1 List of texts 238

26.1 Release timeline of the Hand. Cannot. Erase. materials and tour 395
Contributors
Mike Alleyne is a professor in the Department of Recording Industry at Middle Tennessee

State University (MTSU). He is the author of The Encyclopedia of Reggae: The Golden Age
of Roots Reggae (2012) and a contributing editor of Rhythm Revolution: A Chronological
Anthology of American Popular Music – 1960s to 1980s (2015). He has lectured
internationally and has published in numerous journals, magazines and essay collections.
He was also a consultant and expert witness for the estate of Marvin Gaye in the 2015
copyright infringement trial involving the 2013 hit song ‘Blurred Lines’. He is a writer and
publisher, member of ASCAP and PRS, and currently co-edits the SAGE Business Case
Series in Music Marketing.
Tuomas Auvinen is a musicologist, musician and educator teaching music production

and ethnographic methodology courses at the University of Turku, among other places.
He completed his PhD in musicology at the University of Turku (2019) and is currently
researching the relationship between music production and artificial intelligence. He is
a songwriter, arranger, producer and live and studio musician performing on the viola,
guitar, bass, percussion, keyboards and vocals primarily in his native Finland. He is a board
member of the Finnish Society for Ethnomusicology and an editor of its peer-reviewed
journal, the Finnish Yearbook of Ethnomusicology. His publications have appeared in the
Journal on the Art of Record Production, the Finnish Yearbook of Ethnomusicology and
Musiikki.
Simon Barber is a research fellow in the Birmingham Centre for Media and Cultural
Research at Birmingham City University. His work focuses primarily on songwriting
and the relationships between creative workers and industry. He is currently leading
the Songwriting Studies Research Network, a two-year project funded by the Arts and
Humanities Research Council (AHRC), and has published on the subject in journals such
as Popular Music and Society and The European Journal of Cultural Studies. Simon is also
the producer and co-presenter of the popular Sodajerker podcast, which features interviews
with some of the most successful songwriters in the world.
Eliot Bates is Assistant Professor of Ethnomusicology at the Graduate Center of the City
University of New York. He is an ethnomusicologist and technology studies scholar whose
research examines recording production and the social lives of musical instruments and
studio technologies. A graduate of UC Berkeley (2008) and ACLS New Faculty Fellow
(2010), he previously taught at the University of Birmingham (UK), Cornell University
xii Contributors
and the University of Maryland, College Park. His publications include Digital Tradition:
Arrangement and Labor in Istanbul’s Recording Studio Culture (2016), Music in Turkey:
Experiencing Music, Expressing Culture (2011), and Critical Approaches to the Production
of Music and Sound co-edited with Samantha Bennett (2018). He is also a performer and
recording artist of the 11-stringed-oud.
Amy Blier-Carruthers is Lecturer in Postgraduate Studies at the Royal Academy of Music,

and Teaching Fellow in Performance at King’s College London. She read music at King’s
College London, concurrently undertaking practical studies in violin at the Royal Academy
of Music. Her work is published by Oxford University Press and Routledge, and she has
collaborated with colleagues at the Royal College of Art on a book, Walking Cities: London.
She is co-investigator for the AHRC Digital Transformations project ‘Classical Music
Hyper-Production and Practice as Research’ , is on the steering committee of the Institute
of Musical Research, and has worked for the Royal College of Music and the University of
Cambridge.
Andrew Bourbon is Subject Area Lead of Music Technology at Huddersfield University. He

previously taught at the London College of Music, UWL and Birmingham City University.
He completed his PhD at Birmingham University with Professor Jonty Harrison. He is
also a producer, sound engineer, composer and musician and has produced and mixed
records for The Waletones, Joe Wander, Lewis Bootle, Grupo Lokito and Alice Auer. He
participated, with Simon Zagorski-Thomas, in the AHRC-funded Performance in the
Studio research network and, with Amy Blier-Carruthers, Emilie Capulet and Simon
Zagorski-Thomas, in the Classical Music Hyper-Production project.
Richard James Burgess is President and CEO for the American Association of Independent
Music (A2IM) and has produced, recorded and performed on many gold, platinum and
multi-platinum albums. He was previously Associate Director of Business Strategies at
Smithsonian Folkways Recordings where he produced Jazz: The Smithsonian Anthology.
He is known for his pioneering work with synthesizers, computers, sampling, EDM, New
Romantics and early house music, as the inventor of the SDSV drum synthesizer and for
coining the music genre terms EDM and New Romantic. His most recent publications
include The Art of Music Production: The Theory and Practice, 4th edition (2013) and The
History of Music Production (2014).
Lori A. Burns is Professor of Music at the University of Ottawa. Her articles have been
published in edited collections and leading journals, such as Popular Music, The Journal for
Music, Sound, and Moving Image, Studies in Music, and The Journal for Music Theory. Her
book Disruptive Divas: Critical and Analytical Essays on Feminism, Identity, and Popular
Music (2002) won the Pauline Alderman Award from the International Alliance for Women
in Music (2005). She is co-editor of The Pop Palimpsest with Serge Lacasse (2018) and The
Bloomsbury Handbook of Popular Music Video Analysis with Stan Hawkins (2019) as well
as series co-editor of the Ashgate Popular and Folk Music Series.
Contributors xiii
Anne Danielsen is Professor of Musicology and Director of the RITMO Centre for
Interdisciplinary Studies in Rhythm, Time and Motion at the University of Oslo. She
has published widely on rhythm, digital technology, and mediation in post-war popular
music and is the author of Presence and Pleasure: The Funk Grooves of James Brown and
Parliament (2006), Digital Signatures: The Impact of Digitization on Popular Music Sound
with Ragnhild Brøvig-Hanssen (2016) and the editor of Musical Rhythm in the Age of
Digital Reproduction (2010).
Michail Exarchos (aka Stereo Mike) is a hip-hop musicologist and award-winning rap
artist (MTV Best Greek Act 2008), including a nomination for an MTV Europe Music
Award. He is the course leader for Music Mixing and Mastering at London College of
Music (University of West London) where he is carrying out doctoral research on the
relationship between sample-based hip-hop and vintage record production techniques. His
publications include articles for Popular Music and the Journal of Popular Music Education.
His self-engineered and produced album Xli3h was included in the thirty best Greek hip-
hop albums of all time (SONIK magazine) and he is the first Greek artist ever to perform
at South by Southwest (2013).
Robert Fink is a past chair of the UCLA Musicology department, and currently Chair
of the UCLA Herb Alpert School of Music’s Music Industry Program. His publications
include Repeating Ourselves (2005) and The Relentless Pursuit of Tone (2018). His work
on popular music, minimalist experimentalism and post-1965 music and politics has
appeared in the Journal of the American Musicological Society, The Oxford Handbook of
Opera, the Cambridge Opera Journal and the recent collections Rethinking Reich (2019)
and Einstein on the Beach: Opera Beyond Drama (2019). Before coming to UCLA, Fink
taught at the Eastman School of Music (1992–97), and has been a visiting professor at Yale
University (2006) and a Fellow at the Stanford Humanities Center (1998–99).
Phil Harding joined the music industry at the Marquee Studios in 1973, engineering for
the likes of The Clash, Killing Joke and Matt Bianco. In the 1980s, he mixed records for
Stock, Aitken & Waterman, Bananarama, Rick Astley, Depeche Mode, Erasure, Pet Shop
Boys and Kylie Minogue. In the 1990s, he set up his own facility at The Strongroom with
Ian Curnow and further hits followed. Harding has recently worked for Samantha Fox,
Belinda Carlisle and Curiosity with his new team PJS Musicproductions.com. He is Co-
Chairman of JAMES (Joint Audio Media Education Services) and completed his doctorate
in Music Production at Leeds Beckett University in 2017. His most recent publication is
Pop Music Production (2019).
Mike Howlett was born in Lautoka, Fiji. During the 1970s he played bass with space-funk
group Gong. Leaving Gong in 1976, Mike put together his own group, Strontium-90, who
went on to enormous success as The Police. Mike began producing records in the 1980s
and had a string of top ten hits around the world. Mike was a founding member and former
chair of the Record Producers Guild (now known as MPG, the Music Producers Guild).
xiv Contributors
Following his PhD, Mike was Discipline Leader of Music and Sound at Queensland
University of Technology. He is currently semi-retired, playing occasional gigs with his
space-funk improvisational group PsiGong and re-mixing live multi-track recordings for
an upcoming Gong box set.
Mark Katz is Professor of Music at the University of North Carolina at Chapel Hill and
Founding Director of the hip-hop cultural diplomacy program, Next Level. His publications
include Capturing Sound: How Technology Has Changed Music (2004, rev. 2010), Groove
Music: The Art and Culture of the Hip-Hop DJ (2012), and Build: The Power of Hip Hop
Diplomacy in a Divided World (2019). He is co-editor of Music, Sound, and Technology
in America: A Documentary History (2012) and former editor of the Journal of the Society
for American Music. In 2015 he was recognized by the Hip-Hop Education Center in its
inaugural awards ceremony. In 2016 he received the Dent Medal from the Royal Musical
Association.
M. Nyssim Lefford is a researcher and teacher at Luleå University of Technology in Sweden,

in the Audio Technology program. She studied music production and engineering and
film scoring at Berklee College of Music. She received her master’s from the Massachusetts
Institute of Technology’s Media Lab for work on network music collaboration, and her
PhD for investigations into the perceptions of music creators in situ. As a researcher, she
continues to explore the unique creative, perceptual and cognitive processes of music
production and the ecology of recording studio environments, specifically, the nature
of production intelligence. Having worked in both industry and academia, in numerous
contexts, she has developed a breadth of interdisciplinary perspectives and methods.
Laura McLaren is a PhD student in musicology at the University of Toronto. Her research
interests are in popular music, feminist theory, music video and digital media. She
completed her Master’s of Arts with Specialization in Women’s Studies at the University of
Ottawa and completed her thesis ‘The Lyric Video as Genre: Definition, History, and Katy
Perry’s Contribution’ under the direction of Dr Lori Burns. She has presented her research
at IASPM-CAN and contributed a chapter to the forthcoming Bloomsbury Handbook of
Popular Music Video Analysis edited by Lori Burns and Stan Hawkins.
Kirk McNally is Assistant Professor of Music Technology in the School of Music at the
University of Victoria, Canada. He is the program administrator for the undergraduate
combined major program in music and computer science and the graduate program in
music technology. Kirk is a sound engineer who specializes in popular and classical music
recording, and new music performances using electronics. He has worked in studios in
Toronto and Vancouver, with artists including REM and Bryan Adams. His research and
creative work has been supported by the Deutscher Akademischer Austausch Dienst
(DAAD), the Canada Council for the Arts, the Banff Centre for Arts and Creativity and
the Social Sciences Humanities Research Council of Canada (SSHRC).
Contributors xv
Anthony Meynell is a record producer, songwriter, performing musician and academic

from London. After completing his PhD at the London College of Music, Anthony has
continued to combine extensive industry experience as a record label professional and
performer with lecturing, delivering programmes in popular music employing practice-
led research techniques to studies of performance in the studio, production and history of
technology. His research interest focuses on reenactment of historic recording sessions as
a process to uncover forgotten tacit working practices.
Mark Mynett is a record producer as well as live music front-of-house engineer, and
has worked as Senior Lecturer in Music Technology and Production at the University
of Huddersfield, UK, since 2006. Mark initially had an extensive career as a professional
musician with six worldwide commercial albums with several years of touring. This
was followed by a career as a self-employed record producer and front-of-house sound
engineer. In addition to teaching music technology and his own production work, Mark
frequently writes articles for publications such as Sound on Sound and Guitar World (US).
He is the author of Metal Music Manual: Producing, Engineering, Mixing and Mastering
Contemporary Heavy Music (2017).
Carlo Nardi received his PhD in Sciences of Music from the University of Trento in
2005. He is Research Associate at Rhodes University and Research Assistant at the Free
University of Bozen. He teaches methodology, arts marketing and music production at
Centro Didattico Musica Teatro Danza (CDM). He focuses on the use of technology from
a sensory perspective, authorship in relation to technological change, the organization of
labour in music-making and sound for moving images. Between 2011 and 2013 he was
General Secretary of IASPM (International Association for the Study of Popular Music). In
addition to academic research and teaching, he is also a producer, composer and performer.
Susan Schmidt Horning is Associate Professor of History at St John’s University in Queens,

New York. She is the author of Chasing Sound: Technology, Culture, and the Art of Studio
Recording from Edison to the LP (2013; 2015), and her work has appeared in the journals
ICON and Social Studies of Science, and in Music and Technology in the Twentieth Century
(2002) and The Electric Guitar: A History of an American Icon (2004). From her teenage
all-girl rock band, The Poor Girls, to the 1970s power-trio Chi-Pig, she played in northeast
Ohio and New York during the heady days of 1960s and 1970s rock and punk. Her current
project is a global study of all-girl rock bands in the 1960s.
Toby Seay is Professor of Recording Arts and Music Production at Drexel University. As
an engineer, he has worked on multiple gold and platinum certified recordings including
eight Grammy winners. He is a voting member of the Recording Academy (Grammys);
President of the International Association of Sound and Audiovisual Archives (IASA)
2017–2020; and Chair of the Coordinating Council of Audiovisual Archives Associations
(CCAAA) 2020–2021. Selected publications include: ‘The Recording’ in The Bloomsbury
Handbook of the Anthropology of Sound (2019), ‘Sonic Signatures in Record Production’
xvi Contributors
in Sound as Popular Culture (2016) and ‘Capturing That Philadelphia Sound: A Technical
Exploration of Sigma Sound Studios’ in the Journal on the Art of Record Production, no. 6
(2012).
Paul Théberge is a Canada Research Professor at Carleton University, Ottawa. He is cross

appointed to the Institute for Comparative Studies in Literature, Art and Culture (where he
served as Director, 2008–2011) and to the School for Studies in Art and Culture (Music).
He has published widely on issues concerning music, technology and culture. He is author
of Any Sound You Can Imagine: Making Music/Consuming Technology (1997), which
was the recipient of two academic book awards, and co-editor of Living Stereo: Histories
and Cultures of Multichannel Sound (2015). In 2012, he produced and engineered a set
of experimental recordings from the Glenn Gould archive – Glenn Gould: The Acoustic
Orchestrations.
Rob Toulson is Founder and Director of RT60 Ltd, specializing in technology development
for the audio and music industries. He was previously Professor of Creative Industries and
Commercial Music at the University of Westminster and Director of the Cultures of the
Digital Economy Research Institute at Anglia Ruskin University. Rob is a music producer,
sound designer and studio engineer who has worked with many established music artists
including Talvin Singh, Mediaeval Baebes, Ethan Ash and Janet Devlin. Rob is a successful
software engineer; he developed and co-produced the groundbreaking Red Planet EP
and iPhone music app for Daisy and The Dark. Rob is also the inventor of the unique
iDrumTune iPhone app.
Alan Williams is Professor of Music, Chair of the Department of Music and Coordinator
of Music Business at the University of Massachusetts Lowell. He has published chapters
in The Art of Record Production (2012), The Oxford Handbook of Applied Ethnomusicology
(2015) and Critical Approaches to the Production of Music and Sound (2018). He also writes,
records and performs with his ensemble, Birdsong At Morning.
Alexa Woloshyn is Assistant Professor of Musicology at Carnegie Mellon University.

She holds a PhD in musicology from the University of Toronto. Her research focuses on
how electronic, physiological and sociocultural technologies mediate the creation and
consumption of musical practices in both art and popular music. Her current research
projects examine performance practice in live electronic music and Indigenous musicians’
use of mediating technologies to construct and interrogate notions of ‘modern’ Indigeneity.
Her work has been published in the Journal of Popular Music Studies, Intersections: Canadian
Journal of Music, Circuits: musiques contemporains, eContact!, The American Indian Culture
and Research Journal, TEMPO, and the Journal on the Art of Record Production.
Simon Zagorski-Thomas is Professor at the London College of Music (University of West

London, UK) and founded and runs the 21st Century Music Practice Research Network.
He is series editor for the Cambridge Elements series and Bloomsbury book series on
Contributors xvii
21st Century Music Practice. He is ex-chairman and co-founder of the Association for the
Study of the Art of Record Production. He is a composer, sound engineer and producer and
is, currently, writing a monograph on practical musicology. His books include Musicology
of Record Production (2014; winner of the 2015 IASPM Book Prize) and the Art of Record
Production: Creative Practice in the Studio co-edited with Katia Isakoff, Serge Lacasse and
Sophie Stévance (2019).
Albin Zak III is Professor of Music at the University at Albany (SUNY). He is a composer,
songwriter, record producer and musicologist. His articles and reviews have appeared in
the Journal of the American Musicological Society, Journal of the Society for American Music,
Current Musicology, Journal of American History and in several volumes of collected essays.
He is the author of The Poetics of Rock: Cutting Tracks, Making Records (2001) and I Don’t
Sound Like Nobody: Remaking Music in 1950s America (2010). His recordings include the
albums An Average Day, Waywardness and Inspiration and Villa Maria Road.
Introduction
Andrew Bourbon and Simon Zagorski-Thomas
Organizing this book

It is important to consider what is not in this handbook as well as what it contains. We
can’t include everything and this introduction is about explaining what isn’t included as
well as what is and, hopefully, why. In many ways the production of the book has been a
metaphor for some aspects of the production of much of the recorded music in the last
fifty years. The book would not have been made without the impetus and facilitation
of a multinational corporation. Leah, our commissioning editor at Bloomsbury (our
metaphorical A&R person), has ceded overall creative control to us, Simon and Andrew,
and yet has been liaising with us about the timing and logistics of the production process
throughout. The writing of the book has been a collaborative process that involved us
managing the individual creative contributions of many actors. As editors we had a
vision of what we wanted and we invited a broad range of actors to participate. We
knew all of them to be highly competent and experts in their field but there were other
contributing factors such as status, experience, innovation and novelty as well. Not
everything went to plan. It took longer than expected, some participants dropped out for
various reasons and some chapters ended up being quite different to our expectations.
There were negotiations and edits and the precise nature of the book, including this
introduction, emerged out of that process. Of course, it was fundamentally shaped
by our initial ideas and vision, but it has also organically grown out of the process
of creation. And beyond this production and editing (post-production) process, once
we submit the ‘master’ to Bloomsbury there will be a further round of copy editing
and formatting – paralleling the mastering processes in audio recording – in order to
create the final ‘product’. While we do not want to take this metaphor too far – there
are, of course, major differences between both the creative processes and the completed
artefacts – it does serve to remind us that the end product of any creative process always
seems to have been inevitable in retrospect but was always contingent on the many
vagaries of life and collaboration.
2 The Bloomsbury Handbook of Music Production
In the proposal document for this book we provided the following as a brief outline of
the book’s goals:
Leading and emerging academics in the field of music production have been brought
together in this handbook to discuss how their cutting edge research fits into the broader
context of other work in their specialism. Examining the technologies and places of music
production as well as the broad range of practices – organization, recording, desktop
production, post-production and distribution – this edited collection looks at production as
it has developed around the whole world and not just in Anglophone countries. In addition,
rather than isolating issues such as gender, race and sexuality in separate chapters, these
points are threaded through the entire text.
Obviously the second sentence outlines the sectional structure, but it might be useful
to outline the reasons for these choices. For example, we could have taken a more
chronological approach, breaking the book into a series of decades or quarter centuries.
We could have chosen to divide the book according to professional (or other) roles such
as engineer, producer, artist, session player, equipment maker, etc. However, there is an
agenda at the heart of this book that is reflected in Bloomsbury’s stated aims for the series:
Bloomsbury Handbooks is a series of single-volume reference works which map the
parameters of a discipline or sub-discipline and present the ‘state of the art’ in terms of
research. Each Handbook offers a systematic and structured range of specially commissioned
essays reflecting on the history, methodologies, research methods, current debates and
future of a particular field of research. (Bloomsbury n.d.)
The idea of mapping the parameters of the discipline and presenting the ‘state of the art’ in
terms of research is complicated by the fact that this is a practical and vocational discipline.
As Zagorski-Thomas discusses in Chapter 1, there is a dichotomy between research that is
concerned with explaining the ‘state of the art’ of practice – essentially, how to do the job
well – and the ‘state of the art’ when it comes to understanding how production works. This
is also reflected in the various approaches that the contributors have taken in this volume.
It might be best understood in terms of two models of education that exist throughout
the music sector. What we will call the ‘traditional’ model is based on the idea of a novice
learning from an expert. This can be contrasted with what we might call the ‘university’
model of using theoretical knowledge to understand how a process works and looking
at new ways of working based on that theory. Of course, neither of these models exists
in a pure form, especially not in the current ‘state of the art’. Universities employ a lot of
dual-professionals who bring extensive knowledge about cutting edge industry practice –
and about the tried and tested expertise from the ‘golden age’ of recording. There is also
more and more research going on about how communication, interaction and creativity
work in the studio (see for example McIntyre 2012; Lefford 2015; Bennett and Bates 2018;
Thompson 2018) and how we listen to and interpret recorded music (e.g. Moylan 2007;
Zagorski-Thomas 2010; Moore 2012; Dibben 2013; Zagorski-Thomas 2014a, b, 2018).
There is extensive literature that combines some scientific knowledge about the nature of
sound with forms of generic technical manuals, for example explaining both the ‘typical’
controls on a dynamic compressor and how they affect an audio signal (e.g. Owsinski
Introduction 3
1999; Case 2007; Izhaki 2008; Hodgson 2010; Savage 2011). And there are a broad range
of historical and ethnographic studies that outline past practice in varying amounts of
detail (e.g. Zak 2001; Meintjes 2003; Porcello and Greene 2004; Ryan and Kehew 2006;
Zak 2010; Williams 2012). What there is not much literature on is the connection between
practice and aesthetics. And again, this is an issue which spans the whole of the practical
music education section. Vocational education should not only be a matter of learning how
experts do their job (and how they did it in the past) but should also be about providing
novices with a theoretical map that lets them think about what the musical objectives of the
song are, decide how that might be embodied in a sonic metaphor or cartoon (Zagorski-
Thomas 2014b), and be able to identify the tools and techniques to achieve it.
So, to rewind for a second, the agenda we mentioned that is reflected in the Bloomsbury
Handbook’s stated aims is to find ways to bridge between the vocational and the academic,
the practical and the theoretical, and the ‘how’ and the ‘why’. Although more or less everyone
in this book can be described as a dual-practitioner of some sort – either simultaneously
or sequentially as both a creative practitioner and an analytical and reflective researcher –
they all also do it in different ways and to different extents. And, also in different ways, we
all negotiate between the ‘traditional’ and the ‘university’ modes of learning and knowledge
that were outlined earlier. The decision about structuring the book was, therefore, based
on finding a way to best achieve those aims of bridging between various differences and
dichotomies. These eight headings allowed us to mix up contributors from different
backgrounds and with different approaches in the same section. Hopefully, although
the contributions can all stand up on their own, this structure will suggest connections,
differences, contrasts and complementarity that will make the book greater than simply
the sum of all its parts. In order to reinforce this approach we have prefaced each of the
sections with a short discussion of the topic in broad terms and the ways that the various
chapters link together.
Bibliography
Bennett, S. and E. Bates (2018), Critical Approaches to the Production of Music and Sound,
New York: Bloomsbury Publishing.
Bloomsbury (n.d.), ‘Bloomsbury Handbooks’. Available online: https://www.bloomsbury.com/
us/series/bloomsbury-handbooks (accessed 5 May 2019).
Case, A. U. (2007), Sound FX: Unlocking the Creative Potential of Recording Studio Effects,
Burlington, MA: Focal Press.
Dibben, N. (2013), ‘The Intimate Singing Voice: Auditory Spatial Perception and Emotion
in Pop Recordings’, in D. Zakharine and N. Meise (eds), Electrified Voices: Medial, Socio-
Historical and Cultural Aspects of Voice Transfer, 107–122, Göttingen: V&R University
Press.
Hodgson, J. (2010), Understanding Records: A Field Guide to Recording Practice, New York:
Bloomsbury Publishing.
Izhaki, R. (2008), Mixing Audio: Concepts, Practices and Tools, Burlington, MA: Focal Press.
Lefford, M. N. (2015), ‘The Sound of Coordinated Efforts: Music Producers, Boundary

Objects and Trading Zones’, Journal on the Art of Record Production, (10).
McIntyre, P. (2012), ‘Rethinking Creativity: Record Production and the Systems Model’,
in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An Introductory
Reader to a New Academic Field, Farnham: Ashgate.
Meintjes, L. (2003), Sound of Africa! Making Music Zulu in a South African Studio, Durham,
NC: Duke University Press.
Moore, A. F. (2012), Song Means: Analysing and Interpreting Recorded Popular Song, Farnham:
Ashgate.
Moylan, W. (2007), Understanding and Crafting the Mix: The Art of Recording, 2nd edn,
Owsinski, B. O. (1999), The Mixing Engineer’s Handbook, 1st edn, Boston, MA: Artistpro.
Porcello, T. and P. D. Greene (2004), Wired for Sound: Engineering and Technologies in Sonic
Cultures, Middletown, CT: Wesleyan University Press.
Ryan, K. and B. Kehew (2006), Recording The Beatles, Houston, TX: Curvebender Publishing.
Savage, S. (2011), The Art of Digital Audio Recording: A Practical Guide for Home and Studio,
New York: Oxford University Press.
Thompson, P. (2018), Creativity in the Recording Studio: Alternative Takes, Leisure Studies in a
Global Era, New York: Springer International Publishing.
Williams, S. (2012), ‘Tubby’s Dub Style: The Live Art of Record Production’, in S. Frith and
S. Zagorski-Thomas (eds), The Art of Record Production: An Introductory Reader to a New
Academic Field, Farnham: Ashgate.
Zagorski-Thomas, S. (2010), ‘The Stadium in Your Bedroom: Functional Staging, Authenticity
and the Audience Led Aesthetic in Record Production’, Popular Music, 29 (2): 251–266.
Zagorski-Thomas, S. (2014a), ‘An Analysis of Space, Gesture and Interaction in Kings of
Leon’s “Sex On Fire” (2008)’, in R. von Appen, A. Doehring, D. Helms and A. Moore (eds),
Twenty-First-Century Pop Music Analyses: Methods, Models, Debates, Farnham: Ashgate.
Zagorski-Thomas, S. (2014b), The Musicology of Record Production, Cambridge: Cambridge
University Press.
Zagorski-Thomas, S. (2018), ‘The Spectromorphology of Recorded Popular Music: The
Shaping of Sonic Cartoons through Record Production’, in R. Fink, M. L. O’Brien and Z.
Wallmark (eds), The Relentless Pursuit of Tone: Timbre in Popular Music, New York: Oxford
University Press.
Zak, A. J. (2001), The Poetics of Rock: Cutting Tracks, Making Records, Oakland, CA:
University of California Press.
Zak, A. J. (2010), I Don’t Sound Like Nobody: Remaking Music in 1950s America, Tracking Pop,
Ann Arbor, MI: University of Michigan Press.
Part I
Background
The three chapters here deal with the questions of what recorded music is, the ways in which
it may or may not be authentic, and the problems involved in the research and study of
music production. But the question of background and context goes way beyond the scope
of what could be fitted into a single volume, let alone into the chapters within this single
part. Almost by definition, this part was going to be less to do with the specifics of music
production and more to do with the nature of music, the non-technical and non-musical
factors that influence how it is made, and the types of knowledge we can have about the
subject. While these three chapters address those ideas quite clearly and coherently, they
certainly do not do so exhaustively. Indeed, the chapters in Part VIII, on Distribution, can
be seen to be as equally relevant as background or context. Several of the contributors in
the book, although they are writing about a specific topic here, have professional lives that
reflect some of the complexities of this ‘background’. Mike Alleyne was an expert witness
in the Robin Thicke/estate of Marvin Gaye court case over the single ‘Blurred Lines’. The
influence of the law on creativity is understudied – possibly because it would be so difficult
to establish cause and effect – but record companies’ legal departments are having and have
had a huge influence on which records get made, get released and who gets the money (and
therefore creates incentives for future work). Richard James Burgess, through his work
with the Recording Academy and A2M, has been an advocate for recognizing producers
as creative contributors to recordings who should receive royalties in the same way that
an artist does. There is also the issue of the way that flows of money in and around the
industry influence the types and quantities of recordings that are made and released. Mike
Howlett, separate to his contribution here, has written about this idea of the producer as a
nexus between the business interests of the record company and the creative and technical
interests of the artists and technicians.
A conceptual thread that goes through many of the chapters in this collection, as well
as the first two chapters in this part, is the idea of ownership and the way that the recorded
artefact does more than simply provide access to sounds. This relates not only to the ways
in which we interpret the ‘unnatural’ experience of sound without sight but also to the
ways in which it helps to shape our sense of identity. This thread also travels through
Alexa Woloshyn’s and Mark Katz’s chapters in relation to both identity and listening
practice. Indeed, this idea of the ways in which a recording industry affects and is affected
by its national and cultural context formed the basis for a series of radio programmes
that Zagorski-Thomas recently made for the BBC World Service called ‘How the World
Changed Music’. It demonstrated the many ways in which the creation of recorded music
was an integral part of a process of cultural development – the spread of a single language
across China, the development of colonial and post-colonial statements of resistance and
identity in the Democratic Republic of Congo and Okinawa, the changing gender roles in
India and China, and the problems with attempts at state control of music in Poland and
Cuba.
Although, of the three, Carlo Nardi’s chapter is the most clearly focused on the types
of knowledge we can have about music production and the recorded artefact, all three are
permeated with this question. Of course, as one would expect with a part that provides
a general background or context to the whole volume, these questions infuse all of the
chapters. And, as discussed in the introduction, this question of what a survey of the ‘state
of the art’ of research in this discipline should look like is at the heart of the design of the
structure of this volume.
1
Recorded Music
Simon Zagorski-Thomas
What is music?
There are a great many theories about the development, purpose and nature of music and,
while this is not the place to go too deeply into that discussion, it is important to discuss
some parameters and definitions at the start of this book. We all think we know exactly
what music is and yet most definitions run into trouble if they try to be universal rather
than personal. Mine is more a series of stories than a single definition.
Singing emerged as part of the bundle of activities that humans developed to encourage
and cement social cohesion and to use metaphors for emotion to synchronize the mood
between participants. It became part of the ritualization of life where we developed
special versions of everyday activities to imbue certain instances of them with additional
importance. Thus singing, chanting and poetry became special versions of speech that
allowed us to mark certain stories as more important than others. They allow us to
exaggerate certain features of speech – the types of energy that inflect the words with
additional meaning – and thus fall into the category of sonic cartoons (Zagorski-Thomas
2014). Instrumental music allows us to isolate the meaningful emotional energy from the
semantic meaning of language and to create a ritualized and schematic representation of
experience.
Being a representational system, it involves two layers of perception, interpretation
and appreciation – of the phenomenon that is being represented and of the way it is
represented. I can perceive, interpret and appreciate the semantic meaning of a poem
and the aesthetic way it is expressed. I can do the same with other forms of literature,
with visual art and with music. I can perceive, interpret and appreciate what I consider
to be the physical or emotional narrative that is being represented as well as the skill
and beauty with which it is represented. And there are three concurrent and interacting
modes of engagement (Middleton 1993) that we use in this process of interpretation and
appreciation:
1 The direct embodied responses of empathy and engagement (e.g. moving to a

beat).
2 The learned and physiological subconscious metaphorical connections we make
(e.g. hearing the sound of lethargic activity as sad or listless).
3 The conscious metaphorical connections that emerge from learning and problem
solving (e.g. hearing structure such as verse/chorus or sonata form in a piece of
music).
The ‘production’ activities of musicking (Small 1998) – composition, performance,

arranging and staging – that have developed throughout the span of human history and
in the myriad strands of geography and culture, have been designed and nurtured to
encourage these three forms of engagement in various combinations in the ‘consumption’
activities – listening, dancing, worshipping and other forms of participation. It is only
through limited technologies such as the music box and, more recently, through recorded
music that the ‘production’ activities could be embodied in an artefact that allowed them
to be separated in time and space from the ‘consumption’ activities.
What is recorded music?

When Edison developed the recording process, he considered its use to be for recording
speech – as a way of replacing writing. It was to be a representational system that not
only represented the words but also the character and tone of the voice of the individual
person. It is interesting that its primary function soon became the representation of music,
a representational system in itself.
Obviously the term ‘record’ implies that there is a phenomenon or an occurrence and
that there is an artefact that constitutes a record of that phenomenon or occurrence. But
records are not about replication, they are about selecting particular features to measure
or represent. When I talk about a record, I am not talking about reliving a moment exactly
as it was. My medical record is not an exact copy of the history of my body’s health, it
is a record of pieces of information that various health professionals think might be
important to any future health professionals who have to assess my health or treat my
illnesses. Audio recording uses a mechanical process to translate the vibration of air
molecules at a particular place (usually at the diaphragm of a microphone/transducer)
into a representational system that allows other transducers (usually a speaker cabinet or
headphones) to recreate similar vibrations of air molecules in a different place and time.
The ‘realism’ of this representational system has been based on making the transducer
and the storage system for the representation at least as sensitive as the human ear is to
frequency and dynamics. While we may have come quite close to this in terms of these
two features, there are other features of the live and active process of hearing that are
not being represented. There are very few moments in life when we are confused about
whether we are listening to ‘real’ or ‘recorded’ sound and, when we are, we can soon find
out if we want to. The ‘realism’ of recorded music is very limited. Indeed, many forms of
musical recording are about creating something that is clearer and more impactful than
the original moment – or they are about creating something that is an idealized version of
a musical idea.
Recorded Music 9
In addition, even before the moment that electrical speaker systems were used to play
back these types of recording, they were also being used to create new forms of artificial
sound – sounds that were the result of the electrical circuitry in instruments like the
Theremin and the Ondes Martinot. Just as the representational system of music notation
allowed composers to create musical forms that were too complicated to hold in their heads
without the tool of notation, so too did the representational system of sound-as-voltage
allow creative musicians and technicians to produce sounds that were more precise and
ordered than those produced by the mechanical vibration of objects in the natural world.
Later in the twentieth century both art music and popular music developed approaches
and techniques that combined and manipulated these two sources of sonic representation –
recorded sound and electronically generated sound. The last forty years have witnessed a
steady process of integrating these two forms of sound generation into the technologies
of music production. Even in musical traditions that have an ideological averseness to
electronic musical instruments, the recording process now, more often than not, involves
electronically generated artificial reverberation. And, of course, multiple representations
of waveforms – collected from different angles and distances from the various sound
sources – are filtered, reshaped and merged together to create a new and highly artificial
representation of a sonic ‘event’. In traditions such as electroacoustic art music, hip-hop
and EDM, these twin strands of recording and construction and the ubiquitous processes
of waveform manipulation and reshaping come together to produce music that travels far
beyond the limits of what is possible in the acoustic realm of humans and objects making
air vibrate in a given environment.
The story of recorded music is often told as a progression from low fidelity to
high fidelity and, although we have moved beyond that narrative in some senses, the
development of 3-D sound and immersive audio is beset with those images of ‘realism’.
But do we want to be in control of our relationship with music? Do we want the
sound to alter as we move either our head position or our position in a room? Or do
we want to be subjected to a musical experience that has been created for us by the
artists? Of course, the answers to these types of rhetorical question tend to be ‘maybe’,
‘sometimes’ and ‘it depends’. The development of all forms of art and entertainment are
a constant negotiation between some collaborative group of makers, who are designing
and producing something for us to engage with, and the choices and decisions of the
audience member or consumer. They can choose whether, when and how they engage.
They can choose what aspect of the experience to focus their attention on at any given
moment, and their interpretation of what is happening and what it means will be a
unique product of their past experience and the decisions they make while engaging
with it. However, as Eric Clarke (2005) and others have pointed out, our interpretation
is not random. It is the result of our perception of affordances – what is possible and/
or more or less likely – in a given set of circumstances, whilst a representational system
manipulates our subject-position; rather than, it selects or distorts our visual or aural
perspective, it affords the perception of some features and not others, and it creates a
chronological narrative.
Sonic cartoons
The term sonic cartoons (rather than being about humour!) relates to the fact that
recordings are not realistic and that they are representations. And, along with the camera
obscura, photography, film, video, etc., they can all involve the creation of a representation
through a mechanical process that encodes a limited set of the physical properties of a
phenomenon – some of the ways it reflects light or causes air molecules to move. The
mechanical accuracy of these types of representation encourages us to place them in a
different category to other forms of representation such as drawing, painting or sculpture,
but we are technologically still a long way from any recording system that can mimic the
real world. We cannot mimic the way that the reflections of multiple sound sources in space
change along with our body’s position in the space, our head’s orientation (and continual
movement) on top of our body, and even, it now seems (Gruters et al. 2018), the muscular
changes in our ears that follow our eye movement.
What, though, is being represented by a recording? Written language is a representation
of the spoken word, but if you read a Hamlet soliloquy you will not experience the
actor’s facial expression or the tone of their voice. With a recording, for example, you can
experience an approximation of the sound of a group of musicians performing in a room
but you cannot see them and you cannot walk up to one of them so that you can hear them
more clearly. And if you move around in a space while listening to recorded sound played
back on speakers, it will change in very different ways than if you were in a room with
the players. Of course, most recordings, even ones of a single performance without any
additional overdubs, have been recorded with microphones in a different configuration
than the two ears on either side of our head. And many are deliberately unrealistic or
even surrealistic representations but we still understand them and make sense of them
in relation to the norms of human activity. Recorded music creates a representation of
the sound of some body (or bodies) and/or some thing(s) happening in some place for
some reason and that representation places you in some perceived physical and, therefore,
social and psychological relationship with that phenomenon. And just as I mentioned in
regard to music itself as a phenomenon, we can perceive, interpret and appreciate both the
musical activity that is being represented (i.e. the composition and the performance) and
the skill and artistry that are used in the process of representation (i.e. the techniques of
recording, production and mixing).
Curated versus constructed

In the same way that we have become accustomed to visual representations that might
be drawings, paintings or computer animations as well as those that are mechanical
‘reflections’, we have also developed aural forms of representation, as I mentioned before,
that are constructed entirely within the representational system rather than being ‘captured’
versions of a reality. Thus, I can record the sound of a snare drum being hit but I can also
create one electronically using white noise, envelope shaping and filtering. I can also record
Recorded Music 11
the timing and dynamics of a performance or I can place sounds on a computer generated
timeline and trigger them at different amplitudes to construct an artificial performance.
And I can mix and match these types of production activities – placing recordings of a real
snare drum on a timeline, triggering electronic sounds from the timing and dynamics of a
recorded performance or realigning the timing of a performance using audio quantization
to a timeline grid.
Whether recorded music involves a quasi-realistic representation of a single
performance that occurred in a specific space, a ‘performance’ that has been constructed
by overdubbing and editing together multiple individual performances into a collage, or an
entirely constructed electronic piece that bears little resemblance to actual human activity,
we still construct our interpretations based on our experience of what kinds of things make
sound, how they make sound and where they make sound. Based on this we can break
that interpretation down into the agency (who or what), the energy (how), the context
(why) and the space (where). In addition, for each of these, there are potential literal and
metaphorical (or associative) interpretations. For example, we might recognize the singing
voice of Joe Strummer on The Clash’s 1979 single ‘London Calling’ and we might also
identify an animal metaphor through the way he howls. As we have also said, though, a
representation such as recorded music is perceived, interpreted and appreciated not only
in terms of what is being represented but also in terms of the representational system. The
way a representation ‘speaks’ to us, whether it is our appreciation of the brush strokes of a
master painter or the way that Norman Petty rhythmically altered the sound of the drums
with an echo chamber on Buddy Holly’s 1957 single ‘Peggy Sue’ , adds another dimension
to our interpretation.
The first thing to point out about these categories as we progress through them is the
fact that they are artificial constructs and that there is a lot of overlap and ambiguity. So
while these may be useful categories in helping us to think about recorded music, we can
also use them to think about how examples do not fit into these categories. The following
sections provide explanations and examples of the ways in which the techniques of music
production can be used to affect the perception, interpretation and appreciation of music
through these categories.
The perception of agency

The literal
Of course, the majority of musical sounds are made by people, either solo or in groups, and
either with their own bodies or using tools/instruments, but with the advent of recording it
has been possible to incorporate non-human sound making into music such as the sound
of animals or natural phenomena such as thunder, wind, rain and running water. There
are many human characteristics that can influence the types of sound that a person makes:
age, gender, size and strength, for example. And, of course, the number of people makes a
big difference too. The techniques of record production allow these to be manipulated in
a variety of ways. Equalization and dynamic shaping can be used to alter the perception
of size – obviously larger bodies allow deeper resonances and these can be exaggerated or
inhibited – and these features can also help to exaggerate sonic characteristics that relate
to physical and cultural markers of gender, strength and age as well. For example, Archie
Roach’s 2006 vocal performance on ‘Held Up To The Moon (In My Grandfather’s Hands)’
by the Australian Art Orchestra has had the frailty and suggestion of old age in his vocal
timbre exaggerated by thinning out the lower frequencies of the voice. The spectral change
does not suggest old age or frailty in itself but it does highlight the features (croakiness and
pitch instability) that do. When it comes to representing the number of people participating
in a performance, there are several characteristics that can be used to exaggerate that sense
of number: variations in vocal timbre, pitch and timing. The last two of those provided the
basis for Ken Townsend’s development of Automatic Double Tracking for John Lennon in
1966 (Ryan and Kehew 2006); a system which altered the pitch and speed of his recorded
voice so that, when mixed with the original unaltered voice recording, it created the
impression of two voices.
When it comes to tools and instruments, it is difficult to differentiate between the
materiality of the object and the type and level of energy used to make it sound. For
example, for recordings of two different snare drums – they might have different heads
or one might have a wooden shell and the other have a metal one – it is difficult to
distinguish between audio processing that might bring out the character of each drum as
opposed to changing the sense of how hard each one was hit. So, for example, on the two
mixes of ‘By Starlight’ on the 2012 reissue of Mellon Collie and the Infinite Sadness by the
Smashing Pumpkins, the rough mix by Flood allows the metallic ring of the snare drum
to resonate throughout the whole track whereas the released mix by Alan Moulder and
Billy Corgan has dampened that ring and it only emerges for a short time in the middle
of the track.
The metaphorical
Along with any literal sense of who and what are making the sounds on a recording, there
are also a set of associations and metaphors that they suggest to listeners. Some of those,
like the way that listening to Patti Smith’s voice on her 1975 Horses album always reminds
me of the taste of millionaire’s shortbread, are very personal and idiosyncratic, and others,
like the association of the choir on The Rolling Stones’ 1969 ‘You Can’t Always Get What
You Want’ with the European Christian Church, are more universal (although only for
people familiar with the Christian Church). In addition, some of the literal suggestions that
I have already noted also bring different associations and metaphors to different audiences.
Some, like the suggestion of youth or old age through vocal timbre mentioned earlier, may
also have different cultural associations to do with status, or, like the association of The
Rolling Stones’ track with the church, might suggest metaphors of very different types:
from spirituality to oppression perhaps.
Recorded Music 13
The perception of energy

The literal
Perhaps more meaningful than the specificity of the person or persons involved in the
activity is the nature of the energy expenditure itself. The sound of something being engaged
with using high levels of energy versus lower ones creates powerful differences in the way
it encourages empathy or entrainment. Berry Gordy, the founder of Motown Records,
was famously reported to have used hammering on timber to emphasize the backbeat on
records so that, as the joke had it, ‘even white folks can dance to it’ (Pugh 2015: 245). As we
shall see in later chapters, musical tropes such as heaviness, fatness and groove are strongly
influenced by the type and strength of the energy that is being perceived to be expended.
In Lisa Hannigan’s 2009 ‘I Don’t Know’ from the album Sea Sew, microphone techniques
and equalization have been used to remove ‘weight’ from the energy of the performances
of some of the instruments and the voice at the start so that the build in energy in the song
is more dramatic, which is mostly suggested by making the arrangement and mix fuller
rather than the individual performances building in energy.
The metaphorical
Of course, a large part of the energy that we hear expended is about metaphorical meaning
in the sense that it is designed to suggest an emotional response through the physical
gesture. While the lightness of the energy in the Lisa Hannigan example may suggest a
particular bouncy form of entrainment, it is also there to suggest a lightness of spirit. The
heaviness of Mark Mynett’s production of his band’s, Kill II This, 2019 single ‘Coma Karma’
may encourage embodied responses through entrainment but also seeks to suggest power,
swagger and grandeur. The track ‘Boro’ on Greek hip-hop artist Stereo Mike’s 2011 album
Aneli3h includes a sliding bass synth that both encourages particular types of entrained
dance responses and also evokes the sound of an electric motor.
The perception of space

The literal
While Moylan’s (1992, 2007) writing about the perceived performance environment in
phonographic staging (Lacasse 2000) and Moore’s (1992, 2012) sound box suggest that
recorded music utilizes realistic spatial environments, they all recognize the fact that in
recordings from as early as Gene Austin’s 1927 ‘My Blue Heaven’ instruments can appear to
be in different spaces to each other on the same track. Although both the technologies and
the approaches have become more complicated and sophisticated, it has been a common
way of playing with our perception of space in a recording. A common trope in more
contemporary recordings is to place the verse vocal in a smaller or sparser space and the
chorus in a larger or more intense space. This can be heard in an exaggerated form on the
2018 single ‘That Feeling’ by Mo Jamil.
The metaphorical
Of course, like energy, space in recordings suggests associations and metaphors to
listeners. As Peter Doyle (2006) has pointed out, reverberation and echo as a metaphor
for solitude and wide open spaces was used in the fad for cowboy songs that coincided
with the trend for film westerns and can be heard in songs like Roy Rogers’s 1948 version
of ‘Don’t Fence Me In’. And yet simultaneously it was also used as a metaphor for the vast
expanse of outer space, albeit often in extreme or impossible mixes of spatial sound with
the ‘dry’ signal as can be heard in Young John Watson’s (aka Johnny ‘Guitar’ Watson) 1954
single ‘Space Guitar’. Of course, the way that dub reggae played with surreal constructions
of spatial sound forced the listener to think of it metaphorically rather than realistically.
The remarkable spatial narrative of records such as King Tubby’s mix of Augustus Pablo’s
1976 album King Tubby Meets Rockers Uptown takes the listener on a wild ride through
a fantasy world of reverberation and echo that we each have to try and make sense of in
our own way. This use of spatial processing as a way of accentuating particular words
and creating texture in an arrangement has developed in modern pop, hip-hop and R ‘n’
B into an almost abstract language for creating formal structure as can be heard, for
example, in the vocal staging of SZA’s and Travis Scott’s performances on her 2017 single
‘Love Galore’ .
The perception of context

The literal
The idea of placing the music of a recording in its cultural context is in part going back to
the notion of recordings that are so realistic you can shut your eyes and imagine you are
there – the promotional schtick of high fidelity. Many recordings are sold on the basis of
how they can place you in a special place at a special time. Whether it is Andras Schiff ’s
2019 performances of Schubert’s piano works on an ‘authentic’ period instrument, the
2003 live album of the Dixie Chicks’ Top of the World Tour, or the Christmas service by
the Kings College Choir, Cambridge, it is the context that makes the event additionally
special over and above the physical sound. The extent to which these recordings are always
‘events’ in the strictest sense of the word is another matter. Often there is extensive editing
from multiple performances and/or runs through, repairs and alterations made in post-
production either with audio processing or overdubbed performances, and additional
ambient and audience added to ‘top up’ the sound of liveness.
Recorded Music 15
The metaphorical
Both the 1977 Marvin Gaye single ‘Got To Give It Up’ and Robin Thicke’s 2013 ‘Blurred
Lines’ (which was found to have plagiarized the Gaye song in a landmark legal battle
discussed by Mike Alleyne in Chapter 2) use a performed soundtrack of party noise –
a constructed context that is designed to conjure up associations of the real thing. In
more ambiguous instances of staged context, Shiyani Ngcobo’s 2004 album Introducing
Shiyani Ngcobo utilizes overdubs recorded in the garden of the South African studio,
which introduce ambient sounds such as birdsong to the recording – certainly something
that provided context to some moments of the recording but which were not a dominant
feature of the process. In addition, the Raw Bar Collective’s 2011 album Millhouse Measures
was recorded in a pub outside Dublin to capture the authentic atmosphere of the ‘raw
bar’. However, the close microphone placement, the editing out of noise and chat between
tunes, the arrangement of the audience in seated rows as if in a concert hall and the very
respectful and quiet audience all produced an album that created a very pale imitation of
the actual ‘raw bar’ experience.
Representational system
The literal
This final category involves the way that a recording might use the characteristics of the
representational system – the recording or distribution technologies – to suggest further
meaning in the recording. One literal example of this is the introduction to Skee-Lo’s
1995 single ‘I Wish’, which starts with an introduction filtered to sound like it is being
played back through a tiny portable radio speaker before the full, unfiltered version of the
backing track drops as the vocals enter. This is an updated version of the kind of reduced
arrangement introduction that has been a staple of popular music – often a solo guitar or
piano will play the introduction before the rhythm section enters. Another example is the
1994 Portishead track ‘Biscuit’ from the album Dummy, where the sounds of vinyl crackle
and tape hiss are deliberately highlighted and incorporated into the arrangement through
their rhythmic inclusion and exclusion.
The metaphorical
While these types of mediation may have their own associations and connotations –
particularly the historical periods when sounds such as transistor radios, vinyl records
and cassette tapes were part and parcel of the experience of recorded music – there are
also ways in which forms of mediation can act metaphorically. In Eminem’s ‘The Real Slim
Shady’, from the 2000 Marshall Mathers LP, one of his vocal takes is processed as if it is
being broadcast through a tannoy system, like a public address message in a supermarket.
The metaphor of the disembodied voice marks it out as distant and remote and as a separate
narrator for the track who can comment on the main storyline.
Conclusion
I have used this first chapter in the book to outline the ways in which recorded music
differs from the concert hall and the ways in which the production process can address and
affect the various factors that influence our perception, interpretation and appreciation
of both the musical materials and the ways in which they are represented. As humans,
we understand music in terms of the sound of something happening somewhere and, in
those terms, we can think of agency in terms of the nature of the things that are using
and being used – in the most general terms, of the people and things that are causing
sounds, the tools and instruments that are being caused to vibrate, the types and levels of
energy that are doing it, the space and cultural context in which it is all happening and
the sound of any recording or distribution medium that is being used to represent the
musical activity.
Bibliography
Clarke, E. F. (2005), Ways of Listening: An Ecological Approach to the Perception of Musical
Meaning, New York: Oxford University Press.
Doyle, P. (2006), Echo and Reverb: Fabricating Space in Popular Music Recording, 1900–1960,
Middleton, CT: Wesleyan University Press.
Gruters, K. G., D. Murphy, C. Jenson, D. Smith, C. Shera and J. Groh (2018), ‘The Eardrums
Move When the Eyes Move: A Multisensory Effect on the Mechanics of Hearing’,
Proceedings of the National Academy of Sciences, 115 (6): E1309–E1318. doi: 10.1073/
pnas.1717948115.
Lacasse, S. (2000), ‘Listen to My Voice: The Evocative Power of Vocal Staging in Recorded
Rock Music and Other Forms of Vocal Expression’, PhD thesis, University of Liverpool,
Liverpool. Available online: http://www.mus.ulaval.ca/lacasse/texts/THESIS.pdf (accessed
30 June 2011).
Middleton, R. (1993), ‘Music Analysis and Musicology: Bridging the Gap’, Popular Music,
12 (2): 177–190.
Moore, A. F. (1992), Rock: The Primary Text: Developing a Musicology of Rock, 2nd edn,
Ashgate Popular and Folk Music Series, Farnham: Ashgate.
Ashgate.
Moylan, W. (1992), The Art of Recording: Understanding and Crafting the Mix, Waltham, MA:
Focal Press.
Moylan, W. (2007), Understanding and Crafting the Mix: The Art of Recording, 2nd edn,
Waltham, MA: Focal Press.
Recorded Music 17
Pugh, M. (2015), America Dancing: From the Cakewalk to the Moonwalk, New Haven, CT:
Yale University Press.
Small, C. (1998), Musicking: The Meanings of Performing and Listening, 1st edn, Middleton,
CT: Wesleyan University Press.
Zagorski-Thomas, S. (2014), The Musicology of Record Production, Cambridge: Cambridge
University Press.
Discography
Austin, Gene (1927), [10” shellac disc] ‘My Blue Heaven’, Victor.
Australian Art Orchestra with Archie Roach (2006), [digital download] ‘Held Up To The
Moon (In My Grandfather’s Hands)’, Australian Art Orchestra Recordings.
The Choir of Kings College, Cambridge (2013), [CD] Nine Lessons and Carols, Kings College.
The Clash (1979), [7” vinyl single] ‘London Calling’, CBS.
Dixie Chicks (2003), [CD-set] Top of the World Tour (Live), Open Wide.
Eminem (2000), [CD] The Marshall Mathers LP, Interscope.
Gaye, Marvin (1977), [7” vinyl single] ‘Got To Give It Up’, Motown.
Hannigan, Lisa (2009), [digital download] Sea Sew, Hoop Recordings.
Holly, Buddy (1957), [7” vinyl single]. ‘Peggy Sue’, Coral.
Jamil, Mo (2018), [digital download] ‘That Feeling’, Polydor.
Kill II This (2019), [digital download] ‘Coma Karma’.
Ngcobo, Shiyani (2004), [CD] Introducing Shiyani Ngcobo, World Music Network.
Pablo, Augustus (1976), [12” vinyl album] King Tubby Meets Rockers Uptown, Yard Records.
Portishead (1994), [CD] Dummy, Go! Beat.
Raw Bar Collective (2011), [CD] Millhouse Measures.
Rogers, Roy (1948), [10” shellac disc] ‘Don’t Fence Me In’, RCA Victor.
The Rolling Stones (1969), [7” vinyl single] ‘You Can’t Always Get What You Want’, Decca.
Schubert, Franz (2019), [CD-set] 4 Impromptus; Piano Sonatas Nos 19 & 20; Klavierstücke;
Andras Schiff, fortepiano, ECM.
Skee-Lo (1995), [12” vinyl single] ‘I Wish’, Sunshine Records.
The Smashing Pumpkins ([1995] 2012), [CD-Deluxe Edition] Mellon Collie and the Infinite
Sadness, Virgin.
Smith, Patti (1975), [12” vinyl album] Horses, Arista.
Stereo Mike (2011), [digital download] Aneli3h, Minos–EMI.
SZA featuring Travis Scott (2017), [digital download] ‘Love Galore’, Top Dawg Entertainment.
Thicke, Robin (2013), [digital download] ‘Blurred Lines’, Interscope.
Watson, Young John (1954), [7” vinyl single] ‘Space Guitar’, Federal.
18
2
Authenticity in Music Production
Mike Alleyne
Authenticity and production

Few terms are as problematic as authenticity in relation to popular music and record
production, and consequently its very application in these contexts is frequently
reconsidered by musical and academic practitioners (Zagorski-Thomas 2014: 189,
202, 204, 219; Shuker 2017: 24–26; Weisethaunet and Lindberg 2010). Authenticity is a
constructed ideal, the precise characteristics of which may vary among genres and even
apply differently to specific historical phases. It is usually perceived in relation to preexisting
ideals against which works are assessed, and the evaluation of a record as ‘authentic’
becomes a badge of artistic honour despite the numerous aesthetic and technological
contradictions that have now become almost imperceptibly embedded within the use of
the term. The absence of any absolute standard of authenticity imbues the word with a
remarkable elasticity that does little to clarify the validity of its use. This discussion will
explore the record production authenticity dilemma with occasional reference to works
which often fall outside of the historically rock-centred identity parameters of the ascribed
term. This chapter also proposes an emphasis on contexts of authenticity rather than a
broadly applicable overarching rubric, and it incorporates perspectives from several
successful music producers. This approach relates closely to Allan Moore’s often-cited
essay ‘Authenticity as Authentication’ in which authenticity is ‘ascribed, not inscribed’
(2002: 210) in the musical text. Consequently, the discussion continues, this ascription
process may be affected by the subjectivity of observers, inclusive of their proximity to the
creation of the musical work. Thus, Moore identifies characteristics of first, second and
third person perceptions of authenticity, some of which are referred to in this discussion,
though their application to ‘world music’ genres is rather more complex and perhaps less
readily applicable.
Record production is a realm which is ultimately characterized by the manipulation
and calculated soundscape design carried out by producers and engineers, applying
a vast variety of techniques and artistic philosophies to meaningfully convey human
performances. However, despite this apparently inherent inconsistency whereby the sonic
results are consciously reshaped, critics and audiences alike have constructed realities from
aural fictions, creating codes separating the artistic end product from the complex means
facilitating its spatial assembly. Referring to the alchemic roles of the recording engineer,
Albin Zak comments that ‘their manipulation of sonic reality must become for the listener
a reality in itself ’ (2001: 169), and this observation applies equally to the overall production
process (within which the engineer’s active participation is integral). Successive changes in
recording technologies and aesthetic approaches have repeatedly challenged our notions of
what is authentic, as each innovation creates means of superseding the limitations of real-
time performance and its accompanying sonic spatial realities (Schmidt Horning 2013: 216).
The apparent dissolution of some genre borders as digital music technologies have become
more pervasive has inspired anti-authenticity debates. However, as Moore suggests, such
thoughts of invalidating the concept in popular music discourse are premature (2002: 209).
Many previous analyses of authenticity in popular music have adopted holistic approaches
that have usually included record production only as an unindividuated component,
implicitly conflating the process with composition and performance. This raises the issue
of whether it is actually possible to separate the authenticity of a song and performance
from the production within which it is framed and through which it is projected to
audiences. It also questions the extent to which responsibility for the authenticity or lack
thereof residing on a record can be ascribed to the producer.
The varying degree of manipulation of sound, space and time to create illusory continuity
and wholeness in a recorded work usually constructed in temporal fragments has become
a production norm, but arguably one which assumes its own forms of authenticity. As
one example, the ambitious reggae remixing strategies in dub, executed in the 1970s and
1980s by Jamaican producer/engineer King Tubby and others, rely wholly on the often
radical deconstruction and reimagining of preexisting works with the overt application
of sonic processing (particularly echo, reverb and delay). The subgenre’s very authenticity
is interwoven with distorting the fabric of real-time aural reality, replacing it with bold
exploratory alternative dimensions of sound and ambience.
Producer, author and independent label advocate Richard James Burgess (Colonel
Abrams, Living In A Box, Five Star), who has written extensively on the inner workings
of music production, considers the place of ‘authenticity’ in the sphere to be intrinsically
complex:
Authenticity is tricky to define in music production because the whole process began as an
artificial replication of the live music experience, which involved compromises from the
beginning. Examples being multiple retakes to achieve perfected performances; bass drums
were not permitted on early jazz recordings for fear of over-modulating the record head,
and singers were asked to sing more quietly to get a better recorded sound. There were many
such manipulations used right from the earliest days. Of course, beginning in the 1940s,
music production completely surpassed the capturing of the live performance as the sole
technique or criterion. (Burgess, email to author, 27 March 2018)
The moments of production mediation in popular music appear to either preclude the
existence of a largely imaginary creative purity or spontaneity, or at the very least to demand
further qualification of authenticity. It is therefore perhaps not coincidental how rarely
Authenticity in Music Production 21
musical creative participants use the word ‘authenticity’ explicitly, though it may generally
be implied in their descriptive language outlining their approach to artistic processes.
Producer Albhy Galuten, who worked with the Bee Gees during the group’s
multiplatinum peak in the second half of the 1970s, perceives the concept in highly
nuanced terms, noting that ‘if you’re going to define authenticity, you have to define it on
two different axes. On one axis you have the continuum from spontaneous creativity and
live performance on one end of the spectrum to constructed reality at the other’ (Galuten,
interview with author, 13 March 2018). He goes on to make a distinction between two
discrete dimensions of authenticity:
Oscar Peterson live at the Blue Note is authentic and a Berlioz concerto by the New York
Philharmonic is also authentic. So, authentic, in this meaning of the word, goes from on one
end live, performed at once and captured by some number of microphones to the creative
process of putting things together. In this sense, the kind of music that is represented by a
constructed reality (e.g. overdubbing one instrument at a time) is less authentic. Let’s call
this Performance Authenticity.
Another axis is the authenticity of the emotion that it creates in the listener. We might call
this Emotional Authenticity. So if your goal as a musician is to move your listener, whether
it’s to move them to dance or move them to cry, the extent to which you succeed at that is a
measure of Emotional Authenticity. (Galuten, interview with author, 13 March 2018)
In expanding his discussion, Galuten cites the recording of a vocal verse on Rod Stewart’s
gold-certified album Atlantic Crossing (1975) as one such example of ‘Emotional
Authenticity’.
There is a story I like to tell about Emotional Authenticity in the context of performance.
I remember Tom Dowd was producing the Rod Stewart record, Atlantic Crossing. We were
recording on 8-track tape (state of the art at the time) and Rod said, ‘I think I can beat the
vocal on the second verse.’ Tommy agreed to let him try. I was going to punch in on the only
vocal track and replace the existing performance on the second verse. Everyone knew the
old performance was going away. We were all totally concentrating – reaching for Emotional
Authenticity. Today, the scenario is different. With unlimited tracks, you might say, ‘Let’s
take a couple of passes at the second verse and see if we beat what we already have.’ It is much
harder to get the same degree of focus. (Galuten, interview with author, 13 March 2018)
Recognizing that there is no monolithic producer stereotype that can be indiscriminately

applied is also a crucial part of comprehending the production process (Burgess 2013:
88). Burgess notes that ultimately ‘the producer’s objective is to realize the vision for
the artist, manager, and label’ (60) and that ‘the aesthetic qualities of a production need
to complement the commercial and artistic intentions of the stakeholders’ (86). Under
ideal circumstances, all parties involved should be representing the authenticity of the
artist’s creative identity in conjunction with securing the attention of audiences that will
help sustain a career. Caught between satisfying the needs of several parties whilst also
projecting their own creative capabilities, the producer (or production team) is often
performing a perilous balancing act in attempting to achieve a level of commercial
pragmatism. In a clinical sense then, the producer’s implicit role is to reconcile Moore’s
first, second and third person authenticity perspectives, making the art sufficiently valid
to the parties responsible for its public presence.
Beyond some genre-specific requirements centred on simultaneous collective activity,
modern production is rarely focused on capturing live performance (in the sense of holistic,
unfragmented temporal continuity) but instead focuses on collating several series of
performances to arrive at an aesthetically satisfactory constructed outcome. What may once
have been considered an illusory process separate from real-time expression, now presents
a multiplicity of sonic possibilities from which the finished work is assembled as though its
collective shape and embedded aural details always preexisted in that integrated form. The
previously ‘inauthentic’ artefact has in fact assumed its own authenticity, largely displacing
the recording premise of reflecting realism (Moorefield 2005: xv). Moreover, recording
technology innovations have propelled artists, producers and engineers to accelerate and
expand their creativity and to maximize the potential of the new tools at their disposal
(Schmidt Horning 2013: 5–6). In his musings on phonography, Evan Eisenberg proposes
that (multitracked) studio recordings create a temporally deceptive mosaic: ‘Pieced together
from bits of actual events, they construct an ideal event. They are like the composite
photograph of a minotaur’ (2005: 89). However, as Zak discusses (2001: 130–131),
Eisenberg’s descriptive analysis disassociates the recorded multitrack components from
their individual and collective roles in contributing to a song’s holistic character. The ability
to easily alter the sequence of events in both time and space within a song using digital audio
programmes such as Logic or Pro Tools means that listeners outside of the creative team
usually only hear one completed version on which to base authenticity judgments. Without
an intimate knowledge of the performance, recording and editing processes applied to the
respective works, it is difficult at best to make worthwhile assumptions about how the aural
mosaic was constructed or the degree to which individual performances were manipulated
to achieve the desired overall result. Of course, this assemblage process long precedes the
digital era, but combining multiple analogue takes (or parts thereof) was rarely anything
less than tedious with greatly reduced technological flexibility.
The rhythm track of ‘In Your Eyes’ by Peter Gabriel (from the 1986 multiplatinum album
So) was created from a tapestry of ninety-six multitrack-tape versions of the song, requiring
the manual extraction of individual bars of musical performance to arrive at the finished
work (Classic Albums: Peter Gabriel – So 2012). Similarly, Gabriel’s 1992 platinum album
Us included ‘Love To Be Loved’ featuring a composite lead vocal constructed from forty
takes and ‘literally edited into existence’ (Zak 2001: 55). The process adopted by Gabriel
and his production/engineering collaborators in both cases implies that the (first person)
authenticity sought by the artist resided in disparate fragments of multiple interpretations,
reconfigured to achieve the intended sonic and emotional results. With current digital
technologies, those alternative representations may become subsumed by rapid erasure
and re-recording from which no fully formed reinterpretation emerges and to which the
audience is likely never exposed. This is especially true for the auteur producer who also
writes, arranges and performs, and who therefore has the opportunity to manifest his or
her creative vision authentically with either minimal or non-existent external mediation
(Burgess 2013: 85).
Record production history and technologies

Guitarist, inventor and producer Les Paul’s production techniques and their
implications for perceptions of authenticity demand close scrutiny. His sound-on-
sound overdubbing strategies using acetate discs, then later multitracking with tape,
resulted in hit records and shattered the norms of recorded performance. As Burgess
comments, ‘The length of the piece no longer defined the duration of the recording
process’ (Burgess 2014: 50). From the beginning of Les Paul’s visionary and complex
sound-on-sound overdubbing on his commercially successful records in the late 1940s,
the process inherently required duplication of performances, which no individual could
simultaneously reproduce in real time beyond the recorded medium. Sound-on-sound
required overdubbing while recording new parts, as the previously recorded elements
played back from another disc. Despite the recording industry’s reflexive dismissal
of overdubbing at that time as a sonic novelty somehow divorced from musicianship
and artistic concept (Cunningham 1998: 25–26), the finished products of Les Paul’s
experiments in collaboration with his singing wife Mary Ford were effectively authentic
to a listening public who generally thought and knew little about production. His solo
1948 instrumental hit single ‘Lover’ recorded for Capitol (using Paul’s sound-on-
sound techniques before he used tape) not only featured eight separate electric guitar
parts (Burgess 2014: 51) but also manipulated the speeds at which some parts were
played back to create sounds not reproducible in real-time performance. The record
represented an enormous triumph over technological barriers, and his subsequent
recordings following his acquisition of an Ampex 300 tape machine by 1949 (13)
challenged the industry’s flawed premise that multitracking was merely a transient
distraction (and thus somehow inauthentic).
When introducing the performing couple on American television in 1953, host Alistair
Cooke described the event as ‘the final demolition of this popular and ignorant rumour
that the basis of Les Paul and Mary Ford’s music is electronics’ (Les Paul: Chasing Sound
2007). While the televised programme deliberately revealed the overdubbing process,
Cooke’s descriptions emphasized that the tape recording technology being employed was
merely capturing and collating genuine human performances, thereby authenticating
the validity of the process. His exposition also reveals a widespread anti-technological
cynicism that had already begun to prejudice perceptions of popular musical authenticity,
long before overdubbing and record production had become elevated to art form levels by
both public and critical consensus from the mid-1960s. Les Paul’s production succeeded in
making the sonic illusion real and therefore functionally authentic to most of the listening
audience. However, the failure of American record companies and studios to immediately
recognize and embrace the creative possibilities of recording with multitrack tape delayed
its acceptance as an authentic production medium. Furthermore, the association of Les
Paul’s sound with pop styles preceding the rock era paradoxically made his multitracking
approach initially antithetical – functionally inauthentic – for rock ‘n’ roll performers intent
on capturing the music’s raw immediacy and vitality. The fact that multitrack recording was
vastly under-utilized even a decade after Ampex delivered its first manufactured 8-track
machine to Les Paul in 1957, clearly suggests the persistence of authenticity barriers
in industry perception (Zak 2001: 15). The post-1967 studio artistry that harnessed
technologies to create ambitious soundscapes and effects in popular music, and became
inextricably integral to production techniques, moved from apparent inauthenticity to a
normative position.
It is strikingly clear that the application of new technologies has played a critical
historical role in influencing ideas about authenticity in recorded music. At its earliest
stages, the process simply sought to capture the performances with seemingly minimal
mechanical intervention, although the very act of sonic capture for repeated listening was
itself significantly transformative. Moreover, even those early renditions were directly
affected by the limitations on recording length and sound quality that the available
technology imposed. Thus, the recorded version of a given song might only bear a passing
resemblance to a live version – particularly where jazz was concerned – unrestricted by
the temporal constraints associated with a 78 rpm disc. The arrival of electrical recording
in 1925, employing the microphone to overcome the difficulties posed by earlier acoustic
recording techniques, changed the dynamics of studio performance, allowing artists to
more effectively articulate with the benefit of amplification (Starr and Waterman 2014:
75). As previously demonstrated in the discussion of Les Paul’s innovations in the 1940s
and 1950s, the emergence of his sound-on-sound and tape multitracking methods in
conjunction with the introduction of the 33 1/3 long playing record signalled an axis point
for aural representation. With each technological phase, musicians and audiences were
challenged to reassess what was authentic to them. Interestingly, Zak notes an instance
in which jazz pianist Lennie Tristano responded to concerns about use of the emergent
multitrack technologies by issuing a printed disclaimer on the record sleeve of his 1962
release The New Tristano to accommodate ‘the traditional ideology of authenticity in
jazz’ (Zak 2001: 7). Tristano’s previous eponymous 1955 release aroused considerable
controversy due to its use of multitracking, deemed by jazz cognoscenti to have violated
the genre’s performance parameters (Case and Britt 1979: 193). A little more than a decade
later, such authenticity concerns would seem relatively trivial. General changes in modern
recording approaches and, in jazz, the stylistic fusion innovations of Miles Davis in
conjunction with the editing concepts of producer Teo Macero overturned jazz’s standard
notions of recorded performance. Inevitably, purist criticism persisted as the material on
albums such as In a Silent Way (1969) and Bitches Brew (1970) reached wider audiences.
Burgess remarks that:
These many hours of jams were recorded, cutup, and assembled by the producer, Teo
Macero. Though the final album did not represent what actually happened in the studio,
I would argue that these recordings were as authentic as, say, Kind of Blue or Sketches of
Spain. The processes and results still conformed to Miles’ artistic intention even though the
methodologies were entirely different. The results were a function of the processes, which
were part of the creative conception. If that is true, isn’t a Katy Perry, Kendrick Lamar, or any
other record equally authentic? (Burgess, email to author, 27 March 2018)
This concept of artistic intent defining (first person) authenticity repeatedly resurfaced
in my interviews with producers, suggesting a possible bifurcation of meaning between
creators and audiences (second person authenticity).
Authenticity and reissues

Revealing the sonic contours of material in gestation may affect our appraisal of its
production characteristics, potentially altering the degree of credibility that we assign
to the process. It is therefore worth considering the extent to which critical and
commercial perceptions of a work’s authenticity are affected by a reissued catalogue in
the form of deluxe editions or other box-set compilations including demos or alternate
versions of songs with which audiences are usually familiar in a more standardized
‘finished’ form.
The case of reggae and its commodification raises particular issues that are less
applicable to parts of the pop world that do not necessarily represent postcolonial
worldviews either sonically or lyrically. Discussing authenticity in a broader postcolonial
cultural context, Ashcroft, Griffiths and Tiffin suggest that ideas ‘become entangled in an
essentialist cultural position in which fixed practices become iconized as authentically
indigenous and others are excluded as hybridized or contaminated’ (2005: 21). The
ideological conditions surrounding such musical genres make perceptions of authenticity
highly contested as issues of textual fixity and flexibility become perennial analytical
matters.
In Bob Marley & the Wailers’ three-disc Exodus Fortieth Anniversary Edition (2017),
the remixed version of the record effectively reinterprets and reconfigures the original
production intent. Remixed by Bob’s son Ziggy – now credited as a co-producer – this
revised version alters the original running sequence of the songs, includes alternate takes
or track elements, and in some cases thoroughly revises the aural text on which the album’s
commercial and cultural status is based. In the particular case of ‘The Heathen’, the remix
thrusts a formerly deeply embedded guitar solo to the foreground, but without the delay and
reverb that imbued the original version with its ethereal mystique. In this instance, the song’s
re-production displaces its original forty-year-old aura with sonic treatments that might be
at least partially perceived as less authentic. In effect, the soundscape has been irrevocably
altered, creating an historical discontinuity between the two sonic representations. This
also represents a fracture between a first person authenticity manifested in Ziggy Marley’s
participation and a third person authenticity in which the genre features of an iconic release
are recalibrated. There is undoubtedly a sense of posthumous hagiography surrounding
Bob Marley and his most successful records despite the fact (discussed in further detail
shortly) that commercial calculation often played a critical role in their textual formation
at the expense of authenticity considerations. As expanded reissues assume a greater role in
sustaining the commercial viability of catalogue material, the circumstances under which
works are sonically repackaged with supplementary production represents a potential

rupture in popular music’s historical narrative. It suggests that, ultimately, there might not
be a definitive version of a recording around which authenticity debates can actually occur,
thus undermining aspects of critical authenticity.
Genre and geographical authenticities

In the reggae genre, any discussion of its global significance inevitably foregrounds the
music of Bob Marley, with particular reference to his catalogue of commercially successful
recordings between 1973 and 1980 as an integral member and later leader of The Wailers.
Beyond the postcolonial ideologies embedded in those albums challenging social, political,
economic and philosophical norms, the records themselves have assumed iconic status as
exemplars of reggae authenticity. A key paradox of the Marley legend is that the album
Catch a Fire (1973), which laid the foundation for The Wailers’ international ascent on the
Island Records label, was a commercially calculated effort to reach the rock market, a point
repeatedly made in documentary film interviews with former Island Records head Chris
Blackwell. As such, it was adorned with seemingly incongruous elements from the rock
soundscape. The overdubbing of additional rock guitars and keyboards wielded by session
musicians who had little or no previous experience with reggae, combined with a studied
de-emphasis of bass frequencies, meant that Catch a Fire’s production strategies focused on
re-presenting the music in ways that were not wholly authentic for the genre (Alleyne 1999:
92–104). The album embodied a recontextualization of reggae for the musical mainstream,
refining previously raw elements to accommodate the perceived target audience’s
preferences. The Wailers’ backing tracks recorded in Kingston, Jamaica, on both 4-track
and 8-track tapes (at Dynamic Sound Studios, Harry J. Studios and Randy’s Studios),
underwent 16-track overdubs at Island’s studios in Basing Street, London. Engineer Tony
Platt worked on these overdubs and had also previously recorded other reggae acts on the
label. His experience with the genre lent a sonic sensibility that at least allowed Catch a Fire
to retain some of its original stylistic integrity, and he strongly disputes the idea that his (or
Blackwell’s) participation in the project undermined the music, noting that:
Sometimes I’ve had […] slightly annoyed aggressive people accuse me of destroying reggae
forever, which I find a ludicrous concept. Bob was part of the process. If he hadn’t wanted it
to happen, then it would not have happened. He was quite well aware of where it was gonna
take him. I think he was quite well aware of it being a development he’d like to happen for
his music […]. If you’re a musician and a songwriter and you want to get a message across,
you want it to go to the widest audience possible. I think it’s actually slightly disrespectful
towards Bob to jump to this conclusion that in some way he was steamrollered into doing it.
(Platt, interview with author, 29 November 2009)
From his engineering/production perspective then, the musical discovery and reshaping
processes were also artist-driven, thereby possessing a type of innate (first person)
authenticity. Conversely, analyses of the finished product in its historical context have often
interrogated and challenged ideas surrounding the album’s authenticity, largely because
of production decisions that commercially tailored its sonic contours in untypical ways
for the reggae genre. Ultimately, any assumptions surrounding Catch a Fire’s inherent
authenticity are persistently complicated by marketplace considerations. Veteran Jamaican
reggae producer and entrepreneur Gussie Clarke1 makes the crucial observation, that:
Sometimes authenticity has to be breached to create innovation. For example, maybe that
is how we moved from mento and all kinds of beats to rocksteady [and then] ended up in
reggae. Somebody moved out of their safe zone, out of being as authentic as they are or could
be, and created something a little bit different. It then worked and became something new.
(Clarke, interview with author, 25 March 2018)
Although made with specific reference to Jamaican popular music, this commentary is also
applicable beyond the reggae realm. Like many notable producers in other genres, Clarke
is essentially guided by the song and the interpretation it requires to make it effective for
the selected recording artist. He also emphasizes the importance of creative flexibility in
the studio, changing from the original prescribed course of action, if necessary, in order to
create a result that is authentic for both the song and artist (Clarke, interview with author,
25 March 2018).
More than any other genre, reggae in its dub guise has recast the authenticity of an
original composite of tracks in a recording, radically remixing the constituent elements to
the point of effectively creating a new work, often with distinctive ambient features bearing
only passing relationships to the holistic source material, and asserting its own identity.
Having achieved initial major commercial success in the Simon and Garfunkel duo
between 1965 and 1970, Paul Simon’s efforts to periodically revitalize his solo career
by utilizing the rhythms and aural aesthetics of other cultures otherwise located on
the fringes of the pop mainstream have been subject to intense critical examination.
Few records in popular music history have so clearly stimulated authenticity debates as
Simon’s multiplatinum 1986 Graceland album in which he controversially incorporated
South African music during the apartheid era. However, Simon’s unexpected foray into
the world of reggae to launch his solo career more than a decade earlier is less often
discussed as a crucial antecedent. His 1972 international hit debut solo single ‘Mother
And Child Reunion’ (on which he shared production credit with engineer Roy Halee)
features a Jamaican rhythm section recorded in Kingston, the country’s capital (Alleyne
2000: 21). The boldness of Simon’s stylistic risk is partially counterbalanced by his decision
to relocate to the source of the music for its instrumental recording, imbuing the song
with a cultural authenticity otherwise unavailable: the musicians were captured not only
in organic interaction with each other but also in the physical studio space within which
they usually collaborated. ‘Mother And Child Reunion’ presents a production paradox
wherein the artist not otherwise associated with the culture from which the music arises
interweaves authenticity with more hybrid ideas to achieve commercial success despite the
genre’s marginal status at the time.
The transformation of Senegalese singer/songwriter Youssou N’Dour from local to
global stardom presents another circumstance in which a non-Western artist interweaves
elements of mainstream pop soundscapes with components of his original aesthetic. As

Simone Krüger Bridge discusses (2018: 168–172), a stylistic shift in N’Dour’s sound first
became evident in the 1980s, spurred in part by his successful collaboration with Peter
Gabriel on ‘In Your Eyes’, mentioned earlier in this analysis. Reconfiguring his sonic
identity to achieve wider solo commercial success after signing with Virgin Records,
his case exemplifies a cultural relocation that raises serious questions about his creative
authenticity, although the music retains evidence of ‘resistance to the homogenizing
forces of globalization’ (171). However, this is only one example within a richly populated
soundscape of global crossover efforts with similar authenticity implications (Alleyne
2018: 362–366).
Issues of production authenticity have become entangled in two prominent copyright
infringement cases involving the largest international hit songs of recent years. The
lawsuits surrounding the 2013 song ‘Blurred Lines’ by Robin Thicke featuring T.I. and
Pharrell, and the late 2014 single ‘Uptown Funk’ by Mark Ronson featuring Bruno Mars
focus on compositional transgression. However, in both instances, production decisions
are implicated in the process of replicating ideas embodied in previously published
works. Artistic influence and the relative assimilation of preexisting ideas are viewed
as inevitable and necessary parts of the creative process, but there remains a point at
which excessive borrowing transmutes into theft and at which authenticity evaporates.
In March 2015, ‘Blurred Lines’ was adjudged to have infringed the late Marvin Gaye’s
chart-topping 1977 single ‘Got To Give It Up’ Pt. 1, with one of the musicologists
representing Gaye’s estate citing eight points of substantial similarity.2 For the lay listener,
the similarities both include and transcend the standard compositional aspects, extending
to the production soundscape. The utilization of a prominent cowbell, the duplication
of bass patterns, and the persistent recurrence of party sounds are all examples of sonic
compositional components coexisting between the two recordings in the production
realm, though admissibility in this case was limited by the judge to lead sheet notation.
Published interviews with both Robin Thicke and producer/co-writer Pharrell Williams
make it clear that Gaye’s ‘Got To Give It Up’ was the guiding template used when ‘Blurred
Lines’ was being recorded as an ostensibly ‘original’ work. They extracted a multiplicity of
distinctive elements from Gaye’s song, effectively cloning the sound of the record as well as
mimicking the musicological characteristics of the composition. The production concepts
that informed ‘Got To Give It Up’ were interpolated into ‘Blurred Lines’ resulting in a
mirror image of the original work. Ultimately, the initial court decision and subsequent
rejection of an appeal designated ‘Blurred Lines’ as an inauthentic work assimilating the
identity of a previously released recording.3
Tracking alive
The 1977 Bee Gees platinum single ‘Stayin’ Alive’ is located at an intersection between
technological innovation and performance authenticity, prior to the arrival and widespread
application of digital music technologies. Apart from its iconic hit status and persistent
popular cultural longevity, the record’s use of a tape-based drum loop is considered the first
such instance on an international chart-topping track (Buskin 2012: 163). Forced by necessity
to create drum substitution to continue tracking at the remote Château d’Hérouville in France
during the enforced absence of the group’s drummer Dennis Bryon, co-producers Albhy
Galuten and Karl Richardson developed the idea of editing two bars of the 4-track drums
from the already recorded ‘Night Fever’ song (Buskin 2012: 163; Richardson, interview with
author, 25 March 2018). After much splicing and re-recording (with the drums now played
back at a lower pitch than in their original form due to the slower speed of ‘Stayin’ Alive’),
the overdubbed addition of tom-tom fills and crash cymbals completed the drum track.
With the central rhythmic foundation in place, overdubs of other instruments and vocals
began, organically synchronized to the drum loop. Notably, Galuten, Richardson and Barry
Gibb rejected the idea of using automated percussion from a Hammond organ because it
lacked the rhythmic feel they sought. Both Galuten and Richardson have observed that
although Bryon’s drumming was edited to create the loop, that performance is distinctly
human in its rhythmic feel, with fractional timing idiosyncrasies separating it from the
typical rigidity of later drum machine programming (Galuten, interview with author,
13 March 2018; Richardson, interview with author, 25 March 2018).
Ironically, although ‘Stayin’ Alive’ is rarely viewed as being in any way inauthentic, its
very origins in the studio are in a sense partially artificial, foreshadowing the digital era of
recycled rhythms and recorded components. The creative intent guiding the production,
songwriting and performance results in a record that transmits a dominant sense of human
authenticity. Albhy Galuten justifiably describes it as ‘possibly the best pop record I ever
worked on’ (interview with author, 13 March 2018).
Producer and engineer John Merchant has worked with the Bee Gees (following the
Galuten/Richardson era), Celine Dion, Toni Braxton and Barbra Streisand, among others,
and in conjunction with other established producers including Arif Mardin (d. 2006),
David Foster, Babyface, Phil Ramone (d. 2013) and Russ Titelman. His assessment
of authenticity in record production also reinforces the idea proposed in this chapter
regarding the inseparability of performance and production; the performance is in many
ways a direct outcome of production decisions, primarily informed by what elements best
serve the goals of the song:
I never once heard Arif (Mardin) say ‘This doesn’t feel authentic.’ His thing was never ‘Is
this authentic?’ It was ‘Does this feel good? Is it a good performance?’ Russ Titelman came
from a similar place. His sense of exploration was amazing, and both of those guys have
immaculate pop ears. You make every decision – technical and aesthetic – in service of the
musical intention of the song. (Merchant, interview with author, 27 April 2017)
Merchant goes further still, suggesting ‘It strikes me that there is no such thing as an authentic
recording.’ From his perspective, the basic demo version of a song written with acoustic
guitar or piano is arguably the most authentic version because at that stage the process
is only mediated by the singer/songwriter without the elements typically accompanying
external production intervention. Observations by veteran hit producer/engineer Hugh
Padgham (The Police, Sting, Phil Collins, David Bowie, XTC, McFly) reinforce ideas that
authenticity is not just contested ideological territory but fundamentally incorporates

creating the best representation of your art to your audience (Padgham, interview with
author, 20 March 2018).
The shadow of authenticity will always loom above popular music as long as musicians,
critics and audiences maintain the need to define style and genre parameters as means of
identifying prescribed sonic realities beyond which other expressions may be deemed as
less than authentic.
Notes
1. Clarke’s production work across several decades has included many well-known Jamaican
reggae artists such as I-Roy, Dennis Brown, the Mighty Diamonds and Gregory Isaacs.
His success has also encompassed international reggae acts including Maxi Priest and
Aswad, and the global hit single ‘Telephone Love’ (1989) by J. C. Lodge.
2. In her preliminary report for Marvin Gaye’s estate dated 17 October 2013, musicologist
Judith Finell (2013) enumerates the specific similarities between ‘Blurred Lines’ and ‘Got
To Give It Up’.
3. The infringement verdict was upheld on 20 March 2018 by a majority opinion, following
an initial judgement on 10 March 2015. I served as an expert witness for the estate of
Marvin Gaye and submitted a report that was admitted into the case’s evidence.
Bibliography
Alleyne, M. (1999), ‘Positive Vibration? Capitalist Textual Hegemony & Bob Marley’, in
B. Edmondson (ed.), Caribbean Romances: The Politics of Regional Representation, 92–104,
Charlottesville: University Press of Virginia.
Alleyne, M. (2000), ‘White Reggae: Cultural Dilution in the Record Industry’, Popular Music
& Society, 24 (1): 15–30. doi: 10.1080/03007760008591758.
Alleyne, M. (2018), ‘Trajectories and Themes in World Popular Music:
Globalization, Capitalism, Identity’, Ethnomusicology Forum, 27 (3): 362–366. doi:
10.1080/17411912.2018.1543608.
Ashcroft, B., G. Griffiths and H. Tiffin (2005), Post-Colonial Studies: The Key Concepts, New
York: Routledge.
Burgess, R. J. (2013), The Art of Music Production: The Theory and Practice, New York: Oxford
University Press.
Burgess, R. J. (2014), The History of Music Production, New York: Oxford University Press.
Buskin, R. (2012), Classic Tracks – The Real Stories Behind 68 Seminal Recordings, London:
Sample Magic.
Case, B. and S. Britt (1979), The Illustrated Encyclopedia of Jazz, London: Salamander.
Classic Albums: Peter Gabriel – So (2012), [DVD] Dir. George Scott, USA: Eagle Rock.
Cunningham, M. (1998), Good Vibrations: A History of Record Production, London:
Sanctuary.
Eisenberg, E. (2005), The Recording Angel: Music, Records and Culture from Aristotle to Zappa,
New Haven, CT: Yale University Press.
The Greatest Ears in Town – The Arif Mardin Story (2013), [DVD] Dir. Joe Mardin and Doug
Bird, USA: Shelter Island.
Judith Finell Music Services, Inc. (2013), ‘Preliminary Report: Comparison of “Got To Give It
Up” and “Blurred Lines”’, 17 October 2013.
Krüger Bridge, S. (2018), Trajectories and Themes in World Popular Music: Globalization,
Capitalism, Identity, Sheffield: Equinox.
Les Paul: Chasing Sound (2007), [DVD] Dir. John Paulson, New York: Koch Vision.
Moore, A. (2002), ‘Authenticity as Authentication’, Popular Music, 21 (2): 209–223.
Moorefield, V. (2005), The Producer as Composer: Shaping the Sounds of Popular Music,
Cambridge, MA: MIT Press.
Schmidt Horning, S. (2013), Chasing Sound: Technology, Culture & the Art of Studio Recording
from Edison to the LP, Baltimore, MD: Johns Hopkins University Press.
Shuker, R. (2017), Popular Music: The Key Concepts, 4th edn, London: Routledge.
Starr, L. and C. Waterman (2014), American Popular Music, 4th edn, New York: Oxford
University Press.
Weisethaunet, H. and U. Lindberg (2010), ‘Authenticity Revisited: The Rock Critic and the
Changing Real’, Popular Music & Society, 33 (4): 465–485.
University Press.
Zak, A. (2001), The Poetics of Rock, Los Angeles: University of California Press.
Discography
Bee Gees (1977), [7” vinyl single] ‘Stayin’ Alive’, RSO.
Bee Gees (1977), [7” vinyl single] ‘Night Fever’, RSO.
Davis, Miles (1959), [LP] Kind of Blue, Columbia.
Davis, Miles (1960), [LP] Sketches of Spain, Columbia.
Davis, Miles (1969), [LP] In a Silent Way, Columbia.
Davis, Miles (1970), [LP] Bitches Brew, Columbia.
Gabriel, Peter (1986), [LP] So, Geffen.
Gabriel, Peter (1992), [LP] Us, Geffen.
Gaye, Marvin (1977), [7” vinyl single] ‘Got To Give It Up’, Pt. 1, Tamla.
Marley, Bob, & the Wailers (1973), [LP] Catch a Fire, Island.
Marley, Bob, & the Wailers (2017), [CD-set] Exodus Fortieth Anniversary Edition. Tuff Gong/
Island/Universal.
Paul, Les (1948), [78 rpm single] ‘Lover’, Capitol.
Ronson, Mark, featuring Bruno Mars (2014), [digital single] ‘Uptown Funk’, Sony.
Simon, Paul (1972), [7” vinyl single] ‘Mother And Child Reunion’, Columbia.
Simon, Paul (1986), [LP] Graceland, Warner Bros.
Stewart, Rod (1975), [LP] Atlantic Crossing, Warner Bros.
Thicke, Robin, featuring T.I. and Pharrell (2013), [digital single] ‘Blurred Lines’, Star Trak/
Interscope.
Tristano, Lennie (1962), [LP] The New Tristano, Atlantic.
32
3
How to Study Record Production
Carlo Nardi
Introduction: Premises for the study of

record production
In order to study a subject rigorously and systematically we need to consider what are
its characteristics. Sound recording can be understood as a process, in which audio
representations of musical activity are produced and then distributed as autonomous
objects within a complex set of contexts. This has made it possible to set the context of
audio production apart from that of consumption in a temporal, spatial and symbolic way.
Records, digital files, soundtracks, ambient sound, audio books and similar instances of
reproducible sound allow for uses that cannot be controlled by those who created them
in the first place. As cultural artefacts they are at the centre of complex processes of
commodification, identity formation and aestheticization that require research approaches
that, albeit not necessarily new, are at least in tune with the discipline. The musical
phenomenon thereby becomes repeatable and this has had an influence both on what
music is and how we can study it: ‘Recorded sound, by allowing multiple repetitions of the
listening experience, permits a detailed and considered exploration of a piece in a way that
the concert experience doesn’t’ (Zagorski-Thomas 2014: 38). The recorded medium and
technology of reproduction have affected not only the way we make music but also how we
understand it: the phonograph was crucial in enabling ethnomusicologists to measure folk
melodies objectively and to recognize cultural patterns in non-Western musical traditions
(Lampert 2008: 384).
The legitimacy of record production as a distinct field of study hence can be explained
with its peculiar characteristics, which address, among other things, issues related to
music, sound, aesthetics, cultural memory, the division of labour, communication,
commerce, technology and perception. It developed mainly within popular music studies
and ethnomusicology and only in relatively recent times – at least, as compared to the
impact of sound recording on culture and society. Nevertheless, it has arguably produced
new insights about a vast range of topics such as the functioning of the music industry, the
complexity of the creative process, the cycle of technological invention, and the production
and reproduction of musical values.
We can thus study record production both as a process and as an object. In this regard,
Butler argues for a comprehensive look at the different dimensions that contribute to
the recording as an object of/for analysis, namely ‘the interactions of record production,
aesthetic values, song writing and commercial concerns that jointly determine that object’
(2012: 223). From this stance, art and technique are inexorably intertwined, a feature that
makes record production a privileged position from which to look at the social construction
of the concepts listed by Butler.
Research design and any methodological choices following a research question are
dependent upon a range of considerations regarding – but not limited to – the kind of
topic of research, the theoretical framework, more specific aspects concerning the research
problem and how it is addressed in the research question, the type of variables investigated,
the researcher’s wider epistemological standpoint, the resources available, and the actual
or prospective referent/public. More concerns relate to the scale of the phenomenon
under investigation (micro- or macro-study), the nature and extent of the geographical
area covered, the temporal range under investigation, and whether data collection involves
an instantaneous snapshot of reality (cross-sectional or transverse study) or repeated
observations aimed at registering change over time (longitudinal study).
Assuming that the basic methodological notions can be effective regardless of the
chosen subject of research, the aim of this chapter is to look at those assumptions keeping
in mind the specifics of record production. For this purpose, I will discuss some of the
methods so far adopted by scholars, namely ethnography, content analysis (in its most
comprehensive meaning) and historiography, concluding with some observations about
quantitative approaches. While discussing methods, I will tackle some of the most common
methodological concerns about epistemology, conceptualization, operationalization,
sampling and research evaluation, and relate them to the topic at hand.
Ethnography
The growing literature about music production and the establishment of academic
organizations, such as the Association for the Study of the Art of Record Production (ASARP),
bear witness to a rising interest in music technology in education that expands the scope of
a field that has hitherto been an almost exclusive domain of the industry, with bodies such
as the Audio Engineering Society (AES), and complements the co-optation of an increasing
number of sound engineers and music producers into higher education not only in vocational
studies but also in academic courses in the humanities and social sciences. As a corollary
to this, the approach to the topic of record production in scholarly literature has frequently
privileged the emic perspective of practitioners. Sound engineers and record producers have
brought new insights about music technology, music-making and the music industry by
means of offering an insider view of what has often been considered as a sort of black box
where an unfathomable amalgam of music ideas and commercial intuitions are transformed
into hit records (and, more often than not, flops or merely forgettable recordings). This
How to Study Record Production 35
imbalance has happened for several reasons, including the specialized competence required
to work in a studio, the relatively low visibility in the media of certain roles and the emphasis
on musical professions more in tune with established aesthetic hierarchies based on the
distinction between art and technique (Frith 1986; Kealy [1979] 1990). Hence, it should
not come as a surprise that studies about record production often favour an ethnographic
approach, observing studio practices or providing first-hand accounts of the researcher’s
own experience as a practitioner (see, for example, Porcello 1991).
Ethnographic work is aimed at gaining an understanding of a phenomenon from the
perspective of the actors involved. It requires access to social interactions as they take
place. The task of the ethnographer is to observe, describe and interpret those interactions.
As compared to other methods, ethnography ‘involves a prolonged, systematic, first
hand and direct encounter with the people concerned, as they act out their lives in a
range of interactional contexts’ (Payne and Payne 2004: 73). As far as the ethnographer is
participating in the reality observed, whether as a facilitator, a peer or simply by inevitably
interfering with their presence, the reflexivity of ethnographic work implies that the
observer’s actions and reactions are at the same time a significant aspect of the reality
observed. This means that, while interactions should remain as close to their normal
unfolding as possible, ethnographers, if they find it useful, can engage in questions or even
interviews in order to gain a better understanding of the phenomena studied.
Citing a study in which ten families were asked to produce recordings of their holidays
and then discuss them together, Karen Bijsterveld (2016: 106–107) argues for what
she defines ‘experimental ethnography’. Although there was no control group and the
conditions did not allow satisfactory control over the variables, Bijsterveld’s experience
suggests that, in certain situations, the researcher can act as a facilitator, giving tasks to the
subjects observed rather than simply asking them questions. In my PhD dissertation (Nardi
2005) I did something similar by giving sound engineers a mastering task, even though I
was aware that merely by participating in the mastering process I had already tarnished
the field, since mastering sessions are generally limited to the mastering engineer. This
approach rested on the conviction that by assuming a specific role that exceeded that of the
participant observer and by working with audio material I was familiar with I could better
understand the process under observation without having to ask engineers to verbalize
their actions, interrupting the regular flow of their work. There are indeed several examples
in which a task commissioned by the researcher defines the setting for observation of
record production practices. Meynell (2017) pursues ‘practice as theory’ by using historical
equipment to demonstrate how tacit knowledge guides the production choices connected to
that equipment. For this purpose, he re-enacted two classic recording sessions, respectively
of ‘Eight Miles High’ by The Byrds and ‘Rain’ by The Beatles, complying as much as possible
with the constraints that characterized the original sessions. This experiment brought him
to the conclusion that equipment has a weaker impact on the resulting recordings than
established working methods. Similarly, Sean Williams, in his study of King Tubby’s dub-
style techniques, complements the material analysis of equipment with the assimilation of
King Tubby’s studio set-up into his own creative music practice (2012: 235). This allows him
to compensate for the lack of direct documentation of the original studio practice, which
would have led him to rely exclusively on second-hand testimony. The idea – significantly,
both Williams and Meynell operate within the theoretical framework of actor-network
theory (ANT)1 – is to let technological devices ‘speak’ for themselves by activating them
within specific ‘performance ecosystems’ that include context-specific tools, techniques,
studio personnel and audience feedback among other things (S. Williams 2012).
Commissioned tasks can be fruitfully used in very different research settings, such as
action research or vocational subject areas (nursing, hospitality, social work, etc.). Here the
main aim of research might not necessarily be that of understanding studio practice but
rather that of using our understanding on this subject to pursue community engagement
or cultural democracy. Action research methods, being based on critical pedagogy, are
inherently participatory and inclusive. As a consequence, they can be employed to build and
share knowledge about studio practice with the participants. Examples of such practices are
provided by experiences from community music activities (Higgins 2012: 44, 49, 58–59).
The ethnographer must share with the individuals under observation competence in
the relevant cultural codes, which encompass language skills (including technical language
skills and familiarity with professional jargon, see Porcello 2004), technical knowledge as
well as the relational skills required in that particular social context. These competences, as
well as the capacity to act them out naturally, are normally linked to prior direct experience
in the field, something that can be aptly summarized with the concept of habitus as a set of
socially conditioned dispositions comprising a mix of habits, bodily techniques, outlooks,
styles, tastes and orientations (Bourdieu 1972). As Théberge puts it, habitus in music ‘takes
the form of that unconscious yet fully structured system of sounds, gesture, meanings, and
conventions that we commonly refer to as a “style”. […] Style, then, for the musician is
something that is acquired only through an extended process of learning through practice’
(1997: 167).2
Hence, some expertise, both practical and theoretical, is required in order to be able
to grasp what is going on inside the environment of a studio: ‘Like jazz improvisation,
the success of participant ethnography is a matter of interaction and communication,
shifting patterns of strangeness and familiarity, even practice’ (Porcello 2003: 269). A
common professional background, furthermore, can be pivotal to legitimize the presence
of the observer in the research setting, regardless of whether the researcher is undercover
or out in the open. For instance, Alan Williams (2012) expanded his former professional
insights through ethnographic work, which showed how the computer screen mediates the
relationship between sound engineers and musicians inside the recording studio, making
the recording process more accessible to the latter.
Nonetheless, an ethnographer possessing a different skill set and expertise might be
able to grasp different interesting aspects that a music professional might not be able to
recognize, due to their sharing the same ideology of the observed subjects. In other words,
non-professionals can also contribute greatly to the understanding of record production.
A notable example of this possibility is offered by Louise Meintjes’s (2003) study of the
production process of an album of mbaqanga music. Based on ethnographic fieldwork in
a recording studio in Johannesburg, it was conducted around the time of the transition
from apartheid to democracy. It documents the encounter between black musicians and
producers, white sound engineers and a white ethnomusicologist and shows, among other
things, how race and ethnicity are mediated through style and sound aesthetics. Such an
approach allowed Meintjes to recognize how power is acted out in a studio where different
roles have different access to it:
In-studio sound mixing is a process of negotiation for control over the electronic
manipulation of style. If style is conceived as a performed and multi-layered sign that
expresses, constructs, and reproduces the sensibilities of artists […] then recording and
mixing is a dramatized struggle over signs embodying values, identities, and aspirations. In
their struggle, studio music-makers rework or reaffirm their sociopolitical and professional
positioning in relation to one another. These negotiations concern the creative use of the
studio’s technological resources even as they happen through it. (8–9)
Although, as I have mentioned earlier, ethnography shows reality from the perspective
of the people under observation, paradoxically it can also be helpful in dismantling any
implicit or explicit misrepresentations the same people would normally convey about their
work. As for other professions that, being submitted to the market and to competition,
entrust a certain degree of secrecy into their functioning, it might nevertheless be hard to
get access to the hubs in which decisions are taken.
This is further problematized with the move from an era in which records were
prevalently made in a traditional studio setting (and occasionally in other locations thanks
to a mobile studio, yet always with the mediation of a sound engineer) to a wide range of
situations which, to further complicate things, often involve the conversion of domestic
space into workspace. As a consequence, it can be even more problematic for ethnographers
to identify and access those spaces. Moreover, it must be taken into consideration that not
only the location but also the characteristics of music production spaces have shifted as
compared to traditional working spaces. They now often manage to condense production
and consumption, spare time and working time, and corporate space and the individual
domain (Théberge 1997). In order to be able to grasp this changing context of cultural
production, recording technology itself ‘must be understood as a complete “system” of
production involving the organization of musical, social, and technical means’ (193).
The shift to internet-based forms of production, consumption and collaboration, in
particular, requires new forms of field research, where – Lysloff argues – the field is not
so much a physical place but rather a virtual place ‘in which social interaction and group
formation can take place’ (2003: 28). Recognizing the development of new social spaces
for music production and work collaborations, he uses online ethnography to study an
electronic music community based on the web by means of ‘learning particular musical
and linguistic skills, interviewing composers, visiting numerous research sites, collecting
relevant texts and audio recordings, and observing various kinds of music-related activities’
(24). While online ethnography is concerned with the study of interactions between web
users, the materials thus produced and shared can also constitute a subject for study –
something that cloud systems, as a way to manage individual or collective workflows
online, possibly make even more relevant. Hence, the internet can be thought of as a
storage medium, based on nodes and ties as well as artefacts that can be studied as content.
Unobtrusive research methods: Content

analysis and historiography
Content analysis, analysis of existing statistical records, and comparative and historic
research are ways to gather data unobtrusively or non-reactively:
The possibilities are limitless. Like a detective investigating a crime, the social researcher
looks for clues. If you stop to notice, you’ll find that clues of social behavior are all around
you. In a sense, everything you see represents the answer to some important social science
question – all you have to do is think of the question. (Babbie 2013: 295)
Content analysis, in particular, studies recorded human communications in any form

and is therefore interested in social and cultural documents and in any kind of human
artefacts:
A document is any material that provides information on a given social phenomenon

and which exists independently of the researcher’s actions. Documents are produced by
individuals or institutions for purposes other than social research, but can be utilized by the
researcher for cognitive purposes. (Corbetta 2003: 277)
Documents that are useful for the study of record production, besides the recording itself,
include – to name a few – any unit or portion of audio recording used in the final mix or
excluded from it (e.g. alternate takes, separate tracks, stems, samples); any written, drawn
or otherwise recorded notes such as input lists, tech riders, patches, automations, etc.
(e.g. pictures showing the disposition of performers in a studio); instruments and studio
equipment; formats and carriers; recording locations; advertising and press notes; interviews
and media content in general; commercial regulations and contracts; tax returns; minutes of
board meetings. Further valuable documents pertain to the distribution and consumption
of recorded music: music stores, listening devices, locations for music listening, record
charts, playlists, broadcasts, internet cloud content, etc. In quantitative analyses, some of
these may constitute the sampling frame, from which a sample is selected. Finally, given
the overlap between production and reproduction in the contemporary music landscape
(Théberge 1997), artefacts related to the practices of sampling, DJ mixes, mashups and the
like should also be considered.
Documents can be in written form or, more typically for a context significantly based
on aurality, occur as material traces. In addition, they can be studied first-hand by the
researcher or through the mediation of an expert or testimony, hence through interviews,
memoirs or diaries. Regardless of their source and form – videos, interviews, recorded
music, etc. – data can be reorganized and analysed either qualitatively, through thematic,
ideological or semiotic analysis, or quantitatively, through formal content analysis, which
often is called just content analysis.
Content analysis consists of a process of coding, in which data deriving from a form
of communication are classified according to a conceptual framework developed by the
researcher (Babbie 2013: 300). Coding involves the interpretation and categorization of
data and is also influenced by various degrees of access and various forms of negotiation,
including power negotiations between professional roles (e.g. sound engineers vs.
performers), gender, age groups, and so on. It must be noted that content analysis is not
limited to reducing sound recordings to objects, as it should consider the processes that are
in play in producing and using those objects. In other words, it is crucial to recognize that
sound recordings are at the intersection of wider socioeconomic dynamics, or else their
historic and cultural meaning is inevitably misunderstood.
A common problem of content analysis concerns sampling, which inevitably affects
the generalizability of its results. Researchers typically choose documents containing
information that is considered representative or symptomatic of some concept that they
want to illustrate. This has an impact on the consistency of a study and its validity, which
describes among other things whether an inference is appropriate for the context and the
sample analysed. The fact that the chosen examples are in accordance with their hypothesis
is not proof of generalizability. They may be exceptions, they may not be causal or there
may be other factors involved.
The problem of generalizability especially concerns formal (quantitative) content
analysis. The latter involves the systematic sampling of texts and is based on the identification
of selected aspects of their content that are described in verbal form and classified
according to determined criteria. Data thus collected can then be analysed statistically, so
that research findings can be generalized. Other researchers, moreover, can replicate the
whole process, strengthening the reliability of the results. This kind of analysis could be
applied to study how specific elements of a recording contribute to a certain outcome such
as entering a record chart or clicking with a certain social group. As a matter of fact, formal
content analysis is especially useful in setting a possible cause in relation to a known effect,
especially comparing findings obtained through content analysis to data collected through
audience research. Since this approach can provide evidence of stereotypes, it could also
be used to substantiate studies addressing gender or racial biases in recording studios,
curricula, online forums and print media.
Qualitative content analysis comprises semiotics and thematic analysis. Of these,
semiotics has played a significant part in the affirmation of popular music studies around
the beginning of the 1980s, especially with the work of Philip Tagg in his musicological
analysis of the theme from Kojak (Tagg 1979). While making extensive use of score
transcriptions, he also acknowledges the role of sound recording in the identification of
musemes as the smallest unit of musical meaning. Also Alan Moore’s (1992) definition
of texture as the relationship between different identifiable strands of sound, being
metaphorically derivative from multitracking, not only recognizes the specifics of recorded
sound but also would not be possible from an analytical perspective without the existence of
the recording as the ‘primary text’. While semiotic analysis reminds us that the recognition
of the key role of sound recording lies at the heart of popular music studies, nonetheless it
is more interested in sound recording as an object rather than as the outcome of a process.
Semiotics has found little application in the study of record production, possibly since the
growth of scholarly interest in the latter has occurred concurrently with the relative demise
of the academic popularity of the former.
More recently, cultural studies not only have privileged thematic analysis (analysis
of discourse or ideology) over semiotics, but they also have placed greater emphasis on
highlighting the relations between a text and its contexts of production and reception:
‘Although there may be a nominal text such as a score or a recording, the musical
meaning should be explored through how it emerged from a production system and how
it is interpreted through a reception system’ (Zagorski-Thomas 2014: 32). On defining a
musicology of production as a distinct field of study, Moore writes:
[It] would need to address the musical consequences of production decisions, or the
consequences attendant on the shifting relationship between production decisions and
the decisions of musicians about their performative practice. Production decisions are
made principally by producers (who may also be the musicians involved in a production),
and secondarily by engineers who are responsible for the decisions which mediate what
musicians do to what listeners hear. It is with the (musical) results of these decisions that a
musicology of production would be concerned. (2012: 99)
Thematic analysis is aimed at interpreting the explicit or implicit intentions behind a

text, uncovering any ideological conditioning that may be operating at a given time
and place. This means that the same material traces that we obtain from a text should
be embedded within specific cultural frameworks of music production. To exemplify
this point, Simon Zagorski-Thomas (2014: 36) hypothesizes a ‘spectral and dynamic
analysis to compare the audio effect of analogue tape saturation with a digital signal
processing emulation of the same effect. […] In the end, though, any such evaluation
is based on a presupposed audience aesthetic that is likely to differ between musical
communities.’
Since the process of record production is verbalized only in limited part, a scrupulous
work of conceptualization and classification of the relevant themes and their indicators is
required. Furthermore, thematic analysis should be able to recognize the specific aspects
of sound recording as compared to music performance or composition, but it should never
disengage those aspects from both reception and the wider musical and cultural field –
which includes, among other things, performance and composition, too. In this sense,
audience analysis can put thematic analysis to the test.
Thematic analysis, as previously seen, must accept the implication that data are gathered
primarily in support of a given hypothesis. While this inevitably affects the generalizability
of a study, its reliability should be afforded by data comparison and openness to refutation:
theoretical speculation should, directly or indirectly, refer to observable data. If the first
limit is a matter of sampling and concerns, to a lesser or greater degree, all forms of
content analysis, the second limit poses a threat to science in the form of metaphysics,
with postmodernism – as one of its relatively most recent manifestations – emphasising
relativity and rejecting any unifying principles in science and methodology, to the point of
actually alienating theory from factual corroboration.
Besides unobtrusiveness, another advantage of document analysis is that it can be
used to study the past, an aspect which blends the role of the sociologist with that of the
social historian (Corbetta 2003: 278). Théberge (1997) provides a notable example of the
advantages of integrating thematic analysis with historiography. In his book, he shows how
digital instruments, such as drum machines, synthesizers, samplers and sequencers, engage
music makers not just in the production of sounds but also in their reproduction (2–3).
This informs the nature of musicianship, blurring the divide between production and
consumption of music (4). For this purpose, he focuses on technological innovation using
documents that shed light on the intersection between the industry supplying instruments
(the moment of design/production), their promotion in the media (mediation) and the
meaning that they assume while in use.
Historiography can review widely accepted canons, in which certain figures given more
prominence at the expense of others on ideological grounds. A function of historiography
can thus be that of identifying privileges and patterns of exclusion of entire social groups
from the benefits of record production in order to ascertain individual or collective
contributions that are omitted in official accounts. These may include written records and
contracts, which for various reasons can be suppressed, or partially or utterly falsified. More
specifically, oral histories and other forms of direct testimony allow research to give a voice
to those groups that are typically marginalized in historical reports based on documents,
either because of their lower status or because they are regarded as unexceptional (Bryman
2012: 491). Booth (2008) takes advantage of this opportunity in his rehabilitation of the
often-underestimated role of music directors’ assistants, arrangers and session musicians
in the Hindi film music industry.
From a different angle, historiographies that put too much emphasis on technological
development run the risk of telling a different story from that of cultural production,
neglecting in the meantime users’ agency and the fact that, as technology can influence
ideology, ideology can also influence technology (Sterne 2012a). In this respect, Burgess
(2014), not unlike other scholars, traces a chronology of music production based on
technological devices and their impact on the development of different techniques, music
styles and business models. This brings him to identify stages of development, such as
an acoustic, an electric and a digital age, that follow each other while, at the same time,
interweaving with cyclical economic growth and decline. The impact of technology on
these aspects is self-evident and, to be fair, Burgess also acknowledges the active role of
changes in aesthetics and in the structure of the recording industry. Yet such a framework
for historical analysis is always on the verge of suggesting a deterministic approach to both
technology and cultural endeavours.
It goes without saying that it can happen at any time, including during a revision of
orthodox historiography, that external forces that are not primarily motivated by scientific
curiosity may direct scholarly interests and approaches, as far as research funding depends
on the favour that a certain topic encounters among corporations and the public. Also,
the way a particular scholar sees reality depends on their social extraction and their
acculturation. Of course, every historical reconstruction bears its own ideological burden,
so that even the best-intentioned attempt to appreciate or rediscover a forgotten contributor
to the history of recording is really telling a particular history of recording – one more good
reason to always clarify the epistemological and methodological framework of a piece of
research.
Last but not least, research approaches need to consider that the history of recorded
music is also a history of broadcasting, competing music formats, sound diffusion systems,
music used more or less expressly as a background for various activities, and all those
situations in which the possibility of separating sound production from its reproduction is
crucial if not even a binding condition.
Quantitative approaches
Quantitative approaches are inspired by the neopositivist paradigm and structure the
relationship between theory and research according to deduction, where empirical
evidence is used to support a previously formulated hypothesis (Corbetta 2003: 38).
Research planning, hence, involves a cyclical pattern: a theory is presented as a testable
model, a research design as an expression of a strict and rigorous conceptual order is
arranged, data are collected, then analysed and, finally, the initial theory is tested (36).
Quantitative research can manipulate the environment in order to test variables through
experimentation or, otherwise, it can collect data unobtrusively, for instance through
surveys or taking advantage of existing statistics. At any rate, researchers need to distance
themselves psychologically and, if possible, also physically from the research object.
Quantitative approaches normally pursue nomothetic explanations, in which a
determined variable is probabilistically linked to one or more different variables in a
correlation of causation. As compared to qualitative approaches, which seek description
and interpretation, and to ideographic explanations, which seek an in-depth elucidation
of a single case, nomothetic explanations aspire to universality. They should respect
three mutually binding criteria: statistical correlation between the variables observed, a
consistent time order between cause and effect, and nonspuriousness (i.e. the exclusion of
any further variables as the possible cause of the dependent variable) (Babbie 2013: 93–94).
Researchers are faced with a fundamental choice between depth of analysis, affecting
the validity of their research, and specificity of understanding, affecting its reliability
(300). While ethnography typically aims at depth at the expense of reliability (i.e. the
possibility to generalize research results), surveys, experiments and formal (quantitative)
content analysis aim at specificity, in order to infer data about a population starting from
a sample. The main advantage of quantitative methods is that they allow the validity of
research findings to be assessed (are we really observing what we mean to? Are indicators
actually telling us if a variable is present in the data observed?). They can also indicate their
reliability (if we repeat a measurement with the same conditions at a different time, do we
obtain the same results?) and their generalizability (do the results apply to the entire target
population?).
Quantitative approaches in the human and social sciences are most commonly
associated with surveys. Surveys are a standardized instrument for submitting predefined
questions to all the respondents. It coerces them into discussing only the topics that
the researcher wants to focus on. This means that there is no in-depth analysis of the
respondents and that the researcher needs to have a previous understanding of the
context of the research, in order to identify the relevant variables and indicators.
Moreover, the action of compiling a questionnaire is a sui generis situation that requires
some arrangements to be taken, both regarding the formulation of the questionnaire
items and its administration.
Surveys are usually adopted for studies having individuals as their units of analysis,
which can be especially useful, as the latter would coincide with the respondents (Babbie
2010: 244). When it is either not feasible or advantageous to gather data about an entire
population, the analysis of a sample of that population can provide inferences about the
latter by saving time and other resources. Identifying a target population and extracting a
probabilistic sample can be especially problematic when dealing with cultural products and,
in particular, when the unit of sampling consists of artefacts, such as recordings. A different
order of problems affects studies, where the target population consists of professional roles
for which there is no comprehensive register, such as independent producers, recording
assistants or remixers. These problems are first of all conceptual: can we fully distinguish
a producer from an arranger or a sound engineer in most digital audio workstation-based
productions? How do we define the term ‘independent’? Do we include those employees
of a studio who are not trained for the job as recording assistants but who occasionally
help sound engineers during recording sessions? Secondly, according to the researcher’s
conceptualization, we need to identify a definite target population, whose members possess
the required qualities of our category at least to a degree that is considered significant.
Thirdly, we need to find a way to extract a sample, either randomly or as a stratified random
sample, from this target population.
To exemplify this problematic, if we were conducting a quantitative study that
established a causative correlation between mastering engineers’ age and preference for
digital versus analogue mastering, our unit of sampling could be, for instance, mastering
engineers. In order to identify the target population, we would have to be able to count all
the sound engineers that are operating at a given time and place. If we chose record releases
as the sampling unit, the need to demarcate what a ‘record release’ is would also pose
significant methodological quandaries, first of all because it would feasibly include records
mastered by engineers residing out of our area of interest. Even if we chose the largest
database of record releases, let’s say Discogs, we well know that not all music releases can
be found there. Then we should consider that the same release could have different masters
for different digital music services and for physical formats. This is not to say that these
problems are insurmountable but rather that they should be tackled knowledgeably during
conceptualization and, accordingly, during sampling. Regarding the last example, one
might decide to limit the target population to the releases that enter the Billboard Top 200
Albums or the Billboard Hot 100 Single charts, considering each track as a sampling unit –
although it might be hard to know how sales refer to different formats; also, we would need
to know in advance whether a track has been mastered differently for different formats.
This choice, however, should be justified convincingly not only by geographical reasons but
also by explaining why only successful releases, and only success as measured by charts, are
relevant for our research purposes.
Seemingly more appropriate for surveys are studies of the public/reception, where
there can be a more defined idea of the target population (e.g. youths attending secondary
school, citizens of a particular country, online record buyers) and where it is possible to
extract a sample randomly and in a way that both the population size and the sample size,
which affects representativeness, are appropriate for statistical analysis.
There is another aspect suggesting that quantitative analysis could be suited to
understanding record production. Corbetta describes the structure of quantitative
research as ‘a creative process of discovery, which is developed according to a pre-
established itinerary and according to predetermined procedures, that have become
consolidated within the scientific community’ (2003: 57). It should be noted that
sound engineering, as a relevant aspect of record production, is imbued with similar
notions of causality and experimentation, including setting experimental and control
variables (e.g. when comparing alternative EQ filter settings or bypassing an audio
signal processor), preserving the environment from unwanted interferences, and
relying on visual metres as a way to substantiate certain mixing choices with objective
evaluation.
As a matter of fact, there is a long tradition of quantitative studies of sound recording,
linked in particular to the establishment of the AES in 1948 and the Journal of the
Audio Engineering Society in 1953. AES was born in the aftermath of the Second World
War as a way to institute a communication network between company managements:
‘World War II of course produced advances of a technical nature that could be applied
to sound recording but the war impeded their application. It was this delay, this lack of
advancement, that concerned the founders of the AES when they gathered to lament
the slow progress and discuss how improvements could be effected’ (Plunkett 1998: 5).
Due to its links with the industry, the AES has been mainly involved in applied research
focusing on equipment and software specification and measurement.3
Partnerships between audio research organizations and the industry have since
flourished, bringing about an impressive bulk of data, which, besides their own worth,
could be beneficial for a wide range of research endeavours in the cultural and social
sciences as well. This, however, hasn’t happened as much as we would have expected until
very recently, possibly due to various factors: the highly specialized language used in
engineering, the preoccupation that the lack of context in many studies might pin down
human agency and cultural difference, and the suspicion surrounding the epistemological
premises of applied research. Moreover, industrial secrecy can prejudice the depth of this
kind of data.4
Whereas the study of equipment and human perception can be approached also
from a social and cultural perspective, the quantitative study of record production does
not have to be limited to its most obviously quantifiable aspects nor to data gathered
through methods that are inherently linked to this kind of analysis, such as formal content
analysis or questionnaires. In fact, quantitative approaches also can be considered for
data collected through interviews, videos and fieldwork. Compatibly with the research
question, all data can be categorized numerically and measured with scales that depend
on the type of variable and the kind of data available, thus helping to investigate issues
related to attitudes, opinions, orientations, values, social mobility, etc. about both music
makers and their public.
The fact that the study of record production has been mostly confined within certain
specific scientific domains might be related to the kind of cultural and social capital
attached to the profession of sound engineering (and, partially, music producers) as
compared to a cultural industry that, at least in the eyes of the public, places artistic roles
in the foreground. For this reason, sound engineers sometimes emphasize the artistic
dimension of their work in order to enhance the perceived status of their profession (e.g.
Owsinski 1999). However, this view seems to suggest that only art is linked to creativity
while technique is a mere repetition of given procedures. Yet, missing the importance of
creativity in the choice and application of technical tasks would mean not recognizing
the value of the scientific knowledge implied in the daily activity of sound engineers and
record producers, where aesthetics and technical knowledge are inevitably intertwined.5
On the other hand, companies typically resort to quantitative data for various purposes:
justifying investments, segmenting markets, planning marketing strategies, etc. Scholars
may have favoured interpretive paradigms in an attempt to counterbalance this business-
oriented approach and the official history that comes with it, with the intent to give credit
to professionals normally operating relatively in the dark, to highlight their creative
contributions where they are not acknowledged publicly, or to underscore the collaborative
nature of record production also when it is obscured by the industry and the media. The
way terms such as ‘art’ and ‘technique’ are defined and used in actual sociohistorical
contexts may inform but never prevent or limit research, including its methodology.
Finally, given that the practice of sound engineering includes more or less formally
coded psychoacoustic knowledge, the same body of knowledge has been used to understand
the reasons for aesthetic choices among both music makers and listeners as well as to
explore new developments of software engineering, including artificial intelligence (AI)
applications to mixing tasks (see, for example, De Man and Reiss 2013; Ma et al. 2015;
Reiss 2016; Reiss 2017; De Man, McNally, and Reiss 2017; Moffat and Reiss 2018).6 Such an
approach can shed light on topics as diverse as equalization and compression techniques,
the creation or re-creation of space and depth in a mix, microphone placing techniques,
market success and failure of reproduction devices and digital formats, and so on.
Psychoacoustics lies outside the range of topics covered in this chapter, which is mainly
rooted in the human and social sciences. From an epistemological stance it can be argued
that psychoacoustic approaches can encounter problems both concerning reliability and
generalizability. This indeterminateness depends on a conceptual problem at the heart
of the relationship between physiological data and culture, where the first cannot be
abstracted from the second. Music is clearly a cultural phenomenon, just as much as it is
sound: the idea that a sample extracted from a delimited population can reveal essential
truths about how the entire humanity would judge a determined musical event (for a
critique of the so-called ‘Mozart Effect’, see Waterhouse 2006) does not take into account
that music as well as listening are the results of learning processes, as the cultural study
of sound has pointed out (Feld 1982; Sterne 2003; Howes 2004; Erlmann 2010; Sterne
2012b; Papenburg and Schulze 2016). Nevertheless, psychoacoustic research, once its
results are not extrapolated from the target population, can be useful in corroborating
or invalidating particular theories, including theories developed within different
epistemological contexts.
Conclusions: Science vs. metaphysics

The role of methodology is to provide the tools of enquiry around a given research question,
including how to conceptualize the topic investigated according to an epistemological
framework. Based on the acknowledgment of previous theorization, it can also assist with
the choice of the most appropriate ways to collect data about that topic, how to organize
and interpret data, and, finally, how to present the results to the wider public. A premise
is that methodology applies to scientific research, where every statement, including
theoretical speculation, must ultimately be supported by direct or mediated observation:
‘It is in the empirical component that science is differentiated from fantasy’ (Kaplan 1964:
34–35). Kaplan distinguishes between three classes of objects: observational terms, which
rest on direct observation, indirect observables, which require some kind of mediation, and
constructs, which are not observable either directly or indirectly (54–56). Although the
latter are the result of logical thinking, they are more or less directly linked to observables.
In other words, any theoretical construct, even the most abstract, should be exposed – at
least potentially – to empirical validation (or falsification). Constructs, of course, are as
important as observables to scientific research. Nonetheless, they should be kept apart
from metaphysics and any other explanation that rejects objectivity or that does not admit
the existence of reality beyond the knowledge that humans can have (or make) about it.
A second premise is that methodology is a means to an end; as much as the epistemological
foundations of methodology, just as any other forms of knowledge, can be legitimately
questioned, the concept itself of methodology is uncountable and does not have a plural
form, otherwise it could not provide a common ground for the scientific community to
control the rigour of every single piece of research.
A third premise is that methodology oversees the entire development of a research
endeavour, except for the research question at its origin, which, as Max Weber argues, is
the researcher’s choice and can be justified solely by their personal interests:
The constructs of the cultural and social sciences reflect the values of the investigators. […]
Our cultural concerns launch our investigations; but once at work on a set of phenomena,
Weber argued, we should analyze our evidence for its own sake, without further regard for
our value interests. (Ringer 1997: 123)
All that comes after this initial choice, in fact, must follow a rigorous and verifiable path,
where every step develops logically from what precedes it; thanks to careful methodological
design, a study is open to inquiry from other scientists, who can assess whether the premises
justify the conclusions.
Notes
1. For a structured discussion of how to apply ANT to the study of record production, see
Zagorski-Thomas (2018). Zagorski-Thomas combined ANT with the social construction
of technology (SCOT) paradigm to shed light on the schematic representations that
establish direct connections between objects, perception, environments and processes,
thus feeding our expectations while performing specific tasks (852).
2. It is often assumed that sound engineering and music production can be learnt only
through practice as far as verbal teachings are insufficient to pass on its fundamentals.
For instance, Geoff Emerick tells about his first experiences at Abbey Road that ‘there
was no actual training; it was up to you to pick up what you could. You can’t really
train anyone, anyway – it’s like trying to train someone to paint a picture. If you can’t
paint, you can’t paint’ (Emerick quoted in Massey 2000: 52). Similar assertions can
be found in other interviews with practitioners, in vocational texts and in media
representations, often coupled with a certain degree of esotericism, which might be a
way to mask a lack of self-awareness or even self-esteem. It might also be interpreted as
a claim that only sound engineers and producers are entitled to mediate access to these
professions.
3. The social context in which scientific research emerges can affect the latter in many ways.
Merton coined the term ‘Matthew effect’ to describe ‘how resources and rewards (such
as opportunities to publish, or prestige) are assigned and distributed within the scientific
community. […] In science this principle translates into a cumulative effect which
exponentially rewards those who already occupy a privileged position’ (Bucchi 2004:
20). When looking at pictures of the first meetings of the AES (there is a revealing one in
Plunkett (1998) showing a room full of white men in office suits) there is the impression
that they don’t simply reflect discrimination by gender and race in society, but even
amplify it.
4. Commenting on their inability to access knowledge about the algorithm of the
automated mastering service LANDR, Sterne and Razlogova (2019: 2) describe this
‘obfuscation of its own internal workings as a constitutive feature of its social and
cultural existence’. They further state that, as far as AI ‘exists within webs and flows of
culture and power’ (2) and since LANDR is a venture capital-funded corporation, ‘the
politics of AI cannot be separated from the politics of corporate capitalism, regulation,
and resistance’ (3).
5. Incidentally, some of the most renowned sound engineers, such as Bruce Swedien,
Glyn Johns and George Massenburg – to name a few – are often linked to eras that
marked pioneering advances in the profession, suggesting that, in this profession, status
is acquired also through creative inventiveness mostly in the fields of technological
invention and business endeavours.
6. Moffat and Reiss (2018) investigate how to replace recorded with synthesized audio
samples based on perception-evaluation experiments; De Man and Reiss (2013), Ma et
al. (2015) and Reiss (2017) discuss different features of AI in sound processing; De Man
et al. (2017) examine the perception of artificial reverb in music recordings; Reiss (2016)
compares results from experiments into the perception of high resolution audio (beyond
44.1 kHz).
Bibliography
Babbie, E. (2010), The Practice of Social Research, 12th edn, Belmont, CA: Wadsworth.
Babbie, E. (2013), The Practice of Social Research (International Edition), 13th edn, Belmont,
CA: Wadsworth.
Bijsterveld, K. (2016), ‘Ethnography and Archival Research in Studying Cultures of Sound’, in
J. G. Papenburg and H. Schulze (eds), Sound as Popular Culture: A Research Companion,
99–109, Cambridge, MA: MIT Press.
Booth, G. D. (2008), Behind the Curtain: Making Music in Mumbai’s Film Studios, Oxford:
Oxford University Press.
Bourdieu, P. (1972), Esquisse d’une Théorie de la Pratique Précédé de Trois Etudes d’Ethnologie
Kabyle, Paris: Editions du Seuil.
Bryman, A. (2012), Social Research Methods, 4th edn, Oxford: Oxford University Press.
Bucchi, M. (2004), Science in Society: An Introduction to Social Studies of Science, London:
Routledge.
Butler, J. (2012), ‘The Beach Boys’ Pet Sounds and the Musicology of Record Production’,
in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An Introductory
Reader for a New Academic Field, 223–233, Farnham: Ashgate.
Corbetta, P. (2003), Social Research: Theory, Methods and Techniques, London: Sage.
De Man, B., K. McNally and J. D. Reiss (2017), ‘Perceptual Evaluation and Analysis of
Reverberation in Multitrack Music Production’, Journal of the Audio Engineering Society, 65
(1/2): 108–116.
De Man, B. and J. D. Reiss (2013), ‘A Semantic Approach to Autonomous Mixing’, Journal on
the Art of Record Production, 8 (December). Available online: https://www.arpjournal.com/
asarpwp/a-semantic-approach-to-autonomous-mixing/ (accessed 30 July 2019).
Erlmann, V. (2010), Reason and Resonance: A History of Modern Aurality, New York: Zone Books.
Feld, S. (1982), Sound and Sentiment: Birds, Weeping, Poetics, and Song in the Kaluli
Expression, Philadelphia: University of Pennsylvania Press.
Frith, S. (1986), ‘Art Versus Technology: The Strange Case of Popular Music’, Media, Culture &
Society, 8 (3): 263–279.
Higgins, L. (2012), Community Music: In Theory and in Practice, New York: Oxford University
Press.
Howes, D. (2004), ‘Sound Thinking’, in J. Drobnick (ed.), Aural Cultures, 240–251, Toronto:
YYZ Books.
Kaplan, A., Jr (1964), The Conduct of Inquiry: Methodology for Behavioral Science, San
Francisco, CA: Chandler.
Kealy, E. R. ([1979] 1990), ‘The Case of Sound Mixers and Popular Music’, in S. Frith and
A. Goodwin (eds), On Record: Rock, Pop, and the Written Word, 207–220, New York:
Pantheon Books.
Lampert, V. (2008), ‘Bartók and the Berlin School of Ethnomusicology’, Studia Musicologica,
49 (3/4): 383–405.
Lysloff, R. T. A. (2003), ‘Music in Softcity: An Internet Ethnography’, in R. T. A. Lysloff and
L. C. Gay, Jr. (eds), Music and Technoculture, 23–63, Middletown, CT: Wesleyan University
Press.
Ma, Z., B. De Man, P. D. L. Pestana, D. A. A. Black and J. D. Reiss (2015), ‘Intelligent

Multitrack Dynamic Range Compression’, Journal of the Audio Engineering Society, 63 (6):
412–425.
Massey, H. (2000), Behind the Glass: Top Record Producers Tell How They Craft the Hits, San
Francisco, CA: Backbeat.
Meintjes, L. (2003), Sound of Africa!: Making Music Zulu in a South African Studio, Durham,
Meynell, A. (2017), ‘How Recording Studios Used Technology to Invoke the Psychedelic
Experience: The Difference in Staging Techniques in British and American Recordings in
the Late 1960s’, PhD thesis, University of West London, London.
Moffat, D. and J. D. Reiss (2018), ‘Perceptual Evaluation of Synthesized Sound Effects’, ACM
Transactions on Applied Perception, 15 (2), article 13: 1–19.
Moore, A. F. (1992), ‘The Textures of Rock’, in R. Dalmonte and M. Baroni (eds), Secondo
Convegno Europeo di Analisi Musicale, 241–244, Trento: Università degli Studi di Trento.
Moore, A. (2012), ‘Beyond a Musicology of Production’, in S. Frith and S. Zagorski-Thomas
(eds), The Art of Record Production: An Introductory Reader for a New Academic Field,
99–111, Farnham: Ashgate.
Nardi, C. (2005), ‘Playing by Eye: Music-Making and Intersensoriality’, PhD thesis, University
of Trento, Trento.
Owsinski, B. (1999), The Mixing Engineer’s Handbook, Vallejo, CA: Mix Books.
Papenburg, J. G. and H. Schulze, eds (2016), Sound as Popular Culture: A Research
Companion, Cambridge, MA: MIT Press.
Payne, G. and J. Payne (2004), Key Concepts in Social Research, London: SAGE.
Plunkett, J. (1998), ‘Reminiscences of the Founding and Development of the Society’, Journal
of the Audio Engineering Society, 46 (1/2): 5–6.
Porcello, T. (1991), ‘The Ethics of Digital-Audio Sampling: Engineer’s Discourse’, Popular
Music, 10 (1): 69–84.
Porcello, T. (2003), ‘Tails Out: Social Phenomenology and the Ethnographic Representation
of Technology in Music Making’, in R. T. A. Lysloff and L. C. Gay, Jr. (eds), Music and
Technoculture, 264–289, Middletown, CT: Wesleyan University Press.
Porcello, T. (2004), ‘Speaking of Sound: Language and the Professionalization of Sound-
Recording Engineers’, Social Studies of Science, 34 (5): 733–758.
Reiss, J. D. (2016), ‘A Meta-Analysis of High Resolution Audio Perceptual Evaluation’, Journal
of the Audio Engineering Society, 64 (6): 364–379.
Reiss, J. D. (2017), ‘An Intelligent Systems Approach to Mixing Multitrack Audio’, in
R. Hepworth-Sawyer and J. Hodgson (eds), Mixing Music, 226–244, New York: Routledge.
Ringer, F. (1997), Max Weber’s Methodology: The Unification of the Cultural and Social
Sciences, Cambridge, MA: Harvard University Press.
Sterne, J. (2003), The Audible Past, Durham, NC: Duke University Press.
Sterne, J. (2012a), MP3: The Meaning of a Format, Durham, NC: Duke University Press.
Sterne, J., ed. (2012b), The Sound Studies Reader, London: Routledge.
Sterne, J. and E. Razlogova (2019), ‘Machine Learning in Context, or Learning from LANDR:
Artificial Intelligence and the Platformization of Music Mastering’, Social Media + Society,
5 (1): 1–18.
Tagg, P. (1979), ‘Kojak – 50 Seconds of Television Music: Toward the Analysis of Affect in
Popular Music’, PhD thesis, Göteborgs Universitet, Göteborg.
Théberge, P. (1997), Any Sound You Can Imagine: Making Music/Consuming Technology,
Hanover, NH: Wesleyan University Press.
Waterhouse, L. (2006), ‘Multiple Intelligences, the Mozart Effect, and Emotional Intelligence:
A Critical Review’, Educational Psychologist, 41 (4): 207–225.
Williams, A. (2012), ‘Putting It On Display: The Impact of Visual Information on Control
Room Dynamics’, Journal on the Art of Record Production, 6 (June). Available online:
https://www.arpjournal.com/asarpwp/putting-it-on-display-the-impact-of-visual-
information-on-control-room-dynamics/ (accessed 30 July 2019).
Williams, S. (2012), ‘Tubby’s Dub Style: The Live Art of Record Production’, in S. Frith and
S. Zagorski-Thomas (eds), The Art of Record Production: An Introductory Reader for a New
Academic Field, 235–246, Farnham: Ashgate.
University Press.
Zagorski-Thomas, S. (2018), ‘Directions in Music by Miles Davis: Using the Ecological
Approach to Perception and Embodied Cognition to Analyze the Creative Use of
Recording Technology in Bitches Brew’, Technology and Culture, 59 (4): 850–874.
Discography
The Beatles (1966), [7” vinyl single] ‘Paperback Writer/Rain’, Parlophone.
The Byrds (1966), [7” vinyl EP] ‘Eight Miles High’, CBS.
Part II
Technology
Two key elements that we wanted to cover in the contributions to this part were the
history and social construction of technological development and the functionality of
the technology itself. The danger with historical discussion, of course, is that it has to be
selective and that, even in histories of topics without wars, the winners write history. In
histories of technology, this manifests itself in two types of ‘winner’ – the technologies that
survive and the technologies that some historian chooses to write about. In a presentation
to the 2009 Art of Record Production Conference in Cardiff, researcher and novelist Peter
Doyle made an impassioned plea for richer narratives of storytelling in histories of record
production. He pointed to the all too frequent characterization of creative artists as ‘the
goodies’ and the entrepreneurs and business people that they worked with as ‘the baddies’.
This kind of emotive characterization happens with stories of technology as well. Albin Zak’s
chapter covers the often-told stories of the move from mechanical to electrical recording
in the 1920s and the shift from analogue to digital representations of recorded sound in
the 1980s, but he also pulls on a less-frequently told thread: the shift from vacuum tube to
solid state electronics in the 1960s, which had a powerful impact on the sound of music.
And very often, it is the allure of the grand sweeping narrative that provides the ‘winner’ in
this way. It is also very easy to simplify a very complex and messy transition process, such
as analogue to digital into a grand narrative of tape to digital audio workstation without
thinking about the ways in which the complexities of making records changed in those
gloriously creative last two decades of the twentieth century. Whenever we read histories,
we should be thinking about what has been left out and why the story is being told in the
way it is. That is just as true when telling the history of the British Empire without the
massacres, famines and concentration camps as it is when telling the history of recording
technology without the contribution of the German engineers developing tape technology
for the Nazi propaganda machine.
And just as important as this complex and detailed documentation of what happened,
when and where, are the stories of why and how. Although that has been a cornerstone
of historical studies about people and politics for as long as histories of them have been
written, histories of technology used to be much more positivist. But the deterministic
notion of technological development being a slow and inevitable process of progress
from worse to better was slowly overtaken by the idea that it was a much more chaotic
and subjective process. The social construction of technology became a much stronger
part of the narrative from the 1980s onwards through the influence of scholars from the
sociology of technology. And the idea of technology being socially constructed makes the
narrative much richer. How and when did the affordances of existing technologies collide
and interact with social ideas about what was valuable, desirable or considered part of the
norms of practice. We can see from examples of innovations which took years before they
found a market – such as Blumlein’s stereo patent or the Ampex 8-track tape machine –
that developments which look crucial and inevitable in retrospect might be considered
unnecessary or overcomplicated at the time.
And just as important in that social construction process is the interaction over time
between designers, manufacturers and users of the technology and the way that they can take
developments in new and unexpected directions. As Anthony Meynell points out through
some of the examples in his chapter, many of the technologies of production that start out
as the solution to specific problems end up being transformed into creative tools through
a process of negotiation. Once designers saw a technology such as equalization, which
started out as a way of compensating for the loss of high frequency in technical transfer
processes, being used to shape musical content, they started to redesign the interface and
expand the capabilities to match that function rather than the original one. And those
kinds of tacit negotiation have been happening throughout the history of production
technology, from EQ to record turntables and from compressors to autotune. But also,
in many instances, it was the users of the technology who discovered unplanned features
such as the overlapping ‘bump’ in the Pultec EQ1A equalizer when you simultaneously
boost and cut (that both Albin Zak and Anthony Meynell mention) or the ‘all buttons in’
mode on the UA1176 compressor. And that links in with the problematic idea of there
being a linear chronological progression where new technologies replace old ones. In many
instances, new technologies in music production are simply seen as broadening the choice
of tools rather than being better and more advanced or convenient. The development
of VCA, Vari-Mu, FET and Opto dynamic compressors may have originally been based
around a competition that was supposed to be about becoming the compressor design
but engineers soon realized that they all had different characteristics that were useful in
different contexts. And these kinds of choices between similar technologies also extend to
the interface design as well as to the sound. That is because the workflow that a particular
interface encourages or even allows/prevents can be just as crucial in determining the
sound that emerges at the end of the process.
4
From Tubes to Transistors:
Developments in Recording
Technology up to 1970
Albin Zak III
Popular music sales charts in late 1940s America were peppered with records of novelty
songs. Long a Tin Pan Alley staple, the genre had a robust repertory of musical tomfoolery
that included such ditties as ‘K-K-K-Katy’ (Geoffrey O’Hara 1918), ‘Yes! We Have No
Bananas’ (Frank Silver and Irving Cohn 1923) and ‘Oh By Jingo!’ (Albert Von Tilzer
and Lew Brown 1919). But there was something different about the post-war novelty
craze. Such songs and their recordings had relied on gimmicks of lyric writing, musical
arrangement or performance to produce their comic effect. Electronic gimmickry was
rarely apparent in the songs’ recorded rendition. The very popular Spike Jones and his City
Slickers perpetrated their hilarious ‘musical depreciation’ on such tracks as ‘Liebestraum’
or ‘The Blue Danube’ without resorting to electronics. The microphone and recording
device seemed little more than passive witnesses to the buffoonish confluence of musical
virtuosity and lowbrow humour enjoyed by audiences at the group’s live shows.
By contrast, the Harmonicats 1947 recording of ‘Peg O’ My Heart’ – in an arrangement
for three harmonicas, bass and electric guitar – presented listeners with an electronically
produced aural scenario unlike any they had encountered, or might encounter, in the
natural acoustic world. In the course of the recording two distinct reverberant images were
juxtaposed, as if the musicians had moved between two dramatically different architectural
spaces in a fraction of a second. This defiance of natural law – perpetrated by the recording
engineer, Bill Putnam, who directed the sound at the speed of electric current to and from
an ambient space the band never visited – displayed a kind of gimmickry neither song
nor performance alone could convey. Not simply a novelty song, ‘Peg O-My Heart’ was a
novelty record. It gave audiences a glimpse of a fantastic realm where musical sound and
electricity interacted to create a uniquely peculiar listening experience. The public’s opinion
of such contrivance was immediately obvious. The record, released on the small Vitacoustic
label, reached number one on the Billboard charts and surpassed a million units in sales.
The proliferation of novelty records in the late 1940s and early 1950s reflected changing
attitudes towards recorded music. While some recordists continued the longstanding
pursuit of audio ‘fidelity’ – famously exemplified by Robert Fine’s ‘Living Presence’ Mercury
recordings of symphonic repertory employing a single carefully placed microphone –
an antithetical aesthetic trend saw others producing records conceived and executed as
one-off concoctions, whose creative artifice relied on the recording studio and its tools.1
Rather than faithful snapshots of what producer John Hammond called ‘natural sound’,
free of ‘phony effects’ and ‘electronic fakery’, such records were unique works of aural fancy
whose musical and verbal meanings were inseparable from their electronic expression
(Hammond quoted in ‘Natural Sound’ 1954: 17). This notion was argued in perhaps plainer
English in an advertisement for the 1947 novelty hit ‘Near You’ (Bullet), by Francis Craig
and his orchestra. ‘Don’t Settle for a Substitute’, the ad exhorted Billboard readers. ‘GET
THE ORIGINAL.’ In case anyone wondered why the exclusivity, given that cover records
of popular tunes were commonplace and for many record buyers interchangeable, the ad
explained that ‘The Tune Was Made By the Way It Was Played’ (‘The Tune Was Made’
1947: 28). It was an extraordinary claim and an implicit sign that the previous ontological
authority of written music was giving way to a new sort of text.
Though perhaps lowly in their origins and aims, novelty records asserted a tacit
proposition that would change record making forever. Namely, that a record – not merely
recorded but created in the studio – might be an original piece of music, a musical work.
If that were the case, then everything that went into making a record, including the audio
devices that captured and shaped the sound of musicians’ performances, was germane
to the creative process. As electronic reproduction made up-close listening increasingly
normative, what it meant to shape musical sound – how and to what end – was an evolving
topic of interest among sound engineers, musicians, critics and the listening public. In
the rise of novelty recording we see the beginnings of a crystallization of an aesthetic-
technological fusion that would determine record production’s future course.
Engineering musical expression

The developmental evolution of sound recording and reproduction technology involved
a marriage of science, art and commerce whereby practical innovation led to new forms
of aesthetic agency, which were, in turn, tested in the commercial marketplace.2 Electrical
recording, for example, improved on acoustic recording in ways engineers could measure
and describe in scientific terms (e.g. frequency response, signal-to-noise ratio). Yet the
new sound capturing tool they used – the microphone – made unforeseen demands that
science alone could not meet. As audio historian David Morton has observed, electrical
engineers were ‘immediately thrust into the unfamiliar, unscientific realm of aesthetics’
(2000: 26). Where should the mic be placed? How many mics should be used? With the
opportunity to use multiple mics, how should their various ear-points be mixed? How can
the right textural balance among instruments be created? How much ambient room sound
Developments in Recording Technology up to 1970 55
should be captured? Engineers accustomed to solving quantitative technical problems

were now drawn in to a far more subjective enterprise.
For performers, the pragmatic benefits of electric recording could likewise be described
in fairly objective terms. The increased sensitivity of microphones captured fuller frequency
and dynamic ranges, making for a more accurate representation of the source. And the
ability to control volume with vacuum tube (or valve) amplifiers helped to solve balance
problems that had bedevilled ensembles making records acoustically, forcing musicians to
situate themselves in unfamiliar ways around recording horns according to the loudness
of their instrument. Louis Armstrong’s powerful tone, for instance, had forced him
to stand further from the horn than his band mates in King Oliver’s Creole Jazz Band
during their April 1923 session for Gennett Records (Kennedy and McNutt 1999: 2). The
electrical apparatus might have overcome the problem and allowed Armstrong to return
to his normal place in the group. Yet this logistical advantage, too, came with aesthetic
consequences, for it also allowed quieter instruments to hold their own in any ensemble
according to the engineers’ sensibilities regarding balance and texture.
As the electric process removed sound from its natural vibratory state, new properties
and possibilities came into play. For telephony, where microphone technology got its
start, electricity allowed sound to be sent to a remote destination that it could not reach
on its own. Likewise for radio broadcasting. When microphones began to be used for
sound recording and film the idea was not to send the sound far away in real time
but to capture it and preserve it for later dissemination and reproduction. The same
transformation of acoustic energy into electricity that allowed sound transmission over
telephone wires or airwaves also made it mutable – controllable in ways specific to the
electronic realm. For music recording this meant that such features as loudness and
timbre, once the unique province of musicians, became subject to electronic affect.
Recording tools, in turn, were assessed not only in terms of technical efficiency but also
for their perceived musicality.
In the earliest years, the mere possibility of capturing sound evoked wonder. Reporting
on an early version of Edison’s phonograph, a writer for Scientific American wrote, ‘No
matter how familiar a person may be with […] the principle underlying this strange device
[…] it is impossible to listen to the mechanical speech without his experiencing the idea that
his senses are deceiving him’ (‘The Talking Phonograph’ 1877: 385). But the reproduction
was crude, with many words unintelligible. Moving forward, everyone involved with
sound capture, transmission and reproduction aimed to improve the recorded image –
that is, to refine its likeness to its source. Edison aimed to ‘record automatically the speech
of a very rapid speaker […] preserving the characteristics of the speaking voice so that
persons familiar with it would at once recognize it’ (Edison quoted in ‘Men Behind the
Microphones’ 1952: 56). The literature tracing commercial sound recording presents a story
of progressive evolution towards this goal of challenges met and obstacles overcome (e.g.
Read and Welch 1976; Gelatt 1977; Morton 2000; Millard 2005). As the project advanced,
a consensual attitude favouring transparent representation steered developments towards
a ‘realistic’ rendition of source sounds, with the recording apparatus cast as an objective
aural observer.
Yet within the studio walls the project’s aesthetic dimension was always apparent. As
Susan Schmidt Horning reminds us, ‘ever since the earliest recordings, capturing a live
performance on cylinder or disk required subtle manipulation of the recording equipment’.
In fact, she rightly asserts, ‘there has never been a recording that did not require some
intervention’ (Schmidt Horning 2004: 704). Long before tape editing and overdubbing
upended traditional notions of musical performance, instrumental textures were shaped
by engineers’ mic techniques and the electronic balancing they performed at their mixing
consoles. Leopold Stokowski realized, in 1929, that the engineer for one of his radio
broadcasts exerted significant control over his orchestra’s sound. ‘He’s the conductor and
I’m not,’ Stokowski exclaimed, wary of surrendering his maestro’s supreme control to an
electronic collaboration (Daniel 1982: 306).
The technological revolution that permanently reshaped the experience of hearing,
habits of listening and the broad soundscape of the electrified world moved quickly.
Historian Emily Thompson illustrates the point in her monograph, The Soundscape of
Modernity, framing the story with the openings of two famous musical buildings separated
by only three decades. In October 1900, Boston’s Symphony Hall welcomed audiences to
concerts of classical music presented in a pristine acoustic environment. At the end of 1932,
Radio City Music Hall, ‘a celebration of the sound of modernity, its gilded proscenium
crowned […] with state-of-the-art loudspeakers that broadcast music of the day’, opened
its doors on Sixth Avenue in Manhattan (Thompson 2004: 4). The two venues were devoted
to fundamentally different ways of producing and experiencing musical sound. Symphony
Hall evoked a pre-electric aural culture, its clear, detailed sound a marvel of acoustic design.
By contrast, Radio City, whose sound also elicited critical praise, pulsed with electricity.
The ‘music of the day’, which included popular classics, was amplified without regard to
musical idiom or tradition. The hall ‘was wired for sound and no one seemed to mind’, for
by the time of its opening, through the electrical mediation of radio, sound film, records
and public address systems, ‘both the nature of sound and the culture of listening were
unlike anything that had come before’ (231, 2).
From sound to signal

The interface of acoustics and electronics begins with a microphone, the portal between
acoustic and electronic realms. Microphones are transducers, which means they convert
acoustic energy into electrical energy. There are many types of mics (e.g. carbon, condenser,
moving coil, ribbon, piezoelectric), but they all operate according to a common principle.
A diaphragm of some sort vibrates sympathetically with sound waves striking it, like an
eardrum. The microphone ‘hears’ sound and converts it into an audio signal, a series of
electrical impulses that mirror the sound vibrations. Problems of microphone design and
qualitative differences among mics come down to what they hear (directionality), how
accurately they hear it (dynamic and frequency response) and the ways in which they
themselves affect the sound (added noise, presence bump).
In 1931, RCA Photophone engineers introduced a microphone design that Radio News
reported as a ‘new and radical development’. The industry standard condenser and carbon
mics were plagued by a persistent problem: they were ‘sensitive in an almost equal degree
to sounds coming from any direction’. That is, for ‘certain types of sound transmission
and recording work […] they [were] not sufficiently directional’. The RCA engineers had
‘developed an extremely sensitive directional microphone, which they call the ribbon
microphone’ (Felstead 1931: 1067). Solving the directionality problem, the ribbon design
became widely popular in broadcasting, film and record production, most notably the RCA
44 series bidirectional and the RCA 77 series unidirectional models. In British studios the
BBC-Marconi Type A ribbon mic was the standard.
Yet in addition to the technical achievement of selective hearing, the ribbon mic, like
any mic, also imparted a distinctive flavour through its electronic characteristics. It had a
timbral personality. Among its idiosyncrasies was the mic’s emphasis on low frequencies
when the sound source was close to the diaphragm. This so-called proximity effect produced
a rich, warm sound for close-up voices, a boon to the popular new singing style known as
crooning. Such singers as Rudy Vallee, Russ Columbo and especially Bing Crosby put the
effect to expressive use. As Crosby biographer Gary Giddins has noted, ‘Bing collaborated
with electric current.’ He ‘played the mike with virtuosity […] reinventing popular music
as a personal and consequently erotic medium’ (Giddins 2002: 228). In her chronicle of
crooning, Real Men Don’t Sing, Allison McCracken notes that the ribbon mic ‘gave an
even stronger impression of vocal intimacy than earlier microphones had’ (2015: 280).
According to the film sound engineer and microphone historian, Jim Webb, a consensus
among engineers holds that Crosby’s artful use of the ribbon’s proximity effect ‘was an
essential aspect of his rise to fame’ (McCracken 2015: 368).
For singers, removing the requirement of a big voice opened the door to a broader range
of expressive possibility. As McCracken has observed, the effects on popular culture were
unexpectedly far-reaching. ‘No one anticipated the transformative effects of microphone
technology, which put soft-voiced singers on equal footing with classically trained singers
and Broadway belters, nor the profound social impact of combining these amplified voices
with radio’s expansive reach’ (3). By the 1950s amplified singing had ‘hardened into a
tradition’, as the critic Howard Taubman put it. Whatever the style or genre, all singers of
popular song were now ‘the spawn of the microphone age’ (Taubman 1954: SM26).
Dynamic microphones – ribbons and, for live performances, the more rugged moving
coil designs such as the Shure 55 – remained the industry standard through the war years.
But in 1948 Georg Neumann, who had been designing mics since the 1920s, brought
out a new condenser mic, the U47 (distributed in the United States by Telefunken), with
selectable polar patterns (cardiod and omni). The mic’s tonal qualities and sensitivity were
widely admired. A 1952 Saturday Review feature noted that it had ‘been received very
favorably by many recording engineers, and it is included in the batteries of most of the
recording studios, both major and minor’ (Villchur 1952: 69). ‘When the U47 came along, it
was like a bomb went off ’, recalled Webb. ‘Such remarkable results had never been achieved
before’ (quoted in Granata 1999: 23). The U47 was the mic Robert Fine used for his one-
mic orchestral recordings. It became Frank Sinatra’s favourite ‘instrument’, as he called it,
replacing the venerable RCA 44 (Granata 1999: 23). Renowned jazz recording engineer
Rudy van Gelder acquired one of the first U47s sold in America, he was impressed with its
‘extreme sensitivity and warmth’ (Myers 2012).
If singers treated microphones like instruments, engineers in the 1950s, working with
an increasing selection of types and models, began to use them ‘like a painter mixing
colors on a palette’ (Schmidt Horning 2013: 115). Matching mics to sound sources and
placing them appropriately to achieve a desired outcome was an essential skill. Early in
his career Al Schmitt worked at Apex Recording in New York, which owned but one
equalizer (not uncommon at the time). Since equalization (EQ) was available only on the
mix bus, he was forced to rely on mic technique to shape individual sounds. ‘If I wanted a
brighter vocal sound,’ he recalled, ‘I used a brighter mike. If I wanted a warmer bass sound,
a warmer piano sound, I used maybe a ribbon mike’ (Schmidt Horning 2013: 115). Van
Gelder noted another aspect of microphone technique. ‘I used specific microphones’, he
said, ‘located in places that allowed the musicians to sound as though they were playing
from different locations in the room’ (Myers 2012). Van Gelder stressed that a grasp of
all aspects of a microphone’s characteristic behaviours allows an engineer to use it in
ways that ‘fit the music, with the same care a photographer employs in selecting a lens’
(Robertson 1957: 56).
Controlling the signal

Once the microphone has done its work, the resulting audio signals are controlled by an
array of equipment – amplifiers, mixing consoles (or desks), equalizers, compressors –
whose electronic roles in shaping tone colour and dynamics mirror in some ways the work
of performing musicians. As with microphones, these tools were developed initially for
purposes other than sound recording. In fact, EQ – the process of reconfiguring a signal’s
constituent frequencies – found its first use in telegraphy (to transmit multiple messages at
different frequencies over a single telegraph wire) before being used to address the problem
of frequency loss in telephone wires. Compression grew out of a need to keep radio
broadcast signals within a delimited dynamic range. But whatever the pragmatic genesis
of these devices, they would eventually take their places as tools of art in the recordist’s
palette, valued as much for their sonic effect as their functional efficiency.
Console
The nerve centre of any recording project is the mixing console (even if represented virtually
within a digital audio workstation, DAW). The console is where we see the clearest visual
evidence of the atomized nature of record production, with each of a track’s component
sounds represented by its designated channel. It was the console that represented to
Stokowski a rival ‘conductor’, an agent charged with blending multiple sound sources
into a coherent whole. The practical necessity of consoles arose with the use of multiple
microphones, providing engineers with a centralized control surface from which to mix,
or balance, the incoming signals. For decades, consoles were simply banks of amplifiers
dedicated to controlling loudness. Yet even early on they held the promise of further
evolution in the form of the signal routing principle. If at first the routing served merely to
combine multiple signals at a mono summing point, ever more complex schemes emerged
over time. Engineers would find reasons to combine signals into auxiliary subgroups, or to
send them beyond the console for outboard processing (to an echo chamber, for instance,
or an equalizer) and then return them to the mix. With the advent of stereo, signals were
routed to various points along the stereo horizon. Each development drew engineers and
the machines they commanded deeper into a record’s formal organization.
Consoles would eventually include EQ, which along with its mic preamps, signal path and
architecture contributed to each model’s distinct character. Providing EQ at the engineer’s
fingertips further integrated studio tasks in a single control surface where amplitude balance
and tone colour could be shaped interactively. (This trend would continue. By 1979, the
innovative SL 4000 E Series, manufactured by Solid State Logic, included a compressor
and gate on each channel strip.) Although a few consoles were commercially available in
the 1950s (e.g. RCA, Langevin, Altec), studios large and small commonly assembled or
commissioned their own custom units. Although this practice continued into the 1960s,
demand grew for ready-made products. By 1970 such firms as Trident, Helios and API
were producing stock consoles that represented the wave of the future. It was only a short
time before such plug-and-play consoles were standard throughout the industry.
Equalizer
In Robert Runstein’s distilled definition, ‘equalization (EQ) refers to altering the frequency
response of an amplifier so that certain frequencies are more or less pronounced than
others’ (Runstein 1974: 112). In other words, EQ provides an ability to electronically
reshape timbre. For decades, equalizers were blunt devices best suited in the audio sphere
to solving problems in producing sound films, disc mastering and long-distance radio
transmission over telephone lines, with the common purpose of preserving fidelity to
the sound source and suppressing extraneous noise. With only a limited tone-shaping
capacity, they were used in recording studios largely as what engineer Walter Sear called
‘“corrective” devices [because] they were used to correct mistakes you made when you
recorded’ (Schmidt Horning 2013: 113). Tone colour engineers were better served by
making appropriate microphone choices and placements.
But equalizers came into wider and more diverse use in recording studios in the 1950s
with outboard units designed specifically for record production (recording, mixing, disc
cutting), such as the RS56 Universal Tone Control, designed by EMI technical engineer
Mike Bachelor for use at EMI studios, and the Pultec EQP-1, designed and built – by hand,
one at a time – by Oliver Summerland and Eugene Shank, and marketed out of their Pulse
Techniques storefront in Teaneck, New Jersey. These units provided a new level of timbral
control and relatively seamless operation. For example, though the Pultec’s ‘passive’ design
was based on that of the Western Electric equalizer, it included an output tube that amplified
the attenuated signal – reduced at output by 16 db in the Western Electric unit – back up
to its input level (Reilly 2017). So, in addition to the increased potential for manipulating a
sound’s frequency content, the Pultec offered what we would now call improved workflow.
In the words of engineer Donald J. Plunkett, the Pultec’s widespread adoption ‘facilitated
the transition from a purist to manipulative recording technique’ (1988: 1061).
Compressor
In simple terms, a compressor (also limiters and levelling amplifiers) controls loudness.
Initially the device brought relief to vigilant radio ‘control operators’ who manually
governed the broadcast signal for consistent modulation and maximal signal-to-noise
ratio, keeping ‘a happy mean which is at the same time audible and safe’ (Dreher 1923: 40).
Instead of keeping a constant hand on the amplifier knob, engineers could simply monitor
the signal while leaving the moment-to-moment control to the compressor. The device’s
lowly origins as radio’s ‘automatic gain control’, however, gave little indication of the central
place it would take among recording engineers’ tools for sculpting and colouring the
recorded image. Current applications include the use of multiple compressors, in series
or in parallel, on a single sound source – often in combinations of varying design and
manufacturer – and multiband compression, which dials in compression at designated
frequencies. Such elaborate manipulations arise from the compressor’s ability not only
to limit dynamic range but also to affect attack and release characteristics of programme
material above a set threshold, in effect altering a sound’s natural envelope. Further, the
levelling effect they produce on amplitude (determined by an attenuation ratio) can also
change the relationship between a sound’s loudest and softest components, heightening the
sound’s low-level details while controlling its amplitude peaks. In the hands of experienced
engineers, the result is a sense of enhanced ‘presence’ and perceived loudness.
As with any other audio device, though designed to operate as transparently as possible,
compressors carry unique sonic signatures. Beyond their architectural fundamentals, the
key variable is the form of gain reduction employed. The three most widely used compressors
of the post-war era are still prized today, each using a different method, which accounts
for non-linear colouring effects specific to each unit. The Fairchild 660/670, designed by
Rein Narma in the 1950s, uses a series of tubes in a so-called variable-mu design whereby
the compression ratio varies automatically according to input gain. The Teletronix LA-2A,
designed by James Lawrence in the early 1960s, uses an electroluminescent panel and a
photoresistor in an ‘optical’ process whose effect is entirely programme dependent. Finally,
Bill Putnam’s Urei 1176, which came to market in 1966, is a solid-state design employing
the field effect transistor (FET). It offers variable attack times (as short as 20 microseconds)
and selectable ratios of 4:1, 8:2, 12:1 and 20:1 (Moore, Till and Wakefield 2016). With its
expanded set of controls, the 1176 invited engineers to experiment. Over the years, sheer
curiosity had led to many innovations in the studio. With the Pultec EQ engineers had
found that the unlikely practice of boosting and attenuating the same frequency range at
the same time produced an appealing timbral effect. One popular technique discovered on
the 1176 involved pushing in all the ratio buttons at once, happily distorting the programme
material in a way the designers had likely not envisioned (Moore 2012).
Capturing the signal

Tape recording came on to the American studio scene in the midst of the novelty recording
boom and quickly replaced disc recording both for its practical advantages and its creative
possibilities. On the one hand, tape was a great leap forward in sonic transparency and
production efficiency. On the other, it invited novel practices that such enterprising
producers as Mitch Miller and Les Paul exploited with gusto. Tape machines provided
instant gimmick gratification with their ability to change speed – thus altering pitch, tempo
and timbre – and to produce echo. But it was tape’s overdubbing potential that offered the
greatest possibility for compositional artifice. Layering performances was not new. But the
earlier disc method was ponderous and, due largely to surface noise and high-frequency
loss, allowed for only a limited number of overdubs. Tape made the process far more viable.
In 1944, Jack Mullin, an American GI assigned to the Army’s Signal Corps and stationed
in Britain, noticed that German radio broadcasts continued after the BBC signed off at
midnight. ‘They broadcast symphony concerts in the middle of the night.’ He was sure
the stations were employing ‘canned’ music, a practice frowned on back in the United
States for its lacklustre audio quality. But Mullin was surprised to hear that the ‘frequency
response was comparable to that of a live broadcast’. At the war’s end, he stumbled on the
source of the wee-hours music in a radio station outside Frankfurt. A magnetic recorder
using tape (rather than wire) had been introduced to the German market in 1935 by the
AEG electronics firm as the ‘magnetophone’. Through a decade of refinement, and as it
came into wide use in Germany, the device remained largely unknown to the world beyond
the Third Reich. Mullin ‘flipped’ when he heard the thing play. He ‘couldn’t tell from the
sound whether it was live or playback’ (Mullin 1976: 62–63).
Mullin shipped two magnetophones back to the United States in pieces, reassembled
them and began to give public demonstrations. The device practically sold itself. Its
advantages over disc recording were immediately apparent. In addition to its superior
frequency response, tape added less extraneous noise, offered simpler and more versatile
editing capacity, and a longer recording duration. All of this attracted Bing Crosby, then
keen to begin recording his weekly radio show for later broadcast. Aside from his pragmatic
interest, however, which included investing in the Ampex corporation, one of the first to
bring the tape recorder and recording tape to the American market, Crosby also made an
indirect creative contribution when he supplied an Ampex machine to his guitar player,
Les Paul.
Crosby knew that Paul was fascinated with recording. Paul had taken the one-man-band
concept to unprecedented heights with his multilayered disc recordings – for example,
‘Lover’, ‘Caravan’ and ‘Nola’. He appreciated the tape recorder’s sonic superiority to disc
as well as its more flexible editing capability. But he also saw in the machine an untapped
possibility. By adding a second playback head, Paul reasoned, a single machine might
replace two disc recorders by layering sounds in repeated recordings on the same piece
of tape. He made the modification, it worked, and he and his overdubbing partner Mary
Ford used the gimmick he called ‘sound-on-sound’ to produce a string of multilayered hit
records in the early 1950s (Paul and Cochran 2005: 204).
Sound-on-sound would be replaced by Ampex’s ‘sel-sync’ method, devised by Ross
Snyder and Mort Fuji, whereby parallel bands separated a piece of tape into discrete tracks.
Paul’s laborious process of layering performances (which meant a mistake on a later take
required starting over) was replaced with the ability to park them side by side. A poor
performance on one track could be replaced without affecting those on other tracks.
But though this innovation would significantly advance the practice of overdubbing, its
immediate significance was unclear to Snyder, who recounts that he came up with the
idea as an aid specifically for Paul’s multitrack project, thinking there ‘might be three or
four musicians who’d benefit by it, not more’ (Sanner 2000). Paul took delivery of the first
Ampex 8-track machine, which he dubbed the ‘octopus’, in 1957, but as Snyder suspected
the machine found few other takers (Tom Dowd at Atlantic Records was one). With Paul’s
hit-making days behind him, his brand of compositional recording appeared to have
waned as well. It turned out, however, to be only a temporary divergence of technological
affordance and artistic aims. Within a decade, multitrack recording would become the
normative production practice everywhere, and several manufacturers competed to offer
recordists increasing numbers of tape tracks.
The 1960s
By the 1960s, a distinctive aesthetic ethos pervaded the art of record production. If there
were still some ‘natural sound’ holdouts, widespread evidence suggested that attitudes had
changed. Even in classical music recording, the touchstone for high fidelity, many records
expressed an idiosyncratic electronic identity. The critic R. D. Darrell in 1954 had bemoaned
the ‘No Man’s Land where the boundaries of engineering, esthetics, and commercialism
overlap in the contemporary world of sound recording and reproduction’, resulting in ‘the
too-frequent perversion of great music […] when virtuoso interpreters and engineers run
wild in the quest of the distinctively “different” and “sensational” presentations of familiar
materials’ (1954: 56). Yet just two years later another cautious critic, Edward Tantall Canby,
who had previously fretted about the potential ‘danger’ and ‘slightly appalling’ prospects of
edited performances, proclaimed that ‘a new type of sound’ was ‘here to stay. Taste […] is
ever moving towards the acceptance of effects that are unique to recording itself’ (1950: 71;
1956: 60, emphasis in original). One of the idiom’s most electrifying young pianists, Glenn
Gould, acknowledged that: ‘Of all the techniques peculiar to the studio recording, none
has been the subject of such controversy as the tape splice.’ Yet he himself was a fan of the
practice, often employing creative tape editing to ‘transcend’, as he put it, ‘the limitations
that performance imposes upon the imagination’ (Gould 1966: 51, 53). And the producer
John Culshaw, in his landmark production of Wagner’s Der Ring des Nibelungen (1958–65),
employed active stereo panning to ‘bring the home listener into closer contact with the
drama’ by creating a virtual ‘theater of the mind’ (Culshaw 1967: 10, 5).
Pop music had presented to the public more than a decade of highly processed – and, in
the case of rock ‘n’ roll, often shockingly lo-fi – sound, which had produced a sonic culture
where audiences expected records to offer peculiar sonic thrills. Throughout the 1960s
rock music recording projects became ever more ambitious amid a general awareness that
recording studios were akin to a painter’s atelier and audio devices were intrinsic to musical
craft. Exemplars from the decade’s early years include such stylistically divergent producers
as Phil Spector, Joe Meek, Brian Wilson and Frank Zappa. In an era and studio market
(Los Angeles) where three three-hour sessions were the rule, Spector and Wilson racked
up big bills with long studio hours spent not merely recording performances but pursuing
elusive aural visions (Ribowsky 1989: 102; Badman 2004: 148). Meek literally brought the
studio home, using his London flat as a production space (Irwin 2007). Zappa took the
unusual step (for a musician of the time) of acquiring a studio of his own in 1964 (Miles
2004: 80–81). The decade’s most successful recording group, The Beatles, had no real need
of a personal studio. Their commercial success allowed them the run of EMI’s Abbey Road
studios, where they spent the bulk of their creative time.
With the guidance of producer George Martin and Abbey Road’s expert staff of recording
engineers and technicians The Beatles showcased ‘manipulative’ recording practices
from the pinnacle of the show business world. Beatles records presented constructed
sonic images fashioned in a studio where signal processing, added reverb (which by
now included synthetic plate and spring reverb as well as chambers), overdubbing and
an increasing number of electronic timbral effects were standard fare. Microphones were
chosen for their particular characteristics. Among the studio’s Neumann mics, for instance,
the U47 or U48 handled vocals; the KM53 captured the sound of the echo chamber; the
U67 was favoured for electric and acoustic guitar (Ryan and Kehew 2006: 166–174). The
mixing consoles (EMI’s REDD design) provided not only level controls but also EQ on
each channel. Compressors were everyday staples, used both for recording and mixing
(including submixing). Various models were assigned to specific tasks according to their
perceived musical fit. Geoff Emerick used an RS124 (an Altec compressor modified by
EMI technicians) on Paul McCartney’s bass to achieve what the engineer thought at the
time to be ‘the ultimate bass sound’. For vocals he ‘fell in love with’ the Fairchild 660 for the
‘presence’ it added (Ryan and Kehew 2006: 134, 142).
The pioneering efforts and experimental attitudes of 1960s recordists were spurred by
technological developments that to the Fairchild Corporation’s George Alexandrovich
seemed to have ‘changed most of the design concepts overnight’ (1967: 14). Multitrack
recording – already standard in 2-, 3- or 4-track configurations – became widely available
in an 8-track format with the release of the 3M M-23 in 1966. Unlike Ampex’s limited
production ‘octopus’, the M-23 was intended for a mass market where innovative record
production practices demanded ever more resources. The Ampex and Scully corporations
soon followed 3M’s lead, and in 1967 Ampex doubled the track count to sixteen with the
prototype AG-1000 commissioned by Mirasound Studios in New York City, soon followed
by the mass market MM-1000. In 1968, Grover ‘Jeep’ Harned modified an Ampex 300
handling 2” tape to accommodate twenty-four tracks, a format that by the early 1970s
became widely available in Harned’s MCI JH-24 and Willi Studer’s A80.
More tape tracks, of course, called for more console channels. And in addition to their
increasing size, consoles underwent a sea change in their electronic architecture with a shift
from tubes to transistors. Among the first such consoles was one designed by Rupert Neve
and built at his Neve Electronics firm in 1964 for delivery to the Philips Records studio
in London. The design represented a significant trend in the pro audio field. Solid state
transistors had debuted in consumer audio during the 1950s, in home audio systems and
portable radios, and by the mid-1960s the technology (in particular the FET) had advanced
to a level of quality sufficient for professional applications. As one writer summed up the
prevailing view in 1967: ‘Solid state provides the designer with so many new approaches
to the perfect amplifier that there is little doubt in the minds of progressive engineers that
solid state has taken over in audio leaving vacuum tubes as one of the great stepping stones
in the search for perfect electronic amplification’ (Weathers 1967: 6).
This brave new audio world, of course, also brought a change in the general soundscape.
The sound of transistors was said to have a crisper high end, the sound of tubes more
warmth and punch – much the same terms as the electric vs. acoustic debates of the 1920s,
or the digital vs. analogue quarrels of the 1980s and 1990s. Whatever the characterizations,
it was clear that the sounds were different and that each presented a distinct sonic setting
within which to work, and which might, in turn, influence the musical outcome. During
the sessions for their final album, The Beatles moved into the solid-state world with the new
EMI TG12345 console and the 3M M-23 eight track. According to Geoff Emerick, the new
equipment produced a less ‘punchy’, ‘softer and rounder’ aural effect. He was ‘convinced
that the sound of the new console and tape machine overtly influenced the performance
and the music on Abbey Road […]. The end result was a kinder, gentler-sounding record
– one that is sonically different from every other Beatles album’ (Emerick 2010: 277–278).
Conclusion
Amid increased awareness and a general curiosity about the workings of record production,
the past couple of decades have seen a steady increase in sessionographies, monographs
and videos in which recording machinery makes at least a cameo appearance. Examples
include Mark Lewisohn on The Beatles (1988), Martin Elliott on The Rolling Stones (2012),
Keith Badman on The Beach Boys (2004), John McDermott, Billy Cox and Eddie Kramer
on Jimi Hendrix (1995), Charles Granata on Frank Sinatra (1999), Ashley Kahn on Miles
Davis (2007), Bloomsbury’s 33 1/3 monograph series and the Classic Albums video series
distributed by Eagle Rock Entertainment. For a detailed description of electronic devices at
work in a creative musical milieu, nothing matches Ryan and Kehew’s Recording The Beatles
(2006). If this list suggests a tilt towards popular music, it is not to minimize the importance
of work such as that of Timothy Day (2000) or Andrew Kazdin (1989) on classical music
recording. Rather, it acknowledges that pop music as we know it, with its explicit embrace
of electronic mediation and its ontological association with electronic texts, has thus far
extended a more open and enticing invitation to peak behind the studio curtain.
In addition to documentary interest, a marked historical sensibility, sometimes framed as
‘technostalgia’, pervades current production practices (Taylor 2001; Bennett 2012; Williams
2018). Audio equipment leaves sonic traces on the records it shapes, forming a connection
among music, sound and machine. In the lexicon of sound recording and mixing, such
associations have conferred on many pieces of gear an iconic status through their use
on decades’ worth of hit records. If ‘vintage’ gear inspires a certain fanciful, fetishistic
reverence, its sonic influence is nevertheless real, the evidence audible on records that
have stood the test of time. Thus, the Pultec, for example, remains an esteemed equalizer.
Or, rather, its popularity has revived after being eclipsed in the 1980s. Indeed, the rise of
analogue technostalgia in the recording studio coincides with what we might call digital
modernism. Ironically, however, it is in the digital realm, recast as software emulation,
that the Pultec finds its widest application. Similarly for the Fairchild 670, the LA-2A, the
1176 and vintage tape machines, channel strips, reverb and echo units, and microphones.
All are available in emulations created by digital signal processing (DSP) engineers at such
firms as Universal Audio and Waves. Though transistors initially replaced tubes on the
road to ‘perfect electronic amplification’, the superseding digital phase of audio’s evolution
has made both technologies available in a technological bricolage of historical association
evoking the broad sweep of sound-recording tradition.
Notes
1. Mercury Records took ‘Living Presence’ as the brand name for a series of high-fidelity
classical music recording projects following the term’s use by New York Times critic
Howard Taubman in his review of Rafael Kubelik and the Chicago Symphony Orchestra’s
recordings of Mussorgsky’s Pictures at an Exhibition, Bartók’s Music for Stringed
Instruments, Percussion and Celesta, and Bloch’s Concerto Grosso No. 1. Effusive in his
praise, Taubman characterized the discs as ‘one of the finest technical jobs of recording
made on this side of the Atlantic’ (1951: 8X). Though unacknowledged in the review,
Robert Fine and producer Wilma Cozart oversaw the recordings.
2. For an illuminating analysis of the social dimensions of this interaction in its early years,
see Sterne (2003).
Bibliography
Alexandrovich, G. (1967), ‘The Recording Studio’, db, November: 14–16.
Badman, K. (2004), The Beach Boys: The Definitive Diary of America’s Greatest Band: On Stage
and in the Studio, San Francisco, CA: Backbeat.
Bennett, S. (2012), ‘Endless Analogue: Situating Vintage Technologies in the Contemporary

Recording & Production Workplace’, Journal on the Art of Record Production, 7. Available
online: http://arpjournal.com/endless-analogue-situating-vintage-technologies-in-the-
contemporary-recording-production-workplace/ (accessed 4 March 2018).
Canby E. T. (1950), ‘Some Highs and Lows: Sound Editing’, Saturday Review, 28 January: 71.
Canby, E. T. (1956), ‘The Sound-Man Artist’, Audio, June: 44–45, 60–61.
Culshaw, J. (1967), Ring Resounding, New York: Viking Press.
Daniel, O. (1982), Stokowski: A Counterpoint of Views, New York: Dodd, Mead.
Darrell, R. D. (1954), ‘Chromium-Plated Vulgarity’, Saturday Review, 25 December: 56–57.
Day, T. (2000), A Century of Recorded Music: Listening to Musical History, New Haven, CT:
Dreher, C. (1923), ‘Behind the Scenes at a Broadcasting Station’, Radio Broadcast, November:
37–46.
Elliott, M. (2012), The Rolling Stones: Complete Recording Sessions 1962–2012, London: Cherry
Red Books.
Emerick, G. (2010), Here, There and Everywhere: My Life Recording the Music of The Beatles,
New York: Gotham Books.
Felstead, C. (1931), ‘The New Ribbon Microphone’, Radio News, 12 (12): 1067, 1096, 1108.
Gelatt, R. (1977), The Fabulous Phonograph, 1877–1977, New York: Macmillan.
Giddins, G. (2002), Bing Crosby: A Pocketful of Dreams, Boston, MA: Little, Brown.
Gould, G. (1966), ‘The Prospects of Recording’, High Fidelity, April: 46–63.
Granata, C. L. (1999), Sessions with Sinatra: Frank Sinatra and the Art of Recording, Chicago:
A Cappella Books.
Irwin, M. (2007), ‘Take the Last Train from Meeksville: Joe Meek’s Holloway Road Recording
Studio 1963–7’, Journal on the Art of Record Production, 2. Available online: http://
arpjournal.com/take-the-last-train-from-meeksville-joe-meeks%e2%80%99s-holloway-
road-recording-studio-1963-7/ (accessed 3 April 2018).
Kahn, A. (2007), Kind of Blue: The Making of the Miles Davis Masterpiece, New York: Da Capo
Press.
Kazdin, A. (1989), Glenn Gould at Work: Creative Lying, New York: Dutton.
Kennedy, R. and R. McNutt (1999), Little Labels – Big Sound: Small Record Companies and the
Rise of American Music, Bloomington, IN: Indiana University Press.
Lewisohn, M. (1988), The Beatles Recording Sessions, New York: Harmony Books.
McCracken, A. (2015), Real Men Don’t Sing: Crooning in American Culture, Durham, NC:
Duke University Press.
McDermott, J., B. Cox and E. Kramer (1995), Jimi Hendrix Sessions: The Complete Studio
Recording Sessions 1963–1970, Boston, MA: Little, Brown.
‘Men Behind the Microphones: Makers of Music for Millions’ (1952), Newsweek, 8 September:
56–57.
Miles, B. (2004), Zappa: A Biography, New York: Grove Press.
Millard, A. J. (2005), America on Record: A History of Recorded Sound, Cambridge: Cambridge
University Press.
Moore, A. (2012), ‘All Buttons In: An Investigation into the Use of the 1176 FET Compressor
in Popular Music Production’, Journal on the Art of Record Production, 6. Available online:
http://arpjournal.com/all-buttons-in-an-investigation-into-the-use-of-the-1176-fet-
compressor-in-popular-music-production/ (accessed 4 April 2018).
Moore, A., R. Till and J. Wakefield (2016), ‘An Investigation into the Sonic Signature of
Three Classic Dynamic Range Compressors’. Available online: http://eprints.hud.ac.uk/id/
eprint/28482/ (accessed 11 May 2018).
Morton, D. (2000), Off the Record: The Technology and Culture of Sound Recording in America,
New Brunswick, NJ: Rutgers University Press.
Mullin, J. (1976), ‘Creating the Craft of Tape Recording’, High Fidelity, April: 62–67.
Myers, M. (2012), ‘Interview: Rudy Van Gelder (Parts 1–5)’, Jazz Wax, 13–17 February.
Available online: http://www.jazzwax.com/2012/02/interview-rudy-van-gelder-part-3.html
(accessed 18 February 2018).
‘Natural Sound’ (1954), New Yorker, 17 July: 17–18.
Paul, L. and M. Cochran (2005), Les Paul – In His Own Words, New York: Gemstone.
Plunkett, D. J. (1988), ‘In Memoriam’, Journal of the Audio Engineering Society, 36 (12): 1061.
Read, O. and W. L. Welch (1976), From Tin Foil to Stereo: Evolution of the Phonograph,
Indianapolis, IN: Howard W. Sams.
Reilly, B. (2017), ‘The History of Pultec and the Storied EQP-1’, Vintage King blog, November.
Available online: https://vintageking.com/blog/2017/11/pultec-bf/ (accessed 11 March
2018).
Ribowsky, M. (1989), He’s a Rebel, New York: E. P. Dutton.
Robertson, C. A. (1957), ‘Jazz and All That’, Audio, October: 56–58.
Runstein, R. E. (1974), Modern Recording Techniques, Indianapolis, IN: Howard W. Sams.
Ryan, K. L. and B. Kehew (2006), Recording The Beatles: The Studio Equipment and Techniques
Used to Create Their Classic Albums, Houston, TX: Curvebender Publishing.
Sanner, H. (2000), ‘Ross Snyder Interview, 16 November’, Recordist. Available online: http://
recordist.com/ampex/mp3/index.html (accessed 3 March 2018).
Schmidt Horning, S. (2004), ‘Engineering the Performance: Recording Engineers, Tacit
Knowledge and the Art of Controlling Sound’, Social Studies of Science, 34 (5): 703–731.
Schmidt Horning, S. (2013), Chasing Sound: Technology, Culture, and the Art of Studio
Recording from Edison to the LP, Baltimore: Johns Hopkins University Press.
Sterne, J. (2003), The Audible Past: Cultural Origins of Sound Reproduction, Durham, NC:
‘The Talking Phonograph’ (1877), Scientific American, 22 December: 384–385.
Taubman, H. (1951), ‘Records: Kubelik Leads Modern Selections on Mercury Label’, New York
Times, 25 November: 8X.
Taubman, H. (1954), ‘Crooners, Groaners, Shouters and Bleeders’, New York Times,
21 November: 26–27, 54–56.
Taylor, T. (2001), Strange Sounds: Music, Technology and Culture, New York: Routledge.
Thompson, E. A. (2004), The Soundscape of Modernity: Architectural Acoustics and the Culture
of Listening in America, 1900–1933, Cambridge, MA: MIT Press.
‘The Tune Was Made By the Way It Was Played’ (1947), Billboard, 30 August: 28.
Villchur, E. M. (1952), ‘The Telefunken Mike’, Saturday Review, 12 October: 69–70.
Weathers, P. (1967), ‘Audio Amplification’, db, November: 4–7.
Williams, A. (2018), ‘Weapons of Mass Deception: The Invention and Reinvention of
Recording Studio Mythology’, in S. Bennett and E. Bates (eds), Critical Approaches to the
Production of Music and Sound, 157–174, New York: Bloomsbury.
Discography
Bartók, Béla (1951), [12” vinyl LP] Music for Stringed Instruments, Percussion and Celesta,
Mercury.
The Beatles (1969), [12” vinyl LP] Abbey Road, Apple.
Bloch, Ernest (1951), [12” vinyl LP], Concerto Grosso No. 1, Mercury.
Craig, Francis, and His Orchestra (1947), [10” 78 rpm single] ‘Near You’, Bullet.
Harmonicats (1947), [10” 78rpm single] ‘Peg O’ My Heart’, Vitacoustic.
Jones, Spike, and His City Slickers (1946), [10” 78 rpm single] ‘Liebestraum’, RCA Victor.
Jones, Spike, and His City Slickers (1945), [10” 78 rpm single] ‘The Blue Danube’, RCA Victor.
Mussorgsky, Modest (1951), [12” vinyl LP] Pictures at an Exhibition, Mercury.
Wagner, Richard (1958–65), [12” vinyl LP] Der Ring des Nibelungen, Decca.
5
Transitions: The History
of Recording Technology from 1970
to the Present
Paul Théberge
Introduction
The so-called ‘Digital Age’ is often described as beginning in the 1970s with the introduction
of microprocessors and personal computers, reaching a peak in the mid-1990s with the
popularization of the internet, and then morphing into a new and even more prolific phase
of innovation in the twenty-first century with the rise of mobile communications. Initially,
these events and forces were thought to be part of the ever-increasing pace of technological
change during the twentieth century but, more recently, they are often understood to be
both economically and socially ‘disruptive’. On the surface, technological changes in the
music industries as a whole appear to have moved in tandem with these larger forces: the
music industries have been, by turn, at the forefront of adopting digital innovations and,
also, on the receiving end of their most disruptive impacts.
But despite the implicit (and explicit) claims of ‘revolutionary’ change imbedded
in popular accounts of this technological trajectory, the transition to digital means of
production within the smaller world of recording technologies was neither sudden,
nor linear, nor entirely coherent. Indeed, the relationship between ‘digital’ technologies
and the wide range of recording technologies that preceded them – most often referred
to collectively as ‘analogue’ technology – was more complex than the assumptions that
often underlie the use of these terms. As Jonathan Sterne (2016) has argued: the supposed
opposition between digital and analogue technologies is of recent origin and too often
belies the points of contact between them.
In the following pages, I want to outline a history of recording technologies over the
past fifty years – from the early 1970s to the end of the second decade of the twenty-first
century – as a series of transitions that reflect multiple trajectories, overlapping interests
and underlying relationships, all operating within particular technical and economic
contexts. The number of individual instruments, recording and processing devices
introduced over the past fifty years is enormous – far more than can be included in a brief
essay – and the global diffusion, adoption and uses of these devices has been varied and
uneven, but also highly creative. Of necessity, the focus here will be partial, representing
only a small number of individual innovations that reflect, or have contributed to, the
larger transitions occurring within the recording industries as a whole.
Transitions I: Multitracking
Understanding the significance of changes in technology and music production practices
that took place during the 1970s and early 1980s is critical in coming to terms with the
broader transition to digital technologies in music. While digital technologies are often
thought to have replaced their analogue counterparts, they were also shaped by them: in
this sense, the process of ‘remediation’ (Bolter and Grusin 1999) is complex insofar as the
expectations of what digital technologies are, and what they can do, were largely modelled,
at least initially, on assumptions about music production as it existed at the time digital
technologies were first developed and put into use. Within the world of music production,
the 1970s were preoccupied, first and foremost, with the coming to fruition of multitrack
recording and the establishment of the multitrack recording studio – and the practices
associated with it – as a mode of production. The implications of these developments were
far-reaching and their impact continues to inform music production a half century later.
Conventional histories of multitrack recording begin with the sound-on-sound
experiments of guitarist Les Paul in the 1950s (even though experiments with overdubbing
predate his work) and then move on to the ambitious studio productions of Phil Spector,
The Beatles and The Beach Boys during the 1960s. But it was not until the 1970s that
multitrack recording came to full fruition and became established as a widespread
phenomenon within music production. During the late 1960s, 8-track tape recorders were
still rare (even one of the most heralded examples of early multitrack possibilities and
excesses, The Beatles’ Sgt Pepper’s Lonely Hearts Club Band (1967), was recorded on two
4-track machines); but by the end of the decade both 16- and 24-track tape recorders had
been introduced (by Ampex and MCI, respectively). More importantly, with the growing
popularity of rock and pop music, independent studios began to flourish and having the
latest in multitrack equipment was one of the grounds on which status and competition
were played out.
Track count – the expansion to 24-track capability (and beyond) – was not simply a
quantitative change but, also, a qualitative one. It changed how musicians, engineers and
producers thought about music performance and recording and it necessitated changes in
a whole range of other technologies – from console design, to the uses of sound processing,
to the architecture of the studio environment itself – and subtle shifts in the division of
labour in music production. The latter was recognized by sociologist, Edward R. Kealy
Recording Technology from 1970 to the Present 71
(1979), in his often-cited study of the relationship between technology, studio work and
industry structures. Kealy argued that the rise of the multitrack studio placed greater
emphasis on recording and mixing practices, enhancing the role of the engineer and
conferring upon them the status of a recording ‘artist’. It may be difficult to generalize such
shifts – Kealy’s research was largely conducted in the United States and, as Simon Zagorski-
Thomas (2012) has argued, even in the United Kingdom changes in labour structures
were more ambiguous – but the point of these researchers is clear: there are a variety of
subtle relationships that can exist between technology, studios, labour and aesthetics, and
these relationships ultimately have an impact on the sound of recorded music. The role of
multitrack recording technology in the production of rock music has been outlined by a
number of writers, including Steve Jones (1992) and Albin Zak (2001), who argued that
making records was central to the aesthetics of rock (unlike other genres, such as jazz) and
that this gave the multitrack recording studio special significance as the site where rock
sounds were first made and then moulded into a final product.
Part of the development of pop/rock aesthetics during the 1970s and 1980s depended
on a fundamental affordance of multitrack technology: the possibility of isolating and
processing individual musical sounds on separately recorded tracks. This not only resulted
in an increased use of overdubbing (recording the contributions of individual musicians
in a series of discrete passes) but, more importantly, it enabled a new level of scrutiny over
the shaping and processing of individual sounds and their overall balance and placement
in the final stereo mix. This was perhaps most clearly manifest in the recording of drum
tracks: by the end of the 1970s it was not unusual to find as many as eight separate tracks
devoted to the drum kit alone, allowing the kick drum, snare, hi-hat and other percussion
sounds to be isolated, processed through equalizers, compressors, gates, reverbs and other
effects, and then placed precisely within the stereo mix. Dockwray and Moore (2010)
have traced changes in the placement of sounds in the stereo field from the late 1960s to
the early 1970s, and Moylan (2002) has elaborated a more general account of multitrack
recording processes and the emergence of multiple sound locations and spaces within
mixing practice. The results of these multitrack processes were not only of significance for
musicians and engineers alone but for consumers as well: listening to recordings made in
the late 1970s and early 1980s was an immersive experience, where listeners experienced
drum and other instrument sounds as a kind of spatialized rhythmic structure stretching
across the entire stereo field (Théberge 1989).
The 1970s and early 1980s was also a period of music industry expansion. Wallis and
Malm (1984) documented the widespread adoption of electronic instruments, studio
technology and inexpensive audio cassettes, and argued that small countries in both the
Western and non-Western world had come to be regarded as sources of new talent, on the
one hand, and peripheral markets, on the other. In the context of globalization, multitrack
recording technology became one of the means by which musicians tailored their sounds
for the international market (Meintjes 2003); however, how the sounds were shaped varied
greatly. One of the most salient examples of the creative response to multitrack technology
to come out of the 1970s was the rise of reggae and dub music in Jamaica: reggae recording
practices allowed each individual instrument to occupy its own space within the larger
rhythmic frame of the music, while dub instituted a new way of thinking of individual tracks
as material to be recycled for use in other recordings – recordings destined for the mobile
‘sound systems’ that were central to Jamaican dance music culture. With their innovative
use of pre-recorded tracks and the liberal deployment of echo and other audio processing,
dub engineers, such as King Tubby, have been credited with inventing the idea of the ‘remix’
that would come to dominate dance music production in urban centres throughout the
world (Howard 2016). By the end of the decade, pop music producers such as Brian Eno
(in a lecture given in 1979 and published in 1983) would declare the multitrack studio –
with its seemingly infinite possibilities for adding, subtracting and processing tracks – as
a ‘compositional tool’, citing reggae as a key moment in its development, and referring to
reggae producers as ‘composers’ (Eno 1983).
While most of this discussion has dealt with professional audio recording equipment,
it needs to be recognized that the significance of the audio cassette goes far beyond its
use as a consumer format. As the first recordable format to gain widespread acceptance
in the consumer market in close to a century of sound recording, the audio cassette
brought production and consumption together in unexpected ways. During the 1970s
and 1980s, the cassette became an alternative medium of musical production and
distribution for groups that previously had little or no access to recording, enhancing
the possibilities of musical, political and ethnic subcultures around the world (Manuel
1993), and it brought to the fore issues of copyright, appropriation and piracy on a scale
that was unprecedented in the music industry. Even long after the industry shifted to
digital recording and CDs, cassettes remained a viable medium, outselling both LPs
and CDs into the late 1980s, and continued to be a source of industry complaints of
piracy well into the 1990s. But in the context of production, cassettes also became the
technical foundation of a market for semi-professional and amateur music recording in
the home – the so-called ‘home studio’.
While open-reel multitrack recorders clearly offered better sound quality than was
possible with any cassette format – the width of the tape tracks, the speed of the tape
transport and other factors guaranteed the sound quality of professional tape decks – the
sheer cost advantages, ease of handling and portability of cassette tape made it accessible
to a wider public. The introduction of the ‘Portastudio’ by Tascam, in 1979, helped create
the idea of the home studio as an essential part of every musician’s stock in trade and also
cemented the relationship between the tape recorder and the recording console as the two
central technologies of multitrack studio production (the Portastudio integrated recording
and console functions into a single device). The distinction between ‘professional’ and
‘consumer’ – or what would eventually become known as ‘pro-sumer’ – devices was still
significant and most home-studio productions were considered within the industry as
demo tapes at best. But even that assumption was put into question when Bruce Springsteen
chose to release a raw, 4-track Portastudio version of his album Nebraska (1982), over a
more elaborately arranged, studio-recorded version of the same material.
While many have celebrated these creative and ‘democratizing’ possibilities of
recording technology, what most commentators of the day (and many since) failed to
recognize was that the social relations of recording studio production were largely male
dominated; and this was as true of the home-studio phenomenon as it was of mainstream,
commercial operations. Indeed, the entire discourse of marketing and promotion of
studio equipment has assumed male music enthusiasts as the primary market, and this
has been the case in the promotion of both analogue and digital equipment, even as the
domestic sphere – traditionally a space for female music-making – was being colonized as
a production environment (Théberge 1997). The male domination of the music industry
has been challenged from a number of fronts in popular music studies over the past several
decades and, most recently, scholar/practitioners such as Paula Wolfe (2018) have argued
that as women in the industry have become increasingly focused on controlling their own
careers, moving into the roles of recording and producing is part of that process. Gender
relations in the studio is still a highly contested terrain, however, and throughout this
discussion of the importance of contemporary recording technology it is important to be
reminded that all technology exists within larger social and industrial structures; and in the
music industry, where sounds, images and careers are highly managed, the ongoing male
domination of the industry creates limits on the uses of technology in music production.
Transitions II: Digital recording – uncertain

beginnings
Just as multitrack recording on analogue magnetic tape was reaching its highest level of
development in the 1970s and early 1980s, digital recording technology was already
emerging, albeit tentatively. The primary difference between digital and analogue recording
lies in the method by which audio information is stored: conventional electrical recordings
begin with the microphone, when sound pressure in the air is converted into electrical signals
(a process known as ‘transduction’); in analogue recording, the variations in the electrical
signal can be amplified and applied directly to a cutting lathe to make fluctuations in the
grooves on a disc, or stored as variations in magnetic flux in the iron oxide particles coated
onto magnetic tape. In digital recording, the amplitude of electrical signals is ‘sampled’ and
quantified at fixed intervals of time and stored as a series of discrete, on/off pulses (0s and
1s) on a magnetic or optical medium. The digital process is often referred to as pulse code
modulation (PCM), and its technical characteristics are expressed in terms of the length of
the digital word used to measure the audio and the frequency at which the electrical signal
is sampled (e.g. as in the ‘16-bit/44.1 kHz’ technical standard for CDs).
In both professional and popular discourse, much has been made of the supposed
difference (in sound quality and musicality) between the ‘continuous’ character of analogue
recording methods versus the ‘discrete’ values of digital data but, as Jonathan Sterne (2016)
has argued, there is in fact nothing more, or less, ‘natural’ about either method. Differences
in sound quality did, of course, exist between the two media but this had as much to do
with the advanced level of development of magnetic tape recorders, Dolby noise reduction
and other technologies used to enhance analogue recording at the time, and the relatively
poor quality of digital converters in use during the same period.
What digital recording promised, first and foremost, was the elimination of surface
noise and instabilities in the analogue tape transport mechanism. All analogue methods of
reproduction result in noise of one kind or another – the surface noise of an LP or the ‘hiss’
of magnetic tape – and imperfections in the medium cause distortion in the reproduced
audio; ‘wow and flutter’ in tape transport causes instability in sound reproduction, most
noticeable when recording sustained pitches; and all of these noises and distortions build
up each time an analogue recording is copied. But digital reproduction only needs to be
able to read enough of the recorded data to reconstruct the individual on/off pulses and
convert them back into electrical signals, and this makes the digital process less susceptible
to minor imperfections or degradation of the recorded surface itself (error correction
is employed when surface problems do occur); timing of the recorded data is also more
tightly controlled.
Although the theoretical foundations of digital recording had been laid as early as the
1920s and 1930s, it was not until the late 1960s that the first practical digital recording
devices were developed at the research division of NHK, the Japanese national broadcaster.
The earliest machines were monophonic, operated at only 12-bit/30 kHz resolution (quite
low compared to later standards) and stored the data on video tape (magnetic audio tape
could not handle the density of the digital information); but the benefits of digital recording
in terms of the lack of surface noise and other forms of distortion were immediately
recognized and, by the early 1970s, NHK, Denon and Nippon Columbia had developed
stereo (and even 8-track) versions of the technology.
Given the track limitations of the new digital recording devices, their primary use
throughout the 1970s was in the recording and mastering of classical music and, to a lesser
extent, jazz. The technology was still relatively unstable and experimental in character,
however, and few recording projects were undertaken without also employing analogue
recording as a backup measure; in many instances, the analogue tapes were deemed
superior in quality and commercial releases used the analogue backups as source material.
In addition, since commercial releases still employed LP or cassette tape media, the benefits
of digital recording and mastering were not always evident to the listener. It was not until
the introduction of the digital compact disc, in 1983, that pressure mounted to consolidate
a fully digital chain of production from recording, to mastering, to consumption.
One of the other drawbacks of early digital recording methods was that the material
could not be edited in a conventional fashion: each system used special controllers to
perform routine editing tasks. The US-based, Soundstream Inc. (founded in 1975), was
one of the first companies to develop a fully integrated digital recording and editing system
during the late 1970s: based on custom-designed circuitry for the audio conversion process,
Honeywell drives, a DEC mini-computer, and custom-designed editing and processing
software, Soundstream developed what could be considered as one of the first ‘digital audio
workstations’ (DAWs; although the term was not in widespread use at the time). Given
the technology of the day, the Soundstream system was bulky and prohibitively expensive
(Rumsey 1990); only a small number of these proprietary systems were produced for use
in (primarily classical) music recording and in film post-production, and their impact in
the wider realm of music recording was limited.
In the multitrack studio, computers were first introduced in the late 1970s by console
manufacturers, such as Solid State Logic (SSL), as a way of automating and storing
console settings to aid in the increasingly complex process of mixing analogue multitrack
recordings. Digital microprocessors and algorithms were also employed in a new
generation of artificial reverberation units (introduced by EMT, in 1976, and Lexicon, in
1978), which allowed for the manipulation of reverb time, timbre and pre-delay in ways
that were impossible with analogue reverbs of the day. In these ways, digital technology
did not so much displace analogue technology but was instead designed to control and
automate analogue processes, on the one hand, or provided audio processing tools within
an essentially analogue recording environment, on the other. Even the introduction of a
32-track digital tape recorder, developed by 3M, in 1978, and later digital multitrack decks,
such as Sony DASH and Mitsubishi ProDigi machines introduced in the 1980s, did not
disrupt studio practices: functioning as stand-alone digital devices within a predominantly
analogue recording aesthetic, they lacked the integration that computer-based recording
and editing promised (but was as yet unable to deliver).
Transitions III: Synthesizers, sequencers

and MIDI
Perhaps one of the most significant moves towards the full integration of digital technologies
within the studio environment came, surprisingly, not from the manufacturers of sound-
recording equipment but from developments within the synthesizer industry. During the
late 1960s and early 1970s, analogue synthesizers migrated from their previous habitat in
electronic music studios into mainstream music-making. Many early studio synthesizers
were monophonic and, as such, were dependent on multitrack recording: well-known
synthesizer recordings of the day – those by avant-garde composers such as Morton
Subotnick or by popular artists such as Wendy Carlos and Isao Tomita – were composed
on multitrack tape. One of the challenges of the early inventor-entrepreneurs, such as
Robert Moog and Alan R. Perlman, was to convince music retailers that their devices were
not some obscure form of audio equipment but, in fact, a new kind of musical instrument
(Pinch and Trocco 2002). The success of the Minimoog – a portable, keyboard-based
synthesizer introduced in 1970 – and instruments like it, paved the way for the acceptance
of synthesizers in a wider range of music performance and recording contexts than they
had previously enjoyed.
Not unlike the case of analogue recording, the initial introduction of digital technologies
in music synthesis came in the form of devices used to control and automate the analogue
hardware. ‘Sequencers’ were initially developed as analogue circuits that could supply a
series of stepped voltages to create brief, repeating sequences of pitches: the voltages were
used to drive the pitch of oscillators in a precise fashion while ‘gates’ turned the tones on
and off. The repetitive character of much early synth pop was in large part due to the use
of sequencers. The aptly-named Sequential Circuits, a US-based company founded in 1974
by Dave Smith, produced sequencer and programming devices for popular synthesizers
such as the Minimoog and the ARP 2600 and, by the mid-1970s, Smith began to design his
devices around microprocessor technology, allowing for greater length and variety in the
sequenced patterns. Smith later introduced his own keyboard synthesizer, the Prophet-5 (in
1978), the first analogue synthesizer to include a microprocessor at the core of its hardware
design. Similarly, Ikaturo Kakehashi, founder of the Roland Corporation in Japan, began
his career designing transistor-based drum machines and, later, keyboard synthesizers;
Kakehashi’s drum machines employed a type of sequencer to trigger repeatable rhythmic
patterns that played the electronic drum sounds and he too turned to microprocessor-
based designs by the late 1970s.
At the same time that these hybrid, analogue/digital devices were being developed,
a number of other inventors were working on more integrated, fully digital systems
for recording, synthesizing and controlling audio: instruments like the Synclavier (first
introduced in 1978) and the Fairlight CMI (Computer Music Instrument, released a year
later) used microprocessors for their internal synthesis engines as well as their sequencers
and programme controls; the CMI also included a computer monitor (an option on later
versions of the Synclavier) and a floppy disk drive, giving it the appearance of a computer
with an organ-like keyboard. Most important, these instruments also included the ability
to record digital audio ‘samples’ – digital recordings of brief duration that could be stored in
RAM (random access memory) and then triggered and pitched by playing the instrument’s
keyboard. While neither instrument had the recording capability of a multitrack recording
deck, their ability to reproduce acoustic instrument sounds and to integrate digital
recording capability, synthesis, editing and sequencing in a single device began to call into
question the distinction between musical instrument and recording device.
The Synclavier and the Fairlight CMI cost tens of thousands of dollars and, although their
influence on music was significant, they appealed to only a small segment of the market.
But by the early 1980s, a variety of low-cost digital synthesis and sampling instruments
were released to a broader range of musicians by companies such as Roland, Yamaha,
Sequential Circuits, E-mu Systems, Akai and Ensoniq. Individually, these instruments had
a more limited range of capabilities than the Synclavier or the Fairlight, but what gave them
the potential for an even greater level of integration was the introduction of the Musical
Instrument Digital Interface (MIDI) in 1983 (see Théberge 1997).
MIDI, which was designed primarily through the collaborative efforts of Smith and
Kakehashi, is neither a recording device nor a musical instrument; it is, rather, a technical
specification that defines how, via a set of hardware and software protocols, digital
instruments can be connected to one another and to personal computers (which were just
becoming widely available during the early 1980s). MIDI does not carry digital audio but it
defines how control signals can be used to trigger the timing and pitch of sounds produced
by synthesizers, samplers and drum machines. Most industries are governed by technical
standards of one kind or another, and MIDI created a kind of technical standard through
which digital instruments built by different manufacturers could be interconnected – an
important step in the maturation of the synthesizer market – more importantly for the
discussion here, it allowed both hardware and software sequencers to become the centre of
a production environment that could rival that of the multitrack studio. Indeed, while early
sequencers had often been designed to create songs by stringing together groups of repetitive
patterns, software sequencers could take advantage of the power of personal computers to
partially break away from pattern-based composition to create longer and more complex
sequences that essentially organized MIDI data into linear, track-like configurations. But
unlike analogue or digital recorders which fix sounds in a medium, MIDI data could be
edited in a variety of ways, reordered through cut-and-paste methods and made to trigger
sounds from different instruments at will, allowing for a kind of flexibility impossible with
conventional recording technology.
The effect of these developments in the world of music production was immediate and
far-reaching: in pop, hip-hop and dance music especially, entire songs could be created
almost entirely from sequencers, drum machines, samplers and synthesizers, resorting
to conventional recording only to add vocals, guitars or other instruments as needed.
Timothy Warner (2003) has singled out the work of producer Trevor Horn as emblematic
of the new recording aesthetic: Horn made innovative uses of multitracking, MIDI and
sampling to create a series of hit singles with a variety of artists throughout the 1980s
and early 1990s. In rap and hip-hop, the combination of sampling beats from previously
recorded material, triggering them from sequencers and layering them with other sounds
on multitrack tape was reminiscent of techniques employed by dub reggae producers a
decade earlier. Alongside the turntable, samplers, drum machines and sequencers became
central technologies in hip-hop production (Schloss 2004).
The flexibility of MIDI technology and the consumer orientation of the synthesizer
market created a downward pressure on studio configurations: a studio could consist
of as little as a few MIDI devices, a small mixing console and an 8-track recorder (the
introduction of modular, digital 8-track tape machines, such as the Alesis ADAT and the
Tascam DA-88 in the early 1990s, attests to the continued popularity of small-format tape
recording devices). MIDI helped establish the so-called ‘project studio’ – essentially a
home studio of sufficient quality that it could take in commercial work – and some project
studios began to compete with larger, professional operations. And like the spread of
cassettes and multitrack recording in the 1970s, MIDI technology created a second wave
of Western technology spreading throughout the globe in the late 1980s and early 1990s,
fuelling a vibrant culture of low-cast, independent production and remixing that became
a part of music scenes in urban centres as far afield as Detroit, Jakarta and Kathmandu
(Greene and Porcello 2004).
Transitions IV: Consolidation – the rise of

the digital audio workstation
While MIDI gave rise to a new level of integration in audio production technologies, the
systems to which it gave rise were extremely complex. In addition to the technologies
found in a conventional recording studio – from microphones to consoles, signal
processors, tape machines and monitoring systems – MIDI added a new range of
devices, including computers, synthesizers, samplers, sequencers and drum machines.
The acquisition of new forms of technical knowledge and new vocabularies was also
part and parcel of the turn to MIDI technology. It was the personal computer, however,
with its increasing levels of speed, computational power and storage capacity that would
eventually integrate the entire disparate studio apparatus into a single software-based
device: the DAW.
In everyday usage, the term ‘digital audio workstation’ has had a problematic history:
it was initially imported from the scientific and engineering communities where the
term ‘workstation’ was applied to a class of computers that usually included some form
of proprietary hardware so as to enhance its ability to perform scientific applications –
applications that were beyond the capabilities of mainstream computing. In this regard,
referring retroactively to the Soundstream system (mentioned above) as a prototypical
DAW would be appropriate as the system included custom-designed audio converters
and other hardware and software enhancements that allowed it to record, edit and process
digital audio, a capability far beyond that of other mainstream computers of the day.
Similarly, the DAW label has been applied to the Fairlight CMI because of its particular
configuration of computing and sampling capabilities. However, by the mid-1980s, the
‘workstation’ term was often applied to virtually any sampling or synthesizer keyboard
that included an integrated sequencer: for marketing purposes, it was in the interests of
the manufacturers to associate the keyboard/sequencer combination with a kind of virtual
studio concept, and this usage is still common among synthesizer manufacturers some
three decades later.
Following on from the Soundstream example, however, the term ‘digital audio
workstation’ is most commonly used to designate a computer-based, integrated system
for the recording, editing, processing and mixing of digital audio. From the early 1980s
onwards, a number of companies developed systems around proprietary hardware and
software configurations: such as the Advance Music Systems’ (AMS) ‘Audiofile’, introduced
in 1984. Others took advantage of the increases in computational power offered by
personal computers to develop hybrid systems: these systems typically used a combination
of specialized hardware (such as the Motorola DSP 56000 series of digital signal processing
chips designed to handle the heavy mathematical computations required for digital
mixing, equalization and other tasks) and off-the-shelf consumer-oriented computers
produced by IBM, Apple and Atari to run audio software applications and the graphical
interface. These systems became more prevalent during the second half of the decade: for
example, companies such as Sonic Solutions, Otari and Digidesign developed digital audio
systems specifically for use with the Mac II series of computers (first introduced by Apple
in 1987). By the early 1990s at least twenty-four different DAWs – ranging in price from a
few thousand dollars (not including the price of the computer) to in excess of US$100,000
– were available in the marketplace (Lambert 1991).
Audio production using a DAW differs from conventional recording methods in a
number of significant ways: firstly, the resolution of digital editing, which can be performed
to within thousandths of a second (down to the level of individual digital bits), offers a level
of precision impossible with conventional editing techniques. Later developments in editing
would employ this level of precision to shift the placement of recorded drum sounds to
create subtle variations in the feel of rhythms and beats (Danielsen 2010). Secondly, editing
within the digital domain is a random access, non-linear process: workstations of the late
1980s and early 1990s typically used large amounts of RAM and/or computer hard drives
as storage media, allowing any segment of a recording to be theoretically as accessible as
any other at any given moment; this random accessibility enhances the flexibility of editing
digital audio.
Thirdly, the non-linear aspect of working with digital audio has had a profound
impact on the production of popular music, not only at the level of the manipulation of
individual sounds but also at the level of the arrangement and the overall song structure.
In multitrack tape recording each sound source is recorded on a separate linear ‘track’ that
runs for the entire duration of the song; limited editing can be performed within individual
tracks – through an electronic process known as ‘punching in’, where a portion of a track
is re-recorded to correct errors or to achieve a better performance. However, the overall
structure of the song cannot be altered once the initial tracks have been laid. In the DAW,
on the other hand, each sound source is recorded to a hard drive as a separate file; the files
can be synchronized and played in a linear fashion or, through random access editing, the
files can be cut up, reordered and juxtaposed in an infinite number of ways. This flexibility
greatly enhances the ability of the producer to compose, rearrange or remix a song to create
multiple versions of the material.
Finally, the use of computer graphics in DAWs has enhanced the engineer’s ability to
visualize audio in new ways: through Fourier analysis, individual sounds are rendered
as visual waveforms on the computer monitor; these graphic representations can
be used as visual aids in analysing and editing digital audio. Similarly, graphics can
be used to help engineers visualize the effect that digital processing – equalization,
compression or reverberation – has on individual waveforms or, in mastering, the
cumulative effect of processing on the overall sonic contour of the song. Visualization
is also central to the large-scale manipulation of the song’s arrangement as mentioned
above: instrumental and vocal tracks are often represented as blocks of sound that can
be freely edited and moved around within or between tracks through copy-and-paste
procedures. Managing the multiple, onscreen windows and configuring visual as well
as audio information has become a central part of DAW production practices (Savage
2011; Strachan 2017).
Despite the advantages offered by DAWs, they were only gradually and unevenly
adopted within mainstream music production throughout the early 1990s: because of the
computational demands of recording, processing and mixing multiple tracks of digital
audio, many workstations were capable of handling only four to eight tracks of audio
at a time. By comparison, MIDI sequencer applications made far less demands on the
computer because they relied on external synthesizers, samplers and drum machines to
generate audio.
However, realizing the potential to combine DAWs with MIDI sequencing, a number
of sequencer software manufacturers added digital audio capabilities to their products: for
example, Mark of the Unicorn (MOTU), a US-based developer, had introduced a computer-
based MIDI sequencer called ‘Performer’, in 1985, and then renamed it ‘Digital Performer’,
in 1990, after adapting it as a combined digital audio and sequencing platform; similarly,
Steinberg (based in Germany) renamed its software sequencer from ‘Cubase’ (introduced
in 1989) to ‘Cubase Audio’ (in 1992). Approaching the problem from the opposite side
of the equation, Digidesign, a US developer, had entered the music production field in
1984 with a product called ‘Sound Designer’, a software-based tool for editing digital audio
samples (the samples were first transferred from MIDI keyboards or hardware samplers to
the computer and then transferred back after editing). By the early 1990s, Sound Designer
had evolved from a sample editor to a stereo audio recording and editing system, and
then to a four-channel multitrack system named ‘Pro Tools’ (in 1991); recognizing the
competition from the sequencer manufacturers, Digidesign later added a rudimentary
MIDI sequencer to the Pro Tools platform. Each of these manufacturers began to describe
their hybrid, digital audio/sequencer systems as ‘digital audio workstations’, thus enlarging
the scope of the term in the marketplace.
By the end of the 1990s, personal computers had become powerful enough to handle all
the basic processing and mixing needs of digital audio. Although some form of conversion
hardware was still required to get audio into, and out of, the computer, additional digital
signal processing (DSP) hardware was no longer needed for computational purposes and
track counts increased significantly, allowing DAWs to compete more effectively with
tape-based multitrack recorders. While DAWs were widely used in studios throughout
the decade, Ricky Martin’s ‘Livin’ La Vida Loca’, a number one hit in the Latin and
mainstream pop charts in 1999, is widely regarded as the first hit single fully produced
on a Pro Tools system and a watershed moment in the transition of the industry from
conventional recording methods to DAW technology (at the time of writing, Billboard had
even published an article celebrating the twentieth anniversary of the recording; Brown
2019). During this time, Digidesign (now Avid Audio) emerged as a leader in the field of
professional digital recording and its Pro Tools software became the de facto standard (in
the United States) in both the film post-production and music recording industries.
Given that the development and acceptance of DAW technology was an extended
process, some two and a half decades in the making, it is difficult to gauge the speed at
which it has become diffused throughout the world. Certainly by the end of the 1990s,
many manufacturers had ceased to make conventional large-format multitrack recording
decks – both analogue or digital – and recording to digital media via computers or other
devices had become the default mode of operation in many commercial and home studios.
In his detailed ethnography of recording studios in Turkey, Eliot Bates (2016) has argued
that, by the early 2000s, virtually all major studios in Istanbul had converted to DAW
technology and that the computer screen had become one of the focal points where the
activities of musicians and engineers coalesced, creating a dynamic interplay between
technology and traditional modes of studio work and musicianship.
Transitions V: In the box, on the net and

the rise of AI
The movement towards fully integrated, computer-based recording systems has not
been driven by technological innovations alone. At the turn of the millennium, MP3s,
peer-to-peer platforms such as Napster and other forms of internet file sharing spread
quickly and the music industry came to regard the internet as little more than a vehicle for
piracy writ large. The industry’s struggle to adapt to the new medium is well known (see
Morris 2015) but, as profits from conventional record sales (CDs) fell, so did recording
budgets for new releases. These economic forces have created an enormous pressure on
musicians, engineers and producers to adapt to recording ‘in the box’, not simply because
of the advantages offered by DAW software but also because it is cheaper to do so than
to maintain a studio full of expensive gear. And even before commercial services like
Spotify entered the online scene, websites such as BandCamp, SoundCloud and YouTube
allowed musicians to distribute their music, for free, simply by uploading it, encouraging
a DIY aesthetic where the ratio between the cost of production, on the one hand, and the
potential for revenue, on the other, had to be brought to as low as possible.
Of course, studios are still desirable as social spaces (including the interaction between
musicians and experienced recording professionals), as architectural and acoustic spaces,
and for the mixing, processing and monitoring equipment that they afford: in the past,
studio owners have taken great care to choose various equipment – a range of microphones
with specific characteristics, a particular console, etc. – because of the unique sonic
qualities that they lend to any given recording. Depending on the genre in question, much
record production still depends on some combination of musical performance, acoustic
factors, and analogue and digital components to achieve ideal outcomes; as a result, many
commercial and home studios remain hybrid operations, housing the latest DAW software
alongside a selection of ‘vintage’ gear. Furthermore, as Samantha Bennett (2018) has argued,
musicians and producers throughout this period have developed a variety of idiosyncratic
approaches to combining technologies of all kinds in their studio productions.
But at the same time, the designers of DAW software have pursued an opposite path,
employing various forms of digital emulation in an attempt to further establish the
computer as the central – if not the only – relevant piece of equipment needed for audio
production. This tendency has been particularly marked during the past two decades.
Steinberg, for example, continued to develop its approach to sequencer/DAW software by
enhancing its product line with a new software plug-in interface called ‘Virtual Studio
Technology’ (VST), introduced in 1996. In its first iteration, VST provided a standardized
environment within which third-party developers could create small sub-applications
that simulate reverbs and other audio effects. By 1999, Steinberg was able to introduce
a second version of VST that provided for the creation of sequencer-controlled software
‘instruments’ – software emulations of synthesizers, samplers and drum machines. By
taking advantage of the increased power of personal computers in this way, Steinberg was
able to consolidate MIDI sequencing within the computer so that even external synthesizers
and other hardware were no longer needed. In the years that followed, the VST format
became widely adopted by other manufacturers who integrated VST instruments within
their own host applications (Avid, the makers of Pro Tools, have developed their own plug-
in architecture).
Even where studio hardware is necessary (in the case of microphones) or desirable
(consoles, for example) there has been a parallel tendency towards designing devices
as interfaces rather than stand-alone components. For example, in 2015, Slate Digital
introduced the ‘Virtual Microphone System’ (VMS): the VMS is a hybrid technology that
employs a hardware microphone and preamp – both designed to be as sonically neutral
as possible – combined with software emulations of ‘classic’ analogue microphones –
microphones that typically cost thousands of dollars. The Slate VMS points to a new concept
of the microphone as a hardware/software device, positioned ambiguously at both the
production and post-production stages of the recording process: for example, one can
record, using one plug-in emulation for monitoring purposes (thus keeping its sound
out of the recording chain), and then change it later at the mixing stage, much as one
might change EQ, reverb or other signal processing after a sound has been recorded. To
accomplish this, however, the hardware part of the system must be designed in such a
way as to efface itself: its primary function is to supply data for the purpose of emulating
hardware of the past.
When digital consoles were first introduced by companies such as Yamaha in the late
1980s and early 1990s, they were designed much like their analogue counterparts but with
enhanced automation, instant recall of programmed settings, onboard reverb and other
features that made them advantageous in both live and studio applications. As with all
digital technology, the sound quality of the consoles was very much dependent on the care
with which the input and output stages of the device were implemented and the resolution
of the internal processing architecture. More recently, studio consoles are increasingly
being designed so that they coordinate well with various software platforms, such as those
produced by Avid, Steinberg or Apple. Input and output functions can be located in stand-
alone converters and the console, rather than contributing its own characteristic sound to
the mix, is increasingly regarded as little more than a computer interface.
The issues around hardware, software and sound quality vary, depending on the
aesthetics of the musical genre in question. But historically speaking, the issue of sound
quality has become somewhat moot when it comes to choosing a digital recording
platform: by the early 2000s, digital converters – even converters designed for the
consumer market – routinely employed a 24-bit/96 kHz resolution and, once in the
computer, internal processing within DAW software can be performed at even higher
resolutions in order to maintain optimal sound quality. While factors beyond simple
numbers govern the differences between professional and consumer grade equipment,
it is clear that the sound quality of most DAW productions, even amateur productions,
can be quite high. At the same time, the forces that drove the rise of DAWs in the first
place – the advent of CDs as a playback medium – have been reversed: MP3s have been
the format of choice on most online, streaming and mobile media for two decades
and the gap between production standards and playback standards has perhaps never
been greater. As Jonathan Sterne (2012) has pointed out, MP3 files were designed for
distribution not sound quality, and consumers have clearly opted for convenience over
older notions of ‘fidelity’.
But beyond the purely technical aspects of this phenomenon, there is a kind of
ideological ‘fit’ between the world of mobile music consumption and the idea that DAWs
allow one to record music anywhere and at any time: the ‘studio’ is essentially anywhere
that you decide to put down your laptop. Of course, this is not a new idea – mobility was
built into the name of the ‘Portastudio’ decades ago – but it has been given a new emphasis
as DAW designers attempt to expand into a larger consumer market. To enhance this
market opportunity, most of the manufacturers have released so-called ‘lite versions’ of
their DAW products – versions based on the same fundamental core technologies but
having reduced feature sets (and, often, limited track counts); these software products
are available at a low cost and, in some cases, for free. Pursuing this trend further, Apple
Computers acquired Emagic (in 2002), a German software company that had created
a DAW/sequencer programme known as ‘Logic’; Apple continued to market Logic as
a professional application but also used its core technology to develop ‘GarageBand’,
released in 2004 as part of a suite of free applications bundled with the purchase of
every Apple computer. Like Steinberg’s VST, GarageBand came with a bank of virtual
instruments – drums, keyboards, guitars and other sounds – as well as pre-formed ‘loops’ –
short rhythmic phrases that can be strung together in a variety of ways to produce songs.
By 2011, Apple had developed a version of GarageBand for its iPad and iPhone product
lines, thus making DAW technology available as an easily accessible app on its mobile
devices. Aiding this process, Apogee Electronics, a US-based company whose name had
previously been synonymous with only the most expensive, professional quality digital
audio converters, began to develop a line of inexpensive converters designed specifically
for Apple computers and its mobile devices.
Recognizing the cultural potential of this consumer-oriented recording software, the
rock band Nine Inch Nails released a promotional single called ‘The Hand That Feeds’,
in 2005: originally recorded on a Pro Tools system, the song’s individual tracks were
subsequently converted to the GarageBand format and made available for free on the band’s
website, inviting fans to remix their own version of the single. The band continued this
practice with a number of subsequent songs, including tracks formatted for GarageBand,
Ableton Live and a number of other popular software platforms.
While not as flexible as full-fledged DAW systems and clearly intended for a consumer
market, the appeal of products such as GarageBand have made modern music production
technology and practices like remixing available to a wider audience than at any time in
the history of sound recording. But the potential of this technology is, in part, limited
by the very same easy-play consumer ethic that has guided the development of many
other music technologies of the past (see Théberge 1997): much of its appeal lies in the
access to presets and pre-recorded musical patterns, not in the power of the underlying
architecture. Given the amount of technical knowledge and skill required to effectively use
most DAW programmes, it is not even clear how wide an audience they actually reach and
who they represent: while Apple’s website, for example, has promoted musicians such as
Grimes as an idealized, female consumer/indie artist, most industry research continues to
emphasize the lack of female representation in the area of professional music production:
on average, about 2 per cent (Smith, Choueiti and Pieper 2019). In this sense, there is
more to the problem of ‘democratization’ than simple access to technology: in an earlier
study of how musicians gain knowledge of music technology, both within and outside
educational institutions, Born and Devine (2016) argued that even DIY cultures have
not fully addressed the problems of gender, race and social status imbedded in electronic
music and sound art.
The problem of knowledge and skill may have reached a new juncture in the ways in
which artificial intelligence (AI) and machine learning are now being applied within audio
production. In professional music production, the final stage of a sound recording – the
stage at which it is prepared for release – has traditionally been handled by a special class of
sound professionals: the mastering engineer. The mastering engineer typically has a deep
knowledge of musical genres, processing tools and distribution formats, and their skills and
listening habits are developed over years of experience in the industry. As DAW technology
has developed, specific mastering applications and plug-ins have been introduced to
allow virtually anyone to potentially take on the role of mastering: iZotope’s Ozone is one
such product. As with iZotope’s entire line of software products, the graphical interface
for Ozone is quite elaborate, offering real-time visual feedback as individual processing
parameters are changed. iZotope designs its products for experienced engineers as well
as the wider consumer demographic of musicians: its website is filled with instructional
videos that provide advice and tips to novice engineers and, not surprisingly, Ozone
includes a number of genre-specific presets that offer a starting point for the mastering
process. Recent versions of the software, however, have added a new plug-in called Tonal
Balance Control, which begins with a smaller number of preset categories that employ AI
to generate equalization curves based on an analysis of thousands of popular songs; these
equalization curves can then be applied, automatically, to one’s own song. The software
essentially ‘listens’ for us.
But ‘listens’ may be the wrong expression in this instance: Tonal Balance Control,
and other products like it (LANDR and CloudBounce claim to offer similar AI-based
mastering services online), belong to a class of technologies that align music with Big
Data. Other forms of AI and machine learning have been introduced into software for
vocal processing, tracking and mixing and, taken as a group, the aims of these automated
processes appear to be efficiency, ease of use and low cost, but they also emphasize
notions of control, conformity and perfection over creativity and experimentation. Most
notable, perhaps, is that in drawing on vast data banks of recorded musical repertoire,
these production technologies bear a resemblance to the type of algorithms utilized by
streaming services to make recommendations to consumers. Production and consumption
have always been linked in complex ways within the music recording industry, but now
that music has become more or less completely aligned with software and digital services
the links between production and consumption appear to have taken on a new kind of
digital logic.
Conclusion
The transition from analogue to digital audio recording and reproduction has been a
complex technical and aesthetic process some five decades in the making. It has not been a
singular transition: as I have attempted to demonstrate, there have been several overlapping
transitions in technology that have each made different contributions to the present
assemblage that we now know as ‘digital’ recording technology in music. And even that
may now be in a new state of transition, playing out the digital logics presently governing
the broader networks of production, distribution and consumption in contemporary
society (see Théberge 2004; Watson 2015).
But for all the visible changes in technological form – the vast number of hardware devices
that have become obsolete (or simply displaced) as their function has been absorbed by,
or emulated in, software – the underlying principles of multitrack recording, overdubbing,
processing, mixing and remixing appear to have remained at the core of digital recording,
albeit vastly expanded and enhanced by possibilities offered by computer-based editing,
arrangement, visualization and automation. The continuation and expansion of these
core principles suggest there may be an underlying musical logic that binds together both
analogue and digital recording technologies, and their associated practices, into something
that is greater than either of them.
Bibliography
Bates, E. (2016), Digital Tradition: Arrangement and Labor in Istanbul’s Recording Studio
Culture, New York: Oxford University Press.
Bennett, S. (2018), Modern Records, Maverick Methods: Technology and Process in Popular
Music Record Production 1978–2000, New York: Bloomsbury Academic.
Bolter, J. D. and R. A. Grusin (1999), Remediation: Understanding New Media, Cambridge,
MA: MIT Press.
Born, G. and K. Devine (2016), ‘Gender, Creativity and Education in Digital Musics and
Sound Art’, Contemporary Music Review, 35 (1): 1–20.
Brown, H. (2019), ‘“We’ve Crossed the Threshold”: How Ricky Martin’s “Livin’ La Vida Loca”
Became the First No. 1 Song Made Entirely in Pro Tools’, Billboard, 10 May. Available
online: https://www.billboard.com/articles/columns/latin/8510936/ricky-martin-livin-la-
vida-loca-pro-tools-desmond-child (accessed 10 May 2019).
Danielsen, A. (2010), Musical Rhythm in the Age of Digital Reproduction, Aldershot:
Ashgate.
Dockwray, R. and A. F. Moore (2010), ‘Configuring the Sound-Box, 1965–72’, Popular Music,
29 (2): 181–197.
Eno, B. (1983), ‘The Studio as Compositional Tool – Part I & II’, Down Beat, 50 (7/8): 56–57,
50–53.
Greene, P. D. and T. Porcello, eds (2004), Wired for Sound: Engineering and Technologies in
Sonic Cultures, Middletown, CT: Wesleyan University Press.
Howard, D. O. (2016), The Creative Echo Chamber: Contemporary Music Production in

Kingston Jamaica, Kingston, Jamaica: Ian Randle Publishers.
Jones, S. (1992), Rock Formation: Music, Technology and Mass Communication, Newbury Park,
CA: Sage Publications.
Kealy, E. R. (1979), ‘From Craft to Art: The Case of Sound Mixers and Popular Music’,
Sociology of Work and Occupations, 6 (1): 3–29.
Lambert, M. (1991), ‘Digital Audio Workstations: Whither the Future?’, Mix, 15 (9): 28–43.
Manuel, P. (1993), Cassette Culture: Popular Music and Technology in North India, Chicago:
University of Chicago Press.
Meintjes, L. (2003), Sound of Africa: Making Music Zulu in a South African Studio, Durham,
Morris, J. W. (2015), Selling Digital Music, Formatting Culture, Oakland, CA: University of
California Press.
Moylan, W. (2002), The Art of Recording: Understanding and Crafting the Mix, Boston, MA:
Focal Press.
Pinch, T. and F. Trocco (2002), Analog Days: The Invention and Impact of the Moog
Synthesizer, Cambridge, MA: Harvard University Press.
Rumsey, F. (1990), Tapeless Sound Recording, Boston, MA: Focal Press.
Oxford: Oxford University Press.
Schloss, J. G. (2004), Making Beats: The Art of Sample-Based Hip-Hop, Middletown, CT:
Wesleyan University Press.
Smith, S. L., M. Choueiti and K. Pieper (2019), Inclusion in the Recording Studio?: Gender and
Race/Ethnicity of Artists, Songwriters & Producers across 700 Popular Songs from 2012–
2018, Los Angeles: USC Annenberg Inclusion Initiative.
Sterne, J. (2012), MP3: The Meaning of a Format, Durham, NC: Duke University Press.
Sterne, J. (2016), ‘Analog’, in B. Peters (ed.), Digital Keywords: A Vocabulary of Information
Society and Culture, 31–44. Princeton, NJ: Princeton University Press.
Strachan, R. (2017), Sonic Technologies: Popular Music, Digital Culture and the Creative
Process, New York: Bloomsbury Academic.
Théberge, P. (1989), ‘The “Sound” of Music: Technological Rationalization and the Production
of Popular Music’, New Formations, (8): 99–111.
Hanover, NH: Wesleyan University Press/University Press of New England.
Théberge, P. (2004), ‘The Network Studio: Historical and Technological Paths to a New Ideal
in Music Making’, Social Studies of Science, 34 (5): 759–781.
Wallis, R. and K. Malm (1984), Big Sounds from Small Peoples: The Music Industry in Small
Countries, New York: Pendragon Press.
Warner, T. (2003), Pop Music – Technology and Creativity: Trevor Horn and the Digital
Revolution, Aldershot: Ashgate.
Watson, A. (2015), Cultural Production in and Beyond the Recording Studio, New York:
Routledge.
Wolfe, P. (2018), ‘“An Indestructible Sound”: Locating Gender in Genres Using Different
Music Production Approaches’, in S. Bennett and E. Bates (eds), Critical Approaches to the
Production of Music and Sound, 62–77, New York: Bloomsbury Academic.
Zagorski-Thomas, S. (2012), ‘The US vs. the UK Sound: Meaning in Music Production in

the 1970s’, in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An
Introductory Reader for a New Academic Field, 57–76, Farnham: Ashgate.
Zak, A. (2001), The Poetics of Rock: Cutting Tracks, Making Records, London: University of
California Press.
Discography
The Beatles (1967), [LP] Sgt Pepper’s Lonely Hearts Club Band, Parlophone.
Martin, Ricky (1999), [CD single] ‘Livin’ La Vida Loca’, Columbia.
Nine Inch Nails (2005), [digital download] ‘The Hand That Feeds’, Interscope.
88
6
How Does Vintage Equipment Fit
into a Modern Working Process?
Anthony Meynell
Introduction
Contemporary students of record production, interested in the development of studio
practice and seeking to discover historic techniques, will find narratives that not only
mythologize the works of producers such as Spector or Martin but cast historic pieces of
equipment such as Pultec EQs and Fairchild limiters as mythical figures, venerated for
their role in defining the sound of popular music.
Indeed, marketing of software emulations of these historic brands focuses on the
associated recordings, unique personalities and role in creating a canon of influential
recordings, calling them the ‘tone titans of hundreds of hit records’ (Shanks and Berners
2018).
Whereas recording studio practitioners in the past concentrated on song, performer
and arrangement, viewing sound capture as a technical process provided by engineers,
modern recording practice incorporates the sonic design potentials of the control room,
using historic pieces of equipment as part of the overall creative process to recreate sounds
of specific eras and provide a sense of creative randomness and distortion to otherwise
controllable pristine methods.
Hence, equipment originally designed as options of last resort to repair or control
performances become valorized for their creative potentials and reputation for use by
renowned engineers on significant recordings. Items, such as the Pultec and Fairchild,
abandoned as soon as technically superior units became available, are rediscovered, with
second-hand prices supporting the mythology of the users and software emulations at
keen prices granting access to such pieces without suggesting their place in the modern
recording chain. Bennett (2012) concurs that ‘The digital appropriation of analogue
systems is particularly prevalent in software plug ins.’
This chapter considers the continued use and veneration of analogue equipment that
first appeared over sixty years ago. As old designs are reissued by various companies
copying original electrical circuits and using valves, and digital emulations are marketed as
‘tone titans’, deifying the original concepts, the question is how does this equipment fit into
a modern working process and why does it survive?
Audio processing equipment such as equalization and compression were developed
in the early part of the twentieth century. Equalization originated as a technological
solution to maintain spectral consistency and clarity over long-distance telephone lines by
compensating for high-frequency losses. Compression and limiting emerged in the 1930s
as an electrical method of automatically controlling audio peaks to protect radio broadcast
transmission equipment previously controlled by manual gain riding:
Operators sat at a console with the sole job of keeping the audio at a constant level, and even
more importantly, preventing the program audio from jumping high enough to hit 100%,
pinch the carrier off and knock the station off the air. (Somich and Mishkind 2018)
Hence units such as the Pultec and Fairchild were designed to be transparent control
devices, responding to the music in a pas de deux of discreet manipulation, to help
engineers achieve loud and clean programme audio, whether to modulate the transmitter
in the best possible way, or in use by recording studios to maximize signal and fidelity for
vinyl disc-cutting styli.
The chapter follows the fortunes of the Pultec equalizer as an example that has
survived in popularity and reputation despite the technological changes that have
spawned modern equivalent devices that have impacted upon and shaped modern
working practices.
Pultec
In 1947, American manufacturer Pulse Industries produced a programme equalizer,
licensed under Western Electric patents, designed to match timbres between recordings
when mastering from different studios or recording dates. The prescriptive tone of the
supporting documentation describes its specific design and targeted market as:
Used by major broadcasting networks, record companies and recording studios to add the
final touch to good program material, and to greatly improve program material previously
recorded on inferior quality or differing characteristics. (Pulse Techniques 1947)
The unit alters input tonal character based on the choice of attenuation or boost at certain
frequencies on the front panel, which controls a passive circuit design.1 A connected
compensating line amplifier maintains the signal strength, so it adds unavoidable
colouration by employing transformers and valves at this stage. This inherent characteristic,
an artefact of using valve equipment, though acceptable at the time given other signal to
noise fluctuations from tape hiss, mains hum and cross talk, etc., enhanced the recorded
signal and became an important feature in later use. Indeed, digital emulation also re-
creates the harmonic distortion of this circuit whether the equalizer is switched in or out
as well.
Vintage Equipment in a Modern Working Process 91
The front panel design is also unusual. Boost and attenuation are not combined but
on separate knobs, and emulations eschew standard interface ergonomics to copy this
unconventional layout for authenticity. Hence, operation does not follow common tacit
expectations.
The primitive design with its unusual layout of controls became a legendary unit, and
was one of the few units commercially available for the recording industry, since corporate
recording studios tended to use ‘in house’ designed equipment.2
Although the 1950s are perhaps considered the heyday of valve-based technology,
outboard units found little use in a recording studio working practice that was geared
towards capturing performance and time spacial information of session musicians
playing in ensemble in strictly managed sessions.3 In Chasing Sound, Schmidt Horning
describes 1950s studio practice as the ‘art of controlling sound’ that relied on ‘an implicit
knowledge gained from experience’ (Schmidt Horning 2013: 126): a practice that focused
on microphone choice, placement and ambience as vital ingredients to maintain high
standards and create a high-fidelity sound mixed direct to tape.4 Since patching in outboard
equipment also meant adding noise, engineers avoided them if possible.
Nevertheless as the decade progressed units such as the Pultec and Fairchild began to
find other uses in the studio, to ‘rein in’ loud amplified sounds that were emerging from
rock ‘n’ roll guitar bands, adapting to methods gleaned from scrutinizing recordings made
in independent studios, and while engineers continued to fight the medium of tape and to
limit noise, this combination of tape saturation and valve distortion created a recognizable
sonic character. McCartney observed that ‘valve equipment […] gives you a record-y type
sound – a pleasing sonic distortion part of the aural tradition in rock’ (McCartney quoted
in Zak 2001: 99).5
The exploration of the unit’s adaptability led users towards finding new ways to interpret
its functionality, discovering that while Pultec’s manual clearly states ‘Do not attempt to
boost and attenuate simultaneously on the low frequencies,’ doing so added a valuable bass
‘bump’, a feature unique in a world of otherwise crude shelving equalizers. This ‘creative
abuse’ (Keep 2005) was to become a key signature sound of the unit and followed the
notion of an anti-program (Akrich and Latour 1992: 261). A combination of further
technological innovations and engineers’ inventive adaptations to their use set the stage for
the exploitation of the recording studio as a new creative medium in the 1960s (Schmidt
Horning 2013: 138).
While the record business followed the paradigm of increased fidelity and stereo
reproduction, parallel advances in circuit design and adoption of the transistor as a substitute
to the valve amplifier not only reduced inherent noise levels and power requirements, but
allowed for the miniaturization of component parts and opened the way for independent
console manufacturers to enter the market, for example Neve, who employed modular
convenience and bespoke flexible design to provide multichannel desks laid out in the now
familiar channel strip layout.
These innovations were often manufacturers’ responses resulting from customer
interaction and user requests to develop modifications. Kirby (2015) describes the
interactive development of the console equalizer at Olympic Studios between designer

Sweetenham and engineer Grant, while Neve began experimenting with transistor
technology because he was unable to accommodate feature requests into his original valve-
mixer design, demonstrating that users can be said to share a technological frame with the
equipment’s designers (Oudshoorn and Pinch 2003).
Hence innovations once only available on outboard boxes, such as the Pultec, became
built into each channel of the mixing desk, and as studio working practice adapted to
the concept of serial recording with the introduction of efficient quieter multitrack tape
machines, so the use of equalization changed to sculpting and filtering a multitude of pre-
recorded performances into a final soundscape. Active sweepable parametric equalizers
on three or four bands, together with hi and lo cut on each strip, emerged as standard,
replacing bulky valve outboard equalizer units; the development of equalizers, once a
technological innovation, was complete.6 The homogenized control surface became the
familiar look, face and main control surface of the studio, allowing control over spacial
placement and tone shaping of individual instruments after recording.
While Théberge notes that the development of multitrack recording and associated
practices met the technical demands of contemporary new music – rock (Théberge 1989:
99) – the technological advancement also changed the sound of the recorded music.
Emerick describes the difficulty in moving from valve to transistor, from the Sgt Pepper’s
Lonely Hearts Club Band LP (1967) to the Abbey Road LP (1969):
There was presence and depth that the transistors just wouldn’t give me that the tubes did
[…] Abbey Road was the first album that was recorded through an EMI transistorized desk,
and I couldn’t get the same sounds at all. (Droney 2002)
Nevertheless, as studios kept up with technology, the once valuable valve technology,
including Pultecs, became viewed as old-fashioned, noisy and redundant units. Built-in
equalizers meant an external one was superfluous, and Pultec ceased manufacture in the
late 1970s.
The eventual EMI ‘sale of the century’ provided proof that the forward-facing industry
had no romantic attachment to equipment from the past; whether used on iconic recordings
or not, the technology was considered valueless and written down industrial equipment,
and was thrown away.
Indeed, as the 1970s progressed and a new generation entered the studio, so the tacit
knowledge of past techniques was lost. 1950s trained engineers, skilled in recording
ensembles in three-hour sessions on limited equipment, found themselves working for
months on end ‘with untrained musicians with just an idea of a song [… which] was an
affront to their profession and tantamount to deskilling’ and they left the industry (Schmidt
Horning 2013: 181).
The end of the 1970s marked a zenith in the pursuit of analogue hi fidelity, and
manufacturers changed their direction to convenience and control, focusing development
on automation, flying faders, and adding compression and noise gates on every channel,
as desks became the centrepiece of studios, while homogenization of design allowed
engineers to move between studios.
Whereas these large studios were locked into tape and large desk formats, embracing
the 1980s era of technological acceleration, Bennett also notes a practice developing where
lo and hi fidelity existed side by side, identifying the emergence of an anti-production
ethos:
Devoid of the perfection, clarity and polish so associated with technology-driven
productions. Nostalgia, technophobia or sentimentalism cannot be attributed to such
a technique, rather the knowledge that technologies are a means to an end. (Bennett
2010: 244)
Noting that this anti-production ethos was often the modus operandi of the burgeoning
independent studio culture that thrived on second-hand equipment,7 and while these
large studios also adapted quickly to the new technology of Musical Instrument Digital
Interfaces (MIDIs), drum machines and digital recorders, they also found use in discarded
valve technology, once part of traditional recording methods, as devices found a new role,
adding colour and character to sounds.
Indeed, the cold and brittle digital sound demanded a hyped input that warmed up the
sound, and valve technology was found to be the saviour, entering a renaissance period
as manufacturers such as Danish company Tube-Tech introduced a new valve outboard,
including a replica of the original Pultec in 1985, for users who wanted a reliable version of
the original, now rare design.
Meanwhile, equipment brokers such as Tony Larking, who had provided a service
to small independent studios breaking large desks into rack mounted devices, giving
access to high-end channel strips from large and obsolete desks that wouldn’t fit into
a room, also began to manufacture a new valve-driven outboard, designed not only to
equalize and compress but also sold to warm up the signal. Joe Bennett (2009) – Bath Spa
University – provides a typical endorsement: ‘The VTC […] it gives us that high quality
valve sound.’
Phil Harding at PWL also describes using two Pultec EQ1As during this period,
specifically using the bass drum, snare or claps as an integral part of the recorded sound
of the programmed sampled drums and using the previously discovered unique trick of
attenuating and boosting the same frequency at the same time. Hence the unit found a
new vital role whereas the equalizer circuit wasn’t even designed with bass drums in mind
(Harding 2010).
The users’ discovery of the ‘secret trick’ demonstrates a process that has only recently been
integrated into scholarship […] that users don’t always obey the rules, and that when they
don’t it can often have positive and creative results. (Zagorski-Thomas 2014: 129)
Keep concurs: ‘Innovation in record production has developed through the creation
of new sounds and is more likely to come from heuristic experimentation of existing
equipment, rather than adaptation of new technology, while in search for an elusive new
sound’ (2005).
So what was changing wasn’t the product but how it was used in the network
(Latour 2005), and the Pultec became a secret weapon. The tipping point of the digital
revolution can also be seen as the moment valve technology re-entered the studio
control room, alongside musicians and engineers who understood the colouration of
valve guitar amps, etc., while project studios and niche outfits created a demand for
hands-on manipulation as well as maverick use of MIDI, portable multitracks, ADATs,
samplers and other prosumer equipment lauded by the emerging music technology
press (Bennett 2010).
Hence the market for new and refurbished valve equipment became established, with
rising prices and a hybrid studio which combined a plethora of analogue equipment on
the input chain feeding into a digital workstation that began to emerge as a standard
working practice, with final mixing back through the existing large desk or summing
unit.
Although digital recording eventually replaced tape as a viable alternative during the
2000s, saving studios tape cost, maintenance fees and space, traditionally trained engineers
still drew the line at using in the box equalizer and compression instead still preferring
familiar outboard equipment, and citing obvious constraints as processing power,
latency and suspicion at the core effect of the processing, that it sounded thin, flat, one
dimensional, cheap, etc., echoing in fact similar complaints during the move from valve to
transistor decades earlier. The inherent distortion characteristics were noticeably missing
from digital and so had to be compensated for with liberal use of outboard to recreate that
‘record-y sound’.
But as digital audio workstations (DAWs) eventually replaced mixing desks as the
central focus of recording with improved stock equalizers and compression, so third-party
software developers moved to branding of otherwise anonymous computer code around
emulations of vintage equipment. Companies such as Waves, UAD, Softube etc. created
lookalike and vaguely soundalike replicas of key equipment, including the Pultec EQ1A,
wrapped in graphical user interface (GUI) and marketing that immersed the product in an
aura of tradition, mojo and reference to historic and venerated recordings with celebrity
endorsement. Indeed, remixers are now brands in their own right with boutique designed
plug-in collections, while online tutorials, face-to-face masterclasses and extensive
marketing provide an informed perspective on a past once shrouded in secrecy and
industry intrigue. So training and education are also playing a vital role in the selling of the
vintage emulation to the student of record production.
Therefore, the Pultec has acted as a silent witness to changes in working practice as it
adapts to creative uses. In each example above, the functionality doesn’t change but the
way it is used has. So it plays a different role in the network, and interpretive flexibility is
appearing at the user stage of the artefact as the equalizer is used for different purposes.
First in radio transmission and as a programme equalizer, then as a general studio equalizer,
then as a special bass trick, then as a lo-fi distortion and warming for digital, then as a
model for emulation, and finally the emulation is marketed as a particular frequency curve
useful on drums or air for vocal, not as a general equalizer only for specific jobs. Even if
the outboard version is not used it sells the cache of vintage equipment while the working
practice may use the plug-in for convenience. So then it becomes a selling point for the
studio. So Pultec is more than fidelity, it has name value and vintage connections that allow
it to be incorporated into modern working practice.
Discussion
This modern working practice that vintage valve equipment finds itself in differs from the
era when the units were first designed. The original programme of action was to record
music. The programme of an equalizer and compressor is to ensure loudness and clarity.
With the affordance of equalization and compression on separate channels of a mixer
together with a multitrack, the new programme has a better loudness and clarity and
separation, allowing a new working practice to emerge. ‘You are different with a gun in
hand; the gun is different with you holding it’ (Latour 1994: 33). The original programmes
of action therefore are both reshaped while becoming part of a new overall programme
of action. Using this example of translation, why would the engineer then use a Pultec to
equalize a track? What is it that modern equivalents cannot provide? Is it for a specific use,
as in PWL, to enhance certain frequencies, to add distortion or to add mojo, or because it
is there to satisfy a client? What does it do that the ears say nothing else will do? Why is it
worth so much trouble and expense to add that circuit? It is a primitive noisy artefact, so
wouldn’t the job be better done with the desk equalizer or the stock plug-in?
Although professional users such as Andrew Scheps and Tom Lord-Alge both built
reputations on use and choice of outboard hardware,8 modern working practice demands
a circular rather than linear mixing approach, often working on concurrent mixes and
revisiting previous work for further adjustment before final sign off. Both have recently
moved to ‘in the box’ software mixing for pragmatic reasons. Scheps also prefers working
‘in the box’ ‘because no one comes over – they just get the mix […] what they hear is
what they get is what they make a judgement on, adding, when they could see your
rack they said – what are you using on the bass? I hate that compressor and it changed
their perception of what the bass sounded like even if they didn’t know’ (Music Faculty:
University of Oxford 2018).
Lord-Alge admits his outboard is ‘gathering dust’, ‘today I am able to get the same sound
that I was getting from my outboard from plug-ins. […] It’s much easier, also, with respect
to recalling mixes’ (Tingen 2015b).
This perspective is also mirrored in the series of recent online interviews with
soundtrack composers at work in their boutique studios,9 revealing a broad perspective
of current creative working practice. Whereas opening conversations turn to the glamour
of the hardware in the racks and associated stories, working practice discussion reveals
a constant move towards meeting deadlines and fast turnarounds, the advantage of
hardware is described as input colouration and a ‘device to make me get up and move
across the room so avoiding repetitive strain injury locked into the computer screen all
day’ (Gray).
All reveal a romantic connection to collected equipment that inspires the initial
creative spark:
The hardware makes you want to go to work […] you feel involved […] it’s how it makes
you feel […] software sounds the same but sweeping eq – I feel more involved, it’s how you
engage.
(Andy Britten)
Less gear constrains options […] Promotes creativity.

(Orbital)
Satisfaction of turning a knob and playing it like an instrument you never get from a plug in.
(Sefi Carmel)
While Andy Gray, who owns an extensive synthesizer collection, explains he uses a small
MIDI keyboard to write, then a MIDI out to CV triggers the original keyboards, before he
concedes, ‘Kontakt is 99% of what I use to be honest.’
The host of the interviews, Henson is heard to lament that from the perspective of the
client customer, ‘“in the box” is less sociable, I can’t see what’s happening, I don’t want to
squint over someone’s shoulder, so there’s no point attending the mix any more’. However,
Scheps says ‘that’s what I want – don’t judge my work by my rack gear, listen to the music’.
Indeed, professional mix engineers have now accepted that initial reservations in sound
have been overcome and plug-in emulations sound the same as the hardware.
Paradoxically, whereas hardware is normally associated with being ‘professional’, the
opposite is also true. As professionals find that endless options of equipment choice are less
important than establishing a working method of templates and system management that
supports the overly complex multitrack and multistage decision process, self-producing
musicians, hobbyists and niche studios working on serial recording sessions create an
environment where there is less necessity to recall mixes. This limited method can embrace
analogue hardware because decisions have to be made, budgets and time are finite, and
mixes have to be agreed on the spot. Typically ensembles who record in a traditional studio
space to capture the feel and ambiance, often semi-pro musicians at the end of the long tail
who place value in using professional outboard equipment and traditional methods, rarely
have the resources to return and tweak the result but make a judgement in seeing racks
of expensive rare outboard rather than the sound of the mix. So using a Pultec suggests a
certain aura even if you haven’t even switched it on – it’s in the rack and you sense the vocals
or bass drum are affected and sound bigger because of it, thereby making a judgement not
only on what you hear but what is perceived as professional equipment.
Therefore as iconic studios close, or become too prohibitive for mixing, so the mojo
now exists in associated items or methodology – the Pultec equalizer, the Glyn Johns drum
technique or Phil Spector sound – that may not be appropriate but are metaphors for
professional knowledge and embodied heritage, where hardware acts as an investment and
selling point. Although limited in use, it suggests a calibre of fidelity, as a window-dressing
studio specification to attract clients who are informed via Gearslutz-style discussion
boards and staged nostalgia of ‘Mix with the Masters’-style re-enactments.
Crucially, whereas the vintage Pultec can survive because its design can be copied,10
and circuit diagrams and components are readily available, even as do-it-yourself kits,
more ‘modern’ electronic items such as tape machines that incorporate logic boards and
other computer-controlled components have not survived as they rely on crucial items no
longer manufactured, so are defunct. Indeed, one can argue that the in-built obsolescence
of computerization, abandoned operating systems and connecting systems creates a vortex
of updates and abandoned formats where vintage gear is the only long-term reliable item in
the room. Windows 95 PCs, ISA cards, 1630 tape, zip discs, rs422 cables, etc., the industry
is strewn with equipment that cannot connect to each other or play back vital recordings,
whereas a valve machine can connect to a transistor unit and a tape recorded fifty years ago
can play back on any 2- to 24-track machine.
While vintage gear is being used as a metaphor for creativity, professionalism and style
by associating with celebrity recordings rather than perhaps being correct choices for
recordings, the branding is being used in software to create an aspirational product out of
anonymous programming, implying no matter how dull your recordings, this plug-in will
make them Technicolor like the old days. Indeed, the description says you don’t even have
to switch it on for it to work its magic. The implication is that old engineers sounded good
because they used valve equipment, glossing over the artistry, musicianship, room, tacit
knowledge and work experience, and with the same gear you can too.
But if software equivalents of vintage hardware do not have the advantage of tactile
interaction and are simply selling you the idea that if it is used on old records it must
be good, then why use an emulation of a sixty-year-old design when a modern process
may achieve better results? While the plug-in market is overflowing with permutations
of saturators, distortion and tape emulations, why use the version cloaked in a vintage
GUI when you can use an alternative at half the price to do a better job? How strong is the
marketing and who is responding to that? Whose ears are you trusting?
Developments in machine learning and AI now provide plug-in equalizers that
can match famous recordings, provide preset combinations of settings and create aural
equivalents of sepia, retrolux or monochrome filters, etc. Industry commentators bemoan
that these innovations remove the human element or nuance of possibilities, promoting a
Fisher Price ‘my first mix’ reliance on suggested presets (Senior 2017), an approach that
doesn’t tell you how or why to equalize. This argument also echoes Lanier’s proposition
that modern technology, while giving an illusion of empowerment is increasingly about
removal of liberty and homogenizing the user base, where ‘keeping up with new technology
actually ends up shepherding the creation process along quite restrictive lines’ (Pattison
2013). The idea is that not knowing what is ‘under the hood’ in the algorithm underscores
a suspicion that the device may also be biased towards evaluating a generic or safe solution
rather than a creative and ingenious hunch based on outside the box circumstances or
wider network influences. Indeed, developers concede that cost and processing limitations
stop them modelling the absolute boundary responses of hardware units (Lambert 2010),
yet these responses, when things go wrong or break down, are the very nuances that created
the palette of undiscovered sounds and ideas that enabled the trick or special identity and
drew users to the original hardware.
Hence, using vintage outboard equipment suggests the work involves specialized
craftsmanship and reflects the handmade personality of the maker, the imprint of
humanity and the feeling of being in control. Indeed, using hardware becomes the industry
anti-programme rather than an expression of nostalgia or anti-technology, as the DAW
environment increasingly depends on learning new versions, renewing licences and
discovering your new updated operating system has just deleted your favourite legacy
plug-ins and won’t open previous session templates and mixes. As Townshend says:
It looks like vanity or elitism. But what we know about vintage studio equipment is that it
makes us feel different about what we do, and how we do it, in the studio […] we are following
in a long line of studio process and tradition that reminds us that if we use these great vintage
tools carefully, but audaciously, we might break new ground all over again. (2018)
The three case studies

The following three contrasting examples of a mixer, producer and recording engineer
provide contrasting methodological frameworks showing how users embrace vintage
gear.11
Kesha – ‘Praying’ 2017

Mixer Jon Castelli (Lady Gaga, Ariana Grande) combines outboard valve equipment with
an ‘in the box’ strategy in his LA studio. ‘The reason I have the tube gear is because it
creates harmonic content that I don’t believe exists in the digital realm’ and he describes
the sonic advantage as warm, musical and fat, with more headroom acting as a safety net
against digital clipping. His VLC-1 console is ‘based on a vintage RCA preamp tube circuit,
with a Pultec-style EQ on every channel’, and comes up as inserts in Pro Tools. He used one
channel as an insert on the lead vocal and two channels on the mix bus for Kesha’s song. He
employs this console and outboard on 80 per cent of mixes as a matter of taste but accepts
that he can also achieve depth, separation and clarity ‘in the box’, ‘I’m not precious about
my analogue gear anymore, despite spending a lot of money on it!’
After an ‘in the box’ rough mix was approved by Kesha, her manager and record
company, Castelli decided to redo the rough mix incorporating his outboard analogue
equipment. Although Kesha and her label agreed it was better, her management preferred
the timbre of the original rough mix and asked for adjustments.
Castelli describes the rough mix as ‘bedroom’, raw and dryer, whereas the analogue remix
was ‘platinum’, lush, gigantic, hi-fi, clear, top 20, but he conceded it was the rawness that the
people responded to, that gave ‘goosebumps’. Six weeks and nineteen further remixes failed
to deliver an accepted result using the hybrid of analogue and plug-in processing. The final
approved mix was an edited amalgam of the original ‘bedroom’ mix for the verses, spliced
with the ‘platinum’ mix from the second chorus onwards.12 The choice of mix selection
rested on which soundscape complemented the lyrical performance, which went from
intimate in the verse to bombastic in the chorus. The description of ‘bedroom rough mix’
belies the fact it was ‘70 tracks processed with several hundred instances of plugins’, mostly
UAD vintage and tube emulations. The platinum mix reduced this arrangement to thirty-
three stem tracks, which were further treated with outboard and vintage plug-ins.
The above example illustrates that although the mixer’s creative preference may be to
incorporate vintage hardware equipment to add harmonic distortion to colour the sound
and also provide a creative tactile relationship to the process, the workload increasingly
dictates an abandonment of this style of working, staying within a digital environment
to manipulate a complex hyper-fidelity approach of layering, processing and managing
sounds and mix decisions. Thus the mixer preferred to use his prerogative to employ tacit
skills to achieve an accepted paradigm of fidelity, which conflicted with the instinctive
judgement of the wider network and how the mix made them feel. The plug-ins, although
precise emulations, did not achieve the same sonic openness, although they created a more
than acceptable result.
Clean Bandit – ‘Solo’ 2018

Following five years of mixing exclusively ‘in the box’, former session musician turned
producer and mixer Mark Ralph (Hot Chip, Years & Years, Rudemental, Jess Glynne) now
incorporates a hybrid approach to create electronica/organic soundscapes from his SSL
equipped facility:13
When mixing, I get the mix up to about 70 percent, using desk EQ and compression and
analogue outboard to shape the sound. When it’s time to start the automation, I transfer all
48 channels back into the same Pro Tools session, get rid of the tracks I don’t use, and finalise
the mix inside the box, with the four groups coming back up on the desk again.
Ralph further states ‘the creative process by which I arrive at a sound and the way in
which I perform with a piece of hardware is completely different to staring at a computer
screen, moving a mouse around a picture of a piece of equipment’. Although accepting
plug-ins can achieve similar sonic signatures to hardware counterparts and acknowledging
the disadvantage of losing plug-in automation, instant total recall and versatility, he argues
outboard focuses on committing to sounds before digitizing, avoids later manipulation to
fix and correct, and results in smaller final arrangements and less processing:
I find the clarity and separation of summing through the mix bus section of the desk much
better than summing in digital […]. I find it easier to make judgements in analogue at an
early stage, when you need to make important decisions […] the moment I began splitting
things out over a desk again, it gave me 10 to 20 percent more clarity and separation.
Ralph also describes how the ‘physical contact environment’ engenders an intuitive and
spontaneous performative creative process that extends the studio as instrument ethos
into a social event, especially relevant when working with musicians in the control room.
‘Everyone can get involved. That is impossible when it’s just one guy working in the box
[…] When I have bands in the studio, multiple people can play multiple pieces of hardware
at the same time.’
He extends the notion of vintage outboard to synths and often-forgotten early reverb
and effect units famous for grainy digital sound, noting the largest piece of outboard is the
desk.
Although the methodology to print outboard processing into ProTools and submix
stems back out to capture a final SSL buss compressor mix appears an elaborate
workaround to incorporate outboard and maintain a tactile relationship with the mix
process, he argues that the approach outweighs any logistical constraints and produces
better results.
Nevertheless, he accepts: ‘In the old days, recalls would take two hours on an SSL so you
didn’t do them too often. But today there are endless requests for changes. By the time you have
done all those minute changes, you can sometimes end up with more than 50 mix versions.’
Bob Dylan – ‘Shadows in the Night’ 2014

Engineer and mixer Al Schmitt (twenty-four Grammy Awards) has been recording
since 1957 and is renowned for his vast experience and tacit understanding of historic
methodology, eschewing equalization and compression for microphone choice and
placement, capturing ambience and spill, and committing ensemble performances direct
to tape, often riding the vocal-track volume fader while recording to avoid later processing.
Dylan’s album of Sinatra covers was recorded as a live ensemble at Capital Studios, Studio B
in Los Angeles, using seven microphones to record Dylan and his five-piece combo, using
the Neve 8068 desk and 24-track analogue tape machine, also direct to stereo. Indeed,
three songs were mixed live to two-track during the session and became the final masters.
The final mix of the other seven tracks simply incorporated level adjustments with no
further processing or edits. No headphones were used, live room balancing was a matter
of placing musicians in a semicircle so they could see and hear each other. Additional
horn performers on three tracks were positioned away from the main ensemble but mainly
picked up by the omni mic in the centre of the semicircle.
Schmitt concurs:
A lot of the time was spent on making sure that each musician was playing the right parts,
with the right performances. We also wanted to make sure that everyone was comfortable
and could hear each other […] Sometimes the very first take would be the take, so there was
nothing to adjust, but most of the time after listening to it, they had their ideas, and I would
say that I would need a little bit more volume here, or little bit less there, and I asked them to
adjust that in the room. When there was a guitar solo, he just played a little louder. I did not
want to be riding faders, I wanted it to be natural. I rode faders on the vocals, but for the rest,
once I set it up they balanced themselves in the room. After this there was very little for me to
do. That was it. There was no editing, no fixing, no tuning. Everything was just the way it was.
Dylan describes the vocal sound as ‘the best he’s heard in 40 years’. Schmitt used a U47
and Neve 1173 for vocals, with Audio Technica ribbon microphones on the instruments,
noting: ‘The only compression I used on the entire album was on Bob’s voice, a tiny bit of
an old mono Fairchild. I barely touched it, I used it mainly for the tube sound. It just added
some warmth. On the desk I also mixed in some of Capitol’s live chamber number four on
his voice.’
Hence the compressor wasn’t used for its original design intention but its inherent
character was used to shine a sonic spotlight and further lift the vocal in the mix. Indeed,
the ribbon microphones, desk and tape machines were ‘modern’ in design and Schmitt
relied on ensemble working practice to capture the ‘old school’ character of the songs and
performances rather than employ an elaborate array of available vintage equipment to
further imitate the historic spirit of Sinatra’s original session.
Although Schmitt has successfully mixed ‘in the box’ his preference is to use a
console, which not only matches his methodology but has clear sonic advantages, while
the simplicity provided by committing to sonic decisions during recording avoids later
complex mixing scenarios.
Conclusion
Historic recording studios are endowed with mythology as representing some of the most
creative, uplifting and noble spaces. These spaces still serve symbolically to reinforce the
spirit of a golden age of recording, as palaces of expertise where engineering experience
and interaction with technology create a powerful sense of importance.
In recent years, recording spaces have been altered to accommodate an increasingly
digital environment and have adapted to changes in working practices emerging from new
musical styles, abilities and declines in recording budgets.
Tacit knowledge of working practice which incorporated vintage equipment such as the
Pultec, which was designed for specific use in the 1950s, is now invisible and mythologized
in the context of current practice. This mythology is mostly the marketing of tradition for
branding.
Whilst the Pultec has an undeniable sonic signature its use as an equalizer is limited
compared to modern technology. Its continued function is a product of adapting to the
changes in working practice rather than changes in specification.
The Pultec has survived, not because the functionality was changed but the way it is
used was, so it played a different role in the network.
First in radio transmission,
Then as program equalizer,
Then as a special bass trick,
Then as lo-fi distortion,
Then as warming for digital,
Then as a model for emulation,
Then as a selling point for studios.
Its adaptability belies its simplicity, and its old technology with the availability of
components ensures it is a repairable and repeatable design rather than later technology
that incorporated integrated circuits and redundant computer protocol.
Even if left unused it sells the cache of vintage equipment, while the working practice
may use the plug-in for convenience. Hence it becomes a trophy, a statement art piece that
confers an aura of tradition, mojo and reference to historic and venerated recordings.
While analogue is often associated with the ‘professional user’, the opposite is also true.
Famous ardent analogue users are now self-confessed ‘in the box’ users because the industry
demands it, as the job is not only creative, it’s recall, organization, storage, etc., even if they
have racks upon racks of high-end outboards. Nevertheless, commentators often appear
polarized, seeing mixing ‘in the box’ as being a causal agent of change rather than as an
opportunity for change. As Henson says, ‘I want to see what you are using to feel part of
the process,’ whereas Scheps says ‘you don’t hear the gear only the end result’. Meanwhile,
hobbyists, tinkerers and niche studios maintain a prodigious enthusiasm for otherwise
redundant technology, to create recordings bearing the patina of a previous epoque.
The above studio examples illustrate that hardware is still in use for tracking, adding
familiar harmonic content to the sonic signal prior to digitization, by engineers who just
like to be hands-on and make better decisions when turning knobs, or by users who want
to have fun unencumbered by client turnaround time or the economics of running a
business. However, there is a danger the studio as an instrument can become the studio
as an indulgence, without prior commercial thought beyond creative freedom and no
deadline.
Rediscovering the merits of valve equipment in a digital world also involves the
deification of vintage technology, to emulations, to changes in the story as the collective
memory of working practice is replaced by a romantic view of what we think it was
like, misunderstanding the role of vintage equipment related to historic hit records and
identifying its role as another tool within the context of current methodology rather than
as a tool designed to serve an industry dealing with tape hiss, distortion and working
practices developed for radio broadcast and live recording. Nevertheless, while its original
use may not provide an adequate solution amongst the plethora of modern technological
options, new groups of users have applied subsequent interpretations, influenced by the
scope and adaptability of the unit’s functionality, allowing divergent interpretations to be
realized, sustaining the reputation and longevity of the technology.
Notes
1. Passive electrical circuits use components that do not rely on electrical power for
operation but the circuit results in a drop in signal power.
2. For example, in 1951 EMI designed a parametric disc-cutting equalizer Universal Tone
Control (UTC) or ‘curve bender’, which played a similar role to the Pultec ‘to improve
the tonal quality of recordings acquired from some external source’, implying that EMI
recordings did not need fixing but inferior recordings from outside studios did. Ken Scott
suggests many EMI engineers learnt about equalization from playing around with the
UTC as 2nd engineers cutting acetates. The corporate working practice is underscored
with the protocol that the unit was not commonly allowed on recording sessions, since
modifying a sound too far from its original state conflicted with their ‘true fidelity’ ethos.
Nevertheless, The Beatles were granted exclusive use of the UTC, which can be seen in
Sgt Pepper studio photographs (Ryan and Kehew 2006: 151).
3. While technological advancements and experimentation in recording studio practice
during the 1960s signalled the emergence of multitracking, labour agreements dating
back to the Second World War between the broadcast and recording industries, and
the American Federation of Musicians in the United States (and Musicians Union
in the United Kingdom) established modes of working that continued to favour live
performance. These unions sought to protect their members from a post-war music
industry based on selling records rather than on live music, dictating terms such as no
recorded overdubs, which lasted into the 1960s (Meynell 2017).
4. Malcolm Addey concurs, ‘at EMI equalising and limiting at the time of mastering was the
order of the day. In fact the first time I wanted to use a limiter on vocal only in the studio,
a memo was shown to me expressly forbidding such a thing and I had to have that order
waived at managerial level just to get it patched in!’ (Massey 2015: 31).
5. Kehew notes that, ‘every time The Beatles engineer added an eq or compressor in circuit,
so a tube driven line transformer has to be added to bring up the signal from the passive
box. Abbey Road Studios had a very un-orthodox standard for impedances [… the]
Standard [for] both the incoming and outgoing signal was 200 ohms. The result was
that a lot of line amps were needed. For example the REDD 37 mixing desk needed
31 Siemens V72S valve amps’ (Ryan and Kehew 2006: 75).
6. In 1971, Daniel Flickinger invented his circuit, known as ‘sweepable EQ’, which allowed
an arbitrary selection of frequency and gain in three overlapping bands. ‘I wrote and
delivered the AES paper on Parametrics at the Los Angeles show in 1972 […] It’s the first
mention of “Parametric” associated with sweep-tunable EQ’ (Massenburg 1972).
7. Much like 1950s American studios, repurposed radio broadcast tape machines and
mixing desks.
8. He still owns £750,000 of outboard gear, recently installed at Monnow Valley Studios, Wales.
9. Creative Cribs is a series of extensive interviews exploring creative working practice in
the context of film and sound design (see Spitfire Audio 2019).
10. The original Western Electric patent has expired.
11. All case studies are articles published in Sound on Sound magazine in the series ‘Inside
Track: Secrets of the Mix Engineers’, and all written by Paul Tingen, see Tingen 2015b,
2017, 2018.
12. A similar solution to George Martin’s approach following Lennon’s request to combine
two separate recordings of ‘Strawberry Fields Forever’.
13. He has developed the former Beethoven Street Studio in London into a multiroom
production complex.
Bibliography
Akrich, M. and B. Latour (1992), ‘A Summary of a Convenient Vocabulary for the Semiotics of
Human and Nonhuman Assemblies’, in W. E. Bijker and J. Law (eds), Shaping Technology,
Building Society: Studies in Sociotechnical Change, 259–64. Cambridge, MA: MIT Press.
Akrich, M., M. Callon, B. Latour and A. Monaghan (2002), ‘The Key to Success in Innovation
Part I: The Art of Interessement’, translated by A. Monaghan, International Journal of
Innovation Management, 6 (2): 187–206.
Bennett, J. (2009) 2009 June – Joe Bennett. Available online: https://joebennett.net/2009/06/
page/2/ (accessed 18 August 2019).
Bennett, S. (2010), ‘Examining the Emergence and Subsequent Proliferation of Anti
Production Amongst the Popular Music Producing Elite’, PhD thesis, University of Surrey,
Guildford.
Bennett, S. (2012), ‘Endless Analogue: Situating Vintage Technologies in the Contemporary
Recording & Production Workplace’, Journal on the Art of Record Production 7 (November).
Billboard (1980), ‘EMI Sale of Century’, Billboard, 30 August.
Droney, M. (2002), Geoff Emerick – Mixonline, London: Future plc.
Harding, P. (2010), PWL from the Factory Floor, London: Cherry Red Books.
Keep, A. (2005), ‘Does Creative Abuse Drive Developments in Record Production?’, Paper
presented at the First Art of Record Production Conference, University of Westminster,
London. Available online: https://www.artofrecordproduction.com/aorpjoom/arp-
conferences/arp-2005/17-arp-2005/72-keep-2005 (accessed 22 August 2019).
Kirby, P. R. (2015), ‘The Evolution and Decline of the Traditional Recording Studio’, PhD
thesis, University of Liverpool, Liverpool.
Lambert, M. (2010), ‘Plug-in Modelling’, Sound on Sound, August 2010. Available online:
https://www.soundonsound.com/techniques/plug-modelling (accessed 22 August 2019).
Latour, B. (1994) ‘On Technical Mediation’, Common Knowledge, 3 (2): 29–64.
Latour, B. (2005), Reassembling the Social: An Introduction to Actor-Network-Theory, Oxford:
Massenburg, G. (1972), ‘Parametric Equalization’, in Proceedings of the 42nd Convention of the
Audio Engineering Society, Los Angeles, CA, 2–5 May.
Massey, H. (2015), The Great British Recording Studios, Milwaukee, WI: Hal Leonard
Corporation.
Meynell, A. (2017), ‘How Recording Studios Used Technology to Invoke the Psychedelic
Experience: The Difference in Staging Techniques in British and American Recordings in
the late 1960s’, PhD thesis, University of West London, London.
Music Faculty: University of Oxford (2018), Andrew Scheps at the University of Oxford – ‘What
Comes Out of the Speakers’, YouTube, 8 January. Available online: https://www.youtube.
com/watch?v=HVCdrYbUVW8 (accessed 22 August 2019).
Oudshoorn, N. E. J. and T. Pinch (2003), How Users Matter: The Co-construction of Users and
Technologies, Cambridge, MA: MIT Press.
Pattison, L. (2013), ‘Boards of Canada: “We’ve become a lot more nihilistic over the years”’,
The Guardian, 6 June 2013. Available online: https://www.theguardian.com/music/2013/
jun/06/boards-of-canada-become-more-nihilistic (accessed 22 August 2019).
Pulse Techniques (1947), Pultec Manual EQP-1, West Englewood, NJ: Preservation Sound.
Available online: http://www.preservationsound.com/wp-content/uploads/2012/02/Pultec_
EQP-1.pdf (accessed 22 August 2019).
Senior, M. (2017), ‘iZotope Neutron’, Sound on Sound, January 2017. Available online: https://
www.soundonsound.com/reviews/izotope-neutron (accessed 22 August 2019).
Shanks, W. and D. Berners (2018), ‘UA’s Art and Science of Modeling UAD Plug-Ins, Part 2 |
Universal Audio’, Universal Audio (blog). Available online: https://www.uaudio.com/blog/
ask-doctors-ua-modeling-plug-ins/ (accessed 22 August 2019).
Somich, J. and B. Mishkind (2018), ‘Sound Processing: A History of Audio Processing’, in
B. Mishkind (ed.), Broadcasters’ Desktop Resource. Available online: https://www.thebdr.
net/articles/audio/proc/ProcHist.pdf (accessed 22 August 2019).
Spitfire Audio (2019), ‘Creative Cribs’. Available online: https://www.spitfireaudio.com/
editorial/cribs/ (accessed 1 August 2019).
Théberge, P. (1989), ‘Paul Théberge – The “Sound” of Music. Rationalization and the
Production of Popular Music – Drum Kit – Bureaucracy’, New Formations 8: 99–111.
Tingen, P. (2015a), ‘Al Schmitt: Recording Bob Dylan’s Shadows In The Night’, Sound on
Sound, May 2015. Available online: https://www.soundonsound.com/techniques/al-
schmitt-recording-bob-dylans-shadows-night (accessed 22 August 2019).
Tingen, P. (2015b), ‘Inside Track: Tom Lord-Alge’, Sound on Sound, January 2015. Available
online: https://www.soundonsound.com/people/inside-track-tom-lord-alge (accessed
22 August 2019).
Tingen, P. (2017), ‘Inside Track: Kesha “Praying”’, Sound on Sound, November 2017. Available
online: https://www.soundonsound.com/techniques/inside-track-kesha-praying (accessed
22 August 2019).
Tingen, P. (2018), ‘Inside Track: Clean Bandit “Solo”’, Sound on Sound, November 2018.
Available online: https://www.soundonsound.com/techniques/inside-track-clean-bandit-
solo (accessed 22 August 2019).
Townshend, P. (2018), ‘The Fairchild 660/670 Tube Compressor’, Vintage King. Available
online: https://vintageking.com/fairchild-660-670-compressor-limiter (accessed 22 August
2019).
University Press.
Zak, A. J. (2001), The Poetics of Rock: Cutting Tracks, Making Records, Berkeley, CA: University
of California Press.
Discography
The Beatles (1967), [LP] Sgt Pepper’s Lonely Hearts Club Band, Parlophone.
The Beatles (1969), [LP] Abbey Road, Parlophone.
Clean Bandit (2018), [digital download] ‘Solo’, Atlantic.
Dylan, Bob (2015), [CD] Shadows in the Night, Columbia.
Kesha (2017), [digital download] ‘Praying’, Kemosabe/RCA.
Sinatra, Frank (1957), [LP] Where Are You, Capitol.
106
Part III
Places
There are two aspects of the notion of spaces in music production. The first, which is what
this part is predominantly concerned with, is of the studio as a place of work or a site of
creative practice – as a cross between the factory or workshop and the artist’s studio. The
second, which is dealt with to some extent by various chapters in parts five, six and seven,
is the question of what we hear in the recorded music. As Zagorski-Thomas discusses in
the first chapter, we hear the sound of something happening somewhere. That somewhere
may be a schematic, even impossible or surreal, somewhere but it cannot be nowhere. We
may be able to use the idea of something happening nowhere or in a non-place as a kind
of metaphor or image but it is not something that we imagine as a physical phenomenon.
And it is interesting how important it seems to be that the special ritual of musicking
should take place in a special space. The sound of special spaces seems to go back to the
beginnings of human art – cave paintings are often in the most acoustically resonant part
of a cave complex – and we consistently build acoustically special spaces as our places of
ritual. The Gonbad-e Qabus tower in Iran, the Hungarian Fertőrákos Cave Theatre, the
5,000-year-old temple at Hypogeum Hal Saflieni in Malta, the Gol Gumbaz Mausoleum
in Karnataka, India, and the great European cathedrals of Europe all bear testament to
humanity’s recognition of acoustics as an important contributor to the rituals of life. And
as the sensitivity of recording technology improved such that we could hear the detail of
the reverberant sound in the space where the recording was taking place, we started to
take more care over the control and manipulation of those reverberations. The temples
of sound in the 1940s and 1950s, such as Columbia’s 30th Street Studio in New York –
a large deserted church converted into a studio in 1948, which used a complex array of
reflectors, absorbers, acoustic screens and parabolic reflectors to shape the sound of the
space and capture it from different vantage points in the room to create a controlled and
pristine soundscape – began the creation of the sound of music in a special space that has
continued into the age of mechanical and electronic reverberation.
But the recording studio as a place of work is just as complicated and full of paradoxes
and dilemmas. The early recordists and sound engineers certainly treated it as an industrial
place of work – whether it was the factory-like recording facility of EMI’s Abbey Road
or one of the many small-scale enterprises that were started up by ex-servicemen with
electrical training after the war, which are mostly very like offices or technical workshops.
However, the visual impact was not what was important and despite the fact that, for
example, Stax Records’ studio was a converted cinema and Norman Petty’s Clovis Studios
looks like a small office complex, the sounds that emanated from these recordings were
still special. It is interesting that in the last half of the twentieth century, when the concept
of high fidelity arose and then started to fade away again, the aesthetics of expensive and
cheap sounding acoustic spaces that were also connected to notions of sophistication and
commerciality started to emerge.
But if the sound engineers and producers were used to these spaces as places of work,
the artists and musicians who came into them developed a more fluid relationship. The
working methods of musicians in the 1960s and 1970s changed from treating studios as
another gig – a place where you came in and out much like a concert and expected a drab
green room or cafeteria and unpleasant toilets – into places where they went to be inspired
and creative, and to spend extended periods of time doing both of those things. And at the
same time, the balance of financial and cultural capital shifted. In the earlier part of the
century, the record companies had the financial capital but they also believed they had (and
did have) the cultural capital – the knowledge of what record-buying audiences wanted or,
at least, the ability to shape that demand. As the audiences and the musicians became
younger, the record companies felt that they were losing this cultural capital – they had
to rely on the musicians to make the creative decisions more and this gave the musicians
more power and leverage. And, as that coincided with the rapid growth of revenues as
album sales grew in relation to singles, musicians were able to convert this power, leverage
and money into more conducive working conditions. So at the top end of the industry,
the big stars were putting studios into their houses and mansions and entrepreneurs
were creating resort-style residential studios, but in the mid- and lower-budget end these
ideas about comfort and inspiration also spread in the form of lounges and games rooms,
ambient lighting, domestic-style décor and comfortable furniture. One of the features that
Eliot Bates describes in his chapter, however, is the tension between the requirements of
acoustics and the ergonomics of musicians. The logistics of acoustic separation are often at
odds with the logistics of detailed aural and visual communication that allow performers,
technicians and producers to communicate and respond to each other’s creative practice.
7
Recording Studios in the First Half
of the Twentieth Century
Susan Schmidt Horning
Introduction
Early recording studios bore no resemblance to the glamorous dens of technological
marvel and musical creativity that came to exemplify studios of the late twentieth century.
In fact, the notion of the recording studio as a purpose-built space did not emerge
until 1931 when EMI completed its Abbey Road Studios, but even this was actually an
extensive renovation of a preexisting mansion (Massey 2015). The earliest spaces used for
recording were inventors’ laboratories and machine shops, rooms in office buildings and
travel recording locations ranging ‘from a grass hut to a palace’ (Sooy 1898). From the
1890s to 1930s, studios were concentrated on the East Coast of the United States and in
Great Britain in the greater London area, but from the very beginning, recording was a
global endeavour (Gronow and Saunio 1998; Burrows 2017). By 1912, the Gramophone
Company had branches operating in London, Paris, Madrid, Berlin, Brussels, St
Petersburg, Vienna, Budapest, Warsaw, Copenhagen, Stockholm, Alexandria, Calcutta
and Bombay. In the 1920s Columbia Graphophone started operating in Australia and
Japan. When the two companies merged to form EMI their assets included fifty factories
in nineteen countries and studios that were either permanent or in venues such as concert
halls with which the company had agreements. By the 1960s, they opened or acquired
studios in Alexandria, Egypt; Lagos, Nigeria; Johannesburg, South Africa; and Beirut,
Lebanon (Martland 1997).
In the United States, the original East Coast concentration soon gave way to studios
in Chicago and Los Angeles, and after the Second World War in Nashville, Memphis,
Detroit, New Orleans and Cincinnati, where independent studios and small record
labels flourished (Kennedy and McNutt 1999). In cities outside the major entertainment
centres, from Cleveland, Ohio (Schmidt Horning 2002), to Lubbock, Texas (Peoples
2014), and points in-between, dozens of independent studios arose to service a range
of clientele, from advertisers to touring performers and local talent, and these studios
became springboards for future recording stars. By 1970, an international directory
listed about 800 recording studios worldwide, with a little over a third of those outside
the United States (Billboard 1970).
Each of the major technological milestones in the primary medium of recording during
the analogue era — acoustical recording to electrical recording, disc to magnetic tape, mono
to stereo and multitrack — brought about changes in studio design as well as engineering
practice and record production (Cunningham 1996; Burgess 2014). The speed of change,
the source of innovation, and the design and construction of the technology varied from
studio to studio, but all studios, whether the flagship operations of a major label such as
Columbia or the small independent studio with no label affiliation, evolved along similar
lines. This chapter charts that evolution and surveys a variety of studio environments and
practices in the United States and Great Britain where the majority of studios were located
during the first century of sound recording.
The acoustic recording studio

The earliest recording studios were laboratories in which inventors and mechanics
experimented with methods of capturing sound using acoustical recording machines and
various sizes and shapes of horns. In Thomas Edison’s laboratory in Menlo Park, New
Jersey, the staff undertook extensive experimentation with a wide variety of instruments,
recording horns, wax cylinder compounds and methods of adjusting room acoustics
(Millard 1990). In the Columbia Street Studio in West Orange, the walls, floors, ceiling,
piano and piano bench were coated with cow hair, creating an acoustically dry and
dead environment, not pleasant for musicians (Harvith and Harvith 1987). Commercial
recording took place in a variety of spaces. The Victor Talking Machine Company had its
home laboratories in Philadelphia and Camden, New Jersey, but used various locations in
New York City for music recording, including room 826 of Carnegie Hall, where Enrico
Caruso made his first US recordings (Sooy 1898). In 1898, Fred Gaisberg recorded in the
Gramophone Company’s London studio, which was the former smoking room of the
Cockburn Hotel (Kinnear 1994, photos in Burrows 2017: 36–37). The Gennett Records’
Richmond, Indiana, studio was in a 125 x 30 foot rural shed near a rail spur, ‘soundproofed’
by sawdust between the interior and exterior walls and deadened by floor to ceiling monk’s
cloth draperies (Kennedy 1994).
Edison’s National Phonograph Company Recording Department on the top floor of
the Knickerbocker building at Fifth Avenue and 16th Street in Manhattan exemplifies a
commercial operation of the early twentieth century. It included two recording studios,
a reception area, the manager’s private office, a rehearsal and audition room, test rooms,
receiving and shipping rooms, and an experimental machine shop. In a smaller recording
room, vocalists sang into a horn protruding through a curtained partition in the corner.
The largest studio was equipped with myriad devices, hangings and other apparatus
where large bands, orchestras and other instrumentalists recorded. Once the record was
‘taken’, a process described as ‘arduous and unromantic’, an assistant removed it from the
Recording Studios up to 1970 111
machine, carried it to the test room where the official critic and his assistant listened,
passed judgement and made suggestions for improvements. Once the musicians had
made the necessary corrections the recordist took another record and repeated the entire
process until the results were approved, after which more masters were made to be sent
to the factory for processing (‘Our New York Recording Plant’ 1906). With the growth
of the popular music industry, recording studios began to accommodate artists, but the
environment was intimidating and encouraged efficiency over creativity. In an effort to
mitigate the ‘phonograph fright’ that plagued artists unaccustomed to performing without
an audience, the Aeolion Company (Vocalion Records) provided comfortable surroundings
in nicely decorated rehearsal rooms before sending them to the bare-walled studio with
hanging wires, straight-backed chairs and the impersonal horn jutting out of the wall (‘The
Aeolion Co. Announces the Vocalion Record’ 1918).
Secrecy permeated the early recording business as inventors sought to protect their
inventions and innovations from being copied by rivals. Both Edison and Eldridge Johnson
of Victor Talking Machine protected their innovations that either required no patent
or could not be patented, keeping their labs remote and under lock and key (Aldridge
1964). Consequently, few contemporary descriptions of the recordist’s room survive. A
1911 reminiscence by the advertising manager of Columbia’s London studio described
the recordist’s ‘shrine of mystery’ as a space few were permitted to enter (Gelatt 1965).
It contained a turntable mounted on a heavy steel base, controlled by gravity weight, a
floating arm with its recording diaphragm, a small bench strewn with spare diaphragms
and a heating cupboard where the wax blanks were kept warm to soften the recording
surface to better receive the cut. The partition that separated the recordist’s room from the
studio where the artists performed remained fixed in nearly all early studios, becoming
more fortified over time. This distinction between work domains emerged because the
technology and the proprietary nature of recording methods demanded it. Recordists
moved freely between these realms because the job required it, but they jealously guarded
their technique and technology. Artists were not permitted to see the recording machine
and how it worked.
Electrical recording, big rooms and small

studios
Early recordists were aware of the importance of room size, shape and acoustical resonance,
but the scientific study of sound and its behaviour in enclosed spaces was still in its infancy
and they were continually learning the ‘tricks of sound waves’ (Lescarboura 1918: 178).
The 1920s introduction of electrical recording, amplification and microphones enabled
a broader range of voices and instruments to be captured, in both company studios and
makeshift location studios like those used for recording early blues and folk artists (Dixon
and Godrich 1970; Sutton 2008). In studios, the conversion to electrical recording was not
completely smooth nor was it immediate, but it was dramatic in its effect, and the improved
sound quality and ease of recording also meant that different types of music, types of
performers and ways of listening emerged. Microphones replaced the cumbersome and
temperamental recording horns, so musicians no longer had to crowd together and jockey
for position before the horn. The more sensitive microphones captured room acoustics, so
recording and broadcasting studios employed sound-absorbing wall coverings, rugs and
drapes to eliminate reverberation, but this dead environment proved difficult for musicians.
Radio pioneered the use of live end–dead end studios to enable both controlled and live
sound and recording engineers recognized the value of both acoustical environments
(‘WCAU Uses Dead End and Live End Studio’ 1932).
The new electrical recording chain gave the recordist greater control. A condenser
microphone, vacuum tube amplifier and electromagnetically powered cutting lathe
replaced the cumbersome recording horn, diaphragm and cutting stylus of the acoustic
recording system (Maxfield 1926). Rather than running back and forth between studio
and control room to reposition musicians, a recordist could ‘see’ sound on the volume
indicator dial, audibly monitor it from a speaker, control it through knobs at his fingertips,
and he could communicate with the studio through an intercom system like that used
in radio broadcast studios. Triple-pane glass and soundproof partitions separated control
room and studio, providing protection for the recording equipment, which was sensitive
to vibrations, extraneous noise and environmental conditions. It also served as a barrier
between artist and technician, a boundary between realms, and a stronger distinction
between their respective roles. This division afforded greater efficiency and control for the
engineer, but it led to dramatically different listening environments in the control room
and in the studio (Newell 1995).
One of the great advantages of electrical recording was its ability to capture and
reproduce a wider range of frequencies and dynamics, and thus, larger musical ensembles
could now be successfully recorded, leading to the search for larger recording venues. The
major American record labels’ company studios were not large enough to accommodate
orchestras or the popular big bands of the era. The typical studio measured roughly 15,000
to 35,000 cubic feet, with enough acoustical treatment to give it what Bill Putnam described
as a ‘pinched’ sound (1980: 2). So RCA, Columbia and Decca regularly used alternate
venues for their large ensemble recording during the 1940s and 1950s. Columbia Records
used the World Broadcasting radio studios; RCA Victor used Manhattan Center, a former
opera house, and Webster Hall, a dance hall once used for bohemian costume balls, society
weddings and as a speakeasy during Prohibition; and Decca used the ballroom of the
Pythian Temple, an elaborate Egyptian-themed structure built by the Knights of Pythias in
1926. These large rooms provided the space and natural acoustics conducive to recording
symphonies and big bands, but they were not built as recording studios and thus were not
without problems. Engineers had to devise ways of minimizing external noise that bled
into the recording, halt sessions when that was not possible, and anchor recording lathes
in cement to counteract subway rumble (Schmidt Horning 2004). One site, Liederkranz
Hall, home to a German singing society, with its solid wood floors and walls, became the
preferred recording room for Columbia Records, until CBS President Bill Paley turned it
into a television studio. Columbia engineers and producers canvassed Manhattan to find
a replacement, an abandoned church on East 30th Street proved ideal (Liebler 1954; Kahn
2000) but only after the engineers figured out how to control the excessive reverberation.
All these recording venues and acoustical treatments had the effect of imparting greater
volume and presence to the records made in them, which was highly desirable in jukebox
and radio play.
Elsewhere in the world, RCA Victor was growing its network of studios. The first of
its studios to use the principles of polycylindrical diffusers was a disc recording studio in
South America in 1940, but eventually, nearly all RCA radio and recording studios were
outfitted with these half-round plywood panels arrayed along studio surfaces at mutually
perpendicular axes (Volkmann 1945). Devised by RCA engineer John Volkmann to obtain
a more uniform decay of sound, these diffusers actually made rooms feel and sound bigger
than ordinary studios. In 1962, RCA Italiana built a massive recording facility outside of
Rome, with Studio A (one of four studios in the complex) measuring 400,000 cubic feet,
the largest studio in the world, big enough to fit the entire cast of Aida and acoustically
tuneable with movable polycylindrical diffusers and panels covering the walls and ceiling
(Wechsberg 1962).
Although location recording had been going on since the early twentieth century, mobile
units for recording opera and symphony orchestra performances began in the electrical
recording era. In England, the Gramophone Company had a mobile recording system in 1927,
a compact recording suite built onto a two-ton Lancia van chassis, which enabled recordings
of classical concerts at Queen’s Hall near Oxford Circus and Kingsway Hall in the City of
London (Burrows 2017; Massey 2015). In early 1931, the Gramophone Company merged
with Columbia Graphophone to form Electric and Musical Industries (EMI), and by the end
of the year, EMI’s Abbey Road Studios opened in St John’s Wood.1 Unlike their American
counterparts, both EMI and Decca featured massive in-house studios with high ceilings and
staircases to elevated control rooms. At Abbey Road’s Studio One there was no direct access to
the control room from the studio, and in Studio Two, where The Beatles recorded, musicians
had to climb up a long flight of stairs to the control room, described by Paul McCartney
as ‘Where the grown-ups lived’ (Massey 2015). At Decca, this same arrangement of steep
steps to the elevated control room gave singer Marianne Faithful the feeling that she and the
musicians were ‘like workers in the factory while the fat cats directed operations from on
high’ (Faithfull with Dalton 1994). The division between technicians and artists/producers/
music directors was more pronounced in the UK, where even into the 1960s EMI engineers
wore white lab coats and protocol dictated that permission had to be obtained in writing to
move microphones from their standard spot (Emerick and Massey 2006).
At the major labels in the United States a similar division between recording engineers
and the artists and producers they worked with existed due to engineering unions, which
protected the engineers from any encroachment of their duties. At Columbia, RCA, Decca
and Capitol engineers were members of one of two major unions: the National Association
of Broadcast Engineers and Technicians (NABET) and the International Brotherhood of
Electrical Workers (IBEW). During a session, artists and producers might offer suggestions
for balance, but only engineers could touch the controls. If a producer wanted to adjust the
volume he had to place his hands on the engineer’s hands to do so.
From disc to tape, stereo and multitracking

With the conversion to magnetic tape recording and the introduction of 2- and 3-track
stereo in the 1950s, studios underwent considerable redesign and jobs diversified. The
affordances of tape recording (editing, re-recording, overdubbing and other sound
manipulations) offered opportunities to devise new sound and production values, and
performances could now be edited to previously unimagined standards of perfection
(Milner 2009). New jobs emerged, such as mixer, tape editor and button pusher, the newest
title for a studio apprentice. Initially, recording continued on disc as well as on tape. Once
companies were satisfied with the reliability of magnetic tape, it replaced disc recording as
the primary recording medium and disc-cutting lathes were moved to separate mastering
rooms where mastering engineers undertook the final step of cutting the final mixed tape
to master disc to be sent for processing.
The first large-scale custom-designed recording studios in the United States were
those incorporated in the Capitol Records Tower in Hollywood, completed in 1956. Built
literally from the ground up, the Tower studios used techniques of motion picture sound-
stage design and construction, modern acoustical materials and constructions ‘to achieve
a new concept in studio design: minimized reverberation’, and included four sublevel
reverberation chambers ‘to provide optimal acoustical properties’, resulting in ‘a modern,
diversified plant, physically attractive, acoustically controllable and electromechanically
flexible’ (Bayless 1957). The studio did not meet these high expectations immediately
and, during the first sessions in February 1956, the musicians felt it sounded ‘dead as hell’
(Granata 1999). Capitol engineer John Palladino (1999) recalled that session and many
that followed as disappointing to the musicians and challenging to the engineers, mainly
because the studio was so much larger than the previous studio with a different performing
and monitoring environment. ‘You don’t have anything of a constant […] and some of
the same techniques [we] used at Melrose didn’t seem to work out […] because of the
acoustics.’ Quite possibly the ‘minimized reverberation’ was the problem, but there was
a certain level of resistance to the new and attachment to the familiar. Another Capitol
engineer grew so tired of hearing the musicians wish they could go back to Melrose, the
studio they had been using since 1949, originally an NBC radio studio, that he finally
played them a recording made at the former studio, ‘and they all suddenly realized that the
new studio was by far acoustically superior’ (Grein 1992).
Some popular record producers exploited studio technology to create unique sounds
and exaggerated effects, creating a sense of space and place far removed from the actual
space in which recording took place (Doyle 2005). Recording engineers and producers
first began to experiment with artificially recreating the sense of space with the use of
reverberation chambers, commonly called ‘echo chambers’, techniques first pioneered in
radio and sound film, and later by using electro-acoustic devices such as spring reverbs
and EMT Reverberation Plates (Schmidt Horning 2012). As early as 1937, bandleader
Raymond Scott achieved a big auditorium sound on his records simply by placing
microphones in the hallway and men’s room outside his record company’s office, which
also served as the label’s recording studio. A decade later, Chicago engineer Bill Putnam
used the bathroom at Universal Recording to achieve exaggerated reverberation on The
Harmonicats’ recording of ‘Peg O’ My Heart’. In New York City, Columbia Records used
the stairwell of their studios at 799 Seventh Avenue. In Phoenix, fledgling record producer
Lee Hazlewood installed a cast iron storage tank in the Ramsey Recording parking lot,
inserted a microphone in one end and a cheap speaker in the other to create guitarist
Duane Eddy’s ‘million dollar twang’ (Schmidt Horning 2004). Another popular means
of achieving echo utilized two tape recorders, called tape slap. In Memphis, Tennessee,
engineer and producer Sam Phillips used it to great effect recording early Elvis Presley and
Jerry Lee Lewis records, creating what came to be known as the ‘Sun Sound’. Phillips had
two Ampex 350 recorders in his studio in 1954, one on the console and the other mounted
on a rack behind his head. By bouncing the signal from one machine to another, with
a split-second lag between the two, he created an echo effect that became the sound of
rockabilly. In London, EMI Chamber Two and Decca’s Broadhurst Gardens rooftop echo
chamber featured vertically placed glazed sewer pipes to diffuse sound (Massey 2015),
reminiscent of the polycylindrical diffusers lining RCA studio walls.
A different attempt to exploit the possibilities of stereo and studio technology was
embraced in classical music. Decca Records (London) used the Sofiensaal in Vienna,
Austria, as its main European recording venue during the 1950s and 1960s. In his account
of the recording of Richard Wagner’s Der Ring des Nibelungen from 1958 to 1965, record
producer John Culshaw described how he and the engineering team managed to produce
all the effects and perspectives required in the production of Götterdammerung. Recording
the previous two operas proved time-consuming due to the need for multiple hands on the
console. The new desk, designed by Decca engineers and constructed in Vienna, enabled
the producer to sit in the middle, with one engineer on either side of him, one controlling
the orchestra and the other the voices and effects cues (Culshaw 1967).
The use of echo and reverb, and the exploitation of the stereo image, a more three-
dimensional listening experience than monophonic or one-eared listening, all embraced
the concept of using acoustical space, or the simulation thereof, in recording, whether
in lavish operatic productions such as Culshaw’s Ring or the ‘ping-pong’ stereo of Enoch
Light’s Persuasive Percussion (1959). Another approach to manipulating sound in studio
recording was to use the technology itself to create sounds. British independent producer
Joe Meek produced and engineered records in London professionally from 1955 to 1967,
working at IBC and Lansdowne before striking out on his own. He ran practically every
sound through compressors and frequently drenched everything in excessive reverb, echo
and delay. Sometimes he added homemade sound effects and even sped up the tape to give
it ‘more personality’ (Cleveland 2001: xi). His was the ultimate cobbled together home
studio: a series of rooms occupying three floors, above a leather goods store on Holloway
Road. His use of rooms as sound booths and the creative use of corrective devices to distort
sound comprise an example of user-modification of technology, a practice that has become
ubiquitous in music-making and other fields (Oudshoorn and Pinch 2003).
With the introduction of multitrack recording in the 1960s, a new concept of recording
in layers led ultimately to the reintroduction of acoustical deadening and the separation
of musicians in the studio. In order to maintain discrete tracks, with instruments kept
separate in order to have control over each channel in the final mix, engineers employed
close microphone techniques and baffles or gobos to wall off musicians from one another.
Recording engineers had to spend a great deal of time getting the right levels of instruments,
leading some musicians to lose a sense of spontaneity in their performance.2 Vocal booths
had been used since the days of big bands, when the collective instrumentation often
overpowered a vocalist, but now drum booths became necessary. Because multitracking
made it possible for instruments to be recorded at different times and overdubbed,
recording could now be made without an entire ensemble present, and musicians did not
have to be well rehearsed as they did in the days of disc recording, when mistakes could
neither be edited out nor corrected. The multitrack studio may have thwarted spontaneity
and undermined skills as musicians and producers relied on the engineer’s ability to fix
it in the mix, but it afforded many creative possibilities: from Les Paul’s pioneering use of
sound-on-sound disc recording to his early 8-track recording, and no one made better use
of multitracking than George Martin and The Beatles in the making of Sgt Pepper (Martin
with Pearson 1994; Emerick and Massey 2006).
Rise of the independents

Independent recording studios first emerged in the 1930s with the introduction of
instantaneous disc recorders (Brock-Nannestad 2012). That semi-professional equipment
served as the training ground for future professional engineers and studio owners. In
the 1940s, major labels used independent studios Universal Recording in Chicago and
Radio Recorders in Hollywood to record their artists and cut master discs. Before Sam
Phillips opened his Memphis Recording Service in 1950, he honed his engineering skills
by recording transcription programmes for radio station WREC. Once he opened for
business, he cut his early records at 78 rpm on 16-inch discs using a Presto 6-N lathe and
Presto turntable before switching to Ampex tape recorders (Escott with Hawkins 1991).
His was a storefront studio, a simple room with upright piano, vinyl floors and acoustical
tiles because Phillips had some knowledge of the behaviour of sound in rooms. In Chicago,
Leonard and Phil Chess partnered with Jack Wiener to build their studio at 2120 South
Michigan Avenue. Although it became famous as the ‘Chess Studio’, Wiener originally
designed and built it as an independent studio, Sheldon Recording. When Wiener learned
that the Chess brothers wanted to maintain the studio for only their own artists, Wiener
left and took the recording equipment with him (Rudolf 2018). In Cincinnati, Syd Nathen
established King Records as a small empire, with recording, mastering, processing and
shipping all under one roof (Kennedy and McNutt 1999).
Before it joined New York, Los Angeles and Chicago as a major recording centre, Nashville
had a number of independent studios. The city’s early recording activity took place in Castle
Recording Laboratory, located in the mezzanine of the Tulane Hotel (1947–55), and in a
remodelled house and surplus military Quonset Hut that together became Owen Bradley’s
Film and Recording Studios, where Columbia and Decca artists recorded. Before building
their own studio, RCA Victor held sessions at several Nashville studios, including a studio
and office space rented from the Television, Radio, and Film Commission (TRAFCO) of
the Methodist Church. It was there that Elvis recorded ‘Heartbreak Hotel’ on 10 January
1956, but the space had a curved ceiling which amplified the bass notes, and some church
leaders were not happy about hearing the ‘devil’s music’ as their educators worked on church
publications (Rumble 2016). By 1957, RCA Studio B, measuring 65 x 150 feet, with an echo
chamber on the second floor, opened for business (Cogan and Clark 2003). RCA did not
provide vocal booths and only had 4-foot acoustical panel baffles to isolate the drums,
apparently believing Nashville did not require the sophisticated acoustical treatment of its
other studios. So engineer Bill Porter bought acoustical panels, cut them up and created
pyramid-shaped structures, which he hung at different heights across the room to eliminate
standing waves (pictured in Rumble 2016: 30). By 1966, the studio had been redesigned
to include parquet wood floors, acoustic tile and convex plywood panels (polycylindrical
diffusers, found in most RCA radio and recording studios) (Volkmann 1966).
Small independent labels flourished in the 1950s and 1960s and many of these relied
on a growing number of independent studios. During this period, independent studios
grew in number, some rivalling major label operations in their organizational structure.
Bell Sound Studio in New York City was the first independent studio to give the major
labels real competition for clients, and it exemplifies a very successful independent studio
of the time, recording numerous hit singles and producing innovative technology. Bell
began, as so many other independent studios began in the post-war period, recording
air checks, weddings and bar mitzvahs. By the 1960s, Bell was a full-service operation
with three recording studios, editing rooms, mastering rooms, a film room, acoustic echo
chambers, an office and tape library on the ground floor, receptionists on the second and
fifth floors, a full-time maintenance crew and a sales department. Young aspiring engineers
could begin as apprentice ‘button pushers’ assisting with various tasks such as tape cueing
and microphone set-ups, and the mixers had the more glamorous job of recording the
session and working directly with artists. Lead engineer Dan Cronin designed and built
one of the first solid-state professional recording consoles as well as the electronics for their
Scully tape recorders. Bell also had one of the first 12-channel tape recorders, designed by
Mort Fujii, a former Ampex engineer. That particular innovation backfired, however, when
its uniqueness became not the envisioned selling point but a problem. If a client needed a
mix or wanted to overdub but could not get time at heavily booked Bell, they could take
it to another studio, but a 12-track master tape would not align on the standard 8- or
16-track machines at other studios. In a period of intense competition for the latest studio
technology, Bell’s 12-channel was an example of technology’s unintended consequences.
But during the early 1960s, Bell was the most popular independent studio in New York,
frequented by rock groups such as the Lovin’ Spoonful and The McCoys, composers such
as Burt Bacharach, and solo artists and independent producers such as Phil Spector.
In the UK, over a dozen major independent studios emerged by the mid-1960s to
compete with the four majors (EMI, Decca, Pye, Philips). IBC, Lansdowne, Advision and
Cine-Tele Sound (CTS)/The Music Centre, along with 304 Holloway Road (Joe Meek’s
studio), Olympic, Trident and, by 1970, George Martin’s own AIR Studios were recording
British rock bands such as The Kinks, The Who and The Rolling Stones, and popular
singers such as Petula Clark. In the United States, by the late 1960s the major label flagship
studios that had dominated recording from the post-war era waned in popularity as a new
crop of independent studios arose. The shift signalled a generational divide between artists
and technicians over perceptions of the use and meaning of technology, and notions of
skill, creativity and musicality. From 1956 to 1968, only signed artists could use the Capitol
Tower studios, but as many rock bands in the late 1960s demanded the right to record
wherever they wanted without the union restrictions and rigid work environment of major
labels, Capitol opened its studios to outside clients (Grein 1992). Les Paul never used the
company studios, making all of his sound-on-sound and multitrack recordings in home
studios in California and Mahwah, New Jersey (Schmidt Horning 2004). In recording
‘Good Vibrations’, The Beach Boys’ Brian Wilson used five different studios in Los Angeles:
Gold Star, Sunset Sound, Western, Columbia and RCA (Wilson and Gold 1991; Sharp
2015), ultimately combining performances from different studios into one hugely popular
hit record. When the band embarked on their second album for Capitol, they insisted on
decamping to a former pool house which required the company to transport recording
equipment and maintain Capitol engineers on the site during recording (union rules)
even though John Simon did the engineering. Columbia also began to attract independent
producers to its New York, Nashville and San Francisco studios by advertising in trade
magazines: ‘Make your hits at our place’ (Schmidt Horning 2013). Jimi Hendrix spent so
much time in recording studios that his accountant advised he build his own, but Electric
Lady did not open until after his death. By the early 1970s, rock bands sought retreats from
urban studios and location recording became popular. The Rolling Stones parked a mobile
unit outside Keith Richards’s rented villa in the south of France, where the band recorded
basic tracks for Exile on Main Street (Massey 2015). The Beach Boys commissioned
construction of a state-of-the-art 16-track studio in Los Angeles, had it broken down and
shipped to the Netherlands where they had it reassembled in a rural barn and proceeded to
record Holland (Schmidt Horning 2013).
Gender and race in the studio workplace

The social and cultural diversity of the entertainment industry meant that recording studios
were always places where race, class and gender intermixed, albeit with certain limitations.
Women and people of colour could be found behind the microphone but not at the control
desk. In 1940, Mary Howard, a wealthy divorcee with a passion for recording, applied for
a job in the recording department at NBC. Instead, they offered her a secretarial position,
but when the war depleted the engineering staff, she got the chance to cut discs. She soon
earned the reputation of ‘a master recording engineer’ and eventually left NBC to open her
own recording studio in her Midtown apartment (‘Woman with a Disk’ 1947). At Radio
Recorders in Hollywood, Rose Palladino and Evelyn Blanchard became expert editors of
radio programmes, editing out commercials to make 16-inch transcription discs of music
for the Armed Forces Radio Service. Evelyn eventually mixed two major hits – ‘Smoke!
Smoke! Smoke!’ by Tex Williams (1947) and ‘Twelfth Street Rag’ by Pee Wee Hunt (1948) –
when the newly formed Capitol Records (1942) used Radio Recorders before it had its own
studios. Rose’s brother, John Palladino, who became a Capitol engineer and producer and
Evelyn’s future husband, said, ‘I had to marry her to get rid of the competition!’ (Palladino
1999). Her work went uncredited but until the 1960s no recording engineers, male or
female, received credit for their work.
Although African American performers sang and played in recording studios from the
very beginning (Brooks 2004), few people of colour held technical positions in studios
until well after the Second World War. In 1949, at a time when recording engineers learned
their craft on the job and had little if any engineering experience, Ray Hall was an African
American Marine Corps veteran with an electrical engineering degree from Purdue
University. Yet when he applied to RCA Records in response to an ad seeking blacks with
scientific and technical backgrounds, he first underwent a battery of tests at various RCA
locations before starting his employment repairing phonographs for artists such as Leopold
Stokowski before he advanced to second engineer, and eventually first engineer (Schmidt
Horning 2013). Motown Records, the first successful African American owned and
operated record label in the post-Second World War era, relied on the technical expertise
of white engineers, even as its entire stable of talent was black (Fostle 1997a, b). For women
and people of colour, studio employment opportunities were rare well into the 1970s.
Conclusion
What began with efforts to capture and reproduce sound by mechanical means became
a technology-driven music industry with the studio as the central creative hub and the
control room as the nerve centre. Early studios were sites of experiment and innovation,
and recording was a labour-intensive process, but the recording technology was the
preserve of the recordist and no artist could ever view, much less touch, any of the devices.
Electrical recording refined the process and gave recordists more control over sound,
musicians more space in the studio and the possibility of recording more styles of music.
Magnetic tape brought editing, re-recording, overdubbing and mixing, and multitracking
introduced creative possibilities that led to more time spent in recording, more involvement
of technicians in the creative process and, in some cases, artists being permitted to see
and to touch the controls. By 1970, Billboard declared the recording studio ‘the crucible
of creativity’, but recording studios had always invited one kind of creativity or another
from the early experimentation with acoustical methods, techniques, materials and room
size, to the increasingly collaborative deployment of multitrack recorders, echo chambers,
outboard effects and cut-and-try methods pioneered by people such as Les Paul and Joe
Meek. The studio began with technological invention in the service of art and came to be a
site for musical and technological collaboration (Schmidt Horning 2016).
Notes
1. Today, Abbey Road Studios include three full-size recording studios, mixing, mastering
and copying rooms, and administrative offices. For a detailed look at the technical and
physical characteristics, personnel and studio stories of Abbey Road and every major
and independent British studio and mobile unit up through the 1970s, Howard Massey’s
deeply researched and beautifully illustrated The Great British Recording Studios (2015) is
essential reading.
2. One example would be Thelonious Monk’s frustrated reaction to the extended level
checks when he recorded at Columbia’s 30th Street Studio, as seen in the film Thelonious
Monk: Straight, No Chaser (1988) and discussed in Chasing Sound (Schmidt Horning
2013: 194–197).
Bibliography
‘The Aeolion Co. Announces the Vocalion Record’ (1918), The Music Trade Review,
18 May: 48, 50.
Aldridge, B. L. (1964), The Victor Talking Machine Company, RCA Sales Corporation.
Bayless, J. W. (1957), ‘Innovations in Studio Design and Construction in the Capitol Tower
Recording Studios’, Journal of the Audio Engineering Society, 5 (2) (April): 75–77.
Billboard (1970), International Directory of Recording Studios, 9 May 1970.
Brock-Nannestad, G. (2012), ‘The Lacquer Disc for Immediate Playback: Professional
Recording and Home Recording from the 1920s to the 1950s’, in S. Frith and S. Zagorski-
Thomas (eds), The Art of Record Production: An Introductory Reader for a New Academic
Field, 13–27, Farnham: Ashgate.
Brooks, T. (2004), Lost Sounds: Blacks and the Birth of the Recording Industry, 1890–1919,
Urbana, IL: University of Illinois Press.
Burrows, T. (2017), The Art of Sound: A Visual History for Audiophiles, London: Thames &
Hudson.
‘Chasing Sound Oral History Project’ (n.d.), Louis B. Nunn Center for Oral History,
University of Kentucky Libraries. Available online: https://kentuckyoralhistory.org/
ark:/16417/xt7pzg6g4k5n (accessed 22 August 2019).
Cleveland, B. (2001), Creative Music Production: Joe Meek’s Bold Techniques, Vallejo, CA: Mix
Books.
Cogan, J. and W. Clark (2003), Temples of Sound: Inside the Great Recording Studios, San
Francisco: Chronicle Books.
Culshaw, J. (1967), Ring Resounding, New York: Penguin.
Cunningham, M. (1996), Good Vibrations: A History of Record Production, Chessington:
Castle Communications.
Dixon, R. M. W. and J. Godrich (1970), Recording the Blues, New York: Stein and Day.
Middletown, CT: Wesleyan University Press.
Emerick, G. and H. Massey (2006), Here, There and Everywhere: My Life Recording the Music
of The Beatles, New York: Gotham Books.
Escott, C. with M. Hawkins (1991), Good Rockin’ Tonight: Sun Records and the Birth of Rock ‘n’
Roll, New York: St Martin’s Press.
Faithfull, M. with D. Dalton (1994), Faithfull: An Autobiography, Boston, MA: Little, Brown.
Fostle, D. W. (1997a), ‘The Audio Interview: Mike McLean – Master of the Motown Sound,
Part I’, Audio, 81 (11): 56–61.
Fostle, D. W. (1997b), ‘The Audio Interview: Mike McLean, Part II’, Audio, 81 (12): 50–56.
Gelatt, R. (1965), The Fabulous Phonograph: From Edison to Stereo, rev. edn, New York:
Appleton-Century.
Granata, C. L. (1999), Sessions with Sinatra: Frank Sinatra and the Art of Recording, Chicago:
A Cappella Books.
Grein, P. (1992), Capitol Records: Fiftieth Anniversary, 1942–1992, Hollywood, CA: Capitol
Records.
Gronow, P. and I. Saunio (1998), An International History of the Recording Industry, translated
by C. Moseley. London: Cassell.
Harvith, J. and S. Edwards Harvith, eds (1987), Edison, Musicians, and the Phonograph: A
Century in Retrospect, New York: Greenwood Press.
Kahn, A. (2000), Kind of Blue: The Making of the Miles Davis Masterpiece, New York: Da Capo
Press.
Kennedy, R. (1994), Jelly Roll, Bix, and Hoagy: Gennett Studios and the Birth of Recorded Jazz,
Bloomington, IN: Indiana University Press.
Kennedy, R. and R. McNutt (1999), Little Labels – Big Sound, Bloomington, IN: Indiana
University Press.
Kinnear, M. S. (1994), The Gramophone Company’s First Indian Recordings, 1899–1908,
Bombay: Popular Prakashan.
Lescarboura, A. C. (1918), ‘At the Other End of the Phonograph’, Scientific American,
31 August: 164, 178.
Liebler, V. J. (1954), ‘A Record is Born!’, The Columbia Record: 3–4, 9.
Lubin, T. (1996), ‘The Sounds of Science: The Development of the Recording Studio as
Instrument’, National Association of Recording Arts and Sciences Journal, 7 (Summer/Fall):
41–102.
Martin, G. with W. Pearson (1994), With a Little Help From My Friends: The Making of Sgt
Pepper, Boston, MA: Little, Brown and Company.
Martland, P. (1997), Since Records Began: EMI The First Hundred Years, London: B. T.
Batsford.
Massey, H. (2015), The Great British Recording Studios, Milwaukee, WI: Hal Leonard Books.
Maxfield, J. P. (1926), ‘Electrical Phonograph Recording’, Scientific Monthly, 22 (1): 71–79.
Millard, A. (1990), Edison and the Business of Innovation, Baltimore: Johns Hopkins
University Press.
Milner, G. (2009), Perfecting Sound Forever: An Aural History of Recorded Music, New York:
Faber and Faber.
Newell, P. R. (1995), Studio Monitoring Design: A Personal View, Oxford: Focal Press.
Oudshoorn, N. and T. Pinch, eds (2003), How Users Matter: The Co-Construction of Users and
Technologies, Cambridge, MA: MIT Press.
‘Our New York Recording Plant’ (1906), Edison Phonograph Monthly, 4 (9): 6–8.
Palladino, J. (1999), Interview by Susan Schmidt Horning, 15 October 1999, ‘Chasing Sound
Oral History Project’, Louie B. Nunn Center for Oral History, University of Kentucky
Libraries. Available online: https://kentuckyoralhistory.org/ark:/16417/xt74f47gt943
(accessed 1 July 2019).
Peoples, C. (2014), ‘The Only Mountain in Lubbock: A History of Analog Recording Studios
in Lubbock, Texas’, Association for Recorded Sound Collections Journal, 45 (2): 141–155.
Putnam, M. T. (1980), ‘A Thirty-Five Year History and Evolution of the Recording Studio’,
Audio Engineering Society Preprint 1661, Los Angeles: Engineering Society.
Rudolph, Dr. (2018), ‘Jack Wiener and the Sheldon Recording Studios’. Available online:
http://www.crlf.de/ChuckBerry/sheldon.html (accessed 22 August 2019).
Rumble, J. W. (2016), Historic RCA Studio B Nashville: Home of 1,000 Hits, Nashville, TN:
Country Music Foundation Press.
Schmidt Horning, S. (2002), ‘From Polka to Punk: Growth of an Independent Recording
Studio, 1934–1977’, in Hans-Joachim Braun (ed.), Music and Technology in the Twentieth
Century, 136–147, Baltimore: Johns Hopkins University Press.
Schmidt Horning, S. (2004), ‘Recording: The Search for the Sound’, in A. Millard (ed.),
The Electric Guitar: A History of an American Icon, 105–122, Baltimore: Johns Hopkins
University Press.
Schmidt Horning, S. (2012), ‘The Sounds of Space: Studio as Instrument in the Era of High
Fidelity’, in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An
Introductory Reader for a New Academic Field, 29–42, Farnham: Ashgate.
Schmidt Horning, S. (2013), Chasing Sound: Technology, Culture and the Art of Studio
Schmidt Horning, S. (2016), ‘Creativity in the Trading Zone: Sound Recording as
Collaboration’, in H.-J. Braun (ed.) in collaboration with S. Schmidt Horning, Creativity:
Technology and Music, 169–186, Frankfurt am Main: Peter Lang.
Schonberg, H. C. (1970), ‘Acoustics at RCA’s Studio Music to Engineers’ Ears’, New York
Times, 9 August. Available online: https://www.nytimes.com/1970/08/09/archives/
acoustics-at-rcas-studio-music-to-engineers-ears-acoustics-please.html (accessed 5 July
2019).
Sharp, K. (2015), Sound Explosion! Inside L.A.’s Studio Factory with the Wrecking Crew,
Woodland Hills, CA: Wrecking Crew LLC.
Sooy, R. B. (1898), ‘Memoirs of My Recording and Traveling Experiences for the Victor
Talking Machine Company’, in Sooy Brothers Memoirs, Victor Talking Machine
Company, Hagley Digital Archives. Available online: https://digital.hagley.org/
LMSS_2464_09_X_B_MA1252_20?solr_nav%5Bid%5D=27ab02b008c2e92db1a3&solr_
nav%5Bpage%5D=0&solr_nav%5Boffset%5D=2#page/1/mode/1up (accessed 10 June
2019).
Sutton, A. (2008), Recording the Twenties: The Evolution of the American Recording Industry,
1920–1929, Denver, CO: Mainspring Press.
Thelonious Monk: Straight, No Chaser (1988), [Film] Dir. C. Zwerin; Producers C. Eastwood,
C. Zwerin and B. Ricker, Los Angeles: Warner Bros.
Thompson, E. (2002), The Soundscape of Modernity: Architectural Acoustics and the Culture of
Listening in America, 1900–1933, Cambridge, MA: MIT Press.
Volkmann, J. E. (1945), ‘Polycylindrical Diffusers in Room Acoustic Design’, Broadcast News
40 (January): 2–7.
Volkmann, J. E. (1966), ‘Acoustic Requirements of Stereo Recording Studios’, Journal of the

Audio Engineering Society, 14 (4): 324–327.
‘WCAU Uses Dead End and Live End Studios’ (1932), Broadcasting 5 (October): 21.
Wechsberg, J. (1962), ‘RCA’s Home Away from Home’, Saturday Review, 45 (26 May): 53.
Wilson, B. and T. Gold (1991), Wouldn’t It Be Nice: My Own Story, New York: HarperCollins.
‘Woman with a Disk’ (1947), Newsweek 30 (29 December): 42.
Discography
The Beach Boys (1966), ‘Good Vibrations’, Capitol Records.
The Beach Boys (1973), Holland, Brother Records.
The Beatles (1967), Sgt Pepper’s Lonely Hearts Club Band, Parlophone.
The Harmonicats (1947), ‘Peg O’ My Heart’, Vitacoustic Records.
Hunt, Pee Wee (1948), ‘Twelfth Street Rag’, Capitol Records.
Light, Enoch (1959), Persuasive Percussion, Command Records.
The Rolling Stones (1972), Exile on Main Street, Rolling Stones Records.
Wagner, Richard (1958–65), Der Ring des Nibelungen, conducted by Georg Solti, produced by
John Culshaw, Decca Records.
Williams, Tex (1947), ‘Smoke! Smoke! Smoke!’, Capitol Records.
124
8
Recording Studios since 1970
Eliot Bates
Introduction
Like many other speciality, purpose-built spaces, we tend to think of recording studios in
instrumental terms, meaning that the space is defined in relation to the nominal type of
work that the space is instrumental towards. While audio recordings have been made in
spaces since 1877, not all of these spaces tend to be regarded as recording studios, partly
since so many recordings were made in environments designed for other types of work;
indeed, much of the first seventy years of US and UK recorded music history transpired at
radio stations, concert halls and lightly treated mixed-use commercial spaces (e.g. offices,
furniture stores or drugstores where drapes might be informally used to deaden some of
the reverberations and early reflections), as has been well documented by Schmidt Horning
(2015) and Kinney (1999). In other national contexts, recordings happened at diverse
locations such as Turkish municipal buildings (Bates 2016), ‘in the field’ in numerous
colonies and postcolonial countries (Western 2018), or in the untreated living rooms of
producers’ houses in Jamaica (Veal 2007). While there were early exceptions such as EMI
Studios London (1931–) and Capitol Studios in Los Angeles (1956–), the architectural,
acoustical and aesthetic conventions that are taken for granted today, and their relation
to patterned workflows for making recorded music, only became firmly established and
adopted around the world in the 1970s. And, there is nothing ‘natural’ or ‘inevitable’ about
these conventions: it easily ‘could have been otherwise’ (Mol and Law 2002: 11).
Therefore, the instrumental definition of studios provides an inadequate framework
for understanding what a studio is and what studios are designed to do (Bates 2012),
much of which exceeds the circumscribed goal of making an audio recording. Aware that
many musicians find large-scale commercial studios to be alienating (both socially and
creatively), and many producers and engineers find many very famous spaces to be difficult
to work in due to problematic acoustical or layout features, we should not assume that
all studios necessarily fully succeed in their instrumental role, at least not for everyone
involved. Instead, in this chapter, extending Pickering’s call to attend to the ‘mangle’ of
practice in sociotechnical milieus (1995) and Law’s argument for bringing the ‘mess’ back
into social science research (2004), I argue that we need to understand studios as a messy
and uneven entanglement between four domains: the material, the spatial, the positional
and the occupational. However, these are not discrete, separate domains, and what is so
fascinating about spaces such as studios is that they are sites where we can observe, for
example, the material enframing the occupational domain, and the positional enframing
the spatial domain. Ultimately, studios are social spaces – the term ‘social’ here relating to
a wide variety of ways in which people interact with other people and with technological
objects – and what defines the unique characteristics of a studio as a social space arises
from patterned relations between these four, always entangled, domains. In other words,
the physical matter of the studios, the organization of objects and people in the space,
the positionality of the studio in relation to the outside world, and the way in which
occupations contribute to production labour define the studio as a kind of space – and
constrain the social dynamics of that particular space.
The materiality of studios

From an architectural standpoint, post-1970 studios are designed to simultaneously
isolate the studio from the outside world, to isolate the rooms of multiroom studios from
each other and to impart a desired acoustical/reverberant character on the sound as it
propagates inside the space(s). With regard to the isolating component, while acoustic
isolation is a must (sound from inside the studio shouldn’t get out, and sound from outside
shouldn’t get in), in nearly all cases this results in a near-complete visual isolation as well.
Studio workers, hermetically sealed off from the world, produce sonic representations of
the world (Hennion 1989: 407–408; Bates 2018). But there’s not just one qualitative kind of
isolation, as some studios are explicitly described by their owners as ‘wombs’, ‘bunkers’ or
‘cockpits’, all of which suggest the desire for a kind of isolation that is imagined in relation
to feminine and nurturing labour (wombs) or in relation to the masculine world of military
and warcraft technologies (bunkers and cockpits) (Bates 2012). Regarding the acoustical
component, the ubiquity of artificial reverberation in production workflows (Doyle 2005),
and the fact that most listeners now find recordings using artificial reverb to sound more
plausibly ‘natural’ than recordings made without it, led to standard acoustical treatment
designs that assume that artificial reverb will be added during the mixing phase. Thus, the
materiality of the studio enforces isolation and also imparts an acoustic quality that sounds
and feels like no other space, but that gets partially masked before recordings made within
circulate to the outside world (Watson 2015: 67–69).
Whereas nearly every other type of architectural space exhibits an effort to partially or
wholly conceal the presence of acoustical treatments (an exception might be music practice
rooms), studios are distinctive due to the strangeness of the space and the imposing presence
of the materials used to build and treat them. Hallmark visual features of studios of many sizes/
functions include peculiar ‘stuff ’, perhaps consisting of acoustical foam, angular wooden
treatments used to diffuse or resonate, strikingly coloured cloth coverings barely concealing
aesthetically unpleasant (and sometimes unhealthy) absorptive materials, unusual floors
Recording Studios since 1970 127
of hardwoods or poured and dyed concrete, strangely shaped and angled ceilings, and
nonstandard lighting fixtures with bulbs of unusual colour temperatures (e.g. lava lamps,
xenon bulbs). Post-1970 studios of all sizes become immediately recognizable through the
consistent presence of these materials; products by speciality companies such as Auralex
and Primacoustics provide widely used prefabricated solutions. Not all these categories are
employed everywhere, but even home studios may be immediately recognizable as a studio
from the commanding presence of 2 × 4 foot mineral wool or Owens Corning 703 rigid
fibreglass panels (especially those manufactured by RealTraps and GIK Acoustics, or their
DIY equivalents), a scattering of Auralex acoustical foam squares and/or the hundreds of
protruding wood squares of an RPG Quadratic Residue Diffuser.
Although pre-1970s studios did not usually have these features, it is not because the
underlying technologies did not exist. It is interesting how long it took for certain acoustical
technologies to become adopted for music recording spaces. The birth of ‘modern’
acoustics is widely associated with Walter Sabine (professionally active 1895–1919). While
perhaps best known for his redesign of Symphony Hall in Boston, Massachusetts, the first
acoustical challenge Sabine was employed to tackle was a subpar lecture hall (Thompson
2002: 34). His pioneering research into the absorptive properties of different materials
led to the adoption of a standard measurement of reverberation time, RT60, and helped
inspire the founding of the Acoustical Society of America in 1928, the same year that it
first became possible to semi-accurately measure the sound pressure levels in a room
(Sabine 1977: 259). Acoustics wings of construction firms first emerged in 1911 (257). The
Acoustical Materials Association (AMA) was founded in the United States in 1933, bringing
together manufacturers of gypsum, cork and fibreglass-based materials, and their semi-
regular publications included an ever-expanding list of acoustics products with measured
absorption coefficients. Until the AMA mandated that their members use a centralized
testing lab and consistent methodology, there had been no reliable measurements of
absorption (Sabine 1977). The number of participating companies in the AMA expanded
from eight in 1940 to thirteen in 1965, demonstrating the expansion of the acoustics sector,
although the expansion was not specifically on account of music-related applications, and
most of the products in question were designed for spaces we wouldn’t usually consider
‘architectural’, such as aircraft interiors. Regarding the specific materials that define the
post-1970 studio, patent documents suggest that rockwool had existed since the 1840s
(in the familiar ‘batt’ form since at least 1928), rigid fibreglass was manufactured as early
as 1940 by Owens Corning, and polyurethane foam since roughly 1954. And there had
been hundreds of years of precedent for using unevenly-shaped wood or stone surfaces to
pleasingly diffuse sound in environments such as opera houses or cathedrals.
Part of the problem had to do with the limited attention given to recording spaces and
their distinct acoustical properties or challenges, in contrast to the more developed literature
on concert and opera halls (Beranek 1962, 2004), or office/industrial spaces. A trickle of
articles starting in 1932 mention features specific to studio architecture, although until the
1940s differentiation wasn’t made between radio studios, which often hosted live orchestras
for broadcasts, and recording studios. Stanton and Schmid suggest that broadcast studios
should have reverberant characteristics to facilitate musical performance but that the goal
of recording music should be to recreate the experience of being in a concert hall (1932:
46–47). Morris and Nixon’s (1936) article on the design of the NBC radio studio discusses
acoustic coefficients and is possibly the first mention of the need to install extensive bass
trapping materials (they suggested a minimum of 4 inches of rockwool, which has since
become a standard for this widely used material). Knudsen (1938) recognized the creative
potential for studios with variable acoustics as early as 1938, but did not discuss this in detail.
Green and Dunbar regarded the studio as ‘the final instrument that is recorded’ (1947: 413),
but their article mainly discussed generic construction materials that can be used to alter
the reverberant characteristics of existing rooms. It wasn’t until the 1950s that, within the
Journal of the Audio Engineering Society, we begin to see a quorum of publications relating
to studio-specific architecture, and by the 1970s substantial discussion could be found in
acoustics-specific journals (e.g. Schaudinischky and Schwartz 1970). The audience for such
publications was acousticians not studio owners, musicians or the lay public; more ‘accessible’
acoustics manuals came considerably later (Everest 1981; Newell 2003; Gervais 2011).
This had very real consequences on the quality of the rooms that were actually built
or redeveloped in the first eighty-five years of recorded music history. As Newell notes,
‘commercial recording studios put up with a lot of bad rooms until the early 1970s, when
serious efforts were made, on an international scale, to try to find control room designs
which could be relied upon to produce recordings which generally travelled well; both to
the outside world, and between studios’ (2003: 350). Additionally, there was a considerable
gap between the ‘technical research’ and the ‘artistic end’ of recording work (Putnam 1980:
10). Things began to change only with the emergence of speciality acoustic design firms, of
which the best known for specializing in studios included Thomas Hidley/Westlake, George
Augspurger, Northward, John Sayers, Walters-Storyk, Wes Lachot and Michael Cronin/
Blackbird. As control rooms came to take a more significant role in production workflows
during the same period, these firms established iterative studio designs that transformed the
feel and sound of control rooms. The first of these designs was the geometrically controlled
room, designed by Hidley/Westlake and installed in several hundred studios starting in the
mid-1970s.1 Subsequent iterative designs include the RFZ/LEDE (reflection free zone/live-
end, dead-end) room where the front is highly absorbent and the rear diffusive (invented
in 1980), the non-environment room (mid-1980s), FTB (front-to-back) designs (2005),
and MyRoom (2010). All of these sound drastically different, and impose different kinds
of material presences on the users of the rooms. However, they ‘do’ much more than this.
Simultaneously, changes in production workflows, recording sector economics, labour
and the process of making recordings necessitated the creation of smaller home/project
studios. Whereas large-scale commercial studios are characterized by a small number
of iterative designs, project studios are defined in one sense by their uniqueness and
idiosyncrasies (for example, no two rooms have the same dimensions). However, only a
limited number of architectural materials in practice actually get used for non-commercial
studios. Egg crates stapled to the wall were a hallmark of US punk-rock studios from the
1970s, but the potential of these studios to catch on fire, and the lack of any acoustical
benefit from this particular treatment, fortunately led to the decline in this trend. Auralex,
founded in 1977, was the first speciality acoustic treatment company to serve specifically
the home- and project-studio market, and in 1983 RPG began introducing diffusion
and absorption systems. Of these two, RPG was the most proactive in patenting their
designs (Cox and D’Antonio 2004; D’Antonio 2011), even though many were in essence
improvements or syntheses of ideas that had been invented by others decades in the past.
Organizing objects in space

As noted above, the idiosyncratic approach to architectural acoustics means that studios,
big and small, showcase some of the rooms’ acoustical treatments. Others remain effectively
invisible, such as the neoprene rubber pucks used to decouple a floating floor, or the
resilient channel used to hang the inner leaf of green glued gypsum and/or medium-density
fibreboard panels. While in most other types of room the elements of architecture other
than its ornaments are intended to recede from conscious perception (Kilmer and Kilmer
2014), in studios there is a fundamental tension between this ‘environmental’ tendency
and the overwhelming presence of the studio’s acoustical treatments – which effectively
become perceived as objects within the space rather than constituting the space itself.
Moreover, acoustical treatments and room constructions do not result in rooms
where the reverberant characteristics of sound are experienced consistently everywhere;
rooms have ‘sweet spots’, and even well-designed rooms often have null points or
room modes where significant parts of the frequency spectrum are either inaudible or
amplified as standing waves. Sweet spots often only encompass a small space (especially
in geometrically controlled or LEDE rooms, and any smaller spaces), meaning that
there may only be room for one person at a time to experience the ideal sound quality
within that space. For performing musicians, their instruments or voices will sound very
different if positioned in the centre of a room versus close to and facing a reflective surface
(Zagorski-Thomas 2014: 65). And due to the essentially four-dimensional nature of
sound propagating through space, specific frequencies move through three-dimensional
space and decay differently through time. Thus, room acoustics is not a static quality of
sound, and it additionally leads to the precise positioning of people and objects within
the space. In many (but not all) studios, the chosen studio culture dictates that the ideal
placement of microphone and instrument within a tracking room, defined in relation to
the architecture and acoustical-aesthetic outcomes, takes precedence over the comfort or
spontaneity of musicians.
Accordingly, studio designers/owners make choices about where to place prominent
technologies (e.g. the mixing console or studio monitors) and furniture (e.g. the ubiquitous
‘producer’s couch’ at the back of the room) that rearticulate precisely where studio workers
of all categories are supposed to be, leading to rooms that, in contrast to many more flexible
workspaces in other professions, discourage being reconfigured and serve to continuously
reinforce gender, class and other sociocultural power asymmetries (Meintjes 2003; Bates
2016: 146). In some studios, both professional and project (Slater 2016), the ‘gear’, defined
here as many but not all audio technologies entailing the use of electricity, overtakes the
space, further limiting the movement of people. Studios in general, across many national
contexts, are characterized by this gear, much of which now rarely gets used and has been
rendered functionally redundant by software plug-in simulations, yet the nature of bulky,
physical gear-as-fetish continues. In addition to the ways in which gear-as-fetish indexes the
obsessions that studio workers have with gear (Meintjes 2012) or the ‘technoporn’ framing
of studio gear within magazines and advertising (Bennett 2012), we should understand
gear fetishes with regards to ways in which ‘they mystify unequal relations of exchange by
being attributed autonomous agency or productivity’ (Hornborg 2014: 121). In other words,
studios frame gear as objects that mediate relations between people. Oddly, software plug-
ins are not perceived as a form of ‘gear’, even though they nominally do the same thing as
the expensive outboard they replace and can be hoarded like hardware. In nearly all studio
recordings made in the 2000s, the primary capture device is a computer, and this computer
is also used for non-linear editing and hosting the effects plug-ins (in addition to many other
tasks). Almost everything centres on the computer, which is often strategically placed outside
the control room, or if in the control room within an isolating box that reduces the fan
noise and conceals the computer itself, much as larger analogue studios used machine rooms
to isolate the critical recording spaces from the fans of console amplifiers and multitrack
tape machines. Thus, the prominence of the technology within a production workflow may
not correspond with the prominence of the object in the room, contrary to what might be
expected.
While most of the attention in phonomusicological and art of record production
scholarship has focused on those technologies that at some point leave a clear audible
trace on recorded sound (in other words, capture and signal processing technologies),
much of the technology that defines the studio is used, instead, to compensate for the
inter-room isolation and constrained visual connection that studio architecture enforces
(Bennett 2017). Headphones, headphone amps, talkback, cabling, studio monitoring and
cue mixing, which in aggregate I term technologies of audition (Bates 2016: 124), all work
together as a system to enable workers isolated in different rooms to participate in real
time in the production process, with significant limitations, however. While the effects
of headphone listening on musician creativity has received a bit of attention (Williams
2012; Watson 2015: 69), the lack of attention of the other components of these technologies
of audition probably has to do with their status as infrastructure, what Larkin defines as
‘the material forms that allow for exchange over space’ (2008: 5). The disorienting effects
introduced by these technologies, which include altered perceptions of musical event
timing stemming from system latency (Bates 2016: 134) and audible aesthetics and timbral
qualities stemming from transducer proximity, require all studio workers to adapt their
kinesthetic approaches to musicianship and computer use, and to cultivate context-specific
embodied modes of critical listening. In sum, studio architecture leads to the incorporation
of a site-specific configuration of interconnected infrastructural technologies that then
further constrains the sensory and embodied dispositions that workers have towards the
musical production process. Not surprisingly, these substantial preconditions and the
challenges inherent to overcoming them come to limit who actually participates in studio
work – and in which ways.
Positioning studios
Iterative design studios have been installed in numerous countries; the designs may have
‘a’ nominal country of origin (Westlake and LEDE rooms originated in the United States,
Northward invented the FTB design in the Netherlands and Bogic Petrovic co-created the
MyRoom design in Serbia) but are conceptualized as essentially placeless. That is, they
are placeless if one considers only the iterative acoustic aspects, as much of the rest of a
studio’s design results from interaction with a local customer, and studios are built by local
construction crews. Hidley’s largest-budget studio complex was built with the help of Swiss
architects in 1991 in what at the time was the Republic of Bophuthatswana (now incorporated
into South Africa); the control rooms look like ‘classic Hidley’ but are positioned adjacent to
a massive residential complex with its own bar and restaurant that clearly articulates South
African architectural styles (Robjohns 2013). If the prospective customer has the budget and
a suitable location, it is now entirely feasible to build a world-class studio, at least based on
the norms of these iterative designs, nearly anywhere in the world, and this has considerably
lessened the dependence upon US and UK studios outside the Anglophone context.
Within national markets, there are other what might be termed ‘semi-iterative’ designs.
For example, I visited several studios in Turkey designed by local acoustician Sorgun Akkor,
which were immediately recognizable from the incorporation of certain construction
materials and visual aspects from foreign iterative studios, but implemented without the
rigour of the iterative design, and fused with the addition of local materials and aesthetic
changes. Rather than being necessarily ‘flawed’ in design, these semi-iterative studios were
often extremely successful in functioning as social workplaces that held together robust
music production networks (Bates 2019). Therefore, while many twenty-first century
home and project studios to an extent replicate models presented in textbooks or in
online discussion fora (e.g. John Sayers’s acoustics forum or the Gearslutz studio acoustics
subforum, both of which have been active since 2003), there is considerable variability in
the extent to which they adopt locally available materials (e.g. woods from indigenous trees)
and reflect the idiosyncratic aesthetics of the studio’s owner(s) or local architectural norms.
While some studios exhibit the ‘placelessness’ or interchangeability that Théberge describes
in relation to what he terms network studios (2004), many do not, instead reflecting local
acoustic practices, construction materials and personal aesthetic preferences.
Recording studios have a peculiar geographical and economic relation to the recorded
music sector. While the activities of record labels have increasingly become ‘virtual’ and
therefore less rooted to any particular place, studios are in effect ‘communities of workers
anchored to particular places’ (Leyshon 2009: 1313). Studios are expensive and space-
intensive facilities, but constitute one of the least-profitable parts of the music sector, and
this appears consistent regardless of the national market.2 Whereas prior to the 1970s most
studios were owned by a single record label, radio station or entrepreneur and were often
used for producing recordings for that label or station, now few studios are label-owned,
with most being owned instead by producers, arrangers, musicians, engineers (of various
sorts), film music composers, beats producers, or other professionals or amateurs who need
considerable access to a studio space for extended periods of time. Following what Leyshon
terms the ‘vertical disintegration’ (2009: 1327) of the production process, studios serving film
companies, major labels and their subsidiaries compete now within an oligopsonic market for
studio time/access (there are many studios but very few ‘customers’), while studios catering to
‘independent’ artists have a somewhat larger, but financially poorer, potential customer base.
However, in many regional contexts the purpose of studios goes beyond economic
considerations or the aforementioned musical-technical work, and studios often function
as charged nodes within broader production networks, even becoming important
destinations or social hang-out spaces. For those who are granted access (either those
involved in musical-technical work, other music industry workers, or close friends or
family members), studios may constitute the hub of local music scenes, and the act of
hosting guests brings considerable social capital to the owners or managers of such spaces.
This became especially apparent in the studios I worked at in Istanbul, where the ability
to perform the cultural institution of misafirperverlik (a Turkey-specific form of providing
hospitality) became a key motivating factor for people to run studios, and the social labour
of this hospitality, more than anything else, maintained the social networks of the recorded
music sector (Bates 2016: 147).3 This connects recording studios to other workspaces that
are also in the business of providing hospitality (restaurants, bars, hotels) but distinguishes
them from other kinds of studios or technical workplaces which lack a similar place-
defining social function. However, in my experience, it is not only the actual recording
rooms that host this social function but the non-recording places as well. Fifty to eighty
per cent of the square footage of Turkish commercial studios may be primarily designated
for hosting guests. Elsewhere, the phenomenon of the ‘residential studio’, which appends
short- or medium-term accommodations with a studio complex situated outside of urban
areas, provides another approach. In contrast, songwriting or beat production studios may
not dedicate any space at all for hosting guests and thus are not, at least in this manner,
social spaces, although they are still social spaces in other ways, especially if they are nodes
connected to some sort of distributed or collective recording work.
Production labour and workflow

Production labour has been, at least since the publishing of Marx’s Capital in the 1800s,
a key topic in disciplines that we today label economics, economic philosophy, cultural
geography and sociology. As Marx demonstrated in chapter fifteen, in addition to enabling
the production of surplus value, the shift from handiwork manufacturing to factory
production radically transformed the relation between people and the tools/technologies/
machines used to do work, turning workers into a ‘mere living appendage’ to machines
(Marx 1906: 467). Several factors contributed to the recording studio becoming a factory
for music recordings – and those involved with production becoming factory workers,
Marx’s ‘living appendages’. These include the shift of recording labour from a craft mode
of organization done in informal spaces to a more ‘entrepreneurial’ (and therefore more
industrial) one done in formal studios (Kealy 1979), the increase in the technological
complexity of recording spaces, and the site of technological development being relocated
to profit-driven entrepreneurial companies. It is certainly not the case that the larger
commercial studios are inherently more factory-like than small rooms, as, for example, the
thousands of tiny producer-owned rooms that each churn out thousands of film and TV
music cues a year, or the studios that provide mastering or editing services, immediately
appear as factory-like. However, studios are peculiar kinds of factories, being involved as
they are with what Leyshon terms ‘emotional labor’ (2009: 1316) and involved with creating
products that often have a pronounced role in constituting or representing ‘culture’.
Perhaps it is this paradox, between what has been termed the ‘production of culture’
(Peterson 1976) and the ‘cultures of production’ (Fine 1992) that has led to production
being a field of singular interest within US organizational sociology and UK cultural
sociology. Certainly, the production of culture perspective (Peterson and Anand 2004)
is the only sociological paradigm I know of that originated in the problems specific to
studying music-related phenomena and was subsequently applied to numerous other
kinds of organizations and creative industries. But it is not only in sociology that we find
attention on production. As Samantha Bennett and I have argued, production studies have
been pulled by a tension between phonomusicological works (where the object of study is
‘music’ itself), organizational sociological approaches, which only in vague terms cover any
of the labour of recording work, and popular music studies works that (typically) attempt
to valorize the auteurism of technical or social-managerial professions (e.g. producers,
engineers) as evidence of their inherently creative nature (Bates and Bennett 2018). Through
all these works, production is typically discussed in terms of things ‘people do’, but, as
I have hinted already, that provides an incomplete account, specifically with regards to
geographical, topographical, technological, architectural and other spatial concerns. While
we all know that most production happens ‘in studios’, precisely how production is impacted
by the architecture has not been considered to any significant extent. But architecture, and
space more broadly, always have a considerable role in shaping production. For example,
Lefèbvre has discussed the active, ‘operational or instrumental role of space, as knowledge
and action, in the existing mode of production’ (1991: 11). For Lefèbvre, space, ‘in addition
to being a means of production […] is also a means of control’ (26).
Perhaps part of this neglect stems from the difficulty in relating what in effect is a
practice that unfolds through time (production) with an entity we tend to misperceive as
static (architecture). But specifically it was changes in production workflows that led to
the need to change studio architecture. Before the establishment of overdubbing-based
production workflows, which became part of commercial studio work in the United States
in the 1950s and subsequently permeated other national contexts, there was in effect one
recording act: tracking and mixing were part of a singular process of capturing a recording
in real time. With multitrack recording and overdubbing, it became possible to stretch the
temporality of the recording act, which then necessitated a subsequent mixing stage. Tape
(or magnetophone wire) splicing suggested the possibilities of tape editing for corrective
or creative purposes, famously exploited by Canadian pianist Glenn Gould and Miles
Davis producer/editor Teo Macero (Zagorski-Thomas 2014). The complexities of creating
recorded music that would command listener attention when subsequently broadcast
on radio led to more attention on the process of cutting the recording’s lacquer master,
resulting in a practice called mastering. While all these stages of recording production
existed in the 1950s, increasing competition in the market for recorded music, increasing
complexity in each stage of the production process, and increasing ability to capture
sounds encompassing the full range of human hearing led to greater attention on the
specific sociotechnical demands of each of these stages. This in turn led to the design and
construction of dedicated spaces for performing each stage – tracking rooms, control
rooms, cutting rooms, mastering rooms, editing suites and post-production facilities.
It has been these technical problems relating to production, not ‘creative’ concerns, that
have led to changes in studio design and use. In essence, little of what actually happens
in a workday in most contemporary studios is best described as ‘creative’, as on the one
hand contemporary production workflows necessitate considerable mundane computer-
based technical work (e.g. editing, corrective tuning, session template preparation,
software updates) and technical set-up based on successful formulas (e.g. microphone
placement, putting low-cut filters on channels), and on the other so much effort goes into
re-creating aspects of prior recordings (e.g. loudness maximization, sample-replacing kick
drums or vocal repair – see Marshall 2018). However, there is a strong desire in multiple
Anglo-American contexts – within academic, music industry and distributed musician
communities – to reimagine all of this as being especially creative in nature. Having worked
on eighty-something albums and film/TV soundtracks over the past couple of decades in
several countries in nearly every professional role, I can only think of four projects where
even half of the active participants’ time was spent ‘being creative’ (two of these projects
were conspicuously not recorded in a studio), and in over half of the projects perhaps less
than 5 per cent of the working hours in the studio entailed essentially creative labour. Often
the bulk of ‘creation’ had happened prior to anyone arriving at the studio (composition,
arrangement, rehearsals), and studio time was spent attempting to translate as much of
this pre-production labour as possible into a finished product without excessively losing
the creative ‘spark’. My point is not to lessen the importance or status of the non-creative
work that goes into album and film soundtrack production, or to argue that the resulting
products might not become viewed as ‘creative’ by consumers, but rather to illustrate that
studios, by their design, have become places that necessitate a disproportionate amount
of reiterative and repetitive technical work in order to make a product that is deemed
acceptable in the marketplace. This technicity, rather than creativity, is the principal
defining feature of labour within post-1970s studios.
Conclusion
In other industries, there is a conscious attempt to critically interrogate and redesign
architecture in order to facilitate innovation, and we can see evidence of this in spaces
such as Apple Park, Apple’s new US$5 billion technology campus in Cupertino, California,
designed by Foster+Partners (McGuigan 2014). This builds on a many-decades practice

of conducting extensive usability studies in the design phase of architecture, which in the
work of ‘subversive’ architect Christopher Alexander led him to conduct psychological
therapy-like sessions with all categories of prospective users of the space to be built,
and culminated in the authorship of his visionary manifesto for architecture that instils
a sense of belonging (what he terms ‘wholeness’) as opposed to alienation (Alexander
2004). Usability, as a concept, doesn’t surface in any of the studio construction manuals
or acoustics journals publications, nor in the published discussion that accompanies the
launch of new recording spaces. Returning to Newell’s discussion, while in certain technical
regards the acoustics of control rooms can be argued to have improved considerably since
the 1970s, the usability of most spaces has not improved at all, and likely it has worsened.
Studio workers ‘make do’ with spaces that are suboptimal: in terms of usability, the openness
to creativity, the excessive focus on technicity, and the sensory aesthetics. The mess of
the studio, and the mangle of the production workflow (recalling Law and Pickering), is
the ‘natural’ consequence of a mismatch between the stated purpose of studios (making
meaningful music) and the actuality of studio work.
If it is not obvious by now, I see a fundamental ambiguity in post-1970 recording studios.
On the one hand, phenomenal recordings are occasionally produced in studio environments,
and many recordings continue to produce significant effects when they circulate in the
world. The visual iconicity of studios (their distinctive architecture, and the way they house
large and expensive technologies) resonates with some music consumers and stimulates
imagination about the glamour of recording production work. On the other hand, many
notable recording artists, and even some of the best-known record producers, refuse to work
in conventional studios as they are spaces that are typically anathema to open-ended creativity.
Moreover, many studio spaces, in their design and privileging of fetishized technology and
technicity over creativity, exclude participation by many demographic groups. For example,
the vast majority of commercial studios are coded and understood as male/masculine spaces
and limit or circumscribe participation by women, which has led prominent female music
producers such as Imogen Heap and Juana Molina to work primarily in their own home
studios (Woloshyn 2009; Wolfe 2018, 2019). Further research needs to be done with regards
to other social identities in different local and national contexts as well (e.g. Ottosson 2007;
Wilson 2014). Considering the often radical social projects that musicians seemingly wish
to contribute to, the studio architecture, the infrastructural technologies and the remarkably
conservative recorded music industry that nominally provides economic support for it all,
greatly limit the radical liberatory potential of recorded music.
Notes
1. While some engineers continue to regard Hidley/Westlake rooms as the gold standard,
Hidley himself speaks disparagingly about several design aspects that led to undesirable
low-frequency resonance problems in his pre-1980 rooms (Stewart 2009). He retired
from the business in 1980, but left retirement in 1986 to start a new design company
based in Switzerland that addressed some of the problems of the earlier designs.
Nonetheless, it is his earlier rooms that are the best known.
2. In the 2000s many of the largest and best-known studios closed down, often due to
multi-year economic losses stemming from the declining demand for long-term studio
lockouts. One common euphemism is that such spaces will be torn down and turned into
parking lots. This has actually happened, though, and invariably the parking lot is much
more profitable than the preceding studio.
3. I have also observed this at several of the studios I worked at in San Francisco and studios
I have visited in Europe, although the cultural norms of hospitality differed somewhat in
form.
Bibliography
Alexander, C. (2004), The Nature of Order: An Essay on the Art of Building and the Nature of
the Universe, vol. 1, London: Taylor & Francis.
Bates, E. (2012), ‘What Studios Do’, Journal on the Art of Record Production, 7. Available
online: https://www.arpjournal.com/asarpwp/what-studios-do/ (accessed 22 August 2019).
Bates, E. (2016), Digital Tradition: Arrangement and Labor in Istanbul’s Recording Studio
Culture, New York: Oxford University Press.
Bates, E. (2018), ‘Producing TV Series Music in Istanbul’, in S. Bennett and E. Bates (eds),
Critical Approaches to the Production of Music and Sound, 81–97, New York: Bloomsbury.
Bates, E. (2019), ‘Technological Encounters in the Interculturality of Istanbul’s Recording
Studios’, El Oído Pensante, 7 (1): 145–171.
Bates, E. and S. Bennett (2018), ‘The Production of Music and Sound: A Multidisciplinary
Critique’, in S. Bennett and E. Bates (eds), Critical Approaches to the Production of Music
and Sound, 1–21, New York: Bloomsbury.
Bennett, S. (2012), ‘Revisiting the “Double Production Industry”: Advertising, Consumption
and “Technoporn” Surrounding the Music Technology Press’, in A.-V. Kärjä, L. Marshall
and J. Brusila (eds), Music, Business and Law: Essays on Contemporary Trends in the Music
Industry, 117–145, Helsinki: IASPM Norden; Turku: International Institute for Popular
Culture.
Bennett, S. (2017), ‘Songs About Fucking: John Loder’s Southern Studios and the Construction
of a Subversive Sonic Signature’, Journal of Popular Music Studies, 29 (2): 1–14.
Beranek, L. L. (1962), Music, Acoustics and Architecture, New York: Wiley.
Beranek, L. L. (2004), Concert Halls and Opera Houses: Music, Acoustics and Architecture, 2nd
edn, New York: Springer.
Cox, T. J. and P. D’Antonio (2004), Acoustic Absorbers and Diffusers: Theory, Design and
Application, London: Spon Press.
D’Antonio, P. (2011), ‘RPG Diffusor Systems: Overview of Nearly 30 Years of Research and
Development’, Journal of the Acoustical Society of America, 130 (4): 2388.
Everest, F. A. (1981), The Master Handbook of Acoustics, 1st edn, Blue Ridge Summit, PA: TAB
Books.
Fine, G. A. (1992), ‘The Culture of Production: Aesthetic Choices and Constraints in Culinary
Work’, American Journal of Sociology, 97 (5): 1268–1294.
Gervais, R. (2011), Home Recording Studio: Build It Like the Pros, 2nd edn, Boston, MA:
Cengage.
Green, L. and J. Y. Dunbar (1947), ‘Recording Studio Acoustics’, Journal of the Acoustical
Society of America, 19 (3): 412–414.
Hennion, A. (1989), ‘An Intermediary between Production and Consumption: The Producer
of Popular Music’, Science, Technology, & Human Values, 14 (4): 400–424.
Hornborg, A. (2014), ‘Technology as Fetish: Marx, Latour, and the Cultural Foundations of
Capitalism’, Theory, Culture & Society, 31 (4): 119–140.
Kealy, E. R. (1979), ‘From Craft to Art: The Case of Sound Mixers and Popular Music’,
Sociology of Work and Occupations, 6 (1): 3–29.
Kilmer, R. and W. O. Kilmer (2014), Designing Interiors, 2nd edn, Hoboken, NJ: Wiley.
Kinney, W. H. (1999), Recorded Music in American Life: The Phonograph and Popular Memory,
1890–1945, New York: Oxford University Press.
Knudsen, V. O. (1938), ‘Some Cultural Applications of Modern Acoustics’, Journal of the
Acoustical Society of America, 9 (3): 175–184.
Larkin, B. (2008), Signal and Noise; Media, Infrastructure, and Urban Culture in Nigeria,
Durham, NC: Duke University Press.
Law, J. (2004), After Method: Mess in Social Science Research, London: Routledge.
Lefèbvre, H. (1991), The Production of Space, translated by D. Nicholson-Smith, Oxford:
Blackwell.
Leyshon, A. (2009), ‘The Software Slump?: Digital Music, the Democratisation of Technology,
and the Decline of the Recording Studio Sector within the Musical Economy’, Environment
and Planning A: Economy and Space, 41 (6): 1309–1331.
Marshall, O. (2018), ‘Auto-Tune In Situ: Digital Vocal Correction and Conversational Repair’,
in S. Bennett and E. Bates (eds), Critical Approaches to the Production of Music and Sound,
175–194, New York: Bloomsbury.
Marx, K. (1906), Capital: A Critique of Political Economy, edited by F. Engels and E. Untermann,
translated by S. Moore and E. B. Aveling, vol. 1, New York: The Modern Library.
McGuigan, C. (2014), ‘Asking Mr Big’, Architectural Record, 16 March. Available online:
https://www.architecturalrecord.com/articles/5846-asking-mr-big (accessed 13 August
2019).
Meintjes, L. (2012), ‘The Recording Studio as Fetish’, in J. Sterne (ed.), The Sound Studies
Reader, 265–282, New York: Routledge.
Mol, A. and J. Law (2002), ‘Complexities: An Introduction’, in J. Law and A. Mol (eds),
Complexities: Social Studies of Knowledge Practices, 1–22, Durham, NC: Duke University
Press.
Morris, R. M. and G. M. Nixon (1936), ‘NBC Studio Design’, Journal of the Acoustical Society
of America, 8 (2): 81–90.
Newell, P. R. (2003), Recording Studio Design, 1st edn, London: Focal Press.
Ottosson, Å. (2007), ‘“We’re Just Bush Mob”: Producing Aboriginal Music and Maleness in a
Central Australian Recording Studio’, World of Music, 49 (1): 49–63.
Peterson, R. A. (1976), ‘The Production of Culture: A Prolegomenon’, American Behavioral
Scientist, 19 (6): 669–684.
Peterson, R. A. and N. Anand (2004), ‘The Production of Culture Perspective’, Annual Review
of Sociology, 30 (1): 311–334.
Pickering, A. (1995), The Mangle of Practice: Time, Agency, and Science, Chicago: University of
Chicago Press.
Putnam, M. T. (1980), ‘A Thirty-Five Year History and Evolution of the Recording Studio’,
Audio Engineering Society, preprint 1661.
Robjohns, H. (2013), ‘The BOP Studios Story: Out of Africa’, Sound on Sound, October.
Available online: https://www.soundonsound.com/music-business/bop-studios-story
(accessed 13 August 2019).
Sabine, H. J. (1977), ‘Building Acoustics in America, 1920–1940’, Journal of the Acoustical
Schaudinischky, L. and A. Schwartz (1970), ‘The Acoustical Design of Multi-Purpose
Recording Studios in Existing Buildings’, Applied Acoustics, 3 (4): 283–298.
Slater, M. (2016), ‘Locating Project Studios and Studio Projects’, Journal of the Royal Musical
Association, 141 (1): 167–202.
Stanton, G. T. and F. C. Schmid (1932), ‘Acoustics of Broadcasting and Recording Studios’,
Journal of the Acoustical Society of America, 4 (1): 44–55.
Stewart, A. (2009), ‘The Name Behind the Name: Tom Hidley – Westlake/Eastlake
Audio’, AudioTechnology 37. Available online: http://www.audiotechnology.com/PDF/
REGULARS/NAME_BEHIND_THE_NAME/AT37_NBN_Tom_Hidley.pdf (accessed
13 August 2019).
Théberge, P. (2004), ‘The Network Studio: Historical and Technological Paths to a New Ideal
in Music Making’, Social Studies of Science, 34 (5): 759–781.
Thompson, E. A. (2002), The Soundscape of Modernity: Architectural Acoustics and the Culture
of Listening in America, 1900–1933, Cambridge, MA: MIT Press.
Veal, M. E. (2007), Dub: Soundscapes and Shattered Songs in Jamaican Reggae, Middletown,
CT: Wesleyan University Press.
Watson, A. (2015), Cultural Production in and Beyond the Recording Studio, New York:
Routledge.
Western, T. (2018), ‘Field Recording and the Production of Place’, in S. Bennett and E. Bates (eds),
Critical Approaches to the Production of Music and Sound, 23–40, New York: Bloomsbury
Academic.
Williams, A. (2012), ‘“I’m Not Hearing What You’re Hearing”: The Conflict and
Connection of Headphone Mixes and Multiple Audioscapes’, in S. Frith and S. Zagorski-
Thomas (eds), The Art of Record Production: An Introductory Reader, 113–127, Farnham:
Ashgate.
Wilson, O. (2014), ‘Ples and Popular Music Production: A Typology of Home-Based Recording
Studios in Port Moresby, Papua New Guinea’, Ethnomusicology Forum, 23 (3): 425–444.
Wolfe, P. (2018), “‘An Indestructible Sound”: Locating Gender in Genres Using Different
Music Production Approaches’, in S. Bennett and E. Bates (eds), Critical Approaches to the
Production of Music and Sound, 62–80, New York: Bloomsbury.
Wolfe, P. (2019), Women in the Studio: Creativity, Control and Gender in Popular Music Sound
Production, Farnham: Ashgate.
Woloshyn, A. (2009), ‘Imogen Heap as Musical Cyborg: Renegotiations of Power, Gender and
Sound’, Journal on the Art of Record Production, 4. Available online: http://www.arpjournal.
com/asarpwp/imogen-heap-as-musical-cyborg-renegotiations-of-power-gender-and-
sound/ (accessed 13 August 2019).
University Press.
140
Part IV
Organizing the Production Process
One of the themes that keeps recurring in the contributions to this volume is the question
of whether (or to what extent) music production is an industrial process rather than a
craft-like creative one. I think that wherever this question does get addressed – either
implicitly or explicitly – the answer seems to be: it depends. There do seem to be two key
components of this issue: the way that the fragmentation of the process involves a clear
division of labour and whether the aim is to create something that is similar to music that
has gone before, i.e. either following existing demand or creating demand for something
new through creativity. In this part, all three contributors deal with the ways in which this
collaborative process is managed and how the division of labour works, although the idea
of creativity is dealt with in very different ways.
So what is the difference between a process of creative collaboration and an industrial
process that, for whatever reason, reduces the creative process to a series of smaller (and
possibly less skilled) tasks. If we pull ourselves back and examine and define the terminology
of that sentence then we might simply say that an industrial process is not creative. And,
on the one hand, the notion that the motivation in an industrial process is to produce
many similar products for which there is a proven existing demand does support that
idea. But then the question gets more complicated if the industrial process simply makes
many exact copies of the same product or service, i.e. the mass production of a Beatles
recording for sale or streaming. In the same way that selling posters of the Mona Lisa
makes Leonardo Da Vinci part of an industrial process, the same is true for any artist that
makes recorded music for mass distribution. The question of whether we label The Beatles
as art or entertainment makes no difference to their participation in the process. However,
just as Charles and Ray Eames designed highly beautiful and sought-after chairs for the
Herman Miller furniture company in the 1950s and lesser designers have collaborated on
lesser designs for cheaper furniture companies, the design of the product (or the writing
of the track) may not involve a process of conveyor belt mass production and distribution
of labour. And just as participating in the creation of a mass-produced product does not
necessarily mean that there is no artistic or creative activity involved, neither does the
distribution of labour in the song writing and/or music production process. Tuomas
Auvinen takes us through a history of these collaborative and distributed practices within
the world of record production, but these types of collaboration can be found throughout
the history of musical activity. Once again, by questioning the categories that we habitually
use in the descriptions of ‘creative’ and ‘uncreative’ practices we can learn a lot from the
ways things do not fit into the categories as well as from the ways that they do.
Indeed, the nature of novelty and originality in music is highly problematic itself. We
may distinguish between musicking that seeks to satisfy an existing demand and that
which seeks to create demand for something new or different, but that prioritizes quite a
narrow function of the role that music performs around the world. Both what we might
call the industrial/popular and folkloric approaches to music are not interested in novelty
or innovation, just in performing the correct social function. Zagorski-Thomas argues, in
the first chapter of this volume, for an expanded version of Richard Middleton’s tripartite
theory of the ways in which music creates meaning. That argument suggests that there can
be three ways of demonstrating artistry: in the ability to stimulate empathy or entrainment,
in the ability to suggest subconscious associations and metaphors – particularly between
the sounds made by gestural shapes and emotional narratives – and in the ability to embody
consciously constructed metaphors – music that makes us think. Under this model, the
traditional idea of originality lies in the third of these categories, but it should be obvious
that the minute detail of shaping people’s responses to the first two categories can be
original and certainly requires a high degree of skill. Indeed, much like a conductor, one of
the key functions of a producer (and/or of the musical director in some working patterns)
is to encourage – or, in the terms of Nyssim Lefford’s ecological approach, to configure the
environment so that it affords – the appropriate action from the participants.
And that idea of configuring the environment so that it affords particular types of action
also works on a more macro, temporal and structural level too. All three of these chapters
deal with some aspect of the way that the recording process is structured. In addition to
what is covered here, there are a great many other ways of thinking about the organizing
structure of the process and the ways in which that influences the activity of the participants
and the final outcome. There are theoretical approaches such as Csikszentmihalyi’s systems
approach to creativity or actor - network theory (ANT) – more of a methodology than a
theory per se – that have been used extensively in the literature but there are also many
less formalized ways of thinking about structure. Sessions – or entire projects – could be
thought of in terms of phases of activity which, on the one hand, might be used as externally
defined descriptors, or on the other hand, as individual schema or scripts that users engage
with in order to guide their thought processes and behaviour. And it may well emerge that
different participants are thinking that they are in different phases of activity, for example,‘I
didn’t know you were recording! I’m still working my part out.’ How are these types of
mental processes identified, used and abused in the organization and management of the
production process? For example, ‘I get some of my best takes from people who don’t know
they are being recorded – they’re more relaxed.’
That idea of managing the process also reaches back to the origins of the A&R (Artist
and Repertoire) department – to someone who puts teams together and puts people
together with repertoire. That may simply involve brokering partnerships between people
who may be expected to work well – the kind of executive producer role – or, as Mike
Organizing the Production Process 143
Howlett mentions in relation to pre-production, may involve putting musicians in the right
studio (or other) environment for a particular project. But it also might involve selecting
a team for their specific and complementary skills – or even, for stimulating a creative
melting pot where various people are given limited information about what is required of
them and are put into exciting (or perhaps stressful) contexts. This, for example, is how
Miles Davis started to work in the studio in the second half of the 1960s, culminating with
the eclectic Bitches Brew album – where some of the musicians did not even recognize
themselves or the project when they heard the finished mix.
And this idea of being placed in a context where the musician is out of their comfort
zone flags up the notion of emotional labour in the recording studio. The conventional idea
of emotional labour is the idea of synchronizing the emotional response of customers with
the either real or acted emotional façade of staff – the compassionate calm of a nurse, the
happy waiter or the formal seriousness of a police officer. The notion of emotional labour
relates to the requirement of those staff having to put on that façade no matter how they
feel in their own life – and that there may be emotional dissonance between those two
mental states. Of course, the same is true of musical performance – of playing a happy
piece when you are sad – but in the studio there is the additional dimension of having a
facilitator, the producer, of whom part of their job is to create the right mood in the room
so that this emotional labour by the musician is easier. There are a great many charming
and more disturbing anecdotes (possibly even apocryphal) about this kind of emotional
labour in the studio. For example, Norman Whitfield changing the key of ‘Heard It Through
The Grapevine’ so that the strain in Marvin Gaye’s voice reflected the angst of the lyrics;
associate producers Malcolm Cecil and Robert Margouleff irritating Stevie Wonder by
hiding his tea when they wanted him to perform the grittier vocal performance for ‘Living
For The City’; or Mike Howlett building a little vocal booth with Persian rugs and scented
candles for Joan Armatrading to get her into the right mood for a vocal performance.
Discography
Davis, Miles (1970), Bitches Brew, Columbia Records.
Gaye, Marvin (1966), ‘Heard It Through The Grapevine’, Motown Records.
Wonder, Stevie (1973), ‘Living For The City’, Motown Records.
144
9
Information, (Inter)action and
Collaboration in Record Production
Environments
M. Nyssim Lefford
Introduction
Music production, particularly in the context of the recording studio, involves amalgamating
different performances, both aesthetic and technical, into something that sounds like one
coherent recording to the one audience. Often these performances are created by a group
of collaborators; but even when produced by a multitalented individual, different kinds of
performances, or work, require different working methods and information to inform how
work gets done (i.e. key, tempo, how loud/soft, feeling/mood, style, etc.). The production
process accommodates these varying information requirements (Lefford 2015); but the
perceptual and cognitive mechanisms that enable different perspectives to cooperate are
not yet well understood. One thing all collaborators (and the shifting perspectives of each
individual) have in common is the working environment of the recording studio. All
contributors, regardless of differences in their perspectives, typically, historically operate
not only within a shared working context but also within a shared physical space; and for
good reason. The shared physical environment does much to sustain cooperation among
contributors. This chapter unpacks those relationships among the physical characteristics
of the working environment of the recording studio, how individual contributors do their
specialized work there and how a pluralistic group of contributors collaborate to produce
a recording that (hopefully) sounds like a unified whole.
Recorded and recording worlds

‘A symphony must be like a world. It must embrace everything,’ said Mahler (quoted in de
La Grange 1995: 753). As a world, each music composition has unique structural attributes
that emerge from and make sense within its own internal consistencies. As if it is an
ecosystem, the composition allows and does not allow certain kinds of structures to exist
within it. Listeners perceive this internal logic. Music recordings are similarly world-like.
The process of producing music recordings not only brings about and captures musical
performances, but also situates and organizes them within an acoustic space, in realistic
or surrealistic ways, for example, by balancing levels in the mix. Recorded space environs
the varied recorded elements/objects within. Producing, like composing (Moorefield
2010), imposes an arrangement, a logic and physics on sounds, making explicit how they
may exist and interact within the world of the recording. Musical and technical parts and
performances interoperate, each functioning within an organic system of sorts.
Becker (1974) calls the numerous creators required to make these parts an ‘art world’,
a collaboration in which contributors informed by their own unique perspectives and
expertise perform work with the foreknowledge that every work product is part of a complex
assemblage. It takes many perspectives to make a music recording. Each contributor and
contribution fills a niche in the recorded world. In Becker’s art worlds contributors do not
necessarily need to share a physical space, as they often do in music production. Together
in a studio environment, interacting with each other within it, contributors share common
resources. They form something more akin to a natural community where niches not only
coexist in balance but also depend on other niches for their very existence.
Like the biological environment of a natural community, the recording studio environment
physically contains the creative work that happens there. And like the recording imposes
structure on all of the sonic parts of which it is comprised, and it imposes structure on
production work and on the (social) interactions among contributors. The studio’s physical
configuration can make it easy to focus on certain characteristics of a performance and/or
it can make it difficult for musicians to hear and respond to others. The studio also environs
technology and musical instruments, and thus it also affects interactions with these objects.
Essentially, the physical attributes of the studio environment shape how varied kinds of
work and perspectives on work come together.
In the natural, biological world, biotic and abiotic entities interact with one another,
forming symbiotic relationships. The environment does more than support life. It
constrains, informs and enables some actions and behaviours, and the growth and change
of all the organisms contained within it. And, vice versa, the actions of organisms also
affect the environment, sometimes profoundly. For example, overgrowth pushes out
other members of the natural community. When we frame collaboration in the recording
studio as an ecosystem, how the studio environment gives rise to collaboration becomes
readily apparent. This approach provides a framework for deconstructing collaborators’
perceptions and links their individual and shared experiences to their behaviours, actions
and creative conceptualizations.
Niches
The perspectives and work of different contributors (instrumentalist, engineers,
vocalists, songwriters, etc.) fill different niches in a production. In order to thrive
in biological environments, organisms carve out niches for themselves through
Information, (Inter)action and Collaboration 147
evolutionary adaptation. Adaptation leads to particular modes of interacting with the

environment, utilizing its resources and developing without competing excessively
with other species for the basic means of survival. Contributors to productions become
specialized through training and experience. Each, from the perspective of their
expertise and practices, learns to interact with and within the production environment,
to utilize its resources for creative work and to coexist and co-work with others sharing
the same environment.
Among collaborators, the producer’s niche is exceptional. The producer facilitates and
integrates the work of those in other niches. The producer is described by Howlett (2012)
as a ‘nexus’ who connects artists, recording technologies and ‘commercial aspirations’.
Zagorski-Thomas positions the producer as a node in a (social) network (Zagorski-Thomas
2014) using actor-network theory (ANT; see Law 1992; Latour 1996). The producer is also
said to function as a ‘boundary object’ (Star and Griesemer 1989), meeting the varying
informational requirements of collaborators individually and collectively (Lefford 2015).
None of these attributions describe the producer’s role comprehensively, and all apply. All
point to a contributor who connects contributors and work products. For that, the producer
needs an exceptional grasp on the multifarious types of work needed to produce a music
recording and, moreover, a subtle understanding of the perspectives of each contributor.
This knowledge puts the producer in a strong position to influence how work in the
studio gets done. All the things producers usually do to facilitate work and collaboration
– adjusting headphone mixes, positioning players in live rooms, iso-booths or the control
room, even directing attention and suggesting associations to guide performances (i.e.
think of it this way) – impact what contributors filling other niches perceive and therefore
the information they have to do their work; and thus how they do their work. Through
means like the ones listed above the producer influences how individuals and work
products connect. But to wield this influence, from an ecological perspective, the producer
needs the physical environment of the studio (or production space).
Production ecology
The ‘ecological approach’ to perception seeks to explain how individuals perceive
information within an environment. It was first proposed by Gibson ([1979] 1986).
Gibson studied visual perception, but his ideas have had a far-reaching impact. There are
numerous ecological explanations of music-making, notably starting with Clarke’s writings
(2005). Zagorski-Thomas and Bourbon (2017) address production specifically and align
ecological explanations of mixing decisions in the studio with ANT. Many ecological
descriptions of creative activity, however, depart significantly from Gibson’s original ideas.
Offered here, is a deep ecology of the recording studio. It begins with Gibson’s ideas about
information and perception in an environment. Thereafter the chapter tracks information
from the perceptions of individuals to its sharing among collaborators, and from working
with perceptual to conceptual information. Finally, the producer’s role is revisited through
the frame of ecological perception.
Environments and information

In natural ecosystems, organisms do what they do: (1) because they can, the environment
makes it possible, and (2) because it benefits their survival/development, even if obliquely.
To survive creatures not only feed, they also defend, mate and cooperate. There is
interaction among conspecifics and between species. Every behaviour contributes not only
to the welfare of the organism but also to the state of the environment, and thus to possible
subsequent behaviours within the environment. The most fundamental influence on the
behaviour of all organisms who perceive or sense their environment is the availability of
information with potential to guide action. Gibson ([1979] 1986) posits that the physical
world we inhabit as organisms is awash with ambient sensory information (i.e. stimuli)
that perceivers either detect or do not detect. Detected information informs action. All
forces that influence detection or perception influence action.
An environment, Gibson ([1979] 1986) states, is comprised of a medium, objects and
energy. The medium surrounds the objects and organisms within. Energy such as light or
sound is perceived when it interacts with the environment’s medium and objects. Ecological
‘events’ happen ‘at the level of substances and surfaces’ (93). Events, interactions, have
structure that the senses detect. These patterns are information. Changes in patterns are
information. In vision, reshaping, repositioning, a shift in orientation and (dis)appearance
result in changing information. In auditory perception, notes begin and decay. A stream of
sound radiating from a source may be dampened by air or obstructed by objects, attenuating
frequencies, before reaching a perceiver. The perceiver’s physical position – distance,
angle, in light or shade – determines what patterns may be perceived. Every percept is
embodied. Bodies, their potential for psychophysical experience and subjectivity, dictate
what is perceived in the environment as well as what is done as a result of perceiving that
information. Varela, Thompson and Rosch (1993) assert a perceiver’s embodiment is its
fundamental connection to its environment. Gibson referred to the combination of factors
that determine what is detected as the perceiver’s ‘vantage’ ([1979] 1986).
The state and configuration of the environment and the perceiver’s vantage interact. The
very presence and the actions of a perceiver in the environment impact the environment,
for example, by occluding an object or otherwise interfering with the flow of perceptible
energy, and therefore information (Gibson [1979] 1986). A guitarist cranking an amplifier
masks the sounds of other instruments. Certain patterns and object properties are salient
from certain vantages. A pianist may hear her piano despite the amplified guitar, but
others in the space may not. Some object properties are ‘invariant’ regardless of a change
in vantage, position or perspective (73).
According to Gibson ([1979] 1986), perceived characteristics and structures are
indicative of an object’s interaction potential or its ‘affordances’. Norman (1988) applies
the affordance concept to the features and utility of designed objects. A door knob affords
turning to open. The recording studio environment is filled with objects that afford
interaction. Zagorski-Thomas and Bourbon (2017) claim, from the ecological perspective,
that ‘there is no separate category for tools or objects being used by the human actors to
make noises’. Eleanor J. Gibson explains: ‘When we perceive the affordance of anything […]
we are perceiving the relation between something […] and its use or value to ourselves’
(2000: 54); and this subsequently informs (inter)actions with it.
Affordances in nature denote things of value to the survival or success of ‘organisms in
an environmental niche’ (Osbeck and Nersessian 2014: 94). Vantage precludes irrelevant
affordances. A perceiver that cannot grasp does not perceive that which affords grasping.
Vantage affords perceptibility. If studio collaborators cannot see each other’s faces or do
not know or care to look, then visual social cues do not afford interaction. Interactions
are opportunistic. The convergence of perceivers’ vantages and values, affordances and, of
course, the environment itself define interaction potential.
Perceiving and acting

‘We must see the organism and environment as bound together in reciprocal specification
and selection’ (Varela, Thompson and Rosch 1993: 174). Every (inter)action, every
behaviour, is an ‘enactment’ (Varela, Thompson and Rosch 1993) of the environment – in the
studio, of the studio environment. Enactments bear the mark of the perceiver’s experiences
(Varela, Thompson and Rosch 1993). The recorded product itself is therefore an enactment
of the environment in which it was produced. Fundamentally, the environment unifies all
work performed there. But ‘perception is not simply embedded within and constrained
by the surrounding world; it also contributes to the enactment of the surrounding world’
(174). The work studio collaborators do creates (sonic) objects and phenomena in the
environment that influence the very structure of the environment and the perceptions of
all within.
Much of this creative work requires interaction with technological objects, musical
instruments, microphones, signal processors, monitors, consoles, etc. Technologies,
through screens or as a result of their operations, provide perceptual information for
doing work. For instance, as Franinović and Serafin explain: ‘When playing a musical
instrument, there is clearly interaction with a physical object, the sound is dependent on
several interactions between player and instrument in complex ways. The player adjusts
such sound by moving different parts of his body in an action-perception loop’ (2013:
242). The studio’s acoustics may enhance this connection. Collaborators’ actions may
interfere. A producer may rebalance a singer’s headphone mix to change how the singer
sings by controlling, maybe selectively impeding, the flow of information. That alters what
is perceived as afforded, since, ‘Our perceptions come into existence through our action,
and the other way around. Action and perception must be understood as one’ (50) and as
inseparable from the perceiver’s subjective vantage and niche.
Perceivers do not perceive passively. They focus attention (on affordances) selectively.
‘Attentional mechanisms select, modulate, and sustain focus on information most relevant
for behavior’ (Chun, Golomb and Turk-Browne 2011: 73). Usually, attention is directed
according to goals (Chun, Golomb and Turk-Browne 2011). Focusing attention is a
way of valuing. Since attentional resources are finite (Chun, Golomb and Turk-Browne
2011), perceivers do not spend them indiscriminately. They have a cost because ‘attention
determines how well the target information is processed’ (75). Experience, specialization
and conditioning, for example, from learning to play an instrument or engineer recordings,
determine what is worthy of attention and what information to extract from those percepts.
Every niche has its perspectives on what needs attention for doing its work.
Attention may be internally directed towards mental representations as well as captured
by salient, external perceptual stimuli (Chun, Golomb and Turk-Browne 2011). Focusing
internally, periodically, perceivers maintain their ‘priorities’ by comparing what is known
or conceived of to what is physically perceived (Lavie et al. 2004). Knowledge (experiential,
tacit and also declarative/propositional, procedural, etc.) pilots attention and thus
perception and thus action. Cognition and perception mutually inform work.
Synecology of the recording studio

What contributors enact individually is the product of integrating what they know
(cognitively) and what they perceive (physically). Each performance is ‘active externalism’
through which, according to Clark and Chalmers (1998), embodied cognition is ‘coupled’
to an ‘external entity’ by means of interaction. In the studio that entity is usually the
recording-in-production, the work-in-progress, to which all collaborators contribute.
By actively externalizing, both cognition (informed by perception) and the entity ‘jointly
govern behavior’ (Clark and Chalmers 1998). Kirsh (2010) has identified several ways that
external representations affect an individual’s cognition, hence direct attention and inform
action. The physicality of an external representation leads to less ambiguous (perceptual)
encoding. ‘Encoding transforms the perceived external stimulus into an internal
representation’ that may be mentally analysed and manipulated (Fiske and Taylor 2013:
59). Externalized representations ‘help coordinate thought’, ‘create persistent referents’ and
‘facilitate re-representation’ (Kirsh 2010: 441) or future performances and reworkings.
External representations also ‘serve as a shareable object of thought’ (441). The
open sharing of performances that contribute to the development of the recording-in-
production is a form a mutualism, the principal act of mutualism in a studio collaboration.
All contributors provide information for other contributors’ work. As a result, musical
and technical performances overtly interoperate. This mutualism roots the synecology
(i.e. the study of relationships within natural communities) of the studio environment.
Collaborative behaviours are essential for working in the studio and the success/survival of
each contributor. The most essential (physical) features of the production environment are
those that sustain mutualism throughout the community of collaborators.
Although collaborators in their niches have distinct expertise and vantages, externalizing
makes possible similar perceptions, common understanding and shared knowledge, which
set a basis for communication about each contributor’s experiences. Externalizing together
increases the likelihood that all can/will attend to similar information, making it easier
to facilitate interaction and coordinate activities among contributors. Unsurprisingly,

attention is often swayed by social expectations and norms and knowledge that affords
social interaction (Fiske and Taylor 2013).
Collaborators are able to share understandings at a perceptual level that transcend
music-specific, local experiences, for example, all collaborators recognize patterns that are
salient for all humans (including audiences). As a species, we share perceptual sensitivities
to certain patterns. Ramachandran and Hirstein (1999) suggest that all human perceivers
learn structural rules such as ‘rectangularity’. The more an object exemplifies a rule, the more
the perceiver responds to perceiving it: a ‘peak-shift’ effect. It follows that any ‘amplification’
of structural rules or enhancement of recognizable patterns produces a salient ‘super
stimuli’ (Ramachandran and Hirstein 1999). Dissanayake (2001) stresses the importance
of exaggerated (amplified) patterns in all forms of artistic expression. In music production,
engineers often add signal processing effects to technically amplify or exaggerate gestures.
Collaborators also share conceptual knowledge of art-making. They know that art-
making, like play, involves an ‘elaboration’ on actions, for example, the layering or
reworking of recorded performances. Fittingly, properties that afford elaboration are likely
to be salient to and valued by all studio collaborators. More fundamentally, collaborators
understand that art, like ritual, is a special kind of human social activity (Dissanayake
2001) and therefore has its own rules of engagement. Collaborators share assorted cultural,
musical and non-musical reference points in these regards.
Porcello (2004) shows that professional engineers and producers, to coordinate their
work, use a specialized vocabulary that builds on shared knowledge and experience. Their
words reflect affordances and values that engineers and producers share. Musicians share
another vocabulary. Modes of communicating accommodate particular ‘participation
frameworks’ (Suchman 1996: 52) for engaging in collaboration. The production community
supports multiple participation frameworks.
According to Dourish and Bellotti, collaborators derive information from witnessing
not only externalized work but ‘the way in which [objects are] produced’ (1992: 107):
objects like musical performances. When collaborators share feedback verbally, it ‘makes
information about individual activities apparent to other participants’ (111). Shared
feedback increases ‘awareness’ ‘of the activities of others, which provides a context for your
own activity’ (107). In other words, feedback directs attention. Fussell, Kraut and Siegel
notice collaborators ‘exchange evidence about what they do or do not understand’, and this
‘grounding’ informs subsequent (inter)actions (2000: 21–22).
However, even with grounding and shared feedback, contributors may not always
see evidence for their own actions or the actions of others, for example, when they find
themselves in unfamiliar creative territory. In such cases, studio collaborators often trust
the producer to ground and guide action. As the collaborator responsible for unifying
everyone else’s work, the producer appreciates the ‘character’ of each individual work
product and its ‘significance with respect to the whole group and its goals’. The producer’s
vantage on the entire context is needed ‘to ensure that individual contributions are relevant
to the group’s activity as a whole, and to evaluate individual actions with respect to group
goals and progress’ (Dourish and Bellotti 1992: 107).
Over and above that, the producer’s grounding and mediating of work also regulates
the flow of information and ideas coming into the local context of the studio environment
from outside. Collaborators bring experience and knowledge with them into the studio.
Nersessian (2006) observes the same in scientific research communities where experts
bring to a specific laboratory and shared working context experience and knowledge
of practice from a wider research community. McIntyre (2008) explains that creative
work is produced by agents who themselves incorporate ‘an individual’s personal life
experiences […] their peculiar biological attributes manifest in talent’; the constraints of
a field defined by ‘the social organization, the hierarchy of groups and individuals who
deal with and can influence the knowledge system’ used by those who work within; and
domain-specific knowledge of ‘the symbol system that the person and others working in
the area utilize’.
Studio locales, as Watson notes, ‘are at once insulated spaces of creativity, isolated from
the city outside, and spaces influenced directly by the wider urban contexts in which the
studios operate’ (Watson 2012: 9–10). Rogers maintains that creative practices bound
to a location and situation, nevertheless, ‘operate through networks and flows that link
locations together’ (2011: 663, quoted in Watson 2012: 9). The studio environment is
porous. The producer is instrumental in keeping all the work of a given production
situated.
Situatedness
The studio environment affords many possibilities for shaping how information for work
is detected, processed collectively and otherwise shared. The space is routinely modified
to accommodate different types of work, which often progress simultaneously. Studio
acoustics, the way sound/energy travels through it, are adjusted with baffles, blankets and
other acoustic treatments. Instruments are positioned because of the way they interact
with the room’s acoustics to bring forth information and perceptual events. The choice
and position of microphones determines what information about an instrument and/or
acoustic environment is available to be perceived and integrated into work products. The
producer more than any other collaborator is likely to direct how the physical space and
objects within are to be (re)configured; if indirectly, for example, by placing requirements
on what an engineer’s actions should achieve.
More than a workspace, the studio environment is a ‘situational territory’ (Goffman
1971, quoted in Suchman 1996: 37). Its special equipment, configuration and conditions
constrain the very nature of work performed there, all (inter)actions and perceptions
within, thereby ‘situating’ them both physically and conceptually (Suchman 1987). Ceci
and Roazzi suggest that ‘one physical context may elicit [conceptual] strategies that are
not elicited by another’ (1994: 75); as may the social context, for example, if a situation is
serious or not (Ceci and Roazzi 1994).
Suchman’s early research on situatedness describes another shared workspace, an

airport operations room, as ‘a singular unitary locus for the assembly of disparate activities
into a coherent network or whole’ (Suchman 1996: 36); which inarguably captures how
recording studios are used. In Suchman’s airport, ‘the room’s location in an organizational,
spatial, and temporal field sets up the conditions within which the work of operations
gets done’ (36). The recording studio, with its organizational place in the music industry,
its spatial configuration and how it temporally organizes work, sets up the conditions for
creative collaboration.
In the airport operations room ‘each role in the division of labor is […] mapped to a
specific location in the room’ (38). The studio comparably supports the divvying up of tasks,
for example, by separating the control and live rooms. Equally important to the utility of
a shared workspace is how it enables contributors within to alternate between individual
work and interaction, between subjective perspectives and shared awareness, such as when
collaborators move in and out of the control room to share (some of) the engineer’s or
producer’s vantage. Cooperation and ‘joint activity’ requires ‘a congruent analysis of what
is relevant and irrelevant to their common purposes at hand’ (42).
The centralizing in one location of the different types of information needed by
actors for their individual work fosters coordination among actors. Collaborators with
different ‘participation frameworks’ interact via ‘discrimination and recognition of mutual
relevances’ that are apparent because all the information and all actors are in one place
(52). Under these conditions, collaborators may organize themselves physically to signal
(socially) a sharing of perspectives and a ‘shared focus of attention’ (57), and, at the same
time, identify invariants across vantages.
In Suchman’s operations room, personnel engage in ‘the continuous (re)constitution of
the room’s centrality’ (36); centrality being (re)constituted through reconfiguration. Re-
configurability aids collaboration by affording diversity in vantages and, simultaneously,
means to align vantages or prioritize vantages or invariants (Suchman 1996). The studio
environment can be easily reconfigured by any collaborator but most importantly, and
regularly, by a producer. To facilitate collaborative work around the centralized work
product, producers may organize contributors around instruments, place them in
positions with and without sightlines, isolate individuals, etc. In this way, contributors
with different frameworks for participation may be enabled or indeed discouraged
from interacting with some collaborators or objects while continuing to interact with
other collaborators and objects. Producers manage when, where and how contributors
are to be brought together for concurrent work and interaction in situations of the
producer’s devising. Interestingly, Koszolko reports that producers in networked music
collaborations, with collaborators situated in sundry locations, can find themselves
‘losing control over the mix, which could be hijacked and compromised by fellow
project members’ (2017: 97). Without the recording studio’s shared physical listening
environment, its possibilities to jointly (re)configure the space, co-discover affordances
and share information and understanding of goals and context, it is very difficult to
integrate the work and conceptions of contributors.
World building in a shared conceptual

space
Creative collaborations require the sharing of mental/cognitive spaces in addition to
physical and social ones. Like Suchman’s airport control centre, where all contributors’
responsibilities and work products are needed to keep passengers safe, in the studio
all melodic, harmonic and rhythmic elements are interrelated and interdependent. In
practice, it is not the engineer alone who is responsible for the timbre of the guitar. Rather,
the engineer, guitarist and producer combine efforts, and intentionally or in effect, make
collaborative decisions about each niche’s contribution. These work products mutually
benefit all collaborators as well as the recorded production. To coordinate behaviour,
collaborators have to share not only physical work products but also ideas and concepts
about how work might be done, before it is externalized or enacted. Creative collaborators
share conceptual and physical spaces.
Externalizing work products as thinking about them evolves demonstrates how
performances (can) fit in. Shared perceptual experiences and feedback about these protean
performances (afforded by the environment) communally ground understanding and
establish a shared context. Experiences across niches, expertise and vantages are coupled
through shared knowledge of music, the production process, society, culture and the
world at large, and of conventional mental models and schema, all of which can then
guide subsequent work. When attention is directed towards common understanding, the
community can act collectively.
Contributors operate collectively and within their niches concurrently. Creative
collaboration requires constantly alternating between shared and individual experience,
between functioning as an individual contributor and as a community. The environment
affords movement between these two positions, for example, by offering physical spaces
that accommodate large groups working and listening together, and small isolated areas
where individuals direct attention to what is most pertinent to their niche.
Work inside each niche not only bridges perceptual and cognitive domains (e.g. this
is a minor 2nd interval. The sound is dissonant, and dissonance is musically significant)
but also bridges cognitive domains, mentally (e.g. a love song). Both kinds of mental
work guide attention and performances. Bridging cognitive domains involves combining
existing, known concepts for the purposes of creating new, complex structures (i.e. this idea
and that idea together make a new idea). Concepts about or physical sensation from one
domain might be fused with concepts of another kind entirely (e.g. eyes like limpid pools).
Fauconnier and Turner (1996, 1998a) call this cognitive process ‘conceptual blending’,
whereby the ‘integration of related events’ or concepts turn into complex, perhaps ‘novel
conceptual structures’ (Fauconnier and Turner 1998a: 3). In a blend, each known structure
or ‘input space’ has structural attributes or ‘elements’ that are ‘projected’ onto a blended
space. Typically, only select elements align, and ‘partial projection[s]’ (Fauconnier and
Turner 1998b: 145) are assimilated into the blend (i.e. eyes are like pools in some respects).
‘Blending exploits and develops counterpart connections between inputs’ (Fauconnier and
Turner 1996, 1998a: 3), opportunistically matching affordances in each. ‘Taken together,
the projections from the inputs make new relations available that did not exist in the
separate inputs’ (Hutchins 2005: 1556).
So that each input is fully comprehended, blending ‘recruit[s] structure from more stable,
elaborate, and conventional conceptual structures’ (Fauconnier and Turner 1996, 1998a:
3–4), for example, commonplace schema or perhaps perceptual invariants. ‘Conceptual
structure must be represented in a way that allows some parts of the representation to be
manipulated, while other parts remain stable’ (Hutchins 2005: 1557). Zagorski-Thomas
(2014) offers that conceptual blending helps to explain meaning-making in recorded
music. A melody that descends in pitch is, in some respects, like things that fall physically.
There are also falls from grace. In music production, stable conceptions of reality become
affiliated with recorded representations. But, a fall from grace may be significant in myriad
ways. Blending involves picking out the ways that are relevant and stable in the local context.
The product of conceptual blending is a new ‘mental space’ or kind of structure that draws
from ‘conceptual domains and from local context’ (Fauconnier and Turner 1998b: 157).
Locally, collaborators build a shared context. Collectively grounding experiences
(verbally) and feedback makes concepts common across the community available for
blends, potentially blends that pull from both shared concepts and specialized personal
knowledge. New conceptual structures may be discussed again verbally; and, in these
sorts of exchanges, which happen frequently in production, studio collaborators often
use metaphors and analogies to relate conceptual and physical domains. Johnson (2002)
observes scientists doing the same, using metaphors to link observed phenomena with
explanations. ‘Metaphorical mappings are what allow us to carry over knowledge and
inferences within the source domain to knowledge and inferences in the target domain’
(17), from the mind to shared perceptual experiences of performances and vice versa.
Collaborators also share new structures simply by externalizing performances.
Hutchins (2005) suggests conceptual blending often recruits material anchors, physical
objects with perceptually accessible attributes. Material is very stable, and conceptual
and material structures are easily associated; as is illustrated in all forms of art, whenever
an artefact represents cultural models. Blending can also link an anchor’s materiality to
experiences of ‘bodily interaction with the physical world’ (1560). For example, Hutchins
suggests that ‘queuing’ is the product of blending the conceptual and physical. Queuing is
a widely understood cultural practice that involves serving customers in sequential order
based on time of arrival and the (physical) structure of a line. Customers arrange their
bodies in a sequence representing a line. Gestalt perception then groups individual bodies.
The blend, as a new and comprehensible structure, provides a basis for further ‘elaboration’,
for example, deliberately positioning line occupants (Hutchins 2005). An elaboration in a
musical scenario might involve a descending, melancholy melody picked up by different
instruments, each using the same basic musical material to enact melancholy from a
particular vantage.
If collaborators do not share a local context they are less likely to share similar
understandings of each other’s blends. The studio environment, by environing and
contextualizing both potential blend inputs and the perceivers of those structures, affords
shared understanding. Given these conditions, with so much information, there are
many stable inputs to recruit and many possible artistic elaborations (Dissanayake 2015)
grounded in mutual understanding.
Any collaborator in the studio may verbally describe an input or a blend. Any
collaborator’s actions might alter a shared recorded artefact and thus how it serves as a
material anchor for subsequent blending. Nevertheless, given the producer’s appreciation
for the vantages of each actor; the producer is uniquely equipped to suggest stable inputs
for blending. The producer is best able to constrain, frame or facilitate the blending work of
individual contributors or groups; for example, by providing a ‘vision’ for the production
(Anthony 2017). The producer, more than any other collaborator, has opportunities to
control how the materiality of the recording-in-production is perceived, by directing what
is heard through the control-room monitors, what tracks are muted and unmuted, their
balance and signal processing, and therefore what attributes and affordances are most salient
to fellow collaborators. Of course, the producer also directs attention verbally. From both
perceptual and conceptual or cognitive perspectives, the producer exudes a prodigious
amount of control over how information flows through the production environment from
percept to concept; if, when and how it is detected; and how it is cognitively processed
and used for work. Ecologically speaking, the producer’s ability to direct productions
comes not from the position’s presumed authority but, rather, from the producer’s ability
to successfully frame the perceptual and cognitive experiences of collaborators.
The producer as mediator

The producer’s control over the perceptual environment, over information and what is
detected as afforded and what is valued, ultimately defines all the behaviours that emerge
within. Producing then, it can be said, is a process of shaping the vantages of varied
contributors within a shared recording environment, so that the work products produced,
individually and collectively, all similarly exude characteristics of that ecosystem – they
all appear to the listener to belong together and interoperate. All work (or actions)
within an environment manifest attributes of having been similarly environed; but not all
environments have a producer who mediates the environmental medium, and as a result,
all experiences of ecological events within.
Just like any other collaborator, the producer’s work involves integrating perceptual and
conceptual/cognitive information/knowledge. However, the producer’s concepts are not
limited to a particular niche or kind of work. The producer integrates an overarching conceptual
view of the production with the properties of the local context. The producer’s vantage
and thus enactions of the environment are unlike those of other collaborators and exude a
disproportionate influence on the environment. The enactions of other collaborators are to a
great degree enactions of an environment if not created then heavily shaped by the producer.
Collaborators do what they do – individually and collectively – because they can in the
environment and context the producer makes possible. From an ecological perspective,
it can be said that many problems in collaboration stem from confusion around the
perceptions of different vantages or informational requirements being met unevenly.
The producer, as designated mediator, not only facilitates communications but prevents
environment conditions that lead to breakdowns.
Collectively, a studio-based music production’s ‘creativity […] is the emergent property
of a system at work’ (McIntyre 2016: 18). The uniqueness of each recorded production
is reflective of the state of the environment and the vantages of collaborators at the time
actions were taken. The producer’s actions, enacted from the producer’s vantage, not only
provide the necessary environmental conditions for creative collaboration but, more
specifically, the structure that defines this emergence for every production.
Conclusion
In the studio, each contributor is in collaboration with the environment, the technology and
each other. Any of which may be influenced by controlling the availability of information.
By controlling or mediating the flow of information, the producer exerts influence over the
physical properties of each collaborator’s actions and work and the recorded artefact as a
whole. The producer:
● Modifies the environment and that changes the ambient information available for
perception and thus work.
● Physically and conceptually focuses collaborators’ attention on information and
affordances.
● Facilitates and regulates collaborators’ (inter)actions and the process of
collaboration.
● Structures situations for both independent and collaborative working, making it
possible to amalgamate distinct work products into a single recorded artefact.
As a space for collaborative work – configured by a producer – the studio exudes a unifying
force over all work performed there because all actions are enactions of that environment.
As collaborators modify and manipulate recorded objects in their individual ways,
externalizing their cognitive work, other perceivers/collaborators are able to share vantages
and understanding of the context, which subsequently inform their own actions. Though it is
possible to generalize about production decisions and common practice in the studio, actions
cannot be thoroughly explained without incorporating the working environment in which
they are situated; an environment that environs not only technology but other perceivers
who share resources, and also create resources and information for all collaborators to share.
Bibliography
Anthony, B. (2017), ‘The Producer’s Vision: A Study into the Multi-Faceted Cognitive Design
of the Popular Music Recording Aesthetic’, 12th Art of Record Production Conference:
Mono: Stereo: Multi, Stockholm, 1–3 December.
Becker, H. (1974), ‘Art as Collective Action’, American Sociological Review, 39 (6): 767–776.
Ceci, S. and A. Roazzi (1994), ‘The Effects of Context on Cognition: Postcards from Brazil’, in
R. J. Sternberg and R. K. Wagner (eds), Mind in Context, 74–101, New York: Cambridge
University Press.
Chun, M., J. Golomb and N. Turk-Browne (2011), ‘A Taxonomy of External and Internal
Attention’, Annual Review of Psychology, 62: 73–101.
Clark, A. and D. Chalmers (1998), ‘The Extended Mind’, Analysis, 58 (1): 7–19.
Clarke, E. (2005), Ways of Listening: An Ecological Approach to the Perception of Musical
de La Grange, H.-L. (1995), Gustav Mahler, vol. 3, Vienna: Triumph and Disillusion (1904–
1907), Oxford: Oxford University Press.
Dissanayake, E. (2001), Homo Aestheticus: Where Art Comes From and Why, Seattle, WA:
University of Washington Press.
Dissanayake, E. (2015), What is Art For?, Seattle, WA: University of Washington Press.
Dourish, P. and V. Bellotti (1992), ‘Awareness and Coordination in Shared Workspaces’,
in Proceedings of the 1992 ACM Conference on Computer-Supported Cooperative Work,
Toronto, 1–4 November, 107–114, New York: ACM.
Fauconnier, G. and M. Turner (1996), ‘Blending as a Central Process of Grammar’, in A.
Goldberg (ed.), Conceptual Structure, Discourse, and Language, 113–129, Stanford, CA:
Center for the Study of Language and Information (CSLI), distributed by Cambridge
University Press.
Fauconnier, G. and M. Turner (1998a), ‘Blending as a Central Process of Grammar [expanded
version]’. Available online: https://www.cc.gatech.edu/classes/AY2013/cs7601_spring/
papers/Fauconnier_Turner.pdf (accessed 18 February 2018).
Fauconnier, G. and M. Turner (1998b), ‘Conceptual Integration Networks’, Cognitive Science,
22 (2): 133–187.
Fiske, S. and S. Taylor (2013), Social Cognition: From Brains to Culture, Los Angeles: Sage.
Franinović, K. and S. Serafin (2013), Sonic Interaction Design, Cambridge, MA: MIT Press.
Fussell, S., R. Kraut and J. Siegel (2000), ‘Coordination of Communication: Effects of Shared
Visual Context on Collaborative Work’, in Proceedings of the 2000 ACM Conference on
Computer Supported Cooperative Work, Philadelphia, PA, 2–6 December, 21–30, New York:
ACM.
Gibson, E. (2000), ‘Where Is the Information for Affordances?’, Ecological Psychology, 12 (1):
53–56.
Gibson, J. ([1979] 1986), The Ecological Approach to Visual Perception, Hillsdale, NJ: Lawrence
Erlbaum Associates, Inc.
Goffman, E. (1971), ‘The Territories of the Self ’, in Relations in Public, 28–61, New York: Basic
Books.
Howlett, M. (2012), ‘The Record Producer as Nexus’, Journal on the Art of Record Production,
6. Available online: https://www.arpjournal.com/asarpwp/the-record-producer-as-nexus/
Hutchins, E. (2005), ‘Material Anchors for Conceptual Blends’, Journal of Pragmatics, 37 (10):
1555–1577.
Johnson, M. (2002), ‘Metaphor-Based Values’, in L. Magnani and N. J. Nersessian (eds),
Scientific Models in Model-Based Reasoning: Science, Technology, Values, 1–19, New York:
Kluwer Academic/Plenum Publishers.
Kirsh, D. (2010), ‘Thinking with External Representations’, AI & Society, 25: 441–454.
Koszolko, M. (2017), ‘The Impact of Remote Music Collaboration Software on Collaborative
Music Production’, PhD thesis, RMIT University, Australia.
Latour, B. (1996), ‘On Actor-Network Theory. A Few Clarifications Plus More Than a Few
Complications (English Version)’, Soziale Welt, 47: 369–381.
Lavie, N., A. Hirst, J. W. De Fockert and E. Viding (2004), ‘Load Theory of Selective Attention
and Cognitive Control’, Journal of Experimental Psychology: General, 133 (3): 339–354.
Law, J. (1992), ‘Notes on the Theory of the Actor-Network: Ordering, Strategy, and
Heterogeneity’, Systems Practice, 5 (4): 379–393.
Lefford, M. N. (2015), ‘The Sound of Coordinated Efforts: Music Producers, Boundary
Objects and Trading Zones’, Journal of the Art of Record Production, 10. Available online:
https://www.arpjournal.com/asarpwp/the-sound-of-coordinated-efforts-music-producers-
boundary-objects-and-trading-zones/ (accessed 22 August 2019).
McIntyre, P. (2008), ‘The Systems Model of Creativity: Analyzing the Distribution of Power
in the Studio’, Journal of the Art of Record Production, 3. Available online: https://www.
arpjournal.com/asarpwp/the-systems-model-of-creativity-analyzing-the-distribution-of-
power-in-the-studio/ (accessed 22 August 2019).
McIntyre, P. (2016), ‘General Systems Theory and Creativity’, in P. McIntyre, J. Fulton and
E. Paton (eds), The Creative System in Action: Understanding Cultural Production and
Practice, 13–26, London: Palgrave Macmillan.
Nersessian, N. (2006), ‘The Cognitive-Cultural Systems of the Research Laboratory’,
Organization Studies, 27 (1): 125–145.
Norman, D. (1988), The Design of Everyday Things, New York: Doubled Currency.
Osbeck, L. and N. Nersessian (2014), ‘Situating Distributed Cognition’, Philosophical
Psychology, 27 (1): 82–97.
Ramachandran, V. S. and W. Hirstein (1999), ‘The Science of Art: A Neurological Theory of
Aesthetic Experience’, Journal of Consciousness Studies, 6 (6–7): 15–51.
Rogers, A. (2011), ‘Butterfly Takes Flight: The Translocal Circulation of Creative Practice’,
Social & Cultural Geography, 12 (7), 663–683.
Star, S. L. and J. Griesemer (1989), ‘Institutional Ecology, “Translations” and Boundary
Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907–
39’, Social Studies of Science, 19 (3): 387–420.
Suchman, L. (1987), Plans and Situated Action, New York: Cambridge University Press.
Suchman, L. (1996), ‘Constituting Shared Workspaces’, in Y. Engeström and D. Middleton (eds),
Cognition and Communication at Work, 35–60, Cambridge: Cambridge University Press.
Varela, F., E. Thompson and E. Rosch (1993), The Embodied Mind Cognitive Science and
Human Experience, Cambridge, MA: MIT Press.
Watson, A. (2012), ‘Sound Practice: A Relational Economic Geography of Music

Production in and Beyond the Recording Studio’, PhD thesis, Loughborough University,
Loughborough.
University Press.
Zagorski-Thomas, S. and A. Bourbon (2017), ‘The Ecological Approach to Mixing Audio:
Agency, Activity and Environment’, Journal of the Art of Record Production, 11. Available
online: https://www.arpjournal.com/asarpwp/the-ecological-approach-to-mixing-audio-
agency-activity-and-environment-in-the-process-of-audio-staging/ (accessed 22 August
2019).
10
Creative Communities of Practice:
Role Delineation in Record
Production in Different Eras
and across Different Genres and
Production Settings
Tuomas Auvinen
Introduction
Record production is a multifaceted process that consists of different activities and
temporally consecutive and/or concurrent phases. These activities have often been
distributed among specialized individuals who form what in record production has been
called the creative collective (Hennion 1983). From a historical perspective, however, these
activities have been distributed differently at different times and eras, in various genres,
and they have taken different shapes depending on the individual project. The distribution
of activities and the form of the activities themselves again have been largely influenced
by opportunities and possibilities afforded by technological changes and developments. In
this chapter, my aim is, firstly, to identify and examine a generic set of roles and activities
that have been a part of record production throughout its 100-year history and across all
genres, and secondly, to discuss how these activities have been distributed among different
human agents in different eras and in various production settings. My aim, therefore, is
essentially to examine and discuss the division of labour in the creative process of record
production. Despite the growing influence of artificial intelligence (AI), record production
in the twenty-first century still remains predominantly a collaborative technological
pursuit of human agents. This study is partially a summarized derivative of earlier research
on the subject and is partly based on my original ethnographic research on production
projects of popular, classical and rock music in Finland. I have conducted this research
between 2015 and 2017. This, I hope, will on its own part provide a non-Anglophone
contribution to the body of research, which has traditionally been strongly dominated by
scholars from or based in the English-speaking world. I will first outline some premises of
the nature of record production. Then I will outline the different activities and phases of a
production project and discuss how each activity has been delineated in different eras and
genres. Finally, I will take a look at how role delineation plays out in contemporary record
production and provide some insight into how I see the development in the future.
Premises: The creative collective

Record production has generally been understood as a collective process. Antoine
Hennion’s (1983: 160) concept, the creative collective, has been regarded as somewhat of
a premise when discussing the social nature of the record production process. Hennion
argues: ‘The creative collective, a team of professionals who simultaneously take over all
aspects of a popular song’s production, has replaced the individual creator who composed
songs which others would then play, disseminate, defend or criticize’ (160). Here, Hennion
seems to contrast the creative process of record production with musical processes before
record production. Agents with different roles need to work together instead of working
separately for separate aims. Elaborating further:
This team shares out the various roles which the single creator once conjoined: artistic
personality, musical know-how, knowledge of the public and of the market, technical
production and musical execution have become complementary specialisms and skills
performed by different individuals. (160–161)
Albin Zak offers a similar perspective on the record production process:

Making records is intrinsically a collaborative creative process, involving the efforts of a
‘composition team’ whose members interact in various ways. As a matter of form, the ‘artist’
on a recording is usually the person or group who receives top billing on the album cover,
but in fact most of the tasks involved in making a record require some measure of artistry.
(2001: 163)
Consequently, creativity in the record production process must be understood as distributed

and taking place as a result of social interactions between the different agents involved. It
seems that record production still is comprehended as a collective effort despite the possibility
afforded by digital technologies for one person to take over several or all of the activities.
What must be noted here, however, is that Hennion’s views are based on his research into
the shift from ‘the illusion of reality’ to the ‘reality of illusion’, when record production
became an art form of its own through multitracking and ‘tapification’ (Moorefield 2005;
Schmidt Horning 2013: 171–172), during what I would consider the full bloom of the
digital revolution in music production technologies along with the digital audio workstation
(DAW) and visual displays (e.g. Williams 2012). Also, the ‘composition team’ mentioned
by Zak is a concept used by ‘the fifth Beatle’ George Martin (Williams 2012), whose career
Creative Communities of Practice 163
began in the 1950s and who is perhaps best known for his pioneering production work in
the 1960s. I would argue that the digital revolution has, to some extent, reversed the process
of collective creativity in record production. I will return to this argument later.
The roles in record production

From a creative perspective, making a musical record consists of five phases or activities.
They are songwriting (or composing), arranging, performing, engineering and producing
(Zak 2001: 164). These are what Zak calls the ‘nominal categories of contributors’ that are
usually written onto an album’s cover. The boundaries of these categories are nevertheless,
as Zak puts it, ‘fluid and tasks often merge or overlap’. Zagorski-Thomas (2007: 191)
and Moore (2012) provide a similar list of stages in most production projects but omit
‘producing’ from the list. This might reflect the notion that ‘Producers may act as arrangers,
performers, songwriters, or engineers’ (Zak 2001: 164). Especially in the contemporary
setting, producing can easily be understood as influencing all the essential categories of
contribution to various degrees depending on the genre and individual project. Furthermore,
as I will argue later, producing as a category might even absorb all other actions.
All of these necessary phases of music production still need human agents as
contributors, regardless of the recent developments in AI in the field of, for instance,
mastering (which should be understood as a subcategory of engineering). The nominal
categories of contribution can be understood to transcend boundaries of musical style,
genre and historical era of music production (or record production). All of them need
to happen in one form or another at one phase or another if a musical record is to be
produced. However, the degree to which the boundaries of the categories intertwine and
overlap can be understood as an indication of differences in genre values. Also, the changes
in music production technologies have contributed to the changes in role delineation.
From an industrial perspective, however, a managerial or leadership role is always
required. Someone has to oversee the overall process and make sure that, as Howlett (2012)
puts it, the ‘creative inspiration of the artist, the technology of the recording studio, and the
commercial aspirations of the record company’ converge in a way that is both artistically
sufficient and economically viable. Roughly speaking, before the 1950s, the producer’s role
was mainly managerial (e.g. Moorefield 2005). They would be hired by the record company
to decide which records were going to be made and to oversee the financial, creative and
logistical aspects of a production project. The contemporary role of the music producer
evolved from this role, in the case of major record labels, and from entrepreneurs who
ran their own small businesses (see Schmidt Horning 2013). In the contemporary music
industry, the A&R (artists and repertoire) person might be closest to the old managerial
record producer. The A&R is an employee of the record company who oversees the
financial aspects as well as schedules and deadlines of a production process, whereas the
producer, who works with aesthetic decision-making, often works as an independent
entrepreneur (see, for example, Burgess 2013: 15, 17; Auvinen 2018: 74). This is the case
especially in popular music, although it is becoming increasingly common in, for example,
classical music (see Blake 2012: 195).
Songwriting/composing
For music to be produced into a recording, a song or a piece of music must be written
or composed. Here, I will include writing lyrics into this category, even if it can be
regarded as a separate process especially in subgenres of Western classical music (e.g.
opera). Traditionally, this category of contribution would take place in the beginning of
the process and can also be understood as ‘the reason most commercial recordings in the
world of popular music are taking place at all’ (Howlett 2012). This, I would argue, is true
especially in record production before the 1950s, when record production mainly aimed at
producing an illusion of reality (Moorefield 2005: xiii–xiv).
During this era, the aim of record production was to capture a live performance onto a
recording medium, which then could be sold to audiences. Before music can be performed
and the performance captured onto a recording medium, a musical structure must exist for
the musician to perform. Traditionally, a composition or a song would exist as a musical score,
also an item to be copied and sold. This was, and largely remains, the case especially in Western
classical music, excluding technology-based genres such as musique concrète and perhaps, for
instance, some subcategories of minimalism. This can also be understood as having been the
case with early traditions of popular music such as Tin Pan Alley songs, early jazz and ragtime.
Composers (or songwriters) would form their own group of agents separate from other agents
involved in record production. Alternatively, the artist performers would compose their own
music. However, as record production in the 1950s took steps to the direction of becoming an
art form in itself, not only a reproduction of an artistic event (xiii–xiv), the first steps towards
integrating composition with the technological process were taken. In the 1960s, when artists
started spending more and more time in the studio or even doing much of their compositional
work at the studio (e.g. Schmidt Horning 2013: 208), the technological processes of
engineering became integrated with the compositional process. The artists and the engineer/
producer formed a ‘composition team’ (Zak 2001: 163). Practices pioneered by, for example,
the producer Brian Eno resulted in a full integration of engineering and composing as his ideas
made the studio into a musical instrument (Moorefield 2005: 53–54). Today, this is the case
with many producer-driven genres, typically based on digital technological practices. In much
of popular music today, the producer is also the songwriter/composer.
Arranging
Arranging as a category of contribution, at least in copyright terms, is understood as what
backs up the melody, i.e. the composition. It is regarded as what succeeds the compositional
process and accounts for ‘the rest’ of the music: the rhythm tracks, the chords, the harmony.
Consequently, at least according to the current copyright law in Finland, the arranger is
entitled to a maximum of 16.67 per cent of the overall copyright fee of a composition
or song (Teosto 2016: 16).1 This comprehension of the arrangement is questionable to
put it mildly, especially in contemporary music production based on the DAW studio
process; the arrangement might be what constitutes a great share of the meaning of a track.
Hennion writes as early as 1983: ‘The real music of the song hides behind the melody and
gives it its meaning. The audience only notices the melody and thinks it is the tune itself it
likes’ (1983: 163). Furthermore, in contemporary music production, the arrangement or,
in other words, a ‘backing track’ might be what inspires the writing of the melody to begin
with (see Bennett 2011).
In the traditional model of record production, arrangers constituted their own
professional group separate from other categories of contribution. Accounts of music
production before the DAW handle arranging as a separate practice taking place prior to
the recording process and often in the form of a written full score even in non-classical
music (e.g. Swedien 2009: 2, 15–16). Furthermore, within smaller music markets, arranging
remained completely separate from producing comparatively late. For example, in Finland,
the producer as a credited contributor in the production of a musical record emerged as
late as the 1970s. Prior to this, agents who engaged in activities that are now associated with
producing were called ‘arrangers’ (Muikku 2001: 308).
However, of all the nominal categories of contribution in music production,
arranging as a separate activity might have suffered the greatest ramifications following
the development started by the shift from ‘illusion of reality’ to the ‘reality of illusion’
(Moorefield 2005: xiii–xiv). In contemporary popular music production, especially in
the ‘pop and urban’ genres (see Burgess 2008; Auvinen 2018: 92), the arrangement is
often taken care of by a producer or producers (Auvinen 2016; Auvinen 2017), or at least
the producer greatly influences it if they are working with an artist or a band. Initially,
producers were able to take over the arrangement through multitracking, tape-splicing
and overdubbing processes. However, the introduction of computer sequencing by means
of, for example, Fairlight and Synclavier influenced this change in a much more dramatic
way and strengthened the arranging capacities of the producer; one no longer had to have
as much formal knowledge of music theory to be able to write arrangements. Musical
Instrument Digital Interface (MIDI) programming took this development even further.
The latest developments of digital editing and the complete visual arrange window have
finalized the producer’s role as an arranger. Firstly, digital editing gives the opportunity
to treat recorded (or programmed) sonic material as raw material for later use (Auvinen
2016; 2017). This rearranging of sonic material during or after a studio session can be
understood as arranging. Secondly, the visual arrange window as a metaphor for the
traditional musical score, gives the agent using the technology (engineer, producer) an
opportunity to comprehend the entirety of the musical arrangement in the same way
a traditional composer or arranger would, without requiring knowledge of traditional
Western notation. The change in the delineation of the activity of arranging in the music
production process is perhaps best visible in technology-based producer-driven genres
such as rap or electronic dance music (EDM). Here, the producer is responsible for the
backing tracks – the arrangement – although in the absence of an actual melody (like in
rap) it is really hard to distinguish arrangement from composition in the first place.
Another way to look at why contemporary producers usually engage in arranging is
from a financial perspective. Burgess (2008) discusses how producer compensations
have gone down since the plummeting of the sales figures of physical records and the
consequent shrinking of album budgets. Although writer-producers have been around for
a long time (Burgess 2008), it may now be the case that a producer of popular music must,
among other things, have at least a slice of the copyright fees of a song in order to sustain
a financially viable career.
Performing
Performing is perhaps the least male-dominated category of contribution in record
production. This arguably also reflects the way in which other activities, for example
producing, have been distributed to female agents. While this is changing and there are well-
known exceptions, many female producers are artists who produce their own music (e.g.
Wolfe 2012). I would argue that performance has remained somewhat of an independent
category of contribution in record production regardless of era or genre, excluding the
fully programmed electronic music with no vocals. The performing artist is after all the
one who audiences tend to focus on, the one receiving the ‘top billing on the album cover’
(Zak 2001: 163) regardless of who did most of the work in the production process. At the
very least members of the audience need to think that the artist whose name a recording
carries has actually performed their own sonic parts. Also, vocal performances still seem to
be irreplaceable sound-wise by producer-programmers despite the current breakthroughs
in the area of vocal synthesis. On the other hand, musicians performing on records can be
understood as ‘extensions of the producer’ or their vision. Phil Spector, for example, had
a ‘group of regulars’ who came into the studio to play according to his vision (Moorefield
2005: 10).
As I understand it, the greatest change in the delineation of performance activities in
record production is a result of sound synthesis and the digital revolution. The introduction
of synthesizers with their vast sound banks allowed one agent to produce sounds that they
would not have had the skills to produce through traditional musicianship. Virtual plug-in
synthesizers integrated to sound-recording software in a DAW production setting completed
this development by essentially distributing the performance activities from musicians to the
producer in genres such as rap and the myriad different kinds of EDM; the producer largely
builds a simulation of a performance through programming. This profoundly problematizes
the very idea of performing in musical records. Classical musical records still largely
remain simulations of an ideal performance (Blake 2012) achieved through heavy editing
of bits of real recorded performances. In many styles of popular music, however, records
are simulations of performances that never happened as sonic events. These programmed
simulations can then be performed live by the artist-producer (Moorefield 2005).
Engineering
Of the nominal categories of contribution in record production, engineering is perhaps the
easiest to divide into subcategories. These are recording, editing, mixing and mastering.
The reason they are easily recognizable is that they largely do not remain merely nominal
but actually have manifested and still manifest themselves as different and separate
activities each with their own specific aims and purposes in the record production process.
Consequently, the delineation of these activities has remained separate to a greater degree
than, for instance, composing and arranging.
According to Susan Schmidt Horning, the early recording engineers, or ‘recordists’
and ‘recording experts’, were a separate category of agents for a clear reason (2013: 29).
Recording technologies did not come in ready-made packages but recordists often had to
build and maintain their own (29). Consequently, a recordist would have to be interested
in designing and building tools for recording sound. This brings forth the question of
how different roles should be labelled. Technological change and the change of studio
environments can be seen as a driving force behind the change of role descriptions in
music production especially when it comes to the category of engineering. Schmidt
Horning provides a prime example:
By the 1960s, recordists had become ‘recording engineers’, yet ironically, many of those who
entered this profession often knew less about electronics and the basic technical foundations
of recording than their self-taught predecessors. (2013: 143)
It cannot be a coincidence that the recordist or the recorder, interchangeable job descriptions
from the era of acoustic recording (15), was replaced by the new term ‘recording engineer’
at a time of an overall reorganization of the recording industry and an ideological expansion
in record production, which happened in the 1960s.
A fundamental difference in the delineation of engineering activities lies between
popular music and classical music. In classical productions a separate recording engineer
is always needed. According to the classical producer Seppo Siirala:
I always need a professional recording engineer as a partner to work with. I have to brief
him about the project of course. The recording engineer doesn’t necessarily have to get
acquainted with the repertoire the same way that I do, but s/he has to have a clue about what
kind of an overall sound we are aiming at and what the [important] points in the repertoire
are and what kind of stuff we are going to be working with, I need to be able to describe that.
(interview with author, Ondine’s building, Malmi, Helsinki, 26 November 2015)
This remains the case in, for instance, the production of Western classical records, perhaps
not least for the reason that classical musicians or composers do not receive training
in music production technologies and/or do not show interest in them (exceptions do
naturally exist; see, for example, Klein 2015). On the contrary, in popular music a home
studio ‘is virtually a prerequisite for any aspiring pop musician’ (Warner 2003: 20). This
conflates the roles of the engineer and the performing artist or musician in popular
music.
Role delineation concerning the category of engineering has perhaps the greatest
variation between different genres. Especially the ways in which engineering merges with
composing and/or arranging characterizes entire categories of musical style. Conventionally,
engineering and all its subcategories would remain separate from composing, arranging and
performing. However, in contemporary music production settings in genres based more
on innovative uses of music production technologies and less on traditional musicianship
like, for instance, rap and the multitude of genres within the broad category of EDM,
engineering is essentially merged with composing, as I outlined earlier, and producing,
which I will deal with next (see Auvinen 2017). This development is at least partly the
result of the heavy commodification of music production technologies especially from the
1980s onwards (see Théberge 1997). This was manifested in two ways. Firstly, technologies
were made simpler to use through automation that didn’t require calibration, and through
sound capturing techniques that didn’t require acoustically treated spaces. Secondly, music
production technologies became cheaper through lower technical specifications and
ultimately smaller through laptops and the move from hardware to software. In a relatively
short timeframe, many more people could afford music production technologies that can
be used to produce music that sounds professional. This connects in part with Schmidt
Horning’s idea of how the development of recording technology in the 1960s created
recording engineers who didn’t know anything about traditional engineering (2013: 143)
as opposed to earlier engineers who had to build their own equipment, which required
knowledge of actual engineering (143). Is it possible that we now have people who call
themselves ‘producers’ without knowing anything about producing?
Nevertheless, mixing and mastering as subcategories of engineering seem to have more
or less survived as independent categories of contribution distributed to highly specialized
agents (see, for example, Gibson 2005: 205).
Producing
Producing, the only category of contribution in music limited only to record (or music)
production, remains the most vague and ambivalent in terms of specific activities related to
it. This observation has led scholars of music production to note: ‘The question often arises:
“What exactly does a record producer do?”’ (Zak 2001: 172). As noted earlier, producers
might contribute to any of the aforementioned categories (164). Historically speaking,
the role of the producer might have experienced the greatest ramifications in the shift of
record production from craft to art (Kealy 1979). Within popular music, producers such
as Phil Spector and George Martin are widely regarded as the first producers with artistic
agencies, producers responsible for aesthetic decision-making, not merely for the practical
arrangements of recording sessions. This change in the conceptualization of producing can
be understood as being connected to a more fundamental change in the way that people
started to conceptualize popular music from the 1960s onwards. Théberge (1997: 192)
writes:
Clearly by the early 1960s the notion of a ‘sound’ was part of the vocabulary of popular
culture. Phil Spector was perhaps the first pop producer to be recognized as having his own
unique sound – ‘Spector Sound’ (also known in more general terms as the ‘wall of sound’) –
and a variety of recording studios and musical genres soon were identified as the promoters
and/or possessors of a particular ‘sound’.
From this point of view, the rise of the idea of the importance of ‘sound’, perhaps at the
expense of abstract musical parameters written on a score in Western notation, seems to be
of utmost importance when thinking about the ever more important role of the producer,
especially in taking over an increasing number of the categories of contribution of record
production (I will return to this later).
In classical music, a similar shift in the ideology of ‘producing’ took place at the same
time. However, the aim here was not to create new sounds through producing records but
to simulate the ideal performance (Blake 2012). Classical producer Walter Legge explains:
I was the first of what are called ‘producers’ of records. Before I established myself and my
ideas, the attitude of recording managers of all companies was ‘we are in the studio to record
as well as we can on wax what the artists habitually do in the opera house or in the concert
platforms’. My predecessor Fred Gaisberg told me: ‘We are out to make sound photographs
of as many sides as we can get during each session.’ My ideas were different. It was my
aim to make records that would set the standards by which public performances and the
artists of the future would be judged – to leave behind a large series of examples of the best
performances of each epoch. (Legge quoted in Frith 1996: 227)
Despite the different aesthetic aims between popular and classical music, here too we can
see how the activity of producing shifted from something that aimed to take a photograph
of a musical event to the production of artefacts as ends in themselves.
It must also be noted that ideas and conceptions of the activity of ‘producing’ have
developed at different times in different countries and music markets. For example, the
Finnish musicologist Jari Muikku (2001: 308) has argued that the concept of the ‘producer’
only emerged in the relatively small music industry of Finland ‘during the multi-tracking
era in the 1970s’, which is significantly later than in the American or the British context.
Prior to this, agents called ‘arrangers’ or ‘conductors’ took care of hiring suitable musicians
and making the proper arrangements both musically and practically (308). Only the new
producer generation of the 1970s started to think in terms of isolated tracks and in terms
of sounds rather than notated arrangements. Consequently, the philosophical shift from
the ‘illusion of reality’ to the ‘reality of illusion’, which had taken place at least in the United
States and Britain in the 1950s and 1960s (cf. Moorefield 2005; Hiltunen 2016), took place
through the new producer generation as late as in the 1970s in Finland.
Overall, it must be noted that what Moorefield describes as a philosophical shift from
one aim to another is better understood as an expansion. While it is true that much of
the music in the vast aesthetic category of ‘pop’ did follow this shift, other musical styles
remained more or less the same. For example, classical record production still largely aims
at producing an illusion of a real concert experience (Auvinen 2018). The same is true with
much of jazz, even country, blues and many rock recordings, where notions of performance
authenticity are important and the sonic product of record production is contrasted against
a live concert experience (e.g. Frith 2012: 208).
Current developments: Programming, the producer

as ‘tracker’ and the composer as ‘topliner’
Current developments in the digital production space have seen many of the other
categories of contribution absorbed into the agency of the ‘tracker/producer’ (Auvinen
2016, 2017). The Finnish musicologist Riikka Hiltunen (2016: 6) deals with the tracker at
songwriting camps arranged by a Finnish music promotion and export organization Music
Finland. Hiltunen writes:
Each group has one or several trackers who are responsible for the production and recording
of a demo recording and two or more topliners who are responsible for the composition
and writing of the melody and the lyrics. The terms tracker and topliner are widely used
in the trade. Their meanings are still partly unstable. The terms tracker and producer are
used in part as synonymous even if their working roles differ to some extent. The roles often
cross. The tracker might have an important role, for example, in the creation of the chord
background and s/he can also take part in making the melody and the lyrics. On the other
hand, topliners can throw around ideas on the production. (6, my translation)
Hiltunen goes on to explain how the obscuring of the differences between the different
music-making roles is essentially bound to the growing role of technology in the process
of music-making. The music production setting connects strongly with ideas presented by
the British musicologist Joe Bennett (2011):
[A] completed backing track is supplied by a ‘producer’ to a top-line writer who will supply
melody and lyric. The backing track acts as harmonic/tempo template but more crucially as
inspiration for genre-apposite creative decisions, such as singability of a line.
Here, Bennett clearly refers to the tracker-producer type without using the same term. This
would strengthen Hiltunen’s idea that the terminology of contemporary music production
is not yet fully established even if the role of the tracker is already an intrinsic part of the
communal creative process (2016: 6).
The role of the tracker can also be examined through Zak’s categories (2001: 164).
The overlapping of Zak’s categories of contribution and their heavy concentration on
the tracker/producer are in my understanding the result of the heavy digitalization of
the music production process and its relocation into the DAW. This can be viewed as
a fusion of the use of the studio as a ‘fully-fledged musical instrument’ (Moorefield
2005: 53) by innovators such as Kratftwerk, Giorgio Moroder, Trevor Horn and Brian
Eno, and even George Martin, and self-producing multi-instrumentalist artists such
as Prince, Stevie Wonder and Mike Oldfield. Also, earlier figures such as Les Paul who
experimented with multitracking technologies to create layered arrangements can be seen
as precursors of the contemporary tracker. The tracker/producer or producer/tracker,
at least in the case study I have conducted (Auvinen 2016, 2017), takes part in or takes
over all the nominal categories of contribution: composing, arranging (in the form of
programming on the DAW), performing and engineering (editing). Even if they do
not explicitly perform themselves, they influence the performance by giving extensive
instruction and/or feedback to the singer (Auvinen 2017, 2018). Also, the tracker takes part
in constructing the performance through editing, even though editing also mixes with the
nominal category of arranging when recorded audio is used as raw material for the purposes
of constructing arrangements through editing (Auvinen 2017). The notion of programming
in music production raises an interesting comparison. ‘Programming’ essentially means
arranging in traditional terms of contribution and has become an important activity in
music production. Hence the role of the ‘programmer’, though it is also often merged
with that of the producer. In the same way that agents occupying the role of the recording
engineer in the 1960s didn’t have much knowledge about actual engineering, programmers
in the age of digital music production have very little to do with writing actual code. Thus,
the term can be seen as a remnant from an era before large digital sound libraries existed
and musicians or producers had to actually programme the desired sounds themselves.
When most or all of the activities included in the nominal categories of contribution
of the creative process get conflated and merge together through digital technological
practices, perhaps a new conceptualization of record production could be useful. This is
be even more important when the traditionally separate activities of performing live and
producing a record get conflated; what is produced in a studio is taken onstage and becomes
part of the sonic material of a live event. I propose that contemporary record (or music)
production that takes place in a DAW environment from the beginning of the compositional
process to the end of mastering could be understood as a form of technological musicking
(see Small 1998). Separate categories of contribution merge together into one meaningful
and goal oriented technomusical activity, where it is impossible to say where one category
of contribution ends and another begins.
Despite the fact that the digitalization of the music production process seems to have
enabled the concentration of different activities in one agent, some activities seem to have
remained independent and detached from the process of studio recording. Gibson has
made note of ‘the survival of high-level mastering and post-production facilities’ (2005:
205), which also include the agents working in them. Mixing and mastering seem to have
remained separate activities and what is important to note here is that both activities
are essentially interested in sound. Consider the following statement in response to the
question of why the producer would not do the mixing:
Kane is a better mixing engineer […] I also think it’s cool to share responsibilities […] I
know the things that have been done in that so I might not have the ability to get a fresh
perspective. For example, I might have listened to the snare a lot when I’ve tweaked it. And
for Kane, when he hears it for the first time, he might be like ‘oh, this snare is a little shallow,
let’s put some bottom to it’. Perspective. (J. Olsson interview with author, InkFish studio,
Helsinki, 24 January 2017)
This statement indicates that mixing as a subcategory of engineering is, at least to an

extent, a highly specialized activity not easily taken over merely as a result of technological
access by agents not dedicated to the craft. Furthermore, it seems desirable not to have
been involved in the phases of composition, arrangement and recording when mixing
or mastering; the mixing (or mastering as a subcategory of mixing) engineer needs to
have ‘fresh ears’ with respect to the music to bring perspective. At least for these reasons,
mixing and mastering have largely remained activities independent from the recording
and editing process despite the emergence of the DAW and especially the graphic
display, which have been seen as having shifted the power relations between musicians
and engineers by revealing secrets formerly accessible to engineers only (Williams 2012).
Finally, the ‘coolness’ of sharing responsibilities indicates that multi-agent collaboration in
music production might be desirable in itself despite the possibily of centralizing activities
afforded by digital production technology.
Concluding thoughts
Examining the changes and developments of role delineation in the communal process of
record production brings forth the phenomenon of centralization of the necessary roles
and activities of music production in one agent, the producer. As the digital revolution
has led to the relative easiness of learning the basics of sound engineering, it has created
the potential for anyone to become a ‘producer’ that takes over all the nominal categories
of contribution of record production. This progression is particularly visible in the agency
of the tracker/producer, or producer/tracker, who either fully takes over or contributes
to most of the categories of contribution essential in the process of record (or music)
production. This is, however, also related to musical style, as the tracker/producer seems
mostly to be an agent category related to the so-called urban pop style (Auvinen 2016). The
centralization of roles in one agent can be seen as stemming from the emergence of the
DAW and the standardization of digital music production technologies. Additionally, the
development of all roles and activities coming together in the agent of the producer must
be, as said, understood as genre related, at least to some extent. Partially for this reason,
this progression should be viewed as ‘more additive than evolutionary’ (Moorefield 2005:
xiv); different genres harbour different kinds of role delineations arising from the specific
musical aims of a genre influenced by the traditions of production, technological origins,
values and philosophical underpinnings in which they are nested.
The roots of this development of role centralization can surely be traced to the work of
pioneering engineer-producers like, for example, Giorgio Moroder, Phil Spector, George
Martin, Joe Meek, Brian Eno, Trevor Horn and even Les Paul, and their use of the studio
as a musical instrument bringing together composing (songwriting), performance and
engineering. All these activities can in many cases of music production now be seen
as nested inside the vague and ambiguous activity of ‘producing’. The process, which
started in the 1950s, that ultimately led to the formation of Hennion’s creative collective
or to George Martin’s composition team, can be seen as being somewhat reversed. All or
most of the elements of the creative process formerly distributed among the members of
a group can now be seen as increasingly being concentrated into the dominion of one
agent: the producer. The aforementioned pioneers are early examples of this but remain
a marginal group in their own time. The digital revolution and especially the emergence
of the DAW has finalized this change. As the vast majority of the sonic product can
these days be programmed digitally, even the nominal category of performance can
in many cases be taken over by the producer even if the producer does not engage in
traditional musical performance. The centralization of most or all nominal categories of
contribution is increasingly becoming an industry standard. Consequently, the entire term
of ‘producing’ might need redefining to match the reality of music (or record) production
in its contemporary state, if it ever has been properly defined in the first place.
Note
1. Teosto (2018) is a non-profit organization founded in 1928 by composers and music
publishers, to administer and protect their rights. At Teosto, decision-making power lies
with ordinary members – that is, music authors. Teosto represents more than 32,000
Finnish and almost three million foreign composers, lyricists, arrangers and music
publishers. They collect and distribute royalties to the music authors they represent, for
the public performance and mechanical reproduction of their music in Finland.
Bibliography
Auvinen, T. (2016), ‘A New Breed of Home Studio Producer: Agency and Cultural Space in
Contemporary Home Studio Music Production’, The Finnish Yearbook of Ethnomusicology,
28: 1–33. Available online: https://doi.org/10.23985/evk.60227 (accessed 6 May 2018).
Auvinen, T. (2017), ‘A New Breed of Home Studio Producer?: Agency and the Idea “Tracker”
in Contemporary Home Studio Music Production’, Journal on the Art of Record Production,
11. Available online: http://arpjournal.com/a-new-breed-of-home-studio-producer-
agency-and-the-idea-tracker-in-contemporary-home-studio-music-production/ (accessed
6 March 2017).
Auvinen, T. (2018), The Music Producer as Creative Agent: Studio Production, Technology and
Cultural Space in the Work of Three Finnish Producers, Annales Universitatis Turkuensis,
Ser. B. 467, Humaniora, Piispanristi: Painotalo Painola. Available online: http://urn.fi/
URN:ISBN:978-951-29-7519-8 (accessed 5 May 2019).
Bennett, J. (2011), ‘Collaborative Songwriting – The Ontology of Negotiated Creativity in
Popular Music Studio Practice’, Journal on the Art of Record Production, 5. Available online:
http://arpjournal.com/collaborative-songwriting-–-the-ontology-of-negotiated-creativity-
in-popular-music-studio-practice/ (accessed 8 July 2016).
Blake, A. (2012), ‘Simulating the Ideal Performance: Suvi Raj Grubb and Classical Music
Production’, in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An
Introductory Reader for a New Academic Field, 195–206, London: Ashgate.
Burgess, R. J. (2008), ‘Producer Compensation: Challenges and Options in the New Music
Business’, Journal on the Art of Record Production, 3. Available online: http://arpjournal.
com/producer-compensation-challenges-and-options-in-the-new-music-business/
Burgess, R. J. (2013), The Art of Music Production. The Theory and Practice, New York: Oxford
University Press.
Frith, S. (1996), Performing Rites, Cambridge, MA: Harvard University Press.
Frith, S. (2012), ‘The Place of the Producer in the Discourse of Rock’, in S. Frith and
S. Zagorski-Thomas (eds), The Art of Record Production: An Introductory Reader for a
New Academic Field, 207–222, London: Ashgate.
Frith, S. and S. Zagorski-Thomas (2012), The Art of Record Production: An Introductory
Reader for a New Academic Field, London: Ashgate.
Gibson, C. (2005), ‘Recording Studios: Relational Spaces of Creativity in the City’, Built
Environment, 31 (3): 192–207.
Hennion, A. (1983), ‘The Production of Success: An Antimusicology of the Pop Song’, Popular
Music: Producers and Markets, 3: 159–193.
Hiltunen, R. (2016), ‘Luovia valintoja rajoitetussa tilassa. Popkappaleen tekeminen
ryhmätyönä Biisilinna 2015 – leirillä’, The Finnish Yearbook of Ethnomusicology, 28.
Available online: https://doi.org/10.23985/evk.60231 (accessed 4 May 2018).
Howlett, M. (2012), ‘The Record Producer as Nexus’, Journal on the Art of Record
Production, 6. Available online: http://arpjournal.com/the-record-producer-as-
nexus/(accessed 21 March 2016).
Kealy, E. R. (1979), ‘From Craft to Art: The Case of Sound Mixers and Popular Music’, Work
and Occupations, 6 (3): 3–29. Available online: https://doi.org/10.1177/009392857961001
(accessed 4 April 2018).
Klein, E. (2015), ‘Performing Nostalgia on Record: How Virtual Orchestras and YouTube
Ensembles Have Problematised Classical Music’, Journal on the Art of Record Production, 9.
Available online: http://arpjournal.com/performing-nostalgia-on-record-how-virtual-
orchestras-and-youtube-ensembles-have-problematised-classical-music/ (accessed 2 June
2017).
Moore, A. F. (2012), ‘Beyond a Musicology of Production’, in S. Frith and S. Zagorski-Thomas
(eds), The Art of Record Production: An Introductory Reader for a New Academic Field,
99–112, London: Ashgate.
Muikku, J. (2001), Musiikkia kaikkiruokaisille. Suomalaisen populaarimusiikin
äänitetuotanto 1945–1990, Tampere: Gaudeamus Kirja/Oy Yliopistokustannus University
Press Finland.
Schmidt Horning, S. (2013), Chasing Sound. Technology, Culture & the Art of Studio Recording
from Edison to the LP, Baltimore, MD: Johns Hopkins University Press.
Small, C. (1998), Musicking: The Meaning of Performing and Listening, Middletown, CT:
Swedien, B. (2009), In the Studio with Michael Jackson, New York: Hal Leonard Books.
Teosto (2016), Säveltäjäin tekijänoikeustoimisto Teostory, Tilitys- ja jakosääntö 13.4.2016.
Available online: https://www.teosto.fi/sites/default/files/files/Tilitys-jakosaanto-2016.pdf
(accessed 20 March 2018).
Teosto (2018), ‘About Teosto’. Available online: https://www.teosto.fi/en/teosto/about-teosto

(accessed 20 March 2018).
Warner, T. (2003), Pop Music – Technology and Creativity: Trevor Horn and the Digital
Revolution, Burlington, VT: Ashgate.
Williams, A. (2012), ‘Putting It on Display: The Impact of Visual Information on Control
Room Dynamics’, Journal on the Art of Record Production, 6. Available online: http://
arpjournal.com/putting-it-on-display-the-impact-of-visual-information-on-control-room-
dynamics/ (accessed 28 February 2018).
Wolfe, P. (2012), ‘A Studio of One’s Own: Music Production, Technology and Gender’, Journal
on the Art of Record Production, 7. Available online: https://www.arpjournal.com/asarpwp/
a-studio-of-one’s-own-music-production-technology-and-gender/ (accessed 22 August
2019).
Zagorski-Thomas, S. (2007), ‘The Musicology of Record Production’, Twentieth Century Music,
4 (2): 189–207. Available online: http://journals.cambridge.org/action/displayIssue?
jid=TCM&volumeId=4&seriesId=0&issueId=02 (accessed 21 March 2018).
Zak, A. (2001), The Poetics of Rock. Cutting Tracks, Making Records, Berkeley, CA: University
176
11
Pre-Production
Mike Howlett
Introduction
The ingredients of a good production consist of four things: the composition, the
arrangement, the performance and the engineering (Moore 2005; Zagorski-Thomas 2007).
Of these the composition – the song – is almost always the beginning of the process and
is the reason most commercial recordings in the world of popular music take place at all.1
The main reason that record companies sign an artist is because they believe the artist has
a song that can be a hit.
The way you arrange a song and the key that you put it in will have far more effect on the end
result than the desk or where you put the microphone. (Horn 2008)
But to demonstrate how important the arrangement and the performance can be,
consider two recordings of the Motown classic, ‘I Heard It Through The Grapevine’. Both
recordings were made in the Motown Studio in Detroit around the same time, using the
same engineering technology and with many of the same musicians. One recording was by
Gladys Knight and the Pips and the other by Marvin Gaye, but with the same song and the
same engineers and recording technology, the Marvin Gaye version was a global smash hit
and the other just a footnote in an otherwise great career. And, yes, the performances were
different too, and both were excellent singers, but the key difference is in the arrangement.
As Moore (2005) argues, the composition, the arrangement and the performance are
also the subjects of musicology; nevertheless, they are also key components in record
production. An arrangement can be considered in terms of the assembly of frequencies: the
bass guitar and bass drum occupy the low-frequency range; guitars, keyboards and vocals
provide the mid-range; and cymbals and string pads the upper-mid- and high-frequency
ranges. When an arrangement is well structured the engineering process is greatly
simplified and the engineer does not have to struggle to achieve clarity and separation,
to make, for example, the vocal audible because the guitars or keyboards are flooding the
same frequency range, or the bass is being muddied by similar frequencies on the keyboard.
However, these arrangement decisions are best made long before the artist enters the
studio. Even for experienced artists, studios can be a challenging environment with a high
intensity of focus on every detail of a performance. Sudden changes to the arrangement can
be surprisingly unsettling, especially for young musicians with little recording experience.
The best place to sort out the details of the song structure and arrangement is in a rehearsal
room and this is called ‘pre-production’.
The term ‘pre-production’ loosely covers all the actions taken to prepare for a recording
session and has developed over the years to include rehearsals, decisions about the material,
refinements of the arrangement and approaches to the recording itself, such as whether to
record everyone at once or in overdubs. As Richard Burgess confirms: ‘Preproduction is
when you decide how to make the record – whether it will be overdubbed, performed live,
programmed, or a combination’ (2013: 65). Historically, before the invention of multitrack
recorders, most of these practices were the standard by necessity – sessions were always
live and had to be well prepared. Nowadays, many decisions can be deferred until the
mix. Some recordings don’t need rehearsal, as John Leckie notes below, in order to keep
spontaneity, especially with improvised forms such as jazz, but even then decisions need to
be made before going into the studio.
The process of pre-production has too often been underplayed in the teaching of, and
research into, record production, but it is potentially the most valuable part of a recording.
As noted above, changes to the arrangement can be unsettling for musicians in the
studio. Many performers feel exposed and under close scrutiny – the studio can seem like
a microscope exposing the slightest flaws. In a rehearsal room, on the other hand, the
atmosphere is much more relaxed; ideas can be suggested and tried out with little pressure.
And another very helpful aspect of pre-production is that in the less-pressured environment
the producer can get to know the musicians and understand the internal relationships in
the band – i.e. who are the dominant personalities, where are the creative ideas coming
from? The value of pre-production was something I learned from bitter experience early
in my career and this chapter is written with the aim of helping aspiring producers to
avoid some of the same mistakes. In this chapter I suggest approaches to arrangement for
recording and various aspects that can best be sorted in rehearsal. I have also interviewed
a number of professional producers to see how they approach pre-production, and below
I report on my discoveries. Understandably, each has their own method, and it surprised
me to see how many different ways producers manage it, but all agree on the importance
and value of pre-production.
The linear structure

The first encounter between a producer and a prospective client is usually about finding
common ground and agreeing an approach to the recording process. To begin a dialogue
with an artist or a band a helpful starting point is to be able to discuss the structure of the
songs. What is the sequence of events, or sections, of the arrangement? Sometimes this is
easy to see and the components can be identified quickly, such as the verse, the chorus, the
Pre-Production 179
middle 8 or bridge, intros and outros. Listening to the demo, or getting the band to play
through the song, is the simplest way to identify these parts. I like to take notes and make
a rough map of the different sections. Quite often there will be sections that don’t seem to
fit the standard definitions, but just asking the band what they call a particular section is
enough. They will often have generic names, such as ‘Dave’s riff ’ or ‘the Link’ – the point is
for the producer to get an overview of the piece quickly and to be able to talk about these
various elements in a way that the artist/band can understand.
Sometimes long sections can be identified that complicate the flow of a song
unnecessarily. A catchy section – a hook – may only appear once, or not until the end.
Having a clear understanding of the linear structure enables the producer to identify and
suggest changes that can make all the difference.2
The groove
Once the overall shape of the song is established the details of the arrangement can be
analysed. First, for me, is to consider the groove: does it feel right for the melody and the
style of the song? Usually this is part of a band’s sound or identity and can be a sensitive
subject to raise, but if the arrangement feels too busy, or too stiff, for example, it needs to be
discussed. If the feel is right the arrangement can be a lot emptier, and as a rule of thumb,
it is a lot easier to make a great sounding recording when there is a lot of space to hear the
details. Rhythmic feel is also particularly vulnerable to change during a recording session,
so now, in the rehearsal room, is the time and the place to try ideas. Musicians can be very
touchy about their playing, but a good way to approach changes is to present an idea as
a suggestion, one that you can say perhaps may not work, but let’s try it. No pressure. If
anyone doesn’t like it we can try something else. In this way the question is raised and at
least some discussion can be had about the subject. The tempo too is a large part of the
groove and should be considered at this point. The best judge of tempo, I have found, is the
singer. If a song feels too fast or too slow it is harder for the singer to phrase and to express
feelings. The tempo may be fine but the question should be asked, and this is the time to do
it – not when you are recording the vocals!
The density and dynamic development

Having established the groove and the overall flow of the structure, the way the arrangement
develops becomes clearer. The components of each section serve specific functions. The
intro of a song has particular significance because it sets the tone and the atmosphere of
the piece. This is especially important if the recording is aimed at radio. The first moments
identify a track to the listeners and can hold their attention as the track develops. Once
the song is familiar from repeated plays the effect of the intro is even stronger. This is also
true for radio programmers and selectors – the people who decide whether the track will
be on their playlist – and for new unknown artists the amount of time to get noticed is
short. As little as 20–30 seconds are listened to before programme selectors jump to the
next contender. So a lot of attention should be paid to this relatively short part of the song.
This is where an unusual sound or texture can make a big difference; it could be simply a
new treatment of the snare or an unusual keyboard instrument. Defining guitar riffs work
well, particularly if they feature strongly in the song. Some examples would include Martha
and the Muffins’ ‘Echo Beach’, Blue Oyster Cult’s ‘(Don’t Fear) The Reaper’, Free’s ‘All Right
Now’, The Beatles’ ‘Ticket To Ride’ and The Rolling Stones’ ‘(I Can’t Get No) Satisfaction’.
Once the voice comes in the harmonic context becomes the important focus. If the
textures are too dense then the vocal struggles to be heard. As noted above, the engineer’s
job becomes more difficult. Far better to make sure the instruments around the vocal leave
space. As Zak puts it, ‘in many cases a track’s arrangement develops according to criteria
that are specific to recorded sound’ (2001: 32), and it is a mistake ‘to separate sonics from
the overall arrangement’ (Mitchell Froom quoted in Zak 2001: 32).
Space here also means the frequencies around the voice. High string parts can float
above and still leave room for the vocal; piano and guitars are much closer in frequency
and can crowd the space for the vocal. The voice generally occupies a large amount of
aural space in a recording, so it is always advisable to work with a guide vocal from the
beginning, and keep it in the mix during all overdubs. It can be surprising how little needs
to be going on behind the voice.
As the song develops, from intro to verse to chorus, new melodic elements are introduced,
but when they come around again – in the second verse and chorus, for example – some
new textures can be introduced to add interest. These can be an instrumental counter
melody or harmonic element such as a string part, or perhaps additional percussion. The
point is to make it feel like a development from the first time around.
Although arrangement ideas can be introduced at any time, the most effective time for major
structural changes is in the rehearsal room. This is also the place to look at the equipment. Are
the amplifiers noisy? What condition is the drum kit in? Are the cymbals good quality? New
drum heads are also generally advisable for recording, as are new guitar strings.
The right studio

A major decision that the producer needs to make is the choice of studio. This depends
very much on the best approach for recording a given artist. If the act is a group and
they want to play together then the studio needs good separation and, ideally, separate
headphone mixes. If the recordings are to be layered, working to a click track, then the
room can be smaller, but it depends on the kind of drum sound: does the production style
want a big, crashing ambience? Or perhaps a close, drier feel? A production based mostly
on sequenced programming and samples is best in a studio with a large control room
where everyone involved can have space. These decisions need to be made before choosing
the studio and should come out of discussions with the artist and the producer’s vision for
the production. This is part of pre-production.
Pre-Production 181
How do other producers like to work?

During my research for this chapter, I approached a number of producers and asked
them how they viewed pre-production and even whether they practised it at all. Everyone
agreed it was important, although some found it helpful for different reasons and noted
that artists’ needs vary greatly and each requires a particular approach. Here are some of
the comments:3
Mick Glossop (Van Morrison, Frank Zappa, The Ruts et al.) says:
I find the pre-production period very important […] it gives me time to get to know
the musicians in a less stressed (and less expensive) environment […] importantly, I get
to discover the strengths and weaknesses (both personal and musical) of each member
of the recording project and to understand the ‘politics’ of the situation; whose views or
contributions are most important or most highly regarded, etc. It all helps in sorting out
the ‘psychology’ of those involved. Whilst leaving some room for experimentation and
spontaneity during the recoding, I like to be as prepared as possible beforehand.
Pip Williams (Status Quo, The Moody Blues et al.) says:

As much prep beforehand as possible will always save time and money! I do a ‘recce’ session
with the vocalist – this helps me decide how long to set aside for vocal recording. Some
singers are knackered after a day of singing! I work out a list of aims for each day of recording
followed by actual achievements at the end of the day. If working with a solo artist, I would
have rehearsal sessions to fine-tune the songs and song choice.
I also have pre-production mixing planning, where I sit down with the band and the
A&R/Record Co., to try to suss how they see things and how they are planning to market it.
Better than having confusion in the studio during mixing!
John Leckie (Radiohead, Stone Roses, XTC et al.) agrees on the importance and value
of pre-production but notes that the ‘main work in pre-production is agreeing on tempos:
even if not using a click track I always like to have a tempo for the count in on every song.
Agreeing on arrangements: intros and outros etc.’ Leckie adds:
Just do what’s needed. I’ve done albums of fourteen songs with no pre-production and only
had the briefest phone conversation with the band. And [that was] in the days before YouTube
when you couldn’t see a gig […] Sometimes I’ve been presented with a band with fourteen
to twenty songs all rehearsed and demoed and raring to go. (Radiohead.) Best to just go in
and do it before they get stale and bored. I’ve gone in and the band can be over rehearsed on
songs and just a little bored with playing them and you have to keep their energy levels up.
Richard Burgess (Spandau Ballet, Colonel Abrams and many others) – who wrote the
seminal work on record production The Art of Record Production (1994), now revised as
The Art of Music Production (2013) – felt strongly that pre-production was essential. To my
question of how much time he would allocate to pre-production he said:
[It] depends on the project. For a band that plays its own instruments, about two weeks. If
it’s a computer project, probably the same if things are in good shape but much more if the
songs are not quite there yet or not there at all. It could be six or nine months if the material
is not in great shape.
On the question of using pre-production to decide the recording approach, Burgess’s

response was emphatic:
This is critical. I have changed my mind completely in pre-production as to how to record
a band. Even when I have seen them live a number of times, things become obvious in pre-
production that it is easy to miss at a gig. These include lack of the drummer’s ability to play
in time, general untogetherness between the players, out of tune singing. The number one
reason why I have gone to click track and overdubbing was because of general sloppiness
that was not enhancing the music. When I say that, people always bring up The Rolling
Stones but they have a uniquely tight sloppiness that really works. There is a funky looseness
that is cool and then there is just messy. Pre-production helps me figure out which one I am
dealing with. […]
Pre-production clues you in to things like whether they show up on time. You find out
if one person is habitually late or disappears early for this reason or that. You can judge the
internal dynamic of the band and figure out where the tensions might lie and you can get an
idea of how each personality is best handled. Bands are never monolithic: the singer is going
to have a very different personality than the bass player and it’s good to have a grasp of what
those differences are and how they play into the overall band dynamic. […]
I can’t imagine going into a studio without pre-production for most projects. Some
jazz albums, with well-known and extremely competent musicians are best done without
preparation and I did a few of those when I was playing. […]
Beyond that, I think pre-production is essential. Some established artists do their pre-
production in the studio and that has its merits if you can justify the budget because you
can capture the music as soon as it is ready. I like doing pre-production with bands because
I drill them hard for a couple of weeks, rearrange the songs as needed, rehearse them until
they have them down cold in the new format and then give them a couple of days off. When
we hit the studio, the new arrangement is second nature to them so we don’t get the stupid
mistakes of, say, going to the bridge eight bars too early but we still have the fresh energy
because they have had a couple of days off to clear their heads. For me, that is the dilemma,
how do you get them to play a new song or new arrangement correctly, as if it was their
300th gig, but with all that fresh first take energy. I really try for early takes once I get into
the studio. This is something I learned as a studio musician. There were many times when
famous producers would drive us to dozens and even hundreds of takes because they would
keep changing the arrangement but the energy and vibe of those early takes (usually the first
to the third or so) were always better in my estimation. The right arrangement is critical but
so is great energy.
Richard’s comments about jazz groups make the point that not all sessions need so
much planning and pre-production. As John Leckie also noted, sometimes freshness
and spontaneity can be lost by too much preparation. However, the discussion I had
with producer Steve Power, who co-wrote, produced and engineered Robbie Williams’s
massive hits ‘Millennium’ and ‘Angel’ amongst many others, told a rather different story.
Nowadays, working in the world of high-end commercial pop, he is expected to provide
virtually complete and ready-recorded arrangements including guide vocals, backing
Pre-Production 183
vocals and solos. And that is without any guarantee that his version will be used! This
is pre-production taken to an extreme, perhaps, but has echoes of an earlier age of
recording.
In the 1930s, 1940s and 1950s, before the advent of multitrack recording, arrangements
defined the production outcome. Without the option to add instruments later, the
whole finished production had to be visualized and scored before entering the studio.
Nevertheless, rehearsals would still take place beforehand in most cases because of the
high cost of studio time. Apart from Les Paul’s groundbreaking method of bouncing
from disk to disk, and 3-track recorders used by Elvis Presley and Buddy Holly in the
1950s, multitrack recording only really got going in the 1960s. By the 1970s, 16- and
24-track recorders were ubiquitous and the process of overdubbing became firmly
established, allowing for creative development of the arrangement as the session
progressed. This practice also led to a tendency to try out multiple ideas and leave final
decisions to the mix – or until track space became a problem! With the modern situation
of virtually unlimited tracks on digital recorders the possibility to keep many versions
has increased exponentially. While this can be a blessing, it can also lead to indecision
and procrastination. It is also counter to the advice of many producers to have a clear
vision of the ideal outcome for a recording, which leads us back to the importance of
pre-production.
From my discussions with many producers it has become clear that there are many
approaches to pre-production, although common themes of clarifying the vision and how
to go about the recordings seem consistent. Here are some further perspectives:
Aubrey Whitfield, producer of Little Mix, Kelly Clarkson et al., made detailed
observations and recommendations:
For a single I don’t spend as much time on pre-production as I have created templates and
processes that help me cut down the prep time so I can just go straight to producing.
My pre-production tends to involve:
Have a meeting with the client (ideally face to face) to discuss reference artists and songs,
deadlines, their requirements and for them to understand how I like to work and what they
can expect in my studio.
Curating a playlist of reference tracks for the specific project so I can reference back to
them throughout the project to ensure I’m on track (usually no more than 2–3 reference
tracks recommended by the clients on my request).
Requesting a rough demo from the client in advance and working out the key and tempo,
downloading any stems that are pivotal and importing them into my templates. I always ask
the client to record a demo to a click track so I can easily build around the demo or use as a
reference point.
Organizing my session musicians if needed (usually guitarist or drummer) and making
sure they are available.
Doing a project plan in conjunction with the artist (for albums) which sets out the order
in which we’ll record the album, recording dates and first mix delivery dates. This is usually
a spreadsheet saved in my Google drive.
Creating the DAW templates for each track (my go-to plug-ins set on each instrument
channel, colour coding, labelling, etc.).
Booking in recording sessions in my diary (as my diary gets very busy, very fast!) I tend
to book about 1–2 weeks in advance.
Because of the nature of the music I produce (mostly pop, electronic and some singer-
songwriter), I don’t tend to use a lot of analogue equipment. That saves me time as all my kit
is set up in my studio and ready to go from the off. So equipment checks aren’t really needed
as a lot of my producing is MIDI [Musical Instrument Digital Interface] based or using a
simple set-up to record guitar or vocals.
Producer Phil Harding (East 17, Boyzone, 911 et al.) worked as a programmer and
engineer for Stock, Aitken and Waterman in the 1980s as well as mixing for artists including
Michael Jackson and Diana Ross. Harding said that: ‘In pop, pre-production doesn’t really
exist other than in the initial keyboard and drum programming.’ But he then went on to
say: ‘In pop it is all about preparing for the vocal sessions because that’s what the artists
are – singers – not wishing to make that appear minor.’ So pre-production in pop could be
defined simply as preparing everything for a successful and smooth-running artist vocal
session. He continued:
Vocals are a major part of a pop project and even at the pre-production stage I will be
picturing the final mix in my head with the lead vocals up front and the backing vocals and
other instruments in the background. […]
In order to achieve a smooth-running vocal session for pop, producers and their
programmers will first be concerned about the best possible key for the singer(s) and may
even go as far as a quick vocal key test session if the vocalists are willing to give the time
for that. Then the pop production team would establish the key/set the tempo and song
arrangement and begin programming keyboards/drums and any audio instruments such as
guitars, etc. Crucially and unusually, most pop production teams at this early stage would
book a session singer to record all the backing vocal and harmony parts, thus allowing the
producers to map out exactly what they want from the artist on the main vocal session to
follow. […]
That is what I would call pre-production in pop and everything I have described above is
preparation for the main artist vocal session and could take 3–4 studio days.
Pre-production clearly serves different purposes in pop than with rock groups and more
‘serious’ artists. Nevertheless, all the producers I approached, without exception, agreed
that pre-production was important, although for various reasons. Tony Platt (AC/DC, Iron
Maiden, Motorhead et al.) commented that a good reason for pre-production was, ‘making
sure everyone is clear about the album we are going to make’. This point was emphasized by
Manon Grandjean, Breakthrough Engineer of the Year at the 2018 MPG Awards. Manon
has worked with artists such as London Grammar and grime artist Stormzy and says an
important part of pre-production is ‘deciding a common approach in terms of sound’,
adding that, ‘the more you can do in advance with the band to prepare the better. With
smaller budgets nowadays you will have to reduce the studio recording time quite a bit so
rehearsing and sorting keys, tempos, gear, getting to know the band, rehearsing the songs,
etc. are key.’
Pre-Production 185
Reflections
Talking recently to some young new producer/writers, a way of making records has
developed that merges the writing, the arrangement and even the performances, with
different artists contributing the rhythm arrangement, the lyrics, the melodies and voices
as a whole collaborative effort. The entire production becomes a continuous process
from composition to the mix, so the whole process is virtually a continuous state of pre-
production! What is clear from this review of producers’ methods is that, at the very least,
taking the time to discuss with artists their vision and an agreed approach to recording is a
valuable process and highly recommended to aspiring producers.
Notes
1. The most common exception is when an artist develops a composition out of in-studio
improvisation, when the sound has been set up and the performance is part of the
composition process. Artists such as Prince and Dave Stewart liked to work in this way,
although usually material would subsequently be developed further and re-recorded or
parts added.
2. This approach helped me to see that the ‘hook’ riff in the song ‘Echo Beach’ by Martha
and the Muffins (1980) only appeared in the intro, so I suggested that the riff appeared
again before the second verse and on the outro, so that this most attractive and catchy
part became more significant in the recording.
3. All of the comments quoted here came either from direct conversations or by email
exchanges.
Bibliography
Burgess, R. J. (1994), The Art of Record Production, London: Omnibus Press.
Burgess, R. J. (2013), The Art of Music Production, 4th edn, New York: Oxford University Press.
Horn, T. (2008), ‘Trevor Horn Interview at SARM Studios – Part One’, RecordProduction.
com. Available online: http://www.recordproduction.com/trevor-horn-record-producer.
html (accessed 5 July 2018).
Moore, A. F. (2005), ‘Beyond a Musicology of Production’, in Proceedings of the Art of Record
Production Conference, University of Westminster, 17–18 September. Available online:
https://www.artofrecordproduction.com/aorpjoom/arp-conferences/arp-2005/17-arp-
2005/76-moore-2005 (accessed 27 August 2019).
Zagorski-Thomas, S. (2007), ‘The Musicology of Record Production’, Twentieth-Century
Music, 4 (4): 189–207. Available online: http://journals.cambridge.org/action/display
AbstractfromPage=online&aid=1742460&fulltextType=RA&fileId=S1478572208000509
Zak, A. J., III (2001), The Poetics of Rock: Cutting Records, Making Tracks, London: University
Discography
The Beatles (1965), [7” vinyl single] ‘Ticket To Ride’, Parlophone.
Blue Oyster Cult (1976), [7” vinyl single] ‘(Don’t Fear) The Reaper’, Columbia.
Free (1970), [7” vinyl single] ‘All Right Now’, Island.
Gaye, Marvin (1968), [7” vinyl single] ‘I Heard It Through The Grapevine’, Tamla.
Knight, Gladys, and the Pips (1967), [7” vinyl single] ‘I Heard It Through The Grapevine’,
Tamla.
Martha and the Muffins (1980), [7” vinyl single] ‘Echo Beach’, DinDisc.
The Rolling Stones (1965), [7” vinyl single] ‘(I Can’t Get No) Satisfaction’, Decca.
Robbie Williams (1998), [CD single] ‘Millennium’, Chrysalis.
Part V
Creating Recorded Music
The idea that musical practices required in the recording process might be radically different
from those required in the concert hall crops up at various points in this book. While it was
true that the studio was the ‘abnormal’ and the concert hall the ‘normal’ situation in the days
before cheap home recording equipment, there is now a generation (or more) of musicians
who were experienced at recording before they had extensive exposure to the concert hall.
Either way round, performance, writing and arranging all happen differently when there is
recording involved than they do for live music. In some styles of music the onus has remained
on the studio environment to accommodate the performers, while in others the musicians
have engaged with the technology and adjusted their habitus. But in either direction there has
had to be movement. So while classical and folk performers still tend to record in ensembles
without much physical or acoustic separation between them, they are still working around
the editing process in some way – creating performances that are structured more like
rehearsals (i.e. repeating sections that are not quite right in some way) – and yet they are
being asked to maintain the intensity and emotional energy of the ‘one chance only’ concert
performance. The various styles of popular music that involve real-time performance, on the
other hand, have seen those performers engaging with the affordances (and requirements)
of recording technology – overdubbing, playing short fragments, performing in isolation,
performing to a click track or performing to the sound of their instrument or voice through
various types of processing. And these different forms of performance in the studio do not
fall simply into two camps. There is a whole continuum of variation and the production
process often involves an explicit or implicit ongoing negotiation about what is the most
favoured or effective approach at any given moment in the process.
And these types of negotiation, as well as others to do with song writing, arranging
and sound design, are central to the question of what it means to be creative in the studio.
Creativity in the studio relies on some of the activities being unplanned – if all the creativity
happens in advance, everything is pre-planned and the process will simply involve the
technical realization of a plan. So ‘creativity’ undermines that other idea which has been
cropping up throughout the book, the idea of recording as an industrial process, because
an industrial process is one where everything is pre-planned. However, in any recording
process there is going to be an element of quality control – even if everything else has
been pre-planned someone needs to decide on whether the performance was good enough
for artistic or commercial purposes. But very often, as these chapters testify, the music
production process starts when many of the features are still (deliberately) unplanned –
song writing, arranging, feel, atmosphere and even the ways in which the recording is
going to help to construct the musical identity of the artist.
Another theme that creeps in and out of the chapters in this part of the volume as well as
elsewhere is the way in which technology can enhance or inhibit creativity – and whose? Very
often in the history of recording, developments in the technology have shifted some aspect
of the creative agency in the process from one set of actors to another rather than providing
some conceptually new affordance. In some instances this has moved the agency away from
skilled technicians to musical creatives by simplifying the interface in some way – although
this usually also shifts some of the technical nuance and detail of control away from the studio
and into the product design process. Thus, the notion of ‘presets’ has become more ubiquitous
across many technologies – from plug-ins to guitar pedals to electronic instrument sounds –
and these involve shifting much of the detail of the decision-making process into the design
stage. However, technologies that automate or de-skill some activity can be empowering
because they free you up to focus your attention and your energies on some other aspect of
the process. But you have to be able to work out what you want to focus on.
And that notion of where the knowledge can and should lie in the production process
also brings us to the question of how that knowledge is disseminated. In the second half of
the twentieth century that question would have been about apprenticeship, training and the
informal learning processes of home-studio production but the formal education sector –
what McNally and Seay term the ‘cognitive apprenticeship’ approach to learning – has
become hugely influential in recent years. Music production has shifted sharply from being
something that was engaged in largely by professionals in the 1970s to being something that
currently involves more professionals supplying the ‘production technology consumers’
with products and services to enable them to make recorded music. There are the designers
and suppliers of DAWs, plug-ins and other technology, and the various forms of experts
who sell the knowledge that allows these (largely non-professional) users to engage more
effectively. On the one hand, we can see this as part of the huge supply infrastructure of a
more gig-economy driven cultural industry and, on the other hand, we can see this as a
shift in the balance of consumer activity – from the musicians as producers and audiences
as consumers, to a time when being a ‘musician’ has become a less specialized and skilled
activity (for many) and more of a consumer activity.
Finally we come to the question of how these processes of musicking in the recording
process contribute to the creation of identity. The idea of the persona has been extensively
explored in popular music studies – the various ways in which an artist can present an
identity through their musical practice (and the accompanying promotional activity). And
that idea extends to the notion of recording practice as well. Alexa Woloshyn explores
many of the ways in which identity can both be expressed through the sound of recorded
performances and through the ideological choices that artists may be seen to have made
during the production process. And this, of course, relates back strongly to Mike Alleyne’s
discussion of authenticity in Chapter 2.
12
Songwriting in the Studio
Simon Barber
Introduction
I was fourteen when I wrote my first song and sixteen when I made my earliest studio
recordings. These experiences set me on a path that has guided my creative and scholarly
interests since, so it is not surprising to me that twenty-five years later I find myself using this
space to explore how professional musicians engage with the art and craft of songwriting
as a mode of studio practice. Though the writing of popular songs and the production
of commercial music recordings have, chiefly for reasons of cost, often been overlapping
but separate processes, the increasing availability of inexpensive, high-quality recording
solutions has, in recent years, afforded songwriters opportunities to experiment more
readily with a philosophy of ‘writing as recording’. This is an approach where the component
parts of the song are recorded in their final form as soon as they are written, and sometimes
written as they are recorded, with a view to keeping the initial inspiration intact on the final
release. With the modern digital audio workstation (DAW) having become central to the
creative process in the studio (Marrington 2017; Bennett 2018), the conventional routine
of remaking one’s ‘demos’ in a costly multitrack recording facility has become an obsolete
concept for many artists working within the commercial music industries. Drawing
from insights provided by professional songwriters speaking on my podcast, Sodajerker
On Songwriting, I find that the primary motivation for adopting this methodology is to
conflate stages of the process involving intermediate recording in order to achieve finished
productions which they perceive to possess the kind of energy and immediacy usually
associated with the demo. This practice also emerges as a productive strategy for artists
subject to the demands of record label deadlines and budgetary constraints.
I begin this chapter by looking at how the recording studio has been conceptualized as a
space for creativity and collaboration through songwriting. I consider some of the concepts
that have been used by scholars to explain how songwriting happens in the studio and the
importance of studio technologies in the evolution of recording practice. From there, I
trace how songwriting has been understood to work as a production culture, reviewing
some of the ways in which organizations such as Motown and Stock Aitken Waterman
(SAW) have achieved success as a result of studio-based songwriting teams working in
time-compressed writing and recording configurations. I then turn to the comments

made by songwriters appearing on the Sodajerker podcast, showing how they use these
strategies to synthesize writing and recording processes in the studio. Following Bennett
(2018), I adopt a broadly ethnographic approach, focusing on interviews with professional
songwriters whose music is subject to critical and commercial scrutiny and situating the
creative work they describe in the context of the creative objects they produce. I conclude
by giving an overview of some of the ways in which technology is informing songwriting
beyond the studio space, such as via artificial intelligence (AI) platforms and apps, and how
these innovations are already posing challenges to the creative practices discussed herein.
The recording studio as a space for

songwriting
Before we can look in more detail at how songwriters strive to consolidate the writing and
recording process in the studio, it is first necessary to contextualize some of the particulars of
this environment and its culture. There are, after all, numerous places in which songwriting
happens – from the bedroom, to the rehearsal/writing room, to the publisher’s offices – and
all of them may facilitate the creative process in different ways. As Thompson and Lashua
(2016) point out, recording studios are not typically open to the general public and attempts
to mythologize studio practice are rife in popular media such as film and television; see, for
example, Muscle Shoals (2013), Empire (2015) or A Star is Born (2018). This can make it
challenging for outsiders to appreciate the diverse aesthetics of recording studios, the social
dynamics that exist between artists and workers in these spaces, and their importance as
hubs of creativity in populated areas (Gibson 2005). Thompson and Lashua (2016) argue
that contemporary trends amalgamating songwriting, production and performance duties
under the purview of individual artists and producers are to be resisted in order to undermine
notions of auteurism and to highlight the complex relations between staff, creative workers
and technology in studio environments. The same conclusion is drawn by Thompson in his
work with Harding (2019), which posits a ‘service model’ of creativity observed through the
working practices of SAW at PWL (Pete Waterman Limited) during the 1980s.
This seemingly fluid characterization is, however, complicated by the work of scholars
such as artist-producer Paula Wolfe (2019), whose research evinces the field of popular
music production and recording studios as gendered spaces subject to an overwhelmingly
male bias. Wolfe argues that the music industry establishment is dominated by patriarchal
frameworks designed to encourage stereotyping, marginalization and containment of
women’s achievements in the studio and across wider spheres of media production. Indeed,
the same problems apply to songwriting and production as professional activities. A report
from the Annenberg Inclusion Initiative (Smith et al. 2019), which looks at the gender
and race/ethnicity of artists, songwriters and producers across 700 popular songs between
2012 and 2018, finds that females are missing in popular music. Of the 3,330 songwriters
credited in the sample, just 12.8 per cent were women. Statistics on 871 producers across
Songwriting in the Studio 191
400 songs were also damning with just 2.1 per cent identified as female. Moreover, just 4 of
the 871 producers credited were women of colour.
It is with this lack of diversity duly noted that I begin to raise questions about how the
creative process is activated in the recording studio and how it relates to songwriting. One
of the most prominent ways in which creativity has been understood in this environment
is through the systems model, a concept advanced by the work of Csikszentmihalyi (1988)
and taken up by scholars such as McIntyre (2008b, 2011), Bennett (2012) and Thompson
(2016). The principle of this idea is that creativity is the product of three interacting forces:
A set of social institutions, or field, that selects from the variations produced by individuals
those that are worth preserving; a stable cultural domain that will preserve and transmit the
selected new ideas or forms to the following generations; and finally the individual, who
brings about some change in the domain, a change that the field will consider to be creative.
(Csikszentmihalyi 2014: 47)
Csikszentmihalyi’s model provides us with a foundation for a more practical

understanding of creativity in the recording studio and the specific kinds of interactions
that happen within this domain. The evidence for this emerges in ethnographic work on
the routines, practices and aspirations of technically enabled songwriters and producers,
including music students (Marrington 2011), aspirant hitmakers (Auvinen 2017) and
seasoned professionals (Bennett 2018). Research by Gooderson and Henley (2017) explores
the creative process of songwriting through the contrasting perspectives of a professional
songwriting team and a student songwriting team in order to gain insights into the
epistemology of the practice of songwriting. Taking Csikszentmihalyi’s (1988) model as
the foundation of their analysis, Gooderson and Henley (2017) continue the claim (via
Toynbee 2000) that popular musicians rework material that is already present in the social
domain. Or as McIntyre (2008a) puts it, songwriters use their knowledge of existing songs
to produce variations on prominent conventions and rules. It is their mastery of the rules
of the domain, and the acceptance of their decisions by others, which confirms their status
as producers of impactful ideas in the field – or after Bourdieu (1993) – reaps the reward
of cultural capital. In the present, the affordances of modern technology are central to this
practice and Bennett’s (2018) emphasis on the DAW and broadband internet connectivity
as primary tools chimes with the experiences of recording studio workers across studies of
different genres (Morey and McIntyre 2014) and non-academic studies, such as Seabrook’s
(2015) account of the industrialized approaches of leading pop producers in New York, Los
Angeles, Stockholm and Korea. And yet, as indicated, women and people of colour remain
largely relegated to supporting roles in these narratives.
Songwriting as a mode of production

Throughout history, there have been a variety of popular musical styles and traditions
that have achieved sustained commercial success as a result of organized approaches to
songwriting (Goldmark 2015). The cycle of song production within production houses,
publishing companies and record labels is necessarily formulaic; a process that often
incorporates an attempt to replicate the performance of some past success by fashioning
something new from a proven source. Influenced to a significant degree by Marxist critiques
of mass production and manipulation (Adorno 1941), the canons, fashions, symbols and
meanings of pop have been subject to dismissal and devaluation. As Jones explains: ‘Once
we insert the term “commodity” into the debate, and thus re-specify pop as an industrial
product as well as a musical activity, a source of pleasure or a source of meaning, it can
provoke a kind of reflex response in that debate’s participants’ (2003: 148). However, the
context of production need not reflect negatively on the quality of the music produced. In
his work on the musicology of pop songs, Hennion makes obvious the futility of trying to
separate a popular song from the machinations of its creation:
Is success achieved through bribery, through massive ‘plugging’, through a dulling of the
senses or through conformism, as the ritual claims of the press would have it? Is it a by-
product of profit, of standardisation, of alienation or of the prevailing ideology, as Marxists
argue? (1983: 159)
During the past 100 years, from the standards of the Great American Songbook and
the emergence of rock ‘n’ roll, to the social and political power of popular music culture in
the 1960s (Etzkorn 1963), hubs of songwriting concentration across North America such
as Tin Pan Alley, the Brill Building, Motown and Fame Studios in Muscle Shoals, have
been the basis of our understanding of the influence and scope of what is possible with
pop music. As documented elsewhere (Emerson 2006; Barber 2016), the so-called ‘factory’
model adopted by songwriters in and around the Brill Building in New York during the
1950s and 1960s continued this successful mode of production. Indeed, songwriting teams
such as Barry and Greenwich, Mann and Weil, Goffin and King, and Bacharach and David
accounted for forty-eight pop hits with thirty-two different performers between 1963 and
1964 alone (Fitzgerald 1995b: 61).
Bennett defines the factory model as a physical space, usually an office or studio
environment formalized in part by conventional working hours and able to ‘produce a large
number of “hits” owing to its work ethic and quality control systems’ (2012: 159). However,
as Inglis argues, the concept of the ‘production line’ or ‘songwriting factory’ reduces the
creative act of writing a hit song to a workaday task (2003: 215), which is insufficient as a
way of understanding the complexities and subtleties of this kind of creative labour. It is
worth remembering that the motivations behind commercial song production are complex
and rely on a mix of factors affecting songwriters (institutional stimuli, leisure) in tandem
with the work of (typically male) animateurs, such as publishers or entrepreneurs, focused
on the exploitation of songs as a commodity in a capitalistic enterprise (Henderson and
Spracklen 2017). To turn to our previous examples, this has been the case with the Brill
Building-era publishing company Aldon Music (Don Kirshner and Al Nevins), Motown
(Berry Gordy), SAW (Pete Waterman) and Xenomania (Brian Higgins). In Studio A at
Motown, aka ‘The Snake Pit’, songs by writing teams such as Holland-Dozier-Holland were
expeditiously fashioned into records. The company’s round-the-clock studio practices, its
quality control meetings and reservoir of melodic songs bound up in social and political
commentary make for a striking comparison with accounts of the Brill Building-era song
industry (Fitzgerald 1995a; Nelson 2007). Indeed, Motown founder Berry Gordy was
directly inspired by the efficiency of the Brill Building model, which Motown ultimately
‘updated and replaced’ (Fitzgerald 1995b: 75).
Closer to home, teams at UK ‘hit factories’ such as SAW and Xenomania, and Cheiron
Studios in Sweden, mirrored the methodologies of these successful production houses and
record labels. In Britain during the 1980s, SAW was renowned for its highly routinized,
round-the-clock recording operation responsible for pop hits by Kylie Minogue, Jason
Donovan, Rick Astley and others. SAW achieved more than 100 UK top forty hits,
earning somewhere in the region of £60 million from the sale of approximately 40 million
records (Petridis 2005). Such a fast-paced environment necessitated the manipulation of
existing material using tricks such as reversing chord structures and bass lines from prior
hits in order to produce new songs that would emit the auratic qualities of their already
successful works (Sodajerker 2013). Toulson and Burgess (2017) convey this fact using
Harding’s account of programming and mixing the Bananarama song ‘Venus’ with the
same techniques previously used on Dead or Alive’s most successful records: ‘Within 12
hours we had turned Bananarama into a female Dead or Alive and the record was huge
everywhere’ (Harding 2009: 64).
In The Song Machine (2015), journalist John Seabrook outlines a systematic songwriting
process called ‘track and hook’. This is an approach, industrialized primarily in the recording
studios of Sweden, New York and Los Angeles by hitmakers such as Max Martin and
Stargate, in which the writing and production of instrumental backing music comes first,
with specialist freelancers brought in to ‘top-line’ or add melodies and lyrics to preexisting
tracks. A top-liner might be one of several talented experts asked to invent a hook for
a song, making this an economically precarious profession but with the opportunity for
significant reward. ‘Songwriters would be assigned different parts of a song to work on’ says
Seabrook, ‘choruses would be taken from one song and tried in another; a bridge might be
swapped out, or a hook. Songs were written more like television shows, by teams of writers
who willingly shared credit’ (64). ‘Writing camps’ – gatherings convened by labels and
artists at which dozens of track writers and top-liners are invited to exchange ideas in a
sort of ‘speed dating for songwriters’ format – is another strategy that is frequently adopted
in an attempt to ignite a musical spark between collaborators. In this way, bids to increase
the likelihood of success by obtaining outside input have become standard practice for
songwriting/production teams and solo singer-songwriters. Ed Sheeran’s documentary
Songwriter (2018) is a rare depiction of how collaboration with multiple writers in the
studio has become the status quo for marquee names under pressure to deliver a steady
flow of hit records.
Columbia University’s David Hajdu (2015), music critic for The Nation, criticized
Seabrook for adopting an industrial scheme to explain contemporary pop songwriting
and production. He argues that Seabrook overlooks the fact that we’ve already entered
the post-industrial age, a world in which the song factories have closed, and the old
tropes of the machine era no longer adequately explain the organization of creative
work by songwriters. Given the previous examples presented in this chapter, and those
introduced in the next section, I’d argue that contemporary pop songwriting and
production, while transformed by digitalization, owes a greater debt to the past than
Hajdu suggests. That said, the need to understand not only the technology and culture
of the modern recording studio but also the politics of song administration – staking
claims to song splits where five, eight or ten others might be sharing credit – has surely
become a dizzying aspect of the contemporary freelance songwriter’s skill set (Morey
and McIntyre 2011; de Laat 2015).
Writing as recording
Why bother playing a part if you can’t have it count towards the actual final recording? […]
I couldn’t even imagine doing that. It makes me nauseous.
Andrew W.K. (Sodajerker 2018)
In this section, I want to bring forth the voices of professional practitioners to illustrate
how they are conceptualizing songwriting and production as a continuum taking place
in the recording studio in as few stages as possible. Drawing on a range of prominent
individuals working in a variety of popular music styles, I use their commentary to
unpick the philosophy of writing as recording established at the outset, connecting these
approaches to the historical trajectory of industry practice reviewed thus far. Of course,
gathering reflections on the process from songwriters is not a new idea. There have been
a number of authors, Zollo (2003) and Rachel (2013) among them, who have published
interviews with songwriters about their craft. However, the desire of popular media to
overlook everyday working practices and routines in favour of understanding songwriters
as ‘romantically inspired artists’ (Negus and Astor 2015) is strong, and the majority of such
sources cultivate that notion. I have therefore deliberately selected subjects working in the
solo singer-songwriter idiom in order to explore the extent to which such artists might be
engaged with the routines, practices and decisions associated with the industrial context
for songwriting established earlier in this chapter.
Working together under the name Sodajerker, my colleague Brian O’Connor and I
began recording interviews with professional songwriters in late 2011. The resulting
podcast, Sodajerker On Songwriting, contains 140 episodes at the time of writing featuring
our conversations with world-renowned practitioners spanning from the 1950s to the
present day. As Llinares, Fox and Berry (2018) have claimed, over the past fifteen years
podcasting has come into its own as a new aural culture. The long-form podcast, and the
freedom it engenders for discussion of niche topics, has emerged as a critically important
methodology for my research, affording songwriters a more expansive space in which to
articulate their working routines and their place in the wider political economy of the
music industries.
When asked about the writing of the song ‘Music Is Worth Living For’, the American
rock performer Andrew W.K., whose words opened this section, was quick to reveal that
he doesn’t compartmentalize the writing and recording processes, nor does he test out his
ideas in an intermediate recording stage: ‘I don’t think I’ve ever recorded what would be
considered demos, where you practise a recording and then you scrap that and start fresh,
even if you like it,’ he says. ‘It’s such a waste! If you’ve already started working on it, why
start over again?’ (Sodajerker 2018). Capturing an idea at the moment of inspiration has
become a central way in which artists such as W.K. carry out the record-making process,
and this is made possible by the affordability of high-resolution recording systems, which
are frequently installed in the homes of recording artists, away from the pressures of time
and money associated with large recording complexes. W.K. maintains that his passion and
emotional investment in the song is contained within the originating seed of the idea and
encourages songwriters to capture that initial inspiration and keep it within the recording
until completion rather than abandoning it in service of ‘trying to grow the same tree’
with subsequent re-recordings (Sodajerker 2018). This holds with prior research (Long
and Barber 2015), which finds that the core of the songwriter’s labour deals in emotion,
investing and articulating feelings in forms recognizable enough to listeners that they elicit
affective responses. This is realized in W.K.’s music through a strongly held belief system of
positivity, self-actualization and ‘partying’.
Recording parts as they are conceived, or conceiving them as they are recorded,
relieves W.K. from the frustration of trying to recapture the energy of the demo in a
separate recording session. However, there is a distinction to be made here with the art
of improvisation, as the artist is willing to re-record a part during the initial period of
inspiration if it helps to solidify his compositional goals. ‘Once you get it, you’ve got it,’ he
relates. However, while W.K. relishes working in solitary, his status as a recording artist
contracted to Sony Music complicates notions of autonomy (Holt and Lapenta 2010) and
his work is facilitated by recording budgets, record label staff, engineers and feedback from
friends and family, among other relationships within the infrastructure that surrounds
him. The importance of collaboration is borne out in W.K.’s account of deciding to
spontaneously write/record the spoken word sections contained on his 2018 album, You’re
Not Alone, whilst in the final stages of post-production on the record. This he achieved
unconventionally with help from his mastering engineer (Sodajerker 2018).
Singer-songwriter Jack Savoretti also provides evidence of his desire to release recordings
which contain his earliest performances of an idea:
The studio is my favourite place because I love capturing it while we write it. There’s
something about how you sing a song the first time you write it that’s hard to recreate, even
if it’s not perfect. The melody might develop and get better on tour, or more emotional, but
there’s something about when somebody sings a song that’s come out of nowhere, I think it’s
quite captivating when you hear it; it’s genuine. (Sodajerker 2017a)
Savoretti’s position is guided by a sense of what represents an authentic performance

but also his experiences within the music industries. Scholars who have scrutinized
creativity in the cultural industries (Hesmondhalgh 2007; Menger 2014) connect this
kind of decision-making to the political, economic and cultural frameworks of capitalist
production. Savoretti is motivated, then, not only by the aesthetic benefits of this approach
but also the economic considerations:
Demo is like a dirty word for me now, because I’ve spent so many years wasting money –
people going, ‘oh, we love it, now let’s spend time and money trying to recreate it’. I’m like,
‘either we go and do something different or that’s the one we use’. So now 90 per cent of the
vocal takes you hear on the record, or the last two records, are me singing it the day the song
was written. (Sodajerker 2017a)
As revealed in earlier discussion, these sorts of economically and artistically productive

strategies for recording can be observed in the lineage of time-compressed ‘factory’-style
production routines used not only by pop songwriting teams but also solo artists such as
Savoretti. ‘I like to go into the studio at twelve, go for lunch, get that out of the way, and
then go back into the studio and it’s heads down until six-thirty, seven o’clock, eight, and
leave with a finished product,’ he says. ‘Studios are very easy to just fuss about something
for hours. It’s amazing what you can do when you do it’ (Sodajerker 2017a).
Following my earlier review of the marginalization of female songwriters in
the recording studio, I want to turn now to the views of two disparate but equally
accomplished women: the award-winning singer-songwriter Dido, and the celebrated
folk singer and banjoist Rhiannon Giddens. As with the artists discussed in previous
research (Barber 2017; Long and Barber 2017), and the strategic practices documented
in this chapter, these artists think deeply about the processes of writing and recording,
and adopt strategies that they believe will enhance the emotional impact of their work
and improve their efficiency. Dido finds that once she has an idea for a song it is her
instinctive creative response that will produce the most satisfactory results: ‘I think sort
of going with that first instinct has always worked for me. ... Every single vocal on this
record, I was like, “look, I’m just going to put it down while I’ve got the idea and we’ll
do the proper vocals when I come back next time,” and then I just never did the proper
vocals’ (Sodajerker 2019a). Dido’s affection for early takes of a performance extends to
capturing the absolute first performance of a song and ensuring that the moment of
creation is made available as the finished product:
On this record, there’s the song called ‘Have To Stay’ I wrote with Ryan Louder. I had a
melody in my head, but while I was singing him the melody without playing the guitar, he
started filling in these really beautiful chords and that’s actually what’s on the record, is him
working it out while I’m singing the song to him. You can hear, ‘cause he sits on the first
chord for, like, two verses; he’s obviously just listening and thinking ‘where shall I go?’ So,
you can get moments like that, which are literally the first idea and you’ve just kept it and I
always make sure we record the first idea of anything. (Sodajerker 2019a)
Collaboration, then, is an essential part of being able to achieve performances of

embryonic ideas that might develop in the studio to meet the standards of finished work.
For Rhiannon Giddens, her creative synergy with multi-instrumentalist Francesco Turrisi
afforded them the luxury of recording at an early stage in the writing and arranging process,
aided by the organic nature of their approach to traditional music: ‘A lot of the tracks were
actually first takes and some of them were the only take,’ says Giddens. ‘There was a sense
of being in a constant river of creation and not really worrying about, “Is this the song? Is
this the take? Is this the arrangement?” It was just, “we’re going to do this” and it was all
captured and we usually knew, “okay, there’s no need to be doing this again, that was the
one”’ (Sodajerker 2019b).
Giddens’s position raises an interesting question for those bypassing intermediate stages
of recording in the production of popular music, which could be expressed as: ‘when does
a song become a song?’ The songwriter, drummer and video director Kevin Godley, of 10cc
and Godley and Creme, recalls the studio process that led to the song ‘Neanderthal Man’, a
sprawling chant recorded at Strawberry Studios in Stockport during an equipment test by
the line-up that became 10cc, but which was released under the moniker ‘Hotlegs’:
We were in Strawberry Studios, testing the gear, it was like, ‘okay, let’s do the drums first’,
so I laid down a beat and then, ‘well, what if we double-track it?’ Okay. ‘What are we going
to put over this? How about “I’m a Neanderthal Man?”’ It’s a process that you go through,
it’s an experimental process, a research process. It’s a lab. The recording studio is a lab to
me and we weren’t aiming to turn it into a song, but something in our sensibilities jointly
helped us turn it into a recording. I think it’s a recording, it’s not a song, that’s for sure.
(Sodajerker 2016)
While a song’s status and identity can be discovered through the writing as recording
process – or in the case of ‘Neanderthal Man’, through commercial exploitation – having
the help of studio collaborators to seize upon raw ideas and imbue them with the urgency
required for rapid development is again seen to be critically important. Noel Gallagher,
who is known for his autocratic approach to the creative process, reports that producer
David Holmes facilitated his writing process on an eponymous album for his High Flying
Birds. Gallagher describes a regimented routine in which he would spend the day in the
studio with Holmes, his spontaneous attempts to write songs captured by the producer.
Holmes would then use this as the raw material for a curation process that would happen
outside of normal working hours:
David would start working at six and he couldn’t wait to get me out the fucking door. He’d be
like, ‘it’s ten to six, shall I get you a cab?’ Then he’d stick his headphones on with his computer
and he’d get into what he was doing and I’d come back the next day and he’d say, ‘I’ve found
this bit of music that you played’ and I’d be like, ‘wow, fucking hell, that’s when I’d switched
off!’ And he’d say, ‘this is going to be the song’. (Sodajerker 2017b)
Songwriting beyond the studio

In this chapter, I have explored how professional musicians engage with songwriting as
a mode of studio practice. Developments in home recording technologies have afforded
singer-songwriters and songwriting teams opportunities to abandon intermediate ‘demo’
stages of production, developing release-quality recordings during the writing process. This
production methodology has two main benefits: it provides artists with ‘inspired’, dynamic
recordings and simultaneously reduces the costs associated with production. When these
practices are located within the context of the long history of industrialized songwriting,
and the increasingly modular ways in which this kind of work is carried out by teams, it
becomes easier to appreciate how contemporary attitudes to technology, collaboration and
time constraints are designed to increase the efficacy of production while still enhancing
the durability and quality of the material. In that sense, the distance between the hit factory
and the contemporary singer-songwriter is closer than one might imagine, though issues
of gender balance, equality and power continue to pervade most forms of studio practice
in the present.
These conclusions are also worthy of attention in the context of recent and new
technologies designed to assist songwriters in the writing and production process. In line
with Bennett’s (2018) claims about the vital role of internet connectivity in contemporary
practice, collaboration online through interconnected networks of remote musicians
crowdsourcing for partners to perform, mix or remix their work has become commonplace
(Hajimichael 2011; Koszolko 2015). However, the emergence of AI engines drawing on vast
databases of music data engenders postmodern fears for some about the future redundancy
of songwriters and composers in the production process (Reilly 2019; Spezzatti 2019). The
reality, of course, typically involves the deployment of these technologies for the purposes
of automation, assistance and augmentation rather than autonomous control (Davis 2019).
Beyond the traditional recording studio facility, or even the home studio, applications
such as Amadeus Code (French 2018; Murphy 2019) bring the possibilities of assisted
music production to the smartphone. They join a substantial history of technology firms
working with artists to produce apps and platforms based on deep learning networks to
enhance music creation (Deahl 2019). This has meant a steady increase in the amount of
music generated through algorithmic or computational methods, particularly in the area
of library music (Passman 2017; Jancer 2018), but questions still remain about the ability
of AI platforms such as Amper to write innovative or engaging pop songs (Garza 2018;
Deahl 2019).
Returning to our discussion of social fields and domains of knowledge, AI music
solutions, much like human songwriters, base their ability to produce new songs on learning
the structures, words and melodic patterns present in existing catalogues of material,
and so they can be asked to regurgitate generic country songs or mimic the music of The
Beatles if required (Heaven 2019). While AI can master the symbolic rules of a culture and
may even bring novelty to the domain, the discourse raises questions for this researcher
about the extent to which innovations generated by AI will be recognized and validated by
a field of experts (Knight 2016; Dredge 2019; Weiner 2019). Furthermore, songs written
with the assistance of AI are subject to debates about the validity of copyright (Carlisle
2019) and questions remain regarding the potential for technologies such as blockchain
to protect songwriters’ interests and ensure fair remuneration (Rys 2017); a potentially
positive step forward for beleaguered creators in the age of streaming services. With the
AI sector estimated to be worth £232 billion to the UK economy by 2030 (PwC 2017),
there can be little doubt that the recording studio, however that space evolves, or devolves,
will continue to be a complex product of the interacting forces of humans and technology
working in pursuit of hit songs.
Bibliography
Adorno, T. (1941), ‘On Popular Music’, Studies in Philosophy and Social Science, 9 (17): 17–48.
A Star Is Born (2018), [Film] Dir. Bradley Cooper, USA: Warner Bros Pictures.
Auvinen, T. (2017), ‘A New Breed of Home Studio Producer?: Agency and the Idea “Tracker”
in Contemporary Home Studio Music Production’, Journal on the Art of Record Production,
(11). Available online: https://www.arpjournal.com/asarpwp/a-new-breed-of-home-studio-
producer-agency-and-the-idea-tracker-in-contemporary-home-studio-music-production
Barber, S. (2016), ‘The Brill Building and the Creative Labor of the Professional Songwriter’, in
K. Williams and J. A. Williams (eds), The Cambridge Companion to the Singer-Songwriter,
67–77, Cambridge: Cambridge University Press.
Barber, S. (2017), ‘Professional Songwriting Techniques: A Range of Views Summarised from
the Sodajerker Interviews (2011–2015)’, in J. Williams and K. Williams (eds), The Singer-
Songwriter Handbook, 51–68, New York: Bloomsbury Academic.
Bennett, J. (2012), ‘Constraint, Collaboration and Creativity in Popular Songwriting Teams’, in
D. Collins (ed.), The Act of Musical Composition: Studies in the Creative Process, 139–169,
Farnham: Ashgate.
Bennett, J. (2018), ‘Songwriting, Digital Audio Workstations, and the Internet’, in N. Donin
(ed.), The Oxford Handbook of the Creative Process in Music, 1–25, New York: Oxford
University Press.
Bourdieu, P. (1993), The Field of Cultural Production: Essays on Art and Literature, New York:
Columbia University Press.
Carlisle, S. (2019), ‘Should Music Created by Artificial Intelligence be Protected by Copyright?’,
NSU Florida, 7 June. Available online: http://copyright.nova.edu/ai (accessed 7 July 2019).
Csikszentmihalyi, M. (1988), ‘Society, Culture, and Person: A Systems View of Creativity’,
in R. Sternberg (ed.), The Nature of Creativity: Contemporary Psychological Perspectives,
325–339, Cambridge: Cambridge University Press.
Csikszentmihalyi, M. (2014), The Systems Model of Creativity: The Collected Works of Mihaly
Csikszentmihalyi, Dordrecht: Springer.
Davis, H. (2019), ‘Robot Rhythms: The Startups Using AI to Shake Up the Music Business’,
The Guardian, 18 June. Available online: https://www.theguardian.com/music/2019/
jun/18/robot-rhythms-the-startups-using-ai-to-shake-up-the-music-business (accessed
7 July 2019).
de Laat, K. (2015), ‘“Write a Word, Get a Third”: Managing Conflict and Rewards in
Professional Songwriting Teams’, Work and Occupations, 42 (2): 225–256.
Deahl, D. (2019), ‘How AI-Generated Music is Changing the Ways Hits Are Made’, The Verge,
31 August. Available online: https://www.theverge.com/2018/8/31/17777008/artificial-
intelligence-taryn-southern-amper-music (accessed 7 July 2019).
Dredge, S. (2019), ‘Music Created by Artificial Intelligence Is Better Than You Think: A.I.
Doesn’t Have To Be a Threat to Human Musicians. It Might Actually Improve Their
Melodies’, Medium, 1 February. Available online: https://medium.com/s/story/music-
created-by-artificial-intelligence-is-better-than-you-think-ce73631e2ec5 (accessed 7 July
2019).
Emerson, K. (2006), Always Magic in the Air: The Bomp and Brilliance of the Brill Building Era,
New York: Penguin.
Empire (2015), [TV programme] Fox, 7 January.
Etzkorn, K. (1963), ‘Social Context of Songwriting in the United States’, Ethnomusicology,
7 (2): 96–106.
Fitzgerald, J. (1995a), ‘Motown Crossover Hits 1963–1966 and the Creative Process’, Popular
Music, 14 (1): 1–11.
Fitzgerald, J. (1995b), ‘When the Brill Building Met Lennon-McCartney: Continuity and
Change in the Early Evolution of the Mainstream Pop Song’, Popular Music & Society,
19 (1): 59–77.
French, I. (2018), ‘Amadeus Code: The AI-Powered Songwriting Assistant’, Decoded Magazine,
27 March. Available online: https://www.decodedmagazine.com/amadeus-code-the-ai-
powered-songwriting-assistant (accessed 7 July 2019).
Garza, F. (2018), ‘The Quest to Teach AI to Write Pop Songs’, Gizmodo, 19 April. Available
online: https://gizmodo.com/the-quest-to-teach-ai-to-write-pop-songs-1824157220
Gibson, C. (2005), ‘Recording Studios: Relational Spaces of Creativity in the City’, Built
Environment, 31 (3): 192–207.
Goldmark, D. (2015), ‘“Making Songs Pay”: Tin Pan Alley’s Formula for Success’, The Musical
Quarterly, 98 (1–2): 3–28.
Gooderson, M. C. and J. Henley (2017), ‘Professional Songwriting: Creativity, the Creative
Process, and Tensions between Higher Education Songwriting and Industry Practice
in the UK’, in G. D. Smith, Z. Moir, M. Brennan, S. Rambarran and P. Kirkman (eds),
The Routledge Research Companion to Popular Music Education, 257–271, New York:
Routledge.
Hajdu, D. (2015), ‘It’s an Old Trope, but How Well Does the Factory Model Explain Pop
Music?’, The Nation, October 29. Available online: https://www.thenation.com/article/its-
an-old-trope-but-how-well-does-the-factory-model-explain-pop-music (accessed 7 July
2019).
Hajimichael, M. (2011), ‘Virtual Oasis – Thoughts and Experiences About Online Based
Music Production and Collaborative Writing Techniques’, Journal on the Art of Record
Production, (5). Available online: https://www.arpjournal.com/asarpwp/virtual-oasis-
–-thoughts-and-experiences-about-online-based-music-production-and-collaborative-
writing-techniques/ (accessed 7 July 2019).
Harding, P. (2009), PWL from the Factory Floor, Bury: W.B. Publishing.
Heaven, D. (2019), ‘Songwriting Machine Churns Out Beatles Hits’, New Scientist, 241
(3214): 16.
Henderson, S. and K. Spracklen (2017), ‘“If I had my way, I’d have been a killer”: Songwriting
and Its Motivations for Leisure and Work’, Leisure/Loisir, 41 (2): 231–247.
Hennion, A. (1983), ‘The Production of Success: An Anti-Musicology of the Pop Song’,
Popular Music, 3: 159–193.
Hesmondhalgh, D. (2007), The Cultural Industries, London: Sage.
Holt, F. and F. Lapenta (2010), ‘Introduction: Autonomy and Creative Labour’, Journal for
Cultural Research, 14 (3): 223–229.
Inglis, I. (2003) ‘“Some Kind of Wonderful”: The Creative Legacy of the Brill Building’,
American Music, 21 (2): 214–235.
Jancer, M. (2018), ‘More Artists Are Writing Songs in the Key of AI’, Wired, 17 May. Available
online: https://www.wired.com/story/music-written-by-artificial-intelligence (accessed
7 July 2019).
Jones, M. (2003), ‘The Music Industry as Workplace: An Approach to Analysis’, in A. Beck
(ed.), Cultural Work: Understanding the Cultural Industries, 147–156, London: Routledge.
Knight, W. (2016), ‘AI Songsmith Cranks Out Surprisingly Catchy Tunes’, MIT Technology
Review, 30 November. Available online: https://www.technologyreview.com/s/603003/ai-
songsmith-cranks-out-surprisingly-catchy-tunes (accessed 7 July 2019).
Koszolko, M. K. (2015), ‘Crowdsourcing, Jamming and Remixing: A Qualitative Study of
Contemporary Music Production Practices in the Cloud’, Journal on the Art of Record
Production, (10). Available online: https://www.arpjournal.com/asarpwp/crowdsourcing-
jamming-and-remixing-a-qualitative-study-of-contemporary-music-production-practices-
in-the-cloud (accessed 7 July 2019).
Llinares, D., N. Fox and R. Berry (2018), Podcasting: New Aural Cultures and Digital Media,
Cham: Palgrave Macmillan.
Long, P. and S. Barber (2015), ‘Voicing Passion: The Emotional Economy of Songwriting’,
European Journal of Cultural Studies, 18 (2): 142–157.
Long, P. and S. Barber (2017), ‘Conceptualizing Creativity and Strategy in the Work of
Professional Songwriters’, Popular Music & Society, 40 (5): 556–572.
Marrington, M. (2011), ‘Experiencing Musical Composition in the DAW: The Software
Interface as Mediator of the Musical Idea’, Journal on the Art of Record Production,
(5). Available online: https://www.arpjournal.com/asarpwp/experiencing-musical-
composition-in-the-daw-the-software-interface-as-mediator-of-the-musical-idea-2
Marrington, M. (2017), ‘Composing with the Digital Audio Workstation’, in J. Williams
and K. Williams (eds), The Singer-Songwriter Handbook, 78–89, New York: Bloomsbury
Academic.
McIntyre, P. (2008a), ‘Creativity and Cultural Production: A Study of Contemporary Western
Popular Music Songwriting’, Creativity Research Journal, 20 (1): 40–52.
McIntyre, P. (2008b), ‘The Systems Model of Creativity: Analyzing the Distribution of Power
in the Studio’, Journal on the Art of Record Production, (3). Available online: https://www.
arpjournal.com/asarpwp/the-systems-model-of-creativity-analyzing-the-distribution-of-
power-in-the-studio (accessed 7 July 2019).
McIntyre, P. (2011), ‘Rethinking the Creative Process: The Systems Model of Creativity
Applied to Popular Songwriting’, Journal of Music, Technology and Education, 4 (1):
77–90.
Menger, P. (2014), The Economics of Creativity, Cambridge, MA: Harvard University Press.
Morey, J. and P. McIntyre (2011), ‘“Working Out the Split”: Creative Collaboration and
Assignation of Copyright Across Differing Musical Worlds’, Journal on the Art of Record
Production, (5). Available online: https://www.arpjournal.com/asarpwp/‘working-out-the-
split’-creative-collaboration-and-assignation-of-copyright-across-differing-musical-worlds
Morey, J. and P. McIntyre (2014), ‘The Creative Studio Practice of Contemporary Dance Music
Sampling Composers’, Dancecult: Journal of Electronic Dance Music Culture, 1 (6): 41–60.
Murphy, D. (2019), ‘Amadeus Code Raises $1.8m to Develop Its AI-Powered Songwriting
Platform’, Mobile Marketing, 31 May. Available online: https://mobilemarketingmagazine.
com/amadeus-code-raises-18m-to-develop-its-ai-powered-songwriting-platform (accessed
7 July 2019).
Muscle Shoals (2013), [Film] Dir. Greg ‘Freddy’ Camalier, USA: Magnolia Pictures.
Negus, K. and P. Astor (2015), ‘Songwriters and Song Lyrics: Architecture, Ambiguity and
Repetition’, Popular Music, 34 (2): 226–244.
Nelson, G. (2007), Where Did Our Love Go? The Rise & Fall of the Motown Sound, Urbana, IL:
University of Illinois Press.
Passman, J. (2017), ‘Music as a Commodity: Songwriting with Artificial Intelligence’, Forbes,
3 March. Available online: https://www.forbes.com/sites/jordanpassman/2017/03/03/
music-as-a-commodity-songwriting-with-artificial-intelligence (accessed 7 July 2019).
Petridis, A. (2005), ‘Return of the Hitmen’, The Guardian, 2 December. Available online:
https://www.theguardian.com/music/2005/dec/03/popandrock (accessed 7 July 2019).
PwC (2017), ‘The Economic Impact of Artificial Intelligence on the UK Economy’, June.
Available online: https://www.pwc.co.uk/economic-services/assets/ai-uk-report-v2.pdf
Rachel, D. (2013), Isle of Noises: Conversations with Great British Songwriters, London:
Picador.
Reilly, D. (2019), ‘A.I. Songwriting Has Arrived. Don’t Panic’, Fortune, 25 October. Available
online: https://fortune.com/2018/10/25/artificial-intelligence-music (accessed 7 July 2019).
Rys, D. (2017), ‘Can Blockchain Keep Songwriters from Getting Stiffed?’, Billboard, 9 March.
Available online: https://www.billboard.com/articles/news/magazine-feature/7717068/
blockchain-payment (accessed 7 July 2019).
Seabrook, J. (2015), The Song Machine: Inside the Hit Factory, New York: W.W. Norton &
Company.
Smith, S. L., M. Choueiti, K. Pieper, H. Clark, A. Case and S. Villanueva (2019), ‘Inclusion
in the Recording Studio? Gender and Race/Ethnicity of Artists, Songwriters & Producers
across 700 Popular Songs from 2012–2018’, USC Annenberg Inclusion Initiative, February.
Available online: http://assets.uscannenberg.org/docs/inclusion-in-the-recording-studio.
pdf (accessed 7 July 2019).
Sodajerker (2013), ‘Episode 32 – Mike Stock’, Sodajerker On Songwriting podcast, 2 January.
Available online: https://www.sodajerker.com/episode-32-mike-stock (accessed 7 July
2019).
Sodajerker (2016), ‘Episode 95 – Kevin Godley’, Sodajerker On Songwriting podcast,
9 November. Available online: https://www.sodajerker.com/episode-95-kevin-godley
Sodajerker (2017a), ‘Episode 99 – Jack Savoretti’, Sodajerker On Songwriting podcast, 15 May.
Available online: https://www.sodajerker.com/episode-99-jack-savoretti (accessed 7 July
2019).
Sodajerker (2017b), ‘Episode 110 – Noel Gallagher’, Sodajerker On Songwriting podcast,
20 December. Available online: https://www.sodajerker.com/episode-110-noel-gallagher
Sodajerker (2018), ‘Episode 114 – Andrew W.K.’, Sodajerker On Songwriting podcast,
21 March. Available online: https://www.sodajerker.com/episode-114-andrew-wk
Sodajerker (2019a), ‘Episode 133 – Dido’, Sodajerker On Songwriting podcast, 18 March.
Available online: https://www.sodajerker.com/episode-133-dido (accessed 7 July 2019).
Sodajerker (2019b), ‘Episode 139 – Rhiannon Giddens’, Sodajerker On Songwriting podcast,

6 June. Available online: https://www.sodajerker.com/episode-139-rhiannon-giddens
Songwriter (2018), [Film] Dir. Murray Cummings, UK: Murray Pictures.
Spezzatti, A. (2019), ‘Neural Networks for Music Generation: Can We Reproduce Artists’
Creativity through AI?’, Towards Data Science, 24 June. Available online: https://
towardsdatascience.com/neural-networks-for-music-generation-97c983b50204 (accessed
7 July 2019).
Thompson P. (2016), ‘Scalability of the Creative System in the Recording Studio’, in
P. McIntyre, J. Fulton and E. Paton (eds), The Creative System in Action, 74–86, London:
Palgrave Macmillan.
Thompson, P. and B. Lashua (2016), ‘Producing Music, Producing Myth? Creativity in
Recording Studios’, IASPM@Journal, 6 (2): 70–90.
Thompson, P. and P. Harding (2019), ‘Collective Creativity: A “Service” Model of Commercial
Pop Music Production at PWL in the 1980s’, in R. Hepworth-Sawyer, J. Hodgson, J. Paterson
and Rob Toulson (eds), Innovation in Music: Performance, Production, Technology, and
Business, 149–153, New York: Routledge.
Toulson, R. and R. J. Burgess (2017), ‘Singer-Songwriter Meets Music Production and Studio
Technology’, in J. Williams and K. Williams (eds), The Singer-Songwriter Handbook,
91–112, New York: Bloomsbury Academic.
Toynbee, J. (2000), Making Popular Music: Musicians, Creativity and Institutions, London:
Arnold.
Weiner, K. (2019), ‘Machines Can Create Art, but Can They Jam? Jazz Composition and
Performance Is the Next Frontier in Creative AI’, Scientific American, 29 April. Available
online: https://blogs.scientificamerican.com/observations/machines-can-create-art-but-
can-they-jam (accessed 7 July 2019).
Wolfe, P. (2019), Women in the Studio: Creativity, Control and Gender in Popular Music
Production, New York: Routledge.
Zollo, P. (2003), Songwriters on Songwriting, Cambridge, MA: Da Capo.
Discography
Andrew W.K. (2018), [digital download] ‘Music Is Worth Living For’, Bee & El/Sony Music.
Andrew W.K. (2018), [digital download] You’re Not Alone, Bee & El/Sony Music.
Bananarama (1986), [digital download] ‘Venus’, London Music Stream/Because Music.
Dido (2019), [digital download] ‘Have To Stay’, BMG Rights Management (UK) Ltd.
Hotlegs (1971), [digital download] ‘Neanderthal Man’, UMC (Universal Music Catalogue).
204
13
The Influence of Recording
on Performance: Classical
Perspectives
Amy Blier-Carruthers
Performing for recording: An act of

translation
It is an often-repeated axiom that studio recordings are not the same thing as live
performances, but comparatively little research has been undertaken into what happens
behind the often closed doors of a classical recording studio. What influence does recording
have on the performances that are captured and the performers that are being recorded?
What changes do classical musicians have to make, consciously and unconsciously, to
produce a performance suitable for recording? How can musicians and production teams
use the tools of the recording studio to their creative advantage? It has been shown that
changes certainly are made when a performer has to translate a concert experience into
a recorded one (Auslander 1999; Blier-Carruthers 2010; Fabian 2008; Katz 2004; Philip
2004). A recorded performance has many traits that a live one does not share: it is captured,
mediated, collaborated upon, retaken, edited, constructed, retouched, packaged, displaced,
consumed differently, carried around and repeated.
It may not be an overstatement to say that classical recording studios are mysterious
spaces – producers and sound engineers often call their work a ‘black art’. For decades, the
practices of, and discourses around, classical recording have been ones of self-effacement:
we are not meant to be aware of anything that might get in the way of the transmission
of the composer’s work, via the performer, to the listener. The producers, engineers, and
by extension conductors and performers, have by and large aimed for a simulation of a
‘best seat in the house’ concert experience, with no sense given of how this was achieved
(this term is often used by production team members, and also discussed in Cook 2013:
375–391). However, one of the effects of this way of thinking and working is that people
seem to ignore the influence that recording has on performance. The fact that recording is
kept a ‘black art’ not only hides a world of interesting material and working practices, but
it puts pressure on people to keep the process shrouded in mystery. A performer might say
‘I aim for my recording to be as close to a live performance as possible,’ but they are very
unlikely to admit that this ideal performance was constructed from 82 takes, with 450 edit
points on a 72-minute album! We can therefore see that there is a vast gulf that needs to
be bridged between the live performance that the performer brings to the studio and the
final edited marketable product. There is a looking-glass that Alice steps through to a place
where everything is different, yet somehow very similar, metaphorically speaking. What
are these processes of translation? What effect do they have?
The invention of recording has had an unparalleled impact on classical music. From the
moment Edison invented his phonograph machine, live concerts and recordings became
irrevocably separate processes and products. With the advent of recording being arguably
the biggest change that musicians have had to deal with, it instilled in many musicians
a dislike or fear of the recording process, a sense of anxiety which is different from the
nerves associated with performing live. It transformed their lives forever. Stories abound
of early recording sessions in which singers are performing with their heads stuck down
the recording horn and sometimes being bodily dragged back so the high notes wouldn’t
ruin the recording. But what is striking is that, even after over a century of commercial
classical recording, many of the issues that disturbed earlier recording artists are true
for performers today – distrust of the technology, dislike of the process, doubts about
the captured performance, disillusionment with editing, disagreement with the level of
perfection expected for a recording and discomfort about the concept of a disembodied
performance existing at all. This might have something to do with the fact that performers
don’t have the level of control in the making of their recordings that listeners might think
they do. Classical performers involved in mainstream commercial recording almost never
participate in editing their recordings to any meaningful extent. They hear what is called
a ‘first edit’ – a complete performance compiled of takes and splice-points chosen by the
producer, the details of which the performer is unaware. The performer is invited to make
minor comments or requests. They then hear a ‘final edit’, but they have no idea how it was
constructed – is it made up of ten or 600 edits? Was the material taken from two main takes
or forty-two smaller ones? It therefore cannot be ignored that they are far from being in
artistic control of their recorded output.
Theoretical contexts: Key concepts and

texts
The effects of recording is not a new topic of discussion. From Benjamin, through Adorno
and Gould, to Auslander, musicians, theorists and listeners have been aware that the two
performance modes are different. Benjamin famously talks about the loss of the ‘aura’ of
the artwork in the age of mechanical reproduction ([1936] 1982: 223). Adorno suggests
The Influence of Recording on Performance 207
that the record stems from ‘an era that cynically acknowledges the dominance of things
over people’ ([1934] 2002: 277) and is responsible for ‘the barbarism of perfection’ ([1938]
2002: 301). Much of the early discourse around recordings seems to be focused on the
threat they might pose to live music-making (Philip 2004: 13) and to the quality of attention
(or lack thereof) with which people listen. It should however be remembered that many
of the writings of the period were reactions to a relatively new technology and were set
against a backdrop of war and increasing mechanization, with the very real presence of
mass propaganda being disseminated via the public address loudspeaker and straight into
people’s homes via the radio (Blier-Carruthers 2010: 26–34).
The title of this chapter might perhaps trigger echoes of Glenn Gould’s 1966 article
‘The Prospects of Recording’, in which he predicted the demise of the public concert a
century hence, its functions having been ‘entirely taken over by electronic media’ (Gould
[1966] 1987: 331). On the effects of recording, he describes that people’s current musical
tastes ‘could be attributed directly to the influence of the recording […] – characteristics
such as analytic clarity, immediacy, and indeed almost tactile proximity’ (333). In his
description and defence of the technique of editing, he advocates using the opportunities
afforded by recording: ‘by taking advantage of the post-taping afterthought […] one can
very often transcend the limitations that performance imposes upon the imagination’
(339). Gould is unique in the dedication with which he championed the creative
possibilities of the recording studio (353–357), completely eschewing the concert
platform in order to use the studio as a ‘crucible for creativity’, as Georgina Born later
defined it (Born 2009: 296).
Several scholars of performance studies have more recently engaged in the debate
around the influence of recording on performance (including Small 1998; Auslander
1999; Day 2000; Clarke 2002; Katz 2004; Philip 2004; Fabian 2008; Gritten 2008; Leech-
Wilkinson 2009; Blier-Carruthers 2010; Johnson 2010; Botstein 2011; Cook 2013).
Auslander compares live and ‘mediatized’ (recorded) performance events, particularly
highlighting the irony of the fact that in a time when live performance holds a higher
cultural value, it nevertheless seeks to replicate the mediatized product in terms of
perfection and the level of audio detail (Auslander 1999: 23–38). Philip (2004) describes
how the very existence of recordings has changed the way we perform and experience
music, and Katz (2004) coins the term ‘phonograph effects’, which relates to the results
of the portability, repeatability and manipulability of recordings. Cook (2013) asks why
recordings still seem to be assessed mainly in comparison to live performances, focusing
on concepts such as the ‘paradigm of representation’, the ‘discourse of fidelity’ and the
‘best seat in the house ideology’. Performers and production team members have also
joined the debate about live versus recorded performance (Gould [1966] 1987; Haas 2003;
Rushby-Smith 2003; Hallifax 2004; Tomes 2004; Brendel 2007; Freeman-Attwood 2009).
With the exception of Gould, performers most often talk about the difficulties they have
in coming to terms with making recordings. On the other hand, people involved in the
production process advocate for recording being a completely separate art form, with its
own challenges and opportunities, which necessitates a translation from the concert to
the studio performance.
Capturing/translating/creating: Case
studies in recording
The case studies here cover a range of approaches to recording, beginning with an experiment
in re-enacting the making of an early acoustic recording, to more current approaches to
large-scale stereo classical recording, and finishing by examining an experiment in taking
recording beyond its more usual role of capturing a performance to using the techniques of
recording to enhance or go beyond what might be possible in a live performance. Each of
these examples represents a step along the path of what the recorded performance consists
of or signifies. The first of these – making an acoustic recording by etching sound onto a
wax disc – could be seen as quite a simple process of capturing a live performance. The
second – using modern studio practices, stereo techniques and editing processes – could
be described as translating a live performance for recording. And finally, the use of so-
called ‘hyper-production’ techniques to achieve more through recording techniques than
is usual in the classical world, could be described as creating a new kind of performance
specifically for recording (see Zagorski-Thomas et al. 2015).
Recording onto wax: Capturing a live

performance
The first example of the influence of recording on performance opens a window onto a
hitherto obscure moment in recorded history. It is an example of ‘capturing’ a performance
and it centres on the 1913 recording of Arthur Nikisch and the Berlin Philharmonic
Orchestra’s interpretation of Beethoven’s C minor Symphony (No. 5), which was the first
recording of a complete orchestral work by a world-renowned conductor together with a
leading professional orchestra. It was one of the very first attempts to capture the natural
sound of a full orchestra on record. The experiment, undertaken in collaboration with
Aleks Kolkowski and Duncan Miller, and a chamber orchestra from the Royal College
of Music conducted by Robin O’Neill, sought to recreate this early orchestral acoustic
recording in order to try to find out more about how these recordings were made.
A central question underpinning this experiment was: what performance lies behind
an early twentieth-century acoustic recording? What was happening behind the hiss and
crackle of the disc, and what effect did the limited technology have on the performance
that was being recorded? This has been widely speculated upon by musicians, recordists,
academics and listeners, but many questions have been left unanswered because of the lack
of evidence. In addition to the technical knowledge we were able to test, the re-enactment
afforded three main ways of answering the overarching question posed above. The first
was to make an acoustic recording whilst simultaneously capturing the sound in the room
using modern digital recording technology, thus allowing a comparative listening analysis
of the two different recorded outcomes. The second was to assess whether musicians had
to adapt their playing style for the acoustic recording process. The third method was that,
with the results of these two types of investigation in hand, it would be possible to attempt
an extrapolation of what the Berlin Philharmonic may have sounded like behind the extant
artefact of the 1913 acoustic recording.
Fred Gaisberg, the famed and arguably the first record producer, aimed to capture what
he called ‘sound photographs’ (Legge quoting Gaisberg in Schwarzkopf 1982: 16). But
‘capturing’ a performance was not as easy as it sounds at that point in history. As a modern
listener, it can be difficult to get past the hiss and crackle of an acoustic recording, and the
old-fashioned performance style we are presented with. Until now, the prevailing discourse
around what an early recording offers us in terms of evidence of a performance is that
only a limited range of frequencies could be captured, resulting in a recording that was an
unrealistic impression of the performer (Day 2000: 9; Philip 2004: 28).
Did early recorded performers have to change their performance style to come across
via acoustic technology? As per the evidence of this experiment, the simple answer is:
yes, but perhaps not as much as might have been expected, or previously assumed. After
the first acoustic playback, it became clear that certain things relating to performance
would need to change (see Blier-Carruthers, Kolkowski and Miller 2015 for fuller results
and sound examples). The elements that underwent change included: the positioning of
the musicians changed dramatically to achieve the desired sonic balance, the tempo was
quicker, the articulation was more focused and aggressive, the dynamics were louder,
the vibrato was more intense and the expression of the performance was more overtly
characterized. Early recordist Gaisberg offers some evidence to corroborate our findings
when he writes that, in order for a Stroh violinist to be audible on a recording, he has to
‘exaggerate heavily the pizzicato, glissando and vibrato’ (Gaisberg 1947). The picture that
emerges is that adjustments were being made in terms of positioning and performance in
order to produce a sonic picture that resembled the sound the musicians felt they made
live; a sound that they could recognize as themselves. They had to arrive at a recognizable
final performance by working through and despite the effects of the technology. In the
words of one participant: ‘It was contrived to be an accurate representation’ of their playing
(Participant 23; see Blier-Carruthers, Kolkowski and Miller 2015). It could be posited that
even though the sound in the room had to be significantly different to their normal style of
playing live, the recorded result was an attempt at what I call an ‘approximation of realism’.
We can extrapolate from these examples that we can hear the ‘voice’ of the 1913 orchestra.
Live performance versus studio recording:

Translating a live performance
The next case study brings us into the twenty-first century, though it deals with recording
practices which have been more or less stable in mainstream classical recording since
the late 1960s. These practices attempt what I have described as a ‘translation’ of the
live performance for the recorded medium. This is how producers and sound engineers
would usually describe the process, though I have argued elsewhere (Blier-Carruthers,
forthcoming), as has Nicholas Cook, that we are still married to recreating a live concert
sound and experience – what Cook calls the ‘Best Seat in the House Ideology’ (Cook 2013:
376). Wherever one sits on this debate, it is quite impossible to deny that live concerts
and studio recordings are different, both in terms of the process and the final product. I
have investigated this with particular reference to the work of the conductor Sir Charles
Mackerras and the performers – from major London orchestras such as The Philharmonia
and the Orchestra of the Age of Enlightenment – and production team members with
whom he was working. By combining both ethnographic approaches and detailed
performance analysis techniques, I aimed to define the differences between the two types
of performance and to contextualize them within musicians’ experience.1
The performance analysis examined comparative case studies which were based on
directly corresponding pairs of live and studio recordings of the same piece performed
by the same performers, ideally recorded at around the same time. When heard in direct
comparison to each other, even for just a few seconds, and considering how many factors
they have in common (piece, performers, date and often venue), there is an obvious
difference between the sound of the live and studio recordings, and this can be heard in
every aspect of the performances. The research supports a number of conclusions, including:
the timbre of a recording changes depending on performance situation, a live performance
tends to be declaimed and characterized in a more overt and expressive manner, there is
more space for dramatic timing in a concert, vibrato is usually more intense in concert
and comparatively relaxed in a studio recording, phrasing and articulation are often
more sustained in the recording studio, when tempo differs it is almost always quicker
in a concert, and perfection is more of an issue for a studio recording than it is in a live
performance. Also, it has been shown that the sonic balance and perspective it is possible
to create in a studio recording comprise a great benefit for detailed or subtle sections of
music which might otherwise be obscured or inaudible in live performance, thus rendering
the studio recording aesthetically better in these instances.
Finally, considering working processes and the expected final product, perfection is a
big issue in the recording studio, whereas in the concert expression is more important.
The results of repeated takes and editing have trained the public to expect perfection and
finesse, something that many musicians feel is somewhat at odds with the expression
and excitement they aim for in a live concert. Mistakes take time, and time is money
(and budgets are decreasing consistently in classical recording), and there is incredible
pressure to get it right as quickly as possible. We might also add to this the fact that, as
the performance is being recorded for posterity, musicians feel bound to make sure they
don’t do something they may regret later, or that might pall upon repeated hearing. It is
then hardly surprising that this situation is not conducive to experimentation, partly due
to practicality and partly to psychology. The findings of this research are supported by
Fabian’s results, which found that most of her interviewees (79.5 per cent) ‘reported taking
fewer risks in the studio, in spite of the potential for correction’ (Fabian 2008: 242). This
perfecting influence of recording came across very strongly in my research and is echoed
in writings by performers (for example, Tomes, Brendel, Ax [Sachs 2008]), producers (for
example, Keener [Philip 2004: 54–56]) and scholars (Auslander 1999; Day 2000; Adorno
2002; Katz 2004; Philip 2004; Botstein 2011; and Leech-Wilkinson 2009). The pianist
Alfred Brendel goes as far as to say that the editable perfection of recording has turned
modern listeners into ‘wrong-note fiends’ (Brendel 2007: 347). This may seem strange to
many recordists (producers and sound engineers), who feel that there is more freedom
in the studio because of the possibility of repeated takes and editing, but many classical
musicians don’t see it this way, because there is a huge time pressure and expectation of
professionalism to get the notes right first time (Blier-Carruthers, forthcoming).
Hyper-production: Creating a performance

through and for recording
I would now like to shift our attention to ways in which we might take control and
experiment with the possible influences of recording, rather than allowing those influences
to take effect as a by-product of getting a performance down on record. Like Gould
suggested in the 1960s, and others before and after, the recording studio could be used
more as a creative space rather than simply as a re-creative or reifying space (Gould [1966]
1987; Culshaw [1967] 2012; Legge quoted in Schwarzkopf 1982; Born 2009; Freeman-
Attwood 2009; Howlett 2009; Blier-Carruthers 2013; Cook 2013; Zagorski-Thomas 2014).
This would mean creating a new kind of performance specifically for recording by wielding
the possibilities of the recording and editing technology more creatively, seeing them as
tools that could be consciously used from the outset to generate a new kind of musical
product.
Here we focus on an AHRC Digital Transformations project entitled Classical Music
Hyper-Production and Practice as Research, which in part seeks to: ‘create new and
exciting sonic worlds in the production of live and recorded performances […] drawing
on the multi-tracking and editing techniques of popular music to create innovative
and experimental recordings of the classical repertoire’. One of the experiments was a
collaboration with cellist Neil Heyde (Kreutzer Quartet/Royal Academy of Music), an
experienced professional both on the concert platform and in the recording studio. He
chose to record a solo piece by Richard Beaudoin (Dartmouth College) entitled Bacchante,
which is based on a microtiming of Debussy’s Welte-Mignon piano-roll recording of ‘[…]
Danseuses de Delphes’ from the Préludes, made in Paris on 1 November 1913. Also involved
were producer Simon Zagorski-Thomas and sound engineer Andrew Bourbon. We spent
two days in the studio, one recording, and one mixing and editing (3–4 September 2015,
Vestry Hall, University of West London). I was involved in a dual role, partly insider and
partly outsider, as I was there as an objective ethnographic researcher but had also been
active in the planning discussions and the development of the overall Hyper-Production
project concept. This is a particular kind of ethnography – there is an element of action
research and advocacy in my position; participant-observer research has a long history, and
by collecting different voices and triangulating them with my own observations it becomes
possible to paint a representative picture (on advocacy, see Seeger 2008; Castelo-Branco
2010; and for the concept of narrative ethnography, see Kisliuk 2008; Bayley 2010a).
There were a few main aims for this experiment. The first was to try to capture this
performance with some techniques which are more usually found in the popular music
recording studio. A traditional classical set-up would be to have a stereo pair of microphones
at a distance and possibly also one microphone closer to the cello; this recording employed
nine microphone set-ups, set at various distances from the cello. Another priority for the
session was to explore what could be achieved with the sound of the cello that went beyond
what was possible live. Finally, by setting enough time aside to experiment with mixing and
editing with the performer present, we created a situation which is unusual in the classical
world.
In commercial classical recording, studio time is often limited to the minimum
amount needed to capture enough material to construct a performance that performers
and producers are happy with, rather than allowing the time to experiment. Furthermore,
mainstream recording practices usually put the editing decisions in the hands of the
producer, with performers only exceptionally getting involved in any detail with post-
production decision-making. It is perhaps not surprising that Glenn Gould was a strong
advocate of the performer assuming an editorial role, though this has not made its way
into mainstream practices. In this instance, the process and product of the recording
aims to be one in which the performer is far more involved in the curatorial process
than in a traditional recording project, thereby enhancing the scope for the application
of their subjective view. Keeping in mind the topic of this chapter which is the influence
of recording on performance, the most relevant themes that emerged from these
sessions included: the ‘spatialization’ (positioning in space) of the microphones; how
this allowed for experimentation with sonic perspective and the treatment of dynamics;
collaboration and sharing of control; and the phenomenon of the performer at the
mixing desk.
‘Spatialization’ of the microphones

What became termed the ‘spatialization’ of microphones created a vast sound palette which
allowed Neil (cellist), Andrew (sound-engineer) and Simon (producer) to play with sound
gestures. Andrew came to call these various types of sounds ‘gestural fractals’, ‘bubbling
distortions’ and ‘spatial crescendos and decrescendos’ (Vestry Hall, 3–4 September
2015). This recording approach developed the potential to allow a recomposition using
microphones, unintentionally akin to the Culshaw/Solti Wagner’s Ring cycle ([1959–66]
1967) with implied live opera staging (Culshaw [1967] 2012; Patmore and Clarke 2007) or
Glenn Gould’s ‘acoustic orchestrations’ or ‘choreographies’ of Scriabin and Sibelius from
the early 1970s (see Glenn Gould: The Alchemist 1974; Théberge 2012).
On the first day, as Neil was warming up in the hall and Andrew started setting mics up
around him, Neil said: ‘So many mics! It’s like having an audience – that’s how it should
always be!’ This was perhaps just a throwaway comment, but the concept of seeing the mics
as ears listening in is quite interesting, as it could be seen to imply a kind of freedom that
goes beyond what normally happens in the studio. Rather than having a few mics, which
are balanced on the day and often left at those levels, capturing a performance from a set
perspective, here there are a number of ‘ears’ listening in, all from their own perspective (as
an audience listens), which allows the freedom to experiment with different balances (and
therefore effects and meanings) in the post-production process.
Experimenting with sonic perspective and

treatment of dynamics
Sound and sonic perspective could be manipulated here because they were using
multitracking, which meant that as they were listening through they could explore the
separation of musical lines and make decisions about which microphone perspective
suited which section of music best. They were also exploring the concept of dynamics
versus performance intensity. Perhaps the most striking sounds were coming from the
microphones placed close to the instrument: one just above the cellist, one near the belly
of the instrument and one under Neil’s chair. These created sounds that gave a sense of
being inside the instrument and of being able to hear the grain of the instrument’s sound,
also including the breathing of the performer and the tiny extraneous noises that result
from playing a cello. Neil became interested in ‘the things that only the instrumentalist
hears’, and in ‘hearing more than normal’. He felt that some of the sounds being achieved
were ‘like hearing the body of the instrument’. There was a strong sense that this kind of
recording was amplifying what was happening live, or getting closer to what the performer
hears in their imagination: Neil said ‘That’s how I feel it,’ ‘We’re using recording to do what
I can’t do.’
Control and collaboration

Because of the experimental nature of the sessions, and the time spent editing and
mixing as a group, the control was shared much more equally than in a normal
commercial session, and there was a lot of collaborative decision-making. Furthermore,
the conversations seemed to be of a quite different type than usually go on in a classical
recording studio. For instance, the sounds being captured remained a very present topic
throughout, rather than focused mainly on the sound-check at the beginning, after which
point it’s the performance that takes precedence. Usually each part of the process occurs
quite separately – the material is recorded, the producer goes away to choose the edits
and then, based on the producer’s annotations in the score, the sound engineer/editor
will put the first edit together. In this project, the triumvirate of the performer, producer
and engineer was brought onto a more equal footing; this quite distinctly felt like an act
of collaborative co-creation.
The performer at the mixing desk –

Heyde and Gould
Typical studio practices dictate that the mixing desk is the tool and domain of the sound
engineer. The producer is usually the most powerful agent in a recording project, but even
he would not interfere with the numerous faders and dials. It therefore makes it all the
more striking that there was a moment on the second day when the cellist reached over
and started changing the levels on the mixing desk, and was eventually even sitting in the
sound engineer’s seat.2
To set the scene, on day two the plan was to listen through to choose takes, and
also to experiment with which combinations of microphones were going to be used
for the various mixes. At one point, there was a lull in listening and discussions as it
had been decided to send a rough mix to the composer Richard Beaudoin to see what
he thought of the results so far: as Neil writes to him, Simon leaves the room for a few
moments, and after some silent time where Neil is concentrating on writing his e-mail
(0:53), Neil leans over (1:14) and Andrew suddenly starts explaining something about
the mics. Neil gets up, and I notice that the discussion might make more sense with a
visual element so I move the camera so that the faders on the mixing desk are in view
(1:43). It takes only a few seconds before Neil is reaching out to touch the faders and
saying: (2:01) ‘Can I […] can I just, out of curiosity […] can I hear those up, and where’s
the floor one? I’m just curious what happens if you’ve got something [gestures with his
hands].’ And twenty seconds later Andrew gets up and says ‘So it’s quite fun just to have
a bit of a play, really […] explore the […] explore the piece’, and Neil takes the sound
engineer’s seat (2:20).
This might seem like a minor event, but even after hundreds of hours spent in the
recording studio, the only other time I have seen this happen is in a video of Glenn Gould
working on his acoustic orchestrations of Scriabin. His intention was to experiment with
editing between different sound-worlds like a film director would cut between different
camera angles in a movie. This footage can be seen in Monsaingeon’s documentary Glenn
Gould: The Alchemist (1974), and at the time of writing an excerpt of this is available
online, see p0lyph0ny (2008a). Only one of these types of experiments was released in
Gould’s lifetime – Sibelius’s Kyllikki Op.41 and Sonatines Op.67 in 1977 (see Cook 2013:
383) – and were only evidenced by the scene in the documentary, until Paul Théberge
(2012) remixed the Sony masters and released them. In the striking and crucial moment
in question, Gould starts from a position of leaning on the mixing desk opposite the
sound engineer, where he is instructing the engineer in which mics to turn up or down
(p0lyph0ny 2008b: 0:09). He then begins gesturing, and then he suddenly reaches over
and starts moving the faders himself (0:12). In fact, he pretty much pushes the engineer
out of the way (0:26, and again at 1:16) and starts to ‘perform’ the piece via the sounds that
the different microphone set-ups offer him. He then proceeds to conduct the rest of the
playback (0:00–2:57).
Is it a coincidence that in both instances, as soon as the performer is allowed to

become an agent in the mixing and editing process, his interest in the sound that he
has created with his instrument and how that might be creatively wielded comes to
the fore, and he suddenly feels the urge to reach out and intervene in a way that is
outside of usual studio protocol? In both instances there is a moment of collaboration
where both agents have their hands on the desk, but then the performer takes over.
Since having sat down, Neil has been moving the faders to see the different effects they
can create, but as he gets more comfortable (3:39) he starts making performing gestures
with his hands, arms and then shoulders. Like for Gould, the mixing desk has suddenly
become an instrument for him. The transition is unmissable: the change from tentative
experimental gestures to visceral involvement is striking. He is obviously very engaged
and interested in what he’s hearing. After a time, it looks like the moment is over, they
start talking again, and Neil gets up (4:36). At this point I asked Andrew to run through
the different properties of each microphone grouping again, as his explanation to Neil
had been very quick and I had not had the camera in a good place for capturing it. I
did not have any predetermined agenda other than to clarify what had just been said,
but this led to a fascinating exchange between Neil and Andrew about what each mic
placement sounded and felt like.
One example is Mic 3, which is on the floor under the cello, and which Andrew feels
has a sorrowful, booming sound, but to Neil it feels like it comes from his stomach. It is a
sound that reminds him of what it feels like in his body to play the cello: ‘I love that sound,
because that’s the sound […] that’s the relationship you have with the instrument in your
body, through your legs [taps thighs and stomach]’ (5:39). This mic came to be known as
the ‘stomach mic’. It feels like such a quick interaction, almost a passing comment, but this
is a moment where a sound engineer and a performer manage to triangulate how a sound
can seem right or real in the way that it correlates with the performer’s experience, or his
imaginative concept of what playing the cello means or feels like. The next moment also
relates to a physical experience of sound: Andrew says that Mic 4 is above the top of the
cello, and Neil makes sniffing noises and says ‘It’s in your nose!’ Andrew qualifies this by
saying it’s breathy and airy, and Neil replies ‘Exactly, but it feels like what you smell off food,
or something, to me’ (6:01). This mic came to be known as the ‘nose mic’. Overall, it was
striking that the sound engineer often speaks in terms of sound or emotion, whereas the
cellist is often relating it to bodily experiences, for instance referring to stomach, thighs,
feeling, nose and smell (for how metaphor is employed to discuss sound, see Hallifax 2004:
39; or Leech-Wilkinson 2009: 1.1.7). The end of this exchange is marked by Simon coming
in and having the others describe to him what they’ve just been listening to and discussing.
Neil has heard something that he likes, and it is not what he was expecting. The sound Neil
and Andrew arrived at was created by using completely different mic combinations than
usual for Neil: ‘This is still believable, actually, curiously, I quite like that […] I learned a
lot from being able to hear that. You see, I’ve never been able to do this before […] That’s
completely fascinating […] And we’ve just taken out the mics I always use, effectively!’
(12:27).
It is worth noting that this interaction happened in a moment of hiatus, a suspension

of the normal working process. A mere thirteen minutes of accidental impromptu
discussion and experimentation allowed a number of things to happen: a performer got
his hands on a mixing desk, which acted as a new instrument for him; the performer
had agency beyond what he normally has in the recording studio and began to ‘perform’
with the sounds; there was discussion and explanations of sound metaphors and how
sound relates to the physicality of playing the instrument, as well as ideas about how to
consciously translate that experience into sound for the listener; the sound engineer was
able to share his knowledge of his craft and artistic intentions more thoroughly than is
the norm; the performer heard different balances than he is accustomed to and heard
new things that he liked; the producer and sound engineer supported and enabled the
performer’s choice; and the performer learned a lot by doing something he’s never had
the chance to do before. It is fascinating that it resulted in a complete reversal in normal
microphone choices.
These few moments were only possible because of the interest of the performer and the
openness of the sound engineer, but these ingredients in themselves are not rare – what is
rare is that they combine in this particular way and result in a leap forward in imagination
and understanding of what a recording might possibly sound like. Overall, the triumvirate
of the performer, producer and sound engineer collaborated on a different footing, and
unintentionally completely subverted normal classical studio practices, both in terms of
process and outcome.
Ways of working creatively with the

influences of recording on performance
It seems clear that recording has an influence on performance at every turn, so how might
we better prepare classical performers to work creatively and collaboratively in the studio,
and to work with the influences of recording, instead of often being at the mercy of those
influences?
In any recording, performers are trying to put something across that is artistically
significant to them, but they have to collaborate with others, and with technology, in order
to achieve this in a different way to how they operate on the concert platform. Sometimes
they have to change what they do, sometimes aspects of performance change unconsciously.
But there are three points of tension which stand out from these examples, which also
loosely correlate to some potential solutions I’d like to offer here. The first is the pressure
put upon performers by the expectation of perfection; the second is the level of control the
performer has over the possibility to have their voice heard in the studio; the third is the
extent to which a performer is allowed or invited to participate in decisions about sound
and editing choices, rather than being only a provider of musical material to be sculpted in
the post-production process.
The musician’s performance travels through the transformative prism of the production
team and recording process, but they have often not learned how to manage this experience
successfully. Is it because of the inherent qualities of the product and process themselves, or
because of a lack of preparation during their training? Musicians spend thousands of hours
preparing for the concert platform but nearly none preparing for the recording studio.
So it seems clear that musicians studying in conservatoires must be trained to achieve
this transformation; by becoming more knowledgeable, they would feel empowered to
get involved in the technical aspects of studio recording and even be properly equipped to
curate and manage their own recording projects (Blier-Carruthers 2013).
Secondly, if there was more time available in the studio, there would be less pressure
to simply capture the material in a textually accurate way in the shortest time possible.
Therefore, the expectation of perfection at the detriment of expressive moments might
loosen its stranglehold on the aesthetic of mainstream recording and allow for more
experimentation and creativity to flourish in the studio. This would certainly help to put
many performers in a better frame of mind when they approach the microphone (Philip
2004; Fabian 2008; Blier-Carruthers 2013, forthcoming).
Flowing out of this, finally, is the concept that classical recording could and should be
viewed as an art form in and of itself, as it arguably is in popular music, or as film and theatre
are seen as distinct practices and products (Gould [1966] 1987; Philip 2004: 54–55; Cook 2013:
357–372; Blier-Carruthers, forthcoming). This would mean being willing to reconsider the
tried and tested formulas for working in the studio – they function very efficiently, but they
don’t necessarily allow for the interesting conversations to happen. It would also mean allowing
the performer to get involved in decisions about the sound beyond the initial sound-check, and
also in the post-production processes of mixing and editing. These more collaborative modes
of working have the potential to inspire new language, original results and exciting possibilities.
Whether we are talking about capturing, translating or creating a performance for
recording, it can be seen that recording has had a range of significant influences on
performance. We are living in paradoxical times: despite the near demise of the recording
industry as we know it, recordings are too much a part of our everyday lives to be at risk
of disappearing completely. Bergh and DeNora have argued that recorded music is one
of the ways in which we articulate our identities, it is ‘reflexive embodied praxis’ – it
accompanies us whilst ‘dancing, crying, sleeping’ in a way that live music cannot (2009:
111). Recordings are not going anywhere, so our challenge is to see how we can work to
make these experiences in the studio more vibrant, creative and collaborative; to find ways
to make the final products as artistically valuable and aesthetically enjoyable for the people
who make and listen to them, for many years to come.
Notes
1. See the Sir Charles Mackerras Collection (C961 and C1189), British Library Sound
Archive, London, CDs 1–4, shelf mark: 1CDR0032905-08, compiled 2010. The sound
examples to go with this research are for the moment only to be found in the British
Library Sound Archive, where they have been selected from the Sir Charles Mackerras
Collection as well as the corresponding commercially available discs; and Blier-
Carruthers (2010).
2. Please see https://www.bloomsbury.com/the-bloomsbury-handbook-of-music-
production-9781501334023/ for the video clip under discussion, transcript, and
supporting materials.
Bibliography
Adorno, T. W. (2002), Essays on Music, edited by R. Leppert, Berkeley, CA: University of
California Press.
Auslander, P. (1999), Liveness: Performance in a Mediatized Culture, London: Routledge.
Bayley, A. (2010a), ‘Multiple Takes: Using Recordings to Document Creative Process’,
in A. Bayley (ed.), Recorded Music: Performance, Culture and Technology, 206–224,
Cambridge: Cambridge University Press.
Bayley, A., ed. (2010b), Recorded Music: Performance, Culture and Technology, Cambridge:
Cambridge University Press.
Benjamin, W. ([1936] 1982), ‘The Work of Art in the Age of Mechanical Reproduction’, in
H. Arendt (ed.), Illuminations, 219–253, Bungay, UK: Fontana/Collins.
Bergh, A. and T. DeNora (2009), ‘From Wind-Up to iPod: Techno-Cultures of Listening’, in
N. Cook, E. Clarke, D. Leech-Wilkinson and J. Rink (eds), The Cambridge Companion to
Recorded Music, 102–115, Cambridge: Cambridge University Press.
Blier-Carruthers, A. (2010), ‘Live Performance – Studio Recording: An Ethnographic and
Analytical Study of Sir Charles Mackerras’, PhD thesis, King’s College, University of
London, London.
Blier-Carruthers, A. (2013), ‘The Studio Experience: Control and Collaboration’, in
Proceedings of the International Symposium on Performance Science, Vienna, August.
Available online: http://performancescience.org/wp-content/uploads/2018/08/isps2013_
proceedings.pdf (accessed 12 July 2019).
Blier-Carruthers, A. (forthcoming), ‘The Problem of Perfection in Classical Recording – The
Performer’s Perspective’, Musical Quarterly. Available online: https://www.ram.ac.uk/
research/research-output/research-repository (accessed 12 July 2019).
Blier-Carruthers, A., A. Kolkowski and D. Miller (2015), ‘The Art and Science of Acoustic
Recording: Re-enacting Arthur Nikisch and the Berlin Philharmonic Orchestra’s
Landmark 1913 Recording of Beethoven’s Fifth Symphony’, Science Museum Group Journal,
(3). Available online: http://journal.sciencemuseum.ac.uk/browse/issue-03/the-art-and-
science-of-acoustic-recording/ (accessed 12 July 2019).
Born, G. (2009), ‘Afterword – Recording: From Reproduction to Representation to
Remediation’, in N. Cook, E. Clarke, D. Leech-Wilkinson and J. Rink (eds), The Cambridge
Companion to Recorded Music, 286–304, Cambridge: Cambridge University Press.
Botstein, L. (2011), ‘The Eye of the Needle: Music as History After the Age of Recording’, in
J. F. Fulcher (ed.), The Oxford Handbook of the New Cultural History of Music, 523–550,
Brendel, A. (2007), ‘A Case for Live Recordings’, in On Music: His Collected Essays, 345–351,
London: JR Books.
Castelo-Branco, S. E.-S. (2010), ‘Epilogue: Ethnomusicologists as Advocates’, in J. M. O’Connell
and S. E.-S. Castelo-Branco (eds), Music and Conflict, 243–252, Urbana, IL: University of
Illinois Press.
Clarke, E. (2002), ‘Listening to Performance’, in J. Rink (ed.), Musical Performance: A Guide to
Understanding, 185–196, Cambridge: Cambridge University Press.
Cook, N. (2013), Beyond the Score: Music as Performance, Oxford: Oxford University Press.
Cook, N., E. Clarke, D. Leech-Wilkinson and J. Rink, eds (2009), The Cambridge Companion
to Recorded Music, Cambridge: Cambridge University Press.
Culshaw, J. ([1967] 2012), Ring Resounding: The Recording of Der Ring des Nibelungen,
London: Pimlico.
Day, T. (2000), A Century of Recorded Music: Listening to Musical History, New Haven, CT:
Doğantan-Dack, M., ed. (2008), Recorded Music: Philosophical and Critical Reflections,
London: Middlesex University Press.
Fabian, D. (2008), ‘Classical Sound Recordings and Live Performances: Artistic and Analytical
Perspectives’, in M. Doğantan-Dack (ed.), Philosophical Reflections on Sound Recordings,
232–260, London: Middlesex University Press.
Freeman-Attwood, J. (2009), ‘Still Small Voices’, in N. Cook, E. Clarke, D. Leech-Wilkinson
and J. Rink (eds), The Cambridge Companion to Recorded Music, 54–58, Cambridge:
Cambridge University Press.
Gaisberg, F. W. (1947), Music on Record, London: Robert Hale.
Glenn Gould: The Alchemist (1974), [Film] Dir. B. Monsaingeon, EMI Classics, IMG Artists,
Idéale Audience International.
Gould, G. ([1966] 1987), ‘The Prospects of Recording’, in T. Page (ed.), The Glenn Gould
Reader, London: Faber and Faber.
Gritten, A. (2008), ‘Performing after Recording’, in M. Doğantan-Dack (ed.), Philosophical
Reflections on Sound Recordings, 82–99, London: Middlesex University Press.
Haas, M. (2003), ‘Studio Conducting’, in J. A. Bowen (ed.), The Cambridge Companion to
Conducting, 28–39, Cambridge: Cambridge University Press.
Hallifax, A. (2004), The Classical Musician’s Recording Handbook, London: SMT.
Howlett, M. (2009), ‘Producing For (and Against) the Microphone: Finding a Credible Vocal’,
in E. Clarke, N. Cook, D. Leech-Wilkinson and J. Rink (eds), The Cambridge Companion to
Recorded Music, 30–31, Cambridge: Cambridge University Press.
Johnson, P. (2010), ‘Illusion and Aura in the Classical Audio Recording’, in A. Bayley (ed.),
Recorded Music: Performance, Culture and Technology, 38–51, Cambridge: Cambridge
University Press.
Katz, M. (2004), Capturing Sound: How Technology Has Changed Music, Berkeley, CA:
Kisliuk, M. (2008), ‘(Un)doing Fieldwork: Sharing Songs, Sharing Lives’, in G. Barz and
T. J. Cooley (eds), Shadows in the Field: New Perspectives for Fieldwork in Ethnomusicology,
183–205, Oxford: Oxford University Press.
Leech-Wilkinson, D. (2009), ‘The Changing Sound of Music: Approaches to the Study of
Recorded Musical Performances’, Charm. Available online: http://www.charm.kcl.ac.uk/
studies/chapters/intro.html (accessed 12 July 2019).
p0lyph0ny (2008a), ‘Glenn Gould Records Scriabin Desir Désir Part 1’, YouTube, 12 July.
Available online: https://www.youtube.com/watch?v=JllD47HIees&t=35s (accessed 12 July
2019).
p0lyph0ny (2008b), ‘Glenn Gould Records Scriabin Desir Désir Part 2’, YouTube, 12 July.
Available online: https://www.youtube.com/watch?v=chHJdmyIiRk (accessed 12 July 2019).
Patmore, D. and E. Clarke (2007), ‘Making and Hearing Virtual Worlds: John Culshaw and
the Art of Record Production’, Musicae Scientiae, 11 (2): 269–293.
Philip, R. (2004), Performing Music in the Age of Recording, New Haven, CT: Yale University
Press.
Rushby-Smith, J. (2003), ‘Recording the Orchestra’, in Colin Lawson (ed.), The Cambridge
Companion to the Orchestra, 169–179, Cambridge: Cambridge University Press.
Sachs, H. (2008), Six Famous Ears: Emanuel Ax, Alfred Brendel, and Andras Schiff Tell How They
Listen, interviews for the Orpheus Instituut, Ghent, Belgium, presented at The Musician as
Listener conference, 22–23 May. Available online: https://orpheusinstituut.be/assets/files/
publications/1386520124-Six-famous-ears-Harvey-Sachs.pdf (accessed 12 July 2019).
Schwarzkopf, E. (1982), On and Off the Record, London: Faber and Faber.
Seeger, A. (2008), ‘Theories Forged in the Crucible of Action: The Joys, Dangers, and
Potentials of Advocacy and Fieldwork’, in G. Barz and T. J. Cooley (eds), Shadows in
the Field: New Perspectives for Fieldwork in Ethnomusicology, 271–288, Oxford: Oxford
University Press.
Small, C. (1998), Musicking: The Meanings of Performing and Listening, Middletown, CT:
Théberge, P. (2012), Liner Notes to: Glenn Gould: The Acoustic Orchestrations. Works by
Scriabin and Sibelius, recorded in 1970 by Glenn Gould, produced and mixed in 2012 by
Paul Théberge (Sony Classical, 88725406572).
Tomes, S. (2004), ‘A Performer’s Experience of the Recording Process’, in Beyond the Notes:
Journeys with Chamber Music, 140–150, Woodbridge, UK: Boydell Press.
University Press.
Zagorski-Thomas, S., A. Blier-Carruthers, A. Bourbon and E. Capulet (2015), AHRC Digital
Transformations Project: ‘Classical Music Hyper-Production and Practice-as-Research’,
University of West London partnering with Royal Academy of Music. Available online: https://
www.uwl.ac.uk/classical-music-hyper-production/about-project (accessed 12 July 2019).
Discography
Beethoven, Ludwig van ([1913] 2012), [CD] C minor Symphony (No. 5), the Berlin
Philharmonic Orchestra, conducted by Arthur Nikisch, ‘The Berlin Philharmonic
Orchestra and Their Music Directors’, EMI Classics.
Debussy, Claude ([1913] 1992), [CD] ‘Danseuses de Delphes’, Préludes, ‘Masters of the Piano
Roll: Debussy Plays Debussy’, Dal Segno.
Wagner, Richard ([1959–66] 1967), Der Ring des Nibelungen, produced by John Culshaw,
conducted by Sir Georg Solti, Wiener Philharmoniker, Wiener Staatsopernchor, originally
released in 1959, 1963, 1965 and 1966, Decca.
14
Welcome to the Machine:
Musicians, Technology
and Industry
Alan Williams
Introduction
Musicians are inherently wed to technology as every musical instrument is a machine
of some sort, and most musicians’ musical conceptions are shaped by the instruments
they play. Many are quire comfortable working within existing norms. Others push back
against this technological determinism with extended technique, altering the method of
playing without altering the machines in their hands. Some then make slight changes to
the standard, altering tunings, muting a snare drum head with a wallet, etc. The impulse
to modify technology is present in these situations, but these deviations are temporary;
the initial state of the machine can be easily reclaimed. Like developing musical practices,
incremental adjustments to these machines, often undertaken to address an individual
musician’s particular challenges and ideas, are sometimes adopted and adapted by
ever-expanding numbers of musicians. The entire science of organology is a history of
musicians shaping technology to suit their purposes. The shifts of instrument design
over long periods usually bear the fingerprints of too many individuals to identify. But
occasionally, a musician seeks to permanently alter the devices they have encountered,
resulting in designs that might bear a resemblance to a technological lineage but which
are clearly identifiable in their own right, and they may in fact signal the introduction
of a new instrument classification, demanding entirely new methods and techniques for
the realization of a musical idea. There are individuals whose inventions become widely
adopted, and their role as musicians is overshadowed by the impact and longevity of their
mechanical creations.
Musicians have shaped technology at both micro- and macroscopic levels, but it is
only when the ideas of an individual musician become embraced and replicated that
the impact moves technological practice from the idiosyncratic to the normative. The
impulse to bend a machine to serve one’s purpose can be as subtle as strapping a capo
on a guitar neck, or as revolutionary as scratching a record to cue a breakbeat. While
the emergence of practices such as turntablism has been widely documented and
acknowledged, I would like to focus upon less seismic examples of musicians tinkering
with technology in ways that resonate and influence both technological development
and musical practice. I posit that there are two types of inventor – passive influencers
and active proselytizers – and that there is an important distinction between them.
The passive influencer is a musician who shapes technology to suit their own purpose,
someone whose modifications and inventions are so compelling that they become
widely adopted, with little to no effort on the part of the inventor. Whereas the active
proselytizer is someone who endeavours to establish their designs in the marketplace,
who seeks to create new technologies that will become standard, accepted devices with
a clear impact on subsequent musical practices, which are forever associated with their
inventor, often bearing the name of their creator.
However, will alone won’t necessarily establish new musical technologies. Intrepid
musician-inventors must also be savvy entrepreneurs, able to think as creatively and
effectively in the marketplace as they do in their basement laboratories and practice
rooms. The successful launch of a new technology is dependent upon the appeal of the idea
to a larger number of potential users; the active proselytizer must develop technologies
that address the needs of large numbers of musicians and be able to articulate these
needs to investors and manufacturers who not only see the potential for profit, but
who have the means and experience to successfully bring the product to market. This
is where many inventive endeavours fall short, and musicians acting as engineers often
encounter outright hostility from these sectors of industry. But for those few musicians
who master the arts of design and commerce, the immediate payoff and lasting impact
can be enormous.
Furthermore, as musicians began to explore the creative possibilities of the recording
studio and the various technologies they found within, the resulting artefacts became
the model to be emulated or recreated in performance, and musicians began to push for
technology that would replicate the recording in a live context. From one perspective,
this innovation was driven by individual musicians whose performance style was ever-
more shaped by the studio experience, with specific songs demanding the presence of a
recognizable, even intrinsic sonic signature. From another angle, audiences’ conceptions
of the music they paid their admission price to hear were constructed via the controlled
laboratory of the studio, and their expectations of the live performance as mediated by
audio technology demanded a re-creation of the recorded mixes that were now inseparable
from their favourite songs or moments of compelling performance. This chapter examines
some of these examples from the last 200 years of musical technologies, with a focus on
the complicated encounters between musicians and the business of technology, and how
Musicians, Technology and Industry 223
technological practices born in the recording studio shape musical performance outside
the studio for both musicians and audiences.
Passive influencers
Often, the distinction between large- and small-scale impact is the result of an individual’s
intention. Occasionally, a musician devises a modification to suit a specific personal need,
with little regard for whether others might have a common interest in the solution to
a shared problem. For these nascent inventors, frustrations with the status quo lead to
designs that are highly personalized and are the results of the efforts intended only for their
own musical endeavours. In the rock era, Brian May and Eddie Van Halen both created
handcrafted guitars, singular instruments that came to define their respective sounds and
styles (Obrecht 1978; May and Bradley 2014). Those instruments were never intended
for mass manufacture, indeed, May sometimes expresses an element of embarrassment
about his guitar’s homemade nature, not like a real instrument bought at a store. This is
in contrast to Van Halen’s rejection of mass-manufactured goods; his dissatisfaction with
the limited options presented to musician-consumers forced him to attempt to craft an
instrument that would deliver his ideal. As Van Halen explained in an early interview,
Nobody taught me how to do guitar work: I learned by trial and error. I have messed up a
lot of good guitars that way, but now I know what I’m doing, and I can do whatever I want
to get them the way I want them. I hate store-bought, off-the-rack guitars. They don’t do
what I want them to do, which is kick ass and scream. (Van Halen quoted in Obrecht 1978)
The highly audible and visible careers that followed resulted in acolytes closely studying
these unique machines, and manufacturers offering to market copies of May’s ‘Red Special’
or Van Halen’s ‘Frankenfender’. But because these musicians endeavoured to work with
technology in the service of their own musical expression, their mark on future technologies
is fairly minimal. The history of the electric guitar is filled with examples of musicians
tinkering with the instrument, crafting modifications that expanded the sonic capabilities
of the instrument (Tolinski and Di Perna 2017). In many cases, musicians commissioned
engineers to build machines that would allow for specific tonal manipulations of amplified
sound. Vox produced larger models of their amplifier line specifically for The Beatles to
tour with, and Jim Marshall worked with Pete Townshend and Jimi Hendrix, constantly
modifying his amplifier designs to accommodate the musicians’ requests (Kraft 2004).
This notion of musicians consulting in the design of musical instruments has a parallel
in the introduction of consumer electronic synthesizers, with Bob Moog in particular
courting The Beatles and musicians such as Keith Emerson helping to promote and refine
his inventions. Donald Buchla employed musicians such as Suzanne Ciani in his factory,
but Moog’s courtship of high-profile musicians is one of the reasons he triumphed in the
marketplace, reinforcing the value of brand and endorsements in transforming novelties
into the commonplace (Pinch and Trocco 2004).
Active proselytizers
Some musicians have designed equipment and instruments with the specific intention of
creating new ways for other musicians to make musical sound; Adolphe Sax exemplifies this
category of musical inventors. The nineteenth century saw the crafting and codification of
trademark and patent laws designed to protect the rights and potentially lucrative claims of
ownership that inventors might be entitled to (Hyde 2010). Concurrent with developments
in steam and electrical power, and mass-manufactured consumer goods and devices, several
musical instruments such as the piano made the transformation from modified prototype
variations on existing instruments into common household objects. While not quite as
ubiquitous as the piano, the family of instruments that Sax developed (and which bear his
name) has become entrenched in musical practices well into the twenty-first century, and
in a variety of genres, some of which, such as jazz, are unthinkable without the saxophone.
Sax was a conservatory-trained musician, performing on both flute and clarinet.
His work in modifying the design of the bass clarinet would lead him to develop the
instrument family that now bears his name. Championed by composers such as Berlioz,
Sax won acclaim in Paris, and his new instruments were quickly adopted across Europe,
not so much for orchestras, but for marching bands. For a time, Sax was financially secure
due to winning prize money and manufacturing instruments. But with a limited time
period on his patents, he was unable to take advantage of the rapidly expanding market for
saxophones and died in poverty. The failure of Sax as a businessman would be mirrored
by many other musician-inventors in the following decades (Cottrell 2012: 10–37). But
it is Sax’s ambition to have his designs become widely adopted that separates him from
individuals such as Harry Partch or Raymond Scott, musicians who developed instruments
specifically to realize their idiosyncratic compositional ideas, with no expectation that
other musicians would compose or perform their own music on them (Winner 2008). Sax’s
success at establishing such a wide-scale adoption of his invention is also notable because
of its singularity; no other musician has seen their inventions become so widely utilized
in music-making, rather, most other successes have been more modest modifications,
noteworthy, but not impactful.
Other musicians were directly involved in the design and manufacture of musical
instruments, if on a smaller scale or without the full commitment to establishing their
inventions in the marketplace. Merle Travis drew up the designs that Paul Bigsby worked
with in building his solid body electro-acoustic guitar, the distinctive headstock shape
emulated by Leo Fender in his Telecaster and Stratocaster designs (Millard 2004). Clarence
White, the country/bluegrass musician who joined The Byrds in the late 1960s, wanted to be
able to emulate the combination of static and pitch-shifted tones that pedal steel guitarists
were employing on many country recordings of the era, but on a standard 6-string guitar.
He came up with the concept behind the B-Bender, a modification that allowed the guitarist
to pull on the neck of the guitar, with the tension on the strap engaging a mechanism that
could raise the tuning of the b string while other strings remained unaltered. Fortunately,
another member of The Byrds, drummer Gene Parsons, was a machinist able to build a
working prototype for White. Other musicians were intrigued, and Parsons and White
licensed their invention to Leo Fender. Following White’s untimely death, Parsons formed
his own company to manufacture the units, and though not exactly a household name,
various incarnations of the B-Bender exist to this day, occasionally utilized by a number of
guitarists (Russell 2003).
A number of inventive musicians sought to monetize technologies that had been
originally designed for their own personal use. Kevin Godley and Lol Creme of the band
10cc built a device they dubbed the gizmotron, or ‘gizmo’, that allowed for unending
sustained notes on the guitar, emulating the way a bow produces sound on a violin by
attaching a set of small wheels with serrated edges that would constantly set a string
to vibrate. The pair used the machine on several songs in the 10cc catalogue, and their
obsession with the device led them to leave the band first to craft a triple album set that
featured multiple incarnations of the gizmo in action. They then furthered their dedication
by launching a partnership with Musictronics in 1979 in order to mass manufacture the
device for the consumer market. Flaws in the design and installation led to the failure of
the enterprise, though the infinite sustain concept was subsequently explored by Roland in
their designs of the guitar synthesizer in the 1980s (Dregni 2014).
Thomas Dolby, whose pop hit ‘She Blinded Me With Science’ helped to define electro-pop
in the 1980s, moved into the realm of computer coding and software design once the hits
began to dry up. With a grasp of the value of venture capital, business plans and corporate
industry partnerships, Dolby founded Headspace, a company that specialized in the design
of new technologies only tangentially related to music – from video games to mobile phone
ringtones – and established a second career as a technological guru, frequently serving as a
keynote speaker at tech conferences during the first wave of the Silicon Valley gold rush of
the 1990s. The trouble was that these partners and investors saw the value in Dolby as a pop
star brand/endorser but didn’t fully consider Dolby a true technological engineer/designer.
He was a rock star dilettante, useful for promotion and marketing (and could be counted
upon to give an entertaining keynote lecture), but not someone capable of writing deep
code. As such, tech officers within his own company began to bypass their boss as they
developed the technological backend to bring his concepts to practical application stage
(Dolby 2016). Dolby can claim to have thought up ideas that would later become widely
adopted, but he was unable to capitalize on their eventual success in the marketplace.
Captains of industry
As recordings became the primary artefact of popular music-making, generations
of musicians sought to book studio time as the logical extension of the desire to make
some noise with a newly acquired guitar amplifier. The exponential growth in potential
clients encouraged entrepreneurs to open studios in every small city, and many small
towns, to serve the demand to create recorded product. As a result, the audio industry
began to design and market more affordable studio gear – mixing consoles, loudspeakers,
processors, acoustic room treatments, etc. Independent studios of the 1950s and 1960s
tended to work with modified radio broadcast equipment and the larger, more professional
facilities operated custom designed, proprietary equipment. But by the end of the 1970s, a
serious market in studio equipment had evolved. Record-influenced musicians sought the
gear, and in turn, the culture of recording practice as a means to achieve fame and fortune
established technologically informed dreams and desires in the hearts of burgeoning rock
stars.
Some musicians had as much influence over developing studio technology as they
had in instrument design. In terms of impact, it would be difficult to overstate Les Paul’s
contributions to recording technology and popular music in general. But this accolade is
more applicable to his invention of the select/synchronous tape head, a crucial component
in the development of multitrack recording technology, than it is to the instrument that
he is most known for, the electric guitar model that bears his name. Common wisdom
often misidentifies Les Paul as the inventor of the electric guitar in large part because of
his name recognition as a performing artist, one who made his name ironically in the
process that he was much more instrumental in developing. But the success of the Les Paul
model guitar manufactured by Gibson illustrates the value of name branding that Adolphe
Sax established a century earlier. Paul’s successful partnerships with industry provide an
alternative model from Sax’s own manufacturing ventures. Rather than building a business
from scratch, then spending his time managing such an enterprise, Paul instead tinkered
with the designs already manufactured by the Ampex company, making modifications that
allowed for him to achieve the musical ideas in his head, then proposing licensing this patent
back to Ampex, thus ensuring that an established company with manufacturing facilities
and a distribution system would be able to literally capitalize on Paul’s concepts. In turn,
Paul’s use of overdubbing would prove the key to his ‘New Sound’, a novelty he promoted
in all publicity materials and appearances, heralding the arrival of multitrack technology
in popular music (Zak 2012). The resulting commercial success quickly inspired producers
and performing artists to explore the possibilities of what Paul Théberge has dubbed the
‘rationalized and fragmented’ process of methodically layering performance elements to
construct a composite whole (Théberge 1997: 216).
Yet, despite the seismic shifts in music-making as a result of Paul’s noteworthy invention,
it is the Les Paul guitar that most sustains his notoriety. Though his work with Gibson was
more as a consultant about type and location of pick-up placement, neck dimensions, etc.,
because the Les Paul model bears his name and is one of the more iconic guitars to emerge in
the 1950s, embraced by numerous rock musicians in subsequent decades, Paul is frequently
identified as the ‘inventor’ of the electric guitar (Millard 2004). The development of the
electric guitar in reality was a series of modifications undertaken by a number of people,
musicians and engineers alike, that while relatively rapid in adoption, is more akin to the
long-term development of any instrument type such as the guitar, the piano, the violin,
etc. Paul, as much an entrepreneur as a musician or engineer, endeavoured to partner with
companies that oversaw the bulk of the work (and the financial risk) in exchange for a
smaller but consistent share in the financial stake of the endeavour (Kraft 2004). And the
initial benefit to the companies that manufactured his designs using his name association
was subsequently superseded by the benefit that his brand recognition brought to his later
performing career.
Perhaps the most successful musician as inventor is Tom Scholz, a self-described
‘engineer first, musician second’. Scholz, a graduate of MIT, assembled a recording studio
in his basement in the 1970s while employed by Polaroid. Here, he began not only writing
and recording what would become the first album issued under the name Boston, but also
designing and building a set of devices that would deliver the sound of a wide variety of
electronic distortion without the use of amplifiers, a condition necessitated by the limits of
his basement studio. With the massive success of Boston’s debut, Scholz took his devices
with him on the road, allowing him to replicate his album sonics, and frequently giving
interviews to music instrument magazines touting the benefits of his machines. Interest
grew, and Scholz was able to launch a product line, Rockman, in the early 1980s. Sales
of the Rockman were strong, and Scholz has earned more as a technology designer than
as a musician, even with multimillion sales of his records. Scholz notes the resistance
he encountered with the recording industry that was quick to scoff at his home demos
and prototype designs for achieving his guitar tone. Yet the Boston catalogue was largely
crafted in Scholz’s basement, and with his nontraditional approach to electric guitar sound,
an approach that has been widely adopted by a number of high-profile guitarists, and often
utilized in recording studios to process drums, vocals and other non-guitar sounds (‘Tom
Scholz: Sound Machine’ 2014). The success of his company came from his devotion to
business, a product that offered a new machine in a field that was already well established,
unlike Thomas Dolby’s more bleeding-edge ventures.
If the success of Scholz’s Rockman product line exemplifies the potentially profitable
ventures of engineers moonlighting as musicians, the most successful musician
moonlighting as technological entrepreneur would be Dr Dre, co-founder of N.W.A.,
and hit-making producer for Snoop Dogg and Eminem. Though his long-term cultural
impact will likely be his creative output as performer and producer, his current status
as the wealthiest musician on the planet is the result of his partnership with recording
engineer turned record mogul Jimmy Iovine. Beats by Dre, a mid-level headphone line
produced by their company, was acquired by Apple, resulting in an estimated payday
worth US$3 billion. Split between partners, and with a mixture of cash and stock options
in Apple, overnight Dr Dre earned US$700 million, not for his skills as a producer, or for
any engineering design acumen, but rather from an entrepreneurial understanding of the
value of his brand, a clear focus on consumer audio technology as a relatively untapped
market and the logical extension of his work as a recording musician.
While the production of headphones may seem only tangentially related to music
production, the motivation for Dre’s interest came from a deep dissatisfaction with how
audiences auditioned the results of his painstaking labours in the studio. As Iovine recalls,
‘Apple was selling $400 iPods with $1 earbuds […] Dre told me, “Man, it’s one thing that
people steal my music. It’s another thing to destroy the feeling of what I’ve worked on”’
(Greenburg 2018: 164–165).
The same concern for audio quality that many musicians strive to achieve in their
recorded work is applicable to the post-production world of manufactured consumer
audio products, an area that in retrospect seems oddly unaddressed by earlier generations
of musicians. As Iovine and Dre first bonded over discussions of well-crafted stereo mixes,
it makes sense that they would extend their partnership into the realm of the consumer
listening experience, crafting sets of headphones with dramatically increased low-
frequency response, simulating the sound of playback in clubs and in subwoofer tricked
out automobiles. Pricing the units in the US$300 range, they created a product that was
both an entry-level audiophile item as well as a cultural totem fashion accessory akin to
Air Jordans. This latter aspect was dependent upon the brand value of Dre’s moniker, and
he is responsible for the product name – Beats by Dre. Beats by Dre mirrors the brand
recognition of Gibson’s Les Paul model guitar, but with the crucial distinction that Dre was
a major partner in the company, not simply licensing his endorsement of someone else’s
product. His status as co-owner also resembles that of Tom Scholz, but whereas Scholz’s
market was limited to musicians and, in reality, a small subsection of electric guitarists,
Beats by Dre were purchased by the exponentially larger audience of music fans. So
successful was the line that, while initial purchases were made by fans of the artist, their
ubiquitous presence as a fashion accessory soon extended the market to fans who never
listened to hip-hop, much less Dre’s own productions.
One of the ramifications of Beats by Dre was not simply furthering Dre’s own name
recognition but making private music listening a visible public statement. Whereas Apple’s
earbuds were designed to disappear into the ear, to hide the fact that their users were wearing
them, Beats headphones were quite large, boldly coloured and instantly recognizable. With
a generation of listeners exchanging the fairly limited bandwidth of earbuds for the higher
resolution (though admittedly hyped) audio playback of larger diaphragm designs, Dre
not only achieved his goal of having his fans experience the ‘feeling of what (he) worked
on’ but extended that experience to fans of other musicians as well. The generation raised
on tiny, lo-fi earbuds was replaced by another that place a greater importance upon audio
quality, or at least have demonstrated an awareness of the possibility of higher resolution
and the enhanced listening experience, even if motivated more by fashion than music.
While the revenues generated by his headphone line are staggeringly impressive, and mark
an important moment in hip-hop culture and African American business, the impact upon
musical practices might be more profound. Turning the tide away from ever-narrowing
limited frequency and dynamics, Beats by Dre re-established the value of the listening
experience and may prove to be just as important to musical development as Les Paul’s sel/
sync tape head was fifty years earlier.
Taking it to the stage

Sound-recording practice began to shape musical performance almost from the moment
of its birth, whether it was the emergence of international performing stars such as
Enrico Caruso, whose recordings set the stage for integrated marketing strategies, or
the development of public address systems that allowed crooners such as Bing Crosby to
whisper in the ears of not only the gramophone and radio listeners, but also those dance
hall crowds who flocked to see these pre-rock rock stars.
But as musicians began to explore sound recording as a means of first extending,
then subsequently transcending what could be accomplished in live performance, new
techniques and technologies were developed to make these sonic adventures replicable in
a live setting. If The Beatles refrained from attempts to perform their most studio-centric
work during their final years of touring, by the 1970s, many musicians approached live
performance from the perspective of the recordings that were conceived as studio creations
rather than as documents of performance in the studio. Often, it was musician’s ad hoc
approach to replicating their recorded music that drove technological developments and
their eventual manufacture and consumer markets. For example, once The Who built
classic anthems such as ‘Baba O’Riley’ and ‘Won’t Get Fooled Again’ around recorded
sequenced electronic ostinatos, those same recordings served as the foundation of their
live performance. This posed challenges in playing to tape, some related to locking human
rhythmic fluctuations to mechanized pulse, some related to the fallible technology of the
tape playback systems themselves. For the last eight years of his life, drummer Keith Moon
was forced to play with a pair of headphones duct taped to his head, just as he had done
when tracking his recorded performance in the studio (The Kids Are Alright 1979).
Headphones allowed Moon to keep more precise time than that of often unreliable sound
systems. But headphones also allowed musicians access to sonic information not intended
for the audience. For heavily sequenced/programmed music in the 1980s and beyond,
headphones became a necessity, a common component of a drummer’s stage set-up. Many
drummers began to perform to metronomic click tracks, even when the sounds broadcast
to the audience did not appear to contain any mechanized programming elements. Such live
performance strategies reflect how deeply embedded notions of acceptable performance
had been shaped by recording practice as tempos were set not only to match that of the
record but also to maintain those tempos throughout the performance, just as the records
being reproduced had been tracked to metronomic click tracks (Théberge 1989).
For many musicians, the desire to recreate the recorded performance went beyond
replicating the final mix. Instead, these musicians sought to construct a simulacrum of
the sonic environment in which they delivered their original studio performance. In-ear
monitors, a far more stage-worthy iteration of a headphone, could deliver carefully crafted
monitor mixes that emulated the instrumental balances, with supporting atmospheric
digital reverberation and delay. These personal monitoring devices had the added
benefit of reducing foldback monitors on stage, dramatically reducing stage volumes and
subsequently allowing house sound engineers to boost acoustic-based signals in the mix
without the previously common feedback issues. This in turn resulted in mixes that were
even closer to the original recordings and thus invited audiences not only to relive the
experience of listening to the record but also to become more fully present at the moment
of its (re)creation.
Digital reverberation and delay units not only afforded small-scale studios the luxury of
echo chambers, plates and tape delays, but brought these sounds into both the coffeehouse
and the arena. Programmable synthesizers enabled keyboardists who had spent hours
carefully dialling in the sonic elements of electronic processors in the studio to store and
recall these settings at the touch of a button. Drum machines not only brought the big
bam boom to musicians without access to drums but also, once these sounds became the
sound of the record, the record became even easier to replicate onstage. Drummers began
to utilize drumsticks and heads that delivered a crisper attack, replicating the close-mic’ed
sounds of samplers and records for live performance (Zagorski-Thomas 2010).
The influence of recorded sound on live performance isn’t relegated solely to the
popular music industry. The advent of rock-oriented musicals brought a contemporary
pop aesthetic not only to composition but also to the sonic experience of Broadway and
West End audiences. Rock bands replaced pit orchestras, and theatres installed sound
systems remarkably similar to those utilized by touring rock and pop performers. The
most telling component of this switch is the use of microphones for singing actors. Small,
thin, carefully disguised microphones equipped with radio transmitters became part of
theatre stagecraft. And as these productions moved into community and high school
theatres, a new micro-industry in consumer theatre amplification emerged. In turn, many
religious services sought to emulate both the experience of pop recordings and musical
theatre productions during worship services. Figures from the North American Music
Merchandisers Association indicate that one of the most profitable sectors of the music
product is the field of audio systems for churches, with a full complement of PA systems,
in-ear monitors and headset microphones (Matas 2014). The difference between a
production of Jesus Christ Superstar and Jesus Christ Sunday School is minimal.
Conclusion
The historical examples examined in this chapter illustrate the myriad forms in which
musicians have shaped technology. They also highlight the importance of commerce in
transforming the inspirations of an inventive individual into tangible, widely adopted
products. Many intriguing ideas have failed to establish themselves in any lasting way
because of lack of capitalization, a hostile or dismissive business environment, an inability
to clearly articulate the potential of technology or perhaps just bad timing, a little too
far ahead of the bleeding edge. While large-scale industry made Les Paul and Dr Dre
household names and wealthy men, newer developments in replication and distribution,
via 3-D printers, crowdfunded platforms and social media outlets, may afford the creative
musical inventor the opportunity to bypass the older manufacturing and retail structures
that have proven challenging to earlier generations of musician-engineer-entrepreneurs,
erasing the distinctions and divisions implied by such hyphenated phrases, where creative
expression is found in both musical sound and the technologies that make it possible.
The consumer audio industry now serves the needs of a wide range of performers
who cannot conceive of live performance without the mediation of technology. The
standardization of intonation to A440 Hz can be seen as a result of widely districted tuners
set to this frequency. Global traditions of non-equal temperament have gradually given way
to the theoretical formulas of pitch that arrive as defaults on most electronic keyboards. The
spectre of technological determinism is ever-present, yet always gives way to the individual
willing to look under the hood and hotwire a circuit board in the quest for something new
and unique. Musicians first embrace what is available, then push equipment beyond its
design specs. Subsequently, industry swoops in to incorporate these modifications into
mass-produced products. And the cycle is set in motion again.
Bibliography
Cottrell, S. (2012), The Saxophone, New Haven, CT: Yale University Press.
Dolby, T. (2016), The Speed of Sound: Breaking the Barriers Between Music and Technology,
New York: Flatiron Books.
Dregni, M. (2014), ‘Gizmotron: Most Bizarre Guitar Effect of All Time?’, Vintage Guitar
Magazine, March: 44–47. Available online: http://www.vintageguitar.com/18431/
gizmotron/ (accessed 1 August 2019).
Greenburg, Z. (2018), 3 Kings: Diddy, Dr Dre, Jay-Z and Hip-Hop’s Multibillion-dollar Rise,
New York: Little, Brown and Co.
Hyde, L. (2010), Common as Air: Revolution, Art, and Ownership, New York: Farrar, Straus
and Giroux.
The Kids Are Alright (1979), [Film] Dir. Jeff Klein, USA: Pioneer.
Kraft, J. (2004), ‘Manufacturing: Expansion, Consolidation, and Decline’, in A. Millard (ed.),
The Electric Guitar: A History of an American Icon, 63–88, Baltimore, MD: Johns Hopkins
University Press.
Matas, D. (2014), ‘House of Worship: A Growing and Evolving Market’, NAMM Playback:
Industry Crossroads. Available online: https://www.namm.org/playback/industry-
crossroads/houses-worship-growing-and-evolving-market (accessed 31 May 2019).
May, B. and S. Bradley (2014), Brian May’s Red Special: The Story of the Home-Made Guitar
that Rocked Queen and the World, Milwaukee, WI: Hal Leonard Corporation.
Millard, A. (2004), ‘Inventing the Electric Guitar’, in A. Millard (ed.), The Electric Guitar: A
History of an American Icon, 41–62, Baltimore: Johns Hopkins University Press.
Obrecht, J. (1978), ‘Eddie Van Halen: Heavy-Metal Guitarist from California Hits the Charts
at Age 21’, Guitar Player, 12 (11). Available online: https://www.guitarplayer.com/players/
eddie-van-halens-first-interview-1978 (accessed 11 August 2019).
Pinch, T. and F. Trocco (2004), Analog Days: The Invention and Impact of the Moog
Synthesizer, Cambridge, MA: Harvard University Press.
Russell, R. (2003), ‘Parsons-White String Bender’, Vintage Guitar Magazine, February.
Available online: http://www.vintageguitar.com/1970/parsons-white-string-bender/
Théberge, P. (1989), ‘The “Sound” of Music: Technological Rationalization and the Production
of Popular Music’, New Formations, (8): 99–111.
Tolinski, B. and A. Di Perna. 2017. Play It Loud: An Epic History of the Style, Sound, and
Revolution of the Electric Guitar, New York: Anchor Press.
‘Tom Scholz: Sound Machine’ (2014), [TV Programme] NOVA’s Secret Life of Scientists
and Engineers, PBS, 22 May. Available online: https://www.youtube.com/
watch?v=NYXgfzVjrTw (accessed 1 August 2019).
Winner, J. (2008), ‘The World of Sound: A Division of Raymond Scott Enterprises’, in
P. D. Miller (ed.), Sound Unbound, 181–202, Cambridge, MA: MIT Press.
Zagorski-Thomas, S. (2010), ‘Real and Unreal Performances: The Interaction of Recording
Technology and Drum Kit Performance’, in A. Danielsen (ed.), Musical Rhythm in the Age
of Digital Reproduction, 195–212, Farnham: Ashgate Press.
Zak, A. (2012), I Don’t Sound Like Nobody: Remaking Music in 1950s America, Ann Arbor:
University of Michigan Press.
Discography
Dolby, Thomas (1982), ‘She Blinded Me With Science’, The Golden Age of Wireless, Capitol
Records.
The Who (1971), ‘Baba O’Riley’, Who’s Next, MCA Records.
The Who (1971), ‘Won’t Get Fooled Again’, Who’s Next, MCA Records.
15
Studying Recording Techniques
Kirk McNally and Toby Seay
Introduction
This chapter is the result of a conversation we had about the study of recording techniques –
particularly the research-based texts from our field and our sense that they had not been
critically evaluated as a whole. Independently we had voiced frustration with various
aspects of the standard texts, including the gaps we saw in them and the problems that had
arisen using them in teaching. We had also heard from students about their own issues,
which further pointed to the need for dialogue around the books we ask them to read and
study.
However, thinking about actually writing such a critique brought on a fraught silence.
Beyond the obvious reluctance to criticize the life’s work of colleagues, our view was that
many of the standard texts suffer because of the features of this field. The texts we use
were all written over a relatively short period of time, with the earliest entries dating
back only to the last half century (Bernhart and Honegger 1949). Sound recording is a
multifaceted, technology-intensive field, which creates a need for resources that evolve,
grow and accommodate the ever-changing ‘state of the art’. Compared to fields such
as mathematics, English literature, philosophy or medicine, ours is still in its infancy.
Extending this metaphor, we heed the advice that ‘children need models rather than
critics’. Any critique we put forth could be countered by the call to simply do better and
write that perfect text.
Why, then, did we not write a new text? The answer came during McNally’s visit to an
institutional library, where recording-techniques textbooks were found in the TK section.
TK is a subcategory of the technology class, reserved for books on electrical engineering,
electronics and nuclear engineering. This classification was striking and raised the question
as to how texts such as Roey Izhaki’s Mixing Audio: Concepts, Practices and Tools (2008) and
William Moylan’s The Art of Recording: Understanding and Crafting the Mix (2002), which
both have a strong creative and artistic component, came to be shelved in the electrical
engineering section. Certainly sound recording uses technology and has a historical basis
in electrical engineering, but it is also a field in which artistry and craft are recognized,
valued and taught.
As Grassie and Clark (2015) point out, the study of music production and audio covers
a broad set of disciplines, making it difficult to place it in a singular category. Leaving the
library, McNally walked past table after table of students glued to their laptops and mobile
devices. Some students were watching video tutorials or lectures, some were reading online
and a few had textbooks in hand. It was seeing those students using a variety of learning
mediums in juxtaposition with the historical narrative illustrated by the TK classification
that provided clarity on how we might critically evaluate the texts within the field of sound
recording. What follows is an analysis of the methods used to study sound recording and
how they have influenced the content of research texts in the field.
The chapter is divided into three parts. The first documents the approaches to the study
of sound recording, both historical and contemporary. The second section categorizes the
available texts and aligns them with the previous approaches. The third section introduces
contemporary disrupters in the form of new and nontraditional resources in the field and
analyses the value of these resources.
History of the study of sound recording

In this brief history of the study of sound recording, we will break from the convention of
using a chronological timeline and begin with the concept of Tönmeister training.1
The Tönmeisters
Within the academic writing on sound recording, the formal training of sound-recording
engineers is reported to have begun roughly seventy years ago (Borwick 1973), when the
composer Arnold Schoenberg wrote to Robert Maynard Hutchins, the chancellor of the
University of Chicago, to suggest that ‘soundmen’ would be better prepared for the demands
of the field if they received musical training in addition to training in the ‘mechanical
fields’ (physics, electronics). Schoenberg writes:
The student should become able to produce an image in his mind of the manner in which
music should sound when perfectly played. In order to produce this image he should not use
the corrupting influence of an instrument. Merely reading the score must suffice. He will be
trained to notice all the differences between his image and the real playing; he will be able to
name these differences and to tell how to correct them if the fault results from the playing.
His training in the mechanical fields should help him to correct acoustic shortcomings, as,
for example, missing basses, unclear harmony, shrill notes, etc. (1987: 241)
His name for this new breed of sound engineer was ‘Tönmeister’, and the original training
concept has now become the philosophical foundation of a large number of institutions,
primarily in Europe but also in Canada and the United States. The Erich Thienhaus
Institute, founded by its namesake in 1949 in Detmold, Germany, is credited with being
Studying Recording Techniques 235
the first educational programme for the study of sound recording to use the Tönmeister
concept.
While each institution will offer its own particular flavour of this concept, a
Tönmeister programme will either include musical training or require a formal musical
education as a prerequisite. Once in the programme, students take scientific courses,
including mathematics, acoustics, physics, electrical engineering or digital signal
processing as well as practical courses in the recording studio to familiarize them
with the technology and equipment used in sound recording. When synthesized, the
stated goals of the various Tönmeister programmes currently offered indicate a desire
to provide musicians with the technical skills required to be successful in the music-
recording industry.
The apprentices
Schoenberg’s letter makes it clear that there were working `soundmen’ before he penned
his letter, his ideas and recommendations being based upon what must have been negative
experiences working with contemporary sound engineers. Where did these early engineers
receive their training, and what was the method? The answer to these questions is, of
course, in-employment training and apprenticeship. This type of training requires that you
are first an employee of a given institution or private enterprise, which would seem to
distance it significantly from a formal education model. We argue that the apprenticeship
model must be considered when looking at the study of sound recording because many of
the authors of the texts used in the field have apprenticeship training in their background.
For transparency, it should be noted that both authors of this chapter apprenticed as part
of their professional training.
The apprenticeship model, also described as situated learning (Lave and Wenger
1991), is well known in the field of sound recording. The classic job progression observed
in recording studios around the world – runner, second or assistant engineer, engineer,
producer – exhibits the hallmark features of situated learning. Individuals starting their
career at a traditional studio will find themselves participating in a new social context
which provides learning that is both rich and complex in meaning. The apprentice
engineer, with the support of a master practitioner, moves through a series of key steps,
which are easily understood and include observation, modelling, scaffolding, fading and
coaching (Pratt 1998). Through these steps and the cyclical repetition of product creation,
the learners’ schemas – the theory individuals have about the nature of events, objects
or situations (Rumelhart and Norman 1980) – become more complex and elaborate.
This process illustrates the development of tacit knowledge – the unarticulated, implicit
knowledge gained from practical experience (Polyani 1958, 1966), which Susan Schmidt
Horning (2004) applies to sound recording and describes as the ability of recording
engineers to ‘deploy a working knowledge of the behavior of sound and the machinery
of its propagation’. Importantly, these schemas are not extracted from the studio context
to remain only inside the heads of the apprentices. Rather, ‘the ultimate product of
situated learning is knowledge that is embedded in, rather than extracted from, contexts
of its application’ (Pratt 1998: 88). This model requires a physical engagement within
a community of practitioners. Through this training, an apprentice will develop an
extensive body of knowledge, have the ability to work in a variety of roles, anticipate
outcomes and improvise new schemas based on changing circumstances within the
studio.
The cognitive apprentices

Reflecting on the discussion thus far, it becomes clear that another model for how to
study sound recording is necessary. A large number of institutions have sound-recording
programmes that don’t conform to the Tönmeister model or cannot adhere to a strict
apprenticeship approach. Generally speaking, these programmes, to different degrees,
combine theoretical study with practical training. We describe this type of programme as
the cognitive apprenticeship model.2
A cognitive apprenticeship shares many of the features of the traditional model, but
the focus shifts to intellectual or cognitive tasks that aren’t always easily observed (Pratt
1998). Because of this, the role of the teacher – for this model is most often found within
an educational setting – is significantly different from that of the master practitioner. The
cognitive apprenticeship model in action is illustrated by the following scene, common
to many hospital television dramas. A patient presents with some serious condition
or hidden ailment, and the doctor queries the resident, asking them to call upon their
current content knowledge to propose possible causes of the ailment. An examination
follows, with the doctor stating aloud the mental process they used to make decisions
and evaluate the case. The resident is asked for their diagnosis and planned course of
treatment. The head doctor then chastises the poor resident for an obvious omission
and provides their own diagnosis, with an accompanying climax in the musical score.
Discussion between fellow residents and the doctor about where they were right or
wrong concludes the scene.
For that kind of scene to work, the master and the learner must assume a familiar yet
different role in the steps that describe a cognitive apprenticeship model. As with the
traditional apprenticeship model, the steps are easily understood and include modelling,
approximating, fading, self-directed learning and generalizing (Pratt 1998, adapted from
Farmer 1992). The hospital scene just described illustrates that both the master (doctor)
and learner (resident) articulate their thinking, which allows for assessment on multiple
levels. Does the student have the appropriate content knowledge? Situational knowledge?
Are they applying this skill and knowledge appropriately for the given situation, task or
activity? As competency grows, the master provides assistance only when asked and is
ultimately tasked in the final ‘generalizing’ stage with discussing the appropriateness of
using existing knowledge in a new situation.
The texts
Our analysis of texts in the field of sound recording pointed to a clustering into three
principle categories: texts that are overtly technical, texts that are comprehensive in nature
and methods textbooks that focus on a specific approach, such as critical listening, or on
specific stages of the music production process – music mixing, for example. At a very
high level, the trajectory of the texts when viewed upon a timeline shows a movement
from strictly technical to methods texts. Though not exhaustive, Figure 15.1 provides an
overview of the number of published textbooks in the field over the past forty years. In
selecting texts for this list (Table 15.1), we considered both the content and the perceived
primary readership, choosing ones aimed at student or professional practitioners in sound
recording and excluding those serving as reference texts for the designers and makers
of the equipment and technology. The proliferation of texts in the last fifty years mirrors
both the democratization of music technology and the expansion of post-secondary and
private-institution programmes in sound recording. As stated in the introduction, the
point of the following analysis is not to critique the texts in our field but rather to illustrate
how the methods used to study sound recording have influenced the content of our texts,
which in turn provides greater clarity on what is missing or inadequately addressed in the
contemporary context.
It must also be noted that white, male, anglophone authors dominate the literature. In
a loose canvassing of the membership of the Art of Record Production community, which
included ex-students and non-native English-speaking lecturers from Latin America, Europe,
Africa, China and India, editor Simon Zagorski-Thomas reported only a few examples of
‘local’ texts. The anecdotal evidence suggested that the vast majority use textbooks from
the United Kingdom and the United States. Female authors of recording texts are similarly
limited, with Sylvia Massy and Jenny Bartlett being two notable exceptions. In the world of
hip-hop, with its origins in African American culture, the first how-to book was written by
Joseph Schloss (2004), a white male author, an observation further supporting the claim
that there is limited diversity among the authors of recording texts.
Given the large number of texts available, it is impossible to discuss them all within
the context of this chapter, but we have chosen carefully and attempted to address each
category equally. The following texts have been selected based on their professional
reputation and publishing record, and are divided into three categories: technical, methods
and comprehensive.
Figure 15.1 Publication dates of selected sound-recording textbooks.

Table 15.1 List of texts

Date Title Author Publication
1949 Traité de prise de son Bernhart, José Paris: Eyrolles
1959 The Audio Cyclopedia Tremaine, Howard Indianapolis, IN: H.W. Sams
M.
1974 Modern Recording Runstein, Robert E. Indianapolis, IN: H.W. Sams
Techniques
1975 Handbook of Multichannel Everest, F. Alton Blue Ridge Summit, PA: G/L Tab
Recording Books
1976 Sound Recording Practice: Borwick, John London: Oxford University Press
A Handbook
1976 Sound Recording Eargle, John New York: Van Nostrand Reinhold
Co.
1976 The Recording Studio Woram, John M. Plainview, NY: Sagamore Pub. Co.
Handbook
1986 Handbook of Recording Eargle, John New York: Van Nostrand Reinhold
Engineering
1986 Modern Recording Runstein, Robert Indianapolis, IN: H.W. Sams
Techniques E.; Huber, David
Miles
1987 Handbook for Sound Ballou, Glen Indianapolis, IN: H.W. Sams
Engineers: The New
Audio Cyclopedia
1987 Introduction to Bartlett, Bruce Indianapolis, IN: H.W. Sams
Professional Recording
Techniques
1988 Audio Engineering Benson, K. Blair New York: McGraw-Hill
Handbook
1992 Practical Recording Bartlett, Bruce; Carmel, IN: Sams
Techniques Bartlett, Jenny
1992 The New Stereo Everest, F. Alton Blue Ridge Summit, PA: Tab Books
Soundbook
1992 Sound and Recording: An Rumsey, Francis; Oxford: Focal Press
Introduction McCormick, Tim
1996 Audio in Media: The Alten, Stanley R. Belmont, CA: Wadsworth Pub. Co.
Recording Studio
1997 The Art of Mixing: A Visual Gibson, David; Emeryville, CA: Mix Books
Guide to Recording, Petersen, George
Engineering, and
Production
1999 The Mixing Engineer’s Owsinski, Bobby; Emeryville, CA: Mix Books
Handbook O’Brien, Malcolm
2000 The Mastering Engineer’s Owsinski, Bobby; Vallejo, CA: Mix Books
Handbook Englefried, Sally
2002 Mastering Audio: The Art Katz, Robert A. Oxford: Focal Press
and the Science
2002 The Art of Recording: Moylan, William Boston, MA: Focal Press
Understanding and
Crafting the Mix
Date Title Author Publication

2003 Recording Tips for Crich, Tim Vancouver: Black Ink Pub.
Engineers: Over 1,000
Easy-to-Use Tips,
Hints, Tricks, How-to’s,
Setups, Explanations
and Suggestions for
Today’s Recording
Engineer, Musician and
Home Studio User
2005 Critical Listening Skills for Everest, F. Alton Cambridge, MA: Course Technology
Audio Professionals
2005 The Recording Engineer’s Owsinski, Bobby Boston, MA: Thomson Course
Handbook Technology, ArtistPro
2010 Audio Production and Corey, Jason Oxford: Focal Press
Critical Listening:
Technical Ear Training
2012 Audio Engineering 101: Dittmar, Tim Waltham, MA: Focal Press
A Beginner’s Guide to
Music Production
2012 Mixing Audio: Concepts, Izhaki, Roey Oxford: Focal Press
Practices, and Tools
2014 Mic It!: Microphones, Corbett, Ian Hoboken, NJ: Taylor and Francis
Microphone
Techniques, and Their
Impact on the Final
Mix
2014 Recording Secrets for the Senior, Mike New York: Focal Press
Small Studio
2016 Recording Orchestra and King, Richard New York: Focal Press
Other Classical Music
Ensembles
2018 Mixing With Impact: Oltheten, Wessel; New York: Routledge
Learning to Make Osch, Gijs van
Musical Choices
Technical
The format of texts in this category will be familiar to many in the field. These texts
generally prioritize the theory, engineering design, features or measured performance
of equipment over the practical application of the tools – they are references for all
things technical that relate to sound recording. For example, in the preface to Rumsey
and McCormick’s Sound and Recording (1992), the authors classify their text as targeted
towards newcomers to the field and ‘biased towards an understanding of “how it works,”
as opposed to “how to work it”’. Chapters on basic physics, acoustics and psychoacoustics
are common, as are sections on specific equipment, including microphones, consoles and
signal processing gear.
In Glen Ballou’s Handbook for Sound Engineers: The New Audio Cyclopedia (1987; the
title pays homage to Howard Tremaine’s technical text Audio Cyclopedia [1959]), the
biographies for chapter authors routinely cite either institutional affiliations or positions
within industry, including research positions. Some authors do cite practical experience,
but the vast majority have electrical engineering backgrounds. This is not surprising
for contributors to a technical reference, but the training for these authors would have
been at odds with a great number of people working in professional recording studios at
the time the books were published. These texts mirror a division that Schmidt Horning
identifies in audio-engineering publications of the early 1950s. The first Journal of the
Audio Engineering Society (JAES), published in 1953, ushered in editorial changes to its
precursor, the magazine Audio Engineering, which became simply Audio in 1954 and
repositioned itself towards the ‘newcomer to our ranks – and possibly less [towards] the
scientist’ (Schmidt Horning 2013: 75).
Tremaine’s text is a logical progression in the maturation of the audio engineering field
described by Schmidt Horning. These offerings are clearly aligned with formal training
approaches. Certainly Audio Cyclopedia can be seen as a response to the need for a
reference text that could be used by students and professionals alike as a resource for best
design practices and information on standards. Meeting that need was one of the historical
priorities of, and the reason for, the formation of the Audio Engineering Society (AES)
(Schmidt Horning 2013: 73–76). Tönmeister and cognitive apprenticeship approaches
alike would benefit from texts such as these, but they do not directly address amateur or
‘hobbyists’ interested in sound recording, nor do they reflect the vast knowledge capital
held by professional recording engineers of the era. This, of course, wasn’t the point of the
authors, but it does highlight a significant gap in the available resources during the early
years of formal sound-recording programmes.
A distinguishing feature of technical texts is their inclusion of references and background
literature, commonly cited in-text or as suggested further readings. This approach aligns
with scientific writing practices and clearly illustrates to the reader where previous
knowledge and technical information are coming from. Such resources align with both
the Tönmeister and cognitive apprenticeship approaches, where technical information is
either studied as an independent discipline or is used as evidence of content knowledge
when placed into a practical situation.
Comprehensive
The textbooks we have categorized as comprehensive include some discussion of applied
theory or practice. While a technical text would describe a microphone by its operational
design principles and measured response, a text in the comprehensive group would include
some discussion about how the microphone is commonly deployed in practice and perhaps
why it might be a good choice for a given situation. One could be forgiven for mistaking a
text such as Robert E. Runstein’s Modern Recording Techniques (1974) as another technical
reference text if it weren’t for the chapter called ‘Studio Session Procedures’ (230–243) and
the first pages of a chapter devoted to ‘Automated Mixdown’ (258–259). In these pages
we begin to get a glimpse of the activities that take place within the walls of the recording
studio and some information about the expectations that will be placed upon the individual
in control of the recording equipment.
In the preface to his book, Runstein cites a story from his tenure as head engineer and
technical director of Intermedia Sound Studios in Boston, Massachusetts, where he was
continuously approached by people fascinated with recording and wanting ‘to learn how
the records they enjoyed were made and to participate in making records as engineers,
producers, or recording artists’ (1974: preface). In its ninth edition, this text, now
authored by David Miles Huber, includes an introductory chapter in which we hear an
echo of Runstein’s experience. The chapter includes sections titled ‘Knowledge is power!’,
‘Making the project studio pay for itself ’ and ‘Career development’ (Huber and Runstein
2017: 1–40). Nested within this brash assessment of the requirements to be successful in
the recording industry is the acknowledgment that the kind of individuals Runstein met
in 1974 are still around today. The desire and hunger for knowledge about the recording
process remain as present today as they were in 1974.
The growth of the educational market and formalization of sound-recording
training programmes is explicitly stated in the preface to John Eargle’s Sound Recording
(1976) and the later Handbook of Recording Engineering (1986). John Borwick’s Sound
Recording Practice is another text that addresses ‘the vast army of aspirants who are
continually knocking on studio doors trying to get in’ (1976: editor’s preface). While
the voice of the practitioner isn’t included in the technical texts, authors of texts in the
comprehensive category attempt in different ways to impart some of the knowledge and
experience of the recording engineer. Eargle and Runstein report the standard practices,
specific activities and questions the engineer would be expected to be able to answer.
The Runstein and Huber editions are written using the voice of the mentor, or sage
studio veteran, which is reminiscent of an apprenticeship approach. In the Borwick
text a collection of experts gathered from the industry and academia report on their
knowledge and experience, at times even sharing their own philosophy on working in
the studio: ‘It is important to establish and maintain a congenial atmosphere to bring
out the best performance from the musicians. Therefore one should try to be as helpful
and friendly as possible to producers and musicians alike’ (Peter Tattersall quoted in
Borwick 1976: 262).
The methods used to include ‘how to work it’ information in these comprehensive texts
can be seen on a spectrum from observational reporting to first-person accounts. The latter
method clearly acknowledges the apprenticeship-training approaches that exist within the
field but aren’t reflected in the technical textbooks. The danger of this method is that as
text alone it becomes anecdotal and lacks the context for applying the knowledge being
conveyed. The reader isn’t able to genuinely access the body of knowledge that resides
within an experienced practitioner, and we don’t have a method to rectify a particular use
case or philosophy should it not work as suggested by the master.
Methods
This category is the most diverse of the groups analysed. It includes textbooks based on
specific approaches or on specific stages of the music-production timeline, and that largely
exclude the foundational theory found in comprehensive texts. In general, these textbooks
provide a framework for working with audio that includes a method for identifying,
evaluating and shaping sound. The relative weight given to developing these abilities varies,
as do the frameworks presented by the authors, but in all of these texts the reader is given
a system to work within.
An obvious example of a methods text is Jason Corey’s Audio Production and Critical
Listening: Technical Ear Training (2010). Focused on ear training, this book presents a
systematic method for developing critical listening skills within the sound-recording
context. As it is a contemporary text, it includes a software application and musical
examples available through a compendium website. An example of an early methods text,
and one that happens to be aligned philosophically with Corey’s, is Alec Nisbett’s The
Technique of the Sound Studio (1962). It systematically discusses studio techniques as well
as methods for assessing and shaping sound in the studio. The ability to listen is identified
as the primary feature of a recording engineer. Nisbett states that a goal of the text is to
educate readers in ‘learning to use the ears properly’ (9). Other examples of methods texts
include William Moylan’s The Art of Recording: Understanding and Crafting the Mix (2002),
Roey Izhaki’s Mixing Audio: Concepts, Practices, and Tools (2008), David Gibson’s The Art
of Mixing: A Visual Guide to Recording, Engineering, and Production (1997) and Bobby
Owsinski’s The Mixing Engineer’s Handbook (1999).
Like the Corey book, the majority of these texts include musical references, which
provide context for the specific topic or approach being discussed. This can be seen as an
attempt to further incorporate the apprenticeship approach into the texts. A more overt
example of ties to the apprenticeship model can be seen in the Owsinski and O’Brien text,
which includes interviews with famous mixing engineers. In the first case, the inclusion
of musical references is better seen as a cognitive apprenticeship approach, due to the fact
that the author was rarely the engineer or producer on the original recording. In Owsinski
and O’Brien, and as we argued earlier, the inclusion of first-hand accounts is limited by
the fact that the ‘master’ often lacks the ability to critically extract and articulate their
individual schema. In general, these musical examples and voices are a positive addition to
the textbook offerings as they directly address the fascination readers have with recordings
and connect them – if only superficially – with how they were made.
The majority of the texts in this category focus on mixing, but there are also texts about
recording and mastering as well as more contemporary offerings, including audio for
games, microphone techniques and orchestral recording. This diversity is positive, but it
becomes problematic when we consider using the texts in conjunction with one another;
there is potential for contradictions and inconsistencies in education and training.
In order to mitigate any disagreement between texts, the author must reflect critically
upon the specific knowledge and learning being conveyed, particularly in the case of
technical information. A criticism regarding this very issue has been raised about William
Moylan’s The Art of Recording: Understanding and Crafting the Mix. Within this methods
text, references appear in a bibliography but are not linked through in-text citations or
footnotes, which demands that the reader make the necessary connections (Lapointe
2010). The Corey text uses in-text citations, which clearly delineate the body of knowledge
and previous work from any of the author’s unique opinions, methods or approaches.
The disrupters
Any discussion of the resources students use to learn in a contemporary setting would be
incomplete without including the internet and the many modes and channels of information
it represents. Textbooks may still hold a place within educational institutions, but anecdotal
experience suggests that an ever-increasing number of students are turning to secondary
resources that are more in line with how they consume media and entertainment. As
a discipline, sound recording isn’t unique in this, but the effect is amplified due to the
democratization of the technology associated with the field (Leyshon 2009). The ease of
access to technology previously restricted to the recording studio, coupled with the access
afforded to industry professionals by the internet, creates a significant disruption to the way
we think about and approach the study of sound recording. Students can turn not only to a
host of online courses, or MOOCs (massive open online courses) such as those on Lynda.
com or Coursera, but also any number of YouTube channels dedicated to sound recording.
If you admire a certain recording engineer who is of some stature, you can likely watch a
video of your idol on the internet, speaking about their work or giving you their tips of
the trade. Between the formal MOOC model and the informal, anything-goes YouTube
format there is also a growing niche of content providers, including Pensado’s Place (2019),
The Mix Academy by David Glenn (David Glenn Recording 2019), the ‘Shaking Through’
series by Weathervane Music (2014), and Mike Senior’s ‘Mixing Secrets for the Small
Studio’ (Cambridge-MT 2019). Are these disrupters offering something new and filling
a gap that exists in the traditional texts we use to study and teach sound recording, or are
they simply a new medium for the delivery of established content?
If we apply the same analysis technique to these disrupters that we used for the research
textbooks, the same broad categories can be observed. We see evidence of the historical
apprenticeship approach, whereby a significant engineer – the majority focusing on music
mixing – provides insights into the tricks, tools and techniques they employ in their creative
practice. These examples often deviate from the strict apprenticeship model, adding some
level of analysis by the engineer in an attempt to translate tacit skill into teachable sound
bites. The master draws viewers to their channel through their personal stature, but in
the majority of cases it is a one-way line of communication. Also, in the majority of
cases the medium defines the presentation format, and judicious editing, scripting and
highlighting of key moments significantly alter the understanding of the practice. The
handling of reference materials, as addressed in the previous section, particularly when
technical or scientific elements are being communicated, is another issue to be addressed.
Upon analysis we see that, outside of the MOOC, references are rarely identified within
the disrupter offerings. When skill and craft are the desired learning outcomes, this is
not necessarily a damning feature in itself. However, when there is an attempt to provide
a more comprehensive learning experience, and because of the mediated nature of the
learning situation, the potential for significant inconsistencies or contradictions across
learning materials becomes problematic.
The cognitive apprenticeship approach is evident in the more formalized MOOC
offerings. A broad spectrum of courses is available, but in general the principle difference
between this delivery method and the previous example is its use of supplementary
theoretical and technical materials to support the practical tutorials. In the most
rudimentary form, these can be considered virtual textbooks. More advanced courses
provide new materials to the learner – for example, a set of shared multitrack audio
recordings. This provides a more legitimate connection to the practice, and because of
the shared content and context, the learner’s skills development will be more evident as a
result. However, is this approach really new, or is it simply a new medium for delivering
an existing method? The Corey and Izhaki texts would suggest that it is the latter, the
‘accompanying audio CD’ being replaced by a compendium website.
Where we do see a significant advantage in several of these disrupter platforms is their
ability to establish a community of practice and learners, much like what once existed
within the walls of the traditional large studio. The ‘Shaking Through’ series by Weathervane
Music is an example of a virtual apprenticeship model that leverages positive features of
the traditional approach into a contemporary learning environment. Subscribers to this
site are given session materials and video footage of the sessions (albeit edited), and are
invited to submit their mixes for evaluation and critique by the organization’s principal
actors. There are, of course, practical limitations to the depth of the relationship and
communication possible between the master and apprentice, but this approach does afford
access that could substantially enhance an individual’s attempt to study sound recording.
Used within an institutional setting, this type of resource is valuable because it provides
the community of learners with a diversity of voices and mentors.
Conclusions
Returning to the original motivation for this chapter – a conversation about how we
study recording techniques and the texts we use to support this learning – it is important
to remember that each individual approach – Tönmeister, apprenticeship, cognitive
apprenticeship and likely every sound-recording text available – has helped produce
proficient and successful recording engineers. As Nixon et al. (2015) point out, the skills
necessary to succeed in music production are diverse and require technical, creative,
relational and cultural contexts to be engaged during training. This discussion, therefore,
has not been about which training approach or text is better; rather, we hope we have
provided some insight into how to interpret a text’s content in terms of which context it fills
and with which tradition it aligns. Our discussion of contemporary disrupters illustrates
the value of critically evaluating the methods and approaches being used in these new
resources, particularly in light of both their prevalence and their lack of the editorial
oversight associated with traditional textbooks.
We have illustrated how apprenticeship approaches have influenced a great number of
texts within our field. Whether the framework is a traditional or a cognitive apprenticeship,
a fundamental feature of the method is observation. A text such as Richard King’s (2016)
book on orchestral and ensemble recording, which offers extremely valuable insights into
his recording process, is further evidence of the value placed upon the master practitioner
within our field and also the inherent limitations of a textual format where this approach
is used. A paradox is thus present: the field craves the master-practitioner, but the textual
modality compromises the ability of the learner to legitimately access the master’s
knowledge, skill and experience.
Lefford and Berg (2015) describe their use of structured observations within
an institutional audio programme, identifying time as a critical element of these
observations: the length of an observation affects the learner’s ability to identify the
frequency of a behaviour or particular occurrence and, therefore, their ability to
determine what is significant. Time is a luxury the internet affords, and though there
are challenges, there is also the potential to capture recording-studio activity and
present this to learners in a more authentic and meaningful way, making possible
the opportunity for situational observation and a more genuine connection to the
community of practice. Observation is a logical progression from the use of reference
materials such as the Izhaki and Corey texts, and it is in this area that disrupter
platforms show the most promise.
Further, we have identified that the handling of reference material is important in order
to situate material within the broader context and mitigate misconceptions, biases or
disagreements between texts. This is particularly important when linking to the concept of
observation and the potential of contemporary disrupters to provide novel new platforms
for learning. The evidence from our textual history points to the need for a clear indication
of where foundational knowledge is being presented and how it is being applied within
a given scenario. This is necessary and allows learners to clearly grasp where they are
deficient, either in foundational knowledge or skill, while also illustrating where a given
master practitioner’s approach or technique presents a novel entry into the field of sound
recording.
More than anything, this chapter supports the idea that a singular philosophy or
methodology for the study of sound recording is actually limiting and detrimental to learners.
While texts such as Corey’s on ear training are specific and focused in their approach,
other texts suffer from the attempt to be both authoritative and comprehensive at the same
time. Careful consideration of a text’s approach (technical, comprehensive, methods) and
training methodology (Tönmeister, apprenticeship, cognitive apprenticeship), and not
limiting this evaluation to only the content covered is vital if we are to carry forward the
lessons learned from our short history of music-production instruction and move towards
new and innovative methods of delivering content.
Notes
1. For further reading on the history of sound recording, we can recommend the following
texts: Schmidt Horning (2013); Burgess (2014); and Jonathan Sterne (2003).
2. For further reading on the topic of cognitive apprenticeship theory, and the use of this for
higher education music technology programmes, please see Walzer (2017).
Bibliography
Ballou, G. (1987), Handbook for Sound Engineers: The New Audio Cyclopedia, Indianapolis,
IN: H.W. Sams.
Bernhart, J. and A. Honegger (1949), Traité de prise de son, Paris: Éd. Eyrolles.
Borwick, J. (1973), ‘The Tonmeister Concept’, Presented at the 46th Audio Engineering Society
Convention, New York, 10–13 September.
Borwick, J. (1976), Sound Recording Practice: A Handbook, London: Oxford University Press.
Cambridge-MT (2019), ‘Mixing Secrets for the Small Studio’. Available online: http://www.
cambridge-mt.com/ms-mtk.htm (accessed 22 April 2019).
Corey, J. (2010), Audio Production and Critical Listening: Technical Ear Training, Amsterdam:
Focal Press.
David Glenn Recording (2019), ‘About David Glenn’. Available online: https://www.
davidglennrecording.com (accessed 22 April 2019).
Eargle, J. (1976), Sound Recording, New York: Van Nostrand Reinhold Co.
Eargle, J. (1986), Handbook of Recording Engineering, New York: Van Nostrand Reinhold.
Farmer, J. A., Jr. (1992), ‘Cognitive Apprenticeship: Implications for Continuing Professional
Education’, New Directions for Adult and Continuing Education, 55: 41–49.
Gibson, D. (1997), The Art of Mixing: A Visual Guide to Recording, Engineering, and
Production, edited by G. Petersen, Emeryville, CA: Mix Books.
Grassie, C. and D. F. Clark (2015), ‘An Integrated Approach to Teaching Electroacoustics
and Acoustical Analysis to Music Technology Students’, Presented at the 26th UK AES
Conference on Audio Education, Glasgow, Scotland, 26–28 August.
Huber, D. M. and R. E. Runstein (2017), Modern Recording Techniques, New York: Focal Press.
Izhaki, R. (2008), Mixing Audio: Concepts, Practices, and Tools, Oxford: Focal Press.
King, R. (2016), Recording Orchestra and Other Classical Music Ensembles, New York: Focal
Press.
Lapointe, Y. (2010), ‘William Moylan. 2007. Understanding and Crafting the Mix: The Art of
Recording’, Intersections: Canadian Journal of Music/Revue Canadienne de Musique, 31 (1):
209–213.
Lave, J. and E. Wenger (1991), Situated Learning: Legitimate Peripheral Participation,
Lefford, M. N. and J. Berg. (2015), ‘Training Novice Audio Engineers to Observe: Essential
Skills for Practical Development and Analytical Reasoning’, Presented at the 26th UK AES
Conference on Audio Education, Glasgow, Scotland, 26–28 August.
Leyshon, A. (2009), ‘The Software Slump?: Digital Music, the Democratisation of Technology,
and the Decline of the Recording Studio Sector within the Musical Economy’, Environment
and Planning A: Economy and Space, 41 (6): 1309–1331.
Moylan, W. (2002), The Art of Recording: Understanding and Crafting the Mix, Boston, MA:
Focal Press.
Nisbett, A. (1962), The Technique of the Sound Studio, New York: Hastings House.
Nixon, P., T. Young, E. Brown and L. Wiltshire (2015), ‘Interdisciplinary Benefits: Encouraging
Creative Collaboration’, Presented at the 26th UK AES Conference on Audio Education,
Glasgow, Scotland, 26–28 August.
Owsinski, B. (1999), The Mixing Engineer’s Handbook, edited by M. O’Brien, Emeryville, CA:
Mix Books.
Pensado’s Place (2019), ‘Pensado’s Place’. Available online: https://www.pensadosplace.tv
(accessed 22 April 2019).
Polanyi, M. (1958), Personal Knowledge: Towards a Post-Critical Philosophy, Chicago:
University of Chicago Press.
Polanyi, M. (1966), The Tacit Dimension, Garden City, NY: Doubleday.
Pratt, D. D. (1998), Five Perspectives on Teaching in Adult and Higher Education, Malabar, FL:
Krieger Pub. Co.
Rumelhart, D. E. and D. A. Norman (1980), Analogical Processes in Learning [Report],
15 September. Available online: http://www.dtic.mil/docs/citations/ADA092233 (accessed
1 August 2019).
Rumsey, F. and T. McCormick (1992), Sound and Recording: An Introduction, Oxford: Focal
Press.
Runstein, R. E. (1974), Modern Recording Techniques, Indianapolis, IN: H.W. Sams.
Schloss, J. G. (2004), Making Beats: The Art of Sample-Based Hip-Hop, Middleton, CT:
Schmidt Horning, S. (2004), ‘Engineering the Performance: Recording Engineers, Tacit
Knowledge and the Art of Controlling Sound’, Social Studies of Science, 34 (5): 703–731.
Schoenberg, A. (1987), Arnold Schoenberg Letters, Berkeley: University of California Press.
Sterne, J. (2003), The Audible Past: Cultural Origins of Sound Reproduction, Durham, NC:
Tremaine, H. M. (1959), The Audio Cyclopedia, Indianapolis, IN: H.W. Sams.
Walzer, D. A. (2017), ‘The Curricular Soundtrack: Designing Interdisciplinary Music
Technology Degrees Through Cognitive Apprenticeship and Situated Learning’, in New
Directions for Computing Education, 143–160, Cham: Springer.
Weathervane Music (2014), ‘Shaking Through’. Available online: https://weathervanemusic.
org/shakingthrough/episode-index (accessed 22 April 2019).
248
16
Materializing Identity in the
Recording Studio
Alexa Woloshyn
Planes of mediation: The individual, society

and the institution
The process of identity formation in the recording studio entails both conscious and
unconscious negotiations of broader aesthetic, social and industry values. Identity is neither
fixed nor inevitable. Anthropologist and sociologist of music Georgina Born outlines four
planes of social mediation to articulate ‘how music materializes identities’ (2011: 376). The
first plane recognizes that musical practices produce social relations – between musicians,
producers, technicians, etc. The recording studio creates its own ecology, which is sonified
and can then be analysed. The second plane recognizes music’s ability to create imagined
and virtual communities. Music plays a significant role in the expression of identity of
the music practitioners as well as the identity formation of listeners. The third plane
recognizes that music emerges from wider social identity formations, such as class, age,
race, ethnicity and so on. These broader social relations of the individuals in the recording
studio will variously shape the specifics of the identities materialized in music. Finally, the
fourth plane recognizes the institutional mediation of music as it is produced, reproduced
and transformed. The recording studio is a central institutional space that itself mediates
industry standards through individuals (e.g. producers) and technologies.
Staging the voice in popular music

Due to popular music’s emphasis on the voice – in performance staging, mixing, the
importance of lyrics, emphasis on star persona, etc. – vocal performance and staging are
prioritized in the recording studio. Identity is constructed and then mediated rather than
revealed from some inherent form. Serge Lacasse describes vocal staging as a ‘deliberate
practice whose aim is to enhance a vocal sound, alter its timbre or present it in a given
spatial and/or temporal configuration with the help of [a] mechanical or electrical process’
(2000: 4). Vocal staging and its outcomes directly impact the listener, resulting in certain
connotations regarding presence, intimacy and emotion. It starts with the voice and what
the microphone picks up, but production choices such as double tracking and filtering
create the potential for playfulness, aloofness and so on (Lacasse 2010). For example, in
Tori Amos’s ‘’97 Bonnie And Clyde’, studio production emphasizes paralinguistic sounds
(e.g. breathing, swallowing). When combined with specific vocal delivery choices (e.g.
whispering vs. murmuring), Amos constructs a materiality to her voice that serves her
feminist critique of Eminem’s original song.
Both Nicola Dibben (2009b) and Lacasse (2000) point to the contrast between a
voice staged as ‘dry’ versus with reverberation as strategies for constructing relationality,
specifically between the voice and the listener. According to Dibben, a listener who is closer
to a sound source will hear a larger proportion of direct sound than reverberant sound:
close microphone placement ensures a high level of direct sound. This close microphone
placement results in a ‘flat’ voice, which is ‘close-up sound, sound spoken by someone close
to me, but it is also sound spoken toward me rather than away from me. Sound with low
reverb is sound that I am meant to hear, sound that is pronounced for me’ (Altman 1992:
61, italics in original). Recordings also indicate proximity through amplitude in a parallel
way to sound sources in real life. Dibben explains: ‘sounds which are louder in the mix in
a recording tend to be heard as being nearer the listener than sounds which are quieter’
(2009b: 320). This relationality enables the emotional connection listeners feel to vocalists.
Constructing emotion, feminism and

nationalism with Björk
Dibben argues that Björk’s music, specifically on Vespertine and Homogenic, represents
and possibly incites emotional experiences; also ‘it contributes uniquely to the idea of
what emotion is and of how it is perceived’ (2006: 197). Because Björk’s involvement
in all matters of studio processes (e.g. beat-making, writing lyrics, processing sounds,
collaborating with producers) is emphasized, her work is read as autobiographical. Thus,
studio production creates a space for connection: for example, in ‘Unison’, spatialization,
vocal staging and lyrics work in tandem to immerse the listener ‘in an increasingly full
sonic space’ (177) simultaneously to situating the listener in close proximity to Björk due
to close-micing (and intimate lyrics). On Vespertine, ‘Even when the virtual space evoked is
seemingly extended through a deepening of the aural perspective, a combination of vocals
and delicate percussive sound is often placed at the very front of the mix […], thereby
reinforcing the (virtual) proximity of the listener to the source materials involved’ (180).
Björk’s music and the discourse surrounding it demonstrate the capacity for the identity
of a single artist – constructed principally in the studio – to stand in for national identity.
Indeed, Björk uses her musical sound to argue that Icelandic national identity unites nature
and technology. Dibben argues that ‘technologised nature’ or ‘naturalised technology’ is
Materializing Identity in the Recording Studio 251
conveyed through the following musical strategies: ‘the “resistance” created between her
voice and the metric grid of beats of the sonic background of her music; the integration
of acoustic and synthesized sources; the mimesis of natural sounds as the timbre of beats;
miniaturisation of beats; and exploitation of the failures and idiosyncrasies of music
technology’ (2009a: 142).
Shana Goldin-Perschbacher (2014) broadens Björk’s performance of Icelandic national
identity to examine the role of ‘difference feminism’ on Medúlla and Volta. The recording
process is central to the finished album and its message: ‘[Björk] laid down her own
tracks and then brought in guest vocalists, recording their contributions and picking
and choosing what she wanted to include with what she had already created’ (Goldin-
Perschbacher 2014: 63). This production process exemplifies ‘emergence’, or ‘the gradual
materialization, introduction or becoming of a fuller texture or more complete musical idea’
(Malawey 2011: 143), by which musical emergence parallels motherhood as emergence.
By pointing to ‘peace politics through maternal sensuality’ (Goldin-Perschbacher 2014:
64), Björk constructs a particular feminist (i.e. maternal) identity for herself in order to
foster a kind of global unity for the listeners – to transcend nation-state boundaries and
the strong influence of nationalism on identity. This shift to a ‘one tribe’ message was a
response to 9/11: to articulate the dangers of getting distracted by ‘religion and patriotism’
(Björk quoted in Goldin-Perschbacher 2014: 61) and to advocate for embracing humanity
through the shared expression of her ‘close, natural, and fallible’ voice (69). Björk’s diverse
vocal timbres articulate various identity positions, in this case both mother and sexual
being, thus rejecting the position of ‘mother’ as asexual (though embodied). Björk rejects
certain stereotypes of femininity and motherhood in Western culture, but not all. Björk’s
song ‘Hope’ in particular emphasizes the multiplicity of female perspectives, which is
difference feminism.
Scholarship on Björk continues to be a dominant trend in popular music scholarship,
and her musical processes offer insight into diverse ways in which the recording studio
can create identity. Her work in the recording studio pushes the boundaries of normative
constructions of gender, sexuality and nationality. Other artists create non-normative
gender and sexual identities in the recording studio.
Identity processes and generic

conventions: Gender, queerness and the
cyborg
Meshell Ndegeocello is a Black queer musician whose constantly-changing musical
practice eschews labels, including those most typically placed on her: black feminist,
black nationalist/Africanist. Ndegeocello constructs multiple personae to critique identity
politics’ ‘inherent reinvestment in intertwining heteropatriarchal, racist and classist
formations’ (Goldin-Perschbacher 2013: 473). Goldin-Perschbacher’s application of affect
studies emphasizes Ndegeocello’s identity(ies) as process over essence and argues against
deterministic readings of artists’ identity politics into musical meaning. In this case,
Ndegeocello performs a queer masculinity (in sound recording and live performance) to
salvage both masculinity (which generally rejects ‘feminine’ qualities such as vulnerability)
and femininity (which generally rejects ‘masculine qualities’). Ndegeocello demonstrates
how social and institutional notions of gender, race and sexuality intersect with genre.
The perceived normativity of Americana positions the genre well for transgender and
queer artists to share perspectives with audiences not typically exposed to transgender
and queer artists (Goldin-Perschbacher 2015). As a storytelling genre, Americana
allows transgender and queer artists to express their identity politics and critique the
heteropatriarchal white supremacy from multiple angles. Rae Spoon and Namoli Bennett,
for example, illustrate that ‘Sung music is a performative articulation of the “self-in-
progress,” a musicking in which the interplay of individual and collective identities
compels us to make fresh sense of ourselves and the world’ (Goldin-Perschbacher 2015:
796). TransAmericana singing is open and diverse, which allows for trans and gender non-
conforming musicians to express their identities vocally.
Jeff Buckley’s strategy was to keep the pronouns when singing songs written initially for
female performers; also, he did not ‘clean’ up his recordings to remove the body (Goldin-
Perschbacher 2003). His recordings create what Goldin-Perschbacher calls an ‘unbearable
intimacy’: ‘Unbearable intimacy seems to arise not only because Jeff is so intimate that he
displays his vulnerability, but also because this intimacy encourages an empathic response
in listeners, one which encourages listeners to search their own souls, sometimes to feel
painfully vulnerable themselves’ (309). His vocal performance and staging in the recording
studio construct a compelling identity that creates a strong sense of connection with
listeners (i.e. second plane of mediation).
Artists such as Buckley expose the internal technologies of the voice in the body. When
this presence is combined with other production and post-production choices within
genre expectations, popular music can challenge gendered logic and reveal the cyborg.
The association of The Carpenters with ‘easy listening’ reflects the intersection of Born’s
third and fourth planes of mediation in which societal norms regarding gender and industry
institutional standards position the group as outmoded. Freya Jarman-Ivens claims that the
double bias against both female musicians and easy listening, non-countercultural genres
is a potential opening for articulating the queer because the ‘derision [of The Carpenters]
is based on such gendered logic’ (2011: 67).
The studio becomes a site for the emergence of the cyborg. In this case, the cyborg
materializes through the subservience of Karen Carpenter’s internal technologies of the
voice to the external technologies controlled by Richard Carpenter. Jarman-Ivens applies
Roland Barthes’s grain of the voice and the pheno-song/geno-song binary to distinguish
between the effects of the ubiquitous overdubbing and the rare audibility of Karen’s internal
technologies (i.e. the voice in the body). For example, Ray Coleman explained that Richard
left in Karen’s deep breaths on ‘Goodbye To Love’. While this sound is audibly striking,
Jarman-Ivens argues we know this sound is there only through Richard’s permission –
that any such sound is ‘always subject to Richard’s technological manipulation’ (2011:
87). Thus, the grain is not actually breaking through independently but is ‘an artificial
construct’ (87). The result of this situation – that Karen’s voice as created through internal
technologies is always subservient to the will of external technologies – materializes the
cyborg.
Developments in production and post-production technologies mean the cyborg identity
is more readily and easily emergent in the recording studio. The specific construction of
identity, though, relies on Born’s first plane: the social relations in the studio. Producers
are central figures in the popular music industry, and the nature of their relationship to
artists and role within projects vary (Burgess 2002; Howlett 2012). For The Carpenters,
the recording studio was characterized by Richard’s control over Karen’s voice, a process
that mirrors broader societal gender hierarchies. This emphasis on her ‘natural’ voice
contrasts with her initial role as drummer in the Richard Carpenter Trio. Richard himself
praised her instrumental virtuosity and described how she could ‘speedily maneuver the
sticks’ (Richard Carpenter quoted in Schmidt 2012: 7). Soon, she combined singing with
drumming, but by the early 1970s, Karen was presented as lead singer and only occasional
drummer. The recording studio in particular, under Richard’s direction, capitalized on her
unique voice.
Many artists choose to self-produce, a choice that is increasingly viable with the
affordability of commercial-level equipment. Paula Wolfe (2012) explains that, for many
women artist-producers, this ‘creative retreat’ has been essential for them developing
‘confidence in [their] technical abilities’ within the male-dominated commercial studio.
Thus, unlike Karen Carpenter, British electronica artist Imogen Heap’s cyborg is one that
defies gendered logic in the studio as she positions herself as empowered and embodied
(i.e. feminine).
Imogen Heap has demonstrated her technological prowess in the recording studio with
her solo albums I Megaphone (1998), Speak for Yourself (2005), Ellipse (2009) and Sparks
(2014). Developments in recording technology that increase portability and decrease cost
allow artists such as Heap to sidestep institutional and commercial structures to create
a sonic identity of their own design (Woloshyn 2009). In Heap’s case, this means that a
female musician takes on the role not only of vocalist/performer but also producer. The
cyborg image becomes useful not only in describing the sonic merging of her acoustic
voice with technological manipulation in the sonic outcome of Heap’s work, but also in
articulating a body-machine unity in the musical processes located in the recording studio
(and then often re-enacted on a live stage). The successful track ‘Hide And Seek’ exemplifies
the cyborg identity, as the vocal input from her physical voice is required for the vocoder
and harmonizer processing. By taking on a cyborg identity, Heap rejects the association
of technology with the masculine. She takes ownership over the often-genderless cyborg
(or post-gender) by embracing the feminine in her media and performance image. Since
‘Hide And Seek’, Heap has made significant progress in translating the cyborg identity of
the recording studio to the live stage through MI.MU gloves she created with a team of
electronic, software, design and textile experts. Now in both spaces, Heap’s gestures and
vocal utterances are seamlessly captured, transformed and reimagined through real-time
digital processing.
The recording studio has become a playground for exploring and challenging normative
constructs of gender and sexuality. Artists from George Michael and Boy George to Lady
Gaga and Azealia Banks emphasize queerness in diverse ways. When Stan Hawkins asks
‘what makes pop music queer?’ (2016), he looks at heavily produced and playful music –
music that some might consider antithetical to resistance. Hawkins argues that pleasure
is central to the disruptive possibilities because the collaborations inherent in any large-
scale pop artist’s output offer queerness as a vision for a utopian future (from José Esteban
Muñoz). This pleasure is sonically expressed through the musical stylings of camp, pop
art, irony, parody, disco, kitsch, glamour and flair – stylings which are constructed in the
recording studio and reinforced through videos and live performances. For example, the
Scissor Sisters celebrate excess and affectation in ‘Skin Tight’ by pairing a simple harmonic
sequence and the four-on-the-floor requisite disco beat with high production values in
filtering, placement in the mix and equalization (EQ) (Hawkins 2016: 107). Because of the
voice’s central position in popular music, vocal delivery is a ‘prime signifier of identity’ (2).
The voice articulates the personality of the performer; gendered identification is inevitable,
but the voice performs rather than marks a stable identity. For example, Hawkins’s analysis
of Le1f ’s ‘vocal costuming’ traces five ‘costume changes’ linked to the singing style. These
production choices showcase his voice as an haute couture. Hawkins explains: ‘It seems
that Le1f knows full well that to be black and queer stands for the formidable Other and
that musical styles can be exploited to represent primitiveness and mysteriousness’ (216).
Hierarchies of race and ethnicity in the

recording studio
While the wider social identity formations based on class, age, race, ethnicity and so on
mediate all identities constructed through popular music, specific genres and musicians
emphasize this mediation more strikingly. The recording studio becomes its own ecology
consisting of the various artists, technicians and producers involved in a given project
(Born’s first plane). These social relations mediate how the recorded music materializes
identity. Certain contexts highlight wider social identity formations (third plane), in
particular when societal hierarchies are replicated in the recording studio. For example,
Paul Simon’s Graceland signalled the kinds of musical and social significations possible in
a collaborative album, particularly one associated with the political context of apartheid
South Africa (Meintjes 1990). Collaboration – musical and social – is entwined in the
processes and sounds of the album and will be interpreted depending on one’s political
stance on South Africa and one’s belief about music’s autonomy.
Black Zulu musicians struggle for power in the recording studio – to have agency over
the kind of Zulu music that will be shared on the global market (Meintjes 2003). Power
dynamics in the studio (both in apartheid and post-apartheid South Africa) ensure that
white sound engineers are the gatekeepers for what Zulu music will hit the international
market. In her book Sound of Africa! (2003), Louise Meintjes discusses one engineer who
speaks in essentializing terms, which, Meintjes argues, ‘perpetuates inequality in everyday

social interaction in and out of the studio and exposes the social, especially racialized,
mechanisms through which studio practice is rarefied’ (103). While this particular
engineer does not support apartheid ideology, he still weaponizes it against the musicians
to explain ‘the divisions of labor and authority in the studio’ (103). The divisions come
down to not only race but also class and ethnicity, as working-class musicians have ‘the
least-empowered position’ (104). Black musicians are denied similar access to studio space
and new technology as white musicians as a means of innovation. As a result, these artists
continue to be confined to stereotypically ‘authentic’ Zulu music; they are denied agency
as modern subjects.
In powwow recording culture, the recording studio – especially one owned by non-
Natives (such as Arbor Records) – is a place where assumptions about race and culture
come to the fore. Christopher Scales, thus, refers to powwow music labels as ‘intercultural
contact zones’, ‘where competing commonsense notions about music, musical performance,
musical ownership and authorship, and “normal” rules of social conduct and social
relations continually rub up against one another’ (2012: 21): compromise and concession
are required and negotiated by both sides. The common assumption is that powwow music
expresses Native identity. However, Scales argues:
This expression or meaning or articulation is never guaranteed and must be actively produced
by all involved in the social production and consumption of powwow music. Particular
elements of Native identity must be articulated to particular elements of powwow musical
or choreographic or social style. Certain ideas about the nature, definition and content of
Native tradition must be articulated to particular social and musical values. (9)
Reverb, EQ and compression are employed as specific strategies for achieving a sense
of ‘liveness’, which is deemed crucial for producers and consumers seeking ‘cultural
authenticity’. At the same time, musicians can be eager to use more obvious forms of
electronic mediation in the studio as innovation and thereby construct modern Indigenous
identities.
The recording studio is, thus, not an open space that is neutral. The position of the
recording studio within the recording culture of a specific genre is central to how we
understand identity formation in that space. The Indigenous hip-hop scene similarly seeks
to construct a modern Indigenous identity through a contemporary musical expression
and approach to production/post-production that still engages an Indigenous worldview
and history.
Charity Marsh explains that hip-hop culture becomes ‘a way to express and make sense
of present-day lived experiences, including the ongoing legacies of […] colonization’ (2012:
347–348). The work of Indigenous hip-hop artists in the recording studio allows them and
the youth who consume their music to ‘convey the contradictions, struggles, resistances,
and celebrations of their current lived experiences while simultaneously attempting to
acknowledge and respect the (hi)stories of their ancestors’ (348). Saskatchewan-based
female Cree hip-hop emcee Eekwol (Lindsay Knight) exemplifies the strong relationship
between an artist’s work in the recording studio and the broader community (i.e. Born’s
second plane). Marsh emphasizes Eekwol’s role as a storyteller in ‘Apprento’ from the
album Apprentice to the Mystery. By calling on the importance of storytelling in Indigenous
culture and including many recognizable Indigenous sounds, such as round dance singer
Marc Longjohn, flute and rattle, Eekwol ‘challenges the listener to really hear her stories
and to embody the affects of the storytelling act and the storyteller’s meaning. Eekwol
puts herself, and her contradictions, out there, simultaneously becoming vulnerable and
powerful as she dares the listener to reflect and to move’ (366).
Eekwol mobilizes the globalization of hip-hop in service of her local context. When
Eekwol records an album, she has found a safe place to speak out, which is particularly
important for a female emcee in a male-dominated scene. The increased accessibility to
DIY technology in the recording industry (fourth plane) allows Eekwol and her brother
Mils to produce their own work. This Indigenous-centred ecology of Eekwol’s work in the
recording studio avoids interference from non-Indigenous producers who have their own
agenda regarding hip-hop and Indigenous identity.
Cross-/trans-cultural creativity in the

recording studio
The recording studio can be a site of creativity for Indigenous musicians, particularly to
reflect the heterogeneity of Indigenous music. The stylistic distinctions of various repertoires
are often captured and deliberately constructed through processes in the recording studio.
Beverley Diamond explains: ‘Decisions about style and arrangement often involve teams
of collaborators. […] style can reflect cross-cultural alliances, the exigencies of textual
expression, or personal aesthetic preferences’ (2008: 152). The recording studio becomes
a site of identity formation through its social processes (first plane), the production/post-
production choices that signal specific relationships to tradition and contemporary musical
practices (third plane), and the engagement with the recording industry more broadly (e.g.
major vs. indie labels; fourth plane).
The social relations in the studio enact the values of Indigenous culture and point to
potential tensions in collaboration. As Native American artists have enjoyed a surge in
productivity, concerns about commodifying Native culture – given the long history of
ethnographic ‘preservation’ – create an opportunity for Native American artists to assert
agency over the definition of the commodified product (Diamond 2005). Diamond explains
that, in order to increase their agency within the recording studio ecology, the majority of
artists work outside of the major labels; they require the following when choosing where
and with whom to work: (1) trust within collaboration, (2) more input in processes, and (3)
freedom to include diverse material on a single album rather than to target a specific genre.
Diversity on an album means that traditional songs may be juxtaposed with pop styles,
such as on Hearts of the Nation by Aboriginal Women’s Voices (1997). Some artists aim
for a fusion. In either case, the recording studio may be variously used: in some cases,
they isolate different parts and piece together the parts layer by layer; in other cases, the
social interaction of musicians in creating the sounds of traditional music is central to the
mode of recording. Strategies in-between or in combination are also possible. For example,
when the Wallace family recorded their album Tzo’kam (2000), they employed different
recording techniques in four different studios; cuts from each studio were included on the
final album.
Diamond’s work with Native artists stresses one common priority, especially for women
artists: ‘[they] regarded the choice of collaborators to be part of the production process,
a part that is as significant as the more tangible techniques of production’ (2005: 125).
What is the ‘right’ collaboration team differs according to industry expectations and genre
conventions and may be specific to the artist. What Diamond found with these Native
artists was that they wanted the recording studio to be a space for asserting an Indigenous
worldview and individual and community identity with Indigenous and non-Indigenous
collaborators. Production and post-production choices in the recording studio have
consequences on the presentation and perception of regional, national and individual
identity for these Native artists.
Indigenous artists around the world also see the recording studio as a site for
experimentation when creating a sonic Indigenous identity. For example, Sámi artists
often experiment in the recording studio.1 Some use popular genres in combination with
traditional genres such as joik to express a modern Sámi identity. Sámi musicians have found
ways to translate traditional values and musical practices in joik in the studio, especially
with the ability to layer sounds and use archival recordings. These approaches are ‘part of
a larger project of making Sámi culture tangible and visible – and audible’ (Diamond 2007:
44). Often the vocal production techniques aim to distinguish between joik and song, and
to capture the individuality of each singer. Artists try to present a voice with ‘soul’, and the
production/post-production techniques that achieve that can vary greatly. Diamond does
highlight a gender divide not in joik but in music categorized as ‘song’ in which producers
often aim for a feminized sound, despite the discomfort of the singers themselves, which
reflects a gendered hierarchy of wider social relations.
Like folk and traditional musicians from various musical practices, many Sámi artists
consider the studio to be completely different from live performance, even ‘dead’ or ‘awful’
(26). Diamond notes the tension between ‘indigenous concepts and norms [… and] the
industry norms of studio production’ (26), a tension that arises specifically within rock
and pop industry conventions (Hilder 2017). For example, when recording joik, artists
are faced with important decisions: some choose to ‘adjust’ joik to 4/4 time, while others
seek arrangements that avoid metrical clarity or achieve cross rhythms. Many Sámi artists
value creating a sense of ‘liveness’ in the recorded product. One strategy is to mix in a field
recording with a studio recording.
The institutional mediation of music has a strong influence on the construction of Sámi
identity. Sámi recording culture intersects with the broader institutions of government-
sponsored cultural processes as many projects are funded with state subsidies and funding
for cultural projects tends to focus on ‘audibly yoik-based’ music (Diamond 2007: 25).
Sámi-owned record labels and Sámi radio stations are both crucial institutions for the
circulation of these records. Richard Jones-Bamman underlines the importance of these
recordings: ‘recordings of Saami music have played a significant role in how the Saami
conceive of themselves collectively, and how they would prefer to be perceived by others’
(2001: 191). As a result, Jones-Bamman can point to the contrast between Sven-Gösta
Jonsson singing ‘I am Lapp’ in ‘Vid Foten Av Fjället’ (1959) and Jonas Johansson singing
‘I am Saami’ in ‘Goh Almethj Lea’ (1991) to illustrate the transformations in Sámi popular
music. The choices made in the recording studio both reflect and inspire conversations
about ethnic identity in the broader Sámi community.
Popular music evidences the global circulation of artists and recordings. As a result,
many artists employ strategies in the recording studio to position an artistic identity
that is both global and local: ‘glocal’. For example, Roderic Knight (1989) describes the
‘Mande sound’ as a combination of popular sounds, such as Latin rhythms and popular
instruments (e.g. electric guitar) as well as traditional instruments and songs, specifically
from the jali tradition. The contemporary recording industry for popular musicians in West
Africa reflects the transnational/transcultural exchange of various popular musics. This
exchange of popular music codes can be brought directly into the political sphere, such as
with Zambia’s Zed Beats, which blends Zambian rhythms and languages with global pop
and dance trends (e.g. reggae and hip-hop). Matthew Tembo argues that the tune ‘Dununa
Rivesi’, which was created for the Patriotic Front party during the 2016 Zambian national
election, ‘changed the ways people hear politics’ (2018).
Popular music across Africa explores this transnational/transcultural exchange in
various ways, depending on the marketing scale. For example, local consumers from a
shared culture will understand (and even seek out) popular music that maintains nuances
of local traditional musics. What is ‘traditional’, though, will be read through a complex
of factors, including class and generational divide, and will be interpreted variously
as complicit or resistant, particularly within the history of colonialism in Africa. The
international world music market requires artists to find a broadly appealing balance
between familiarity (e.g. pop song form, metric regularity) and ‘exoticism’ (e.g. traditional
instrumental and vocal timbres); even in this marketing context, though, a perception of
‘progression’ or ‘resistance’ may be part of consumer appeal.
Sophie Stévance describes Inuk vocalist Tanya Tagaq as a transcultural/transnational
artist who ‘rewrites the symbolic dimension of katajjaq [traditional Inuit vocal games, or
throat singing] according to her wishes and values, which are rooted in two cultures’ (2017:
50), the two cultures being Inuit and Western. Stévance declares: ‘In terms of the subject
rather than the object, Tagaq works against the grain of these conventions by consciously
exercising control over them for her own purposes’ (54). Tagaq creates an ecology in the
studio that is conducive to her transcultural expression – her ‘right to modernity’ (48),
which is constructed through specific choices that relate to katajjaq and contemporary
electro-pop studio techniques (or ‘phonographic staging’). For example, overdubbing
allows Tagaq to interact with herself as a katajjaq partner on ‘Qimiruluapik’. The mix allows
the listener to ‘distinguish each of the parts’ while still ‘contributing to the merging of the
vocal timbre’. Her diverse vocal sounds are each treated like individual instruments in the
mix. For example, amplifying the low frequencies in her voice (similar to the EQ of a kick
drum) lends a ‘contemporary’ feel to the track.
On ‘Caribou’, the production techniques on the voice, in particular, create a pop/rock

song, even as Tagaq continues to call upon her katajjaq influences. The chorus is particularly
effective as it layers four different vocal techniques: the layering of tracks means that Tagaq
sonifies her transnationalism in a single sonic moment (Stévance and Lacasse 2019). Tagaq
creates an individual transnational artistic identity based on diverse cultural references, or
what Stévance and Lacasse call ‘aesthetic cosmopolitanism’ (from Regev 2013). Production
and post-production techniques are crucial for establishing Tagaq’s identity that emerges
from integrating musical and cultural codes. They construct an identity that bypasses
the restrictive binary of traditional/modern so often imposed on Indigenous artists by
producers, consumers and industry labels.
Tagaq’s two latest albums (Animism and Retribution), in conjunction with her social
media community (i.e. second plane), assert an identity as a modern Inuk woman and
reject violations against Inuit sovereignty (Woloshyn 2017). For example, ‘Aorta’ on
Retribution references katajjaq in some of the vocal sounds and Western pop music in the
steady drum beat and song structure, but with a vocal virtuosity and alarming sound world
reflecting the album’s theme. The extra-vocal sounds inherent to katajjaq (e.g. breaths and
grunts) give Tagaq the freedom to include a vast array of vocal sounds, many of which
portray a sensuality and sexuality that are central to her identity as an Inuk woman. The
recording studio is a space in which Tagaq gathers individuals whose collaborations allow
her to express her artistic agency, and thus her agency as an Inuk woman.
Rejecting neutrality, embracing potentiality

This chapter has focused on aspects of collective identity that are materialized in the
recording studio. Collective identities are the broader contexts from which one’s particular
self emerges and is expressed. Numerous examples discussed here also point to the potential
to capture and construct individuality as sonically expressed through vocal styles (and
how those styles are staged), unique use of extra-vocal sounds for phrasing, distinctive
instrumental timbres and so on.
Some aspects that are central to an artist’s identity (both unique and in relation to a
collective) will not translate audibly. For example, Tagaq’s performance approach, both live
and in studio, is characterized by her corporeality, with a wide range of physical gestures,
many of which have corresponding sets of vocalizations. Her movements inevitably affect
her vocalizations, and maybe some of her movement could be captured and preserved on
an audio recording. However, the exact relationship between movement and sound will be
relatively inaudible to a listener. What we hear is what happens when Tagaq finds the right
ecology (first plane) for her uninhibited musical expression in the studio.
Sonic traces of identity will also be interpreted variably by listeners, often resulting
in imagined communities (second plane); production choices related to arrangement,
tracking, mixing and mastering will influence the formation of listener subcultures in
which artists are hailed as heroes or villains, such as within the low-fi as authentic and hi-fi
as corporate sell-out debates of certain music communities.
Those of us who were not part of the record-making process often will not – cannot –
know what exactly happened to create the sonic outcome we hear. For example, the final
version of Kate Bush’s ‘Lake Tahoe’ is missing a piano note; that take ‘just had a feel about it’,
and because Bush was confident in her own musical direction, the ‘mistake’ made it through
(Wolfe 2012). There is a sonic trace of this ‘right’ performance, but without the story, what
do we hear in that piano part? Perhaps we will never hear it as ‘missing’ anything, because
we are unaware of what was intended.
The recording studio is a space of negotiation. Because the individuals in the studio
mirror broader social identity formations, inequality, oppression and racism can exist in
the recording studio. Institutional demands regarding genre and marketability as well as
access to technology may also limit what identities are created in the studio. Artists do not
share the same levels of agency in determining what elements of collective and individual
identities will be embraced and in what ways. Artist-producers such as Björk and Heap
have more authority over such matters. Emerging artists can be particularly vulnerable to
the hierarchies within the popular music industry.
In her recent authorized biography, Buffy Sainte-Marie discusses her early years with
Vanguard Records: ‘I wish that I had been able to choose the takes because Vanguard had
a certain perception of me, I think, and really wanted to rub it in. […] In my first couple of
records, whoever was choosing takes wanted me to sound like I was old and dying. I think they
imagined that maybe I was a junkie or they probably thought that I was going to be a young
casualty’ (Warner 2018: 64). Nonetheless, Sainte-Marie’s career, and countless others, also
reminds us that the recording studio offers artists potential to express diverse and sometimes
subversive identities. Contrast Sainte-Marie’s early years with Vanguard with her 2015 album
Power in the Blood, in which the album’s production is a full partner to her artistic vision,
creating what Warner calls ‘a new era in protest music and resistance’ (2018: 244).
The recording studio is not a neutral space. It is a space of potential and process. These
diverse examples demonstrate the exciting opportunities for artists to use studio techniques
to construct identities that engage with and challenge the broader society.
Note
1. Both Sámi and Saami are common spellings. The traditional music genre joik is also often
spelled yoik. For purposes of consistency, this chapter will use Sámi and joik unless a
direct quotation or reference uses a different spelling.
Bibliography
Altman, R. (1992), ‘Sound Space’, in R. Altman (ed.), Sound Theory, Sound Practice, 46–64,
New York: Routledge.
Born, G. (2011), ‘Music and the Materialization of Identities’, Journal of Material Culture,
16 (4): 376–388.
Burgess, R. J. (2002), The Art of Music Production, London: Omnibus Press.

Diamond, B. (2005), ‘Media as Social Action: Native American Musicians in the Recording
Studio’, in P. D. Greene and T. Porcello (eds), Wired for Sound: Engineering and Technologies
in Sonic Cultures, 118–137, Middletown, CT: Wesleyan University Press.
Diamond, B. (2007), ‘“Allowing the Listener to Fly as They Want To”: Sámi Perspectives on
Indigenous CD Production in Northern Europe’, World of Music, 49 (1): 23–48.
Diamond, B. (2008), Native American Music in Eastern North America: Experiencing Music,
Expressing Culture, New York: Oxford University Press.
Dibben, N. (2006), ‘Subjectivity and the Construction of Emotion in the Music of Björk’,
Music Analysis, 25 (1–2): 171–197.
Dibben, N. (2009a), ‘Nature and Nation: National Identity and Environmentalism in
Icelandic Popular Music Video and Music Documentary’, Ethnomusicology Forum, 18 (1):
131–151.
Dibben, N. (2009b), ‘Vocal Performance and the Projection of Emotional Authenticity’, in
D. B. Scott (ed.), The Ashgate Research Companion to Popular Musicology, 317–334,
Farnham: Ashgate.
Goldin-Perschbacher, S. (2003), ‘“Unbearable Intimacy” and Jeff Buckley’s Transgendered
Vocality’, Proceedings of the 2003 IASPM-International Conference, Montreal.
Goldin-Perschbacher, S. (2013), ‘The World Has Made Me the Man of My Dreams: Meshell
Ndegeocello and the “Problem” of Black Female Masculinity’, Popular Music, 32 (3):
471–496.
Goldin-Perschbacher, S. (2014), ‘Icelandic Nationalism, Difference Feminism, and Björk’s
Maternal Aesthetic’, Women and Music: A Journal of Gender and Culture, 18 (1): 48–81.
Goldin-Perschbacher, S. (2015), ‘TransAmericana: Gender, Genre, and Journey’, New Literary
History, 46 (4): 775–803.
Hawkins, S. (2016), Queerness in Pop Music: Aesthetics, Gender Norms, and Temporality, New
York: Routledge.
Hilder, T. R. (2017), ‘Sámi Festivals and Indigenous Sovereignty’, in F. Holt and A.-V. Kärjä
(eds), The Oxford Handbook of Popular Music in the Nordic Countries, 363–378, New York:
Howlett, M. (2012), ‘The Record Producer as a Nexus’, Journal on the Art of Record Production,
(6). Available online: https://www.arpjournal.com/asarpwp/the-record-producer-as-nexus/
Jarman-Ivens, F. (2011), Queer Voices: Technologies, Vocalities, and the Musical Flaw, New
York: Palgrave Macmillan.
Jones-Bamman, R. (2001), ‘From “I’m a Lapp” to “I am Saami”: Popular Music and Changing
Images of Indigenous Ethnicity in Scandinavia’, Journal of Intercultural Studies, 22 (2):
189–210.
Knight, R. (1989), ‘The Mande Sound: African Popular Music on Records’, Ethnomusicology,
33 (2): 371–376.
Lacasse, S. (2000), ‘“Listen to My Voice”: The Evocative Power of Vocal Staging in Recorded
Rock Music and Other Forms of Vocal Expression’, PhD thesis, University of Liverpool,
Liverpool.
Lacasse, S. (2010), ‘The Phonographic Voice: Paralinguistic Features and Phonographic
Staging in Popular Music Singing’, in A. Bayley (ed.), Recorded Music Society, Technology,
and Performance, 225–251, Cambridge: Cambridge University Press.
Malawey, V. (2011), ‘Musical Emergence in Björk’s “Medúlla”’, Journal of the Royal Musical
Association, 136 (1): 141–180.
Marsh, C. (2012), ‘Bits and Pieces of Truth: Storytelling, Identity, and Hip Hop in
Saskatchewan’, in A. Hoefnagels and B. Diamond (eds), Aboriginal Music in Contemporary
Canada: Echoes and Exchanges, 346–371, Montreal; Kingston: McGill-Queen’s University
Press.
Meintjes, L. (1990), ‘Paul Simon’s Graceland, South Africa, and the Mediation of Musical
Meaning’, Ethnomusicology, 34 (1): 37–73.
Regev, M. (2013), Pop-Rock Music: Aesthetic Cosmopolitanism in Late Modernity, Cambridge:
Polity.
Scales, C. (2012), Recording Culture: Powwow Music and the Aboriginal Recording Industry on
the Northern Plains, Durham, NC: Duke University Press.
Schmidt, R. L., ed. (2012), Yesterday Once More: The Carpenters Reader, Chicago: Chicago
Review Press.
Stévance, S. (2017), ‘From Throat Singing to Transcultural Expression: Tanya Tagaq’s Katajjaq
Musical Signature’, in S. Hawkins (ed.), The Routledge Research Companion to Popular
Music and Gender, 48–62, Abingdon: Routledge.
Stévance, S. and S. Lacasse (2019), ‘Tanya Tagaq, A Cosmopolitan Artist in the Studio’,
in S. Zagorski-Thomas, K. Isakoff, S. Lacasse and S. Stévance (eds), The Art of Record
Production: Creative Practice in the Studio, 21–37, Abingdon: Routledge.
Tembo, M. (2018), ‘“Dununa Rivesi” (“Kick Back”): Dancing for Zambia’, Bring the Noise:
Popular Music Studies: Ethnomusicology Review, 2 March. Available online: https://www.
ethnomusicologyreview.ucla.edu/content/“dununa-rivesi”-“kick-back”-dancing-zambia
Warner, A. (2018), Buffy Sainte-Marie: The Authorized Biography, Vancouver: Greystone
Books.
Wolfe, P. (2012), ‘A Studio of One’s Own: Music Production, Technology and Gender’, Journal
on the Art of Record Production, (7). Available online: http://www.arpjournal.com/asarpwp/
a-studio-of-one’s-own-music-production-technology-and-gender (accessed
2 August 2019).
Woloshyn, A. (2009), ‘Imogen Heap as Pop Music Cyborg: Renegotiations of Power, Gender
and Sound’, Journal on the Art of Record Production, (4, Suppl. to ARP08). Available online:
https://www.arpjournal.com/asarpwp/imogen-heap-as-musical-cyborg-renegotiations-of-
power-gender-and-sound/ (accessed 2 August 2019).
Woloshyn, A. (2017), ‘“Welcome to the Tundra”: Tanya Tagaq’s Creative and Communicative
Agency as Political Strategy’, Journal of Popular Music Studies, 29 (4).
Discography
Aboriginal Women’s Voices (1997), [CD] Hearts of the Nation, Banff.
Amos, Tori (2001), [CD] ‘’97 Bonnie And Clyde’, Strange Little Girls, Atlantic.
Björk (1997), [CD] Homogenic, One Little Indian.
Björk (2001), [CD] Vespertine, One Little Indian.

Björk (2004), [CD] Medúlla, One Little Indian.
Björk (2007), [CD] ‘Unison’, Vespertine, One Little Indian.
Björk (2007), [CD] Volta, One Little Indian.
Bush, Kate (2011), [CD] ‘Lake Tahoe’, 50 Words for Snow, Fish People.
The Carpenters (1972), [7” single] ‘Goodbye To Love’, A&M.
Eekwol (2004), [CD] Apprentice to the Mystery, Mils Productions.
Heap, Imogen (1998), [CD] I Megaphone, Almo Sounds.
Heap, Imogen (2005), [CD] ‘Hide And Seek’, Speak for Yourself, Megaphonic Records.
Heap, Imogen (2005), [CD] Speak for Yourself, Megaphonic Records.
Heap, Imogen (2009) [CD] Ellipse, Megaphonic Records.
Heap, Imogen (2014) [CD] Sparks, Megaphonic Records.
Johansson, Jonas [Almetjh Tjöönghkeme] (1991), [CD] ‘Goh Almethj Lea’, Vaajesh,
Jojkbox AB.
Jonsson, Sven-Gösta (1959), [7” single] ‘Vid Foten Av Fjället (Jag är Lapp …)’, Bonniers
Folkbibliotek.
Sainte-Marie, Buffy (2015), [CD] Power in the Blood, True North Records.
Scissor Sisters (2010), [CD] ‘Skin Tight’, Night Work, Downtown.
Simon, Paul (1986), [LP] Graceland, Warner Bros.
Tagaq, Tanya (2005), [CD] ‘Qimiruluapik’, Sinaa, Jerico Beach Music.
Tagaq, Tanya (2014), [CD] ‘Caribou’, Retribution, Six Shooter Records.
Tagaq, Tanya (2014), [CD] Animism, Six Shooter Records.
Tagaq, Tanya (2016), [CD] Retribution, Six Shooter Records.
The Wallace Family (2000), [CD] Tzo’kam, Red Plant Records.
264
Part VI
Creating Desktop Music
Sound recordist and rock musician Steve Albini (who rejects the title of producer) has
suggested that the digital audio workstation (DAW) is an ideological programme which
undermines the autonomy of musicians by transferring the agency for creating recorded
music into the post-production process. Albini’s definition of a musician in this instance,
however, is of a member of a band or a solo artist who performs live and whose musical
autonomy and identity exists primarily outside the studio. This may seem odd as Albini
has spent such a large part of his working life in the studio, but it makes more sense in
light of his formative years in the punk and alternative rock scenes as both a recordist and
a performing musician. Of course, Albini is right in one sense: the existence of this vast
array of technologies for manipulating, editing and transforming performances can take
the musical agency away from the initial performer if they do not take part in this post-
performance process. And, to take that a little further, the ergonomics of placing all that
control in the technology of the personal computer does encourage the idea that it is and
should be the domain of a single person. However, there are a great number of musicians
in a wide range of popular music genres around the world for whom the DAW has been a
creative boon.
Zagorski-Thomas, in the first chapter of the book, points to the dual nature of music
production – the mechanical-electrical process of representing actual performances, what
we tend to think of under the term ‘recording’, but also the electronic construction of
pseudo-performances that forms the basis of all electronic and electroacoustic music in all
of the styles and traditions in which it exists. And electroacoustic in this instance is used in
a more literal sense than usual to include any style which conjoins the processes of acoustic
recording with electronic ‘construction’ – from Stockhausen to Public Enemy. Indeed, we
could argue that any live performance which has been compiled or reordered into a newly
constructed ‘performance’ is electroacoustic under this definition. And, in addition, it
would include unedited performances that have been subjected to processing and effects
in a way that significantly alters their character – such as King Tubby’s dub mixes. However,
Stockhausen, King Tubby and Public Enemy are not desktop production. Their recorded
outputs were created with tape-based technologies. In Part II, Paul Théberge provides
a nuanced overview of the ways in which desktop technologies emerged from various
interacting streams of development.
The ‘desktop music’ of this part is focused more on this second aspect of music
production: the creative manipulation of real and constructed ‘performances’. And there
are three key differences between tape-based examples of these kinds of practice and the
DAW-based, desktop versions:
● the addition of a powerful visual dimension to the representation
● the ubiquitous ‘undo’ functionality
● the notionally infinite nature of the technology that the virtual provides.
Perhaps ironically, given that the visual representation of sound is a product of the virtual
world of graphic user interfaces, it also encourages us to objectify sound. By thinking of
a sound file as a shape that can be sculpted and sliced rather than as a set of instructions
for making a noise, we are provided with an entirely new set of possible metaphors. It is
an interesting question as to whether the kinds of rhythmic innovation that Danielsen
talks about in her chapter would have been perceived as affordances if it was not for the
visual editors that have become ubiquitous in DAWs. And there is a good deal of anecdotal
evidence that the sound of music changed dramatically when engineers started to equalize
audio in conjunction with a visual representation of the frequency curve as opposed to a
series of potentiometers on a physical mixing desk. And the number of edits in classical
music production has grown exponentially since they could be performed using a computer
screen rather than physical tape splicing or timecoded tape-offset bounces.
In addition to the potential for doing all this complicated editing and shaping that the
digital desktop afforded through visual representation is the potential for undoing it all
as well. This hugely important aspect of digital audio (as well as of musical instrument
digital interface [MIDI] sequencing) is the result of a combination of the ‘lossless’ copying
of data files in the digital domain and the potential to playback any portions of any files
in any order without actually altering the files – simply by creating a ‘playlist’ of memory
addresses (e.g. play take 1 from 0’ 00” to 0’ 03.25” followed by take 2 from 0’ 04.16” to 1’
23.67”). This process of non-destructive editing creates one type of potential for ‘undo’
while the ability to make a ‘lossless’ copy of any file that is going to be altered allows you
to return to the unaltered file at any point. This undo function allowed engineers, artists
and producers to fearlessly experiment with techniques that could potentially have ruined
wonderful recorded performances in the world of analogue tape recording. However,
it is also, along with the third difference, the potentially limitless capacity for recording
tracks and the storage of multiple takes that are responsible for the digital world’s ‘fear
of commitment’. On the one hand, the undo function allows participants to be much
more adventurous because they can almost always step back from a catastrophic failure
in their experimentation. On the other hand, decisions cease to become commitments
and workflows have increasingly become exercises in keeping one’s options open which, it
might be argued, makes the decision-making processes at the end of a project much more
complex and large scale than they would be if small but absolute commitments were made
throughout the workflow as they used to be in the world of analogue tape.
17
Desktop Production and Groove
Anne Danielsen
Introduction
The conditions for working with groove-based music changed considerably with the
advent of digital music production tools and, in particular, with their integration in
the digital audio workstation (DAW). As a consequence, working ‘in-the-box’ – that is,
producing a track entirely within one software programme, such as, for example, Pro Tools
or Logic Pro – has become more and more common, as every function needed to produce
music is now integrated into the software. Whereas groove-based music in previous times
relied heavily on the skills of outstanding musicians, recording studios and physical units
for effect processing, today’s desktop producers can produce compelling grooves via a
computer alone. In many genres, then, grooves produced in-the-box have taken over the
musical function of the traditional rhythm section, and the new techniques for creating
grooves provided by the DAW now make their mark on a wide spectrum of popular music,
from the avant-garde to the mainstream.
Groove is a musical term used by musicians, fans and academics alike. It is used (as
a noun) to name a characteristic rhythmic pattern typical of a musical style (i.e. swing
groove, rock groove, funk groove) but also (as is reflected in the adjective ‘groovy’) to
evoke the particular pleasurable quality of, as well as the appeal to dance and motion
emanating from, such patterns when they are performed well.1 Producing a groove
thus involves more than creating the correct rhythmic pattern; it entails the detailed
manipulation of the ‘analogue aspects’ (Kvifte 2007) of rhythm – that is, features such as
microtiming, choice and processing of sounds, and so on. The increase in computational
power has been a critical factor in the development of opportunities to control these
aspects of groove in desktop production. Equally important, however, has been the
creation of more ergonomic and user-friendly interfaces, both on the screen and in
the form of external controllers. Desktop production is thus no longer solely about
programming but involves a form of physical interaction with an ‘instrument’ that
resembles traditional musicking.
This chapter starts with a historical overview of some of the technological developments
that led to current desktop production practices. I then explore selected techniques that
have deeply affected the desktop production of grooves. I both give a brief presentation
of each technique and present related research on music that features the technique. In
particular I focus on the creation of rhythmic patterns and practices that do not resemble
‘traditional musicking’ – sometimes even seeking to deliberately undermine it – and which
have, in turn, come to influence performance. Ultimately, I touch upon the topic of desktop
production in live settings before concluding the chapter and outlining some directions for
future research.
The technological prehistory of desktop-

produced grooves
Certain steps in the development of digital music tools are particularly salient to the
emergence of the desktop production of grooves. In the following, I will briefly introduce
the most important of them: sampling, Musical Instrument Digital Interface (MIDI) and
audio sequencing, and digital sound recording/editing. At some point, all of them became
integrated into one software package, the DAW, which is a clear prerequisite of today’s
desktop production practices.
Sampling
Sampling, in the sense of using small fragments of recorded sound as musical building
blocks,2 is not in itself a new practice. There are pre-digital instruments based on sampling,
such as the Mellotron used in the progressive and symphonic rock genres of the late 1960s
and 1970s, and Pierre Schaeffer and colleagues (2004) also conducted a sort of analogue
sampling via microphone and magnetic tape in the 1940s. Whereas these examples of
analogue sampling are based on magnetic tape-recording technology, digital sampling
involves creating a numerical representation of the audio waveform. The capacity of
digital electronics to record and store sound in digital memory chips was first exploited
via recording-studio delay units (Roads 1996: 120). Towards the end of the 1970s, when
computer memory became cheaper, digital sampling was introduced in commercial
sample-based keyboard instruments, such as the Fairlight CMI and the Emulator (E-MU),
and a wave of increasingly powerful and inexpensive sample-based keyboard instruments
soon followed (Holmes 2012: 492).
Whereas sustained acoustic instruments, such as brass or strings, often sounded ‘dead’
due to the short sampling time and lack of development in the sound source, many
percussive sounds could be recorded and played back in their entirety. The new sampling
technology thus suited the reproduction of percussive instruments very well, and an
important area of application for early digital sampling technology in the commercial
market was the drum machine. The drum machine does not in itself require digital
technology and analogue drum machines could be programmed like their digital heirs.
Desktop Production and Groove 269
The key difference between analogue and early digital drum machines was that the former
used analogue sound synthesis, rather than samples, for their drum sounds. For example,
in an analogue drum machine, a snare drum sound typically came about via subtractive
sound synthesis, using a burst of white noise as its starting point. This meant that the final
sound was not particularly close to the sound of the real instrument. In contrast, the digital
drum machine used samples of real drums as its sound sources. Unlike more sustained
sounds, such durations presented few challenges to the sampling process. Sampled sounds
improved drum machines considerably, because percussive sounds, which are very
complex and have disharmonic spectral features, are almost impossible to produce in a
realistic way via other forms of synthesis.
MIDI and audio sequencing

The manufacturers of early drum machines used different digital standards, and, as was
the case with synthesizers, compatibility among machines remained a problem. In 1981,
Dave Smith and Chet Wood from the company Sequential Circuits presented a paper at
the Audio Engineering Society that described the concept behind MIDI for the first time.
MIDI allowed for entirely new ways of using old tools, including synthesizers, sequencers,
samplers and drum machines. MIDI also enabled computers to be applied to the music-
making process in a new way and thus sparked wide-reaching interest in the general
integration of the various tools that were required for record production. MIDI was, in
short, a transformative technology: ‘MIDI increased access to a rapidly expanding palette
of sounds and began the convergence and integration of the various technologies and
methodologies. The many all-in-one units and subsequent software programs, along with
falling prices, signalled the beginning of the democratization of the production process’
(Burgess 2014: 142).
MIDI increased the richness of available sounds. One set of MIDI signals could be
used to control many instruments at the same time, which made it easy to duplicate a
pattern across several instruments. MIDI also led to a new level of precision in timing by
way of quantization, which is the process of automatically positioning performed musical
notes according to an underlying predetermined temporal grid that commonly represents
beats and subdivisions of beats. The new richness of quantized sounds blurred the line
between humans and machines in that we heard acoustic drum kits played with machine
consistency with regard to temporal, dynamic and timbral precision. By 1984, when the
Linn 9000 drum machine was released, the MIDI code had become an industry standard.
The Linn 9000 was fully equipped for MIDI communication, which meant that its drum
sounds could be controlled by an external device, and vice versa – the sequencer could
control sounds in other digital modules. The capacity of these new units was limited, but
they represented the beginning of the development that would ultimately result in the
DAW. Compared to today’s DAW, however, several important functions remained absent –
for example, digital sound editing and digital sound processing. When combined with
the vast increase in the power of personal computers, however, early ‘music production
centres’ such as the Linn 9000 or the early AKAI MPCs brought about a revolution of sorts
in the recording industry.
Digitally stored audio represents a lot of data, and processing audio is demanding on
computer power. Up to the late 1980s, then, consumer-level computers such as the Atari ST
and Apple Macintosh had only enough computing power to handle MIDI data. By the late
1980s, however, sequencer programmes started to offer audio sequencing in traditional
MIDI sequencer programmes.
Hard-disk recording
The road to digital hard-disk recording began with the development of sample editors
that would prepare samples for sample-based instruments such as the Fairlight CMI
and the Synclavier. Connecting a sampler to a personal computer (a Macintosh or Atari)
added a larger screen and extra computing power to speed up the editing. Pro Tools, for
example, began as a Mac-based sample editor by Digidesign called Sound Designer. This
software morphed into Sound Tools, a direct-to-disk recording system launched in 1989
that allowed for non-destructive editing in stereo and also included some simple digital
processing. In 1991, a 4-track version called Pro Tools was released, and, shortly thereafter,
Digidesign’s Digital Audio Engine became available to other manufacturers, opening up
the hardware in Pro Tools to users of sequencer software (Musicradar 2011a; Hofer 2013;
Burgess 2014: 145).
Hard-disk recording facilitates non-destructive editing at all levels, allowing for more
‘custom-made’ designs of both the basic pattern of the groove and its overall form. In
combination with the marvellous ‘undo’ function, this flexibility led to a new practice of
‘endless’ editing across all genres of popular music.
For a long time, audio and MIDI sequencing on the one hand, and hard-disk recording
on the other, were two distinct lines of development. Accordingly, what might be regarded
as prototypes of DAWs tended to focus on either sequencing (Steinberg’s Cubase and
EMagic’s Creator, Notator and Logic Series) or recording (Digidesign’s Sound Tools
and, later, Pro Tools). The graphical interface of the Pro Tools software, for example,
was modelled on analogue tape-based recording studios, whereas the interface of the
sequencer-oriented software was an extension of earlier MIDI-sequencing software.
When these two lines of development eventually came together in the early 1990s, a new
world of opportunity opened up for manipulating groove. The next important step in
this regard came in 1996, when Steinberg introduced Cubase VST, which could record
and play back up to thirty-two tracks of digital audio on a Mac (Musicradar 2011b). The
software offered a tape-like interface for recording and editing as well as the entire mixing
desk and effects rack that were common in analogue studios. Its most revolutionary
aspect, however, was that all of these operations could be done ‘in-the-box’, that is, in the
software alone.
The new opportunities for combining MIDI and audio sequencing meant that audio
recordings became part of a production practice measured by a fixed timeline; audio came
to be viewed against a ‘grid’. Consequently, Zagorski-Thomas argues (2010), the idea of

‘in time’ versus ‘out of time’ started to be considered as scientific and measurable rather
than musical and aesthetic. Moreover, two convergent trends started to emerge: ‘On the
one hand players seem to be trying to sound more like machines and on the other hand
programmers creating computer based music were often aiming to make the machines
sound more like people’ (Zagroski-Thomas 2010: 197). This tension between making
humans more machine-like and machines more human-like became a central organizing
concept in the development of technology.
Creating groove ‘in-the-box’: The

techniques and their musical applications
In what follows I will address some basic techniques in the desktop production of groove
and the ways in which they have been employed in creative ways in actual music. I will
focus on both manipulations of rhythmic events along the temporal axis and effects
that influence the sound-related aspects of groove. Ultimately, I will discuss interactions
between timing and sound and their consequences for groove.
Relocating tracks and events

A simple way to manipulate the timing of sampled or performed music in the DAW is to
simply move rhythmic events to new temporal positions. This can, of course, be used to
correct mistakes or poor timing, but it can also be used to generate new rhythmic feels.
Even though the technique is simple, the result can be rather complex perceptually, because
such procedures sometimes result in profound discrepancies between rhythmic events
that were initially aligned (beatwise). As D’Errico points out, the instability introduced
through such a disruptive manoeuvre tends to become normalized in the context of a
stable and repetitive loop (2015: 283). However, such interventions nonetheless introduce
a characteristic halting feel to the groove.
Several recordings by mainstream contemporary R&B and rap artists from the early
2000s display the innovative use of non-linear sound editing, an option that came with
hard-disk recording. Carlsen and Witek, in an analysis of the song ‘What About Us’ from
Brandy’s innovative album Full Moon (2002, produced by Rodney Jerkins), show that the
peculiar rhythmic feel of that tune derives from simultaneously sounding rhythmic events
that ‘appear to point to several alternative structures that in turn imply differing placements
of the basic beat of the groove. Though these sounds might coincide as sounds […] they
do not coincide as manifestations of structure’ (2010: 51). An illustration of this is when
the last sixteenth note of the melodic-thematic synth riff sounds concurrently with the first
quarter note of the woodblock in the next measure, that is, the last sixteenth note before
the downbeat in one rhythmic layer is delayed to such an extent that it coincides with the
sound that structurally represents that downbeat in another layer. In other words, rather
than being perceived as deviations from a shared underlying reference structure, such
simultaneously sounding rhythmic events point to several alternative structures that in
turn imply differing placements of the basic beat at the microlevel of the groove. The result
resembles the rhythmic feel of several grooves on D’Angelo’s pioneering Voodoo album
from 1999 (for analysis, see Danielsen 2010), where the relocation of tracks in relation to
one another also led to multiple locations of the pulse merging into one extended beat at
the microlevel of the groove.
Time-warping
Dynamic time-warping is an algorithm that calculates an optimal match between two
given sequences by matching indices in one sequence to corresponding indices in the other
sequence. In addition to a similarity measure, this procedure produces a ‘warping path’ by
which the two time sequences can be non-linearly time-aligned. Time-warping is used,
for example, when audio is adjusted to fit the grid of a sequencer. Radical time-warping
became increasingly popular in the 2000s, in particular in funk-derived computer-based
genres such as contemporary R ‘n’ B and hip-hop. On several tracks of Snoop Dogg’s
innovative album R&G (Rhythm & Gangsta): The Masterpiece (2004), producers such as
J. R. Rotem and Josef Leimberg contributed beats that come forward as time-warped. The
groove in ‘Can I Get A Flicc Witchu’ (produced by Leimberg), for example, which consists
of a programmed bass riff and a drum kit, along with vocals that are mainly rapped,
demonstrates no less than two forms of this technique simultaneously. First, the length
of the beats is gradually shortened, so that beat 2 is shorter than beat 1, beat 3 is shorter
than beat 2, and so on. The very linear manner in which this is done suggests that tempo
automation has been applied on a sequence of kick drum samples, a function that was
available in the DAW at the time of production of Rhythm & Gangsta. Used as described
above, this technique results in a general vagueness in the positioning of rhythmic events.
The second layer of time-warping in this track is a perceptual effect, which is rather caused
by an absence of appropriate time-warping of the audio. The case in point is the bass pattern,
which follows its own peculiar schematic organization and is a main reason for the ‘seasick’
rhythmic feel of the tune. This pattern neither relates to the 4/4 metre nor conforms to a
regular periodicity of its own (see Figure 17.1; for a detailed analysis, see Brøvig-Hanssen
and Danielsen 2016: ch. 6).
Its peculiar feel could have been produced by manually drawing the notes away from
the grid or by ‘forcing’ a sampled bass riff from a different source into the bar without the
‘necessary’ adjustments in the tempo or length of the sample. As a consequence, the sample
is de facto deformed perceptually: the listener will unavoidably try to structure the sample
in accordance with the overarching metrical context of the new musical context (Danielsen
2006, 2010), and, if it does not fit, the sample will be heard as weird, even though it might
have been completely straightforward in its original context.
Figure 17.1 Amplitude graph and spectrogram (produced in Praat version 6.0) of Snoop
Dogg’s ‘Can I Get A Flicc Witchu’ (bar 9) illustrating two levels of time-warping: (a) Tempo
automation of beat durations (indicated on top), and (b) non-metric synth bass sample
(indicated by box in the spectrogram). Grid on isochronous quarter notes marked by
vertical lines.
This strategy was also used by J Dilla to produce the peculiar feels of his Donuts album
(2006). Here, as well, the natural periodicity of the original samples is severely disturbed by
the shortening or lengthening of one or more beats/slices of the sample. When this type of
operation is looped, again, the result is a dramatically halting, deformed feel. In J Dilla’s music,
however, it is not the combination of several layers that produces the strange halting effect.
As D’Errico points out, J Dilla has his own characteristic way of reconfiguring single musical
sources, since he often ‘abstains from juxtaposing various samples into a multi-layered loop,
instead rearranging fragments of a single sample into an altogether different groove’ (2015:
283). This strategy depends on creating weird microtemporal relationships within one layer
rather than combining instrumental layers that are displaced in relation to one another.
Cutting up sounds: Cut ‘n’ paste, glitch and

granulation
Cutting up sounds and relocating the fragments is yet another much-used technique in
the desktop production of grooves, and one that DAWs clearly facilitated. As Oliver (2015)
points out, it is not always first and foremost the transformation of temporal features or
durations that produces peculiar microrhythmic effects but the abrupt transitions between
sounds that are produced by such cuts. In his analysis of jungle and drum and bass, he shows
how the effect of chopping up the crash cymbal of the much-sampled ‘Amen break’, for
example, relies heavily on the fact that it is an initially acoustic, and thus very rich, sound.3
In addition, more automatic procedures for chopping up sound yield characteristic
groove effects. An automatic procedure such as granulation, for example, which chops up
the sounds into small ‘pixels’ of equal length, then reorganizes them, often in a random
fashion, produces a microrhythmic pattern in which all of the events are strictly on the
grid. Yet the abrupt onsets and offsets of the ‘grains’ are in stark contrast to how musical
sounds tend to behave when produced by a conventional instrument. As in the case above
with the ‘Amen break’, chopping up a very rich sound in an unexpected way produces
abrupt transitions and, in turn, a very characteristic microrhythmic feel. In her analysis
of DJ Food’s ‘Break’, Brøvig-Hanssen (2010) demonstrates further the microrhythmic
potential of abrupt cuts to sonic ‘wholes’. In this case, the peculiar stuttering effect stands
in clear contrast to the swung ‘jazz’-like groove of the rest of the work. This combination
of organic and machinic rhythm generated a novel musical expression. Today, however,
the dividing line between the two has become increasingly difficult to draw (Danielsen
2019).
All techniques that involve cutting sound into fragments and reorganizing them anew
in various ways imply that a coherent sonic totality has been ‘destroyed’. Accordingly, there
is always an additional layer of significance to cut-up sounds. The label glitch music4 – a
substyle of electronic dance music associated with artists such as Autreche, Aphex Twin
and Squarepusher – hints at the ways in which we perceive these soundscapes. An
important point here, which Brøvig-Hanssen discusses at length, is that this approach
to sound relies on the listener being able to imagine a ‘music within the music’ – that
is, a fragmented sound presupposes an imagined and spatiotemporally coherent sound
(Brøvig-Hanssen 2013). In her detailed analysis of the manipulations of the vocal track in
two versions of Squarepusher’s ‘My Red Hot Car’,5 where one is a ‘glitched’ version of the
other, she demonstrates how the vocal track of the glitched version has been ‘deformed’,
in that sounds are cut off too early; there are repeated iterations of sound fragments
separated by signal dropouts; and fragments are dislocated from their original locations:
‘We can discern two layers of music, the traditional and the manipulated, neither of
which, in this precise context, makes sense without the other’ (Brøvig-Hanssen and
Danielsen 2016: 95).
Beat slicing, quantization and swing ratio

Beat slicing is yet another technique for cutting up sound – an audio file is cut into
segments, in accordance with its rhythmic structure, and these segments are identified
as transients in the sound. This technique is useful for changing the tempo of the audio
without changing its pitch.6 However, it may also be used to produce strange effects –
for example, by targeting a not very obvious rhythmic sound such as a sustained synth
pad. The sliced parts can, furthermore, be moved to new positions using quantization. As
mentioned above, quantization means adjusting the rhythmic positions of audio or MIDI
events to a specified time grid, such that all notes are moved to the nearest position on
the grid. (When quantizing audio tracks, certain parts of the audio are time stretched.)
Different layers and also different parts of a single layer can be quantized to different grids,
and this – together with the option of varying the swing ratio – allows artists to employ
all kinds of combinations of equal and unequal subdivisions within and across musical
layers. One early example of the use of differing subdivision grids is the song ‘Nasty Girl’
by Destiny’s Child from their album Survivor (2001), in which each section has a distinct
structural subdivision ‘profile’ – the verse has a clear triplet feel, the bridge is quantized to
straight sixteenths and the chorus hovers somewhere in-between (for a detailed analysis,
see Danielsen 2015).
A more recent example of this practice can be found in the Norwegian hip-hop
producer Tommy Tee’s track ‘Going On (feat. Mike Zoot)’ from Bonds, Beats & Beliefs
vol. 2 (2016). Here, different sections of the hi-hat pattern have been quantized to
different grids. In practice, this is done by splitting the audio into different regions,
applying quantization and then gluing the fragments back together again. According to
Tommy Tee (interview with the author), this practice is now widespread in contemporary
hip-hop.
Altering the sound shape

Ultimately, it should be mentioned that all shaping of sound indirectly influences timing.
Several perception studies have shown that sounds with slow rise times and long durations
move their perceived timing later in time (Gordon 1987; Wright 2008; Danielsen et al.
2019). Lowering the pitch in musical sounds probably works in the same direction (Zeiner-
Henriksen 2010), although systematic studies of the effect of frequency range on perceived
timing are scarce (one exception is Danielsen et al. 2019). The ultimate evidence of the
effect of the shaping of sound on perceived timing, however, is the widespread use of
sidechain pumping in contemporary EDM-related popular music. To achieve sidechain
pumping, one enables a track with a percussive sound – for example, a kick drum – to
control a compressor set to control a track with a more sustained sound, such as a synth
pad or a bass synth. The volume of this track is ‘ducked’ when the sidechain trigger kicks in,
and it returns to its normal level in tandem with the compressor’s release (see Figure 17.2).
This whole process produces a characteristic rhythmic ‘swell’, or what Hodgson describes
as a ‘regularized rhythmic flexing’ (2011).
Sidechain pumping, and the related technique of envelope following, completely alters
the shape of the sound that is ducked (or, in the case of envelope following, processed
according to the shape of the chosen envelope). In general, the vast opportunities for
manipulating both timing and sound that digital signal processing presents to the desktop
producer have given rise to a situation where the temporal and sound-related aspects of
microrhythm are difficult to pull apart.
Figure 17.2 Sidechain pumping example. From top to bottom: (i) a synth pad side-
chained to (ii) a kick drum playing straight quarter notes, and (iii) the resulting pad.
(Reprinted with permission from Hodgson 2011. © Jay Hodgson.)
Controllers and desktop production on the stage

In the last decade, desktop production has also ‘entered stage right’. The DAW Ableton
Live has been particularly important in this development. Moreover, due to increased
processing power, the desktop production of grooves on stage has become more and more
similar to desktop production in the professional or home studio. Whereas automated
cutting processes could initially be applied only to pre-recorded sound, for example, they
can now be used in real time.7
An important enabling factor here is the development of DAW controllers. Using an
external controller allows for the intuitive and embodied creation of desktop-produced
grooves. They have been widely used in the home studio for a long time, because they
produce a different microrhythmic feel from that produced by visual programming. In the
live setting, the controller enables better communication with an audience, because it can
be used to indicate how a sound that the audience hears is actually produced. In the words
of a Norwegian live electronics artist, ‘Generally, if it is just a small pulse in the background,
it is okay to play from the machine, but not if it is a very special sound. Then you miss the
actual movement, [and] to us that feels like cheating’ (Kjus and Danielsen 2016).
One crucial difference between live and recorded formats is the real-time aspect of a
live performance. Because there is simultaneity between the music’s unfolding and the
audience’s experience of it, there is no erase or undo button in a live context. The threat of
technological breakdown is thus a great concern for desktop production in a live context
(Kjus and Danielsen 2016), and the use of pre-programmed and pre-recorded – that is,
pre-produced – elements on stage is widespread as well. The level of acceptance of this
trend varies among genres, of course. In electronic dance music, all aspects of the sound,
even the vocals, can be prefabricated without any perceived impact upon the quality of
the live performance (see Danielsen and Helseth 2016; Kjus and Danielsen 2016). In rock,
on the other hand, the production techniques typical of desktop production would be
viewed as inauthentic and a threat to the experience of a live concert as ‘live’ (Frith 1986).
The live desktop production of grooves is thus most common in genres where looping
and processing of sounds and samples are already integral parts of the overall musical
expression.
Conclusion
The integration of sampling, MIDI and audio sequencing, hard-disk recording and editing,
and digital signal processing in one software platform – that is, the DAW – has opened up
vast opportunities for in-the-box experimentation with groove features. The techniques
and examples discussed above demonstrate that this experimentation at the microrhythmic
level of grooves involves, not only severe manipulation of temporal relationships, but
also radical disruption of sonic wholes as well as the reordering of the sound fragments
following from such procedures.
Experiencing the unique presence of typical desktop production techniques requires
that one recognize that they have been used in the first place. Put differently, mediating
processes can be either transparent or opaque for the listener (Brøvig-Hanssen 2010: 162) –
that is, aspects of the mediating process can be both exposed and completely hidden. Above,
I have focused on the opaque use of some important tools that are available for the desktop
production of groove. Most of the tools discussed – such as time-warping, quantization
and cut ‘n’ paste – are, however, also used in a transparent manner and have had a profound
influence on the timing in more traditional popular music styles. This reminds us, in turn,
that desktop production today is also employed to create grooves that belong to styles and
genres that existed long before the desktop production era. Even though such tracks often
tend to show no signs of the use of digital music tools, country, rock ‘n’ roll and other pre-
digital popular musics are now often also produced partly in-the-box.
What is new compared to pre-DAW times, however – and this concerns both the
traditions that tend to expose the digital production strategies employed in the creation
of groove and those that do not – is the infinite possibility of endless manipulation and
control of both the structural and the micro aspects of rhythm. The DAW makes it possible
to control every aspect of a groove down to the millisecond, and the ability to ‘undo’, as
well, has clearly contributed to a new world of previously unheard grooves.
Notes
1. For a thorough discussion of the term ‘groove’ and its applications, see Câmara and
Danielsen (2018).
2. For a discussion of some of the different meanings of the term ‘sampling’, see Kvifte
(2007: 106–108).
3. The ‘Amen break’ refers to a drum solo performed by Gregory Sylvester Coleman in the
song ‘Amen, Brother’ (1969) by The Winstons.
4. ‘Glitch’ initially referred to a sound caused by malfunctioning technology.
5. The two versions were released as the first two tracks of Squarepusher’s EP My Red Hot
Car (2001). The second track was subsequently placed on the Squarepusher album Go
Plastic (2001).
6. The use of beat slicing started with the stand-alone software package Re-Cycle by
Propellorhead in 1994 and gradually came to be integrated into DAWs in the early 2000s.
7. For an introduction to the algorithmic procedures underlying different automated cutting
processes in live electronica performance, see Collins 2003.
Bibliography
Brøvig-Hanssen, R. (2010), ‘Opaque Mediation. The Cut-and-Paste Groove in DJ Food’s
“Break”’, in A. Danielsen (ed.), Musical Rhythm in the Age of Digital Reproduction, 159–175,
Farnham: Ashgate.
Brøvig-Hanssen, R. (2013), ‘Music in Bits and Bits of Music: Signatures of Digital Mediation
in Popular Music Recordings’, PhD thesis, University of Oslo, Oslo.
Brøvig-Hanssen, R. and A. Danielsen (2016), Digital Signatures. The Impact of Digitization on
Popular Music Sound, Cambridge, MA: MIT Press.
Burgess, R. J. (2014), The History of Music Production, Oxford: Oxford University Press.
Câmara, G. and A. Danielsen (2018), ‘Groove’, in A. Rehding and S. Rings (eds), The Oxford
Handbook of Critical Concepts in Music Theory, Oxford: Oxford University Press. doi:
10.1093/oxfordhb/9780190454746.013.17.
Carlsen, K. and M. A. G. Witek (2010), ‘Simultaneous Rhythmic Events with Different
Schematic Affiliations: Microtiming and Dynamic Attending in Two Contemporary R&B
Grooves’, in A. Danielsen (ed.) Musical Rhythm in the Age of Digital Reproduction, 51–68,
Farnham: Ashgate.
Collins, N. (2003), ‘Generative Music and Laptop Performance’, Contemporary Music Review,
22 (4): 67–79.
Danielsen, A. (2006), Presence and Pleasure. The Funk Grooves of James Brown and Parliament,
Danielsen, A. (2010), ‘Here, There, and Everywhere. Three Accounts of Pulse in D’Angelo’s
“Left and Right”’, in A. Danielsen (ed.), Musical Rhythm in the Age of Digital Reproduction,
19–36, Farnham: Ashgate.
Danielsen, A. (2015), ‘Metrical Ambiguity or Microrhythmic Flexibility? Analysing Groove
in “Nasty Girl” by Destiny’s Child’, in R. Appen, A. Doehring and A. F. Moore (eds), Song
Interpretation in 21st-Century Pop Music, 53–71, Farnham: Ashgate.
Danielsen, A. (2019), ‘Glitched and Warped: Transformations of Rhythm in the Age of the
Digital Audio Workstation’, in M. Grimshaw-Aagaard, M. Walther-Hansen and
M. Knakkergaard (eds), The Oxford Handbook of Sound & Imagination, Oxford: Oxford
University Press.
Danielsen, A. and I. Helseth (2016), ‘Mediated Immediacy: The Relationship between
Auditory and Visual Dimensions of Live Performance in Contemporary Technology-Based
Popular Music’, Rock Music Studies, 3 (1): 24–40.
Danielsen, A., K. Nymoen, E. Anderson, G. S. Câmara, M. T. Langerød, M. R. Thompson
and J. London (2019), ‘Where Is the Beat in That Note? Effects of Attack, Duration, and
Frequency on the Perceived Timing of Musical and Quasi-Musical Sounds’, Journal of
Experimental Psychology: Human Perception and Performance, 45 (3): 402–418.
D’Errico, M. (2015), ‘Off the Grid: Instrumental Hip-Hop and Experimentalism After the
Golden Age’, in J. A. Williams (ed.), The Cambridge Companion to Hip-Hop, 280–291,
Frith, S. (1986), ‘Art Versus Technology: The Strange Case of Popular Music’, Media, Culture
and Society, 8 (3): 263–279.
Gordon, J. W. (1987), ‘The Perceptual Attack Time of Musical Tones’, The Journal of the
Acoustical Society of America, 82 (1): 88–105.
Hodgson, J. (2011), ‘Lateral Dynamics Processing in Experimental Hip Hop: Flying Lotus,
Madlib, Oh No, J-Dilla and Prefuse 73’, Journal on the Art of Record Production, (5).
Available online: http://arpjournal.com/lateral-dynamics-processing-in-experimental-hip-
hop-flying-lotus-madlib-oh-no-j-dilla-and-prefuse-73/ (accessed 18 May 2018).
Hofer, M. (2013), ‘Pro Tools Hardware History Video’, YouTube, 27 March. Available online:
https://www.youtube.com/watch?v=ENeYnkp3RrY (accessed 14 May 2018).
Holmes, T. (2012), Electronic and Experimental Music. Technology, Music, and Culture, 4th
edn, New York: Routledge.
Kjus, Y. and A. Danielsen (2016), ‘Live Mediation: Performing Concerts Using Studio
Technology’, Popular Music, 35 (3): 320–337.
Kvifte, T. (2007), ‘Digital Sampling and Analogue Aesthetics’, in A. Melberg (ed.), Aesthetics at
Work, 105–128, Oslo: Unipub.
Musicradar (2011a), ‘A Brief History of Pro Tools’, Future Music, 30 May. Available online:.
https://www.musicradar.com/tuition/tech/a-brief-history-of-pro-tools-452963/ (accessed
14 May 2018).
Musicradar (2011b), ‘A Brief History of Steinberg Cubase’, Future Music, 24 May. Available
online: http://www.musicradar.com/tuition/tech/a-brief-history-of-steinberg-
cubase-406132/ (accessed 14 May 2018).
Oliver, R. A. (2015), ‘Rebecoming Analogue: Groove, Breakbeats and Sampling’, PhD thesis,
University of Hull, Hull.
Roads, C. (1996), The Computer Music Tutorial, Cambridge, MA: MIT Press.
Schaeffer, P. (2004), ‘Acousmatics’, in C. Cox and D. Warner (eds), Audio Culture. Readings in
Modern Music, 76–81, New York: Continuum.
Smith, D. and C. Wood (1981), ‘The “USI”, or Universal Synthesizer Interface’, in Audio
Engineering Society Convention 70, Audio Engineering Society E-library. Available online:
http://www.aes.org/e-lib/browse.cfm?elib=11909 (accessed 9 August 2019).
Wright, M. (2008), ‘The Shape of an Instant: Measuring and Modeling Perceptual Attack Time
with Probability Density Functions’, PhD thesis, Center for Computer Research in Music
and Acoustics (CCRMA), Stanford University, Stanford, CA.
Zagorski-Thomas, S. (2010), ‘Real and Unreal Performances: The Interaction of Recording
Technology and Rock Drum Kit Performance’, in A. Danielsen (ed.), Musical Rhythm in the
Age of Digital Reproduction, 195–212, Farnham: Ashgate.
Zeiner-Henriksen, H. T. (2010), ‘The “PoumTchak” Pattern: Correspondences Between
Rhythm, Sound, and Movement in Electronic Dance Music’, PhD thesis, University of
Oslo, Oslo.
Discography
Brandy (2002), [CD] ‘What About Us’, Full Moon, Atlantic.
D’Angelo (1999), [CD] Voodoo, Cheeba Sound/Virgin.
Destiny’s Child (2001), [CD] ‘Nasty Girl’, Survivor, Columbia.
DJ Food (2000), [CD] ‘Break’, Kaleidoscope, Ninja Tune.
J Dilla (2006), [CD] Donuts, Stones Throw.
Snoop Dogg (2004), [CD] R&G (Rhythm & Gangsta): The Masterpiece, Geffen.
Squarepusher (2001), [CD] ‘My Red Hot Car’, Go Plastic, Warp Records.
Squarepusher (2001), [CD] ‘My Red Hot Car (Girl)’, My Red Hot Car, Warp Records.
Tommy Tee (2016), [CD] ‘Going On (feat. Mike Zoot)’, Bonds, Beats & Beliefs vol. 2, Tee
Productions.
The Winstons (1969), [7” vinyl] ‘Amen Brother’, Color Him Father/Amen Brother, Metromedia
Records MMS-117.
18
The Boom in the Box: Bass and
Sub-Bass in Desktop Production
Robert Fink
Mia Hansen-Løve’s 2014 film Eden is an elegiac look back at the rise and fall of the millennial
house scene as seen through the career of Paul Vallée, a fictional French DJ-producer
based on her own brother, Sven. Unlike his buddies Daft Punk, who make periodic cameo
appearances, Paul/Sven doesn’t make it big, but his artistic integrity is never in doubt, nor
is his almost religious connoisseurship of sound production. In a gently mordant scene
about halfway into the film, Paul and a collaborator sit, bleary-eyed, in front of a computer
monitor in his dingy apartment, trying to choose from among 250 kick drum samples and
thereby get a track started. After rejecting one as too ‘heavy, too fat’ and another as weak,
‘like a rabbit fart’ (comme un pet de lapin), they seem to be closing in on something, only
to lose confidence as they re-audition the sample:
Paul Try 29 – I think I liked it. No […] too feminine.
Arnaud Too feminine? It’s not bad. [listens] I see what you mean. It’s thin, superficial.
Paul A little stingy […]
The spectacle of two grown men arguing over the precise sonic ‘feel’ of brief, almost
identical bursts of low-frequency noise like wine lovers enthusing over grapes is inherently
funny, and it neatly foreshadows the personal obsessiveness and inability to compromise
that will limit Paul’s career as a dance music entrepreneur. But most house producers
would, ruefully, recognize the basic rightness of this cinematic portrayal of what it’s like to
take electronic dance music seriously: the spread of digital tools, especially the ability to
chain and layer software instruments as plug-ins inside the framework of a digital audio
workstation (DAW), has put incredible power over sound in the hands of even the most
underground musicians, and with the endless possibilities come extremely high standards
for the functionality of individual sounds – like kick drum hits, whose timbre can and must
be systematically analysed, synthesized and manipulated to serve the exacting needs of the
groove and the mix while maintaining generic integrity.
The other response of the contemporary dance producer would be that we have come a
long way since the 1990s, when producers needed hundreds of sampled kicks because they
didn’t really understand how, beyond a few elementary analogue techniques such as filtering
and envelope generation, to work efficiently with the ones they had. Post-millennial sub-
bass heavy genres such as UK garage and dubstep threaten the producer-as-connoisseur:
with more musical information pushed further down into the sub-bass register, precise
sonic control of the deepest parts of a mix, the kick and bassline, became something more
like science than art. What actually goes on in the sub-bass register of a dance track? And
how do contemporary desktop producers navigate these new and (sometimes) dangerous
depths?
A historical ecology of sub-bass

In contemporary electronic music production, the deepest layers of bass begin at
about 100 Hz, and extend right down to the haptic boundary at 18–25 Hz, where the
perception of sound shades off into a sensation of touch (Goodman 2009; Henriques
2011). Although the deepest fundamentals of acoustic music reach almost this low, for
most of recording and broadcast history, frequencies below 100 Hz were uniformly
attenuated to protect amplifiers and speakers from the deleterious effects of mechanical
noise and harmonic distortion (Read 1952; Millard 1995). Placing complex musical
information into this sub-bass range was only possible with the effectively infinite noise
floor of digital production and reproduction. Styles of contemporary music that focus
on the sub-bass register actually favour desktop producers, since working entirely ‘in-
the-box’, with synthesized, sequenced and sampled audio, minimizes the possibility that
bad equipment or faulty recording technique will introduce damaging low-frequency
artefacts to the mix.
Producers of bass-dominated music recognize three distinct regions of sub-bass, referred
to functionally or by the region of the body in which the corresponding frequencies are
imaginatively felt (Fink 2018: 104–108). The ‘thump’ or ‘meat’ of sub-bass is centred in a
±10 Hz band above and below 50 Hz, not coincidently, almost the precise frequency range
occupied by the Roland TR-808 analogue kick that anchors old-school house and hip-
hop. Below this, from 25–35 Hz, one talks about ‘boom’, the semi-audible vibration in the
gut felt during the deepest drops in dancehall and dubstep. ‘Punch’ or ‘smack’, a sharper
pressure associated with percussive transients felt in the chest or face, starts around 75 Hz
and extends a full octave up into the lower end of the bass register (150–175 Hz). Each sub-
bass register represents a distinct ecological niche for which a species of sub-bass speaker
has been functionally refined: punch and smack come from old-fashioned folded horns,
which throw sound directionally and work best in the transitional upper-sub register,
while boomy drops are usually assigned to simple but massive bass reflex designs, which
Bass and Heaviness in Dance Music 283
can be precisely tuned to reproduce a narrow range of the lowest audible frequencies. The
all-important ‘meat’ of the kick is often channelled through complex bandpass enclosures –
but most important is that the whole ensemble, which, in the largest sound systems, can
encompass as many as eighteen to twenty-four sub-bass drivers, is designed as a tight
frequency resonance ladder where each breed of speaker handles the sliver of the sonic
spectrum to which it is best adapted (Dickason 2006).
For a generation of desktop musicians familiar with this neatly segmented terrain, all
sub-bass sounds are thus by definition abstract, decomposable into quantifiable response
curves applied to narrow bands of low-frequency energy. Intuitive spectral analysis of sub-
bass kicks into discrete sonic functions – does this drum sample have the right combination
of initial smack, meaty sustain and booming decay? – has been supplemented by easy access
to sophisticated digital tools for visualizing and shaping sound. No longer must an aspiring
producer make or accumulate dozens of sound clips, then painstakingly train the ear to
remember a hundred different combinations of punch, thump and boom. If a particular
kick is too ‘fat’ (too much energy in the sub-35 Hz register), or thin and ‘stingy’ (nice attack,
but not enough 50 Hz sustain), this can be diagnosed with a spectral analyser, treated
quickly with the appropriate digital signal processing (DSP) and the work of production
continued efficiently. As the popular Computer Music’s Production Manual 2012 put it, ‘if
you have control over a sound at the harmonic level and you know what properties the
different frequency areas in the mix need, then your job is an easier one’ (2012: 145).
Harmonic layering, sub-bass synthesis and

transient shaping
One of the most venerable production techniques for bass is layering: the addition of
spectral content through instrumental doubling, synthesis or the use of multiple samples
at once. In the analogue era, when sound reproduction stopped at the sub-bass border,
recording musicians developed layering to encourage the ear as it interpolated fundamental
frequencies that weren’t actually present in the mix. Bebop pioneers such as Slam Stewart
and Major Holley used to hum along one or two octaves above their bowed bass solos,
creating a ‘rich, eerie sound’ adapted to the narrow passband of early electrical recording
(Palmer 1987); the even stingier bass response of 1960s AM radio led Motown producers
to double James Jamerson’s driving Fender bass ostinatos with layers of bowed strings and
the woody rattle of the marimba. At the other end, the first disco sound systems capable
of reproducing sonic content down to 20 Hz revealed that even the ‘heaviest’ funk and
soul records of the 1970s carried little useful information in that lowest octave. The secret
weapon at venues such as New York’s Paradise Garage was a sub harmonic synthesizer like
the dbx Model 100 ‘Boom Box’, which fed computer-generated sounds a full octave below
the existing bass into an array of state-of-the-art folded horns, using what was, for the time,
cutting-edge digital signal processing (Figure. 18.1).
Figure 18.1 The dbx Model 100 ‘Boom Box’ Sub harmonic Synthesizer. Source: dbx
Corporation (1980).
A looser kind of sub harmonic doubling had, since the dawn of the rock era, been a
closely held producer’s secret for adding mid-sub meat to a poorly recorded kick drum:
before the advent of synthesizers, an engineer could at least commandeer one of the studio’s
test oscillators to generate a low-frequency sine wave in the 40–60 Hz range, run it through a
gating circuit and then patch the kick drum back into the gate through a ‘side chain’ in order
to trigger it (‘Get Your Kicks’ 2013). This technique was perfected using hardware synths in
the old-school rave era, and today the digital sidechaining of layered subharmonics under a
looped kick drum sample is effortless, and thus, in principle, every kick can have a precisely
tuneable analogue ‘boom’ characteristic of classic drum machines. If setting up a side chain
in the mixer is too complicated, a desktop producer can rely on single-use software plug-
ins such as Brainworx’s bx_boom! at the mastering stage. Deliberately evoking the original
dbx Boom Box, this virtual instrument has only two rotary controls, in this case mapped to
cartoon images of the kick drum head and foot pedal, respectively (Figure 18.2).
Turning the first knob controls the amount of ‘boom’, i.e. bass energy, added or subtracted,
using a complex set of algorithms for virtual mid/side filtering to create the equivalent
of a dynamically loaded bandpass filter in the sub-bass range. (Mid/side recording, first
patented in 1934, combines a cardioid (mid) and a bidirectional (side) microphone to
recreate the sensation of audio ‘width’ in a mono signal (Ballou 2002: 447–452); as a signal
processing technique, it allows a digital producer to boost or attenuate the perceived ‘low
centre’ of a mix, where the kick drum sits, without causing muddiness or harsh frequency
spikes across the stereo image.) The other dial puts bx_boom! into one of three modes,
Figure 18.2 A kick drum enhancer using virtual mid/side filtering for dynamic
equalization.
allowing a track’s sub-bass resonance to be ‘tuned’ within the LOW (36 Hz), MID (48 Hz)
and HI (64 Hz) octaves, exactly following the tripartite ‘boom, thump, punch’ mapping
outlined above.
Brainworx actually uses this throwaway plug-in, a simple skinning of a single setting
from their generic bx_dynEQ software plug-in, as a gateway to increasingly elaborate levels
of low-frequency control. (The cartoon-level simplicity maps onto a simplified two-factor
model of kick drum ‘weight’ in the metaphorical way discussed in Zagorski-Thomas 2018.)
Slightly more functionality is available in the configurable bx_subfilter: here the bandpass
effect comes with low, high and ‘extreme’ settings, and the resulting quantum of ‘tight punch’
is freely movable between 20 and 60 Hz. The subfilter function, in turn, is an essential
part of Brainworx’s bx_subsynth, designed to model old-school hardware subharmonic
synthesizers such as those once built by dbx. But ubiquitous DSP has transformed what
was a simple, if expensive, analogue boom box into a cheap and virtual sub-bass laboratory.
Before ‘tight punch’ and ‘low end’ filters are activated, a producer can add and individually
contour three separate subharmonic strata, one for each sub-bass register (24–36 Hz, 36–
56 Hz and 56–80 Hz), each layer dynamically resynthesized from ‘dominant’ frequencies
precisely one octave higher in the mix.
With its suggestively named ‘Edge’ function (Figure 18.3), the bx_subsynth moves
beyond its hardware inspirations, incorporating a dynamic compressor or transient shaper
whose onset can be smooth or harsh, and whose effective range can be dialled in using
matched high-pass and low-pass filters. Its primary function is to normalize and ‘smooth
Figure 18.3 Complete signal path for the Brainworx bx_subsynth plug-in.
out’ the subharmonic oscillations of a mix through signal saturation and even slight
overdrive, allowing a sound system to pump out sub-bass at peak efficiency:
This section is capable of manipulating the signal to make it tighter (reducing the level
of reverb or room info between peaks), less dynamic (taming transients, for example), or
even distorted, depending on your source signal and settings. Please be creative here and
experiment; there are no rules when you’re this close to the Edge. (Brainworx n.d.)
No fixed rules, perhaps, but the promise is still of control, of tighter tolerance when mixing
and mastering at the limits of human hearing and driver specifications. The algorithms
take a desktop producer close to the edge, but never over it, where uncontrolled digital
reverberation, stereo phase cancellation, and other exotic threats to effective sound
design lie in wait to rob the mix of its sonic coherence and power. Dynamic equalization,
compression and transient shaping at the mixing stage have become an integral and
cost-effective part of desktop production. An industry standard suite of mixing plug-ins
like those in Izotope’s Neutron 3 is designed for studio producers, and it can fix much
bigger problems in a mix than a kick drum that sounds like a rabbit fart; but entry-level
versions start at US $129, and, as dozens of online tutorials show, there is now no need
to spend hours auditioning hundreds of samples to find the right sound. Let’s imagine a
2019 sequel to our opening movie vignette – call it Return to Eden – in which producer
protagonist Paul Vallée returns to the same folder of unsatisfactory bass samples that
stymied him in 1994: today, he could pick one, any one, and use Neutron’s pinpoint
equalization tools and an extraordinarily flexible three-band transient shaper to add the
precise amount of bass weight needed to fix its ‘stingy’, ‘superficial’ or ‘feminine’ sound,
freeing him up to … spend some time contemplating his own casual yet ingrained
sexism, perhaps?
Assembling the ‘perfect kick drum’

Neutron’s transient shaper, like Brainworx’s entire suite of sub-bass plug-ins, is simple to
use because it depends on an internal side chain to decide when and how much to ‘squeeze’
the low-frequency signal:
Unlike many dynamic tools that aim at certain peaks (transients) in your signal, bx_boom!
actually FILTERS the bass-drum ‘punch’ [… It] is NOT just boosting low end frequencies
all the time, but is a ‘dynamic EQ process’ that is being ‘triggered’ by (internal) compressors
with heavy side chaining features. (Brainworx n.d.: 4)
The power of dynamic shaping circuits is that they allow a bass drum sound to ‘adjust
itself ’ millisecond by millisecond through the tight negative feedback loop inherent in the
side-chain relationship. But there are many instances where desktop producers want to
intervene directly in the way a kick drum sample changes over time; often it is desirable
to aim right at the most notable peaks in the signal, especially the all-important initial
transient when the (virtual) beater hits the (virtual) drum. A typical such case arises
when layering multiple samples or analogue patches to create a customized kick drum
sound. Computer Music devoted a full five-page spread of its 2012 summary special issue
on desktop production techniques to the search for ‘The Perfect Kick Drum’, flattering
its readers with the observation that ‘samplists have always been adventurous and open
minded with sound sources, mixing and matching samples and synth tones to create
powerful kicks’ (‘The Perfect Kick Drum’ 2012: 85–90).
The search for the perfect kick enables a characteristically functionalist, objectified and
analytical approach to sound. In desktop production, our complex response to a ‘live’ kick
drum is abstracted into what we can reverse-engineer operationally from observed practice
as three ‘kick functions’, each corresponding to one of the physical sensations listeners
should get from a properly managed sub-bass kick: K(punch), the initial, transient-heavy
attack, essential for cutting through a dense mix; K(thump), responsible for the sensation of
weight, and tuneable to avoid interference with other sub-bass information; and K(boom),
a sweeping resonance drop characteristic of large drumheads but also good for dramatic
transition effects from beat to beat. (The attentive reader will notice that each of these
functions also corresponds to one of the three frequency ranges of the sub-bass register
and to a type of subwoofer optimized for that sensation on the surface of the body.) It
might seem simple to graft the attention-grabbing 150 Hz punch from one sampled kick
onto the meaty 45 Hz thump of another, but due to the interactive relationship of the three
kick functions in the mix, just triggering two or three samples simultaneously probably
won’t produce acceptable results: ‘The relationship between the attack phase and the body
of the kick is in such delicate balance that it would be unreasonable to expect multiple
attack sounds at different speeds and pitches to automatically work together in unison’
(‘The Perfect Kick Drum’ 2012: 89).
Computer Music’s ‘The Perfect Kick Drum’ provides step-by-step instructions for
integrating and maximizing K(punch), K(thump) and K(boom) using desktop software. At
the most basic level, aspiring producers should learn how to use transient shaping to meld
kick drum samples without destructive interference. It may be possible to cut and splice
the two kicks together at just the right point to preserve one’s punch and another’s thump,
but – as audio engineers and beat slicers using the earliest generation of digital samplers
already knew – to avoid an audible click both samples have to be sliced precisely, as their
digital waveform crosses a zero amplitude point. An alternative path involves layering two
samples while using a DAW’s fader and filter control curves to mute everything but the
first track’s hit and the second’s weight. The same kind of control curve, applied to pitch,
can be used to tune the meat of a kick, or extend the resonance of its boom, while leaving
the distinctive attack untouched. (Novice producers are advised to use a spectral analyser
to visualize the fundamental frequency at which the sample resonates and where in the
sub-bass register it should sit after processing, then draw in a digital glissando that will
pitch the audio stream up or down just after the initial transients dissipate.) This kind of
K(thump, boom) tuning can keep the kick drum from crowding any other material in the
lowest octave; the inverse effect, where the K(punch) is grafted onto a heavily distorted
bass line, can be achieved by adding a wild upward spike of pitch modulation to the first
milliseconds of a synthesized bass patch.
More ambitious desktop producers will easily master the tutorial on building a kick
drum patch from scratch, and, in so doing, take a mini-course in shaping sound: start
with a sine wave, construct initial attack and decay, control the concomitant rise and fall
of drum head resonance, then add distortion and white noise to simulate the inharmonics
of the beater. Having mastered the solfége of the (sub-bass) sound object, the truly
adventurous are ready to think analytically about any sound at all: ‘It can be a useful
exercise to analyze the frequency content and dynamic shape of some tried-and-tested
kick sounds […] if a pitch envelope can turn a simple sine wave into a usable kick, imagine
what it can do for a more obscure sound’ (‘The Perfect Kick Drum’ 2012: 87). But there will
always be producers who long for a dedicated kick drum construction kit. Programmable
analogue drum machines have existed for almost a half-century; the convergence of audio
sampling and musical instrument digital interface (MIDI) sequencing allowed producers
to customize the sounds they arranged on the digital grid; now, custom drum plug-ins with
names like Punch, Tremor and Nerve merge synthesis and sampling into one hyperflexible
process. As the dance tech magazine Attack noted when lauding one of the cult favourites
in this category, Sonic Charge’s µTonic, this software ‘doesn’t make any claims to recreate
analog classics, or follow strict rules about how sounds should be created’ (‘Ten of the Best:
Drum Synth Plugins’ 2015).
The (desktop) recording studio as

(sub-bass) compositional tool
Given the fundamental importance of the kick across all styles of Afro-diasporic pop
music, it is not surprising that Attack’s round-up of the ten best drum synth plug-ins of
2015 mentions no less than four dedicated kick drum plug-ins, ranging from the cheap-
and-cheerful (GuDa’s US$19 KickR, which allows shaping of the three basic kick functions
along with some sliders for equalization and distortion) to the almost comically elaborate:
Vengeance Sound’s US$119 Metrum, which buys access to over a thousand genre-specific
sample sets (e.g. ‘German Clubsnd 3’) which can be loaded into three independently
editable layers of processed audio, plus a full complement of sinewave oscillators, filters
and envelope shapers, all patched into a modulation matrix that allows any parameter to
control any other. This may seem like a return to the bad old days of sitting for hours in
front of a screen auditioning fractionally distinguishable samples, but Metrum’s pitch is just
the opposite: ‘How many times have you spent countless hours previewing and searching
for the perfect kick drum to suit your musical project, literally wasting valuable time and
sometimes wasting musical inspiration? Those days are long gone with Metrum – build a
kick drum in minutes, not hours’ (Keilworth Audio 2012).
The fact that Metrum’s designers were forced to put a large RANDOMIZER button at
the top of their crowded interface tends to falsify their claim – and Brian Eno (1976) could
have predicted this – that providing total software control alleviates creative paralysis. But
some single-purpose bass drum synthesizers, such as Sonic Academy’s KICK, do empower
the desktop producer as they make her access to the key structural functions of sub-bass
kicks elegantly clear.
Figure 18.4 shows the pitch editing screen of this software plug-in, from which, thanks to
careful interface design, it is possible to ‘read’ the complex sound profile under construction
as if from a musical score. One immediately sees that the entire kick sample is generated
by a logarithmic descent in pitch, first almost instantaneously from the top of the audio
range to the Fletcher-Munson peak of maximum audibility (20–1.557 kHz), then quickly
to the upper mid-bass (251 Hz) – this is the ‘click’ and ‘punch’ – then much more slowly,
over about 250 milliseconds, into the middle of the sub-bass register (70 Hz) – the ‘thump’
or body of the kick – and, finally, a relatively slow slide – the ‘boom’ – over the next 400
milliseconds down to the haptic boundary at 20 Hz. One also intuits how this white curve
could be reshaped by grabbing one of the ‘handles’ and moving it: up or down to change
the actual pitch, and side to side to set inflection points in the rate of change. It’s easy to see,
also, that the kick sample dies out too fast for the full descent to be heard (the blue wave)
and that, if this is a problem, a subharmonic synthesizer lies ready at the lower right.
Figure 18.4 Sonic Academy KICK 2 drum synthesizer, main control panel.
Contrast this highly abstract computational interface with the almost childishly
instrumental design of bx_boom!: Brainworx shows you two knobs made to look like parts
of a ‘real’ kick drum that work unpredictably and conceal the complex web of algorithms
working ‘underneath’; Sonic Academy’s KICK, much the better mentor, subordinates its
remediating aspect (the familiar knobs) to a new, dramatic presentation of an unseen kick’s
progress through the abstract pitch-time continuum within which its DSP algorithms do
their work. One shows you a drum and tells you nothing; the other shows you what to do
when it’s time to leave that drum metaphor behind and deal with sub-bass sound on its own
terms. My dichotomies here follow D’Errico’s thesis ‘Interface Aesthetics: Sound, Software,
and the Ecology of Digital Audio Production’ (2016) (currently the most thoroughgoing
conceptual study of music software interface design); D’Errico notes that, ‘while plugins
represent existing musical tools and techniques, DAWs also offer the unique capability of
abstracting what might be considered more traditional “musical” tools and techniques,
thus presenting new possibilities for digital composition’ (43). D’Errico was probably not
thinking about the quest for the perfect kick drum when he wrote that sentence, but if
the current survey has had its intended effect, it will not seem outlandish to quote it here.
Sculpting the sub-bass register, at the edge of human hearing, may involve the irreducibly
physical (Henriques 2011), but it is also inherently abstract.
Bibliography
Ballou, G. (2002), Handbook for Sound Engineers, 3rd edn, Boston, MA: Focal Press.
Brainworx (n.d.), Manual bx_boom!. Brainworx Music & Media GmbH. Available online:
https://files.plugin-alliance.com/products/bx_boom/bx_boom_manual.pdf (accessed
2 August 2019).
Computer Music (2012), Production Manual 2012, Bath: Future Publishing.
dbx Corporation (1980), dbx Model 100 ‘Boom Box’ Sub Harmonic Synthesizer Instruction
Manual, Waltham, MA. Available online: https://www.hifiengine.com/manual_library/
dbx/100-boom-box.shtml (accessed 10 June 2019).
D’Errico, M. (2016), ‘Interface Aesthetics: Sound, Software, and the Ecology of Digital Audio
Production’, PhD thesis, University of California, Los Angeles.
Dickason, V. (2006), The Loudspeaker Design Cookbook, 7th edn, Peterborough, NH: Audio
Amateur.
Dressler, S. (2014), ‘Brian Eno 1979 Lecture The Recording Studio as Compositional Tool
Lecture’, YouTube, 5 September. Available online: https://youtu.be/E1vuhJC6A28 (accessed
13 March 2019).
Eden (2014), [Film] Dir. M. Hansen-Løve, Palace Films.
Eno, B. (1976), ‘Generating and Organizing Variety in the Arts’, Studio International 984
(Nov./Dec.), 279–283. Reprinted in G. Battock, ed. (1981), Breaking the Sound Barrier:
A Critical Anthology of the New Music, New York: Dutton.
Fink, R. (2018), ‘Below 100 Hz: Toward a Musicology of Bass Culture’, in R. Fink, M. Latour
and Z. Wallmark (eds), The Relentless Pursuit of Tone: Timbre in Popular Music, 88–118,
‘Get Your Kicks’ (2013), Sound on Sound, December. Available online: https://www.
soundonsound.com/techniques/get-your-kicks (accessed 10 June 2019).
Goodman, S. (2009), Sonic Warfare: Sound, Affect, and the Ecology of Fear, Cambridge, MA:
MIT Press.
Henriques, J. (2011), Sonic Bodies: Reggae Sound Systems, Performance Techniques, and Ways
of Knowing, London: Continuum.
Keilworth Audio (2012), ‘Vengeance Producer Suite/Metrum Official Product Video’,
YouTube, 31 January. Available online: https://youtu.be/qiARjNZjJOA (accessed 4 July
2019).
Millard, A. (1995), America on Record. A History of Recorded Sound, Cambridge: Cambridge
University Press.
Palmer, R. (1987), ‘Slam Stewart, 73, a Jazz Bassist Known for Singing with His Solos’,
[obituary] New York Times, 11 December.
‘The Perfect Kick Drum’ (2012), Computer Music, Production Manual 2012, 85–89, Bath:
Future Publishing.
Read, O. (1952), The Recording and Reproduction of Sound. A Complete Reference Manual for
the Professional and the Amateur, 2nd edn, Indianapolis, IN: Howard W. Sams & Co.
‘Ten of the Best: Drum Synth Plugins’, Attack, 22 May 2015. Available online: https://www.
attackmagazine.com/reviews/the-best/ten-of-the-best-drum-synth-plugins/ (accessed
26 June 2019).
Zagorski-Thomas, S. (2018), ‘The Spectromorphology of Recorded Popular Music: The
Shaping of Sonic Cartoons through Record Production’, in R. Fink, M. Latour and
Z. Wallmark (eds), The Relentless Pursuit of Tone: Timbre in Popular Music, 345–366, New
York: Oxford University Press.
292
19
Maximum Sonic Impact:
(Authenticity/Commerciality)
Fidelity-Dualism in Contemporary
Metal Music Production
Mark Mynett
Introduction
Metal is part of the Westernised, commercial pop and rock music industry that has imposed
itself on the rest of the world […] metal has played and continues to play a key role in the
globalised entertainment industries.
(Hill and Spracklen 2010: vii)
The term ‘heavy metal’ was first used as an adjective relating to popular music in the late
1960s, however, in the early 1970s the expression began to be employed as a noun and
therefore as a descriptor for a music genre (Walser 1993: 7). Heavy metal, more recently
referred to simply as metal, has therefore existed for approximately five decades. Given the
longevity of this music’s appeal, it is significant to note that there was little critical discourse
on the genre prior to the early 1990s (Bennett 2001: 42; Phillipov 2012: xi). However,
in the past seven years there has been a dramatic increase in the number of academics
researching and studying the area (Scott and Von Helden 2010: ix). This is evidenced
by the world’s first scholarly conference on the metal genre, ‘Heavy Fundametalisms –
Music, Metal and Politics’, held in Salzburg, Austria, in 2008 (Sheppard 2008). To date,
the focus of this academic study has tended to address the importance and relevance of
metal from a historical, sociological, anthropological, cultural, musicological and political
science perspective (e.g. Weinstein 1991; Walser 1993; McIver 2000, 2005; Kahn-Harris
2007). Additionally, Weinstein points to metal studies drawing on the fields of economics,
literature, communication and social psychology (2011: 243). Therefore, as Pieslak notes,
the existing literature on heavy metal is mainly dedicated to its culture and transformations,
rather than the music itself (2008: 35). Furthermore, academic exploration into the
processes of music production, more generally, can be viewed as being in an embryonic
phase. Much of the discussion, as Howlett highlights, is ‘marginalised as an incidental,
or peripheral observation’, due to discourse often being found in texts presented from
cultural theory perspectives, or those that focus on the history of recording and the impact
that the arrival of recording technology had from a sociological viewpoint (2009: 7). Of
particular relevance here is that comprehensive study into procedural methodologies
for the production of contemporary metal music is ‘virtually non-existent’ (Turner
2012: ii). By focusing on the qualities – as well as the associated processes, approaches
and techniques – that provide recorded and mixed contemporary metal music (CMM)
with maximum sonic impact, the author intends that this chapter partially addresses this
apparent literary gap.
Value judgements
Frith suggests that academics have a duty to make value judgements rather than evade
them (1996: 8). He proposes that:
Popular cultural arguments […] are not about likes and dislikes as such, but about ways
of listening, ways of hearing, about ways of being. The importance of value judgment for
popular culture thus seems obvious, but it has been quite neglected. (8)
These sentiments are relevant to conceptual framework CMM produced to deliver

‘maximum sonic impact’. In other words, the concept of maximum sonic impact is
intrinsically linked to ways of listening, hearing and being, which therefore require
value judgements to be made. Here, these value judgements will likely involve shifting
and conflicting opinions about the relationship between CMM and sound quality. Due to
these value judgements not being able to offer a fixed standard, they could be viewed as
problematic. However, a conceptual framework with subjective boundaries is nevertheless
provided.
A majority of listeners want this style of music to present a dense and powerful yet clear
sound. The artists usually want the same, as this translates and enhances the best aspects of
their performances. (Mynett 2017: 21).
This conceptual framework states that maximum sonic impact for CMM in recorded
form is afforded through an effective balance between clarity, heaviness, sonic weight
and performance precision, with each of these qualities having the potential to impact
and inform the other. Different bands and their respective productions need these
characteristics presented in different ways; however, a recording that delivers a deficient
sense of all four of these qualities is inevitably weak.
Maximum Sonic Impact 295
Artistic vs. commercial concerns

Numerous writers have highlighted the dichotomy in popular music between artistic/
authentic/aesthetic values versus commercial concerns. For example, Moore refers to ‘the
opposition between the authentic and the commercial’ (2002: 211), and Frith discusses the
rhetoric between art and commercial values being kept apart (1996: 42). However, as will
be explained, with CMM in recorded and mixed form, this dualism tends to be significantly
reduced, with CMM’s artistic and commercial values often being aligned. Importantly
though, this alignment is broadly moderated by perceived performance authenticity, with
clarity tending to have a greater perceived value for artists featuring a high-speed, high-
precision aesthetic (virtuosity), and sonic weight and/or heaviness tending to have a greater
perceived value for artists featuring slower performance approaches.
The technicality of musical composition and performance complexity often displayed
in CMM represents a fundamental authentic/artistic perspective. This proficiency and
sophistication is often afforded high value and esteem by artists and enthusiasts alike
(Purcell 2003: 12–14), with Phillipov claiming:
Technical complexity is often claimed as a virtue in and of itself, with fans and musicians
often claiming a level of prestige for the music based on its technical difficulty. (2012: 64)
Clearly then, contemporary metal culture places high value on genuine virtuosity and
musicianship. However, for the listener to perceive virtuosity and advanced standards
of musicianship in a CMM production, a high level of clarity needs to be provided. If
this clarity is not provided, the often-complex performative gestures are rendered largely
unintelligible, and the ability to receive, or perceive, the virtuosity involved becomes
obscured, or lost. An example here is the Darkthrone’s album Transilvanian Hunger (1994),
where despite fast double bass drum work being performed on many of the album’s tracks,
this is largely unintelligible. Nevertheless, the particularly lo-fi production aesthetics
on this album appeared to partly reflect the band reacting against what they saw as the
increasing overt commerciality of metal music production – with the production values
thereby reflecting the artists’ notions of apparent authenticity. All the same, from many fan
and musician perspectives, the high value, esteem and prestige of the music is obscured or
lost by such productions and, from these perspectives, clarity remains an essential aesthetic
principle for conveying both artistry and authenticity.
Clarity is the primary quality that accentuates the perceived energy of CMM’s
performances, which directly informs the sonic impact of the production. Clarity enhances
and exaggerates the feelings of performance physicality that are required to produce these
highly embodied sounds, in turn evoking a synaesthetic response in the listener. In this
respect, Wallach refers to the impact of music’s sound often being ‘audiotactile’, in that it
aims to literally move the listener (2003: 42). He goes on to state that: ‘Music recordings
are cultural objects whose meaningful effects come about primarily through their ability to
produce material sonic presences’ (Wallach 2003: 37). In recorded music, Corbett discusses
embodied presence as tracing visual presence (1994: 41–44). This would appear to be
particularly true of CMM where particular cultural understandings, often gained through
the live experience, become embedded in the experience. An example of such cultural
understanding would be the actions that are performed to create certain sounds. Therefore,
in contrast to Frith, who states: ‘Most listeners, for example, no longer care that they have
no idea what instrument (if any) makes their favorite sound’ (1988: 125), this is likely to
be far from the case for CMM listeners. In many instances when guitars are perceived, the
actions and emotional associations behind the relevant sounds are simulated. Wallach refers
to sound waves’ ability to create this experience as ‘copresence’ (2003: 36). In CMM this
notion of copresence is largely achieved through the clarity of the various sound sources,
the emotional associations of which, therefore, contribute to a production’s perceived
sonic impact. The author therefore proposes that productions presenting a lo-fi approach,
with reduced sound source intelligibility, are not as subjectively heavy as those where the
performative gestures of the musicians are more intelligible. An example of this would
be contrasting the aforementioned ‘lo-fi’ approach of Darkthrone’s Transilvanian Hunger,
where the transients of the drum performance are largely unintelligible, with the high-
fidelity/high commercial standards of Dimmu Borgir’s Spiritual Black Dimensions (1999).
Here it is relevant to note that, although the black metal subgenre is renowned for its lo-fi
production aesthetics, the most commercially successful black metal bands, for example,
Dimmu Borgir and Cradle of Filth, tend towards high levels of clarity. Nevertheless – and
reflecting Frith’s (1996) ‘ways of listening’ – there are significant niches within the metal
audience with different priorities about what constitutes authenticity, who prioritize this
apparent quality over production clarity/perceived heaviness. It could be argued that, in
these instances, there is a dualism between perceived authenticity and commerciality, with
Darkthrone labelled as the former, and Cradle of Filth as the latter.
Clarity, definition and intelligibility

As we have seen, clarity can enhance the perceived intensity and energy of each and every
instrument/sound component in a metal production, which in turn can strengthen the
perceived power and drive of the performance’s rhythm structures. Clarity can therefore
be viewed as a valuable parameter of sonic impact/effective heaviness. However, a more
precise way of breaking down the concept of clarity is ‘definition’ and ‘intelligibility’, which,
although often considered similar in meaning, can be differentiated.
The term ‘Definition’ refers to the characteristics of a single isolated performance that
enables the performance to be clear, and clearly understood. For example, definition refers
to the qualities of the individual drum sounds heard during the brief solo drum fill from
3:43 to 3:45 of Lamb of God’s ‘Blacken the Cursed Sun’ that allow the fast subdivisions to be
decipherable and easily understood. If the drums on this production lacked definition, for
instance if they had a dull, flat attack and long, resonant sustain, it would be much harder for
the listener to make sense of this drum fill. Without definition, its impact would be largely
lost, as would drummer Chris Adler’s intentions when playing this part. (Mynett 2017: 19)
Whereas definition relates to a single performance heard in isolation, intelligibility

relates to the ease of understanding and perception of a single performance/sound
component when heard within the context of the rest of the ensemble. For example,
the Chris Adler drum fill referenced above, being heard within a guitar riff and bass
performance; and, likewise, the ease of perception and understanding of the relevant
guitar riff and bass part within this context. In this respect, Izhaki states: ‘Intelligibility
is the most elementary requirement of sonic quality’, and ‘sonic quality is also a powerful
selling point’ (2007: 5).
The clarity criterion has become essential to CMM, largely shaping the expectations of
bands, musicians, record labels and enthusiasts, whilst in turn becoming a central aesthetic
value pursued by CMM producers. Nevertheless, the manner in which clarity is captured,
enhanced and presented is an important element that needs to be negotiated between
band and producer. CMM artists that have a much lesser emphasis on performance/note
complexity and a greater focus on sound/timbre are likely to require less accentuation of
clarity and vice versa.
Fan identification within metal appears to be shaped at its most rudimentary and ‘popular’
level around the music, but in two differing aspects – sound/timbre or performance/note
complexity. (Pieslak 2008: 46)
Heaviness
‘Heaviness’ is metal music’s defining feature. Although the term is regularly used to portray
a wide variety of sound and performance qualities in a wide range of styles, this adjective
is most often used in relation to the metal genre, describing the music’s perceived weight,
size and density. Heaviness is primarily substantiated through displays of distortion and,
regardless of the listening levels involved, associated with perceived volume, power, energy,
intensity, emotionality and aggression.
Fans of many, if not most, styles of popular music react positively to distortion almost
instinctively. When a device is overloaded, something exciting must be happening […] there
is something visceral and stimulating about distortion that makes the music more exciting.
(Case 2007: 97–150)
These sentiments from Case relate particularly to metal music where ‘the most important
aural sign […] is the sound of an extremely distorted electric guitar’ (Walser 1993: 41).
Distortion creates potentially unlimited sustain in electric guitars due to signal compression.
As exertion is normally required for sustaining any physical activity, distorted electric guitar
sounds also signal energy and power ‘through this temporal display of unflagging capacity
for emission’ (42). ‘Distortion also results in a timbral change towards brightness […] since
distorting a signal increases the energy of its higher harmonics’ (42). Furthermore, when
additional spectral information, in the form of high-frequency energy, is introduced to
guitars’ timbres, they are perceived as heavier (Berger and Fales 2005: 193–194). In order
for the other instrumentation to punch through, and be perceived as within the same
context of, this ‘sonic wall’ (Turner 2009) of extremely bright rhythm guitars, heightened
high-frequency content is normally required for much of the other instrumentation.
Due to high-frequency sound dissipating with distance, the perception of heightened
high-frequency content is associated with sound being very close and intimate. Zagorski-
Thomas even suggests that intense high-frequency content ‘can be used to make something
seem closer than the loudspeaker it emanates from’ (2012: 8). Additionally, the perception
of sound being very close and intimate is an essential element of CMM’s intelligibility.
In addition to harmonic distortion heightening high-frequency content, distortion
generates different tones that generate frequencies below the fundamental of the guitar’s
lowest open string. For example, with a dropped-C root/fifth (C/G) power chord –
approximately 65.4 Hz and 98 Hz, respectively – an additional tone at approximately 32.6 Hz
(98 Hz minus 65.4 Hz) is constructed.
As human hearing is exponentially more sensitive to low and high frequencies the
louder they are in volume (Senior 2011: 62), capturing or enhancing high, as well as low,
frequencies in this manner typically results in sounds being perceived as louder than
they actually are. Furthermore, these frequencies at the extremes of the audio spectrum
significantly contribute to our somatic perception of sound, due to their ability to cause
vibrations in skin and internal organs (Zagorski-Thomas 2014: 7).
Highly relevant to CMM production though, is that low-mid attenuation is a technique
that, psycho-acoustically, can achieve the same result as simultaneously emphasizing low
and high frequencies. Low-mid attenuation is not only a stylistic marker of the death metal
subgenre’s guitar tones (referred to as ‘scooped’) but also a staple production technique
for the CMM style in general. In many ways, the important element of the sense of space
used in CMM production is the bearing that ‘holes’ have within the four dimensions of
laterality, prominence, register and temporal continuity (Moore 2001: 121). For CMM,
these ‘holes’ will often be created in the low-mid frequency range.
In addition to the radical harmonic distortion in its guitar timbres, CMM is likely to
present numerous other audio sources with distorted characteristics. Bass sounds that
contain distortion are a regular feature of CMM’s composite bass timbres and CMM’s
vocal performances are often so heavily distorted and guttural as to fail to present any
distinguishable note or pitch (Berger 1999: 164). Lastly, high levels of signal compression
normally used throughout the mixing and mastering stage of CMM often generate related
harmonics. In addition to increasing inherent loudness levels, this compression often
creates a desirable form of harmonic distortion (Case 2007: 150).
Sonic weight
CMM’s heaviness is intrinsically related to distorted rhythm guitars but also to sounds
perceived as emanating from large, intense and powerful entities, referred to as sonic
weight.
Sonic weight is a vital parameter of metal music’s heaviness. Sonic weight and heaviness
are related in numerous ways, but can be distinguished from one another. The perceived
heaviness of an individual instrument or overall production is highly impacted by high
frequency energy. However, the concept of sonic weight is specifically concerned with low
frequency content, and, more precisely, the perceived ‘size’ and ‘mass’ of this spectral region.
(Mynett 2017: 14)
The world we inhabit has consistent physical laws, which are reflected in the way sound
delivers information about the perceived size of the source. When we hear, or perhaps
feel, low frequencies, we typically associate the generation of these sounds with sizeable,
weightier objects. An example of which would be that we wouldn’t associate the low-
frequency content of a large lion ‘roar’ with a household cat. Likewise, low frequencies
also tend to be associated with sounds generated via high impact (i.e. an object forcefully
striking another object, for instance, the sound produced by substantial thudding, such as
kicking a door, compared to light tapping, such as a fingernail on a table).
For musical purposes, the term ‘sonic weight’ refers to low-frequency qualities associated
with high levels of energy, power, impact, and loudness, creating the perception that the
sound source is large, dense, and powerful. It is unlikely that sonic weight would be apparent
if, for instance, the bass or guitar performance involved a very soft attack from the picking
hand (i.e. a lack of energy, impact, and loudness) or if the instrument/amp/cab combination
sounded thin and weak. It would also be highly unlikely that a dense, big, powerful sound
could be achieved simply by boosting the low frequencies of this example. For these reasons,
the term ‘sonic weight’ refers to both the quantity and qualities of low-frequency energy.
(Mynett 2017: 15)
Performance precision
The term ‘performance precision’ refers to subdivisions that are executed very close to the
intended rhythmic sequence. This involves the performers playing with the same concept
of where hits/notes begin, and to a certain extent with bass and guitar, where these notes
end. Performance precision is a principal requirement for providing a CMM production
with maximum sonic impact, particularly as performance precision typically elevates a
production’s sense of clarity and heaviness.
There are few other popular music genres that involve as great a focus on higher
tempi and fast subdivisions as CMM. As well as CMM performance tempi regularly
displaying dramatic changes within the same song (Purcell 2003: 11–24), Kahn-Harris
notes that modern metal tempi often fall between 150–250 bpm (2007: 32) and, in radical
recent examples, tempi sometimes exceed 350 bpm for quarter note values (Berry and
Gianni 2004: 85). This results in smaller inter-onset intervals and, consequently, more
rhythmically concentrated drum, bass and guitar patterns. Inter-onset intervals are
‘the elapsed time, measured in milliseconds, between the onset of one sound and the
onset of the next’ (Zagorski-Thomas 2007: 334). These smaller metric groupings are
likely to restrict the possibilities for dynamic variation, improvisation and variation of
performance. Furthermore, numerous studies have been carried out that support the
hypothesis that expressive timing does not scale accordingly with tempo (Collier and
Collier 1996; Friberg and Sundström 2002; Honing and de Haas 2008), with these authors
concluding that an increase in tempo frequently results in a decrease in the amount of
swing ratio found in drum performance patterns. These findings would seem to have
relevance to CMM that is frequently performed with minimal, or no, swing, groove or
expressive timing discrepancies, sometimes referred to as ‘human-feel’. These ‘straight’,
rather than swung, drum performance characteristics partly dictate the style’s aesthetic
of performance precision (Haid 2006: 100). This means that straight, metronomic
performances with consistent dynamics are often involved in CMM, with any deviation
usually being unintended and unwanted.
Kahn-Harris emphasizes the fast subdivisions frequently found in the rhythm guitar
performances of CMM by noting that some riffs (short repeated rhythm phrases) are
played at 500–600 notes per minute (2007: 33). Fast subdivisions such as these frequently
involve the guitarist exploiting ‘the rhythmic potential of repeated pitches’ (Pillsbury 2006:
129). Recurring rhythmic phrases, or motifs, repeated at the same pitch are sometimes
referred to as ostinato. Playing the same single pitch, rather than different pitches, enables
faster subdivisions, which in CMM are regularly played in the guitar’s lower registers. These
repeated low notes are a generic identifier of metal (197) that produce a ‘stable affective
base’ (11). Furthermore, playing these fast subdivisions with the same single pitch, rather
than a chord, provides a clearer and more prominent identification of a low root tone.
Pieslak proposes: ‘In this lower register […] the fifth of the power chord tends to obscure
rather than reinforce the fundamental or root in more active passages’ (2008: 220).
Apart from stamina, endurance and agility being required to perform these fast
subdivisions or fast double-kick patterns with consistency and accuracy, it takes only a
slight deviation from the intended rhythmic pattern for this to sound disordered/confused.
For example, a sixteenth note sequence at 180 bpm contains inter-onset intervals of
approximately 83 milliseconds. Deviations of just 10 or 20 milliseconds would significantly
confuse this rhythmic sequence, whereas this deviation might be insignificant to a
performance featuring slower subdivisions. The need for performance precision therefore
increases with faster subdivisions, and especially when featuring ensemble rhythmic
synchronization.
Ensemble rhythmic synchronization refers to the metric structures of guitar/bass
performance parts being largely or entirely coordinated by drum patterns driven
predominantly by the kick, which collectively provides the music with a dense texture.
When multiple performances are endeavouring to synchronize the same subdivisions but
without the required level of precision, this quickly results in a chaotic mush of sound.
Apart from this impeding the listener’s clear comprehension of the frequent overarching
complexity in the music’s construction, a mush of sound is typically perceived as subjectively
less heavy, with less sonic impact than a production featuring comparatively more precise
performances. With this in mind, the concept of rhythmical perfection is a characteristic
that many CMM musicians and producers endeavour to capture and present.
Bibliography
Bennett, A. (2001), Cultures of Popular Music, Philadelphia, PA: Open University Press.
Berger, H. (1999), ‘Death Metal Tonality and the Act of Listening’, Popular Music, 18 (2): 161–178.
Berger, H. and C. Fales (2005), “Heaviness” in the Perception of Heavy Metal Guitar
Timbres: The Match of Perceptual and Acoustic Features over Time’, in P. D. Green and
T. Porcello (eds), Wired for Sound: Engineering and Technologies in Sonic Cultures, 181–197,
Berry, M. and J. Gianni (2004), The Drummer’s Bible: How to Play Every Drum Style from Afro-
Cuban to Zydeko, Tucson, AZ: See Sharp Press.
Budofsky, A. (2006), The Drummer: 100 years of Rhythmic Power and Invention, Cedar Grove,
NJ: Hal Leonard Corporation.
Case, A. (2007), Sound FX: Unlocking the Creative Potential of Recording Studio Effects,
Collier, G. and L. Collier (1996), ‘The Swing Rhythm in Jazz’, in Proceedings of the 4th
International Conference on Music Perception and Cognition, 477–480, Montreal: Faculty
of Music, McGill University.
Corbett, J. (1994), ‘Free, Single and Disengaged: Listening Pleasure and the Popular Music
Object’, in Extended Play: Sounding Off from John Cage to Dr: Funkenstein, 32–55, Durham,
Friberg, A. and A. Sundström (2002), ‘Swing Ratios and Ensemble Timing in Jazz Performance:
Evidence for a Common Rhythmic Pattern’, Music Perception, 19 (3): 333–349.
Frith, S. (1988), ‘Video Pop: Picking Up the Pieces’, in S. Frith (ed.), Facing the Music: A
Pantheon Guide to Popular Culture, 88–130, New York: Pantheon Books.
Frith, S. (1996), Performing Rites: On the Value of Popular Music, Cambridge, MA: Harvard
University Press.
Hill, R. and K. Spracklen (2010), ‘Introduction’, in R. Fisher and N. Billias (eds), Heavy
Fundametalisms: Music, Metal and Politics. Second Global Conference of Heavy
Fundametalisms: Music, Metal and Politics, Salzburg, Austria, 10–12 November 2009, vii–x,
Oxford: Inter-Disciplinary Press.
Honing, H. and W. B. de Haas (2008), ‘Swing Once More: Relating Timing and Tempo in
Expert Jazz Drumming’, Music Perception, 25 (5): 471–476.
Howlett, M. (2009), ‘The Record Producer as Nexus: Creative Inspiration, Technology and the
Recording Industry’, PhD thesis, University of Glamorgan, Glamorgan.
Izhaki, R. (2007), Mixing Audio: Concepts, Practices and Tools, Oxford: Elsevier Science Ltd.
Kahn-Harris, K. (2007), Extreme Metal: Music and Culture on the Edge, Oxford: Berg.
McIver, J. (2000), Extreme Metal Handbook, London: Omnibus Press.
McIver, J. (2005), Extreme Metal II, London: Omnibus Press.
Moore, A. F. (2001), Rock: The Primary Text, Burlington, VT: Ashgate Publishing.
Moore, A. F. (2002), ‘Authenticity as Authentication’, Popular Music, 21 (2): 214–218.
Mynett, M. (2017), Metal Music Manual, London: Routledge.
Phillipov, M. (2012), Death Metal and Music Criticism: Analysis at the Limits, Lanham, MD:
Lexington.
Pieslak, J. (2008), ‘Sound, Text and Identity in Korn’s “Hey Daddy”’, Popular Music, 27 (1):
35–52.
Pillsbury, G. (2006), Damage Incorporated: Metallica and the Production of Musical Identity,
New York: Routledge.
Purcell, N. (2003), Death Metal Music: The Passion and Politics of a Subculture, Jefferson, NC:
McFarland and Company.
Scott, N. and I. Von Helden (2010), ‘Preface’, in R. Fisher and N. Billias (eds), The Metal Void:
First Gatherings. First Global Conference of Heavy Fundametalisms: Music, Metal and
Politics, Salzburg, Austria, 3–5 November 2008, ix–xiv, Oxford: Inter-Disciplinary Press.
Senior, M. (2011), Mixing Secrets for the Small Studio, Burlington, MA: Elsevier.
Sheppard, J. (2008), ‘World’s First Heavy Metal Conference Hits Salzburg’, The Guardian,
29 October. Available online: http://www.guardian.co.uk/education/2008/oct/29/research-
music?INTCMP=SRCH (accessed 29 March 2018).
Turner, D. (2009), ‘Outlining the Fundamental Production Aesthetics of Commercial Heavy
Metal Music Utilising Systematic Empirical Analysis’ [conference presentation], Art of
Record Production, 21 April. Available online: https://www.artofrecordproduction.com/
aorpjoom/arp-conferences/arp-archive-conference-papers/21-arp-2009/117-turner-2009
Turner, D. (2012), ‘Profiling a Metal Mastermind: The Mixing Techniques of Andy Sneap’,
PhD thesis, University of Glamorgan.
Wallach, J. (2003), ‘The Poetics of Electrosonic Presence: Recorded Music and the Materiality
of Sound’, Journal of Popular Music Studies, 15 (1): 34–64.
Walser, R. (1993), Running with the Devil: Power, Gender and Madness in Heavy Metal Music,
Weinstein, D. (1991), Heavy Metal: A Cultural Sociology, New York: Maxwell Macmillan
International.
Weinstein, D. (2011), ‘How Is Metal Studies Possible?’, Journal for Cultural Research, 15 (3):
243–245.
Zagorski-Thomas, S. (2007), ‘The Study of Groove’, Ethnomusicology Forum, 16 (2): 327–335.
Zagorski-Thomas, S. (2012), ‘Musical Meaning and the Musicology of Record Production’, in
D. Helms and T. Phelps (eds), Black Box Pop. Analysen populärer Musik, 135–147, Bielefeld.
Zagorski-Thomas, S. (2014), ‘Sonic Cartoons’, in The Research Companion to the Sound in
Media Culture Research Network.
Discography
Darkthrone (1994), [LP] Transilvanian Hunger, Peaceville Records, VILELP43PMI.
Dimmu Borgir (1999), [LP] Spiritual Black Dimensions, Nuclear Blast, NB 349-9.
20
Desktop Production and
Commerciality
Phil Harding
Introduction: Transformation and process

The analogue methodologies derived from 1970s and 1980s commercial pop and rock
music can be identified in current characteristics of commercial pop that are ascribable
to processing and effects for mixing ‘in-the-box’ – ITB (Burgess 2014; Paterson 2017).
Current desktop music production technology allows the creative technologist to access
emulations of classic and popular analogue hardware processors (Burgess 2014) in the
digital domain with an ease that was only a dream to engineers and producers like
myself in the 1980s and 1990s. Pop music songwriting and production teams are now
more frequently part of a larger creative collective (Hennion 1990) in creating a musical
product.
This chapter explores the creative production and mixing processing tools (Moylan 2015)
found in systems available for desktop music production today. Drawing upon a series
of interviews and data gathered during an extended ethnographic and autoethnographic
study, this chapter presents my previous analogue workflow in the pop music domain
of the 1970s, 1980s and 1990s, and examines the current output requirements to satisfy
commercial music clients at various stages of the commercial pop production and mixing
process.
Evidence also suggests that for individuals to successfully participate in the field and
domain of commercial pop music they will need to display a genuine creative ‘love’ of the
genre’s culture and technology. I will set out a desktop production ‘Service Model for the
Creation of Commercial Pop Music’ (Harding 2017).
This chapter begins with an autoethnographic reflection of my working practices as a
producer, engineer and mixer from the 1970s to the end of the 1990s. I will then explore
the work that has come since and the choices available to industry practitioners today. To
give a general overview of my current working practices: I assign ITB digital processing
and effects during the recording and mix process of commercial pop music projects that
are still based on analogue working practices from the 1970s to 1990s. In this chapter,
I distinguish between ‘Top Down’ and ‘Bottom Up’ approaches to mixing (Harding 2017).
Top down refers to starting a mix with the lead vocals and then working ‘down’ through
the arrangement to the drums. Bottom up mixing begins with the drums and ends with the
vocals. The latter method, in my experience, has been the traditional routine in rock, pop
and dance music genres since the 1970s.
Music production has changed enormously over the last thirty to forty years.
Today’s technology now allows anyone to make high-quality recordings at home.
That said, the analogue technologies and techniques of record making from the pre-
digital era are still with us and still inform many of the processes used today. It is
important for new and current practitioners to be aware of the contextual background
of the music production process as it once was as well as of the current possibilities.
Whilst commercial desktop production differs from ‘traditional’ recording studio
processes, many production parameter decisions will remain, such as arrangement,
instrumentation, tempo/time signature/groove and performance quality. Hepworth-
Sawyer and Golding (2011) rightly emphasize that ‘these four overarching parameters
can offer a great deal of scope when altered’ and are equally important in commercial
desktop production. This chapter will put aside those arrangement and musical
considerations and concentrate on the characteristics of commercial pop that are
ascribable to processing and effects.
There are clear and definable differences, in my view, between recording and mixing
for acts such as The Clash, Killing Joke and Toyah Wilcox, which I did in the late 1970s/
early 1980s compared to the ‘stacking’ process of recording for the acts I worked with at
PWL Studios (PWL) throughout the 1980s. The recording routine of producers Stock,
Aitken and Waterman (SAW) became deliberately aimed at supplying an abundance
of keyboard and programmed orchestral overdubs as commercial hooks and ‘spotlight’
sounds for the mix engineers to focus upon. Pete Waterman always attended the mixing
sessions at PWL, and it became an easier approach for the production team to supply too
much instrumentation to support the vocal hooks rather than too little. Pete Waterman
and the PWL engineers were masters of knowing what to leave out rather than knowing
what to request from SAW to add in. That same routine continued for Ian Curnow and
myself producing boy bands such as East 17 and Boyzone in the 1990s. Therefore we
could view an ideal commercial desktop production mix as something that has been
deliberately overloaded with too many overdubs to allow the final mix to be sculpted by
the mix engineer into the ideal commercial sound for current national and commercial
radio.
The complex narrative for acts such as The Clash and Killing Joke was somewhat
opposite, only overdubs that had been creatively scrutinized by the band members in the
rehearsal rooms and then the recording studio were recorded – and nothing more. In many
ways that makes the mixing process easier for these types of acts.
Desktop Production and Commerciality 305
Analogue audio processing and effects

from the 1970s to today
There was a limited array of external audio hardware processing units and techniques
when I started as an assistant engineer at the Marquee Studios in 1973. We had access to
two EMT (Elektromesstechnik) 140 Reverb plates and an external analogue equipment
rack containing various classic limiters (Universal Audio 1176), compressors (DBX 160)
along with parametric and graphic equalizers (EQs). By the time I was engineering and
mixing sessions for artists such as The Clash, Killing Joke and Toyah Wilcox from the
late 1970s and early 1980s, all of those pieces of equipment were still in place. Also added
were Drawmer gates, Roland tape delay units (555) and early versions of digital delay,
harmonizer (Eventide) and chorus units (Roland Dimension D). Lexicon was among the
first of the audio hardware companies to introduce digital reverb to studios with the 224
in 1978.
In the 1980s, music-making technology (rather than recording or processing) gradually
shifted from largely unreliable hardware platforms such as the Linn 9000 and Roland midi
sequencers, with limited back-up technology (e.g. the largest capacity floppy disc was
1.9 Mb on average), to the Atari ST computer: originally designed for home-gaming and the
only personal computer to ship with built-in musical instrument digital interface (MIDI)
connectivity. The two popular music sequencer developers of the time were Steinberg
(Cubase software) and C-Lab (Logic software) – both developed their MIDI sequencer
platforms utilizing the flexibility of the Atari computer. There were other MIDI sequencers
on the market in the late 1980s in the United Kingdom and Europe, but these two dominated
in recording studios and programming rooms into and throughout the 1990s.
By 1984 I had moved from the Marquee Studios to Pete Waterman’s PWL Studios,
working as chief engineer to the SAW production team. A major piece of creative
technology in the early 1980s that quickly gained favour with engineers and producers in
the commercial pop genre was the Solid State Logic (SSL) mixing console. It was renowned
for producing a bright and sharp sound which was ideal for pop and dance music. We can
now access those classic 1980s SSL sounds from the various plug-ins available to us from
Universal Audio and Waves. In particular, the SSL G+E-series EQ and the SSL master
buss compressor were vital to the SAW/PWL ‘Signature Sound’ (Zagorski-Thomas 2014)
throughout the 1980s. PWL engineers such as Dave Ford, Tony King and myself would
‘mix in’ to that SSL master buss compressor from the start of the mix (drums up) onwards.
It took a long time for Stock and Aitken to embrace the Atari/Cubase technology,
and they would stand by their Linn 9000 system, described by Burgess (2014) as a ‘self-
contained production box’ for many more years. Aitken explained this in an interview with
Sean Egan for the book The Guys Who Wrote ‘Em:
We were still using a relatively archaic system [Linn 9000] when everybody else was using
Atari-based systems and we spent so long trying to get a system that was reliable and having
[ironed] the bugs out of the system over two or three, four years, there was little point going
to something that was going to crash on you. (Aitken quoted in Egan 2004: 302)
Aitken was correct to indicate that the early versions of the Steinberg Cubase music
sequencer system would regularly crash on the Atari computer. This was understandably
disruptive to the creative workflow of songwriting and recording sessions at that time.
PWL musician, programmer and producer Ian Curnow supplies a view that was quite
different to Aitken with regard to the Atari/Cubase system circa 1991:
2 Mb of RAM, that was the biggest system available [the basic was 1 Mb] and I think the
hard disk [which was purchased separately] was about 20 Mb, or maybe 40 Mb, I’m not sure.
Amazing! Cubase was a revelation, and having been involved with Steinberg as a Pro 24 user,
they asked me, along with several other users, exactly what we’d want from a ‘clean sheet’
sequencer, having used Pro 24. So when Cubase came out, it was like it was made for me, so
many of my suggestions and ideas were there for me to use, which was fantastic. (Curnow
quoted in Harding 2010: 555)
Ian Curnow was an early Cubase beta-tester for Steinberg and he was therefore able to
input many suggestions to the development of the software. He explains why he thinks this
type of technical information is important to our knowledge:
I get people asking me on Facebook for my Yamaha DX7 Bass sounds – I don’t have them
anymore! We [Harding and Curnow] used to be called ‘International Rescue’ in the 1980s
and 1990s, where someone had a radio mix that they couldn’t get right for the label or the
manager and we would be hired to save it. The process now is all ITB and revisiting tracks to
make adjustments is a lot easier. You can re-visit/re-vamp tracks yourself without necessarily
sending it off to another remix or production team to ‘rescue’. We also tend to look for
team collaborations and co-producers now rather than just handing the whole track over to
someone fresh for a re-vamp. Technology allows us now to achieve the same processes as
the 1990s but it’s a lot easier now, we’re swapping files and not tapes etc. (Curnow, interview
with author, 2014)
This is useful information for people who are trying to understand how boy band records of
the 1990s were produced and gain knowledge of the equipment used to generate the sound
sources. I also still receive messages on Facebook about the complete chain of sounds for
instrument processing such as our kick and snare drums from the 1980s PWL period. A
full description of the music technology utilized at the very successful PWL Studios is
included in my book PWL from the Factory Floor (Harding 2010).
William Moylan (2015) notes that ‘people in the audio industry need to listen to and
evaluate sound’. I would also agree with Moylan and other academics that prolonged periods
of critical listening ‘can be used to evaluate sound quality and timbre’ (186). Moylan raises
the important question of timbre, which Zagorski-Thomas notes ‘is a function of the nature
of the object making the sound as well as the nature of the type of activity’ (2014: 65).
Today, everything that was available to creative technicians and producers from those
eras onwards is available to us via digital audio plug-in specialists such as Universal Audio
and Waves in our digital audio workstations (DAWs). This allows the current commercial
pop producers and engineers to access and simulate the tried and tested processing
techniques and settings used by their predecessors. This can provide a very good starting
point for those working in the desktop production environment and creating commercial
pop music. DAWs first arrived in the early 1990s (1992 most would agree) when Steinberg
introduced Cubase Audio onto the Apple Mac system, combining their Cubase MIDI
sequencer with digital audio via the Digidesign A/D interface (see Harding 2010) and
DAWs ‘have since the early 2000s become central to the creation of commercially released
music’ (Marrington 2017: 77).
What audio processing settings can

we ascribe to the characteristics of
commercial pop?
In my view the most important part of any commercial pop record in Western music is the
vocal and that is the reasoning behind my ‘Top-Down – 12-Step Mixing Program’ (Harding
2017: 62–76), which starts the mix process with the vocals. We can make an argument for
some commercial pop music processing and effects that are vital for monitoring throughout
the recording process and through to the final mix. The likelihood now is that we will also be
turning to Harold Hilberand’s Auto-Tune and Celemony’s Melodyne pitch correction tools
during post-vocal recording editing sessions, and these have ‘eroded the boundary between
reality and artificiality’ according to Burgess (2014: 179). For most commercial desktop
production practitioners, these tools have become a necessity and part of the regular workflow.
Part of the current ‘singer songwriter disposition’ is a requirement to keep up with other
artist’s manipulation of vocals against the pop hip-hop trend of ‘a bare and mostly unadorned
sonic environment’ and the temptation for the vocals to be ‘digitally manipulated to the
point of hardly sounding human’ (Appel 2017: 12), for example the Kanye West vocals on
‘Love Lockdown’ from his 2008 album 808s & Heartbreaks.
Ultimately, the mixing process is something that needs to be performed, practised and
mastered by every creative music person in this time of diversity. Composers and musicians,
as well as engineers and producers, will find themselves in the mixing seat, due to budget
or time constraints in an age where budgets for all types of recorded music and audio have
fallen by 75 per cent or more since the 1990s. The following examples from the ‘12-Step
Mixing Program’ still serve me well after forty years of experience as an industry practitioner
and are a framework for others to reference and experiment with on their own projects.
Lead and backing vocals

This is my standard set of vocal mix (and monitoring) techniques:
1 Insert a vocal compressor starting with a 3:1 ratio and the threshold set so that the
gain reduction meter is only active on the louder notes.
2 Set up an auxiliary send to a vocal plate reverb with a decay of about 3 seconds and
a high pass filter (HPF) up to 150 Hz.
3 Set up an auxiliary send to a crotchet mono delay effect with around 35 per cent
feedback and 100 per cent wet.
4 The EQ settings are entirely dependent on how the vocals sound, but typically if the
vocals were recorded flat (and one will not know unless one has recorded them)
then I would boost a few decibels (dB) around 10 kHz or 5 kHz and consider a
4 dB cut at 300–500 Hz, and also a HPF up to 100 Hz, provided this doesn’t lose
the body of the vocal sound. If the vocal is already sounding too thin then try
a boost around 150–250 Hz, but no lower than that as I would want to save any
boost of 100 Hz downwards for kick drum and bass only. See Figure 20.1 for a
full commercial pop EQ guide for all commonly used instruments in commercial
desktop production.
5 My final suggestion on a lead vocal, and I apply this later in the mix, is the Roland
Dimension D at its lowest setting – Dimension 1. The Universal Audio (and other
manufacturers) plug-in virtual copy of this piece of hardware is a good replacement
and again this would be on an auxiliary send, in addition to keeping the original
signal in the stereo mix. The effect on the Dimension D is still set to 100 per cent
wet and balanced behind the lead vocal to create a stereo spread but strangely it has
a wonderful effect of bringing the voice forward, hence this is best used later in the
mix, when there is more going on behind the lead vocal.
Apart from the Dimension D, I would also be running settings 1–4 during the recording
process, certainly the reverb as an absolute minimum to both inspire the vocalist
performing and to allow the vocal to sit into the monitor balance whilst recording and
playback of other overdubs.
All of the processing I have described for the main lead vocal would generally go onto
the double-track lead vocal except for the crotchet delay, I tend to leave that just for the
single lead track, otherwise it can sound confusing if it is on the double as well. Next would
be any harmony vocals recorded to the lead vocal, generally a third up or maybe a fifth
below. If the harmonies are single tracked then they would remain panned centre. If they
were double tracked, then I would pan them half left and right, or even tighter. Processing
on these would be similar to the lead vocals but with no delay effects. Finally, to complete
the vocal stage of the mixing (or monitor mix whilst recording), we move onto the chorus
backing vocal blocks, which would often start with double or quadruple tracked unisons to
the lead vocal in the chorus. This is to add strength and depth and a stereo image with these
panned fully left and right. From there, all of the other harmonies in the chorus would
be panned from the outside fully or, for instance, half left and right for the mid-range
harmonies, tight left and right at 10 o’clock and 2 o’clock for the highest harmonies. All of
these need to be at least double tracked once to achieve a true stereo. The processing would
be applied on the stereo group fader that these vocals are routed to. This saves the computer
system DSP by not processing the individual tracks. Typical backing vocal processing
would be compression first, set similarly to the lead vocal, then equalization, again similar
to the lead vocal but less low-mid cut and minimal HPF. The vocal reverb would stay the
same, though it is worth considering a longer reverb time, four seconds or higher, to place
the backing vocals further back from the lead vocals. I would not put the crotchet delay
on the backing vocals except for a special, automated effect on one or two words, sending
the backing vocals to a small amount of a quaver delay overall to give them a different and
tighter perspective to the lead vocals. Multiple tests and use of this methodology since the
1990s have proven to me that this is a repeatable formula for all pop and dance mixes. One
may wish to vary interpretations of this with more delays and processing for extended and
club mixes (especially by more use of the crotchet delay on the backing vocals), but for
a radio and video mix the above techniques almost guarantee an industry standard and
accepted sound.
It is worth noting that throughout the 1980s at PWL Studios, we would generally process
the lead vocals to sound brighter and slightly sharper by using a buss send to the Eventide
H910 Harmonizer, set to 1.01 pitch ratio. This setting together with triple-tracking lead
vocals would contribute to the PWL ‘Signature Sound’ (Zagorski-Thomas 2014) for vocals.
Acoustic guitars
I would generally deal with acoustic guitars before electric guitars in a final mix and here
are some tips on the processing I use. If the production were centred on an acoustic singer/
songwriter then the artist’s main guitar part would have been my first support instrument
of choice whilst processing the lead vocal. The equalization choices for the acoustic guitar
on the mix would typically be HPF up to 100 Hz; a 2–6 dB cut at 300–900 Hz; and if it
sounds too dull then apply boosts at 5 kHz and 10 kHz. All of this is for the multitracked
one microphone acoustic. If you have just one main acoustic and you are using a multi-
microphone technique and direct injection, then I would pan the ‘body’ condenser
signal hard or half left and leave it virtually flat other than a 300–900 Hz cut, I would
even consider a 150–200 Hz boost or 2–4 dB to give more depth. I would then pan the
sound-hole microphone hard or half right and duplicate the equalization described for
the multitrack acoustic but possibly with the HPF up to 200 Hz. The direct input (DI) box
signal would feed in behind the stereo microphones in the middle, probably kept flat but
at this stage one should check phasing of the three signals combined and finally consider a
tiny bit of the vocal reverb plate on the right-hand signal to help blend the acoustic guitar
into the track. Certainly I would avoid room, ambience or hall reverb. See Figure 20.1 for
further EQ guides.
Electric guitars
In the hope that the guitar sounds are well sourced and well played, I do little or nothing
at the mix stage. I generally do not touch the low frequencies or lower-mid frequencies,
deliberately leaving them in because I am cutting them so much elsewhere. If the guitar
sounds at all dull I only boost around 4.5 kHz, as I am trying to leave 5 kHz and above to
the vocals, piano, acoustic guitars and cymbals on the drums, see Figure 20.1. I would only
consider any reverb or delays if the guitarist has not used any pedals or guitar amp effects.
Thankfully the technology of guitar effect pedals and the effects on guitar amps is good
enough now, I believe, for engineers to trust their quality and low noise ratios. I generally
trust the guitarist to deliver the sound that feels right to them on their amplifier to suit
the track. If the guitars have been recorded flat and dry I would add some short plate or
ambience reverb between 1 to 2 seconds in length and some quaver delay with around
30 per cent feedback for rhythm guitars. Usually I would apply some longer reverb and
crotchet delay for solo and lead guitar parts.
As one may gather from these processing and placement descriptions, I am building a
multidimensional landscape of sound across a stereo picture. During my mixing process,
from the 1980s onwards, I have always used my sonic landscape in relation to a picture
landscape. To take that further in the mixing stage, I imagine the picture as 3-D so that
I can analyse the staged layers in a deliberate attempt to separate the instruments for
the listener and yet also to help those instruments to meld together as one. This should
sound to the listener as though the musicians (live or programmed) and singers are in one
space – or all on stage – together. I use the word picture deliberately here because that’s
how I plan a final mix, like a multidimensional landscape painting, with the lead vocal at
the front and heart of the picture. I usually have this picture in my head before I even start
the production and I believe this is a sensible way to plan a commercial pop production.
Therefore some of the processing and effects I have described such as the vocal reverb
and delays will help to achieve this goal, and this statement from Burgess and Toulson
(2017: 94) ‘reverb and delay effects both add a sense of space and depth to the mix’ further
supports my own views.
Keyboards, bass, drums and orchestration

The alternatives available to us for these instruments are too variable to detail here as there
are such huge processing differences between the programmed versions generally used for
commercial pop and dance music compared to processing the ‘real’ instruments. Some
typical examples would be short (less than one second) digital room ambience for snare
drums and tom toms, adding some of the vocal reverb in small amounts to some keyboard
parts and electric guitars.
Also, at the final mix stage, I would send the bass to the Roland Dimension D, as
mentioned earlier for the lead vocal. Whilst we do not want the bass to be as loud as the
lead vocal, the Dimension D effect brings it forward and gives a stereo perspective that is
generally satisfying for commercial pop.
The idea with this ‘Top-Down’ method of mixing is that one is concentrated on the song
from the minute of starting the mix. Fauconnier and Turner (2002) talk about ‘conceptual
blending’ as an unconscious human activity and that is exactly what I seem to do in my
mixing approach. Short ambience on the drums blended with medium to long plate reverb
on the vocals, then those vocal reverbs blended with my crotchet mono delay as highlighted
previously. These are good examples that describe the types of technology we engage with
whilst mixing a record using today’s technology.
Figure 20.1 Phil Harding commercial pop EQ guide 2018.
The EQ dilemma
I have been surprised over recent years to come across many musicians, songwriters and
producers across many music genres that admit to possessing very little true knowledge
of EQ. They find themselves ‘twiddling’, experimenting and guessing at what to do until
arriving at something that, for them, simply ‘sounds good’ or at least ‘works’ for their
requirements. Clearly a lack of formal training is among the root causes of this, and
many have found it useful to follow some of my basic advice and guidance. The EQ guide
in Figure 20.1 outlines some basic principles that I use that are useful for processing
instruments during a commercial pop project – especially at the mixing stage.
These EQ settings are best sourced from individual character-full software-based EQs
(or outboard hardware equivalents) such as the Focusrite d2 EQ and Universal Audio’s
Pultec EQ simulation as opposed to multipurpose channel strip plug-ins.
The digital domain: Convergence and

integration
Some of today’s most ‘die-hard’ high-level recording-industry professionals, such as Haydn
Bendall (Kate Bush and Alfie Boe) and Tony Platt (AC/DC and Bob Marley), have now
converted from either totally analogue mixing or using analogue consoles and external
hardware to being happy to mix entirely in the digital domain within their desktop
computers. With regard to ITB mixing Burgess comments:
There are detractors who feel that such mixes lack the depth of those using external analog
effects and modifiers. A counter argument is that by staying in the digital domain the audio
is damaged less by multiple A-D and D-A conversions. (Burgess 2014: 145)
Burgess comments further on DAW resources becoming ‘exhausted with a big and
[complex] mix’ in the early days of ITB mixing. It would, however, be fair to say that
since Burgess wrote that in 2014 computer resources have become faster and more
powerful, enabling them to cope with the highest levels of desktop production and
commercial mixing. Audio plug-in manufacturers such as Universal Audio and Native
Instruments (Kontact) have offered solutions to ease the speed and memory burdens on
DAWs by supplying their products in formats that process and operate outside of the
computer’s internal central processing unit (CPU). Justin Paterson states: ‘The current
paradigm of mixing ITB is almost entirely based around the metaphor of the traditional
mixing console’ (2017: 89) and this can be seen among the vast array of channel strips
(generally copied from analogue mixing consoles such as SSL) available from Universal
Audio, Waves and others. It is worth emphasizing that DAWs ‘have since the early 2000s
become central to the creation of commercially released music’ (Marrington 2017: 77).
How do we define commerciality in

desktop production?
The commercial pop world has changed significantly in the last decade in terms of song
arrangement and repeated looped sections of the same musical and audio cycle. Bennett
(2017) has illustrated that, in 2012, five of the year’s top ten bestsellers worldwide made use
of a repeated 4-bar harmonic or audio sample loop, whereas in 2017 it was up to 100 per
cent, as Bennett explains:
For example, at the time of writing, the current top 10 worldwide Spotify streaming songs (all
presumably created using DAWs) make substantial use of 2- and 4-bar harmonic or audio
sample loops, with more than 50% of them consisting entirely of a single loop. Contrast this
with an equivalent chart from 30 years previously (October 1986), and all of the songs use
longer chord chains over a greater number of bars. Has verticality crossed from songwriter
habit to listener preference? (Bennett 2018)
Defining commerciality in desktop production is often out of the producer’s hands as

Howlett states:
Finding an audience is the role of the record company. A record producer must also engage
with the commercial expectations of the interests that underwrite a recording. (2009: iii)
Nevertheless, a commercial pop producer has high expectations on their shoulders to

deliver a ‘chart-worthy’ product to the client, and retaining a commercial perspective
will always be easier when collaborating as a production team as opposed to being a sole
producer working in isolation.
Can we control creativity in desktop

production?
Our engagement with and immersion in recorded music today both as creators and
listeners has become more intense than it used to be in the 1960s and 1970s. This is partly
due to the many platforms we use with current technology such as mobile phones, tablets
and laptop computers. On these particular platforms it is common to use headphones for
playback and the leap in the audio quality of headphones from the 1970s (made popular as
a consumer listening device by the Sony Walkman cassette player) to today is enormous.
This requires a production team to apply very high standards of creative control over the
audio output. As Alex James has recently stated:
I don’t think you can manage creativity, it’s irrepressible, it’s indomitable and it’s like thistles:
just springing up everywhere. You can’t stop it and you can’t control it, you just have to make
sure it’s pointing in the right direction. (Hits, Hype and Hustle 2018)
So, we have to be more honest in our desktop production for commercial music. No longer
can people hide behind drum loops and uncleared samples from other records. We have
to be more aware of an audience that is listening on mobile devices and headphones that
are delivering professional recording studio quality. As Alex James says, we need to point
creativity in the right direction and hope that direction is right. In contrast, my view is that
a creative desktop production team making records for the commercial pop music market
can effect a certain amount of control over creativity by following my ‘Service Model for
(Pop Music) Creativity and Commerce’ (Harding 2019).
With regard to the 1980s PWL version of the ‘Service Model’ (Harding and Thompson
2019), Pete Waterman’s status within the process allowed him to use his extensive
knowledge of pop music and culture to inform his decisions and exercise his agency to
ease the ‘tension between commerce and creativity’ (Negus 1992: 153). Waterman could
say with confidence what would or would not work on a final product leaving the studio
and every member of the production team would trust Waterman on such decisions. This
generally allowed and often generated a positive creative workflow at PWL.
Conclusions
Richard James Burgess (2014) states that ‘transformative technologies’ such as DAWs
completely changed the working practices of creative studio technicians and producers
such as Ian Curnow and me in the 1990s. I have commentated on that throughout this
chapter and how the successful workflow of some musical creatives such as SAW can
cause a reluctance to change as these transforming technologies arrive. To the SAW team
leader, Pete Waterman, these technologies were irrelevant to his critical listening skills
and commercial music judgement. His tacit knowledge from two decades of hit making
taught him to ‘feel’ when a potential hit was presented to him at the mix stage for his
approval. In his psyche at that point was, surely, not only how will this sound on radio but
what will the client (generally the record company then) think? However, as the hits kept
coming for SAW in the 1980s and the PWL team gained the knowledge of what satisfied
the ears of their gatekeeper, certain songwriting, production and mixing elements
became clearly repeatable. Music theorists dissecting SAW hit songs have discovered that
a lift into the all-important chorus was often achieved by a subtle upward key change.
There would always be maximum support for the main vocal hooks with harmonies
and backing vocals. Basically the philosophy became ‘emphasize anything catchy within
the track’ whether that is vocal hooks or instrumental hooks and melodies – make sure
they jump out of the track. Clear, intelligible lyrics were achieved by pushing for good
annunciation at the time of recording and the type of mix EQ processing I have outlined
in this chapter.
During the 1990s boy band phenomenon, Ian Curnow and I (P&E Music) would initially
hire the session backing vocalists to record backing vocals, and a guide lead, before the band
vocals, therefore creating a clear roadmap for us during the artist vocal sessions to follow.
During that time at P&E, there was a clear expectation by our team leader and manager,
Tom Watkins, to never allow a dull moment in the production, something interesting
always had to be present, preferably involving the band members for easy camera options
in future music videos and performances.
Simon Zagorski-Thomas (2014) states that the common ‘cut and paste’ methods of
DAWs have encouraged composers to work in a modular fashion since their inception in
the late 1980s into the 1990s, causing producers to adapt their methodologies and record
session musicians in the same way. This means only recording one chorus and cutting
and pasting that take throughout the arrangement. This has become completely normal
practice for the pop and boy band music genre I have described throughout this chapter.
The hardware studio equipment described in section one, in my view, were the minimum
requirements for competitive pop production teams and studios of the day and were
required to fulfil my personal sonic aesthetics. Today the equivalent technology can be
found inside DAWs as virtual plug-in copies of the hardware described earlier.
For commercial desktop production teams today there is a need to devise their own
methodologies to know when something is sounding commercially ‘catchy’. Employing the
ears of a more experienced ‘team leader’ or maybe an executive producer could be ways to
achieve this. The final judgement of any producer’s work though is always the client. No
matter what we think critically, creatively or commercially, if the client is happy, our job is
completed. Commercial desktop production is a service industry.
Bibliography
Appel, N. (2017), ‘The Singer-Songwriter Disposition’, in J. A. Williams and K. Williams (eds),
The Singer-Songwriter Handbook, 12–13, New York: Bloomsbury Academic.
Bennet, J. (2017), ‘How Someone Controlled You: The Digital Audio Workstation and the
Internet as Influences on Songwriting Creativity’, Presentation at the 2017 Art of Record
Production Conference, Royal College of Music, Stockholm, Sweden, 1–3 December.
Bennett, J. (2018), Oxford Handbook of Creative Process in Music, Oxford: Oxford University
Press.
Burgess, R. (2014), The History of Music Production, Oxford: Oxford University Press.
Burgess, R. and R. Toulson (2017), ‘Singer-Songwriter Meets Music Production’, in
J. A. Williams and K. Williams (eds), The Singer-Songwriter Handbook, 94–95, New York:
Bloomsbury Academic.
Egan, S. (2004), The Guys Who Wrote ’Em: Songwriting Geniuses of Rock and Pop, London:
Askin Publishing.
Fauconnier, G. and M. Turner (2002), The Way We Think (Conceptual Blending and the Mind’s
Hidden Complexities), New York: Basic Books.
Harding, P. (2010), PWL from the Factory Floor, London: Cherry Red Books.
Harding, P. (2017), ‘Top-Down Mixing – A 12-Step Mixing Program’, in R. Hepworth-Sawyer
and J. Hodgson (eds), Mixing Music, ch. 4, Oxford: Routledge.
Harding, P. (2019), Pop Music Production – Manufactured Pop and Boy Bands of the 1990s,
Oxford: Routledge.
Harding, P. and P. Thompson (2019), ‘Collective Creativity: A ‘Service’ Model of Commercial
Pop Music Production at PWL in the 1980s’, in J. Hodgson, R. Hepworth-Sawyer,
J. Paterson and R. Toulson (eds), Innovation in Music: Performance, Production, Technology,
and Business, New York: Routledge.
Hennion, A. (1990), ‘The Production of Success: An Anti-Musicology of the Pop Song’, in
S. Frith and A. Goodwin (eds), On Record: Rock, Pop and the Written Word, 185–206,
London: Routledge.
Hepworth-Sawyer, R. and C. Golding (2011), What Is Music Production?, Oxford: Focal Press.
Hits, Hype and Hustle (2018), [TV programme] BBC4 TV, 9 February 2018.
Howlett, M. (2009), ‘The Record Producer as Nexus’, PhD thesis, University of Glamorgan/
Prifysgol Morgannwg, Cardiff.
Marrington, M. (2017), ‘Composing with the Digital Audio Workstation’, in J. A. Williams and
K. Williams (eds), The Singer-Songwriter Handbook, 77–78, New York: Bloomsbury Academic.
McIntyre, P. (2013), ‘Creativity as a System in Action’, in K. Thomas and J. Chan (eds),
Handbook of Research on Creativity, 84–97, Cheltenham: Edward Elgar.
Moylan, W. (2015), Understanding and Crafting the Mix: The Art of Recording, 185–189,
Oxford: Focal Press.
Negus, K. (1992), Producing Pop: Culture and Conflict in the Popular Music Industry, London:
Paterson, J. (2017), ‘Mixing in the Box’, in R. Hepworth-Sawyer and J. Hodgson (eds), Mixing
Music, ch. 5, Oxford: Routledge.
University Press.
Discography
West, Kanye (2008), [CD] ‘Love Lockdown’, 808s & Heartbreaks, USA: Universal Records.
316
21
Audio Processing
Michail Exarchos (aka Stereo Mike)
and Simon Zagorski-Thomas
Introduction
This chapter initiates the discussion of post-production through an exploration of the ‘state
of the art’ in both the practice and our theoretical understanding of audio processing,
echoing the underlying pursuit the Handbook stated in the introduction – that is, the
bridging of dichotomies between the theoretical and the practical (the ‘how’ and the
‘why’) in/behind record production processes and outcomes. In some ways, this is also
reflected in the tone and examples offered by the two authors, contributing both practice-
based/phonographic illustrations, as well as surveys of relevant theorizations from other
disciplines. In attempting to provide a theoretical map/ping between musical-sonic
objectives, technological/processual actualization and pursued aesthetics (personal/
stylistic sonic signatures in record production), we examine four domains of sonic
characteristics, and ensuing tools and processes devised for their manipulation – namely
‘pitch’, ‘amplitude’, ‘spectrum and timbre’ and ‘time and space’.
Our discussion of pitch processing and its associated tools explores the first historical
attempts at de-associating the domains of pitch and time via, initially, innovations in
the design of analogue devices, which quickly problematizes the notion of neat sonic
categories when looked upon through the lens of practice; we then examine how the
science behind digital pitch-processing devices and algorithms takes advantage of
the functional interrelationship between pitch and delay at the heart of their effective
operation. The amplitude section explores this notion further, suggesting that perhaps
(almost) all mixing processes/workflows can be seen as a function of amplitude processing;
the implication, however, being that spectral/timbral and dynamic/temporal perceptual
effects cannot be understood but as artefacts of processes (ab)using aspects of amplitude
(over different domains, such as relative balance, panorama, spectra and time), which –
in turn – highlights shortcomings in the discipline’s consistency over the current state
of our definitions and categorizations. Spectrum and timbre starts from this very issue to
bring together understandings from related areas (such as music and audio psychology,
plug-in design, ecological perception, semantics, production pedagogy, ethnomusicology

and popular music studies), demonstrating how embodied knowledge and critical listening
can be further informed by enriching practice; although the section commences by offering
residualist and frequency-specific definitions of timbre, it eventually illustrates through
a merging of semantic and plug-in-design understandings, how our current (software)
mixing tools have come closer to a ‘bundling’ of simplified timbral controls that reflect
perception rather than causality. Finally, the time and space section provides a time-based
definition of spatial effects as used in record production, illustrating the surreal complexity
of staging artefacts in contemporary (popular) music-making, to then underline the gap
that exists between technical and musicological theories of production aesthetics.
Pitch
At the heart of contemporary pitch manipulation lies the process of pitch shifting and
its close relationship with delay. Although delay-related processing will be dealt with in
more detail in the time and space section below, the focus here will remain on processes
dealing with, and perceived as, forms of pitch manipulation in record production. A
number of tools and applications borne out of the creative exploitation of pitch shifting
have been created serving record production needs, from pitch correction, through to
harmonizing, enhancing and real-time, expressive textural applications. Interestingly, as
pitch processing technologies have developed, a cyclic thread can be observed from early
efforts attempting to avoid inadvertent ‘glitch’ or granular artefacts as a result of pitch-
shifting processes, through to Autotune’s self-proclaimed transparent pitch-correction
and back to much contemporary pitch manipulation, celebrating and bringing to the fore
the textural artefacts infused through – extreme – pitch processing. In other words, we
can map the history of pitch-processing applications and technologies against inadvertent
(caused by technological limitations) and/or conscious (stylistically driven, aesthetically
pursued) textural traces. Of course, exploring this causality links pitch processing also
to spectral phenomena, which are the direct subject of the spectrum and timbre section
below.
To better understand the relationship between pitch processing and the time domain,
it is worth exploring some of the earliest attempts at pitch and time manipulation. Much
of the literature dealing with the history of pitch manipulation starts off with the Eventide
H910 Harmonizer – the first commercially available pitch correction device, yet one that
came with the trade-off of substantial digital artefacts (Owsinski n.d.). But as Costello
(2010) points out, a 1960s AES article locates some of the earliest patents relating to pitch
and time manipulation back to analogue devices from the 1920s (Marlens 1966). Marshall
informs us that the driver for such innovation had been communication and not record
production, offering the 1967 Eltro Information Rate Changer – specifically marketed
for ‘speech compression’ (Eltro Brochure 1967) – as the earliest commercially available
analogue pitch-time changer evolving from these earlier patents:
Audio Processing 319
The ability to change a recorded sound’s pitch independently of its playback rate had its
origins not in the realm of music technology, but in efforts to time-compress signals for
faster communication […] In short, for the earliest ‘pitch-time correction’ technologies, the
pitch itself was largely a secondary concern, of interest primarily because it was desirable for
the sake of intelligibility to pitch-change time-altered sounds into a more normal-sounding
frequency range. (Marshall 2014)
This relationship between pitch processing and the time domain is not just a functional
one – it sits at the core of pitch processing theory and application. Focusing on pitch
shifting as key for respective digital processes, Izhaki provides a useful definition that
exemplifies the relationship:
A pitch shifter alters the pitch of the input signal without altering its duration […] If we
zoom into the recording of a human voice, we identify repeating patterns that only vary
over time […] A pitch shifter identifies these patterns, then shrinks or stretches them into
the same time span […] The actual process is more complex, but at its core is this rudiment.
(Izhaki 2018: 469)
To uncover some of the complexities behind the theory of pitch shifting, we can turn to
Case who explains pitch ‘recalculation’ through ‘a bit of math’, and demonstrates that,
although a source sound would simply be shifted in time were it run through a fixed
digital delay, running it through a variable delay actually results in pitch change (2012:
249–262). Crucially, he highlights two important theoretical problems: firstly, the
manipulation of source content spanning the duration of whole songs would necessitate
devices able to produce delay times unthinkable before access to substantial computer-
based processing power; secondly, an infinitely varying delay would result in phenomena
akin to manipulating analogue playback devices (249–262). It is worth citing Case at some
length here to illustrate the practical implications and how these issues are dealt with in
digital devices:
Pitch-shifting signal processors differentiate themselves from tape speed tricks in their
clever solving of this problem. Digital delays can be manipulated to always increase, but also
to reset themselves […] It (is) the rate of change of the delay that (leads) to pitch shifting, not
the absolute delay time itself […]
It is a problem solved by clever software engineers who find ways to make this inaudible.
Older pitch shifters ‘glitched’ as they tried to return to the original delay time. Today, those
glitches are mostly overcome by intense signal processing. Software algorithms can evaluate
the audio and find a strategic time to reset the delay time, applying cross fades to smooth
things out. (253)
However, the ‘glitch’ effects perceptible in early digital devices such as the H910 were, of
course, put to creative use by record producers. Notable highlights include: Kevin Killen’s
detuning of snares on U2’s War (1983) and The Unforgettable Fire (1984) albums; AC/DC’s
Back in Black (1980), featuring Tony Platt’s infamous snare, vocal and guitar riff ‘fattening’;
Tony Visconti’s metallic snare on ‘Breaking Glass’ from David Bowie’s Low (1977); Eddie
Van Halen’s use of two H910s for much of his ultra-wide 1980s guitar tones – for example,
on 5150 (1986); Laurie Anderson’s studio and live vocal/instrumental processing as of the
late 1970s; and Aphex Twin’s use of four units of the later H949 model on the Grammy-
winning Syro (2014) (Eventide Audio 2016a, b; Bain 2017).
Notably, Case differentiates between older pitch shifters that ‘glitched’ and today’s
‘intense signal processing’ that facilitates a high degree of transparency (2012: 253).
Fast-forwarding to 1997, Antares Auto-Tune’s initial impact was very much a result of
achieving this textural transparency, which allowed – arguably – discreet pitch correction.
Provenzano explains:
Older analog voice manipulation technologies such as vocoders and talk boxes can bind the
output of the larynx to an equal-tempered scale, but there is a cost (or a benefit): the timbre
of the voice gets mixed up and reworked, sometimes beyond recognition. Auto-Tune, by
contrast, when used not as an overt effect but as a pitch-correction tool, is not timbrally
expensive. (2018: 163)
Provenzano here focuses on the creative and sonic implications of a pitch-processing tool
that can be deployed for pitch-correction only, leaving the vocalist’s (or instrumentalist’s)
timbral and (non-pitch-related) emotional aspects of the performance intact. This is a
processing objective mirrored in all of Antares’s promotional and instructional literature,
highlighting that their pitch correction takes place ‘without distortion or artefacts, while
preserving all of the expressive nuance of the original performance’ (Antares 2017).
In much of her chapter, ‘Auto-Tune, Labor, and the Pop-Music Voice’ (2018), Provenzano
delineates between this functional objective, which allows producers and artists to focus
in on untamed performance not restricted by pursuits of tuning perfection, and the
‘overt’ use of Auto-Tune as an effect that does incur a timbral cost. But it is important
to question at which point does our auditory perception transcend from pitch-only to
textural appreciation? Or, in other words, what is the threshold between pitch correction
and timbral effect, and what are the key variables responsible for crossing over from one
type of manipulation to the other? To begin answering this we need to briefly review the
functionality of Auto-Tune, and other pitch-correction/manipulation software that have
been produced since its release – namely Celemony Melodyne, Waves Tune and cases
of digital audio workstation (DAW) specific tuning functionality, such as Logic Pro X’s
internal Flex Pitch and Ableton Live’s Warp modes/Transpose function.
In their latest user manual for Auto-Tune Realtime (for UAD), Antares (2017) identify
the settings necessary to achieve the ‘Auto-Tune vocal effect’, referring to the hard ‘pitch
quantization’ artefacts exemplified by records such as Cher’s Believe (1998) and T-Pain’s
Epiphany (2007) – this consists of selecting a scale and setting the fastest possible retune
speed, which ‘limits each note to its exact target pitch and forces instantaneous transitions
between notes’ (our emphasis). It is worth clarifying at this point that the word ‘auto’ in
Auto-Tune refers to one of its two main operating modes, the automatic one, which is
typically opted for when correcting performances that are generally quite close in tuning
to an intended scale. By selecting the respective scale and setting the ‘Retune Speed’,
‘Correction Style’, ‘Natural Vibrato’ and ‘Humanize’ parameters to values rendering
naturalistic results – or, conversely, extreme ones to achieve the ‘Auto-Tune Vocal Effect’
(see Figure 21.1 below) – an efficient pitch correction process can be carried out, which
Figure 21.1 The Antares Auto-Tune Realtime UAD plug-in window showing settings
used by one of the authors on the lead rap voice for a recent Trap remix: a softer take on
the hard ‘Auto-Tune Voice Effect’ has been achieved (courtesy of the less than maximum
Retune Speed setting), and one note of the F# minor scale selected has been bypassed
to facilitate the ‘auto’ mode without graphic intervention.
may work for the majority of the source content. But in cases where the automatic pitch
correction renders some unwanted results (pitch quantization defaulting to unintended
notes of the chosen scale due to largely inaccurate performance), Auto-Tune’s Graphic
Mode is more apt (and challenging to operate), allowing increased editing/manual accuracy
over individual notes and utterances.
It is in comparison to Auto-Tune’s Graphic Mode that Celemony’s Melodyne (winner
of the Technical Grammy in 2012) has brought creative advantages to the table since
its first public viewing at the NAMM (National Association of Music Merchants) show
in 2001. Melodyne takes the graphical interface possibilities further and expands the
functionality – in later releases – to polyphonic pitch correction and manipulation. In a
comparative review for Auto-Tune’s version 5 and Melodyne’s first plug-in version, Walden
(2007) points out that Melodyne’s ‘power takes it beyond the role of a purely corrective
processor and into that of a powerful creative tool’. The ‘powerful’ artefact-free algorithm
has, as a result, enabled creative manipulation beyond simple pitch correction or hard pitch
quantization and many practitioners have been empowered by its intuitive interface, using
it to create everything from Harmonizer-style pseudo-ADT (Anderton 2018), through to
artificial backing harmonies, and – notably also – Skrillex Bangarang-style (2011) ‘glitch’
vocals (Computer Music Magazine 2014).
By 2013, Apple had brought many of Melodyne’s features to Logic Pro X in the form of
its Flex Pitch functionality (Breen 2013; Kahn 2013) providing direct access to advanced
processing of multiple pitch parameters inside a DAW (see Figure 21.2), while a few years
prior, Waves had provided a ‘mutant child of Auto-Tune and Celemony’s Melodyne’ with
Waves Tune (Godfrey 2006). Ableton Live’s powerful elastic engine, courtesy of their
flexible Warp modes algorithms (see Figure 21.3), arguably became its unique selling
point, initially for easy and sonically transparent pitch/time-stretching, and, eventually, for
the glitchy and granular sound design explored by artists such as Flume (for example, on
Skin, 2016), when its ‘granular resynthesis’ engine is pushed to extremes (McFarlane 2010).
Although there are arguments for or against both the ease and sound of the now many
available algorithm alternatives in pitch-processing practice (see, for example, Vandeviver
2018), it would be fair to say that the current state of the art is certainly at a stage where
the user can dial in anything from transparent/functional control or correction, through
to convincing pseudo-performative layers, or onto expressive/experimental real-time
artefacts – benefitting from the elasticity now available as a direct result of increased
processing power.
This brings us back full circle to our timbral thread and its relationship to pitch
manipulation. According to Moore (2012): ‘Timbre is concerned with the harmonic
Figure 21.2 Flex Pitch mode enabled on a distorted bass guitar track in Logic Pro X
(10.4.1), zooming in on both its Workspace and – the more detailed – Editor views. The
six nodes visible on the latter provide access to parameters such as Pitch Drift, Vibrato
and Formant Shift.
Figure 21.3 Ableton Live’s Clip View illustrating a number of available Warp modes and
the Transpose function, which are often combined and pushed to extreme settings live or
via automation in pursuit of granular effects.
relationships, phase relationships and the overall volume contour or envelope of the
sound’, and, further, cites Moylan (2007) who suggests that parameters such as ‘dynamic
envelope; spectral content and spectral envelope are all crucial in the perception of
timbre’. Although – as the spectrum and timbre section will show – the study of timbre
is problematized by questions of definition, it can certainly be argued then that pitch
manipulation, even at its most discreet, results in textural artefacts of different shades
on the scale of perceptibility. It is, therefore, certainly necessary to further research
the timbral (envelope, amplitude, spectral, etc.) ramifications of pitch processing more
systematically.
Amplitude
In a similar vein, it is almost impossible to single out amplitude processing in a record/
post-production context from spectral and time-based causality and ramifications. It is
interesting that the literature on amplitude effects and psychoacoustics weaves analytical
perspectives that juxtapose notions of amplitude, perceived loudness, the time domain
and spectral or harmonic distortion to meaningfully discuss our perception of amplitude
in musical sound. Case (2012), for example, breaks down his analysis of amplitude effects
into chapters dealing first with distortion, equalization, dynamics (compression, limiting,
expansion and gating) and then volume. His distortion chapter ‘leaves the time axis well
alone and focuses on the sometimes accidental and sometimes deliberate manipulation of
the amplitude axis’ (89). Howard and Angus explain that:
Although the perceived loudness of an acoustic sound is related to its amplitude, there is not
a simple one-to-one functional relationship. As a psychoacoustic effect it is affected by both
the context and nature of the sound […]
The pressure amplitude of a sound wave does not directly relate to its perceived loudness.
In fact it is possible for a sound wave with a larger pressure amplitude to sound quieter than
a sound wave with a lower pressure amplitude. How can this be so? The answer is that the
sounds are at different frequencies and the sensitivity of our hearing varies as the frequency
varies. (2009: 91–93)
It follows then that we need to consider the various approaches to amplitude processing
in record production from the perspectives of creative application (processing tools,
practice), pursued effect (aesthetics, perception) and their interrelationship. A helpful
concept to consider here may be Stavrou’s self-proclaimed guiding mixing principle of
generating ‘Maximum Illusion with Minimum Voltage’, highlighting the psychoacoustic
oxymoron that exists between abused causality (in record production practice) and
sonic effect (in perception) (2003: 51). The remainder of this section will look at cases of
amplitude processing practised in a contemporary record production context that pursue
a range of loudness perception effects.
Of course, from a post-production perspective, one has to start with faders – and before
them, historically, rotary knobs – which have been used ‘for coarse level adjustments’
and constitute ‘the most straightforward tools in the mixing arsenal’ (Izhaki 2018: 183).
Although there are differences in fader designs (potentiometers providing level attenuation
through resistance, VCA faders allowing voltage-controlled amplification, digital faders
determining sample multiplication), functionally, they all enable the balancing of levels
for individual elements in a mix even though they – arguably by design – inspire turning
things up (183–187). Due to this affordance, one of the mix engineer’s primary concerns
becomes ‘gain-staging’: the management of relative levels between different stages of a
signal path, allowing sufficient headroom for further level adjustment while maximizing
signal-to-noise ratios (Davis 2013; Houghton 2013). Sufficient headroom in optimum
gain-staging parlance technically relates to the avoidance of distortion and noise artefacts
but, as Case’s (2012) thematic breakdown of amplitude effects demonstrates, distortion of
amplitude should very much be appreciated as a processing artefact stemming from the
creative application of headroom abuse.
A classic, if extreme, example of amplitude processing from discography can be heard
in the harmonically distorted kicks and snares of countless drum and bass, hip-hop and
EDM records (check, for example, the exposed beat intro to ‘Ghosts n’ Stuff ’ from For Lack
of a Better Name by deadmau5 [2009]), achieved through hard ‘clipping’. This is the process
of ‘abusing the analogue stages of an A-D converter’ resulting in the generation of ‘odd
harmonics’ (Houghton 2015); an effect initially carried out through the abuse of converters
on digital samplers or audio interfaces, later via digital clipping within DAWs operating
at lower resolutions than today (it is actually quite hard to reach clipping point within
contemporary DAWs running at high-bit floating architectures – see Thornton 2010),
and – currently – typically via third-party clipping plug-ins. What is interesting about
harmonic distortion achieved through hard clipping is that not only does the process result
in increased loudness perception but also it is a case of amplitude processing bringing about
a form of synthesis – the generation of new harmonics, typically only over the transient
phase of a percussive source, with both dynamic and textural implications. This can also
be viewed as a fundamentally stylistically driven sonic signature, with post-production
implications for loudness perception, yet again blurring the theoretical delineation of
amplitude and timbral processing.
Equalizers are very much part of the same debate, as they are effectively tools that
allow amplitude manipulation over a specific part of the frequency spectrum (there are
many types of equalizers allowing and achieving different types of spectral manipulation
and effect, respectively, covered by a variety of technical handbooks such as, for example,
Case 2012; Davis 2013; Izhaki 2018), but resulting in altered timbre. Following the same
logic, independent amplitude manipulation of the signal levels of two or more channels
(typically via panoramic potentiometers) – mapped to different speakers – results in
imaging or spatial aural illusions of placement or movement. Finally, amplitude processing
over time pertains to our use of dynamic range processors, such as compressors, limiters,
gates, expanders and duckers. Perceiving these tools and applications under the conceptual
theme of time-based amplitude processing is helpful, as is their delineation between our
processing foci on the micro- and macro-temporal domain. It would not be a stretch to
suggest that most mixing processes revolve around four notions of amplitude processing:
● as relative level balancing;
● as panning;
● over specific areas of the frequency spectrum (equalisation); and
● over time (dynamics)
… leaving only ambient/spatial processing as a separate case (although there are many
amplitude-related functions to this aspect as well, but perhaps it cannot be as neatly
explained solely in relation to amplitude as either a parameter or a form of processing).
This notion may explain the unavoidable crossover between our theoretical delineations of
different processing categories and why they are so entangled in practice. We would therefore
suggest that there is a need to complicate our theoretical mapping of the interrelationship
between physical/acoustic/technological causality and psychoacoustic perception effects,
taking into account the semantically incomplete attempts at categorization and definition
of sonic phenomena in record production musicology.
Spectrum and timbre

The study of timbre has been plagued by questions of definition. These range from the
notion that all sound is only frequency spectrum over time, i.e. there is not anything but
timbre, to a kind of residualist approach in music theory – that timbre is what is left over
in the process of orchestrating and arranging after pitch, rhythm, dynamics, harmony and
form have been accounted for (i.e. it differentiates instrument types and delineates playing
techniques – excluding dynamics – such as pizzicato, marcato, con legno, sul ponticello/
tasto, con sordini etc.). In light of this and of similar problems around the notion of spatial
and time-based effects, these last two sections will include brief surveys of the ways in
which these subjects are theorized in other disciplines as well as exploring how they are
approached through the techniques of record production. Hopefully this will provide
further insights from both sides of the poeisis/esthesis fence.
In the world of audio and music psychology, a good deal of work has developed around
the problems of identification and categorization (e.g. Bregman 1994; McAdams et al. 1995;
Peeters et al. 2011). How do we distinguish between a trumpet and violin while at the
same time being able to categorize the whole range of different timbres that a trumpeter
can produce as being ‘the same’? Some of this work is based on the notion of ‘feature
extraction’ – of looking for statistical correlations between sounds that are identified as in
some way ‘the same’. Feature extraction is, of course, at the heart of the notion of emulation
in digital effects, and plug-in design and research (e.g. Yeh, Abel and Smith 2007; Zölzer
2011; Paiva et al. 2012), and that has become hugely important in the commercial world
of record production, as the notion of timbre has become more and more associated with
vintage products and the sonic signatures of star mixers rather than with principles or
theory. Porcello has discussed the way in which sound engineers speak about sound,
stating that one of the key ways that this happens is through association, in particular
by indexically invoking industry professionals or production technologies (2004: 747).
This has now become embodied in the design and marketing techniques of product
designers, and the economics of the sector is driving research towards emulation rather
than innovation. Of course, that is not stopping practitioners from using these emulative
tools in innovative ways.
In parallel to this detailed empirical and practical work, there are also several strands
of work that are approaching the notion of timbre from the perspective of what it means
to people. In the world of electroacoustic music, flowing from the idea of creating and
listening to sound in an abstract way, various composer-researchers (e.g. Schaeffer
1977; Smalley 1986, 1997; Wishart 1997) have explored both how our perception and
interpretation of music is tied to the cause of a sound (Clarke 2005) and how we can
seek to break those ties. These discussions relate both to the nature of the ‘thing’ that
makes the sound (by vibrating) and the nature of the energy that causes it to vibrate.
Without much direct crossover in either direction, this has also been an important factor
in work within ethnomusicology and popular music studies (Fales 2002, 2005; Berger
and Fales 2005; Fales 2017; Fink, Latour and Wallmark 2018; Zagorski-Thomas 2018).
Of course, these approaches are reflected in the first chapter of this Handbook that used
the categories of agents (human and non-human), energy, space, context and media (the
representational systems used to produce and disseminate the sounds) as a framework
for analysing recorded music. This semantic approach to audio (Who did what? Where,
when and why?) is also reflected in recent developments in plug-in design. This is seen,
for example, in the Waves CLA series (branded on producer/mixer Chris Lord-Alge)
where there are parameters labelled with terms such as spank, roar, bark and honk, or
the Infected Mushroom Pusher (branded on the Infected Mushroom EDM duo), which
includes punch, push, body and magic. This semantic approach takes us away from tools
that were perceived to be controlling only one parameter and towards bundled effects
that affect dynamics/envelope, frequency/spectrum (including distortion), ambience/
reverberation and other effects such as chorus and delay. Elsewhere in this Handbook
both Meynell and Zak have discussed the ways in which the Pultec EQ1A affects the
dynamics of the signal as well as the frequency content, and a lot of the emulation plug-ins
we have been describing are dynamic processors, which also add spectral colouration and
distortion in some desirable manner.
The key texts used in production pedagogy, on the other hand (e.g. Owsinski 1999; Case
2012; Izhaki 2018), treat spectral processing (equalization and distortion) in generic terms.
The technical details of both equalization and distortion are covered in detail and a series
of established approaches are also discussed. And although there are variations in the detail
of how those mentioned above and other authors (e.g. Savage 2011; Mynett 2013) describe
the reasons for equalization, there are three fundamental approaches:
● to make a particular feature of a sound stand out more, for example, to exaggerate
the high-frequency consonant sounds (t, k, p, s, etc.) in a vocal to improve
intelligibility;
● to remove some unwanted feature such as noise or an unpleasant resonance; and
● to improve clarity and prevent masking by ensuring different sounds in a mix are
not competing in the same frequency range.
One of the reasons for making a particular frequency range stand out more, and this might
be done by equalization or distortion, is to change the perceived energy expenditure of the
activity that is causing the noise. This might be quite subtle, in the sense of bringing out
energy that is already there – heaviness or lightness in a gesture for example – or it might
be about creating quite a surreal cartoon of high energy through distortion (although
distortion can signify the vibration of a degraded artefact as well as a higher level of energy
in making it vibrate). In any event, there are a range of views about the possibility of
specifying good and bad generic equalization settings. We all have our ‘go to’ approaches
for specific contexts – Phil Harding outlined one for popular music earlier in this book
– but it is also true that the ‘go to’ approach is a starting point from which to use critical
listening and decide what else needs to be done.
Time and space

It may seem odd to put time and space under the same heading but the idea is that time-
domain effects – in contrast to pitch, amplitude and spectral effects – are related to most
of the techniques that are concerned with spatial processing: delay and reverberation most
notably. They are techniques that are based, in the acoustic world at least, on reflections
and therefore on delayed copies of some original ‘direct’ signal. Given that definition, it
also makes sense to place phasing, flanging and chorusing in the same box although they
all also involve either different frequencies being delayed by different amounts or the delays
being caused by different and varying playback speeds of the ‘copied’ signal – which causes
both time and pitch differences. An additional level of time-domain effects is the world
of looped samples, audio quantizing and time-warping, but these are dealt with by Anne
Danielsen elsewhere in the Handbook and so will not be covered here.
In the same way that we dealt with spectrum and timbre from a range of perspectives,
we will start by discussing the ways in which the world of audio and music psychology has
addressed this issue. Not surprisingly, it has been focused on the way that our perceptual
and cognitive systems deal with spatial audio in the ‘real world’ (e.g. Howard and Angus
1996; Moore 1999; Begault, Wenzel and Anderson 2001; Rumsey 2001) and, given that the
vast majority of the funding for research in this area is focused on virtual reality and the
gaming industry, the works on technologies of production, recording and emulation are
similarly focused on realism (Abel et al. 2006; Maxwell 2007; Schlemmer 2011). Although,
as with Abel’s work on emulating spring reverbs, some of the ‘realism’ is concerned with
‘real’ vintage technologies, one set of workflows that is emerging is to select very realistic
reverbs and to combine them in a very surreal sonic cartoon.
Once again, these empirical and practical approaches exist in parallel with a range of
strands that explore acoustic space in terms of its meaning. Reaching back to Edward Hall’s
(1966) work on proxemics, the anthropology of architecture, we can find a range of work
in sound studies, social history and cultural theory that runs in a similar vein (e.g. Schmidt
Horning 2012; Born 2013; Dibben 2013; Sterne 2015a, b; Revill 2016). It is concerned with
psychological and cultural notions of space, intimacy and the many ways in which how
we exist in the natural and built environments influences our identities and interactions.
Being culturally tied to the concert hall more than to commercially distributed recording
formats has allowed and encouraged the world of electroacoustic music to embrace the
many 3-D formats of spatial audio, while also approaching the notion of space in ways that
are simultaneously more surreal and more literal (Zotkin, Duraiswami and Davis 2004;
Smalley 2007; Kendall 2010). The use of surround sound and 3-D formats has encouraged
composers to surround and immerse the listener with the sound source, but the virtual
spaces in which the sound sources are staged are often very literal.
Within the world of popular music, on the other hand, the case is very often reversed:
the instrumentation is frequently very conventional while the staging is often highly
surreal, with multiple spaces of different sizes superimposed on each other. Although
the complexities of staging in popular music have been discussed much more than in
other musical traditions (e.g. Moore 1992; Doyle 2006; Camilleri 2010; Dockwray and
Moore 2010; Moore 2012; Dibben 2013; Zagorski-Thomas 2015), these very common
practices of multiple parallel and often conflicting spatial cues need to be addressed in
more detail. Indeed, it is often the case that a small, bright room ambience is used purely
as a timbral and dynamic effect, thickening up and lengthening the tone of drums for
example, while additional reverbs and delays are considered to be doing the spatial
work.
Stereo spatial effects involving time-based processing (delay and reverb) are very seldom
realistic – even binaural recordings are seldom dummy-head and the levels of ‘artificiality’
can be altered in many different ways:
● configuring two microphones differently than two ears on a head: e.g. pointing two
directional mics towards the sound source or spacing them wider apart than a head;
● using more microphones than we have ears and then mixing them down to a stereo
master afterwards;
● using artificial or schematic reverb that can then be processed separately from the
‘dry’ original sound; and
● superimposing more than one reverb on top of each other, combining delay and
reverb or changing the perceived performance environment in different sections of
a piece of music.
Much like the question of equalization and distortion, the established literature (Owsinski
1999; Moylan 2007; Case 2012; Izhaki 2018; Minchella 2018) focuses on the theory of the
generic technologies rather than the sonic specifics of particular units or spaces. There
is also quite extensive reference to the practices and techniques of UK and US popular
music from the 1960s to the 1990s, and the ways in which reverberation was often delayed,
equalized or filtered to prevent masking. Indeed, whether through interviews with key
engineers (Owsinski 1999) or reference to landmark recordings (Moylan 2007; Case 2012;
Izhaki 2018), the literature harks back to the apprentice model of learning by example from
a master, with relatively little in terms of detailed analysis of the justification or musical
reasons for particular approaches. Thus Izhaki explains delay and reverb coupling in the
following terms:
Blending a delay with a reverb is known to result in more impressive effect than having only
one of them. One way to look at it is as if the delay enhances the reverb. (2018: 448)
And Case on unreal space:

Reverb devices are also used for synthesizing a reverberant character that may not exist in
nature. Wishing to enhance the music with lushness or some other form of beauty and held
only to a standard that it ‘sound good’, an engineer might dial in settings on a reverb device
that violate the physics of room acoustics […] Such a reverb can be made to sound glorious
coming out of loudspeakers. (2012: 309)
This reflects one of the key historical problems with pedagogy in this area: the scientific
theory behind the devices is something to be learned, but the aesthetics of how they should
be applied in musical circumstances are a matter of long-term experience and learning
‘hints and tips’ from the masters about what they did. There is very little connect between
the technical theory and a musicological theory of production aesthetics. To be fair, this
is a disconnect that exists across the whole world of music education but it is a disconnect
that we should be working hard to address.
Conclusion
The chapter’s narrative has, on the one hand, followed a linear path informed by its domain-
categorization, but, on the other hand, it has simultaneously challenged it by progressing
in its dynamics from questions brought about by practice/process, which highlight issues
in our theoretical understanding, to syntheses of cross-disciplinary theoretical frameworks,
which should leave – it is our hope – a circular (non-linear) resonance in the reader’s
mind. The words emphasized here are, of course, purposefully chosen to reflect the
introduction’s music production metaphors, at the same time illustrating that a completely
textual interpretation of sonic phenomena may be limited (i.e. the musical analogy is
more than a playful metaphor – it functions as a structural interpretative tool as well).
The chapter’s sections therefore connect, contrast and interact in their illustration of how
specific domain classifications are problematized by both audio processing tool designs
and creative application, identifying gaps in the discipline’s use of meaningful definitions/
understandings that need to be addressed in the pursuit of a more complete musicological
theory of record production aesthetics.
Bibliography
Abel, J. S., D. P. Berners, S. Costello and J. O. Smith III (2006), ‘Spring Reverb Emulation
Using Dispersive Allpass Filters in a Waveguide Structure’, presented at the 121st
Convention of the Audio Engineering Society, Journal of the Audio Engineering Society,
54 (Abstracts): 1277.
Anderton, C. (2018), ‘Double-Tracking, Harmonizing, and Layering: How to Record and Mix
Multiple Vocals’, Reverb, 18 September. Available online: https://reverb.com/news/double-
tracking-harmonizing-and-layering-how-to-record-and-mix-multiple-vocals (accessed
26 June 2019).
Antares (2017), Auto-Tune Realtime: User Manual, USA: Antares Audio Technologies.
Bain, K. (2017), ‘How the Eventide Harmonizer Expanded the Possibilities of Sound’, Red Bull
Music Academy Daily, 3 November. Available online: https://daily.redbullmusicacademy.
com/2017/11/eventide-instrumental-instruments (accessed 29 June 2019).
Begault, D. R., E. M. Wenzel and M. R. Anderson (2001), ‘Direct Comparison of the Impact of
Head Tracking, Reverberation, and Individualized Head-Related Transfer Functions on the
Spatial Perception of a Virtual Speech Source’, Journal of the Audio Engineering Society, 49
(10): 904–916.
Berger, H. M. and C. Fales (2005), ‘“Heaviness” in the Perception of Heavy Metal Guitar
Timbres: The Match of Perceptual and Acoustic Features Over Time’, in P. D. Greene and
T. Porcello (eds), Wired for Sound, 181–197, Middletown, CT: Wesleyan University Press.
Born, G. (2013), Music, Sound and Space: Transformations of Public and Private Experience,
Breen, C. (2013), ‘Review: Logic Pro X Loses None of Its Power, Gains Great New Features’,
Macworld, 16 July. Available online: https://www.macworld.com/article/2044283/logic-
pro-x-loses-none-of-its-power-gains-great-new-features.html (accessed 8 July 2019).
Bregman, A. S. (1994), Auditory Scene Analysis: The Perceptual Organization of Sound,
Camilleri, L. (2010), ‘Shaping Sounds, Shaping Spaces’, Popular Music, 29 (2): 199–211.
Case, A. U. (2012), Sound FX: Unlocking the Creative Potential of Recording Studio Effects,
Clarke, E. F. (2005), Ways of Listening: An Ecological Approach to the Perception of Musical
Computer Music Magazine (2014), ‘Skrillex-Style Twisted, Glitchy Vocals Using Melodyne
– Part 4/10’, YouTube, 15 September. Available online: https://www.youtube.com/
watch?v=orbroFt1_8E (accessed 26 June 2019).
Costello, S. (2010), ‘Pitch Shifters, Pre-Digital’ [blog], The Halls of Valhalla, 4 May. Available
online: https://valhalladsp.wordpress.com/2010/05/04/pitch-shifters-pre-digital/ (accessed
24 June 2019).
Davis, D. (2013), Sound System Engineering, 4th edn, New York: Focal Press.
Dibben, N. (2013), ‘The Intimate Singing Voice: Auditory Spatial Perception and Emotion
in Pop Recordings’, in D. Zakharine and N. Meise (eds), Electrified Voices: Medial, Socio-
Historical and Cultural Aspects of Voice Transfer, 107–122, Göttingen: V&R University
Press.
Dockwray, R. and A. F. Moore (2010), ‘Configuring the Sound-Box 1965–1972’, Popular
Music, 29 (2): 181–197. doi: 10.1017/S0261143010000024.
Middleton, CT: Wesleyan University Press.
Eltro Brochure (1967). Available online: http://www.wendycarlos.com/other/Eltro-1967/
Eltro-1967.pdf (accessed 26 June 2019).
Eventide Audio (2016a), ‘History of the Eventide H910 Harmonizer®’, YouTube, 11 July.
Available online: https://www.youtube.com/watch?v=977Sri5EcCE&app=desktop
(accessed 26 June 2019).
Eventide Audio (2016b), ‘Laurie Anderson and the Eventide H910 Harmonizer (Full Version)’,
YouTube, 11 July. Available online: https://www.youtube.com/watch?v=6veZObvDCXU&
app=desktop (accessed 26 June 2019).
Fales, C. (1998), ‘Issues of Timbre: The Inanga Chuchotee’, in Ruth M. Stone (ed.), Garland
Encyclopedia of World Music, 164–207, New York: Garland.
Fales, C. (2002), ‘The Paradox of Timbre’, Ethnomusicology, 46 (1): 56–95.
Fales, C. (2005), ‘Listening to Timbre During the French Enlightenment’, in Proceedings of the
Conference on Interdisciplinary Musicology (CIM05), Montréal.
Fink, R., M. Latour and Z. Wallmark (2018), The Relentless Pursuit of Tone: Timbre in Popular
Music, New York: Oxford University Press.
Godfrey, J. (2006), ‘Waves Vocal Bundle’, Sound on Sound. Available online: https://www.
soundonsound.com/reviews/waves-vocal-bundle (accessed 26 June 2019).
Hall, E. T. (1966), The Hidden Dimension, London: Doubleday.
Houghton, M. (2013), ‘Gain Staging in Your DAW Software: Level Headed’, Sound on Sound.
Available online: https://www.soundonsound.com/techniques/gain-staging-your-daw-
software (accessed 3 July 2019).
Houghton, M. (2015), ‘Q. “What’s the Best Way to Clip My Drums?”’, Sound on Sound.
Available online: https://www.soundonsound.com/sound-advice/q-whats-best-way-clip-
my-drums (accessed 3 July 2019).
Howard, D. M. and J. Angus (1996), Acoustics and Psychoacoustics, Oxford: Focal Press.
Howard D. M. and J. A. S. Angus (2009), Acoustics and Psychoacoustics, 4th edn, Oxford:
Focal Press.
Izhaki, R. (2018), Mixing Audio: Concepts, Practices and Tools, 3rd edn, New York: Routledge.
Kahn, Jordan (2013), ‘Logic Pro X Review: Powerful New Features & a Simplified UI
With No Compromises for Pros’, 9TO5Mac, 26 July. Available online: https://9to5mac.
com/2013/07/26/logic-pro-x-review-powerful-new-features-a-simplified-ui-with-no-
compromises-for-pros/ (accessed 8 July 2019).
Kendall, G. S. (2010), ‘Spatial Perception and Cognition in Multichannel Audio for
Electroacoustic Music’, Organised Sound, 15 (3): 228–238.
Marlens, W. S. (1966), ‘Duration and Frequency Alteration’, Journal of the Audio Engineering
Society, 14 (2): 132–139.
Marshall, O. (2014), ‘A Brief History of Auto-Tune’ [blog], Sound Studies, 21 April. Available
online: https://soundstudiesblog.com/2014/04/21/its-about-time-auto-tune/ (accessed
23 June 2019).
Maxwell, C. B. (2007), ‘Real-Time Reverb Simulation for Arbitrary Object Shapes’, in
Proceedings of the International Conference on Digital Audio Effects, University of Bordeaux,
Bordeaux, France.
McAdams, S., S. Winsberg, S. Donnadieu, G. De Soete and J. Krimphoff (1995), ‘Perceptual
Scaling of Synthesized Musical Timbres: Common Dimensions, Specificities, and Latent
Subject Classes’, Psychological Research, 58 (3): 177–192.
McFarlane, B. (2010), ‘Learn the Perks of Ableton Live’s Warp Function’, Electronic Musician,
5 January. Available online: https://www.emusician.com/how-to/learn-the-perks-of-
ableton-lives-warp-function (accessed 27 June 2019).
Minchella, D. (2018), ‘The Poietics of Space: The Role and Co-performance of the Spatial
Environment in Popular Music’, in S. Bennett and E. Bates (eds), Critical Approaches to the
Production of Music and Sound, 41–61, New York: Bloomsbury Publishing.
Moore, A. (2012), ‘All Buttons In: An Investigation into the Use of the 1176 FET Compressor
in Popular Music Production’, Journal on the Art of Record Production, (6). Available online:
https://www.arpjournal.com/asarpwp/all-buttons-in-an-investigation-into-the-use-of-the-
1176-fet-compressor-in-popular-music-production/ (accessed 19 August 2019).
Moore, A. F. (1992), Rock: The Primary Text: Developing a Musicology of Rock, Ashgate Popular
and Folk Music Series, 2nd edn, Farnham: Ashgate.
Ashgate.
Moore, B. C. J. (1999), ‘Controversies and Mysteries in Spatial Hearing’, in Audio Engineering
Society 16th International Conference: Spatial Sound Reproduction, Rovaniemi: Audio
Engineering Society.
Moylan, W. (2007), Understanding and Crafting the Mix: The Art of Recording, 2nd edn, New
York: Focal Press.
Mynett, M. (2013), ‘Contemporary Metal Music Production’, PhD thesis, University of
Huddersfield, Huddersfield.
Owsinski, B. (n.d.), ‘The Subtle Art of Pitch Correction’ [blog], UA. Available online: https://
www.uaudio.com/blog/pitch-correction-basics/ (accessed 25 June 2019).
Owsinski, B. O. (1999), The Mixing Engineer’s Handbook, 1st edn, Artistpro, Vallejo: MixBooks.
Paiva, R. C., S. D. Angelo, J. Pakarinen and V. Välimäki (2012), ‘Emulation of Operational
Amplifiers and Diodes in Audio Distortion Circuits’, IEEE Transactions on Circuits and
Systems II: Express Briefs, 59 (10): 688–692.
Peeters, G., B. L. Giordano, P. Susini, N. Misdariis and S. McAdams (2011), ‘The Timbre
Toolbox: Extracting Audio Descriptors from Musical Signals’, Journal of the Acoustical
Provenzano, C. (2018), ‘Auto-Tune, Labor, and the Pop-Music Voice’, in R. Fink, M. Latour
and Z. Wallmark (eds), The Relentless Pursuit of Tone: Timbre in Popular Music, 159–181,
Revill, G. (2016), ‘How Is Space Made in Sound? Spatial Mediation, Critical Phenomenology
and the Political Agency of Sound’, Progress in Human Geography, 40 (2): 240–256.
Rumsey, F. (2001), Spatial Audio, New York: Focal Press.
Schaeffer, P. (1977), Traité des objets musicaux, Paris: Le Seuil.
Schlemmer, U. (2011), ‘Reverb Design’, in Pure Data Convention, Weimar: Berlin, Germany.
Schmidt-Horning, S. (2012), ‘The Sounds of Space: Studio as Instrument in the Era of High
Fidelity’, in S. Frith and S. Zagorski-Thomas (eds), The Art of Record Production: An
Introductory Reader to a New Academic Field, 29–42, Farnham: Ashgate.
Smalley, D. (1986), ‘Spectromorphology and Structuring Processes’, in S. Emmerson (ed.), The
Language of Electroacoustic Music, 61–93, London: Macmillan.
Smalley, D. (1997), ‘Spectromorphology: Explaining Sound-Shapes’, Organised Sound, 2 (2):
107–126.
Smalley, D. (2007), ‘Space-Form and the Acousmatic Image’, Organised Sound, 12 (1): 35–58.
Stavrou, M. P. (2003), Mixing With Your Mind: Closely Guarded Secrets of Sound Balance
Engineering, Mosman, Australia: Flux Research Pty.
Sterne, J. (2015a), ‘Space Within Space: Artificial Reverb and the Detachable Echo’, Grey
Room, 60: 110–131.
Sterne, J. (2015b), ‘The Stereophonic Spaces of Soundscape’, in P. Théberge, K. Devine and T.
Everett (eds), Living Stereo: Histories and Cultures of Multichannel Sound, 65–83, New York:
Bloomsbury.
Thornton, M. (2010), ‘Headroom & Master Fader’, Sound on Sound. Available online: https://
www.soundonsound.com/techniques/headroom-master-fader (accessed 3 July 2019).
Vandeviver, C. (2018), ‘Why I Prefer Flex Pitch Over Melodyne for Pitch Editing in Logic –
Expert Opinion’, Logic Pro Expert, 30 August. Available online: https://www.pro-tools-
expert.com/logic-pro-expert/2018/8/30/why-i-prefer-flex-pitch-over-melodyne-for-pitch-
editing-in-logic-expert-opinion (accessed 26 June 2019).
Walden, J. (2007), ‘Auto-Tune vs Melodyne’, Sound on Sound. Available online: https://www.
soundonsound.com/reviews/auto-tune-vs-melodyne (accessed 25 June 2019).
Wishart, T. (1997), On Sonic Art, 2nd rev. edn, New York: Routledge.
Yeh, D. T., J. S. Abel and J. O. Smith (2007), ‘Simplified, Physically-Informed Models of
Distortion and Overdrive Guitar Effects Pedals’, in Proceedings of the International
Conference on Digital Audio Effects (DAFx-07), Citeseer, 10–14.
Zagorski-Thomas, S. (2015), ‘An Analysis of Space, Gesture and Interaction in Kings of
Leon’s “Sex On Fire”’, in R. von Appen, A. Doehring, D. Helms and A. F. Moore (eds),
Song Interpretation in 21st-Century Pop Music, 115–132, Farnham: Ashgate Publishing
Limited.
Zagorski-Thomas, S. (2018), ‘Timbre as Text: The Cognitive Roots of Intertextuality’, in
S. Lacasse and L. Burns (eds), The Pop Palimpsest: Intertextuality in Recorded Popular
Music, 273–290, Ann Arbor: University of Michigan Press.
Zölzer, U. (2011), DAFX: Digital Audio Affects, Chichester: John Wiley & Sons.
Zotkin, D. N., R. Duraiswami and L. S. Davis (2004), ‘Rendering Localized Spatial Audio in a
Virtual Auditory Space’, IEEE Transactions on Multimedia, 6 (4): 553–564.
Discography
AC/DC (1980), [vinyl] Back in Black, Albert Productions.
Anderson, Laurie (2001), [HD CD] Life on a String, Nonesuch.
Aphex Twin (2014), [CD] Syro, Warp Records.
Bowie, David (1977), [vinyl] ‘Breaking Glass’, Low, RCA Victor.
Cher (1998), [CD] Believe, Warner Bros Records.
Deadmau5 (2009), [CD] ‘Ghosts n’ Stuff ’, For Lack of a Better Name, Ultra Records.
Flume (2016), [CD] Skin, Mom + Pop.
Skrillex (2011), [CD] Bangarang, Big Beat.
T-Pain (2007), [CD] Epiphany, Jive.
U2 (1983), [vinyl] War, Island Records.
U2 (1984), [vinyl] The Unforgettable Fire, Island Records.
Van Halen (1986), [vinyl] 5150, Warner Bros Records.
Part VII
Post-Production
The notion of post-production requires that there is something that can be done to a
recording after the ‘capture’ of a performance. Another way in which we could divide up
the history of recording – other than mechanical, analogue, digital or through the types
of format on which it was distributed – is through the ways in which music production
technology afforded the alteration and manipulation of recorded sound before or after it
was stored on a representational system. In the acoustic period, when there was mostly a
single horn for capturing the sound, the dynamics and balance of an ensemble could only
be affected through performance or through proximity to the device. In some instances
performances were ‘mixed’ by performers being pushed backwards and forwards on
wheeled chairs or platforms to get them closer to or further from the recording horn. When
electrical disc cutting was the primary process, there was scope for some equalization and
dynamic compression in the transfer from the recording disc to the pressing plate, but the
balance of the individual instruments and voices in a track was controlled by microphones.
Before the recording took place, the position of the musicians in relation to however many
microphones were being used and the volume of each of the microphones going to the
disc recorder determined the mix. Once recorded there was no going back – there were
only processes of pre-production and production. You wrote songs and arrangements, you
selected musicians and rehearsed them – these mostly always happened as pre-production –
and then you ‘produced’ the recording – establishing the right balance, getting the right
performance and recording it.
The advent of tape recording in the 1940s was the initial driver for the more conscious
split between production and post-production because it allowed for editing. The selection of
segments of performances from different takes that could be (literally) stuck together to create
a new sequence of activity gave an agency to recordists ‘after the fact’ that had not existed
before. The addition of equalization, dynamic processing or spatial effects had to happen
during the production process – while it was being played – and that was mostly achieved as
a single performance. By using separate microphones for separate instruments, people who
were playing together in the same room could be treated differently and this was famously
how crooning – someone singing quietly over a loud orchestra – or the use of echo chambers
to add reverberation to a single voice in an ensemble, developed. However, in the last years of
direct to disc recording and the first years of tape recording a few pioneers started to work out
ways of using technology to allow performers to contribute to the same recording by playing
at different times. Les Paul famously used a ‘sound on sound’ technique on his recordings
with Mary Ford to layer his guitars and her voices by ‘bouncing’ from one recording machine
to another while adding a new performance at the same time. And others used the same
technique, as diverse as Sidney Bechet, the famous jazz clarinettist who duetted with himself,
or The Chipmunks, who recorded an instrumental track at one speed and then bounced on
their vocals at a slower speed to create that famous high-pitched cartoon vocal sound. But
although all these techniques for manipulating and shaping the sound on recordings were
laying the ground rules for future experimentation, they were still part of the production
process, and the only activities that could really be considered post-production until the
1960s were tape editing and any ‘mastering’ – such as equalization or dynamic compression
and limiting – that happened during the transfer to the pressing master.
It was the development of the ‘sel-sync’ system that allowed multitrack recording
which heralded in the start of a new era in post-production that changed the workflows of
recording and the sound of recorded music forever. By recording different instruments and
voices onto different tracks of the recording system during the production process, a new
stage of mixing was introduced post-production. And now all of those clever techniques
of pitch alteration, equalization, dynamic processing or spatial effects could be thought
about and added afterwards. And the ‘sel-sync’ analogue multitrack tape machines were
superseded by digital tape and cassette (e.g. ADAT and DA-88) and then hard disc and
digital audio workstation (DAW) systems. In addition, musical instrument digital interface
(MIDI) sequencers allowed the addition of hardware and software electronic musical
instruments and samplers. All of these developments help to shift the technological frame,
the way that users of these technologies thought about the affordances and obstacles
associated with recording technology, from one of capturing something that exists in the
world towards creating something that never did exist.
In the preface to Part V we mentioned the way that technology does not simply de-
skill or automate certain professions, but also shifts agency and control between different
participants in the process. This is very often driven by economics and demand rather than
by some more abstract process of making the technology work ‘better’ or ‘more effectively’.
And the people with the economic power, either through personal affluence or through
sheer numbers, are the people who, in conjunction with the producers of the technology,
define and shift the technological frame. In the contemporary marketplace, demand for
production technology is driven in large part by students, amateurs and semi-professionals
rather than, in the mid-twentieth century, by production professionals. This creates three
drivers in the post-production technology market – emulations of vintage technologies
that hark back to the ‘golden age’ of production, the incorporation of presets which
embody the knowledge of existing experts, and the simplification of interfaces through the
use of semantic terminology such as ‘punch’ or ‘heaviness’ to control multiple parameters.
This certainly is not a criticism of this shift. It is recognition that the amazing advances
and developments in current production technology are occurring within a context that is
strongly influenced by a broad range of socioeconomic factors that are the result of race,
gender, sexuality, postcolonialism, globalization and many other influences.
22
Studying Mixing: Creating a
Contemporary Apprenticeship
Andrew Bourbon
Introduction
Historically, mixing has been a skill that has been developed by progression through a
number of roles within the studio environment. The traditional route to mixing would see
interns and assistants working in the studio, taking on everyday tasks such as cleaning,
artist support and general receptionist responsibilities. As these employees moved ‘up the
ladder’ they would move on from the perceived tea-making role to tape operator, assistant,
engineer and eventually to the role of mix engineer. This traditional studio hierarchy
does still exist in a few large modern studios, however, opportunities are more difficult
to come by as more mixers move from commercial spaces into privately owned facilities,
often in the residence of the engineer. Théberge (2012) suggests that there is now ‘a lack
of apprenticeship placements’, with those looking to engage in a career in mixing turning
either to self-instruction or academic study as their route into a career.
This chapter will explore the role of educational establishments in the development of
the study of mixing, exploring the tools currently available to create the replacement for
the traditional apprenticeship in an academic environment. As a practitioner I have not
had the opportunity to study under an established mix engineer, instead relying on study
and experimentation to develop my own skills and practice. Much of the discussion here
is based around my own findings in this process of developing an effective curriculum in
support of developing a new generation of mix engineers who have turned to the education
system to help find their own voice in a challenging environment.
Apprenticeship: Learning through emulation

The traditional apprenticeship model sees less experienced engineers working with more
experienced engineers to learn their craft to a point where they are trusted with the
responsibilities of mixing. During his talk at the 2015 Art of Record Production Conference
hosted by Drexel University, Philadelphia, noted mix engineer Tony Maserati spoke about
himself not as a mix engineer but as a brand. Maserati has a number of assistants who have
developed their skills through observation, practice and feedback, much as a student would in
an academic institution. As they develop they are trusted with greater responsibility, until the
point at which they are ready to mix as part of the Maserati brand. This process is not unique
to Maserati, with a number of established mix engineers now having a clear sonic brand
present throughout the credited output. In order to be successful through this apprenticeship
process those involved will have to study in detail the practices of their engineer in order not
only to be able to recreate the sound of that engineer but also to anticipate the imagination
and musical direction imparted by these engineers in order to convincingly replicate and
indeed become part of the evolving sound of the brand that they represent.
As previously stated the apprenticeship journey begins with tasks that are not traditionally
associated with music creation. Once trust is gained, the apprentice moves into the role of
tape-op and assistant. During this time the assistant will be offered the opportunity to
work alongside the engineer, engaging in activities from console recall to digital audio
workstation (DAW) operation and patching of equipment. Maserati again remarked in his
keynote that the opportunities for an apprentice are further compromised by the move into
the DAW as the primary mixing environment. Though he prefers to work on a console,
there is simply not the budget in most cases to engage in that workflow. The problem is
further exacerbated by the change in the expectation of clients, with the end of the mix
often now representing a negotiation between client and engineer through a process of
recalls. In the past the recall of the console and preparation of stems for mix would be a
significant component of the activities of an assistant, with the physical environment and
workflow clearly mapped through the console and patchbay. In the modern environment
this has been replaced with mix templates in the DAW, which as will be discussed later in
this chapter afford an interesting opportunity for workflow study.
The apprenticeship model: Studying

workflow and process
One of the key affordances of the traditional apprenticeship is the opportunity to work
alongside an established engineer, reflecting on their workflow through observation and
physical recall. The sound associated with mix engineers is the sum of a set of incredibly
complex interactions between the engineer, the tools used in mixing and the musical
relationship with the song itself. At the heart of the traditional apprenticeship lies the
role of the assistant, who has a range of responsibilities including patching of hardware,
initial console preparation, editing and tuning, and mix recall. The mix engineer workflow
is a set of interactions with the chosen tools that is essential to analyse in preparation
for promotion into a mix brand, or indeed a solo mix engineer career following in the
footsteps of an established mentor. The apprentice would gain insight not just from the
Studying Mixing 339
settings on the various audio-processing tools employed in a mix but in the decision to
use those specific tools in a given context. It is clear that workflow plays a huge part in the
sound of a number of engineers. Chris Lord-Alge (CLA) for example has a very particular
way of engaging with his Solid State Logic (SSL) console, which has developed through
years of practice. His use of buses to drive compressors and the sound of the SSL equalizer
are all arguably a creative abuse of the original intended use of the console, but are an
essential part of the recognizable sound of a CLA mix. The use of automation in the mixes
is another essential feature in replicating the sound of a CLA mix, with a process again
driven by the affordances of the automation system on the SSL console. Though the sound
of the mix is very much the sound of CLA, it is clear that the workflow and properties of
the tools selected by CLA in his mix process have become a huge part of his sound. An
assistant working under an engineer will have the opportunity to study the workflow in
detail, understanding not just the processes undertaken but the inspirations behind those
processes and the musical decisions made in the deployment of mix processing tools.
The apprenticeship model: Studying

response to a musical language
A number of top mix engineers have moved from working on a console to a more
consolidated approach, either driven by a hybrid mix technique of hardware and software
or through moving completely ‘in-the-box’. It is interesting that through this transition the
overarching sonic qualities of the mixes do not change – a Spike Stent or Mike Crossey in-
the-box mix still has the same identifiable characteristics as a mix completed on a console.
During the early stage of the transition from console to computer a trained ear may be
able to identify qualities in the mixes that differentiate between these two workflows, but
despite this the overarching signature of the engineer still dominates the mix. It is clear
that the workflow employed by these engineers was essential in allowing them to find and
define their sound, but once established, engineers are able to find their sound despite
changing the workflow and toolset. As an assistant engaging in a studio apprenticeship, it is
important to move beyond replicating the process and to understand the musical decisions
that are made in the process of a mix. In his keynote at the 2018 Art of Record Production
Conference hosted by the University of Huddersfield, Andrew Scheps posed the question:
what does the listener actually hear? It is not the speakers that the engineer worked on, or the
console and processing choices, but instead the music itself that resonates with the listener.
The job of a mix engineer is arguably to enhance the presentation of music to maximize
the effectiveness of the desired impact on the listener. The tools used in the mixing process
will have a significant effect on this. It is, for example, possible to take a piano line and
make it mournful or resolute through differing approaches to distortion. It is the study
of the choices made as engineers as to what direction needs to be imparted and how this
impacts on the overall presentation of a mix that is at the core of the mix process, and of the
process of learning to mix through an apprenticeship. It is the combination of the workflow

and mix approach, the tools engaged with through mixing and an appreciation of their
affordances and the musical direction inspiration of individual engineers that has led to the
development of the recognized engineer sound or brand, with an apprenticeship offering a
unique opportunity to study all aspects of the creative mix process simultaneously.
An alternative to the traditional

apprenticeship
With these diminished opportunities to engage in a traditional apprenticeship, students
looking to develop careers in mixing are now regularly turning to the music technology
education system to gain the instruction and insight required to become recognized as
mix engineers. The traditional apprenticeship provides an opportunity to study multiple
aspects of the mix process through a close relationship with a specific engineer. A number
of highly qualified and industry recognized practitioners can now be found in educational
establishments, passing on their skills and knowledge to students. However, there are
a number of potential opportunities for an enhanced new apprenticeship to be offered
through a multimodal approach to the study of mixing through universities.
Studying mixing through traditional textbooks

There are a number of established textbooks regularly used in the study of mix practice.
This chapter is not written to provide a literature review of mixing texts, but instead it will
look at trends in the approaches taken in key publications. Many of these texts focus on
an instructional approach to mixing that breaks down the creative and technical choices
employed by engineers in the process of creating a mix. Izhaki (2018) and Owsinski
(2017) represent two such texts, with structured approaches to mixing explored through
both texts. In both cases a brief history of mixing is explored, along with a discussion of
elements that are considered essential in the creation of a successful mix. Izhaki begins by
identifying what a mix is on a conceptual level, reflecting on the importance of listening
and critical evaluation. The key concept discussed here relates to three steps to creative
mixing involving interactions between vision, evaluation and action (Izhaki 2018: 20).
Following this process, workflow becomes the next topic for discussion, with a somewhat
generic analysis of potential workflow and mix order approaches and general working
practices. It is noteworthy that though workflow is discussed there is no acknowledgment
of workflows used by established engineers. There is recognition of a number of top mix
engineers including Andy Wallace, Spike Stent and Rich Costey, however, there is little to
link the questions asked regarding perceived mix quality and engineer practice. As the
text continues to develop a number of interesting workflow and mix decision questions
are asked of the reader, reflecting on staging, structure and musical interest through
Studying Mixing 341
the domains of frequency, level, stereo image and depth. The text then looks to specific
mix tools by category, providing audio examples to the reader in the key tools employed
through the mixing process. Though this is far from an exhaustive evaluation of this text,
the structure is common to a number of texts regularly used by students and educators in
supporting students in developing their skills in mixing. Gibson (2019) takes a compatible
approach to mix processing but uses visualization of space with sound boxes to analyse
the special and spectral content of mixes, looking at appropriate staging approaches on a
genre-by-genre basis. Case (2007) takes a more processing focused approach, exploring
workflow and then breaking down audio processing tools by category. Owsinski (2017)
also contains a number of interviews with established practitioners, offering some insight
into tools, workflows and musical vision but without providing the depth of understanding
found through a traditional apprenticeship.
There are a number of other important texts in this field that take a broadly similar approach
to the subject matter, establishing musical inspiration, workflow and audio processing as the
primary tools for the education of aspiring engineers. Many of these resources provide a
‘how to’ approach to processing but do not describe processing in the terms that you might
expect to hear from an engineer discussing the sonic impact of their favourite processing
tools. An 1176, for example, is in my opinion not best described in the technical description
of attack release and ratio. Instead, the 1176 imparts an energy and excitement, with faster
attack times creating a sense of enhanced ‘hair’ as the compressor digs into the source
sound, and faster release times adding a sense of ‘air’ as the compressor releases. We can
explore other processing tools in similar ways, exploring the impact of tools on the perceived
response to a sound by a listener. This level of insight into the impact of processing tools is
rarely approached in mixing textbooks but will be at the heart of a traditional apprenticeship.
Though these texts do not provide the insight into engineer practice that would be
gained through an apprenticeship approach, there is still significant value in this approach
to the study of mixing. The fundamental questions posed in regards to the structure of a
mix, the manipulation of audience reaction to a performance and an analysis of the tools
engaged with through the mixing process are essential to support further analysis to be
discussed through this chapter.
Academic writing and mixing

Students have access to a range of publications that fall outside of the category of traditional
textbook in support of their studies. As with the reflection on the provision of textbooks,
this section is not intended to represent a detailed literature review but rather to explore
the types of materials available in supporting students in gaining insight into the record-
mixing process. The defining feature of the writing included in this section is in the
focus on conceptual understanding of mix process rather than instruction and technical
description. Zagorski-Thomas (2014) explores a range of important concepts, including
the impact of ecological perception on staging and the resulting impact on the listener.
This analysis provides important insight in the study of engineer practice as discussed later
in this chapter. The equalization (EQ) approaches of engineers such as Andy Wallace and
Chris Lord-Alge have clear affordances in terms of the energy required to create the sound
being received, for example, and sit at the heart of the clear sonic signature of these mixers’
output. Moylan (2002) provides an important collection of concepts for the analysis of
mixed material, with a focus on staging, envelope and timbre from which a number of
published concepts are developed. Periodicals such as the Journal of the Art of Record
Production also provide an important collection of resources that support the analysis of
mixing process and practice, with contributions from academics and practitioners.
Composers such as Schaeffer (2012) also contribute important concepts to the analysis
of music and audio processing. Schaeffer identified twenty-five different categories
for the description of the spectromorphology of a sound, which though proposed as a
methodology for understanding his own music also provide a set of useful criteria for
the evaluation of sound processing. Compression, for example, is described in terms of
the technical process in many textbooks, with discussion of the type of gain reduction,
attack, release, ratio and threshold as the key attributes. Context may be provided in a basic
analysis of the characteristics of particular gain reduction types, and the impact of speed in
the attack and release characteristics in relation to the overarching sound envelope. There
is little discussion as to the change in colour, richness or thickness as the compressor moves
in and out of compression, however, Schaeffer provides us with a set of criteria that focus
on the morphology of a sound, which can be plotted in multiple dimensions. This analysis
plays an essential role in providing a student with the understanding of the potential
affordances of a processing tool, and therefore brings that student closer to understanding
how they may explore the realization of a chosen musical direction.
Measurement
It is not uncommon in conversation with engineers when discussing the characteristics
of particular tools to resort to semantic descriptors to articulate the affordances of audio
processing tools. As previously discussed we have a number of approaches to the analysis of
mix practice provided through contemporary academic writing and the appropriation of
language previously used to describe genres such as electroacoustic music. Measurement
provides students with a useful set of tools for exploring what is happening through audio
processing, giving visual feedback to reinforce and expose elements perceived through
listening analysis. There are a number of useful measurement approaches available
to students, ranging from EQ curve analysis to compression curve and burst tone
measurement. Simple distortion measurement is also incredibly useful, demonstrating the
audio enhancement that takes place through the action of compression, for example. An
LA2A will show a different distortion profile when in compression compared with being
out of compression, which when combined with an analysis of attack and release curves
and response times can be used to explain the much cherished sonic characteristics of
that tool. The burst tone shows the nature of the attack and release envelope of a given
Studying Mixing 343
compressor, again giving an important insight into the behaviour of particular audio
devices. The dbx 160 is a particularly striking example of this, with the release profile
showing a release curve that suggests the action of throwing the sound at the listener, with
an aggressive release curve that goes some way to showing why this tool found a place as a
mono bus tool reinforcing the impact of a kick drum to be followed by a Pultec equalizer
in the workflow of Bob Power and his protégés. It is entirely likely that these characteristics
were not discovered through measurement but through the creative exploration of
these tools, with many engineers reluctant to engage in technical measurement and
analysis, instead choosing to rely on feel and techniques passed down from generation
to generation. Measurement simply provides another methodology for the articulation of
the affordances discovered by engineers in their practice, and to encourage students to try
creative processing and then to explore the results through multiple analysis approaches.
It is incredibly difficult to simultaneously explore the complex behaviour and perceptual
interaction with listeners of audio processing tools through measurement, however,
through multiple measurement techniques combined with an auditory analysis of the
musical affordances a set of descriptors for audio processing can be developed and creative
mix practice enhanced.
Musical analysis: Building a history of mix

engineers
In my own teaching I have made the study of the history of mix engineers a core subject
in the curriculum. The approaches found through more traditional textbooks, academic
writing focusing on mixing and texts that focus on the analysis of electroacoustic music
and modes of listening have provided students with a language and a set of tools for the
analysis of mix engineer sonic signatures. In the study of mixes by CLA it is clear from both
listening and technical analysis that there are a number of consistent features in his mixes.
Throughout his mix history the prominence of snare drum is clear, with a particular speed
and hardness of delivery created through an EQ approach that enhances snap and drive.
Kick is also incredibly important in these mixes, with low-frequency energy presented
in such a way that the transient of the kick is given priority in the mix. Reverbs tend
to be short, maintaining a sense of proximity and aggressive delivery in the mix. Lead
vocals are protected in the mix, with backing vocals becoming increasingly distant as they
separate from the primary performance and therefore reduce in importance compared
to the primary focus of the mix. Guitars are incredibly solid, providing a focused wall of
energy to the left and right of the vocal but never stopping the relentless energy of kick and
snare through the mix. This listening analysis is clearly backed up though spectrogram
analysis and use of stereo metering tools. It is also clear that transitions between sections
are carefully managed through automation, with snare often changing in level and feel
between chorus and verse with a subtle drop in level in the chorus to allow the energy of
other elements to push through.
In contrast to the mixes of CLA, an engineer such as Spike Stent provides a very different
set of musical characteristics through his mixes. We hear dramatic gestures throughout the
musical presentation, with elements moving through space in reaction to the production.
Space is incredibly important in these mixes, with elements occupying an exciting and
engaging range of positions in space, but always in reaction to some kind of musical event
within the production. The low end of the mixes is a complete departure from the lean
control of a CLA mix, instead exploring the lowest octaves of sound reproduction systems.
The stereo image is greatly enhanced through phase manipulation, with clear intent in
this stereo manipulation identifiable through a process of middle and side monitoring and
little concern for mono compatibility on some of the mixes. CLA tends to throw the mix
at you, delivering relentless energy and punch, whilst Stent draws you into an immersive
mix environment and guides the listener through the production through manipulation of
musical gesture in space. Both are hugely successful mix engineers but both with completely
different approaches and musical inspiration.
It is also of value to study mix engineer genealogy. We can see clear developments in mix
engineer family sounds, with generational shifts and developments as assistants go on to
become mix engineers in their own right and, indeed, then train their own assistants who
again bring their own interpretation of a family approach, much as a folk musician would
develop and embellish a tune passed from generation to generation. One such example of
this could be seen with Bob Power, with the family tree moving to Dave Pensado and then
to Jaycen Joshua. There is a clear family sound and mix workflow approach between these
mixers, but with an evolving approach to low-end management and mix size in response
to genre changes and improvements in technology.
Through the study of these mix engineers it is possible to gain an insight into the musical
inspiration that drives them. The texts and analysis tools provide tools and language for
this analysis, with clear parallels with the language used by Schaeffer (2012) to describe
the musical transformations possible in his music and the transformations employed by
engineers through their mix practice. We can achieve a strong sense of the musical vision
of these engineers through this process and can engage in a process of experimentation to
try and understand the actions taken by engineers to create these musical visions.
Commercializing mix practice: Studying

engineers through multiple analysis
methods
In recent times a number of companies and engineers have released content, aimed at
capitalizing on access to interviews, videos and sessions that provide insight into the
workflows and equipment choices made by engineers. Previously this content was limited
to magazine interviews that offered insight into the tools and practices of named engineers.
Studying Mixing 345
This new content provides an opportunity for study that had previously been unavailable
to students, instead being reserved for the privileged few given the opportunity to engage
in the traditional apprenticeship. This content provides a significant resource for those
looking to engage in the study of mix practice, highlighting mix workflow and creative
musical choices as engineers explore the tracks for which they have become famous and
bringing context to the mix engineer analysis explored in this chapter. Resources such as
Mix with the Masters (2019) allow students to watch some of the world’s top mix engineers
engage in mixing commercially released materials, providing a commentary into the
processes undertaken and establishing the workflow and musical direction choices made
by engineers as they mix. Engineers such as Andrew Scheps have enhanced this content
with pre-populated mix templates, providing instant access to the processes and, through
supporting content, the musical approaches taken through the mix. Understanding the
choices made by engineers has always been at the heart of mix engineer apprenticeship,
with new resources providing unique and detailed mix insight. Engineers such as Andy
Wallace appear on multiple sessions, identifying in detail the ways in which they engage
with their mixes. Users are provided with information in regards to the ordering of channels
on the console, the provision of time-based processing on auxiliary outputs, the approach
to the use of ‘controlled ambience’ and the use of samples to provide this control and his
approach to automation, EQ and overall mix balance.
Conclusion
The traditional studio apprenticeship has provided an essential method for the transfer of
knowledge from generation to generation, resulting in clearly identified sonic signatures
associated with engineers at the top of the mix profession. As opportunities for this
experience have diminished in the studio environment, music education establishments
have taken on the responsibility for the training of new engineers in preparation for a career
in music. By combining the study of core mix practice and mix processing tools through
traditional textbooks, students gain a fundamental understanding of what these tools can
achieve from a technical perspective, but often lack the context provided through working
with engineers to explore these tools, drawing conclusions as to the musical affordances
of audio processing and mix workflows. By engaging in academic studies of mixing, both
through contemporary academic writing, measurement and through the analysis of practice
that can be achieved though contemporary rich media resources, universities are able to
provide students not only with a deep level of technical and aesthetic understanding, but
also with the knowledge and musical inspirations held within a generation of professional
mix engineers. Though the traditional apprenticeship under a single engineer may have
been lost to the industry, we are now moving into an environment where a student can
effectively undertake multiple pseudo-apprenticeships, reflecting on workflows and sonic
signatures and finding their own voice as an engineer.
Bibliography
Case, A. (2007), Sound FX: Unlocking the Creative Potential of Recording Studio Effects,
Gibson, D. (2019), The Art of Mixing: A Visual Guide to Recording, Engineering and
Production, 3rd edn, New York: Routledge.
Izhaki, R. (2018), Mixing Audio, 3rd edn, New York: Routledge.
Mix with the Masters (2019), ‘Mix with the Masters’. Available online: https://
mixwiththemasters.com (accessed 27 August 2019).
Moylan, W. (2002), The Art of Recording: Understanding and Crafting the Mix, 2nd edn,
Oxford: Focal Press.
Owsinski, B. (2017), The Mixing Engineer’s Handbook, 4th edn, Burbank, CA: BOMG
Publishing.
Schaeffer, P. (2012), In Search of a Concrete Music, translated by C. North and J. Dack,
Berkeley: University of California.
Théberge, P. (2012), ‘The End of the World as We Know It: The Changing Role of the Studio
in the Age of the Internet’, in S. Frith and S. Zagorski-Thomas (eds), The Art of Record
Production: An Introductory Reader for a New Academic Field, 77–91, Farnham: Ashgate.
University Press.
Part VIII
Distribution
The production processes of recorded music are virtually meaningless without the final
stage of that process – the delivery of that product to an audience. The history of music
production has involved an interactive tug of war between models that have been based
on selling a product and those based on a service. In various places in the book we have
mentioned the various formats that have jostled for position as the product to be sold.
However, there has always been a parallel stream of service-based choices that have worked
in collaboration and competition – from the public presentation of recorded music as a
kind of ‘fairground attraction’ in the early years to the latest developments in streaming
curation. They have also, of course, included radio, jukeboxes, discos and clubs, but there
is also the more subliminal world of background music in shops and restaurants. Different
legal systems around the world provide for different levels of artist remuneration in these
service models. And there has been a complex and shifting balance in the symbiotic
relationship between live music and recordings with each acting as promotional material
for the other and their relative pricing shifting dramatically over the years. Whereas in the
1970s a loss-making tour might often be subsidized by a record company to promote a
new album, we have often seen a complete switch in this economic balance in recent years.
One of the primary drivers of distribution in recorded music has been innovation in the
technologies of data storage. As we have stated earlier, recorded music is a representational
system where some set of data that embodies the instructions for creating a sound wave
through a transducer is stored so that the process can occur when the user so desires. This
was as true of the etchings into wax cylinders as it is of the 0s and 1s of digital audio files – or
rather their electrical or magnetic counterparts. Just as each of these systems was reliant on
the development of parallel and prerequisite systems in other sectors and industries, they
also afford different modes of distribution and the technologies and marketing systems
that facilitate them. And the advantages and disadvantages of each of the various systems
encourage or prevent a whole range of parallel or subsequent synchronization strategies.
Indeed, recorded music is often a subsidiary ‘sync product’ to other brands – such as films
or film stars, television programmes, sporting events, adverts or even political movements.
And these technologies have two sets of characteristics that make a difference to the
quality of the experience of ‘consumption’: how they create a sound wave through a
transducer and how they act as a storage medium. At first glance we might consider that
both of these functions can be judged through fairly empirical parameters. In the first
instance, there has been a long battle throughout the twentieth century to improve the
frequency response and the dynamic response of both the recording and the reproduction
processes and, at the same time, to ensure that the minimum of additional noise is added
to the signal. These can be thought of as the primary indicators of ‘good quality’ in both
recording and reproduction, but there is the obvious subjective element in that there may be
some trade-off between the quality of these characteristics. Choosing which characteristic
is more important can be highly subjective – as can be seen in those consumers who prefer
the way that analogue vinyl produces a sound wave to the way that digital formats do,
despite the fact that the vinyl system introduces surface noise that digital formats do not.
In the second instance, how they act as a storage medium, there is a similar but much larger
range of relatively empirical features that often require a subjective decision in terms of
trade-off. These include price, maximum capacity for data, size and shape of the storage
medium, the extent to which the medium is tied to the translation device (e.g. a record or
a CD is not tied to a hi-fi while an internal hard drive is much more tied to a PC) and the
extent to which the data are tied to the storage medium (e.g. very much so on a vinyl record
but much less so on an audio cassette).
All of these technical factors have been important in determining the success and
popularity of the various data storage formats, but there are also three other important
considerations:
● the types of income stream that these various technical formats afford and how
those incomes streams (as well as the costs of production) are likely to be split
between the various participants in the production process;
● how the users respond to the technical format. Many other socioeconomic and
cultural factors influence the ways in which users respond to technology and these
factors can stimulate different sectors, geographic and social, to value, for example,
aspects of convenience over price or audio quality;
● the ‘packaging’, either physical or virtual, has had an historically important
influence on patterns of consumption. On the one hand, the richness of album
cover art and information is presumably one of the factors in the resurgence of vinyl
sales but, on the other hand, the richness and accessibility of ‘unpackaged’ material
on the internet (music videos, lyrics, photos, gossip, social media, Wikipedia, etc.)
are driving other audiences in that direction.
And in addition to the question of whether recorded music is consumed through a product
or a service model, there is also the question of how active or passive an experience we
want that consumption to be. In the last chapter of the volume, McLaren and Burns
examine a release format, which is based on making the experience of the ‘package’ into
a highly proactive and thought-provoking process. By turning the story of a concept
album into a sort of multimedia, multi-artform, detective story, the audience is drawn
towards a highly active form of consumption. At the same time both Toulson and Katz
discuss both passive and more active recorded music distribution systems: from interactive
music apps to karaoke. There are a whole variety of potential ‘products’ that fall under the
Distribution 349
umbrella of recorded music, which range in character from an artist (or group) placing
you (as audience) in a specific relationship with the music through to more interactive
and participatory approaches where your engagement is a powerful determinant of the
structure of the musical experience.
And as we have also discussed elsewhere in the book, the process of production is
moving more and more towards becoming a consumer activity – a hobbyist and semi-
professional activity as well as a professional and industrial one. Are the makers of music
becoming consumers again – rather like the participatory consumer activities of late
nineteenth-century music when owning a piano and buying sheet music allowed you to
‘do it yourself ’? Are GarageBand, Fruity Loops and even Abelton, Logic and Cubase simply
doing the same?
350
23
Producer Compensation
in the Digital Age
Richard James Burgess
Introduction
In 2008, I published a paper entitled ‘Producer Compensation: Challenges and Options in
the New Music Business’. This was a time when the music business was in a transitional
phase moving from physical to virtual distribution. The industry was still experiencing the
decline in revenues that had begun in 1999 with the launch of Napster and the widespread
availability of free music online. Record producers were hit hard because of declining
budgets for recording and the fact that the required skill sets, work environments, sources
and types of work, and the ways and means of remuneration were in a state of flux. Some
factors included drastically reduced revenues because of piracy and the need to compete
with free recorded music as well as the technological advances in digital recording
equipment, which opened up production capabilities to many more people and created
competition from much cheaper recordings. At the time of writing, the transition from
physical to digital delivery continues, but streaming is now the dominant consumption
model. Vinyl continues to grow modestly, while CDs and downloads are in sharp decline.
The challenges are narrowing, and the future is coming into sharper focus.
The good news is that an exponential increase in the uptake of streaming subscriptions
has produced three straight years of double-digit revenue growth for the recorded music
industry and this is beginning to be reflected in budgets for some productions and
producers. Nevertheless, the business of music production has been forever changed by
the digital revolution, and we are now seeing the monetization of parts of a production as
well as its whole. This chapter’s purpose is to document, in the most general terms, how
and to what degree producers are compensated in today’s music business environment.
Prior to the digital disruption

Before 1999 the primary sources of revenue for producers were an advance against a
royalty on physical goods sold and any further royalties due should the sales revenues
exceed the advance paid to the producer.1 In the 1990s, CDs dominated the market and
vinyl retreated to a single digit percentage of sales. Most producer contracts would provide
for ancillary income from licences of the produced work. Compared to now, this was an
uncomplicated system and, with variations, that system had been in place since at least
the 1960s for producers. For artists signed to major or independent labels, the producer’s
advance was usually more than sufficient to pay for a music industry lawyer to draw up a
contract. Producers working with self-funded and self-releasing artists would often accept
a fee with no back-end. However, the chances of an artist having a self-released hit, in those
days, were slim.
Post-digital disruption
So, what has changed in the recording industry today that affects how producers get paid?
The democratization of the creation process caused by the advent of relatively inexpensive
production software has caused an explosion in the number of productions. Distribution
has likewise been made available to all by distribution or aggregation companies such as
CDBaby, Tunecore and many more. This has resulted in more than 600,000 track uploads
per month to digital services. Many of these productions are self-produced and self-
released by the artist. Some self-releasing artists are using outside producers and, when
there is no formal label structure or equivalent in place, those producers are often working
for a fee with no back-end royalty. Oftentimes, that fee has to cover engineering the record,
and providing the recording equipment and the studio space. Even if the producer has a
contractual provision for back-end royalties, tracking and claiming them can be difficult
if there is no label or, at least, no formal accounting or royalties’ system on the artist’s side.
Self-releasing artists
Reports from producer managers indicate that as much as 50 per cent of production work
is being done for artists that have no record deal and that most of these self-funded albums
are also being self-released by the artist.2 Very few of these productions pay a royalty in the
traditional sense. Generally, producers must negotiate a share of the net revenues payable
to the artist and hope that the artist has an accountant or an accounting system that will
accurately report and payout the producer’s share. Sales are often so small that neither
royalties nor a share of net generate significant additional revenues for the producer.
Because self-releasing artists often lack the royalty payment or accounting infrastructure of
Producer Compensation in the Digital Age 353
a label, reports indicate that the producer or producer manager often has to chase royalty
payments more than they do with label-released work. Producers of artist-funded and
artist-released work are often paid modest sums ranging from US$5,000 to US$15,000
(although successful self-releasing artists are paying more). The upside for the producers of
these albums is that they are recorded quickly because self-funded artists can rarely afford
to spend weeks in a studio.
Who gets paid and what do they get paid?

In contrast to the artist-released productions described above, on label-released
productions, the producer will often receive an advance against a royalty of 3–4 per cent
of PPD (published price to dealer/wholesale price). Of course, the higher percentages are
reserved for in-demand producers with a track record of success. Some older contracts
offer similar percentages of manufacturers’ suggested retail price (MSRP), and since PPD
is significantly less than MSRP it is essential to understand on which basis your royalties
are being calculated. Increasingly common are net profits deals. Anything above US$40,000
is currently a good producer advance for an album production, and track production is
usually paid on an approximately pro rata basis. Worth noting here is that it is important to
negotiate a pro rata share of master use licences and a share of SoundExchange (US), PPL
(UK) or other neighbouring rights monies (depending on which country the artist is based
in).3 SoundExchange requires a Letter of Direction (LOD) from the artist in order to pay
the producer, and this should be one of the requirements in the producer’s contract with
the artist because it can be difficult to get after the fact. Since these performance royalties
are now a material source of income it is essential that the producer’s lawyers, managers
or the producers themselves ensure that the artist’s lawyer or manager gets the artist to
sign whatever LOD or instructions the specific collection agency requires. Reportedly, more
producers today are getting their share of neighbouring rights revenues than ever before.
Terms can still be hard to negotiate for audio/visual use of produced masters. Some artists’
lawyers fight to hold on to 100 per cent share of sync/master use licence revenue. However,
producers today cannot and should not depend on any one of the many revenue streams
(and especially not sales) as their sole source of revenue. It has become essential to aggregate
the various revenue streams. It is worth noting that neighbouring rights royalties vary widely
from country to country and from one neighbouring rights collective to another. Producer
access to these monies will be tied to the collectives that the artist belongs to.
Entrepreneurial producers – those who discover new artists, develop them and either
sell produced masters to a label or release them on their own label – have a much wider
range of compensation possibilities. The project could fail to catch anyone’s imagination
and the entire investment will be lost or, at the other end of the scale, it could be a huge
hit and the producer can receive a significant payout from their share of revenues.
Executive producers can sometimes be paid a fee and, occasionally, the equivalent of a
point (1 per cent of PPD), but most do not get paid. They often work for the record label
or management company, or may provide the financial backing and/or creative support for
the project in some other capacity. The credit is usually more of an acknowledgment of a
key (typically non-creative) role.
Associate producers may get paid a fee and very occasionally a royalty. Often artists
request the associate producer receive credit to reflect their creative input to the production.
Otherwise the associate producer credit may go to an engineer or a key musician who
would be paid for performing that role. Sometimes artists will request a joint producer
credit, such as ‘Produced by “Artist name”, “Producer name”’, these can be vanity credits or
revenue earning.
Engineers and beatmakers are usually paid a fee with no royalty. Vocal and other
speciality producers typically command a fee. Mixers can earn both a fee and the equivalent
of a point or two (1–2 per cent of PPD) but mixers are increasingly working for a fee
with no point. Mixer fees seem to have fallen significantly over the past couple of decades.
Remixers get a fee and sometimes a royalty with advances depending on their stature and
the nature of the project. As with any of these categories, anyone considered ‘hot’ or with
current hits can command much higher fees, advances or royalties.
Writer producers are paid a fee, get a royalty for the production and a royalty for their
share of the song. Almost all pop and urban music is handled by writer producers. While
this may seem like a new trend it can be traced back to Motown in the 1960s, the Brill
Building in the 1950s and beyond. Writer producers are often entrepreneurial as well, in that
they discover new talent that they write and produce the material for before placing them
with a label or releasing them on their own label. It is not uncommon for a multitrack file to
be sent around the world to be worked on by multiple writers and producers, which creates
the ‘Franken-writer’ syndrome of many credited writers and producers on a single track.
A move away from royalties?

A significant shift in the past two decades, because of the industry’s financial struggles,
is that there are experienced producers with substantial, recent and even award-winning
track records that are no longer asking for points. The emphasis for them now is on
payment for their work and a good credit. Sadly, although there is some recent progress,
we have still not solved the problem of the lack of credits on digital services. Since credits
represent the opportunity for future work, this can be quite damaging to the careers of
creative individuals in our business and to labels that depend on having a branded identity.
Since the early 2000s it has been harder for mid-level producers to make a living than
previously. Now, it seems that producers represented by substantial producer managers
are making decent livings from producing full-time, but they often need to own their own
studio, or at least be capable of engineering and mixing in addition to producing. For
aspiring producers, the digital revolution, by making available high-quality, inexpensive
recording and distribution technology, has lowered the barriers to entry but has increased
competition in the marketplace and most cannot make a living from producing.
When is the money paid?

Generally, producers get paid half of their advance or fee upfront and half at the end. Some
independents ask to split it up in thirds: 33 per cent on signature, 33 per cent at the start of
mixing and 34 per cent on delivery of masters and stems (more of which later). Royalties
or splits of net profits are paid once the recording has recouped all costs. Producers can be
paid back to record one, meaning that the producer does not have to contribute part of his
or her share towards the costs of the recording but the producer advance will have to be
recouped before royalties are paid. This can mean that a producer has, theoretically, earned
royalties but can’t unlock them unless or until the record recoups.
All-inclusive fund or separate recording

budget?
There has long been production work for which producers are paid an all-inclusive fund out
of which the producers pay themselves and the costs of the recording. Producer managers
generally express preference for projects where the producer is paid from one fund and the
artist or label pays the recording costs from a separate fund. There are advantages to this,
more traditional two-bucket, system. The producers know how much they will make for
the project, how much they have available to spend on the recording and they never need
to decide whether to spend more on the production as opposed to putting the extra money
into their own pockets.
All-in fund projects usually work best when the producer supplies the studio/equipment/
recording location because it reduces the pressure of daily studio charges. Having said that,
producers that have their own recording situations are invariably subsidizing production
expenses by absorbing the capital costs of the equipment, maintenance costs and the cost
of the space (even if it is a home studio there are still costs to the producer). Unless they
are explicitly factored in to the recording budget and paid by the label or artist these are
negative externalities that are borne by the producer.
Additional tasks to be aware of

Besides producing and recording the tracks, producers are commonly expected to perform
other functions that must be factored in to the production budget if they want to make
a reasonable profit. It is easy for new producers to overlook these additional tasks when
budgeting and the extra time required can be significant.
In certain genres, it is common for producers to be required to deliver final mixes. For
many producers, this is a preference but mixing takes time, which must be factored in
to the overall budget. Remixes still tend to be the domain of specialists. Vocal up/vocal
down/no lead vox and instrumental versions are always needed. Stems are a relatively new
addition and producers need to add a line item for them because they can add significant
time at the end of the production. Most stems are requested after the mixes are delivered,
so it is essential to be prepared to do them to avoid interruptions when you are in the
middle of your next project. Unless the producer explicitly builds the cost of creating stems
into the agreed budget, labels are unlikely to pay separately (especially after the project is
completed) for supplying these additional elements. They will be regarded as being part
of the production fee and are most likely required in the delivery/acceptance terms of the
contract.
Differences between major and

independent label payouts
There is a difference in compensation levels between major and independent labels. Major
labels are likely to pay a higher advance against the producer royalty and budgets are
often bigger, but the independents tend to have a better success rate than the majors. The
majors look for blockbuster hits and often abandon moderately successful projects that the
independents manage so well. Many independent labels have fifty-fifty split profits deals
with their artists rather than a royalty against PPD or MSRP. It is often possible to negotiate
a share of the band’s net profits income and if the album is successful producers will make
more from an independent label on that basis. Major labels do not usually pay artists a
50 per cent share of net profits. Net profits is one of those tricky terms that needs to be
defined in the producer contract. The producer’s royalties will be calculated on the same
basis as the artist’s, so it is helpful to know what the label is netting out from the gross.
There is a concept known as ‘Hollywood net’, which results in the artist and producer never
getting paid because all things imaginable (and some unimaginable) are expensed against
the gross to produce a net of zero or worse.
Challenges in getting paid

There is another hurdle that most producers and their managers must jump and that is
getting paid any back-end monies. This can be the royalty or share of the net or simply
the final payment of the advance/fee upon completion of the project. The completion or
acceptance language can be a harsh lesson for neophyte producers. Contracts usually have
some language that defines the label’s acceptance of the project and triggers the final portion
of the advance/fee (often as much as 50 per cent of the total payment due). The delivery
language usually includes delivery of fully mastered tracks, all credits, delivery of stems, vox
up/down, instrumentals, backing tracks and the digital audio workstation files themselves
(in defined formats). For experienced producers, these are either part of their workflow or
they negotiate them out of their contracts. Less experienced producers may not be aware of
these requirements until they attempt and fail to get their final payment for the production.
Some companies will insist on delivery of ‘commercially acceptable masters’. Clearly, until a
track or album is a hit or has earned significant revenue, the term ‘commercially acceptable’
is subjective. ‘Technically acceptable’ is safer language if the producer wants to increase his
or her chances of getting paid in full upon completion. There may still be some subjectivity
in a technical standard but much less so than a commercial one.
Then there is the problem of getting paid the royalty or share of net that you negotiated.
For the most part, majors and independent labels all account when they are supposed to
but chasing a self-funded band for a royalty is rarely going to be easy. They often lack the
technology, know-how and staff to make the calculations and to report accurately. In the
case of self-releasing artists, it is better to agree a share of net profits, which requires less
expertise and work for them to calculate.
Contracts
As we have seen, back-end income relies upon having a solid, enforceable contract in
place that, today, must embrace the multiple sources of revenue that accrue to the artist.
Some of these sources of revenue are understood but not proven to be profitable and some
are currently unknown. To maximize the chances of making a living and to ensure a fair
share of the success that may be generated by their work, producers need to be aware of
the newer business models that are in place and anticipate those that are still forming.
The producer contract of today needs to provide for a fair share of all revenues that are
associated with the sale or use of the recordings. These revenue sources include but may
not be limited to:
● physical sales, while diminishing they can still be important. These include CDs,
vinyl and any other physical format that may become popular such as the return of
cassettes
● virtual sales – downloads
● streaming audio – both interactive and non-interactive
● free models, name your own price or others that accrue income in some indirect
way such as advertising, a lump sum payout from sponsorships, settlements, digital
breakage or future equity deals (per the equity payout from the Spotify public
listing)
● blank media levies
● sales through the artist’s own label
● publisher and writer royalties if the producer wrote or published any part of the
produced songs
● other sources such as monies raised by the producer from investors, label deals,
publishing deals, etc.
● stems are now being monetized and producers should consider their right to a share
of that revenue
● samples can generate significant income and since they include the producer’s work
the producer should consider how to participate in that potential revenue stream.
Physical sales
Physical sales can be dealt with much as they always have been by a contractual royalty
based on either retail or wholesale price. What this model usually does not consider is
what happens when there is no third-party record label. Even when there is a major or
independent label involved it behoves the producer to ensure that they are entitled to a fair
share of profits if the rights ever revert to the artist. The established producer/mixer royalty
model of somewhere between 1–4 or even 5 per cent of suggested retail price, is based upon
the major-label artist royalty model but no longer remains fair to the producer when the
artist is releasing via their own label. Instead of realizing a gross royalty of 10–20 per cent
of retail (approximately one to two dollars of physical sales), the artist could generate five to
ten times that in gross revenue. Even after manufacturing costs and publishing royalties, a
self-releasing artist can expect to earn many times what they could make as a royalty from a
major label. Even superstar artists often sell less on their own label, but the business model
works because the profit per-unit is greatly increased. Producers need to make sure that new
contracts accommodate such eventualities and entitle them to a fair share of profits from
any source that involves the use of their work even if the terms of the artist’s deal change.
Downloads
Downloads are fading fast but as long as they exist they have to be accounted for. The
industry standard for downloads, by default, became the iTunes US$0.99 per track US$9.99
per album model. Now there are other download models such as Amazon (full albums can
be priced at US$4.99, US$6.99, etc.) and subscription services such as eMusic where the
per-download rate can be much lower per-track (such as US$0.30 or possibly less). These
variations indicate that the producer royalty might be best expressed as a percentage of
revenue due to the artist including any additional income received where the artist is also
the label and is receiving the label share.
Streaming audio
Streaming audio is divided into two categories: interactive (on-demand) and non-
interactive (radio style).
Interactive streams
There is currently no statutory rate set for interactive streams in the United States or other
territories. Interactive or on-demand streaming rates are negotiated directly between
the label/distributor/aggregator and the Digital Service Provider (such as Apple Music,
Amazon, Spotify, etc).4 This reinforces the need for the producer to ensure that their
contract covers all income related to the recordings produced so that they can access this
revenue as negotiated by the label or the artist.
Non-interactive streams
Non-interactive streams are statutory payments governed by the Digital Millennium
Copyright Act 1998, or DMCA, in the United States and by similar legislation in other
countries (Library of Congress 1998). DMCA royalties and the 50 per cent that is
directly payable to artists, musicians and singers is not subject to recoupment or cross-
collateralization by the label but is paid directly through SoundExchange for the artists.
Musicians and singers receive theirs through the unions. All labels and featured or non-
featured artists need to do is sign up with SoundExchange to receive this income from
non-interactive digital streaming services. Unfortunately, there was no provision in the
DMCA for producers. The only way that producers can get their entitlement from this
digital streaming income is to get a LOD from the artist instructing SoundExchange to
pay the producer directly from the artist’s share. In the United Kingdom, PPL distributes
these monies to producers, but there are restrictions on which producers can qualify for
this revenue (eligible producers).5 Other territories also pay producers for these uses but
the percentages and the terms and conditions vary from one collection agency to another.
Some labels strike direct deals with the non-interactive streaming services thus bypassing
SoundExchange. Producer contracts should address any such direct-deal revenue also. As
with most other terms the producer contract will reflect the terms of the artist contract.
Neighbouring rights
Neighbouring rights is an unnecessarily confusing term, but it refers to the right to publicly
perform, or broadcast, a sound recording. Outside the United States, collection agencies
collect from a bigger pool of users of music than SoundExchange can in the United States.
For example, PPL in the United Kingdom collects not just from digital radio but also from
terrestrial radio, television stations, clubs, shops, pubs, restaurants, bars and grills, and
thousands of other music users who play sound recordings in public. Currently the only
type of public performance income that exists in the United States for artists and performers
that producers can get a share of is non-interactive digital streaming income (internet and
satellite radio) and only then by obtaining the LOD. Music publishers and writers in the
United States have, for many decades, had legislated access to similar performance monies
from the use of compositions. Considering the growth of performance income for featured
and non-featured artists and producers in the rest of the world, the fact that this money
is not collected and distributed represents the loss of a lucrative source of income for
US producers, artists, musicians and labels. The loss includes international performance
monies. International collection agencies will not pay US-based performers and producers
because the United States cannot reciprocate. Even outside of the United States, producers
have not always had easy access to performance royalties. In the United Kingdom, to qualify
for this income, the producer must be deemed a ‘performing producer’. This means that
they either contributed an audible performance (such as playing an instrument or singing);
or they conducted or musically directed another performer’s live performance as it was
being recorded. Edits or remixes do not qualify unless the new versions involve new audible
performances. The performing producer categorization is an imperfect compromise but
one that was hard-fought and won by the Music Producers Guild (MPG) in the United
Kingdom. There are various related but different royalties available to producers in other
countries from performing rights organizations such as GVL in Germany and GRAMO in
Norway. Each society has its own qualifying parameters and calculations for the payouts.
The terrestrial right in the United States

The musicFIRST coalition (A2IM, AFM, NARAS, RIAA, SAG-AFTRA) is currently
lobbying to remove the exemption from US copyright law that allows terrestrial (including
HD) radio to play music without compensation to the labels, featured and non-featured
artists, and performers. Should the musicFIRST campaign succeed and a terrestrial right be
established in the United States, producers may have the same problem as with the DMCA
in needing an LOD to be able to claim their share of this revenue. An LOD can be difficult
to obtain in retrospect so it is essential that producers get one drawn up with their initial
contract. The Allocation for Music Producers (AMP) Act 2018 was recently passed by
Congress as part of the MMA (Music Modernization Act) and this helps legacy producers
who cannot get an LOD from the artists they produced. Producers need to ensure that they
have a contract with the artist and that the contract is specific as to how the producer’s share
of performance royalties is to be paid to them. Even when the terrestrial radio exemption
is removed it may well be that access to those performance royalties will require an LOD
from the artist to instruct SoundExchange (or whichever performing rights organization
distributes those monies) to pay the producer.
Free and other business models

Over the past twenty years there has been much talk about ‘free’ as a business model. Of
course, this is a well-tested system with broadcast radio, television, and some magazines
and papers being made available free to the consumer courtesy of third-party advertising
revenues. An artist or label can strike a co-promotional deal with a third party such as, say,
the British Daily Mail giveaway of Planet Earth, a Prince album, in 2007. There are also the
name your own price deals (e.g. Radiohead’s In Rainbows album in 2007). These are some
of the many creative business models that developed in response to the digital disruption
and rampant digital piracy. It is not always immediately obvious how artists are being paid
from these types of models, but producers need to tie their royalty income to the revenue
received by the artist from any source triggered by the produced recording and, where
applicable, get an LOD for payment at source.
Blank media levies

Blank media levies are collected in many countries and they vary as to which media and
equipment they are collected from. These monies are usually paid out to the artist on a
pro rata basis. This is a source of income that may disappear if digital delivery eliminates
physical formats but, for the time being, producers should make sure that the income from
blank media levies is covered by language in their contract.
Contracts
It is impossible when discussing producer compensation to avoid referring to the contract.
In the past eighteen years, it would seem that a higher percentage of productions are being
done without contracts. Prior to Napster, most producers had agreements prepared by
music industry lawyers. They were expensive but, for the most part, effective. When the
recorded music industry collapsed to less than half its peak revenues, budgets for producers
and productions fell to levels that made paying for producer contracts uneconomical in
certain circumstances. Even if an album advance would be sufficient to pay a lawyer, a
single-track advance might not be and a single-track contract is as complex as a full album
contract. Producer managers I have spoken to say that most producers today work under
a producer contract. Contracts for bigger budget productions are still prepared by high-
end music business lawyers who have a good grasp of current thinking. Those contracts
offer excellent protections. By contrast, contracts for most independent and self-funded
projects are often prepared by the producer’s manager because the advances are so low that
they can’t afford or justify a lawyer (some producer managers are also lawyers). A good
producer manager understands the necessary parameters and potential pitfalls as well as
any lawyer. Producer contracts are all similar in their terms and conditions and they almost
invariably accept the artist’s terms (per their contracts with the labels) as the basis for the
producer’s terms. Producers with such a contract should be fine. There is a risk when a
producer works without a contract or with a ‘home-made’ contract with an unsigned artist
that the recording will be very successful and the producer will not share financially in that
success. Massive hits don’t come along every day and you want to make sure that, should
you produce one, you will be paid your fair share of royalties.
Producer managers
I have referred to producer managers a good deal in this chapter, so some discussion of
their merits is worthwhile. Producing tends to be an all-consuming process. Today there
is little room for the managerial producers who are in the producer booth on the phone,
calling in from the tennis courts, or texting and emailing while at the console. Given the
focus and hours required to be a great, artist-centric producer, having a manager look after
the business aspects of your career makes sense. They are in discussions regarding the next
projects while you are buried in your acoustically perfect cave. Good producer managers
are constantly in conversation with labels regarding new signings and opportunities, and
they have an eye to your overarching career trajectory. They know the going rates and they
understand which occasional sideways steps can benefit your career and which ones can
cause you harm. That said, good producer managers are hard to come by. To persuade a
good producer manager to represent you, it is usually necessary to bootstrap your career
to a level where your value is obvious. At that point, the tendency is to think you can go
it alone, but this is precisely when a good manager can add their expertise, contacts and
wisdom to lift your demonstrated value to new heights.
Organizations
There are organizations that work assiduously to improve the producer’s lot. In the United
Kingdom there is the MPG and in the United States there is the Producers and Engineers Wing
(P&E Wing) of the Recording Academy. There are equivalent organizations in many other
countries as well. I recommend joining your local organization for networking opportunities,
educational purposes and to contribute to the overall well-being of the profession.
Checklist
In 2008, when I had the honour of being asked to join the P&E Wing’s steering committee,
Maureen Droney (Managing Director of the P&E Wing) and I, along with other members
of the committee, put together a list of items for inclusion, as applicable, in producer and
engineer contracts. Not all these items apply to every project, but the list might be useful as
a memory jogger when putting together a producer agreement.
Upfront monies:
● advance against production royalties
● recording engineering fee
● mixing fee
● mastering fee
● studio rental fee

● equipment rental fees
● development fee where applicable (monetize your time invested)
● percentage of monies raised by producer from investors in the project, label deals,
publishing deals, etc.
Back-end monies:
● royalty against sales of any type of physical goods
● royalty against sales of any type of downloads
● royalty against interactive streams
● LOD from the artist to the relevant collection agency for non-interactive streams
and other neighbouring rights revenue
● percentage of ringtones/ringbacks, etc.
● writers’ royalties where applicable
● publishing royalties where applicable
● percentage of Blank Media and Hardware levies where applicable
● percentage of Black Box monies where applicable
● percentage of equity payouts tied to the use of the produced masters
● percentage of digital breakage6
● percentage of income from any advertising monies that accrue to the artist/label
that are tied to the use of the recording
● increased percentage if sales are through the artist’s own label
● in the case of 360/free giveaway/loss leader deals (à la Prince–Daily Mail) or name
your own price (à la Radiohead or Nine Inch Nails), a percentage of the other streams
of income stimulated by the freebie, freemium, promotional or low-priced sales
● percentage of any other type of income that is derived from the exploitation of the
sound recordings
● bumps with sales hurdles or chart action
● percentage of buyouts.
General guidelines:
● ensure that the artist signs the applicable form(s) directing the collection agency/
agencies that will be collecting the sound-recording performance royalties
(including those generated by electronic, terrestrial, satellite and any other means of
use or distribution) to pay the producer their contracted share of the revenue
● make sure that the contract covers all formats ‘now known and hereinafter devised’.
Alternative payment systems:

●
all-in fee (be careful that the recording costs don’t exceed the fee)
● hourly fee/day rate/project rate.
Other considerations that should be factored into the

overall payment:
● the state of previously recorded material and how much work is needed in
preparation for you to work on it
● production coordination (booking musicians, studios, equipment, etc.)
● documentation of metadata (credits, lyrics, etc.)
● backup and archiving
● materials and listening copies
● increased percentage or separate label deal if sales are through the producer’s own
label.
Conclusion
There is well-founded optimism about the future of the recorded music industry. The records
that we love and value are the product of their producers persuading the many pieces into
one coherent whole: the artist, the musicians, engineers, writers, A&R people, the budget –
all who touch, and everything that affects, the project. These pieces are the orchestra and the
producer is the conductor willing that orchestra to harmonize and synchronize. Whether
there is a named producer on a project or not, someone is performing the functions of a
music producer. The music producer not only conducts the creatives but also conforms the
music, technology, people and business into coherence. Music producers perform many
roles: CEO of the project, creative director, technical director, connector, programmer,
curator, editor, song doctor, arranger, therapist, coach, teacher and cheer leader to will the
recording out of the spirit world into the material. Our business is far more complicated
than it was twenty years ago, but opportunities abound. We are finally in growth mode
again. It is critical that we protect the creative imperative and ensure that our musicians,
artists and producers, who create the music that enriches our lives, receive a fair share of
the rewards they generate. If we value recorded music, we need to return sufficient value to
the creators so that they continue to create.
Notes
1. Royalties are payments made for the use of an intellectual property right. The royalty
usually takes the form of a percentage of the revenue or profit that the other party (in this
case, the record label) makes.
2. From author interviews with several top producer managers including Bennet Kaufman
(Patriot Management), Sandy Roberton (Worlds End Producer Management) and others
who wish to remain anonymous.
3. A master use licence is needed when a recording is used for synchronization with visual
materials for a commercial, movie, TV show, etc. Compilations and other third-party
uses of the recording also require a master use licence, but these are becoming less
common in the streaming environment. Samples also require a licence.
4. Non-interactive streams are analogous to radio programmes where the consumer has no
control over which track is played. Interactive or on-demand streaming services allow
users to choose which track they want to listen to next.
5. A performing producer who made a musical contribution to the recording qualifies for
remuneration. Where a studio producer does not make an audible contribution (such
as vocals or instrumentals) they are eligible for payment as an Eligible Studio Producer
if they conduct (or provide a similar musical direction to) another performer’s live
performance as it is being recorded.
6. Digital breakage refers to the difference between monies earned for attributable streams
and minimum guarantees negotiated with the digital service.
Bibliography
Burgess, R. J. (2008), ‘Producer Compensation: Challenges and Options in the New Music
Business’, Art of Record Production Journal, 3. Available online: http://www.arpjournal.
com/asarpwp/producer-compensation-challenges-and-options-in-the-new-music-
business/ (accessed 22 August 2019).
Library of Congress (1998), ‘The Digital Millennium Copyright Act of 1998 US Copyright
Office Summary’, December 1998. Available online: http://www.copyright.gov/legislation/
dmca.pdf (accessed 9 August 2019).
Discography
Prince (2007), [CD] Planet Earth, NPG Records/Columbia.
Radiohead (2007), [CD] In Rainbows, XL Recordings.
366
24
Evolving Technologies of Music
Distribution: Consumer Music
Formats – Past, Present and
Future
Rob Toulson
Introduction
The development of music recording and playback technologies in the early twentieth
century enabled recorded music to be sold directly to the consumer for home listening. As
commercial music formats have evolved, consumers engage with ever-changing systems
for purchasing and listening to recorded music, each bringing different properties with
respect to sound quality, accessibility, portability and price. Since 1973, the Recording
Industry Association of America (RIAA) publishes annual sales and revenue data for each
commercial music format, highlighting which are emerging and which appear to become
obsolete (RIAA 2019a). Figure 24.1, for example, shows the US unit sales of music albums
for the period 1973–2018, showing trends of preferred formats and the decline of overall
album sales towards the present day.
What emerges from the analysis of past and present commercial music formats [Will
the colours be easily distinguishable in Figures 24.1-24.3 because currently they are not?]
is a critical relationship between the technical audio quality permitted by each format,
and the consumer focused features and convenience enabled by each format, which rarely
correlate directly. For example, the compact cassette introduced in the late 1970s brought a
significant reduction in audio quality in comparison to the established vinyl format, yet its
compact size, low-power playback requirement and the ability to facilitate home recording
enabled it to be sustained as a successful music format for many years.
Whilst Figure 24.1 shows valuable data for album sales, it doesn’t in itself give a full
picture of the current state of recorded music revenue, given the emergence of music
streaming services and the trend for digital music downloads to be weighted significantly
towards sale of individual tracks. The revenue of recorded music from the consumer
Figure 24.1 US music album sales from 1973 to 2018 (millions of units).
Figure 24.2 US music sales revenue from 1996 to 2018 (millions of dollars).
therefore describes a subtly different picture (Figure 24.2), showing that revenues are
currently experiencing marginal growth, predominantly owing to advert-based and
subscription streaming services.
Data related to commercial music sales also allow the music charts for different countries
and genres to be published. Prior to recorded music, the music sales charts related purely
to the sale of sheet music of popular songs, but, in 1936, the New York based Billboard
Evolving Technologies of Music Distribution 369
magazine first published a list of the three major record companies’ most sold records
(Sale 1996). By 1940, Billboard had devised its own combined music sales chart from the
data of fifty record shops in the USA, though it wasn’t until 1952 that the New Musical
Express printed the first British singles chart, collated from the sales data of twenty record
shops around the UK (Stanley 2013: 3). The advent of home listening changed the face
of the music industry forever, bringing with it innovative music artists, celebrity cultures
and hysterical fans. Nowadays, given the rise of download and streaming services, the
charts have become more complex and niche for different platforms, genres and playlists.
Furthermore, music artists and producers have regularly modified their creative approaches
to capitalize on consumer preferences with regards to music formats. This is particularly
evident with respect to eras where full music albums made up the majority of record sales,
enabling artists to present multi-themed sequential stories and concept albums with their
music and cover art, in contrast to a modern approach for producing single tracks that aim
to gain immediate impact with the listener and combine well within a playlist of songs by
similar genre artists on a digital playback platform.
This chapter gives a discussion of the contemporary and historical advances in
technology that have enabled the commercialization of recorded music, from the initial
inventions of Thomas Edison, to the compact disc (CD), data compressed music files and
streaming services. Innovative and future commercial music formats are also discussed,
with consideration of interactive music apps, the emergence of virtual reality (VR) and the
modern resurgence of vinyl.
The golden age of analogue

In 1877, inventor Thomas Edison made the first reproducible sound recording when he
recited Mary Had a Little Lamb into the mouthpiece of his hand-powered phonograph.
As Edison turned the phonograph handle, the sound waveform was etched onto a sheet of
tin foil that was wrapped around a metal cylinder (Burgess 2014: 7). Once the recording
had been made, a playback needle could be slotted into the tinfoil grooves to reproduce
the sound. Edison had identified that playing back recorded sound could potentially
bring music and voice recordings to the home, but it was not until 1915 that Edison could
demonstrate that recorded music could sound indistinguishable from a live performance
(Milner 2009: 4). Edison spent many years devising live ‘tone tests’ where an audience
would be challenged to identify which audio was a live vocalist or ensemble, and which
was being reproduced from his Diamond Disc Phonograph. Audiences would be amazed
when the musicians on stage stopped performing (really, they were just miming) and the
sound of their performance continued. Despite almost a century of technology advance
thereafter, audiences still claim to be able to distinguish a real performance from a recorded
one, yet Edison managed to convince his audiences every time, not least because his tone
test singers were trained to adapt their voice to the nuances of the Diamond Disc and
give a comparable, though perhaps compromised, performance (7). Nevertheless, Edison
changed the commercial landscape for music performance and indirectly invented the role
of the ‘record producer’, who would learn how to perfect a recording set-up such that the
most authentic playback could be achieved.
Initially, commercial phonograph systems were purely acoustic, relying on the physical
vibration of air (i.e. air pressure disturbances) on a lightweight diaphragm and needle to
etch a soundwave groove in a rotating cylinder. Despite the mechanical design, electrical
components had already been introduced to the Phonograph in 1888, by means of an
electric motor to turn the rotating cylinder at a reliable and repeatable speed (Burgess
2014: 16). The bulky size of recording cylinders made them expensive to manufacture and
the flat shellac disc would hence become ubiquitous by 1912. However, the disc brought a
design challenge in that it was less easy to connect the acoustic capture horn and diaphragm
directly onto the recording surface. In the mid-1920s, both Victor and Columbia introduced
electrically recorded discs, which used the principle of electromagnetic induction to convert
acoustic sound waves (motion) into electrical signals. The benefit of an electrical unit was
that the acoustic capture horn and diaphragm did not need to come into direct contact
with the recording disc, also enabling an electrical amplifier to be incorporated before the
(now higher-power) electric signal was converted back into electromagnetic motion and
used to etch the sound waveform on the recording disc. Despite the acoustic and electrical
approaches coexisting for a while, the electrical system enabled a wider frequency range
and more dynamic detail to be captured, giving the music playback more bass, brightness,
body and detail (Schmidt Horning 2015: 37). Electric systems also allowed amplified
(louder) playback through moving coil loudspeakers. Indeed, the electrical development
inspired audio engineer H. A. Hartley to coin the much-used phrase ‘high fidelity’ in 1927
(Burgess 2014: 31).
In 1931, Alan Blumlein was an impressive electro-mechanical engineer working for the
Columbia Graphophone Company, which would soon be consolidated into a new company,
Electric and Musical Industries (EMI). Blumlein had been curious about how the sound he
heard at the cinema always came from the same place (a single loudspeaker) even though
the actors could be positioned either on the left or right of the screen (Alexander 2013: 60).
Blumlein knew that he could create a false or ‘phantom’ binaural positioning of the sound
if he used two loudspeakers, one on either side of the screen, and adjusted the relative
volumes of an actor’s voice into each speaker. However, the binaural or ‘stereophonic’
invention could only be of widespread use if he could develop an ‘end-to-end’ process
that enabled both the effective recording and playback of stereo sound. In his 1931 patent,
Blumlein presented the first stereophonic recording technique, using two microphone
capsules angled 45 degrees positive and negative from the sound source. His patent also
included a method for recording two channels of audio into a single grove of a flat disc,
using a two-axis tool that cut each channel at 45 degrees to the surface of the disc (68).
Unfortunately, Blumlein’s invention was ‘so far ahead of its time’ and ‘misunderstood by
many of those around him’ (63). While Blumlein demonstrated his technology effectively,
the recording industry was not ready to capitalize and, in many respects, the improved
sonic experience was not sufficient at the time to justify the extra cost of manufacturing
two channels of audio in commercial music formats, and the consumer inconvenience
of requiring two loudspeakers in a room. It would be more than twenty-five years before
Blumlein’s stereo concept was incorporated into recording studios, commercial records
and home playback systems; long after his patent had expired.
The recording industry benefitted from a gradual increase in phonograph record sales
during the 1930s, but this was halted dramatically by the onset of the Second World War,
which in the United States also involved a Musicians Union recording ban between 1942
and 1944 (Green and Dunbar 1947). However, the war brought technology advancements
that would later benefit the music production and audio industries; a shortage of shellac
‘drove companies to investigate alternative materials, which resulted in an improved
record made of Vinylite’ (Schmidt Horning 2015: 68). Around the same time, post-war
audio experts were coming together to form the Audio Engineering Society (AES) as
an organization for exchanging ideas and developing technical standards for promoting
common approaches to consumer audio systems (68). As a result, the microgroove vinyl
long play record (LP), with either a 10- or 12-inch diameter, a playback speed of 331/3
revolutions per minute (rpm), and standardized playback filter circuit (AES 1951), became
the first non-proprietary music format. This enabled ‘at long last, a common platform for
the reproduction of all recordings regardless of speed, groove dimensions, or manufacturer’
(AES 1951).
The first popular vinyl LP was arguably Columbia’s 1948 reissue of The Voice of Frank
Sinatra (Marmorstein 2007: 165), while 7-inch 45 rpm singles were launched for the first
time in 1952, with Mario Lanza’s Because You’re Mine being possibly the first on the EMI
label (Stanley 2013: 4). It’s valuable to note that the term ‘LP’ itself has become adopted as
a descriptor for a collection of music tracks, i.e. a music album, and has since become less
synonymous with the specific music format. The vinyl LP allowed artists for the first time
to experiment with storytelling through a collection of sequential songs and the possibility
for a ‘concept album’, such as The Rise and Fall of Ziggy Stardust and the Spiders from Mars by
David Bowie, making full use of the record cover for elaborate artwork, band photography,
song lyrics, production credits, information booklets and pop-up inlay designs, and hence
being generally regarded as a higher quality artistic product than vinyl singles and more
convenient future formats. A further cultural shift enabled by vinyl was the invention of
the jukebox, a coin-operated mechanical device that could be programmed to play songs
from a number of vinyl discs in a specified order. Installed in public bars and cafes, the
jukebox enabled the first consumer curated playlists and brought with it a new opportunity
for social venues to attract a music loving crowd (‘The Top Twenty Come to the Public Bar’
1956).
Rather than recording direct to disc, analogue magnetic tape recording had become
ubiquitous in recording studios in the 1950s, because it allowed multiple synchronized
tracks of music to be recorded simultaneously and facilitated creative, non-linear editing
after the recordings had been made. The first successful consumer tape format was the
eight track, introduced in the early 1960s. The 8-track cartridge was viable and had a short
period of success, partially due to the fact that it could be utilized in cars and on the move
(Despagni 1976), unlike vinyl, which would skip if experiencing mechanical disturbances.
It allowed four stereo audio recordings to be stored in parallel pairs of magnetic tracks
on the tape, with an electromagnetic playback head moving across the tape to align with
the desired song. Figure 24.1 shows the 8-track format declining to obsolescence in 1983,
when the smaller and more convenient compact cassette started to increase in popularity.
The 2-track compact cassette tape was a successful medium for music delivery for a long
period, partly because it was small, portable and more robust than vinyl. The cassette used
four tracks to hold two stereo recordings which played in opposite directions with songs
from one ‘side’ arranged in series from the beginning to end of the tape and then the cassette
could be turned over to play the other ‘side’. The Sony Walkman portable cassette player
emerged in 1979 and ‘changed the world’; despite media and retail scepticism. The music
listening public overlooked the cassette’s inferior audio playback quality and overcame
the perceived negative image of wearing headphones in public, and, given the robust and
portable convenience of the format, a new cultural paradigm was born (Costello 2009). The
cassette was also the first format that allowed home recording and duplication, meaning
that for the first time listeners could compile and curate their own bespoke ‘mix tapes’ and
duplicates also. The cassette hence brought with it a second cultural paradigm shift: the onset
of music piracy that it had unexpectedly facilitated, and which would further penetrate the
music industry with the future introduction of digital formats (Milner 2009: 213).
The main negative attribute of the cassette tape was sound quality. Analogue magnetic
tape inherently introduces noise to the recording medium, which, while manageable with
professional grade studio machines, the portable cassette design brought as a compromise
in order to meet manufacturing and economic targets. Inventor Ray Dolby was a physicist
and keen recording engineer in the 1960s who was eager to improve the noise performance
of analogue tape (Hull 2013). He developed a dynamic noise reduction circuit that only
attempted to filter noise frequencies from the quiet parts of a recording (since the noise was
not so evident when the recorded audio or music was loud). By this method, the recording
system could add selective ‘pre-emphasis’ to the captured audio, requiring a sympathetic
‘emphasis’ process on playback devices (Dolby 1967). The professional Dolby-A system
was sold to studios and record labels from 1965, and in 1970 the more cost-effective
Dolby-B system was introduced for the compact cassette. The Dolby-B circuit was licensed
to cassette player manufacturers, transforming the format to become viable for sale and
distribution of recorded music.
The birth of digital

Physical digital
Despite Dolby’s innovative noise suppressors the desire for truly noise-free playback was
strong. In the early 1970s, Denon engineers were convinced that digital recordings gave
an improvement over analogue systems (Fine 2008), and by 1982 digital technologies had
become fast and reliable enough for Philips and Sony to commercialize the digital CD
format, with Billy Joel’s 52nd Street album being the first available to consumers.
The digital concept meant that as long as the microprocessors in the recording and
playback systems could operate accurately and significantly faster than the frequency
threshold of human hearing (22 kHz), then analogue audio data could be sampled and
stored as discrete data, i.e. as digital ‘pulse codes’ representing 0 or 1 values. With digital
systems, for example, 0 volts represents binary 0, whereas 5 volts represents binary 1, so
any analogue noise would be ignored by the system unless it were greater than the huge
2.5 volts needed to incorrectly flip a 0 to a 1 or vice versa (analogue noise in audio circuits
is generally in the region of millivolts, so the likelihood of noise induced errors is very
small). Philips and Sony deduced that sampling at 44.1 kHz with 16-bit resolution gave
sufficient accuracy to capture authentic and noise-free audio recordings – meaning that
each analogue audio voltage was measured 44,100 times per second to a value between
0 and 216. The CD data format holds around 80 minutes of audio and is described as the
Philips/Sony Red Book standard. Subsequently ratified by the AES, the Red Book standard
has remained as the minimum benchmark recommendation for pulse code digital
playback (known more technically as the pulse code modulation, or PCM, format). The
CD additionally allows digital data describing the held audio also to be embedded within
the disc. This brought the first widespread use of audio metadata and the opportunity for
tracks to be uniquely identified on the disc by an International Standard Recording Code
(ISRC), which is particularly useful for tracking repertoire, radio playback and mechanical
sales data (Toulson, Grint and Staff 2014), and also allowed users to programme selected
tracks for playback and avoid others, further enabling listeners to move away from the fixed
format of the album. Additionally, a CD master file, which utilizes the disc description
protocol (DDP) format, ensures that the duplication manufacturer receives the exact audio
product provided by the record label, guaranteeing that no potential defects introduced
during transit are propagated into the final manufactured discs.
The CD has been a hugely successful format, still achieving significant sales in 2019.
However, its main disadvantage was clearly its poor portability, since physical motion or
disturbance of the playback system could cause the CD to skip and generate self-damage.
Equally, the CD was never particularly suitable for home recording, as a write-once format,
meaning the cassette tape and portable cassette players enjoyed an extended lifetime into
the early twenty-first century, as highlighted in Figure 24.2. The only potential rival to both
the CD and cassette in the late 1990s was the Sony Minidisc format, which promised the
best of both formats – being both robust and portable, whilst facilitating home recording –
but it failed to gain any commercial traction and very few commercial releases were made,
given the emergence of a second digital audio revolution (Faulkner 2012).
Data and downloads

The CD/PCM standard is described as ‘lossless’ because the digital audio data directly
represent a sampled version of the analogue audio waveform, and, owing to sampling
theorems described by Nyquist (1924) and Shannon (1949), an analogue audio signal can
be exactly recreated from PCM lossless data. When used on a computer system, PCM audio
data are usually stored in the Microsoft WAVE file format with a ‘.wav’ extension (Microsoft
2007), or the similar Audio Interchange File Format (AIFF) utilized by Apple. The
emergence of these lossless file formats enabled the first home-computer based recording
systems, and ultimately the ability for music consumers to store their own library of music
on a digital computer hard disc.
In the late 1980s, lossless audio files took up a relatively large amount of computer
memory space and hence would also take a long time to send and share over computer
networks. For this reason, researchers had long been exploring methods to reduce the file
size of audio files, before Fraunhofer IDMT researcher Karlheinz Brandenburg developed
the ‘lossy’ MPEG Layer 3 (MP3) audio file format (Brandenburg et al. 1990). The MP3
is classed as a lossy data compression algorithm because the original analogue audio
signal cannot be fully reproduced from the digital data contained in the audio file. The
algorithm uses a unique mathematical model to remove elements of the audio signal that
are anticipated to be less perceptible to the human ear. This is enabled by a psychoacoustic
phenomenon, known as auditory masking, where some audio frequencies are not heard
if they are similar in pitch to, or significantly quieter than, more dominant frequencies
in the signal (Oxenham 2013). By comparison, whereas the CD audio standard gives two
channels of 16-bit data at 44,100 samples per second (i.e. 1411 kbps), equivalent MP3 files
encoded at 128 kbps enable over ten times the amount of audio storage and hence ten times
faster transmission over a computer network.
Lossy audio formats, by definition, involve a loss of fidelity in the playback audio signal,
so they are not generally appropriate for professional audio applications. Katz (2012)
demonstrates the quantifiable addition of distortion and noise artefacts that are inherent in
a data compressed audio file. Equally Corbett (2012) also reports on a number of additional
audible artefacts – including sound masking, pre-echoes and low-frequency imbalance –
that can be heard in lossy compressed audio.
Despite the reduced audio quality, the MP3 algorithm revolutionized the music industry.
As internet speeds increased, music listeners were able to download online music files and
build large catalogues of music that could be accessed with the click of a mouse. Much of this
involved non-legal activity, with recorded music being unlawfully shared and distributed,
and little revenue making its way back to the copyright owner or artist. The decline in music
revenue during the years from 2000 onwards (see Figure 24.1) is generally attributed to
music piracy and peer-to-peer sharing of music over the internet (Janssens, Van Daele and
Beken 2009). Specifically, this correlates with the establishment of the Napster website in
1999, which facilitated illegal sharing of music and amassed over 70 million international
users in its two years of activity (Menn 2003). The digital download paradigm did however
bring a new problem, that of library and file management – with potentially hundreds of
thousands of audio tracks saved on a computer, it becomes a huge challenge to search and
find exactly what music is desired at any moment in time, particularly if the files are not
sensibly named or organized into a database structure.
In 2001, Napster was forced to shut down, owing to a successful lawsuit filed by the
RIAA, but the damage had already been done and music consumers would from then
onwards demand digital music download services above established physical formats.
Furthermore, consumers’ appetite to pay high costs for music was severely reduced,
and it would be another fifteen years before the decline in music sales revenue would be
halted. In the same year that Napster closed, Apple capitalized on the new music appetite
by launching the iPod, and the Apple iTunes store followed in 2003. Using Apple’s own
lossy Advanced Audio Codec (AAC) file type, consumers could now purchase music and
download directly over the internet. The iTunes platform also delivered a user-friendly
music file database and library management system, which was undoubtedly a contributor
to the success of the music download format. The on-demand purchase model enabled
consumers to only purchase the individual songs they liked (rather than complete albums)
and the decline in music revenue fell further, particularly owing to reduced sales of full
albums. This trend is highlighted in Figure 24.3, which shows the significant move towards
single downloads in comparison to CD or download album purchases and CD singles.
Despite the increase in unit sales shown during 2004 to 2012, the actual US revenue from
these sales fell fourfold from approximately US$20 billion to US$5 billion over the same
period (shown in Figure 24.2).
Given the loss of audio quality and revenue on music sales, a small community of music
makers and audiophiles pushed for lossless and high-resolution alternatives to the MP3.
High-resolution (or hi-res) audio can be defined as digital audio data that have greater
amplitude resolution than 16-bit or greater time-axis resolution than 44.1 kHz (Rumsey
2007). As the compact disc delivery format itself delivers 16-bit and 44.1 kHz accuracy,
hi-res can sometimes be described simply as ‘greater than CD’ resolution, i.e. raising the
digital resolution to, say, 24-bit or using an increased sample rate of, for example, 96 or
192 kHz. One particular file format for holding hi-res audio is FLAC (Free Lossless Audio
Codec). The FLAC algorithm is capable of encoding PCM data into approximately half
the raw file size, but still enables 100 per cent recreation of the PCM data when decoded.
The high-res movement has resulted in small revenues from formats such as Super Audio
CD, Blu-ray Audio and direct high-resolution downloads (from, for example, HD Tacks),
Figure 24.3 US music album and singles sales for CD and download from 2004 to 2018
(millions of units).
but even Neil Young’s highly anticipated hi-res Pono format could not gain mainstream
industry or consumer attention, or reverse the decline of revenue during the early twenty-
first century (Lewis 2018).
The emergence of streaming

As internet speeds increased further, the need to download and keep copies of music
audio files diminished, and on-demand music streaming services such as Spotify, Apple
Music and Deezer emerged. Despite early resistance from successful artists such as Thom
Yorke and Taylor Swift (Flynn 2016), the streaming platform has finally turned a corner
for music industry revenues from recorded music. Whilst streaming is generally identified
as being responsible for the latter fall in download sales shown in Figure 24.3, it is equally
responsible for the upturn in overall revenues shown from 2015 in Figure 24.2. In 2018,
50.2 million subscribers to on-demand streaming services accounted for 75 per cent of
all US revenue from recorded music, up from 34 per cent (10.8 million subscribers) in
2015 (Glazier 2019; RIAA 2019b). The streaming surge is partly owing to accessibility for
the consumer and music industry buy-in, and an adoption of the higher quality 320 kbps
lossy bit rate, but equally down to the intelligent algorithms employed to engage listeners
for longer with suggestions of new music, playlists and curated content. Data on every
song’s demographic of listeners, discovery route, play count and skip rate (i.e. the length of
time a song is listened to on average before it is skipped) are all valuable data to labels and
independent artists, and equally allow streaming platforms to develop numerous bespoke
charts and playlists that songs can automatically climb and fall from based on algorithms
and statistics. Additionally, the upturn in engagement with on-demand streaming is
supported by a move (for example, by streaming platform Tidal) to incorporate lossless
PCM data and new high-resolution algorithms developed specifically for streaming, such
as Meridian Audio’s MQA (Master Quality Authenticated) codec (Generale and King 2017).
It is of interest to note that this move from an ownership to a service model also reflects the
historical development of the consumer as a selector rather than a collector, which could
have been somewhat predicted given the gradual increase in consumer desire for personal
curation of music playlists, from the first jukebox, through recording analogue mixtapes,
programming the order of CD playback and the first fully curated playlists enabled by the
digital download format.
Interactive music formats

Novel approaches to draw greater interactivity with recorded music have been
experimented with for many years and perhaps were pioneered first by the DJs of the
reggae sound systems in Jamaica in the 1950s and 1960s, who developed techniques to
use live effects and processing as they played back records (Zagorski-Thomas 2007). More
recently, in 2005, Trent Reznor of Nine Inch Nails chose to release multitrack files of the
band’s commercial releases, so that fans could engage with the music in a way not possible
before (Reznor 2005). Soon followed interactive music games such as Guitar Hero and
Rock Band, which brought back-catalogue revenues for labels and publishers who were
invited to deliver multitrack stems for games featuring, for example, The Beatles, Metallica
and Aerosmith.
Given an appetite for interactive commercial music, and the increasing ubiquity and
processing power of smartphones on Apple iOS, Android and Windows platforms, the
Album App format emerged (Bogdan 2013). Björk released her Biophilia album as an
app in 2011, which can be purchased and downloaded to iPhone and iPad devices and
incorporates a number of unique graphics and interactivity features. Later in 2014, Paul
McCartney re-released five of his solo albums as album apps. The album app format allows
artistic and interactive content to be packaged alongside the audio, which may include
artwork, photography, song lyrics, video, animation and even interaction and gaming
features (Dredge 2012). The album app is also relatively secure (i.e. it cannot be manipulated
after final rendering) and it is much more difficult to pirate and distribute unauthorized
copies, given its closed format that can only be installed through official online app stores
(Paterson et al. 2016).
Gwilym Gold’s Tender Metal album app (2012) used a novel algorithmic system to
play back a unique synthesized composition on each rendition, without the need for any
user control. Shakhovskoy and Toulson (2015) further defined a potential album-app
platform in collaboration with artist Francois and the Atlas Mountains for their album
Piano Ombre (2014), which was recognized as the world’s first chart-eligible app. The
developed album app platform was later extended by Paterson and colleagues (2016) to
incorporate interactive music features, allowing the listener to explore alternative mixes
and instrument stems of songs in real time, resulting in the variPlay format; the first
variPlay release being the innovative Red Planet EP by artist Daisy and The Dark, launched
on the Apple App Store in 2015 (Arts and Humanities Research Council [AHRC] 2015).
This was soon followed in 2016 by a similar interactive album app Fantom by Massive
Attack, which allowed the user to interact with music playback through the device sensors
including the built-in camera, microphone, compass and accelerometer (Monroe 2016).
Conclusion
It is clear that the evolution of music formats has swung back and forth owing to changing
consumer demands for high audio quality and enhanced features, convenience and user
experience. In many cases it is seen that an established music format (such as vinyl or
the CD) is upstaged by a new, more convenient or feature-rich format which delivers a
lower audio playback quality (such as the cassette following vinyl and the MP3 download
following the CD). Yet once the new format is established, consumer demand reverts to
improved audio quality as an essential progression, until new convenient formats emerge
and the cycle repeats.
Interactive digital music has shone a glimmer of light into the future of commercial
music formats. With the emergence of VR games and applications, comes a desire for
VR audio and 3-D sound systems. Binaural algorithms, such as Headphone X by DTS,
have shown that it is possible to spatialize audio in standard stereo headphones using
advance transfer functions that mimic the physical and psychoacoustic properties of
the human head and hearing system. However, despite these significant advances in
audio technology, and experimentation by the ever-innovative Björk, 3-D sound has
yet to deliver a compelling format that could be considered for release of commercial
music.
Paradoxically, the vinyl format has seen a significant resurgence since 2006, matching
sales from the late 1980s and approaching a half billion-dollar industry in the United States
(see Figure 24.4). Vinyl is often regarded as the most ‘artistic’ music format (Shakhovskoy
and Toulson 2015), given the physical size and style, which lends itself to rich artwork
and associated texts including song lyrics and producer credits, so this trend is potentially
owing to a consumer backlash to the digital revolution (Dewey 2013), though it is uncertain
whether the resurgence will hold, grow further or fall in the future.
With the upturn in vinyl sales, the record industry has also lobbied streaming service
providers to deliver higher quality streaming data rates, provide better consumer experience
(such as incorporating artwork, music videos, lyrics and producer credits) and return
better royalties to artists. As a result, the original contracts for streaming services are being
renegotiated more favourably with labels. RIAA data show that, in 2019, more music is
being created and consumed than ever before, and with the dark days of digital piracy
predominantly in the past, the future looks positive for the commercial music industry and
music artists alike.
Figure 24.4 US vinyl sales between 1989 and 2018.

Bibliography
Alexander, R. C. (2013), The Inventor of Stereo: The Life and Works of Alan Blumlein,
Arts and Humanities Research Council (AHRC) (2015), ‘Music App Is “Magic Meeting
Technology”’, 9 July. Available online: http://www.ahrc.ac.uk/newsevents/news/musicapp/
Audio Engineering Society (AES) (1951), ‘AES Standard Playback Curve’, Audio Engineering,
January: 22–45.
Bogdan, G. (2013), ‘Is the Music Album App the Next Game Changer’, The Guardian,
21 August. Available online: http://www.theguardian.com/media-network/2013/aug/21/
music-album-app (accessed 9 August 2019).
Brandenburg, K., H. Gerhser, D. Seitzer and T. Sporer (1990), ‘Transform Coding of High
Quality Digital Audio at Low Bit Rates–Algorithms and Implementation’, Proceedings
of IEEE International Conference on Communications, Including Supercomm Technical
Sessions, Atlanta, GA, 16–19 April: 3: 932–936.
Burgess, R. J. (2014), The History of Music Production, Oxford: Oxford University Press.
Corbett, I. (2012), ‘What Data Compression Does to Your Music’, Sound on Sound Magazine,
April. Available online: https://www.soundonsound.com/techniques/what-data-
compression-does-your-music (accessed 9 August 2019).
Costello, J. (2009), ‘Rewind: How the Walkman Changed the World’, The Independent, 16 July.
Available online: https://www.independent.ie/entertainment/music/rewind-how-the-
walkman-changed-the-world-26551309.html (accessed 3 August 2019).
Despagni, A. J. (1976), ‘Some Help from Debussy for the Hassled Driver’, New York Times, 25
January. Available online: https://www.nytimes.com/1976/01/25/archives/some-help-from-
debussy-for-the-hassled-driver.html (accessed 3 August 2019).
Dewey, C. (2013), ‘Vinyl Record Sales Have Hit Their Highest Point Since 1997’, Style [blog],
Washington Post, 11 April. Available online: http://www.washingtonpost.com/blogs/style-
blog/wp/2013/04/11/vinyl-records-are-more-popular-now-than-they-were-in-the-late-
90s/ (accessed 9 August 2019).
Dolby, R. (1967), ‘An Audio Noise Reduction System’, Journal of the Audio Engineering Society,
15 (4): 383–388.
Dredge, S. (2012), ‘Music Apps Are the New Albums’, The Guardian, 7 November. Available
online: http://www.theguardian.com/technology/appsblog/2012/nov/07/music-apps-
david-gilmour-lady-gaga (accessed 9 August 2019).
Faulkner, J. (2012), ‘MiniDisc, the Forgotten Format’, The Guardian, 24 September. Available
online: https://www.theguardian.com/music/musicblog/2012/sep/24/sony-minidisc-20-
years (accessed 3 August 2019).
Fine, T. (2008), ‘The Dawn of Commercial Digital Recording’, Association for Recorded Sound
Collections Journal, 39 (1): 1–17.
Flynn, M. (2016), ‘You Need Me Man, I Don’t Need You: Exploring the Debates
Surrounding the Economic Viability of On-Demand Music Streaming’, in R. Hepworth
Sawyer, J. Hodgson, J. L. Paterson and R. Toulson (eds), Innovation in Music II:
Transactions on Innovation in Music, 157–173, Shoreham-by-Sea: Future Technology
Press.
Generale, M. and R. King (2017), ‘Perceived Differences in Timbre, Clarity, and Depth in
Audio Files Treated with MQA Encoding vs. Their Unprocessed State’, Convention e-Brief
392 Presented at the 143rd Audio Engineering Society Convention, October, New York.
Glazier, M. (2019), ‘50 Million Reasons for Optimism’, Medium.com, 28 February. Available
online: https://medium.com/@RIAA/50-million-reasons-for-optimism-d70cff45c8ab
Green, L. and J. Y. Dunbar (1947), ‘Recording Studio Acoustics’, Journal of the Acoustical
Society of America, 19 (May): 412–414.
Hull, J. (2013), ‘Ray Dolby 1933–2013’, Journal of the Audio Engineering Society, 61 (11):
947–949.
Janssens, J., S. Van Daele and T. V. Beken (2009), ‘The Music Industry on (the) Line: Surviving
Music Piracy in a Digital Era’, European Journal of Crime, Criminal Law and Criminal
Justice, 17: 77–96.
Katz, B. (2012), iTunes Music: Mastering High Resolution Audio Delivery, Burlington, MA:
Focal Press.
Lewis, R. (2018), ‘Neil Young Fights for Quality Streaming with Career-Spanning Online
Archive’, Chicago Tribune, 14 February. Available online: http://www.chicagotribune.com/
entertainment/music/la-et-ms-neil-young-archives-online-20180214-story.html.
Marmorstein, G. (2007), The Label: The Story of Columbia Records, New York: Thunder’s
Mouth Press.
Menn, J. (2003), All the Rave: The Rise and Fall of Shawn Fanning’s Napster, New York: Crown
Business.
Microsoft (2007), Multiple Channel Audio Data and WAVE Files. Available online: http://
www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Docs/multichaudP.pdf
Milner, G. (2009), Perfecting Sound Forever: An Aural History of Recorded Music, New York:
Faber & Faber.
Monroe, J. (2016), ‘Massive Attack Launch App Containing New Music’, Pitchfork, 21 January.
Available online: https://pitchfork.com/news/61596-massive-attack-launch-app-containing-
new-music/ (accessed 3 August 2019).
Nyquist, H. (1924), ‘Certain Factors Affecting Telegraph Speed’, Bell Systems Technical Journal,
3 (2): 324–346.
Oxenham, A. J. (2013), ‘Mechanisms and Mechanics of Auditory Masking’, Journal of
Physiology, 591 (10): 2375.
Paterson, J., E. R. Toulson, S. Lexer, T. Webster, S. Massey and J. Ritter (2016), ‘Interactive
Digital Music: Enhancing Listener Engagement with Commercial Music’, in R. Hepworth
Sawyer, J. Hodgson, J. L. Paterson and R. Toulson (eds), Innovation in Music II:
Transactions on Innovation in Music, 193–209, Shoreham-by-Sea: Future Technology Press.
Recording Industry Association of America (RIAA) (2019a), ‘US Sales Database’. Available
online: https://www.riaa.com/u-s-sales-database/ (accessed 3 August 2019).
Recording Industry Association of America (RIAA) (2019b), RIAA 2018 Year-end
Music Industry Revenue Report. Available online: http://www.riaa.com/wp-content/
uploads/2019/02/RIAA-2018-Year-End-Music-Industry-Revenue-Report.pdf (accessed
3 August 2019).
Reznor, T. (2005), ‘The Hand That Feeds (Read Me from Trent)’, NINRemixes, 15 April.
Available online: http://ninremixes.com/8/multitracks.php (accessed 3 August 2019).
Rumsey, F. (2007), ‘High Resolution Audio’, Journal of the Audio Engineering Society, 55 (12):
1161–1167.
Sale, J. (1996), ‘Sixty Years of Hits, from Sinatra to … Sinatra’, Independent, 5 January.
Available online: https://www.independent.co.uk/life-style/sixty-years-of-hits-from-
sinatra-to-sinatra-1322429.html (accessed 3 August 2019).
Recording from Edison to the LP, repr. edn, Baltimore: Johns Hopkins University Press.
Shakhovskoy, J. and E. R. Toulson (2015), ‘Future Music Formats: Evaluating the Album App’,
Journal on the Art of Record Production, 10. Available online: https://www.arpjournal.com/
asarpwp/future-music-formats-evaluating-the-album-app/ (accessed 9 August 2019).
Shannon, C. E. (1949), ‘Communication in the Presence of Noise’, Proceedings of the Institute
of Radio Engineers, 37 (1): 10–21.
Stanley, B. (2013), Yeah Yeah Yeah: The Story of Modern Pop, London: Faber & Faber.
‘The Top Twenty Come to the Public Bar: Remorseless March of the Juke Box’ (1956),
Manchester Guardian, 30 November. Available online: https://www.theguardian.com/
music/2017/nov/30/remorseless-march-of-the-jukebox-1956 (accessed 9 August 2019).
Toulson, E. R., B. Grint and R. Staff (2014), ‘Embedding ISRC Identifiers in Broadcast Wave
Audio Files’, in R. Hepworth-Sawyer, J. Hodgson, J. Paterson and E. R. Toulson (eds),
Innovation in Music, 213–223, Shoreham-by-Sea: Future Technology Press.
Zagorski-Thomas, S. (2007). ‘Gesturing Producers and Scratching Musicians’, in Proceedings of
the Art of Record Production Conference, Queensland University of Technology, Brisbane.
Discography
Björk (2011), [app] Biophilia, One Little Indian Records.
Bowie, David (1976), [vinyl LP] The Rise and Fall of Ziggy Stardust and the Spiders from Mars,
RCA Victor.
Daisy and The Dark (2015), [app] Red Planet EP, Red Planet Records.
Francois and the Atlas Mountains (2014), [app] Piano Ombre, Domino Records.
Gold, Gwilym (2012), [app] Tender Metal, Gwilym Gold.
Joel, Billy (1982), [CD] 52nd Street, CBS Sony.
Lanza, Mario (1952), [7” vinyl] Because You’re Mine, EMI.
Massive Attack (2016), [app] Fantom, Fantom and Sons Ltd.
Sinatra, Frank (1948), [vinyl LP] The Voice of Frank Sinatra, Columbia.
382
25
Listening to Recorded Sound
Mark Katz
Introduction
The advent of mechanical sound recording in the nineteenth century marked a momentous
development in the history of music. It did not, however, mark the advent of technologically
mediated listening. For centuries, it had already been possible to listen to music created
by self-playing instruments, from the water-driven flutes of third-century BCE Greece to
the pianist androids of eighteenth-century France (Ord-Hume 2001). In 1876, the year
before Thomas Edison introduced the phonograph, a self-playing ‘Pianista’ was exhibited in
Philadelphia, and automatic pianos developed alongside sound-recording machines (Loesser
1954: 580–581). What, then, was new and significant about sound recording? Perhaps what
early phonograph listeners marvelled at most was its uncanny separation of the human
voice from the corporeal body, a feat no previous technology achieved. From the historical
standpoint, however, we can see that this technology ushered in a profound, global change
in musical listening habits. Technologically mediated music, became ubiquitous, shaping
every facet of musical life. It is this process, still evolving over nearly a century and a half,
that I examine here. I organize this brief survey around three periods of listening, from the
cylinder recordings of the late nineteenth century to the digital streaming technologies of the
early twenty-first century. Throughout, I’ll call attention to the interdependent relationship
among listening, technology and culture.
Any number of key dates could facilitate a periodization of listening in the age of recorded
sound: 1877, the year Edison and his lab brought out the cylinder-playing phonograph;
1906, when the internal-horn player (best known in the form of the Victrola) made
record players more acceptable in domestic settings; 1925, when, with the introduction of
electrical recording, microphones started capturing a wider range of sounds; 1948, when
the long play record, or LP, quadrupled the playing time of the 78 rpm record to 20 minutes
per disc side; 1963, the year the compact cassette tape went public; 1979, the debut of the
Walkman; 1982, the release of the first compact disc, or CD; 1999, when Napster’s peer-to-
peer service went live and MP3 became a household word; or 2008 with the launch of the
music streaming service Spotify. Such a timeline essentially articulates a corporate history
of recorded sound, recounting the triumphs and occasional failures of Edison, Victor,
Colombia, Philips, Sony, Napster, et al., while simultaneously positioning technology as

the prime driver of musical experience. Without dismissing corporate- and technology-
centred perspectives, I’d like to offer a tripartite periodization that focuses on modes of
listening: learning to listen, high-fidelity listening and post-fidelity listening.
Learning to listen
Listening to music is a learned skill. We learn which sounds constitute the music and which
are incidental. We determine where to affix our gaze. We learn how to respond emotionally
to music and how to manifest these responses, whether to clap, sing, dance, mosh or sit in
silence. Likewise, we learn how to listen to recorded sound, but because it differs from live
musical performance in the ways in which it occupies time and space, we learn to adjust.
Recorded music typically has no accompanying visual analogue, so we need to decide how
to occupy our ocular organ. Recorded music can be portable, and so we learn how to listen
to music while moving through space. (And when we don’t learn well enough we wreak
havoc when we walk, cycle or drive too heavily under its influence.) Recorded music is
repeatable, and often easily accessible, and so we discover that we can skip songs, knowing
that they can be refound. We learn how to exploit the distinctive interfaces and affordances
of particular technologies, whether the tone arm and stylus of a phonograph, the rewind
and fast forward buttons on a tape player, or the skip button/icon on CD and MP3 players or
computer screens. Much of this happens unconsciously, and so we learn without knowing
we are learning. Years ago I observed my infant daughter first become aware of recorded
music. She would look anxiously around to locate its origin, having only ever experienced
sounds emanating from sources that she could readily see, like her mother or father. But
after only a few days she stopped looking for the music. She had learned a key aspect of
phonographic listening – the decoupling of sight and sound.1
I’ve pointed out a few ways that we, as individuals, learn how to listen to recorded sound.
In the early history of sound recording there was also a collective process of learning in
which a variety of modes of listening were explored, tested, accepted or rejected. During
this period – which I approximate lasted until the Second World War – a number of broad
questions were addressed. What was the main purpose of listening to recorded sound? What
was the primary mode of listening? What was the sonic ideal of recorded sound? Here are
the answers. The main purpose of listening to recorded sound was musical entertainment.
The primary mode of attending to music was flexible, semi-engaged listening. The sonic
ideal came to be called high-fidelity, in which recorded sound was meant to approximate
or simulate the experience of listening to live music. In each case, other answers were
posited and championed, but never became predominant. When Edison sketched out the
possibilities of recorded sound in 1878, listening to music was only one of many activities,
and not at the top of the list – the writing and receiving of letters owned that spot (Edison
1878, reproduced in Taylor, Katz and Grajeda 2012: 29–37). An 1877 New York Times article
marvelled at the prospect of keeping a well-stocked oratorical cellar of vintage speeches
preserved on wax cylinder (‘The Phonograph’ 1877, reproduced in Taylor, Katz and Grajeda
Listening to Recorded Sound 385
2012: 37–39). In early twentieth-century Madras (now Chennai), recordings that featured
vikatam – a genre of vocal mimicry (of anything from birds to trains to social interactions) –
rivalled music discs for popularity (Weidman 2010). In an alternative universe of recorded
sound, listening to the spoken word or vocal mimicry sketches could well be the dominant
form of engagement. In the early days of recording, many writers championed a form of
attentive listening that carefully recreated the Western classical music concert experience
of the late nineteenth century; over time, however, it was deemed acceptable to listen to
recordings in very different ways. And although fidelity might seem to be an obvious goal for
recorded sound, this too had to be negotiated; some proposed and experimented with using
recordings to create sounds that could not be performed or heard otherwise. Although this
is a tremendous simplification of a long, complex and never wholly completed process, we
can see in retrospect the emergence of certain broad patterns of thinking and acting around
recorded sound that came to predominate, setting the stage for the next several decades.
The negotiations about recorded sound taking place during this period are fascinating
to observe, for they reveal a wide range of what might strike us today as odd debates and
unlikely forms of listening. Some writers pushed for appreciation, rather than entertainment,
as the ideal mode of listening, and a small industry grew up in the early twentieth century
that promoted educational records and books about how to learn to listen to music through
the phonograph.2 Music appreciation via sound recording lives on in the CD anthologies
(and now internet-accessible playlists) that accompany university music survey courses.
Still, appreciation lost out to entertainment. Arguments abounded about whether listening
to recorded music alone, listening on Sunday, or listening during meals or other quotidian
activities such as dressing or shaving was proper. Consider two competing articles in
the inaugural volume of the British periodical, Gramophone, from 1923. Grant Richards
(1923) makes his view clear from the title of his piece, ‘Against the Gramophone’, in which
he insists: ‘The gramophone is a destroyer of peace; it rends the family asunder. The hours
which were peaceful it has made cacophonous. In a gramophone-ridden house one can
never be sure that the sacred silence will not be shattered.’ Orlo Williams (1923), on the
other hand, argued that recordings should be welcomed at all times and seasons, making
a special case for listening to recordings alone, which some thought as improper: ‘when
science and enterprise have put so much pleasure within our reach as a gramophone and
a selection of good records will give, you will find it hard to convince me by logic, or mere
vociferation, that music in solitude is indecent’. Over time, disputes about phonographic
(or gramophonic) propriety mostly faded; recorded music came to accompany nearly
every conceivable human activity, and uncontroversially so.
High-fidelity listening
By the Second World War, it must have seemed to most listeners that the meaning and value
of recorded music was settled. This high-fidelity period, as I’m identifying it, persisted until
the turn of the millennium.3 With the development of LPs, 45 rpm discs and the jukeboxes
they filled, and the rise of a star system of recording artists, a massive, international music
industry came to play a major role in modern entertainment. In order to compete for
customers, equipment manufacturers and record labels feverishly sought (and exuberantly
proclaimed) sonic improvements, particularly along three parameters – frequency range,
dynamic range and noise reduction. The high-fidelity period was not simply focused
on sound, however; it promoted a product-based (as opposed to service-based) model,
one intended to create an unquenchable demand for the dizzying array of gear and discs
on offer every month. (Speciality magazines, like the aptly-titled High Fidelity, which
launched in 1951, emerged to help consumers navigate the ever-expanding marketplace.)
This quest for fidelity – and the production of related goods – drove the industry and
shaped listening habits for decades, from the first mass-produced stereo recordings in the
late 1950s to the rise of digital recording in the late 1970s and early 1980s, first on LP and
cassette and then on CD and DAT (digital audiotape). With the CD, record companies
made huge sums of money essentially reselling consumers their vinyl collections in the
form of digital reissues, invoking high-fidelity to drive demand. And although there were
no revolutionary advances in CD technology in the last two decades of the century, every
incremental change in sampling rates or digital-to-analogue conversion was touted as
crucial. Even cassette tapes, designed for convenience, portability and user-friendliness,
were often marketed for their so-called fidelity to reality.
I use the phrase ‘so-called fidelity’ because high-fidelity is a construct, a rhetorical
gambit. In reality, high-fidelity is hyperreality. Recordings often offer listeners an impossible
(or near-impossible) vantage point, say in the middle of a string quartet or floating above
the musicians. Balances among instruments or between voices and instruments or the
movement of sound through the stereo field might only exist on recordings. Bass or treble
might be boosted or dampened in ways that correspond to no live performance. Jonathan
Sterne offers this no-nonsense deflation of the concept of fidelity: ‘To argue that the primary
purpose of media is to reproduce experience as it unfolds in real life is to miss the point
on two grounds. This position proposes that consciousness is somehow “unmediated” (as
if language and culture did not exist) and it reduces all large-scale social activity to some
kind of aberration of face-to-face communication’ (2012: 250n16). It’s not clear whether
listeners wholly believed in the verisimilitude of recordings or understood them to be more
real than real. Nevertheless, for decades, the industry and millions of listeners remained
faithful to fidelity.
Post-fidelity listening
I use the term post-fidelity to describe the period, starting just before the turn of the
millennium, when listening habits changed, new technologies arose, and the discourse of
recorded sound began to shift.4 Two interrelated technologies, compression and file-sharing,
helped drive the priorities of this new era. With compression formats, most famously the
MP3 (introduced to the public in 1995), size mattered, and smaller was better. The goal
of compression was to ease the dissemination of files across the internet and to occupy
less real estate on computer hard drives. In order to shrink files, sonic information – in
the form of strings of 1s and 0s – had to be removed. The scientists who developed these
formats sought to minimize the effect of compression on listeners; still, much debate has
ensued about the degree to which this digital pruning could be perceived by listeners. Later
in the 1990s came the MP3’s partner in crime (literally, according to some landmark court
cases): peer-to-peer file (P2P) sharing. Most famously in the form of Napster (1999–2001),
these networks allowed users to disseminate and accumulate digital sound files easily and,
initially, at no cost. Engaging with music through MP3s was distinctive in many ways.
Acquiring recordings was often haphazard and unreliable, with poorly labelled, severely
compressed or corrupt files a common result; CD-quality sound was also possible, but
it was hard to ensure. Since downloading required a computer, a common playback
device at the time was an immobile desktop with low-fidelity speakers. When MP3s were
transferred to portable digital players, such as Apple’s iPod (first released in 2001), the
sound quality conveyed by earbuds and headphones was often compromised as well. The
combination of compressed files and typically low-quality speakers realized no sound
engineer’s high-fidelity ideal. According to the assumed teleology of sound recording,
in which perfection is the technology’s manifest destiny, compression and file-sharing
seemed to represent a regression. Nevertheless, for millions of listeners, the partnership
of MP3 and P2P was a match made in the heavenly jukebox. I was not alone in finding the
experience of downloading music an exhilarating fever dream of consuming sonic sweets
in an endless candy store in which everything was free and overindulging an impossibility.
And like many others, I tolerated low-bit rate, watery-sounding music and mislabelled files
both because I could access seemingly all the music I had ever wanted and all the music I
didn’t know I wanted but now needed, and also because I could listen to it whenever and
wherever I wanted.5
This is what I mean by post-fidelity. It is not that high-quality sound no longer mattered,
and it wasn’t as if access, convenience and affordability had not mattered in the previous
century. Rather, there was an inversion of priorities and, for the first time in the history of
recording, when it came to sound quality good enough was good enough. Moreover, and
crucially, industry in the post-fidelity era moved away from the product-based model that
had dominated for decades, replacing it with a service-based model in which consumers
are meant to license instead of own the music they listen to on their various devices.
Post-fidelity has persisted beyond the demise of free file sharing and into the era of paid
downloading platforms (such as Apple’s iTunes Store), video sharing sites (most notably
YouTube) and streaming services (such as Pandora and Spotify). The post-fidelity era, now
about twenty years old, is the new norm.
There has been, however, a notable form of resistance to the post-fidelity ideal: the
resurgence of analogue technologies, particularly the record player and vinyl disc. This
vinyl revival started gathering momentum around 2007 and continues at the time of
this writing, more than a decade later, with sales steadily rising every year and a well-
established global, annual celebration known as Record Store Day (2019). The revival
is modest, taking place largely within Western nations and mostly among the higher
end of the socioeconomic spectrum. Still, the phenomenon is instructive, revealing the
appeal of seeing and touching grooved vinyl and perhaps exposing the limitations of
internet-based listening. But it’s also worth noting that the celebration of the analogue
does not necessarily represent a rejection of the digital. Most vinyl lovers today also (and
even predominantly) consume digital media as well. Free digital downloads often come
with new LPs so that listeners can have the best of both worlds. The analogue allows
them to engage in a satisfying, multisensory, even sensual musico-technological ritual:
removing the disc from its sleeve, placing it over the turntable’s spindle, lowering the
tonearm (grasped gently between thumb and forefinger) onto the record, nestling the
diamond-tipped stylus into the v-shaped groove, tapping the start-stop button and then
reversing the process. The digital download, on the other hand, can accompany listeners
wherever they venture and is easily repeated, or skipped, for that matter.6 The vinyl revival
does not suggest a return to the high-fidelity era, for it typically represents no more
than 5 per cent of total annual sales; it does, however, demonstrate that technological
negotiation is ongoing, as listeners do more than simply respond to the possibilities or
constraints of the technologies offered to them in the marketplace. They also, always and
vitally, exercise their agency, their aesthetics and their values.
Cultures of listening
In the preceding discussion, culture has lurked in the background, revealing itself, if only
indirectly, as an ever-present force in the development of mediated listening. I now want
to bring it into the foreground. Consider the high-fidelity era. Not to undercut the validity
of my own periodization, but it cannot be said that high-fidelity, as a value and an ideal,
was equally embraced across the world; it was much more marked in the West. More to
my point, high-fidelity, particularly as it was manifested in the United States, was deeply
influenced by gender relations within the white American middle class. As Keir Keightley
(1996) writes: ‘The high-fidelity phenomenon of the late 1940s and 1950s involved not only
the masculinisation of home audio technology and the reclaiming of masculine domestic
space; it was also part of a significant development in the history of American middle-class
culture.’ It was during this period that recorded music came to be thoroughly masculinized;
the technology was not as strongly gendered in previous decades and, in fact, women had
been its prime consumers and users in schools and at home in most Western nations.
Consider, for example, Agnes Haynes’s 1908 Talking Machine News article, ‘How to Give a
Christmas Party’. As she explained:
At the majority of Christmas gatherings at the home a band is, and always has been, quite out
of the question for many reasons. […] But since the arrival of the talking machine everyone
may confidently solve the problem by a little forethought in the selection of records. One
advantage of mechanical music is that it does not suffer from nerve-fright when facetious
would-be comedians satirise its efforts; nor does it become prostrate from excessive demands
on its energies. (Haynes 1908)7
We see here a negotiation of recorded sound’s meaning in this early era. The talking machine,
as it was often called in those days, both built upon and subtly altered a longstanding
cultural practice (in this case, in England) – the middle-class home Christmas party. The
existing cultural contexts shaped the use of the technology much more powerfully than the
technology influenced cultural practices. Put another way, just because record players gave
life to Christmas parties didn’t mean that phonograph owners in, say, Muslim-majority
Iran would suddenly start throwing Christmas parties.
This may seem an obvious point, but too often we look to the technology itself to
understand or interpret its modes of consumption rather than the cultural context of the
technology’s users. Staying with the example of Iran, the consumption of cassettes and CDs
in Iran in the years since the revolution of 1979 has been directly influenced by the relative
religious conservatism of those in power. In some Muslim traditions, music is haram –
forbidden – and this was the perspective in Iran shortly after the revolution. Muslims
who wished to listen to recorded music during this time had to do so in secret, seeking
black-market recordings, listening only at certain times and with certain people, removing
informational or illustrative labels, notes or sleeves, and so on. There was certainly no
publishing articles on how to enliven the Eid al-Fitr celebration with ‘a little forethought
in the selection of records’. Music may be haram in these circumstances, but what counts
as music may vary depending on who is in power. Qur’anic chant, for example, is always
halal, and thus not considered to be music. But other forms of vocal singing, particularly
if by men and performed without instruments, were also acceptable. In later years, male
musicians were shown on Iranian television, but their instruments might be blurred out
or covered by animations (Roberston 2012: 30). Again, we see a clear relationship among
culture, politics and listening habits. This is certainly a stark example and one that cannot
be assumed to apply to the more than one billion Muslims living worldwide – after all,
Indonesians and Kosovars, Senegalese and Uzbeks may share a religion but their cultural
traditions are often quite distinct. The example of listening in post-revolution Iran, however,
reveals how strongly listening habits may be shaped by forces that have nothing to do with
technological affordances.
To take a final example, consider karaoke. The practice didn’t arise because technology
made it possible. Karaoke emerged in the early 1970s in Japan, where impromptu amateur
singing at social gatherings was long a traditional cultural practice. The main purpose of
karaoke is simply to facilitate amateur social singing by providing professionally recorded
accompaniment. The crucial feature of any karaoke technology – whether employing
cassettes, CDs, laser discs, music videos, MP3s or some other format – is that the recordings
it plays are not complete in themselves. The very term ‘karaoke’, roughly translated as ‘empty
orchestra’, reveals this incompleteness. Simply insofar as karaoke demands active music
participation, it provides a counterexample to the long-held assumption that recording
technology encourages a passive relationship with music. Those who interact with karaoke
technology are anything but passive – after all, listeners are typically just performers-in-
waiting. At the same time, we must acknowledge that the technology has influenced the
tradition of communal singing, affecting how, where and what people sing, and that this
influence has spread well beyond the borders of Japan.
The lesson here is to be wary of all forms of determinism, whether cultural or

technological. Or, to put this in the positive, if we are to understand the now long and
rich tradition of listening to recorded sound, we must take a broadminded approach, one
that accounts for the manifold technological and cultural factors that shape it and, most
importantly, never loses sight of the endlessly creative and complex humans who connect
so deeply with the music.
Notes
1. The term ‘acousmatic’ describes such sound decoupled from its originating cause. See
Kane (2014).
2. See, for example, Faulkner (1913); Scholes (1921). Note that both books were published
by record companies; this was not uncommon. For more on the business of music
appreciation, see Chybowski (2017).
3. This is not to say that fidelity had not been a goal of the recording industry prior to the
Second World War; indeed, the quest for fidelity is as old as the phonograph. My point
is that by the post-war period fidelity, particularly under the rubric of high-fidelity, came
to operate as the dominant and natural mode of phonographic discourse. See Thompson
(1995).
4. For more on the concept of post-fidelity, see Guberman 2011.
5. For more on the iPod and other forms of mobile music, see Bull (2007); Gopinath and
Stanyek (2014).
6. For more on the vinyl revival, see Bartmanski and Woodward (2015); Katz (2015); and
Krukowski (2017).
7. See also Adams (1908).
Bibliography
Adams, G. (1908), ‘How to Give a Christmas or New Year’s Party’, Talking Machine News and
Journal of Amusements, 1 (December): 66.
Bartmanski, D. and I. Woodward (2015), Vinyl: The Analogue Record in the Digital Age,
London: Bloomsbury.
Bull, M. (2007), Sound Moves: iPod Culture and the Urban Experience, London: Routledge.
Chybowski, J. J. (2017), ‘Selling Musical Taste in Early Twentieth-Century America: Frances
E. Clark and the Business of Music Appreciation’, Journal of Historical Research in Music
Education, 38: 104–127.
Edison, T. A. (1878), ‘The Phonograph and Its Future’, North American Review, 126: 530–536.
Faulkner, A. (1913), What We Hear in Music, Camden, NJ: Victor Talking Machine Company.
Gopinath, S. and J. Stanyek (2014), The Oxford Handbook of Mobile Music Studies, 2 vols,
Guberman, D. (2011), ‘Post‐Fidelity: A New Age of Music Consumption and Technological
Innovation,’ Journal of Popular Music Studies, 23: 431–54.
Haynes, A. (1908), ‘How to Give a Christmas Party’, Talking Machine News and Journal of
Amusements, 1 (December): 42.
Kane, B. (2014), Sound Unseen: Acousmatic Sound in Theory and Practice, Oxford: Oxford
University Press.
Katz, M. (2010), Capturing Sound: How Technology Has Changed Music, rev. edn, Berkeley:
Katz, M. (2015), ‘The Persistence of Analogue’, in G. Borio (ed.), Musical Listening in the Age
of Technological Reproduction, 275–287, Farnham: Ashgate.
Keightley, K. (1996), ‘“Turn It Down!” She Shrieked: Gender, Domestic Space, and High
Fidelity, 1948–59’, Popular Music, 15 (2): 149–177.
Krukowski, D. (2017), The New Analog: Listening and Reconnecting in the Digital World, New
York: The New Press.
Loesser, A. (1954), Men, Women, and Pianos: A Social History, New York: Simon and Schuster.
Ord-Hume, A. W. J. G. (2001), ‘Mechanical Instrument’, in L. Macy (ed.), Grove Music Online.
doi: 10.1093/gmo/9781561592630.article.18229.
‘The Phonograph’ (1877), New York Times, 7 November: 4.
Record Store Day (2019), ‘Home’. Available online: https://www.recordstoreday.com/
Richards, G. (1923), ‘Against the Gramophone’, Gramophone 1 (June): 36.
Roberston, B. (2012), Reverberations of Dissent: Identity and Expression in Iran’s Illegal Music
Scene, London: Continuum.
Scholes, P. (1921), Learning to Listen by Means of the Gramophone: A Course in Appreciation,
London: Gramophone Company.
Sterne, J. (2012), MP3: The Meaning of a Format, Durham, NC: Duke University Press.
Taylor, T. D., M. Katz and A. Grajeda, eds (2012), Music, Sound, and Technology in America:
A Documentary History of Early Phonograph, Cinema, and Radio, Durham, NC: Duke
University Press.
Thompson, E. (1995), ‘Machines, Music, and the Quest for Fidelity: Marketing the Edison
Phonograph in America, 1877–1925’, Musical Quarterly, 79 (Spring): 131–171.
Weidman, A. (2010), ‘Sound and the City: Mimicry and Media in South India’, Journal of
Linguistic Anthropology, 20: 294–313.
Williams, O. (1923), ‘Times and Seasons’, Gramophone 1 (June): 38–39.
392
26
Interpreting the Materials of a
Transmedia Storyworld:
Word-Music-Image in
Steven Wilson’s Hand. Cannot.
Erase. (2015)
Lori A. Burns and Laura McLaren
Introduction
Building upon a long-standing tradition of musical transmediality in the creative work
of artists such as The Beatles, David Bowie and Pink Floyd, contemporary musical artists
increasingly mobilize new digital technologies and social media platforms to build
transmedia storyworlds. With the advancement of media technologies, artistic strategies
for developing an album package have expanded from images and words to promotional
videos, cinematic and animated films, illustrated books and comics, music videos, concert
films, staged musicals, video games and internet blogs (see Kelly 2007; Holm-Hudson 2008;
McQuinn 2011; Richardson 2012; Jost 2015; Rose 2015; Burns 2016, 2018). Henry Jenkins
understands the convergence of such materials as a vehicle for telling transmedia stories,
in which each text makes its own ‘distinctive and valuable contribution’ to the overall
expression (2006: 95). For recording artists, such materials enhance and expand the world-
building potential of their musical storytelling, providing the spectator with a myriad of
ways to interact with the cultural content. This chapter examines an elaborate example of
a transmedia concept album that shapes a compelling narrative of human experience and
encourages the spectator to explore dynamic perspectives on an extensive gathering of
multimodal materials. Relying upon recent theories of intermediality and transmediality
(see Jenkins 2006; Herman 2010; Ryan 2014; Ryan and Thon 2014; Burns 2017, 2018), we
consider progressive rock artist Steven Wilson’s approach to transmedia storytelling in his
2015 album Hand. Cannot. Erase. In order to respond to the complex materials of this
work, we have developed an innovative platform that presents our analysis in an interactive
media framework. In so doing, we invite the reader to engage – as a spectator-analyst –
with the very materials that comprise Steven Wilson’s Hand. Cannot. Erase., curated and
interpreted to illuminate their musical-cultural significance.
For all of his solo albums to date – Insurgentes (2008), Grace for Drowning (2011), The
Raven That Refused to Sing (And Other Stories) (2013), Hand. Cannot. Erase. (2015) and
To The Bone (2017) – Steven Wilson has collaborated with photographer Lasse Hoile and
designer Carl Glover to produce elegant hardbound books (10” x 10”) for limited edition
releases that include bonus demos and 5.1 mixes of the songs.1 The Raven takes Wilson’s
storytelling to a new level, as each song is connected to a short story, written by Hajo
Müller.2 Also beginning with The Raven, the limited editions have included Hoile’s studio
documentary films and Wilson entered into collaboration with Jess Cope, an animation
director who has created music video treatments for songs on The Raven, Hand. Cannot.
Erase. and To The Bone. The limited edition book for Hand. Cannot. Erase. (hereafter
H.C.E.) presents a diverse assemblage of materials (personal notes, photos and artefacts, i.e.
birth certificate, report card, diary, newspaper clippings) in ‘scrapbook’ form. Taking these
materials to the realm of social media, Wilson also launched a blog (no longer available
at www.handcannoterase.com), which accomplished a number of important tasks: it
simulated the protagonist’s timeline for writing the scrapbook entries between 2008 and
2015, thus reinforcing her credibility as a real subject; it served as an album teaser and built
anticipation for the album release; and it offered a free alternative for fans to participate in
the transmedia experience without having to order the deluxe edition ‘scrapbook’, in which
many of the photos and almost all of the blog posts are reproduced, with the exception of
the final seven posts from 1 January to 11 February 2015.3
Wilson declares that H.C.E. is ‘driven by narrative’ and that the materials account
for ‘all of the curves of someone’s life’ (Blum 2015). The story was inspired by the true
story – captured in the drama-documentary film, Dreams of a Life (2011) – of Joyce Carol
Vincent, who died in a London apartment, but whose body was not found for three years.
To summarize the story that emerges in the H.C.E. narrative, the female subject (only
ever referred to as H) experiences isolating circumstances in an urban setting while her
memories transport her to other places and times. Her ideas and perspectives are conveyed
not only through the song lyrics but also the writings of her personal blog, the scrapbook,
photos and artefacts. The blog accounts for her present situation but features flashbacks to
past experiences and relationships. H is an artist who resists emotional attachment and is
both drawn to and repelled by the narcissism of social media. In the course of the narrative,
her sister (J) and cat (Laika) both ‘disappear’, she is detached from her brother, and she
feels distrust of the ‘visitors’ who enter her apartment. As H experiences deeper and deeper
isolation and a loss of reality, she contemplates her own disappearance. The biographical
style of H.C.E. is enhanced further by the music videos for the songs ‘Hand Cannot Erase’
(directed by Lasse Hoile), ‘Perfect Life’ (directed by Youssef Nassar) and ‘Happy Returns’
(also directed by Nassar), which feature the female subject of the scrapbook photographs
in her apparent real-life settings. To make the photography and the videography even
more realistic, the actress who assumes the role of the female protagonist, Karolina ‘Carrie’
Interpreting the Materials of a Transmedia Storyworld 395
Grzybowska, provided personal photographs from her own childhood to be featured in

the album materials. Notably, the video for ‘Routine’ steps away from the biographical
representations and offers instead what narratologists would refer to as a character
focalization, as the subject becomes fascinated by the tragic situation of another woman
whose story is reported in the news.
Working with the H.C.E. materials, it becomes apparent that the expressive content in
the scrapbook and blog are meant to correspond to specific songs on the album. Adopting
an approach of ‘narrative immersion’ (Herman, Manfred and Ryan 2005: 238), the
spectator can listen to the album, read the scrapbook and/or blog, and watch the videos,
making connections between the songs and the materials that amplify the lyrical story.4 To
interpret this intersecting network of artistic materials, the spectator-analyst is challenged
to understand the dialogue between musical production and the media artefacts, to absorb
the multimodal discourse of the recorded tracks and music videos, to discover the ways
in which individual texts are connected in an intermedia relationship to other texts in the
network, and to grasp the cohesive narrative that emerges in and through the transmedia
platform.
Table 26.1 reproduces the release timeline of the H.C.E. materials and tour. On the first
day of 2015, a ‘Teaser’ video was released on YouTube, offering a passage from Track 1 of
the album (‘First Regret’) and pointing fans to the handcannoterase.com website, which
comprised the blog listings from 8 October 2008 until 1 January 2015. From that date
forward, the seven remaining posts appeared on the blog site in real time (4 January;
12 January; 23 January; 27 January; 10 February; 28 February; and 2 March 2015). While
these blog posts built anticipation, the first music video (‘Perfect Life’) was released on
4 February, a ‘Reveal’ video for the Deluxe Edition appeared on YouTube on 15 February,
and the CD, Vinyl and Deluxe Edition were all released on 27 February. The official tour
Table 26.1 Release timeline of the Hand. Cannot. Erase. materials and tour
1 January 2015 ‘Hand. Cannot. Erase. Teaser’, video (Wilson 2012)
1 January 2015 Blog goes live, with entries from 8 October 2008 to 1 January 20156
4 February 2015 ‘Perfect Life’, music video (Wilson 2015b)
15 February 2015 ‘Hand. Cannot. Erase. Deluxe Edition Reveal’, video (Wilson 2015a)
27 February 2015 CD and vinyl release
27 February 2015 Deluxe edition ‘scrapbook’ release
12 March 2015 H.C.E. Tour Begins
(Corn Exchange, Cambridge, UK)
13 June 2015 Yahoo Music Live Nation Livestream
(The Wiltern, Los Angeles, CA)
29 October 2015 ‘Routine’, music video (Wilson 2015c)
16 May 2016 ‘Hand Cannot Erase’, live film (Wilson 2016a)
28 October 2016 ‘Happy Returns’, music video (Wilson 2016b)
2 December 2016 H.C.E. Tour ends
(NH7 Bacardi Weekender, Pune, India)
began shortly thereafter on 12 March 2015, with a livestream performance appearing on

Yahoo Music Live Nation on 13 June. Attendees of the tour witnessed the projection of
not only images from the Teaser video and ‘Perfect Life’ video (already available to fans
on YouTube), but also new music videos for ‘Routine’, ‘Hand Cannot Erase’ and ‘Happy
Returns’, videos not available on YouTube until several months later: October 2015, May
2016 and October 2016, respectively. The rollout of the H.C.E. materials was thus tied both
to the album release as well as the tour, carefully sequenced to build anticipation for the
materials connected to the project.5
We present the analysis that follows as an interpretive curation of Wilson’s artistic work,
thus we have used some digital content management tools – the Omeka digital content
management platform (https://omeka.org/) and the Storymap tool (https://storymap.
knightlab.com/) – to assemble the materials in the style of an interpretive exhibit. The reader
is encouraged to visit the Omeka page (http://biblio.uottawa.ca/omeka1/transmedia/hand-
cannot-erase), which hosts two storymaps:
● The ‘Narrative Timeline’ offers a pathway through selected images and texts,
following the temporal narrative of events in H’s life. This is not the order of the
album materials but rather the order that the spectator-analyst might construct for
the temporal contexts of H’s story. For the reader who is able to explore the Omeka
exhibit, the ensuing discussion of the Narrative Timeline will be supported by a full
narrative immersion.
● The ‘Musical Timeline’ respects the order of the songs on the album and offers a
sonic map to the full tracklist.7 The reader is invited to explore the musical analyses
provided here.
Narrative Timeline of H.C.E.

Entering the page for the Narrative Timeline the reader will find an interpretive summary
of the album narrative. With Tracks 1 and 2, Wilson adopts the device known as in medias
res (‘into the middle of things’), a common storytelling strategy that places the crisis of
the story at the opening, after which flashbacks and additional events fill in the necessary
storyline until the crisis moment returns and is resolved (Baldick 2015).8 H.C.E. features
three significant moments of memory flashback (Tracks 3, 4 and 8), with these appearing
in reverse chronological order, with the most recent flashback to college life in Track 3,
a memory from adolescence in Track 4 and the deepest recollection from childhood in
Track 8. Track 5 stands out as a ‘timeless’ moment that does not represent H’s own life
but rather offers a related story that captures her interest. The remaining tracks follow the
timeline of the blog: Tracks 6–7 are associated with the period of 2008–2011; Track 9 with
the period of 2012–2014, Track 10, with December 2014, Tracks 1–2 with early 2015, and
finally Track 11 with February 2015. The in medias res opening explores H’s moment of
decisional crisis, to ‘stay’ in this world or to ‘disappear’. Since Track 11 has no lyrics, we
only understand H to make that decision through the content of the blog entry, in which
H declares that she is ‘ready to leave’.
The second table on the Narrative Timeline page reorders the songs in order to recreate
the linear timeline, from childhood to the present. We are not suggesting that Wilson
should have produced the songs on the album in this order, any more than we would
suggest that modern filmmakers or novelists should re-cut their films to avoid the strategy
of in medias res. Rather, our reordering of the tracks is meant to demonstrate how the
timeline evolves and how H appears to arrive at her decision.
We now invite the reader to start exploring the Narrative Timeline, by navigating with
the forward arrows in the storymap. Event 1 in the timeline of H’s experiences is a memory
from childhood that emerges in Track 8 (‘Transience’). Connected to the childhood period
are a number of artefacts in the scrapbook: childhood photos, H’s birth certificate and a
‘child observation’ document (presumably written by a teacher or psychologist) identifying
her tendency towards isolation and her creative spirit. The blog entry from 9 January 2009
describes her parents as preoccupied with their own interests and recalls a phrase that her
father used to use: ‘It’s only the start.’ This line then appears in the lyrics to ‘Transience’,
which portray a family train trip from the perspective of a third-person narrator. The
scrapbook reproduces the lyrics over an image of a vacation day by the sea. Musically, this
song captures the childhood memory, overshadowed by a layer of darkness. The song is
characterized by a simple form, a gentle triplet pattern on the finger-picked guitar, and an
intimate vocal presentation for the verses in contrast with a more layered and distant vocal
for the chorus. While these features might connote ‘innocence’, an ominous low synth tone
cracks the veneer at several points.
Event 2 in the Narrative Timeline places H at the age of thirteen, when she was strongly
attached to a sister who was three years older (referred to as J at points in the blog) for a six-
month period. The blog entry of 12 August 2009 describes their relationship and explains
that her parents’ subsequent separation leads to her sister’s rehousing. The mixtape,
represented as an artefact, lists songs that reveal a preoccupation with disappearance,
escape and nostalgia. Most of the blog text and a few of the songs from the mixtape appear
in the lyrics to Track 4 (‘Perfect Life’), although the key information about the parents’
separation is left out. The lyrics to the song appear in an artefact reproduction of a little
book, Key of Skeleton, which is inserted into the pages of the scrapbook. In the song setting,
the lyrics are delivered in a spoken word passage by a female voice (Katherine Jenkins)
over a 4/4 groove presented by an electronic drum kit, with sonic fragments that float into
the ambient texture. The spoken word section is followed by a repeated chorus, which
declares: ‘We have got the perfect life,’ while the texture builds to include a jangly guitar
with an oscillating stereo delay effect. Harmonically, the song is unresolved (closing on the
subdominant). The video also points to an unresolved ending, as scenes of the teenage girls
spending time in an open park end with the younger girl alone, crying for her sister, while
the bicycle they shared stands empty.
Event 3 takes us to some college memories. In a blog entry (September 2009) H admits
her tendency to ‘hide in plain sight’, and she plays artistically with the reproduction
of a photo, in which the surrounding figures are visible in the blog but blacked out in
the scrapbook. She also writes (December 2009) about a boyfriend, suggesting that the
relationship made her feel as if she was ‘playing a role in a dream of a life’ (Dreams of a Life
2011), the latter reference pointing to the title of the film that was the inspiration for the
album. The lyrics to Track 3 (‘Hand Cannot Erase’), which appear in the form of a postcard
to a friend, suggest the love in that relationship to be enduring, while also revealing her
detachment as she needs ‘more space’. Musically, the song offers an accessible song form,
an upbeat 4/4 and post-punk style. The black and white ‘live film’ created by Lasse Hoile
was displayed behind the band on the H.C.E. tour. In it, H and her male partner are seen
embracing and caressing one another, with their movements being tracked in slow motion.
At the introduction of the chorus, they are doused with water, possibly suggesting what
happens when ‘life gets hard’, as the lyrics suggest. While the streams of water initially drive
the couple apart, the video shows them alternately coping with the onslaught alone and in
each other’s arms, where they remain as the water finally stops. Though we know that H
eventually broke off the relationship due to her isolated tendencies, the video hints at what
might have happened if she had turned to him for strength.
Event 4 represents the moment when H begins writing her blog and thus brings us
into the ‘present’ of her narrative, revealing her choice of the city as ‘the place to hide’
(blog, 30 April 2010) and her fascination with social media. The lyrics to Track 6 (‘Home
Invasion’) juxtapose the attachments that can result from the internet (‘download love’),
with the negative effects of that medium for escape (‘I have lost all faith in what’s outside’).
With this song, the music takes a very dark turn, especially in comparison with the
flashback songs associated with Events 1 through 3. An extended instrumental intro driven
by a jam session gives way to a funky synth groove supporting Wilson’s gritty vocal. Gone
is any attempt to see relationships in a positive light. The song links immediately to the
subsequent Track 7 (‘Regret #9’).
Event 5 continues with H reflecting on city life and experiencing a failed effort to form
an attachment to a cat. In the blog entry of 11 September 2011, H considers the potential
of humans to feel regret when they do not make commitments, juxtaposing this idea with
a comment about the ‘visitors’ who ‘prowl her flat’. Even though it is an instrumental track
with no lyrical cues to confirm the connection, this blog entry is linked to Track 7 (‘Regret
#9’) by means of a strong sonic cue: as a representation of the visitors in her apartment, the
track opens with the faint sound of a male voice reciting numbers (over a transparent texture
of deep bass tones and a lilting 6/8 kit pattern). The distance of the voice suggests that H
is aware of the visitors but not really engaged with them. The suggestion of her deepening
isolation is supported by the musical content of the track, which follows immediately from
Track 6 and functions like an extended outro for that song. Intense solos on the moog
synth followed by a thick, saturated guitar suggest the internal workings of H’s mind, in
seeming disregard of the visitors.
Event 6 focuses on H’s self-declared fascination with ‘stories of the disappeared’. The
theme of disappearance is manifest throughout H.C.E. in news clippings about missing
women. Her 9 March 2011 blog entry concludes that she finds it appealing to consider
the possibility of ‘leaving everything behind’. One such clipping (from 1993) reports the
story of a woman, Madeline Hearne, who lost her children and husband in a reported
school shooting and subsequently went missing. This particular case becomes the focus
of Track 5 (‘Routine’), as the voice of Madeline Hearne emerges to convey her feelings of
grief, thus the song represents a moment in the H.C.E. narrative that focalizes another
subject’s experience. The musical setting stands out for its adherence to a number of
progressive rock conventions: an extended progressive form, a lilting 5/4 metre, expanded
vocal resources including boy soprano, guest artist Ninet Tayeb and choir, a wide dynamic
range and vast sonic space. The song climaxes with a vocal scream that is followed by
a calming postlude. The video stands out as the only animated clip associated with the
album, featuring the sophisticated stop-motion animation created by Jess Cope.9 In the
video scene that immediately precedes the scream, the subject comes across the newspaper
clipping that reported the case. In her detailed attention to the aesthetic content and design
of H.C.E., Cope ensures that the material values of the album are developed in her own
artistic contribution to the transmedia storyworld.
Event 7 continues to follow H’s fascination with isolation and disappearance, but now she
seems to contemplate her own disappearance in more explicit terms. In the blog entry of
28 March 2013, she considers the possibility of ‘coming back’, just like the deceased woman
in an unfathomable story, although the visitors tell her, ‘no one ever comes back’. This line
points strongly to Track 9 (‘Ancestral’), in which the bridge and outro lyrics express the
idea of coming back. As she commits further to the idea of disappearing entirely, she asks
herself, ‘Why do I still stay?’ (blog, 2 December 2014). The longest track on the album,
this song offers a complex form, chromatic harmonic development and challenging metric
alternation between 7/4 and 4/4. Through a number of intertextual links between ‘Routine’
and ‘Ancestral’, we can understand the latter as H’s own journey into the stories of loss and
disappearance that she has imagined in her fascination with stories of the disappeared.
Event 8 in the Narrative Timeline takes us to December 2014, with H writing a letter
to her estranged brother. The words of the letter comprise the lyrics to Track 10 (‘Happy
Returns’), an attempted moment of intimacy that is never realized, as the letter is left
unfinished and unsent; her final words, ‘I’ll finish this tomorrow,’ are left without resolve
in the materials of the album. Her excuse not to continue writing (‘I’m feeling kind of
drowsy now’) points to her use of medication, which is referenced in the music video when
we see her imagined goodbye to her brother and subsequently taking a pill (2:29). ‘Happy
Returns’ is signalled by Wilson as a key moment in the narrative, by means of a series
of strong musical connections to Tracks 3 (‘Hand Cannot Erase’) and 4 (‘Perfect Life’).
By referring to these two songs, ‘Happy Returns’ recalls H’s memories of her formative
relationships with her college boyfriend and sister. The track opens with the natural
sounds of a thunderstorm that transform into the oscillating delay from Tracks 3 and 4.
Where, in those two songs, the stereo oscillation was fairly narrow, it is now quite wide;
we interpret this as a representation of how her life is coming apart in ways that were
foreshadowed even in those early relationships. Track 10 also recalls significant moments
in Track 1 (‘First Regret’): the oscillating delay and piano melody of Track 1 are reprised for
the opening of ‘Happy Returns’, marking a musical-formal reprise to conclude the album
and also to explicitly connect these two tracks. We especially appreciate this connection in
the context of the in media res Narrative Timeline, since Track 1 is understood to follow
directly from ‘Happy Returns’. With this range of intertextual references within the album,
Wilson both signals ‘Happy Returns’ as a strongly marked moment in the H.C.E. narrative
and simultaneously offers coherent pathways for the listener to understand the earlier
tracks on the album.
Event 9 takes us to the moment that we identified earlier as the in medias res opening of
the album. Tracks 1–2 (‘First Regret’ and ‘3 Years Older’) show H to be arriving at a moment
of decisional crisis: while she has been reflecting throughout the blog and scrapbook on her
tendency to self-isolate, her obsession with the notion of disappearance and her distance
from loved ones, we now see her to be moving ever closer to a decision about ‘disappearing’.
In January 2015 (the moment when Wilson’s H.C.E. teaser video was released and the blog
went live) the blog entries begin to feature retrospective content, as she refers to lyrical
lines and concepts from previous songs (i.e. the 1 January reference to ‘hand cannot erase’;
the 4 January reference to the lyrical line ‘awning of the stars’ from Track 6; the 12 January
reference to a photo of her and her sister which she has never seen before and does not
think was ever taken, and the 23 January reference to her sister’s mixtape). The blog entry
from 27 January suggests that H is losing touch with reality, as she sees and interacts with
her thirteen-year-old self, who says that their sister is ‘waiting for us’. The drawing attached
to this blog entry is from the Key of Skeleton book that was connected to ‘Perfect Life’
(Event 2). By 10 February, the blog indicates that she has given up, as she has not left her
apartment in over a week and wonders ‘how long it will be before anyone notices me’.
The music associated with these feelings of deep isolation and detachment, Tracks 1–2
(‘First Regret’ and ‘3 Years Older’), has a double function: first, to set the stage for the
complete album as they are the opening tracks, and second, to raise the curtains on the
crisis moment – the in medias res scene that draws the spectator into the narrative – which
actually occurs later in the temporal scheme. Fulfilling these two functions, Track 1 sets
the stage for the album by establishing some significant sonic materials: the oscillating
delay effect that is later identified as having its source in the flashback songs (Tracks 3
and 4); the dark and intimate piano melody that will be reprised when H writes to her
brother in ‘Happy Returns’; and the ambient sounds and voices (also occurring in Tracks
5, 6, 7 and 9) that situate H in relation to her environment and her memories of human
interaction. Track 1 also brings us into the moment of H’s decisional crisis as she moves
towards ‘disappearance’: the crescendo in the mellotron and the heavy, dark heartbeat at
the end of the track point ahead to her mental and physical isolation in Tracks 6, 7 and 9.
Track 2 also fulfils these two functions: it sets the sonic stage for the album with its shifting
textures, dynamics and ideas, and it raises the curtain on the moment of crisis by exploring
an expanded form that connects to other musical moments of deep internal reflection and
decision-making.
Event 10 marks the end of the journey, and is stunningly minimal in its material
content. In the final blog entry of 28 February 2015, H declares herself ready to disappear.
The self-portrait in the scrapbook features her face completely painted over, with streams
of paint flowing from her closed eyes. The final image in the scrapbook and the blog of
lights in the sky connects to several earlier comments about the sky as well as to the title
of Track 11 (‘Ascendant Here On …’). Growing immediately out of ‘Happy Returns’, this
brief final track functions as a choral outro for the album, balancing the instrumental
intro that was Track 1 (‘First Regret’). In a lush, rounded, distant and ethereal production,
the choir vocalizes with piano chords and ambient sounds, reprising the harmonic
progression from ‘Perfect Life’.
Interpretive conclusions
Based on this narrative immersion in the transmedia storyworld of H.C.E., the interpreter
is led to conclude that the loss of her sister was a critical juncture for H; she never recovered
from that traumatic loss and was subsequently unable to sustain meaningful attachments.
Just as the sister disappeared – and perhaps because the sister disappeared – H spends the
rest of her life wanting to disappear. The transmedia materials document and communicate
the subject’s gradual descent into total isolation and disappearance. The spectator is thus
meant to engage with a compelling human chronicle as it is conveyed in and through
artistic means. Wilson’s self-declared intention to write ‘the curves of someone’s life’ is
realized in a transmedia narrative that provides meticulous documentation of H’s personal
tragedy.
The specific pathway of engagement is, however, not prescribed; the design of
this transmedia setting allows for multiple avenues to be explored, as the album, blog,
scrapbook, music videos and live show each offer unique narrative commentaries.
While the songs alone present sonic and lyrical clues to the story, an encounter with the
transmedia materials leads to a more robust understanding. The dated entries in the blog
and scrapbook allow the reader to situate the lyrics within significant events in H’s life and
contribute to her authenticity as the subject and author of a personal blog or scrapbook.
Finally, the visual elements such as the photos, artefacts and music videos elaborate the
transmedia storyworld, allowing the spectator to witness the main character in particular
settings and social spaces. The artefacts tucked into the scrapbook further authenticate the
female subject. Taken collectively, the material elements – musical, textual and visual –
contribute cohesively to H’s story.
Whereas the live concert tour might immerse the spectator in song performances
coordinated with the backdrop of music video images, the experience of listening to the
album while reading the scrapbook or blog creates another style of narrative immersion.
We found that by doing this, we were able to follow along with the scrapbook at the same
pace as the music, for example, pausing our reading when the image or text depicted what
was happening in a given song. This encouraged periods of reflection, including instances
of cross-referencing pages in order to substantiate material connections. For the most part,
we discovered that simultaneous listening and reading revealed a complementary narrative,
as the song order seemed to correspond intelligently with the layout of the scrapbook.
This approach to the transmedia work allows for aural, visual and tactile absorption, a
multimodal immersion that is enhanced by the loose artefacts tucked into the book as well
as the realism of the photography, which portrays H (Karolina Grzybowska) as a credible
subject in tangible settings and actions. By the end of the album and book/blog, the spectator
can receive H as a plausible subject and can form a response to her striking disappearance.
Although Wilson does not suggest an unequivocal ending for the story, the references to
the medicine cabinet (Track 2) and the hint that H is slipping into sleep (Track 10) reveal her
attempts to disappear with drugs. The very brief image of H placing a pill on her tongue in
the music video for ‘Happy Returns’ is a visual confirmation of that interpretation. Wilson’s
narrative thus offers some critical messages about the post-millennial human condition:
he reveals the negative consequences of urban isolation and the myth of social media
connectivity; he cautions us about the potentially destructive consequences of drugs; and
ultimately he underscores the importance of human relationships and social structures
(e.g. the ‘visitors’, who might be interpreted as social workers) that are there to help.
This profound social commentary does not only emerge in the sophisticated transmedia
materials but also is deeply embedded in Wilson’s musical content and expression.
Through his formal structures, as well as through his timbral resources, Wilson represents
H’s subjectivity and perspective. He mobilizes a range of devices – ambient sounds, sonic
fragments, oscillating delays, guitar tones, instrumental developments, etc. – to build a
musical narrative that answers many of the puzzling questions the transmedia spectator
may experience in the engagement with this album.
Notes
1. For the many readers who do not have access to the limited edition book, it is possible to view
an ‘unboxing video’ for all of Wilson’s limited editions: Insurgentes, see https://www.youtube.
com/watch?v=RAPqPRLo23g (accessed 3 August 2019). Grace for Drowning, see Lock 2011; The
Raven, see Lock 2013; Hand. Cannot. Erase., see Wilson 2015a; and To The Bone, see Lock 2017.
2. Burns 2017, 2018 offer studies of the transmedia storytelling in two songs on The Raven:
‘Drive Home’ and ‘The Raven That Refused to Sing’.
3. These seven blog posts, which all appeared online during the lead up to the album
release, portray H’s descent deeper into isolation, as she becomes more and more aware
of her solitude, she reminisces about her lost sister, and dreams of empty buildings and
encounters with her younger self.
4. This kind of spectator engagement is not new to progressive rock but rather builds on
that genre’s fascination with extra-musical materials. For instance, Camel’s 1975 musical
adaptation of the novella, The Snow Goose, by author Paul Gallico (1971) comes to mind,
which is a book-album experience familiar to many progressive rock fans.
5. For discussions of such new economic models of the music industry, see Tschmuck 2017;
Wikström 2013.
6. The blog was available at handcannoterase.com but has since been taken down. It can be
viewed on archive websites such as the Wayback Machine and Archive.is. For example,
for the blog as it appeared on 4 January 2016, after the tour ended, see https://web.archive.
org/web/20161207192455/http://handcannoterase.com/
7. The spectrographic data in the Musical Timeline images were derived from Sonic
Visualiser, with three layers from top to bottom: peak frequency spectrogram (channels
mixed); melodic range spectrogram (channels mixed); and waveform (channels separate).
The analytic annotations were added in a pdf editing tool, Preview.
8. We see this in many pieces of literature and film, from Homer’s Odyssey to the film City of
God (2002).
9. Cope, of Owl House Studios, also directed the videos for Wilson’s ‘Drive Home’ and ‘The
Raven’ (Owl House Studios 2018).
Bibliography
Baldick, C. (2015), ‘In medias res’, in Oxford Dictionary of Literary Terms, 4th edn, Oxford:
Blum, J. (2015), ‘Genius. Doesn’t. Fade. A Conversation with Steven Wilson’, PopMatters,
12 March. Available online: https://www.popmatters.com/genius-doesnt-fade-a-
conversation-with-steven-wilson-2495558585.html (accessed 3 August 2019).
Burns, L. (2016), ‘The Concept Album as Visual–Sonic–Textual Spectacle: The Transmedial
Storyworld of Coldplay’s Mylo Xyloto’, Journal of the International Association for the Study
of Popular Music, 6 (2): 91–116.
Burns, L. (2017), ‘Multimodal Analysis of Popular Music Video: Genre, Discourse, and Narrative
in Steven Wilson’s “Drive Home”’, in C. Rodrigues (ed.), Coming of Age: Teaching and
Learning Popular Music in Academia, 81–110, Ann Arbor, MI, University of Michigan Press.
Burns, L. (2018), ‘Transmedia Storytelling in Steven Wilson’s “The Raven That Refused to
Sing”’, in C. Scotto, K. M. Smith and J. Brackett (eds), The Routledge Companion to Popular
Music Analysis: Expanding Approaches, 95–113, New York: Routledge Press.
City of God (2002), [Film] Dir. F. Meirelles and K. Lund, Brazil: Miramax Films.
Dreams of a Life (2011), [Film] Dir. C. Morley, United Kingdom: Dogwoof Pictures.
Herman, D. (2010), ‘Word-Image/Utterance-Gesture: Case Studies in Multimodal Storytelling’, in
R. Page (ed.), New Perspectives on Narrative and Multimodality, 78–98, New York: Routledge.
Herman, D., J. Manfred and M.-L. Ryan (2005), Routledge Encyclopedia of Narrative Theory,
London: Routledge.
Holm-Hudson, K. (2008), Genesis and the Lamb Lies Down on Broadway, Aldershot: Ashgate.
Jenkins, H. (2006), Convergence Culture: Where Old and New Media Collide, New York: New
York University Press.
Jost, C. (2015), ‘Popular Music and Transmedia Aesthetics: On the Conceptual Relation of
Sound, Audio-Vision and Live Performance’, in E. Encabo (ed.), Reinventing Sound: Music
and Audiovisual Culture, 2–13, Newcastle upon Tyne: Cambridge Scholars Publishing.
Kelly, J. (2007), ‘Pop Music, Multimedia and Live Performance’, in J. Sexton (ed.), Music,
Sound and Multimedia: From the Live to the Virtual, 105–120, Edinburgh: University of
Edinburgh Press.
Lock, D. (2011), ‘Grace for Drowning by Steven Wilson Deluxe Edition Unboxing’, YouTube,
29 September. Available online: https://www.youtube.com/watch?v=z-9MZTEQqbU
Lock, D. (2013), ‘Prog Review Extra – The Raven That Refused to Sing Deluxe Edition’,
YouTube, 23 March. Available online: https://www.youtube.com/watch?v=jZnyyqHg6II
Lock, D. (2017), ‘FIRST LOOK – To The Bone Deluxe Edition – Steven Wilson’, YouTube,
29 August. Available online: https://www.youtube.com/watch?v=im7UjGdE0rI (accessed
3 August 2019).
McQuinn, J. (2011), Popular Music and Multimedia, Aldershot: Ashgate.
Owl House Studios (2018), ‘Home’. Available online: http://www.owlhousestudios.com/
Richardson, J. (2012), An Eye for Music: Popular Music and the Audiovisual Surreal, Oxford:
Rose, P. (2015), Roger Waters and Pink Floyd: The Concept Albums, London: Fairleigh
Dickinson University Press.
Ryan, M.-L. (2014), ‘Story/Worlds/Media: Tuning the Instruments of a Media-Conscious
Narratology’, in M.-L. Ryan and J.-N. Thon (eds), Storyworlds Across Media: Toward a
Media-Conscious Narratology, 25–49, Lincoln, NE: University of Nebraska Press.
Ryan, M.-L. and J.-N. Thon, eds (2014), ‘Storyworlds Across Media’, in M.-L. Ryan and J.-N.
Thon (eds), Storyworlds Across Media: Toward a Media-Conscious Narratology, 1–12,
Lincoln, NE: University of Nebraska Press.
Tschmuck, P. (2017), The Economics of Music, Newcastle-upon-Tyne: Agenda Publishing.
Wikström, P. (2013), The Music Industry: Music in the Cloud, Cambridge: Polity Press.
Wilson, S. (2012), ‘Hand. Cannot. Erase. Teaser’, YouTube, 1 January. Available online: https://
www.youtube.com/watch?v=uXKpm-y3n1M (accessed 3 August 2019).
Wilson, S. (2015a), ‘Steven Wilson – Hand. Cannot. Erase. Deluxe Edition Reveal’, YouTube,
15 February. Available online: https://www.youtube.com/watch?v=HLsgd4L-nAM
Wilson, S. (2015b), ‘Steven Wilson – Perfect Life’, YouTube, 4 February. Available online:
https://www.youtube.com/watch?v=gOU_zWdhAoE (accessed 3 August 2019).
Wilson, S. (2015c), ‘Steven Wilson – Routine’, YouTube, 29 October. Available online: https://
www.youtube.com/watch?v=sh5mWzKlhQY (accessed 3 August 2019).
Wilson, S. (2016a), ‘Steven Wilson – Hand. Cannot. Erase’, YouTube, 16 May. Available online:
https://www.youtube.com/watch?v=A64J8mo8oZE (accessed 3 August 2019).
Wilson, S. (2016b), ‘Steven Wilson – Happy Returns (from Hand. Cannot. Erase.)’, YouTube,
28 October. Available online: https://www.youtube.com/watch?v=l7uls7grg5Y (accessed 3
August 2019).
Discography
Camel (1975), The Snow Goose, Decca SKL-R 5207.
Wilson, Simon (2008), Insurgentes, Kscope 808.
Wilson, Simon (2011), Grace for Drowning, Kscope 176.
Wilson, Simon (2013), The Raven That Refused to Sing (And Other Stories), Kscope 242.
Wilson, Steven (2015), [CD] Hand. Cannot. Erase, Kscope 316.
Wilson, Steven (2015), [vinyl] Hand. Cannot. Erase, Kscope 875.
Wilson, Steven (2015), [Deluxe edition, double CD, DVD, book, additional inserts] Hand.
Cannot. Erase, Kscope 522.
Wilson, Simon (2017), To The Bone, Caroline International 2557593020.
Index
1176 (see also UA1176 and Urei 1176) 341 Auslander, Philip 205, 206, 207, 211
Abbey Road 47, 63, 92, 103, 107, 109, 113, 120 Austin, Gene 13
Abel, Jonathan 328 authenticity 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
actor-network theory (ANT) 36, 47, 142, 147 29, 30, 91, 170, 188, 255, 295, 296, 401
Adorno, Theodor 206 authentic 5, 14, 15, 19, 20, 21, 23, 24, 25, 26, 27,
AES. See Audio Engineering Society 29, 30, 195, 245, 255, 259, 295, 370, 373
aesthetics 3, 27, 32, 37, 41, 45, 54, 71, 82, 108, Auvinen, Thomas 141, 161, 163, 165, 168, 169,
130, 131, 135, 190, 295, 296, 314, 317, 318, 170, 171, 172, 191
324, 329, 330, 388
affordances 9, 52, 114, 148, 149, 151, 153, 155, Babbie, Earl 38, 42, 43
156, 157, 187, 191, 266, 336, 338, 339, 340, Babyface (Kenneth Edmunds) 29
342, 343, 345, 384, 389 Bacharach, Burt 117
agency 11, 16, 41, 44, 54, 130, 170, 172, 188, 216, Bachelor, Mike 59
254, 255, 256, 259, 260, 265, 313, 335, 336, Badman, Keith 63, 64
353, 359, 363, 388 Baldick, Chris 396
Akai 76 Ballou, Glen 238, 241, 284
Akrich, Madeline 91 Bamman, Jones- 258
Albini, Steve 265 Banks, Azealia 254
Alesis 77 Bartlett, Jenny 237
Alexandrovich, George 63 Bates, Eliot 2, 67, 80, 108, 125, 126, 129, 130, 131,
Alleyne, Mike 5, 15, 19, 26, 27, 28, 188 132, 133
Ampex 23, 24, 52, 61, 62, 63, 64, 70, 115, 116, Bayless, James 114
117, 226 Bayley, Amanda 212
Anand, N 133 Beach Boys, The 64, 65, 70, 118
Angus, Jamie 323, 328 Beatles, The 35, 63, 64, 66, 67, 70, 103, 113, 116,
ANT. See actor-network theory 121, 141, 180, 198, 223, 229, 377, 393
API (Audio Processes Incorporated) 59 Becker, Gary 146
Apogee Electronics 83 Bee Gees 21, 28, 29
Armatrading, Joan 143 Bendall, Haydn 311
Armstrong, Louis 55 Benjamin, Walter 206
ARP 2600 (synthesiser) 76 Bennett, Andy 293
arranging 8, 163, 165, 166, 167, 168, 171, 187, Bennett, Joe 65, 93, 165, 170, 189, 190, 191, 192,
188, 196, 325 198, 312
Ashcroft, Bill 25 Bennett, Namoli 252
Association for the Study of the Art of Record Bennett, Samantha 2, 67, 81, 89, 93, 94, 130, 133
Production 34 Beranek, Leo 127
Audio Engineering Society 34, 44, 49, 104, 128, Berger, Harris 297, 326
138, 241, 269, 330, 371, 381 Bernhart, Nicholas 233, 238
406 Index
Bijsterveld, Karen 35 Choueiti, Marc 84

Björk 250, 251, 260, 262 Chun, Marvin 149, 150
Blackwell, Chris 26 Ciani, Suzanne 223
Blake, Andrew 166 CLA. See Lord-Alge, Chris
Blanchard, Evelyn 118 Clark, Andy 150
Blier-Carruthers, Amy 205, 207, 209, 210, 211, 217 Clark, Fraser 234
Blumlein, Alan 370 Clark, Petula 118
Bolter, Jay 70 Clark, William 117
Booth, Gregory 41 Clarke, Eric 9, 27, 30, 147, 207, 212, 219, 326
Born, Georgina 84, 328 Cochran, Paul 62, 67
Borwick, John 234, 242 Cogan, Jim 117
Bourbon, Andrew 1, 147, 148, 211, 337 Cohn, Irving 53
Bourdieu, Pierre 36, 191 Columbo, Russ 57
Bowie, David 29, 319, 371, 393 composition 8, 10, 20, 28, 40, 77, 134, 145, 146,
Boy George 254 162, 164, 165, 166, 170, 172, 177, 185, 230,
Bradley, Simon 223 290, 295, 377
Braxton, Toni 29 compressor 2, 52, 59, 60, 63, 95, 100, 101, 103,
Breen, Christopher 322 275, 285, 305, 307, 332, 341, 342, 343
Bregman, Albert 325 Cook, Nicholas 207, 210
Brendel, Alfred 207, 211, 219, 220 Cope, Jess 394
Britten, Andy 96 Corbett, John 295
Brock-Nannestad, George 116 Corbetta, Piergiorgio 38, 40, 42, 44
Brøvig-Hanssen, Ragnhild 272, 274, 277 Corey, Jason 242, 243, 244
Bryman, Alan 41 Corgan, Billy 12
Bryon, Dennis 29 Costello, John 372
Buchla, Donald 223 Costello, Sean 318
Burges, Richard 5, 20, 21, 22, 23, 24, 41, 110, 163, Costey, Rich 340
165, 166, 178, 181, 182, 193, 246, 253, 269, Cottrell, Stephen 224
270, 303, 305, 307, 310, 311, 312, 313, 351, Cox, Billy 64
369, 370 Craig, Francis 54
Burns, Lori 393 Cronin, Dan 117
Bush, Kate 260 Crosby, Bing 57, 61, 66, 228
Butler, Mark 34 Csikszentmihalyi, Mihalyi 142, 191
systems approach 142
Camilleri, Lelio 328 Culshaw, John 63, 115, 211, 212, 220
Carlisle, Stephen 198 Cunningham, Mark 110
Carlos, Wendy 75 Curnow, Ian 306
Carlsen, Kristoffer 271
Carmel, Sefi 96 D’Antonio, Peter 129
Carpenter, Karen 252, 253 D’Errico, Mike 273, 290
Caruso, Enrico 110, 228 Daniel, Oliver 56, 103
Case, Alex 3, 174, 208, 297, 298, 319, 320, 323, Danielsen, Anne 79, 266, 267, 272, 274, 275, 277,
324, 325, 327, 329, 341, 403 278, 327
Castelli, Jon 98 Davis, Don 324, 325
Castelo-Branco, Gustav 212, 219 Davis, Larry 328
Cecil, Malcolm 143 Davis, Miles 24, 64, 143
Chalmers, David 150 Day, Timothy 65, 207, 209
Chess, Phil 116 dbx 283, 284, 285, 343
Index 407
de Laat, Kim 194 ethnographer 35, 36

De Man, Brecht 45, 47 Everest, Frank Alton 128
Deahl, Dani 198
Decca 68, 112, 113, 115, 117, 186, 220, 404 Fabian, Dorottya 205, 207, 210, 217
Despagni, Anthony 371 Fairchild (manufacturer) 60, 63, 65, 89, 90, 91,
Devine, Kyle 84 100
Di Perna, Alan 223 Fairlight (sampling workstation) 76, 78, 165, 268,
Diamond, Beverley 256, 257 270
Dibben, Nicola 2, 250, 328 Fales, Cornelia 326
Dickason, Vance 283 Fauconnier, Gilles 154, 155, 158, 310
Dilla, J 273 Felstead, Charles 57
Dion, Celine 29 Fender, Leo 224, 225
Discogs 43 FET (field effect transistor) 52, 60, 64, 66,
Dissanayake, Ellen 151, 156 332
Distribution 5, 347, 349, 352, 367 fidelity 9, 14, 54, 59, 62, 65, 83, 90, 91, 92, 93, 95,
Dockwray, Ruth 328 96, 99, 103, 108, 207, 296, 370, 374, 384,
Dolby, Ray 372 385, 386, 387, 388, 390
Dolby, Thomas 225, 227 Fine, Thomas 372
Dowd, Tom 21, 62 Fink, Robert 326
Doyle, Peter 14, 51, 114, 126, 328 Fiske, Susan 150, 151
Dre, Dr. 227 Fitzgerald, Jon 193
Dredge, Stuart 198, 377 Flynn, Matthew 376
Dregni, Michael 225 Ford, Mary 23, 62, 336
Dreher, Carl 60 Foster, David 29
Droney, Maureen 92, 362 Fostle, D 119
Duraiswami, Ramani 328 Franinović, Karmen 149
Dylan, Bob 100 Freeman-Attwood, Jonathan 207
Frith, Simon 294, 296
ecological approach 142, 147
Ed Sheeran 193 Gabriel, Peter 22, 28
Eddy, Duane 115 Gaisberg, Fred 110, 209
Edison, Thomas 8, 31, 55, 110, 111, 121, 206, 247, Gaisberg, Frederick 169, 209
369, 381, 383, 384 Galuten, Albhy 21, 29
EDM (electronic dance music) 9, 165, 166, 168, Gaye, Marvin 5, 15, 28, 30, 143, 177
275, 324, 326 Gelatt, Roland 55, 66, 111
Eisenberg, Evan 22 Generale, Mariane 376
electroacoustic 9, 265, 326, 328, 342, 343 Gibb, Barry 29
embodied cognition 3, 7, 8, 13, 26, 28, 96, 130, Gibson, James 147, 148, 168, 190
148, 150, 217, 251, 253, 276, 295, 318, Giddens, Rhiannon 196
326 Giddins, Gary 57
Emerick, Geoff 47, 63, 64, 92, 113, 116 Glossop, Mick 181
Eminem 15, 227, 250 Glover, Carl 394
emotional labour 143 Glynne, Jess 99
Eno, Brian 72, 164, 170, 172, 289 Godfrey, Jem 322
Ensoniq (manufacturer) 76 Golding, Craig 304
Erlmann, Veit 45 Goldin-Perschbacher, Shana 251, 252
Escott, Colin 116 Gooderson, Matt 191
ethnography 34, 35, 36, 37, 42, 80, 211, 212 Goodman, Steve 282
408 Index
Gordy, Berry 13, 192, 193 Hirstein, William 151

Gould, Glenn 62, 133, 207, 212, 214, 220 historiography 34, 38, 41
Granata, Charles 64 Hodgson, Jay 3, 275, 276, 315
Grassie, Colin 234 Hoile, Lasse 394, 398
Gray, Andy 96 Holly, Buddy 11, 183
Greene 3, 77, 330 Holmes, David 197
Grein, Paul 114, 118 Holmes, Thom 268
Griesemer, James 147 Holm-Hudson, Kevin 393
Griffiths, Gareth 25 Holt, Fabian 195
Gronow, Pekka 109 Honegger, Gitta 233
Grusin, Richard 70 Horn, Trevor 77, 170, 172, 177
Gruters, K. 10 Hornborg, Alf 130
Hot Chip 99
Haas, Michael 207 Houghton, Matt 324
Hajdu, David 193, 194 Howard, David 323, 328
Hajimichael, Mike 198 Howard, Mary 118
Halee, Roy 27 Howes, David 45
Hall, Ray 119 Howlett, Mike 5, 143, 147, 163, 164, 177, 211,
Hallifax, Andrew 207 253, 294, 301, 312
Hammond, John 54 Hull, Joseph 372
Hannigan, Lisa 13, 17 Hutchins, Edwin 155
Harding, Phil 93, 184, 190, 193, 303, 304, 306, Hyde, Lewis 224
307, 311, 313, 327
Harmonicats 53, 68, 115 Inglis, Ian 192
Harvith, John 110 Iovine, Jimmy 227
Harvith, Susan 110 Irwin, Mark 63
Hawkins, Martin 116 Izhaki, Roey 3, 233, 240, 242, 244, 297, 319, 324,
Hawkins, Stan 254 325, 327, 329, 340
Haynes, Agnes 388
Hazlewood, Lee 115 Jamerson, James 283
Heap, Imogen 135, 253 James, Alex 313
Heaven, Douglas 198 Jarman-Ivens, Freya 252
Helios (manufacturer) 59 Jenkins, Henry 393
Helseth, Inger 277 Johnson, Eldridge 111
Henderson, Stephen 192 Johnson, Mark 155
Hendrix, Jimi 64, 118, 223 Jones, Spike 53
Henley, Jennie 191 Jones, Steve 71
Hennion, Antoine 126, 161, 162, 165, 172, 192, Joshua, Jaycen 344
303 Jost, Christofer 393
Henriques, Julian 282
Hepworth- Sawyer, Russ 304 Kahn, Ashley 64, 113
Heyde, Neil 211 Kahn-Harris, Keith 293, 300
Higgins, Lee 36, 192 Kakehashi, Ikaturo 76
Hilder, Thomas 257 Kaplan, Abraham 46
Hill, Rosemary 293 Katz, Mark 205, 207
Hiltunen, Riikka 169, 170 Kazdin, Andrew 65
hip-hop 9, 13, 14, 77, 228, 237, 255, 256, 258, 272, Kealy, Edward 35, 70, 71, 133, 168
275, 307, 324 Keep, Andy 93
Index 409
Kehew, Brian 3, 12, 63, 64, 103 machine learning 84, 97

Kelly, Jem 393 MacQuinn, Julie 393
Kendall, Gary 328 magnetophone 61, 133
Kennedy, Rick 55, 109, 110, 116 Manuel, Peter 72
Kesha 98 Mardin, Arif 29
King, Richard 376 Margouleff, Robert 143
Kinnear, Michael 110 Marlens, W 318
Kinney, WIlliam 125 Marley, Bob 25, 26, 30, 311
Kirsh, David 150, 159 Marmorstein, Gary 371
Kisliuk, Michelle 212 Marrington, Mark 189, 191, 307, 312
Kjus, Yngvar 277 Mars, Bruno 28
Klein, Eve 167 Marsh, Charity 255, 256
Knight, Will 198 Marshall, Jim 223
Kolkowski, Aleks 208 Marshall, Owen 134
Koszolko, Martin 153, 198 Martin, George 63, 103, 116, 118, 162, 168, 170,
Kraft, James 226 172
Kramer, Eddie 64 Martin, Max 193
Krüger Bridge, Simone 28 Martin, Ricky 80
Kvifte, Tellef 267, 278 Martland, P 109
Marx, Karl 132
LA2A. See Teletronix LA-2A Mary Ford 23
Lacasse, Serge 13, 249, 250, 259 Maserati, Tony 338
Lady Gaga 98, 254 Massey, Howard 47, 103, 109, 113, 115, 116, 118,
Lambert, Mel 97 120
Lampert, Vera 32 Massy, Sylvia 237
Lapenta, Francesco 195 Maxfield, J.P 112
Larking, Tony 93 May, Brian 223
Lashua, Brett 190 McAdams, Stephen 325
Latour, Bruno 91, 93, 95, 104, 147, 326 McCartney, Paul 63, 113, 377
Lave, Jean 235 McCormick, Tim 238, 240, 247
Law, John 125 McCracken, Alison 57, 66
Lawrence, James 60 McDermott, John 64
Leckie, John 178, 181 McGuigan, Cathleen 135
Lefèbvre, Henri 133 McIntyre, Phillip 2, 152, 157, 191, 194
Lefford, Nyssim 2, 142, 145, 147, 245 McIver, Joel 293
Legge, Walter 169 McNally, Kirk 45, 233
Leimberg, Josef 272 McNutt, Randy 55, 109, 116
Lewis, Jerry Lee 115 Meek, Joe 63, 66, 115, 117, 119, 172
Lewisohn, Mark 64 Meintjes, Louise 3, 36, 37, 71, 129, 130, 254,
Lexicon (manufacturer) 75, 305 255
Leyshon, Andrew 131, 132, 133, 243 Mellotron (instrument) 268
Liebler, Vincent 113 Merchant, John 29
Lindberg, U. 19 conscious 8
liveness 14, 255, 257 subconscious 7
Lord-Alge, Chris (CLA) 326, 339, Meynell, Anthony 35, 36, 52, 89, 103, 326
342 Michael, George 254
Lord-Alge, Tom 95 Middleton, Richard 7, 142, 247
Lysloff, René 37, 49 Miles, Barry 24, 63, 133, 143, 238, 242
410 Index
Millard, Andre 55, 110, 122, 224, 226, 282 Oliver, Rowan 273
Miller, Duncan 208 Ondes Martinot (instrument) 9
Miller, Mitch 61 Opto (compressor type) 52
Milner, Greg 114, 369, 372 Orbital 96
Minimoog (instrument) 75, 76 Ord-Hume, Arthur 383
Moffat, David 45, 47 Osbeck, Lisa 149
Mol, Annemarie 125 Ottosson, Ase 135
Monroe, Jazz 377 Oudshoorn, Nelly 92, 115
Moog, Robert 75 Owsinski, Bobby 2, 45, 49, 238, 240, 242, 318,
Moon, Keith 229 327, 329, 340, 341
Moore, Allan 2, 13, 19, 20, 21, 31, 39, 40, 71, 163, Oxenham, Andrew 374
177, 279, 295, 298, 301, 328
Moore, Austin 60, 61, 322 Padgham, Hugh 29
Moorefield, Virgil 22, 31, 146, 162, 163, 164, 165, Paley, Bill 112
166, 169, 170, 172 Palladino, John 114, 119
Morey, Justin 191, 194 Palladino, Rose 118
Moroder, Giorgio 170, 172 Papenburg, Jens 45
Morton, David 54, 55 Paterson, Justin 312
Motown Records 13, 119, 143, 177, 189, 192, 193, Pattison, Louis 97
200, 202, 283, 354 Paul, Les 23, 24, 31, 61, 62, 70, 116, 118, 119, 170,
Moulder, Alan 12 172, 183, 226, 228, 230, 336
Moylan, William 2, 13, 71, 238, 242–243, 303, Peeters, Geoffry 326
306, 323, 233, 329, 342 Peg-O-My Heart 53
Muikku, Jari 165, 169 Pensado, Dave 344
Mullin, Jack 61 performance 8, 10, 11, 12, 13, 20, 21, 22, 23, 24,
musicking 8, 107, 142, 171, 188, 252, 267, 268 28, 29, 36, 40, 53, 56, 62, 63, 64, 70, 75, 79,
Myers, Marc 58 81, 91, 98, 103, 116, 127, 143, 146, 150,
Mynett, Mark 13, 293, 294, 296, 299, 301, 327 164, 166, 169, 171, 172, 173, 177, 178, 185,
187, 188, 190, 195, 196, 205, 206, 207, 208,
N’Dour, Youssou 27 209, 210, 211, 212, 213, 216, 217, 222, 223,
Nardi, Carlo 6, 32, 35 226, 228, 229, 230, 240, 242, 249, 251, 252,
Nassar, Youssef 394 253, 255, 257, 259, 260, 265, 268, 277, 278,
Nathen, Syd 116 294, 295, 296, 297, 299, 300, 304, 320, 321,
Ndegeocello, Meshell 251, 252 329, 335, 336, 341, 343, 353, 359, 360, 363,
Nersessian, Nancy 149, 152, 159 365, 369, 370, 372, 384, 386, 396
Neumann, Georg 57 Perlman, Alan R. 75
Neve (manufacturer) 64, 91, 92, 100 Peterson, Oscar 21
Neve, Rupert 64 Petty, Norman 11, 108
Newell, Philip 112, 128, 135 Pharrell 28
Ngcobo, Shiyani 15 Philip, Robert 205, 207, 209
Nikisch, Arthur 208 Phillipov, Michelle 293, 295
Nine Inch Nails 83, 363, 377 Phillips, Sam 116
Nisbett, Alec 242 Pieper, Katherine 84, 202
Norman, Donald 235 Pieslak, Jonathan 294
Pinch, Trevor 75, 92, 115, 223
O’Connor, Brian 194 Pink Floyd 393
O’Hara, Geoffrey 53 Platt, Tony 311
O’Neill, Robin 208 Plunkett, Donald J. 60
Index 411
Polyani, Michael 235 Runstein, Robert 59, 238, 241, 242

Porcello, Tom 3, 35, 36, 77, 85, 151, 326 Ryan, Kevin 3, 12, 63, 64, 103, 196, 404
Portastudio 72, 83 Ryan, Marie-Laure 393, 395
Porter, Bill 117
Portishead 15 Sabine, Walter 127
Power, Bob 343, 344 Sanner, H 62
Pratt, Daniel 235, 236 Saunio, Ilpo 109
pre-production 134, 143, 178, 180, 181, 182, 183, Savage, Steve 3, 79, 327
184, 335 Savoretti, Jack 195, 196
Presley, Elvis 115, 117, 183 Schaeffer, Pierre 326, 342, 344
Provenzano, Catherine 320 Scheps, Andrew 95, 96, 102, 104, 339, 345
psychoacoustics 45, 323, 324, 325, 331, 374, 378 Schiff, Andras 14, 220
Pugh, M. 13 Schloss, Joseph 77, 237
Pultec (manufacturer) 52, 59, 60, 65, 89, 90, 91, Schmid, F. C. 127
92, 93, 94, 95, 96, 97, 98, 101, 102, 311, Schmidt Horning, Susan 20, 22, 56, 58, 59, 91, 92,
326, 343 109, 112, 114, 115, 118, 119, 120, 122, 125,
Purcell, Natalie 295 162, 163, 164, 167, 168, 235, 241, 246, 328,
Putnam, Bill 53, 60, 112, 115 370, 371
Schmitt, Al 58, 100, 101
Rachel, Daniel 194 Schoenberg, Arnold 234, 235, 247
Ralph, Mark 99 Scholz, Tom 227
Ramachandran, Vilayanur 151 Schubert, Franz 14
Ramone, Phil 29 Schulze, Holger 45, 49
Read, Oliver 55, 67, 282 Scissor Sisters 254
realism 8, 9, 22, 209, 328, 401 Scott, Niall 293
recontextualization 26 Scott, Raymond 114
record company 5, 98, 114, 163, 312, 314, 347 Scott, Travis 14, 30, 102
Reilly, Dan 198 Seabrook, John 193
Reiss, Joshua 45, 47 Sear, Walter 59
Reznor, Trent 377 Semiotics 39
Ribowsky, Mark 63 Senior, Mike 97, 243
Richardson, John 393 Serafin, Stefania 149
Richardson, Karl 29 Shuker, Roy 19, 31
Ringer, Fritz 46 Shure (manufacturer) 55, 57
Roach, Archie 12 Siirala, Seppo 167
Roberston, Bronwen 389 Silver, Frank 53
Robert Fine 54 Simon and Garfunkel 27
Robjohns, Hugh 131 Simon, John 118
Rogers, Roy 14, 152 Simon, Paul 27, 254
Roland Corporation 76 Sinatra, Frank 57, 64, 66, 100, 101, 371, 381
Rolling Stones, The 12, 180 situated learning 235, 236
Ronson, Mark 28 Skee-Lo 15
Rosch, Elanor 148, 149 Slater, Mark 129
Rose, Phil 393 Small, Christopher 8, 117, 171, 207, 230, 240, 243
Rotem, J. R. 272 Smalley, Dennis 326, 328
Rudemental 99 Smashing Pumpkins 12
Rumelhart, David 235 Smith, Dave 76, 269
Rumsey, Francis 74, 238, 240, 247, 328, 375 Smith, Julius 326
412 Index
Smith, Patti 12 thematic analysis 39, 40, 41

Smith, Stacy 84 Theramin (instrument) 9
Snoop Dogg 227, 272 Thicke, Robin 5, 15, 28
Sodajerker 189, 190, 193, 194, 195, 196, 197 Thompson, Emily 56, 122, 127, 390
Solid State Logic (manufacturer) 59, 75 Thompson, Evan 148, 149
sonic cartoon 3, 7, 10, 284, 285, 327, 328, 336 Thompson, Paul 2, 190, 191, 313
Sooy, Raymond 109, 110 Thon, Jan-Noel 393
Soundstream 74, 78 Tiffin, Helen 25, 30
spatialization 212, 250 Till, Rupert 60
Spector, Phil 63, 70, 96, 117, 166, 168, 169, 172 Tingen, Paul 95, 103
Spezzatti, Andy 198 Titelman, Russ 29
Spracklen, Karl 192, 200, 293 Tolinski, Brad 223
Springsteen, Bruce 72 Tomes, Susan 207
staging 8, 13, 14, 212, 249, 250, 252, 258, 318, Tomita, Isao 75
324, 328, 340, 341, 342 Toulson, Rob 193, 310, 348, 367, 373, 377, 378
Stanley, Bob 369, 371 Townshend, Pete 98, 223
Stanton, G. T. 127 Trident (manufacturer) 59, 118
Star, Susan 147 Trocco, Frank 75, 223
Stargate 193 Tubby, King 14, 20, 35, 72, 265
Starr, Larry 24, 31 Turk-Browne, Nicholas 149, 150
Stavrou, Mike 324 Turner, Dan 294
Steinberg (manufacturer) 80, 81, 82, 83, 270, 305, Turner, Mark 154, 155, 310
306, 307
Stent, Mark ‘Spike’ 340 UA1176 (see also 1176 and Urei 1176) 52
Stereo Mike. See Exarchos, Mike Urei 1176 (see also 1176 and UA1176) 60
Sterne, Jonathan 41, 45, 47, 65, 69, 73, 83, 246,
247, 328, 386 Vallée, Paul 286
Stévance, Sophie 258, 259 Vallee, Rudy 57
Stockhausen, Karlheinz 265 Van Halen, Eddie 223, 319
Strachan, Robert 79 Vandeviver, Christopher 322
streaming 376 Varela, Francisco 148, 149
Streisand, Barbra 29 Vari-Mu (comporessor) 52
Strummer, Joe 11 VCA (voltage-controlled amplifier) 52, 324
Subotnick, Morton 75 Veal, Michael 125
Suchman, Lucy 153, 154 Villchur, Edgar 57
Swift, Taylor 376 Vitacoustic (record label) 53
Synclavier (manufacturer) 76, 165, 270 Volkmann, John 113
Von Helden, Imke 293
T.I. (Clifford Harris) 28
Tagg, Philip 39 Wakefield, Jonathan 60
Tascam (manufacturer) 72, 77 Wallace, Andy 340, 345
Taubman, Howard 57, 65 Wallach, Jeremy 295, 296
Taylor, Shelley 150, 151 Walser, Robert 293, 297
Teletronix LA-2A 60, 342 Walter Welch 55, 67
Tembo, Matthew 258 Warner, Timothy 77, 167, 175
Théberge, Paul 36, 37, 38, 40, 69, 71, 73, 76, 83, Waterman, Christopher 24, 184, 190, 192
85, 92, 131, 168, 212, 214, 226, 229, 265, Watson, Alan 130
337 Watson, Allan 85, 126
Index 413
Webb, Jim 57 Witek, Maria 271

Weber, Max 46 Wolfe, Paula 73, 135, 190, 253
Weiner, Ken 198 Woloshyn, Alexa 5, 135, 188, 249, 253, 259
Weinstein, Deena 293, 294
Weisethaunet, H. 19 Yamaha (manufacturer) 76, 82, 306
Wenger, Etienne 235 Years & Years 99
Wiener, Jack 116 Yorke, Thom 376
Williams, Alan 3, 36, 65, 130, 162, 172,
221 Zagorski-Thomas, Simon 1, 2, 3, 6, 7, 19, 31, 32,
Williams, Orlo 385 40, 47, 71, 93, 107, 129, 133, 142, 147, 148,
Williams, Pip 181 155, 163, 177, 208, 211, 230, 237, 265, 271,
Williams, Robbie 182 285, 298, 299, 305, 306, 309, 314, 317, 326,
Williams, Sean 35, 36 328, 341, 376
Williams, Tex 119 Zak, Albin 3, 20, 22, 24, 51, 52, 53, 71, 91, 162,
Wilson, Brian 63, 118 163, 164, 166, 168, 170, 180, 226,
Wilson, Oli 135 326
Wilson, Steven 393, 394, 397, 400 Zappa, Frank 63, 66, 181
Winner, Jeff 224 Zollo, Paul 194
Wishart, Trevor 326 Zotkin, Dimitry 328

The Bloomsbury Handbook of Music Production 1501334026 9781501334023 Compress

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Bloomsbury Handbook of Music Production 1501334026 9781501334023 Compress

Uploaded by

Copyright:

Available Formats

The Bloomsbury

BLOOMSBURY, BLOOMSBURY ACADEMIC and the Diana logo

First published in the United States of America 2020

Volume Editors’ Part of the Work © Andrew Bourbon and

Each chapter © of Contributor

Cover design: Louise Dugdale

All rights reserved. No part of this publication may be reproduced or transmitted

Library of Congress Cataloging-in-Publication Data

ISBN: HB: 978-1-5013-3402-3

Typeset by Integra Software Services Pvt. Ltd.

Part II Technology

Part III Places

Part IV Organizing the Production Process

Part V Creating Recorded Music

15 Studying Recording Techniques

Part VI Creating Desktop Music

Part VII Post-Production

Part VIII Distribution

15.1 Publication dates of selected sound-recording textbooks 237

15.1 List of texts 238

Mike Alleyne is a professor in the Department of Recording Industry at Middle Tennessee

Tuomas Auvinen is a musicologist, musician and educator teaching music production

Amy Blier-Carruthers is Lecturer in Postgraduate Studies at the Royal Academy of Music,

Andrew Bourbon is Subject Area Lead of Music Technology at Huddersfield University. He

M. Nyssim Lefford is a researcher and teacher at Luleå University of Technology in Sweden,

Anthony Meynell is a record producer, songwriter, performing musician and academic

Susan Schmidt Horning is Associate Professor of History at St John’s University in Queens,

Paul Théberge is a Canada Research Professor at Carleton University, Ottawa. He is cross

Alexa Woloshyn is Assistant Professor of Musicology at Carnegie Mellon University.

Simon Zagorski-Thomas is Professor at the London College of Music (University of West

Organizing this book

Lefford, M. N. (2015), ‘The Sound of Coordinated Efforts: Music Producers, Boundary

1 The direct embodied responses of empathy and engagement (e.g. moving to a

The ‘production’ activities of musicking (Small 1998) – composition, performance,

What is recorded music?

Curated versus constructed

The perception of agency

The perception of energy

The perception of space

The perception of context

Authenticity and production

Recognizing that there is no monolithic producer stereotype that can be indiscriminately

Record production history and technologies

Authenticity and reissues

works are sonically repackaged with supplementary production represents a potential

Genre and geographical authenticities

elements of mainstream pop soundscapes with components of his original aesthetic. As

authenticity is not just contested ideological territory but fundamentally incorporates

Introduction: Premises for the study of

Unobtrusive research methods: Content

Content analysis, in particular, studies recorded human communications in any form

A document is any material that provides information on a given social phenomenon

Thematic analysis is aimed at interpreting the explicit or implicit intentions behind a

Conclusions: Science vs. metaphysics

Ma, Z., B. De Man, P. D. L. Pestana, D. A. A. Black and J. D. Reiss (2015), ‘Intelligent

Engineering musical expression

should be captured? Engineers accustomed to solving quantitative technical problems

From sound to signal

Controlling the signal

Capturing the signal

Bennett, S. (2012), ‘Endless Analogue: Situating Vintage Technologies in the Contemporary

Transitions II: Digital recording – uncertain

Transitions III: Synthesizers, sequencers

Transitions IV: Consolidation – the rise of

Part II Technology

Part III Places

Part IV Organizing the Production Process

Part V Creating Recorded Music

Part VI Creating Desktop Music

Part VII Post-Production

Part VIII Distribution