- - - - - - - - - - - - - - - - ~ ~

AUDIBLE DESIGN
A PLAIN AND EASY INTRODUCTION
..
,j
TO PRACTICAL SOUND COMPOSITION '. :1
"
'I
I,
'.
'I
:1
"
,
,
'I
.,
,1
by
'I
:1
l
TREVOR WISHART
.,
II
"
"
Ij
,'I
,.'
.!,J
, .
Published by Orpheus the Pantomime Ltd: 1994
Copyright: Trevor Wishart: 1994
I:
~
PREFACE
The main body of this book was wriuen at the Institute of Sonology at the Royal Conservatory in the
Hague. where I was invited as composer in residence by Clarence Barlow. in 1993. Some clarifications and
supplementary material was added after discussions with Miller Puckette. Zak Scttel. Stephan Bilbao and
Philippe Depalle atIRCAM. However, the blame for any inaccuracies or inconsistcncies in the exposition
rests entirely with me. The text of the book was originally wrillen entirely in longhand and I am indebted to
Wendy McGreavy for transferring these personal pretechnological hieroglyphs to computer files. Help with
the final text layout was provided by Tony Myatt of the University of York.
.\
"

My career in music-making with computers would nOl have been possible without the existence of a
:1
community of like-minded individuals committed to making powerful and open music-computing tools
'I
i.
available to composers on affordable domestic technology. I am therefore especially indebted to the
'l
Composers Desktop Project and would hence like to thank my fellow contributors to this long running
::
..
,.
project; in particular Tom Endrich, the real driving force at the heart of the CDP, to whom we owe the
survival. expansion and development of this cooperative venture. against all odds. and whose persistent
I
II
probing and questioning has led to clear instrument descriptions & musician-friendly documentation; :l
Richard Orton and Andrew Bentley. the other composer founder members of. and core instrument 'I
'I
contributors to the project; David Malham who devised the hardware bases of the CDP and much clsc.
and continues to give SUPPOI1; Martin Atkins. whose computer science knowledge and continuing
:l
'.
"
commitment made. and continues to make, the whole project possible (and from whom I have very slowly
learnt to progranl less anarchically, if not yet elegantly!); Rajmil Fischman of the University of Keele. who
has been prinCipally responsible for developing the various graphic interfaces to the system; and to Michael
Clarke. Nick Laviers. Rob Waring, Richard Dobson and to the many students at the Universities of York.
..
Keele. and Birmingham and to individual users elsewhere. who have suppol1ed. used and helped sustain and
"
.,
develop this reSOUrce.
a
II
All the sound examples accompanying this book were either made specifically for this publication. or come '/'I
."
from my own compositions Red Bird. The VOX Cycle or Tongues of Fire. except for one item. and I would Ie:
like to thank Paul de Marinis and Lovely Music for pemlitting me to use the example in Chapter 2 from the
piece Odd Evening on the CD Music as /l Second Lang/lOge (Lovely MusiC LCD 3011), Thanks are also
due to Francis Newton. for assistance with data transfer to DAT.
'/
~
1
2
3
4
5
6
7
8
9
10
11
12
13
Chapter 0
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
Chapter
CONTENTS
Introduction ............................ 1
The Nature of Sound ............... 11
Pitch ......................................... 25
Spectrum .................................. 35
Onset ........................................ 44
o.
,j

: ~
Continuation ............................ 48
'!
Grain Streams .......................... 55
'l
I,
:1
"
Sequences ................................ 59 '.
! ~
I
Texture .................................... 66
."
:1
:a
Time........................................ 72 : ~
Energy ................ ..................... 81
J
.
Time-stretching ..................... 86 . J
..
..
Interpolation ........................... 96
.,
New Approaches to Form ...... 103 ,j
11
Appendix 1 ................................................... 112
Brief Bibliography ........................................ 139
CHAPTER 0
INTRODUCTION
WHAT THIS BOOK IS ABOUT
This is a book about composing with sounds. It is based on three assumptions.
I. Any sound whatsoever may be the stalling material for a musical composition.
2. The ways in which this sound may be transformed are limited only by the imagination of the
composer.
3. Musical structure depends on establishing audible relationships amongst sound materials.
The first assumption can be justilied with reference to both aesthetic and technological developments ill
.,
the Twentieth Century. Before 1920. the French composer Varcse was imagining a then unattainable
,l
music which had the same degree of control over sonic substance as musicians have traditionally
,i
exercised over melody. harmony and duration. This concern grew directly out of the sophisticated
: ~
development of orchestration in the late Nineteenth Century and its intrinsic limitations (a small finite : ~
I,
set of musical instruments). TIle American composer John Cage was the first to declare that all sound ,l
was (already) music. It was the emergence and increasing perfection of the technology of sound ':t
recording which made this dream accessible.
".
I
!l
The exploration of the new sounds made available by recording technology was begun by Pierre
:1
"
Schaeffer and the G.R.M. in Paris in the early 1950s. Initially hampered by unsophisticated tools (in the
'0
early days. editing between lacquer discs - later the transfonnations - like rape Jpeed variatioll.
:1
editillg and mixillg - available with magnetic tape) masterpieces of this new medium began to emerge
l
and an approach to musical composition rooted in the sound phenomenon itself was laid out in great
detail by the French school.
.
The second of our assumptions had to await the arrival of musical instruments which could handle in a
.,
"
subtle and sophisticated way. the inner substance of sounds themselves. The digital computer provided
."
the medium in which these tools could be developed. Computers allow us to digitally record any sound
il
"
at all and to digitally process those recorded sounds in any way that we care to define.
','
,j
.'
In this book we will· "discuss in general the properties of different kinds of sound materials and the I'
effects certain well defined processes of transformation may have on these. We will also present. in the
I
Appendix. a simple diagrammatic explanation of the musical procedures discussed.
The third assumption will either appear obvious. or deeply controversial. depending on the musical
perspective of the reader. For the moment we will assume that it is obvious. The main body of this
book will therefore show how. stalling from a given sound. many other audibly similar sounds may be
developed which however. possess properties different or even excluded from the original sound. TIle
question of how these relationships may be developed to establish larger scale musical structures will
be suggested towards the end of the book but in less detail as. to date. no universal tradition of large
scale fonn-building (through these newly accessible sound-relationships) has established itself as a
norm.
WHAT THIS BOOK [S NOT ABOUT
This book is not about the merits of computers or particular programming packages. However, most of
the programs described were available on the Composers Desktop Project (COP) System at the time of
writing. The COP was developed as a composers cooperative and originated in York, U.K.
Nor will we, outside this chapter, discuss whether or not any of the processes described can be, or
ought to be implemented in real time. In due course, many of them will run in real time environments.
My concern here, however, is to uncover the musical possibilities and restraints offered by the medium
Qf sonic composition, not to argue the pros and cons of different technological situations.
A common approach to sound-composition is to define "instruments" - either by manipulating factory
patches on a commercial synthesizer, or by recording sounds on a sampler - and then trigger and
transpose these sounds from a MIDI keyboard (or some other kind of MIDI controller). Many
composers are either forced into this approach, or do not see beyond it, because cheaply available
technology is based in the note/instrument conception of music. At its simplest such an approach is no
ffiQre than traditional note-oriented composition for electroniC instruments, particularly where the MIDI
interface confines the user to the tempered scale. This is not significantly different from traditional
on-paper composition and although this book should give some insight into the design of the
'instruments' used in such an approach, I will not discuss the approach as such here - the entire
history of European compositional theory is already available!
On the contrary. the assumption in this book is that we are not confmed to defining "instruments" to
arrange on some preordained pitch/rhythmic structure (though we may choose to adopt this approach in
particular circumstances) but may explore the multidimensional space of sound itself, which may be
moulded like a sculptural medium in any way we wish.
We also do not aim to cover every possibility (this would, in any case, be impossible) but oniy a wide
and, hopefully, fairly representative. set of processes which arc already familiar. In particular, we will
focus on the transformation of sound materials taken from the real world. rather than on an approach
through synthesis. However, synthesis and analysis have, by now, become so sophisticated, that this
distinction need barely concern us any more. It is perfectly possible to use the analysis of a recorded
sound to build a synthesis model which generates the original sound and a host of other related sounds.
It is also possible to use sound transformation techniques to change any sound into any other via some
well-defined and audible series of steps. The common language is onc of intelligent and sophisticHted
sound transformation so that sound composition has become a plastic art like sculpture. It is with this
that we will be concerned.
THINKING ALOUD - A NEW CRITICAL TRADITION
I cannot emphasise strongly enough that my concern is with the world of sound itself. as opposed to the
world of notations of sound, or the largely literary disciplines of music score analysis and criticism. I
will focus on what can be aurally perceived, on my direct response to these perceptions and on what can
be technically, acoustically or mathematically described.
2
The world of sound-composition has been hampered by being cast in the role of a poor relation to JIIore
traditional musical practice. In particular the vast body of analytical and critical writings in the
iIlusicology of Western Art music is strongly oriented to the study of musical texts (scores) rather than
10 a discipline of acute aural awareness in itself. Sound composition requires the development of both
new listening and awareness skills for the composer and, I would suggest, a new analytical and critical
discipline founded in the sludy of the sonic experience itself, rather than its representation in a text.
This new paradigm is beginning to struggle into existence against the immense inertia of received
wisdom about 'musical structure'.
I have discussed elsewhere (011 SOllie Art) the strong influence of mediaeval 'text worship' on the
critical/analytical disciplines which have evolved in music. Both the scientific method and technologised
industrial society have had to struggle against the passive authority of texts declaring eternal !ruths
and values inimical to the scientific method and to technological advance.
I don', wish here to decry the idea that there may be "universal truths" about human behaviour and
human social interaction which science and technology are powerless to alter. [Iut because our prime
medium for the propagation of knowledge is the written text, powerful institutions have grown up
around the presentation, analysis and evaluation of texts and textual evidence .... so powerful that their "

influence can be inappropriate.
.;
"1.
In aHempting to explore the area of composing with sound. this book will adopt the point of view of a :,;
!.
scientific researcher delving into an unknown realm. We are looking for evidence to back up any

hypotheses we may have about potential musical structure and this evidence comes from our perception
t
of sounds themselves. (Scientific readers may be surprised to hear that this stance may be regarded as 'J
I
polemical by many musical theorists).
;i
.,
In line with Illis view, therefore, this book is not intended to be read without listening to the sound
Q
examples which accompany it. In the scientific spirit, these are presented as evidence of the

propositions being presented. You arc at liberty to affirm or deny what I have to say Illfough your own
i
experience. but this book is based on the assumption that the existence of structure in music is a mailer
of fact to be decided by listening to the presented, not a matter of opinion to be decided on the
,
,
authority of a musical text (a score or a book .. even this one). or the importance of the scholar or
.,
composer who declares structure to be present (I shall return to such matters in particular in Chapter 9
on 'Time').
., "
;,
However, this is not a scientific text. Our interest in exploring this new area is not to discover
II

universal laws of perception but to suggest what might be fruitful approaches for artists who Wish to
explore the vast domain of new sonic possibilities opened up by sound recording and computer
technology.
We might in fact argue that the truly potent texts of our times are certainly not texts like this book, or
even true scientific theories. but computer programs themselves. Here. the religious or mystical
POtency willi which the medieval text was imbued has IJccn replaced by at.:tual phYSical cftica.;;y. For
the text of a computer program can act on the world through associated electronic and mechanical
hardware. to make the world anew. and in particular to crcate new and unhcard sonic experiences. Such
texts are potent but, at the same time. judgeable. They do nOt rJdlate some mystical authority 10 wiuch
we must kow-tow. but do something specific in the world which we can judge to be more or Ie,s
Successful. And if we are dissatisfied. the text can be modified to produce a more satisfactory result.
3
'The pntCtical implications of this are immense for the composer. In the past I might spend many days
working from an original sound source subjecting it to many processes before arriving at a satisfactory
result As a record of this process (as well as a safeguard against losing my hard-won fmal product) I
would keep copious notes and copies of many of the intermediate sounds, to make reconstruction (or
variation) of the final sound possible. Today, I store only the source and a brief so-called "batch-file".
The text in the batch-file, if activated, will automatically run all the programs necessary to create the
goal sound. It can also be copied and modified to produce whatever variants are required.
An illuminating manuscript indeed!
SOUND TRANSFORMATION: SCIENCE OR ART?
In this book we will refer to sounds as sound-materials or sound-sources and to the process of
changing them transformations or sound-transformations. The tools which effect changes will be
described as musical instruments or musical tools. From an artistic point of view it is important to
stress the continuity of this work to past musical craft. The musical world is generall y conservative and
denizens of this community can be quick to dismiss !he "new-fangled" as unmusical or artistically
inappropriate. However, we would stress that this is a book by, and for, musicians.
Nevertheless, scientific readers will be more familiar with the terms signal, signal proceSSing and
computer program or algorithm. In many respects, what we will be discussing is Signal processing as it
applies to sound signals, However, the motivation of our discussion is somewhat different from that of
the scientific or technological study of Signals. In analysing and transforming signals for scientific
purposes, we normally have some distinct goal in mind - an accurate representation of a given
sequence, of data in the frequency cWmain, the removal of noise and the enhancement of the
signal "image" - and we may test the result of our process against the desired outcome and hence
assess the validity of our procedure.
In some cases, musicians share !hcse goals. Precise analysis of sounds, extraction of time-varying
infonnation in the frequency domain (see Appendix p3), sound clarification or noise reduction, are all of
great importance to the sonic composer. But beyond lhis, !he question that a composer asks is, is this
process aes!helically interesting? - does !he sound resulting from !he process relate perceptually and
in a musically useful manner to the sound we began wi!h? What we are searching for is a way to
transform sound material to give resulting sounds which are clearly close relatives of the source, but
also clearly different. Ideally we require a way to "measure" or order these degrees of difference
allowing us to articulate the space of sound possibilities in a structured and meaningful way.
Beyond this, there are no intrinsic restrictions on what we do. In particular, the goal of thc process we
set in motion may not be known or even (wi!h complex signals) easily predictable beforehand. In fact,
as musicians, we do not need to "know" completely what we are doing (!!). The success of our cfforts
will be judged by what we hear. For example, a technological or scientific task may involve the design
of a highly specific filler to achieve a particular result. A musician, however, is more likely to require an
extremely flexible (band-variable, Q-variable. time-variable: Appendix p8) filter in order to explore its
effects on sound materials. Helshe may not know beforehand exactly what is being searched for when
it is used, apart from a useful aes!hetic transformation of !he Original source. What this means in
practice may only emerge in the course of the exploration.
4
This open-ended approach applies equally 10 the design of musical instruments (signal processing
grams) themselves. In this case, however, it clearly helps 10 have a modicum of acoustic
:wledge. An arbitrary, number-<:runching program will produce an arbitrary result - like the old
adage of trying 10 write a play by letting a chimpanzee stab away at a typewriter in the hope that a
masterpiece will emerge.
So scientists be forewarned! We may embark on signal-processing procedures which will appear
bizarre to the scientifically sophisticated, procedures that give relatively unpredictable results, or ti13t
are heavily dependent on the unique properties of the particular signals to which they are applied. The
question we must ask as musicians however is nOl, are these procedures scientifically valid, or even
predictable, bul rather, do they produce aesthetically useful resulls on al least some types of sound
materials.
There is also a word of caution for the composer reader. Much late Twentieth Century Western art
music has been dogged by an obsession wi!h complicatedness. This has arisen partly from the
permutational procedures of late serialism and also from an intellectually suspect linkage of crude
information theory with musical communication - more pallems means more information, means more
musical "potency". This obsession wi!h quantity, or information overload, arises parUy from the
'.

breakdown of consensus on the subslance of musical meaning. In the end, !he source of musical
potency remains as elusive as ever but in an age whidl demands everything be quantifiable and 'f
,!
measurable, a model which stresses !he quamity of information, or complicmedness of an artefact,

seems falsely plausible.
.

.\
This danger of overkill is particularly acute with the computer- processing of sound as anything and
I
.

,
everything can be done. For example, when composing, I may decide I need to do some!hing with a
sound which is difficult. or impossible, wilh my existing mUSIcal tools. I will therefore make a new :t
instrument (a program) to achieve the result I want. Whilst building the instrument, however, I will
o

make it as general purpose as possible so that it applies to all possible situations and so that all
variables can vary in all possible ways. Given the power of the computer, it would be wasteful of time
not to do !his. This docs not mean however that I will, or even intend to, use every conceivable option
the new instrument offers. Just as with tile traditional acoustic instrument, !he task is to use it, to play .
.,
it, well. In sound composition. this means to use the new tools in a way appropriate to Ihe sound we
,.
arc immediately dealing with and wi!h a view 10 p-JIlicular aesthetic objectives. There is no inherent ..
virtue in doing everything.
;,

,!
EXPLICIT AND INTUITIVE KNOWLEDGE
In musical creation we can distinguish two quite distinct modes of knowing and acting. In the tlrst, a
phYsical movement causes an immediate result which is monitored immediately by the cars and this
feedback is used to modify the action. Learned through physical practice and emulation of others and
aided by diSCUSSion and description of what it involved, this real-time-monitored action type of
knowledge, I will describe as illluilive". It applies to things we know very well (like how 10 walk, or
how to construct meaningful sentences in our native longue) wi!hout necessarily being able to describe
expliCitly what we do, or why it works, In music, intuitive knowledge is most strongly associated WiUl
mUsical performance.
5
I
r
On !be other hand. we also have explicit knowledge of. e.g. acceptable HAnnonic progressions within a
!liven style or the spectral contour of a particular vowel formant (see Chapter 3). 'This is knowledge that
we know that we know and we can give an explicit description of ilto others in language or
mathematics. Explicit knowledge of this kind is stressed in the training. and usually in the practice of.
traditional composers.
In traditional "on paper" composition. the aura of explicitness is enhanced because it results in a
definitive lext (the score) which can often be given a rational exegesis by the composer or by a music
score-analyst. Some composers (particularly the presently devalued "romantic· composers). may use
the musical score merely as a notepad for their intuitive musical outpourings. Others may occasionally
do the same but in a cultural atmosphere where explicit rational decision-making is most highly prized.
will claim. post-hoc. 10 have worked it all out in an entirely explicil and rational way.
Moreover, in the music-<:ultural atmosphere of the late Twentieth Century, it appears "natural" to
assume that thc use of the computer will favour a totally explicit approach to composition. At some
level. the computer must have an exact description of what it is required to do, suggesting therefore that
the composer must also have a clearly explicit description of the task. (I will describe later why this is
not so).
TIle absurd and misguided rationalist nightmare of the totally explicit construction of all aspects of a
musical work is not the "natural" outcome of the use of computer technology in musical composition ­
just one of the less interesting possibilities.
It might be supposed that the pure electro-acoustic composer making a composition directly onto a
recording medium has already abandoned intuitive knowledge by eliminating the performer and hislher
role. 'Ole fact, however, is that most electro-acoustic composers 'play' with the medium, exploring
through infonned, and often real-timc. play the range of possibilities available during the course of
composing a work. Not only is this desirable. In a medium where everything is possible, it is an
essential part of the compositional process. In this case a symbiosis is taking place between
composerly and perfonner/y activities and as thc activity of composer and performer begin to overlap,
the role of intuitive and explicit knowledge in musical composition, must achieve a new cquilibrium.
In fact, in all musical practice, some balance must be struck between what is created explicitly and what
is created intuitively. In composed Western art music where a musical score is made, the composer
takes responsibility for the organisation of certain weU-<:onlrOlled parameters (pitch, duration,
instrument type) up to a certain degree of resolution. Beyond this, perfonnance practice tradition and
the player's intuitive control of the instrumcntal medium takes over in pitch definition (especially in
processes of succession from pilch to pitch on many instruments, and with the human voice), timing
precision or its interpretation, and sound production and articulation.
AI the other pole, the free-improvising perfonner relies on an intuitive creative process (restrained hy
the intrinsic sound/pitch limitations of a sound source, c.g. a retuned piano, a metal sheet). to generate
both the moment-to-moment articulation of events and the ongoing time structure at all levels.
However, cven in the free-improvisation case, the instrument builder (or, by aCCident, the found-objecl
manufacturer) is providing a framework of restrictions (the sound world, thc pitch-set, the articulation
possibilities) which hound the musical univcrse which the free improviser may explore. In the sense
that an instrument builder sets explicit limits to the perfonncr's free exploration shelhc has a restricting
role similar to that of the composer.
6
In contraSl, the role of computer instrument builder is somewhat different Infonnation Technology
alloWS us to build sound-processing tools of immense generality and flexibility (though one might not
uess this fact by surveying the musical hardware on sale in high street shops). Much more
is therefore placed on the composer to choose an appropriate (set of) configuration(s) for
a particular purpose. The "instrument" is no longer a definable (if subtle) closed universe but a
groundswell of possibilities out of which the sonic composer must delimit some aesthetically valid
universe. Without some kind of intuitive understanding of the universe of sounds, the problem of choice
is insunnountable (unless one replaces it entirely by non- choice strategies, dice-throwing procedures
etc).
REAL TIME OR NOT REAL TIME? - THAT IS THE QUESTION
As computcrs becomc faster and and more and more powerful, there is a clamour among
musicians working in the medium for "real-timc· system, i.e. systems on which we hear the results of
our decisions as we take them. A traditional musical instrument is a "real-timc system", When we
bow a note on a violin, we immediatcly hear a sound being produced which is the result of our decision
.,
to use a particular finger position and bow pressure and which responds immediately to our subtle

manual articulation of these. Success in perfonnance also dcpends on a foreknowledge of whal our
actions will precipitate. Hence the rationale for having real-lime systems seems quite clear in the

sphere of musical performance. To develop new, or extended musical perfonnance instruments ,S
(including real-time-control!able sound processing devices) we need real-time processing of sounds.

J
COmposition on the other hand would seem, at first glance, to be an intrinsically non-real-time process I
in which considercd and explicit choices arc made and used to prepare a musical lext (a score), or (in
.

,
the studio) to put together a work. sound-by-sound. onlo a recording medium out of real time. As this

book is primarily about composition, we must ask what bearing the development of real-time
Q
proceSSing has on compositional practice apart from speeding up somc of thc morc mundane tasks

involved.
l
Although the three traditional roles performer-improviser, instrument-builder and composer arc being
blurred by the new tcchnological developments, they provide useful poles around which we may
the value of what we are dOing.
' .

I would suggest that there are two conflicting paradigms competing for thc attention of the sonic

!t
composer. TIle first is an instrumental pal'ddigm, where thc composer provides elcctronic extensions to '!
a traditional instrumental performance. TIus approach is intrinsically "real time". TIle advantages of
this paradigm are those of "livcness" (the theatrc versus the cinema, the work is rccrcated 'before your
very eyes') and of mutability dependent on each performer's reinterpretation of the work. ThIs
approach filS well into a tnlditional musical way of thinking.
II's disadvantages are not perhaps immediately obvious. But the composer who specilles a network of
electrOnic processing devices around a traditional instrumental performance must recognise that helshe
is in fact creating a new and different instrumenl for the instrumentalist (or iustrumentalist-"techllician"
duo) to play and is partly adopting the role of an instrument builder with its own vcry differenl
reSponsibilities. In the neomanic cultural atmosphere of the late Twentieth Century, the temptation for
anyone labelled .. composer" is to build a new electronic extcnsion for every piece, to establish
7
credentials as an "original" artist. However. an instrument builder must demonstrate the viability and
effieaey of any new instrument being presented. Does it provide a satisfying balance of restrictions and
flellibilities to allow a sophisticated performance practice to emerge?
Ii!
'i
Foe the performer he/she is performing on a new instrument which is composed of the complete system
acoustic-Instrument-plus- electronic-network. Any new instrument takes time to master. Hence
there is a danger that a piece for electronically processed acoustic instrument will fall short of our
ii,
musical expectations because no matter how good the performer. his or her mastery of the new system
i,11 is unlikely to match his or her mastery of the acoustic instrument alone with the centuries of
I I
perfonnance practice from which it arises. There is a danger that electronic extension may lead to
ilii
musical trivialisation. Because success in this sphere depends on a marriage of good instrument design
I,j i
I
and evolving performance practice, it takes time! From this perspective it might be best to establish a
number of sophisticated electronic-extension-archetypes which performers could. in time, learn to
master as a repertoire for these new instruments develops.
The second paradigm is that of pure electro-acoustic (studio) composition. Such composition may be
regarded as suffering from the disadvantage that it is not a 'before your very eyes'. and interpreted
medium. In this respect it has the disadvantages, but also the advantages, that film has viz-a-viz
theatre. What film lacks in the way of reinterpretable archetypicality(!) it makes up for in the closely
observed detail of location and specifically captured human uniqueness. Similarly, studio composition
can deal with the uniqueness of sonic events and with the conjuring of alternative or imaginary sonic
land'!capes outside the theatre of musical performance itself. Moreover. sound diffusion adds an
element of performance and interpretation (relating a work to the acoustic environment in which it is
presented) not available in the presentation of cinema.
However, electro-acoustic composition does present us with an entirely new dilemma. Performed
music works on sound archetypes. When I finger an "mf E" on a flute, I expect to hear an example of a
class of possible sounds which all satisfy the restrictions placed on being a flute "mf E". Without this
fact there can be no interpretation of a work. However, for a studio composer, every sound may be
treated as a unique event. Its unique properties are reproducible and can be used as the basis of
compositional processes which depend on those uniquenesses. The power of the computer to both
record and tntnsform allY sound whatsoever, means that a "performance practice" (so to speak) in the
traditional sense is neither necessarily attainable or desirable. !.lut as a result the studio composer
must take on many of the functions of the performer in sound production and articulation.
This does not necessarily mean. however. that all aspects of sound design need to be explicitly
understoOd. As argued above, the success of studiO produced sound-art depends on the fusion of Ihe
roles of composer and performer in the studio situation. For this to work effectively, real-time
processing (wherever this is feasible) is a desirable goal.
In many studio situations in which I have worked, in order 10 produce some subtly time-varying
modification of a sound texture or process, I had to provide to a program a file listing the values somc
parameter will takc and the times OIl which il will reach those values (a breakpoint table). I then ..10
the program and subsequently heard the result. If I didn't like this result. I had to modify the values in
the table and run the program again, and so on. It is clearly simpler and more efficacious, when lirst
exploring any time-varying process and its effect on a particular sound. to move a fader. [Urn a knob,
bow a string, blow down a tube (etc) and hear the result of tlus physical action as I make il. I can then
adjust my physical actions to satisfy my aural experience - I can explore intuitively. without going
through any conscious explanatory control process.
8
s ~
In this way I can learn intuitively, through performance, what is "right" or "wrong" for me in this new
siW
ation
without necessarily having any conscious e1tplicit k.nowledge of how I made this decision. The
power of the computer both to generate and provide control over the whole sound universe does nOI
require that I explicitly know where to travel.
But beyond this, if the program also recorded my hand movements I could store these intuitively chosen
values and these could then be reapplied systematically in new situations, or analysed to determine
their mathematical properties so I could conSciously generalise my intuitive k.nowledge to diflerent
situations.
In this interface between intuitive knowledge embedded in bodily movement and direct aural feedback.
and explicit knowledge carried in a numerical representation. lies one of the most significant
contributions of computing technology 10 the process of musical composition. blurring the distinction
between the skill of the perfomler (intuitive knowledge embedded in (X'fformance practice) and the skill
of the composer and allowing us to e1tplore the new domain in a more direct and less theory-laden
manner.
j
SOUND COMPOSITION; AN OPEN UNIVERSE
t
Finally, we must declare thai the realm of sound composition is a dangerous place for the traditional " ,
composer! Composers have been used to working within domains estahlished for many years or ;
centuries. The tempered scale became established in European music over 200 years ago. as did many
of the instrumental families. New instruments (e,g. the saxophone) and new ordering procedures (e.g.
serialism) have emerged and been taken on board gradually.
, ~
But these changes are ounor compared with the transition into SOllie composition where every sound ..
Q
.
and every imaginable process of transformation is available. The implication we wish the reader to ,
.
draw is that this new domain may not merely lack established limits at lhis moment, il may turn out to
be intrinsically unbounded. The exploralory researching approach to the medium. rather than just the
mastering and extension of established craft. may be a necessary requirement to come to grips with lilis
.,
new domain.
Unfonunately, the enduring academic image of the European composer is that of a man (sic) who, from
it
"
the depths of his (preferably Teutonic) wisdom, wHls into existence a musical score out of pure though!.
In a long established and stable tradition with an aimosl unchanging sel of sound-sources
(instruments) and a deeply embedded performance practice. perhaps these supermen exist. In contrast.
however. good sound composition always includes a process of discovery, and hence a coming 10 terms
with the unexpected and the unwilled, a process increasingly informed by experience as the composer
engages in her or his craft, bul nevertheless always open to both surprising discovery and to errors of
judgement. Humility in the face of experience is an essential character Irait!
In Particular the sound examples in this book are the result of applying particular musical tools to
partiCular sound sources. Both must be selecled with care to achieve some desired musical goal. One
cannot simply apply a process. "turn the handle", and expect to get a perceptually similar transformation
With whatever sound source one puts inlo the process. Sonic An is nOI like arranging.
9
I
r-
We already know that sound composition presents us with many modes of material-variation which are
unavailable _ at least in a specifically controllable way - in traditional musical practice. It remains to be
seen wbcther the musical community will be able to draw a definable boundary around available
leChniques and say "this is sound composition as we know it" in the same sense that we can do this
with ttaditional instrumental composition.
. :1

CHAPTER 1
THE NATURE OF SOUND
SOUNDS ARE NOT NOTES
One of the first lessons a sound composer must learn is this. Sounds are not notes. To give an analogy
with sculpture, there is a fundamental difference between the idea "black stone" and the particular stone I
found this morning which is black enough to serve my purposes. llUs particular stone is a unique piece of
material (no matter how carefully I have selected it) and I will be sculpting with this unique material. The
difficulty in music is compounded by the fact that we discuss musical form in terms of idealisations. "F5
mf" on a flute is related to "E-flat4 ff' on a clarinet. But these are relations between ideals, or classes, of
sounds. For every "F5 mf" on a flute IS a different sound-event. It's particular grouping of
micro-fluctuations of pitch, loudness and spectral properties is unique,
Traditional music is concerned with the relations among cel1ain abstracted properties of real sounds the
pitch. the duration. the loudness. Cel1ain other features like pitch stability, tremolo and vibrato control etc.
are expected to lie within perceptible limits defined by perforlll:lnce practice, but beyond this enter into a
less well-defined sphere known as interpretation. In this way, traditionally. we have a structure defined 1

by relations among archetypal properties of sounds and a less well-defined aura of acceptability and
excellence attached to other aspects of the sound events.
.,
With sound recording. the unique characteristics of the sound can be reproduced. llUs innumerable

<.
consequences which force us to extend, or divert from, traditional musical practice .
I
.
To give just 2 examples ...
·f


I. We can capture a very special articulation of sound that cannot be reproduced through perfonnance ­
the particular tone in which a phrase is spoken on a particular occasion by an untrained speaker; the
resonance of a passing wolf in a particular forest or a particular vehicle in a road-tunnel; the ultimate
extended solo in an all-time great improvisation which arose out of the special combination of
performers involved and the energy of tbe moment.
./
2. Exact reproducibility allows us to generate transformations [hat would otherwise be impoSSible. ;..
For example. by playing two copies of a sound together with a slight delay before the onset of the Il
sccond we generate a pitch because the exact repetition of sonic events at an exact time interval
corresponds to a fixed frequency and hence a perceived pitch.
The most impol1ant thing 10 understand, however. is that a sound is a sound is a sound is a sound. It is
not an example of a pitch class or an instrument type. It is a unique object with its own particular
properties which may be revealed, extended and transformed by the process of sound composition.
Furthermore. sounds arc multi-dimensional phenomena. Almost all sounds can be described in terms of
(particularly onsct-grain), pitch or pitch-band, pitch motion, spectral harmonicity-inhamlonicity and
Its evolution, spectral contour and formants (Sec Chapter 3) and their evolution, speCtral stability and its
and dispersive, undulating ami/or forced continuation (see Chapter 4), all a/ the same time. In
diViding up the various propel1ics of sound, we don't wish 10 imply that there are different classes of
II
li
ii
sounds corresponding to the various chapters in the book. In fact. sounds can be grouped into different
Iii
classes. with ful.l.Y boundaries, but most sounds have most of the properties that we wiD discuss. As
I'li compositional tools may affect two or more perceptual aspects of a sound, as we go through the book it
Iii
will be necessary to refer more than once to many compositional processes as we examine their effects
L! I"
11[,' from different pereeptual perspectives.
Ii Ii
UNIQUENESS AND MALLEABILITY: FROM ARCHITECTURE TO CHEMISTRY
Ii"
!,
I':i
l
To deal with this change of orientation. our principal metaphor for musical composition must change from
','I
one of architecture to one of chemistry.
In the past, composers were provided, by the history of instrument technology. performance practice and
the formalities of notational conventions (including theoretical models relating notatables like pitch and
dUllltion) with a pool of sound resources from which musical "buildings' could be constructed.
Composition through traditional instruments binds together as a class large groups of sounds and
evolving-shapes-of-sounds (morphologies) by collecting acceptable sound types together as an
-instrument" e.g. a set of struck metal strings (piano), a set of tuned gongs (gamelan), and uniting this
with II tradition of performance practice. The internal shapes (morphologies) of sound evenlS remain
mainly in the domain of performance practice and are not often sublly accessed through notation
conventions. Most importantly however, apart from the field of perCUSSion. the overwhelming dominance
of pitch as a defining parameter in music focuses interest on sound classes with relatively stable
spectral and frequency characteristics.
We might imagine an endless beach upon which are scattered innumerable unique pebbles. The
previous of the instrument builder was to seek out all those pebbles that were completely black to
make one instrument, all those that were completely gold to make a second instrument, and so on. The
composer then becomes an expert in constructing intricate buildings in which every pebble is of a
definable colour. As the Twentieth Century has progressed and the possibilities of conventional
instruments have been explored to their limit we have learned to recognise various shades of grey and
gilt to make out architecture ever morc elaborate.
Sound recording, however, opens up the possibility that any pebble on the beach might be usable ­
!hose that are black with gold streaks, those that are muJti-coloured. Our classification categories are
overburdened and our original task seems to become overwhelmed. We need a new perspective to
understand this new world.
In sonic terms. not only sounds of indetenninate pitch (like un pitched percussion, definable portamenti,
or inharmonic spectra) but those of unstable, or rapidly varying spectra (the grating gate, the human
1',
speech-stream) must be accepted into the compositional universe. Most sounds simply do not fall into
I,
the neat categories provided by a pitched-instruments oriented conception of musical architecture. As
,['"
most traditional building plans used pitch (and duration) as their primary ordering principles, working
, 'i
these newly available materials is immediately problematic. A complete reorientation of musical
thought is required together with the power provided by computers - to enable us to encompass ulis
new world of possibilities.
We may imagine a new personality combing the beach of sonic possibilities, not someone who selects,
rejects, classities and measures the acceptable, but a chemist who can take any pebble and by
12
»
um
ericai
sorcery. separate its constituents. merge the constituents from two quite different pebbles
in fact. transfonn black pebbles into gold pebbles. and vice versa.
The signal processing power of the computer means that sound itself can now be manipulated. Like the
chenUst. we can take apart what were once the raw materials of music, reconstitute them, or transfonn
them into new and undreamt of musical materials. Sound becomes a fluid and entirely malleable
medium, not a carefully honed collection of givens. Sculpture and chemistry, rather than language or
tinite mathematics, become appropriate metaphors for what a composer might do, although
mathematical and physical principals will still enter strongly into the design of both musical tools and
musical structures.
The precision of computer signal prncessing means, furthermore, that previously evanescent and
uncontainable features of sounds may be analysed, understood, transferred and transfonned in
rigorously definable ways. A minule audible feature of a particular sound can be magnified by
time-stretching or brought into focus by cyclic repetition (as in the works of Steve Reich). The evolving
spectrum of a complex sonic event can be pared away until only a few of the constituent partials remain,
transforming somellting that was perhaps coarse and jagged, into somcllting aetherial (spectrat tracillg :
see Chapter 3). We may exaggerate or contradict - in a precise manner - the energy (loudness)
trajectory (or envelope) of a sound, enhancing or contradicting ilS gestural propensities, and we can 'l
.
pass between these "states of being" of the sound with complete fluidity, tracing out an audible path of
;
"
....
musical connections - a basis for musical form- building.
"
This shift in emphasis is as radical as is possible - from a finite set of carefully chosen archetypal
properties governed by traditional "archilectural" principles, to a continuum of unique sound events and ,)
the possibility to stretch, mould and transfonn tllis continuum in any way we choose, to build new
"
I
worlds of musical connectedness. To get allY further in (his universe, we need 10 understand the
'-

'.
properties of the "sonic matter" willl which we must deal.
'1
.'
"
THE REPRESENTATION OF SOUND PHYSICAL & NUMERICAL ANALOGUES
To understand how this radical shift is possible, we must understand bolll the nature of sound, and how
it can now be physically represented. Fundamentally sound is a pressure wave travelling through the
air. lust as we may observed the ripples spreading outwards from a stone thrown into still water, when
we speak similar ripples travel through the air to the ears of listeners. And as with all wave motion, it
;"
is the pattern of disturbance which movcs forward, rather than the water or air itself. Each pocket of air
(or water) may be envisaged vibrating about its current position and on that vibration to the
POCket next to il. This is the fundamental differem:e between the motion of sound in air and the motion
of the wind where the air molecules move en ma<;se in one direction. It also explains how sound can
travel very much faster than Ille most ferocious hurricane.
Sound waves in air differ from the ripples Oil the surface of a pond in another way. The ripples on a pond
(Or waves on the ocean) disturb the surface at right angles (up and down) to the direction of motion of
that wave (forward). These arc known lateral wavcs. In a sound wave, the air is
alternatively compressed and 'Jrclied in thc samc direction as the direction of motion (See Diagram I).
However, we can represent the air wave as a gmph of pressure against time. In such a graph we
pressure 011 the vertical axis and time on the horizontal axis, So our representation ends up
COking Just like a lateral wavel (See Diagram 1).
13
I
••••••••••••••••••••••••••••••••••••••
SOUND WAVES
. ,
Sound waves are patterns of compression and rarefaction of the air,
in a direction parallel to the direction of motion of the wave.
iniciaf s.tate. [ Pas/TrON >­
+-.-?
@. ..;... • • •••• • • •••• • • •••• • • •••• • • •••• •
)-
4- -? [ mooE's, forward.
@ • •••• • • •••• • " ••e .................... .
::>
I -+ moves forward
I
@t•• • • •••• • ..... • • H. • • •••••• • •••• • • •••
t con.PN!SSIO>'l...
DIAGRAH 2
/"""»1


I"> DIAGRAM 1
Ii
drum rcJ:ates
."'
<IS shown.
TRANSVERSE WAVES
(<?)
',t
pen draw;;;;
,
Waves on the surface of a pond involve '1'
,.,
" '.
.•
Ot'l drl.llll. ,
motion of the water in a direction
perpendicular to the direction of motion
of the waves.
!I
':'
\"
"
", rotablt1ij anaerobe

-;:.
...
d.ruh'\. .
barometer;
"


3
,
,f
I'll
Anaerobe barometer stack expands and contacts as air pressure
varies over time.
\l)
af
Point of pen traces out similar movements,
at a s(nfJ'e
which are recorded as a (spatial) trace
on the regularly rotating drum.
The representation
of a sound wave
as a graph of
pressure against time
looks like
a transverse wave.
mlcrophorre
/aCttJ.U disc

Vinl.jl disc


<Q)
Foe.
:=. .......--=>
x::::=:::::> <:::::::>
fL/J I.

masnebc. tape
txl
aeYII
$
In..
space
• T7
fe1r.QI'\.
On t .. pe. Volcal€ .
p n
( €
\TV':"
fLn
r
Such waves are, of their very nature, ephemeral. Without some means to 'halt time' and to capture their
form OIIt of time. we cannot begin to manipulate them. Before the Twentieth Century this was
technologically impossible. Music was reproducible through the continuity of instrument design and
performance practice and the medium of the score, essentially a set of instructions for producing sounds
anew on known instruments with known techniques.
The trick involved in capturing such ephemeral phenomena is the conversion of time information into
spatial information. a simple device which does this is the chart-recorder, where a needle traces a
grapb of some time-varying quantity on a regularly rotating drum. (See Diagram 2).
And in fact the original phonograph recorder used eltactly this principle, first converting the movement of
air into the similar (analogue) movements of a tiny needle, and then using this needle to scratch a
pattern on a regularly rotating drum. In this way, a pattern of pressure in time is converted to a physical
shape in space. The intrinSically ephemeral had been captured in a physical medium.
nus process of creating a spatial analogue of a temporal event was at the root of all sound-recording
before the arrival of the computer and is still an essential part of the chain in the capture of the
phenomenon of sound. The fundamental idea here is of an analogue. When we work on sound we in
fact work on a spatial or electrical or numerical analogue. a very precise copy of the form of the
sound-waves in an altogether different medium.
The arrival of electrical power contributed greatly to the evolution of an Art of Sound itself. First of all,
reliable electric motors ensured that the time-varying wave could be captured and reproduced reliably.
as the rotating devices used could be guaranteed to run al a constant speed. More importantly.
electricity itself proved to be the ideal medium in which to create an analogue of the air
pressure-waves. Sound waves may be converted into electrical waves by a microphone or other
transducer. Electrical waves arc variations of electrical voltage with time - analogues of sound waves.
Such electrical analogues may also be created using electrical oscillators. in an (analogue) synthesizer.
Such electrical patterns are, however. equally ephemeral. They must still be converted into a spatial
analogue in some physical medium if we are to hold and manipulate sound-phenomena out of time. [n
the past this might be the shape of a groove in rigid plastic (vinyl discs) or. the variation of magnetism
along a tape (analogue tape recorder). By running the physical substance past a pickup (the needle of a
record player. the head of a tape recorder) at a regular speed. the position information was reconverted
to temporal information and the electrical wave was recreated. The final link in the chain is a device to
convert electrical waves hack inlO sound waves. This is the loudspeaker. (See Diagram 3).
Note that in all these cases we are attempting to preserve Ihe shape. the form. of the original pressure
wave, though in an entirely different medium.
The digital representation of sound takes one further step which gives us ultimate and precise control
over the very substance of these ephemeral phenomena. Using a device known as an A to D (analogue
to digital) converter, the pattern of electrical nuctuation is converted into a pattern of numbers. (See
Appendilt pI). The individual numbers are known as samples (not to be confused with chunks of
recorded sounds. recorded in digital samplers. also often referred to as 'samples'), and each one
represents the instantaneous (almost!) value of the pressure in the original air pressure wave.
14

DIAGRAM :3


SOu.nd SOurGe lS A D /
J\,'
air
j:Jre:.sure
ele!ic.

(f\

OR


m etlC­
£.cs em
,J' :.a:
Of

sm."

eled-ric..

loudspeaker
\0

\tl

vO
Ze
Sound
- e
n
·$.5ur
i
ear
We now arrive at an essentially abstract representation of sonic substance for. although these numbers
are nonnally stored in the spatial medium of magne.tic domains on a computer disk, or pits burned in a
CD. there need be no simple relationship between their original temporal order and the physical-spatial
order in which they are stored. Typically. the contents of a tile on a hard disk are scattered over the
disk according to the availability of storage space. What the computer can do however is to re-present
to us, in their original order. the numbers which represent the sound.
Hence what was once essentially physical and temporally ephemeral has become abstract. We can
even represent the sound by writing a list of sample values (numbers) on paper. providing we specify
the sample rate (how many samples occur at a regular rate. in each second). though this would take an
inordinately long time to achieve in practice.
It is interesting to compare this new musical abstraction with the abstraction involved in traditional
music notational practice. Traditional notation was an abstraction of general and easily quantifiahle
large-scale properties of sound events (or performance gestures, like some ornaments), represented in
written scores. This abstracting process involves enormous compromises in that it can only deal
accuI1ltely with finite sets of definable phenomena and depends on existing musical assumptions about
performance practice and instrument technology to supply the missing information (i.e. most of it!) (See
the discussion in On Sonic Arr). In contrast. the numerical abstraction involved in digilal recording
leaves nothing to the imagination or foreknowledge of the musician, but consequently conveys no
abstracted information on the macro level.
THE REPRESENTATION OF SOUND THE DUALITY OF TIME AND FREQUENCY
So far our has focused on the representation of the actual wave-movement of the air by
physical and digital analogues. However. lhere is an alternative way to think about and 10 represent a
sound. Let us assume to begin with that we have a sound which is nOl changing in quality through lime.
If we look at the pressure wave of this sound it will be found to repeat ils pattern regularly (See
Appendix p3).
However. perceplually we are more imeresled in the perceived properties of this sound. Is it pitched or
noisy (or both)? Can we perceive pitches within it? etc. In fact the feature underlying these perceived
properties of a static sound the disposition of its partials. These may he conceived of as the simpler
vibrations from which the actual complex waveform is constructed. It was Fourier (an Eighteenth
Century French mathematician investigating heat distribution in solid objects) who realised that any
particular wave-shape (or. mathematically. any function) can be recreated by summing together
elementary sinusoidal waves of the correct frequency and loudness (Appendix p2). In acoustics these
are known as the partials of the sound. (Appendix p3).
If we have a regular repeating wave-pallem we can therefore also represent it by plotting the
frequency and loudness of its partials, a plot known as the spectrum of the sound. We can often deduce
perceptual information about the sound from this data. The pattcrn of partials will determine if the sound
will be perceived as having a single pitch or several pitches (likc a bell) or even (with singly pitched
sounds) what its pitch might be.
The spectral pattern which produces a singly pitched sound has (in mosl cases) a particularly simple
structure. and is said to be "harmonic". Thus should not to be confused with the traditional notion of
I
I
15
2 ,
f{A.rIIlony in EUropean music. Thus we use double capitalisation for the latter. and none for the former.
(for a fuller discussion. see Chapter 2). The bell spectrum. in contrast. is known as an irtharmonic
spectrum.
II is important to note that this new representation of sound is out-of-time. We have converted Ule
temporal information in our original representation to frequency information in our new representation.
These two representations of sound are known as the time-domain representation and Ule
frequency-domain representation respectively. The mathematical technique which allows us to conven
frOm one to the other is known as the Fourier Iransform. (and to convert back again. the inverse Fourier
trans/onn) which is often implemented on computers in a highly efficient algorithm called the Fast
Fourier Trdllsform (FiF-T).
If we now wish 10 represent the spectrum of a sound which varies in time (i.c. any sound of musical
interest and certainly any naturally occurring sound) we musl divide the sound inlo tiny
time-snapshols (like the frames of a film) known as windows. A sequence of these windows will
show us how the spectrum evolves with time (the phase vocoder docs just this, see Appendix pit).
Note also that a very tiny fragment of the time-domain representation ( a fragment shorter than a
wavecycle). although it gives us accurate information aboul the time-Variation of tile pressure wave,
gives us no infomlation about frequency. Converting it to frequency data with the FFf will produce
energy allover the spectrum. Lislening 10 il we will hear only a click (whatever the source).
Conversely, the frequency domain representation gives us more precise information aboul the frequency
of the partials in a sound Ihe larger the time window used to calculate it. But in enlarging the window,
we track less accurately how the sound changes in time. This trade-off between temporal information
and frequency information is identical 10 Ihe quanlum mechanical principle of indetenninacy, where the
time and energy (or position and momenrum) of a panicle/wave cannol both be known with accuracy _
we trade off our knowledge of one againsl our knowledge of the other.
TIME FRAMES: SAMPLES, WAVE-CYCLES. GRAINS AND CONTINUATIONS.
In working with sound materials, we quickly become aware of the different time-frames involved and
their perceptual consequences. Stockhausen, in the article How Time Passes (published in Die Reihe)
argued for the unity of formal, rhythmic and acoustic time-frames as the rationale for his composihon ',"I
Gruppen. This was a fruitful stance to adopt as far as Gruppen was concerned and tits well with the
.,
..
'unily' mysticism which pervades much musical score analysis and ..;ommenlary, hut it does nOI tally
with aural experience.
Extremely short time frames of the order of 0.000 I seconds have no perceptual significance at all. Each
sample in the digital representation of a waveform corresponds to a lime less than 0.000 I seconds.
Although every digitally recorded sound is made OUI of nothing hut samples. the individual sample can
tell us nothing about the sound of which il is a part. Each sample, if heard individually, would be a
broad-band click of a certain loudness. Samples are akin 10 the quarks of suhatomic particle theory,
essential to the existence and structure of malter. bUI nOI separahle from the panicles (protons and
neutrons) they constitute.
16
II
I
'I
II
Ii
I
I'I'
I:
I
I:!'
"
',I
II'
I,I
f
I'
;,Iil
'l'il'
I·",
'The first significant object from a musical point of view is a shape made out of samples, and in particular
a wavecycle (a single wavelength of a sound). These may be regarded as the atomic units of sound.
'The shape and duration of the wavecycle will help to determine the properties (the spectrum and pitch)
of the sound of which it is a part. But a single wavecycle is not sufficient on its own to determine these
properties. As pitch depends on frequency, the number of times per second a wavefonn is repeated, a
single wavecycle supplies no frequency infonnation. Not until we have about six wavecycles do we
begin to associate a specific pitch with the sound. Hence there is a crucial perceptual boundary below
which sounds appear as more or less undifferentiated clicks, regardless of their internal form, and above
which we begin to assign specific properties (frequency, pitch/noise, spectrum etc) to the sound (for a
fuller discussion of this point see On Sonic Art).
This perceptual boundary is nicely illustrated by the process of waveset time-stretching. lllis process
lengthens a sound by repeating each waveset (A waveset is akin to a wavecycle, but not exactly the
same thing, For more details see Appendix p55). With noisy sources, the vary widely from
one to another but we hear only the net result, a sound of indefinite pitch. Repeating each waveset
lengthens the sound and introduces some artefacts. If we repeat each waveset five or six times
however, each one is present long enough to establish a specific pitch and spectral quality and the
original source begins to transfonn into a rapid stream of pitched beads. (Sound example 1.0).
Once we can perceive distinctive qualitative characteristics in a sound, we have a grain. The boundary
between the wavecycle time frame and the grain time-frame is of great importance in instrument
design. For example, imagine we wished to separate the grains in a sound (like a rolled "R") by
examining the loudness trajectory of the sound. Intuitively we can say that the grains are the loud part
of the signal, and the points between grains the quietest parts. If we set up an instrument which waits
for the signal level to drop below a certain value (a threshold) and then cuts out the sound (gating), we
should be able to separate the individual grains.
However, on reflection, we see that this procedure will not work. The instantaneous level of a sound
signal constantly varies from positive to negative so, at least twice in every wavecycle, it will fall below
the threshold and our procedure will chop the signal into its constituent half wavecycles or smaller units
(see Diagram 4) - not what we intended. What we must ask the instrument to do is search for a point
in the signal where the signal stays below the (absolute) gate value for a significant length of time.
lllis time is at least of grain time-frame proportions. (See Diagram 5).
A Grain differs from any larger structure in that we cannot perceive any resolvable internal structure.
The sound itself to us as an indivisible unit with definite qualities such as pitch, spectral
contour, onset characteristics (hard-edged, soft-edged), pitchy/noisy/gritty quality etc. Often the grain
is characterised by a unique cluster of properties which we would be hard pressed to classify
individually but which enables us to group it in a particular type e.g. unvoiced "k", "t", "p", "d".
Similarly, the spectral and pitCh characteristics may not be easy to pin down, e.g. certain drums have a
focused spectrum which we would expect from a pitched sound (they definitely don't have noise spectra
as in a hi-hat), yet no particular pitch may be discernible. Analysis of such sounds may reveal either a
very short inhannonic spectrum (which has insufficient time to register as several pitches, as we might
hear out in an inhannonic bell sound), or a rapidly portamentoing pitched spectrum. Although the
internal structure of such sounds is the cause of what we hear, we do not resolve this internal structure
in our perception. The experience of a grain is indivisible.
17
DIAGRAM 4,
,
r - - --C; -1\ --Aii...
- U V
- --v ---
Set the sound to zero
wherever its absolute value falls below
the level of the gate
."" 0 0 0."
roo 0 CJ
o u
DIAGRAM 5
Divide up the sound into finite length windows, and use the absolute maximum
rr -1Ljo'
V
- - - -
- ---
-

-
q
.
- -- --
-

!
'- • _ .. _n _ ________


tho '0_., ...... ,
.,
..

• ... /:il>le
r (\
" ""
too :00• .,
V\TV V <:7 "J
r
!I
II The Internal structure of grains and their indivisibility was brought home to me through worldng with
,II birdsong. A particular song consisted of a rapid repeated sound having a "bubbly· qUality. One might
i presume from this that the individual sounds were internally portamentoed at a rate too fast to be
I ' beard. In fact. when slowed down to 1I8th speed, each sound was found to be comprised of a rising
, I, scale passage followed by a brief portarnento!
Longer sound events can often be described in terms of an onset or attack event and a continuation.
The onset usually has the timescale and hence the indivisibility and qualitative unity of a grain and we
win rerum to this later. But if the sound persists beyond a new time limit (around .05 seconds) we have
enough information to detect its temporal evolution. we become aware of movements of pitch or
loudness. or evolution of the spectrum. The sound is no longer an indivisible grain: we have reached the
sphere of Continuation.
1l!is is the next important lime-frame after Grain. It has great significance in the processing of sounds.
For example. in the technique known as brassage. we chop up a sound into tiny segments and then
splice these back together again. If we retain the order of the segments using overlapping segments
from me original sound, but don't overlap them (so much) in the resulting sound, we will dearly end up
with a longer sound (Appendix p44).
If we try to make segments smaller than the grain-size. we will destroy the signal because the splices
(cross-fading) between each segment will be so short as to break up the continuity of the source and
destroy the signal characteristics. For example, attempting to time-stretch by a factor of 2 we will in
fact resp/ice together parts of the wavefOll1l itself, to make a waveform twice as long, and our sound will
drop by an octave, as in tape-speed variation. If the are in the grain time-frame, the
insllIntaneous .pitch will be preserved, as the waveform itself will be preserved intact. and we should
achieve a time-stretching of the sound without changing its pitch (the hannoniser algorithm). If the
segments are longer than grains, their internal structure will be heard out and we will begin to notice
echo effects as the perceived continuations are heard to be repeated. Eventually, we will produce a
collage of motifs or phrases cut from the original material, as the segment size becomes very much
larger than the grain time-frame. (Sound example 1.1).
The grain/continuation time-frame boundary is also of crucial importance when a sound is being
time-stretched and this will be discussed more fully in Chapter II.
The boundaries between these time-frames (wavecycle. grain, continuation) are not, of course,
completely clear cut and interesting perceptual ambiguities occur if we aller the parameters of a process
so that it crosses these time-frame thresholds. In the simplest case, gradually time-stretching a gmin
gradually makes its internal structure apparent (See Chapter II) so we can pass from an indivisible
qualitative event to an event with a clearly evolving structure of its own. Conversely, a sound with clear
internal structure can be time-contracted imo a structurally irresolvable gl".Iin. (Sound example 1.2).
Once we reach the sphere of continuation, perceptual descriptions become more involved and perceptual
boundaries, as our time frame enlarges, less clear cut. If the spectrum has continuity, perception of
continuation may be concerned with the morphology (changing shape) of the spectrum, with the
articulation of pitch (vibl".Ito, jitter), loudness (tremolo), spectral contour or formant gliding (see
Chapter 3) etc, etc. The continuation may, however, be disconunuous as in iterative sounds
(graln-streams such as rolled "R", and low contrabassoon notes which are perceived partly as a
rapid sequence of onsets) or segmented sounds (see below), or granular texture streams (e.g. maracas
18
with coarse pellets) where the sound is clearly made up of many individual randomised attacks. Here,
variation in grain or segment speed or density and grain or segment qualities will also contribute to our
aural experience of continuation.
As our time-frame lengthens, we reach the sphere of the Phrase. Just as in traditional musical practice,
the boundary between a long articulation and a short phrase is not easy to draw. This is because we
are no longer dealing with clear cut perceptual boundaries, but questions of the interpretation of our
experience. A trill, without Variation, lasting over four bars may be regarded as a note-articulation (an
example of continuation) and may exceed in length a melodic phrase. But a trill with a marked series of
loudness and speed changes might well function as a musical phrase (depending on context). (Sound
example 1.3).
A similar ambiguity applies in the sound domain with an added difficulty. Whereas it will usually be
quite clear what is a note event and what is a phrase (ornameniS and trills blurring this distinction), a
sound event can be arbitrarily complex. We mighl, for example. start with a spoken sentence. a
phrase-time-frame object. then time-shrink it 10 become a sound of segmented morphology (See
Chapter II). As in traditional musical practice, the recognition of a phrase as such, will depend on
musical conlext and tile fluidity of our new medium will allow us to shrink. or expand, fwm one
time-frame to another.
A similar ambiguity applies as we pass further up the time-frame ladder towards larger scale musical
entities. We can however construct a series of nested time-frames up to the level of the duration of an
entire work. These nested time-frames are the basis of our perception of both rhytilm and larger scale
form and this is more fully discussed in Chapter 9.
THE SOUND AS A WHOLE - PHYSICALITY AND CAUSALITY
Most sounds longer than a grain can he described in temlS of an onset and a continuation. A detailed
discussion of the typology of sounds can be found in Of! Sonic Art and in the writings of the Groupe de
Recherches Musicales. Here, I would like to draw attenlion to two aspects of our aural experience.
The way in which a sound is attacked and continues provides evidence of the physicality of its origin. In
,
the case of tr.lnsformed or synthesized sounds, this evidence will be misleading in actuality, bUl we still
"
gain an impression of an imagined origin of the sound.
"
It is important to bear this distinction in mind. As Pierre Schaeffer was at pains to stress, once we
begin working with sounds as our medium the oClllal origin of those sounds is no longer of any concern.
This is panicularly true in the era of computer sound transformation. However, the apparelll origin (or
physicality) of Ihe sound remains an important factor in our perception of the sound in whatever way it
is derived.
We may look at this in another way. With the power of the computer. we can transform sounds in such
radical ways that we can no longer assert that the goal sound is related 10 the source sound merely
because we have derived one from the other. We have 10 establish a connection ill the experience of
the listener either through clear spectral, morphological, or etc similarities between the sounds, or a
clear path through a series of connecting sounds which gradually change their characteristics from those
of the source, to those of the goal. Ths is particularly true when the apparent origin (physicality) of the
gOal SOund is quite different to that of the source sound.
19
Ii
Ili'i'l
, I
'Ihus, for example. we may spectrally time-stretch and change the loudness trajectory (enveloping) of a
vocal sound. producing wood-like attacks. which are then progressively distorted to sound like
unpltched drum sounds. (Sound example 1.4).
In general. sounds may be activated in two ways - by a single physical event (e.g. a striking blow). or
by a continuous physical event (blowing, bowing. scraping). In the first case, the sound may be
internally damped producing a short sound, or grain - a xylophone note. a drum-stroke. a vocal click. It
may be short, but permitted to resonate through an associated physical system, e.g. a cello soundbox
for a pizzicato note. a resonant hall for a drum Stroke. Or the material itself may have internal
resonating properties (bells, gongs, metal tubes) producing a gradually attenuated continuum of sound.
In the case of continuous excitation of a medium, the medium may resonate, producing a steady pitch
which varies in loudness with the energy of the excitation e.g. Aute, Tuba, Violin. The medium may
vibrate lit a frequency related to the excitation force (e.g. a rotor-driven siren, or the human voice in
some circumstances) SO that a varying eXCitation force varies the pitch. Or the contact between exciting
force and vibrating medium may be discontinuous, producing an iterated sound (rolled "R", drum roll
etc).
1lle vibrating medium itself may be elastically mobile - a flexatone, a flexed metal sheet, a mammalian
larynx - so that the pitch or spectrum of the sound varies through time. The material may be only
gently coaxed into motion (the air in a "BIoogle". the shells in a Rainmaker) giving the sound a soft
onset, or the material may be loosely bound and granular (the sand or beads in a Shaker or
Wind-machine) giving the sound a diffuse continuation. Resonating systems will stabilise after a
vlll'iety of transient or unstable initiating events (flute breathiness, coarse hammer blows to a metal
sheet) so that a and disconnected onset leads to a stable or stably evolving spectrum.
I am not suggesting that we consciously analyse our aural experience in this way. On the contrary,
aurnl experience is so important to us that we already have an intuitive knowledge (see earlier) of the
physicality of sound-sources. I also do not mean that we see pictures of physical objects when we
hear sounds, only that our aural experience is grounded in physical experience in a way which is not
necessarily consciously understood or articulated. Transforming the characteristics of a sound-source
automatically involves transforming its perceived physicality and this may be an important feature to
bear in mind in sound composition.
In a similar and not easily disentangled way, the onset (or attack) properties of a sound give us some
inkling of the cause of that sound - a physical blow, a scraping oontact, a movement, a vocal utterance.
The onset or attack of a sound is always of great significance if only because it is the moment of
greatest surprise when we know nothing about the sound that is to evolve, whereas during the
continuation phase of the sound, we are articulating what the onset has revealed. II is possible to give
the most unlikely sounds an apparent vocal provenance by very carefully splicing a vocal onset onto a
non-vocal continuation. The vocai "causality" in the onset can adhere to the ensuing sound in the most
unlikel y cases.
20
STA.BILITY & MOTION: REFERENCE FRAMES
In general. any property of a sound (pitch, loudness. spectral Contour etc) may be (relatively) stable, or
it may be in motion (portamento, crescendo, opening of a filier, etc.) Furthermore, motion may itself be
in motion, e.g. cyclically varying pitch (vibrato) may be accelerating in cycle-duration, while shrinking in
pilCh-depth. Cyclical motions of various kinds (lremolo, vibrato, spectral Vibrato, etc.) are often
regarded as stable Properties of a sound.
We may be concerned with stable properties, or with the nature of motion itself and these of
sound properties are usually perceptually distinguishable. Their separation is exaggerated in Western
Art Music by the use of a discontinuous notation system which notates static propenies well, but
moving properties much less precisely (for a fuller discussion see On Sonic Art).
Furthermore, our experiences of sonic change are often fleeting and only inexactly reprodUCible, We
can, for example, reproduce the direction, duration and range of a rising portamento and in practice, we
can differentiate a start-weighted from an end-weighted portanlento. (See Diagram 6).
In many non-Western cultures, subtle COntrol of such distinctions (portamento type. vibrato speed and
deplh etc) are required skills of the performance practice. But the reproduction of a complex portamento
trajectory (see Diagram 7) with any precision would be difficult merely from immediate memory. The
difficulty of immediate reproducibility makes repetition in performance very difficult and therefore
acquaintance and knOwledge through familiarity impossible to acqUire.
With computer technology, however, complex, time-varying of II sound-evcnt can be tied down
with precision. reproduced and transferred to other sounds. For the first time, we have sophisticated
control over sonic motion in many dimensions (pitch, loudness, spectral contour or fornlant evolution,
spectral harmonicity/inharmonicity etc. etc.) and can begin to develop a diScipline of motion itself (sec
Chapler 13).
In SOund composilion, the entire continuum of sound possibilities is open to us, and types of motion are
as accessible as static States. But in our perception of our sound universe there is another factor to be
COnsidered, that of the reference-frame. We may choose to (it is not inevitable!) organise a sound
property in relation to a set of reference values. These provide reference points. or nameable foci, in the
continuum of sonic possibilities. Thus, for example, any particular language provides a phonetic
reference-frame distinguishing those vowels and consonant types to be regarded as different (and
hence capable of articulating different meanings) within the continuum of possibilities. These
diStinctions may be subtle ("D" and "r in English) and are often problematic for the non-native
speaker (English "L" and "R" for a Japanese speaker).
USUally, these reference frames reler to stable properties of the sonic space but this is not universal.
Consonants like oW' and "Y" (in English) and various vowel diphthongs are in fact defined by the
1ll0tion of their spectral contours (formants: sec Appendix pJ). fiut in general, reference frames for
Illotion types are not so well understood and certainly do not feature strongly in traditional Western art
Illusic practice. Nevertheless, we are equipped in the sphere of language perception, at the level of the
Phoneme ("w", "y", etc. and tone languages like Chinese) and of "tone-of-voice", to make very subtle
distinctions of pitch motion and motion, They arc vital to our comprehension at the phonemic
and semantic level. And in fact, sounds witll moving speclral COntours tend to be classified alongside
SOunds with stable spectral contours in the phonetic classification system of any particular language.
21
DIAGRAM 7
DIAGRAH 6
-.... ./
,.,,-­
....
End-weighted
Start-weighted
Co.plex
porlamento
trajectory.
portamento
portamento
DIAGRAH e
tOlP'lic: to"ic.



notes placed on Jt
s..itone ladder.
I't I NOR
l'IAJOR
SCALE
SCALE
Pattern only fits
Pattern only fits
against .inor scale
against major scale
semitone template
semitone template
t»nt.:..
0' at one place.
at one place.
Ve can therefore
Ve can therefore
3 identify the 2nd note 3
identify the 2nd note
as the minor 3rd :f
as the tonic
o
of the scale.
of the scale. "'I
We can see the same mixed classification in the Indian raga sySlem where a raga is often defined in
terms of both a scale (of static pilch values) and various motivic figures often involving sliding
intonations (motion-types). In general, in the case of pitch reference-frames. values lying off the
reference frame may only be used as ornamental or articulatory features of the musical stream. They
have a different stalus 10 that of pitches on the reference frame (cf. "blue nOles" in Jazz. certain kinds of
barOQue ornamentation elc).
A reference frame gives a structure to a sonic space enabling us to say "I am here" and not somewhere
else. But pilch reference frames have special properties. Because we nonnally accept the idea of
octave equivalence (doubling the frequency of a pitched sound produces the "same" pitch, one octave
rugher), pitch reference frames are cyclic. repeating themselves at each octave over the audible range.
In contrast. in the vowel space, we have only one "OCtave" of possibilities but we are slill able 10
recognise, without reference 10 other vowel sounds, where we are in the vowel space. We have a
sense of absolule position.
people with perfect pitch can deal with pilch space in this absolute sense, but for most of us. we have
only some general conception of high and low. We can. however, detennine our position re/J1tive 10 a
given pitch. using the notion of interval. If the OClave is divided by a mode or scale into an asymmetrical
sel of intervals. we can lell where we are from a small sci of nOles without hearing the key note
because the interval relationships belween the nOles orienl us witWn the set-of- intervals making up
the scale. We cannot do tWs trick. however, with a completely symmetrical scale ( whole-tone scale,
chromatic scale) without some additional clues (See Diagram 8).
Cyclic repetition over the domain of reference and the notion of interval are specific to pitch and time
reference-frames. However, time reference frames which enter into our perceplion of rhythm are
particularly contentious in musical circles and I will postpone funher discussion of these until Chapter 9.
Traditional Weslern music practice is strongly wedded to pilch reference fr.lmes. In fact on many
instruments (keyboards, freued strings) they arc dift1cult to escape. However. in sonic composilion we
can work ....
(a) with stalic properties on a reference frame.
(b) with static properties WililOUI a reference frame.
(c) with properties of motion .
Furthennore. we can transfonn the reference frame itSelf, through time, or move on 10, and away from,
such a frame. Compuler control permits the very precise exploration of trus area of new possibilities.
It is panicularly imponant 10 understand thai we can have pitch, and even stable pilCh, without having a
stable reference frame and hence without having a HAmlOIJic senfe oj pitch in the traditional sense.
(See Diagram 9). We can even imagine establishing a reference frame for pilch-motion types without
haVing a HAnnonic frame - lhis we already lind in tone languages in China and Africa.
I
......!II'
I i"
notes placed on
WHOLE-TONE
semitone ladder.
SCALE
Pattern fits against
whole-tone scale
in many places. € :

We cannot therefore
orient ourselves
within the scale.
i
C1I
22
DIAGRAM '3
I
stable pitches a reference frame.
iii.
II
-. - 1
II
.,
... .. ." . - - .
_ -_ _ - _ - ---
.1>1
- - '<
- - - ............. - /III
-
stable pitches not defining a reference frame.
(every pitch is different)
._ ______ ------==.....
-
rflE MANIPULATION OF SOUND: TIME-VARYING PROCESSING
Just as I have been at pains to stress that sound events have several time-varying properties, it is
impOrtant to understand that the compoSitional processes we apply to sounds may also vary in time in
subtle and complex ways. In general, any parameter we give to a process (pitch, speed, loudness, filter
centre. density etc.) should be replaceable by a time varying quantity and this quantity should be
continuOusly variable over the shol1est time-scales.
In a tradition dominated by the notation of fixed values (pitch, loudness level, dw-ation) etc, it is easy to
imagine that processes themselves must have fixed values. In fact. of course, the apparently fixed
values we dictate in a notation system are turned into subtly varying event" by musical performers.
Where this is not the case (e.g. quantised sequencer music on elementary synthesis modules) we
quickly tire of the musical sterility.
The same fixed-value conception was transferred into acoustic modelling and dominated early
syntllCSis experiments. And for this reason, early synthesized sounds governed by fixed values (or
simply-<hanging values) suffered from a lack of musical vitality. In a similar fashion an "effects unit"
used as an all-or-nothing process on a voice Of instrument. is simply that. an "effect" which may
occasionally be appropriate (to create a pal1icular general acoustic ambience. for example) but will not
sustaiti our interest as a compositional feature until we begin to articulate its temporal evolution.
A time-varying process can, on the other hand, effect a radical transformation within a sound, e.g. a
gradually increasing (but subtly varying) vibrato. applied to a fairly static sound. gradually giving it a
vocal quality; a gradual spectral stretching of a long sound causing its pitch to split into an inharmonic
stream; and so on. Transfonning a sound does not necessarily mean changing it in an all-or-nothing
way. Dynamic spectral iTllerpo/ation in pal1icular (see Chapter 12) is a process which depends on (he
subtle use of time-varying parameters.
THE MANIPULATION OF SOUND: RELATIONSHIP TO THE SOURCE
The infinite malleability of sound materials raises another significant musical issue. discussed by Alain
Savouret at the Imernational Computer Music Conference at IRe AM in 1985. As we can do anything to
a Signal. we must decide by listening whether the source sound and the goal sound of a compositional
process are in fact at all perceptually related. or at least whetllcr we can define a perceptible route from
One to the other through a sequence of intermediate In score-based music there is a tradition
to claiming that a transformation which can be explained and who's results can be seen in a score. by
definition defines a musical relationship. TItis mediaeval textual idealism is out of the question in the
new musical domain.
An instrument which replaces every half-waveset by a single pulse equal in amplitude to the amplitude
or the half-waveset, is a perfectly well-defined transformation, but reduces most sounds to a uniform
Crackling noise. Nevertheless. for complex real-world sounds. each cr<lckling signal is unique at the
Sample level and uniquely related to its source sound. Hencc no musical relationship has been
established between the source sound and the transformed sound.
23
In &ound composition a relationship between two sounds is established only through aurally perceptible
similarity or relatedness. regardless of the methodological rigour of the process which transforms one
sound into another.
Another important aspect of Savoun:t's talk was to distinguish between source-focused Iransformalion
(where the nature of the resulting sound is strongly related to the input sound. (e.g. lime-slretching of
a signal with a stable spectrum. retaining the onset unstretched) and process-focused transformations
(where the nature of the resulting sound is more strongly determined by the transformation process
itself, e.g. using very short time digital delay of a signal. superimposed on the non-delayed signal to
produce delay-time related pitched ringing). There is. of course. an interesting area of ambiguity
between the two extremes.
In general, process-focused transformations need to be used sparingly. Often when a new
compositional technique emerges. e.g. pitch parallelism via the harmanizer. there is an irtitial rush of
excitement to explore the new sound possibilities. But such process-focused transformations can
rapidly become cliches.
Transformations focused in the source, however. retain the same infinite potential that the infinity of
natural sound sources offer us. Soulld-processing procedures which are sensitive to the evolving
properties (pitch. loudness. spectral form and contour etc.) of the source-sound are those most likely to
bring rich musical rewards.
CHAPTER 2
PITCH
WHAT IS PITCH?
Certain sounds appear to possess a clear quality we call pilch. Whereas a cymbal or a snare-drum has
no such property and a bell may appear to contain several pitches. a !lule. violin or trumpet when played
in the conventional way produces sounds that have tile quality ·pitch".
Pitch arises when tile partials in the spectrum of the sound are in a particularly simple relation to one
another. i.e. they are all whole number multiples of some fixed value. known as the fundamental. In
matilematical terms. tile fundamental frequency is the highest common factor (HCF) of these partials.
When a spectrum has this structure it is said to be harmonic. and the individual partials are known a ~
harmoniCS. But this must not be confused with the notion of "harmony" in traditional music. What
happens when the partials are not in this simple relationship is discussed in tile next Chapter. Thus the
numbers 200. 300.400. 500 are all whole number multiples of 100, Which is their fundamental. The
frequency of lhis fundamental detemlines the "height" or "value" of the pitch we hear.
In most cases. the fundamental frequency of such a sound is present in tile sound as tile frequency of
the lowest partial. but lhis is not necessarily true (e.g. the lowest notes of the piano do not contain any
partial whose frequency corresponds to the perceived pitch). II is imponant to understand, therefore.
that the perceived pitch is a mental construct from a hamlonic spectrum, and not simply a matter of
directly perceiving a fundamental frequency in a spectrum. Such a frequency may not be physically
present.
The most imponant feature of pitch perception is that the spectrum appears to fuse into a unitary
percept. that of pitch. with a cenain speciral quality or "timbre". This fusion is best illustrated by
undoing it.
For example if we playa (synthesized) voice sound by placing the odd harmOnics on the left
loudspeaker and the even harmonics on the right loudspeaker we Will hear one vocal sound between the
two loudspeakers. TIlis happens because of a phenomenon kIlown as aural streaming. When sounds
from two different sources enter our ears simultaneously we need some mechanism to disentangle the
partials belonging to one sound from those belonging to the other. One way in which Ihe ear is able to
process the data relies on the micro-instabilities (jiller) in pitch. or tessitura, and loudness which all
natUrally occurring sounds exhibit. The partials derived from one sound will all jitter in parallel with one
another. while tilOse from the other sound will jiHer differently but also in parallel with one another. TIlis
provides a strong clue for our brain to assign any particular partial 10 one or other of the source sounds.
In OUr loudspeaker experiment. however. we have removed this clue by maintaining synchronicity of
microfluctuations between the partials coming from Ihe two loudspeakers. Hence the ear does not
unscramble the data into two separate sources, The voice remains a single percept. (Sound example
2.l).
If, now, we gradually add a differellt vibrato to each set of partials. the sound image will split. The ear
IS now able to group tile 2 sets of data illlo two aural streams and assign two different source sounds to
25
24
what it he3rs. The odd harmonics. say 300. 500. 700. 900. will continue to imply a fundamental of 100
cycles but will take on a clarinet type quality (clarinets produce only the odd harmonics in the spectrum)
and move into one of the loudspeakers. The remaining harmonics. 200. 400. 600. 800. will be interpreted
as having a fundamental al 200 (as 200 is the HCF of 200. 400, 600 and 800) and hence a second
·voke" an octave higher. will appear to emanate from the other loudspeaker. Hence, with no change of
spectral content, we have generated 2 pitch percepts from a single pitch percept. (Sound example 2.2).
SPECTRAL AND HARMONIC CONCEfYrlONS OF PITCH
Our definition of pitch leads us into conflict both with traditional conceptions and traditional terminology.
First of all. to say that a spectrum is harmonic. is to say that the partials are exact multiples of the
fundamental and this is the source of the perceptual fusion. Once this exact relationship is disturbed.
this spectral fusion is disturbed (see next Chapter).
There are not different kinds of harmony in the spectrum. Most of the relationships we deal with
between pitches in traditional Western "hannony" are between frequencies that do not stand in this
simple relationship to one another (because our scale is tempered). They are approximations to whole
number ratios which ·work" in the functional context of Western hannonic musical language. But. more
importantly. they are relationships between the averaged properties of sounds. An"A" and a ·C"
played on two flutes (or on one piano) are two distinct sound events, each having its own integrated
spectrum. Each spectrum is integrated with itself because its internal microfluctuations run in parallel
over all its own panials. But these microfluctuations are different to those in the other spectrum.
Within a single spectrum, however. panials might roughly correspond to an A and a C (but in exact
numerical proportions. unlike in the tempered scale) but will also have exactly parallel microl1uctuations
and hence fuse in our unitary perceplion of a much lower fundamental (e.g. an F 2 octaves below). We
can draw analogies between these two domains (as some composers have done) but they are
perceptually quite distinct.
To avoid confusion. we will try to reserve the wonis "HArmony" and "HArmonic" (capitalised as
shown) to apply to the traditional concern with relations amongst notes in Western art music and we
will refer to a spectrum having pildt as a pitch-spectrum or as having harmonicity, rather than a . ~ a
harmonic spectrum (which is the preferred scientJfic deSCription). However. the term "harmonic" may
occasionally be used in the spectral sense as a contrasting term 10 "inharmonic".
A second problem arises because a spectrum in motion may still preserve this simple relationship
between its constituent partials. as il moves. To put it simply, a ponamento is pitched in (he spectral
sense. It is difficult to speak of a portamento as ·having a pitch" in the sense of conventional hannony.
This sense of "having a pitch" i.e. being able to assign II pitch to a specific class like E-f1a( or C#2 is
quite a different concept from the timbral concept of pitch descrihed here. We will therefore refer to that
traditional concept as Hpitcll, an abbreviation for pitch-as-related-to-conventional-harmony.
Perception of Hpitch depends on the existence of a frame of reference (see Chapter I). Even with
steady (non-portamento) pitches. we may still have no sense of Hpitch if the pitches are selected at
random from the continuum of values, though often our cultural predispositions cause us to "force" the
notes onto our preconceived notion of where they 'ought" to be. In the sound example we hear first a
set of truly random pitches. next a SCI of pitches on a HArmonic field. then a set of pitches approximable
10 a HArmonic field and finally the 'same' set locked onto that HArmonic field. (Sound example 2.3).
26
We can, in fact. generate a perception of a single. if unstable, pitch from an event containing very many
different pitches scattered over a narrow band. The narrower the band. the more clearly is a single pitch
defined. (Sound example 2.4).
Once we confine our selection of pitches to a given reference frame (scale. mode. HArmonic field) we
establish a clear sense of Hpitch for each event.
Returning now to ponamenti and considering rising ponamenti. it is possible to relate such moving
pitches to Hpitch if the portamenti have special properties. Thus. if they are onset-weighted and the
onsets located in a fixed HArmonic field. we will assign iIpitch to the events. regarding the portamenti
as ornaments or aniculations to those Hpitches. (Sound example 2.Sa). Similarly. end-focused
portamenti (often heard in popular music singing styles) where the portamenti settle clearly onto values
in a HArmonic field, will be perceived as anacrusial ornaments to Hpitches. (Sound example 2.Sb).
portamenti with no such weighting. however. will not be perceived to be Hpitched. (Sound example
2.Sc).
The same arguments apply 10 falling portamenti and even more powerfully. to complexly moving
portamenti. (Sound example 2.6). Nevenheless all these sounds are pitched in the spectral sense.
Using our new computer instruments it becomes possible to follow and extract the pitch from a sound
event. but this does not necessarily (or usually) mean that we are assigning an iIpitch, or a set of
Hpitches to it. The pitch flow of a speech stream can be followed, extracted and applied to an entirely
different sound object, establishing a definite relatJonship between the two without any conception of
Hpitch or scale, mode or HAnnonic field entering into our thinking, or our perception. This should be
borne in mind while reading this Chapter as it is all too easy for mUSicians to slide imperceptibly imo
thinking of pitch as Hpilch!
A good example in traditional musical practice of pitch not treated as Hpitch can be found in Xenakis'
Pithoprakta where ponamenti are organised in statistical fields governed by Poisson's fonnula or in
portamenti of portamenti, which themselves rise and fall without having any definite Hpitch. (See OT!
Sonic Art). (Sound example 2.7).
In contrast, Paul de Marinis uses pitch extraction on natural speech recordings, subsequently reinforcing the
speech pitches on synthetic instruments so the speech appears to sing. (Sound example 2.8).
PITCH-TRACKING
It seems odd so early in this book to tackle what is perhaps one of the most difficult problems in
illstrument design. Whole treatises have been written on pitch....tetection and its difficulties. The main
problem for the instrument designer is (hat the human ear is remarkably good at pitch detection and
even the best computing instrumentS do not quite match up to it. This being said. we can make a
reasonably good attempt in most cases.
Working in the time domain. we will recognise pitch if we can find a cycle of Ihe wavefonn (a
Wavelength) and then correlate that with similar cycles immedialely following it. The pitch is then
simply one divided by the wavelength. This is pitch-tracking by auto-correlation (Appendix p70).
27
We can also attempt pitch-tracking by partial analysis (Appendix p71). Hence in the frequency domain
we would expect to have a sequence of (time) windows. in which the most significant frequency
information has been eXlraCted in a number of channels (as in the phase vocoder). Provided we have
many more channels than there are partials in the sound, we will expect that the partials of the sound
have been separated into distinct channels. We must then separate the true partials from the other
infonnation by looking for the peaks in the data.
1hen. as our previous discussion indicated. we must find the highest common factor of the frequencies of
our partials. which will exist if our instantaneous spectrum is hannonic. Unfonunatel y. if we allow
sufficiently small numbers to be used. then, within a given limit of accuracy, any set of panial
frequencies values will have a highest common factor. e.g. partials at 100.3, 201, 307 and 513.5 have all
HCF of 0.1. We must therefore reject absurdly small values. A good lower limit would be 16 cycles, the
approximate lower limit of pitch perception in humans.
1lte problem of pitch-tracking by panial analysis can in fact be simplified if we begin our search 011 a
quartet-tone grid. and also if we know in advance what the spectral content of the source sound is (see
Appendix p71). In such relatively straightforward cases pitch-tracking can be very accurate, with
perhaps occasional octaviation problems (the Hpitch can be assigned to the wrong octave). However. in
the general case (e.g. speech, or synthesised sequences involving inhannonic spectra). where we wish
to IraCk pitch independently of a reference frame, and where we cannot be sure whether the incoming
sound will be pitched or not, the problem of pitch-tracking is hard.
I
For even greater certainty we might try correlating the data from the time-domain with that from the
frequency domain to come up with our most definiti ve solution.
1lte pitch data might be stored in (the equivalent of) a breakpoint table of time/frequency values. In this
case we need to decide upon the frequency resolution of our data. i.e. how much must a pitch vary before
we record a new value in our table? More precisely. if a pitCh is changing, when is the rate of change
adjudged to have changed? (See Diagram 1).
If we are working on an Hpitch reference frame the task is. of course, much simpler. If we do not confine
ourselves to such frames. to be completely rigorous we could store the pitch value found at every
window in the frequency domain representation. But this is needlessly wastefuL Better to decide on
our own ability to discriminate rates of pitch motion and to give the pitch-detection instrument a
portamento-rate-change threshold which, when eXCeeded. causes a new value to he recorded in our
pitch data file.
PITCH TRANSFER
Once we have established a satisfactory pitch-trace for a sound. we can modify !he pitch of the original
sound and this is most easily considered in the frequency domain. We can provide a new (changing)
pitch-trace. either directly. or from a second sound. By comparing the two traces. a pitch-fOllowing
instrument will deduce the ratio between the new pitch and the original pitch at a particular window
time (the instantaneous transposition ratio). then multiply the frequency values in each channel of the
Original window by that ratio, hence altering the perceived pitch in the resyn!hesized sound. (See
Diagram 2).
28
DIAGRAH J..
STORING VALUES OF A TIME-CHANGING PITCH.
itd,
Store at ·regular times.
Accurate but inefficient.
(stable pitch in
segment 'a' stored
I I ! ! I I ! ! i ! ! i I I I ! ! lIIaIly times).
:. ___ ..-.1 inaccurate.
==i '\ (Se<Jlllent 'a'
-' ...... ."":;;;.r: - - replaced by
l.
.
j,. slight portamento) 1

. \\,
\.
... _________ __
store at regular
gradient changes •.
(gradient is
rate of change
of pitch).
Efficient
and accurate.

,.----.. bl"ltE'­
"' ____________
DIAGRAH 2
PI TeH TRANSFER
In each analysis window,
multiply all values by transposition ratio for this window.
1111
e.g. when transposition ratio
!I !1111' II
is 1/2..•.
q111.11
. .mule. a/l c.l1aMei viiJv«
.\ ,b':j Y;1.$h'ft' /'e.SlAtts. i.,{:o
"i ch""lIeis
11111 III 1111 !Ii IU)\11 UIU IiI ; I ill iii
Derive (time-varying) transposition ratio from
the pitch-trace of the original sound and the trace of the desired pitch.
piW,O't. pirdt as fr('llA >1<:'j
e
,So
aso

'.1.00
too
150 rdllSf:>osibOf1 ratro
;<. 1-1 l't,
;5,0
, Or','}inal pit<:h
tUllE:
t;"'1
1-15
:?

-3<>0
...kiso
new Pit-cit
lOa
t, ......

'"
J
r
i .··1
I ! There are also some fast techniques that can be used in special cases. Once the pitch of a sound is
! known, we can rranspose it up an octave using comb-jilter transposition (Appendix p65). Here we
iklay the signal by half the wavelength of the fundamental and add the result to the original signal. This
process cancels (by phase-inversion) the odd harmonics while reinforcing the even hannonics. Thus if
we start with a spectrum whose partials are at 100, 200, 300.400, 500 etc with a fundamental at 100Hz,
we are left with partials at 200, 400 etc whose fundamental lies at 200Hz. an octave above the original
sound. A modification of the technique, using the Hilbert transfonn. allows us to make an octave
downward transposition in a similar manner. The process is particularly useful because it does not
disturb the contour of the spectrum (the formants are nOI affected: see Chapter 3) so it can be applied
successfully to vocal sounds.
It is important to emphasize thai pitCh manipulation does not have to be embedded in a traditional
approach to Hpitches. The power of pilch-tracking is that it allows us to trace and transfer the most
subrle or complex pilCh flows and fluctuations without necessarily being able to assign specific Hpitch
values at any point. For example. the subtleties of portamento and vibrato articulation in a particular
established vocal or instrumental idiom, or in a naturally recorded bird or animal cry, could be
rransferred to an arbitrarily chosen non-instrumental, non-vocal, or even synthetic sound-<>bject It
would even be possible to interpolate between such articulation styles witilout at any stage having a
quantifiable (measurable) or notatable representation of them - we do not need to be able to measure
or analytically explain a phenomenon to make an aesthetic decision about it.
NEW PROBLEMS IN CHANGING PITCH
Changing the pitCh of a musical event would seem, from a traditional perspective, to be the most
obvious thing to do. Instruments arc set up, either as collections of similar objects (strings, metal bars,
wooden bars etc.), or with variable access to the same o b j e c t ~ (flute fingerholes, viOlin fingerboard.
brass valves) to permit similar sounds with different pitches to be produced rapidly and easily.
There are two problems when we try to transfer this familiar notion 10 sound composition. Firstly, we
do 110t necessarily want to confine ourselves to a finite set of pitches or to steady pitches (an Hpitch
set). More importantly the majority of sounds do not come so easily prepackaged either because the
circumstances of their produclion cannot be reproduced (breaking a sheet of glass ... every sheet will
break differently, no matter how much care we take!), or because the details of their production cannot
be precisely remembered (a spoken phrase can he repealed al differenl pitches by the same voice
within a narrow range but, assuming natural speech inflection, the fine details of articulation cannot
usually be precisely remembered).
In fact changing the pilch on an instrument does involve some speclral compromises, e.g. low and high
pitches on the piano have a very diflerent spectral quality but we have come to regard these
discrepancies as acceptable through the skill of instrument design (the piano strings resonate in the
same sound-box and there is a relatively smooth transition in quality from low to high strings), and the
familiarity of tradition. We are not. in fact. changing the pitch of tile Original sound. but producing
another sound, whose relationship to the original is acceptable.
29
The ideal way, therefore. to change the pilCh of the sound is to build a synthetic model of the sound.
then alter its fundamental frequency. However this is a very complicated task. and adopting this
approach for every sound we use. would make sound composition unbelievable arduous. So we must
find alternative approaches.
DIFFERENT APPROACHES TO PITCH-CHANGING
In the time-domain, the obvious way to change the pitch is 10 change the wavelength of tile sound. In
classical tape studios the only way to do this was to speed up (or slow down) the tape. On the
computer, we simply re-read the digital data at a different step (instead of everyone sample. read
every 2 samples. or every 1.3 samples). This is tape-speed variation. This automatically makes every
wavelength shorter (or longer) and changes the pitch. Unfortunately, it also makes the source sound
shorter (longer). If this doesn't maner. it's tile simplest method to adopt, but with segmented sounds
(speech. melodies) or moving sounds (e.g. portamenli) it changes their perceived speed. (Sound
example 2.9).
Computer control makes such "tape-speed" variation a more powerful tool as we can precisely control
the speed change trajectory or specify a speed-change in terms of its final velocity (tape acceleration).
The availability of time-variable processing gives us a new order of compositional control of such
time-varying processes. (Sound example 2.10).
Waveset transposition is an unconventional approach which avoids the time-dislortion involved in
tape-speed variation and can be used for integral multiples of the frequency. Here. each waveset (in
the sense of a pair of l.ero-crossings : Appendix pSO) is replaced by N shortened copies occupying the
same time as the original one. (Diagram 3 and Appendix pSI). This technique is very fast to compute
but often introduces strange. signal dependent (i.e. varying With the signal) artefacts. (It can therefore
be used as a process of constructive distortioll in its own right I) Grouping the wavesets in pairs,
triplets elc .• before reproducing them, can affcct the integrity of reproduction of the sound al the new
pitch (see Diagram 4). TIlc grouping to choose again depends on Ihe signal.
With accurate pitch-tracking this technique can be applied to true wavecyclcs (deduced from a
knowledge of both the true wavelengtil and tbe zero-crossing information) and should avoid producing
artefacts. (Diagram 5).
The technique can also he used 10 tmnsposc the sound downwards, replacing N wavescts or
wavecycles by just one of them, enlarged, but too much informalion may be lost (especially in a complex
sound) to give a satisfactory result in terms of just pitch- shifting. (Appendix pSI: Sound example
2.lla).
A more satisfactory time-domain approach is through br(Jssage. To lower the pitch of the sound, we cut
the sound into short segments, slow them down as in tape-speed variation. which lengthens them,
then splice them together again so they overlap sufficiently to retain the original duration. It is crucial to
use segments 10 the grain lime-frame (sec Chapters I & 4), so that each segment is long enough to
carry instantaneous pitch information, but not long enough to have a perceptible internal structure which
would lead to unwanted echo effects within tilC pitch-changed sound. 111is technique, used in the
!zamumiser. works quitc well over a range of one oclave, up or down, but beyond this, begins to
introduce Significant artefacts: the signal is transformed as well as pilch-shifted. (Sound example
2.12).
30
M
nXAGRAl't :3
Replace each waveset by three shortened copies of itself.
Wavelength redduced to 1/3, therefore frequency X 3,
hence transpose up by interval of a 12th.
0 }1 f\ /\ f\ LJ {\ e:--, ti..
VOVV V °v >
, 'I..-......Il...--J \.-...-Jl II-II..--...J\ J
.....,. : I I lit I I

1T
non'

:
nn
,
'fuL,dl
,
()N
"
.. r....
'
:6 ....
q]
nXAGRAl't 4
Take wavesets in groups of three.
each '-,et by 3 ,hort.ned cople. of It.elf,
!\ /\ /\ /\ /\ _ time
I V. V VV CI lVOV' 'ea
I. I '-----I.L...-.-..LL.....-....II....-.....-.J I I L--....JI •
, r\ " u0
Q_P I I
N\ no no (1 nAb rJ I\Qfl/ln etc. bmfw
Find true wavecycles. with help of a pitch-tracking instrument.
Replace each wavecycle with 3 shortened copies.
f\ !) /1 /\.
VV Ci VI CC; I )
I I I I I
! I I I I! , I \ I ;\ J,
r"
A
I , I ; ,
rln nA :AAAA Al14. (\, Dn
In the frequency domain, pilCh-shifting is straightforward. We need only multiply the frequencies of the
components in each channel (in each window) by an appropriate figure, the transposition ratio. As this
does not change the window duration, the pitch is shifted without changing the sounds duration. This is
spectral shifting. (Sound example 2.13).
PRESERVING THE SPECTRAL CONTOUR
All these approaches, however. shift the formant-char.l.cteristics of the spectrum. The problem here is
that certain spectral chamcleristics of a sound are determined by the overall shape of the spectrum at
each moment in time (the spectral contour) and particularly by peaks in the spectral contour known as
formants (see Chapter 3). Thus the vowel sound Ha
H
will be found to be related to various peaks in the
H
spectral contour. If we change the pitch at whidl Ha is sung. the partials in the sound will all move up
(or down) the frequency ladder. However. the spectral peaks will renwin where they were in the
frequency space. Thus. if there was a peak at around 3000 Hz. we will continue to find a peak at around
3000 Hz. (See Appendix pIO).
Simply multiplying the channel frequencies by a transposition ratio causes the whole spectrum. and
hence the spectral peaks (formants). to move up (or down) the frequency space. Hence the formants
are moved and the "aH-ness of the sound destroyed. (Appendix pI6).
A more sophisticated approach therefore involves determining the spectral contour in each window.
retaining it and then superimposing the unshifted contour on the newly shifted partials. The four stages
might be as follows ...
(a) Extract the spectral contour using linear predictive coding (LPC) (Appendix pp12-t3).
(b) Extract the partials with the plUlse vocoder. (Appendix pH) or with a fine-grained LPC (see
spectral-focusing below).
(c) Flallen the spectrum using the inverse of the spectral contour.
(d) Change the spectrum.
(e) Reimpose the original spectral contour.(Sec Appendix pI').
Ideally this approach of separating the fonnallt data and the partial data should be applied even when
merely imposing Vibrato on a sound (see Chapter 10) but it is computationally intensive and, except in
!he case of the human voice. probably excessively fastidious in most situations.
Fonnant drift is an obviOUS problem when dealing with speech sounds, but needs 10 be borne in mind
more generally. An instrument is characterised often by a single soundbox (piano. violin) which
prOvides a relatively fixed background spectral contour for the entire gatnut of notes played on it. We
are, however, more obviously aware of formant drift in situations (like speech) where the (Jrtiw/atioll of
fonnants is significant.
31
PITCH
Var!oos compositional processes allow us to generate more than one pitch from a sound. For example.
using the Iulrmoniser approach we can shift the pitch of a sound without altering its duration and then
mix this with the original pitch. Apart from the fact that we can apply this lechnique to any sound. it
differs from simply playing two notes on the same instrument because the variations and
microfluctuations of the two pitches remain fairly well in step, a situation impossible to achieve with
two sepatale performers. though it may be closely approached by a single performer using e.g.
double-stopping on a stringed instrument. The technique tends to be more musically interesting when
used on subtly fluctuating sounds. rather than as a cost-saving way of adding conventional HArmony
to a melodic line. (Sound example 2.14).
As usual, the Iulrmoniser algorithm introduces significant artefacts over larger interval shifts. An
alternative approach, therefore. is to use .rpectrai shifting in the frequency domain. superimposing the
result on the original source. In fact, we can use this kind of spectral shifting to literally split the
spectrum in two, shifting oniy a part of the spectrum. The two sets of partials thus generated will imply
two different fundamentals and the sound will appear to have a split pitch. (Appendix pI8). (Sound
example 2.(5).
All these techniques cane be applied dynamically so that the pitch of a sound gradually splits in two.
(Sound example 2.(6).
Small pitch-shifts. superimposed on the original sound add "hody' to the sound. producing the well
known "chorus· effect (an effect produced naturally by a chorus of similar singers. or similar instruments
playing the same pitches. where small variations in tuning between individual singers, or players,
broaden the spectral band of the resultant massed sound). (Sound example 2.17).
A different kind of pitch duality can be produced when the focus of energy in the spectrum (spectral
peak) moves markedly above a fixed pitch or over a pitch which is moving in a contrary direction. TIlere
are not truly two pitches present in these cases but percepts of conflicting frequency motions within the
sound can certainly be established. With very careful control. including the phasing in and out of
partials at the top and bottom of the spectrum. sounds can he created which get higher in Hpitch yet
lower in lessitura (or lower in Hpitch but higher in tessitura) the so called Shepard TOiles. (see 0"
Sonic Art and Appendix p72). (Sound example 2.18).
Similarly. by appropriale filtering, we can individually reinforce the harmonics (or partials) in a spectrum
SO that our attention is drawn to them as perceived pitches in their own righl (as in Tibetan chanting or
Tuvan harmonics singing). (Sound example 2.19).
Another pitch phenomena which is worth noting is the pitch drift associated with spectral stretching
(see Appendix pt9 and Chapter 3). If the partials of a pitch-spectrum are gradually moved so the
relationship ceases to be harmonic (no longer whole number multiples of the fundamental), the sound
will begin to present several pitches to our perception (like bell sounds). Moreover if the stretch is
upwards, even if the fundamental frequency is present in the spectrum and remains unchanged. the
lowest perceived pitch may gradually move upwards. (Sound example 2.20).
32
TESSITURA CHANGE OF UNPITCHED SOUNDS
The lechniques of pitch change we have described can be applied to sounds without any definite pitch.
Inharmonic sound will, in general. be transposed in a similar way to pitched sounds. Walleset
transposition will usually change the centre-of-energy (the "pitch-band" or tessitura) of noisy sounds
so they appear to move higher (or lower). (Sound example 2.2l). The hnl"mOniser will usually have the
same effect. Splitting the spectrum of a broad-band. noise-based sound using spectral shifting may
not have any noticeable perceptual effect on the sound, even when the split is quite radical and
chorusing will often be unnoticeable as the spectrum is already full of energy. However, the problem of
formant shifting when transposing will apply equally well to, e.g. unvoiced specch sounds. (Sound
example 2.22).
We can also use this technique to give a sense of pitch motion in unpitched sounds - nOise portamenti.
(Sound example 2.23).
PITCH CREATION
It is possible to give pitch qualities to initially unpitched sounds. There arc two approaches to this
which, at a deeper level. are very similar. A filter is an instrument which reinforces or suppresses
particular parts of the spectrum. A filter may work over a large hand of frequencies (when it is said to
have a small Q) or over a narrow band of frequencies (when it has a large Q). We can use a jilter not
oniy to remove or attenuate parts of the spectrum but, by inversion, to accentuate what remains
(bandpass filter Appendix p7). In particular. narrow Hlters with very high "Q" (so that the bands
allowed to pass are very abruptly defined), can Single out narrow ranges of partials from a spectru m. A
very narrow fixed filter can thus impose a marked spectral peak on the noisiest of sounds. giving it a
pitched quality.
In fact this process works hesl on nOisier sounds because here the energy will be distributed over the
entire spectrum and wherever we place our filter band(s). we will be sure to find some signal
components to work on. Very light Q on the filter will produce oscillator-type pitches, whilst less tight
Q and broader bands will produce a vaguer focusing of the spectral energy. There are many degrees of
"pitchiness" between pure noise and a ringing oscillator. (Sound example 2.24). We can also force our
sound through a stack of filters, called ajilter bank, producing "chords", and, increasing Q with lime,
move from noise towards such chords. (Sound example 2.25).
In a sound with a simpler spectrum, a narrow, tight filter may simply miss any significant spectral
elements we may end up with silence
A very sophisticated approach to Ihis process is spectral focusing (Appendix p20). In this process, we
first make an analYSis of a sound with (possibly time-varying) pitch using Linear Predictive Coding
(Appendh: pp12-13). However. we vary the size of the analysis window through time. With a nornlal
size analysis window we will extract the spectral conlour (the (see above). With a very fine
analysis wtndow. however. we will pick out the individual r:utials.
33
nIAGR.:A.H 6
cmtp
! 1
v
I ! I I

Time varying delay between two sounds,
one of which is a time-varying time-stretched version of the other.
If we now use the analysis data as a set of (time-varying) filters on an input nOise sound, wherever the
analysis window was normal-sized the resultant filters will impose the formant characteristics of the
original sound on the noise source (e.g. analysed voiced speech will produce unvoiced speech),but
where the window size was very fine, we will have generated a set of very narrow-Q filters at the
(time-varying) frequencies of the original partials. These will then act on the noise to produce
something vcry close to the original signal.
If the original analysis window-size varied in time from normal to fine, our output sound would vary
from formant-shaped noise to strongly pItched sound (e.g. from an analysis of pitched speech, our new
sound would pass from unvoiced to voiced speech). This then provides a sophisticated means to pass
from noise 10 pitch in a complexly evolving sound-source.
The second approach to pitch-generation is to usc delay. As digital signals remain precisely in time,
the delay between equivalent samples in the original and delayed sound will remain exactly fixed. If
this delay is short enough, we will hear a pitch corresponding to one divided by delay time, whatever
sound we input to the system. This technique is known comb filtering. Longer delays will give lower
and less well defined pitches. (Sound example 2.26).
Both these techniques allow us to produce dual-pitch percepts with the pitch of the source material
moving in some direction and the pitch produced by the delay or I1l!ering I1xed, or moving in a different
sense (with time-variable ftltering or delay).
ProdUCing pOl1amcnti is an even simpler process. When a sound is mixed with a very slightly
time-stretched (or shrunk) copy of itself, we will produce a gradually changing delay (Sec Diagram 6).
If the sounds are start-synchronised, this will produce a downward portamento. If the sounds arc
end-synchronised, we will produce an upward porlamento. We lIlay work with more than twO
time-varied copies. (Sound example 2.27).
Phasing or Aangeing. often used in popular music, relies on such delay effects. In this case the signal is
delayed by different amounts in different frequency registers using an all-pass filler (Appendix p9) and
this shifted signal is allowed to interact with the unchanged source.
The production of pitch-motion seems an appropriate place to end this Chapter as it stresses once
again the difference between pitch and Hpitch and the power of the new composiljonal tools to provide
control over pitch-in-motion.
1IIIil
CHAPTER 3
iii
SPECfRUM
1·1;1
WHA T IS TIMBRE ?
',II
The spectral characteristics of sounds have, for so long, been inaccessible to the composer that we have
become accustomed to lumping together all aspects of the spectral structure under the catch-all term
"timbre" and regarding it as an elementary, if unquantifiable, property of sounds. Most musicians with a
III1
traditional background almost equate "timbre" with instrument type (some instruments producing a
variety of "timbres" e.g. pizz, arco, legno, etc). Similarly, in the earliest analogue studios, composers
came into contact with oscillators producing featureless pitches, noise generators, producing featureless
noise bands, and "envelope generators" which added simple loudness trajectories to these elementary
sources. lbis gave no inSight into the subtlety and multidimensionality of sound spectra.
However, a whole book could be devoted to the spectral characteristics of sounds. The most important
1.11'1
feature to note is that all sound spectra of musical interest are time-varying, either in
micro-articulation or large-scale motion. 11,:,1"11'1111
,:1:, ::: : :
1 HARMONICITY - INHARMONICITY
,
1 , 11 "" '11 1
'1"'1111" ,
,,11
I
1
As discussed in Chapter 2, if the partials which make up a sound have frequencies which are exact
"""'
11
1.: "d"
:11111111
i
multiples of some frequency in the audible range (known as the fundamental) and, provided this ,
,III::::
relationship persists for at least a grain-size time-frame, the spectrum fuses and we hear a specific
(possibly gliding) pitch. If the partials are not in this relationship, and provided the relationships (from
window to window) remain relatively stable, the car's attempts to extract harmonicity (whole number)
relationships amongst the partials will result in our hearing several pitches in the sound. These several
pitches will trace out the same micro-articulations and hence will be fused into a single percept in a
bell sound). The one exception to this is that certain partials may decay more quickly than others
without destroying this perceived fusion (as in sustained acoustic bell sounds).
In Sound example 3.1 we hear the syllable "ko->u" being gradually spectrally stretched: Appendix
pI9). lbis means that the partials are moved upwards in such a way that their whole number
relationships are preserved less and less exactly and eventually lost. (See Diagram 1). Initially, the
sound appears to have an indefinable "aura" around it, akin to phasing, but gradually becomes more and
more bell-like.
It is important to understand that this transformation "works" due to a number of factors apart from the
harmonic\inharmonic transition. As the process proceeds, the tail of the sound is gradually
time-stretched to give it the longer decay time we would expect from an acoustic bell. More
importantly, the morphology (changing shape) of the spectrum is already bell-like. The syllable
"ko->u" begins with a very short broad band spectrum with lots of high-frequency information ("k")
corresponding to the initial clang of a bell. lbis leads immediately into a steady pitch, but the vowel
formant is varied from "0" to "u", a process which gradually fades out the higher partials leaving the
lower to continue. Bell sounds have this similar property, the lower partials, and hence the lower heard
III pitches, persisting longer than the higher components. A different initial morphology would have
produced a less bell-like result.

I
'.,
lbis example (used in the composition of Vox 5) illustrates the importance of the time-varying
structure of the spectrum (not simply its loudness trajectory).
35
DIAGRArl J..
0-[ r[-[Tl-rTrr-c-,-,_,_)
rT rl-r-r-rl'-r:--r-T-::.
[1- -r -r-r-}--r [- -]-
fr-1- T-r -r- r- --1--::
"0::'"' fYct. 2 . 0 5 C\
Ir-r-r-r--r--r---I'---r-,
2 ·50. I¥u
[r- T --r ---- T----
Q. 2-2a 3a. f t
We may vary this spectral stretching process by changing the overall stretch (I.e. the top of the
spectrum moves further up or further down from its initial position) and we may vary the type of
stretching involved. (Appendix pI9). (Sound example 3.2).
'I
Different types of stretching will produce different relationships between the pitches heard within the
sounds.
Note that. small stretches produce an ambiguous area in which the original sound appears" coloured" in
III
some way rather than genuinely multi-pitched. (Sound example 3.3). Inharmonicity does not therefore
necessarily mean multipltchedness. Nor (as we have seen from the "ko->u· example), does it mean
bell sounds. Very short inharmonic sounds will sound percussive. like drums, strangely coloured
il
drums. or akin to wood-blocks (Sound example 3.4). These inharmonic sounds can be transposed and
Iii
caused 10 move (subtle or complex pitch-gliding) just like pitched sounds (also see Chapter 5 on
II Col'Itinuation).
I
Proceeding further, the spectrum can be made to vary, either slowly or quicldy, between the harmonic
and the inharmonic creating a dynamiC interpolation between a harmonic and an inharmonic state (or
between any state and something more inharmonic) so that a sound changes its spectral character as it
unfolds. We can also imagine a kind of harmonic to inharmonic vibrato-like fluctuation within a sound,
(Sound example 3.5).
Once we vary the spectrum too quicldy. and especially if we do so irregularly, we no longer perceive
individual moments or grains with specific spectral qualities. We reach the area of noise (see below).
When transformi ng the harmonicity of the spectrum, we run into problems about the position of formants
akin to those encountered when pitch-ehanging (see Chapter 2) and to preserve the formant
characteristics of the source we need to preserve the spectral contour of the source and apply it to the
resulting spectrum (see fom1anl preservillg Jpecfral manipulation: Appendix p 17).
FORMANT STRUCTURE
In any window, the contour of the spectrum will have peaks and troughs. The peaks, known as
formants. are responsible for such features as the vowel-state of a sung note. For a vowel to persist,
the SpeClfal contour (and therefore the position of the peaks and troughs) must remain where it is even
if the partials themselves move. (See Appendix pIO).
As we know from singing, and as we can deduce from this diagram, the frequencies of the partials In the
spectrum (determining pitch(es), harmonicity-inharmonicity, nOisiness) and the position of the spectral
peaks, can be varied independently of each other. 'This is why we can produce coherent speech while
singing or whispering. (Sound example 3.6),
Because most conventional acoustic instruments have no articulate time-varying control over spectral
contour (one of the few examples is hand manipulable brass mutes), the concept of formant control is
less familiar as a musical concept to traditional composers. However, we all use articulate formant
control when speaking.
36
It is possible to extract the (time varying) spectral contour from one signal and impose it on another, a
process originally developed in the analogue studios and known as vocoding (no connection with the
phaSe vocoder). For this to work effectively, the sound to be vocoded must have energy distributed
over the whole spectrum so that the spectral contour to be imposed has something to work on.
VQCOding hence works well on noisy sounds (e.g. the sea) or on sounds which are artificially
prewhitened by adding broad band noise, or subjected to some noise producing distortion process.
(Sound example 3.7).
It is also possible to normalise the spectrum before imposing the new contour. This process is
described in Chapter 2. and in the under fonnanl preserving spectral manipulalion in Appendix p17.
Formant-variation of the spectrum does not need to be speech-related and, in complex signals, is often
more significant than spectral change. We can use spectral freezing to freeze certain aspects of the
spectrum at a particular moment. We hold the frequencies of the partials, allowing their loudnesses to
vary as originally. Or we can hold their amplitudes stationary, allowing the frequencies to vary as
originally. In a complex signal, it is often holding steady the amplitudes, and hence the spectral contour,
which produces a sense of "freezing" the spectrum when we might have anticipated that holding the
frequencies would create this percept more directly. (Sound example 3.8).
NOISE, "NOISY NOISE" & COMPLEX SPECTRA
Once the spectrum begins to change so rapidly and irregularly that we cannot perceive the spectral
quality of any particular grain, we hear "noise". Noise spectra arc not, however. a uniform grey area of
musical options (or even a few shades of pink and blue) which the name (and past experience with
noise generators) might suggest. The subtle differences between unvoiced staccato "t", "d", "p", "k".
"s", "sh", "r, the variety amongst cymbals and unpitched gongs give the lie to tltis.
Noisiness can be a matter of degree, particularly as the number of heard out components in an
inharmonic spectrum increases gradually to the point of noise saturation. It can, of course, vary
formant-wise in time: whispered speech is the ideal example. It can be more or less focused towards
sllitic or moving pitches, using band-pass Jillers or delay (see Chapter 2), and it can have its own
complex internal structure. In Sound example 3.9 we hear portamentoing inhamlonic spectra created
by filtering noise. TItis filtering is gradually removed and the bands become more noise-like.
A good example of the complexity of noise itself is "noisy noise", the type of crackling signal one gets
from very poor radio reception tuned to no particular station, from masses of broad-band click-like
sounds (either in regular layers - cicadal> or irregular - masses of breaking twigs or pebbles falling
onto tiles - or semi-regular - the gritty vocal sounds produced by water between the tongue and
palate in e.g. Dutch "gh") or from extremely timc-eontractcd speech streams. There are also Iluid
noises produced by portamentoing components, e.g. the sound of water falling in a wide stream around
many small rocks. These shade off into the area of "Texture" which we will discuss in Chapter 8.
(Sound example 3.10).
These examples illustrate that the rather dull sounding word "noise" hides whole worlds of rich sonic
material largely uncxplored in detail by composers in the past.
37
r
Two processes are worth mentioning in this respect. Noise with transient pitch content like water
falling in a stream (rather than dripping, flowing or bubbling), might be pitch-enhanced by spectral
tracing (see below). (Sound example 3.11). Conversely. all sounds can be amassed to create a sound
with a noise-1Opectrum if superimposed randomly in a sufficiently frequency-dense and time-dense
way. At the end of Sound example 3.9 the noise band finally resolves into the sound of voices. The
noise band was in fact simply a very dense superimposition of many vocal sounds.
Different sounds (with or without harmonicity, soft or hard-edged. spectrally bright or dull, grain-like.
sustained. evolving. iterated or sequenced) may produce different qualities of noise (see Chapter 8 on
Texture). There are also undoubtedly vast areas to be explored at the boundaries of
inharmonicity/noise and time-fluctuating-spectrumlnoise. (Sound example 3.12).
A fruitful approach to this territory might be through spectral focusing. described in Chapter 2 (and
Appendix p20). This allows us to extract. from a pitched sound. either the spectral contour Dilly. or the
true partials. and to then use this data to filter a noise souree. The filtered result can vary from
articulated noise formants (like unvoiced speech) following just the formant articulation of the original
source, to a reconstitution of the partials of the the original sound (and hence of the original sound
Itself). We can also move fluidly between these two states by varying the anaiysis window size
through time. This technique can be applied to any source. whether it be spectrally pitched (harmonic).
or inhannonic, and gives us a means of passing from articulate noise to articulate not-noise spectra in a
seamless fashion.
Many of the sound phenomena we have discussed in this section are complex of simpler
units. It is therefore worthwhile to note that any arbitrary collection of sounds. especially mixed in
mono, has a well-defined time-varying spectrum a massed group of talkers at a party; a whole
orchestra individually, but Simultaneously, practising their difficult pa.'iSages before a concert. At each
moment there is a compositc spectrum for these events and any portion of it could be grist for the mill of
sound composition.
SPECTRAL ENHANCEMENT
The already existing structure of a spectrum can be utilised to enhance the original sound. This is
particularly important with respect to the onset portion of a sound and we will leave discussion of this
until Chapter 4. We may reinforce the total spectral structure. adding additional partials by spectral
shifting the sound {without changing its dur.lIion} (Appendix piS) and mixiflg the shifted spectrum on
the original. As the digital signal will retain its duration precisely. all the components in the shifted
signal will line up preCisely with their non-shifted sources and the spectrum will be thickened while
retaining its (fused) intcgrity. Octave enhancement is the most obvious approach but any interval of
transposition (e.g. the tritone) might be chosen. The process might be repeated and the relative balance
of the components adjusted as desired. (Appendix p4K). (Sound elCample 3.13)
A further enrichment may be achieved by mixing an already stereo spectrum with a pitch-shifted
version which is left-right inverted. Theoretically this produces merely a stage-centre resultant
spectrum but in practice there appear to be frequency dependent effects which lend the resultant sound
a new and richer spatial "fullness", (Sound example 3.14).
finally, we can introduce a sense of multiple-sourced ness en to a sound (e.g. make a single voice
appear crowd-like) by adding small random time-changing perturbations to the loudnesses of the
spectral components (spectral slwking).This mimics part of the effect of several voices auempting to
deliver the same information. (Sound example 3.15). We may also perturb the partial frequencies
(Sound example 3.16).
SPECTRAL BANDING
Once we understand that a spectrum contains many separate components, we can imagine processing
the sound to isolate or separate these components. Filters. by permitting components in some
frequency bands to pass and rejecting others. allow us to select parts of the spectrum for closer
observation. With dense or complex spectra the results of filtering can be relatively unexpected
revealing aspects of the sound material not previously appreciated. A not-tao-narrow and static bal/d
pass filter will transform a complex sound-source (usually) retaining its morphology (time-varying
shape) so that the resulting sound will rclate to the source sound through its articulation in time.
(Sound example 3.17).
Afilter may also be used to isolate some static or moving feature of a sound. [n a crude way, filters
may be used to eliminate unwanted noise or hums in recorded sounds. especially as digital filters can be
very preCisely tuned. In the frequency domain. spectral components can be eliminated on a
c/Jannel-by-channel basis. either in terms of their frequency location (using spectral splilling to define
a frequency band and setting the band loudness to zero) or in terms of their time varying relative
loudness (spectral tracing will eliminate the N least signifIcant. i.e, quietest. channel components.
window by window. At an elementary level this can be used for signal-dependent noise reduction. Out
see also "Spectral Fission" below). More radically. sets of narrow band pass filters can be used 10
force a complex spectrum onto any desired Hpitch sct (HArmonic field in the tradllional sense), (Sound
example 3.18).
In a more Signal sensitive sense a !ilter or a frequency-domain channel selector can be used to
separate some desired feature of a sound, e.g. a moving high frequency component in the onset. a
particular strong middle partial ctc. for further compositional development. In particular. we can
separate the spectrum into parts (using band pass filters or spectral Sp/jlljflg) and apply processes to
the N separated pans (e.g. pilch-shift, add vibrato) and thell recombine Ule two parts perhaps
reconstituting the spectrum in a new form. However. if the spectral parts are changed too radically c.g.
adding completely diffcrcm vibrato to each part. they will not fuse when remixed. but we may be
interested in the gradual dissociation of the spectrum. 11lis leads us into the next area.
Ultimately we may usc a procedure which follows the partials themselves. separating the signal into Its
COmponent partials (partial tracking), This is quile a complex lask which will involve pilch tracking and
pattern-matching (to estimate where the partials might lie) on a window by window basis. Ideally it
must deal in sOllie way with inharmonic sounds (where lhe form of the SpeCtrullI is not known in
advance) and noise sources (where therc arc, in effcct. no partials). This technique is however
particularly powerful as it allows us to sct up an additive symhesis model of our analysed sound and
thereby provides a bridge belween unique recorded sound-events and the control available through
SYnthesis.

n
I i
I'
SPECTRAL FISSION & CONSTRUCTIVE DISTORTION
I
We have mentioned several times the idea of spectral fusion where the parallel micro-articulation of
the many components of a spectrum causes us to perceive it as a unified entity - in the case of a
harmonic spectrum, as a single pitch. The opposite process. whereby the spectral components seem to
split apart. we will describe as spectral fission. Adding two different sets of vibrato to two different
groups of partials within the same spectrum will cause the two sets of partials to be perceived
independently - the single aural stream will split into two. (Sound example 3.19).
Spectral fission can be achieved in a number of quite different ways in the frequency domai n. Spectral
orpeggiarion is a process that draws our attention to the individual spectral components by isolating, or
emphasising. each in sequence. This can be achieved purely vocally over a drone pitch by using
appropriate vowel formants to emphasise partials above the fundamental. The computer can apply this
process to any sound-source, even whilst it is in motion. (Sound example 3.20).
I.
Spectral tracing strips away the spectral components in order of increasing loudness (Appendix p25).
I
When only a few components are left. any sound is reduced to a delicate tracery of (shifting) sine-wave
constituents. Complexly varying sources produce the most fascinating results as those partials which
are at any moment in the permitted group (the loudest) change from window 10 window. We hear new
partials entering (while others leave) producing "melodies· internal to the source sound. This feature
can often be enhanced by time-stretching so that the rate of partial change is slowed down. Spectral
rracing can also be done in a time-variable manner so that a sound gradually dissolves into its internal
sine-wave tracery. (Sound example 3.21).
SpeC1rol time-stretching, which we will deal with more fully in Chapter II, can produce unexpected
spectral consequences when applied to noisy sounds. In a noisy sound the spectrum is changing too
quickly for us to gain any pitch or inharmonic multi-pitched percept from any particular time-window.
Once, however. we slow down the rate of change the spectrum becomes stable or stable-in-motion for
long enough for us to hear out the originally instantaneous window values. In general, these are
inharmonic and hence we produce a "metallic" inharrnonic (usually moving) ringing percept. By making
perceptible what was not previously perceptible we effect a "magical" transformation of the sonic
.11 'II material. Again, this can be effected in a time-varying manner so that the inharmonicity emerges
11:" gradually from within the stretching sound. (Sound example 3.22).
..:
Alternatively we may elaborate the spectrum in the time-domain by a process of constructive
, 'II'
distortion. By searching for wavesels (zero-crossing pairs: Appendix pSO) and then repeating the

wavesets before proceeding to the next (Waveset time-stretchillg) we may time stretch the source

I'!
without altering its pitch (see elsewhere for the limitations on this process). (Appendix pSS).
". ,
Wavesets correspond to wavecydes in many pitched sounds, but not always (Appendix pSO). Their
"
advantage in the context of constructive distortion is that very noisy sounds, having no pitch, have no
true wavecycles - but we can still segment them into wavesets (Appendix pSO).
In a vcry simple sound source (e.g. a steady waveform, from any oscillator) waveset time-stretching
produces no artefacts. In a compledy evolving signal (especially a noisy one) each waveset will be
different. often radically different, to the previous one, bUl we will not perceptually register the content
of that waveset in its own right (see the discussion of time-frames in Chapter I). It merely contributes
to the more general percept of noisiness. The more we repeat each waveset however. the closer it
comes to the grain threshold where we can hear out the implied pitch and the spectral quality implicit in
40
its waveform. With a 5 or 6 fold repetition therefore, the source sound begins to reveal a lightning fast
stream of pitched heads, all of a slightly different spectral qUality. A 32 fold repetition produces a clear
"random melody" apparently quite divorced from the source. A three or four fold repetition produces a
"phasing"-like aura around the sound in which a glimmer of the bead stream is beginning to be
apparent. (Sound example 3.23).
Again, we have a compositional process which makes perceptible aspects of the signal which were not
perceptible. But in this case. the aural result is entirely different. The new sounds are time-domain
artefacts consistent with the original Signal. rather than revelations of an intrinsic internal structure. For
this reason I refer to these processes as constmctive distortion.
SPECTRAL MANIPULATION IN THE FREQUENCY DOMAIN
There are many other processes of spectral manipulation we can apply to signals in the frequency
domain. Most of these are only interesting if we apply them to moving spectra because they rely on the
interaction of data in different (time) windows and if these sets of data are very similar we will
perceive no change.
We may select a window (or sets of windows) and freeze either the frequency data or the loudness
data which we find there over the ensuing signal (spectral freezing). If the frequency data is held
constant, the channel amplitudes continue to vary as in the original signal but the channel
frequencies do not change. If the amplitude data is held constant then the channel frequencies continue
to vary as in the original signal. As mentioned previously. in a complex signal, holding the amplitude
data is often more effective in achieving a sense of "freezing" tile signal. We can also freeze both
amplitude and frequency data but, with a complex signal, this tends to sound like a sudden splice
between a moving signal and a synthetic drone. (Sound example 3.24).
We may average tile spectral data in a each frequency-band channel over N time-windows (spectral
blurring) thus reducing the amount of detail available for reconstructing the signal. This can be used to
"wash out" the detail in a segmented signal and works especially effectively on spikey crackly signals
(those with brief, bright peaks). We can do this and also reduce the number of partials (spectral [r{lce &
blur) and we may do either of these things in a time-variable manner so that the details of a sequence
gradually become blurred or gradually emerge as distinct. (Sound example 3.25).
Finally. we may shuffle the time-window data in any way we choose (spectral shufflillg), shuffling
Windows in groups of 8. 17.61 etc. Witillargc numbers of windows in a shuffled group we produce an
audible rearrangements of signal segments, but with only a few windows we create another process of
SOund blurring akin to brassage. and particularly apparent In rapid sequences. (Sound example 3.26).
SPECTRAL MANIPULATION IN THE TIME-DOMAIN
A whole series of spectral transformations can be effected in the time-domain by operating on
wavesets defined as pairs of z.ero-crossings. Bearing in mind that these do not necessarily correspond
to true wavecycles, even in relatively simple signals, we will anticipate producing various unexpected
artefacts in complex sounds. In general, the effects produced will not be entirely predictable, but they
41
111
".- •
1;1' I
IT
will be tied to the morphology (time changing characteristics) of the original sound. Hence the resulting
sound will he clearly related to the source in a way which may be musically useful. As this process
destroys the original fonn of the wave I will refer to it as destructive distortion. The following
manipulations suggest themselves.
We may replace wavesets with a waveform of a different shape but the same amplitude (waveset
substitution: Appendix pS2). Thus we may convert all the wavesets to square waves. triangular
waves. sine-waves. or even user-<lefined waveforms. Superficially, one might expect that sine-wave
replacement would in some way simplify, or clarify. the spectrum. Again, this may he true with simple
sound input but complex sounds are just changed in spectral "feel" as a rapidly changing sine-wave is
no less perceptually chaotic than a rapidly changing arbitrary wave-shape. In the sound examples the
sound with a wood-like attack has wavesets replaced by square waves, and then by sine waves. Two
Interpolating sequences (see Chapter 12) between the 'wood' and each of the transformed sounds is
then created by inberweening (see Appendix p46 & Chapter 12). (Sound example 3.27).
Inverting the half-wave-<:ycles (waveset inversion: Appendix pSI) usually produces an "edge" to the
spectral characteristics of Ihe sound. We might also change the spectrum by applying a power factor 10
the wavefonn shape itself (waveset distortion: Appendix pS2) (Sound example 3.28).
We may average the waveset shape over N wavesets (waveset averaging). Although this process
appears to be similar to the process of spectral blurring. it is in fact quite irrational, averaging the
waveset length and the wave shape (and hence the resulting spectral contour) in perceptually
unpredictable way. More interesting (lhough apparently less promising) we may replace N in every M
wavesets by silence (waveset omission: Appendix pSI). For example, every alternate waveset may
be replaced by silence. Superficially, this would appear to be an unpromising approach but we are in fact
thus changing the waveform. Again. this process introduces a slightly rasping "edge" to the sound
quality of the source sound which increases as more "silence" is introduced. (Sound example 3.29).
We may add 'harmonic components' to the waveset in any desired proportions (waveset Iwrmollic
distortion) by making copies of the wa veset which are 112 as shon (1/3 as sbort etc) and
superimposing 2 (3) of these on the original waveform in any specified amplitude weighting. With an
elementary waveset form this adds harmonics in a rational and predictable way. With a complex
wavefonn, it enriches the spectrum in a not wholly predictable way, Ihough we can fairly well predict
how the spectral energy will be redistributed. (Appendix pS2).
We may also rearrange wavesets in any specified way (waveset shllff1ing : Appendix pSt) or reverse
wavesets or groups of N wavesets (waveset reversal: Appendix pSI). Again, where N is large we
produce a fairly predictable brassage of reverse segments, but with smaller values of N the signal is
altered in subtle ways. Values of N at the threshold of grain perceptibility are especially interesting.
Finally, we may introduce small, random changes to the waveset lengths in the signal (wavesef shaking
: Appendix pSI). This has Ihe effect of adding "roughness' to clearly pitched sounds.
Such distortion procedures work particularly well with short sounds having disLinctive loudness
trajectories. In the sound example a set of such sounds, suggesting a bounCing object, is destructively
distorted in various ways, suggesting a change in the physical medium in which the 'bouncing' takes
place (e.g. bouncing in sand). (Sound example 3.30).
CONCLUSION
In a sense, almost any manipulation of a signal will altcr its spectrum. Even editing (most obviously in
very short time-frames in brassage e.g.) alters the time-varying nature of the spectrum. But. as we
have already made clear. many of the areas discussed in the different chapters of Ihis book overlap
considerably. Here we have attempted to focus on sound composition in a particular way. through the
concept of "spectrum'. Spectral thinking is inlegralto all sound composition and should be borne in
mind as we proceed to explore other aspects of this world.
42
43
I'
CHAPTER 4
ONSET
WHAT IS SIGNIFICANT ABOUT THE ONSET OF A SOUND?
In the previous Chapter (Spectrum) we have discussed properties of sounds which they possess at
every moment, even though these properties may change from moment to moment. There are, however,
properties of sound intrinsically tied to tile way in wllich lhe sound changes. In this Chapter and the next
we will look at those properties. In fact, the next chapter, entitled "Continuation" might seem to deal
happily with all those properties. Why should we single out the properties of the onset of a sound, its
attack, as being any different to those that follow?
The onset of a sound, however, has two particular properties which are perceptually interesting. In most
naturally occurring sounds the onset gives us some clue as to the causality of the sound - what source
is producing it, how much energy was expended in producing ii, where it's coming from. Of course, we
can pick up some of this infonnation from later momelllS in the sound. but such infonnation has a
primitive and potentially life-threatening importance in the species development of hearing. After all,
hearing did not develop 10 allow us to compose music. but to better help us to survive. We are therefore
particularly sensitive to the qualities of sound onset - at some stage in the past our ancestors lives may
have depended on the correct interpretation of that data. A moment's hesitation for reHection may have
been too long!
Secondly. because of the way sound events are initiated in the physical world, the onset moment almost
inevitably has some special properties. Thus a resonating cavity (like a pipe) may produce a sustained
and stable sound once it is activated. but there needs to be a moment of transition from non-activation
10 activation. usually involving exceeding some energy threshold, to push the syslem into motion. Bells
need to be struck. Hules blown etc. Some resonating systems can, with practice, be PUI into resonance
with almosl no discontinuity (the voice. bowl gongs. blooglcs). Others either require a tmnsienl onset
event, or can be initiated wilh such an event. (!lute or brass tonguing). Other systems have internal
resonance - once set in motion we do not have to continue supplying energy 10 them but we therefore
have to supply a relatively large amounl of energy in a short time at the evcnt onsel (piano-string, bell).
Other systems produce inlrinsically short sounds as they have no internal resonance. Such sources can
produce either individual short sounds (drums. xylophones. many vocal consonants) or be activated
ileratively (drum roll. rolled "r", low contrabassoon nOles).
Ilerative sounds are a special case in which perceplual considerations entcr inlO our judgement. Low
and high contrabassoon notes are both produced by the disconLinuous movcment of the rced. However,
in the lower notes we hear OUI Ihose individual motions as Ihey individually fall within the grain
time-frame (See Chapter I). Above a cenain speed, the individual reed m o v e m e n L ~ fall below the grain
time-frame boundary and Ihe units meld in perception into a conLinuous event. Sounds which are
perceptually iterative, or granular, can be thought of as a sequence of onsel events. This means thai
Ihey have special properties which differentiatc them (perceptually) from continuous sounds and must be
treated differently when wc compose with Ihem. These mailers are discussed in Chapters 6.7 and R.
44
acoustic instruments the initiating transition from 'off' to 'on' is most often a complex event in its
right, a clang, a breathy release, or whatever, with the dimensions of a grain (See Chapter I) and
its own intrinsic sonic properties. These properties are. in fact. so important that we can destroy
recognisability of instrumental sounds (flute, trumpet, violin) fairly easily by removing their onset
It is of course not only the onset which is involved in source recognition. Sound sources with
resonance and natural decay (struck piano strings. struck bells) are also panJy recognisable
this decay process and if it is artificially prevented from occurring. our percept may change (is it
a piano or is it a flute?). For a more detailed discussion see On Sonic Art.
We need. therefore. to pay special attention to the onset characteristics of sounds.
GRAIN-SCALE SOUNDS
Very short sounds (xylophone notes. vocal clicks. two pebbles struck together) may be regarded as
onsets without continuation. Such sounds may be studied as a class on their own. We may be aware of
pitch. pitch motion, spectral type (harrnonicity. inharmonicity. noisiness etc) or spectral motion. But our
percept will also be influenced strongly by the loudness trajectory of such sounds. (See DiagrdITI 1).
Thus any grain-scale sound having a loudness trajectory of type la (see Diagram) will appear 'struck"
as the trajectory implies that all the energy is imparted in an initial shock and then dies away naturally.
We Clln create the percept "struck object" by imposing such a brief loudness trajectory on almost any
spectral structure. For example. a tirtll!-stretched vocal sound may have an overall trajectory imposed
on it made out of such grain-scale trajectories, but repeated. The individual grains of the resulting
iterated sound may appear like struck wood (Sound example 4.1).
If these grains are then spectrally altered (using. for example. the various destructive distortion
instruments discussed in Chapter 3) we may alter the perceived nature of the "material" being "struck".
In particular. the more noisy the spectrum. the more "drum-like" or "cymbal-like" but we are retaining
the percept 'struck" because of the persisting form of the loudness trajectory. (Sound example 4.2).
If. however. we provide a different loudness trajectory (by enveloping) like type lb. which has a quiet
onset and peaks near the end. the energy in the sound seems to grow. which we might intuitively
associate with rubbing or stroking or some other gentler way of coaxing an object into its natural
vibrating mode. At the very least the percept is "gradual initiated". rather than "sudden initiated".
(Sound example 4.3).
Yet another energy trajectory. a sudden excitation brought to an abrupt end (lc). suggcsts perhaps an
extremely forced scraping together of materials where the evolution of the process is controlled by
forces external to the natural vibrating properties of the material e.g. sawing or bowing. panicularly
where these produce forced vibrations rather than natural resonant frequencies. (Sound example 4.4).
Thus with such very brief sound. transformation between quite differem aural percepts can be effectcd
by the simple device of altering loudness trajectory. Comllining this with the control of coment
and spectral change gives us a very powerful purchase on the sound-composition of grain-size sounds.
45
DIAGRAH 3..


r
I
ALTERED PHYSICALITY
,I
What we have been describing are modifications to the perceived physicality of the sound. If we
I
proceed now to sounds which also have a continuation, subtle alterations of the onset characteristics
may still radically alter the perceived physicality of the sound. For example we can impose a
sudden-initiated (struck-like) onset on any sound purely by providing an appropriate onset loudness
I tnl.je<:tory, or we can make a sound gradual-initiated by giving it an onset loudness trajectory which
rises more slowly. In the grain time-frame we can proceed from the ·struck inelastic object" to the
'struck resonating object" 10 the "robbed" and beyond that to the situation where the sound appears to
rise out of nowhere like the "singing" of bowl gongs or bloogles. (Sound example 4.5).
Where the sound has a "struck" quality, we may imply not just the energy but the physical quality of the
striking medium. Harder striking agents tend to excite higher frequencies in the spectrum of the
vibrated malerial (compare padded stick, robber headed sticks and wooden sticks on a vibraphone).
We can generalise this notion of physical "hardness· of an onset, to the onset of any sound. By
exaggerating the high frequency components in the onset moment we create a more "hard" or "brittle"
attack.. (I make no apologies for using these qualitative or analogical terms. Grain time-frame events
have an indivisible qualitative unity as percepts. We can give physical and mathematical correlates for
many of the perceived properties, but in the immediate moment of perception we do not apprehend these
physical and mathematical correlates. These are things we learo to appreciate on reflection and
repeated listening.)
One way to achieve this attack hardening is to mix octave upward transposed copies of the source onto
the onset moment, with loudness trajectories which have a very sudden onset and then die away
relatively quickly behind the original sound (we do not have to use octave transposilion and the rate of
decay is clearly a mancr of aesthetic intent and judgement). (octave stacking). TIle transpositions
might be in the time frame of the original sound. or timc-contracted (as with tape-speed variatio/l : Sec
Chapter II). The laller will add new structure to the auack.. particularly if the sound itself is quickly
changing. We can also, of course, enhance the attack with downward transpositions of a sound. with
similar loudness tf'.!jeclOries, the physical correlate of such a process being less clear. This laller fact is
not necessarily important as, in sound composition. we are creating an artificial sonic world. (Sound
example 4.6).
We can, for example, achieve in this way a "hard" or "metallic' allack to a sound which is revealed
(immediately) in its continuation to be the sound of water, or human speech, or a non-representational
spectrum suggesting physical softness and elasticity. We are not constrained by the photographically
real, but our perception is guided by physical intuition even when listening to sound made in the entirely
contrived space of sound composition. (Sound example 4.7).
Another procedure is to add noise to the sound onset but allow it to die away very rapidly. We may
cause the noisiness to 'resolve' onlO un actual wavelength of the modified sound by preceding that
wavelength with repetitions of itself which are increasingly randomised Le. noisier. This produces a
plucked-string-like allack (sound plucking) and relates to a well known synthesis instrument for
producing plucked string sounds called the Karplus Strong algorithm. (Sound example 4.8)_
The effect of modifying the onset has to be taken into cortsideralion when other processes are put into
motion. In particular, time-stretching the onset of a sound will alter its loudness trajectory and may
even extend it beyond the grain lime-frame. As the onset is so perceptually significant.
46
time-stretching the onset is much more perceptuaUy potent than time-stretching the continuation.
This issue is discussed in Chapter I. Also editing procedures on sequences (melodies, speech-stream
etc.) in _many circumstances need to preserve event onsets if they are not to radically alter the
perceived nature of the materials (the lalter, of course. may be desired). Finally. e)(tremely dense
textures in mono will eventually destroy onset characteristics. stereo separation will allow the
ear to discriminate event onsets even in very dense situations. (Sound example 4.9).
ALTERED CAUSALITY
Because the onset characteristics of a sound are such a significant clue to the sound's origin, we can
alter the causality of a sound through various compositional devices. In particular, a sound with a vocal
onset tends to retain its "voiceness" when continuation information contradicts our initial intuitive
assumption. The piece Vox-5 uses this causality transfer throughout in a very conscious way but it can
operate on a more immediate leveL
Listen first to Sound example 4.10. A vocally initiated event transforms in a strange (and vocally
impossible) way. If we listen more carefully, we will hear that there is a splice (in fact a splice at a zero
crossing: zerO-Wiling) in this sound where the vocal initiation is spliced onto its non-vocal (but voice
derived) continuation. (In Vox-5 the vocal/non-vocal transitions are achieved by smooth spectral
interpolation, rather than abrupt splicing - see Chapter 12). When this abrupt change is pointed out to
us, we begin to notice it a rather obvious discontinuity, the 'causal chain" is broken. but in the wider
context of a musical piece. using many such voice initiated events, we may not so hone in on the
discontinuity.
A more radical causality shift can be produced by onset fusion_ When we hear two sounds at the same
time certain properties of the aural sU'eam allow us to differentiate them. Even when we hear two
violinists playing in unison, we are aware that we are hearing two violins and not a single
producing the same sound stream. At least two important factors ill our perception permit us to
differentiate the two sources. Firstly, the micro fluctuations of the spectral components from one of the
sources will be precisely in step with onc anolller but generally OUI of step wilh those of the other
source. So in the continuation we can aurally separate the sources. Secondly, the onset of the two
events will be slightly out of synchronisation no mailer how accurately they are played. Thus we c:.m
aurally separate the two sources in the onset moments.
If we now precisely align the onsets of two (or more) sounds 10 the nearest sample (ollset
syncllrollisatiol/) our ability to separate the sources at onset is removal The instantaneous perce!,1 is
one of a single source. However, the continuation immediately reveals thai we are mistaken. We thus
produce a percept with "dual causality". At its oulset it tS one source bUI it rapidly unfolds into two_
In Sound example 4.11 from Vw:-5 this process IS applied to Ihree vocal sOllrces. Listen carefully te'
the first sound in the sequence. TI1C percept is of "bel\" but also voices, cven though the sources
only untransformcd voices. This initial sound initiates a sequence of similar sounds. but as the
sequence proceeds the vocal sources arc also gradually spectrally stretched (See Chapter 3) beconung
more and more bell-like in the process.
47
r
CHAPTER 5
CONTINUATION
WHA T IS CONTINUATION?
Apart from grain-duration sounds, once a sound has been initiated it must continue to evolve in some
way. Only conuived synthetic sounds remain completely stable in every respect. In tillS Chapter we
will discuss various properties of tltis sound continuation, sometimes referred to as morphology and
allure.
Some Iypes of sound-contlnuation are, however, quite special. Sounds made from a sequence of
perceived rapid onsets (grain-streams), sounds made from sequences of short and different elements
(sequences) and sound which dynamically transform one set of characteristics into a quite diffcrcnt set
(dynamic interpolation) all have special continuation properties which we will discuss in later chapters.
Here we will deal with the way in which certain single properties, or a small sct of properties, of a
sound may evolve in a fairly prescribed way as the sound unfolds. These same properties may evolve
simllarly over grain-streams, sequences and dynamically interpolating sounds. They are not mutually
exclusive.
DISPERSIVE CONTINUATION & ITS DEVELOPMENT
Certain natural sounds are initiated by a single or bIief cause (striking. short blow or rub) and then
continue to evolve because the physical material involved has some natural internal resonance,
(stretched metal strings, bells) or the sound is enhanced by a cavity resonance (slot drum, resonant
hall acoustics). As the medium is no longer being excited. however, the sound will usually gradually
become quieter (not inevitably; for example the sound of the tam-tam may grow louder before
eventually fading away) and liS spectrum may gradually change. In particular. higher frequencies tend
to die away more quickly than lower frequencies except in cases where a cavity offers a resonaling
mode to a particular pitch. or pitch area, wubin tbe sound. This may then persist for longer than any
other pitch component not having such a resonating mode. We will call tltis mode of continuation attack
dispersal. (Sound example 5.1).
In the studio we can immediately reverse this train of events (.fOund reversal), causing the sound to
grow from nowhere, gradually accumulating all its spectral characteristics and ending abruptly at a point
of (usually) maximum loudness, The only real-world comparable experience might be that of a
sound-source approaching us from a great distance and suddenly stopping on reaching our locatioll.
We will call tills type of continuation an wxulllullllioll.
Accumulations arc more startling if Illey are made from non-linear dispersals. The decay of loudness
and spectral energy of a piano note tends to be close to linear. so the associated accumulation is little
more than a crescendo. Gongs or tam-tams or other complex-spectra events (c.g. the resonance of
the undamped piano frame when struck by a heavy metal object) have a much more complex dispersal in
Which many of the lrutiai high frequency components die away rapidly. The aSSOCiated accumulation
tlterefore begins very gradually but accelerates in spectral "momentum" towards tlle end. generating a
growing scnse of antiCipation. (Sound example 5.2).
48
'The structures of dispersal and accumulation can be combined to generate more interesting continuation
structures. By splicing copies of one segment of a dispersal sound in a back-te-back fashion so that
II
the reversed version exactly dovetails into its fOlWard version, we can create an ebb and flow of
I!
spectral energy (see Diagram 1). By time-variably time-stretching (time-warping)the result, we can
avoid a merely time-cyclic ebb and flow. More significantly. the closer we cut to the onset of the
'I! original sound, the more spectral momentum our accumulation will gather before releasing into the
" dispersal phase. Hence we can build up a set of related events with different degrees of musiCal
'I
I
intensity or tension as the accumulation approaches closer and closer to the onset point. (See Diagram
2). As the listener cannot tell how close any particular event will approach. the sense of spectral
'I
" anticipation can be played with as an aspect of compositional structure.
'This reminds me somewhat of the Japanese gourmet habit of eating a fish which is poisonous if the flesh
too close to the liver is eaten. Some officionados. out of bravado, ask for the fish to be cut within a
bair's breadth of the liver. sometime with fatal consequences. Sound composition is. fortunately. a little
less dangerous. (Sound example 5_1).
Again, rime-warping. spatial motion or different types of spectral reinforcement (see Chapter 3) can be
used to develop this basic idea.
UNDULATING CONTINUATION AND ITS DEVELOPMENT
Certain time varying properties of sound evolve in an undulating The most obvious examples of
these are • undulation of pitch (vibrato) and undulation of loudness (tremolando). Undulating
oonCinuation is related to physical activities like shaking and it is no accident that a wide, trill-like,
vibrato is known as a "shake". These variations in vocal sounds involve. in some sense. the physical
shaking of the diaphragm, larynx or throat or (in more extended vocal techniques) the rib cage. the head
or the whole body (!). 'This may also be induced In elastic phYSical (like thin metal sheets. thin
wooden boards etc) by physically shaking them. (Sound example 5.4).
In naturally occuring vibrato and tremolando. there is moment-to-moment instability of undulation
speed (frequency) and undulation depth (pitch-excursion for vibrato, loudness fluctuation for
lremolando) which is not immediately obvious until we create artificial vibrato or tremolando in which
Ihese features are completely regular. Completely regular speed, in particular, gives the undulation a
cycliCal. or rhythmic. quality drawing our attention to its rhythmicity. (Sound example 5.5).
Both speed and depth of vibrato or tremolo may have an overall trajectory (e.g. increasing speed,
decreasing depth etc.). In many non-Western art music cultures, subtle control of vibrato speed and
depth is an important aspect of performance practice. Even in Western popular music. gliding upwards
onto an Hpitch as vibrato is added. is a common phenomena. (Sound example 5.6).
Vocal vibrato is in fact a complex phenomenon. Although the pitch (and therefore the partials) of the
sound drift up and down in frequency. for a given vowel sound the spectral peaks (formants) remain
where they are. or, in dipthongs. move independently. (See Appendix plO and p66). The pitCh
excursions of vibrato thus make it more likely that any particular partial in a sound will spend at least a
little of its existence in the relatively amplified environment of a spectral peak. Hence vibrato can be
used to add volume to the vocal sound. (Sec Diagram 3).
49
DIAGRAl'I 3.

ClLL
coPIes

DIAGRAl'I 2
1
1
.

1.

!LlLcr--, )­


'I,
,
' ~ ~ ;
' ~ ~ :
,!",
I!II::
Ii:
;1
Amp
I
ii'
,
J
\
Qrnp
/\1
f ~
tun.£> D:IAGRAH 3
amp \
,l."
I t
\
frq,..
Ideally we should separate fonnant data before adding some new vibrato property to a sound.
reimposing the separate motion of the fonnanlS on the pitch-varied sound. (See fOrnuJfIl preserving
spectral m.anipulation : Appendix pl7). In practice we can often get away with mere tape-speed
variation transposition of the original sound source.
We may extract the ongoing evolution of undulating properties using pitch-tracking (Appendix
pp70-71) or envelape following (Appendix p58) and apply the extracted data to other events. We may
also modify (e.g. expanding. compression: Appendix p60) the undulating properties of the
source-sound. Alternatively. we may define undulations in pitch (vibrato) or loudness (cremolo) in a
sound. with control over time-varying speed and depth. In this way we may, e.g. impose voice-like
articulations on sounds with static spectra. or non-vocal environmental sources (e.g. the sound of a
power drill). (Sound example 5.7).
We may also produce extreme. or non-naturaJly-occuring. articulations using extremely wide (several
octaves) vibrato. or push tremolando to the point of sound granulation. At all stages the overall
trajectories (gradual variation of specd and depth) will be imponant compositional parameters. (Sound
example 5.8).
In the limit (very wide and slow) tremolando becomes large-scale loudness trajectory and we may
progress from undulating continuation to forced articulation (and vice versa). Similarly. vibrato (very
wide and slow) becomes the forced articulation of moving pitch. We thus move from the undulatory
articulation of a stable Hpitch to the realm of pitch motion structures in which Hpitch h a . . ~ no further
significance. (Sound example 5.9).
At the oppOsite extreme, tremolando may become so deep and fast. that the sound granulates. We may
then develop a dense grain texture (see Chapter &) on which we may impose a new tremolando
agitation and this may all happen within Ule ongoing now of a single event. (Sound example 5.U)).
We may also imagine undulatory continuation of a sound's spectrum, fluctuating in its
harmonicity\inhannonicity dimension. its stableness-noise dimension. or in its formant-position
dimension (spectral tmd"larion). Formantal undulations (like hand-yodelling, head-shake tlutters or
'yuyuyuyu" articulation) can be produced and controlled entirely vocally. (Sound example 5.11).
FORCED CONTINUATION AND ITS DEVELOPMENT
In any system where the energising source has to be continually applied to sustain the sound (bowed
SUing. blown reed, speech) the activator exercises continuous control over the evolution of tile sound.
With an instrument. a player can force a particular type of continuation, a crescendo. a vibrato or
tremolando. or an overblowing. usually in a time-{:ontrollable fashion. In general. the way in whiCh the
Sound changes in loudness and spe.:trurn (with s.:raped or bowed sounds etc) or sometimes in pitch
(with rotor generation as in a siren or wind machine) will tell us how much energy is being applied to
the sounding system. 1llesc forced loudness, spectral or pitch movemenl shapes may thus be thought of
as physical gestures translated into sound.
Any gestural shape which can be applied by breath control or hand pressure on a wind or bowed stri ng
instrument, can be reproduced over any arbitrary sound ( a sust:lined piano tone, the sound of a dense
50
aowd) by applying an appropriate loudness trajectory (enveloping) with perhaps more subtle
paraJleling features (spectral enhancement by fil/ering, luu:monidty shifting by spectral stretching or
subtle tklay). Moreover. these sonically created shapes can transcend the boundaries of the physically
likely (!). Sound which are necessarily quiet in the real world (unvoiced whispering) can be unnervingly
loud. while sounds we associate with great forcefulness. e.g. the crashing together of large. heavy
objects. the forced grating of metal surfaces, can be given a pianissimo delicacy.
Moreoyer, we can extract the properties of existing gestures and modify them in musically appropriate
ways. This is most easily done with time-varying loudness information which we can capture
(e11Vt!lope /ollawing) and modify using a loudness trajectory manipulation instrument (envelope
tr(l1l$!ormation), reapplying it to the original sound (envelope subs/illl/ion), or transferring it to other
sounds (enveloping or envelope substitution : see Appendix p59).
Information can be extracted from instrumental performance (which we might specifically compose or
improyise for the purpose), speech or vocal improvisation but also from fleeting unpredictable
phenomena (the dripping of a tap) or working in stereo. the character of a whole field of naturally
occuring events (e.g. traffic flow, the swarming of bats etc). The extracted gestural information can then
be modified and reapplied to the same material (envelope trans/onnation followed by envelope
substitution), or applied to some entirely different musical phenomenon. (enveloping or envelope
substitution: see Appendix p59) or stored for use at some later date. All such manipulations of the
loudness trajectory are discussed more fully in Chapter 10,
Sounds may also have a specific spatial continuation. A whole chapter of On Sonic Art is devoted to the
exploration of spatial possibilities in sound composition, Here we will only note that spatial movement
can be a type of forced continuation applied to a sound. that it will work more effectively with some
sounds than with others and that it can be transferred from one sound to another provided these
limitations are borne in mind. Noisy sounds, grain-streams and fast sequences move particularly well.
low continuous sounds particularly poorly, Some sounds move so well that rapid spatial movement
(e.g. a rapid left right sweep) may appear as an almost indecipherable quality of grain or of sound onseL
(Sound example 5.12).
The movement from mono into stereo may also playa significant role in dynamic inrerpolation (See
Chapter 12).
CONSTRUCTED CONTINUATION: TIME-STRETCHING & REVERUERATION
In many cases we may be faced with a relatively shOll sound and wish to create a continuation for iL
There are a number of ways in which sounds can be extended artificially. some of which reveal
continuation properties intrinsic to the sound (reverberatioll. time-strerching. some kinds of brassage)
while others impose their own continuation properties (zigzaggillg, granular-extension and other types
of brassage).
The most obvious way to create continuation is through time-stretching. There are several ways to do
this (brassagelluJI1t1oniser, spectral stretchillg. waveset time Slretclting, granular time stretching) and
these are discussed more fully in Chapter II. It is clear, however. that through time-stretChing we can
expand the indivisible qualitative properties of a grain into a perceptibly time-varying structure a
continuation. In this wayan indivisible unity can be made to reveal a surprising morphology. This
51
process also can change a rapid continuation into something in a phrase time-frame, for example,
formant-gliding consonants ("w', My' in English) become slow formant transformations ("00" -> ·uh
"and "j" --> ·uh"). Conversely, a continuation structure can be locked into the indivisible qualitative
character of a grain, through extreme time-contraction. (Sound example 5.13).
Continuation may also be generated through reverberation, This principle is used in many acoustic
instruments where a sound box allows the energy from an instantaneous sound-event to be reflected
around a space and hence sustained, Reverberation will extend any momentarily stable pitch and
spectral properties in a short event It may also be combined with filtering to sustain specific pitches or
spectral bands. It provides a new dispersal continuation for the sound. (Sound example 5.14),
Natural reverberation is heard when (sufficiently loud) sounds are played in rooms or enclosures of any
ldnd, except where the reverberation has been designed out, as in an anechoic chamber. N alUral
reverberation is particularly noticeable in old stone buildings (e,g, churches) or tiled spaces like
bathrooms or swimming pool enclosures, Reverberation in fact results from various delayed (due to
travelling in indirect paths to the ear, by bouncing off walls or other surfaces) and spectrally altered
(due to the reflection process taking place on different physical types of surface) versions of the sound
being mixed with the direct sound as it reaches the ear. Such processes can be electroniCally mimicked,
The electronic mimicry has the added advantage that we can specify the dimenSions of unlikely or
impossible spaces (e.g. infinite reverberation from an 'infinitely long' hall. the inside of a biscuit tin for
an orchestra etc.) There is thus an enormous variety of possibilities for creating sound coOlinuation
through reverberation.
Reverberation can be used to add the ambience of a particular space to any kind of sound. In this case
we are playing with the illusion of the physical nature of the acoustic space itself. the gencmlised
continuation properties of our whole sound set But it can also be used in a specific way to enlarge or
alter the nature of the individual sound events. extending (elements ot) the spectrum in time,
Reverberation cannot. however. by itself. extend any uudulatory or forced continuation properties of a
sound, On the contrary it will blur these, Moving pitch will he extended as a pitch band. loudness
variations will simply be averaged in the new dispersal. We can. of course, post-hoc. add undulatory or
forced continuation propenies (vibrato, rrenwlo, ellveloping) to a reverberation extended sound, These
will in fact help to unify the percept initiator-reverber.llor as a single sound-evenL The reverberation
part of the sound will appear to be pan of the sound production itself, rather than an aspect of a sound
box or chardcteristic of a room, (Sound example 5.15),
lnitiator-reverberator models have in fact been used recently to build sylllhesis models of acoustic
instruments from vibraphones. where the separation might seem natural, 10 Tubas,
CONSTRUCTED CONTINUATION: ZIGZAGGING & IlRASSAGE
Continuation can be gener,lIed by the process of zigl.agging, Here. the sound is read in a
forward-backward-forward etc sequence. each reversal-read stalling at the point where the preceding
read ended, The reversal points may be specified to be anywhere in the sound. Provided we start the
whole procClis at the sound's beginning and end it at its end, the sound will appear to be artilicially
extended. In the Simplest case, the sound may oscillate between two lixed times in the middle of the
SOund until told to proceed to the end. This "alternate looping" technique was used in carly commercial
52
samplers to allow a shOll sampled sound to be sustalned. It may work adequately where the chosen
ponion of the sound has a stable pitch and spectrum. In general. however, any segment of a sound.
beyond grain time-fnune. will have a continuation structure and looping over it will cause this structure
to be heanJ as a mechanical repetition in the sound. (see Appendix p43). (Sound example 5.16).
Zigzagging. however, can move from any point to any other within the sound. constantly changing
length and average position as it does so. We can therefore use the to select spectral. pitch or
loudness movements or discontinuities in a sound and focus attention on them by alternating repetition.
By :altering the zigzag points subtly from lig to zig. we may vary the length (and hence duration) over
the repeated segment In this way zigzagging can be used to generate non-mechanical undulatory
properties. or (on a longer timescale) dispersal-accumulation (see above) effects. within a
sound-continuation. (Sound example 5.17).
Sounds can also be artificially continued by brassage. In the brassage process successive segments are CUI
from II. sound and then resp/iced together to produce a new sound. Clearly. if the segments are replaced
exactly as they were cut. we will reproduce the original sound. We can extend the duration of the sound by
cutting overlapping segments from the source, hut not overlapping them in the goal sound. (See Appendix
p44-B). (see note at end of chapter).
As discussed previously. grain time-frame segments will produce a simple time-stretching of the
sound (llarmoniser effect). Slightly larger segments may introduce a granular percept into the sound as
the perceived evolving shapes of the segments are heard as repeated. (Sound example S.18). The
segment granulation of an already granular source may produce unexpected phasing or delay effects
within the sound. Longer segments. especially when operating on a segmented source (a melody, a
speech stream) will result in a systematic collage of its clements. With regular segment-size our
attention will be drawn to the segment length and the percept will prohably be repetitively rhythmic.
However. we may vary the segment size. either progressively or at random. producing a less
rhythmici1..ed, collage type extension. (Sound example 5.19)
This idca of brassage can be generalised. Using non-regular grain size near to the grain time-frame
boundary, the instantaneous articulations of the sound will be echoed in an irregular fashion adding a
spectral (very shOll time-frame) or al1iculatory (brief but longer time-frame) aura to the
time-stretched source. We may also permit the process to select segments from within a time-range
measured backwards from the current position in the source-sound (see Appendix p44-C). In this
way, echo percepts are randomised ful1her. Subtly controlling this and the previous factors. we can
extract rich fields of possibilities from the small features of sounds with evolving spectra (espeCially
sequences. which present us with constantly and perceptibly evolving spectra). (Sound example 5.20).
Ultimately. we can make the range include the entire span of the sound up to the current poSition (sec
Appendix p44-D). Now, as we proceed. all the previous propel1ies of the sound become grist to the
mill of continuation-production. In the case e.g. of a long. melodic which we brassage in this
way using a relatively large segment-size (including several notes) we will create a new melodic
stream including more and more of the notes in the original melody. The original melody will thus control
the evolving HArmonic field of Hpitch possibilities in the goal sound. (Sound example 5.21). On a
smaller time-frame. the qualities of a highly characteristic onset event can be redistributed over the
entire ensuing continuation. (Sound example 5.22).
53
The brassage components may also be varied in pitch. If this is done at random over a very sm:all
range. the effect will be to broaden the pitch band or the spectral energy of the Original sound. (Sound
example 5.23). We may also cycle round a set of pitches in a very small band providing a subtle
·stepped vibrato· inside the continuation (particularly if grain-size is slightly random-varied to avoid
rhythmic constructs in perception). (Sound example 5.24). The pitch band can also be progressively
broadened. (Sound example 5.25). The loudness of segments can also be varied in a random.
progressive or cyclical way. We might also spatialise. or progressively spatialise. the segmented
components. moving from a point source to a spread source. (Sound example 5.26).
Eventually. such evolved manipulations (and their combinations) force a continuous or coherently
segmented source to dissociate into a mass of atomic events. The ultimate process of this type is Sou,ui
shredding which completely deconstructs the original sound and will be discussed in Chapler 7.
CONSTRUCTED CONTINUATION; GRANULAR RECONSTRUCTION
We may generalise the hrassagc concept fUl1her. laking us into the realm of granular recolI:ilructjon. As
with hrassage, our source sound is CUI into segments which we may vary in duration, pitch or loudness.
However. instead of merely resplicing these tail-to-tail. they are used as the elemenLS of an evolving
texture in which we can control the density and the time-randomisation of the elements. (See
Appendix p73). (Sound example 5.27).
In this way we call overlay segments in the resulting stream. or. if segments are very shOll. introduce
momentary silences between the grains. This process. especially where used with very tiny grains. is
also known as granular sylllilesis (the boundaries between sound processing and synthesis are fluid)
and we may expect the spectral properties and the onset characteristics of the gr.llns to influence the
quality of the resulting sound stream, alongside imposed stream characteristics (density. pitch spread,
loudness spread, spatial spread etc.). This process then passes over into texture control. and is
discussed more fully in Chapter 8. (Sound example 5.211).
With granular reconstruction. if we keep the range and pllch-spread small, we may expect to generate
a time-stretched goal-sound wruch is spectrally thickened (and hence often more noisy. or at least
less focussed). the degree of thickening being comroJled by our choice of both density and pitch
bandwidth. But as range. density and bandwidth arc IIlcrcased and segment duration varied. perhaps
progressively, the nature of the source will come to have a less signillcant influence on the goal sound.
It will become part of the overall field-properties of a texture stream (see Chapter 8). (Sound example
5.29).
Final note: in order to simplify the discussion in this chapter (and also in the diagrammatic appendix)
Brassagc has been described here as a process in which the cut segments are not overlapped (apan from the
Icngtll of the edits themselves) when they arc reassembled. In practice, nornlal brassagelharmoniser
processes use a cel1ain degree of overlap of the adjacent segments to ensure sonic continuity and avoid
granular artefacts (where these are not required) III the resulting sound.
54
,
'Ilil
I
""ii "
'111111 I'
I'll'" "
';;1: II

CHAPTER 6
GRAIN-STREAMS
DISCONTINUOUS SPECTRA
In this chapter and the next we will discuss the particular properties of sounds with perceptibly
discontinuous spectra. The spectra of many sounds are discontinuous on such a small time- frame that
we perceive the result as noise (or in fact as pitch if the discontinuities recur cyclically), rather than as a
sequence of changing but definite sound events. Once, however, the individual spectral moments are
stable or stable-in-motion for a grain time-frame or more, we perceive a sound-event with definite
but rapidly discontinuous properties.
In one sense, all our sound-experience is discontinuous. No sound persists forever and, in the real
world, will be interrupted by another, congruously or incongruously, We are here concerned with
perceived discontinuous changes in the time range speed-of-normal-speech down to the lower limit of
grain perception,
Compositionally, we tend to demand different things of discontinuous sounds, than of continuous ones.
In particular. if we lime-streich a continuous sound, we may he disturbed by the onset distortion but
the remainder of the sound may appear spectrally satisfactory. If wc time-strctch a discontinuous
sound, however, we will be disconcel1ed everywhere by onset dislortion as the sound is a sequence of
onsets. Often we want the sound (e.g. in Ihe real environmem. a drum roll. a speech-stream) 10 be
delivered more slowly 10 us without the individual attacks (the drum strike. the timbral structure of
consonants) being sllleared out and hence transformed, We wish to be selective about what we
time-stretch!
The idea of slowing down an event stream without slowing down the internal workings of the events is
quite normal in traditional musical practice - we jusl play in a slower tempo on the same instrument
the internal tempi of the onset evems is not affected. But with recorded sounds we have 10 make
special to get this to work.
We will divide discontinuous sounds inlo two classes for the ensuing discussion. A grain-stream is a
sound made by a succession of similar grain events. In the limit it is simply a single grain rapidly
repeated. Even where this (non!) ideal limit is approached in nalurally occuring sounds (e.g. low
contrabassoon notes) we will discover that the individual grains arc far from identical, nor arc they ever
completely regularly spaced in time. (Sound example 6.1).
Discontinuous sounds consisting of different individual units (speech, a melody on a single instrument,
any rapid sequence of different events) we will refer to as sequences and will discuss these in the nexi
Chapter.
BOth grain-streams and sequences can have (or can be composed to have) overall continuation
properties (dispersive. undulatory and forced continuation and their developments. as discussed in
Chapter 5). (Sound example 6.2). In this chapter and the next, we will talk only those propertie,
Which are special to grain-streams and sequences.
55
CONSTRUCTING GRAIN STREAMS
Grain-'Streams appear naturally from iterative sound-production - any Idnd of roll or Uill on drums,
keyed percussion or any sounding material. They are produced vocally by rolled "r" sounds of various
SOt1S in various languages, by lip-farlS and by tracheal rasps. Vocally, such sounds may be used 10
modulate others (sung rolled "r", whistled rolled "r", flutter-tongued woodwind and brass etc). The
rapid opening and clOSing of a resonating cavity containing a sound source (e.g. the hand over the mouth
as we sing or whistle) can be used to naturally grain-stream any sound. (Sound example 6.3).
In the studio, any continuous sound may be grain-streamed by imposing an appropriate on-off type
loudness trajectory (enveloping), which itself might be obtained by envelope jo/l(JWing another sound
(see Chapter 10). (On-offness might be anything from a deep tremolando fluttering to an abrupt on-off
gating of the Signal). A particularly elegant way to achieve the effect is to generate a loudness
trajectory on a sound tied to the or wavecycles it contains (wave.ret enveloping). If each
on-off Iype trajectory is between about 25 and 100 wavesets in length. we will hear grain-streaming.
(Below this limit. we may produce a rasping or spectral "coarsening" of the source sound). This
process ties the grain-streaming to the internal properties of the source-sound so. for example. the
gmin- streaming ritardandos if the perceived pitch falls. This suggests to the ear that the falling
ponamento and the ritardando are causally linked and intrinsic to the goal-sound, rather than a
compositional affectation (!). (Sound example 6.4).
Alternatively grain-streams may be constlUcted by splicing together individual grains (!). Looping can
be used to do this but will produce a mechanically regular result. An instrument which introduces
random fluctuations of repetition-rate and randomly varies the pitch and loudness of successi ve grains
over a small range (iteration) produces a more convincingly nalUral result (Sound example 6.5).
More compositionally flexible. but more pernickety. is to use a mixing program so that individual grains
can be placed and ordered. then repositioned, replaced or reordered using meta-instruments which
allow us to manipulate mixing instructions (mixshuJjling) or to generate and manipulate
time-sequences (seqllence generation).
In this way grain-streams can be given gentle accelerandos or ritardandos of different shapes and be
slightly randomised to avoid a mechanistic result. (Sound example 6.6), Similarly, the grains
themselves can be sequentially modified using some kind of enveloping (see Chapter 4). or spectral
transformation tools (e.g. destructive distortion through waveset distortion. waveset Iwrmonic
djstortion or waveset-averaging : see Chapter 3) combined perhaps with inbetweellillg (see Chapter
12) to generate a set of intermediate sound-states. (Sound example 6.7).
Short continuous sounds can he extended into longer grain-streamed sound by using brass age with
appropriate (grain time-fmme duration) segment-size. (Sound example 6.8). The granulation of the
resulting sound can be exaggerated by corrugation (see Chapter 10) and the regularity of the result
mitigated by using some of the grain-stream manipulation tools to be described below.
Many of these compositional processes provide means of establishing audible (musical) links between
materials of different types. We are able to link grains with grain-streams and continuous sounds with
grain-streams in this way and hence begin to build networks of musical relationships amongst di verse
materials.
56
DISSOLVING GRAIN-STREAMS
Just as continuous sounds can be made discontinuous. grain-streams can be dissolved into continuous.
or even single-grain. forms. Speeding up a grain-stream by tape speed variation or :.pectral
time-colltraction (with no grain pitch alteration) may force the grain separation under the minimum
time limit for grain-perception and the granulation frequency will eventually cmerge as a pitch. (Sound
example 6.9). Alternatively. by sp.;:eding up the sequence rate of grains without changing the grains
(granular time-shrinking: see below) we will breach the grain-perception limit. The sound will
gradually become a continuous fuzz. In this case a related pitch mayor may not emerge. (Sound
example 6.10). Reverberation will blur the distinction between grains. (Sound example 6.11).
Increasing the grain density (e.g. via the parameters of a granular synthesis instrument) will also
gnldually fog over the granular property of a grain-stream. (Sound example 6.12).
Grain-strcams may also be dissociated in other ways ....
(I) slowing down the sequence of grains but not the gr.lins themselves so that grains become
detached events in their own right (granular time-stretching: Sound example 6.13).
(2) slowing down the sequence and the grains, so the internal morphology of the individual events
comes to the foreground of perception (spectral time-stretching: Sound example 6.14).
(3) gradually shifting the pitches or spectral quality of different grains differently. so the grain-stream
becomes a sequence (granular reorderillg : Sound example 6.15).
Again. we are describing here ways in which networks of musical relationships can be established
amongst diverse musical materials.
CHANGING GRAIN-STREAM STRUCTURE
In some ways, a grain-stream is aldn to a note-sequence on an instrument. In the latter we have
control over the timing and Hpilct\ing and sequencing of the events. In a grain-stream not constructed
from individually chosen grains, but e.g. by enveloping a continuous source. we do not initially have this
control. However. we would like to be able to treat gmins in a similar way to the way we deal with
note events. By the appropriate usc of gating. cutting and resplicing or remixing. which may be all
combined in a single sound processing instrument, we can retime the grains in a sequence using various
mathematical templates (slow by a fixed factor, ritardando arithmetically. geometrically or
exponentially. randomise grain locations about their mean, shrink the lime-frame in similar ways and
so on : granular time-warping). The grain-stream can thus be elegantly time-distorted without
distorting the grain-constituents. (Sound example 6.16).
We can also reverse the grain order (granular reversing). without reversing the individual grains
themselves (sound reversing), producing what a traditional composer would recognise as a
(as opposed to an accumulation. see Chapter 5). Thus a grain-stream moving upwards in pilCh would
become a grain-stream moving downwards in pilCh. (Sound example 6.17).
57
We can go further than this. We might rearrange the grains in some new order (grain reordering). A
rising succession of pitched sounds might thus be converted into a (motivic) sequence. (Sound
example 6.18). Or we might move the pitch of individual grains without altering the time-sequence of
the stream, or alter the timing on individual grains without altering their pitch. or some combination of
both. In this way the control of grain-streams passes over into more conventional notions of event
sequencing. melody and rhythm. I will not dwell on these here because hundreds of books have already
been written about melody and rhythm. whereas sound composition is a relatively new field. We will
therefore concentrate on what can be newly achieved, assuming that what is already known about
traditional musical parameters is already part of our compositional tool kit.
58
CHAptER 7
SEQUENCES
MELODY & SPEECH
The most common examples of sequences in the natural world are human speech. and melodies played
on acoustic instruments. However any rapidly articulated sound stream can be regarded as a sequence
(some ldnds of birdsong. kJangfarbenmelodie passing between the instruments of an ensemble, a
"breaIc" on a multi instrument percussion set etc). We can also construct disjunct sequences of
arbitrary sounds by simply splicing them together (e.g. a dripping tap. a car hom. an oboe note. a cough
- with environmentally appropriate or inappropriate loudness andlor reverberation relations one to the
other) or by modifying existing natural sequences (time-contraction of speech or music. for example).
(Sound example 7.1).
Naturally occuring sequences cannot necessarily be accurately reproduced by splicing togeuler (in the
studio) constiluent elements. The speech stream in particular has complex transition properties at the
Interfaces between different phonemes which are (1994) currently the subject of intensive investigation
by rese:m:hers in speech synthesis. To synthesize the speech stream it may be more appropriate to
model all the transitions between the we tend to notate in our writing systems. rather than
those elements themselves (this is Diphone synthesis). Starting from the separate elements
themselves, to achieve the flowing unity of such natural percepts as speech, it may be necessary to
"mac;sage" a purely spliced-together sequence. A simple approach might be to add a lillie subtle
rtverberation. However. for the present discussion. we will ignore this subtle flow property and treat all
sequences as if they were formally equivalent.
Clearly, sequences of notes on a specific instrument and sequences of phonemes in a natural language
have well-documented properties, but here we would like to consider the properties of any sequence

CONSTRUCTING & DESTRUCTING SEQUENCES
Sequences can be generated in many ways, apart from splicing togeuler the individual elements "by
hand". Any sound source in directed motion (e.g. a pitch-glide or a formant glide) can be spliced into
elements which, when rearranged. do not relain the spectral-continuity of Ule Original. (Sound example
7.2). An existing speech-stream can be similarly reordered to destroy the syntactic content and
(depending on where we CUI) the phoneme-continuity. We may do this by chopping up the sequence
imo conjunct segments and reordering them (as in sound shredding: see below), or by selecting
segments to crll, at random (so they might overlap other chosen and reordering them as Ihey
are spliced back together again (random cUlling: Appendix p4l). (Sound example 7.3).
We may work with a definite segment length or with arbitrary lengths and we may shift the loudness or
pitch of the materials. marginally or radically. before constructing the new sequence. Alternalively we
may cut our material into sequential segments, modify each in a non-progressive manner (different
jilterings. pitch Shifts. etc) and reconstitute the original sequence by resplicing the clements together
again, but they will now have discontinuously varying imposed properties. (Sound example 7.4).
59
Using brassage techniques. we may achieve similar results. if the segment size is set suitably large and
we work on a clearly specl.rally-evolving source. Using several brass ages with similar segment length
settings but different ranges (see Appendix p44-C) we might create a group of sequences with
identical field properties but different. yet related. order properties i.e. smaller ranges would lend to
preserve the order relations of the source. large ranges would reveal the material in the order sequence
of the source but, in the meantime. return unpredictably to already revealed materials. (Sound example
7.5).
Brassage with pitci! variatioll of segments over a finite Hpitch set (possibly cyClically) could establish
an Hpitch-sequence from a pitch-continuous (or other) source. (Sound example 7.6). Even brassage
with spatialisation (see Appendix p4S) will separate a continuous (or any other) source into a spatial
sequence and so long as they are clearly delineated. such spatial sequences can be musically
manipulated (Sound example 7.7). A sequence from left to right can become a sequence from right 10
left. or a sequence generally moving to the left from Ule right but WiUl deviations. can be rcstationed at a
point in space. or change to an alternation of spatial position and so on.
The perception of sequence can also be destroyed in various ways. Increasing the speed of a sequence
beyond a certain limit will produce a gritty noise or even. in the special circumstancc of a sequence of
regularly spaced events with strong attacks, a pitch percept. Conversely. time-stretching the sequence
beyond a certain limit will bring the internal properties of the sequence into ule phrase time-frame and
the sequence of events will become a formal property of the larger time-frame. (Sound example 7.8).
Alternatively. copies of the sequence, or (groups of) liS constituents may be aIIlassed in textures (see
Chapter 8) of sufficient density that our perception of sequence is overwhelmed. (Sound example 7.9).
Conversely a sequence may be shredded (sound shredding). In this process the sequence is "ra into
random-length conjunct segments which are lhen reordered randomly. This process may be repeated
ad infinitum. gradually dissolving all but the most persistent spectral properties of the sequence in a
watery complexity (the complex in fact becomes a simple unitary percept). (See Appendix p41). (Sound
example 7.10).
GENERAL PROPERTlF-S OF SEQUENCES
All sequences have two general properties. TIley can define both a field and an order. Thus, on the
large scale, the set of utterances with a particular natural language defines a field, the set of phonemes
which may be used in the language. Similarly any sequence played on a piano defines {he tuning sCI or
HAnnonic field of the piano (possibly just a subset of it). It is possible to construct sequences which
do not have this property (in which no elements are repeated) just as it is possible to construct pilch
fields where no Hpitch reference-frame is set up. But in general. lor a finile piece of music. we will be
working within some Held. a reference-frame for the sequence constituents. (Sound example 7.11).
Sequences are also characterised by order properties. In existing musical languages, certain sequena,
of notes will be commonplace. oUlers exceedingly unlikely. In a particular natural language certain
Clusterings of consonants (e.g. "scr" in English) will be commonplace. others rolre and yet others
absent. It is easy to imagine and to generate unordered sequences. though the human mind's pattern
seeking predilection makes it over-deterministic, heaflng definite pattern where pallern is only hinted
at!
60
In the finite space of a musical composition we may expect a reference set (field) and ordering
properties to be established quite quiclc.ly if they are to be used as musically composed elements. On
the IlIQlec scale, these may be predetermined by cultural norms, like tuning systems or the phoneme set
of a natural language, but traditional musical practice is usually concerned with working on subsets of
these cultural norms and exploring the particular properties and malleabilities of these subsets.
In this context the size of the field is significant. Musical settings of prose, for example, may treat the
Hpitch (and duration) material in terms of a small ordered reference set which is constantly regrouped
In subsets (e.g. chord formations over a scale) and reordered (motivic variation), whereas the phonetiC
material establishes no such small time-frame field and order properties - the text is used as
referential language, and field and order properties are on the very large timescale of extensive
language utterance. In tltis siruation, the text is percei ved as being in a separate domain to the
"musical".
Poetry. however. through assonance, alliteration and particularly rhyme. begins to adopt the
small-scale reference-frame and order sequencing for phonemes we find normal in traditional musical
practice. We therefore discover a meeting ground betwcen phonemic and traditional Hpitch and
dutIltional musical concerns and these connections have been explored by sound poets (Amirkhanian
etc) and composers (Rerio etc) alike. As we move towards poetry which is more strongly focused in
lite sonority of words, or just of syllabic utterance, the importance of small-scale reference-frame and
order sequencing may become overriding (e.g. Schwitters Ursonata). (Sound example 7.12).
COMPOSING FIELD PROPERTIES
Constructing sequences from existing non-sequenced. or differently sequenced. objects (a flute melody.
or an upward sweeping noisc-band. or a traffic recording in a tunnel with a particular strong resonance.
or a conversation in Japanese) ensures that some field properties of the source sounds will inhere in the
resulting sequence: a defined Hpitch reference sct and a Oute spectrum (with or without onset
chara<..1eristics; in the latter case the field is altered), rising noisc-bands within a given range, the
resonance characteristics of the tunnel, the spectral characteristics of the phonemes of Japanese atld
perhaps the sex of the specific voices. These field properties may then define the
boundaries of the compositional domain (a piano is a piano is a piano is a piano) or conversely become
part of the substance of it, as we transform the field characteristics (piano -> "bell" -> "gong" ->
"cymbal" -> unvoiced sibilant etc.). (Sound example 7.(3).
Compositionally we can transform tlJe field (reference-set) of a sequence through time by gradual
substitution of one element ror another or by the addition of new elements or reduction in the IOtal
number of elements (this can be done with or without a studio!). We can also gradually transform each
element (e.g. by destructive distortion with illberweelling, see Chapter 12) so that the elements of a
sequence become more differentiated or, conversely, more and more similar moving towards a
grain-stream (see above). or simply different. (Sound example 7.14).
Or we may blur the boundaries between the elements through processes like reverberation, delay.
small time-frame brassage, spectral blurring. spectral shuffling, waveset shuffling. or granular
reconstruction). or simply through time-contractioll (of various sorts) so that the sequence succession
rale falls below the grain time-frame and the percept becomes continuous. We thus move a sequence
61
towards a simple continuation sound. In these ways we can form bridges between sequences. grain
streams and smoothly continuous sounds. establishing audible links between musically diverse
materials. (Sound example 7.15).
We may also focus the field-properties of a sequence, or part of a sequence by looping (repeating tllat
portion). Thus a series of pitches may seen arbitrary (and may in fact be entirely random across the
continuum, establishing no reference frame of Hpitch). If, however, we repeat any small subset of the
sequence, it will very rapidly establish itself as its own reference set. We can thus move very rapidly
from a sense of Hpitch absence to Hpitch definiteness. (Sound example 7.16),
The same set-focusing phenomena applies to the spectr.!1 chara..:tcristics of sounds, If a meaningful
group of words is repeated over and over again we eventually lose contact with meaning in the phrase
as our attention focuses on the material as just a sequence of sonic events delining a sonic reference
frame. (Sound example 7.17).
COMPOSING ORDER PROPERTIES
We can also compose with the order properties of sequences. In the simplest case, cycling (see above)
establishes a rhythmic subset and a definitive order relation among the cyclic grouping.
Beyond tllis simple procedure we move into the whole-area of the manipulation of finite sets of entities
which has been very extensively researched and used 'in latc Twentieth Century serial compositioll.
Starting originally as a method for organising Hpitches which have specific relational properties due to
the cyclical nature of the pitch domain (see Chapter 13) it was extended tirst to the time-domain. wltich
has some similar properties, and then to all aspects of notated composition.
Permutation of a linitc set of clcmems i II fact provides a way or establishing relationships among
element-groupings which, if the groupings are small. are clearly audible. If the sets are large. such
permutations preserve the field (reference-sct) so that complex musical materials arc at least
field-related (claims that they are perceptually ordcr-related arc oflen open to dispute). All the
insights of serial or pel1llutalional tIl.inking about matcrials may be applied to any type of sequence,
provided that we ask the vital question. "is this design audible?", And in what sellse is it audible? - as
an explicit re-ordering relation - or as an underlying field constraint?
In traditional scored music. the mampulation of sct order-relations is usually <..'Onfined to a subset of
properties of the sounds e.g. !ixcd duration values, a set of loudness values. But we can work
with alternative properties, or gmups of properties. such as total spectral type. For example,
permutations of instrument type in a monodic klangfarhenmelodie - Webern's Opus 21 Sinfonie
provides us with such an instrument-Io-instrument line and uses it canonically. We could. however,
permute the instrumental sequence and not the Hpitches. Formant sequence e,g.
bo-ba-be =do-da-de =lo-I(I-Ie
and onset characteristic sequence c.g.
bo-ka-pe = bu-ki-po = ba-ko-pu ctc.
62
can be most easily observed as pattemable properties in language or syllables, but
DIAGRAl'1 1
formant and onset characteristics can be extracted and altered in all sounds.
We might imagine an unordered set of sounds and which define no small reference set, on which we
impose a sequence of 'bright", "hard", 'soft", "subdued" etc. attacks as a definable sequence. The
continuations of such sounds might be recognisable (e.g. water, speech, a bell, a piano concerto excerpt
etc.) but ordering principles are applied just to the onset types. This is an extreme example because I ENGLISH BELL-RINGING PATTERNS
wish 10 stress the point that order can be applied to many different properties and can be used to focus
the listener's attention on the ordered property, or to contrast the area of order manipulation with !hat
PLAIN BOB
of lesser order, In the setting of prose this contrast is clear. In our extreme example we have
suggested a kind of ordering of onset characteristics laid against a "Cinematographic" montage which
might have an alternative narrative logic - the two might offset and comment upon one another in the
PATTERNS OF PERttQTATIONS
same way that motivic (and rhythmic) ordering and the narrative content of prose do so in traditional
musical settings of prose. \ ALPHA 1
1 234 567 8
Ordering principles can, of course, also be applied to sounds as wholes. So any fixed set of sounds can
sIJap even pairs
be rearranged as wholes and these order relations explored and elaborated. A traditional example is
X Y X' X
2 1 436 5 8 7
English change-ringing where the sequence in which a fixed set of bells is rung goes through an
SIJap odd pairs I'x'Y'x'1
ordered sequence of permutations. (See Diagram 1). (Sound example 7.18),

sIJap even pairs X etc. 'A
42618 375
STRESS PATTERNS AND RHYTHM
sIJap odd pairs I X'etc·X I
4 6 2 8 1 7 3 5
sIJap even pairs
\ \ / /
We can apply different order-groupings to mutually exclusive sets of properties (e.g. onset,
64827 1 5 3
continuation. loudness. as against Hpitch) within !he same reference-set of sounds, Onset
sIJap odd pairs
\ X /
characteristics and overall loudness are often organised cooperatively to create stress patterns in 684 7 251 3
sequences. Combined with the organisation of durations, we have !he group of properties used to etc
X X
define rhy!hm, Rhythmic structure can be established independcnlJy of Hpitch structure (as in a
8 6 7 4 5 2 3
drum-kit solo) or par.lllel to and independently of Hpitch onler relations. Most pre twentieth century
/ X \
6 7 6 5 4 3 2 1
European an music and e,g. Indian classical music, separates out !hese two groupings of sound
/ / \ \
properties and orders !hem in a semi-independent but mutually interactive way (HArmOnic rhythm in
7 6 5 6 3 4 1 2
Western Music, melodidrhythmic cadencing in Indian Classical Music).
I / \
7 5 836 1 .4 2
Again, whole volumes might be devoted to rhythmic organisation but as there is already an extensive X X
5 7 3 8 1 624
literature on this, I will confine mysel f to just a few observations.
I \ / I
5 3 7 1 6 2 6 4
Stress patterns may be established, even cyclical stress patterns, independent of dumtion organisation,
\ \ / /
or on a regular duration pattern that is subsequently permuted or time-warped in a continuously 3 517 2 846
varying manner, a feature over which we can have complete conlrol in sound composition, Hence, \ X /
stress duration palterns at different tempi, or at different varying tempi. can be organised in such a way
<1 1 5 2 7 4 8 6
!hat they synchronise at cxactly defined times, these times perhaps themselves governed by larger
X X
1 3 2 5 4 7 6 8
time-frame stress patterns. (Sce Chapler 9), (Sound example 7.19),
END OF ALPHA
Dy subtle continuous control of relative loudness or onset characteristics, we can Change the perception
of !he stress pattern. It might, for example, be possible to creale superimposed grouping perceptions on
!he same stream of semiquaver units, e,g. grouped in 5 and in 7 and in 4 and to change their relative
perce.plual prominence by sublJy altering loudness balance or onset chamcteristics or in fact to destroy
IBETA I
grouping perception by a gradual randomisation of stress information, Here field variation is altering
order perception, (Sound example 7.20).
132 547 6 8
Swap even pairs
except 1st II X'X X
3 5 2 7 4 8 6
63
REPEAT PATTERNS AT LEFT,AS FOLLOwS
135 2 7 4 8 6
.......•etc ....... .
ALPHA ........ etc ....... .
........ etc ....... .

BETA I I X X X
157 3 8 2 6 4
etc .•......
ALPHA etc ...... .
etc ...... .
1 7 5 6 3 624
BETA
II X X X
17656342
etc
ALPHA etc
etc
18765432
BETA
II X X X
18674523
etc
ALPHA etc
etc
684 7 253
BETA
I X X X
648 2 7 3 5
etc
ALPHA etc
etc
146 2 837 5
BETA II X XX
14263857
etc
ALPHA etc
etc
12436 5 8 7
BETA II X X X
12345676
CYCLE COllPLETED
II
Alternatively, Clarence Barlow has defined an "indispensability factor" for each event in a rhythmic
grouping, e.g. a 618 bar. This defines how important the presence or absence of an event is to our
DIAGRAn: 2
perception of that grouping. In 618 for example, the first note of the 6 is most important, the 4th note
(wflich begins the 2nd set of 3 notes) the next, and so on. In 3/4 over the same 6 quavers, the
Indispensability factors will be arranged in a different order, stressing the division of the 6 into 3 groups
of 2. These factors are then used to define the probability that a note will occur in any rhythmic
sequence and by varying these probabilities, we can e.g. vary our perception of 6/8-ness versus
3/4-ness. Also, by making all notes equally probable, all stress regularity is lost and the sequence
becomes arhythmic above the quaver pulse level. (Diagram 2). (Sound example 7.21).
ORDERING ORDER
We can also work with order-groupings of order-groupings e.g. with Hpitches, altering the
time-sequencing of motivic-groups-taken-as-wholes. There is an extensive musical literature
discussing order-regrouping, especially of Hpitches, and all such considerations can be applied to any
1,1 subset of the properties of sounds, or to sounds as wholes.
:I!
,"
Here we are beginning to stray beyond the frame of reference of this chapter, because reordering
principles can be applied on larger and larger time-frames and are the substance of traditional musical
"form". From a perceptual point of view, I would argue that the larger the time-frame (phrase-frame
," and beyond), the more perception is dependent on long term memory, and the less easy
!II
,
pattern-retention and recall becomes. Hence larger scale order relations tend to become simpler, in
traditional practice, as we deal with larger and larger time-frames. This is not to say that, in traditional
practice, the "repeated" objects are not internally varied in more complicated ways at smaller
time-frames, but as large time-block qua large time-blocks. the permutation structure tends to be
simpler. (Such matters are discussed in greater detail in Chapter 9).
The computer, having no ear or human musical judgement, can manipulate order sequences of any length
and of any unit-size in an entirely equivalent manner. As composers, however, we must be clear about
the relationships of such operations to our time experience. Equivalence in the numerical domain of the
computer (or for that matter on the spatial surface of a musical score) is not the same as experiential
equivalence. Audible design requires musical judgement and cannot be "justified" on the basis of the
,I
computer-logic of a generating process.
In fact, as the elements of our sequence hccome larger, we pass over from one time-frame to another.
Thus the temporal extension of a sequence (by unit respacing, time-stretchillg etc.) is a way of passing
from one perceptual time-frame to another, just as time-shrinking will ultimately contract the sequence
into an essentially continuous perceptual event. With time-expansion. the perceptual boundaries are
less clear, but no less important.
Permutational logic is heady stuff and relationships of relationships can breed in an explosive fashion,
like Fibonacci's rabbits. Permuting and otherwise reordering sequences in ever more complicated ways
is something that computers do with consummate ease. A powerful order-manipulation instrument can
be written in a few lines of computer code, whereas what might appear to be a simple
spectral-processing procedure might run to several pages. The questions must always be, does the
listener hear those reorderings and if so, are they heard directly, or as field constraints on the total
64
('llkl.',., ,
@* fn;#;f7J;SU itfJij
i ..
¥r Ur IF t rIJ- 'I t J1lDJ11
[ij it t
r

aH uti'
1:1<1 1'1 ''')(1',
I'
f
hi
i
ii

I2t!k

r t

r

11"J
r
OR
"$ r- ePlt tr' • GliTJ lJ71
Most probably
3/4
Most probably
6/8
Ambiguous
3/4 or 6/8
77_77p )
tiij Pt 17 7 G7 E7 7 P J.
1\1
ij _____?±!. 77 7 J7 J;t
?
. fl - -r= Und.oldobl.,
• r1f' 1­
(( . 7 ttt "
. 1m ; ; I r
_
it ;;i'
?tl¥- f ;-1;tJ.z XI I .. 7 ol:
I' I ••

:'f-V
1
f7f1lf1
7;
i
rII j
:
tt $kZ
I
experience? If the latter, need they be quite so involved as I suppose, i.e. is there a simpler, and hence
more elegant, way to achieve an equivalent aural experience? A more difficult question, as with any
musical process is, does it matter to the listener that they hear this reordering process? Or, what does
this reordering "mean" in the context of an entire composition? Is it another way of looking down a
kaleidoscope of possibilities, or a special moment in a time-sequence of events taking on a particular
significance in a musical-unfolding because of its panicular nature and placement with respect 10
related events? These are, or course, aesthetic questions going beyond the mere use, or otherwise, of a
compositional procedure.
CHAPTER 8
TEXTURE
WHA T IS TEXTURE?
So far in this lxlok we have looked at the intrinsic or internal properties of sounds. However in this
chapter we wish to consider properties of dense agglomerations of the same or similar (e.g. /ape-speed
variation transposed) versions of the same sound, or set of sounds. This is what I will call texture.
Initially we will assume that what I mean by texture is obvious. In the next chapter we will analyse in
more detail the boundary between textural and measured perception, particularly in relation to the
distribution of events in time.
In the first two sound examples, we give what we hope will be (at this stage) indisputable examples of
measuredly perceived sound and texturally perceived sound. In the first we are aware of specific order
relations amongst Hpitches and the relative onset time of events and can, so to speak, measure them,
or assign them specific values (this will be explained more fully later). In the latter we hear a welter of
events in which we are unaware of any temporal order relations. However, in the latter case, we are
aware that the sound experience has some delinable persisling properties - the Hpitches define a
persisting reference sel (a HArmonic lield). (Sound example 8.0.
We can lose the sense of sequential ordering of a succession of sounds in two ways. FirsU y, the
elements may be relatively disordered (random) in some property (e.g. HPitch). Secondly the elements
may succeed each olher so quickly thaI we can no longer grasp their order.
There are immediate perceptual problems with the notion of disorder. We can generate an order-free
sequence in many ways, but il is possible to pick up on a local ordering in a totally disordered sequence.
Thus a random sequence of zeros and ones may contain tbe sequence 11001100 which we may
momentarily perceive as ordered, even if the paltern fails to persist. Such focusing on transient
orderliness may be a feature of human perception as we tend to be inveterate pattern-searchers. So
disorderly sequence, of itself. need not lead to textural perception. (Sound example 8.2).
By the same token, if a sequence, no matter how rapid, is repeated (over and over), the sequence shape
will be imprinted in our memory. This 'looping effect' can thus contradict the effect of event-rale on OIJr
perception. (Sound example 8.3).
Textural perception therefore only takes over unequivocally when the succession of events is both
random and dense, so we no longer have any perceptual bearings for assigning sequential properties to
the sound stream. (Sound example 8.4).
A disordered sequence of Hpitches in a temporally dense succession is a fairly straightforward
conception. However, we can also apply the notion of texture to temporal perception itself. This
involves more detailed arguments about the nature of perception and we will leave this to the next
chapter,
66
65
So. broadly speaking, texture is sequence in which no order is perceived, whether or not order is
intended or definable in any mathemalical or notatable way. Texture differs from Continuum in that we
retain a sense that the sound event is composed of many discrete events. Pure textural perception
takes over from measured perception when we are aware only of persisling field properties of a musical
s!ream and completely unaware of any ordering properties for the parameters in question.
In some sense. texture is an equivalent of noise in the spectral domain, where the spectrum is changing
so rapidly and undirectedly that we do not latch onto a particular spectral reference frame and we hear
an average concept. "noise". But. like noise, texture comes in many forms, has rich properties and also
vague boundaries where it touches on more stable domains.
GENERATING TEXTURE STREAMS
The most direct way to make a texture-stream is through a process that mixes the constituents in a
way given by higher order (density, random scatter, element and property choice) variables. TIlere are
many ways to do this and we will describe just three.
We may use untransposed specifying their timing and loudness through a mixing score
(or a graphic mixing environment, though in high density cases this can be less convenient) and use
various mix s/JUjJling meta-instruments to control timing, loudness and sound-source order.
Alternalively textural elements may be submitted as 'samples' (on a 'sampler') or, equivalently, as
sampled sound in a look-up table for submission to a table-reading instrument like CSound, Textures
can then be generated from a MIDI keyboard (or other MIDI interface), using MIDI data for
note-onset, note-off, key velocity, and key-<:hoice to control timing, transposition and loudness (or, in
fact, any texture parameters we wish to assign to !be MIDI output) - or from a CSound (textfile) score.
The keyboard approach is intuitively direct but diflicult to contrul with subtlety when high dcnsities are
involved. Thc CSound score approaches requires typing impossible amounts of data to a text file, but
this can be overcome by using a meta-instrument which generates the CSound score (and 'orchestra')
from higher level data.
Via such texture control or texture generation procedures (see Appendix pp68-69), we can generate a
texture for a specified duration from any number of sound-sources, giving information on
lemporal-density of events, degree of randomisation of onset-time, type (or absence) of event....onsel
quantisation (the smallest unit of the time ....grid on which events must be located), pitch-range (del1ncd
over the continuum or a prespecified, possibly time-varying, HArmonic field), range of loudness from
which each event gets a specific loudness, individual event duration, and the spatia! location and spatial
spread of the resulting texture-stream. In addition all of these parameters may vary (independently)
through lime.(sce Appendix pp6!Hi9). (Sound example 8.5).
Thirdly, any shortish sound may be used as the basis for a texture-stream, if used as an clement in
granular synthesis where the onset time distribution is randomised and the density is high, but not
extremely high. The properties of the sound (loudness trajectory. spectral brightness etc) may be varied
from unit to unit to give a diversity of texture-stream elements. (Sound example 8.6).
67
Alternatively we may begin with individually longer sound sources. Grain streams and sequences may
begin to take on texture-stream characteristics by becoming arhythmic (through the scatlering of
onset-time away from any time reference-frame: see later in lhis Chapter) and by becoming
sequentially disordered. If we now begin to superimpose even with a few superimpositions we
will certainly generate a texture stream. (Sound example 8.7).
Continuous sounds which are evolving through time (pitch-glide, spectral change etc) may also change
into texture-streams. Applying granular recoflStntction (see Appendix p73) to such a continuous sound
with a grain-size that is large enough to retain perceptible internal structure, and provided both that the
density is not extremely high (when the sound becomes continuous once again) and the grain time
distribution is randomised, can produce a texture stream. (Sound example 8.8).
Spatialisation can be a factor in the emergence of a texture-stream, If a nlpid sequence has its
elements distributed alternately between extreme left and right positions, !be percept will be most
probably split into two separate sequences, spatially distinct and with lower event rates. However, if
!be individual events are scattered randomly over the stereo space, we are more likely to create a
stereo texture-stream concept as the sense of sequential continuity is destroyed, particularly if the
distribution is randomised. A superimposition of two such scattered sequences will almost
certainly merge into a texture-stream. (Sound example 11.9),
And in fact any dense and complex sequence of, possibly layered. musical events, originally having
sequential, contrapuntal and other logics, can become a texture-stream if the complications of these
procedures and temporal density arc pushed beyond certain perceptual limits.
FIELD
The two fundamental properties of texture arc Field and Density.
Field refers to a grouping of different values which persists through time, Thus a texture may be
perceived to be taking place over a whole-tone scale, even though we do not retain the exact sequence
of Hpitches. In this case, we retain a HArmonic percept. Similarly, we may be able to distinguish
French from Portuguese, or Chinese from Japanese, even if we do not speak these languages and hen(X
do not latch onto significant sequences in the speech stream. This is possible because the
vowel-fonllants, consonant-types, syllabic-<:ombinations or even the pitch-articulation types for one
language form a field of properties which are different from those of another language.
Even aspects of the time organisation may create a field percept. Thus, if the time-placement of event,
is disordered so that we perceive only an indiviSible agitation over a rhythmic stasis, but this placement
is quanti sed over a very rapid pulse (e.g. events only occur on time-divisions at multiples of 1f30th of a
second). we may still be aware of this regular time-grain to the texture. 'This is a field percept. A more
complete discussion of temporal perception can be found in the next chapter. (Sound example 8.10).
68

• •

• • • •
Field properties themselves may also be evolving through time, in a direct or oscillating manner. One
method of creating sound interpolation (see Chapter 12) is through the use of a texture whose
elements are gradually changing in type. In Sound Example 8..11 we hear a texture of vocal utterances
which becomes very dense and rises in pitch, As it does so, its elements are spectrally-traced (see
Appelldix p2S) so that the texture-stream becomes more pitched in spectral qUality.
lllere are four fundamental ways to change a field property of a texture-stream. (see Diagram 1).
(a) We may gradually move its value.
(b) We may spread its value to create a range of values.
(c) We may gradually change the range of values.
(d) We may stuink the range to a single value.
For example, we may reduce the event duration so that the texture might become grittier (this also
depends on the onset characteristics of the grains). We might begin with a fixed spectral type defined
mainly by enveloping (see Chapter 9) or by source class (e.g. twig snaps. pebbles on tiles. bricks into
mud. water drips etc,) then spectrally interpolate (see Chapler 12) over time from one to another. or
gradually expand the spectral range from just one of these to include all the others. We may begin with
pitched spectra and change these gradually to inharmonic spectra of a particular type. or to inharmonic
spectra of a range of types. or noisy spectra etc, We may begin with a fixed-pitch and gradually spread
events over a slowly widening pitch-band (wedgeillg : Appendix p69),
If our property has a reference-frame (sec Chapter I). e.g. an Hpitch sct for pitches. these field
properties of the texture may gradually change in a number of distinct ways.
In tenns of an Hpitch field .....
(a) We may change the Hpitch field by deletion, addition or substitution of Held components. Addition
and substitution imply a larger inclusive Hpitch field and the process is equivalent to a HArmonic
change over a scale system (e.g. the tempered scale) in homophonic tonal music. (see Diagram
2a).
(b) We may change the Hpitch field by gradually retuning the individual Hpitch values towards new
values. This kind of gradual tuning shift can destroy the initial Hpitch concept and reveals the
existence of the pitch continuum. (see Diagram 2b).
(c) We may change the Hpitch field by spreading the choice of possible values of the original Hpitchcs.
Each Hpitch can now be chosen from an increasing small range about a median value. Initially we
will retain the sense of the Original Hpitch field. but eventually. as the range broadens. tills will
dissolve into the pitch continuum. (see Diagram 2c).
(d) We lIIay gradually destroy the Hpitch field by gliding the pitches. Once the gliding is large, and not
start- or end- focused (see Chapter 2), we lose tbe original Hpitch percep!. (See
Diagram 2d).
Such reference-frame variations can be applied (0 OIher properties which are perceived against a
reference frame. (Sound example 8.12).
69
D:IAGRAH 1
-----
a
b c d
D:IAGRAH 2
oriq;hal
.

-


s-·
• •
I
DELETION
e
• • • •
.. a
b, L -­
..
• •
-

.
• •


• • • •
it
'lIr

· .­
II GRADUALLY RETUNE PITCHES
towards new-field values,
k\:: l>I,(:
::: ',', :: ,:: I:::: .' : . fielt/.
• " ....... *. :.": • ;
.., .. I' ',' " '
"--,•. GRADUALLY SPREAD RANGE
from which itch is selected.
; 1 I
l1ew
d'5 =-: )< 'V field
- - >< 7\
- " -
GRADUALLY INTRODUCE PITCH-GLIDING
and increase its range.
We cannot hope to describe all possible controllable field parameters of all texture-streams because,
as the elements of the stream are perceptible, we would need to describe all possible variations of each
constituent. all combinations of these properties, all changes of the individual properties and all changes
in combinations of these properties. We will therefore offer a few other suggestions ...... .
(l) Variation of sound-type of the constituents (hannonicity-inharmonicity; type of inharmonicity;
formant-type; noisiness-darity; stability or motion of any of these); the range of sound-types;
the temporal variation of these.
(2) Variation of the individual spatial motion of elements; the temporal variation of this.
And, if the texture elements are groups of smaller events ...
(3) The group size and its variation: the range of group size and its variation.
(4) The internal pitch-range. spectral-range (various) of the groups.
(5) The internal group-speed, group-speed range and their slow or undulating variation.
(6) The internal spatialisation of groups (moving left, spatial oscillation, spalial randomness), range of
spatialisation types, and the time-variation of these.
(7) The variation of order-sequence or time-sequence of the groups.
All such features may be compositionally controlled independently of the stream density and the
onset-time randomness of the texture-stream.
DENSITY
The events in a texture-stream will also have a certain density of event-onset-separation which we
cannot measure within perception but which we can compare with alternative densities. Thus we will be
able to perceive increases and decreases in density. oscillations in density and abrupt changes in
density. We have this comparative perception of density changes so long as these changes are in a
time-frame sufficiently longer than that of density perception itself. Otherwise there is no way to
distinguish density from density Iluctuation. (Sound example 8.13).
In fact event-onset-separdtion-density perception is like temperature measurement in a material
medium. Temperature is a function of the speed and randomness of motion of molecules. Hence the
temperature at a point, or of a single molecule has no meaning. In the same way, density at the
time-scale of the event repetitions has no meaning. Temperature can only be measured over a certain
large collection of molecules - and density can only be measured over a group of event entries. Spatial
variation in temperature similarly can only be measured over an even larger mass - and temporal
variation in density similarly requires more events, more time, to be perceived than does density itself.
Event-onset-separation-density fluctuations themselves may be random in value bUI regular, or
semi-regular in time, providing a larger time-frarne reference-frame. Funhermore. slow density
fluctuations will tend to be perceived as directed flows - as continuation properties - as compared with
rapid density fluctuations. In extremely dense textures. the density fluctuations themselves may
approach the size of large grains - we create granular or 'crackly' texture. (Sound example 8.14).
Increasing the densily of a complex set of events can be used to average out its spectral properties and
make it more amenable 10 interpolation with other sounds. In fact. before the advent of computer
technology this was one of the few techniques available to achieve sound interpolation (see Chapter
12).
In Sound example 8.15 (from Red Bird: 1977. a pre-digital work), a noisy-whistled voiced is
interpolated wilh a (modified) skylark song via this process of textural thickening. In Sound example
8.16 (from Vox 5 : 1986), behind the stream of complex vocal multiphonics, we hear the sound of a
crowd emerging from a vocal-like noise band which is itself a very dense texture made from thaI crowd
sound.
When this process is taken to its density extreme we can produce white-oul. When walking on snow
in blizzard conditions it is sometimes possible to become completely disoriented. If the snow fall is
sufficientl y heavy, all sense of larger oudi nes in the landscape are lost in the welter of whi Ie snow
particles. Similarly, when a sound texture is made extremely dense, we loose perceptual Irack of the
individual elements and Ihe sound becomes more and more like a continuum. In particular, where the
sound elements are of many spectral types including noise elements. the texture goes over into noise.
This is whal we describe as white-ou!.
In Sound example 8.17. a dense texture of vocal sounds whites-out and the resulting noise-band is
then filtered 10 give it spectral pitches while it slides up and down. The pitch ness is then gradually
removed from all bands except one, and the density decreases once more to reveal the original human
voice elements.
Finally we should note that changes in field and density propenies might be coordinated between
parameters, or all varied independently. or even have nested 'contradictory' properties. Thus a texture
may consist of events whose onsets are entirely randomly distributed in time bUI whose
event-elements are clearly rhythmically ordered within or, conversely. events may begin
on tile Hpitches of a clearly defined HArmonic field while the event-elements have Hpitches distributed
randomly Over the continuum. (Sound example I!.IS).
70
71
......­
CHAPTER 9
TIME
ABOur TIME
In this chapter we will discuss various aspects of the organisation of time in sound composition.
Clearly. rbythm is a major aspect of such a discussion but we will not dwell on rhythmic organisation at
length because this subject is already dealt with in great detail in the existing musical literature. We
will be more concerned with the nature of rhythmic perception and its boundaries.
Our discussion will enable us to extend tbe notion of perceptual time-frames to durations beyond that
of the grain and upwards towards the time-scale of a whole piece.
WHERE DOES DENSITY PERCEPTION BEGIN?
In the previous chapter we discussed texture-streams which were temporally dense but which might
retain field properties in other dimensions (like Hpitch or formant-type). We must now admit that the
concept of density and density variation can be applied to any sound parameter. For example, if we
have pitch confined over a given range, a pitch density value would tell us how densely the
covered the continuum of pitch values between the upper and lower limits of the range
(not time-wise but pitch-wise).
In this we can begin to see that the concepts of Density and Field applied over the same
parameter come into conllicl. Once the pitch-density (in this new sense) becomes very high. we loose
any sense of a specific Hpitch field or HArmonic reference frame. though we may continue to be aware
of the range limits of the field. (Sound example 9,1).
We must therefore what is the dividing line between field, or reference-frame, perception and
density perception ill any olle dimellsion. In this chapter we will confine ourselves to the dimension of
temporal organisation. Our conclusions may however be generalised to the field/density perceptual
break in any other dimension (e.g. Hpitch organisation).
Compositionally, we can create sequences of events that gmdua.lly lose a (measured) sense of rhythm.
Thus we may begin with a computer-quantised rhythmic sequence which has an "unnatural" or
"mechanical" preciSion. Adding a very small amount of random scalier to the time-position of the
events, gives the rhythm a more "natural" or "human performed" feel. This is because very accurately
performed rhythmic music is not "accurate" in a precisely measured sensc but contains subtle
fluctuations from "cxactness· which we regard as important and often essential to a proper rhythmic
"feel".
Increasing the random scalier a little further we move into the area of loosely performed rhythm, or even
badly performed rhythm and eventually the rhythm percept is losl. The time-sequence is arhythmic.
Once this point is reached we perceive the event succession as having a certain density of event onsets
- our perception has changed from grasping rhythmic ordering as such to grasping only density and
density fluctuations. (Sound example 9.2).
72
In this sequence we probably retain the sense of an underlying measured percept (which is being
articulated by random scattering) a long way into the sequence. We are given a reference frame by the
initial strict presentation, which we carry with us into the later examples. If we were presented with
some of the laler examples wirlwut hearing the reference frame, we would pertJaps be more willing to
declare them arhythmic.
Talcing the sequence in the opposite order, there may be a perceptual switching point at which we
suddenly become aware of rhythmic order in the sequence. (Sound example 9.3).
Let us now look at this situation from another viewpoint. Beginning again with our strictly rhythmic set
of events. we note that !be event-onselS lie on (or very close to) a perceivable time grid or
reference-frame (the smallest common beat subdivision, which may also be thought of as a
lime-quantisation grid). Allowing event-onsets to be displaced randomly by very small amounts from
this time reference-frame. we initially retain the percept of this reference frame and of a rhythm in the
event stream. Once these excursions are large. however, our perception of the frame, and in
consequence rhythntiCily. breaks down.
It is informalive to compare this with the analogous situation pertaining \0 an Hpitch reference frame.
Here we would begin with events confined to an Hpitch set (a HArmonic field). then gradllally
randomise the tuning of notes from the Hpitch set. slowly destroying the Hpitch field characteristics.
even though we might retain the relative up-downness in the pitch sequencing.
From this comparison we can see that a durational reference-frame underlying rhythmic perception is
sintilar to a field, and rhythm is an ordering relalion over such a reference frame. Dissolving rhylhmicity
is hence analogous 10 dissolving the percept of Hpilch which also relates intrinsically to a reference set.
Strictly speaking, to provide a precise analogy with our use of HAnnonic field, a duration field would be
the sct of all event-onset-separation durations used in a rhythmic sequence. However. just as
underlying any HArmonic field we may be able 10 define a frame made up of the smallest common
intervallic unit (e.g. thc semilone for scales played in the Western tempered scale, the srutis of the
Indian rag system), it is more useful to think of !be smallest subdivision of all the duration values in our
rhythmic sequence, which we will refer to as the time-frame of the event. In an idealised fonn. this may
also be thought of as the lime quantisalion grid.
Such a time-frame. constructed from our perception of event-onsel-separation duration, provides a
perceptual reference at a particular scale of temporal activity. As such it provides us with a way to
extend the notion of perceptual lime-frames used previously to define sample-level, grain-level and
continuation-level perception (or lack of it) into longer swathes of time. Moreover. because such
time-frames may be nested (see below) we can in fact denne a hierarchy of time-frames up to and
including !be durJtion of an entire work.
Just as with the dissolution of Hpitch perception. it is the dissolution of the time-frame which leads us
from Iield-ordered (rhythmic) perception of temporal organisation to density perception. And just as
dissolving the Hpitch percept by !be randomisation of tuning leaves us with many comparatively
perceived pitch properties to compose, dissolving the time-reference-frame leads us into Ihe complex
domain of event-onsct-separation-<!ensity-articulation discussed in the previous chapler.
73
MEASURED, COMPARATIVE & TEXTURAL PERCEPTlON
To make these distinctions completely clear we must examine the nature of lime-order perception and
define our terms more precisely.
In the ensuing discussion we will use the term 'duration' to mean event-onsel-separation-<!uration,
to make things easier to read. Bear in mind however that duration, here, means only this, and not the
time-length of sounds themselves.
There are two dimensions to our task. Firstly, we musl divide temporal perception into three types ­
the measured. Ihe comparative and the textural (the perceplion of temporal density and density
fluctuation) .
Secondly. there is the question of time-frames. As will he discussed below, time-frames may be
nested within one another in the rhythmic or temporal organisation of a piece so thai at one level (e.g. a
semiquaver time-frame) order-sequences are PUI in motion (rtJythm) while simultaneously
establishing a time-frame at the larger metrical unit (e.g. the bar) in which longer-duration
order-sequences can be set up.
So we can ask, in whal time-frames is our perception of a particular event measurable and in whal
time-frames is it comparative or textural ?
The first assertion we will make is that perception requires an established or only slowly
changing reference-frame against which we can "measure" where we are. In the pitch domain this
might be a tuning system, or a mode or scale. or, in Western tonal music. a subset of the scalc defining
a particular key. This permits structural judgements like "this is an E-tlat" or ·we arc in the key of C#
minor". Such a reference-frame might be eSlablished by cultural norms (e.g. tempered lUning) or
established within the context of a piece (e.g. the initial statement of a raga. the establishment of tempo
and metrical grouping in a particular piece). If we do not have a reference set. we are still able to make
comparative judgements. In the pilch domain for example, 'we are now higher than before", or. "we are
moving downward", or. 'we arc hovering around the central valuc".
In the sphere of durations, if we establish a time reference-frame of. say, repealed quavers at [crotchet
= 120J we can recognise wilh a fair degree of precision the various multiples (dOlled crotchet, minim,
etc) and integral divisions (semiquaver, demisemiquaver, triplct sentiquaver) of this unit. (See
Diagram I). Hence we can, in many cases, measure our perception against the reference frame and
recognise specific duration sequences. Our perception of duration is measured, at the time-frame of the
quaver.
Similarly in the sequence in Diagram 2 we will probably hear a clearcut sequence of duration values in
the perceptibly measured ralios 1 :2:3:4:5. Our perception is again measured.
However. in the situation in Diagram 3 we are aware of the comparative quality of the successive
durations. We perceive them as gelling relatively longer and we also have an overall percept of
'slowing down" bul we do not (normally) have a clear percept of the exact proportions appertaining
between the successive events. Furthermore no two live performances of this event will preserve the
same sct of measured proportions between the constituents. This is. then, an example of comparative
perception.
74
A more interesting e"ample is presented by the sequence in Diagram 4. If this rhythm occurs in a
nIAGRAH 1 conte"t where there is a clear underlying semiquaver reference-frame. and the sequence is played
"precisely" as written. we will perceive the 3:1:3:1:3:1 etc. sequence of duration proportions clearly ­

lEt tuFt> It! tiro
f"3'
i j fur L1
I Y2. 4 .3 Y:5
DIAGRAM :2
:z.:3 4 "....-=:::" S
r r' If flr' [U\
DIAGRAl'I 3
rno/to rit ., ............................. a t'f'lfhpo
1£4 Dt LtJItKl? £7)jld
DIAGRAl'I 4
i,j= 168> c.rotche.t
nnnnnI1lrlnnnnnJ
(tJ«'20)
,...,
...
refel"l!'nc.e. frame..
our perception will be measured. However. in many cases. this time pattern is encountered where the
main reference frame is the crotchet. and the pallern may be more loosely interpreted by the performer.
veering towards 2: I:2: I: etc. at the extreme. Here we are perceiving a regular alternation of short and
long durations which. however, are not necessarily perceived in some measurable proportions. The
score may given an illusory rigour to a 3: I definition. but we are concerned here with the percept.
(Sound example 9.4).
The way in which such crotchet beats are divided is one aspect of a sense of "swing" in certain styles of
music. A particular drummer. for example. may have an almost completely regular long-short division
of the crotchet in the proportion 37:26. We may be aware of the regularity of hislher beat and
appreciative of the particular quality of this division in the sense of the particular sense of swing it
imparts to hislher playing while remaining completely unaware of the exact numerical proportions
involved. Here then we have a comparative perception with fundamental qualitative consequences.
(Sound example 9.5).
It is importanlto note at this point that our perception of traditional fully-scored music is comparative
in many respects and in some respects textural to the extent that e.g. the precise morphology of each
violin note cannot be specified in the notation and varies arbitrarily over a small range of possibilities.
as does the vibrato of opera singers. In sound composition, these factors can be precisely specifIed and,
if desired. raised to the level of comparative, or measured, perception.
More importantly. our rhythmic example using dolled rhythms illustrates the second aspect of our
discussion. For in this example, perception at the level of the crotchet remains measured the music is
"in time". However, Simultaneously, in a smaller time-frame. our perception has become comparative.
We can see the same division into time-frames if we look again at the idea of "indispensability factor"
proposed by Clarence Barlow (see Chapter 7). As discussed previously, we can define. over a
reference frame of quavers, the relative indispensability of each note in a 6/8 or a 3/4 (or a 7/8) pattern.
Linking the probability of occurrence of a note to its indispensability factor allows us to generate a
strong 6/8 grouping feel. or a strong 3/4 feci, or an ambiguous percept between the two. Once every
quaver becomes equally probable, however. the sense of grouping breaks down allogether and at the
level of 6-groupings (7-groupings, or any groupings) measured perception is lost. Our perception
reverts 10 the field characteristics. In this case, howevL'f, the fIeld is defined by a smaller SCI of
durations. the quavers themselves. So at the smaller time-frame. we retain a sense of measured
regularity and hence our perception there is measured perception.
SHORT DURATION REFERENCE-FRAMES: RHYTHMIC DISSOLUTION
There are, however. limits to this time-frame switch phenomena. On the large scale, if the pulse cycle
100 long (e.g. 30 minutes), we wi!! no longer perceive il as a time reference-frame (some
musicians will dispute this, see below). More significantly. if the pulse-frame becomes too small we
also lose a sense of measurability and hence of measured perception. We cannot give a precise figure
fOr this limit. We can perceive the regularity of grain down to the lower limit of grain perception but
COmparative judgements of grain-durations. especially in more demanding proportions (5:7 opposed
to I :2), seems to break down well above this limit.
75
DIAGRAl'I S
notafiQI'I il'\ 2,;-4 te".,po
.tT?J m "3"'-
rr "3
I'" .r:"'lrn
Tn ITl .J.I J .rnlrn J.--J
(.
.. I89)
(tJ:IU;')
-.........:::-.
5hered.
<time"
,;
){- ..... .
J
J J
J J
frame.
LARGER
"
** "
DIAGRAl'I 6
::: t;U;)


'-.: .
c.
"'......... -.
10 r· r·· J. t.l.l.l.l.
DIAGRAl'I 7
Dr" '" '>' r"

o
M!!!!J!!!J

'Ibis problem becomes particularly important where different divisions of a pulse are superimposed. To
given an example, if we compose two duration streams in the proportion 2:3 (see Diagram 5) we have
a clear mutual time-frame pulse at the larger time-frame of the crotchet. (Sound example 9,6) .
If we now regroup the elements in each stream in a way which contradicts this mutual pulse (see
Diagram 6) we may still be able to perceplUally integrate the streams (perceive their exact
relationship) in a way over a smaller time-frame. By dividing the quavers of the slower
stream into 3 and the faster stream into 2, we may discover (i.e. perceive) a common pulse. (see
Diagram 7). (Sound example 9.7),
Even bere this perception does not necessarily happen (certainly not for all listeners). particularly
where we are relying on the accuracy of performers.
The more irrational (in a mathematical sense) the tempo relationship between the two streams, the
shoner this smaller mutual time-frame pulse beromes. And the smaller this unit, the more demanding
we must be on performance accuracy for us to hear this smaller frame. With computer precision,
however, this underlying mutual pulse may continue to be apparent in situations where it would be lost
in live performance, as in Diagram 8, (Sound example 9.8).
Perceiving a specific 9 to II proponion as in the last example, where the smaller common time-frame
unit has a duration of c. looth of a serond. is simply impossible, even given romputer precision in
performance. Measured perception on the smaller mutual time-frame broken down. though we may
be comparatively aware (if the density of events-relative-to--the-stream-tempo can be compared in
the two streams) that we have 2 streams of similar but different event density.
With 3 superimposed tempi the problem is. of course. compounded, the common pulse urtit becomes
even shorter. We may. for example, when composing on paper, set up sequences of events in which
simultaneous divisions of the crotchet beat into (say) 7,11, & n at {crotchet = 120). are used. (Sec
Diagram 9).
And we may always claim that we are setting up an exact percepl created by the exact notational
device used. However, in this particular case we should ask. with what accuracy can this concept be
realised in practice?
No human performance is strictly rhythmically regular (see discussion of quanlised rhythm above).
When three streams are laid together the deviations from regularity will not flow in parallel (they will
not synchronise like the parallel micro--aniculations of the panials in a single sound-source: see
Chapler 2). We may describe the fluctuations of the performed lines from exact rorrelation with the
rommon underlying pulse by some scattering factor. a measure of how much an event is displaced from
its 'true' position. A factor I means it is displaced by a whole urtit. In larger time-frame terms. a
quaver in a sequence of quavers would be inaccurately placed by up to a whole quaver's length. By
anyone's judgement this would have to be described as an inaccu ••lle placement' (Sec Diagram to)
In fact I would declare the situation in which events are randomly scauered within a range reaching 10
hair of the duration of the time-frame urtits as defirtitively destroying the percept of that lime-frame. In
praCtice the time-frame percept probably breaks down with even more closely confined random
displacements. (Diagram ttl.
76
D:IAG:R.A'ff e

I!E #'m
iEr
J
i
ei§ 1
SIl'l8ller r r r j « it r': j rj ri ! rf rfir rf f r r r r r f [ common plAtse
IS ---- IS ­
tt ----- (' ------=-­
J r:.120 -:j
I

laY3et" '3S --- .. '" u 1 r
J:12
0
9
t J!J i J I 14 1
12 r r;
sm1>ller fff!r::1TrP iJmriiI ')J1illiiW 'mr;i1lJ[i!i I1j;;Wi COI'Imon
's
F ---­
..---------, DIAG:R.A'ff 9
I
(J.J20)_ .,; i If GG -I
, 1 I
- S ::{I
Sf ?I L I J ;
Part of above pattern placed _ 4tnS


on the slIIII.ller cOIIIIIIOn pulse. , ••. .......................... 6.,
..::::.......................... Y"'/'-
lOrnS 3 ....
I DIAGRAH 10
timeframe un;!:; 1==1 tfl(
(bft,t') )0
d" .i i i J'
'---""l
disph!:.<:ed..
DIAGRAH 11
Original regular pulse. tlmoe·frame unit (tfu)
i h h Y2.tju !l'z.l:fu..
i
.[, ;:J cotter ran es ::.J I
. .. .. .. ..
One possible result of maximal scattering.
sm".Jler
lar,er
DIAGRAH 12

(.J ::./2.0)
F(. 1f 1 J2!J1 ;/' 'if j' i I
Sequence fail ure. anhc/i:r.ltteiL
b .. 2... S-t

In our notated example. the mutual time-frame duration lasts approximately 0.0005 seconds. or half a
millisecond. and hence there is no doubt Ihal the live-performed events will be displaced by at least
half the time-frame unit (half of half a millisecond). In fact we can confidently declare that events will
be displaced by multiples of the time-frame unit. in all performances. The precision of the result is
simply impossible and an appeal to future idealised performances mere sophistry.
In fact we can be fairly certain that even the order of these events will not be preserved from
performance to performance. The 7th unit in a 13-in-the-time-of-1 grouping, and the 6th unit in an
Il-in-the-time-of-one grouping. over the same crotchet, at [crotchet .. 120J are only 1I143th of a
aotcbel. or 1nS6th of a second (less than 4 milliseconds) apart. II only needs one of these unils to be
misplaced by 3 or more milliseconds in one direction, and the other by 3 or morc milliseconds in the
opposite direction, for the order of the two events to be reversed in live performance! (See Diagram
12).
Hence we are not, here, composing a sequence intrinsically tied 10 an "exact" notation. TIle exact
notation in fact specifies a of sequences with similar properties within a fairly definable range. We
are in some senses specifying a density flow within tile limitations of a cenaln range of random
fluctuations. We could describe this class of sequences by a time-varying density function with a
specified (possibly variable) randomisation of relative time-positions. In writing notes on paper, the
exact notation is more practicable so long as we acknowledge that it does not specify an exact n:suIt.
In the computer domain, the density approach may be more appropriale if we are really looking for the
same class of percepts as in the notated case.
In such complex cases, we may still retain a longer time-frame reference set (see Diagr'.un 13).
In a cyclically repeated pattern of superimposed "irrationally' related groupings, even given the intrinsic
"inaccuracy· of Ii ve perfonnance, we should be aware of the repetition of the bar length or
larger-time-frame unit (which we are dividing). We hear in a measured way at the larger time-frame
level. but we hear only comparatively at smaller time-frames. On the other hand, with
computer-precision in generating the sound sequence, we may be able to perceive a very preCise
"density flux· in the combination of these streams, at least if they are repeated a sufficient number of
times. (Sound example 9,9).
Our comments about notated music are furtiler reinforced if we now organise the material internally so
as to contradict any mutual reinforcement at larger pulse (e.g. bar) interfaces. In this way we can also
deslroy measured perception at the larger level. Again, if we repeat a sequence of (say) 4 bars, we may
re-establish a measured perception of phrase regularity in a yet larger time-frame, e.g. the 4-bar
frame. (Diagram 14).
Eventually, however, either because we change bar-lengths in a non-repeating way, or because we
constantly undermine the bar-level mutual pattern reinforcement, larger time-fmme reference-frames
will not be perceptually established. (Diagram 15).
We then pass over exclusively to comparative or to textural perception.
77
DIAGRAM 13

lIJ"5'er til11e.-frsme
i
-r- comm'Oll butse. r
I
DIAGRAM
LARGE-DURATION REFERENCE FRAMES - THE NOT SO GOLDEN SECTION
At the otlJer extreme. music can be constructed as a set of nested time-frames in which at one level
(e.g. a serniquaver time-frame) order-sequences are put in motion (rhythm) while at the same time
establishing a time-frame (e.g. semibreve bars) in which longer duration order sequences can be set
up.
Within the time-frames fasH.1emisemiquaver to l-or-2-minutes. hierarchical systems of interlocked
levels functioning as order sequences in one direction and duration frames in the other can be
articulated. (Diagram 16).
lbere will. however. be a limit to our ability to perceive a large time-frame as precisely measurable and
hence comparable with anotlJer event of comparable time-frame proportions, i.e. a timescale limil to
measured perception. Though many composers writing musical scores would stake their reputations on
the idea that we can hear and respond to precise proportions in larger time-frames (especially those
derived from the Fibonacci series and associated Golden Section) there is little evidence to support
this. Such proportions certainly look nice in scores and provide convenient ways to subdivide larger
time-scales.
The exact proportions offered by the Fibonacci series have the particular advantage of being nestable at
longer and longer time-frames. (Diagram 17).
This makes them attracti ve for work based on integral time units. e.g. quavers at the fundamental
tempo of a piece. The question is. however. can we percei ve these proportions in a measured sense. or
do we perceive them in a comparative sense (I to ll2ish-213ish) while the exact measure makes the
task of laying out the score possible. If we consider the golden section itself (the limit of the Fibonacci
series) this will nest perfectly with itself. (Diagram IS).
However. being an irrational quantity (in the mathematical sense - it cannot be expressed as the ratio
of (lilY pair of whole numbers) its parts are not integrally divisible by any number of fixed pulses. no
matter how small. Hence it is intrinsically problematic to attempt to score in exact Golden Sections.
There is, of course. no such intrinsic problem to CUlling tape-segments to such irrational proportions
and bence. hypothetically, a pertect Golden Section nesting might be achieved in a studio composition.
Whether we are measurably perceiving this handiwork is quite a different question.
On the relatively small time-frame of "swing" (see earlier discussion). subtle bUI comparative
perception of proportions may be very significant (just there are very subtle parameters in the
spectral character of grain which we cannot "hear out" but which are fundamental to our qualitative
perception). However, on much larger time-frames. we run into the problem of both longer tcrm time
memory and the influence of smaller scale event patterns on our sense of the passage of time (waiting
len minutes for this train seemed an interminable time .... reading the newspaper. an hour has gone by
without my notiCing).
Due to the structure of the fibonacci series, almost any proportion between I to 112 & I to 213 can be
found within it and Ule nearer these proportions are to the Golden Section, the more likely they arc to
occur in the series. The question is. do we believe that we can measurably perceive these proportions
(e.g. 34:21) or merely thai we can comparatively perceive them all as lying within the range I 10
1I2ish-2J3ish. More dramatically. because the Golden Section is an irrational quantity, then, by
78
Dl:AGRAM 16
(MOZART)

.1PtWiFfFill t fo/ttOl fr$&t94@t@#£E'tr!# SW
/.JJ.1..I.LU /=>...ls<1­
I 1 I I I I I I cr.x-chet; p../se. I II J[ I bel/'S
pnob fs II II II It 11........1 1---11L-_-'
I phrase; 'L..l________---1
eTC, i-
DXAGRAM 17
"34- -- ___ a ------- J
O' Q.
o 0 o. o.
2'__________ --------
13
-- --.l'
eJ---i'IO

g
0"""---") 1.1. J. .J. cl-..J'I .I
DXAGR..ArI
,
a
alb >=
r
,b ble. .::
C/d =
die
elf =
f/9 ;:
gin =
h fl'
.J, ETC.
3 15 3
J. IJ cI----J-l.1.
"J 1.1. Iii.' IJ IlINIllJ I..' 1.1·

IJtmf!:. ,J 1.rim); li
b
i i i
Co c d.
I i
e f d e d e
i i r--T

!-, f-.t t-. 'f,
'--"r--­
'h 5 "AA 9 It-->,..,
.......
1[Ii\1 j"
I
definition. there is no smaller time-frame over which it can be measurably perceived, This is part of the
definition of a mathematically irrational quantity. Hence. by definition, we cannot measurably perceive a
Golden Section, though again we may comparatively perceive that it lies within the bounds of I to
1I2ish-213ish,
I am happy to accept that Fibonacci derived ratios give a satisfactory comparative percept of I to
1/2ish-213ish (possibly even a slightly narrower range) but not thaI, at larger time-frames. there is any
measured perception involved. In large time-frames it is certainly not possible to distinguish a Golden
Section from a host of other roughly similar proportions lying within a range. The longer the time-frame
the bu"ger this range of inexactness becomes.
I am certain of two things, Firstly that when psycho-acousticians run detailed tests, even on
composers who regularly use these devices. they will discover that these relations on larger
lime-frames are not heard exactly, Secondly, that when this is discovered. a large section of the
musical community will reject the Number mysticism has a long and distinguished tradition in
musical thought and the Golden Section is a well-defended tenet of a modern musical gnosticism,
WHEN IS DENSITY CONTROL APPROPRIATE ?
We must now ask the crucial question, when is composing with density parameters appropriate and
when, conversely, should we proceed in a deterministic way. i.e, specifying exact time-placement.
exact pilch-value, exact spectral contour etc.
If we demand computer precision, we will produce a precise, and preCisely repeatable, resull The
question here is, how exact does it need to be, i.e. by how much can we alter the parameters before we
notice any difference in what we perceive, nus judgement has to be set in context. Is the musical
organisation focusing our attention on these exact timing aspects of the total gestalt. e,g, through
cyclical repetition of the pattern in which timing parameters are slowly changing in a systematiC
manner? Or is the time-sequence within this event merely one of several panlmeters the listener is
being asked to follow? Or is it an incidental result of other processes and not central to the way in which
we scan the musical events for a sense of mutual connectedness and contrast?
It is always possible to claim that everything is important. In particular, composers are especially
prone to claiming that anything is an important constituent of what the listener hears if it was an
important part of the method by which they arrived at the sounding result, However, this need not be
the case, I can devise a method for producing sound in which every wavecycle has at least one sample
of value 327, Ths. by itse/j. will provide no perceived coherence within the sound world I create, To
declare that it docs is simply dishonest. The composer must thus be able to monitor what is and what
is not perceptibly related in a comJlOliition. independent of the knowledge hclshc has of the generating
procedure,
Deterministic procedures in which pummcters arc systematically varied may be used to generate
ranges of sound materials, We need. however, a separate approach. based in perception, (0 delinc
whether the sound materials generated are related and in what ways they are related, if at alL The
generating procedure and its systematicness are not, on their own. a guarantee of perceptual
relatedness,
We may, in the same spirit, use texture generation procedures. also with systematic variation of
parameters, to generate materials for subsequent perceptual assessmenl
Conversely. through experience, we may already know what kind of materials we want. If, for example,
we know that the result must be characterised by slowly density-fluctuating arhythmicity. it would be
more direct and simpler. when composing sound directly with a computer, to use a texture control
process. refining its parameters until we get the result we desire. Defining a deterministic process to
attain the same perceptual result will usually be much more difficult both to specify and to
systematically vary in some way coherent witll perceptual variatiollS,
We may also combine the two approaches by employing stochastic processes in which the probabilities
of proceeding from one sel of states to the nexi are defined precisely, ensuring a high degree of control
at one time-frame over a process that evolves unpredictably at smaller time-frames, As the transition
probabilities can have any value up to unity (completely deterministic) we have here a way to pass
seamlessly between pure density perception and total rt"tythmic determinism (or between the equivalent
extremes for any other sonic parameter). Again we must be clear that whether the result is rhythmic or
arhythmic, density-perceived or field-perceived, is 10 be ascertained by listening to the results, nO! by
appealing to the method of generation.
Compositional virtue does not lie in the determinism (or even the describabiJity) of the compositional
method, but in the control of the perceived results and their perceptual connectedness, This is
unfortunately not always the view handed down at academies of music composition,
79
80
CHAPTER 10
ENERGY
ENERGY TRAJECTORIES
Sounds in the real world which do nOl simply die away to nothing require some continuous energy input
such as bowing, blowing, scraping or an electrical power supply 10 maintain them. Usually the level of
the flow of energy into the system determines the loudness of the output and hence allows us to read
Ihe fluauating causality of the sound.
We may create the illusion of energy flow with a continuous, grain-stream, sequential or textural sound
by simply imposing a loudness trajectory. or envelope, on it (enveloping) (Sound example 10.1).
Such a trajectory may be derived directly from the activity of another sound (prerecorded or a live
performer) (envelope jollowing). Enveloping is particularly useful for creating a loudness anacrusis (a
crescendoing onto a key sonic event), a particularly musical device as most natural sounds have
precisely the opposite loudness evolution (Sound example 10.2).
Conver.;ely, we may begin with a sustained sound (continuum, grain-stream, sequence, texture) and
give it an exponentially decaying loudness, through enveloping. This suggests the sound originates
from some vibrating medium which has been struck (or otherwise set in motion) and then left to
resonate, especially if the onset of the sound is reinforced. With simple sustained sound this can
completely alter the apparent physicality and causality of the source (see below). With sounds which
more easily betray their origins (e.g. speech) we produce an interesting dual percept. (Sound example
10.3).
If a sound already has a noticeably varying loudness trajectory we may track this variation (envelope
jollowing) and then perform various operations with or on this trajectory (envelope IransjOrtfUilioll). As
already suggested we might transfer the trajectory 10 an entirely different sound (enveloping or
envelope SubSlilulion). Sequences of event, which are extremely strongly characterised by loudness
articulation may thus be 'recapitulated' with an entirely different sonic substance (Sound example
10.4).
We may also modify the loudness trajectory we have extr..tcted (envelope transjanllalioll) and reapply
it to the original sound. Note that we must do this by envelope substifutioll and not simply by
enveloping (see Appendix ppS9 & 61). Thus we may exaggerate the loudness trajectory to heighten
its energetic (or dramatic) evolution (expanding: see below) .
The loudness trajectory may also be lengthened (envelope exll'llSioll) or shortened (envelope
conlraction), extending or contracting the associated gesture (see for example the discussion of
time-stretching textures in Chapter II), and any of the trajectory (envelope) transformations described
here might be applied in a time-varying manner so that e.g. a sound may gradually become prominently
corrugated (sec below).
FINE-GRAINED ENVELOPE FOLLOWING
The result of the process of envelope following will depend to a great extent on how the loudness
trajeCtory is analysed. To assess the loudness of a sound at a particular moment, we need to look at a
certain small lime-snapshot of the sound (a time-frame defining a window size: see Appendix pS8).
1be instlJlltaneOU$ loudness of a sound has no meaning (see below). The size of this time-snapshot
will affect what exactly we are reading.
Thus if we wish to exaggerate the loudness tr.ljectory of a rhetorical speech, we are interested in
deteCting and exaggerating the variations in loudness from word to word or even from phrase to phrase.
We do not wish to exaggerate the loudness variation from syllable to syllable. Even less do we wish to
exaggerate the undulations of loudness Within a single rolled 'r'. We want a ·coarse" reading of the
loudness trajectory. We therefore choose a relatively large time-frame over which the loudness of the
signal is measured. (Alternatively we might detect the loudness level at the smallest meaningful
time-frame (around 0.005 seconds) and then average the result over an appropriate number of Ihese
frames to get a coarser result).
Conversely, we may wish to track loudness changes more precisely. If loudness trajectory is tracked
using a fine time-frame (small window) over a granular sound, we will often be able to detect the
individual grains. Exaggerating the fluctuations in the trajectory can then be used to force momentary
silences between the grains (corrugalion). This both alters the quality of the sound and makes the
grains more amenable to independent manipulation (Sound example 10.5).
Following or manipUlating the loudness trajectory below the grain time-frame has little meaning. The
instantaneous value (pressure) of the sound wave changes from positive (above the norm) to negative
(below the norm) 10 create the experience we know as sound. The amplitude of the signal describes the
size of these swings in pressure, and. as we cannot know how large a swing will be until it reaches its
maximum excursion, we need to usc a time- window which covers whole wavecycles when determining
the loudness trajectory of a sound.
Beneath the duration of the wavccycle the instantaneous variation of the pressure determines the
shape of the wave and hence the sound spectrum. Imposing loudness changes over one (or only a few)
wavecycles, or wavesets, in effect changes the local shape of the wave(s) and is hence a process of
spectral manipulation rather than of loudness control (see Appendix pS3). But if we gradually expand
the number of wavecycles or wavescts we cause to fall under our loudness-changing envelope, we
pass from the spectral domain to the time-frame of grain-structure and eventually to that of
independent events. These transitions arc nicely illustrated by wavesel enveloping. In Sound example
10.6, the number of wavesets falling under a single loudness-changing envelope is progressively
increased on each repetition.
The foregOing discussion illustrates ollce again the importance of time-frames in the perception and
composition of sonic events.
81 82
GATES AND TRIGGERS
We may apply more radical modifications to the loudness Irajectory of a sound. We may cause a signal
to cut oul completely if it falls below a certain level, TIlis procedure, known as gating. can be used for
elementary noise reduction, but is also often used in popular music 10 enhance the impacl of percussive
sounds.
Detecting the appropriate level 10 apply the gate is known as threshold delection. We may also use
threshold detection as a means to Irigger other events, either when the level of a signal exceeds a
threshold. or when il falls below il. In the former case we may heighten the altack characteristics of
particularly loud sounds by causing them to trigger a second sound 10 be mixed into the sonic stream.
Or we may trigger quite different sounds to appear briefly and at low levels. so thaI these new sounds
are partly masked by the original sound. making a kind of hidden appearance only (Sound example
10.7).
Alternatively we mighl trigger a process affecting the source sound. such as reverberalion or delay.
We could switch such processes on and off rapidly. or vary their values according to the loudness
trajectory of the source Signal. e.g. quiet events might be made more strongly reverberant and louder
events drier. Combining this with triggered spatial position control we might articulate a whole
stereo-versus-reverb-<lepth space from the loudness trajectory of a single mono source.
In conventional recording studio practice two very commonly used gating/triggering procedures arc
limiljng and compressjng. In a limiter. any sound above a certain threshold loudness is reduced in level
10 that threshold value. This ensures that e.g. a concert can be recorded at a high level with less risk of
particularly loud peaks overloading the system. Compression works in a more sophisticated manner.
Above the threshold the louder the sound, the more its loudness is reduced. producing a more subtle
containment of the signal. This process may be used more generally, setting the threshold level quite
low, to flauen or smooth the gestural trajectory of a sound. thereby e.g. "calming" a hyperactive
sequence of events.
An expander. in contrast, does the opposite, expanding the level of a sound more. the louder it is, hence
exaggerating the contrasts in loudness in a sound source and thereby perhaps exaggerating the
gestural energy of a sound event or sequence of events. (See Appendix p60).
BALANCE
I n orchestral music, the balance of loudness between sound sources is achieved part! y through the
combination of perfonnance practice with the notation of dynamics in a score, and partly tlu'ough the
medium of the conductor who must interpret the composer's instructions for the acoustic of the
particular performance space. Furthermore. the blending or contrast of sounds is aided or hindered by
the fact that all the sounds are generated in the same acoustic space (normally). In the studio we may
bring together sounds from entirely different acoustic spaces (a forest, a living room) and with quite
different proximity characteristics (close-miked. or miked at a distance such that room ambience is
significantly incorporated into the sound).
Moreover. we may combine such features in ways which contradict our experience of natural acoustic
environments. Thus proximately recorded loud sounds may be played back very quietly, while distant
quiet sounds may be projected with very loudly. These features of the sound landscape are discussed
more fully in On Sonic An.
In sound composition in general, the loudness balance belween di verse sources is described in a
(pOSSibly graphic) mixing score, or created in real-time at a mixing desk, perhaps with action
information recorded for subsequenl exact reproduction or detailed modification.
'This can give us very precise control over lYdiance, even within the course of grain size events. Balance
changes within the grain time-frame is in fact a means of generating new sonic substances
intermediate between the constituents (e.g. inbetweenillg. see Chapter 12). (Sound example to.8).
Mixing may also be used 10 consciously mask the features of one sound by another or. more usually. (0
consciousl y a void (his. The latter process is aided by distributing the different sounds over the stereo
space. (Sound example 10.9).
In certain cases we may wish to ensure the prominence of a particular sound or sounds without thereby
sacrificing the loudness level of other sources. In popular music the device of ducking is used to this
end. Here the general level of some of the other instruments is linked inversely to that of the voice.
Before the singer begins, the rest of the band is recorded as loudly as possible, but, once the voice
enters. some instruments 'duck' in level so as nol to mask the voice. This ensures that the voice is
always clearly audible while at the same time a generally high recording level is maintained whether or
not the voice is present.
TIlis process may be used more generally. We may mix any two sounds so that the loudness trajectory
of the second is the inverse of the first (envelope i1lver.rioll). For example. a sound of rapidly varying
loudness (like speech) may be made to interrupt a sound of more constant energy (a large crowd Ialking
amongst itself, heavy traffic) by imposing the inverted loudness trajectory of the speech on the trame
noise (Sound example 10.(0). This procedure will ensure that the trdffic will not mask the voice. yel
will remain prominent. perhaps alternating in perceptual importance with the voice (depending on how
the relative maximum levels are sel and on the intrinsic interest of the traffic sounds themselves!).
PROXIMITY
Loudness information in the real world usually provides us with information about the proximity of a
sound source. The same sound heard more quietly is usually further away. It is important to
understand that other factors feature in proximity perception. especially the presence or absence of high
frequencies in the speetrum (such high frequencies tend to be lost as sound travels over greater
distances). There are of course other aspects of the sound environment affecting proximity perception.
The presence or absence of barriers (walls. doors to rooms or entrances to other containers) and the
nature of reflective surfaces (hard stone. soft furnishings) contribute to a sense of ambient
reverberalion. Even air temperature is impot1ant (on cold clear nights. sound waves are refracted
downwards and hence sounds travel further).
84
83
Using a combination of loudness reduction and /ow pass filtering (to eliminate higher frequencies) we
may make loud and closely recorded sounds appear distant. especially when contraSted with the
original source. But we may also produce self-contradictory images e.g. closely miked whispering,
which we associate with c1ose-to-lhe-ear intimacy and hence 'quietness'. may be projected very
loudly. while bellowed commands may be given the acoustic of a small wooden box - and these may
boI:b be projected in the same space.
Proximity may also be used dynamically, like a continuous zoom in cinematography. A mi'-Tophone may
be physically moved towards or away from a sound source during the COUlse of a recording. The effect is
particularly noticeable with complcJI spectra with lots of high frequency components (bells and other
ringing metal sounds) where the approach of the microphone 10 the vibrating object picks out morc and
more high frequency detail. lhis aural zoom is hence a combination of loudness and specual variation. It
is Important to understand that this effect lakes place over a very small physical range (a few
centimetres) in contrast to the typical range of the visual zoom. (Sound example 10.11).
PHYSICALITY AND CAUSALITY
As we have already suggested, loudness trajectory plays an important part in our attribution of both the
physicality of a source (rigid, soft. loose aggregate etc.) and the causality of !be excitation (struck,
stroked etc). The loudness trajectory of the sound onset is particularly significalll in this respect, and
often simple onset-Ir.ljectory manipulation is sufficient to radiCally alter the perceived
physicalitylcausality of the source. Thus almost any sound can be given a struck-like attack by
providing a sudden onset and perhaps an exponential decay. Conversely. a suongly percussive sound
may be softened by very carefully 'shaving off the altack to give a slightly more slowly rising
trajectory. (Sound example 10.12).
lhis area is discussed in more detail in Chapter 4.
85
CHAPTER 11
TIME STRETCHING
TIME-STRETCHING & TIME-FRAMES
Time-stretching and time-shrinking warrant a separate chapter in this book because they are
procedures which may breech the boundaries between perceptual time-frames. The importance of
time-frames in our perception of sounds is discussed in detail in the section "Time-frames: samples,
wavecycles, grains & continuation" in Chapter I.
Furthermore, the degree of time-stretching of a sound may itself vary through time, and with the
precision of the computer such time-varying time-stretching (time-warping) may be applied with
great accuracy.
There are several different approaches to time-stretching and the approach we choose will depend both
on the nature of the sound source and the perceptual result we desire. Below we will look at
tape-speed variation, brassage techniques, waveset repetition (waveset time-stretching), frequency
domain time-stretching (spectral time-stretching), and grain separation or grain duplication (granular
time-stretching). We will then discuss various general aesthetic and perceptual issues relating to
time- stretching before dealing with the most complex situation, the time-stretching of texture
streams.
"TAPE-SPEED" VARIATION
In the classical tape-music studio, the only generally available way to time-stretch a sound to
change the speed at which the tape passed over the heads. The digital equivalent of this is to change
the sampling frequency (in fact, interpolating new sample values amongst the existing ones, but storing
the result at the standard sampling rate: sec Appendix p37). This approach is used in sampling
keyboards (1993) and table-reading instruments e.g. in CSound.
In both cases the procedure (tape-speed variation) not only changes the sound duration but also the
pitch because it alters the wavelengths (and therefore the frequency: sec Appendix p4) in the
time--<lomain Signal. Similarly, it changes the frequency of the partials and hence also shifts the spectral
contour, and hence the formants. And it time-stretches the onset characteristics, probably radically
changing the sound percept in another way. (Appendix p36). (Sound example 11.1).
Although time-warping, pitCh-warping and formant-warping are thus not independent, this approach
its musical applications. In particular (multi-) octave (etc) upward transpositions can be used as
short time-frame reinforcements of a sound's onset characteristics (see Chapter 4). Moreover,
downward transposition by one or two octaves not only reveals the details of a sound's evolving
morphology in a slower time-frame, making important detailS graspable. The transposition process
itself often brings complex, very high frequency spectral information into a more perceptually accessible
mid-frequency rJnge (our hearing is most sensitive in this rJnge). The internal qualities of a complex
sound may thus be magnified in two complementary domains. Digital recording and transposition is
86
also OIuch OIore robust than the old analogue recording tape-speed variation, capturing important
extremely high frequency infonnation for re-Iistening in a moderate frequency range and preserving
frequency infonnation transposed down to very low frequencies. (Sound example 11.2).
Tape-speed variation may be applied in a time-varying manner e.g. causing a sound to plunge into the
lowest pitch range, hence bringing very high frequency detail into the most sensitive hearing range, at
the same time as magnifying the time-frame. Conversely, a sound may accelerate rapidly, rising
pitch-wise into the stratosphere (using e.g. tape acceleration). As the sound rises, internal detail is
lost. With sufficient acceleration almost any sound can be converted into a structureless rising
pitch-portamento. This is an elementary way to perceptually link the most diverse sound materials, if
they occur in long enough streams for such acceleration to be possible, or a way of creating musical
continuity between a complex stream of diverse events and event structures focused on pitch
portamenti. (Sound example 11.3).
BRASSAGE TECHNIQUES
As discussed previously, 8rassage involves CUlling a sound into successive, and possibly overlapping,
segments and then reassembling these by resplicing them together (for pure time-stretching, in exactly
the same order) but differently spaced in time (see Appendix p44-AB). It can always be arranged, by
appropriate choice of segment length and segment overlap, for the resulting sound to be continuous, if
the source sound is also continuous. Provided the cut segments are of short grain duration (i.e. with
perceptible pitch and spectral properties but no pitCh or spectral evolution over time) then the goal
sound will appear time-stretched relative to the source.
Good algorithms for doing this are currently (1994) embedded in commercially available hardware
devices (often known as harllloniJers) and function reasonably well, often in a continuously
time-variable fashion over a range half to two-times time-stretch. At the limits of this range and
beyond we are beginning to hear spectral and other artefacts of the process. These may, however. be
useful as sound-transformation techniques. (Sound example 11.4).
Particularly in long time-stretches, Brassage may lead to...
(I) pitch artefacts - related to the event-separation rate of the segments.
(2) granulation artefacts - where the individual grains are large enough to reveal a time-evolving
structure, and hence, as successive segments are chosen from overlapping regions of the source,
delayed repetitions are heard.
(3) phasing artefacts - due to the interJction of rapid repetition or "delay", and gradual shifting along
the source.
With long time-streches the perceptual connection between source sound and goal sound may be
remote and may require the perception of mediating sounds (with less time-stretching) to make the
connection apparent. Repeated application of Brassage techniques to a source (in effect using
may entirely destroy the original characteristics of the source. In contrast, spectral
tITTle-stretching (in the frequency domain) can be repeated non-destructively. (Sound example 11.5),
87
Bmssage may be extended in a variety of ways, using larger and variable length segments, varying the
time-range in the source from which the goal segment may be selected. varying the pitch. loudness
andIor spatial position of successive segments and, ultimately. varying the OUlput event density. We
thus move gradually out of the field of time-stretching into that of generalised brassage and granular
I'eCOnstruction. (These possibilities are discussed in more detail in Chapter 5 in the section
"Constructed Continuation" and in Appendix pp44-45 and Appendix p73). We need add only thaI
using Granular Reconstruction with time-varying parameters (average segment length. length spread,
search range, pitch, loudness. spatial divergence and output density) it is possible to create. in a single
evem. a version of a source sound which begins as mere time-stretch and ends as a texture stream
development of that source.
A more time-stretching satisfactory application of the Brassage process to sequences such as speech
slreams can be achieved by source-segment-synchronous brassage. In this case we need 10 apply a
sophisticated combination of envelope following. and pitch-synchronous spectral analysis to isolate the
individual segments of the source stream. These can then be individually brassage-time-stretched
and respliced together in one operation. Because none of the goal segments crossover between source
segmenlS, we avoid ar1efacts created at source segment boundaries in simple brassage (sec Diagram
I).
WAVESET TIME-STRETCHING
Time-stretching can be achieved by searching for zero-crossing pairs and repeating the wavesets thus
found (see Appendix p55). TIus techrlique will produce only integral time-multiples of the source
duration. As discussed elsewhere, two zero-crossings do not necessarily correspond to a wavecycle (a
true wavelength of the signal) so a waveset is not necessarily a wavecycle. As a resull this techrlique
will have some unpredictable. though often interesting. sorlic consequences. Sometimes parts of the
signal will pitch-shift (e.g. at x2 time-streich, by an octave downwards, as in tape-speed variatioll).
For a x2 (or even x3) time-stretch. artefacts can often be reduced by repeating pairs (or larger groups)
of wavesels (a special case of pitch- synchronous brassage). These kinds of artefacts can of course be
avoided, in truly pitched material, by using a pitch-following to help us to distinguish the
true wavecyc!es. (Sound example 11.6).
As the number of repetitions increases. other artefacts begin to appear. At x 3 there is often a
"pha.'iing"-like coloration of the sound. With time evolving and noisy signals. at xl6 time stretch a
rapid stream of pitched beads is produced as each waveset group achieves (near-) grain dimensions
and is heard out in pilch and spectral terms. (No such change occurs, however. in a steady tone or
stable spectrum.) This spectral fiSSion is heard 'subliminally' within the source even at x4
time-stretch. (Appendix p55). (Sound example n.7).
It is possible also to interpolate between waveset durations and between wavcset shapes through the
sequence of repetitions. In x64 time-stretching with such interpolation the new signal clearly glides
around in pitch as each "bead" pitch glides into the next At xl6 time-stretch, we are aware more of
the "!Iuidity' of a bead-stream rather than of a continually ponamentoing line. Even at x4
time-stretch. this Iluidity quality contributes to the percept in an intangible way. (Sound example
11.8).
DJ:AGRAH 1.
01"11­
Use fine-grained envelope following
and detailed spectral analysis
to extract sequential features of
the sound.
Deduce pitch of feature.
Use pitch to help define
pitch-synchronous
bras_ge segments
of feature.
DJ:AGRAl'I 2
Effect of granular time-stretching.
Effect of spectral time-stretching.
88
Except for ,,2 time-stretching, when working on musically interesting sound sources, wavesel
repetition is more useful a special sound transformation procedure. Successive applications of such
,,2 stretching does not significantly reduce the bead-streaming effecl in long time dilations.
SPECTRAL TIME-8TRETCHING
A sustained sound wi!.h stable speclrum and no distinctive onset characteristics may be analysed to
produce a (windowed) frequency-domain representation. We can then resynthesize the sound. using
each window to generate a longer duration (phase vocoder). The resultant sound appears longer but
retains its pitch and spectral characteristics. We may notice an extension of the initial rise time and
final decay time but in this case these may well not be perceptually crucial. Of all !.he techniques so far
discussed, this is by far the best for pure time-suetching and works well up to about x8 time
stretching. (Sound example 11.9).
Beyond this,however, it 100 becomes less satisfaclory as a pure time-stretching procedure. The
original window size for the analysis is chosen to be in the time-domain so brief that the human ear
perceives !.he small change from window 10 window (in fact a "Slep") as a smooth continuous transition.
Once we 00 too long a resynthesis from individual windows. e.g. a x64 time-stretch, !.he speclrum of
each window is sustained long enough for us to become aware of the joins. The continuity of !.he
original source is not reconslructed.
One way around this limitation is to suetch the source x8, synthesize !.he result, reanalyse and
by x8 once again. However, a more satisfactory approach is to interpolate between !.he
existing windows to create new windows intermediate in channel-frequency and chaMel-loudness
values between the original windows but spaced al the original window time-interval. (speclral
time-stretching). The procedure ensures a perceptually continuous resull even at x64 time suctch.
(Sound example 11.10).
However, even in this case, specUal transformations may arise. In particular. a sound with a rapidly
changing spectrum, originally perceived as noise, will be sufficiently slowed for us to hear out !lIe
spectral motion involved. In general we will hear a resulting sound with a gliding inllannonic
("metallic") speclrum in place of our source noise. at very great time- stretching factors. (Sound
example 11.11).
Using time-variable spectral time-stretching (spectral lillie-warping), we can use this clTect to
produce spectrally diverse variants of a source. looming in to a maximal time-stretch at a particular
point in a source, will produce a particular inharmonic anefaet in the goal sound. Zooming in 10 a
different point in the source will produce a different ill.hamlonic anefact in the new goal sound. We can
!.hus produce a collection of related musical events deriving from the same source (provided enough of
the resulting signals are elsewhere similar to cadI other). (Sound example 11.12).
The ultimate extension of this process is spectral freezing. where the frequencies or loudnesscs of the
spectral components in a particular window are retained through the ensuing windows. 1his
compositional tool is discussed in Chapter 3.
If we have a sound with marked (grain time-frame) onset characteristics, e.g. an altack-dispersion
sound (piano, bell, pizz). even our interpolated spectral time-stretch over a long suetch. will radically
89
s1te
r
the sound percept by making the indivisible qualitative character of the onset become a
tirne-varying percept (see Chapter I). (Sound example 11.13).
If we wish to preserve !.he characteristics of the source sound. we must retain !.he onset characteristics
by not time-stretching !.he onset. 1his we can achieve by time-variable spectral time-stretching
(spectral time-warping), making !.he time-stretch equal to 1.0 (i.e. no stretch) over the first few
milliseconds of the source and thence increasing rapidly as we wish to any large value we desire.
(sound example 11.14).
Alternatively, we can use spectral time-stretching 10 heighten !.he internal spectral variation of !.he
source. Time-stretching !.he onset of the signal, bul not the continuation, will alter the sound percept
radically (altering !.he causality) but retain a perceptual connection between source and goal wough the
stability of the continuation. Clearly, !.he longer the continuation, the stronger the sense of relatedness.
Clearly also, there are many combinations of onset transfonnation and continuation transformation.
(sound example 11.15).
In II very long spectral time-stretch of a sound's continuation, where !.he sound is specually varying,
we can reveal these changes by stripping away partials from the time-stretched sound until only the
most prominent (say) twenty remain (spectrallracing : see Appendix p25 and Chapter 3). As the
spectrum varies in time, partials will enter and leave this honoured set and we will hear out the new
enmes as 'revealed melodies', a type of constructive distortion. If the lime-stretch is itself not
time-varying, these revealed melodies will be regularly pulsed at the time-separation of the original
window duration multiplied by the time-stretch factor. Spectral lime-warping will create a fluid
(lIcccl-rit) tempo of entries and will accelerate into !.he time continuum where the time-suctch factor
reduces towards 1.0 (no stretch). (Sound example 11.16).
GRANULAR TIME-STRETCHING
The time-stretching of grain-streams is problematic. As we have seen. if we time-suetch the onset
of a sound we risk completely altering its perceived character. We overcame this problem by
time-variably time-stretching (time-warping) a sound in such a way !.hat !.he onset was not stretched.
However, grain-streams are in effect a sequence of onsets. We cannot in this case, therefore, preserve
only the beginning of the sound. Ideally we would need to use an envelope-follower to uncover the
loudness trajectory of the sound and !.hus locate all the onsets, and then apply a time-warping process
that left the sound unaltered during every onset moment. 1his is feasible but awkward to achieve
successfully.
It is therefore useful to be able to time-vary a grain-stream by separating the grains and repositioning
them in lime, causing the sequence of grains to accelerate. ritardando, randomly scalier etc. (gra1lular
til/fe-warping by grail! sepnrtltioll). (Sound example 11.17).
With even moderately large granular time-stretching of this sort, the stream character of a
8rain-sueam breaks down in our perception - we hear only isolated staccato the elements of
POtential musical phrases. Conversely. time-shrinking of a sequence of isolated events. by reducing
separation time, can reach a point where the sounds become a grain-stream, or sequence-stream.
I1lther than musical "points" in their own right. (Sound example 11.18).
90
We may also extend grain-streams by duplicating elements. (granular time-stretching by grain
duplicalion) e.g. respace the original units at twice the initial separation and repeat each grain halfway
in lime between the new grains. This is a grain dimension analogue of waveset repetition and suffers
from the same drawbacks. If the grain e l e m e n L ~ are changing in character. grain repetition will be
perceptUally obvious and the more repetitions of each grain. the more perceptually prominent il will be.
(It might be obviated by a very narrow range pitch-scattering of the grains 10 destroy the sense of
patlerning introduced by the grain repetition) Note that in this way we can time-stretch the
grain-stream by integral multiples of the original duration (and by selected choice of grains 10 repeat.
by any rational fraction). without thereby altering the (average) pulse-rate (density) of the stream.
(Sound example 11.19).
TIME-SHRINKING
All the above processes may be applied to time-contraclion. with noticeably different results.
Taptt-speed variation time-contraction has already been discussed. Spectral time-shrinking can
smoolhly contract any sound. or part of a sound. It can be used to contract a sound with continuation
into an indivisible grain. though. in this process data will be irretrievably lost, i.e. if the grain is now
time-dilated once more. the resulting sound will be considerably less time-variably spectrally detailed
than the original. The process of successive contraction and expansion will give similar results to the
process of spectral blurring (see Chapter 3). (Sound example 11.20).
If grain-streams or sequences are spectrally time-shrunk. the individual elements will shrink. become
less spectrally detailed and more click-like a ~ their durations approach the lower grain time-frame
boundary. (Sound example 11.21). Granular rime-shrinking by grain separation involves the
repositioning of grains by reducing intervening silence (where possible). or overlaying or splicing
together the existing grains. These latter processes will tend to blur grains together. The process hence
tends towards tremolo-like loudness trajectory and eventually to a continuous percept. from which the
original grain is not recoverable. (A new grain might be created by envelopillg.) (Sound example
11.22).
Granular time-shrinking by grain deletioll is not prone to this blulTing erfect. but it is easy to destroy
the continuity of the grain-stream percept if grains are deleted from that stream.
In waveset time-shrinking waveset duplication is replaced by waveset omission. lhis process also
loses granular or sequential detail as the sound conW.lcts, though in a different way to grain separation
contraction. We may choose to omit waveselS in a sequentially regular manner (e.g. every fourth
waveset) in which case the sound becomes increasingly fuzzy, quieter. less detailed. Or, we may
choose to omit the least significant (lowest amplitude) wavcsets. In this case the sound retains its
original loudness and loudness trajectory. but also loses sequential dctail. The first process tends
towards silence. the laner eventually reduces any sound to a continuation-less but loud point sound.
(Sound example tl.23).
THE CONSEQUENCES OF TIME-WARPING; TIME-FRAMES
As we have discovered time-warping is not a single. simple process. The musical implications of
applying particular processes to particular sounds must be considered on their own merits. There arc.
however. a number of general perceptual considerations to be borne in mind.
91
Considering single grain time-frame sounds, the internal structure of the grain, perceived as in
indivisible whole, a quality of the event, when time-stretched becomes a clearly time-varying property,
a feature of the sound continuations. We may in this way uncover changing pitCh, changing spectral
type. changing formant placement. changing loudness undulations in any of these, or specific granularity
of sequencing of events within the original grain. In many cases there may be no apparent perceptual
connection between the quality of the grain and the revealed morphology of a much time-slTetched
version of that grain. II will need to be eSlablished through the structural mediation of sounds made by
time-stretching the grain by much less. In fact. perceptual connectedness will be found to follow an
exponential curve as we move away from the grain dimensions i.e. initially very tiny spectral
time-stretches will produce very significant perceptual relations. but as we proceed. much larger
time-stretching will present less and less surprising perceptual information. (Sound example 11.24).
If we wish to preserve the onset characteristics of a grain time-frame sound. we must use
time-variable spectral time-stretching (spectral time-warping) (see above) or simply. perhaps. use
the original grain as a superimposed onset for time-stretched versions of itself. (Sound example
11.25).
Conversely. a sound with continuation will usually have distinctive onset characteristics of grain
dimensions. When we time-stretch such a sound we may choose to preserve the time-frame of the
onset, hence preserving a key perceptual feature of the source. or we may time-stretch the onset too,
prooucing. with large time-stretches. a dramatic change in the percept. Once again, some kind of
mediation (lhsough sounds with less time-stretched onsets) may be necessary to establish the
perceptual link between source and goal sound. though the continuation of the two sounds may provide
sufficient perceptual linkage between them. (Sound example t 1.26).
As we have seen. grain-streams (and sequences) may be time-extended by granular time-strerclling
by grain duplication. by granular time-stretching by grail! separarion or by specrraltime-stretcl!ing.
which latter qualitatively transforms the grain (sequence) clements. These each have quite different
musical implications. In particular. granular time-strerciling by grain separation (in which the grains
themselves are preserved while the intervening time-gaps are manipulated) will give us a clear percept
of rhythmic variation (ritardando. accelerando. randomisation etc) a ~ the internal pulse of the
grain-stream (sequence) changes. In contrast. spectral time-stretching (or brassagel1wmwniser type
time-stretching) applied to COnlinllOUS soumis is more akin to looking at time itself Ihsough a
magnifying glass - the sound gets longer without any sense of the rhythmic slowing down of a pulse.
When applied to a grain-stream these will usually create both a sense of rhythmic change and this
sense of temporal extension. (Diagram 2). (Sound example t 1..27).
Granular time-stretching by grain separation of a grain-stream will lead. ultimately. to a "slow·
sequence of individual staccato events which may be re-sequenced in terms of onset-time (rhythm)
pitch. or various spectral properties. or any combination of these. Specrral time-stretcizing or
lramumiser time-stretching of a grain-stream. if it stretches the grain constituents by a large amount.
will reveal spectral qualities and evolving shapes (morphologies) within the gn.un (or sequence)
elements having their own musical implications. (Sound example 11.28).
The continuation of a sound. itself specrrally time stretched over a number of seconds (especially in
COmpleXly evolving sounds) may provide enough musical information for us to treat the extended sound
as a phrase in its own right. As we have heard, time-variable spectral time-stretching (specrral
92
rime-warping) allows us to produce many versions of such a phrase with different internal time
proporuoos and perhaps, spectral emphases, just as we might produce variants in the Hpitch domain. of
a melodic phrase. Also. tillie-warping of a continuous sound can create a kind of forced continuation
(see Chapter 5) in the spectral domain if applied to a complexly spectrally evolving sound. Similarly.
undulating continuation properties will become slow glides with quite different musical implications to
their undulating sources and we can alter !hese implications through tillie-warping. (Sound example
11.29).
The time-warping of rhythmically pulsed events provides us wi!h an entirely new area for musical
exploration. It is already possible to work: wi!h multiple fixed time-pulses (e.g. mutually synchronised
c1ick-tracks in different tempi. as in Vox 3) or wi!h time-pulses mutually varying in a linear manner (as
in the "phasing" pieces of Steve Reich. which generate new melodic, rhythmic or spectral percepts
through finely controlled delay) but we might also produce mutual interaction between rhy!hmic streams
which are themselves aecelerating of decelerating, from time to time forcing !he streams to
pulse-synchronlse in !he same (possibly changing) tempo and in the same (or a displaced)
pulse-grouping unlt (bars coincide or not). (Sound example 11.30).
This device is used in Vox 5 where three copies of an ululating voice are slightly time-varied in different
ways and made to move differently in space. They begin in synchronlsation. in a single spatial location,
and reoonverge to a similar state at the section end by appropriate inversions of the speed variations
(see Diagram 3). Hence speed divergence and resynchronlsation become elements in the emergence
and merging of aural streams; or ·counterstreaming". tbe time-fluid cousin of counterpoint. Note, ("l
however. that the sound constituents of two time-varying coumer streams do not have to be the same.
or even similar. (Sound example 11.31).
~
In fact. if the stream constituents are identical. are synchronlsed at some point and have slightly
(!)
different time-stretch parameters. their interaction produces portamenti rising 10 or falling from the
(
poinl of synchronlsation (see Chapter 2 on Pilch Creation). (Sound example 11.32). In the previous 1-1
case (Sound example 11.31) these precise-delay pitch-artefacts were avoided by carefully fading IWO A
of Ihe three streams just before such artefacts would appear.
The interaction of time-varying pulse-slreams bears the same relation to fixed-tempo polyrhylhm as
controlling pitch-glide textures does to Hpi\ch field pitch organisation.
Time-stretching may lead to spectral fission (see Chapter 3). e.g. Spectral time-stretching of noisy
sounds by large amounts may produce sounds with gliding inharmonic spectra (Sound example 11.1l)
- or to constructive distortiOIl e.g. wave-set tim.e-stretching produces pitches bead streams (Sound
Examples 11. 7 & ll.8) and spectral time-stretching reveals window-stepping in spectrally traced
sounds (Sound Example 1l.16). [n both cases a long time-stretched goal sound may not be
perceplually relatable to the source and will require mediating sounds (e,g. less time-stretched
versions of the source using the same technlque) to establish a musical conncction.
Throughout this section we have talked only of time stretching but Ihe same arguments may be applied.
in reverse. to time contraction. In particular, p1uase structure may be compressed into grain-slream (a
comprehensible spoken sentence can become a rapid-fire. spectrally irregular sequence of grains) and
continuation compressed into the indivisible qualitative percept of gr.lin, Once again. if a perceptual
connection belween source and goal is required (to build musical structure). mediating sound types may
be necessary,
93
--------------------_...
TIME-STRETCHING OF TEXTURE-STREAMS
We have left the discussion of texture-streams until the last because it introduces further
muiti-dimensionality into our discussion of time-stretching. We have already encountered a two
dimensional situation with grain-streams. A grain-stream may be granular time-stretched by grain
upal'4/ion. or by grain duplication. The first process reduces the pulse-rate (or density) of the
the laner docs nOl. With texture-streams the situation is even more complex.
We may distinguish three distinct approaches to time-stretching a texture-stream. In the first we
treat the texture-stream as an indivisible whole and time-stretch it. We may do this by spectral
time-s/re/clting, thereby stretching all the texture constituents, and hence very quickly producing a
radical spectral transformation of the percept. All the perceptible time-varying field properties of the
texture-stream will thereby be time-stretched e.g. loudness trajectory, pitch-band width change etc.
The revelation of the inner structure of grains may even alter the field percepts (e.g. noise elements
becoming inharmonic sounds, or Hpitches appearing as gliding pitches) of the source sound. (Sound
example 11.33).
Waveset time-stre/ching will produce surprising and unpredictable artefacts when applied to texture
streams as the zero crossing analysis will confuse the contribution of various distinct grains to the
overall signal. With a stereo texture, waveset duplication may be applied to each channel
independently, producing arbitrary phase shifts between the channels, as well as the aforementioned
artefacts. (Sound example 11.34).
Brassage techniques with above-grainsize segments. using regular segment size and zero
search-range (sce Appendix p44) will quickly destroy the unpatterned quality of the texture-stream.
as brassage repetitions introduce a "spurious" order into the goal sound. Brassage will work. bener at
preserving the inherent qualities of the texture-stream if we use a large enough segment size to
capture the disorder of the texture-stream and a large enough range to avoid obvious repetition of
materials. However. t(Xl large a range will begin to destroy any time-varying order in the field
characteristics of the stream (e.g. directed change of the Hpitch field. loudness Irajectory etc). (Sound
example 11.35).
Assuming we have fine control of the texture generatioll process we COUld. in fact. separate out some of
these field properties e.g. time-stretching a dynamically flat version of the texture, then reimposing the
original loudness trajectory in exactly the same time-frame as in the source texture-stream. We will
discuss this parameter separation further. below. (Sound example 11.36).
The second approach to time-stretching a texture-stream would be granular time-stretching by graill
separation, as with grain-streams and sequences. However. because of the mutual overlaying of
grains (or larger constituents) in a texture-stream, there is usually no simple way we can achieve this.
It can only be done, in general, by returning to the texture gelleratioll process and altering the
event-onset density parameters. To achieve an integrated time-stretch of this sort. any time-varying
field properties (Hpitch field change, loudness trajectory. formant change etc.) would need to be
similarly time-stretched in the generating instructions. We COUld, however, choose flat to alter these
features of the stream. In this way. we may create a goal sound which appears less event-onset dense
than the source sound hut not perceptually time-stretched in any meaningful sense. (Sound example
11.37).
94
This suggests the third approach to time-stretching. In this case we may generate events at the
original density, but for a longer time, then impose time-dilated field-variation parameters on the
stream. Here the loudness trajectory, the pitch-range variation, the formant changes, the transitions to
noisiness etc would move more slowly, but the event-onset density would remain as before. This is the
textural equivalent of granular rime-stretching by grain duplication. (Sound example 11.38).
We here begin to touch upon interesting music-philosophical ground. For in this last case, the
texture-stream percept is clearly not a "unique" sound-event, in the sense which we spok.e of this in
Chapter I. The texture-stream is an example of a class of sounds with certain definable time-varying
properties, just as a note, in traditional music, is a representative of a class of sounds with certain
definable stable properties. It is the mUSical context which focuses our anention upon particular
properties, or groups of properties of a sound, or on its holistic characteristics. Composition focuses
perception on what is being perceptually organised through time. Or rather it does this so long as it is
aware of what can be perceived and what will be perceived in the resulting musical stream.
We have hence given three quite different definitions of time-stretching a texture-stream. If we
include the possibility of spectrally time-stretching the tcxture-components prior 10 generating the
texture-stream, we may imagine another option in which the texture constituents are spec/rally
rime-stretched (this time-stretching itself changing from constituent to constituent, as we proceed
timewise through the texture) while (as far as is possible) the temporal evolution of density and field
characteristics remain unchanged. This might belief be regarded as a time-varying
spectral-typeJduration-type transformation of the source texture.
In fact we can time-stretch event-onset-separation-dcnsity variation, event-duration variation,
overall loudness trajectory, pitch-range variation, evolution of the spectral contour (formant evolution)
etc. etc. all independell/ly of one another. Time has thus become a multi-dimensional phenomena
within the sound percept and we may choose amongst the many compositional options availahle to us.
95
CHAPTER 12
INTERPOLATlON
WHAT IS INTERPOLATION?
Any perceptually effective compositional process applied to a sound will produce a different sound.
However, if the process is sufficiently radical (extreme spectral stretc/ling, or time warping or /iltering
or randomisation of perceptual elements etc) we will produce a sound wnicn we recognise as being of a
different type. I will try to define tilis more clearly below.
For tnc moment, we also note that we can, by using a similar compositional process. create a wnole set
of sounds, wnose properties are intermediate between toose of tne source and tnose of tne gOal. We
will describe this mediation by progressive steps as static interpolation.
Alternatively. provided our sound is sumciently long, we may gradually apply a compositional process
changing the value of various parameters through time, e.g. we may gradually spectrally stretch the
harmonic spectrum of an instrumental tone until we rcacn, through a continuous process, a complex and
inharmonic sound. Or we may gradually add vibrato [0 a relatively short tenn, pitch-stable sound (e.g.
dense traffic) so that it eventually involves such extreme and rapid swings of pitcnthat the original
percept is swallowed up by the process. In these cases we have a process of dynamic interpolation
taking place through the application of a continuously time-varying process.
Clearly, we can apply this kind of reasoning to all compositional intervention and all compositional
processes might be described as so many sopltisticalcd variants of interpolation. However, in this
chapter we will deal largely with compositional processes which interpolate between two (or more)
pre-existing sounds. And, in a similar way, we will diseuss both static and dynamic interpolation
between these sound,.
It is important to understand that in this case we'wish to acilieve some sense of perceptual jusioll
between the two Original percepts - a mere superimposition of one over the other is not acceptable as
an interpolation. This is discussed more fully below.
IS RECOGNITION IMPORTANT?
A significant factor to deline. when discussing the idea of inlerpolalion. is the recognition of source and
goal sounds in the process. What distinguishes interpolation from mere variation is our feeling that we
have moved away from one type of sound and arrived at a different type. This percept is most readily
understood wnen the source and goal sounds are recognisable in some referential sense. The source is
a trumpet, the goal is a violin: the source is the sea. the goal is a voice: the source is spoken English.
the goal is sjXlken French: or even the source is a singing voice, the goal is clearly not a voice.
Interpolation may pass directly from one recognition (voice) to another (bees) or it may seek out an
ambiguous ground in wilich two recognition concepts contlict or cooperate within the same experience
(the talking sea).
aoWever, a percept of interpolation is possible without such clear referential clues where there is a
distinct cnange in the sense of physicality or causality of the source (see Chapter I).
aence. by greatly spectrally time-stretching a voice, or a flexed metal sheet sound, and then imposing a
rapid series of hard-edged loudness trajectories on the resulting continuum, we move from the sense of
forced continuation of an elastic medium, to a sense of striking a hard inflexible medium. Both
physicality and causality have been altered. (Sound example 12,1).
}-Ience modifications to onset characteristics. the rdte of spectral cnange (and elsewhere to the
irregularity-regularity of sequencing etc) alter our intuitive type-classifications of the sounds we hear.
When compositional processes move sounds across these boundaries, we create the percept of
interpolation.
MEDIATION: AMBIGUITY: CHANGE
Before we go on to discuss compositional methods for achieving interpolation, it is worth considering
why we might want to do it. There would seem to be at least three different motivations for musical
interpolation and each motivation Icads to a different empnasis in the way the tecnnique is applied.
The firsl approach is aimed at achieving some kind of mediation in sound between distinct sound types.
This approach may be heard in Stockhausen's Gesaflg tier Junglinge where the 'pure' pitched Singing
voice of a young ooy and pure (pitched) sine tones are mediated through a set of intennediate pitched
sounds (sounds between 'boy' and 'sine tone').
This desire to mediate between the child's voice and a sct of more 'abstract' (I.e. less
source-recognisable) sounds. a metaphysical underpinning (a religiOUS conception of 'unity' in Ihc
cosmos), wilich pervades much of StOCkhausen's musical thought. The mediation is not achieved
through dynamiC interpolation (technologically almost impossible 3t the time), nor through a clear
progressive movement from one sound-type to the other. but in the sense tnat the piece is grounded in
a field of sound types wilich span the range 'boy's voice' to 'sine tone'. These are sequentially
articulated according to an entirely different logic (a serialist sequencing aesthetic), wilich also govems
the rest of the musical organisation in the piece.
A. second approach to sound interpolation the ambiguous implications of the sounds thus
created. Roger Reynold's nas used interpolations between a voice speaking a Samuel Beckett text in
English, the same voice speaking the text in Frencn and the sound of brass instruments. Interpolation
takes place in two dimensions. between English and French on the one nand and between voice and
instrument on the otner. The composer focuses on the cusp of the interpolations. where we are most
undecided about whether what we hear is English or French, voice or instrument. This approach also
has its own metaphysical implications. if of a more secular variety. Technically the aim (and difficulty)
here, is to achieve a percept wilich is capable of these dual interpretations without entirely lOSing
'SOurce credibility' (Le. is it anything at all that we can recognise?). This can be particularly diflicult,
even with the most advanced technology.
96
97
TIle third approach focuses upon the process of change itself. In Vox 5 the transformation voice->bees
aims 10 achieve clear recognition of both source and goal, and a seamless transition from one to the
other without any intervening artefacts which might suggest some other physicality/causality, or even
betray the technical process involved. (Sound example 12.2).
Moreover the dynamism of the change is a crucial parameter; in the voice->bees example, the voice
almost melts slowly into the bee swarm; other transformations in Vax 5 are quicker and more forceful,
suggesting a generative energy in the vocal source. spitting or throwing out the goal sounds - and this
dynamism is enhanced by spatialisalion in the sense of spatial motion, or the emergence of stereo
images from a mono source. Here, the technical problems are those of seamlessness in the spectral
transition and achieving the right dynamism, especially as interpolation recognition often requires a
relatively long time-frame in order to work smoothiy. Musical context can playa vital role here.
INBETWEENING
'The most obvious way to achieve some kind of interpolation between two sounds would seem to be to
mix them in appropriate (relative loudness) proportions. However, as we know from our everyday
experience this almost never creates perceptual fusion of the two sources. Our brain manages to
unscramble the many sound impinging on our ears at any lime and to sort them into separate sources,
We hear mix, not fusion. This is due partly to the ear's sensitivity to onset synchronicity (or the lack of
it) and partly to the parallelism of micro-fluctuations of the components in anyone source, at the same
time being different from that in other sources.
Successful interpolation hy mixing can only be achieved if we can defeat either, or both, of these, In
sounds with continuation, the very precise (to the sample) synchronisation of auack (onset
synchronisation) can achieve an instantaneous fusion of the aural image which is however immediately
contradicted by the continuation of the sounds in queslion. (Sound example 12.3).
To aChieve a completely convincing sound intennediate betwecn two othern, we must work willl sounds
whose continuations are (almost) identical, or with grain lime-frame sounds (which have no
continuation). In the latter case in pmticular, we can achieve good intermediate percepts simply by
mixing, but not necessarily. The ideal case is one in which the source sound is transformed into the goal
sound by a distortion process that retains the duralion and general shape of the source down to the
levcl of the wavecycle or waveset (sec destructive distorticm in Chapter 3). Superimpositions of these
two sounds in various proportions will create convinCing intermediate states (inbetweening : Appendix
p46). (Sound example 12.4),
Where the sounds have continuation, this process may be unsatisfactory. ConSider, for example, the
goal and source sounds of the sequence "Ko->u· 10 "Bell" from Vox 5. (Sound example 12.5).
If we began with the goal and source of this sequence and merely mixed them in various proportions. we
would not achieve a satisfactory set of intermediate sounds. The process of spectral stretching.
successively applied, has gradually separated the spectral components (and hence altered the spectral
Structure) to a point where they will not simply fuse together agllln by mixing.
98
1be difficulties here might be compared with those in the process of in-betweening in animation. Here
(be directing artist draws key frames for the film. and assistant.s (or computers) do the various
in-between drawings such that. when all are combined frame-by-frame on the film. realistic movement
will be created. In-betweening may not be as skilful an occupation as originating the key frames but it
is by no means elementary. Simple point-to-point linear interpolation will. in general, not work.
INTERLEAVING AND TEXTURAL SUBSTITUTION
A different approach to this problem of static interpolation is to, in some sense. interleave the data from
our two sounds. One way to do this is to interleave analysis windows obtained from the spectral
analySiS of two sounds.
When we produce a spectral analysis of a time-varying sound, we divide the sound into tiny
time-slices, or windows. and analyse the spectrum in each window (using e.g. the fasl fourier
transform. see Appendix pp2-3. and Chapter I) in order to follow the temporal evolution of the
spectrum. This procedure is the basis of the phase vocoder (see Appendix).
Having produced such an analysis of our two different sounds we may interleave alternate windows
from each sound, preserving the original time-frame. or simply interleave windows as they are, creating
a goal sound as long as the sum of the original sounds' lengths (see Appendix). This procedure can
result in a kind of closely welded' mix', if not exactly a fusion of aural percept. We may also choose to
interleave pairs (or larger groups) of windows. With a sufficiently large grouping of windows, this will,
of course. produce a rapid regular oscillation between the two source sounds. (Sound example 12.6).
A similar percept can, however, be more easily achieved through brassage where we also have control
over the segment size and are not therefore tied to regular oscillations. Two or more sources can be
hrassaged together (fllIIlli-source brassage) with control of segment size, search nmge, pitch shift,
spatialisation of segments etc, just as in single source brassage. This approach p < ~ r h a p s works best in
producing a process-focussed transforlllation (see Chapter I) (e.g. granulated or pitch-spread) in
which the tWO (or more) sources 'peer through' the distorting grill of the process simultaneously. The
process's artefact grill disguises the lack of trUe fusion of the sources - both are an integral palt of the
resulting 'mince', (Sound example 12.7).
We can go one step further and altempt to integrate the wave-<:ycles, or wavescts, of the two sounds.
Interleaving wave cycles or wavescts will produce spectrally unpredictahle sidehands; two interleaved
sinewavcs of different wavelengths will produce a sidcband whose wavelength is the sum of the
onginais. bUI With complex signals the process is likely to resull in radical mutual distortion. This
becomes a mutual transformation process rather than a true interpolation (waveset interleaving),
(Sound example 12.8).
Similarly, we may impose the waveset (or wavecycle) shape of one sound on the waveset (or
Wavecycle) length of the other, With simple sources, e.g. square wave to sine wave, this will force the
pitch of the latter on the spectrum of the fonner. With complex sources, using wavesets. the rcsults arc
once again complel( and perceptually unpredictable. Again. we have mutual transformation. but not true
interpolation (waveset transfer). (Sound example 12.9),
99
If we have a very rapid sequence or a texture-stream we may achieve a dynamic interpolation between
two quite distinct states by careful element substitution. e.g. we might start with a texture-stream of
vaaueJy pitched noise-granules spread over a wide pitchband and, through gradually tighter band pass
fllJering of the elements themselves. focus the pitch of the granules, while simultaneously narrowing
the pitchband. In this way, we can force a broad band noise granule stream onto a single pitch which we
might animate with formant-glides and vibrato articulations reminiscent of tbe human voice (granular
syruliesis : see Appendix). (Sound Example 12,10).
VOCODING AND SPECTRAL MASKING
The distribution of partials in a spectrum (spectral form) and the spectral contour (formants) arc
separable phenomena and impinge differently on our perception. As discussed in Chapter 3 the spectral
form creates our sense of harmonicity-inharmonicity, while the spectral contour contributes to our
formant, or 'vowel' perception. If we therefore have one source with a clear articulation of the formants
(e.g. speech), and another source which lacks significant formant variation (e.g. a flute, the sea), we can
impose the formant variation of the first on the spectral form of the second to create a dual percept
(vocoding). As the spectral contour defining the formants must have something on which to 'grip' on the
second source, this process works best jf the second source has a relatively flat spectral contour over
the whole frequency spectrum. (see Diagram I).
In speech analysis and synthesis. spectral contour data is recovered and stored as data defining a sct of
time-varying filters, using a process known as linear predictive coding. TIle speech can be
reconstituted by driving a generating signal Ihrough these time-varying filters. A sequence of constant
squarewave type buzzes of appropriate pitches (for voiced vowels and consonants) and stable white
noise (for noise consonants and unvoiced speech) played through the time varying filters can be used to
reconstitute the original speech. TIIC process can also be used to reconstitute the speech at a different
pitch (change the pitch of the buzz), or speed (change the rate of succession of the filters and the
buzzlnoise), or to change the voiced-unvoiced characteristics (choice of buzz or noise).
For interpolation applications, the signal we send through the fillers will be our second source for
interpolation (i.e. the one that's not the VOice!). II will usually not have a flat, or even a stable,
spectrum but we can enhance the formant transfer process by 'whitening' the second source, 1.e. by
adding noise discreetly to the spectrum in frequency ranges where there is liule energy in the source.
For obvious reasons, this process is most often used to interpolate between a voice and a
source and is often called vocodillg. It should not be confused with the phase vocoder. (Similar vocooing
procedures may, however, be aUempted, using spectral data extracted by phase vocoder analysis).
TIlls type of interpolation might also be applied progressively so that 'voiceness', or lack of it, emerges
out of a continuing non-vocal sound. (Sound example 12.11).
Making the sea 'talk' may be a sophisticated process of control of spectral evolution. However,
the inverse process, making 'talk' like the sea, can be envisioned much more simply: a very dense
texture of unvoiced specch to which an appropriate wave-breaking-shape 10udness-trajeclOry is
applied (using a choral conductor'). If the spectral type of the sounds within the texture stream is made
to vary appropriately (e.g. lack of initially, T and 'sh' sibilants at the wave peak, 's' sibilants
with high formants for the undel1Ow) we can create a dual percept with no electronic technology
whatsoever. We might then proceed to vocodc a recording of our 'sea' conslIuct - voices within
voices.
100
DIAGRAM 1
For each windoW".
,1lft1P
, <y ...
'\
,
sound 1.

... extract spectral contour
"-', from formant-varying
I '\
\ ...
'" "
'-'
...,
,
'-...

apply this cafltour
to spectrn. of other sound.
Another way in which we can achieve interaCtion between the spectra of two or more sounds is via
spectral masking. Here we construct a goal sound from the spectra of two (or more) source sounds by
selecting the loudest partial on a frequency-band by frequency-band basis for each time window. (See
Appendix). If one of our sources has prominent high frequency partials. it may mask out the high
frequency data in the other source(s) and the high frequency characteristics of the masked sources will
be suddenly revealed if the first source pauses. or gets quieter. Hence aspects of the spectra of two (or
more) sources may be played off against each other in a contrapuntal interaction of the spectral data.
This technique should. in general. be regarded as a form of spectrally interactive mixing. rather than
illleqlOlation in the true sense. However. if our two (or more) source sounds have stably pitched
speclra which are strongly pitch-related, alterations in the loudness balance of source sounds will
produce interpolated spectral states. With voices, or other sources with variable formants. this
interaction will be particularly potent. (Sound example 12.12).
SPECTRAL. INTERPOLATION
1lle most satisfactory form of dynamic interpolation is achieved by interpolating progressively between
the time-changing speCIra of two sources. (spectral ilUerpoiation). This process is used extensively in
Vox 5.
In Appendix p32 each sound is represented by a series of frequency domain analysis windows. TIle
information in these windows changes. window by window, for each sound. Due to the nature of the
analysis procedure. however. the windows are in step-time synchronisation between the two sounds,
where the time- step corresponds to the window duration.
We can now apply a process of moving the amplitude (loudness) and frequency values in window N of
sound I towards the values in window N of sound 2. If we do this progressively so that in window N+I
we can move a little fUl1her away from the values in sound I in window N+I and a little closer to those
values in sound 2 in window N+l. then in windows N+2 etc. the resulting window values will move
progressively from being close to those in sound I 10 being close to those in sound 2 and Ule resulting
sound will be heard to move gradually from the first percept (sound I) to the second (sound 2).
It is important to understand that we are interpolating over the difference between the values in
successive windows. We are moving gradually away from the currelll value in sound I towards the
currenJ value in sound 2, and not from the original values (back in window N) of sound I towards the
ultimate values (onward in window N+K) of sound 2. The latter process. being a linear translation
between two static spectral states. would produce merely a spectral glide perceptually disconnected
from both source sounds. (Sound example 12.13).
Our process. in contrast. is moving from the 'wobbling' of one spectrum into the . wobbling . of the other.
For this very reason, the interpolation tends to be perceptually smooth. Mixillg sounds normally fails 10
fuse them as a single percept because the micfO-lluclUations within each sound are mutually
synchronised and out of sync with those in the other sound. For this reason. a cross-fade does not
produce an interpolation. In our process, we are effectively interpolating the micro-Iluctuations
themselves. (listen to Sound example 12.3).
101
'{bere are a number of refinements to the interpolation process. We may interpolate amplitude data and
frequency data separately (at different times and/or over different timescales) and we may interpolate in
a linear or non-linear fashion, and we must choose these parameters in a way which is appropriate to
the particular pair of sounds with which we arc working. (see Appendix p33).
'Dlere are also perceptual problems in creating truly seamless interpolation between two recognisable
sources. First of all, source recognition takes time and we need enough time for both source and goal to
be recognised, as well as for the interpolation itself:o take place. It is also important for the two
SOUrces to be perceptually similar (e.g. same pitch, similarly noisy), if the seamless transition is to be
achieved without intervening artefacts which either suggest a third and unrelated physicality/causality
in the sound. or even reveal the mechanics of the process of composition itself.
A more difficult problem is created by categoric switching in perception. e.g. in the transformation
voice->bees, we tend to perceive - "a voice - a voice? - a voice??? - no. it's bees
l
" - there is a
sudden switch as we recognise the goal percept. It may be necessary. to achieve a truly seamless
transition. to create perceptual 'false trails' which distract the ear's attention sufficiently at the point of
maximum uncertainty for the transition to be accepted. In the voice->beeS lransition in Vox 5, a very
high-register. low-level part of the spectrum undergoes a slide at the 'moment of maximal doubt' in a
way which is not conSCiously registered but seems sufficient to undermi ne the sudden categoric
switching which had not been overcome by other means.
This type of spectral process can also be used for static interpolation. creating a set of sounds
spectrally intermediate between source and goaL If we also allow progressive time-stretching (so that
duration of source sound and goal sound can be matched through interpolation) and we place no
aesthetic restrictions (such as the goal of 'seamlessness' in the examples above) on the intervening
sound-types. it should be possible to produce a set of spectral intermediates between any tWo sounds
_ with one word of caution! Equal small changes in spectral parameters. need not lead to equal small
perceptual changes. In fact a slight deviation in spectral form (c.g. harmonicity) may have a dramatic
perceptual result.
Achieving a successful interpolation is about creating convincing. small perceptual changes in the
resulting sounds not about the intemal mathematical logic that produces them. To achieve an
apprOldmately linear sense of interpolation along a set of sounds may require a highly non-linear
sequence of processing parameter values. The proof of the mathematics is in the listening.
SPATIAL CONSIDERATIONS
Spatial perception may also be an important factor in creating convincing static or dynamic interpolation.
In the crudest sense, imposed reverberation can help to fuse a sound-percept by giving an impression
of 'spatial integrity' (this sound was apparently produced at a single place in a given space). More
profoundly. spatial strcaming will tend to separate a fused image; in a sequence or tcxture-stream. if
one set of elements moves left and the other right. we experience aural stream dissociation.
A movement from mono to true stereo can, however. be used to enhance. or 'dramatise' the process of
dynamic spectral interpolatioll. [n Vox 5 most of the voice->other interpolations start with a voice in
mono at front centre stage and interpolate to a stereo image (crowd, bees etc) which itself often moves
off over the listeners heads. Spatialisation and spatial motion hence reinforce the dynamism of the
transition.
102
CHAPTER 13
NEW APPROACHES TO FORM
BEYOND SOUND-OBJECTS
1befe is still the danger of regarding sound-composition as a means to provide self-contained objects
which are !henceforward to be controlled by an externai architecture of Hpitch and rhy!hm along
traditional lines. In particular, MIDI technology (1994) makes this option so easy and other approaches
SO roundabout !hat it is easy to give up at this stage and revert to purely traditional concepts for
building large-scale fonn.
Furthennore, I do not wish to decry traditional approaches to musical form-building. On !he contrary a
broad knowledge of ideas about melodic construction and evolution. rhy!hmic organisation. HArmonic
control. large-scale musical fonns in general, etc, etc, from many different cultures (both ·serious· and
'popular") and historical periods, should underpin compositional choice. However, a compositional
practice confined to this. in the late Twentieth Century, will inevitably be limited. I would particularly
streSS a detailed appreciation of singing styles and styles of declamation from around !he world's
cultures as a way to appreciate subtleties of sonic architecture and chemistry which are often missing or
excluded from Western art music practice. Similarly, a study of the World's languages reveals !he
great range of sound materials that enter into everyday human sonic articulation somewhere on the
planet.
The aim of this chapter is to suggest extensions to traditional ways of thinking. Extensions which are
grounded in sound-composition itself. These may be used to complement or to replace traditional
approaches depending on the sound context and !he aes!hetic aims of the composer.
MULTIDIMENSIONALITY
Given the priorities of notated Western Art Music. and the fact that these priorities are constructed into
the instrument technology (from the tempered keyboard. or keyed flute. to the MIDI protocol) it is easy
to view music as a two dimensional (Hpitch/duration) structure. ·coloured-in" by sound. TIle very
tWo-dimensionality of the musical page reinforces the notion thaI only two significant parameters can
be preCisely controlled using horizontal staves and vertical bars. Sound composition involves
rejecting this Simplistic hierMchisation.
Sounds may be organised into multi-dimcnsional relation scts in terms of their Hpitch and Hpitch-field.
or their pitCh and pitch-band. their vowel set, harOlonicily type. onset density.
vibrato-acceleration-type etc. etc. with each of these treated as distinct perceptible orderable
parameters. Not all listable parameters arc perceptually separJhle in every case, e.g. vibrato
acceleration type may be imperceptible when onsct-<1ensity is very high. But in all definable sonic
Situations we have many sound parameters at our disposal.
We may also organise sounds in terms of their overall holistic qualities. This approach is appropriate
both for grain time-frame events where separable classes of properties may not be distinguishable and
for sound sets created by progressive interpolation between quite different sounds (e.g. "ko->u" to
'Bell": listen to Sound example 12.5). This latter sound interpolation illustrates the fact !hat we can
103
create scales of perception betweet1 definable points without parameterising our experience beforehand.
From a mathematical viewpoint, in a multi-dimensional space, we do not need to create scales along
any preferred axes of the space; we can create a stepped line in any direction.
We may then explore this multi-dimensional space establishing relationships between sounds
(holistically or through different property sets) which are both perceptible and musically potent in some
sense.
Given this multi-dimensional space, we can generalise the notion of 'modulation' (in the tonal sense).
We are already aware that reorchestrating Hpitched music can fundamentally change its affective
character. Music transferred from vibraphones to shawms. for example, or within instrumental practice.
a change in "expression" in the production of the sound stream (e.g. bowing attack, andlor vibrato
continuation on violins).
The distinction made between "strucrure' and "expression' is in fact an arbitrary, ideological divide. All
the changes to the sounds can be traced to structural properties of the sounds and the control of those
structural properties. The distinction structure/expression arises from the arbitrary divide created by
the limitation of notation. Features of sound we have, in the past, been able to notate with some
exactitude (Hpitch, duration) are I1JCned up to "rational" control (or at least rationalised explication) by
comjXlSers and commentators. Those which remain invisible or vague in the notation are out of reach of
this rationalisable control. They do, of course, remain under the intuitive conlrol of Ule performer (see
Chapter I) but this type of control comes to have a lower ontological starus in the semantics of music
philosophy.
Sound recording and computer coutrol destroy the basis for this dualistic view. The
complexity of the sound world is I1JCned up to compositional control. In this new space of possibilities,
reason (or rationalisation), must come to terms with intuition. With precise sound-compositional
control of the space, we can move from what were (or appeared to he)
all-or-nothing shifts in sound-type to a subtly articulated and possibly progressively time-varying
"playing" of the sound space. Moreover, we do not have to treat each parameter as a separate entity.
We may group properties into relal£d sets. or link the way one property varies with the variation of
another (e.g. vibrato speed with depth) and we may vary these linkages.
We are already familiar with such subtle articulations of a sound space within our
everyday experience. Consider the many affective ways to deliver a text, even where we specify no
Significant change in tempo or rhythm. The range of human intent, physiological, health or age
characteristics, pre-cxisting physicaJ ur emotional condition (brealhlessness, hysteria etc.), textual
meaning (irony, accusation, information, questioning etc) which can be conveyed by multi-dimensional
articulation of the sound space, is something we take for granted in everyday social interactions and in
theatre contexts.
With precise sound-compositional control of this multi-dimensional situation for any desired sound. we
can see that there is a significant and subtly articulahlc space awaiting musical exploration.
filE STRUCTURAL FUNCTION OF INTERPOLATION
It is instructive to examine the form of some sound composition phrases to illustrate this
rnulti-dimensional approach. The sound examples here are taken from Tongues of Fire (1992-94). In
sound Example 13.1 we begin with a vocal sound whose tail is spectrally lime-stretched with a gentle
trenW
10
at its end. The prior context is that of voice sounds, Voice sounds themselves are only
recognisable as such through a complex interaction of properties (pilch-tessitura, pitch-glide speed
and range, formant and formant-glide set, noise types, general rate and semi-regularity of sequencing
etc.).
After the first downward portamento we amve at a section based on variants of this time-stretched
voice-tail. Here the sliding inharmonic sound falls or rises or is vibrato or tremolo articulated in
numerous ways. Each event in this segment vocally, but the spectrally time-stretched tail
extensions are linked through their (time-evolving) spectr.u type; pitch and loudness are the principle
articulating parameters.
'This leads us to a varied recapitulation of the falling portamento with tremolando idea, but now the
loudness trajectory of the tremolando cuts so deep that the originally continuous sound breaks into a
succession of wood-like onsets. This is a sonic modulation from voice to 'wood' and is akin to a key
. change in the tonal system. However, this is only a passing modulation. We switch back again into a
section of variants on the time-stretched tail of the vocal sound.
'This section ends with a true sonic modulation, the ritardandoing wood events firmly establishing a new
sonic area, no longer vocal, but • wood-like' . It is interrupted by a surprising shift back towards the
voice-tail, but. in fact this octave stacked, onset-enveloped version of the tail occurs without the vocal
initiator, thus underlining the sonic "key change" which has taken place. As this event clears, the
wood-like events rise in pitch and density, establishing a high granular texture. The voice has now
been left behind. We are in a new sonic "key'.
'This sense of elsewhere is reinforced by a further pair of passing sonic modulations. as an accelerating
wood-event (twice) transforms into a drum-like attack, itself colored by a very high frequency version
of the granular texture, adding a cymbal-like presence to these onsets.
The section ends with yet another surprising sound-modulation as the vocal onset returns hut (out of a
set of other transforms) goes over into a sound like random vocal grit.
In my description of this event I have tried to emphasise the structural importance of sonority hy
comparing it to tonal structure. I am also stressing the structural function of interpolation as a
continuum-domain analogue of tonal modulation. There are clearly important differences. TIle set of
keys in the tonal system form a finite, discrete and cyclic set of possibilities (see Oil SOllie Art) and lbe
tonal system is rooted in this underlying structure. The Sonic Continuum is neither discrete nor cyclic ­
but, by contrast, it is wonderfully multi-dimensional, and we still retain some sense of the 'distance' of
one sound from another even if we cannot measure this in the way we can measure the distance around
Ihe cycle of fifths (see also the discussion of measured. comparative and textural perception in Chapter
9).
104
105
In this context, inlerpolation provides the sense of logical passage from one sonic area to another (as
opposed 10 mere juxtaposition) which we also feel during tonal modulation.
We can also pick out other features of this sequence which help bind it together: the falling pilch shapes
(and their rising inversion), the decelerating (and related accelerating) rhythmic patterns. What is
significant is that many different dimensions of sound organisation can provide structural reference
points. We no longer have the traditional hierarchy: Hpitch, duration, others.
In Sound example 13.2 we begin with a sequence whose structure is rhythmically grounded (we hear
varied repetitions of a rhythmic cell whose original material was vocal in origin: this fact is more
apparent in the context of the whole piece than in the context of the example here) into which a
resonant pitched event is inteljected. As the sequence proceeds the units are given spectral pilch
through successive jillerballk filtering with increasing Q, and the structural focus partly passes over to
the pitch domain. Once the 'fireworks' enter, the rhythmic patterning is lost in a dense texture, but the
pilch-focus remains in the HArmonic field of this texture. In the foreground the 'fireworks' are linked
by a new device, the falling portamento shape whose origin we hear when the spec/rally lime-s/re/ched
voice peers through the texture.
In Sound example 13.3 we begin with a vocal texture increasing in density, which begins to rise in
lessitura. As it does so the texture while's 01(1 to produce a noise band which is subsequently
jilterbank filtered to produce a multi-pitched inharmonic sound. The rising tessitura of the texture is
first perceived as a secondary propeny (an artiCUlation). Dut it soon hecomes the binding structural
feature of the ensuing section, as both noise bands and inharmonic sounds glide up and down.
It is the very multi-<limensionality of the sonic continuum which in some sense forces us to call upon
different perceptual foci as binding elements at different times. I am not suggesting that one
consciously composes in this way but that an intuitive sense of formal cohesion over an unhicrarchiscd
set of sonic properties may be important in sound composition.
FROM THE RATIONAL TO THE REAL
As I have argued elsewhere (On Sonic Art), the two dimensional grid of staves and barlincs of the
musical score reinforces a culturally received conception of the musical universe as lying on a grid, or
lattice, of discreet values, particularly of Hpitch and of duration (but also instrument type). Notation
hence deals with the "rational" in the mathematical sense of the term, Le. that which can be counted, or
that which can be expressed in terms of ratio: finite sets of Hpilches and the associated intervals,
cvcnt-<lurations as multiples or divisions of a regular unit (c.g. a crotchet). The grid hides from us the
reality of the· musical continuum from which it is carved. But just as the integers and the r.ltional
numbers (fractions) are only special cases lying along the underlying continuum of the real numbers, so
Hpitch sets and countable rhythmic units are special cases of an underlying continuum of frequency and
duration values matched by the continua of formant types, spectral forms etc. etc.
Rationality, however, is in the ear of the beholder. For musicians accustomed to working in the
Western tempered scale, there is nothing more simply rational than the intervallic division of the octave
in terms of the logarithm of frequency by which we have come to measure pitch-interval. All
semi-tones are equal and the lifth is simply 7 steps up a 12 step scale. Musicians from other cultures.
106
and devotees of various pre-tempered approaches to tuning (e.g. just intonation), however would beg
to differ from this analysis.
For them the interval we know as the 5th derives from the second twO partials in a harmonic spectrum.
1be ratio of their frequellCies is exactly 3:2 or 1.5. However, in the tempered scale, where larger
intervals must always be expressible as exact multiples of the semi tone (whose frequency ratio is 2
raised to the power of 1112), the interval of a 5th is 2 raised to the power of 7/12, or approximalely
(.498307077. In fact, in terms of frequency, the tempered scale 5th cannot be expressed as an exact
ratio of any two whole numbers; it is not a rational number (in the mathematical sense). (It is certainly
out of tune with the • pure' fifth).
The tritone in the tempered scale. roughly corresponding to the mediaeval "devil in music", is even more
instructive in this regard. For the tempered scale trHone corresponds to a precise frequency ratio of
the-square-root-of-2: I. And the square root of 2 is perhaps the most famous non-rational numher of
aiL
Pythagoras' discovery of the link between musical pitch relationships and the length ratios of vibrating
strings spawned the pythagorean cult which linked mathematics, mystical numerology and music.
However. it was the Greek discovery of the non-rationality of the square root of 2 which destroyed
Greek arithmetic and led to an almost complete reliance on geometrical methods. For the square root of
2, the length of the diagonal of a square of unit side, is perfectly comprehensible geometrically.
However as a number it cannot be expressed as the ratio of two other numbers. This was proved by
the Greeks and seemed hopelessly baffling. The square root of 2 is simply nOl a rational number (a
number expressible as a fraction).
It was only 2,000 years later, in the Seventeenth Century, with Descartes' development of coordinate
geometry (linking geometrical and numerical thinking) and Newton's use of fluxions, (the origin of the
calculus) to study accelerated motion, that mathematicians began to deal in a sophisticated way with
the numerical continuum. And it was not until the Nineteenth Century that a rigorous delinition of these
new numbers, the very stuff of the continuum, was developed.
lbe numbers which form the continuum (and which include the rational numbers) are known as the Real
Numbers. So, in the mathematical world, the rational appears as only a small part (in fact an infinitely
small part) of the Real!
In a sense, Western musical prJctice has remained overawed by the Pythagorean fixation with the
mathematically rational. The obsession with geometrical proportions in written scores may be s<-'en as
an aspect of the persistence of this tradition.
But the computer as a recording and transformational tool has altered our relationship to the musical
continuum. Sampling is a means whereby the continuum of our sonic experience can be captured and
subjected to rational calculation. The key to this is the existence of perceptual time-frames. Below a
certain duration limit (as discussed in Chapter I) we can no longer distinguish indiVidual sound events,
hence our perception of the continuity of experience can be recreated by piecing together Clttremely
short, but stable, sound elements. Just as the 24 or 25 static frames of film or video create for us the
illusion of seamless motion, the reconstruction of a sound from a windowed analysis. or the playing of
samples through a digital to analogue converter, recreates for us a continuous sonic domain from what
are fixed and countable values. The compuler thus provides a link between the world of raljonal
107
mathematics and the continuum of our sonic experience (so-called "floating point" arithmetic is a
rational approximation of oper4l.ions on real. in the mathematical sense. numbers).
It mllSt also be said that. in dealing with the continuum mathematically on the computer, we both learn
the necessity of numerical approximation and come up against the limits of human perceptual acuity
which itself sets limits to the aceuracy required for our calculations. 'The world of "pure" ratios in the
sense of tuning systems, or temporal proportions, is seen for the idealisation that it is. We come
face-to-face with the Real. the reality of human experience, and the realities of rational calculation.
Nevertheless. what before lay only within the intuitive control of physical action (which varies spati ally
and temporally over the continuum) in performance practice, is now recordable, repeatable, analysable,
recalculable in a way not previously available to us.
SEQUENTIAL AND MORPHIC FORM
An important question is, then, to what extent can we distinguish and appreciate articulations of the
continuum. Appreciation of musical perfornlance, and subtle comprehension of the "innuendo" of spoken
language, suggest we have a refined. if not well described, ability in this sphere.
We will describe shapes articulated in the continuum as /twrpllic forms in contrast with forms created
by juxtaposing fixed values. which are sequential fOn/IS. Almost all Western Art Music as it appears i1l
lile nota/ioll is concerned with articulating sequential forms.
TIle invention of the orchestral crescendo by the Mannheim School of Symphonists seemed a major
advance at the time. The crescendo is an example of the most elementary morphic form, a linear motion
from one state (in tlus case. of loudness) to another. In a world of stable loudness fields (not laking
into aceount the subtleties of loudness articulation in the performance practice) t1Us was a startling
development. But traditional notation gives no means to add detail to t1Us simple state-interpolation.
(Moce recent notational developments include stepped cresccmli: see Diagram 1.)
Similarly, tempo variation. as expressed through traditional notalion. cannot be gi Yen any subtlelY of
articulation. It is either happening at a "normal" rate (aceel). slowly. (poco a poco accel) or rapidly
(molto acecl). Internal articulation of the rate of speed change is not describable (though some
twentieth century composers, like Stockhausen, have allempted to extend nolational practice to do this.
in works for solo performer). I'erfonnancc practice may involve Ihe subtle or involved use of "rubato"
(performance "feel") related tempo variation. II is interesting to note how "rubalo" interpretation,
associated with musical romanticism, is generally frowned upon in serious music circles. From a
nOlation-focused perspective. it clearly docs not adhere to the notated lime information. But the facI is.
we cannot notale t1Us kind of subtle tempo fluctuation. Docs Ihis mean therefore it must be abandoned
as all element in musical construction?
The subtle articulation of pitch, continuation and spectral formant properties can be observed in the
vocal music practice of many non-European musical traditions, as well as witlUn jazz and in popular
musics. Musics handed down, at least pru'lly, by aural tradition, can carry the morphic form information
from one generation of performers to another. It docs not get hived off into a separate domain of
"performance practice" separated from stable-property not3tables and subsequently devalued.
108
DIAGRAM 1..
Conventional
-=========
Stepped crescendo.
With CQIIlputer control and the possibility of analytic understanding, we can imagine an extension of
morphic control to larger and larger time-frames and to many levels and dimensions of musical
experience. However, because of the existence of the text/performance separation in Western Art
Music and the textual fixation of analysis and evaluation, it is difficult for music analytic thought to take
on boanI morphic form. Such a vast literature of musical analysis exists, grounded in sequential
properties. By contrast, there has been no means to notate. let alone describe and analyse, morphic
form.
11le combined powers of the computer to record sound and to perform numerical analysis of the data to
any required degree of precision immediately removes any technical difficulty here. For those willing to
deal with precise numerical representations of sonic reality. a whole new field of study opens up. To
those fixated by musical texts, however, this area will remain a closed book.
From a perceptual perspective, even in the most elementary case, a motion from state A to state B. we
are able to distinguish articulations of the motion. For example, a pitch-portamento which rises
steeply before slowly settling on the goal pitch, can be differentiated from a linear portamento and from
one which leaves its initial pitch slowly before accelerating towards the goal. These different motion
types can be observed to have different perceptual qualities even within very short time-frames.
(Soond example 13.4).
Once we begin to combine morphic form over a number of parameters. we may observe the emergence
of elementary morphic categories. Simple examples would be ....
I. The descrescendoing rising portamento which accelerates away from a Hpitch, a "leading-from" or
dispersal morpheme.
2. A rising portamenlO which slows as it approaches the final Hpitch and from which gradually emerges
a widening vibrato (also making the percept louder) - a common gesture in Western popular music
- a "leading to". or morphic anacrusis.
3. A sense of Illorphic stability achieved on a sustained tone (with jitter) with or without a fairly stable
vibrato depth and rate. as compared to.
4. Morphic instability occasioned if the vibrato on the same note begins to vary arbitrarily in speed and
depth over a noticeable range.
(Sound example 13.5)
Sueh formal groupings can be transferred to other parameters (loudness variation. change of cyclic
loudness variation (tremolo rate). lormant motion, harmonicity shift. density fluctuation etc. etc.). In
other cases, simultaneous morphic development of parameters may lead in contradictory directions, A
tone whose vibrato-ratc gradually slows but whose vibrato depth simultaneously widens ceases to be
perceived as a tonc event and becomes a pitch-glide structure. We have a simple example of morphic
modulation. taking us from (perhaps) Hpitch lield percepts to pitCh-glide field percepts.
As morphic forming is developed over larger time-<iomains, new musical formal structures may
develop. In particular, different musical streams. undergoing internal morphic change, may interact with
one another. We have the possibility of counterstreaming, the morphic flow equivalent of counterpoint
109
IJI lite sequential domain. Stream interaction in an elementary sense can be heard in Steve ReiCh's
pIlIISing and in the ululation section o.f Vox-5 (see II). but in neither of these cases is
matenal In the separate streams undergomg real morphic change (10 Vox-5 the streams move in
space relative to one another).
llIIagine. for example. a detailed control of several pitch-portamento lines over long time-frames, or a
lexture-stream which undergoes continual morphic development of density, granularity. spectral energy
focUS. pilch-band width, pitch-band-location, internal pitch-stream motion etc. and which can diverge
jnto perceptually separate streams, perhaps moving separately in space. before uniting sonically and
spatially at a later time.
Imagine also a theatre situation in which we see the quite separate evolution of two characters with
entirely different, but internally consistent, goals and desires. Then external circumstances conspire 10
cause the two 10 meet briefly in a small room. The theatrical outcome is dramatic and we, as audience,
have an inkling of how this collision of personalities might tum out, whereas the characters are
unprepared for this chance encounter. After their meeting. both their lives are changed dramatically.
From a broader perspective. this is also stream-counterpoint. the collision and mutual interaction of
developing streams.
The possibilities in this domain remain almost entirely unexplored. a rich seam of development for any
composer who can grasp the concepts and mathematical controls of morphic flow processes.
POSTLUDE: TIME & SPACE
The forms of living organisms are the resull of continuous processes of flow (growth) - they are
morphic forms captured in space. Even the calcified remains of sea creatures (e.g. the shell of the
Nautilus) tell us about the processes of continuous growth which formed them. and any spatial
proportions we may observe in the final (hrms are merely the traces of these continuous processes,
lime-flows revealed in space.
What sllikes us as formally coherent about a tree is not some exact and slatic proportioning of
branching points and branch angles, but the way in which a flexible (and environmentally sensitive)
eltpression of some branching principle is realised through a process of branching flow. The overall
state of a particular flow pattern (its result being the current state of the tree) our perception of
one member of a tree species with the other members of the same species.
Unlike the growing tree, a musical event never fixes its time-nnw evolution in a spatial object, it is
intrinsically evanescent and a musical score is not the musiclil experience itself. but a set of instructions
to generate this aetherial substance. just as the genetic code is a set of instructions to grow an
organism. The symmetries of the DNA molecule are not. in any sense, copies or analogues of the
symmetries of the organism. Similarly, we cannot assume that spatial order in a musical score
corresponds to some similar sense of order in our musical experience of time-flow.
The way in which we divide experience into the spatially codifiable and the Ilowing governs much mon:
Iban our attitude to music. Human social life itself is an experience of interactive flow. or growth,
amongst human individuals. Some people lind all now (growth, change) unsettling and seck 10 codify
eltperience. professional practice, social order and even human relationships in strict categories. Others
see categoric divisions as temporary pens in which we attempt 10 corral the !low of experience and
history.
10
Similarly. some artists seek out exact rational proportions, golden sections or. more recently. fractality,
in everything. Pemaps it may seem to reveal the hand of the ultimate designer in all things. or. as with
Plato. reality may be conceived as a hidden world of fixed Ideals to which everyday reality corresponds
with varying degrees of success. In such a world. the score (the text). or the compositional idea or
method. may be viewed as a musical Ideal and its realisation in pcrfomlance an adequate or poor
reflection of this Ideal object. Reality is set in stone. or print, in the exact proportions of architecture. the
unchanging fixity of the text. the codification of the artist's method.
For artists like myself. reality is flow. growth. change and ultimate uncertainty. Texts. from the
religious to the scientific. are our mortal attempts to hold this flow in our gt"dSps SO we can understand it
and perllaps control it a little. Compositional methods are, like engineering metllods. subject to test. and
ultimately to possible failure.
For a certain time a procedure. an approach. a theory. works. until we discover new facts. new
symmetries. new symmetry breaking. and those tell:lS have to be revised or rewritten. the proportions
of our architecture reconsidered. the appropriateness of our methods reassessed in the light of
experience.
For me, sound-composition. in seeking the measure of the evanescent flow of sonic events. a t t e m p L ~ to
grapple with the very essence of human experience.
III
APPENDIX 1
(Page numbers refer 10 the diagrammatic Appendix 2).
• **** A·····
ACCELERANDO
An increase in the speed at which musical events succeed one another.
ACCUMULATION
Sound which 'gathers momenlum' (increases in loudness and speClral conlen!) as il proceeds.
ALL-PASS FILTER .................... 9
AMPLITUDE
A scientific measure of the strength of a sound-signal. This normally reveals itself in perception as the
• loudness of the sound, bUI amplilude and loudness and not equivalent. In particular. the sensitivity of
dle human ear varies from one frequency range 10 another. hence Amplitude and perceived loudness arc
IlOl idenlical. However. in this book, Ihe term Loudness is used, inslead of Amplitude. in the lext,
wherever this does nOl cause any confusion. Diagrams however usually refer to Amplitude.
ANALYSIS
In this book analysis almost always refers to Fourier AnalYSis, the process of converting the wavefonn
(time-domain representation) of a sound into its spectrum (frequency-domain representation).
Analysis of musical scores is the process of determining the essential slrUctural features of a
composition from ils notaled representation in a score. This IlOtion mighl be generalised 10 include any
process attempting to uncover Ihe underlying Slructure of a piece of music.
ARTICULATION
The (conscious) changing/moving of a propeny of a sound, usually in a way bound by conventions (of
language, musical practice elc) bUI onen not appearing as such in any associaled nOlalion syslem.
ArrACK
The onset portion of a sound where it achieves i l ~ inilial maximum loudness.
112
***** B *****
BAND PASS FILTER 7
BAND REJECT FILTER .................. . 7
BAR
A grouping of musical duration-units. The bar-length is measured in some basic musical unit (e.g.
crotchets, quavers) for which a speed (tempo) is given (e.g. 120 crotchets per minutes). In most
musics, bar length is regular for long stretches of time, and bar length is one determinant of perceived
rhythm.
BRASSAGE 44-45
A procedure which chops a sound into a number of segments and then resplices these together tail to
head. In the simplest case sounds are selected in order, from the source, and respliced back together in
order. However there are many possible variations on the procedure. Brassage may be used for
changing the duration of a sound, for evolving montage based on a sound-source, and for many other
musical applications. See also GRANULAR RECONSTRUCTION.
BREAKPOINT TABLE
A table which associates the value of some time-varying quantity (e.g. pitch, loudness, spectral stretch
etc.) and the time at which that value is reached. e.g ....
(time) (value)
0.0 2.7
1.3 2.2
1.7 2.1
This table allows us to describe the time-varying trajectory of the quantity, and a musical program may
read the data in the table, interpolating between the given values where necessary, to determine what
to do at a particular time in the source-sound.
113
***** c *****
CADENCE
A recognised device in a musical language which signals the end of a musical phrase, section or piece.
CAUSALITY
The way in which the sound was apparently initialed. NOT the actual cause of the sound. Was the
source apparently struck, rubbed, shaken, spun etc. ??
CHANGE RINGING
A form of bell-ringing in which the order in which the bells are sounded is permuted in specific ways.
CHANNEL
Channel is most often used in this book to refer to an analysis channel. We derive the spectrum
(frequency domain representation) of a sound from its waveform (time-{\omain representation) by a
process of analysis. In doing the analysis we must decide how accurate we would like to be. We may
search for a partial in each block of 100 cycles per second (i.e. between 50 and 150, 150 and 250, 250
and 350 etc) or, more discriminatingly, in each block of 10 cycles per second (i.e. between 5 and 15, 15
and 25, 25 and 30 etc.). These search blocks are the channels of the analysis. Channels should not be
confused with WINDOWS.
Channel is also used to refer to the right hand and leli hand parts of a stereo sound (which can be
viewed as two separate streams of digital information).
CHORD
A set of pitches initiated and sounding at the same time. Usually a set of pitches within a known
reference set (e.g. the European tempered scale).
CHORUSING ..................... 45
A process of making a single musical source (e.g. a voice) sound like a group of similar sources all
making the same sound (e.g. a chorus of singers singing the same pitch).
CHROMATIC SCALE
The european scale consisting of all the semi tone divisions of the octave.
COMB-FILTER TRANSPOSiTION ........... 65
A quick method of achieving octave-upward transposition of a sound, where its pitch is known.
COMB-FILTERING 64
114
COMPRESSION .................................. 60
Reducing the loudness of a sound by greater amounts where the sound itself becomes louder.
CONSTRUCTED CONTINUATION
The extension of a sound by some compositional process (e.g. brassage. zigzagging).
CONSTRUCTIVE DISTORTION
A process which generates musically interesting artefacts from the intrinsic properties of the waveform
or the (time-varying) spectrum of a sound.
CONTINUATION
Where sounds are longer than grains, we hear how the sound qualities evolve in time, (their
morphology). These sounds have continuation.
CONTOUR
The shape of some property of a sound at one moment in time. In particular, the shape of the spectrum.
This is often referred to as Spectral Envelope. However. Envelope is also used to describe the
time-<:hanging evolulion of a property (especially Loudness). To avoid any confusion, this book
reserves Trajectory for such lime-<:hanging properties, and Contour for the instantaneous shape of a
property. The names of computer instruments may however use the term 'Envelope'.
The spectral contour describes the overall loudness contour of the spectrum at a single moment in lime.
But note that the spectral contour may itself evolve (change) through time.
CORRUGATION ....................... 60
CROTCHET == 120
An indication of the speed at which musical events suceeed one another. Here the duration unit.
crotchet, occurs 120 times every second. This speed is known as the Tempo.
CSOUND
A general purpose computer language which allows a composer to describe a sound-generating
procedure (synthesis method) and ways to control it. in almost any degree of detail, and to define a
sequence of events (a score) using the sounds generated, and which then generates the sound events
thus defined. CSound is the most recent development of a series of such general purpose synthesis
engines, and the one in most common use at the time of writing (AulUmn. 1994).
CUTTING .................... 40
115
..... D·····
DELAy ................ .. ............................... 64
DENSITY
Describes the way in which a range is filled. Applies particularly to time ranges. A high-densily texture
has a great many events in a short time. Temporal density is a primary property of
rexTURE-STREAMS. The concept of density can also be applied to pitch-ranges.
DESTRUCTIVE DISTORTION
An irreversible transformation of the waveform of a sound, changing its spectral quality (the brightness.
noisiness etc), rather than the pitch or duration. Distortion also implies the degradation of the sound
(and irreversibility means that the original sound cannot be restored from the distorted version).
Destructive distortion which preserves zero-<:rossing points can be musically useful. See
WAVESETS.
DIPHONE SYNTHESIS
The reconstruction of speech (usually) by synthesising the transitions between significant phonemes
(roughly speaking, vowelS & consonants) rather than the phonemes themselves.
DRONE
A pitched or mUllipitched sustained sound which persists for a long lime.
DUCKING .. ........................... 62
Means of ensuring the prominence or a lead 'voice' in a mix.
DURATION
The length of time a sound persists. Not to be confused with event-onset-separation-dration, wttich is
the time hetween the start or successive sound events.
DYNAMIC INTERPOLATION
The process whereby a sound gmdually changes imo a dirferent (kind of) sound during the course of a
single sonic event.
116
·····E·....
ECHO ................... " ............... 64
EDfIlNG
1be processes of cutting sounds into shorter segments orland splicing together sounds or segments of
sounds.
ENVELOPE
1be loudness trajectory of a sound (the way the loudness varies through time) is often referred to as
the envelope of the sound. Computer instruments which manipulate this loudness trajectory are usually
called 'Envelope something'. Envelope is also used in the liIerature to refer to the time-changing
variation of any property (we use the term trajectory) and even to the instantaneous shape of the
spectrum (we use the term contour).
ENVELOPE CONTRACTION .................... 60
ENVELOPE FOLLOWING ....................... . 58
ENVELOPE INVERSION 60
ENVELOPE SMOOTHING 60
ENVELOPE SUnSTITUTlON 59.61
ENVELOPE TRANSFORMATION ........... 60
Musical transformations of the loudness trajectory of a sound.
ENVELOPING 59
EXPANDING 60
..... F··"·
F5
The pitch • F' in the European scale. in the 5th octave.
FAST FOURIER TRANSFORM
A very efficient computer algorithm for performing the Fourier Transform.
FF
Fortissimo = very loud.
FIBONACCI SERIF};
Series of numbers in which each term is given by the sum of the previous two terms. The sequence
begins with 1.1 and is hence..
1.1,2.3.5.8,13,21.34,55 etc
The ratio between successive terms
If!, 112.213.315. S18, 8f13, 24121, 21134,34155 etc.
approaches closer and closer to the Golden Section as one goes up the series.
FIELD
A set of values defmed for some property. e.g. the pitches of the scale of G minor deline a Field. Any
pitch of the tempered scale lies either in that field (lr outside it. Any pitch within the pitch continuum lies
close (in some perceptually defined sense) to that Field or no\.
FIL'lliR 7-8
An instrument which changes the loudness of the spectral components of a sountl (ellen reducing some
to zero).
FIL'lliR BANK ....... , ................. 7
FLUTIER-TONGUING
A way of articulating a granular sound on breath controlled instruments (woodwind or brass) by
making a rolled 'r' in the mouth while producing a pilch on the instrument.
FORMANT 10
117
118
FORMANT PRESERVING SPECTRAL MANIPULATION ....... 17
2
The representation of the waveform of a sound as a set of simpler (sinusoidal) waveforms. The new
representation is known as the spectrum of the sound.
FOURIER ANALYSIS
FOURIER TRANSFORM
A mathematical procedure which allows us to represent any arbitrary waveform as a sum of elementary
silUlsoidai waveforms. It is used in Fourier Analysis. See also INVERSE FOURIER TRANSFORM.
FREQUENCY .................. , .............. 3,4
A steady sound has a definite repeating shape. a waveform. and this waveform a definite length. which
takes a certain time to pass the listener. The number of waveforms that pass in each second (the
number of cycles per second) is known as the frequency of the wave, TIle frequency of the wave helps to
dete.nnine the pitch we hear.
FREQUENCY DOMAIN REPRESENTATION ......................... 3
.*.*. G •••**
....................... 60
GATING
GOLDEN SECTION
If a straight line AB is cui at a point I' such lhal...
Al'/Afl = I'lllAI'
this ratio is desrilx-"d as the G o l d ( ~ n Section, & is approximately equal to 0.618
A- - --1'-- II
GRAIN
Sound having a duration which is long cnough for spectral properties to be perceived but nOl long
enough for time-fluctuation of properties to be perceived.
GRAIN-STREAM
A. sound consisting of a rapid sequence of similar onsets.
GRAIN TIME-FRAME
The typical duration of a grain .
GRANULAR PROCESSES ................ .. 56-57
Musical processes which preserve the grains within a grain- stream.
GRANULAR RECONSTRUCTION .. , ......... 73
A procedure which chops a sound into a number of segments and then redisUibutes these in a texture of
definable density, The process differs from BRASSAGE in that the segments need nOI be rejoined tail
to head, Granular Reconstruction of outpul density I is brassage.
GRANULAR REORDERING 57
GRANULAR REVERSAL 56
GRANULAR SYNTHESIS
A process, almost identical to granular reconstruction, in which very brief sound clements are generated
with particular (time-varying) properties and a particular (time-varying) density. 10 create a new
sound.
GRANULAR TIME SHRINKING flY GRAIN DELETION 56
GRANUI.AR TIME SHRINKING BY GRAIN SEPARATION ........ 56
GRANULAR TIME STRETCHING BY GRAIN DUPLICATION .... 56
GRANULAR TIME STRETCHING IW GRAIN SEPARATION .. ,.. 56
GRANULAR TIME WARPING
Granular time-stretching or time-shrinking, which itself varies in lime.
120
119
..... H·····
..... I·····
HARMONIC
In traditional European practice, harmonic means pertaining to hannony and relates particularly to tonal
music. In sound composition, harmonic refers to the property of a spectrum where all the partials are
multiples of some (audible) fundamental frequency. These partials arc then known as harmonics, the
spectrum is said to be harmonic and the sound has a single definite pitch. (See also INHARMONIC).
These two usages are incompatible, so in the text we use HArmonic to refer to the traditional usage,
and harmonic to refer to the sound-compositional usage.
HARMONIC FIELD
A reference frame of pitches. This might be thought of as a chord. All pitches in a texture controlled by a
HArmonic Field will fallon one or other of the pitches of the chord.
HARMONICITY
The property of having a harmonic spectrum.
HARMONICS 4
The partials of a sound of definite pitch are (usually) exact multiples of some (audible) frequency known
as the fundamental. In this c a ~ e the partials are known as the harmonics of the sound.
HARMONISER , ................ , 38
Application of brassage with tape-speed transposition to change the pilch of a sound without alleling
its duration. Generic name of commerCially available hardware units which do both this and a number of
other grain time-frame brassage procedures (e.g. duration change Without pitch....:hange),
HARMONY
In European music, the rules governing the sounding-together of pitches, and the sequencing of chords.
HOMOPHONIC
Music in which there are several pans (instruments or voices) but all pat1S sound simultaneously
(though not necessarily with the same pitch) at each musical event.
HPITCH
Pitch in a sense which refers to a HArmonic Field or to European notions of Harmony, In contrast to
pitch as a property of a spectrum.
INBE1WEENING ................. .. 46
INDIAN RAG SYSTEM
'Jbe system of scales and associated figures and ornaments at the base of Oassical Indian music theory
and practice.
INHARMONIC
Aspectrum which is not harmonic, but which is not noise, is said 10 be inharmonic. Inharmonic spectra
fillY be bell-like (suggesting several pitches) or drum-like (being focused in some sense but laCking
definitive pitch).
[NTERPOLA TION
The process of moving gradually betw('£.n two defined states, See SPECTRAL INTERPOLATION.
INVERSE FOURIER TRANSFORM
A mathematical procedure which allows us to convert an arbitrary spectrum (frequency-domain
II!presentation of a sound) into the corresponding waveform (time-domain representation of the
sound), See also, FOURIER TRANSFORM,
INVERSE KARPLUS STRONG
See SOUND PLUCKING.
ITERATION 42
••••• J .*•••
lrITER
Tiny random fluctuations in apparently perceptually stable properties of a sound, such as pitch or
loudness,
121
122
....... K ......
KARPLUS STRONG
An. efficient algorithm for generating plucked-string sounds.
KEY
A piece of european music using the tempered scale (except where it is intenlionally atonal) can usually
be related to a scale beginning on a particular pitch. around which the melodic patterns and chord
progressions of the piece are organised. The pitch which begins the scale defines the Key of the piece.
KLANGFARBENMELODIE
Musical line where successive pitches are played by different (groups (1) instruments. Literally.
tone-<Xllour melody.
••••• L •••••
LIMITING
.................. 60
UNEAR PREDICTIVE CODING .............. 12
LOOPING 42
LOUDNESS
Technically speaking, loudness is a property which is related 10 perception. and is measured in a way
which takes into account the varying sensitivity of the ear over different frequency ranges. In this sense
it differs from the Amplitude of the signal. which is a scientific measure of the strength of a sound. In
this book. the term loudness is used in the text. wherever this will not cause any confusion. Diagrams
usually reter to Amplitude.
LPC
......................... 12
123
·····M.....
MAJOR
Most european music uses one of two scale patterns. known as Ule major scale and the minor scale
(the latter having a number of variants).
META-INSTRUMENT
1.n instrument which provides control instructions for another instrument. E.g. ule mixing of sounds is
controlled by a mixing score (which may be a graphic representation of sounds and their entry times, or
awrillen list in a computer file). A meta-instrument might write or modify the mixing insllllctions
according to criteria supplied by the composer. or in response to other data.
MF
mezzo forte moderately loud.
MIDI
Acronym for Musical Instrument Digital lnlerface. This is a communication protocol for messages sen!
between different digital musical instruments and computers. MIDI stores information on which key is
pressed or released. how forcefully (or quickly) it is pressed. and on certain kinds of control information
provided by controllers on digital instruments (e.g. pilch-glide. tremolo etc). together with more
Instrument specific data (which synthesis patch is being used). MIDI docs NOT record the sound itself.
MINOR
Most european music uses one of two scale patterns. known a . ~ the major scale and Ihe minor scale.
The minor scale has tWO important varianls. Ihe melodic and the harmonic minor.
MIX-SHUFFLING ................ 47
MIXING 46
MIXING SCORE
A set of instructions detailing what will happen when a number of sounds are MIXED together. 11lis
might be a text file or a graphic display on a computer. hut could equally well be a set of instructions for
moving faders on a mixing desk in a studio. A typical mixing score would contain information ahoUl
which sounds were to be used. at what time each would start. how loud each would be, and at what
spatial position each should be placed.
MONO
Sound emanating from a single source (e.g. a single loudspeaker) or a single channel of digital
information. As opposed to stereo.
124
MORPHOLOGY
TIle way in which the properties of a sound vary with time.
MOTIF
Small element of musical structure, usually consisting of a sequence of a few pitches (in notated music),
and out of which larger stroctura! units (e.g. phrases) are built.
MULTI-DIMENSIONAL SPACE
A /ine defines a one-dimensional space, a sheet of paper or the surface (on! y) of a sphere a
two-dimensional space, and the world we live in is a three-dimensional space. We may generalise the
notion of a space to any group of independent parameters. e.g. pitch and duration may be thought of as
defining a two-dimensional space, and this is the space that we draw in wben we notate a traditional
musical score. Spaces may be of any number of dimensions (I.e. not necessarily ones that we can
visualise in our own spatial experience) from the four dimensions of Einstein's space-time, to the
infinite number of dimensions in Hilbert space.
MULTI-SOURCE BRASSAGE .................. 45
....·N ..• ..
NOISE
Sound having no perccplible pitch(es) and in which energy is distributed densely and randomly over the
spcctrom andlor in a way which varies randomly wilh time. Typical examples might be the consonants
's' or 'sh'. Other sounds (especially those recorded directly from the natural environmcn!) may contain
elements of unwanted noise which we may wish to eliminate by noise reduction.
NOISE REDUCTION
Process of eliminating or reducing unwanted noise in a sound source.
NOTCH FILTER ..................................... 7
125
·····0··..·
OCTAVE
A sound whose pitch is an octave higher than a second sound, usually has twice the frequency of thaI
second sound. Two pitcbes separated by an octave, in european music, are regarded in some sense as
the 'same' pilch, or as belonging to the same ·pitch-class'.
OCTAVE-STACKING ................... 48
ONSET SYNCHRONISATION 48
..*.* p •••••
PARAMETER
Any property of a sound or a sequence of sounds which can be musically organised. Parameter often
also implies the measurnbility of that property.
PARTIAL ............................. .. 3
The sinUSOidal elements which define the spectrum of a sound.
PARTIAL TRACKING 21
PERMUTATION
Specific rearrangement of the elements, or the properties of thc elements (e.g. loudness), of a sequence
of musical events.
PHASE 9
PHASE INVERSION (9)
The sound waveform may be replaced by thc same fonll bUI 'upside-down' (i.c. the new wave riscs
where the olher falls and vice versa). The samc-cffcct is achieved with a sinusoidal waveform hy
restarting thc wave from the tirst point at which it recrosses the 1.ero-line (known as a phase-shift by
180 degrees, or hy PI rndians). Superimposing the original sound on its phase-inverted version causes
the sounds to cancel one another.
126
PHASE VOCODER II
Instrument producing a moment-by-moment analysis of a sound wavefonn to reveal its time-evolving
spectrum. A windowed Fast Fourier Transform.
PHASING 9
PHONEME
A fundamental sound unit of a language, Crudely speaking we may think of vowels and consonants (as
heard, rather than as written) but the true definition is more subtle.
PHRASE
Element of musical structure consisting of a sequence of sound events and, in traditional practice,
usually lasting for a few bars.
PHYSICAUTY
The physical nature of the apparent source of the sound. NOT the physical nature of the real source. Is
the apparent source hard or soft, rigid or flexible, granular of of-a-piece etc.
PITCH ............................................................. 4
A property of instrumental and vocal sounds organised in most traditional musical practices. Pitch
arises from a regular arrangement of partials in the spectrum. All partials are multiples of some
fundamental frequency which is audible (and which mayor may not be prescn[ in the speclrum), and in
this case they are known as the hannon.ics of the sound. In the simplest case (the Sine wave) there is
only the fundamental present. Humans hear pitch between approximately 16 cycles per second and 4(X)O
cycles per second. Below 16 cps, the sound breaks up into a grain-stream. Above 4000 cycles we may
still be aware of tessitura (relative pitch-range) but assigning specific pitch becomes more problemallc.
PITCH-GLIDE
See PORTAMENTO.
PITCH-TRACKING 70,71
Finding and recording the (time-varying) pitch of a sound.
PITCH-TRACKING BY AUTO-CORRELATION .. ...70
PITCH-TRACKING BY PARTIAL ANALYSIS 71
PITCH-TRANSFER
Imposing (time-varying) pitch of a sound on a different sound.
pORTAMENTO
Sliding of pitch. Often incorrectly referred to as glissando, but there is an important distinction. On a
fretted or keyed instrument like a piano, we may slide lingers from a high pitch to a low pitch, but the
leeys allow us to access only the pitches of the scale, and we hear a rapid descending scale passage: a
glissando. On a trombone or violin, a similar motion with the slide, or the finger along the string,
pitch to fall through the continuum, without picking out intervening scale pitches: a ponamento.
PROCESS-FOCUSSED TRANSFORMATION
Some musical processes will so radically alter a source sound that the perceived goal-sound is
dependent on the process of transfonnation than on the source itself. The same process applied to two
very different sources will produce very similar goal-sounds. The perceived result of the procedure is
governed more by the ar1efacts of the process of transformation, rather than by the particular nature oi
the source-sounds employed.
***•• Q •••••
Q ......................... . &
The steepness with which a tiller cuts (lut ul1wanlcd frequencies.
QUANTISATJON
Forcing the timing of evcn[s on to a time grid of a specific size. E.g. we may set a grid at 0.01
Any event must then fall at some multiple of 0.01 seconds. ALternatively we may sct a grid at SOIll,'
division of the metrical unit e.g. a grid of demi-semi-quavers. All events must then fall on some
multiple of demi-semiquaver divisions of the beaL The actual time quantisation will then also be
detcnnined by the tempo (the number of crotchets per second). The quanlisation grid provi,lcs a lime
reference-frame. On keyed or frelted instruments. pitch is similarly quantiscd.
QUARTER TONE
A very small division of the musical scale. Half a semitone (the smallest interval accessihlc on a
standard European keyboard instrument, like a piano).
••••• R •••••
41
RANDOM-CUTIING
REFERENCE FRAME
A set of values which provide a reference set against which OIher values can be measured. e.g. Ihe
chromatic scale as a reference set for european harmony, !he set of vowels in standard english as a
reference set for classifying regional accents etc.
RESONANCE
If an object is vibrated it will produce a sound. Due to its particular weight. size and shape there will be
certlIin frequencies at which it will vibrate 'naturally'. If supplied with frequency-unspecific energy it
will tend to vibrate at these natural re..'ronant frequencies. A flute tube with a certain combination of
closed holes has specific resonant frequencies which produce the pitches for Ihat fingering. A hall or
building will reinforce cenain rrequencies in a voice, orchestra etc which fall on its natural resonant
frequencies.
RETROGRADE
1he performance of a sequence of sound events in Ihe reverse order. A-B-C-D-E becomes
E-D-C-B-A. Note that the sound-events themselves are nOl reversed.
REVERBERATION .................................... 64
RITARDANDO
A decrease in the speed at which musical events succeed one another.
•••• * s •••••
SAMPLER
A piece of hardware, or a software package on a computer, which Ihat digitally records any sound and
allows it to be manipulated (e.g. pitch-change by 'tape-speed' variation with !he specific transp,)sition
information sent from a MIDl keyboard). The sounds recorded on a sampler are often referred to in the
commercial literature as 'samples'. These should not be confused wilh !he individual numbers used to
record Ihe shape of the waveform itself, which are properly known as samples. (See SAMPLING).
SAMPLING ................................................. .
Sound is digitally recorded by sampling the value of the (elecuical analogue of the) sound wave, al
regular time- intervals. These time intervals must be very short if high frequencies in the sound arc to
be resolved. (e.g. between 22000 and 48000 samples per second). At a sampling rate of 48000 samples
per second, Ihe highest resolvable frequency is 24000 cycles per second.
129
SCORE
The notation of a piece of music from which a performance of the work is recreated.
SEMITONE
The smallest interval between consecutive pitches on a modem European keyboard or fretted
instrument. Musical scales are defined as some pattern of tones (equal to two semitones) and
semitones, HArmonic minor scales also containing a three semitone step.
SEQUENCE GENERA 1l0N
SEQUENCES
Groups of consecutive sounds having distinctly different spectral properties. e.g. speech. or melodic
pltrases on keyed instruments (the speCtrJ. of whose elements differ by perceptually significant pitch
steps).
SERIAL COMPOSITION
Style of twentielh century European musical composition in which the 12 pitches of the chromatic scale
are arranged in a specific order, and this sequence (and cenain well-specified transformations of it) arc
used as the basis for Ihe organisation of pitches in a piece. The general idea of serialism was also
I
extended to sequences of durations, of loudnesses etc. The even more general notion of permuting a
given set of elements has been more widely used (e.g. systems music).
SERIALISM
•. See SERIAL COMPOSITION
, SHAWM
A mediaeval wind instrument with a strong reedy sound.
SHEPARD TONES 72
Sounds (or sequences of sounds) constructed so that they rise in tessilura while their pitch falls (or
vice versa).
SINE WAVE ............... . ...................... 2.5
The elementary oscillations in terms of which all other regular oscillations, vibrations or waveforms call
be descrihed. The oscillation of a simple (idealised) pendulum is described by a sine wave.
SINUSOIDAL
Having the shape of a sine-wave.
130
r-
SOUND REVERSING ................ 43
SOUND SHREDDING 41
SOUND-PLUCKING 72
l
A musical transformation which imposes a plucked-string-like attack on a sound.
Iii
,I,
I
SOURCE-FOCUSSED TRANSFORMATION
A musical transformation whose outcome depends strongly on the nature of the source sound. Defined
in contrast to PROCESS-FOCUSSED TRANSFORMATION.
SPECTRAL ARPEGGIATION 24
1'1'
'"
SPECfRAL BLURRING ................ . 26
1 '1'
1
SPECTRAL BRIGHTNESS
'1,1
A measure of where energy is focused in the spectrum. If some of upper partials are very loud. the
sound will appear bright.
SPECTRAL FOCUSING 20
I'li
I'
SPECTRAL FORMANT TRANSFER
'I'" see VOCODING.
SPECIRAL FREEZING 22
SPECTRAL INTERLEAVING 35
SPECTRAL INTERPOLATION ................. 32-33
SPECTRAL MANIPULATION 18-35
Musical processes thai work direcUy on the (time-varying) spectrum of Ihe sound.
SPECTRAL MASKING ............................... 34
SPECTRAL SHAKING ................................ 23
SPECTRAL SHIFTING ............................... 18
'II
I SPECTRAL SPLITTING .............................. 29
I
,II'
SPECTRAL STRETCHING ......................... 19
I
!'i
SPECTRAL TIME-SHRINKING ............... 30-31
SPECTRAL TIME-STRETCHING ............ 30-31
SPECTRAL TIME-WARPING .................. 30-31
SPECTRAL TRACE-AND-BLUR ............ 27
;': SPECTRAL TRACING 25
:ii
l SPECTRAL UNDULATION 28
SPECTRUM 2.3
Representation of a sound in terms of the frequencies of its partials (those sinusoidal waves which can
be summed to produce the actual waveform of the sound). The frequency- domain representation of the
sound.
SPLICING .................. 40
The tail to head joining of two sounds. In the classical tape studio this would be achieved hy joining the
end of one sound tape to the beginning of another, using sticky tape.
SQUARE WAVE 5
SRlITl
The smallest unit into which the octave is divided in classical Indian music and from which the various
rag scales can be derived. It is at least 6 times smaller than a semitone. This is more of a theorelical
unit of than a praclically applied unit. In the Weslern semi tone is huill into Ihe
structure of ils keyboard instruments .
STATIC INTERPOLATION
The process wherehy a sound gradually changes into a different (kind of) sound during a series of
repetitions of the sound. where each repeated unit is changed slightly away from the previous one and
towards the goal sound.
STEREO
Sound emanaling from two channels (e.g. two loudspeaker) or stored as IWO channels of digilal
informalion. As opposed to mono (from a single source). Sound infomlation provided through IWO
loudspeakers is ahle to convey information about the (apparent) positioning of sound sources in the
intervening space belween the loudspeakers. rather Ihan suggesting merely a pair of sound sources.
132
STOCHASTIC PROCESSES
A process in which the probabilities of proceeding from one state. or set of states. to another. is defined.
lbe temporal evolution of the process is therefore governed by a kind of weighted mndomness. which
can be chosen to give anything from an entirely determined outcome, to an entirely unpredictable one.
SYNTHESIS
Process of generating a sound from digital data, or from the parameters supplied to an electrical
oscillator. Originally synthetic sounds were recognisably such. but now it is possible, through a process
of careful analysis and subtle transformation, to recreate a recorded sound in a changed form which
however sound a . ~ convincingly 'natural' as the recording of the Original sound.
•••• * T ••••*
TAPE-ACCELERATION .......................... , 36
TAPE-SPEED VARIATION ...................... 36
TEMPERED TUNING
lbe lUning of the scales used in European music., a system which became firmly established in the
early eighteenth century. In tempered tuning the octave is divided into 12 exactly equal semitoncs. I.e.
the ratio between the frequencies of any two pitches which are a scmitone aparl is exactly the samc. In
a harmonic spectrum, the frequencies of the partials are cxaCI multiples of some fundamental frequency.
lbe mtio of their frequencies form a pattern known as the harmonic series i.e.
2JI 3/2 4/3 5/4 6/5 7/6 .....
and the frequency ratios between members of this series arc known as 'pure' intervals. SOnle pure
interval ratios arc...
octave.. 2J1 4Ih.....4/3
5th..... 3/2 7Ih ..... 7/4
There is no common smaller imerval from which all these 'pure' intervals can be constructed. Hence the
striving 10 achieve some kinds of compromise tunings, of which the tempered scale IS just o n < ~ example.
Apart from the octave, the tempered scale only approximales frequency ralios of the pure inlcrvuls. The
7th has no close approximalion in the tempered scale.
TEMPO
The mte at which musical events occur. In european music the relative duration of CVCnlS a r < ~ indicated
by note values c.g. a crotchet is as long as two quavers, The speed of the whole will be indicated by a
tempo marking e.g,
crotchet =120
which means there are to be 120 crotchets in one minute.
133
1ESSITURA
Also known as register. In its simplest sense, the range of pilch in question. The tessituf"d of soprano
voices is higher than that of Tenor voices. However, unpitched (e.g. noise) sounds may rise in tessitura
while having no perceivable pitch, while SHEPARD TONES may rise in tessitura while falling in
pitch(!). Hence tessitura also has something to do with where the main focus of energy lies in the
spectrum (where the loudest groups of partials are to be found).
'TEXTURE
Organisation of sound elements in terms of (temporal) density and field propenies.
TEXTURE CONTROL 68-69
TEXTURE GENERATION 68-69
TEXTURE OF GROUPS .............................. 68
Texture in which small sets of sound elements are considered as the organisahle units of the texture,
rather than individual sounds themselves. The sounds forming any particular group arc, however,
chosen arbitrarily from the available sound sources.
TEXTURE OF MOTIFS 69
Texture in which specific small sets of sound elements, known as motifs, are considered as the
organisahle units of the texture, I'llthcr than individual sounds themselves. The sounds forming any
particular motif are in some kind of predefined arrdngement (e.g. some comhination of sequence, timc­
placement, pitch-relationship, formant sequence, etc).
TEXTURE OF ORNAMENTS 69
Texture in which specific small sets of sound elements, known as ornaments, arc attached to the
fundamental sound elements of the texture, and each of lhese may be organised as units of the texture.
TEXTURE STREAM
Dense and relatively disordered sequem;e of sound events in which spedne ordering propenics
between individual elements cannot be perceived. Properties may be ordered in tcrnlS of Field and
Density,
THRESHOLD
A value either not 10 be excceded (or fallen below), or to be noted (in order to cause something else to
happen) if II IS thus exceeded.
\34
TIMBRE
A catch-all term for all those aspects of a sound not included in pitch and duration. Of no value to the
sound composer!
TIME STRETCHING
See spectral time-stretching, granular time-stretching, waveset time-stretching, harmoniser,
brassage.
TIME-DOMAIN REPRESENT A TION
The waveform of a sound represented as the variation of air- pressure with time.
TIME-VARIABLE TIME-STRETCHING
See time-warping.
T1ME-WARPING 30
Changing the duration of a sound in a way which itself varies with time.
TONAL MUSIC
Music organised around keys and the progression between different keys. In contrast, atonal music
avoids indicating the dominance of any particular key or pitch.
TONE
1l1e interval between the first two pitches of a major (or minor) scale in European music. European
scales consist of patterns of tones and semitones (half a tone) and, in some cases 3-semitone steps.
TRAJECTORY
The variation of some property with time. e.g. loudness trajectory, pitch trajectory, formant trajectory.
Loudness trajectory is often also known as 'envelope' and instruments which manipulate the loudness
trajectory are here called 'Envelope something'.
TREMOLO 66
Cyclical undulations of loudness between c. 4 and 20 cycles per second.
TRANSPOSITION
Changing the pitch of a sound, or sound sequence.
TRIGGERING 62
Using the value of some time-varying property (usually the loudness) of a sound to cause something
else to happen.
135
1
i
TRITONE
The musical interval of 6 semi tones. In the European tempered scale, a frequency ratio of the square
root of 2, to I.
• •••• U •••••
UNDULATION
cyclical change in the value of a musical property, like the rising and falling of pitch in vibrato, or of
loudness in tremolo.
..... V·····
RAPHONE
musical instrument consisting of pitched metal bars suspended over resonating tubes. When the bars
struck they produce a specific pitch. and the associated tube resonates at that pitch. A motor drives
small rotary blade inside the tube, adding slight vibrato-tremolo to the resonated sound.
VIBRATO
66
Cyclical undulations of pitch between c. 4 and 20 cycles per second.
VOCODING
34
• •••• W •••••
WAVELENGTH
3,4
WAVESET
50
W A VESET DISTORTION
52
Generally refers to any process which irreversibly alters the wavesets in a sound e.g. waveset
inversion, omission, reversal, shaking, shuftling, substitution, averaging and harmonic distortion.
Specifically used to refer to power-distortion i.e. raising each sample value of the sound to a power
(e.g. squaring, cubing, taking the square ((XlI).
136
WAVESET ENVELOPING ........................ 53
WAVESET HARMONIC DISTORTION ... 52
WA VESET INlERLEAVING .................... 54
WA VESET INVERSION ............................ 51
W A VESET OMISSION .............................. 51
WAVESET REVERSAL ............................. 51
WAVESET SHAKING ................................ 51
WA VESET SHur-'FLlNG ........................... 51
W A VESET SUBSTITUTION ...................... 52
WAVESET TIMESTRETCHING ............... 55
WAVESET TRANSFER 54
WAVESET TRANSPOSITION 51
WEDGEING ............... . .. .................. 69
WHIlE-OUT
A musical process in which a texlure becomes so dense thaI all spectral detail is lost, producing a nOise
band.
WINDOW
In sound analysis (conversion from wavdorm 10 spectrum, to spectral envelope or to time-varying
tiher4;ocfficients : see LPC) a window is a very brief slice of time (a few milliseconds) in which we
make a spectral analysis, before passing to the next window to make our next analysis. We hence
discover how the spectrum changes in time. Not to be confused with analysis CHANNEL
WINDOWED I-+T
A F a . ~ 1 Fourier Transform that is perlormed over a brief slice of time (a window), then performed over
and over again at successive windows throughout the entire duration of the sound. The Phase Vocoder
is a windowed FFT.
137
....... Z ..... ••
ZERO-CROSSING
The waveform of a sound continually rises above the centre line and then falls below H. The point where
it crosses the centre line is a zer04;rossing.
ZERO-CUTTING ................................ . 40
ZIGZAGGING .............................................. .43
138

I:
~

PREFACE The main body of this book was wriuen at the Institute of Sonology at the Royal Conservatory in the Hague. where I was invited as composer in residence by Clarence Barlow. in 1993. Some clarifications and supplementary material was added after discussions with Miller Puckette. Zak Scttel. Stephan Bilbao and Philippe Depalle atIRCAM. However, the blame for any inaccuracies or inconsistcncies in the exposition rests entirely with me. The text of the book was originally wrillen entirely in longhand and I am indebted to Wendy McGreavy for transferring these personal pretechnological hieroglyphs to computer files. Help with the final text layout was provided by Tony Myatt of the University of York. My career in music-making with computers would nOl have been possible without the existence of a community of like-minded individuals committed to making powerful and open music-computing tools available to composers on affordable domestic technology. I am therefore especially indebted to the Composers Desktop Project and would hence like to thank my fellow contributors to this long running project; in particular Tom Endrich, the real driving force at the heart of the CDP, to whom we owe the survival. expansion and development of this cooperative venture. against all odds. and whose persistent probing and questioning has led to clear instrument descriptions & musician-friendly documentation; Richard Orton and Andrew Bentley. the other composer founder members of. and core instrument of the CDP and much clsc. contributors to the project; David Malham who devised the hardware bases and continues to give SUPPOI1; Martin Atkins. whose computer science knowledge and continuing commitment made. and continues to make, the whole project possible (and from whom I have very slowly learnt to progranl less anarchically, if not yet elegantly!); Rajmil Fischman of the University of Keele. who has been prinCipally responsible for developing the various graphic interfaces to the system; and to Michael Clarke. Nick Laviers. Rob Waring, Richard Dobson and to the many students at the Universities of York. Keele. and Birmingham and to individual users elsewhere. who have suppol1ed. used and helped sustain and develop this reSOUrce. All the sound examples accompanying this book were either made specifically for this publication. or come from my own compositions Red Bird. The VOX Cycle or Tongues of Fire. except for one item. and I would like to thank Paul de Marinis and Lovely Music for pemlitting me to use the example in Chapter 2 from the piece Odd Evening on the CD Music as /l Second Lang/lOge (Lovely MusiC LCD 3011), Thanks are also due to Francis Newton. for assistance with data transfer to DAT.

.\

• :1
"

'I i. 'l

:: ... ,
I

II :l 'I
:l

'I

'.

"

"

." Ie:

'/'I

a II

.. .,

'/

~

CONTENTS

Chapter 0 Introduction ............................ 1
Chapter 1 The Nature of Sound ............... 11
Chapter 2 Pitch ......................................... 25
Chapter 3 Spectrum .................................. 35
Chapter 4 Onset ........................................ 44 Chapter 5 Continuation ............................ 48 Chapter 6 Grain Streams .......................... 55 Chapter 7 Sequences ................................ 59 Chapter 8 Texture .................................... 66 Chapter 9 Time........................................ 72 Chapter 10 Energy ................ ..................... 81 Chapter 11 Time-stretching ..................... 86 Chapter 12 Interpolation ........................... 96 Chapter 13 New Approaches to Form ...... 103 Appendix 1 ................................................... 112
Brief Bibliography ........................................ 139

o.

,j

'!

I, 'l

:~
"

:1

!~ ."

'.
:a

I

:1

:~

J

.J

.. .. .,
11
,j

.

TIle question of how these relationships may be developed to establish larger scale musical structures will be suggested towards the end of the book but in less detail as. stalling from a given sound.CHAPTER 0 INTRODUCTION WHAT THIS BOOK IS ABOUT This is a book about composing with sounds.like rape Jpeed variatioll.l . This concern grew directly out of the sophisticated development of orchestration in the late Nineteenth Century and its intrinsic limitations (a small finite set of musical instruments). For the moment we will assume that it is obvious. The ways in which this sound may be transformed are limited only by the imagination of the composer. We will also present.. 3. The third assumption will either appear obvious. It is based on three assumptions.j I' " .'. :1 '0 :1 l " ..l ':t :~ I. Initially hampered by unsophisticated tools (in the early days. . The main body of this book will therefore show how. Computers allow us to digitally record any sound at all and to digitally process those recorded sounds in any way that we care to define. The second of our assumptions had to await the arrival of musical instruments which could handle in a subtle and sophisticated way. the inner substance of sounds themselves.M. a simple diagrammatic explanation of the musical procedures discussed.later the transfonnations .available with magnetic tape) masterpieces of this new medium began to emerge and an approach to musical composition rooted in the sound phenomenon itself was laid out in great detail by the French school. In this book we will· "discuss in general the properties of different kinds of sound materials and the effects certain well defined processes of transformation may have on these. . no universal tradition of large scale fonn-building (through these newly accessible sound-relationships) has established itself as a norm. . many other audibly similar sounds may be developed which however. the French composer Varcse was imagining a then unattainable music which had the same degree of control over sonic substance as musicians have traditionally exercised over melody. I. The first assumption can be justilied with reference to both aesthetic and technological developments ill the Twentieth Century. harmony and duration. Any sound whatsoever may be the stalling material for a musical composition. It was the emergence and increasing perfection of the technology of sound recording which made this dream accessible. in the Appendix. in Paris in the early 1950s.R. editing between lacquer discs . editillg and mixillg . . 2. Before 1920." '.' I il .i :~ . The digital computer provided the medium in which these tools could be developed. to date. depending on the musical perspective of the reader. TIle American composer John Cage was the first to declare that all sound was (already) music. Musical structure depends on establishing audible relationships amongst sound materials. The exploration of the new sounds made available by recording technology was begun by Pierre Schaeffer and the G. possess properties different or even excluded from the original sound. or deeply controversial. !l "I " .

.i :~ . However.. as opposed to the world of notations of sound. It is perfectly possible to use the analysis of a recorded sound to build a synthesis model which generates the original sound and a host of other related sounds.. fairly representative. [Iut because our prime medium for the propagation of knowledge is the written text. Many composers are either forced into this approach. The COP was developed as a composers cooperative and originated in York.y.A NEW CRITICAL TRADITION I cannot emphasise strongly enough that my concern is with the world of sound itself. or the importance of the scholar or composer who declares structure to be present (I shall return to such matters in particular in Chapter 9 on 'Time'). In line with Illis view. For the text of a computer program can act on the world through associated electronic and mechanical hardware. We might in fact argue that the truly potent texts of our times are certainly not texts like this book. or by recording sounds on a sampler . I have discussed elsewhere (011 SOllie Art) the strong influence of mediaeval 'text worship' on the critical/analytical disciplines which have evolved in music. This is not significantly different from traditional on-paper composition and although this book should give some insight into the design of the 'instruments' used in such an approach. most of the programs described were available on the Composers Desktop Project (COP) System at the time of writing. to make the world anew. however. In due course. It is also possible to use sound transformation techniques to change any sound into any other via some well-defined and audible series of steps. because cheaply available technology is based in the note/instrument conception of music.. This new paradigm is beginning to struggle into existence against the immense inertia of received wisdom about 'musical structure'.. in any case. However. is to uncover the musical possibilities and restraints offered by the medium Qf sonic composition. the text can be modified to produce a more satisfactory result. However.. A common approach to sound-composition is to define "instruments" . rather than its representation in a text. even this one). not to argue the pros and cons of different technological situations. and in particular to crcate new and unhcard sonic experiences. acoustically or mathematically described. . I would suggest. (Scientific readers may be surprised to hear that this stance may be regarded as polemical by many musical theorists). THINKING ALOUD . It is with this that we will be concerned. or do not see beyond it. I don'. They do nOt rJdlate some mystical authority 10 wiuch we must kow-tow. synthesis and analysis have. this book will adopt the point of view of a scientific researcher delving into an unknown realm. at the same time. the religious or mystical POtency willi which the medieval text was imbued has IJccn replaced by at. I will focus on what can be aurally perceived. but do something specific in the world which we can judge to be more or Ie. many of them will run in real time environments.. by now. Sound composition requires the development of both new listening and awareness skills for the composer and. rather than on an approach through synthesis. this is not a scientific text. In aHempting to explore the area of composing with sound. The common language is onc of intelligent and sophisticHted sound transformation so that sound composition has become a plastic art like sculpture. The world of sound-composition has been hampered by being cast in the role of a poor relation to JIIore traditional musical practice.~ :. be impossible) but oniy a wide and. . . set of processes which arc already familiar. In particular the vast body of analytical and critical writings in the iIlusicology of Western Art music is strongly oriented to the study of musical texts (scores) rather than 10 a discipline of acute aural awareness in itself.either by manipulating factory patches on a commercial synthesizer. particularly where the MIDI interface confines the user to the tempered scale. therefore. In the scientific spirit. on my direct response to these perceptions and on what can be technically. powerful institutions have grown up around the presentation. Nor will we. a new analytical and critical discipline founded in the sludy of the sonic experience itself. " . which may be moulded like a sculptural medium in any way we wish. Here. . 'J . Our interest in exploring this new area is not to discover universal laws of perception but to suggest what might be fruitful approaches for artists who Wish to explore the vast domain of new sonic possibilities opened up by sound recording and computer technology. so powerful that their influence can be inappropriate. i Q . Both the scientific method and technologised industrial society have had to struggle against the passive authority of texts declaring eternal !ruths and values inimical to the scientific method and to technological advance. the assumption in this book is that we are not confmed to defining "instruments" to arrange on some preordained pitch/rhythmic structure (though we may choose to adopt this approach in particular circumstances) but may explore the multidimensional space of sound itself. We are looking for evidence to back up any hypotheses we may have about potential musical structure and this evidence comes from our perception of sounds themselves. analysis and evaluation of texts and textual evidence . or ought to be implemented in real time. but computer programs themselves. II l~ . !. hopefully.s Successful.:tual phYSical cftica. but this book is based on the assumption that the existence of structure in music is a mailer of fact to be decided by listening to the sound~ presented..the entire history of European compositional theory is already available! " . this book is not intended to be read without listening to the sound examples which accompany it. outside this chapter. not a matter of opinion to be decided on the authority of a musical text (a score or a book . I will not discuss the approach as such here . We also do not aim to cover every possibility (this would. My concern here.~ . judgeable. these are presented as evidence of the propositions being presented. At its simplest such an approach is no ffiQre than traditional note-oriented composition for electroniC instruments.and then trigger and transpose these sounds from a MIDI keyboard (or some other kind of MIDI controller).K. In particular.. that this distinction need barely concern us any more. And if we are dissatisfied.WHAT THIS BOOK [S NOT ABOUT This book is not about the merits of computers or particular programming packages. t I "1... U. or even true scientific theories. or the largely literary disciplines of music score analysis and criticism. 2 3 ..~ On the contrary. become so sophisticated. You arc at liberty to affirm or deny what I have to say Illfough your own experience. we will focus on the transformation of sound materials taken from the real world. Such texts are potent but. wish here to decry the idea that there may be "universal truths" about human behaviour and human social interaction which science and technology are powerless to alter. discuss whether or not any of the processes described can be.

So scientists be forewarned! We may embark on signal-processing procedures which will appear bizarre to the scientifically sophisticated. Just as with tile traditional acoustic instrument. Much late Twentieth Century Western art music has been dogged by an obsession wi!h complicatedness. use every conceivable option the new instrument offers. In music. However. Precise analysis of sounds. SOUND TRANSFORMATION: SCIENCE OR ART? In this book we will refer to sounds as sound-materials or sound-sources and to the process of changing them a. bul rather. What this means in practice may only emerge in the course of the exploration.like the old adage of trying 10 write a play by letting a chimpanzee stab away at a typewriter in the hope that a masterpiece will emerge. .more pallems means more information. a model which stresses !he quamity of information. if activated. Helshe may not know beforehand exactly what is being searched for when it is used. however. the motivation of our discussion is somewhat different from that of the scientific or technological study of Signals. this means to use the new tools in a way appropriate to Ihe sound we arc immediately dealing with and wi!h a view 10 p-JIlicular aesthetic objectives. !he question that a composer asks is. In some cases. but also clearly different. ~ . The musical world is generall y conservative and denizens of this community can be quick to dismiss !he "new-fangled" as unmusical or artistically inappropriate. This danger of overkill is particularly acute with the computer. signal proceSSing and computer program or algorithm. In sound composition. it would be wasteful of time not to do !his. The success of our cfforts will be judged by what we hear. however.and we may test the result of our process against the desired outcome and hence assess the validity of our procedure..~ transformations or sound-transformations. '. In particular. This docs not mean however that I will. time-variable: Appendix p8) filter in order to explore its effects on sound materials. this real-time-monitored action type of knowledge. o . Learned through physical practice and emulation of others and aided by diSCUSSion and description of what it involved. I will therefore make a new instrument (a program) to achieve the result I want.\ 'f . arises parUy from the breakdown of consensus on the subslance of musical meaning. or even intend to. we do not need to "know" completely what we are doing (!!). is this process aes!helically interesting? . will automatically run all the programs necessary to create the goal sound. wilh my existing mUSIcal tools..! . There is also a word of caution for the composer reader. Whilst building the instrument. !he source of musical potency remains as elusive as ever but in an age whidl demands everything be quantifiable and measurable. musicians share !hcse goals. or why it works. In the end. The tools which effect changes will be described as musical instruments or musical tools. It can also be copied and modified to produce whatever variants are required.~ . means more musical "potency". is more likely to require an extremely flexible (band-variable. or even predictable. A musician. Beyond this.'The pntCtical implications of this are immense for the composer. when composing. or how to construct meaningful sentences in our native longue) wi!hout necessarily being able to describe expliCitly what we do. there are no intrinsic restrictions on what we do. In the tlrst. For example. apart from a useful aes!hetic transformation of !he Original source. however. For example. It applies to things we know very well (like how 10 walk. seems falsely plausible. In many respects. The text in the batch-file. I will make it as general purpose as possible so that it applies to all possible situations and so that all variables can vary in all possible ways. This obsession wi!h quantity.an accurate representation of a given sequence. I ~ ~ :t ~ '~ . The question we must ask as musicians however is nOl. . what we will be discussing is Signal processing as it applies to sound signals.! . e~traction of data in the frequency cWmain. I store only the source and a brief so-called "batch-file". There is no inherent virtue in doing everything. are these procedures scientifically valid. . Q-variable. In fact. From an artistic point of view it is important to stress the continuity of this work to past musical craft. it clearly helps 10 have a modicum of acoustic :wledge. 4 5 . or complicmedness of an artefact. EXPLICIT AND INTUITIVE KNOWLEDGE In musical creation we can distinguish two quite distinct modes of knowing and acting. scientific readers will be more familiar with the terms signal. procedures that give relatively unpredictable results. as musicians. number-<:runching program will produce an arbitrary result . well. In analysing and transforming signals for scientific purposes. .. to make reconstruction (or variation) of the final sound possible. However. the goal of thc process we set in motion may not be known or even (wi!h complex signals) easily predictable beforehand.does !he sound resulting from !he process relate perceptually and in a musically useful manner to the sound we began wi!h? What we are searching for is a way to transform sound material to give resulting sounds which are clearly close relatives of the source. a phYsical movement causes an immediate result which is monitored immediately by the cars and this feedback is used to modify the action. and for. do they produce aesthetically useful resulls on al least some types of sound materials. In this case. intuitive knowledge is most strongly associated WiUl mUsical performance. sound clarification or noise reduction. But beyond lhis. In the past I might spend many days working from an original sound source subjecting it to many processes before arriving at a satisfactory result As a record of this process (as well as a safeguard against losing my hard-won fmal product) I would keep copious notes and copies of many of the intermediate sounds. Today. This has arisen partly from the permutational procedures of late serialism and also from an intellectually suspect linkage of crude information theory with musical communication . a technological or scientific task may involve the design of a highly specific filler to achieve a particular result. I may decide I need to do some!hing with a sound which is difficult. musicians. extraction of time-varying infonnation in the frequency domain (see Appendix p3).processing of sound as anything and everything can be done. or information overload. or ti13t are heavily dependent on the unique properties of the particular signals to which they are applied. !he task is to use it. Given the power of the computer.. Nevertheless. An arbitrary. Ideally we require a way to "measure" or order these degrees of difference allowing us to articulate the space of sound possibilities in a structured and meaningful way. the removal of noise and the enhancement of the signal "image" . we normally have some distinct goal in mind . we would stress that this is a book by. or impossible. An illuminating manuscript indeed! This open-ended approach applies equally 10 the design of musical instruments (signal processing grams) themselves. I will describe as illluilive". to play it. are all of great importance to the sonic composer..

~ponsibility REAL TIME OR NOT REAL TIME? . In this case a symbiosis is taking place between composerly and perfonner/y activities and as thc activity of composer and performer begin to overlap. In a medium where everything is possible. At some level. In traditional "on paper" composition. TIus approach is intrinsically "real time". 10 have worked it all out in an entirely explicil and rational way.choice strategies. play the range of possibilities available during the course of composing a work. onlo a recording medium out of real time. traditional composers. When we bow a note on a violin. at first glance. In composed Western art music where a musical score is made. must achieve a new cquilibrium.. the found-objecl manufacturer) is providing a framework of restrictions (the sound world. . and usually in the practice of. Although the three traditional roles performer-improviser. the temptation for anyone labelled . and more and more powerful. Explicit knowledge of this kind is stressed in the training. perfonnance practice tradition and the player's intuitive control of the instrumcntal medium takes over in pitch definition (especially in processes of succession from pilch to pitch on many instruments. composer" is to build a new electronic extcnsion for every piece. thc pitch-set. post-hoc. To develop new. they provide useful poles around which we may a~sess the value of what we are dOing. systems on which we hear the results of our decisions as we take them. TIle advantages of this paradigm are those of "livcness" (the theatrc versus the cinema. a metal sheet). a retuned piano. suggesting therefore that the composer must also have a clearly explicit description of the task.. in the music-<:ultural atmosphere of the late Twentieth Century.g. the composer takes responsibility for the organisation of certain weU-<:onlrOlled parameters (pitch. As this book is primarily about composition.g. Without some kind of intuitive understanding of the universe of sounds. and often real-timc. or extended musical perfonnance instruments (including real-time-control!able sound processing devices) we need real-time processing of sounds. exploring through infonned. however. is that most electro-acoustic composers 'play' with the medium. . (I will describe later why this is In contraSl. cven in the free-improvisation case. Much more is therefore placed on the composer to choose an appropriate (set of) configuration(s) for a particular purpose. we must ask what bearing the development of real-time proceSSing has on compositional practice apart from speeding up somc of thc morc mundane tasks involved.. the aura of explicitness is enhanced because it results in a definitive lext (the score) which can often be given a rational exegesis by the composer or by a music score-analyst. may use the musical score merely as a notepad for their intuitive musical outpourings. we immediatcly hear a sound being produced which is the result of our decision to use a particular finger position and bow pressure and which responds immediately to our subtle manual articulation of these. timing precision or its interpretation. TIle first is an instrumental pal'ddigm.~ Q ~ l In fact. TIle absurd and misguided rationalist nightmare of the totally explicit construction of all aspects of a musical work is not the "natural" outcome of the use of computer technology in musical composition ­ just one of the less interesting possibilities. ThIs approach filS well into a tnlditional musical way of thinking. The "instrument" is no longer a definable (if subtle) closed universe but a groundswell of possibilities out of which the sonic composer must delimit some aesthetically valid universe.THAT IS THE QUESTION As computcrs becomc faster and fa~ter. ' . However. in all musical practice. the computer must have an exact description of what it is required to do.r I On !be other hand. In the sense that an instrument builder sets explicit limits to the perfonncr's free exploration shelhc has a restricting role similar to that of the composer. 'Ole fact. the work is rccrcated 'before your very eyes') and of mutability dependent on each performer's reinterpretation of the work. COmposition on the other hand would seem. by aCCident. and with the human voice). instrument-builder and composer arc being blurred by the new tcchnological developments. the articulation possibilities) which hound the musical univcrse which the free improviser may explore. It might be supposed that the pure electro-acoustic composer making a composition directly onto a recording medium has already abandoned intuitive knowledge by eliminating the performer and hislher role. Success in perfonnance also dcpends on a foreknowledge of whal our actions will precipitate. AI the other pole. duration. Others may occasionally do the same but in a cultural atmosphere where explicit rational decision-making is most highly prized. sound-by-sound. or (in the studio) to put together a work. Beyond this. A traditional musical instrument is a "real-timc system". Hence the rationale for having real-lime systems seems quite clear in the sphere of musical performance. In the neomanic cultural atmosphere of the late Twentieth Century. there is a clamour among musicians working in the medium for "real-timc· system. it is an essential part of the compositional process. where thc composer provides elcctronic extensions to a traditional instrumental performance. • . the problem of choice is insunnountable (unless one replaces it entirely by non. to establish not so). the instrument builder (or. . the role of computer instrument builder is somewhat different Infonnation Technology alloWS us to build sound-processing tools of immense generality and flexibility (though one might not uess this fact by surveying the musical hardware on sale in high street shops). will claim.e. I would suggest that there are two conflicting paradigms competing for thc attention of the sonic composer. i. dice-throwing procedures etc).~ . Moreover. e. Not only is this desirable.~ !t '! 6 7 . II's disadvantages are not perhaps immediately obvious. instrument type) up to a certain degree of resolution. some balance must be struck between what is created explicitly and what is created intuitively. the free-improvising perfonner relies on an intuitive creative process (restrained hy the intrinsic sound/pitch limitations of a sound source. the role of intuitive and explicit knowledge in musical composition. 'This is knowledge that we know that we know and we can give an explicit description of ilto others in language or mathematics.~ . c. it appears "natural" to assume that thc use of the computer will favour a totally explicit approach to composition.S ~ ~ I ~ J . acceptable HAnnonic progressions within a !liven style or the spectral contour of a particular vowel formant (see Chapter 3). to generate both the moment-to-moment articulation of events and the ongoing time structure at all levels. and sound production and articulation. we also have explicit knowledge of. to be an intrinsically non-real-time process in which considercd and explicit choices arc made and used to prepare a musical lext (a score). Some composers (particularly the presently devalued "romantic· composers). But the composer who specilles a network of electrOnic processing devices around a traditional instrumental performance must recognise that helshe is in fact creating a new and different instrumenl for the instrumentalist (or iustrumentalist-"techllician" duo) to play and is partly adopting the role of an instrument builder with its own vcry differenl reSponsibilities.

may be a necessary requirement to come to grips with lilis new domain. Its unique properties are reproducible and can be used as the basis of compositional processes which depend on those uniquenesses. In a long established and stable tradition with an aimosl unchanging sel of sound-sources (instruments) and a deeply embedded performance practice. a process increasingly informed by experience as the composer engages in her or his craft. However. Hence there is a danger that a piece for electronically processed acoustic instrument will fall short of our musical expectations because no matter how good the performer. The tempered scale became established in European music over 200 years ago. that film has viz-a-viz theatre. and hence a coming 10 terms with the unexpected and the unwilled. means that a "performance practice" (so to speak) in the traditional sense is neither necessarily attainable or desirable.Q . The implication we wish the reader to draw is that this new domain may not merely lack established limits at lhis moment. il may turn out to be intrinsically unbounded. I can then adjust my physical actions to satisfy my aural experience . Humility in the face of experience is an essential character Irait! In Particular the sound examples in this book are the result of applying particular musical tools to t . bul nevertheless always open to both surprising discovery and to errors of judgement. that all aspects of sound design need to be explicitly understoOd. real-time processing (wherever this is feasible) is a desirable goal. studio composition can deal with the uniqueness of sonic events and with the conjuring of alternative or imaginary sonic land'!capes outside the theatre of musical performance itself. Any new instrument takes time to master. i.. in time. I then . In this interface between intuitive knowledge embedded in bodily movement and direct aural feedback. If I didn't like this result. " it partiCular sound sources. lies one of the most significant contributions of computing technology 10 the process of musical composition. One cannot simply apply a process.g. through performance. his or her mastery of the new system is unlikely to match his or her mastery of the acoustic instrument alone with the centuries of perfonnance practice from which it arises. There is a danger that electronic extension may lead to musical trivialisation. rather than just the mastering and extension of established craft.~ . when lirst exploring any time-varying process and its effect on a particular sound. we must declare thai the realm of sound composition is a dangerous place for the traditional composer! Composers have been used to working within domains estahlished for many years or centuries. The power of the computer both to generate and provide control over the whole sound universe does nOI require that I explicitly know where to travel. learn to master as a repertoire for these new instruments develops. in order 10 produce some subtly time-varying modification of a sound texture or process. to move a fader. The exploralory researching approach to the medium. I had to provide to a program a file listing the values somc parameter will takc and the times OIl which il will reach those values (a breakpoint table). [Urn a knob. . and expect to get a perceptually similar transformation With whatever sound source one puts inlo the process. In this respect it has the disadvantages.10 the program and subsequently heard the result. electro-acoustic composition does present us with an entirely new dilemma. what is "right" or "wrong" for me in this new siW ation without necessarily having any conscious e1tplicit k.nowledge to diflerent situations. In many studio situations in which I have worked. "turn the handle". as did many of the instrumental families. and so on. 8 9 . j SOUND COMPOSITION. Unfonunately. Such composition may be regarded as suffering from the disadvantage that it is not a 'before your very eyes'.11 II I ilii I.s credentials as an "original" artist. and explicit knowledge carried in a numerical representation. Sonic An is nOI like arranging. !.lut as a result the studio composer must take on many of the functions of the performer in sound production and articulation. Both must be selecled with care to achieve some desired musical goal. I expect to hear an example of a class of possible sounds which all satisfy the restrictions placed on being a flute "mf E". But beyond this. from the depths of his (preferably Teutonic) wisdom. sound diffusion adds an element of performance and interpretation (relating a work to the acoustic environment in which it is presented) not available in the presentation of cinema. Because success in this sphere depends on a marriage of good instrument design and evolving performance practice. Does it provide a satisfying balance of restrictions and ~ Ii! 'i flellibilities to allow a sophisticated performance practice to emerge? Foe the performer he/she is performing on a new instrument which is composed of the complete system acoustic-Instrument-plus. This does not necessarily mean.. it takes time! From this perspective it might be best to establish a number of sophisticated electronic-extension-archetypes which performers could. When I finger an "mf E" on a flute. Similarly.j i The second paradigm is that of pure electro-acoustic (studio) composition. however. But these changes are ounor compared with the transition into SOllie composition where every sound and every imaginable process of transformation is available. I had to modify the values in the table and run the program again. bow a string. wHls into existence a musical score out of pure though!. without going through any conscious explanatory control process. New instruments (e. the saxophone) and new ordering procedures (e. In contrast. It is clearly simpler and more efficacious. however. good sound composition always includes a process of discovery.nowledge of how I made this decision. In this way I can learn intuitively. . an instrument builder must demonstrate the viability and effieaey of any new instrument being presented. the enduring academic image of the European composer is that of a man (sic) who. AN OPEN UNIVERSE Finally. Moreover. for a studio composer. . Performed music works on sound archetypes. serialism) have emerged and been taken on board gradually.I can explore intuitively. Without this fact there can be no interpretation of a work. perhaps these supermen exist. the success of studiO produced sound-art depends on the fusion of Ihe roles of composer and performer in the studio situation. if the program also recorded my hand movements I could store these intuitively chosen values and these could then be reapplied systematically in new situations.electronic-network.g. ii. . What film lacks in the way of reinterpretable archetypicality(!) it makes up for in the closely observed detail of location and specifically captured human uniqueness. blow down a tube (etc) and hear the result of tlus physical action as I make il. or analysed to determine their mathematical properties so I could conSciously generalise my intuitive k. blurring the distinction between the skill of the perfomler (intuitive knowledge embedded in (X'fformance practice) and the skill of the composer and allowing us to e1tplore the new domain in a more direct and less theory-laden manner. The power of the computer to both record and tntnsform allY sound whatsoever. However. but also the advantages. However. and interpreted medium. As argued above. " . every sound may be treated as a unique event. For this to work effectively.

however. of sounds. 2. the ultimate extended solo in an all-time great improvisation which arose out of the special combination of performers involved and the energy of tbe moment.in traditional musical practice. Cel1ain other features like pitch stability. Exact reproducibility allows us to generate transformations [hat would otherwise be impoSSible. <. I.~ ha. The most impol1ant thing 10 understand. It's particular grouping of micro-fluctuations of pitch. With sound recording. To give an analogy with sculpture. But these are relations between ideals. we don't wish 10 imply that there are different classes of . the Traditional music is concerned with the relations among cel1ain abstracted properties of real sounds pitch. traditional musical practice. Furthermore.. undulating ami/or forced continuation (see Chapter 4). traditionally. the resonance of a passing wolf in a particular forest or a particular vehicle in a road-tunnel. We already know that sound composition presents us with many modes of material-variation which are unavailable _ at least in a specifically controllable way . In diViding up the various propel1ics of sound. It is not an example of a pitch class or an instrument type. spectral harmonicity-inhamlonicity and Its evolution. Almost all sounds can be described in terms of ~n (particularly onsct-grain). The difficulty in music is compounded by the fact that we discuss musical form in terms of idealisations. llUs consequences which force us to extend. is that a sound is a sound is a sound is a sound. pitch or pitch-band. loudness and spectral properties is unique. we have a structure defined by relations among archetypal properties of sounds and a less well-defined aura of acceptability and excellence attached to other aspects of the sound events. there is a fundamental difference between the idea "black stone" and the particular stone I found this morning which is black enough to serve my purposes. sounds arc multi-dimensional phenomena. spectral contour and formants (Sec Chapter 3) and their evolution. We can capture a very special articulation of sound that cannot be reproduced through perfonnance ­ the particular tone in which a phrase is spoken on a particular occasion by an untrained speaker. For every "F5 mf" on a flute IS a different sound-event. . :1 ~ 1 . Sounds are not notes. llUs particular stone is a unique piece of material (no matter how carefully I have selected it) and I will be sculpting with this unique material. all a/ the same time. extended and transformed by the process of sound composition. I To give just 2 examples . are expected to lie within perceptible limits defined by perforlll:lnce practice. the loudness. pitch motion. . speCtral stability and its e~olution. by playing two copies of a sound together with a slight delay before the onset of the sccond we generate a pitch because the exact repetition of sonic events at an exact time interval corresponds to a fixed frequency and hence a perceived pitch.r- I ~ CHAPTER 1 THE NATURE OF SOUND SOUNDS ARE NOT NOTES One of the first lessons a sound composer must learn is this. tremolo and vibrato control etc..~ innumerable . the duration. or classes. but beyond this enter into a less well-defined sphere known as interpretation. or divert from. In this way. It is a unique object with its own particular properties which may be revealed. and dispersive. For example. "F5 mf" on a flute is related to "E-flat4 ff' on a clarinet. Il . the unique characteristics of the sound can be reproduced. ·f ~ ~ . It remains to be seen wbcther the musical community will be able to draw a definable boundary around available leChniques and say "this is sound composition as we know it" in the same sense that we can do this with ttaditional instrumental composition../ .

l. a set of tuned gongs (gamelan). I.. Ii Ii UNIQUENESS AND MALLEABILITY: FROM ARCHITECTURE TO CHEMISTRY To deal with this change of orientation. In such a graph we 011 the vertical axis and time on the horizontal axis. and we can pass between these "states of being" of the sound with complete fluidity. and vice versa. As compositional tools may affect two or more perceptual aspects of a sound. become appropriate metaphors for what a composer might do. but most sounds have most of the properties that we wiD discuss.) I " ''. those that are muJti-coloured.1' " '. our principal metaphor for musical composition must change from one of architecture to one of chemistry. As the Twentieth Century has progressed and the possibilities of conventional instruments have been explored to their limit we have learned to recognise various shades of grey and gilt to make out architecture ever morc elaborate. . we need 10 understand the properties of the "sonic matter" willl which we must deal.Y boundaries. performance practice and the formalities of notational conventions (including theoretical models relating notatables like pitch and dUllltion) with a pool of sound resources from which musical "buildings' could be constructed. The internal shapes (morphologies) of sound evenlS remain mainly in the domain of performance practice and are not often sublly accessed through notation conventions. transfonn black pebbles into gold pebbles. Sound becomes a fluid and entirely malleable medium. a set of struck metal strings (piano).~I II i li » sounds corresponding to the various chapters in the book. composers were provided. '. understood. and how it can now be physically represented. however. However. although mathematical and physical principals will still enter strongly into the design of both musical tools and musical structures. I':i The signal processing power of the computer means that sound itself can now be manipulated.to enable us to encompass ulis new world of possibilities. Sound recording. by the history of instrument technology.se in one direction. In sonic terms. classities and measures the acceptable. transferred and transfonned in rigorously definable ways. definable portamenti.a basis for musical form. So our representation ends up COking Just like a lateral wavel (See Diagram 1). In the past. Our classification categories are overburdened and our original task seems to become overwhelmed. The precision of computer signal prncessing means. And as with all wave motion. We may exaggerate or contradict . that previously evanescent and uncontainable features of sounds may be analysed. enhancing or contradicting ilS gestural propensities. We may imagine a new personality combing the beach of sonic possibilities. as we go through the book it will be necessary to refer more than once to many compositional processes as we examine their effects ~ Iii Iii I'li ~ in fact. Sculpture and chemistry. We need a new perspective to understand this new world. These arc thcrel(m~ known a~ lateral wavcs. 'i ~epresent pressure 12 13 . The composer then becomes an expert in constructing intricate buildings in which every pebble is of a definable colour. This is the fundamental differem:e between the motion of sound in air and the motion of the wind where the air molecules move en ma<. .g. the air is alternatively compressed and 'Jrclied in thc samc direction as the direction of motion (See Diagram I). Most importantly however. or inharmonic spectra) but those of unstable. .in a precise manner . and so on. we can represent the air wave as a gmph of pressure against time. rejects. r~ . umericai sorcery.. The evolving spectrum of a complex sonic event can be pared away until only a few of the constituent partials remain. we can take apart what were once the raw materials of music. It also explains how sound can travel very much faster than Ille most ferocious hurricane. to a continuum of unique sound events and the possibility to stretch. mould and transfonn tllis continuum in any way we choose.from a finite set of carefully chosen archetypal properties governed by traditional "archilectural" principles. or rapidly varying spectra (the grating gate.'I 'l " .' from different pereeptual perspectives. sounds can be grouped into different classes. Each pocket of air (or water) may be envisaged a~ vibrating about its current position and pa~sil1g on that vibration to the POCket next to il." I . lust as we may observed the ripples spreading outwards from a stone thrown into still water. reconstitute them. Sound waves in air differ from the ripples Oil the surface of a pond in another way. all those that were completely gold to make a second instrument. A minule audible feature of a particular sound can be magnified by time-stretching or brought into focus by cyclic repetition (as in the works of Steve Reich). In a sound wave. Like the chenUst. the overwhelming dominance of pitch as a defining parameter in music focuses interest on sound classes with relatively stable spectral and frequency characteristics. • THE REPRESENTATION OF SOUND PHYSICAL & NUMERICAL ANALOGUES 1'. As most traditional building plans used pitch (and duration) as their primary ordering principles. merge the constituents from two quite different pebbles I" L! 11[. to build new worlds of musical connectedness. the human speech-stream) must be accepted into the compositional universe. The ripples on a pond (Or waves on the ocean) disturb the surface at right angles (up and down) to the direction of motion of that wave (forward). we must understand bolll the nature of sound. not only sounds of indetenninate pitch (like un pitched percussion. Most sounds simply do not fall into the neat categories provided by a pitched-instruments oriented conception of musical architecture.. or transfonn them into new and undreamt of musical materials. The previous ta~k of the instrument builder was to seek out all those pebbles that were completely black to make one instrument. In fact. transforming somellting that was perhaps coarse and jagged. " . but a chemist who can take any pebble and by l Ii" !. furthermore. not a carefully honed collection of givens. separate its constituents. it is the pattern of disturbance which movcs forward. apart from the field of perCUSSion. rather than the water or air itself. opens up the possibility that any pebble on the beach might be usable ­ !hose that are black with gold streaks. Composition through traditional instruments binds together as a class large groups of sounds and evolving-shapes-of-sounds (morphologies) by collecting acceptable sound types together as an -instrument" e. tracing out an audible path of musical connections . working these newly available materials is immediately problematic.['" To understand how this radical shift is possible. not someone who selects. into somcllting aetherial (spectrat tracillg : see Chapter 3). We might imagine an endless beach upon which are scattered innumerable unique pebbles. with ful.building. Fundamentally sound is a pressure wave travelling through the air. rather than language or tinite mathematics. and uniting this with II tradition of performance practice. This shift in emphasis is as radical as is possible . A complete reorientation of musical thought is required together with the power provided by computers . when we speak similar ripples travel through the air to the ears of listeners. To get allY further in (his universe.the energy (loudness) trajectory (or envelope) of a sound.

'.... ~ .. • '.~ -+ ).~ " • 3 \l) ~t1ahon at a s(nfJ'e af Py~ss..-? ~~ rQ~adlo. . .ar~act. . Anaerobe barometer stack expands and contacts as air pressure varies over time.-? Pas/TrON >­ •••••••••••••••••••••••••••••••••••••• @. DIAGRAH 2 Sound waves are patterns of compression and rarefaction of the air. '1' " . • •••• • . .... -+I~ [ +-. .PN!SSIO>'l. Point of pen traces out similar movements. Ot'l drl.I I t ~~ ~ •••• • • •••• • rtl1e~dl'of\ 4.... . .:..Ufe. The representation of a sound wave as a graph of pressure against time looks like a transverse wave. . . .t !I ':' \" " "... . . . .ruh'\."' TRANSVERSE WAVES Waves on the surface of a pond involve motion of the water in a direction perpendicular to the direction of motion of the waves. (<?) pen draw.. which are recorded as a (spatial) trace on the regularly rotating drum. in a direction parallel to the direction of motion of the wave. iniciaf s. pl~e. forward.~tte"'n mooE's.. ~ anaerobe barometer. . [ pa~"n ::> moves forward • •••• • • ••• @t•• • con.. . . :~-. . . • • •••• • • •••• • • •••• • • •••• • • •••• • CO"'~l'WlbIoI.o"'- • • •••••• /"""»1 Ii ~ I"> ~ DIAGRAM 1 drum rcJ:ates <IS shown.. @ • t:o"'~~SSio. . -. .tate. • • H. .. SOUND WAVES .. I'll . . ..... rotablt1ij d.. .llll. " ••e . . . .f ~~ .

and then using this needle to scratch a pattern on a regularly rotating drum.a: On t . the variation of magnetism along a tape (analogue tape recorder). The individual numbers are known as samples (not to be confused with chunks of recorded sounds.. essentially a set of instructions for producing sounds anew on known instruments with known techniques. electricity itself proved to be the ideal medium in which to create an analogue of the air pressure-waves. also often referred to as 'samples'). vott~ ~ Sound ~e n·$. Such electrical analogues may also be created using electrical oscillators. Volcal€. DIAGRAM :3 ~~ ~.. (See Appendilt pI). By running the physical substance past a pickup (the needle of a record player..sure :=.QI'\.~tt-in!J mlcrophorre air j:Jre:. however. where a needle traces a grapb of some time-varying quantity on a regularly rotating drum.. SOu. Using a device known as an A to D (analogue to digital) converter.jl disc The arrival of electrical power contributed greatly to the evolution of an Art of Sound itself. reliable electric motors ensured that the time-varying wave could be captured and reproduced reliably. The fundamental idea here is of an analogue. Such electrical patterns are.U disc c. Before the Twentieth Century this was technologically impossible. the position information was reconverted to temporal information and the electrical wave was recreated. Music was reproducible through the continuity of instrument design and performance practice and the medium of the score. and each one represents the instantaneous (almost!) value of the pressure in the original air pressure wave. in an (analogue) synthesizer. we cannot begin to manipulate them. • T7 ~ pf • volt~€. J\. aeYII txlIn. The final link in the chain is a device to convert electrical waves hack inlO sound waves. (See Diagram 2). ephemeral. as the rotating devices used could be guaranteed to run al a constant speed..r ~ Such waves are. Sound waves may be converted into electrical waves by a microphone or other transducer.l~e. ~ (f\ vOlt~e ~ ~ "'~il/'J ~ masnebc.. tape space OR m etlC­ £. They must still be converted into a spatial analogue in some physical medium if we are to hold and manipulate sound-phenomena out of time. equally ephemeral. fL/J Vinl. of their very nature. sm..pe.analogues of sound waves.--=> <:::::::> tJ~ ~ ele!ic. Without some means to 'halt time' and to capture their form OIIt of time. The digital representation of sound takes one further step which gives us ultimate and precise control over the very substance of these ephemeral phenomena.. First of all. (See Diagram 3). a pattern of pressure in time is converted to a physical shape in space. When we work on sound we in fact work on a spatial or electrical or numerical analogue. nus process of creating a spatial analogue of a temporal event was at the root of all sound-recording before the arrival of the computer and is still an essential part of the chain in the capture of the phenomenon of sound.J' Of <Q) loudspeaker 14 \0 i fLn p\TV':" I~ n volt~ ( fe1r. This is the loudspeaker. More importantly." ~ ear vO e Ze eled-ric. ~ ~ \tl .nd SOurGe lS A D / Foe. V\J~I~~ \(o. [n the past this might be the shape of a groove in rigid plastic (vinyl discs) or. a very precise copy of the form of the sound-waves in an altogether different medium. the head of a tape recorder) at a regular speed. the pattern of electrical nuctuation is converted into a pattern of numbers. The intrinSically ephemeral had been captured in a physical medium. And in fact the original phonograph recorder used eltactly this principle. first converting the movement of air into the similar (analogue) movements of a tiny needle. /aCttJ. recorded in digital samplers.' The trick involved in capturing such ephemeral phenomena is the conversion of time information into spatial information. of the original pressure wave.cs em . In this way. ~~:':~. Electrical waves arc variations of electrical voltage with time .5ur ~. though in an entirely different medium. Note that in all these cases we are attempting to preserve Ihe shape. x::::=:::::> . :.. a simple device which does this is the chart-recorder. $ I. the form.

(and to convert back again. However.e. where the time and energy (or position and momenrum) of a panicle/wave cannol both be known with accuracy _ we trade off our knowledge of one againsl our knowledge of the other. Stockhausen. in contrast.. and is said to be "harmonic". if heard individually. This abstracting process involves enormous compromises in that it can only deal accuI1ltely with finite sets of definable phenomena and depends on existing musical assumptions about performance practice and instrument technology to supply the missing information (i. II is important to note that this new representation of sound is out-of-time. like some ornaments). However. perceplually we are more imeresled in the perceived properties of this sound. It is interesting to compare this new musical abstraction with the abstraction involved in traditional music notational practice. bUI nOI separahle from the panicles (protons and neutrons) they constitute.2 We now arrive at an essentially abstract representation of sonic substance for. hut it does nOI tally with aural experience. These may he conceived of as the simpler vibrations from which the actual complex waveform is constructed. Each sample in the digital representation of a waveform corresponds to a lime less than 0. the frequency domain representation gives us more precise information aboul the frequency of the partials in a sound Ihe larger the time window used to calculate it. '. This trade-off between temporal information and frequency information is identical 10 Ihe quanlum mechanical principle of indetenninacy. It was Fourier (an Eighteenth Century French mathematician investigating heat distribution in solid objects) who realised that any particular wave-shape (or. in each second). . there need be no simple relationship between their original temporal order and the physical-spatial order in which they are stored. We have converted Ule temporal information in our original representation to frequency information in our new representation. If we look at the pressure wave of this sound it will be found to repeat ils pattern regularly (See Appendix p3).000 I seconds have no perceptual significance at all. most of it!) (See the discussion in On Sonic Arr). In acoustics these are known as the partials of the sound. the numerical abstraction involved in digilal recording leaves nothing to the imagination or foreknowledge of the musician. Converting it to frequency data with the FFf will produce energy allover the spectrum. In contrast. Lislening 10 il we will hear only a click (whatever the source). the individual sample can tell us nothing about the sound of which il is a part.. lhere is an alternative way to think about and 10 represent a sound. or pits burned in a CD. We can often deduce perceptual information about the sound from this data. any sound of musical interest and certainly any naturally occurring sound) we musl divide the sound inlo tiny time-snapshols (like the frames of a film) known as windows. Is it pitched or noisy (or both)? Can we perceive pitches within it? etc. Although every digitally recorded sound is made OUI of nothing hut samples. rhythmic and acoustic time-frames as the rationale for his composihon Gruppen. Thus we use double capitalisation for the latter. The bell spectrum. The pattcrn of partials will determine if the sound will be perceived as having a single pitch or several pitches (likc a bell) or even (with singly pitched sounds) what its pitch might be. see Appendix pit). We can even represent the sound by writing a list of sample values (numbers) on paper. Note also that a very tiny fragment of the time-domain representation ( a fragment shorter than a wavecycle).. WAVE-CYCLES. These two representations of sound are known as the time-domain representation and Ule frequency-domain representation respectively. essential to the existence and structure of malter. would be a broad-band click of a certain loudness.rIIlony in EUropean music. although these numbers are nonnally stored in the spatial medium of magne.ommenlary. But in enlarging the window. In fact the feature underlying these perceived properties of a static sound i~ the disposition of its partials. THE REPRESENTATION OF SOUND THE DUALITY OF TIME AND FREQUENCY So far our di~cussion has focused on the representation of the actual wave-movement of the air by physical and digital analogues. GRAINS AND CONTINUATIONS. (for a fuller discussion. providing we specify the sample rate (how many samples occur at a regular rate. Thus should not to be confused with the traditional notion of I I 15 16 .c. is known as an irtharmonic spectrum. The mathematical technique which allows us to conven frOm one to the other is known as the Fourier Iransform. in their original order. The spectral pattern which produces a singly pitched sound has (in mosl cases) a particularly simple structure. and none for the former.tic domains on a computer disk. represented in written scores. we quickly become aware of the different time-frames involved and their perceptual consequences. mathematically. Samples are akin 10 the quarks of suhatomic particle theory. gives us no infomlation about frequency. we track less accurately how the sound changes in time. a plot known as the spectrum of the sound. Hence what was once essentially physical and temporally ephemeral has become abstract."I . f{A. Traditional notation was an abstraction of general and easily quantifiahle large-scale properties of sound events (or performance gestures. Conversely. the numbers which represent the sound. in the article How Time Passes (published in Die Reihe) argued for the unity of formal. Each sample. though this would take an inordinately long time to achieve in practice. Typically. the contents of a tile on a hard disk are scattered over the disk according to the availability of storage space. see Chapter 2). the inverse Fourier trans/onn ) which is often implemented on computers in a highly efficient algorithm called the Fast Fourier Trdllsform (FiF-T). any function) can be recreated by summing together elementary sinusoidal waves of the correct frequency and loudness (Appendix p2). If we have a regular repeating wave-pallem we can therefore also represent it by plotting the frequency and loudness of its partials. In working with sound materials. but consequently conveys no abstracted information on the macro level. Extremely short time frames of the order of 0. although it gives us accurate information aboul the time-Variation of tile pressure wave. This was a fruitful stance to adopt as far as Gruppen was concerned and tits well with the 'unily' mysticism which pervades much musical score analysis and . (Appendix p3). TIME FRAMES: SAMPLES. A sequence of these windows will show us how the spectrum evolves with time (the phase vocoder docs just this. . If we now wish 10 represent the spectrum of a sound which varies in time (i.000 I seconds. What the computer can do however is to re-present to us. Let us assume to begin with that we have a sound which is nOl changing in quality through lime.

--C.-.. "d".. ~ too ~i'd :00• . ! _ _ _ _ _ _ __ . pitch/noise. But a single wavecycle is not sufficient on its own to determine these properties.0). "t".. Hence there is a crucial perceptual boundary below which sounds appear as more or less undifferentiated clicks. the spectral and pitCh characteristics may not be easy to pin down. <:7 . With noisy sources. . Intuitively we can say that the grains are the loud part of the signal.'d Set the sound to zero wherever its absolute value falls below the level of the gate r.. soft-edged). lllis time is at least of grain time-frame proportions.. lllis process lengthens a sound by repeating each waveset (A waveset is akin to a wavecycle. certain drums have a focused spectrum which we would expect from a pitched sound (they definitely don't have noise spectra as in a hi-hat). ~ ~ tho '0_.g. onset characteristics (hard-edged. This perceptual boundary is nicely illustrated by the process of waveset time-stretching. each one is present long enough to establish a specific pitch and spectral quality and the original source begins to transfonn into a rapid stream of pitched beads... . the wavesel~ vary widely from one to another but we hear only the net result.th~y. r (\ • . The experience of a grain is indivisible. and in particular a wavecycle (a single wavelength of a sound). but not exactly the same thing.. e. " "" .j~tory. The instantaneous level of a sound signal constantly varies from positive to negative so.-' • . These may be regarded as the atomic units of sound.-. spectral contour. we have a grain. pitchy/noisy/gritty quality etc.--v --. the number of times per second a wavefonn is repeated. . (See Diagram 5). Often the grain is characterised by a unique cluster of properties which we would be hard pressed to classify individually but which enables us to group it in a particular type e. For more details see Appendix p55). For example. If we repeat each waveset five or six times however..I II' " Once we can perceive distinctive qualitative characteristics in a sound. 'The first significant object from a musical point of view is a shape made out of samples. we should be able to separate the individual grains. on reflection. yet no particular pitch may be discernible.. it will fall below the threshold and our procedure will chop the signal into its constituent half wavecycles or smaller units (see Diagram 4) . roo o u DIAGRAM 5 0 0 CJ 0 0.. Similarly. Although the internal structure of such sounds is the cause of what we hear. A Grain differs from any larger structure in that we cannot perceive any resolvable internal structure. o~ "J /:il>le V\TV V 17 . a sound of indefinite pitch.- - - -_ _ . and the points between grains the quietest parts. . What we must ask the instrument to do is search for a point in the signal where the signal stays below the (absolute) gate value for a significant length of time... However. Repeating each waveset lengthens the sound and introduces some artefacts. "p".not what we intended. or a rapidly portamentoing pitched spectrum. and above which we begin to assign specific properties (frequency. and use the absolute maximum II -WTffi7zi?n(~ 12I7ZTT!I/Ij/~ ~ ...g. spectrum etc) to the sound (for a fuller discussion of this point see On Sonic Art).Iil 'l'il' I·". 'The shape and duration of the wavecycle will help to determine the properties (the spectrum and pitch) of the sound of which it is a part. we do not resolve this internal structure in our perception. a single wavecycle supplies no frequency infonnation. DIAGRAM 4. we see that this procedure will not work.--Divide up the sound into finite length windows. tc."" I I'I' I: I I:!' '. imagine we wished to separate the grains in a sound (like a rolled "R") by examining the loudness trajectory of the sound. -1\ --A ii -~U nvA=~~ V ~.q-1Ljo' . regardless of their internal form.. unvoiced "k".~7/) tc. (Sound example 1. If we set up an instrument which waits for the signal level to drop below a certain value (a threshold) and then cuts out the sound (gating). The sound presenl~ itself to us as an indivisible unit with definite qualities such as pitch. as we might hear out in an inhannonic bell sound)." I. As pitch depends on frequency. _ n ~ -~~ ~~~ -~ . The boundary between the wavecycle time frame and the grain time-frame is of great importance in instrument design. at least twice in every wavecycle. I 'I II Ii -""~"'''<L . . Not until we have about six wavecycles do we begin to associate a specific pitch with the sound. Analysis of such sounds may reveal either a very short inhannonic spectrum (which has insufficient time to register as several pitches.I I' f rrV~~~~C:j~~.j~tory t~th.

in the technique known as brassage. Just as in traditional musical practice. we can transform sounds in such radical ways that we can no longer assert that the goal sound is related 10 the source sound merely because we have derived one from the other.05 seconds) we have enough information to detect its temporal evolution. In the simplest case. We have 10 establish a connection ill the experience of the listener either through clear spectral. jitter). The onset usually has the timescale and hence the indivisibility and qualitative unity of a grain and we win rerum to this later. The sound is no longer an indivisible grain: we have reached the sphere of Continuation. the recognition of a phrase as such. Longer sound events can often be described in terms of an onset or attack event and a continuation. (Sound example 1. This is because we are no longer dealing with clear cut perceptual boundaries. A detailed discussion of the typology of sounds can be found in Of! Sonic Art and in the writings of the Groupe de Recherches Musicales.Ito. or a clear path through a series of connecting sounds which gradually change their characteristics from those of the source. a sound with clear internal structure can be time-contracted imo a structurally irresolvable gl". or evolution of the spectrum. and we should achieve a time-stretching of the sound without changing its pitch (the hannoniser algorithm). or etc similarities between the sounds. Ths is particularly true when the apparent origin (physicality) of the gOal SOund is quite different to that of the source sound. II The Internal structure of grains and their indivisibility was brought home to me through worldng with birdsong. completely clear cut and interesting perceptual ambiguities occur if we aller the parameters of a process so that it crosses these time-frame thresholds. and low contrabassoon notes which are perceived partly as a rapid sequence of onsets) or segmented sounds (see below). to make a waveform twice as long. the apparelll origin (or physicality) of Ihe sound remains an important factor in our perception of the sound in whatever way it is derived. Eventually. each sound was found to be comprised of a rising scale passage followed by a brief portarnento! with coarse pellets) where the sound is clearly made up of many individual randomised attacks. We may look at this in another way. If the segment~ are in the grain time-frame.2).lnsformed or synthesized sounds. In the case of tr.pitch will be preserved. For example. (Sound example 1. a sound event can be arbitrarily complex. to those of the goal. we reach the sphere of the Phrase. for example. but don't overlap them (so much) in the resulting sound. Conversely. As our time-frame lengthens. perception of continuation may be concerned with the morphology (changing shape) of the spectrum. A similar ambiguity applies as we pass further up the time-frame ladder towards larger scale musical entities. when slowed down to 1I8th speed. However. less clear cut. (Sound example 1. 18 19 . These nested time-frames are the basis of our perception of both rhytilm and larger scale form and this is more fully discussed in Chapter 9. If the segments are longer than grains. Here. will depend on musical conlext and tile fluidity of our new medium will allow us to shrink. Here. We can however construct a series of nested time-frames up to the level of the duration of an entire work. continuation) are not. perceptual descriptions become more involved and perceptual boundaries. We mighl. the insllIntaneous . be disconunuous as in iterative sounds (graln-streams such as rolled "R". With the power of the computer. I would like to draw attenlion to two aspects of our aural experience. Whereas it will usually be quite clear what is a note event and what is a phrase (ornameniS and trills blurring this distinction). But if the sound persists beyond a new time limit (around . attempting to time-stretch by a factor of 2 we will in fact resp/ice together parts of the wavefOll1l itself.PHYSICALITY AND CAUSALITY Most sounds longer than a grain can he described in temlS of an onset and a continuation. If the spectrum has continuity. The grain/continuation time-frame boundary is also of crucial importance when a sound is being time-stretched and this will be discussed more fully in Chapter II. this evidence will be misleading in actuality. The continuation may. but questions of the interpretation of our experience. In fact. we chop up a sound into tiny segments and then splice these back together again. A similar ambiguity applies in the sound domain with an added difficulty. bUl we still gain an impression of an imagined origin of the sound. If we try to make segments smaller than the grain-size. grain. If we retain the order of the segments using overlapping segments from me original sound. of course. loudness (tremolo). 1l!is is the next important lime-frame after Grain. we become aware of movements of pitch or loudness. the boundary between a long articulation and a short phrase is not easy to draw. we will produce a collage of motifs or phrases cut from the original material. spectral contour or formant gliding (see Chapter 3) etc. A trill. It has great significance in the processing of sounds. and our sound will drop by an octave. As in traditional musical practice. then time-shrink it 10 become a sound of segmented morphology (See Chapter II). This is panicularly true in the era of computer sound transformation. THE SOUND AS A WHOLE . or granular texture streams (e. The boundaries between these time-frames (wavecycle.1). as in tape-speed variation. One might presume from this that the individual sounds were internally portamentoed at a rate too fast to be beard. however. As Pierre Schaeffer was at pains to stress. variation in grain or segment speed or density and grain or segment qualities will also contribute to our aural experience of continuation. without Variation. as the waveform itself will be preserved intact. gradually time-stretching a gmin gradually makes its internal structure apparent (See Chapter II) so we can pass from an indivisible qualitative event to an event with a clearly evolving structure of its own. For example. start with a spoken sentence. Once we reach the sphere of continuation. lasting over four bars may be regarded as a note-articulation (an example of continuation) and may exceed in length a melodic phrase. their internal structure will be heard out and we will begin to notice echo effects as the perceived continuations are heard to be repeated. we will destroy the signal because the splices (cross-fading) between each segment will be so short as to break up the continuity of the source and destroy the signal characteristics. or expand.r 1 ~1 !I . as the segment size becomes very much larger than the grain time-frame. we will dearly end up with a longer sound (Appendix p44). etc. I. A particular song consisted of a rapid repeated sound having a "bubbly· qUality. maracas " " It is important to bear this distinction in mind. once we begin working with sounds as our medium the oClllal origin of those sounds is no longer of any concern.g. But a trill with a marked series of loudness and speed changes might well function as a musical phrase (depending on context). fwm one time-frame to another. as our time frame enlarges.II i I' . The way in which a sound is attacked and continues provides evidence of the physicality of its origin. a phrase-time-frame object.3).Iin. . morphological. with the articulation of pitch (vibl".

We can. cyclically varying pitch (vibrato) may be accelerating in cycle-duration. Or the material itself may have internal resonating properties (bells. In general.so that the pitch or spectrum of the sound varies through time. I am not suggesting that we consciously analyse our aural experience in this way. The onset or attack of a sound is always of great significance if only because it is the moment of greatest surprise when we know nothing about the sound that is to evolve. a flexed metal sheet. at the level of the Phoneme ("w".a physical blow. a cello soundbox for a pizzicato note. in the continuum of sonic possibilities.g. these reference frames reler to stable properties of the sonic space but this is not universal. producing wood-like attacks. the entire continuum of sound possibilities is open to us. e. Furthermore. The medium may vibrate lit a frequency related to the excitation force (e. It may be short. spectral Contour etc) may be (relatively) stable. which are then progressively distorted to sound like unpltched drum sounds. drum roll etc). any particular language provides a phonetic reference-frame distinguishing those vowels and consonant types to be regarded as different (and hence capable of articulating different meanings) within the continuum of possibilities. Transforming the characteristics of a sound-source automatically involves transforming its perceived physicality and this may be an important feature to bear in mind in sound composition. etc. "y". producing an iterated sound (rolled "R". we have sophisticated control over sonic motion in many dimensions (pitch. (See Diagram 6). USUally. Tuba. In the case of continuous excitation of a medium. we are equipped in the sphere of language perception. a scraping oontact. a vocal utterance. The material may be only gently coaxed into motion (the air in a "BIoogle". loudness. These diStinctions may be subtle ("D" and "r in English) and are often problematic for the non-native speaker (English "L" and "R" for a Japanese speaker). The difficulty of immediate reproducibility makes repetition in performance very difficult and therefore acquaintance and knOwledge through familiarity impossible to acqUire. Cyclical motions of various kinds (lremolo. motion may itself be in motion. or the human voice in some circumstances) SO that a varying eXCitation force varies the pitch. Their separation is exaggerated in Western Art Music by the use of a discontinuous notation system which notates static propenies well. Violin. In SOund composilion. (Sound example 1. we are articulating what the onset has revealed. 1lle vibrating medium itself may be elastically mobile . gongs. while shrinking in pilCh-depth.g. For the first time. bowing. a resonant hall for a drum Stroke. etc. that of the reference-frame. In the first case.g. Consonants like oW' and "Y" (in English) and various vowel diphthongs are in fact defined by the 1ll0tion of their spectral contours (formants: sec Appendix pJ). But in our perception of our sound universe there is another factor to be COnsidered. or it may be in motion (portamento. Or the contact between exciting force and vibrating medium may be discontinuous. a drum-stroke. sjX~ctral 20 21 . time-varying aspect~ of II sound-evcnt can be tied down with precision. II is possible to give the most unlikely sounds an apparent vocal provenance by very carefully splicing a vocal onset onto a non-vocal continuation. but permitted to resonate through an associated physical system. We may choose to (it is not inevitable!) organise a sound property in relation to a set of reference values. e. any property of a sound (pitch. we may spectrally time-stretch and change the loudness trajectory (enveloping) of a vocal sound. reproduced and transferred to other sounds. duration and range of a rising portamento and in practice. In many non-Western cultures. In a similar and not easily disentangled way. subtle COntrol of such distinctions (portamento type. Aute. I STA. coarse hammer blows to a metal sheet) so that a comple~ and disconnected onset leads to a stable or stably evolving spectrum. spectral harmonicity/inharmonicity etc. We may be concerned with stable properties.BILITY & MOTION: REFERENCE FRAMES 'Ihus. a movement. and tone languages like Chinese) and of "tone-of-voice". reproduce the direction. a vocal click. our experiences of sonic change are often fleeting and only inexactly reprodUCible. fiut in general. vibrato. scraping). reference frames for Illotion types are not so well understood and certainly do not feature strongly in traditional Western art Illusic practice. crescendo. sounds may be activated in two ways . These provide reference points. only that our aural experience is grounded in physical experience in a way which is not necessarily consciously understood or articulated. Resonating systems will stabilise after a vlll'iety of transient or unstable initiating events (flute breathiness. But the reproduction of a complex portamento trajectory (see Diagram 7) with any precision would be difficult merely from immediate memory. a mammalian larynx . or with the nature of motion itself and these aspect~ of sound properties are usually perceptually distinguishable. for example. but moving properties much less precisely (for a fuller discussion see On Sonic Art). the sound may be internally damped producing a short sound. to make very subtle motion. a striking blow).) Furthermore. for example. complex.a flexatone. or nameable foci. the onset (or attack) properties of a sound give us some inkling of the cause of that sound . the shells in a Rainmaker) giving the sound a soft onset. In general. a rotor-driven siren.Iii'l Ili' . for example. we can differentiate a start-weighted from an end-weighted portanlento. vibrato speed and deplh etc) are required skills of the performance practice. etc. sounds witll moving speclral COntours tend to be classified alongside SOunds with stable spectral contours in the phonetic classification system of any particular language. loudness. Thus.4). the medium may resonate. whereas during the continuation phase of the sound. aurnl experience is so important to us that we already have an intuitive knowledge (see earlier) of the physicality of sound-sources. however. I also do not mean that we see pictures of physical objects when we hear sounds. opening of a filier. On the contrary. etc. producing a steady pitch which varies in loudness with the energy of the excitation e.by a single physical event (e. or by a continuous physical event (blowing. metal tubes) producing a gradually attenuated continuum of sound.g. Nevertheless. or grain . And in fact. spectral Vibrato. The vocai "causality" in the onset can adhere to the ensuing sound in the most unlikel y cases.) and can begin to develop a diScipline of motion itself (sec Chapler 13). and types of motion are as accessible as static States. With computer technology. They arc vital to our comprehension at the phonemic distinctions of pitch motion and and semantic level.a xylophone note.) are often regarded as stable Properties of a sound. or the material may be loosely bound and granular (the sand or beads in a Shaker or Wind-machine) giving the sound a diffuse continuation.g. spectral contour or fornlant evolution.

. to"ic.lhis we already lind in tone languages in China and Africa. time reference frames which enter into our perceplion of rhythm are particularly contentious in musical circles and I will postpone funher discussion of these until Chapter 9. one octave rugher). "'I 3 o against . -~'.. ~ ~ i 22 C1I . chromatic scale) without some additional clues (See Diagram 8)... tiluho/Q--to~~ I a reference frame. Because we nonnally accept the idea of octave equivalence (doubling the frequency of a pitched sound produces the "same" pitch. detennine our position re/J1tive 10 a given pitch.. 0' at one place. Traditional Weslern music practice is strongly wedded to pilch reference fr.. however. But pilch reference frames have special properties... we can transfonn the reference frame itSelf. or move on 10.-­ . using the notion of interval.. certain kinds of barOQue ornamentation elc). It is panicularly imponant 10 understand thai we can have pitch.inor scale semitone template x~ t»nt. ~I: notes placed on Jt s. In fact on many instruments (keyboards.:. pitch reference frames are cyclic. we have only one "OCtave" of possibilities but we are slill able 10 recognise. :€ We cannot therefore orient ourselves within the scale. In general. freued strings) they arc dift1cult to escape. but for most of us. / . and even stable pilCh.'. However. without reference 10 other vowel sounds.plex porlamento trajectory. in the case of pitch reference-frames.. in the vowel space. End-weighted portamento Start-weighted portamento Co. values lying off the reference frame may only be used as ornamental or articulatory features of the musical stream. We have a sense of absolule position. They have a different stalus 10 that of pitches on the reference frame (cf.!II' WHOLE-TONE SCALE (c) with properties of motion . DIAGRAH e A reference frame gives a structure to a sonic space enabling us to say "I am here" and not somewhere else. such a frame. Pattern fits against whole-tone scale in many places.If" . through time. Compuler control permits the very precise exploration of trus area of new possibilities. with a completely symmetrical scale ( whole-tone scale. .intervals making up the scale.. we can lell where we are from a small sci of nOles without hearing the key note because the interval relationships belween the nOles orienl us witWn the set-of..DIAGRAH 6 DIAGRAM 7 We can see the same mixed classification in the Indian raga sySlem where a raga is often defined in terms of both a scale (of static pilch values) and various motivic figures often involving sliding intonations (motion-types). (b) with static properties WililOUI l'IAJOR SCALE Pattern only fits against major scale semitone template at one place. however. Furthennore. without having a stable reference frame and hence without having a HAmlOIJic senfe oj pitch in the traditional sense.. (a) with stalic properties on a reference frame. tOlP'lic: • ez:. Ve can therefore identify the 2nd note 3 as the minor 3rd :f of the scale. we have only some general conception of high and low.. ~. Cyclic repetition over the domain of reference and the notion of interval are specific to pitch and time reference-frames.. "blue nOles" in Jazz..fC1'··~1 I't I NOR SCALE people with perfect pitch can deal with pilch space in this absolute sense. in sonic composilion we can work .. ~ l~~pm# I i" notes placed on ~ semitone ladder.itone ladder. In contrast. If the OClave is divided by a mode or scale into an asymmetrical sel of intervals. where we are in the vowel space.. However. Ve can therefore identify the 2nd note as the tonic of the scale. (See Diagram 9). We can even imagine establishing a reference frame for pilch-motion types without haVing a HAnnonic frame . repeating themselves at each octave over the audible range. We cannot do tWs trick. ~ Pattern only fits ~...lmes.. and away from. We can.

stable pitches definin~ a reference frame. quanti sed sequencer music on elementary synthesis modules) we quickly tire of the musical sterility. or at least whetllcr we can define a perceptible route from One to the other through a sequence of intermediate sound~. loudness level. _ - - -- --- .-. In fact.. density etc.. applied to a fairly static sound. In a tradition dominated by the notation of fixed values (pitch. And for this reason. discussed by Alain Savouret at the Imernational Computer Music Conference at IRe AM in 1985. II II .. or the half-waveset. each cr<lckling signal is unique at the Sample level and uniquely related to its source sound.g. TItis mediaeval textual idealism is out of the question in the new musical domain.. speed." _ _ _- .. but reduces most sounds to a uniform Crackling noise... Transfonning a sound does not necessarily mean changing it in an all-or-nothing way. early synthesized sounds governed by fixed values (or simply-<hanging values) suffered from a lack of musical vitality.. 1 '< /III ~ ~a. As we can do anything to a Signal. 23 . - ~ . on the other hand. Where this is not the case (e. filter centre. is simply that. a gradual spectral stretching of a long sound causing its pitch to split into an inharmonic stream... the apparently fixed values we dictate in a notation system are turned into subtly varying event" by musical performers.) should be replaceable by a time varying quantity and this quantity should be continuOusly variable over the shol1est time-scales. and so on. by definition defines a musical relationship. a gradually increasing (but subtly varying) vibrato.. for example) but will not sustaiti our interest as a compositional feature until we begin to articulate its temporal evolution. An instrument which replaces every half-waveset by a single pulse equal in amplitude to the amplitude is a perfectly well-defined transformation. A time-varying process can.. any parameter we give to a process (pitch.. of course. - stable pitches not defining a reference frame. for complex real-world sounds. gradually giving it a vocal quality. (every pitch is different) The same fixed-value conception was transferred into acoustic modelling and dominated early syntllCSis experiments.1>1 - .. Nevertheless. ~ ... e. it is impOrtant to understand that the compoSitional processes we apply to sounds may also vary in time in subtle and complex ways. . an "effect" which may occasionally be appropriate (to create a pal1icular general acoustic ambience. ~WW: - THE MANIPULATION OF SOUND: RELATIONSHIP TO THE SOURCE The infinite malleability of sound materials raises another significant musical issue.DIAGRAM '3 rflE MANIPULATION OF SOUND: TIME-VARYING PROCESSING Just as I have been at pains to stress that sound events have several time-varying properties.. dw-ation) etc.g. effect a radical transformation within a sound. it is easy to imagine that processes themselves must have fixed values. In score-based music there is a tradition to claiming that a transformation which can be explained and who's results can be seen in a score. In general._ ______ ------==. . In a similar fashion an "effects unit" used as an all-or-nothing process on a voice Of instrument...~ - . we must decide by listening whether the source sound and the goal sound of a compositional process are in fact at all perceptually related. loudness. Dynamic spectral iTllerpo/ation in pal1icular (see Chapter 12) is a process which depends on (he subtle use of time-varying parameters. ~ .. Hencc no musical relationship has been established between the source sound and the transformed sound. I iii.

retaining the onset unstretched) and process-focused transformations (where the nature of the resulting sound is more strongly determined by the transformation process itself. The most imponant feature of pitch perception is that the spectrum appears to fuse into a unitary percept. we gradually add a differellt vibrato to each set of partials. the lowest notes of the piano do not contain any partial whose frequency corresponds to the perceived pitch). In matilematical terms. or tessitura. an interesting area of ambiguity between the two extremes. One way in which Ihe ear is able to process the data relies on the micro-instabilities (jiller) in pitch. But this must not be confused with the notion of "harmony" in traditional music.g. The ear now able to group tile 2 sets of data illlo two aural streams and assign two different source sounds to IS 24 25 . Hence the ear does not unscramble the data into two separate sources. retain the same infinite potential that the infinity of natural sound sources offer us. If. When sounds from two different sources enter our ears simultaneously we need some mechanism to disentangle the partials belonging to one sound from those belonging to the other. Often when a new compositional technique emerges. and not simply a matter of directly perceiving a fundamental frequency in a spectrum. Which is their fundamental.400. II is imponant to understand. In general.g. while tilOse from the other sound will jiHer differently but also in parallel with one another. violin or trumpet when played in the conventional way produces sounds that have tile quality ·pitch". however. Another important aspect of Savoun:t's talk was to distinguish between source-focused Iransformalion (where the nature of the resulting sound is strongly related to the input sound. In most cases. with a cenain speciral quality or "timbre". Thus the numbers 200. of course. superimposed on the non-delayed signal to produce delay-time related pitched ringing). (e. therefore. and the individual partials are known a~ harmoniCS. however. now. process-focused transformations need to be used sparingly. Transformations focused in the source. What happens when the partials are not in this simple relationship is discussed in tile next Chapter. spectral form and contour etc.e. a !lule. But such process-focused transformations can rapidly become cliches. For example if we playa (synthesized) voice sound by placing the odd harmOnics on the left loudspeaker and the even harmonics on the right loudspeaker we Will hear one vocal sound between the two loudspeakers.In &ound composition a relationship between two sounds is established only through aurally perceptible similarity or relatedness. regardless of the methodological rigour of the process which transforms one sound into another. known as the fundamental. Soulld-processing procedures which are sensitive to the evolving properties (pitch. that the perceived pitch is a mental construct from a hamlonic spectrum. Such a frequency may not be physically present.l). e. but lhis is not necessarily true (e. they are all whole number multiples of some fixed value. and loudness which all natUrally occurring sounds exhibit. CHAPTER 2 PITCH WHAT IS PITCH? Certain sounds appear to possess a clear quality we call pilch. The partials derived from one sound will all jitter in parallel with one another. 500 are all whole number multiples of 100. The frequency of lhis fundamental detemlines the "height" or "value" of the pitch we hear. When a spectrum has this structure it is said to be harmonic. (Sound example 2. 300. using very short time digital delay of a signal. loudness. In OUr loudspeaker experiment. Pitch arises when tile partials in the spectrum of the sound are in a particularly simple relation to one another. we have removed this clue by maintaining synchronicity of microfluctuations between the partials coming from Ihe two loudspeakers. There is. i.g.) of the source-sound are those most likely to bring rich musical rewards. there is an irtitial rush of excitement to explore the new sound possibilities. the fundamental frequency of such a sound is present in tile sound as tile frequency of the lowest partial. Whereas a cymbal or a snare-drum has no such property and a bell may appear to contain several pitches.g. TIlis provides a strong clue for our brain to assign any particular partial 10 one or other of the source sounds. e. pitch parallelism via the harmanizer. The voice remains a single percept. lime-slretching of a signal with a stable spectrum. tile fundamental frequency is the highest common factor (HCF) of these partials. that of pitch. the sound image will split. TIlis happens because of a phenomenon kIlown as aural streaming. This fusion is best illustrated by undoing it.

Returning now to ponamenti and considering rising ponamenti. generate a perception of a single. the term "harmonic" may occasionally be used in the spectral sense as a contrasting term 10 "inharmonic".6). more importantly. (Sound example 2. In contrast. The same arguments apply 10 falling portamenti and even more powerfully.4). subsequently reinforcing the speech pitches on synthetic instruments so the speech appears to sing. but this does not necessarily (or usually) mean that we are assigning an iIpitch. however. though often our cultural predispositions cause us to "force" the notes onto our preconceived notion of where they 'ought" to be. 400. will appear to emanate from the other loudspeaker. portamenti with no such weighting. then a set of pitches approximable 10 a HArmonic field and finally the 'same' set locked onto that HArmonic field. We will therefore refer to that traditional concept as Hpitcll. we may still have no sense of Hpitch if the pitches are selected at random from the continuum of values. Each spectrum is integrated with itself because its internal microfluctuations run in parallel over all its own panials. 26 27 . However. The remaining harmonics. This sense of "having a pitch" i. But. First of all. rather than a.3). The pitch flow of a speech stream can be followed. however. or our perception. pitch from an event containing very many different pitches scattered over a narrow band. next a SCI of pitches on a HArmonic field. 500. Most of the relationships we deal with between pitches in traditional Western "hannony" are between frequencies that do not stand in this simple relationship to one another (because our scale is tempered). end-focused portamenti (often heard in popular music singing styles) where the portamenti settle clearly onto values in a HArmonic field. There are not different kinds of harmony in the spectrum. we will recognise pitch if we can find a cycle of Ihe wavefonn (a Wavelength) and then correlate that with similar cycles immedialely following it. (Sound example 2. To avoid confusion. 400. Similarly. (Sound example 2. they are relationships between the averaged properties of sounds. say 300. will continue to imply a fundamental of 100 cycles but will take on a clarinet type quality (clarinets produce only the odd harmonics in the spectrum) and move into one of the loudspeakers. an F 2 octaves below). to complexly moving portamenti. or a set of Hpitches to it. 2. a ponamento is pitched in (he spectral sense. (Sound example 2. A second problem arises because a spectrum in motion may still preserve this simple relationship between its constituent partials.Sb). the more clearly is a single pitch defined. Within a single spectrum. each having its own integrated spectrum. if unstable. (Sound example SPECTRAL AND HARMONIC CONCEfYrlONS OF PITCH Our definition of pitch leads us into conflict both with traditional conceptions and traditional terminology. which themselves rise and fall without having any definite Hpitch. will be interpreted as having a fundamental al 200 (as 200 is the HCF of 200..tetection and its difficulties. The odd harmonics. It is difficult to speak of a portamento as ·having a pitch" in the sense of conventional hannony. we have generated 2 pitch percepts from a single pitch percept. But these microfluctuations are different to those in the other spectrum. Perception of Hpitch depends on the existence of a frame of reference (see Chapter I). Working in the time domain. to say that a spectrum is harmonic. as il moves. Paul de Marinis uses pitch extraction on natural speech recordings. Once this exact relationship is disturbed. Hence. regarding the portamenti as ornaments or aniculations to those Hpitches.~ a harmonic spectrum (which is the preferred scientJfic deSCription). 200. In the sound example we hear first a set of truly random pitches. We can. unlike in the tempered scale) but will also have exactly parallel microl1uctuations and hence fuse in our unitary perceplion of a much lower fundamental (e. Once we confine our selection of pitches to a given reference frame (scale. This being said. This should be borne in mind while reading this Chapter as it is all too easy for mUSicians to slide imperceptibly imo thinking of pitch as Hpilch! A good example in traditional musical practice of pitch not treated as Hpitch can be found in Xenakis' Pithoprakta where ponamenti are organised in statistical fields governed by Poisson's fonnula or in portamenti of portamenti.. This is pitch-tracking by auto-correlation (Appendix p70). establishing a definite relatJonship between the two without any conception of Hpitch or scale.8).Sa). (Sound example 2. An" A" and a ·C" played on two flutes (or on one piano) are two distinct sound events. We can draw analogies between these two domains (as some composers have done) but they are perceptually quite distinct. we will try to reserve the wonis "HArmony" and "HArmonic" (capitalised as shown) to apply to the traditional concern with relations amongst notes in Western art music and we will refer to a spectrum having pildt as a pitch-spectrum or as having harmonicity. To put it simply. Even with steady (non-portamento) pitches. 700. we will assign iIpitch to the events. The main problem for the instrument designer is (hat the human ear is remarkably good at pitch detection and even the best computing instrumentS do not quite match up to it. The narrower the band. with no change of spectral content. Using our new computer instruments it becomes possible to follow and extract the pitch from a sound event. we can make a reasonably good attempt in most cases. is to say that the partials are exact multiples of the fundamental and this is the source of the perceptual fusion. HArmonic field) we establish a clear sense of Hpitch for each event.g. in fact. mode. if they are onset-weighted and the onsets located in a fixed HArmonic field. (Sound example 2. this spectral fusion is disturbed (see next Chapter). 900. being able to assign II pitch to a specific class like E-f1a( or C#2 is quite a different concept from the timbral concept of pitch descrihed here. They are approximations to whole number ratios which ·work" in the functional context of Western hannonic musical language. 600. extracted and applied to an entirely different sound object. will not be perceived to be Hpitched. The pitch is then simply one divided by the wavelength. PITCH-TRACKING It seems odd so early in this book to tackle what is perhaps one of the most difficult problems in illstrument design.. 600 and 800) and hence a second ·voke" an octave higher.e. mode or HAnnonic field entering into our thinking. Nevenheless all these sounds are pitched in the spectral sense. 800. Whole treatises have been written on pitch. (Sound example 2.7). an abbreviation for pitch-as-related-to-conventional-harmony. (Sound example 2. panials might roughly correspond to an A and a C (but in exact numerical proportions. will be perceived as anacrusial ornaments to Hpitches. Thus. it is possible to relate such moving pitches to Hpitch if the portamenti have special properties.what it he3rs.Sc).2). (See OT! Sonic Art).

. _________~~-. (Se<Jlllent 'a' replaced by slight portamento) .-... Hence in the frequency domain we would expect to have a sequence of (time) windows.-. then multiply the frequency values in each channel of the Original window by that ratio."" :."'1 tUllE: t>IIt»Q~~~ue"c'::l :? . of course..1. much simpler. in which the most significant frequency information has been eXlraCted in a number of channels (as in the phase vocoder). either directly..4i..\ . 1-1 l't.l1aMei viiJv« .kiso new Pit-cit ~ t-..1 '\ . I DIAGRAH 2 PI TeH TRANSFER In each analysis window. e.e. 1hen. . t. We must then separate the true partials from the other infonnation by looking for the peaks in the data..'}inal pit<:h So~ 150 rdllSf:>osibOf1 ratro 1-15 . . or from a second sound.5.DIAGRAH STORING VALUES OF A We can also attempt pitch-tracking by partial analysis (Appendix p71)..0 .$h'ft' /'e. with perhaps occasional octaviation problems (the Hpitch can be assigned to the wrong octave). if a pitCh is changing.-t. 1lte pitch data might be stored in (the equivalent of) a breakpoint table of time/frequency values..IIII.~ t. Provided we have many more channels than there are partials in the sound. to be completely rigorous we could store the pitch value found at every window in the frequency domain representation. . !Ii IU)\11 f~1111 UIU }III~~ Ii I . We can provide a new (changing) pitch-trace..~=~--_._~r\~~re :~t res.r: ==i . we must find the highest common factor of the frequencies of our partials. :.11 Derive (time-varying) transposition ratio from the pitch-trace of the original sound and the trace of the desired pitch. or synthesised sequences involving inhannonic spectra). In such relatively straightforward cases pitch-tracking can be very accurate. 201.II~~ . within a given limit of accuracy. inaccurate. However.{:o "i app. e. i. ! II! ! I I ! i ! ! i I I I ! ___ ~:" -:~:. (gradient is rate of change .g.. itd.1. Unfonunatel y. If we do not confine ourselves to such frames. then.. For even greater certainty we might try correlating the data from the time-domain with that from the frequency domain to come up with our most definiti ve solution... \\.. I~._ _ -:l'..00 too Or'.SlAtts.. Accurate but inefficient. aso '. -3<>0 ~Q lOa . TIME-CHANGING PITCH.g.~0400 ~oo- . 307 and 513.Il~"nf store at regular ~ gradient changes •. i. causes a new value to he recorded in our pitch data file. if we allow sufficiently small numbers to be used. If we are working on an Hpitch reference frame the task is. q111. and where we cannot be sure whether the incoming sound will be pitched or not... hence altering the perceived pitch in the resyn!hesized sound. multiply all values by transposition ratio for this window.1.•. "' ____________ ~ Efficient and accurate.~c~:::h~::e~1 l. speech. (stable pitch in segment 'a' stored ! t""~ lIIaIly times). we will expect that the partials of the sound have been separated into distinct channels. which will exist if our instantaneous spectrum is hannonic. . how much must a pitch vary before we record a new value in our table? More precisely.~ 1lte problem of pitch-tracking by panial analysis can in fact be simplified if we begin our search 011 a quartet-tone grid.. pl~ a/l c. where we wish to IraCk pitch independently of a reference frame.<. p-~ttPe>tCIj . ill iii ~~ I . when transposition ratio is 1/2. we can modify !he pitch of the original sound and this is most easily considered in the frequency domain. (See Diagram 2). when is the rate of change adjudged to have changed? (See Diagram 1). partials at 100.5 have all HCF of 0. as our previous discussion indicated... any set of panial frequencies values will have a highest common factor. A good lower limit would be 16 cycles..O't.b':j Y. ~ \.. By comparing the two traces. We must therefore reject absurdly small values. 28 . . ~ 1111 11111 III 1111 piW.So -' . . But this is needlessly wastefuL Better to decide on our own ability to discriminate rates of pitch motion and to give the pitch-detection instrument a portamento-rate-change threshold which.mule.~ bl"ltE'­ of pitch). when eXCeeded. . the approximate lower limit of pitch perception in humans.- j. in the general case (e.. the problem of pitch-tracking is hard.g. Store at ·regular times. J. a pitch-fOllowing instrument will deduce the ratio between the new pitch and the original pitch at a particular window time (the instantaneous transposition ratio).o~Ylate ch""lIeis ii1~fJ !I !1111' II pirdt as fr('llA e>1<:'j PITCH TRANSFER Once we have established a satisfactory pitch-trace for a sound. and also if we know in advance what the spectral content of the source sound is (see Appendix p71).3. In this case we need to decide upon the frequency resolution of our data..

12). TIlc grouping to choose again depends on Ihe signal. melodies) or moving sounds (e. slow them down as in tape-speed variation. an octave above the original sound. could be rransferred to an arbitrarily chosen non-instrumental. (Sound example 2. it's tile simplest method to adopt. Unfortunately. NEW PROBLEMS IN CHANGING PITCH Changing the pitCh of a musical event would seem.lla). to be the most obvious thing to do. We are not. therefore. works quitc well over a range of one oclave. which lengthens them.10). 29 30 . the obvious way to change the pitch is 10 change the wavelength of tile sound. and the familiarity of tradition. from a traditional perspective.). so that each segment is long enough to carry instantaneous pitch information. Here we iklay the signal by half the wavelength of the fundamental and add the result to the original signal. begins to introduce Significant artefacts: the signal is transformed as well as pilch-shifted. it also makes the source sound shorter (longer). Computer control makes such "tape-speed" variation a more powerful tool as we can precisely control the speed change trajectory or specify a speed-change in terms of its final velocity (tape acceleration).g. we are left with partials at 200. or with variable access to the same object~ (flute fingerholes. the subtleties of portamento and vibrato articulation in a particular established vocal or instrumental idiom. or because the details of their production cannot be precisely remembered (a spoken phrase can he repealed al differenl pitches by the same voice within a narrow range but. or every 1. brass valves) to permit similar sounds with different pitches to be produced rapidly and easily. (It can therefore be used as a process of constructive distortioll in its own right I) Grouping the wavesets in pairs. The technique can also he used 10 tmnsposc the sound downwards. low and high pitches on the piano have a very diflerent spectral quality but we have come to regard these discrepancies as acceptable through the skill of instrument design (the piano strings resonate in the same sound-box and there is a relatively smooth transition in quality from low to high strings). used in the !zamumiser. A modification of the technique. 111is technique. can affcct the integrity of reproduction of the sound al the new pitch (see Diagram 4). So we must find alternative approaches. but beyond this. (Diagram 3 and Appendix pSI). but not long enough to have a perceptible internal structure which would lead to unwanted echo effects within tilC pitch-changed sound. (Sound example 2. allows us to make an octave downward transposition in a similar manner. metal bars. wooden bars etc. It is crucial to use segments 10 the grain lime-frame (sec Chapters I & 4). The process is particularly useful because it does not disturb the contour of the spectrum (the formants are nOI affected: see Chapter 3) so it can be applied successfully to vocal sounds. This technique is very fast to compute but often introduces strange. 300. It is important to emphasize thai pitCh manipulation does not have to be embedded in a traditional approach to Hpitches. then splice them together again so they overlap sufficiently to retain the original duration. On the computer. DIFFERENT APPROACHES TO PITCH-CHANGING In the time-domain. read every 2 samples. Firstly. More importantly the majority of sounds do not come so easily prepackaged either because the circumstances of their produclion cannot be reproduced (breaking a sheet of glass . This process cancels (by phase-inversion) the odd harmonics while reinforcing the even hannonics. we can rranspose it up an octave using comb-jilter transposition (Appendix p65).we do not need to be able to measure or analytically explain a phenomenon to make an aesthetic decision about it. or in a naturally recorded bird or animal cry.g. portamenli) it changes their perceived speed. but too much informalion may be lost (especially in a complex sound) to give a satisfactory result in terms of just pitch. signal dependent (i. Instruments arc set up. This automatically makes every wavelength shorter (or longer) and changes the pitch. the fine details of articulation cannot usually be precisely remembered). The power of pilch-tracking is that it allows us to trace and transfer the most subrle or complex pilCh flows and fluctuations without necessarily being able to assign specific Hpitch values at any point. With accurate pitch-tracking this technique can be applied to true wavecyclcs (deduced from a knowledge of both the true wavelengtil and tbe zero-crossing information) and should avoid producing artefacts. each waveset (in the sense of a pair of l.'r " J . 400 etc whose fundamental lies at 200Hz.ero-crossings : Appendix pSO) is replaced by N shortened copies occupying the same time as the original one. but producing another sound. However this is a very complicated task. we cut the sound into short segments. (Appendix pSI: Sound example 2. but with segmented sounds (speech. A more satisfactory time-domain approach is through br(Jssage. assuming natural speech inflection. There are two problems when we try to transfer this familiar notion 10 sound composition. Once the pitch of a sound is known. viOlin fingerboard. Here. (Diagram 5). either as collections of similar objects (strings. in fact. no matter how much care we take!). and adopting this approach for every sound we use. we simply re-read the digital data at a different step (instead of everyone sample.400. In fact changing the pilch on an instrument does involve some speclral compromises. up or down. There are also some The ideal way. to change the pilCh of the sound is to build a synthetic model of the sound. varying With the signal) artefacts. triplets elc .3 samples). To lower the pitch of the sound. 200. then alter its fundamental frequency.9). we do 110t necessarily want to confine ourselves to a finite set of pitches or to steady pitches (an Hpitch set). For example. would make sound composition unbelievable arduous. whose relationship to the original is acceptable. Thus if we start with a spectrum whose partials are at 100. changing the pitch of tile Original sound. This is tape-speed variation. If this doesn't maner. 500 etc with a fundamental at 100Hz. non-vocal. The availability of time-variable processing gives us a new order of compositional control of such time-varying processes. using the Hilbert transfonn. enlarged.. every sheet will break differently. e.. (Sound example 2.e.··1 i I ! ! fast techniques that can be used in special cases. replacing N wavescts or wavecycles by just one of them.• before reproducing them. Waveset transposition is an unconventional approach which avoids the time-dislortion involved in tape-speed variation and can be used for integral multiples of the frequency. In classical tape studios the only way to do this was to speed up (or slow down) the tape. or even synthetic sound-<>bject It would even be possible to interpolate between such articulation styles witilout at any stage having a quantifiable (measurable) or notatable representation of them .shifting.

J I r\ I " I L--. I I I! I .. f\ LJ {\ V °v 0 e:--.VI~ f\ /1 A Ci CC.cteristics of the spectrum. The four stages might be as follows ..... I .--.. (Appendix pH) or with a fine-grained LPC (see spectral-focusing below)...... however.hort.dl 'I. As this does not change the window duration. (e) Reimpose the original spectral contour.. more obviously aware of formant drift in situations (like speech) where the (Jrtiw/atioll of fonnants is significant.-.et by 3 . the transposition ratio. (\.. ()N". if there was a peak at around 3000 Hz... to move up (or down) the frequency space. (Appendix pI6)....\'\f~lJl!)VV/ \ . We need only multiply the frequencies of the components in each channel (in each window) by an appropriate figure. This is spectral shifting..--J \. violin) which prOvides a relatively fixed background spectral contour for the entire gatnut of notes played on it.(Sec Appendix pI'). :AAAA Al14. Thus the vowel sound Ha Hwill be found to be related to various peaks in the H spectral contour.. If we change the pitch at whidl Ha is sung.... '-----I. (a) Extract the spectral contour using linear predictive coding (LPC) (Appendix pp12-t3). Hence the formants are moved and the "aH-ness of the sound destroyed.13).JI • u0 I V~ no ~ no (1 nAb rJ I\Qfl/ln (111~ N\ etc. In the frequency domain.. probably excessively fastidious in most situations. . ~7 ti. shift the formant-char. ~ hence transpose up by interval of a 12th.. An instrument is characterised often by a single soundbox (piano.. The problem here is that certain spectral chamcleristics of a sound are determined by the overall shape of the spectrum at each moment in time (the spectral contour) and particularly by peaks in the spectral contour known as formants (see Chapter 3). but needs 10 be borne in mind more generally. ~"PI'C' each '-.... > PRESERVING THE SPECTRAL CONTOUR All these approaches. WV~ ' nXAGRAl't 4 Take wavesets in groups of three.-. and hence the spectral peaks (formants). therefore frequency X 3..L.LL. the pitch is shifted without changing the sounds duration.. . pilCh-shifting is straightforward.-.. We are... ~ ~1T non'~"O nn. with help of a pitch-tracking instrument. Fonnant drift is an obviOUS problem when dealing with speech sounds. retaining it and then superimposing the unshifted contour on the newly shifted partials..-Jl II-II. Simply multiplying the channel frequencies by a transposition ratio causes the whole spectrum.. Find true wavecycles.. (Sound example 2.J \ : I I lit I J I q] ~V\l VVUU((VVLl[TV~ r..!\ /\VV /\CI/\lVOV'~'ea V I time ~ Q_P . Thus. Q~~ Replace each wavecycle with 3 shortened copies. (d) Change the spectrum. except in !he case of the human voice... bmfw (b) Extract the partials with the plUlse vocoder... Ideally this approach of separating the fonnallt data and the partial data should be applied even when merely imposing Vibrato on a sound (see Chapter 10) but it is computationally intensive and. :h~e 31 .~) I I J. r" ~A ~A rln ~A nA ~A I ! I I ~\.... we will continue to find a peak at around 3000 Hz. (See Appendix pIO). however.\ I ".Il.elf.II. I\ I . Wavelength redduced to 1/3. However...-. (c) Flallen the spectrum using the inverse of the spectral contour. .-.. ~ }1VOVV f\ /\ . the spectral peaks will renwin where they were in the frequency space.-..l. /\ _ V.ned cople.-. the partials in the sound will all move up (or down) the frequency ladder. I I. A more sophisticated approach therefore involves determining the spectral contour in each window. of It. : 6 .M nXAGRAl't :3 ~ Replace each waveset by three shortened copies of itself. I~ ~ : 'fuL. Dn ~ /\r\J\~ I .

The hnl"mOniser will usually have the same effect. we can use this kind of spectral shifting to literally split the spectrum in two. the Iulrmoniser algorithm introduces significant artefacts over larger interval shifts. There are many degrees of "pitchiness" between pure noise and a ringing oscillator. double-stopping on a stringed instrument. All these techniques cane be applied dynamically so that the pitch of a sound gradually splits in two.g.18).g. by inversion. is to use . the lowest perceived pitch may gradually move upwards. it differs from simply playing two notes on the same instrument because the variations and microfluctuations of the two pitches remain fairly well in step. we can individually reinforce the harmonics (or partials) in a spectrum SO that our attention is drawn to them as perceived pitches in their own righl (as in Tibetan chanting or Tuvan harmonics singing). whilst less tight Q and broader bands will produce a vaguer focusing of the spectral energy. (Sound example 2. The lechniques of pitch change we have described can be applied to sounds without any definite pitch.(5).2l).20). (Sound example 2. (Sound example 2. As usual. unvoiced specch sounds. however. therefore. broaden the spectral band of the resultant massed sound). There arc two approaches to this which. tight filter may simply miss any significant spectral elements we may end up with silence ~ A very sophisticated approach to Ihis process is spectral focusing (Appendix p20). Small pitch-shifts. (Sound example 2. e. (Sound example 2. (Sound example 2. sounds can he created which get higher in Hpitch yet lower in lessitura (or lower in Hpitch but higher in tessitura) the so called Shepard TOiles. Another pitch phenomena which is worth noting is the pitch drift associated with spectral stretching (see Appendix pt9 and Chapter 3). (Sound example 2. noise-based sound using spectral shifting may not have any noticeable perceptual effect on the sound. With very careful control. Walleset transposition will usually change the centre-of-energy (the "pitch-band" or tessitura) of noisy sounds so they appear to move higher (or lower). the problem of formant shifting when transposing will apply equally well to. Very light Q on the filter will produce oscillator-type pitches. (Sound example 2. superimposing the result on the original source. superimposed on the original sound add "hody' to the sound.24). Similarly. (Sound example 2. In fact this process works hesl on nOisier sounds because here the energy will be distributed over the entire spectrum and wherever we place our filter band(s). at a deeper level. 32 33 . Inharmonic sound will. A very narrow fixed filter can thus impose a marked spectral peak on the noisiest of sounds. We can use a jilter not oniy to remove or attenuate parts of the spectrum but.23). (Sound example 2.PITCH TESSITURA CHANGE OF UNPITCHED SOUNDS Var!oos compositional processes allow us to generate more than one pitch from a sound. can Single out narrow ranges of partials from a spectru m. The two sets of partials thus generated will imply two different fundamentals and the sound will appear to have a split pitch. we will be sure to find some signal components to work on. and. With a very fine analysis wtndow. we first make an analYSis of a sound with (possibly time-varying) pitch using Linear Predictive Coding (Appendh: pp12-13). If the partials of a pitch-spectrum are gradually moved so the relationship ceases to be harmonic (no longer whole number multiples of the fundamental). including the phasing in and out of partials at the top and bottom of the spectrum. In particular.14). though it may be closely approached by a single performer using e. rather than as a cost-saving way of adding conventional HArmony to a melodic line.rpectrai shifting in the frequency domain. A different kind of pitch duality can be produced when the focus of energy in the spectrum (spectral peak) moves markedly above a fixed pitch or over a pitch which is moving in a contrary direction. or similar instruments playing the same pitches. narrow Hlters with very high "Q" (so that the bands allowed to pass are very abruptly defined). a narrow. be transposed in a similar way to pitched sounds. by appropriale filtering. where small variations in tuning between individual singers. We can also use this technique to give a sense of pitch motion in unpitched sounds . even when the split is quite radical and chorusing will often be unnoticeable as the spectrum is already full of energy. In fact. even if the fundamental frequency is present in the spectrum and remains unchanged. (see 0" Sonic Art and Appendix p72). With a nornlal size analysis window we will extract the spectral conlour (the formanL~) (see above). a situation impossible to achieve with two sepatale performers. An alternative approach. or players. move from noise towards such chords. For example. (Sound example 2. called ajilter bank. we vary the size of the analysis window through time. In a sound with a simpler spectrum. (Appendix pI8). Splitting the spectrum of a broad-band. We can also force our sound through a stack of filters. (Sound example 2.(6). However. The technique tends to be more musically interesting when used on subtly fluctuating sounds.25). we will pick out the individual r:utials. PITCH CREATION It is possible to give pitch qualities to initially unpitched sounds. shifting oniy a part of the spectrum.22). producing "chords". Moreover if the stretch is upwards.19). in general. increasing Q with lime. producing the well known "chorus· effect (an effect produced naturally by a chorus of similar singers. the sound will begin to present several pitches to our perception (like bell sounds). A filter may work over a large hand of frequencies (when it is said to have a small Q) or over a narrow band of frequencies (when it has a large Q).17). However. using the Iulrmoniser approach we can shift the pitch of a sound without altering its duration and then mix this with the original pitch. TIlere are not truly two pitches present in these cases but percepts of conflicting frequency motions within the sound can certainly be established. giving it a pitched quality. to accentuate what remains (bandpass filter Appendix p7). Apart from the fact that we can apply this lechnique to any sound. A filter is an instrument which reinforces or suppresses particular parts of the spectrum.nOise portamenti. In this process. are very similar.

wherever the analysis window was normal-sized the resultant filters will impose the formant characteristics of the original sound on the noise source (e. If the original analysis window-size varied in time from normal to fine. This technique is known a.. If this delay is short enough. This then provides a sophisticated means to pass from noise 10 pitch in a complexly evolving sound-source.27). we will produce a gradually changing delay (Sec Diagram 6). Time varying delay between two sounds. or moving in a different sense (with time-variable ftltering or delay). one of which is a time-varying time-stretched version of the other. the delay between equivalent samples in the original and delayed sound will remain exactly fixed.H 6 If we now use the analysis data as a set of (time-varying) filters on an input nOise sound.nIAGR. cmtp v ! ~~======--=~~~~_~. we will produce an upward porlamento. When a sound is mixed with a very slightly time-stretched (or shrunk) copy of itself. These will then act on the noise to produce something vcry close to the original signal. The second approach to pitch-generation is to usc delay. from an analysis of pitched speech._. often used in popular music. we will have generated a set of very narrow-Q filters at the (time-varying) frequencies of the original partials. our output sound would vary from formant-shaped noise to strongly pItched sound (e.g.:A. whatever sound we input to the system. analysed voiced speech will produce unvoiced speech). In this case the signal is delayed by different amounts in different frequency registers using an all-pass filler (Appendix p9) and this shifted signal is allowed to interact with the unchanged source. We lIlay work with more than twO time-varied copies. .~ comb filtering. If the sounds are start-synchronised. Longer delays will give lower and less well defined pitches. (Sound example 2. As digital signals remain precisely in time. ProdUCing pOl1amcnti is an even simpler process.26). relies on such delay effects.but where the window size was very fine.g. (Sound example 2. we will hear a pitch corresponding to one divided by delay time. our new sound would pass from unvoiced to voiced speech). Phasing or Aangeing.=-======~ I I ! 1 I Both these techniques allow us to produce dual-pitch percepts with the pitch of the source material moving in some direction and the pitch produced by the delay or I1l!ering I1xed.ij=-~. If the sounds arc end-synchronised. this will produce a downward portamento. The production of pitch-motion seems an appropriate place to end this Chapter as it stresses once again the difference between pitch and Hpitch and the power of the new composiljonal tools to provide control over pitch-in-motion.

The one exception to this is that certain partials may decay more quickly than others without destroying this perceived fusion (as in sustained acoustic bell sounds). for so long. HARMONICITY .. I~IIII :1111 1111 "d" . etc). the car's attempts to extract harmonicity (whole number) relationships amongst the partials will result in our hearing several pitches in the sound. If the partials are not in this relationship.T-r-r. but gradually becomes more and more bell-like.--1--:: "0::'"' fYct. 2-2a 3a. However. 1. lbis means that the partials are moved upwards in such a way that their whole number relationships are preserved less and less exactly and eventually lost. legno. if unquantifiable._. ::: : : 1 1 .T--I---~I:-. but the vowel formant is varied from "0" to "u".-. 2 .:. '1"'1111" I 1 ""11 1 ' . the tail of the sound is gradually time-stretched to give it the longer decay time we would expect from an acoustic bell.1 we hear the syllable "ko->u" being gradually spectrally stretched: Appendix pI9). producing featureless noise bands. if the partials which make up a sound have frequencies which are exact multiples of some frequency in the audible range (known as the fundamental) and.1"11'1111 .r_T"~~--. (See Diagram 1).~~ fr -1. a whole book could be devoted to the spectral characteristics of sounds. Initially. composers came into contact with oscillators producing featureless pitches. and "envelope generators" which added simple loudness trajectories to these elementary sources. lbis leads immediately into a steady pitch.-r -r-r-}-. 0 5 C\ [1. f t 35 . Most musicians with a traditional background almost equate "timbre" with instrument type (some instruments producing a variety of "timbres" e. a process which gradually fades out the higher partials leaving the lower to continue. 11 . noise generators.II III1 The spectral characteristics of sounds have. the sound appears to have an indefinable "aura" around it. It is important to understand that this transformation "works" due to a number of factors apart from the harmonic\inharmonic transition.. '. The syllable "ko->u" begins with a very short broad band spectrum with lots of high-frequency information ("k") corresponding to the initial clang of a bell._) rT rl-r-r-rl'-r:--r-T-::. More importantly. persisting longer than the higher components.III:::: i I r-r-r-r--r--r---I'---r-. Similarly.~["-T. A different initial morphology would have produced a less bell-like result. The most important feature to note is that all sound spectra of musical interest are time-varying. III I '. provided this relationship persists for at least a grain-size time-frame. either in micro-articulation or large-scale motion. been inaccessible to the composer that we have become accustomed to lumping together all aspects of the spectral structure under the catch-all term "timbre" and regarding it as an elementary.[--].11 """' 11 1.r. the lower partials. and provided the relationships (from window to window) remain relatively stable. These several pitches will trace out the same micro-articulations and hence will be fused into a single percept (a~ in a bell sound). in the earliest analogue studios. . 0-[ r[-[Tl-rTrr-c-.: . and hence the lower heard pitches. Bell sounds have this similar property. lbis gave no inSight into the subtlety and multidimensionality of sound spectra.T-----~~ Q. property of sounds. ~ ~. I¥u I~.. 2 ·50. As the process proceeds.--r ---.INHARMONICITY As discussed in Chapter 2. akin to phasing.1 CHAPTER 3 SPECfRUM WHA T IS TIMBRE ? DIAGRArl J.:1:.g. pizz.1IIIil ii 1·1. In Sound example 3. [r. the spectrum fuses and we hear a specific (possibly gliding) pitch. the morphology (changing shape) of the spectrum is already bell-like. arco. lbis example (used in the composition of Vox 5) illustrates the importance of the time-varying structure of the spectrum (not simply its loudness trajectory).11'1 ~ 11.

(See Appendix pIO). Note that. We can also imagine a kind of harmonic to inharmonic vibrato-like fluctuation within a sound. VQCOding hence works well on noisy sounds (e. The peaks. the sound of water falling in a wide stream around many small rocks. Formant-variation of the spectrum does not need to be speech-related and. from masses of broad-band click-like sounds (either in regular layers . il II Iii I NOISE. the top of the spectrum moves further up or further down from its initial position) and we may vary the type of stretching involved. "p". or subjected to some noise producing distortion process. Dutch "gh") or from extremely timc-eontractcd speech streams. the sound to be vocoded must have energy distributed over the whole spectrum so that the spectral contour to be imposed has something to work on.the gritty vocal sounds produced by water between the tongue and palate in e. (Sound example 3. (Sound example 3.4). which produces a sense of "freezing" the spectrum when we might have anticipated that holding the frequencies would create this percept more directly. the sea) or on sounds which are artificially prewhitened by adding broad band noise. the frequencies of the partials In the spectrum (determining pitch(es). e.5). "k". the type of crackling signal one gets from very poor radio reception tuned to no particular station. Once we vary the spectrum too quicldy. (Sound example 3. harmonicity-inharmonicity. nOisiness) and the position of the spectral peaks. known as formants. can be varied independently of each other.6). "r. It is also possible to normalise the spectrum before imposing the new contour. we no longer perceive individual moments or grains with specific spectral qualities. (Sound example 3. allowing the frequencies to vary as originally. Or we can hold their amplitudes stationary. These inharmonic sounds can be transposed and caused 10 move (subtle or complex pitch-gliding) just like pitched sounds (also see Chapter 5 on Col'Itinuation). it is often holding steady the amplitudes.10). the variety amongst cymbals and unpitched gongs give the lie to tltis. and in the under fonnanl preserving spectral manipulalion in Appendix p17.3). strangely coloured drums. For this to work effectively. we hear "noise". These shade off into the area of "Texture" which we will discuss in Chapter 8. Very short inharmonic sounds will sound percussive. is often more significant than spectral change. As we know from singing. a uniform grey area of musical options (or even a few shades of pink and blue) which the name (and past experience with noise generators) might suggest.8). When transformi ng the harmonicity of the spectrum. of course.or semi-regular . like drums. the contour of the spectrum will have peaks and troughs. we all use articulate formant control when speaking. Proceeding further. are responsible for such features as the vowel-state of a sung note. or akin to wood-blocks (Sound example 3. The subtle differences between unvoiced staccato "t". Inharmonicity does not therefore necessarily mean multipltchedness. and hence the spectral contour. (Appendix pI9). Noise spectra arc not. either slowly or quicldy.2). It can be more or less focused towards sllitic or moving pitches. These examples illustrate that the rather dull sounding word "noise" hides whole worlds of rich sonic material largely uncxplored in detail by composers in the past.7). vary formant-wise in time: whispered speech is the ideal example.9 we hear portamentoing inhamlonic spectra created by filtering noise. A good example of the complexity of noise itself is "noisy noise". We reach the area of noise (see below). particularly as the number of heard out components in an inharmonic spectrum increases gradually to the point of noise saturation. (Sound example 3.We may vary this spectral stretching process by changing the overall stretch (I. 36 37 . the spectrum can be made to vary. in complex signals. It can. (Sound example 3. "sh". a process originally developed in the analogue studios and known as vocoding (no connection with the phaSe vocoder). We hold the frequencies of the partials. Nor (as we have seen from the "ko->u· example). 'This is why we can produce coherent speech while singing or whispering. and it can have its own complex internal structure. using band-pass Jillers or delay (see Chapter 2). however. However.g. between the harmonic and the inharmonic creating a dynamiC interpolation between a harmonic and an inharmonic state (or between any state and something more inharmonic) so that a sound changes its spectral character as it unfolds.e. and as we can deduce from this diagram. and especially if we do so irregularly. "d". Because most conventional acoustic instruments have no articulate time-varying control over spectral contour (one of the few examples is hand manipulable brass mutes). FORMANT STRUCTURE In any window. the SpeClfal contour (and therefore the position of the peaks and troughs) must remain where it is even if the partials themselves move. 'I III Different types of stretching will produce different relationships between the pitches heard within the sounds. In a complex signal. TItis filtering is gradually removed and the bands become more noise-like. "NOISY NOISE" & COMPLEX SPECTRA Once the spectrum begins to change so rapidly and irregularly that we cannot perceive the spectral quality of any particular grain. There are also Iluid noises produced by portamentoing components. We can use spectral freezing to freeze certain aspects of the spectrum at a particular moment.cicadal> or irregular . small stretches produce an ambiguous area in which the original sound appears" coloured" in some way rather than genuinely multi-pitched. For a vowel to persist. Noisiness can be a matter of degree.masses of breaking twigs or pebbles falling onto tiles . (Sound example 3. In Sound example 3. "s".g. we run into problems about the position of formants akin to those encountered when pitch-ehanging (see Chapter 2) and to preserve the formant characteristics of the source we need to preserve the spectral contour of the source and apply it to the resulting spectrum (see fom1anl preservillg Jpecfral manipulation: Appendix p 17). the concept of formant control is less familiar as a musical concept to traditional composers. It is possible to extract the (time varying) spectral contour from one signal and impose it on another. allowing their loudnesses to vary as originally.g. This process is described in Chapter 2. does it mean bell sounds.

filters may be used to eliminate unwanted noise or hums in recorded sounds. With dense or complex spectra the results of filtering can be relatively unexpected revealing aspects of the sound material not previously appreciated. flowing or bubbling). the tritone) might be chosen.g.This mimics part of the effect of several voices auempting to deliver the same information. i. As the digital signal will retain its duration precisely.lIion} (Appendix piS) and mixiflg the shifted spectrum on the original. e.16). but Simultaneously. In particular. whether it be spectrally pitched (harmonic). adding completely diffcrcm vibrato to each part. en SPECTRAL BANDING Once we understand that a spectrum contains many separate components.12). separating the signal into Its COmponent partials (partial tracking). This is quile a complex lask which will involve pilch tracking and pattern-matching (to estimate where the partials might lie) on a window by window basis.g. has a well-defined time-varying spectrum a massed group of talkers at a party. The noise band was in fact simply a very dense superimposition of many vocal sounds. in effcct. Conversely. This technique is however particularly powerful as it allows us to sct up an additive symhesis model of our analysed sound and thereby provides a bridge belween unique recorded sound-events and the control available through SYnthesis.15). This technique can be applied to any source. but we may be interested in the gradual dissociation of the spectrum. or the true partials.g. all sounds can be amassed to create a sound with a noise-1Opectrum if superimposed randomly in a sufficiently frequency-dense and time-dense way.'iSages before a concert. However. A filter may also be used to isolate some static or moving feature of a sound. At the end of Sound example 3. Filters. . and to then use this data to filter a noise souree. and gives us a means of passing from articulate noise to articulate not-noise spectra in a seamless fashion. (Sound example 3. (Sound example 3. soft or hard-edged. make a single voice appear crowd-like) by adding small random time-changing perturbations to the loudnesses of the spectral components (spectral slwking).g. we can separate the spectrum into parts (using band pass filters or spectral Sp/jlljflg) and apply processes to the N separated pans (e. Theoretically this produces merely a stage-centre resultant spectrum but in practice there appear to be frequency dependent effects which lend the resultant sound a new and richer spatial "fullness". by permitting components in some frequency bands to pass and rejecting others.17). Ultimately we may usc a procedure which follows the partials themselves. evolving. especially mixed in mono. The filtered result can vary from articulated noise formants (like unvoiced speech) following just the formant articulation of the original source. if the spectral parts are changed too radically c. A not-tao-narrow and static bal/d pass filter will transform a complex sound-source (usually) retaining its morphology (time-varying shape) so that the resulting sound will rclate to the source sound through its articulation in time. adding additional partials by spectral shifting the sound {without changing its dur. Ideally it must deal in sOllie way with inharmonic sounds (where lhe form of the SpeCtrullI is not known in advance) and noise sources (where therc arc. especially as digital filters can be very preCisely tuned. (Sound example 3. all the components in the shifted signal will line up preCisely with their non-shifted sources and the spectrum will be thickened while retaining its (fused) intcgrity. This is particularly important with respect to the onset portion of a sound and we will leave discussion of this until Chapter 4. Different sounds (with or without harmonicity. There are also undoubtedly vast areas to be explored at the boundaries of inharmonicity/noise and time-fluctuating-spectrumlnoise. Many of the sound phenomena we have discussed in this section are complex concatenation.13) A further enrichment may be achieved by mixing an already stereo spectrum with a pitch-shifted version which is left-right inverted.e. we can imagine processing the sound to isolate or separate these components. add vibrato) and thell recombine Ule two parts perhaps reconstituting the spectrum in a new form. This allows us to extract.~ of simpler units. practising their difficult pa. allow us to select parts of the spectrum for closer observation. finally.18).r Two processes are worth mentioning in this respect. It is therefore worthwhile to note that any arbitrary collection of sounds.11). Octave enhancement is the most obvious approach but any interval of transposition (e. to a reconstitution of the partials of the the original sound (and hence of the original sound Itself). We may reinforce the total spectral structure. from a pitched sound. a whole orchestra individually.14). spectrally bright or dull. (Sound example 3. The process might be repeated and the relative balance of the components adjusted as desired. At an elementary level this can be used for signal-dependent noise reduction. 11lis leads us into the next area. or inhannonic. described in Chapter 2 (and Appendix p20). (Sound example 3. either in terms of their frequency location (using spectral splilling to define a frequency band and setting the band loudness to zero) or in terms of their time varying relative loudness (spectral tracing will eliminate the N least signifIcant. a moving high frequency component in the onset. for further compositional development. [n a crude way. sets of narrow band pass filters can be used 10 force a complex spectrum onto any desired Hpitch sct (HArmonic field in the tradllional sense). We may also perturb the partial frequencies (Sound example 3. Noise with transient pitch content like water falling in a stream (rather than dripping. (Sound elCample 3. A fruitful approach to this territory might be through spectral focusing. no partials). We can also move fluidly between these two states by varying the anaiysis window size through time. either the spectral contour Dilly. sustained. we can introduce a sense of multiple-sourced ness to a sound (e. At each moment there is a compositc spectrum for these events and any portion of it could be grist for the mill of sound composition. might be pitch-enhanced by spectral tracing (see below). pilch-shift. (Appendix p4K). spectral components can be eliminated on a c/Jannel-by-channel basis.g. iterated or sequenced) may produce different qualities of noise (see Chapter 8 on Texture). Out see also "Spectral Fission" below). channel components. In the frequency domain. grain-like.9 the noise band finally resolves into the sound of voices. window by window. More radically. (Sound example 3. In a more Signal sensitive sense a !ilter or a frequency-domain channel selector can be used to separate some desired feature of a sound. SPECTRAL ENHANCEMENT The already existing structure of a spectrum can be utilised to enhance the original sound. quietest. a particular strong middle partial ctc. they will not fuse when remixed.

to the previous one. which we will deal with more fully in Chapter II. we have a compositional process which makes perceptible aspects of the signal which were not perceptible.. Again. I Spectral tracing strips away the spectral components in order of increasing loudness (Appendix p25). with a complex signal. having no pitch. If the frequency data is held constant. (Sound example 3. A three or four fold repetition produces a "phasing"-like aura around the sound in which a glimmer of the bead stream is beginning to be apparent. we slow down the rate of change the spectrum becomes stable or stable-in-motion for long enough for us to hear out the originally instantaneous window values.~es) continue to vary as in the original signal but the channel frequencies do not change. Spectral orpeggiarion is a process that draws our attention to the individual spectral components by isolating. Spectral fission can be achieved in a number of quite different ways in the frequency domai n.23). but with only a few windows we create another process of SOund blurring akin to brassage. bright peaks). When only a few components are left. often radically different. (Sound example 3. we will describe as spectral fission. We can do this and also reduce the number of partials (spectral [r{lce & blur) and we may do either of these things in a time-variable manner so that the details of a sequence gradually become blurred or gradually emerge as distinct. (Sound example 3. In a compledy evolving signal (especially a noisy one) each waveset will be different. holding the amplitude data is often more effective in achieving a sense of "freezing" tile signal.: I ~:I ~ ~III I'! ". bUl we will not perceptually register the content of that waveset in its own right (see the discussion of time-frames in Chapter I). even in relatively simple signals. any sound is reduced to a delicate tracery of (shifting) sine-wave constituents. The computer can apply this process to any sound-source. (Appendix pSS). however. 'II' ~ " t. the aural result is entirely different.24). have no true wavecycles . or emphasising. from any oscillator) waveset time-stretching produces no artefacts. but they 40 41 . and particularly apparent In rapid sequences. In a noisy sound the spectrum is changing too quickly for us to gain any pitch or inharmonic multi-pitched percept from any particular time-window.in the case of a harmonic spectrum. Witillargc numbers of windows in a shuffled group we produce an audible rearrangements of signal segments. in a complex signal. this tends to sound like a sudden splice between a moving signal and a synthetic drone. Finally. all of a slightly different spectral qUality. If the amplitude data is held constant then the channel frequencies continue to vary as in the original signal.19). A 32 fold repetition produces a clear "random melody" apparently quite divorced from the source. Adding two different sets of vibrato to two different groups of partials within the same spectrum will cause the two sets of partials to be perceived independently . I. (Sound example 3. the source sound begins to reveal a lightning fast stream of pitched heads.20). The more we repeat each waveset however. whereby the spectral components seem to split apart. SPECTRAL MANIPULATION IN THE TIME-DOMAIN A whole series of spectral transformations can be effected in the time-domain by operating on wavesets defined as pairs of z. this can be effected in a time-varying manner so that the inharmonicity emerges gradually from within the stretching sound. Bearing in mind that these do not necessarily correspond to true wavecycles. SPECTRAL MANIPULATION IN THE FREQUENCY DOMAIN There are many other processes of spectral manipulation we can apply to signals in the frequency domain. In a vcry simple sound source (e. In general. Most of these are only interesting if we apply them to moving spectra because they rely on the interaction of data in different (time) windows and if these sets of data are very similar we will perceive no change. This can be achieved purely vocally over a drone pitch by using appropriate vowel formants to emphasise partials above the fundamental. as a single pitch.25). But in this case. We may select a window (or sets of windows) and freeze either the frequency data or the loudness data which we find there over the ensuing signal (spectral freezing). Complexly varying sources produce the most fascinating results as those partials which are at any moment in the permitted group (the loudest) change from window 10 window. these are inharmonic and hence we produce a "metallic" inharrnonic (usually moving) ringing percept. (Sound example 3. Spectral rracing can also be done in a time-variable manner so that a sound gradually dissolves into its internal sine-wave tracery. a steady waveform. Once.26). This can be used to "wash out" the detail in a segmented signal and works especially effectively on spikey crackly signals (those with brief. even whilst it is in motion. Alternatively we may elaborate the spectrum in the time-domain by a process of constructive distortion. We may average tile spectral data in a each frequency-band channel over N time-windows (spectral blurring) thus reducing the amount of detail available for reconstructing the signal. The opposite process. By making perceptible what was not previously perceptible we effect a "magical" transformation of the sonic material.ero-crossings. SpeC1rol time-stretching.g. With a 5 or 6 fold repetition therefore. By searching for wavesels (zero-crossing pairs: Appendix pSO) and then repeating the wavesets before proceeding to the next (Waveset time-stretchillg) we may time stretch the source without altering its pitch (see elsewhere for the limitations on this process). (Sound example 3. 17.61 etc. We can also freeze both amplitude and frequency data but. This feature can often be enhanced by time-stretching so that the rate of partial change is slowed down.22). the closer it comes to the grain threshold where we can hear out the implied pitch and the spectral quality implicit in . rather than revelations of an intrinsic internal structure. It merely contributes to the more general percept of noisiness. each in sequence.21). (Sound example 3. Their advantage in the context of constructive distortion is that very noisy sounds. but not always (Appendix pSO). The new sounds are time-domain artefacts consistent with the original Signal. can produce unexpected spectral consequences when applied to noisy sounds. We hear new partials entering (while others leave) producing "melodies· internal to the source sound. we may shuffle the time-window data in any way we choose (spectral shufflillg). For this reason I refer to these processes as constmctive distortion. we will anticipate producing various unexpected artefacts in complex sounds. the channel amplitudes (Ioudnes. Again. shuffling Windows in groups of 8.11 'II 11:" . As mentioned previously. In general. Wavesets correspond to wavecydes in many pitched sounds. (Sound example 3.'. its waveform. the effects produced will not be entirely predictable.the single aural stream will split into two.n I i I' I SPECTRAL FISSION & CONSTRUCTIVE DISTORTION We have mentioned several times the idea of spectral fusion where the parallel micro-articulation of the many components of a spectrum causes us to perceive it as a unified entity .but we can still segment them into wavesets (Appendix pSO). .

every alternate waveset may be replaced by silence. Values of N at the threshold of grain perceptibility are especially interesting. (Sound example 3. suggesting a change in the physical medium in which the 'bouncing' takes place (e. it enriches the spectrum in a not wholly predictable way. With a complex wavefonn. this process introduces a slightly rasping "edge" to the sound quality of the source sound which increases as more "silence" is introduced. Again. We may also rearrange wavesets in any specified way (waveset shllff1ing : Appendix pSt) or reverse wavesets or groups of N wavesets (waveset reversal: Appendix pSI). 42 43 I' . where N is large we produce a fairly predictable brassage of reverse segments. Thus we may convert all the wavesets to square waves. Again. sine-waves. is destructively distorted in various ways. we may introduce small. Finally. Superficially. bouncing in sand). Inverting the half-wave-<:ycles (waveset inversion: Appendix pSI) usually produces an "edge" to the spectral characteristics of Ihe sound. Although this process appears to be similar to the process of spectral blurring. This has Ihe effect of adding "roughness' to clearly pitched sounds. this would appear to be an unpromising approach but we are in fact thus changing the waveform. (Appendix pS2). The following manipulations suggest themselves. Spectral thinking is inlegralto all sound composition and should be borne in mind as we proceed to explore other aspects of this world. CONCLUSION In a sense. almost any manipulation of a signal will altcr its spectrum. For example. Even editing (most obviously in very short time-frames in brassage e.30). one might expect that sine-wave replacement would in some way simplify. Two Interpolating sequences (see Chapter 12) between the 'wood' and each of the transformed sounds is then created by inberweening (see Appendix p46 & Chapter 12). as we have already made clear. (Sound example 3.28).IT 1. and then by sine waves.29).". suggesting a bounCing object. many of the areas discussed in the different chapters of Ihis book overlap considerably. (Sound example 3. Hence the resulting sound will he clearly related to the source in a way which may be musically useful. or clarify. through the concept of "spectrum'. We may replace wavesets with a waveform of a different shape but the same amplitude (waveset substitution: Appendix pS2).1' • I 111 will be tied to the morphology (time changing characteristics) of the original sound.g. With an elementary waveset form this adds harmonics in a rational and predictable way. But. As this process destroys the original fonn of the wave I will refer to it as destructive distortion. In the sound examples the sound with a wood-like attack has wavesets replaced by square waves. but with smaller values of N the signal is altered in subtle ways. Again.) alters the time-varying nature of the spectrum.g. it is in fact quite irrational. We may average the waveset shape over N wavesets (waveset averaging). Superficially. triangular waves.27). More interesting (lhough apparently less promising) we may replace N in every M wavesets by silence (waveset omission: Appendix pSI). Such distortion procedures work particularly well with short sounds having disLinctive loudness trajectories. the spectrum. random changes to the waveset lengths in the signal (wavesef shaking : Appendix pSI). In the sound example a set of such sounds. or even user-<lefined waveforms. Ihough we can fairly well predict how the spectral energy will be redistributed. averaging the waveset length and the wave shape (and hence the resulting spectral contour) in perceptually unpredictable way. We may add 'harmonic components' to the waveset in any desired proportions (waveset Iwrmollic distortion) by making copies of the wa veset which are 112 as shon (1/3 as sbort etc) and superimposing 2 (3) of these on the original waveform in any specified amplitude weighting. We might also change the spectrum by applying a power factor 10 the wavefonn shape itself (waveset distortion: Appendix pS2) (Sound example 3. Here we have attempted to focus on sound composition in a particular way. this may he true with simple sound input but complex sounds are just changed in spectral "feel" as a rapidly changing sine-wave is no less perceptually chaotic than a rapidly changing arbitrary wave-shape.

After all. Above a cenain speed. Hules blown etc. how much energy was expended in producing ii. as being any different to those that follow? The onset of a sound. entitled "Continuation" might seem to deal happily with all those properties. with practice. the next chapter. we can pick up some of this infonnation from later momelllS in the sound. where it's coming from. the onset moment almost inevitably has some special properties. because of the way sound events are initiated in the physical world. Such sources can produce either individual short sounds (drums. the individual reed movemenL~ fall below the grain time-frame boundary and Ihe units meld in perception into a conLinuous event. to push the syslem into motion.CHAPTER 4 ONSET WHAT IS SIGNIFICANT ABOUT THE ONSET OF A SOUND? In the previous Chapter (Spectrum) we have discussed properties of sounds which they possess at every moment. There are. but there needs to be a moment of transition from non-activation 10 activation. In most naturally occurring sounds the onset gives us some clue as to the causality of the sound . properties of sound intrinsically tied to tile way in wllich lhe sound changes. Bells need to be struck. In this Chapter and the next we will look at those properties. rolled "r". Thus a resonating cavity (like a pipe) may produce a sustained and stable sound once it is activated. be PUI into resonance with almosl no discontinuity (the voice. Some resonating systems can. Other systems have internal resonance . bowl gongs.what source is producing it. However. but such infonnation has a primitive and potentially life-threatening importance in the species development of hearing. We are therefore particularly sensitive to the qualities of sound onset .7 and R. These mailers are discussed in Chapters 6. (!lute or brass tonguing). bell). 44 . usually involving exceeding some energy threshold. however. or can be initiated wilh such an event. blooglcs). or granular. xylophones. even though these properties may change from moment to moment. Of course. in the lower notes we hear OUI Ihose individual motions as Ihey individually fall within the grain time-frame (See Chapter I). but to better help us to survive. This means thai Ihey have special properties which differentiatc them (perceptually) from continuous sounds and must be treated differently when wc compose with Ihem. Others either require a tmnsienl onset event. hearing did not develop 10 allow us to compose music. many vocal consonants) or be activated ileratively (drum roll.at some stage in the past our ancestors lives may have depended on the correct interpretation of that data. however. Other systems produce inlrinsically short sounds as they have no internal resonance. Low and high contrabassoon notes are both produced by the disconLinuous movcment of the rced. In fact. has two particular properties which are perceptually interesting.once set in motion we do not have to continue supplying energy 10 them but we therefore have to supply a relatively large amounl of energy in a short time at the evcnt onsel (piano-string. its attack. Ilerative sounds are a special case in which perceplual considerations entcr inlO our judgement. Why should we single out the properties of the onset of a sound. A moment's hesitation for reHection may have been too long! Secondly. low contrabassoon nOles). Sounds which are perceptually iterative. can be thought of as a sequence of onsel events.

For a more detailed discussion see On Sonic Art. sawing or bowing. but repeated. DIAGRAH 3. If. so important that we can destroy recognisability of instrumental sounds (flute. The individual grains of the resulting iterated sound may appear like struck wood (Sound example 4. trumpet. we provide a different loudness trajectory (by enveloping) like type lb. Thus with such very brief sound. the various destructive distortion instruments discussed in Chapter 3) we may alter the perceived nature of the "material" being "struck". a breathy release. the more noisy the spectrum. struck bells) are also panJy recognisable this decay process and if it is artificially prevented from occurring. inharmonicity. therefore. a sudden excitation brought to an abrupt end (lc). (Sound example 4. panicularly where these produce forced vibrations rather than natural resonant frequencies. If these grains are then spectrally altered (using. Thus any grain-scale sound having a loudness trajectory of type la (see Diagram) will appear 'struck" as the trajectory implies that all the energy is imparted in an initial shock and then dies away naturally.1).g. For example. Sound sources with resonance and natural decay (struck piano strings.3). however. We may be aware of pitch. which we might intuitively associate with rubbing or stroking or some other gentler way of coaxing an object into its natural vibrating mode. rather than "sudden initiated".2). violin) fairly easily by removing their onset It is of course not only the onset which is involved in source recognition. a clang. vocal clicks. in fact. or whatever. GRAIN-SCALE SOUNDS Very short sounds (xylophone notes. ~) ~c. the more "drum-like" or "cymbal-like" but we are retaining the percept 'struck" because of the persisting form of the loudness trajectory. We need.acoustic instruments the initiating transition from 'off' to 'on' is most often a complex event in its right. noisiness etc) or spectral motion. 45 . (Sound example 4. suggcsts perhaps an extremely forced scraping together of materials where the evolution of the process is controlled by forces external to the natural vibrating properties of the material e. We Clln create the percept "struck object" by imposing such a brief loudness trajectory on almost any spectral structure. with the dimensions of a grain (See Chapter I) and its own intrinsic sonic properties. pitch motion. Comllining this with the control of ~pectral coment and spectral change gives us a very powerful purchase on the sound-composition of grain-size sounds. the energy in the sound seems to grow. Yet another energy trajectory. which has a quiet onset and peaks near the end. (See DiagrdITI 1). In particular. But our percept will also be influenced strongly by the loudness trajectory of such sounds. These properties are. Such sounds may be studied as a class on their own.. spectral type (harrnonicity. two pebbles struck together) may be regarded as onsets without continuation. (Sound example 4. At the very least the percept is "gradual initiated". our percept may change (is it a piano or is it a flute?). a tirtll!-stretched vocal sound may have an overall trajectory imposed on it made out of such grain-scale trajectories. transformation between quite differem aural percepts can be effectcd by the simple device of altering loudness trajectory. to pay special attention to the onset characteristics of sounds. for example.4).

r

I

ALTERED PHYSICALITY

,I
I

I

What we have been describing are modifications to the perceived physicality of the sound. If we proceed now to sounds which also have a continuation, subtle alterations of the onset characteristics may still radically alter the perceived physicality of the sound. For example we can impose a sudden-initiated (struck-like) onset on any sound purely by providing an appropriate onset loudness tnl.je<:tory, or we can make a sound gradual-initiated by giving it an onset loudness trajectory which rises more slowly. In the grain time-frame we can proceed from the ·struck inelastic object" to the 'struck resonating object" 10 the "robbed" and beyond that to the situation where the sound appears to rise out of nowhere like the "singing" of bowl gongs or bloogles. (Sound example 4.5).
Where the sound has a "struck" quality, we may imply not just the energy but the physical quality of the striking medium. Harder striking agents tend to excite higher frequencies in the spectrum of the vibrated malerial (compare padded stick, robber headed sticks and wooden sticks on a vibraphone). We can generalise this notion of physical "hardness· of an onset, to the onset of any sound. By exaggerating the high frequency components in the onset moment we create a more "hard" or "brittle" attack.. (I make no apologies for using these qualitative or analogical terms. Grain time-frame events have an indivisible qualitative unity as percepts. We can give physical and mathematical correlates for many of the perceived properties, but in the immediate moment of perception we do not apprehend these physical and mathematical correlates. These are things we learo to appreciate on reflection and repeated listening.)

time-stretching the onset is much more perceptuaUy potent than time-stretching the continuation. This issue is discussed in Chapter I. Also editing procedures on sequences (melodies, speech-stream etc.) in _many circumstances need to preserve event onsets if they are not to radically alter the perceived nature of the materials (the lalter, of course. may be desired). Finally. e)(tremely dense textures in mono will eventually destroy onset characteristics. wherea~ stereo separation will allow the ear to discriminate event onsets even in very dense situations. (Sound example 4.9).

ALTERED CAUSALITY Because the onset characteristics of a sound are such a significant clue to the sound's origin, we can alter the causality of a sound through various compositional devices. In particular, a sound with a vocal onset tends to retain its "voiceness" when continuation information contradicts our initial intuitive assumption. The piece Vox-5 uses this causality transfer throughout in a very conscious way but it can operate on a more immediate leveL Listen first to Sound example 4.10. A vocally initiated event transforms in a strange (and vocally impossible) way. If we listen more carefully, we will hear that there is a splice (in fact a splice at a zero crossing: zerO-Wiling) in this sound where the vocal initiation is spliced onto its non-vocal (but voice derived) continuation. (In Vox-5 the vocal/non-vocal transitions are achieved by smooth spectral interpolation, rather than abrupt splicing - see Chapter 12). When this abrupt change is pointed out to us, we begin to notice it a~ a rather obvious discontinuity, the 'causal chain" is broken. but in the wider context of a musical piece. using many such voice initiated events, we may not so ea~i1y hone in on the discontinuity. A more radical causality shift can be produced by onset fusion_ When we hear two sounds at the same time certain properties of the aural sU'eam allow us to differentiate them. Even when we hear two violinists playing in unison, we are aware that we are hearing two violins and not a single inSlfum~nt producing the same sound stream. At least two important factors ill our perception permit us to differentiate the two sources. Firstly, the micro fluctuations of the spectral components from one of the sources will be precisely in step with onc anolller but generally OUI of step wilh those of the other source. So in the continuation we can aurally separate the sources. Secondly, the onset of the two events will be slightly out of synchronisation no mailer how accurately they are played. Thus we c:.m aurally separate the two sources in the onset moments. If we now precisely align the onsets of two (or more) sounds 10 the nearest sample (ollset syncllrollisatiol/) our ability to separate the sources at onset is removal The instantaneous perce!,1 is one of a single source. However, the continuation immediately reveals thai we are mistaken. We thus produce a percept with "dual causality". At its oulset it tS one source bUI it rapidly unfolds into two_ In Sound example 4.11 from Vw:-5 this process IS applied to Ihree vocal sOllrces. Listen carefully te' the first sound in the sequence. TI1C percept is of "bel\" but also voices, cven though the sources ar~ only untransformcd voices. This initial sound initiates a sequence of similar sounds. but as the sequence proceeds the vocal sources arc also gradually spectrally stretched (See Chapter 3) beconung more and more bell-like in the process.

One way to achieve this attack hardening is to mix octave upward transposed copies of the source onto the onset moment, with loudness trajectories which have a very sudden onset and then die away relatively quickly behind the original sound (we do not have to use octave transposilion and the rate of decay is clearly a mancr of aesthetic intent and judgement). (octave stacking). TIle transpositions might be in the time frame of the original sound. or timc-contracted (as with tape-speed variatio/l : Sec Chapter II). The laller will add new structure to the auack.. particularly if the sound itself is quickly changing. We can also, of course, enhance the attack with downward transpositions of a sound. with similar loudness tf'.!jeclOries, the physical correlate of such a process being less clear. This laller fact is not necessarily important as, in sound composition. we are creating an artificial sonic world. (Sound example 4.6).
We can, for example, achieve in this way a "hard" or "metallic' allack to a sound which is revealed (immediately) in its continuation to be the sound of water, or human speech, or a non-representational spectrum suggesting physical softness and elasticity. We are not constrained by the photographically real, but our perception is guided by physical intuition even when listening to sound made in the entirely contrived space of sound composition. (Sound example 4.7). Another procedure is to add noise to the sound onset but allow it to die away very rapidly. We may cause the noisiness to 'resolve' onlO un actual wavelength of the modified sound by preceding that wavelength with repetitions of itself which are increasingly randomised Le. noisier. This produces a plucked-string-like allack (sound plucking) and relates to a well known synthesis instrument for producing plucked string sounds called the Karplus Strong algorithm. (Sound example 4.8)_ The effect of modifying the onset has to be taken into cortsideralion when other processes are put into motion. In particular, time-stretching the onset of a sound will alter its loudness trajectory and may even extend it beyond the grain lime-frame. As the onset is so perceptually significant.

46

47

r

WHA T IS CONTINUATION?

CHAPTER 5 CONTINUATION

Apart from grain-duration sounds, once a sound has been initiated it must continue to evolve in some way. Only conuived synthetic sounds remain completely stable in every respect. In tillS Chapter we will discuss various properties of tltis sound continuation, sometimes referred to as morphology and allure. Some Iypes of sound-contlnuation are, however, quite special. Sounds made from a sequence of perceived rapid onsets (grain-streams), sounds made from sequences of short and different elements (sequences) and sound which dynamically transform one set of characteristics into a quite diffcrcnt set (dynamic interpolation) all have special continuation properties which we will discuss in later chapters. Here we will deal with the way in which certain single properties, or a small sct of properties, of a sound may evolve in a fairly prescribed way as the sound unfolds. These same properties may evolve simllarly over grain-streams, sequences and dynamically interpolating sounds. They are not mutually exclusive.

DISPERSIVE CONTINUATION & ITS DEVELOPMENT
Certain natural sounds are initiated by a single or bIief cause (striking. short blow or rub) and then continue to evolve because the physical material involved has some natural internal resonance, (stretched metal strings, bells) or the sound is enhanced by a cavity resonance (slot drum, resonant hall acoustics). As the medium is no longer being excited. however, the sound will usually gradually become quieter (not inevitably; for example the sound of the tam-tam may grow louder before eventually fading away) and liS spectrum may gradually change. In particular. higher frequencies tend to die away more quickly than lower frequencies except in cases where a cavity offers a resonaling mode to a particular pitch. or pitch area, wubin tbe sound. This may then persist for longer than any other pitch component not having such a resonating mode. We will call tltis mode of continuation attack dispersal. (Sound example 5.1). In the studio we can immediately reverse this train of events (.fOund reversal), causing the sound to grow from nowhere, gradually accumulating all its spectral characteristics and ending abruptly at a point of (usually) maximum loudness, The only real-world comparable experience might be that of a sound-source approaching us from a great distance and suddenly stopping on reaching our locatioll. We will call tills type of continuation an wxulllullllioll. Accumulations arc more startling if Illey are made from non-linear dispersals. The decay of loudness and spectral energy of a piano note tends to be close to linear. so the associated accumulation is little more than a crescendo. Gongs or tam-tams or other complex-spectra events (c.g. the resonance of the undamped piano frame when struck by a heavy metal object) have a much more complex dispersal in Which many of the lrutiai high frequency components die away rapidly. The aSSOCiated accumulation tlterefore begins very gradually but accelerates in spectral "momentum" towards tlle end. generating a growing scnse of antiCipation. (Sound example 5.2).

48

I!
'I!
I

II

'The structures of dispersal and accumulation can be combined to generate more interesting continuation structures. By splicing copies of one segment of a dispersal sound in a back-te-back fashion so that
the reversed version exactly dovetails into its fOlWard version, we can create an ebb and flow of spectral energy (see Diagram 1). By time-variably time-stretching (time-warping)the result, we can avoid a merely time-cyclic ebb and flow. More significantly. the closer we cut to the onset of the original sound, the more spectral momentum our accumulation will gather before releasing into the dispersal phase. Hence we can build up a set of related events with different degrees of musiCal intensity or tension as the accumulation approaches closer and closer to the onset point. (See Diagram 2). As the listener cannot tell how close any particular event will approach. the sense of spectral anticipation can be played with as an aspect of compositional structure. 'This reminds me somewhat of the Japanese gourmet habit of eating a fish which is poisonous if the flesh too close to the liver is eaten. Some officionados. out of bravado, ask for the fish to be cut within a bair's breadth of the liver. sometime with fatal consequences. Sound composition is. fortunately. a little less dangerous. (Sound example 5_1). Again, rime-warping. spatial motion or different types of spectral reinforcement (see Chapter 3) can be used to develop this basic idea.

DIAGRAl'I

3.

'I
"

'I
"

~)
ClLL

coPIes

I~)
DIAGRAl'I 2

UNDULATING CONTINUATION AND ITS DEVELOPMENT Certain time varying properties of sound evolve in an undulating fa~hion. The most obvious examples of these are • undulation of pitch (vibrato) and undulation of loudness (tremolando). Undulating oonCinuation is related to physical activities like shaking and it is no accident that a wide, trill-like, vibrato is known as a "shake". These variations in vocal sounds involve. in some sense. the physical shaking of the diaphragm, larynx or throat or (in more extended vocal techniques) the rib cage. the head or the whole body (!). 'This may also be induced In elastic phYSical ~iects (like thin metal sheets. thin wooden boards etc) by physically shaking them. (Sound example 5.4). In naturally occuring vibrato and tremolando. there is moment-to-moment instability of undulation speed (frequency) and undulation depth (pitch-excursion for vibrato, loudness fluctuation for lremolando) which is not immediately obvious until we create artificial vibrato or tremolando in which Ihese features are completely regular. Completely regular speed, in particular, gives the undulation a cycliCal. or rhythmic. quality drawing our attention to its rhythmicity. (Sound example 5.5). Both speed and depth of vibrato or tremolo may have an overall trajectory (e.g. increasing speed, decreasing depth etc.). In many non-Western art music cultures, subtle control of vibrato speed and depth is an important aspect of performance practice. Even in Western popular music. gliding upwards onto an Hpitch as vibrato is added. is a common phenomena. (Sound example 5.6). Vocal vibrato is in fact a complex phenomenon. Although the pitch (and therefore the partials) of the sound drift up and down in frequency. for a given vowel sound the spectral peaks (formants) remain where they are. or, in dipthongs. move independently. (See Appendix plO and p66). The pitCh excursions of vibrato thus make it more likely that any particular partial in a sound will spend at least a little of its existence in the relatively amplified environment of a spectral peak. Hence vibrato can be used to add volume to the vocal sound. (Sec Diagram 3).

1
1 .

~--)

13~.,

~

1.

!LlLcr--, )­

~~.

49

'I,

Amp
I

Ideally we should separate fonnant data before adding some new vibrato property to a sound. reimposing the separate motion of the fonnanlS on the pitch-varied sound. (See fOrnuJfIl preserving spectral m.anipulation : Appendix pl7). In practice we can often get away with mere tape-speed variation transposition of the original sound source.

ii' ,

J
Qrnp

\

/\1
f~
D:IAGRAH 3

We may extract the ongoing evolution of undulating properties using pitch-tracking (Appendix pp70-71) or envelape following (Appendix p58) and apply the extracted data to other events. We may also modify (e.g. expanding. compression: Appendix p60) the undulating properties of the source-sound. Alternatively. we may define undulations in pitch (vibrato) or loudness (cremolo) in a sound. with control over time-varying speed and depth. In this way we may, e.g. impose voice-like articulations on sounds with static spectra. or non-vocal environmental sources (e.g. the sound of a power drill). (Sound example 5.7). We may also produce extreme. or non-naturaJly-occuring. articulations using extremely wide (several octaves) vibrato. or push tremolando to the point of sound granulation. At all stages the overall trajectories (gradual variation of specd and depth) will be imponant compositional parameters. (Sound example 5.8). In the limit (very wide and slow) tremolando becomes large-scale loudness trajectory and we may progress from undulating continuation to forced articulation (and vice versa). Similarly. vibrato (very wide and slow) becomes the forced articulation of moving pitch. We thus move from the undulatory articulation of a stable Hpitch to the realm of pitch motion structures in which Hpitch ha..~ no further significance. (Sound example 5.9).

tun.£>

amp \
,l."t I
\

At the oppOsite extreme, tremolando may become so deep and fast. that the sound granulates. We may then develop a dense grain texture (see Chapter &) on which we may impose a new tremolando agitation and this may all happen within Ule ongoing now of a single event. (Sound example 5.U)).

frq,..
, '~~ ;

We may also imagine undulatory continuation of a sound's spectrum, fluctuating in its harmonicity\inhannonicity dimension. its stableness-noise dimension. or in its formant-position dimension (spectral tmd"larion). Formantal undulations (like hand-yodelling, head-shake tlutters or 'yuyuyuyu" articulation) can be produced and controlled entirely vocally. (Sound example 5.11).

,!",
I!II::

'~~

:
FORCED CONTINUATION AND ITS DEVELOPMENT In any system where the energising source has to be continually applied to sustain the sound (bowed SUing. blown reed, speech) the activator exercises continuous control over the evolution of tile sound. With an instrument. a player can force a particular type of continuation, a crescendo. a vibrato or tremolando. or an overblowing. usually in a time-{:ontrollable fashion. In general. the way in whiCh the Sound changes in loudness and spe.:trurn (with s.:raped or bowed sounds etc) or sometimes in pitch (with rotor generation as in a siren or wind machine) will tell us how much energy is being applied to the sounding system. 1llesc forced loudness, spectral or pitch movemenl shapes may thus be thought of as physical gestures translated into sound. Any gestural shape which can be applied by breath control or hand pressure on a wind or bowed stri ng instrument, can be reproduced over any arbitrary sound ( a sust:lined piano tone, the sound of a dense

Ii:
;1

50

aowd) by applying an appropriate loudness trajectory (enveloping) with perhaps more subtle paraJleling features (spectral enhancement by fil/ering, luu:monidty shifting by spectral stretching or subtle tklay). Moreover. these sonically created shapes can transcend the boundaries of the physically likely (!). Sound which are necessarily quiet in the real world (unvoiced whispering) can be unnervingly loud. while sounds we associate with great forcefulness. e.g. the crashing together of large. heavy objects. the forced grating of metal surfaces, can be given a pianissimo delicacy.

process also can change a rapid continuation into something in a phrase time-frame, for example, formant-gliding consonants ("w', My' in English) become slow formant transformations ("00" -> ·uh "and "j" --> ·uh"). Conversely, a continuation structure can be locked into the indivisible qualitative character of a grain, through extreme time-contraction. (Sound example 5.13). Continuation may also be generated through reverberation, This principle is used in many acoustic instruments where a sound box allows the energy from an instantaneous sound-event to be reflected around a space and hence sustained, Reverberation will extend any momentarily stable pitch and spectral properties in a short event It may also be combined with filtering to sustain specific pitches or spectral bands. It provides a new dispersal continuation for the sound. (Sound example 5.14), Natural reverberation is heard when (sufficiently loud) sounds are played in rooms or enclosures of any ldnd, except where the reverberation has been designed out, as in an anechoic chamber. N alUral reverberation is particularly noticeable in old stone buildings (e,g, churches) or tiled spaces like bathrooms or swimming pool enclosures, Reverberation in fact results from various delayed (due to travelling in indirect paths to the ear, by bouncing off walls or other surfaces) and spectrally altered (due to the reflection process taking place on different physical types of surface) versions of the sound being mixed with the direct sound as it reaches the ear. Such processes can be electroniCally mimicked, The electronic mimicry has the added advantage that we can specify the dimenSions of unlikely or impossible spaces (e.g. infinite reverberation from an 'infinitely long' hall. the inside of a biscuit tin for an orchestra etc.) There is thus an enormous variety of possibilities for creating sound coOlinuation through reverberation. Reverberation can be used to add the ambience of a particular space to any kind of sound. In this case we are playing with the illusion of the physical nature of the acoustic space itself. the gencmlised continuation properties of our whole sound set But it can also be used in a specific way to enlarge or alter the nature of the individual sound events. extending (elements ot) the spectrum in time, Reverberation cannot. however. by itself. extend any uudulatory or forced continuation properties of a sound, On the contrary it will blur these, Moving pitch will he extended as a pitch band. loudness variations will simply be averaged in the new dispersal. We can. of course, post-hoc. add undulatory or forced continuation propenies (vibrato, rrenwlo, ellveloping) to a reverberation extended sound, These will in fact help to unify the percept initiator-reverber.llor as a single sound-evenL The reverberation part of the sound will appear to be pan of the sound production itself, rather than an aspect of a sound box or chardcteristic of a room, (Sound example 5.15), lnitiator-reverberator models have in fact been used recently to build sylllhesis models of acoustic instruments from vibraphones. where the separation might seem natural, 10 Tubas,

Moreoyer, we can extract the properties of existing gestures and modify them in musically appropriate ways. This is most easily done with time-varying loudness information which we can capture (e11Vt!lope /ollawing) and modify using a loudness trajectory manipulation instrument (envelope tr(l1l$!ormation), reapplying it to the original sound (envelope subs/illl/ion), or transferring it to other sounds (enveloping or envelope substitution : see Appendix p59). Information can be extracted from instrumental performance (which we might specifically compose or improyise for the purpose), speech or vocal improvisation but also from fleeting unpredictable phenomena (the dripping of a tap) or working in stereo. the character of a whole field of naturally occuring events (e.g. traffic flow, the swarming of bats etc). The extracted gestural information can then be modified and reapplied to the same material (envelope trans/onnation followed by envelope substitution), or applied to some entirely different musical phenomenon. (enveloping or envelope substitution: see Appendix p59) or stored for use at some later date. All such manipulations of the loudness trajectory are discussed more fully in Chapter 10, Sounds may also have a specific spatial continuation. A whole chapter of On Sonic Art is devoted to the exploration of spatial possibilities in sound composition, Here we will only note that spatial movement can be a type of forced continuation applied to a sound. that it will work more effectively with some sounds than with others and that it can be transferred from one sound to another provided these limitations are borne in mind. Noisy sounds, grain-streams and fast sequences move particularly well. low continuous sounds particularly poorly, Some sounds move so well that rapid spatial movement (e.g. a rapid left right sweep) may appear as an almost indecipherable quality of grain or of sound onseL (Sound example 5.12). The movement from mono into stereo may also playa significant role in dynamic inrerpolation (See Chapter 12).

CONSTRUCTED CONTINUATION: TIME-STRETCHING & REVERUERATION In many cases we may be faced with a relatively shOll sound and wish to create a continuation for iL There are a number of ways in which sounds can be extended artificially. some of which reveal continuation properties intrinsic to the sound (reverberatioll. time-strerching. some kinds of brassage) while others impose their own continuation properties (zigzaggillg, granular-extension and other types of brassage). The most obvious way to create continuation is through time-stretching. There are several ways to do this (brassagelluJI1t1oniser, spectral stretchillg. waveset time Slretclting, granular time stretching) and these are discussed more fully in Chapter II. It is clear, however. that through time-stretChing we can expand the indivisible qualitative properties of a grain into a perceptibly time-varying structure a continuation. In this wayan indivisible unity can be made to reveal a surprising morphology. This

CONSTRUCTED CONTINUATION: ZIGZAGGING & IlRASSAGE Continuation can be gener,lIed by the process of zigl.agging, Here. the sound is read in a forward-backward-forward etc sequence. each reversal-read stalling at the point where the preceding read ended, The reversal points may be specified to be anywhere in the sound. Provided we start the whole procClis at the sound's beginning and end it at its end, the sound will appear to be artilicially extended. In the Simplest case, the sound may oscillate between two lixed times in the middle of the SOund until told to proceed to the end. This "alternate looping" technique was used in carly commercial

51

52

In this way we call overlay segments in the resulting stream. any segment of a sound. the segmented components. Slightly larger segments may introduce a granular percept into the sound as the perceived evolving shapes of the segments are heard as repeated.29). By :altering the zigzag points subtly from lig to zig. (Sound example 5. or. we may expect to generate a time-stretched goal-sound wruch is spectrally thickened (and hence often more noisy. if segments are very shOll. such evolved manipulations (and their combinations) force a continuous or coherently segmented source to dissociate into a mass of atomic events. loudness spread. introduce momentary silences between the grains. will have a continuation structure and looping over it will cause this structure to be heanJ as a mechanical repetition in the sound. our source sound is CUI into segments which we may vary in duration.llns to influence the quality of the resulting sound stream. However. It will become part of the overall field-properties of a texture stream (see Chapter 8). We can extend the duration of the sound by cutting overlapping segments from the source.24). collage type extension. The original melody will thus control the evolving HArmonic field of Hpitch possibilities in the goal sound. CONSTRUCTED CONTINUATION.16). With granular reconstruction. In practice. laking us into the realm of granular recolI:ilructjon. producing a less rhythmici1. (Sound example 5.21). the effect will be to broaden the pitch band or the spectral energy of the Original sound.). (Sound example 5. As discussed previously. Eventually. Ultimately. However. We may also permit the process to select segments from within a time-range measured backwards from the current position in the source-sound (see Appendix p44-C). beyond grain time-fnune. The brassage components may also be varied in pitch. (Sound example 5. can move from any point to any other within the sound. Sounds can also be artificially continued by brassage. which present us with constantly and perceptibly evolving spectra). all the previous propel1ies of the sound become grist to the mill of continuation-production. The loudness of segments can also be varied in a random. (Sound example 5. or progressively spatialise. especially where used with very tiny grains. (see note at end of chapter). With regular segment-size our attention will be drawn to the segment length and the percept will prohably be repetitively rhythmic. This process. the qualities of a highly characteristic onset event can be redistributed over the entire ensuing continuation. (see Appendix p43). (Sound example 5. we will reproduce the original sound.25). Now. (Sound example 5. echo percepts are randomised ful1her. we can make the range include the entire span of the sound up to the current poSition (sec Appendix p44-D). In the case e. they are used as the elemenLS of an evolving texture in which we can control the density and the time-randomisation of the elements.18). nornlal brassagelharmoniser processes use a cel1ain degree of overlap of the adjacent segments to ensure sonic continuity and avoid granular artefacts (where these are not required) III the resulting sound. (Sound example 5. however. 53 54 . (Sound example 5. especially when operating on a segmented source (a melody. density and bandwidth arc IIlcrcased and segment duration varied. progressive or cyclical way. pitch or loudness. hut not overlapping them in the goal sound.ed. pitch or loudness movements or discontinuities in a sound and focus attention on them by alternating repetition. We might also spatialise. we may vary the length (and hence duration) over the repeated segment In this way zigzagging can be used to generate non-mechanical undulatory properties. or at least less focussed). the nature of the source will come to have a less signillcant influence on the goal sound. But as range. the degree of thickening being comroJled by our choice of both density and pitch bandwidth. It may work adequately where the chosen ponion of the sound has a stable pitch and spectrum. Subtly controlling this and the previous factors. This process then passes over into texture control. On a smaller time-frame. a speech stream) will result in a systematic collage of its clements.. GRANULAR RECONSTRUCTION We may generalise the hrassagc concept fUl1her. however. grain time-frame segments will produce a simple time-stretching of the sound (llarmoniser effect). (Sound example 5. melodic phra~ which we brassage in this way using a relatively large segment-size (including several notes) we will create a new melodic stream including more and more of the notes in the original melody. moving from a point source to a spread source. Zigzagging. either progressively or at random. As with hrassage. If this is done at random over a very sm:all range. The pitch band can also be progressively broadened.23). (Sound example S. the instantaneous articulations of the sound will be echoed in an irregular fashion adding a spectral (very shOll time-frame) or al1iculatory (brief but longer time-frame) aura to the time-stretched source. The ultimate process of this type is Sou.~ to select spectral. constantly changing length and average position as it does so. The segment granulation of an already granular source may produce unexpected phasing or delay effects within the sound. and is discussed more fully in Chapter 8. we may vary the segment size. as we proceed.20).211). pitch spread. Clearly.samplers to allow a shOll sampled sound to be sustalned. We may also cycle round a set of pitches in a very small band providing a subtle ·stepped vibrato· inside the continuation (particularly if grain-size is slightly random-varied to avoid rhythmic constructs in perception). spatial spread etc. or (on a longer timescale) dispersal-accumulation (see above) effects. we can extract rich fields of possibilities from the small features of sounds with evolving spectra (espeCially sequences.17). if the segments are replaced exactly as they were cut. within a sound-continuation. (Sound example 5. instead of merely resplicing these tail-to-tail. Using non-regular grain size near to the grain time-frame boundary.27). (Sound example 5.22).g. alongside imposed stream characteristics (density. perhaps progressively. Final note: in order to simplify the discussion in this chapter (and also in the diagrammatic appendix) Brassagc has been described here as a process in which the cut segments are not overlapped (apan from the Icngtll of the edits themselves) when they arc reassembled. (See Appendix p73).ui shredding which completely deconstructs the original sound and will be discussed in Chapler 7. We can therefore use the proces. sound and then resp/iced together to produce a new sound.19) This idca of brassage can be generalised. is also known as granular sylllilesis (the boundaries between sound processing and synthesis are fluid) and we may expect the spectral properties and the onset characteristics of the gr. Longer segments. (Sound example 5. In general. In this way. (See Appendix p44-B). if we keep the range and pllch-spread small. of a long. In the brassage process successive segments are CUI from II.26).

a drum roll. in the real world. No sound persists forever and. we will talk only al::~)ut those propertie. a speech-stream) 10 be delivered more slowly 10 us without the individual attacks (the drum strike. But with recorded sounds we have 10 make special arrangemenl~ to get this to work. Even where this (non!) ideal limit is approached in nalurally occuring sounds (e. undulatory and forced continuation and their developments.g. rather than as a sequence of changing but definite sound events. a melody on a single instrument. 55 .we jusl play in a slower tempo on the same instrument the internal tempi of the onset evems is not affected. the timbral structure of consonants) being sllleared out and hence transformed. as discussed in Chapter 5).1:" " l' ~:f quite normal in traditional musical practice . Compositionally.CHAPTER 6 GRAIN-STREAMS DISCONTINUOUS SPECTRA In this chapter and the next we will discuss the particular properties of sounds with perceptibly discontinuous spectra.frame that we perceive the result as noise (or in fact as pitch if the discontinuities recur cyclically). if we lime-streich a continuous sound. BOth grain-streams and sequences can have (or can be composed to have) overall continuation properties (dispersive. Often we want the sound (e. any rapid sequence of different events) we will refer to as sequences and will discuss these in the nexi Chapter. In particular. we tend to demand different things of discontinuous sounds.1).2). Which are special to grain-streams and sequences. we may he disturbed by the onset distortion but the remainder of the sound may appear spectrally satisfactory. If wc time-strctch a discontinuous sound. The spectra of many sounds are discontinuous on such a small time. Discontinuous sounds consisting of different individual units (speech. nor arc they ever completely regularly spaced in time. We wish to be selective about what we time-stretch! The idea of slowing down an event stream without slowing down the internal workings of the events is .. we will be disconcel1ed everywhere by onset dislortion as the sound is a sequence of onsets. all our sound-experience is discontinuous. In the limit it is simply a single grain rapidly repeated. than of continuous ones. Once. in Ihe real environmem. (Sound example 6. will be interrupted by another.g. A grain-stream is a sound made by a succession of similar grain events. we perceive a sound-event with definite but rapidly discontinuous properties. congruously or incongruously. We will divide discontinuous sounds inlo two classes for the ensuing discussion. however. In this chapter and the next. however. (Sound example 6. We are here concerned with perceived discontinuous changes in the time range speed-of-normal-speech down to the lower limit of grain perception. I 'I~!i lil II ""ii " '111111 I' I''l. low contrabassoon notes) we will discover that the individual grains arc far from identical. In one sense. the individual spectral moments are stable or stable-in-motion for a grain time-frame or more.

This process ties the grain-streaming to the internal properties of the source-sound so. producing what a traditional composer would recognise as a retrogf'~de (as opposed to an accumulation.16). (Sound example 6. whistled rolled "r". by sp.10). or spectral transformation tools (e. We are able to link grains with grain-streams and continuous sounds with grain-streams in this way and hence begin to build networks of musical relationships amongst di verse materials. ritardando arithmetically. The granulation of the resulting sound can be exaggerated by corrugation (see Chapter 10) and the regularity of the result mitigated by using some of the grain-stream manipulation tools to be described below. by lip-farlS and by tracheal rasps. (2) slowing down the sequence and the grains. Alternatively grain-streams may be constlUcted by splicing together individual grains (!). More compositionally flexible. (Sound example 6. 56 57 .9).. DISSOLVING GRAIN-STREAMS Just as continuous sounds can be made discontinuous. In the latter we have control over the timing and Hpilct\ing and sequencing of the events. Short continuous sounds can he extended into longer grain-streamed sound by using brass age with appropriate (grain time-fmme duration) segment-size. so the internal morphology of the individual events comes to the foreground of perception (spectral time-stretching: Sound example 6.:eding up the sequence rate of grains without changing the grains (granular time-shrinking: see below) we will breach the grain-perception limit. (3) gradually shifting the pitches or spectral quality of different grains differently. The grain-stream can thus be elegantly time-distorted without distorting the grain-constituents. Thus a grain-stream moving upwards in pilCh would become a grain-stream moving downwards in pilCh. Increasing the grain density (e.ret enveloping). we do not initially have this control.12).g.. keyed percussion or any sounding material.17). which may be all combined in a single sound processing instrument. the gmin. This suggests to the ear that the falling ponamento and the ritardando are causally linked and intrinsic to the goal-sound. Grain-strcams may also be dissociated in other ways .streaming ritardandos if the perceived pitch falls. forms. cutting and resplicing or remixing.11). for example. via the parameters of a granular synthesis instrument) will also gnldually fog over the granular property of a grain-stream. we are describing here ways in which networks of musical relationships can be established amongst diverse musical materials. shrink the lime-frame in similar ways and so on : granular time-warping). the grains themselves can be sequentially modified using some kind of enveloping (see Chapter 4).6).CONSTRUCTING GRAIN STREAMS Grain-'Streams appear naturally from iterative sound-production . If each on-off Iype trajectory is between about 25 and 100 wavesets in length. see Chapter 5).7). In this way grain-streams can be given gentle accelerandos or ritardandos of different shapes and be slightly randomised to avoid a mechanistic result.. randomise grain locations about their mean. Speeding up a grain-stream by tape speed variation or :. a grain-stream is aldn to a note-sequence on an instrument. We can also reverse the grain order (granular reversing). The sound will gradually become a continuous fuzz. (Below this limit.4).lins themselves so that grains become detached events in their own right (granular time-stretching: Sound example 6. but e.any Idnd of roll or Uill on drums. Similarly. waveset Iwrmonic djstortion or waveset-averaging : see Chapter 3) combined perhaps with inbetweellillg (see Chapter 12) to generate a set of intermediate sound-states.5). destructive distortion through waveset distortion. (I) slowing down the sequence of grains but not the gr. However.g. Alternatively. In a grain-stream not constructed from individually chosen grains. (Sound example 6. replaced or reordered using meta-instruments which allow us to manipulate mixing instructions (mixshuJjling) or to generate and manipulate time-sequences (seqllence generation). which itself might be obtained by envelope jo/l(JWing another sound (see Chapter 10).13). by enveloping a continuous source. grain-streams can be dissolved into continuous. but more pernickety. In this case a related pitch mayor may not emerge. without reversing the individual grains themselves (sound reversing). flutter-tongued woodwind and brass etc). so the grain-stream becomes a sequence (granular reorderillg : Sound example 6. Again. A particularly elegant way to achieve the effect is to generate a loudness trajectory on a sound tied to the wavesel~ or wavecycles it contains (wave. Looping can be used to do this but will produce a mechanically regular result. we would like to be able to treat gmins in a similar way to the way we deal with note events. geometrically or exponentially. is to use a mixing program so that individual grains can be placed and ordered. (Sound example 6. In the studio. (Sound example 6.3). Reverberation will blur the distinction between grains. we will hear grain-streaming. Many of these compositional processes provide means of establishing audible (musical) links between materials of different types. we can retime the grains in a sequence using various mathematical templates (slow by a fixed factor. (On-offness might be anything from a deep tremolando fluttering to an abrupt on-off gating of the Signal). (Sound example 6. or even single-grain. CHANGING GRAIN-STREAM STRUCTURE In some ways. (Sound example 6. (Sound example 6. the hand over the mouth as we sing or whistle) can be used to naturally grain-stream any sound. By the appropriate usc of gating.8). we may produce a rasping or spectral "coarsening" of the source sound). (Sound example 6. They are produced vocally by rolled "r" sounds of various SOt1S in various languages.g. The rapid opening and clOSing of a resonating cavity containing a sound source (e. Vocally.15). An instrument which introduces random fluctuations of repetition-rate and randomly varies the pitch and loudness of successi ve grains over a small range (iteration) produces a more convincingly nalUral result (Sound example 6.14). (Sound example 6. rather than a compositional affectation (!).. any continuous sound may be grain-streamed by imposing an appropriate on-off type loudness trajectory (enveloping).pectral time-colltraction (with no grain pitch alteration) may force the grain separation under the minimum time limit for grain-perception and the granulation frequency will eventually cmerge as a pitch. such sounds may be used 10 modulate others (sung rolled "r".g. then repositioned. (Sound example 6.

A rising succession of pitched sounds might thus be converted into a (motivic) sequence. We might rearrange the grains in some new order (grain reordering). We will therefore concentrate on what can be newly achieved. whereas sound composition is a relatively new field. or some combination of both.We can go further than this. or alter the timing on individual grains without altering their pitch.18). Or we might move the pitch of individual grains without altering the time-sequence of the stream. 58 . In this way the control of grain-streams passes over into more conventional notions of event sequencing. melody and rhythm. (Sound example 6. assuming that what is already known about traditional musical parameters is already part of our compositional tool kit. I will not dwell on these here because hundreds of books have already been written about melody and rhythm.

Increasing the speed of a sequence beyond a certain limit will produce a gritty noise or even. a reference-frame for the sequence constituents. In existing musical languages.2). but they will now have discontinuously varying imposed properties. of notes will be commonplace.7).rally-evolving source. modify each in a non-progressive manner (different jilterings. (Sound example 7. large ranges would reveal the material in the order sequence of the source but. However any rapidly articulated sound stream can be regarded as a sequence (some ldnds of birdsong. etc) and reconstitute the original sequence by resplicing the clements together again. A sequence from left to right can become a sequence from right 10 left. (Sound example 7. (Sound example 7. before constructing the new sequence. others rolre and yet others absent. a pitch percept. But in general. pitch Shifts. (See Appendix p41). In a particular natural language certain Clusterings of consonants (e. This process may be repeated ad infinitum. MELODY & SPEECH The most common examples of sequences in the natural world are human speech. order properties i.4). To synthesize the speech stream it may be more appropriate to model all the transitions between the elemenL~ we tend to notate in our writing systems.9). or (groups of) liS constituents may be aIIlassed in textures (see Chapter 8) of sufficient density that our perception of sequence is overwhelmed. It is possible to construct sequences which do not have this property (in which no elements are repeated) just as it is possible to construct pilch fields where no Hpitch reference-frame is set up. We may do this by chopping up the sequence imo conjunct segments and reordering them (as in sound shredding: see below). It is easy to imagine and to generate unordered sequences. Using several brass ages with similar segment length settings but different ranges (see Appendix p44-C) we might create a group of sequences with identical field properties but different.11). in the special circumstancc of a sequence of regularly spaced events with strong attacks. Brassage with pitci! variatioll of segments over a finite Hpitch set (possibly cyClically) could establish an Hpitch-sequence from a pitch-continuous (or other) source. when rearranged. GENERAL PROPERTlF-S OF SEQUENCES All sequences have two general properties. marginally or radically. if the segment size is set suitably large and we work on a clearly specl. on the large scale. Starting from the separate elements themselves. we will be working within some Held. can be rcstationed at a point in space.g.g. gradually dissolving all but the most persistent spectral properties of the sequence in a watery complexity (the complex in fact becomes a simple unitary percept). (Sound example 7. TIley can define both a field and an order.CHAptER 7 SEQUENCES Using brassage techniques. return unpredictably to already revealed materials.e.10). for the present discussion. smaller ranges would lend to preserve the order relations of the source. Thus. in the meantime. lor a finile piece of music.sage" a purely spliced-together sequence. to achieve the flowing unity of such natural percepts as speech. However. but here we would like to consider the properties of any sequence what~oever. the set of utterances with a particular natural language defines a field. kJangfarbenmelodie passing between the instruments of an ensemble. An existing speech-stream can be similarly reordered to destroy the syntactic content and (depending on where we CUI) the phoneme-continuity. time-stretching the sequence beyond a certain limit will bring the internal properties of the sequence into ule phrase time-frame and the sequence of events will become a formal property of the larger time-frame. do not relain the spectral-continuity of Ule Original. Similarly any sequence played on a piano defines {he tuning sCI or HAnnonic field of the piano (possibly just a subset of it). (Sound example 7. or change to an alternation of spatial position and so on. we will ignore this subtle flow property and treat all sequences as if they were formally equivalent.g. (Sound example 7. or by selecting segments to crll.with environmentally appropriate or inappropriate loudness andlor reverberation relations one to the other) or by modifying existing natural sequences (time-contraction of speech or music. We can also construct disjunct sequences of arbitrary sounds by simply splicing them together (e. Naturally occuring sequences cannot necessarily be accurately reproduced by splicing togeuler (in the studio) constiluent elements. (Sound example 7. oUlers exceedingly unlikely. Alternalively we may cut our material into sequential segments. though the human mind's pattern seeking predilection makes it over-deterministic. Even brassage with spatialisation (see Appendix p4S) will separate a continuous (or any other) source into a spatial sequence and so long as they are clearly delineated. sequences of notes on a specific instrument and sequences of phonemes in a natural language have well-documented properties.1). (Sound example 7. Any sound source in directed motion (e. a "breaIc" on a multi instrument percussion set etc). and melodies played on acoustic instruments. an oboe note.5). apart from splicing togeuler the individual elements "by hand".3). In this process the sequence is "ra into random-length conjunct segments which are lhen reordered randomly. Conversely a sequence may be shredded (sound shredding). a dripping tap. yet related. copies of the sequence. (Sound example 7. (Sound example 7.8). The perception of sequence can also be destroyed in various ways. a car hom. We may work with a definite segment length or with arbitrary lengths and we may shift the loudness or pitch of the materials. at random (so they might overlap other chosen segmenL~) and reordering them as Ihey are spliced back together again (random cUlling: Appendix p4l). "scr" in English) will be commonplace. A simple approach might be to add a lillie subtle rtverberation. certain sequena. it may be necessary to "mac. a pitch-glide or a formant glide) can be spliced into elements which. for example). a cough . Conversely. Alternatively.6). The speech stream in particular has complex transition properties at the Interfaces between different phonemes which are (1994) currently the subject of intensive investigation by rese:m:hers in speech synthesis. or a sequence generally moving to the left from Ule right but WiUl deviations. Sequences are also characterised by order properties. such spatial sequences can be musically manipulated (Sound example 7. heaflng definite pattern where pallern is only hinted at! 59 60 . the set of phonemes which may be used in the language. we may achieve similar results. CONSTRUCTING & DESTRUCTING SEQUENCES Sequences can be generated in many ways. Clearly. rather than those elements themselves (this is Diphone synthesis).

the mampulation of sct order-relations is usually <. the resonance characteristics of the tunnel.g. (Sound example 7.g.. As we move towards poetry which is more strongly focused in lite sonority of words. these may be predetermined by cultural norms. or granular reconstruction). small time-frame brassage. Formant sequence e. cycling (see above) establishes a rhythmic subset and a definitive order relation among the cyclic grouping. for example.g.. (Sound example 7. through assonance. see Chapter 12) so that the elements of a sequence become more differentiated or. These field properties may then define the boundaries of the compositional domain (a piano is a piano is a piano is a piano) or conversely become part of the substance of it. (Sound example 7. If the sets are large.inking about matcrials may be applied to any type of sequence. alliteration and particularly rhyme. by destructive distortion with illberweelling. a set of loudness values.g. Permutation of a linitc set of clcmems iII fact provides a way or establishing relationships among element-groupings which. whereas the phonetiC material establishes no such small time-frame field and order properties . If. such permutations preserve the field (reference-sct) so that complex musical materials arc at least field-related (claims that they are perceptually ordcr-related arc oflen open to dispute). in the latter case the field is altered).1eristics. we repeat any small subset of the sequence. delay. (Sound example 7. On the IlIQlec scale. and field and order properties are on the very large timescale of extensive language utterance. In these ways we can form bridges between sequences. rising noisc-bands within a given range. wltich has some similar properties. We can also gradually transform each element (e. or an upward sweeping noisc-band. "is this design audible?". We may also focus the field-properties of a sequence. however. bo-ba-be COMPOSING FIELD PROPERTIES Constructing sequences from existing non-sequenced. or just of syllabic utterance.:tcristics of sounds. or part of a sequence by looping (repeating tllat portion).g. spectral shuffling.15). more and more similar moving towards a grain-stream (see above).12). as we transform the field characteristics (piano -> "bell" -> "gong" -> "cymbal" -> unvoiced sibilant etc. Or we may blur the boundaries between the elements through processes like reverberation. permute the instrumental sequence and not the Hpitches. All the insights of serial or pel1llutalional tIl. or differently sequenced. chord formations over a scale) and reordered (motivic variation). the text is percei ved as being in a separate domain to the "musical". such as total spectral type. establishing no reference frame of Hpitch). if the groupings are small. We could. But we can work with alternative properties. Beyond tllis simple procedure we move into the whole-area of the manipulation of finite sets of entities which has been very extensively researched and used 'in latc Twentieth Century serial compositioll. Starting originally as a method for organising Hpitches which have specific relational properties due to the cyclical nature of the pitch domain (see Chapter 13) it was extended tirst to the time-domain. 61 62 . conversely. or simply through time-contractioll (of various sorts) so that the sequence succession rale falls below the grain time-frame and the percept becomes continuous. Musical settings of prose.'Onfined to a subset of properties of the sounds e.the text is used as referential language. We thus move a sequence =do-da-de =lo-I(I-Ie and onset characteristic sequence c. In tltis siruation. (Sound example 7. however. In this context the size of the field is significant. it will very rapidly establish itself as its own reference set. establishing audible links between musically diverse materials. the spectral characteristics of the phonemes of Japanese atld perhaps the sex characteristic. may treat the Hpitch (and duration) material in terms of a small ordered reference set which is constantly regrouped In subsets (e. If a meaningful group of words is repeated over and over again we eventually lose contact with meaning in the phrase as our attention focuses on the material as just a sequence of sonic events delining a sonic reference frame. Thus a series of pitches may seen arbitrary (and may in fact be entirely random across the continuum. provided that we ask the vital question. the importance of small-scale reference-frame and order sequencing may become overriding (e. or a conversation in Japanese) ensures that some field properties of the source sounds will inhere in the resulting sequence: a defined Hpitch reference sct and a Oute spectrum (with or without onset chara<. Compositionally we can transform tlJe field (reference-set) of a sequence through time by gradual substitution of one element ror another or by the addition of new elements or reduction in the IOtal number of elements (this can be done with or without a studio!).~ of the specific voices. are clearly audible. or gmups of properties.as an explicit re-ordering relation .(3). COMPOSING ORDER PROPERTIES We can also compose with the order properties of sequences. but traditional musical practice is usually concerned with working on subsets of these cultural norms and exploring the particular properties and malleabilities of these subsets.or as an underlying field constraint? In traditional scored music. Hpil~h. and then to all aspects of notated composition. In the simplest case.g. The same set-focusing phenomena applies to the spectr. !ixcd duration values. Schwitters Ursonata). or a traffic recording in a tunnel with a particular strong resonance. like tuning systems or the phoneme set of a natural language..14).In the finite space of a musical composition we may expect a reference set (field) and ordering properties to be established quite quiclc. For example. bo-ka-pe = bu-ki-po = ba-ko-pu ctc. waveset shuffling. towards a simple continuation sound. (Sound example 7. We therefore discover a meeting ground betwcen phonemic and traditional Hpitch and dutIltional musical concerns and these connections have been explored by sound poets (Amirkhanian etc) and composers (Rerio etc) alike.Webern's Opus 21 Sinfonie provides us with such an instrument-Io-instrument line and uses it canonically. objects (a flute melody. grain streams and smoothly continuous sounds. begins to adopt the small-scale reference-frame and order sequencing for phonemes we find normal in traditional musical practice. Poetry. And in what sellse is it audible? .).ly if they are to be used as musically composed elements. however.!1 chara.16). We can thus move very rapidly from a sense of Hpitch absence to Hpitch definiteness. or simply different. permutations of instrument type in a monodic klangfarhenmelodie .17). spectral blurring.

4163857 REPEAT PATTERNS AT LEFT.... for example.. The continuations of such sounds might be recognisable (e.. onset..... Here field variation is altering order perception.19). A traditional example is English change-ringing where the sequence in which a fixed set of bells is rung goes through an ordered sequence of permutations.g..g. 1 7 5 6 3 624 etc \ X/ 684 7 251 3 X X 8 6 7 4 5 2 3 6 ALPHA BETA / X \ 7 6 5 4 3 2 / 17656342 1 II X X X etc etc etc 7 6 5 6 3 4 1 2 7 5 836 1 . on which we impose a sequence of 'bright". Combined with the organisation of durations.. or on a regular duration pattern that is subsequently permuted or time-warped in a continuously varying manner. (Sound example 7.the two might offset and comment upon one another in the same way that motivic (and rhythmic) ordering and the narrative content of prose do so in traditional musical settings of prose..• etc .. 'soft".. (Sound example 7. . . (See Diagram 1). etc .. .... grouped in 5 and in 7 and in 4 and to change their relative perce. can be organised in such a way !hat they synchronise at cxactly defined times..•. a piano concerto excerpt etc.g.. these times perhaps themselves governed by larger time-frame stress patterns. independent of dumtion organisation.... 'A I X'etc·X I \ ALPHA BETA ... 1537~~46 sIJap odd pairs sIJap even pairs sIJap odd pairs 4 6 2 8 1 7 3 5 64827 1 5 3 \ / / 157 3 8 2 6 4 I I X X X etc .. as against Hpitch) within !he same reference-set of sounds.lllel to and independently of Hpitch onler relations.. Most pre twentieth century European an music and e..... (Sce Chapler 9). (Sound example 7.. a bell.. melodidrhythmic cadencing in Indian Classical Music)... whole volumes might be devoted to rhythmic organisation but as there is already an extensive literature on this..20).AS FOLLOwS 135 2 7 4 8 6 42618 375 STRESS PATTERNS AND RHYTHM We can apply different order-groupings to mutually exclusive sets of properties (e. continuation. attacks as a definable sequence.. etc . etc ..g. I will confine mysel f to just a few observations. In our extreme example we have suggested a kind of ordering of onset characteristics laid against a "Cinematographic" montage which might have an alternative narrative logic . we have !he group of properties used to define rhy!hm.. "subdued" etc. I'x'Y'x'1 X etc.. . even cyclical stress patterns.. Rhythmic structure can be established independcnlJy of Hpitch structure (as in a drum-kit solo) or par... Onset characteristics and overall loudness are often organised cooperatively to create stress patterns in sequences. We might imagine an unordered set of sounds and which define no small reference set. Ordering principles can.) but ordering principles are applied just to the onset types. In the setting of prose this contrast is clear.18). "hard". be possible to creale superimposed grouping perceptions on !he same stream of semiquaver units. stress duration palterns at different tempi. ..plual prominence by sublJy altering loudness balance or onset chamcteristics or in fact to destroy grouping perception by a gradual randomisation of stress information. DIAGRAl'1 1 ENGLISH BELL-RINGING PATTERNS PLAIN BOB PATTERNS OF PERttQTATIONS \ ALPHA 1 1 234 567 8 sIJap even pairs SIJap odd pairs sIJap even pairs X Y X' X 2 1 436 5 8 7 ~. .. Hence. e... water. or to contrast the area of order manipulation with !hat of lesser order. It might..4 2 / \ \ ALPHA BETA ALPHA BETA ALPHA BETA ALPHA I / \ 18765432 5 7 3 8 1 624 X \ X I 18674523 II X X X etc etc etc I \ / 5 3 7 1 6 2 6 4 3 517 2 846 <1 \ / / 684 7 253 648 2 7 3 5 \ X / 1 5 2 7 4 8 6 X X I X X X 1 3 2 5 4 7 6 8 etc etc etc 146 2 837 5 14263857 END OF ALPHA II X XX etc etc etc IBETA I 132 547 6 8 12436 5 8 7 BETA Swap even pairs except 1st 63 II X 'X X 3 5 2 7 4 8 6 12345676 CYCLE COllPLETED II X X X . Again. loudness. separates out !hese two groupings of sound properties and orders !hem in a semi-independent but mutually interactive way (HArmOnic rhythm in Western Music.. speech.can be most easily observed as pattemable properties in language or language~erived syllables. Indian classical music. Dy subtle continuous control of relative loudness or onset characteristics. Stress patterns may be established. This is an extreme example because I wish 10 stress the point that order can be applied to many different properties and can be used to focus the listener's attention on the ordered property.. etc . So any fixed set of sounds can be rearranged as wholes and these order relations explored and elaborated.. but formant and onset characteristics can be extracted and altered in all sounds.. of course. also be applied to sounds as wholes. or at different varying tempi. we can Change the perception of !he stress pattern. a feature over which we can have complete conlrol in sound composition.. .

the "repeated" objects are not internally varied in more complicated ways at smaller time-frames. e. or as field constraints on the total i ¥r Ur . the more perception is dependent on long term memory.) is a way of passing from one perceptual time-frame to another.r IF t @* P~f~! fn. are they heard directly." a . 77 J. stressing the division of the 6 into 3 groups of 2. the first note of the 6 is most important. and all such considerations can be applied to any subset of the properties of sounds. Also. Most probably 3/4 t J1lDJ11 Most probably 6/8 1. Permuting and otherwise reordering sequences in ever more complicated ways is something that computers do with consummate ease. can manipulate order sequences of any length and of any unit-size in an entirely equivalent manner.i' ?tl¥- 7 Und.. _____ J)7 ~"7 7 J.t ~ -r . but no less important. This is not to say that.SU itfJij IJ. as the elements of our sequence hccome larger. does the listener hear those reorderings and if so.z XI h~p ~? . just as time-shrinking will ultimately contract the sequence into an essentially continuous perceptual event. Permutational logic is heady stuff and relationships of relationships can breed in an explosive fashion. The questions must always be.. I 64 ~7~1~11:7'I~ ~:'f-V f7f1lf1 7.lf?7W'%~~i¥i ~1m~~! . because reordering principles can be applied on larger and larger time-frames and are the substance of traditional musical "form". Clarence Barlow has defined an "indispensability factor" for each event in a rhythmic grouping.Alternatively.I II ? (( . Thus the temporal extension of a sequence (by unit respacing.g.-1.1 :I! .1'~~cl~h ?. A powerful order-manipulation instrument can be written in a few lines of computer code. whereas what might appear to be a simple spectral-processing procedure might run to several pages. the permutation structure tends to be simpler. These factors are then used to define the probability that a note will occur in any rhythmic sequence and by varying these probabilities. In 618 for example. (Diagram 2). Hence larger scale order relations tend to become simpler. Equivalence in the numerical domain of the computer (or for that matter on the spatial surface of a musical score) is not the same as experiential equivalence. a 618 bar. and so on. The computer.'I ('llkl.21). In 3/4 over the same 6 quavers." !II . ~ r ?~ ~lfn. time-stretchillg etc.'. especially of Hpitches.f7J. . but as large time-block qua large time-blocks. by making all notes equally probable. _ . ~~ ~ hi r ~ 11"J f r t ''')(1'. I' i OR [ij it ~fs~11. the 4th note (wflich begins the 2nd set of 3 notes) the next. in traditional practice. however. with Hpitches.+~ fl = • ntlf':~~'~J'Bl~71r-rf7 r1f'ttt1­ " . altering the time-sequencing of motivic-groups-taken-as-wholes.g. ~.. as we deal with larger and larger time-frames.. - 77_77p 1\1 G7 E7 7 P~_~ ?±!. 7 ol: .#. all stress regularity is lost and the sequence becomes arhythmic above the quaver pulse level. As composers. ~¢! r iri~ Itjfi~ H uti' ~. 677~#:G7. I would argue that the larger the time-frame (phrase-frame and beyond).ePlt ~1 ~7 tr' tiij ij Pt 17 7 • GliTJ lJ71 .7~7~77~'17:~1:/J'7 1:1<1 1'1 Ambiguous 3/4 or 6/8 "$ r.. we can e. DIAGRAn: 2 ORDERING ORDER We can also work with order-groupings of order-groupings e. we pass over from one time-frame to another. In fact. in traditional practice. like Fibonacci's rabbits.. ~ ~14h7'f1 ~~ ii~ I2t!k~ r t "~ ~=. (Sound example 7. There is an extensive musical literature discussing order-regrouping.oldobl. (Such matters are discussed in greater detail in Chapter 9).G. vary our perception of 6/8-ness versus 3/4-ness. having no ear or human musical judgement. we must be clear about the relationships of such operations to our time experience.tJ. i rII j :tt $kZ I' 1 I •• I I .7:lfyc?~tr71M11 fit.g. Audible design requires musical judgement and cannot be "justified" on the basis of the computer-logic of a generating process. From a perceptual point of view. or to sounds as wholes. Here we are beginning to stray beyond the frame of reference of this chapter. With time-expansion. and the less easy pattern-retention and recall becomes. the perceptual boundaries are less clear. This defines how important the presence or absence of an event is to our perception of that grouping. the Indispensability factors will be arranged in a different order.

way to achieve an equivalent aural experience? A more difficult question. the sequence shape will be imprinted in our memory. FirsU y. or otherwise.experience? If the latter. or a special moment in a time-sequence of events taking on a particular significance in a musical-unfolding because of its panicular nature and placement with respect 10 related events? These are. of a compositional procedure. measure them.g. of itself. Initially we will assume that what I mean by texture is obvious. CHAPTER 8 TEXTURE WHA T IS TEXTURE? So far in this lxlok we have looked at the intrinsic or internal properties of sounds.the Hpitches define a persisting reference sel (a HArmonic lield). we can also apply the notion of texture to temporal perception itself. However in this chapter we wish to consider properties of dense agglomerations of the same or similar (e. no matter how rapid. aesthetic questions going beyond the mere use. is there a simpler. However. In the first two sound examples. so we no longer have any perceptual bearings for assigning sequential properties to the sound stream. Thus a random sequence of zeros and ones may contain tbe sequence 11001100 which we may momentarily perceive as ordered. (Sound example 8. In the latter we hear a welter of events in which we are unaware of any temporal order relations. Such focusing on transient orderliness may be a feature of human perception as we tend to be inveterate pattern-searchers. There are immediate perceptual problems with the notion of disorder.2). However. 65 66 . particularly in relation to the distribution of events in time. even if the paltern fails to persist. Secondly the elements may succeed each olher so quickly thaI we can no longer grasp their order. So disorderly sequence.3). we are aware that the sound experience has some delinable persisling properties . need they be quite so involved as I suppose. (Sound example 8. as with any musical process is. or course. but il is possible to pick up on a local ordering in a totally disordered sequence. i. what does this reordering "mean" in the context of an entire composition? Is it another way of looking down a kaleidoscope of possibilities.4). We can generate an order-free sequence in many ways.e. if a sequence. By the same token. or set of sounds. /ape-speed variation transposed) versions of the same sound. This involves more detailed arguments about the nature of perception and we will leave this to the next chapter. A disordered sequence of Hpitches in a temporally dense succession is a fairly straightforward conception. HPitch). in the latter case. and hence more elegant. we give what we hope will be (at this stage) indisputable examples of measuredly perceived sound and texturally perceived sound. so to speak. In the first we are aware of specific order relations amongst Hpitches and the relative onset time of events and can. is repeated (over and over). (Sound example 8. This is what I will call texture. We can lose the sense of sequential ordering of a succession of sounds in two ways. Textural perception therefore only takes over unequivocally when the succession of events is both random and dense. or assign them specific values (this will be explained more fully later).g. In the next chapter we will analyse in more detail the boundary between textural and measured perception. does it matter to the listener that they hear this reordering process? Or. need not lead to textural perception. the elements may be relatively disordered (random) in some property (e. This 'looping effect' can thus contradict the effect of event-rale on OIJr perception. (Sound example 8.0.

We may use untransposed sound-source. Field refers to a grouping of different values which persists through time. musical events. events only occur on time-divisions at multiples of 1f30th of a second). 67 68 . if used as an clement in granular synthesis where the onset time distribution is randomised and the density is high. This is possible because the vowel-fonllants. whether or not order is intended or definable in any mathemalical or notatable way. texture is an equivalent of noise in the spectral domain. Continuous sounds which are evolving through time (pitch-glide.. "noise". TIlere are many ways to do this and we will describe just three. can become a texture-stream if the complications of these procedures and temporal density arc pushed beyond certain perceptual limits. or Chinese from Japanese. particularly if the on. possibly time-varying. we may still be aware of this regular time-grain to the texture. spatially distinct and with lower event rates.grid on which events must be located). texture is sequence in which no order is perceived. originally having sequential. In this case. (Sound example 8. as sampled sound in a look-up table for submission to a table-reading instrument like CSound. element and property choice) variables. random scatter. Texture differs from Continuum in that we retain a sense that the sound event is composed of many discrete events. !be percept will be most probably split into two separate sequences.. any texture parameters we wish to assign to !be MIDI output) . (Sound example 8. In addition all of these parameters may vary (independently) through lime. spectral brightness etc) may be varied from unit to unit to give a diversity of texture-stream elements. has rich properties and also vague boundaries where it touches on more stable domains. and provided both that the density is not extremely high (when the sound becomes continuous once again) and the grain time distribution is randomised. equivalently. A more complete discussion of temporal perception can be found in the next chapter. where the spectrum is changing so rapidly and undirectedly that we do not latch onto a particular spectral reference frame and we hear an average concept. texture comes in many forms. If a nlpid sequence has its elements distributed alternately between extreme left and right positions.. The keyboard approach is intuitively direct but diflicult to contrul with subtlety when high dcnsities are involved.6).So. In some sense. using MIDI data for note-onset. consonant-types. though in high density cases this can be less convenient) and use various mix s/JUjJling meta-instruments to control timing. any shortish sound may be used as the basis for a texture-stream. possibly layered. 'This is a field percept. even if we do not speak these languages and hen(X do not latch onto significant sequences in the speech stream. (Sound example 8. we retain a HArmonic percept. degree of randomisation of onset-time.. can produce a texture stream. in fact. Thus a texture may be perceived to be taking place over a whole-tone scale..5). HArmonic field).onsel quantisation (the smallest unit of the time.9). A superimposition of two such scattered sequences will almost certainly merge into a texture-stream. Pure textural perception takes over from measured perception when we are aware only of persisling field properties of a musical s!ream and completely unaware of any ordering properties for the parameters in question. giving information on lemporal-density of events. individual event duration. Applying granular recoflStntction (see Appendix p73) to such a continuous sound with a grain-size that is large enough to retain perceptible internal structure.~. range of loudness from which each event gets a specific loudness. (Sound example 11. Even aspects of the time organisation may create a field percept. if the time-placement of event. like noise. Textures can then be generated from a MIDI keyboard (or other MIDI interface). Thirdly. specifying their timing and loudness through a mixing score (or a graphic mixing environment.g. pitch-range (del1ncd over the continuum or a prespecified. spectral change etc) may also change into texture-streams. broadly speaking. FIELD The two fundamental properties of texture arc Field and Density. and key-<:hoice to control timing.10). The properties of the sound (loudness trajectory. But. we may be able to distinguish French from Portuguese. GENERATING TEXTURE STREAMS The most direct way to make a texture-stream is through a process that mixes the constituents in a way given by higher order (density. if !be individual events are scattered randomly over the stereo space. And in fact any dense and complex sequence of. type (or absence) of event. key velocity. but this can be overcome by using a meta-instrument which generates the CSound score (and 'orchestra') from higher level data. but not extremely high. even with a few superimpositions we will certainly generate a texture stream.8). we are more likely to create a stereo texture-stream concept as the sense of sequential continuity is destroyed. transposition and loudness (or. contrapuntal and other logics. Via such texture control or texture generation procedures (see Appendix pp68-69). but this placement is quanti sed over a very rapid pulse (e.(sce Appendix pp6!Hi9). we can generate a texture for a specified duration from any number of sound-sources. Thc CSound score approaches requires typing impossible amounts of data to a text file. Thus. and the spatia! location and spatial spread of the resulting texture-stream. (Sound example 8. is disordered so that we perceive only an indiviSible agitation over a rhythmic stasis. even though we do not retain the exact sequence of Hpitches. Spatialisation can be a factor in the emergence of a texture-stream. Alternatively we may begin with individually longer sound sources.7). syllabic-<:ombinations or even the pitch-articulation types for one language form a field of properties which are different from those of another language. note-off. (Sound example 8..~t-time distribution is randomised. Similarly. Alternalively textural elements may be submitted as 'samples' (on a 'sampler') or. Grain streams and sequences may begin to take on texture-stream characteristics by becoming arhythmic (through the scatlering of onset-time away from any time reference-frame: see later in lhis Chapter) and by becoming sequentially disordered.or from a CSound (textfile) score. However. If we now begin to superimpose varianl~. loudness and sound-source order.

but eventually. l>I.' :. e. lllere are four fundamental ways to change a field property of a texture-stream.g. We might begin with a fixed spectral type defined mainly by enveloping (see Chapter 9) or by source class (e..focused (see Chapter 2). If our property has a reference-frame (sec Chapter I). (a) We may change the Hpitch field by deletion.12). This kind of gradual tuning shift can destroy the initial Hpitch concept and reveals the existence of the pitch continuum. may spread its value to create a range of values. One method of creating sound interpolation (see Chapter 12) is through the use of a texture whose elements are gradually changing in type.. and not start.. :. pebbles on tiles... as the range broadens.ad -~~an~e"'/ GRADUALLY from which .•... I' '. OIher properties which are perceived against a d'5 =-: ~ 1 . (Sound example 8. water drips etc. the tempered scale) in homophonic tonal music.. may gradually change the range of values. we lose tbe original Hpitch percep!. . We may begin with a fixed-pitch and gradually spread events over a slowly widening pitch-band (wedgeillg : Appendix p69).hal a~d~ • b.' • " .. (see Diagram 2b).. bricks into mud. In Sound Example 8. or noisy spectra etc.. (0 D:IAGRAH 2 oriq. Initially we will retain the sense of the Original Hpitch field. 69 . these field properties of the texture may gradually change in a number of distinct ways. D:IAGRAH 1 a ----b """. may stuink the range to a single value. L • . *. We may begin with pitched spectra and change these gradually to inharmonic spectra of a particular type. . . )< - - >< 'V 7\ " I SPREAD RANGE itch is selected. Each Hpitch can now be chosen from an increasing small range about a median value. or to inharmonic spectra of a range of types.) then spectrally interpolate (see Chapler 12) over time from one to another.. or gradually expand the spectral range from just one of these to include all the others. (b) We may change the Hpitch field by gradually retuning the individual Hpitch values towards new values. As it does so.~ c d For example.. Addition and substitution imply a larger inclusive Hpitch field and the process is equivalent to a HArmonic change over a scale system (e. (See Diagram 2d). · . (see Diagram 1). (see Diagram 2c). its elements are spectrally-traced (see Appelldix p2S) so that the texture-stream becomes more pitched in spectral qUality.. an Hpitch sct for pitches.. (c) We may change the Hpitch field by spreading the choice of possible values of the original Hpitchcs.(: 1~~~::I"ew \:: :: .. tills will dissolve into the pitch continuum.­ cf~'k::: '. twig snaps... Such reference-frame variations can be applied reference frame. f~~. we may reduce the event duration so that the texture might become grittier (this also depends on the onset characteristics of the grains).": " • ' ..11 we hear a texture of vocal utterances which becomes very dense and rises in pitch. Once the gliding is large. • • • • • • ~• • • • • • • it • • • • • • • e -­ a 'lIr II or~inal' GRADUALLY RETUNE PITCHES towards new-field values. addition or substitution of Held components.g.:: I:::: . In tenns of an Hpitch field . (d) We lIIay gradually destroy the Hpitch field by gliding the pitches.'. in a direct or oscillating manner. (see Diagram 2a)..g. l1ew field GRADUALLY INTRODUCE PITCH-GLIDING and increase its range.Field properties themselves may also be evolving through time. " . ~ fielt/.or end.-•• • s-· • • I • DELETION .. (a) We (b) We (c) We (d) We may gradually move its value.

Thus a texture may consist of events whose onsets are entirely randomly distributed in time bUI whose event-elements are clearly rhythmically ordered within them. we loose perceptual Irack of the individual elements and Ihe sound becomes more and more like a continuum. a dense texture of vocal sounds whites-out and the resulting noise-band is then filtered 10 give it spectral pitches while it slides up and down.as continuation properties . (2) Variation of the individual spatial motion of elements. density at the time-scale of the event repetitions has no meaning.14). the texture goes over into noise. And. We will therefore offer a few other suggestions . The pitch ness is then gradually removed from all bands except one. or semi-regular in time. a pre-digital work). Spatial variation in temperature similarly can only be measured over an even larger mass . spalial randomness). or even have nested 'contradictory' properties. noisiness-darity. (Sound example I!. more time.we create granular or 'crackly' texture. Funhermore. all sense of larger oudi nes in the landscape are lost in the welter of whi Ie snow particles.and temporal variation in density similarly requires more events. Increasing the densily of a complex set of events can be used to average out its spectral properties and make it more amenable 10 interpolation with other sounds.. (5) The internal group-speed. When walking on snow in blizzard conditions it is sometimes possible to become completely disoriented. type of inharmonicity. the density fluctuations themselves may approach the size of large grains . we would need to describe all possible variations of each constituent. rapid density fluctuations. if the texture elements are groups of smaller events .. all changes of the individual properties and all changes in combinations of these properties. and the time-variation of these. conversely.We cannot hope to describe all possible controllable field parameters of all texture-streams because. Thus we will be able to perceive increases and decreases in density. when a sound texture is made extremely dense. range of spatialisation types. before the advent of computer technology this was one of the few techniques available to achieve sound interpolation (see Chapter 12).. Temperature is a function of the speed and randomness of motion of molecules. events may begin on tile Hpitches of a clearly defined HArmonic field while the event-elements have Hpitches distributed randomly Over the continuum. In fact event-onset-separdtion-density perception is like temperature measurement in a material medium. This is whal we describe as white-ou!. a noisy-whistled voiced is interpolated wilh a (modified) skylark song via this process of textural thickening. Hence the temperature at a point. .as compared with 70 71 . group-speed range and their slow or undulating variation. providing a larger time-frarne reference-frame. the temporal variation of these. we hear the sound of a crowd emerging from a vocal-like noise band which is itself a very dense texture made from thaI crowd sound. the temporal variation of this. spectral-range (various) of the groups. as the elements of the stream are perceptible. to be perceived than does density itself. and the density decreases once more to reveal the original human voice elements. Temperature can only be measured over a certain large collection of molecules .~elves. In Sound example 8. DENSITY The events in a texture-stream will also have a certain density of event-onset-separation which we cannot measure within perception but which we can compare with alternative densities. In particular. Event-onset-separation-density fluctuations themselves may be random in value bUI regular. behind the stream of complex vocal multiphonics. All such features may be compositionally controlled independently of the stream density and the onset-time randomness of the texture-stream.. or. or of a single molecule has no meaning. (6) The internal spatialisation of groups (moving left. stability or motion of any of these).. oscillations in density and abrupt changes in density. (3) The group size and its variation: the range of group size and its variation. Similarly. We have this comparative perception of density changes so long as these changes are in a time-frame sufficiently longer than that of density perception itself. Finally we should note that changes in field and density propenies might be coordinated between parameters. formant-type. all combinations of these properties. (7) The variation of order-sequence or time-sequence of the groups.15 (from Red Bird: 1977. or all varied independently. When this process is taken to its density extreme we can produce white-oul.16 (from Vox 5 : 1986). (Sound example 8. (4) The internal pitch-range. In the same way.and density can only be measured over a group of event entries. In Sound example 8. Otherwise there is no way to distinguish density from density Iluctuation.13). In fact. (Sound example 8..IS). slow density fluctuations will tend to be perceived as directed flows . spatial oscillation.17. In extremely dense textures. In Sound example 8. (l) Variation of sound-type of the constituents (hannonicity-inharmonicity. where the sound elements are of many spectral types including noise elements. If the snow fall is sufficientl y heavy.. the range of sound-types.

1). we loose any sense of a specific Hpitch field or HArmonic reference frame.. For example.~e we can begin to see that the concepts of Density and Field applied over the same parameter come into conllicl.2).lly lose a (measured) sense of rhythm. (Sound example 9. Compositionally. or reference-frame. Once the pitch-density (in this new sense) becomes very high. Adding a very small amount of random scalier to the time-position of the events. 72 . Clearly. We must now admit that the concept of density and density variation can be applied to any sound parameter. perception and density perception ill any olle dimellsion. if we have pitch confined over a given range. This is because very accurately performed rhythmic music is not "accurate" in a precisely measured sensc but contains subtle fluctuations from "cxactness· which we regard as important and often essential to a proper rhythmic "feel". Increasing the random scalier a little further we move into the area of loosely performed rhythm. We must therefore a. what is the dividing line between field. We will be more concerned with the nature of rhythmic perception and its boundaries. gives the rhythm a more "natural" or "human performed" feel. The time-sequence is arhythmic. WHERE DOES DENSITY PERCEPTION BEGIN? In the previous chapter we discussed texture-streams which were temporally dense but which might retain field properties in other dimensions (like Hpitch or formant-type). Thus we may begin with a computer-quantised rhythmic sequence which has an "unnatural" or "mechanical" preciSion. Our discussion will enable us to extend tbe notion of perceptual time-frames to durations beyond that of the grain and upwards towards the time-scale of a whole piece. Hpitch organisation).~k. though we may continue to be aware of the range limits of the field. In this chapter we will confine ourselves to the dimension of temporal organisation. rbythm is a major aspect of such a discussion but we will not dwell on rhythmic organisation at length because this subject is already dealt with in great detail in the existing musical literature. Our conclusions may however be generalised to the field/density perceptual break in any other dimension (e...our perception has changed from grasping rhythmic ordering as such to grasping only density and density fluctuations.g.. (Sound example 9.. In this ca. Once this point is reached we perceive the event succession as having a certain density of event onsets . we can create sequences of events that gmdua.­ CHAPTER 9 TIME ABOur TIME In this chapter we will discuss various aspects of the organisation of time in sound composition. a pitch density value would tell us how densely the pitch~vents covered the continuum of pitch values between the upper and lower limits of the range (not time-wise but pitch-wise).. or even badly performed rhythm and eventually the rhythm percept is losl.

g. Allowing event-onsets to be displaced randomly by very small amounts from this time reference-frame. constructed from our perception of event-onsel-separation duration. It is informalive to compare this with the analogous situation pertaining \0 an Hpitch reference frame. and in consequence rhythntiCily. the establishment of tempo and metrical grouping in a particular piece). We perceive them as gelling relatively longer and we also have an overall percept of 'slowing down" bul we do not (normally) have a clear percept of the exact proportions appertaining between the successive events. then gradllally randomise the tuning of notes from the Hpitch set. Firstly. MEASURED.g. Ihe comparative and the textural (the perceplion of temporal density and density fluctuation) . our perception of the frame. we initially retain the percept of this reference frame and of a rhythm in the event stream. which we will refer to as the time-frame of the event. In an idealised fonn. there is the question of time-frames. breaks down. triplct sentiquaver) of this unit. an example of comparative perception. time-frames may be nested within one another in the rhythmic or temporal organisation of a piece so thai at one level (e. If we do not have a reference set. Beginning again with our strictly rhythmic set of events. in whal time-frames is our perception of a particular event measurable and in whal time-frames is it comparative or textural ? The first assertion we will make is that mea. which we carry with us into the later examples. this may also be thought of as the lime quantisalion grid. repealed quavers at [crotchet = 120J we can recognise wilh a fair degree of precision the various multiples (dOlled crotchet. minim. even though we might retain the relative up-downness in the pitch sequencing. or. 'we are now higher than before". Let us now look at this situation from another viewpoint. In the sphere of durations. in many cases.g. here. Talcing the sequence in the opposite order. to make things easier to read. COMPARATIVE & TEXTURAL PERCEPTlON To make these distinctions completely clear we must examine the nature of lime-order perception and define our terms more precisely. in the situation in Diagram 3 we are aware of the comparative quality of the successive durations. There are two dimensions to our task. the srutis of the Indian rag system). or a mode or scale. the bar) in which longer-duration order-sequences can be set up. In the pitch domain this might be a tuning system. So we can ask. at the time-frame of the quaver. (See Diagram I). the initial statement of a raga. or. say. 73 74 . Such a time-frame. If we were presented with some of the laler examples wirlwut hearing the reference frame. provides a perceptual reference at a particular scale of temporal activity. In the ensuing discussion we will use the term 'duration' to mean event-onsel-separation-<!uration. and rhythm is an ordering relalion over such a reference frame. However. Such a reference-frame might be eSlablished by cultural norms (e. demisemiquaver. Our perception is again measured. dissolving the time-reference-frame leads us into Ihe complex domain of event-onsct-separation-<!ensity-articulation discussed in the previous chapler. then. we musl divide temporal perception into three types ­ the measured. etc) and integral divisions (semiquaver.3). Similarly in the sequence in Diagram 2 we will probably hear a clearcut sequence of duration values in the perceptibly measured ralios 1:2:3:4:5. From this comparison we can see that a durational reference-frame underlying rhythmic perception is sintilar to a field. However. we note that !be event-onselS lie on (or very close to) a perceivable time grid or reference-frame (the smallest common beat subdivision. Strictly speaking. it is the dissolution of the time-frame which leads us from Iield-ordered (rhythmic) perception of temporal organisation to density perception. "we are moving downward". Here we would begin with events confined to an Hpitch set (a HArmonic field). grain-level and continuation-level perception (or lack of it) into longer swathes of time. (Sound example 9. 'we arc hovering around the central valuc". tempered lUning) or established within the context of a piece (e. to provide a precise analogy with our use of HAnnonic field. And just as dissolving the Hpitch percept by !be randomisation of tuning leaves us with many comparatively perceived pitch properties to compose. Once these excursions are large. As will he discussed below. measure our perception against the reference frame and recognise specific duration sequences. Just as with the dissolution of Hpitch perception. however. As such it provides us with a way to extend the notion of perceptual lime-frames used previously to define sample-level.~ured perception requires an established or only slowly changing reference-frame against which we can "measure" where we are.g. just as underlying any HArmonic field we may be able 10 define a frame made up of the smallest common intervallic unit (e. Secondly. slowly destroying the Hpitch field characteristics. Our perception of duration is measured.In this sequence we probably retain the sense of an underlying measured percept (which is being articulated by random scattering) a long way into the sequence. means only this. or. we are still able to make comparative judgements. we would pertJaps be more willing to declare them arhythmic. Furthermore no two live performances of this event will preserve the same sct of measured proportions between the constituents. Dissolving rhylhmicity is hence analogous 10 dissolving the percept of Hpilch which also relates intrinsically to a reference set. We are given a reference frame by the initial strict presentation. Hence we can. thc semilone for scales played in the Western tempered scale. in Western tonal music. it is more useful to think of !be smallest subdivision of all the duration values in our rhythmic sequence. a duration field would be the sct of all event-onset-separation durations used in a rhythmic sequence.g. a subset of the scalc defining a particular key. In the pilch domain for example. Moreover. This is. because such time-frames may be nested (see below) we can in fact denne a hierarchy of time-frames up to and including !be durJtion of an entire work. a semiquaver time-frame) order-sequences are PUI in motion (rtJythm) while simultaneously establishing a time-frame at the larger metrical unit (e. if we establish a time reference-frame of. which may also be thought of as a lime-quantisation grid). Bear in mind however that duration. there may be a perceptual switching point at which we suddenly become aware of rhythmic order in the sequence. and not the time-length of sounds themselves. This permits structural judgements like "this is an E-tlat" or ·we arc in the key of C# minor".

perception at the level of the crotchet remains measured the music is "in time".~ flr' DIAGRAl'I 3 4 ".g. see below). a t'f'lfhpo 1£4 Dt LtJItKl? £7)jld DIAGRAl'I 4 c. 4 .j= 168> SHORT DURATION REFERENCE-FRAMES: RHYTHMIC DISSOLUTION (tJ«'20) nnnnnI1lrlnnnnn J .~ 100 long (e. If this rhythm occurs in a conte"t where there is a clear underlying semiquaver reference-frame. and the pallern may be more loosely interpreted by the performer..3 Y:5 (Sound example 9. the sense of grouping breaks down allogether and at the level of 6-groupings (7-groupings... However. The score may given an illusory rigour to a 3: I definition.. More significantly.. Our perception reverts 10 the field characteristics. On the large scale. however.. we retain a sense of measured regularity and hence our perception there is measured perception.. but we are concerned here with the percept. the quavers themselves. over a reference frame of quavers.0) lEt tuFt> I Y2.rotche. the relative indispensability of each note in a 6/8 or a 3/4 (or a 7/8) pattern. we will perceive the 3:1:3:1:3:1 etc. . So at the smaller time-frame.5).. and the sequence is played "precisely" as written.. our perception has become comparative. Once every quaver becomes equally probable.nIAGRAH 1 (~:12. if the pulse-frame becomes too small we also lose a sense of measurability and hence of measured perception... as does the vibrato of opera singers....~ opposed to I :2). may have an almost completely regular long-short division of the crotchet in the proportion 37:26. frame. We can see the same division into time-frames if we look again at the idea of "indispensability factor" proposed by Clarence Barlow (see Chapter 7).... We can perceive the regularity of grain down to the lower limit of grain perception but COmparative judgements of grain-durations.e. rno/to rit . As discussed previously. however.. or an ambiguous percept between the two. however... The way in which such crotchet beats are divided is one aspect of a sense of "swing" in certain styles of music. the precise morphology of each violin note cannot be specified in the notation and varies arbitrarily over a small range of possibilities.. the fIeld is defined by a smaller SCI of durations.. It! tiro i fur L1 w~ gr~1 j ~ f"3' ~ A more interesting e"ample is presented by the sequence in Diagram 4. veering towards 2: I:2: I: etc. refel"l!'nc. these factors can be precisely specifIed and.-=:::" S It is importanlto note at this point that our perception of traditional fully-scored music is comparative [U \ in many respects and in some respects textural to the extent that e.. in a smaller time-frame... perception. For in this example. There are.. A particular drummer.g. ti~ r r' :z.. are not necessarily perceived in some measurable proportions. Here then we have a comparative perception with fundamental qualitative consequences... 30 minutes). our rhythmic example using dolled rhythms illustrates the second aspect of our discussion. we can define. However. if desired.. Simultaneously. at the extreme. or a strong 3/4 feci.. Linking the probability of occurrence of a note to its indispensability factor allows us to generate a strong 6/8 grouping feel. or measured.... we wi!! no longer perceive il as a time reference-frame (some musicians will dispute this. DIAGRAM :2 (Sound example 9.:3 If . 75 . this time pattern is encountered where the main reference frame is the crotchet. serni~ttCl\ler .. In this case. Here we are perceiving a regular alternation of short and long durations which. raised to the level of comparative. in many cases... seems to break down well above this limit. howevL'f.4). More importantly. limits to this time-frame switch phenomena. if the pulse cycle become. We may be aware of the regularity of hislher beat and appreciative of the particular quality of this division in the sense of the particular sense of swing it imparts to hislher playing while remaining completely unaware of the exact numerical proportions involved.t i. We cannot give a precise figure fOr this limit. sequence of duration proportions clearly ­ our perception will be measured.. for example. especially in more demanding proportions (5:7 a. In sound composition.. or any groupings) measured perception is lost.

even given romputer precision in performance. And the smaller this unit.l.l. Dr" '" 'M'~ '>' r" r.. is simply impossible.. However. .I ~I~ ~ J ~. the shoner this smaller mutual time-frame pulse beromes. (Sound example 9. a measure of how much an event is displaced from its 'true' position.J.e. set up sequences of events in which simultaneous divisions of the crotchet beat into (say) 7.1~Tl. And we may always claim that we are setting up an exact percepl created by the exact notational device used.) ~. (Sound example 9.r:"'lrn Tn ITl . We may...-4 S 'Ibis problem becomes particularly important where different divisions of a pulse are superimposed.~-.. looth of a serond. J J .... By anyone's judgement this would have to be described as an inaccu ••lle placement' (Sec Diagram to) In fact I would declare the situation in which events are randomly scauered within a range reaching 10 hair of the duration of the time-frame urtits as defirtitively destroying the percept of that lime-frame. Perceiving a specific 9 to II proponion as in the last example.}. 7 DIAGRAl'I No human performance is strictly rhythmically regular (see discussion of quanlised rhythm above).po ~ "3"'.~~. of course..v-i.~ . compounded... Measured perception on the smaller mutual time-frame ha~ broken down.. t.. A factor I means it is displaced by a whole urtit.8).--J "3 (tJ:IU.. rr . when composing on paper. perceive) a common pulse.:::-. With 3 superimposed tempi the problem is.··l.. where the smaller common time-frame unit has a duration of c. te".r notafiQI'I il'\ 2.. though we may be comparatively aware (if the density of events-relative-to--the-stream-tempo can be compared in the two streams) that we have 2 streams of similar but different event density.. for example..~~-. as in Diagram 8.: . <time" frame. ~ J :J1J~~~ ~ ){- ... :t-'!-. Even bere this perception does not necessarily happen (certainly not for all listeners). particularly where we are relying on the accuracy of performers.J. this underlying mutual pulse may continue to be apparent in situations where it would be lost in live performance.DIAGRAl'I e. (see Diagram 7). ~-":.I!~_~~. By dividing the quavers of the slower stream into 3 and the faster stream into 2.l·f·lrl. (Diagram ttl. 10 .. M!!!!J!!!J I.L. With computer precision. To given an example. We may describe the fluctuations of the performed lines from exact rorrelation with the rommon underlying pulse by some scattering factor.7)..U. the more demanding we must be on performance accuracy for us to hear this smaller frame.l. with what accuracy can this concept be realised in practice? (~ ::: t.. J. • '-..~ ~ o 76 . c..~~~-. the common pulse urtit becomes even shorter. are used. In larger time-frame terms. (Sec Diagram 9).. (..11. J J LARGER DIAGRAl'I 6 " ** " The more irrational (in a mathematical sense) the tempo relationship between the two streams. & n at {crotchet = 120)..6) . a quaver in a sequence of quavers would be inaccurately placed by up to a whole quaver's length.. if we compose two duration streams in the proportion 2:3 (see Diagram 5) we have a clear mutual time-frame pulse at the larger time-frame of the crotchet. in this particular case we should ask. however.. I89) .rnlrn J. (Sound example 9.. r· r·· "'.. . -.') 5hered. we may discover (i. If we now regroup the elements in each stream in a way which contradicts this mutual pulse (see Diagram 6) we may still be able to perceplUally integrate the streams (perceive their exact relationship) in a mea~ured way over a smaller time-frame.T?J m I'" t -.. When three streams are laid together the deviations from regularity will not flow in parallel (they will not synchronise like the parallel micro--aniculations of the panials in a single sound-source: see Chapler 2)...l. In praCtice the time-frame percept probably breaks down with even more closely confined random displacements..J.uN~.

. ::...lmrrrbmr.120 -:j !::::::. - 1 ?I Sf L S I J .~II ~...tju !l'z.. ~3 Y"'/'...~ lOrnS ~-II ~ro"p....:J h ~"h ran .. ../2. ...Wi 's F ---­~ p~15e ~I ·...i 11 i i Y2.i1lJ[i!i I1j.. ..... '" u 1 1==1 tfl( ~ i ~ h )0 d" '---""l disph!:..~"~rou.<:ed...J20)_ .. 9 DIAGRAH 1 COI'Imon Original regular pulse.t') un.... tlmoe·frame unit (tfu) lar~r fff!r::1TrP iJmriiI ')J1illiiW 'mr.Jler F(..~e(ol'l ... ::{I "I--~_! I ~ ~ ~..e r Sequence fail ure.~ ~ 4tnS Part of above pattern placed on the slIIII. S-t . -f~ro.. 10 r _ 6.:mrmwr.::=·t·~J~· .. . o.. anhc/i:r...A'ff D:IAG:R.J J' J:12 0 12 t J!J r sm1>ller i J I 14 r. ~aller '1ttt!ih:rri1r~Eiiwir...J ::..rrf1i! laY3et" r I time frame DIAGRAH (bft....... .ltteiL b .A'ff 9 e (J.. ..... J=12~ I!E SIl'l8ller rrrj #'miEr « it r': j r j ri ! r f rfir rf f rrrrr f[ common plAtse ~ ~~ i ei§ J ---IS ~ i~ I 1 ­ 8~i ...[.. ~ If GG -I lar~\"" tt ----. ••.::::...... ..ller cOIIIIIIOn pulse. 2 .../' ~ 'if j' i I lar.. I ~ 3 fmS"mill...(' IS ~ ------=-­ J r:.oof~ecs. '3S --- . .. 1f 1 J2!J1 ..!:. ..... One possible result of maximal scattering.. DIAGRAH 12 (...... i~1 I i cotter es .l:fu.0) ~~i sm"... DIAG:R.

or because we constantly undermine the bar-level mutual pattern reinforcement. we may be able to perceive a very preCise "density flux· in the combination of these streams.~s of sequences with similar properties within a fairly definable range. Our comments about notated music are furtiler reinforced if we now organise the material internally so as to contradict any mutual reinforcement at larger pulse (e. for the order of the two events to be reversed in live performance! (See Diagram DIAGRAM 13 (J"~ ~ 12). over the same crotchet. composing a sequence intrinsically tied 10 an "exact" notation. we may re-establish a measured perception of phrase regularity in a yet larger time-frame. and hence there is no doubt Ihal the live-performed events will be displaced by at least half the time-frame unit (half of half a millisecond). II only needs one of these unils to be misplaced by 3 or more milliseconds in one direction. bar) interfaces.g. The 7th unit in a 13-in-the-time-of-1 grouping. the mutual time-frame duration lasts approximately 0.-frsme comm'Oll butse. (Sound example 9. or 1nS6th of a second (less than 4 milliseconds) apart. On the other hand. even given the intrinsic "inaccuracy· of Iive perfonnance. TIle exact notation in fact specifies a cla. the density approach may be more appropriale if we are really looking for the same class of percepts as in the notated case. the 4-bar frame. if we repeat a sequence of (say) 4 bars. (Diagram 15). In such complex cases. e. with computer-precision in generating the sound sequence. We then pass over exclusively to comparative or to textural perception. In writing notes on paper. here. 120J are only 1I143th of a aotcbel. In the computer domain. In this way we can also deslroy measured perception at the larger level. but we hear only comparatively at smaller time-frames. either because we change bar-lengths in a non-repeating way.0005 seconds. In a cyclically repeated pattern of superimposed "irrationally' related groupings.g. I r i DIAGRAM ~5 77 . (Diagram 14). we may still retain a longer time-frame reference set (see Diagr'.9).un 13). We could describe this class of sequences by a time-varying density function with a specified (possibly variable) randomisation of relative time-positions. In fact we can be fairly certain that even the order of these events will not be preserved from performance to performance. -r- lIJ"5'er til11e. or half a millisecond. we should be aware of the repetition of the bar length or larger-time-frame unit (which we are dividing). We hear in a measured way at the larger time-frame level. Hence we are not. at [crotchet . Eventually. the exact notation is more practicable so long as we acknowledge that it does not specify an exact n:suIt. in all performances. Again.. however. at least if they are repeated a sufficient number of times. The precision of the result is simply impossible and an appeal to future idealised performances mere sophistry.In our notated example. In fact we can confidently declare that events will be displaced by multiples of the time-frame unit. and the 6th unit in an Il-in-the-time-of-one grouping. We are in some senses specifying a density flow within tile limitations of a cenaln range of random fluctuations. and the other by 3 or morc milliseconds in the opposite direction. larger time-fmme reference-frames will not be perceptually established.

ls<1­ fo/ttOl fr$&t94@t@#£E'tr!# SW p.ArI ~a . of course..1· DXAGR..g. ~( !-.. a pertect Golden Section nesting might be achieved in a studio composition.g. no matter how small...-------J'I~ g eJ---i'IO . l~]..O' o 2'__________ --------o .' 1. II II It 11..l_ _ _ _ _ _ _ _---1 lbere will. On the relatively small time-frame of "swing" (see earlier discussion). the more likely they arc to occur in the series. e. or do we perceive them in a comparative sense (I to ll2ish-213ish) while the exact measure makes the task of laying out the score possible.JJ. can we percei ve these proportions in a measured sense.. (Diagram 16)./se...it cannot be expressed as the ratio of (lilY pair of whole numbers) its parts are not integrally divisible by any number of fixed pulses.-~'I 13 0 ___ Q.l' IJ ~ 15 3 cI----J-l.I. Though many composers writing musical scores would stake their reputations on the idea that we can hear and respond to precise proportions in larger time-frames (especially those derived from the Fibonacci series and associated Golden Section) there is little evidence to support this. . • 1.1.. The question is. /=>. no such intrinsic problem to CUlling tape-segments to such irrational proportions and bence. Hence it is intrinsically problematic to attempt to score in exact Golden Sections. 3 ::.J ~~)to J.:: f = die elf = f/9 . (Diagram 17).1. ~ ~ ~~. a serniquaver time-frame) order-sequences are put in motion (rhythm) while at the same time establishing a time-frame (e. Due to the structure of the fibonacci series.t 'h 5 "AA 9 ~ • t-. hierarchical systems of interlocked levels functioning as order sequences in one direction and duration frames in the other can be articulated. Such proportions certainly look nice in scores and provide convenient ways to subdivide larger time-scales.a r .'lililmIJ li "J 1...<a~e.. i. almost any proportion between I to 112 & I to 213 can be found within it and Ule nearer these proportions are to the Golden Section.. If we consider the golden section itself (the limit of the Fibonacci series) this will nest perfectly with itself.. then. being an irrational quantity (in the mathematical sense .b i i i Co b i I alb C/d >= c i d e d r--T '--"r--­ It-->. by iDXAGRAM 17 "34. t /. an hour has gone by without my notiCing)..x-chet. -. J. we run into the problem of both longer tcrm time memory and the influence of smaller scale event patterns on our sense of the passage of time (waiting len minutes for this train seemed an interminable time .. music can be constructed as a set of nested time-frames in which at one level (e. .1 1---11L-_ .. The exact proportions offered by the Fibonacci series have the particular advantage of being nestable at longer and longer time-frames. on much larger time-frames. eTC. .LU q.~ there are very subtle parameters in the spectral character of grain which we cannot "hear out" but which are fundamental to our qualitative perception).l.e. This makes them attracti ve for work based on integral time units.J. o. do we believe that we can measurably perceive these proportions (e. Iii. quavers at the fundamental tempo of a piece.g..IJllIJIJI..g..THE NOT SO GOLDEN SECTION At the otlJer extreme. however..: gin = h fl' ~ ETC.dJI~I.LARGE-DURATION REFERENCE FRAMES .. The question is. I II J[ ~~flh"'~ I bel/'S I 1 I I I I pnob fs II I I cr.1emisemiquaver to l-or-2-minutes. Whether we are measurably perceiving this handiwork is quite a different question. a ------.J. However. .. be a limit to our ability to perceive a large time-frame as precisely measurable and hence comparable with anotlJer event of comparable time-frame proportions. subtle bUI comparative perception of proportions may be very significant (just a. hypothetically.J 0"""---") 1.J'I ~ IJtmf!:... a timescale limil to measured perception. c l -. f-. 78 I 1[Ii\1 j" . semibreve bars) in which longer duration order sequences can be set up.' 'L. 'f. however.1PtWiFfFill I phrase. Dl:AGRAM 16 (MOZART) . e i ble.' IJ IlINIllJ I.1.... 34:21) or merely thai we can comparatively perceive them all as lying within the range I 10 1I2ish-2J3ish.1. (Diagram IS).~ ~ ~ e d. because the Golden Section is an irrational quantity.rim). Within the time-frames fasH. J~ . However. reading the newspaper. There is.I . More dramatically.t..

I am certain of two things.~ults. there is any measured perception involved. by itse/j. Firstly that when psycho-acousticians run detailed tests. exact pilch-value. specifying exact time-placement. we will produce a precise. Ths. (0 delinc whether the sound materials generated are related and in what ways they are related. e. in the same spirit. We may also combine the two approaches by employing stochastic processes in which the probabilities of proceeding from one sel of states to the nexi are defined precisely. a guarantee of perceptual relatedness. We may. 79 80 . Compositional virtue does not lie in the determinism (or even the describabiJity) of the compositional method. Number mysticism has a long and distinguished tradition in musical thought and the Golden Section is a well-defended tenet of a modern musical gnosticism. WHEN IS DENSITY CONTROL APPROPRIATE ? We must now ask the crucial question. this need not be the case. If. I am happy to accept that Fibonacci derived ratios give a satisfactory comparative percept of I to 1/2ish-213ish (possibly even a slightly narrower range) but not thaI. This is unfortunately not always the view handed down at academies of music composition. even on composers who regularly use these devices. to use a texture control process. Deterministic procedures in which pummcters arc systematically varied may be used to generate ranges of sound materials. composers are especially prone to claiming that anything is an important constituent of what the listener hears if it was an important part of the method by which they arrived at the sounding result.e. based in perception. In particular. We need. a large section of the musical community will reject the re. conversely. should we proceed in a deterministic way. also with systematic variation of parameters. they will discover that these relations on larger lime-frames are not heard exactly. however. exact spectral contour etc. when composing sound directly with a computer. for example. Secondly. Defining a deterministic process to attain the same perceptual result will usually be much more difficult both to specify and to systematically vary in some way coherent witll perceptual variatiollS. i. we may already know what kind of materials we want. The composer must thus be able to monitor what is and what is not perceptibly related in a comJlOliition. a separate approach. independent of the knowledge hclshc has of the generating procedure. on their own. it would be more direct and simpler. how exact does it need to be. nO! by appealing to the method of generation. there is no smaller time-frame over which it can be measurably perceived.e. will provide no perceived coherence within the sound world I create. to generate materials for subsequent perceptual assessmenl Conversely. density-perceived or field-perceived. This is part of the definition of a mathematically irrational quantity.definition. To declare that it docs is simply dishonest. if at alL The generating procedure and its systematicness are not. If we demand computer precision. though again we may comparatively perceive that it lies within the bounds of I to 1I2ish-213ish. by how much can we alter the parameters before we notice any difference in what we perceive. refining its parameters until we get the result we desire. at larger time-frames. we know that the result must be characterised by slowly density-fluctuating arhythmicity. we cannot measurably perceive a Golden Section. that when this is discovered. is 10 be ascertained by listening to the results.g. through cyclical repetition of the pattern in which timing parameters are slowly changing in a systematiC manner? Or is the time-sequence within this event merely one of several panlmeters the listener is being asked to follow? Or is it an incidental result of other processes and not central to the way in which we scan the musical events for a sense of mutual connectedness and contrast? It is always possible to claim that everything is important. when is composing with density parameters appropriate and when. but in the control of the perceived results and their perceptual connectedness. i. I can devise a method for producing sound in which every wavecycle has at least one sample of value 327. Hence. However. In large time-frames it is certainly not possible to distinguish a Golden Section from a host of other roughly similar proportions lying within a range. use texture generation procedures. Again we must be clear that whether the result is rhythmic or arhythmic. nus judgement has to be set in context. by definition. The longer the time-frame the bu"ger this range of inexactness becomes. resull The question here is. and preCisely repeatable. ensuring a high degree of control at one time-frame over a process that evolves unpredictably at smaller time-frames. Is the musical organisation focusing our attention on these exact timing aspects of the total gestalt. through experience. As the transition probabilities can have any value up to unity (completely deterministic) we have here a way to pass seamlessly between pure density perception and total rt"tythmic determinism (or between the equivalent extremes for any other sonic parameter).

in effect changes the local shape of the wave(s) and is hence a process of spectral manipulation rather than of loudness control (see Appendix pS3). as we cannot know how large a swing will be until it reaches its maximum excursion. a particularly musical device as most natural sounds have precisely the opposite loudness evolution (Sound example 10. speech) we produce an interesting dual percept. we need to usc a time.ljectory of a rhetorical speech. This suggests the sound originates from some vibrating medium which has been struck (or otherwise set in motion) and then left to resonate. sequential or textural sound by simply imposing a loudness trajectory. a sound may gradually become prominently corrugated (sec below). scraping or an electrical power supply 10 maintain them. But if we gradually expand the number of wavecycles or wavescts we cause to fall under our loudness-changing envelope. We do not wish to exaggerate the loudness variation from syllable to syllable. (Alternatively we might detect the loudness level at the smallest meaningful time-frame (around 0. 1be instlJlltaneOU$ loudness of a sound has no meaning (see below). We may create the illusion of energy flow with a continuous. The loudness trajectory may also be lengthened (envelope exll'llSioll) or shortened (envelope conlraction). through enveloping. Usually the level of the flow of energy into the system determines the loudness of the output and hence allows us to read Ihe fluauating causality of the sound. blowing. Exaggerating the fluctuations in the trajectory can then be used to force momentary silences between the grains (corrugalion). We therefore choose a relatively large time-frame over which the loudness of the signal is measured.CHAPTER 10 FINE-GRAINED ENVELOPE FOLLOWING ENERGY The result of the process of envelope following will depend to a great extent on how the loudness trajeCtory is analysed. In Sound example 10. 81 82 ..3). Such a trajectory may be derived directly from the activity of another sound (prerecorded or a live performer) (envelope jollowing).6. If loudness trajectory is tracked using a fine time-frame (small window) over a granular sound. or wavesets. These transitions arc nicely illustrated by wavesel enveloping. (Sound example 10.2). Conver. If a sound already has a noticeably varying loudness trajectory we may track this variation (envelope jollowing) and then perform various operations with or on this trajectory (envelope IransjOrtfUilioll). on it (enveloping) (Sound example 10. Beneath the duration of the wavccycle the instantaneous variation of the pressure determines the shape of the wave and hence the sound spectrum. The foregOing discussion illustrates ollce again the importance of time-frames in the perception and composition of sonic events. we will often be able to detect the individual grains. we are interested in deteCting and exaggerating the variations in loudness from word to word or even from phrase to phrase. we need to look at a certain small lime-snapshot of the sound (a time-frame defining a window size: see Appendix pS8). This both alters the quality of the sound and makes the grains more amenable to independent manipulation (Sound example 10. or envelope.005 seconds) and then average the result over an appropriate number of Ihese frames to get a coarser result). The amplitude of the signal describes the size of these swings in pressure. and any of the trajectory (envelope) transformations described here might be applied in a time-varying manner so that e.g.. Imposing loudness changes over one (or only a few) wavecycles. The instantaneous value (pressure) of the sound wave changes from positive (above the norm) to negative (below the norm) 10 create the experience we know as sound. the number of wavesets falling under a single loudness-changing envelope is progressively increased on each repetition. Following or manipUlating the loudness trajectory below the grain time-frame has little meaning.1). The size of this time-snapshot will affect what exactly we are reading.5). Thus if we wish to exaggerate the loudness tr. which are extremely strongly characterised by loudness articulation may thus be 'recapitulated' with an entirely different sonic substance (Sound example 10. ENERGY TRAJECTORIES Sounds in the real world which do nOl simply die away to nothing require some continuous energy input such as bowing. especially if the onset of the sound is reinforced. Even less do we wish to exaggerate the undulations of loudness Within a single rolled 'r'.g. Sequences of event. Thus we may exaggerate the loudness trajectory to heighten its energetic (or dramatic) evolution (expanding: see below) . Note that we must do this by envelope substifutioll and not simply by enveloping (see Appendix ppS9 & 61). With sounds which more easily betray their origins (e. extending or contracting the associated gesture (see for example the discussion of time-stretching textures in Chapter II).ely. Conversely. We want a ·coarse" reading of the loudness trajectory. With simple sustained sound this can completely alter the apparent physicality and causality of the source (see below). As already suggested we might transfer the trajectory 10 an entirely different sound (enveloping or envelope SubSlilulion). and. we may wish to track loudness changes more precisely.4). sequence. grain-stream. To assess the loudness of a sound at a particular moment. grain-stream.tcted (envelope transjanllalioll) and reapply it to the original sound. texture) and give it an exponentially decaying loudness. we may begin with a sustained sound (continuum. Enveloping is particularly useful for creating a loudness anacrusis (a crescendoing onto a key sonic event).window which covers whole wavecycles when determining the loudness trajectory of a sound. we pass from the spectral domain to the time-frame of grain-structure and eventually to that of independent events. We may also modify the loudness trajectory we have extr.

It is important to understand that other factors feature in proximity perception. (0 consciousl y a void (his. (Sound example to. In sound composition in general. Even air temperature is impot1ant (on cold clear nights. hence exaggerating the contrasts in loudness in a sound source and thereby perhaps exaggerating the gestural energy of a sound event or sequence of events. or vary their values according to the loudness trajectory of the source Signal. (See Appendix p60). soft furnishings) contribute to a sense of ambient reverberalion. or miked at a distance such that room ambience is significantly incorporated into the sound). Balance changes within the grain time-frame is in fact a means of generating new sonic substances intermediate between the constituents (e. This process may be used more generally. once the voice enters. The presence or absence of barriers (walls. so thaI these new sounds are partly masked by the original sound. This ensures that the voice is always clearly audible while at the same time a generally high recording level is maintained whether or not the voice is present. Combining this with triggered spatial position control we might articulate a whole stereo-versus-reverb-<lepth space from the loudness trajectory of a single mono source. yel will remain prominent. Thus proximately recorded loud sounds may be played back very quietly. perhaps with action information recorded for subsequenl exact reproduction or detailed modification. We could switch such processes on and off rapidly. For example. (Sound example 10. An expander. producing a more subtle containment of the signal. There are of course other aspects of the sound environment affecting proximity perception. to flauen or smooth the gestural trajectory of a sound. These features of the sound landscape are discussed more fully in On Sonic An. we may combine such features in ways which contradict our experience of natural acoustic environments. Detecting the appropriate level 10 apply the gate is known as threshold delection. In conventional recording studio practice two very commonly used gating/triggering procedures arc limiljng and compressjng.g. e. Mixing may also be used 10 consciously mask the features of one sound by another or. Compression works in a more sophisticated manner. the balance of loudness between sound sources is achieved part! y through the combination of perfonnance practice with the notation of dynamics in a score. but is also often used in popular music 10 enhance the impacl of percussive sounds. even within the course of grain size events.9). some instruments 'duck' in level so as nol to mask the voice. such as reverberalion or delay. either when the level of a signal exceeds a threshold. "calming" a hyperactive sequence of events. the loudness balance belween di verse sources is described in a (pOSSibly graphic) mixing score. Or we may trigger quite different sounds to appear briefly and at low levels. TIlis procedure. This procedure will ensure that the trdffic will not mask the voice. quiet events might be made more strongly reverberant and louder events drier.g. The same sound heard more quietly is usually further away. In the former case we may heighten the altack characteristics of particularly loud sounds by causing them to trigger a second sound 10 be mixed into the sonic stream.8). We may also use threshold detection as a means to Irigger other events. In certain cases we may wish to ensure the prominence of a particular sound or sounds without thereby sacrificing the loudness level of other sources. Before the singer begins. 'This can give us very precise control over lYdiance.g. Moreover. more usually. inbetweenillg. see Chapter 12). any sound above a certain threshold loudness is reduced in level 10 that threshold value. Above the threshold the louder the sound.rioll). Here the general level of some of the other instruments is linked inversely to that of the voice. does the opposite. TIlis process may be used more generally. a living room) and with quite different proximity characteristics (close-miked. The latter process is aided by distributing the different sounds over the stereo space.GATES AND TRIGGERS We may apply more radical modifications to the loudness Irajectory of a sound. but. In a limiter. the more its loudness is reduced. making a kind of hidden appearance only (Sound example 10. the rest of the band is recorded as loudly as possible. In the studio we may bring together sounds from entirely different acoustic spaces (a forest. setting the threshold level quite low. in contrast. thereby e. or created in real-time at a mixing desk. the louder it is. Furthermore.g. can be used for elementary noise reduction. This ensures that e. known as gating.7). while distant quiet sounds may be projected with very loudly. especially the presence or absence of high frequencies in the speetrum (such high frequencies tend to be lost as sound travels over greater distances). 83 84 . doors to rooms or entrances to other containers) and the nature of reflective surfaces (hard stone. a concert can be recorded at a high level with less risk of particularly loud peaks overloading the system. PROXIMITY BALANCE I n orchestral music. We may mix any two sounds so that the loudness trajectory of the second is the inverse of the first (envelope i1lver. Loudness information in the real world usually provides us with information about the proximity of a sound source. the blending or contrast of sounds is aided or hindered by the fact that all the sounds are generated in the same acoustic space (normally). In popular music the device of ducking is used to this end. perhaps alternating in perceptual importance with the voice (depending on how the relative maximum levels are sel and on the intrinsic interest of the traffic sounds themselves!). or when il falls below il. We may cause a signal to cut oul completely if it falls below a certain level. expanding the level of a sound more. heavy traffic) by imposing the inverted loudness trajectory of the speech on the trame noise (Sound example 10. and partly tlu'ough the medium of the conductor who must interpret the composer's instructions for the acoustic of the particular performance space. sound waves are refracted downwards and hence sounds travel further).(0). a sound of rapidly varying loudness (like speech) may be made to interrupt a sound of more constant energy (a large crowd Ialking amongst itself. Alternatively we mighl trigger a process affecting the source sound.

But we may also produce self-contradictory images e. lhis area is discussed in more detail in Chapter 4. closely miked whispering. Thus almost any sound can be given a struck-like attack by providing a sudden onset and perhaps an exponential decay.and these may boI:b be projected in the same space.ljectory manipulation is sufficient to radiCally alter the perceived physicalitylcausality of the source. like a continuous zoom in cinematography. which we associate with c1ose-to-lhe-ear intimacy and hence 'quietness'. (Sound example 10. PHYSICALITY AND CAUSALITY As we have already suggested. and often simple onset-Ir. may be projected very loudly. stroked etc).) and the causality of !be excitation (struck. loudness trajectory plays an important part in our attribution of both the physicality of a source (rigid. 85 . especially when contraSted with the original source. Conversely. The loudness trajectory of the sound onset is particularly significalll in this respect.g. It is Important to understand that this effect lakes place over a very small physical range (a few centimetres) in contrast to the typical range of the visual zoom. soft. A mi'-Tophone may be physically moved towards or away from a sound source during the COUlse of a recording. lhis aural zoom is hence a combination of loudness and specual variation.11). a suongly percussive sound may be softened by very carefully 'shaving off the altack to give a slightly more slowly rising trajectory. Proximity may also be used dynamically.Using a combination of loudness reduction and /ow pass filtering (to eliminate higher frequencies) we may make loud and closely recorded sounds appear distant. The effect is particularly noticeable with complcJI spectra with lots of high frequency components (bells and other ringing metal sounds) where the approach of the microphone 10 the vibrating object picks out morc and more high frequency detail. while bellowed commands may be given the acoustic of a small wooden box . (Sound example 10.12). loose aggregate etc.

5). it changes the frequency of the partials and hence also shifts the spectral contour. Repeated application of Brassage techniques to a source (in effect using '~eedback") may entirely destroy the original characteristics of the source. The transposition process itself often brings complex. Below we will look at tape-speed variation.due to the interJction of rapid repetition or "delay". if they occur in long enough streams for such acceleration to be possible. Provided the cut segments are of short grain duration (i. if the source sound is also continuous. It can always be arranged.2). Furthermore. At the limits of this range and beyond we are beginning to hear spectral and other artefacts of the process.1). delayed repetitions are heard. in exactly the same order) but differently spaced in time (see Appendix p44-AB). downward transposition by one or two octaves not only reveals the details of a sound's evolving morphology in a slower time-frame. The digital equivalent of this is to change the sampling frequency (in fact.CHAPTER 11 TIME STRETCHING also OIuch OIore robust than the old analogue recording tape-speed variation. This approach is used in sampling keyboards (1993) and table-reading instruments e. the time-stretching of texture streams. The importance of time-frames in our perception of sounds is discussed in detail in the section "Time-frames: samples. waveset repetition (waveset time-stretching).g.g.g. however. With long time-streches the perceptual connection between source sound and goal sound may be remote and may require the perception of mediating sounds (with less time-stretching) to make the connection apparent. (Sound example 11. In contrast. (Appendix p36). The internal qualities of a complex sound may thus be magnified in two complementary domains. And it time-stretches the onset characteristics. by appropriate choice of segment length and segment overlap. probably radically changing the sound percept in another way. often in a continuously time-variable fashion over a range half to two-times time-stretch. and hence the formants. (Sound example 11. and gradual shifting along the source. BRASSAGE TECHNIQUES As discussed previously. (Sound example 11. causing a sound to plunge into the lowest pitch range. These may. a sound may accelerate rapidly. spectral tITTle-stretching (in the frequency domain) can be repeated non-destructively. with perceptible pitch and spectral properties but no pitCh or spectral evolution over time) then the goal sound will appear time-stretched relative to the source.. Moreover. but storing the result at the standard sampling rate: sec Appendix p37). the degree of time-stretching of a sound may itself vary through time.3). hence bringing very high frequency detail into the most sensitive hearing range.4). "TAPE-SPEED" VARIATION In the classical tape-music studio. at the same time as magnifying the time-frame. the only generally available way to time-stretch a sound wa~ to change the speed at which the tape passed over the heads. Similarly. this approach its musical applications. and grain separation or grain duplication (granular time-stretching).. With sufficient acceleration almost any sound can be converted into a structureless rising pitch-portamento. This is an elementary way to perceptually link the most diverse sound materials. internal detail is lost. (I) pitch artefacts . 86 87 . Conversely. Although time-warping. As the sound rises. and possibly overlapping. (3) phasing artefacts . pitCh-warping and formant-warping are thus not independent.where the individual grains are large enough to reveal a time-evolving structure. frequency domain time-stretching (spectral time-stretching). In particular (multi-) octave (etc) upward transpositions can be used as short time-frame reinforcements of a sound's onset characteristics (see Chapter 4). 8rassage involves CUlling a sound into successive. as successive segments are chosen from overlapping regions of the source. capturing important extremely high frequency infonnation for re-Iistening in a moderate frequency range and preserving frequency infonnation transposed down to very low frequencies. and hence. In both cases the procedure (tape-speed variation) not only changes the sound duration but also the pitch because it alters the wavelengths (and therefore the frequency: sec Appendix p4) in the time--<lomain Signal. (Sound example 11.related to the event-separation rate of the segments. very high frequency spectral information into a more perceptually accessible mid-frequency rJnge (our hearing is most sensitive in this rJnge). grains & continuation" in Chapter I. or a way of creating musical continuity between a complex stream of diverse events and event structures focused on pitch portamenti. (2) granulation artefacts . making important detailS graspable. for the resulting sound to be continuous.e. There are several different approaches to time-stretching and the approach we choose will depend both on the nature of the sound source and the perceptual result we desire. segments and then reassembling these by resplicing them together (for pure time-stretching. Brassage may lead to. and with the precision of the computer such time-varying time-stretching (time-warping) may be applied with great accuracy. Digital recording and transposition is ha~ Good algorithms for doing this are currently (1994) embedded in commercially available hardware devices (often known as harllloniJers) and function reasonably well. Particularly in long time-stretches. tape acceleration). wavecycles. We will then discuss various general aesthetic and perceptual issues relating to time. rising pitch-wise into the stratosphere (using e. (Sound example 11. Tape-speed variation may be applied in a time-varying manner e. in CSound. be useful as sound-transformation techniques.stretching before dealing with the most complex situation. brassage techniques. interpolating new sample values amongst the existing ones. TIME-STRETCHING & TIME-FRAMES Time-stretching and time-shrinking warrant a separate chapter in this book because they are procedures which may breech the boundaries between perceptual time-frames.

A more time-stretching satisfactory application of the Brassage process to sequences such as speech slreams can be achieved by source-segment-synchronous brassage.g. For a x2 (or even x3) time-stretch.7). (Appendix p55). (Sound example n. 01"11­ Use fine-grained envelope following and detailed spectral analysis to extract sequential features of the sound.'iing"-like coloration of the sound. Sometimes parts of the signal will pitch-shift (e. however.Bmssage may be extended in a variety of ways. we avoid ar1efacts created at source segment boundaries in simple brassage (sec Diagram I). loudness. As the number of repetitions increases. WAVESET TIME-STRETCHING Time-stretching can be achieved by searching for zero-crossing pairs and repeating the wavesets thus found (see Appendix p55). TIus techrlique will produce only integral time-multiples of the source duration. DJ:AGRAl'I 2 Effect of granular time-stretching. These can then be individually brassage-time-stretched and respliced together in one operation. Because none of the goal segments crossover between source segmenlS. Use pitch to help define pitch-synchronous bras_ge segments of feature. These kinds of artefacts can of course be avoided. DJ:AGRAH 1. other artefacts begin to appear. in a steady tone or stable spectrum. this Iluidity quality contributes to the percept in an intangible way. (Sound example 11. At x 3 there is often a "pha. in truly pitched material. at xl6 time stretch a rapid stream of pitched beads is produced as each waveset group achieves (near-) grain dimensions and is heard out in pilch and spectral terms. varying the pitch. though often interesting. at x2 time-streich.synchronous brassage). We need add only thaI using Granular Reconstruction with time-varying parameters (average segment length. by using a pitch-following in~trument to help us to distinguish the true wavecyc!es. we are aware more of the "!Iuidity' of a bead-stream rather than of a continually ponamentoing line. pitch. Even at x4 time-stretch. With time evolving and noisy signals. in a single evem. as in tape-speed variatioll). (Sound example 11. loudness andIor spatial position of successive segments and. varying the time-range in the source from which the goal segment may be selected. spatial divergence and output density) it is possible to create. We thus move gradually out of the field of time-stretching into that of generalised brassage and granular I'eCOnstruction. length spread. In x64 time-stretching with such interpolation the new signal clearly glides around in pitch as each "bead" pitch glides into the next At xl6 time-stretch.8). by an octave downwards. varying the OUlput event density. 88 . a version of a source sound which begins as mere time-stretch and ends as a texture stream development of that source. As a resull this techrlique will have some unpredictable. two zero-crossings do not necessarily correspond to a wavecycle (a true wavelength of the signal) so a waveset is not necessarily a wavecycle. using larger and variable length segments. Deduce pitch of feature. In this case we need 10 apply a sophisticated combination of envelope following. (No such change occurs. (These possibilities are discussed in more detail in Chapter 5 in the section "Constructed Continuation" and in Appendix pp44-45 and Appendix p73). sorlic consequences.6). ultimately. artefacts can often be reduced by repeating pairs (or larger groups) of wavesels (a special case of pitch. It is possible also to interpolate between waveset durations and between wavcset shapes through the sequence of repetitions. search range.) This spectral fiSSion is heard 'subliminally' within the source even at x4 time-stretch. Effect of spectral time-stretching. and pitch-synchronous spectral analysis to isolate the individual segments of the source stream. As discussed elsewhere.

(Sound example 11. or sequence-stream.g. GRANULAR TIME-STRETCHING The time-stretching of grain-streams is problematic. Ideally we would need to use an envelope-follower to uncover the loudness trajectory of the sound and !.. (sound example 11. Clearly also. ritardando. (Sound example 11.~ rapidly as we wish to any large value we desire. therefore. the stronger the sense of relatedness. Of all !.0 (i.e. where !.15).stretching factors. (Sound example 11. One way around this limitation is to suetch the source x8. (Sound example 11. We can !. (speclral time-stretching). Beyond this. at very great time. we can use this clTect to produce spectrally diverse variants of a source. Zooming in 10 a different point in the source will produce a different ill.hus produce a collection of related musical events deriving from the same source (provided enough of the resulting signals are elsewhere similar to cadI other).he techniques so far discussed. reanalyse and time~tretch by x8 once again.hus locate all the onsets.he onset. a x64 time-stretch.he causality) but retain a perceptual connection between source and goal wough the stability of the continuation. making !. However.hat !. an altack-dispersion sound (piano. !.. e. the stream character of a 8rain-sueam breaks down in our perception . causing the sequence of grains to accelerate. if we time-suetch the onset of a sound we risk completely altering its perceived character. randomly scalier etc. Time-stretching !. a type of constructive distortion. The continuity of !.14). there are many combinations of onset transfonnation and continuation transformation. If we have a sound with marked (grain time-frame) onset characteristics. can reach a point where the sounds become a grain-stream. partials will enter and leave this honoured set and we will hear out the new enmes as 'revealed melodies'. Once we 00 too long a resynthesis from individual windows. even our interpolated spectral time-stretch over a long suetch. will radically very long spectral time-stretch of a sound's continuation.18). However. Conversely. the elements of POtential musical phrases.2 stretching does not significantly reduce the bead-streaming effecl in long time dilations.he onset characteristics by not time-stretching !.he time continuum where the time-suctch factor reduces towards 1.0 (no stretch).however.he result.g. will be sufficiently slowed for us to hear out !lIe spectral motion involved.he sound is specually varying. 1his we can achieve by time-variable spectral time-stretching (spectral time-warping). these revealed melodies will be regularly pulsed at the time-separation of the original window duration multiplied by the time-stretch factor. when working on musically interesting sound sources. The original window size for the analysis is chosen to be in the time-domain so brief that the human ear perceives !. we can use spectral time-stretching 10 heighten !. looming in to a maximal time-stretch at a particular point in a source. Spectral lime-warping will create a fluid (lIcccl-rit) tempo of entries and will accelerate into !. In II SPECTRAL TIME-8TRETCHING A sustained sound wi!.. We may notice an extension of the initial rise time and final decay time but in this case these may well not be perceptually crucial.12). we must retain !. Clearly.h stable speclrum and no distinctive onset characteristics may be analysed to produce a (windowed) frequency-domain representation. !. (Sound example 11. (Sound example 11. this is by far the best for pure time-suetching and works well up to about x8 time stretching. (Sound example 11.he longer the continuation. 1his compositional tool is discussed in Chapter 3.he characteristics of the source sound. will produce a particular inharmonic anefaet in the goal sound.13).9).he speclrum of each window is sustained long enough for us to become aware of the joins.11). pizz). 1his is feasible but awkward to achieve successfully.he source. time-shrinking of a sequence of isolated events. grain-streams are in effect a sequence of onsets. In particular. (gra1lular til/fe-warping by grail! sepnrtltioll). If we wish to preserve !. As the spectrum varies in time. The procedure ensures a perceptually continuous resull even at x64 time suctch. (Sound example 11. Using time-variable spectral time-stretching (spectral lillie-warping). e. As we have seen. We overcame this problem by time-variably time-stretching (time-warping) a sound in such a way !. It is therefore useful to be able to time-vary a grain-stream by separating the grains and repositioning them in lime. using each window to generate a longer duration (phase vocoder).. s1ter the sound percept by making the indivisible qualitative character of the onset become a tirne-varying percept (see Chapter I).he time-stretch equal to 1.he onset of the signal. specUal transformations may arise.17).he onset was not stretched. we can reveal these changes by stripping away partials from the time-stretched sound until only the most prominent (say) twenty remain (spectrallracing : see Appendix p25 and Chapter 3). no stretch) over the first few milliseconds of the source and thence increasing a. a more satisfactory approach is to interpolate between !.~ a special sound transformation procedure. The resultant sound appears longer but retains its pitch and spectral characteristics. will alter the sound percept radically (altering !. a sound with a rapidly changing spectrum. I1lther than musical "points" in their own right. (sound example 11.he small change from window 10 window (in fact a "Slep") as a smooth continuous transition. and then apply a time-warping process that left the sound unaltered during every onset moment.he original source is not reconslructed.Except for .we hear only isolated staccato evenL~. However.he existing windows to create new windows intermediate in channel-frequency and chaMel-loudness values between the original windows but spaced al the original window time-interval. even in this case. it 100 becomes less satisfaclory as a pure time-stretching procedure. preserve only the beginning of the sound.hamlonic anefact in the new goal sound.10). Successive applications of such . We cannot in this case. Alternatively. originally perceived as noise. bul not the continuation. synthesize !. We can then resynthesize the sound. 89 90 . With even moderately large granular time-stretching of this sort. The ultimate extension of this process is spectral freezing. If the lime-stretch is itself not time-varying.2 time-stretching.he internal spectral variation of !. In general we will hear a resulting sound with a gliding inllannonic ("metallic") speclrum in place of our source noise. where the frequencies or loudnesscs of the spectral components in a particular window are retained through the ensuing windows.16). wavesel repetition is more useful a. bell. by reducing separation time.

grain-streams (and sequences) may be time-extended by granular time-strerclling by grain duplication. In fact. THE CONSEQUENCES OF TIME-WARPING. less detailed. or specific granularity of sequencing of events within the original grain. There arc. changing loudness undulations in any of these. The process of successive contraction and expansion will give similar results to the process of spectral blurring (see Chapter 3). the laner eventually reduces any sound to a continuation-less but loud point sound. changing spectral type.26). (Sound example tl. a dramatic change in the percept.24).un (or sequence) elements having their own musical implications. a feature of the sound continuations. to a "slow· sequence of individual staccato events which may be re-sequenced in terms of onset-time (rhythm) pitch.25). the internal structure of the grain. grain repetition will be perceptUally obvious and the more repetitions of each grain.20). The process hence tends towards tremolo-like loudness trajectory and eventually to a continuous percept. We may in this way uncover changing pitCh. (Sound example 11. the resulting sound will be considerably less time-variably spectrally detailed than the original. which latter qualitatively transforms the grain (sequence) clements. initially very tiny spectral time-stretches will produce very significant perceptual relations. time-variable spectral time-stretching (specrral 91 92 . Conversely. simple process. the individual elements will shrink. some kind of mediation (lhsough sounds with less time-stretched onsets) may be necessary to establish the perceptual link between source and goal sound. The first process tends towards silence. If grain-streams or sequences are spectrally time-shrunk. when time-stretched becomes a clearly time-varying property. Once again. In particular. perhaps. Or. When applied to a grain-stream these will usually create both a sense of rhythmic change and this sense of temporal extension.19). in this process data will be irretrievably lost. hence preserving a key perceptual feature of the source. Considering single grain time-frame sounds. or part of a sound. much larger time-stretching will present less and less surprising perceptual information. This is a grain dimension analogue of waveset repetition and suffers from the same drawbacks. however.e.e. These each have quite different musical implications. ultimately. become less spectrally detailed and more click-like a~ their durations approach the lower grain time-frame boundary. or we may time-stretch the onset too. (Sound example 11. we may choose to omit the least significant (lowest amplitude) wavcsets. Granular time-shrinking by grain deletioll is not prone to this blulTing erfect. though the continuation of the two sounds may provide sufficient perceptual linkage between them. but as we proceed.. (Sound example 11.the sound gets longer without any sense of the rhythmic slowing down of a pulse. In many cases there may be no apparent perceptual connection between the quality of the grain and the revealed morphology of a much time-slTetched version of that grain. (Diagram 2). Granular time-stretching by grain separation of a grain-stream will lead. a sound with continuation will usually have distinctive onset characteristics of grain dimensions. As we have discovered time-warping is not a single. Granular rime-shrinking by grain separation involves the repositioning of grains by reducing intervening silence (where possible). These latter processes will tend to blur grains together. Taptt-speed variation time-contraction has already been discussed. though.g.g. itself specrrally time stretched over a number of seconds (especially in COmpleXly evolving sounds) may provide enough musical information for us to treat the extended sound as a phrase in its own right. or any combination of these. granular time-strerciling by grain separation (in which the grains themselves are preserved while the intervening time-gaps are manipulated) will give us a clear percept of rhythmic variation (ritardando. As we have heard. It can be used to contract a sound with continuation into an indivisible grain. from which the original grain is not recoverable. II will need to be eSlablished through the structural mediation of sounds made by time-stretching the grain by much less. As we have seen. but also loses sequential dctail. TIME-FRAMES The continuation of a sound. without thereby altering the (average) pulse-rate (density) of the stream. (A new grain might be created by envelopillg. spectral time-stretching (or brassagel1wmwniser type time-stretching) applied to COnlinllOUS soumis is more akin to looking at time itself Ihsough a magnifying glass . by any rational fraction). accelerando. perceptual connectedness will be found to follow an exponential curve as we move away from the grain dimensions i. If the grain elemenL~ are changing in character.lcts. In contrast. (Sound example 11. TIME-SHRINKING All the above processes may be applied to time-contraclion. (granular time-stretching by grain duplicalion) e. In this case the sound retains its original loudness and loudness trajectory. i. If we wish to preserve the onset characteristics of a grain time-frame sound.27). (Sound example t 1.21). or overlaying or splicing together the existing grains. a quality of the event. In waveset time-shrinking waveset duplication is replaced by waveset omission. Specrral time-stretcizing or lramumiser time-stretching of a grain-stream. or various spectral properties.23). The musical implications of applying particular processes to particular sounds must be considered on their own merits. with large time-stretches. When we time-stretch such a sound we may choose to preserve the time-frame of the onset.) (Sound example 11. if it stretches the grain constituents by a large amount. prooucing. lhis process also loses granular or sequential detail as the sound conW. (Sound example 11. perceived as in indivisible whole. (It might be obviated by a very narrow range pitch-scattering of the grains 10 destroy the sense of patlerning introduced by the grain repetition) Note that in this way we can time-stretch the grain-stream by integral multiples of the original duration (and by selected choice of grains 10 repeat.22).We may also extend grain-streams by duplicating elements. by granular time-stretching by grail! separarion or by specrraltime-stretcl!ing. randomisation etc) a~ the internal pulse of the grain-stream (sequence) changes. (Sound example t 1.28). will reveal spectral qualities and evolving shapes (morphologies) within the gn. but it is easy to destroy the continuity of the grain-stream percept if grains are deleted from that stream. (Sound example 11. respace the original units at twice the initial separation and repeat each grain halfway in lime between the new grains. with noticeably different results. changing formant placement. the more perceptually prominent il will be. use the original grain as a superimposed onset for time-stretched versions of itself. quieter. we must use time-variable spectral time-stretching (spectral time-warping) (see above) or simply. though in a different way to grain separation contraction. a number of general perceptual considerations to be borne in mind. Spectral time-shrinking can smoolhly contract any sound. if the grain is now time-dilated once more. every fourth waveset) in which case the sound becomes increasingly fuzzy. We may choose to omit waveselS in a sequentially regular manner (e.

from time to time forcing !he streams to pulse-synchronlse in !he same (possibly changing) tempo and in the same (or a displaced) pulse-grouping unlt (bars coincide or not). p1uase structure may be compressed into grain-slream (a comprehensible spoken sentence can become a rapid-fire.g. rhythmic or spectral percepts through finely controlled delay) but we might also produce mutual interaction between rhy!hmic streams which are themselves aecelerating of decelerating.or to constructive distortiOIl e. [n both cases a long time-stretched goal sound may not be perceplually relatable to the source and will require mediating sounds (e. spectral The time-warping of rhythmically pulsed events provides us wi!h an entirely new area for musical exploration.g.g.8) and spectral time-stretching reveals window-stepping in spectrally traced sounds (Sound Example 1l. (Sound example 11. spectrally irregular sequence of grains) and continuation compressed into the indivisible qualitative percept of gr. mutually synchronised c1ick-tracks in different tempi. 7 & ll.e-stretching produces pitches bead streams (Sound Examples 11. or ·counterstreaming". of a melodic phrase. in a single spatial location. less time-stretched versions of the source using the same technlque) to establish a musical conncction. The interaction of time-varying pulse-slreams bears the same relation to fixed-tempo polyrhylhm as controlling pitch-glide textures does to Hpi\ch field pitch organisation. undulating continuation properties will become slow glides with quite different musical implications to their undulating sources and we can alter !hese implications through tillie-warping.16). Hence speed divergence and resynchronlsation become elements in the emergence and merging of aural streams. It is already possible to work: wi!h multiple fixed time-pulses (e. Similarly.lin. Note. are synchronlsed at some point and have slightly different time-stretch parameters. proporuoos and perhaps. or even similar. This device is used in Vox 5 where three copies of an ululating voice are slightly time-varied in different ways and made to move differently in space. Spectral time-stretching of noisy sounds by large amounts may produce sounds with gliding inharmonic spectra (Sound example 11. mediating sound types may be necessary. They begin in synchronlsation. to time contraction.32).1l) . Throughout this section we have talked only of time stretching but Ihe same arguments may be applied. in reverse. (Sound example 11. tbe time-fluid cousin of counterpoint. as in Vox 3) or wi!h time-pulses mutually varying in a linear manner (as in the "phasing" pieces of Steve Reich. (Sound example 11. e. In the previous case (Sound example 11. their interaction produces portamenti rising 10 or falling from the poinl of synchronlsation (see Chapter 2 on Pilch Creation). Also. however. (Sound example 11.rime-warping) allows us to produce many versions of such a phrase with different internal time emphases. In particular. that the sound constituents of two time-varying coumer streams do not have to be the same. Time-stretching may lead to spectral fission (see Chapter 3). ("l (!) ( 1-1 ~ A 93 . if the stream constituents are identical. if a perceptual connection belween source and goal is required (to build musical structure). just as we might produce variants in the Hpitch domain. and reoonverge to a similar state at the section end by appropriate inversions of the speed variations (see Diagram 3).30).29). tillie-warping of a continuous sound can create a kind of forced continuation (see Chapter 5) in the spectral domain if applied to a complexly spectrally evolving sound.g.31). Once again.31) these precise-delay pitch-artefacts were avoided by carefully fading IWO of Ihe three streams just before such artefacts would appear. wave-set tim. In fact. which generate new melodic.

then reimposing the original loudness trajectory in exactly the same time-frame as in the source texture-stream. This is the textural equivalent of granular rime-stretching by grain duplication. the transitions to noisiness etc would move more slowly. We here begin to touch upon interesting music-philosophical ground. (Sound example 11. loudness trajectory. formant change etc. however. all independell/ly of one another. (Sound example 11. We will discuss this parameter separation further. as we proceed timewise through the texture) while (as far as is possible) the temporal evolution of density and field characteristics remain unchanged. Or rather it does this so long as it is aware of what can be perceived and what will be perceived in the resulting musical stream. We COUld. by returning to the texture gelleratioll process and altering the event-onset density parameters. We have hence given three quite different definitions of time-stretching a texture-stream. The second approach to time-stretching a texture-stream would be granular time-stretching by graill separation. With texture-streams the situation is even more complex. then impose time-dilated field-variation parameters on the stream. or groups of properties of a sound. as well as the aforementioned artefacts.36). 94 95 . We have already encountered a two dimensional situation with grain-streams. Composition focuses perception on what is being perceptually organised through time. the formant changes. just as a note. time-stretching a dynamically flat version of the texture. (Sound example 11. the laner docs nOl.g. as with grain-streams and sequences. If we include the possibility of spectrally time-stretching the tcxture-components prior 10 generating the texture-stream. in general. event-duration variation. the pitch-range variation. It can only be done. To achieve an integrated time-stretch of this sort. but for a longer time. However. Brassage will work. we may create a goal sound which appears less event-onset dense than the source sound hut not perceptually time-stretched in any meaningful sense. producing arbitrary phase shifts between the channels. t(Xl large a range will begin to destroy any time-varying order in the field characteristics of the stream (e. any time-varying field properties (Hpitch field change. loudness trajectory. In fact we can time-stretch event-onset-separation-dcnsity variation. loudness Irajectory etc). Waveset time-stre/ching will produce surprising and unpredictable artefacts when applied to texture streams as the zero crossing analysis will confuse the contribution of various distinct grains to the overall signal. This might belief be regarded as a time-varying spectral-typeJduration-type transformation of the source texture.e of this in Chapter I.37). overall loudness trajectory. Time has thus become a multi-dimensional phenomena within the sound percept and we may choose amongst the many compositional options availahle to us. Assuming we have fine control of the texture generatioll process we COUld. The revelation of the inner structure of grains may even alter the field percepts (e. In this way. With a stereo texture. directed change of the Hpitch field. and hence very quickly producing a radical spectral transformation of the percept.g. because of the mutual overlaying of grains (or larger constituents) in a texture-stream. In this case we may generate events at the original density.34). there is usually no simple way we can achieve this. using regular segment size and zero search-range (sce Appendix p44) will quickly destroy the unpatterned quality of the texture-stream. bener at preserving the inherent qualities of the texture-stream if we use a large enough segment size to capture the disorder of the texture-stream and a large enough range to avoid obvious repetition of materials.TIME-STRETCHING OF TEXTURE-STREAMS We have left the discussion of texture-streams until the last because it introduces further muiti-dimensionality into our discussion of time-stretching. or Hpitches appearing as gliding pitches) of the source sound. in fact. The texture-stream is an example of a class of sounds with certain definable time-varying properties.38). We may do this by spectral time-s/re/clting. However. etc. we may imagine another option in which the texture constituents are spec/rally rime-stretched (this time-stretching itself changing from constituent to constituent. the texture-stream percept is clearly not a "unique" sound-event. In the first we treat the texture-stream as an indivisible whole and time-stretch it. Brassage techniques with above-grainsize segments. Here the loudness trajectory. in traditional music. below. separate out some of these field properties e. choose flat to alter these features of the stream. This suggests the third approach to time-stretching. All the perceptible time-varying field properties of the texture-stream will thereby be time-stretched e. noise elements becoming inharmonic sounds.33). (Sound example 11. A grain-stream may be granular time-stretched by grain upal'4/ion. waveset duplication may be applied to each channel independently. or on its holistic characteristics.g. pitch-range variation. but the event-onset density would remain as before.g. For in this last case. (Sound example 11.) would need to be similarly time-stretched in the generating instructions.35). is a representative of a class of sounds with certain definable stable properties. We may distinguish three distinct approaches to time-stretching a texture-stream. It is the mUSical context which focuses our anention upon particular properties. The first process reduces the pulse-rate (or density) of the grain~tream. or by grain duplication. thereby stretching all the texture constituents. evolution of the spectral contour (formant evolution) etc. in the sense which we spok. pitch-band width change etc. as brassage repetitions introduce a "spurious" order into the goal sound. (Sound example 11.

MEDIATION: AMBIGUITY: CHANGE Before we go on to discuss compositional methods for achieving interpolation. 96 97 . is to achieve a percept wilich is capable of these dual interpretations without entirely lOSing 'SOurce credibility' (Le. to a sense of striking a hard inflexible medium. is it anything at all that we can recognise?). the goal is clearly not a voice. we also note that we can. Interpolation may pass directly from one recognition (voice) to another (bees) or it may seek out an ambiguous ground in wilich two recognition concepts contlict or cooperate within the same experience (the talking sea). voice or instrument. I will try to define tilis more clearly below. it is worth considering why we might want to do it. less source-recognisable) sounds. It is important to understand that in this case we'wish to acilieve some sense of perceptual jusioll between the two Original percepts . A. and then imposing a rapid series of hard-edged loudness trajectories on the resulting continuum. in a similar way. we may gradually spectrally stretch the harmonic spectrum of an instrumental tone until we rcacn. When compositional processes move sounds across these boundaries. even with the most advanced technology. This approach may be heard in Stockhausen's Gesaflg tier Junglinge where the 'pure' pitched Singing voice of a young ooy and pure (pitched) sine tones are mediated through a set of intennediate pitched sounds (sounds between 'boy' and 'sine tone').~ a metaphysical underpinning (a religiOUS conception of 'unity' in Ihc cosmos). where we are most undecided about whether what we hear is English or French. the goal is a violin: the source is the sea. What distinguishes interpolation from mere variation is our feeling that we have moved away from one type of sound and arrived at a different type. Interpolation takes place in two dimensions. a complex and inharmonic sound. The firsl approach is aimed at achieving some kind of mediation in sound between distinct sound types.CHAPTER 12 INTERPOLATlON aoWever. These are sequentially articulated according to an entirely different logic (a serialist sequencing aesthetic). aence. For tnc moment. is the recognition of source and goal sounds in the process. This approach also has its own metaphysical implications. a voice. the rdte of spectral cnange (and elsewhere to the irregularity-regularity of sequencing etc) alter our intuitive type-classifications of the sounds we hear. Technically the aim (and difficulty) here.g. or a flexed metal sheet sound. In these cases we have a process of dynamic interpolation taking place through the application of a continuously time-varying process. The mediation is not achieved through dynamiC interpolation (technologically almost impossible 3t the time). second approach to sound interpolation stres. or time warping or /iltering or randomisation of perceptual elements etc) we will produce a sound wnicn we recognise as being of a different type. through a continuous process. dense traffic) so that it eventually involves such extreme and rapid swings of pitcnthat the original percept is swallowed up by the process. Clearly. However. The source is a trumpet. IS RECOGNITION IMPORT ANT? A significant factor to deline. wilich also govems the rest of the musical organisation in the piece. e.a mere superimposition of one over the other is not acceptable as an interpolation. The composer focuses on the cusp of the interpolations. a percept of interpolation is possible without such clear referential clues where there is a distinct cnange in the sense of physicality or causality of the source (see Chapter I). create a wnole set of sounds. pitch-stable sound (e. ha. }-Ience modifications to onset characteristics. we will diseuss both static and dynamic interpolation between these sound. we move from the sense of forced continuation of an elastic medium. This is discussed more fully below. This can be particularly diflicult. wilich pervades much of StOCkhausen's musical thought. we may gradually apply a compositional process changing the value of various parameters through time. However.~es the ambiguous implications of the sounds thus created. we create the percept of interpolation. the goal is sjXlken French: or even the source is a singing voice. if of a more secular variety.. nor through a clear progressive movement from one sound-type to the other. by using a similar compositional process. but in the sense tnat the piece is grounded in a field of sound types wilich span the range 'boy's voice' to 'sine tone'. Alternatively. And. the goal is a voice: the source is spoken English. the same voice speaking the text in Frencn and the sound of brass instruments. There would seem to be at least three different motivations for musical interpolation and each motivation Icads to a different empnasis in the way the tecnnique is applied. This desire to mediate between the child's voice and a sct of more 'abstract' (I. Both physicality and causality have been altered. (Sound example 12. between English and French on the one nand and between voice and instrument on the otner. Roger Reynold's nas used interpolations between a voice speaking a Samuel Beckett text in English. in this chapter we will deal largely with compositional processes which interpolate between two (or more) pre-existing sounds. by greatly spectrally time-stretching WHAT IS INTERPOLATION? Any perceptually effective compositional process applied to a sound will produce a different sound.e. This percept is most readily understood wnen the source and goal sounds are recognisable in some referential sense. when discussing the idea of inlerpolalion. if the process is sufficiently radical (extreme spectral stretc/ling. Or we may gradually add vibrato [0 a relatively short tenn.1). provided our sound is sumciently long.g. we can apply this kind of reasoning to all compositional intervention and all compositional processes might be described as so many sopltisticalcd variants of interpolation. wnose properties are intermediate between toose of tne source and tnose of tne gOal. We will describe this mediation by progressive steps as static interpolation.

or even betray the technical process involved. square wave to sine wave. we would not achieve a satisfactory set of intermediate sounds. If we began with the goal and source of this sequence and merely mixed them in various proportions. To aChieve a completely convincing sound intennediate betwecn two othern. or simply interleave windows as they are.6). INTERLEAVING AND TEXTURAL SUBSTITUTION A different approach to this problem of static interpolation is to. we must work willl sounds whose continuations are (almost) identical. we may impose the waveset (or wavecycle) shape of one sound on the waveset (or Wavecycle) length of the other. Successful interpolation hy mixing can only be achieved if we can defeat either. In-betweening may not be as skilful an occupation as originating the key frames but it is by no means elementary. Having produced such an analysis of our two different sounds we may interleave alternate windows from each sound. e. 98 99 . search nmge. With complex sources. We may also choose to interleave pairs (or larger groups) of windows. spatialisation of segments etc. not fusion. The process's artefact grill disguises the lack of trUe fusion of the sources . This is due partly to the ear's sensitivity to onset synchronicity (or the lack of it) and partly to the parallelism of micro-fluctuations of the components in anyone source. and a seamless transition from one to the other without any intervening artefacts which might suggest some other physicality/causality.TIle third approach focuses upon the process of change itself. 1be difficulties here might be compared with those in the process of in-betweening in animation. be more easily achieved through brassage where we also have control over the segment size and are not therefore tied to regular oscillations. In the latter case in pmticular. in the voice->bees example.2). we divide the sound into tiny time-slices. bUI With complex signals the process is likely to resull in radical mutual distortion. In Vox 5 the transformation voice->bees aims 10 achieve clear recognition of both source and goal. With a sufficiently large grouping of windows. Simple point-to-point linear interpolation will. In sounds with continuation. for example. Again. The process of spectral stretching. With simple sources. the goal and source sounds of the sequence "Ko->u· 10 "Bell" from Vox 5. using wavesets. produce a rapid regular oscillation between the two source sounds. and analyse the spectrum in each window (using e. When we produce a spectral analysis of a time-varying sound. in general. or wavescts. Moreover the dynamism of the change is a crucial parameter. Similarly. preserving the original time-frame. We hear mix.both are an integral palt of the resulting 'mince'. the voice almost melts slowly into the bee swarm. A similar percept can. One way to do this is to interleave analysis windows obtained from the spectral analySiS of two sounds. or with grain lime-frame sounds (which have no continuation). creating a goal sound as long as the sum of the original sounds' lengths (see Appendix). granulated or pitch-spread) in which the tWO (or more) sources 'peer through' the distorting grill of the process simultaneously. Here. successively applied.s (or computers) do the various in-between drawings such that. (Sound example 12. (Sound example 12.g. the fasl fourier transform. however. INBETWEENING 'The most obvious way to achieve some kind of interpolation between two sounds would seem to be to mix them in appropriate (relative loudness) proportions. This procedure is the basis of the phase vocoder (see Appendix). (Sound example 12. we have mutual transformation. This becomes a mutual transformation process rather than a true interpolation (waveset interleaving). Here (be directing artist draws key frames for the film. two interleaved sinewavcs of different wavelengths will produce a sidcband whose wavelength is the sum of the onginais. we can achieve good intermediate percepts simply by mixing. at the same time being different from that in other sources. We can go one step further and altempt to integrate the wave-<:ycles. Two or more sources can be hrassaged together (fllIIlli-source brassage) with control of segment size. if not exactly a fusion of aural percept. and assistant. the rcsults arc once again complel( and perceptually unpredictable. when all are combined frame-by-frame on the film. of course. Our brain manages to unscramble the many sound impinging on our ears at any lime and to sort them into separate sources.7). of these.g. realistic movement will be created. just as in single source brassage. However. but not true interpolation (waveset transfer). suggesting a generative energy in the vocal source. This procedure can result in a kind of closely welded' mix'. this will. interleave the data from our two sounds. this process may be unsatisfactory. ConSider. this will force the pitch of the latter on the spectrum of the fonner. Musical context can playa vital role here.3). and Chapter I) in order to follow the temporal evolution of the spectrum. or both. (Sound example 12. the technical problems are those of seamlessness in the spectral transition and achieving the right dynamism. (Sound example 12. spitting or throwing out the goal sounds . as we know from our everyday experience this almost never creates perceptual fusion of the two sources. (Sound example 12.8).g.5). This approach p<~rhaps works best in producing a process-focussed transforlllation (see Chapter I) (e. The ideal case is one in which the source sound is transformed into the goal sound by a distortion process that retains the duralion and general shape of the source down to the levcl of the wavecycle or waveset (sec destructive distorticm in Chapter 3). in some sense. not work. Interleaving wave cycles or wavescts will produce spectrally unpredictahle sidehands. the very precise (to the sample) synchronisation of auack (onset synchronisation) can achieve an instantaneous fusion of the aural image which is however immediately contradicted by the continuation of the sounds in queslion.and this dynamism is enhanced by spatialisalion in the sense of spatial motion.4).9). Where the sounds have continuation. but not necessarily. pitch shift. (Sound example 12. of the two sounds. or windows. has gradually separated the spectral components (and hence altered the spectral Structure) to a point where they will not simply fuse together agllln by mixing. see Appendix pp2-3. other transformations in Vax 5 are quicker and more forceful. (Sound example 12. Superimpositions of these two sounds in various proportions will create convinCing intermediate states (inbetweening : Appendix p46). or the emergence of stereo images from a mono source. especially as interpolation recognition often requires a relatively long time-frame in order to work smoothiy.

however. (Sound Example 12. the one that's not the VOice!). the inverse process. TIlls type of interpolation might also be applied progressively so that 'voiceness'. A sequence of constant squarewave type buzzes of appropriate pitches (for voiced vowels and consonants) and stable white noise (for noise consonants and unvoiced speech) played through the time varying filters can be used to reconstitute the original speech. . (see Diagram I). T and 'sh' sibilants at the wave peak. 1. while the spectral contour contributes to our formant. making 'talk' like the sea. by adding noise discreetly to the spectrum in frequency ranges where there is liule energy in the source.. II will usually not have a flat. of other sound. or 'vowel' perception. In speech analysis and synthesis. can be envisioned much more simply: a very dense texture of unvoiced specch to which an appropriate wave-breaking-shape 10udness-trajeclOry is applied (using a choral conductor'). As the spectral contour defining the formants must have something on which to 'grip' on the second source. Making the sea 'talk' may be a sophisticated process of control of spectral ~ontour evolution. lack of sibilanl~ initially. speech). TIle speech can be reconstituted by driving a generating signal Ihrough these time-varying filters.voices within voices. <y . using a process known as linear predictive coding. We might then proceed to vocodc a recording of our 'sea' conslIuct .g..fr~ .' " . 100 . or to change the voiced-unvoiced characteristics (choice of buzz or noise). e. and another source which lacks significant formant variation (e. spectrum but we can enhance the formant transfer process by 'whitening' the second source. a flute. spectral contour data is recovered and stored as data defining a sct of time-varying filters. through gradually tighter band pass fllJering of the elements themselves. '-. emerges out of a continuing non-vocal sound. this process works best jf the second source has a relatively flat spectral contour over the whole frequency spectrum. I '\ extract spectral contour from formant-varying \ '" ' . For obvious reasons. .e. However. or speed (change the rate of succession of the filters and the buzzlnoise). 's' sibilants with high formants for the undel1Ow) we can create a dual percept with no electronic technology whatsoever. TIIC process can also be used to reconstitute the speech at a different pitch (change the pitch of the buzz).. or lack of it. (Sound example 12. (Similar vocooing procedures may. ~L-~~~~~~~~~'.. the sea).e. sound 1.... '\ .g...If we have a very rapid sequence or a texture-stream we may achieve a dynamic interpolation between two quite distinct states by careful element substitution.. be aUempted. If we therefore have one source with a clear articulation of the formants (e. For interpolation applications. using spectral data extracted by phase vocoder analysis). we can impose the formant variation of the first on the spectral form of the second to create a dual percept (vocoding).g. focus the pitch of the granules. -­ apply this spe~tral cafltour to spectrn. VOCODING AND SPECTRAL MASKING The distribution of partials in a spectrum (spectral form) and the spectral contour (formants) arc separable phenomena and impinge differently on our perception. we might start with a texture-stream of vaaueJy pitched noise-granules spread over a wide pitchband and. while simultaneously narrowing the pitchband.11). DIAGRAM 1 For each windoW". "-'. we can force a broad band noise granule stream onto a single pitch which we might animate with formant-glides and vibrato articulations reminiscent of tbe human voice (granular syruliesis : see Appendix).1lft1P . . In this way. It should not be confused with the phase vocoder.10). If the spectral type of the sounds within the texture stream is made to vary appropriately (e.g. or even a stable. this process is most often used to interpolate between a voice and a sc~ond source and is often called vocodillg.. As discussed in Chapter 3 the spectral form creates our sense of harmonicity-inharmonicity. the signal we send through the fillers will be our second source for interpolation (i.

[n Vox 5 most of the voice->other interpolations start with a voice in mono at front centre stage and interpolate to a stereo image (crowd. however. This process is used extensively in Vox 5. This technique should. We can now apply a process of moving the amplitude (loudness) and frequency values in window N of sound I towards the values in window N of sound 2. Achieving a successful interpolation is about creating convincing. For this reason. a very high-register. in the transformation voice->bees. e. We are moving gradually away from the currelll value in sound I towards the currenJ value in sound 2. This type of spectral process can also be used for static interpolation.12). if our two (or more) source sounds have stably pitched speclra which are strongly pitch-related. wobbling . Due to the nature of the analysis procedure. or gets quieter. source recognition takes time and we need enough time for both source and goal to be recognised. (Sound example 12. to create perceptual 'false trails' which distract the ear's attention sufficiently at the point of maximum uncertainty for the transition to be accepted. In the voice->beeS lransition in Vox 5. '{bere are a number of refinements to the interpolation process. be regarded as a form of spectrally interactive mixing. More profoundly. alterations in the loudness balance of source sounds will produce interpolated spectral states. and we must choose these parameters in a way which is appropriate to the particular pair of sounds with which we arc working.13). Here we construct a goal sound from the spectra of two (or more) source sounds by selecting the loudest partial on a frequency-band by frequency-band basis for each time window. in general.step corresponds to the window duration.g. In Appendix p32 each sound is represented by a series of frequency domain analysis windows. need not lead to equal small perceptual changes. harmonicity) may have a dramatic perceptual result. same pitch. where the time. INTERPOLATION 1lle most satisfactory form of dynamic interpolation is achieved by interpolating progressively between the time-changing speCIra of two sources. the resulting window values will move progressively from being close to those in sound I 10 being close to those in sound 2 and Ule resulting sound will be heard to move gradually from the first percept (sound I) to the second (sound 2). First of all. would produce merely a spectral glide perceptually disconnected from both source sounds. we tend to perceive . It is important to understand that we are interpolating over the difference between the values in successive windows.Another way in which we can achieve interaCtion between the spectra of two or more sounds is via spectral masking. If we do this progressively so that in window N+I we can move a little fUl1her away from the values in sound I in window N+I and a little closer to those values in sound 2 in window N+l. in contrast. and not from the original values (back in window N) of sound I towards the ultimate values (onward in window N+K) of sound 2.g. it should be possible to produce a set of spectral intermediates between any tWo sounds _ with one word of caution! Equal small changes in spectral parameters. 'Dlere are also perceptual problems in creating truly seamless interpolation between two recognisable sources. (Sound example 12. In our process.3). small perceptual changes in the resulting sounds not about the intemal mathematical logic that produces them. as well as for the interpolation itself:o take place. in a sequence or tcxture-stream. of the other. SPECTRAL. A more difficult problem is created by categoric switching in perception. window by window. it may mask out the high frequency data in the other source(s) and the high frequency characteristics of the masked sources will be suddenly revealed if the first source pauses. creating a set of sounds spectrally intermediate between source and goaL If we also allow progressive time-stretching (so that duration of source sound and goal sound can be matched through interpolation) and we place no aesthetic restrictions (such as the goal of 'seamlessness' in the examples above) on the intervening sound-types. for each sound. With voices. It is also important for the two SOUrces to be perceptually similar (e. (see Appendix p33). or 'dramatise' the process of dynamic spectral interpolatioll. to achieve a truly seamless transition. bees etc) which itself often moves off over the listeners heads. If one of our sources has prominent high frequency partials. imposed reverberation can help to fuse a sound-percept by giving an impression of 'spatial integrity' (this sound was apparently produced at a single place in a given space). TIle information in these windows changes.no."a voice . (listen to Sound example 12. A movement from mono to true stereo can. To achieve an apprOldmately linear sense of interpolation along a set of sounds may require a highly non-linear sequence of processing parameter values. if one set of elements moves left and the other right. Our process. if the seamless transition is to be achieved without intervening artefacts which either suggest a third and unrelated physicality/causality in the sound. similarly noisy). rather than illleqlOlation in the true sense. the windows are in step-time synchronisation between the two sounds. then in windows N+2 etc. or even reveal the mechanics of the process of composition itself. Spatialisation and spatial motion hence reinforce the dynamism of the transition.a voice? . 101 102 . SPATIAL CONSIDERATIONS Spatial perception may also be an important factor in creating convincing static or dynamic interpolation. be used to enhance. is moving from the 'wobbling' of one spectrum into the .a voice??? . We may interpolate amplitude data and frequency data separately (at different times and/or over different timescales) and we may interpolate in a linear or non-linear fashion. a cross-fade does not produce an interpolation. low-level part of the spectrum undergoes a slide at the 'moment of maximal doubt' in a way which is not conSCiously registered but seems sufficient to undermi ne the sudden categoric switching which had not been overcome by other means. In the crudest sense. It may be necessary. The latter process. this interaction will be particularly potent. we experience aural stream dissociation. Hence aspects of the spectra of two (or more) sources may be played off against each other in a contrapuntal interaction of the spectral data. however.there is a sudden switch as we recognise the goal percept. In fact a slight deviation in spectral form (c. spatial strcaming will tend to separate a fused image.g. Mixillg sounds normally fails 10 fuse them as a single percept because the micfO-lluclUations within each sound are mutually synchronised and out of sync with those in the other sound. The proof of the mathematics is in the listening. (spectral ilUerpoiation). the interpolation tends to be perceptually smooth. we are effectively interpolating the micro-Iluctuations themselves. it's bees l " . However. (See Appendix). For this very reason. or other sources with variable formants. being a linear translation between two static spectral states.

from many different cultures (both ·serious· and 'popular") and historical periods. The aim of this chapter is to suggest extensions to traditional ways of thinking. This approach is appropriate both for grain time-frame events where separable classes of properties may not be distinguishable and for sound sets created by progressive interpolation between quite different sounds (e. TIle very tWo-dimensionality of the musical page reinforces the notion thaI only two significant parameters can be preCisely controlled using horizontal staves and vertical bars. etc. to the MIDI protocol) it is easy to view music as a two dimensional (Hpitch/duration) structure. a study of the World's languages reveals !he great range of sound materials that enter into everyday human sonic articulation somewhere on the planet. Sound composition involves rejecting this Simplistic hierMchisation. with each of these treated as distinct perceptible orderable parameters. We may also organise sounds in terms of their overall holistic qualities. MIDI technology (1994) makes this option so easy and other approaches SO roundabout !hat it is easy to give up at this stage and revert to purely traditional concepts for building large-scale fonn. vibrato acceleration type may be imperceptible when onsct-<1ensity is very high. or their pitCh and pitch-band. However. Sounds may be organised into multi-dimcnsional relation scts in terms of their Hpitch and Hpitch-field. On !he contrary a broad knowledge of ideas about melodic construction and evolution. HArmonic control. MULTIDIMENSIONALITY Given the priorities of notated Western Art Music. Not all listable parameters arc perceptually separJhle in every case. in the late Twentieth Century. will inevitably be limited. harOlonicily type. I do not wish to decry traditional approaches to musical form-building. a compositional practice confined to this. Furthennore. vibrato-acceleration-type etc.g. onset density. These may be used to complement or to replace traditional approaches depending on the sound context and !he aes!hetic aims of the composer. Similarly. their vowel set. e. Extensions which are grounded in sound-composition itself.CHAPTER 13 NEW APPROACHES TO FORM BEYOND SOUND-OBJECTS 1befe is still the danger of regarding sound-composition as a means to provide self-contained objects which are !henceforward to be controlled by an externai architecture of Hpitch and rhy!hm along traditional lines. etc. etc. should underpin compositional choice. large-scale musical fonns in general. This latter sound interpolation illustrates the fact !hat we can 103 . ·coloured-in" by sound. rhy!hmic organisation. and the fact that these priorities are constructed into the instrument technology (from the tempered keyboard.5). But in all definable sonic Situations we have many sound parameters at our disposal. "ko->u" to 'Bell": listen to Sound example 12. In particular. I would particularly streSS a detailed appreciation of singing styles and styles of declamation from around !he world's cultures as a way to appreciate subtleties of sonic architecture and chemistry which are often missing or excluded from Western art music practice.g. or keyed flute.

The voice has now been left behind. general rate and semi-regularity of sequencing etc. They do. onset-enveloped version of the tail occurs without the vocal initiator. All the changes to the sounds can be traced to structural properties of the sounds and the control of those structural properties. the wood-like events rise in pitch and density. health or age characteristics. TIle set of keys in the tonal system form a finite. The range of human intent. Moreover. With precise sound-compositional control of this multi-dimensional situation for any desired sound. filE STRUCTURAL FUNCTION OF INTERPOLATION It is instructive to examine the form of some sound composition phrases to illustrate this rnulti-dimensional approach. We may group properties into relal£d sets. bowing attack. In this new space of possibilities. we do not have to treat each parameter as a separate entity. questioning etc) which can be conveyed by multi-dimensional articulation of the sound space. Music transferred from vibraphones to shawms. Those which remain invisible or vague in the notation are out of reach of this rationalisable control. or link the way one property varies with the variation of another (e. is something we take for granted in everyday social interactions and in theatre contexts. After the first downward portamento we amve at a section based on variants of this time-stretched voice-tail. comparative and textural perception in Chapter 9). The sound examples here are taken from Tongues of Fire (1992-94). this is only a passing modulation. hysteria etc. I am also stressing the structural function of interpolation as a continuum-domain analogue of tonal modulation. Consider the many affective ways to deliver a text. The multi~imensional complexity of the sound world is I1JCned up to compositional control. ideological divide. even where we specify no Significant change in tempo or rhythm.1 we begin with a vocal sound whose tail is spectrally lime-stretched with a gentle trenW10 at its end. for example.). Here the sliding inharmonic sound falls or rises or is vibrato or tremolo articulated in numerous ways. change in the tonal system. thus underlining the sonic "key change" which has taken place. but the spectrally time-stretched tail extensions are linked through their (time-evolving) spectr. adding a cymbal-like presence to these onsets. must come to terms with intuition. discrete and cyclic set of possibilities (see Oil SOllie Art) and lbe tonal system is rooted in this underlying structure. We are already aware that reorchestrating Hpitched music can fundamentally change its affective character. From a mathematical viewpoint. In my description of this event I have tried to emphasise the structural importance of sonority hy comparing it to tonal structure. reason (or rationalisation). we can see that there is a significant and subtly articulahlc space awaiting musical exploration. We are in a new sonic "key'. by contrast. pitch and loudness are the principle articulating parameters. information. we can generalise the notion of 'modulation' (in the tonal sense). The section ends with yet another surprising sound-modulation as the vocal onset returns hut (out of a set of other transforms) goes over into a sound like random vocal grit. There are clearly important differences. Given this multi-dimensional space.~ vocally. but now the loudness trajectory of the tremolando cuts so deep that the originally continuous sound breaks into a succession of wood-like onsets. in a multi-dimensional space. pitch-glide speed and range. It is interrupted by a surprising shift back towards the voice-tail. Sound recording and computer coutrol destroy the basis for this dualistic view. vibrato speed with depth) and we may vary these linkages. a change in "expression" in the production of the sound stream (e. 'This leads us to a varied recapitulation of the falling portamento with tremolando idea. we do not need to create scales along any preferred axes of the space. we can move from what were (or appeared to he) all-or-nothing shifts in sound-type to a subtly articulated and possibly progressively time-varying "playing" of the sound space. 104 105 . the ritardandoing wood events firmly establishing a new sonic area. We switch back again into a section of variants on the time-stretched tail of the vocal sound. but •wood-like' . The Sonic Continuum is neither discrete nor cyclic ­ but. In sound Example 13. Each event in this segment begin. duration) are I1JCned up to "rational" control (or at least rationalised explication) by comjXlSers and commentators. and we still retain some sense of the 'distance' of one sound from another even if we cannot measure this in the way we can measure the distance around Ihe cycle of fifths (see also the discussion of measured. itself colored by a very high frequency version of the granular texture. We may then explore this multi-dimensional space establishing relationships between sounds (holistically or through different property sets) which are both perceptible and musically potent in some sense. it is wonderfully multi-dimensional. of course. 'This section ends with a true sonic modulation. accusation.g. As this event clears. we can create a stepped line in any direction. The prior context is that of voice sounds. no longer vocal. but. pre-cxisting physicaJ ur emotional condition (brealhlessness.). physiological. formant and formant-glide set. as an accelerating wood-event (twice) transforms into a drum-like attack. The distinction structure/expression arises from the arbitrary divide created by the limitation of notation. Voice sounds themselves are only recognisable as such through a complex interaction of properties (pilch-tessitura. establishing a high granular texture. in fact this octave stacked. in the past. noise types. However.create scales of perception betweet1 definable points without parameterising our experience beforehand. 'This sense of elsewhere is reinforced by a further pair of passing sonic modulations. This is a sonic modulation from voice to 'wood' and is akin to a key . Features of sound we have. been able to notate with some exactitude (Hpitch.u type. We are already familiar with such subtle articulations of a multi~imensional sound space within our everyday experience. or within instrumental practice. The distinction made between "strucrure' and "expression' is in fact an arbitrary. andlor vibrato continuation on violins). remain under the intuitive conlrol of Ule performer (see Chapter I) but this type of control comes to have a lower ontological starus in the semantics of music philosophy. textual meaning (irony. With precise sound-compositional control of the multi~imensional space.g.

For the square root of 2. the rhythmic patterning is lost in a dense texture. FROM THE RATIONAL TO THE REAL As I have argued elsewhere (On Sonic Art).2 we begin with a sequence whose structure is rhythmically grounded (we hear varied repetitions of a rhythmic cell whose original material was vocal in origin: this fact is more apparent in the context of the whole piece than in the context of the example here) into which a resonant pitched event is inteljected. However as a number it cannot be expressed as the ratio of two other numbers. the two dimensional grid of staves and barlincs of the musical score reinforces a culturally received conception of the musical universe as lying on a grid. It is the very multi-<limensionality of the sonic continuum which in some sense forces us to call upon different perceptual foci as binding elements at different times. is even more instructive in this regard. In fact. This was proved by the Greeks and seemed hopelessly baffling. In Sound example 13. In Sound example 13. the falling portamento shape whose origin we hear when the spec/rally lime-s/re/ched voice peers through the texture. in the tempered scale. etc. cvcnt-<lurations as multiples or divisions of a regular unit (c. hence our perception of the continuity of experience can be recreated by piecing together Clttremely short. As it does so the texture while's 01(1 to produce a noise band which is subsequently jilterbank filtered to produce a multi-pitched inharmonic sound. however. And the square root of 2 is perhaps the most famous non-rational numher of aiL Pythagoras' discovery of the link between musical pitch relationships and the length ratios of vibrating strings spawned the pythagorean cult which linked mathematics. in terms of frequency. Sampling is a means whereby the continuum of our sonic experience can be captured and subjected to rational calculation. The rising tessitura of the texture is first perceived as a secondary propeny (an artiCUlation). 1be ratio of their frequellCies is exactly 3:2 or 1. Le. which begins to rise in lessitura. And it was not until the Nineteenth Century that a rigorous delinition of these new numbers.498307077.and devotees In this context. so Hpitch sets and countable rhythmic units are special cases of an underlying continuum of frequency and duration values matched by the continua of formant types. The square root of 2 is simply nOl a rational number (a number expressible as a fraction). is in the ear of the beholder. The obsession with geometrical proportions in written scores may be s<-'en as an aspect of the persistence of this tradition. in the Seventeenth Century. But just as the integers and the r. of various pre-tempered approaches to tuning (e. sound elements. it is not a rational number (in the mathematical sense). others. In the foreground the 'fireworks' are linked by a new device. Below a certain duration limit (as discussed in Chapter I) we can no longer distinguish indiVidual sound events. The grid hides from us the reality of the· musical continuum from which it is carved. Notation hence deals with the "rational" in the mathematical sense of the term. roughly corresponding to the mediaeval "devil in music". The key to this is the existence of perceptual time-frames. just intonation). However. where larger intervals must always be expressible as exact multiples of the semi tone (whose frequency ratio is 2 raised to the power of 1112). (the origin of the calculus) to study accelerated motion. or approximalely (.5. So. it was the Greek discovery of the non-rationality of the square root of 2 which destroyed Greek arithmetic and led to an almost complete reliance on geometrical methods. the length of the diagonal of a square of unit side.g.g. lbe numbers which form the continuum (and which include the rational numbers) are known as the Real Numbers.000 years later. I am not suggesting that one consciously composes in this way but that an intuitive sense of formal cohesion over an unhicrarchiscd set of sonic properties may be important in sound composition. or that which can be expressed in terms of ratio: finite sets of Hpilches and the associated intervals.3 we begin with a vocal texture increasing in density. The compuler thus provides a link between the world of raljonal 106 107 . and the structural focus partly passes over to the pitch domain. the reconstruction of a sound from a windowed analysis. For them the interval we know as the 5th derives from the second twO partials in a harmonic spectrum. But the computer as a recording and transformational tool has altered our relationship to the musical continuum. The tritone in the tempered scale. as both noise bands and inharmonic sounds glide up and down. duration. As the sequence proceeds the units are given spectral pilch through successive jillerballk filtering with increasing Q. the decelerating (and related accelerating) rhythmic patterns. the tempered scale 5th cannot be expressed as an exact ratio of any two whole numbers. the rational appears as only a small part (in fact an infinitely small part) of the Real! In a sense. For musicians accustomed to working in the Western tempered scale. We can also pick out other features of this sequence which help bind it together: the falling pilch shapes (and their rising inversion). inlerpolation provides the sense of logical passage from one sonic area to another (as opposed 10 mere juxtaposition) which we also feel during tonal modulation. in the mathematical world. or lattice. Musicians from other cultures. Western musical prJctice has remained overawed by the Pythagorean fixation with the mathematically rational. however would beg to differ from this analysis. What is significant is that many different dimensions of sound organisation can provide structural reference points. Rationality. that which can be counted. (It is certainly out of tune with the •pure' fifth). that mathematicians began to deal in a sophisticated way with the numerical continuum. but stable. the interval of a 5th is 2 raised to the power of 7/12. the very stuff of the continuum. For the tempered scale trHone corresponds to a precise frequency ratio of the-square-root-of-2: I. Just as the 24 or 25 static frames of film or video create for us the illusion of seamless motion.ltional numbers (fractions) are only special cases lying along the underlying continuum of the real numbers. spectral forms etc. or the playing of samples through a digital to analogue converter. All semi-tones are equal and the lifth is simply 7 steps up a 12 step scale. We no longer have the traditional hierarchy: Hpitch. particularly of Hpitch and of duration (but also instrument type). recreates for us a continuous sonic domain from what are fixed and countable values. of discreet values. is perfectly comprehensible geometrically. However. with Descartes' development of coordinate geometry (linking geometrical and numerical thinking) and Newton's use of fluxions. a crotchet). but the pilch-focus remains in the HArmonic field of this texture. It was only 2. Dut it soon hecomes the binding structural feature of the ensuing section. Once the 'fireworks' enter. mystical numerology and music. there is nothing more simply rational than the intervallic division of the octave in terms of the logarithm of frequency by which we have come to measure pitch-interval. was developed.

ions on real. then. slowly. have allempted to extend nolational practice to do this. Almost all Western Art Music as it appears i1l lile nota/ioll is concerned with articulating sequential forms. is now recordable. of loudness) to another. it clearly docs not adhere to the notated lime information. The crescendo is an example of the most elementary morphic form. as expressed through traditional notalion. Appreciation of musical perfornlance. by aural tradition. ~ TIle invention of the orchestral crescendo by the Mannheim School of Symphonists seemed a major advance at the time. (poco a poco accel) or rapidly (molto acecl). if not well described. From a nOlation-focused perspective. the reality of human experience. which itself sets limits to the aceuracy required for our calculations. and the realities of rational calculation. Musics handed down. what before lay only within the intuitive control of physical action (which varies spati ally and temporally over the continuum) in performance practice. we both learn the necessity of numerical approximation and come up against the limits of human perceptual acuity DIAGRAM 1. 'The world of "pure" ratios in the sense of tuning systems. ability in this sphere. In a world of stable loudness fields (not laking into aceount the subtleties of loudness articulation in the performance practice) t1Us was a startling development. It is either happening at a "normal" rate (aceel). and subtle comprehension of the "innuendo" of spoken language.) Similarly. -========= Conventional crescendo~ Stepped crescendo. We will describe shapes articulated in the continuum as /twrpllic forms in contrast with forms created by juxtaposing fixed values. a linear motion from one state (in tlus case. II is interesting to note how "rubalo" interpretation. cannot be gi Yen any subtlelY of articulation. analysable. suggest we have a refined. to what extent can we distinguish and appreciate articulations of the continuum. SEQUENTIAL AND MORPHIC FORM An important question is. in dealing with the continuum mathematically on the computer. But traditional notation gives no means to add detail to t1Us simple state-interpolation. continuation and spectral formant properties can be observed in the vocal music practice of many non-European musical traditions. recalculable in a way not previously available to us. at least pru'lly. as well as witlUn jazz and in popular musics. we cannot notale t1Us kind of subtle tempo fluctuation. is seen for the idealisation that it is. We come face-to-face with the Real. It docs not get hived off into a separate domain of "performance practice" separated from stable-property not3tables and subsequently devalued. Nevertheless. can carry the morphic form information from one generation of performers to another. Docs Ihis mean therefore it must be abandoned as all element in musical construction? The subtle articulation of pitch. like Stockhausen. is generally frowned upon in serious music circles. numbers). But the facI is. (Moce recent notational developments include stepped cresccmli: see Diagram 1. or temporal proportions.mathematics and the continuum of our sonic experience (so-called "floating point" arithmetic is a rational approximation of oper4l. in the mathematical sense. tempo variation. Internal articulation of the rate of speed change is not describable (though some twentieth century composers. which are sequential fOn/IS. 108 .. I'erfonnancc practice may involve Ihe subtle or involved use of "rubato" (performance "feel") related tempo variation. It mllSt also be said that. repeatable. in works for solo performer). associated with musical romanticism.

spectral energy focUS. the collision and mutual interaction of developing streams.4).With CQIIlputer control and the possibility of analytic understanding. Imagine also a theatre situation in which we see the quite separate evolution of two characters with entirely different. change of cyclic loudness variation (tremolo rate). a pitch-portamento which rises steeply before slowly settling on the goal pitch. Such a vast literature of musical analysis exists. (Soond example 13. 11le combined powers of the computer to record sound and to perform numerical analysis of the data to any required degree of precision immediately removes any technical difficulty here. 109 10 . The descrescendoing rising portamento which accelerates away from a Hpitch. lime-flows revealed in space.). may interact with one another. Even the calcified remains of sea creatures (e. density fluctuation etc. The theatrical outcome is dramatic and we.they are morphic forms captured in space. I. taking us from (perhaps) Hpitch lield percepts to pitCh-glide field percepts. but a set of instructions to generate this aetherial substance. 3. but in neither of these cases is ~ matenal In the separate streams undergomg real morphic change (10 Vox-5 the streams move in space relative to one another). a motion from state A to state B. form. or growth. harmonicity shift. 4. As morphic forming is developed over larger time-<iomains. however. 2. because of the existence of the text/performance separation in Western Art Music and the textual fixation of analysis and evaluation. These different motion types can be observed to have different perceptual qualities even within very short time-frames. just as the genetic code is a set of instructions to grow an organism. can be differentiated from a linear portamento and from one which leaves its initial pitch slowly before accelerating towards the goal. the shell of the Nautilus) tell us about the processes of continuous growth which formed them. new musical formal structures may develop. for example. In particular. change) unsettling and seck 10 codify eltperience. POSTLUDE: TIME & SPACE The forms of living organisms are the resull of continuous processes of flow (growth) . have an inkling of how this collision of personalities might tum out. this area will remain a closed book. but the way in which a flexible (and environmentally sensitive) eltpression of some branching principle is realised through a process of branching flow. In other cases. etc. are able Once we begin to combine morphic form over a number of parameters. The symmetries of the DNA molecule are not. it is intrinsically evanescent and a musical score is not the musiclil experience itself. granularity. Others see categoric divisions as temporary pens in which we attempt 10 corral the !low of experience and history.a "leading to". Stream interaction in an elementary sense can be heard in Steve ReiCh's pIlIISing p~ec:s and in the ululation section o. Morphic instability occasioned if the vibrato on the same note begins to vary arbitrarily in speed and depth over a noticeable range. morphic IJI lite sequential domain. both their lives are changed dramatically. a detailed control of several pitch-portamento lines over long time-frames. Some people lind all now (growth. even in the most elementary case. social order and even human relationships in strict categories. By contrast. After their meeting. in any sense.. (Sound example 13. However. a rich seam of development for any ~und composer who can grasp the concepts and mathematical controls of morphic flow processes. A rising portamenlO which slows as it approaches the final Hpitch and from which gradually emerges a widening vibrato (also making the percept louder) .. as compared to. internal pitch-stream motion etc. or a lexture-stream which undergoes continual morphic development of density. the morphic flow equivalent of counterpoint The way in which we divide experience into the spatially codifiable and the Ilowing governs much mon: Iban our attitude to music. What sllikes us as formally coherent about a tree is not some exact and slatic proportioning of branching points and branch angles. Simple examples would be . To those fixated by musical texts. A sense of Illorphic stability achieved on a sustained tone (with jitter) with or without a fairly stable vibrato depth and rate. We have the possibility of counterstreaming. but internally consistent. simultaneous morphic development of parameters may lead in contradictory directions. this is also stream-counterpoint. amongst human individuals. From a perceptual perspective..f Vox-5 (see ~hapter II). We have a simple example of morphic modulation. The overall state of a particular flow pattern (its result being the current state of the tree) lin~~ our perception of one member of a tree species with the other members of the same species. different musical streams. professional practice. whereas the characters are unprepared for this chance encounter. let alone describe and analyse. Unlike the growing tree.g. copies or analogues of the symmetries of the organism. we cannot assume that spatial order in a musical score corresponds to some similar sense of order in our musical experience of time-flow. lormant motion. or morphic anacrusis. we to distinguish articulations of the motion. Similarly. llIIagine. and which can diverge jnto perceptually separate streams.a common gesture in Western popular music . The possibilities in this domain remain almost entirely unexplored. Then external circumstances conspire 10 cause the two 10 meet briefly in a small room. it is difficult for music analytic thought to take on boanI morphic form. goals and desires. Human social life itself is an experience of interactive flow. pitch-band-location. From a broader perspective. there has been no means to notate. undergoing internal morphic change. before uniting sonically and spatially at a later time. as audience. grounded in sequential properties. a "leading-from" or dispersal morpheme. pilch-band width. a whole new field of study opens up. we may observe the emergence of elementary morphic categories.5) Sueh formal groupings can be transferred to other parameters (loudness variation. and any spatial proportions we may observe in the final (hrms are merely the traces of these continuous processes. we can imagine an extension of morphic control to larger and larger time-frames and to many levels and dimensions of musical experience. A tone whose vibrato-ratc gradually slows but whose vibrato depth simultaneously widens ceases to be perceived as a tonc event and becomes a pitch-glide structure. perhaps moving separately in space. For those willing to deal with precise numerical representations of sonic reality. For example. a musical event never fixes its time-nnw evolution in a spatial object.

reality is flow. the appropriateness of our methods reassessed in the light of experience. more recently. Compositional methods are. attempL~ to III . and ultimately to possible failure. the unchanging fixity of the text. Texts. a theory. change and ultimate uncertainty. new symmetries. works. For me. or. as with Plato. are our mortal attempts to hold this flow in our gt"dSps SO we can understand it and perllaps control it a little. until we discover new facts. Reality is set in stone. or print. reality may be conceived as a hidden world of fixed Ideals to which everyday reality corresponds with varying degrees of success. or the compositional idea or method. new symmetry breaking. may be viewed as a musical Ideal and its realisation in pcrfomlance an adequate or poor reflection of this Ideal object. the score (the text). fractality. the codification of the artist's method. In such a world. grapple with the very essence of human experience. like engineering metllods. in everything.Similarly. in seeking the measure of the evanescent flow of sonic events. some artists seek out exact rational proportions. golden sections or. growth. from the religious to the scientific. For artists like myself. and those tell:lS have to be revised or rewritten. Pemaps it may seem to reveal the hand of the ultimate designer in all things. subject to test. For a certain time a procedure. an approach. sound-composition. in the exact proportions of architecture. the proportions of our architecture reconsidered.

ArrACK The onset portion of a sound where it achieves il~ inilial maximum loudness.. ACCUMULATION Sound which 'gathers momenlum' (increases in loudness and speClral conlen!) as il proceeds. In particular. wherever this does nOl cause any confusion. ALL-PASS FILTER AMPLITUDE ... the process of converting the wavefonn (time-domain representation) of a sound into its spectrum (frequency-domain representation). inslead of Amplitude. Diagrams however usually refer to Amplitude. This normally reveals itself in perception as the • loudness of the sound... bUI amplilude and loudness and not equivalent. 9 A scientific measure of the strength of a sound-signal. in this book. musical practice elc) bUI onen not appearing as such in any associaled nOlalion syslem. Ihe term Loudness is used.......... 112 . ARTICULATION The (conscious) changing/moving of a propeny of a sound. However.APPENDIX 1 (Page numbers refer 10 the diagrammatic Appendix 2)... the sensitivity of dle human ear varies from one frequency range 10 another.. • **** ACCELERANDO A····· An increase in the speed at which musical events succeed one another... ANALYSIS In this book analysis almost always refers to Fourier AnalYSis. in the lext. This IlOtion mighl be generalised 10 include any process attempting to uncover Ihe underlying Slructure of a piece of music. usually in a way bound by conventions (of language. hence Amplitude and perceived loudness arc IlOl idenlical. Analysis of musical scores is the process of determining the essential slrUctural features of a composition from ils notaled representation in a score.

. .. BAR CHANGE RINGING A grouping of musical duration-units. 150 and 250. CHORD A set of pitches initiated and sounding at the same time.. between 50 and 150. a voice) sound like a group of similar sources all making the same sound (e. interpolating between the given values where necessary..e.. Usually a set of pitches within a known reference set (e. COMB-FILTERING 64 113 114 .) and the time at which that value is reached.... NOT the actual cause of the sound.3 1.7 2. Channel is also used to refer to the right hand and leli hand parts of a stereo sound (which can be viewed as two separate streams of digital information). In the simplest case sounds are selected in order. e. to determine what to do at a particular time in the source-sound.0 1....... and a musical program may read the data in the table.. CHORUSING .***** c ***** CADENCE ***** B ***** A recognised device in a musical language which signals the end of a musical phrase.. COMB-FILTER TRANSPOSiTION .. in each block of 10 cycles per second (i. Was the source apparently struck. shaken. BREAKPOINT TABLE A table which associates the value of some time-varying quantity (e....... BAND PASS FILTER BAND REJECT FILTER . and for many other musical applications.g . A process of making a single musical source (e... Channels should not be confused with WINDOWS. and respliced back together in order..7 2.... We derive the spectrum (frequency domain representation) of a sound from its waveform (time-{\omain representation) by a process of analysis. CHROMATIC SCALE The european scale consisting of all the semi tone divisions of the octave. between 5 and 15.g. 120 crotchets per minutes). The bar-length is measured in some basic musical unit (e. spectral stretch etc.. more discriminatingly. However there are many possible variations on the procedure.. Brassage may be used for changing the duration of a sound. where its pitch is known... loudness. 15 and 25.). bar length is regular for long stretches of time. pitch... These search blocks are the channels of the analysis.. 65 A quick method of achieving octave-upward transposition of a sound. crotchets.g..g.. 45 7 7 CAUSALITY The way in which the sound was apparently initialed..g.1 This table allows us to describe the time-varying trajectory of the quantity..e. In most musics. for evolving montage based on a sound-source. and bar length is one determinant of perceived rhythm...g. the European tempered scale).g.. 25 and 30 etc.. BRASSAGE A form of bell-ringing in which the order in which the bells are sounded is permuted in specific ways. See also GRANULAR RECONSTRUCTION. CHANNEL Channel is most often used in this book to refer to an analysis channel. section or piece. 250 and 350 etc) or. We may search for a partial in each block of 100 cycles per second (i.. rubbed. quavers) for which a speed (tempo) is given (e...2 2.. spun etc. In doing the analysis we must decide how accurate we would like to be. from the source. ?? 44-45 A procedure which chops a sound into a number of segments and then resplices these together tail to head.. (time) (value) 0.. a chorus of singers singing the same pitch).

... .. CONSTRUCTIVE DISTORTION A process which generates musically interesting artefacts from the intrinsic properties of the waveform Describes the way in which a range is filled.... Distortion also implies the degradation of the sound (and irreversibility means that the original sound cannot be restored from the distorted version)... Here the duration unit. DESTRUCTIVE DISTORTION An irreversible transformation of the waveform of a sound.... ... The reconstruction of speech (usually) by synthesising the transitions between significant phonemes (roughly speaking....... A high-densily texture has a great many events in a short time.. wttich is the time hetween the start or successive sound events. (their morphology). the shape of the spectrum.... occurs 120 times every second. changing its spectral quality (the brightness. D····· DELAy ...... 62 or the (time-varying) spectrum of a sound............. and to define a sequence of events (a score) using the sounds generated. In particular...... CUTTING .. Temporal density is a primary property of rexTURE-STREAMS. But note that the spectral contour may itself evolve (change) through time........... 64 Reducing the loudness of a sound by greater amounts where the sound itself becomes louder.. this book reserves Trajectory for such lime-<:hanging properties. rather than the pitch or duration.. 60 . 60 ... we hear how the sound qualities evolve in time.. The concept of density can also be applied to pitch-ranges...... 40 115 116 ....g... DIPHONE SYNTHESIS The shape of some property of a sound at one moment in time... The names of computer instruments may however use the term 'Envelope'..... CSound is the most recent development of a series of such general purpose synthesis engines.. CONTOUR Means of ensuring the prominence or a lead 'voice' in a mix... noisiness etc)...... These sounds have continuation. The spectral contour describes the overall loudness contour of the spectrum at a single moment in lime. CSOUND A general purpose computer language which allows a composer to describe a sound-generating procedure (synthesis method) and ways to control it..... CONSTRUCTED CONTINUATION DENSITY The extension of a sound by some compositional process (e.. This is often referred to as Spectral Envelope... However. DYNAMIC INTERPOLATION The process whereby a sound gmdually changes imo a dirferent (kind of) sound during the course of a single sonic event... crotchet.... Destructive distortion which preserves zero-<:rossing points can be musically useful.. Envelope is also used to describe the time-<:hanging evolulion of a property (especially Loudness)... zigzagging)... and the one in most common use at the time of writing (AulUmn... Applies particularly to time ranges. vowelS & consonants) rather than the phonemes themselves..... and Contour for the instantaneous shape of a property... DURATION The length of time a sound persists...... DUCKING CORRUGATION CROTCHET == 120 An indication of the speed at which musical events suceeed one another. 1994).......... Not to be confused with event-onset-separation-dration. CONTINUATION Where sounds are longer than grains. To avoid any confusion.....COMPRESSION ... This speed is known as the Tempo...... See WAVESETS. and which then generates the sound events thus defined..... brassage. in almost any degree of detail........ DRONE A pitched or mUllipitched sustained sound which persists for a long lime.......

.. S18. .... The sequence begins with 1..1 and is hence...... FIBONACCI SERIF}...... F··"· F5 The pitch •F' in the European scale. ENVELOPE A very efficient computer algorithm for performing the Fourier Transform.. the pitches of the scale of G minor deline a Field. Any pitch within the pitch continuum lies close (in some perceptually defined sense) to that Field or no\..8. FORMANT A set of values defmed for some property.61 FIELD ENVELOPE TRANSFORMATION . in the 5th octave. "...... 60 ENVELOPE FOLLOWING .......... ENVELOPING EXPANDING 59 FIL'lliR 60 An instrument which changes the loudness of the spectral components of a sountl (ellen reducing some to zero).......... 24121.... 7-8 .. FIL'lliR BANK FLUTIER-TONGUING A way of articulating a granular sound on breath controlled instruments (woodwind or brass) by making a rolled 'r' in the mouth while producing a pilch on the instrument....34.5.... ECHO EDfIlNG FAST FOURIER TRANSFORM ....·····E·.. 8f13.13.. 64 1be processes of cutting sounds into shorter segments orland splicing together sounds or segments of sounds. ENVELOPE CONTRACTION . Computer instruments which manipulate this loudness trajectory are usually called 'Envelope something'..2.. 7 10 117 118 .. Series of numbers in which each term is given by the sum of the previous two terms... 59.... Any pitch of the tempered scale lies either in that field (lr outside it......315....1.3.. 60 approaches closer and closer to the Golden Section as one goes up the series... e.... 60 Musical transformations of the loudness trajectory of a sound..21... ...g.34155 etc.. 58 The ratio between successive terms ENVELOPE INVERSION ENVELOPE SMOOTHING ENVELOPE SUnSTITUTlON 60 If!.. 1.... Envelope is also used in the liIerature to refer to the time-changing variation of any property (we use the term trajectory) and even to the instantaneous shape of the spectrum (we use the term contour).... 21134.55 etc 1be loudness trajectory of a sound (the way the loudness varies through time) is often referred to as the envelope of the sound......... FF Fortissimo = very loud.. ....... 112.......213...

..AR TIME SHRINKING BY GRAIN SEPARATION . Al'/Afl = I'lllAI' this ratio is desrilx-"d as the Gold(~n Section.. GRANULAR REORDERING GRANULAR REVERSAL GRANULAR SYNTHESIS 57 56 ... TIle frequency of the wave helps to dete......... sound consisting of a rapid sequence of similar onsets....... GRANULAR RECONSTRUCTION .. .. which takes a certain time to pass the listener... & is approximately equal to 0....... in which very brief sound clements are generated with particular (time-varying) properties and a particular (time-varying) density... 56 GRANULAR TIME WARPING Granular time-stretching or time-shrinking.... The process differs from BRASSAGE in that the segments need nOI be rejoined tail to head. 3 Musical processes which preserve the grains within a grain... GRAIN TIME-FRAME The typical duration of a grain .... See also INVERSE FOURIER TRANSFORM. GRANULAR TIME SHRINKING flY GRAIN DELETION 56 ...... 17 FOURIER ANALYSIS GRAIN Sound having a duration which is long cnough for spectral properties to be perceived but nOl long enough for time-fluctuation of properties to be perceived. GRANULAR PROCESSES .nnine the pitch we hear. FOURIER TRANSFORM A mathematical procedure which allows us to represent any arbitrary waveform as a sum of elementary silUlsoidai waveforms................ Granular Reconstruction of outpul density I is brassage.......... 56-57 A steady sound has a definite repeating shape..4 A...... FREQUENCY .. 3....... a waveform... It is used in Fourier Analysis. 56 GRANULAR TIME STRETCHING BY GRAIN DUPLICATION . almost identical to granular reconstruction....... which itself varies in lime...... 73 A procedure which chops a sound into a number of segments and then redisUibutes these in a texture of definable density.....*....... 60 GRANUI.*.... GATING GOLDEN SECTION G ••• ** A process. GRAIN-STREAM 2 The representation of the waveform of a sound as a set of simpler (sinusoidal) waveforms. 10 create a new sound.... The new representation is known as the spectrum of the sound... If a straight line AB is cui at a point I' such lhal....stream. and this waveform a definite length... The number of waveforms that pass in each second (the number of cycles per second) is known as the frequency of the wave. 56 GRANULAR TIME STRETCHING IW GRAIN SEPARATION ....... FREQUENCY DOMAIN REPRESENTATION ...FORMANT PRESERVING SPECTRAL MANIPULATION ...618 A- --1'-- II 119 120 .

[NTERPOLATION The property of having a harmonic spectrum.:hange).. HPITCH Pitch in a sense which refers to a HArmonic Field or to European notions of Harmony.. INHARMONIC A spectrum which is not harmonic. .. INVERSE FOURIER TRANSFORM A mathematical procedure which allows us to convert an arbitrary spectrum (frequency-domain II!presentation of a sound) into the corresponding waveform (time-domain representation of the sound)... and harmonic to refer to the sound-compositional usage. INVERSE KARPLUS STRONG The partials of a sound of definite pitch are (usually) exact multiples of some (audible) frequency known as the fundamental... HARMONICITY INDIAN RAG SYSTEM ....I····· 46 'Jbe system of scales and associated figures and ornaments at the base of Oassical Indian music theory and practice. (See also INHARMONIC).. and the sequencing of chords. See SPECTRAL INTERPOLATION. HARMONIC FIELD A reference frame of pitches.*••• lrITER Tiny random fluctuations in apparently perceptually stable properties of a sound. Inharmonic spectra fillY be bell-like (suggesting several pitches) or drum-like (being focused in some sense but laCking definitive pitch)..... Application of brassage with tape-speed transposition to change the pilch of a sound without alleling its duration. duration change Without pitch. FOURIER TRANSFORM... harmonic refers to the property of a spectrum where all the partials are multiples of some (audible) fundamental frequency. so in the text we use HArmonic to refer to the traditional usage.. In traditional European practice. such as pitch or loudness... All pitches in a texture controlled by a HArmonic Field will fallon one or other of the pitches of the chord. the rules governing the sounding-together of pitches..g.. Generic name of commerCially available hardware units which do both this and a number of other grain time-frame brassage procedures (e... .. In contrast to pitch as a property of a spectrum. .. harmonic means pertaining to hannony and relates particularly to tonal music..... is said 10 be inharmonic..... but which is not noise. This might be thought of as a chord... In sound composition. These partials arc then known as harmonics.H····· HARMONIC INBE1WEENING .. These two usages are incompatible. HOMOPHONIC See SOUND PLUCKING. ITERATION 42 ••••• J Music in which there are several pans (instruments or voices) but all pat1S sound simultaneously (though not necessarily with the same pitch) at each musical event. HARMONICS 4 The process of moving gradually betw('£.. 121 122 .. the spectrum is said to be harmonic and the sound has a single definite pitch... In this HARMONISER ca~e the partials are known ..n two defined states... HARMONY In European music. See also. 38 as the harmonics of the sound..

.g. MINOR LIMITING .. MF mezzo forte MIDI moderately loud...... MAJOR Most european music uses one of two scale patterns... efficient algorithm for generating plucked-string sounds. 1... tone-<Xllour melody.. MIX-SHUFFLING MIXING MIXING SCORE A set of instructions detailing what will happen when a number of sounds are MIXED together. Diagrams usually reter to Amplitude.. E... or a wrillen list in a computer file). LPC . The pitch which begins the scale defines the Key of the piece. wherever this will not cause any confusion.... META-INSTRUMENT KARPLUS STRONG An..... or in response to other data.. how loud each would be. and at what spatial position each should be placed... KLANGFARBENMELODIE Musical line where successive pitches are played by different (groups (1) instruments... tremolo etc). at what time each would start. together with more Instrument specific data (which synthesis patch is being used)..~ the major scale and Ihe minor scale.. hut could equally well be a set of instructions for moving faders on a mixing desk in a studio.... and on certain kinds of control information provided by controllers on digital instruments (e.. ·····M. known as Ule major scale and the minor scale (the latter having a number of variants)......... around which the melodic patterns and chord progressions of the piece are organised....... 47 46 123 124 . the term loudness is used in the text... MONO Sound emanating from a single source (e... MIDI docs NOT record the sound itself.. how forcefully (or quickly) it is pressed.n instrument which provides control instructions for another instrument. ule mixing of sounds is controlled by a mixing score (which may be a graphic representation of sounds and their entry times. In this book.. This is a communication protocol for messages sen! between different digital musical instruments and computers. Literally. K . and is measured in a way which takes into account the varying sensitivity of the ear over different frequency ranges.. KEY A piece of european music using the tempered scale (except where it is intenlionally atonal) can usually be related to a scale beginning on a particular pitch. As opposed to stereo.. .... ••••• L ••••• Acronym for Musical Instrument Digital lnlerface. The minor scale has tWO important varianls.. 12 42 Most european music uses one of two scale patterns.... MIDI stores information on which key is pressed or released.. A meta-instrument might write or modify the mixing insllllctions according to criteria supplied by the composer.. In this sense it differs from the Amplitude of the signal.. 12 LOOPING LOUDNESS Technically speaking.. known a.. 60 UNEAR PREDICTIVE CODING . loudness is a property which is related 10 perception...... Ihe melodic and the harmonic minor.. which is a scientific measure of the strength of a sound.. a single loudspeaker) or a single channel of digital information. 11lis might be a text file or a graphic display on a computer.....g.. pilch-glide......g.. A typical mixing score would contain information ahoUl which sounds were to be used.

...g... usually consisting of a sequence of a few pitches (in notated music)........ MOTIF Small element of musical structure...... MULTI-DIMENSIONAL SPACE A /ine defines a one-dimensional space.. Other sounds (especially those recorded directly from the natural environmcn!) may contain elements of unwanted noise which we may wish to eliminate by noise reduction......g...·N . PHASE PHASE INVERSION 9 (9) The sound waveform may be replaced by thc same fonll bUI 'upside-down' (i.. OCTAVE-STACKING ..... to the infinite number of dimensions in Hilbert space..ero-line (known as a phase-shift by 180 degrees... Parameter often also implies the measurnbility of that property... NOISE REDUCTION Process of eliminating or reducing unwanted noise in a sound source. 3 .c....... Two pitcbes separated by an octave... NOTCH FILTER ..... 7 21 Specific rearrangement of the elements. ONSET SYNCHRONISATION 48 48 TIle way in which the properties of a sound vary with time.... Typical examples might be the consonants 's' or 'sh'.. pitch and duration may be thought of as defining a two-dimensional space....... Superimposing the original sound on its phase-inverted version causes the sounds to cancel one another.. PARTIAL TRACKING NOISE PERMUTATION Sound having no perccplible pitch(es) and in which energy is distributed densely and randomly over the spcctrom andlor in a way which varies randomly wilh time. The sinUSOidal elements which define the spectrum of a sound. or hy PI rndians).. or the properties of thc elements (e.· OCTAVE A sound whose pitch is an octave higher than a second sound.. the new wave riscs where the olher falls and vice versa).... e.. Spaces may be of any number of dimensions (I... of a sequence of musical events.*...... MULTI-SOURCE BRASSAGE . not necessarily ones that we can visualise in our own spatial experience) from the four dimensions of Einstein's space-time.....e.. 45 .. and out of which larger stroctura! units (e. are regarded in some sense as the 'same' pilch. or as belonging to the same ·pitch-class'. The samc-cffcct is achieved with a sinusoidal waveform hy restarting thc wave from the tirst point at which it recrosses the 1.g. usually has twice the frequency of thaI second sound.... loudness)...... a sheet of paper or the surface (on! y) of a sphere a two-dimensional space. PARTIAL ... 125 126 . phrases) are built......... in european music.MORPHOLOGY ·····0··.....* PARAMETER p ••••• Any property of a sound or a sequence of sounds which can be musically organised.• ... and the world we live in is a three-dimensional space.. We may generalise the notion of a space to any group of independent parameters... and this is the space that we draw in wben we notate a traditional musical score.... .

but there is an important distinction. On a fretted or keyed instrument like a piano.. The perceived result of the procedure is governed more by the ar1efacts of the process of transformation. PITCH .. A windowed Fast Fourier Transform. or the finger along the string.....01 seconc!~.. ALternatively we may sct a grid at SOIll..g.....70 71 A very small division of the musical scale..01 seconds.....g.. PITCH-TRACKING 70. PROCESS-FOCUSSED TRANSFORMATION A fundamental sound unit of a language.. E. pitch is similarly quantiscd. cau~es pitch to fall through the continuum.... .. Below 16 cps.. The same process applied to two very different sources will produce very similar goal-sounds.... PHASING PHONEME 9 Sliding of pitch.. Often incorrectly referred to as glissando.. a grid of demi-semi-quavers. PHYSICAUTY The physical nature of the apparent source of the sound.. PITCH-TRACKING BY PARTIAL ANALYSIS PITCH-TRANSFER Imposing (time-varying) pitch of a sound on a different sound. Half a semitone (the smallest interval accessihlc on a standard European keyboard instrument. and we hear a rapid descending scale passage: a glissando. Above 4000 cycles we may still be aware of tessitura (relative pitch-range) but assigning specific pitch becomes more problemallc.. rigid or flexible.... QUANTISATJON Forcing the timing of evcn[s on to a time grid of a specific size..... rather than as written) but the true definition is more subtle... Crudely speaking we may think of vowels and consonants (as heard. QUARTER TONE Finding and recording the (time-varying) pitch of a sound.... rather than by the particular nature oi the source-sounds employed. without picking out intervening scale pitches: a ponamento.' division of the metrical unit e.. Is the apparent source hard or soft.... and in this case they are known as the hannon.lcs a lime reference-frame. in traditional practice. .ics of the sound... The quanlisation grid provi.... All events must then fall on some multiple of demi-semiquaver divisions of the beaL The actual time quantisation will then also be detcnnined by the tempo (the number of crotchets per second). Humans hear pitch between approximately 16 cycles per second and 4(X)O cycles per second. PITCH-TRACKING BY AUTO-CORRELATION . Some musical processes will so radically alter a source sound that the perceived goal-sound is mor~ dependent on the process of transfonnation than on the source itself. we may slide lingers from a high pitch to a low pitch.... 4 ***•• Q ••••• A property of instrumental and vocal sounds organised in most traditional musical practices. NOT the physical nature of the real source. Any event must then fall at some multiple of 0....... PHRASE Element of musical structure consisting of a sequence of sound events and... like a piano)... a similar motion with the slide...71 Q .. Pitch arises from a regular arrangement of partials in the spectrum. .... usually lasting for a few bars.. All partials are multiples of some fundamental frequency which is audible (and which mayor may not be prescn[ in the speclrum).PHASE VOCODER II pORTAMENTO Instrument producing a moment-by-moment analysis of a sound wavefonn to reveal its time-evolving spectrum. the sound breaks up into a grain-stream.... On a trombone or violin... but the leeys allow us to access only the pitches of the scale.. we may set a grid at 0.. & The steepness with which a tiller cuts (lut ul1wanlcd frequencies.. PITCH-GLIDE See PORTAMENTO..... On keyed or frelted instruments. granular of of-a-piece etc.... In the simplest case (the Sine wave) there is only the fundamental present.

. 2.... which are properly known as samples. SEQUENCE GENERA1l0N SEQUENCES Groups of consecutive sounds having distinctly different spectral properties. vibrations or waveforms call be descrihed.. If an object is vibrated it will produce a sound. Note that the sound-events themselves are nOl reversed.. The even more general notion of permuting a given set of elements has been more widely used (e.... The sounds recorded on a sampler are often referred to in the commercial literature as 'samples'....g..... ......... 129 130 .......... •••• * s ••••• SAMPLER I SERIALISM •.... These time intervals must be very short if high frequencies in the sound arc to be resolved. ... Sound is digitally recorded by sampling the value of the (elecuical analogue of the) sound wave. SERIAL COMPOSITION Style of twentielh century European musical composition in which the 12 pitches of the chromatic scale are arranged in a specific order.. Sounds (or sequences of sounds) constructed so that they rise in tessilura while their pitch falls (or vice versa)..)sition information sent from a MIDl keyboard). SAMPLING . of whose elements differ by perceptually significant pitch steps). e.... Ihe chromatic scale as a reference set for european harmony. These should not be confused wilh !he individual numbers used to record Ihe shape of the waveform itself. of loudnesses etc. RESONANCE The smallest interval between consecutive pitches on a modem European keyboard or fretted instrument.. SEMITONE A set of values which provide a reference set against which OIher values can be measured. SINE WAVE .. The oscillation of a simple (idealised) pendulum is described by a sine wave...... Musical scales are defined as some pattern of tones (equal to two semitones) and semitones.. (See SAMPLING).. pitch-change by 'tape-speed' variation with !he specific transp.. al regular time..'ronant frequencies.... See SERIAL COMPOSITION . systems music).g.....intervals. and this sequence (and cenain well-specified transformations of it) arc used as the basis for Ihe organisation of pitches in a piece........ RETROGRADE 1he performance of a sequence of sound events in Ihe reverse order.. speech... Due to its particular weight... If supplied with frequency-unspecific energy it will tend to vibrate at these natural re....... orchestra etc which fall on its natural resonant frequencies. A hall or building will reinforce cenain rrequencies in a voice. between 22000 and 48000 samples per second). size and shape there will be certlIin frequencies at which it will vibrate 'naturally'.... SHEPARD TONES 72 A piece of hardware...... (e...g. 64 RITARDANDO A decrease in the speed at which musical events succeed one another.g....... The general idea of serialism was also extended to sequences of durations.••••• R ••••• RANDOM-CUTIING REFERENCE FRAME 41 SCORE The notation of a piece of music from which a performance of the work is recreated. REVERBERATION .. .. A-B-C-D-E becomes E-D-C-B-A.. which Ihat digitally records any sound and allows it to be manipulated (e.. or melodic pltrases on keyed instruments (the speCtrJ. !he set of vowels in standard english as a reference set for classifying regional accents etc..... or a software package on a computer........ e.... At a sampling rate of 48000 samples per second.. SINUSOIDAL Having the shape of a sine-wave.g.5 The elementary oscillations in terms of which all other regular oscillations. HArmonic minor scales also containing a three semitone step.. A flute tube with a certain combination of closed holes has specific resonant frequencies which produce the pitches for Ihat fingering. Ihe highest resolvable frequency is 24000 cycles per second... SHAWM A mediaeval wind instrument with a strong reedy sound...

..domain representation of the sound. SPECTRAL FOCUSING 20 The tail to head joining of two sounds..... The frequency....I..... two loudspeaker) or stored as IWO channels of digilal informalion.... SPLICING 1 SPECTRAL BRIGHTNESS '1...... 23 SPECTRAL SHIFTING .... 19 'II II . 35 SPECTRAL INTERPOLATION ........... the sound will appear bright............ 40 A measure of where energy is focused in the spectrum.': SPECTRAL TRACING SPECTRAL UNDULATION SPECTRUM l Iii .... 29 SPECTRAL STRETCHING .... 30-31 SPECTRAL TIME-STRETCHING . 27 ........... In contra~t.. SOURCE-FOCUSSED TRANSFORMATION A musical transformation whose outcome depends strongly on the nature of the source sound... In the classical tape studio this would be achieved hy joining the end of one sound tape to the beginning of another... 34 SPECTRAL SHAKING . the Weslern semi tone is huill into Ihe structure of ils keyboard instruments . STEREO Sound emanaling from two channels (e. 32-33 SPECTRAL MANIPULATION 18-35 Musical processes thai work direcUy on the (time-varying) spectrum of Ihe sound.. using sticky tape. SPECTRAL MASKING ..... '1' 26 Representation of a sound in terms of the frequencies of its partials (those sinusoidal waves which can be summed to produce the actual waveform of the sound)........1 ........... I SOUND-PLUCKING A musical transformation which imposes a plucked-string-like attack on a sound. SQUARE WAVE 5 I'li I' 'I'" SPECTRAL FORMANT TRANSFER SRlITl see VOCODING...g... where each repeated unit is changed slightly away from the previous one and towards the goal sound..... This is more of a theorelical unit of mea~urement than a praclically applied unit.... Defined in contrast to PROCESS-FOCUSSED TRANSFORMATION....r- SOUND REVERSING SOUND SHREDDING .......... SPECTRAL ARPEGGIATION 24 l :ii 25 28 2....................... 30-31 SPECTRAL TIME-WARPING .. STATIC INTERPOLATION The process wherehy a sound gradually changes into a different (kind of) sound during a series of repetitions of the sound..... 43 41 72 ~ SPECTRAL TIME-SHRINKING ..... Sound infomlation provided through IWO loudspeakers is ahle to convey information about the (apparent) positioning of sound sources in the intervening space belween the loudspeakers..... As opposed to mono (from a single source). ...................... 18 SPECTRAL SPLITTING .............3 1'1' '" 1 SPECfRAL BLURRING ...................................... It is at least 6 times smaller than a semitone..... If some of upper partials are very loud.........II' I !'i 132 ........ SPECIRAL FREEZING SPECTRAL INTERLEAVING 22 The smallest unit into which the octave is divided in classical Indian music and from which the various rag scales can be derived.. 30-31 SPECTRAL TRACE-AND-BLUR ... rather Ihan suggesting merely a pair of sound sources.......................

'TEXTURE Organisation of sound elements in terms of (temporal) density and field propenies.. to another....... of which the tempered scale IS just on<~ example. or from the parameters supplied to an electrical oscillator.. rather than individual sounds themselves.g.. some comhination of sequence...... TEXTURE STREAM 5th. Hence tessitura also has something to do with where the main focus of energy lies in the spectrum (where the loudest groups of partials are to be found).. 2JI 3/2 4/3 5/4 6/5 7/6 . to an entirely unpredictable one. THRESHOLD A value either not 10 be excceded (or fallen below).4/3 7Ih . The sounds forming any particular group arc. SYNTHESIS Process of generating a sound from digital data... In its simplest sense..... 36 68-69 68-69 TEXTURE OF GROUPS .. octave. 7/4 69 Texture in which specific small sets of sound elements.g.. In european music the relative duration of CVCnlS ar<~ indicated by note values c. while SHEPARD TONES may rise in tessitura while falling in pitch(!).. In tempered tuning the octave is divided into 12 exactly equal semitoncs.e...... but now it is possible. SOnle pure interval ratios arc.... 36 TEMPERED TUNING Texture in which small sets of sound elements are considered as the organisahle units of the texture. unpitched (e.... chosen arbitrarily from the available sound sources.. In a harmonic spectrum.. I'llthcr than individual sounds themselves. or to be noted (in order to cause something else to happen) if II IS thus exceeded... TEMPO The mte at which musical events occur. known as motifs.. and each of lhese may be organised as units of the texture. however...g.. Also known as register.. noise) sounds may rise in tessitura while having no perceivable pitch. Apart from the octave.e.. TEXTURE OF ORNAMENTS 69 Texture in which specific small sets of sound elements. the range of pilch in question. arc attached to the fundamental sound elements of the texture.. TEXTURE OF MOTIFS lbe lUning of the scales used in European music.. The speed of the whole will be indicated by a tempo marking e. ... through a process of careful analysis and subtle transformation.~ convincingly 'natural' as the recording of the Original sound... Hence the striving 10 achieve some kinds of compromise tunings. formant sequence. 3/2 There is no common smaller imerval from which all these 'pure' intervals can be constructed.......... the tempered scale only approximales frequency ralios of the pure inlcrvuls. However.... and the frequency ratios between members of this series arc known as 'pure' intervals. TEXTURE CONTROL TEXTURE GENERATION •••• * T •••• * TAPE-ACCELERATION . timc ­ placement. or set of states... a crotchet is as long as two quavers. lbe mtio of their frequencies form a pattern known as the harmonic series i... The tessituf"d of soprano voices is higher than that of Tenor voices.. pitch-relationship. 133 \34 .g. is defined. crotchet Dense and relatively disordered sequem....... = 120 which means there are to be 120 crotchets in one minute..... I... 2J1 4Ih.. etc). Originally synthetic sounds were recognisably such. the frequencies of the partials are cxaCI multiples of some fundamental frequency.. The sounds forming any particular motif are in some kind of predefined arrdngement (e.. 68 TAPE-SPEED VARIATION .... a system which became firmly established in the early eighteenth century.. The 7th has no close approximalion in the tempered scale...e of sound events in which spedne ordering propenics between individual elements cannot be perceived.. Properties may be ordered in tcrnlS of Field and Density... are considered as the organisahle units of the texture.STOCHASTIC PROCESSES 1ESSITURA A process in which the probabilities of proceeding from one state. the ratio between the frequencies of any two pitches which are a scmitone aparl is exactly the samc. lbe temporal evolution of the process is therefore governed by a kind of weighted mndomness. which can be chosen to give anything from an entirely determined outcome.. known as ornaments.. to recreate a recorded sound in a changed form which however sound a..

granular time-stretching. TIME-VARIABLE TIME-STRETCHING See time-warping. taking the square ((XlI). Using the value of some time-varying property (usually the loudness) of a sound to cause something else to happen.. loudness trajectory. pitch trajectory. omission.g. VIBRATO 66 Cyclical undulations of pitch between c. harmoniser. formant trajectory. like the rising and falling of pitch in vibrato.4 50 66 c. squaring. Specifically used to refer to power-distortion i. waveset time-stretching. averaging and harmonic distortion. or of loudness in tremolo.. adding slight vibrato-tremolo to the resonated sound. substitution. waveset inversion. T1ME-WARPING 30 cyclical change in the value of a musical property. shuftling. VOCODING 1l1e interval between the first two pitches of a major (or minor) scale in European music. European scales consist of patterns of tones and semitones (half a tone) and. RAPHONE . When the bars struck they produce a specific pitch. TRIGGERING 34 • •••• W ••••• WAVELENGTH WAVESET W A VESET DISTORTION 3. In contrast. In the European tempered scale. TIME-DOMAIN REPRESENT ATION UNDULATION • •••• U ••••• The waveform of a sound represented as the variation of air. to I. 4 and 20 cycles per second. reversal. TONE musical instrument consisting of pitched metal bars suspended over resonating tubes.e. in some cases 3-semitone steps. e. or sound sequence. a frequency ratio of the square root of 2. cubing. 4 and 20 cycles per second. 135 136 . 52 62 Generally refers to any process which irreversibly alters the wavesets in a sound e..TIMBRE A catch-all term for all those aspects of a sound not included in pitch and duration. TREMOLO Cyclical undulations of loudness between TRANSPOSITION Changing the pitch of a sound. TRAJECTORY The variation of some property with time. atonal music avoids indicating the dominance of any particular key or pitch. raising each sample value of the sound to a power (e.g. Loudness trajectory is often also known as 'envelope' and instruments which manipulate the loudness trajectory are here called 'Envelope something'. shaking. Of no value to the sound composer! TIME STRETCHING See spectral time-stretching..pressure with time. and the associated tube resonates at that pitch. brassage. A motor drives small rotary blade inside the tube.g. 1 i TRITONE The musical interval of 6 semi tones. V····· Changing the duration of a sound in a way which itself varies with time. TONAL MUSIC Music organised around keys and the progression between different keys.

........................................ before passing to the next window to make our next analysis.......~1 Fourier Transform that is perlormed over a brief slice of time (a window). 69 137 138 ....................... ..... ...... WINDOW In sound analysis (conversion from wavdorm 10 spectrum... 51 ZERO-CROSSING ...... 51 W A VESET SUBSTITUTION ... •• The waveform of a sound continually rises above the centre line and then falls below H..... ZERO-CUTTING ...... 53 WAVESET HARMONIC DISTORTION ......... to spectral envelope or to time-varying tiher4....... The point where it crosses the centre line is a zer04.... ........... 52 WAVESET TIMESTRETCHING ... Not to be confused with analysis CHANNEL WINDOWED I-+T A Fa....................................... 55 WAVESET TRANSFER WAVESET TRANSPOSITION WEDGEING . 54 WA VESET INVERSION ........ 40 W A VESET OMISSION ............rossing.......... then performed over and over again at successive windows throughout the entire duration of the sound.................................... We hence discover how the spectrum changes in time...43 WAVESET REVERSAL ..WAVESET ENVELOPING .................................. producing a nOise band..... WHIlE-OUT A musical process in which a texlure becomes so dense thaI all spectral detail is lost.............. The Phase Vocoder is a windowed FFT....... Z .........ocfficients : see LPC) a window is a very brief slice of time (a few milliseconds) in which we make a spectral analysis............... 51 WAVESET SHAKING . 52 WAVESET INlERLEAVING ... 54 51 ..... 51 WA VESET SHur-'FLlNG ........... 51 ZIGZAGGING ...................