Ebook TA225 Block 1 Part 1 ISBN0749258942 L3

1 TA225 BLOCK 1 INVESTIGATING SOUND CHAPTER 1 SOUND BASICS 1
1
TA225 The Technology of Music
Investigating
Sound
Chapter 1 Sound Basics page 3
Chapter 2 Sound Shape and Colour page 55
Chapter 3 Sound and Time page 105
Index page 141
c
This publication forms part of an Open University course, TA225 The Technology of Music.
Details of this and other Open University courses can be obtained from the Course
Information and Advice Centre, PO Box 724, The Open University, Milton Keynes MK7 6ZS,
United Kingdom: tel. +44 (0)1908 653231, email general-enquiries@open.ac.uk
Alternatively, you may visit the Open University website at http://www.open.ac.uk
where you can learn more about the wide range of courses and packs offered at all
levels by The Open University.
To purchase a selection of Open University course materials visit the webshop at
www.ouw.co.uk, or contact Open University Worldwide, Michael Young Building, Walton
Hall, Milton Keynes MK7 6AA, United Kingdom for a brochure. tel. +44 (0)1908 858785;
fax +44 (0)1908 858787; email ouwenq@open.ac.uk
The Open University

Walton Hall, Milton Keynes
MK7 6AA
First published 2004
Copyright © 2004 The Open University
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
transmitted or utilized in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without written permission from the publisher or a licence from the
Copyright Licensing Agency Ltd. Details of such licences (for reprographic reproduction) may be
obtained from the Copyright Licensing Agency Ltd of 90 Tottenham Court Road, London W1T 4LP.
Open University course materials may also be made available in electronic formats for use by
students of the University. All rights, including copyright and related rights and database rights, in
electronic course materials and their contents are owned by or licensed to The Open University, or
otherwise used by The Open University as permitted by applicable law.
In using electronic course materials and their contents you agree that your use will be solely for
the purposes of following an Open University course of study or otherwise as licensed by The Open
University or its assigns.
Except as permitted above you undertake not to copy, store in any medium (including electronic
storage or use in a website), distribute, transmit or re-transmit, broadcast, modify or show in
public such electronic materials in whole or in part without the prior written consent of The Open
University or in accordance with the Copyright, Designs and Patents Act 1988.
Edited, designed and typeset by The Open University.
Printed in the United Kingdom by The Burlington Press, Foxton, Cambridge CB2 6SW.
ISBN 0 7492 5894 2
1.1
TA225 Block 1 Investigating sound
Chapter 1
Sound Basics
CONTENTS
Aims of Chapter 1 4
1 Introduction 5
1.1 Music and technology 5
1.2 What is sound? 5
1.3 Summary of Section 1 8
2 Sinusoidal pressure waves 9
2.1 Introduction 9
2.2 Pressure in the atmosphere 9
2.3 Pressure waves and cycles 11
2.4 Period 13
2.5 Wavelength 13
2.6 Pressure variations in one place 16
3 Frequency 19
3.1 Frequency and period 19
4 The speed of sound 20
4.1 The experimental result 20
4.2 Frequency, wavelength and the speed of sound 21
5 Phase 23
5.1 Phase and phase difference 23
5.2 Cancellation and reinforcement 24
6 Amplitude 27
6.1 Defining amplitude 27
6.2 Practical units of amplitude 29
6.3 Root-mean-square amplitude 30
7 Pitch and loudness 32
7.1 The subjective experience 32
8 The octave 34
8.1 The octave sound 34
8.2 Octave pitch and frequency increments 35
9 The ranges of human hearing 38

9.1 Frequency range 38
9.2 Dynamic range 39
10 The decibel 40
10.1 Introduction 40
10.2 Adding decibels 42
10.3 The decibel as a measure of sound amplitude 44
Summary Chapter 1 46
Answers to self-assessment activities 48
Learning outcomes 53
Acknowledgement 54
AIMS OF CHAPTER 1
Q To introduce the concept of a travelling pressure wave as the
physical manifestation of sound.
Q To present a simple model relating pressure waves to molecular
activity.
Q To introduce the basic mathematical properties of sine waves.
Q To introduce the basic terms and measurements of sound used in
acoustics.
Q To relate the terms from basic acoustics to corresponding musical
terms.
Q To explore the subjective perception of pitch and loudness, in
particular their relationship to frequency and amplitude.
1 INTRODUCTION
1.1 Music and technology
Music technology in one guise or another is part of everybody’s life,
because music is a part of almost everybody’s life. For instance, if you
are an instrumental performer of music, professional or not, then your
instrument, be it the harp or the rock’n’roll drums, will be the result of
considerable technological expertise on the part of the instrument
maker. On the other hand, if you are not a performer but like to listen
to music, the chances are that most of your listening is done via a
home hi-fi system or a car radio rather than in a concert hall. In this
case your music is relayed to you via various technological devices;
and well before the music reaches a CD or radio it will have been
manipulated in various way using studio devices for recording, mixing
and mastering. From the traditional acoustic instruments to the
modern computer-based sampler or synthesiser, from the museum
gramophone to the latest MP3 player, whatever your choice of musical
style and whatever your relationship with music, technology is
virtually inescapable.
Technology and music have been closely associated since the first
musical instruments were constructed. This may come as a surprise to
people who are used to thinking of music technology in terms of
electric or electronic devices. The piano, for example, has taken
centuries of evolving technological expertise to become what it is
today. Throughout its history, music has freely exploited state-of-the-
art technological developments, and has had an intimate relationship
with sciences such as physics, mathematics and, more recently,
electronics and computing. Throughout this course you will see many
examples that highlight this close relationship of music with both
technology and the sciences.
This block of TA225, specifically, explores sound, which is the basis of
all music. Music, of course, implies sound. Music technology,
essentially, has the purpose of enabling sound production,
manipulation, storage and reproduction. But what is sound?
Throughout this block you will study the various aspects that, together,
comprise a basic model of sound. In this first chapter, in particular,
you will start by learning about the various ways of interpreting the
word ‘sound’, concentrating at first on a more formal approach to
sound from the perspective of physics.
1.2 What is sound?

In the section above I posed a question: what is sound? Take a few
minutes to think about this. This may seem a straightforward question,
but, in fact, sound is a rather more complicated thing to pin down than
you might think on a first analysis. In this section I would like to
explore and map out this complexity, and we will do this together,
based on your own experience with sound. To start, I’d like to propose
some listening activities.
ACTIVITY 1 (LISTENING) .....................................................................

The purpose of this activity is to put your understanding of sound into
perspective, to provide a basis for the exploration of sound undertaken
in this block. Listen to the eight audio tracks for this activity. Jot down
a few words to describe what you hear. Use whatever terms seem
appropriate to the sound you hear.
Comment
Here are my descriptions. Your descriptions will almost certainly be
different from mine because people have different musical backgrounds
and experiences. This is to be expected and is not at all a problem, as
you should be able to relate your descriptions to the main points I will
be making shortly.
Track 1: A major and a minor scale played on the piano.
Track 2: A low-pitched note followed by a high-pitched note played on
the recorder.
Track 3: A drum solo.
Track 4: Humming of machinery.
Track 5: Sound of an ambulance siren.
Track 6: Sounds of sea waves.
Track 7: A tune sung by a male voice.
Track 8: Sounds of three female voices speaking, in different languages. Q

Listen to the tracks from Activity 1 again, this time looking at my
descriptions to familiarise yourself with them, as the next activity will
build on this previous work. Q

Listen to the audio tracks again from Activity 1 as many times as you
like, and then consider the descriptions I produced in Activity 1.
Examining the words used in my descriptions, can you think of any
similarities between them, perhaps more general threads that may
bring two or more descriptions together as one broader class? Q
I can group my descriptions of the sounds in Activity 1 into three

categories, based on the types of things I have said:
(a) descriptions that refer to the objects or phenomena used to create
the sounds: for example, musical instruments (piano, recorder),
environmental sounds (siren, waves) and human voice (speech,
song, male, female);
(b) descriptions that refer to musical elements: scale, note, pitch;
(c) descriptions that use metaphors: high/low, humming.
ACTIVITY 4 (EXPLORATORY) ................................................................

Look at the descriptions you produced for the audio tracks in Activity 1.
Do they seem to belong to these categories of description?
Comment
Naturally I do not know what your descriptions were like, but I would be
very surprised if most of them could not be put into one of these categories. Q
Let’s now take a closer look at my list of categories, starting with item
(a). In my descriptions of this sort, I referred to the source-causes of
the sounds, that is, objects or instruments (the sources of sounds) and
ways of using these to produce sounds (the causes of sounds). Here are
other examples of source-cause descriptions:
Violin (source) played pizzicato (cause); pizzicato is a plucking
technique for string instruments that are normally played with a bow.
Piano strings (source) struck with a felt-covered hammer (cause); this
is the basic mechanism of a modern piano.
Hand (source) clapping (cause); this is central to traditional Spanish
Flamenco music.
Source-cause descriptions are probably the most common way of
describing sounds. Your list probably resembles mine in this respect.
Such descriptions are particularly interesting because it is quite
remarkable how descriptions of sounds seem to rely so much on things
that are not the sounds themselves. Consider the way we would
generally describe an object we see. Normally we would describe an
object by mentioning a label (let’s say, chair) accompanied by some
qualities that, we feel, make the object distinct from others (for example, a
wooden, white-painted, chair in the corner of the room). With a verbal
explanation, we can characterise the object so that it can be distinguished
from other nearby or similar objects. Instead, where sounds are
concerned we tend to mention their origins. This sort of description of
sounds might be compared to describing objects we see by talking
about how they are produced. Perhaps this is because the recognition
of the source-causes of sounds is one of the most basic listening abilities,
an instinctive ability, as you will see in Chapter 5 of this block.
My second category of descriptions (ones that refer to musical elements)
is a bit more specific. If your background is not musical, you may not
have used or mentioned this sort of terminology at all, keeping your
description to the identification of the instruments in the first two
tracks, for example. On the other hand, if your background is musical,
you may have added yet more detail, for example, mentioning the key
in which the scales in the first track are played, the relationship
between them in harmonic terms, or, perhaps, the interval you identify
in the second track. Naturally, the use of musical terminology depends
on musical training, but some of the traditional terms have crept into
colloquial speech, perhaps due to some form or other of musical training
received in early schooling. ‘Note’, ‘key’ and ‘chord’, for example, have
specific meanings in music (which are not easy to pin down), but have
become part of everyday language, as in: ‘that strikes a chord’.
My final category, item (c) ‘Descriptions that use metaphors’, is a most
interesting one, as it relates to a sort of informal language used commonly
by musicians and experienced music listeners. (If you want to remind
yourself about what a metaphor is, see Box 1.) You may have heard
Box 1 Metaphor
According to the Oxford Concise Dictionary, Eighth Edition (1990), a metaphor
is ‘the application of a name or descriptive term or phrase to an object or action
to which it is imaginatively but not literally applicable (e.g. a glaring error).’
qualities like brightness, darkness and depth attributed to sounds,

although these are clearly attributes of things we perceive visually.
Interestingly, pitch is supposed to have been originally associated with
a metaphor of high/low, which expressed the impression of highness/
lowness of the singing voice. This is still reflected in the physical
arrangement of singers in a choir according to their type of voice.
My three categories have one thing in common. All of them rely on
perception, that is, on the sense of hearing. Perceptual categories are
most useful and, indeed, necessary because a lot of the sense we make
of the world around us is enabled by what we see, hear, smell, that is,
generally, perceive. There is no pun intended here: we make a lot of
sense of things using our senses. Music technology, in particular,
relies much on perception: if the proof of the pudding is in the eating,
the ‘proof’ of music technology is in the hearing.
Additionally, though, in technological analyses we need a different
way of describing sounds, a way that allows their formal assessment
and numerical representation. Essentially, we need to be sure we are
talking about the same thing, which is quite difficult to achieve if we
count exclusively on subjective perception. Therefore, a way of
approaching sound is borrowed from physics, specifically, from
acoustics, and this is the perspective you will be exploring in the
remainder of this chapter.
In summary, I have used the word ‘sound’ to refer to things that are,
indeed, quite different in nature. Sound refers both to what is
perceived – a sensation – and to the stimulus that suggests the sensation –
a physical phenomenon involving vibrations and energy. It may be a
bit perplexing that the same word has such different meanings, and
some authors do prefer to use different words. However, in this course
we have decided to stick to one term only, because it is normally quite
easy to understand what it refers to from the context in which it appears.
The remainder of this chapter is a short introduction to the phenomenon
of sound: what it is physically, how we quantify it, and, at a basic
level, how its physical properties are interpreted as sensations.
1.3 Summary of Section 1

The close relationship of music and technology is not new, and is not
confined to electronically generated or computer-generated music.
Historically there is a long association between music and technology,
and this continues in the way instruments are made and the way music
is disseminated. Additionally, for most listeners in the developed
world, listening to music is almost synonymous with listening to
electronically processed and delivered music, through recordings,
broadcasts or the Internet.
Sound, which is fundamental to all varieties of music, can be
considered objectively and subjectively. As an objective phenomenon
it can be measured and described using a scientific, and to some extent
a musical, vocabulary. As a subjective phenomenon it is experienced
as a perception, and descriptions tend to be metaphorical, although
some musical terminology also relates to subjective perception.
2 SINUSOIDAL PRESSURE WAVES

2.1 Introduction
For much of the rest of this chapter we will be concerned with the
properties of a type of sound wave which, when represented as a
graph, has a characteristic shape known as a sine wave. Figure 1
shows you what a sine-wave graph looks like. For the moment you
need not be concerned with what this graph represents.
Figure 1 A sine wave
Despite their theoretical importance, sine waves are of limited use in

the actual business of creating music – at least music of conventional
kinds. Few instruments, for instance, produce a sine-wave type of
sound when played in the normal way, although in electronic music
sine waves can be a basic component of synthesised sounds.
Because pure sine-wave sounds are not often heard in music, I ought to
begin this study by giving a few reasons why they are so important. One
reason has already been hinted at in my reference to electronic synthesis.
For our purposes sine waves are important for three main reasons:
1 They can be used to define some basic terms and quantities relating
to sound of all kinds. They therefore give us a basic vocabulary for
talking about the physical properties of sound.
2 They allow us to explore the relationships between the purely physical
properties of sound and the subjective experience of hearing it.
3 They are fundamental to the analysis and synthesis of sounds that
are used musically (and also non-musically).
The first two of these reasons are the basis of this chapter. The third
reason will be the basis of Chapter 2 of this block, but will also loom
large throughout the rest of the course.
Sine waves are fundamental in many areas of mathematics, science
and technology, not just in sound. Many of the properties of sine
waves, which I shall discuss later in this chapter, are therefore of
wider application than sound, although I will not be referring to these
other applications to any significant extent. However, before I can say
anything further about sine waves, I need to prepare the ground by
discussing pressure waves.
2.2 Pressure in the atmosphere

The sounds we hear generally consist of rapid fluctuations of air
pressure in the atmosphere that surrounds us. Sound can also be
transmitted through other media, for instance water, so not all sound
consists of fluctuations in air pressure. However, for the purposes of
this discussion I will confine myself to sound in air.
These fluctuations in air pressure are caused by a local disturbance to the
air pressure, which might be sudden and transient – for example when
a paper bag is burst – or continuous and regular – for example when
someone sings a steady note. Whatever the nature of the disturbance,
the pressure variations spread outwards from the source through the
surrounding air, becoming gradually weaker. At a sufficient distance
from the source, the pressure variations die away completely.
For a listener in the vicinity of the sound source, the pressure variations
act on the listener’s hearing mechanism, causing the eardrum to move
in sympathy with the source of the pressure variations. The movements
of the eardrum are detected by a mechanism that will be described
later in this block, and are interpreted by the brain as sound.
equal pressure
The pressure variations we hear can also sometimes be felt. You may
inside and out be familiar with the experience of holding an inflated balloon and
feeling it vibrate in response to nearby sounds, or standing near a
loudspeaker and feeling vibrations from the bass notes. In the case of
the balloon, pressure variations cause the skin of the balloon to vibrate
in just the same way as they cause the eardrum to vibrate. Similarly,
all drummers and percussion players are familiar with the way
drumskins resonate in sympathy with sound from nearby instruments,
and often need to be damped to prevent unwanted noises being
generated.
reduced
pressure inside The air around us presses on everything it touches, and this is roughly
what is meant by the atmospheric pressure. Generally we are unaware
Figure 2
of the pressure because it acts equally in all directions, and so its
A plastic bottle
can be buckled effects are self-cancelling. However, if you removed some of the air
by unbalanced from an empty, thin-walled plastic bottle, the pressures inside and
pressure outside the bottle would cease to be self-cancelling, and the bottle
would buckle (Figure 2). The kinds of pressure imbalance that would
make a plastic bottle buckle are, however, much larger than the
pressure fluctuations associated with sound.
With regard to sound and the way it travels, we need to think about
pressure in relation to the arrangement of molecules in the air.
Atmospheric air is a mixture of gases, and at the sub-microscopic scale
consists of a mixture of gas molecules. These molecules are so tiny
high pressure that they are only detectable individually by sophisticated scientific
apparatus. In a moderately sized volume of air, such as inside a bottle,
there is a colossal number of such molecules, and between them there
is empty space. The molecules are not static, but continually move
around, bouncing off other molecules or off any solid or liquid objects
in their vicinity. In general, if there is no sound or other disturbance to
the pressure of a sample of air, the molecules are evenly (though
low pressure randomly) distributed throughout the sample.
Figure 3 The pressure of air (or any gas) is related to how closely packed its
High pressure
corresponds to molecules are (Figure 3). If the molecules are widely dispersed, the
closer molecular pressure is lower than when they are closer together – other things
spacing being equal (principally temperature).
When you pump up a bicycle tyre, by driving more air into the tyre
you squash the molecules together more closely than they are outside
the tyre. Hence the pressure inside the tyre is higher than that outside.
The message to remember from this section is that sound consists of
rapid fluctuations of atmospheric pressure, and that, at the molecular
level, high pressure corresponds to air molecules being bunched
together, and low pressure corresponds to air molecules being
relatively more widely separated.
ACTIVITY 5 (SELF-ASSESSMENT) ...........................................................

The science fiction film Alien (1979) was promoted with the grim
slogan, ‘In space, no one can hear you scream.’ Is this slogan true in
the following cases?
(a) In a space craft where there is an artificial atmosphere to sustain
the crew.
(b) Outside the space craft, where there is a vacuum and the crew need
to wear space suits to survive.
(c) On a planet where there is a poisonous atmosphere and where the
crew need to wear space suits (which are not sound-proof). Q
2.3 Pressure waves and cycles

In this section we shall be looking at the behaviour and properties of
pressure waves in the atmosphere. There are some computer
animations to help you visualise what is happening.
ACTIVITY 6 (COMPUTER) ....................................................................

Run the software item for this activity. Q
In Activity 6 you saw that a vibrating object, a tuning fork, created

alternating regions of high and low pressure, as in Figure 4. These
alternating pressure regions travel away from the fork in all directions,
though we will concentrate on one direction.
high pressure high pressure
low pressure low pressure
Figure 4 Pattern of pressure variations caused by a vibrating tuning fork
In reality, molecules are not neatly arranged in rows and columns in

the way represented in the animation and in Figure 4. These
representations are therefore simplified images of what is happening
at the molecular level. They are nevertheless a convenient way of
showing the bunching-up and spreading-out of molecules created by
the vibration of the tuning fork’s prongs.
A regular pattern of high- and low-pressure regions like this is known

as a pressure wave. Because the wave travels away from its source, it
is also known as a travelling wave, to distinguish it from another type
of wave, the standing wave, which you will meet in Block 2. Note that
not all travelling waves are pressure waves. Light, for instance, is a
travelling wave but not a pressure wave.
Among other things, a travelling wave is a way of transmitting energy.
In the case of the tuning fork, the energy imparted to the tuning fork
when it is made to vibrate is conveyed by the pressure wave into the
environment. If the tuning fork were suspended in a vacuum and had
no mechanical connection to anything else, it would not only be
inaudible but it would also continue to vibrate for much longer
because there would be nothing to transfer its energy into the
environment.
Notice that although the pressure wave travels away from the tuning
fork, the molecules do not travel away from the fork, at least not in the
long term. In the simplified representation of Figure 4, each molecule
moves regularly backwards and forwards about a fixed point. If this is
not clear to you, watch the animation again, and focus your attention
on a particular molecule. Motion such as this, which repeats itself
regularly, is known as cyclic (or cyclical) motion, or oscillatory
motion. One cycle or oscillation is one complete sequence of motion
up to the point at which the motion starts to repeat itself. (The motion
of the tuning fork’s prongs is also cyclic or oscillatory.)
As a final characterisation of the type of wave we are dealing with,
notice that the oscillations of the molecules are back and forth along a
line which is also the direction of travel of the wave. The wave is said
to be longitudinal for this reason. There are other kinds of wave,
which you will meet in Block 2, where the oscillation in the medium
is at right-angles to the direction of the wave’s travel. These are known
as transverse waves. An example can be seen in the ripples on the
surface of water: the water oscillates up and down as the wave radiates
outwards from the source.

Below are three true statements 1 to 3:
1 Sound waves are pressure waves.
2 Sound waves emanating from a single source in the open (away
from buildings etc.) are travelling waves.
3 Sound waves are longitudinal waves.
Below are three explanations of the above statements, but in the wrong
order. Which explanation goes with which statement?
(a) Because the molecular oscillations are along the line of travel of the
wave.
(b) Because the pressure variations radiate outwards from their source,
conveying energy away from the source.
(c) Because they consist of cyclical changes of pressure. Q
2.4 Period
You saw in the computer animation in Activity 6 that the prongs of
the tuning fork vibrated cyclically. The animation explained that a
cycle of the prongs’ vibration is a complete sequence of motion up to
the point at which the motion starts to repeat itself. Another term for
this repetitive kind of motion is periodic motion. The time taken for
one cycle to occur is called the period of the vibration, and in
theoretical work it is usually represented by T, or by the Greek letter
tau, τ.
Properly speaking, in periodic or cyclical motion every cycle is
identical to every other. With a practical tuning fork, however, no two
cycles are identical. This is because each cycle is slightly weaker than
the one before, as the vibration of the prongs diminishes from the
moment the fork is struck. Nevertheless, it takes several seconds for a
tuning fork to become silent, during which time there will be
thousands of cycles of vibration. Thus over the course of a few cycles
there will be very little change from one cycle to the next and we can
regard the motion as periodic.

Approximately how many cycles of the fork’s vibration were required
to create the pressure pattern shown in Figure 5. Q
Figure 5 Pressure variations for Activity 8
For a pressure wave created by a particular tuning fork, the distance

from one high pressure region to the next is fixed. A different tuning
fork (that is, one that vibrates at a different rate) will produce a
different separation of high- and low-pressure regions.
2.5 Wavelength
So far we have seen that sound is a pressure wave, and that the
spacing of the pressure variations is related to the period of vibration
of the source.
A graphical representation of the pressure wave from a tuning fork
closely approximates to a certain type of wave known as a sine wave,
as shown in Activity 9.

Run the software for this activity. Q
The last activity showed that if we freeze the pattern of high and low
pressure regions in the pressure wave, we have the pattern shown in
high low high low high low

(a) pressure pressure
pressure distance
(b) wavelength wavelength
Figure 6 A graph of the pressure wave produced by a tuning fork is a sine wave
Figure 6(a), for which we can draw a graph relating pressure to

distance from the fork (Figure 6b).
For the kind of pressure wave produced by a tuning fork, the
transitions from high pressure to low pressure are not sudden. This
can be seen from the graph of the pressure wave in Figure 6(b). Notice
that the peaks of the graph line up with the high-pressure regions, and
the troughs line up with the low-pressure regions. In between there is
a smooth change of pressure.
This graph of pressure variation has a very characteristic shape,
known as a sinusoidal shape. Alternatively, we can describe the graph
as being a sine wave. Not many instruments produce pure sine waves,
at least not when they are played in the normal way, so, as I said at the
start of this section, the use of sine waves as musical tones is limited
in practice. Nevertheless, sine waves do have a characteristic sound
which it is worth becoming familiar with. The following activity gives
you a chance to hear some.
ACTIVITY 10 (LISTENING, EXPLORATORY) .................................................

The audio track for this activity contains a variety of sine waves for
you to listen to. How would you describe these sounds?
Comment
Using the sound-cause type of description, you might have described
some of the sine waves as flute-like, the flute being one of the few
common instruments that can produce a sine wave, or a close
approximation to one.
As far as metaphorical descriptions go, sine waves are often described
as ‘neutral’, ‘pure’ or ‘colourless’. You may disagree – particularly if
you are a flute player. Q
A full discussion of sine waves and their properties would entail quite
a lot of mathematics, which is beyond the scope of this course.
However, you may be interested to know that the oscillations of many
smoothly vibrating systems, when plotted as a graph, have the
characteristic sine-wave shape. Other examples would include the
oscillations of a mass on a spring and the swinging of a pendulum,
provided the oscillations are relatively small in each case.
Notice that in the pressure wave in Figure 6, the distance between any
two adjacent regions of high pressure (or low pressure) is the same.
This distance is called the wavelength of the sound, and is usually
represented by the Greek letter lambda, λ. In fact, the distance between
any two corresponding points of consecutive cycles is the wavelength.
For instance, in Figure 7, points A and B are one wavelength apart, as
are C and D.
pressure
C D
A P B
distance
wavelength λ wavelength λ
Figure 7 Wavelength
However, although point P in Figure 7 is at the same pressure as A and

B, this pressure was generated at a different part of the cycle from that
which generated A and B. So the distance from A to P is not a
wavelength, nor is that from P to B.
All sine waves that we would describe from their sound as continuous
and unchanging (like those in Activity 10) have an unchanging
wavelength. The wavelengths of audible sine waves typically range
from a few centimetres to several metres.
Although I have defined wavelength in terms of pressure variations
produced by a tuning fork, which have a characteristic sine wave
shape, the same definition applies to non-sinusoidal periodic waves,
which are the sorts of waves more frequently encountered in music.

Figure 8 gives two sinusoidal graphs of pressure variations. What are
the wavelengths of these pressure waves? Q
pressure
0 1 2 distance
(metres)
(a)
pressure
0 1 2 distance
(metres)
(b)
Figure 8 Sine waves for Activity 11

Note that in the time it takes a pressure wave to travel a distance equal
to one wavelength, the source performs one complete cycle of oscillation.
To see why this is so, consider Figure 9(a). The fork is at a particular part
of its cycle, and the point X on the pressure wave is adjacent to the fork.
pressure direction of travel
(a) distance
pressure
one cycle later
Y X
distance
(b) wavelength
one cycle of
oscillation at source
Figure 9 Pressure wave travels one wavelength in one cycle of oscillation
Figure 9(b) shows the situation one cycle later. The prongs of the fork
are back at the same part of the cycle as in (a), and the pressure at
point Y is at exactly the same part of the cycle of pressure variation as
X was. In the mean time, X has travelled away from the fork a distance
equal to one wavelength. Thus, in the time it has taken for the source
to go through one cycle of oscillation, the wave has travelled a distance
equal to one wavelength away from the fork. This is an important point
which I shall return to when we come to look at the speed of sound.

A particular tuning fork generates pressure waves with a wavelength
of 1.5 metres. In the time it takes the tuning fork to perform 200 cycles
of vibration, how far does the pressure wave travel? Q
2.6 Pressure variations in one place

So far, when we have been thinking about pressure waves we have
visualised a pattern of pressure variations extending through space,
and travelling away from the source of the vibration.
I now want to consider how the pressure variations change at one
particular place in the vicinity of the tuning fork as time passes. You
could think of this as examining how the pressure at your eardrum
varies from moment to moment as you listen to a tuning fork’s sound,
or how the pressure changes at any fixed point in the vicinity of a
source or instrument.

Now run the software for this activity. The commentary in this item
refers to ‘frequency’. We shall be looking at this shortly, so do not
worry if you cannot follow this part of the commentary. Q
In Activity 13 you saw a graph with a familiar sinusoidal shape,

reproduced in Figure 10(b).
pressure
(a) 1 period 1 period
(b) time
Figure 10 (a) Pressure wave, with one point indicated by arrows. (b) Pressure
variations at the point indicated in (a) as time passes
It is important to appreciate the distinction between the graph in

Figure 10(b) and the one shown earlier in Figure 6(b). In Figure 6(b), it
was as though we could survey the whole of the region in which the
tuning fork was audible, and at a particular instant see where the high-
pressure regions were and where the low-pressure regions were. In
Figure 10 we are focusing our attention on one region of space, such as
that between the pair of arrows in Figure 10(a), and observing what
happens as time passes rather than at one instant of time. The graph in
Figure 10(b) shows the variations of pressure in this region as the
wave goes by. It has the familiar sinusoidal shape, but the pressure is
shown as a variation with time instead of with distance. (Notice that
the horizontal axis now carries the label ‘time’.)
Just as the prongs of the fork cyclically go backwards and forwards, so
the pressure at any particular point, such as that indicated by the
arrows, cyclically rises and falls. In other words, the pressure
variations at any point are periodic, just as the motion of the fork’s
prongs are periodic. In the time it takes for the prongs to complete one
cycle of movement there is one complete cycle of pressure variation
(from high to low and back to high again, for instance). Hence, the
period of one complete cycle of pressure variation is the same as the
period of the tuning fork.
The duration of a cycle (the period) is the time interval between any
two corresponding points on consecutive cycles of the pressure wave.
Figure 10(b) shows the period marked at two different places on the
graph, but there are infinitely many places from which to measure the
period of the oscillation.

What are the periods of the pressure variations represented by the
graphs in Figure 11? Q
The period of the waves encountered in music is generally very short.

For instance, a typical value might be 0.001 s, or a thousandth of a
second. When dealing with such short times as these, it is often more
pressure
pressure
0 1 2 3 time 0 0.1 0.2 time
(seconds) (seconds)
(a) (b)
Figure 11 Pressure graphs for Activity 14
convenient to use the millisecond as the unit of time. One millisecond

(1 ms) is a thousandth of a second.

Pressure in the air is related to how closely packed the molecules are.
Other things being equal, more closely packed molecules are at a
higher pressure than more dispersed molecules. Sound is associated
with fluctuations of the air pressure caused by local disturbance.
Fluctuations of pressure travel outwards away from the disturbance,
carrying energy imparted by the disturbance.
A simple form of local disturbance to air pressure is a vibrating tuning
fork. It generates a pressure wave, consisting of alternating regions of
high and low pressure which travel away from the fork. (The
molecules in the air oscillate longitudinally, but do not themselves
travel away from the source.) Because the pressure wave radiates away
from the source, it is known as a travelling wave.
One cycle of oscillation of the tuning fork has a characteristic time
(depending on how quickly the prongs oscillate) known as the period
of oscillation, symbol T or τ. In a single cycle of the fork’s oscillation,
one complete high-pressure region and one complete low-pressure
region in the pressure wave are produced.
For a pressure wave produced by a tuning fork, a graph of pressure
plotted against distance is a sine wave. Sounds with a sinusoidal
pressure wave are considered to be neutral or flute-like. A sinusoidal
pressure wave has a characteristic distance, known as the wavelength
(symbol λ), between adjacent regions of high pressure (or low
pressure) in consecutive cycles. The size of the wavelength is
determined by the period of oscillation of the source: a quickly
vibrating source produces a shorter wavelength than a more slowly
vibrating source. The wavelength is also the distance the pressure
wave travels in the time it takes the source to complete one cycle.
If we monitor the pressure at a fixed point in the vicinity of a
sinusoidally oscillating source and plot the results as a graph of
pressure against time, the result is again a sine wave. The period of the
pressure variations is the same is that of the source.
3 FREQUENCY
3.1 Frequency and period
In Figure 11 you saw that waveform (b) had a much shorter period
than waveform (a). Hence waveform (b) completes more cycles of
oscillation in a second than does waveform (a). Waveform (b) is said to
have a higher frequency than waveform (a). The frequency of an
oscillation (usually represented by the symbol f ) is the number of
cycles there are in a second. This may or may not be a whole number.
For instance, a certain wave might have 25.5 cycles in a second;
another might have exactly 100. In either case, though, the number
quoted is acceptable as a frequency.
The unit of frequency used to be ‘cycles per second’, which had the
merit of being self-explanatory. Nowadays it is given the
internationally agreed unit hertz (symbol Hz), named after the German
physicist Heinrich Hertz (1857–94). A typical tuning fork might have a
frequency of oscillation of 440 Hz (or 440 hertz), meaning that the
prongs perform 440 oscillations every second. For high frequencies the
kilohertz is often used as a unit of frequency. One kilohertz (1 kHz) is a
thousand hertz. (Larger units than the kilohertz are not required for
sound, although they may be used in connection with equipment used
in sound technology.)
The frequency of an oscillation is directly related to the period of the
oscillation. Suppose a source of pressure waves vibrates at the rate of
100 Hz, that is, 100 cycles per second. It is fairly clear that each cycle
must last for one-hundredth of a second. Mathematically we express
this relationship as:
1 1
frequency = period =
period or frequency
1 1
In symbols, f = or T =
T f

Figure 11 showed sine waves with periods of 1 second and 0.02
second. What are the frequencies of these waves? Q

The number of cycles of oscillation per second, both for a vibrating
source and a pressure wave, is known as the frequency, symbol f.
Frequency is specified in hertz (symbol Hz) or kilohertz (kHz). One
hertz is one cycle per second; one kilohertz is one thousand cycles per
second. Frequency and period are directly related. Frequency is the
reciprocal of period:
1 1
f = or T =
T f
4 THE SPEED OF SOUND

4.1 The experimental result
One way to establish the speed of sound is to measure it
experimentally. That is, one measures how long the sound takes to
travel a known distance, and from this works out the speed. The
answer turns out to depend somewhat on the prevailing temperature
and humidity. At an air temperature of 14°C the speed is 340 metres
per second and at about 22.5°C it is 345 metres per second. That is a
change of speed of less than 1.5% for an appreciable change of
temperature. To a reasonable approximation, therefore, we can regard
the speed of sound in air as constant, and a value of 340 metres per
second is a good general-purpose approximation to use. You do not
need to memorise this number.
Although the speed of sound is fast by everyday standards, it is far
from instantaneous. Light, for instance, travels much faster, which is
why a flash of distant lightning is always seen before the arrival of the
associated thunderclap.
For performers in large spaces the finite speed of sound can have
practical implications. For instance, in a cathedral or large church, the
reflected sound can arrive back at the performer after a perceptible
delay, which can be confusing for the performers. Also, for spatially
distributed groups of performers in antiphonal music (see Box 2),
there can be synchronisation problems.

Suppose two groups A and B of performers are 34 metres apart. Group
B synchronises itself to the sounds it hears from Group A.
To the members of Group A, does Group B appear to be synchronised
with Group A? If not, what is the apparent discrepancy (in seconds)?
Take the speed of sound to be 340 metres per second. Q
A delay of 0.2 seconds, as in the last activity, is not insignificant, as

you can hear in the following activity.

In the audio track for this activity, the audio begins on one of the
stereo channels. After a few seconds, you will hear a duplicated
version of the audio enter on the other channel, but this version is
delayed by 0.2 seconds relative to the first. Q
One way to bring two spatially separated groups of performers into

synchrony is to have them take their cue from a conductor placed mid-
way between them. To the conductor and to listeners mid-way
between the groups, the performers will be synchronised. To each of
groups A and B, however, the other group appears to lag.
BOX 2 Antiphonal music and the speed of sound
Figure 12 Interior of St Mark’s Cathedral, Venice
At St Mark’s Cathedral in Venice (Figure 12) there developed a tradition of performance

with groups of performers located in different galleries of the building. Compositions by
Andrea Gabrieli (1532/3–1585) and his nephew Giovanni Gabrieli (1554–1612) were
specially written to exploit these spatial effects. This type of composition, where
performers are grouped in spatially distributed ensembles, is known as antiphonal. The
problem of synchronisation suggests that the performers in this kind of music may not
have been as widely separated as has often been claimed, as the following indicates:
‘The amount of spatial separation between the choirs of instruments and voices used
by composers such as Giovanni Gabrieli has often been overstated. Vocal polychoral
pieces a due cori [for two choirs] were generally performed with no spatial
separation of the performing forces, but with a division between soloists and ripieno
[the rest of the performers]; the total number of singers could be as few as 12. Some
of the most extravagant late 16th-century performances saw one group in each of the
organ lofts, situated on either side of the altar, and a third group on a specially built
temporary stage on the main floor of the church, not far from the main altar.’
New Grove Dictionary of Music and Musicians, second edition, Macmillan, 2001, ‘Venice’.

You can hear a sample of the music of Giovanni Gabrieli in the audio
track for this activity. It is his Canzona no.13 from the Sacrae
Symphoniae of 1597. Q
4.2 Frequency, wavelength and the speed of sound

The speed of sound has a joint relationship with both the wavelength
and frequency of the sound. To see why, recall that at the end of
Section 2.5, in connection with the wave produced by a tuning fork,
I said ‘in the time it takes for the source to go through one cycle of
oscillation, the wave travels a distance equal to one wavelength....’
The time taken for the source to perform one cycle of oscillation is its
period T. So, in one period of oscillation, the wave travels a distance
λ. To determine the speed of the wave, we need to know how far it
travels in a second, rather than in one period. In a second there are
f cycles of oscillation, where f is the frequency, so in one second the
wave travels f times as far as it travels during just a single cycle of
oscillation. Thus,
speed = frequency × wavelength
Or, if we let the speed be represented by v:
v=f×λ
This equation above can be restated in two other ways:
speed v
frequency = or f =
wavelength λ
speed v
wavelength = or λ=
frequency f
Note that when using these equations in calculations, it is necessary to

specify wavelengths in metres and frequencies in hertz (rather than
kilohertz or megahertz) in order to be consistent with a speed
expressed in metres per second.

A tuning fork has a frequency of 384 Hz. Is the wavelength of the
sound it produces greater than 1 metre or less than 1 metre? Take the
speed of sound to be 340 metres per second. No calculator needed. Q
The relationship v = f × λ might seem to suggest that the speed of a

sound is dependent in some way on its frequency or its wavelength.
But experiments show the speed to be constant (more or less). The
correct interpretation of this relationship is that frequency and
wavelength are inversely proportional to each other. This is what we
would expect, because a tuning fork that vibrates more quickly creates
high-pressure regions (and low-pressure regions) at a greater rate. The
speed at which the pressure wave travels is fixed, so the regions of
high (or low) pressure must pack more closely together. The equation
v = f × λ shows that if one of f or λ is doubled, the other must halve so
that the product f × λ remains unchanged. If one is tripled, the other
must reduce to a third of its former value. And so on.

The speed of sound in air, symbol v, is approximately constant at
340 metres per second. (You do not need to memorise this value.)
As temperature increases, the speed increases slightly.
Speed, frequency and wavelength are related by the formula v = f × λ.
Other forms of this relationship are f = v/λ and λ = v/f. Because the
speed is approximately constant, it follows that frequency and
wavelength are inversely proportional: doubling the frequency halves
the wavelength, etc.
5 PHASE
5.1 Phase and phase difference
In this section I am considering sine waves which have the same
frequency, but are out of step with each other. The following activity
shows one way in which this can arise.

Run the software for this activity. Q
Figure 13 shows how the pressure varies with time near the tuning
fork (Figure 13a) and at a distance (Figure 13b).
pressure
near fork
0 1 ms 2 ms time
(a)
pressure
at a distance
0 1 ms 2 ms time
(b)
Figure 13 Pressure variations (a) near the tuning fork

and (b) at a distance
Although the two graphs have the same frequency, they are not in step.
At any given moment, each is at a different part of its cycle. For
example, if you look at the 1 ms point on each graph you can see that
the curves are at different parts of a cycle.
We use the word phase to refer to the part of a cycle which a particular
vibrating system is in at any moment. In practice we are often less
concerned with the phase of a single wave than with the phase
difference between two (or more) waves having the same frequency.
(The stipulation ‘having the same frequency’ is necessary because we
cannot really speak of a fixed phase difference between sine waves
with different frequencies.) Another way of saying that there is a phase
difference between two sine waves is to say that they are out of phase.
When the waves have the same phase, they are said to be in phase.
In Figure 13, on each graph’s horizontal axis events to the right are
happening later than events to the left. Thus, because the first peak in (b)
is to the right of the first peak in (a), we say that the pressure variations
in (b) are lagging in phase behind those in (a). Alternatively, we can
say that the pressure variations in (a) are leading in phase those in (b).

By how many milliseconds does Figure 13(b) lag behind Figure 13(a)? Q
In the last activity there was a phase difference of 0.2 ms, but phase
differences are not always expressed in units of time. Instead, the
following two methods are commonly used to express a phase
difference quantitatively:
(a) as a fraction of a cycle
(b) as an angle.
The first of these is fairly straightforward, as the following activity
demonstrates.

In Figure 13, by what fraction of a cycle does (b) lag (a)? Q
Expressing a phase difference as an angle depends on the fact that in a

periodic wave, every cycle is identical to every other, and we can
regard one cycle as being like a complete rotation round a circle: 360
degrees. After one cycle, we are back at the part of the cycle where we
began. After half a cycle we are half-way to the part of the cycle where
we began. This connection between sine waves and rotation is
probably easier to appreciate through an animation.

The software for this activity shows how a sine wave can be traced out
by a rotating rod, pivoted at one end (like the spoke of a wheel). You
will see that one complete rotation of the rod creates a single cycle of
the sine wave. Q
The phase difference in Figure 13 was calculated in Activity 22 to be a

fifth of a cycle. To express this as a phase difference in degrees we
simply calculate a fifth of 360 degrees, because one cycle corresponds
to 360 degrees. The answer is 72 degrees.
If the phase lag were increased continuously beyond 72 degrees, it
would eventually reach 360 degrees, which would bring the two sine
waves back into phase again. If the phase difference continued to
increase, the waves would next be in phase at 720 degrees, and so on.
5.2 Cancellation and reinforcement

I have shown that a phase difference between two points in space
arises as a natural consequence of the finite time it takes a pressure
wave to travel between two points in space. This is not the only way in
which a phase difference can arise. A phase difference can arise
between two sine waves if one is delayed relative to the other. Also,
almost any form of electronic sound-processing equipment affects the
phase of the signal it is processing, so that what comes out is not in
phase with what goes in. This applies to common pieces of equipment
such as amplifiers, filters, mixing desks, and so on, as well as
recording equipment and effects units. (You will read more about these
in Block 3.) The extent to which such shifts of phase are audible is
contentious, but experiments suggest that a varying phase shift can be
audible, whereas an unchanging one is inaudible.
One of the reasons for being interested in phase arises from the
consequences of mixing, or adding, two sine waves that are phase-
shifted relative to each other. Figure 14 shows two sine waves that are
completely out of phase.
time
(a)
time
(b)
Figure 14 Out-of-phase sine waves

What is the phase difference between these waves in degrees? Q
The consequence of adding or mixing two sine waves that are 180° out
of phase is complete cancellation of one wave by the other. This is the
basis of a noise-reduction technique sometimes used in noisy
environments: an out-of-phase version of the noise is played through
amplifiers and loudspeakers into the noisy environment, causing
cancellation and thus elimination of the noise. (Incidentally, the term
‘out of phase’ is used here to mean ‘180° out of phase’ rather than just
‘not in phase’.)
When two sine waves are in phase, there is mutual reinforcement. For
instance, in Figure 15 sine waves (a) and (b) are in phase. When they
are added or mixed the result is (c). Note that (c) is a sine wave with
the same period (and hence same frequency) as (a) and (b).
time
(a)
time
(b)
time
(c)
Figure 15 In-phase sine waves, (a) and (b), produce

maximum reinforcement when added, (c)
Intermediate amounts of phase shift, between completely in phase and

completely out of phase, produce intermediate amounts of
cancellation or adding. However, the result is always a sine wave with
the same frequency as the waves being combined. The term
interference is sometimes used to describe an interaction between two
or more sine waves leading to reinforcement or cancellation.
Reinforcement and cancellation of musical sound through phase
shifting is exploited in the effect known as flanging (or sometimes
phasing). In the original version of this effect, first used in the 1960s, a
piece of music was mixed with a slightly delayed version of itself.
This led to selective cancelling and reinforcement throughout the
spectrum of frequencies in the music. (There will be more on this idea
of a frequency spectrum in Chapter 2.) If the amount of delay is varied
the result is a very characteristic sound used in popular music.
Nowadays the effect can be created electronically, rather than by
mixing a delayed version with the original sound.

The audio track for this activity illustrates the effect of flanging (or
phasing). Q
Other instances of the effects of phase shifting in sound technology are

too numerous to list, but one that you are sure to have heard is the ear-
splitting whistling produced by (for instance) public address systems
when the volume is too high and the microphone picks up sound from
the loudspeaker. Here, at one particular frequency, there is an exact
phase shift of 360 degrees between the microphone and sounds
returning to the microphone from the loudspeaker. There is thus
reinforcement, and under the right conditions a sine wave with a
continually growing amplitude is produced at that particular
frequency.

The term phase is used to refer to the part of a cycle which an
oscillating system is in at a particular moment. For two sine waves of
the same frequency that are not in step, one wave lags or leads the
other in time. We can express the amount by which they are out of
step as a phase difference. Usually phase difference is expressed as a
fraction of a cycle or as a certain number of degrees (one complete
cycle corresponding to 360°).
If two (or more) sine waves are completely out of phase (a phase
difference of 180° or odd multiples thereof), there is complete
cancellation when the waves are combined by adding. If the sine
waves are completely in phase (a phase difference of 0°, 360° or
multiples thereof), there is complete reinforcement. When the phase
difference lies between these extremes, there is partial reinforcement
or partial cancellation. The result, however, is always a sine wave of
the same frequency as the ones being combined.
6 AMPLITUDE
6.1 Defining amplitude
Another important property of a sine wave we need to be able to
specify is its amplitude. In essence, the amplitude of a sine wave is its
size. Unfortunately there are various ways of defining what is meant by
the size of a sine wave, and you are likely to come across many of them
in material you look at outside the course. Before I explain what our
definition is, it will help matters if we look at what is meant by the
average value of a sine wave.
Figure 16 shows a sinusoidally alternating voltage. The curve is
symmetrical around the time axis, which is also the line of zero
voltage. The average value of this sine wave over many cycles is
therefore zero.
10 V
voltage
average value
0
time
Figure 16 Average value of a sine wave is midway

between the peaks and troughs
All sine waves are symmetrical around a line running through the
middle, midway between the peaks and troughs of the wave. However,
it does not follow that all sine waves have an average value of zero.
Look at Figure 17. This is an exaggerated graph of a sinusoidal pressure
variation in the air.
Pmax
average value
atmospheric pressure Pa
pressure
0
time
Figure 17 The average value here is not zero

Once again, the average value over many cycles runs midway between
the peaks and troughs, but now its value is not zero. The average value
is the prevailing atmospheric pressure.
A standard way of defining the amplitude of a sine wave is in terms of
its maximum departure from its average value. To see what this means,
look at Figure 18.
10 V
voltage
amplitude peak-to-peak
amplitude
0
time
a
Figure 18 Amplitude for a sine wave of average value zero
The amplitude is the height of the peak relative to the average value. In
this case, because the average value is zero, the amplitude is just the
peak value of the sine wave, namely 10 volts. However, in Figure 19
the amplitude is not simply the peak value.
Pmax
amplitude a
peak-to-peak
atmospheric pressure Pa amplitude
pressure
0
time
Figure 19 Amplitude for a sine wave of non-zero average value
Here, because the sine wave does not have an average value of zero,
the amplitude is the difference between the peak value and the
average value, that is, Pmax – Pa.
In general, the amplitude of a sine wave is the maximum deviation of
the sine wave from its average value. In other words, it is the
difference between its peak value and its average value. Another way
to express this is half the height of the wave from peak to peak.
Sometimes the peak-to-peak height itself is used as a measure of a sine
wave’s size, because it is easy to read from graphical displays. Such a
reading is invariably referred to as the peak-to-peak amplitude, and is

twice the amplitude as defined above. It is marked on Figures 18 and 19.
Some authors define the word ‘amplitude’ rather differently from that
given above. For them, the moment-by-moment deviation of a sine
wave from its average value is its amplitude. In this usage, the
amplitude is constantly changing, and can only be specified at a
particular moment. In Figures 18 and 19, a marks an amplitude
according to this particular usage. Although we shall not be adopting
this usage in this course, you can expect to encounter it if you look at
other books on the subject.
Earlier in this chapter, in Figures 6 and 7, you saw graphs representing
sound waves travelling away from a source. The amplitudes of the sine
waves were constant. In reality the amplitude of a sound wave must
decrease with distance from the source. Figure 20 shows how
amplitude can decay with distance from a source.
pressure
distance
Figure 20 Decaying amplitude
6.2 Practical units of amplitude

The amplitude of a sine wave is measured in whatever units are used
to calibrate the vertical axis, as you saw in connection with Figures 16
and 17. Nearly all the graphs you have seen so far in this chapter have
had pressure plotted up this axis. One of the scientific units of
pressure is the pascal (symbol Pa), but this is a large unit. A more
appropriate unit in relation to sound pressure variations is the
micropascal (symbol µPa), which is a millionth of a pascal. Another
unit often used in connection with sound is the decibel. This is a very
different kind of unit from the pascal, and merits a section to itself. I
will return to it at the end of the chapter.
Because in music technology we are often dealing with electrical
representations of sound, the volt is another common unit for
amplitude. As electrical signals can be quite small, the millivolt
(symbol mV), which is a thousandth of a volt, and the microvolt (µV),
which is a millionth of a volt, are often used.

What are the amplitudes of the sine waves in Figure 21? Q
0.4
0.3
0.2
0.1
voltage
0
0.1 0.2 0.3 0.4
time
–0.1 (seconds)
–0.2
–0.3
(a) –0.4
voltage
6
–8 –6 –4 0 2 4 6 8 10
–2 distance
–2 (metres)
–4
(b) –6
Figure 21 Sine waves for Activity 26
6.3 Root-mean-square amplitude

One drawback of the amplitude as I have defined it is that, although it
allows the relative sizes of sine waves to be compared, it does not give
a good idea of what a sine wave can deliver in absolute terms. For
instance, a sine wave with an amplitude of 10 volts has twice the
amplitude of one with an amplitude of 5 volts. But is a power source
that delivers a sine wave with an amplitude of 10 volts as powerful as,
say, a 10 volt battery? Could you use it to drive a bulb and get the
same illumination? The answer is ‘no’, and the discrepancy is due to
the fluctuating nature of the sine wave, which for most of each cycle is
well below its maximum value. It is thus often useful to specify the
magnitude of a sine wave in a way that facilitates direct comparison
with a non-oscillatory source of energy. One benefit of this is that it
enables us to say how big a non-oscillatory source would be needed to
deliver the same energy as the sine wave delivers in a particular length
of time.
A tempting solution might simply be to use the average value of the
sine wave over several cycles, but this turns out not to be useful. As
we have seen from the last section, the average value of the sine wave
in Figure 16 is zero, but this sine wave would certainly be capable of
transferring energy. The solution to the problem is found in the

concept of the root-mean-square (r.m.s.) amplitude of a sine wave,
which is mainly used in the context of electrical sine waves. You can
think of this as an alternative way of specifying how big a sine wave
is, but with the advantage of allowing direct comparison with a non-
oscillating source of energy. The root-mean square is a kind of average,
but it is derived by calculating the average power of a sine wave.
The root mean square amplitude of a sine wave is its amplitude
multiplied by a factor of approximately 0.71. (The actual value is 1/√2,
which, to five decimal places, is 0.70711.) Thus, a sine wave with an
amplitude of 10 volts has an r.m.s. amplitude of (approximately)
0.71 × 10 volts, which is 7.1 volts. It therefore conveys energy at the
same rate as a steady 7.1 volt source, other things being equal. The
r.m.s. amplitude of a sine wave is proportional to the amplitude as I
defined it earlier. Doubling one doubles the other; tripling one triples
the other; and so on.

A sine wave has a peak-to-peak amplitude of 2 volts.
(a) What is its amplitude?
(b) What is its r.m.s amplitude?
(c) What steady source of voltage (e.g. a battery) would deliver energy
at the same rate, other things being equal? Q

Amplitude refers to the size of a sine wave. It can be defined in
various ways, but a standard definition is that it is the maximum value
of a wave’s departure from its average value. (The average value of a
sine wave lies midway between its peaks and troughs.) The size of a
sine wave is sometimes also expressed as a peak-to-peak amplitude,
which is the vertical distance from peak to trough.
Root-mean-square (r.m.s.) amplitude is a way of specifying the size of
a sine wave so that comparisons can more easily be made with steady,
non-oscillating sources. The r.m.s. amplitude is the amplitude defined
above multiplied by approximately 0.71 (you do not need to memorise
this figure.) A steady source equal in value to the r.m.s. amplitude of
an oscillating source supplies energy at the same rate as the oscillatory
source, other things being equal. Root-mean-square amplitudes are
often met in electrical contexts.
7 PITCH AND LOUDNESS

7.1 The subjective experience
Two of the properties of sound that we have examined from an
objective stance, frequency and amplitude, have a fundamental
importance to our appreciation of sound and music. In this section I
want to look more closely at the subjective interpretation of these two
properties of sound. I should stress that I am talking about sine-wave
sounds in this section. The complex, non-sinusoidal sounds
encountered in music add extra layers of complexity to the
relationships I am discussing here.

Run the software item for this activity. This allows you to change
the amplitude and frequency of a sine wave and to hear the result.
(a) Keeping the frequency constant, vary the amplitude setting a few
times. How would you describe the effect of changing the
amplitude?
(b) Now keep the amplitude constant, but vary the frequency setting a
few times and listen to the result. How would you describe the
effect of increasing the frequency and decreasing it?
Comment
(a) I expect you found that changing the amplitude changed the
loudness, and the bigger the amplitude, the louder the sound.
(b) The effect of changing the frequency is rather harder to pin
down. If you are familiar with musical concepts you probably
said that changing the frequency changed the pitch. If you are
not familiar with ‘pitch’ as a musical term, then you might
have said the sound became higher and lower, and you could
regard this activity as supplying an illustration of what we
mean by pitch. Q
To summarise the message from the last activity:

Loudness is the subjective property of sound that is heard to change
when amplitude is changed while the frequency is held constant.
Pitch is the subjective property of sound that is heard to change
when the frequency is changed while the amplitude is held
constant.
Generally, pitch is felt to exist on a continuum from low to high.
Low pitches are associated with low frequencies, and high pitches
are associated with high frequencies. However, although pitch is
experienced as being on a continuum, for musical purposes a series
of more-or-less fixed points on the continuum is usually defined.
These are the pitches to which we give letter names, such as A, B
flat, B, and so on.
These apparently simple correspondences between amplitude and

loudness, and between frequency and pitch, are complicated by a
number of factors. One of these is the uneven response of the hearing
system. The human ear is not equally sensitive to all frequencies
within the range of human hearing, being most sensitive at around
4 kHz. Changing the frequency of a sine wave, particularly over a wide
range, while holding the amplitude constant, can result in a change of
loudness. It is also true that changing the amplitude of a sound while
holding the frequency constant can result in a slight shift of pitch.
Nevertheless, it is broadly true that we experience changes of
amplitude as changes of loudness, and changes of frequency as
changes of pitch.
Another factor that complicates the relationship between the objective
and subjective properties of sound is the way the ear judges changes of
amplitude. This is something I will return to when I discuss decibels,
but for the moment I would like you to try another experiment.

Run the software item for this activity. Start with the amplitude at 1
and listen to the sound.
(a) Now reduce the amplitude to 0.5 and listen again. Would you say
the sound was half as loud as when the amplitude was 1?
(b) If you did not think that halving the amplitude halved the
loudness, what reduction of amplitude seems to you to
correspond to a halving of loudness?
Comment
This activity is concerned with the subjective interpretation of
amplitude, so there is no hard-and-fast answer that everyone is sure to
agree with. In fact, you may have found the concept of halving the
loudness of a sound to be almost meaningless. Nevertheless, most
people who do this experiment find that halving the amplitude does
not halve the loudness. Usually a bigger reduction is required to give
the impression of a halving of loudness. Another way to express this is
to say that the amplitude must be more than doubled to give the
impression of a doubling of loudness. Q
We shall return to the relationship between amplitude and loudness

when we look at decibels later in this chapter.
A4 The correspondence between frequency and pitch, though not exact, is

sufficiently close for us to use frequency to define the pitches used in
music. For example, in the pitch standard known as concert pitch, the
440 Hz
pitch of the note A above the note middle C (Figure 22) is set at 440 Hz.
Figure 22 (This note is sometimes referred to as A4 using a convention that will
Frequency of be explained shortly.)
A 4 in concert
pitch Other pitch standards are in use, particularly for the performance of
older music, and even concert pitch is not universally used by
contemporary performers.

Two of the following statements are true and one is false. Find the true
and false statements.
(a) If two equal-amplitude sine waves A and B are exactly in phase,
the result of adding them will sound twice as loud as A or B by
itself.
(b) Reducing the amplitude of a sine wave always reduces its
loudness if the frequency is held constant.
(c) A sound of a particular amplitude may be audible at about 4 kHz,
but inaudible at 1 kHz. Q

Pitch and loudness are subjective properties of sound. Pitch is closely
correlated with frequency, and loudness is closely correlated with
amplitude. However, under certain circumstances, slight changes of
pitch can be created by changes of amplitude, and changes of loudness
can be created by changes of frequency. The ear’s uneven response
is part of the explanation for these latter phenomena. In the pitch
standard known as concert pitch, the note A4 (the A above middle C)
is set to 440 Hz.
8 THE OCTAVE
8.1 The octave sound
One feature of pitch that seems to be universal to all cultures is that
for musical purposes the pitch range is divided into discrete steps, for
instance the notes of a scale. This is not to say that musicians rigidly
adhere to those steps when they play, but the existence of these steps
is fundamental to the way music is conceived and organised. Different
cultures have different ways of defining the steps in their scale of
pitches, but nearly all cultures take the octave as their starting point. It
has a very characteristic sound, and it corresponds precisely to a
particular relationship of frequencies.
ACTIVITY 31 (COMPUTER, EXPLORATORY) ...............................................

Run the virtual keyboard software for this activity, or use a keyboard of
your own if you know where middle C is on it. On the keyboard, play
the notes middle C and the seventh white note to its right (labelled X
in Figure 23). Play them one after the other (in either order), listening
carefully to the sound of the two notes. Can you describe the
relationship between the two pitches?
middle C X
Figure 23 Section of keyboard

for Activity 31
Comment
Although the two pitches are clearly different (X has a higher pitch
than middle C), most people find that there is nevertheless
something very similar about them. One way to express this idea is
to say that they are two versions of the same musical sound. If you
have difficulty hearing a similarity between them, try the following.
Play all the white keys from middle C to X in order, starting at
middle C, and then play middle C again. When you reach X it
should feel like a return to base, although naturally it is not a
return to middle C. Q
The two pitches you played in the last activity are said to be an
octave apart. The pitch marked X is an octave above middle C. Its
note name is also C. The seventh white note to the right of X
(counting the white note next to X on the right as 1) is another C. It
is two octaves above middle C. The seventh white note to the left of
middle C is an octave below middle C and is also given the note
name C.
Sine waves whose pitches are an octave apart have frequencies in
the ratio 2:1. That is, the higher pitch has a frequency that is twice
that of the pitch below it. Alternatively, the lower pitch has a
frequency that is half that of the pitch an octave above.

The A above middle C has a frequency of 440 Hz in concert pitch.
What is the frequency of the pitch three octaves above it? Q
We saw earlier that doubling the frequency of a sine wave

corresponds to a halving of its wavelength. This follows directly
from the relationship v = f × λ. Thus a sine wave that is an octave
above another sine wave has half the wavelength of the other
sound.
8.2 Octave pitch and frequency increments

Because a doubling of frequency corresponds to an octave increase of
pitch, it follows that there is no constant increment of frequency that
always corresponds to a one octave increment of pitch. That is to say,
there is no fixed amount by which a frequency can be augmented that
will always produce a one-octave pitch rise.
For instance, starting at the pitch A with a frequency of 440 Hz, we
need to augment the frequency by 440 Hz to get the pitch one
octave above (880 Hz). But a further augmentation of 440 Hz does
not take us to the next A. We need an augmentation of 880 Hz to
reach the next A. Clearly the frequency steps get bigger as the
frequency (and pitch) get higher. In Figure 24, the horizontal axis
shows a series of As an octave apart (subscripted to show that they
are different pitches). Notice that they are equally spaced,
corresponding to the equal pitch-step of one octave between each.
The vertical axis shows the frequency corresponding to each pitch.
Notice that these frequencies are not equally spaced.
1760
frequency (Hz)
880
440
220
110
55
27.5
A0 A1 A2 A3 A4 A5 A6
pitch
A4
440 Hz
Figure 24 Pitch/frequency graph
The same effect is found with any pair of pitches as we move up

through the frequency range. For instance, the step from any C to the G
above is five white notes on a piano keyboard, irrespective of whether
you play it at the top end of the keyboard or at the bottom. In terms of
named musical pitches, therefore, the step is always the same size:
five notes. The size of the frequency step, however, is not at all
constant, being much wider at the upper end of the keyboard than at
the lower. What is constant is the ratio of the frequencies, being 1.5:1
for G and C (that is, the frequency of G is 1.5 times that of the first C
below it, whatever the region of the keyboard). Similarly, the
frequency step for a single tone, from C to the D above it for example,
is not a fixed amount and varies from octave to octave; but the ratio of
their frequencies is always the same. It is generally true that equal
increments of musical pitch correspond to equal ratios of frequency,
not equal increments of frequency.
Box 3 ‘A typographical notation for octave ranges’ explains the
subscript convention used in Figure 24.

Suppose the keys on a piano keyboard were spaced in proportion to
their frequency.
(a) At which end of the keyboard would the keys be closest together?
(b) On a four-octave keyboard where the bottom octave occupied
20 centimetres, what would be the total width of the keyboard?
Q
Box 3 A typographical notation for octave ranges

Several typographical systems have been devised for indicating the octave
range in which a particular pitch is situated. A common convention uses
numerical subscripts. In this convention, each new octave is regarded as
starting on C and ending on the B above, and the lowest range is given the
subscript 0. Figure 25 shows a keyboard and a notational representation of this
convention.
B4 C5 B5 C6 B6 C7
C4
B2 C3 B3
B1 C 2
B0 C1 Middle C
D E F G A D E F G A D E F G A D E F G A D E F G A D E F G A
B0 C1 B1 C 2 B2 C3 B3 C4 B4 C5 B5 C6 B6 C7
Figure 25 Subscript system for indicating pitch regions
Another system you might come across in books is the so-called Helmholtz
system. In this system C’’ stands for C0, C’ for C1, C for C2, c for C3, c’ for C4, c’’
for C5, etc.
Yet another system uses CCCC for C0, CCC for C1, CC for C2, C for C3, c for C4, c’
for C5, c’’ for C6, etc.

A fundamental musical and acoustical relationship is the octave, and
pitches which are one or more octaves apart are heard musically as
different instances of the same sound. A one octave increase in pitch
corresponds to a doubling of frequency.
For musical purposes, a pitch range of one octave is divided into
discrete steps, known as scales, the individual pitches of which are
given letter names (A, A flat, B etc.). The pattern of pitches in a scale
is repeated in other octave ranges, as can be seen on a piano keyboard,
and pitches that are one or more octaves apart share the same musical
letter name. Subscripts are sometimes added to the letter name to
distinguish different pitches which share the same name, for instance
A1, A2, A3, etc.
Equal separation of pitch does not correspond to equal separation of
frequency. For instance, the pitch step from A4 to B4 is the same as the
pitch step from A5 to B5, but the frequency step is different. However,
the ratio of the frequencies in each step is the same.
9 THE RANGES OF HUMAN HEARING

9.1 Frequency range
The lowest frequency humans can hear is approximately 20 Hz. The
upper limit for humans is nominally 20 000 Hz (20 kHz), but this limit
tends to decline with age, and for most of us it is well below this figure.

Taking the upper limit of frequency as 20 kHz, how many octaves span
the range of human hearing? Q

With age, your upper limit might drop from 20 kHz to 10 kHz. How
many octaves have you lost? Q
Although human hearing covers a range of, say, ten octaves at best,
seven of these octaves cover the bottom eighth of the range, from 20 Hz
up to 2500 Hz, which corresponds roughly to the pitch range from E b0
to E b7 (Figure 26). As far as music is concerned, this is where the action
is concentrated. Of the standard acoustic instruments, only the piano,
harp and piccolo go higher than E b7, and those not by very much.
approximately 2500 Hz
approximately 20 Hz
E 0 E 7
Middle C
Figure 26 Range of musical pitch
At the lower end of the musical pitch range, the bottom note on a
double bass or bass guitar, E1, has a frequency of just over 40 Hz. Not
many instruments can go below this, except mainly the harp, piano,
double bassoon and organ. (The keyboard in Figure 26 is extended
below a normal piano keyboard.)
To say that musical instruments rarely produce pitches above Eb7 is
not the same as saying that they rarely produce frequencies above
2500 Hz. The sounds produced by virtually all instruments are not
pure sine waves. Instead, they are more or less complex mixtures of
sine waves covering a range of frequencies above that corresponding to
the pitch being played. I will say more about this in Chapter 2, but for
the moment the important point to bear in mind is that the mixture of
frequencies associated with a single musical pitch can extend well
above the frequency corresponding to the pitch that is heard. The
presence of these additional frequencies (sometimes called overtones,
partials or harmonics) is partly what gives individual ‘colour’ to
particular instruments.

The ear is at its most sensitive around 4 kHz. Is this within the range
occupied by the pitches mainly used in music? Q
9.2 Dynamic range

The quietest sound we can hear corresponds to a pressure wave with
an amplitude of about 10 µPa, which is a very small pressure
amplitude indeed. It is about 0.000 000 01% of nominal atmospheric
pressure, and the resultant displacement of the eardrum is less than a
tenth of the diameter of a hydrogen molecule.
At the upper end of the scale, a sound which is distressingly loud
might typically correspond to a pressure wave with an amplitude of
0.01% of nominal atmospheric pressure or more. (Atmospheric
pressure itself varies from day to day, which is why I refer to a
‘nominal’ value of atmospheric pressure.) We call a range of sound
amplitudes such as this a dynamic range. From the figures given here,
it is clear that the loudest sounds we encounter can have an amplitude
more than a million times greater than the quietest.
The size of the human dynamic range comes as a surprise to many
people. Loud sounds, subjectively, do not seem to exceed quiet ones by
a factor of a million. The reason for this relates to the non-proportional
relationship between amplitude and loudness that we met in Section 6.
With quiet sounds, we readily notice a small increase in the amplitude.
For instance, two bicycle bells ringing sound distinctly louder than one.
But if the same increase is made to a louder sound (for instance, 11
bicycle bells instead of 10), the change is not so noticeable. Research
into the perception of sound indicates that instead of hearing equal
increments of amplitude as equal increments of loudness, we hear equal
multiples of amplitude as equal increments. For instance, successive
doublings of the amplitude of a sound are generally perceived as equal
increments of loudness. This is somewhat akin to the way we hear
successive doublings of frequency as equal increments of pitch.
ACTIVITY 37 (LISTENING, COMPUTER) .....................................................

(a) In the first part of the audio track for this activity you hear a quiet
sine wave which grows in amplitude by successive doublings, and
then decreases by successive halvings. Does the loudness seem to
change by the same amount each time?
(b) In the second part of the audio track you hear the sine wave grow
in amplitude by equal increments, and then decrease by equal
decrements. Does the loudness seem to change by the same
amount each time?
Comment
I expect you answered ‘yes’ to (a) and ‘no’ to (b). (Depending on your
audio equipment, you may not have heard all the changes in (b). There
are as many steps in (b) as in (a).) Q

The nominal frequency range of human hearing is 20 Hz to 20 kHz,
though most people cannot hear to 20 kHz. However, the pitches used
in music correspond roughly to frequencies in the range from 20 Hz to
2.5 kHz. Generally, musical tones are not pure sine waves but are
mixtures of sine waves with frequencies that can extend well beyond
2.5 kHz. However, although they are mixtures of sine waves, they are
usually heard as having a single pitch.
The dynamic range of human hearing refers to the range of amplitudes
the ear can cope with. It covers a range of more than 1 000 000:1.
Equal increments of amplitude are not heard as equal increments of
loudness.
10 THE DECIBEL
10.1 Introduction
For a variety of reasons, not least the very wide dynamic range of
human hearing, the decibel (symbol dB) is often used as a unit for
the amplitude of sound waves. The decibel is also used in other
contexts, such as specifying the amplification of amplifiers or for
specifying the degree to which a signal is affected by noise. In the
context of sound, the use of the decibel as a unit captures
something of the subjective impression of the way loudness
changes with amplitude.
The decibel unit has two rather unusual properties in comparison with
other more conventional units you have probably met, such as the
metre or the second.
1 It indicates a ratio, rather than an absolute value. Thus the
decibel can be used as a way of comparing one amplitude with
another.
2 Equal decibel increments correspond to equal multiplications of
ratio.
Because the decibel expresses a ratio rather than an absolute value, it
cannot by itself specify the absolute amplitude of a sound. I shall
explain shortly how it can be adapted for the expression of absolute
values, but first I want to pursue the second feature I listed above,
namely that equal decibel increments correspond to equal
multiplications of ratio. I want to do this in the context of ratios of
amplitude.
Table 1 Amplitude ratios 1000:1

for selected decibel values
Decibels Amplitude ratio 900:1

–12 0.25:1
–6 0.5:1 800:1
0 1:1
equivalent amplitude ratio

6 2:1
700:1
12 4:1
18 8:1
600:1
20 10:1
24 16:1
30 32:1 500:1
36 64:1
40 100:1 400:1
60 1000:1
300:1
200:1
100:1
–20 –10 0 10 20 30 40 50 60
decibel scale
Figure 27 Relationship of amplitude ratios to decibels
Figure 27 relates ratios to their decibel equivalents, and Table 1 does

the same thing for a few discrete values.
One thing that is immediately clear from Figure 27 is the non-
proportional relationship between decibels and their equivalent
amplitude ratios. For instance, the graph shows that an amplitude ratio
of 100:1 has a decibel-equivalent of 40 decibels, but an amplitude ratio
that is twice as big (200:1) does not have 80 decibels as its decibel-
equivalent. In fact its decibel-equivalent is about 46 decibels. A
proportional relationship between decibels and their corresponding
amplitude ratios would have a straight-line graph. Notice also (from
Table 1) that 0 dB corresponds to a ratio of 1:1, and that for ratios
between 1:1 and 0:1 the decibel-equivalent has a negative value.
In the first part of the audio track for Activity 37 you heard a sound
clip that doubled in amplitude at each reappearance. However, the
sound appeared to be getting louder by an equal amount each time.
What does a doubling of amplitude look like when measured in
decibels? Let’s say that the starting amplitude of the sound was 1 unit
on a convenient scale of pressure or voltage. In Table 2 the first column
shows the successive amplitudes of the waves that you heard in terms
of this unit. Notice that in each line of the table the amplitude is twice
that in the line above.
Table 2 Amplitude ratios and decibel equivalents
Amplitude Amplitude ratio Decibel equivalents

1 unit 1:1 0 dB
2 units 2:1 6 dB
4 units 4:1 12 dB
8 units 8:1 18 dB
16 units 16:1 24 dB
The second column gives these amplitudes as a ratio of the starting

amplitude in the first line. The third column expresses these ratios in
decibels, using data from Table 1.

How do the values in the decibel column of Table 2 increase from line
to line?
Comment
On each line, the decibel value is an equal increment on the decibel
value in the line above. The increment is 6 decibels each time. Thus in
the first part of the audio track in Activity 37 there was a 6 decibel
increase of amplitude with each recurrence of the sound. Q
The equality of the decibel increment you saw in the last activity
therefore matches our subjective sense of equal increments of loudness
when the amplitude is multiplied by a constant factor (2 in this case).
10.2 Adding decibels

A feature of decibels is that adding two decibel values is equivalent to
multiplying the ratios they represent. To see how this comes about,
consider another context in which a decibel measurement is often
used, that of signal amplification.
In Figure 28, the triangular symbol represents an amplifier which
amplifies a signal one thousandfold. A sine wave enters the amplifier
on the left and emerges on the right with its amplitude enlarged. (The
sine waves are not drawn to scale.) The ratio of the output voltage
amplitude to the input voltage amplitude is 1000:1, so, from Table 1,
we can say that the voltage amplification is 60 dB. Such an amplifier is
sometimes said to have a voltage gain of 1000, or 60 decibels.
An amplification of a thousand times could also be achieved in two
separate stages, as shown in Figure 29.
The first stage gives a tenfold amplification, and the second gives a
one-hundredfold amplification. In terms of decibels, the first stage
gives a gain of 20 dB and the second a gain of 40 dB. Adding these
gives 60 dB, which we saw in Figure 28 (or Table 1) to be the
decibel-equivalent of an amplitude ratio of 1000:1. Thus adding
decibels is equivalent to multiplying their corresponding ratios.
signal in ×1000 signal out

60 dB
Figure 28 A single-stage amplifier
signal in ×10 ×100 signal out

20 dB 40 dB
Figure 29 A two-stage amplifier

(a) A two-stage amplifier consists of a first stage which gives a gain of
12 decibels followed by a stage which gives a gain of 18 decibels.
What is the overall amplification of the amplifier, both in decibels
and as a ratio of natural numbers?
(b) A two-stage amplifier has a first stage which amplifies by a factor
of 8 and a second which amplifies by a factor of 10. What is the
overall amplification of the amplifier, both in decibels and as a
ratio of natural numbers? Q
The property of decibels whereby adding them is equivalent to

multiplying their corresponding ratios results from the fact that they
are based on logarithms. Box 4 ‘Mathematical definition of decibels’
explains their mathematical background, but this material is not included
in the outcomes for the course and you will not be assessed on it.
Box 4 Mathematical definition of decibels

Decibels were originally devised as a way of expressing a ratio of powers.
Given two powers P1 and P2, which have a ratio P1/P2, the decibel-
equivalent is defined to be
P
10 log10 __1
P2
For sound waves, the power is proportional to the square of the pressure
amplitude. Hence, if the amplitudes corresponding to the power levels P1
and P2 are A1 and A2 respectively, the decibel equivalent of this power ratio
can be written as 2 2
A1 A1 A1
10 log10 2 =
10 log10 = 20 log10
A2 A2 A2
This is the equation used to give the values in Table 2. If you have already
worked with decibels as power ratios you might find the decibel values
given in Table 1 are double what you are used to seeing for the
corresponding ratios. The difference arises because Table 1 shows the
decibel equivalents of amplitude ratios rather than power ratios.
10.3 The decibel as a measure of sound amplitude

As I mentioned earlier, because a decibel is a way of expressing a ratio,
it cannot by itself express the absolute size of anything. To express
absolute values it must be referred to a fixed reference quantity, against
which whatever is being measured can be compared. In the context of
acoustics the reference used is the lower limit of audibility – the
threshold of audibility. This varies from person to person, but has a
nominal value which can be expressed as a pressure wave with an
amplitude of about 0.000 000 01% of atmospheric pressure. This is
taken to be 0 dB. If a sound level is described as being ‘40 dB’, it means
that its amplitude is 40 dB relative to the agreed threshold of audibility.
This means that its amplitude is 100 times greater than that of the
audibility threshold. Sound amplitudes expressed in decibels in this
way are said to have a sound pressure level (SPL) of so many decibels
(see Box 5, which is non-assessable).

The amplitude of a sound is 1000 times greater than the reference
value. What is its sound pressure level in decibels? Use either the
graph in Figure 27 or Table 1. Q
The following activity is designed to give you a flavour of what a

change in sound pressure level of a few decibels corresponds to.

In the audio track for this activity you hear a sine wave in which the
sound pressure level changes in 3 dB steps. You might like to compare
it with the first part of the audio track of Activity 37 where the sound
pressure level changed in 6 dB steps. Q

I mentioned earlier that a sound level that was a million times greater
than the threshold of audibility would be distressingly loud for most
people. Given that 0 dB is the level assigned to the threshold of
audibility, use Table 1 and the additive property of decibels to find the
sound pressure level of such a loud sound. Q
Table 3 gives some approximate sound pressure levels. You might

be surprised at how loud some instruments are. We do not normally
think of the violin and flute as loud instruments. However, because
of the close proximity of the player’s ear to the sound source, sound
pressure levels for the performer are not far from levels that can
Box 5 Sound pressure level (non-assessable)

The sound pressure level (SPL) in decibels of a sound with a given pressure
amplitude is:
pressure in pascals
SPL = 20 log10
2 × 10 −5
(The figure of 2 × 10–5 in the denominator is the amplitude in pascals of a
pressure wave at the lower limit of audibility.) Table 3 gives some typical
sounds and their sound pressure levels in decibels.
damage the player’s hearing. Frequent, extended playing of these

instruments at high volume, as may happen in orchestras, can result in
hearing loss for the player. Players of other instruments, particularly
brass instruments, are similarly at risk, as are users of personal stereo
systems with headphones if they are regularly used at high volume.
Table 3 Some typical sound pressure levels
Sound sound pressure level

Threshold of pain 130 dB
Jet taking off at 100 m 120 dB
Peak levels in dance club 110+ dB
Timpani and bass drum rolls 106 dB
Rock performance at close range 100+ dB
Orchestral music during loud passages,
experienced by performers (permanent
hearing damage on prolonged exposure) 90+ dB
Violin, flute at player’s ear 85+ dB
Heavy car traffic at about 10 m 80 dB
Chamber music in small auditorium 75+ dB
Car interior/Singer fortissimo at 1 metre 70 dB
Piano practice 60–70 dB
Conversation at 1 m 60 dB
Office noise 50 dB
Domestic living room 40 dB
Bedroom 30 dB
Empty concert hall 20 dB
Breeze through leaves 10 dB
Threshold of hearing 0 dB

The decibel (symbol dB) is a way of expressing a ratio. It is based on
logarithms, and so adding decibels is equivalent to multiplying their
corresponding ratios. Decibels can be used to express absolute values
by referring them to a reference value.
A common use of decibels is to express ratios of amplitudes. For
instance, the amplification (or gain) of an amplifier can be expressed
either as the ratio of the output and input amplitudes, or as a certain
number of decibels. With a multi-stage amplifier where the gain of
each stage is expressed in decibels, the overall gain in decibels is just
the sum of the individual stages’ gains.
The sound pressure level (SPL) is a unit for expressing the amplitudes
of sound waves relative to the threshold of hearing. The SPL system
uses units of decibels, and the lower threshold of hearing has a value
of 0 dB. An advantage of the SPL as a unit is that it reflects the way
loudness is experienced. That is to say, equal increments of SPL are
heard as approximately equal increments of loudness.
SUMMARY OF CHAPTER 1
Sound can be considered objectively and travels a distance equal to one wavelength.
subjectively. (Section 1.2) (Section 2.5)
As an objective phenomenon sound can be Pressure variations produced by a

measured and described using a scientific sinusoidally oscillating source, when
vocabulary (and to some extent a musical, observed at a fixed point in space, are
vocabulary). (Section 1.2) sinusoidal when plotted as a graph against
time. (Section 2.6)
As a subjective phenomenon sound is
experienced as a perception, and The period (symbol T or τ) of a sinusoidal
descriptions tend to be metaphorical, pressure variation is the time interval
although some musical terminology also between two corresponding points on
relates to subjective perception. ( Section consecutive cycles of pressure variation.
1.2) (Section 2.6)
In the air, sound consists of pressure The period of a pressure wave is the same
fluctuations that are propagated through as the period of the source. (Section 2.6)
the atmosphere. These fluctuations are
known as pressure waves. They are also The frequency f of a pressure wave is the
longitudinal travelling waves. (Sections 2.2 number of cycles of oscillation per second.
and 2.3) Frequency is the reciprocal of period
(f = 1/T). (Section 3.1)
Pressure is related to molecular spacing.
Closer average spacing means higher Frequencies of sound are measured in hertz
pressure (other things being equal). (Hz) or kilohertz (kHz). (Section 3.1)
(Section 2.2)
The speed of sound in air is for many
One cycle of an oscillating source is a
practical purposes constant. (Section 4.1)
complete sequence of motion up to the
point at which the motion starts to repeat
Speed = frequency × wavelength, or
itself. (Section 2.3)
v = f × λ. Alternatively, f = v/λ or λ = v/f.
The time taken for one cycle is the period. (Section 4.2)
(Section 2.3)
Phase refers to the part of a cycle which a
One cycle of an oscillation produces one particular vibrating system is in at any
complete high-pressure region and one moment. (Section 5.1)
complete low-pressure region. (Section
2.3) A difference of phase between two sine
waves of the same frequency is usually
Pressure waves produced by a sinusoidally expressed as a fraction of a cycle or as a
oscillating source such as a tuning fork certain number of degrees. (Section 5.1)
have a sinusoidal shape when plotted as
graphs. Sine waves have a When two waves are not in phase, one leads
characteristically pure or neutral sound. or lags the other. (Section 5.1)
(Section 2.4)
If two sine waves are half a cycle out of
The wavelength λ of a periodic wave is the phase (180° degrees or any odd multiple of
distance between two points of the same 180°), there is complete cancellation when
pressure created at corresponding points they are combined. If the waves are in phase,
of consecutive cycles of the source. there is maximum reinforcement.
(Section 2.5) Intermediate phase differences produce
intermediate amounts of cancellation or
In the time it takes for a source to complete reinforcement but the result is still a sine
one cycle of oscillation, the pressure wave wave. (Section 5.2)
Amplitude is the size of a sine wave. It is of the same sound, and they are given the
the maximum value of the wave’s departure same musical letter-name (A, A flat, B etc.).
from its average value. (A sine wave (Section 8.1)
oscillates symmetrically about its average
value.) (Section 6.1) Subscripts may be used to distinguish
different pitches which share the same
Peak-to-peak amplitude is the vertical letter-name, for instance A1, A2, A3, etc.
distance from peak to trough. It is twice the (Section 8.2)
amplitude. (Section 6.1)
Equal separation of pitch does not
Amplitudes of sounds are typically correspond to equal separation of
measured in units of pressure or volts, or frequency, but to equal ratios of frequency.
in decibels. (Sections 6.2 and 10.3) (Section 8.2)
Root-mean-square (r.m.s.) amplitude is the The frequency range of human hearing is

amplitude multiplied by 0.71 nominally 20 Hz to 20 kHz, although most
(approximately). (Section 6.3) people cannot hear to 20 kHz. (Section 9.1)
A steady source equal in value to the r.m.s. The pitches used in music correspond
amplitude of a sinusoidally oscillating roughly to frequencies in the seven-octave
source supplies energy at the same rate as range from 20 Hz to 2500 Hz. (Section 9.1)
the oscillatory source (other things being
equal). (Section 6.3) Generally, musical tones are not pure sine
waves but are mixtures of sine waves.
Pitch is the subjective property of sound (Section 9.1)
that is heard to change while frequency
changes. (Section 7.1) The dynamic range of human hearing is the
range of amplitudes the ear can cope with.
Loudness is the subjective property that is It covers a range of more than 1 000 000:1.
heard to change while amplitude changes. (Section 9.2)
(Section 7.1)
The decibel (symbol dB) is a way of
Equal changes of frequency are not expressing a ratio. It can be used to express
experienced as equal changes of pitch. absolute values by referring measurements
Equal changes of amplitude are not to a fixed standard. (Sections 10.1 and 10.3)
experienced as equal changes of loudness.
(Sections 7.1, 8.2, 10.1 and 10.3) Adding decibels is equivalent to
multiplying their corresponding ratios.
For musical purposes, pitches are organised (Sections 10.1 and 10.2)
into discrete, named steps. (Sections 7.1
and 8.1) Decibels can be used to express a ratio of
amplitudes, for instance in specifying the
In the pitch standard known as concert voltage gain of an amplifier. (Section 10.2)
pitch, the pitch A4 is defined as the pitch
of a sine wave of frequency 440 Hz. The sound pressure level (SPL) is a unit for
( Section 7.1) expressing the amplitude of a sound wave
relative to the threshold of hearing. Its unit
An octave pitch increase corresponds to a is the decibel. (Section 10.3)
doubling of frequency. (Section 8.1)
Equal increments of SPL are heard as
Pitches which are one or more octaves apart approximately equal increments of
are heard musically as different instances loudness. (Section 10.3)
ANSWERS TO SELF-ASSESSMENT ACTIVITIES

Activity 5
(a) Sound would be conveyed by the atmosphere within the craft, so
the slogan does not apply here.
(b) In the vacuum of space there is no medium within which there can
be pressure fluctuations, so the slogan applies here.
(c) The atmosphere on the planet would be able to sustain pressure
variations, so the slogan does not apply here.
Activity 7
The following are the correct pairings:
1 and (c)
2 and (b)
3 and (a).
The following is the correct text:
Sound waves are pressure waves, because they consist of cyclical
changes of pressure.
Sound waves emanating from a single source in the open (away from
buildings etc.) are travelling waves because the pressure variations radiate
outwards from their source, conveying energy away from the source.
Sound waves are longitudinal waves, because the molecular
oscillations are along the line of travel of the wave.
Activity 8
It took two complete cycles, and a little bit more, to create this pattern.
To see why, notice that the right-hand fork is at the centre of the low-
pressure part of the cycle, when the prongs are closest. To the right
there are two other low-pressure zones, so to create this pattern two
complete cycles were needed (Figure 30), plus a bit extra to account
for the region to the right of the diagram.
one cycle one cycle
‘extra’
Figure 30 Pattern of pressures for Activity 8
Activity 11
(a) Each cycle of the pressure wave occupies 1 metre, so this is the
wavelength.
(b) We cannot directly read a single wavelength here, but four cycles
occupy 1 metre, so the wavelength is 0.25 metre.
Activity 12
It travels 300 metres. To see why, recall that in the time it takes for one
cycle of the fork the wave travels one wavelength, which is 1.5 metres.
So in the time required for 200 cycles the wave travels 200 times as
far, which is 200 × 1.5 metres = 300 metres.
Activity 14
With calibrated graphs like these, it makes sense to measure a period
from the point where the curve crosses the horizontal axis to the
corresponding point on the next cycle.
(a) It is clear from this graph that one cycle takes one second. Hence
the period is one second.
(b) In this graph it is not so easy to read the time for a single cycle.
However, five cycles clearly take 0.1 second, so a single cycle
would take one-fifth of this. The period is therefore 0.02 second.
Activity 15
The sine wave with a period of 1 second has a frequency of 1 Hz. The
sine wave with a period of 0.02 second has a frequency of (1/0.02) Hz,
or 50 Hz.
Activity 16
To members of Group A, Group B appears to be lagging by 0.2 seconds.
It takes 0.1 second for the sound to travel 34 metres from A to B at a
speed of 340 metres per second, and a further 0.1 second for the sound
to travel from B back to A. This makes a total delay of 0.2 second for
the round trip.
Activity 19
The wavelength is less than 1 metre. Using λ = v/f, we can see that the
wavelength in metres is 340 ÷ 384. Without using a calculator this can
be seen to be less than one. In fact its value is about 0.89 metres.
Activity 21
The delay is 0.2 ms. Figure 31 is a repeat of Figure 13.
pressure
near fork
0 1 ms 2 ms time
(a)
pressure
at a distance
0 1 ms 2 ms time
(b)
Figure 31 Repeat of Figure 13
One way to answer the question would be to look at where the first
peaks occur in each graph and to try to read the time difference. This
is not easy because neither peak occurs on a grid line, and finding the
exact summit of a smoothly curving sine wave is not easy. In cases like
this it makes more sense to compare the points where the graphs cross
the horizontal axis at the end of corresponding cycles. In Figure 31(a)
the end of the first cycle occurs at 1 ms. In Figure 31(b) the end of the
corresponding cycle occurs at the next vertical grid line after 1 ms. As
there are five grid lines between 1 ms and 2 ms, the space between
each line represents 0.2 ms. Hence a delay of 0.2 ms.
Activity 22
The phase difference is 0.2 milliseconds, and the period is 1
millisecond, so the phase lag is 0.2 of a cycle, or one-fifth of a cycle.
Activity 24
The phase difference is half a cycle, or 180 degrees. We could equally
say the waves were one-and-a-half cycles apart, or two-and-a-half, and
so on, which would give phase differences of 540 degrees and 900
degrees respectively. However, it is customary to speak of a phase
difference like this as 180 degrees.
Activity 26
(a) The amplitude is 0.35 volts. The tops of the peaks fall half way
between 0.3 and 0.4 volts, hence the value of 0.35 volts.
(b) The amplitude is 6 volts.
Activity 27
(a) The amplitude is half the peak-to-peak amplitude, so the answer
is 1 volt.
(b) The r.m.s. amplitude is 0.71 of the answer in (a), so the answer is
0.71 volts.
(c) The r.m.s. amplitude is the size of a steady source that would
deliver energy at the same rate, other things being equal. So the
steady source needs to be 0.71 volts.
Activity 30
(a) False. The sum of A and B will have double the amplitude of A or
B because A and B are in phase. A doubling of amplitude does not
produce a doubling of loudness.
(b) This is true.
(c) True. The ear is most sensitive around 4 kHz.
Activity 32
The pitch three octaves above this A has a frequency of 3520 Hz.
Unless you are familiar with this type of calculation, it is sensible to
do it an octave at a time, as follows.
One octave above has a frequency of 2 × 440 Hz = 880 Hz.
Two octaves above has a frequency of 2 × 880 Hz = 1760 Hz.
Three octaves above has a frequency of 2 × 1760 Hz = 3520 Hz.
The reason for doing the calculation a step at a time is to avoid a
couple of traps that can easily be fallen into. First, note that a three
octave rise does not correspond to a tripling of frequency. Secondly,
note that three successive doublings of frequency do not amount to a

six-fold increase in frequency overall. That misapprehension would
have given an answer of 2640 Hz. In fact, three successive doublings of
frequency amounts to an eight-fold increase. Hence the factor by which
we need to multiply the original frequency is (2 × 2 × 2), which is
called ‘two cubed’ and customarily written as 23.
Activity 33
(a) Figure 24 showed that the frequency increments become smaller
as we go down in pitch, so the bottom end of the keyboard (the
left-hand end) would be where the keys were closest together.
(b) It is safest to do this calculation in stages. The width of the second
octave would be 2 × 20 cm = 40 cm. The width of the third octave
would be 80 cm, and the width of the fourth octave would be
160 cm. The total width would therefore be:
20 cm + 40 cm + 80 cm + 160 cm = 300 cm, or 3 metres
Activity 34
Ten octaves. A simple approach is to divide 20 000 Hz repeatedly by
two until we reach the lower limit of the human frequency range.
The number of times the division can be carried out is the number of
octaves. Thus, starting at the upper end of the range:
20 000 Hz ÷ 2 = 10 000 Hz
10 000 Hz ÷ 2 = 5000 Hz
5000 Hz ÷ 2 = 2500 Hz
2500 Hz ÷ 2 = 1250 Hz
1250 Hz ÷ 2 = 625 Hz
625 Hz ÷ 2 = 312.5 Hz
312.5 Hz ÷ 2 = 156.25 Hz
156.25 Hz ÷ 2 = 78.125 Hz
78.125 Hz ÷ 2 = 39.0625 Hz
39.0625 Hz ÷ 2 = 19.53 Hz
Hence the span is ten octaves.
Activity 35
One octave, corresponding to a halving of the upper-frequency limit.
Activity 36
It is above the frequency of the pitches produced by virtually all
instruments. However, it is within the range of the harmonics of some
instruments.
Activity 39
(a) The overall amplification is 12 dB + 18 dB = 30 dB. From Table 1
this can be seen to be equivalent to a ratio of 32:1. Multiplying the
amplifications of the individual stages confirms this. The first
gives an amplification ratio of 4:1 and the second an amplification
ratio of 8:1.
(b) The overall amplification ratio is 80:1. Table 1 does not give a
decibel equivalent for this, but it must be the sum of the decibel
equivalents for each stage. From Table 1 these are 18 dB and 20
dB, so the overall amplification is 18 dB + 20 dB, or 38 dB.
Activity 40
From Table 1 or from the graph, an amplitude ratio of 1000:1 is
equivalent to 60 dB. This is the sound pressure level of this sound.
Activity 42
Table 1 does not give us a decibel equivalent directly for an amplitude
ratio of a million to one. However, Table 1 shows that a ratio of 1000:1
has a decibel equivalent of 60 dB. Hence 1000 000, which is 1000 ×
1000, has a decibel equivalent of 60 dB + 60 dB, which is 120 dB. This
sound pressure corresponds to a jet taking off when heard from a
distance of 100 m.
LEARNING OUTCOMES
After studying this chapter, you should be able to:
1 Explain correctly the meaning of the emboldened terms in the main

text and use them correctly in context.
2 Explain ‘cycle’ in terms of an oscillating source and the pressure
wave it produces. (Activity 8)
3 Describe simply what a pressure wave is and give a simple
explanation of sound in terms of a travelling pressure wave.
(Activity 7)
4 Relate amplitude (including peak-to-peak and r.m.s.), frequency,
period and wavelength to a sinusoidal waveform.
5 Calculate the wavelength of a pressure wave from a graph of
pressure against distance. (Activity 11)
6 Relate the distance that a pressure wave travels to the number of
cycles of oscillation performed by the source in a given length of
time. (Activity 12)
7 Calculate the period of a pressure wave from a graph of pressure
against time, and hence calculate the frequency. (Activities 14
and 15)
8 Perform simple distance and time calculations for sound, given the
speed of sound. (Activity 16)
9 Use the formula v = f × λ to perform simple calculations relating
speed, frequency and wavelength of sound. (Activity 19)
10 Explain phase difference and how it is quantified, and be able to
relate cancellation and reinforcement to phase difference.
11 Calculate phase difference in seconds, degrees or fractions of a
cycle from a graph showing two sine waves. (Activities 21, 22, 24)
12 Read or calculate the amplitude, peak-to-peak amplitude and r.m.s.
amplitude of a sine wave from its graph, given data relating
amplitude to r.m.s. amplitude. (Activity 26)
13 Discuss the relationship between amplitude and loudness, and
between frequency and pitch. (Activity 30)
14 Discuss the significance of the octave in terms of frequency and in
terms of pitch, and the role of the octave in relation to musical
scales.
15 Perform simple frequency calculations in connection with octave-
related pitches. (Activities 32, 33. 34, 35)
16 Specify approximately where in the human frequency range the
sounds used in music lie, in terms of both their pitches and the
extent of the frequencies spanned by their harmonics. (Activity 36)
17 Explain the use of the decibel as a way of representing sound
pressure level.
18 Perform simple decibel calculations, given a table or graph relating
decibels to ratios. (Activity 38)
Acknowledgement
Grateful acknowledgement is made to Robert Harding Picture Library/
Alamy Images for permission to reproduce Figure 12 [St Marks
Cathedral].
55 TA225 BLOCK 1 INVESTIGATING SOUND CHAPTER 2 SOUND SHAPE AND COLOUR 55
Chapter 2
Sound Shape and Colour
CONTENTS
Aims of Chapter 2 56
1 Introduction 57
2 Waveshape and timbre 58
3 Fourier’s theorem 59
3.1 Introduction 59
3.2 The harmonic series 60
3.3 The effect of phase 62
3.4 Synthesis and analysis 64
4 Frequency spectrum 65
4.1 Line spectrum 65
4.2 Bandwidth 66
5 The problem of synthesis 67
5.1 The problem of timbre solved? 67
5.2 The evolving sound 68
6 Repetition rate and fundamental frequency 72
7 The missing fundamental 75
8 Formants 77
9 Pitches in the harmonic series 79
9.1 Introduction 79
9.2 Identifying the pitch classes 80
9.3 Consonance and dissonance; the triad 82
9.4 The complete harmonic series 84
10 Intervals in the harmonic series 86
10.1 The perfect fifth 86
10.2 The perfect fourth 87
10.3 The major third 87
10.4 Other intervals 88
11 Tuning and temperament 89

11.1 Ascending by fifths 89
11.2 Ascending by thirds 90
12 Equal temperament 92
13 Beats 95
13.1 Close frequencies 95
13.2 Beating in near harmonics 97
Summary of Chapter 2 98
AIMS OF CHAPTER 2
Q To introduce the concept of a timbre.
Q To investigate some of the factors that affect timbre, principally
frequency spectrum, attack and decay and formant.
Q To introduce the harmonic series.
Q To introduce Fourier’s theorem and to show the role of harmonics
in creating non-sinusoidal periodic waves.
Q To introduce the concept of a time-varying frequency spectrum and
show its relevance to timbre.
Q To introduce the concepts of attack, steady state, decay and
formant.
Q To show how the harmonic series can be used to define pitches and
intervals.
Q To show some of the anomalies that arise from using the harmonic
series to define pitches and intervals and to show how equal
temperament overcomes them.
Q To introduce the phenomenon of beating.
1 INTRODUCTION
Much of Chapter 1 was concerned with sound waves and their
properties. These properties were broadly of two sorts: objective and
subjective. Three objective properties were found to be particularly
important: frequency, amplitude and phase. Two of these, frequency
and amplitude, were found to be closely related to audible, subjective
properties, namely pitch and loudness.
I mentioned in passing that the sounds produced by conventional
musical instruments, if plotted as graphs, are seldom pure sine waves;
rather, the sounds are mixtures of sine waves. In this chapter I shall be
looking at this idea more closely. The timbre of instruments is related
to the non-sinusoidal nature of the sound waves they produce. Timbre
is the characteristic sound, or ‘colour’, of an instrument or voice which
enables us to differentiate it from other instruments or voices even
when they are playing the same note. For instance, a trumpet sounds
very different from a clarinet. Timbre turns out to be a very complex
phenomenon, partly related to the particular mixture of sine waves
produced by instruments, and partly related to a number of other
factors, not least of which is the playing style of the performer.
A large part of the chapter will be concerned with the harmonic series.
This is a commonly occurring set of frequency relationships found in
the mixture of sine waves produced by many instruments. The harmonic
series is important not only in the story of timbre but also as the basis
of some of the pitches used in musical scales in many cultures.
The word ‘harmonic’ will occur frequently in this chapter. If you are a
player of a stringed instrument, you will be familiar with the word
‘harmonic’ to denote a way of playing which achieves a particular
timbre through the player lightly touching the string at certain places,
rather than by pressing it against the instrument’s neck. This meaning
of the word harmonic is different from the one that will feature in this
chapter, although the two meanings are not unrelated.
2 WAVE SHAPE AND TIMBRE

Figures 1–4 show some waveforms for a few standard musical
instruments, playing different pitches. All these graphs show parts of
the wave that are well away from either the start or the end of the note.
I mention this because the starts and ends of notes are usually unlike
the main part of a note, as we shall see later. The zeros on the time axis
therefore do not coincide with the starts of these notes.
0 1 2 3 4 5 6 7
Figure 1 Flute playing a sharp F#4

time (milliseconds)
0 1 2 3 4 5 6 7 8
Figure 2 Oboe playing C#5

time (milliseconds)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
time (milliseconds)
Figure 3 Violin playing G3
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
Figure 4 Piano playing G#4

time (milliseconds)
Notice that, apart from the flute (Figure 1), all these waveforms are
distinctly non-sinusoidal.

You can hear the instrumental sounds from which these waveforms
were taken in the audio track for this activity. Q
It is tempting to deduce that because the waves look so different, and

the instruments sound so different, each wave shape must account for
the corresponding instrumental timbre. In other words, one might
conclude that each instrument produces a characteristic wave shape,
and that the wave shape gives rise to a characteristic timbre, as though
timbre were the subjective counterpart to wave shape in some way. To
a considerable degree the wave shape is directly related to our experience
of timbre, but timbre cannot simply be reduced to a particular wave
shape. For instance, the acoustic character of the room in which a sound
is heard can considerably affect the shape of the wave, and yet most
listeners would say that something distinctive about the timbre of the
instrument remained unchanged. Another way to express this observation
is to say that sometimes different wave shapes have the same timbre.
The waves in Figures 1 to 4, apparently all different, have one common
feature about which I have so far said nothing – the fact that they are, or
appear to be, periodic. That is to say, the waves consist of a repeating
portion, and this portion repeats at regular intervals. Periodicity is a
general feature of the waveforms produced by pitch-playing
instruments, such as stringed instruments, keyboard instruments,
woodwind, brass and voices. Excluded from the class of pitch-playing
instruments are snare drums, cymbals, triangles, and other instruments
which do not produce an identifiable pitch.

In concert pitch, the note G3 is assigned a frequency of 196 Hz. That is
to say, a sine wave of this frequency would have a pitch of G3.
(a) What is the period of such a sine wave?
(b) Does the wave form in Figure 3, which is for a violin playing G3,
have the same period? Q
Non-sinusoidal periodic waves have received a lot of theoretical attention

from scientists and mathematicians, and nearly all their discoveries build
on the fundamental work of Joseph Fourier, to whose work we now turn.
3 FOURIER’S THEOREM
3.1 Introduction
The French mathematician Joseph Fourier (1768–1830, roughly
contemporary with Beethoven) made important discoveries regarding
periodic waves, which have profound implications for the analysis of
sound, and indeed for innumerable other branches of science,
mathematics and engineering. He discovered that a non-sinusoidal
periodic wave can be created by adding together sine waves of
appropriate frequencies, amplitudes and phases. This observation is
usually referred to as Fourier’s theorem. Another way to think of
Fourier’s theorem is that a non-sinusoidal periodic waveform is
equivalent to a number of sine waves of appropriate frequency,
amplitude and phase added together. The amplitudes of the sine waves
that are added can be different from each other (in fact, they usually
are), but they are constant; that is, they do not change as time passes.
The frequencies and phases of the sine waves are similarly constant as
time passes.
A few restrictions must be placed on the interpretation of the phrase
‘non-sinusoidal periodic wave’ for Fourier’s theorem to work, because
not every conceivable non-sinusoidal periodic wave can be regarded as
a combination of sine waves. In principle, Fourier’s theorem applies to
non-sinusoidal periodic waves which satisfy both of the following
criteria:
1 infinite duration,
2 not too many gaps, sudden jumps, or other discontinuities.
The first condition looks impossible to meet, but turns out not to be a
major obstacle in practice. Provided the duration of a periodic wave is
long in relation to the duration of one cycle, then effectively the wave
is infinitely long for the purposes of Fourier’s theorem.
The second condition in the above list is designed to exclude rather
strange waves of a kind that do not really occur in music. Provided a
periodic wave shape is sufficiently well formed to be capable of being
drawn or printed on a piece of paper, then it satisfies the second
condition. In practice, then, the two restrictions above do not pose a
problem for the use of Fourier’s theorem in connection with real,
periodic musical waves such as those in Figures 1 to 4.
3.2 The harmonic series

I said above that Fourier’s theorem relates to the combining of sine waves
of appropriate frequencies to create periodic non-sinusoidal waves, but
what do I mean by ‘appropriate’ in connection with frequency? The
answer lies in the relationship between the frequencies used.
Specifically, the frequencies must be whole-number multiples of a
single, lowest frequency. An example should make this clear. The
following frequencies satisfy the condition:
100 Hz, 200 Hz, 300 Hz, 400 Hz, 500 Hz, 600 Hz, 700 Hz, 800 Hz.
Every frequency here is a whole-number multiple of 100 Hz, as shown
below:
100 Hz is 1 × 100 Hz
200 Hz is 2 × 100 Hz
300 Hz is 3 × 100 Hz
and so on. A frequency of 250 Hz could not be included in the above
series because it is not a whole-number multiple of 100 Hz.
Combining sine waves having these frequencies and having unvarying
amplitudes would result in a non-sinusoidal periodic wave, as
outlined by Fourier. You can confirm this for yourself shortly.

Do the following frequencies meet the condition of being whole-
number multiples of a single, lowest frequency?
150 Hz 300 Hz 450 Hz 600 Hz 750 Hz Q
Frequencies that are whole-number multiples of a single, lowest

frequency are said to be harmonically related. A series of such
frequencies is called a harmonic series. All the frequencies in a
harmonic series are whole-number multiples of the fundamental
frequency, sometimes just called the fundamental. In the first
example given above, the fundamental frequency was 100 Hz. In
Activity 3 the fundamental frequency was 150 Hz. For any
arbitrarily chosen fundamental frequency there is a series of
harmonically related frequencies.

Write the first six frequencies of a harmonic series whose fundamental
frequency is 98 Hz. Q

Are the following frequencies harmonically related to a fundamental
frequency of 100 Hz?
100 Hz, 150 Hz, 166.66667 Hz, 201 Hz, 305.39 Hz Q
Generally, a series of harmonically related frequencies has this pattern,

f 2f 3f 4f 5f ....
where f stands for the fundamental frequency. The dots at the end
indicate that the series can continue indefinitely.
Sine waves that have harmonically related frequencies are called
harmonics. A harmonic at the fundamental frequency is called the first
harmonic; a harmonic with frequency 2f is called the second
harmonic; and so on.
We can now re-phrase Fourier’s theorem in terms of this new
vocabulary. Essentially the theorem states that (with certain
restrictions) a non-sinusoidal periodic wave can be regarded as a
combination of constant-amplitude sine waves whose frequencies are
harmonically related. To put it another way, if you combine sine waves
whose frequencies are harmonically related (and whose amplitudes are
unvarying), the result is a periodic, non-sinusoidal wave. This is of
interest to us because periodic, non-sinusoidal waveforms are what
many instruments produce, and we saw examples in Figures 1 to 4
(although Figure 1 was almost sinusoidal).
Note that Fourier’s theorem does not say that a harmonic series of
frequencies needs to be complete. A series may contain gaps. Sine
waves with the following frequencies, when added together, would
still produce a periodic non-sinusoidal result, despite the harmonic
series being incomplete.
100 Hz, 200 Hz, 400 Hz, 500 Hz, 800 Hz
Note that this set of frequencies lacks the third harmonic (300 Hz), the
sixth harmonic (600 Hz), the seventh harmonic (700 Hz) and all the
harmonics above the eighth.

Give the next four harmonics above the eighth for the above series. Q
The following activity gives you the chance to see and hear for yourself
what happens when harmonically related sine waves are added.

The software for this activity enables you to add up to twelve
harmonics in varying amounts. Set all the sliders to zero except for
the fundamental.
Play the fundamental, which has the characteristically ‘pure’ tone
of a sine wave. Now experiment with adding various harmonics, in
varying amounts, and seeing and hearing the result. You might
want to try adding just the even harmonics or just the odd ones, and
noticing the characteristics of the resulting waveform. Also, try the
effect of omitting some harmonics.
Another thing to investigate is the period of the combined
waveform compared with that of the fundamental. Q
Activity 7 should have convinced you that adding harmonically related

sine waves creates non-sinusoidal, periodic waves. You will have noticed
that the shape and sound of the combined waveform depended not only
on which harmonics were included, but also on the amounts of each
harmonic, in other words, on their amplitudes. Thus, you were able to
modify the timbre of the sound by adjusting its harmonic composition.
You probably found that one cycle of the combined waveform had the
same width as one cycle of the fundamental, indicating that they have the
same period. However, it is possible to produce results that apparently
do not fit this pattern. I shall return to the relationship between the
period of the combined wave and that of the fundamental later.
3.3 The effect of phase

When you were adding sine waves in Activity 7 to create a non-
sinusoidal periodic wave, you might have wondered what the effect of
changing the phase relationship of the sine waves would have been, as
the software does not allow you to experiment with this. In fact,
changing the phase of the sine waves can radically alter the shape of
the resultant non-sinusoidal wave.
In Figure 5, harmonically related sine waves (a) and (b) are added to
create (c). In Figure 6, sine waves (a) and (b) are identical to those in
Figure 5, except that the phase of wave (b) is different. Adding sine
waves in (a) and (b) in Figure 6 to produce wave (c) gives a very
different result from that obtained in Figure 5.
Despite the clear difference between waves (c) in Figures 5 and 6, there
is no audible difference between them. The following activity gives
you the opportunity to check this for yourself.
200 Hz
0 5 10 15 20 time (ms)
(a)
400 Hz
(b)
(c)
Figure 5 Harmonically related sine waves
200 Hz
0 5 10 15 20 time (ms)
(a)
400 Hz
(b)
(c)
Figure 6 Sine waves from Figure 5 with a phase difference


In the audio track for this activity the first sound corresponds to wave
(c) from Figure 5. The second corresponds to wave (c) from Figure 6.
Can you hear a difference?
Comment
I am sure that you found the sounds to be identical. However, this does
not mean that the effects of phase change are always inaudible. This is
still a somewhat controversial area, but you would probably be able to
hear the effects of a transition from the wave at the bottom of Figure 5
to the wave at the bottom of Figure 6 during the transition. Q
3.4 Synthesis and analysis

The process of adding harmonics to create periodic non-sinusoidal
waves is known as Fourier synthesis. Thus, in Activity 7 you were
performing Fourier synthesis when you added together sine waves
with harmonically related frequencies to create non-sinusoidal
periodic waves.
There is an inverse process to Fourier synthesis in which a
periodic, non-sinusoidal wave is decomposed into its constituent
harmonics. This is known as Fourier analysis. Fairly complex
mathematical techniques are required to perform Fourier analysis.
A complete Fourier analysis would tell you not only which
harmonics were present in a non-sinusoidal wave, but also the
relative amounts of each (that is, their amplitudes) and their phase
relationships.

A harmonically related series of frequencies has this pattern: f1, 2f1, 3f1,
4f1, 5f1 .... The frequency f1 is called the fundamental frequency. A set
of sine waves with harmonically related frequencies are called
harmonics. Harmonics are identified by number according to their
frequency: the harmonic with a frequency equal to the fundamental is
the first harmonic, the one with a frequency twice the fundamental is
called the second harmonic, and so on.
Fourier’s theorem states that, with some restrictions, a non-sinusoidal
periodic wave can be created by combining constant-amplitude
harmonics. This is Fourier synthesis.
Changing the phase relationship of harmonics changes the shape of the
resultant wave, but, except in some specialised instances (such as
when the phase relationship is changing), the effect is inaudible.
Fourier’s theorem can also be applied ‘in reverse’: a non-sinusoidal
periodic wave can be analysed into its constituent harmonics. This
process is called Fourier analysis.
4 FREQUENCY SPECTRUM
4.1 Line spectrum
In the last section I discussed the idea that a non-sinusoidal periodic
wave has a harmonic composition, that is to say, it is composed of
harmonics. The complete set of harmonics that make up a periodic
wave is known as the frequency spectrum (or sometimes just
spectrum) of the wave.
It is useful to have a simple, graphical way to represent a frequency
spectrum. To specify a spectrum fully we need to be able to represent
three kinds of information:
1 The frequencies present.
2 Their amplitudes.
3 Their phase relationship.
Because the effects of phase relationship tend to be inaudible (except
in special circumstances), we can afford to drop the third type of
information when looking for a simple way to represent the frequency
spectrum of sound. Information relating to the first two types of
information can be captured in a particular type of frequency spectrum
graph in which harmonics are represented as vertical lines. Figure 7 is
an example.
amplitude
0 200 Hz 400 Hz frequency

Figure 7 A frequency spectrum (also called an
amplitude spectrum or a line spectrum)
Figure 7 shows that there are only two harmonics in this wave form, at
frequencies of 200 Hz and 400 Hz. The vertical lines at these frequencies
have the same height, indicating that these harmonics have equal
amplitude.
A diagram like Figure 7 can have various names. Sometimes it known
as a frequency spectrum (because it shows the frequencies present).
Sometimes it is referred to as an amplitude spectrum, because it shows
the amplitudes of the harmonics but not their relative phases. (Bear in
mind that a complete representation of a frequency spectrum would
have to show both the amplitudes of the harmonics and their relative
phases, but we are choosing to ignore the phase information.)
As it happens, Figure 7 is also a line spectrum, because the harmonics
are shown as discrete lines. Any spectrum consisting of sine waves of
discrete frequencies (whether harmonically related or not) must
consist of isolated lines, as in Figure 7. However, sometimes frequency
spectra may consist of a continuous spread of frequencies rather than
discrete frequencies. Such spectra are usually associated with
unpitched sounds, for example a cymbal crash or the sound of the
wind. The important point to bear in mind is that not every frequency
spectrum graph consists of isolated lines. Phase information is
sometimes represented as a separate diagram.

Sketch the amplitude spectrum of the sound produced by a tuning
fork tuned to a frequency of 440 Hz. For the purposes of this activity,
regard the tuning fork as producing a sound with a constant amplitude
of 10 units. Q
4.2 Bandwidth
The concept of a frequency spectrum naturally gives rise to the concept
of frequency bandwidth, or just bandwidth. This is the range of
frequencies over which a frequency spectrum extends. In the case of
line spectra like Figure 7 the bandwidth is simply the highest
frequency present minus the lowest. In Figure 7 this is
400 Hz – 200 Hz = 200 Hz
As you can see, bandwidth is measured in the unit of frequency, hertz.
For frequency spectra that do not consist of isolated lines, bandwidth
must be defined rather differently, but it still conveys the same
essential idea, namely the range of frequencies over which a spectrum
is extended.

What is the bandwidth of the sound whose frequency spectrum is
shown in Figure 8? Q
amplitude
0 75 150 300 375 450 frequency (Hz)
Figure 8 A frequency spectrum
The concept of bandwidth of a sound gives us a simple way to

characterise the sound. The almost perfect sine wave produced by the
flute in Figure 1 has a narrower bandwidth than the waves produced
by the other instruments in Figures 2 to 4. In fact, it is generally true
that the more angular a wave shape appears, the greater is its
bandwidth, because high harmonics are needed to supply the fine
features of a wave, such as sharp corners or sudden transitions. The
bandwidth of musical sounds can be quite wide, sometimes covering a
range of 10 kHz for certain instruments.
Graphs of frequency spectrum give us an alternative way to represent a
periodic waveform. Frequency spectrum graphs are a form of
frequency domain representation. On the other hand, graphs such
as those in Figures 1 to 4, which show the moment-by-moment
values of a wave, are known as time domain representations. In
general, it is possible to represent any periodic waveform in either
way, but bear in mind that a full frequency-domain representation
would have to include phase information. We are ignoring it
because the effects of phase are largely inaudible.

A frequency spectrum is a range of frequencies. A frequency
spectrum can be represented as a frequency spectrum graph (sometimes
also known as an amplitude spectrum). Such a representation is also
known as a frequency domain representation, in contrast with a time
domain representation such as an ordinary graph
Periodic non-sinusoidal waveforms have line spectra. That is, the
frequency spectrum consists of discrete lines corresponding to the
harmonically related frequencies. The height of the lines in a line
spectrum indicates the amplitudes of the sine wave components. A
complete representation of a frequency spectrum contains information
about frequencies, amplitudes and phases, although phase information
is sometimes omitted.
The frequency bandwidth is the range of frequencies in a spectrum.
For a line spectrum, the bandwidth is just the highest frequency minus
the lowest frequency. For a continuous frequency spectrum a different
definition is necessary.
5 THE PROBLEM WITH SYNTHESIS

5.1 The problem of timbre solved?
Fourier’s theorem and the concept of a frequency spectrum seem to
offer a solution to the problem of timbre. Using Fourier’s theorem,
we could say that the weird and wonderful wave shapes produced
by conventional instruments are mixtures of sine waves of
harmonically related frequencies. In other words, each timbre
consists of a particular spectrum of frequencies. We can even
account for how an instrument produces a spectrum of frequencies,
rather than a single frequency: it is because vibrating systems, such
as stretched strings and air columns, unlike tuning forks, can easily
be made to vibrate in more than one way at once. (You will learn
more about this in Block 2.) These ways of vibrating, or modes of
vibration, when they are harmonically related (as they sometimes
are), combine to create the non-sinusoidal periodic results shown
in Figures 2 to 4.
This simple explanation even suggests a way to recreate the timbre
of instruments artificially. An instrumental sound could be
subjected to Fourier analysis to determine its frequency spectrum.
This would give us a ‘recipe’ for that particular timbre, which we
could then recreate using electronic sine-wave generators tuned to
the required frequencies and amplitudes. Some early attempts at
electronic synthesis consisted of little more than this. The results,
however, were disappointing when considered as imitations of
conventional instruments, although many musicians found them
interesting and useful in their own right. You can try synthesising
some instrumental sounds yourself in the following activity.

The software for this activity has some pre-set amplitudes for
harmonics for a small selection of instruments. You might like to try
listening to them and adjusting the sliders to see whether you can
improve the quality of the simulation.
Comment
I expect you found that the pre-set values gave a sound that was not
very convincing as a simulation. All the same, you probably also found
that changing the settings did not give any significant improvement. Q
Why are the results of this form of synthesis so disappointing? The

fundamental problem is that our view of timbre is too simple. One
shortcoming, which you did not have the chance to test with the
software in the last activity, lies in the assumption that there is a
single spectrum or ‘recipe’ for each instrumental sound, and that this
can be used for all the notes the instrument can play. In fact the wave
shape produced by an instrument can vary considerably depending on
whether it is playing a high note or a low note or one in the middle of
its range. Figure 9 shows the wave form produced by a flute playing a
low note. It is very different from the almost sinusoidal wave shape in
Figure 1, and therefore consists of a very different spectrum from that
of the wave in Figure 1. We shall look at one reason for this changing
wave shape in Section 8, on formants.
0 2 4 6 8 10 12 14 16 18 20 22 24 26
time (milliseconds)
Figure 9 Flute in low-register (F4)
There is a more fundamental problem with trying to reduce timbre to a

particular frequency spectrum, however, and it arises because the
shape of a wave changes or evolves during the course of a note. That is
to say, the graphs in Figures 1 to 4 show only brief portions of the
sounds produced by the respective instruments, and if we looked at
earlier or later portions of the waveforms, we might very well find that
the shape of the wave had changed.
5.2 The evolving sound

In the piano waveform in Figure 4 you can see that the wave shape is
actually changing gradually in the short section shown. This is perhaps
only to be expected with an instrument that cannot produce a
sustained sound. The note is dying almost from the moment it is
struck, so we should expect its wave shape at least to get smaller –
though it is not so obvious that the actual contour of the wave would
change. But even with instruments such as the oboe or flute, where the
note can be sustained for as long as the player has breath, we find that
the wave shape changes if we look at a sufficiently long stretch.
Figure 10 shows the envelope of the whole of the oboe note of which
Figure 2 shows a part. (The arrow in Figure 10 shows from which part
of the overall waveform Figure 2 was taken.)

You can hear the oboe note from which Figure 10 is taken in the audio
track associated with this activity. Q
In Figure 10 it is as though we were standing so far back from the wave

form that we can no longer see the individual oscillations that we saw
in Figure 2. Instead we see the overall rise and fall in the amplitude of
the wave over the course of the sound, which in this case lasts
approximately two-and-a-half seconds.
attack 35 ms
steady state 1.5 s decay 0.8s
Figure 2
Oboe playing C#
time
Figure 10 5
Three distinct phases can be distinguished in Figure 10: the attack, the
steady state and the decay. Very many instrumental and vocal sounds
(but not all) have a three-part structure.
During the attack phase (sometimes also called the onset) the note is
establishing itself and growing rapidly in amplitude. Although the
attack phase is relatively short, it is nevertheless an extremely
significant part of an instrument’s timbre.
After the attack, there is typically a steady-state part. However, as
Figure 10 shows, the term ‘steady state’ can be something of a
misnomer, as the wave may vary quite a lot during this portion. The
cause of the variation may be voluntary or involuntary. Very few
instrumentalists or singers can maintain a note with absolute
steadiness, and even with mechanical instruments, such as the organ,
where you might expect the steady state to be absolutely steady, there
is often a degree of unsteadiness. In addition, many instrumentalists
and singers habitually add vibrato to a note. Vibrato is a periodic
variation in the amplitude or frequency of a waveform, typically with a
period of about 0.2 or 0.3 seconds (which is much longer than the
period of the waveform itself). The amount of vibrato added to a note is
a matter of taste and style, and at different historical periods different
amounts of vibrato have been used. However, even supposedly vibrato-
less performers often turn out to be using a small amount of vibrato.
The final phase is the decay (sometimes called offset). This is much
less significant than the earlier two phases in determining the timbre.
Depending on the instrument, the decay phase may or may not be
under the player’s control.
Although it is clear from Figure 10 that the amplitude of the

underlying waveform is changing, it is not clear that this has any
implications for the frequency spectrum of the sound. However, a
waveform with a varying amplitude may in fact have a different
frequency spectrum from a waveform with the same shape but with
a constant amplitude. Figure 11, for instance, shows a sine wave in
which the amplitude is changing; it has a different frequency spectrum
from Figure 12, which is a sine wave with a constant amplitude.
time
Figure 11 Sine wave with increasing amplitude
time
Figure 12 Sine wave with constant amplitude
Another way to express this idea is that we can synthesise something

like Figure 11 from a number of constant-amplitude sine waves.
Precisely which sine waves are required depends on the precise way
in which we want the amplitude to change. (If it is to periodically
grow and shrink, for instance, then synthesis is relatively easy using
sine waves with harmonically related frequencies.) On the other hand,
to synthesise the wave in Figure 12 would require only one constant-
amplitude harmonic – the wave itself, since this is a constant-
amplitude sine wave.
We can therefore appreciate that during the attack and decay phases,
where the amplitude is markedly changing, the frequency spectrum
will be markedly different from the spectrum during the steady-state
phase; and even during the steady-state phase it may vary, because
the amplitude may not actually be steady.
During the attack phase many processes are at work, and the way in
which the wave changes during this phase can amount to more than a
simple increase of amplitude. Typically the shape of the wave itself
changes during this phase. Figure 13 is from the attack phase for the
oboe sound in Figure 10, and the shape of the wave is clearly changing,
although the period remains constant. Such a changing shape is itself
indicative of a changing frequency spectrum.
0 5 10 15 20 25 30 35 40 45
time (milliseconds)
Figure 13 From the attack phase of an oboe
The conclusion from all these observations is that it is not surprising

that a single ‘recipe’ of harmonics does not yield a satisfactory
synthesised sound, since the sounds produced by instruments and
voices consist of complex time-varying spectra, that is, spectra that
change with time.
I said at the start of this section that not all instruments produced
sounds with the three-phase structure that I identified in the oboe.
With plucked and hammered sounds (for instance those produced by
harps, guitars, harpsichords, pianos, and percussion instruments) the
waveform consists of an attack phase followed immediately by a decay
phase, with no intervening steady state. Figure 14 shows the envelope
of the sound produced by a piano. The attack phase is very short (too
short to show clearly in Figure 14) and the decay phase is very long.
attack 5 ms
decay 2 s
Figure 4
time
Figure 14 Envelope of a piano sound
The attack phase of an instrument’s sound is particularly important in

defining the instrument’s timbre. The following activity gives you a
chance to verify this.

The sound track for this activity consists of a sequence of paired
instrumental sounds. The first sound of each pair has had its attack
removed. The second sound of each pair is the same sound with its
attack phase restored. The purpose of the demonstration is to show
how removing the attack markedly affects the timbre of the sound. The
instruments used are the clarinet, violin, guitar, tuba and tenor
saxophone. Q
I said in Section 3 that Fourier’s theorem relates non-sinusoidal

periodic waves to the harmonics that constitute them. Properly
speaking, when a wave changes shape from cycle to cycle, we ought
not to describe it as periodic. This is because, strictly speaking, to be

described as periodic a cyclical oscillation must meet two criteria:
1 The oscillation must repeat itself identically from cycle to cycle.
2 The period of each cycle must remain constant.
Some of the oscillations we have looked at in this section do not
entirely meet the first criterion, or do not meet it during all phases.
Such waves are often said to be nearly periodic. If the rate of change of
the wave shape is fairly slow, as tends to happen during the steady
state or the decay phases, then it can still be legitimate to use Fourier’s
theorem as an approximation. However, if the wave shape changes
rapidly over the space of a few cycles, for instance as can happen
during the attack phase, then the spectrum may consist of non-
harmonically related frequencies and more complex mathematical
techniques are required to determine the frequency spectrum.

The wave forms produced by pitch-playing instruments generally have
time-varying spectra. The reasons for these time-varying spectra are
complex, and relate to the shape of the envelope of the sound and to
processes operating during, particularly, the attack phase of the sound.
Instrumental and vocal sounds typically have a three-phase structure
consisting of an attack (or onset) phase during which the amplitude of
the sound is growing, a steady-state phase during which the amplitude
is roughly constant, and a decay phase (or offset), during which the
amplitude reduces.
Some instrumental sounds have a two-phase structure consisting of an
attack phase followed immediately by a decay phase.
6 REPETITION RATE AND FUNDAMENTAL

FREQUENCY
In this section I am reverting to the kinds of strictly periodic non-
sinusoidal waves with which I began my discussion of Fourier’s
theorem in Section 3.
One consequence of the fact that a non-sinusoidal periodic wave
consists of a spectrum of frequencies is that we cannot ascribe a single
frequency to it. For instance, the non-sinusoidal waves at the bottom of
Figure 15 (which you saw earlier as Figure 5) has a well-defined period
of oscillation, 5 ms, which is the same period as that of the 200 Hz
fundamental.
However, we could not properly say that wave (c) in Figure 15 has a
frequency of 200 Hz, because this wave contains a spectrum of
frequencies. For non-sinusoidal periodic waves like these we use the
term repetition rate rather than frequency. The repetition rate is the
number of cycles per second of a non-sinusoidal periodic wave. Its
unit is the hertz. The repetition rate is the reciprocal of the period, just
as a sine wave’s frequency is the reciprocal of its period.
200 Hz
0 5 10 15 20 time/ms
(a)
400 Hz
(b)
(c)
Figure 15 Sinusoidal waves (a) and (b) combine to

produce the non-sinusoidal periodic wave (c)
You might wonder whether the repetition rate of a non-sinusoidal

periodic wave is always equal to the frequency of the fundamental.
The answer is ‘yes’, but the fundamental frequency is not always what
it appears to be at first sight. As an example, consider this series of
harmonically related frequencies:
100 Hz 200 Hz 300 Hz 400 Hz 500 Hz 600 Hz 700 Hz 800 Hz
Combining sine waves with these frequencies will produce a non-
sinusoidal periodic wave with a repetition rate of 100 Hz, which is the
frequency of the fundamental.
ACTIVITY 14 (COMPUTING) ...................................................................

(a) Confirm the last sentence above using the software supplied for
this activity. Start with only the first harmonic and observe the
width of one cycle on the screen. Now add in the other harmonics
and notice that the width of a cycle remains unchanged.
(b) When you have all the harmonics present, take the amplitude of
the first harmonic down to zero, leaving the others unchanged.
Does the width of a single cycle change? Q
Part (b) of the last activity would have shown you that the result of
combining this series of frequencies
200 Hz 300 Hz 400 Hz 500 Hz 600 Hz 700 Hz 800 Hz
is to produce a periodic wave with a period equal to that of a 100 Hz

sine wave. In other words, the repetition rate is 100 Hz. Thus, even
though the fundamental frequency is absent from the above series of
frequencies, the remaining frequencies are all harmonically related to
this missing frequency, and the combined wave form has a repetition
rate equal to the fundamental frequency.

Using the software for this activity, set all the harmonics to zero
(including the fundamental). Now add some of harmonic number 2.
Notice that you see two cycles on the screen. Add in as much as you
like of the other even-numbered harmonics, but be sure to keep the
odd-numbered harmonics (including the fundamental) at zero.
(a) How many cycles of the combined waveform do you see?
(b) What is the repetition rate of the combined wave form in relation
to the frequency of the second harmonic?
Comment
(a) With only even harmonics present, you see two cycles of the
combined waveform, as in Figure 16.
Figure 16 Two cycles of a wave created from even

harmonics
(b) The two cycles of the combined wave have the same period as two
cycles of the second harmonic, Figure 17.
Figure 17 Two cycles of the second harmonic
Hence the repetition rate of the combined wave is the same as the
frequency of the second harmonic. Q
What the last activity shows is that if we combine these frequencies

200 Hz, 400 Hz 600 Hz 800 Hz
then the combined wave has a repetition rate of 200 Hz, not
100 Hz. It is not difficult to see why this should be the case.
Although the above frequencies are all harmonically related to a
fundamental of 100 Hz, being respectively the second, fourth, sixth
and eighth harmonics, they are also harmonically related to a
fundamental of 200 Hz.

In terms of a fundamental frequency of 200 Hz, what harmonics are
present in the sequence of frequencies 200 Hz, 400 Hz, 600 Hz and
800 Hz? Q
This is what I mean by saying that the fundamental frequency of a

harmonic series may not be what it appears to be at first sight.
Specifically, where a harmonic series has gaps, there may be a frequency
to which the remaining frequencies can be harmonically related, but
which differs from the fundamental frequency of the complete series.

Periodic non-sinusoidal waves are said to have a repetition rate rather
than a frequency. The repetition rate is the reciprocal of the period
and is measured in hertz. The repetition rate is the same as the frequency
of the fundamental. An incomplete harmonic series may be harmonically
related to more than one frequency. The repetition rate is equal to the
highest frequency to which the harmonics are harmonically related.
7 THE MISSING FUNDAMENTAL

The ability of the ear–brain combination to perform a sort of Fourier
analysis naturally prompts the question of what pitch you hear when
presented with an incomplete harmonic series. The question is very
relevant to instrumental timbre, because the spectra of sounds
produced by instruments are sometimes anomalous when considered
in relation to the straightforward harmonic
amplitude
series we have been considering in this

chapter. Bassoons, for instance, produce
spectra in which the fundamental frequency
can be very weak, almost to the point of non-
existence (Figure 18).
In the case of the bassoon sound represented
1 2 3 4 5
in Figure 18, the ear–brain combination harmonic number
appears to perform a kind of analysis of the
Figure 18 Spectrum of the
wave shape and to supply a pitch note E3 played on a bassoon
corresponding to the missing first harmonic.
This is an example of the phenomenon of the missing fundamental,
which has received a lot of attention from researchers. The subject is
rather complex to analyse, but you might like to hear some examples in
the following activity.

In the audio track for this activity you hear a series of paired sounds.
In each case, the second sound is derived from the first by filtering out
the lowest harmonic. Do you detect a change of pitch between the first
and the second sounds of each pair? Q
What the phenomenon of the missing fundamental appears to

demonstrate is that harmonics are important in our perception of
pitch, and that adding harmonically related frequencies to a sound,
rather than making the resulting sound’s pitch less clear, can actually
reinforce the sense of the pitch associated with the fundamental. A
combination of harmonically related frequencies, therefore, tends to be
heard as fused tone, with one pitch and a particular timbre, rather
than as a combination of sounds with individually distinguishable
pitches. This is not to say that the individual harmonics cannot be
heard, but most people find it rather difficult to pick them out. The
following activity is designed to help you pick out a harmonic from a
complex sound.

In the audio track for this activity you can hear a sound with twelve
harmonics. The eighth harmonic is periodically silenced and restored.
Figure 19 shows the pattern. Each horizontal line represents a harmonic
in the spectrum. The eighth harmonic is silent in alternate seconds.
When you play this
audio track you should 12
hear an apparently 11
constant tone against 10
which a single high- 9
pitched tone comes 8
harmonics
and goes every second. 7

Q 6
5
4
3
2
1
Figure 19 Twelve 0
harmonics with the eighth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
switched on and off time (seconds)

A series of harmonically related frequencies from which the
fundamental frequency has been removed may nevertheless be heard
to have a pitch corresponding to the missing fundamental frequency.
This is the phenomenon of the missing fundamental.
The presence of harmonics reinforces a sense of the fundamental.
Adding harmonically related sine waves creates a fused tone which
has a pitch corresponding to the frequency of the fundamental.
8 FORMANTS
In Section 5 I mentioned that the wave shape produced by an
instrument was apt to differ depending on where in its range the
instrument was being played. In terms of frequency spectra, we would
say that the frequency spectrum changes through the instrument’s
range. In one sense the frequency spectrum must change, because
when an instrument plays a high note its frequency spectrum is
further to the right on a frequency spectrum graph than when it plays
a low note (Figure 20).
amplitude
(a) 0 frequency (Hz)

amplitude
(b) 0 frequency (Hz)
Figure 20 Hypothetical frequency spectrum for

(a) a low note and (b) a high note
In essence, the same spectrum has been shifted to a higher frequency.

But this is not what we mean when we say that the frequency spectrum
changes in different parts of the instrument’s range. What we mean is
that the shape of the spectrum changes. There are several reasons why
this happens, which you will learn more about in Block 2.
In relation to timbre, however, a particularly significant modifier of
the spectrum is the presence of a formant in the instrument’s acoustic
behaviour. A formant is a fixed frequency region (or regions) in which
harmonics are emphasised, irrespective of the frequency of the
fundamental. To see how this works, let us imagine we have a source
that produces the flat frequency spectrum shown in Figure 21.
amplitude
0 100 200 300 400 500 600 700 800 900 1000
frequency (Hz)
Figure 21 A flat frequency spectrum
The spectrum is described as flat because all the harmonics have the
same amplitude, which is a rather unrealistic state of affairs. Suppose
this spectrum is subjected to a process which amplifies the amplitudes
of all harmonics between 750 Hz and 950 Hz by a factor of two (6 dB).
This region of frequency is the formant region for this particular case.
As a result of this process, the spectrum now has the appearance of
Figure 22.
amplitude
0 100 200 300 400 500 600 700 800 900 1000
frequency (Hz)
Figure 22 The spectrum of Figure 21 showing the
effect of a formant region from 750 Hz to 950 Hz
Now suppose that the fundamental frequency of the source rises to

200 Hz. The formant region remains fixed at 750 Hz to 950 Hz.

Sketch the frequency spectrum for the case with a fundamental of 200
Hz and the same formant region. Q
Although the formant operates on a fixed region of the frequency

spectrum, its effect on the harmonics varies depending on the
fundamental frequency.
ACTIVITY 20 (SELF-ASSESSMENT)
(a) When the fundamental was 100 Hz, which harmonics were
emphasised by the formant?
(b) When the fundamental was 200 Hz, which harmonics were
emphasised by the formant? Q
Activity 20 shows that the effect of the formant is to change which

harmonics are emphasised as the fundamental frequency changes.
However, the emphasised harmonics always lie in the same
frequency range.
Many instruments (and the voice) have one or more built-in formants
which favour frequencies over a particular range. As the pitch of the
note played on the instrument changes, different harmonics are
emphasised according to the character of the instrument, and this is
part of what enables us to identify it. For instance, a bassoon has a
formant in the region of 440 Hz to 500 Hz, and another, weaker, one
around 1220 Hz to 1280 Hz. The presence of one or more formants is
very characteristic of instrumental or vocal timbre.

A formant is a fixed range of frequencies over which harmonics are
emphasised. Instruments and voices have characteristic formants.
9 PITCHES IN THE HARMONIC SERIES

9.1 Introduction
The connection between the harmonic series and the work of Fourier
in the eighteenth and nineteenth centuries might give the impression
that interest in the harmonic series has been relatively recent in
relation to musical history. In fact, musicians’ interest in it dates back
at least a couple of thousand years. For musicians, interest in the
harmonic series has historically been concentrated on pitch
relationships that are defined by the frequencies of the series, and this
is what I shall be concentrating on in the remainder of this chapter.
The question I want to investigate is whether there is any significant
relationship between the series of pitches produced by the harmonic
series. To put it another way, suppose we choose as our fundamental
frequency one that corresponds to a conventionally agreed musical
pitch. For the purposes of my discussion I will use a frequency of
98 Hz, which corresponds to the pitch G2 in concert pitch; however,
there is nothing special about this pitch, and any other could equally
well be chosen. The question now is, with a fundamental tuned to G2,
do any other members of the harmonic series correspond to musical
pitches in the concert-pitch standard? The answer, as you will see, is a
mixture of yeses and noes.
Before I go any further into this question, you might think that there
are other questions that ought to precede it, namely: Where do our
musical pitches come from in the first place? Why do we subdivide an
octave as we do, into a particular scale of pitches? The answer is that
no one really knows why. There are, however, enough
correspondences between the pitches we use and the pitches of the
harmonic series for many commentators to conclude that the harmonic
series is ultimately the source of our scales. The harmonic series, it is
said, is a phenomenon of nature, discernible in the timbre of natural,
pitch-producing objects, and it supplies a set of pitch relationships
which has been systematised into scales. This is a controversial view,
however, and there is not space in this chapter to pursue it further.
Irrespective of whether the harmonic series is or is not the origin of
our way of subdividing an octave into a scale of pitches, many
musicians have considered that the harmonic series provides an ideal
for the frequency relationships to use for tuning instruments. For
example, if we say that an octave corresponds to a doubling of
frequency, we are effectively using the harmonic series to define what
we mean by an octave. However, when listeners are asked to identify
an octave by ear, researchers have found that there is a fairly
consistent (albeit small) bias away from an exact frequency ratio of 2:1.
There is, therefore, room for a certain amount of latitude in how an
octave is defined, and piano tuners, for instance, tend not to use an
exact ratio of 2:1 at the extreme ends of the keyboard.
The issue of how a scale of pitches is defined, and how an instrument
is tuned, might seem to have little to do with the question of timbre,
but in fact musical keys (that is F major, C minor, B flat major, etc.)
have often been regarded as having particular timbres or colour.
Depending on the tuning system used, such an attribution of timbre to

key is more or less plausible. Nowadays the widespread use of the so-
called equal temperament system for tuning (to be discussed in
Section 11) eliminates the inherent characteristics of particular keys,
though most conventional instruments do in fact play differently and
sound different in different keys because of the idiosyncrasies of the
instrument. In the past, though, before the widespread adoption of
equal temperament, different keys could have different timbres, and
this characteristic has returned to music with the revival of old ways
of tuning and playing instruments in the trend towards historically
informed performance styles.
9.2 Identifying the pitch classes

It will help our discussion of the pitches in the harmonic series if we
use the concept of a pitch class. Box 1 explains what this is.
BOX 1 Pitch class

A pitch class is a set of musical pitches that share the same note name, but
not necessarily the same pitch. Thus the pitches G2, G3, G4, etc. all belong to
the pitch class G. Any pitches using other note names, such as A, A # , B b , C,
and so on, are excluded from this class. Likewise G # 3 or G b 5 (for instance)
do not belong to the pitch class G. They belong respectively to the pitch
classes G # and G b . The pitch class G#, incidentally, is distinct from the pitch
class A b , even though on many instruments it is not possible to differentiate
between them (for instance on keyboard instruments and guitars).
We have already seen that the second harmonic belongs to the same
pitch class as the first harmonic, because they are one octave apart.
The following activity is designed to identify other members of this
pitch class.

Which other members of the harmonic series belong to the same pitch
class as the fundamental? Look for frequencies that are two or more
octaves above the fundamental. You should be able to spot a pattern
by thinking about the frequencies of just the first eight harmonics. Q
You may have realised during the last activity that every even-
numbered harmonic must be one or more octaves above an odd-
numbered harmonic. In other words, as you ascend the harmonic
series, every time you meet an even-numbered harmonic you
encounter a member of a pitch class that has already been introduced
by a lower-numbered odd harmonic. For instance, the sixth harmonic
belongs to the same pitch class as the third, so the sixth harmonic does
not introduce any new pitch class to the series.

In the first sixteen members of a harmonic series, how many pitch
classes are there? Q
Figure 23 shows the first eight odd-numbered harmonics of a harmonic

series for a series with a fundamental frequency of 98 Hz (G2), in terms
of the keyboard and in terms of music notation.
*
*
G2 D4 B4 F5 A5 C 6 E6 F 6
Middle C * *
harmonic number
1 3 5 7 9 11 13 15
Figure 23 The first eight odd-numbered harmonics. Harmonics marked with an

asterisk are audibly out of tune
Apart from the pitch G2, none of these pitch correspondences is exact.
That is to say, the pitch of the fifth harmonic (say) is not exactly the
same as the pitch B4 as you would find it on a modern piano. The
difference arises because, as I hinted earlier, the harmonic series is not
in practice used to define the pitches to which most modern
instruments are tuned. We shall be looking at this in more detail in
Section 11. For harmonics 3, 5, 9, 13 and 15, the discrepancy in pitch
between that supplied by the harmonic series and tuning of a modern
piano is relatively small, and many listeners would either not notice it
or would have to listen very carefully to hear a difference.
It is a different story with harmonics 7 and 11, marked with an asterisk
in Figure 23. Here there are very noticeable differences between the
pitches supplied by the harmonic series and those used in modern
tuning, and the pitch names that are allocated (F5 and C # 6 ) to these
harmonics are approximations. (Alternatively, you could say that the
pitches used in modern tuning are only approximations to the pitches
supplied by the harmonic series.)

What pitch classes are represented by the first eight odd-numbered
harmonics? Q
The series of pitch classes in the last activity is not quite enough to
give us a scale of G, which is G, A, B, C, D, E, F # and G, because it
lacks C. Actually it would not be difficult to establish the pitch class C
via a harmonic series on a different fundamental, because any series
that had G as its third harmonic would have C as its fundamental. It
looks, then, as though the harmonic series could, with a certain
amount of octave transposition, be used to give us the pitches of a
standard major scale. In principle, therefore, the idea that our
subdivision of the octave derives from the harmonic series is not
implausible. However, as a practical method of deriving all the pitches
for, say, tuning a keyboard instrument, this approach leads to
problems. For example, if we use the series in Figure 23 to define the
pitch class A, and then use this A as the fundamental of a new

harmonic series to derive the pitch class E, the pitch class E we
produce in this way will be slightly different from the one represented
by the thirteenth harmonic in Figure 23. In practice, attempts to use
the harmonic series to define all the pitch classes used in music
concentrate on using just the first few harmonics of the series, and
extrapolate from these to all the other pitch classes in a way that will
be described in Section 11.
9.3 Consonance and dissonance; the triad

Just as the octave is a fundamental musical interval or step, with its
own characteristic sound, so there are other fundamental musical
relationships embodied in the harmonic series. This section will be
looking at some of these.
ACTIVITY 24 (LISTENING, COMPUTER) .....................................................

The software for this activity allows you to explore the sounds of
various harmonics in relation to the fundamental. You may have done
this already earlier in the chapter. If you have not done so already,
consider the effect of starting with just the first harmonic, then add the
third harmonic, and then the fifth. Do these three harmonics in
combination sound consonant or dissonant to you?
Comment
Whether you regard these pitches in combination as dissonant is a
subjective matter. However, most people regard them as consonant. Q
The pitch class of the third harmonic is regarded as being the most
consonant in relation to the fundamental after the octave. The pitch
class of the third harmonic is therefore regarded as being specially
important in music. The pitch class represented by the fifth harmonic
is consonant in relation to both the fundamental and the third
harmonic. It is also particularly important in music.
The pitches you used in Activity 24 were the first three in Figure 23,
reproduced here in Figure 24.
G2 D4 B4
Middle C
harmonic number
1 3 5
Figure 24 First three odd-numbered harmonics of the harmonic series with

fundamental G2
If you play these pitches on a piano or any other instrument, the result
will sound very different from what you heard in Activity 24. This is
because the pitches you play on an instrument will not be pure sine
waves, whereas the harmonics you were adding in the activity
consisted of pure sine waves.
The pitch classes corresponding to the first three odd harmonics are
used to create the most basic type of chord used in music, the major
triad. A triad of G major is formed by playing the pitches G, B and D in
their closest possible arrangement, which is that shown in Figure 25.
I have not put subscripts on the pitches
because this arrangement of pitches is a triad
wherever it is played on the keyboard.
G B D
Figure 25 G major triad
ACTIVITY 25 (COMPUTER, EXPLORATORY) ................................................

Try playing the triad in Figure 25 on your keyboard (either a real one
or the one supplied as part of the course software) to familiarise
yourself with its sound. Play it in different parts of the keyboard. Also,
play the notes in sequence and simultaneously. Notice that when you
add other pitches from the pitch classes G, B and D, the basic
character of the triad is not changed.
You might like to experiment with a few other triads, for instance:
C, E, G (C major),
F, A, C (F major)
D, F # , A (D major) Q
Any selection of pitches drawn from the pitch classes of the triad,
such that each class is represented by at least one pitch, is a major
chord. For instance, this set of pitches is a chord of G major: G2, G3, G4,
D5, G6, B6; but it is only one of innumerable chords of G major that can
be assembled from the pitch classes G, B and D. A major triad is thus
the most basic form of a major chord. As you would have heard in
Activity 25, a major chord and its corresponding major triad share a
characteristic sound. Thus in the published scores of a lot of popular
music, chords are represented in shorthand by letters, such as G (to
mean a chord of G major), C (a chord of C major), D (a chord of D
major), etc. It is left to the player to decide the particular arrangement
of pitches from the corresponding triads to use.
In the next activity we start to explore the higher harmonics.

Using the software for this activity add the seventh harmonic to the
first, third and fifth. Is the result harmonious?
Comment
Generally adding the seventh harmonic to the other three would be
regarded as producing dissonance, although for many people the
dissonance is mild. Q
Activity 26 shows that not all the pitch classes in the harmonic series
are consonant with the fundamental, or with each other (to check this
point you might try listening to just the fifth and seventh harmonics,
with none of the others present).

With the software for this activity, add the ninth and eleventh
harmonics. Try them individually with the first, third and fifth, and
see whether you think they are dissonant. Also, try them with the first,
third, fifth and seventh.
Comment
You probably found the ninth and eleventh harmonics dissonant in
relation to just the first, third and fifth harmonics. Adding the seventh
harmonic as well would generally be regarded as increasing the
overall level of dissonance. Q
9.4 The complete harmonic series

Up to now I have concentrated on just the odd-numbered harmonics,
because these determine the pitch classes in the harmonic series. It is
now time to bring the even-numbered harmonics back into the picture.
Figure 26 is the result.
Not all the even harmonics are properly in tune relative to other
members of the series. Specifically, the fourteenth, being an octave
above the seventh, is equally out of tune.
*
*
*
G2 D4 B4 F5 A5 C 6 E6 F 6
Middle C * *
G3 G4 D5 G5 B5 D6 F6 G6
*
odd harmonics
1 3 5 7 9 11 13 15
even harmonics
2 4 6 8 10 12 14 16
Figure 26 Complete harmonic series up to the sixteenth harmonic

If the harmonic series were extended beyond the sixteenth, which
would be the next even harmonic to be markedly out of tune? Q
Notice in Figure 26 that the harmonics become progressively closer in

pitch as we ascend the series. Indeed in places there are consecutive
runs of pitches that form sections of scales.
I have summarised the data for the first sixteen harmonics of the
harmonic series on G2 in Table 1. In the second column the frequency
is expressed as its ratio to the frequency immediately preceding it. For
instance, the second harmonic has 2:1 in this column, indicating that
its frequency is twice that of the harmonic above it, which in this case
happens to be the fundamental. Notice that the ratios in the second
column are all whole-number ratios, and that they advance in an
orderly way from top to bottom. This pattern is a straightforward
consequence of the frequency relationships of the harmonic series:
f1, 2f1, 3f1, 4f1, etc.
Table 1 The first sixteen harmonics of the harmonic series on G2.

(Asterisks indicate ‘out of tune’ harmonics)
Harmonic Frequency ratio relative Frequency Pitch

to preceding harmonic
1 98 Hz G2
2 2:1 196 Hz G3
3 3:2 294 Hz D4
4 4:3 392 Hz G4
5 5:4 490 Hz B4
6 6:5 588 Hz D5
7 7:6 686 Hz F5 *
8 8:7 784 Hz G5
9 9:8 882 Hz A5
10 10:9 980 Hz B5
11 11:10 1078 Hz C# 6*
12 12:11 1176 Hz D6
13 13:12 1274 Hz E6
14 14:13 1372 Hz F6 *
15 15:14 1470 Hz F#6
16 16:15 1568 Hz G6

A pitch class is a set of pitches that share the same name. The odd
numbered pitches of a harmonic series can be used to define several
pitch classes in relation the pitch class of the fundamental. Two
especially important pitch classes are those of the third and fifth
harmonics. Together with the pitch class corresponding to the
fundamental, these three pitch classes are used to create a major triad,
the basic major chord of music.
Some members of the harmonic series (for example the seventh and
eleventh) are markedly out of tune in relation to the rest.
10 INTERVALS IN THE HARMONIC SERIES

10.1 Perfect fifth
We can regard the data in Table 1 as supplying data not just for
pitches, but for intervals. In musical parlance, an interval is a step in
musical pitch. For instance, a one-octave step is an interval of one
octave. As you know, the interval between the first two harmonics in
the harmonic series is an octave.
After the octave, the most important interval is
that between the second and third harmonics of
the harmonic series. This interval is called a
perfect fifth, or informally just a fifth. Figure 27
G D
shows this interval on a piano keyboard with a
lower pitch of G, although any pitch could be Figure 27
A perfect fifth
chosen.

Play the fifth in Figure 27 a few times on a keyboard to familiarise
yourself with the sound. Try playing it in different parts of the
keyboard. Its distinctive quality is probably best appreciated by
playing the two pitches in sequence, upwards and downwards, and by
singing them. Other fifths you can try are E and the first B above it, or
D and the first A above it. Q
Table 1 shows that the frequency ratio for this interval is 3:2, or 1.5:1,
meaning that the frequency corresponding to the higher pitch is 1.5
times that of the lower pitch.
Notice in Figure 27 that the fifth is spanned by five white notes. These
five white notes are the first five notes of the scale of G, and this is the
origin of the name ‘fifth’. A perfect fifth, however, does not have to
begin and end on white keys, and not all groups of five adjacent white
keys span a perfect fifth. Unless you are conversant with music theory,
the surest way to check whether an interval is a perfect fifth is to
count the semitone steps.

How many semitone steps are there in a perfect fifth? Q
Perfect fifths occur elsewhere in the harmonic series than between the
second and third harmonics. Any two members of the harmonic series
whose frequencies are in the ratio 3:2 (or 1.5:1) are a perfect fifth apart.
In fact, any harmonics whose harmonic numbers are in the ratio 3:2
are a perfect fifth apart. For example, harmonics 6 and 4 have
frequencies in the ratio 6:4, which is the same as 3:2. Alternatively,
you could say that the frequency of harmonic 6 (frequency 6f1) is 1.5
times that of harmonic 4 (frequency 4f1). Hence sixth and fourth
harmonics are a perfect fifth apart.
10.2 The perfect fourth

The next interval step in Table 1 is that between
the third and fourth harmonics. This is called a
perfect fourth, or just a fourth. Table 1 shows D G
that the frequency ratio is 4:3. Figure 28 shows a Figure 28
perfect fourth on a piano keyboard. Perfect fourth

Try playing a perfect fourth. Again, try playing the pitches in sequence
and singing them. Other perfect fourths you can try are from E to the
first A above it, or from G to the first C above it. Q
10.3 The major third

The next interval in the sequence is that between the fourth and fifth
harmonics. This is called a major third, and has a frequency ratio of
5:4. Figure 29 shows three intervals of a major third on a piano
keyboard. As can be seen from Figure 29(a), a major third spans the
first three notes of a major scale. However, as with the perfect fifth, the
surest way to check whether an interval is a major third is to count the
semitone steps. A major third always has four semitone steps.
(a) G B
(b) B D /E
(c) D /E G
Figure 29 Three major thirds
within one octave
Notice that in Figure 29, (b) and (c) have been chosen such that each
begins where the one before ends. In other words, the lower note in (b)
was the upper note in (a), and the lower note in (c) was the upper note
in (b). Thus three interlocking major thirds span exactly one octave, a
fact we shall make use of in Section 11.

Using a keyboard, play the major thirds in Figure 29 to familiarise
yourself with their sound. In addition to these major thirds, others you
can try are D to the F# immediately above, or C to the E immediately
above, or any other pair of notes separated by four semitone steps. Q
The major third occurs elsewhere in the harmonic series than between
the fifth and fourth harmonics. Any frequencies in the ratio 5:4 (or
1.25:1) are a major third apart.
The intervals of a major third and a perfect fifth are used in the major
triad. Taking the home note of the triad as G, for instance, the note
above it is B, which is a major third above G, and the top note is D,
which is a perfect fifth above G.
10.4 Other intervals

I am not going to explore all the other intervals that are listed in
Table 1. The perfect fifth and the major third are the most important
for our purposes.
However, it is worth pointing out an anomaly concerning the whole-
tone step that emerges clearly from Table 1. Consider the step from
A5 down to G5 (harmonic 9 to 8). These are a whole tone apart.
Their frequencies in Table 1 are shown to be in the ratio 9:8 (1.125:1).
Now consider the pair B 5 and A5 (harmonics 10 and 9), which are
also adjacent keys on the piano keyboard and therefore also a whole
tone apart. Their frequencies are in the ratio 10:9 (or about 1.111:1).
The pair E6 and D6 (harmonics 13 and 12) are also neighbours on the
piano keyboard, and have frequencies in the ratio 13:12 (about 1.083:1).
Thus the harmonic series does not give us a consistent frequency ratio
for a whole-tone step.

(a) From Table 1 (or otherwise), which harmonic is a perfect fifth
above the sixth harmonic?
(b) From Table 1 (or otherwise), which harmonic is a major third
above the twelfth harmonic?
(c) Is any member of the harmonic series a major third above the
tenth? Q

An interval is a step in musical pitch. The harmonic series can be
used to define frequency ratios for intervals. After the octave, the
most important music intervals are the perfect fifth and the major
third.
The perfect fifth has a frequency ratio of 3:2, corresponding to its
place in the harmonic series as the interval between the third and
second harmonics. In a major scale, the perfect fifth is the interval
between the home note, or tonic, and the fifth note of the scale.
The major third has a frequency ratio of 5:4, corresponding to its
place in the harmonic series as the interval between the fifth and
fourth harmonics. In a major scale, the major third is the interval
between the home note, or tonic, and the third note of the scale.
11 TUNING AND TEMPERAMENT

11.1 Ascending by fifths
Practical attempts to derive scales from the harmonic series for tuning
instruments such as pianos, harpsichords and organs tend to use the
first few intervals from Table 1, and to apply them repeatedly. To give
an example, let us consider what happens if we progressively move
upwards in steps of a perfect fifth from an arbitrary starting pitch.

Figure 30 is somewhat wider than a full-sized piano keyboard. Starting
at the lowest G on this keyboard, go up by a chain of perfect fifths until
you return to a G. The fifths have to be interlocking; that is, as you
ascend the keyboard, the upper note of one fifth must become the
lower note of the next.
(a) What is the sequence of pitches passed through on the way? (I
have not given a subscript for the initial G because its actual pitch
is irrelevant for this activity. You therefore do not need to bother
with subscripts for the pitches you identify.)
(b) How many perfect fifths are needed to accomplish this ascent?
Unless you are conversant with music theory, use the fact that a perfect
fifth consists of seven semitone steps. Q
G
Figure 30 Keyboard for Activity 34
If we take the pitches visited in part (a) of Activity 34 and regard them
as representatives of their respective pitch classes, we can arrange
them in ascending order of name, to produce this sequence of pitch
classes:
G – G# – A – A# – B – C – C# – D – D# – E – F – F# – G
This series of pitch classes covers all the pitches of a chromatic scale,
a chromatic scale being one where all the white and black notes are
sounded between the lower and upper notes of the scale. Thus by
starting from an arbitrary pitch and rising through a chain of perfect
fifths we can derive all twelve pitch classes of the chromatic scale.
This assumes that we regard enharmonic pairs, such as G# and Ab, as
equivalent.
You might be wondering how these pitch classes relate to those
derivable from the harmonic series in Figure 26. For the pitch classes
G, D and A (the first three from the ascending chain of fifths) the two
methods give the same result. All the other pitch classes, however,
differ slightly. Thus there is actually more than one way to derive a
scale of pitches from the harmonic series.
Unfortunately, the appealingly simple method of using a chain of

perfect fifths to define a chromatic scale of pitches soon leads us
into problems.

How many octaves are there between the bottom G and the top G in
Figure 30? Q
Apparently, then, if you move up in a series of twelve perfect fifths,

you should arrive at the same pitch as you would if you moved up
through seven octaves.

Suppose, for simplicity, that the frequency of the lowest G in Figure 30
is 1 Hz. Calculate what happens if this frequency is taken through
seven steps at a frequency ratio of 2:1 (that is, seven doublings) and
compare the result with twelve steps at 3:2 (in other words, 1.5
multiplied by itself twelve times). (You will need to use a calculator
for this.) Q
The result of Activity 36 shows that the G reached by ascending through

twelve perfect fifths is not the same as the G reached by ascending
through seven octaves. Thus pitches which can be derived through
different routes and which ought to be the same turn out to be different.
This problem afflicts all methods of defining pitches based on the use
of the harmonic series, or chains of intervals derived from the
harmonic series.
11.2 Ascending by thirds

Discrepancies such as that in Activity 36 are also evident over a
much smaller range than seven octaves. For instance, we saw in
Section 10.3 that three major thirds span an octave. What happens
if we use the precise ratio for a major third from Table 1 to ascend
through an octave?

Again, treating the starting pitch as being equivalent to a frequency of
1 Hz, calculate what frequency is arrived at after a succession of three
major thirds, and compare the result with that obtained from a single
step of one octave. Q
The discrepancy revealed in Activity 37 is very audible, as Activity

38 shows.

The audio track for this activity consists of a succession of major
thirds, as shown schematically in Figure 31.
ascending pitch
C
G /A G /A
E E
C
(a) (b) (c)

Figure 31 Ascending major thirds for Activity 38
The first sound is an exact major third (using the ratio from Table
1). It consists of the pitches C and E played simultaneously, as
represented by (a) in Figure 31. The second sound is an exact major
third with a lower note of E, as in (b). However, in the third sound,
as represented by (c), the upper C is not an exact major third above
the lower note of the pair. Instead, it is an exact octave above the C
in (a). Although the major thirds in (a) and (b) sound harmonious,
that in (c) sounds harsh. Q
A characteristic feature of the sound of intervals that do not have

the precise frequency ratios of Table 1 is the phenomenon of ‘beats’,
which is a pulsation in the loudness of the sound. You may have
heard them in Activity 38. We shall be looking into the
phenomenon of beats in Section 12.
In Activity 38, the pitch we arrived at after rising through two
major thirds from C is, properly speaking, G # and not A b. On the
other hand, the pitch that is a major third below C is properly
speaking A b and not G# (C is the third note of a major scale of A b).
Thus, if we used the exact frequency ratios of Table 1, there would
be a difference between G # and Ab. To illustrate this, consider again
a starting frequency of 1 Hz. Going up by two major thirds takes us to:
1 Hz × 1.25 × 1.25 = 1.5625 Hz.
Now let us start from a frequency of 2 Hz (one octave above 1 Hz) and
go down by a major third. This gives us a frequency of
2 Hz ÷ 1.25 = 1.6 Hz
Thus G# and A b have close but different frequencies.
The tuning system of equal temperament, which is discussed in the
next section, has been evolved to overcome the anomalies that have
been revealed in this section.

A chain of interlocking perfect fifths can supply twelve pitch
classes of the chromatic scale. However, twelve interlocking perfect
fifths are very slightly larger than seven octaves; three major thirds
are very slightly less than an octave, and the pitches of enharmonic
pairs, such as G# and Ab, are slightly different. The harmonic series,
therefore, is not a practical way to define frequency ratios for
musical intervals, except for the octave.
12 EQUAL TEMPERAMENT
The previous section has demonstrated the audibility of the anomalies
arising from the use of the ratios in Table 1 to define intervals. Can
anything be done to rescue the situation? In general, musicians have
compromised in various ways, and indeed it is possible to make
sufficient adjustments on most string and wind instruments during
playing to correct for certain amounts of ‘out-of-tuneness’. With a
keyboard instrument such as the piano, organ or harpsichord,
however, this is not possible, since the pitch of the notes has been set
by the person who tuned the instrument. Normally, there is just one
key that has to serve, for example, as both G# and Ab. How, therefore, is
the pitch of the note which is played by this key to be determined?
There does not seem to have been much of a problem in Mediaeval
and Renaissance times, because only a restricted range of keys (that is,
home notes) was in use, and so you tuned the notes as was most
appropriate for your commonly used keys. In the case of our G#/Ab
example, the note would have been biased towards G#, because that
note was employed in the commonly used key of A minor, while there
was very little use of the keys in which the combination of Ab and C
was required.
In later periods various tuning compromises have been adopted which
have entailed putting some intervals out of ‘perfect’ tuning. For
example, some major thirds might be true (that is, in a ratio of 5:4),
whereas other major thirds would not. The ideal was to keep as many
intervals as true as possible. This process of adjusting pitches and
intervals away from their true values is called tempering. Many kinds
of tempering bias the tuning towards a particular key or home note. In
such systems a set of pitches that sounds acceptable in music with G
as its home note might sound less acceptable with A as its home note.
These difficulties became more acute as composers began to use a
wider range of modulation – that is, changing key or home note within
a piece. As this happened, the carefully set bias for the original key
became a cacophony if your music moved to more distantly related
keys. (A distantly related key was one in which the notes of the scale
had very few in common with the scale of the original key.) All kinds
of tunings were worked out, in theory and in practice, to accommodate
the musical needs as they developed. The particular combination of
pitches in a tuning system is known as a temperament.
Towards the nineteenth century, musicians’ needs were developing in
a direction that required complete freedom of modulation – that is, the
ability to move from any major or minor key to any other major or
minor key in the course of a piece of music. This was accommodated
by a system of equal temperament, in which the ‘out-of-tuneness’ was
spread equally around the pitch-relationships. In equal temperament,
only the octaves remain perfectly in tune: every other interval is
deliberately ‘out of tune’ to some degree. Perfect fifths are all flattened
by a regular amount: they are narrowed, in fact, from a ratio of 3:2 to
2.9966:2. One consequence of this is that twelve perfect fifths do now
become equivalent to seven octaves. Major thirds, on the other hand,
are expanded; instead of a ratio of 5:4 they have a ratio of 5.04:4 (or
1.26:1 instead of 1.25:1). Consequently equally tempered major thirds

do truly divide an octave into three parts. That is 1.26 × 1.26 × 1.26 =
2 (very nearly). Another consequence of equal temperament is that the
enharmonic equivalents (G # /A b , A # /B b , F # /G b , etc.) do actually have
the same pitch. Theoretically, however, they remain distinct and are
notated differently.
Equal temperament has established itself as standard for pianos,
organs, electronic keyboards, guitars, woodwind instruments – in fact,
for almost any instrument where the relationship between pitches is
determined by the construction of the instrument or the way it has
been set up. The solution it provides has, of course, been bought at a
price: there are no ‘pure’ thirds and fifths, for example. Thus, you may
be able to hear a kind of throbbing as you sustain the interval of a third
or a fifth on a piano or organ because the interval is not the precise one
given in Table 1. (The effect is usually easier to hear in the middle-to-
lower registers than in the treble.) If you hear this throbbing, don’t jump
to the conclusion that the piano tuner has done a bad job! It is a direct
consequence of the use of equal temperament. The throbbing is the
phenomenon of beats, which we shall look at in the next section.
We are now used to the sound of equally tempered instruments, and in
general do not have the sense of them being ‘out of tune’. The realities
of the situation sometimes become apparent, nevertheless, when
equally tempered instruments play in ensemble with other instruments
and voices. To take a simple example, you can sometimes hear the
difference between the piano and a violin, because the violinist
naturally tunes the strings in ‘perfect’ fifths, which are wider than the
tempered fifths of the piano.
Another change that came with the adoption of equal temperament
was the removal of ‘key colour’. On instruments with non-equal
tunings there was naturally a bias towards some keys in which the
intervals were purer: when you moved away to another key the pitch-
system became noticeably rougher, and then smoothed out again as
you returned to the original key. Many of the descriptions that
attribute characters to particular keys (Beethoven, for example, called
the key of F minor ‘barbarous’) may in fact be the consequence of the
different effects in the context of unequal temperaments. It’s also
worth remembering that equal temperament did not become the
generally accepted standard until quite late in the nineteenth century,
so an enormous repertory of Western music was not created with the
assumptions about pitch relationships that we take for granted today.
In spite of the apparent supremacy achieved by equal temperament, the
use of unequal temperaments nevertheless survived on a small scale
among some musical enthusiasts into the twentieth century, and in fact
was taken up again by the movement towards ‘authentic’ performance
in the Early Music revival among performing groups in the last quarter
of the century. CD notes often state the system of temperament that is
used for the keyboard instruments on recordings by such groups.

The audio track for this activity consists of a twenty-minute
presentation of speech and music on the subject of temperament. Q
For reference, Table 2 lists the frequencies used in equal temperament

for the concert pitch standard of tuning (A4 = 440 Hz). The numbers in
this table have been rounded, so there is a small amount of
approximation in most cases. However, the effect of this
approximation is inaudible.

From Table 2, what is the frequency ratio for an equally tempered
whole-tone step, and how does it compare with the ratio of 9:8 given
by Table 1 for the ratio between A5 and G5? Q
You will notice in Table 2 that Table 2 Notes and pitches

there are twelve semitone steps (equal temperament)
from C 4 to the pitch an octave
higher, C5. In equal temperament, Note name Frequency
each semitone step upwards C4 (middle C) 261.63 Hz
corresponds to a multiplication of C# 4 277.18 Hz
the frequency by a factor of D4 293.66 Hz
approximately 1.0594. Thus, D# 4 311.13 Hz
multiplying any starting frequency E4 329.63 Hz
by this number twelve times takes
F4 349.23 Hz
#
you to a frequency one octave
F4 369.99 Hz
above the starting frequency.
(Dividing by this number twelve G 392.00 Hz
G# 4
4
times takes you down an octave.) 415.30 Hz

The factor 1.0594 is, in fact, an A4 440 Hz
approximation to the twelfth root A4# 466.16 Hz
of 2 (written as 12√2). The twelfth B4 493.88 Hz
root of two is the number which, C5 523.25 Hz
when multiplied by itself twelve
times, gives an answer of two.
Multiplying a starting frequency by this number twelve times doubles the
frequency. Thus, given a reference frequency for a pitch (in Concert Pitch
that is 440 Hz for A4), all the other frequencies are found by multiplying
or dividing by the twelfth root of 2 the required number of times.

Adjusting intervals away from the precise ratios of the harmonic series
to overcome anomalies is known as tempering. Many schemes of
tempering have been devised. Generally they favour certain keys over
others. Equal temperament, however, treats all keys equally, allowing
free modulation from one key to any other without any attendant
change of timbre. In equal temperament, only intervals of an octave
have frequency ratios that match the prescription of the harmonic
series. All other intervals are tempered to a greater or lesser amount.
One consequence of equal temperament is that the notes of
enharmonic pairs have the same pitch. However, the difference is
retained in music theory, and they are notated differently. In equal
temperament, semitone steps are defined by a constant frequency
ratio. The multiplicative factor for a semitone step in equal
temperament is the twelfth root of 2 (12√2).
13 BEATS
13.1 Close frequencies
In the last section, mention was made of a ‘throbbing’ sound that might
be heard when intervals are not quite perfectly in tune (that is, not
quite as given by the harmonic series). The term used for this effect is
beating. It is most readily demonstrated using sine waves with pitches
that are close in frequency, but it is not exclusively associated with
sine waves nor with close-frequencies.

Launch the software associated with this activity. Play simultaneously
the pitches of 440 Hz and 442 Hz and listen to the result. Try the effect
of adjusting the second frequency downwards to 441 Hz, and upwards
to 449 Hz. Q
The throbbing or beating effect of two close-frequency sine waves

would have been very clear in Activity 41. You would also have found
that the closer the two frequencies became, the slower became the rate
of beating, and the further apart they became, the faster the rate of
beating.
Figure 32 is a somewhat exaggerated view of a waveform pulsating in
the way you have just heard. Clearly there is an underlying sine wave,
the amplitude of which is varying cyclically. The cyclical variation of
amplitude is the pulsation or beating you heard.
time
Figure 32 Pulsating sinusoidal waveform
It is interesting to note that there is only one underlying sine wave in

Figure 32, yet two sine waves were added to create it. The frequency
of this underlying sine wave is the average of the two sine waves that
were added. Thus, when you added sine waves of 440 Hz and 442 Hz,
the frequency of the resulting pitch was 441 Hz, or (440 Hz + 442 Hz)/
2. The rate of pulsation is the difference of the two frequencies. Thus,
with 440 Hz and 442 Hz tones, the pulsation rate is 442 Hz – 440 Hz,
or 2 Hz, meaning two beats per second. Clearly the closer two
frequencies become to each other, the slower the rate of beating.
The phenomenon of beating is often encountered in music. For

instance, it can be heard when instruments playing in unison are not
quite in tune with each other. Beats are also often used practically
when tuning instruments, especially stringed instruments and stringed
keyboard instruments such as pianos and harpsichords. On a violin or
guitar, for instance, the player might tune one string to a tuning fork or
to some other standard pitch. Then the other strings are tuned to the
already tuned string. To do this the player finds the same note as the
tuned string on another string and listens for beats when the two notes
are played simultaneously. The string being tuned is adjusted until the
beat-rate falls to zero.
We saw and heard earlier that increasing the frequency separation of
the two sine waves increased the beating rate. The next activity
investigates how far this phenomenon continues with increasing
separation.

Run the software for this activity and continue to increase the
separation between the two frequencies. Q
As the frequency separation is increased, not only does the frequency

of the beating increase, but what I have referred to as the underlying
sine wave starts to sound less and less ‘pure’. By the time the
frequency separation is around 15 Hz it sounds quite rough, and
beating becomes less apparent. Further increasing the separation of the
frequencies eventually leads to a loss of this roughness, and the
discernible presence of two pitches, corresponding to the two
frequencies being combined. Figure 33 summarises these findings.
single tone two tones
beats
smooth tone rough smooth
0 15 Hz frequency difference
f2 – f1
critical bandwidth
Figure 33 The emergence of two distinct pitches as frequency separation increases
The explanation for these phenomena lies in the operation of the ear
rather than in the physics of sound, and in particular the significance
of the critical bandwidth in Figure 33 will become clear later, in
Chapter 5, when we look at the working of the ear. (The critical band
has an upper an lower limit; the lower limit is to the left of the zero in
Figure 33.)
The phenomenon of beats and roughness created by closely spaced
notes played simultaneously has many musical consequences. For
instance, sine waves at the frequencies of G2 (98 Hz) and G # 2 (103.83 Hz)
are close enough for beats to be very audible when they are played
simultaneously. Whether the beats are audible in a musical context
depends very much on the instrument(s) playing the notes. If they are
played simultaneously on the piano the throbbing is not so easy to
hear; however, on a guitar, for example, beats can be heard and felt by
the player when A2 (the second-to-lowest string) is played

simultaneously with G # 2 on the bottom string.
Because of the non-sinusoidal wave forms produced by most musical
instruments, there is plenty of scope for beating between a harmonic
of one note and a harmonic of another. For instance, although a
fundamental frequency of 100 Hz is far enough away from a
fundamental of 51 Hz for there to be no beating between them, if the
51 Hz fundamental has a prominent second harmonic (102 Hz), then
there may be beating between it and the 100 Hz fundamental. The rate
of beating will be 102 Hz – 100 Hz = 2 Hz.

(a) A beating at 3 Hz is heard between two close-frequency sine
waves. The lower-frequency sine wave has a frequency of 180 Hz.
What is the frequency of the higher-frequency sine wave?
(b) The second harmonic of wave A is slightly lower in frequency
than the third harmonic of wave B. Wave A has a fundamental
frequency of 90 Hz. The rate of beating is 3 Hz. What is the
fundamental frequency of wave B? Q
13.2 Beating in near-harmonics

Beating is not only found with closely spaced frequencies. It is also
found with widely spaced frequencies that are nearly, but not quite
harmonically related. For instance, when a frequency of 250 Hz is
combined with one of 502 Hz, a beating at the rate of 2 Hz can be
heard. This effect is the cause of the beating that can sometimes be
heard when an interval is played on a piano tuned in equal
temperament. Piano tuners in fact sometimes need to count the beat
rate to ensure that particular pairs of notes have the right degree of
‘out of tuneness’ for equal temperament.

Two sine waves with close frequencies f1 and f2 (e.g. two or three hertz
apart), when played simultaneously, are heard as a single pitch with a
pulsating amplitude. This pulsation is known as beating. The rate of
pulsation is equal to the difference between the frequencies, f1 – f2
or f2 – f1 (depending which of f1 or f2 is the greater). As the separation
of the frequencies increases, the pulsating pitch acquires a rough
quality, and the beating becomes less noticeable. Eventually, with
increasing separation, the frequencies are heard as two separate
pitches.
Beating is also heard when two sine waves have frequencies that are
close in ratio to those in the harmonic series. Hence, beating is
sometimes heard in the intervals of instruments tuned to equal
temperament.
Timbre is the characteristic sound of an Graphical representations of frequency
instrument or voice. (Section 1) spectra are examples of frequency-domain
representations. Waveform graphs plotted
The sound waves produced by instruments against time are time-domain
are usually not sinusoidal. The non- representations. (Section 4.2)
sinusoidal character of the wave is related
to the instrument’s timbre. (Section 1) A complete representation of a frequency
spectrum would contain information about
A harmonically related series of frequencies frequencies, amplitudes and phases.
has this pattern: (Section 4.2)
f1 2f1 3f1 4f1 5f1 ....
Frequency f 1 is called the fundamental A frequency bandwidth is the range of
frequency. (Section 3.2) frequencies in a spectrum. For a line
spectrum, the bandwidth is the highest
frequency minus the lowest frequency.
Sine waves with harmonically related
(Section 4.3)
frequencies are called harmonics. Harmonics
are numbered: the first harmonic has
frequency f 1 , the second harmonic has Synthesising an instrumental timbre
frequency 2f1, and so on. (Section 3.2) convincingly is complex, and generally
requires more than simply adding
Fourier’s theorem states that, with some harmonics according to a simple scheme.
restrictions, a non-sinusoidal periodic wave (Section 5.1)
can be created by combining harmonics with
unvarying amplitudes. The non-sinusoidal The wave forms produced by musical
periodic wave so created has the same period sounds typically have an attack phase
as the first harmonic. (Sections 3.1 and 3.2) (onset), a steady-state phase and an decay
phase (offset). The attack phase is very
Changing the relative amounts of the characteristic of an instrument’s or voice’s
harmonics changes the shape and timbre of timbre. (Section 5.2)
the resultant periodic wave. (Section 3.1)
The frequency spectrum of a musical sound
Changing the phase relationship of is typically a time-varying spectrum.
harmonics changes the shape of the resultant (Section 5.2)
wave, but providing the phase relationship
is not changing, the effect is inaudible. Periodic non-sinusoidal waves are said to
(Section 3.3) have a repetition rate rather than a
frequency. (Section 6)
Combining harmonics to create a non-
sinusoidal periodic wave is called Fourier Repetition rate is the reciprocal of the
synthesis. Analysing a non-sinusoidal period, and is measured in hertz. (Section 6)
periodic wave into its constituent harmonics
is called Fourier analysis. (Section 3.4) An incomplete harmonic series may be
harmonically related to more than one
A frequency spectrum is a range of frequency. (Section 6)
frequencies. (Section 4.1)
A series of harmonically related
A frequency spectrum can be represented as frequencies from which the fundamental
a frequency spectrum graph (also known as frequency has been removed may be heard
an amplitude spectrum). (Section 4.1) to have a pitch corresponding to the
missing fundamental frequency. This is the
The frequency spectrum of a periodic non- phenomenon of the missing fundamental.
sinusoidal wave is a line spectrum. (Section (Section 7)
4.1)
Harmonically related sine waves combine
In a line spectrum, each line represents a to create a fused tone which has a pitch
harmonic, and the height of the line indicates corresponding to the frequency of the
the amplitude of the harmonic. (Section 4.1) fundamental. (Section 7)
A formant is a fixed range of frequencies The major third is the interval between the
over which harmonics are emphasised. fifth and fourth harmonics. It has a
(Section 8) frequency ratio of 5:4. It is also the interval
between the home note of a major scale and
Instruments and voices have characteristic the third note of the scale. (Section 10.3)
formants. (Section 8)
A chain of interlocking perfect fifths can
A pitch class is a set of pitches that share the supply twelve pitch classes of the
same letter name. (Section 9.2) chromatic scale. (Section 11.1)
The odd numbered pitches of a harmonic A chain of ascending perfect fifths with the
series can be used to define several pitch ratio 3:2 eventually arrives at a pitch which
classes relative to the pitch class of the is slightly higher than seven octaves above
fundamental. (Section 9.2) the starting pitch. A chain of three
ascending major thirds using the ratio 5:4
The seventh and eleventh harmonics are covers slightly less than one octave.
markedly out of tune in relation to the rest (Sections 11.1 and 11.2)
of the series. (Section 9.2)
Adjusting the frequency ratios for intervals
The pitch class of the third harmonic away from those of the harmonic series is
corresponds to the fifth note of a major called tempering. (Section 12)
scale. (Sections 9.3 and 10.1)
Most schemes of tempering (i.e. most
temperaments) favour certain keys over
The pitch class of the fifth harmonic
others. (Section 12)
corresponds to the third note of a major
scale. (Sections 9.3 and 10.3)
Equal temperament removes the anomalies
inherent in using the simple ratios of the
A major triad consists of the home note (the harmonic series for intervals of less than
tonic), the third note and the fifth note from an octave. It treats all keys equally,
a major scale. (Sections 9.3, 10.1 and 10.3) allowing free modulation from one key to
any other. (Section 12)
A major triad is the most basic form of the
major chord. A major chord consists of at In equal temperament, the only intervals
least one representative of each of the pitch whose frequency ratios agree exactly with
classes of the major triad. (Section 9.3) those of the harmonic series are octaves.
Enharmonic pairs have the same pitch. The
An interval is a step in musical pitch. The semitone step is defined by a constant
harmonic series can be used to define multiplicative factor of the twelfth root of
frequency ratios for intervals. ( Section 2 ( 12√2). (Section 12)
10.1)
Two sine waves with close frequencies f1
The perfect fifth and the major third are and f2, when played simultaneously, are
especially important musical intervals. heard as a single pitch that beats at the rate
(Sections 10.1 and 10.3) equal to the difference between the two
frequencies. (Section 13.1)
The perfect fifth is the interval between the
third and second harmonics. It has a Beating is also heard with frequencies that
frequency ratio of 3:2. It is also the interval are close in ratio to those in the harmonic
between the home note of a major (or series. Beating is sometimes heard in the
minor) scale and the fifth note of the scale. intervals of equal temperament. (Section
(Section 10.1) 13.2)

Activity 2
(a) If f = 196 Hz the period is (1 ÷ 196) s = 5.1 ms.
(b) One cycle of the waveform can conveniently be taken as
comprising the wave from one peak to the next (Figure 34).
one cycle
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
time (milliseconds)
Figure 34 One cycle of the violin wave form
From the horizontal axis we see that one cycle takes just over 5 ms. To
get a more accurate answer would require careful measurement of the
amount by which the separation is greater than 5 ms. In fact, careful
measurement shows that the period is certainly in the region of 5.1 ms, so
within the tolerance to be expected when reading such graphs, we can
say that the period is the same as in part (a). In fact, it is often (but not
always) true that the period of a wave produced by an instrument is
the same as the period of a sine wave tuned to the same pitch.
Activity 3
Yes, all these frequencies are whole-number multiples of 150 Hz.
Activity 4
The first six terms in this series are these multiples of 98 Hz.
(1 × 98 Hz), (2 × 98 Hz), (3 × 98 Hz), (4 × 98 Hz), (5 × 98 Hz), (6 × 98 Hz)
These work out as follows:
98 Hz, 196 Hz, 294 Hz, 392 Hz, 490 Hz, 588 Hz
Activity 5
No, these frequencies are not harmonically related to a fundamental of
100 Hz. For instance, there is no whole number by which 100 Hz can be
multiplied to give any of the frequencies in this series apart from the first.
Activity 6
This fundamental frequency is 100 Hz, so the next four above 800 Hz
are 900 Hz, 1000 Hz, 1100 Hz and 1200 Hz.
Activity 9
A tuning fork’s waveform is a pure sine wave (to a very close
approximation), so there is only one frequency in its spectrum, at
440 Hz. Hence the spectrum is as shown in Figure 35.
amplitude
10
0 440 Hz frequency (Hz)
Figure 35 Frequency spectrum of a

tuning fork tuned to 440 Hz
Activity 10
The highest frequency present is 450 Hz. The lowest frequency is
75 Hz. Hence the bandwidth is (450 – 75) Hz = 375 Hz
Activity 16
Given a fundamental frequency of 200 Hz, the first four harmonics
have the following frequencies:
200 Hz × 1 = 200 Hz (first harmonic)
200 Hz × 2 = 400 Hz (second harmonic)
200 Hz × 3 = 600 Hz (third harmonic)
200 Hz × 4 = 800 Hz (fourth harmonic)
Thus harmonics 1 to 4 are present in the frequencies given.
Activity 19
When the fundamental is 200 Hz, the harmonics are at frequencies of
200 Hz, 400 Hz, 600 Hz, 800 Hz and 1000 Hz. The effect of the formant
is to emphasize the 800 Hz harmonic. Figure 36 shows the result.
amplitude
0 100 200 300 400 500 600 700 800 900 1000
frequency (Hz)
Figure 36 Effect of the formant on a harmonic series
with fundamental 200 Hz
Activity 20
(a) With a fundamental of 100 Hz, the eighth and ninth harmonics
were emphasised.
(b) With a fundamental of 200 Hz, the fourth harmonic was emphasised.
Activity 21
The fourth harmonic, with frequency 4f1, is an octave above the second
harmonic, and therefore two octaves above the fundamental. The eighth
harmonic, 8f1, is three octaves above the fundamental, and the sixteenth
harmonic, 16f1, is four octaves above the fundamental, and so on.
Activity 22
There are eight pitch classes. As we go up through the series adding
harmonics, we can ignore the even-numbered harmonics because they
do not contribute new pitch classes. Only the odd-numbered
harmonics contribute new pitch classes, and there are eight of those.
Activity 23
To establish the pitch classes from Figure 23, we simply take the note
names, remove the subscripts, and arrange the letter-names in
ascending order. This is the result:
G, A, B, C# *, D, E, F* and F# .
Activity 28
The twenty-second harmonic would be out of tune. It is an octave
above the eleventh, which is marked with an asterisk.
Activity 30
There are seven
semitone steps.
Figure 37 shows
them.
1 3 6
Figure 37
2 4 5 7
Seven semitone steps
comprise a perfect fifth G D
Activity 33
(a) A perfect fifth has a frequency ratio of 3:2, which is the same as
1.5:1. The upper note of a perfect fifth therefore has a frequency of
1.5 times the frequency of the lower. The lower frequency here is
6f1, so the upper frequency must be 9f1. Hence the ninth harmonic
is a perfect fifth above the sixth.
(b) In this case the upper frequency is 1.25 times the lower (because
the frequency ratio is 5:4, or 1.25:1). The lower frequency is 12f1,
so the upper frequency is 15f1. Hence the fifteenth harmonic is a
major third above the twelfth.
(c) No. The tenth harmonic has a frequency of 10f1. The frequency a
major third above this is 1.25 × 10f1, which is 12.5f1. This does not
appear in the harmonic series, being between the twelfth and
thirteenth harmonics.
Activity 34
(a) These are the pitches passed through on the way:
G – D – A – E – B – F # – C # – G # – D # – A # /B b – F – C – G
You may have given some of these pitches their alternative ‘flat’
names. Figure 38 shows these perfect-fifth steps. If your answer was
G D A E B F C G D A /B F C G
Figure 38
different, you might like to check that there are seven semitone steps in
each of these perfect-fifth intervals.
(b) As Figure 38 shows, it takes twelve perfect fifths in succession to
return you to the note G.
Activity 35
There are seven octaves from the bottom G to the top G.
Activity 36
Seven octave steps bring you to the figure of 128 (which is 27). Twelve
steps of a perfect fifth in a 3:2 ratio gives you approximately
129.74634. If you went up though twelve ‘perfectly’ tuned fifths, you
would therefore end up with a note that was sharper in pitch than if
you went up through seven octaves.
Activity 37
Three steps of the ratio of 5:4 (or 1.25:1) starting from 1 Hz gives a
frequency of
1 Hz × 1.25 × 1.25 × 1.25 = 1.953 Hz
A single octave step gives a frequency of 2 Hz.
Activity 40
Any whole-tone step from Table 2 can be used. Taking the ratio of
D4:C4 we get 293.66:261.63, which is about 1.122425:1. The ‘just’ ratio
of 9:8 is equivalent to 1.125:1. Thus the equally tempered whole-tone
step is slightly smaller.
Activity 43
(a) The frequency difference is 3 Hz, because this is the rate of
beating. Hence the higher frequency must be 183 Hz.
(b) The second harmonic of wave A has a frequency of 180 Hz. It is
the lower frequency of a pair that beats at 3 Hz, so the higher
frequency must be 183 Hz. The 183 Hz frequency is that of the
third harmonic of wave B, so the fundamental frequency of wave
B must be 183 Hz ÷ 3 = 61 Hz.
LEARNING OUTCOMES
2 Derive the period and repetition rate of a non-sinusoidal periodic
wave from its graph. (Activity 2)
3 Explain in simple terms Fourier’s theorem, and explain the
distinction between Fourier synthesis and Fourier analysis.
4 Recognise whether a series of frequencies is harmonically related.
(Activities 3, 5)
5 Calculate the frequency of any harmonic given a fundamental, and
calculate the fundamental given a harmonic. (Activities 4, 6, 16)
6 Draw a line spectrum from appropriate frequency data, or interpret
frequency data from a line frequency spectrum. (Activities 9, 10)
7 Calculate bandwidth from a line frequency spectrum. (Activity 10)
8 Modify a line spectrum to show the effect of a formant, or deduce
the effect of a formant from appropriate frequency spectrum data.
(Activities 19, 20)
9 Discuss the main factors that affect timbre and explain why the
timbre of instruments cannot easily be recreated artificially.
10 Relate pitches to pitch classes. (Activity 23)
11 Perform simple frequency calculations involving the intervals of
the octave, perfect fifth and major third. (Activities 33, 36)
12 Identify members of the harmonic series which are related by
intervals of the octave, the perfect fifth and the major third.
(Activity 33)
13 Perform simple frequency calculations given a table of equally
tempered frequencies. (Activity 40)
14 Discuss some of the difficulties that arise when the harmonic series
is used to define musical intervals other than the octave.
15 Discuss the system of equal temperament and why it has evolved.
16 Perform simple calculations relating to rates of beating. (Activity 43)
105 TA225 BLOCK 1 INVESTIGATING SOUND CHAPTER 3 SOUND AND TIME 105
Chapter 3
Sound and Time
CONTENTS
Aims of Chapter 3 106
1 Introduction 107
2 The time dimension in music 108
2.1 Relative durations 108
2.2 Allocation of time to notes 110
2.3 Beat and pulse 110
2.4 Beats and subdivisions 112
2.5 Subdividing unequally 115
2.6 Are beats always regular? 116
3 Time signatures 117
3.1 Accents and bars 117
3.2 Interpreting the time signature 118
3.3 Spacing of notes on the page 122
3.4 Variability of beats and accents 122
3.5 Compound time signatures 123
3.6 Rhythm 125
4 Expressivity in performance 126
4.1 The work of Carl Seashore 126
4.2 Categorical perception 128
5 Synchronising an ensemble 131
Summary of Chapter 3 135
AIMS OF CHAPTER 3
To explain the way the temporal aspect of music is represented in
notation, and how the temporal dimension of music is related to an
underlying pattern of beats and accents.
To introduce the concept of metre and the time signature as a way of
representing it.
To introduce the concept of categorical perception and relate it to the
notational categories of notated music.
To discuss expressivity in terms of categorical perception.
To discuss synchronisation of musicians in terms of underlying pulse,
and show some of the counting strategies used by musicians.
1 INTRODUCTION
So far we have spent quite a lot of time investigating the pitch and
frequency aspect of musical sounds. Music, however, unfolds in time;
musical notes have duration as well as pitch, and music has a tempo.
The temporal aspect of music is important in a course on music
technology for a number of reasons. A very straightforward reason is
that the complexities of human performance need to be appreciated if
one is trying to create or recreate music convincingly by electronic
means. Many of these complexities relate to the temporal aspect of
music: when notes begin and end, how they relate to an underlying
beat, and so on. Another reason is the widespread use of terminology
in music technology that relates to the temporal aspect of music, such
as ‘beat’, ‘rhythm’ and ‘tempo’. These have very particular meanings,
which I shall explain. Another issue relates to synchronisation and
co-ordination, which inevitably arise when musicians perform
together.
To discuss the temporal aspect of music I shall draw on terminology
and ideas from conventional music notation. I should, however, stress
that you do not need to be able to read music to follow the discussion
in this chapter, nor is it my intention to teach the reading of music, in
the sense of teaching how to perform it or how to transcribe it.
However, I shall spend quite a lot of time explaining the way the
temporal aspect of music is represented in notation, and by the end
you should understand some of the basic principles of the notation of
music’s temporal aspect. I shall also be using both British and
American terminology for note values. I am including American
terminology (‘quarter notes’, ‘eighth notes’, etc.) not because I think it
is generally preferable to British terminology, but simply because it
makes some of the arithmetic easier. If you are already familiar with
British terminology, you should have no problem. If you are not
familiar with either system, but find the American system easier to use
and would prefer to use it exclusively, that will be acceptable for the
purposes of this course.
I should also stress at the outset that what I will be discussing is the
modern conception of time in relation to music as it has existed in the
broadly European and American musical traditions for two or three
centuries. That may seem like a long time, but in terms of musical
history it is fairly recent. If you go back to the twelfth, thirteenth and
fourteenth centuries you find different conceptions of the temporal
and metrical aspects of music. Also, in music traditions outside Europe
and America, for instance in Java and India, you again find different
ways of conceiving the temporal aspect of music. Furthermore, I shall
only be concerned with metrical music, that is, music with a beat or
pulse (I shall be explaining these terms in the chapter). Many of the
concepts I discuss are not relevant to styles of music such as musique
concrete (music based on the manipulation of recorded natural sounds)
or electro-acoustic music.
2 THE TIME DIMENSION IN NOTATION

2.1 Relative durations
In conventional music notation, time is represented in a horizontal
direction (Figure 1). The notes are laid out on a system of horizontal
lines called a staff or stave, and events towards the right happen after
events towards the left. By ‘event’ I mean things such as the starts and
ends of notes, changes from loud to quiet, or vice versa, and so on.
Pitch is represented in the vertical direction, with lines and spaces
standing for pitches. Increasing height on the vertical dimension
signifies increasing pitch. (I shall be explaining some of the other
features of this figure later.)
pitch
time
Figure 1 Pitch and time occupy two dimensions in notation
Perhaps the first important point to grasp about the temporal aspect of
notation is that the durations of notes or silences (usually called rests)
are relative. That is to say, nothing in Figure 1 indicates directly how
long each note lasts in units of time, such as seconds or fractions of a
second. Instead, the note shapes embody a system of time
relationships. As you can see, most of the notes in Figure 1 are of this
type: q (crotchet, or quarter note in American terminology). In
addition, at the end of the line, there is one of this type h (minim, or
half note), which is preceded by one of this type e (quaver, or eighth
note), and then by one of this type ß (dotted crotchet, or dotted quarter
note). We can regard any type of note as a basic unit of time, but the
crotchet (quarter note) is commonly used. In terms of this note, the
relative durations of the other notes are as follows:
h (minim, or half-note) is twice as long as q (crotchet, or quarter-note)
e (quaver, or eighth-note) is half as long as q (crotchet, or quarter-note)
ß (dotted crotchet, or dotted quarter-note) is 50% longer than q
(crotchet, or quarter note)
The pattern of note values used here is just a selection from a larger set
of possible values. Box 1 ‘Standard note values’ gives the note values
that are in common use in notated music.
Once a basic duration for any particular note value has been decided, the
durations of all the other notes in a piece (with some provisos) are fixed.
A composer can establish the duration of a particular note value by
means of a metronome mark (see Box 2 ‘The metronome’). A
metronome mark says how many of a particular type of note there are
in a minute. For instance, a metronome mark for Figure 1 might be
written as q = 120, meaning that a hundred and twenty crotchets
(quarter notes) would last a minute, and hence the duration of a
crotchet comes out as half a second. From this, the durations of all the
BOX 1 Standard note values

Table 1 shows the common note Beaming can only be used for In other words
values. The top one is the longest quavers (eigh th notes) or = +
in common use. The downward shorter-value notes, because
extension could theoretically the beam is basically an A second dot extends the duration
continue indefinitely, but seldom extension of the hooks applied by a further 25% of the original
goes beyond a further two or to the stems of these notes. note value:
three steps. Sequences of notes Crotchets (quarter notes) and
= + +
of the same value are sometimes minims (half notes), which have
joined together in notation by no hooks on their stems, cannot The standard scheme of note
a beam. For instance be beamed. The duration of any values described here is based on
note can be extended by 50% repeated halvings or doublings of
= by the addition of a dot. time values. Subdivisions by other
factors are possible. A subdivision
Table 1 Note values in British and American terminology of a crotchet into three equal
parts (a triplet), for instance, is
Notation American British Equivalent Duration relative indicated as follows:
name name rest to whole note
(semibreve) subdivides equally into three as:
w whole note semibreve 1

h or H half note minim 1/2
or
q or Q
3 3
quarter note crochet 1/4
Various extensions of the
e or E eighth note quaver 1/8 schemes described here are used
x or X sixteenth semiquaver 1/16 to allow for other subdivision of
note notes, and also to allow for
r or R thirty-second demisemiquaver 1/32
asymmetrical subdivision, where,
for instance, a note might be
note
subdivided into unequal parts.
BOX 2 The Metronome

The metronome is a mechanical or electronic device
for beating time at a specified rate. It was invented
around 1812 by Diderich Winkel, but is more often
associated with Johan Maelzel, who refined and
patented Winkel’s invention. Until the late twentieth
century, metronomes were invariably clockwork devices
adjustable
consisting of a pendulum which swings from side to counter-weight
side at a regular rate (Figure 2). Twice per cycle of
oscillation, the metronome emits an audible click. The
rate of oscillation, and hence the rate of clicking, is
set by sliding a counter-weight along the upper part of
the pendulum (above the pivot). The pendulum is pivot
graduated with numbers, which indicate the number
of beats per minute (b.p.m.) with the counterweight
at that setting. Electronic metronomes are now widely
used. These tend to be smaller than clockwork models
(some can even be worn on the ear), have additional
fixed weight
features (such as flashing lights as well as, or in place
of, the clicks), and are rather more robust. Clockwork
models, for instance, must be placed on a horizontal
surface and kept stationary to work properly. Figure 2 A metronome
other notes follow. The metronome mark is a precise way of setting the
tempo or speed of a piece of music. Less precise ways of specifying
tempo involving Italian terms, such as presto (very fast), allegro (fast,
or lively), andante (moderate walking speed), adagio (leisurely), and
lento (slow) have a long tradition in classical music and are still widely
used, either in conjunction with a metronome mark or in place of one.

What is the duration of a semiquaver (sixteenth note) relative to that of
a minim (half note). ■

(a) A piece of music is specified as having a metronome mark of q = 90.
If the music lasts for five minutes and consists entirely of quavers
(eighth notes), how many notes are there?
(b) In a piece of music with a metronome mark of e = 120, how long
does each quaver (eighth-note) last?
(c) In the same piece of music as in (b), how long does a minim (half
note) last? ■
2.2 Allocation of time to notes

I have been discussing the durations of notes as though there were no
problem with the idea of a note having a particular duration: a note has
a beginning and an end, and the time from beginning to end is the
note’s duration. In practice, things are not always so straightforward.
For instance, with plucked instruments, such as guitars, mandolins,
harpsichords, etc., or those with hammered strings, such as the piano,
a note decays virtually from the moment it is played, and the ending of
the note is not always completely under the player’s control. Therefore
a note may become inaudible before its notated duration has been
reached. Additionally, at certain periods of musical history it has been
understood by players and composers that the precise duration of a note
is not always what is literally indicated by the notation. By convention,
the note might be played shorter than its written value, possibly because
this is considered more expressive. It is therefore more appropriate to
think of notation as a way of indicating an allocation of time to a note.
Whether the note fills its allocation depends on the kinds of factor I have
described above. If a composer thinks it important for a note to get its full
allocation, there are ways of indicating this in the notation. I mention
these points because attempts to realise notated music by mechanical or
electronic means have sometimes resulted in performances that obey the
time values in the score (and indeed the pitches) more literally than an
expressive human performance would, and which sound correspondingly
lifeless. I shall return to this issue later in the chapter.
2.3 Beats and pulse

You might well be wondering how a musician can accurately play
notes that, in relation to a basic note length, are twice as long, half as
long, three times as long, one-eighth as long, and so on.
A musician’s ability to judge relative durations is dependent on the

human ability to assess temporal regularity quite accurately, at
least in the short term. This ability is used almost instinctively by
groups of people when there is a need to move a heavy object by
synchronising several people’s efforts. One of the group might
count aloud something like ‘one, two, three, pull!’. The instruction
‘pull’ falls where ‘four’ would have fallen, and because everyone
has the same idea of where that is, the desired synchronisation is
achieved. Notice that the four equally spaced counts define three
equal time intervals (Figure 3).
One! Two! Three! Pull!
first time interval second time third time interval

interval
time
Figure 3 Counting to achieve synchronisation
ACTIVITY 3 (PRACTICAL) ....................................................................

You might like to investigate you own time-keeping ability. For this
activity you will need a clock or watch with a seconds hand or a
seconds counter. Watch the seconds hand or counter, and count to
yourself ‘one, two, three,’ at one-second intervals. At ‘three’, look away
but continue to count. At ‘five’, look back at the seconds hand or
counter and see whether you are still synchronised.
Now try counting to ten. Look away at three, and look back when you
reach ten.
You may also like to try counting aloud as well as counting silently to
see if this gives you a more accurate result, and also to try tapping a
finger as you count.
Comment
I expect you found that you were quite accurate when counting to five,
but probably less so when counting to ten. However, you may have
found that with a little practice you could markedly improve your
results. Many people also find that making a physical movement, such
as tapping their foot or a finger, enables them to improve their
accuracy. ■
Virtually all conventional music has a pulse or beat, that is to say,

sequence of evenly spaced markers of time, like the ticking of a
clock or the tapping of a foot. This beat may be ‘in the foreground’,
that is audible, and possibly a prominent feature of the music.
Listeners can synchronise themselves to it, as is clear from their
ability to tap their feet, clap their hands, or dance to the music.
Alternatively, the beat or pulse might operate in the background,
being in the heads of the performer(s) rather than actually audible
in the music. As far as the listeners are concerned, there may be no
discernible beat, but the performers themselves would have been
aware of it as a sort of imaginary clock or metronome ticking away.

In the audio track associated with this activity you can hear two pieces
of music, one where the beat is very evident, followed by a piece
which is famous for having a very indistinct beat. The first piece is
from the Trojan March by Berlioz; the second is an extract from the
prelude to Wagner’s opera Parsifal. Try tapping your foot or clapping
your hands to both.
Comment
You probably had no difficulty finding the beat in the first extract. With
the second example I expect you found it very difficult to tell where the
beat was, and you may have thought there was no beat at all. However,
the players and conductor of the piece would have been very well aware
of where the beat was. Because relatively few musical events coincide
with the beat, the listener cannot easily detect the background pulse. ■
I should introduce a word of caution here about the word beat, which
tends to be used in two senses in music. In one sense it means what I
have outlined above: regularly recurring, instantaneous markers of
time (for instance, hand claps, drum beats, clicks). In its other sense,
beat means the time interval between two consecutive markers of time.
It is important to grasp that the word has these two meanings, which
may occur within one sentence, for instance when a musician says
something like, ‘Start on the third beat and hold the note for four
beats.’ I will use both senses in this chapter, though it should be clear
from the context which is meant. Figure 4 summarises these two
meanings of ‘beat’.
beat 1 beat 2 beat 3
pulse
markers
1 beat
time
Figure 4 ‘Beat’ as an instantaneous time marker and as
a measure of duration
2.4 Beats and subdivisions

The presence of regular beats is the key to the musician’s ability to
gauge the durations of notes. In conventional, notated music, a note
value is selected to represent the basic unit of the pulse. In other
words, a particular note value is made to represent the time interval
between two beats. This note value is said to represent the beat (using
the word ‘beat’ in the second of the two senses defined earlier). Often
the crotchet q (quarter-note) is used as the basic note value, but by no
means always. If this note value represents the time interval between
two consecutive beats, then a minim (half-note) corresponds to the
interval spanned by three beats, as Figure 5 shows.
In Figure 5, the horizontal direction represents time. The crosses
represent equally spaced beats. Below these we see a crotchet (quarter-
note) starting on the first beat and ending on the second. Below this is
another note which starts on beat two and ends on beat four. This note
will be twice the length of the crotchet, so it is a minim (half-note).
beat 1 beat 2 beat 3 beat 4
crotchet
(quarter note)
duration
minim (half note) duration
time
Figure 5 Relative durations of a crotchet (quarter note)

and a minim (half-note) in terms of an underlying
crotchet pulse
The close similarity between Figures 5 and 3 is not coincidental. In

both, a series of equally spaced, momentary events (beats in one case,
counts in another) is used to define a series of equal time durations.
It is perhaps not so clear how one can assess the duration of notes that
are less than the gap between two beats. However, the problem is
simplified by the fact that there are very standard ways of subdividing
this gap. Thus, although there are theoretically infinitely many ways
of subdividing this gap, in practice a finite number of very standard
subdivisions tends to be used. Furthermore, each of these standard
subdivisions has a characteristic sound, which musicians learn to
recognise. The following activity gives you a chance to create some of
the standard subdivisions yourself.
ACTIVITY 5 (PRACTICAL) ....................................................................

For this activity you will ideally need to be able to walk for several
paces without interruption. Walk at your normal walking pace, and say
to yourself ‘left, right, left, right...’ as you put down each foot. Now,
without adjusting your walking pace, insert the word ‘and’ between
each of the words ‘left’ and ‘right’. If you consider the ‘lefts’ and
‘rights’ to be beats, the ‘ands’ are falling half way between them.
If you are not able to try this activity while walking, you could alterna-
tively clap your hands at a steady rate, saying alternately ‘left, right,
left, right...’ in synchronism. When you have established a steady beat,
drop in ‘and’ without changing the rate at which you clap your hands.
Comment
You probably did not have much difficulty with this activity. In fact,
the ease with which you can subdivide a beat into two equal parts
probably accounts for why the most basic subdivision in conventional
music is a halving. ■
In terms of notation, what (a)
you were doing in Activity 3

is shown in Figure 6.
Figure 6
(a) Representing a regular beat (b)
pattern using quarter-notes.
(b) Subdivision of the beat
pattern into half-length notes
In (a) we set up a regular pattern of beats, and each beat coincides with
the commencement of a note. I am using the crotchet (quarter note) to
represent the time gap between two consecutive beats, although, as
always, any note value could be used. (The horizontal line after each
‘left’ and ‘right’ indicates that the sound of the word lasts through to
the next beat. The use of a line like this is common in notated vocal
music.)
In (b) each ‘left’ and ‘right’ lines up with those in (a), but between each
‘left’ and ‘right’ there is an ‘and’. In (b), the words are now represented
by quavers (eighth-notes). (We could also include a short horizontal
line after each ‘left’, ‘right’ and ‘and’, although this is not so necessary
when the notes are close together, as here.)
One thing that should be apparent from Figure 6, although you may
not have realised it during the activity, is that in line (b) the words
‘left’ and ‘right’ must be said more quickly to make room for
interpolated ‘ands’.
Other subdivisions of the beat, such as into three, four, or more equal
parts can be a bit tricky, but, once more, playing them or recognising
them is a matter of practice. Once the sound of a particular subdivision
has become familiar it then becomes almost second-nature for a
performer to play it. Sometimes music students use mnemonics to help
them get used to the sound of subdivisions, common ones being based
on words or phrases with the right
number of syllables to give the
required subdivision. For
instance, saying the words
‘higgledy, piggledy, higgledy,
piggledy, ...’ with one word at each
step while walking (or at each
hand clap) gives you a regular
three-part or triplet division of a Figure 7 Subdividing a beat equally
into three
beat, as in Figure 7.
In music where the beat is
frequently subdivided into
triplets, the notation can be made
tidier by using a dotted note to
represent the beat. Almost
invariably a dotted crotchet
(dotted quarter note) is used, as in
Figure 8. This gets rid of the need
Figure 8 Use of a dotted crotchet
to put a figure ‘3’ above or below to represent the beat simplifies
the triplet groups. the notation of triplet subdivision
To help get a regular four-part
division of the beat, the word
‘caterpillar’ can be used, as shown
in Figure 9.
Figure 9 Using the word ‘caterpillar’

to subdivide a beat equally into four

It is quite unusual for a piece of music to use the same subdivision of
the beat from beginning to end. However, in the audio track for this
activity you can hear three pieces for solo cello by Bach which employ
an almost constant subdivision, with little change from beginning to
end. The subdivisions are the ones discussed in this section. The
pieces you hear are as follows:
(a) The first half of the Prelude from the Fourth Suite for
unaccompanied cello. This consists entirely of an even two-part
division of the beat (‘left-and-right-and...’).
(b) The Gigue from the Fourth Suite for unaccompanied cello. This
consists almost entirely of unbroken three-part divisions of the
beat (‘higgledy piggledy higgledy piggledy...’).
(c) The Prelude from the First Suite for unaccompanied cello. This
consists almost exclusively of continuous four-part divisions of
the beat (‘caterpillar caterpillar ...’). ■
We have looked in this section at the regular division of beats into 2, 3

or 4 equal subdivisions. These are what I call the ‘standard’
subdivisions. Larger numbers of subdivisions are possible, but are less
common. For instance, divisions into 5, 7, 11 or 13 subdivisions are
quite rare. Rather more common are divisions into 6, 8, 9 and 12
subdivisions. Such large numbers of subdivisions tend to be found
mainly in slow music. The reason that 6, 8, 9 and 12 are more common
than 5, 7, 11 or 13 is that they are multiples of the numbers we have
already considered in this section. They can therefore be thought of as
standard subdivisions of the standard subdivisions.

Explain briefly what a beat is and explain its use as a way of specifying
the durations of notes or rests. ■
2.5 Subdividing unequally

All the subdivisions so far have been (a)
equal subdivisions. That is, we have

considered subdivision in two, three,
four and more equal parts. It is also (b)
possible, however, for subdivisions to
be unequal. I will not explore this
: :
much further, except to say that, once
again, there tend to be standard (c)
patterns. Figure 10 shows how two
very common unequal subdivisions are
related to an equal two-part : :
subdivision.
(d)
At the top of Figure 10, at (a), we have
two crotchets (quarter notes), on the
left and right. These are to be : :
subdivided into two parts. Below these, Figure 10 Common unequal sub-
in (b), we have equal subdivisions of divisions of a crotchet (quarter note)
each crotchet into two quavers (eighth notes). Each pair is beamed
together. I have emphasized the equality of the subdivisions by the
putting a 1:1 ratio under each beamed pair, although a ratio such as
this does not form part of the musical notation.
Now, to make this subdivision unequal, we can either extend the first
note of each pair at the expense of the second, or extend the second
note at the expense of the first. These two schemes are adopted
respectively on the left and the right of Figure 10. Thus, on the left-
hand side of Figure 10, at (c), the first note is now twice as long as the
second (indicated by the 2:1 ratio), whereas on the right-hand side the
second note is twice as long as the first (indicated by the 1:2 ratio).
The pattern is continued in (d). On the left, the two notes are now in
the ratio 3:1 whereas on the right they are in the ratio 1:3.
As with even subdivisions, standard patterns such as these (and a few
others ) are very common and have a characteristic sound which
musicians learn to recognise. The 2:1 ratio at (c), for instance, when
repeated in consecutive pairs of notes, has a characteristic ‘rumty
tumty rumty tumty’ sound that occurs in much British (and especially
English) folk music. The notation for this subdivision in Figure 10(c)
is rather awkward, as you probably noticed. There is a simpler way to
notate this subdivision which I shall mention later in connection with
compound time signatures.
The important point to grasp here is that there are standard patterns of
subdivision, which are very common and have characteristic sounds.
These standard patterns are based on simple numerical ratios. For the
purposes of this course you are not expected to be able to recognise
the sound of these subdivisions or to be able to perform them.
2.6 Are beats always regular?

It might seem perverse to ask whether beats are always regular, because
regularity is partly what defines a beat, as I have discussed it so far.
However, musical pulse is not always strictly regular. In the following
activity you can hear a piece of music where the pulse is fairly
variable.

In the audio track for this activity you can hear the famous Spanish
guitarist Andres Segovia (1893–1987) playing Bach’s Gavotte from the
Partita in E major, originally for solo violin. You might like to assess
whether the pulse is regular.
Comment
Segovia was famous for his flexible attitude to the beat, and to the
subdivision of the beat, but his flexibility was not unusual when this
recording was made (1927). Many musicians of his era played with a
fairly flexible pulse, and this was considered to be musically
expressive. Nowadays performers tend to be less flexible. However, in
folk music and other non-classical genres you can still often hear quite
variable pulses. ■
3 TIME SIGNATURES
3.1 Accents and bars
When presented with a sequence of identical beats, equally spaced,
people generally hear them as unequally accented. A familiar example
is the ticking of a clock. The ticks of a clock are all equal, but are
usually differentiated by the listener as tick, tock, tick, tock, .... The
listener interprets the sounds as forming a recurring two-beat cycle:
‘tick, tock’.
With a little imagination, the same ticking of a clock can be heard as a
three-beat cycle: ‘tick, tock, tock, tick, tock, tock, ....’ Four-beat cycles
and five-beat cycles can also be imagined, and other numbers of beats.
What defines the start of each new cycle is a feeling of accentuation on
the first beat of the cycle, making that beat a ‘strong’ beat. A cycle,
then, consists of a pattern of strongly and weakly accented beats. For a
two-beat cycle it goes like this:
S W S W S W
where ‘S’ stand for ‘strong’ and ‘W’ for ‘weak’. For a three-beat cycle
the pattern goes like this:
S W W S W W
For a four-beat cycle it goes like this:
S W M W S W M W
(Here ‘M’ stands for ‘medium’, an accentuation intermediate between
strong and weak.)
It is usual to attach count-numbers to the beats of a cycle. For a two-
beat pattern, the count goes like this:
1 2 1 2 1 2
A two-beat cycle is often found in marches, the two beats
corresponding to ‘left’ and ‘right’.
A three-beat cycle is counted as follows:
1 2 3 1 2 3, etc
Such a pattern is found in waltzes and minuets.
A four-beat pattern, predictably enough, is
1 2 3 4 1 2 3 4
What all these patterns have in common is that beat 1 is the strongest
of each cycle
The question of what an accent actually consists of is complex. The
example of the ticking clock shows that listeners can hear an accent in
sounds that are identical, and that they can choose to hear
accentuation in different ways. However, in actual musical
performance everyone should ideally be hearing the same pattern of
accentuation. A sense of a strong beat can be created in several ways,
one of which is to increase the loudness, but loudness is not the only
way. Anything that draws attention to a particular beat could be said to
be accenting it. However, if you are beating on a drum you have

limited scope for indicating an accent other than by changing the
loudness or by displacing the beat slightly.
Cycles of beats are grouped into bars. Each bar contains one cycle of
beats, the strongest beat coming first. Bars are indicated in music
notation by a vertical line, called a bar line. Here are some two-beat
cycles grouped into bars.
⏐1 2⏐1 2⏐1 2⏐1 2⏐

Write out a few cycles of three-beat and four-beat patterns, including
the bar lines, and indicate underneath the strong and weak beats. ■
The pattern of accents on the beats of a piece of music tends to transfer

itself to the notes that are heard on those beats. For instance, in a three-
beat cycle the notes falling on beat 1 are generally accented, whereas
those on beats 2 and 3 are generally unaccented. I use the word
‘generally’ because a performance where the accents fell only in
regular and predictable places could very soon seem boring and
unmusical, rather as poetry can become doggerel if its accentuation is
unremittingly the same from beginning to end. Some composers,
indeed, have written unbarred music, to deter performers from playing
a recurring pattern of strong and weak accents. Some of the piano
music of Erik Satie (1866-1925) and Federico Mompou (1893-1987)
falls into this category. More recent composers have also composed
unbarred music.
I said earlier that accentuation is not necessarily achieved by
increasing loudness. A low note followed by a sequence of high ones
can also appear to carry an accent, as can a long note followed by a
series of shorter ones.
3.2 Interpreting the time signature

Figure 11 is the notation of the main tune from the last movement of
Beethoven’s Ninth Symphony. As you can see, it is a vocal piece. The
words are from the writings of the German poet Johan Schiller (1759–
1805) and relate to ideas of universal brotherhood (to use the
terminology of the time), to which Beethoven was sympathetic. I have
included the words as an aid to following the music when you listen to
this piece. However, I think you will find it quite easy to follow. I have
added bar numbers to make the discussion below easier to follow.

In the audio track for this activity you can hear the ‘Ode to Joy’ theme
as it appears in the symphony, sung by a baritone singer. (It is actually
sung an octave lower than shown in Figure 11.) In the audio track there
is about one-and-a-half seconds of music before you hear the tune of
Figure 11. In this performance, each crotchet (quarter note) takes about
half a second. ■
Figure 12 shows the first few bars of Figure 11. At the start of the staff,
there is a clef, a key signature and a time signature.
Figure 11 ‘Ode to Joy’ theme from Beethoven’s Ninth Symphony
key signature
clef time signature

Figure 12 Basic nomenclature
I do not want to say too much about the first two, except to say that the
clef identifies the second line from the bottom of the staff as
representing the pitch G (specifically G4, using the pitch notation
introduced in Chapter 2). Hence the pitches of all the other lines and
spaces can be derived from it. Clefs other than the G clef are in use.
They assign other pitches to the lines or spaces of the staff.
The key signature defines the key, that is to say, the scale from which
the notes of the music are taken. Here it consists of two sharp signs,
but other numbers of sharps and flats are possible. (A key signature
never mixes sharp and flat signs.) The sharp signs here are attached to
the line corresponding to the pitch F5, and to the space corresponding
to a pitch of C5. This key signature means that all members of the pitch
class F (not just the F5) are sharpened by a semitone, and all members
of the pitch class C (not just C5) are sharpened by a semitone, unless
there is some instruction to the contrary later in the music. These two
pieces of information tell the performer that the music is in the key of
D major or B minor, because these are the only two keys where only the
notes F and C are consistently sharpened. (The scale of D, for instance
is D, E F#, G, A, B, C#, D.) As it happens, this piece is in D major rather
than B minor, but this fact can only be established either by listening
to the music (major and minor keys sound distinctly different) or by
looking at whether the note D or B acts as a ‘home’ note for the piece.
The symbol I want to concentrate on here is the time signature,
because it tells us two vital pieces of information:
1 How many beats there are per bar.
2 What note value has been selected to represent the beat (or, if you like,
what note value has a duration equal to the space between two beats).
The time signature for this piece is 4/4, said as ‘four-four’. Another
way to express the same idea is to say that the metre of the piece is
four-four. Notice in Figure 11 that the time signature is not repeated at
the start of each staff, although the clef and key signature are.
Although a time signature looks like a fraction, it is not a fraction in
the usual arithmetical sense. Nevertheless, for our purposes you can
think of the time signature being the result of a fraction multiplication,
like this:
4 1
= 4×
4 4
This way of rewriting the time signature brings out the two pieces of
information listed above. First of all, we have the number of beats per
bar. This is the multiplying term, 4. There are four beats per bar in this
piece. The second piece of information is the note value chosen to
represent the beat. It is, in American terminology, the quarter note, or
crotchet.
A time signature or 4/4 is extremely common in all musical styles,
especially in popular music, show tunes, and jazz. In fact a metre of
four-four is often referred to as common time. Other very common time
signatures are 3/4 (used in waltzes) and 2/4 (much used in marches),
and 2/2, 3/2 and 4/2 (often used in hymns). Time signatures of 5/4 and
7/4 are relatively much less common, though quite often used in East
European folk music. Note that the American name ‘quarter note’, is
not derived from the fact that there are four of them in a bar of 4/4. The
name derives simply from the fact that it is a quarter of the longest
commonly encountered note value.
Bars 4, 8, 10, 11 and 16 of Figure 11 show that not all bars in this piece
consist of four crotchets (quarter notes). However, the durations of the
notes in each of bars 4, 8, 10, 11 and 16 add up to the same duration as
occupied by four crotchets (quarter-notes). You can check this with a
little arithmetic.
If we take bars 10 and 11, each consists of this sequence of note values:
q e e q q
Using American terminology the time 1 1 1 1 1 4
+ + + + =
value of this sequence is easily calculated: 4 8 8 4 4 4
So these bars are each equivalent to a bar of four quarter notes, or
crotchets.

Confirm that each of bars 4, 8 and 16 is equivalent to four crotchets
(quarter notes). ■
When doing this kind of addition, it is important to include rests. For
which is equivalent to a crotchet (quarter note). q e e q

instance, in the following bar there is a rest
1 1 1 1 1 4
The addition sum for this bar is + + + + =
4 8 8 4 4 4

If the time signature remains the same throughout a piece of music,
will all the bars have equal durations? ■
Beethoven’s tune in Figure 11 begins on the first beat of the bar, but
plenty of tunes begin at a different part of the bar. Figure 13 shows a
familiar tune that begins on the last beat of the bar.
Figure 13 A tune that begins on the final

beat of the bar
In Figure 13 I have added a couple of rests before the tune starts to clarify
where in the bar the tune starts. However, rests such as these are
generally not added to an incomplete bar at the start of a piece of music.

In the audio track for this activity you will hear extracts from five
pieces of music with a range of time signatures. Do not worry if you
cannot relate what you hear to the time signatures; in some of these
pieces it is quite tricky to work out the time signature. The extracts are
given only as examples of how these time signatures have been used.
The extracts are taken from:
(a) Elgar’s Pomp and Circumstance March No. 1, in 2/4.
(b) Tchaikovsky’s ‘Waltz’ from Eugene Onegin, in 3/4.
(c) Beethoven’s Trio in B flat for Piano, Violin and Cello, op. 97
(‘The Archduke’), in 4/4.
(d) Tchaikovsky’s Sixth Symphony, second movement, in 5/4.
(e) Sibelius’s Third Symphony, second movement, in 6/4. ■
One reason why it can be hard to tell a piece’s time signature from the
sound alone is that there is generally more than one way in which the
music can be notated. For instance, Figure 14(a) and (b) would sound
identical. The notational difference is due to the use of a different note
value to represent the beat. In (a), a crotchet (quarter note) is used; in
(b), a minim (half note) is used.
(a)
(b)
(c)
Figure 14 (a) and (b) are alternative notations.

(c) can often sound indistinguishable from (a) and (b)
When we come to Figure 14(c), although this is theoretically different

from (a) and (b) – having four beats per bar instead of two – in practice
it can sound indistinguishable from the other two. This is because in
(c) the weakly accented beats (2 and 4) can often sound like
subdivisions of the strong beats rather than beats in their own right.
This pushes the feel of the music towards that of (b), where the last
notes in bars 1 and 3 fall on subdivisions of the beat.
3.3 Spacing of notes on the page

In Figure 15 I have taken two bars from Figure 11 to illustrate a couple
of points about spacing in notation. First, notice that bar (b) occupies
more space than (a), even though both occupy the same amount of time.
(a)
(b)
Figure 15
Spacing of notes gives only
an approximate indication
of their relative durations
The amount of space occupied by a bar on the page is not a reliable

indication of how long the bar lasts, and is governed by other factors
such as legibility.
Notice also that in (a), where all the note-values are of the same type,
the notes are evenly spaced, reflecting their even spacing in time. In (b)
there is an attempt to suggest the temporal spacing of the notes by their
spatial placement in the bar. For instance, the quaver (eighth note) is
much closer to the minim (half note) than to the dotted crotchet
(dotted quarter note). However, this spatial placement gives only a
suggestion of the temporal placement. In conventional notation there is
not a precise correspondence between spatial placement and temporal
placement. (In some modern compositions, however, there is a more
exact correspondence between spatial and temporal placement.)
3.4 Variability of beats and accents

You might at this point be wondering whether all the bars in a piece of
music must have a constant number of beats, and whether all the beats
in successive bars must be accented in the same way. There is actually
no reason for the number of beats in a bar to remain constant within a
piece of music, or within a section of a piece of music, other than the
fact that composers have tended to compose in this way. However,
many composers in the twentieth century began to vary the number of
beats in a bar, the most famous example being Stravinsky in his Rite of
Spring. Where the number of beats changes in the score, it is
accompanied by a new time signature.
As for the pattern of accentuation of beats within a bar, this too can
change, and there are ways of indicating this in a score using an accent
sign, >. For instance, a bar of music in four-four might have this
symbol over a normally weak beat, such as the fourth, to indicate that
the note falling on this beat should be accented (Figure 16).
Figure 16 Accent
on a weak beat
Beethoven achieves this very effect in bar 12 of Figure 11, but without
using this symbol. Instead he ties this normally weakly accented note
to the one following, which is normally strongly accented, as shown in
Figure 17.
tie
Figure 17 Tying notes together
The curved line joining the last note of bar 12 to the first of bar 13
makes them into a single note with a combined length of the two notes
added together. (Naturally only notes having the same pitch can be
tied.) Because of the tie, in bar 13 there is no note beginning on the
first beat, and thus no note to receive a strong accent. In effect, the
accent that would normally fall here is transferred forward to the weak
beat at the end of bar 12, as in Figure 18.
Figure 18 A syncopation
This transferring of an accent from its customary place onto a weak

beat is known as syncopation. It is a common feature of ragtime, jazz
and popular music, but, as this example shows, it has been in use for a
long time.

In the audio track for this activity you can hear bars 12 and 13 (Figure 18).
Because they are quite quick, they are played three times. ■
3.5 Compound time signatures

Time signatures such as 4/4, 3/4, 2/4, 4/2, 3/2 and 2/2 are relatively
easy to interpret. As we have seen, the top number tells you how many
beats there are per bar, and the bottom number tells you the note value
that has been chosen to represent the beat.
In cases where a dotted note represents the beat, matters are not so
straightforward. As you will recall, a dotted note is often used to
represent the beat in cases where the prevailing subdivision is into
three equal parts rather than two. There is no unambiguous way to
indicate the use of a dotted-note beat in the time signature. For
instance, in terms of the table of standard note values, the value of a

dotted crotchet (dotted quarter note) is:
1 1 3
+ =
4 8 8
This is to be expected, because the simplest subdivision of a dotted
crotchet (dotted quarter note) is into three quavers (eighth notes). If
there are two beats per bar, and the dotted crotchet (dotted quarter
note) represents the beat, the time signature is:
3 6
2× =
8 8
A time signature of 6/8, however, could be interpreted as:
1
6×
8
That is to say, it could be interpreted as indicating six beats per bar,
using an eighth note to represent the beat.
Time signatures of 6/8, 9/8 and 12/8 are commonly used for music
where a dotted crotchet (dotted quarter note) represents the beat, and
are known as compound time signatures. They indicate, respectively,
that there are two, three and four beats per bar. When music is in a
compound time signature, the metronome mark (if there is one) usually
reflects the fact that a dotted note represents the beat, for instance ß = 80.
If a composer really intended there to be six beats per bar in a piece in
6/8, the composer might put a note to that effect at the head of a score,
or re-notate it in 6/4.

A film composer wants to accompany 30 seconds of film with 24 bars
of music in 6/8. What must the metronome mark be for the music to
last exactly the right length of time? ■
Besides being used when the prevailing subdivision of the beat is into
three rather than two, compound time signatures are frequently used
when the beat is often subdivided into unequal parts in the ratio of 2:1
(or 1:2), as in Figure 10(c). Figure 19 shows how the notation works
when a dotted crotchet (dotted quarter note) represents the beat.
(a) beat
(b) subdivided equally into three
(c) first two notes tied

:
Figure 19 Compound time

signatures are convenient (d) re-notated
for a prevailing 2:1 subdivision :
The dotted crotchet (dotted quarter note) which represents the beat in
Figure 19(a) is equivalent to three quavers (eighth notes), as in
Figure 19(b). If we tie the first two notes together (Figure 19c), we have
effectively subdivided the beat into two parts which are in a ratio of 2:1.
The two tied quavers (eighth notes) can be replaced by a crotchet
(quarter note), as in Figure 19(d).

The audio track associated with this activity is a short excerpt from the
Cello Concerto by Elgar. The time signature is 9/8, and in the excerpt the
melody is almost entirely in pairs of notes of the kind in Figure 19(d). ■

A piece of music has a time signature of 2/4. For how many beats (or
fractions of a beat) does
(a) a crotchet (quarter note) last
(b) two dotted quavers (dotted eighth notes) last? ■
3.6 Rhythm
In connection with Beethoven’s ‘Ode to Joy’ tune in Figure 11 we
noticed that not all bars consisted of four crotchets (quarter notes),
although all the bars had a 4/4 metre. For instance, bars 4, 8, 10, 11 and
16 consist of other note values than four crotchets (quarter notes). We
express this idea by saying that these bars have a different rhythm from
that of a bar of four crotchets (quarter notes). However, the rhythms of
bars 4, 8 and 16 are identical, and the rhythm of bar 10 is the same as
that of bar 11. A rhythm is thus the pattern of note values used in any
section of music, usually together with their metre.

Why might it be necessary to specify the metre when referring to the
rhythm of a particular series of notes? ■
The concept of rhythm does not apply to just one bar’s worth of note
values. There is, for instance, a distinctive rhythm in the two-bar group
formed by bars 12 and 13 arising from the syncopation. Similarly, we can
speak of the two-note pair in Figure 19(d) as having a distinctive rhythm.

Briefly explain what a bar is in relation to a series of regularly
accented beats. ■

Briefly explain what a time signature indicates in relation to beats,
bars and the note value used to represent the beat. ■

(a) What is the principal difference between a compound time
signature and a non-compound time signature in terms of the type
of note value used to represent the beat.
(b) Under what circumstances would a compound time signature be
likely to be used? ■
4 EXPRESSIVITY IN PERFORMANCE
4.1 The work of Carl Seashore
The modern temporal notation of music, as you have seen, divides
time into beats and subdivisions of the beat. A note either falls on a
beat or not on a beat. If it does not fall on a beat then it falls half-way
between the beats, or on some other relatively simple numerical
subdivision of the beat such as two thirds or one quarter.
In using a relatively limited set of categories of note values and
placements, the temporal representation in notation is a counterpart to
pitch representation. In the representation of pitch a note has to be one
pitch or another, F# or G, for instance, and the set of available pitches
is finite and quite limited. But what really happens in the performance
of music? Are these categories of placement and pitch an accurate
representation of what musicians actually play? I have already hinted
that in an expressive performance there might be some flexibility of
pulse, but are the other parameters (temporal placement and pitch)
also more flexible than the notation indicates?
The answer is yes, but the extent to which they are flexible was not
fully appreciated until the pioneering work of an American musical
psychologist called Carl Seashore in the 1930s. Using equipment that
now seems quite crude, he analysed recorded performances of singers
and instrumentalists from which he was able to make precise
measurements of pitch, loudness and time. He found that the
apparently rigid framework of notation is approximate to a degree that
surprised even many highly skilled musicians.
Figure 20 is based on one of Seashore’s analyses.1 The top line is the
first twelve notes of the folk tune ‘All though the night’. Do not worry
if you cannot read music; you should be able to follow the analysis.
Above the staff, the notes have been numbered, and the words, or lyrics
added above the numbers. (The usual place for the lyrics of a song is
below the staff, but it is more convenient here to put them above.)
This section of the tune uses only five pitches, F, G, A, Bb and C, and
the lines and spaces corresponding to these pitches have been labelled
at the left of the staff. Unusually, the horizontal placement of the notes
on the staff in Figure 20 accurately represents the temporal placement
of the notes indicated by the notation. You can check the horizontal
placement of the notes by referring to the short vertical lines below the
staff, which represent evenly spaced beats. As you can see, there are
four per bar, and note 2, for instance, lies half-way between the second
and third beats of the first bar.

The recording that Carl Seashore analysed is unfortunately not
available, but you can hear a modern recording of this short extract in
the audio track for this activity. ■
The bottom part of Figure 20 is part of Carl Seashore’s analysis of a

recording of this section of the song sung by a competent singer. Each
dashed or dotted horizontal line corresponds to one pitch. The line
1
Carl E. Seashore, Psychology of Music, McGraw Hill, 1938; republished by Dover, 1967, p. 258.
C 1 2 3 4 5 6 7 8 9 10 11 12
B
A
G
F
beats
C
B
A
G
F
Figure 20 ‘All through the Night’ in conventional notation (top) and a performance
analysed by Carl Seashore (bottom)
between F and G, for instance, corresponds to F#. The analytical

diagram does not confine notes to the ‘grid’ of possibilities underlying
conventional notation. For example, intermediate pitches between,
say, F# and G can be shown; and the horizontal positioning of a note
accurately represents its temporal placement. (Seashore’s original
analytical diagrams also included a chart showing how loudness varied
throughout a performance, but I have not reproduced this in Figure 20.)
Notice that after nearly every note in the bottom part there is a wiggly
line. This is showing a cyclical fluctuation of pitch. Instead of holding
a steady note, the singer is using vibrato, which is a regular cyclical
variation of pitch (and loudness) often used by singers.

Over what range of pitch do notes 1 and 3 vary in this performance.
Look at the wiggly line following each of these notes.
Comment
The first note, nominally B b , varies almost between B and A, almost a
whole tone. Note 3, at its widest deviation, varies almost between G #
and F # , again almost a whole tone. ■
Regarding pitch, notice that the average value of the pitch during vibrato
is not always the nominal pitch. For instance, during note 7 the average
pitch is lower than the nominal pitch of the note (A), and during note 8
the average pitch is rising. Notice also that notes 2 and 3, and 7 and 8, run
into each other. From the diagram it is hardly possible to say where note 2
ends and 3 begins (for instance). In addition, several notes are approached
from below their nominal pitch. Notes 9 and 10, in particular, start below
their nominal pitch and rise. (At note 11, Seashore’s singer performs a
version of the tune where this note is given the same pitch as note 12.)
As far as the temporal placement of the notes goes, if the singer were
giving a literal interpretation of the note values, then the notes in this
bottom part of the diagram would be aligned with the notation in the
upper part of the diagram. Notice that although notes 1 to 4 fall close to
where they would be expected to fall, notes 5–10 are all begun early.
One thing that is immediately clear from Seashore’s results is that what
happens in a real performance is not as rigid as conventional notation
indicates. The beginnings and endings of notes are not always clearly
defined, nor are their pitches, and the placement of notes can also be
quite flexible.
I should introduce a caution here. The recording that Seashore analysed

would undoubtedly have been a product of its time, and therefore
probably used certain performance practices that were current at the time
but which would not be used today to the same extent. Nevertheless, analyses
of more modern performances, using more sophisticated equipment than
Seashore had available, have confirmed Seashore’s general findings.
An interesting question is whether a person (or a machine) who did not
know this song, but who was conversant with musical notation, could
deduce the notation at the top of Figure 20 from Seashore’s analytical
diagram. That would certainly not be easy, and yet transcribing a sound
recording of the song into notation could be done by any reasonably
able music student. The resulting notation would be regarded as
accurate, even though, as Seashore’s diagram shows, it is really only an
approximation. However, the transcription would be acknowledged to
have missed the expressive element in the performance, which
conventional notation is hardly capable of representing. It is almost as
though the many deviations of placement and pitch from what is shown
in the notation are not recognised for what they are, at least by listeners
who are attuned to the musical culture that produced this song. Why
should this be? The answer seems to be related to the phenomenon of
categorical perception, which we shall consider next.

Compare and contrast the Seashore-style pitch diagram of a
performance and the conventional notation for the same piece. ■
4.2 Categorical perception

Categorical perception is a psychological term and its basic idea is that
phenomena are perceived by humans (within a particular culture) in
terms of pre-existing categories. New experiences or phenomena are
assimilated to their closest existing category. In this way humans
interpret the new in terms of what is already known. In the process, of
course, features that differentiate a new phenomenon from other
members of the category may be overlooked.
The effect is well known in psychological research, where in one classic
experiment subjects are quickly shown a series of playing cards and
required to identify them. The cards contain one or more trick cards,
which are the wrong colour for their suit (for instance a black four of
hearts). The purpose of the experiment is to see how the subjects deal
with these trick cards. When the experiment is conducted at a brisk
rate, subjects typically assimilate the trick cards to the existing categories
of suit or colour without consciously noticing that the trick cards are
strange. For instance, a black four of hearts may confidently be named
as either the four of hearts or the four of spades. Some subjects may
have difficulty naming a trick card, and yet be hard pressed to identify
what is anomalous about the card, even becoming confused or
distressed at their inability to identify the thing they are looking at.2
In music, categorical perception has been shown to operate both in terms
of the temporal placement of notes and (under certain conditions) pitch.
2
Jerome S. Bruner and Leo Postman, ‘On the Perception of Incongruity: A Paradigm’, Journal of
Personality, vol. 18, 1949, pp. 206–223.
1 2 To take just the question of placement, in one simple experiment,

conducted on both skilled musicians and untrained subjects, the
subjects were presented with a series of metronomic clicks. Once the
Figure 21
Click X is
pulse had been established by being allowed to run for a while, a new
interpolated click was introduced between the consecutive regular clicks. Figure 21
between represents this diagrammatically.
clicks 1 and 2
The numbers 1 and 2 represent consecutive regular clicks, and X
represents a click introduced between them. In the diagram, X is
midway between 1 and 2, so this represents a simple subdivision of the
beat into two equal parts. However, in the experiment, X could lie
anywhere between 1 and 2. The subjects were required to reproduce
what they had heard. Now, although music notation can theoretically
cope with any placement of X between 1 and 2, certain placements are
very standard. As we have seen, the standard placements correspond
to simple fractions of the gap, such as a half, a quarter, three quarters, a
third, two thirds. A placement of X three sevenths of the way between
1 and 2 is, in terms of conventional notation, very non-standard.
The result of the experiment showed that the subjects had great difficulty
reproducing the non-standard placements accurately, and tended to
assimilate them to the standard placements. However, the results of
performance analyses such as Carl Seashore’s suggest that in practice
we hear non-standard placements (and pitching) of notes very often,
perhaps most of the time. It seems, then, that listeners raised in a basically
European music tradition assimilate the very variable timings and
pitches of a performance to certain standard categories which musical
notation has evolved to represent. This is not to say that the deviations
of non-standard timings and pitchings are not noticed, but they appear
to be interpreted in terms of expression, or ‘feel’, rather than in terms
of timing or pitch. Listeners, even musically skilled ones, are apt to
characterise such musical expression or ‘feel’ using ill-defined terms
like ‘swing’, ‘groove’, ‘spring’, ‘lift’ and so on, rather than in terms of
exact temporal displacements or pitch displacements.
A common example of the use of non-standard placements (in terms of
conventional notation) occurs in jazz and blues, where runs of
consecutive notes tend to be played as unequal pairs, for example
SLSLSL etc. or LSLSLS etc. (where S stands for short and L for long).

You can hear an example of this unequal pairing of notes in the audio
track for this activity. It is from Cornet Chop Suey, recorded by Louis
Armstrong’s Hot Five in 1926. ■
Notations of jazz, which was originally a largely unnotated form of music,

have tended to represent such pairs of notes as having a durational ratio
of 2:1, using the notations of Figure 10(c) or Figure 19(c) or (d). The trouble
with these notations is that, when played accurately, they do not feel right
for the jazz idiom. Close analysis of what jazz musicians actually play
(using a Seashore type of analysis), reveals that a ratio of 2:1 is, at best, an
approximation, and what they play is closer to 2.3:1 or 2.4:1. For jazz or
blues musicians, this ‘non-standard’ ratio is, of course, very standard.
The use of computers to transcribe ‘live’ performance into musical
notation is another area where these ‘categorical perception’ issues
arise. Many computer systems can now transcribe a performance

played on an attached piano-style keyboard into conventional
notation. Such systems have been used by composers who do not read
or write music to create scores for orchestral or choral compositions.
The pitch aspect of the music presents no problem for the computer,
because the pitches are predefined by the keyboard, and only those
pitches can be used. (However, the computer may have difficulty
interpreting enharmonic pairs correctly, for instance deciding between
D# or Eb .) For transcription of the temporal aspect, however, it is
usually necessary to predefine steps of time (this is known as
quantisation). The quantised steps are the shortest note values that the
system will recognise, and this becomes the basic unit of time for that
piece. Beginnings and endings of notes played at the keyboard will be
made to coincide with the nearest time step. However, even allowing
for the degree of approximation introduced by this process, the
resulting transcription is liable to be too literal. A note that falls just
before or after a beat might be accurately transcribed as such, whereas
the performer may have felt that it was on the beat, but played
expressively. Alternatively, a note that is badly timed by the performer
may have its incorrect placement faithfully transcribed by the
computer, even though to a human listener the player’s intention may
have been obvious. The programs that perform such transcription are
being improved all the time, yet the results still generally need to be
edited afterwards to make them into acceptable notation.
Related difficulties arise in connection with the various computerised
systems used for performing notated music as sound. Making a
computer system perform conventionally notated music with the
correct feel has proved very difficult, although the programs are
improving all the time. To a considerable extent the computer must
interpret the notation in a way that is not too literal, particularly as
regards durations and placements (for instance in jazz and swing
pieces, but also in other types of music). Additionally, sliding between
notes (portamento, in musical parlance) can be added, and other
expressive effects. The key to making a convincing-sounding result lies
in knowing when to use these expressive effects and when not to. As
we saw in connection with the Carl Seashore analysis, a human
performer does not always join notes together with a slide, and does
not always displace notes away from the beat. Largely thanks to the use
of computerised analytical tools, musicians and programmers are
learning how music is performed expressively in terms of the small
shifts of time and pitch which musicians make almost instinctively.

The audio track for this activity contains a short extract played by a
computer from musical notation. The first half is played
inexpressively; the second half has the computer’s expressive
interpretation of the notes. ■

(a) Summarise briefly the concept of categorical perception.
(b) Give an example of an aspect of music where categorical
perception seems to operate. ■
5 SYNCHRONISING AN ENSEMBLE
When more than one musician is involved in a performance of a piece,
the question of synchronisation naturally arises. How do two performers
ensure that their notes coincide? The potential for non-synchronisation
escalates as more musicians are involved until, with a large choir or
orchestra, there may be in the region of 70 to 100 performers.
The conductor (if there is one) has a score, or, more correctly here, a
full score, which shows what every performer should be doing at any
instant. Figure 22 shows one page of the score from part of Beethoven’s
Ninth Symphony. (The first two bars of Figure 11 are in the final two
bars of Figure 22.) I have indicated the instruments or voices to which
each staff applies. Numbers in brackets indicate the number of
instruments playing each staff, except for the choral staves and string
staves, where the precise number of performers is not indicated by the
composer. (The first and second violins of a modern symphony
Flutes (2)
Oboes (2)
between
Clarinets (2) beats 2
and 3
Bassoons (2)
Double
Bassoon
French
Horns (4)
Trumpets (2)
Timpani
vertically
First Violins aligned
Second
Violins
Violas
Soprano
Alto
Tenor
Baritone
Sopranos
Altos
Tenors
Basses
Cellos and
double basses
Figure 22 Page of the full score of Beethoven’s Ninth Symphony (from M. Unger, Eulenburg)
orchestra might typically comprise around fourteen or twelve players

each, with maybe ten violas, ten cellos and six double basses.)
On a score, alignment on a common vertical line indicates simultaneity.
Thus, for instance, in the final bar of Figure 22, the cellos, double
basses, baritone soloist, violins and violas all play four crotchets
(quarter notes). Some of these bars are boxed in Figure 22. In these
bars, all the first crotchets (quarter notes) align with those above and
below. Likewise, all the second, third and fourth crotchets (quarter
notes) align. The oboes, however, have a quaver (eighth note) between
beats two and three, and its horizontal placement is carefully arranged
to fall between the second and third crotchets (quarter notes) of those
bars where four crotchets (quarter notes) are played.
Individual orchestral players do not perform from a full score but from
a part, which shows only the music played by that instrument (or
group of instruments in the case of a section of the orchestra, such as
the first violin section). The singers usually perform from a vocal score
containing the vocal lines and a piano arrangement of the orchestral
parts which a pianist can use during rehearsals.
Figure 23 shows the first-violin part from the score extract in Figure
22. Notice that it contains no information about what any other
instruments are playing, so it is not much help from the point of view
of synchronisation.
Figure 23 Extract from the first violin part
Although a piece such as this would have a conductor, whose gestures

are intended to indicate where the beats occur, musicians would rarely
rely on the conductor for the placement of individual notes. Instead,
they would achieve synchronisation in the way (in principle) that I
described earlier for synchronising the efforts of a group of people to
move an object. There would be an initial count (maybe aloud in the
case of a jazz group, and maybe a silent physical gesture in the case of
an orchestral- or chamber-music performance), and thereafter the
performers keep counting, listening and watching. Continued listening
and watching are essential because performers can easily drift out of
synchronisation with each other.
What, however, do the performers count? In principle, they count the
beat. For instance, if a passage is in 4/4, with a crotchet (quarter-note)
beat, the performers silently count one, two, three, four during each
bar. This counting may be almost subconscious, but even so – mishaps
aside – the performer always knows which beat is being played. If the
beat is subdivided, for instance into continuous semiquavers (sixteenth
notes), the performer still in principle counts the beat, as in Figure 24.
1 2 3 4
Figure 24 Counting the beat

However, if the music is being taken fairly slowly, then the beat as
indicated by the notation may be too slow to count comfortably. For
instance, if the extract shown in Figure 24 is played so that each group
of four notes takes two seconds or longer, the performer may increase
the counting rate, and count in the places indicated by ‘X’ in Figure 25,
or even, in extreme cases, count each note individually. This is known
as dividing the beat.
Figure 25 Dividing the beat
On the other hand, if the music is played quickly, the performer may
not count each beat. For instance in fast music in 4/4, the performer
may count in the places shown in Figure 26.
Figure 26 Possible counting

tactic in fast music
The disparity between what a musician may count, and what the time
signature indicates the beat to be, indicates that what is meant by ‘beat’
is not entirely straightforward. Notational subdivisions of the beat can
come to be regarded as beats in slow music; and notational beats can
come to be regarded as subdivisions in fast music. Behind this
phenomenon lies the experimentally observed fact that people are
most likely to feel comfortable tapping their feet at around one beat
every 0.6 seconds. So when conductors are conducting, or musicians
are counting, or dancers are dancing, they will tend to respond to
repetitive events in the music that are happening at about this rate, and
have a physical sensation that this is the beat. Whether these beats are
actually the same as the beat indicated in notation (if the musicians are
playing from notation) depends very much on the speed at which the
music is performed.

Why is the presence of a score not essential for the synchronisation of
a group of performers? ■
I mentioned at the start of this chapter that my account of the temporal

aspect of music was based on practices in Western music that have
been standard for a few centuries. The introduction of electronic
technology into music has not fundamentally altered the way time is
conceived. For instance, in the context of MIDI (which will be
discussed in Block 3), music is still organised into patterns of beats
and bars, and has a time signature. Electronic technology has, however,
had an influence in matters of precision and complexity.
To take precision first, the prevalence of music recording has changed
listeners’ and performers’ expectations of the degree of
synchronisation required of performers. For instance, in orchestral
performances nowadays levels of synchronisation are frequently

achieved that were often not achieved in the past. This is not to say
that modern performances are therefore artistically superior to older
ones, simply that expectations have changed. In some special musical
contexts, indeed, musicians often play to a click track, which is a
series of electronically derived metronome pulses played to the
musicians through headphones. This is most likely to be encountered
in the recording studio for film work and other contexts where exact
synchronisation is required.
Concerning complexity, technology has sometimes enabled music to be
created mechanically that would be impossible for human performers.
For instance, extremely complex or extremely rapid subdivisions of a
beat can be achieved electronically that would scarcely be feasible for
a human performer. This can be found in both avant-garde and
popular music.
We saw in Figure 22 an extract from a full score. This style of
representation too has survived into much music created
electronically. For instance, in MIDI, the computer display typically
shows each track of the music (analogous to an instrumental line) as a
separate horizontal line, and the tracks are arranged under each other
as the instrumental lines of a full score are. This too will be discussed
in Block 3.
In music notation, note durations are Except for compound time signatures, the
relative. Durations are notated using a upper number of a time signature represents
hierarchical system of note values which the number of beats per bar, and the lower
represent successive doublings or halvings number, interpreted as the denominator of
of duration. In American terminology, note a fraction, indicates the note value that
names indicate relative durations (half note, represents the beat. (Section 3.2)
quarter note, eighth note, etc.) This system
can be adapted to accommodate other A change to the pattern of accentuation
subdivisions than equal, binary within a bar can be indicated by the use of
subdivisions. (Section 2.1) an accent symbol (>). A syncopation occurs
when a strong accent is shifted to a
A metronome mark can be used to specify normally weakly accented beat. (Section
an absolute duration for a particular note 3.4)
value, and hence for any other note value.
A metronome mark is a precise way of Compound time signatures are used where
indicating tempo. (Section 2.1) the principal subdivision of the beat is into
three parts rather than two, or where the
Music conventionally has a beat or pulse. prevailing subdivision is into two parts
Beats are regularly spaced, instantaneous with durations in the ratio of 2:1. In music
temporal markers, to which musical events of this kind, a dotted note represents the
are related. The word ‘beat’ also refers to beat. Most commonly, a dotted crotchet
the time interval between two consecutive (dotted quarter note) is used. (Section 3.5)
temporal markers. In much music, audible
repetitive events coincide with the beat, Modern musical notation has evolved to
making the beat an explicit part of the represent certain ways of arranging the
music. In other music, the beat may be temporal and pitch aspects of musical
inaudible. (Section 2.3) material. Performances are typically much
more flexible than the notation suggests.
The durations of notes are conceived in Notation appears to represent the categories
terms of the underlying beat rate. A note of musical perception of its originating
lasts for a certain number of beats, or culture. (Sections 4.1 and 4.2)
fractions (subdivisions) of a beat.
Commonly used subdivisions of the beat The expressive quality of music appears to
are based on simple numerical ratios and reside in its flexibility in relation to the
have characteristic sounds. (Section 2.4) categories of organisation used in notation.
(Sections 4.1 and 4.2)
A stream of unaccented beats will begin to
seem regularly grouped if listened to for Acceptable computer-based transcription
more than a few seconds. Groups of evenly or performance requires intelligent
spaced beats appear to be set off from one computer systems that are able to interpret
another by equally spaced accented beats. or generate expressive effects correctly.
Regular beats fall into repeating cycles of (Section 4.2)
strong and weak accentuation. Each cycle
begins with a strong accent. A cycle of Synchronisation of ensembles is achieved
accentuation is a bar. (Section 3.1) through matched counting by the
performers. If the performers are playing
The time signature is a way of indicating from notation, their counts may or may not
how many beats there are in a bar, and what coincide with the notational beats,
note value represents the beat (in the sense depending on the tempo of the
of the time interval between two beats). performance. (Section 5)

Activity 1
Calculations such as these are easier to perform using the American
terminology. The ratio of their durations is (1/16): (1/2). Multiplying
both sides of the ratio by 2 gives (1/8): 1. In other words, a semiquaver
(sixteenth note) is one eighth of the duration of a minim (half note).
Activity 2
(a) There are 90 crotchets (quarter notes) per minute, and hence 180
quavers (eighth-notes) per minute. So in five minutes there will be
900 notes.
(b) There are 120 quavers (eighth notes) per minute, so each one lasts
half a second.
(c) A minim (half note) lasts four times as long as a quaver (eighth
note), so each lasts for 4 × 0.5 second, or two seconds.
Activity 7
A beat is one of a series of regularly spaced, instantaneous temporal
markers, for example the clicks of a metronome or the silent markers
inside a performer’s head. The word is also used to represent the time
between two such temporal markers. This length of time can serve as a
temporal unit. By using a particular note value to represent this
temporal unit, the durations of other note values can be specified in
terms of beats, either as whole numbers or fractions of a beat.
Activity 9
This is a three-beat pattern:
⏐1 2 3 ⏐ 1 2 3 ⏐ 1 2 3 ⏐
⏐S W W ⏐ S W W ⏐ S W W ⏐
This is a four-beat pattern:

⏐1 2 3 4 ⏐ 1 2 3 4 ⏐ 1 2 3 4 ⏐
⏐S W M W⏐ S W M W ⏐ S W M W⏐
Activity 11
Each of these begins with a dotted crotchet (dotted quarter note), ß
which is 50% longer than a crotchet (quarter note). In other words, this
note is worth
1 1
+
4 8
The whole bar is therefore
⎛ 1 1⎞ 1 1 4
⎜ + ⎟+ + =
⎝4 8⎠ 8 2 4
Activity 12
All bars will have an equal total of note values and rests, so, as long as
the duration of each note value remains the same, the duration of each
bar will remain the same. However, the duration of the note values might
not remain the same, for instance if the music speeds up or slows
down. If there are no tempo changes, the bars are all the same length.
Activity 15
Twenty-four bars of music in 6/8 has forty-eight beats (two beats per
bar). If forty eight beats are to last for 30 seconds, then in a minute
there would be 96 beats. So the music must be played at 96 beats per
minute. Thus the metronome mark must be ß = 96.
Activity 17
As usual, the use of American terminology helps considerably.
(a) A time signature of 2/4 means that there are two beats per bar, and
the beat is represented by a crotchet (quarter note). Thus the
duration is one beat.
(b) A quaver (eighth note) is half a crotchet (quarter note), so an
undotted quaver (undotted eighth note) lasts half a beat. Adding a
dot extends the duration for 50%, so the duration of one dotted
quaver (dotted eighth note) becomes three-quarters of a beat. For
two such notes the duration is one-and-a-half beats.
Activity 18
A given series of notes would have a different rhythmic character if
played in one metre rather than another. Four instance, four crotchets
(quarter notes) played in 4/4 are rhythmically different from four
crotchets (quarter notes) played in 3/4 because of the differing
arrangements of strong and weak accents.
Activity 19
Patterns of regularly accented beats are naturally grouped into
repeating cycles of strong and weak accentuation. A bar is a cycle of
accentuation, beginning with a strong accent.
Activity 20
The time signature is a way of indicating how many beats there are in
a bar, and what note value has been chosen to represent the beat (in
the sense of the time interval between two beats). In non-compound
time signatures, the upper number of a time signature represents the
number of beats per bar, and the lower number indicates the note
value that represents the beat.
Activity 21
(a) In compound time signatures, a dotted note represents the beat
(usually a dotted crotchet or dotted quarter note). In non-
compound time signatures an undotted note represents the beat.
(b) Compound time signatures are used where the principal
subdivision of the beat is into three parts rather than two, or
where the prevailing subdivision is into two parts with durations
in the ratio of 2:1.
Activity 24
Both the Seashore-style diagram and conventional notation are
graphical representations of music. The Seashore diagram has a chart
showing how pitch varies (and another showing how loudness varies).
As with conventional notation, the Seashore pitch chart represents
pitch in the vertical direction and time along the horizontal axis. The
Seashore diagram always represents a recorded performance, and is

made for analysis. It cannot itself be performed. Conventional
notation, on the other hand, can either represent a performance (in the
case of a transcription) or can be performed.
In a Seashore diagram, pitch and time are represented continuously;
that is, pitches are not confined to the standard note-names, and note
placements and durations are indicated by proportional horizontal
spacing. In conventional notation, pitches are discrete, and
intermediate pitches between named pitches are not represented. In
conventional notation horizontal spacing is only a very approximate
indication of temporal placement. More accurate information about
temporal placement and duration is derived from the note values used
and their relation to an underlying pattern of beats. This system does
not allow time to be represented as a continuous variable.
The Seashore diagram reveals more closely what actually happens in a
performance, in terms of pitch and timing, than does a transcription of
the same performance into conventional notation.
Activity 27
(a) Categorical perception is a psychological concept. It is the process
by which newly experienced phenomena are assimilated to pre-
existing categories, which they may not exactly fit. The mismatch
between the phenomenon and the category is, however,
overlooked until such time as it leads to unignorable anomalies.
(b) Musical pitch is one aspect. The categories are the named pitches,
and, in the context of a musical performance, pitches may be
assimilated to these categories. Another aspect is the series of
temporal categories produced by simple subdivisions of an
interval of time.
Activity 28
Synchronism is achieved through matched counting among the
players, for which a score is not needed. In addition, in many styles of
music, notation is not used. Even in those musical styles where notation
is used, the performers play from a part (showing their own part in the
music) rather than from a score, which shows all the parts together.
LEARNING OUTCOMES
2 Perform simple calculations based on the relative durations of
minims (half notes), crotchets (quarter notes), quavers (eighth
notes), semiquavers (sixteenth notes), and their dotted forms, given
a table of standard note values. (Activity 1)
3 Interpret a metronome mark and perform simple calculations based
on it, given a table of standard note values. (Activity 2)
4 Explain how durations of notes can be related to beats and their
spacings. (Activity 7)
5 Explain what a bar is and show the patterns of accentuation in bars
of two beats, three beats and four beats. (Activities 9 and 19)
6 Show that the note values (and rests) in a bar add to the correct
amount in relation to the time signature, given a table of standard
note values, and be able to relate the duration of a bar to its
constituent notes (and rests). (Activities 11 and 12)
7 Interpret the duration of a note (dotted and undotted), or a series of
notes, in terms of a number of beats, given a time signature and a
table of standard note values. (Activity 17)
8 Explain the use of a time signature (Activities 20 and 21).
9 Interpret common time signatures (2/4, 3/4, 4/4, 6/8, 9/8 and 12/8)
in terms of the number of beats in the bar and the note value used
to represent the beat. (Activities 15, 17 and 20)
10 Discuss some of the findings of Carl Seashore regarding the
relationship between a notated piece of music and a performance of
the same piece. (Activity 24)
11 Describe simply the concept of categorical perception and to relate
it to aspects of music. (Activity 27)
12 Explain the relationship of a musical score to a musical part, and
explain how ensembles of performers maintain synchronism.
(Activity 28)
140 TA225 BLOCK 1 INVESTIGATING SOUND INDEX 140
INDEX
Notes 1 This index covers Block 1, Chapters 1, 2 and 3 only.
2 Where terms are referenced in two or more places, the page number is only given in one place,
cross references are given for the other entries.
3 Page numbers in bold refer to places where the term appears emboldened in the main text.
4 The index does not cover the aims, chapter summaries, answers to self-assessment activities or
learning outcomes.
> (in music notation) see accent sign decay (of a note) 69, 72
accent (in music) 117 decibel 40, 45
strong 123 adding 42, 45
weak 123 mathematical definition (non-
accent sign (in music notation) 122 assessable) 43
air pressure 10, 18 table of equivalent ratios 41
amplification 42, 45 dissonance 82
amplifier 42 domain
amplitude 27, 29, 31 frequency see frequency domain
peak to peak see peak-to-peak amplitude time see time domain
root-mean-square see r.m.s. amplitude dotted note 108, 123
amplitude spectrum 65, 67 dynamic range 39
antiphonal 21 of human hearing 39, 40
atmospheric pressure 10
ear
attack (of a note) 69, 72
frequency response see frequency
response of human hearing
ear–brain combination 75
bandwidth 66, 67 eighth note see quaver
critical see critical bandwidth
electronic synthesis 67
bar (in music) 118
enharmonic pairs 89, 91
bar line 118
equal temperament 80, 90, 92, 94
bassoon formant 78 expression (in music) 129
bassoon spectrum 75
beam (in music notation) 109 f see frequency
beat (musical) 111 fifth (musical interval) see perfect fifth
as a time interval 112 first harmonic 64
divided 133 flanging 26
equal subdivisions of 112 flat (in music notation) 119
notational subdivisions of 133
strong 117 flute waveform 58
unequal subdivisions of 115 formant 77, 78
weak 117 bassoon see bassoon formant
beating 91, 93, 95, 97 Fourier analysis 64
Fourier synthesis 64
Fourier’s theorem 59, 64
cancellation (phase) 24 Fourier, Joseph 59, 79
categorical perception 128 fourth (musical interval) see perfect fourth
chord frequency 19, 22
major see major chord bandwidth see bandwidth
chromatic scale 89, 91 fundamental see fundamental frequency
clef (in music notation) 119 harmonically related see harmonically
related frequencies
click track 134
frequency domain 66
common time 120
frequency domain representation 67
compound time signature 124
frequency range
concert pitch 33
human hearing see frequency response –
consonance 82 human hearing
critical bandwidth 96 frequency response
crotchet 108, 112, 120, 124, 125, 132 of human hearing 33, 38, 40
cycle 12, 18 frequency spectrum 65, 67
cyclic (or cyclical) motion 12, 13 flat 77
full score 131
fundamental metaphor 7
missing see missing fundamental metre see time signature
fundamental frequency 61, 64 metronome 109
fused tone 76 metronome mark 108
microvolt 29
Gabrieli, Andrea 21 MIDI 133
Gabrieli, Giovanni 21 millisecond 18
gain (of an amplifier) see amplification millivolt 29
graph minim 108, 112, 121
frequency spectrum 67 missing fundamental 75
groove see swing modes of vibration 67
modulation (between musical keys) 92
music
half note see minim temporal aspect see temporal aspect of
harmonic 39, 57 music
first see first harmonic musical instrument digital interface see
second see second harmonic MIDI
harmonic series 57, 60, 61, 79, 88 musique concrete 107
incomplete 75
relationship to pitch 79
table of frequency ratios and pitches 85 noise reduction 25
harmonically related frequencies 61, 64 notation (musical) 108
harmonics 61, 64 interpretation by computer 130
Helmholtz system 37 note (musical)
hertz 19 dotted see dotted note
relative duration 108
home note see tonic
note allocation 110
human ear see ear
human hearing note values 109
dynamic range see dynamic range –
human hearing oboe waveform 58, 69
frequency response see frequency octave 34, 35, 37, 79
response – human hearing octave ranges
threshold of audibility see threshold of typographical notation 37
audibility offset (of a note) see decay
onset (of a note) see attack
in phase 23 oscillation 12, 18
interference (phase) 26 oscillatory motion 12
interval (musical) 86, 88 out of phase 23
overtone 39
jazz 129, 130
part (of a full score) 132
partial 39
key colour 93 pascal 29
key signature 119 peak-to-peak amplitude 28, 29, 31
keys (musical) perception
and timbre 79 categorical, of music see categorical
kilohertz 19 perception
perception of sound 8
λ see wavelength perfect fifth 86, 88, 89, 91, 92
lag (phase) 23, 26 perfect fourth 87
lead (phase) 23, 26 period (of a waveform) 72
lift (in music) see swing period 13, 17, 19
line spectrum 65, 67 periodic motion 13, 17
longitudinal wave 12 periodic wave 59
non-sinusoidal 60
loudness 32
phase 23, 26
lyrics 126
effect of when adding harmonically
related sine waves 62
Maelzel, Johan 109 phase angle 24
major chord 83 phase difference 23, 24, 26
major scale 81, 88 phase shift 24
major third 87, 88, 90, 91, 92 phasing see flanging
major triad 83 piano waveform 58, 68, 71
pitch 8, 32, 35 spring (in music) see swing

and relationship to frequency 35, 37 staff 108
conventions for notation 37 stave see staff
cyclical fluctuation of see vibrato steady-state phase 72
musical notation of 126, 108 swing 129, 130
perception of 76 synchronisation
relationship to harmonic series 79 of an ensemble 131
pitch class 80, 82, 85 of musical beats 111, 132
pitches (scale of) see scale of pitches syncopation 123
pizzicato 7
portamento 130 τ or T see period
pressure temperament 92
air see air pressure equal see equal temperament
atmospheric see atmospheric pressure tempering 92, 94
pressure wave 12, 16 tempo 110
pulsating waveform see beating temporal aspect of music 107
temporal placement of musical notes 108
quantisation third (musical interval) (major) see major third
when transcribing music 130 threshold of audibility 44, 45
quarter note see crotchet timbre 57, 62, 67
quaver 108, 124, 132 time
musical notation of see temporal
r.m.s. amplitude 30, 31 placement of musical notes
reinforcement (phase) 24 note allocation of see note allocation
repetition rate 72, 75 time domain 66
rest (musical) 108 time domain representation 67
rhythm 125 time signature 119, 120
root-mean-square amplitude see r.m.s. compound see compound time signature
amplitude time varying spectrum 71
tone (fused) see fused tone
scale tonic (home note) 88, 92
chromatic see chromatic scale transcription (of music) 128
major see major scale by computer 129
scale of pitches 79 transverse wave 12
scales 37 travelling wave 12, 18
score 131 triad (major) see major triad
full see full score triplet 109, 114
part from see part (of a full score) tuning fork 11, 22
vocal see vocal score twelfth root of 2 94
Seashore, Carl 126
second harmonic 61, 64 vibration
sharp (in music notation) 119 modes of see modes of vibration
sine wave 9, 13, 14 vibrato 69, 127
average value 27 violin waveform 58
mixtures of 57 vocal score 132
sinusoidal 9, 14
sound wave
as fluctuations of air pressure 9 longitudinal see longitudinal wave
speed of see speed of sound periodic see periodic wave
sound pressure level (SPL) 44, 45 pressure see pressure wave
table of typical values 45 transverse see transverse wave
sound waves travelling see travelling wave
properties of 57 wave shape 68
source-cause description 7 and timbre 59
spacing (in music notation) 122, 126 waveform
spectrum 65 flute see flute waveform
amplitude see amplitude spectrum non-sinusoidal 58
bassoon see bassoon spectrum oboe see oboe waveform
frequency see frequency spectrum piano see piano waveform
line see line spectrum pulsating see beating
time varying see time varying spectrum violin see violin waveform
speed (of a piece of music) see tempo wavelength 15, 18, 22
speed of sound 20, 21, 22 whole-tone step 88
SPL see sound pressure level Winkel, Diderich 109
Acknowledgement
Cover images: © 1997 Photodisc, Inc., and Ingram Publishing

Ebook TA225 Block 1 Part 1 ISBN0749258942 L3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ebook TA225 Block 1 Part 1 ISBN0749258942 L3

Uploaded by

Copyright:

Available Formats

1 TA225 BLOCK 1 INVESTIGATING SOUND CHAPTER 1 SOUND BASICS 1

Chapter 2 Sound Shape and Colour page 55

Chapter 3 Sound and Time page 105

Index page 141

The Open University

TA225 Block 1 Investigating sound

9 The ranges of human hearing 38

1.2 What is sound?

ACTIVITY 1 (LISTENING) .....................................................................

ACTIVITY 2 (LISTENING) .....................................................................

ACTIVITY 3 (LISTENING) .....................................................................

I can group my descriptions of the sounds in Activity 1 into three

ACTIVITY 4 (EXPLORATORY) ................................................................

qualities like brightness, darkness and depth attributed to sounds,

1.3 Summary of Section 1

2 SINUSOIDAL PRESSURE WAVES

Figure 1 A sine wave

Despite their theoretical importance, sine waves are of limited use in

2.2 Pressure in the atmosphere

ACTIVITY 5 (SELF-ASSESSMENT) ...........................................................

2.3 Pressure waves and cycles

ACTIVITY 6 (COMPUTER) ....................................................................

In Activity 6 you saw that a vibrating object, a tuning fork, created

high pressure high pressure

low pressure low pressure

Figure 4 Pattern of pressure variations caused by a vibrating tuning fork

In reality, molecules are not neatly arranged in rows and columns in

A regular pattern of high- and low-pressure regions like this is known

ACTIVITY 7 (SELF-ASSESSMENT) ...........................................................

ACTIVITY 8 (SELF-ASSESSMENT) ...........................................................

high pressure high pressure

low pressure low pressure

Figure 5 Pressure variations for Activity 8

For a pressure wave created by a particular tuning fork, the distance

ACTIVITY 9 (COMPUTER) ....................................................................

high low high low high low

Figure 6(a), for which we can draw a graph relating pressure to

ACTIVITY 10 (LISTENING, EXPLORATORY) .................................................

However, although point P in Figure 7 is at the same pressure as A and

ACTIVITY 11 (SELF-ASSESSMENT) ...........................................................

Figure 8 Sine waves for Activity 11

pressure direction of travel

one cycle later

Figure 9 Pressure wave travels one wavelength in one cycle of oscillation

ACTIVITY 12 (SELF-ASSESSMENT) ...........................................................

2.6 Pressure variations in one place

ACTIVITY 13 (COMPUTER) ....................................................................

In Activity 13 you saw a graph with a familiar sinusoidal shape,

It is important to appreciate the distinction between the graph in

ACTIVITY 14 (SELF-ASSESSMENT) ...........................................................

The period of the waves encountered in music is generally very short.

Figure 11 Pressure graphs for Activity 14

convenient to use the millisecond as the unit of time. One millisecond

2.7 Summary of Section 2

ACTIVITY 15 (SELF-ASSESSMENT) ...........................................................

3.2 Summary of Section 3

4 THE SPEED OF SOUND

ACTIVITY 16 (SELF-ASSESSMENT) ...........................................................

A delay of 0.2 seconds, as in the last activity, is not insignificant, as

ACTIVITY 17 (LISTENING) .....................................................................

One way to bring two spatially separated groups of performers into

BOX 2 Antiphonal music and the speed of sound

Figure 12 Interior of St Mark’s Cathedral, Venice